cross_val_score
returns score of test fold where cross_val_predict
returns predicted y values for the test fold.
For the cross_val_score()
, you are using the average of the output, which will be affected by the number of folds because then it may have some folds which may have high error (not fit correctly).
Whereas, cross_val_predict()
returns, for each element in the input, the prediction that was obtained for that element when it was in the test set. [Note that only cross-validation strategies that assign all elements to a test set exactly once can be used]. So the increasing the number of folds, only increases the training data for the test element, and hence its result may not be affected much.
Edit (after comment)
Please have a look the following answer on how cross_val_predict
works:
How is scikit-learn cross_val_predict accuracy score calculated?
I think that cross_val_predict
will be overfit because as the folds increase, more data will be for train and less will for test. So the resultant label is more dependent on training data. Also as already told above, the prediction for one sample is done only once, so it may be susceptible to the splitting of data more.
Thats why most of the places or tutorials recommend using the cross_val_score
for analysis.