Scikit-learn predict_proba gives wrong answers

Question

predict_probas is using the Platt scaling feature of libsvm to callibrate probabilities, see:

How does sklearn.svm.svc’s function predict_proba() work internally?

So indeed the hyperplane predictions and the proba calibration can disagree, especially if you only have 2 samples in your dataset. It’s weird that the internal cross validation done by libsvm for scaling the probabilities does not fail (explicitly) in this case. Maybe this is a bug. One would have to dive into the Platt scaling code of libsvm to understand what’s happening.

Leave a Comment Cancel reply