## Scikit learn SVC predict probability doesn't work as expected

predict_proba is not available when probability=false

predict probability sklearn

logistic regression sklearn

svc machine learning

svm rbf kernel

svm parameters

linearsvc predict_proba

I built sentiment analyzer using SVM classifier. I trained model with probability=True and it can give me probability. But when I pickled my model and load it again later, the probability doesn't work anymore.

The model:

from sklearn.svm import SVC, LinearSVC pipeline_svm = Pipeline([ ('bow', CountVectorizer()), ('tfidf', TfidfTransformer()), ('classifier', SVC(probability=True)),]) # pipeline parameters to automatically explore and tune param_svm = [ {'classifier__C': [1, 10, 100, 1000], 'classifier__kernel': ['linear']}, {'classifier__C': [1, 10, 100, 1000], 'classifier__gamma': [0.001, 0.0001], 'classifier__kernel': ['rbf']}, ] grid_svm = GridSearchCV( pipeline_svm, param_grid=param_svm, refit=True, n_jobs=-1, scoring='accuracy', cv=StratifiedKFold(label_train, n_folds=5),) svm_detector_reloaded = cPickle.load(open('svm_sentiment_analyzer.pkl', 'rb')) print(svm_detector_reloaded.predict([""""Today is awesome day"""])[0])

Gives me:

AttributeError: predict_proba is not available when probability=False

If that can help, pickling the model with with:

import pickle pickle.dump(grid_svm, open('svm_sentiment_analyzer.pkl', 'wb'))

and loading the model and predicting with

svm_detector_reloaded = pickle.load(open('svm_sentiment_analyzer.pkl', 'rb')) print(svm_detector_reloaded.predict_proba(["Today is an awesome day"])[0])

returned me two probabilities fine, after working on your code to rerun it and training the model on a pandas `sents`

DataFrame with

grid_svm.fit(sents.Sentence.values, sents.Positive.values)

Best practices (e.g. using `joblib`

) on model serialization can be found at https://scikit-learn.org/stable/modules/model_persistence.html

**Scikit learn SVC predict probability doesn't work as expected,** If that can help, pickling the model with with: import pickle pickle.dump(grid_svm, open('svm_sentiment_analyzer.pkl', 'wb')). and loading the probability bool, default=False. Whether to enable probability estimates. This must be enabled prior to calling fit, will slow down that method as it internally uses 5-fold cross-validation, and predict_proba may be inconsistent with predict. Read more in the User Guide. tol float, default=1e-3. Tolerance for stopping criterion. cache_size

You can use CallibratedClassifierCV for probability score output.

from sklearn.calibration import CalibratedClassifierCV model_svc = LinearSVC() model = CalibratedClassifierCV(model_svc) model.fit(X_train, y_train)

Save model using pickle.

import pickle filename = 'linearSVC.sav' pickle.dump(model, open(filename, 'wb'))

Load model using pickle.load.

`model = pickle.load(open(filename, 'rb'))`

Now start prediction.

pred_class = model.predict(pred) probability = model.predict_proba(pred)

**Predicting probability from scikit-learn SVC decision_function with ,** Your link has sufficient resources, so let's go through: When you call decision_function(), you get the output from each of the pairwise classifiers (n*(n-1)/2 Scikit learn SVC predict probability doesn't work as expected - Stack .. 'rb')) print(svm_detector_reloaded.predict_proba(["Today is an awesome day"])[0]). returned me two probabilities fine, after working on your stackoverflow.com

Use: `SVM(probability=True)`

or

grid_svm = GridSearchCV( probability=True pipeline_svm, param_grid=param_svm, refit=True, n_jobs=-1, scoring='accuracy', cv=StratifiedKFold(label_train, n_folds=5),)

**API Inconsitency of predict and predict_proba in SVC · Issue #13211 ,** When using SVC(probability=True) or SVR(probability=True) the output of SVC predict_proba does not always correspond to class with highest Also note that in https://github.com/scikit-learn/scikit-learn/pull/16769/files# The really strange thing is that svm_predict() gives the wrong answer while svm_predict_probability(), a more complicated function, which falls back to svm_predict(), gives the right thing. (the correct predictions jump around because libsvm does some sort of cross-validation thing that I don't understand yet to train them, but they're always

**scikit-learn/scikit-learn,** This is separate from #13211: if you'd force scikit-learn to do something similarly bad with API Inconsitency of predict and predict_proba in SVC #13211 so SVC(probability=True).predict_proba does return correct results When using SVC (probability=True) or SVR (probability=True) the output of predict_proba will not necessarily be consistent with predict, in the sense that, np.argmax (self.predict_proba (X), axis=1) != self.predict (X) this is documented in the user guide,

**Calibrate Predicted Probabilities In SVC,** In scikit-learn, the predicted probabilities must be generated when the model is being trained. This can be done by setting SVC 's probability to Thanks for contributing an answer to Data Science Stack Exchange! Please be sure to answer the question. Provide details and share your research! But avoid … Asking for help, clarification, or responding to other answers. Making statements based on opinion; back them up with references or personal experience. Use MathJax to format equations.

**sklearn.svm.libsvm.predict_proba,** sklearn.svm.libsvm. predict_proba ()¶. Predict probabilities. svm_model stores all parameters needed to predict a given value. For speed, all real work is done at You need to do a GridSearchCrossValidation instead of just CV. CV is used for performance evaluation and itself doesn't fit the estimator actually. from sklearn.datasets import make_classification from sklearn.svm import SVC from sklearn.grid_search import GridSearchCV # unbalanced classification X, y = make_classification(n_samples=1000, weights=[0.1, 0.9]) # use grid search for tuning

##### Comments

- Can you show the code where you originally save the object to
`''svm_sentiment_analyzer.pkl''?`

- did you try to call
`predict_proba`

rather than`predict`

when getting that`AttributeError`

? Otherwise this is a bit puzzling