scikit-learn return value of LogisticRegression.predict_proba

Question

asked Jul 24, 2019 in Machine Learning by ParasSharma1 (19k points)

What exactly does the LogisticRegression.predict_proba function return?

In my example I get a result like this:

[[ 4.65761066e-03 9.95342389e-01]
[ 9.75851270e-01 2.41487300e-02]
[ 9.99983374e-01 1.66258341e-05]]

From other calculations, using the sigmoid function, I know, that the second column are probabilities. The documentation says, that the first column are n_samples, but that can't be, because my samples are reviews, which are texts and not numbers. The documentation also says, that the second column are n_classes. That certainly can't be, since I only have two classes (namely +1 and -1) and the function is supposed to be about calculating probabilities of samples really being of a class, but not the classes themselves.

What is the first column really and why it is there?

1 Answer

Anurag · Answer 1 · 2019-07-24T05:54:41+0000

After analysing your output, I got some points:

4.65761066e-03 + 9.95342389e-01 = 1
9.75851270e-01 + 2.41487300e-02 = 1
9.99983374e-01 + 1.66258341e-05 = 1

The first column is the probability that the entry has the -1 label and the second column is the probability that the entry has the +1 label.

If you would like to get the predicted probabilities for the positive label only, you can use logistic_model.predict_proba(data)[:,1]. This will yield you the

[9.95342389e-01, 2.41487300e-02, 1.66258341e-05]

result.

Thus, to know more study Logistic Regression In Python.

scikit-learn return value of LogisticRegression.predict_proba

1 Answer

Related questions

Browse Categories