How to find the importance of the features for a logistic regression model?

Question

asked Jul 2, 2019 in Machine Learning by Sammy (47.6k points)

I have a binary prediction model trained by the logistic regression algorithm. I want to know which features(predictors) are more important for the decision of positive or negative class. I know there is a coef_ parameter comes from the scikit-learn package, but I don't know whether it is enough to for the importance. Another thing is how I can evaluate the coef_ values in terms of the importance of negative and positive classes. I also read about standardized regression coefficients and I don't know what it is.

Let's say there are features like the size of the tumor, weight of tumor, and etc to make a decision for a test case like malignant or not malignant. I want to know which of the features are more important for malignant and not malignant prediction. Does it make sort of sense?

1 Answer

Anurag · Answer 1 · 2019-07-02T06:19:48+0000

You can simply use Python’s scikit-learn library to implement logistic regression and related API’s easily.

For example:

>>> from sklearn.linear_model import LogisticRegression
>>> clf = LogisticRegression(random_state=0, solver='lbfgs',
... multi_class='multinomial').fit(X, y)
>>> clf.predict(X[:2, :])
array([0, 0])
>>> clf.predict_proba(X[:2, :])
array([[9.8...e-01, 1.8...e-02, 1.4...e-08],
[9.7...e-01, 2.8...e-02, ...e-08]])
>>> clf.score(X, y)
0.97…
print(clf.coef_)
# those values, however, will show that
# the second parameter
# is more influential
print(np.std(X, 0)*m.coef_)
m.fit(X / np.std(X, 0), y)
print(m.coef_)

Hope this answer helps.

How to find the importance of the features for a logistic regression model?

1 Answer

Related questions

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources