Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Machine Learning by (47.6k points)

I have a binary prediction model trained by the logistic regression algorithm. I want to know which features(predictors) are more important for the decision of positive or negative class. I know there is a coef_ parameter comes from the scikit-learn package, but I don't know whether it is enough to for the importance. Another thing is how I can evaluate the coef_ values in terms of the importance of negative and positive classes. I also read about standardized regression coefficients and I don't know what it is.

Let's say there are features like the size of the tumor, weight of tumor, and etc to make a decision for a test case like malignant or not malignant. I want to know which of the features are more important for malignant and not malignant prediction. Does it make sort of sense?

1 Answer

0 votes
by (33.1k points)

You can simply use Python’s scikit-learn library to implement logistic regression and related API’s easily.

For example:

>>> from sklearn.linear_model import LogisticRegression

>>> clf = LogisticRegression(random_state=0, solver='lbfgs',

...                          multi_class='multinomial').fit(X, y)

>>> clf.predict(X[:2, :])

array([0, 0])

>>> clf.predict_proba(X[:2, :]) 

array([[9.8...e-01, 1.8...e-02, 1.4...e-08],

       [9.7...e-01, 2.8...e-02, ...e-08]])

>>> clf.score(X, y)

0.97…

print(clf.coef_)

# those values, however, will show that

# the second parameter

# is more influential

print(np.std(X, 0)*m.coef_)

m.fit(X / np.std(X, 0), y)

print(m.coef_)

Hope this answer helps.

31k questions

32.9k answers

507 comments

693 users

...