0 votes
1 view
in Machine Learning by (47.8k points)

I have a binary prediction model trained by the logistic regression algorithm. I want to know which features(predictors) are more important for the decision of positive or negative class. I know there is a coef_ parameter comes from the scikit-learn package, but I don't know whether it is enough to for the importance. Another thing is how I can evaluate the coef_ values in terms of the importance of negative and positive classes. I also read about standardized regression coefficients and I don't know what it is.

Let's say there are features like the size of the tumor, weight of tumor, and etc to make a decision for a test case like malignant or not malignant. I want to know which of the features are more important for malignant and not malignant prediction. Does it make sort of sense?

1 Answer

0 votes
by (33.2k points)

You can simply use Python’s scikit-learn library to implement logistic regression and related API’s easily.

For example:

>>> from sklearn.linear_model import LogisticRegression

>>> clf = LogisticRegression(random_state=0, solver='lbfgs',

...                          multi_class='multinomial').fit(X, y)

>>> clf.predict(X[:2, :])

array([0, 0])

>>> clf.predict_proba(X[:2, :]) 

array([[9.8...e-01, 1.8...e-02, 1.4...e-08],

       [9.7...e-01, 2.8...e-02, ...e-08]])

>>> clf.score(X, y)



# those values, however, will show that

# the second parameter

# is more influential

print(np.std(X, 0)*m.coef_)

m.fit(X / np.std(X, 0), y)


Hope this answer helps.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !