2 views

I have trouble understanding the difference (if there is one) between roc_auc_score() and auc() in scikit-learn.

I'm trying to predict a binary output with imbalanced classes (around 1.5% for Y=1).

Classifier

model_logit = LogisticRegression(class_weight='auto')

model_logit.fit(X_train_ridge, Y_train)

Roc curve

false_positive_rate, true_positive_rate, thresholds = roc_curve(Y_test, clf.predict_proba(xtest)[:,1])

AUC's

auc(false_positive_rate, true_positive_rate)

Out: 0.82338034042531527

and

roc_auc_score(Y_test, clf.predict(xtest))

Out: 0.75944737191205602

Somebody can explain this difference? I thought both were just calculating the area under the ROC curve. Might be because of the imbalanced dataset but I could not figure out why.

Thanks!

by (33.1k points)

When we need to check or visualize the performance of the multi-class classification problem, we use the AUC (Area Under The Curve) ROC (Receiver Operating Characteristics) curve.

AUC is not always area under the curve of an ROC curve. Area Under the Curve is an (abstract) area under some curve, so it is a more general thing than AUROC. With imbalanced classes, it may be better to find AUC for a precision-recall curve.

See sklearn source for roc_auc_score:

def roc_auc_score(y_true, y_score, average="macro", sample_weight=None):

def _binary_roc_auc_score(y_true, y_score, sample_weight=None):

fpr, tpr, tresholds = roc_curve(y_true, y_score,

sample_weight=sample_weight)

return auc(fpr, tpr, reorder=True)

return _average_binary_score(

_binary_roc_auc_score, y_true, y_score, average,

sample_weight=sample_weight)

In the above code, this first get a roc curve and then calls auc() to get the area.

I guess your problem is the predict_proba() call. For a normal predict() the outputs are always the same:

For example:

import numpy as np

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import roc_curve, auc, roc_auc_score

est = LogisticRegression(class_weight='auto')

X = np.random.rand(10, 2)

y = np.random.randint(2, size=10)

est.fit(X, y)

false_positive_rate, true_positive_rate, thresholds = roc_curve(y, est.predict(X))

print auc(false_positive_rate, true_positive_rate)

# 0.857142857143

print roc_auc_score(y, est.predict(X))

# 0.857142857143

If you change the above for this, you'll sometimes get different outputs:

false_positive_rate, true_positive_rate, thresholds = roc_curve(y, est.predict_proba(X)[:,1])

# may differ

print auc(false_positive_rate, true_positive_rate)

print roc_auc_score(y, est.predict(X))