AUC-ROC curve assesses binary classification models, and visualize the performance of binary classification models. It is a very popular method to measure the accuracy of a classification model. In this article, we will learn how to interpret an ROC curve and its AUC value to evaluate a binary classification model over all possible classification thresholds.
Table of Contents:
What is ROC Curve?
ROC or Receiver Operating Characteristic plot is used to visualise the performance of a binary classifier. It gives us the trade-off between the True Positive Rate (TPR) and the False Positive Rate (FPR) at different classification thresholds.
1. True Positive Rate
True Positive Rate is the proportion of observations that are correctly predicted to be positive.
2. False Positive Rate
False Positive Rate is the proportion of observations that are incorrectly predicted to be positive.
For different threshold values we will get different TPR and FPR. So, in order to visualise which threshold is best suited for the classifier we plot the ROC curve. The following figure shows what a typical ROC curve look like.
Alright, now that we know the basics of ROC curve let us see how it helps us measuring performance of a classifier.
Thresholding in Machine Learning Classifier Model
Logistic regression outputs probabilities. For example, when using a logistic regression model to determine if breast cancer is malignant or benign, a projected probability of 0.8 indicates that the patient is more likely to have malignant cancer. In contrast, a forecast of 0.2 indicates a decreased chance of malignancy.
How about a patient with a likelihood score of 0.6? Here’s where categorization thresholding comes into play. A threshold is a predetermined value used to transform probability scores into binary classifications. For example, if we set the threshold to 0.5,
- Patients having a chance of ≥0.5 are categorized as “malignant.”
- Patients having a likelihood of < 0.5 are categorized as “benign.”
What about a patient with a likelihood of 0.6? This is when categorization thresholds come into play. A threshold is a predefined value that is used to transform probability scores into binary classifications. For example, if we set the threshold to 0.5, then
- Patients having a likelihood of ≥ 0.5 are categorized as “malignant.”
- Patients having a likelihood < 0.5 are considered “benign.”
By default, logistic regression assumes a 0.5 threshold, but this value is not always optimal. The choice of threshold depends on the specific problem and the trade-offs between false positives and false negatives. Adjusting the threshold allows us to fine-tune model performance based on the desired outcome.
1. Understanding Threshold Tuning with an Analogy
Consider a metal detector whose sensitivity is determined by a threshold:
- A higher threshold leads to lower sensitivity and only detects huge metal items.
- Lower threshold leads to higher sensitivity, allowing for detection of both large and small metal particles.
Similarly, in a classification model.
- A higher threshold lowers false positives while potentially missing actual positives (reducing sensitivity).
- A lower threshold improves sensitivity, collecting more genuine positives while simultaneously producing more false positives.
ROC Curve of a Random Classifier Vs. a Perfect Classifier
The ROC curve of a random classifier with the random performance level (as shown below) always shows a straight line. This random classifier ROC curve is considered to be the baseline for measuring the performance of a classifier. Two areas separated by this ROC curve indicates an estimation of the performance level—good or poor.
ROC curves that fall under the area at the top-left corner indicate good performance levels, whereas ROC curves fall in the other area at the bottom-right corner indicate poor performance levels. An ROC curve of a perfect classifier is a combination of two straight lines both moving away from the baseline towards the top-left corner.
Now, we might be wondering how a perfect classifier looks like.
Note: The dotted line represents the ROC curve of a purely random classifier; a good classifier stays as far away from that line as possible (toward the top-left corner).
Area Under ROC Curve
Area Under the Curve or AUC ROC curve is nothing but the area under the curve calculated in the ROC space. One of the easy ways to calculate the AUC score is using the trapezoidal rule, which is adding up all trapezoids under the curve.
Although the theoretical range of the AUC ROC curve score is between 0 and 1, the actual scores of meaningful classifiers are greater than 0.5, which is the AUC ROC curve score of a random classifier.
Implementing ROC Curve with Python
Here is how, we can implement ROC Curve in Python using sklearn and Logistic Regression Model:
Step 1 – Gathering Requirements
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, auc
from sklearn.datasets import make_classification
Step 2 – Creating Synthetic Data
X, y = make_classification(n_samples=1000, n_features=10, random_state=42)
Step 3 – Split the Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
Step 4 – Train the Logistic Regression Model
model = LogisticRegression()
model.fit(X_train, y_train)
Step 5 – Get Prediction Probabilities
y_probs = model.predict_proba(X_test)[:, 1]
Step 6 – Compute FPR, TPR and Thresholds
fpr, tpr, thresholds = roc_curve(y_test, y_probs)
roc_auc = auc(fpr, tpr)
Step 7 – Plot ROC Curve
plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, color='blue', label=f'ROC Curve (AUC = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], color='gray', linestyle='--') # Random classifier line
plt.xlabel('False Positive Rate (FPR)')
plt.ylabel('True Positive Rate (TPR)')
plt.title('ROC Curve')
plt.legend()
plt.grid()
plt.show()
Output
Get 100% Hike!
Master Most in Demand Skills Now!
Conclusion
We hope this article helped you in getting an idea of AUC-ROC curve metric for assessing classifier performance. This is commonly used in the industry as well as data science or machine learning.
If you want to master data science and create high-performing models, the consider enrolling in our comprehensive Data Science course and advance your machine learning skills!
Our Machine Learning Courses Duration and Fees
Cohort Starts on: 12th Apr 2025
₹70,053