What is F1 Score in Machine Learning?

The F1 Score in machine learning is immensely valuable for business companies. It acts like a performance report card to better decision-making in areas like fraud detection, customer support, and marketing campaigns. The subject matter of this blog centers around what is F1 score, its calculation, and some real world examples.

What is F1 Score?

The F1 score is a model accuracy metric that is especially useful when working with imbalanced datasets. It considers both precision and recall to provide a balanced evaluation of a model’s performance. The F1 score is the harmonic mean of precision and recall and is calculated as follows:

F1 Score = 2 * (precision * recall) / (precision + recall)

Precision evaluates the model’s accuracy in making favorable predictions. The model’s capacity to capture all positive cases is measured by recall. The F1 score is a single statistic that can be used to assess how effectively a model balances precision and recall. It is extensively used in machine learning classification problems.

How to Calculate F1 Score

To understand how the F1 score is calculated, we must first examine a confusion matrix. A confusion matrix represents a model’s prediction performance on a dataset. A confusion matrix contains four fundamental components for a binary class dataset (say, “positive” and “negative” classes):

True Positives (TP)	We predicted yes (they have the disease), and they do
True Negatives (TN)	We predicted no, and they are disease-free
False Positives (FP)	We predicted yes, but they do not have the disease
False Negatives (FN)	We predicted that they don’t not have the disease, yet they do

False positives (FP) is also referred to as a “Type I error.”

False negatives (FN) is often referred to as a “Type II error.”

The F1 score is determined by the precision and recall scores, which are described as follows:

Recall: Recall is the measure of the capacity of a model to accurately identify all relevant instances within a dataset. It’s sometimes referred to as “sensitivity” or “true positive rate.” The fraction of true positive predictions (properly recognized relevant events) out of all real positive instances in the dataset is measured by recall. It is mathematically calculated as:

Recall = TP/TP+FN

Precision: It measures the accuracy of a model’s positive predictions. Simply put, accuracy indicates how many things classified as belonging to a specific class are genuinely relevant to that class. The following formula is used to compute it:

Precision = TP/TP+FP

Get 100% Hike!

Master Most in Demand Skills Now!

How is F1 Score Calculated Using Precision and Recall?

The F1 score is calculated using precision and recall with a harmonic mean formula to provide a balanced assessment of a model’s performance. It combines precision (the accuracy of positive predictions) and recall (the ability to capture all positive instances) into a single metric, favoring models that achieve a balance between these two measures.

The F1 score combines precision and recall and reflects them symmetrically in the formula:

F1 scores can vary from 0 to 1, with 1 being a model that flawlessly categorizes each observation into the correct class and 0 representing a model that cannot categorize any observation into the correct class.

F1 Score can also be calculated as follows:

Example of F1 Score

Consider the following confusion matrix and calculate the corresponding f1 score:

In this particular example, we have taken a confusion matrix in which disease cases are given. The values for True Positive, True Negative, False Positive, and False Negative are given. We will now simply put all the values in the formula and then calculate the F1 score.

	Disease = Yes	Disease = No
Disease = Yes	TP = 670	FP = 45
Disease = No	FN = 65	TN = 25

To determine the f1 score, first calculate the Precision and Recall values.

Precision = TP/ (TP + FP) = 670 / (670 + 45) = 0.937

Recall =TP / (TP + FN) = 670 / (670 + 65) = 0.911

Now, the f1 score will be:

Real World Applications of F1 Score

The F1 score, which is a measure of a model’s precision and recall, is used in a variety of real-world settings. The F1 score helps in maximizing the performance of machine learning models and algorithms in all of the below mentioned applications by providing a balanced assessment of their accuracy and completeness, which is critical for making informed decisions and taking appropriate actions.

Here are some examples of how the F1 score can be used in practice:

Medical Diagnosis: The F1 score is used in healthcare to evaluate the effectiveness of diagnostic models. Positive diagnoses are more likely to be accurate when accuracy is high, while critical illnesses are less likely to be missed when recall is high.
Information Retrieval: The F1 score is used in search engines and information retrieval systems to measure the performance of algorithms that extract relevant documents from big databases. It helps balance the trade-off between search result precision (relevance) and recall (completeness).
Spam Detection: Email and message filtering systems employ the F1 score to determine whether messages are spam or not. A high F1 score ensures that spam is correctly caught while minimizing false positives.
Credit Scoring: The F1 score is used in the financial industry to evaluate the effectiveness of credit risk models. It helps in striking a compromise between identifying creditworthy individuals correctly (precision) and not refusing credit to those who deserve it (recall).
E-commerce Search Relevance: Online marketplaces utilize the F1 score to evaluate the performance of their search algorithms. It ensures that only relevant products are displayed (precision) while leaving off any potential matches (recall).
Natural Disaster Prediction: The F1 score is used by meteorological models to assess their capacity to anticipate natural catastrophes. Precision and recall must be balanced in order to issue accurate and timely warnings.

Wrap-Up

The F1 score is vital in machine learning, balancing accuracy and completeness. It’s crucial in real-world areas like healthcare and fraud detection to make sure models provide accurate positive predictions and reduce false negatives. It guides us through precision and recall trade-offs to find effective solutions in practical situations. As machine learning continues to evolve, businesses will rely on the F1 Score to achieve higher precision and recall, ultimately improving the quality of their services and products.