Gradient Boosting in Machine Learning

Gradient Boosting in Machine Learning

Gradient Boosting is a powerful machine learning technique used to improve model accuracy by combining multiple weak learners. It’s widely used in both classification and regression tasks. In this blog, you’ll learn what Gradient Boosting is, how it works, and how to implement it easily in Python.

Table of Contents:

Why do we need Boosting?

Before learning gradient boosting technique lets understand the need for boosting with the help of a scenario. Suppose, we have separately built six Machine Learning models for predicting whether it will rain or not. Each of these models has been built on top of the 6 distinct parameters given below to analyze and predict the weather condition:

  1. Air temperature
  2. Atmospheric (barometric) pressure
  3. Humidity
  4. Precipitation
  5. Solar radiation
  6. Wind

The outputs from the Machine Learning models may differ for these six parameters. The model which is evaluating air temperature may predict a sunny day. Whereas, another model may predict a rainy day based on humidity. Also, even if we predict the outcome on the basis of a single model, then there is a 50% probability of false prediction. These individual models are weak learners. But, if we combine all the weak learners to work as one, then the prediction would rely on 6 different parameters. The increase in the number of parameters will boost the accuracy of the model.

This is the logic behind all the boosting techniques such as AdaBoost, gradient boosting, and XGBoost.

Now, in this blog on ‘Gradient Boosting,’ we will understand ‘What is Boosting?’

What is Boosting?

Boosting is a machine learning approach used to improve the performance of underperforming students. A weak learner refers to a simple model, such as a short decision tree, that can enhance random guessing by a little bit. Boosting aggregates multiple of these weak learners sequentially, with each subsequent model concentrating on correcting past mistakes. As a result, boosting transforms numerous weak models into a single strong and accurate model. It is often used for classification and regression problems, and is the basis for many well-known algorithms like AdaBoost, Gradient Boosting, and XGBoost.

It uses ensemble learning to boost the accuracy of a model. Ensemble learning is a technique to improve the accuracy of Machine Learning models. There are two types of ensemble learning:

1. Sequential Ensemble Learning

It is a boosting technique where the outputs from individual weak learners associate sequentially during the training phase. The performance of the model is boosted by assigning higher weights to the samples that are incorrectly classified. AdaBoost algorithm is an example of sequential learning that we will learn later in this blog.

Sequential Ensemble Learning

2. Parallel Ensemble Learning

It is a bagging technique where the outputs from the weak learners are generated parallelly. It reduces errors by averaging the outputs from all weak learners. The random forest algorithm is an example of parallel ensemble learning.

Parallel Ensemble Method(Bagging)

Mechanism of Boosting Algorithms

Boosting is creating a generic algorithm by considering the prediction of the majority of weak learners. It helps in increasing the prediction power of the Machine Learning model. This is done by training a series of weak models.

Below are the steps that show the mechanism of the boosting algorithm:

1. Import required libraries

from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

2. Load sample data

data = load_iris()
X = data.data
y = data.target

3. Split the dataset

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

4. Create a weak learner (Decision Tree)

weak_learner = DecisionTreeClassifier(max_depth=1)

5. Create the AdaBoost model

adaboost_model = AdaBoostClassifier(base_estimator=weak_learner, n_estimators=50, learning_rate=1.0, random_state=42)

6. Train the model

adaboost_model.fit(X_train, y_train)

7. Make predictions

y_pred = adaboost_model.predict(X_test)

8. Evaluate the model

accuracy = accuracy_score(y_test, y_pred)
print("Accuracy of AdaBoost Model:", accuracy)

Output:

Accuracy of AdaBoost Model: 0.9555555555555556

Certification in Bigdata Analytics

Now, we will explore various interpretations of weakness and their corresponding algorithms.

Types of Boosting Algorithms

Basically, there are three types of boosting algorithms discussed as below:

1. Adaptive Boosting (AdaBoost)

AdaBoost is a boosting algorithm that combines a number of simple models (often decision trees) to create a robust, accurate model. It works by allocating more weight to misclassified data points, so the next model focuses on the more difficult cases. The process is repeated, with all models Integrated to improve overall performance. AdaBoost is commonly utilized for classification jobs and can be easily integrated with scikit-learn.

Steps for implementing AdaBoost:

Step 1: Import Required Libraries

from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

Step 2: Load and Prepare the Data

# Load the Iris dataset
data = load_iris()
X = data.data
y = data.target
# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

Step 3: Define the Weak Learner

# Use a decision stump (a tree with max depth = 1)
weak_learner = DecisionTreeClassifier(max_depth=1)

Step 4: Create the AdaBoost Classifier

adaboost = AdaBoostClassifier(
               base_estimator=weak_learner, 
               n_estimators=50, 
               learning_rate=1.0, 
               random_state=42
)

Step 5: Train the Model

adaboost.fit(X_train, y_train)

Step 6: Make Predictions

y_pred = adaboost.predict(X_test)

Step 7: Evaluate the Model

accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Output:

Accuracy: 0.9555555555555556Next, we will see “What is Gradient Boosting?.”

Get 100% Hike!

Master Most in Demand Skills Now!

2. Gradient Boosting

Gradient Boosting is a powerful machine learning technique that gradually combines numerous weak learners, typically decision trees, to form a strong model. During the process, every new model learns to correct the mistakes of previous models.To reduce the overall predicted error, it uses a technique known as gradient descent. Both classification and regression tasks can be done by Gradient Boosting with high accuracy which is why it is so popular.  It serves as the foundation for advanced algorithms like XGBoost, LightGBM, and CatBoost, all of which are frequently utilized in data science projects and competitions.

The gradient boosting algorithm requires the below components to function:

1. Loss function: To reduce errors in prediction, we need to optimize the loss function. Unlike in AdaBoost, the incorrect result is not given a higher weightage in gradient boosting. It tries to reduce the loss function by averaging the outputs from weak learners.

2. Weak learner: In gradient boosting, we require weak learners to make predictions. To get real values as output, we use regression trees. To get the most suitable split point, we create trees in a greedy manner, due to this the model overfits the dataset.

3. Additive model: In gradient boosting, we try to reduce the loss by adding decision trees. Also, we can minimize the error rate by cutting down the parameters. So, in this case, we design the model in such a way that the addition of a tree does not change the existing tree.

Finally, we update the weights to minimize the error that is being calculated.

How is this useful?

Gradient boosting is a highly robust technique for developing predictive models. It applies to several risk functions and optimizes the accuracy of the model’s prediction. It also resolves multicollinearity problems where the correlations among the predictor variables are high.

Gradient boosting machines have been successful in various applications of Machine Learning.

Next, we will move on to XGBoost, which is another boosting technique widely used in the field of Machine Learning.

3. XGBoost

XGBoost algorithm is an extended version of the gradient boosting algorithm. It is basically designed to enhance the performance and speed of a Machine Learning model.

Additionally, we have an XGBoosting library, which gives us Machine Learning frameworks of gradient boosting for various languages such as R, Python, Java, etc.

Why do we use XGBoost?

In the gradient boosting algorithm, there is a sequential computation of data. Due to this, we get the output at a slower rate. This is where we use the XGBoost algorithm. It increases the model’s performance by performing parallel computations on decision trees.

XGBoost

What features make XGBoost unique?

XGBoost is much faster than the gradient boosting algorithm. It improves and enhances the execution process of the gradient boosting algorithm.

There are more features that make XGBoost algorithm unique and they are:

1. Fast: The execution speed of the XGBoost algorithm is high. We get a fast and efficient output due to its parallel computation.

2. Cache optimization: To manage and utilize resources, it uses cache optimization.

3. Distributed computing: If we are employing large datasets for training the Machine Learning model, then XGBoost provides us distributed computing, which helps combine multiple machines to enhance performance.

Implementation of Gradient Boosting

In this section, we will look into the implementation of the gradient boosting algorithm. For this, we will use the Titanic dataset.

Here are the steps of implementation:

1. Importing the required libraries

t1

2. Loading the dataset

t2

3. Performing data preprocessing

t3

4. Concatenating a new dataset

t4

5. Dropping the columns that are not required

t5

6. Assigning empty sets a value of 0

t6

7. Splitting the data into train and test sets

t7

8. Scaling the data using MinMaxScaler

t8

9. Selecting the size of the dataset for testing

t9

10. Assigning the learning rate to evaluate the classifier’s performance

t10

Performance of different learning rates:

t11

11. Creating a new gradient boosting classifier and building a confusion matrix for checking accuracy

t12

Output:

t13

Conclusion

In this blog, we saw ‘What is Gradient Boosting?,’ AdaBoost, XGBoost, and the techniques used for building gradient boosting machines. Also, we implemented the boosting classifier and compared the accuracy of the model for different learning rates. This is all about how the gradient boosting technique incredibly enhances the performance of the Machine Learning models.

Our Machine Learning Courses Duration and Fees

Program Name
Start Date
Fees
Cohort Starts on: 14th Jun 2025
₹70,053

About the Author

Principal Data Scientist, Accenture

Meet Akash, a Principal Data Scientist with expertise in advanced analytics, machine learning, and AI-driven solutions. With a master’s degree from IIT Kanpur, Aakash combines technical knowledge with industry insights to deliver impactful, scalable models for complex business challenges.