Process Advisors

ey-logo
*Subject to Terms and Condition
Updated on 02nd Mar, 23 3776 Views

Introduction to Linear Regression in Python

Linear regression is a type of supervised learning algorithm, commonly used for predictive analysis. As the name suggests, linear regression performs regression tasks. Now, what is regression? Well, regression is nothing but a technique that displays the relationship between two variables. Here is this tutorial, you will learn Linear Regression in Python.

Here’s the table of contents for this module on Linear Regression in Python:

Here’s an interesting video on Linear vs Logistic Regression:

Without much delay, let’s get started.

What Is Linear Regression?

As mentioned above, linear regression is a predictive modeling technique. It is used whenever there is a linear relation between the dependent and the independent variables.

Y = b0 + b1* x

It is used in estimating exactly how much of y will change when x changes a certain amount.

Linear Regression

As we see in the picture, a flower’s sepal length is mapped onto the x-axis and the petal length is mapped on the y-axis. Let us try and understand how the petal length changes with respect to the sepal length with the help of linear regression. Let us have a better understanding of linear regression with another example given below.

Certification in Bigdata Analytics

Example:

Say, there is a telecom network called Neo. Its delivery manager wants to find out if there’s a relationship between the monthly charges of a customer and the tenure of the customer. So, he collects all customer data and implements linear regression by taking monthly charges as the dependent variable and tenure as the independent variable. After implementing the algorithm, what he understands is that there is a relationship between the monthly charges and the tenure of a customer. As the tenure of the customer increases, the monthly charges also increase. Now, the best-fit line helps the delivery manager find out more interesting insights from the data. With this, he can predict the values of y for every new value of x.

Example of Linear Regression

Let us say, the tenure of a customer is 45 months, and with the help of the best fit line, the delivery manager can predict that the customer’s monthly charges would be somewhere around $64.

Example of Linear Regression

Similarly, if the tenure of a customer is 69 months, then with the help of the best fit line the delivery manager can predict that the customer’s monthly charges would be somewhere around $110.

Example of Linear Regression

This is how linear regression works. Now, the question is how to find the best fit line?

Interested to learn Data Science? Check out this Data Science course in Chennai to get clear understanding.

Linear Regression Line of Best Fit

The line of best fit is nothing but the line that best expresses the relationship between the data points. Let us see how to find the best fit line in linear regression.

This is where the residual concept comes into the picture which is shown in the image below:

Linear Regression Line of Best Fit

Red lines in the above image denote residual values, which are the differences between the actual values and the predicted values. How does residual help in finding the best fit line?

To find out the best fit line, we have something called residual sum of squares (RSS). In RSS, we take the square of residuals and sum them up.

RSS

The line with the lowest value of RSS is the best fit line.

Now, let us see how the coefficient of x influences the relationship between the independent and the dependent variables.

Regression Coefficient

In simple linear regression, if the coefficient of x is positive, then we can conclude that the relationship between the independent and the dependent variables is positive.

Regression Coefficient Positive Relation

Here, if the value of x increases, the value of y also increases.

Now, if the coefficient of x is negative, then we can say that the relationship between the independent and the dependent variables is negative.

Regression Coefficient Negative Relation

Here, if the value of x increases, the value of y decreases.

Now, let us see how we can apply these concepts to build linear regression models. In the below given Python Linear Regression Examples, we will be building two machine learning models for simple and multiple linear regression. Let’s begin.

Enroll in our Machine Learning Course and master Linear Regression Technique.

Hands-on: Linear Regression Using Python Scikit learn Hands-on-: Boston Housing Prices Dataset

  • Environment: Python 3 and Jupyter Notebook
  • Library: Pandas
  • Module: Scikit-learn

Understanding the Dataset

Before we get started with the Python linear regression hands-on, let us explore the dataset. We will be using the Boston House Prices Dataset, with 506 rows and 13 attributes with a target column. Let’s take a quick look at the dataset.

Let’s take a quick look at the dataset.

This data frame contains following columns:

  • Crim: Per capita crime rate by town
  • Zn: Proportion of residential land zoned for lots over 25,000 sq. ft.
  • Indus: Proportion of non-retail business acres per town
  • Chas: Charles River dummy variable (= 1 if tract bounds river; 0, otherwise)
  • Nox: Nitrogen oxides concentration (parts per 10 million)
  • Rm: Average number of rooms per dwelling
  • Age: Proportion of owner-occupied units built before 1940
  • Dis: Weighted mean of distances to five Boston employment centers
  • Rad: Index of accessibility to radial highways
  • Tax: Full-value property tax rate per $10,000
  • Ptratio: Pupil–Teacher ratio by town
  • Black: 1000(Bk – 0.63) ^2, where Bk is the proportion of Blacks by town
  • Lstat: Lower status of the population (percent)
  • Medv: Median value of owner-occupied homes in $1000s

In this Python Linear Regression example, we will train two models to predict the price.

Model Building

Now that we are familiar with the dataset, let us build the Python linear regression models.

Simple Linear Regression in Python

Consider ‘lstat’ as independent and ‘medv’ as dependent variables

Step 1: Load the Boston dataset

Step 2: Have a glance at the shape

Step 3: Have a glance at the dependent and independent variables

Step 4: Visualize the change in the variables

Step 5: Divide the data into independent and dependent variables

Step 6: Split the data into train and test sets


Step 7: Shape of the train and test sets

Step 8: Train the algorithm

Step 9: Retrieve the intercept

Step 10: Retrieve the slope

Step 11: Predicted value


Step 12: Actual value

Step 13: Evaluate the algorithm

Become a Data Science Architect IBM

Multiple Linear Regression in Python

Here, consider ‘medv’ as the dependent variable and the rest of the attributes as independent variable

Step 1: Load the Boston dataset

Step 2: Set up the dependent and the independent variables

Step 3: Have a glance at the independent variable

Step 4: Have a glance at the dependent variable

Step 5: Divide the data into train and test sets:

Step 6: Have a glance at the shape of the train and test sets:

Step 7: Train the algorithm:

Step 8: Having a look at the coefficients that the model has chosen:


Step 9: Concatenating the DataFrames to compare:

Step 10: Comparing the predicted value to the actual value:

Step 11: Evaluate the algorithm

Learn new Technologies

What Did We Learn?

In this module, we have talked about Python linear regression, linear regression best-fit line, and the coefficient of x. Toward the end, we built two linear regression models: simple linear regression and multiple linear regression using sklearn in Python. You can master linear regression and more. In the next module, we will talk about logistic regression. Let’s meet there!

Course Schedule

Name Date Details
Data Science Course 10 Jun 2023(Sat-Sun) Weekend Batch
View Details
Data Science Course 17 Jun 2023(Sat-Sun) Weekend Batch
View Details
Data Science Course 24 Jun 2023(Sat-Sun) Weekend Batch
View Details
Data Science Course 01 Jul 2023(Sat-Sun) Weekend Batch
View Details

Leave a Reply

Your email address will not be published. Required fields are marked *

Speak to our course Advisor Now !

Subscribe to our newsletter

Signup for our weekly newsletter to get the latest news, updates and amazing offers delivered directly in your inbox.