Understanding Lasso Regression Using Python

4 mins read1.4K Views Comment

Updated on Nov 25, 2022 19:16 IST

Lasso is a regularization technique that reduces the model complexity by adding a penalty. The penalty minimizes the coefficient sizes and allows some of penalties to shrink to zero. This article briefly discussed how what is Lasso Regression and how to implement in python.

LASSO is a regularization technique used in Linear regression models for shrinkage and variable selection. Lasso imposes constraints on the parameters that cause the regression coefficient for some variables to shrink toward zero. The variables whose regression coefficient becomes zero are excluded from the model. The main goal of using lasso regression is to find the subset of predictors that minimizes the prediction error in the linear regression models.

Before discussing lasso regression, let’s discuss an important topic: “What is Regularization?”

Regularization is used in machine learning models to prevent the model from overfitting by adding penalties.

In regularization, we reduce the magnitude of the coefficients

There are two types of regularization techniques that are used to reduce the regression coefficient or magnitude of the coefficient:

Lasso Regression (also referred to as L1 regularization)
Ridge Regression (also referred to as L2 regularization)

What is Programming	What is Python
What is Data Science	What is Machine Learning

This article will briefly discuss how to use lasso regression in a linear regression model.

So, without further delay, let’s dive deep to learn more about lasso regression.

Recommended online courses

Best-suited Machine Learning courses for you

Learn Machine Learning with these high-rated online courses

Master of Computer Applications with specialization in Machine Learning and Artificial Intelligence (Online MCA)

Amity OnlineDegree

Total Fees

₹1.7 L

Duration

2 years

Professional Certificate Course In Generative AI And Machine Learning

IIT KanpurCertificate

Total Fees

₹1.53 L

Duration

11 months

Advance Certification in Applied Data Science, Machine Learning & IoT

IIT GuwahatiCertificate

4.0

Total Fees

₹95 K

Duration

9 months

MCA in Machine Learning Online

Amity OnlineDegree

Total Fees

₹2.5 L

Duration

2 years

IIT Roorkee - Post Graduate Certificate Program in Data Science & Machine Learning (Online)

TimesProCertificate

4.0

Total Fees

₹2 L

Duration

10 months

Data Science & Machine Learning Course

Coding NinjasCertificate

4.8

Total Fees

₹34.65 K

Duration

11 months

MCA in Machine Learning

Amity University Online, NoidaDegree

Total Fees

₹2.5 L

Duration

2 years

M.Sc. in Machine Learning and AI

upGradDegree

Total Fees

₹5.6 L

Duration

18 months

IIT Roorkee & Wiley Post Graduate Certification in AI for BFSI

IIT RoorkeeCertificate

Total Fees

– / –

Duration

6 months

Full Stack Machine Learning & AI Program

Jigsaw AcademyCertificate

Total Fees

– / –

Duration

8 hours

What is Lasso Regression?

LASSO stands for Least Absolute Shrinkage and Selection Operator.

Lasso regression is a regularization technique that reduces the model complexity by adding a penalty.

Lasso penalizes the model based on the sum of the Absolute Coefficient Values, also referred to as the L1 penalty.
L1 penalty minimizes the coefficients’ size and allows some penalties to shrink to zero.
- The variables whose regression coefficient becomes zero are excluded from the model.
In lasso, the data values are shrunk towards a central point such as the mean value.
It is mainly used if the dataset has high dimensionality and high correlation.

Note: A hyperparameter lambda is used (multiplied) with the L1 penalty to control the weight of the penalty.

Mathematical Formula of Lasso Regression

The mathematical equation for the lasso regression will be:

Residual Sum of Square + lambda * (sum of the absolute value of the magnitude of the coefficient)

i.e.,

Programming Online Courses and Certification	Python Online Courses and Certifications
Data Science Online Courses and Certifications	Machine Learning Online Courses and Certifications

Implementation of Lasso Regression in Python

About Dataset

The data was extracted from the 1974 Motor Trend US magazine and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models). The data set is used to implement Lasso Regression to improve the r2 score of the model.

A data frame with 32 observations on 11 (numeric) variables and 1 categorical variable.

mpg: Miles/(US) gallon

cyl: Number of cylinders

disp: Displacement (cu.in.)

hp: Gross horsepower

drat: Rear axle ratio

wt: Weight (1000 lbs)

qsec: 1/4 mile time

vs: Engine (0 = V-shaped, 1 = straight)

am: Transmission (0 = automatic, 1 = manual)

gear: Number of forward gears

Step – 1: Import Dataset

#import Libraries

import pandas as pd
import numpy as np

#import the dataset

mt =pd.read_csv('mtcars.csv')

#check the first five rows of the data

mt.head()
Copy code

Step – 2: Check for Null Values

#check for null values

mt.info()
Copy code

In the above dataset, no feature contains the NULL values.

Step-3: Drop the Categorical Variable

#drop the categorical variable: model from the dataset

mt.drop(['model'], axis = 1, inplace = True)
mt
Copy code

Step -4: Create Feature and Target Variable

#create feature and Target Variable

x = mt.drop(columns = 'hp', axis = 1)
y = mt['hp']
Copy code

Step- 5: Train- Test Split

#splitting the data into train and test set

from sklearn.model_selection import train_test_split 
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3, random_state = 4)
Copy code

Step – 6: Fitting into the model

#fitting data into the model

from sklearn.linear_model import LinearRegression

lr = LinearRegression()
lr.fit(x_train, y_train)
Copy code

Step – 7: Checking r-squared value

#import r2_score

from sklearn.metrics import r2_score
Copy code

#r2-score for train data

x_pred_tarin = lr.predict(x_train)
r2_score(y_train, x_pred_train)
Copy code

#r2-score for test data

x_pred_test = lr.predict(x_test)
r2_score(y_test, x_pred_test)
Copy code

Step – 8: Building LASSO Model

from sklearn.linear_model import Lasso

lasso = Lasso()
lasso.fit(x_train, y_train)
Copy code

x_pred_lasso_test = lasso.predict(x_test)
r2_score(y_test, x_pred_lasso_test)
Copy code

From above, we get the r2_score increases after implementing lasso changes from 0.54 to 0.78.

Conclusion

Hope this will help you to learn all about LASSO and clears all your doubts.

Top Trending Article

Interview Questions

About the Author

Vikram Singh

Understanding Lasso Regression Using Python

Table of Content

Best-suited Machine Learning courses for you

Master of Computer Applications with specialization in Machine Learning and Artificial Intelligence (Online MCA)

Professional Certificate Course In Generative AI And Machine Learning

Advance Certification in Applied Data Science, Machine Learning & IoT

MCA in Machine Learning Online

IIT Roorkee - Post Graduate Certificate Program in Data Science & Machine Learning (Online)

Data Science & Machine Learning Course

MCA in Machine Learning

M.Sc. in Machine Learning and AI

IIT Roorkee & Wiley Post Graduate Certification in AI for BFSI

Full Stack Machine Learning & AI Program

What is Lasso Regression?

Mathematical Formula of Lasso Regression

Implementation of Lasso Regression in Python

Conclusion

Top Picks & New Arrivals