Bias and Variance with Real-Life Examples

7 mins read5.2K Views Comment

Updated on Oct 21, 2022 11:35 IST

This blog revolves around bias and variance and its tradeoff. These concepts are explained with respect to overfitting and underfitting with proper examples.

A machine learning model is trained on some data. Then it finds patterns by analyzing the data and do predictions accordingly. So it’s not like all the predictions are 100% correct. This is not even possible. The model do mistakes while predicting because of numerous reasons. These mistakes are bias and variance which we are going to cover in today’s blog.

Bias and variance are one of the must-know concepts for every data scientist and one of the famous interview questions for almost all data science interviewers.

In this blog, we will cover the

Bias and Variance with examples
Bias and variance tradeoff

What is Bias?
What is variance?
Understanding with example
Bias and variance tradeoff

Before understanding, we have to understand overfitting and underfitting concepts. Overfitting refers to the problem of too much fitting of data by model. In this case, the model tries to memorize all data you give it during training time. On the other hand, underfitting describes the situation where a model is performing poorly on its training data it doesn’t learn much from that data.

Also, explore:

Cross-validation techniques

Read Later

Overfitting and Underfitting with a real-life example

Read Later

How to improve machine learning model

After building machine learning the work is not over!! The next task is to improve machine learning model. But this is a challenging task. Some of you must have faced...read more

Read Later

Recommended online courses

Best-suited Statistics for Data Science courses for you

Learn Statistics for Data Science with these high-rated online courses

Discontinued (Aug 2024)- Post Graduate Diploma in Applied Statistics

Centre for Online EducationCertificate

4.0

Total Fees

– / –

Duration

12 months

Spatial Statistics And Spatial Econometrics

IIIT DelhiCertificate

Total Fees

Free

Duration

12 weeks

NISM-Series-XIII: Common Derivatives Certification Examination, National Institute of Securities Markets

National Institute of Securities MarketsCertificate

5.0

Total Fees

₹3 K

Duration

3 hours

Introduction to Statistics

IIT HyderabadCertificate

Total Fees

– / –

Duration

12 weeks

Maths for CS I: Probability & Statistics

IIIT HyderabadCertificate

Total Fees

– / –

Duration

10 days

Probability I with Examples Using R

Indian Statistical Institute, DelhiCertificate

Total Fees

Free

Duration

12 weeks

Discontinued (Aug 2024)- Linear Dynamical Systems

IIT MandiCertificate

Total Fees

– / –

Duration

8 weeks

Modern Complexity Theory

IIIT HyderabadCertificate

Total Fees

– / –

Duration

10 days

Discontinued (October,2024)-Statistical Mechanics

IISER MohaliCertificate

Total Fees

– / –

Duration

12 weeks

Probability for Comp. Sci.

IIIT HyderabadCertificate

Total Fees

– / –

Duration

10 days

What is Bias?

Bias is the error that calculates the difference between the average prediction of our model and the actual value that we are trying to predict.

A model suffering from high bias is a simple model which pays very little attention to the training data. This type of model always leads to a high error on both training and test data. Let’s take an example. Suppose we want our model to predict the animal by showing photos of animals. We trained the model on only one attribute poiniting_ears. Then we showed the image of a cat to the model. So the model predicted it as a fox also has pointed ears.

This shows the model is not able to capture other details while predicting as it has bias.

Characteristics of a high bias model include:

Not able to capture proper data trends
Trained over noise also. So giving less accurate results
Suffers from underfitting
A more general or simple model

What is variance?

Variance is the opposite of Bias. Variance is also an error that measures the randomness of the predicted value from the actual value.

Variance can be defined as the model’s sensitivity to fluctuations in the data. if we model is allowed to view the data too many times, it will learn very well for only that data. It will capture most patterns in the data, but it will also learn from the unnecessary data present, or from the noise. When we train our model with too much data or allow it to view the data too many times, it will learn the data including noise, which will cause our model to consider trivial features as important. In this case, our model is overfitted. Now let’s continue the above example of animal prediction. If we consider fur as a feature then that will be noise as many animals have fur.

Note: Noise here means irrelevant details which are not required for the predicting output.

If you will train the model with some 100 images of cat and dog and again show the same images to it. It will predict correctly. But if you will some different cat and dog images the model will not be able to predict it correctly. This model performs well during the training phase but not during a test phase. And it might be looking at specific features like the nose and ears also. When the variance is high our model will capture all the features of the data given to it will tune itself to the data and predict it very well.

A model should have less variation in the predicted values with changes in the training data set. Continuing the same cat example, now this time we gave more features to the model for training

Variance errors are either the low variance or high variance.

Low variance: A model has a small variation in the predicted values with changes in the training data set.
High variance: A model has a high variation in the predicted values with changes in the training data set. A model having high variance learns everything shown to it and performs well with the training dataset, but not on test data.

Understanding with example

Suppose you want to predict house price with respect to the house area.

Let’s say all these blue dots are training samples, the orange dots are test samples as shown in the figure below. We can train a model that fits these blue dots perfectly which means our model is an overfitted model. An overfit model tries to fit exactly to the training samples but not to the test samples that’s why training error becomes close to zero and test error is high.

Calculating error

Nonlinear model

Now let’s say you want to figure out an error for this particular orange test data point. The error will be this gray dotted line. And you can measure the error for all your test data set and average it out.

Let’s say you get this error as 100(as shown in the left figure below). When you split your dataset, you pick your training samples at random. Suppose your friend, uses the same model, the same methodology but might be choosing a different set of training samples. In both scenarios training data set error will be zero because you both are trying to overfit the model. Let’s say you get this test error as 100(as shown in the first figure) and your friend gets a test error of 27(as shown in the second figure). Why you are getting high errors as compared to your friend even after using the same methodology and same data.

This is because the test error varies greatly based on your selection of training data points. And this is called high variance because there is high variability in the test error based on what kind of training samples you are selecting. Now you are selecting training samples at random, so your test error varies randomly which is not good and this is the common issue with overfit model.

Now next question that comes to mind is what if we are using linear models.

Linear model

When you select a different set of training data points, your training and test data set error is still kind of similar. This means variability is not there much.

Examples of bias and variance

Some machine learning algorithms with low bias are k-Nearest Neighbours, Decision Trees, and Support Vector Machines. At the same time, some machine learning algorithms that have high bias are Linear Regression and Logistic Regression.

Summary

Bias—–>Underfitting—->High train and test error

Variance—->Overfitting—–>High test error

Bias variance tradeoff

Till now we got the idea that in order to avoid overfitting and underfitting in the model we have to decrease bias and variance.

If the model is having fewer parameters, it may have low variance and high bias. Whereas, if the model is complex with a large number of parameters, it will have high variance and low bias. So, there is a need to strike a balance between bias and variance errors, and this balance between the bias error and variance error is known as the Bias-Variance trade-off.

Note: When the model is suffering from high bias then that means it is suffering from low variance and vice versa.

Consider this Bull’s eyes diagram. The center i.e. the bull’s eye is the model result we want to achieve that perfectly predicts all the values correctly. As we move away from the bull’s eye, our model starts to make more and more wrong predictions.

Low-Bias, Low-Variance:
The combination is an ideal machine learning model. However, it is not possible practically.
Low-Bias, High-Variance: This is a case of overfitting where model predictions are inconsistent and accurate on average. The predicted values will be accurate(average) but will be scattered.
High-Bias, Low-Variance: This is a case of underfitting where predictions are consistent but inaccurate on average. The predicted values will be inaccurate but will be not scattered.
High-Bias, High-Variance:
With high bias and high variance, predictions are inconsistent and also inaccurate on average.

Endnotes

In this blog, we talked about bias and variance with examples and also studied the bias-variance tradeoff. We discussed that nonlinear models have high variance and linear models have low variance.

If you like this blog please hit the stars below.

Recently completed any professional course/certification from the market? Tell us what you liked or disliked in the course for more curated content.

Click here to submit its review with Shiksha Online.

About the Author

Shiksha Online

This is a collection of insightful articles from domain experts in the fields of Cloud Computing, DevOps, AWS, Data Science, Machine Learning, AI, and Natural Language Processing. The range of topics caters to upski... Read Full Bio

Bias and Variance with Real-Life Examples

Table of contents

Best-suited Statistics for Data Science courses for you

Discontinued (Aug 2024)- Post Graduate Diploma in Applied Statistics

Spatial Statistics And Spatial Econometrics

NISM-Series-XIII: Common Derivatives Certification Examination, National Institute of Securities Markets

Introduction to Statistics

Maths for CS I: Probability & Statistics

Probability I with Examples Using R

Discontinued (Aug 2024)- Linear Dynamical Systems

Modern Complexity Theory

Discontinued (October,2024)-Statistical Mechanics

Probability for Comp. Sci.

What is Bias?

What is variance?

Understanding with example

Calculating error

Nonlinear model

Linear model

Examples of bias and variance

Summary

Bias variance tradeoff

Endnotes

Top Picks & New Arrivals