Normal Distribution: Definition and Examples

Normal Distribution: Definition and Examples

3 mins read2.5K Views Comment
Vikram
Vikram Singh
Assistant Manager - Content
Updated on Jun 17, 2022 10:50 IST

Introduction:

A probability distribution is a function that gives the relationship between the outcome of a random variable in any random experiment and its probable values.

2022_02_feature-images_NORMAL.jpg

In this article, we will discuss one of the probability distributions which is commonly used in Data Science,

Normal Distribution or Gaussian Distribution.

To know more about Random Variable, read the article Introduction to Probability.

To know about other probability distributions, read the article Probability Distribution used in Data Science.

Recommended online courses

Best-suited Statistics for Data Science courses for you

Learn Statistics for Data Science with these high-rated online courses

Free
12 weeks
– / –
12 weeks
– / –
10 days
Free
12 weeks
– / –
8 weeks
– / –
10 days
– / –
12 weeks
– / –
10 days

Table of Content:

Normal Distribution:

Normal Distribution or Gaussian Distribution (named after German mathematician Carl Friedrich Gauss) is a continuous Probability distribution,

which is symmetric about its mean value (i.e. data near the mean value are more frequently occurring).

Example: 

  • Height of Students in the school
  • The score of the student in any exam.

Note: Normal distribution is often known as the bell-shaped curve.

Before going further let’s have an example.

Consider the experiment of Number of books read by students in a school

2022_02_normal1.jpg

Mathematical Definition:

2022_02_mathematical-definition_normal-distribution.jpg

To know more about the mean, and variance read the article on Measures of Central Tendency and Measures of Dispersion.

Example: Normal Distribution curve using Python

# importing libraries
 
import numpy as np
import matplotlib.pyplot as plt
import statistics as st
from scipy.stats import norm  #norm : normal distribution function
 
# distribution parameters
 
sample_set = np.arange(-20, 20, 0.1) 
#arange (start, stop, step-size): used to generate linear sequence with a constant step size
mean = st.mean(sample_set)
sd = st.stdev(sample_set)
 
# plot the normal distribution function with the defined mean and standard deviation
 
plt.plot(sample_set, norm.pdf(sample_set, mean, sd))
2022_02_normal-distribution-definition-and-example_example-1.jpg

Effect of Mean and standard deviation on Normal Distribution:

Same Mean – Different Standard Deviation

# import libraries
 
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
 
# distribution parameters
sample_set = np.arange(-20, 20, 0.1) 
#arange (start, stop, step-size): used to generate linear sequence with a constant step size
 
# define multiple normal distributions: same mean and different standard deviation
 
plt.plot(sample_set, norm.pdf(sample_set, 0, 1), label='μ: 0, σ: 1')
plt.plot(sample_set, norm.pdf(sample_set, 0, 1.5), label='μ:0, σ: 1.5')
plt.plot(sample_set, norm.pdf(sample_set, 0, 2), label='μ:0, σ: 2')
 
#add legend to plot
plt.legend()
2022_02_normal-distribution-definition-and-example_same-mean.jpg

From the above, we get, if we change the standard deviation keeping the mean constant the larger standard deviation will give a flatter curve.

Different Mean – Same Standard Deviation

#import libraries
 
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
 
# distribution parameters
 
sample_set = np.arange(-20, 20, 0.1) 
#arange (start, stop, step-size): used to generate linear sequence with a constant step size
 
#define multiple normal distributions: different mean and same standard deviation
 
plt.plot(sample_set, norm.pdf(sample_set, -2, 2), label='μ: -2, σ: 2')
plt.plot(sample_set, norm.pdf(sample_set, 0, 2), label='μ:0, σ: 2')
plt.plot(sample_set, norm.pdf(sample_set, 2, 2), label='μ:2, σ: 2')
 
#add legend to plot
 
plt.legend()
2022_02_normal-distribution-definition-and-example_different-mean.jpg

From the above, we get if we change the mean, the curve will shift either on the right or the left side.

Properties of Normal Distribution:

  • Symmetric The shape of the normal distribution is perfectly symmetric about the mean i.e. the equal number of observations lies on both sides of the mean.
  • Mean = Median = Mode At the center of Normal distribution, all the measures of central tendency lie.
  • The total area under the curve is 1.
  • Empirical Rule:

In a Normal Distribution, data is distributed in constant proportion, which is given by the Empirical rule.

The empirical rule or 68 – 95 – 97 rule or the Three sigma rule.

It states that in a Normal Distribution:

  • 68% of the data will be within one Standard Deviation of the Mean
  • 95% of the data will be within two Standard Deviations of the Mean
  • 99.7 of the data will be within three Standard Deviations of the Mean
2022_02_MicrosoftTeams-image-3-1.jpg

Standard Normal Distribution:

It is a special case of Normal Distribution for which mean = 0 and standard deviation = 1.

For any random variable X, Standard Normal Distribution is given by:

2022_02_formula_snd.jpg
2022_02_standard-normal-deviation.jpg

Conclusion:

In this article, we have discussed about one of the most important probability distribution Normal Distribution , with examples in python.

Hope this article will help in your data science and machine learning journey.

Top Trending Articles:
Data Analyst Interview Questions Data Science Interview Questions Machine Learning Applications Big Data vs Machine Learning Data Scientist vs Data Analyst How to Become a Data Analyst Data Science vs. Big Data vs. Data Analytics What is Data Science What is a Data Scientist What is Data Analyst
About the Author
author-image
Vikram Singh
Assistant Manager - Content

Vikram has a Postgraduate degree in Applied Mathematics, with a keen interest in Data Science and Machine Learning. He has experience of 2+ years in content creation in Mathematics, Statistics, Data Science, and Mac... Read Full Bio