68-95-99.7 Rule: Definition and Implementation

68-95-99.7 Rule: Definition and Implementation

3 mins read1.3K Views Comment
Updated on Aug 26, 2024 17:41 IST

68-95-99.7 Rule or the empirical rule is based on mean and standard deviation. It is a shorthand for remembering percentage of values lying within interval estimate in the normal distribution.

2022_09_Add-a-heading-2.jpg

68-95-99.7 rule is an Empirical Rule followed by all the data following a normal distribution. Through this article, you will learn about the rule in detail.

Table of Contents

In order to understand the 68-95-99.7 rule let’s first understand the normal distribution with the below scenario:

  • This is Jimmy
  • Jimmy is an average 40-year Software Engineer
  • He works 40hrs per week
  • Jimmy earns 1.5lakh rupee per month

Now here’s the question:

Q: How often do you think that you are going to meet someone who earns 5x that of Jimmy?

Q: How often do you think you are going to meet a person who works 5x more per week? 

Well, the answer will definitely differ from the kind of data you are looking at. It might be possible that 5x greater than average is common.

What is Normal Distribution?

Normal Distribution or Gaussian Distribution is defined as a symmetrical bell-shaped curve. The peak of the curve denotes the average of the data points. When compiled and graphed, many continuous large datasets in nature exhibit this bell-shaped curve.

Let’s check out some of the common examples of simple data following Normal Distribution:

Height of Person

The population’s height is an example of normal distribution. The majority of persons in a given population are of average height. The number of persons who are taller and shorter than the average height is almost equal, and only a few people are either extraordinarily tall or extremely short. As a result, when we plot the height of all the people in the world, the graph will follow a bell-shaped curve denoting the normal distribution.

Explore normal distribution certifications

Rolling a Dice

A fair dice (more than 1 dice) roll also creates a bell-shaped curve denoting normal distribution. The normal distribution graph will become increasingly sophisticated as the number of dice increases.

2022_09_dice-roll-68-95-99.7-Rule.jpg

Similarly, there are many more things in the world that follow this distribution:

  • IQ of a population sample
  • Income Distribution In Econom
  • Performance of employees in the company
  • Shoe size of a person
  • Birth weight of a population, and many more

Learn more about normal distribution

Recommended online courses

Best-suited Statistics for Data Science courses for you

Learn Statistics for Data Science with these high-rated online courses

Free
12 weeks
– / –
12 weeks
– / –
10 days
Free
12 weeks
– / –
8 weeks
– / –
10 days
– / –
12 weeks
– / –
10 days

Understanding 68-95-99.7 Rule

After the observation of multiple normal distribution curves, statisticians found a common pattern in the data. They observed that all the data showed some pattern at 68%, 95%, and 99.5%. Since this was a common observation in all the normal distribution graphs, it became a rule and we call it the Empirical Rule of Normal Distribution (also known as the 68-95-99 rule).

In other words, the Empirical Rule is a rule based on experience and observation. It gives how much of the data lies within one, two, or three standard deviations of the mean. 

The 68–95–99.7 was first discovered and named by Abraham de Moivre in 1733 during his experimentation of flipping 100 fair coins. This term was coined 75 years before the normal distribution model even was introduced.

Let’s understand this 68-95-99.7 rule more clearly,

As per Empirical Rule:

  • 68% of observed data points lie within one standard deviation of the mean
  • 95% fall within two standard deviations, and 
  • 99.7% occurs within three standard deviations.

The empirical rule is also known as three-sigma because practically all data fall within the three standard deviations. This rule establishes both upper and lower bounds of a statistical chart at +/- three standard deviations. Since 99.7 percent of all numbers fall inside it, this limit is also used to identify the outliers. All the data points which lie beyond +/- 3 standard deviations are considered an outlier. 

Learn What is an outlier?

Implementing the 68-95-99.7 Rule using Python

Let us now understand the implementation of this empirical rule using Python:


 
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import norm
# setting the values of
# mean and Standard Deviation
mean = 0
SD = 1
# value of Cumulative distribution function (cdf)
# between one, two and three S.D. around the mean
one_sd = norm.cdf(SD, mean, SD) - norm.cdf(-SD, mean, SD)
two_sd = norm.cdf(2 * SD, mean, SD) - norm.cdf(-2 * SD, mean, SD)
three_sd = norm.cdf(3 * SD, mean, SD) - norm.cdf(-3 * SD, mean, SD)
# printing the value of fractions
print("Values within 1 SD =", one_sd)
print("Values within 2 SD =", two_sd)
print("Values within 3 SD =", three_sd)
Copy code

Output:


 
Values within 1 SD = 0.6826894921370859
Values within 2 SD = 0.9544997361036416
Values within 3 SD = 0.9973002039367398
Copy code

The above output clearly verifies the empirical rule by denoting the fraction of values within 1, 2, and 3 Standard Deviation. 

Conclusion

This observation of 68-95-99.7 is a rule because it is universally accepted for all the data following a normal distribution curve. The important part is to figure out whether or not your data follow a normal distribution or not. 

About the Author

This is a collection of insightful articles from domain experts in the fields of Cloud Computing, DevOps, AWS, Data Science, Machine Learning, AI, and Natural Language Processing. The range of topics caters to upski... Read Full Bio