Skewness in Statistics – Overview, Concepts, Types, Measurements and Importance
Imagine a seesaw—perfectly balanced, right? That's how data can sometimes be—nice and even on both sides. But what if all the kids pile on one side? That's kind of like skewness! This article will explain what skewness is in statistics. It's a way of measuring how data clumps to one side or the other, like that overloaded seesaw. We'll use pictures and easy examples to show you how to spot skewness and why it matters.
“99 percent of all Statistics only tell 49 percent of the story.”
– Ron DeLegge II, Gents with No Cents
In simple terms, Skewness means the lack of straightness or symmetry. Now, to get a better understanding let’s discuss skewness in statistics through a cricket match example.
Case – I: Most of the players scored 40+ runs in a match, and only a few of them scored less than 10 runs. (negative skew)
Must Read – Statistical Methods Every Data Scientist Should Know
Case – II: Most of the players score poorly while just a few of them perform well. (positive skew)
What is Skewness in Statistics?
Skewness: Skewness in statistics is a measure of lack of symmetry, i.e. it measures the deviation of the given distribution of a random variable from a symmetric distribution (like normal Distribution).
- Normal Distribution: A Normal Distribution is a probability distribution that is symmetric about the mean. It is also known as a Gaussian Distribution. The distribution appears as a Bell-shaped curve, which means the mean is the most frequent data in the given data set.
In Normal Distribution :
Mean = Median = Mode
- Standard Normal Distribution: When the mean in a Normal Distribution is 0 and the Standard Deviation is 1, then the Normal Distribution is called a Standard Normal Distribution.
- Normal Distributions are symmetrical in nature it doesn’t imply that every symmetrical distribution is a Normal Distribution.
- Normal Distribution is the probability distribution without any skewness.
Must Check: Measure of Central Tendency: Mean, Median and Mode
Types of Skewness
- Positive Skewness
- Negative Skewness
Unlike the Normal Distribution (mean = median = mode), in positive and negative skewness, the mean, median, and mode are all different.
Positive Skewness
In positive skewness, the extreme data values are larger, which in turn increases the mean value of the data set. In simple terms, a positive skew distribution is the distribution with the tail on the right side.
In Positive Skewness:
Mean > Median > Mode
Negative Skewness
In negative skewness, the extreme data values are smaller, which decreases the mean value of the dataset or the negative skew distribution is the distribution having the tail on the left side.
In Negative Skewness:
Mean < Median < Mode
Measuring Skewness
There are different ways to measure the skewness:
- Pearson Mode
- Pearson Median
- Momental
- Kelly’s Measure
- Bowley
But we mainly use the first two, Pearson mode and Pearson median skewness.
Pearson Mode is used when a strong mode is exhibited by the sample data and if the data have multiple modes or the weak mode, then Pearson Median is used.
Best-suited Statistics for Data Science courses for you
Learn Statistics for Data Science with these high-rated online courses
Importance of Skewness
The important question arises about what we will do after finding the skewness. Skewness gives the direction of the outliers. If it is right-skewed, most of the outliers are present on the right side of the distribution, while if it is left-skewed, most of the outliers will be present on the left side of the distribution. But the important thing to keep in mind is that it doesn’t tell about the number of outliers.
Endnotes
Skewness in statistics plays an important role in Exploratory Data Analysis (EDA) during feature extraction and selection. We use different transform techniques like Power Transform, Log Transform, and Exponential Transform to convert the positive and negative skew distribution to Normal distribution to deal with the skewness.
In this article, we have briefly discussed what skewness is, its type (positive skewness and negative skewness), and finally, we have also discussed different methods to measure skewness in statistics.
FAQs
What is Normal Distribution?
Normal Distribution is a probability distribution that is symmetric about the mean. It is also known as Gaussian Distribution. The distribution appears as a Bell-shaped curve which means the mean is the most frequent data in the given data set.
What is Skewness in Statistics?
Skewness in statistics is a measure of lack of symmetry i.e. it measures the deviation of the given distribution of a random variable from a symmetric distribution (like normal Distribution).
What are the types of Skewness?
Positive Skewness: In positive skewness, the extreme data values are larger, which in turn increase the mean value of the data set, or in the simple term in positive skew distribution is the distribution having the tail on the right side. In Positive Skewness: Mean > Median > Mode Negative Skewness: In negative skewness, the extreme data values are smaller, which decreases the mean value of the dataset or the negative skew distribution is the distribution having the tail on the left side. In Negative Skewness: Mean <u00a0 Median < Mode
How is Skewness Measured?
Generally, skewness is measured using the formula that contains mean, median, and mode. If a mean is greater than median, then the distribution is said to be positive skewness, whereas if mean is less than median, it is said to be negatively skewed.
What does it mean if a distribution is positively skewed?
If the distribution is positively skewed, the tail of the distribution is longer on the right side, while the majority of the data is on the left side.
What does it mean if a distribution is negatively skewed?
If the distribution is negatively skewed, the tail of the distribution is longer on the left side, while the majority of the data is on the right side.
Vikram has a Postgraduate degree in Applied Mathematics, with a keen interest in Data Science and Machine Learning. He has experience of 2+ years in content creation in Mathematics, Statistics, Data Science, and Mac... Read Full Bio