Understanding the Basics: The Difference Between Mean, Median, and Mode
Looking to gain a better understanding of statistical measures of central tendency? Check out our article on the differences between mean, median, and mode. Learn how each measure represents a dataset and how to calculate them.
Have you ever wondered how to summarize a set of data? Well, there are three popular measures of central tendency that can help: mean, median, and mode; while these terms may sound similar, they each have a unique way of representing a dataset.
In this article, we’ll explore the differences between mean, median, and mode and how they can help us better understand the data we’re working with.
Table of Content
What is the Difference Between Mean, Median and Mode?
Parameter | Mean | Median | Mode |
Definition | The average value of a set of data. | The middle value in the set of data. | The value that occurs most frequently in the dataset. |
Calculation | Sum of all values divided by the number of values. | Arranges the values in the order (either ascending or descending), and then choose the middle value (if the data points are even then the median will be the average of two middle values). | Identify the most frequent value (It may be more than one). |
Usefulness | Symmetrical and Continuous Data | Skewed and Discrete Data | Discrete Data and identifying the most common value |
Limitation | Can’t be used for categorical data. Highly affected by outliers. | Less representative as doesn’t depend on all observations. | Not well defined. No or multiple modes. |
Must Check: Top Online Courses and Certifications for Statistics in Data Science
Best-suited Statistics for Data Science courses for you
Learn Statistics for Data Science with these high-rated online courses
What is a Mean?
Mean is the measurement of central tendency that represents the average value of the dataset. Mean is calculated by adding all the values in the dataset and dividing by the total number of values, i.e.,
Mean = Sum of all values/ Number of Observations
Example: Suppose the score of five students in a mathematics exam: 30, 45, 50, 35, and 40. Find the mean score.
Mean = (30 + 45 + 50 + 35 + 40) / 5
=> Mean = 200 / 5
=> Mean = 40
Hence, the mean score in the math exam is 40.
What is a Median?
Median is the measure of central tendency that represents the middle value of the dataset when the data are arranged in order (either ascending or descending).
Once you get the data, the first thing you have to do is to arrange the data either in ascending or descending order.
Formula
Case-1: When the number of terms is odd.
Median = ((n+1)/2)th term
Example: Let there be 5 data points: 30, 45, 50, 35, and 40
Firstly, we will arrange the data points in ascending order, i.e. 30, 35, 40, 45, and 50.
median = (5+1)/2 = 3rd term = 40
hence, the median of the dataset is 40.
Case-2: When the number of terms is even.
Median = [(n/2)th term + (n/2 + 1)th term] / 2
i.e., mean of two middle values.
Example: Let there be 5 points: 30, 25, 45, 50, 35, and 40
Firstly, arrange the dataset in ascending order: 25, 30, 35, 40, 45, 50.
Here, the number of datapoints = 6
Therefore, Median = [(6/2)th term + (6/2 + 1)th term] / 2
=> Median = [3rd term + 4th term]/2 = (35 + 40)/2 = 37.5
=> Median = 37.5
Hence, the median of the dataset is 37.5.
What is a Mode?
Similar to the mean and median, mode is also a measure of central tendency that is used to represent the most frequently occurring value in a dataset.
To calculate the mode, you just have to identify the most frequently occurring value. It may be possible that there doesn’t exist any mode or there exists more than one mode.
Example-1: Find the mode of the dataset: 30, 35, 40, 45, and 50.
Mode = No mode exists, as no value appears more than twice.
Example-2: Find the mode of the dataset: 30, 35, 40, 40, 40, 45, 50.
Mode = 40
Example-3: Find the mode of the dataset: 30, 35, 40, 40, 45, 45, 50.
Mode = 40 and 45.
Key Difference Between Mean, Median, and Mode
Here are the key differences between Mean, Median, and Mode based on 5 different Parameters:
- Symmetrical Data: Since the centre of the data is exactly the mid-point. Hence, for the symmetrical data mean = median = mode.
- Skewed Data: In a skewed distribution
- Mean is influenced by the outliers and will be pulled toward the direction of skewness.
- In the case of skewness, the median is the best representation of the centre of the data.
- Mode may not be useful in case of skewness, since it may not occur frequently enough.
- Discrete Data: In the case of discrete data, data may take on certain values.
- In the case of discrete data, the mode will be the most useful measure of central tendency.
- Mean, and Median may not be useful as they don’t correspond to the actual data point.
- Continuous Data: If data is continuous it takes the values within a certain range.
- Mean and Median are the best representation of central tendency since they correspond to the actual value of the data.
- In the case of continuous data, the mode is not useful as it may not occur frequently enough to be meaningful.
- Bimodal Data: When the dataset has two peaks, then it is called bimodal.
- Median will be the value that divides the dataset into two equal halves.
- Mean may not be useful as it may not represent either peak, same with the mode it may be possible that there may be more than two modes.
Conclusion
In this article, we have briefly discussed three measures of central tendency (represents a summary measure to describe whole set of data with a single value that represents the middle or center of its distribution.): mean, median, and mode. We have also discussed how these measures are different from each other.
Hope you will like the article.