Simplifying Mean, Median, and Mode Formulas: Your Ultimate Reference Guide
The mean formula (or arithmetic mean) is calculated by adding all the observations and dividing the sum by the number of observations. The Median formula firstly arranges the dataset in order and finds the middle value. And if the dataset is even in number, then the median will be the average of the two middle values. The mode formula represents the most frequent data value in the data set.
When it comes to analyzing, or making conclusions about the data, understanding the concept of central tendency is very important. The measure of central tendency is the summary measure that describes a whole set of data with a single value (representing the middle or centre of the distribution). The central tendencies can be measured using mean, median, and mode.
This article is a type of reference guide or revision notebook that contains the definition, formulas, and examples of mean, median, and mode for your quick revision.
Must Check: Free Statistics for Data Science Online Courses and Certification
Must Check: Free Mathematics for Data Science Online Courses and Certification
So, without further delay, let’s start.
Table of Content
What is Mean?
Mean or Arithmetic Mean represents the average value of the numerical dataset. It is useful to summarize and detect patterns in the dataset.
Apart from summarizing and detecting patterns, it compares two or more datasets, sets benchmark, forecasts, and predicts models to estimate future value.
Along with its advantages, mean has some limitations too. The sample size, outliers, and skewness highly influence it.
Now, let’s check out the mean formula.
Mean Formula for Ungrouped Data
To calculate the mean of the ungrouped data,
- Sum all the observation
- Find the total number of observations
- Divide the sum of all observations with the total number of observations, i.e.,
Mean Formula = Sum of all Observations / Total Number of Data Points (Observations)
Mean Formula for Grouped Data
To calculate the mean for grouped data,
- If the data values (xi) are given in intervals, find the midpoint ((upper-class limit + lower-class limit)/2).
- Multiply data values (or mid-points) by their corresponding frequency (fi), i.e., xi*fi
- Add up all products from step-2, i.e., x1*f1 + x2*f2 + … + xn*fn.
- The mean is calculated by dividing the above result by the number of observations (f1 + f2 + f3 + … + fn).
Mean = x1*f1 + x2*f2 + … + xn*fn / f1 + f2 + f3 + … + fn
Best-suited Statistics for Data Science courses for you
Learn Statistics for Data Science with these high-rated online courses
What is the Median?
The median is the middle value or the average of two middle values if the list has an even number of data values. In simple terms, the median divides the group into two halves.
Dissimilar to the mean, the median is unaffected by the outlier and the skewness. It can be used with the ordinal data. Similar to the mean, it has some limitations, such as it is affected by the sample size, it may not be unique, or it may not be representative of the entire dataset when there are gaps or clusters of the data.
Median Formula for Ungrouped Data
To calculate the median of the ungrouped data:
- Arrange the data point in order (either in ascending or descending)
- Count the total number of observations.
Case-1: If the number of observations is even
Median = [(n/2)th term + ((n/2) + 1)th term] / 2
Case-2: If the number of observations is odd
Median = ((n+1)/2)th term
Median Formula for Grouped Data
To calculate the median of the grouped data:
- Find the cumulative frequency of the data.
- Determine the total frequency of all groups.
- Divide the total frequency by 2 to find the data’s midpoint.
- Find the group that contains the midpoint.
- This will be the median group.
- Calculate the Median using the following:
Median = L + ((n/2 – F) / f) x W
where,
L: Lower Limit of the median group
n: Total Frequency
F: Cumulative Frequency of the group before the median group
f: frequency of the median group
W: width of the median group
What is the Mode?
Mode is the only measure of central tendency used for nominal data. It represents the maximum frequency of the number in the dataset. Mode is the better measure of central tendency for the nominal, skewed, binomial, and sparse data than mean and median.
Mode Formula for Ungrouped Data
To calculate the mode of the ungrouped data:
- Arrange the dataset in order (either ascending or descending)
- Count the frequency of each data value.
- Identify the value that appears most frequently.
Mode Formula for Grouped Data
To calculate the mode of the grouped data:
- Identify the class interval in the frequency distribution with the highest frequency.
- This is the modal class interval.
- Now, use the below formula to calculate the mode.
Mode = L + ((f1 – f0) / (2f1 – f0 – f2)) x h,
where
L: Lower limit of the modal class
f0: frequency of the class before the modal class (preceding the modal class)
f1: frequency of the modal class
f2: frequency of the class after the modal class (succeeding the modal class)
h: the size of the class interval.