z-test : Definition and Example

4 mins read14.5K Views Comment

Assistant Manager - Content

Updated on Aug 22, 2023 10:38 IST

z-test is a statistical method for the comparison of mean in a sample from the normally distributed population or between two independent samples. In this article we will briefly discuss z-test, types of z-tests with different examples.

z-test is a statistically significant test for Hypothesis Testing. There are 3 steps in Hypothesis Testing:

State Null and Alternate Hypothesis
Perform Statistical Test
Accept and reject the Null Hypothesis

In this article, we will discuss the z-test, the mathematical formula, and how to calculate it with the help of an example.

Must Check: Statistics Interview Questions

Table of Content

Z-test
- One-sample z-test
- Two-sample z-test

Recommended online courses

Best-suited Statistics for Data Science courses for you

Learn Statistics for Data Science with these high-rated online courses

Discontinued (Aug 2024)- Post Graduate Diploma in Applied Statistics

Centre for Online EducationCertificate

4.0

Total Fees

– / –

Duration

12 months

Spatial Statistics And Spatial Econometrics

IIIT DelhiCertificate

Total Fees

Free

Duration

12 weeks

NISM-Series-XIII: Common Derivatives Certification Examination, National Institute of Securities Markets

National Institute of Securities MarketsCertificate

5.0

Total Fees

₹3 K

Duration

3 hours

Introduction to Statistics

IIT HyderabadCertificate

Total Fees

– / –

Duration

12 weeks

Maths for CS I: Probability & Statistics

IIIT HyderabadCertificate

Total Fees

– / –

Duration

10 days

Probability I with Examples Using R

Indian Statistical Institute, DelhiCertificate

Total Fees

Free

Duration

12 weeks

Discontinued (Aug 2024)- Linear Dynamical Systems

IIT MandiCertificate

Total Fees

– / –

Duration

8 weeks

Modern Complexity Theory

IIIT HyderabadCertificate

Total Fees

– / –

Duration

10 days

Discontinued (October,2024)-Statistical Mechanics

IISER MohaliCertificate

Total Fees

– / –

Duration

12 weeks

Probability for Comp. Sci.

IIIT HyderabadCertificate

Total Fees

– / –

Duration

10 days

What is z-test?

Statistical method for the comparison of mean in a sample from the normally distributed population or between two independent samples

Statistical test to validate the hypothesis (accept or reject) when the data is normally distributed.

z-test is used when:

Population variance is unknown
Sample size is greater than 30

Also Read: Difference between Null Hypothesis and Alternative Hypothesis

Also Read: Difference between One-tailed and Two-Tailed Test

Probability Distributions used in Data Science

Probability is defined as the likeliness of something to occur or happen and probability distributions are functions that give the relation between all the outcomes of random variable in any...read more

Read Later

Measures of Central Tendency: Mean, Median and Mode

When we have the dataset having ample records (like passenger traveling through of airplane, weight, and score of all students in a university, share prices) in it and...read more

Read Later

Measures of Dispersion: Range, IQR, Variance, Standard Deviation

To describe the data, a measure of the central tendency is not just enough as it only gives information about the central values of the dataset.

Read Later

Types of z-test:

Z-test is mainly classified into 2 types:

One Sample
Two Sample

One-Sample

The one-sample test is used when we have to compare a sample mean with the population mean.
The region of rejection is located either extreme left or extreme right of the distribution

i.e. if any null hypothesis: Sample mean is 2

Then, its Alternate hypothesis: Sample mean is either greater or less than 2

in this case, the rejection region will be on the left side of the distribution

Note: For the left tailed test, the claimed mean sample value for the null hypothesis will be less than or equal to the mean population value.

In this case, the rejection region will be on the right side of the distribution.

Note: For the right-tailed test, the claimed mean sample value for the null hypothesis will be greater than or equal to the mean population value.

Mathematical Formula:

Also Read: t-test

Also Read: chi-square test

Standard Error vs. Standard Deviation

Standard Error quantifies the variability between samples drawn from the same population, whereas standard deviation quantifies the variability of values in a dataset. In this article, we will discuss Standard...read more

Read Later

Difference between Accuracy and Precision

Precision refers to the closeness of multiple reading of the same quantity, whereas accuracy refers to the measured value to the true value. In this article we will discuss difference...read more

Read Later

Difference Between Type 1 and Type 2 Error

Type – 1 error is known as false positive, i.e., when we reject the correct null hypothesis, whereas type -2 error is also known as a false negative, i.e., when...read more

Read Later

Let’s understand the one-sample z-test by an example:

z-test Example

A gym trainer claimed that all the new boys in the gym are above average weight.

A random sample of thirty boys weight have a mean score of 112.5 kg and the population mean weight is 100 kg and the standard deviation is 15.

Is there a sufficient evidence to support the claim of gym trainer.

Also Read: p-value

Difference between Variance and Standard Deviation

Variance and standard deviation are statistical measures of data dispersion. Variance quantifies the average squared deviation from the mean, while standard deviation is the square root of variance, providing a...read more

Read Later

Difference between Correlation and Regression

Correlation measures the degree of relationship between two variables while regression is about how one variable affects the other. In this article, we will briefly discuss the difference between correlation...read more

Read Later

Difference Between Covariance and Correlation

Looking to understand the difference between covariance and correlation? This article breaks down the key differences between the two statistical measures, including their definitions, range of values, units, sensitivity to...read more

Read Later

Two-Sample:

A two-sample test is used when we have to compare the mean of two samples.
The region of rejection is located on both the extreme (left and right) of the distribution

i.e. if any null hypothesis: Sample mean is 2

Then, its Alternate hypothesis: Sample mean is not equal to 2

Note: For two sample test, the claimed value for the null hypothesis will be equal to mean population value.

Mathematical Formula:

Let’s understand the two-sample z-test by an example:

Problem Statement:

Random samples of 75 males and 50 female’s donors yields mean concentration of 28 and 33 ppm respectively. The amount of trace elements in blood varies with the standard deviation 14.1 and 9.5 ppm respectively for males and females. What is the likelihood that the population means of concentration of elements are the same for men and women.

Conclusion:

Z-test is a statistically significant test for the hypothesis testing (null and alternative hypotheses) when the sample size is large, and the population parameter (mean and variance) is known. Hope you will like the article.

Keep Learning!!

Keep Sharing!!

How Can Decision Tree Handle Complex Data?

A decision tree’s objective is to categorize data into one of two groups based on a set of attributes. A decision tree might be used, for instance, to categorize emails...read more

Read Later

Cross Entropy Loss Function in Machine Learning

Cross entropy loss function is a mathematical tool used in machine learning to measure the difference between predicted and actual probability distributions.

Read Later

Understanding Decision Tree Algorithm in Machine Learning

Decision tree algorithms are a type of supervised learning method used for both classification and regression problems. These algorithms create a tree-like model of decisions and their possible consequences, allowing...read more

Read Later

Machine Learning for Fraud Detection

Discover the power of Machine Learning for fraud detection.

Read Later

Introduction to Word Embeddings in NLP

In this article, we will learn the concept of word embedding, and its importance. Later in the article, we will also learn the concepts of continuous bag of words model,...read more

Read Later

What is Polynomial Regression in Machine Learning?

You will learn advantages,disadvantages and application of Polynomial Regression.You will also see the implementation of polynomial regression.

Read Later

Understanding Hierarchical Clustering in Data Science

Data can be challenging to comprehend as it can be extensive. Clustering is a method to divide objects into clusters that are similar and dissimilar to the objects belonging to...read more

Read Later

3 Important Types of Vector Norm Used in Machine Learning

The length or the magnitude of the vector is known as vector norm or vector magnitude. In mathematics, a function is defined on a vector space that maps each vector...read more

Read Later

A Simple Explanation of the Bag of Words (BoW) Model

In this article, we will explore all that there is to know about Bag of Words (BOW) Model. The Bag of Words (BoW) Model is a Natural Language Processing technique...read more

Read Later

Quadratic Voting – All That You Need To Know

Have you ever felt your vote didn’t matter? Maybe you didn’t feel strongly about a particular candidate or issue, so you just cast your vote and hoped for the best....read more

Read Later

All that You Need to Know About Logistic Regression

Logistic Regression is a supervised machine-learning model that is used for classification problems. By classification, we mean that this model allows us to classify a set of input variables or...read more

Read Later

A Comprehensive Guide to Convolutional Neural Networks

CNN is a supervised deep neural network that is used in deep learning. In this article we will learn the architecture of CNN, hyperparameters used in CNN and the applications...read more

Read Later

Anomaly Detection in Machine Learning

Anomaly detection is a crucial process in machine learning that helps identify unusual patterns in datasets. It plays a vital role in multiple domains, ranging from fraud detection to system...read more

Read Later

Dot Product – All That You Need To Know

Dot products are an important concept in data science and are used in a variety of applications, including machine learning, natural language processing, and recommendation systems. A dot product, also...read more

Read Later

Transfer Learning in Machine Learning: Techniques for Reusing Pre-Trained model

In this blog, we will introduce the concept of transfer learning in machine learning and discuss its applications and benefits. Transfer learning involves using knowledge from a previously trained model...read more

Read Later

Active Learning in Machine Learning: Techniques for Efficiently Labeling Data

In this blog, you will discover the benefits of using active learning in your machine learning projects. Active learning is a powerful technique that allows a model to choose which...read more

Read Later

A Day in a Life of a Data Science Engineer

Data science engineer builds and deploys machine learning models, designs data pipelines, and maintains models in production to solve business problems using data and programming skills.

Read Later

Probability Density Function: Definition, Properties, and Application

Probability Density function describes the probability distribution of the continuous random variable. In this article, we will briefly discuss what is probability density function, its properties, its application, and how...read more

Read Later

10 Ways to Handle Imbalanced Data in a Classification Problem

Imbalanced datasets, where one class greatly outnumbers others, pose machine learning challenges. To address this, techniques like oversampling, undersampling, SMOTE, ADASYN, Tomek links, ENN, CNN, near miss, and one-sided selection...read more

Read Later

How to Calculate the F1 Score in Machine Learning

f1 score is the evaluation metric that is used to evaluate the performance of the machine learning model. It uses both precision and Recall, that makes it best for unbalanced...read more

Read Later

Introduction to Maximum Likelihood Estimation: Definition, Type and Calculation

Maximum Likelihood Estimation is used to estimate the parameter value of the likelihood function. This article will briefly discuss the definition, types and calculation of MLE.

Read Later

How to Calculate the Degrees of Freedom

Degrees of freedom in statistics is the maximum number of logically independent values in any data sample. This article will discuss the definition, formula and how to calculate the degrees...read more

Read Later

How to Compute Euclidean Distance in Python

Euclidean Distance is one of the most used distance metrics in Machine Learning. In this article, we will discuss Euclidean Distance, how to derive formula, implementation in python and finally...read more

Read Later

All About Train Test Split

Train test split technique is used to estimate the performance of machine learning algorithms which are used to make predictions on data not used to train the model. In this...read more

Read Later

Pytorch vs Tensorflow – What’s the Difference?

The main difference between Pytorch vs Tensorflow (as of now, both of these libraries are still evolving) is that more research-oriented developers use the Pytorch library. On the other hand,...read more

Read Later

K-fold Cross-validation

Cross-validation is a resampling technique used to validate machine learning models against a limited sample of data. In this article we will talk about K-fold Cross-validation and its advantages and...read more

Read Later

Difference Between Independent and Dependent Variables

Independent variable in mathematics does not depend on another variable and it explains the cause, whereas the dependent variable depends on an independent variable, and it is used to inform...read more

Read Later

All You Need to Know About Odds Ratio

The odds ratio is defined as the ratio of the number of favorable events to the ratio of unfavorable events. This article, will briefly discuss odd ratio, log odd ratio...read more

Read Later

All that You Need to Know About Sigmoid Function

The sigmoid function is a special case of a logistic function that has S-shaped characteristic and are used as an activation function in Neural Networks. In this article, we will...read more

Read Later

Difference Between Precision and Recall

Discover the key differences between Precision and Recall in our latest article. Dive into examples and Python programming to understand how these metrics, based on relevance, measure the percentage of...read more

Read Later

FAQs

What is z-test with example?

Statistical test to validate the hypothesis (accept or reject) when the data is normally distributed. z-test is used when: 1. Population variance is unknown. 2. Sample size is greater than 30. Example: Random samples of 75 males and 50 female's donors yields mean concentration of 28 and 33 ppm respectively. The amount of trace elements in blood varies with the standard deviation 14.1 and 9.5 ppm respectively for males and females. What is the likelihood that the population means of concentration of elements are the same for men and women.

What is the difference between z-test and t-test?

z-test is a kind of hypothesis test that ascertains if the average of the two datasets is different from each other when standard deviation and variance are given, whereas the t-test is referred to as a kind of parametric test that is applied to identity how average of two sets of data differ from each other when the standard deviation and variance is not given.

What are the different types of z-tests?

There are two types of z-tests: 1. One Sample z-test 2. Two Sample z-test

What z-score means?

z-score is a measure of how many standard deviations below or above the population mean a raw score is. It gives an idea of how far a data point is from the mean. It can be placed on normal distribution curve. Value of z-score ranges from -3 standard deviation to +3 standard deviation.

What is a good z-score?

The choice of 'good' or bad 'z-score' is totally subjective, it totally depends on the individual choice, to determine whether a good z-score should be one that represents the 70th, 80th, 90th, 95th percentile, etc. The value of z-score ranges from -3 standard deviations (far left of the normal distribution) to +3 standard deviation (far right of the normal distribution)

About the Author

Vikram Singh

Assistant Manager - Content

Vikram has a Postgraduate degree in Applied Mathematics, with a keen interest in Data Science and Machine Learning. He has experience of 2+ years in content creation in Mathematics, Statistics, Data Science, and Mac... Read Full Bio

z-test : Definition and Example

Table of Content

Best-suited Statistics for Data Science courses for you

Discontinued (Aug 2024)- Post Graduate Diploma in Applied Statistics

Spatial Statistics And Spatial Econometrics

NISM-Series-XIII: Common Derivatives Certification Examination, National Institute of Securities Markets

Introduction to Statistics

Maths for CS I: Probability & Statistics

Probability I with Examples Using R

Discontinued (Aug 2024)- Linear Dynamical Systems

Modern Complexity Theory

Discontinued (October,2024)-Statistical Mechanics

Probability for Comp. Sci.

What is z-test?

Types of z-test:

One-Sample

z-test Example

Two-Sample:

Conclusion:

FAQs

Top Picks & New Arrivals