# Statistics - Standard Normal Distribution

The standard normal distribution is a normal distribution where the mean is 0 and the standard deviation is 1.

## Standard Normal Distribution

Normally distributed data can be transformed into a standard normal distribution.

Standardizing normally distributed data makes it easier to compare different sets of data.

The standard normal distribution is used for:

• Calculating confidence intervals
• Hypothesis tests

Here is a graph of the standard normal distribution with probability values (p-values) between the standard deviations:

Standardizing makes it easier to calculate probabilities.

The functions for calculating probabilities are complex and difficult to calculate by hand.

Typically, probabilities are found by looking up tables of pre-calculated values, or by using software and programming.

The standard normal distribution is also called the 'Z-distribution' and the values are called 'Z-values' (or Z-scores).

## Z-Values

Z-values express how many standard deviations from the mean a value is.

The formula for calculating a Z-value is:

$$\displaystyle Z = \frac{x-\mu}{\sigma}$$

$$x$$ is the value we are standardizing, $$\mu$$ is the mean, and $$\sigma$$ is the standard deviation.

For example, if we know that:

The mean height of people in Germany is 170 cm ($$\mu$$)

The standard deviation of the height of people in Germany is 10 cm ($$\sigma$$)

Bob is 200 cm tall ($$x$$)

Bob is 30 cm taller than the average person in Germany.

30 cm is 3 times 10 cm. So Bob's height is 3 standard deviations larger than mean height in Germany.

Using the formula:

$$\displaystyle Z = \frac{x-\mu}{\sigma} = \frac{200-170}{10} = \frac{30}{10} = \underline{3}$$

The Z-value of Bob's height (200 cm) is 3.

## Finding the P-value of a Z-Value

Using a Z-table or programming we can calculate how many people Germany are shorter than Bob and how many are taller.

### Example

With Python use the Scipy Stats library norm.cdf() function find the probability of getting less than a Z-value of 3:

import scipy.stats as stats
print(stats.norm.cdf(3))
Try it Yourself »

### Example

With R use the built-in pnorm() function find the probability of getting less than a Z-value of 3:

pnorm(3)
Try it Yourself »

Using either method we can find that the probability is $$\approx 0.9987$$, or $$99.87\%$$

Which means that Bob is taller than 99.87% of the people in Germany.

Here is a graph of the standard normal distribution and a Z-value of 3 to visualize the probability:

These methods find the p-value up to the particular z-value we have.

To find the p-value above the z-value we can calculate 1 minus the probability.

So in Bob's example, we can calculate 1 - 0.9987 = 0.0013, or 0.13%.

Which means that only 0.13% of Germans are taller than Bob.

## Finding the P-Value Between Z-Values

If we instead want to know how many people are between 155 cm and 165 cm in Germany using the same example:

The mean height of people in Germany is 170 cm ($$\mu$$)

The standard deviation of the height of people in Germany is 10 cm ($$\sigma$$)

Now we need to calculate Z-values for both 155 cm and 165 cm:

$$\displaystyle Z = \frac{x-\mu}{\sigma} = \frac{155-170}{10} = \frac{-15}{10} = \underline{-1.5}$$

The Z-value of 155 cm is -1.5

$$\displaystyle Z = \frac{x-\mu}{\sigma} = \frac{165-170}{10} = \frac{-5}{10} = \underline{-0.5}$$

The Z-value of 165 cm is -0.5

Using the Z-table or programming we can find that the p-value for the two z-values:

• The probability of a z-value smaller than -0.5 (shorter than 165 cm) is 30.85%
• The probability of a z-value smaller than -1.5 (shorter than 155 cm) is 6.68%

Subtract 6.68% from 30.85% to find the probability of getting a z-value between them.

30.85% - 6.68% = 24.17%

Here is a set of graphs illustrating the process:

## Finding the Z-value of a P-Value

You can also use p-values (probability) to find z-values.

For example:

"How tall are you if you are taller than 90% of Germans?"

The p-value is 0.9, or 90%.

Using a Z-table or programming we can calculate the z-value:

### Example

With Python use the Scipy Stats library norm.ppf() function find the z-value separating the top 10% from the bottom 90%:

import scipy.stats as stats
print(stats.norm.ppf(0.9))
Try it Yourself »

### Example

With R use the built-in qnorm() function find the z-value separating the top 10% from the bottom 90%:

qnorm(0.9)
Try it Yourself »

Using either method we can find that the Z-value is $$\approx 1.281$$

Meaning that a person that is 1.281 standard deviations taller than the mean height of Germans is taller than 90% of Germans.

We then use the formula to calculate the height ($$x$$) based on a mean ($$\mu$$) of 170 cm and standard deviation ($$\sigma$$) of 10 cm:

$$\displaystyle Z = \frac{x-\mu}{\sigma}$$

$$\displaystyle 1.281 = \frac{x-180}{10}$$

$$1.281 \cdot 10 = x-180$$

$$12.81 = x - 180$$

$$12.81 + 180 = x$$

$$\underline{192.81} = x$$

So we can conclude that:

"You have to be at least 192.81 cm tall to be taller than 90% of Germans"