# Statistics - Making Conclusions

Using statistics to make conclusions about a population is called statistical inference.

## Statistical Inference

Statistics from the data in the **sample** is used to make conclusions about the whole **population**. This is a type of **statistical inference**.

**Probability theory** is used to calculate the certainty that those statistics also apply to the population.

When using a sample, there will **always** be some uncertainty about what the data looks like for the population.

Uncertainty is often expressed as **confidence intervals**.

Confidence intervals are numerical ways of showing how likely it is that the **true value** of this statistic is within a certain range for the population.

**Hypothesis testing** is a another way of checking if a statement about a population is true. More precisely, it checks how likely it is that a hypothesis is true is based on the sample data.

Some examples of statements or questions that can be checked with hypothesis testing:

- People in the Netherlands taller than people in Denmark
- Do people prefer Pepsi or Coke?
- Does a new medicine cure a disease?

**Note:** Confidence intervals and hypothesis testing are closely related and describe the same things in different ways. Both are widely used in science.

## Causal Inference

Causal inference is used to investigate if something causes another thing.

For example: Does rain make plants grow?

If we think two things are related we can investigate to see if they **correlate**. Statistics can be used to find out how strong this relation is.

Even if things are correlated, finding out of something is caused by other things can be difficult. It can be done with good **experimental design** or other special statistical techniques.

**Note:** Good experimental design is often difficult to achieve because of ethical concerns or other practical reasons.