Statistics - Making Conclusions
Using statistics to make conclusions about a population is called statistical inference.
Statistics from the data in the sample is used to make conclusions about the whole population. This is a type of statistical inference.
Probability theory is used to calculate the certainty that those statistics also apply to the population.
When using a sample, there will always be some uncertainty about what the data looks like for the population.
Uncertainty is often expressed as confidence intervals.
Confidence intervals are numerical ways of showing how likely it is that the true value of this statistic is within a certain range for the population.
Hypothesis testing is a another way of checking if a statement about a population is true. More precisely, it checks how likely it is that a hypothesis is true is based on the sample data.
Some examples of statements or questions that can be checked with hypothesis testing:
- People in the Netherlands taller than people in Denmark
- Do people prefer Pepsi or Coke?
- Does a new medicine cure a disease?
Note: Confidence intervals and hypothesis testing are closely related and describe the same things in different ways. Both are widely used in science.
Causal inference is used to investigate if something causes another thing.
For example: Does rain make plants grow?
If we think two things are related we can investigate to see if they correlate. Statistics can be used to find out how strong this relation is.
Even if things are correlated, finding out of something is caused by other things can be difficult. It can be done with good experimental design or other special statistical techniques.
Note: Good experimental design is often difficult to achieve because of ethical concerns or other practical reasons.