Tutorials References Exercises Bootcamps Videos Menu
Sign Up Create Website Get Certified Upgrade

Machine Learning Statistics

Statistics are tools to get answers to questions about data:

  • What is Common?
  • What is Expected?
  • What is Normal?
  • What is the Probability?

Inferential Statistics

Inferential statistics are methods for quantifying properties of a population from a small Sample:

You take data from a sample and make a prediction about the whole population.

For example, you can stand in a shop and ask a sample of 100 people if they like chocolate.

From your research, using inferential statistics, you could predict that 91% of all shoppers like chocolate.

Incredible Chocolate Facts

Nine out of ten people love chocolate.

50% of the US population cannot live without chocolate every day.

You use Inferential Statistics to predict whole domains from small samples of data.

Descriptive Statistics

Descriptive Statistics summarizes (describes) observations from a set of data.

Since we register every newborn baby, we can tell that 51 out of 100 are boys.

From these collected numbers, we can predict a 51% chance that a new baby will be a boy.

It is a mystery that the ratio is not 50%, like basic biology would predict. We only know that we have had this tilted sex ratio since the 17th century.


Raw observations are only data. They are not real knowledge.

You use Descriptive Statistics to transform raw observations into data that you can understand.

Descriptive Statistics Measurements

Descriptive statistics are broken down into different measures:

Tendency (Measures of the Center)

  • The Mean (the average value)value
  • The Median (the mid point value)
  • The Mode (the most common value)

Spread (Measures of Variability)

  • Min and Max
  • Standard Deviation
  • Variance
  • Skewness
  • Kurtosis