# Artificial Intelligence

## Statistics

Statistics is about how to collect, analyze, interpret, and present data.

- What is the most
**Common?** - What is the most
**Expected?** - What is the most
**Normal?**

## Descriptive Statistics

**Descriptive Statistics** are methods for summarizing observations into information that we can understand.

Since we register every new born baby, we can tell that 51 out of 100 are boys.

From the numbers we have collected, we can predict a 51% chance that a new baby will be a boy.

It is a mystery that the ratio is not 50%, like basic biology would predict. We can only say that we have at least had this tilted sex ratio since the 17th century.

## Inferential Statistics

**Inferential statistics** are methods for quantifying properties of a population
from a small **Sample**:

You take data from a sample and make a prediction about the whole population.

For example, you can stand in a shop and ask a **sample of 100 people** if they like chocolate.

From your research, using inferential statistics, you could predict that 91% of **all shoppers** like chocolate.

## Incredible Chocolate Facts

Nine out of ten people love chocolate.

50% of the US population cannot live without chocolate every day.

## Mean Values

The mean value is the **Average** of all values.

This table contains house prices versus size:

Price | 7 | 8 | 8 | 9 | 9 | 9 | 10 | 11 | 14 | 14 | 15 |

Size | 50 | 60 | 70 | 80 | 90 | 100 | 110 | 120 | 130 | 140 | 150 |

The mean price is (7+8+8+9+9+9+10+11+14+14+15)/11 = 10.363636.

**How to**: Add all numbers, then divide by the number of numbers.

The **Mean** is the **Sum** divided by the **Count**.

Or if you use a math library like **math.js**:

```
var mean = math.mean([7,8,8,9,9,9,10,11,14,14,15]);
```

## The Variance

In statistics, the **Variance** is the average of the squared differences from the mean value.

In other words, it describes how far a set of numbers is spread out from their average value.

The Variance (in JavaScript):

```
// Calculate the Mean (m)
```

var m = (7+8+8+9+9+9+10+11+14+14+15)/11;

// Calculate the Sum of Squares (ss)

var ss = (7-m)**2 + (8-m)**2 + (8-m)**2 + (9-m)**2 + (9-m)**2 + (9-m)**2 + (9-m)**2 + (10-m)**2 + (11-m)**2 + (14-m)**2 + (15-m)**2;

// Calculate the Variance

var variance = ss / 11;

Or if you use a math library like **math.js**:

```
var variance = math.variance([7,8,8,9,9,9,10,11,14,14,15],"uncorrected");
```

## Standard Deviation

**Standard Deviation** is a measure of how spread out numbers are.

The symbol is **σ** (Greek letter sigma).

The formula is the **√** variance (the square root of the variance).

The Standard Deviation is (in JavaScript):

```
// Calculate the Mean (m)
```

var m = (7+8+8+9+9+9+10+11+14+15)/11;

// Calculate the Sum of Squares (ss)

var ss = (7-m)**2 + (8-m)**2 + (8-m)**2 + (9-m)**2 + (9-m)**2 + (9-m)**2 + (9-m)**2 + (10-m)**2 + (11-m)**2 + (14-m)**2 + (15-m)**2;

// Calculate the Variance

var variance = ss / 11;

// Calculate the Standard Deviation

var std = Math.sqrt(variance);

Or if you use a math library like **math.js**:

```
var std = math.std([7,8,8,9,9,9,9,10,11,14,15],"uncorrected");
```

For the **Normal Distribution Curve** (Bell Curve), values less than one **Standard Deviation** away from the
**Mean** account for 68.27% of the set,
two standard deviations away account for 95.45%, and three standard deviations away account for 99.73%.

## The Margin of Error

Statisticians will always try to predict everything with 100% accuracy.

But, there will always be some uncertainty.

The** Margin of Error** is the number that quantifies this uncertainty.
Different margins of error define the different ranges for where we believe
true answers can be found.

The acceptable margin of error is a matter of judgment, and relative to how important the answer is.

The more samples we collect, the lower the margin of error is: