# Data Science - Regression Table: P-Value

## The "Statistics of the Coefficients Part" in Regression Table

Now, we want to test if the coefficients from the linear regression function has a significant impact on the dependent variable (Calorie_Burnage).

This means that we want to prove that it exists a relationship between Average_Pulse and Calorie_Burnage, using statistical tests.

There are four components that explains the statistics of the coefficients:

**std err**stands for Standard Error**t**is the "t-value" of the coefficients**P>|t|**is called the "P-value"-
**[0.025 0.975]**represents the confidence interval of the coefficients

We will focus on understanding the "P-value" in this module.

## The P-value

The P-value is a statistical number to conclude if there is a relationship between Average_Pulse and Calorie_Burnage.

We test if the true value of the coefficient is equal to zero (no relationship). The statistical test for this is called Hypothesis testing.

- A low P-value (< 0.05) means that the coefficient is likely not to equal zero.
- A high P-value (> 0.05) means that we cannot conclude that the explanatory variable affects the dependent variable (here: if Average_Pulse affects Calorie_Burnage).
- A high P-value is also called an insignificant P-value.

## Hypothesis Testing

Hypothesis testing is a statistical procedure to test if your results are valid.

In our example, we are testing if the true coefficient of Average_Pulse and the intercept is equal to zero.

Hypothesis test has two statements. The null hypothesis and the alternative hypothesis.

- The null hypothesis can be shortly written as H0
- The alternative hypothesis can be shortly written as HA

Mathematically written:

```
H0: Average_Pulse = 0
```

HA: Average_Pulse ≠ 0

H0: Intercept =
0

HA: Intercept ≠ 0

The sign ≠ means "not equal to"

## Hypothesis Testing and P-value

The null hypothesis can either be rejected or not.

If we reject the null hypothesis, we conclude that it exist a relationship between Average_Pulse and Calorie_Burnage. The P-value is used for this conclusion.

A common threshold of the P-value is 0.05.

**Note:** A P-value of 0.05 means that 5% of the times, we will falsely reject the null hypothesis. It means that we accept that 5% of the times, we
might falsely have concluded a relationship.

If the P-value is lower than 0.05, we can reject the null hypothesis and conclude that it exist a relationship between the variables.

However, the P-value of Average_Pulse is 0.824. So, we cannot conclude a relationship between Average_Pulse and Calorie_Burnage.

It means that there is a 82.4% chance that the true coefficient of Average_Pulse is zero.

The intercept is used to adjust the regression function's ability to predict more precisely. It is therefore uncommon to interpret the P-value of the intercept.