Statistics - Descriptive Statistics
Descriptive statistics gives us insight into data without having to look at all of it in detail.
Key Features to Describe about Data
Getting a quick overview of how the data is distributed is a important step in statistical methods.
We calculate key numerical values about the data that tells us about the distribution of the data. We also draw graphs showing visually how the data is distributed.
Key Features of Data:
- Where is the centre of the data? (location)
- How much does the data vary? (scale)
- What is the shape of the data? (shape)
These can be described by summary statistics (numerical values).
The Centre of the Data
The centre of the data is where most of the values are concentrated.
Different kinds of averages, like mean, median and mode, are measures of the centre.
Note: Measures of the centre are also called location parameters, because they tell us something about where data is 'located' on a number line.
The Variation of the Data
The variation of the data is how spread out the data are around the centre.
Statistics like standard devition, range and quartiles are measures of variation.
Note: Measures of variation are also called scale parameters.
The Shape of the Data
The shape of the data can refer to the how the data are bunched up on either side of the centre.
Statistics like skew describe if the right or left side of the centre is bigger. Skew is one type of shape parameters.
One typical of presenting data is with frequency tables.
A frequency table counts and orders data into a table. Typically, the data will need to be sorted into intervals.
Frequency tables are often the basis for making graphs to visually present the data.
Different types of graphs are used for different kinds of data. For example:
- Pie charts for qualitative data
- Histograms for quantitative data
- Scatter plots for bivariate data
Graphs often have a close connection to numerical summary statistics.
For example, box plots show where the quartiles are.
Quartiles also tell us where the minimum and maximum values, range, interquartile range, and median are.