How does a statistical test work? – Part 2
Degree of freedom and p-value
In the last article, we have illustrated the concept of statistical tests. If you only want to obtain a rough understanding of the concept behind statistical tests then reading our last article is absolutely sufficient. However, if you are interested in more detail, then the following will help you to understand how a p-value is calculated and what “degrees of freedom” means. If you haven’t seen the previous article, have a look at it first. So last time we saw that statistical tests are based on two steps:
- Calculation of a so called “test statistic”.
- Comparison of this test statistic with a threshold. This threshold was obtained from a distribution of the test statistic.
Now we will learn some more details. First of all, when we talked about “the distribution of the test statistic” this is actually not only one distribution, but there are many distributions. Which distribution applies for our data depends on our sample size. In statistical terms we don’t talk about “sample size”, but about degrees of freedom. The term “degrees of freedom” sounds a bit abstract, so I’ll explain why we talk about “freedom”: When we do statistics, we often want to know about variability or variance. When we only measure a single value, we can’t say anything about variability, because there is no other measurement with which we could compare our value. We could say there is no room for variability or no “freedom”. The more values we add to our sample the more “freedom” we obtain and the more accurately we can measure variability or variance. So for a sample of ten values the sample variance has nine degrees of freedom:
df = n – 1
However, we don’t need to worry how the degree of freedom is calculated. There are formulae to do this and any software will calculate the degree of freedom automatically. We only have to understand that the degree of freedom is about sample size.
Another point we omitted in the last video was the p-value. So far we only focussed on the threshold to decide whether a difference is significant or not. This threshold is also called the “critical value”. In many textbooks you still find tables where you can look up critical values for given degrees of freedom and for one- or two-sided testing. Now the p-value is calculated from the distribution of the test statistic which we showed last time. In this example, we have marked the critical value for a level of significance of 0.05 (5%), shown as a red line:
The question is now: How do we obtain the p value for the actual test statistic calculated form our data (the red arrow)? What we want to know is how many values are below our t-value. So far we have plotted the distribution as a so called “probability density function“ (PDF). I.e. for each test statistic we plotted the probability on the y-axis. What we need for the calculation of a p value is another way of plotting this distribution: We use a “cumulative distribution function” (CDF). This means for each value on the x-axis we gradually add all values up starting from the left. It is easier to illustrate this, if we view our t-distribution as a histogram:
A cumulative distribution is obtained by adding up the values from the left to the right. The first value on the left is 1, hence in the cumulative histogram we also write a 1. The next value in the upper histrogram is 6. If we add this value up with the value we already had (1), then at this point we have 6+1 = 7. At the next position we have 23. Adding these to the previous values (7) means that we have 23+7 = 30 values. And if we continue adding up all values, then at the end we obtain the cumulative histogram shown in the lower figure.
Now let’s go back to our original distribution. For this distribution the cumulative density function looks like this:
From this cumulative distribution function, we now can easily calculate a p-value. This is done by taking our t-value on the x-axis and looking up its p-value on the y-axis:
In conclusion, using the cumulative distribution function (CDF) we can calculate a p-value. In mathematical terms, the p-value is defined as follows: