7.8 What is a p-value and how is it interpreted?
It is common in a lot of fields, not just biology, to calculate and then communicate a p-value, but in experience p-values are often misused and misinterpreted. In this course, we will not ask you to calculate p-values, but we do want you to understand what they are and how to interpret them.
What is a p-value?
A p-value is a number that we can gain from different types of statistical tests. These statistical tests typically measure how likely it is that our observed data would have occurred by random chance. Let’s look at an example. This course is offered to multiple sections. We could calculate the average height for a random subset of 30 students in section 001 and also the average height for a random subset of 30 students in section 002.
-
- Section 001 average height = 165 inches
- Section 002 average height = 167 inches
When we look at the data above, we can see that the two averages are not identical. However, we do not know whether that is due to random chance when we randomly selected our 30 individuals to measure or whether Section 002 is significantly taller than Section 001. This is where a statistical test for significance comes into play.
NOTE: It is common for people to use the word significant in their everyday language to mean really big or important. In science, we only use the word significant if a statistical test for significance has been done.
First, when conducting a statistical test, we will have a null hypothesis, which is always “There is no difference between the conditions”. In our example, the null hypothesis is “There is no difference in height between the students in Section 001 and the students in Section 002”.
Second, there is also an alternative hypothesis. This one depends on your experiment, but is generally going to say that your conditions are different. In our example, you may hypothesize that students in Section 002 get more sleep and are therefore taller. If we look at the two averages, this would seem to be supported by the data.
So how does a scientist figure out if the difference in average height is due to the randomness of the 30 people sampled from each section OR if the two sections are actually different in height?
By calculating the p-value we can determine if these two groups have different average heights due to random chance, or if there is actually a relationship between the class section and height.
The range for p-values is between 0 and 1. A smaller value means it is less likely that the results are due to random chance. In other words, a smaller p-value means it is more likely the results reflect an actual difference between your two conditions. A larger value means that it is more likely that the results are due to random chance and your two conditions are not different.
- Smaller p-value = more likely to be actually different
- Larger p-value = more likely to be similar
But what is small and what is large? You have likely seen the criterion that a p-value ≤ 0.05 is considered statistically significant, but what does this mean? It means that there is equal to or less than a 5% probability that the null hypothesis is correct. In other words, it is strong evidence against the null hypothesis. Therefore, we would reject the null hypothesis and have support for the alternative hypothesis.
- p ≥ 0.05 accept the null hypothesis
- p ≤ 0.05 reject the null hypothesis
Note: Scientists must set their significance level before conducting their tests. It may make more sense based on sample size and the magnitude of the effect for a particular experiment to have a significance level of 0.01 or 0.10. Typically, scientists use 0.05.
Let’s get back to our example. We are asking, how likely is it that we would get an average of 167 inches for Section 002 if our null hypothesis that there is no difference is true? We ran the test and our p-value = 0.09
Since 0.09 is greater than 0.05 this means we accept our null hypothesis and would conclude that the height for Section 001 students is similar to the height for Section 002 students. Therefore, there is no significant difference between these two groups.
If you’d like to watch a video about p-values, check out this Khan Academy video [7:58].
Check Yourself
We have two groups of mice. One was given a standard diet and the other was given a diet with added Vitamin C. We found that mice on the standard diet lived an average of 2.1 years. Mice on the added Vitamin C diet lived an average of 2.6 years. We found that p = 0.35.
WARNINGS about use and interpretation of p-value
- A p-value does not tell you whether your null hypothesis is true or false. Rather, it tells you how likely it is to see the data you observed if the null hypothesis was true.
- The p-value can only tell you whether or not the null hypothesis is supported. It cannot tell you whether your alternative hypothesis is true, or why.
- Larger sample sizes can lead to small differences between conditions to become statistically significant. When presented with data examine the actual difference between groups (like your treatment and your control group) as well as the p-value
- Image made in Powerpoint by author, Katherine Furniss ↵