p-values

February 29

I lead math workshops for second-year electronic engineering students at UCL, and statistics has always been the most challenging topic to teach. There is far too much misuse of frequentist statistics in academia (including academic publishing), medicine, and law. Instead of making the students follow some ritual for rejecting the null hypothesis, my goal is to make sure they get the fundamentals right.

There was a study in 2002 where academics and students from psychology departments of several German universities were asked to fill out a questionnaire. It consisted of 6 statements about p-values which the participants had to mark as true or false. Second year in a row, I’m performing a little experiment—at the beginning of the workshop, I present a similar scenario to my students and ask those same 6 questions. Here’s my version:

p-value is the probability of obtaining a result at least as extreme as the one observed, assuming that the null hypothesis is true.
You have a treatment that you suspect may cure covid.
You compare the means of your control and experimental groups, each of size 100. At the end of the experiment, 50 people in the control group and 60 people in the experimental group do not have covid.
You use a simple independent-means t-test to investigate whether there is a significant difference between the two groups.
You compute p-value of 0.01.
Are the following statements true or false?
You have absolutely disproved the null hypothesis.
You have found the probability of the null hypothesis being true.
You have absolutely proved your alternative hypothesis.
You can deduce the probability of the alternative hypothesis being true.
You know, if you decide to reject the null hypothesis, the probability that you are making the wrong decision.
You have a reliable experimental finding that if the experiment were repeated a great number of times, you would obtain a significant result on 99% of occasions.

Results from today’s workshop:

0 true, 15 false
14 true, 1 false
0 true, 15 false
13 true, 2 false
13 true, 2 false
1 true, 14 false

All six statements are, in fact, false. The lesson is that people love to attach additional meaning to p-values and other statistical concepts, even when a clear definition is given. It’s absolutely not an issue that the students didn’t get all the answers right—that’s what education is for. The problem is that even the people who are supposed to teach these things often make the same mistakes:

A bar chart. y axis: 'Proportion that marked all 6 statements correctly (%)'. x axis: 1) 'Academics teaching statistics (n = 30)': 20%, 2) 'Academics not teaching statistics (n = 39)': 10.3%, 3) 'Psychology students (n = 44)': 0%.