Statisticians respond to misuse and misinterpretation of “statistical significance” (p-values) in research

Editor’s note: This article was first published on 8 March 2016. It was republished on 6 January 2017 to become part 7 of the special series Top 100 most-discussed journal articles of 2016.

Amid rising concerns about the reproducibility and replicability of scientific conclusions, the American Statistical Association (ASA) has released a formal statement1 clarifying several widely agreed upon principles underlying the proper use and interpretation of the p-value:

Underpinning many published scientific conclusions is the concept of “statistical significance,” typically assessed with an index called the p-value. While the p-value can be a useful statistical measure, it is commonly misused and misinterpreted.

The six principles are:

1. P-values can indicate how incompatible the data are with a specified statistical model.
2. P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
3. Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
4. Proper inference requires full reporting and transparency.
5. A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
6. By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.

Research monitor Retraction Watch has interviewed ASA executive director and statement co-author Ron Wasserstein in regard to the new principles.

Wasserstein says that the biggest mistakes being made include the misuse of statistical significance as an arbiter of scientific validity, concluding that a null hypothesis is true because a computed p-value is large, and the logical fallacy of concluding something is true that you had to assume to be true in order to reach that conclusion.

The latter error relates to principle 2, which addresses a widespread misconception in regard to p-values. The Retraction Watch interviewer asks Wasserstein:

Some of the principles seem straightforward, but I was curious about #2 – I often hear people describe the purpose of a p value as a way to estimate the probability the data were produced by random chance alone. Why is that a false belief?

Wasserstein responds:

Let’s think about what that statement would mean for a simplistic example. Suppose a new treatment for a serious disease is alleged to work better than the current treatment. We test the claim by matching 5 pairs of similarly ill patients and randomly assigning one to the current and one to the new treatment in each pair. The null hypothesis is that the new treatment and the old each have a 50-50 chance of producing the better outcome for any pair. If that’s true, the probability the new treatment will win for all five pairs is (½)5 = 1/32, or about 0.03. If the data show that the new treatment does produce a better outcome for all 5 pairs, the p-value is 0.03. It represents the probability of that result, under the assumption that the new and old treatments are equally likely to win. It is not the probability the new treatment and the old treatment are equally likely to win.

Reference:

1. Wasserstein, R.L. & Lazar, N.A. (2016). The ASA’s statement on p-values: context, process, and purpose. The American Statistician, DOI:10.1080/00031305.2016.1154108
5/5 - (1 vote)

Also published on Medium.

Bruce Boyes

Bruce Boyes (www.bruceboyes.info) is editor, lead writer, and a director of the award-winning RealKM Magazine (www.realkm.com), and a knowledge management (KM), environmental management, and project management professional. He is a PhD candidate in the Knowledge, Technology and Innovation Group at Wageningen University and Research, and holds a Master of Environmental Management with Distinction. His expertise and experience includes knowledge management (KM), environmental management, project management, stakeholder engagement, teaching and training, communications, research, and writing and editing. With a demonstrated ability to identify and implement innovative solutions to social and ecological complexity, Bruce's many career highlights include establishing RealKM Magazine as an award-winning resource, using agile and knowledge management approaches to oversee an award-winning \$77.4 million western Sydney river recovery program, leading a knowledge strategy process for Australia's 56 natural resource management (NRM) regional organisations, pioneering collaborative learning and governance approaches to support the sustainable management of landscapes and catchments, and initiating and teaching two new knowledge management subjects at Shanxi University in China.

Related Articles

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Check Also
Close