2016's top 100 journal articlesNews & communityQuality of science & its communication

Statisticians respond to misuse and misinterpretation of “statistical significance” (p-values) in research

Editor’s note: This article was first published on 8 March 2016. It was republished on 6 January 2017 to become part 7 of the special series Top 100 most-discussed journal articles of 2016.

Amid rising concerns about the reproducibility and replicability of scientific conclusions, the American Statistical Association (ASA) has released a formal statement1 clarifying several widely agreed upon principles underlying the proper use and interpretation of the p-value:

Underpinning many published scientific conclusions is the concept of “statistical significance,” typically assessed with an index called the p-value. While the p-value can be a useful statistical measure, it is commonly misused and misinterpreted.

The six principles are:

  1. P-values can indicate how incompatible the data are with a specified statistical model.
  2. P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
  3. Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
  4. Proper inference requires full reporting and transparency.
  5. A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
  6. By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.

Research monitor Retraction Watch has interviewed ASA executive director and statement co-author Ron Wasserstein in regard to the new principles.

Wasserstein says that the biggest mistakes being made include the misuse of statistical significance as an arbiter of scientific validity, concluding that a null hypothesis is true because a computed p-value is large, and the logical fallacy of concluding something is true that you had to assume to be true in order to reach that conclusion.

The latter error relates to principle 2, which addresses a widespread misconception in regard to p-values. The Retraction Watch interviewer asks Wasserstein:

Some of the principles seem straightforward, but I was curious about #2 – I often hear people describe the purpose of a p value as a way to estimate the probability the data were produced by random chance alone. Why is that a false belief?

Wasserstein responds:

Let’s think about what that statement would mean for a simplistic example. Suppose a new treatment for a serious disease is alleged to work better than the current treatment. We test the claim by matching 5 pairs of similarly ill patients and randomly assigning one to the current and one to the new treatment in each pair. The null hypothesis is that the new treatment and the old each have a 50-50 chance of producing the better outcome for any pair. If that’s true, the probability the new treatment will win for all five pairs is (½)5 = 1/32, or about 0.03. If the data show that the new treatment does produce a better outcome for all 5 pairs, the p-value is 0.03. It represents the probability of that result, under the assumption that the new and old treatments are equally likely to win. It is not the probability the new treatment and the old treatment are equally likely to win.

Reference:

  1. Wasserstein, R.L. & Lazar, N.A. (2016). The ASA’s statement on p-values: context, process, and purpose. The American Statistician, DOI:10.1080/00031305.2016.1154108

Also published on Medium.

Bruce Boyes

Bruce Boyes is editor, lead writer, and a director of RealKM Magazine and winner of the International Knowledge Management Award 2025 (Individual Category). He is an experienced knowledge manager, environmental manager, project manager, communicator, and educator, and holds a Master of Environmental Management with Distinction and a Certificate of Technology (Electronics). His many career highlights include: establishing RealKM Magazine as an award-winning resource with more than 2,500 articles and 2 million reader views, leading the knowledge management (KM) community KM and Sustainable Development Goals (SDGs) initiative, using agile approaches to oversee the on time and under budget implementation of an award-winning $77.4 million recovery program for one of Australia's iconic river systems, leading a knowledge strategy process for Australia’s 56 natural resource management (NRM) regional organisations, pioneering collaborative learning and governance approaches to empower communities to sustainably manage landscapes and catchments in the face of complexity, being one of the first to join a new landmark aviation complexity initiative, initiating and teaching two new knowledge management subjects at Shanxi University in China, and writing numerous notable environmental strategies, reports, and other works.

Related Articles

Back to top button