Interpretation of and alternatives to p-values in biomedical sciences
- Authors: Grjibovski A.M.1,2,3,4, Gvozdeckii A.N.5
-
Affiliations:
- Northern state medical university
- Al-Farabi Kazakh national university
- West Kazakhstan Marat Ospanov medical university
- North-Eastern federal university
- Mechnikov North-Western state medical university
- Issue: Vol 29, No 3 (2022)
- Pages: 209-218
- Section: Articles
- URL: https://journals.rcsi.science/1728-0869/article/view/97249
- DOI: https://doi.org/10.17816/humeco97249
- ID: 97249
Cite item
Full Text
Abstract
Existing difficulties in interpretation of the results of statistical analysis have been repeatedly mentioned as one of the factors behind poor reproducibility of research findings in biomedical sciences followed by a series of publications presenting alternatives to improve the situation including a abandonment of p-values and significance testing. In this paper we briefly present the scope of the problem as well as Fischer and Neyman–Pearson approaches to hypothesis testing. Moreover, we present confidence intervals and effect size calculation as alternatives to dichotomization of the results as significant or not significant using a certain cut-off level. In addition, we summarize the pros and cons of suggestion to change the cut-off value from traditional 0.05 to 0.005. We also present a list of the most common misunderstandings of p-values discussed in international statistical literature.
We conclude the paper with brief recommendations on careful interpretation of the results of statistical analysis to prevent misinterpretation and misuse of p-values in biomedical studies.
Full Text
##article.viewOnOriginalSite##About the authors
Andrej M. Grjibovski
Northern state medical university; Al-Farabi Kazakh national university; West Kazakhstan Marat Ospanov medical university; North-Eastern federal university
Author for correspondence.
Email: andrej.grjibovski@gmail.com
ORCID iD: 0000-0002-5464-0498
SPIN-code: 5118-0081
MD, MPhil, PhD
Russian Federation, Arkhangelsk; Almaty, Kazakhstan; Aktobe, Kazakhstan; YakutskAnton N. Gvozdeckii
Mechnikov North-Western state medical university
Email: gvozdetskiy_an@outlook.com
ORCID iD: 0000-0001-8045-1220
SPIN-code: 4430-6841
MD, Cand. Sci. (Med.)
Russian Federation, St. PetersburgReferences
- Polonioli A, Vega-Mendoza M, Blankinship B, Carmel D. Reporting in experimental philosophy: current standards and recommendations for future practice. Rev Philos Psychol. 2021;12(1):49–73. doi: 10.1007/s13164-018-0414-3
- Amrhein V, Trafimow D, Greenland S. Inferential statistics as descriptive statistics: there is no replication crisis if we don’t expect replication. The American statistician. 2019;73(supl. 1):262–270. doi: 10.1080/00031305.2018.1543137
- Amrhein V, Korner-Nievergelt F, Roth T. The earth is flat (p >0.05): significance thresholds and the crisis of unreplicable research. PeerJ. 2017;5:e3544. doi: 10.7717/peerj.3544
- Szucs D, Ioannidis J.P.A. When null hypothesis significance testing is unsuitable for research: a reassessment. Front Hum Neurosci. 2017;11:390. doi: 10.3389/fnhum.2017.00390
- Akanov A, Turdaliyeva BS, Izekenova AK, et al. Assessment of use of statistical methods in scientific articles of the Kazakhstan’s medical journals. Ekologiya cheloveka (Human Ecology). 2013;20(5):61–64. (In Russ).
- Dorey F. The p value: what is it and what does it tell you? Clin Orthop Relat Res. 2010;468(8):2297–2298. doi: 10.1007/s11999-010-1402-9
- Haller H. Misinterpretations of significance: a problem students share with their teachers? Methods of psychological research. 2002;7(1):1–20.
- Palesch YY. Some common misperceptions about p-values. Stroke. 2014;45(12):e244–e246. doi: 10.1161/STROKEAHA.114.006138
- Zorin NA. «Validity» or «significance» — 12 years later. Pediatric Pharmacology. 2011;8(5):13–19. (In Russ).
- Kmetz JL. Correcting corrupt research: recommendations for the profession to stop misuse of p-values. The American statistician. 2019;73(supl. 1):36–45. doi: 10.1080/00031305.2018.1518271
- McShane BB. Abandon statistical significance. The American statistician. 2019;73(supl 1):235–245. doi: 10.1080/00031305.2018.1527253
- Perezgonzalez JD. Fisher, Neyman–Pearson or NHST? A tutorial for teaching data testing. Front Psychol. 2015;6:223. doi: 10.3389/fpsyg.2015.00223
- Lew MJ. Bad statistical practice in pharmacology (and other basic biomedical disciplines): you probably don’t know p: statistical inference using p-values. Br J Pharmacol. 2012;166(5):1559–1567. doi: 10.1111/j.1476-5381.2012.01931.x
- Pernet C. Null hypothesis significance testing: a guide to commonly misunderstood concepts and recommendations for good practice. F1000Research. 2017;4:621. doi: 10.12688/f1000research.6963.5
- Serdar CC, Cihan M, Yücel D, Serdar MA. Sample size, power and effect size revisited: simplified and practical approaches in pre-clinical, clinical and laboratory studies. Biochem Med (Zagreb). 2021;31(1)010502. doi: 10.11613/BM.2021.010502
- Lee DK. Alternatives to p value: confidence interval and effect size. Korean J Anesthesiol. 2016;69(6):555–562. doi: 10.4097/kjae.2016.69.6.555
- Grissom RJ, Kim JJ. Effect sizes for research. 2nd ed. New York: Routledge; 2012. doi: 10.4324/9780203803233
- Sullivan GM, Feinn R. using effect size — or why the p value is not enough. J Grad Med Educ. 2012;4(3):279–282. doi: 10.4300/JGME-D-12-00156.1
- Colquhoun D. An investigation of the false discovery rate and the misinterpretation of p-values. R Soc Open Sci. 2014;1(3):140216. doi: 10.1098/rsos.140216
- Stahel WA. New relevance and significance measures to replace p-values. PLoS One. 2021;16(6):e0252991. doi: 10.1371/journal.pone.0252991
- Anderson N.D. Teaching signal detection theory with pseudoscience. Front Psychol. 2015;6:762. doi: 10.3389/fpsyg.2015.00762
- Benjamin DJ, Berger JO, Johannesson M, et al. Redefine statistical significance. Nat Hum Behav. 2018;2(1):6–10. doi: 10.1038/s41562-017-0189-z
- Rubanovich AV. Redefining the critical value of significance level (0.005 instead of 0.05): the bayes trace. Radiation biology. Radioecology. 2018;58(5):453–462. (In Russ). doi: 10.1134/S0869803118050156
- Betensky RA. The p-value requires context, not a threshold. The American statistician. 2019;73(supl. 1):115–117. doi: 10.1080/00031305.2018.1529624
- Lakens D, Adolfi, FG, Albers CJ, et al. Justify your alpha. Nature human behaviour. 2018;2(3):168–171. doi: 10.1038/s41562-018-0311-x
- Di Leo G, Sardanelli F. Statistical significance: p value, 0.05 threshold, and applications to radiomics — reasons for a conservative approach. Eur Radiol Exp. 2020;4(1):1–8. doi: 10.1186/s41747-020-0145-y
- Vexler A. Valid p-values and expectations of p-values revisited // Ann Inst Stat Math. 2021;73:227–248. doi: 10.1007/s10463-021-00800-8