Istotność statystyczna III. Od rytuału do myślenia statystycznego

Piotr Wolski


Statistical significance III. From ritual to statistical thinking

One of the more prominent problems of significance testing is ritualisation of their practical use and interpretation. In the present, third part of the series, reasons and manifestations of that rigidity have been discussed, and an alternative, sometimes labeled “statistical thinking”, presented. Matching a statistical significance testing scenario to the needs of the specific research program constitutes a part of statistical thinking. Some typical scenarios have been described, with the intent of showing how the same statistical tool, depending on it’s assumptions, can have differing use in research.

Słowa kluczowe: statistical inference, null hypothesis signifi cance testing, NHST, p-value, statistical power

Abelson, R.P. (1995). Statistics as Principled Argument. New York: Psychology Press. Taylor
& Francis Group.

American Psychological Association. (2010). Publication Manual of the American Psychological Association. 6th edition. Washington, DC: American Psychological Association.

Brandstätter, E. (1999). Confidence intervals as an alternative to significance testing. Methods of Psychological Research Online, 4(2), 33–46.

Brzeziński, J.M. (2012a). Co to znaczy, że wyniki przeprowadzonych przez psychologów badań naukowych poddawane są analizie statystycznej? Roczniki Psychologiczne, 15(3), 7–40.

Brzeziński, J.M. (2012b). Kontekst teorii psychologicznej a kontekst analizy statystycznej. Rocz-niki Psychologiczne, 15(3), 75–81.

Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155–159.

Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49(12), 997. doi:10.1037/0003-066X.49.12.997

Cohen, J. (2006). Ziemia jest okrągła (p < 0,05). W: J. Brzeziński, J. Siuta (red.), Metodologiczne
i statystyczne problemy psychologii
(s. 100–118). Poznań: Wydawnictwo Zysk i S-ka.

Colquhoun, D. (2014). An investigation of the false discovery rate and the misinterpretation of p-values. Royal Society Open Science, 1(3), 140216. doi:10.1098/rsos.140216

Coulson, M., Healey, M., Fidler, F., Cumming, G. (2010). Confidence intervals permit, but do not guarantee, better inference than statistical significance testing. Frontiers in Psychology, 1, 26. doi:10.3389/fpsyg.2010.00026

Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25(1), 7–29. doi:10.1177/0956797613504966

Finch, S., Cumming, G., Williams, J., Palmer, L., Griffith, E., Alders, C., ... Goodman, O. (2004). Reform of statistical inference in psychology: The case of Memory & Cognition. Behavior Research Methods, Instruments & Computers, 36(2), 312–324.

Fisher, R.A. (1971). The Design of Experiments (wyd. 8). New York: Hafner Publishing Company.

Gigerenzer, G. (1998). We need statistical thinking, not statistical rituals. Behavioral and Brain Sciences, 21, 199–200.

Gigerenzer, G. (2004). Mindless statistics. Journal of Socio-Economics, 33(5), 587–606. doi:10.1016/j.socec.2004.09.033

Greene, J., D’Oliveira, M. (1982). Open Guides to Psychology: Learning to Use Statistical Tests in Psychology: A Student’s Guide. Milton Keynes: Open University Press.

Hoekstra, R., Johnson, A., Kiers, H.A.L. (2012). Confidence intervals make a difference: Effects of showing confidence intervals on inferential reasoning. Educational and Psychological Measurement, 72(6), 1039–1052. doi:10.1177/0013164412450297

Hoekstra, R., Morey, R.D., Rouder, J.N., Wagenmakers, E.J. (2014). Robust misinterpretation of confidence intervals. Psychonomic Bulletin and Review, 21(5), 1157–1164. doi:10.3758/s13423-013-0572-3

Kahneman, D. (2011). Thinking, Fast and Slow. New York: Farrar, Straus and Giroux.

McManus, I.C., Davison, A., Armour, J.A. (2013). Multilocus genetic models of handedness closely resemble single-locus models in explaining family data and are compatible with genome-wide association studies. Annals of the New York Academy of Sciences, 1288, 48–58. doi:10.1111/nyas.12102

Meehl, P.E. (1978). Theoretical risks and tabular asterisks: Sir Karl, sir Ronald, and the slow progress of soft psychology. Journal of Consulting and Clinical Psychology, 46(4), 806–834.

Ostasiewicz, W. (2012). Myślenie statystyczne. Warszawa: Oficyna a Wolters Kluwer Business.

Palij, M. (2012). New statistical rituals for old. PsycCRITIQUES, 57(24). doi:10.1037/a0028079

Sedlmeier, P., Gigerenzer, G. (1989). Do studies of statistical power have an effect on the power of studies? Psychological Bulletin, 105(2), 309.

Simonsohn, U., Nelson, L.D., Simmons, J.P. (2014). P-curve: A key to the file-drawer. Journal of Experimental Psychology: General, 143(2), 534–547. doi:10.1037/a0033242

StatsLife. (2015). Academic journal bans p-value significance test. Royal Statistical Society. Pobrane z:

Trafimow, D., Marks, M. (2015). Editorial. Basic and Applied Social Psychology, 37(1), 1–2.

Tversky, A., Kahneman, D. (1971). Belief in the law of small numbers. Psychological Bulletin, 76(2), 105.

Wasserstein, R. (2015). ASA comment on a journal’s ban on null hypothesis statistical testing. Pobrane z:

Weinberg, G.M. (1979). Myślenie systemowe. Warszawa: Wydawnictwa Naukowo-Techniczne.

Wilkinson, L., APA Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54(8), 594–604.

Wojciszke, B. (2004). Systematycznie modyfikowane autoreplikacje: Logika programu badań empirycznych w psychologii. W: J. Brzeziński (red.), Metodologia badań psychologicznych. Wybór tekstów (s. 44–60). Warszawa: Wydawnictwo Naukowe PWN.

Woolston, Ch. (2015). Psychology journal bans P values. Nature, 519(7541), 9–9. doi:10.1038/519009f

Pierwotną wersją czasopisma jest wersja elektroniczna publikowana w internecie.
Czasopismo ukazuje się w sposób ciągły on-line