Replicability crisis in psychology
In case you haven’t heard it, the joke goes like this: not only the results of the research cannot be replicated, even the absence of replication cannot be replicated.
Richard Horton put it very clearly:
The case against science is straightforward: much of the scientific literature, perhaps half, may simply be untrue. Afflicted by studies with small sample sizes, tiny effects, invalid exploratory analyses, and flagrant conflicts of interest, together with an obsession for pursuing fashionable trends of dubious importance, science has taken a turn towards darkness.
Horton, R. (2015). What is medicine’s 5 sigma? The Lancet, 385(9976), 1380.
Andrew Gelman’s blog reports a long-lasting discussion on such topic, which is very interesting to follow. Although not everybody agrees, the null-hypothesis-significance-testing (NHST) approach fares quite badly in this crisis, which is not only a crisis in psychology, obviously (actually, it seems much worse elsewhere). Small effect sizes, small samples, small signal-to-noise ratios, and still the p-value can, quite easily, be smaller than .05. The NHST was such a simple procedure: is it any wonder that it could only fail? Should I wonder how the first-year psychology students would react, if we started teaching Bayesian statistic rather than the frequentist approach? Or, should I ask, when will it happen?
Well, apart from the NHST, apart from the degrees of freedom of the experimenter, the money that comes from ‘observing’ a given result rather than another, the garden of forking paths, the bad decisions of the Government agencies that dictate the criteria for career progression in academia, I think that it is worthwhile to remember the provocation of Nelson, Simmons, and Simonsohn (2012), who proposed the Utopia in which researchers are allowed to publish only one paper per year.
Publication quantity is no longer a relevant dimension. This system incentivizes researchers to demonstrate that an effect is robust and generalizable, and hence true and important.
… if Hari Seldon arrived at work and the published literature was slimmer and more digestible, would he be worse off? Furthermore, rather than wondering about how to evaluate two job candidates who differ in quality and quantity, Seldon would instead see candidates who were matched on the latter, allowing him to entirely focus on the former. Finally, Hari can pursue his own work with improved clarity and focus. There is only one paper to write this year. He had better make it count.
What’s wrong with that?