Skip to main content
Log in

A mistaken confidence in data

  • Paper in General Philosophy of Science
  • Published:
European Journal for Philosophy of Science Aims and scope Submit manuscript

Abstract

In this paper I explore an underdiscussed factor contributing to the replication crisis: Scientists, and following them policy makers, often neglect sources of errors in the production and interpretation of data and thus overestimate what can be learnt from them. This neglect leads scientists to conduct experiments that are insufficiently informative and science consumers, including other scientists, to put too much weight on experimental results. The former leads to fragile empirical literatures, the latter to surprise and disappointment when the fragility of the empirical basis of some disciplines is revealed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. To clarify, ghost literatures are not defined by this particular etiology. Presumably, many factors contribute to the existence of ghost literatures, and different ghost literatures may have different etiologies.

References

  • Allchin, D. (2001). Error types. Perspectives on Science, 9(1), 38–58.

    Google Scholar 

  • American Psychological Association. (2010). Publication manual of the APA (6th ed.). Washington, DC: Author.

  • Axt, J. R. (2018). The best way to measure explicit racial attitudes is to ask about them. Social Psychological and Personality Science, 9(8), 896–906.

    Google Scholar 

  • Bakker, M., Hartgerink, C. H., Wicherts, J. M., & van der Maas, H. L. (2016). Researchers’ intuitions about power in psychological research. Psychological Science, 27(8), 1069–1077.

  • Begley, C. G., & Ellis, L. M. (2012). Drug development: Raise standards for preclinical cancer research. Nature, 483, 531–533.

    Google Scholar 

  • Belia, S., Fidler, F., Williams, J., & Cumming, G. (2005). Researchers misunderstand confidence intervals and standard error bars. Psychological Methods, 10(4), 389–396.

    Google Scholar 

  • Benjamin, D. J., Berger, J. O., Johannesson, M., Nosek, B. A., Wagenmakers, E. J., Berk, R., Bollen, K. A., Brembs, B., Brown, L., Camerer, C., Cesarini, D., Chambers, C. D., Clyde, M., Cook, T. D., de Boeck, P., Dienes, Z., Dreber, A., Easwaran, K., Efferson, C., Fehr, E., Fidler, F., Field, A. P., Forster, M., George, E. I., Gonzalez, R., Goodman, S., Green, E., Green, D. P., Greenwald, A. G., Hadfield, J. D., Hedges, L. V., Held, L., Hua Ho, T., Hoijtink, H., Hruschka, D. J., Imai, K., Imbens, G., Ioannidis, J. P. A., Jeon, M., Jones, J. H., Kirchler, M., Laibson, D., List, J., Little, R., Lupia, A., Machery, E., Maxwell, S. E., McCarthy, M., Moore, D. A., Morgan, S. L., Munafó, M., Nakagawa, S., Nyhan, B., Parker, T. H., Pericchi, L., Perugini, M., Rouder, J., Rousseau, J., Savalei, V., Schönbrodt, F. D., Sellke, T., Sinclair, B., Tingley, D., van Zandt, T., Vazire, S., Watts, D. J., Winship, C., Wolpert, R. L., Xie, Y., Young, C., Zinman, J., & Johnson, V. E. (2018). Redefine statistical significance. Nature Human Behaviour, 2(1), 6–10.

    Google Scholar 

  • Button, K. S., et al. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Review Neuroscience, 14, 365376.

    Google Scholar 

  • Califf, R. M., Zarin, D. A., Kramer, J. M., Sherman, R. E., Aberle, L. H., & Tasneem, A. (2012). Characteristics of clinical trials registered in ClinicalTrials. Gov, 2007-2010. Jama, 307, 1838–1847.

    Google Scholar 

  • Chang, A., & Li, P. (2015). Is economics research replicable? Sixty published papers from thirteen journals say “usually not”. Available at SSRN 2669564.

  • Cohen, J. (1962). The statistical power of abnormal-social psychological research: A review. The Journal of Abnormal and Social Psychology, 65(3), 145–153.

    Google Scholar 

  • Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155–159.

    Google Scholar 

  • Desmond, J. E., & Glover, G. H. (2002). Estimating sample size in functional MRI (fMRI) neuroimaging studies: Statistical power analyses. Journal of Neuroscience Methods, 118(2), 115–128.

    Google Scholar 

  • Dumas-Mallet, E., Button, K. S., Boraud, T., Gonon, F., & Munafò, M. R. (2017). Low statistical power in biomedical science: A review of three human research domains. Royal Society Open Science, 4(2), 160254.

    Google Scholar 

  • Fanelli, D. (2010). “Positive” results increase down the hierarchy of the sciences. PLoS One, 5(4), e10068.

    Google Scholar 

  • Fanelli, D. (2018). Opinion: Is science really facing a reproducibility crisis, and do we need it to? Proceedings of the National Academy of Sciences, 115(11), 2628–2631.

    Google Scholar 

  • Flake, J. K., Pek, J., & Hehman, E. (2017). Construct validation in social and personality research: Current practice and recommendations. Social Psychological and Personality Science, 8(4), 370–378.

    Google Scholar 

  • Fraley, R. C., & Vazire, S. (2014). The N-pact factor: Evaluating the quality of empirical journals with respect to sample size and statistical power. PLoS One, 9(10), e109019.

    Google Scholar 

  • Hagger, M. S., Chatzisarantis, N. L. D., Alberts, H., Anggono, C. O., Batailler, C., Birt, A. R., Brand, R., Brandt, M. J., Brewer, G., Bruyneel, S., Calvillo, D. P., Campbell, W. K., Cannon, P. R., Carlucci, M., Carruth, N. P., Cheung, T., Crowell, A., de Ridder, D. T. D., Dewitte, S., Elson, M., Evans, J. R., Fay, B. A., Fennis, B. M., Finley, A., Francis, Z., Heise, E., Hoemann, H., Inzlicht, M., Koole, S. L., Koppel, L., Kroese, F., Lange, F., Lau, K., Lynch, B. P., Martijn, C., Merckelbach, H., Mills, N. V., Michirev, A., Miyake, A., Mosser, A. E., Muise, M., Muller, D., Muzi, M., Nalis, D., Nurwanti, R., Otgaar, H., Philipp, M. C., Primoceri, P., Rentzsch, K., Ringos, L., Schlinkert, C., Schmeichel, B. J., Schoch, S. F., Schrama, M., Schütz, A., Stamos, A., Tinghög, G., Ullrich, J., vanDellen, M., Wimbarti, S., Wolff, W., Yusainy, C., Zerhouni, O., & Zwienenberg, M. (2016). A multilab preregistered replication of the ego-depletion effect. Perspectives on Psychological Science, 11(4), 546–573.

    Google Scholar 

  • Higginson, A. D., & Munafò, M. R. (2016). Current incentives for scientists lead to underpowered studies with erroneous conclusions. PLoS Biology, 14(11), e2000995.

    Google Scholar 

  • Hon, G. (1989). Towards a typology of experimental errors: An epistemological view. Studies in History and Philosophy of Science Part A, 20(4), 469–504.

    Google Scholar 

  • Hussey, I., & Hughes, S. (2020). Hidden invalidity among fifteen commonly used measures in social and personality psychology. Advances in Methods and Practices in Psychological Science, 3(2), 166–184.

    Google Scholar 

  • Jennions, M. D., & Møller, A. P. (2003). A survey of the statistical power of research in behavioral ecology and animal behavior. Behavioral Ecology, 14(3), 438–445.

    Google Scholar 

  • Ioannidis, J. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124.

    Google Scholar 

  • Lamberink, H. J., Otte, W. M., Sinke, M. R. T., Lakens, D., Glasziou, P. P., Tijdink, J. K., & Vinkers, C. H. (2018). Statistical power of clinical trials increased while effect size remained stable: An empirical analysis of 136,212 clinical trials between 1975 and 2014. Journal of Clinical Epidemiology, 102, 123–128.

    Google Scholar 

  • Lemoine, N. P., Hoffman, A., Felton, A. J., Baur, L., Chaves, F., Gray, J., Yu, Q., & Smith, M. D. (2016). Underappreciated problems of low replication in ecological field studies. Ecology, 97(10), 2554–2561.

    Google Scholar 

  • Loken, E., & Gelman, A. (2017). Measurement error and the replication crisis. Science, 355(6325), 584–585.

    Google Scholar 

  • Machery, E. (2015). Cognitive penetrability: A no-progress report. In J. Zeimbekis & A. Raftapoulos (Eds.), The cognitive penetrability of perception (pp. 59–74). Oxford: Oxford University Press.

    Google Scholar 

  • Machery, E. (2020). What is a replication? Philosophy of Science, 87(4), 545–567.

    Google Scholar 

  • Machery, E. G., Grau, C. M., & Pury, C. (2020). Love and power: Grau and Pury (2014) as a case study of the challenges in x-phi replication. Review of Philosophy and Psychology, 11, 995–1011.

  • Mayo, D. G. (1996). Error and the growth of experimental knowledge. Chicago: University of Chicago Press.

    Google Scholar 

  • Nelson, L. D., Simmons, J., & Simonsohn, U. (2018). Psychology's renaissance. Annual Review of Psychology, 69, 511–534.

    Google Scholar 

  • Nord, C. L., Valton, V., Wood, J., & Roiser, J. P. (2017). Power-up: A reanalysis of “power failure” in neuroscience using mixture modeling. Journal of Neuroscience, 37(34), 8051–8061.

    Google Scholar 

  • Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science, 7(6), 615–631.

    Google Scholar 

  • Oakes, M. (1986). Statistical inference: A commentary for the social and behavioural sciences. Chichester: Wiley.

    Google Scholar 

  • Oakes, L. M. (2017). Sample size, statistical power, and false conclusions in infant looking-time research. Infancy, 22(4), 436–469.

    Google Scholar 

  • Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349, aac4716. https://doi.org/10.1126/science.aac4716.

    Article  Google Scholar 

  • Richard, F. D., Bond Jr., C. F., & Stokes-Zoota, J. J. (2003). One hundred years of social psychology quantitatively described. Review of General Psychology, 7(4), 331–363.

    Google Scholar 

  • Sedlmeier, P., & Gigerenzer, G. (1989). Do studies of statistical power have an effect on the power of studies? Psychological Bulletin, 105, 309–316.

    Google Scholar 

  • Simmons, J. P., & Simonsohn, U. (2017). Power posing: P-curving the evidence. Psychological Science, 28, 687–693.

    Google Scholar 

  • Szucs, D., & Ioannidis, J. P. (2017). Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLoS Biology, 15(3), e2000797.

    Google Scholar 

  • Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366.

    Google Scholar 

  • Thorndike, E. L. (1904). An introduction to the theory of mental and social measurements. New York: Teachers College, Columbia University.

    Google Scholar 

  • Tversky, A., & Kahneman, D. (1971). Belief in the law of small numbers. Psychological Bulletin, 76(2), 105–110.

    Google Scholar 

  • Weidman, A. C., Steckler, C. M., & Tracy, J. L. (2017). The jingle and jangle of emotion assessment: Imprecise measurement, casual scale usage, and conceptual fuzziness in emotion research. Emotion, 17(2), 267–295.

    Google Scholar 

  • Windish, D. M., Huot, S. J., & Green, M. L. (2007). Medicine residents' understanding of the biostatistics and results in the medical literature. JAMA, 298, 1010–1022.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edouard Machery.

Ethics declarations

Conflict of interest

The author declares that he has no conflict of interest.

Additional information

This article belongs to the Topical Collection: Philosophical Perspectives on the Replicability Crisis

Guest Editors: Mattia Andreoletti, Jan Sprenger

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Machery, E. A mistaken confidence in data. Euro Jnl Phil Sci 11, 34 (2021). https://doi.org/10.1007/s13194-021-00354-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13194-021-00354-9

Keywords

Navigation