Abstract
Evidence hierarchies are widely used to assess evidence in systematic reviews of medical studies. I give several arguments against the use of evidence hierarchies. The problems with evidence hierarchies are numerous, and include methodological shortcomings, philosophical problems, and formal constraints. I argue that medical science should not employ evidence hierarchies, including even the latest and most-sophisticated of such hierarchies.
Similar content being viewed by others
Notes
I have extracted this table from a slightly more detailed version found on the website of the Oxford Centre for Evidence-Based Medicine (http://www.cebm.net/index.aspx?o=1025, accessed Feb 15, 2013).
In part I draw on extant arguments by past critics of evidence hierarchies in medicine, including Bluhm (2005), Upshur (2005), Rawlins (2008), Goldenberg (2009), Borgerson (2009), Solomon (2011) and La Caze (2011). For a specific critique of placing meta-analysis at the top of such hierarchies, see Stegenga (2011), and the assumption that RCTs ought to be necessarily near the top of such hierarchies has been criticized by Worrall (2002) and Cartwright (2007) (among many others).
Etymologically, a hierarchy refers to “rule by priests”, in which the hierarch is the top ruling priest.
I am taking internal validity to mean, roughly, freedom from systematic error in the method. This is usually contrasted with external validity, which I take to mean, roughly, the validity of extrapolating results from a test situation to a target situation. For a classic statement of these terms, see Cook and Campbell (1979).
The evidence from an observational study might be especially salient, for instance, if there were a very strong association between the purported cause and its effect, and there were no obvious threats to the internal validity of the study. See Vandenbroucke (2008).
These other causes are sometimes called ‘confounding causes’. A standard probabilistic definition of causality is: C causes E iff p(E|C and Ki) > p(E| ~ C and Ki), where Ki are the potential confounding causes. See Cartwright (1979) for a canonical, but slightly different, definition.
In this section and elsewhere I speak of evidence that provides support for various kinds of hypotheses. This way of construing the goal of medical research might appear to be misleading, since often the goal of clinical trials is to determine the strength of a causal relation, as estimated by so-called ‘effect sizes’. However, ultimately the goal of medical research is to provide evidence for hypotheses regarding the potential effectiveness of medical interventions. The outcomes of clinical trials, often measured by effect sizes, provide evidence relevant to such hypotheses.
See Vandenbroucke (2008) for a discussion of some of these trade-offs, and for a similar defense of the view that different kinds of hypotheses might require different evidence hierarchies.
For more discussion of the view that different hypothesis types require different kinds of studies, see Borgerson (2008).
Howick appeals to a ‘principle of total evidence’ in defense of his proposal. In fact a principal of total evidence would require one to consider not just high-quality evidence, but all evidence. Leuridan (m.s.) presents a nuanced discussion of the principal of total evidence and the various ways it can be interpreted in the context of medical research.
Mechanistic reasoning is permitted, according to Howick, when the mechanism which is appealed to is ‘not incomplete’. But critics of evidence hierarchies do not suggest appealing to evidence from methods purported to be lower on the evidence hierarchy when that evidence is sketchy. Moreover, it is fine to say that all high-quality evidence should be considered when that evidence is concordant, but hard cases are when plausible evidence from methods on various levels of an evidence hierarchy conflict with each other.
This typology of scales is standard (see, for exposition, Suppes and Zinnes (1962)). Consider the following examples of four different kinds of comparisons:
-
(i)
Beth claims that the food at Kiribati Kuisine is better than at Tahitian Treats
-
(ii)
The temperature inside Kiribati Kuisine is 20 ºC while the temperature inside Tahitian Treats is 22 ºC
-
(iii)
Kiribati Kuisine has been in business 5 years longer than Tahitian Treats
-
(iv)
Kiribati Kuisine has fewer items on its menu than does Tahitian Treats
The scales of these measurements are, respectively, ordinal (i), cardinal (ii), ratio (iii), and absolute (iv). Any positive transformation to a measure of Beth’s food tastes will preserve the information in (i). Only a positive linear transformation will preserve the information in (ii)—for instance, if we switched to the Kelvin scale. Similarly, only a positive linear transformation will preserve the information in (iii)—for instance, if we switched to a scale of weeks instead of years—and there is a natural zero point: the date of business inception. Only an identity transformation will preserve the information in (iv): the actual number of items on each menu. From a measure on a ratio scale—take the one in (ii), for example—we can infer a measure on an ordinal scale—in this example, that it is colder inside Kiribati Kuisine than it is inside Tahitian Treats.
-
(i)
References
Atkins D, Best D, Briss PA, Group GW (2004) Grading quality of evidence and strength of recommendations. BMJ 328:1490
Bluhm R (2005) From hierarchy to network: a richer view of evidence for evidence-based medicine. Perspect Biol Med 48(4):535–547. doi:10.1353/pbm.2005.0082
Bluhm R (2011) Jeremy Howick: the philosophy of evidence-based medicine. Theor Med Bioeth 32(6):423–427. doi:10.1007/s11017-011-9196-7
Borgerson K (2008) Valuing and evaluating evidence in medicine. PhD diss
Borgerson K (2009) Valuing evidence: bias and the evidence hierarchy of evidence-based medicine. Perspect Biol Med 52(2):218–233. doi:10.1353/pbm.0.0086
Cartwright N (1979) Causal laws and effective strategies. Nous 13:419–437
Cartwright N (2007) Are RCTs the gold standard? BioSocieties 2(1):11–20. doi:10.1017/s1745855207005029
Cartwright N (2010) What are randomised controlled trials good for? Philos Stud 147:59–70
Cho MK, Bero LA (1994) Instruments for assessing the quality of drug studies published in the medical literature. JAMA 272(2):101–104
Cook TD, Campbell DT (1979) Quasi-experimentation: design and analysis issues for field settings. Houghton Mifflin, Boston
Department of Clinical Epidemiology and Biostatistics, M. U. H. S. C (1981) How to read clinical journals: V: to distinguish useful from useless or even harmful therapy. Can Med Assoc J 124(9):1156–1162
Douglas H (2012) Weighing complex evidence in a democratic society. Kennedy Inst Ethics J 22(2):139–162
Goldenberg MJ (2009) Iconoclast or creed? Objectivism, pragmatism, and the hierarchy of evidence. Perspect Biol Med 52(2):168–187. doi:10.1353/pbm.0.0080
Hadorn DC, Baker D, Hodges JS, Hicks N (1996) Rating the quality of evidence for clinical practice guidelines. J Clin Epidemiol 49:749–754
Hartling L, Bond K, Vandermeer B, Seida J, Dryden DM, Rowe BH (2011) Applying the risk of bias tool in a systematic review of combination long-acting beta-agonists and inhaled corticosteroids for persistent asthma. PLoS One 6(2):e17242. doi:10.1371/journal.pone.0017242
Howick J (2011a) Exposing the vanities—and a qualified defense—of mechanistic reasoning in health care decision making. Philos Sci 78(5):926–940
Howick J (2011b) The philosophy of evidence-based medicine. Wiley, Oxford
Illari PM (2011) Mechanistic evidence: disambiguating the Russo–Williamson thesis. Int Stud Philos Sci 25(2):139–157. doi:10.1080/02698595.2011.574856
Ioannidis JP (2005) Why most published research findings are false. PLoS Med 2(8):e124. doi:10.1371/journal.pmed.0020124
Ioannidis JP (2008) Why most discovered true associations are inflated. [Review]. Epidemiology 19(5):640–648. doi:10.1097/EDE.0b013e31818131e7
Ioannidis JP (2011) An epidemic of false claims. Competition and conflicts of interest distort too many medical findings. Sci Am 304(6):16
Karanicolas PJ, Kunz R, Guyatt GH (2008) Point: evidence-based medicine has a sound scientific base. [Editorial]. Chest 133(5):1067–1071. doi:10.1378/chest.08-0068
Kelly MP, Moore TA (2011) The judgement process in evidence-based medicine and health technology assessment. Soc Theory Health 10(1):1–19
La Caze A (2011) The role of basic science in evidence-based medicine. Biol Philos 26(1):81–98. doi:10.1007/s10539-010-9231-5
Leuridan B, Weber E (2011) The IARC and mechanistic evidence. In: Illari PM, Russo F, Williamson J (eds) Causality in the sciences. Oxford University Press, NewYork, pp 91–109
Moher D, Pham B, Jones A, Cook DJ, Jadad AR, M Moher, Klassen TP (1998) Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses? Lancet 352(9128):609–613. doi:10.1016/s0140-6736(98)01085-x
Petticrew M, Roberts H (2003) Evidence, hierarchies, and typologies: horses for courses. J Epidemiol Community Health 57(7):527–529
Rawlins M (2008) De Testimonio: on the evidence for decisions about the use of therapeutic interventions. Royal College of Physicians, London
Russo F, Williamson J (2007) Interpreting causality in the health sciences. Int Stud Philos Sci 21:157–170
Solomon M (2011) Just a paradigm: evidence-based medicine in epistemological context. Eur J Philos Sci 1(3):451–466. doi:10.1007/s13194-011-0034-6
Stegenga J (2011) Is meta-analysis the platinum standard? Stud Hist Philos Biol Biomed Sci 42:497–507
Stegenga J (forthcoming) Quality of information in clinical research. In: Illari PM, Floridi L (eds) The philosophy of information quality. Springer
Straus SE, Richardson WS, Glasziou PP, Haynes RB (2005) Evidence-based medicine: how to practice and teach, 3rd edn. Elsevier Churchill Livingstone, London
Suppes P, Zinnes JL (1962) Basic measurement theory. Institute for mathematical studies in the social sciences, Technical Report No. 45
Upshur RE (2005) Looking for rules in a world of exceptions: reflections on evidence-based practice. Perspect Biol Med 48(4):477–489. doi:10.1353/pbm.2005.0098
Vandenbroucke JP (2008) Observational research, randomised trials, and two views of medical science. PLoS Med 5(3):e67. doi:10.1371/journal.pmed.0050067
Wilson MC, Hayward RS, Tunis SR, Bass EB, Guyatt G (1995) Users’ guides to the medical literature. VIII. How to use clinical practice guidelines. B. what are the recommendations and will they help you in caring for your patients? The evidence-based medicine working group. JAMA 274(20):1630–1632. doi:10.1001/jama.1995.03530200066040
Worrall J (2002) What evidence in evidence-based medicine? Philos Sci 69:S316–S330
Acknowledgments
I am grateful to Phyllis Illari, Federica Russo, and two anonymous reviewers for detailed commentary on earlier drafts. Financial support was provided by the Banting Postdoctoral Fellowships program administered by the Social Sciences and Humanities Research Council of Canada.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Stegenga, J. Down with the Hierarchies. Topoi 33, 313–322 (2014). https://doi.org/10.1007/s11245-013-9189-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11245-013-9189-4