Medical nihilism is the view that we should have little confidence in the effectiveness of medical interventions. Jacob Stegenga argues persuasively that this is how we should see modern medicine, and suggests that medical research must be modified, clinical practice should be less aggressive, and regulatory standards should be enhanced.
An astonishing volume and diversity of evidence is available for many hypotheses in the biomedical and social sciences. Some of this evidence—usually from randomized controlled trials (RCTs)—is amalgamated by meta-analysis. Despite the ongoing debate regarding whether or not RCTs are the ‘gold-standard’ of evidence, it is usually meta-analysis which is considered the best source of evidence: meta-analysis is thought by many to be the platinum standard of evidence. However, I argue that meta-analysis falls far short of that standard. Different meta-analyses (...) of the same evidence can reach contradictory conclusions. Meta-analysis fails to provide objective grounds for intersubjective assessments of hypotheses because numerous decisions must be made when performing a meta-analysis which allow wide latitude for subjective idiosyncrasies to influence its outcome. I end by suggesting that an older tradition of evidence in medicine—the plurality of reasoning strategies appealed to by the epidemiologist Sir Bradford Hill—is a superior strategy for assessing a large volume and diversity of evidence. (shrink)
Philosophers have committed sins while studying science, it is said – philosophy of science focused on physics to the detriment of biology, reconstructed idealizations of scientific episodes rather than attending to historical details, and focused on theories and concepts to the detriment of experiments. Recent generations of philosophers of science have tried to atone for these sins, and by the 1980s the exculpation was in full swing. Marcel Weber’s Philosophy of Experimental Biology is a zenith mea culpa for philosophy of (...) science: it carefully describes several historical examples from twentieth century biology to address both ‘old’ philosophical topics, like reductionism, inference, and realism, and ‘new’ topics, like discovery, models, and norms. Biology, experiments, history – at last, philosophy of science, free of sin. (shrink)
Robustness is a common platitude: hypotheses are better supported with evidence generated by multiple techniques that rely on different background assumptions. Robustness has been put to numerous epistemic tasks, including the demarcation of artifacts from real entities, countering the “experimenter’s regress,” and resolving evidential discordance. Despite the frequency of appeals to robustness, the notion itself has received scant critique. Arguments based on robustness can give incorrect conclusions. More worrying is that although robustness may be valuable in ideal evidential circumstances (i.e., (...) when evidence is concordant), often when a variety of evidence is available from multiple techniques, the evidence is discordant. †To contact the author, please write to: Jacob Stegenga, Department of Philosophy, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093; e‐mail: [email protected] (shrink)
Medicalisation is a social phenomenon in which conditions that were once under legal, religious, personal or other jurisdictions are brought into the domain of medical authority. Low sexual desire in females has been medicalised, pathologised as a disease, and intervened upon with a range of pharmaceuticals. There are two polarised positions on the medicalisation of low female sexual desire: I call these the mainstream view and the critical view. I assess the central arguments for both positions. Dividing the two positions (...) are opposing models of the aetiology of low female sexual desire. I conclude by suggesting that the balance of arguments supports a modest defence of the critical view regarding the medicalisation of low female sexual desire. (shrink)
Measuring the effectiveness of medical interventions faces three epistemological challenges: the choice of good measuring instruments, the use of appropriate analytic measures, and the use of a reliable method of extrapolating measures from an experimental context to a more general context. In practice each of these challenges contributes to overestimating the effectiveness of medical interventions. These challenges suggest the need for corrective normative principles. The instruments employed in clinical research should measure patient-relevant and disease-specific parameters, and should not be sensitive (...) to parameters that are only indirectly relevant. Effectiveness always should be measured and reported in absolute terms (using measures such as 'absolute risk reduction'), and only sometimes should effectiveness also be measured and reported in relative terms (using measures such as 'relative risk reduction')-employment of relative measures promotes an informal fallacy akin to the base-rate fallacy, which can be exploited to exaggerate claims of effectiveness. Finally, extrapolating from research settings to clinical settings should more rigorously take into account possible ways in which the intervention in question can fail to be effective in a target population. (shrink)
A platitude that took hold with Kuhn is that there can be several equally good ways of balancing theoretical virtues for theory choice. Okasha recently modelled theory choice using technical apparatus from the domain of social choice: famously, Arrow showed that no method of social choice can jointly satisfy four desiderata, and each of the desiderata in social choice has an analogue in theory choice. Okasha suggested that one can avoid the Arrow analogue for theory choice by employing a strategy (...) used by Sen in social choice, namely, to enhance the information made available to the choice algorithms. I argue here that, despite Okasha’s claims to the contrary, the information-enhancing strategy is not compelling in the domain of theory choice. (shrink)
Robustness arguments hold that hypotheses are more likely to be true when they are confirmed by diverse kinds of evidence. Robustness arguments require the confirming evidence to be independent. We identify two kinds of independence appealed to in robustness arguments: ontic independence —when the multiple lines of evidence depend on different materials, assumptions, or theories—and probabilistic independence. Many assume that OI is sufficient for a robustness argument to be warranted. However, we argue that, as typically construed, OI is not a (...) sufficient independence condition for warranting robustness arguments. We show that OI evidence can collectively confirm a hypothesis to a lower degree than individual lines of evidence, contrary to the standard assumption undergirding usual robustness arguments. We employ Bayesian networks to represent the ideal empirical scenario for a robustness argument and a variety of ways in which empirical scenarios can fall short of this ideal. (shrink)
Evidence hierarchies are widely used to assess evidence in systematic reviews of medical studies. I give several arguments against the use of evidence hierarchies. The problems with evidence hierarchies are numerous, and include methodological shortcomings, philosophical problems, and formal constraints. I argue that medical science should not employ evidence hierarchies, including even the latest and most-sophisticated of such hierarchies.
I defend a radical interpretation of biological populations—what I call population pluralism—which holds that there are many ways that a particular grouping of individuals can be related such that the grouping satisfies the conditions necessary for those individuals to evolve together. More constraining accounts of biological populations face empirical counter-examples and conceptual difficulties. One of the most intuitive and frequently employed conditions, causal connectivity—itself beset with numerous difficulties—is best construed by considering the relevant causal relations as ‘thick’ causal concepts. I (...) argue that the fine-grained causal relations that could constitute membership in a biological population are huge in number and many are manifested by degree, and thus we can construe population membership as being defined by massively multidimensional constructs, the differences between which are largely arbitrary. I end by showing that positions in two recent debates in theoretical biology depend on a view of biological populations at odds with the pluralism defended here. (shrink)
WE AIM HERE to outline a theory of evidence for use. More specifically we lay foundations for a guide for the use of evidence in predicting policy effectiveness in situ, a more comprehensive guide than current standard offerings, such as the Maryland rules in criminology, the weight of evidence scheme of the International Agency for Research on Cancer (IARC), or the US ‘What Works Clearinghouse’. The guide itself is meant to be well-grounded but at the same time to give practicable (...) advice, that is, advice that can be used by policy-makers not expert in the natural and social sciences, assuming they are well-intentioned and have a reasonable but limited amount of time and resources available for searching out evidence and deliberating. (shrink)
Amalgamating evidence of different kinds for the same hypothesis into an overall confirmation is analogous, I argue, to amalgamating individuals’ preferences into a group preference. The latter faces well-known impossibility theorems, most famously “Arrow’s Theorem”. Once the analogy between amalgamating evidence and amalgamating preferences is tight, it is obvious that amalgamating evidence might face a theorem similar to Arrow’s. I prove that this is so, and end by discussing the plausibility of the axioms required for the theorem.
To be effective, a medical intervention must improve one's health by targeting a disease. The concept of disease, though, is controversial. Among the leading accounts of disease-naturalism, normativism, hybridism, and eliminativism-I defend a version of hybridism. A hybrid account of disease holds that for a state to be a disease that state must both (i) have a constitutive causal basis and (ii) cause harm. The dual requirement of hybridism entails that a medical intervention, to be deemed effective, must target either (...) the constitutive causal basis of a disease or the harms caused by the disease (or ideally both). This provides a theoretical underpinning to the two principle aims of medical treatment: care and cure. (shrink)
I defend a radical interpretation of biological populations—what I call population pluralism—which holds that there are many ways that a particular grouping of individuals can be related such that the grouping satisfies the conditions necessary for those individuals to evolve together. More constraining accounts of biological populations face empirical counter-examples and conceptual difficulties. One of the most intuitive and frequently employed conditions, causal connectivity—itself beset with numerous difficulties—is best construed by considering the relevant causal relations as ‘thick’ causal concepts. I (...) argue that the fine-grained causal relations that could constitute membership in a biological population are huge in number and many are manifested by degree, and thus we can construe population membership as being defined by massively multidimensional constructs, the differences between which are largely arbitrary. I end by showing that positions in two recent debates in theoretical biology depend on a view of biological populations at odds with the pluralism defended here. 1 Introduction2 Biological Population, Broad and Narrow3 Difficulties with Narrow Biological Population Conditions3.1 Against the genealogical condition3.2 Against the conspecificity condition3.3 Against the proximity condition3.4 Against the typology condition4 Causal Connectivity5 Massively Multidimensional Population Constructs6 Population Uniqueness and Natural Selection6.1 Statisticalism and its discontents6.2 Price at what price?7 Conclusion. (shrink)
Data from medical research are typically summarized with various types of outcome measures. We present three arguments in favor of absolute over relative outcome measures. The first argument is from cognitive bias: relative measures promote the reference class fallacy and the overestimation of treatment effectiveness. The second argument is decision-theoretic: absolute measures are superior to relative measures for making a decision between interventions. The third argument is causal: interpreted as measures of causal strength, absolute measures satisfy a set of desirable (...) properties, but relative measures do not. Absolute outcome measures outperform relative measures on all counts. (shrink)
Consensus conferences are social techniques which involve bringing together a group of scientific experts, and sometimes also non-experts, in order to increase the public role in science and related policy, to amalgamate diverse and often contradictory evidence for a hypothesis of interest, and to achieve scientific consensus or at least the appearance of consensus among scientists. For consensus conferences that set out to amalgamate evidence, I propose three desiderata: Inclusivity, Constraint, and Evidential Complexity. Two examples suggest that consensus conferences can (...) readily satisfy Inclusivity and Evidential Complexity, but consensus conferences do not as easily satisfy Constraint. I end by discussing the relation between social inclusivity and the three desiderata. (shrink)
Harms of medical interventions are systematically underestimated in clinical research. Numerous factors—conceptual, methodological, and social—contribute to this underestimation. I articulate the depth of such underestimation by describing these factors at the various stages of clinical research. Before any evidence is gathered, the ways harms are operationalized in clinical research contributes to their underestimation. Medical interventions are first tested in phase 1 ‘first in human’ trials, but evidence from these trials is rarely published, despite the fact that such trials provide the (...) foundation for assessing the harm profile of medical interventions. If a medical intervention is deemed safe in a phase 1 trial, it is tested in larger phase 2 and 3 clinical trials. One way to think about the problem of underestimating harms is in terms of the statistical ‘power’ of a clinical trial—the ability of a trial to detect a difference of a certain effect size between the experimental group and the control group. Power is normally thought to be pertinent to detecting benefits of medical interventions. It is important, though, to distinguish between the ability of a trial to detect benefits and the ability of a trial to detect harms. I refer to the former as power-B and the latter as power-H. I identify several factors that maximize power-B by sacrificing powerH in phase 3 clinical trials. If a medical intervention is approved for general use, it is evaluated by phase 4 post-market surveillance. Phase 4 surveillance of harms further contributes to underestimating the harm profile of medical interventions. At every stage of clinical research the hunt for harms is shrouded in secrecy, which further contributes to the underestimation of the harm profiles of medical interventions. (shrink)
The purpose of this chapter is to describe what we see as several important new directions for philosophy of medicine. This recent work (i) takes existing discussions in important and promising new directions, (ii) identifies areas that have not received sufficient and deserved attention to date, and/or (iii) brings together philosophy of medicine with other areas of philosophy (including bioethics, philosophy of psychiatry, and social epistemology). To this end, the next part focuses on what we call the “epistemological turn” in (...) recent work in the philosophy of medicine; the third part addresses new developments in medical research that raise interesting questions for philosophy of medicine; the fourth part is a discussion of philosophical issues within the practice of diagnosis; the fifth part focuses on the recent developments in psychiatric classification and scientific and ethical issues therein, and the final part focuses on the objectivity of medical research. (shrink)
Medical scientists employ ‘quality assessment tools’ (QATs) to measure the quality of evidence from clinical studies, especially randomized controlled trials (RCTs). These tools are designed to take into account various methodological details of clinical studies, including randomization, blinding, and other features of studies deemed relevant to minimizing bias and error. There are now dozens available. The various QATs on offer differ widely from each other, and second-order empirical studies show that QATs have low inter-rater reliability and low inter-tool reliability. This (...) is an instance of a more general problem I call the underdetermination of evidential significance. Disagreements about the strength of a particular piece of evidence can be due to different—but in principle equally good—weightings of the fine-grained methodological features which constitute QATs. (shrink)
Reasons transmit. If one has a reason to attain an end, then one has a reason to effect means for that end: reasons are transmitted from end to means. I argue that the likelihood ratio (LR) is a compelling measure of reason transmission from ends to means. The LR measure is superior to other measures, can be used to construct a condition specifying precisely when reasons transmit, and satisfies intuitions regarding end-means reason transmission in a broad array of cases.
Millstein (2009) argues against conceptual pluralism with respect to the definition of “population,” and proposes her own definition of the term. I challenge both Millstein's negative arguments against conceptual pluralism and her positive proposal for a singular definition of population. The concept of population, I argue, does not refer to a natural kind; populations are constructs of biologists variably defined by contexts of inquiry.
We provide a novel articulation of the epistemic peril of p-hacking using three resources from philosophy: predictivism, Bayesian confirmation theory, and model selection theory. We defend a nuanced position on p-hacking: p-hacking is sometimes, but not always, epistemically pernicious. Our argument requires a novel understanding of Bayesianism, since a standard criticism of Bayesian confirmation theory is that it cannot represent the influence of biased methods. We then turn to pre-analysis plans, a methodological device used to mitigate p-hacking. Some say that (...) pre-analysis plans are epistemically meritorious while others deny this, and in practice pre-analysis plans are often violated. We resolve this debate with a modest defence of pre-analysis plans. Further, we argue that pre-analysis plans can be epistemically relevant even if the plan is not strictly followed—and suggest that allowing for flexible pre-analysis plans may be the best available policy option. (shrink)
The chemical characterization of the substance responsible for the phenomenon of “transformation” of pneumococci was presented in the now famous 1944 paper by Avery, MacLeod, and McCarty. Reception of this work was mixed. Although interpreting their results as evidence that deoxyribonucleic acid (DNA) is the molecule responsible for genetic changes was, at the time, controversial, this paper has been retrospectively celebrated as providing such evidence. The mixed and changing assessment of the evidence presented in the paper was due to the (...) work’s interpretive flexibility – the evidence was interpreted in various ways, and such interpretations were justified given the neophytic state of molecular biology and methodological limitations of Avery’s transformation studies. I argue that the changing context in which the evidence presented by Avery’s group was interpreted partly explains the vicissitudes of the assessments of the evidence. Two less compelling explanations of the reception are a myth-making account and an appeal to the wartime historical context of its publication. (shrink)
The standard view about sex differences in sexual desire is that males are lusty and loose, while females are cool and coy. This is widely believed and is a core premise of some scientific programs like evolutionary psychology. But is it true? A mountain of evidence seems to support the standard view. Yet, this evidence is shot through with methodological and philosophical problems. Developments in the study of sexual desire suggest that some of these problems can be resolved, and when (...) they are, the standard view looks, at best, to be an exaggeration. (shrink)
Millstein argues against conceptual pluralism with respect to the definition of “population,” and proposes her own definition of the term. I challenge both Millstein’s negative arguments against conceptual pluralism and her positive proposal for a singular definition of population. The concept of population, I argue, does not refer to a natural kind; populations are constructs of biologists variably defined by contexts of inquiry.
Millstein argues against conceptual pluralism with respect to the definition of “population,” and proposes her own definition of the term. I challenge both Millstein’s negative arguments against conceptual pluralism and her positive proposal for a singular definition of population. The concept of population, I argue, does not refer to a natural kind; popula tions are constructs of biologists variably defined by contexts of inquiry.
It is a plausible speculation that conventional choices in outcome measures might influence the results of meta-analyses. We test that speculation by simulating data from trials on antidepressants. We vary real drug effectiveness while modulating conventional values for outcome measures. We had previously shown that one conventional choice used in meta-analyses of antidepressants falls in a narrow range of values that maximise estimates of effectiveness. Our present analysis investigates why this phenomenon occurs. Moreover, our results suggest the superiority of absolute (...) outcome measures over relative measures. This research program can be extended to test numerous other aspects of clinical research. (shrink)
I describe two traditions of philosophical accounts of evidence: one characterizes the notion in terms of signs of success, the other characterizes the notion in terms of conditions of success. The best examples of the former rely on the probability calculus, and have the virtues of generality and theoretical simplicity. The best examples of the latter describe the features of evidence which scientists appeal to in practice, which include general features of methods, such as quality and relevance, and general features (...) of evidence, such as patterns in data, concordance with other evidence, and believability of the evidence. Two infamous episodes from biomedical research help to illustrate these features. Philosophical characterization of these latter features—conditions of success—has the virtue of potential relevance to, and descriptive accuracy of, practices of experimental scientists. (shrink)