Academia.eduAcademia.edu
Inquiry, Evidence, and Experiment: The “Experimenter’s Regress” Dissolved Matthew J. Brown September 10, 2008 Abstract Contemporary ways of understanding of science, especially in the philosophy of science, are beset by overly abstract and formal models of evidence. In such models, the only interesting feature of evidence is that it has a one-way “support” relation to hypotheses, theories, causal claims, etc. These models create a variety of practical and philosophi- cal problems, one prominent example being the experimenter’s regress. According to the experimenter’s regress, good evidence is produced by good techniques, but which techniques are good is only determined by whether they produce the evidence we expect. The best answer to this problem within the traditional approach relies on the concept of robust evidence, but this answer ultimately falls flat because it creates impos- sible requirements on good evidence. The problem can more easily be solved by rejecting abstract, formalistic models of evidence in favor of a model of inquiry which pays attention to the temporal complexity of the process of inquiry and the distinction between observational and experimental evidence. 1 Introduction Several problems in the contemporary discussions of evidence only arise be- cause of inattention to the complexity of the process of inquiry and the uses of evidence. While inattention to these matters seems endemic amongst philosophers, the problem has also come to infect certain of the social and medical sciences as well as policy-making. The source of the problem is the 1 use of idealized models of inquiry and evidence, in which evidence is charac- terized by a formal relation of support it has to a hypothesis or theory. In such models, evidence is useful for this one task, and the task can be char- acterized without recourse to factors of the particular context or temporal development of the inquiry in which evidence arises and is used. This essay will show how a model of the temporal dynamics of inquiry and the functional role of evidence better addresses the problem known as the “experimenter’s regress,” popularized by H.M. Collins (1975, [1985] 1992). According to this model, there are different kinds of evidence, playing dif- ferent roles in an inquiry, and one major distinction is that between obser- vational and experimental evidence. Along the way, I will argue that the appeals to the robustness of evidence (e.g., Culp 1995) in attempt to solve the experimenter’s regress are bound to fail, and that the value of robustness itself is better understood according to the inquiry-model than by traditional approaches. 2 Evidence and the Model of Inquiry The temporal dynamics of inquiry have received scant attention in philoso- phy of science. There are only two major models which develop the temporal complexity of inquiry. One has been developed by Kuhn and his followers and critics. Kuhn’s model discusses the career of large-scale theories or re- search paradigms that govern entire areas over a large span of time. These models are sufficiently large-scale and long-term that they are not useful for addressing current concerns about the nature of evidence. Problems like the experimenter’s regress deal not with the evolution of theories over the long run, nor the revolutionary replacement of theories or paradigms. The questions at issue are far more local. The other major model of inquiry on offer is one introduced by C.S. Peirce and articulated by John Dewey. This model works best at the more local level of particular scientific inquiries, though it has some applications at the larger scale. I will not discuss potential conflicts or compatibilities between the two models, nor will I attempt a full defense of this model here,1 though I will attempt to make it as plausible as possible and provide an illuminating example. 1 I have done so elsewhere (Brown 2009, in preparation). 2 In the main outlines, the pragmatist model of the dynamics of inquiry goes has the following interlocking phases: 1. Inquiry Begins with a Felt Perplexity. There are many types of perplex- ity, but they are not in general a mere state of ignorance on the part of the inquirer. Rather, the state of the science, including theories and techniques, is discoordinated or indeterminate, and this is reflected in the feeling of perplexity on the part of the inquirer(s). There are conflicting tendencies within the situation of the field at the present time, requiring investigation. (Contrast this with the smooth application of a theory to a situation with immediate success.) 2. The Institution of a Problem. The situation must be assessed in order to formulate a problem-statement that adequately captures the felt per- plexity. Operations of observation must take place in order to arrive at a statement of the problem, which evolves as the inquiry develops. 3. The Determination of a Problem-Solution The first pass at observation and problem-formulation suggest hypotheses for solving the problem. The hypothesis is usually but not always generated from a larger field of back- ground theory, which may require special development in the specific con- text. 4. The Coordination of Observations and Hypotheses. A reciprocal pro- cess of coordination and improvement of observed facts and theoretical- hypothetical ideas is undertaken, in which hypotheses are developed or eliminated, new more refined observations are made, and the problem- statement is refined. 5. The Necessity of Experiment. A series of tentative, experimental appli- cations of the hypotheses are made in order to estimate their efficacy as problem-solutions. Earlier experiments can suggest more refined experi- ments, or the necessity of further articulating data and hypothesis, or the need to “go back to the drawing board.” Solving any problem requires operations of making and doing. Ideas cannot be tested by a purely pas- sive collection of facts through observation. Active interventions of an experimental nature provide necessary evidence about the prospective ef- fectiveness of solutions. 3 6. Judgment. Inquiry continues until a hypothesis is adjudged to resolve the problem, while the alternatives have been ruled out, and the conclusion can be used as a reliable means to further inquiries. This is obviously an idealized picture of the conduct of inquiry, but it is informed by Peirce’s and Dewey’s participation in and studies of the actual practice of science. It is a normative-explanatory model, attempting to cap- ture and explain the lessons of successful inquiries past. It will be helpful to use a concrete illustration to clarify this abstract model. Consider the work of John Snow on the transmission of the disease cholera.2 3 Snow on Cholera The basic outlines of the problematic situation are clear: cholera is a terrible disease, fatal in nearly all cases in Snow’s time. It is tempting to say that the problem itself is clear from the beginning: how is cholera communicated, and how can its transmission be prevented or contained? While the idea of contagious diseases was not new in the middle of the nineteenth century, when Snow was at work on cholera, it was neither fully accepted nor clearly distinguished from views identifying disease as a punishment for sin. To regard some diseases as communicable, and to identify cholera in this way, is already to be well on into the inquiry. Cholera tended to be concentrated amongst the poor, and almost never infected the doctors who tended to the sick. This was taken as evidence that the disease was “a just punishment for the undeserving and vicious classes of society”(26). To regard the problem as one fixed prior to inquiry would be to take as fixed from the beginning many things that were at first unsettled. Snow begins by collecting a variety of general facts, such as the beginnings of the disease in India and the spread via human interaction (29). He then moves to more specific cases (30–1). From the start, the evidence clearly sug- gests that the disease is communicable. But it doesn’t fit well with the more popular “effluvia” theory of transmission of disease through the air, since spending time in the company of the sick doesn’t necessarily lead to infec- tion (31). Rather, a particular pattern of behavior (tending to the patient in 2 My discussion here is taken from Goldstein and Goldstein (1978, pp. 25–62) who draw heavily on Snow’s own manuscripts. Parenthetical references are to their discussion. 4 intimate fashion) and pathology (beginning with intestinal symptoms) sug- gest another hypothesis: The disease spreads by some infected matter from a cholera patient being accidently ingested in sufficient quantity (33). This hypothesis suggests some further observations. If it is valid, you’ll find that certain people who come near to the patient do not get cholera (as we’ve seen), and that they avoided it by way of habits of cleanliness that would prevent them from accidently ingesting any choleraic evacuations. Indeed, this is clearly the case with doctors, who do not generally contract cholera from their patients (33). Reasoning through the implications of the hypothesis, we can see that there are several reasons that people of different social classes would have different levels of risk of contracting the disease based on differnt living conditions and behavior around the sick (33–34). One observation raises a puzzle, however. Cholera does sometimes spread to the rich despite the absence of the vectors of direct communication present in the case of the poor. Snow did not take this to invalidate the hypothesis, however. Rather, he supposed a further specification of the hypothesis in these cases that would provide the appropriate kind of transmission vector: cholera can spread through the water supply (35). Further cases support this hypothesis. Having worked out the implications of the hypothesis and found corre- sponding facts is not where Snow stopped. The next phase requires exper- imental application of the hypothesis to real situations in order to test its adequacy. Experimental application is not just a special way of generating further observations. Certainly, techniques of observation are part of the experiment, but the function is nonetheless very different. The functions of observation are to fix the conditions of the problematic situation and the terms of the problem, as well as to suggest and refine hypotheses. The func- tion on an experiment is to put the hypothesis into practice, in a limited and controlled fashion, in order to determine its efficacy in solving the problem. Snow engaged in at least two experiments, neither of which is entirely satisfactory from the point of view of our model, though the details here are not so important to the main point. The final part of Snow’s monograph on cholera is the most crucial, from the point of view of our model. In the last section, Snow provides a list of twelve recommendations for how to prevent the spread of cholera, based on his two hypotheses, plus some further reasoning about possible cases. For example: 1st. The strictest cleanliness should be observed by those about 5 the sick. . . 3rd. Care should be taken that the water employed for drinking and preparing food. . . is not contaminated with the contents of cesspools, house-drains, or sewers. . . 11th. To inculcate habits of personal and domestic cleanliness among the people everywhere. . . Such recommendations are crucial to the eventual acceptance of Snow’s explanation. No amount of convincing argument provided by a scientific manuscript can be the ultimate measure of a scientific judgment’s warranted assertibility. What matters is that others take the results to be so settled as to provide a steady resource for further inquiry and that future applications, such as the ones suggested by Snow in this final section, are successful; these are the “decisive experiments” in favor of Snow’s view. 4 The Experimenter’s Regress Many people regard the impact of theory on evidence as having problematic consequences. For example, Robert Hudson (2000) believes that if we cannot make room in our epistemology for direct perception, unmediated by theory or concepts, then we can never escape the “hermeneutic circle” and find some independent ground for our knowledge-claims unsullied by the question at issue. Sylvia Culp (1995), in a similar vein, worries about and attempts to solve the problem of the “experimenter’s regress” raised by H.M. Collins (1975, [1985] 1992). According to Collins, good data is regarded as the product of a good experimental technique, but the test of an experimental technique is just whether it produces the expected data. The same worry can be put about the need to interpret “raw data” before it can become “data” or “facts” (Culp 1995, p. 439). Something like the following picture, suggested by Culp, is surely right: what happens in the lab prior to interpretation is merely a brute happening, and brute happenings are not themselves evidence. We must then interpret those happenings, take them up as a certain item of fact, and, metaphorically speaking, teach them to speak the language of the theory, in order to see how they bear on the theory. This interpretation is never independent of theory, neither the theory of how the apparatus works nor the theory in question. All of this presents a problem, according to Collins and Culp, because we are left 6 wondering how interpretations of experiments that themselves presuppose controversial theories, including parts of the theory in question, can serve as solid ground to support those theories. From the point of view of the inquiry-model, several crucial parts of the story have been left out. For one, it mentions only one direction on the two- way street of the coordination of factual and conceptual materials. Contra Culp’s supposition, we don’t only teach evidence to “speak the language” of theory. We also teach the theory to speak language of observation; that is, we must develop our hypotheses so that they have operational consequences, that they may direct activities of observation and so that experiments may be created that apply the hypothesis as a solution to the problem. Collins’ and Culp’s shared way of setting up the problem presupposes that theory is inert, and experiment must be constructed or interpreted in a way that meets it. But theory and experiment must meet in the middle. Further, both parties to the debate construe the function of evidence extremely narrowly, collapsing the distinction between observational and ex- perimental evidence. Evidence is taken to be exhausted by its function of supporting a theory, but this is a relatively minor function of evidence within the course of inquiry. Observation serves to help fix the problem, it suggests hypotheses for solution, helps improve hypotheses. Experiments put hy- potheses to work in tentative application, trying them out as solutions to a problem. It is undeniable that in some sense, theories “produce” their own evidence, but this is only a problem if evidence only serves to justify theory, and theory is only justified by that body of evidence it produces. To the con- trary, producing (not predicting) some events is the point of a theory; it is the adequacy of the consequences produced to solving the problem at hand, along with its usefulness in attacking new problems and supplementing new inquiries that are the ultimate test of the theory. A key to the problem of the experimenter’s regress is the issue of calibra- tion.3 Early attempts to detect or measure some previously unobserved or unquantified phenomenon are faced with a problem of how to calibrate the technique, lacking any other techniques to check against. We have only theo- retical expectations about what the phenomenon should be like to guide us.4 Later attempts are faced with the problem that their calibration depends on 3 See section I.B.1 of Franklin 2007. 4 Hasok Chang’s work on temperature (2004) explicitly addresses the way that basic expectations guide this process. 7 previous measurements which themselves were not calibrated in a standard way. In both cases there is a troublesome regress; in the later case, the circle of data and technique is simply pushed back to earlier stages. But the question we should ask is, “What is this experimental evidence for ?” Under the impoverished model of theory-evidence relationships that regards the sole role for evidence to be either adding or removing support from a hypothesis (in context-free fashion), the experimenter’s regress is a serious concern. If evidence lacks independent plausibility, it cannot stand as support in the way this simple model would hope. Godin and Gingras (2002) have suggested that the “experimenter’s regress” amounts to just the classical problems of skepticism, and thus that we should get around it in the same way that we get around skeptical worries in epistemology generally. This answer will not do, however, as the problem is internal to the traditional model, once the facts of theory-dependence in evidence are accepted. One must either elaborate or replace this model to avoid them. Sylvia Culp’s (1995) alternative solution to the problem posed by the ex- perimenter’s regress is to appeal to the robustness of evidence. We need not have full independence of evidence from our expectations. Rather, what we need is evidence from a variety of different kinds of sources that are indepen- dent from each other, whose interpretation relies on the theory in question in quite different ways (if at all), and that all support the same conclusion. Evidence from a single source that seems to support the conclusion but only does so due to being calibrated that way would be problematically circular. A variety of different types of evidence, developed independently from each other, which all seem to support the conclusion but in fact are just the prod- uct of our expectations, so the argument goes, would be a miracle. A far better explanation is just the truth of the hypothesis. The strategy is an appealing one. While no single thread can do the job, a rope woven in the right way can be strong enough. In Culp’s argument, she fully admits that no particular bit of evidence can be theory-free, that it doesn’t even make sense to talk of uninterpreted, bare “happenings” as evidence. Nonetheless, since she is committed to the metaphor of support, she attempts to find an arrangement of evidence that can be strong support our hypotheses. According to her argument, a set of evidence can be a foundation for theoretical knowledge if it is robust—if it comes from a variety of sources that are theoretically independent of each other. This argument fails to meet the challenge posed by the experimenter’s regress, however. At least three difficulties arise, one empirical and two epis- 8 temological.5 The first is the difficulty of finding really independent sources of evidence. The history of the development of experimental techniques is replete with a variety of cross-calibration techniques. Chang’s (2004) discus- sion of the development of the modern thermometer shows the complex in- terdependencies of various new techniques for measuring temperature. Early errors propagate into later techniques and take a long time to disappear entirely, as in the case of measurements of the charge of the electron (Feyn- man [1974] 1999), because of the preponderance of cross-calibration. True independence may be difficult to determine. The second problem, which springs off from the first, is that robustness doesn’t really solve the problem of calibration. For any particular measure- ment technique, there are two cases: either it is calibrated according to existing techniques, or it isn’t. In the former case, the possibility of inde- pendent techniques of measurement is seriously endangered. Furthermore, the question of how those pre-existing techniques were themselves calibrated must be examined. In the latter case, it would appear that all we have to go on to judge the results provided by the technique is the very expectations we hope to support. A variety of different types of evidence, all calibrated by reference to the same set of expectations also lack the independence required by the argument. On the other hand, it may be that the different types of measurement, though originally calibrated in a suspect way, are calibrated with respect to different, independent sets of expectations.6 While problematic in those original circumstances, in a present case, they may be sufficiently indepen- dent from one another to provide robust, adequate evidence in the case at hand. Even supposing that this case passes the empirical test of independence discussed above, a larger question about whether we ought to rely on the ev- idence remains. Perhaps we ought to regard it as a miracle that a variety of such evidence purportedly supports a single conclusion, but why should we think that the truth of that conclusion explains the apparent miracle, given the story of evidence now on offer? A variety of methods, calibrated under highly suspicious circumstances, apparently providing no real support in the case of their original development, now all happen to agree on one conclu- sion. Do we have any reason to believe that this coincidence has anything 5 Compare to Jacob Stegenga’s “three easy problems” for robustness (unpublished). The first (empirical) problem is especially close to Stegenga’s discussion. 6 Though this seems unlikely in the light of Chang’s discussion of the underlying ex- pectations that inform the development of measurement techniques. 9 to do with the truth of the conclusion? Not without some prior reason to think that the methods, taken individually, track the truth in even a mod- estly reliable fashion. But it is precisely the lack of such a reason in the case of individual techniques that leads to the demand for robustness in the first place. A final problem arises for the attempt to solve this problem through the appeal to robustness. As mentioned before, in order to have truly indepen- dent sources of evidence, it is crucial that the the measurement techniques not be calibrated to one another, lest the bias in one creep in to the other. The sources must be multi-modal, and they must be incommensurable, in the sense of not having any inter-modal standard of comparison (otherwise, they are probably calibrated to one another). If they are incommensurable in this way, however, we’re left with a major worry: if we have no standard of com- parison between the types of evidence, how can we say determinately that they support the same conclusion? If the interpretive framework at hand is the theory in question, of course, then it is easy to see how different pieces of evidence support the same conclusion. But if all the evidence can be inter- preted by the theory in such a way as to allow cross-modal comparisons, it isn’t really independent in the way that Culp demands. Suppose, then, that the sources of evidence are all independent from one another in the strong sense. How do you determine the relevance of each to your hypothesis?7 Evidence that meets the requirements of robustness, understood in the way it must be in order to solve the problem at hand, may be sufficiently incon- gruous that it would be difficult to make even qualitative comparisons. And in the common case where there is some discordance between different types of evidence, the necessary lack of an inter-modal standard of comparison prevents us from knowing how to resolve the conflict.8 The inquiry-model of evidence provides a very different answer to the question of the purpose of evidence. Evidence has a variety of functional roles within an inquiry, the main goal of which is the resolution of the perplexity which spurred the inquiry. In general, then, the experimenter’s regress will not present any difficulty, since what matters is that the evidence fulfill its role well enough for the purposes of solving whatever problem presents itself. So long as we find a way to combat the disease and increase the life and 7 The problem of relevance is raised by Nancy Cartwright in her discussions of evidence- based policy (2009) and discussed by Stegenga (unpublished). 8 The terms “incongruity” and “discordance” and the associated problems are raised in the context of robustness by Stegenga (unpublished). 10 vitality of people, it doesn’t matter than the experimental techniques have a variety of dependencies on the experimenter’s expectations. Since experiment is not merely a procedure for producing neutral evidence, but rather a way of making and doing that puts the hypothesis into practice, there is a test of the experimental evidence, together with the hypothesis, that is independent of expectations per se. Expectation cannot prevent a bridge from falling down, nor can it cure disease, nor can it even make quantum mechanics compatible with general relativity. 5 The Value of Robustness In attempting to respond to the experimenter’s regress and related problems, the defenders of the value of robustness have created an insoluble dilemma. Robustness must on such accounts achieve independence from foreground expectations and background theory by being so independent from each other that the potentially infecting theories, assumptions, and expectations “cancel out,” that it would be a “miracle” if such diverse techniques all point to the same conclusion. But in order to achieve independence, the members of the set of evidence must end up being mutually incommensurable, because commensurability requires the shared background and mutual calibration that endangers the needed independence. However, the incommensurability of evidence brings with it the problems of incongruity and discordance, which threaten the very possibility of determining how the evidence bears on a hypothesis. Not only is robustness unnecessary to solve the problem of the exper- imenter’s regress, which disappears when we move from an impoverished model of evidence to the inquiry-model, but “robustness” as we’ve been forced to defined it is actually an impossible requirement. If evidence cannot be integrated, then inquiry cannot move towards resolution. As a result, it may seem that robustness has no place as a scientific norm. This is an un- acceptable conclusion, given the apparent obviousness of its epistemic value and its unanimous support amongst scientists. But it isn’t the value of ro- bustness per se that has been challenged in this chapter. Rather, it is the particular way of understanding robustness that Culp and others are forced into. Robustness, as it figures in the methodological platitudes which the defenders of robustness cite, is merely the recommendation to seek evidence of several type from different sources. The further requirement of complete 11 independence is forced by the purposes that Culp puts robustness to. If we relax these impossible restrictions on robust evidence, the value of robustness becomes more clear. A set of evidence that includes many different kinds of physical processes, and one that does not depend on controversial hypothesis that are unnecessary to the hypothesis in question or the materials need to integrate the evidence has the obvious value that we would expect. When robustness is not asked to do an impossible job, it is no longer plagued by irresolvable difficulties. 6 Conclusion In closing, I would like to emphasize the variety of roles that evidence plays in the course of an inquiry. In many accounts, evidence is mono-functional: all evidence serves as a test of a theory/hypothesis, and it confirms or dis- confirms, and there is no interesting difference between evidence garnered by observation versus that gotten by experimentation. In the model of inquiry I’ve been discussing, however, evidence serves many purposes. Observational evidence helps locate the problem; it provides information about fixed con- ditions; it guides speculation and hypothesis-formation; it helps us eliminate or improve our original hypotheses. Experimental evidence also serves as a tentative application of a developed hypothesis to check its consequences for future action and inference. In every case, it is not some abstract or formal relation between the evidence and the hypothesis by which the evi- dence serves to justify the hypothesis. It is rather a very concrete process of transforming a perplexity into a resolution that evidence serves, and which ultimately justifies any final judgment of the inquiry. The formal or sym- bolic features of evidence are only one small part. This complex, contextual, and pluralistic model of the role of evidence in inquiry can serve to resolve a variety of problems, as we’ve seen in the case of the experimenter’s regress. References [1] Brown, M.J., “Models and Perspectives on Stage” forthcoming in Stud- ies in the History and Philosophy of Science A, Spring 2009. [2] Brown, M.J., “Scientific Significance and Genuine Problems,” in prepa- ration 12 [3] Brown, M.J., Science and Experience: John Dewey’s Philosophy of Sci- ence, dissertation manuscript in preparation [4] Cartwright, N.D., ’Evidence-Based Policy: What’s To Be Done About Relevance’, forthcoming in Philosophical Models, Methods, and Evi- dence: Topics in the Philosophy of Science. Proceedings of the Thirty- Eighth Oberlin Colloquium in Philosophy. Special issue of Philosophical Studies, Spring 2009. [5] Chang, H., Inventing Temperature: Measurement and Scientific Progress. New York: Oxford University Press, 2004. [6] Collins, H. M., (1975) ‘The Seven Sexes: A Study in the Sociology of a Phenomenon, or The Replication of Experiments in Physics’, Sociology, 9, 2, 205–224. [7] Collins, H. M., ([1985] 1992) Changing Order: Replication and Induction in Scientific Practice, Chicago: University of Chicago Press. [8] Culp, S., “Objectivity in Experimental Inquiry: Breaking Data- Technique Circles” Philosophy of Science, Vol. 62, No. 3. (Sep., 1995), pp. 438–458. [9] Feynman, R.P., “Cargo cult science: some remarks on science, pseudo- science, and learning how to not fool yourself.” In: Feynman RP, Rob- bins J. The pleasure of finding things out. Cambridge, Mass.: Perseus Books; 1999:205-16. [10] Franklin, A., “Experiment in Physics”, The Stanford Encyclope- dia of Philosophy (Fall 2007 Edition), Edward N. Zalta (ed.), URL = http://plato.stanford.edu/archives/fall2007/entries/ physics-experiment/.] [11] Godin, B. and Y. Gingras, The experimenters’ regress: from skepticism to argumentation, Studies In History and Philosophy of Science A, Vol- ume 33, Issue 1, , March 2002, Pages 133–148. [12] Goldstein, M. and I.F. Goldstein (1980), How We Know: An Exploration of the Scientific Process, Da Capo Press. 13 [13] Hudson, R.G., (2000), “Perceiving Empirical Objects Directly,” Erken- ntnis, Volume 52, Number 3 / May, 2000 [14] Stegenga, J., “Robustness, Discordance, and Relevance” Hadden Prize Essay, Canadian Society for the History and Philosophy of Science, cur- rently unpublished. 14