Behavioristic, evidentialist, and learning models of statistical testing
Philosophy of Science 52 (4):493-516 (1985)
| Abstract | While orthodox (Neyman-Pearson) statistical tests enjoy widespread use in science, the philosophical controversy over their appropriateness for obtaining scientific knowledge remains unresolved. I shall suggest an explanation and a resolution of this controversy. The source of the controversy, I argue, is that orthodox tests are typically interpreted as rules for making optimal decisions as to how to behave--where optimality is measured by the frequency of errors the test would commit in a long series of trials. Most philosophers of statistics, however, view the task of statistical methods as providing appropriate measures of the evidential-strength that data affords hypotheses. Since tests appropriate for the behavioral-decision task fail to provide measures of evidential-strength, philosophers of statistics claim the use of orthodox tests in science is misleading and unjustified. What critics of orthodox tests overlook, I argue, is that the primary function of statistical tests in science is neither to decide how to behave nor to assign measures of evidential strength to hypotheses. Rather, tests provide a tool for using incomplete data to learn about the process that generated it. This they do, I show, by providing a standard for distinguishing differences (between observed and hypothesized results) due to accidental or trivial errors from those due to systematic or substantively important discrepancies. I propose a reinterpretation of a commonly used orthodox test to make this learning model of tests explicit | |||||||||
| Keywords | No keywords specified (fix it) | |||||||||
| Categories | ||||||||||
| Options |
|
|||||||||
| PhilPapers Archive |
Upload a copy of this paper Check publisher's policy on self-archival Papers currently archived: 5,875 |
| External links |
|
| Through your library | Configure |
Deborah G. Mayo (1992). Did Pearson Reject the Neyman-Pearson Philosophy of Statistics? Synthese 90 (2):233 - 262.
Peter Godfrey-Smith (1994). Of Nulls and Norms. PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association 1994:280 - 290.
Patrick Suppes (2007). Statistical Concepts in Philosophy of Science. Synthese 154 (3):485--496.
Deborah G. Mayo (1983). An Objective Theory of Statistical Testing. Synthese 57 (3):297 - 340.
Deborah G. Mayo (1997). Error Statistics and Learning From Error: Making a Virtue of Necessity. Philosophy of Science 64 (4):212.
Max Albert (1992). Die Falsifikation Statistischer Hypothesen. Journal for General Philosophy of Science 23 (1):1 - 32.
Andrés Rivadulla (1991). Mathematical Statistics and Metastatistical Analysis. Erkenntnis 34 (2):211 - 236.
Deborah G. Mayo (1991). Novel Evidence and Severe Tests. Philosophy of Science 58 (4):523-552.
J. D. Trout (1994). Austere Realism and the Worldly Assumptions of Inferential Statistics. PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association 1994:190 - 199.
Deborah G. Mayo & Aris Spanos (2006). Severe Testing as a Basic Concept in a Neyman–Pearson Philosophy of Induction. British Journal for the Philosophy of Science 57 (2):323-357.
Monthly downloads |
Added to index2009-01-28Total downloads4 ( #180,404 of 556,837 )Recent downloads (6 months)1 ( #64,847 of 556,837 )How can I increase my downloads? |

