Chapter 7 Towards a Grammar of Bayesian Confirmation Vincenzo Crupi, Roberto Festa, and Carlo Buttasi 7.1 Introduction A long standing tradition in epistemology and the philosophy of science sees the notion of confirmation as a fundamental relationship between a piece of evidence E and a hypothesis H. A number of philosophical accounts of confirmation, moreover, have been cast or at least could be cast in terms of a formally defined model c.H; E/ involving evidence and hypothesis.1 Ideally, a full-fledged and satisfactory confirmation model c.H; E/ would meet a series of desiderata, including the following: (1) c.H; E/ should be grounded on some simple and intuitively appealing "core intuition"; (2) c.H; E/ should exhibit a set of properties which formally express sound intuitions; (3) it should be possible to specify the role and relevance of c.H; E/ in science as well as in other forms of inquiry. In what follows we will focus on accounts of confirmation arising from the Bayesian framework and we will mainly address issues (1) and (2). Bayesianism arguably is a major theoretical perspective in contemporary discussions of reasoning in science as well as in other domains (Bovens and Hartmann 2003; Howson and Urbach 2006; Oaksford and Chater 2007). As we will see, the Bayesian approach to confirmation includes traditional and well-known proposals along with novel and more recent variants. Despite all this, the exploration of points (1) and (2) still seems V. Crupi () Department of Philosophy, University of Turin, via Sant'Ottavio 20, 10124 Turin (Italy), e-mail: vincenzo.crupi@unito.it R. Festa and C. Buttasi Department of Philosophy, University of Trieste, Trieste, Italy Research supported by a grant from the Spanish Department of Science and Innovation (FFI2008-01169/FISO). 1 It should be kept in mind that this relationship is strictly speaking a three-place one, involving a given background of knowledge and assumptions, often denote as K . Such a term will be omitted from our notation for simple reasons of convenience, as it is unconsequential for our discussion. M. Suárez et al. (eds.), EPSA Epistemology and Methodology of Science: Launch of the European Philosophy of Science Association, DOI 10.1007/978-90-481-3263-8 7, c Springer Science+Business Media B.V. 2010 73 74 V. Crupi et al. to lag behind a fully satisfactory level of detail and completeness. In trying to contribute to a more systematic treatment, we hope to provide some useful conceptual material to effectively tackle issue (3), ultimately bridging philosophical accounts of confirmation back to practice. 7.2 How to Price a Horse: Intuitions Concerning Distance In the 20s of the seventeenth century, Galileo Galilei was consulted by some Florentine gentlemen engaged in a "learned conversation" on the following question: a horse is really worth a hundred crowns, one person estimated it at ten crowns and another at a thousand; which of the two made the more extravagant estimate? A priest named Tolomeo Nozzolini had argued that the higher estimate was more mistaken, since "the excess of a thousand above a hundred is greater than that of a hundred above ten". In disagreement with that, Galilei submitted that the two estimates were equally extravagant, "because the ratio of a thousand to a hundred is the same as the ratio of a hundred to ten" (see Todhunter 1865, p. 5). The Nozzolini-Galilei controversy reveals different intuitions about the correct way to measure error and, more generally, the distance between quantitative values. In Bayesian confirmation theory, evidence E is seen as possibly increasing or decreasing the initial probability of a hypothesis of concern H . As applied to the departure of the final from the initial probability, Nozzolini's and Galilei's diverging notions of "distance" seem to lie behind two of the most widely known Bayesian measures of confirmation, i.e., the so-called probability difference and probability ratio measures (first proposed by Carnap (1950/1962), and Keynes (1921), respectively): cd .H; E/ D p.H jE/  p.H/ cr .H; E/ D p.H jE/=p.H/ Indeed, a very similar debate involving diverging intuitions has occurred concerning precisely cd .H; E/ and cr.H; E/. Sober (1994) has argued that the probability ratio measure of confirmation conflicts with clear presystematic judgments by means of numerical examples such as the following. Suppose that on one hand p.H1/ D :1 and p.H1jE1/ D :9, whereas on the other hand p.H2/ D :0001 and p.H2jE2/ D :001. Then it can be computed that cr .H1; E1/ D 9 < 10 D cr .H2; E2/. However, Sober claims, "surely a jump from .1 to .9 reflects a larger change in plausibility than a jump from .0001 to .001", contrary to what cr .H; E/ implies (Sober 1994, 228). It should be noticed that, as Festa (1999) has pointed out, complaints can be construed which go in exactly the opposite direction. For now suppose that p.H1/ D :000001 and p.H1jE1/ D :1, whereas p.H2/ D :7 and p.H2jE2/ D :8. This time, the ratio measure cr .H; E/, in contrast to the difference measure cd .H; E/, ranks the former jump as larger, reflecting the arguably sound judgment that "the initial probability increases from a ridiculously small value to a noticeable one" (Festa 1999, p. 66). 7 Towards a Grammar of Bayesian Confirmation 75 Although based on different ways to conceive distance, the difference and ratio measures above share a common trait. Measures cd .H; E/ and cr.H; E/ are both meant to provide an answer to the question: how far has the probability of H gone from its initial value (as an effect of E)? However, the probability of a hypothesis enjoys the important property of having a clear-cut limiting case, i.e., certainty – either of H being true or of it being false. As a consequence, one can legitimately conceive confirmation in terms of a measure crd .H; E/ of the "relative reduction of the probability distance (difference) from certainty", in a sense to be explained immediately. The core intuition underlying a crd-measure is the focus on the question: to what extent is the initial probability distance from certainty concerning the truth (falsehood) of H reduced by a confirming (disconfirming) piece of evidence E? Or, in other terms: how much of such a distance is "covered" by the upward (downward) jump from p.H/ to p.H jE/? A rather natural way to formalize crd.H; E/ is the following: crd .H; E/ D 8  <  : p.H jE/  p.H/ 1  p.H/ if p.H jE/  p.H/ p.H jE/  p.H/ p.H/ if p.H jE/ < p.H/ Previous appearances of crd.H; E/ include Rescher (1958, p 87), Shortliffe and Buchanan (1975) and Mura (2006, 2008). Measure crd.H; E/ has also been advocated by Crupi et al. (2007) as enjoying a set of interesting formal properties, on some of which we will return later on. For the time being, we would like to point out that, as far as we can see, crd.H; E/ is the only Bayesian alternative to cd .H; E/ and cr .H; E/ which emerges as a relatively simple way to assess confirmation on the basis of "distances" involving relevant probability values. In probability theory, odds may work as measures of confidence under uncertainty much as probabilities themselves. Both quantities are strictly related in a perfectly well-known fashion: p.X/ D o.X/=OE1 C o.X/ and o.X/ D p.X/=OE1  p.X/. Following an illuminating informal remark made by Joyce (2004), one can conveniently illustrate the correspondence between "probability talk" and "odds talk" as simply analogous to the correspondence between "we are two thirds of the way there" and "we have gone twice as far as we have yet to go". It so happens, thus, that the difference-based and the ratio-based notions of distance can also be applied to odds, thus having the following further pair of Bayesian confirmation measures: cod .H; E/ D o.H jE/  o.H/ cor.H; E/ D o.H jE/=o.H/ It is then interesting to notice that the odds ratio cor.H; E/ itself – a largely known notion, famously advocated by Good (1950) and others (e.g., Fitelson 2001) – can 76 V. Crupi et al. be seen as a distance-based confirmation measure. The odds difference measure cod.H; E/ has also appeared in the literature, if only occasionally (see Festa (1999, p 59), and Joyce (2004).)2 7.3 The Sharp Edges of Incremental Confirmation: Basic Properties An intuitively neat core intuition is a valuable basis, but still represents too feeble a ground to meet the challenges of a satisfactory philosophical account of confirmation. We will now address various details of a proper "grammar" of (Bayesian) confirmation which are often only separately analyzed (if at all). Indeed, Bayesian confirmation theories are typically introduced in a rather cursory way, e.g., by simply pointing to the large class of functions mapping relevant probability values involving H and E onto a number which is positive, null or negative depending on p.H jE/ being higher, equal or lower as compared to p.H/. Other times a different rather informal characterisation can be found, presumably capturing the "incremental" nature of Bayesian confirmation: c.H; E/ is then called an incremental measure of confirmation when increasing with the final probability p.H jE/, and – as it is sometimes added – decreasing with the initial probability p.H/. This sort of approach may well be often pursued for the sake of simplicity and brevity. It is surprising, however, that no complete formal characterisation of Bayesian incremental confirmation seems to be available so far. In what follows, we will provide such a characterisation as grounded in a small number of conditions, thus labelled basic. A few preliminary assumptions and points of notation will be useful. We will say that a statement X is p-normal iff 0 < p.X/ < 1 (see Kuipers 2000, p. 48). Also, we will say that hypothesis H and evidence E represent a p-normal pair iff both are pnormal. Further, we will assume that, for any p-normal pair .H; E/; c.H; E/ maps the joint probability distribution p. H ^  E/ onto a real number. Importantly, this requires that c.H; E/ be defined for p-normal pairs, but does not forbid it to be defined for non-p-normal pairs as well.3 2 One can define a rather natural odds counterpart of the measure crd.H; E/ of the relative reduction of probability distance (from certainty). (An earlier occurrence of this measure appears in Heckerman (1986).) As shown in Crupi (2008), however, such an odds counterpart turns out to be ordinally equivalent to the simple odds ratio measure cor.H; E/. 3 There are confirmation measures whose behavior is perfectly defined and identical for any p-normal pair while being divergent for non-p-normal ones. To illustrate, consider the measure advocated by Kuipers (2000), i.e., ck.H; E/ D p.EjH/=p.E/. Since for any p-normal pair p.EjH/=p.E/ D p.H jE/=p.H/, in this class of cases ck.H; E/ is identical to the probability ratio measure cr .H; E/ defined above. However, the latter is not defined whenever p.H/ D 0. On the contrary, ck.H; E/ may be defined in this case as well, provided that E is p-normal and a value for p.EjH/ can be specified. (For more on this point, see the distinction between "inclusive" and "non-inclusive" accounts of confirmation in Kuipers (2000); also see Festa (1999, pp 67–68).) 7 Towards a Grammar of Bayesian Confirmation 77 It may be useful to consider that a reader who is familiar with standard presentations of Bayesian confirmation theory may already have expectations that certain statements appear as eminently basic conditions. The following is a case in point: (IFPD) Initial and Final Probability Dependence For any p-normal pair .H; E/; c.H; E/ is a real-valued function of p.H/ and p.H jE/ only. Indeed, to the extent that confirmation is thought of as capturing the direction and amount of a change in the probability of hypothesis H as provided by evidence E , (IFPD) should sound very natural. It will be shown shortly, however, that such a principle needs not be assumed as primitive, as it can be promptly derived from a set of conditions which we see as providing a more convenient theoretical basis. In our proposed reconstruction, the first basic condition defining Bayesian incremental confirmation is the following comparative principle concerning the dependence on the final probability of hypotheses: (FPI) Final Probability Incrementality For any two p-normal pairs .H1; E1/ and .H2; E2/ such that p.H1/ D p.H2/; c.H1; E1/ > = D = < c.H2; E2/ iff p.H1jE1/ > = D = < p.H2jE2/. Condition (FPI) quite simply says that, for any fixed (non-extreme) value of the initial probability of hypotheses, the degree of confirmation is higher for higher values of the final probability, i.e., a strictly increasing function of the latter. The second basic condition is a somewhat parallel comparative principle concerning the dependence of incremental confirmation on the initial probability of hypotheses. (IPI) Initial Probability Incrementality For any two p-normal pairs .H1; E1/ and .H2; E2/: (1) if p.H1jE1/ D p.H2jE2/ 2 0; 1OE, then c.H1; E1/ > = D = < c.H2; E2/ iff p.H1/ < = D = > p.H2/; (2) if p.H1jE1/ D p.H2jE2/ 2 f0; 1g, then: (i) if p.H1/ < p.H2/, then c.H1; E1/  c.H2; E2/; (ii) if p.H1/ D p.H2/, then c.H1; E1/ D c.H2; E2/; (iii) if p.H1/ > p.H2/, then c.H1; E1/  c.H2; E2/. Condition (IPI.1) requires that, for non-extreme values of the final probability of hypotheses, the degree of confirmation is higher for lower values of the initial probability, i.e., a strictly decreasing function of the latter. On the other hand, (IPI.2) weakens the requirement for extreme values of the final probability of hypotheses, 78 V. Crupi et al. implying only that the degree of confirmation is a non-increasing function of the initial probability. The latter caveat is suggested by the remark that, for extreme values of the final probability, the concerned evidence allows full certainty about the truth value of the hypothesis at issue. In such cases, one might arguably want the degree of confirmation to depend on this final state of full certainty only, whatever the initial probability. The third basic condition concerns neutrality, i.e., the case in which the evidence at issue does not affect the credibility of the hypothesis of concern: (EN) Equineutrality For any two p-normal pairs .H1; E1/ and .H2; E2/ such that p.H1jE1/ D p.H1/ and p.H2jE2/ D p.H2/; c.H1; E1/ D c.H2; E2/. Condition (EN) dictates that all p-normal pairs involving probabilistically independent statements should be assigned a unique numerical value. As compared to principles (FPI) and (IPI) above, equineutrality may appear less transparent in its intuitive basis. It can be defended, however, by the crucial role it plays in the derivation of desirable properties to which we will turn shortly. For the time being, let us point out that Theorem 1. The basic conditions (FPI), (IPI) and (EN) are logically independent.4 7.4 Some Derived Properties of Incremental Confirmation As a consequence of the basic properties specified by the three basic conditions above, incremental confirmation measures share several interesting derived properties. Some of them are established by the following principles, which will thus label derived conditions. A first important derived condition is the following, showing how incremental measures naturally discriminate among confirmation, neutrality and disconfirmation: (QD) Qualitative Discrimination There exist a real number t such that for any p-normal pair .H; E/: (1) c.H; E/ > t iff p.H jE/ > p.H/; (2) c.H; E/ D t iff p.H jE/ D p.H/; (3) c.H; E/ < t iff p.H jE/ < p.H/. 4 The Appendix provides proofs of this as well as all subsequent theorems. 7 Towards a Grammar of Bayesian Confirmation 79 Principle (QD) states that the fixed quantity t indicating neutrality acts as a threshold separating cases of confirmation from cases of disconfirmation. The precise value of t is largely a matter of convenience, usual choices being either 0 or 1. It is easy to see that: Theorem 2. (QD) follows from the basic conditions (FPI) and (EN). It should be noticed that (QD) is sometimes taken as an appropriate general definition for Bayesian confirmation measures. Strictly speaking, this is quite unsatisfactory though, as we will now see by discussing a number of further derived properties. To begin with, consider once again the following: (IFPD) Initial and Final Probability Dependence For any p-normal pair .H; E/; c.H; E/ is a real-valued function of p.H/ and p.H jE/ only. It can be shown that: Theorem 3. (IFPD) follows from each of the basic conditions (FPI) and (IPI). Thus, the fulfilment of (IFPD) amounts to an important derived property of Bayesian incremental confirmation measures. However, (IFPD) does not follow from (QD). As a matter of fact, it can be proven that: Theorem 4. (QD) and (IFPD) are logically independent. Furthermore, consider the following principle: (FPI-H) Final Probability Incrementality with Fixed Hypothesis For any two p-normal pairs .H; E1/ and .H; E2/; c.H; E1/ > = D = < c.H; E2/ iff p.H jE1/ > = D = < p.H jE2/. According to Eells and Fitelson (2000, p. 670), "it is not an exaggeration to say that most Bayesian confirmation theorists would accept (FPI-H) as a desideratum for Bayesian measures of confirmation". For one thing, as Eells and Fitelson (2000) also point out, (FPI-H) is crucially involved in classical Bayesian analyses such as the solution of the ravens paradox provided by Horwich (1982, pp 54–63). Yet (FPI-H) itself can not be derived from (QD), as implied by the following demonstrable fact: Theorem 5. (QD) and (FPI-H) are logically independent. On the contrary, our basic conditions for incremental confirmation do yield (FPIH) as specifying a derived property. Indeed, it is very easy to show that: Theorem 6. (FPI-H) follows from basic condition (FPI). 80 V. Crupi et al. Once the "fixed hypothesis" version of final probability incrementality is considered, a natural "fixed evidence" counterpart promptly comes to mind, i.e.: (FPI-E) Final Probability Incrementality with Fixed Evidence For any two p-normal pairs .H1; E/ and .H2; E/ such that p.H1/ D p.H2/; c.H1; E/ > = D = < c.H2; E/ iff p.H1jE/ > = D = < p.H2jE/. It is easy to show that this extremely plausible principle is again no more than a straightforward consequence of the basic condition (FPI), thus indicating a further derived property of Bayesian incremental confirmation: Theorem 7. (FPI-E) follows from basic condition (FPI). It should also be pointed out that – much as with (FPI-H) above – the often mentioned condition (QD) is not sufficient to yield (FPI-E), as implied by the following demonstrable fact: Theorem 8. (QD) and (FPI-E) are logically independent. Further important remarks about incremental measures involve the notion of predictive success. We will say that the predictive success of hypothesis H concerning evidence E amounts to the quantity q.H; E/ D p.EjH/=p.E/. It can be shown that (FPI-H) is logically equivalent to the following condition linking confirmation to the predictive success of a given hypothesis:5 (PS) Dependence on Predictive Success For any two p-normal pairs .H; E1/ and .H; E2/; c.H; E1/ > = D = < c.H; E2/ iff q.H; E1/ > = D = < q.H; E2/. Theorem 9. (FPI-H) and (PS) are logically equivalent. From (PS), in turn, the following "surprise bonus" principle for successful deductive prediction of hypotheses can be easily derived:6 (SB) Surprise Bonus for Successful Deductive Predictions For any two p-normal pairs .H; E1/ and .H; E2/ such that H jD E1; E2; c.H; E1/ > = D = < c.H; E2/ iff p.E1/ < = D = > p.E2/. Theorem 10. (PS) implies (SB). 5 (PS) essentially amounts to a statement that Steel (2007) labels LP1 and identifies as one among two possible renditions of the "likelihood principle". While departing from his terminological choices, we concur with Steel's argument that (PS) is a compelling principle for Bayesians. 6 We borrow the term "surprise bonus" from Kuipers (2000, 55). 7 Towards a Grammar of Bayesian Confirmation 81 (SB) states that hypothesis H is more strongly confirmed by the occurrence of the most surprising (unlikely) among its deductive consequences, a rather widespread principle in the philosophy of science which is often said to reflect a basic tenet of scientific methodology. In order to appreciate the relevance of the foregoing analysis, it should be noticed that confirmation measures have been proposed in the literature which, although broadly consistent with the Bayesian framework, notably lack some of the basic and derived properties of incremental confirmation. As an illustration, consider the following well-known measure, proposed by Nozick (1981, 252): cn.H; E/ D p.EjH/  p.Ej:H/ It can be easily shown that such a measure does satisfy the basic equineutrality condition (EN). As far as the derived properties above are concerned, it also satisfies the qualitative discrimination condition (QD) along with condition (FPI-E). Yet it does not generally satisfy the basic incrementality conditions (FPI) and (IPI), and it ends up violating all other derived conditions as well, i.e., (IFPD), (FPI-H), (PS) and (SB).7 7.5 Sorting Out the Grammar: from Basic to Structural Properties 7.5.1 The Ordinal Versus Quantitative Level By definition, all incremental confirmation measures share the basic and derived properties presented above. Yet one or more specific incremental measures may be characterized by further interesting features to be called structural properties, as specified by appropriate structural conditions. Once the class of structural properties of incremental confirmation is so identified, it may serve for grounding a thorough and unified discussion of a variety of issues already addressed or touched upon in the literature in a less systematic fashion. We will outline such a discussion shortly, soon after introducing a further useful distinction, i.e., that between ordinal level and quantitative level structural properties. For our purposes, two confirmation measures c.H; E/ and c.H; E/ will be said ordinally equivalent iff, for any two p-normal pairs .H1; E1/ and .H2; E2/; c.H1; E1/ > = D = < c.H2; E2/ iff c.H1; E1/ > = D = < c.H2; E2/. Isotone transformations of a given confirmation measure yield measures whose detailed quantitative behavior (including domain and neutrality value) may vary widely, but such that rank-order (for p-normal pairs) is strictly preserved. To 7 Detailed proofs are omitted here, but see Steel (2003, 219–221), Crupi et al. (2007, 246), and Tentori et al. (2007, 109), for relevant remarks. 82 V. Crupi et al. illustrate, the measures in the following list are all ordinally equivalent variants:8 cr .H; E/ D p.H jE/=p.H/ domain W OE0; C1/ neutrality value W 1 cr .H; E/ D p.H jE/  p.H/ p.H jE/ C p.H/ domain W OE1; 1/ neutrality value W 0 cr .H; E/ D p.H jE/  p.H/ p.H/ domain W OE1; C1/ neutrality value W 0 cr .H; E/ D p.H jE/ p.H jE/ C p.H/ domain W OE0; 1/ neutrality value W 1=2 Both Fitelson (1999) and Festa (1999) emphasized that probabilistic confirmation measures are not generally ordinally equivalent – not even properly incremental ones. As a consequence, at the ordinal level of analysis of the notion of confirmation one can already find conceptually remarkable properties that are, in our current terms, structural and not basic. Ordinal level structural properties are simply invariant upon classes of ordinal equivalence, i.e., c.H; E/ will enjoy the property at issue if and only if any ordinally equivalent c.H; E/ does. If that is not the case, then the property at issue posits constraints operating at a more fine-grained quantitative level, thus being sensitive to the quantitatively different behavior of ordinally equivalent measures. In this section, we will mainly address a sample of significant ordinal level structural properties. As a final point, we will also touch upon the quantitative level of analysis by reference to one illustrative example. 7.5.2 "Laws" of Likelihood A widely known and discussed principle in probabilistic analyses of confirmation is the so-called "law of likelihood" (or "likelihood principle"), whose rendition in our present framework is the following: (LL) Law of Likelihood For any two p-normal pairs .H1; E/ and .H2; E/; c.H1; E/ > = D = < c.H2; E/ iff p.EjH1/ > = D = < p.EjH2/.9 Principle (LL) certainly amounts to an important structural property of incremental confirmation. Structural, and not basic, for many incremental confirmation measures are well-known not to satisfy (LL). More generally, it can be shown that: 8 For cr .h; e/ and c  r .h; e/ see Festa (1999, 64) and Finch (1960), respectively. 9 (LL) essentially amounts to a statement that Steel (2007) labels LP2 and identifies as the second possible renditions of the "likelihood principle". (See footnote 5.) 7 Towards a Grammar of Bayesian Confirmation 83 Theorem 11. (LL) is logically independent from incrementality, i.e., from the set of basic conditions (FPI), (IPI) and (EN). (LL) is also a principle concerning the ordinal level of analysis. Indeed, it has been seen by Bayesian confirmation theorists as isolating (incremental) confirmation measures ordinally equivalent to the probability ratio measure (see Milne 1996) as well indicating some significant limitations of this very class of measures (see Fitelson 2007). Interestingly, despite being independent from incrementality, (LL) is a sufficiently powerful and committing principle to imply by itself conditions appearing above as derived for incremental measures: Theorem 12. (LL) implies the derived condition (FPI-E). Now consider the following claim: (WLL) For any two p-normal pairs .H1; E/ and .H2; E/, if p.EjH1/ > p.EjH2/ and p.Ej:H1/ < p.Ej:H2/, then c.H1; E/ > c.H2; E/. "WLL" stands for "weak law of likelihood", according to the following fact: Theorem 13. (LL) implies (WLL). Joyce (2004) has argued that (WLL) "must be an integral part of any account of evidential relevance that deserves the title 'Bayesian"'. In a similar vein, according to Fitelson (2007, 479), "(WLL) captures a crucial common feature of all Bayesian conceptions of relational confirmation". In light of these statements, it is thus of interest to point out that the following also holds: Theorem 14. (WLL) is logically independent from incrementality, i.e., from the set of basic conditions (FPI), (IPI) and (EN). As a consequence we submit that, as plausible as it may seem in a Bayesian perspective, (WLL) – just as (LL) – counts as a properly structural (not basic) condition for Bayesian theories of incremental confirmation. Joyce's and Fitelson's statements are only contingently supported in the sense that, to the best of our knowledge, all incremental confirmation measures which have been historically proposed and seriously defended do in fact satisfy (WLL). 7.5.3 Confirmability and Disconfirmability Assuming a fixed confirmation measure c.H; E/, we will use Cy.H/ to denote the confirmability of a particular hypothesis H , amounting to the maximum degree of confirmation that H could possibly receive (given its current probability). It is easy to see that, by the derived condition (FPI-H) above, Cy.H/ D c.H; H/ for any incremental measure. That is, for incremental measures the confirmability of a given H 84 V. Crupi et al. corresponds to the degree of confirmation provided in the limiting case of H itself having been ascertained. Notably, our basic conditions for incremental confirmation leave the following quite natural question unanswered: does Cy.H1/ D Cy.H2/ generally hold for any two distinct p-normal hypotheses H1 and H2? The following statement amounts to an ordinal level structural condition implying a positive answer: (ECy) Equiconfirmability For any two p-normal hypotheses H1 and H2; Cy.H1/ D Cy.H2/. By a parallel line of argument, let Dy.H/ D c.H; :H/ be the disconfirmability of H (again given the fixed incremental measure considered), by which the following condition can be stated: (EDy) Equidisconfirmability For any two p-normal hypotheses H1 and H2; Dy.H1/ D Dy.H2/. Kemeny and Oppenheim (1952, 309) seem to have at least implicitly advocated (ECy) and (EDy). More recently, Fitelson (2006, 502) has approvingly mentioned a condition, named logicality, apparently implying both principles, i.e., "c.H; E/ should be maximal (minimal) when EjD H.EjD :H/". Kuipers (2000, 54–55), on the other hand, has argued in favour of confirmability being hypothesis specific, i.e., in favour of: (HCy) Hypothesis Specific Confirmability For any two p-normal hypotheses H1 and H2, if p.H1/ ¤ p.H2/, then Cy.H1/ ¤ Cy.H2/. whose analogue for disconfirmability is of course the following: (HDy) Hypothesis Specific Disconfirmability For any two p-normal hypotheses H1 and H2, if p.H1/ ¤ p.H2/, then Dy.H1/ ¤ Dy.H2/. Quite clearly: Theorem 15. (ECy) and (HCy) are logically inconsistent, as well as (EDy) and (HDy). A less obvious fact to be pointed out is that: Theorem 16. (ECy) and (EDy) are logically independent, as well as (HCy) and (HDy). 7 Towards a Grammar of Bayesian Confirmation 85 7.5.4 Confirmation and Complementary Hypotheses As a final point, we would like to illustrate how one can shift from a derived to a structural ordinal property, and from the latter to a structural quantitative one by subsequently strengthening a given condition. Consider the following similar but increasingly strong principles connecting the confirmation and disconfirmation of pairs of complementary hypotheses: (CCO-H) Confirmation Complementarity: Ordinal with Fixed Hypothesis Let .H; E1/ and .H; E2/ be two p-normal pairs such that p.H jE1/ > p.H/ and p.H jE2/ > p.H/. Then c.H; E1/ > c.H; E2/ iff c.:H; E1/ < c.:H; E2/. (CCO) Confirmation Complementarity: Ordinal (General) Let .H1; E1/ and .H2; E2/ be two p-normal pairs such that p.H1jE1/ > p.H1/ and p.H2jE2/ > p.H2/. Then c.H1; E1/ > c.H2; E2/ iff c.:H1; E1/ < c.:H2; E2/. (CCQ) Confirmation Complementarity: Quantitative For any p-normal pair .H; E/; c.H; E/ D c.:H; E/. First of all, as suggested above, it is quite easy to show that: Theorem 17. (CCQ) implies (CCO), which implies (CCO-H). Moreover, it turns out that: Theorem 18. (CCO-H) follows from the basic condition (FPI). Thus (CCO-H) describes a derived property of incremental measures: the confirmatory impact of evidence on one given hypothesis (be it positive or negative) is a decreasing function of its impact on the negation of that hypothesis. By contrast, (CCO) is a structural condition at the ordinal level – demonstrably violated, for instance, by any measure ordinally equivalent to the probability ratio cr .H; E/ – stating that one hypothesis is better confirmed than another iff the negation of the former is more severely disconfirmed. Finally, as far as or third condition (CCQ) is concerned, consider the following measures: cor .H; E/ D o.H jE/  o.H/ o.H jE/ C o.H/ cor .H; E/ D o.H jE/  o.H/ o.H/ cd .H; E/ D p.H jE/  p.H/ 86 V. Crupi et al. Measures cor.H; E/ and cor .H; E/ are ordinally equivalent to the odds ratio measure cor.H; E/ but distinct from the third measure listed, i.e., the simple probability difference cd .H; E/. (Also, all three measures listed have 0 as their neutrality value.) It can be shown, however, that cor.H; E/ and cd .H; E/ satisfy condition (CCQ), whereas cor .H; E/ does not. This shows that (CCQ) is a properly quantitative structural condition, as it specifies one particular form of the decreasing function connecting c.H; E/ and c.:H; E/, whose fulfilment is orthogonal to the ordinal equivalence relationship among measures. To the best of our knowledge, conditions (CCO-H) and (CCO) have never been explicitly discussed in the literature. By contrast, it is interesting to notice that the strongest condition (CCQ) has a rather long history: it was first clearly stated as an adequacy condition by Kemeny and Oppenheim (1952, 309), then more recently defended by Eells and Fitelson (2002, 134) and by Crupi et al. (2007) in a more general framework.10 Let us conclude this discussion by a final remark. Being presented with the derived condition (CCO-H) above, some reader might have wondered about its "fixed evidence" counterpart, i.e., about the following rather appealing principle: (CCO-E) Confirmation Complementarity: Ordinal with Fixed Evidence Let .H1; E/ and .H2; E/ be two p-normal pairs such that p.H1jE/ > p.H1/ and p.H2jE/ > p.H2/. Then c.H1; E/ > c.H2; E/ iff c.:H1; E/ < c.:H2; E/. Just as its counterpart (CCO-H), (CCO-E) is also a consequence of the more general condition (CCO), i.e.: Theorem 19. (CCO) implies (CCO-E). However, unlike (CCO-H), condition (CCO-E) specifies a property which is not derived but only structural for incremental confirmation measures. Once again it is demonstrably violated by measures ordinally equivalent to the probability ratio cr .h; e/. This is not by chance, as the probability ratio measure satisfies the law of likelihood (LL), which in turn contradicts (CCO-E), i.e.: Theorem 20. (LL) and (CCO-E) are logically inconsistent. Notice that, by contradicting (CCO-E), (LL) also contradicts both (CCQ) and (CCO), which are logically stronger (by Theorems 17 and 19). This illustrates further how strong a constraint the law of likelihood is as a structural property for incremental confirmation. 10 A major issue in Crupi et al. (2007, 236–242) is a thorough analysis of so-called "symmetries and asymmetries" in Bayesian confirmation theory (see Eells and Fitelson 2002). In our current terms, their convergent symmetries are all ordinal structural conditions, whereas their divergent ones are all quantitative structural conditions. 7 Towards a Grammar of Bayesian Confirmation 87 7.6 Concluding Remarks: the Call for a Grammar The foregoing analyses were meant as laying the foundations of a set of theoretical tools for the formal analysis of reasoning, i.e., a detailed grammar of Bayesian confirmation. Our present results are preliminary, and still already telling, we submit, as suggested by the graphical summary appearing in Fig. 7.1 below. We would like to conclude with a few remarks indicating why we see the endeavour outlined here as fruitful. To begin with, the distinction between basic/derived and structural properties may serve as a firm guide for differentiating issues concerning Bayesian incremental confirmation as such as compared to relatively more subtle puzzles involving its many variants. In particular, the appeal (or lack thereof) of basic and derived features should be seen as a crucial benchmark for the assessment of the notion of Bayesian incremental confirmation per se, as distinct from its diverse possible formalizations. On the other hand, debated issues such as the so-called irrelevant conjunction problem (see Hawthorne and Fitelson 2004; Crupi and Tentori forthcoming), Matthew effects (Festa forthcoming), so-called "likelihoodist" principles (Fitelson 2007; Steel 2007) and symmetries and asymmetries (Eells and Fitelson 2002; Crupi et al. 2007) can all be seen as examples in which specific and possibly alternative structural conditions (or sets thereof) are formally investigated and arguments are scrutinized concerning their more or less compelling nature. In this Fig. 7.1 A graphical representation of the currently investigated logical relationships among basic, derived and structural conditions for Bayesian incremental confirmation. Arrows indicate relationships of logical implication. Dotted lines denote relationships of logical independence. Links marked with a bar (/) represent logical inconsistencies. Figures refer to corresponding theorems in the text 88 V. Crupi et al. connection, a fully developed grammar of confirmation would contribute in clarifying which options are theoretically viable and which are not, by pointing out, say, that one cannot logically satisfy both the law of likelihood (LL) and the confirmation complementarity condition (CCO), so that such a pair of principles would amount to an inconsistent set of desiderata. To sum up, the investigation of the logical relationships among basic, derived and structural properties as defined above seems to represent an appropriate general framework of inquiry for a number of analyses and discussions surrounding confirmation and Bayesian confirmation in particular. References Bovens L, Hartmann S (2003) Bayesian epistemology. Oxford University Press, Oxford Carnap R (1950/1962) Logical foundations of probability. University of Chicago Press, Chicago, IL Christensen D (1999) Measuring confirmation. J Philos 96:437–461 Crupi V (2008) Confirmation and relative distance: the oddity of an odds counterpart (unpublished manuscript) Crupi V, Tentori K (forthcoming) Irrelevant conjunction: statement and solution of a new paradox. Philos Sci Crupi V, Tentori K, Gonzalez M (2007) On Bayesian measures of evidential support: theoretical and empirical issues. Philos Sci 74:229–252 Eells E, Fitelson B (2000) Measuring confirmation and evidence. J Philos 97:663–672 Eells E, Fitelson B (2002) Symmetries and asymmetries in evidential support. Philos Stud 107:129–142 Festa R (1999) Bayesian confirmation. In: Galavotti M, Pagnini A (eds) Experience, reality, and scientific explanation. Kluwer, Dordrecht, pp 55–87 Festa R (forthcoming) For unto every one that hath shall be given: Matthew properties for incremental confirmation. Synthese Finch HA (1960) Confirming power of observations metricized for decisions among hypotheses. Philos Sci 27:293–307; 391–404 Fitelson B (1999) The plurality of Bayesian measures of confirmation and the problem of measure sensitivity. Philos Sci 66:S362–S378 Fitelson B (2001) A Bayesian account of independent evidence with applications. Philos Sci 68:S123–S140 Fitelson B (2006) Logical foundations of evidential support. Philos Sci 73:500–512 Fitelson B (2007) Likelihoodism, Bayesianism, and relational confirmation. Synthese 156: 473–489 Gaifman H (1979) Subjective probability, natural predicates and Hempel's ravens. Erkenntnis 21:105–147 Good IJ (1950) Probability and the weighing of evidence. Griffin, London Hawthorne J, Fitelson B (2004) Re-solving irrelevant conjunction with probabilistic independence. Philos Sci 71:505–514 Heckerman D (1986) Probabilistic interpretations for MYCIN's certainty factors. In: Kanal L, Lemmer J (eds) Uncertainty in artificial intelligence. North-Holland, New York, pp 167–196 Horwich P (1982) Probability and evidence. Cambridge University Press, Cambridge Howson C, Urbach P (2006) Scientific reasoning: the Bayesian approach. Open Court, La Salle, IL Joyce J (1999) The foundations of causal decision theory. Cambridge University Press, Cambridge 7 Towards a Grammar of Bayesian Confirmation 89 Joyce J (2004) Bayes's theorem. In: Zalta EN (ed) The Stanford encyclopedia of philosophy (Summer 2004 Edition), URL D http://plato.stanford.edu/archives/sum2004/entries/bayes-theorem/ Kemeny J, Oppenheim P (1952) Degrees of factual support. Philos Sci 19:307–324 Keynes J (1921) A treatise on probability. Macmillan, London Kuipers T (2000) From instrumentalism to constructive realism. Reidel, Dordrecht Milne P (1996) LogOEp.h=eb/=p.h=b/ is the one true measure of confirmation. Philos Sci 63: 21–26 Mortimer H (1988) The logic of induction. Prentice Hall, Paramus Mura A (2006) Deductive probability, physical probability and partial entailment. In: Alai M, Tarozzi G (eds) Karl Popper philosopher of science. Rubbettino, Soveria Mannelli, pp 181–202 Mura A (2008) Can logical probability be viewed as a measure of degrees of partial entailment? Logic Philos Sci 6:25–33 Nozick R (1981) Philosophical explanations. Clarendon, Oxford Oaksford M, Chater N (2007) Bayesian rationality: the probabilistic approach to human reasoning. Oxford University Press, Oxford Rescher N (1958) A theory of evidence. Philos Sci 25:83–94 Shortliffe EH, Buchanan BG (1975) A model of inextact reasoning in medicine. Mathematical Biosciences 23:351–379 Sober E (1994) No model, no inference: a Bayesian primer on the grue problem. In: Stalker D (ed) Grue! the new riddle of induction. Open Court, Chicago, IL, pp 225–240 Steel D (2003) A Bayesian way to make stopping rules matter. Erkenntnis 58:213–222 Steel D (2007) Bayesian confirmation theory and the likelihood principle. Synthese 156:55–77 Tentori K, Crupi V, Bonini N, Osherson D (2007) Comparison of confirmation measures. Cognition 103:107–119 Todhunter I (1865) A history of mathematical theory of probability from the time of Pascal to that of Laplace. Macmillan, London (reprinted: 1949, 1965, Chelsea Publishing Company, New York) Appendix: Proofs of Theorems Theorem 1. The basic conditions (FPI), (IPI) and (EN) are logically independent. Proof. Logical independence amounts to both consistency and non-redundancy. As for consistency, it can be shown that all confirmation measures presented in Section 7.2. (i.e., measures cd ; cr ; crd; cod, and cor) jointly satisfy all three conditions (FPI), (IPI) and (EN). As for non-redundancy, consider the following functions of a joint probability distribution p. H ^  E/: (i) p.H jE/=p.H/2 (ii) p.H/OEp.H jE/  p.H/ (iii) OE1  p.H jE/OEp.H jE/  p.H/ Non-redundancy is proven by the following set of easily demonstrable facts: (i) satisfies both (FPI) and (IPI) but violates (EN); (ii) satisfies both (FPI) and (EN) but violates (IPI); (iii) satisfies both (IPI) and (EN) but violates (FPI). Theorem 2. (QD) follows from the basic conditions (FPI) and (EN). Proof. (EN) immediately implies (QD.2). Then, since by (FPI) c.H; E/ is a strictly increasing function of p.H jE/, both (QD.1) and (QD.3) follow. 90 V. Crupi et al. Theorem 3. (IFPD) follows from each of the basic conditions (FPI) and (IPI). Proof. For any p-normal pair .H; E/, a joint probability distribution p. H ^  E/ is completely determined in a non-redundant way by p.H/; p.H jE/ and p.E/. As a consequence, if c.H; E/ is a function of p. H ^ E/ but not a function of p.H/ and p.H jE/ only, that is because it is a (non-constant) function of p.E/ as well. If that's the case, however, probability models exist showing that c.H; E/ violates (FPI) as well as (IPI). Theorem 4. (QD) and (IFPD) are logically independent. Proof. Consider the following functions of p. H ^  E/: (i) p.H jE/  p.H j:E/ (ii) p.H jE/=p.H/2 It is easy to prove that: (i) (originally proposed by Christensen (1999, 449), and Joyce (1999), Ch. 7, as a confirmation measure) satisfies (QD) while violating (IFPD); (ii), on the other hand, violates (QD) while satisfying (IFPD). Theorem 5. (QD) and (FPI-H) are logically independent. Proof. Consider the following functions of p. H ^  E/: (i) p.H jE/  p.H j:E/ (ii) p.H jE/ It is easy to prove that: (i) satisfies (QD) while violating (FPI-H); (ii), on the other hand, violates (QD) while (trivially) satisfying (FPI-H). Theorem 6. (FPI-H) follows from basic condition (FPI). Proof. (FPI-H) trivially follows from (FPI) in the special case H1 D H2. Theorem 7. (FPI-E) follows from basic condition (FPI). Proof. (FPI-H) trivially follows from (FPI) in the special case E1 D E2. Theorem 8. (QD) and (FPI-E) are logically independent. Proof. Consider the following functions of p. H ^  E/: (i) sin 3 2  .p.H jE/  p.H//  (ii) p.H jE/ It is easy to prove that: (i) satisfies (QD) while violating (FPI-E); (ii), on the other hand, violates (QD) while (trivially) satisfying (FPI-E). Theorem 9. (FPI-H) and (PS) are logically equivalent. 7 Towards a Grammar of Bayesian Confirmation 91 Proof. For any two p-normal pairs .H; E1/ and .H; E2/; p.H jE1/ > = D = < p.H jE2/ iffp.H jE1/=p.H/ > = D = < p.H jE2/=p.H/ iff p.E1jH/=p.E1/ > = D = < p.E2jH/=p.E2/ iff q.H; E1/ > = D = < q.H; E2/. Theorem 10. (PS) implies (SB). Proof. Assume (PS). Then notice that, for any two p-normal pairs .H; E1/ and .H; E2/, if H jD E1; E2, then c.H; E1/ > = D = < c.H; E2/ iff q.H; E1/ > = D = < q.H; E2/ iff p.E1jH/=p.E1/ > = D = < p.E2jH/=p.E2/ iff 1=p.E1/ > = D = < 1=p.E2/ iff p.E1/ < = D = > p.E2/. Theorem 11. (LL) is logically independent from incrementality, i.e., from the set of basic conditions (FPI), (IPI) and (EN). Proof. Consider the following functions of p. H ^  E/: (i) p.H jE/=p.H/ (ii) p.EjH/  p.E/ (iii) o.H jE/=o.H/ D p.EjH/=p.Ej:H/ It is easy to prove that: the probability ratio measure (i) satisfies both (LL) and all basic conditions for incrementality; (ii) (originally proposed by Mortimer (1988), Section 11.1, as a confirmation measure) satisfies (LL) while violating the basic conditions for incrementality; the odds ratio measure (iii), on the other hand, violates (LL) while satisfying all basic conditions for incrementality. Theorem 12. (LL) implies the derived condition (FPI-E). Proof. Assume (LL). Then notice that, for any two p-normal pairs .H1; E/ and .H2; E/, if p.H1/ D p.H2/, then c.H1; E/ > = D = < c.H2; E/ iff p.EjH1/ > = D = < p.EjH2/ iff p.EjH1/p.H1/=p.E/ > = D = < p.EjH2/p.H2/=p.E/ iff p.H1jE/ > = D = < p.H2jE/. Theorem 13. (LL) implies (WLL). Proof. Assume (WLL) is false. Then there exist two p-normal pairs .H1; E/ and .H2; E/ such that p.EjH1/ > p.EjH2/ while c.H1; E/  c.H2; E/, so that (LL) is violated. Theorem 14. (WLL) is logically independent from incrementality, i.e., from the set of basic conditions (FPI), (IPI) and (EN). Proof. Consider the following functions of p. H ^  E/: (i) cor.H; E/ D o.H jE/=o.H/ D p.EjH/=p.Ej:H/ (ii) p.EjH/  2p.Ej:H/ (iii) p.H jE/10  p.H/10 92 V. Crupi et al. It is easy to prove that: the odds ratio measure (i) satisfies both (WLL) and all basic conditions for incrementality; (ii) satisfies (WLL) while violating the basic conditions for incrementality; (iii), on the other hand, violates (WLL) while satisfying all basic conditions for incrementality. Theorem 15. (ECy) and (HCy) are logically inconsistent, as well as (EDy) and (HDy). Proof. Recall that c.H; E/ is assumed to be a function of p. H ^  E/. Then simply notice that, if (ECy) holds, then Cy.H/ is a constant for any p-normal H . If (HCy) holds, on the contrary, Cy.H/ must be a non-constant function of p.H/. A strictly analogous line of argument applies to (EDy) and (HDy). Theorem 16. (ECy) and (EDy) are logically independent, as well as (HCy) and (HDy). Proof. Consider the following incremental measures: (i) cor.h; e/ D o.hje/  o.h/ o.hje/ C o.h/ (ii) cd .h; e/ D p.hje/  p.h/ (iii) cg.h; e/ D p.:h/  p.:hje/ p.:h/ C p.:hje/ (iv) cr .h; e/ D p.hje/  p.h/ p.hje/ C p.h/ Measure (i), proposed by Kemeny and Oppenheim (1952), is ordinally equivalent to the odds ratio measure cor.h; e/ D o.hje/=o.h/ and can be easily shown to jointly satisfy (ECy) and (EDy). This proves that (ECy) and (EDy) are consistent. On the other hand, it is easy to show that the probability difference measure (ii) jointly satisfies (HCy) and (HDy). This proves that (HCy) and (HDy) are consistent. Finally, it is easy to show that: measure (iii) (ordinally equivalent to the one proposed by Gaifman (1979)) jointly satisfies (ECy) and (HDy), thus violating both (HCy) and (EDy); on the other hand, measure (iv) (ordinally equivalent to the probability ratio) jointly satisfies (HCy) and (EDy), thus violating both (ECy) and (HDy). This proves that (ECy) and (EDy) are non-redundant, as well as (HCy) and (HDy). Theorem 17. (CCQ) implies (CCO), which implies (CCO-H). Proof. Assume (CCQ). Then for any two p-normal pairs .H1; E1/ and .H2; E2/ such that p.H1jE1/ > p.H1/ and p.H2jE2/ > p.H2/; c.H1; E1/ > c.H2; E2/ iff c.:H1; E1/ D c.H1; E1/ < c.:H2; E2/ D c.H2; E2/. Moreover, simply notice that (CCO-H) trivially follows from (CCO) in the special case H1 D H2. Theorem 18. (CCO-H) follows from the basic condition (FPI). Proof. Assume (FPI) and recall that (FPI-H) follows (Theorem 6 above). Then for any two p-normal pairs .H; E1/ and .H; E2/ such that p.H jE1/ > p.H/ and p.H jE2/ > p.H/; c.H; E1/ > c.H; E2/ iff p.H jE1/ > p.H jE2/ iff p.:H jE1/ < p.:H jE2/ iff c.:H; E1/ < c.:H; E2/. 7 Towards a Grammar of Bayesian Confirmation 93 Theorem 19. (CCO) implies (CCO-E). Proof. (CCO-E) trivially follows from (CCO) in the special case E1 D E2. Theorem 20. (LL) and (CCO-E) are logically inconsistent. Proof. Consider the following probability distribution over p-normal statements H1; H2 and E W p.H1 ^ H2 ^ E/ D :16; p.H1 ^ H2 ^ :E/ D 0; p.H1 ^ :H2 ^ E/ D :04; p.H1 ^ :H2 ^ :E/ D 0; p.:H1 ^ H2 ^ E/ D :24; p.:H1 ^ H2 ^ :E/ D :20; p.:H1 ^ :H2 ^ E/ D :06; p.:H1 ^ :H2 ^ :E/ D :30. It can then be computed that p.EjH1/ D 1 > :67 D p.EjH2/ and p.Ej:H1/ D :38 > :25 D p.Ej:H2/. Thus, by (LL), c.H1; E/ > c.H2; E/ and c.:H1; E/ > c.:H2; E/, contrary to (CCO-E).