The Counterpart Principle of Analogical Support by Structural Similarity A.Hill∗ and J.B.Paris School of Mathematics The University of Manchester Manchester M13 9PL alex.hill@gmail.com jeff.paris@manchester.ac.uk August 21, 2013 Abstract We propose and investigate an Analogy Principle in the context of Unary Inductive Logic based on a notion of support by structural similarity which is often employed to motivate scientific conjectures. Key words: Analogy, Inductive Logic, Probability Logic, Uncertain Reasoning. Introduction Starting with the founding work of Carnap, [1], there have been a number of attempts within Inductive Logic to formally capture the idea of 'analogical support' between evidence and hypotheses, for example [5], [16], [27], [28]. These attempts to date do not seem to us to have provided a comprehensive formal justification to explain the intuitive appeal of 'support by analogy', and indeed our earlier paper ∗Supported by a UK Engineering and Physical Sciences Research Council (EPSRC) Research Studentship 1 [10] shows that some of these formalizations have very limited applicability. By contrast in this paper we shall present an analogy principle within the context of Inductive Logic which holds widely, in particular it holds for the continua of inductive methods of Carnap and Nix-Paris. This principle is inspired by an alternative account of analogical support based on structural similarity. The structure of this paper is that after a section introducing the formal context and notation. we shall briefly motivate the notion of analogy, structural similarity, which we intend to formalize and investigate. Such a formalization, the Counterpart Principle as we shall call it, together with an investigation into conditions under which it does/does not hold will then follow. Context and Notation We will be working, as usual, in the first order framework for (unary) Inductive Logic1 where we have a predicate language L with finitely many, say q, (unary) predicate symbols P1, ..., Pq , constant symbols a1, a2, a3, . . ., the intention being that these constants enumerate the universe, and no function symbols nor equality. Let SL denote the set of sentences of L and QFSL the quantifier free sentences of L. We will use θ, φ, ψ etc. for elements of SL and adopt throughout the convention that if we write a sentence as θ(ai1 , ai2 , . . . , ain) then all the constants appearing in θ are amongst these ai1 , ai2 , . . . , ain , though they need not all actually appear. In this context, where the ai are intended to enumerate the universe, we define a Probability Function on L to be a map w : SL → [0, 1] such that for all θ, φ,∃xψ(x) ∈ SL: (P1) If |= θ then w(θ) = 1. 1Our general approach to Inductive Logic, see for example [21], [23], is actually close to what Carnap in [3] termed Pure Inductive Logic, the intention of 'Pure' here being similar to that in 'Pure Mathematics'. That is, our aim is to investigate the logic itself in isolation, devoid of any particular intended real world interpretation. Of course the rational principles with which Pure Inductive Logic is concerned are (and indeed should be if the subject is not to become simple mathematics for its own sake) almost invariably motivated by real world examples. But once such principles are formulated it is then the task of Pure Inductive Logic to investigate their consequences as they stand, without borrowing further from any particular, special, interpretation. It is akin to studying a differential equation which has arisen in modeling, say, a bridge. Its solutions apply to any model for which the equation is appropriate, and it they do not apply to the particular bridge we started of with then that is because the original formulation was somehow lacking, not that the differential equation itself is at fault. 2 (P2) If θ |= ¬φ then w(θ ∨ φ) = w(θ) + w(φ). (P3) w(∃xψ(x)) = limm→∞w( ∨m i=1 ψ(ai)). The following theorem, due to Gaifman, [8], shows that a probability function w is actually already determined by its action on QFSL. Theorem 1. Let w : QFSL → [0, 1] satisfy (P1), (P2) for θ, φ ∈ QFSL. Then w has a unique extension to SL satisfying (P1-3). The aim in Inductive Logic, as we view it, is to pick out probability functions on L which are arguably logical or rational in the sense that they could be the choice of a rational agent. Or to put it another way to discard probability functions which could be judged in some sense to be 'irrational'.2 The usual method of thinning down towards such rational choices is to impose 'rationality principles' which these probability functions should arguably satisfy. Of course there can be considerable disagreement about which principles are rational (to the extent of different candidates being mutually inconsistent, see for example [20]) but one such widely accepted principle is that the inherent symmetry between the constants should be respected by any rational probability function w on L. Precisely w should satisfy: The Constant Exchangeability Principle (Ex) For θ, θ′ ∈ QFSL, if θ′ is obtained from θ by replacing the constant symbols3 ai1 , ai2 , . . . , ain in θ by ak1 , ak2 , . . . , akn , respectively, then w(θ) = w(θ ′). It is worth remarking that Ex in fact implies this same principle even for θ, θ′ ∈ SL. Thus in Theorem 1 if w satisfies Ex then so will its extension to SL. The atoms4 of the language L are the formulae αi(x), i = 1, 2, . . . , 2 q, of the form ±P1(x) ∧±P2(x) ∧ ... ∧±Pq(x) where ±Pj stands for either Pj or ¬Pj. The atoms are disjoint and exhaustive, so if w is a probability function on L then by (P1-2), for any m, w(α1(am)), w(α2(am)), . . . , w(α2q (am)) ≥ 0 and ∑ i w(αi(am)) = 1. 2It is interesting that 'irrationality' seems much easier to spot than 'rationality'. 3In such lists all the terms will be assumed to be distinct unless otherwise stated. 4Corresponding to Carnap's Q-predicates within the present framework. 3 Conversely for any such vector of non-negative numbers xi with sum 1 there is a probability function w on L with w(αi(am)) = xi for i = 1, 2, . . . , 2 q. Namely for ~x ∈ D2q = {〈x1, ...x2q 〉 |xi ≥ 0, 2q ∑ i=1 xi = 1} define w~x on state descriptions, that is sentences of the form ∧n i=1 αhi(aki), by w~x ( n ∧ i=1 αhi(aki) ) = n ∏ i=1 xhi = 2q ∏ j=1 x nj j where nj is the number of times that the atom αj occurs amongst the αhi . Using the Disjunctive Normal Form Theorem it is easy to see that w~x can be extended to QFSL to satisfy (P1-2) and Ex and hence by Theorem 1 to a probability function on L, which continues to satisfy Ex. Clearly also w~x(αi(am)) = xi for i = 1, 2, . . . , 2q (and any am). In what follows we take Ex as a standing assumption. This allows us access to a powerful representation theorem due to de Finetti, see [7].5 De Finetti's Representation Theorem If the probability function w on L satisfies Ex then there is a (countably additive) measure μ on D2q such that for θ ∈ SL, w(θ) = ∫ D2q w~x(θ) dμ(~x). (1) Conversely if w is defined by (1) then w is a probability function on L satisfying Ex. We refer to the measure μ here as the de Finetti prior of w. In [9] Gaifman showed the following consequence of de Finetti's Representation Theorem:6 5As given here the converse direction assumes a result due to Gaifman on restrictions of probability functions, see [8]. 6One may have hoped that such an elementary and comprehendible statement as this theorem would have a correspondingly elementary and comprehendible proof, using only simple applications of (P1-2) say. Unfortunately we know of no such proof and quite commonly it appears that in this subject we cannot avoid diversions into 'higher mathematics'. We will later see an (apparently) similar situation with the use of Theorems 10, 11 to conclude on page 17 that Carnap's cλ, for 0 < λ < ∞, satisfy our forthcoming Counterpart Principle. 4 Theorem 2. If the probability function w on L satisfies Ex then it satisfies: The Principle of Instantial Relevance (PIR) For ψ ∈ QFSL, α an atom of L and ai, aj distinct constant symbols not mentioned in ψ,7 w(α(ai) |α(aj) ∧ ψ) ≥ w(α(ai) |ψ). (2) In short the additional evidence α(aj) enhances (or at least does not decrease) the probability of α(ai) conditional on evidence ψ. Extending the idea of symmetry between symbols of the language, we might also feel it rational to require that the predicates be exchangeable. The Predicate Exchangeability Principle (Px) For θ, θ′ ∈ QFSL, if θ′ is obtained from θ by replacing the (distinct) predicate symbols Pj1 , Pj2 , . . . , Pjm appearing in θ by the (distinct) predicate symbols Ps1 , Ps2 , ...Psm respectively, then w(θ) = w(θ ′). Using Theorem 1 it is easy to show that Px implies that the same property holds even when θ, θ′ are sentences of L rather than just quantifier free sentences, a fact that we shall use without further mention in what follows. The following is a stronger symmetry based principle which is also frequently seen in Inductive Logic; for example, it is satisfied by both Carnap's Continuum of Inductive Methods and the Nix-Paris Continuum, [20]. The Atom Exchangeability Principle (Ax) If σ is a permutation of 1, 2, ..., 2q , then w ( n ∧ i=1 αhi(aji) ) = w ( n ∧ i=1 ασ(hi)(aji) ) Note that Ax (with Ex) implies that the probability of a state description θ = ∧n i=1 αhi(aji) depends only on the multiset 8 {n1, n2, . . . , n2q} where nj is the number of times that the atom αj appears amongst the αhi . We call this multiset the spectrum of θ. It follows then that stated in this form Ax implies Px. 7In order to circumvent any problems when the conditioning sentence has probability zero we take an expression such as w(φ1 |ψ1) ≥ w(φ2 |ψ2) to be short for the inequality formed by multiplying out the denominators, i.e. w(φ1 ∧ψ1) *w(ψ2) ≥ w(φ2 ∧ψ2) *w(ψ1) in this case. So PIR as defined will automatically hold when w(α(aj) ∧ ψ) = 0 since both sides of the inequality will then be zero. 8A multiset is just a set in which we allow the same element to appear more than once. The usual convention here is that nj which are zero are omitted from mention in the spectrum. However in this case it will be convenient to include them. 5 In addition to symmetry there have been numerous attempts to incorporate principles based on analogy into Inductive Logic, initially by Carnap, for example in [4], by Carnap & Stegmüller [5] and later for example by Festa [6], Kuipers [13], Maher [16], [17], Maio [18], Niiniluoto [19], Romeijn [27], Skyrms [28]. Generally these have considered analogy as deriving from the sharing of similar or identical properties by the constants. In other words, a principle has been sought based on a notion of distance between atoms. This approach means that Ax is violated. Although this may not be a failing of such principles it is interesting to note that the alternative conception of analogy presented in this paper is consistent with Ax and is widely satisfied, including by the probability functions of the afore-mentioned continua. Support by Structural Similarity As a toy example9 of what we have in mind by 'support by structural similarity' suppose that I am observing windfall pears and the first three I see are all green. Then that should not decrease my belief that the next three pears observed will all have a maggot, since my first observation of the three green pears suggests that these pears are a pretty uniform bunch, so no less likely to all have a maggot than was the case before I had observed the three green pears. For an everyday example consider the assertion that there is, or at least once has been, life on Mars. Most of us (we imagine) would think this was indeed worth the multi-billion dollars so far spent trying to confirm it. Equally however for most of us this is (we again imagine) largely based on the similarity between the Earth and Mars. They are both planets with fairly similar ages and orbits, and both have atmospheres of sorts. On top of that 'Earth-Life' is abundant almost across the globe so isn't 'Mars-Life' just a missing piece in the jigsaw, in the pattern, something almost to be expected? Interestingly we can give here a third example which directly relates to Theorem 3 of this paper. As we have seen in the presence of Constant Exchangeability, Ex, we have the Principle of Instantial Relevance, PIR, saying that, in the notation of (2), the additional evidence α(aj) provides support for α(ai). In that case we might conjecture that there should be some similar 'Support Principle' when we have Predicate Exchangeability in place of Constant Exchangeability. As we shall see this is exactly the case. This last example is by no means uncommon within our experience of doing mathematics. In fact with most conjectures one makes there is somewhere in the back9Arising out of the discussion with one of the reviewers. 6 ground a structurally similar situation in which the corresponding conjecture has been confirmed. This is not to say that this is invariably the right goal to aim for, quite often one's conjecture based on this form of analogy turns out to be false, but what it does seem to show is that in forming and attempting to confirm such conjectures we are implicitly giving credence to a principle of support by structural similarity. Nor, of course, do our experiences here seem to differ from those of our fellow mathematicians, or even scientists in general – in [25], [26] Polya gives a nice account of just such analogical reasoning. The Counterpart Principle In view of the above discussion we are lead to propose, within the context of PIL, the following as a principle of analogical reasoning: Counterpart Principle (CP)10 For any θ ∈ SL, if θ′ ∈ SL is obtained by replacing the predicate and constant symbols appearing in θ by (distinct) new ones not occurring in θ then w(θ | θ′ ) ≥ w(θ). (3) Our plan now in this section is to show that the Counterpart Principle, CP, is rather widely satisfied. In the section which follows we will make some observations on when we can have strict inequality in (3). 10Superficial this principle might be thought to resemble the Analogieschluss ('inference by induction') of Carnap & Stegmüller dating back to their [5, p226]. The intuition here is that for state descriptions ψ(a1, . . . , an) = n ∧ i=1 αL1hi (ai), φ(a1, . . . , an) = n ∧ i=1 αL2gi (ai) for disjoint languages L1, L2 the probability of the state description ψ(a1, . . . , an) ∧ φ(a1, . . . , an) of L1 ∪ L2 should be greater the more often hi = hj ⇐⇒ gi = gj . The difference here with CP is that here the analogy or link is between the relations of indistinguishability between the same ai engendered by ψ and φ whereas in CP it is between the forms of sentences θ, θ′ involving disjoint ai. (For more on Carnap & Stegmüller's Analogieschluss within the approach to Inductive Logic taken here see [15], [23].) 7 We need the following notion: A probability function w on a language L is said to satisfy Unary Language Invariance, ULi, if there is a family of probability functions wL, one on each unary language L, each satisfying Px, such that wL = w and whenever L ⊂ L′ then wL ′ restricted to SL (notice that SL ⊆ SL′) equals wL. We say that w satisfies ULi with Ax if in addition we can choose these wL to satisfy Ax. Our argument in favour of it being rational for a probability function w on L to satisfy ULi is that it appears reasonable to suppose that were we to enlarge the language then w could be extended to this larger language, after all it would seem unjustified to assume from the start that L was all the language there could ever be (a point made by Kemeny already in [12]). Similarly if we are assuming the position that P is a property our rational choice of probability function w on L should possess then we should equally demand this property of our chosen extension to larger languages. This then provides a rational justification for Language Invariance. In particular we should require that the probability functions in this family satisfy Px (and Ex by standing assumption), thus yielding ULi. Note that ULi equivalently means that w can be extended to a probability function w∞ on the infinite language L∞ = {P1, P2, P3, . . .} satisfying Px. In more detail, given the ULi family {wL} we let Ln be the language with predicate symbols {P1, P2, . . . , Pn} and for θ ∈ SL∞ set w∞(θ) = w Ln(θ) for any n sufficiently large that θ ∈ SLn. Conversely, given such a probability function w∞ and a language L with predicate symbols {R1, R2, . . . , Rn} we can define the required w L(θ) to be w∞(θ ′) where θ′ is the result of replacing each Ri in θ by Pi. (Notice that by Px this is independent of how we list the predicate symbols of L.) Theorem 3. Let w satisfy ULi. Then w satisfies the Counterpart Principle, CP. Proof. Assume that w satisfies ULi and let w∞ be a probability function on the infinite (unary) language L∞ = {P1, P2, P3, . . .} extending w and satisfying Px. Let θ, θ′ be as in the statement of CP, without loss of generality assume that all the constant symbols appearing in θ are amongst a1, a2, . . . , ak, all the relation symbols appearing in θ are amongst P1, P2, . . . , Pj , and for θ ′ they are correspondingly ak+1, ak+2, . . . , a2k, Pj+1, Pj+2, . . . , P2j . So we can write θ = θ(a1, a2, . . . , ak, P1, P2, . . . , Pj), θ′ = θ(ak+1, ak+2, . . . , a2k, Pj+1, Pj+2, . . . , P2j). 8 With this in place let θi+1 = θ(aik+1, aik+2, . . . , a(i+1)k , Pij+1, Pij+2, . . . , P(i+1)j) ∈ SL∞ so θ1 = θ, θ2 = θ ′. Let L be the unary language with a single unary relation symbol P and define τ : QFSL → SL∞ by τ(P (ai)) = θi, τ(¬φ) = ¬τ(φ), τ(φ ∧ ψ) = τ(φ) ∧ τ(ψ), etc. for φ,ψ ∈ QFSL. Now define v : QFSL → [0, 1] by v(φ) = w∞(τ(φ)). Then since w∞ satisfies (P1-2) (on SL∞) so does v (on QFSL). Also since w∞ satisfies Ex + Px, for φ ∈ QFSL, permuting the θi in w(τ(φ)) will leave this value unchanged so permuting the ai in φ will leave v(φ) unchanged. i.e. v satisfies Ex. By Theorem 1 v has an extension to a probability function on SL satisfying Ex and hence satisfying PIR by Theorem 2. In particular then v(P (a1) |P (a2)) ≥ v(P (a1)). But since τ(P (a1)) = θ, τ(P (a2)) = θ ′ this amounts to w∞(θ | θ ′) ≥ w∞(θ) and hence gives the Counterpart Principle for w since w∞ agrees with w on SL. We remark that by a small amendment of the proof this conclusion, CP, can be strengthen to require that only some of the constant and predicate symbols in θ are changed when forming θ′. Furthermore by then using w∞ conditioned on ψ in place of w∞ through this proof we can further strengthen the conclusion to w(θ | θ′ ∧ ψ) ≥ w(θ |ψ) for any ψ ∈ SL mentioning only constant and predicate symbols common to both θ and θ′ The condition ULi required for Theorem 3 holds for the cLλ of Carnap's Continuum of Inductive Methods, [2], and also for the wδL of the Nix-Paris Continuum 11 defined 11See [20] for an explanation of how this continuum, like Carnap's Continuum of Inductive Methods, is singled out by arguably rational principles. 9 by wδL = 2 −q 2q ∑ j=1 w~ej where ~ej = 〈γ, γ, . . . , γ, γ + δ, γ, . . . , γ, γ〉, the δ occurring in the jth coordinate, γ = 2−q(1 − δ) and 0 ≤ δ ≤ 1. Indeed they both satisfy the stronger condition of ULi with Ax, the language invariant families being given by fixing λ, respectively δ, and varying L, see [20], [21], or for a more technical account [14] or [23]. It is worth noting that we cannot do without ULi here, Ex and Px alone do not guarantee that a probability function satisfies CP. As an example here let q = 2 and take w to be the probability function12 w = 4−1(w〈 1 2 , 1 2 ,0,0〉 + w〈 1 2 ,0, 1 2 ,0〉 + w〈0, 1 2 ,0, 1 2 〉 + w〈0,0, 1 2 , 1 2 〉). Then w satisfies Ex and Px. However for θ = (P1(a1) ∧ ¬P1(a2)), θ ′ = (P2(a3) ∧ ¬P2(a4)), a straightforward calculation shows that w(θ | θ′) = 0 < w(θ) = 18 . Hence CP fails for this function. A second argument for restricting attention here to probability functions satisfying ULi (equivalently to probability functions on L∞) is that without it the lack of available predicates from which to form θ′ from θ becomes a significant nuisance factor. Given this, and the fact that the main interest in Inductive Logic is in probability functions satisfying Ax, we shall henceforth limit our attention to probability functions satisfying ULi with Ax. The Strict Counterpart Principle In the previous section we considered the version of the Counterpart Principle, (3), asserting that the evidence θ′ does not decrease the probability of θ. We now make some observations on when we can, and cannot, assert that θ′ strictly increases the probability of θ, what one might call the 'Strict Counterpart Principle'. In fact we can never have this for all θ since for θ a tautology or contradiction θ′ will have the same status and (3) will simply give equality. In this section we 12It is straightforward to see that convex combinations of probability functions also satisfy (P1-3) and hence are themselves probability functions. 10 shall look at some non-tautologous non-contradictory θ for which all probability functions satisfying ULi with Ax fail to give a strict inequality. First however we shall look at the other end of the spectrum, probability functions satisfying ULi with Ax which fail the strict version CP for all θ. A particular class of well understood functions (see for example [11], [22], [24]) satisfying ULi with Ax, and so CP, are those which satisfy: Weak Irrelevance Principle (WIP) If θ, φ ∈ SL have no constant or predicate symbols in common then w(θ |φ) = w(θ). Clearly WIP implies that CP always holds with equality. For probability functions satisfying WIP we have a precise characterization which we now explain, in part because this notation will be required later. Let B be the set of infinite sequences p = 〈p0, p1, p2, p3, ...〉 of reals such that p0 ≥ 0, p1 ≥ p2 ≥ p3 ≥ ... ≥ 0 and ∞ ∑ i=0 pi = 1. For p ∈ B and f : {1, 2, ..., n} → {1, 2, ..., 2q} let Rp,n = 1− ∑n j=1 pj and designate f(p) = 〈 2−qRp,n + ∑ f(j)=1 pj, 2 −qRp,n + ∑ f(j)=2 pj, ..., 2 −qRp,n + ∑ f(j)=2q pj 〉 ∈ D2q . Now let up,Ln be the probability function defined by up,Ln = 2 −nq ∑ f wf(p) where the f range over all functions f : {1, 2, ..., n} → {1, 2, ..., 2q} and for θ ∈ QFSL define up,L(θ) = lim n→∞ up,Ln (θ). This limit exists and by Theorem 1 up,L extends to a probability function on L. The fact that the up,Ln satisfy Ex and Ax carries over to up,L, indeed as we vary L the up,L form a language invariant family so up,L satisfies ULi with Ax. 11 Notice that if pn+1 = 0 in p then u p,L = up,Ln . In particular for 0 ≤ δ ≤ 1 and p = 〈1− δ, δ, 0, 0, . . .〉, wδL = u p,L = up,L1 , so these up,L include the Nix-Paris Continuum. A generalization (to polyadic languages) of the following theorem is proved in [22] Theorem 4. The up,L are exactly the probability functions on L satisfying ULi with Ax and WIP. This theorem then provides a family of probability functions which have equality in CP (in the presence of ULi with Ax) for all θ ∈ SL. We would conjecture that conversely these up,L are the only probability functions with this property. We now turn to look at non-tautologous non-contradictory sentences θ which guarantee equality in CP for any probability function satisfying ULi with Ax. To describe these we first need to introduce some more notation. For θ ∈ QFSL, let fθ(ñ) denote the number of state descriptions with spectrum ñ = {n1, n2, ..., n2q} appearing in the Disjunctive Normal Form of θ 13. Note that for any probability function w satisfying Ax and any sentence θ, w(θ) = ∑ ñ fθ(ñ)w(ñ) where w(ñ) is the value of w on some/any state description with spectrum ñ. The following lemma appears in [23] but for completeness we include a proof here. Lemma 5. Let θ ∈ QFSL be such that for any probability function w on L satisfying Ax, w(θ) = c, equivalently ∑ ñ fθ(ñ)w(ñ) = c, (4) for some constant c. Then for each ñ, fθ(ñ) = cf⊤(ñ). Proof. Given reals s1, s2, ...s2q ≥ 0, and not all zero, let v~s be the probability function on L such that v~s(ñ) = (2 q!)−1 ∑ σ sn1 σ(1)s n2 σ(2)...s n2q σ(2q)(s1 + s2 + ...+ s2q ) −m 13For some fixed set of constants which includes all the constants mentioned in θ, though the particular fixed set is not important in what follows. 12 where σ ranges over all permutations of 1, 2, .., 2q . Then v~s satisfies Ax and (4) together with the fact that v~s(⊤) = 1 gives that ∑ ñ fθ(ñ)(2 q !)−1 ∑ σ sn1 σ(1)s n2 σ(2)...s n2q σ(2q ) = cv~s(⊤)(s1 + s2 + ...+ s2q) m = c ∑ ñ f⊤(ñ)(2 q!)−1 ∑ σ sn1 σ(1)s n2 σ(2)...s n2q σ(2q). Since we can take each si to be algebraically independent this is only possible if the coefficients of sn11 s n2 2 ...s n2q 2q on both sides agree, from which the result follows. We shall refer to a θ ∈ QFSL such that w(θ) = c for all probability functions w on L satisfying Ax as being of constant type. Notice that in this case c must be rational with denominator (when in lowest form) which divides all the f⊤(ñ). As Lemma 5 is stated it seems possible that since the definition of a constant type sentence depends on the overlying language θ could be of constant type for L but not for L′ even though θ ∈ SL ∩ SL′. The next lemma shows that this is not the case. Lemma 6. Suppose that θ ∈ SL∩SL′ and w(θ) = c for all probability functions w on L satisfying Ax. Then w′(θ) = c for all probability functions w′ on L′ satisfying Ax. Proof. It is enough to check this in the cases where L ⊂ L′ and where L′ ⊂ L. In the former case the result is immediate since w′↾SL (w′ restricted to SL) still satisfies Ax and w′↾SL(θ) = w′(θ). So assume that L′ ⊂ L. Then by Theorem 33.1 of [23] there is λ ≥ 0 and probability functions w1, w2 on L satisfying Ax such that w = (1 + λ)w1↾SL ′ − λw2↾SL ′. (5) Hence w(θ) = (1 + λ)w1↾SL ′(θ)− λw2↾SL ′(θ) = (1 + λ)w1(θ)− λw2(θ) = (1 + λ)c− λc, since θ is of constant type for L, = c , as required. 13 Theorem 7. For any θ ∈ QFSL of constant type and any φ ∈ QFSL that has no predicate or constant symbols in common with θ, w(θ |φ ) = w(θ) for all w satisfying Ax. Proof. Suppose that θ and φ have no constant or predicate symbols in common, and that w(θ) = m/n for all probability functions w satisfying Ax. Clearly the result holds if m = 0 so assume that m > 0. Let L1 to be the set (i.e. language) of all predicate symbols occurring in θ. By putting θ in Disjunctive Normal Form (DNF) we can express θ as a disjunction of state descriptions from L1. By Lemma 5 fθ(ñ) = (m/n)f⊤(ñ) for all spectra ñ of L1. For each spectrum ñ, let r(ñ) = n −1f⊤(ñ) and partition the set of state descriptions of L1 with spectrum ñ into n sets A1(ñ), A2(ñ), . . . , An(ñ), each containing r(ñ) state descriptions, so that the union of the first m of these sets constitute all the state descriptions in the DNF of θ with spectrum ñ. Notice that for ξ, ψ state descriptions of L1 (for the same constants) with spectrum ñ there is a permutation of atoms of L which sends ξ ∧ η to ψ ∧ η for any state description η of L−L1 (with disjoint constants of course). Hence, by Ax, w(ξ∧η) = w(ψ ∧ η), and in turn w(ξ ∧ φ) = w(ψ ∧ φ) by taking the DNF equivalent of φ in L− L1. So w(φ ∧ θ) = w    φ ∧ ∨ ñ m ∨ j=1 ∨ ψ∈Añ j ψ    = w    ∨ ñ m ∨ j=1 ∨ ψ∈Añ j (ψ ∧ φ)    = ∑ ñ m ∑ j=1 ∑ ψ∈Añ j w(ψ ∧ φ) = ∑ ñ mr(ñ)w(ψñ ∧ φ) 14 for some/any state description ψñ of L1 (for the same constants) with spectrum ñ. Exactly similarly by replacing θ by ⊤, w(φ) = ∑ ñ nr(ñ)w(ψñ ∧ φ) = (n/m)w(φ ∧ θ) so since w(θ) = m/n the required identity follows. It is worth remarking here that this theorem can, with some more effort, be proved even for φ ∈ SL (see [23]) though we will not need that stronger version here. From Theorem 7 we have the following corollary. Corollary 8. For any θ ∈ QFSL of constant type and any θ′ obtained by replacing all constant and predicate symbols in θ by new ones, w(θ | θ′ ) = w(θ) for all w satisfying Ax. The converse to Corollary 8 is easily shown. Proposition 9. Suppose that θ, θ′ ∈ QFSL with θ′ the result of replacing all constant and predicate symbols in θ by new ones and w(θ | θ′ ) = w(θ) for all w satisfying Ax. Then θ is of the constant type. Proof. Let w1, w2 be distinct probability functions satisfying Ax. Then the probability function 2−1(w1 +w2) also satisfies Ax and so by assumption, 2−1(w1 + w2)(θ ∧ θ ′ ) = 2−1(w1 +w2)(θ)2 −1(w1 + w2)(θ ′) = (2−1(w1 + w2)(θ)) 2 since by Ax w(θ) = w(θ′). Multiplying out and re-arranging we get 2w1(θ ∧ θ ′) + 2w2(θ ∧ θ ′) = w1(θ) 2 + 2w1(θ)w2(θ) + w2(θ) 2. Using the assumption this gives that 2w1(θ) 2 + 2w2(θ) 2 = w1(θ) 2 + 2w1(θ)w2(θ) +w2(θ) 2 and by re-arranging (w1(θ)− w2(θ)) 2 = 0. Hence w1(θ) = w2(θ) as required. 15 Having seen a class of sentences for which equality always holds in the statement of CP, we turn to consider a case in which strict inequality holds for all non-constant θ ∈ QFSL. In order to do so we recall the following special case of a theorem (Theorem 1) from [14]. Theorem 10. Any probability function w on L satisfying ULi with Ax can be represented as an integral w = ∫ B up,Ldμ (6) for some measure μ on the Borel subsets of B. Conversely any such function defined in this way satisfies ULi with Ax.14 Theorem 11. For a probability function w = ∫ B up,Ldμ, if every point in B is a support15 point of μ then strict inequality holds in CP whenever θ ∈ QFSL is not of the constant type. Proof. Assume that w can be expressed in this way and let θ, θ′ ∈ QFSL be as in the statement of CP. Then since the up,L satisfy WIP, w(θ ∧ θ′)− w(θ)2 = ∫ B up,L(θ ∧ θ′) dμ(p)− (∫ B uq,L(θ) dμ(q) )2 = ∫ B up,L(θ)2 dμ(p)− (∫ B uq,L(θ) dμ(q) )2 = ∫ B ( up,L(θ)− ∫ B uq,L(θ) dμ(q) )2 dμ(p) ≥ 0. Since the support of μ is all of B the only way we can have w(θ | θ′) = w(θ) is if up,L(θ) = ∫ B uq,L(θ) dμ(q) for all p ∈ B. In other words up,L(θ) must be constant for all p ∈ B. By a result in [23, Chapter 33] any probability function w on L satisfying Ax is of the form w = (λ+ 1) ∫ B up,L dμ1(p) − λ ∫ B up,L dμ2(p) for some 0 ≤ λ and measures μ1, μ2 on B. Hence if all the u p,L(θ) are constant then so too are all w(θ) for w satisfying Ax. In other words θ is of the constant type. 14Notice that the 'building block functions' here, i.e. the up,L, are precisely the probability functions satisfying ULi with Ax and WIP. 15Recall that a point ~e ∈ B is in the support of μ if μ(B) > 0 for all open subsets B of B containing ~e. 16 The conditions given in this theorem which ensure that w satisfies CP with strict inequality for all non constant type quantifier free sentences can be shown to hold for Carnap's Continuum of Inductive Methods cλ when 0 < λ <∞, thus ensuring that these cλ satisfy this strong version of CP (for quantifier free sentences). However showing this appears to be quite involved and in general we currently have little insight into when these conditions hold for particular probability functions (unlike the situation with the de Finetti's Representation). It might have been hoped at this point that any probability function w satisfying ULi with Ax would either satisfy WIP, and so never give strict inequality in CP, or else not satisfy WIP and always give strict inequality in CP whenever θ ∈ QFSL was not of the constant type. Unfortunately as the following example shows the situation is not as simple as that. Let L be the language with just two predicate symbols, i.e. q = 2. Then for a state description θ with spectrum {3, 1, 0, 0} or {2, 2, 0, 0} the mapping δ 7→ wδ(θ) has a maximum point in (0, 1). So there are 0 < ν < τ < 1 such that wν(θ) = wτ (θ) and in consequence the probability function w = (wν + wτ )/2 has the property that w(θ | θ′) = w(θ). However θ is not of the constant type and w does not satisfy WIP. Conclusions This paper argues that the Counterpart Principle of Analogy has intuitive appeal and gives a proof that it holds for all probability functions satisfying Unary Language Invariance, another appealing principle in our opinion. In particular then the Counterpart Principle is satisfied by both Carnap's Continuum of Inductive Methods and the Nix-Paris Continuum. This also means that it is consistent with Atom Exchangeability, which makes it a rather different analogy principle to those considered previously. The paper also investigates when the inequality in the Counterpart Principle is strict. In this case it is shown that there are probability functions satisfying Unary Language Invariance with Atom Exchangeability that never give strict inequality for any sentence (amongst them the wδ of the Nix-Paris Continuum) and there are non-tautologous non-contradictory 'constant' sentences which never give a strict inequality for any probability function satisfying Atom Exchangeability. Whilst the general situation still begs clarification it can be shown that for 0 < λ < ∞ the cλ of Carnap's Continuum of Inductive Methods do give the strict inequality on quantifier free sentences for all except the 'constant' sentences. 17 References [1] Carnap, R., Logical Foundations of Probability, University of Chicago Press, Chicago, Routledge & Kegan Paul Ltd., 1950. [2] Carnap, R., The Continuum of Inductive Methods, University of Chicago Press, 1952. [3] Carnap, R., The Aim of Inductive Logic, in Logic, Methodology and Philosophy of Science, Eds. E.Nagel, P.Suppes & A.Tarski, Stanford University Press, Stanford, California, 1962, pp303-318. [4] Carnap, R., A Basic System of Inductive Logic, in Studies in Inductive Logic and Probability, Volume II, ed. R. C. Jeffrey, University of California Press, Berkeley and Los Angeles, 1980, pp7-155. [5] Carnap, R. & Stegmüller, W., Induktive Logik und Wahrscheinlichkeit, Springer Verlag, Wien, 1959. [6] Festa, R., Analogy and Exchangeability in Predictive Inferences, Erkenntnis, 1996, 45:229-252. [7] de Finetti, B., Theory of Probability, Volume 1, Wiley, New York, 1974. [8] Gaifman, H., Concerning measures on first order calculi, Israel Journal of Mathematics, 1964, 2:1-18. [9] Gaifman, H., Applications of de Finetti's Theorem to Inductive Logic, in Studies in Inductive Logic and Probability, Volume I, eds. R.Carnap & R.C.Jeffrey, University of California Press, Berkeley and Los Angeles, 1971, pp235-251. [10] Hill, A.J. & Paris, J.B., Reasoning by Analogy in Inductive Logic, The Logica Yearbook 2011, eds. M.Pelǐs & V.Punčochář, College Publications, London, 2012, pp63-76. [11] Hill, M.J., Paris, J.B. & Wilmers, G.M., Some observations on induction in predicate probabilistic reasoning, Journal of Philosophical Logic, 2002, 31:4375. [12] Kemeny, J.G., Carnap's Theory of Probability and Induction, in The Philosophy of Rudolf Carnap, ed. P.A.Schilpp, La Salle, Illinois, Open Court, 1963, pp711-738. [13] Kuipers, T.A.F., Two types of Inductive Analogy by Similarity, Erkenntnis, 1984, 21:63-87. 18 [14] Landes, J., Vencovská, A. & Paris, J.B., A Characterization of the Language Invariant Families satisfying Spectrum Exchangeability in Polyadic Inductive Logic, Annals of Pure and Applied Logic, 2010,161:800-811. [15] Landes, J., Paris, J.B. & Vencovská, A survey of some recent results on Spectrum Exchangeability in Polyadic Inductive Logic, Synthese, 2011, 181(Supplement 1), pp19-47. [16] Maher, P., Probabilities for multiple properties: The models of Hesse, Carnap and Kemeny, Erkenntnis, 2001, 55:183-216. [17] Maher, P., A Conception of Inductive Logic, Philosophy of Science, 2006, 73:513-520. [18] di Maio, M.C., Predictive Probability and Analogy by Similarity, Erkenntnis, 1995, 43(3):369-394. [19] Niiniluoto, I., Analogy and Inductive Logic, Erkenntnis, 1981, 16:1-34. [20] Nix, C.J. & Paris, J.B., A Continuum of Inductive Methods arising from a Generalized Principle of Instantial Relevance, Journal of Philosophical Logic, 2006, 35(1):83-115. [21] Paris, J.B., Pure Inductive Logic, in The Continuum Companion to Philosophical Logic, Eds. L.Horsten & R.Pettigrew, Continuum International Publishing Group, London, 2011, pp428-449. [22] Paris, J.B. & Vencovská, A., A Note on Irrelevance in Inductive Logic, Journal of Philosophical Logic, 2010, 40(3):357-370. [23] Paris, J.B. & Vencovská, A., Pure Inductive Logic. To appear in the Association of Symbolic Logic series Perspectives in Logic, Ed. M.Rathjen, Cambridge University Press, 2013. [24] Paris, J.B. &Waterhouse, P., Atom Exchangeability and Instantial Relevance, Journal of Philosophical Logic, 2009, 38(3):313-332. [25] Polya, G., Induction and Analogy in Mathematics, Volume I of Mathematics and Plausible Reasoning, Geoffrey Cumberlege, Oxford University Press, 1954. [26] Polya, G., Patterns of Plausible Inference, Volume II of Mathematics and Plausible Reasoning, Geoffrey Cumberlege, Oxford University Press, 1954. 19 [27] Romeijn, J-W., Analogical Predictions for Explicit Simmilarity, Erkenntnis, 2006, 64(2):253-280. [28] Skyrms, B., Analogy by Similarity in Hyper-Carnapian Inductive Logic, in Philosophical Problems of the Internal and External Worlds, J.Earman. A.I.Janis, G.Massey & N.Rescher, eds., University of Pittsburgh Press, 1993, pp273-282.