Ancient Indian Logic and Analogy? J.B.Paris?? and A.Vencovská? ? ? University of Manchester, Manchester M13 9PL, UK, jeff.paris@manchester.ac.uk, alena.vencovska@manchester.ac.uk Abstract. B.K.Matilal, and earlier J.F.Staal, have suggested a reading of the 'Nyāya five limb schema' (also sometimes referred to as the Indian Schema or Hindu Syllogism) from Gotama's Nyāya-Sūtra in terms of a binary occurrence relation. In this paper we provide a rational justification of a version of this reading as Analogical Reasoning within the framework of Polyadic Pure Inductive Logic. Keywords: Nyāya-Sūtra, Analogy, Pure Inductive Logic, Rationality Introduction In the Nyāya-Sūtra (∼150CE), Gotama discussed the structure of logical reasoning, offering a fundamental schema consisting of: • statement of the thesis, • statement of a reason, • an example supporting the reason on the grounds of similarity to the present case, • application of the above to the present case, • conclusion. B.M Matilal [5] gives this 'time-honoured' illustration of the schema: • There is fire on the hill. • For there is smoke. • (Wherever there is smoke, there is fire), as in the kitchen. • This is such a case (smoke on the hill). • Therefore it is so, i.e. there is fire on the hill. It is often emphasised that this reasoning should be understood as occurring in the context of a debate, employed to persuade an opponent. Hence the ? Subsequently appeared in Logic and Its Applications, Proceedings of the 7th Indian Conference, ICLA 2017, Kanpur, India, January 2017. Eds. S.Ghosh and Sanjiva Prasad. Springer LNCS 10119, 2017, pp198-210. ?? Supported by a UK Engineering and Physical Sciences Research Council (EPSRC) Research Grant. ? ? ? Supported by a UK Engineering and Physical Sciences Research Council Research Grant. apparently unnecessary number of steps; they each have a role. Considering the argument taken out of this context, it is commonly rephrased as • (Wherever there is smoke, there is fire), as in the kitchen. • There is smoke on the hill. A • Therefore there is fire on the hill. This is clearly close to one of the Aristotelian syllogisms, but the Indian Schema, as we shall call it, can be reduced to it only at the cost of imposing the perspective of our contemporary deductive logic and rendering the example (almost1 redundant. See for instance [2] for a collection of papers relating to attempts at understanding and formalising the schema in various ways. We have suggested in [7] and [8] that returning to the position where the example itself carries the weight of the evidence, somehow itself representing the universal implication, allows formulations of the argument within Pure Inductive Logic (to be introduced shortly) which can be justified as rational on the grounds of following from principles usually accepted in that subject as rational. When the example is so taken to encapsulate the evidence, the argument may be rephrased as2 • When there was smoke in the kitchen, there was fire. • There is smoke on the hill. B • Therefore there is fire on the hill. – with the rider that the kitchen is a good example, which is taken to mean that the example captures all the relevant information. Regarding this rider the Nyāya-Sūtra is a cryptic text and does not elaborate on its methodology. Nevertheless it is clear that the relationship here between smoke and fire is not simply taken to be contingent, coincidental, but fundamental, a concomitance, or even causal relationship, that cannot be otherwise. Being a good example then can be equated with capturing this link, rather as in mathematics we may give a 'proof by example'. Of course the problem in practice of precisely demarcating what we mean by this notion in general appears immensely difficult but fortunately that is not our problem in this short paper. We shall simply be interested in providing a justification for this inference on the grounds of its logical form alone. The Paks.a Formalisation In our previous attempts [7] and [8] at formalising B we worked within Unary Predicate Logic, so using S, F, h and k in the obvious sense we employed S(k) to express There is smoke in the kitchen, F (h) to express There is fire on the hill etc.. Within Pure Inductive Logic, B then becomes the assertion that, in 1 It has been suggested that under such a perspective, the role of the example may be to ensure existential import, see e.g. [4, p16]. 2 Notice that we are taking the evidence as a single instance of a kitchen, hence the switch from 'whenever' on line 1 to 'when'. the absence of any other pertinent information S(h) and S(k)→ F (k) provide grounds for accepting F (h). In [7] and [8] we elaborated on the background and evidence for this reading of the schema (and so will not repeat ourselves here) and showed that such inference is indeed justified by certain well accepted rational symmetry principles of probability assignment and in consequence is itself rational. Some authors however, notably Staal [12] and Matilal [5], have suggested that it is much closer to the Indian way of thinking to formalise the Indian Schema by employing a binary relation standing for 'occurring at': According to Staal in Indian logic an entity is never regarded in isolation but always considered as occurring at a locus, and the fundamental relation which underlies all expressions is that between an entity and its locus (paks.a). Using R for this relation and f, s, h, k for fire, smoke, hill and kitchen respectively, B becomes the claim that, in the absence of any other pertinent information, R(s, h) and R(s, k)→ R(f, k) provide grounds for accepting R(f, h). In this note we show that Pure Inductive Logic supports this version as a rational inference. To facilitate this we first need to summarize some necessary background from Pure Inductive Logic and briefly explain what this logic is attempting to elucidate. Pure Inductive Logic The framework for Pure Inductive Logic is Predicate Logic employing a language L with a finite set of relation symbols R1, . . . , Rq, countably many constants a1, a2, a3, . . . and no function symbols nor equality. 3 SL denotes the set of sentences of L and QFSL denotes the set of quantifier free sentences in SL. A probability function on L is a function w : SL→ [0, 1] such that for any θ, φ, ∃xψ(x) from SL, (i) If |= θ then w(θ) = 1. (ii) If θ |= ¬φ then w(θ ∨ φ) = w(θ) + w(φ). (iii) w(∃xψ(x)) = lim n→∞ w ( n∨ i=1 ψ(ai) ) . Any function w satisfying the above conditions has the properties we usually expect of probability (see [10, Prop. 3.1]), in particular if ψ logically implies θ then w(ψ) ≤ w(θ). Given a probability function w and θ, φ ∈ SL with w(φ) > 0 we define the conditional probability of θ given φ as usual by w(θ |φ) = w(θ ∧ φ) w(φ) . (1) With a fixed φ ∈ SL, w(φ) > 0, the function defined by (1) is also a probability function. 3 In place of ai we sometimes use other letters to avoid subscripts or double subscripts. The aim of Pure Inductive Logic (see for example [10]) is to investigate the logical or rational assignment of belief, as subjective probability,4 in the absence of any intended interpretation. To explain this, consider a valid natural language argument, such as A where lines 1 and 2 are the premises and line 3 the conclusion. What we understand here by 'valid' is that this conclusion is true whenever the premises are true independently of the meaning of 'fire' 'smoke', 'kitchen' etc.. In other words the conclusion is a logical consequence of the premises depending only on their form and not on the meaning or interpretation we give to 'fire, kitchen' etc.. Most natural language 'arguments' however are not so valid. Instead the premises only seem to provide some support for the conclusion rather than deem it categorically true. B is just such an example (though as Matilal points out at [4, p16] and [5, p197] contemporary scholars have commonly understood, and in consequence criticised, the Indian schema as aiming to render a valid conclusion). Nevertheless we can still investigate the question of how much of this support is logical or rational, depending only on the form of the premises and conclusion and not on the actual interpretation of 'fire', 'smoke' etc.. So, just as Predicate Logic seeks to understand the notion of logical consequence by considering sentences of a formal language devoid of any particular interpretation, Pure Inductive Logic aims to address the more general issue of the logical or rational assignment of probabilities to sentences of a formal language (such as L above) in the absence of any particular interpretation. Note that this is indeed a more general issue since the support given by some evidence to a hypothesis arguably can be measured by the conditional probability of the hypothesis given the evidence. A hypothesis is a logical consequence of the evidence just when this support is total (probability 1) for all probability functions giving non-zero probability to the evidence. A key requirement here is the rationality of the probability assignment (without it we would get no further than simply standard Predicate Logic). Whilst we may not know exactly what we mean by 'rational' here nevertheless there are, in this completely uninterpreted context, some constraints or principles governing this assignment that we feel are rational and should be enforced. The most basic of these is that since there is no reason to treat any one constant any differently from any other interchanging constants should not alter the assigned probabilities. Precisely, a rational probability function should satisfy: The Constant Exchangeability Principle, Ex. If θ ∈ SL and the constant symbol aj does not appear in θ then w(θ) = w(θ ′) where θ′ is the result of replacing each occurrence of ai in θ by aj. 5 4 In our view this makes it an obvious logic to investigate 'analogical arguments' where it is subjective probability which is being propagated by considerations of rationality. 5 This formulation of Ex is equivalent to that given in, say, [10], and avoids introducing extra notation. Similarly, in the absence of any particular interpretation there is no reason to treat a relation any differently from its negation. This leads to the rationality requirement on a probability function that it satisfy, The Strong Negation Principle, SN. w(θ) = w(θ′) where θ′ is the result of replacing each occurrence of the relation symbol Pi in θ by ¬Pi. A word of caution here however. In our main theorem below we will formalise B in a predicate language and then, in this rarified set-up, argue that adopting the above principles Ex+SN, the conditional probability of the conclusion given the conjunction of the premises must be at least 1/2 (in fact strictly greater than 1/2 in all except exceptional circumstances). However for one to accept this conclusion requires one to agree, or allow for the sake of argument, that all the relevant information is given in the premises,6 so that the actual interpretation ceases to matter and nothing essential is lost in the resulting formalisation as simply uninterpreted sentences of a predicate logic.7 This is what we intend by a 'good example'. The Main Result The following theorem shows that when formalising the Indian Schema as in the section before last (that is, via a binary relation representing 'occurring at') and assuming that the condition on the example being a good one is taken to be that it represents all the relevant information, the Schema is at least as rational as Ex+SN. By this we mean that any probability function on L (where from now on L is the fixed language with single binary relation symbol R) satisfying Ex+SN gives probability at least 1/2 to fire occurring on the hill given (just) that smoke occurs on the hill, and that smoke in the kitchen implied fire in the kitchen. Theorem 1 Let w be a probability function on SL satisfying Ex+SN. Let h, k, s, f be distinct constants from amongst the a1, a2, a3, . . .. Then8 w(R(f, h) |R(s, h) ∧ (R(s, k)→ R(f, k))) ≥ w(R(f, h) |R(s, h)) ≥ 1/2. (2) A few remarks are in order here. Firstly one might object that for the claimed justification one really requires the left hand term to be strictly greater than 1/2. In fact it is not difficult to show that if for a particular probability function w satisfying Ex+SN the left hand term of (2) and hence also the 6 Of course one has a vast background knowledge about fires and kitchens etc. none of which is alluded to in these premises. 7 In other words such reasoning is appropriate only in so far as one is content to apply a principle of ceteris paribus. 8 To avoid problems with zero denominators we identify w(θ |φ) ≥ w(ψ | η) with w(θ ∧ φ) * w(η) ≥ w(ψ ∧ η) * w(φ). middle term does take the value 1/2 then this w must give the same value 1/2 to w(R(s, kn+1) |R(s, k1) ∧R(s, k2) ∧ . . . ∧R(s, kn)) (3) for any number of 'kitchens' k1, k2, . . . , kn+1. In other words w must completely dismiss any inductive influence, informally, no matter if all the many kitchens seen in the past have been smokey this evidence amounts to nothing when it comes to the probability assigned to the next kitchen seen being also smokey. Thus to say that a purportedly rational w could fail to give the left hand side of (2) a value not strictly greater than 1/2 entails saying that it is rational to give (3) a value of 1/2 for all n, a not-inconsistent position to take but one which is hardly popular. Of course one might wish that the support is not simply greater than 1/2 but actually substantially greater. However that can only be achieved by making additional assumptions beyond simply Ex+SN and currently we cannot envisage any such assumption which would avoid introducing a subjective element (just how much is 'substantially greater'?). This would seem to directly conflict with the idea of probabilities being assigned on purely rational or logical grounds. A second remark here concerns our formalization of B. We have chosen to capture 'when there was smoke in the kitchen there was fire', by R(s, k) → R(f, k). Various other formulations are possible here, for example R(s, k)←→ R(f, k), R(s, k) ∧R(f, k). In each case one can prove by the same methods that for a probability function satisfying Ex+SN conditioning R(f, h) on this evidence together with R(s, h) gives a value of at least one half, see Theorem 5 in the appendix. However in these cases it is currently open whether or not we can still interleaf w(R(f, h) |R(s, h)) as in Theorem 1.9 Thirdly, in case the reader might object here that the second inequality in (2) already gives the claimed 'support' for R(f, h) from evidence R(s, h) alone we are at pains to point out that by the assumption that all pertinent evidence has been included one cannot simply throw away the R(s, k)→ R(f, k). Finally we remark that Matilal's suggestion from [5, p197] that the reasoning in the Indian Schema may be more correctly understood as inductive, and for practical purposes providing knowledge of the real world, seems to us along the lines of the approach we have adopted here: we take the assignment of a probability of at least one half to the conclusion (equivalently, the conclusion being at least as probable as its negation) to be a justification for giving the conclusion the status of a working assumption. 9 There are several other currently open problems with these, and other formulations (see for example [7], [8], [9]), in particular when the evidence involves multiple smokey kitchen, and the heterogenous non-smokey lakes, a case not treated at all in this paper. Conclusion We have shown that a version of the Indian Schema expressed in terms of the binary occurrence relation, as suggested by Staal and Matilal, is actually a consequence of the two of the central principles in Pure Inductive Logic, Constant Exchangeability and Strong Negation. By this we certainly do not wish to imply that the ancients were somehow aware of these principles (so this paper is not at all intended as a contribution to the History of Indian Logic). Rather we simply intend to point out that the everyday common senseness of the Indian Schema does in fact have a formal justification as rational within the context of Pure Inductive Logic. This paper has left much open for further research and investigation. For example in the way we formalise the schema in terms of the paks.a, the concomitance, should it be implication, bi-implication or conjunction? Should 'hill' be thought of as a constant or a predicate etc., etc.? There is also the issue of the effect of heterogenous examples and of mixtures of multiple reasons of both kinds. We have already considered some of these questions in [7], [8] and [9] within the context of Pure Inductive Logic but much remains unanswered. One advantage of using this framework is that following recent advances (see [10]) it is now equipped with some powerful representation theorems and a choice of attractive rational principles in addition to Ex+SN. Nevertheless there is the question whether this is the best framework in which to investigate such classical analogical reasoning, and certainly other have previously been proposed, for example [3], [6], [11]. Hopefully this short note will stimulate answers to these questions, not least from the Indian Logic community who clearly (unlike the present authors) have first hand access to the original texts and language. References 1. Gaifman, H., Concerning Measures on First Order Calculi, Israel Journal of Mathematics, 1964, 2:1-18. 2. Ganeri, J., Indian Logic: A Reader, Routledge, 2001. 3. Ganeri, J., Ancient Indian Logic as a Theory of Case Based Reasoning, Journal of Indian Philosophy, 31: 33-45, 2003. 4. Matilal, B.K., The Character of Logic in India, SUNY Series in Indian Thought, Ed. W.Halbfass, State University of New York Press, Albany, 1998. 5. Matilal B.M., Introducing Indian Logic, in J.Ganeri, Indian Logic, A Reader, Routledge, 2001. 6. Oetke, C., Ancient Indian Logic as a Theory of Non-monotonic Reasoning, Journal of Indian Philosophy, 24:447-539, 1996 7. Paris, J.B. & Vencovská, A., The Indian Schema as Analogical Reasoning, submitted to the Journal of Philosophical Logic. Currently available at http://eprints.ma.man.ac.uk/2436/01/covered/MIMS ep2016 10.pdf 8. Paris, J.B. & Vencovská, A., The Indian Schema Analogy Principles, submitted to the IfCoLog Journal of Logics and their Applications. Currently available at http://eprints.ma.man.ac.uk/2436/01/covered/MIMS ep2016 8.pdf 9. Paris, J.B. & Vencovská, A., Ancient Indian Logic, Paks.a and Analogy. To appear in the Proceedings of the joint Conference of The 3rd Asian Workshop on Philosophical Logic (AWPL-2016) and the 3rd Taiwan Philosophical Logic Colloquium (TPLC-2016), Taipei, October 2016. 10. Paris, J.B. & Vencovská, A., Pure Inductive Logic, in the Association of Symbolic Logic Perspectives in Mathematical Logic Series, Cambridge University Press, April 2015. 11. Schayer, S., On the Method of Research into Nyāya, (translated by J.Tuske) in Indian Logic: A Reader, ed. J.Ganeri, Routledge, London & New York, 2001, pp102-109. 12. Staal, J.F., The concept of Paks.a in Indian Logic, in Indian Logic: A Reader, ed. J.Ganeri, Routledge, London & New York, 2001, pp151-161. Appendix To prove the theorem we need to appeal to a representation theorem for probability functions on L satisfying Ex. First we introduce some notation. For the language L as above a state description for a1, . . . , an is a sentence of L of the form ∧ i,j≤n R(ai, aj) εi,j where the εi,j ∈ {0, 1} and R(ai, aj)1 = R(ai, aj), R(ai, aj)0 = ¬R(ai, aj). By a theorem of Gaifman, see [1], or [10, Chapter 7], a probability function on SL is determined by its values on the state descriptions. Let D = (di,j) be an N × N {0, 1}-matrix. Define a probability function wD on SL by setting wD  ∧ i,j≤n R(ai, aj) εi,j  to be the probability of (uniformly) randomly picking, with replacement, h(1), h(2), . . . , h(n) from {1, 2, . . . , N} such that for each i, j ≤ n, dh(i),h(j) = εi,j . This uniquely determines a probability function on SL satisfying Ex. (For details see e.g. [10, Chapter 7]). Clearly convex mixtures of these wD also satisfy Ex. Indeed by the proof of [10, Theorem 25.1] it follows that any probability function w satisfying Ex can be approximated arbitrarily closely on QFSL by such convex mixtures. More precisely: Lemma 2 For a probability function w on SL satisfying Ex and θ1, . . . , θm ∈ QFSL and ε > 0 there is an N ∈ N and λD ≥ 0 for each N ×N {0, 1}-matrix D such that ∑ D λD = 1 and for j = 1, . . . ,m, |w(θj)− ∑ D λDw D(θj)| < ε. We can extend this representation result to probability functions satisfying additionally SN as follows. For θ ∈ SL let θ¬ be the result of replacing each occurrence of R in θ by ¬R and similarly for matrix D as above let D¬ be the result of replacing each occurrence of 0/1 in D by 1/0 respectively. For w a probability function on SL set w¬ to be the function on SL defined by w¬(θ) = w(θ¬). Then w¬ satisfies Ex and the probability function 2−1(w+w¬) satisfies Ex+SN. Conversely if w satisfies Ex+SN then w = w¬ so w = 2−1(w + w¬). Thus every probability function satisfying Ex+SN is of the form 2−1(v+v¬) for some probability function v satisfying Ex and conversely every such probability function satisfies Ex+SN. Notice that if w = ∑ D λDw D then w¬ = ∑ D λDw D¬ and 2−1(w + w¬) = ∑ D λD2 −1(wD + wD ¬ ). In particular then by Lemma 2, Lemma 3 For a probability function w on SL satisfying Ex+SN and θ1, . . . , θm ∈ QFSL and ε > 0 there is an N ∈ N and λD ≥ 0 for each N ×N {0, 1}-matrix D such that ∑ D λD = 1 and for j = 1, . . . ,m, |w(θj)− 2−1 ∑ D λD(w D(θj) + w D¬(θj))| < ε. Let w be a probability function on SL satisfying Ex and for a 2× 2 {0, 1}matrix E = [ e11 e12 e21 e22 ] let |E|w = w(R(a1, a3)e11 ∧R(a1, a4)e12 ∧R(a2, a3)e21 ∧R(a2, a4)e22). We will omit the subscript w if it is clear from the context. Notice that when D = (di,j) is an N ×N {0, 1}-matrix, then for E as above we have |E|wD = N−4 ∑ i,j,r,s de11i,r d e12 i,s d e21 j,r d e22 j,s , (4) where x1 = x, x0 = 1− x. We will write |E|D in place of |E|wD . A useful observation is that for any probability function w satisfying Ex, |E| is invariant under permuting rows and permuting columns so for example∣∣∣∣∣1 01 0 ∣∣∣∣∣ = ∣∣∣∣∣0 10 1 ∣∣∣∣∣ , ∣∣∣∣∣1 10 0 ∣∣∣∣∣ = ∣∣∣∣∣0 01 1 ∣∣∣∣∣ , ∣∣∣∣∣1 00 1 ∣∣∣∣∣ = ∣∣∣∣∣0 11 0 ∣∣∣∣∣ ,∣∣∣∣∣1 00 0 ∣∣∣∣∣ = ∣∣∣∣∣0 10 0 ∣∣∣∣∣ = ∣∣∣∣∣0 00 1 ∣∣∣∣∣ = ∣∣∣∣∣0 01 0 ∣∣∣∣∣ , (5) etc. We will use this observation frequently in what follows. Let X = ∣∣∣∣∣1 11 1 ∣∣∣∣∣ + ∣∣∣∣∣0 00 0 ∣∣∣∣∣ , Y = ∣∣∣∣∣1 11 0 ∣∣∣∣∣ + ∣∣∣∣∣0 00 1 ∣∣∣∣∣ , T = ∣∣∣∣∣1 01 0 ∣∣∣∣∣ , U = ∣∣∣∣∣1 00 1 ∣∣∣∣∣ , Z = ∣∣∣∣∣0 01 1 ∣∣∣∣∣ . Lemma 4 For any probability function w satisfying Ex we have T,Z ≥ U and X ≥ 2Z, 2T . Proof. We shall prove that T ≥ U , the other inequalities follow similarly. Let D = (di,j) be an N ×N {0, 1}-matrix and assume first that w = wD. By the above observation, T = 1 2 (∣∣∣∣∣1 01 0 ∣∣∣∣∣ D + ∣∣∣∣∣0 10 1 ∣∣∣∣∣ D ) U = 1 2 (∣∣∣∣∣1 00 1 ∣∣∣∣∣ D + ∣∣∣∣∣0 11 0 ∣∣∣∣∣ D ) so T ≥ U is the inequality∑ i,j,r,s di,r(1− di,s)dj,r(1− dj,s) + ∑ i,j,r,s (1− di,r)di,s(1− dj,r)dj,s ≥ ∑ i,j,r,s di,r(1− di,s)(1− dj,r)dj,s + ∑ i,j,r,s (1− di,r)di,sdj,r(1− dj,s) which is equivalent to the sum over r, s of(∑ i di,r(1− di,s) )2 + ∑ j (1− dj,r)dj,s 2− 2(∑ i di,r(1− di,s) )∑ j (1− dj,r)dj,s  being nonnegative, and hence clearly true. From this it follows that the result holds for convex combinations of the wD and hence by Lemma 2 for general w satisfying Ex. Proof of Theorem 1 We start with the left hand side inequality. Let w be a probability function satisfying Ex+SN. If w(R(s, h) ∧ (R(s, k) → R(f, k)) and/or w(R(s, h)) equals 0 then (2) holds by our convention, so assume that these values are nonzero. Consider an approximation 2−1 ∑ D λD(w D + wD ¬ ) of w for the θ of the form R(f, h)e11 ∧R(f, k)e12 ∧R(s, h)e21 ∧R(s, k)e22 with small ε and N ∈ N as guaranteed by Lemma 3. For an N × N {0, 1}-matrix D = (di,j), write u for 2−1(wD + wD ¬ ). We have u(R(f, h) ∧R(s, h) ∧ (R(s, k)→ R(f, k)) = 2−1(XD + 2TD + YD), u(R(s, h) ∧ (R(s, k)→ R(f, k)) = 2−1(XD + 2TD + 3YD + 2UD), u(R(f, h) ∧R(s, h)) = 2−1(XD + 2TD + 2YD), u(R(s, h)) = 2−1(XD + 2TD + 4YD + 2UD + 2ZD). Let D be another (not necessarily distinct) N ×N {0, 1} matrix. Working with approximations of w for arbitrarily small ε it can be seen that to show (2) for w it suffices to demonstrate that for any pair D, D we have (XD + 2TD + YD)(XD + 2TD + 4YD + 2UD + 2ZD) + (XD + 2TD + YD)(XD + 2TD + 4YD + 2UD + 2ZD) ≥ (XD + 2TD + 3YD + 2UD)(XD + 2TD + 2YD) + (XD + 2TD + 3YD + 2UD)(XD + 2TD + 2YD). This simplifies to 2XDZD+4TDZD+2YDZD+2XDZD+4TDZD+2YDZD ≥ 4YDYD+2UDYD+2UDYD and since by Lemma 4 we have ZD ≥ UD, ZD ≥ UD, it suffices to show that (XD + 2TD)ZD + (XD + 2TD)ZD ≥ 2YDYD. (6) We have XD + 2TD = ∑ i,j [(∑ r di,rdj,r )2 + (∑ s (1− di,s)(1− dj,s) )2 + 2 (∑ r di,rdj,r )(∑ s (1− di,s)(1− dijs) )] = ∑ i,j (∑ r di,rdj,r + ∑ s (1− di,s)(1− dj,s) )2 = ∑ i,j (xi,j + yi,j) 2, (7) where xi,j = ∑ r di,rdj,r, yi,j = ∑ s (1− di,s)(1− dj,s). Similarly ZD = ∑ i,j (∑ r,s di,rdi,s(1− dj,r)(1− dj,s) ) = ∑ i,j z2i,j (8) where zi,j = ∑ r di,r(1− dj,r), and, using (5), YD = ∑ i,j (∑ r (1− di,r)dj,r )(∑ s di,sdj,s + ∑ s (1− di,s)(1− dj,s) ) = ∑ i,j zi,j(xi,j + yi,j). (9) Similarly for D = (di,j). Writing ui,j for xi,j + yi,j etc., the inequality (6) becomes(∑ i,j u2i,j )(∑ i,j ẑ2i,j ) + (∑ i,j û2i,j )(∑ i,j z2i,j ) ≥ 2 (∑ i,j zi,jui,j )(∑ i,j ẑi,j ûi,j ) which holds since for any particular pairs i, j and g, h, u2i,j ẑ 2 g,h + û 2 g,hz 2 i,j ≥ 2zi,jui,j ẑg,hûg,h. Turning to the right hand side inequality it is enough to show that w(R(f, h) ∧R(s, h)) ≥ 2−1w(R(s, h)), equivalently w(R(f, h) ∧R(s, h)) ≥ w(¬R(f, h) ∧R(s, h)). Proceeding as above (but much simpler since it does not need to involve the D) it is sufficient to show that XD + 2TD ≥ 2UD + 2ZD, and indeed this holds by Lemma 4. 2 Theorem 5 Let w be a probability function on SL satisfying Ex+SN. Let h, k, s, f be distinct constants from amongst the a1, a2, a3, . . .. Then w(R(f, h) |R(s, h) ∧ (R(s, k)←→ R(f, k))) ≥ 1/2. w(R(f, h) |R(s, h) ∧ (R(s, k) ∧R(f, k))) ≥ 1/2. Proof Starting with the bi-implication case and proceeding as in the proof of the second inequality in Theorem 1 it is enough to show that XD + 2TD ≥ 2YD. (10) To this end notice that XD = ∑ r,s ((∑ i di,rdi,s )2 + (∑ i (1− di,r)(1− di,s) )2) , 2TD = 2 ∑ r,s (∑ i di,r(1− di,s) )2 , 2YD = ∑ r,s 2 ((∑ i di,r(1−di,s) )(∑ i (1−di,r)(1−di,s)+ ∑ i di,r(1−di,s) )(∑ i di,rdi,s )) . Writing Ar,s = ∑ i di,rdi,s, Br,s = ∑ i (1− di,r)(1− di,s), Cr,s = ∑ i di,r(1− di,s) the required inequality becomes∑ r,s ( A2r,s +B 2 r,s + 2C 2 r,s − 2Ar,sCr,s + 2Br,sCr,s ) ≥ 0, which clearly holds. The second inequality in the theorem can likewise be reduced to showing that XD ≥ YD and this follows from (10) and Lemma 4.