LOGIC OF PROBABILITY AND CONJECTURE HARRY CRANE Abstract. I introduce a formalization of probability in intensional Martin-Löf type theory (MLTT) and homotopy type theory (HoTT) which takes the concept of 'evidence' as primitive in judgments about probability. In parallel to the intuitionistic conception of truth, in which 'proof' is primitive and an assertion A is judged to be true just in case there is a proof witnessing it, here 'evidence' is primitive and A is judged to be probable just in case there is evidence supporting it. To formalize this approach, we regard propositions as types in MLTT and define for any proposition A a corresponding probability type Prob(A) whose inhabitants represent pieces of evidence in favor of A. Among several practical motivations for this approach, I focus here on its potential for extending meta-mathematics to include conjecture, in addition to rigorous proof, by regarding a 'conjecture in A' as a judgment that 'A is probable' on the basis of evidence. 1. Introduction I aim to develop a formal logic of probability and conjecture which takes judgments about evidence as primitive and whose calculus is based on rules for reasoning about such judgments. I focus specifically here on the formal approach and its basic technical consequences. Much more can be said about a wide range of philosophical, historical, and technical motivations underlying this work, including its potential implications for understanding how probability and evidence often arise in legal proceedings, scientific discovery, and everyday decision making. For these additional considerations, I refer the reader to the longer version of this article [Cra18]. The study of conjecture and plausible reasoning in mathematics provides one concrete motivation for the theory introduced here. Although conjecture is central to mathematical practice, and is often a necessary precursor to rigorous proof, there lacks a 'meta-mathematical' framework for analyzing how heuristics and intuitions guide formal 'rigorous' mathematics. But although conjecture and plausible reasoning do not obey strict logical rules, they seem to follow a sound rationale based on some commonly applicable techniques; see, e.g., Pólya's two volume work [Pól54] and Mazur's more recent discussion [Maz12]. For example, let A and B be mathematical propositions. Though it may seem logical for a mathematician who (i) conjectures A and (ii) has proven A→ B (i.e., if A then B) to also conjecture B, there exists no formal theory to justify the move. We could attempt to formalize this rule by combining the conventional probability calculus (i.e., set Pr(Q) = 1 if Q is true and Pr(Q) = 0 if ¬Q is true) with a Lockean thesis for belief (i.e., conjecture A if 'Pr(A) ≥ t' for some 1 2 HARRY CRANE pre-determined threshold 0 ≤ t ≤ 1) [Fol92].1 In this case, since A → B ≡ ¬A ∨ B has been proven, we have Pr(¬A ∨B) = 1, and since A is conjectured we must have Pr(A) ≥ t by the Lockean thesis. A routine application of the probability calculus gives Pr(¬A) ≤ 1− t and 1 = Pr(¬A ∨B) ≤ Pr(¬A) + Pr(B) ⇒ Pr(B) ≥ t, leading to a conjecture in B. But whereas in classical logic any proposition Q is either true or false by the law of excluded middle, a conjecture about Q under this approach relies on the extra-mathematical data encoded by the probability operator Pr(*) and the Lockean threshold t. Here I seek a logic in which truths and conjectures can be treated as mathematical objects of equal standing, thus allowing 'plausible reasoning' in mathematics to be studied internally to the same formal system in which rigorous mathematics already takes place. From this perspective, the logical rules governing 'rigorous mathematics', which is concerned only with truth and proof and not with conjecture and evidence, are a fragment of a more general framework that also incorporates conjecture. I achieve this by introducing a concept of probability on top of the existing syntax of Martin-Löf intensional type theory (MLTT) [ML84,ML87,ML96], and by interpreting a 'conjecture in A' as a formal judgment that 'A is probable' in the type theory.2 1.1. Meaning of a conjecture. Wittgenstein's conception of 'meaning as use' [Wit73] figures into Martin-Löf's meaning explanation of types [ML87,ML96] and the Curry– Howard 'propositions as types' correspondence [Cur34,CF,How69], by which a mathematical proposition A is represented as a type A : Type whose terms a : A are proofs of A: "the meaning of a proposition [...] is determined by that which counts as a verification of it." (Martin-Löf [ML96, p. 27]) Following Martin-Löf, we formalize probability in terms of evidence, by associating each proposition A : Type to a probability type Prob(A) : Type whose terms a′ : Prob(A) correspond to pieces of evidence in favor of A. In this case, the judgment a′ : Prob(A) indicates that 'a′ is evidence for A' (or that a′ witnesses the probability of A). In the example discussed above, the conjecture in A corresponds to a judgment a′ : Prob(A), the truth of A→ B corresponds to a proof f : A→ B, and the derived conjecture in B results from the formal judgment impf (a) : Prob(B),3 which can be constructed by applying the elimination rule (4) for the probability type in the 1There are other frameworks for belief revision, some based on the probability calculus and some not. I present this one here only for illustration. See [SF18,AGM85] and references therein for more detailed accounts. 2In the specific application to meta-mathematics for conjecture, I interpret the judgment that 'A is probable' to correspond to a conjecture in A. My natural language use of 'probable' differs from the use of 'plausible' in [Pól54,Maz12] on the grounds that mere 'plausibility' is not sufficient to conjecture A. In order to conjecture A, one must believe that A is 'likely' or 'probable' on the basis of the observed evidence, not that it is merely 'plausible'. 3In MLTT the 'proof' f : A → B is simply a function which converts any proof of A (i.e., a : A) into a proof of B (i.e., f(a) : B). LOGIC OF PROBABILITY AND CONJECTURE 3 upcoming formalism. Thus, in parallel to Martin-Löf's meaning explanation of truth in terms of proof, we obtain a meaning explanation of probability in terms of evidence: the meaning of a conjecture is determined by that which counts as evidence in favor of it.4 Before beginning, I note that the intuitionistic approach to probability presented here seems to be autonomous from other conceptions of probability found throughout the literature, including Weatherson's work on 'intuitionistic probability' [Wea03]. In particular, I am not aware of any previous type-theoretic accounts of probability or any other formalism which treats conjectures as first class mathematical objects in their own right. I also note that the concept of 'probability' invoked here is intended in a purely epistemic sense, and thus is best compared and contrasted with other accounts of logical and subjective probability in [Car50, dF, Ram26]. Though not presented in traditional 'theorem-proof' style, the numbered statements in the coming sections can be proven rigorously. To aid the exposition, I defer these proofs to the appendix (Section 5), and opt to explain their meaning as intuitively and preformally as possible in terms of probability, evidence, and conjecture. One feature of the constructive logic of MLTT is its amenability to formalization in a computerbased proof assistant, such as Coq or UniMath. Some parts of the formal system introduced here have been formalized in UniMath, and it is left as a topic of future work to fully formalize this theory. Though I attempt to provide as much explanation about type theory as possible, for brevity I assume basic knowledge of the rules and syntax of MLTT. I refer the reader to [Uni13,ML84] for a thorough introduction of MLTT and homotopy type theory. For more on the philosophy of intuitionism, see Brouwer [Brob, Broa, Bro81], Heyting [Hey71], Dummett [Dum00], and Martin-Löf [ML87,ML96]. 2. Type theoretic probability At a conceptual level, the state of affairs decomposes statements about truth and probability into five main components: • Judgment: A judgment is an assertion, based on context and witnessed by either proof or evidence, that a concept is true or probable. • Context: Judgments about truth and probability depend on context. • Concept: Judgments are made about concepts. Following Martin-Löf's meaning explanation, the meaning of a concept is determined by how the concept manifests itself. • Proof: The truth of any concept manifests itself in proof. • Evidence: A conjecture about a concept manifests itself in evidence. 4Contrast this with the orthodox Bayesian approach to evidence, which takes the probability function Pr(*) as primitive and defines E as evidence for A just in case Pr(A | E) > Pr(A), where Pr(A | E) is the conditional probability of A given E. From this perspective, the probabilities determined by Pr(*) exist prior to and independently of judgments about evidence. In the perspective taken here, evidence is primitive: a judgment about the 'probability of A' cannot be made without evidence that supports the judgment. I explore this contrast further in [Cra18]. 4 HARRY CRANE Here I write 'concept' instead of 'proposition' to suggest the more general settings (e.g., law, science) in which this thinking applies. The interpretation of types as mathematical concepts has also been suggested in [LP14]. To emphasize this interpretation below, I write A : Concept in place of A : Type, without any change to the rules of MLTT. These main components are expressed formally in MLTT with the following notation. pre-formal formal (MLTT) Context Γ A is a concept A : Concept 'A is probable' is a concept Prob(A) : Concept judgment that A is true a : A conjecture that A is true a : Prob(A) In particular, if A : Concept is a mathematical proposition, then a : A is interpreted as 'a is a proof of A' and a : Prob(A) as 'a is evidence for A'. I call any judgment 'a : A' a truth statement and 'a : Prob(A)' a conjecture. The above setup is formalized in the following syntax, with the left of the turnstile providing context for the judgments on the right, written 'Context ` Judgment'. Formal: Γ ` A : Concept Semi-formal: Context ` A is a concept Interpretation: Context invokes the concept A. Example: The context of arithmetic invokes the concept that addition is commutative: for all n,m : N, n+m = m+ n. Formal: Γ, A : Concept ` Prob(A) : Concept Semi-formal: Context, A is a concept ` 'A is probable' is a concept Interpretation: The context along with the concept A invokes the concept of the probability of A. Example: The context of arithmetic and the concept that every even integer greater than 4 is the sum of two odd primes invokes the concept that such a claim is probable. Formal: Γ, A : Concept ` a : A Semi-formal: Context, A is a concept ` a is a proof of A Interpretation: A judgment from context and a concept A that the truth of A is witnessed by proof a. Example: A proof from the rules of arithmetic that addition is commutative. LOGIC OF PROBABILITY AND CONJECTURE 5 Formal: Γ, A : Concept ` a : Prob(A) Semi-formal: Context, A is a concept ` a is evidence for A Interpretation: A judgment from context and a concept A that a conjecture in A is supported by evidence a. Example: Empirical evidence that each of 6, . . . , 198, 200 can be expressed as the sum of two odd primes supports Goldbach's conjecture. The interpretation of Prob(A) as a body of evidence which supports conjectures about A follows by introducing appropriate rules for the probability type. 2.1. The Probability Type. Reasoning about concepts in MLTT proceeds by applying rules of formation, introduction, elimination, and computation associated to each type. For lack of space, I refer the reader to the appendix of [KL11] and also [Uni13, Appendix A.2] and [ML84, LP14] for a list of the standard rules of MLTT. See Table 1 for a comparison between the syntax of MLTT and classical logic. Note well the distinction between the constructive, proof relevant syntax of MLTT and the non-constructive, truth functional syntax of classical logic. For example, the type∑ a:AB(a) in MLTT and ∃aB(a) in classical logic can both be read as 'there exists a such that B(a)', but their interpretations differ: in MLTT this statement requires an explicit witness 〈a, b〉 for a : A and b : B(a), and any subsequent outcomes derived from 〈a, b〉 : ∑a:AB(a) are formally obtained by manipulating 〈a, b〉 according to the rules of MLTT. In classical logic, on the other hand, the 'truth' of ∃aB(a) is sufficient for deriving further conclusions, without regard for the proof that establishes this claim. The syntax of MLTT aligns the intuitionistic conceptions of truth and probability through the analogy (1) proof : truth :: evidence : probability. Whereas the lefthand side of (1) is fulfilled by judgments of the form a : A (in MartinLöf's meaning explanation), the righthand side is filled out by the following rules for the probability type. Formation rule: (2) Γ ` A : Concept Γ ` Prob(A) : Concept (Prob-form) Semi-formal: Associated to any concept A, whose meaning is determined by proofs that A is true, is a concept Prob(A), whose meaning is determined by evidence that A is true. Example: From the concept of 'it is raining', derive the concept that 'it is probably raining'. Introduction rule: (3) Γ ` A : Concept Γ, a : A ` evidA(a) : Prob(A) (Prob-intro) 6 HARRY CRANE Type theory Interpretation (Curry–Howard) Classical logic Interpretation A : Type A is a proposition A proposition a : A a is a proof of A ` A A is true A×B A and B A ∧B A true and B true A+B A or B (disjoint union) A ∨B A true or B true A→ B A implies B ¬A ∨B if A then B∑ a:AB(a) there exists a : A s.t. B(a) ∃a B(a) there exists a s.t. B(a)∏ a:AB(a) B(a) for every a : A ∀a B(a) B(a) true for all a 0 empty type (contradiction) ⊥ false/contradiction A→ 0 A is false ¬A not A a =A a′ proofs that a and a′ are identical Table 1. Comparison between syntax of MLTT and classical logic. Note that the constructive nature of MLTT means that a proof of 'B(a) for all a : A' requires an explicit term f : ∏a:AB(a) that associates a witness f(a) : B(a) to every a : A. Similarly, a proof that 'there exists a : A such that B(a)' requires a witness a : A along with b(a) : B(a) so that 〈a, b(a)〉 : ∑a:AB(a). The (propositional) identity type a =A a′, which consists of all proofs that a and a′ are identical terms of A, has no analog in classical logic. Semi-formal: Proof is the definitive and strongest kind of evidence: any proof of A (i.e., a : A) determines evidence for A (i.e., evidA(a) : Prob(A)). Example: From definitive proof that 'it is raining' (e.g., seeing rain through the window), deduce the weaker statement that 'it is probably raining'. Elimination rule 1 (Rule of implication): (4) Γ ` A : Concept Γ, x : Prob(A) ` C(x) : Concept Γ, a : A ` d(a) : C(evidA(a)) Γ, x : Prob(A) ` impd(x) : Prob(C(x)) (Prob-elim-1) Semi-formal: Reasoning about evidence is guided by reasoning about proof. Thus, if A implies C, then evidence for A implies evidence for C. Example: Since 'it is raining' implies that 'the roads are wet', evidence that 'it is raining' supports the conjecture that 'the roads are probably wet'. Elimination rule 2 (Rule of inference): (5) Γ ` A : Concept Γ, x : Prob(A) ` C(x) : Concept Γ, a : A ` d(evidA(a)) : C(evidA(a)) Γ, x : Prob(A) ` infd(x) : C(x) (Prob-elim-2) LOGIC OF PROBABILITY AND CONJECTURE 7 Semi-formal: If A implies C but only through the evidence induced by proofs of A, then evidence of A implies C. Example: 'It is snowing' implies that 'it is cold outside', but only through the evidence required to determine that there is evidence for 'it is probably snowing'. Thus, from the judgment 'it is probably snowing' we can deduce that 'it is cold outside'. Computation rule 1: (6) Γ ` A : Concept Γ, x : Prob(A) ` C(x) : Concept Γ, a : A ` d(a) : C(evidA(a)) Γ, a : A ` impd(evidA(a)) ≡ evidC(evid(a))(d(a)) : Prob(C(evidA(a))) (Prob-comp-1) Semi-formal: The computation rule combines (3) and (4) to make impd defined in (4) compatible with its constructor d. Computation rule 2: (7) Γ ` A : Concept Γ, x : Prob(A) ` C(x) : Concept Γ, a : A ` d(evidA(a)) : C(evidA(a)) Γ, a : A ` infd(evidA(a)) ≡ d(evidA(a)) : C(evidA(a)) (Prob-comp-2) Semi-formal: The computation rule combines (3) and (5) to make infd defined in (5) compatible with its constructor d. 0-rule: (8) Γ ` Prob(0) = 0 : Concept (Prob-0) Semi-formal: Since 0 : Concept is the concept of emptiness, i.e., has no terms, Prob(0) is the body of evidence supporting the claim '0 is inhabited'. But since 0 is uninhabited by definition, any evidence for 0 would immediately lead to contradication; thus, Prob(0) is also uninhabited. Additional elimination rules could be introduced to reflect other ways to reason with evidence, e.g., a relation for judging relative strength of evidence (see Section 2.6). To streamline the discussion here, I restrict attention to the fragment described by the rules above. These rules complete the righthand side of the analogy in (1), warranting the interpretation a : A :: evidA(a) : Prob(A) proof : truth :: evidence : probability. 8 HARRY CRANE The next few sections establish some formal consequences of these rules which make precise the pre-formal interpretation of Prob(A) as a body of evidence and the judgments of the form a : Prob(A) as conjectures about the truth of A. To help the exposition, I defer formal proof of these results to Section 5. 2.2. Carriers of probability. By the introduction rule (3), evidence can be derived from any proof, and thus to every a : A there corresponds a piece of evidence evidA(a) : Prob(A). But because judgments about probability are conjectural, the interpretation of the probability type in terms of evidence is viable only if the logic permits the judgment a : Prob(A) to be made without definitive proof of A. In other words, the formal calculus should be consistent with but should not require (9) ∏ a:Prob(A) ∑ x:A (evidA(x) =Prob(A) a). To every piece of evidence for A (i.e., a : Prob(A)) there exists a proof of A (i.e., x : A) that is compatible with that evidence (i.e., p : evidA(x) = a). If, for example, (9) were required to hold, then any conjecture a : Prob(A) would have to correspond to a proof, thus defeating the purpose of the established formalism as a logic for handling evidence. But while it is possible that a conjecture a : Prob(A) may be valid without any x : A for which evidA(x) = a, the formalism is consistent with this stringent correspondence between Prob(A) and A: (10) ∏ a:Prob(A) ((∑ x:A (evidA(x) =Prob(A) a)→ 0 ) → 0 ) . It cannot be ruled out that every conjecture (i.e., a : Prob(A)) corresponds to a proof (i.e., x : A) through evidA : A→ Prob(A). According to (10), the logic is compatible with judgments of someone who refuses to conjecture without having definite proof. The observation in (10) goes hand in hand with the intended interpretation of the inhabitants of Prob(A) as carriers of evidence. For if∑ x:A (evidA(x) =Prob(A) a)→ 0 were consistent for some A and a : Prob(A), then the interpretation of a as evidence for A would be called into question. Since∑ x:A (evidA(x) =Prob(A) a)→ 0 rules out that a corresponds to some proof of A, the interpretation of a as evidence supporting the conjecture that A is provable becomes obscured. So even though the logic is consistent with probability judgments which do not necessarily correspond to a direct proof, in order for a probability judgment (a : Prob(A)) to qualify as a credible statement about 'evidence in favor of A', bona fide evidence a : Prob(A) LOGIC OF PROBABILITY AND CONJECTURE 9 must at least suggest that a corresponds to a proof of A: (11) ∏ a:Prob(A) Prob (∑ x:A (evidA(x) =Prob(A) a) ) . Any piece of evidence for A (i.e., a : Prob(A)) gives rise to evidence that there exists a proof of A (i.e., x : A) that is compatible with that evidence (i.e., p : evidA(x) = a). 2.3. Structure of probability. The structure of probability induced by the formal rules (2)-(8) can be summarized neatly for free-standing concepts (i.e., non-dependent types). Suppose A,B : Concept is such that B is provable whenever A is, i.e., A→ B is inhabited. Informally, since the truth of A implies the truth of B (i.e., A → B) and evidence for A hints at the truth of A, then evidence for A should also hint at the truth of B. Formally, this follows by applying the elimination rule (4) to any f : A → B. In particular, define C ≡ λa.B : A → Concept in (4)5 so that for any f : A→ B and any context Γ one can derive6 Γ, a : A ` f(a) : C(a) ≡ B. Then by (4) and (6), we have impf : Prob(A)→ Prob(B) such that impf (evidA(a)) ≡ evidB(f(a)). When combined with (5) and (7), we obtain the following commutative diagram for A,B,C : Concept. (12) A B C Prob(A) Prob(B) Prob(C) f d ◦ evidA evidA g ◦ f g evidB e ◦ evidB evidC impf infd impg ◦ impf = impg◦f impg infe 5Here I have used λ-abstraction to define the function C : A → Concept which assigns each a to B. In general the notation λx.y : X → Y is a function that assigns each x : X to some y : Y . 6Given a : A and f : A→ B, we can apply f to a to obtain f(a) : B. 10 HARRY CRANE impf ◦ evidA ≡ evidB ◦ f : A→ Prob(B) infd ◦ evidA ≡ d ◦ evidA : A→ B impd◦evidA = impinfd◦evidA = infevidB◦infd(13) impe◦evidB = impinfe◦evidB = infevidC◦infe .(14) Note that for general expressions x and x′ of type A, I write x ≡ x′ : A to denote judgmental equality and x =A x′ to denote propositional identity in MLTT. Propositional identity x =A x′ is shorthand for the type Id(x, x′, A) in MLTT, whose terms are proofs that x and x′ are identical. When the type is clear from context, I often omit it, writing x = x′ for propositional identity. 2.4. Combining evidence. The rules of MLTT together with the rules for the probability type induce a logic for combining evidence of different assertions and deriving new conjectures from old. For example, as carriers of evidence for 'A and B', the terms of Prob(A× B) ought to give rise to evidence for A and evidence for B individually. That is, when considering the conjecture x : Prob(A × B) for 'A and B', one can disregard the relevance of x to B (respectively, A) and derive a conjecture for A (resp. B) on its own: (15) splitprobA,B : Prob(A×B)→ Prob(A)×Prob(B). Evidence for 'A and B' can be split into separate pieces of evidence for A and B individually. Similarly, when in possession of evidence for A or evidence for B, one can derive evidence for 'A or B': (16) combprobA,B : Prob(A) + Prob(B)→ Prob(A+B). Evidence for 'A or B' can be derived from evidence for A or evidence for B. Together (15) and (16) give the hierarchy (17) Prob(A×B)→ Prob(A)×Prob(B)→ Prob(A)+Prob(B)→ Prob(A+B). The implications in (17) do not, in general, go in reverse. One may not, for example, feel compelled to combine two pieces of evidence, one for A and one for B, into a single conjecture for A and B jointly, as it might not be clear whether the specific pieces of evidence, say, a : Prob(A) and b : Prob(B), are compatible as evidence for 'A and B'. Similarly, having evidence for 'A or B' need not be sufficient for deriving evidence for either of the two individually. Indeed, for a given proposition A, one might postulate the law of excluded middle LEMA : A + ¬A without specifying which of A or ¬A holds. Asserting LEMA allows the derivation evidA+¬A(LEMA) : Prob(A+¬A) by (3), but without further evidence as to which of A or ¬A holds the logical calculus does not justify a conjecture in A or ¬A individually. These observations illustrate the 'proof relevant' character of MLTT: e.g., to conjecture 'A and B', it is not enough to LOGIC OF PROBABILITY AND CONJECTURE 11 simply have evidence for A and evidence for B; the two pieces of evidence must also be compatible with one another. The relations shown in (12), (15), and (16) for non-dependent types A and B extend to relations about universal and existential statements for dependent types B : A→ Concept. (18) Prob( ∏ a:A B(a))→ ∏ a:A Prob(B(a)). From evidence that B holds for every proof of A, derive evidence of B(a) from any particular proof a : A. (19) ∑ a:A Prob(B(a))→ Prob( ∑ a:A B(a)). From any proof of A (i.e., a : A) for which there is evidence for B (i.e., b : Prob(B(a))), derive evidence that B holds for some proof of A (i.e., Prob(∑a:AB(a))). 2.5. Logic for handling evidence. The opening discussion of Section 1 raises the question of how a conjecture for B can be justified from (i) a conjecture for A and (ii) proof of A→ B. In our formalism, this corresponds to exhibiting a term of type Prob(A)× (A→ B)→ Prob(B), which we have already shown through the commutative diagram in (12). Other similar inference rules for the probability type can be derived from the rules (2)-(8) by noting the distinction between A → B in MLTT and classical logic. In classical logic, A→ B is defined as the material conditional ¬A∨B, read as 'if A then B'. By the law of excluded middle this is equivalent to ¬¬(¬A ∨ B) ≡ ¬(A ∧ ¬B), yielding logical equivalence among the statements ¬A ∨B ≡ A→ B ≡ ¬(A ∧ ¬B). But in the constructive logic of MLTT, the above three statements have different meanings and are related by the hierarchy (20) (¬A+B)→ (A→ B)→ ¬(A× ¬B), which in turn elicits a corresponding hierarchy among the corresponding probability statements; see (27). 2.5.1. Evidence under ¬A+ B. Working from left to right in (20), assume first that there is evidence for A and proof of 'B or not A' (i.e., ¬A + B). By the elimination rule for the coproduct type ¬A + B, we reason by case analysis. If B is already known, then B can trivially be derived, regardless of the evidence for A. And by the implication rule (4) and 0-rule (8), ¬A ≡ A → 0 implies Prob(A) → 0, meaning that the evidence for A can be used to derive a contradiction, from which anything follows. Together this gives an inhabitant of (21) Prob(A)× (¬A+B)→ B. From a conjecture in A and proof of 'B or not A', derive a proof of B. 12 HARRY CRANE Notice as a special case that taking B ≡ A in (21) gives Prob(A)× (A+ ¬A)→ A, so that, in particular, evidence for A and LEMA : A + ¬A is enough to construct a proof of A. Thus, from evidence of A one can derive truth of A deductively by postulating LEMA : A+¬A. Though this seems counterintuitive at first, it is clarified by considering the meaning of the judgment LEMA : A+¬A in the logic of MLTT. (In MLTT, LEMA : A+¬A is a 'proof' that 'A or ¬A' holds. By the introduction rule for coproducts, such a proof corresponds either to a proof of A or a proof of ¬A. If the former, then A is true. If the latter, then the evidence for A contradicts the proof of 'not A', and from a contradiction anything follows.) On the other hand, from proof of A and a conjecture 'B or not A, one derives evidence for B: (22) A×Prob(¬A+B)→ Prob(B). 2.5.2. Evidence under A→ B. The structure of evidence summarized in (12) implies that evidence for B can be derived from evidence for A and proof that 'A implies B', i.e., from a : Prob(A) and f : A → B derive impf (a) : Prob(B). In fact, this can be done so that the evidence for A is compatible with the derived evidence for B, justifying a conjecture in 'A and B' jointly: Prob(A)× (A→ B)→ Prob(A×B)(23) Given evidence for A and proof that 'A implies B', conjecture 'A and B'. Also by (4), any verification of A and evidence that 'A implies B' combine into evidence for 'A and B': (24) A×Prob(A→ B)→ Prob(A×B). 2.5.3. Evidence under ¬(A× ¬B). When equipped with proof that 'A and not B is not the case' (i.e., ¬(A×¬B)), evidence for A must suggest that ¬B is not the case (i.e., ¬¬B) since the evidence for A hints that A is true while the proof of ¬(A×¬B) implies that A and ¬B cannot both be true: (25) Prob(A)× ¬(A× ¬B)→ Prob(A× ¬¬B). Evidence for A and proof of ¬(A × ¬B) combine to give evidence for A× ¬¬B. (Recall that, in general, ¬¬B and B are not identical in MLTT, because MLTT does not require the law of excluded middle.) Similarly, from proof of A and evidence for ¬(A×¬B), one can derive evidence for ¬¬B since knowing A and having evidence that A and ¬B cannot both hold gives evidence against ¬B: (26) A×Prob(¬(A× ¬B))→ Prob(A× ¬¬B). LOGIC OF PROBABILITY AND CONJECTURE 13 Comparing the statements in (21), (23), and (25) and ignoring the appearance of A in their conclusion7 gives the following commutative diagram, which agrees with the hierarchy in (20). (27) ¬A+B Prob(A)× (¬A+B) B A→ B Prob(A)× (A→ B) Prob(B) ¬(A× ¬B) Prob(A)× ¬(A× ¬B) Prob(¬¬B) 2.6. Grades of evidence. We have so far discussed the probability type as a way of representing evidence 'at level one', i.e., evidence of a proposition. Unlike the conventional numerical approach to probability as a measure of evidence, the formalism presented here provides no immediate way to compare different pieces of evidence in terms of which is 'stronger'. In the same way that different proofs of A cannot be compared on the basis of which better establishes the truth of A-both establish the truth of A, but in (possibly) different ways-there is no aspect of the formal system which allows for one to judge, for example, that a : Prob(A) is 'stronger evidence' for A than a′ : Prob(A), or that a makes A 'more probable' than a′ does. There are two possible ways to incorporate such a notion of evidential strength into this formalism. One is to extend upon the rules (2)-(8) by adding a relation ≤A: Prob(A) × Prob(A) → Bool for each A : Concept along with rules for ≤A that agree with the interpretation of ≤A (a, a′) as 'a′ is stronger evidence for A than a'. A second possibility requires no additional rules, but instead follows by iterating the probability type constructor Prob : Concept → Concept to obtain an inductive hierarchy of different 'grades of evidence', beginning with truth (A), then evidence of truth (Prob(A)), evidence of evidence (Prob(Prob(A))), evidence of evidence of evidence (Prob(Prob(Prob(A)))), and so on. For each A : Concept, the formalism captures these different grades of evidence by the inductive definition Probn(A) : Concept Prob0(A) :≡ A Probn+1(A) :≡ Prob(Probn(A)), so that Probn(A) consists of the nth level evidence of A. The theorems expressed throughout Section 2.5 can be extended to these 'higher probability types' in a 7For example, from the conclusion Prob(A × ¬¬B) in (26), we can apply (15) to get splitprobA,¬¬B : Prob(A×¬¬B)→ Prob(A)×Prob(¬¬B), which we compose with the projection map pr¬¬B : Prob(A)×Prob(¬¬B)→ Prob(¬¬B), 〈x, y〉 7→ y, to obtain Prob(¬¬B). 14 HARRY CRANE straightforward way. For example, for m,n ≥ 0, we can extend (23) to Probm(A)×Probn(A→ B)→ Probm+n(B). Both of these possible extensions are interesting for future examination, but are not discussed here due to space limitations; see [Cra18] for further discussion. 3. Homotopy Probability Theory So far the logic of probability described above has been purely syntactic, expressed as a logic for reasoning about concepts in MLTT. Though Martin-Löf's meaning explanation provides an interpretation of this syntax in terms of proofs and evidence for propositions, we can gain additional insights by interpreting the syntax of MLTT into homotopy type theory (HoTT). In HoTT, the types in MLTT are interpreted as homotopy types (i.e., topological spaces up to homotopy equivalence), and the calculus is empowered by the resulting univalence axiom, by which two types A,B : Concept are regarded as identical (i.e., A =Concept B) if their associated homotopy types are homotopy equivalent. To emphasize the interpretation in HoTT with univalence, we write A : U (instead of A : Concept) to indicate that A is a homotopy type in a univalent universe U (i.e., a universe of types for which the univalence axiom holds). With A ' B denoting that A,B : U are homotopy equivalent8 and A=U B signifying that A and B are identical as homotopy types in U, the univalence axiom states, roughly, that for all A,B : U (A ' B) ' (A=U B)(28) equivalence is equivalent to identity. Formally, this is accomplished by constructing the canonical map stating that identity implies equivalence, idtoequivA,B : (A=U B)→ (A ' B), and asserting as an axiom, ua : ∏ A,B:U isequiv(idtoequivA,B), so that idtoequivA,B is an equivalence between A and B for all A,B : U.9 The univalence axiom is a powerful and intriguing proposal in Voevovdsky's Univalent Foundations program [APW13, PW12,Uni13]. As it is impossible to discuss the many interesting aspects of HoTT and univalence in this short presentation, I 8A ' B is formally defined as the type ∑ f :A→B isequiv(f), where isequiv(f) :≡  ∑ g:B→A f ◦ g ∼ idB ×( ∑ h:B→A h ◦ f ∼ idA ) , for idC ≡ λc.c : C → C for any C : U. 9Following the convention of [Uni13], I write ua(p) : A=U B to indicate the image of p : A ' B under ua. LOGIC OF PROBABILITY AND CONJECTURE 15 provide here only a cursory overview of how the familiar probabilistic concepts of independence, conditional probability, and additivity can be conceived in the proposed type theoretic version of probability. I refer the reader to [Awo14, Awo16, APW13, Tse16,Tse17,Shu17,Uni13] for further details about HoTT. 3.1. Identical concepts have identical probabilities. It is intuitive that identical concepts ought to have identical probabilities, as can be proven using the induction rule for identity types in MLTT without appealing to univalence: for all A,B : Concept and a, a′ : A, (a =A a′)→ (evidA(a) =Prob(A) evidA(a′))(29) (A =Concept B)→ (Prob(A) =Concept Prob(B)).(30) In MLTT, however, there is no mechanism for proving, e.g., A×B =Concept B×A. (In MLTT, we can only prove that A×B and B×A are isomorphic, i.e., A×B ' B×A, but we need the univalence axiom to derive A×B=U B×A from this isomorphism.) With the univalence axiom, we have, for all A,B : U, (31) (A ' B)→ (Prob(A) ' Prob(B)), from which several obvious statements follow as corollaries, including Prob(A×B) ' Prob(B × A) and Prob(A+B) ' Prob(B + A). The univalence axiom has additional consequences for more nuanced aspects of the probability type, such as conditional probability, independence, and additivity, which warrant a much more in depth discussion than the brief introduction below. 3.2. Conditional probability. In practice, it is common to form a judgment, such as 'A and B' is probable, by combining new evidence for B with old evidence for A. Formally, we define the conditional probability of B given a : Prob(A) as the dependent type Prob(B | −) : Prob(A)→ Concept given for each a : Prob(A) by (32) Prob(B | a) :≡ ∑ x:Prob(A×B) (impprA(x) =Prob(A) a). Conditional evidence for B given evidence for A (i.e., a : Prob(A)) consists of evidence for 'A and B' (i.e., x : Prob(A×B)) along with proof that x is compatible with a (i.e., p : impprA(x) = a). In (32), prA : A × B → A, 〈a, b〉 7→ a, is the canonical projection map and impprA : Prob(A×B)→ Prob(A) is the map constructed by applying the implication rule (4) to prA. According to (32), the inhabitants of Prob(B | a) correspond to 'conditional conjectures', i.e., a conjecture in B that is conditional on the conjecture a : Prob(A). Such a conjecture can be asserted just in case there is evidence x : Prob(A×B) for A and B along with proof that x is compatible with a. A conditional conjecture in B given a : Prob(A) is thus stronger than a conjecture in 'A and B' alone because the conditional probability requires that x : Prob(A × B) is compatible with a specific conjecture a : Prob(A). This may at first seem counterintuitive since it appears 16 HARRY CRANE that the previous evidence for A has already done "half the work" in establishing the conjecture in 'A and B'. But, at the same time, a : Prob(A) constrains what can serve as evidence for the conditional conjecture in B, because the conditional conjecture is required to be compatible with a : Prob(A). By analogy to the classical law of total probability in the probability calculus,10 we observe a similar equivalence between Prob(A × B) and the total space of all conditional probabilities Prob(B | a) over all a : Prob(A), (33) Prob(A×B) ' ∑ a:Prob(A) Prob(B | a). Having evidence for 'A and B' is equivalent to having a piece of evidence for A (i.e., a : Prob(A)) along with conditional evidence for B that is compatible with a. For any A,B : U, the equivalence (33) is established by the conditionalization map condB|A : Prob(A×B)→ ∑ a:Prob(A) Prob(B | a) condB|A(x) :≡ 〈impprA(x), 〈x, reflimpprA (x)〉〉,(34) which decomposes evidence x : Prob(A × B) for A and B into evidence for A (i.e., impprA(x) : Prob(A)) and conditional evidence for B given impprA(x) (i.e., 〈x, reflimpprA (x)〉 : Prob(B | impprA(x))). I prove (33) in Section 5. 3.3. Independence. Two concepts can be regarded as independent whenever their associated 'bodies of evidence' are unrelated to one another. In other words, two assertions are independent if a conjecture about one is irrelevant to a conjecture about the other, expressed formally as (35) Prob(A×B) =Concept Prob(A)×Prob(B). Evidence for A (resp. B) serves neither to corroborate nor refute evidence for B (resp. A). I note in passing the similarity between (35) and the definition of independence in the ordinary probability calculus, Pr(A ∧ B) = Pr(A) × Pr(B), with A and B regarded as propositions and '×' interpreted as multiplication. Under univalence, with concepts interpreted as spaces in a universe U, (35) is equivalent to (36) Prob(A×B) ' Prob(A)×Prob(B), which must be witnessed by a homotopy equivalence between Prob(A × B) and Prob(A)×Prob(B). From Section 2.4, there is a canonical mapping splitprobA,B : Prob(A×B)→ Prob(A)×Prob(B) 10In the standard (numerical) probability calculus, the law of total probability states that Pr(A ∧ B) = ∑k j=1 Pr(B | Aj)Pr(Aj) for any propositions A,B and a partition of A into mutually exclusive propositions A ≡ A1 ∨ * * * ∨Ak. LOGIC OF PROBABILITY AND CONJECTURE 17 as defined in (15), but there is no canonical mapping Prob(A) × Prob(B) → Prob(A×B). By understanding independence to mean that evidence for A is irrelevant to evidence for B, and vice versa, we define the independence type of A and B by (37) indep(A,B) :≡ isequiv(splitprobA,B). 3.4. Conditional independence. Combining the definitions in Sections 3.2 and 3.3, we define conditional independence by replacing the probabilities in (35) with conditional probabilities: for A,B : Concept, we say that A and B are conditionally independent given C : Concept if∏ c:Prob(C) (Prob(A×B | c) =Concept Prob(A | c)×Prob(B | c)). Interpreting MLTT into HoTT, we formally define the conditional independence type of A and B given C by indep(A,B | C) :≡ ∏ c:Prob(C) isequiv(condsplitA,B|C(c)), for condsplitA,B|C : ∏ c:Prob(C) Prob(A×B | c)→ Prob(A | c)×Prob(B | c) condsplitA,B|C :≡ ≡ λc.λ〈x, p〉.〈〈impprA(x), compprC ,prA×C • p〉, 〈impprB (x), compprC ,prB×C • p〉〉, where, for general f : A→ B and g : B → C, compg,f : impg ◦ impf = impg◦f is the proof of the propositional identity for composition reflected in (12), and for general x, y, z : X, r • s : x = z is the concatenation of the paths determined by r : x = y, and s : y = z in the homotopic interpretation. 3.5. Additivity. I conclude with a brief discussion of additivity, which figures prominently in the axioms of conventional probability theory but whose analog is absent from the evidence-based theory presented above. In the ordinary quantitative theory of probability, the additivity axiom says that (38) Pr(A ∨B) = Pr(A) + Pr(B) for any mutually exclusive propositions A and B. If these probabilities are interpreted as a measure of the amount of evidence supporting 'A and B', then (38) says that the amount of evidence supporting 'A or B' equals the amount supporting A plus the amount supporting B. In the type theoretic ('proof relevant') setting, with Prob(A) and Prob(B) interpreted as the bodies of evidence supporting A and B, respectively, we express the analog to (38) as (39) Prob(A+B) =Concept Prob(A) + Prob(B), 18 HARRY CRANE with + interpreted now as the coproduct in type theory. In (39), the lefthand side is the body of evidence for 'A or B' while the righthand side is the disjoint union of the body of evidence for A and the body of evidence for B. The discussion in Section 2.4 showed that evidence for A or evidence for B gives evidence for 'A or B', i.e., (40) combprobA,B : Prob(A) + Prob(B)→ Prob(A+B), but in general having evidence for A or B is not enough to determine which of the two there is evidence for. These observations provide a link between our conception of probability as a body of evidence and the Dempster–Shafer axioms of belief functions Bel(*) [Dem67,Sha76], which instead of (38) require the weaker condition (41) Bel(A ∨B) ≥ Bel(A) + Bel(B). The inequality in (41) reflects the possibility that the amount of evidence favoring A∨B might strictly exceed the sum of evidence for A and B individually. Note that, by interpreting '→' as '≤' when evidence is regarded as a quantity, the implication in (40) agrees with the inequality in (41). The theory of evidence presented here is thus consistent with the Shaferian mathematical theory of evidence [Sha76]. It is interesting to consider the implications of assuming the additivity condition (39) when interpreted in HoTT. In this case, (39) becomes Prob(A+B) ' Prob(A) + Prob(B), and one could postulate (perhaps as an axiom) that the canonical map combprobA,B in (40) is an equivalence, isequiv(combprobA,B). But this is beyond the scope of our discussion here. 4. Concluding remarks I have proposed a type-theoretic formalization of probability in which probability statements are defined as primitive judgments about evidence. As the concepts of probability and evidence have been intermingled for millenia, cf. [GZP+89, Fra15], the formalism presented here is perhaps more historically accurate than the current mathematical orthodoxy for probability. Indeed, it was not until relatively recently in history that probability took its present numerical form [Por96,Hac75]. Also, since judgments of the form 'a is evidence for A' arise much more commonly and naturally than precise quantitative probability assignments (i.e., degrees of belief), this framework is arguably better for modeling the way in which people routinely reason with evidence in legal proceedings, scientific investigation, mathematical conjecture, and everyday decision making. Finally, I have posed mathematical conjecture as the backdrop in order to anchor the exposition in something concrete without delving too far into the details of the given application. I discuss many more historical, philosophical, and conceptual aspects of this work in [Cra18]. LOGIC OF PROBABILITY AND CONJECTURE 19 5. Appendix: Technical Proofs Proof of (11). We apply the elimination rule (4) to construct a witness λa.impd(a) : ∏ a:Prob(A) Prob(C(a)), for the type family C : Prob(A)→ Concept C(a) :≡ ∑ x:A (impprA(x) =Prob(A) a) and d : ∏x:AC(evidA(x)) defined by (42) d(x) :≡ 〈x, reflevidA(x)〉 : ∑ x:A (evidA(x) =Prob(A) evidA(x)).  Proof of (10). Given a : Prob(A) and f : ∑x:A(evidA(x) =Prob(A) a) → 0, we can derive f(y) : 0 for each y : ∑x:A(evidA(x) =Prob(A) a). By the implication rule (4), we immediately have impf : Prob (∑ x:A (evidA(x) =Prob(A) a) ) → Prob(0) ≡ 0 impf (evid(y)) :≡ evid0(f(y)), y : ∑ x:A (evidA(x) =Prob(A) a). Finally, let d be as defined in (42), so that by (11), we have λa.impd(a) : ∏ a:Prob(A) Prob (∑ x:A (evidA(x) =Prob(A) a) ) . We conclude by constructing λa.λf.impf (impd(a)) : ∏ a:Prob(A) ( ∑ x:A (evidA(x) =Prob(A) a)→ 0)→ 0 .  Proof of (12), (13), and (14). Several of the commutativity relations in (12) follow directly from the rules (2)-(8) of the probability type. For example, by (4) and (6), we immediately have impf ◦ evidA ≡ evidB ◦ f : A→ Prob(B) and infd ◦ evidA ≡ d ◦ evidA : A→ B, and analogously for g : B → C and e◦evidB : B → C in (12). By the first judgmental equality, we thus have compg,f : impg ◦ impf = impg◦f by first proving (43) proda:Prob(A)(impg◦f (a) =Prob(C) (impg ◦ impf )(a)) and then applying the axiom of function extensionality. To prove (43), we apply the second elimination and computation rules of the probability type ((5) and (7)) as 20 HARRY CRANE follows. First define C : Prob(A)→ Concept C(a) :≡ impg◦f (a) =Prob(C) (impg ◦ impf )(a). For every x : A, we have impg◦f (evidA(x)) ≡ evidC(g(f(x))) and (impg ◦ impf )(evidA(x)) ≡ impg(impf (evidA(x))) ≡ impg(evidB(f(x))) ≡ evidC(g(f(x))), so that d(evidA(x)) ≡ reflevidC(g(f(x))) : C(evidA(x)) depends on x only through evidA(x). By (5), we have infd : ∏ a:Prob(A) C(a), as desired. Commutativity of the other relations in (12) follow by similar applications of the eliminations rules for Prob. To show the first equality in (13), we use both computation rules (6) and (7) with C : Prob(A)→ Concept C(a) :≡ impd◦evidA(a) = impinfd◦evidA(a) as follows. For x : A, we have impd◦evidA(evidA(x)) ≡ evidB(d(evidA(x))) : A→ Prob(B) impinfd◦evidA(evidA(x)) ≡ evidB(infd(evidA(x))) ≡ evidB(d(evidA(x))) : A→ Prob(B), so that r(evidA(x)) ≡ reflevidB(d(evidA(x))) : C(evidA(x)) and infr : ∏ a:Prob(A) impd◦evidA(a) = impinfd◦evidA(a) by (7). For the second equality, we argue similarly by noting that for every x : A infevidB◦infd(evidA(x)) ≡ evidB(infd(evidA(x))) ≡ evid(d(evidA(x))) : A→ Prob(B) by (7).  Proof of (15). For A,B : Concept let prA : A×B → A prA(〈a, b〉) :≡ a and prB : A×B → B prB(〈a, b〉) :≡ b LOGIC OF PROBABILITY AND CONJECTURE 21 be the projection maps. By (12) we have impprA : Prob(A×B)→ Prob(A) and impprB : Prob(A×B)→ Prob(B), from which we construct λx.〈impprA(x), impprB (x)〉 : Prob(A×B)→ Prob(A)×Prob(B).  Proof of (16). For A,B : Concept let inl : A → A + B and inr : B → A + B be the left and right injections, respectively. By (12) (cf. (4)) we have impinl : Prob(A)→ Prob(A+B) and impinr : Prob(B)→ Prob(A+B), from which we construct h : Prob(A) + Prob(B)→ Prob(A+B) h(inl(a)) :≡ impinl(a), a : Prob(A), h(inr(b)) :≡ impinr(b), b : Prob(B).  Proof of (18). Let B : A → Concept be a dependent type. Fix a : A and define C : Prob(∏y:AB(y)) → Concept as the non-dependent type C(x) :≡ B(a). We construct λf.f(a) : ∏y:AB(y)→ B(a) so that the elimination rule (4) implies λx.impλf.f(a)(x) : Prob( ∏ y:A B(y))→ Prob(B(a)). We may thus construct λx.λa.impλf.f(a)(x) : Prob( ∏ y:A B(y))→ ∏ a:A Prob(B(a)).  Proof of (19). Let B : A → Concept be a dependent type. For each a : A define Ca : Prob(B(a))→ Concept as the non-dependent type Ca(x) :≡ ∑ y:AB(y). From any b : B(a) we construct 〈a, b〉 : Ca(evidB(a)(b)), so that the implication rule (4) implies λx.impλb:B(a).〈a,b〉(x) : Prob(B(a))→ Prob( ∑ y:A B(y)). We then define h : ∑ a:A Prob(B(a))→ Prob( ∑ a:A B(a)) h(〈a, x〉) :≡ impλb:B(a).〈a,b〉(x).  22 HARRY CRANE Proof of (21). Fix A,B : Concept and note first that from any f : ¬A ≡ A→ 0 the elimination rule (4) implies impf : Prob(A) → 0. Arguing by case analysis for the coproduct type, we thus construct h : Prob(A)× (¬A+B)→ B h(〈a, inl(f)〉) :≡ efqB(impf (a)), a : Prob(A), f : A→ 0, h(〈a, inr(b)〉) :≡ b, a : Prob(A), b : B, where efqB : 0 → B is ex falso quodlibet for B.  Proof of (22). For A,B : Concept and a : A, we define da : ¬A + B → B by the elimination rule for ¬A+B: da(inl(f)) :≡ efqB(f(a)), f : A→ 0, da(inr(b)) :≡ b, b : B. By the elimination rule for the probability type (4), we construct impda : Prob(¬A+ B)→ Prob(B), from which we conclude by defining λa.λx.impda(x) : A×Prob(¬A+B)→ Prob(B).  Proof of (23). For A,B : Concept and f : A→ B, we construct hf : Prob(A)→ Prob(A×B) hf (a) :≡ impλx.〈x,f(x)〉:A→(A×B)(a). We then define λa.λf.hf (a) : Prob(A)× (A→ B)→ Prob(A×B).  Proof of (24). For A,B : Concept and a : A, we define da ≡ impλf.〈a,f(a)〉:(A→B)→(A×B) : Prob(A→ B)→ Prob(A×B) by the elimination rule (4). We then construct λa.λx.da(x) : A×Prob(A→ B)→ Prob(A×B).  Proof of (25). For A,B : Concept, f : A× (B → 0)→ 0, and a : A, we define fa :≡ λb.f(a, b) : (B → 0)→ 0 . Thus, for every a : A we have fa : ¬¬B ≡ (B → 0) → 0 and λa.〈a, fa〉 : A → (A× ¬¬B). The elimination rule for the probability type (4) gives impλa.〈a,fa〉 : Prob(A)→ Prob(A× ¬¬B). From the judgmental equality A× (B → 0)→ 0 ≡ ¬(A× ¬B), LOGIC OF PROBABILITY AND CONJECTURE 23 we define λx.λf.impλa.〈a,fa〉(x) : Prob(A)× ¬(A× ¬B)→ Prob(A× ¬¬B).  Proof of (26). For A,B : Concept, a : A, and f : A× (B → 0)→ 0, we define fa :≡ λb.f(a, b) : (B → 0)→ 0 as in the proof of (25). We then define da ≡ λf.〈a, fa〉 : ¬(A× ¬B)→ (A× ¬¬B) so that impda : Prob(¬(A× ¬B))→ Prob(A× ¬¬B). The proof is completed by λa.λx.impda(x) : A×Prob(¬(A× ¬B))→ Prob(A× ¬¬B).  Proof of (27). The following commutes: (44) Prob(A)× (¬A+B) B Prob(A)× (A→ B) Prob(B) Prob(A)× ¬(A× ¬B) Prob(¬¬B) α β evidB ζ γ δ ε Define α, β, γ, δ, ε as follows. (For f : A → C and g : B → C, I write indf,g : A+B → C for the function defined by case analysis.) λa.αL(a) ≡ λa.λg. efqB(impg(a)) : Prob(A)→ ¬A→ B λa.αR(a) ≡ λa.λb.b : Prob(A)→ B → B α ≡ λa.λz.indαL(a),αR(a)(z) : Prob(A)× (¬A+B)→ B λa.βL(a) ≡ λa.λg.〈a, λx. efqB(g(x))〉 : Prob(A)→ ¬A→ Prob(A)× (A→ B) λa.βR(a) ≡ λa.λb.〈a, λx.b〉 : Prob(A)→ B → Prob(A)× (A→ B) β ≡ λa.λz.indβL(a),βR(a)(z) : Prob(A)→ (¬A+B)→ Prob(A)× (A→ B) γ ≡ λa.λf.〈a, λx.λg.g(f(x))〉 : Prob(A)→ (A→ B)→ Prob(A)× ¬(A× ¬B) δ ≡ λa.λg.impg(a) : Prob(A)→ (A→ ¬¬B)→ Prob(¬¬B) ε ≡ λb.impλy.λg.g(y):B→¬B→0(b) : Prob(B)→ Prob(¬¬B) ζ ≡ λa.λf.impf (a) : Prob(A)× (A→ B)→ Prob(B). We first show that the upper square commutes by repeated application of the elimination rule for the product, coproduct, and probability types. For the upper half of the square, evidB ◦ α : Prob(A) × (¬A + B) → Prob(B) is defined by case 24 HARRY CRANE analysis: (a, inl(g)) 7→ evidB(efqB(impg(a)) (a, inr(b)) 7→ evidB(b). The lower half ζ ◦ β is also defined by case analysis: (a, inl(g)) 7→ impλx. efqB(g(x))(a) (a, inr(b)) 7→ impλx.b(a). Now, to show that evidB ◦ α = ζ ◦ β, we must produce an inhabitant of p : ∏ z:Prob(A)×(¬A+B) (evidB ◦ α)(z) = (ζ ◦ β)(z). By the elimination rule for product and coproduct types, we can construct such an inhabitant p by considering z = (a, inl(g)) and z = (a, inr(b)) for a : Prob(A), g : ¬A, and b : B, and defining p1 : ∏ a:Prob(A) ∏ g:¬A (evidB(efqB(impg(a))) = impλx. efqB(g(x))(a)) p2 : ∏ a:Prob(A) ∏ b:B (impλx.b(a) = evidB(b)). The conclude evidB ◦ α = ζ ◦ β by the axiom of function extensionality (e.g., Axiom 2.9.3 in [Uni13]). For p1, we appeal to the second elimination rule for the probability type (the rule of inference (7)) to compute impλx. efqB(g(x))(evidA(x)) ≡ evidB(efqB(g(x))), x : A. Now, given g : ¬A, x : A, and a : Prob(A), we have q(a, g) ≡ efqimpg(a)=g(a)(impg(a)) : impg(a) = g(x). The proof p1 follows by continuity of functions in type theory: p1 ≡ λa.λg. apevidB◦efqB (q(a, g)) : ∏ a:Prob(A) ∏ g:¬A (evidB(efqB(impg(a))) = evidB(efqB(g(x)))), where apevidB◦efqB (q(a, g)) is the application of evidB ◦ efqB to the path q(a, g), as defined in Lemma 2.2.1 in [Uni13]. For p2, we observe that impλx.b(a) ≡ evidB(b) so that p2 :≡ λa.λb.reflevidB(b). The conclusion follows by the elimination rules for Prob(A)× (¬A+B) and ¬A+B. To show that the bottom square commutes, we have to prove that ε ◦ ζ = δ ◦ γ holds in Prob(A)× (A→ B)→ Prob(¬¬B). For the upper half, we have λa.λf.impλy:B.λg:¬B.g(y):0(impf (a)) : Prob(A)× (A→ B)→ Prob(¬¬B), which for x : A and f : A→ B satisfies impλy:B.λg:¬B.g(y):0(impf (evidA(x))) ≡ impλy:B.λg:¬B.g(y):0(evidB(f(x))) ≡ evid¬¬B(λg : ¬B.g(f(x))). LOGIC OF PROBABILITY AND CONJECTURE 25 For the bottom half, we have (δ ◦ γ)(a, f) ≡ impλx.λg.g(f(x))(a), which for x : A satisfies impλx.λg.g(f(x))(evidA(x)) ≡ evid¬¬B(λg : ¬¬B.g(f(x))). The bottom square commutes by reflexivity. The outside square commutes by path concatenation and associativity.  Proof of (31). Let p : A ' B and let Prob : U → U be the probability type former. By the univalence axiom of HoTT we have ua(p) : (A=U B). By [Uni13, Lemma 2.2.1] we have a map apProb : (A=U B)→ (Prob(A) =U Prob(B)), which combines with ua(p) : A=U B to give apProb(ua(p)) : Prob(A) =U Prob(B). Finally, by idtoequiv : (Prob(A) =U Prob(B)) → (Prob(A) ' Prob(B)), we obtain λp.idtoequiv(apProb(ua(p))) : (A ' B)→ (Prob(A) ' Prob(B)).  Proof of (33). Recall the definition of Prob(B | −) : Prob(A)→ U by Prob(B | a) :≡ ∑ y:Prob(A×B) (impprA(y) =Prob(A) a). Now define g ≡ h :≡ λ〈a, x〉. pr1(x) : ∑ a:Prob(A) Prob(B | a)→ Prob(A×B), where pr1 : Prob(B | a) → Prob(A × B) is defined as the projection onto the first coordinate of the ∑-type Prob(B | a) ≡ ∑y:Prob(A×B)(impprA(y) =Prob(A) a), pr1 : ∑ y:Prob(A×B) (impprA(y) =Prob(A) a)→ Prob(A×B) pr1(〈y, p〉) :≡ y. We construct an inhabitant of (g ◦ cond) ∼ idProb(A×B) :≡ ∏ y:Prob(A×B) ((g ◦ cond)(y) =Prob(A×B) y) by observing that cond(y) ≡ 〈impprA(y), 〈y, reflimpprA (y)〉〉, whence g(cond(y)) ≡ y : Prob(A×B) and λy.refly : (g ◦ cond) ∼ idProb(A×B) . It remains to prove that (cond ◦ h) ∼ id∑ a:Prob(A) Prob(B|a) . 26 HARRY CRANE Note first that (cond ◦ h)(〈a, 〈y, p〉〉) :≡ 〈impprA(y), 〈y, reflimpprA (y)〉〉. By the elimination rule for ∑-types, it is enough to prove∏ 〈a,〈y,p〉〉: ∑ a:Prob(A) Prob(B|a) 〈a, 〈y, p〉〉 = 〈impprA(y), 〈y, reflimpprA (y)〉〉. First note that∑ a:Prob(A) Prob(B | a) ' ∑ z:Prob(A)×Prob(A×B) (impprA(prProb(A×B)(z)) = prA(z)). By the elimination rule for product types, we can assume z = 〈a, y〉 for a : Prob(A) and y : Prob(A×B) so that∑ a:Prob(A) Prob(B | a) ' ∑ 〈a,y〉:Prob(A)×Prob(A×B) (impprA(y) = a). Now, given any 〈a, 〈y, p〉〉 : ∑a:Prob(A) Prob(B | a), we immediately have p : impprA(y) = a, and thus p −1 : a = impprA(y), refly : y = y, and pair=(p−1, refly) : 〈a, y〉 = 〈impprA(y), y〉, with pair= as defined in [Uni13, Theorem 2.6.2]. By Theorem 2.7.2 in [Uni13], it remains to show that (45) transportC(pair=(p−1, refly), p−1) = reflimpprA (y), for C : Prob(A)×Prob(A×B)→ U defined by C(〈a, y〉) :≡ a =Prob(A) impprA(y). We argue by based path induction as follows. Fix 〈a, 〈y, p〉〉 : ∑a:Prob(A) Prob(B | a) and define D : ∏ 〈a′,y′〉:Prob(A)×Prob(A×B) (〈a′, y′〉 = 〈impprA(y), y〉)→ U D(〈a′, y′〉, p′) :≡ transportC(p′, apprA(p ′)) = reflimpprA (y). Arguing by based path induction at 〈impprA(y), y〉, we can assume 〈a ′, y′〉 ≡ 〈impprA(y), y〉, so that D(〈impprA(y), y〉, refl〈impprA (y),y〉) ≡ ≡ transportC(refl〈impprA (y),y〉, apprA(refl〈impprA (y),y〉)) = reflimpprA (y) ≡ apprA(refl〈impprA (y),y〉) = reflimpprA (y), for which we have an inhabitant by the propositional computation rule for pair=; see [Uni13, p. 106]. By based path induction, we have an inhabitant of D(z, p′) for every z : Prob(A)× Prob(A × B) and p′ : z = 〈impprA(y), y〉. In particular, for z ≡ 〈a, y〉, with a : LOGIC OF PROBABILITY AND CONJECTURE 27 Prob(A), y : Prob(A×B), and p′ ≡ pair=(p−1, refly), we have an inhabitant d : D(〈a, y〉, pair=(p−1, refly)) ≡ ≡ transportC(pair=(p−1, refly), apprA(pair =(p−1, refly))) = reflimpprA (y). Again by the propositional computation rule for pair=, we have an inhabitant r : apprA(pair =(p−1, refly)) = p−1. And thus, by applying the transport function to the path r, we have aptransportC(pair=(p−1,refly),−)(r) : transportC(pair=(p−1, refly), apprA(pair =(p−1, refly))) = = transportC(pair=(p−1, refly), p−1). By path concatenation, we obtain aptransportC(pair=(p−1,refly),−)(r) −1 • d : transportC(pair=(p−1, refly), p−1) = reflimpprA (y), as required by (45).  References [AGM85] Carlos Alchourron, Peter Gärdenfors, and David Makinson, On the logic of theory change: Partial meet contraction and revision functions, The Journal of Symbolic Logic 50 (1985), no. 2, 510–530. [APW13] S. Awodey, A. Pelayo, and M.A. Warren, Voevodsky's Univalence Axiom in Homotopy Type Theory, Notices of the AMS 60 (2013), no. 9, 1164–1167. [Awo14] S. Awodey, Structuralism, Invariance, and Univalence, Philosophia Mathematica 22 (2014), no. 1, 1–11. [Awo16] , Univalence as a Principle of Logic, https: (2016). [Bro81] L.E.J. Brouwer, Brouwer's cambridge lectures on intuitionism, Cambridge University Press, 1981. [Broa] . [Brob] , Intuitionism and Formalism, Bulletin (New Series) of the American Mathematical Society 37, no. 1, 55–64. [Car50] Rudolph Carnap, Logical Foundations of Probability, University of Chicago Press, 1950. [CF] H. Curry and R. Feys, Craig, william, ed., combinatory logic vol. i. [Cra18] Harry Crane, Logic of Probability and Conjecture (full version), in preparation (2018). [Cur34] H. Curry, Functionality in Combinatory Logic, Proceedings of the National Academy of Sciences 20 (1934), 584–590. [Dem67] Arthur P. Dempster, Upper and lower probabilities induced by a multivalued mapping, The Annals of Mathematical Statistics 38 (1967), no. 2, 325–339. [dF] B. de Finetti, La prévision: ses lois logiques, ses sources subjectives, Annales de l'Institut Henri Poincaré 7, 1–68. [Dum00] M. Dummett, Elements of intuitionism, 2nd edition, Oxford Logic Guides (Book 39), Clarendon Press, 2000. [Fol92] Richard Foley, Working Without a Net, Oxford University Press, 1992. 28 HARRY CRANE [Fra15] James Franklin, The Science of Conjecture: Evidence and Probability before Pascal, Johns Hopkins University Press, Baltimore, MD, USA, 2015. [GZP+89] Gerd Gigerenzer, Zeno Zwijtink, Theodore Porter, Lorraine Daston, John Beatty, and Lorenz Kruger, The Empire of Chance: How Probability Changed Science and Everyday Life (Ideas in Context), Cambridge University Press, 1989. [Hac75] I. Hacking, The Emergence of Probability: A Philosophical Study of Early Ideas about Probability, Induction and Statistical Inference, 2nd edition, Cambridge Series on Statistical & Probabilistic Mathematics, Cambridge University Press, 1975. [Hey71] A. Heyting, Intuitionism: An introduction, Study in Logic & Mathematics, 1971. [How69] W.A. Howard, The formulae-as-types notion of construction, in seldin, jonathan p.; hindley, j. roger, to h.b. curry: Essays on combinatory logic, lambda calculus and formalism, 1969, pp. 479–490. [KL11] Chris Kapulkin and Peter LeFanu Lumsdaine, The Simplicial Model of Univalent Foundations (after Voevodsky), arXiv:1211.2851 (2011). [LP14] J. Ladyman and S. Presnell, A Primer on Homotopy Type Theory Part 1: The Formal Type Theory, http: (2014). [Maz12] Barry Mazur, Is it plausible?, Available at http://www.math.harvard.edu/~mazur/ papers/Plausibility.Notes.3.pdf (2012). [ML84] P. Martin-Löf, Intuitionistic type theory, Studies in Proof Theory. Lecture Notes, v. 1, 1984. [ML87] , Truth of a Proposition, Evidence of a Judgement, Validity of a Proof, Synthese 73 (1987), 407–420. [ML96] , On the Meanings of the Logical Constants and the Justifications of the Logical Laws, Nordic Journal of Philosophical Logic 1 (1996), no. 1, 11–60. [Pól54] George Pólya, Mathematics and Plausible Reasoning, Martino Publishing, Mansfield Centre, CT, 1954. [Por96] Theodore Porter, Trust in Numbers: The Pursuit of Objectivity in Science and Public Life, Princeton University Press, 1996. [PW12] Á. Pelayo and M.A. Warren, Homotopy Type Theory and Voevodsky's Univalent Foundations, arXiv1210:5658 (2012). [Ram26] Frank P. Ramsey, Truth and Probability (1926). [SF18] Ted Shear and Branden Fitelson, Two approaches to belief revision, Erkenntnis in press (2018). [Sha76] Glenn R. Shafer, A mathematical theory of evidence, Princeton University Press, 1976. [Shu17] M. Shulman, Homotopy type theory: the logic of space, arXiv:1703.03007 (2017). [Tse16] Dimitis Tsementzis, Univalent Foundations as Structuralist Foundations, Synthese (2016). [Tse17] Dimitris Tsementzis, A meaning explanation for hott, http: (2017). https://homotopytypetheory.org/book [Uni13] Univalent Foundations Program, The, Homotopy type theory: Univalent foundations of mathematics, https://homotopytypetheory.org/book, Institute for Advanced Study, 2013. [Wea03] Brian Weatherson, From Classical to Intuitionistic Probability, Notre Dame Journal of Formal Logic 44 (2003), 111–123. [Wit73] Ludwig Wittgenstein, Philosophical investigations (3rd edition), Pearson, 1973. Department of Statistics & Biostatistics, Rutgers University, 110 Frelinghuysen Road, Piscataway, NJ 08854, USA E-mail address: hcrane@stat.rutgers.edu