Credence for Epistemic Discourse 1 Paolo Santorio ◦ University of California, San Diego Draft of October 12th, 2018 1 Overview Much recent work on epistemic modality appeals to nonclassical notions of entailment. On the classical conception (see e.g. Kaplan 1989a), logical consequence for natural language tracks preservation of truth (of a sentence, at a context). Many theorists have argued that this notion of consequence is inadequate for epistemic modal sentences and conditionals, like (1) and (2). (1) Frida might roll the die. (2) If Frida rolled the die, it came up even. The details vary, but the central idea is that consequence for epistemic language should track instead a notion of preservation of support by an information state. On the resulting view, a conclusion ψ follows from a set of premises φ1, . . . ,φn just in case all bodies of information that support φ1, . . . ,φn also support ψ. This view of consequence naturally dovetails with a non-truth-conditional semantics, on which epistemic modal claims don't express propositions and are not true or false. Call this package of theses about the logic and the semantics of epistemic discourse the informational view. On the informational view, the right notion of consequence for epistemic discourse validates all patterns of inference that are classically valid, plus some others. This is claimed to be an important empirical advantage of the informational view, since the extra patterns do seem intuitively valid. This paper focuses on the relation between the informational view and credence. Plausibly, a notion of consequence that adequately characterizes epistemic discourse should model how actual subjects think with might-claims and conditionals. In particular, an adequate notion of consequence should work as input for a descriptive model of credence-a model that (by and large at least) adequately describes actual subjects's degrees of belief. Here I discuss two problems for the claim that informational consequence can play this role. 1 For conversations about related topics, hanks to Andrew Bacon, Mike Caie, Jennifer Carr, Kenny Easwaran, Michael Glanzberg, Matt Mandelkern, Malte Willer, Robbie Williams, Jack Woods, and to audiences at Northwestern University, the University of Connecticut, and the University of Birmingham. Special thanks to Fabrizio Cariani for key exchanges on epistemic modals and conditionals. 1 First, I point out that all standard informational inferences are credally invalid. I.e., we can specify intuitive credal assignments that involve a drop of credence between the premises and the conclusion that is incompatible with validity (also according to probabilistic notions of validity, in the style of Adams 1998). This rules out the idea that informational consequence appropriately models credence. But it leaves open a conciliatory route. Classical and informational consequence may not be regarded as alternatives: we may endorse a view on which we need both. On this view, each one specifies coherence constraints on a different class of attitudes. Classical consequence specifies coherence constraints on credence, while informational consequence specifies coherence constraints on all-or-nothing attitudes, like acceptance or belief. Unfortunately, this conciliatory route is blocked by the second problem. There is a close link between informational inferences and triviality results. Even assuming a loose connection between informational consequence and credence (namely: informational consequence tracks preservation of credence 1), all informational inferences give rise to triviality results broadly in the style of Lewis 1976 and Bradley 2000, 2007. For example, building on the informationally valid inference from a material conditional pφ ⊃ ψq to the conditional pφ > ψq (so-called Or-to-If), we can prove via standard triviality techniques that the probability of pφ > ψq has to be at least as high as the probability of pφ ⊃ ψq. This result, on par with the results of standard triviality proofs, is unacceptable. As a result, the informational theorist is left with two options, both of them radical. First, they can adopt a form of nihilism and claim that the sentences pertaining to epistemic discourse are not an appropriate object of credence. Second, they can endorse a nonstandard probability theory. Both of these options require substantial theoretical work, which goes beyond the scope of this paper. I proceed as follows. §2 sets up a simple version of the informational view, illustrating some of its advantages. In §3–4 I show that informational inferences systematically fail in probabilistic reasoning. I introduce the conciliatory option in §5, and I discuss the link between informational consequence and triviality in §6. 2 Setup: the informational view I start by setting up a simple semantics for epistemic modals and conditionals, and defining classical and informational consequence for it. I choose a semantics that closely mimics the semantics in Yalcin 2007, mostly for ease of exposition. 2.1 Semantics for epistemic modals I use an interpretation function (represented as '~*') to map expressions to their semantic values. As usual, this mapping is relativized to an n-tuple of parameters 2 (a point of evaluation). I take points of evaluation to be a pair of a world and an information state 〈w, i〉. Hence the general form of a semantic clause is: ~φw,i = semantic value of φ relative to 〈w, i〉 The world and information state parameters are exploited selectively by different fragments of the language. Nonmodal sentences are sensitive to the world parameter, but not the information state parameter. Here is an example of a semantic clause for a simple, nonmodal sentence: (3) ~It is rainingw,i = true iff it is raining at w Conversely, modal operators display sensitivity to the information state parameter, but not to the world parameter. In particular, necessity and possibility modals are analyzed as quantifiers over worlds in the information state picked out by i: (4) ~φw,i = true iff ∀w′ ∈ i : ~φw ′,i = true (5) ~^φw,i = true iff ∃w′ ∈ i : ~φw ′,i = true 2.2 Defining consequence This compositional semantics can be made fully compatible with a conservative picture of content and consequence. In particular, we may define a notion of truth at a context in the style of Kaplan (1989a, 1989b): (6) φ is true at c iff ~φwc,ic = true This allows us to assign classical truth values to utterances of epistemic sentences. (Of course, the definition in (6) requires a metasemantic assumption: each context determines an information state that is relevant for evaluating an utterance. This assumption is widely disputed.2) The alternative option, which is taken by proponents of the informational view, is to refrain from defining a standard notion of truth. Instead, informational theorists define a notion of support by an information state. Intuitively, a sentence is supported by an information state just in case an agent who is in that state accepts the sentence. More formally, we define support as follows: i supports φ (i  φ) iff, for all w ∈ i, ~φw,i = 1. For the case of nonmodal sentences, φ being supported by i simply reduces to φ being true at all worlds in i. But for nonmodal sentences, i supports φ just in case i satisfies a kind of global condition, not reducible to properties of each individual 2 See, among many, Egan et al. 2005, MacFarlane 2011, 2014. 3 worlds. For example, i supports ^φ just in case i contains some world where φ is true. These two views are naturally paired with different notions of consequence. On the classical truth conditional picture, consequence may be defined in the standard way, as preservation of truth at a point of evaluation. In particular, one notion of consequence that is of particular relevance is preservation of truth at a proper point, i.e. a pair 〈w, i〉 such that w ∈ i.3 This is the notion that, throughout the paper, I will refer to as 'classical consequence'. Classical logical consequence. φ1, ...,φn C ψ iff for all 〈w, i〉 such that w ∈ i and ~φ1w,i = 1, ..., ~φnw,i = 1, ~ψw,i = 1. Conversely, informational consequence is defined as preservation of support. Informational consequence. φ1, ...,φn I ψ iff, for all i s.t. i supports φ1, ...,φn, i supports ψ. Informational consequence is, at least in a sense, a special case of classical consequence. We could define it from classical consequence by restricting consideration not just to proper points of evaluation, but also to points of evaluation where the premises of an argument are true at all worlds in the information state. We get: Classical∗ logical consequence. φ1, ...,φn C∗ ψ iff for all 〈w, i〉 such that w ∈ i and such that, for all w′ ∈ i, ~φ1w ′,i = 1, ..., ~φnw ′,i = 1, ~ψw,i = 1. The reader can check that classical∗ consequence is a notational variant of informational consequence. The fact that informational consequence is a special case of classical consequence means that informational consequence is strictly stronger than classical consequence. On the one hand, all classically valid rules of inferences are also informationally valid: Fact. For all φ1, . . . ,φn,ψ: if φ1, ...,φn C ψ, then φ1, ...,φn I ψ. On the other hand, some extra rules of inferences, which are not classically valid, are informationally valid. (Notice: this doesn't carry over to what we may call 'meta-rules', i.e. rules that allow us to infer that a conclusion follows from a set of premises on the basis of the fact that other entailments hold. I discuss the distinction between rules of inference and meta-rules in a footnote.4) 3 Why is this notion 'particularly relevant'? Differently from improper points, proper points represent genuine epistemic predicaments, i.e. pairs of a world and an informational state such that, for all a subject knows, they might be located at. The resulting notion of consequence is equivalent to what Yalcin (2007) calls 'diagonal consequence'. 4 A rule of inference has the form: 4 One example of a rule of inference that is classically invalid but informationally valid is so-called Łukasiewicz's principle: Łukasiewicz's principle. (ŁP) ¬φ  ¬^φ (Equivalently: φ  φ) Łukasiewicz's principle is at the center of arguments for the informational view. For example, Yalcin 2007 argues for the informational view by arguing that sentences like (7) are semantically contradictory, and that this is captured by informational and not classical consequence. (7) #It's not raining and it might be raining. Arguments in this style are particularly fruitful for the case of conditionals, so let me turn to the latter. 2.3 Semantics for conditionals The compositional semantics in §2.1 can be naturally extended to indicative conditionals. Here I use a simple variant of the semantics in Gillies (2004, 2009). Start by defining a notion of update of an information state: Update of i with φ i +φ = i∩{w : [[φ]]w,i is true} Conditional antecedents are used to update the information state in the index; conditional consequents are evaluated at the updated information state. Using the traditional corner '>' to represent the conditional, here is the relevant clause: (8) ~φ > ψw,i = true iff ∀w′ ∈ i +φ : ~ψw ′,i+φ = true 2.4 Conditionals and informational consequence: Gibbard's tetralemma One of the main advantages of the informational view is that informational consequence captures intuitively valid patterns of reasoning. These results are particularly φ1, ...,φn i ψ I.e., a rules of inference allows the derivation of a sentence in the object language from a set of sentences in the object language. Conversely, a meta-rule has the form: If, φ1, ...,φn i ψ and . . . and χ1, ...,χn i θ, then σ1, ...,σn i ω I.e., a meta-rule allows us to derive that a certain rule of inference holds, given that other rules of inference hold. Informational consequence does not validate more meta-rules than classical consequence. In fact some meta-rules, like reductio and proof by cases, fail on informational consequence for the fragment of the language that involves epistemic modality. This is not surprising: if we add some rules of inference to a logic, there is a risk that meta-rules will be invalidated, since the antecedent of meta-rules ends up ranging over some extra cases. 5 impressive for conditionals. Informational consequence provides an elegant solution to a traditional problem about conditional logic, initially raised by Gibbard (1981). The problem has the form of a tetralemma: four assumptions are all individually plausible, but they jointly lead to an unacceptable result.5 (i) If ψ is entailed by φ, pφ > ψq is valid. Upper bound. If φ  ψ, then  φ > ψ (ii) pφ > ψq entails the corresponding material conditional. Centering. φ > ψ  φ ⊃ ψ (iii) p(φ∧ψ) > χq entails the complex conditional of the form pφ > (ψ > χ)q. Exportation. (φ∧ψ) > χ  φ > (ψ > χ) (iv) The notion of consequence capturing validity for natural language is classical, i.e. it captures preservation of a classical, bivalent notion of truth. (i) is generally taken as a basic condition of adequacy for any semantics for conditionals. (ii) and (iii) are backed by empirical argument.6 (iv) is just a background assumption in Gibbard's argument. Gibbard points out that, given (i)–(iii), we can prove that an indicative conditional pφ > ψq is logically equivalent to (i.e. entails and is entailed by) the corresponding material conditional pφ ⊃ ψq. Combined with the classical understanding of 5 Gibbard's result is sometimes called 'Gibbard's collapse theorem'. Assumption (iv) was left implicit by Gibbard, who took (i)–(iii) together to show that a truth-conditional view of epistemic conditionals is untenable. The label 'Upper Bound' is borrowed from Gillies 2009. Centering is oftentimes called 'Weak Centering'; here I shorten the name for brevity. 6 As for Centering: it's intuitive that (say) a sentence like (i-a) entails a sentence like (i-b). (Notice that just this entailment is what guarantees the validity of Modus Ponens.) (i) a. If Mary was at the party, Sue was at the party. b. Mary was not at the party or Sue was at the party. As for Exportation: it's intuitive that, say, (ii-a) entails (ii-b). (ii) a. If Mary and Sue are at the party, then Sam is at the party too. b. If Mary is at the party, then, if Sue is at the party, Sam is at the party too. 6 consequence, this means that an indicative conditional is true if and only if the corresponding material conditional is true.7 This is obviously problematic.8 Classical truth conditional accounts address the problem by dropping one of (ii) and (iii). Classical conditional semantics like Stalnaker's and Lewis's drop the Exportation principle. Kratzer's semantics drops Centering, thus invalidating Modus Ponens in its general form. In particular, on Kratzer's semantics instances of Modus Ponens involving other conditionals embedded in the consequent fail.9 φ > (ψ > χ), φ 3K ψ > χ (A restricted version of Modus Ponens remains valid on Kratzer's semantics for non-nested conditionals, i.e. conditionals that don't involve embedded modals and conditionals.) Switching from classical to informational consequence provides an elegant solution to the problem. Informational consequence vindicates all of Upper Bound, Centering, and Exportation, thus validating (i)–(iii). At the same time, informational consequence is nonclassical, in violation (iv). This nonclassicality is what blocks the collapse of the indicative conditional onto the material conditional. Informational equivalence doesn't amount to equivalence in meaning tout court. In particular, while pφ ⊃ ψq and pφ > ψq are supported by the same information states, some information states support neither pφ⊃ψq nor its negation, but support the negation of pφ > ψq. (As an aside: Gillies 2004 claims that, to solve Gibbard's tetralemma, we need to resort to an even less classical notion of consequence, i.e. dynamic notions of consequence in the style of Veltman 1996. This is a mistake. Informational consequence is sufficient to block the problem.10) 7 Here is a slightly updated version of Gibbard's proof. (For different versions of the proof, see McGee 1985, Gillies 2009.) One direction of the biconditional is provided by Centering; here is a proof of the other direction. (i) φ ⊃ ψ (ii) if φ∧ψ,ψ (by Upper Bound) (iii) if ((φ ⊃ ψ)∧φ),ψ ((ii), propositional logic) (iv) if (φ ⊃ ψ), if φ,ψ ((iii), Exportation) (v) (φ ⊃ ψ) ⊃ (if φ,ψ) ((iv), Centering) (vi) if φ,ψ ((i), (v), propositional logic) 8 See, among many, Edgington 1995 for some challenging objections to the material conditional account of indicative conditionals. 9 The schematic line below is an approximation; in Kratzer's framework, conditionals don't involve binary connectives, so the claims should be reformulated accordingly. But the logical point is unaffected by this. See Khoo 2013 for discussion. 10 In particular, Gillies (2004) uses a different notion of classicality; on his account, a notion of consequence counts as classical only if it satisfies the following two conditions: 7 2.5 Informational consequence as the logic of natural language Several theorists take informational consequence to be the notion of consequence that correctly captures the logic of epistemic discourse. This attitude is pervasive in the dynamic semantics literature (for the locus classicus, see Veltman 1996). More recently, it has become widespread among philosophers. In this vein, Gillies 2009 has claimed that "entailment ought not be flat-footed [preservation of truth at a world]" (p. 343)11; Yalcin 2012 has suggested that a classical account of consequence may not be "adequate for modeling natural language" (p. 1011); Bledin 2015 has argued that logic-insofar as it applies to natural language-is "fundamentally concerned not with the preservation of truth but rather with the preservation of structural properties of . . . bodies of information" (p. 64). For clarity: in this paper, I take all claims about informational consequence being the 'right' notion of consequence to be descriptive, rather than normative. That is, I take them to concern the empirical adequacy of theories that model speakers' capacity to reason with epistemic claims. In the background, I assume that, just like judgments of well-formedness and truth value, judgments about compatibility and entailment track systematic facts about speakers' competence with natural language.12 Hence figuring out how consequence works is not a claim about how speakers should reason, but rather a claim about how (by and large) they do reason. 3 Probability and informational consequence In this section, I argue that informational consequence doesn't track sound constraints on credence. For simplicity, here I assume that credences have probabilistic structure. As I make clear later on, this assumption is unneeded. I start from the observation, due to Moritz Schulz (2010), that the informational inference from ¬φ to ¬^φ is not probabilistically valid. I then show that the problem goes beyond epistemic possibility modals. All informational inferences that are not also classically valid inferences are probabilistically invalid. This includes standard Inclusion. Γ  γ, for all γ ∈ Γ Monotonicity. If Γ  γ and Γ ⊆ ∆, ∆  γ Gillies then claims that the lesson of Gibbard's tetralemma is that we should reject classicality in this sense. But this is incorrect. Informational consequence satisfies both Inclusion and Monotonicity and vindicates all of (i)–(iii); hence there is a way of reconciling (i)–(iii) with a notion of consequence that is classical in Gillies' sense. 11 I should flag that Gillies doesn't endorse informational consequence as I present it, but rather opts for a more dynamic variant of it (so-called update-to-test consequence, in the terminology of Veltman 1996). But, for current purposes, it seems fair to place him in the same broad camp as theorists like Yalcin and Bledin. 12 This seems also the view in Yalcin (2012 p. 1011, fn. 20). 8 informational inferences involving conditionals, like unrestricted Modus Ponens and Or-to-If. 3.1 Schulz on informational consequence Recall that informational consequence validates: Łukasiewicz's principle. (ŁP) ¬φ  ¬^φ (Also: φ  φ) Schulz (2010) points out that Łukasiewicz's principle doesn't track sound probabilistic reasoning. (For a discussion of a very similar point concerning closure and informational consequence, see also Bledin & Lando 2018.) Schulz simply points out that one may reasonably assign high credence to (9-a), while assigning "very low credence (or even credence 0)" to (9-b). (9) a. They are at home. b. They must be at home. Schulz explicitly assumes that a plausible notion of consequence should preserve probability in one-premise inferences. This seems a basic closure principle that governs all interaction between logic and probability. Here is a precise formulation: Single-premise probabilistic closure (SPC). If φ  ψ and Pr(φ) = t, Pr(ψ) ≥ t. Since Łukasiewicz's principle obviously violates SPC, it is ruled out as invalid. This clearly violates a reasonable constraint on logical consequence: If a rational and logically omniscient subject's credence function P is such that P(φ) = t, and φ  ψ, then P(ψ) ≥ t . . . Since this seems to be one of the core features of how logical consequence relates to rational reasoning, we should not accept informational consequence as an account of the logic of epistemic modals. (Schulz 2010) To my knowledge, Schulz's objection has received no discussion in the literature. In part, this may be because proponents of informational consequence tend to focus not on probabilistic reasoning, but rather on linguistic data about assertability and acceptance. In part, though, this may be due to skepticism towards the idea that mightand must-claims are appropriate objects of credence.13 On the informational view, mightand must-claims don't express standard propositions. Hence one might plausibly deny that it makes sense to talk about their probability. Crucially, this theoretical conclusion seems supported by intuition. In informal surveys, some speakers consistently find it awkward to assign probability to might-sentences. 13 As a sociological note: I myself have repeatedly encountered this attitude in conversation. 9 I put off discussion of the 'no probability' option to §7. In the remainder of this section, I argue that the difficulty spotted by Schulz is fully general. All informational inferences that aren't also classical inferences are probabilistically invalid. In particular, this also applies to inferences involving conditionals, for which the 'no probability' claim seems harder to make. 3.2 Probabilistic Modus Ponens I first introduce a probabilistic constraint that is crucial for my discussion. The principle follows from standard Modus Ponens, plus fairly weak probabilistic constraints on credence. Start from a simple formulation of Modus Ponens: Modus Ponens. φ > ψ, φ  ψ Now, suppose we assume a principle bridging consequence and probability that generalizes Schulz's single-premise closure principle. Multi-premise Probabilistic Closure (MPC). If φ1, . . . ,φn  ψ and Pr(φ1) = t1, . . ., Pr(φn) = tn, then Pr(ψ) ≥ (1− ((1− t1) + . . .+ (1− tn))). One intuitive gloss for MPC is the following: a subject's degree of confidence that the conclusion of an inference is false should not exceed the sum of their degrees of confidence that each of the premises is false. This kind of closure principle is widely adopted in the literature of probabilistic notions of validity.14 Given MPC, via Modus Ponens we obtain the following constraint: Probabilistic Modus Ponens. (PMP) If Pr(φ > ψ) = 1, then Pr(φ) ≤ Pr(ψ). PMP says that, if a conditional has probability 1, then the probability of the antecedent cannot exceed the probability of the consequent. It is a very natural probabilistic generalization of Modus Ponens.15 Even if we refrain from endorsing MPC, it's very easy to derive PMP from its single-premise counterpart SPC, plus other minimal constraints on credence. In particular, consider the following: 14 For some examples, see Adams 1975 and Field 2015. 15 PMP is a special case of a more general principle, which also follows from standard MP and MCP: Generalized Probabilistic Modus Ponens. Let Pr(if φ,ψ) = 1−d. Then Pr(φ)−d ≤ Pr(ψ). 10 Conjunction Lower Bound. (CLB) If Pr(φ) = 1, Pr(φ∧ψ) = Pr(ψ). CLB says that, if one has credence 1 in a conjunct, then one's credence in a conjunction is equal to the credence in the other conjunct. CLB is a fairly weak constraint, which is validated by classical probability as well as several nonstandard probability theories.16 But, together with single-premise closure, it is sufficient to entail PMP. 3.3 Failure of Probabilistic Modus Ponens Let me shift attention to probabilistic judgments about conditionals in a particular scenario. Suppose that Sarah has tossed a fair six-sided die. For convenience, name the die 'Die'. You have no information about the outcome of the toss. Now consider: (C) If Die did not land on two or four, then it landed on six. I ask you to pause and consider what level of confidence you assign to (C)-whether low, middling (roughly, 'fifty-fifty'), or high. In informal polls, most subjects answer 'low'. (Several people also give the more precise answer '1/4'.) Very few people, if any, go for 'middling' or 'high'. Now consider: (P2) Die landed even. (P1) If Die landed even, then, if it didn't land on two or on four, it landed on six. In informal polls, (P2) and (P1) are assigned, respectively, middling and high levels of confidence. I have elicited judgments about an informal notion of 'levels of confidence'. Let me take a theoretical step and assume that these judgments track credences.17 For convenience, I will talk of credences as being modeled by probability functions. But I don't strictly need the assumption that credences have probabilistic structure. All I need is that credences conform to a multi-premise closure principle like MPC (or, alternative, to SPC and CLB). Also, I don't need the assumption that subjects assign precise numerical values to claims. All I need is that subjects make coarse-grained distinctions between 'high', 'middling', and 'low' degrees of belief.18 16 For example, it is validated by the nonclassical probability theories based on Kleene logics that are discussed in Williams 2016. 17 For the semantics of confidence reports, and their relations to a notion of probability and credence, see Cariani et al. 2018. 18 Further bookkeeping: for the moment, I assume that probabilities attach directly to statements (construed as sentences as uttered at a context). 11 The table below summarizes the judgments in §3.3. I model the judgment of certainty concerning (P1) as credence 1, though this assumption can be weakened without harm.19 (P1) if even, (if not (two or four), six) certain (=1) (P2) even middling (≈ .5) (C) if not (two or four), six low (≈ .25) These judgments provide a counterexample to PMP; hence PMP fails for indicative conditionals in natural language. Recall that Modus Ponens (in particular, for the case of complex conditionals involving conditionals nested in the consequent) is informationally valid but not classically valid, at least on a Kratzer-style account. Hence we find, again, that informational consequence and probabilistic reasoning don't interact well. To strengthen this point, let me also notice that the same scenario also shows that Or-to-If, another signature inference of informational consequence, is probabilistically invalid. Or-to-If is the inference from a disjunction of the form pφ∨ψq to a conditional of the form p¬φ > ψq. Assuming that negation behaves classically, this can be formalized in a pithy way as the inference from a material conditional to the corresponding indicative conditional. (See Stalnaker 1975 for discussion of Or-to-If.) Or-to-If. φ ⊃ ψ I ψ > φ Like unrestricted Modus Ponens, Or-to-If is informationally valid but not classically valid. Now, notice that (P2) in the die scenario is equivalent to (10-a), which forms an instance of Or-to-If with (P2), repeated below as (10-b): (10) a. Either Die landed on two or four, or it landed on six. b. If Die didn't land on two or four, it landed on six. We know that (10-a) and (10-b) are assigned, respectively, middling and low credence. Hence the die scenario also provides a counterexample to the probabilistic validity of Or-to-If. 3.4 Equivocation? Before moving on, let me dispatch a line of objection. One might argue that the embedded and the unembedded occurrence of the conditional if not (two or four), six differ. Hence (P1)-(C) is not an instance of MP, and the objection to informational 19 Suppose we assign (P1) credence 1−ε, for some tiny ε. On this understanding, PMP simply doesn't concern (P1). However, as long as the difference in credences between (P2) and (C) is greater than ε, the same statements violate the generalization of PMP given in footnote 15. For simplicity, from now on I simply assume that the rational credence in (P1) is 1. 12 consequence fails. This line can be developed into a charge of semantic or syntactic equivocation. Here I review the semantic version, though nothing hinges on this.20 The objector claims that the conditional if not (two or four), six expresses different contents when it appears embedded in a complex conditional, as in (P1), and when it appears unembedded. In particular, the embedded occurrence exploits a domain of quantification that is restricted by the supposition that Die landed even, while the unembedded occurrence does not. But, to count an argument as an instance of MP, the consequent of the conditional premise and the conclusion must have the same content. Hence the argument is not really an instance of MP. Much can be said to rebut this objection. For current purposes, I can simply sidestep it. I can agree with the objector that MP proper is not threatened by (P1)-(C). What matters for me is not whether Modus Ponens, in whatever way it should be defined, is valid or invalid. My goal is establishing that some inference patterns that are informationally valid fail qua constraints on probabilistic attitudes. This point is independent of what we choose to call 'Modus Ponens'. Dub the principle exemplified by (P1)-(C) 'MP∗'. MP∗ illustrates the failure I'm interested in. 4 Back to informational consequence In §2, I said that proponents of informational consequence claim that this notion captures ordinary reasoning with epistemic discourse. The probabilistic data in §3 challenges this. Now I want to remark that the claims in §2 were well-motivated. The very same examples that give rise to probabilistic invalidity appear valid in a different epistemic context, i.e. when the premises are accepted. First of all consider Łukasiewicz's principle. It is well-known that the conjunction of the premise and the negation of the conclusion sounds contradictory (see Veltman 1985, Yalcin 2007). 20 The syntactic version of the objection builds on a standard theory of the logical form of conditionals, defended by Kratzer (see 1991 2012, von Fintel 1994 and Stone 1997). On this theory, bare conditionals involve a covert modal, which comes equipped with a domain of quantification of the modal is denoted by a variable. This variable can occur both free and bound. When conditionals are embedded in other conditionals, this variable is always bound. In particular, in a nested conditional like (P1) the innermost domain variable is bound by the outermost modal, and inherits the domain restriction specified by the first if -clause. Hence the logical forms of (P1) and (C) are: (P1) modal D1 [if even]k[modal Dk [if not two or four][six]] (C) modal D2 [if not two or four][six] The objector claims that an argument as a instance of MP only if the conclusion occurs syntactically as a constituent of the major premise, and hence (P1)-(C) is not an instance of MP. Una Stojnic (2017) has recently used a similar line of argument. 13 (11) # They are at home and they might not be. The point generalizes to other informational inferences. It's harder to get clear intuitions using analogous conjunctions simply because the resulting sentences are too long. But we can use different evidence by running a test devised by Peter Klecha (2014). Klecha suggests that we test for logical properties of natural language sentences by constructing dialogs, and observing what assertions speakers who represent themselves as agreeing are entitled to make. Consider what happens when we do this for (P1)-(C). (12) A: If Die landed even, then, if it didn't land on two or on four, it landed on six. And Die did indeed land even. B: Yes, that's right. Hence, if it didn't land on two or on four, it landed on six./ # Yes, that's right. But we still don't know whether, if it didn't land on two or on four, it landed on six. Once B accepts A's utterance of (P1) and (P2), they seem committed to (C). Their refraining from endorsing it is awkward and generates a feeling of contradiction. This is evidence that, once the premises of Modus Ponens are accepted, speakers have a commitment to accepting the conclusion. The point generalizes; I invite the reader to run the same test for Or-to-If and other informational inferences.21 21 This discussion sheds light on McGee's classical argument (1985) against the validity of Modus Ponens. McGee considers examples just like our (P1)-(C), against the backdrop of scenarios like the following: Sarah tossed Die, a fair six-sided die. You caught a brief glimpse of the face that came up on top. You are confident, though not certain, that it landed on 2. In any case, you are very confident that you saw few dots on the face that came up. In McGee's words (1985, p. 462), in a scenario like this you have "good grounds for believing the premises [i.e. (P1) and (P2)]", but you are "not justified in accepting the conclusion". McGee concludes that: sometimes the conclusion of an application of Modus Ponens is something we do not believe and should not believe, even though the premises are propositions we believe very properly (1985, p. 462). I agree with McGee that examples like (P1)-(C) show that in a sense Modus Ponens is invalid, but the shift to degree attitudes is crucial. Once the premises of a Modus Ponens argument are accepted, in the sense of acceptance that I have been using in the paper it is incorrect that one may rationally accept the premises of a Modus Ponens and reject the conclusion. (Though McGee's argument may be rescued by appealing to a weaker notion of acceptance; discussion of this point is best pursued elsewhere.) 14 5 Bridging consequence and attitudes 5.1 Endorsing both notions In §3, I argued that informational consequence is not the notion of consequence governing probabilistic reasoning. Theorists like Schulz conclude that informational consequence is simply not the right notion of consequence for epistemic discourse. But this conclusion doesn't do justice to the explanatory power of informational consequence. As I pointed out in §2 and §4, informational consequence figures in an explanation of a large amount of facts about acceptability of sentences and discourses. There is a natural strategy to square the facts reviewed in §2 and §3. We should countenance both classical and informational consequence in our theories. Each of the two tracks different constraints on reasoning. Classical consequence tracks constraints on probabilistic reasoning; informational consequence tracks constraints on reasoning from accepted premises. Moreover, this 'split' view doesn't require endorsing a substantial form of logical pluralism. Plausibly, we can regard reasoning from accepted premises as a special case of probabilistic reasoning, i.e. the case in which the premises are assigned probability 1. In a slogan: classical consequence is the correct logic for probabilistic reasoning in general; informational consequence is the correct logic for reasoning from certainties.22 5.2 Bridge principles Let me state an assumption explicitly. I assume that the final objects that semantic theories assign to utterances (whether they are contents, propositions, context change potentials, etc) also play a role in reasoning. Moreover, this role is part of linguistic competence. I.e., part of overall competence with epistemic modal discourse involves recognizing certain patterns as valid and invalid, and applying (by and large) this competence in reasoning. Against this background, the role of classical and informational consequence can be captured via two bridge principles. The first establishes that classical consequence 22 Kolodny & MacFarlane 2010 also make a similar suggestion. 15 is the notion of consequence regulating credence-given the background assumption that credence obeys a principle of multi-premise closure, like the one in §3.23 Classical-Consequence-to-Credence (CCC). If φ1, . . . ,φn C ψ and Pr(φ1) = t1, . . ., Pr(φn) = tn, then Pr(ψ) ≥ (1− ((1− t1) + . . .+ (1− tn))). The second bridge principle establishes that informational consequence is the notion of consequence regulating acceptance. Informational-Consequence-to-Acceptance (ICA). If φ1, ...,φn I ψ, then a subject's information state i is such that: either one of φ1, ...,φn is not true throughout i, or ψ is true throughout i. Following Stalnaker (1984, 2002), I use 'acceptance' to refer to a broad category of mental states that includes but goes beyond belief. On a first pass, acceptance is the broadest possible kind of doxastic attitude. Accepting a proposition consists in taking it as true for some purposes or other. Acceptance is linked to assertion (beyond Stalnaker 2002, see Maher 1993). Roughly, the outcome of a successful assertion is that all participants in a conversation come to accept the content of the sentence asserted. Hence the assertibility data discussed in this section provides a guide to subjects' acceptances. If we assume that speakers' attitudes are governed by CCC and ICA, we predict exactly the patterns that we observed. Łukasiewicz's principle, Or-to-If, and Modus Ponens are expected to fail probabilistically, but to hold as constraints on reasoning from accepted premises. Assuming that assertion expresses acceptance, this is exactly what the data shows. Let me make two clarifications. First, both CCC and ICA are synchronic constraints. I.e., they state consistency facts for attitudes at a given time. I make no assumptions about the connection between language and the dynamic process of reasoning, understood as the process of forming new attitudes. Second, principles bridging consequence and attitudes are often taken to have normative import, i.e. to capture constraints of rationality.24 But I take CCC and ICA to be descriptive. They 23 A bridge principle of this sort has been recently defended by Field 2015. Here is a statement that more closely mirror's Field's own statement: If φ1, ...,φn C ψ, then a subject's credence function CrS is such that: Cr(ψ) ≥ ΣiCr(φi)−n + 1. To link the two formulations, define a subject's degree of disbelief in φ as Dis(φ) = 1−Cr(φ), and notice that the informal statement is equivalent to: If φ1, ...,φn C ψ, Dis(ψ) ≤ ΣiDis(φi) 24 For classical criticism of the claim that logic imposes constraints on attitudes, see Harman 1984. Harman is particularly concerned with the view that logic is a normative discipline, 16 characterize (an idealized version of) subject's competence with reasoning with the relevant mental representations. Crucially, both bridge principles can be satisfied by a unique model of mental states. In the appendix, I present a toy model of attitudes that vindicates both. 6 Informational Consequence and Triviality The foregoing leaves us with a conciliatory conclusion. Classical consequence provides a general notion of consequence for probabilistic reasoning. Informational consequence concerns reasoning from accepted premises. Assuming that full credence counts as a kind of acceptance, we can take informational consequence to also concern a special case of probabilistic reasoning, i.e. the case in which the premises are assigned full credence. Unfortunately, this conciliatory view is untenable without substantial changes to our theories of credence. The assumption that informational consequence preserves full credence is sufficient to generate a large battery of triviality results, broadly in the style of Lewis 1976. In fact, in this section I'm going to suggest that informational inferences are at the heart of several classical triviality results. A note about the dialectic: throughout this section, I assume that credences have probabilistic structure. But, of course, just the arguments that I present here, if successful, work as a reductio of this assumption. 6.1 Bradley-style triviality Triviality results are generally associated to Stalnaker's Thesis, i.e. the claim that rational credence in a conditional aligns with the conditional credence in the consequent, given the antecedent.25 Stalnaker's Thesis. For all φ, ψ, and for all Pr modeling rational credence: Pr(φ > ψ) = Pr(ψ | φ) Stalnaker's Thesis draws intuitive support from ordinary judgments about probabilities of conditionals. For a simple example, suppose that Frida may and may not have tossed a fair coin. It's natural to think that, in this situation, your credence in (13) should be 1/2, as required by Stalnaker's Thesis. and defends the view that it is instead a purely descriptive discipline, on a par with natural sciences. Nowadays, it is generally acknowledged that logic and the study of rational constraints on reasoning are distinct enterprises. But it is controversial whether the two can be bridged via some informative principles. For positive suggestions along these lines, see Christensen 2004, MacFarlane 2004, Field 2015. For discussion, see Kolodny 2007, 2008. 25 Stalnaker's Thesis owes its name to the fact that Stalnaker (1970) was its first explicit proponent; since then, Stalnaker has disavowed the Thesis; see e.g. his interesting triviality result in Stalnaker 1976. 17 (13) If Frida tossed the coin, the coin came up heads. Unfortunately, a large battery of so-called 'triviality results' shows that Stalnaker's Thesis is untenable in full generality, at least if we hold fixed some basic assumptions about credence. For example, Lewis's first triviality result (1976) establishes that Stalanker's Thesis, plus some basic assumptions, entails that the credence in a conditional pφ > ψq is identical to the credence in the consequent. (See Hájek & Hall 1994 for an overview of triviality results that center around Stalnaker's Thesis.) An important contribution to the triviality literature has come from Richard Bradley. In a number of papers (see, among others, 2000, 2007), Bradley points out that Stalnaker's Thesis is not a crucial premise in triviality proofs. Rather, triviality can be derived on weaker and less controversial constraints on credence. What is of interest here is Bradley's adaptation of Lewis's original triviality proof, in Bradley 2007. Bradley's proof can be generalized into a template for generating triviality results, starting from any constraint on credence coming from informational inferences. Let me first illustrate Bradley's proof. The proof exploits a number of classical Bayesian principles about credence: a. Total Probability. Pr(φ) = Pr(φ∧ψ) + Pr(φ∧¬ψ) b. Ratio. Pr(ψ | φ) = Pr(φ∧ψ) Pr(φ) c. Closure under Conditionalization. For allφ, Pr: if Pr(*) models rational credence, then Pr(* | φ) also models rational credence. As I point out below, the first of these principles is not strictly necessary. We can replace it with the weaker principle that the probability of a conjunction is the lower bound on the probability of a conjunct: a'. Lower Bound. Pr(φ) ≥ Pr(φ∧ψ) Now, let's get to Bradley's adaptation of Lewis's proof. Rather than Stalnaker's Thesis, Bradley adopts the following principle: Cond-Cert. For any Pr modeling rational credence, if Pr(φ) > 0: a. If Pr(ψ) = 1, then Pr(φ > ψ) = 1 b. If Pr(ψ) = 0, then Pr(φ > ψ) = 0 For the sake of the argument, assume that: Pr(φ | ψ) > 0, Pr(φ | ¬ψ) > 0. On these assumptions, we can prove that Pr(φ > ψ) = Pr(ψ). i. Pr(φ > ψ) = Pr((φ > ψ)∧ψ) + Pr((φ > ψ)∧¬ψ) = (Total Probability) ii. Pr(φ > ψ | ψ)×Pr(ψ) + Pr(φ > ψ | ¬ψ)×Pr(¬ψ) (Ratio) 18 iii. 1×Pr(ψ) + 0×Pr(¬ψ) = (Cond-Cert, Closure) iv. Pr(ψ) Hence, assuming Cond-Cert plus the three basic constraint in (a)–(c), we can prove that the probability of any conditional pφ > ψq is equal to the probability of its consequent ψ (assuming that Pr(φ | ψ) > 0 and Pr(φ | ¬ψ) > 0). This is the same conclusion as Lewis's first triviality result. It should be obvious that this conclusion is unacceptable. For a counterexample, just take (14) as uttered in the die scenario. (14) If the Die landed even, it landed six. Intuitively, (14) has probability 1/3, while its consequent has probability 1/6. 6.2 Generalized informational triviality Notice, first, that both the sub-constraints in Cond-Cert are closely related to informational inferences. On the semantics in §2, both of the following are informationally valid:26 Antecedent Introduction. ψ  φ > ψ Converse Antecedent Introduction. ¬ψ  ¬(φ > ψ) Hence Bradley's Cond-Cert is equivalent to the claim that two informational inferences preserve full credence.27 Notice, moreover, that the proof in (i)–(iv) doesn't appeal to the specific content of the premises at all. So, given the link we have assumed between informational consequence and acceptance, and given that credence 1 counts as a kind of acceptance, we can replicate it whenever we have two informational inferences of the form φ I ψ and ¬φ I ¬ψ. In fact, something even more general holds. Notice first that, by using just the first of the two constraints in Cond-Cert, and by adopting the weaker Lower Bound in place of Total Probability, we can still prove an unacceptable result. Below is a proof that the probability of pφ > ψq is an upper bound on the probability of ψ (on the assumption that Pr(φ | ψ) > 0).28 26 A twist: the semantics in §2 assumes that φ > ψ is vacuously supported by information states that support ¬φ. This semantics might be refined into one that involves a definedness condition for conditionals. (Bradley's own Cond-Cert involves a similar assumption.) This issue is orthogonal to the main point that I am making in this section, so I set it aside. 27 Assuming that an assignment of credence 0 to φ is equivalent to an assignment of credence 1 to the negation of φ. 28 For an intuitive counterexample, consider simply the pair in (i), again against the backdrop of the die scenario. 19 i. Pr(φ > ψ) ≥ Pr((φ > ψ)∧ψ) = (Lower Bound) ii. Pr(φ > ψ | ψ)×Pr(ψ) = (Ratio) iii. 1×Pr(ψ) = (Cond-Cert, Closure) iv. Pr(ψ) (Provided that Pr(φ | ψ) > 0.) The content of φ > ψ and ψ is again irrelevant. Hence the proof can be generalized to any informational inference α I β, via the following template: i. Pr(β) ≥ Pr(β∧α) = (Lower Bound) ii. Pr(β | α)×Pr(α) = (Ratio) iii. 1×Pr(α) = (Cond-Cert, Closure) iv. Pr(α) (Some restrictions stating that some sentences have positive probability will need to be added on a case-by-case basis to ensure definedness.) For some examples, consider again the paradigm informational inferences that I have discussed throughout this paper. (For simplicity, I consider the one-premise version of Modus Ponens.) Łukasiewicz's principle. φ I φ Or-to-If. φ ⊃ ψ I ψ > φ One-premise MP. (φ > ψ)∧φ I ψ These informational inferences give rise to the following constraints on credence: a. If Pr(φ) > 0, Pr(φ) ≥ Pr(φ) b. If Pr(φ∧ψ) > 0, Pr(φ > ψ) ≥ Pr(φ ⊃ ψ) c. If Pr(φ | ψ) > 0, Pr(ψ) ≥ Pr((φ > ψ)∧φ) For an illustration, consider Or-to-If. Restrict consideration to probability functions such that Pr(φ | φ ⊃ ψ) > 0 (this is to ensure that φ > ψ has an epistemically possible antecedent). The proof template above yields: i. Pr(φ > ψ) ≥ Pr((φ > ψ)∧ (φ ⊃ ψ)) = (Lower Bound) ii. Pr(φ > ψ | φ ⊃ ψ)×Pr(φ ⊃ ψ) = (Ratio) iii. 1×Pr(φ ⊃ ψ) = (Cond-Cert, Closure) iv. Pr(φ ⊃ ψ) (i) a. If the die didn't land on two or four, it landed even. b. The die landed even Intuitively, Pr((i-a)) = 1/4 and Pr((i-b)) = 1/3. 20 This conclusion is disastrous. It says that the probability of a conditional pφ > ψq is greater than or equal to the probability of the corresponding material conditional pφ ⊃ ψq.29 This is exactly the converse of what we expect. Some of the strongest arguments against the material conditional analysis of indicative conditionals connect exactly to the fact that the probabilities of material conditionals are, in general, much higher than the intuitive probabilities of natural language conditionals. For an example, suppose that we have a 100-ticket fair lottery, with each ticket being numbered between 1 and 100; now consider the conditional: (15) If the winning ticket is between 91 and 100, it is ticket 100. Intuitively, (15) has low probability (.1 if we want to use a precise number). But, on a material conditional analysis, it is predicted to have probability greater than .9. Hence the probabilities of indicative conditionals cannot be, in general, upper bounds on the probabilities of the corresponding material conditionals. 6.3 Summary Informational inferences are at the root of a family of triviality results, including Lewis's original result. This might not be surprising, especially in the light of a number of recent triviality results that focus on epistemic and probabilistic modals, and that are based just on informational inferences. (See e.g. Russell & Hawthorne 2016, Goldstein forthcoming.) But, once it is appreciated, the point is particularly relevant to the question of how probability fits in the informational framework. Let me also notice that informational theorists cannot help themselves to one of the standard responses to triviality. One route for blocking triviality involves appealing to context-dependence. Suppose that we adopt a kind of contextualist semantics and assume that the propositions epistemic modals and conditionals are sensitive to the information state of some relevant agent. This allows us to block the family of triviality proofs we have been considering by claiming equivocation.30 But this route is unavailable precisely for the informational theorist, who is signed up to rejecting contextualism. So the informational theorist has to appeal to other kinds of responses to triviality. 29 For a similar, though not quite analogous, triviality result, see Milne 2003. 30 This route to respond to triviality is standardly associated with Van Fraassen 1976. But completely standard contextualist semantics for modals and conditionals-such as e.g. Stalnaker's (1968) semantics for conditionals-can be used to vindicate Stalnaker's Thesis for simple conditionals, i.e. conditionals that don't embed modals or other conditionals. For discussion, see Khoo 2013 and Khoo & Santorio 2018. 21 7 Fitting probability into the informational view In this section, I outline what I take to be the two main options for the informational theorist. Both of them are radical. The first option involves denying that probability applies at all to modal and conditional sentences. The second involves moving to a nonclassical notion of probability. Adjudicating between the two options goes beyond the scope of this paper, but I raise some potential problems for each. 7.1 Nihilism The first option denies that probability applies to epistemic discourse. This option is familiar from the philosophical logic literature on conditionals and probability. Theorists like Adams 1975 and Edgington 1995 deny that conditionals have probability. (Both of them preserve a weaker kind of connection between conditionals and probability. On Adams' view, conditionals have degrees of assertability, which behave in some ways like probability. On Edgington's view, conditionals are used to express speaker's credences, though they are not the bearers of credences themselves.) Informational theorists may claim that, in a similar way, all modal claims fail to have probability. Nihilism might seem intuitively plausible for mightand must-claims. When asked what probability they assign to It might be raining, a number of speakers find the question difficult to answer. But it is prima facie difficult to believe for conditionals. Speakers tend to have relatively crisp judgments that conditionals have nontrivial probabilities, and in a large number of cases these judgments line up with the corresponding conditional probabilities (see e.g. Evans & Over 2004). It's unclear whether nihilism has the resources to explain these facts. To be sure, there is a familiar strategy via which nihilists can vindicate judgments about the truth value of sentences containing object language probability operators. We start by assuming an appropriate semantics for probability operators31. Following a classical idea tracing back to Kratzer32, we let if -clauses work as the restrictor on the relevant information state picked out by the operator. As a result, a sentence of the form If φ, probably ψ is predicted to make a claim about the conditional probability of ψ, given φ. For example, (16) is true just in case the probability of the coin landing tails, conditional on Frida tossing it, is .5. (16) It is 50% likely that, if Frida tossed the coin, the coin landed tails. Even assuming the restrictor maneuver, nihilists remain open to a number of challenges. First, capturing the functioning of object language probability 31 See Yalcin 2010, 2012, Lassiter 2011, Holliday & Icard 2013 for discussion 32 To my knowledge, Kratzer never says this fully explicitly in a published. But this point is widely attributed to her, and I have personally seen her make it in conversation. 22 operators might not be sufficient to capture all the data. See e.g. Khoo & Santorio 2018 for discussion of probability of conjunctions of conditionals or data involving propositional anaphora. Second, the decision to not assign probability to conditionals is in tension with centering principles, one of which was already discussed above. Centering. φ > ψ  φ ⊃ ψ Strong Centering. φ∧ψ  φ > ψ According to the centering principles, conditionals are entailed by and entail factual claims.33 Hence the value of the probability of a conditional is naturally bounded from above and below by two claims that unequivocally have probability. So it would be surprising if probabilities just didn't apply to conditionals.34 7.2 Nonclassical probability The second option is to block triviality proofs by giving up one of the principles that are appealed to throughout, and that are restated below. (a') Lower Bound. Pr(φ) ≥ Pr(φ∧ψ) (b) Ratio. Pr(ψ | φ) = Pr(φ∧ψ) Pr(φ) (c) Closure under Conditionalization. For allφ, Pr: if Pr(*) models rational credence, then Pr(* | φ) also models rational credence. I discuss this route in detail elsewhere ([reference omitted]). Adopting a nonclassical account of conjunction seems like a far-fetched solution. So the theorist taking this route will focus on (b) or (c). This involves claiming that, once our language includes epistemic claims, update works in a very different way. The challenge for this approach is to develop a nonstandard view about update that is well-behaved and fits all the relevant constraints. Let me close by pointing out that a nonclassical theory of update seems a natural option for the informational theorist. On informational semantics, starting from Veltman 1996, might-claims violate a condition of Persistence, understood as follows. Persistence. For all i: if i  φ, then for all i′ ⊆ i, i′  φ Persistence says that, if a formula is supported by an information state, it is supported also by any information state more informed than it. might-claims, as characterized 33 Centering for complex conditionals is controversial, as I have pointed out above, but it unequivocally holds for simple conditionals, i.e. conditionals not involving modality in the antecedent and the consequent. 34 Notice also that, if we identify the probability of a conditional with the relevant conditional probability, both the Centering principles are, provably, probabilistically valid. (I leave the proofs to the reader.) 23 by informational semantics, are an obvious counterexample to Persistence. Now, Persistence is a qualitative analog of the requirement that update preserves credence 1, which is entailed by classical conditionalization. So failures of Persistence should lead us to expect failures of update by conditionalization.35 8 Conclusion The question I have taken up in this paper is how informational consequence and probability interact. I have first pointed out that all informational inferences are probabilistically invalid. This leaves room for a view on which probabilistic reasoning is regulated by a classical notion of consequence, while informational consequence captures constraints on reasoning from accepted premises. But, on the minimal assumption on which probability 1 counts as a kind of acceptance, the probability-preserving properties of informational inferences lead to a battery of triviality result. The only plausible options seem a kind of nihilism-claims pertaining to epistemic discourse are not appropriate objects of credence-or a switch to a nonclassical probability theory. Assessing which of the two is the best option is a task for further research. 35 This connection is also noticed and discussed explicitly by Bradley 2007. 24 Appendix: a toy formal model of credence and acceptance Here I specify a model that vindicates both CCC and ICA. The model is very simple and can be refined in a variety of ways. I put it forward merely as a possibility proof, showing that CCC and ICA can be vindicated together. The model assigns probability to epistemic possibilities, and hence derivatively to contents (which are construed as sets of epistemic possibilities). Differently from standard credal models, epistemic possibilities are pairs of a world and an information state. Some theorists may worry about the idea of assigning probabilities to nonpropositional contents in the first place (since these contents are not truthevaluable); they may think that, for the epistemic fragment of language, sentences are the only proper bearers of probabilities. These theorists may still use the model as a piece of formal machinery that assigns probabilities to sentences on the basis of their contents. Start by specifying contents for sentences. Following Yalcin 2007, I model contents as sets of world-information state pairs: Content of φ: {〈w, i〉 : ~φw,i = 1} I assume that contents are the primary objects of attitudes, but for simplicity I will talk about attitudes attaching to both sentences and contents. I represent a credal state as a probability distribution over an algebra of worldinformation state pairs. I start by defining the notion of an epistemic space: An epistemic space E is a triple 〈W,F ∗,Pr〉 s.t.: • W is a set of worlds; • F ∗ is the set of all the sets of pairs whose first member is a world and whose second member is a set of worlds; i.e., F ∗ =P(W×P(W)); • Pr : F ∗→R is a probability measure over F ∗. where a probability measure is defined in the usual way. (Pr(∪F ∗) = 1, and for any disjoint sets A and B, Pr(A∨B) = Pr(A) + Pr(B).) For simplicity, I work with models where the set of worlds W is finite. Not all epistemic spaces will work as models of the credences of a rational subject. I impose three extra conditions, which validate intuitive features of epistemic possibilities and the relation between epistemic possibility and probability. A credal space C is any epistemic space triple 〈W,F ∗,Pr〉 s.t.: • Propriety: if Pr(〈w, i〉) > 0, then w ∈ i. • Transparency: for some i ⊆W, Pr({〈w, i〉 : w ∈W}) = 1; 25 • Plenitude: if, for some 〈w, i〉, Pr(〈w, i〉) > 0, then, for all w′ ∈ ik, Pr(〈w′, i〉) > 0. Let me gloss each of these informally. i Propriety requires that each world-information state pair is such that the world is a member of the state. This is a kind of consistency requirement. If a world is a live epistemic option, then it is also a member of the information state that the agent takes to be their information state. Propriety underlies the probabilistic validity of the inference φ  φ. ii Transparency requires that a subject is fully certain about their own information state. This assumption is not strictly needed for the validation of CCC and ICA, but it helps keep the model simple. iii Plenitude requires that, if a subject assigns positive credence to a worldinformation state pair 〈w, i〉, they also assign positive credence to all worldinformation state pairs where w is replaced by other worlds in i. Plenitude enforces the idea that, if a world is regarded as an open option in the subject's information states, then it has positive probability. Plenitude is needed to ensure that assigning credence 1 to φ warrants assigning credence 1 to φ. All of (i)-(iii) may be called into question on philosophical ground. I leave an exploration of what happens if we drop them to future work. The model represents both credences and acceptances. Credences are modeled, in the obvious way, as assignments of probabilities to subsets of a space of possibilities. Acceptances are modeled as follows. Let 'CS' be S's credal space. I say that: S accepts φ iff, for all 〈w, i〉 ∈ ∪F ∗CS such that PrCS(〈w, i〉) > 0, i+φ = i. I.e., a sentence φ (and, by extension, its content) is accepted iff all information states in pairs that receive positive probability entail φ.36 Notice that accepting a sentence and assigning to it probability 1 coincide. Define probabilistic validity in the usual way and acceptance-validity as preservation of probability 1: Probabilistic Validity. φ1, ...,φn Pr ψ iff, for any credal space C, PrC(ψ) ≥ ΣiPrC(φi)−n + 1. Acceptance Validity. φ1, ...,φn A ψ iff, for any credal space C: if PrC(φ1) = 1, ...,PrC(φn) = 1, PrC(ψ) = 1. 36 Given Transparency and Plenitude, this amounts just to φ being entailed by the set of worlds {w : PrWS(〈w, i〉) > 0}, i.e. the set of worlds compatible with the subject's credal state that receive positive probability. 26 The following are both true on the model, which shows that CCC and ICA, can be validated by the same epistemic state. Proposition 1. If an argument φ1, ...,φn  ψ is classically valid, then it is probabilistically valid. If φ1, ...,φn C ψ, then φ1, ...,φn Pr ψ Proposition 2. If an argument φ1, ...,φn  ψ is informationally valid, then it is acceptance valid. If φ1, ...,φn I ψ, then φ1, ...,φn A ψ Here are proofs of Propositions 1 and 2. Define first a notion of proper content, as follows: Proper content of φ: ‖φ‖ = {〈w, i〉 : 〈w, i〉 is proper and ~φw,i = 1} I.e., the proper content of a sentence is just the set of proper world-information state pairs at which the sentence is true. Moreover we define the degree of uncertainty in a sentence (see also footnote 23) as follows: Degree of uncertainty of φ: U(φ) = 1−Pr(φ) Proof of Proposition 1. Suppose that (i) φ1, ...,φn C ψ. By the definition of classical2 consequence, we get (ii), and hence (iii): (ii) ‖φ1‖∩ ...∩‖φn‖ ⊆ ‖ψ‖ (iii) ‖φ1∧ ...∧φn‖ ⊆ ‖ψ‖. By the definition of a probability function, from (iii), for any credal state C: (iv) PrC(φ1∧ ...∧φn) ≤ PrC(ψ). By the Uncertainty Sum Theorem (see Adams 1998), (v) UC(φ1∧ ...∧φn) ≤UC(φ1) + . . .+ UC(φn) Replacing and rearranging in (v): (vi) ΣiPrC(φi)−n + 1 ≤ PrC(φ1∧ ...∧φn) From (iv) and (vi): 27 (vii) ΣiPrC(φi)−n + 1 ≤ PrC(ψ) Proof of Proposition 2. Suppose that (i) φ1, ...,φn I ψ. For the purpose of conditional proof, suppose moreover that, for any credal space C: (ii) PrC(φ1) = . . . = PrC(φn) = 1 Given the equivalence between assigning full probability and acceptance, from (ii) we get: (iii) For all 〈w, i〉 such that PrC(〈w, i〉) > 0: i +φ1 = i, . . . , i +φn = i From (i) and (iii): (iv) For all 〈w, i〉 such that PrC(〈w, i〉) > 0: i +ψ = i And hence: (v) PrC(ψ) = 1. 28 References Adams, Ernest. 1975. The logic of conditionals: An application of probability to deductive logic, vol. 86. Springer Science & Business Media. Adams, Ernest. 1998. A Primer of Probability Logic. Stanford: Csli Publications. Bledin, Justin. 2015. Modus ponens defended. Journal of Philosophy 112(2). 57–83. Bledin, Justin & Tamar Lando. 2018. Closure and epistemic modals. Philosophy and Phenomenological Research 97(1). 3–22. Bradley, Richard. 2000. A preservation condition for conditionals. Analysis 60(267). 219–222. Bradley, Richard. 2007. A defence of the ramsey test. Mind 116(461). 1–21. Cariani, Fabrizio, Paolo Santorio & Alexis Wellwood. 2018. Confidence reports. Draft; Northwestern University, University of California, San Diego, and University of Southern California. Christensen, David. 2004. Putting Logic in its Place: Formal Constraints on Rational Belief. Oxford University Press. Edgington, Dorothy. 1995. On conditionals. Mind 235–329. Egan, Andy, John Hawthorne & Brian Weatherson. 2005. Epistemic modals in context. In G. Preyer & G. Peter (eds.), Contextualism in Philosophy, 131–170. Oxford University Press. Evans, Jonathan St. B. T. & David E. Over. 2004. If. Oxford: Oxford University Press. Field, Hartry. 2015. What is logical validity? In Colin R. Caret & Ole T. Hjortland (eds.), Foundations of Logical Consequence, Oxford University Press. von Fintel, Kai. 1994. Restrictions on Quantifier Domains: University of MassachusettsAmherst dissertation. Gibbard, Allan. 1981. Two recent theories of conditionals. In William Harper, Robert C. Stalnaker & Glenn Pearce (eds.), Ifs, 211–247. Reidel. Gillies, Anthony S. 2004. Epistemic conditionals and conditional epistemics. Noûs 38(4). 585–616. Gillies, Anthony S. 2009. On truth-conditions for if (but not quite only if ). Philosophical Review 118(3). 325–349. Goldstein, Simon. forthcoming. Triviality results for probabilistic modals. Philosophy and Phenomenological Research . Hájek, Alan & N. Hall. 1994. The Hypothesis of the Conditional Construal of Conditional Probability. In Ellery Eells, Brian Skyrms & Ernest W. Adams (eds.), Probability and Conditionals: Belief Revision and Rational Decision, 75. Cambridge University Press. Harman, Gilbert. 1984. Logic and reasoning. Synthese 60(1). 107–127. Holliday, Wesley & Thomas Icard. 2013. Measure semantics and qualitative semantics for epistemic modals. In Proceedings of SALT 23, 514–534. Kaplan, David. 1989a. Demonstratives. In Joseph Almog, John Perry & Howard Wettstein (eds.), Themes From Kaplan, 481–563. Oxford University Press. 29 Kaplan, David. 1989b. Demonstratives: Afterthoughts. In J. Almog, J. Perry & H. Wettstein (eds.), Themes From Kaplan, 565–614. Oxford University Press. Khoo, Justin. 2013. A note on gibbard's proof. Philosophical Studies 166(1). 153–164. Khoo, Justin & Paolo Santorio. 2018. Lecture notes: Probability of conditionals in modal semantics. Lecture notes for a course at the North American Summer School in Logic, Language and Information, Carnegie Mellon University, Summer 2018. Klecha, Peter. 2014. Two kinds of sobel sequences: Precision in conditionals. Forthcoming in the Proceedings of the West Coast Conference on Formal Linguistics 32. Kolodny, Niko. 2007. How does coherence matter? Proceedings of the Aristotelian Society 107(1pt3). 229–263. Kolodny, Niko. 2008. Why be disposed to be coherent? Ethics 118(3). 437–463. Kolodny, Niko & John MacFarlane. 2010. Ifs and oughts. Journal of Philosophy 107(3). 115–143. Kratzer, Angelika. 1991. Modality. Semantics: An international handbook of contemporary research 639–650. Kratzer, Angelika. 2012. Modals and Conditionals: New and Revised Perspectives, vol. 36. Oxford University Press. Lassiter, Daniel. 2011. Measurement and Modality: The scalar basis of modal semantics: NYU dissertation. Lewis, David. 1976. Probabilities of conditionals and conditional probabilities. Philosophical Review 85(3). 297–315. MacFarlane, John. 2004. In what sense (if any) is logic normative for thought? Unpublished draft, UC Berkeley. MacFarlane, John. 2011. Epistemic modals are assessment-sensitive. In Andy Egan & B. Weatherson (eds.), Epistemic Modality, Oxford University Press. MacFarlane, John. 2014. Assessment Sensitivity: Relative Truth and its Applications. OUP Oxford. Maher, Patrick. 1993. Betting on Theories. Cambridge University Press. McGee, Vann. 1985. A counterexample to modus ponens. Journal of Philosophy 82(9). 462–471. Milne, Peter. 2003. The simplest lewis-style triviality proof yet? Analysis 63(4). 300–303. Russell, Jeffrey Sanford & John Hawthorne. 2016. General dynamic triviality theorems. Philosophical Review 125(3). 307–339. Schulz, Moritz. 2010. Epistemic modals and informational consequence. Synthese 174(3). 385–395. Stalnaker, Robert. 1968. A theory of conditionals. In N. Recher (ed.), Studies in Logical Theory, Oxford. Stalnaker, Robert. 1975. Indicative conditionals. Philosophia 5. 30 Stalnaker, Robert. 1976. Letter to van fraassen. In W. Harper & C. Hooker (eds.), Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science, vol. 1, 302–306. Dordrecht: Reidel. Stalnaker, Robert. 1984. Inquiry. Cambridge University Press. Stalnaker, Robert. 2002. Common ground. Linguistics and Philosophy 25(5-6). 701–21. Stalnaker, Robert C. 1970. Probability and conditionals. Philosophy of Science 37(1). 64–80. Stojnic, Una. 2017. One's modus ponens: Modality, coherence and logic. Philosophy and Phenomenological Research 95(1). 167–214. Stone, Matthew. 1997. The anaphoric parallel between modality and tense. University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-97-06. Van Fraassen, Bas C. 1976. Probabilities of conditionals. In Foundations of probability theory, statistical inference, and statistical theories of science, 261–308. Springer. Veltman, Frank. 1985. Logic for Conditionals: University of Amsterdam dissertation. Veltman, Frank. 1996. Defaults in update semantics. Journal of Philosophical Logic 25(3). 221–261. Williams, J. Robert G. 2016. Probability and nonclassical logics. In Alan Hájek & Chris Hitchcock (eds.), Oxford Handbook to Probability and Philosophy, Oxford University Press. Yalcin, Seth. 2007. Epistemic modals. Mind 116(464). 983–1026. Yalcin, Seth. 2010. Probability operators. Philosophy Compass 5(11). 916–937. Yalcin, Seth. 2012. Bayesian expressivism. Proceedings of the Aristotelian Society 112(2pt2). 123–160.