Deontic Modals and Probabilities: One Theory to Rule Them All? Fabrizio Cariani abstract: This paper motivates and develops a novel semantic framework for deontic modals. The framework is designed to shed light on two things: the relationship between deontic modals and substantive theories of practical rationality and the interaction of deontic modals with conditionals, epistemic modals and probability operators. I argue that, in order to model inferential connections between deontic modals and probability operators, we need more structure than is provided by classical intensional theories. In particular, we need probabilistic structure that interacts directly with the compositional semantics of deontic modals. However, I reject theories that provide this probabilistic structure by claiming that the semantics of deontic modals is linked to the Bayesian notion of expectation. I offer a probabilistic premise semantics that explains all the data that create trouble for the rival theories. keywords: deontic modality, probability, information sensitivity, premise semantics, deontic conditionals. word count: 11450 words excluding footnotes; approximately 13300 including footnotes. [Final version. January 2016] INTRODUCTION An important chapter of the semantics and pragmatics of modals concerns deontic sentences like: (1) You should move that pawn. (2) You may not move a pawn that way, you must move it this way. (3) It is better to move a pawn than to eat it. A theory of the meaning of these sentences must specify when they are acceptable, the kind of content they express, how they contribute to more complex expressions that embed them and what systematic effects they may have on the evolution of a conversation. In this paper, I compare some ways of developing such a theory on the basis of empirical considerations as well as philosophical and methodological ones. Philosophically, the main theme of my discussion is the connection between the conventional meanings of deontic vocabulary and substantive theories about what one ought to do. Accounts of the meaning of deontic vocabulary are not alone in making predictions about deontic sentences. Such sentences may also be implied by substantive, nonlinguistic theories. For example, (1) may follow, depending on the circumstances, from (i) the rules of chess (ii) from a theory of tactically correct play or (iii) from a more general theory of rational decision making. Say that a practical theory is any theory that issues verdicts of the form, (4) Given α's circumstances, goals and information, α ought to φ Given α's circumstances, goals and information, α is allowed to φ. What is the relationship between practical theories and the semantics of deontic modals? Should a practical theory be built into the conventional meaning of deontic modals? The hypothesis that has informed much research in formal semantics1 is that the semantic theory should be, within certain limits, neutral among different practical theories. This is the hypothesis I aim to clarify, articulate and defend here. It is important to distinguish at the outset two different degrees of neutrality. One may note that practical theories can conflict with one another about particular sentences, and maintain that it is not up to the semantics to settle these conflicts. Consider again: "You should move that pawn". If my desire is to lose the game and go home, a decision theory yields different verdicts from a theory of tactically correct play. To be neutral in the first sense, a semantic theory should be above these differences. A precise way of putting the requirement is that for any agent α, and predicate φ, such that it is contingent whether φ applies to α, ðα should φñ is consistent. This degree of neutrality is usually achieved by adopting a flexible semantics for modals. For example in the classical, contextualist ordering semantics, deontic modals-and, in fact, modals in general-are analyzed relative to contextual parameters. Among these contextual parameters is an ordering of possibilities. In context, practical theories can help fix this ordering, but that is the entire extent of their contribution. However, there are no context-independent constraints on what can be the source of this ordering. On this picture, there is a 1See Kratzer (1977, 1981, 1991b, 2012); Lewis (1978, 1981). 2 thin lexical meaning for deontic modals that is largely independent of practical theories, but practical theories can affect the sharpenings and connotations that individual deontic modals acquire in context. Given some orderings, "You should move that pawn" is true; given others, it is not.2 I will argue that we need a slightly stronger degree of neutrality. There are reasonable practical theories that make simultaneous predictions about sets of sentences. If the semantic theory is to be compatible with these theories, it seems, it should make those sets of sentences consistent. Accordingly, the stronger degree of neutrality is this: for every coherent practical theory P, if there are circumstances in which P entails a set of deontic sentences S, then the semantic theory ought to imply that S is consistent.3 So, for example, if there is a coherent practical theory that entails "You should move that pawn unless your king is threatened" and "If your king is threatened, you should move the queen", then these sentences ought to come out as jointly consistent. It is important to note at this stage that even the stronger degree of neutrality does not amount to the suggestion that the verdicts of every practical theory should be consistent. Some practical theories are straightforwardly incoherent. No one claims that their predictions ought to be made consistent. My main negative thesis is that analyses that attempt to explain the meaning of deontic modals in terms of expected value are in violation of this stronger, but not universal, requirement of neutrality.4 The argument against expected value accounts will not assume neutrality as an abstract principle. What I hope to show, instead, is that violating it has specific problematic empirical consequences (§2). On the other hand, expected value accounts have some attractive features. They systematically capture the idea that our dispositions to accept and reject deontic statements are tightly linked with probabilistic judgments. Indeed, be2I do not assume that a semantic theory for deontic modals must appeal to truth-conditions. The ascription of truth here is in the contextualist's voice. 3Other authors have also suggested a similarly strong degree of neutrality (Carr, 2012, unpublished; Charlow, forthcoming). However, I want to emphasize a potential difference. Carr criticizes the classical ordering semantics for modals for implementing a particular 'decision rule'. That is: because the classical semantics operates by selecting the 'best' worlds out of the ordering, Carr thinks it implements the decision rule maximax. Similarly, she argues that the semantic proposal of Cariani et al. (2013) implements the decision rule maximin. I will not tackle these arguments directly (I want the reader to focus on a different set of details). I do want to make a methodological point: these arguments cannot be made simply on the basis of structural features of a lexical entry. A semantic theory might select the best worlds out of an ordering, but it does not mean that it implements maximax. To sustain an argument that a theory is not neutral, there have to be specific examples of coherent practical theories that are ruled out by the semantics in specific contexts. 4Expected value accounts are developed, among other places in Goble (1996); Cariani (2009); Lassiter (2011); Wedgwood (forthcoming). This kind of view is also explored without endorsement in Yalcin (2012a). The theoretical proposals in these references differ from each other on several substantial points. 3 fore criticizing them, I will describe some facts concerning the interaction of probability operators and deontic modals that seem to motivate expected value accounts (§1). The main positive thesis of the paper is that it is possible to preserve the virtues of expected value accounts while avoiding their problems. I show how to characterize a probabilistic premise semantics that shares the features that motivate expected value accounts without their problematic consequences (I develop an initial version of my proposal in §3 and a refinement in §4).5 In addition to these programmatic theses, the paper also has a more modest ambition. It is to show, as the semantics of §§3-4 does, that it is possible to merge the insights of a classical premise semantics for deontic modals with the current explosion of work on probabilistic semantic theories for epistemic vocabulary. In other words, to show that it is a mistake to view the space of possible theoretical options as neatly partitioned in two-with classical non-probabilistic theories on the one hand and expected value analyses on the other. 1 DEONTIC AND PROBABILISTIC TALK Many (though not all) theories of deontic modals relativize the interpretation of ought and should to a probabilistic state. To get a sense of the variety of proposals, I will list some examples of probabilistic theories. According to one version of the expected value analysis (see §2 for more details): ðshould Añ is true if and only if the expected value of A (as calculated on the basis of contextually given probability and value function) is greater than the expected value of the relevant alternatives.6 This theory counts as probabilistic because expected values are defined partly in terms of probability functions. Another probabilistic theory is the end-relational analysis (Finlay 2009, 2014). According to this view, ðshould Añ is true in a context c just in case the (contextually salient) probability of A given the contextually salient end E is greater than the probability of the relevant alternatives given E. According to yet another analysis (see Carr 2012) ðshould Añ is true iff A is entailed by all the 'best' alternatives where 'best' is calculated on the basis of contextually supplied decision rule d, probability Pr, and value function v. To these theories, I will add a fourth in §3. More precisely, these analyses appear to commit to: 5The proposal generalizes the theory of Cariani et al. (2013). A preliminary sketch has appeared in my Cariani (2013b). 6In this paper, metalinguistic variables A, B, C range over sentences, while A,B,C range over sets of worlds. In informal discussion and within the same stretch of discourse the value of A is the set of worlds at which A is true. In more formal discussion, I use the more standard notation 1Ao to denote A. 4 Probabilize Deontic Semantics (in short: Probabilize): (a) In defining an intepretation function 1*o* for a fragment of natural language including deontic modals, we should include a probabilistic parameter (e.g. a probability function) among the parameters relative to which we interpret sentences. (b) Deontic sentences depend non-trivially on this probabilistic parameter (To be precise this means that for some deontic sentence A there are parameter assignments π and π∗ that differ only in their Pr coordinate, such that 1Aoπ 6= 1Aoπ∗). What motivates this idea? In this section, I propose two arguments in its favor. I emphasize that neither argument is intended as an unconditional proof of Probabilize. In other words, neither argument aims to motivate every kind of theorist regardless of their prior commitments. Instead, both arguments depend on substantive assumptions that I do not intend to defend here. 1.1 Probabilistic Deontic Conditionals The first argument turns on examples in which deontic modals appear embedded in conditionals that use both probabilistic and deontic operators (the core observation of this argument is from Yalcin 2012a). Imagine an experienced soccer coach advising a less experienced one about an upcoming game. Suppose that the more experienced coach utters: (5) If it is likely that your opponents will attack you on the right flank, you should concentrate your defense on the right side. (6) If it is not likely that your opponents will attack you on the right flank, you should not concentrate your defense on the right side. We have no difficulty interpreting these conditionals. We clearly judge them to be consistent and expect them to figure in certain inferential patterns. For example, the conclusion that you should concentrate your defense on the right follows from (5) together with: (7) It is twice as likely that your opponents will attack you on the right flank, than not. These are the kinds of facts we would like a joint theory of modals, conditionals, and probability operators to explain.7 7 It is sometimes said that the goal of semantic theory is to predict which inferences are deemed acceptable by competent speakers of a language (Chierchia and McConnell-Ginet 2000, p. 52, Yalcin 2012c, Holliday and Icard 2013, §3). From this point of view, it may appear unclear 5 Accounting for the meanings of embeddings like (5)-(6), I claim, motivates Probabilize. The argument I advance here is that, given a natural view of the meanings of their probabilistic antecedents, we must handle their consequents as sensitive to an underlying probabilistic structure. Let me first discuss the probabilistic antecedents. Much recent literature has developed reasons in favor of a probabilistic semantics for sentences like:8 (8) It is likely that they will attack you on the right. For example, Yalcin (2010) shows that a probabilistic semantics is a natural fit for a core of inferences involving probability operators.9 Swanson (2011) notes that it seems difficult, if not impossible, to reduce the quantitative aspect of probabilistic discourse to a purely qualitative basis. On a simple probabilistic semantics, the semantic value of the probability operator likely (short for 'It is likely that...') may be characterized as in (9). (9) 1likely Aoc,Pr,w = T iff Pr(A)> .5 If a probabilistic semantics like (9) is on the right track, we should ask what sort of local context is created by evaluating conditional antecedents of the form ðIf it is likely that A, ...ñ. One way to get a grip on this problem is to think about what it is to suppose that ð It is likely that A ñ. One option is factualism-the view that in making such suppositions one entertains a standard possible world proposition. Factualism is compatible with the semantics in (9), provided that one is a certain kind of contextualist. One can maintain that context determines a probability function Prc, and so ðIt is likely that Añ expresses in c the proposition that is true at a world w just in case Prc(A) > .5. As for how context might determine such a probability function, one option is to claim that Prc is the subjective credence of the speaker of c. More complex accounts are possible (e.g. the collective credence of some salient group in c). exactly what needs to be explained about (5)-(6). Surely, a minimal bar would be to explain why they are consistent. However, and crucially, a semantic theory must deliver at least two more things: first, together with the metasemantics, it must make predictions about speakers' dispositions to accept sentences in context; second, it must connect smoothly to an account of communication and conversation, so as to explain phenomena such as disagreement, assertion, retraction and to form a basis for the calculation of implicatures. It is with respect to the first of these two additional tasks that (5)-(6) pose the biggest problems. 8Yalcin (2007, 2010, 2012b); Swanson (2011); Lassiter (2011); Rothschild (2012); Moss (forthcoming). In addition to these authors, some think that probabilistic resources might be needed for the semantics of conditionals: see Edgington (1995), Bennett (2003) among others. 9But see Holliday and Icard 2013, §9 for an account that retrieves essentially the same core inferences identified by Yalcin in a comparative probability setting. In reply, Lassiter (forthcoming) notes that there might still be inferences involving sentences with ratio modifiers (ðA is twice as likely as Bñ) that elude the comparative probability approaches. 6 Despite the familiarity of this proposal, there are well-known reasons to doubt it (Yalcin, 2007, 2011; Swanson, 2011; Rothschild, 2012). For instance, Yalcin (2007, pp. 996-997) and Rothschild (2012, p. 102) note that factualism seems to make attitudes towards modal and probabilistic contents to be about the wrong sorts of things. To make a supposition to the effect that A is likely is to (provisionally) enter in a particular state of uncertainty about the world. According to the factualist, however, when Dana makes such a supposition, she enters into a state of certainty about a credence, either her own or the group's. This seems to be the wrong content for the supposition in question, and it motivates a different model of attitutes towards probabilistic contents. I don't wish to debate the factualist here about whether her position has been thoroughly refuted by these objections. These arguments are meant to open up new lines of inquiry, not close off the old ones. They motivate a nonfactualist model of probabilistic discourse. Ultimately, the score can be settled only globally–by considering a large variety of phenomena and fully fleshed out theories. The baseline non-factualist idea for probabilistic sentences is this: to suppose that it will likely rain is provisionally to adopt a state that supports the probability of rain. Moving back to the case of conditionals, the basic idea will be to conceive of probabilistic antecedents as operating on an epistemic (and specifically probabilistic) state that is given as a coordinate of semantic evaluation. Evaluating an antecedent of the form ðlikely Añ involves creating a local context according to which A is likely. The consequent of the conditional is evaluated relative to such a local context. The details of this operation vary from semantic theory to semantic theory. However, a simple preliminary model of how this might work (not the one I will end up adopting in §4) is as follows. Assume that probability operators are evaluated relative to a probability function, as in (9) above. To evaluate conditionals of the form (if likely A)(B), shift the probability function Pr, to some other function Pr ′ that is suitably related to Pr and makes A likely. If this picture of the behavior of probabilistic antecedents is on the right track, an account of the meaning of (5) and (6) requires a semantics for should on which the deontic modal is sensitive to the probabilistic update introduced by ðlikely Añ. Otherwise, we would not be able to predict the consistency of (5)- (6). The same holds for other modal expressions with deontic interpretations, such as may, must, better, etc. The first argument for Probabilize, then, is that it constitutes a natural extension of a plausible picture concerning the meaning and conversational role of expressions of probability. The extension is needed to make sense of what pairs like (5)-(6) mean. It should be obvious that this argument for a probabilistic framework is only as strong as the underlying picture of probability claims. I 7 take this as a welcome result: Probabilize does not seem to be an uncontroversial thesis, but it is supported by independently interesting and well-developed views concerning the language of subjective uncertainty and is a natural extension of these views. 1.2 Information Sensitivity and Probabilistic Structure My second argument for Probabilize involves embeddings of deontic modals in the consequent of ordinary deontic conditionals. Kolodny and MacFarlane (2010) argue that deontic modals are information sensitive. Very roughly, their central hypothesis is that a correct account of deontic conditionals requires a quantificational semantics for ought and, I suppose, should on which the domain of quantification is specified as a function of an underlying information state. This proposal is motivated by an analysis of the Miners puzzle (Regan 1980; Parfit unpublished, 2011; my variant is loosely inspired by Jackson 1991): Rescue: in a far away land, there is a natural disaster and you are the only rescuer available. Before you go on your mission, you are offered a choice to take one of two pills (you may also refuse and take no pills). If you refuse, you will rescue 9 people. The outcome of taking one of the pills depends on your genetic makeup. The red pill will give you extra energy if you have the red gene (you will rescue 10 people), but it'll make you faint if you have the blue gene (you will rescue 0 people). The blue pill will give extra energy, if you have the blue gene, and make you faint if you have the red one. No one knows your genetic makeup and there is not enough time to find out: as far as we all know, probabilities are 50/50. It is helpful to visualize the scenario as a decision matrix: Red Gene (.5) Blue Gene (.5) Take Red Pill Rescue 10 Rescue 0 Take Blue Pill Rescue 0 Rescue 10 Refuse Pills Rescue 9 Rescue 9 We want to derive that the following judgments are jointly consistent: (10) You should refuse the pills. (11) If you have the red gene, you should take the red pill. (12) If you have the blue gene, you should take the blue pill. 8 An information sensitive semantics, on which the domain is only specified as a function of an information state i, provides an intuitive account of these examples.10 This is Kolodny and MacFarlane's information-sensitive semantics: (13) 1should Aoc,d,i,w = T iff ∀w′ ∈ d(i),1Aoc,d,i,w′ = T Here d is a deontic selection function that inputs the information state i and outputs a domain. Since Kolodny and MacFarlane assume that the job of conditional antecedents is to shrink the information state i, deontic conditionals may quantify over a significantly different domain from bare deontic claims. This, in turn, allows an explanation of the consistency of the verdicts. It turns out to be pretty difficult to explain how to derive all three verdicts at once if, instead of being information sensitive, deontic modals quantify over a domain that is fixed by context in a more direct way.11 The move to an information sensitive semantics is, in my view, well motivated.12 Since the present aim is to develop the information sensitive approach, rather than to justify it, I do not discuss ways of resisting arguments for information sensitivity (see the references in footnote 11). My focus here is to argue that if we adopt an information sensitive approach, there is reason to take an extra step and adopt a probabilistic model of information. Kolodny and MacFarlane's semantics makes deontic modals responsive to qualitative information states. Consider, for example, this variant on Rescue: Knowledge: Everything is exactly as it is in Rescue but now you (the agent) as well as all the participants to the conversation acquire conclusive and irrefutable evidence that you have the red gene. In Knowledge, we judge (10) to be unacceptable: intuitively, you should take the red pill. Kolodny and MacFarlane's apparatus can deliver this verdict: acquiring the information that the agent has the red gene may shift the domain. In the resulting domain, every world may be one in which the agent takes the red pill. 10There is a complex debate to be had about what sort of state i and how it gets fixed. Towards the end of their (2010), MacFarlane and Kolodny hint towards relativism. According to the relativist, that the knowledge state of the assessor sets the initial value for i (other operators might shift it away from its initial value). However, their compositional semantics is explicitly neutral on this issue: a contextualist and even a certain kind of expressivist might deploy it. Likewise, the semantic theories presented here do not require a stand on these issues. Relativism is more explicitly avowed in MacFarlane 2014, §11.3-ff., but again the compositional semantics presented there does not require a relativist construal. 11In addition to Kolodny and MacFarlane (2010), see also Charlow (2013); Cariani et al. (2013); Silk (2014) for developments of the pro-information sensitivity arguments. For defenses of the classical semantics, see von Fintel (unpublished); Dowell (2012). 12In my preferred account, the parameter i does not have to be tied to information. But whatever it is tied to, the job of prioritizing possibilities must be executed as a function of i. 9 In Knowledge, we add information that has a simple qualitative effect on i: we know that you have the red gene. The situation of perfect ignorance in Rescue becomes one of perfect knowledge. But there are cases whose effects seem best described in terms of graded states: Slanted: Everything is exactly as it is in Rescue but now you have strong evidence (known to you and to all the participants to the conversation) that leads you to assign .99 probability to your having the red gene. As in Knowledge, in Slanted, it is plausible to judge (10) to be unacceptable.13 If there is a .99 probability that you have the red gene, it is plausible that you should take the red pill (of course, if .99 is not enough we could choose a larger probability value short of 1). On the basis of these cases, I advance the following argument: (P1) If the difference between Knowledge and Rescue is traced to the value of an informational parameter in the semantics, then so is the difference between Slanted and Rescue. (P2) The difference between Knowledge and Rescue is traced to the value of an informational parameter in the semantics. (P3) If the difference between Slanted and Rescue is traced to the value of an informational parameter in the semantics, then Probabilize holds. (C) Probabilize holds. I think (P1) is intuitively plausible: the cases are not sufficiently different to warrant altogether different treatments. Those who accept Kolodny and MacFarlane's proposal that deontic modals are information sensitive also accept (P2). But I suppose that one may–and Kolodny and MacFarlane (2010) do–disagree with (P3). As they point out, they adopt: an epistemic and non-probabilistic model of information states; it takes information states to be sets of known facts. We have chosen this model because we think that what one ought to do (relative to an information state) supervenes on what is known: mere differences in beliefs (or partial beliefs) or perceptual states, unaccompanied by differences in what is known, cannot make a difference to what an agent ought to do (footnote 23, p. 130). 13I am not claiming that the semantic theory must deliver this verdict of unacceptability for (10). To do so would be to say that risk averse decision theories are false as a matter of meaning. The point in the main text is not to exclude these theories. It is, instead, that, in context, there are natural practical theories that reject (10) in Slanted. 10 Armed with this picture, one may diagnose Slanted as involving acquisition of a piece of non-probabilistic evidence which affects the probabilistic facts only via the relevant supervenience function. Imagine, for example, that the context of Slanted arises after some reliable test for the red gene came back positive. I do not know whether the supervenience claim Kolodny and MacFarlane endorse is true. Luckily, I do not think we need to engage with the debate over its truth. Even granting the supervenience claim, it still does not follow that we should theorize at the level of the supervenience basis (i.e., qualitative information states). It is a familiar point that there are supervenience structures in which the 'higher' levels are explanatorily autonomous. For example, chemical facts might supervene on physical facts without undermining the autonomy of chemical explanations. Closer to our concerns, the explanatory constraints on semantics dictate relativizations that are independent of whatever supervenience relations might hold. For instance, one may relativize the truthconditions of "Someone owns a guitar" to worlds and assignments but that is perfectly compatible with the claim that existential facts about guitar ownership supervene on worlds alone. The case of the relationship between probabilistic states and deontic judgments may be one such case. In sum, Kolodny and MacFarlane's argument does not motivate resistance to my premise (P3), and without such motivation, my argument for Probabilize stands. 1.3 Taking Stock Consideration of embeddings of deontic modals judgments, e.g. in the consequents of conditionals, provides two significant arguments for Probabilize. Since deontic modals can occur (§1.1) in the consequents of conditionals with probabilistic antecedents, a semantic theory must explain how these are connected and a theory that satisfies Probabilize seems ideally placed to do so. Furthermore (§1.2), there is independent reason to maintain that deontic judgments may have to be evaluated relative to information states. Once that is granted, it is difficult to resist the fine-graining of these states. 2 EXPECTED VALUE ACCOUNTS In this section, I introduce an expected value framework for deontic modals in more detail. I raise three criticisms that apply to every version of the theory I am aware of.14 14For different critical discussions of the approaches I discuss in this section, see Rubinstein (2012, §2.2.1) and Yalcin (2012a, § IX). 11 2.1 Spelling out the Expected Value Accounts Suppose that context provides a probability function Pr and a value function v that assigns to each possible world w a real number representing the value of that world according to some contextually salient standard. With these tools (and assuming for simplicity that there are finitely many worlds) we can calculate one kind of expected value for a possible-world proposition A as: EV (A) = ∑ w∈A v(w) * Pr(w|A) This quantity is the weighted average of the value of the worlds that belong to A, weighted by their probability (conditional on A). Some versions of decision theory evaluate the rationality of choices on the basis of what might appear to be a sharpening of the above concept of expected value.15 The sharpening in question involves several additional commitments. For example, v is identified with a quantity called utility-a numerical representation of the subjective desirability of an outcome. It is also claimed in these theories that the rationally permissible choices are exactly those choices that maximize expected utility. Neither of these commitments is essential to use expected values in semantics. According to the best versions of this approach, the choice of value function is not encoded by the semantics and is left up to context (hence the value function need not be identified with an agent's utility function).16 Proponents emphasize that the point of the expected value analysis is not to stipulate that deontic modals implement a particular decision theory-after all, they accept the milder of our two degrees of neutrality outlined in the introduction. It is, rather, that we must do semantics with a finer set of resources and more structure. For similar reasons, the semantic analysis is not wedded to the particular way of computing expected values I have identified here. One may well develop a semantic theory based on a notion of expected value that is closer to how causal decision theorists (e.g. Skyrms 1982; Joyce 1999) define the notion. Expected value analyses come in many different variants. On one variant, should-sentences directly express comparisons of expected value: ðshould Añ may be interpreted as meaning that A has higher expected value than the relevant alternatives; or perhaps it may mean that A has higher expected value than a contextually set threshold. 15When one looks closer, it turns out to be quite different. Before formulating a precise notion of expected utility, a decision theorist will generally set up a sophisticated modeling discipline concerning what counts as a decision problem (see, inter alia Joyce, 1999, ch. 2). This modeling discipline affects how one formulates the relevant notion of expected value. For example, utility does not attach to individual possible worlds, but to coarser objects called outcomes. 16 This is why Wedgwood (forthcoming) helpfully suggests using the terms 'value' and 'expected value' rather than 'utility' and 'expected utility'. I follow Wedgwood on this score. 12 Both of these approaches yield a non-classical deontic logic (for example, should is not closed under logical consequence and does not agglomerate over conjunction). For those who prefer more canonical logics,17 it is easy to sketch an expected value analysis that generates a relatively classical deontic logic. One just needs to adopt two ideas: (i) should/ought are analyzed as universal quantifiers and (ii) the domain over which they quantify is the union of those salient alternatives that maximize expected value (relative to the given contextual parameters). All of this is to highlight an important point: favoring expected value analyses does not require choosing a particular semantic account and it does not determine a particular logic. In the rest of this section, I discuss expected value accounts generally, regardless of differences in the logics that they project. This makes my critical task more difficult, because I restrict myself to raising objections that apply to all expected value theories (and there are other objections that apply to some theories, but not to others). To narrow down the field of opposing views, I limit myself to views that entail: (FB1) ðshould Añ is true if (but not necessarily only if) the set of salient A-worlds has higher expected value than each of the salient alternatives. (FB2) ðshould Añ is false if (but not necessarily only if) for some B, the set of salient A-worlds and the set of salient B worlds (a) are disjoint and (b) have the same expected value. In less formal terms, (FB1) says that if o is better than every other option, then o is what you should do. (FB2) says that if two incompatible options have equal expected value, then it cannot be that one of them is exactly what you should do. These constraints are satisfied by all the expected value accounts I know of. Expected value theories can easily account for Rescue and its variants (Lassiter, 2011, §6.4.2.2). Consider again: (10) You should refuse the pills. (11) If you have the red gene, you should take the red pill. 17 Even though I have argued (Cariani, 2013a) that closure under logical consequence is not a desideratum in a logic for should and ought, I also proposed that agglomeration over conjunction is desirable-with the possible exception of moral dilemmas (see Cariani forthcoming). 13 (12) If you have the blue gene, you should take the blue pill. Suppose that the context determines the value function v as follows v(w) = the number of people you rescue in w (it is an arbitrary choice, in the sense that we might have picked a different v, but it will do as an illustration) and that Pr is a probability function that satisfies the constraints stated in the description of Rescue (e.g., the probability of your having the red gene is .5). Then, • (10) is predicted to be acceptable because refusing the pills has the highest expected value among the alternatives. • (11)-(12) are also predicted to be acceptable. On one way of construing conditionals, conditional antecents can update the salient probability function (see account of the conditional in the system in §3, which is available to the expected value theorist).18 In the local context created by the antecedent of (11), the option that maximizes expected value is taking the red pill. In the local context created by the antecedent of (12), the option that maximizes expected value is taking the blue pill. Expected value theories also predict the variants of Rescue I have considered. For instance, the reversal on (10) in Slanted and Knowledge is predicted because the expected value of taking the red pill increases as the probability of your having the red gene increases. Once it increases far enough, taking the red pill might become the option with the highest expected value. Summing up: Rescue and its variants involve decision theoretic verdicts. A theory that mirrors the structure of Bayesian decision theory is well suited to explain them. 2.2 Against Expected Value Accounts Despite these advantages, there are important reasons to explore alternatives to these accounts. I consider three kinds of implausible verdicts they force. Attitudes of Non-Bayesian Agents Suppose that John is an extremely risk averse subject. According to John, in every decision problem, one should choose one of the options that yield the least bad worst-case scenario (John abides by the decision-theoretic rule maximin and 18This assumes that one of the effects of evaluating a conditional is to conditionalize the underlying probability function. I think it is plausible that this is one possible construal of deontic conditionals. But I also think that there another construal on which conditional antecedents restrict a covert necessity modal, even when should is present. See my discussion in fn. 35. 14 thinks everyone ought to as well). According to John's decision-making extremism, you should refuse the pill even in Slanted (this was the case in which there is .99 probability that you have the red gene). Suppose, now, that John deliberates about Slanted on your behalf; because of his commitment to maximin, he comes to the conclusion that refusing is the thing to do. As a consequence, the following seems acceptable: (14) John thinks you should refuse the pill. How should an expected value theorist handle (14)? The first task is to find an appropriate model for ð j thinks that Añ, when A is a sentence including deontic vocabulary. Expected value theories relativize compositional semantic values (at least) to quintuples of the form 〈i, Pr,Alt, v, w〉, where i is a qualitative state (a set of worlds), Pr a probability function, Alt a set of alternatives, v a value function and w a world. In this setting, a prominent option for the semantics for thinks would be to extend a Hintikka-style operator (Hintikka, 1962) treatment to these complicated points of evaluation. A first stab might be:19 1thinks( j, A)oi,Pr,Alt,v,w = T iff for all worlds z in i j(z), 1Aoi j(z),Pr j(z),Alt,v j(z),z = T. Notation: i j(w) is j's qualitative state in w, Pr j(w) is j's credence in w, v j(w) is j's value function in w. Let us concede for the sake of argument that the above descriptions (e.g. 'j's value function in w') are all uniquely referring.20 The theory's prediction is that (14) says something roughly like: (15) For every probability function and value function compatible with John's state, refusing the pill has higher expected value than the alternatives. At this point, the problem should be clear: (15) does not capture the meaning of (14). (14) is not an ascription to John of the content that refusing the pill 19The analysis that follows is built on a modification of Hintikka's account based on ideas by Stephenson (2007) and Yalcin (2007). The innovation is to model modal belief by shifting not just the world of evaluation to match the belief worlds, but also the entire belief state i. 20A possible complication is relevant here: plausibly, ðthinks( j, should(A))ñ should not switch the value function v to one based on j's relevant priorities, but rather to the value function that j assigns to whoever is the agent of the deontic modal (if there is one). To address this point, Yalcin (2012a) formulates the expected value theory by replacing v, with functions h(A, w) (modeled after the hyperplans of Gibbard 2003) that take each agent A and world w to a value function. We could then say that thinks shifts the salient h-function, to h j , i.e. the hyperplan of the subject of the attitude ascription. This all strikes me as correct, but it isn't necessary to cover the cases I deal with here: I assume, instead, that j's value function in w is based on the agent's priorities. 15 maximizes expected value. Rather, (14) is an ascription to John of a way of prioritizing alternatives relative to which refusing the pill is the best option. Expected value theorists might reply that the attitude ascription expresses a comparison among expected values relative to a value function that builds in John's risk aversion. After all, as I emphasized, they need not think that one's value function coincides with one's utility in the decision theorist's sense, and it may appear that my argument above requires this. Although I agree that this move is available, it does not solve this particular problem. To see why, we just have to change some features of the case, by taking the option of refusing the pill off the table. Consider this variant context: Rescue−: Everything is exactly as it is in Rescue but now someone is going to force-feed you a pill (either the red one or the blue one). You get to choose which pill you will be given. In this context, it is plausible to accept: (16) John does not think you should take the red pill and does not think you should take the blue pill. After all, John thinks that either pill would be equally good (or equally bad), so he thinks you should be indifferent. Since no other options are salient, this means that taking the red pill or the blue pill must have the same expected value. Now, suppose that some evidence becomes common ground that makes it .99 likely that you have the red gene. That is, Rescue− gets updated to a context that resembles Slanted, except that, again, you do not have the option of refusing. Call this context Slanted−. Despite the fact that Pr(red gene) = .99 (in Slanted−), maximin does not prioritize taking the red pill over taking the blue pill. That is to say that in Slanted− John remains indifferent between the pills, and hence (16) is still acceptable. But the expected value semantics does not predict this. The expected value theory must be bound by the following structural commitment: If John thinks you should be indifferent between taking red or blue in Rescue−, and if Slanted− results only by shifting the relevant probability function to make it .99 likely that you have the red gene, the expected value of taking the red pill in Slanted− must be higher than the expected value of taking the blue pill.21 21Technically, it is possible to define a value function that violates this. For example: v(w)=10 if, in w, you are certain that you have the red (viz. blue) gene and you take the red (viz. blue) pill. v(w)=0 if, in w, you are uncertain about your genetic state and take neither pill. v(w)=-10 if, in w, you either take a pill in a state of uncertainty, or take the wrong 16 Therefore, the expected value theorist predicts, contrary to our judgments, that (16) is unacceptable in Slanted−. To avoid this result, the expected value theorist might claim that, in updating from Rescue− to Slanted−, we must also change the value function. I see two problems with this reply. First, this is neither intuitive nor theoretically motivated: nothing in the update to Slanted− suggested that we had to affect the value function. Second, even in Rescue−, we can say things like: (17) Even if it were more likely than not that you have the red gene, John would still not think you should take the red pill. It is implausible that the local context created by the antecedent of (17) involves a change in value function. Another reply that is available to the expected value theorist might be to reject the operator treatment of epistemic attitudes. In the literature on propositional attitudes, the standard alternative to operator treatments is a relational account. Let C be a context, however conceived. Then, the first stab at a relational semantics for thinks is given by: 1thinks( j, A)oC ,w = T iff in w, j stands in the thinks-relation to the structured content of A in C . There two main reasons to adopt a relational analysis of thinks over an operator account. First, to slice contents more finely than is allowed by a possible-worlds representation (e.g., if >1 and >2 are distinct tautologies, we may want to accept ðJohn thinks that >1ñ but not ðJohn thinks that >2ñ). Second, to block problems of logical omniscience (e.g., if A is a contingent sentence, it need not follow from ðJohn thinks that Añ that ðJohn thinks that >2ñ). The problem I raised for the expected value theories does not appear to be of either kind: it does not involve the claim that the expected value approaches slice contents too coarsely, and it it does not appear related to issues of logical omniscience. This observation leads to a more general point: the appeal to the relational analysis is merely evasive, unless it is complemented by an account of what content is expressed by a deontic sentence in a given context. And it is difficult pill. This kind of value function is counter to the spirit of information sensitive theories. All these theories separate the contribution of the information state from the contribution of whatever device we use to prioritize possibilities. This objection has an empirical component, in that the expected value approach, with v as value function and the natural assignments to the other parameters, wrongly predicts that (i) John thinks that if you have the red gene, you should take the red pill. should come out false. 17 to imagine an account of the content of ðshould Añ that (i) is plausible in light of the meaning that the expected value semantics assigns to deontic sentences and (ii) makes the right predictions about (14) and all the related cases in which an agent has what I have been calling a 'non-Bayesian' attitude. Let me emphasize that this is in no way a criticism of the relational analysis. My target is the claim that the relational analysis helps with the problem. The point that supports it is that the kinds of facts that justify the relational analysis seems orthogonal to the present argument. My opponent may reply here that they are not orthogonal, after all-perhaps John really does have inconsistent deontic beliefs. But there is no reason to accept this claim (the points that follow are more extensively discussed in Cariani, 2014, which was written as an elaboration of this argument). As a preliminary point, the contents of John's beliefs are strictly speaking consistent even according to expected value analyses: these analyses only make them inconsistent relative to background facts about the context. More importantly, there is a clear intuitive difference between how we judge John's beliefs and how we judge the beliefs of someone who ranks alternatives in a totally incoherent way (e.g. by minimizing the minimum value). There is no pressure for the semantics to predict the consistency of the deontic beliefs of such an agent. From this point of view, the 'miniminer' is just like someone who has incoherent views about, say, universal quantification. I am happy to rest my argument on the judgment that not every deviation from the expectational paradigm is a form of logical inconsistency. Disagreement about Decision Rules The previous argument can be extended and simplified by considering judgments about disagreement. Suppose that Will, like John, has a favorite decision rule. Unlike John, Will thinks that everyone ought to maximize expected utility. Intuitively, Will and John agree in Rescue− that: (18) It is not the case that you should take the red pill and it is not the case that you should take the blue pill. Similarly, it is intuitive that they disagree on: (19) If it is likely that you have the red gene, you should take the red pill. I argue that if expected value analyses model the agreement on (18) correctly, then they cannot model the disagreement on (19). To set up this argument, I need a skeletal model of agreement and disagree18 ment between different parties.22 Given an evaluator α and a point of evaluation b, we might associate with α a shifted point of evaluation bα-similar to what we obtained in the operator semantics for thinks (but now without the aim of giving a semantics for attitudes). Earlier, I considered points of evaluation of the form 〈i, Pr,Alt, v, w〉. So, for instance, bWill might be 〈iWill , PrWill ,Altb, vWill , wb〉. We can then say: α and β disagree on A (relative to b) iff 1Aobα 6= 1Aobβ .23 In this setting, (18)-(19) pose the following problem: given that it is settled that you will take one of the pills, the only way for John and Will to agree on (18) is if the expected values of taking red and taking blue are identical (in their respective states). But, if that is the case, John and Will should not disagree on (19)-for reasons that parallel those I discussed in the case of the attitudes. The conditional antecedent in (19) operates on the probabilistic coordinates of bA, bB, creating local contexts in which having the red gene is more likely than having the blue gene. So, α and β should both accept (19).24 In sum, the expected value analysis can account for some disagreements by postulating value functions that differ from the agent's utility function. If, however, we use patterns of agreement to constrain the value function, as we did to get the agreement on (18), the analysis cannot account for disagreements for all sentences that involve a shift in the background state. Zero-probability Events and Decision Problems In the philosophy of probability, several authors have stressed the distinction between probability 0 and epistemic impossibility (for a recent discussion, see Easwaran, 2008). This observation has important implications in decision theory (Skyrms 1980; Easwaran 2014; Hájek unpublished; see Briggs 2014, §3.3, 22As with the account of attitudes I entertained in the previous section, this is an extremely simplified approach, that is nonetheless sufficient to highlight a difficulty that remains problematic even when we complicate our account of disagreement. For a much more nuanced story about normative disagreement, see Plunkett and Sundell (2013). 23 Note that this notion of disagreement concerns a sentence in context. In my view the most fundamental notion of disagreement concerns contents of speech acts and attitudes, but I intend to run the present argument without adding to the expected value framework controversial views about contents. Given this, the notion in the text seems fairly intuitive (see Willer, 2013, for a use of a similar notion in the context of disagreement with epistemic modal sentences). 24Dan Lassiter points out to me that this is not technically true on the account of Lassiter (2011): on his view, should-sentences can be false if there are small differences in expected value among options. However, I don't think that this response avoids the problem, though it may patch some of its occurrences. Even on Lassiter view, there are many contexts in which two options A and B have similar expected value unconditionally, but have vastly different expected values if we shift or update the underlying probability function. 19 for a nice summary of this discussion). These extensions have a direct impact on deontic semantics, especially on expected value theories. Consider this case: Darts. Jeff and Zara are playing a game. Zara throws infinitely fine darts at the [0,1] interval. Zara is a perfect randomizer on such dart throws: her throw is fair. Jeff makes predictions about the outcome of Zara's throw. Let D∞="this dart thrown at the [0,1] interval will hit π/4". If Jeff raises his right hand, he is predicting that D∞ will happen. If Jeff raises his left hand, he is predicting that it will not. Suppose that Jeff gets prizes according to the following matrix. D∞ ∼D∞ Raise right hand $ 1M nothing Raise left hand nothing nothing In such a case, dominance considerations support: (20) Jeff should raise his right hand. No amount of fiddling with value functions yields this verdict for expected value analyses. Since D∞ has probability 0, it contributes nothing to the expected value of raising one's right hand. I will not try to spell out possible technical solutions for the expected value theorist: none are as simple as the solution I offer in §3.5. But perhaps it is worth considering a conceptual objection suggested by a reviewer (the following is not a direct quote). Let us grant that in uncountable spaces some events must have probability 0. It does not follow that the possible events that ordinary agents focus on are ever 0-probability events. After all, the contextually determined probability function might assign positive probability to D∞. Insofar as we get the intuition that Jeff should raise his right hand, it's because we treat D∞ as having positive probability. I concede everything except for the last sentence. Maybe, we do often treat events like D∞ as if they had positive probability. However, the decision theorists are competent speakers of English. Other people who are well acquainted with the relevant measure-theoretic facts are also competent speakers of English. It seems doubtful that their judgments must be based on surreptitiously assigning positive probability to D∞. To put the point another way: if the objection successfully blocked my use of (20) as a constraint on deontic semantics, it would be just as forceful in blocking the original decision theoretic arguments. But it does not do the latter, so it does not do the former. 20 Examples like Darts support a more general objection. Expected value accounts would gain plausibility if they could be supported by the claim that they approximate the correct normative theory of rationally permissible choice. But this is not a concession that we should grant. The decision theory literature supports the view that an expectational analysis of rational decision-making makes correct predictions in a large class of one-off choice problems, restricted to individual agents and given probabilities that are (i) determinate and (ii) regular (no contingent proposition gets probability 0). Whether it generalizes beyond that to life-long plans (e.g. being a vegetarian), collective choices or non-regular probability models is a (series of) wide open question(s). Darts is one example in which it seems not to, but the general point is that it is premature to think that a formal model that is intended to have a very clearly delimited domain of application should be constitutive of the meaning of deontic modals. 2.3 Section Summary I have outlined the basic moves in the space of expected value accounts and their core advantages. Expected value analyses come in a variety of logical strengths and none of them have any trouble handling the phenomena in §1. From a critical point of view, however, I advanced three objections. To start, they do not combine well with attitude reports for non-Bayesian agents. I have used subjects who accept maximin as a decision rule, but the point applies to any subject who does not rank alternatives according to their expected value. For similar reasons, expected value analyses do not interface well with models of disagreement. Finally, they predict that we compare expected values even in decision problems involving zero-probability states (cases in which even Bayesian decision theorists are reluctant to go with expectations). 3 PROBABILISTIC PREMISE SEMANTICS: A FIRST PASS Let us try to collect all our goals in one place. The classical accounts of deontic vocabulary satisfy the mildest degree of neutrality, But they are not information sensitive and they are not probabilistic (and so violate Probabilize). Expected value analyses provide probabilistic structure, but at conceptual and empirical costs. I propose a middle ground which, hopefully, has the benefits of both.25 25Carr (2012, unpublished) also advances a probabilistic account that is motivated by considerations of neutrality. For Carr, the semantic theory should have a 'decision rule' parameter. I prefer my proposal on a number of specific counts, but I will not make the comparisons explicit. There also appear to be similarities between my framework and the explanatory approach of researchers in the program of inquisitive semantics (Ciardelli et al., 2013). I defer the exploration of these similarities to separate work. 21 In this section, I develop a system that is capable of elegantly modeling cases like Rescue and Slanted and more generally the phenomenon of information sensitivity. But the system also integrates well with an operator treatment of the attitudes and can easily represent different ways of prioritizing alternatives (as is called for in Darts). Due to its relative simplicity, the proposal in this section cannot handle the probabilistic conditionals of §1.1. This is not a disadvantage compared to the alternatives (all of which face a similar challenge). But it would be a serious problem if the theory could not explain cases that were essential in motivating it. To explain the meanings of those conditionals and the inferential relations between probabilistic and deontic language, the next section (§4) explores a natural extension of the system. 3.1 Formal Preliminaries Assume that our target English sentences are translated into a sentential multimodal language L generated by atomic sentences, Boolean operators, epistemic modals may and must, a deontic modal should,26 a probability operator likely, and a conditional (if *)(*).27 L has the usual formation rules, except that conditional antecedents are momentarily restricted to non modal and non probabilistic sentences. If the translation of an English sentence s does not have a wide-scope modal, we prefix it with a must when translating conditionals with s as their consequent.28 The function 1*o* assigns truth-values to sentences of L relative to points of evaluation constituted by a context, an index and a world of evaluation. To streamline notation, I omit the context, because all of the parameters we will care about are bundled in the index.29 In the background of 1*o is set of possible worlds W , which I assume, for simplicity, to be finite (except where noted). 26Schroeder (2010) argues that ought is ambiguous, and that, at least in one sense, ought should not be understood as denoting a function from propositions to truth-values, but rather as denoting a relation between a subject and an action. The systems I describe here can be easily reformulated to fit Schroeder's view of the syntactic and semantic category of ought. For the opposing side of the argument, see Chrisman (2012); Finlay and Snedegar (forthcoming). 27I adopt this 'translational' view of logical form for its familiarity to the philosophical audience. The alternative, which is prevalent among linguists and I think preferable, is the view that logical form is a level of syntactic representation of the English sentences themselves, without mediation from an intermediate formal language. 28This is an attempt to replicate Kratzer's idea that conditional antecedents always restrict modals in their consequents without commitment to Kratzer's (1991a) syntactic hypothesis that there are no conditional operators at the level of logical form. 29Two remarks. First, to define certain concepts of logical consequence, it would be useful to also have a notion of proper point of evaluation (that is a triple whose index and world coordinate are directly assigned by context), but we will not need this here. Second, implementing some deontic semantic theories might require double indexing in the sense of (Kamp, 1971). An example of this is the actualism of Jackson (1985); Jackson and Pargetter (1986). 22 Possible worlds are objects that assign a truth-value (T or F) to every atomic sentence of L. Throughout, 'proposition' abbreviates 'set of possible worlds' (i.e. 'subset of W '). Substantial theses about linguistic content are not intended to follow from this modeling choice. 3.2 A Classical Setup The framework I develop in this section follows the spirit, although not quite the letter, of Kratzer's premise semantics for modals. In Kratzer's framework, the compositional analysis of modals is given in terms of a pair of parameters, the modal base and the ordering source whose job it is to determine the modal's domain. Formally, the modal base and the ordering source are the same type of object-they are sets of propositions (modal bases and ordering sources are more properly treated as functions from worlds of evaluation to sets of propositions. I simplify here because the added complexity is not necessary to model the core cases of this paper). However, the two parameters differ significantly in interpretation and in the role they play in the theory. The modal base delimits a range of salient possibilities (where salience can be characterized in different ways according to the context). Different choices of modal base and ordering source create different flavors for the modals. In the modal flavors of interest here, the ordering source is a set of contextually salient binary preferences among worlds (following Portner 2009, I call the propositions in the ordering source priorities). Its job is to determine an ordering of worlds for deontic modalities to quantify over. 3.3 Three Unorthodox Ideas My framework deviates from the standard Kratzer framework in three major respects. I introduce them here without giving a specific justification for each deviation: they are justified as a package by the explanations they allow (§3.5). My first deviation is to replace modal bases with fine-grained states. Definition 1 (Fine-grained states). A fine-grained state is a pair 〈i, Pr〉 consisting of a set of worlds i, and a probability function Pr defined on the algebra of subsets A determined by some subset A of P(W ). Like modal bases, fine-grained states need not be determined by salient information. In some contexts, i might be determined by relevant facts, or by some other objective factor (ditto for Pr). However, when the modal is interpreted deliberatively (e.g. as expressing the conclusions of practical deliberation),30 i and 30See Schroeder (2010) for some criteria that govern deliberative interpretations. 23 Pr are plausibly understood to be respectively the agent's qualitative epistemic state and that agent's credence. It is reasonable to worry that replacing the modal base with a single probability space picks up a heavy metasemantic debt. The worry is that it is implausible to suppose that, in every context, the factors that assign values to contextual parameters determine a single probability space. For instance, this would clearly be a problem if one believed that modal bases are determined by the attitudes of conversational participants. In reply, I note that it is very easy to weaken my model so that fine-grained states are treated as sets of probability spaces (I do this in §4).31 For my second deviation, I propose that we order alternatives, and not individual worlds. I model alternatives as propositions. When the deontic modal is interpreted deliberatively, these alternatives represent the courses of action available to an agent in context.32 Definition 2 (Alternatives). An alternative set Alt is a set of mutually exclusive propositions. The idea of ranking alternatives is inspired by Horty (2001). Unlike Horty, however, I do not restrict myself to dominance orderings.33 For it to be possible to order alternatives, alternative sets must be available in the process of semantic evaluation. To satisfy this requirement, I will introduce in the indices (see Definition 4 below) a parameter that records the relevant deliberative alternatives. Notice that ranking alternatives does not imply that all uses of modals exhibit alternative-sensitivity-much like adding ordering sources in modal semantics does not imply that all uses of modals exhibit ordering-source sensitivity. My last deviation from the classical framework flows naturally from the second (though it is not forced by it). In the classical semantic framework, we order individual worlds: to this end, we use ordering sources consisting of properties that determine binary preferences among worlds. In my framework, as I have indicated, we order alternatives: to keep as much of the structure of the classical semantics, I induce the ordering by means of properties that determine binary preferences among alternatives. Since alternatives are modeled as propositions, 31Further weakenings are also possible: in fact, my semantics does not require any probabilistic structure at all. The design principle is to make it compatible with probabilistic structure, not to require it. 32See Cariani (2013a, §4) for my preferred way of understanding deliberative alternatives in deontic semantics. 33In Cariani et al. (2013), we have also urged this kind of analysis to model the information sensitivity of modals. The theory I advance in this paper is a probabilistic generalization of the view in that article. 24 the simplest extension would be to model priorities as sets of propositions (I call these elevated priorities). Definition 3 (Elevated Priorities and Ordering Sources). (i) An elevated priority is a set of propositions whose extension is fixed relative to a probability function Pr.34 (ii) An elevated ordering source o is a set of elevated priorities. These moves open up the possibility of ranking alternatives on the basis of genuinely probabilistic considerations. For example, the following is a property that an alternative A might have (relative to Pr): (*) {A | given A, it is likely (according to Pr) that Joe will be less hungry} In many ordinary contexts, the alternative Joe eats a sandwich has this property, while the alternative Joe runs a marathon does not. Because Pr is a parameter, changes in Pr may affect the composition of sets like (*). By moving to elevated priorities we can distinguish between (*) and: (**) {A | given A, it is guaranteed that Joe will be less hungry} (***) {A | A is compatible with Joe's being less hungry} As we will see, these distinctions play a crucial role in my account. 3.4 The Semantics Informed by the discussion in §3.2, we can define our indices. Definition 4 (Indices). An index r is a quadruple r = 〈i, Pr,Alt, o〉 consisting of a set of worlds i, a probability function Pr, a set of alternatives and an elevated ordering source. Notation: Given r, ir, Prr,Altr, or denote the respective parameters in r. The non-modal part of the characterization of 1*o* is straightforward: relative to r and w, atomic sentences get T or F according to how things are in w. Boolean operators are given their standard clauses. It follows that non-modal sentences 34In other words, elevated priorities are functions from probability functions to sets of propositions. I thank Malte Willer and a reviewer for this volume for making me notice that I need this complication. The reviewer also wonders whether the fact that elevated priorities are such functions makes my view a notational variant of the expected value analysis. It does not: my resulting semantics is more permissive than (i.e. lacks some of the validities of) all the expected value analyses discussed above, as I verify towards the end of §3.5. 25 only depend on the world w in a point of evaluation r, w. Here are the entries for the modal part of the vocabulary: 1must(A)or,w = T iff ∀w′ ∈ ir,1Aor,w ′ = T 1may(A)or,w = T iff ∃w′ ∈ ir,1Aor,w ′ = T 1likely(A)or,w = T iff Prr(1Ao|ir)> 5 1should(A)or,w = T iff ∀w′ ∈ Selected(r),1Aor,w ′ = T 1(if A)(B)or,w = T iff 1Bor+A,w = T First Pass: Lexical Entries Completing the entry for the conditional. The entry for if appeals to an operation r+A we have not yet defined. To define it, intersect ir with set of A-worlds and conditionalize Pr on the resulting qualitative state (while leaving Alt and o unchanged). Definition 5 (Intersective Update). r+ A= 〈ir ∩ 1Ao, Prr(* | ir ∩ 1Ao),Altr, or〉. It might appear at first sight that conditionals with non-modal consequent are settled by what happens at the world of evaluation w, but recall that we assumed that for such conditionals we always add in must when translating. Note also that there is independent evidence that we need another construal for indicative conditionals (in addition to the one we just characterized). In particular, we might need a construal on which the antecedent does not have this 'conditionalization' effect.35 The conditionalization reading, however, is the one that is salient here. Completing the entry for should. The entry for should appeals to a domain selection function Selected(*): this function inputs an index and outputs a domain for should (if we want to change the system to make deontic sentences contingent, Selected might have to be relativized to a world as well). Following the Kratzerian playbook, Selected(*) is determined by the ordering source (but recall that we use elevated ordering sources). 35Within Kratzer's framework, Frank (1997); Geurts (unpublished); von Fintel (2009) all maintain that there is an alternative reading for conditionals on which a covert must is always added to the logical form of the salient conditionals, whether the consequent has an overt modal or not. If we allow each modal to have its own modal base, this will have a significant truthconditional difference: ð(if A)(should B)ñ would have the following truth condition: for every epistemically possible world w′ in which A is true, 1should Bor,w ′ = T . This is different from the truth-condition in the text, partly because it does not update r to r+ A, and hence does not conditionalize Pr. There is some evidence for this reading, but it does not affect any of the cases I discuss here. See the discussion of the interplay of conditionals and deontic modals in Cariani et al. (2013), §2.3.3. 26 Definition 6 (Elevated Preorder). Ar B iff {π ∈ or | (A∩ ir) ∈ π} ⊇ {π ∈ or | (B∩ ir) ∈ π} Informally, A is at least as good as B (relative to r) just in case A ∩ ir satisfies all of the elevated priorities (in r) that are satisfied by B ∩ ir. If, in addition, A∩ ir satisfies some priorities that are not satisfied by B∩ ir, we can say that A is ranked strictly above B in the preorder (relative to r). Definition 6 implements another central idea of my probabilistic premise semantics: the relative ranking of an alternative A in the preorder depends in part on ir. This is because when we evaluate whether a priority π applies to A, we do not check whether it applies to the whole set of A-worlds. Rather, we check whether it applies to the distinguished subset A ∩ ir. To make this intuitive, consider two deontic alternatives, say flying to France and flying to Japan. The claim is that, to determine which alternative is better (relative to some contextual priorities, say, making it probable that you see cherry blossoms), we only consider the worlds in which you fly to France that belong to ir and the worlds in which you fly to Japan that belong to ir. If it is early spring, the elevated priority might favor flying to Japan; if it is late fall, it does not favor either alternative. This idea is a crucial element of the account of Rescue, Knowledge and Slanted to be given below (§3.5). The preorder in Definition 6 determines the domain for should: Definition 7 (Deontic Selection). Selected(r) = {v ∈W | ∃B ∈ Altr[∼ ∃A ∈ Altr(Ar B & v ∈ B)]}36 The informal gloss on Definition 7 is that the domain for should consists of the worlds that belong to the maximally ranked alternatives. This completes my discussion of the lexical entries. The last thing we need to check the predictions of the semantics relative to given indices is a notion of acceptance at an index, which is defined as follows: Definition 8. A is accepted in r iff for all w′ ∈ ir, 1Aor,w ′ . 36Two remarks on Definition 7: (i) the correctness of this definition depends on the limit assumption for , which follows from the (stipulated) finiteness of W-hence Alt is finite. When we model Darts below, we will have to consider a case in which W is infinite-in fact uncountable. However, even in that case the set of alternatives is still finite. Since it is the alternatives that are ranked by , the limit assumption still applies. (ii) ≺k means that the inclusion in the definition of k is proper. 27 Two quick remarks: although I described must and may as having epistemic interpretations, the semantics can be modified to add entries for their deontic readings. As in Kratzer's semantics, there is no need to postulate lexical ambiguities (at least not on the basis of anything I have said here). The so-called 'strong' necessity modals must and the possibility modal may get their own elevated ordering source o∗ which is a subset of o (as proposed by von Fintel and Iatridou 2008) and selection function Selected∗. The entries are altered as follows: 1must(A)or,w = T iff ∀w′ ∈ Selected∗(r),1Aor,w ′ = T 1may(A)or,w = T iff ∃w′ ∈ Selected∗(r),1Aor,w ′ = T In epistemic flavors, we could stipulate that o∗ is empty, and hence Selected∗(r) = ir. In deontic flavors, o ∗ is a designated subset of the elevated priorities in o. This completes my presentation of a probabilistic premise semantics in the context of a relatively standard deontic logic. A second, more radical modification is possible. The First Pass semantics validates the logical schema known as Inheritance.37 Definition 9 (Inheritance). If A |= B, then should(A) |= should(B). In Cariani (2013a), I challenged this principle and gave a theory that invalidates Inheritance while validating Agglomeration: Definition 10 (Agglomeration). should(A), should(B) |= should(A & B) It is easy to expand the present framework to implement that non-classical theory. Since, the data I discuss here are orthogonal to that debate, I stick to the Inheritance-satisfying variant. 3.5 Applying the Semantics Let us check how the account handles information sensitivity. My first goal is to show that there are natural assignments of values to the contextual parameters that deliver the expected verdicts in Rescue, Slanted and the whole family of cases from §1. In each case, I identify indices that plausibly represent the salient conversational contexts and show that make the expected predictions. Recall that the target sentences are: (10) You should refuse the pills. 37Although there are debates on the appropriate notion of validity for modal systems such as mine (Veltman, 1996; Yalcin, 2007; Kolodny and MacFarlane, 2010; Willer, 2012; Dowell, 2012), every notion I know of entails that an argument with premise A and conclusion B is valid if there is no pair r, w such that 1Aor,w = T and 1Bor,w = F . This sufficient condition is enough to derive Inheritance, given the semantics I spelled out in this section. 28 (11) If you have the red gene, you should take the red pill. (12) If you have the blue gene, you should take the blue pill. For Rescue, consider an index r1 with: i = the set containing any world that is compatible with the agent's information in the context specified in Rescue Pr = any probability function that respects the constraints stated in the description of the context of Rescue In particular: Pr(red gene) = Pr(blue gene) = .5. Alt= {take the blue pill, taking the red pill, refuse the pills} o ={ {A | given A, it is at least .9 likely that nine people are saved}, {A | given A, it is at least .8 likely that ten people are saved}} I have chosen the priorities somewhat arbitrarily here, but it is easy to see that there are many other assignments that would work Fact 1. In Rescue, r1 accepts (10), (11), and (12). I.e., for any w in ir1 , 1(10)o r1,w = 1(11)or1,w = 1(12)or1,w = T. In the initial state, taking the red pill fails to satisfy either of the elevated priorities in r1 (conditional on taking the red pill you are only .5 likely to save a person; similarly for the taking the blue pill). By contrast, refusing the pills satisfies the first priority: it guarantees saving 9 people. So 1(10)or1 = T . When we evaluate the conditional (11), however, the deontic modal appears in an embedding. We must consider the result of updating r1–and in particular ir1 and Prr1–with the proposition that you have the red gene (as in Definition 5). Relative to this updated index, refusing the pill still satisfies the first elevated priority (it guarantees the rescue of 9 people) and nothing else (you cannot save 10 if you refuse). However, taking the red pill guarantees (and hence makes probable) the rescue of 10 people, so satisfies both elevated priorities. For this reason, 1(11)or1,w = T-and by parallel reasoning the same holds of (12). It is important to appreciate the significance of this reasoning. In my notation, the technical notion of serious information dependence in Kolodny and MacFarlane (2010) amounts to the following property: Definition 11 (Serious Information Dependence). Selected(*) is information dependent iff for some A and r, Selected(r)∩A * Selected(r+A). Fact 2. Selected is seriously information dependent. 29 The reasoning in support of Fact 1 establishes Fact 2. The witnesses for the existential quantifier are (i) the proposition that you have the red gene (for A) and r1 (for r). To model Slanted, let r2 be the result of changing Pr in r1 to: Pr2 = some probability function that respects the constraints stated in the description of the context of Slanted. In particular, s.t. Pr(red gene) = .99 This change suffices to select taking the red pill as the highest ranked alternative, without changing anything else in the index. To see this, note that the option of taking the red pill satisfies both elevated priorities, while refusing can only make it probable that 9 are saved. Fact 3. In Slanted, r2 accepts (11) and (12), the negation of (10) and in fact, it also accepts "You should take the red pill". I.e., for any w in ir2 , 1(10)o r2,w = F and 1(11)or2,w = 1(12)or2,w = T. Moreover, 1should(take red)or2,w = T The verification is routine and is omitted here. The broader conclusion is that there is a natural assignment of values to contextual parameters such that, both unconditionally and conditionally, you should take the red pill. Moreover, this natural assignment results from the index in Rescue by modifying only the probabilistic coordinate of context. Fact 4. There is an index r3 that differs from r2 only in its elevated ordering source, and that accepts (10), (11) and (12). Suppose that r3 is obtained by replacing o in r2 with: o ={{A | given A, it is guaranteed that nine people are saved}, {A | given A, it is guaranteed that ten people are saved}} According to r3, you ought to take neither pill, even in Slanted. Effectively, r3 involves priorities that are distinctive of a more risk averse evaluation of the deliberative situation. Even though in Slanted there is .99 probability that you have the red gene, that is not enough to trigger either priority. This showcases a precise sense in which my framework is neutral. Different combinations of elevated priorities can match the verdicts of a large variety of decision rules. As a consequence, on this theory, we can correctly model: (14) John thinks you should refuse the pill. 30 The effect of 'John thinks' is to shift the index to one that is compatible with John's state (we write h(x , r, w) for the index containing x 's information state in w, x 's probability function in w, Altr, and x 's priorities in w; we do not shift the alternative set). 1thinks(x , A)or,w = T iff ∀w′ ∈ ih(x ,r,w),1Aoh(x ,r,w),w ′ = T Even if the initial context is Slanted, the embedded deontic claim in (14) gets interpreted relative to something like r3. In particular, it gets interpreted relative to a state whose elevated priorities reflect John's risk aversion. Finally, the probabilistic premise semantics allows a simple account for Darts. First, include models in which W is uncountably infinite. Since in Darts we still have only finitely many alternatives, no other changes to the semantics are needed. Now, consider an index with the elevated priority: (21) {A |A is compatible with Jeff's winning $1 million}. This elevated priority applies, in context, to Jeff's raising his right hand, but not to Jeff's raising his left hand, so that we can accurately derive the judgment on (20). This example also illustrates a particularly intuitive feature of my probabilistic premise semantics: the theory allows us to model priorities that involve assigning low probability to a certain goal. Consider this dialogue: (22) A: I would like to sign up for a softball league. B: The registration for softball leagues is now closed. There is, however, a small chance that we will be able to get you into a league if a spot opens up. So, you should sign up for the waitlist. It is natural to interpret B as conveying that signing up for the waitlist is the agent's only way to confer a small probability to her salient goal (playing in a league). We could model this idea by supposing that the salient priority was: (23) {A | the probability of entering a league, given A, is greater than r.} [with r being some low probability value, e.g. .05] As a result, relative to the given context, signing up for the waitlist, which satisfies (23), ranks higher than, say, throwing a tantrum, which does not satisfy (23). 3.6 First Pass Wrap Up I conclude that the first pass of the probabilistic premise semantics can handle the cases that motivate information dependence as well as their probabilistic 31 variants. As anticipated, the theory cannot model the probabilistic conditionals of §1.1. The most immediate reason for this is that so far I have only allowed non-modal and non-probabilistic antecedents. The deeper reason is that, even if we abandon this syntactic restriction, our current theory cannot possibly work well with those sentences. On the current theory, sentences of the form ðlikely Añ do not vary with w (they are either T at every w or F at every w). While this may not be problematic per se, it becomes problematic when it interacts with the semantics of §3.4. If conditional antecedents restrict ir, the restriction effected by a probabilistic conditional antecedent is either trivial (no worlds are ruled out) or total (every world is ruled out). It follows that conditionals of the form ð(if likely A)(should B)ñ, such as (5)-(6) (repeated here), are either equivalent to their consequents or else vacuously true. (5) If it is likely that your opponents will attack you on the right flank, you should concentrate your defense on the right side. (6) If it is not likely that your opponents will attack you on the right flank, you should not concentrate your defense on the right side. However, this result is undesirable: intuitively, (5)-(6) are jointly consistent, but neither is equivalent to its consequent, and neither is vacuously true. 4 PROBABILISTIC PREMISE SEMANTICS: SECOND PASS We just saw that the First Pass semantics does not derive plausible meanings for (5)-(6) . The problem, I am going to argue, does not lie in the theory's account of deontic modals, but rather in the theory's modeling of probabilistic antecedents. In this section, I briefly sketch an approach to probabilistic antecedents that, when conjoined with my account of deontics can handle (5)-(6)-I do not claim it to be the only possible approach. The approach I chose for this illustration is Yalcin's account of conditionals with probabilistic antecedents, which he formulates as an update semantics (in the style of Veltman, 1996, but with several important, and subtle modifications). I reformulate Yalcin's system using the indices from §3 as states and using p as a metalinguistic variable ranging over the atomic sentences of L. In this system, we restrict 1*o to just to the atomic sentences and give a recursive 32 characterization of an update operation [*] on states. r[p] =〈ir ∩ 1po, Prr(* | ir ∩ 1po),Altr, or〉 r[∼A] =〈ir − ir[A], Prr(* | ir − ir[A]),Altr, or〉 r[A & B] =r[A][B] r[may(A)] =r, if ir[A] 6= ;, else 〈;, Prr(* | ;),Altr, or〉. r[likely(A)] =r, if Prr(ir[A])> .5, else 〈;, Prr(* | ;),Altr, or〉. r[(if A)(B)] =r, if r[A][B] = r[A], else 〈;, Prr(* | ;),Altr, or〉. Update System with Sharp Probabilities Note that Prr(* | ;) is not and cannot be a probability function. To remain consistent with the treatment of ir, we can suppose that Prr(A |;) = 1 for all A. This means that we have to qualify the claim that the second coordinate of our indices is always a probability function: this is only true if understood as restricted to non-degenerate states. In degenerate states like 〈;, Prr(* | ;),Altr, or〉 we allow the function Prr(* | ;). This system is designed for epistemic modalities, but we can add a clause for deontic should by assuming that, like the other modals, it performs a test on the state r (in the sense of Veltman, 1996). r[should(A)] = r, if (ir ∩ Selected(r)) ⊆ ir[A] else 〈;, Prr(* | ;),Altr, or〉. Informally, if one thinks of ir ∩ Selected(r) as a quantificational domain, what is tested is that this domain is a subset of ir[A]. Note that when A is non-modal and non-probabilistic, ir[A] = ir ∩ 1Ao (this is a consequence of Fact 5 below). It follows that what is tested is whether ir ∩ Selected(r) is a subset of 1Ao, which is just what a defender of a quantificational semantics for should would want to test for. We can complete the account with some standard definitions. Definition 12 (Acceptance). (i) A is accepted in r iff r[A] = r. (ii) the sentences in set Σ are jointly accepted in r iff for all A ∈ Σ, A is accepted in r. The technical machinery is best illustrated by exploring some of its consequences. Fact 5. When A is non-modal and non-probabilistic, r(A) = 〈ir ∩ 1Ao, Prr(* | ir ∩ 1Ao),Altr, or〉 33 Fact 5 states that if we restrict attention to the non-modal and non-probabilistic fragment of our language, the update operation coincides with the update operation we have ascribed to atomic sentences. It updates the information state ir and conditionalizes the probability function, leaving the rest unchanged. Fact 5 is proven by an easy induction on the complexity of the sentences in the nonmodal and non-probabilistic fragment of the language. Yalcin's system involves a subtle treatment of negation. The clause for ∼A yields different behaviors according to the kind of sentence that is negated. If A is from the Boolean fragment of the language, update on ∼A is covered by the generalization in Fact 5. If A is probabilistic or modal, ∼A works as a test (as summarized by Facts 6 and 7). Fact 6. r[∼likely(A)] = r if P r(A | ir)≤ .5, else 〈;, Prr(* | ;),Altr, or〉. Fact 7. r[∼may(A)] = r if ir[A] = ;, else 〈;, Prr(* | ;),Altr, or〉. The proofs of these and the remaining facts are trivial and generally omitted in the interest of space. The motivation for considering this style of update semantics was the hope for an account of the joint acceptability of (5)-(6) that did not require them to be vacuous or to collapse on their consequents. We are not there yet: updating on ðlikely Añ relative to index r must yield either r itself or the degenerate point 〈;, Prr(* | ;),Altr, or〉. In the first case, ð(if likely A)(B)ñ is accepted in r if and only if B is accepted in r. In the second case, ð(if likely A)(B)ñ is accepted in r for every B (it is vacuously true). Fortunately, this system is not the last word. Drawing on ideas from Willer (2013), Yalcin extends his system for epistemic vocabulary to one involving sets of fine-grained states. His motivation is to represent uncertainty without buying into the idea that context supplies a perfectly sharp probability function Pr. Suppose, for instance, that we want to deploy a probabilistic analogue of the common ground to determine fine-grained states. A probabilistic common ground may include all the constraints (qualitative as well as probabilistic) that are mutually presupposed by conversational participants. According to this picture, it is almost certain that our mutual presuppositions are not satisfied by a single probability function and are compatible with a large class of probability functions. You and I may mutually presuppose that rain in Paris is more likely than snow, and not much else. It would appear natural, then, to represent the relevant state of uncertainty with a set of probability functions- those functions that assign greater probability to rain in Paris than they assign to snow in Paris. The formal system built to support these ideas can offer the desired account of (5)-(6). This extension works by first considering sets of states r. 34 Definition 13 (Blunt States). A blunt index R is a set of sharp states such that any two r, r′ ∈ R differ at most in their i, P r or o coordinates. Luckily, we do not need to characterize the update operation again. Building on an idea by Willer (2013), we can update each member of R and then collect only those members that accept A (in the sense of Definition 12). Definition 14 (Blunt Update). R[A] = {r[A] ∈ R | r[A] = r} The following facts reveals some key bits of the behavior of blunt updates. Fact 8. R[likely A] = {r ∈ R | Prr(1Ao)> .5} Fact 9. R[∼likely A] = {r ∈ R | Prr(1Ao)≤ .5} Fact 10. R[(if A)(B)] = {r ∈ R | r[A][B] = r[A]} In general, when a sentence performs a test on r, Blunt Update retains exactly those states that successfully pass the test. Acceptance and joint acceptance can be redefined for blunt updates by lifting Definition 12. Definition 15 (Blunt Acceptance). (i) A is accepted in R iff R[A] = R. (ii) the sentences in a set Σ are jointly accepted in R iff for all A ∈ Σ, A is accepted in R. (iii) the sentences in Σ are jointly acceptable iff there is a blunt state R such that the sentences in set Σ are jointly accepted in R. The definition of blunt acceptance has implications for what it takes for a conditional with a probabilistic antecedent to be accepted: Fact 11. R accepts ð(if likely A)(B)ñ iff there is no r ∈ R with: r[likely A] = r but r[B] 6= r. With this fact in hand, our pair (5)-(6) can be shown to be jointly acceptable without treating either conditional as vacuous or as collapsing on its consequent. In this setting, ð(if likely A)(B)ñ is vacuously accepted iff there is no r ∈ R such that r[likely A] = r; it collapses on its consequent iff for every r ∈ R, r[likely A] = r. Fact 12. There is a R1 such that (5)-(6) are jointly acceptable in R1. Additionally, R1 can be chosen so that (5)-(6) are not vacuously accepted and so that they do not collapse on their consequents. 35 For reasons of space, I do not take up the detailed construction of R1. However, the idea should be obvious in light of the earlier facts. Let R1 contain some sharp states r according to which the opponents are likely to attack on the right, and some according to which the opponents are not likely to attack on the right. Then (5) is accepted in R1 provided that all of the sharp states according to which the opponents are likely to attack on the right pass the test for 'you should concentrate the defense on the right'. Moreover, (6) is accepted in R1 provided that all of the sharp states according to which the opponents are not likely to attack on the right fail the test for 'You should concentrate the defense on the right'. It is easy to construct an elevated ordering source that accomplishes this. The upshot of this section is that the challenge posed by (5)-(6) can be factored into two separate problems. The first problem concerns update on probabilistic information. The other concerns the interface between deontic judgments and probabilistic states. The semantics of §3 solves the latter problem. This is all that it should be reasonably asked to do. If we borrow an answer to the first problem, we can inject it in the probabilistic premise semantics and explain the meanings of these troublesome conditionals. 5 CONCLUSION So, is there one theory to rule them all? Probably not. For one thing, the motivational arguments that justify my proposal depend on significant assumptions about semantic theories for the language of subjective uncertainty, about information-sensitivity for deontic modals, about how to integrate deontic semantics with off-the-shelf treatments of attitudes and so on. For another, even granting those assumptions, a few different systems are compatible with the ideas I have developed. Nonetheless, there is much to recommend the core ideas of the semantic account I developed in §3. The neutrality-based desiderata are plausible and can be reflected in specific empirical predictions. My theory does as well as one can with respect to these desiderata. On the one hand, it is comfortably able to handle the dependence of deontics on probabilistic states. On the other, it can model judgments that track orderings based on expected utility, maximin, dominance and what have you, without assuming that these practical theories are built into the semantics. It strikes a solid balance between the desire to develop a probabilistic semantics for modals and the desire to keep the conventional meanings of deontic modals as thin and flexible as possible. 36 Acknowledgments. Special thanks to Nate Charlow, Matthew Chrisman, Daniel Lassiter, Paolo Santorio, Ralph Wedgwood, and Malte Willer for detailed feedback on previous versions of this paper, as well as to Thony Gillies, and Shyam Nair for public commentaries and to Magdalena and Stefan Kaufmann for our collaboration that helped me sharpen the ideas in this paper. I am also indebted to conversations and e-mail exchanges with: Mike Caie, Janice Dowell, Kai von Fintel, Michael Glanzberg, Jeff Horty, Graham Katz, Angelika Kratzer, Steve Kuhn, Manuel Križ, Hanti Lin, Peter Ludlow, Dilip Ninan, Paul Portner, Aynat Rubinstein, Alex Silk, Seth Yalcin, audiences at Georgetown Linguistics Colloquium, Maryland Philosophy Colloquium, the 2012 Rutgers Semantics Workshop, Kai von Fintel and Sabine Iatridou's 2012 MIT Graduate Seminar on Deontic Modals and Imperatives, the 2013 Pacific APA, Northwestern Deontic Modality Workshop and USC Deontic Modality Conference. Finally, thanks (again!) to both Nate Charlow and Matthew Chrisman for editing this volume. BIBLIOGRAPHY BENNETT, Jonathan (2003), A Philosophical Guide to Conditionals (Oxford University Press). BRIGGS, Rachael (2014), 'Normative Theories of Rational Choice: Expected Utility', in Edward N. Zalta (ed.), The Stanford Encyclopedia of Philosophy, fall 2014 edition. CARIANI, Fabrizio (2009), The Semantics of 'ought' and the Unity of Modal Discourse., Ph.D. thesis, UC Berkeley. --- (2013a), 'Ought and Resolution Semantics', Noûs, 47(3), 534–558. --- (2013b), 'Epistemic and Deontic Should', Thought, 2(1), 73–84. --- (2014), 'Attitudes, Deontics and Semantic Neutrality', Pacific Philosophical Quarterly. --- (forthcoming), 'Consequence and Contrast in Deontic Semantics', Journal of Philosophy. CARIANI, Fabrizio, KAUFMANN, Magdalena, and KAUFMANN, Stefan (2013), 'Deliberative Modality under Epistemic Uncertainty', Linguistics and Philosophy, 36, 225–259. CARR, Jennifer (2012), 'Deontic Modals without Decision Theory', in Proceedings of SuB 17, 167–182. --- (unpublished), 'Subjective Oughts in Natural Language', Manuscript, Leeds University. 37 CHARLOW, Nate (2013), 'What We Know and What To Do', Synthese, 190, 2291– 2323. --- (forthcoming), 'Decision Theory: Yes! Truth-Conditions: No!', in N. Charlow and M. Chrisman (eds.), Deontic Modals (Oxford University Press). CHIERCHIA, Gennaro and MCCONNELL-GINET, Sally (2000), Meaning and Grammar: an Introduction to Semantics (MIT Press). CHRISMAN, Matthew (2012), ' "Ought" and Control', Australasian Journal of Philosophy, 90(3), 433–451. CIARDELLI, Ivano, GROENENDIJK, Jeroem, and ROELOFSEN, Floris (2013), 'Inquisitive Semantics: a New Notion of Meaning', Language and Linguistics Compass, 7(9), 459–476. DOWELL, Janice (2012), 'Contextualist Solutions to Three Puzzles about Practical Conditionals', in Russ Shafer-Landau (ed.), Oxford Studies in Metaethics, volume 7, 271–303 (Oxford University Press). EASWARAN, Kenny (2008), The Foundations of Conditional Probability, Ph.D. thesis, UC Berkeley. --- (2014), 'Regularity and Hyperreal Credences', Philosophical Review, 123, 1–41. EDGINGTON, Dorothy (1995), 'On Conditionals', Mind, 104, 235–329. FINLAY, Stephen (2009), 'Oughts and Ends', Philosophical Studies, 143(3), 315– 340. --- (2014), A Confusion of Tongues (Oxford University Press). FINLAY, Steve and SNEDEGAR, Justin (forthcoming), 'One Ought too Many', Philosophy and Phenomenological Research. VON FINTEL, Kai (2009), 'Conditionals', in Klaus von Heusinger, Claudia Maienborn, and Paul Portner (eds.), Semantics: An international handbook of meaning (DeGruyter). --- (unpublished), 'The best we can (expect to) get? Challenges to the classic semantics for deontic modals', presented at the 2012 Central APA, Chicago, IL. http://mit.edu/fintel/fintel-2012-apa-ought.pdf. 38 VON FINTEL, Kai and IATRIDOU, Sabine (2008), 'How to Say Ought in Foreign: the Composition of Weak Necessity Modals', in Jacqueline Guéron and Jacqueline Lecarme (eds.), Time and Modality (Studies in Natural Language and Linguistic Theory 75), 115–141 (Springer). FRANK, Annette (1997), Context Dependence in Modal Constructions, Ph.D. thesis, University of Stuttgart. GEURTS, Bart (unpublished), 'On an Ambiguity in Quantified Conditionals', Manuscript, University Njemegen. GIBBARD, Allan (2003), Thinking How To Live (Harvard University Press). GOBLE, Lou (1996), 'Utilitarian Deontic Logic', Philosophical Studies, 82, 317– 357. HÁJEK, Alan (unpublished), 'Staying Regular', Australian National University. HINTIKKA, Jaako (1962), Knowledge and Belief: an Introduction to the Logic of the Two Notions (Cornell University Press). HOLLIDAY, Wesley and ICARD, Thomas (2013), 'Measure semantics and qualitative semantics for epistemic modals', Proceedings of SALT 2013, 513–534. HORTY, John F. (2001), Agency and Deontic Logic (Oxford University Press). JACKSON, Frank (1985), 'On the Semantics and Logic of Obligation', Mind, 94(374), 177–195. --- (1991), 'Decision Theoretic Consequentialism and the Nearest Dearest Objection', Ethics, 101(3), 461–482. JACKSON, Frank and PARGETTER, Robert (1986), 'Oughts, Options and Actualism', The Philosophical Review, 95(2), 233–255. JOYCE, James M. (1999), The Foundations of Causal Decision Theory (Cambridge University Press). KAMP, Hans (1971), 'Formal Properties of 'Now", Theoria, 37, 227–273. KOLODNY, Niko and MACFARLANE, John (2010), 'Ifs and oughts', Journal of Philosophy, 107(3), 115–143. KRATZER, Angelika (1977), 'What 'Must' and 'Can' Must and Can Mean', Linguistics and Philosophy, 1(3), 337–355. --- (1981), 'The Notional Category of Modality', in B. Partee and P. Portner (eds.), Formal Semantics: the Essential Readings (Blackwell). 39 --- (1991a), 'Conditionals', in A. von Stechow & D. Wunderlich (ed.), Semantics: An International Handbook of Contemporary Research (De Gruyter). From the Semantics Archive. --- (1991b), 'Modality', in A. von Stechow & D. Wunderlich (ed.), Semantics: An International Handbook of Contemporary Research (De Gruyter). --- (2012), Modals and Conditionals (Oxford University Press). LASSITER, Daniel (2011), Measurement and Modality: The Scalar Basis of Modal Semantics., Ph.D. thesis, NYU. --- (forthcoming), 'Epistemic Comparisons, Models of Uncertainty, and the Disjunction Puzzle', Journal of Semantics. LEWIS, David (1978), 'Reply to McMichael', Analysis, 38(2), 85–86. --- (1981), 'Ordering Semantics and Premise Semantics for Counterfactuals', Journal of Philosophical Logic, 10, 217–234. MACFARLANE, John (2014), Assessment Sensitivity (Oxford University Press). MOSS, Sarah (forthcoming), 'On the Semantics and Pragmatics of Epistemic Vocabulary', Semantics & Pragmatics. PARFIT, Derek (2011), On What Matters (Oxford University Press). --- (unpublished), 'What We Together Do', Manuscript, Oxford University. PLUNKETT, David and SUNDELL, Tim (2013), 'Disagreement and the Semantics of Normative and Evaluative Terms', Philosophers Imprint, 13, 1–37. PORTNER, Paul (2009), Modality (Oxford University Press). REGAN, Donald (1980), Utilitarianism and Cooperation (Oxford University Press). ROTHSCHILD, Daniel (2012), 'Expressing Credences', Proceedings of the Aristotelian Society, 112, 99–114. RUBINSTEIN, Aynat (2012), Roots of Modality, Ph.D. thesis, UMass, Amherst. SCHROEDER, Mark (2010), 'Oughts, Agents and Actions', The Philosophical Review, 120(1), 1–41. SILK, Alex (2014), 'Evidence Sensitivity in Deontic Modals', Journal of Philosophical Logic, 43, 691–723. 40 SKYRMS, Brian (1980), Causal Necessity (Yale University Press). --- (1982), 'Causal Decision Theory', The Journal of Philosophy, 79(11). STEPHENSON, Tamina (2007), 'Judge Dependence, Epistemic Modals, and Predicates of Personal Taste', Linguistics and Philosophy, 30(4), 487–525. SWANSON, Eric (2011), 'How Not to Theorize About the Language of Subjective Uncertainty', in Andy Egan and Brian Weatherson (eds.), Epistemic Modality, 249–269 (Oxford University Press). VELTMAN, Frank (1996), 'Defaults in Update Semantics', Journal of Philosophical Logic, 25, 221–261. WEDGWOOD, Ralph (forthcoming), 'Subjective and Objective Ought', in N. Charlow and M. Chrisman (eds.), Deontic Modals (Oxford University Press). WILLER, Malte (2012), 'A Note on Iffy Oughts', Journal of Philosophy, 109, 449– 461. --- (2013), 'Dynamics of Epistemic Modality', The Philosophical Review, 122, 45–92. YALCIN, Seth (2007), 'Epistemic Modals', Mind, 116(4), 983–1027. --- (2010), 'Probability Operators', Philosophy Compass, 916–937. --- (2011), 'Nonfactualism About Epistemic Modality', in Egan A. and Weatherson B. (eds.), Epistemic Modality (Oxford University Press). --- (2012a), 'Bayesian Expressivism', Proceedings of the Aristotelian Society, 112, 123–160. --- (2012b), 'Context Probabilism', in M. Aloni et. al. (ed.), Logic, Language, and Meaning: Proceedings of the 18th Amsterdam Colloquium. Lecture Notes in Computer Science, Vol. 7218, 12–21 (Springer). --- (2012c), 'A Counterexample to Modus Tollens', Journal of Philosophical Logic, 41, 1001–1024.