Modality and Expressibility∗ Matthew Mandelkern† December 17, 2018 Penultimate draft; to appear in the Review of Symbolic Logic Abstract When embedding data are used to argue against semantic theory A and in favor of semantic theory B, it is important to ask whether A could make sense of those data. It is possible to ask that question on a case-by-case basis. But suppose we could show that A can make sense of all the embedding data whichB can possibly make sense of. This would, on the one hand, undermine arguments in favor of B over A on the basis of embedding data. And, provided that the converse does not hold-that is, that A can make sense of strictly more embedding data than B can-it would also show that there is a precise sense in whichB is more constrained thanA, yielding a pro tanto simplicity-based consideration in favor of B. In this paper I develop tools which allow us to make comparisons of this kind, which I call comparisons of potential expressive power. I motivate the development of these tools by way of exploration of the recent debate about epistemic modals. Prominent theories which have been developed in response to embedding data turn out to be strictly less expressive than the standard relational theory, a fact which necessitates a reorientation in how to think about the choice between these theories. Keywords: relative expressibility; semantic theories; epistemic modals; attitude predicates 1 Introduction When embedding data are used to argue against semantic theory A and in favor of semantic theory B, it is important to ask whether A could, after all, make sense of those data. It is possible to ask that question on a case-by-case basis. But suppose we could show that A can make sense of all the embedding data which B can make sense of. This would, on the one hand, undermine arguments in ∗This paper is a descendant of work presented at MIT, the Eighth Philosophy and Semantics in Europe Colloquium at the University of Cambridge, the 2015 Bucharest Colloquium for Analytic Philosophy, and the 2016 Central APA, and of the final chapter of Mandelkern 2017a. Thanks to audiences at those talks and to Justin Bledin, David Boylan, Daniel Drucker, Kai von Fintel, Vera Flocke, Peter Fritz, Cosmo Grant, Irene Heim, Valentine Hacquard, Matthias Jenny, Roni Katzir, Justin Khoo, Angelika Kratzer, Harvey Lederman, Jack Marley-Payne, Vann McGee, Maša Močnik, Dilip Ninan, Milo PhillipsBrown, Agustın Rayo, Daniel Rothschild, Bernhard Salow, Paolo Santorio, Ginger Schultheis, Roger White, and Stephen Yablo for very helpful comments and discussion on these topics. Thanks to Fabrizio Cariani, Simon Goldstein, Jonathan Phillips, Robert Stalnaker, Shane Steinert-Threlkeld, and three anonymous referees for this journal for extensive comments and discussion. †All Souls College, Oxford, OX1 4AL, United Kingdom, matthew.mandelkern@all-souls.ox.ac.uk 1 favor of B over A on the basis of embedding data. And, provided that the converse does not hold- that is, that A can make sense of strictly more embedding data than B can-it would also show that there is a precise sense in which B is more constrained than A, yielding a pro tanto simplicity-based consideration in favor of B. In this paper I develop formal tools which allow us to make comparisons of this kind, which I call comparisons of potential expressive power. I motivate the development of these tools through discussion of the recent debate about epistemic modals (words like 'might' and 'must', on a broadly epistemic reading). Recent work has used embedding data to argue against the standard relational theory-on which epistemic modals have, roughly, the semantics of modal operators in standard modal logics-and in favor of various revisionary theories. But comparisons of potential expressive power show that those revisionary theories are strictly less expressive than the standard theory, in the sense that the relational theory can make sense of any embedding data involving epistemic modals which those theories can make sense of, but not vice versa (within very mild limits). This necessitates a reorientation in how to think about the choice between these semantics. The situation is roughly the opposite of the way it is standardly presented: we can rest assured that the relational theory can account for any data which those revisionary theories can account for. But not vice versa: we may well discover data that the relational theory can make sense of, but that the revisionary theories cannot. This, in turn, shows that there is a precise sense in which these revisionary theories are simpler than the relational theory, which may count in their favor. The tools developed here allow us to state these facts in a rigorous and general way, which, among other things, makes application to other debates straightforward. Considerations of relative potential expressive power cannot on their own decide between two semantic theories, but they can help us determine what kinds of arguments from natural language on the basis of embedding data are possibly good arguments in the choice between those theories. In the final part of the paper, I consider in more detail the empirical picture concerning the behavior of epistemic modals in attitude contexts, and what this behavior tells us about the theory of epistemic modals in light of the foregoing expressibility results. 2 Background I begin by introducing the relational semantics for epistemic modals; Yalcin (2007)'s data and his revisionary domain semantics response to those data; and Ninan (2016)'s response to Yalcin, which shows that the relational semantics can replicate the predictions of the domain semantics. 2 2.1 The relational theory On the relational theory (e.g. Kripke 1963; Kratzer 1977, 1981), epistemic modals denote quantifiers over a set of worlds: namely, over those worlds which are accessible from the world of evaluation relative to a binary accessibility relation between worlds (equivalently and more conveniently, a corresponding modal base function from worlds to sets of worlds). The accessibility relation for epistemic modals is generally taken to track the contextually relevant evidence: a world is accessible from another world just in case it's compatible with the contextually relevant evidence in that world. 'Might' is treated as an existential quantifier, so pMight φq is a claim that the contextually relevant evidence leaves it open that φ is true;1 'must' is treated as the dual universal quantifier, so pMust φq is a claim that the contextually relevant evidence entails that φ is true. More formally:2 Definition 2.1. Relational Semantics: Where f is a modal base: • [[Might φ]]f,w= 1 iff ∃w′ ∈ f(w) : [[φ]]f,w′ = 1 • [[Must φ]]f,w= 1 iff ∀w′ ∈ f(w) : [[φ]]f,w′ = 1 2.2 Yalcin (2007)'s challenge Yalcin (2007) observed that, when you embed a sentence with the form pφ and might not φq (which he calls an epistemic contradiction) under 'Suppose', the result is infelicitous:3 (1) a. #Suppose it's raining and it might not be raining. b. #Suppose (φ and might not φ). The issue for the relational approach is that it seems to predict that a sentence like (1-a) will be perfectly felicitous. To spell out the problem, let's focus (as Yalcin does) on the corresponding declarative sentence. A sentence like (2) is felt to attribute inconsistent suppositions to Alfred, a fact which explains why the imperative variant will be felt to be infelicitous: (2) a. Alfred supposes it's raining and it might not be raining. 1Greek letters range over all sentences. Italic roman letters below range over atomic sentences. I will leave off a context parameter throughout, for readability; insofar as the languages we are working with contain context-sensitive terms (beyond epistemic modals), these should be read as implicit throughout. 2This presentation simplifies Kratzer's theory in two ways, one irrelevant (for Kratzer, a modal base is a function from worlds to sets of propositions, not worlds, and quantification is over the intersection of that set), and one possibly relevant (for Kratzer, modals are evaluated relative to a second parameter, an ordering source). Insofar as an ordering source plays in interesting role in what follows, though, it makes the relational theory even more expressive than it is on the present simplification. 3Parallel observations concerning the embedding of sentences with this form in the scope of quantifiers were made in Groenendijk et al. 1996; Aloni 2001, and were used to motivate the update semantics, discussed below. I focus on Yalcin's version of the argument only because it is simpler. 3 b. A supposes (φ and might not φ). But, given the relational semantics, it looks like (2-a) is predicted to say that Alfred supposes that it's raining, and supposes that the contextually salient body of evidence is compatible with the proposition that it's not raining. In other words, (2-a) should be roughly equivalent to (3), which is not felt to attribute any kind of inconsistency to Alfred. (3) Alfred supposes that it's raining and that, for all he knows, it's not raining. To make this worry more precise, we need to spell out some background assumptions which, as we will see, are crucial for getting the puzzle going. First, Yalcin assumes that connectives have classical Boolean semantics.4 Second, Yalcin generalizes Hintikka (1962)'s theory of attitude predicates to the relational semantics as follows. On Hintikka's approach, attitude predicates denote universal quantifiers over the possible worlds compatible with the relevant attitude (for attitude verb V and agent A, let 'VA,w' denote the worlds compatible with everything that A V's in w). Yalcin assumes further that attitude verbs do not change the setting of the modal base parameter, so the schematic result is as follows: Definition 2.2. Simple Hintikkan semantics: • [[A V's φ]]f,w= 1 iff ∀w′ ∈ VA,w : [[φ]]f,w ′ = 1 Thus, in particular, pA supposes φq, evaluated at modal base f and world w, just says that φ is true at every world compatible with A's suppositions inw, relative to f . Together with the relational semantics for epistemic modals and Boolean semantics for connectives, we thus predict that pA supposes (φ and might not φ)q, as evaluated at modal base f and world w, says that every world in A's supposition state makes φ true (relative to f ), and that every world in A's supposition state can access under f some world where φ is false. So, if the modal base tracks something like A's knowledge, then (2-a) should just mean something like (3), and thus should not be felt to ascribe incompatible suppositions to Alfred. 2.3 Yalcin (2007)'s domain semantics This is a puzzle for the combination of the relational semantics with the simple Hintikkan semantics for attitude verbs given here together with the Boolean semantics for connectives. Yalcin (2007) responded to this puzzle by developing a theory that is revisionary twice over, rejecting both the relational theory of epistemic modals and the simple Hintikka semantics for attitude verbs. First, Yalcin 4That is, [[φ and ψ]]f,w= 1 iff [[φ]]f,w= 1 and [[ψ]]f,w= 1; and [[Not φ]]f,w= 1 iff [[φ]]f,w= 0. 4 adopts a domain semantics for epistemic modals (crediting an early version of MacFarlane 2011). On the domain semantics, epistemic modals denote quantifiers over a set of worlds (information state) s which is supplied, not by an accessibility relation, but rather as a world-independent parameter of the index. Definition 2.3. Domain semantics: • [[Might φ]]s,w= 1 iff ∃w′ ∈ s : [[φ]]s,w′ = 1 • [[Must φ]]s,w= 1 iff ∀w′ ∈ s : [[φ]]s,w′ = 1 Thus pMight φq says that the truth of φ is compatible with a given information state; pMust φq says that the truth of φ is entailed by a given information state. The domain semantics is just like the relational semantics except that it substitutes information states where the relational semantics has functions from worlds to information states. Let me note that I will use 'domain semantics' and 'relational semantics' to denote just the semantics for epistemic modals given here and above, separate from the semantics for attitude verbs and connectives that each of these theories has been typically coupled with (I use 'relational/domain framework' for a semantic theory which extends the relational/domain theories). This lets us focus on the question of how those particular semantics for epistemic modals can be motivated on their own, rather than as part of a package deal. Yalcin generalizes the classical semantics for the connectives to the domain semantics in the obvious way.5 Yalcin, finally, develops a new semantics for attitude verbs, which builds on the core Hintikka semantics, but stipulates that the attitude verb supplies the set of attitude worlds as the domain of quantification for its complement. Schematically, for attitude verb V, Yalcin proposes:6 Definition 2.4. Yalcin semantics for attitude verbs: • [[A V's φ]]s,w= 1 iff ∀w′ ∈ VA,w : [[φ]]VA,w,w ′ = 1. That is, pA V's φq is true at w and s just in case φ is true at every world compatible with what A V's at w, relative to VA,w. This combination of views accounts for Yalcin's data as follows. Consider a sentence with the form of (4): (4) A supposes (φ and might not φ). Given Yalcin's approach to attitude verbs (plus Boolean connectives), (4) will be true at w, relative to any information state, just in case, at every world w′ in SA,w (the set of worlds compatible with A's 5Namely, [[φ and ψ]]s,w= 1 iff [[φ]]s,w= 1 and [[ψ]]s,w= 1, and [[Not φ]]s,w= 1 iff [[φ]]s,w= 0. 6Yalcin presents a semantics only for 'supposes', but suggests that it should be extended to other attitudes in the obvious way; for simplicity, I am presenting the general format here. Likewise for Ninan's semantics below. 5 suppositions in w), pφ and might not φq is true relative to 〈SA,w, w′〉. This follows just in case, for every world w′ in SA,w, φ is true at 〈SA,w, w′〉; and, for every world w′ in SA,w, there is some world w′′ in SA,w such that φ is false at 〈SA,w, w′′〉. Clearly these two conditions can only both be met if SA,w is empty. Thus a sentence with the form of (4) will ascribe inconsistent suppositions to A. This, again, suffices to account for Yalcin's data: the imperative version of (4) will be infelicitous because it will be a command to make an inconsistent supposition.7 2.4 Ninan (2016)'s response Yalcin's proposal is, again, revisionary in two respects: it couples a new semantics for modals (the domain semantics) with a new semantics for attitude verbs. Ninan (2016) shows that there is a way to account for Yalcin's data while maintaining the relational semantics for epistemic modals. For any set of worlds s, let fs be the constant function which takes any world to s. Then, for attitude verb V, modal base f , and world w, Ninan proposes: Definition 2.5. Ninan semantics for attitude verbs: • [[A V's φ]]f,w= 1 iff ∀w′ ∈ VA,w : [[φ]]f VA,w ,w′ = 1. The resemblance to Yalcin's attitude semantics is clear, and this approach accounts for Yalcin's data in essentially the same way that Yalcin's semantics does, but within a relational framework. To see this, suppose pA supposes (φ and might not φ)q is true at world w and modal base f . Given Ninan attitude semantics, plus the relational semantics for modals and Boolean connectives, this holds just in case, for every world w′ compatible with A's suppositions at w, w′ has the following properties: (i) φ is true at 〈 fSA,w , w′ 〉 ; (ii) there is some world w′′ in fSA,w(w′) such that φ is false at 〈 fSA,w , w′′ 〉 . Given that fSA,w(w′) = SA,w, it is easy to confirm that these conditions are only met when A's suppositions are inconsistent, and so pA supposes (φ and might not φ)q will entail that A's suppositions are inconsistent, as in Yalcin's system. 3 Relative potential expressibility Is it a fluke that Ninan was able to reproduce, in a relational framework, the interaction of Yalcin's attitude verbs with the domain semantics-or is there a deeper fact about the relationship between the domain and relational semantics which accounts for this? This is an important question for both practical and theoretical reasons. Practically, this is an important question because Yalcin's data are not 7Yalcin (2007) in fact claims that, on his semantics, a sentence with the form pA supposes (φ and might not φ)q is itself a contradiction, but this would only follow if we add a presupposition that the attitude is consistent, as Ninan (2016) proposes. 6 the only challenge to the relational semantics from embedding behavior. Yalcin 2007 discusses similar data involving epistemic contradictions in the antecedents of conditionals, and related challenges on epistemic modals in a variety of different embedding environments has raised related challenges for relational approaches and been used to argue in favor of revisionary semantics.8 Ninan's system shows that the relational semantics can make sense of just some of these data. One way to proceed would be to go through each data point, and see whether the relational framework can replicate the predictions of the revisionary accounts with respect to those data. But this approach is inefficient; and, more problematically, even if this method showed that all known data can be accounted for in the relational framework, this would still leave open the possibility that more data are lurking undiscovered which a revisionary approach can account for, but which the relational semantics cannot account for. It would be far more helpful if we could show that the relational framework can account for any embedding data which can be accounted for in the revisionary frameworks. From a more theoretical point of view, answering this question will elucidate the structure of the underlying relationship between the relational semantics and its competitors. And intuitively, it is very natural to think that there is a sense in which the domain semantics is simpler and more constrained than the relational semantics, as Ninan (2016) points out. In this section I will spell out a general framework which allows us to spell out these intuitions in a precise way, by comparing the expressive power of two theories of a given fragment- like the relational and domain semantics-with respect to what embedding operators are definable within those two theories. Let me first note that, in the choice between semantic theories, embedding data invariably provide just one motivation. There will also be framework-level considerations which are at least somewhat independent of particular embedding data. In the debate about epistemic modals in particular, issues about assertability, disagreement, and retraction have played a crucial role. Thus, for instance, Yalcin advocates the domain semantics as part of an expressivist approach to epistemic modality; MacFarlane advocates it as part of a relativist approach. Ninan, for his part, remains neutral about the 'postsemantics' he intends for his semantics, but the relational semantics is most commonly coupled with a contextualist post-semantics (though it needn't be). The choice between these frameworks turns, at least in substantial part, on issues that are independent of embedding data.9 As Ninan (2010) discusses in detail, while it is possible that embedding data will have some impact on the choice between these frameworks, the impact will be at best indirect; embedding data tell us about the logic of the operator we are studying, but not about assertion, disagreement, truth, and so on.10 I will focus in this paper 8See Beaver 1994; Groenendijk et al. 1996; Gerbrandy 1998; Aloni 2000, 2001; Yalcin 2015; Rothschild and Klinedinst 2015; Ninan 2018; Mandelkern 2019. 9See e.g. Lewis 1980; Ninan 2010; Rabern 2012, 2013 on the distinction between assertoric and semantic content; for relativist approaches, see e.g. Egan et al. 2005; Stephenson 2007; MacFarlane 2011, 2014; for expressivism, e.g. Yalcin 2007; Swanson 2015; Moss 2015; for contextualism, e.g. Dowell (2011); Khoo (2015); Mandelkern (2018a). 10Though there may be an indirect bearing-for instance, the logical facts may go more naturally with one conception 7 narrowly on the choice of semantic theories, not post-semantic theories. Abstracting from the particulars of epistemic modals, the kind of question that I want to address is the following. Given two semantic theories of a given language, suppose that we add an arbitrary new sentence operator to the language, and extend the first semantic theory to give a compositional semantics for the resulting language.11 Can we be guaranteed to find a way of extending the second semantic theory to give a compositional semantics for that operator such that the operator has exactly the same logic, according to the second theory, as it does according to the first theory? If so, we know that the first theory is no more expressive than the second theory with respect to embedding data. That is, for any embedding data involving an operator O not covered by the first theory, if the first theory can be extended to make sense of those data-that is, if we can extend the first theory so that we have a semantics for O which makes all the right predictions about O's logic-then we can also cover the same data in the second theory: we can give a semantics for O in the second theory which makes all the right predictions about O's logic. We can spell this out more precisely as follows. First, we make precise the notion of a semantic theory (which I'll denote T , T ′, and so on) for a language (a set of sentences L) which is meant to correspond roughly to the kinds of systems that semanticists construct to make sense of fragments of natural language. For propositional languages, we will let our models be sequences comprising a set of possible worlds; a valuation function which takes atomic sentences in the language to subsets of the set of possible worlds; a set of indices; and an interpretation function. Indices ordinarily will be sequences which include a possible world or set of possible worlds (from the stock of worlds in the model), and may also include other parameters, like an accessibility relation or set of worlds. Interpretation functions interpret the language relative to the set of indices, given the model's valuation function: for any sentence and index, the interpretation function tells us whether the sentence is true or false at that index. This extended notion of a model (relative to a standard account in modal logic) allows us to schematize in a general way very different kinds of semantic theories, and thus to compare those approaches. Finally, a semantic theory of a language is any set of models of that language. A semantic theory for a language yields a logic for the language: for a set of sentences Γ and sentence ψ from the language, Γ T ψ means that, in every model in T , ψ is true at every index where all the elements of Γ are true.12 All of this is spelled out more formally in Appendix A. With this in hand, we can define our notion of relative potential expressibility as follows: Definition 3.1. Relative Potential Expressibility: Given two semantic theories T and T ′ for a language of logical consequence than another, as e.g. Veltman (1996); Groenendijk et al. (1996); Yalcin (2007) have argued with regards to epistemic modals. 11By 'compositional', I mean that the meaning of a sentence containing the operator in question with scope over some sequence of sentences is obtained by applying the meaning of the operator (which we assume is an n-place propositional function) to the meaning of the sentences it takes scope over. 12I write φ T ψ for {φ} T ψ. 8 L, T is no more expressive than T ′ with respect toL (written T L T ′) iff, for any set of new sentence operatorsO, for any extension T O of T to LO, there is an extension T ′O of T ′ to LO which preserves the logic of O from T O. LO is the closure of L under the elements of O. What it means for T ′O to preserve the logic of O from T O is the following: for any set of sentences Γ ⊆ LO at least one of which contains an operator from O (i.e. one of which is in LO \ L), for any sentence ψ in LO, (Γ T O ψ)↔ (Γ T ′O ψ). (More details, again, in the appendix; '↔' abbreviates meta-language 'iff'.) The notion of relative potential expressibility spelled out here is somewhat different from the extant notions of expressibility I know of, which are typically concerned with comparisons between different languages rather than with comparisons between different semantic theories of a language with respect to arbitrary extensions of the language. But relative potential expressibility is just what we need to answer the questions posed at the beginning of this section. If T L T ′, this tells us that T ′ can do anything T can with regard to arbitrary extensions of L, and thus that T ′ can make sense of any embedding data that T can make sense of.13 Before exploring the application of this framework, let me note a helpful characterization result. Relative potential expressibility between certain kinds of well-behaved semantic theories boils down to something very simple: the existence of a truth-preserving injection from the indices of a model in the first theory to those of a model in the second. Fact 3.1. Characterization of Expressibility: For any semantic theories T and T ′ and language L, if T is isomorphic with respect to L, and T ′ is isomorphic with respect to L, then T L T ′ iff there is a modelM∈ T and a modelM′ ∈ T ′ such that there is a injection g from the indices ofM to those ofM′ which preserves the truth of all sentences in L, i.e. such that ∀φ ∈ L, for any index i fromM, φ is true at i inM iff φ is true at g(i) atM′. A semantic theory T is isomorphic with respect to L iff ∀M,M′ ∈ T :MLM′ ∧M′ LM. The proof of Fact 3.1, and of all the other facts stated in the rest of this section, are relegated to Appendix A. I state this characterization result here because it greatly simplifies the proofs of the facts that follow, and, I think, provides some intuitive grip on those results: whether one semantic theory is less expressive than another (in the relevant sense) depends on the structural relations between the semantic theories, and in particular whether the first semantic theory can, from a very abstract perspective, be embedded into the second semantic theory in a truth-preserving way. 13There is some work on comparing expressive power across different classes of structures, though not to my knowledge anything along the lines of what I am proposing here; see Pinheiro Fernandes 2017 for an overview and relevant citations, in particular Mossakowski et al. 2009, which proposes a structurally similar framework (in particular in their definition of sublogics), but one that remains quite different in its details. Cf. also French (2017) for investigation of the notion of notational equivalence over possible extensions of the language. 9 Let me make a final note about the relation between relative potential expressibility and the relative strength of logics. If T L T ′, then the logic of L in T ′ can be no stronger than the logic of L in T (they may have the same logic, but it may also be that the logic of L in T is strictly stronger than the logic of L in T ′).14 But it is not guaranteed to be weaker (even if T is strictly less expressive than T ′, i.e. T ≺L T ′). Nor does the converse follow: if the logic of L in T ′ is no stronger than the logic of L in T , there is no guarantee that T L T ′; likewise, if the logic of L in T ′ is strictly weaker than the logic of L in T , there is no guarantee that T L T ′. So, while relative potential expressibility has some bearing on relative logical strength, these two ways of comparing semantic theories are orthogonal. (Even if they turned out to coincide, this would itself be a substantive fact, since it would not have been at all obvious that they should coincide; that they fail to coincide shows that they get at distinct features of semantic theories).15 3.1 The domain and relational semantics With this discussion in hand, I will turn my attention to the specific comparisons of expressive power which I will focus on here: namely, comparisons between different semantics for epistemic modals. A note to readers: the rest of this section is concerned with introducing a variety of semantics for epistemic modals and comparing their expressive power. I succinctly summarize these results at the beginning of the next section (§4); readers uninterested in the details that follow may wish to skim up to that point. Since our focus is on epistemic modals, I will compare semantic theories for a very simple language L♦ comprising infinitely many atomic sentences pi : i ∈ N (which I will write p, q, r etc.) together with sentences of the form ♦φ, for any sentence φ ∈ L♦ (♦ abbreviates 'might').16 Let R denote the relational semantics, and D the domain semantics. These are, again, classes of models, in 14It may look as though, if T L T ′, then we can prove that the logic of L in T ′ must be the same as in T : we simply introduce a one-place sentence operator Id to L, and give Id the semantics in T of the identity function: for all φ, [[Id(φ)]]T = [[φ]]T . Then we will be guaranteed to be able to give Id a semantics in T ′ such that the logic of Id in the extension of T ′ matches that of Id in the extension of T . But the fact that we can do this does not, contrary to first appearances, guarantee that L has the same logic in the extension of T ′ as in the extension of T , because nothing guarantees that Id will still denote the identity function in the extension of T ′-this is no part of the logic of Id. 15An anonymous referee for this journal points out that there are, however, interesting cases in which these two notions do coincide. It seems to me both a striking fact that they do not coincide in general, and that there are limited cases where they do, and it seems worth further exploration exactly when they coincide, and why. Things would have been otherwise had we defined T L T ′ in a slightly different way, such that this means that for arbitrary extensionLO ofL and extension T O of T , we can extend T ′ to a model T ′O which matches the logic of LO in T O , i.e. which is such that for any Γ∪{ψ} ⊆ LO , (Γ TO ψ)↔ (Γ T ′O ψ). We instead said that T ′O only must match the logic of O in T O . This means that T L T ′ does not at all entail that T and T ′ have to agree on the logic of L. This, in turn, makes it possible to compare the relative expressive power of semantic theories which start with different logics for a given language-as the domain and relational semantics do-with respect to possible enrichments of those languages, something which, as we will see, turns out to be very revealing. Otherwise comparisons of relative potential expressive power would be limited to starting languages with identical logics. 16I let sentences of the language name themselves for brevity. 10 the sense given above; I assume that the models in every case contain sets of possible worlds and valuation functions such that for any set of atoms, exactly that set is true at some world, and any two worlds differ on the truth of some atom. Recall that for any model r ∈ R, an index in that model is any pair 〈f, w〉, with f a modal base and w a possible world drawn from r's stock of worlds. An atomic sentence p is true in r at 〈f, w〉 just in case the model's valuation function takes p to true at w; and ♦φ is true in r at 〈f, w〉 just in case there is a world w′ in f(w) such that φ is true at 〈f, w′〉 according to r. Recall that an index in any model d ∈ D is a pair 〈s, w〉, with s an information state and w a possible world. An atomic sentence p is true in d at 〈s, w〉 just in case p is true at w according to d's valuation function. A sentence with the form ♦φ is true at 〈s, w〉 in d just in case there is a world w′ ∈ s such that φ is true at 〈s, w′〉 according to d. We have some flexibility in how we think about entailment in the domain framework. The classical notion of entailment would say that for any domain model d, Γ d ψ iff ψ is true at every index in d where all the members of Γ are. But a different route, suggested by Yalcin (2007), would be to treat entailment as preservation of acceptance rather than truth, where an index 〈s, w〉 accepts a sentence φ iff for every world w′ ∈ s, φ is true at 〈s, w′〉. Preservation of truth is a strictly stronger notion than preservation of acceptance: if ψ is true at every index where all the members of Γ are, then ψ is also accepted at every index where all the members of Γ are. But preservation of acceptance does not entail preservation of truth; to take a simple example, p entails ♦p in the preservation of acceptance sense, but not in the preservation of truth sense (at least not without further stipulation about our models).17 So which logical notion will be the most illuminating for our purposes? I think it is the preservation of truth notion, for a few reasons. For one thing, on the preservation of acceptance notion, the domain semantics ends up being equivalent to the state-based semantics discussed below. So, just for the sake of diversity, it is worth exploring a different perspective on the domain semantics; readers who are interested in what we would find with a preservation of acceptance notion are referred to the discussion of the state-based semantics below. Second, this perspective renders the domain semantics more expressive than it otherwise would be; the acceptance-based approach renders the world parameter of indices essentially invisible, bleaching the semantics of a good deal of expressive strength (as we will see below in the discussion of the state-based semantics). There are many ways to think about the logic of a given system; I think that for present purposes it makes sense to take a vantage point which does not wash out any part of the semantics' inherent expressive power. This allows us to explore the expressive power of the semantics itself, as it were, rather than of the semantics under a certain, possibly limiting, perspective. Thus I will stick with the preservation of truth notion of consequence for both the relational and domain semantics. 17Yalcin (2007) countenances the possibility that a domain index 〈s, w〉 should always be such that w ∈ s, in which case this disanalogy would drop out. With or without this addition, the acceptance-based logic remains strictly weaker than the truth-based logic across extensions of the language. 11 With this in mind, we can show that the domain semantics is no more expressive than the relational semantics: Fact 3.2. D L♦ R. Proofs of all these results are, again, included in the appendix, but I will say a bit here about the intuition behind each proof. In this case, the proof strategy is to choose an arbitrary model in D and construct a function g which takes any index 〈s, w〉 in that model to 〈fs, w〉, where fs is the constant function from worlds to s. Then Fact 3.2 follows from our characterization result in Fact 3.1.18 Next we show that the relational semantics is not less expressive than the domain semantics with respect to L♦: Fact 3.3. R L♦ D. Fact 3.3 follows immediately from the observation that, for any index i in any model in D, ♦φ is trueD at i iff ♦(♦φ) is; but the same does not hold in R. This will make it impossible to find a truthpreserving injection from the indices of any R to those of any D model. I should note here that this difference in logics in the logic of L♦ itself may be taken on its own as an argument for, or against, the domain semantics, depending on what one thinks of the claim that ♦(♦φ) is always equivalent to ♦φ-a claim, again, validated by D but not byR. Yalcin claims this kind of logical feature as a virtue of his theory, while Moss (2015) criticizes this prediction. I will not explore that debate here, though. Our concern is with a very different question: ignoring the logical differences between the domain and relational semantics within L♦, which theory is better equipped to account for embedding data which involve operators beyond our starting language L♦? It is this latter question I focus on here, though the former may, of course, play an important independent role in the final choice between these theories. Thus from Facts 3.2 and 3.3, we have that the domain semantics is strictly less expressive than the relational semantics with respect to L♦. Fact 3.4. D ≺L♦ R. It is worth noting that 'might's are not generally able to stack in natural language: that is, sentences with the form pMight (Might φ)q are generally infelicitous.19 This may make it seem as though Fact 3.3 holds due to an irrelevant technicality. But this is not so: we still have an expressive inequality between the domain and relational semantics even if we add a syntactic constraint which rules out embedded 'might's (and thus which renders the logics of each theory for this more limited language fully equivalent). In other words, where L♦− is the language comprising just atomic sentences from L♦ and sentences of the form ♦pi, where pi is any atom, we still have: 18Note that it is easy to extend this to a proof that the domain semantics is no more expressive than the relational semantics with respect to an expanded language containing not just atoms and 'might' sentences, but also sentences formed with the Boolean connectives. 19Though see Moss 2015 for discussion of nested epistemic modals. 12 Fact 3.5. D ≺L♦− R. The intuition behind these facts is that sets of worlds are in some sense less fine-grained than functions from worlds to sets of worlds (though this is not a fact about cardinality: given a fixed infinite stock of worlds, the set of domain indices and the set of relational indices have the same cardinality). 3.2 The update and state-based semantics An important fact about the formal tools developed here is that it is straightforward to apply them to further comparisons: nothing in the kind of comparison we are doing here depends on details of the domain or relational semantics. In the rest of this section, I will illustrate this by exploring the relative potential expressive power of a few more revisionary semantics of epistemic modals which have been motivated by embedding data. In each case, the relational semantics comes out as the most expressive option. First, the comparison of expressive power can be extended to the update semantics. In the update semantics U , due to Veltman (1996), building on Heim 1982, 1983, the intension of a complex sentence is a dynamic object: a function from information states (usually called contexts in this literature) to information states. The most natural way to think about this approach from the point of view of the formal framework we are using is to think of our "indices" as pairs of contexts, with φ "true" at a pair 〈s, c〉 in U just in case 〈s, c〉 ∈ [[φ]]U . As for the domain semantics, we have different options for thinking about entailment. The option most parallel to the approach we've taken so far would treat entailment as preservation of truth, so that φ U ψ just in case the function denoted by φ in U is a subset of a function denoted by ψ in U (likewise, ceteris paribus, for multi-premise entailment). This is a non-standard way of thinking about entailment in the dynamic framework. A different approach would be to treat our indices simply as contexts, rather than pairs of contexts, and then treat entailment as preservation of acceptance, where a context c accepts a sentence φ just in case c is a fixed point for φ, i.e. just in case [[φ]]U (c) = c. This would bring our notion of entailment in line with Veltman (1996)'s (third) notion of entailment as preservation of acceptance. Just as for the domain semantics, however, I think the truth-based notion is the more interesting one, for essentially the same reasons. First, on an acceptance-based notion, the update semantics ends up, again, equivalent for our purposes to the state-based semantics, so the discussion of the state-based semantics below will suffice to show what would result from an acceptance-based perspective on the update semantics. Second, the truth-based notion of consequence is strictly stronger than the acceptance-based notion, i.e., if Γ entails φ in the truth-based notion, Γ also entails φ in the acceptance-based notion, but not vice versa. The truth-based notion of consequence thus provides a more expressive perspective on the update semantics; again, the update semantics can plausibly be coupled with a variety of different logics, but I think for present 13 purposes it makes sense to couple it with a logic which does not wash out any of its expressive power, and so I will stick with the truth-based notion.20 Readers interested in the perspective gained from an acceptance-based vantage point on the update semantics are, again, referred to the discussion of the state-based semantics below. With this background, we can gloss the usual update semantic rules for atomic sentences and ♦ as follows. For any model u ∈ U , an atomic sentence p is trueu at 〈s, c〉 just in case c is the result of removing all worlds w from s such that vu(p, w) = 0, where vu is u's valuation function. And a sentence of the form ♦φ "tests" whether φ is compatible with the input context: ♦φ is trueu at 〈s, c〉 just in case, roughly, s includes a φ-world and s = c, or s doesn't include a φ-world, and c = ∅; more precisely, just in case [[φ]]u(s) 6= ∅ and s = c, or else [[φ]]u(s) = ∅ and c = ∅. We first show that the update semantics is no more expressive than the relational semantics with respect to L♦: Fact 3.6. U L♦ R. The proof goes by way of constructing an injection from pairs of contexts to relational indices which preserves truth for all sentences in L♦. The injection goes by way of distinguishing a variety of different cases, and is not particularly intuitive, which makes Fact 3.6 somewhat surprising. We can also show that the relational semantics is more expressive than the update semantics: Fact 3.7. R L♦ U . The proof is identical to the proof of Fact 3.3 (thatR L♦ D). Thus we have: Fact 3.8. U ≺L♦ R. Interestingly, though, we do not have parallel results for the domain semantics: Fact 3.9. U L♦ D. Fact 3.10. D L♦ U . We can prove these facts by showing that there is no truth-preserving injection from the indices of U to those ofD, and vice versa.21 Thus the domain and update semantics are expressively incommensurable- though both are strictly less expressive than the relational semantics. 20It is also worth noting that not everything in the update framework is definable in terms of acceptance-in particular, the update treatment of 'might' cannot be reformulated in an equivalent way just in terms of the fixed points of its complement. To see this, let Acc(φ) be the set of φ's accept states, i.e. Acc(φ) = {c : [[φ]]U (c) = c}. In standard extensions of the update semantics to conjunction and negation, for atomic p, Acc(p ∧ ¬p) = Acc(♦p ∧ ¬p) = {∅}: these sentences have all the same fixed points, namely, just ∅. But they interact differently with ♦: e.g. ♦(p ∧ ¬p) is only accepted by ∅, whereas ♦(♦p ∧ ¬p) is accepted by some non-empty sets, namely those which include both pand p-worlds; and so the update semantics for ♦ cannot be defined just in terms of the accept states of its complement (though see Gillies 2018 for a variant which can be). 21These facts are particularly interesting in relation to the limited equivalence between the update and domain semantics proved in Rothschild 2017. See also Rothschild and Yalcin 2015, 2016 for detailed discussion of dynamic semantics and their relation to static semantics. 14 Note, finally, that these relations are all preserved for L♦−, the language which does not allow stacked 'might's. First, we have D L♦− U (the proof is the same as the proof of Fact 3.10). That, together with Fact 3.5 (that D ≺L♦− R), and the fact that  is transitive (see Fact 3.13 below), shows that R L♦− U ; otherwise, we would have D L♦− U . Finally, the proof that U L♦ R extends immediately to L♦−, so we have: Fact 3.11. U ≺L♦− R. The last system I will discuss here is the state-based semantics of Hawke and Steinert-Threlkeld 2016.22 The state-based semantics S is very similar to the update semantics, except an index is a single set of worlds. At any model s ∈ S , an atomic sentence p is true at a set of worlds c just in case, for every world w ∈ c : vs(p, w) = 1. And a sentence of the form ♦φ is true in s just in case φ is true in s in some {w} ⊆ c. The state-based semantics is strictly less expressive than the domain semantics: Fact 3.12. S ≺L♦ D. Furthermore, for any language L, L is transitive across semantic theories isomorphic over L; this follows from Fact 3.13: Fact 3.13. For any language L and class of semantic theories T isomorphic with respect to L, L is a partial pre-order over T. So we also have: Fact 3.14. S ≺L♦ R. 3.3 Quantified modal languages The final comparison of expressive power which I will explore here concerns quantified modal languages. This is a particularly interesting topic for present purposes because epistemic modals embed in fascinating ways in the scope of quantifiers-ways that have been used to argue against the standard semantics and in favor of various quantified enrichments of the update semantics. The present expressibility results can, however, be extended to show that, if we extend all these frameworks to quantificational languages in the most straightforward way, the expressive hierarchies for the nonquantified language remain unchanged. To show this, consider a standard quantificational language L♦v , built out of a vocabulary comprising variables xi : i ∈ N; n-place relation symbolsRni : i ∈ I for every n ≥ 0; and a one-place sentence operator ♦. L♦v is the smallest set comprising atomic sentences with the form of an n-place relation 22It is a more complicated matter to extend the comparison to the bilateral state-based semantics in Aloni 2016, SteinertThrelkeld 2017, Chapter 3. 15 symbol Rn followed by an n-tuple of variables; and sentences of the form ♦φ, for any φ ∈ L♦v . We can extend the basic relational semantic theoryR to a semantic theoryR∃ for this quantified language by adding a domain to the semantic theory, and adding a variable assignment (a function from variables to elements of the domain) to our indices, so that they amount to triples comprising a variable assignment, accessibility relation, and world. Our valuation function now takes a world and an n-place relation symbol and returns an n-place relation on the domain. Our truth clauses for atomic sentences and ♦ will be generalized in the usual way, and our semantics for ♦ remains essentially unchanged.23 We can treat quantifiers as sentence operators (shifting variable assignments, in the usual way) which we can freely add to our language. We can likewise enrich the domain semantics to a semantic theory D∃ of L♦v in a parallel fashion, augmenting our indices so they are triples of variable assignments, information states, and worlds, and extending the interpretation function in the obvious way. It is clear that these changes do not affect the expressive hierarchy between the domain and relational semantics. That is, we have: Fact 3.15. D∃ ≺L♦v R ∃. The proof is a straightforward generalization of the proof of the parallel result for the non-quantified case. Things get more interesting when we turn to the update semantics.24 The most obvious way to incorporate quantification into the update semantics is to simply treat intensions as functions from a variable assignment to a function from contexts to contexts (as in Yalcin 2015), so that U∃-indices have the form 〈a, 〈s, c〉〉, for a a variable assignment and s and c contexts.25 For any model u∃ ∈ U∃, we say that, for atomic sentence p of the form Rn(〈x1, x2, . . . xn〉), p is trueu∃ at 〈a, 〈s, c〉〉 iff c is the result of removing from s all and only worlds w such that 〈a(x1), a(x2), . . . a(xn)〉 /∈ vu∃(Rn, w). And we say that ♦φ is trueu∃ at 〈a, 〈s, c〉〉 iff [[φ]]u∃(a)(s) 6= ∅ ∧ s = c, or [[φ]]R∃(a)(s) = ∅ = c. If we go this way, then, once again, the expressive hierarchy from above will be preserved: the quantified update semantics will be strictly less expressive than the quantified relational semantics. Fact 3.16. U∃ ≺L♦v R ∃. 23I.e. for any model r∃ ∈ R∃, where 〈a, f, w〉 is a relational index, with a a variable assignment, f a modal base, and w a possible world, we say that an atomic sentence of the form Rn(〈x1, x2, . . . xn〉) is truer∃ at 〈a, f, w〉 iff 〈a(x1), a(x2), . . . a(xn)〉 ∈ vr∃(Rn, w), and ♦φ is truer∃ at 〈a, f, w〉 iff there is a world w′ ∈ f(w) such that φ is truer∃ at 〈a, f, w′〉. 24I'll set aside the state-based semantics for now; I don't know of quantificational versions of that semantics. 25Interestingly, this is not the most common way to incorporate quantification into the dynamic framework in which the update semantics is cast. In the standard quantified extension of that framework, developed in Heim 1982, 1983, indices are treated, not as pairs of a variable assignment and a pair of information states, but rather as pairs of 'files', where a file is a set of world/variable assignment pairs. I will not explore the expressive hierarchies that would result from this more complicated way of going. 16 4 The upshot There is much more to explore here, but I will stop for the present. Let me briefly summarize the situation, and then emphasize a few upshots of this discussion. The (quantified) domain semantics, (quantified) update semantics, and state-based semantics are all strictly less expressive than the (quantified) relational semantics with respect to a language comprising 'might' and atomic sentences. In other words, for any way of extending this language with a new set of sentence operators and any way of extending the domain, update, or state-based semantic theories to give semantics for those operators, we can extend the relational framework to these operators in a way which exactly replicates the logic of that operator in the given semantics. But the converse is not true: there are operators definable in the relational framework whose logic cannot be replicated in the domain, update, or state-based frameworks. This shows that there are no data involving 'might' embedded under some operator which the domain, update, or state-based semantics can make sense of, but which the relational semantics cannot make sense of.26 This, in turn, sheds new light on the debate about the semantics of epistemic modals. At a high level, that dialectic has sometimes been presented roughly as follows. First, a conservative assumption in favor of the relational semantics is generally taken for granted. Then, this assumption is challenged with data involving 'might' embedded under other operators which the relational semantics putatively has trouble accounting for. Finally, the data are used to advocate a revisionary semantics which can make sense of them when extended in an appropriate way to give a semantics for the embedding operator. The present results show that this gets things backwards. On the one hand, these results show that there are no embedding data which on their own tell against the relational semantics and in favor of the domain, update, or state-based semantics. That is, for any embedding data which we can account for within the domain, update, or state-based semantics by extending those semantics to the embedding operator(s) in question, we can account for the same data in the relational semantics by extending the relational semantics in a way which matches the logic of the new operator(s) in the 26Of course, this holds only as long as we focus on sentences where 'might' takes as a complement just sentences in our simple starting language. In fact, of course, 'might' can embed sentences of much more complexity. But, as far as the embedding behavior of epistemic modals goes, this limitation seems to be harmless; all of the data that I know of which have been used to motivate departures from the relational semantics stay within these bounds. This limitation follows from the assumption that LO is closed only under the operators in O, and not under the operators in the vocabulary of L. If we made the latter assumption, we would avoid this limitation, but the proofs below would not go through. Note that many (but not all) of the results above can be easily extended to more complex starting languages. (See brief discussion in Footnote 18. In some cases, no such extension will be possible, e.g. if we compare the update and relational semantics with a starting language which includes conjunction, given its standard entries in these two frameworks. But the comparison of that result to those above shows that the resulting failure has to do with those entries for conjunction, not with the semantics for epistemic modals; the results above show that the logical features of the standard update conjunction can be replicated in the relational framework.) 17 domain/update/state-based frameworks. But the converse does not hold, and so there may well be data which tell the other way: embedding data which the relational framework can make sense of, but which the domain, update, and state-based frameworks cannot make sense of. On the other hand, excessive expressive power is to be avoided in giving a semantics for natural language. So these results also provide the basis for an argument in favor of the revisionary, less expressive alternatives. If natural language does not make use of the full expressive power of the relational semantics, we may want to record this fact in our semantic theory. The less expressive a theory of 'might' is, the fewer stipulations are needed to make it match data from natural language. Thus, if the less expressive frameworks can make sense of the behavior of embedded epistemic modals, then the present expressibility results show that there should be a simplicity-based prejudice in their favor. This makes precise the intuition reported by Ninan and others that there is something about the domain semantics which makes it simpler than the relational semantics. These considerations are, of course, only pro tanto. There are many other considerations that can bear on the choice between semantic theories. For instance, one framework may be able to account for embedding data in a more natural way than another, (for instance, by doing so in a computationally simpler way; it is important to note that just because T L T ′, it does not follow that T ′ can match the logic of any extension of T in as computationally simple way as T does).27 Let me put all this a bit more generally, to bring out the utility (and limitations) of comparisons of potential expressibility. Consider a debate between two semantic theorists about how to make sense of some fragment of natural language. The first theorist may point to the logical behavior of sentences in some extension of this fragment as evidence in favor of her theory. The second theorist could respond by showing that there exists an extension of her preferred theory to that extension which also captures the logical behavior in question. But the present considerations show that a more general response is also available to her. If she can show that her theory has greater potential expressibility than the first theorist's, then she can show that the first theorist can't ever win an argument so easily: whatever extension the first theorist offers up, the second theorist will be able to match the logic of that extension. This does not decide which theory is correct. For one thing, this very fact, although it undermines the first theorist's original argument for her theory, may be taken as a different point in favor of the first theory, as it shows that there is a precise sense in which the first theory is less flexible than the second. Second, there may be other, independent considerations which bear on the choice between the theories. Thus considerations of relative potential expressibility are not decisive, but they play at least two important roles: they help us determine when an argument based purely on the logic of some extension of a semantic theory can on its own be an argument in favor of that theory 27Cf. debates in syntax, where more powerful grammars are sometimes preferred simply because of the ease with which they can capture the data; thanks to Roni Katzir for pointing out this connection. 18 and against another; and they help us make precise intuitions about the relative simplicity of theories. 5 The domain versus relational semantics This concludes my abstract discussion of relative potential expressibility. My plan for this final part of the paper is to briefly descend from these abstract considerations back into the more concrete debate about which theory is best able to account for epistemic modals in natural language. My goal is not to choose between these theories here. I will instead focus on a limited subset of embedding data, making a case study of the behavior of epistemic modals under attitude predicates. The goal will be to bring out the kinds of methodological considerations that should be brought to bear in deciding between different semantic theories in light of expressibility results like those proved above.28 As we saw above, Ninan (2016) shows that we can account for Yalcin's data in the relational system. More generally, Ninan's semantics for attitude predicates, plus the relational semantics, is in fact equivalent to Yalcin's system.29 In light of this equivalence, together with the expressibility results which showed that there is a sense in which the domain semantics is more constrained than the relational semantics, it would be tempting to conclude that the domain semantics is to be preferred over the relational semantics, at least modulo considerations about epistemic modals in other environments. But this would be too fast, because it turns out the predictions of Yalcin's and Ninan's systems are problematic.30 These systems have at least three serious empirical problems. First, as Dorr and Hawthorne (2013) point out, Yalcin's system (and hence Ninan's system) predicts that, for non-modal φ and agent A, the inference from φ to pA knows might φq will be valid: on these accounts, the latter is true just in case the truth of φ is compatible with A's knowledge, and any truth is compatible with anyone's knowledge. But this is obviously wrong. It is possible to fail to know that some truth might obtain; this is just what happens when one has a false belief. Suppose John sees Mark enter his office and close the door. Unbeknownst to John, Mark has a secret exit in the floor of his office, and has used this exit to leave the office and go to the bar. In this situation, 'John knows that Mark might be at the bar' seems plainly false. But, since Mark is in fact at the bar, the Yalcin/Ninan approach wrongly predicts this to 28In Mandelkern 2019, I explore a much wider variety of embedding data, and give a different response to those data. To be clear, I stand by the response I give there; the proposals I explore here are local and ad hoc in an entirely unsatisfying way, and serve a merely illustrative purpose. The 'bounded theory' that I develop there builds on the relational theory. However, as I point out there (Footnote 58), a similar response could be given in a domain framework. For what it's worth, my own view is that the relational framework is preferable, but for reasons having to do with higher-level considerations about communication (which I discuss in Mandelkern (2018a)), not on the basis of embedding data. 29I.e. for any attitude verb V, world w, information state s, and modal base f , pA V's φq is true in Yalcin's system at an index 〈s, w〉 just in case it is true in Ninan's system at the index 〈f, w〉. 30The same problems face the update semantics, when combined with the approach to attitudes in Heim 1992, or the eventrelative modal and attitude semantics of Hacquard 2006, 2010. 19 be true.31 The second problem is that the Yalcin/Ninan framework gets the entailment relations between pA knows might φq and pA believes might φq backwards. In this framework, the first says that the truth of φ is compatible with A's knowledge; the latter that it is compatible with A's beliefs (provided A's beliefs are consistent). Since whatever is compatible with someone's beliefs is compatible with their knowledge, but not vice versa, this framework predicts that pA knows might φq does not entail pA believes might φq; but that, as Dorr and Hawthorne (2013) point out, that pA consistently believes might φq does entail pA knows might φq. But this is wrong: just as in the non-modal case, likewise in the modal case, knowledge entails belief, but not vice versa. Birthers consistently believe that Obama may have been born in Kenya, but they don't know this. By contrast, if John knows Mark may be in his office, then he also believes this; 'John knows Mark might be in his office, but he doesn't believe Mark might be in his office' has the same sense of incoherence as in the non-modal case.32 The final problem for the Yalcin/Ninan system is that it predicts that 'must' is vacuous under attitudes.33 That is, sentences with the form pA V's must φq and pA V's φq are predicted to be semantically equivalent: modals embedded under an attitude verb are predicted only to contribute quantificational force, and since the universal quantificational force of 'must' matches the universal quantificational force of the embedding predicate, 'must' is predicted to have no effect. This prediction, however, is wrong. A construction with the form pA knows/believes must φq is generally only felicitous if A's evidence for the truth of φ is in some sense indirect.34 For instance-to modify a stock example-suppose that Sue is watching it rain, and on this basis concludes that it's raining out. Then we can say 'Sue knows/believes it's raining', but the 'must'-variant 'Sue knows/believes it must be raining' is quite odd. By contrast, suppose that Sue can't see outside, but sees some of her colleagues come inside with wet umbrellas, and on this basis concludes that it's raining out. Then either the 'must' or non-modal variant is acceptable. There are many kinds of explanation we might seek out for 31Yalcin (2012) discusses this problem in the context of the update framework. A referee for this journal helpfully points out that this prediction of the Yalcin/Ninan system would be palatable if there were some reading of 'knows' on which the inference from φ to pA knows might φq looks valid. If there were, then sequences like the following should have a coherent reading: 'Susie is completely convinced that it's sunny out; she is, after all, looking out at what appears to be a sunny sky. But she knows that it might be raining out, because in fact it's raining out, and the apparently sunny sky is just a clever projection.' On any reading, the last sentence here sounds like a non sequitur. 32See Hawthorne et al. (2016); Bledin and Lando (2017); Beddor and Goldstein (2018) for discussion of related cases. A referee for this journal helpfully points out that these first two problem stem from the same logical feature of the Yalcin/Ninan system: that on that system, pA consistently believes/knows φq is equivalent to pφ is consistent with A's beliefs/knowledgeq. These problems are, nonetheless, distinct, and are important to keep separate because it is possible to solve one problem without solving the other (as we will see presently). 33Assuming 'must' is defined as the dual of 'might', with negation given its standard Boolean meaning. This follows from the more general problem that to accept φ and to accept pMust φq amount to the same thing in this system, a problem Hacquard 2010, §6.1.2 discusses. 34This generalizes a common parallel observation about unembedded 'must'; see Karttunen 1972; Veltman 1985; Kratzer 1991; von Fintel and Gillies 2010; Kratzer 2012; Matthewson 2015; Lassiter 2016; Giannakidou and Mari 2016; Sherman 2018; Mandelkern 2017b, 2018b. For specific discussion of embedded 'must' see Ippolito 2017. 20 this difference, including broadly pragmatic explanations, but it is hard to see how the Yalcin/Ninan systems could provide the foundation for any such explanation, since, on this approach, the 'must' and non-modal variants are, again, semantically equivalent.35 The Yalcin/Ninan system is thus not empirically viable. Yalcin's and Ninan's system show us how to respond to Yalcin's data within a domain and relational semantics, respectively. In both cases, those responses have implausible results. The expressibility results above showed that the relational semantics has more expressive power than any of the alternative views we explored. The implausibility of the Yalcin/Ninan system may appear to show that we need an even more expressive semantics than the relational one in order to have the flexibility to account for Yalcin's data in a plausible way. But this would be too fast. The main point which I wish to make in this section is that reasoning like this would only be valid if Yalcin's and Ninan's systems were the weakest ways one could possibly account for Yalcin's data in the domain and relational frameworks, respectively. If this were so, then things would look very bad indeed for those frameworks: we would know that any way those frameworks could account for Yalcin's data will validate the implausible inferences just reviewed. This would, in turn, provide sufficient motivation to reject those theories and pursue a new, more expressive theory of epistemic modals. But, by contrast, if Yalcin's and Ninan's systems do not represent the weakest ways to respond to Yalcin's data within the domain and relational frameworks, respectively, then the failure of the Yalcin/Ninan systems tells us nothing about the viability of those frameworks. More generally, my point is the following: when faced with new data, the only way to show that a semantic system is incapable of making sense of those data is by looking at the weakest way that system can account for the data. It is only if the weakest possible account of the data within that system validates implausible entailments that we know we need a more expressive underlying system, rather than simply a different way of responding to the data within that system. And it turns out that the Yalcin/Ninan system is not the weakest way to account for Yalcin's data, within either the relational or domain frameworks. In both cases, there are much weaker constraints available, which account for Yalcin's data and avoid the problems just enumerated. To show this, 35It may look as though these data can be explained in a relatively conservative way within a close variant of the Yalcin/Ninan approach by adopting a presupposition of indirectness along the lines suggested by von Fintel and Gillies (2010). But, first, to do this, we would have to depart substantively from the domain semantics, since information states do not provide enough structure to formulate a presupposition like the one von Fintel and Gillies propose; on that proposal, modals are evaluated relative to a set of propositions (representing an agent's direct evidence), and presuppose that no single element of the set entails the modal's prejacent or its negation. Even if we modify the domain semantics so that the von Fintel and Gillies proposal is statable, Ippolito (2017) gives convincing arguments that indirectness does not project like a presupposition, and so should not be encoded as a presupposition at all (nor will we have better luck encoding it as a conventional implicature, which would face the same objections). I am inclined to think instead that a pragmatic view like the one that I develop in Mandelkern (2017b, 2018b) is more plausible (cf. Degen et al. 2015). But that approach crucially requires that there be a difference in truth conditions between 'must' sentences and corresponding non-modal sentences, and so cannot get off the ground if pA V's must φq is semantically equivalent to pA V's φq. There are, of course, pragmatic accounts which distinguish sentences which are semantically equivalent, but I do not see a natural account to apply to the present case. In any case, the first two two points alone provide motivation to look for alternatives. 21 I will explore what the weakest constraint is in each framework which accounts for these data. To make this discussion tractable, I will make two background assumptions: first, that our connectives are the standard Boolean ones;36 second, that attitudes are represented as sets of possible worlds, in the broadly Hintikkan framework that Ninan and Yalcin both assume-i.e. that attitude predicates have as their core semantic values universal quantifiers over accessible worlds. There may be reasons to relax both these assumptions, but they facilitate the present discussion, and are harmless as far as present purposes are concerned. Given all this, the weakest constraint in the relational framework which suffices to account for Yalcin's data-that is, to ensure that pA supposes (φ and might not φ)q entails that A's suppositions are inconsistent-is the following: provided A's suppositions are consistent, then some supposition world can only access other supposition worlds. We can schematically encode this constraint, which I'll call the subset relational constraint, as follows: Definition 5.1. Subset attitude semantics, relational version: • [[A V's φ]]f,w a. defined only if VA,w 6= ∅→ ∃w′ ∈ VA,w : f(w′) ⊆ VA,w subset relational constraint b. where defined, true iff ∀w′ ∈ VA,w : [[φ]]f,w ′ = 1 standard Hintikkan truth conditions To see that this constraint is necessary and sufficient to account for Yalcin's data in the relational framework, given our background assumptions, first assume that pA supposes (φ and might not φ)q is true; then we know that (i) all of A's suppositions worlds make φ true (relative to f ), and (ii) all of A's suppositions worlds can access a world where φ is false (relative to f ). But the subset relational constraint ensures that, if A's suppositions are consistent, then one of A's supposition worlds can only access other supposition worlds. It follows that (i) and (ii) are only both satisfied when there are no worlds compatible with A's suppositions. Note next that the subset relational constraint is necessary to account for Yalcin's data, given our background assumptions: if the constraint is violated, then pA supposes (φ and might not φ)q will not entail that A's suppositions are inconsistent. Suppose that A's suppositions are consistent at w, and the subset relational constraint is violated for some modal base f : that is, SA,w is non-empty, and ∀w′ ∈ SA,w : f(w′) \ SA,w 6= ∅. Let φ denote SA,w.37 By construction, every world in SA,w is a φ-world, and every world in SA,w will be able to access a world under f where φ is false. Thus pA supposes (φ and might not φ)q is true at w, even though A's suppositions at w are consistent. 36While adopting non-standard connectives could on its own explain Yalcin's data (as Rothschild and Klinedinst (2015); Mandelkern (2019) discuss) it would not on its own account for nearby order and scope variants. 37It will generally be possible to come up with a sentence that denotes the content of an attitude state in natural language: e.g. just let φ ='What A V's at w'. 22 Thus, given our background assumptions, the subset relational constraint is the weakest constraint which accounts for Yalcin's data in the relational framework. The first thing to note about this constraint is that it is much weaker than the constraint implicit in Ninan's system-that modal bases must be constant functions to the set of attitude world. And thus the relational semantics is by no means locked into the Yalcin/Ninan approach if it is to make sense of Yalcin's data. Before exploring the plausibility of the subset relational system (the system that results from the relational semantics together with the subset attitude semantics), let us explore the parallel question for the domain semantics. It turns out that there is no way to exactly replicate the subset relational system in the domain framework, given our background assumptions:38 something which is unsurprising, given the greater expressive power of the relational framework, and which provides a helpful concrete illustration of those expressibility results. But neither is the domain semantics locked into the Yalcin/Ninan framework. The weakest constraint in the domain framework which accounts for Yalcin's data is the following: a modal in the complement of pA supposesq is always evaluated relative to a subset of A's supposition worlds. A simple way to encode this constraint, which I call the subset domain constraint, is as follows:39 Definition 5.2. Subset attitude semantics, domain version: • [[A V's φ]]s,w a. defined only if s ⊆ VA,w subset domain constraint b. where defined, true iff ∀w′ ∈ VA,w : [[φ]]s,w ′ = 1 standard Hintikkan truth conditions Given this constraint, if pA supposes (φ and might not φ)q is true at 〈s, w〉, then (i) all the worlds compatible with A's suppositions in w make φ true (relative to s); and (ii) all the worlds compatible with A's suppositions are such that some world in smakes φ false (relative to s). These two conditions can only be jointly met if there are no worlds compatible with A's suppositions. Next suppose that A's suppositions are consistent atw, and that the subset domain constraint is violated for some information state s, i.e. s \ SA,w 6= ∅. Let φ denote SA,w. Then pA supposes (φ and might not φ)q will be true at 〈s, w〉, since (i) all of A's supposition world are φ-worlds; and (ii) some world in s makes φ-false, since s \SA,w is non-empty and by construction includes only p¬φq-worlds. So, if the subset domain constraint is violated, pA supposes (φ and might not φ)q can be true even if A's suppositions are consistent. 38The proof follows from the fact that, given our background assumptions and given any semantics for attitudes, the domain framework will validate the inference from pA V's might (φ or ψ)q to p(A V's might φ) or (A V's might ψ)q; but this inference is invalid in the subset relational framework. See Stalnaker 1984; Yalcin 2011 for criticism of one prediction of this inference pattern: that pA believes might φ or A believes might ¬φq is a logical truth. 39A different route to an approach which ends up being essentially equivalent to the subset domain semantics goes by adding an ordering source to Hacquard (2010)'s event-relative semantics; see Hacquard 2010, §6.1.2 for a brief discussion of this possibility. 23 Thus, given our background assumptions, the subset domain constraint is the weakest constraint which accounts for Yalcin's data in the domain framework. And, once more, this constraint is much weaker than the one relied on in Yalcin's framework-that modals under attitudes are evaluated relative to exactly the set of attitude worlds. Thus the domain semantics, like the relational semantics, is not locked into the Yalcin/Ninan approach if it is to make sense of Yalcin's data. The key question at this stage is whether the subset relational or subset domain systems improves over the Yalcin/Ninan systems. Strikingly, both do (albeit to varying degrees). First, both approaches predict that non-modal φ does not entail pA knows might φq, avoiding the most serious problem for the Yalcin/Ninan system: this is because, on both approaches, pA knows might φq is strictly stronger than pφ is compatible with A's knowledgeq. Second, both approaches, unlike the Yalcin/Ninan approach, predict that pA knows might φq Strawson entails pA believes might φq (in the terminology of von Fintel 1997): whenever both sentences are well-defined, if the first is true, the second is as well (holding fixed the modal base/information state parameter). The subset relational approach also correctly predicts that pA consistently believes might φq does not entail pA knows might φq; by contrast, the subset domain approach still wrongly predicts this entailment is Strawson valid. Finally, both approaches predict that pA V's must φq and pA V's φq are not semantically equivalent: the subset relational approach predicts that neither entails the other, while the subset domain approach predicts that the latter entails the former, but not vice versa (in both cases, there is much more to be done to marshal these facts into an explanation of how 'must' patterns under attitude verbs, but this is a clear improvement over the Yalcin/Ninan prediction that pA V's must φq is semantically equivalent to pA V's φq). The subset domain system is less attractive than the subset relational system with respect to its predictions about the relation between belief and knowledge. But there is much more to explore before we come to any conclusion here; in particular, we must, among other things, explore alternative approaches which relax the two background assumptions we have made here.40 The goal of the present excursus is not to decide between the domain and relational approaches, but rather to emphasize that, in deciding whether some embedding data show that a given semantic framework is not expressively powerful enough to make sense of natural language, we must always explore the weakest constraint within that framework which makes sense of those data. It is only if that constraint makes implausible commitments that the data truly tell against the framework in question. In the present case, these considerations showed that there are much weaker ways to account for Yalcin's data in both the domain and relational framework than with the implausible Yalcin/Ninan framework. These weaker 40Again, in Mandelkern (2019), I explore a system which abandons Boolean semantics for the connectives. In a different direction, Beaver 1992, 2001; Rothschild 2011; Yalcin 2012; Willer 2013 develop systems which avoid the first two of these problems by treating attitude states, not as sets of worlds, but as sets of sets of worlds. Those approaches avoid the first and second problems raised here, though they still face the third problem: they predict that pA V's must φq is semantically equivalent to pA V's φq, at least within any eliminative fragment like the ones they are working with. 24 approaches are both much more plausible than the Yalcin/Ninan system, and thus these embedding data do not, after all, tell against either framework as a candidate semantics for epistemic modals. Let me conclude by pointing to a more indirect upshot of this discussion. The domain semantics has a substantially stronger logic than the relational semantics; in particular, enriched with standard classical treatments of the connectives, the domain semantics validates the logic K45 (or S5, if we also impose a reflexivity constraint by assuming that, for all domain indices 〈s, w〉, w ∈ s)41 whereas the relational system validates just the much weaker logic K (again, unless we assume a reflexivity constraint (as I have not), in which case we would have a T logic). Interestingly, Ninan's system, although it adopts the relational semantics, still validates K45 in a local sense: the axioms of K45 will be valid in the scope of an attitude verb. This may make it look as though Yalcin's data are really an argument for K45: we could validate K45 directly by adopting the domain semantics, or indirectly (in a local way) by adopting Ninan's semantics, but either way, we must validate K45, at least in the scope of attitude predicates, if we are to make sense of the data. But the subset relational approach shows this is wrong: it is mere happenstance that the most prominent treatments of the data both validate this strong logic (at least in the scope of attitude predicates), for the subset relational approach does not validate K45 even for modals in the scope of attitude verbs (e.g. on this approach, pA believes might pq can be true without pA believes must (might p)q being true, and pA believes must pq can be true without pA believes must (must p)q being true). And so we can make sense of Yalcin's data without validating K45 even in a local sense. This shows that-even though (I have argued) there is an argument for the domain semantics on the basis of its expressive weakeness-there is no argument from Yalcin's data for the domain semantics on the basis of its stronger logic, which turns out to be strictly orthogonal to accounting for those data. (There may, of course, be other arguments that we want a stronger logic for epistemic modals than K; my present point is simply that the present discussion shows that Yalcin's embedding data do not provide such arguments.) 6 Conclusion I have argued that we can gain new insight into controversies about the semantics of natural language expressions by taking an abstract perspective on the potential expressive power of different semantic theories. I have developed a formal framework to make these comparisons precise, and have used this framework to explore the relative potential expressive power of different semantics for epistemic modals. These comparisons show that, for any embedding operator which can be defined in the domain, update, or state-based semantics, a corresponding operator can be defined in a relational framework, but not conversely. This shows that the dialectic in this debate is roughly the opposite of 41See Schultz 2010; Holliday and Icard 2017. 25 what it has often been taken to be. On the one hand, the relational theory can do anything that these revisionary theories can do, showing that it is not defenders of the relational framework who must fight rearguard actions when new data are discovered, but rather defenders of the revisionary frameworks who must show that those frameworks have the expressive power to account for those data. But, on the other hand, the relative expressive weakness of the domain, update, and state-based semantics provides powerful pro tanto considerations in their favor. In the last part of the paper, I explored the comparison between the relational and domain frameworks in light of these results, focusing on the behavior of modals under attitudes. I emphasized there the importance of methodological parsimony in choosing between semantic frameworks: the only way to see whether a given semantic framework has the expressive power to account for some new domain of data is by finding the weakest way the framework can do so, and then asking if the result makes implausible commitments. I have focused on epistemic modals here because they provide an apt illustration of the utility of the expressive comparisons I have developed. While I hope this discussion has advanced our understanding of the meaning of epistemic modals, my broader goal has been to develop a formal framework with widespread application in semantics. Semantic theory has often advanced thanks to results regarding expressive power-for instance in the theory of tense and temporal adverbs,42 or of generalized quantifiers.43 The framework for comparing relative potential expressibility introduced here makes precise a new kind of question we can ask about the expressive power of different semantic frameworks, and the characterization result proved shows how to straightforwardly answer those questions. There is much more work to do in exploring the applications of this framework, as well as exploring and extending the underlying formalism. Of particular interest is a set of questions about computational complexity: if T L T ′, that means that, for any operator we can give a semantics for in T , we can replicate that operator's logic in an extension of T ′, but this does not tell us anything about the relationship between the complexity of T ′ to the complexity of T . Ideally, we would like to know whether we can replicate all operators from possible extensions of T without an upgrade in computational complexity. I believe this work will pay handsome dividends. Potential expressibility results cannot on their own determine which of two semantic theories is correct, but they can clarify the dialectical relationship in which competing theories stand, thus clarifying what kinds of evidence we can expect to find for and against them based on embedding data, and making precise one sense in which a theory can be simpler than another. 42See Kamp 1970; Cresswell 1990. 43See Peters and Westerståhl (2008, Parts 3-4) for an overview. 26 A Definitions and proofs In this appendix I give definitions of the technical terms used in §3, and provide proofs of the claims made there. A.1 Definitions Definition A.1. Languages, Models, Semantic theories: Given a propositional language L, built from a vocabulary comprising a set A of atomic sentences p, q, r . . . and sentence operators O, and comprising only (and typically all) (i) atoms from A, and (ii) strings of the form On(〈φ1, φ2, . . . φn〉) for any n-place sentence operator On ∈ O and sentences φi : 1 ≤ i ≤ n in L,44 a modelM of L is a sequence 〈W, v,I, [[*]]〉, whereW is a non-empty set of possible worlds (any set); v is an atomic valuation function, which takes any atomic sentence of L and any possible world fromW to either 1 ("true") or 0 ("false"), and which takes any atomic sentence of L to the subset ofW where that sentence is true; and I is a non-empty set of indices (again, any set). [[*]] is an interpretation function for L which takes an atomic sentence to a set of indices in the model; takes an n-place sentence operator On to a function from an n-tuple of sets of indices to a set of indices; take any sentence of the form On(〈φ1, φ2 . . . φn〉), for any n-place sentence operator On ∈ O and any n-tuple 〈φ1, φ2, . . . φn〉 of sentences of L, to [[On]](〈[[φ1]], [[φ2]] . . . [[φn]]〉); and which is otherwise undefined. For convenience, we also stipulate that [[*]] takes any sentence φ of L and any index i (written [[φ]]i) to 1 just in case i ∈ [[φ]], and otherwise to 0. Finally, a semantic theory T of L is a non-empty set of models of L. Given a quantified language Lv , built from a vocabulary comprising a set V of variables x1, x2, . . . , a set R of relation symbols, and a set O of sentence operators; and comprising only (and typically all) (i) atoms of the form Rn(〈x1, x2, . . . xn〉), for Rn an n-place relation in R, and xi : 1 ≤ i ≤ n variables from V; and (ii) strings of the form On(〈φ1, φ2, . . . φn〉) for any n-place sentence operator On ∈ O and any sentences φi : 1 ≤ i ≤ n in Lv , a model of Lv is just like a model of a propositional language, except it also includes a domain D of individuals, and the atomic valuation function takes a possible world and an n-place relation symbol to an n-place relation (a subset of Dn, the set of n-tuples of elements of D). Interpretation functions and semantic theories are constructed as for propositional languages. Definition A.2. Extension of a Language: Given a language L, and a set O of sentence operators disjoint from the vocabulary of L,45 the extension of L to O, written LO, is the smallest set containing L and closed under the elements of O, i.e. where a is the function giving the arity of sentence operators, the smallest set containing L such that if O ∈ O, a(O) = n, and l ∈ (LO)n, then On(l) ∈ LO. Definition A.3. Extension of a model and semantic theory: Given a language L, a modelM of L, and an extension LO of L, an extensionMO ofM to an interpretation of LO with respect to L is an interpretation which is exactly likeM except with respect to its interpretation function, which must agree withM's interpretation function on sentences of L, i.e. ∀φ ∈ L : [[φ]]M ⊆ [[φ]]MO . Given a semantic theory T of L, an extension T O of T to LO is a semantic theory each of whose members extends some model in T from L to LO such that any two models in T O agree on the logic of O.46 44We will generally use lower-case italic letters to range over atoms, and Greek letters to range over all sentences. 45I will call any set of operators which meets this novelty constraint a 'set of new operators'; I sometimes leave this novelty condition implicit for brevity in introducing extensions of languages. 46I will usually leave the relativization to the initial language implicit. For any modelM, I write [[*]]M forM's interpretation function, and likewise for its other parameters. 27 A.2 Proofs For convenience, I repeat the definition of relative potential expressibility here; I then turn to proofs of the claims of §3: Definition 3.1. Relative Potential Expressibility: For any semantic theories T and T ′ of a language L, T L T ′ iff, for any set of new operators O, for any extension T O of T to LO, there is an extension T ′O of T ′ to LO which agrees with T O on logic ofO: that is, which is such that, for any Γ ⊆ LO such that ∃φ ∈ Γ : φ ∈ LO\L, and for any sentence ψ in LO, (Γ T O ψ) ↔ (Γ T ′O ψ). For a set of sentences Γ, sentence ψ, and semantic theory T , Γ T ψ iff for every modelM∈ T , Γ M ψ iff every index inM which makes all the sentences in Γ true also makes ψ true. We also define a derivative notion of relative expressibility between models: for any modelsM andM′ of a language L,M L M′ iff, for any set of new operators O, for any extensionMO ofM to LO, there is an extension M′O of M′ to LO which preserves the logic of O from MO: that is, which is such that, for any Γ ⊆ LO such that ∃φ ∈ Γ : φ ∈ LO \ L, and for any sentence ψ in LO, (Γ MO ψ)↔ (Γ M′O ψ). The proof of Fact 3.1 goes by way of two lemmas: Lemma A.1. For any modelsM andM′ of a language L,M L M′ iff for any extension LO of L with a set of new operators, and any extensionMO ofM to a model of LO, there is an extensionM′O ofM′ to LO such that there exists a function g from the indices ofM to the indices ofM′ such that for any sentence φ of LO and index i inM, φ is true at i inMO iff φ is true at g(i) inM′O. Proof. [⇒] For arbitrary models M and M′ of an arbitrary language L, suppose M L M′. Recall that means that, for any set of new operators O and any extensionMO ofM to LO, there is an extensionM′O of M′ to LO which preserves the logic ofO fromMO. Consider an arbitrary set of new operatorsO and arbitrary extension MO of M to LO. We will show that there is an extension M′ON of M′ to LO such that there is a function g from the indices ofM to the indices ofM′ such that for any sentence φ of LO and index i inM, φ is true at i inMO iff φ is true at g(i) inM′ON . We do so by way of considering first a different extended language which contains LO, and a different extension ofMwhich also extendsMO. In particular, consider the extension ofL toO∪N , whereO∩N = ∅, no operator in N is in the vocabulary of L, and the cardinality of N is greater than the cardinality of the set of indices ofM (if that set is finite) or equal in cardinality to the set of indices ofM (if that set is infinite). Now extendM to a new modelMON of the resulting language, LO∪N , with the following properties: (a) MON is an extension ofMO, so that ∀φ ∈ LO : [[φ]]MO = [[φ]]MON , and the indices ofM O N are the same as the indices ofMO; (b) for some sentence ψ ∈ LO∪N , for each index i of M, there is an O ∈ N , call it Oi, which uniquely specifies i, in the sense that Oi(ψ) is true at i inMON and false everywhere else inMON ; and (c) for some unary sentence operator ¬ ∈ N , ¬ is given the classical semantics of negation inMON , i.e. for any φ ∈ LO∪N , ¬(φ) is trueMON at i iff φ is not trueMON at i. Now extendM′ to a modelM′ON of LO∪N which preserves the logic of O ∪ N fromMON ; we know this will be possible by our assumption thatMLM′. Now, define a function g such that, for any index i inMON , g takes i to an index g(i) inM′ON such that (i) Oi(ψ) is true at g(i) inM′ON and (ii) some sentence in LO∪N is false at g(i). We know there is such an index; otherwise, we would have Oi(ψ) M′ON ¬(Oi(ψ)), and thus Oi(ψ) MON ¬(Oi(ψ)), but we know the latter is false by our semantics for Oi and ¬ inM O N . Now for any sentence φ of LO∪N : 28 • Suppose first φ is true at i inMON . ThenOi(ψ) MON φ, and thus by our assumption thatOi has the same logic inM′ON as inMON , Oi(ψ) M′ON φ, and thus φ is true at g(i), since Oi(ψ) is true at g(i). • Suppose next that φ is not true at i in MON . Then ¬φ is true at i in MON , and thus Oi(ψ) MON ¬φ and thus Oi(ψ) M′ON ¬φ, and thus ¬φ is true at g(i) inM ′O N , since Oi(ψ) is true at g(i). Thus we can conclude that φ is not true at g(i) inM′ON ; if it were, since our logic for negation is classical inMON , and thus inM′ON , we would have that everything is true at g(i) inM′ON , contrary to assumption. Thus for any φ in LO∪N , we have φ true at i inMON iff φ is true at g(i) inM′ON ; thus in particular, for any φ in LO, which is a subset of LO∪N , φ is true at i inMON iff φ is true at g(i) inM′ON ; and, sinceMON is an extension ofMO, it follows that for any φ in LO, φ is true at i inMO iff φ is true at g(i) inM′ON . Since O andMO were selected arbitrarily, this shows that, for any extension LO of L and extensionMO ofM, we can find an extension ofM′ to LO with the property that there is a function from the indices ofM to those ofM′ which preserves truth and falsity for the sentences of LO in the extended models. [⇐] For arbitrary modelsM andM′ of an arbitrary language L, suppose that, for any arbitrary extension LO of L with a set of sentence operators, and any arbitrary extension MO of M to a model of LO, there is an extensionM′O ofM′ to LO such that there exists a function g from the indices ofM to the indices ofM′ such that for any sentence φ ∈ LO and index i inM, φ is true at i inMO iff φ is true at g(i) inM′O. We can use this fact to construct a new extensionM′O− ofM′ which matches the logic of O inMO, as follows. Let M′O− be just like M′O, except that, at every index i of M′O which is not in the image of g, for any sentence φ ∈ LO \ L, φ is false at i inM′O− (we can do this because the truth of φ at i will depend just on the semantics we give to our new operators, since if φ is in LO \ L, it must by definition of LO have an operator from O with highest scope in the sentence). Note thatM′O− is still an extension ofM′ to LO; and g will still preserve truth for the relevant sentences: since we did not change the truth of any sentences in the image of g, for any φ ∈ LO, φ is trueMO iff φ is trueM′O− at g(i).47 Now suppose that Γ MO ψ for (Γ ∪ {ψ}) ⊆ LO, and that ∃φ ∈ Γ : φ ∈ LO \ L. Then by the fact that g preserves truth for sentences of LO between MO andM′O−, we have that, within the image of g, ψ is trueM′O− everywhere that all the members of Γ are; and by construction we have that one member of Γ, namely φ, is falseM′O− everywhere outside of the image of g; and so Γ M′O− ψ. Likewise suppose that Γ 2MO ψ; then there is some i where all of Γ is trueMO and ψ is falseMO , and so at g(i) all of Γ is trueM′O− with ψ is falseM′O− , and thus we have Γ 2M′O− ψ. Thus we have (Γ MO ψ)↔ (Γ M′O− ψ). Since this construction was perfectly general, it shows that, under our assumption, for any extensionMO ofM to an extension LO of L, we can find an extension ofM′ to LO which matches the logic of O inMO, and soMLM′. We turn to our second lemma: Lemma A.2. Characterization of Model Expressibility: For any modelsM andM′ and language L,M L M′ iff there is a function g (call it a witness function with respect to L) from the indices ofM to those ofM′ which is such that (i) for any sentence φ of L and index i inM, φ is true at i inM iff φ is true at g(i) inM′; and (ii) g is an injection. Proof. [⇒] Suppose for arbitraryM,M′ and L, there is no function g from the indices ofM to those ofM′ which is such that (i) for any sentence φ of L and index i inM, φ is true at i inM iff φ is true at g(i) inM′; and (ii) g is an injection. Find a set of new operators O with cardinality equal to the set of indices inM. Let f be a bijection from the indices ofM to O. ExtendM to a new modelMO of LO, with the property that, for any index i inM, for any sentence φ ∈ LO, f(i)(φ) is trueMO at i and falseMO at every other index of 47'TrueM' is shorthand for 'true inM'. 29 M; that is, f(i) "tags" i inMO. Now consider an arbitrary extensionM′O ofM′. Suppose there is a function g from the indices of M to the indices of M′ with the property that, for all φ ∈ LO, φ is trueMO at i iff φ is trueM′O at g(i). Since L ⊂ LO, and since extensions of models of a given language preserve truth for sentences in the original language, we know that, for all φ ∈ L, φ is trueM at i iff φ is trueM′ at g(i). Then it follows from our assumption that g is not an injection: for someM-indices i and i′ with i 6= i′, g(i) = g(i′). Choose some φ ∈ LO. We know by construction ofMO that f(i)(φ) is trueMO at i and falseMO at i′. But, since g(i) = g(i′), f(i)(φ) will either be trueM′O at both g(i) and g(i′), or false at both, and thus it will not be the case that, for every sentence φ of LO, if φ is true at i inMO, then φ is true at g(i) inM′O, contrary to assumption. Thus, sinceM′O was chosen arbitrarily, there is no extension ofM′ to LO such that there is a function which preserves truth and falsity for all sentences in LO betweenMO andM′O; and thus by Lemma A.1,M LM′. [⇐] Suppose, for arbitraryM,M′ and L, there is such a truth-preserving injection g. Given an arbitrary set O of sentence operators and an arbitrary extensionMO ofM to LO, we show there is an extensionM′O of M′ to LO which has the property that, for any sentence φ ∈ LO, φ is true at an index i inMO just in case φ is true at g(i) inM′O. Let K index the elements of O. For each Ok : k ∈ K, extendM to the modelMk which is just likeM, except its interpretation function [[*]]Mk is augmented with the semantic rule for Ok from [[*]]MO . Then, for eachOk, extendM′ to the modelM′k which augments the interpretation function ofM′ with a semantic rule for Ok as follows. For brevity, for any set α and function f , define f [α] to be the pointwise application of f to α where defined, i.e. f [α] = {f(a) : a ∈ α ∧ f(a) is defined}. Let g−1 be the inverse of g, defined only on the image of g; that g−1 is a well-defined function follows because g is an injection. Now, suppose first that Ok is a unary sentence operator; then letM′k extendM′ with the following semantic rule: [[Ok]]M′k = λsM′ .g[[[Ok]]Mk(g −1[s])], where sM′ ranges over sets ofM′ indices. Thus inM′k, Ok takes a set ofM′ indices; then finds the pre-image (where defined) of this complement with respect to g; then applies the semantic rule for Ok inMk to this pre-image; and finally, returns the pointwise application of g to the resulting set. Now note that, for any set s ofM-indices and set s′ ofM′-indices, if i ∈ s ↔ g(i) ∈ s′, it follows that i ∈ [[Ok]]Mk(s)↔ g(i) ∈ [[Ok]]M′k(s ′). To see this, assume for arbitrary s and s′ that i ∈ s↔ g(i) ∈ s′. Now note that s = g−1[s′]: if i ∈ s, then by assumption g(i) ∈ s′, and thus g−1[s′] will include i, by construction; and if i /∈ s, then by assumption g(i) /∈ s′, and since g−1 is an injection, by construction, we know that i /∈ g−1[s′]. We thus have [[Ok]]M′k(s ′) = g[[[Ok]]Mk(g −1[s′])] = g[[[Ok]]Mk(s)]. In other words, whenever i ∈ s↔ g(i) ∈ s′, then [[Ok]]M′k(s ′) is just the pointwise application of g to [[Ok]]Mk(s), and thus, since g is an injection, i ∈ [[Ok]]Mk(s) ↔ g(i) ∈ [[Ok]]M′k(s ′). The generalization of this construction to n-place sentence operators, for any n, is straightforward. We use this method to constructM′k for all k ∈ K. Now, whereM′ = 〈W, I, v, [[*]]M′〉 or 〈D,W, I, v, [[*]]M′〉, letM′O = 〈 W, I, v, ⋃ k∈K [[*]]M′k 〉 or〈 D,W, I, v, ⋃ k∈K [[*]]M′k 〉 , respectively. By our construction, we know that for any O ∈ O and any sets of M-indices s and M′-indices s′ such that i ∈ s ↔ g(i) ∈ s′, [[O]]M′O (s′) = g[[[O]]MO (s)], and thus i ∈ [[O]]MO (s) ↔ g(i) ∈ [[O]]M′O (s′). We know by assumption that, for all sentences φ ∈ L, i ∈ [[φ]]M ↔ g(i) ∈ [[φ]]M′ , and thus (since extending a model never changes its interpretation of a sentence already in the language of the original model) i ∈ [[φ]]MO ↔ g(i) ∈ [[φ]]M′O . Now consider any sequence of sentences ~ψ with the property that for each ψj in the sequence, i ∈ [[ψj ]]MO ↔ g(i) ∈ [[ψj ]]M′O . Then we know that, by our construction, for any k ∈ K and index i ofM, i ∈ [[Ok(~ψ)]]MO ↔ g(i) ∈ [[Ok(~ψ)]]M′O . Since the sentences of LO are built recursively from the sentences of L and the operators in O, it follows by an induction on the complexity of formulae that, for any φ ∈ LO, i ∈ [[φ]]MO ↔ g(i) ∈ [[φ]]M′O . Since O andMO were chosen arbitrarily, we conclude that, for any set of new operators O, for any extensionMO ofM to LO, there is an extensionM′O ofM′ to LO such that there is a function g with the property that, for any sentence φ ∈ LO, φ 30 is true at an index i inMO just in case φ is true at g(i) inM′O; and thus by Lemma A.1,MLM′. We turn now to our proof of Fact 3.1: Fact 3.1. Characterization of Expressibility: For any semantic theories T and T ′ and language L, if T is isomorphic with respect to L, and T ′ is isomorphic with respect to L, then T L T ′ iff there is a modelM∈ T and a modelM′ ∈ T ′ such that there is a witness function g from the indices ofM to those ofM′ with respect to L. A semantic theory T is isomorphic with respect to L iff ∀M,M′ ∈ T :MLM′ ∧M′ LM. Proof. For abitrary semantic theories T , T ′, and language L, suppose that T and T ′ are both isomorphic with respect to L: [⇒]: Suppose there is no pair of modelsM ∈ T andM′ ∈ T ′ such that there is a witness function g from the indices ofM to those ofM′. It follows by Lemma A.2 that for any modelsM∈ T andM′ ∈ T ′,M LM′. Choose arbitrary modelM ∈ T andM′ ∈ T ′, and find a set of operators O and extensionMO ofM to LO such that there is no extension of M′ which matches the logic of O in MO. We know this will be possible since otherwise M L M′. We can show moreover that no model M′′ ∈ T ′ can be extended to match the logic of O inMO; else since T ′ is isomorphic, we could extendM′ to match the logic of O inMO, contrary to assumption. Consider any extension T O of T to LO which includesMO and any extension T ′O of T ′ to LO. If these agreed on the logic ofO, then, since all the models within each theory agree with each other on the logic of O, then every model in T ′O would agree with every model in T O on the logic of O, contrary to our assumption that no extension of any model in T ′ matches the logic of O inMO; so T ′O does not agree with T O on the logic of O; since T ′O was chosen arbitrarily, it follows that no extension of T ′ agrees with T O on the logic of O; and so we have T L T ′. [⇐] Suppose there is a modelM ∈ T and a modelM′ ∈ T ′ such that there is a witness function g from the indices ofM to those ofM′. It follows by Lemma A.2 thatMLM′. Consider any extension LO of L with a new set of operators O, and extensionMO ofM to a model of LO. Find an extensionM′O ofM′ of LO which matches the logic of O fromMO. Now extend every model in T ′ other thanM′ to match the logic of O inM′O; we know this will be possible because T ′ is isomorphic with respect to L. Call the resulting set of models T ′O. For any way of completing the extension of T to a new semantic theory T O of LO, all the models in the extension will agree withMO on the logic of O, by definition of an extension, and so T O and T ′O will agree on the logic of O. Hence T L T ′. Fact 3.2. D L♦ R. Proof. Recall that our language L♦ contains an infinite set of atoms p, q, r . . . closed under the one-place sentence operator ♦, and that we assume that all of our semantic theories are sets of models whose sets of worlds and valuation functions are such that, in any model, any two worlds differ on the truth of some atom, and such that for any combination of atoms, exactly that set of atoms is true at some world, according to that model's valuation function. The relational semantics R is the class of models r of the form 〈Wr, vr, Ir, [[*]]r〉, where Ir = {〈f, w〉 : f : Wr → ℘(Wr) ∧ w ∈ Wr} and [[*]]r defined as specifed in §3; likewise the domain semantics D is the class of models d of the form 〈Wd, vd, Id, [[*]]d〉, where Id = {〈s, w〉 : s ⊆ Wd ∧w ∈ Wd}, with the interpretation function again specified as above. D is clearly isomorphic with respect to L♦, as is R, so by Fact 3.1 it suffices to show that there is a d ∈ D and an r ∈ R s.t. d L♦ r. Choose d at random and let 31 r be any model in R built on the same set of worlds and valuation function as d. Let g be a function Id → Ir as follows. For any index 〈s, w〉 ∈ Id, let g(〈s, w〉) = 〈fs, w〉, where fs is the constant function from worlds to s. For any atomic sentence p of L♦, p will be trued at i iff p is truer at g(i), since we are assuming the same stock of worlds and atomic valuation in both models, and since the truth of atomic sentence in these frameworks depends only on the world parameter of the index and the atomic valuation. Now for any sentence φ ∈ L♦, assume for induction that φ is trued at i iff it is truer at g(i). We show that, for arbitrary index i, ♦φ is trued at i iff ♦φ is truer at g(i). i will have the form 〈s, w〉, for information state s and world w, and, by our semantics for ♦ in d, ♦φ will be trued at i iff φ is trued at some element in the set Φ = {〈s, w′〉 : w′ ∈ s}. g(i) will have the form 〈fs, w〉, and, by our semantics for ♦ in r, ♦φ will be truer at g(i) iff φ is truer for some element in the set Ψ = {〈fs, w′〉 : w′ ∈ fs(w)}. Now note that, thanks to the way we constructed g and the fact that fs(w) = s, g will be a bijection from Φ to Ψ. And so it follows from our assumption for induction that φ will be trued at some element in Φ just in case φ is truer at some element in Ψ, and thus ♦φ will be trued at i iff ♦φ is truer at g(i). It thus follows by induction on the complexity of formulae that, for any sentence φ of L♦ and any index i, φ is trued at i iff φ is truer at g(i). Finally, it is easy to see that g is an injection. Given Fact 3.1, it thus follows that D L♦ R. Fact 3.3.R L♦ D. Proof. Consider any models d ∈ D and r ∈ R. Consider an r-index 〈f, w〉, with f(w) = {w′}, vr(p, w′) = 0, f(w′) = {w′′}, and vr(p, w′′) = 1. Then ♦p will be falser at 〈f, w〉, while ♦(♦p) will be truer at 〈f, w〉. There is no function g which replicates this pattern in d-i.e. which has ♦p falsed at g(〈f, w〉) and has ♦(♦p) trued at g(〈f, w〉). This is for the simple reason that, for any index i in Id, ♦φ is trued at i iff ♦(♦φ) is, since ♦(♦φ) is trued at 〈s, x〉, for any x, iff ♦φ is trued at 〈s, w′〉 for some w′ ∈ s iff φ is trued at 〈s, w′′〉 for some w′′ ∈ s iff ♦φ is trued at 〈s, x〉 for any x. Thus there is no function from the indices of r to those of d which preserves truth for all φ ∈ L♦. Since these models were chosen at random, we have that there is no model r ∈ R and model d ∈ D s.t. r L♦ d and thus by Fact 3.1 we haveR L♦ D. Fact 3.5. D ≺L♦− R. Proof. ThatD L♦− R follows as an immediate corollary of Fact 3.2. The proof thatR L♦− D is as follows. Consider any models r ∈ R and d ∈ D. Let h be a bijection Wr → Wd such that ∀p ∈ L♦− : ∀w ∈ Wr : vr(p, w) = vd(p, h(w)); that there is such a bijection follows from our assumptions about the stocks of worlds and valuation functions in any model of D and R. Now consider three different modal bases f , f ′, and f ′′ from pairs in Ir, and some world w ∈ Wr, with f(w) = f ′(w) = f ′′(w) = ∅. Consider any function g : Ir → Id which preserves truth for φ ∈ L♦−. Suppose that g(〈f, w〉) = 〈s, x〉, g(〈f ′, w〉) = 〈s′, x′〉, and g(〈f ′′, w〉) = 〈s′′, x′′〉, with 〈s, x〉 , 〈s′, x′〉 , and 〈s′′, x′′〉 all different. Since all worlds differ on the truth of some atom, we know that x = x′ = x′′ = h(w), else we would have that at least one of g(〈f, w〉), g(〈f ′, w〉), or g(〈f ′′, w〉) differs from its pre-image on the truth of some atom, contrary to the assumption that g preserves truth. So we must have that s 6= s′ and s 6= s′′ and s′ 6= s′′. It is easy to see that, for any atom p, ♦p is falser at all of 〈f, w〉, 〈f ′, w〉, and 〈f ′′, w〉. But there are only two d-indices with h(w) as their world parameter which make ♦p falsed for every atom p, namely 〈∅, h(w)〉 and 〈 {wfd}, h(w) 〉 , where wfd is the world inWd such that for every atomic sentence p, vd(p, w f d ) = 0. And so either 〈s, x〉 , 〈s′, x′〉 , or 〈s′′, x′′〉 will make ♦p trued for some p, contrary to the assumption that g preserves truth. Thus any truth-preserving function must take two of 〈f, w〉, 〈f ′, w〉, and 〈f ′′, w〉 to the same index in Id, and thus will fail to be an injection. Since d and r were chosen arbitrarily, Fact 3.5 follows by Fact 3.1. 32 Fact 3.8. U ≺L♦ R. Proof. Recall that U is the set of models u of the form 〈Wu, vu, Iu, [[*]]u〉, where Iu = {〈s, c〉 : s ∪ c ⊆ Wu}, with [[*]]u as specified in §3, and assuming again that in any U-model, any two worlds differ on the truth of some atom, and for any set of atoms, there is a world where exactly they are true. First note that U is isomorphic with respect to L♦. Next, let u be an arbitrary U-model, and let r be a model inR built on the same set of worlds and valuation function. Let wtu be the world inWu which is such that, for every atom p ∈ L♦ : vu(p, wtu) = 1, and letwfu be the world inWu which is such that, for every atom p ∈ L♦ : vu(p, wfu) = 0. Where Φ is a set of atomic sentences, we let Φ refer to the unique world fromWu which verifies those sentences according to vu, and vice versa. Let h be an injection which takes any pair of subsets ofWu to a subset ofWu (that there is such a function follows from the fact thatWu must be infinite given our starting language and assumptions about worlds). We stipulate further that h( 〈 {wfu},∅ 〉 ) = {wtu}, h(〈{wtu}, {wtu}〉) = {wtu, w f u}; h( 〈 {wfu , wtu}, {wtu} 〉 ) = {wfu}; and that h(〈s, c〉) includes wtu whenever wtu ∈ s and s = c. For any sets r, s, let frs be the function which takes every world inWu to r except wfu , which it takes to s, and let frs∗ be the function which takes every world in Wu to r except wtu, which it takes to s. We can then define a witness function g as follows: for any pair 〈s, c〉 of contexts (subsets ofWu), with p ranging over atomic sentences in L♦: g(〈s, c〉) =  〈 fsh(〈s,c〉)∗, {p : ∀w ′ ∈ s : p ∈ w′} 〉 iff s = c 6= ∅〈 f∅h(〈s,c〉)∗, {p : (∀w ′ ∈ c : p ∈ w′) ∧ (∀w′′ ∈ (s \ c) : p /∈ w′′)} 〉 iff c ⊂ s ∧ c 6= ∅〈 f∅h(〈s,c〉)∗, w f u 〉 iff c * s〈 f {w:∀p:p∈w→∀w′∈s:p/∈w′} h(〈s,c〉)∗ , {p : ∀w ′ ∈ s : p /∈ w′} 〉 iff s 6= c ∧ c = ∅〈 f {wtu} h(〈s,c〉), w t u 〉 iff s = c = ∅ First note that g is an injection, since each pair of contexts is taken to an index whose modal base is uniquely tagged by h(*). Now note that, for any sentence φ ∈ L♦ and i ∈ Iu, φ is trueu at i iff φ is truer at g(i). To see this, consider first atomic q. Atomic q is trueu at 〈s, c〉 iff c is the result of removing all and only q-worlds from s:48 • if s = c 6= ∅, then this holds iff all the worlds in s are q-worlds iff q ∈ {p : ∀w′ ∈ s : p ∈ w′} iff q is truer at g(〈s, c〉) = 〈 fsh(〈s,c〉)∗, {p : ∀w ′ ∈ s : p ∈ w′} 〉 ; • if c ⊂ s ∧ c 6= ∅, then this holds iff all the worlds in c, but none of the worlds in s \ c, are q-worlds, iff q ∈ {p : (∀w′ ∈ c : p ∈ w′) ∧ (∀w′′ ∈ (s \ c) : p /∈ w′′)}, iff q is truer at g(〈s, c〉) = 〈f∅h(〈s,c〉)∗, {p : (∀w′ ∈ c : p ∈ w′) ∧ (∀w′′ ∈ s : p /∈ w′′)}〉; • if c * s, then this never holds, in which case q is also falser at g(〈s, c〉) = 〈 f∅h(〈s,c〉)∗, w f u 〉 ; • if s 6= c ∧ c = ∅, then this holds iff no world in s is a q-world iff q ∈ {p : ∀w′ ∈ s : p /∈ w′} iff q is truer at g(〈s, c〉) = 〈f{w:∀p:p∈w→∀w ′∈s:p/∈w′} h(〈s,c〉)∗ , {p : ∀w ′ ∈ s : p /∈ w′}〉; • if s = c = ∅, this holds in any case whatsoever, in which case q is also truer at g(〈s, c〉) = 〈 f {wtu} h(〈s,c〉), w t u 〉 . Consider next sentences of the form ♦q, for atomic q. In the update semantics, again, ♦q is treated as a "test": it takes a context c and returns c unchanged just in case [[q]]u(c) 6= ∅, and otherwise returns ∅. That means that, for atomic q, 〈s, c〉 ∈ [[♦q]]u iff 48For atomic p, a p-world is a world w where vu(p, w) = 1; a p-world is a world where vu(p, w) = 0. 33 • (i) there is a q-world in s and s = c; then ♦q is truer at g(〈s, c〉) = 〈 fsh(〈s,c〉)∗, {p : ∀w ′ ∈ s : p ∈ w′} 〉 , since fsh(〈s,c〉)∗(w) will contain a q-world for any w 6= w t u; and {p : ∀w′ ∈ s : p ∈ w′} = wtu iff s = {wtu}, in which case by construction of h we have that h(〈s, c〉) contains wtu, and thus fsh(〈s,c〉)∗({p : ∀w′ ∈ s : p ∈ w′}) will then also contain a q-world, namely wtu; • or (ii) there is no q-world in s, and c = ∅. – Suppose first that s 6= c. Then ♦q will be truer at g(〈s, c〉) = 〈f{w:∀p:p∈w→∀w ′∈s:p/∈w′} h(〈s,c〉)∗ , {p : ∀w′ ∈ s : p /∈ w′}〉, since the fact that q is false throughout s ensures that q will be true at some world in {w : ∀p : p ∈ w → ∀w′ ∈ s : p /∈ w′}, and f{w:∀p:p∈w→∀w ′∈s:p/∈w′} h(〈s,c〉)∗ takes every world but wtu to {w : ∀p : p ∈ w → ∀w′ ∈ s : p /∈ w′}. Moreover in this case we have {p : ∀w′ ∈ s : p /∈ w′} = wtu iff 〈s, c〉 = 〈 {wfu},∅ 〉 ; then h(〈s, c〉) = {wtu} by construction of h, and so f{w:∀p:p∈w→∀w ′∈s:p/∈w′} h(〈s,c〉)∗ ({p : ∀w ′ ∈ s : p /∈ w′}) again contains a q-world. – Suppose next that s = c. Then♦q will be truer at g(〈s, c〉) = 〈 f {wtu} h(〈s,c〉), w t u 〉 , since f{w t u} h(〈s,c〉)(w t u) = {wtu}. Next, suppose that ♦q is falseu at 〈s, c〉; this will hold iff: • s doesn't contain a q-world and c 6= ∅; then either – s = c; then g(〈s, c〉) = 〈 fsh(〈s,c〉)∗, {p : ∀w ′ ∈ s : p ∈ w′} 〉 . We know {p : ∀w′ ∈ s : p ∈ w′} 6= wtu, for we would only have {p : ∀w′ ∈ s : p ∈ w′} = wtu if 〈s, c〉 = 〈{wtu}, {wtu}〉, in which case s contains a q-world contrary to assumption; and so we have fsh(〈s,c〉)∗({p : ∀w ′ ∈ s : p ∈ w′}) = s; since s doesn't contain a q-world, ♦q is falser here; – or c ⊂ s∧c 6= ∅; then g(〈s, c〉) = 〈f∅h(〈s,c〉)∗, {p : (∀w ′ ∈ c : p ∈ w′)∧(∀w′′ ∈ (s\c) : p /∈ w′′)}〉. Suppose first {p : (∀w′ ∈ c : p ∈ w′) ∧ (∀w′′ ∈ (s \ c) : p /∈ w′′)} 6= wtu; so we have f∅h(〈s,c〉)∗({p : (∀w ′ ∈ c : p ∈ w′) ∧ (∀w′′ ∈ (s \ c) : p /∈ w′′)}) = ∅ and so ♦q is falser. Suppose next {p : (∀w′ ∈ c : p ∈ w′) ∧ (∀w′′ ∈ (s \ c) : p /∈ w′′)} = wtu; then 〈s, c〉 must be〈 {wtu, w f u}, {wtu} 〉 ; but then it contains a q-world, contrary to assumption; – or c * s; then g(〈s, c〉) = 〈 f∅h(〈s,c〉)∗, w f u 〉 , and so f∅h(〈s,c〉)∗(w f u) = ∅ and so ♦q will be falser here; • or s contains a q-world and s 6= c; then – either c ⊂ s ∧ c 6= ∅; then g(〈s, c〉) = 〈f∅h(〈s,c〉)∗, {p : (∀w ′ ∈ c : p ∈ w′) ∧ (∀w′′ ∈ (s \ c) : p /∈ w′′)}〉. Suppose first {p : (∀w′ ∈ c : p ∈ w′) ∧ (∀w′′ ∈ (s \ c) : p /∈ w′′)} 6= wtu; so we have f∅h(〈s,c〉)∗({p : (∀w ′ ∈ c : p ∈ w′) ∧ (∀w′′ ∈ (s \ c) : p /∈ w′′)}) = ∅ and so ♦q is falser here; suppose next {p : (∀w′ ∈ c : p ∈ w′) ∧ (∀w′′ ∈ (s \ c) : p /∈ w′′)} = wtu, then 〈s, c〉 must be 〈 {wtu, w f u}, {wtu} 〉 ; by construction h( 〈 {wfu , wtu}, {wtu} 〉 ) = {wfu}; and so in this case, f∅h(〈s,c〉)∗({p : (∀w ′ ∈ c : p ∈ w′)∧ (∀w′′ ∈ (s\ c) : p /∈ w′′)}) = {wfu} and so again ♦q is falser; – or c * s; then g(〈s, c〉) = 〈 f∅h(〈s,c〉)∗, w f u 〉 , and so f∅h(〈s,c〉)∗(w f u) = ∅ and so ♦q will be falser here; – or s 6= c ∧ c = ∅; then g(〈s, c〉) = 〈f{w:∀p:p∈w→∀w ′∈s:p/∈w′} h(〈s,c〉)∗ , {p : ∀w ′ ∈ s : p /∈ w′}〉. Provided {p : ∀w′ ∈ s : p /∈ w′} 6= wtu, we have f {w:∀p:p∈w→∀w′∈s:p/∈w′} h(〈s,c〉)∗ ({p : ∀w ′ ∈ s : p /∈ w′}) = {w : 34 ∀p : p ∈ w → ∀w′ ∈ s : p /∈ w′} which will not contain a q-world, since there is a q-world in s. And in this case we have {p : ∀w′ ∈ s : p /∈ w′} = wtu iff 〈s, c〉 = 〈 {wfu},∅} 〉 , contrary to our assumption that s contains a q-world. Finally, in any update model, for any φ ∈ L♦ and any update index i, ♦(♦φ) is trueu at i iff ♦φ is trueu at i. And this will also hold relative to any point in the image of g: • When s = c = ∅, then both ♦φ and ♦(♦φ) will be truer (by an obvious induction on the length of formulas which I leave implicit here); • when c * s, both will be falser; • when c ⊂ s ∧ c 6= ∅, g1(〈s, c〉)(g2(〈s, c〉)) is empty, making both false; unless g2(〈s, c〉) = wtu, which holds only when s = {wfu , wtu} and c = {wtu}, in which case by construction of h, we have g1(〈s, c〉)(g2(〈s, c〉)) = {wfu}.49 Since g1(〈s, c〉)(wfu) = ∅, we again have false; • when s = c 6= ∅ and g2(〈s, c〉) 6= wtu, we have g1(〈s, c〉) has the same value at g2(〈s, c〉) and at every element of g1(〈s, c〉)(g2(〈s, c〉)), provided s does not include wtu; and whenever s includes wtu, we have g1(〈s, c〉)(wtu) includes wtu by construction of h, and thus that ∀w ∈ s : wtu ∈ g1(〈s, c〉)(w), and thus that ♦φ is true at g2(〈s, c〉) and at every world in g1(〈s, c〉)(g2(〈s, c〉)); and g2(〈s, c〉) = wtu iff s = c = {wtu}, in which case ♦φ is true for any φ at g(〈s, c〉); • when s 6= c ∧ c = ∅, g2(〈s, c〉) = wtu iff s = {w f u}; then h(〈s, c〉) = {wtu}, so then ♦φ is true at g(〈s, c〉) for all φ. When g2(〈s, c〉) 6= wtu, we know that wtu /∈ {w : ∀p : p ∈ w → ∀w′ ∈ s : p /∈ w′} and so g1(〈s, c〉)(g2(〈s, c〉)) is the same as g1(〈s, c〉) applied to any element of g1(〈s, c〉)(g2(〈s, c〉)). And thus we can conclude that ♦φ is trueu at i iff ♦φ is truer at g(i), for any φ ∈ L♦. Thus g is an injection from the indices of u to those of r which preserves truth for all sentences of L♦, and thus by Fact 3.1 we have U L♦ R.50 The proof thatR L♦ U will be as for Fact 3.3. Fact 3.9. U L♦ D. Proof. Consider arbitrary u ∈ U and d ∈ D. Consider the three u-indices 〈∅, {w}〉 , 〈∅, {w′}〉 , 〈∅, {w′′}〉, with w, w′, and w′′ all different. These three indices all make p falseu, for any atomic p ∈ L♦; they also make ♦φ falseu, for any φ ∈ L♦, and thus make every sentence in L♦ falseu. There are, however, only two indices in d which make every sentence in L♦ falsed, namely 〈 ∅, wfd 〉 and 〈 {wfd}, w f d 〉 , where wfd is the world where, for every atomic sentence p, vd(p, w f d ) = 0. Thus any truth-preserving function g : Iu → Id will have to map at least two of the u-indices in question to the same d-index, and thus will fail to be an injection. Thus Fact 3.9 follows by Fact 3.1. Fact 3.10. D L♦ U . 49gn is the nth projection of g, i.e. gn(X) = xn iff g(X) = 〈x1, x2, . . . xn, . . .〉. 50Note that not every operator which can be added to U will be well-defined if we want the intension of any sentence in U to be a function from contexts to contexts, rather than just a relation; there are different approaches within broadly update-style frameworks to this question (e.g. Heim 1983 vs. Groenendijk and Stokhof 1991). 35 Proof. Consider arbitrary u ∈ U and d ∈ D. There are exactly three u-indices which make all sentences in L♦ trueu, namely 〈∅,∅〉 , 〈 {wfu},∅ 〉 , and 〈{wtu}, {wtu}〉, where w f u is the world where, for every atomic sentence p, vu(p, w f u) = 0 and where wtu is the world where, for every atomic sentence p, vu(p, w t u) = 1. There are more than three d-indices which make all sentences in L♦ trued: these include 〈Wd, wtd〉 and 〈{wtd}, wtd〉, as well as 〈s, wtd〉 for any s such that {wtd} ⊆ s ⊆ Wd, with wtd defined as for wtu. Thus any function from the indices of d to those of u which preserves truth for all sentences in L♦ will have to map more than three d-indices to three u-indices, and so will fail to be an injection. Since d and u were chosen arbitrarily, by Lemma A.2 and Fact 3.1, D L♦ U . Fact 3.12. S ≺L♦ D. Proof. Recall that S is the set of models s of the form 〈Ws, vs, IS , [[*]]s〉 with Is = {s : s ⊆ Ws}, with [[*]]s specified as in §3, and assuming again that in any model, any two worlds differ on the truth of some atom, and for any set of atoms, exactly that set is true at some world. Note again that S is isomorphic with respect to L♦. Chose a model s ∈ S and find a model d ∈ D s.t. Ws = Wd and vs = vd. For convenience we identify every world inWs with the set of atomic sentences it makes true according to vs. Let the function g take any information state s ⊆ Ws to 〈s, ⋂ s〉. For atomic p, p is trues at s iff for all w′ ∈ s, vs(p, w′) = 1, iff p ∈ ⋂ s, iff p is trued at g(s) = 〈s, ⋂ s〉. For atomic p, ♦p is trues at s iff s contains a p-world (according to vs) iff ♦p is trued at g(s) = 〈s, ⋂ s〉. Finally, it holds in both s and d that ♦(♦φ) is true at an index iff ♦φ is, and so we know that for any φ ∈ L♦, ♦φ will be trues at i iff ♦φ is trued at g(i). Note finally that g is an injection: for any s and s′, if s 6= s′, then the first elements of g(s) and g(s′) will differ. Thus by Lemma A.2 we have s L♦ d and so by Fact 3.1 we have S L♦ D. But we do not have the converse: D L♦ S. Consider any models s ∈ S and d ∈ D. Consider three d-indices 〈∅, w〉 and 〈∅, w′〉, and 〈∅, w′′〉 with w, w′, and w′′ all distinct. For any φ ∈ L♦, ♦φ is falsed at all these indices. The only indices which make ♦φ falses for every φ ∈ L♦ are ∅ and {wfs }, where wfs is the world which makes every atom false according to vs, and thus any truth-preserving function will have to take two of the d-indices to the same s index, and thus will fail to be an injection; thus by Lemma A.2 and Fact 3.1, D L♦ S. Fact 3.13. For any language L and a set of semantic theories T each of which is isomorphic with respect to L, L is a partial pre-order over T. Proof. L will be transitive: suppose T L T ′, witnessed by a function g : It → It′ , for t ∈ T , t′ ∈ T ′, and T ′ L T ′′, witnessed by a function f : It∗′ → It′′ , for t∗′ ∈ T ′, t′′ ∈ T ′′. By isomorphism, there is a truth-preserving injection h : It′ → It∗′ . f ◦ (h ◦ g) will witness T L T ′′. L is reflexive, witnessed by the identity function. It is not anti-symmetric, since it is easy to see that there are different semantic theories T and T ′ isomorphic with respect to L such that T L T ′ and T ′ L T (for instance, two standard semantic theories for a language just comprising atomic sentences, with the same set of possible worlds but different valuations, may have this property). And it is not necessarily connected, since, as we saw in the comparison of D to U , there are semantic theories which are incommensurable with respect to a given language. Fact 3.15. D∃ ≺L♦v R ∃. Proof. Note that D∃ andR∃ are both isomorphic with respect to L♦v . Consider a model d∃ ∈ D∃. Find a model r∃ ∈ R∃ built on the same domain, set of worlds, and valuation function. For any index 〈a, s, w〉 ∈ Id∃ , with a a variable assignment, s a set of worlds, and w a world, let g be the function which takes 〈a, s, w〉 to the r∃-index 36 〈a, fs, w〉, where fs is again the constant function to s. g will witness d∃ L♦v r ∃; the proof is a generalization of the parallel result in the proof of Fact 3.2; and so we have D∃ L♦v R ∃. But there is no truth-preserving injection in the other direction, for the same reasons given in the proof of Fact 3.3. Note moreover that the proof of Fact 3.5 can be extended to show that D∃ ≺L♦−v R ∃. Fact 3.16. U∃ ≺L♦v R ∃. Proof. First note U∃ is isomorphic with respect to L♦v . Then our proof is very much as in the proof of Fact 3.6. We construct a witness function g from an arbitrarily chosen model u∃ ∈ U∃ to a model r∃ ∈ R∃ built on the same set of worlds, valuation function, and domain, using the function h defined in the proof of Fact 3.6; with the notation defined there: g(〈a, 〈s, c〉〉) =  〈 a, fsh(〈s,c〉)∗, ιw : ∀R : vu∃(R,w) = ⋂ {vu∃(R,w′) : w′ ∈ s} 〉 iff s = c 6= ∅ 〈a, f∅h(〈s,c〉)∗, ιw : ∀R : vu∃(R,w) =⋂ {vu∃(R,w′) : w′ ∈ c} \ ⋃ {vu∃(R,w′) : w′ ∈ (s \ c)}〉 iff c ⊂ s ∧ c 6= ∅〈 a, f∅h(〈s,c〉)∗, w f u∃ 〉 iff c * s 〈a, f{w:∀R∀ ~d:~d∈v u∃ (R,w)→∀w ′∈s:~d/∈v u∃ (R,w ′)} h(〈s,c〉)∗ , ιw : ∀n : ∀Rn : vu∃(Rn, w) = Dn \ ⋃ {vu∃(Rn, w′) : w′ ∈ s}〉 iff s 6= c ∧ c = ∅〈 a, f {wt u∃} h(〈s,c〉), w t u∃ 〉 iff s = c = ∅ R ranges over relation symbols in the vocabulary ofL♦v ;wtu∃ is the world such that vu∃(R n, wtu∃) is the universal n-ary relation, for any n-place relation symbol Rn; and wf u∃ the world such that vu∃(Rn, w f u∃ ) is the empty relation, for any Rn. D is the domain of individuals; ~d ranges over ordered sequences of elements of D; and ι ranges over worlds inWu∃ . The proof that g is a witness function is parallel to the proof of Fact 3.6. The proof thatR∃ L♦v U ∃ is parallel to that for Fact 3.3. References Aloni, M. (2000). Conceptual covers in dynamic semantics. In Cavedon, L., Blackburn, P., Braisby, N., and Shimojima, A., editors, Logic, Language and Computation, volume III. Aloni, M. (2016). FC disjunction in state-based semantics. Slides for Logical Aspects of Computational Linguistics (LACL), Nancy, France. Aloni, M. D. (2001). Quantification Under Conceptual Covers. PhD thesis, University of Amsterdam, Amsterdam. Beaver, D. (1994). When variables don't vary enough. In Harvey, M. and Santelmann, L., editors, Semantics and Linguistic Theory (SALT), volume 4, pages 35–60. Beaver, D. (2001). Presupposition and Assertion in Dynamic Semantics. CSLI Publications: Stanford, CA. Beaver, D. I. (1992). The kinematics of presupposition. In ITLI Prepublication Series for Logic, Semantics and Philosophy of Language. University of Amsterdam. Beddor, B. and Goldstein, S. (2018). Believing epistemic contradictions. Review of Symbolic Logic, 11(1):87– 114. Bledin, J. and Lando, T. (2017). Closure and epistemic modals. To appear in Philosophy and Phenomenological Research. 37 Cresswell, M. (1990). Entities and Indices. Kluwer Academic Publishers, Dordrecht. Degen, J., Kao, J. T., Scontras, G., and Goodman, N. D. (2015). A costand information-based account of epistemic must. Poster at 28th Annual CUNY Conference on Human Sentence Processing. Dorr, C. and Hawthorne, J. (2013). Embedding epistemic modals. Mind, 122(488):867–913. Dowell, J. (2011). A flexibly contextualist account of epistemic modals. Philosophers' Imprint, 11(14):1–25. Egan, A., Hawthorne, J., and Weatherson, B. (2005). Epistemic modals in context. In Preyer, G. and Peter, G., editors, Contextualism in Philosophy: Knowledge, Meaning and Truth, chapter 6, pages 131–169. Oxford University Press. von Fintel, K. (1997). Bare plurals, bare conditionals, and Only. Journal of Semantics, 14:1–56. von Fintel, K. and Gillies, A. (2010). Must...stay...strong! Natural Language Semantics, 18(4):351–383. French, R. (2017). Notational variance and its variants. Topoi, pages 1–11. Gerbrandy, J. (1998). Identity in epistemic semantics. In Third Conference on Information Theoretic Approaches to Logic, Language and Computation. Giannakidou, A. and Mari, A. (2016). Epistemic future and epistemic MUST: nonveridicality, evidence, and partial knowledge. In Blaszack, J., Giannikidou, A., Klimek-Jankowska, D., and Mygdalski, K., editors, Mood, Aspect and Modality: What is a Linguistic Category? University of Chicago Press. Gillies, A. S. (2018). Updating data semantics. Mind. Groenendijk, J. and Stokhof, M. (1991). Dynamic predicate logic. Linguistics and Philosophy, 14(1):39–100. Groenendijk, J., Stokhof, M., and Veltman, F. (1996). Coreference and modality. In Handbook of Contemporary Semantic Theory, pages 179–216. Oxford: Blackwell. Hacquard, V. (2006). Aspects of Modality. PhD thesis, MIT. Hacquard, V. (2010). On the event relativity of modal auxiliaries. Natural Language Semantics, 18:79–114. Hawke, P. and Steinert-Threlkeld, S. (2016). Informational dynamics of epistemic possibility modals. Synthese. Hawthorne, J., Rothschild, D., and Spectre, L. (2016). Belief is weak. Philosophical Studies, 173(5):1393–1404. Heim, I. (1982). The Semantics of Definite and Indefinite Noun Phrases. PhD thesis, University of Massachusetts, Amherst. Heim, I. (1983). On the projection problem for presuppositions. In Barlow, M., Flickinger, D. P., and Wiegand, N., editors, The West Coast Conference on Formal Linguistics (WCCFL), volume 2, pages 114–125. Stanford, Stanford University Press. Heim, I. (1992). Presupposition projection and the semantics of attitude verbs. Journal of Semantics, 9(3):183– 221. Hintikka, J. (1962). Knowledge and Belief: An Introduction to the Logic of Two Notions. Cornell University Press, Ithaca, NY. Holliday, W. H. and Icard, III, T. F. (2017). Indicative conditionals and dynamic epistemic logic. In Theoretical Aspects of Rationality and Knowledge (TARK), volume 16. Ippolito, M. (2017). Constraints on the embeddability of epistemic modals. In Truswell, R., Cummins, C., Heycock, C., Rabern, B., and Rohde, H., editors, Sinn und Bedeutung, volume 21, pages 605–622. Kamp, H. (1970). Formal properties of 'now'. Theoria, 37(227-273). Karttunen, L. (1972). Possible and must. In Kimball, J., editor, Syntax and Semantics, volume 1, pages 1–20. Academic Press, New York. Khoo, J. (2015). Modal disagreements. Inquiry, 58(5):511–534. Kratzer, A. (1977). What 'must' and 'can' must and can mean. Linguistics and Philosophy, 1(3):337–355. Kratzer, A. (1981). The notional category of modality. In Eikmeyer, H. and Rieser, H., editors, Words, Worlds, and Contexts: New Approaches in Word Semantics, pages 38–74. de Gruyter. Kratzer, A. (1991). Modality. In von Stechow, A. and Wunderlich, D., editors, Semantics: An International Handbook of Contemporary Research, pages 639–650. de Gruyter, Berlin. Kratzer, A. (2012). Modals and Conditionals. Oxford University Press, Oxford. Kripke, S. (1963). Semantical considerations on modal logic. Acta Philosophica Fennica, 16:83–94. Lassiter, D. (2016). Must, knowledge, and (in)directness. Natural Language Semantics. Lewis, D. (1980). Index, context, and content. In Kanger, S. and Ohman, S., editors, Philosophy and Grammar, pages 79–100. D. Reidel. MacFarlane, J. (2011). Epistemic modals are assessment sensitive. In Egan, A. and Weatherson, B., editors, 38 Epistemic Modality. Oxford University Press. MacFarlane, J. (2014). Assessment Sensitivity: Relative Truth and Its Applications. Oxford University Press, Oxford. Mandelkern, M. (2017a). Coordination in Conversation. PhD thesis, Massachusetts Institute of Technology. Mandelkern, M. (2017b). A solution to Karttunen's problem. In Truswell, R., Cummins, C., Heycock, C., Rabern, B., and Rohde, H., editors, Sinn und Bedeutung 21, pages 827–844. Mandelkern, M. (2018a). How to do things with modals. To appear in Mind & Language. Mandelkern, M. (2018b). What 'must' adds. To appear in Linguistics and Philosophy. Mandelkern, M. (2019). Bounded modality. The Philosophical Review, 181(1). Matthewson, L. (2015). Evidential restrictions on epistemic modals. In Alonso-Ovalle, L. and MenendezBenito, P., editors, Epistemic Indefinites. Oxford University Press, New York. Moss, S. (2015). On the semantics and pragmatics of epistemic vocabulary. Semantics and Pragmatics, 8(5):1– 81. Mossakowski, T., Diaconescu, R., and Tarlecki, A. (2009). What is a logic translation? Logica Universalis, 3(1):95–124. Ninan, D. (2010). Semantics and the objects of assertion. Linguistics and Philosophy, 33(5):355–380. Ninan, D. (2016). Relational semantics and domain semantics for epistemic modals. Journal of Philosophical Logic, 47(1):1–16. Ninan, D. (2018). Quantification and epistemic modality. The Philosophical Review, 127(2):433–485. Peters, S. and Westerståhl, D. (2008). Quantifiers in Language and Logic. Oxford University Press. Pinheiro Fernandes, D. (2017). Translations: generalizing relative expressiveness between logics. Manuscript, University of Salamanca. https://arxiv.org/abs/1706.08481. Rabern, B. (2012). Against the identification of assertoric content with compositional value. Synthese, 189(1):75–96. Rabern, B. (2013). Monsters in Kaplan's logic of demonstratives. Philosophical Studies, 164:393–404. Rothschild, D. (2011). Expressing credences. In Proceedings of the Aristotelian Society, volume 112, pages 99–114. Rothschild, D. (2017). Veltman-Yalcin. http://danielrothschild.com/dyncon/vy/. Rothschild, D. and Klinedinst, N. (2015). Quantified epistemic modality. Handout for talk at Birmingham. Rothschild, D. and Yalcin, S. (2015). On the dynamics of conversation. Noûs, 51(1):24–48. Rothschild, D. and Yalcin, S. (2016). Three notions of dynamicness in language. Linguistics and Philosophy, 39(4):333–355. Schultz, M. (2010). Epistemic modals and informational consequence. Synthese, 174(3):385–395. Sherman, B. (2018). Open questions and epistemic necessity. The Philosophical Quarterly. Stalnaker, R. (1984). Inquiry. MIT. Steinert-Threlkeld, S. (2017). Communication and Computation: New Questions About Compositionality. PhD thesis, Stanford University. Stephenson, T. (2007). Judge dependence, epistemic modals, and predicates of personal taste. Linguistics and Philosophy, 30(4):487–525. Swanson, E. (2015). The application of constraint semantics to the language of subjective uncertainty. Journal of Philosophical Logic, 45(121):121–146. Veltman, F. (1985). Logics for Conditionals. PhD thesis, University of Amsterdam. Veltman, F. (1996). Defaults in update semantics. Journal of Philosophical Logic, 25(3):221–261. Willer, M. (2013). Dynamics of epistemic modality. Philosophical Review, 122(1):45–92. Yalcin, S. (2007). Epistemic modals. Mind, 116(464):983–1026. Yalcin, S. (2011). Nonfactualism about epistemic modality. In Egan, A. and Weatherson, B., editors, Epistemic Modality, pages 295–332. Oxford University Press. Yalcin, S. (2012). Context probabilism. In Aloni, M., Kimmelman, V., Roelofsen, F., Sassoon, G. W., Schulz, K., and Westera, M., editors, The 18th Amsterdam Colloquium, pages 12–21. Yalcin, S. (2015). Epistemic modality de re. Ergo, 2(19):475–527.