To Be F Is To Be G Cian Dorr Published in Philosophical Perspectives 30 (2016): pp. 1–97 Please cite final published version 1 Identifications I am interested in a certain way of understanding claims of the form 'To be F is to be G', which I take to have played a central role in philosophy from its inception. Here are some examples where the target reading is natural: (1) a. To be a vixen is to be a female fox. b. To be square is to be rectangular and equilateral. c. To be just is to be such that each part of one's soul does its own proper work. d. To be a human being is to be a rational animal. e. To be a hydrogen atom is to be an atom whose nucleus contains exactly one proton. f. To be a continuous function is to be such that for every open set in one's range, the set of things in one's domain which one maps to members of that set is open. As (1c) and (1d) illustrate, questions whose answers can be given in the form 'To be F is to be G' have been of central interest to philosophers since the beginning. (1e) illustrates that we cannot always tell whether to be F is to be G using "armchair" methods: sometimes, we need to do experiments. But not always, as witness (1f). The reading I am interested in is not the only possible reading of 'To be F is to be G'. The readings in play in the following examples seem rather different in character: (2) a. To be a black athlete in Colombia is to be constantly reminded of your otherness.1 1http://www.okayafrica.com/controversial/afro-colombian-gold-medalist/. 1 b. To be a teacher is to be forever an optimist.2 c. To be red is to be coloured. d. To be a vixen is to be female. Here are some diagnostics which may help us isolate the target reading. First: one can encourage the target reading by emphasising (focusing) the word 'is'. One says: 'To be a vixen is to be a female fox'. Or: 'To be a vixen just is to be a female fox'. Second: on the target reading, 'To be F is to be G' can be rephrased as 'To be F and to be G are the same/are one and the same', or (perhaps more idiomatically) 'Being F and being G are the same'.3 Third: on the target reading, 'To be F is to be G' is non-contingent: necessary if true, impossible if false. Fourth: on the target reading, 'To be F is to be G' entails 'Necessarily, all and only F things are G'. And fifth: on the target reading, the claim that to be F is to be G constitutes a very satisfying explanation of the fact that necessarily, all and only F things are G. One might be puzzled as to why it should be necessary that everything F is G, or that everything G is F. But if to be F is to be G, there is nothing more to be puzzled about. The work 'is' does in 'To be F is to be G' on the target reading seems very like the work it does in paradigm cases where it is said to express identity, like 'Hesperus is Phosphorus' and 'This is she'. Indeed, I think that this parallel is the key to understanding our 'is'. However, there is a way of explaining the parallel that we must be wary of, which is to treat the target sentences simply as ways of talking about the identity of properties, so that (1a), for example, would be assimilated to (3): (3) The property of being a vixen is the property of being a female fox. This assimilation misses the fact that the expressions flanking 'is' in (1a) and (3) differ in syntactic category: (4) a. I hope to be an astronaut. b. *I hope the property of being an astronaut. The inference from (1a) to (3) seems similar in status to the inference from 'Nelly is a vixen' to 'Nelly has the property of being a vixen'. Nominalists who deny that the latter inference is strictly truth-preserving (on the grounds that strictly speaking there are no properties) will presumably say the same about the former inference. It is not incoherent to be a nominalist while fully endorsing claims like (1a); or at least, such 2http://www.presidency.ucsb.edu/ws/?pid=56389. 3Zoltan Szabó (p.c.) tells me that whereas the grammaticality of 'To be F is to be G' is idiosyncratic to English, the forms with 'are the same' are more widespread cross-linguistically. 2 a combination is incoherent only if nominalism itself is incoherent. Indeed, even those with no nominalistic sympathies have reason to be careful about the inference from 'To be F is to be G' to 'The property of being F is the property of being G'. For by such a move one can get from the unproblematic (5a) to the paradoxical, deeply controversial (5b): (5) a. To be a non-self-instantiator is to fail to instantiate oneself b. The property of being a non-self-instantiator is the property of failing to instantiate oneself For these reasons, explicit property-talk will only play a heuristic role in our investigation of 'To be F is to be G'. One important parallel between the 'is' in the target sentences and the 'is' in 'Hesperus is Phosphorus' involves the point I made earlier about explaining necessity. The fact that Hesperus is Phosphorus explains in a supremely satisfying way why it is necessary that everyone who lands on Hesperus will land on Phosphorus. Identities are excellent stopping places for explanation; they do not cry out for explanation in their own right. Indeed, there is something odd about questions like 'Why is Hesperus Phosphorus?'. Unless this is understood as a request to be reminded of the reasons for believing that Hesperus is Phosphorus, it is hard to know what would count as a satisfying answer. It is tempting to respond by citing some metalinguistic facts, as if one had been asked why 'Hesperus' refers to the same thing as 'Phosphorus'. But of course that is a quite different question. And this also applies to questions like 'Why is it that to be a vixen is to be a female fox'? Once we set aside the "remind me of reasons to believe" reading, and metalinguistic questions about the word 'vixen', it is hard to see what an answer would even look like.4 There are other environments besides 'To be F is to be G' where something that looks like the 'is' of identity occurs with arguments that are syntactically different from ordinary determiner phrases (like 'Hesperus' and 'the property of being an astronaut'); despite the title, these other constructions will also be part of the topic of this paper. For one thing, we can put untensed verb phrases involving verbs other than 'be' on either side of 'is': (6) a. To be a triangle is to have three angles 4Rayo 2013 strongly emphasises this point, which he calls 'why closure'. However 'Why is it that to be F is to be G?' is not always odd in this way. 'Why is it that to be a hungry vixen is to be a female fox feeling in need of food?' seems to be quite reasonably answered by 'Because to be a vixen is to be a female fox, and to be hungry is to feel in need of food'. The oddity seems specific to the case were at least one of the expressions flanking 'is' lacks relevant syntactic complexity, like 'to be a vixen'. Thanks to Timothy Williamson for discussion on this point. 3 b. To die is to cease to live c. To kill is to cause to die d. ? To resemble is to be similar to e. To kill something is to cause it to die We can also use gerunds: (7) a. Being square is being rectangular and equilateral b. Killing is causing to die c. Being a non-self-instantiator is not instantiating oneself These seem equivalent to their infinitival counterparts ((1b), (6c), and (5a)).5 Finally, 'is' can be flanked by untensed clauses headed by 'for': (8) a. For there to be vixens is for there to be female foxes b. For Obama to be a bachelor is for Obama to be an unmarried man c. For something to be square is for it to be rectangular and equilateral d. For someone to kill someone is for them to cause them to die e. For a line x to be parallel to another line y is for x and y to be coplanar and non-intersecting. 'To be F is to be G' seems obviously equivalent (on the relevant reading) to 'For something to be F is for it to be G'. But 'for' clauses are more flexible: for example, they seem to provide the only way of getting across what is expressed by (8b) or (8e). I will introduce the colourless label 'identifications' for sentences in which 'is' is the main verb, flanked by expressions of the kinds just considered, and understood in the way I have tried to make salient. In other work I have used other labels for the same class of sentences: 'metaphysical analyses' (Dorr 2004, 2005) and 'real definitions' (Dorr 2007). But I will not use these labels here, since it is artificial to speak of 'analyses' and 'definitions' as species of sentences: a proper account of analyses and definitions should go hand in hand with an account of the activities of analysing 5It is somewhat more tempting to assimilate 'Being F is being G' to 'The property of being F is the property of being G', perhaps using the italicised 'Being F is being G' as an intermediary. But so long as one recognises an unproblematic reading for (5a), it seems arbitrary to deny that (7c) has a corresponding reading. 4 and defining, of which they seem to be the products (in the sense in which assertions and promises are the products of asserting and promising).6 These activities seem to centrally involve believing, knowing, and/or asserting things expressible by identifications: for example, 'She analysed lying as intentionally misleading' seems roughly equivalent to 'She maintained that to lie is to intentionally mislead'.7 However, in this paper we will be concerned with the activities only insofar as we are concerned with their cognitive subject matter. The questions I will be investigating concern what one might call the logic of identifications. But all I mean by this is that the questions are extremely general ones. For example, we will not consider whether to be morally right is to maximise happiness, but we will ask whether, for arbitrary F and G, to be F is to be F and either G or not G. There is no suggestion that the answers to 'logical' questions must be in some sense non-substantive, or analytic, or neutral with respect to the more specific disputes. On the contrary, I believe that these questions are among the hardest and deepest questions in metaphysics, and that differences in how we answer them will interact very significantly with differences in how we approach more specific questions. The order of events will be as follows. §2 will articulate some logical principles that should be relatively uncontroversial, insofar as they simply generalise standard principles of the logic of identity. §3 and §4 will introduce some formal tools for regimenting identifications, and re-express the basic principles from §2 in terms of them. The remainder of the paper will then use these tools to formulate certain further principles that I think should be controversial, and to explore some of their consequences. 2 Basics Given the intimate relation we have seen between the 'is' in identifications and the 'is' in ordinary identity sentences, it is clear that the logic of identifications must be importantly parallel to the logic of identity. At a minimum, all instances of the following schemas had better be true: 6Also, 'real definition' is opposed to 'nominal definition', so there is pressure to take them to be two species of some genus. If we take real definitions to be declarative sentences, we will therefore be expected to take nominal definitions to be declarative sentences too. But which sentences would they be? 7The same goes for 'She reduced lying to intentionally misleading'. But there are several other uses of 'reduction' floating around. In my view, the use of the word 'reduction' is such a mess that we would do better to ban it. 5 Reflexivity: To be F is to be F Transitivity: If to be F is to be G and to be G is to be H, then to be F is to be H Symmetry: If to be F is to be G, then to be G is to be F I have come across some resistance to Symmetry-indeed, I seem to have once rejected it myself. But it now strikes me as manifestly valid.8 If to be a vixen simply is to be a female fox-if being a vixen and being a female fox are one and the same- then of course it is equally true that to be a female fox is to be a vixen. However, there are two factors which may mask the obviousness of this implication. First, Symmetry fails for some of the other readings of 'To be F is to be G' which I attempted to set aside in §1-'To be red is to be coloured' has a true reading, whereas 'To be coloured is to be red' seems not to. And second, even on the target reading, there are pragmatic factors that make 'To be a vixen is to be a female fox' a more natural-sounding speech than 'To be a female fox is to be a vixen'. These are plausibly explained in terms of the idea that certain declarative sentences make salient certain questions, to which they present themselves as helpful answers.9 In particular, 'To be F is to be G' suggests the question 'What is it to be F?' in a way that it does not suggest the question 'What is it to be G?'. And while 'To be a vixen is to be a female fox' is a helpful answer to 'What is it to be a vixen?', 'To be a female fox is to be a vixen' is-although true-not a helpful answer to 'What is it to be a female fox?'. Someone who asked this would more likely be hoping for something like 'To be a female fox is to be a fox with two X chromosones'. However, these pragmatic effects are quite context-sensitive: there are settings where 'To be a female fox is to be vixen' seems completely fine, for example that of an argument about whether some particular animal is a vixen or not.10 Some resist Symmetry because they think identifications are intimately connected in some way with some asymmetric notion such as "metaphysical priority" or "grounding". To my mind, the most plausible way to forge such a connection- which we will revisit in §9-is to claim that 'To be a vixen is to be a female fox' entails that being female and being a fox are each metaphysically prior to being a vixen. Perhaps it also entails that being a fox and being female jointly ground being a vixen. But we should be careful to distinguish these claims, which are perfectly 8Thanks to Kieran Setiya for convincing me of this. 9This intuitive thought has been fruitful in explaining many pragmatic phenomena: see, e.g., Simons et al. (2010). 10This ordering-bias is especially noticeable in claims of the form 'What it is to be F is to be G'. But while I am not sure what to make of this 'What it is' syntactically or semantically, it seems unlikely that it could make for a difference in truth value. 6 consistent with Symmetry, from the claims that 'To be a vixen is to be a female fox' entails that being a female fox is metaphysically prior to, or a ground of, being a vixen. If one thinks that being a vixen is being a female fox, I don't think one should feel any pull towards these latter claims; but they are what one would need to rely on to make a priorityor grounding-theoretic argument against Symmetry. If the 'is' in identifications works like 'is' in ordinary identity sentences, we should also expect its logic to include some analogue of Leibniz's Law or the principle of substitution of identicals. For example, the following seem to be consequences of (1b) ('To be square is to be rectangular and equilateral'): (9) a. Everything square is rectangular and equilateral. b. If it is possible for there to be a square planet, it is possible for there to be a rectangular and equilateral planet. c. To be either round or square is to be either round, or rectangular and equilateral. d. For it to be necessary that everything square is square is for it to be necessary that everything square is rectangular and equilateral. Perhaps there are exceptions to this pattern. For example, (10a)–(10c) seem prima facie to be false despite the truth of (1b): (10) a. Anyone who wants a square garden wants a rectangular and equilateral garden. b. Necessarily, whoever believes that something is square believes that it is rectangular and equilateral. c. To say that everyone believes that everything square is square is to say that everyone believes that everything square is rectangular and equilateral The case is not beyond dispute; the literature on attitude reports contains a whole battery of techniques which could be used to explain away the apparent invalidity of arguments like these.11 But assuming we take the appearance at face value, it points towards something quite distinctive about linguistic environments like 'Someone 11Examples include Stalnaker's pragmatic techniques for defending the validity of substitution of necessarily equivalent sentences in attitude reports (Stalnaker 1999), and the pragmatic (Salmon 1986a, Soames 1987b), error-theoretic (Braun 1988, Saul 2007), and contextualist (Dorr 2014c, Schiffer 1979) techniques that have been used to defend the validity of substitution of co-referential names in attitude reports. 7 believes that...'-something that distinguishes them, for example, from the environments created by 'and', 'not', 'all', 'some', and 'it is metaphysically necessary that...'. As we might put it, the former environments are "opaque", sensitive to distinctions in "mode of presentation", whereas the latter are "transparent", only concerned with distinctions "out in the world".12 This picture strongly suggests that any operators we might need to appeal to in stating questions that are central to the subject matter of metaphysics should be transparent. For example, if 'because' is supposed to express a "worldly" notion of explanation-something like grounding-then if we think that to be a vixen is to be a female fox, and reject the claim that every vixen is a vixen because it is a vixen, we must also reject (11): (11) Every vixen is a vixen because it is a female fox. Perhaps (11) has true readings where 'because' is understood in epistemological or psychological or metalinguistic terms. But how could it be true if it is just supposed to be about how things are, out in the world?13 Note that one could endorse this argument against (11) while still accepting (12): (12) Every vixen is a vixen because it is female and it is a fox. Given that to be vixen is a female fox and that 'because' is transparent, (12) implies (13) Every female fox is a female fox because it is female and it is a fox. But (13) is not obviously false, or inconsistent with the irreflexivity of 'because'. In §5 wewill consider a certain fine-grained picture which systematically rejects claims like (14): (14) For something to be a female fox is for it to be the case that it is female and it is a fox 12The semantically relevant notion of "mode of presentation" need not be conceived along Fregean lines. One might tie differences in modes of presentation to differences in syntactic structure, holding that the only cases where substitution of 'G' for 'F' changes the truth value of an attitude report despite the truth of 'To be F is to be G' are cases where 'F' and 'G' differ in syntactic structure, so that in particular such substitution is always legitimate when 'F' and 'G' are syntactically simple predicates like 'groundhog' and 'woodchuck' or 'doctor' and 'physician' (see Salmon 1986a, 2010, Soames 1987a). 13Most theorists of grounding stress the "worldly" or "metaphysical" character of grounding in introducing the notion (Audi 2012, Fine 2012, Rosen 2010). However, Correia (2010) distinguishes a 'worldly' and 'conceptual' sense of 'ground', and advocates very different logics for the two. But I have little grip on what the conceptual sense of 'ground' is supposed to be, or how it is suppose to relate to other, non-grounding-theoretic uses of 'because'. 8 And if (14) is false, there is no route from (13) to a violation of the irreflexivity of 'because'. The thesis that operators that are central to the subject matter of metaphysics should be transparent is relevant to the suggestion, which I have encountered in discussion, that the use of 'To be F is to be G' that is of most importance for metaphysics is one of the readings for which Symmetry fails. I imagine that those who made this suggestion were thinking that the metaphysically important reading would be one on which 'To be a vixen is to be a female fox' is true while 'To be a female fox is to be a vixen' is false.14 I doubt that there is any such reading. But even if there were one, the operator expressing it would be opaque, and thus if the thesis is correct, it cannot have the claimed kind of importance. The thesis is controversial enough to be worth one last round of argument. Consider how we use words introduced by explicit definitions. For example, let me now define 'schmixen' by issuing the following stipulation: (15) To be a schmixen is to be a female fox. When we have introduced a word like this, we can substitute the definiendum ('schmixen') for the definiens ('female fox') in a wide range of contexts. Granted, such substitutions are problematic in speech and attitude reports. Talking to someone who misheard my introduction of the word 'schmixen' and went on to get quite confused, it seems like we could speak the truth by saying 'You have been confusedly thinking that schmixens are e-mail boxes, but really they are female foxes', even though there is no true reading of 'You have been confusedly thinking that female foxes are e-mail boxes'. However, whatever is going on here is something quite distinctive about our practice of characterising psychological states, and does not impugn the usual practice of substituting stipulatively defined words for their definienda in non-psychological contexts. But if we are happy substituting 'schmixen' for 'female fox' in such contexts, why would we not substitute 'vixen' for 'female fox'? The differences in how 'schmixen' and 'vixen' got their meanings may be relevant to epistemology, but it is hard to see how it could matter for metaphysics. Speaking a language which didn't provide a single word meaning 'vixen' would not cut one off from any facts: at worst, it would diminish one's repertoire of modes of presentation of the facts. 14Rosen (2010, p. 123) equates 'p reduces to q' with 'For it to be the case that p just is for it to be the case that q', while taking it for granted that if p reduces to q, q does not reduce to p. Similarly Fine (2015, p. 308) proposes an asymmetric reading for 'IS' on which 'H2O IS water' must be false if 'Water IS H2O' is true. 9 3 Formalisation To make further progress in debating the logic of identifications, it will be helpful to have a formal language in which identifications can be stated without the idiosyncratic syntactic trappings that English grammar requires. Two different approaches to formalisation suggest themselves: rather than plumping for one of them, I will present both, since they both have their advantages and there is something to be learnt by thinking about how they can be translated into one another. The first approach is to make the formal-language counterpart of the 'is' in identifications something that combines with two predicates to make a sentence. So for example, we could represent 'To be triangular is to be trilateral' simply as 'Triangular ≡ Trilateral', where 'Triangular' and 'Trilateral' are two one-place predicates (so that, e.g., we could write '∃x(Triangular(x))' for 'Something is triangular'). To formalise sentences like 'To be square is to be rectangular and equilateral', our language will need some mechanism for building complex predicates. English contains several suchmechanisms: for example we have complex adjectival phrases like 'rectangular and equilateral', complex noun phrases like 'female fox', and complex verb phrases 'has at least one proper part'. But in a formal language we would hope to get by with something more uniform. The most familiar formal device for forming complex predicates is lambda-abstraction, whereby we combine a list of n distinct variables v1...vn with an open sentence φ to form an n-place predicate, written as (λv1...vn.φ), which applies to objects x1, ..., xn iff φ is true on an assignment that maps v1 to x1,..., and vn to xn. This gives us such translations as the following: (16) a. Square ≡ λx.Rectangular(x) ∧ Equilateral(x) To be square is to be rectangular and equilateral. b. Composite ≡ λx.∃y(ProperPart(y, x)) To be composite is to have a proper part. c. Parallel ≡ λxy.Line(x) ∧ Line(y) ∧ Coplanar(x, y) ∧ ¬Intersect(x, y) For an object x to be parallel to an object y is for x and y to be coplanar lines that do not intersect. We can also allow ≡ to be flanked by two sentences, allowing for sentences like (16) d. ∃x(Vixen(x)) ≡ ∃x(Female(x) ∧ Fox(x)) For there to be vixens is for there to be female foxes. This is no great departure, since sentences can be thought of as 0-adic predicates.15 15It would be ideally perspicuous to mark a symbolic difference between the different syntactic 10 Philosophers occasionally use λ-terms as formal equivalents of English expressions of the form 'The property of being F '. Given that I do not want to equate 'To be F is to be G' with 'The property of being F is the property of being G', it is important to be clear that that is not how I am using λ-terms-they are predicates, just like 'is square' and 'is rectangular and equilateral'. If one does want a systematic translation into English, perhaps the best option is to translate 'λx.(...x...)' as 'is such that ...he/she/it/one ...' (with the choice depending on the requirements of syntactic agreement).16 The fact that many English sentences may contain unpronounced variables for items like instants or intervals of time, events, or situationsmakes the task of translating English identifications into this formal language more complicated than it might initially seem. For example, the currently dominant approach to tense assigns additional arguments to ordinary adjectives, verbs and nouns, which are saturated by time variables present in the syntax.17 Against this background, it is plausible that the most natural reading of 'To be a vixen is to be a female fox' would require a representation like '(λxt.Vixen(x, t)) ≡ (λxt. Female(x, t) ∧ Fox(x, t))'. Likewise, one might use event arguments to formalise 'To kill is to cause to die' as '(λexy.e is a killing by x of y) ≡ (λexy.e is a causing-to-die by x of y)'.18 I will ignore these subtleties in what follows by confining my attention to stative sentences and assuming that no time arguments are needed. The second approach to formalisation attempts to stay closer to the structure of identifications in English, especially the ones using 'for' clauses. In these sentences, pronouns on the right hand side of 'is' can be syntactically linked to indefinites on the roles in which the '≡' symbol can appear-for example, between the '≡' in (16a) (which combines with two monadic predicates to make a sentence) and the one in (16c) (which combines with two dyadic predicates to make a sentence). But there is no practical need for this, since it can always be reconstructed from the types of the arguments. 16This departs slightly from the translations given above-the more faithful rendition of (16a) would be 'To be square is to be such that one is rectangular and one is equilateral'. This distinction does not matter if (as I believe) to be rectangular and equilateral is to be such that one is rectangular and one is equilateral. However, some proponents of an extreme version of the "structured" picture (see §6) may reject this identification, on the grounds that there is a difference in syntactic structure between the two sides. On this view, the kind of formal language we are currently working with is incapable of expressing the fact we express in English by 'To be square is to be rectangular and equilateral'. Expressing this will require a language that contains a "predicate functor" that can do what 'and' seems to do in this sentence: combine directly with two predicates to form a new predicate, without forming an open sentence as an intermediary. 17For a helpful survey, see Kusumoto 1999, ch. 1. 18The event-theoretic analysis of 'e is a causing-to-die by x of y' is itself controversial. One possibility is to understand it as equivalent to 'Cause(e) ∧ Agent(x, e) ∧ ∃f (Die(f ) ∧ Theme(y, f ) ∧ Theme(f, e))'. 11 left hand side, and the meaning of the whole sentence turns on the pattern of links. Consider 'For someone to kill someone is for them to cause them to die'. This is ambiguous: using subscripts, we can distinguish the two readings as 'For someone1 to kill someone2 is for them1 to cause them2 to die' (which is roughly true) and 'For someone1 to kill someone2 is for them2 to cause them1 to die' (which is obviously false). This structural ambiguity is clearly the same sort of thing as the structural ambiguity in 'Someone told someone she would kill her': it is a matter of different patterns of "coindexing", which formal languages usually resolve by using numerically distinct variables as the counterparts of pronouns in natural language. This suggests a formal treatment where the operator playing the role of 'is' is something that always combines with two (open or closed) sentences to make a new sentence, and may bind some of the variables that occur free in those sentences.19, We can write the list of bound variables as a subscript to ≡.20 This yields formalisations like the following: (17) a. Composite(x) ≡x ∃y(ProperPart(y,x)) For something to be composite is for there to be a proper part of it. b. Parallel(x, y) ≡x,y (Line(x)∧Line(y)∧Coplanar(x, y)∧¬Intersect(x, y)) For an object x to be parallel to an object y is for x to be a line and y to be a line and x to be coplanar with y and x not to intersect y. c. ∃y[Country(y) ∧ (German(x) ≡x From(x, y))] There is a country y such that for something to be German is for it to be from y. This approach might seem to fit 'To be F as to be G' worse than 'For something to be F is for it to be G'. But the case is not so clear. Orthodox syntax posits an unpronounced pronoun-like constituent 'PRO' in such sentences: really we are dealing with 'PRO to be F is PRO to be G'. Whether 'PRO to be F' and 'PRO to be G' combine with local lambda operators before encountering 'is', or whether the bind19The question what mechanisms underlie the behaviour of indefinites like 'something' in 'For someone to kill someone is for them to cause them to die' is a controversial one in linguistics. In dynamic semantics, it is standard to think of indefinites as variable-like rather than quantifier-like: the quantificational meanings of sentences like 'Something is in this box' arise through a. An alternative approach would be to think of indefinites as still being quantifier-like, but as having covert restrictors that may include variable: 'For someone identical to x1 to kill someone identical to x2 is for them1 to cause them2 to die' (Elbourne 2005). Further options arise if we want indefinites to be predicate-like, as in Graff 2001. I hope these debates can be bypassed for present purposes. 20Rayo (2013) also uses this notation, with what I take to be the same interpretation. 12 ing occurs at the level of the whole sentence, involves hard questions of syntax and semantics that I will not try to resolve here. 21, 22 Even if we were convinced that the sentential approach provides a more adequate formalisation, we can still make sense of the predicate formalism by understanding R ≡ S as shorthand for R(v1, ..., vn) ≡v1,...,vn S(v1, ..., vn) (where R and S are n-ary predicates, and v1, ..., vn are arbitrarily chosen, distinct variables not free in R or S). Since both uses of ≡ were explained by reference to identifications in English, this translation should be good insofar as 'To be F is to be G' and 'For something to be F is for it to be G' are interchangeable in English. Note however that many sentential identifications are not the translations of any predicate identifications, since they are not syntactically of the form R(v1, ..., vn) ≡v1,...,vn S(v1, ..., vn) for any R, S, v1,..., and vn. Because of this, any proposal for a translation in the opposite direction-mapping every sentential identification to a predicate identification-will be more controversial. In particular, the obvious suggestion to translate φ ≡v1,...,vn ψ as (λv1...vn.φ) ≡ (λv1...vn.ψ) leads, in combination with the translation in the other direction to the conclusion that φ ≡v1,...,vn ψ is always equivalent to (λv1...vn.φ)(u1, ..., un) ≡u1,...,un (λv1...vn.ψ)(u1, ..., un). If we take 'equivalent' in the sense of '≡', this is controversial for reasons we will consider in §5. Indeed, even the claim of material equivalence is controversial for certain φ and ψ . For example, let φ abbreviate some closed sentence, say 'snow is white'. Someone might accept the truth of (λx.(λx.φ)(x)) ≡ (λx.φ)-'to be such that one is such that snow is white is to be snow is white'-while rejecting (λx.φ)(x) ≡x φ- 'for something to be such that snow is white is for snow to be white'-on the grounds that not the case (for example) that for Obama to be such that snow is white is for snow to be white, given that Obama's being such that snow is white is about Obama in a way that snow's being white is not. Let us now consider how to formalise the basic logical principles discussed informally in §2. In the predicate approach, we can use analogues of the usual identity 21The claim that PRO in in infinitival clauses must (at least in certain environments) be semantically bound by a local lambda abstractor features in a popular explanation of the the 'de se' character of expressions like 'She expects to φ' and 'She wants to φ' (Chierchia 1989). 22If there turns out to be a structural difference between the arguments of 'is' in 'For something to be F is for it to be G' and in 'To be F is to be G', this will raise the possibility that there is some subtle semantic difference between the forms, so that 'For it to be the case that (for something to be F is for it to be G) is for it to be the case that (to be F is to be G)' can be false-perhaps there are even cases where the two forms differ in truth value. This would require some revision in the present paper, which treats the forms as interchangeable. If evidence against this assumption emerged, I would need to focus more narrowly on 'To be F is to be G'; however I would insist that this kind of claim can intelligibly be generalised to polyadic predicates whether or not the generalisation is expressible in natural languages. (Thanks to Mark Schroeder for helpful discussion here.) 13 axioms: F ≡ FRef (F ≡ G) → (φ → [G/?F ]φ)LL where [G/?F ]φ is a sentence like φ except that one or more occurrences of F are replaced by occurrences of G, in such a way that no variables free in F ≡ G are bound in φ or [G/?F ]φ. I will adopt the convention that universal generalisations of instances of schemas count as instances; so for example, ∀x((λy.Rxy) ≡ (λy.Rxy)) is an instance of Ref. Symmetry and transitivity follow immediately using the instances (F ≡ G) → ((F ≡ F ) → (G ≡ F )) and (G ≡ F ) → ((G ≡ H) → (F ≡ H)) of LL; the truth-preservation principle (F ≡ G) → (F x ↔ Gx) follows from the instance (F ≡ G) → ((F x ↔ F x) → (F x ↔ Gx)). LL will of course have to be restricted if our language contains contexts that are "opaque" in the way that propositional attitude ascriptions seem prima facie to be. However, given the general perspective on the exceptional character of these contexts defended in §2, we may reasonably avoid the need to constantly make exceptions to generalisations like LL by simply stipulating that our formal languages will not contain any expressions that generate opaque contexts. The analogous work in the sentential approach can be done by the following axioms: φ ≡v1...vn φRefs (φ ≡v1...vn ψ) → ([ui/vi]φ ≡u1...un [ui/vi]ψ)Alphabetic Variation (φ ≡v1...vn ψ) → (χ → [φ/?ψ]χ)LLs where [ui/vi]φ is the result of replacing each free occurrence of vi in φ with a free occurrence of ui (re-lettering bound variables if necessary), and [φ/?ψ]χ is a sentence that results from χ by replacing one or more instances of the open sentence ψ with an instance of the open sentence φ, in such a way that no variable that is free in φ ≡v1...vn ψ is bound in χ or [φ/?ψ]χ . Note that in this approach, Alphabetic Variation plays a crucial role in the derivation of many implications, such as F x ≡x Gx, Hx ≡x Ix ⊦ F x ∧ Hy ≡x,y Gx ∧ Iy. By contrast, in the predicate approach, while we surely will want (λv1...vn.φ) ≡ (λu1...un.[ui/vi]φ) to be valid, it is not needed for the closest analogue of the above inference, namely F ≡ G, H ≡ I ⊦ (λxy.F x ∧ Hy) ≡ (λxy.Gx ∧ Iy). 14 4 Higher-orderese My refusal to take 'The property of being F is the property of being G' as more than a heuristic gloss on 'To be F is to be G' is reminiscent of a certain attitude towards the formalism of higher-order logic defended by, amongst others, Prior (1971) and Williamson (2003). This position holds that sentences like '∃F (F (Frege) ∧ F (Church))' and '∃F (2∀xF (x))' are intelligible and true, but only at best heuristically or misleadingly glossed by sentences using standard English quantified noun phrases, like 'Some property is instantiated by both Frege and Church', and 'Some property is necessarily instantiated by everything'.23 Perhaps there just aren't any natural-language sentences strictly synonymous with these formal sentences: if so, we will have to learn the language of higher-order logic by the same "direct method" we use when learning foreign languages by immersion (Williamson 2003, p. 459). I will not be arguing for this kind of embrace of higher-order quantification here. But I do want to discuss several ideas about identification which can be more cleanly articulated using higher-order resources. The formal higher-order language I have in mind is that of simple relational type theory with lambda abstraction. It works as follows (see Appendix A1 for a more rigorous presentation). Every syntactic unit is a term having a particular type, or syntactic category. Types are defined as follows: the letter e is a type ("the type of objects"); for any n ≥ 0 and types τ1, ... , τn, the n-tuple ⟨τ1, ... , τn⟩ is a type; nothing else is a type. Formulae (open or closed sentences) are terms of type ⟨⟩ ("propositional type"). n-place predicates of the familiar sort-expressions that that can combine with n first-order variables to form sentences-are terms of type ⟨e, ..., e⟩. There are variables of all types; we may indicate the type of a variable by means of a superscript on its first occurrence. For any terms A, B1, ..., Bn of types ⟨τ1, ..., τn⟩, τ1, ..., τn respectively, A(B1, ..., Bn) is a formula. For any distinct variables v1, ..., vn of types τ1, ..., τn and formula φ, (λv1...vn.φ) is a term of type ⟨τ1, ..., τn⟩. The logical constants are ¬ (of type ⟨⟨⟩⟩); ∨ and ∧ (of type ⟨⟨⟩, ⟨⟩⟩); and for every type τ, ∀τ and ∃τ (of type ⟨⟨τ⟩⟩), and ≡τ (of type ⟨τ, τ⟩). Some abbreviations: we write ∀xτ(φ) and ∃xτ(φ) instead of ∀τ((λxτ .φ)) and ∃τ((λxτ .φ)). We freely use infix notation, e.g. writing φ ∧ ψ instead of ∧(φ, ψ). → abbreviates λp⟨⟩q⟨⟩.¬p ∨ q, and ↔ abbreviates λp⟨⟩q⟨⟩.(p → q) ∧ (q → p). We may omit parentheses and type annotations when they can be reconstructed unambiguously. 23The point has nothing special to do with the word 'property': the reasons for not embracing these English sentences as unproblematic translations of the higher-order sentences also applies to corresponding sentences using 'concept', 'condition', etc. 15 By regimenting identification using a higher-order predicate ≡τ , I am opting for the "predicate" formalism from the previous section. The convenience and elegance of having lambda-abstraction do all the work of variable-binding is just too great to pass up. This means that the language will lack the ability to neutrally regiment sentential identifications that are not obviously materially equivalent to any particular predicate identification. Fortunately, there are plenty of deep and controversial questions of identification that do have obvious (material) equivalents in our higher-order language, so we need not be too sad to postpone the rest for another occasion. One question that can naturally be raised once higher-order quantifiers are in the language is the question whether identifications are materially equivalent to claims of higher-order indiscriminability. Schematically: Indiscriminability ∀xτ∀yτ((x ≡τ y) ↔ ∀z⟨τ⟩(z(x) ↔ z(y))) where τ may be any type. Indiscriminability can be derived fromRef and LL given standard classical logic. (x ≡τ y) → ((z⟨τ⟩(x) ↔ z(x)) → (z(x) → z(y))) is an instance of LL, and classically implies (x ≡τ y) → (z(x) → z(y)); universal generalisation on this formula gives the right to left direction of Indiscriminability. In the other direction, suppose that ∀z⟨τ⟩(z(x) ↔ z(y)); then by universal instantiation, (λu.x ≡τ u)(x) ↔ (λu.x ≡τ u)(y), which implies x ≡τ x ↔ x ≡τ y (by "extensional β-conversion", see next section), so x ≡τ y by Ref. Some will reject Indiscriminability on the basis of an argument like this: 1. Hank believes all vixens are vixens and does not believe all vixens are female foxes. 2. So (λp⟨⟩.Hank believes p)(all vixens are vixens) ∧¬(λp.Hank believes p)(all vixens are female foxes)]. 3. So ∃z⟨⟨⟩⟩(z(all vixens are female foxes)∧¬z(all vixens are vixens)). 4. All vixens are female foxes ≡ all vixens are vixens 5. So, ∃p⟨⟩∃q⟨⟩(p ≡ q ∧ ∃z⟨⟨⟩⟩(z(p) ∧ ¬z(q))). Accepting the conclusion of this argument means holding that higher-order quantifiers themselves create an opaque context, so that we cannot add such quantifiers to the language without disrupting the ban on opacity required for all instances of LL to be true. 16 The one way of resisting this argument that I want to firmly rule out (at least considered as a general strategy) is that of denying 4: it is crucial to the philosophically important use of 'To be F is to be G' that such claims cannot be refuted so easily. But there are other ways of resisting the argument that are much more promising. One might reject 1, regarding it as literally false, or at least literally false on every uniform interpretation.24 One might-currently my favourite option-reject the inference from 1 to 2, treating propositional attitude ascriptions as creating a special syntactic environment that makes for exceptions to generally valid rules like extensional beta-conversion (see §5), similar in this respect to "mixed quotation". Or, most radically, one might reject the classical rule of existential generalisation, and with it one or both of the inferences from 2 to 3 or from 3 and 4 to 5 (see Bacon and J. S. Russell MS).25 I think that we should accept Indiscriminability, and resist the argument against it in one of the above ways. However, Indiscriminabilitywill not play a major role in what follows-indeed, most of the claims we will be considering will be intelligible even to those who insist that higher order quantifiers are meaningless. (Identification would give complex higher-order predicates a raison d'être even if there weren't any quantifiers around for them to serve as arguments for.) When I do on occasion use higher-order quantifiers in ways that assume Indiscriminability, those who reject Indiscriminability because of opaque contexts will in most cases be able to make sense of what's going on by interpreting the quantifiers as restricted to the domain of the transparent.26 It is worth mentioning that if we do accept Indiscriminability, there is a strong 24For the ideology of uniform interpretations and its relevance to arguments like this one, see Dorr 2014c. There I argue that when N and M are ordinary proper names, 'If N = M , everyone who believes that ...N... believes that ...M...' is true on every uniform interpretation. One could imagine saying the same thing about all identifications that I say about identities between ordinary names. However, this generalisation is harder to defend than the view in the paper. It is fairly easy to imagine situations where 'Lois believes that Clark flies' could be used to assert a truth even though Lois would accept the sentence 'Clark doesn't fly, although Superman does'; it is much harder to imagine situations where 'Hank believes that there are female foxes' could be used to assert a truth even though Hank would accept the sentence 'There are no female foxes, although there are vixens-vixens are sexless Martian spy-robots, not female foxes'. The difference is that ordinary proper names (excluding compound names like 'Oxford University') lack relevant syntactic structure, whereas expressions in other categories can differ structurally in a way that seems to make for systematic, conventionalised differences in the communicative possibilities for speech and attitude reports involving them. 25Another possible view is that higher-order quantification is ambiguous, so that Indiscriminability has both true and false readings. 26This applies in particular to the initial universal quantifiers which, according to our convention concerning schemas, may be added to bind any otherwise free variables that appear in an instance of a schema. 17 case to be made for strengthening it to an identification:27 The Identity Identity ≡τ≡⟨τ,τ⟩ λxτyτ .∀z⟨τ⟩(z(x) ↔ z(y)) Inwords: identification is higher-order indiscernibility. The Identity Identity provides a powerful explanation of the truth of Indiscriminability and of the instances of Ref and LL; it allows the entire logic of ≡ to be subsumed under that of quantification, which we need in any case.28 However, once we have Indiscriminability, the question whether The Identity Identity is true will only be relevant to questions that turn on embedded identifications, which will not be central in what follows. 5 Beta-equivalence An object x is such that it is rectangular and it is equilateral if and only if x is rectangular and x is equilateral. More generally: x is such that ...it... iff ...x.... In a language with lambda abstracts, this pattern can be captured by the following schema: Extensional β-equivalence (λv1...vn.φ)(A1, ..., An) ↔ [Ai/vi]φ where v1, ..., vn are any distinct variables, A1, ..., An are any terms (not necessarily all distinct), and [Ai/vi]φ is a sentence that results from replacing each free occurrence of v1 in φ with an occurrence of A1, and each occurrence of v2 with an occurrence of A2, and so on, replacing bound variables in such a way that no free variables in any Ai become bound. Extensional β-equivalence should be uncontroversial given that the language doesn't contain opaque contexts.29 A much more controversial question is whether this claim of coextensiveness can be strengthened to an identification: Immediate β-equivalence (λv1...vn.φ)(A1, ..., An) ≡⟨⟩ [Ai/vi]φ 27I have stolen this name from Bacon and J. S. Russell MS. 28The Identity Identity is also convenient for metalogical purposes, since it lets us work with a shorter list of logical constants and a simpler definition of a model: for this reason I take it for granted in the Appendix. But this is only a convenience: reintroducing ≡ as a primitive in the model theory would be straightforward. 29Extensional β-equivalence becomes controversial once we have expressions like 'believes' in the language. For example, it is controversial whether (λf .Hank believes ∀y(Vixen(y) → f(y)))(λz.Female(z) ∧ Fox(z)) → Hank believes ∀y(Vixen(y) → (λz. Female(z) ∧ Fox(z))(y)): this conditional will be false on views where the syntactic structure of the argument of 'believes' makes a truth-conditional difference. 18 For example: for Nelly to be such that she is female and she is a fox is for it to be the case that Nelly is female and Nelly is a fox. Those who accept Immediate β-equivalencewill presumably also want to accept other identifications that result from performing the same kind of substitution in an embedded context, for example the following: (λy.((λx.R(x, y))(z))) ≡⟨e⟩ (λy.R(z, y)) R(x, (λy.S(y, (λz.F z ∧ Gz)(x)))) ≡⟨⟩ R(x, (λy.S(y, F x ∧ Gx))) So, the picture that strengthens Extensional β-equivalence to an identification would seem to go along with the following more general schema:30 β-conversion: φ ↔ φ∗, where φ∗ is derived from φ by replacing some constituent of the form (λv1...vn.ψ)(A1, ..., An) with [Ai/vi]ψ . When one term can be derived from another by a replacement of this sort, we say that the latter one-step β-reduces to the former, and that the two are one-step βequivalent. Two terms are β-equivalent simpliciter when they can be connected by a sequence of terms each of which is one-step β-equivalent to its predecessor. Given the symmetry and transitivity of ≡, β-conversion obviously implies the more general principle that φ ↔ φ∗ is true whenever φ and φ∗ are β-equivalent. And given Ref, we also get the superficially stronger principle that A ≡τ B is true whenever A and B are β-equivalent terms of type τ: for in this case, (A ≡τ A) ↔ (A ≡τ B) is an instance of β-conversion, and implies (A ≡τ B) given Ref.31 This explains how we get back from β-conversion to Immediate β-equivalence. The question whether to endorse β-conversion is a crucial choice point for theorising about the logic of identifications. My view is that β-conversion is close to being valid. To be precise: β-conversion holds whenever the substituted term (λv1...vn.ψ)(A1, ..., An) is such that each of the variables v1, ..., vn have at least one free occurrence in ψ-call these instances of nonvacuous β-conversion. In this 30Interestingly, although there is no obvious route to β-conversion from Immediate βequivalence, if we were taking the sentential identification connective as primitive the schema (λv1...vn.φ)(A1, ..., An) ≡u1...um [Ai/vi]φ could do all the work of β-conversion.31If we are taking ≡ as primitive, we could also take the schema whose instances are A ≡τ B when A and B are β-equivalent as our basic axiom, and derive β-conversion from φ ≡ φ∗ and φ ↔ φ using LL. However, if we are using The Identity Identity to avoid taking ≡ as a primitive, we will need to take β-conversion as the more basic axiom, since we need to rely on it to get from φ ≡ φ∗, i.e. (λpq.∀z(z(p) ↔ z(q)))(φ, φ∗), to φ ↔ φ∗, via ∀z(z(φ) ↔ z(ψ)) (one application of β-conversion), then (λp.p)(φ) ↔ (λp.p)(φ∗) by universal instantiation, and finally φ ↔ φ∗ (two more applications of β-conversion). 19 section and the next, I will attempt to make this plausible, though what I have to say will fall far short of being a knock-down argument. In the present section, I will survey four possible strategies for arguing against instances of β-conversion, arguing that none of them are compelling. In §6, I will sketch what seems to me to be the most principled and systematic view on which which β-conversion fails, give some arguments against it, and say some positive things in favour of (non-vacuous) β-conversion. The first strategy for arguing against β-conversion is based on familiar arguments for thinking that the proposition that (λv1...vn.φ)(t1, ..., tn) is (in many cases) distinct from the proposition that [ti/vi]φ. For example, Salmon (2010) considers a case where someone thinks they are seeing photos of two yachts when in fact they are seeing two photos of a single yacht a, and sincerely utters 'This yacht is longer than that one is'. According to Salmon, the speaker in this case believes the proposition that a is larger than a is, but does not believe the proposition that (λx.x is larger than x is)(a).32 Similarly, the Babylonians believed that Venus was a planet visible in the morning and Venus was a planet visible in the evening, and communicated this belief to one another by uttering the Babylonian equivalent of 'Phosphorus is visible in the morning and Hesperus is visible in the evening', but they did not believe that (λx.x is a planet visible in the morning and x is a planet visible in the evening)(Venus). Considerations of this kind are very important to the metaphysics of propositions (understood as "objects of the attitudes"); but they are terrible as an argument against instances of β-conversion. This argument would be no better than the argument that since it is possible for someone to believe that there are vixens without believing that there are female foxes or to want to be a vixen without wanting to be a female fox, it is not the case that for there to be vixens is for there to be female foxes or that to be a vixen is to be a female fox. Those (including Salmon and Soames) who think that the proposition that there are vixens is distinct from the proposition that there are female foxes on the basis that it is possible to believe the former without believing the latter should regard 'the proposition that...' as an opaque context, rejecting the inference from 'For it to be the case that φ is for it to be the case that ψ' to 'The proposition that φ = the proposition that ψ'. If they insist on understanding the identification in terms of the identity of entities of some sort, they should think of these entities not as propositions (the "objects of the attitudes"), but as "states of affairs"-entities which stand to the number 0 as properties stand to 1 and binary relations stand to 2.33 32See also Salmon 1986b and Soames 1987a. 33It is unfortunate that philosophy has no accepted label for such entities. 'Proposition' is hostage 20 The second argumentative strategy is more promising, since it turns on environments that have nothing obvious to do with propositional attitudes or "intentionality". Gideon Rosen (2010) and Kit Fine (2012) suggest certain general principles about grounding which, if true, would provide a widely-applicable strategy for arguing against instances of β-conversion. Rosen puts the point in terms of an ontology of facts: he maintains that in general, the fact that [a/x]φ grounds the fact that (λx.φ)(a). Since no fact grounds itself, this entails that the facts in question are distinct. Fine thinks of grounding claims as involving a sentential operator which (at least in the straightforward case where it connects two sentences) we can pronounce 'because'. So for him, the key claim is that whenever (λx.φ)(a), (λx.φ)(a) because [a/x]φ, although it is never true that (λx.φ)(a) because (λx.φ)(a).34 These claims certainly sound like they should entail that it is not true, in our target sense, that for it to be the case that (λx.φ)(a) is for it to be the case that [a/x]φ. Rosen and Fine are at the forefront of a movement to give questions expressed in grounding-theoretic terms a central role in metaphysics, not merely as tools for investigating some other questions (in the way that, e.g., questions about conceptual analysis might be), but as topics of investigation for their own sake. At least insofar as one is convinced by the picture I presented in §2-according to which the "subject matter of metaphysics" is conceived of as being about the world as opposed to our representations of it, and true identifications license substitution within claims of this sort-one will not want to resist the grounding-theoretic argument against β-conversion at its last step.35 However, so long as we conceive of grounding as a worldly matter, I see no good reason for accepting the premise that [a/x]φ ever grounds (λx.φ)(a). When we are to the propositional attitudes; 'fact' is ruled out since there cannot be a fact that φ unless φ; and some treat 'state of affairs' as like 'fact' in this respect. I am tempted by 'factoid'. 34Fine suggests just one possible exception to the generalisation that [a/x]φ strictly grounds (λx.φ)(a), namely when φ is a predication F (x) where x is not free in F , so that (λx.φ)(a) ≡ [a/x]φ is an instance of η-conversion (see below) as well as β-conversion. 35Interestingly, however, there are indications that Fine and some other grounding-enthusiasts are not thinking along these lines. Fine is open to a view on which propositions are individuated too coarsely to respect grounding-theoretic distinctions: 'the truth of "A, B < C" might be taken to depend not merely upon the propositions expressed by "A", "B" and "C" but also upon how these propositions are expressed'. This suggests a picture where grounding-theoretic claims are in some important sense about our representations, rather than simply about how things are in the world. Correia (2010) takes seriously the idea that 'grounding' admits a "conceptual" interpretation that works like this, as well as a "worldly" interpretation, and interprets Fine and Rosen as concerned with the "conceptual" notion. If he is right, the present argument against β-conversion from putative failures of substitutivity in grounding claims would have no more force than the previously considered argument from failures of substitutivity in attitude reports. However, I have little sense of what the conceptual interpretation of grounding claims is supposed to be, or why anyone would regard such claims as having a distinctive interest for metaphysics. 21 first being introduced to the language of grounding, we will be tempted to deploy it quite promiscuously. For example, we will be tempted to claim that the fact that Nelly is a vixen is grounded by the fact that Nelly is a female fox. After all, 'Nelly is a vixen because Nelly is a female fox' certainly sounds true, and there are no obvious tests for distinguishing the 'because' here from the 'because' of grounding. But this temptation must certainly be resisted, as discussed in §2. Given that to be a vixen is to be a female fox, it certainly follows that for Nelly to be a vixen is for Nelly to be a female fox, and hence that the fact that Nelly is a female fox does not ground the fact that Nelly is a vixen, since it does not ground itself. Of course Fine and Rosen need not dispute this, since they can claim that this one fact is distinct from the fact that Nelly is female and Nelly is a fox, which grounds it. But once we have realised that we need to be careful in going from intuitive 'because' claims to grounding claims, it is hard to see any principled grounds for resisting the temptation in the case of the fact that Nelly is a female fox while yielding to it in the case of the fact that Nelly is female and Nelly is a fox. There is a third influential strategy for arguing against instances of β-conversion, developed by Stalnaker (1977), whose application ismore limited. It has two premises. The first is contingentism, the view that it is metaphysically possible for there to be something such that it is not metaphysically necessary that it exists (in the sense of being identical to something): Contingentism 3∃x¬2∃y(y = x) The second premise is what Williamson (2013) calls "the being constraint", which can be stated schematically as follows: (BC) 2∀x2(F x → ∃y(y = x)) Here F stands for any predicate.36 The combination of contingentism, (BC), and classical modal logic requires the failure of β-conversion. For example, we cannot have the following instance of β-conversion: 2∀x2((λz.¬∃y(y = z))(x) → ∃y(y = x)) ↔ 2∀x2(¬∃y(y = x) → ∃y(y = x)) since the formula on the left is an instance of (BC), while the one on the right is equivalent in classical modal logic to 2∀x2∃y(y = x) and thus inconsistent with contingentism. Similarly, contingentists who endorse (BC) will have to reject the following in36The variable x could occur free in F , but the argument does not depend on instances of this sort. 22 stance of β-conversion for any φ: 2∀x2((λx.φ ∨ ¬φ)(x)) ↔ 2∀x2(φ ∨ ¬φ) since the right hand side is a theorem of classical modal logic, whereas the left hand side implies 2∀x2(∃y(y = x)) given (BC).37 My main complaint about this strategy is that the motivation for the Being Constraint seems weak given contingentism. The central contrast this package draws between subject-predicate sentences and other kinds of sentences is not borne out when we actually look at natural languages. A naïve way to make this argument would be as follows. 'Obama doesn't exist' is a subject-predicate sentence: it results from combining the name 'Obama' with the complex predicate 'doesn't exist'. So if the Being Constraint were correct, it would have to be necessary that if Obama doesn't exist, Obama exists, in which case it would be necessary that Obama exists, which is something no contingentist will grant. The reason this is naïve is that surface syntax may be misleading. Sentences where 'not' occurs inside the verb phrase sometimes have readings in which the subject really occurs within the scope of the negation, in the sense of 'scope' that matters for semantics: (18) a. Everyone hasn't yet had a chance to read the minutes. b. All that glitters is not gold. (18a) and (18b) are structurally ambiguous: they have weak readings equivalent to 'It is not the case that everyone has already had a chance to read the minutes' and 'It is not the case that all that glitters is gold' as well as the strong readings equivalent to 'No-one has yet had a chance to read the minutes' and 'Nothing that glitters is gold'. Proponents of the Being Constraint can therefore respond to the naïve argument by saying that 'It is possible that Obama doesn't exist' is similarly ambiguous, and is true only on the reading where the negation takes scope over 'Obama'.38 The problem with this response is that when 'doesn't exist' has a quantified subject, the reading where negation takes scope over the subject is often much too weak. If contingentism is true, sentences like the following are plausibly true on both readings: 37Stalnaker (1994) develops a quantified modal logic in which (BC) is upheld and β-conversion fails. Many other authors, most influentially Plantinga (1983), have defended a structurally similar package in the context of a theory of properties, namely that property-exemplification entails existence, so that the intersubstitutability of 'a has the property of being an x such that φ' and '[a/x]φ' must be restricted in modal contexts. 38See Plantinga 1983, p. 13. 23 (19) a. It could have happened that both of us didn't exist. b. If the second world war had not been fought, everyone who was actually born since then wouldn't have existed. These sentences are clearly ambiguous in the same way as (18a) and (18b). But given standard contingentist views about the extent of contingent existence, it seems wrongheaded to insist that they are true only on their weak readings, where they are equivalent respectively to (20a) and (20b): (20) a. It could have happened that it was not the case that both of us existed. b. If the second world war had not been fought, it would not have been the case that everyone who was actually born since then existed. The most prominent reading of (19a) is the strong one on which, as uttered by A to B, it is true only if it could have happened that neither A nor B existed; assuming contingentism, it should be true on that reading. Similarly for (19b). But given (BC), a sentence of the form 'DP VP' will be modally equivalent to 'DP exist(s) and VP', so long as nothing in the VP takes scope over the DP. So the readings of (19a) and (19b) where 'not' takes scope within the VP will be equivalent to (21a) and (21b): (21) a. It could have happened that both of us existed and didn't exist. b. If the second world war had not been fought, everyone who was actually born since then would and wouldn't have existed. And this looks bad, since (21a) and (21b) seem clearly false.39 (A further argument against the combination of contingentism and the Being Constraint, due to Fritz and J. Goodman (forthcoming, n. 14), turns on higher-order quantification. In a higher-order setting, there is an natural analogue of contingentism involving quantification into sentence position: 3∃p3(¬∃q(q ≡⟨⟩ p)) There is some pressure on contingentists to endorse such "higher-order contingentism": see Williamson 2013, ch. 6. Similarly, there is some pressure on those who 39Contingentist proponents of (BC) might reply at this point that (21a) and (21b) are in fact true, for the same reason that 'All unicorns both are and are not unicorns' is true. There are many problems with this move, but perhaps the worst one is that it does not generalise to examples using other quantifiers. 'If that had happened, most of us wouldn't have existed' has a reading where, assuming contingentism, it is true if 'us' refers to A, B, and C, and if the relevant thing had happened, A and B would never have been born but C still would have. But 'If that happened, most of us both would and wouldn't have existed' isn't true in this circumstance. 24 endorse (BC) to accept its higher-order analogue: (BC⟨⟩) 2∀p2(Op → ∃q(q ≡⟨⟩ p)) Here O is schematic for a sentential operator-a term of type ⟨⟨⟩⟩. But since ¬ and 3 are sentential operators, (BC⟨⟩) implies propositional necessitism: 1. 2∀p2(3p ∨ ¬p) classical modal logic (KT) 2. 2∀p2(3p → ∃q(q ≡ p)) instance of BC⟨⟩ 3. 2∀p2(¬p → ∃q(q ≡ p)) instance of BC⟨⟩ 5. 2∀p2(∃q(q ≡ p)) 1-3, classical modal logic Higher-order contingentistsmust thus reject (BC⟨⟩), making (BC) look ill-motivated.) A fourth strategy for arguing against β-conversion targets only the vacuous instances. For example, one can appeal to the concept of aboutness, arguing against the claim that for Obama to be such that snow is white is for snow to be white on the grounds that snow being white is not about Obama, whereas Obama being F is about Obama (for any F ). 'About' is a bit too vague for this argument to carry much weight by itself.40 But it does help to undermine the positive case for full β-conversion based on examples, by drawing our attention to the possibility of a weaker generalisation that fits the examples equally well. §8 and §9 will introduce some other considerations that count against vacuous β-conversion but not against nonvacuous β-conversion. For formal purposes, if we reject vacuous β-conversion, it is convenient to work with a so-called λI-language, where 'λv1...vn.φ' is not well-formed unless all of v1, ..., vn have free occurrences in φ. (For details see Appendix A1.) This restriction lets us use the usual β-conversion rule rather than constantly having to make exceptions for the vacuous case. The more common form of language ('λK-language'), where vacuous binding is allowed, can be translated into the λI-language by appending trivial conjuncts to abstracts so that every abstracted variable has a free occurrence: for example, when y is not free in F , λxy.F x could be translated as λxy.F x ∧ y = y. It might be objected that this is arbitrary. Why not instead choose λxy.F x ∨ y ≠ y, or λxy.F x ∧ ∃z(z = y), or λxy.F x ∧ (x = y ∨ x ≠ y), for example? This is not an issue if these various options are themselves equivalent (i.e. if the identifications between them are true). But even if the options are not equivalent, opponents of vacuous β-conversion can respond to the worry about arbitrariness by 40J. Goodman (MS) shows how the way of thinking about aboutness that underlies this strategy can be developed into a systematic theory. 25 saying that the original λK-term is vague and has several different λI-terms as admissible precisifications.41 This seems a strong response: if vacuous β-conversion fails, our use of terms involving vacuous binding is less constrained than our use of λI-terms in a crucial respect, in a way that might be expected to make for vagueness. Thus opponents of vacuous β-conversion may legitimately take λI-languages to be metaphysically more perspicuous than λK-languages, as well as formally more convenient. From now on, when I say that something follows from something else by βconversion, I will always mean nonvacuous β-conversion. If we need the vacuous case I will say so explicitly. In the next section I will further support (nonvacuous) β-conversion by pointing out some problems for the most systematic kind of theory in which it fails. 6 Structure The "structured picture" involves a kind of thinking familiar from the theory of structured propositions (Cresswell 1985, Lewis 1970, Salmon 1986a, Soames 1987a), which holds that propositions have a kind of structure analogous to that of the sentences that express them.42 One signature commitment of the theory of structured propositions is that the proposition that a is F = the proposition that b is G only if a = b and the property of being F = the property of being G. (This is often expressed by saying that these propositions are, or can harmlessly be identified with, ordered pairs of objects and properties.) The idea of the structured picture is that identifications work analogously. So we have (the universal closure of) the following axiom: Atomic Structure (f (x) ≡⟨⟩ g(y)) → ((f ≡⟨e⟩ g) ∧ (x = y)) Atomic Structure requires widespread failures of β-conversion. For example, βconversion implies that (λx.R(x, x))(a) ≡ (λx.R(x, a))(a). But Atomic Structure allows this only if (λx.R(x, x)) ≡ (λx.R(x, a)). For most R, this will be obviously 41Different classical-logic-friendly theories of vagueness offer different tools to help one face down arbitrariness-based objections to classical theorems like 'Everyone is either bald or not bald'. The suggestion is that opponents of vacuous β-conversion should respond to the present worry in the same way. 42See also Bealer 1982, whose theory of 'concepts' provides an especially close analogue of the structured picture in a first-order setting. 26 false-typically, λx.(R(x, x)) and λx.(R(x, a)) are not even coextensive.43 Atomic Structure is only a partial articulation of the structured picture, which would not really qualify as "systematic" if it only applied to identifications of the form F (a) ≡ G(b). In a higher-order setting, a principled theory endorsing Atomic Structure should surely also endorse the analogous schema involving sentential operators: Propositional Structure x(p) ≡⟨⟩ y(q) → ((x ≡⟨⟨⟩⟩ y) ∧ (p ≡⟨⟩ q)) This extends the analogy with the theory of structured propositions, which involves the idea that, for example, when one proposition is the negation of another proposition, it is not also the result of applying some other operator to that proposition or any other, and it is not the result of applying negation to any other proposition. Note that even those who reject the intelligibility of higher-order quantifiers or higher-order identifications might accept the following schemas, which capture much of the force of Atomic Structure and Propositional Structure: F (x) ≡ G(y) → ((F ≡ G) ∧ (x = y))Schematic Atomic Structure (X(φ) ≡ Y (ψ)) → ((X(θ) ≡ Y (θ)) ∧ (φ ≡ ψ)) Schematic Propositional Structure I could say more to flesh out the structured picture, considering analogues of Atomic Structure for other types, as well as principles like r(x, y) ≢ f(x) and f(xe) ≢ z(p⟨⟩) corresponding to the idea that each proposition has a unique structure. But the most important objections to the structured picture require only Propositional Structure. I will consider five objections. The first three are in my viewmuchweaker than the last two; I discuss them here because if they worked, they would threaten not just the structured picture but many other "fine-grained" theories, including the 43The standard theory of structured propositions suggests the following polyadic generalisation of Atomic Structure: (r(a1, ..., an) ≡ s(b1, ..., bn)) → (r ≡ s ∧ a1 = b1 ∧ ⋯ ∧ an = bn) (See, e.g., Audi 2012.) However, this is subject to a further objection which does not impugn Atomic Structure: it rules out the possibility that any r⟨e,e⟩ is symmetric in the strong sense that r ≡ λxy.r(y, x). While Atomic Structure clearly needs to extend somehow to the polyadic case to be worth taking seriously, the desire to allow for symmetry motivates weakening the extension somehow, perhaps to (R(a1, a2) ≡ S(b1, b2)) → (R ≡ S ∨ R ≡ (λxy.Syx)) ∧ ((a1 = b1 ∧ a2 = b2) ∨ (a1 = b2 ∧ a2 = b1)) and its natural generalisation to the n-adic case. 27 theory I will be developing in §8. The first objection involves apparent counterexamples: cases where A ≡ B just seems true despite the fact that A and B differ in structure in a way disallowed by the structured picture. For example, perhaps it just seems obvious that for London to be north of Paris is for Paris to be south of London. But the English sentences 'London is north of Paris' and 'Paris is south of London' have a binary subject-predicate structure: they result from applying the monadic predicates 'is north of Paris' and 'is south of London' to the names 'London' and 'Paris'. The identification is thus ruled out by Atomic Structure. Similarly, it might just look obvious that for it not to be necessary that P is for it to be possible that not P ; but this is ruled out by Propositional Structure, since it is false that for it to be necessary that P is for it not to be the case that P . The problem with such direct "appeals to intuition" is that it isn't clear that the judgments in question really involve our target interpretation of 'To be F is to be G', understood literally. We are often pretty permissive in our use of 'To be F is to be G', allowing ourselves the freedom to substitute not only logically equivalent expressions but expressions that we do not even regard as metaphysically necessarily equivalent. For example, whenwe are doing physical geometry, wemight at one time say 'To be a line is to be the shortest path between two points' and at another 'To be a line is to be such that, of any three of one's points, one is between the other two', even if we are not convinced that these conditions are necessarily coextensive. Whatever is going on here, it suggests that we should not be too impatient if proponents of the structured picture respond to putative counterexamples by invoking some kind of "loose talk". The second objection depends not on case-by-case judgments but on a general principle which is inconsistent with the structured picture, and which is encouraged by some natural ways of thinking about higher order logic. Syntactically, the task of an operator is to combine with a sentence to make another sentence. This makes it natural to think of the semantic value of an operator as a function mapping propositions to propositions. A "function" here is simply a binary relation that is functional-every proposition bears it to some proposition, and no proposition bears it to more than one proposition. Those in the grip of this picture may well find it obvious that quantification into operator position is interchangeable with quantification into binary connective position restricted by a functionality requirement. More generally, quantification into type ⟨τ⟩ is interchangeable with quantification into type ⟨τ, ⟨⟩⟩ restricted by functionality. This is made precise by the following 28 principle: Plenitude ∀x⟨τ,⟨⟩⟩(Functional⟨τ,⟨⟩⟩(x) → ∃z⟨τ⟩∀yτ(x(y, z(y)))) where Functional⟨τ,⟨⟩⟩(x) =df ∀yτ∃p(x(y, p) ∧ ∀q(x(y, q) → q ≡⟨⟩ p)) Loosely speaking: for any functional relation x between type-τ things and propositions, there is a corresponding property z of type-τ things, such that for each type-τ thing, the proposition that it has z is the very proposition to which it is related by x. Plenitude is drastically inconsistent with the structured picture. For some distinct objects a and b and property f ⟨e⟩, let R be λyep⟨⟩.((y = a) ∧ (p ≡ f(b))) ∨ ((y ≠ a) ∧ (p ≡ ¬f(b))) Since R is functional, Plenitude entails that ∃z⟨e⟩∀yeR(y, z(y)). Choose a witnessing z and set y = a. By Extensional β-equivalence, the fact that R(a, z(a)) implies that ((a = a) ∧ (z(a) ≡ f(b))) ∨ ((a ≠ a) ∧ (z(a) ≡ ¬f(b))) and hence that z(a) ≡ f(b). Atomic Structure entails that this is true only if a = b, but by stipulation a ≠ b. But it is not clear what is to be said in favour of Plenitude, once we learn to be careful about the heuristic way of thinking in terms of functions that might make it seem undeniable. True, it is a strong and simple generalisation; but so is Atomic Structure (and so, as we shall see later, are certain other generalisations inconsistent with Plenitude). The final comparison between the packages that include Plenitude and those inconsistent with it will have to be made on other grounds. The third objection takes off from the observation that the structured picture is inconsistent with Plenitude. Take any x⟨⟨⟩,⟨⟩⟩ that is counterexample to Plenitude- a functional relation among propositions that does not correspond to any operator z⟨⟨⟩⟩. Couldn't we introduce into our language a new symbol ⊙, with the syntax of a sentential operator, just by stipulating that whenever a sentence φ means that p and x(p, q), ⊙φ will mean that q? If this stipulation is effective, Schematic Propositional Structure will fail in our new language, assuming that our new symbol ⊙ counts as a legitimate substitution instance for the schematic letter X. For example, x might be λpq.q ≡ ¬φ, for some chosen sentence φ.44 Then we have ⊙ψ ≡ ¬φ for all ψ , though obviously it is not true that ψ ≡ φ for all ψ . Even more simply, we could take x to be λpq.p ≡ q; then we have ⊙(¬φ) ≡ ¬φ despite the fact that φ ≢ ¬φ.45 44In a λI-language, use λpq.q ≡ ¬φ ∧ p ≡ p. 45As Jeremy Goodman pointed out, these stipulations do not specify any meaning for sentences 29 These considerations aboutmade-up languages do not of course show that Schematic Propositional Structure has any false instances in our actual language. But they do raise a challenge: given that languages where Schematic Propositional Structure fails are possible, what reason to we have for thinking that our own language is not one of them? After all, the primary social functions which explain how our languages socially evolved could, it seems, be fulfilled just as well by the imagined extended languages. As Rayo (2013, p. 10) puts the point: 'It is simply not the case that ordinary speakers are interested in conveying information about metaphysical structure.'46 Their goals are much more down to earth. One can also turn this into a worry about the non-schematic Propositional Structure. Since this does not contain the new symbol ⊙, it will still be true in the new language if it was true in the old language: this means that higher-order existential generalisation will not be valid for ⊙. The challenge is to say why, if we reject Plenitude, we should ever be confident that existential generalisation works for terms in our current language, given that our communicative purposes can be perfectly well served by languages in which it fails. The underlying worry is that while quantification into operator position might initially seem like a readily intelligible generalisation of our ordinary quantificational idioms, its legitimacy becomes much harder to defend if its application requires us to make a metaphysically contentious distinction between the "bona fide" connectives (which admit existential generalisation) and the merely "apparent" connectives (which do not). One way to respond to this argument is to insist that the relevant stipulation is simply impossible. This is plausible enough for some of the relevant functional relations x. The idea would be that in some cases where a certain sentence φ means that p, we fail to know which q is such that x(p, q) in some metasemantically important sense of 'know which', and because of this, are not in a position to understand the new sentence ⊙φ in a way that conforms to the attempted stipulation. For example, if x(p, q) is 'Either Caesar once asserted p and q ≡ p, or Caesar never asserted p and q ≡ snow is white', we arguably know too little to really understand '⊙Elephants have trunks'. But for other x, it is hard to see how this kind of complaint could be sustained, since we seem to "know the extension of x" perfectly well. The previous where ⊙ occurs as an argument of some higher-order term (e.g. of type ⟨⟨⟨⟩⟩⟩), rather than having a sentence as its argument. If one wants a stipulation that can make ⊙ behave just like ¬ or any other operator in the higher-order language, one will need something vastly more radical, perhaps a complete reinterpretation of the entire language mapping every term of type ⟨⟨⟩⟩ to one of type ⟨⟨⟩, ⟨⟩⟩ and extending this mapping to all types. This does much to undercut the force of the present objection. Thanks to Jeremy Goodman and Peter Fritz for discussion. 46Rayo is arguing against a view he calls "metaphysicalism", which seems quite close to the structured picture: see Dorr 2014b. 30 example, where x ≡ (λpq.q ≡ ¬φ) seems to be like this, in the case where φ is a sentence we understand. In this case, it is hard to see what understanding-theoretic obstacle there could be to introducing ⊙ by stipulating the truth of ∀p.χ(p, ⊙p). Thus, I think that we will have to get used to the idea that there are possible "illbehaved" languages in which not all of the connectives are bona fide, so we cannot dismiss out of hand the suggestion that some of the connectives in English might turn out to be like this.47 The real problem with the argument, I think, is that the challenge from madeup languages is simply too general to have any bite against the structured picture in particular: one can raise essentially the same challenge concerning any putatively valid schema. It would not be a compelling argument against, say, the law of non-contradiction (understood as a schema) to point out that we could stipulatively modify our language in such a way that it ceased to be valid, and that the new language would be no worse than the old from the point of view of everyday communication. So what? One might argue: 'The social function of language would be served equally well by a language in which most ordinary instances of this schema were true as by a language in which all instances of the schema were true; therefore it would be a surprising coincidence if a community were to end up speaking a language in which all instances of the schema were true'. But this seems a bit silly, since the truth of different instances of a schema in a community's language are not probabilistically independent events. Rather, the truth about metasemantics-about what it is for one abstractly specified language rather than another to be spoken by a given community-means that it is simply easier for a community to end up speaking a "regular", "well-behaved" language, other things being equal.48 They don't have to care about regularity, or logic, or metaphysics, for this to happen. It just 47Someone might object that when we introduce ⊙ in the imagined way, φ does not really occur as a syntactic, as opposed to merely orthographic, constituent in the sentence ⊙φ, so that the failure of existential generalisation or the substitution into Schematic Propositional Structure is completely unsurprising. (The apparatus of schemas needs to be understood in such a way as to rule out merely orthographic "embeddings": 'c=d → (Fido is a dog → Fido is a cog)' is not an instance of Leibniz's Law.) While this may be correct in some cases, I do not think it would be wise for proponents of the structured picture to rely on the the science of syntax to save them from these kinds of objections. Whether something is a constituent in the sense relevant to syntax presumably turns either on cognitivepsychological facts about how speakers process the compound formula, or on sociological facts about the systems of linguistic rules prevalent in a community. To be in a position to insist that stipulations of the kind we have envisaged never create new sentences with genuine syntactic constituents, one would have to be thinking of syntax as directly answerable to metaphysics in a way that seems alien to the practice of actual syntacticians. 48The notion of easiness here could be cashed out in terms of physical probability, as in Dorr and Hawthorne 2014. 31 happens by default, unless something special happens to stop it from happening, such as someone issuing some strange stipulation that would only ever occur to a philosopher trying to prove a point about the power of stipulation. The fourth objection involves the inconsistency of the structured picture with the following principle: Involution p ≡⟨⟩ ¬¬p The inconsistency with Propositional Structure (or Schematic Propositional Structure) is straightforward. Consider, say, a possibility claim 3ψ . By Involution, we have 3ψ ≡ ¬¬3ψ ; by Propositional Structure, this is true only if ψ ≡ ¬3ψ . But this is false for any true ψ , since sentences flanking a true identification cannot differ in truth value. But why believe Involution? The strongest case I know of is based on the following thought experiment from Ramsey (1927): We might, for instance, express negation not by writing a word "not", but by writing what we negate upside-down. Such a symbolism is only inconvenient because we are not trained to perceive complicated symmetry about a horizontal axis, and if we adopted it we should be rid of the redundant "not-not", for the result of negating the sentence "p" twice would simply be the sentence "p" itself. (Ramsey 1927, pp. 42–3) If we spoke Ramsey's imagined language, we would simply have no pairs of distinct formulae in our language that relate to one another in the same way that φ relates to ¬¬φ in our actual language.49 If there are truths of the form φ ≢ ¬¬φ, they are inexpressible in such a language. But it is hard to believe that the use of such a language would be any sort of a handicap from a metaphysical point of view.50 The point doesn't turn on the lack of any symbol for ¬ in Ramsey's language. Suppose the speakers of the language are willing to introduce such a symbol by stipulation. One obvious way for them to accomplish this would be to stipulate that all instances of the schema ¬φ ≡ φ should be true. But ¬¬φ ≡ φ will certainly be 49We had better assume that the basic symbols of the language are chosen so that we never have vertically mirror-symmetric sentences like the English 'HE DOCKED' (see Sorensen 1999, p. 159). 50It would, perhaps, make it harder to describe certain possible mental states, such as the mental states of those intuitionistic logicians who rejected the claim that every set of natural numbers has a least element while still accepting that every set of natural numbers does not not have a least element. But this is not the kind of deficiency metaphysicians should care about: any form of language will enable people to get confused in certain distinctive ways, and will be better suited than others for the task of characterising those particular forms of confusion. 32 true in their language if this stipulation is successful. For starting with ¬¬φ≡¬¬φ, substitution in accordance with the schema yields ¬¬φ≡ ¬φ, which we can then turn into ¬¬φ ≡ φ using ¬φ ≡ φ, which is another instance of the schema. The other way in which we could imagine them introducing the ¬ symbol would be stipulate that it is equivalent to λp.p (that's an upside-down 'p' after the dot); if β-conversion fails, this does not guarantee that instances of the schema ¬φ ≡ φ are true. But insofar as we think that this gives them a way of expressing the facts we express using 'not', we face a problem of expressive limitation in the opposite direction: our language seems to lack the resources to express the facts they express using inverted sentences. There seems to be no way for deniers of Involution to do justice to the thought that the two languages are on a par. The fifth and last argument against Propositional Structure that I want to discuss is well-known (though it has been neglected): it is known as the "Russell-Myhill paradox" (after B. Russell (1903, Appendix B) and Myhill (1958)), and establishes that Propositional Structure is actually inconsistent when we have classical higherorder quantification. Let me start by stating the argument loosely in terms of propositions and properties. Choose some arbitrary proposition p, say that snow is white. Let a "heteropredicative" proposition be one that that predicates of p some property that it itself lacks. Now consider the proposition that p is heteropredicative, call it q. Is q heteropredicative? If not, then q must have every property that it predicates of p, and in particular the the property of being heteropredicative; contradiction. So q is heteropredicative: it predicates of p some property f that it, q, lacks. This f cannot be the property of being heteropredicative, which as we have just seen, q does not lack. So, there must be two distinct-and indeed non-coextensive-properties which this single proposition q predicates of p. For those who feel like working through it, here is a rigorous statement of the argument.51 Let O ("is heteropredicative") abbreviate λq⟨⟩.¬∀f ⟨⟨⟩⟩((q ≡ fp) → fq) 51My version of the argument is similar to the versions given (and endorsed) by Hodes 2015 and J. Goodman forthcoming. 33 Then we argue as follows: 1. O(Op) ↔ ¬∀f ⟨⟨⟩⟩(Op ≡ fp → f(Op)) (Extensional β-equivalence) 2. ∀f(Op ≡ fp → f(Op)) → (Op ≡ Op → O(Op)) ∀-elim 3. ¬O(Op) → (Op ≡ Op → O(Op)) (1, 2) 4. Op ≡ Op (Ref) 5. ¬O(Op) → O(Op) (3, 4) 6. O(Op) (5) 7. ¬∀f(Op ≡ fp → f(Op)) (1, 6) 8. ∀f(O ≡ f → (f(Op) ↔ O(Op))) (LL) 9. ∀f(O ≡ f → f(Op)) (6, 8) 10. ¬∀f(Op ≡ fp → O ≡ f) (7, 9) The conclusion is plainly inconsistent with the substitution instance∀f((Op ≡ fp) → ((O ≡ f) ∧ (p ≡ p))) of Propositional Structure.52 When we think about why Propositional Structure fails in this case, we can see that we should expect failures to be quite pervasive. The argument is essentially Cantorian: one can think of the conclusion as saying that the domain of properties of propositions is larger than the domain of propositions, so that there can be no oneone correspondence between the two domains, and in particular the relation of being a property f and a proposition q such that q is f(p) cannot be one-one as required by Propositional Structure. So it is wrong to think of the failure of uniqueness in the case of O(p) as an isolated oddity. In the absence of some plausible criterion for confining failures of uniqueness to some special propositions, it seems that we should expect just about any proposition that is the result of applying an operator f to also be the result of applying some other operator that is not even coextensive with f . One way to block this reasoning is to adopt a ramified type theory like that of Whitehead and B. Russell 1910. Even explaining the basic idea behind this move, let alone properly evaluating it, would take me too far afield; so let me just echo the widespread consensus that this would be a major cost.53 52The argument remains valid if we uniformly replace all constituents of the form 'X(p)' with [X/y⟨⟨⟩⟩]φ for any formula φ. The conclusion will be interesting, and arguably inconsistent with the structured picture, when φ contains at least one occurrence of the operator variable y⟨⟨⟩⟩. We can recover something close to Russell's original argument (B. Russell 1903, Appendix B) by taking φ to be ∀p⟨⟩((y⟨⟨⟩⟩p) → p) ('every y proposition is true'). 53See Bacon, Hawthorne and Uzquiano 2016, sect. 7 for a survey of some of the forms that ramific34 I conclude that the structured picture is false. Since the structured picture looks to be the simplest and most systematic alternative to β-conversion, this bolsters the case for β-conversion. But of course, intermediate positions that accept neither βconversion nor the structured picture are imaginable. So, let me close this section by tentatively presenting a positive argument for β-conversion. This argument turns on the familiar practice of stipulative definition, in which new terms are introduced by writing down things like (22a)–(22c): (22) a. x is a schmixen =df x is female and x is a fox b. Collinear(x, y, z)=df Between(x, y, z) ∨Between(y, z, x) ∨Between(z, y, x) c. x is a transitive set =df ∀y∀z(y ∈ x ∧ z ∈ y → z ∈ x) As discussed in §2, it is central to the practice of introducing new predicates in this way that having done so, we get to substitute the open sentence on the right of '=df' for the one on the left, salva veritate (perhaps with a special exception for attitude and speech reports). In particular, we should be able to substitute in identifications, licensing claims like (23): (23) Schmixen(Nelly) ≡ Female(Nelly) ∧ Fox(Nelly) For Nelly to be a schmixen is for it to be the case that Nelly is female and Nelly is a fox. But words we have introduced in this way also seem to be perfectly genuine predicates in our expanded language. We should therefore be able to existentially generalise from (23) to (24): (24) ∃f ⟨e⟩(f (Nelly) ≡ Female(Nelly) ∧ Fox(Nelly)) But this sits very strangely with the denial of the following instance of β-conversion: (25) (λx.Female(x) ∧ Fox(x))(Nelly) ≡ Female(Nelly) ∧ Fox(Nelly) If (24) is witnessed by some f ⟨e⟩, why on earth should we not take it to be witnessed by λx.Female(x) ∧ Fox(x)? If existentially quantified claims like (24) were true, surely the best conventions about the use of λ-abstracts would be one on which ation might take, including an approach that (unlike that of Whitehead and Russell) keeps the syntax of the language intact and merely replaces each of our quantifiers with a hierarchy of "restricted" quantifiers. For considerations against ramification, see Ramsey 1926 and Prior 1971, ch. 3. Hodes (2015) considers an argument for ramification based on "converse-compositional" principles like Propositional Structure, and finds it wanting. 35 λx.Female(x) ∧ Fox(x) could be used to stand for the witnesses. And of course this mode of argument is extremely general. For any open sentence φ in which the variables x1, ..., xn all occur free, we can introduce a new n-ary predicate F by stipulating that F (x1, ..., xn) =df φ, and then infer by substitution that F (A1, ..., An) ≡ [Ai/xi]φ and by existential generalisation that ∃f(f(B1, ..., Bn) ≡ [Ai/xi]φ), which makes the denial of (λx1, ..., xn.φ)(B1, ..., Bn) ≡ [Ai/xi]φ seem bizarre. (Note that this seems a lot less compelling in the vacuous case where some of x1, ..., xn do not occur free in φ, since we do not normally introduce new predicates by means of stipulations like that.) Another way to use our practice of stipulative definition to argue for β-conversion relies not on existential generalisation but on the following schema, which is considerably less controversial than β-conversion: (η-conversion) A ≡ A∗, where A∗ is derived from A by replacing some constituent of the form (λv1...vn.F (v1, ..., vn)), where none of v1, ..., vn is free in F , with F .54 η-conversion is not in tension with the structured picture, and several authors who reject β-conversion in general have expressed sympathy for η-conversion-for example, Fine (2012, §9) and Salmon (2010, §2) both suggest that η-convertible sentences may be equivalent in some strong sense in which β-equivalent sentences are not. Thus, the following argument from η-conversion to β-conversion is of some interest. As before, let φ be some formula with x1, ..., xn free, and introduce F by F (x1, ..., xn) =df φ. First use two applications of definitional substitution to get the following identifications: F (A1, ..., An) ≡ [Ai/xi]φ (λx1...xn.F (x1, ..., xn))(A1, ..., An) ≡ (λx1...xn.φ)(A1, ..., An) But (λx1...xn.F (x1, ..., xn))(A1, ..., An) ≡ F (A1, ..., An) is an instance of η-conversion. So by symmetry and transitivity, we can derive (λx1...xn.φ)(A1, ..., An) ≡ [Ai/xi]φ (Again, this is much less compelling in the vacuous case.) 54Note that instances of η-conversion where F is (λv1...vn.φ) for some formula φ are also instances of (nonvacuous) β-conversion. 36 It is natural to think of the formation of complex predicates by λ-abstraction as a device for "automating" the procedure of introducing a new predicate by stipulation and then using it-when we write λx1, ..., xn.φ in a formula (where all of x1, ..., xn occur free in φ and no other variables do), it is just as if we had inserted a simple predicate F which we had earlier defined by issuing the stipulation F (x1, ..., xn) =df φ. The differences between these procedures seem to be matters of convenience rather than principle. Thus we should not be surprised by the idea that disputed questions about the logic of λ-abstracts can be settled by reference to the behaviour of stipulatively defined predicates. Opponents of β-conversion will probably reply to these arguments by insisting that we have to choose between two different interpretations for stipulative definitions like (22a)–(22c). We could treat them as mere abbreviations, in which case the uses of existential generalisation and η-conversion are not licensed; or we could treat them as true predicates interchangeable with the corresponding lambda terms, in which case definitional substitutions will not be licensed in all contexts (and in particular, not in identifications). But the idea that we have to make such a choice looks like an artefact of a bad theory. True, logicians theorising in a metalanguage about a distinct object language sometimes introduce things called "metalinguistic abbreviations" which are not predicates at all, but part of a system for forming complex names for expressions in the object language. But despite superficial similarities, this practice is really quite different from the practice of stipulative definition as engaged in by mathematicians, scientists, and philosophers, which manifestly does lead to extensions of the language. 7 Booleanism The questions discussed so far concern the 'pure logic' of identifications; they have nothing specific to say about the behaviour of other familiar logical vocabulary- truth-functional operators or quantifiers-within the scope of identifications. One simple theory about the interaction of identifications with truth-functional operators is Booleanism, according to which the truth functional operators conform to the axioms of a Boolean algebra, with identification playing the role of identity. This theory can be axiomatised by adding the following schemas to the logic we have 37 already (classical logic plus Ref, LL, and βη-conversion): (λv1...vn.φ ∧ ψ) ≡ (λv1...vn.ψ ∧ φ)∧-Commutativity (λv1...vn.φ ∨ ψ) ≡ (λv1...vn.ψ ∨ φ)∨-Commutativity (λv1...vn.φ ∧ (ψ ∨ θ)) ≡ (λv1...vn.(φ ∧ ψ) ∨ (φ ∧ θ))∧∨-Distributivity (λv1...vn.φ ∨ (ψ ∧ θ)) ≡ (λv1...vn.(φ ∨ ψ) ∧ (φ ∨ θ))∨∧-Distributivity (λv1...vn.φ ∧ (ψ ∨ ¬ψ)) ≡ (λv1...vn.φ)∧∨-Dissolution (λv1...vn.φ ∨ (ψ ∧ ¬ψ)) ≡ (λv1...vn.φ)∨∧-Dissolution Here v1...vn stands for any list of zero or more distinct variables, and φ, ψ , θ are formulae. Given these axioms, various other familiar-looking schemas follow (see Huntingdon 1904 for proofs), including the following (λv1...vn.¬(¬φ)) ≡ (λv1...vn.φ)Involution (λv1...vn.(φ ∧ ψ) ∧ θ) ≡ (λv1...vn.φ ∧ (ψ ∧ θ))∧-Associativity (λv1...vn.φ ∧ φ) ≡ (λv1...vn.φ)∧-Idempotence (λv1...vn.¬(φ ∧ ψ)) ≡ (λv1...vn.¬φ ∨ ¬ψ)∧∨-De Morgan (λv1...vn.φ ∧ (φ ∨ ψ)) ≡ (λv1...vn.φ)∧∨-Absorption (λv1...vn.φ ∧ (ψ ∧ ¬ψ)) ≡ (λv1...vn.ψ ∧ ¬ψ)∧∨-Annihilation We also get the dual versions that interchange ∧ and ∨.55 Booleanism could alternatively be axiomatised using a single schema with a more complicated condition on what counts as an instance: Taut: (λv1...vn.φ) ≡ (λv1...vn.ψ) whenever φ ↔ ψ is a tautology (theorem of classical propositional logic). The equivalence of Taut to the axioms listed above follows from the soundness and completeness of the Boolean-valued semantics for classical propositional logic, where the semantic values of sentences in a model are members of an arbitrary 55In a higher-order λK-language, there is no need to use schemas in axiomatising Booleanism. For example, we can replace ∧∨-Distributivity with the single axiom (λpqr.p ∧ (q ∨ r)) ≡ (λpqr.(p ∧ q) ∨ (p ∧ r)). In a λI-language we can still do this for Commutativity and Distributivity, but it will not work for Dissolution since λpq.p is ill-formed. However, Booleans will see no advantage to λI-languages. As noted in §5, vacuous lambda abstracts can be translated into a λI-language by adding tautologous conjuncts to turn them into non-vacuous abstracts; for example translating λp.φ when p is not free in φ as λp.φ ∨ (p ∧ ¬p). Because they accept Dissolution, Booleans accept full β-conversion even for the expanded language. 38 Boolean algebra and theorems are sentences whose semantic value in every model is the top element of that model's Boolean algebra. Booleanism has more often been taken for granted than argued for. One exception is Ramsey (1927), whose rather compelling argument for Involution we already encountered in §6. After making this argument, Ramsey goes on in short order to generalise the conclusion to all the other Boolean equivalences. But it is doubtful whether the mode of argument from alternative possible forms of language actually extends as far as this. It certainly extends to De Morgan: Involution entails that both ∧∨-De Morgan and ∨∧-De Morgan hold when ∨ is interpreted as λpq.¬(¬p ∧ ¬q) or when ∧ is interpreted as λpq.¬(¬p ∨ ¬q), and it is hard to believe that the actual interpretations of ∧ and ∨ fail to fit together in the way that these possible interpretations do. (Indeed, Ramsey's community where negation is represented by inversion will not have different expressions corresponding to '¬(φ ∨ ψ)' and '¬φ ∧ ¬ψ', or '¬(φ∧ψ)' and '¬φ∨¬ψ', if they wisely use ∧ and ∨ for conjunction and disjunction.) Another argument in a similar style, turning on possible languages whose sentences do not always have to consist of linearly ordered strings of symbols, can be used to support Commutativity.56 The fact that our language forces us to choose an order for conjuncts and disjuncts seems no more relevant to enhancing its expressive capacity than the fact that we have to choose a typeface in writing, or a tone of voice in speech. However, it is hard to see how this mode of argument could be extended to any of the other Boolean equivalences. One could imagine a written language in which symbols, once arranged to form a formula, automatically rearrange themselves to form a tautologically equivalent formula in some canonical form (say, disjunctive normal form). But assuming the community in question retained the ability to introduce new simple symbols stipulatively equivalent to old complex expressions, they could use such symbols to generate stable equivalents to sentences of ours that are not in the canonical form, and use these to express counterexamples to Booleanism, if there are any: by contrast with the case of Involution, allowing for definitional expansions of the language seems to restore expressive parity. Anyway, this thought experiment seems lame as an argument for Booleanism in particular, given that we can also imagine a script whose symbols providentially rearrange themselves into the shortest expression equivalent to what was written down modulo the laws of nature, or indeed modulo the truth about some arbitrarily chosen subject matter. While users of such languages are lucky in one way, in that it is harder for them to communicate false beliefs about the relevant subject matter, they are surely subject 56Cf.Williamson (1985), who in a somewhat different context imagines languages whose sentences are built up by putting expressions into bags. 39 to genuine expressive limitations (until they introduce new simple symbols that do not get destroyed by the rearrangement). From an anti-Boolean point of view, the biconditionals corresponding to the Boolean axioms are not importantly different from the laws of nature in this respect. Define 2 as λp.p ≡ ⊤, where ⊤ is some arbitrarily chosen closed tautology, say ∃p(p) ∨ ¬∃p(p). This operator 2 is of great interest in a Boolean setting, partly because it can take over the role of the propositional identification connective ≡t, since 2(φ ↔ ψ) ↔ (φ ≡ ψ) is a theorem of Booleanism.57 Booleanism implies that all instances of the modal axioms K and T hold for 2.58 Moreover, it would be natural for Booleans to endorse further principles about embedded occurrences of ≡ which have the effect of making 2 be a normal modal operator obeying the logic S4, or perhaps even S5.59 If one thought that 2 had an S5 logic, it would be natural to equate the claim that2φ-i.e. that φ ≡ ⊤-with the claim that it is metaphysically necessary that φ. Some might see this as a welcome explanation of the unfamiliar (identification) in terms of the familiar (metaphysical necessity).60 My attitude towards the proposed 57Proof: Suppose φ ≡ ψ . (φ ↔ ψ) ≡ (φ ↔ φ) is true by Ref and LL; ⊤ ≡ (φ ↔ φ) is true by Taut, so (φ ↔ ψ) ≡ ⊤ is true by LL. In the other direction, suppose (φ ↔ ψ) ≡ ⊤. φ ≡ (ψ ↔ (φ ↔ ψ)) is true by Taut, so by LL, φ ≡ (ψ ↔ ⊤); also (ψ ↔ ⊤) ≡ ψ by Taut, so φ ≡ ψ by LL. 58For K, suppose that φ → ψ ≡ ⊤ and φ ≡ ⊤; then (⊤ → ψ) ≡ ⊤ by LL; since (⊤ → ψ) ≡ ψ by Taut, a second application of LL yields ψ ≡ ⊤. For T, suppose that φ ≡ ⊤; ⊤ is true by propositional logic, so we can infer φ by LL. 59The following principles suffice for S4: (x ≡τ x) ≡ ⊤Necessitated Ref ((A ≡ B) → (φ → [B/?A]φ)) ≡ ⊤Necessitated LL Both of these are natural generalisations of Taut, since they assert identifications where the corresponding biconditionals were already provable in the system. Furthermore, if we accept The Identity Identity, we can derive these principles from the following natural analogues of Booleanism for the quantifiers: ((λv0...vn.A) ≡ (λv0...vn.A ∧ B)) ↔ ((λv1...vn.∃v0(A)) ≡ (λv1...vn.∃v0(A) ∧ B)) ((λv0...vn.A) ≡ (λv0...vn.A ∨ B)) ↔ ((λv1...vn.∀v0(A)) ≡ (λv1...vn.∀v0(A) ∨ B)) (Here B is a formula in which v0 does not occur free. See Dorr 2014a and J. Goodman 2016 for more on these "Adjunction" principles.) To get to S5, on the other hand, we would have to add something much less clearly well-motivated, namely the 2-necessity of distinctness: (x ≢τ y) → ((x ≢τ y) ≡ ⊤) For arguments that Booleans should reject this principle, see Bacon MS and J. Goodman MS. 60Such an explanation would be worth little if it only applied to the sentential connective ≡⟨⟩, 40 equation, if I believed that S5 was valid for 2, would be the reverse. Philosophers have struggled to say something helpful to single out the "metaphysical" readings of modal operators from among the panoply of other readings they may bear; their efforts have not been conspicuously successful. So an explication of 'It is metaphysically necessary that φ' as 'For it to be the case that φ is for it to be the case that ⊤' would shed some welcome light on the concept of metaphysical necessity and the interest of questions formulated in terms of it.61 True, as we saw in §1, 'To be F is to be G' also admits several readings; but at least it doesn't have the same array of readings as 'necessarily', and the task of singling out the target reading is perhaps less challenging. Moreover, whereas it is easy to find philosophers who claim to have no idea what metaphysical necessity is or to regard it as in some sense a defective notion, it is hard to see how any philosopher could be so dismissive of 'To be F is to be G' given the central role questions about the truth of such claims play in most branches of the subject. 'Metaphysically necessary' was introduced into the philosophical vernacular partly through general formulas-for example, the equation of metaphysical necessity with "unrestricted" or "absolute" necessity, or 'necessity in the highest degree- whatever that means' (Kripke 1972, p. 99)-but partly also through philosophers exchanging opinions about which truths are, in fact, metaphysically necessary-e.g. that nothing is green and red all over, that Nixon is not an inanimate object, that a certain lectern is not made of ice, etc. If you were convinced of the falsity of claims like 'Nixon is not an inanimate object ≡ ⊤' or 'The lectern is not made of ice ≡ ⊤', etc., you might worry that the proposed interpretation of 'metaphysically necessary' would be unduly uncharitable. However given the character of 'metaphysically necessary' as a term of art, charity to the explicit explanatory remarks made by those who introduced the term should weigh especially heavily with us in deciding how to interpret it, while charity to claims that they made where they thought of themselves as taking a stand on philosophically controversial questions should count for relatively little. If we can make sense of a notion of necessity with a good claim to labels like 'necessity in the highest degree', it would be perverse to interpret 'metaphysically necessary' as expressing something more restricted just for the sake of being more charitable to the case-by-case modal pronouncements of, say, Saul Kripke. but there are natural strategies for extending it to other types. For example, one might maintain that (A ≡⟨τ1 ,...,τn⟩ B) ≡⟨⟩ 2∀x1...∀xn(A(x1, ..., xn) ↔ B(x1, ..., xn)). Or if one rejected this on the grounds that it is inconsistent with contingentism, one might still accept (A ≡⟨τ1 ,...,τn⟩ B) ≡⟨⟩ 2∀x12...∀xn2(A(x1, ..., xn) ↔ B(x1, ..., xn)). 61The idea of defining necessity in terms of an identity connective occurs, for example, in Cresswell 1965 and Suszko 1975. 41 This would be as misguided as interpreting philosophers like Shoemaker (1998), who maintain a doctrine that they express by saying 'All laws of nature are metaphysically necessary', as merely meaning by 'metaphysically necessary' what other philosophers have meant by 'nomically necessary'. At least if the logic of 2 is S5, it has a very strong claim to be the "absolute" form of necessity, since whenever φ ≡ ⊤ is true, Oφ must be true for every other transparent operator O such that O⊤ is true, a category which certainly includes all non-epistemic necessity operators. If the project of using identifications to say what it is to be metaphysically necessary were feasible only for S5-endorsing Booleans, that would be a weighty consideration in favour of their view. In fact, however, even if we do not question the orthodox view that the logic of metaphysical necessity is S5, this project is also open to non-Booleans, and to Booleans who reject S5 for2. The question how this should go is beyond the scope of the present paper, but one obvious strategy would be to first define a higher-order predicate is an S5 operator (of type ⟨⟨⟨⟩⟩⟩), and then identify being metaphysically necessary with being mapped to a truth by every S5 operator, λq.∀x⟨⟨⟩⟩(S5(x) → x(q)).62 It is not obvious that this operator will turn out to itself be a S5 operator. But if this can be shown, the case for identifying it with metaphysical necessity will be parallel to the case for identifying λp.p ≡ ⊤ with metaphysical necessity if its logic is S5. Thus, while it does seem like an advantage of a theory if it allows for a plausible explanation of metaphysical necessity in terms of identifications, this advantage is not distinctive to Booleanism. The feature that does seem to be distinctive to Booleanism is the possibility of explaining identifications in terms of metaphysical necessity; but the ability to do this does not seem to me to be much of an advantage at all. 62For Booleans, "S5(x)" might be defined as Taut(x) ∧ K(x) ∧ T(x) ∧ 4(x) ∧ B(x), where these are in turn defined in terms of a type-⟨⟨⟨⟩⟩, ⟨⟨⟩⟩⟩ "entailment" connective ≤⟨⟨⟩⟩ as follows: Taut(x) =df (λq.q ∨ ¬q) ≤ (λq.x(q ∨ ¬q)) K(x) =df λpq.(xp ∧ p ≤ q) ≤ (λpq.xq) T(x) =df x ≤ (λq.q) 4(x) =df x ≤ (λq.x(x(q))) B(x) =df (λq.q) ≤ (λq.x(¬x(¬q))) Here, x ≤⟨⟨⟩⟩ y can in turn be defined as x ≡⟨⟨⟩⟩ λp.(x(p) ∧ y(p)). Non-Booleans could adopt the same strategy but with a different definition of ≤. One possible definition that is well behaved in a wide variety of non-Boolean settings, including the models considered in the Appendix, is x ≤⟨⟨⟩⟩ y =df (λp.x(p) ∧ (x(p) ∧ y(p))) ≡⟨⟨⟩⟩ (λp.x(p) ∨ (x(p) ∧ y(p))) (see Appendix A8). 42 8 Non-circularity Even though the arguments for Booleanism considered in the previous section were unconvincing, Booleanism is a simple and powerful metaphysical vision. At times, indeed, its simplicity has won me over. But before we can properly assess the force of such abductive considerations, we will need a much more detailed appreciation of the space of alternatives to Booleanism. In the remainder of this paper, I want to consider one particular kind of consideration which might lead one to reject Booleanism, and see what kind of alternative it might suggest. Consider the following claims: GRUE: To be grue is to be either green and observed before t, or blue and not observed before t. BLEEN: To be bleen is to be either blue and observed before t, or green and not observed before t. GREEN: To be green is to be either grue and observed before t, or bleen and not observed before t. BLUE: To be blue is to be either bleen and observed before t, or grue and not observed before t. GRUE and BLEEN are uncontroversial: just look at the passages of N. Goodman 1954 in which the words 'grue' and 'bleen' are introduced. GREEN and BLUE, on the other hand, are very odd. It is tempting to think-pace Goodman himself-that they are simply false. Colour scientists and philosophers consider various views about what it is to be green and what it is to be blue. Maybe to be green is to be disposed to affect perceivers in a certain way, or to be disposed to reflect light in a certain way, or to have a surface with certain intrinsic physical characteristics. I can get into a frame of mind where GREEN and BLUE seem like obviously misguided competitors to these serious views. Of course, those who think that they are true will probably want to tell some pragmatic story about why they would be unhelpful things to assert in many contexts. Note however that 'To be green is to be green' does not seem false although it is as unhelpful as can be, so those who think that GREEN and BLUE are true should not be too sanguine about the prospects for a pragmatic explanation of their oddity that undercuts the temptation to think them false.63 63One reason one might have for rejecting GREEN and BLUE that has to do with the fact that a particular time t is mentioned on the right hand sides: one might object that being green and being blue are not about any particular times (perhaps on the grounds that they are qualitative), whereas being 43 The following derivation shows that GREEN follows from GRUE and BLEEN given Booleanism. Here 'Gx', 'Bx', 'G′x', 'B′x' and 'Ox' respectively abbreviate 'x is green', 'x is blue', 'x is grue', 'x is bleen' and 'x is observed before t': 1. (λx.Gx) ≡ (λx.((((Gx ∧ Ox) ∨ (Bx ∧ ¬Ox)) ∧ Ox) ∨ (Taut) (((Bx ∧ Ox) ∨ (Gx ∧ ¬Ox)) ∧ ¬Ox))) 2. G ≡ (λx.(((λy.(Gy ∧ Oy) ∨ (By ∧ ¬Oy))(x) ∧ Ox) ∨ ((λy.(By ∧ Oy) ∨ (Gy ∧ ¬Oy))(x) ∧ ¬Ox))) (1, βη-conversion) 3. G′ ≡ (λy.(Gy ∧ Oy) ∨ (By ∧ ¬Oy)) (GRUE) 4. B′ ≡ (λy.(By ∧ Oy) ∨ (Gy ∧ ¬Oy)) (BLEEN) 5. G ≡ (λx.(G′x ∧ Ox) ∨ (B′x ∧ ¬Ox)) (2,3,4, LL) So if we want to reject GREEN (which is line 5), we have to give up Booleanism. One project we could engage in at this point is that of going through the list of Boolean axioms and theorems and try to decide which ones to keep and which to give up. There is a large literature we can draw on in this enterprise: for any logic weaker than classical propositional logic, we could consider a weakening of Taut where equivalence in that logic replaces equivalence in classical propositional logic. But we should hope to be able to do better than this, not simply weakening Booleanism but articulating some general principles that are actually inconsistent with it, so as to put together a competing package with advantages of simplicity and strength of its own that could be set against those of Booleanism. Of course, one might think that there just are no true principles that are both inconsistent with Booleanism and comparable in simplicity and generality to those of Booleanism. But general methodological considerations suggest that such a view should be regarded as a last resort.64 The case of GREEN and BLUE is suggestive in this regard. For in my case at least, the inclination to think them false is not primarily based on any positive thoughts about what it might really be to be green or blue, or even on the nature of colour in general. Rather, I am inclined to think that GREEN and BLUE can be ruled out simply on the basis of GRUE and BLEEN. Just looking at the logical form of these identifications, I have an impulse to say that they cannot possibly all be true either grue and observed before t or bleen and observed after t is about the particular time t. But I am interested in a more general reason that would still apply even if we replaced 'observed before t' throughout with 'observed at some time or other'. 64For abductive methodology in metaphysics, seeWilliamson 2013 (especially its 'Methodological Afterword') and Williamson 2016. 44 together, since that would be circular. The idea that there is something "vicious" (i.e. impossible) about a kind of circularity that would be exhibited by this set of identifications is encouraged if we think of identifications as "real definitions". For as readers of mathematics textbooks know, our standard practice of stipulative definition is certainly governed by a "no circularity" constraint. If I have stipulatively defined a simple expression using a certain complex expression, I am not allowed later to stipulatively define one of the simple constituents of that complex expression using a different complex expression that contains the originally defined simple expression.65 The label 'real definition' suggests that there is some subject matter in reality which is importantly analogous to this human practice, and a "no circularity" constraint would provide one natural basis for such an analogy. Another way to support some kind of "no circularity" principle about identifications is to lean on the idea that identifications explain claims about necessity. As noted in §1, identifications-at least those with a syntactically simple predicate on one side-seem to provide maximally satisfying explanations of the necessities that follow from them; if to be a vixen is to be a female fox, there is no further sense in wondering why it is necessary that all vixens are female foxes. But if someone offered to explain some initially mysterious necessity by citing identifications which run in a circle, we would feel cheated; the sense of there being nothing left to explain seems to disappear. Suppose, for example, that we thought that (26) was true: (26) It is metaphysically necessary that whenever distinct lines x and y are both perpendicular to some third line z, x and y do not intersect. We might wonder why this is true. As we might put it: what stops intersecting lines from ever being perpendicular to a third line? It would seem a bad joke if someone proposed (27a) and (27b) as an answer: (27) a. (λxy.x is perpendicular to y) ≡ (λxy.x intersects y and every line that intersects both x and y intersects each of them obliquely). b. (λxy.x intersects y obliquely) ≡ (λxy.x intersects y and x is not perpendicular to y).66 65"Implicit definitions" might be offered as a counterexample to this claim. But in ordinary mathematical practice it is taken for granted that implicit definitions are shorthand for more complicated explicit definitions, e.g. of the form 'x is F =df x belongs to every set closed under such-and-such operations'. In cases where this recipe cannot be applied, the acceptability of a so-called implicit definition is a matter of deep philosophical controversy, in sharp contrast to the acceptability of standard stipulative definitions. 66Understand 'intersect' in (27a) and (27b) in such a way that lines do not intersect themselves. 45 Although (26) follows straightforwardly from (27a) and (27b)-if x and y are both perpendicular to z, then by (27a) y intersects z, if y intersects both x and z it intersects both x and z obliquely, and by (27b) y does not intersect z obliquely, so y cannot intersect x-the suggestion that they are both true does nothing to allay any puzzlement one might have had about (26). Similarly, in the philosophy of mind, some philosophers (e.g. Setiya 2007, ch. 1) engaged with the question 'What is it to intend to do something?' take one of their central tasks to be that of explaining (28): (28) Necessarily, everyone who intends to do something believes that they will do it. The kind of understanding of (28) these philosophers are looking for certainly cannot be attained just by accepting some "circular" identification like (29): (29) To intend to do something is to intend to do it while believing that you will do it. If combinations like (27a) and (27b) or single identifications like (29) were tenable, the task of explaining puzzling facts about necessity by deriving them from identifications would be much easier than it seems in fact to be. This provides further motivation for the idea that there is some formal "non-circularity" constraint which is violated by pairs like these. But what does it even mean to say that identifications cannot "run in a circle"? We had better be careful. Given Reflexivity, 'To be a vixen is to be a vixen' cannot count as "circular" in the objectionable sense; given Symmetry, neither can the combination of 'To be a vixen is to be a female fox' with 'To be a female fox is to be a vixen'. A more promising suggestion is that the relevant notion of circularity involves the term on one side of an identification occurring as a proper constituent of the term on the other side: No Circles: A ≢τ B, where A and B are terms of any type τ such that B properly contains an occurrence of A in which no occurrence of any variable free in A is bound. (Note thatNoCircles also rules out conjunctions of identifications (A1 ≡ B1)∧(A2 ≡ B2) where A1 is a proper constituent of B2 and A2 is a proper constituent of B1 (with no variables bound): we can use LL and the second conjunct to substitute B2 for the occurrence of A2 in B1 on the right hand side of the first conjunct, to get an identification A1 ≡ B∗1 where A1 is a proper constituent of B ∗ 1 .) Principles like No Circles have been taken seriously: for example, Correia (2010, p. 16) suggests that 46 it may be correct given what he calls a "conceptualist view of factual equivalence". And perhaps Prior (1964, p. 193) is endorsing something like No Circles when he writes 'I cannot see how the sense of a sentence can ever be identical with a logical complication of itself'.67 But we cannot accept No Circles as it stands, since we have endorsed β-conversion, which implies φ ≡ (λp.p)(φ), and also (on independent grounds) tentatively endorsed φ ≡ ¬¬φ: both of these are identifications in which the term on one side of ≡ is a proper constituent of the term on the other. We need a principle that lets us discriminate between benign circles and vicious ones. The idea I want to propose is this: the case where the term on one side of a true identification occurs as a proper constituent of the term on the other side can arise only if all of the other expressions on the more complex side-all of the other ingredients which combine with the term on the less complex side to form the term of which it is a proper constituent-are or are equivalent to logical terms, like ¬ and (λp.p) in the above examples. Non-logical entities are indissoluble, and always make for a genuine increase in complexity when they combine with something else. To capture this idea in our higher-order language, let us help ourselves to a predicate Logicalτ , of type ⟨τ⟩, for every type τ. Logicalτ(xτ) should be true only if xτ is the denotation of by some closed term whose only constants are the logical constants ¬, ∧, ¬, ∀τ , ∃τ , and ≡τ . While this gloss may not constitute a satisfactory definition of 'Logicalτ', it seems to convey an adequate grip on the intended interpretation. Using this vocabulary, we can state the proposal schematically as follows: Only Logical Circles: (A ≡τ B) → Logicalσ(C), where σ and τ are any types, and A, B, C are any terms, of types τ, τ, and σ respectively, such that B contains an occurrence of A together with an occurrence of C that neither contains, nor is identical to, nor is contained by that occurrence of A, and none of the variables free in A or C are bound in either of these occurrences. Recall that we count universal closures of instances of schemas as instances in their own right; thus all of the following are instances of Only Logical Circles: (∃q(q) ≡ (λp.p)(∃q(q))) → Logical⟨⟨⟩⟩(λp.p) ∀x⟨⟨⟩⟩∀p⟨⟩((p ≡ x(p)) → Logical⟨⟨⟩⟩(x)) ∀f ⟨e⟩∀p⟨⟩((f ≡ (λx.f(x) ∧ p)) → Logical⟨⟩(p)) 67Note however that there are many passages in Prior that suggest a commitment to β-conversion: he extracts predicates from sentences by replacing some constituents with blanks, and takes it for granted that in doing this he is providing one legitimate 'analysis' of the original sentence. 47 Note too thatOnly Logical Circles is a non-starter if we accept vacuous β-conversion. Vacuous β-conversion entails that p ≡ (λq.p)(p) is true for every p, which givenOnly Logical Circles would imply the obviously false conclusion that Logical⟨⟩(p) is true for every p. Only Logical Circles does not look very elegant: the criterion for what counts as an instance of the schema involves some rather fiddly syntactic considerations. Fortunately, using the power of β-conversion, we can achieve the same effect as Only Logical Circles with the following simpler schema: OLC (x ≡τ λv1...vn.y(z, x, v1, ..., vn)) → Logicalσ(z) This schema has one instance for every pair σ, τ of types where τ ≠ e: this fixes the types of the variables, since when τ is ⟨τ1, ..., τn⟩, v1, ..., vn must be of types τ1, ..., τn, and y must therefore be of type ⟨σ, ⟨τ1, ..., τn⟩, τ1, ..., τn⟩. (When τ = ⟨⟩, the list of variables v1, ..., vn is empty, so in this case OLC is just (x ≡⟨⟩ y(z, x)) → Logicalσ(z), where y is of type ⟨σ, ⟨⟩⟩.) As a special case of OLC, we have the principle that if a proposition is the result of applying some operator to itself, that operator must be a logical one: (p ≡ f(p)) → Logical⟨⟨⟩⟩(f ) (To get this from OLC take x = p, z = f , and y = λg⟨⟨⟩⟩q⟨⟩.(g(q)).) As we might put it with apologies to Prior: the sense of a sentence may never be identical to a non-logical complication of itself. OLC follows fromOnly Logical Circles, since each of its instances is an instance of Only Logical Circles. It also implies Only Logical Circles. Assume for conditional proof that A ≡τ B, where B is a complex term containing non-overlapping occurrences of A and C in which none of their free variables are bound. Since B is complex, it is either a formula φ or an abstract λv1...vn.φ. Let u and w be variables of the same types as A and C which do not occur free in B, and let φ∗ be the result of substituting u and w for the two given non-overlapping occurrences of A and C in φ. Then φ is β-equivalent to (λuwv1...vn.φ∗)(C, A, v1, ..., vn), so by β-conversion, we can transform A ≡ B into A ≡ (λv1...vn.(λuwv1...vn.φ∗)(C, A, v1, ...vn)) So by universal instantiation, we can substitute A, (λuwzv1...vn.φ∗), and C respectively for x, y, z in OLC and thus derive Logical(C). As we hoped, OLC can be used to rule out combinations of identifications such 48 as GRUE, BLEEN,GREEN, and BLUE. In fact we only need to consider GRUE and GREEN. Suppose both were true: (30) a. G′ ≡ λxe.(Gx ∧ Ox) ∨ (Bx ∧ ¬Ox) b. G ≡ λxe.(G′x ∧ Ox) ∨ (B′x ∧ ¬Ox) Combining these using LL and β-conversion, we get (31) G ≡ λxe.(((Gx ∧ Ox) ∨ (Bx ∧ ¬Ox)) ∧ Ox) ∨ (B′x ∧ ¬Ox) in which the closed term on the left of ≡ is a proper constituent of the one the right. Since the term on the right contains non-overlapping occurrences of the constants G and B, (31) implies Logical⟨e⟩(B) by Only Logical Circles.68 But given the intended meaning for 'Logical', this conclusion-that being blue is logical-should seem obviously false.69 Of course it is not enough to establish this merely to point out that 'blue' is not itself on our official list of logical constants. There is nothing to stop us from introducing new constants equivalent to the given logical constants, or to complex terms built out of them. Nevertheless, when we consider a sample of closed terms of type ⟨e⟩, the idea that 'blue' is even coextensive with any such term, let alone equivalent to one in the sense of ≡, looks terribly implausible: λx.x = x λx.∃f ⟨e⟩fx λx.∀f ⟨e⟩¬fx λp.∃f ⟨e⟩(fx ∧ ∃y(y ≠ x ∧ fy)) λx.∃f ⟨e⟩(fx ∧ ∀y(y = x → ¬fy)) λx.∃f ⟨e,⟨⟩⟩(f (x, f (x))) The idea that we could, even in principle, state necessary and sufficient conditions for something to blue given only this meagre list of ingredients to work with strains credulity far past the breaking point.70 We can give a parallel argument from OLC 68To reach the same conclusion using OLC, we need to applying another β-conversion to (31) to get to something which has the right form to instantiate the antecedent of OLC: G ≡ λxe((λf ⟨e⟩h⟨e⟩x.(((hx ∧ Ox) ∨ (fx ∧ ¬Ox)) ∧ Ox) ∨ (B′x ∧ ¬Ox))(B, G, x)) 69Note that even Carnap 1928 does not endorse this: although he calls everything on his final list of unreduced vocabulary "logical", it includes not just truth-functional operators and quantifiers but a higher-order predicate 'fund', expressing something like naturalness. 70One might be tempted to argue that being blue isn't logical from the premise that any two objects have exactly the same logical properties. But this is not obviously true. As we will see in §9, it is plausible in the present setting that qualitativeness is definable in logical terms, in which case 49 against many other intuitively "circular" combinations of identifications: for example, we can rules out the combination of (27a) and (27b) using the equally plausible auxiliary premise that perpendicularity is not logical. In fact, in many cases, we can get by without having to rely on any such auxiliary premise. There is no way, in our higher-order language, to build a term of type e out of logical constants. (In fact there are no complex terms of type e at all). Thus on the intended interpretation of 'Logical', 'Logical(z)' is always false when z is of type e. So by setting σ = e, we can extract from OLC the following schema, whose instances are themselves purely logical sentences: Qual x ≢τ λv1...vn.y(ze, x, v1, ..., vn) UsingQual, we can rule out the conjunction ofGRUE andGREEN without having to rely on any premises about logicality-indeed the only additional premise we need is the claim that there is at least one object. Choose any object a; then by applying both sides of (31) to a and β-converting, we get (32) Ga ≡ (((Ga ∧ Oa) ∨ (Ba ∧ ¬Oa)) ∧ Oa) ∨ (B′a ∧ ¬Oa) A second β-conversion then yields (33) Ga ≡ (λxep⟨⟩.(((p ∧ Ox) ∨ (Bx ∧ ¬Ox)) ∧ Ox) ∨ (B′x ∧ ¬Ox))(a, Ga) whose negation is an instance of Qual. OLC has the striking consequence that no non-logical proposition is its own self-conjunction: (34) (p ≡ (p ∧ p)) → Logical⟨⟩(p) For the antecedent is β-equivalent to p ≡ (λqr.r∧q)(p, p), which implies Logical⟨⟩(p) so is qualitative indiscernibility (sharing all the same qualitative properties) and being qualitatively indiscernible from something distinct from oneself. It possible that some but not all objects have this property: for example, this would plausibly be the case if the world consisted of two spatiotemporally disconnected parts of which one but not the other was mirror-symmetric. The suggestion that being blue is logical is thus not quite as absurd as we might have supposed. We can imagine that the everyday world consists of swarms of co-located, qualitatively indiscernible objects, and that swarms containing different numbers of such objects tend to reflect different amounts of light at different wavelengths. Insofar as this is an epistemic possibility, perhaps we should allow that it might turn out, say, that being blue is being qualitatively indiscernible from exactly seventeen other objects, in which case being blue is logical after all. However, actual colour science does not provide a fertile ground for such far-fetched speculations. 50 by OLC. Turning to types other than ⟨⟩, it turns out we only need Qual to derive the even more sweeping consequence that no property is its own self-conjunction. For example, in type ⟨e⟩, Qual implies the schema (35) f ≢⟨e⟩ λx.(f (x) ∧ f(x)) For suppose that f ≡ λx.(f (x)∧f(x)); then for any object a, we would have f(a) ≡ f(a) ∧ f(a), which is β-equivalent to f(a) ≡ (λzeq⟨⟩.(q ∧ f(z)))(a, f (a)), which is inconsistent with Qual. This generalises to arbitrary complex types ⟨τ1, ..., τn⟩ (n ≥ 1): (36) f ≢⟨τ1,...,τn⟩ λx1...xn.(f (x1, ..., xn) ∧ f(x1, ..., xn)) Suppose f were a counterexample to this; let A1...An be some terms of types τ1...τn whose only free variable is ze, and let B1...Bn be the result of substituting the name a for ze in these terms.71 Then we have (37) f(B1, ..., Bn) ≡ (λzeq⟨⟩.(q ∧ (λze.f (A1, ..., An))z))(a, f (B1, ..., Bn)) again contradicting Qual. Given this derivation of (36) for all types other than ⟨⟩, considerations of uniformity arguably favour strengthening (34) to p ≢ (p ∧ p) even in type ⟨⟩. The replacement of Idempotence with its negation (at least in complex types) is perhaps the most distinctive hallmark of the particular kind of "fine-grained" theory we are developing. It distinguishes it, for example, from Goodman's theory of aboutness (mentioned in note 40), and from the theories of "worldly factual equivalence" developed by Correia (2010, 2016), all of which endorse Idempotence. I admit that this feature is quite surprising: especially given that we are endorsing Involution, you might have expected other especially "trifling" equivalences in propositional logic to correspond to true identifications. Correia and Skiles MS suggest the rejection of Idempotence as a hallmark of a "conceptual" or "representational" (as opposed to "worldly") conception of the kind of claim made by sentences of the form 'φ ≡ ψ', where this is taken to involve, for example, denying that 'a is a water molecule ≡ a is a H2O molecule' is true, on the grounds that its two sides involve distinct "concepts". But I insist that despite the fact that the present theory rejects Idempotence, it is still intended as a theory of a kind of claim just as "worldly" as ordinary identity claims. The hypothesis is that in the very sense in which it is true 71A proof that there is at least one such λI-term Ai in every type can be extracted from a proof of a related fact given in the Appendix, note 105. 51 that to be a water molecule is to be a H2O molecule, it is just not true that to be red is to be red and red. I insist that these claims are not obviously false. They are "edge cases" of the sort one would only ever consider as part of the kind of systematic logical investigation we are currently engaged in, and as such they should be settled on the basis of broader systematic considerations.72 Many of the other Boolean theorems and axioms turn out, given OLC, to have the same status as Idempotence: universally false in complex types, and also in type ⟨⟩ with a possible exception for logical propositions. For example, consider Dissolution, p ≡ p ∧ (q ∨ ¬q). This is β-equivalent to p ≡ (λrs.s ∧ (r ∨ ¬r))(q, p), which implies Logical(q) by OLC. In type ⟨e⟩, f ≡ λx.f(x) ∧ (g(x) ∨ ¬g(x)) implies f(a) ≡ (λxeq⟨⟩.q ∧ (g(x) ∨ ¬g(x))(a, f (a))), which is inconsistent with Qual. Absorption and Annihilation go the same way, as does the combination of the two Distributivity principles.73 This replacement of ∧-Idempotence with its opposite (at least in complex types) shows that the present theory fulfils our aspiration to have a theory that is inconsistent with Booleanism and comparable to it in generality. Moreover, the fact that Qual is enough by itself to secure so much of the strength of this result shows that the interest of the theory does not depend crucially on the presence in the language of the undefined predicates 'Logicalσ'. Someone might object that the apparent intelligibility of these predicates is an illusion based on taking model theory too seriously, or that a theory using such predicates suffers from a pernicious kind of ideological complexity. I think that these objections are mistaken, but I will not engage with 72An important worry about the failure of idempotence concerns infinite conjunctions. If infinitary conjunction is intelligible at all, we can form the infinite conjunction of φ, φ ∧ φ, φ ∧ φ ∧ φ, ...: call it φω. If φ is non-logical, so is φω; so by (34), we have φω ≢ (φω ∧ φω). But this is puzzling, since we might think that φω and φω ∧ φω are both just the conjunction of φ with itself countably many times. The best resolution to this puzzle, I suspect, will involve rejecting ∧-Associativity even in the finite case. If associativity fails, we should not expect that describing a proposition as a 'conjunction of so-and-so many copies of φ' would suffice to pin it down uniquely. For example, we will distinguish ∧(∧(φ, φ), ∧(φ, φ)) from ∧(∧(φ, ∧(φ, ∧(φ, φ)))); we may also want to distinguish both of these from ∧(φ, φ, φ, φ) (a simple, quaternary conjunction). However there are several further choice points we need to consider in order to come up with an account of infinitary conjunction and disjunction that fits with OLC. 73The hardest case is that of Distributivity. Suppose we have a non-logical proposition f(a), call it p for short. Combining ∧∨−Distributivity and ∨∧−Distributivity yields p∧(p∨p) ≡ ((p∧p)∨p)∧((p∧ p)∨p); byCommutativity this yields p∧(p∨p) ≡ (p∨(p∧p))∧((p∧p)∨p); substituting in the first disjunct another application of ∧∨-Distributivity then gives us p∧ (p ∨p) ≡ ((p ∨p) ∧ (p ∨p)) ∧ ((p ∧p) ∨ p), and another application of ∨∧-Distributivity turns this into p∧(p∨p) ≡ ((p∧(p∨p))∨(p∧(p∨p)))∧((p∧p)∨p). This, finally, has the left hand side as a proper constituent of its right hand side, so that we can apply Only Logical Circles to get Logical(p), which is impossible if p ≡ f(a). Note too that given De Morgan's and Involution, whichwewant to accept, the twoDistributivity principles become equivalent. 52 them here except to note that they are not objections to Qual. It is worth noting that OLC, and indeed Qual, requires rejecting the principle of Plenitude which we considered back in §6, according to which every functional relation between propositions corresponds to an operator. For any property f and object a, we can consider the relation R =df λpq.(q ≡ f(a) ∧ (p ≡ p)), which every proposition bears uniquely f(a). Since R is functional, Plenitude implies that there is a corresponding operator z, such that for any p, R(p, z(p))-i.e. for any p, z(p) ≡ f(a). So in particular, f(a) ≡ z(f(a) ∧ f(a)). Since this β-converts to f(a) ≡ (λxp.f (x) ∧ p)(a, f (a)), it is inconsistent with Qual. The present theory thus commits one just as much as the structured picture did to making a distinction in principle between genuine operators (which allow existential generalisation) and mere "quasi-operators" which use the meanings of their input sentences to fix the meanings of their output sentences in a manner captured by some functional ζ ⟨⟨⟩,⟨⟩⟩ that does not correspond to any χ ⟨⟨⟩⟩. Now that we have seen how to derive lots of interesting, controversial, and antiBoolean conclusions from OLC (together with β-conversion and the background classical logic), an urgent question is whether you can derive everything else as well, including the negations of these conclusions. In other words: is OLC consistent? The answer is not at all obvious, but in the Appendix, I show that it is yes, by constructing a nonempty class of set theoretic models for a λI-language in which all instances of OLC have value 1, and in which all the rules of the background logic (including βη-conversion) preserve value 1, although not everything has value 1. Since Involution, De Morgan, Commutativity, and Associativity also hold in some of the models in the class, the proof also shows that OLC is consistent with these principles. Moreover, in these models, 'Logicalσ(x)' has value 1 on an assignment only if the denotation of x on that assignment is the same as that of some closed term built out of logical constants; thus, the models are consistent with 'Logical' meaning what I wanted it to mean. In fact, OLC also remains true even when we contract the extension of 'Logical' further so that that Logicalσ(x) has value 1 on an assignment only when the denotation of 'x' is the same of that of some closed term in which the only constant is ¬: i.e. a pure term such as λp⟨⟩.p or λf ⟨e,e⟩xy.f(y, x) or λf ⟨⟨⟩⟩p⟨⟩.f (f (p)), or the result of inserting some negation symbols into such a term. One result of this strengthening is that we can no longer build any "logical" terms of type ⟨⟩, just as we previously could not build any logical terms of type e. This means that 'Logical⟨⟩(p)' is always false, so we derive another schema not involving 'Logical', analogous to Qual: x ≢τ λv1...vn.y(zt, x, v1, ..., vn) 53 Amongst other things, this lets us drop the exception for logical propositions from (34) and the negations of the other Boolean schemas.74 I am not sure whether it is a good move to strengthen OLC in this way. On the one hand, if we keep Involution, the strengthened view accords to negation a certain status (that of being able to take part in "circles") that it denies to the other logical constants, which might seem invidious. On the other hand, the strengthened view has the nice feature that it makes type ⟨⟩ behave more uniformly with other types with respect to the negations of Boolean axioms. I leave the question open for now. 9 Priority Despite our disagreements about their logic, I think we understand identifications very well. My grip on them feels much firmer than my grip on expressions like the following, which have also been cropping up in philosophy from its inception: fundamental more fundamental than metaphysically primitive derivative from metaphysically prior to prior in the order of being These expressions are, I think, among themost obscure in all of philosophy. Looking at the wide variety of ways in which they are used, we may reasonably worry that they are hopelessly vague, to such an extent that no sentence expressed in terms of themwould be both interesting and definitely true. So it will be a substantial advance if it turns out that certain such sentences can be glossed or reconstructed in terms of identifications.75 Against the background of the kind of theory explored in §8, we can define something which can reasonably be thought of as capturing a notion of metaphysical priority. For any x and y, we can say that x is "weakly prior" to y iff y is the result of 74This also applies to all the other types σ in which there are no pure terms, for example ⟨e, ..., e⟩. Moreover, in types with only finitely pure terms up to βη-equivalence, there are also only finitely many terms whose only constant is ¬ up to βη-equivalence and double negation elimination; so for these σ we can also restateOLC in purely logical terms by replacing 'Logicalσ' with a finite disjunction containing one representative of each equivalence classes. For example, in type ⟨⟨e, e⟩, e, e⟩, there are just such four equivalence classes, represented by λrxy.r(x, y), λrxy.r(y, x), λrxy.¬r(x, y), and λrxy.¬r(y, x). 75I feel the same way about 'grounds' and 'in virtue of', but the ideas in the present section are less directly relevant to them. For one natural approach to analysing grounding in terms of identifications, see Correia and Skiles MS. 54 saturating an argument place of some z with x, and that x is "strictly prior" to y iff x is weakly prior to y and y is not weakly prior to x. x and y here can variables of any type other than e (since it doesn't make sense for an object to be the result of saturating an argument place of a relation). Formally, we express this using connectives ≼σ,τ and ≺ρ,τ , both of type ⟨σ, τ⟩ where σ and τ are any types other than e: x ≼σ,τ y =df ∃z(y ≡τ λv1...vn(z(x, v1, ..., vn))) x ≺σ,τ y =df (x ≼σ,τ y) ∧ ¬(y ≼τ,σ x) Here where τ is ⟨τ1, ..., τn⟩, the variables v1, ..., vn must be of types τ1, ..., τn respectively, and so z must be of type ⟨σ, τ1, ..., τn⟩. When τ is ⟨⟩, the variable list v1, ..., vn is empty, so x ≼σ,⟨⟩ p just means ∃z(p ≡ z(x)). We can also introduce a connective expressing the case of weak priority without strict priority: x ≈σ,τ y =df x ≼σ,τ y ∧ y ≼τ,σ x This could be pronounced as "x and y are coeval". Using βη-conversion we can show that weak priority is reflexive (∀x(x ≼τ,τ x)) and transitive (∀x∀y∀z((x ≼σ,τ y ∧ y ≼τ,ρ z) → x ≼σ,ρ z)).76 Weak priority is not a partial order, however: even when x and y are of the same type τ, we can have x ≈τ,τ y without x ≡τ y. For example, Involution entails that p ≈⟨⟩,⟨⟩ ¬p, since p ≡ ¬(¬p) entails ∃z(p ≡ z(¬p)). Similarly, β-conversion entails that r ≈⟨e,e⟩,⟨e,e⟩ λxy.r(y, x): each relation is coeval with its converse, since r ≡ (λsxy.s(y, x))(λxy.r(y, x)). Strict priority, on the other hand, is a strict partial order: transitive because ≼ is, and irre76Proof: For reflexivity, note that when x is of type ⟨τ1, ..., τn⟩, x ≡ x is η-equivalent to x ≡ λy1, ..., yn.x(y1, ..., yn), which is β-equivalent to x ≡ λy1...yn.(λzu1...un.z(u1, ...un))(x, y1, ..., yn), which implies ∃z(x ≡ λy1...yn.z(x, y1, ..., yn)) by existential generalisation. For transitivity, suppose a ≼σ,τ b and b ≼τ,ρ c, where τ is ⟨τ1, ..., τn⟩ and ρ is ⟨ρ1, ...ρm⟩. Then for some d⟨σ,τ1 ,...,τn⟩ and e⟨τ,ρ1 ,...,ρm⟩, b ≡ λu1...un.(d(a, u1, ..., un)) and c ≡ λv1...vm.(e(b, v1, ..., vm)). Substituting the first of these into the second yields c ≡ λv1...vm.(e(λu1...un.(d(a, u1, ..., un)), v1, ...vm)) which is β-equivalent to c ≡ λv1...vm.((λw0...wm.e(λu1...un.d(w0, u1, ..., un), w1, ..., wm))(a, v1, ...vm)) which implies ∃x(c ≡ λv1...vm.(x(a, v1, ...vm))) by existential generalisation. All this looks less forbidding if we work in a functional language of the kind explained in Appendix A2. 55 flexive because ≼ is reflexive. OLC gives us an easy way to establish claims of strict priority: whenever b ≡ λv1...vn.y(z, a, v1, ...vn) and z is not logical, a must be strictly prior to b. For example, being green is strictly prior to being grue, since grue ≡ (λx.(λf ⟨e⟩ye.(fy ∧ Oy) ∨ (By ∧ ¬Oy))(green, x), and being green is not logical. Whenever we have a true identification B ≡ C where A is a constituent of C and C also has a nonoverlapping, non-logical constituent, OLC implies that A ≺ B, since if there were any way to get back to A by plugging B into an argument place of some other term, the result would then be the kind of circle forbidden by OLC. This is a good fit for the way in which the notion of priority is used in informal philosophical settings. For example, the theory that to know something is to have a justified true belief in it would be naturally described as one on which belief is prior to knowledge, whereas the theory that to believe something is to be such that one would know it if one were in normal circumstances would be described as one on which knowledge is prior to belief. By contrast, three of the competing views we have considered entail that the notions of prioritywe have just defined are far too undiscriminating to be of any interest: for some or all σ, τ, everything is weakly prior to everything else (∀xσ∀yτ(x ≼σ,τ y)) and thus nothing is strictly prior to anything (¬∃xσ∃yτ(x ≺σ,τ y)). Booleanism and vacuous β-conversion both imply that ≼σ,τ is universal for any σ and τ (other than e). Plenitude, meanwhile, implies that ≼σ,⟨⟩ is universal (every proposition is weakly posterior to everything), which is nearly as bad. Proof. Let σ = ⟨σ1, ..., σn⟩ and τ = ⟨τ1, ..., τm⟩. (i) It is easy to see why Booleanism entails that ≼⟨⟩,⟨⟩ is universal: for any p and q, we have q ≡ q ∧ (p ∨ ¬p) by Dissolution, hence q ≡ (λr.q ∧ (r ∨ ¬r))(p), so p ≼ q. To generalise this reasoning to show that x ≼σ,τ y, let z1, ..., zn be of types σ1, ..., σn; then Dissolution implies that y ≡ λv1...vm.(y(v1, ..., vm) ∧ (x(z1, ..., zn) ∨ ¬x(z1, ..., zn))) which implies x ≼σ,τ y for the same reason as before. (ii) It is easy to see why vacuous β-conversion entails that ≼⟨⟩,⟨⟩ is universal: for any p and q, we have q ≡ (λr.q)(p), which implies p ≼ q. To generalise this reasoning to show that x ≼σ,τ y, consider the following vacuous β-equivalence: y ≡ λv1...vm.((λu0...um.y(u1, ..., um))(x, v1, ..., vm)) (iii) To show that Plenitude implies the universality of ≼σ,⟨⟩, for a given q let Rq be λuσs⟨⟩.u ≡ u ∧ s ≡ q. By Plenitude, there is a corresponding operator Oq (of type 56 ⟨σ⟩) such that for all xσ , Rq(x, Oq(x)). By β-conversion this implies that for all x, Oq(x) ≡ q, and hence that x ≼σ,⟨⟩ q.77 Indeed, given Booleanism or Plenitude, there seems to be no prospect of finding a non-trivial reconstruction of questions about metaphysical priority in terms of identifications.78 We have not provided any way of making sense of the question whether an object is strictly prior or strictly posterior to something else: ≺σ,τ is well-defined only when both σ and τ are types other than e. By contrast, our definition of ≼σ,τ actually makes perfectly good sense if σ is e, so long as τ is not e. Extending the use of ≼ to this case, we can say for example that Obama ≼e,⟨⟩ Obama is tall, and similarly Obama ≼e,⟨e⟩ λx.x admires Obama. Saying that an object is weakly prior to a proposition, property, or relation is a way of saying that it is about that object, in a demanding sense of "about"-a sense in which it is at least not obviously true that the proposition that every man is mortal is about every man. So long as we deny that ≼e,⟨⟩ is universal, there is a strong case for identifying qualitativeness with not being, in this demanding sense, about anything.79 That is: Qualitativeτ ≡ λxτ .¬∃y(y ≼e,τ x) Since the good standing of the notion of qualitativeness is relatively uncontroversial, the fact that we can give this simple analysis of it in logical terms is a significant advantage of the present framework, which rejects Booleanism, vacuous β-conversion, andPlenitude.80 For the reasons given above, Booleanism and vacuous β-conversion 77Plenitude is a special case of a stronger principle-Strong Plenitude, discussed in §A3 of the Appendix-which implies that ≼σ,τ is universal for all σ, τ ≠ e. 78Proponents of vacuous β-conversion might, by contrast, hope to restore nontriviality by somehow restricting the existential quantification in the definition of ≼. However it is far from obvious how this could be done while preserving transitivity. Even if we reject all three of Booleanism, vacuous β-conversion, and Plenitude, there is of course no guarantee that our ≼ and ≺ will behave in a way that would make it reasonable to think of them as expressing notions of metaphysical priority. For example, although J. Goodman (MS) rejects q ≡ q ∧ (p ∨ ¬p) in general, he accepts it in the case where p is qualitative, so his view implies that x ≼ y whenever x is qualitative; more generally, x ≼ y exactly when x is about every object that y is about. 79Khamara (1988) considers a suggestion like this: 'A property, P , is impure [non-qualitative] if and only if there is at least one individual, y, such that, for any individual, x, x's having P consists in x's having a certain relation to y.' 80This advantage does not turn on accepting anything like OLC, however. The same definition of qualitativeness is available in the setting of Goodman's theory of aboutness (J. GoodmanMS), although that theory makes our ≼ pretty useless as an account of priority talk, since everything qualitative is weakly prior to everything. 57 imply that ∀xe∀yτ .x ≼e,τ y, and Plenitude implies that ∀xe∀p⟨⟩.x ≼e,⟨⟩ p, which rules out accepting the above account of qualitativeness on pain of having to say that nothing at all is qualitative. The ease with which we can make sense of the notion of metaphysical priority, or relative fundamentality, naturally raises the question whether we can also make sense of a corresponding notion of absolute fundamentality.81 Here is an obvious initial suggestion: (F1) Fundamentalτ(x) =df ¬∃y(y ≺τ,τ x) Loosely: a fundamental property or relation is one to which no other property or relation of the same type is strictly prior.82 However, this threatens to be in one respect too demanding, and in another respect not demanding enough. To see how it might be too demanding, consider the thesis that conjoining any property with self-identity yields the same property back: (38) ∀f ⟨e⟩(f ≡ (λx.fx ∧ x = x)) Since conjunction and identity are logical, this is not ruled out by OLC. (38) implies that self-identity is weakly prior to all properties-∀f((λx.x = x) ≺⟨e⟩,⟨e⟩ f)- and hence, that the only properties that are fundamental in the sense of (F1) are those that are also weakly prior to self-identity. But given OLC, (38) also implies that the only properties that are weakly prior to self-identity are logical properties. (Suppose f is weakly prior to self-identity: then there is some z such that (λx.x = x) ≡ (λx.z(f , x)), hence f ≡ (λx.fx ∧ x = x) ≡ (λx.(fx ∧ x = x) ∧ x = x) ≡ (λx.(fx ∧ z(f , x)) ∧ z(f , x)), which implies Logical⟨e⟩(f ) since it contains two non-overlapping occurrences of f .) We could respond to this just by giving up (38)-something we will already be committed to if we adopt the strengthening of 80The above definition will also look problematic when combined with certain views that postulate special objects whose logical behaviour is very different from that of ordinary individuals. For example, if one believed in propositions (understood as special objects) and held that ∀p⟨⟩∃xe(p ≡ True(x)), one would object that the definition wrongly implies that ∀p⟨⟩¬Qualitative(p). This is not the right place for an argument against such views. 81For the idea that we should want to be able to talk about naturalness or fundamentality for quantifiers and connectives as well as ordinary predicates, see Sider 2011; for a higher-order implementation of this thought, see Dorr and Hawthorne 2013, sect. 2. 82Note that any notion of fundamentality defined in these terms will be one on which anything coeval with something fundamental is itself fundamental; in particular, given Involution, the negations of fundamental properties are themselves fundamental. See Plate 2016 for an analysis of fundamentality (or "logical simplicity") in the setting of a first-order theory of properties that also embraces this consequence. 58 OLC we were contemplating at end of §8, where 'Logicalσ(x)' is interpreted so that it is true only when x is the denotation of a closed termwhose only constant is ¬. But an analogous worry will still arise in other types. For example, we definitely have to admit that the identity operator λp.p is weakly prior to every other operator, since x⟨⟨⟩⟩ ≡ (λq.(λyr.y(x(r))))(λp.p, q) by β-conversion. Thus the only fundamental operators are those (like negation) which count as 'Logical' in the sense relevant to OLC, since OLC implies that only these could be weakly prior to the identity operator. This is not completely indefensible-no positive claims about fundamentality are uncontroversial-but it is especially odd in the context of Involution, since Involution implies that negation is fundamental in the sense of (F1), and if we are saying this there is pressure to say that at least one of conjunction and disjunction is fundamental too. It seems better to modify (F1) by simply leaving logical entities (in whatever turns out to be the sense relevant to OLC) out of consideration altogether: (F2) Fundamentalτ(x) =df ∀yτ(y ≺τ,τ x → Logicalτ(y)) This addresses the first worry, but still leaves us with the second worry, about not being demanding enough. The problem is that it allows xτ to count as fundamental even when it is strictly posterior to some yσ , where σ is some type other than τ. For example, given a (non-logical) binary relation r⟨e,e⟩, we can build up properties like λxe.r(x, x), λre.r(x, a), and λxe.∃ye(r(x, y)). These should not count as fundamental in the target sense, since r is strictly prior to all of them. However there is no special reason to expect any non-logical property f ⟨e⟩ to be strictly prior to any of them. To address this problem, it looks like (F2) needs to be strengthened to something like this: (F3) Fundamentalτ(x) =df ⋀σ ∀yσ(y ≺σ,τ x → Logicalσ(y)) where the right hand side is an infinite conjunction with one conjunct for every type σ other than e. However, such an infinite conjunction is not expressible in the kind of higher-order language we have been working in, and its intelligibility raises some difficult issues. It is not enough to be tolerant of infinite conjunctions and disjunctions, since such tolerance is naturally combined with a tolerance of types of transfinite adicity (i.e. predicates that make a sentence only when combined with infinitely many arguments), and one might hold that there are too many such types for it to be possible to subsume all of them even in an infinite conjunction or disjunction. A proper treatment of this issue is beyond the scope of the present work. On an optimistic note, however, it's not clear that we really need infinitely many 59 conjuncts. If a property f ⟨e⟩ is derived from a three-place relation r⟨e,e,e⟩ by reflexivisation, or by quantification, or by saturating its arguments with objects, then it is also derived in one of these ways from some binary relation s⟨e,e⟩. So at least if these operations are representative of the ways in which something of one type can be strictly posterior something of a more complex type, it looks defensible to omit the conjunct corresponding to ⟨e, e, e⟩ in the definition of Fundamental⟨e⟩. And this suggests that we might modify (F3) by restricting the conjunction to a finite collection of types whose complexity (on some measure) is not too much greater than that of the given type τ.83 An account of absolute fundamentality in logical terms would be a very nice thing to have, since it would help to precisify, and perhaps also to resolve, a wide range of interesting but elusive metaphysical questions.84 But even setting this aside, a purely logical conception of priority (relative fundamentality) is nothing to sniff at. Claims of priority turn up all over philosophy, especially when we want to engage in a kind of "big picture" thinking that abstracts away from the nitty-gritty details of particular controversial identifications. The fact that the non-circularity picture provides an explanation of priority that justifies our standard practice of reasoning from identifications to priority claims is a significant additional consideration in its favour. 10 Conclusion Given how central identifications have always been in philosophy, it is surprising how little has been done to explore their logic. I think people have been held back by the assimilation of identifications to questions about the identity of properties, together with the assumption that the latter questions would turn out to be merely verbal. This leads to the idea that the right response to the kinds of general questions we have been concerned with-questions like 'Is it the case that to be red is to be 83The resulting analysis of fundamentality will not be quite as neutral as we might have hoped. For example, someone who believed in a fundamental contemplating relation of type ⟨e, ⟨e, e, e⟩⟩ (a relation between objects and three-place relations) as well as a fundamental betweenness relation of type ⟨e, e, e⟩ might object that dropping the conjunct for type ⟨e, e, e⟩ in the definition of Fundamental⟨e⟩ leads to contemplating betweenness being incorrectly classified as fundamental (since we have to ascend to type ⟨e, e, e⟩ before we find something nonlogical that is strictly prior to it, namely betweenness). But complete neutrality on all potentially controversial questions is toomuch to demand. Unlike, for example, a definition of fundamentality that took the form of a mere list, the finitised version of (F3) leaves open a very wide array of views about what is fundamental, and thus provides a reasonable way of reconstructing what might be at stake in many debates about fundamentality. 84For example, we could consider to what extent fundamentality as defined plays the various roles for perfect naturalness discussed in Dorr and Hawthorne 2013. 60 red and either square or not square?' and 'Is it the case that to be red is to be red and red?'-will always be something like this: 'If you are talking about "properties" in sense A, then obviously yes; if you are talking about "properties" in sense B, then obviously no'.85 But as a reason for scepticism, this is premature. Indeed it is quite obscure what conception of the range of available readings for identifications could justify the assumption that our very general questions will turn out to have boringly obvious answers on all their readings, although more specific questions of identification-whether to be morally right is to maximise happiness, whether to be water is to be H2O, and so forth-still have readings on which their answers are interesting and non-obvious. I hope that the present investigation can illustrate the progress that can be made when one, instead, takes the general questions as seriously as we are used to taking the specific ones. As we have seen, it is not just a matter of settling some intrinsically uninteresting edge cases. Rather, different systematic approaches to such questions reflect deeply and fascinatingly different views about the nature of reality at an extremely general level. Moreover, we have seen how investigating the logic of identifications might illuminate many other questions traditionally of interest to metaphysicians, including questions about metaphysical necessity, priority, and fundamentality. And the exploration has barely begun: there is a whole continent of views waiting to be mapped out, and at this point we can only guess which of them will look most believable in the long run. Onwards!86 85Influential here has been the view of Lewis, who writes, concerning the question whether triangularity and trilaterality are the same property, that 'I don't see it as a matter of dispute. Here there is a rift in our talk of properties, and we simply have two different conceptions' (Lewis 1986, p. 55). 86This paper has been in progress in some form or another for a very long time, and there no way I could thank all those who deserve thanks. But I would like to mention Lucas Champollion, Kit Fine, John Hawthorne, Thomas Hofweber, Jessica Moss, Jim Pryor, Mark Schroeder, Kieran Setiya, Ted Sider, Zoltan Szabó, Peter van Inwagen, and Timothy Williamson. Special thinks to Andrew Bacon, Peter Fritz, and Jeff Russell, who provided helpful guidance; and especially special thanks to Jeremy Goodman, whose influence on the final result has been pervasive. 61 A Appendix The central goal of the appendix is to prove the consistency of OLC in a classical deductive system with nonvacuous βη-conversion. However, I also want to give a more rigorous presentation of the syntax of the higher-languages I have been working with than I gave in the main text, and to present a general model theory for these languages that will also be useful to those investigating other views of logic of identifications. The plan is as follows. A1 will give a more precise characterisation of the syntax of the higher-order languages introduced in §4 (relationally typed higher order languages). A2 will characterise a different family of languages (functionally typed higher order languages), and show how to translate back and forth between them and relationally typed languages. A3 will present the basic definition of a model, and A4 will put this definition in context by introducing some significant properties of models. A5 and A6 will develop some definitions and results that will be useful in the proof of the main model existence theorem: this proof will then be given in §A7. Finally A8 will consider some extensions of the result. There is much here that is well-known to those who know it. In particular I will be drawing extensively on the textbook Hindley and Seldin 2008 (henceforth HS) and on Benzmüller, Brown and Kohlhase 2004 (henceforth BBK). Many of the objects we will be interested in are 'collections' indexed by types, in one of two senses of 'type'. In discussing such entities, the following general definitions will be useful. • When T is any set, a T -typed collection C is a set of ordered pairs such that the second co-ordinate of each pair is a member of T . When C is an T -typed collection and τ ∈ T , we write Cτ for {x ∶ ⟨x, τ⟩ ∈ C }. • When C is a T -typed collection, ⋃ C is the set of all first co-ordinates of members of C . When X is any set, C − X is the typed collection such that (C − X)τ = Cτ \ X for every τ ∈ T . • An T -collection C is nonoverlapping if Cσ ∩ Cτ = ∅ whenever σ ≠ τ; completely overlapping if Cσ = Cτ for every σ, τ; and populated if Cσ ≠ ∅ for every σ. • When C and D are T -typed collections, a typed function f from C to D is a function f from C to D such that f(x) always has the same second coordinate as x. For any τ ∈ T , we write fτ for the function Cτ to Dτ such that 62 f(⟨x, τ⟩) = ⟨fτ(x), τ⟩ for all x ∈ Cτ . We denote the set of all typed functions from C to D by D [C ]. • We write 'x1 ↦τ1 y1, ..., xn ↦τn yn' for the minimal typed function f such thatfτi(xi) = y(i), i.e. {⟨⟨x1, τ1⟩, ⟨y1, τ1⟩⟩, ..., ⟨⟨xn, τn⟩, ⟨yn, τn⟩⟩}. Often it will be convenient to work with non-overlapping typed collections, since this enables an abuse of notation where, given f ∈ D [C ], we can just write f(x) to mean 'the unique y such that for some τ ∈ R, f(⟨x, τ⟩) = ⟨y, τ⟩'. We also define a T -family is any function whose domain is T . When F is a T family and τ ∈ T , we write Fτ instead of F (τ). When we use this subscript notation it should always be clear in context whether we have in mind a typed collection, a typed function, or a typed family. We could instead have defined a typed collection as a typed family of sets, and a typed function as a typed family of functions. The advantage of the above definitions is that standard set-theoretic and function-theoretic concepts can be applied directly to typed collections and functions, for example we can take the union of two T -typed collections or the composition of two T -typed functions. A1 Relational types and relationally typed languages In this section I give a more precise definition of the "relational" higher order languages introduced in §4. We begin by defining the set of types or syntactic categories. Definition 1.1. R, the set of relational types, is the smallest set containing the letter 'e' such that, for any n ≥ 0, if τ1, ..., τn belong to the set, the n-tuple ⟨τ1, ..., τn⟩ does too. We call R-typed collections relationally typed collections. To define our languages we suppose we are given once and for all a certain R-typed collection VarR of variables, with infinitely many members in each type. It doesn't matter what we take variables to be, so long as no variable is a string with multiple other variables as constituents. Let a relational signature be any non-overlapping R-typed collection Σ such that no element of ⋃ Σ is a variable or a string containing multiple variables or elements of ⋃ Σ. For each relational signature Σ we define two languages, K Σ, the λK-language of Σ, and I Σ, the λI-language of Σ. 63 Definition 1.2. For any relational signature Σ, I Σ, the λI-language of Σ, is a function thatmaps each finite, non-overlappingV ⊆ VarR to a typed collectionI Σ(V )- the 'terms of I Σ whose free variables are exactly V '-minimally satisfying the following conditions:87 (i) c ∈ I Σ(∅)τ whenever c ∈ Στ . (ii) v ∈ I Σ({⟨v, τ⟩})τ whenever v ∈ Varτ . (iii) A(B1, ..., Bn) ∈ I Σ(V 0 ∪ V 1 ∪ ⋯ ∪ V n)⟨⟩ whenever A ∈ I Σ(V 0)⟨τ1,...,τn⟩, B1 ∈ I Σ(V 1)τ1 , ..., Bn ∈ I Σ(V n)τn , and V 0 ∪ V 1 ∪ ⋯ ∪ V n is nonoverlapping.88 (iv) (λv1...vn.φ) ∈ I Σ(V − {v1, ..., vn})⟨τ1,...,τn⟩ whenever v1 ∈ Vτ1 , ..., and vn ∈ Vτn (n > 0) are any distinct variables, and φ ∈ I Σ(V )⟨⟩. The definition of K Σ, the λK-language of Σ, is the same except for clause (iv), which is modified as follows to allow for vacuous λ-abstracts: (iv′) (λv1...vn.φ) ∈ K Σ(V − {v1, ..., vn})⟨τ1,...,τn⟩ whenever v1 ∈ Varτ1 , ..., and vn ∈ Varτn (n > 0) are any distinct variables, and φ ∈ K Σ(V )⟨⟩. A trivial induction shows that if A ∈ K Σ(V )σ and A ∈ K Σ(V ′)τ , ⋃ V = ⋃ V ′: we call this the set of free variables of A, F V (A). If we take Var to be non-overlapping, it follows that K Σ(V )σ is disjoint from K Σ(V ′)τ for any V ′ ≠ V ; moreover, another straightforward induction shows that K Σ(V )σ is disjoint from K Σ(V )τ whenever σ ≠ τ.89 When L is I Σ or K Σ, a term of L is a member of L(V )τ for some V and τ; wff(L) is the typed collection such that A ∈ wff(L)τ iff A ∈ L(V )τ for some V . A pure term of L is a member of I ∅(V )τ or K ∅(V )τ for some V and τ; a closed term of L is a member of L(∅)τ for some τ; a formula of L is a member of L(V )⟨⟩ for some V . 87'Minimally' means that for any f satisfying the conditions, I Σ(V ) ⊆ f(V ) for every finite, non-overlapping V ⊆ Var. 88The requirement that V 0 ∪ V 1 ∪ ⋯ ∪ V n is non-overlapping guarantees that I Σ(V ) will always be empty unless V is non-overlapping, which ensures that strings like x(x) do not belong to I Σ(V )τ for any V and τ. 89Both conditions can fail if Var is overlapping. For example, if y ∈ Varσ ∩Varτ and x ∈ Var⟨σ⟩ ∩Var⟨τ⟩, λxy.x(y) is in both I Σ(∅)⟨⟨σ⟩,σ⟩ and I Σ(∅)⟨⟨τ⟩,τ⟩, while x(y) is in both I Σ({⟨x, ⟨σ⟩⟩, ⟨y, σ⟩})⟨⟩ and I Σ({⟨x, ⟨τ⟩⟩, ⟨y, τ⟩})⟨⟩. The approach where Var is completely overlapping is called "Curry-style" typing, while the approach where Var is non-overlapping is "Churchstyle": see HS, chapters 10 and 12. 64 A2 Functional types and functionally typed languages Most work in higher-order logic uses functional languages, in which each complex term has exactly two terms as immediate constituents, rather than relational languages, in which a complex term can have any number of immediate constituents. Relational higher-order languages are arguably more metaphysically perspicuous: by treating all the arguments of a polyadic predication as syntactically on a par, they avoid introducing a kind of asymmetry in the notation that does not intuitively correspond to any asymmetry in themetaphysics. Moreover, they have themajor advantage that they can be straightforwardly extended to allow for predicates of transfinite adicity.90 But functional languages are more readable and more convenient for proving things about. In this section I will introduce a certain family of functional languages, and explain the sense in which they are equivalent to the relational languages from §A1. Definition 2.1. F -the set of functional types-is the smallest set containing the letters 'e' (the type of objects) and 't' (the propositional type-think 'truth evaluable', not 'truth value'), such that whenever σ and τ belong to it and τ ≠ e, the ordered pair ⟨σ, τ⟩-which we write as (σ → τ)-belongs to it. A terminal type is a functional type other than e; a complex type is a functional type other than e or t; a basic type is e or t. Following a standard convention, when talking about functional types we will omit parentheses-they are to be restored from the right, so for example ρ → σ → τ abbreviates (ρ → (σ → τ)). I will use σ, ρ to range over all functional types, while τ always stands for terminal types. We inductively define functions ⋅R from R to F , and ⋅F from F to R, as follows: eR = e eF = e ⟨⟩R = t tF = ⟨⟩ ⟨τ0, ..., τn⟩R = (τ0R → ⟨τ1, ..., τn⟩R) (n ≥ 0) (σ → τ)F = ⟨σF ⟩⌢τF where ⌢ is concatenation of tuples. It is easy to show that ⋅F and ⋅R are bijections and mutual inverses. Given these functions, we can turn any relationally typed collection C into a functionally typed collection C F = {⟨x, τF ⟩ ∶ ⟨x, τ⟩ ∈ C }, and similarly we can turn a functionally typed collection D into a relationally typed collection D R . 90Thanks to Peter Fritz for pointing this out. 65 Note that it is crucial to this result that F does not contain any types of the form τ → e: if we had allowed for these, we would have something richer than R. We can of course introduce function symbols into the language using Russellstyle contextual definition; this will turn quantification into function-symbol position into a notational variant of quantification into dyadic predicate position restricted by functionality. But unless we favour a very coarse-grained logic in the original language, the logical behaviour of the stipulatively extended language will likely be disunified in a way that makes it worse for the purposes of stating general schemas than the original. This is especially clear if we reject Plenitude (see §6), since in that case we will think that, for example, quantification into type t → t works quite differently from quantification into type t → t → t restricted by functionality. To define functional languages, we take VarF to be VarRF . A functionally typed signature is any nonoverlapping functionally typed collection Σ that does not contain any variables, or strings consisting of multiple variables or elements of Σ. Definition 2.2 (Functional languages). For any functionally typed signature Σ, I Σ, the λI-language of Σ, is a function that maps each finite nonoverlapping V ⊆ VarF to a functionally typed collection I Σ(V ) (the 'terms of I Σ whose free variables are exactly V '), minimally satisfying the following conditions: (i) c ∈ I Σ(∅)σ whenever c ∈ Σσ . (ii) v ∈ I Σ({⟨v, σ⟩})σ whenever v ∈ Varσ . (iii) (AB) ∈ I Σ(V ∪ V ′)τ whenever A ∈ I Σ(V )σ→τ , B ∈ I Σ(V ′)σ , and V ∪ V ′ is nonoverlapping. (iv) (λv.A) ∈ I Σ(V − {v})σ→τ whenever v ∈ Vσ and A ∈ I Σ(V )τ . The definition of K Σ, the λK-language of Σ, is the same except that clause (iv) is less restrictive: (iv′) (λv.A) ∈ K Σ(V − {v})σ→τ whenever v ∈ Varσ and A ∈ K Σ(V )τ . It follows that I Σ(V )τ ⊆ K Σ(V )τ for every V and τ. The concepts of a term, a closed term, a pure term, a formula, and of a formula's set of free variables apply to functional languages just as they did to relational languages. And it is still true that if VarF is non-overlapping, K Σ(V )σ is disjoint from K Σ(V ′)ρ whenever V ′ ≠ V or σ ≠ ρ.91 91The present notation has the following annoying ambiguity: the empty set is officially both a functional and a relational signature, so the expressions 'I ∅' and 'K ∅' are ambiguous between the relational and functional languages with no constants. It won't be worth our while to try to resolve this. 66 We may omit parentheses in writing terms of functional languages; by contrast with the convention for types, they are to be restored from the left, so that ABC abbreviates ((AB)C). Also, when it is understood that terms A, B, and C are respectively of types ρ → σ → τ, ρ, and σ, I will sometimes use infix notation, writing "B A C" for ((AB)C): for example, if A is a constant ∧ of type t → t → t. Having defined the four classes of languages, we now show how to translate between them. Here we assume that VarR is nonoverlapping so that every term has a unique type. First we define a function ⋅F mapping the set of K Σ-terms for each relational signature Σ to the set of K (ΣF )-terms, as follows: Definition 2.3. When Σ is a relational signature and A is a term of K Σ, AF is given by the following inductive definition: (i) aF = a when a is a constant or variable. (ii) A(B1, B2, ..., Bn)F = (AF B1F B2F ...BnF ) (iii) (λv1v2...vn.A)F = (λv1.λv2...λvn.AF ) It is straightforward to verify that ⋅F maps K Σ(V )σ to K Σ F (V F )σF and I Σ(V )τ to I ΣF (V )τF . The reverse translation function, from a functional language K Σ to a relational language K ΣR , is slightly more involved: Definition 2.4. When Σ is a functional signature and A is a term of K Σ, AR is given by the following inductive definition: (i) aR = a when a is a constant or variable. (ii) When A ∈ K Σσ0→⋯→σn→t and B ∈ K Σ σ0 (for n ≥ 0), (AB)R = (λv1...vn.AR(BR , v1, ... , vn)) where v1, ..., vn are distinct variables of types σ1R , ..., σnR respectively, chosen according to some arbitrary order from among the variables not free in AR or BR . (iii) When v0 ∈ Var and A ∈ K Σσ1→⋯→σn→t (for n ≥ 0), (λv0.A)R = (λv0v1...vn.AR(v1, ..., vn)) 67 where v1, ..., vn are distinct variables of types σ1R , ..., σnR respectively, chosen according to some arbitrary order from among the variables not free in AR and not identical to v0. (If we didn't want take VarR to be non-overlapping, the notion of translation would need to be relativised to a type σ and a non-overlapping typed collection of variables V , mapping K Σ(V )σ to K Σ F (V F )σF and vice-versa.) Unlike the mappings between relational and functional types, the two translation functions on formulae are not mutual inverses. However, the result of translating a term back and forth always stands in an intimate syntactic relation to that term, namely that of βη-equivalence. The next order of business is to explain this. Definition 2.5 (Capture-free simultaneous substitution). Suppose that for a relational or functional signature Σ, π is a function mapping a set of variables and constants to a set of terms, such that whenever π(a) is defined and a ∈ Varσ ∪Σσ , π(a) is in wff(K Σ)σ . We define the capture-free simultaneous substitution by π of a term A, [π]A, as follows. [π]a = π(a) when π(a) is defined [π]a = a when π(a) is undefined [π](BC) = ([π]B[π]C) [π]B(C1, ..., Cn) = [π]B([π]C1, ..., [π]Cn) [π](λv1...vn.B) = (λu1...un.[π[ui/vi]]B) where for each i, ui = vi if vi is not free in π(a) for any constant or free variable a in λv1...vn.B, and otherwise ui is a variable of the same type as vi, not identical to any of v1...vn or u1...ui−1, free in B, or free in π(a) for any constant or free variable a in λv1...vn.B, chosen according to some arbitrary ordering; and π[ui/vi] is the substitution function like π except that it maps each vi to ui. We write [B/v]A for [{⟨v, B⟩}]A, and [Bi/vi]A for [{⟨v1, B1⟩, ..., ⟨vn, Bn⟩}]A. This gives us what we need to define the notions of ∗-reduction, ∗-equivalence, and ∗-normal forms, where '∗' may be 'α', 'β', 'η', 'αβ', 'αη', 'βη', or 'αβη'. I will first give the definitions for functional languages, followed by those for relational languages in square brackets when they are different. Definition 2.6. A immediately α-reduces to B iff A is an abstract λv.C and for some u not free in C , B is λu.[u/v]C . [A is an abstract λv1...vn.φ, and for some variables 68 u1...un each ofwhich is either identical to vi or not free inφ, B is λu1...un.[ui/vi]φ.]92 A immediately β-reduces toB iffA is a "β-redex"-a term of the form ((λv.C)D)- andB is [D/v]C . [A is a term of the form (λv1...vn.φ)(D1, ..., Dn) andB is [Di/vi]φ.] A immediately η-reduces toB iffA is an "η-redex"-a term of the form (λv.(Cv)) where v is not free in C-and B is C . [A is a term of the form λv1...vn.C(v1, ..., vn) where none of v1, ..., vn is free in C , and B is C .] A immediately ∗-reduces to B iff either α is in ∗ and A immediately α-reduces to B, or β is in ∗ and A immediately β-reduces to B, or η is in ∗ and A immediately η-reduces to B. A one-step ∗-reduces to B iff B results from A by replacing one constituent with something to which it immediately ∗-reduces. That is: iff there are two finite sequences ⟨A1, ...An⟩, ⟨B1, ..., Bn⟩ such that A1 immediately ∗-reduces to B1, An = A, Bn = B, and whenever 0 < i < n, either for some C , Ai+1 is AiC and Bi+1 is BiC or Ai+1 is CAi and Bi+1 is CBi, or else for some v, Ai+1 is λv.Ai and Bi+1 is λv.Bi. [...either for some C0...Cm, Ai+1 is Ai(C0, ..., Cm) and Bi+1 is Bi(C0, ..., Cm) or for some 0 < k ≤ m Ai+1 is C0(C1, ..., Ck, Ai, Ck+1, ..., Cm) and Bi+1 is C0(C1, ..., Ck, Ai, Ck+1, ..., Cm), or else for some v1, ..., vm, Ai+1 is λv1...vm.Ai and Bi+1 is λv1...vm.Bi.] A is in ∗-normal form iff everything to which it one-step ∗-reduces is something to which is one-step α-reduces. A is one-step ∗-equivalent toB iff either A one-step ∗-reduces toB or B one-step ∗-reduces to A. A ∗-reduces to B iff there is a finite sequence of terms ⟨C1, ..., Cn⟩ such that A = C1 and B = Cn and whenever 0 < i < n, Ci one-step ∗-reduces to Ci+1. A is ∗-equivalent to B-for short, A ≈∗ B-iff there is a finite sequence of terms ⟨C1, ..., Cn⟩ such that A = C1 and B = Cn and whenever 0 < i < n, Ci is one-step ∗-equivalent to Ci+1. The following consequences of these definitions will be significant for us. We state them for functional languages; the extensions to relational languages are straightforward. Proposition 2.7. If A ≈αβη B, then A ≈βη B. Proof. If A immediately α-reduces to B, there is a term C and variables u, v, with u not free in A, such that A is λv.C and B is λu.[u/v]C; but then λu.(λv.C)u immediately η-reduces to A and one-step β-reduces to B, hence A ≈βη B. It follows that A ≈βη B whenever A one-step α-reduces to B, and hence whenever A ≈αβη B. 92When A immediately α-reduces to B, B also immediately α-reduces to A, since if u is not free in C , v is not free in [u/v]C and C = [v/u][u/v]C . 69 Proposition 2.8. If A ∗-reduces to B and A ∈ K Σ(V )σ , then B ∈ K Σ(V ′)σ for some V ′ ⊆ V . Moreover, if A ∈ I Σ(V )σ , B ∈ I Σ(V )σ . Proof. By an induction using the definition of substitution, we see that if C ∈ K Σ(V )σ and D ∈ K Σ(V ′)ρ, [C/v]D belongs to K Σ(V ∪ (V ′ − {v}))ρ when v ∈ V ′, and K Σ(V ′ \ {v})ρ otherwise. Since (λv.D)C also belongs to K Σ(V ∪ (V ′ − {v}))ρ, it follows that whenever A ∈ K Σ(V )σ immediately β-reduces to B ∈ K Σ(V ′)σ , V ′ ⊆ V , with V = V ′ in the case where A is of the form (λv.D)C where v has a free occurrence in D: this must be the case if A ∈ I Σ(V )σ . It is also easy to show that if A immediately αη-reduces to B and A ∈ K Σ(V )σ , B ∈ K Σ(V )σ . Two straightforward inductions then establish the result, first for ∗-reduction in one step, and then in general. Corollary 2.9. If A ≈∗ B and A ∈ I Σ(V )σ , B ∈ I Σ(V )σ . Proposition 2.10. AB ≈∗ CD whenever A ≈∗ B and C ≈∗ D. and λv.A ≈∗ λv.B whenever A ≈∗ B. Proof. If A ≈∗ B and C ≈∗ D, then there exist sequences ⟨A1, ..., An⟩, ⟨C1, ..., Cm⟩ such that A = A1, B = An, C = C1, and D = Cn, and each Ai ≈∗ Ai+1 and Ci ≈∗ Ci+1. Then ⟨A1B1, A2B1, ..., AnB1, AnB2, ..., AnBm⟩ witnesses the ∗-equivalence of AB and CD. Similarly, if ⟨A1, ..., An⟩ witnesses the ∗-equivalence of A and B, ⟨λv.A1, ..., λv.An⟩ witnesses the ∗-equivalence of λv.A and λv.B: by Corollary 2.9 these are all well-formed so long as λv.A is. For proofs of the following theorems, which will be important later, see HS appendices A–C. Proposition 2.11 (Substitution). If A ≈α∗ B and for each i, Ci ≈α∗ Di, then [Ci/vi]A ≈α∗ [Di/vi]B.93 Proposition 2.12 (Strong normalisation). Every term of K Σ ∗-reduces to at least one term in ∗-normal form. Proposition 2.13 (Church-Rosser). If A ∗-reduces to both B and C , then for some B′, C′, B ∗-reduces to B′, and C ∗-reduces to C′, and B′ is α-equivalent to C′. Corollary 2.14. If A ∗-reduces to B, and B and C are both in ∗-normal form, then B is α-equivalent to C . We can now precisely characterise the way in which our two translation functions are in harmony. 93This holds for α∗-reduction as well as α∗-equivalence. 70 Proposition 2.15. When Σ is a functional signature, ARF ≈η A for any term A of the functional language K Σ. Proof. By induction on the complexity of A. If A is a variable or constant, ARF is identical to A and thus trivially η-equivalent to A. If A is BC , then for some zero or more distinct variables v1, ..., vn not free in B or C , AR F = (λv1...vn.BR(CR , v1, ... , vn)) F = λv1...λvn.BR F CRF v1 ... vn. This η-reduces in n steps to BR F CRF , which is η-equivalent to BC by Proposition 2.10 and the induction hypothesis. Finally if A is λv0.B, then for some zero or more distinct variables v1, ..., vn not free in B, AR F = (λv0v1...vn.BR(v1, ..., vn)) F = λv0.λv1...λvn.BR F v1...vn. This η-reduces in n steps to λv0.BR F , which is η-equivalent to λv0.B by Proposition 2.10 and the induction hypothesis. For the analogous result in the other direction, we first need a couple of lemmas. Lemma 2.16. Whenever A0 is a functional term of type σ1 → ⋯ → σm → t and for some positive n ≤ m A1, ..., An are terms of types σ1, ..., σn, (A0...An)R ≈β λvn+1...vm.A0R(A1R , ..., AnR , vn+1, ..., vm) where vn+1...vm are zero or more distinct variables, of types σn+1...σm respectively, not free in A0...An. Proof. By induction on n. Base case: n = 1. (A0A1)R is λv2...vm.A0R(A1R , v2, ..., vm) for some appropriate v2...vm by definition. Induction step: by definition, for some appropriate vn+2...vm, (A0...An+1)R = λvn+2...vm.(A0...An)R(An+1R , vn+2, ..., vm), which by the induction hypothesis is βη-equivalent to λvn+2...vm.(λun+1...um.A0R(A1R , ..., AnR , un+1, ..., um))(An+1R , vn+2, ..., vm) for some appropriate un+1...um. This β-reduces in one step to λvn+2...vm.A0R(A1R , ..., An+1R , vn+2, ..., vm) Lemma 2.17. Whenever v1...vn (n>0) are distinct variables of types σ1...σn and A is a functional term of type σn+1 → ⋯ → σm → t (wherem ≥ n), (λv1...λvn.A)R ≈αβ λv1...vm.AR(vn+1, ..., vm) where vn+1...vm are any (zero or more) variables of types σn+1...σm, not free in A. Proof. By induction on n. Base case: n = 1; then by definition, (λv1.A)R is λv1...vm.AR(v1, ..., vm) for appropriate v2, ..., vm. Induction step: by definition, (λv1...λvn+1.A)R is λv1u2...um.(λv2...λvn+1.A)R(u2, ..., um) for some appropriate u2...um, which by the induction hypothesis is β-equivalent to λv1u2...um.(λv2...vm.AR(vn+2...vm))(u2, ..., um) 71 This β-reduces in one step to λv1u2...um.([ui/vi]AR(un+2...um)), which is α-equivalent to λv1...vm.(AR(vn+2...vm)). Proposition 2.18. When Σ is a relational signature, AF R ≈αβ A for any term A of the relational language K Σ. Proof. By induction on the complexity of A. When A is a variable or constant, AF R = A. When A is B(C1, ..., Cn), AF is BF C1F ...CnF , so AF R is (BF C1F ...CnF ) R , which is β-equivalent to BF R(C1F R , ..., CnF R) by the n = m case of Lemma 2.16. When A is an abstract λv1...vn.B, AF is λv1...λvn.BF , so AF R is λv1...λvn.BF R , which is αβequivalent to λv1...vn.BF R by the n = m case of Lemma 2.17. The upshot of Propositions 2.15 and 2.18 is that so long as our logic licenses treating βη-equivalent terms as interchangeable, the choice whether to theorise in a relational or a functional language is just a matter of taste. Moreover, when we are studying properties of languages that do not distinguish βη-equivalent terms, the choice whether to theorise about a relational or a functional language is also a matter of taste, since every definition and result about the one will have an analogue for the other. In the rest of the appendix, we will be theorising about functional languages. A3 Structures and models We now turn from syntax to semantics. Our models for a given language L = I Σ or K Σ will consist of an L -structure-a domain for each type together with an interpretation of the language on those domains-together with a valuationwhich assigns truth values to elements of the propositional domain. (The material in this section is largely drawn from BBK: my "K Σ-structures" are, roughly, their "η-functional Σ-evaluations".) Definition 3.1. When D is a typed collection, an assignment for D is any typed function from some nonoverlapping typed collection of variables V to D . Two assignments for D are compatible iff they agree on the intersection of their domains and the union of their domains is nonoverlapping. Definition 3.2. Given a functional language L = I Σ or K Σ, an L -structure is a pair S = ⟨D, ⟦⋅⟧⟩ where D is a typed collection and ⟦⋅⟧ is a function which maps every assignment function g for D to a typed function ⟦⋅⟧g from L(dom g) to D , subject to the following conditions: (i) ⟦v⟧v↦σ xσ = x. (Equivalently: ⟦v⟧ g σ = gσ(v) whenever defined.) 72 (ii) If ⟦A⟧g1σ→τ = ⟦C⟧ g2 σ→τ and ⟦B⟧ g3 σ = ⟦D⟧ g4 σ , then ⟦AB⟧ g1∪g2 τ = ⟦CD⟧ g3∪g4 τ if g1 and g2 are compatible and g3 and g4 are compatible. (iii) ⟦A⟧gσ = ⟦B⟧hσ whenever A ≈βη B, g and h are compatible, and both sides are defined.94 When A is closed, we write ⟦A⟧σ instead of ⟦A⟧∅σ . For convenience, I will generally assume when discussing L -structures that D and Var are both nonoverlapping, and that ⋃ D is disjoint from ⋃wff(L). I use boldface variables x, y, z... when talking about the elements of ⋃ D . This enables a useful convention where, for example, given x ∈ Dρ→σ→τ , y ∈ Dρ and z ∈ Dσ , we can write ⟦xyz⟧ instead of ⟦uvw⟧[u↦x,v↦y,w↦z]τ , or ⟦xyz⟧g instead of ⟦uyw⟧g∪[u↦x,w↦z]τ . In general, occurrences of boldfaced symbols in expressions inside ⟦⋅⟧ function as abbreviations of variables not otherwise in use, assigned to the relevant element of the domain.95 (Note that the definition of an L -structure did not require the domain D to be populated. When Dσ is empty and Vσ is nonempty, there are no assignment functions from V to D . In that case, when A ∈ L(V )ρ, there will be no g for which ⟦A⟧ g ρ is defined. But this does not stop ⟦⋅⟧ from being non-trivial for terms with bound variables of type σ. If we had taken the more customary approach of assigning denotations to terms relative to variable assignments defined on the entirety of Var, we could not have allowed for nontrivial L -structures without populated domains.) We can have I Σand K Σ-structures for arbitrary signatures Σ, including ∅. However, to make a model, we will need a an I Σ-structure or K Σ-structure where Σ contains the logical constants. We will call such signatures "logical". Definition 3.3. Log is the functional signature containing just the following constants: ¬ ∈ Logt→t, ∧ ∈ Logt→t→t, ∨ ∈ Logt→t→t, and for every type σ, ∀σ ∈ Log(σ→t)→t and ∃σ ∈ Log(σ→t)→t. A functional signature Σ is logical iff Log ⊆ Σ. A functional language L is logical if it is I Σ or K Σ for some logical signature Σ. 94In an I -language, βη-equivalent terms must have the same free variables (Proposition 2.8), so if ⟦A⟧gσ and ⟦B⟧hσ are both defined, g andhmust have the same domain, and so are identical if compatible. 95Another way to look at this convention is that, so long as D is nonoverlapping and disjoint from Σ and Var, any I Σ-structure or K Σ-structure ⟨D, ⟦⋅⟧⟩ can be naturally extended to an I Σ∪D -structure or K Σ∪D -structure ⟨D, [⋅]⟩, by the stipulation that that [A]gσ = ⟦A⟧gσ whenever ⟦A⟧gσ is defined, and ⟦x⟧σ = x for any x in Dσ . (Proposition 5.4 below shows that this uniquely determines [⋅].) Going further in this direction, we could modify the definition of an L -structure so that ⟦⋅⟧ only interprets closed terms of I Σ∪D or K Σ∪D , and define the assignment-relative notion of denotation for open terms by treating the assignment function as specifying a substitution function that replaces free variables with constants from D . 73 In writing terms in logical languages we use the following metalinguistic abbreviations: → =df λpt.λqt.¬p ∨ q ↔ =df λpt.λqt.(p → q) ∧ (q → p) ≡τ =df λxτ .λyτ .∀τ(λzτ→t.zx ↔ zy) ≢τ =df λxτ .λyτ .¬(x ≡τ y) We also write ∀vσ(φ) for ∀σ(λvσ .φ) and ∃vσ(φ) for ∃σ(λvσ .φ). A model, then, will be the result of supplementing a logical L -structure with a valuation, which assigns truth values to elements of the propositional domain in accordance with certain constraints. Definition 3.4. WhenL is logical, anL -model is a triple ⟨D, ⟦⋅⟧, |⋅|⟩where ⟨D, ⟦⋅⟧⟩ is an L -structure and |⋅| is a function Dt → {0, 1} such that (i) For any p ∈ Dt, |⟦¬p⟧| = 1 − |p|. (ii) For any p,q ∈ Dt, |⟦p ∨ q⟧| = max{|p|, |q|}. (iii) For any p,q ∈ Dt, |⟦p ∧ q⟧| = min{|p|, |q|}. (iv) For any x ∈ Dσ→t, |⟦∃σx⟧| = max{|⟦xy⟧| ∶ y ∈ Dσ}. (v) For any x ∈ Dσ→t, |⟦∀σx⟧| = min{|⟦xy⟧| ∶ y ∈ Dσ}. We can generalise |⋅| to elements of Dτ for all complex types τ by defining |x|, for any x ∈ Dσ→τ , to be the function with domain Dσ such that, for any y ∈ Dσ , |x|(y) = |⟦xy⟧|. We call |x| the extension of x. As usual, validity is defined in terms of truth (value 1) in a model: Definition 3.5. A formula φ ∈ L(V )t is valid on a class of models C iff for every model ⟨D, ⟦⋅⟧, |⋅|⟩ ∈ C and g ∈ D [V ], |⟦φ⟧g| = 1. Where Γ and Δ are any sets of L -formulae, the sequent Γ ⇒ Δ is valid on a class of models C iff there is no model ⟨D, ⟦⋅⟧, |⋅|⟩ ∈ C and function f mapping every φ ∈ Γ ∪ Δ to an assignment function f(φ) such that φ ∈ L(dom f(φ))t, such that any two of these assignment functions are compatible, |⟦φ⟧f(φ)| = 1 for every φ ∈ Γ, and |⟦φ⟧f(φ)| = 0 for every φ ∈ Δ. The central reason to care about models in the sense of Definition 3.4 comes from the soundness and completeness theorems proved by BBK, which imply that the standard classical rules of (multi-sorted) quantification theory, supplemented by βη-conversion, are sound and complete for the class of all populated models (i.e. models with populated domains). As usual with such results, there is flexibility as 74 regards exactly what proof system we have in mind; we will have no need to be more specific here. BBK focus on a natural deduction calculus NKβη, whose rules are the standard classical natural deduction rules for ¬, ∨, and ∀ (they treat ∧ and ∃ as defined), together with rules allowing one to derive the sequent Γ ⇒ {ψ} from Γ ⇒ {ψ} whenever ψ is β-equivalent or η-equivalent to Φ. In this setting, the result is that a sequent of the form Γ ⇒ {φ} is valid in the class of all populated models iff there is a derivation of it in NKβη.96 Although their result concerns only K models, the proof goes through with essentially no modification for I -models.97 Moreover, some of the proof procedures that are sound and complete for the class of all populated models need only slight amendment to be sound and complete for the class of all models.98 For those of us whose interest in the model theory is driven by metaphysics, the primary importance of the soundness theorem comes from the fact that, by finding a mathematical proof that there is a model of a certain proposed theory stated in a higher-order language, we can assure ourselves that we will not end up with a contradiction if we endorse that theory and reason in accordance with the classical inference rules and βη-conversion. The importance of the completeness theorem, meanwhile, comes from the guarantee it provides that, in searching for models (in the sense of Definition 3.4) of a theory we like, we are not unnecessarily limiting our search space in a way that would deprive us of the assurance that the discovery of a model would provide. Of course, this kind of assurance is only worth so much, since there are many ways of objecting to theories that don't require deriving contradictions from them using the classical rules and βη-conversion. But investigating the class of models in which a theory is true can also be helpful in other ways when assessing its credibility. For example, since most of us are better at informal reasoning about mathematics than at producing valid arguments in formal languages, exploring models can help us to find deductive consequences of the theory (which may turn 96Because derivations in NKβη consist of sequents of closed sentences, BBK's result requires a restriction to require the signature Σ contains a sufficiently large infinite supply of constants in each type that do not occur in Γ. We can get around this by allowing free variables to occur in derivations, although we will still need a restriction to rule out the case where Γ is so enormous that (in some type) the set of variables free in Γ is larger than the set of variables not free in Γ. 97Note that when K -formulae are allowed to occur in derivations, we can prove I -formulae which are not provable when derivations are required to contain only I -formulae. For example, in the K language, we can prove p ≡ (λq.p)p from the theorem p ≡ p using β-conversion, and then apply the introduction rules for the quantifiers to derive ∀pt∃xt→t∀qt(p ≡ xq), an I -formula which cannot be proved by a derivation consisting entirely of I -formulae. 98For example, in a sequent calculus, we could achieve this by changing the rules ∃R and ∀L to disallow steps which take us from a sequent in which some variable occurs free to a sequent in which that variable does not occur free. 75 out to be attractive or objectionable); and thanks to the completeness theorem, it can also help us identify questions that the theory does not deductively resolve (which may be praiseworthy open-mindedness or problematic weakness).99 A4 Varieties of structures and models In this section we will distinguish some interesting subclasses of structures and models. This will help to illuminate why the definitions in A3 look the way they do, and to further refine our sense of what we should be hoping for when looking for models of a theory. Most of these definitions are standard; here again my discussion draws heavily on BBK. As we go on, it will be helpful to be able to apply some standard algebraic terminology to L -structures and L -models. For future reference, the definitions are as follows: Definition 4.1. A homomorphism from an L -structure ⟨D, ⟦⋅⟧⟩ to an L -structure ⟨D ′, [⋅]⟩ is a typed function f from D to D ′ such that for any A ∈ L(V )σ and g ∈ D [V ], [A]f∘gσ = f(⟦A⟧gσ). A homomorphism from an L -model ⟨D, ⟦⋅⟧, |⋅|⟩ to an L -model ⟨D ′, [⋅], ‖⋅‖⟩ is a homomorphism f from ⟨D, ⟦⋅⟧⟩ to ⟨D ′, [⋅]⟩ such that for any p ∈ Dt, ‖f(p)‖ = |p|. An isomorphism is a homomorphism that is bijective.100 An L -structure ⟨D, ⟦⋅⟧⟩ is a substructure of an L -structure ⟨D ′, [⋅]⟩ if D ⊆ D ′ and ⟦A⟧gσ = [A]gσ whenever A ∈ L(V )σ and g ∈ D [V ]. ⟨D, ⟦⋅⟧, |⋅|⟩ is a submodel of ⟨D ′, [⋅], ‖⋅‖⟩ iff ⟨D, ⟦⋅⟧⟩ is a substructure of ⟨D ′, [⋅]⟩ and |p| = ‖p‖ whenever p ∈ Dt. A congruence on an L -structure ⟨D, ⟦⋅⟧⟩ is a typed family ∼ where ∼σ is an equivalence relation on Dσ for each σ, and for any assignment functions g, h ∈ D [V ] 99Those who reject βη-conversion or the classical logic of truth-functional connectives and quantifiers will want to find a broader definition of "model" if they want to pursue similar investigations. If you like the classical rules but not βη-conversion, see Muskens (2007) for a conception of "model" so broad that it does not even require α-equivalent formulae to agree in truth value; it is equivalent to the result of imposing a certain weakening on clause (iii) in our definition of an L -structure while adding a new clause to the definition of a model to require |⋅| to respect extensional β-conversion. If you like βη-conversion and the classical rules for truth-functional connectives, but are tempted to reject the classical rules for ∀ and ∃ (and ≡), Bacon and J. S. Russell MS is a good starting point. If you want to weaken the classical rules for truth-functional connectives, there is a vast array of model-theoretic techniques used in the study of non-classical propositional logics which could be adapted to the present setting by taking over the role of the valuation |⋅| in the definition of a model. 100If f is an isomorphism from ⟨D, ⟦⋅⟧⟩ to ⟨D ′, [⋅]⟩, f −1 is an isomorphism from ⟨D ′, [⋅]⟩ to ⟨D, ⟦⋅⟧⟩, since ⟦A⟧f −1∘g σ = f −1(f (⟦A⟧f −1∘g σ )) = f −1([A]f∘f −1∘g σ ) = f −1([A]gσ). 76 such that g(v) ∼ρ h(v) whenever v ∈ Vρ, ⟦A⟧ g σ ∼σ ⟦A⟧hσ for any A ∈ L(V )σ . A congruence on anL -model ⟨D, ⟦⋅⟧, |⋅|⟩ is a congruence on ⟨D, ⟦⋅⟧⟩with the further property that |p| = |q| whenever p ∼t q. When ∼ is a congruence on an L -structure ⟨D, ⟦⋅⟧⟩, the quotient of ⟨D, ⟦⋅⟧⟩ under ∼ is the L -structure ⟨D ∼, ⟦⋅⟧∼⟩, where D ∼σ is the set of equivalence classes of ∼σ , and for any assignment g for D ∼ with domain V and A ∈ L(V )σ , ⟦A⟧ ∼g σ is the unique equivalence class that contains ⟦A⟧hσ for some assignment h ∈ D [V ] such that hρ(v) ∈ gρ(v) whenever v ∈ Vρ. When ∼ is a congruence on an L -model ⟨D, ⟦⋅⟧, |⋅|⟩, the quotient of the model under ∼ is the model ⟨D ∼, ⟦⋅⟧∼, |⋅|∼⟩, where for each p ∈ D ∼t , |p|∼ is the unique x ∈ {0, 1} such that |q| = x for some q ∈ p. With these preliminaries out of the way, we can turn to two significant properties of L -structures: Definition 4.2. A L -structure ⟨D, ⟦⋅⟧⟩ is functional if for any x, y ∈ Dσ→τ such that x ≠ y, there is some z ∈ Dσ such that ⟦xz⟧ ≠ ⟦yz⟧ Definition 4.3. A L -structure ⟨D, ⟦⋅⟧⟩ is full if for any function f from Dσ to Dτ , there is at least one x ∈ Dσ→τ such that for every y ∈ Dσ , ⟦xy⟧ = f(⟦y⟧). These definitions may be helpfully motivated using the concept of applicative structure. An applicative structure on a typed collection D is a collection of functions @σ,τ where @σ,τ maps Dσ→τ to D Dσ τ . Any L -structure naturally induces an applicative structure on its domain, setting @σ,τ(x)(y) = ⟦xy⟧. When x ∈ Dσ→τ and y ∈ Dσ , we may write x@y as an equivalent to @σ,t(x)(y) or ⟦xy⟧. In these terms, functionality and fullness can be explained as follows: an L -structure is functional if @σ,τ is one-one for each σ, τ, and it is full if @σ,τ is onto for each σ, τ.101 The unfamiliar form of Definition 3.2 is largely motivated by the desire not to restrict our attention only to functional L -structures. Every functional L -structure is isomorphic to a frame, i.e. an L -structure in which for every complex type σ → τ, Dσ→τ is a subset of D Dτ τ , and x ∈ Dσ→τ and y ∈ Dσ , |xy| = x(y)-or in other words, an L -structure in which @σ,τ is just the identity operation for every σ, τ.102 Moreover, in any functional L -structure, the full denotation function ⟦⋅⟧ can be recovered from the application operation @ together with the restriction of ⟦⋅⟧ to constants. For example, in any L -structure, ⟦λp.p⟧t→t must be an x ∈ Dt→t such that x@q = q for all q ∈ Dt (since ⟦λp.p⟧t→t@q = ⟦(λp.p)q⟧t = ⟦q⟧t = q); but given functionality there can be at most one x ∈ Dt→t with this property, so 101Thanks here to Andrew Bacon. 102See BBK, Theorem 3.68. 77 the identity of ⟦λp.p⟧t→t is determined by the application operation. So if we were only interested in functional L -structures, it would have been natural to define the structures of interest as frames meeting certain further conditions (i.e. those required for ⟦A⟧ to exist for every pure closed term A of L ), together with a typed function from Σ to the domain. We call a model functional or full if its constituent L -structure is functional or full. Here are some further significant properties of models: Definition 4.4. A model is extensional if Dt contains exactly two elements.103 Definition 4.5. A model is extensionally full if, for every set Z ⊆ Dσ1 × ⋯ × Dσn , there is some x ∈ Dσ1→⋯→σn→t such that, for any y1 ∈ Dσ1 ... and yn ∈ Dσn , |⟦xy1...yn⟧| = 1 iff ⟨y1, ..., yn⟩ ∈ Z. Definition 4.6. A model is internally full if, for every function f from Dσ to Dτ , if there is some x ∈ Dσ→τ→t such that |⟦xyy′⟧| = 1 exactly when y′ = f(y), then there is some z ∈ Dσ→τ such that for any y ∈ Dσ , ⟦zy⟧ = f(y). Definition 4.7. Amodel is Leibnizean if, for every z ∈ Dσ , there is some x ∈ Dσ→t such that, for any y ∈ Dσ , |x@y| = 1 iff y = z. AHenkinmodel is one that is populated, functional, extensional, and Leibnizean.104 A standard model is a full Henkin model. We can note the following logical relations among these properties. Proposition 4.8. Every extensionally full model is Leibnizean. Proof. Take Z = {z}. Proposition 4.9. A model is full just in case it is both extensionally full and internally full. Proof. If a model is extensionally full, then for every function f from Dσ to Dτ there is some x ∈ Dσ→τ→t such that |⟦xyy′⟧| = 1 iff y′ = f(y), and if it is also internally full, there is a corresponding z in Dσ→τ such that for any y ∈ Dσ , |⟦xy(zy)⟧| = 1, hence |zy| = f(y). In the other direction, the implication from fullness to internal fullness is immediate. To show that every full model is extensionally full, we use induction on n. For the base case where n = 0 and Z is either {⟨⟩} or ∅, we need only observe that |⟦∃p(p)⟧| = 1 and 103Note that Dt could not contain less than two elements, since it follows from the definition of a model that |⟦∃p(p)⟧| = 1 and |⟦∀p(p)⟧| = 0. 104Henkin (1950) actually did not require Leibnizeanness, leading to a mistake in central theorem, which Andrews (1972) points out, and remedies by imposing a condition equivalent to Leibnizeanness. 78 |⟦∀p(p)⟧| = 0. For the induction step, given Z ⊆ Dσ1 × ⋯ × Dσn+1 , we define a function fZ from Dσ1 to Dσ2→⋯→σn+1→t by choosing fZ(y1), for each y1 ∈ Dσ1 , to be some w such that |⟦wy2...yn+1⟧| = 1 iff ⟨y1, y2, ..., yn+1⟩ ∈ Z: such aw is guaranteed to exist in every case by the induction hypothesis. Since the model is full, there is some x ∈ Dσ1→⋯→σn+1→t such that |xy1| = fZ(y1) for all y1 ∈ Dσ1 . It follows that |⟦xy1...yn+1⟧| = 1 iff ⟨y1, y2, ..., yn+1⟩ ∈ Z. Proposition 4.10. Any functional, extensional, extensionally full model is full. Proof. First prove (by a straightforward induction on n) that in a functional, extensional model, when x, y ∈ Dσ1→⋯→σn→t, x = y iff |x| = |y|: i.e. for all ⟨z1, ..., zn⟩ ∈ Dσ1 ×⋯×Dσn , |⟦xz1...zn⟧| = |⟦yz1...zn⟧|. Then suppose f is a function from Dσ to Dτ , where τ = σ1 → ⋯ → σn → t. By extensional fullness, there exists x ∈ Dσ→τ such that for any y0, ..., yn, |⟦xy0...yn⟧| = 1 iff |f (y0)@y1@ ⋯ @yn| = 1; but then by the foregoing fact, ⟦xy0⟧ = f(y0) for all y0 ∈ Dσ . Corollary 4.11. Any extensionally full Henkin model is standard. The following definition helps to clarify the interest of Leibnizean models: Definition 4.12. When M = ⟨D, ⟦⋅⟧, |⋅|⟩ is a model, Leibniz-equivalence in M is the typed family ∼ where ∼σ is the equivalence relation on Dσ such that x∼σy iff |⟦zx⟧| = |⟦zy⟧| for every z ∈ Dσ→t-or equivalently, iff |⟦x ≡σ y⟧| = 1. A model is Leibnizean, then, if no distinct elements of any Dσ are ever Leibniz equivalent. It is not hard to show that Leibniz equivalence is a congruence (see BBK, Theorem 3.62), and that the quotient of any model by Leibniz equivalence is Leibnizean. One nice thing about the class of Leibnizean models is that within it, some of our other properties can be characterised by object language axioms. Proposition 4.13. If a model is Leibnizean, it is is extensional iff The Fregean Axiom is true in it; functional iff Functionality is true in it for every type σ and terminal type τ; and internally full iff Strong Plenitude is true in it for every type σ and terminal type τ. ∀pt∀qt((p ↔ q) → (p ≡t q))The Fregean Axiom ∀xσ→τ∀yσ→τ(∀zσ(xz ≡τ yz) → x ≡σ→τ y)Functionality ∀xσ→τ→t(∀yσ∃uτ(xyu ∧ ∀vτ(xyv → u ≡τ v)) → ∃zσ→τ∀yσ(xy(zy))) Strong Plenitude (Note that Plenitude from §6 is equivalent to the restriction of Strong Plenitude to the case where τ is t.) 79 Proof. These follow straightforwardly from the fact that in a Leibnizean model, |⟦x ≡σ y⟧| = 1 iff x = y. Wemay also note that anymodel (Leibnizean or not) is populated iff∃xe(∃f e→t(fx)) is true in it.105 However, not all of our classes of models can be characterised in this way. One example is the class of Leibnizean models. Since Leibniz-equivalence is a congruence, we can take the quotient of any model under Leibniz-equivalence, and the result is a model in which exactly the same closed sentences are true, and exactly the same sets of open formulae can be made true by providing compatible assignments. This means we would neither gain nor lose anything if we restricted our attention to Leibnizean models: the formulae and sequents valid in any class of models C are exactly those that are valid in the class of Leibniz-quotients of models in C . This means that same proof procedures that are sound and complete for the class of all models are valid for the class of all Lebnizean models. Extensional fullness is another property not characterised by any object-language axiom-schema, but unlike the class of Leibnizean models, no recursive proof procedure is sound and complete for the class of all extensionally full models. Gödel's first incompleteness theorem shows that no such proof procedure is sound and complete for the class of all standard models, or indeed for any class of standard models that contains at least one member whose domain in some type is infinite. And this result extends to the class of all extensionally full models, and to any class of extensionally full models that contains at least one member whose domain in some type is infinite.106 The discovery that a certain theory was not true in any extensionally full 105Clearly the truth of ∃xe(∃f e→t(fx)) is sufficient for De to be nonempty. This suffices for the domain to be populated because of the following fact: whenever ⟨D , ⟦⋅⟧⟩ is an L -structure and L is logical, Dτ is nonempty for every terminal type τ. The proof of this is straightforward for K -models: Dt must be nonempty since it contains ⟦∃p(p)⟧, and so every complex type σ → τ must be nonempty since Dσ→τ contains ⟦λuσ .x⟧ for every x ∈ Dτ . For an I -model, it is still true but less obvious. The required lemmas are as follows: (i) Dt, De→t, D(σ→t)→t, and Dt→t→t are nonempty, since they contain ⟦∃p(p)⟧t, ⟦λx.∃f(fx)⟧e→t, ⟦∃σ⟧, and ⟦∧⟧. (ii) If x ∈ Dτ→τ→τ , ⟦λy.λz.λw.x(yw)(zw)⟧ ∈ D(ρ→τ)→(ρ→τ)→ρ→τ . (iii) If x ∈ D(σ→τ)→τ , ⟦λy.λz.x(λw.ywz)⟧ ∈ D(σ→ρ→τ)→ρ→τ . (iv) If x ∈ Dπ→τ and y ∈ D(σ→π)→π , ⟦λz.x(yz)⟧ ∈ D(σ→π)→τ . (v) If x ∈ Dσ→τ , y ∈ Dρ→τ , and z ∈ Dτ→τ→τ , ⟦λv.λw.z(xv)(yw)⟧ ∈ Dσ→ρ→τ . Using these we can establish the nonemptiness of Dτ for every terminal τ: first for types τ → τ → τ and (σ → τ) → τ using (i)–(iii), and then for the remaining types using (iv) and (v). 106The following sketch shows that this claim is true in the special case where the infinite type is e and the models are populated; the general result is analogous. For each σ, define a "hereditary coex80 models would be a worrying development for proponents of that theory. Explaining exactly what the reasons for worrying would be is harder than explaining why it would be bad if the theory turned out not to have any models at all, and I won't try to get to the bottom of the matter. But one obvious danger is that when we inspected the proof, we could see how to transform it into a proof of the inconsistency of the theory either with the mathematical axioms we used in the proof, or with some analogues of these axioms formulated in higher-order terms (for example, a higherorder analogue of the axiom of choice). Insofar as the axioms were well supported, this would be a blow to the theory's credibility. Another possibility is that the proof would reveal some kind of clash between the way we are treating infinity within the theory and the waywe are reasoning about infinity in the syntactic metatheorywe use when specifying what the theory is-this would be the case, for example, if the theory contained the higher-order regimentations of 'Fred is a farmer', 'Fred's parents are farmers', 'Fred's parents' parents' are farmers', and so on, while also containing 'Not all Fred's ancestors are farmers', i.e. 'Some non-farmer has every property that Fred has and that is had by the parents of everything has it'. (Such a theory cannot have extensionally full models, since the extension of 'farmer' in each model must include all objects that can be connected to the denotation of 'Fred' by finite chains in which neighbouring objects belong to the extension of 'parent'.) There is clearly something deeply objectionable here, even though it is hard to pin down what it is. tensiveness" predicate E∼σ of type σ → σ → t as follows: (i) x E∼e y =df x ≡e y; (ii) x E∼t y =df x ↔ y; (iii) x E∼σ→τ y =df ∀z∀w(z E∼σ w → xz E∼τ yw). Also define ∀Eσ as λxσ→t.∀yσ((y E∼σ y) → xy) and ∃Eσ as λxσ→t.∃yσ((y E∼σ y) ∧ xy), and note that ∀Eσ E∼(σ→t)→t ∀Eσ and ∃Eσ E∼(σ→t)→t ∃Eσ are valid for every σ. Say x ∈ Dσ is hereditarily extensional if |⟦x E∼σ x⟧| = 1. Given any Log-model M = ⟨D, ⟦⋅⟧, |⋅|⟩, we can make a new model M′ by first throwing away all elements of the domains that are not hereditarily extensional, and then modifying the denotation function (see Definition 5.3) so that the new denotations of ∀σ and ∃σ are the old denotations of ∀Eσ and ∃Eσ . (This makes sense since the old denotations of ∀Eσ and ∃Eσ are hereditarily extensional: see Proposition 5.9 below for a more careful description of this general procedure for modifying models.) Hereditary coextensiveness can be shown to be a congruence on M′, so we can take the quotient of M′ by it: call the resulting model Mext the extensional core of M. Mext will be extensional, since there are only two equivalence classes under hereditary coextensiveness in Dt. Given the form of the definition of hereditary coextensiveness it is immediate that Mext will be functional; moreover, if M was extensionally full, Mext will be extensionally full too. So by Proposition 4.10, if M was extensionally full with an infinite domain of objects, Mext will be standard with an infinite domain of objects. Moreover, a sentence is true in Mext iff iff the result of simultaneously substituting ∀Eσ for ∀σ and ∃Eσ for ∃σ in it is true in M. Thus, any sound and complete proof procedure for a class C of models can be turned into a sound and complete proof procedure for the class Cext of extensional cores of models in C . But if every model in C is populated and extensionally full and at least one is infinite in type e, then every model in Cext is standard and at least one is infinite in type e, so Gödel's theorem rules out the existence of a sound and complete proof procedure for Cext. 81 But however the objection is best conceived, we can assure ourselves that our theory will not face it if we manage to prove from standard mathematical axioms that the theory has extensionally full models. So, the strategic considerations that motivate the search for models in the first place also motivate the search for extensionally full models. On the other hand, I see nothing worrying about the discovery that a theory lacks full models. As we have seen, any theory inconsistent with Strong Plenitude will lack Leibnizean, internally full models, and hence will lack full models (since the Leibniz-quotient of any full model is full); this observation has not suggested any objections to such theories that seem at all comparable in force to those that might follow on the discovery that a theory lacks extensionally full models.107 A5 Transformations of structures and models This section will define a few operations for turning one structure or model into another which we will need to rely on later. Most obviously, we can get a new structure (or model) just by shrinking the language on which the denotation function is defined. Definition 5.1. When Σ′ ⊆ Σ and S = ⟨D, ⟦⋅⟧⟩ is a I Σ-structure [K Σ-structure], SΣ′ , the restriction of S to Σ′, is ⟨D, [⋅]⟩, where for any g ∈ D [V ], [A]gσ = ⟦A⟧gσ when A is in I Σ′(V )σ [K Σ ′(V )σ] and undefined otherwise. MΣ ′ = ⟨D, [⋅], |⋅|⟩ is a model if M = ⟨D, ⟦⋅⟧, |⋅|⟩ is: we call this the restriction of M to Σ′. Similarly, we can turn a K Σ-structure or K Σ-model into an I Σ-structure or I Σmodel simply by excluding all terms of K Σ that are not terms of I Σ from the domain of each ⟦⋅⟧g . Slightly less trivially, whenΣ′ is any signature, we can transform anI Σ-structure or K Σ-structure into an I Σ′-structure or K Σ′-structure by choosing some typed function I from Σ′ to D to provide the interpretations of the constants in Σ′. To prepare the ground for this, we need a lemma to the effect that the denotation on an assignment of the result of performing a substitution operation on a term is the same as the denotation of the original term on an appropriately modified assignment. Recall that [π]A is the result of simultaneously substituting π(v) for each free occurrence of v in A (see Definition 2.5). 107Sometimes, schemas that imply Strong Plenitude, like ∀xσ∃yρ(rxy) → ∃f σ→ρ(∀xσ(rx(fx))), are taken as appropriate higher-order analogues of the axiom of choice. But in effect this simply bundles Strong Plenitude together with the weaker schema ∀xσ∃yρ(rxy) → ∃sσ→ρ→t(∀xσ∃yρ(rxy ∧ sxy ∧ ∀zρ(sxz → z ≡ρ y))), and the intuitive and mathematical reasons for choice seem to attach more properly to the weaker schema. Thanks to Jeff Russell for discussion on this point. 82 Lemma 5.2 (Substitution Lemma). If A ∈ L(V )σ , and π is a substitution function defined only on variables, then whenever ⟦[π]A⟧gσ is defined, it equals ⟦A⟧g π,V σ , where for each v ∈ Vρ, g π,V ρ (v) is ⟦[π]v⟧g⇂F V ([π]v)ρ -that is, ⟦π(v)⟧g⇂F V (π(v))ρ if π is defined on v and gρ(v) otherwise. Proof. We first show that that the claim is true for all "straightforward" substitution functions, where π is straightforward if none of the variables in its domain is free in any of the terms in its range. We argue by induction on the cardinality of V . Base case: if V = ∅, then [π]A = A, and g = gπ,V = ∅, so ⟦[π]A⟧gσ = ⟦A⟧σ = ⟦A⟧ gπ,V σ . Induction step: if A is of type e and some variable v is free in A, then A must be v. Then gπ,V is v ↦e ⟦[π]v⟧ g σ , and so (by condition (i)) ⟦v⟧g π,V e = gπ,Ve (v) = ⟦[π]v⟧ge . Otherwise, A is of some terminal type τ. Let v ∈ Vρ for some type ρ, and let V − = V − {v}. Let π− be the restriction of π to V −, and let k and h be the restrictions of g respectively to variables free in F V ([π]v), and to variables free in [π]u for some u in V −. Note that hπ−,V − is the restriction of gπ,V to V −, so gπ,V = hπ−,V − ∪ (v ↦ρ ⟦[π]v⟧ k). Then ⟦A⟧g π,V τ = ⟦(λv.A)v⟧g π,V τ by condition (iii) = ⟦(λv.A)v⟧h π−,V − ∪(v↦ρ ⟦[π]v⟧k)τ = ⟦λv.A⟧hπ −,V − ρ→τ @⟦v⟧ v↦ρ ⟦[π]v⟧kρ by condition (ii) = ⟦λv.A⟧hπ −,V − ρ→τ @⟦[π]v⟧kρ by condition (i) = ⟦[π−]λv.A⟧hρ→τ@⟦[π]v⟧kρ by the induction hypothesis = ⟦([π−]λv.A)([π]v)⟧gτ by condition (ii) = ⟦(λv.[π−]A)[π]v⟧gτ since π is straightforward = ⟦[([π]v)/v][π−]A⟧gτ by condition (iii) = ⟦[π]A⟧gτ since π is straightforward Finally, if the lemma holds for all straightforward π, it holds for all π, since for any π and term A ∈ L (V )σ we can find two straightforward substitution functions π1 and π2 such that [π]A = [π1][π2]A: just let π1 bijectively map V to a typed family of variables U none of which is in any term in the range of π, and let π2(u) = π(π−11 (u)). So ⟦[π]A⟧ g σ = ⟦[π1][π2]A⟧ g σ = ⟦A⟧(g π1,V )π2,U σ , where for any v in V , (gπ1,V )π2,U (v) = ⟦[π2]v⟧g π1 ⇂F V (π2(v)). Applying the just-proved result again, we see that this is the same as ⟦[π1][π2]v⟧g = ⟦[π]v⟧g . Thus (gπ1,V )π2,U = gπ,V , and so ⟦[π]A⟧gτ = ⟦A⟧g π,V τ . With this lemma in the background we can define our "reinterpretation" operation as follows: Definition 5.3. Where S = ⟨D, ⟦⋅⟧⟩ is an L Σ-structure (i.e. I Σ-structure or K Σstructure), and I is a typed function fromΣ′ toD , SI , the reinterpretation ofS by I , 83 is the I Σ′-structure [K Σ-structure] ⟨D, ⟦⋅⟧I ⟩, where for any A ∈ L Σ′(V ) containing constants c1...cn of types σ1...σn and g ∈ D [V ], ⟦A⟧Ig = ⟦[vi/ci]A⟧g∪[vi↦σi I(ci)], where v1...vn are distinct variables of types σ1...σn not in V (by Lemma 5.2 it doesn't matter which ones we choose). Proposition 5.4. SI as defined in Definition 5.3 is an L Σ′-structure. Moreover, if ⟨D, [⋅]⟩ is any L Σ′-structure such that [⋅] agrees with ⟦⋅⟧ on all pure terms and [c] = I(c) for each constant c of Σ′, [⋅] = ⟦⋅⟧I . Proof. Condition (i) is trivially satisfied since a variable does not contain any constants; condition (iii) is satisfied since if A ≈βη B, [vi/ci]A ≈βη [vi/ci]B by Proposition 2.11. For condition (ii), suppose that ⟦A⟧Ig1 = ⟦C⟧Ig3 and ⟦B⟧Ig2 = ⟦D⟧Ig4 , where g1 and g2 are compatible and g3 and g4 are compatible. Let c1...cn and d1...dm be the constants of AB and CD respectively, and v1...vn and u1...um variables of appropriate types. Then ⟦AB⟧Ig1∪g2 = ⟦[vi/ci]AB⟧g1∪g2∪[vi↦σi I(ci)] = ⟦[ui/di]CD⟧g3∪g4∪[ui↦σi I(di)] (by condition (ii) for ⟦⋅⟧) = ⟦CD⟧Ig3∪g4 . To prove the uniqueness claim we argue by induction on the number of constants. The base case is the given fact that ⟦⋅⟧I and [⋅] agree on pure terms. For the induction step, let c be some constant of type ρ in A, let x = [c] = ⟦c⟧I , and let B be a term not containing c such that A = [c/v]B. Then by (iii), [A]g = [(λv.B)c]g = [λv.B]g@x = ⟦λv.B⟧Ig@x by the induction hypothesis = ⟦(λv.B)c⟧Ig = ⟦A⟧Ig . The above operations leave the domain D unchanged. We will also want to be able to take a structure or model and make a new one by "throwing away" some elements. In the case of an L -structure, we can specify the conditions for this to be possible using the concept of a closed subcollection of the domain. Definition 5.5. When S = ⟨D , ⟦⋅⟧⟩ is an L -structure, C ⊆ D is closed iff for any A ∈ L(V )σ , and any assignment function g ∈ C [V ], ⟦A⟧ g σ ∈ Cσ . Definition 5.6. When S = ⟨D , ⟦⋅⟧⟩ is an L -structure and C ⊆ D is closed, the restriction of S to C , is the L -structure SC = ⟨C , [⋅]⟩, where [A]gσ is ⟦A⟧gσ for any A ∈ L(V )σ and g ∈ C [V ]. It is trivial to show that SC satisfies the defining conditions to be an L -structure, and that it is a substructure of S. In the case of models, things are less straightforward. Typically, if we start with a model ⟨D, ⟦⋅⟧, |⋅|⟩ and just restrict D to some closed C while leaving ⟦⋅⟧ and |⋅| unchanged (except for restricting them respectively to assignment functions for C and to Ct), the result will no longer be a model, because |⟦∃σx⟧| and |⟦∀σx⟧| will no longer satisfy conditions (iv) and (v). We will need to either adjust ⟦∀σ⟧ and ⟦∃σ⟧ or adjust the valuation |⋅| to make sure that these conditions are still satisfied. 84 In practice it is easiest to do the former, since so long as the characteristic function of each Cσ is the extension of some element of Cσ→t, we can use these elements to specify the new denotations of the quantifiers as restrictions of the old denotations. Definition 5.7. If M = ⟨D , ⟦⋅⟧, |⋅|⟩ is a model and F is a typed family such that, for each σ, Fσ ∈ Dσ→t, the extension of F in M is the typed collection |F| such that |F|σ = {y ∈ Dσ ∶ |Fσ@y| = 1}. F is self-contained if |Fσ→t@Fσ| = 1 for every σ. Definition 5.8. Suppose that M = ⟨D, ⟦⋅⟧, |⋅|⟩ is an L -model, and F is a selfcontained typed family with a closed extension. Then MF, the restriction of M by F, is the triple ⟨|F|, ⟦⋅⟧F, |⋅|F⟩, where |⋅|F is just the restriction of |⋅| to |F|t, and ⟦⋅⟧F is the restriction to |F| of the reinterpreted denotation function ⟦⋅⟧IF (see Definition 5.3) where IF(c) = ⟦c⟧ for every constant that is not a quantifier, IF(∀σ) = ⟦λxσ→t.∀σ(λyσ .(Fσy → xy))⟧, and IF(∃σ) = ⟦λxσ→t.∃σ(λyσ .(Fσy ∧ xy))⟧. Proposition 5.9. MF as defined in Definition 5.8 is a model. Proof. Since F is self-contained and |F| is closed in ⟨D, ⟦⋅⟧⟩, IF(∀σ) and IF(∃σ) belong to |F|; it follows that |F| is closed in ⟨D, ⟦⋅⟧F⟩, so ⟨|F|, ⟦⋅⟧F⟩ is an L -structure. So it suffices to show that |⋅|F obeys conditions (i)–(v). Conditions (i)–(iii) follow immediately from the fact that ⟦∧⟧F = ⟦∧⟧, ⟦∨⟧F = ⟦∨⟧, and ⟦¬⟧F = ⟦¬⟧ and that |⋅|F agrees with |⋅| whenever defined. For condition (iv), note that |⟦∃σx⟧F|F = |⟦∃σx⟧F| = |⟦λxσ→t.∃σ(λyσ .(Fσy ∧ xy))x⟧| = |⟦∃σ(λyσ .(Fσy ∧ xy))⟧| = max{|⟦(λyσ .(Fσy ∧ xy))y⟧| ∶ y ∈ Dσ} = max{|⟦Fσy ∧ xy⟧| ∶ y ∈ Dσ} = max{min{|⟦Fσy⟧|, |⟦xy⟧|} ∶ y ∈ Dσ} = max{|⟦xy⟧| ∶ y ∈ Dσ and |⟦Fσy⟧| = 1} = max{|⟦xy⟧F|F ∶ y ∈ |F|σ} Condition (v) holds for the same reason. A6 Applicative notions When S = ⟨D, ⟦⋅⟧⟩ is an L -structure and S′ = ⟨D ′, [⋅]⟩ is an L ′-structure, a @-homomorphism from S to S′ is a typed function f from D to D ′ such that f(x)@f(y) = f(x@y) for every x ∈ Dσ→τ and y ∈ Dσ . A @-isomorphism is a bijective @-homomorphism. In this section we will introduce some properties that depend only on the applicative structure induced by an L -structure, and are thus preserved by @-isomorphisms. Here are the first two: Definition 6.1. C ⊆ D is @-closed iff x@y ∈ Cτ whenever x ∈ Cσ→τ and y ∈ Cσ . 85 Definition 6.2. C ⊆ D is inclusive iff x@y ∈ Cτ whenever x ∈ Dσ→τ and y ∈ Cσ . Some elementary consequences of these definitions: Proposition 6.3. If C is inclusive, then x@y ∈ Cτ whenever x ∈ Cσ→τ and y ∈ Dσ . Proof. x@y = ⟦(λx.xy)x⟧ = ⟦(λx.xy)⟧@x. Proposition 6.4. If C is inclusive and C ′ is @-closed, C ∪ C ′ is @-closed. Proof. If x and y are both in C ∪ C ′, then either they are both in C ′, in which case x@y is in C ′ because C ′ is @-closed, or else y is in C in which case x@y is also in C because C is inclusive, or else x is in C in which case x@y is also in C by Proposition 6.3. Proposition 6.5. C is closed (see Definition 5.5) iff (a) C is @-closed, and (b) Cσ contains ⟦A⟧ for every closed term A of L . Proof. Suppose (a) and (b) hold, and the range of g is contained in C ; then for any term A with free variables v1, ..., vn, ⟦A⟧g = ⟦(λv1...λvn.A)v1...vn⟧g = ⟦λv1...λvn.A⟧@g(v1)@ ⋯ @g(vn) and so ⟦A⟧g is in C . Note that an even an inclusive C need not be closed, since it need not contain ⟦A⟧ for every closed A-for example, ∅ is inclusive. The concept of inclusiveness is of little interest when we are talking about K structures, because of the following: Proposition 6.6. If ⟨D, ⟦⋅⟧⟩ is an K -structure, C ⊆ D is inclusive just in case either C = ∅, or Cτ = Dτ for every terminal τ. Proof. The right to left direction is trivial. For the left to right direction, suppose that C is inclusive and y ∈ Cσ for some σ; then for any x ∈ Dτ , x = ⟦λuσ .x⟧@y, so x ∈ Cτ . By contrast, the domain of an I -structure may have inclusive subcollections that are proper in terminal types. Term structures provide a rich source of examples of this. The general concept of a term structure makes sense for K -structures as well (see BBK, sect. 3), but we will only be concerned here with term I -structures whose domains consist of βη-equivalence classes of closed terms of some underlying I language. 86 Definition 6.7. When Σ and Σ′ are signatures, a βη-term structure for I Σ with base language I Σ′ is a I Σ-structure ⟨D, ⟦⋅⟧⟩ such that (a) Dσ is the set of βηequivalence classes of closed terms in I Σ′(∅)σ , and (b) for any pure term A ∈ I ∅(V )σ , g ∈ D [V ], and h ∈ (I Σ ′(∅))[V ] such that hρ(v) ∈ gρ(v) for all v ∈ Vρ, ⟦A⟧gσ is the set of all closed terms of I Σ ′(∅)σ βη-equivalent to [h]A, i.e. the result of substituting for each free variable of A the closed term to which it is mapped by h. (This makes sense since by Proposition 2.11, we will get the same βη-equivalence class whichever h we choose.) Proposition 6.8. For any Σ′, there is a unique βη-term structure for I ∅ with base language I Σ′ Proof. ⟦A⟧g is obviously uniquely pinned down for each term A of I ∅ by condition (b), so we just have to verify that the resulting ⟦⋅⟧ obeys the conditions in the definition of an L -structure.. Condition (i) is obviously satisfied since when A ∈ C , ⟦v⟧v↦σ C is the βηequivalence class containing A, i.e. C . For condition (ii), suppose g1 and g2 are compatible, g3 and g4 are compatible, ⟦A⟧g1 = ⟦C⟧g3 , and ⟦B⟧g2 = ⟦D⟧g4 . Let h1–h4 be typed functions with the same domains as g1–g4 such that for each v, hi(v) ∈ gi(v), h1 and h2 are compatible, and h3 and h4 are compatible. Then ⟦AC⟧g1∪g2 is the βη-equivalence class containing [h1 ∪ h2]AC = [h1]A[h2]C . But since [h1]A ≈βη [h3]C and [h2]B ≈βη [h4]D, [h1]A[h2]C ≈βη [h3]C[h4]D = [h3 ∪ h4]CD by Proposition 2.10; so ⟦AB⟧g1∪g2 = ⟦CD⟧g3∪g4 . Finally, condition (iii) holds as an immediate consequence of Proposition 2.11: if A ≈βη B, then [h]A ≈βη [h]B. It follows from Proposition 6.8 that so long as I Σ′(∅)σ is nonempty whenever Σσ is, there exists at least one βη-term structure for I Σ with base language I Σ′ , since we can construct such a structure by starting with the βη-term structure for I ∅ with base language I Σ′ , taking any typed function I from Σ to the domain of this structure, and using I to extend the denotation function to I Σ in accordance with Proposition 5.4. The domain of a βη-term structure with base language I Σ′ may have many nontrivial inclusive subcollections. Since we are dealing with a λI-language, for any Σ′′ ⊆ Σ′, if some constant of Σ′′ occurs in a term A of I Σ′σ , it also occurs in any term βη-equivalent to A, and in any term βη-equivalent to BA for any B. Thus, the collection of all βη-equivalence classes containing terms containing constants in Σ′′ is an inclusive subcollection of the domain. The following definition picks out one important inclusive subcollection of the domain of any L -structure in a way that depends only on its induced applicative structure. 87 Definition 6.9. When S = ⟨D , ⟦⋅⟧⟩ is an L -structure, z ∈ Dσ is directly circular if σ is of the form τ → τ and for some x ∈ Dτ , x = z@x. z ∈ Dσ is circular if for some type τ, x ∈ Dτ , and y ∈ Dσ→τ→τ , x = (y@z)@x. Otherwise z is noncircular. Noncirc(S) is the typed collection of all non-circular elements of D . Proposition 6.10. Noncirc(S) is inclusive. Proof. If z ∈ Dσ→τ and z′ ∈ Dσ are such that z@z′ is circular, there is some ρ, x ∈ Dρ, y ∈ Dτ→ρ→ρ such that x = (y@(z@z′))@x, in which case x = (⟦λzσ .y(zz)⟧@z′)@x, so z′ is circular too. Proposition 6.11. If C ⊆ D is inclusive and contains no directly circular elements, then C ⊆ Noncirc(S). Proof. Suppose that z ∈ Cσ is circular, i.e. x = (y@z)@x for some τ, x ∈ Dτ , y ∈ Dσ→τ→τ ; then if C is inclusive, Cτ→τ must contain y@z which is directly circular. Thus an inclusive collection that contains no directly circular elements must consist entirely of noncircular elements. Since the union of any set of inclusive subcollections of D must itself be inclusive, it follows from the last two results that we can also characterise Noncirc(S) as the largest inclusive subcollection of the domain of S that contains no directly circular elements. K -structures cannot have noncircular elements, since every inclusive subcollection other than ∅ contains every member of Dτ for every terminal τ, and hence in particular contains the directly circular ⟦λp.p⟧t→t. However, when S is an I structure, Noncirc(S) can be nontrivial. We can see this by looking again at βη-term structures. Proposition 6.12. If S is a βη-term structure with base language I Σ′ , every circular element is a combinator (i.e. identical to ⟦A⟧ for some pure closed term A). Proof. We have already seen that the collection of βη-equivalence classes containing impure terms is inclusive. So by Proposition 6.11, to show that it is contained in Noncirc(S) it suffices to show that it does not contain any directly circular elements. For any term A of I Σ′ and constant c in Σ′, define the number of occurrences of c in A, Count(c, A), in the obvious way: Count(c, A) = 0 when A is a variable or constant other than c; Count(c, c) = 1; Count(c, AB) = Count(c, A) + Count(c, B); Count(c, λv.A) = Count(c, A). A straightforward induction shows that for any terms A, B of I Σ′ , if v has at least one free occurrence in A, then Count(c, [B/v]A) ≥ Count(c, A) + Count(c, B). Since we are dealing with an I -language, any term C that immediately β-reduces to some term D must be of the form (λv.A)B where v has a free occurrence in A, so for every c ∈ Σ′, we must have Count(c, D) ≥ Count(c, A) + Count(c, B) = Count(c, C). It follows from this that the 88 same inequality holds whenever C β-reduces in one step to D. Even more obviously, if C α-reduces or η-reduces in one step to D, Count(c, C) = Count(c, D) for every c. Hence whenever C βη-reduces to D, Count(c, C) ≥ Count(c, D). By the Church-Rosser theorem (Proposition 2.13), when A is in βη-normal form and B ≈βη A, B αβη-reduces to A, so Count(c, A) ≥ Count(c, B) for every c. And by the strong normalisation theorem (Proposition 2.12), every βη-equivalence class of closed terms of I Σ′ contains at least one member A in βη-normal form, and hence one for which Count(c, A) is maximal for every c. Suppose then that x is a directly circular element of Dτ→τ , i.e. that for some y ∈ Dτ , y = x@y. Let A ∈ x and C ∈ y be in βη-normal form; then we know that AC ≈βη C . But then, for every c, Count(c, A) + Count(c, C) = Count(c, AC) ≤ Count(c, C). This can only be true if, for every c, Count(c, A) = 0: in other words, A is a pure closed term (a member of I ∅∅τ→τ). But if so, x is ⟦A⟧τ→τ . Since the question whether an element is noncircular depends only on the applicative structure, an @-isomorphism from an L -structure S to an L ′-structure S′ will map Noncirc(S) to Noncirc(S′). More generally, the image of Noncirc(S) under a @-homorphism from S to S′ must contain Noncirc(S′): Proposition 6.13. Suppose S = ⟨D, ⟦⋅⟧⟩ is an L -structure, S′ = ⟨D ′, [⋅]⟩ is an L ′-structure, and f is a @-homomorphism from S to S′. Then for any x ∈ Dσ , if f(x) ∈ Noncirc(S′)σ , then x ∈ Noncirc(S)σ . Proof. Let J be the typed collection defined by Jσ = {x ∈ Dσ ∶ f(x) ∈ Noncirc(S′)σ}. J contains no directly circular elements. For suppose that for x ∈ Dτ→τ and y ∈ Dτ , x@y = y; then f(x)@f(y) = f(x@y) = f(y), so f(x) is a directly circular element of D ′τ→τ and hence not in Noncirc(S′)τ→τ , so x is not in Jτ→τ . Also, J is inclusive. For suppose y ∈ Jσ and x ∈ Dσ→τ ; then f(x@y) = f(x)@f(y) must be in Noncirc(S′) since Noncirc(S′) is inclusive and contains f(y), and so x@y is in J . J is thus an inclusive collection with no directly circular elements, which means by Proposition 6.11 that it is contained in Noncirc(S). Corollary 6.14. If S = ⟨D , ⟦⋅⟧⟩ is an L -structure and C ⊆ D is closed in S, then Noncirc(S) ∩ C ⊆ Noncirc(()SC ) (where SC is the restriction of S to C , see Definition 5.8). Proof. The identity function on C is a @-homomorphism from SC to S. A7 Model existence We now have the ingredients we need to prove our consistency theorem for OLC. First, observe that the translation ofOLC into a functional language is (βη-equivalent to) the following: OLCσ,τ ∀xτ∀yσ→τ→τ∀zσ((x ≡τ yzx) → Logicalσ z) 89 (I have made explicit the initial universal quantifiers.) Let an admissible signature Σ be a logical signature such that for every σ, the constant 'Logicalσ' belongs to Σσ→t. Let Neg be the signature whose only constant is ¬. Then what we want to prove can be stated as follows: Model existence theorem. (a) For every admissible Σ, there is an extensionally full I Σ-model M in which OLCσ,τ is true for every σ, τ, and in which for any z ∈ Dσ , |⟦Logicalσ z⟧| = 1 only if z = ⟦A⟧ for some closed term A of I ∅. (b) For every admissible Σ, there is an extensionally full I Σ-model M in which OLCσ,τ is true for every σ, τ, and in which for any z ∈ Dσ , |⟦Logicalσ z⟧| = 1 only if z = ⟦A⟧ for some closed term A of I Neg, and in which ⟦λp.¬(¬p)⟧t→t = ⟦λp.p⟧t→t. We can simplify our goal a little by observing that in any Leibnizean model, and a fortiori in any extensionally full model, |⟦x ≡τ yzx⟧g| = 1 just in case g(x) = (g(y)@g(z))@g(x), in which case g(z) is circular. So, OLCσ,τ is true in a model just in case for every z ∈ Dσ , either z is noncircular or |⟦Logicalσ z⟧| = 1. Moreover, we know (by Proposition 5.4) that we can extend the denotation function of any populated extensionally full model to interpret terms containing the constants Logicalσ (and any other constants) and assign them any extensions we wish. So, we will be done if we can establish the following: (a) There is a populated, extensionally full I Log-model M in which each z ∈ Dσ is either noncircular or identical to ⟦A⟧ for some closed term A of I ∅. (b) There is a populated, extensionally full I Log-model M in which each z ∈ Dσ is either noncircular or identical to ⟦A⟧ for some closed term A of I Neg, and in which ⟦λp.¬(¬p)⟧t→t = ⟦λp.p⟧t→t. This helps clarify why part (a) of the theorem can't be strengthened by including the final clause of part (b). If ⟦λp.¬(¬p)⟧ = ⟦λp.p⟧, ⟦¬⟧ is indirectly circular; but ⟦¬⟧ cannot be a combinator in any model, since ⟦λp.p⟧ is the only combinator of type t → t, and the requirement that |⟦¬p⟧| = 1 − |⟦p⟧| rules out the possibility that ⟦¬⟧ = ⟦λp.p⟧. Note however that there is nothing to stop it from being the case that ⟦¬⟧ = ⟦λp.p⟧ in an L -structure; indeed structures where this is the case will be crucial for proving (b). So we can address both parts simultaneously, let O be either ∅ or Neg, and let an O-combinator in any L -structure be any element of D denoted by some closed term of I O. Our strategy will be as follows. Step one is to identify a populated I Σ-structure S = ⟨N , [⋅]⟩ satisfying the condition that every circular element is an O-combinator. We have actually already carried out this step in Proposition 6.12, 90 where we saw that in a βη-term I -structure, the only circular elements are combinators. Step two is to construct an extensionally full model M = ⟨D, ⟦⋅⟧, |⋅|⟩ for which there is a homomorphism f from ⟨D, ⟦⋅⟧⟩ to S. While it will not be true that every circular element of the domain of this model is an O-combinator (since the homomorphism f maps some circular elements that are not O-combinators onto circular O-combinators), this will still be true of all those elements that are denoted by closed terms. So, step three will be to throw away all those elements of D that are circular but not O-combinators, using the method for throwing things away described in Definition 5.8. The result of this final step will be a model meeting our requirements.108 The following construction gives us what we need to implement step two of our strategy. Proposition 7.1. For any populated structure S = ⟨N , [⋅]⟩ for a logical I Σ, there is an extensionally full I Σ-model M = ⟨D, ⟦⋅⟧, |⋅|⟩ such that there is an surjective homomorphism from ⟨D, ⟦⋅⟧⟩ to S. Proof. We will identify De with Ne and take the members of Dτ (for terminal τ) to be ordered pairs. The second coordinate of each ordered pair (which we can think of as its "nonlogical content") will be a member of Nτ , so the homomorphism mapping Dτ to Nτ can just be the typed function that is the identity on De and maps each ordered pair in Dτ to its second coordinate. The first coordinate of each ordered pair will just be an extension. In particular, the first coordinates of the ordered pairs in Dt will each be 0 or 1, so |⋅| can just be the function mapping each member of Dt to its first coordinate. We allow the two coordinates to recombine freely. To make this precise, let the first and second coordinates of an ordered pair p be denoted by π1(p) and π2(p) respectively; then we construct D , ⟦⋅⟧, and |⋅| respectively as follows: (1) a. De = Ne. b. Dt = {0, 1} × Nt. c. Dσ→τ = {π1(x) ∶ x ∈ Dτ}Dσ × Nσ→τ . (2) a. ⟦v⟧v↦σ xσ = x for all x ∈ Dσ . b. ⟦AB⟧g∪hτ = ⟨π1(⟦A⟧ g σ→τ)(⟦B⟧hσ), [AB] π2∘(g∪h) τ ⟩, whenever A ∈ I Σ(V )σ→τ , B ∈ I Σ(V ′)σ , g ∈ D [V ], h ∈ D [V ′], and g and h are compatible. c. ⟦λv.A⟧gσ→τ = ⟨f , [λv.A] π2∘g σ→τ⟩, whenever A ∈ I Σ(V )τ , g ∈ D [V −{v}], v ∈ Vσ , and f is the function such that for all x ∈ Dσ , f(x) = π1(⟦A⟧ g∪(v↦σ x)τ ). 108Why not proceed more straightforwardly, by simply defining a valuation |⋅| that makes ⟨N , [⋅], |⋅|⟩ an extensionally full model? The answer is that there cannot be an extensionally full model ⟨N , [⋅], |⋅|⟩ in which ⟨N , [⋅]⟩ is a βη-term structure. The proof of this result is omitted here. 91 d. ⟦¬⟧ = ⟨f¬, [¬]⟩, ⟦∧⟧ = ⟨f∧, [∧]⟩, ⟦∨⟧ = ⟨f∧, [∨]⟩, ⟦∀σ⟧ = ⟨f∀σ , [∀σ]⟩, and ⟦∃σ⟧ = ⟨f∃σ , [∃σ]⟩, where for all p,q ∈ Dt and x ∈ Dσ→t, f¬(p) = 1 − π1(p), f∧(p)(q) = min{π1(p), π1(q)}, f∨(p)(q) = max{π1(p), π1(q)}, f∀σ (x) = min{π1(x@y) ∶ y ∈ Dσ}, and f∃σ (x) = max{π1(x@y) ∶ y ∈ Dσ}. (3) |p| = π1(p) for all p ∈ Dt. Note that since clauses (2b) and (2c) just pass the second coordinate of the argument of ⟦⋅⟧ over to [⋅], a trivial induction suffices to show that π2(⟦A⟧ g σ) = [A]π2∘gσ whenever ⟦A⟧gσ is defined. It follows immediately from clauses (2d) and (3) that |⋅| obeys all the conditions for ⟨D, ⟦⋅⟧, |⋅|⟩ to be a model provided that ⟨D, ⟦⋅⟧⟩ is an I Σ-structure. So, to show that it is a model it suffices to verify that ⟨D, ⟦⋅⟧⟩ satisfies conditions (i)–(iii) in the definition of an I Σ-structure. Condition (i) is just clause (2a). For condition (ii), suppose that ⟦A⟧g1σ→τ = ⟦C⟧ g2 σ→τ , ⟦B⟧ g3 σ = ⟦D⟧g4σ , g1 and g2 are compatible, and g3 and g4 are compatible. Then [A]π2∘g1σ→τ = [C] π2∘g2 σ→τ , [B] π2∘g3 σ = [D]π2∘h4σ , π2 ∘g1 and π2 ∘g2 are compatible, and π2 ∘g3 and π2 ∘g4 are compatible, so since S satisfies condition (ii), [AC] π2∘(g1∪g2) τ = [AC](π2∘g1)∪(π2∘g2)τ = [BD](π2∘g3)∪(π2∘g4)τ = [BD]π2∘(g3∪g4)τ . Thus by (2b), ⟦AC⟧g1∪g2τ = ⟨π1(⟦A⟧ g1 σ→τ)(⟦C⟧ g2 σ ), [AC]π2∘(g1∘g2)τ ⟩ = ⟨π1(⟦B⟧ g3 σ→τ)(⟦D⟧ g4 σ ), [BD]π2∘(g3∪g4)τ ⟩ = ⟦BD⟧g3∪g4τ . For condition (iii), it is enough to show that if A βη-reduces in one step to B, π1(⟦A⟧ g σ) = π1(⟦B⟧ g σ) whenever defined. (The second coordinates are the same since S is an I Σstructure). We show this by an induction using the definition of one-step βη-reduction. Base case: A immediately βη-reduces to B. If A immediately η-reduces to B, σ must be of the form ρ → τ, and A is λv.(Bv) for some v ∈ Varρ not free in B. So by (2c) and (2b), π1(⟦A⟧ g σ) is the function f such that for any x ∈ Dρ, f(x) = π1(⟦Bv⟧ g∪v↦ρ xτ ) = π1(⟦B⟧ g ρ→τ)(⟦v⟧v↦ρ xρ ) = π1(⟦B⟧ g ρ→τ)(x), i.e. f = π1(⟦B⟧ g ρ→τ). If A immediately β-reduces to B, then for some types ρ, τ, v ∈ Varρ, C ∈ I Σ(V ∪ ⟨v, ρ⟩)τ and D ∈ I Σ(U)ρ, A is (λv.C)D and B is [D/v]C . Then by (2c), π1(⟦λv.C⟧ g⇂V ρ→τ) is the function f such that for any x ∈ Dρ, f(x) = π1(⟦C⟧ g⇂V ∪v↦ρ xτ ). Thus by (2b), π1(⟦A⟧ g τ ) = π1(⟦λv.C⟧ g⇂V ρ→τ)(⟦D⟧g⇂Uρ ) = f(⟦D⟧g⇂Uρ ) = π1(⟦C⟧ g⇂V ∪v↦σ ⟦D⟧g⇂Uρτ ). By the substitution lemma (Lemma 5.2), this is the same as ⟦[D/v]C⟧gτ = ⟦B⟧gτ . Induction step: suppose that A, B ∈ I Σ(V )σ are such that π1(⟦A⟧ g σ) = π1(⟦B⟧ g σ) for every g ∈ D [V ]. Then by the above proof that ⟦⋅⟧ satisfies condition (ii), whenever C ∈ I Σ(V ′)σ→τ and h ∈ D [V ∪V ′], π1(⟦CA⟧hτ ) = π1(⟦CB⟧hτ ). Similarly if σ is of the form ρ → τ and C ∈ I Σ(V ′)ρ, π1(⟦AC⟧hτ ) = π1(⟦BC⟧hτ ). And finally whenever v ∈ Vρ and x ∈ Dρ, π1(⟦λv.A⟧ g−{v} ρ→σ )(x) = ⟦A⟧g[v↦σ x]σ = ⟦B⟧g[v↦σ x]σ = π1(⟦λv.B⟧ g−{v} ρ→σ )(x). So, ⟨D , ⟦⋅⟧, |⋅|⟩ is a model. To show that it is extensionally full, note that the first coordinate of any member of Dσ1→⋯→σn→t belongs to (({0, 1} σn ) ⋰ ) σ1 92 and for every such function, there is a member of Dσ1→⋯→σn→t with it as first coordinate and any arbitrary member of Nσ1→⋯→σn→t as second coordinate. So, for any Z ⊆ Dσ1 ×⋯×Dσn , there is an x ∈ Dσ1→⋯→σn→t such that for any ⟨y1, ..., yn⟩ ∈ Dσ1 × ⋯ × Dσn , |⟦xy1...yn⟧| = π1(x)(y1) ⋯ (yn) = 1 iff ⟨y1, ..., yn⟩ ∈ Z. Finally, we have already noted that the typed function that is the identity on De and maps each member of Dτ to its second coordinate satisfies the condition to be a homomorphism from D to N . This homomorphism is surjective because every element of Nτ is the second element of at least one member of Dτ . We can use Proposition 7.1 to construct an extensionally full model in which every circular element denoted by a closed term is is an O-combinator. Proposition 7.2. When O is ∅ or Neg, there is an extensionally full I Log-model M = ⟨D, ⟦⋅⟧, |⋅|⟩ such that for any closed A, either ⟦A⟧ is in Noncirc(⟨D, ⟦⋅⟧⟩) or ⟦A⟧ = ⟦B⟧ for some B ∈ I O. Moreover, if O = Neg, ⟦λp.¬¬p⟧ = ⟦λp.p⟧. Proof. Choose any Σ such that I ∅(∅)σ is a proper subset of I Σ(∅)σ for every type σ. (For example, this will be the case if Σ is populated.) Let S− = ⟨N , [⋅]−⟩ be the unique βηterm I ∅-structure with base language I Σ (see Proposition 6.8). Choose a typed function I from Log to N such that if c ∈ Logσ and c ∉ Oσ , I(c) is an impure element of Nσ (a βη-equivalence class each of whose members contain at least one constant in Σ), while if c ∈ Oσ (i.e. c = ¬ and O = Neg and σ = t → t), I(c) is the βη-equivalence class containing λp.p. Let S = ⟨N , [⋅]⟩ be the I Log-structure that results from reinterpreting S− with I (see Definition 5.3). Finally let M = ⟨D, ⟦⋅⟧, |⋅|⟩ be an extensionally full I Log-model, and f a surjective homomorphism from ⟨D, ⟦⋅⟧⟩ to S: such an M and f exist by Proposition 7.1. Since f is a homomorphism, it is a @-homomorphism, so by Proposition 6.13, x is in Noncirc(⟨D, ⟦⋅⟧⟩) whenever f(x) is in Noncirc(S). But for any closed term A of I Log that contains at least one constant not in O, f(⟦A⟧) = [A] is a βη-equivalence class of impure terms, and hence in Noncirc(S) by Proposition 6.12; so ⟦A⟧ ∈ Noncirc(D , ⟦⋅⟧). Moreover, if O is Neg and we built M using the ordered-pair construction of Proposition 7.1, ⟦λp.¬¬p⟧ will be the same as ⟦λp.p⟧, since their second coordinates [λp.¬¬p] and [λp.p] are the same, and their first coordinates are also the same, namely the function mapping each ordered pair in {0, 1} × Nt to its first coordinate. Finally comes step three, where we throw away everything in the domain of M that is circular but not an O-combinator. This is possible because the typed collection of elements that are either noncircular or O-combinators is closed, and since M is extensionally full, its domain contains a family with this typed collection as its extension. Proposition 7.3. There exists an extensionally full I Log-model M = ⟨D, ⟦⋅⟧, |⋅|⟩ such that for every x ∈ Dσ , either x = ⟦A⟧ for some closed term A of I O, or x is in 93 Noncirc(⟨D, ⟦⋅⟧⟩). Moreover, if O is Neg, there is a model meeting these conditions in which ⟦λp.¬¬p⟧ = ⟦λp.p⟧. Proof. Let M = ⟨D , ⟦⋅⟧, |⋅|⟩ be an extensionally full I Log-model such that for any closed A, ⟦A⟧ is either in Noncirc(⟨D, ⟦⋅⟧⟩) or an O-combinator, and such that if O = Neg, ⟦λp.¬¬p⟧ = ⟦λp.p⟧; we know from Proposition 7.2 that such a model can always be found. Let C be the subcollection of D such that x ∈ Cσ iff x is either noncircular or a O-combinator. Since C is the union of an inclusive collection and an @-closed one, it is @-closed by Proposition 6.4. Moreover, C contains ⟦A⟧ for every closed term A, so by Proposition 6.5, C is closed. Since M is extensionally full, we know that for each type σ, there exists some x ∈ Dσ→τ such that for any y ∈ Dσ , |x@y| = 1 iff y ∈ Cσ . For any such x ∈ Dσ→τ , there is a noncircular x′ that has the same extension as x: for any p ∈ Noncirc(M)t, we can take x′ = ⟦λyσ .xy ∧ (p ∨ ¬p)⟧. (Noncirc(M)t cannot be empty: since ∃p(p) is not a term of I O, ⟦∃p(p)⟧ is noncircular.) So we can choose a typed family F such that for every σ, Fσ is a noncircular element of Dσ→t whose extension is Cσ . F thus has a closed extension; it is also self-contained, since being noncircular, Fσ belongs to Cσ→t. So by Proposition 5.9, there is a model MF = ⟨C , ⟦⋅⟧F, |⋅|F⟩, the restriction of M by F. The restricted structure ⟨C , ⟦⋅⟧F⟩ is a@-substructure of ⟨D, ⟦⋅⟧⟩. So by Corollary 6.14, every noncircular element of ⟨D, ⟦⋅⟧⟩ that is in C is a noncircular element of ⟨C , ⟦⋅⟧F⟩. And every circular element of D that is in C is ⟦A⟧ for some closed term A of I O, in which case it is also ⟦A⟧F, since ⟦⋅⟧ and ⟦⋅⟧F agree on I O. So, MF is a model in which every circular element is an O-combinator. MF is also extensionally full. For suppose Z ⊆ Cσ1 × ⋯ × Cσn . Then Z ⊆ Dσ1 × ⋯ × Dσn , so by the extensional fullness of M, there is some x ∈ Dσ1→⋯→σn→t such that |⟦xy1...yn⟧| = 1 exactly when ⟨y1, ..., yn⟩ ∈ Z. By the fact noted above, this x has the same extension as some noncircular x′; being noncircular, x′ also belongs to Cσ1→⋯→σn→t. So for any ⟨y1, ..., yn⟩ ∈ Cσ1 × ⋯ × Cσn , |[x ′y1...yn]|F = |⟦x′y1...yn⟧| = |⟦xy1...yn⟧| = 1 iff ⟨y1, ...yn⟩ ∈ Z. Finally, if O is Neg, ⟦λp.¬¬p⟧F = ⟦λp.¬¬p⟧ = ⟦λp.p⟧ = ⟦λp.p⟧F, since ⟦⋅⟧F coincides with ⟦⋅⟧ on quantifier-free terms. This concludes the proof of the theorem. A8 Extensions The proof from §A7 can also be extended to show the consistency of OLC with several other principles, some of which look like attractive strengthenings of the theory. (i) Part (b) of the theorem establishes the consistency of OLC with Involution. As noted in §7, once we have Involution we will almost certainly want De Morgan too. We can achieve this simply by choosing our βη-term structure ⟨N , [⋅]⟩ to be 94 one in which [∧] = [∨]. (For example, we could set both [∧] and [∨] to be the βηequivalence class containing a certain constant ∗ of the base language.) Given that [¬] = [λp.p], this ensures that [λp.λq.¬p ∧ ¬q] = [λp.λq.¬(p ∨ q)] and [λp.λq.¬p ∨ ¬q] = [λp.λq.¬(p∧q)]. Moreover, the first coordinates of the denotations of each of these pairs are also guaranteed to be the same since they are just the corresponding truth functions. Thus their denotations in M, and hence also in MF, will also be the same. (ii) For every terminal type τ, we can inductively define "lifted" negation, disjunction, and conjunction operators ¬τ , ∧τ , ∨τ in the obvious way, i.e. ¬t = ¬; ∧t = ∧; ∨t = ∨; and ¬σ→τ = λxσ→τ .λzσ .¬τ(xz) ∧σ→τ = λxσ→τ .λyσ→τ .λzσ .xz ∧τ yz ∨σ→τ = λxσ→τ .λyσ→τ .λzσ .xz ∨τ yz Using these, we can define operators of "logical equivalence" and "nonlogical equivalence", L∼τ and N∼τ : L∼τ =df λxτ .λyτ .(x ∧τ y) ≡τ (x ∨τ y) N∼τ =df λxτ .λyτ .(x ∧τ ¬x) ≡τ (y ∧τ ¬y) In ordered-pair models based on βη-term structures, |⟦x N∼τ y⟧| will be 1 exactly when π2(x) = π2(y), since π1(⟦x ∧τ ¬x⟧) = π1(⟦y ∧τ ¬y⟧) for any x, y, while π2(⟦x ∧τ ¬x⟧g) = π2(⟦x ∧τ ¬x⟧g) only when π2(x) = π2(y). Moreover, if we base the model on a βη-term structure in which [∧] is identical to [∨], |⟦x L∼τ y⟧| will be 1 exactly when π1(x) = π1(y), since π2(⟦x ∧τ y⟧) = π2(⟦x ∨τ y⟧) for any x, y ∈ Dτ . This means that L∼τ will behave as an equivalence relation in these models.109 Moreover, each of the following principles will hold: (x N∼τ y) ∧ (x L∼τ y) → (x ≡τ y)Two-dimensionality (x ∧τ y) L∼τ (y ∧τ x)Weak ∧-Commutativity (x ∧τ (y ∨τ z)) L∼τ (x ∧τ y) ∨ (x ∧τ z)Weak ∧∨-Distributivity (x ∧τ (y ∨τ ¬τy)) L∼τ xWeak ∧∨-Dissolution We also get dual versions of the last three. All of this seems rather nice. 109 N∼τ is automatically an equivalence relation in every model. 95 However, these models also validate the following principle: Weak Extensionality (p ↔ q) → (p L∼t q) And this is just crazy. Snow is white if and only if grass is green, but it is certainly not true that for it to be the case that snow is white and grass is green is for it to be the case that snow is white or grass is green, since the latter but not the former would be the case if snow were white and grass were red. Fortunately, the construction can be modified in such a way as to invalidate Weak Extensionality while keeping the things that seemed nice. In the modified construction, an arbitrary complete Boolean algebra B takes over the role of {0, 1} in Proposition 7.1, with the meet and join operations ⨅ and ⨆ of B replacing the maximum and minimum operations in the definitions of ⟦∧⟧, ⟦∨⟧, ⟦∀⟧, and ⟦∃⟧, and the complementation operation ′ of B replacing the "subtract from 1" operation in the definition of ⟦¬⟧. The valuation |⋅| is then fixed by any ultrafilter on B, i.e. a mapping f from B to {0, 1} such that for any X ⊆ B, f(⨆ X) = max{f(x) ∶ x ∈ X} and f(⨅ X) = min{f(x) ∶ x ∈ X}, and for any x in B, f(x′) = 1 − f(x).110 This change does not disrupt the result that |⟦x L∼τ y⟧| is 1 exactly when π1(x) = π1(y) and |⟦x N∼τ y⟧| is 1 exactly when π2(x) = π2(y), since in any Boolean algebra, a ⊔ b = a ⊓ b only when a = b, while a ⊓ a′ = b ⊓ b′ for any a and b. So L∼τ still behaves as an equivalence relation, Two-dimensionality still holds, and we still get the weakened Boolean axioms as consequences of the corresponding identities in B. (iii) To show that OLC is consistent with Functionality, note first that any βηterm structure whose base language contains infinitely many constants in every type is functional: when x ≠ y ∈ Dσ→τ and z is the βη-equivalence class of a constant that does not occur in the members of x or the members of y, xz ≠ yz (BBK, lemma 3.14). Moreover, if the input structure ⟨N , [⋅]⟩ is functional, the ordered-pair model constructed at step two will be functional too, since when x ≠ y ∈ Dσ→τ , either π2(x) ≠ π2(y), in which case ⟦xz⟧ ≠ ⟦yz⟧ for any z such that π2(x)@π2(z) ≠ π2(y)@π2(z), or else π1(x) ≠ π1(y), in which case there is some z ∈ Dσ such that π1(⟦xz⟧) = π1(x)(z) ≠ π1(y)(z) = π1(⟦yz⟧). However, if all we do at step three is, as before, to throw away elements in the domain that are circular but not O-combinators, there is no guarantee that the final model will still be functional, since it could happen that although y ≠ z ∈ Dσ→τ do not get thrown away, each z ∈ Dσ such that x@z ≠ y@z does get thrown away. So to preserve functionality, we need to do something more complicated at step three instead. Briefly: define a 110If we require B to be atomic, we can call its atoms the "logically possible worlds" and the unique atom w such that |w| = 1 the "actual world" of the model. 96 partial equivalence relation ∼σ on each Dσ as follows: x ∼e y iff x = y and π2(x) is noncircular; x ∼t y iff x = y and π2(x) is noncircular; x ∼σ→τ y iff whenever z ∼σ w, xz ∼τ yw, and each of x and y either has a noncircular second coordinate, or is ⟦A⟧ for some closed I O-term A. Instead of just throwing away all circular elements that are not O-combinators, we will instead go further by throwing away all elements x such that it is not the case that x ∼ x. Using the functionality of ⟨N , [⋅]⟩ we can show that π2(x) = π2(y) whenever x ∼ y. Also, ∼ is a congruence on this model, since ⟦A⟧ ∼ ⟦A⟧ for every closed I Log term A. So we can take the quotient by ∼. The resulting model is functional; and since each equivalence class consists of ordered pairs with the same second co-ordinate, there is a homomorphism from it to ⟨N , [⋅]⟩; thus the only circular elements in the quotient model are those whose second coordinates are circular in ⟨N , [⋅]⟩, all of which are guaranteed to be denoted by closed I O-terms. (iv) Because our models are built on top of βη-term structures, Commutativity fails in them.111 But we can secure Commutativity if we build the non-logical domains N using equivalence classes, not under βη-equivalence, but under a weaker equivalence relation that allows substitution of φ∗ψ for ψ ∗φ as well as substitution of terms one of which immediately βη-reduces to the other. (Here ∗ is the constant whose βη-equivalence class is [∧] and [∨]). Similarly, if we want Associativity, we can get it by weakening the equivalence relation to allow substitution of φ ∗ (ψ ∗ θ) for (φ ∗ ψ) ∗ θ. Neither modification will affect the proof of Proposition 6.12, since these substitutions do not change the number of occurrences of any constant c in the base language. Because of this, every equivalence class will still contain a term A such that Count(c, A) ≥ Count(c, B) for every other B in the equivalence class and every constant c, which is what we need to prove that CD is never equivalent to D unless C is pure. By contrast, we cannot similarly extend the list of permitted substitutions to secure Distributivity, since (as pointed out in note 73) a sequence of Distributivity-licensed substitutions can increase the number of occurrences of a constant without limit.112 111Instead they validate ((p ∧ q) ≡ (q ∧ p)) → p L∼ q. 112Note however that if we move to a transfinite relational type theory including α-adic conjunction and disjunction operators ∧α and ∨α for both finite and transfinite ordinals α, the natural generalisations of commutativity and associativity will be inconsistent with the natural generalisation of OLC. Generalised associativity would imply that for any p, ∧2(p, ∧ω(p, p, ...)) ≡ ∧ω(p, p, ...), which is ruled out by OLCwhen p is nonlogical. And generalised commutativity for a conjunction operator ∧2ω would imply that for any p, λx1x2....(∧2ω(x1, x2, ..., p, p...)) ≡ λx1x2....(∧2ω(p, x1, x2, ..., p, p...)); but transfinite OLC rules this out, since the term on the right can be derived from the term on the left and p (whichmay be nonlogical) by plugging them into the first two argument places of λzqx1x2....z(q, x1, x2, ...). How bad is this? Giving up Associativity seems tolerable, but I admit I am more worried about Commutativity. Section 7 tentatively suggested that the binary version of Commutativity might be supported by 97 (v) Since everything in Dt is noncircular, our models validate φ ≢ φ ∧ φ and φ ≢ φ ∨ φ without any exception for logical φ. As I mentioned in §8, I am not sure whether this is a good thing or not; the worry is that it may seem invidious for ⟦¬⟧ to be circular when ⟦∧⟧, ⟦∨⟧, ⟦∀σ⟧, and ⟦∃σ⟧ are noncircular. Either way, it would be interesting to find a model of OLC in which the denotations of all logical constants are circular. However, there is no easy way to adapt our construction so as to give the quantifiers circular denotations: even if we managed to make them circular in the initial M, the final pruned model MF modifies the denotations of the quantifiers by adding the restrictions by F, and we needed to make each Fσ noncircular guarantee that F would be internally closed, as it must be for MF to be a well-defined model. So, if we wanted to show the consistency of OLC with schemas like p ≡ p ∧ ∃q(q), we would need a rather different kind of construction. Ramsey-style thought experiments involving non-linear languages. However much more work would need to be done to properly flesh out such an argument and generalise it to the transfinite setting, since it is not at all clear how to imagine the nonlinear elements of the language coexisting harmoniously with transfinite applications and abstracts. 98 References Andrews, Peter (1972). 'General Models and Extensionality'. Journal of Symbolic Logic 37, pp. 395–7. Audi, Paul (2012). 'Grounding: Toward a Theory of the In-Virtue-of Relation'. Journal of Philosophy 109.12, pp. 685–711. Bacon, Andrew (MS). 'The Broadest Necessity'. MS. Bacon, Andrew, John Hawthorne and Gabriel Uzquiano (2016). 'Higher-Order Free Logic and the Prior-Kaplan Paradox'. Canadian Journal of Philosophy 46.4-5, pp. 493–541. Bacon, Andrew and Jeffrey Sanford Russell (MS). 'The Logic of Opacity'. MS. Bealer, George (1982). Quality and Concept. Oxford: Oxford University Press. Benzmüller, Christoph, Chad E. Brown andMichael Kohlhase (2004). 'Higher-Order Semantics and Extensionality'. Journal of Symbolic Logic 69.4, pp. 1027–88. Braun, David (1988). 'Understanding Belief Reports'. Philosophical Review 107, pp. 555–95. Carnap, Rudolf (1928). Der Logische Aufbau der Welt. Trans. by R. A. George. Chierchia, Gennaro (1989). 'Anaphora andAttitudesDe Se'. In Language in Context, ed. Renate Bartsch, J. F. A. K. van Benthem and P. van Emde Boas. Dordrecht: Foris, pp. 1–31. Correia, Fabrice (2010). 'Grounding and Truth-functions'. Logique et Analyse 53, pp. 211–51. - (2016). 'On the Logic of Factual Equivalence'. Review of Symbolic Logic 9, pp. 103–22. Correia, Fabrice and Alexander Skiles (MS). 'Grounding, Essence, and Identity'. MS. Cresswell, Maxwell J. (1965). 'Another Basis for S4'. Logique et Analyse 8, pp. 191– 5. - (1985). Structured Meanings. Cambridge, MA: MIT Press. Dorr, Cian (2004). 'Non-Symmetric Relations'. In Oxford Studies in Metaphysics, vol. 1, ed. Dean Zimmerman. Oxford: Oxford University Press, pp. 155–92. - (2005). 'What We Disagree About When We Disagree About Ontology'. In Fictionalist Approaches to Metaphysics, ed. Mark Kalderon. Oxford: Oxford University Press, pp. 234–86. - (2007). 'There Are No Abstract Objects'. In Contemporary Debates in Metaphysics, ed. Theodore Sider, John Hawthorne and Dean Zimmerman. Malden, Mass.: Wiley-Blackwell, pp. 32–64. 99 Dorr, Cian (2014a). 'Quantifier Variance and the Collapse Theorems'. The Monist 97, pp. 503–70. - (2014b). 'Review of Rayo 2013'.NotreDamePhilosophical Reviews 2014.06.33. - (2014c). 'Transparency and theContext-sensitivity ofAttitudeReports'. InEmpty Representations: Reference and Non-existence, ed. Manuel Garcıa-Carpintero and Genoveva Martı. Oxford: Oxford University Press, pp. 25–66. Dorr, Cian and John Hawthorne (2013). 'Naturalness'. In Oxford Studies in Metaphysics, vol. 8, ed. Karen Bennett and Dean Zimmerman. Oxford: Oxford University Press, pp. 3–77. - (2014). 'Semantic Plasticity'. Philosophical Review 123, pp. 281–338. Elbourne, Paul (2005). Situations and Individuals. Cambridge, MA: MIT Press. Fine, Kit (2012). 'Guide to Ground'. InMetaphysical Grounding, ed. Fabrice Correia and Benjamin Schneider. Cambridge: Cambridge University Press, pp. 37–80. - (2015). 'Unified Foundations for Essence and Ground'. Journal of the American Philosophical Association 1.2, pp. 296–311. Fritz, Peter and JeremyGoodman (forthcoming). 'Higher-Order Contingentism, Part 1: Closure and Generation'. Journal of Philosophical Logic. Forthcoming. Goodman, Jeremy (MS). 'Theories of Aboutness'. MS. - (forthcoming). 'Reality Is Not Structured'. Analysis. Forthcoming. - (2016). 'An Argument for Necessitism'. Philosophical Perspectives this volume. Goodman, Nelson (1954). Fact, Fiction and Forecast. Fourth. Cambridge, MA: Harvard University Press. Graff, Delia (2001). 'Descriptions as Predicates'. Philosophical Studies 102, pp. 1– 42. Henkin, Leon (1950). 'Completeness in the Theory of Types'. The Journal of Symbolic Logic 15. Hindley, J. Roger and Jonathan P. Seldin (2008). Lambda Calculus and Combinators: An Introduction. Cambridge: Cambridge University Press. Hodes, Harold T. (2015). 'WhyRamify?'Notre Dame Journal of Formal Logic 56.2, pp. 379–415. Huntingdon, Edward V. (1904). 'Sets of Independent Postulates for the Algebra of Logic'. Transactions of the American Mathematical Society 5.3, pp. 288–309. Khamara, E.J. (1988). 'Indiscernibles and the Absolute Theory of Space and Time'. Studia Leibnitiana 20, pp. 140–59. Kripke, Saul (1972). Naming and Necessity. revised. Cambridge, MA: Harvard University Press. Kusumoto, Kiyomi (1999). 'Tense in Embedded Contexts'. PhD thesis. University of Massachusetts, Amherst. 100 Lewis, David (1970). 'General Semantics'. Synthese 22, pp. 18–67. - (1986). On the Plurality of Worlds. Oxford: Blackwell. Muskens, Reinhard (2007). 'Intensional Models for the Theory of Types'. Journal of Symbolic Logic 72, pp. 98–118. Myhill, John (1958). 'Problems Arising in the Formalization of Intensional Logic'. Logique et Analyse 1, pp. 78–83. Plantinga, Alvin (1983). 'On Existentialism'. Philosophical Studies 44, pp. 1–20. Plate, Jan (2016). 'Logically Simple Properties and Relations'. Philosopher's Imprint 16, pp. 1–40. Prior, A. N. (1964). 'Conjunction and Contonktion Revisited'. Analysis 24, pp. 191– 5. - (1971).Objects of Thought. Ed. Peter T. Geach and Anthony J. P. Kenny. Oxford: Clarendon Press. Ramsey, F. P. (1926). 'The Foundations of Mathematics'. Proceedings of the London Mathematical Society. Series 2 25, pp. 338–84. - (1927). 'Facts and Propositions'.Proceedings of the Aristotelian Society 7, pp. 153– 70. Rayo, Agustın (2013). The Construction of Logical Space. Oxford: Oxford University Press. Rosen, Gideon (2010). 'Metaphysical Dependence: Grounding and Reduction'. In Modality: Metaphysics, Logic, and Epistemology, ed. Bob Hale and Aviv Hoffmann. Oxford: Oxford University Press. Russell, Bertrand (1903). The Principles of Mathematics. London: Routledge. Salmon, Nathan (1986a). Frege's Puzzle. Cambridge, MA: MIT Press. - (1986b). 'Reflexivity'. Notre Dame Journal of Formal Logic 27, pp. 401–29. - (2010). 'Lambda in Sentences with Designators'. Journal of Philosophy 107, pp. 445–68. Saul, Jennifer (2007). Simple Sentences, Substitution, and Intuitions. Oxford: Oxford University Press. Schiffer, Stephen (1979). 'Naming and Knowing'. InMidwest Studies in Philosophy II: Contemporary Perspectives in the Philosophy of Language, ed. Peter French, Theodore E. Uehling Jr. and Howard K. Wettstein. Minneapolis: University of Minnesota Press, pp. 28–41. Setiya, Kieran (2007). Reasons Without Rationalism. Princeton: Princeton University Press. Shoemaker, Sydney (1998). 'Causal and Metaphysical Necessity'. Pacific Philosophical Quarterly 79, pp. 59–77. 101 Sider, Theodore (2011). Writing the Book of the World. Oxford: Oxford University Press. Simons, Mandy et al. (2010). 'What Projects and Why'. Proceedings of SALT 20, pp. 309–27. Soames, Scott (1987a). 'Direct Reference, Propositional Attitudes, and Semantic Content'. Philosophical Topics 15, pp. 47–87. - (1987b). 'Substitutivity'. InOnBeing and Saying: Essays for RichardCartwright, ed. Judith Jarvis Thomson. Cambridge: MIT Press, pp. 99–132. Sorensen, Roy A. (1999). 'Mirror Notation: Symbol Manipulation Without Inscription Manipulation'. Journal of Philosophical Logic 28, pp. 141–64. Stalnaker, Robert C. (1977). 'Complex Predicates'. The Monist 60, pp. 327–39. - (1994). 'The Interaction of Modality with Quantification and Identity'. In.Ways a World Might Be. Oxford: Oxford University Press, pp. 144–61. - (1999). Context and Content: Essays on Intentionality in Speech and Thought. Oxford: Oxford University Press. Suszko, Roman (1975). 'Abolition of the Fregean Axiom'. Lecture Notes in Mathematics 453, pp. 169–239. Whitehead, AlfredNorth andBertrandRussell (1910).PrincipiaMathematica. Second. Cambridge: Cambridge University Press. Williamson, Timothy (1985). 'Converse Relations'.Philosophical Review 94, pp. 249– 62. - (2003). 'Everything'. In Philosophical Perspectives 17: Language and Philosophical Linguistics, ed. John Hawthorne and Dean Zimmerman. Oxford: Blackwell, pp. 415–65. - (2013). Modal Logic as Metaphysics. Oxford: Oxford University Press. - (2016). 'Abductive Philosophy'. Philosophical Forum 47, pp. 263–80.