DISARMING A PARADOX OF VALIDITY HARTRY FIELD Abstract. Any theory of truth must find a way around Curry's paradox, and there are well-known ways to do so. This paper concerns an apparently analogous paradox, about validity rather than truth, which JC Beall and Julien Murzi ("Two Flavor's of Curry's Paradox") call the v-Curry. They argue that there are reasons to want a common solution to it and the standard Curry paradox, and that this rules out the solutions to the latter offered by most "naive truth theorists". To this end they recommend a radical solution to both paradoxes, involving a substructural logic, in particular one without structural contraction. In this paper I argue that substructuralism is unnecessary. Diagnosing the "v-Curry" is complicated because of a multiplicity of readings of the principles it relies on. But these principles are not analogous to the principles of naive truth, and taken together there is no reading of them that should have much appeal to anyone who has absorbed the morals of both the ordinary Curry paradox and the Second Incompleteness Theorem. Key words and phrases. Curry paradox, substructural logic, validity. 1 2 1. Introduction Any theory of truth must deal in some way with Curry's paradox: what JC Beall and Julien Murzi [2] designate the c-Curry, where the prefix 'c-' indicates that the paradox involves a conditional connective. The paradox is commonly dealt with either by restriction on some of the usual assumptions about truth, or by restriction on some of the usual rules or metarules of classical logic, or both. Beall and Murzi argue that there is an analogous paradox that they call the v-Curry, that doesn't involve a conditional at all but rather involves a validity predicate; and they argue that the most popular solutions to the cCurry don't extend to the v-Curry. They argue that we do need a common solution, and propose a radical one, involving a logic without structural contraction: roughly speaking, a logic where a conclusion B can follow from an assumption A taken twice without it following from A taken once. The c-Curry is a very deep paradox that has important consequences for truth, logic or both; but I am skeptical that there is anything new to be learned from the v-Curry. More fully: the notion of validity can be understood in distinct ways; on some ways of understanding it, the only morals of the v-Curry are ones we should have learned long ago from Gödel, whereas on other ways of understanding it, any way of handling the paradoxes of truth and satisfaction handles the v-Curry automatically. In either case, there is no need of anything nearly so drastic as a logic without structural contraction. I argued for this previously ([5]), but that was prior to Beall and Murzi's elegant presentation of the paradox, their diagnosis of structural contraction as the culprit, and their defense of the two key assumptions other than structural contraction on which the argument rests. So it is worth another look. And I will include a more explicit discussion of several respects in which the apparent analogy of the v-Curry to the c-Curry and other truth paradoxes is superficial. (The main disanalogy, and my main critique of the Beall and Murzi paper, is in Section 4.) 2. The c-Curry The idea of a Curry paradox is that there seems to be a simple way of proving any sentence that one likes (or doesn't like). In the case of the ordinary c-Curry paradox, it goes like this. Let B abbreviate a sentence that one wants to "prove" (for instance, that I will win the lottery tomorrow), and let KB be a sentence of form t0 is true → B, DISARMING A PARADOX OF VALIDITY 3 where '→' is a conditional and t0 abbreviates a singular term that denotes KB. As is well-known, there are natural ways in which such self-referential sentences can arise.1 Then one way to run the Curry argument is in two stages. The first stage will use Modus Ponens and the plausible principle (True-Elim): t = 〈A〉, T rue(t) ` A, where t is any abbreviation of a singular term and 〈A〉 is a standard name of A. '`' represents intuitive reasoning, the sort of thing sometimes written less compactly as t = 〈A〉 True(t) A We might think of it as indicating a legitimate conditional assertion of the conclusion given the premises, or a legitimate intuitive argument of the conclusion from the premises, or a legitimate drawing of the conclusion as a consequence of the premises; in the last case, it can't be narrowly logical consequence (e.g. consequence in first order logic), but might be something like "consequence in the logic of truth". (The exact interpretation of the '`', or the horizontal line in the less compact notation, will become relevant later.) Modus Ponens is the principle (MP): A,A→ B ` B, so the reading of Modus Ponens depends on the reading of '`'. The second stage of the c-Curry argument will also use Modus Ponens, but in addition will use the metarule of Conditional Proof and the plausible principle (True-Introd): t = 〈A〉, A ` True(t). (Actually we don't really need quite this, just the metarule (TI Metarule): If ` A, then t = 〈A〉 ` True(t), which (assuming transitivity of `, as I shall) follows from True-Introd.) Conditional Proof is the metarule (CP): If A ` B then ` A→ B; 1E.g. someone in a certain room with two blackboards on different walls might fill up one blackboard, and then write on the other: If the last sentence written on the blackboard on the east wall of this room is true then B. This would be especially natural if the person thinks that the first blackboard was on the east wall. But if he has his directions confused and it is the second blackboard that is on the east wall, then the person has unintentionally produced a sentence meeting the requirements for KB . 4 HARTRY FIELD again, the reading of this depends on the reading of '`'. To avoid having to carry the assumption t0 = 〈KB〉 as an explicit antecedent to '`' throughout the discussion, let's assume ` t0 = 〈KB〉. (On the consequence interpretation of '`', this simplification prevents the argument as I'll state it from applying to "contingent self-reference" like that in note 1; but even if '`' is read as consequence, we can either rewrite what follows without the simplification, or stick to self-reference produced by Gödel-Tarski diagonalization.)2 Given this simplifying assumption, True-Elim and True-Introd yield True(t0) ` KB and KB ` True(t0) (and the TI Metarule yields that if ` KB then ` True(t0)). Now for the c-Curry argument: Stage 1: Assume for the sake of argument that True(t0). Then by True-Elim we get KB, that is, True(t0)→ B. But since we're assuming for the sake of argument that True(t0), we can infer B, using Modus Ponens. The derivation in Stage 1 is under the supposition that True(t0); so the upshot of Stage 1 is: True(t0) ` B. Stage 2: Applying the rule of Conditional Proof to the upshot of Stage 1, we get ` True(t0)→ B. But that's ` KB, so by True-Introd, or just the TI Metarule, we get ` True(t0). And now by Modus Ponens again, we get ` B. The derivation in Stage 2 is not under the scope of a supposition, it is an absolute derivation of B. P Of course, something has gone wrong, but it isn't obvious what. One possible diagnosis is that one or both of the truth rules needs restriction. Another possible diagnosis is that Conditional Proof needs restriction. A third is that Modus Ponens needs restriction. I'll call 2The latter does require that elementary syntactic reasoning (or numbertheoretic reasoning, when syntax is developed within number theory) be allowed in the reasoning represented by '`'. But this seems harmless since reasoning about (sentential) truth would in any case need to be included, and one can't reason about truth in any significant way without being able to reason in a minimal way about the bearers of truth. DISARMING A PARADOX OF VALIDITY 5 these the standard diagnoses, though the third one, questioning Modus Ponens, isn't common. My personal preference (at least for natural readings of '`') is for the diagnosis that blames Conditional Proof; this can either go with a broadly classical logic (e.g. supervaluationism) or a more thoroughly nonclassical logic. (For instance, a "paracomplete" logic that restricts excluded middle, or a "paraconsistent" logic that restricts disjunctive syllogism. Unlike broadly classical logics, paracomplete and paraconsistent logics can allow for "naive" theories of truth that accept the general intersubstitutivity of True(〈C〉) with C even in embedded contexts, rather than merely the rules True-Introd and True-Elim.) But there is no need here to decide among the standard diagnoses.3,4 Beall and Murzi ultimately suggest (what at least superficially is) a different diagnosis: that Stage 1 of the argument fails as an argument that B follows from 't0 is true' because it illicitly used the assumption that t0 is true twice: once in inferring that if t0 is true then B, and the second in going on to conclude B by Modus Ponens. The suggestion is that at Stage 1 we properly get only True(t0), T rue(t0) ` B; and that then when we apply Conditional Proof in Stage 2, we get only True(t0) ` True(t0)→ B. 3There are other ways to use KB to apparently derive B that don't make direct use of Conditional Proof, but do make use of principles that are easily derivable by Conditional Proof together with Modus Ponens: e.g. Pseudo Modus Ponens (PMP): ` [A ∧ (A→ B)]→ B or Contraction Rule for Conditionals (CRC): A→ (A→ B) ` A→ B. The paradoxical derivations from these principles involve the general intersubstitutivity of True(〈C〉) with C, not merely the rules True-Elim and True-Introd. So "naive" theories of truth, which accept the general intersubstitutivity, must reject both PMP and CRC, but theorists who accept merely True-Elim and True-Introd are free to retain PMP and CRC (and indeed all classical validities, though not classical metarules like Conditional Proof or even Reasoning by Cases). 4Indeed, while I think that on the most natural readings of '`' it is Conditional Proof that needs restriction, I think there are less natural ones in which it is Modus Ponens that needs restriction. For instance, if A1, ..., An `∗ B is defined to mean ` A1∧ ...∧An → B, then Modus Ponens for '`∗' requires Pseudo-Modus Ponens for '`' (see previous footnote), which "naive" truth theorists must reject when 'True' or a related notion is in the language. 6 HARTRY FIELD This blocks the paradox.5,6 I won't argue against the substructuralist diagnosis of the c-Curry, though it strikes me as unnecessarily radical.7 What I do want to argue against is their effort to support this diagnosis by consideration of a different Curry paradox, the v-Curry. 3. The v-Curry Again let B abbreviate a sentence one wants to prove, and let πB be a sentence of form 〈B〉 follows from t1, or for short, V al(t1, 〈B〉); where t1 abbreviates a term that denotes πB, and 〈B〉 names (the sentence abbreviated as) B. Again I'll assume for simplicity that in fact (Id): ` t1 = 〈πB〉; but we could run through at least the initial argument without this, by carrying t1 = 〈πB〉 as a premise. Presumably the validity predicate is extensional, so given (Id), we may assume ` [V al(〈πB〉, 〈B〉) if and only if V al(t1, 〈B〉)], i.e. (Equiv): ` [V al(〈πB〉, 〈B〉) if and only if πB]. 5We could apply Conditional Proof again to get ` True(t0)→ (True(t0)→ B); but one can't get from this to ` True(t0)→ B without the Contraction Rule for Conditionals mentioned in note 3; and Beall and Murzi reject that (as do "naive" theorists of truth who accept structural contraction). 6An alternative substructural resolution would question the transitivity of `, rather than Structural Contraction (see for instance [4]); on this diagnosis the primary problem with the argument is a disguised use of transitivity in Stage 2. (Prior to the end of Stage 2, we had "established" (1) ` True(t0), and (2) ` True(t0)→ B. Modus Ponens as stated above gives (3) True(t0), T rue(t0)→ B ` B. And then assuming transitivity, (1)-(3) yield ` B.) But Beall and Murzi don't question transitivity, and I won't either. 7Though in Section 5 I will suggest that it may be less radical than it seems. DISARMING A PARADOX OF VALIDITY 7 Let's try to more or less mimic the argument of the c-Curry to derive the conclusion that B. To do this we'll need two principles, which Beall and Murzi call VP and VD: (VP): If A ` C then ` V al(〈A〉, 〈C〉) (VD): A, V al(〈A〉, 〈C〉) ` C. Stage 1: Assume for the sake of argument that πB. From this and (Equiv), V al(〈πB〉, 〈B〉). But from πB and V al(〈πB〉, 〈B〉), VD yields B; so on the assumption πB, we can conclude B. That is, πB ` B. Stage 2: From that we can use VP to get ` V al(〈πB〉, 〈B〉). But then by (Equiv), ` πB; and now VD, outside the scope of a supposition, gives ` B. P It is important to the paradox as presented here that 'V al' be a 2-place predicate; there is a related paradox with a 1-place predicate, but it would not serve the dialectical role Beall and Murzi want, of motivating the rejection of structural contraction. In more detail: suppose that VAL is a 1-place predicate of sentence validity, perhaps defined in terms of the 2-place V al, by letting V AL(x) mean V al(〈>〉, x) where > is an uncontroversial logical truth such as ∀x(x = x). The instances of VP and VD when A is > become: (VALP): If ` C then ` V AL(〈C〉) (VALD): V AL(〈C〉) ` C. And these lead to a "V AL-Curry", i.e. the derivation of an arbitrary C in full classical logic, taken to include Conditional Proof even for sentences with 'VAL'. To see this, just replace 'True' in the construction of the c-Curry sentence with 'V AL' and run the derivation in the way analogous to the one involving the TI Metarule. However, by the same token, any of the standard ways of blocking the c-Curry can be carried over to block the V AL-Curry. Indeed, the naive truth versions block contradiction from strengthened premises, such as (VALD+): ` V AL(〈C〉)→ C. Carrying over the standard resolutions of the c-Curry may or may not lead to plausible resolutions of the VAL-Curry: what's plausible for 'True' need not be plausible for 'VAL'. (Indeed the conditional strengthening of VALP, viz. ` C → V AL(〈C〉), is obviously not plausible.) I myself am inclined to think that a better resolution of the VAL-Curry is to restrict either VALP or VALD; which one depends on the readings of 'VAL' and '`'. (I'll discuss this later.) Still, these 8 HARTRY FIELD standard c-Curry resolutions give possible solution routes for the VALCurry that allow for full acceptance of the VAL principles and keep structural contraction intact.8 It is for this reason that Beall and Murzi formulate their v-Curry paradox in a way that essential involves the 2-place 'V al'. The vCurry argument does not overtly rely on the notion of truth, or in any serious way on conditionals.9 Unless it covertly does because of the meaning of 'V al' (a possibility I will consider in Section 7), the standard resolutions of the c-Curry paradox do not carry over. Beall and Murzi think the only plausible diagnosis of the v-Curry is that it fails because of an illicit use of structural contraction. (Or at least, that this is the only plausible diagnosis available to those who advocate "naive truth theories" of the sort mentioned in note 3.) πB is used twice in Stage 1 of the argument: each of the two premises of the application of VD depend on it. According to their diagnosis, all we can validly get via VD at Stage 1 is that B is derivable from πB taken twice, whereas to then apply VP at Stage 2 we'd need that B is derivable from πB taken only once. ("VD is fine as long as you don't contract it.") Again, I won't argue that this diagnosis is wrong, but it strikes me as totally unnecessary. For a more obvious diagnosis (which I think should have strong appeal even to those who think that VALP and VALD are jointly acceptable) is that there's something wrong with at least one of VP and VD. (VP is perhaps the more obvious culprit, but I think that which of them is faulty depends on precisely how one understands 'Valid', and on how one understands '`' as well. More on this at the end of Section 4 and in Sections 7-9.) Of course Beall and Murzi are aware of such a response, and reject it. So let us see what they say in support of VP and VD. Much of the support is apparently supposed to come from something they call the V-Schema. 8The VAL-Curry and c-Curry are also formally analogous to the "Knower paradox", which involves 'knows' viewed as a 2-place predicate taking sentences in its second slot. But again, what's plausible for one of the predicates 'true', 'knows' and 'V AL' needn't be plausible for others. 9It implicitly uses Modus Ponens for the metalinguistic 'if...then', but this seems pretty harmless. (Two of the implicit uses of Modus Ponens came in applying (Equiv), and (Equiv) could be avoided if we allowed a simple kind of self-reference where a sentence can simply contain a name of itself, and then these uses of Modus Ponens wouldn't be needed. The other implicit use is in the application of (VP), but to question Modus Ponens here would be to question the whole idea of a metarule.) DISARMING A PARADOX OF VALIDITY 9 4. The V-Schema Beall and Murzi introduce the following V-Schema: (V-Schema): ` V al(〈A〉, 〈B〉) if and only if A ` B. They describe this as what results from putting VP and VD together, but this is inaccurate. Yes, the right to left of the V-Schema is just VP. The left to right, however, is strictly weaker than VD. From VD and ` V al(〈A〉, 〈B〉) one gets A ` B by the structural rule of transitivity or Cut (which I assume is not in question-see note 6). But (1) VD concerns what follows from the assumption of V al(〈A〉, 〈B〉) (together with another assumption), or from what we can conditionally assert or intuitively argue for on that assumption. Whereas: (2) The left hand side of the V-Schema involves V al(〈A〉, 〈B〉) being established, or unconditionally asserted. So it's hard to see how one could use the left to right of the V-Schema to establish VD.10 Perhaps then we should forget about VD, and do the v-Curry argument using the V-Schema instead? Beall and Murzi suggest that this is possible (p. 153 l. 6), but the same consideration that blocks VD following from the schema seems to undermine this: in the first stage of the v-Curry argument, V al(〈πB〉, 〈B〉) is an assumption rather than an established or asserted conclusion, so the left to right of the VSchema gains no purchase. In the case where B is an absurdity ⊥, the V-Schema merely requires that ` V al(〈π⊥〉, 〈⊥〉) if and only if π⊥ `⊥. Both sides could fail, so Stage 1 of the v-Curry argument is blocked. I conclude that even if one accepts the V-schema, it provides no reason to accept VD, and without it, the v-Curry argument does not go through. 10VD does follow from the left to right half of the strengthened V-Schema (VS+): G ` V al(A,B) iff G, A ` B. (Just take G to be {V al(A,B)}.) But that should be little consolation since (VS+) is clearly false: we have snow is white, grass is green ` snow is white, but obviously not snow is white ` V al('grass is green', 'snow is white'). Admittedly the obvious problem is with the right to left of (VS+), but why should the left to right be any better? If as suggested below we read '`' as 'V al', this would say that if V al(G, V al(A,B)) then V al(G∪{A}, B), and this "V al-Contraction" isn't obviously compelling. 10 HARTRY FIELD Unless there is independent reason to accept VD (a question I will discuss starting in Section 7), this is already enough to defuse the vCurry paradox. But it is also worth asking whether it's obvious that we ought to accept the right-to-left half of the V-Schema, i.e. VP, since rejecting that is an alternative and perhaps preferable resolution of the paradox. (The left-to-right half of the V-Schema does not seem especially problematic,11 but in any case plays no role in their argument other than via the mistaken claim that it implies VD.) Regarding VP, Beall and Murzi say that .... giving up VP seems not to be an option, at least if V al(x, y) is to be the validity predicate-that is, if V al(x, y) expresses what follows from what, what stands in the validity relation (which we normally mark with the turnstile). (p. 156) Until this point they, and I, have been neutral on the reading of the turnstile, but in their parenthetical remark they assume that it be read as a (or "the") validity relation rather than, say, a relation of conditional assertability. And given that it is to be a kind of validity, they suggest that we take 'V al' to be simply a rendering of '`' into the object language (thereby allowing it to freely embed). Prima facie this is a very natural suggestion (though in Section 8 I will point out considerations that make it less so). It clearly doesn't help with the gap between the V-Schema and VD (and in fact makes the gap much worse, as we'll see), but maybe it at least helps with the plausibility of VP? It may help a bit with VP, but not enough. On this interpretation of '`', what VP says is (VP-spec): If V al(〈A〉, 〈B〉) then V al(〈>〉, 〈V al(〈A〉, 〈B〉)〉). (I've put in the > so as to make do with simply a two-place validity relation.) This may seem a plausible principle, though its similarity to Conditional Proof might well give pause. But I don't see why Beall and Murzi say that giving it up isn't an option. Why can't there be true claims of form V al(〈A〉, 〈B〉) that aren't valid? Beall and Murzi's likening of the V-Schema to the truth schema (p. 158) seems incorrect: even on the assumption that '`' represents a kind of validity and 'V al' the same kind of validity, their schema has a "double occurrence of validity" ('` Val') on the left side and a "single occurrence" ('`') on the right, making the argument from right to left (i.e. VP) problematic. 11That is, it doesn't seem problematic that its instances are true. The claim that the instances are valid, and in the same sense of 'valid' encoded by 'V al', would be another matter. DISARMING A PARADOX OF VALIDITY 11 (The "genuine validity scheme", we might say, connects 'V al' rather than '` Val' to '`'.) And without the assumption that '`' represents a kind of validity and 'V al' the same kind of validity, there seems even less reason to accept VP. But it's worth noting that if we accept the idea that 'V al' is simply a rendering of '`' into the object language, it isn't really necessary to question VP, because with 'V al' and '`' so related, VD becomes highly implausible (at least assuming structural contraction). Recall that the first stage of the v-Curry argument, in the case where B is ⊥, yields the conclusion that π⊥ ` ⊥ (using only VD and structural contraction). If 'V al' is simply a rendering of '`' into the object language, this conclusion can be written as V al(π⊥,⊥), which is equivalent to π⊥. But this pair of conclusions seems impossible to accept: we'd have that the sentence π⊥ is both true and implies absurdity. VP was nowhere used in drawing this conclusion: we have a direct argument that if 'V al' is just a rendering of '`' into the object language then VD and structural contraction shouldn't be accepted together.12 This may seem surprising to those like me who accept structural contraction, since it seems at first blush natural to blame the v-Curry paradox on VP rather than on VD. But I think that's partly because we naturally read the turnstile as representing something other than validity, like conditional acceptability: we think of VD as saying that if you (fully) accept both A and the validity of the inference from A to C then you ought to accept C. So read, VD is compelling, whereas VP (that conditional acceptability requires validity) is not. (As we'll see in Section 8, there are also natural ways to take '`' as a validity relation, but not take 'V al' as fully reflecting it; on such interpretations too, VD may be correct and VP be the culprit.) In summary: I haven't argued against giving up structural contraction, but Beall and Murzi's case for doing so depends on there being some interpretation of 'V al' and '`' for which both VP and VD hold. (The modified case that avoids VP is based both on VD and the additional assumption that 'V al' is simply a rendering of '`' into the object 12It might seem that this argument yields the conclusions ` V al(π⊥,⊥) and ` π⊥ without the use of VP, but that is not so. For though it in some sense establishes V al(π⊥,⊥) and π⊥, assuming (VD) and the Beall-Murzi assumption about the relation between 'V al' and '`', we can't write ` V al(π⊥,⊥) and ` π⊥, because on the proposed reading of '`' that would mean that V al(π⊥,⊥) and π⊥ are valid, which goes beyond what's established. Still, we've established V al(π⊥,⊥) and π⊥ as true, assuming (VD) and the Beall-Murzi assumption the relation between 'V al' and '`'. The bizarreness of that conclusion is what makes VD not cotenable with the Beall-Murzi assumption, assuming structural contraction. 12 HARTRY FIELD language.) And Beall and Murzi's case for VP and VD is very weak. In later sections I will discuss a number of views on which one or both fail. 5. An Irenic Alternative? Before going on to further defend the view that substructuralism is unnecessary, let me raise the possibility that the issue might be largely verbal. The substructural view advocated by Beall and Murzi typically goes with the view that in addition to ordinary conjunction, there is a stronger conjunction ◦, called fusion, with A,B ` C equivalent to A ◦ B ` C; the reason that A,A ` C doesn't entail A ` C is that A doesn't entail A ◦ A. (On many such views, ◦ is definable from other connectives; for instance, A ◦B might be just ¬(A→ ¬B).) Given this, we can distinguish between two one-premise forms of Modus Ponens: Strong 1-Premise Modus Ponens: A ∧ (A→ B) ` B and Weak 1-Premise Modus Ponens: A ◦ (A→ B) ` B. The typical substructuralist takes the usual 2-premise form of Modus Ponens as equivalent to the Weak 1-Premise form rather than the Strong 1-Premise form. But is that substantively different from identifying the 2-Premise form with the Strong 1-Premise form and rejecting both, in favor of the Weak 1-Premise form? On this latter formulation, the resolution of the c-Curry is to restrict ordinary Modus Ponens, replacing it by its Weak 1-Premise variant. That formulation would seem to avoid the need for any restriction on structural contraction; it would be a standard resolution of the c-Curry, albeit of the somewhat unusual sort that places the blame on Modus Ponens rather than on Conditional Proof or the truth rules.13 What about the v-Curry? An analogous point applies: when Beall and Murzi say that VD is fine as long as you don't contract it, perhaps what they're saying isn't substantively different from saying that ordinary VD is not fine, and that what is fine is only (Weak VD): A ◦ V al(〈A〉, 〈C〉) ` C. 13While on the whole I prefer a standard resolution that restricts Conditional Proof over one that restricts Modus Ponens, the latter does have one advantage: Conditional Proof together with True-Introd and True-Elim (and obvious conjunction rules) yield the principle that valid inferences preserve truth, a principle that requires restriction in the presence of (Strong 1-Premise) Modus Ponens. DISARMING A PARADOX OF VALIDITY 13 In that case too, we seem to avoid any need for a restriction on structural contraction. Since I'm sympathetic with the view that some good validity concepts restrict (ordinary) VD, I'm not entirely unsympathetic with this structuralist analogue of the Beall and Murzi view, according to which it should be rejected in favor of Weak VD. That isn't to say I endorse it: it would have a cost, to be discussed in the next section. 6. Classicality Constraints? A principle that strikes me as somewhat natural is the Weak Classicality Constraint (WCC): If the 'V al'-free fragment of L is classical, then sentences containing the 'V al' predicate (restricted to inferences in L) should also be classical, in the sense of obeying classical laws like excluded middle and explosion. Of course, this principle would immediately rule out substructural solutions to the validity paradoxes in otherwise classical languages (assuming that the "classical laws" in the WCC include the standard structural rules). It would also seem to rule out the "irenic alternative" of the previous section. For in a classical context, fusion reduces to conjunction (at least if defined from other connectives); so if WCC is accepted then at least for languages whose 'V al'-free fragment is classical, Weak VD seems no more acceptable than ordinary VD. (One might, I suppose, have a primitive fusion operator, and not count it as a violation of classicality to have it diverge from conjunction in some contexts; under that interpretation of WCC, the irenic alternative would accord with it.) Finally, WCC would seem to rule out any 1-place predicate 'V AL' (perhaps defined from the 2-place 'V al') that satisfied both VALP and VALD, given that when one insists on these principles the V AL-Curry requires some sort of non-classical resolution. Admittedly, the non-classical resolution might simply be the restriction of Conditional Proof to 'V al'-free sentences, and it would not be entirely unnatural to read WCC in a narrow way so that it allows this. That narrow reading, though, wouldn't be enough to allow the substructural solution of the v-Curry, or the irenic alternative, since these don't depend on Conditional Proof. While I'm sympathetic to WCC, I certainly wouldn't call it totally obvious. In the next section I will briefly discuss another resolution of the v-Curry that rejects WCC. (I will not further discuss Weak VD, or the fusion connective it requires.) 14 HARTRY FIELD In Sections 8 and 9 I will discuss views that accept WCC; indeed, many of them accept even the Strong Classicality Constraint (SCC): Even for nonclassical L, 'V al' (applied to L) should be a classical predicate, in the sense that classical laws like excluded middle and explosion apply to sentences containing it. (Here too there are narrower and broader readings, depending e.g. on whether the "classical laws" are taken to include meta-rules such as Conditional Proof; the views I will discuss generally accept SCC even in the broader sense.) Indeed, many of these views accept the still stronger constraint that 'V al' be a classical predicate with recursively enumerable extension, even when the language containing 'V al' is otherwise nonclassical. There are at least two things to be said in favor of some or all of these constraints. First, the notion of validity should serve as a regulator of reasoning. It would seem as if it might hamper that role if there were inferences for which we had to reject that they were either valid or not valid (or accept that they were both); or if the notion of validity were to have high computational complexity. But to pursue this issue would require a much bigger discussion. Second the classicality constraints sidestep what we might call the hypocrisy problem: if you take 'logically valid' to obey a logic weaker than classical, you shouldn't ultimately be satisfied with developing your theory of that logic using inferences that are merely classically valid; and yet development of the metalogic without full classical resources presents added difficulties. If 'V al' is a classical predicate in the sense of SCC and WCC, there's no problem; and if it is it least classical when L0 is, we can at least discuss the v-Curry for classical ground languages without raising the hypocrisy issue. While there's a lot to be said for the classicality constraints, I will take no official stand on them, even the weak one. I'm merely exploring possible views, and in fact am sympathetic to the idea that the constraints are appropriate for some understandings of 'V al' but not for others. 7. Validity as Necessarily Preserving Truth There are several ways to think of validity. Some are broadly reductive. And one kind of broadly reductive account reduces validity to truth and some standard notion of necessity that is understood independently of validity: validity is necessary preservation of truth. DISARMING A PARADOX OF VALIDITY 15 On this construal, any paradoxes of validity will simply be paradoxes of truth in the modal language. Standard resolutions of the paradoxes of truth (whether paracomplete, paraconsistent, supervaluationalist, revision-theoretic, hierarchical, or whatever) carry over to modal languages, and this will automatically resolve any paradoxes of validity. For instance, with any reasonable notion of necessity, standard theories that place the blame on Conditional Proof will reject VP, and any standard theories that place the blame on Modus Ponens would reject VD. Beall and Murzi's idea that there are new paradoxes of validity presumably requires rejecting this reduction of validity to truth and an antecedently understood modality.14 (And there are independent reasons to reject that reduction: see e.g. [1] pp. 34-5, and [6] Section I.) It's worth noting that such resolutions of the v-Curry needn't question the weaker principles VALP and VALD discussed in Section 3. For instance, a theory that resolves the c-Curry in a naive truth theory that keeps Modus Ponens and rejects Conditional Proof can be happy with VALP as well as VALD. On such resolutions, the consequent of VALP involves truth but not truth preservation: no conditionality is involved in it. The paradoxical conclusion of the VAL-Curry can be blamed on its use of Conditional Proof. (Similarly for a theory that resolves the c-Curry in a naive truth theory by restricting Modus Ponens: it can keep VALP and VALD and blame the VAL-Curry on Modus Ponens. VP will be questionable where VALP isn't, because the latter involves only the unconditional notion of truth, not the conditional notion of truth preservation.) There's also a semi-reductive approach, which leaves the 1-place 'V AL' undefined but defines 'V al' in terms of it and a conditional, as V al(〈A〉, 〈B〉) is V AL(〈A → B〉). The V AL-Curry can then be resolved in accordance with VALP and VALD in a paracomplete or paraconsistent logic as in Section 3, by restricting either Conditional Proof or Modus Ponens. But if one restricts Conditional Proof, one can't get from VALD to VD; and if one restricts Modus Ponens its hard to see how to get from VALP to VP (and indeed this needs additional assumptions even given Modus Ponens). As I've stated the approach that takes validity to be necessary preservation of truth, it disallows having 'V al' in a language without 'True', 14Of course, any non-classical version of this raises the hypocrisy problem; but I take it that the v-Curry problem that Beall and Murzi are suggesting is supposed to be independent of that. Indeed, if that were their problem, they'd owe a development of structuralist logic in a substructural metalanguage, which they do not attempt. 16 HARTRY FIELD so the issue of WCC doesn't arise. But it is in the spirit of the approach to allow 'V al' in a language without 'True', by making it behave as it would if 'True' were added and 'V al' defined in terms of it as above. In that case, then of course if the hypothetical treatment of 'True' is nonclassical, the view would violate WCC. But as I've said, I don't regard that as clearly objectionable. There may also be thoroughly nonreductive ways of introducing nonclassical 'V al' into otherwise classical languages. Perhaps some of them would be more attractive than the reductive approach considered here, but I don't think VP and VD would fare any better on them. But aside from a very brief observation in the last three paragraphs of the paper, my focus from now on will be on views that retain at least weak classicality. 8. Other Broadly Reductive Concepts of Validity Another kind of reductive account explains validity in standard mathematical terms, e.g. in terms of proof or models; the notion of truth is not used in the reduction. With models, we do use the notion of semantic value in a model. But (at least if 'model' is understood in the strict sense, requiring a set as a domain) this is not a notion that leads to paradoxes, so the situation is very different from the necessary truth-preservation account. But it is similar in one respect: in this case too, the treatment of validity for sentences without 'valid' will dictate how the paradoxes of validity are to be resolved. More fully, let L0 be a language that contains no primitive validity predicate, but which is mathematically rich: it includes at least the language of Peano arithmetic, and preferably the language of set theory (say ZF). (It may or may not contain a truth predicate of some sort; but since we're considering paradoxes that are alleged to arise for validity alone, and are rejecting the idea of explaining that in terms of truth, the reader might want to focus on the case where it doesn't. The general point I'm making, though, goes over to the case where it does.) There are various standard proposals for "defining" validity for such a language, including: (A): Pure first order validity, as codified in standard first order logic with identity. (B): The conclusion following by first order logic from the premise together with standard mathematical axioms, say those of Peano Arithmetic or ZF or some recursive extension of ZF. DISARMING A PARADOX OF VALIDITY 17 (C): Various model-theoretic characterizations. Focusing on onepremise inferences, the general form is either (i) that the inference from A to B is valid iff in all models M of type Ψ, if A has designated value in M then so does B; or (ii) that it is valid iff in all models M of type Ψ, the value of A is less than or equal to that of B. (When L0 is classical, the models are presumably 2-valued, so there is no difference between (i) and (ii).) These needn't be understood as accounts of the meaning of 'valid', but merely as explications (giving proposed extensions in a theoretically fruitful way); that's all that definition in mathematics is usually taken to involve. The definitions can extensionally disagree with each other, so we should regard them as explications of distinct but related validity concepts. The reason that I wanted to take L0 to be a mathematically rich language is that this allows these definitions to be given within L0. More exactly, if L0 is rich enough to formulate Peano arithmetic then it can formulate notions of valid L0-inference that are of form (A) or (B) (provided, in the case of (B), that the set of axioms is recursive or at least arithmetical). And if L0 is rich enough to formulate ZF then it can formulate notions of valid L0-inference of any of the three forms (provided that, in the case of (C), we confine ourselves to Ψ that are definable within ZF). What about if we now expand L0 to include a binary predicate 'valid'? Well, since that term is definable in L0, it seems clear how to proceed: the inference from L-sentence A to L-sentence B is deemed valid if the inference from L0-sentence A∗ to L0-sentence B∗ is valid, where A∗ and B∗ are the results of eliminating 'valid' from A and B by the definition of 'valid'. Using this procedure, it is a simple matter to determine which if any of VP and VD is acceptable.15 The answer will depend on the notion of validity for L0 from which we started. Notion (A), purely first order validity, is of little interest in the present context: validity in that sense requires arithmetic (or syntax) to define, so laws about validity will be special cases of laws of arithmetic (or syntax); they won't themselves be validities in the strictly logical sense given by (A). VP and VD will both fail. If we start with a version of (B) where the mathematical theory in question is identical to that we use in our informal reasoning (which is 15At least in the case where L0 is classical, the analysis will diagnose the V ALCurry in the same way as the v-Curry; this is because V al(〈A〉, 〈B〉) will be taken as equivalent to V AL(〈A ⊃ B〉). 18 HARTRY FIELD presumably the case of (B) that's most in the spirit of Beall and Murzi's discussion), then we will get VP but not VD (unless the standard mathematical axioms are inconsistent, in which case the definition of validity counts all inferences as "valid"). On this view, the validity of the inference from A to B is essentially provability of B from T ∪{A}, where T is a recursively enumerable set of axioms that is sufficient to generate all positive atomic truths about such provability. Then VP just says if A `T C then `T DerivT (〈C〉, 〈A〉), which is a natural generalization of the standard Löb derivability condition and holds for any reasonable provability predicate. But VD becomes A,DerivT (〈C〉, 〈A〉) `T C, which taking A to be a tautology becomes (S): ProvT (〈C〉) `T C. And from Gödel's second incompleteness theorem, we know that this is impossible if T (here, PA or ZF) is consistent. Given that PA and ZF are presumably consistent, we must reject VD (and its special case VALD) on this understanding of validity.16 That, I assume, is a fact that we have come to terms with long ago. I suppose it would be possible to use substructural logic to overcome the second incompleteness theorem (that is, to develop a weak arithmetic in a logic that rejects structural contraction and adds schema (S) postulating its own soundness). But I doubt that many would find that a profitable way to go. The situation with (B) is different if our informal reasoning is significantly more powerful than that codified in T ; in particular, if it employs a theory T ∗ that can prove the soundness of T . In that case, VD will hold (if T ∗ is itself sound). But then of course we shouldn't expect VP (or even its special case VALP) to hold: establishing C in T ∗ shouldn't lead to the conclusion that C is provable in the weaker theory T . (We will, though, get VP-spec: we'll be able to prove in T ∗ 16There is a broad analogy to the provability logic GL of [3]. But in the case where L0 is nonclassical (e.g. because it contains a naive truth predicate), they are not the same: the operator corresponding to V al will be irreducibly 2-place, not equivalent to a 1-place operator  applied to a conditional (even a nonclassical conditional). DISARMING A PARADOX OF VALIDITY 19 that if ProvT (〈B〉) then ProvT (〈ProvT (〈B〉)〉), or more generally that if DerivT (〈B〉〈A〉) then ProvT (〈DerivT (〈B〉〈A〉)〉).)17 Although a central presupposition of Beall and Murzi's paper is that we should have at best limited interest in the case where our informal reasoning is significantly more powerful than that codified in our validity predicate, I'm not sure that is right. For our system of informal reasoning, which we're imagining could be codified as T ∗, is extraordinarily complex-so complex that we almost certainly could never recognize any codification of it as correct. While the language can define a "validity predicate" corresponding to T ∗, it is not one we could recognize as deserving the name 'validity predicate'. Anything we could recognize as a validity predicate would correspond to provability in a weaker T ; and for such a predicate we will not have VP, but may have VD if the gap between T ∗ and T is sufficient. I regard this as an attractive way to accept VD without VP. This is the explanation of my hints in Section 4 that there might be something unnatural even about what I called the "genuine validity schema": the equation of 'V al' with '`' (as opposed to the equation of '` V al' with '`' in what Beall and Murzi call the V-Schema). Unnatural, not incorrect. It is not incorrect, for there will be validity predicates for which this "genuine validity schema" holds (so the view is not that "we do not have the resources to talk about validity [i.e. the relation '`' involved in intuitive reasoning]", as they put it on p. 158). But it is unnatural in that the "genuine validity schema" doesn't hold for the most interesting validity predicates. The last few paragraphs have focused on Option (B). With Option (C), the analysis depends on exactly which models Ψ we quantify over-and also, in the case of multivalued models, on whether we use the definition in terms of designated values or the one in terms of the partial order. Of course if the models Ψ over which we quantify are precisely the models of some recursively axiomatized first order theory like ZF, then the model-theoretic definition of validity is equivalent to a proof-theoretic definition, and the case is no different from (B). In particular, if that recursively axiomatized theory is weak enough for us to prove sound in our broader theory, VD is bound to hold and VP is bound to fail. If on the other hand we employ the constraint that the validity concept in question should coincide with what we use in our informal reasoning, VP will hold and VD fail. 17Extending the previous footnote, there is in this case a broad analogy to a provability logic, but in this case the logic GLS of [3]. Again, in the case of nonclassical L0, the operator corresponding to V al will be irreducibly 2-place. 20 HARTRY FIELD It may however be of interest to quantify over models meeting a condition Ψ of a form other than "model of such and such a recursively axiomatized first order theory" (though it is arguable that doing so would require a rejection of the constraint that the validity concept would match our intuitive reasoning). For instance, we might want to quantify only over models in which there are no non-standard natural numbers. With conditions like these, we can find both cases where VP fails and cases where VD fails (and cases where both do).18 But of course there's no way of defining a notion of validity in set theory that extends first order validity such that both VP and VD hold. The approach to validity given in Options (B) and (C) is bound to respect at least the Weak Classicality Constraint (even on the broad reading where 'classicality' includes Conditional Proof). On natural assumptions, it will respect even the Strong Classicality Constraint (again taken to include Conditional Proof); and many versions of it will take 'V al' to have a recursively enumerable extension. As noted earlier, there is something to be said for these things, though they are not uncontroversial desiderata. To repeat the main point of this section and the last: given a definition of validity in a language without a validity predicate, it's a routine matter to see what its consequences are for the validity paradoxes. While on some definitions VALP and VALD will both hold, on none will both VP and VD hold; and which one fails is determined by the definition. (Moreover, on definitions in which the validity concept reflects our intuitive reasoning, VP will hold and VD fail; but as suggested in the discussion of Option (B), that may not be the most interesting case.) Note the big difference between the case of validity and the case of truth: truth-in-L0, unlike validity-in-L0, isn't even extensionally definable in L0, and so there is no prospect of simply figuring out the validities involving the predicate 'true-in-L0' from the validities of L0. 9. Validity as Normative, and Hierarchical Validity The argument of the previous section seems to be based on the assumption that we can in some important sense (perhaps weaker than meaning-equivalence) define validity in standard mathematics, in such 18We can also get examples where VP-spec fails; when we've given up the constraint that the validity coincides with ordinary reasoning, this is different than VP failing. But the main interest in VP-spec was from its supposed connection to VP. DISARMING A PARADOX OF VALIDITY 21 a way that the only laws of validity will be "pre-existing" laws of mathematics. While this isn't very precise, it seems to suggest a way in which the argument might be questioned. For instance, it is natural to view validity as a normative concept. And if we do, we might think that in "defining" validity for the language L0 we give what appears to be an extensionally adequate account only because we ignore the norms of validity itself. Once we extend the language to explicitly recognize a notion of validity V al, we automatically generate new norms, not accommodated by the original "definition". (Maybe the so-called "definition" works when applied to L0-sentences, but in regarding it as not really a definition in any serious sense, we can allow that and yet not extend the account to the full L.) One natural way to work this out is in terms of a hierarchy of weaker and weaker validity concepts, say V alα for ordinals α. (Beall and Murzi mention this, but are skeptical; the view is advocated in [8]. Although they don't say so, presumably the hierarchy would extend only through an initial segment of the ordinals, such as those hereditarily definable in L0, or those hereditarily definable by simple enough notations that we have the means to prove that they do represent ordinals and prove the order relations among them.) The view might have it that with the V al predicates goes a corresponding hierarchy of '`α'. It would then be natural to modify the principles VP and VD to the following stratified principles (schematic in α as well as in A and C): (VPα,α+1): If A `α C then `α+1 V alα(〈A〉, 〈C〉) (VDα,α+1): A, V alα(〈A〉, 〈C〉) `α+1 C. Such principles seem entirely acceptable, and the paradoxical derivations will be blocked.19 On the hierarchical view, at least as given by VPα,α+1 and VDα,α+1, both VP and VD fail for each successor ordinal, i.e. when for the same successor β, '`' is read as '`β' and 'V al' as 'V alβ'. (That is, VPα+1,α+1 and VDα+1,α+1 both fail.) What about for limit ordinals (i.e. VPλ.λ and VDλ.λ)? The natural construction takes V alλ to be the union of the V alα for α < λ, when λ is a limit; similarly for `λ. If we proceed in this way, then VDλ.λ will certainly fail: the only way to get A, V alλ(〈A〉, 〈C〉) `λ C would be for there to be an α < λ such that A, V alλ(〈A〉, 〈C〉) `α C; but the construction will give only A, V alβ(〈A〉, 〈C〉) `α C for each β < α, and since V alλ is weaker than V alβ, this is not enough. 19It isn't really necessary to have the ordinal jump in both principles, but putting it in both obviates the need to discuss in which the jump is more appropriate. 22 HARTRY FIELD What about VP? A crude argument may suggest that VPλ.λ should hold when λ is a limit ordinal: If A `λ C then for some α < λ, A `α C; so `α+1 V alα〈A〉, 〈C〉); and α + 1 < λ since λ is a limit ordinal, so `λ V alα(〈A〉, 〈C〉). And V alλ is weaker than the V alα for α < λ, so `λ V alλ(〈A〉, 〈C〉). But the last stage of this is suspect: the worry is that at stage λ we may not have the means to recognize that λ is the limit of its predecessors (or that V alλ is the "infinitary disjunction" of the V alα for α < λ). In fact, the whole argument would need reformulation, in the language of ordinal notations rather than simply of ordinals. There is no need to go through this since in any case VDλ,λ fails. What should we say about such a hierarchical view? For purposes of this paper there's really no reason to decide. For (i) as Beall and Murzi themselves suggest, the stratification of validity concepts would not have nearly the devastating impact on our reasoning that a stratification of truth predicates would have; and (ii) the stratification would not support a failure of structural contraction, since it is a view on which VD fails at every stage. Still, a few words on the hierarchical view seem appropriate. First (and contrary to what I take to be the suggestion in the Appendix of [8]), I don't think it at all obvious that we should go hierarchical for '`'. Suppose we start a hierarchy from a fairly weak `0 (weaker than what our intuitive reasoning employs). The hierarchy constructed from it will doubtless reflect our intuitive reasoning for a long way; but at some (countable) limit ordinal W far more complex than its predecessors, our language will contain no notation for it that we are able to recognize as a notation for an ordinal that is the least upper bound of the ordinals denoted by prior ordinal notations. Given this, I think it's natural to suppose that our intuitive reasoning is accurately represented by `Ω; that is, the intuitive validities A ` B are those where A `α B for some α < W. We can comprehend the α < W and validity predicates corresponding to them; but we can't really comprehend W, or put those validity predicates to use in defining a predicate V alW that we can recognize as a validity predicate. But that raises a second question: is this view still hierarchical as regards validity predicates? The answer is: yes and no. No: in that there would presumably still be a formula 'V al' in our language corresponding to the intuitive validity relation ` given by our reasoning practice: if the latter relation is recursively enumerable (as seems plausible), there must be. DISARMING A PARADOX OF VALIDITY 23 Yes: in that we can't recognize a given unstratified predicate as the appropriate V al, but we can recognize specific V alα as validity predicates; so what we recognize as validity predicates would still fall in a hierarchy. Using 'V al' in accord with the 'No' ("nonhierarchical") answer, A ` C if and only if V al(〈A〉, 〈C〉), by definition. But it is doubtful that A ` C if and only if ` V al(〈A〉, 〈C〉), as the V-Schema requires: VP probably fails, simply because V al is too complicated for us to intuitively reason about. In any case, the V-Schema isn't enough to give VD; and for this V al, VD definitely fails in general. (On the other hand, the construction will validate any instances of VD and VP where the instantiating sentences A and C contain only restricted validity concepts.) Using the multiplicity of validity predicates, in accord with the 'Yes' ("hierarchical") answer, we may well get a version of VD for each of the V alα (α < Ω); the version of VD would involve the unstratified ` that is in effect `Ω. That seems attractive, and not altogether different from the VD solutions of the previous section. But like those, it still doesn't deliver VP. I have been sketching one natural way to try to "transcend the hierarchy", at least for '`' and (in the 'No' version) for 'V al' as well. But perhaps the Beall and Murzi paper suggests that there is a different way of "transcending the hierarchy", one that would allow us to keep both VP and VD in a substructural setting? The thought might be that just as Kripke [7] showed how to transcend the Tarski hierarchy in a nonclassical setting (introducing a single unstratified nonclassical truth predicate, from which we can define stratified Tarskian predicates that behave classically), we should do the same for validity in a nonclassical setting. Extending the analogy, the idea might be to argue in a nonclassical setting that by starting from a hierarchy of validity predicates and allowing sentences to "seek their own level", an unstratified predicate that satisfied VP and VD would emerge at some fixed point. (Just how invalidity claims are to get in at each stage of the construction is far from obvious, but suppose there's a way to do it.) Obviously there's no way that anything like this could happen if the nonclassical setting were merely paracomplete or paraconsistent, with standard structural rules-perhaps VALP and VALD could both emerge, but the whole point of the v-Curry argument was that mere paracompleteness or paraconsistency don't suffice to allow for VP and VD together. But perhaps if we did a construction modeled 24 HARTRY FIELD after Kripke's in a substructural setting, VP and VD together would emerge? That would certainly be interesting if it could be done, but Beall and Murzi don't claim it can, and nothing in their paper gives any reason to think that it can. And if it can't, we have a further respect20 in which the situation with the validity principles VP and VD seems totally different from the situation with the principles of naive truth. 10. Acknowledgements I am grateful to Paul Egre, Julian Murzi and an anonymous reviewer for helpful suggestions. References [1] Beall, JC, Spandrels of Truth, Oxford, Oxford University Press, 2009 [2] Beall, JC and J. Murzi, "Two Flavors of Curry's Paradox", Journal of Philosophy vol. 120 (2013), pp. 143-65. [3] Boolos, G., The Logic of Provability, Cambridge University Press, Cambridge UK, 1993 [4] Cobreros, P., P. Egre, D. Ripley and R. van Rooij, "Reaching Transparent Truth", forthcoming in Mind. [5] Field, H., Saving Truth From Paradox, Oxford University Press, Oxford UK, 2008. [6] Field, H., "What Is Logical Validity?", Colin Caret and Ole Hjortland, eds., Foundations of Logical Consequence, Oxford University Press, Oxford UK, forthcoming. [7] Kripke, S., "Outline of a Theory of Truth", Journal of Philosophy vol. 72 (1975), pp. 690-716. [8] Whittle, B., "Dialetheism, Logical Consequence, and Hierarchy", Analysis vol. 64 (2004), pp. 318-26. Department of Philosophy, New York University, 5 Washington Place, New York NY 10003 Department of Philosophy, University of Birmigham, ERI Building, Edgbaston UK B15 2TT hf18@nyu.edu 20To remind you of the others: First, as discussed in Section 4, the "V-Schema" goes beyond the apparently innocent V al(〈A〉, 〈B〉) if and only if A ` B, for it involves an extra '`' on the left hand side; and in any case, even the VSchema doesn't suffice for VD. Second, as discussed under (B) in Section 8, even the apparently innocent schema indented above might not be so innocent, given that our intuitive reasoning is too complex to be codified in a predicate we can recognize as corresponding to it. Third (asserted but not here defended), stratifying validity would not be nearly as crippling as stratifying truth. (And see also the last paragraph of Section 8.)