  5:       Greg Restall∗ Philosophy Department, The University of Melbourne restall@unimelb.edu.au  1 February 9, 2006 Abstract: In this paper I introduce a sequent system for the propositional modal logic 5. Derivations of valid sequents in the system are shown to correspond to proofs in a novel natural deduction system of circuit proofs (reminiscient of proofnets in linear logic [9, 15], or multiple-conclusion calculi for classical logic [22, 23, 24]). The sequent derivations and proofnets are both simple extensions of sequents and proofnets for classical propositional logic, in which the new machinery-to take account of the modal vocabulary-is directly motivated in terms of the simple, universal Kripke semantics for 5. The sequent system is cut-free (the proof of cut-elimination is a simple generalisation of the systematic cut-elimination proof in Belnap's Display Logic [5, 21, 26]) and the circuit proofs are normalising. This paper arises out of the lectures on philosophical logic I presented at Logic Colloquium 2005. Instead of presenting a quick summary of the material in the course, I have decided to write up in a more extended fashion the results on proofnets for 5. I think that this is the most original material covered in the lectures, and the techniques and ideas presented here gives a flavour of the approach to proof theory I took in the rest of the material in those lectures. » « ∗Thanks to Conrad Asmus, Lloyd Humberstone and Allen Hazen for many helpful discussions when I was developing the material in this paper, and to Bryn Humberstone, Adrian Pearce, Graham Priest and the rest of the audience of the University of Melbourne Logic Seminar for encouraging feedback in the early stages of this research. Thanks also to audiences at Logic Colloquium 2005 in Athens and '05 in Edinburgh, where I presented this material in a courses on proof theory, to seminar audiences at the Bath, Nottingham, Oxford, Paris and Utrecht, and in particular to Samson Abramsky, Denis Bonnay, Bob Coecke, Matthew Collinson, Melvin Fitting, Alessio Guglielmi, Hannes Leitgeb, Neil Leslie, Øystien Linnebo, Richard McKinley, Sara Negri, Luke Ong, David Pym, Helmut Schwichtenberg, Benjamin Simmenauer, Phiniki Stouppa, and Albert Visser for fruitful conversations on these topics, and to the Philosophy Faculty at the University of Oxford and Dan Isaacson, where this paper was written. Thanks to an anonymous referee for comments that helped clarify a number of issues. ¶ Comments posted at http://consequently.org/writing/s5nets are most welcome. ¶ This research is supported by the Australian Research Council, through grant 0343388. 1 http://consequently.org/writing/s5nets 2 The modal logic 5 is the most straightforward propositional modal logic - at least when you consider its models. The Kripke semantics for 5 is just about the smallest modification to classical propositional logic that you can make once you add the idea that propositions may vary in truth value from context to context. We add just one new operator, , with the proviso that A is true in a context when and only when A is true in every context. (The dual operator ♦ is definable in terms of in the usual way. We could start with ♦ as primitive, and then  is the defiable connective. Nothing hangs here on the choice of  as primitive.) The modal logic 5 has very simple models. A (universal) 5 frame is a nonempty set P of points. An evaluation relation is an arbitrary relation between points and atomic formulas. A (universal) 5 model 〈P, 〉 is a frame together with an evaluation relation on that frame. Given a model, the evaluation relation may be extended to the entire modal language as follows: • x A ∧ B iff x A and x B. • x ¬A iff x 6 A. • x A iff for every y ∈ P, y A. A formula A is true at a point just when A is true at all points. In this case, A is not merely contingently true, but is unavoidably, or necessarily true. (We utilise the primitive vocabulary {∧,¬,}, leaving ∨ and→ as defined connectives in the usual manner. In addition, the modal operator ♦ for possibility is definable as ¬¬.) A formula A is 5-valid if and only if for every model 〈P, 〉, for every point x ∈ P, we have x A. An argument from premises X to a conclusion A is 5-valid if and only if for each model 〈P, 〉, for every x ∈ P, if x B for each B ∈ X, then x A also. Clearly every classical tautology, and every classically valid argument is 5-valid. Here are some examples of distinctively modal 5 validities. (A → B) ` A → B A ∧ B ` (A ∧ B) ` (A ∨ ¬A) A ` A A ` A A ` ¬¬A When it comes to models, 5 is simple. Models for other modal logics complicate things by relativising possibility. (A point y is possible from the point of view of the point x, and to evaluate A at point x, we consider merely the points that are possible relative to x.) You can then find interesting modal logics by constraining the behaviour of relative possibility in some way or other (is it reflexive, transitive, etc.) The logic 5 can be seen as a system in which relative possibility has disappeared (possibility is unrelativised) or equivalently, as one in which relative possibility has a number of conditions governing it: typically, reflexivity, transitivity and symmetry. Once relative possibility is an equivalence relation, from the perspective of a point inside some equivalence Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 3 class you can ignore the points outside that class with no effect on the satisfaction on formulas, and the model may as well be universal. In other words, you can consider 5 as a logic in which there is not much machinery at all (there is no relation of relative possibility) or it is one in which there is quite a bit of machinery (we have a notion of relative possibility with a number conditions governing it). This difference in perspectives plays a role when it comes to the proof theory for 5. Despite the simplicity of the formal semantics, providing a natural account of proof in 5 has proved to be a difficult task. We have little idea of what a natural account of proofs in 5 might look like. There are sequent systems for 5, but the most natural and straightforward of these are not cut-free [20]. The cut-free sequent systems in the literature tend to be quite complicated [10, 19], partly because they treat 5 as a logic withmany rules (that is, the systems cover many modal logics and 5 is treated as a logic in which relative possibility has a number of features - so we have many different rules governing the behaviour of relative possibility), or they are quite some distance from Gentzen's straightforward sequent system for classical propositional logic [5, 26].1 On the other hand, sequent systems can be modified by multiplying the kind or number of sequents that are considered [3, 16], or by keeping a closer eye on how formulas are used in a deduction [7]. These approaches are closest to the one that I shall follow here, but the present approach brings something new to the discussion. In this paper I introduce and defend a simple sequent system for 5, with the following innovations: the main novelty of this result is that the generalisation of sequents in this system (superficially similar, at least, to hypersequents [3]) have a straightforward interpretation both in terms of the models for 5, and in terms of natural deduction proofs for this modal logic. Sequent derivations are, in a clear and principled manner, descriptions of underlying proofs. 1  Our aim is to defend a simple, cut-free sequent calculus for the modal logic 5, in which derivations correspond in some meaningful way to constructions of proofs. The guiding idea for this quest looks back to the original motivation of the sequent system for intuitionistic propositional logic [13]. For Gentzen, a derivation of an intuitionistic sequent of the form X ` A is not merely a justification of the inference from X toA, and the sequent system is not merely a collection of rules with some pleasing formal properties (each connective having a left rule and a right rule, the subformula property, etc.) Instead, the derivation can be seen as a recipe for the construction of a natural deduction 1Display logic is a fruitful way of constructing sequent systems for a vast range of logical systems, but it comes at the cost a significant distance from traditional sequent systems. We do not extend the sequent system for classical logic with new machinery to govern modality. We must strike at the heart of the sequent system to replace the rules for negation, at the cost of a proliferation of the number of sequent derivations. Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 4 proof of the conclusion A from the premises X. For example consider, the derivation of the sequent A → B ` (C → A) → (C → B): A ` A B ` B A → B,A ` B C ` C A → B,C → A,C ` B A → B,C → A ` C → B A → B ` (C → A) → (C → B) may be seen to guide the construction of the following natural deduction proof. A → B [C → A]$ [C]∗ A B (∗) C → B ($) (C → A) → (C → B) However, a proof may be constructed in more than one way. The first three lines of the proof (fromA → B,C → A,C to B) may be analysed by the different derivation A ` A C ` C C → A,C ` C B ` B A → B,C → A,C ` B In this case, the natural deduction proof constructed is no different, but the analysis varies. Instead of thinking of the tree as starting with a proof of from A → B and A to B (that is, A → B,A ` B) and then justifying the premise A by means of the addition of the two extra premises C → A and C, we think of the proof as starting with the proof from C → A and C to A, and then we add the premise A → B to deduce B. So, the sequent rules X ` A Y,B ` C [→L] X, Y,A → B ` C X,A ` B [→R] X ` A → B can be seen as being motivated and justified by considerations of natural deduction inferences. The rule [→L] can be motivated by the thought that if we have a proof π1 of A from X and another proof π2 from B to C (with extra premises Y) then we may use π1 to deduce A from X, and use the new premise A → B to deduce (using an implication elimination in the natural deduction system) B. Now using Y and the newly justified B, we may add the proof π2 to dedce the desired conclusion C. In other words, X, Y,A → B ` C. The rule [→R] is motivated similarly. If we have a proof π from X,A to B, then we may discharge A to deduce X ` A → B.2 2There are niceties here about how many instances of A are discharged, and whether sequents have of lists, multisets, or sets of formulas on the left-hand side. Most likely the structural rule of contraction will play a role at this point. Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 5 These two derivations of the sequent A → B,C → A,C ` B differ in the order of the application of the [→L] rules. In some sense, this difference is merely "bureaucratic": The sequent system imposes a difference (you must apply either this rule or that rule first) when the natural deduction proof does not (the rules are applied-the order is only imposed when we decide to read the proof from top to bottom, or from bottom to top, or from the inside out or in some other way). There is an important sense in which the sequent system, as a theory of proof, is parasitic on a prior notion of proof found in natural deduction. Some of the merely bureaucratic differences in the sequent calculus are absent from the natural deduction system. This increase in bureaucracy is not without its virtues, of course. The sequent system makes explicit what is implicit in natural deduction proofs. The sequent A → B,C → A,C ` B tells us quite explicitly that at the stage of the proof at which B is the conclusion, the premises A → B, C → A and C are all undischarged. This can only be "read off" the natural deduction proof with some skill. You must look down from B to notice that the two discharges (∗) and ($) occur below, and hence that at the point of the proof where B is deduced, C → A and C are still active. In the rest of this paper, I aim to do the same thing for the modal logic 5. Instead of taking the sequent calculus for classical propositional logic andmodifying it, we will first endeavour to construct a natural deduction proof theory for 5, and from this, reconstruct a sequent calculus that makes explicit the kinds of implicit inferential relationships between premises and conclusions that are found in our proofs. 2   The sequent calculus for classical logic uses sequents with multiple formulas on each side of the turnstile: it has the form X ` Y where both X and Y may involve more than one (or less than one) formula. If a derivation of the sequent X ` A constructs a proof from premises X to conclusion A, then it is natural to think of a derivation ending in X ` Y as constructing a proof π with the formulas in X as premises or inputs and the formulas in Y as conclusions, or outputs. We could think of a proof as having a shape like this: A1 A2 * * * An B1 B2 * * * Bm This is a very natural idea. It goes back at the least ot William Kneale who introduced his tables of development in the 1950s [17]. The simple natural deduction rules for conjunction and negation are these: A B A ∧ B A ∧ B A A ∧ B B A ¬A A ¬A Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 6 Tables of development are found by chaining basic inferences together formulato-formula. Here is a proof of the conclusion ¬(A ∧ ¬A). ¬(A ∧ ¬A) A ∧ ¬A A ∧ ¬A ¬(A ∧ ¬A) A ¬A Notice that it has two instances of the one conclusion ¬(A ∧ ¬A). (This phenomenon is just like the case of the simple Gentzen-style natural deduction proof of A ∧ ¬A ` ⊥, which has two instances of the premise A ∧ ¬A- one to justify A and the other to justify ¬A, which are then combined to infer the falsum ⊥.) In what follows, we will call this proof of ¬(A ∧ ¬A), 'π'. The proof π corresponds to a derivation δ of the sequent ` ¬(A∧¬A),¬(A∧¬A). In the sequent calculus we may chain two instances of δ together with an application of a [∧R] rule, to derive ¬(A ∧ ¬A) ∧ ¬(A ∧ ¬A). δ ` ¬(A ∧ ¬A),¬(A ∧ ¬A) [WR] ` ¬(A ∧ ¬A) δ ` ¬(A ∧ ¬A),¬(A ∧ ¬A) [WR] ` ¬(A ∧ ¬A) [∧R] ` ¬(A ∧ ¬A) ∧ ¬(A ∧ ¬A) This (essentially) utilises the rule of contraction on the right of the turnstile. (The steps labelled "WR".) There is no corresponding move in the natural deduction system. If we want to introduce a conjunction, we are free to paste together two instances of π π π ¬(A ∧ ¬A) ¬(A ∧ ¬A) ¬(A ∧ ¬A) ¬(A ∧ ¬A) ¬(A ∧ ¬A) ∧ ¬(A ∧ ¬A) but as you can see, we have leftover conclusions ¬(A ∧ ¬A). Each time we add another proof π to provide another conjunct for one conclusion, we add another unconjoined instance ¬(A∧¬A). This would not matter if there were a proof which concluded in merely one instance of ¬(A∧¬A), but it is easy to see that with these rules there is no such proof. (Proceed by way of an induction on the construction of a proof: every proof has at least either two conclusions, or two premises or one premise and once conclusion. So, each proof with no premises has at least two conclusions.) Tables of development, as defined here, are incomplete for classical logic.3 Tables of development face more prosaic problem, and that is is that it is not straightforward to typeset them. It turns out that we can solve both of our 3Patching the system is not a simple matter. The canonical references here are Shoesmith and Smiley'sMultiple Conclusion Logic [23] and Ungar'sNormalisation, Cut-Elimination, and the Theory of Proofs [24]. Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 7 problems: the notational problem and the contraction problem in one go. It is much more flexible to change our notation completely. Instead of taking proofs as connecting formulas in inference steps, in which formulas are represented as characters on a page, ordered in a tree, think of proofs as taking inputs and outputs, where we represent the inputs and outputs as wires. Wires can be rearranged willy-nilly-we are all familiar with the tangle of cables behind the stereo or under the computer desk-so we can exploit this to represent cut straightforwardly. In our pictures, then, formulas label wires. This change of representation will afford another insight: instead of thinking of the rules as labelling transitions between formulas in a proof, we will think of inference steps (instances of our rules) as nodes with wires coming in and wires going out. Proofs are then circuits composed of wirings of nodes. The nodes for the connectives are then: ¬I ¬A A ¬E ¬A A ∧I A B A ∧ B ∧E1 A ∧ B A ∧E2 A ∧ B B The proof π for ` ¬(A ∧ ¬A),¬(A ∧ ¬A) is now represented as follows: ¬E ∧E1 ¬I ∧E2 ¬I ¬(A ∧ ¬A) A ∧ ¬A A ¬A A ∧ ¬A ¬(A ∧ ¬A) (The arrow notation for wires allows us to lay proofs out in a way that inference need not go from the top of the page to the bottom of the page.) We can construct a circuit with one conclusion wire by contracting the two original conclusions like this: ¬E ∧E1 ¬I ∧E2 ¬I WI ¬(A ∧ ¬A) A ∧ ¬A A ¬A A ∧ ¬A ¬(A ∧ ¬A) ¬(A ∧ ¬A) Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 8 The new WI node corresponds to the contraction of the two conclusions into one in the sequent proof. We can then combine these proofs to obtain the proof of the desired conclusion: ¬(A ∧ ¬A) ∧ ¬(A ∧ ¬A). ¬E ∧E1 ¬I ∧E2 ¬I WI ¬(A ∧ ¬A) A ∧ ¬A A ¬A A ∧ ¬A ¬(A ∧ ¬A) ¬E ∧E1 ¬I ∧E2 ¬I WI ¬(A ∧ ¬A) A ∧ ¬A A ¬A A ∧ ¬A ¬(A ∧ ¬A) ∧I ¬(A ∧ ¬A) ¬(A ∧ ¬A) ¬(A ∧ ¬A) ∧ ¬(A ∧ ¬A) There is much more that one can say about classical circuits. The first detailed presentation of classical proofnets is found in Robinson's 2003 paper [22]. Our style of presentation here follows Blute, Cockett, Seely and Trimble's work on weakly (or linearly) distribtutive categories [6]. I will leave the detail for the next section in which we introduce modal operators. 3 5  We hope to find rules for introducing a -formula, and for eliminating a formula. If these rules are to be anything like the rules in a natural deduction system, they should step from A to A, and vice versa: E A A I A A From A, we can infer A. Similarly, from A (at least, sometimes) we can infer A. The analogy with rules for the universal quantifier should be clear. From ∀xFx we infer Fa, and if we have derived Fa in a special way (the a is arbitrary) we may infer ∀xFx. In the modal setting, we do not have something playing the role of names. So, we need some other way to ensure that [E] is stronger than it appears (in the quantifier case, we may infer Fa for any object a) and that [I] is weaker than it appears (what is the restriction on its application, corresponding to the condition on names for ∀x?) Consider models for the modal operators: If A is true at a point, what can we infer about A? It follows that A is true at every point: not just the point at which we derived (or assumed) A. So, if we infer A from A, we are free to infer A not only here (in this context) but also there (whatever other context "there" might be). So, we can think of the output A wire in the [E] node as freely 'applying to' a context other than the one in Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 9 which we have evaluated A. It it is in a sense such as this that [E] is stronger than merely the inference that straightforwardly strips the box from the front of the formula. Consider [I]. Under what conditions can our inference of A justify the step toA? We can inferAwhen our inference toA is general- that is, when we have inferred A at an arbitrary context. What does it mean for a context to be arbitrary? Here we take our cue from the proof theory for predicate logic. We can infer ∀xFx from some proof of Fa just when the conclusion Fa is the only part of the proof (premises or conclusion) to contain information about a (that is, to be formulas containing the name a). We can do the same thing here. If we have all of the premises and conclusions in our proof applying to a collection of contexts, and only the conclusion A applies to context, then we can infer A, since that context was arbitrary. We have the conclusion of A generally, in a manner which is appropriate for any context. But contexts are not like names in predicate logic, they do not explicitly show up in the syntax of the logic 5. All that this talk of contexts requires is that we pay attention to whether or not a formula in a proof occurs in the same context as another formula. We can make suggestive ideas more precise in the following way. We start by defining the class of inductively generated circuits, and the equivalence relation of nearness (ν) on wires in a circuit.  [  , ] Inductively generated circuits are defined in the following manner. • An identity wire: A for any formula A is an inductively generated circuit. The sole input type for this circuit is A and its output type is also (the very same instance) A. As there is only one wire in this circuit, it is near to itself. • Each boolean connective node presented in the list below is an inductively generated circuit. ¬E ¬A A ¬I ¬A A ∧I A B A ∧ B ∧E1 A ∧ B A ∧E2 A ∧ B B The inputs of a node are those wires pointing into the node, and the outputs of a node are those wires pointing out. The input and output wires of a each of these nodes are in the same nearness equivalence class. • Given an inductively generated circuit π with an output wire labelled A, and an inductively generated circuit π ′ with an input wire labelled A, we obtain a new inductively generated circuit in which the output wire of π is plugged in to the input wire of π ′. The output wires of the new circuit are the output wires of π (except for the indicated A wire) and the output wires of π ′, and the input Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 10 wires of the new circuit are the input wires of π together with the input wires of π ′ (except for the indicated A wire). A wire in the new circuit near another wire if and only if either those two wires are near in π or close in π ′, or one wire is near to the ouputA in π and the other wire is close to the input A in π ′. (In other words, the equivalence classes for ν on the new circuit are those classes in the old circuit, except for the classes for the wire at the point of composition. The two classes for this wire are merged.) • Given an inductively generated circuit π with two input wires A, a new inductively generated circuit is formed by plugging both of those input wires into the input contraction node  . In the new circuit, the relation ν is the same as the original relation, except that the classes for the two contracted input wires are merged, and the new single input A is in the same class. Similarly, two output wires with the same label may be extended with a contraction node  . The two output wires are now near in the new circuit, as before. • Given an inductively generated circuit π, we may form a new circuit with the addition of a new output, or output wire (with an arbitrary label) using a weakening node  or  .4 π X Y KI B π X Y KE B The new wires are not near any other wires in the proof. (They are arbitrary extra conclusions or premises, and they could well be in any context.) • A E node is also an inductively generated circuit. In this node, the input wire A is not nearby the output wire A. E A A • Given an inductively generated circuit π in which a conclusion wire A is not nearby any other conclusion wire, and is not nearby any premise wire, then the result of plugging inI to the conclusion wireA is a new inductively generated circuit. The new conclusion A is not nearby any other wire of the circuit. I A A 4Using an unlinked weakening node like this makes some circuits disconnected. It also forces a great number of different sequent derivations to be represented by the same circuit. Any derivation of a sequent of the form X ` Y, B in which B is weakened in at the last step will construct the same circuit as a derivation in which B is weakened in at an earlier step. If this identification is not desired, then a more complicated presentation of weakening, using the 'supporting wire' of Blute, Cockett, Seely and Trimble [6] is possible. Here, I opt for a simple presentation of circuits rather than a comprehensive account of "proof identity." Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 11 This completes our definition of the proofs for 5. Inductively generated circuits represent valid reasoning in 5. Here is an example, showing how one can derive ¬¬A from A. The circuit below has A as its only input, an ¬¬A as its only output. ¬E E ¬I I A ¬A ¬A ¬¬A ¬¬A It is a useful exercise to show that this circuit may be inductively generated from left-to-right. The sub-circuit ¬E E ¬I A ¬A ¬A ¬¬A is inductively generated, because each of the nodes are themselves circuts. In this circuit, the equivalence relation ν relates theA and ¬A wires, and it relates the ¬A and ¬¬A wires. But the nearness relation does not relate the wires on the left to the wires on the right. As a result, we may apply [I], since the output wire ¬¬A is not near to any other wire on the periphery of the circuit. The result is the complete circuit with input A and output ¬¬A. This proof tells us more than simply that in any model in any world where A is true, ¬¬A is true (though it does tell us this too). Since the output wire ¬¬A is not close to the input wire A, it tells us that there is no model at all where there is a world where A is true and a world where ¬¬A is not true. Those worlds need not be the same. To speak in terms of contexts, it is incoherent to assert A in one context and to deny ¬¬A in another context. This is an example of the following general result, on the soundness of inductively generated circuits.  [] Given an inductively generated circuit with input wires X1, . . . , Xn and output wires Y1, . . . , Yn, where each Xi ∪ Yi is an equivalence class for the nearness relation, then for any 5 model, there is no set w1, . . . , wn of worlds where each Xi is true atwi and each Yi is false atwi. Proof: The proof is a trivial induction on theconstruction of the proof. Identity, boolean nodes, contraction, weakening are all immediate. The cut rule is a simple consequence of the transitivity of consequence in s5-models. For [E] we note that there is no model in which there is no pair of worlds, where A is true in one andA is false in the other. For [I], we note that if there there is no model satisfying some condition (concerning the rest of the wires in the proof π except for the one output A which is near no other wire in the periphery) where there is a world in whichA is false, then in these models there is noworld in which A is false, and hence, there no world in which A is false either. But this is the condition for [I]. Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 12 So, circuits encode valid reasoning in our models. To show that they encode all of the validities of our models, we need a completeness proof. To discuss the completeness proof, we will examine another way of representing the behaviour of circuits. 4 5  We may represent the periphery of a circuit as a general sequent, in which the input wires are formulas in antecedent position, and the output wires are formulas in consequent position. However, this leaves out the nearness relation, which we need to model the behaviour of modal operators. So, in a sequent, we will keep track of the nearness of formulas. One way to do this is by segregating formulas into equivalence classes, and in those classes, into antecedent and consequent position. The picture, then, is of a hypersequent5 X1 ` Y1 | * * * | Xn ` Yn a multiset of sequents, in which each Xi and Yi is a multiset of formulas.6 We think of the sequent Xi ` Yi as forming one of the zones of the hypersequent. The hypersequent calculus for s5 has the following connective rules:7 X ` A, Y | ∆ [¬L] X,¬A ` Y | ∆ X,A ` Y | ∆ [¬R] X ` ¬A, Y | ∆ X,A ` Y | ∆ [∧L1] X,A ∧ B ` Y | ∆ X,B ` Y | ∆ [∧L2] X,A ∧ B ` Y | ∆ X ` A, Y | ∆ X ′ ` B, Y ′ | ∆ ′ [∧R] X,X ′ ` A ∧ B, Y, Y ′ | ∆ | ∆ ′ X,A ` Y | ∆ [L] A ` | X ` Y | ∆ ` A | ∆ [R] ` A | ∆ which are motivated by way of the rules for constructing circuits. For [¬L], if we have a circuit in which A is an output formula, then we may expand the circuit by adding a [¬I] node, plugged in at the A wire, which will give us a circuit in which ¬A is an input wire. It is nearby all and only the formulas that 5These are hypersequents due to Arnon Avron [1, 2, 3, 27]. However, the account here differs in two ways from Avron's presentation. First, hypersequents are motivated in terms of an underlying deductive machinery. Second, the behaviour of the modal operators is captured by a single pair of left and right rules. There is no special "modal splitting rule" connecting hypersequents and the modal operators. 6In other words, the one hypersequent may be presented as p ` q, r | s, t ` u or as t, s ` u | p ` r, q, but this is not the same as the hypersequent p, p ` q, r | s, t ` u | s, t ` u. The order of formulas or zones in a hypersequent does not matter (in just the same way that the order of wires does not matter in a circuit) but the number of instances of formulas does (just as it does in a circuit). 7To save space, I present the rules for conjunction, but not disjunction. You can think of disjunction as a define connective, or you can use the obvious rules for disjunction, dual to these rules for conjunction. Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 13 are nearby to the A, and so, in the hypersequent, it is a part of the same zone. Similarly, for [R], if we have a circuit in which A is an output wire, adjacent to no other wires on the periphery of the circut (so, we have a sequent in which ` A in a zone of its own), then we may add a [I] node at this point, and the new output A is nearby no other point in the circut-that is, ` A is in a zone of its own. The appropriate rules for identity and cut are straightforward A ` A X ` A, Y | ∆ X ′, A ` Y ′ | ∆ ′ [Cut] X,X ′ ` Y, Y ′ | ∆ | ∆ ′ With the system as it stands, we may make a number of derivations. A ` A [¬L] ¬A,A ` [L] A ` | ¬A ` [¬R] A ` | ` ¬¬A [R] A ` | ` ¬¬A A ` A [L] A ` | ` A [∧L] A ∧ B ` | ` A B ` B [L] B ` | ` B [∧L] A ∧ B ` | ` B [∧R] A ∧ B ` | A ∧ B ` | ` A ∧ B [R] A ∧ B ` | A ∧ B ` | ` (A ∧ B) Clearly, to be able to derive all of the valid sequents, we must add a few structural rules. To mimic the behaviour of circuits closely, we allow contraction inside zones in a circut, and weakening into a new zone. X,A,A ` Y | ∆ [WL] X,A ` Y | ∆ X ` A,A, Y | ∆ [WR] X ` A, Y | ∆ ∆ [KL] A ` | ∆ ∆ [KR] ` A | ∆ Finally, to ensure that we can derive all of the valid hypersequents, we need to be able to throw away information by merging zones in sequents. X ` Y | X ′ ` Y ′ | ∆ [merge] X,X ′ ` Y, Y ′ | ∆ This rule in a sequent proof has no parallel node in the structure of a circuit.8 It corresponds to taking a circuit and merging two zones, or taking two equivalence classes to coalesce. One simple example is taking the circuit consisting of a [E] node alone, with input A and output A to prove for us A ` A (that there's no model with a world w in which A is true and A is false). This is throwing away information, as the circuit can also be read as telling us that A ` | ` A (that there's no model with a world w at which A is true and w ′ where A is false). This is a more general fact. There is no harm in throwing away information, and it is helpful to have a rule such as this for when it comes 8Actually, the effect of a merge can be found by contracting two instances of A in different zones in the proof. Then X,A ` Y | X ′, A ` Y ′ merge to be come X,X ′, A ` Y, Y ′. It seemed too confusing to introduce contraction in this more general form. It can be modelled straightforwardly as an application of merge and then [WL]. Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 14 to proving completeness, to the effect that any valid hypersequent is provable.9 Before moving on to consider completeness, we will state, without proof, the fact that motivated the construction of this sequent system.  [] A hypersequent X1 ` Y1 | * * * | Xn ` Yn decorates a circuit if and only if the input wires for the circuit are X1, . . . , Xn, the output wires are Y1, . . . , Yn, and if two wires are close in the circuit, they appear in the same zone in the hypersequent.10  [] For each inductively generated circuit, and for any hypersequent decorating that circuit, there is a derivation of that hypersequent. Conversely, for any derivation of a hypersequent, there is an inductively generated circuit decorated by that hypersequent. 5     In the next section, I will cover quite quickly some properties of the sequent system. The discussion is necessarily (for reasons of space), compressed. The aim is to explore the behaviour of this presentation of 5.  [] A hypersequent X1 ` Y1 | * * * | Xn ` Yn is valid in a model if and only if there are no worlds w1, . . . , wn in that model in which each formula in Xi is true at wi and each formula in Yi is false at wi. The soundness theorem, proved in the section before last, then, may be restated as saying that the hypersequent corresponding to a inductively generated circuit (that is, a derivable hypersequent) is valid. The completeness theorem is the converse.  [] A valid hypersequent is derivable. This result may be proved in a number of ways. One is simple, but it relies upon a prior completeness result. Proof []: (i) Convert each hypersequent into a formula which is derivable if and only if the hypersequent is derivable, and valid if and only if the formula is valid, and then show that (ii) every axiom in some axiomatisation of 5 is derivable, and the rules in that axiomatisation preserve derivability. Stage (i) is simple. Convert each sequent X ` Y inside a hypersequent to ` ¬( ∧ X ∧ ¬ ∨ Y). The resulting hypersequent is derivable if and only if the 9The situation is somewhat analagous with the role of weakening in the sequent system for intuitionistic propositional logic and the natural deduction system. There is no normal natural deduction proof from premises p, q to conclusion p, but there is a sequent derivation of p, q ` p. We take the identity proof from p to p (consisting of the formula itself ) to tell us not only that p ` p, but also that p, X ` p for any collection of formulas X. 10This allows A ` A to decorate the single [E] node, as well as A ` | ` A. Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 15 original hypersequent is derivable, and valid if and only if the original hypersequent is valid. Then, encode a hypersequent of the form ` A1 | * * * | ` An as a particular formula in the form ` A1 ∨ A2 ∨ * * * ∨ An and this, too, is co-derivable and co-valid with the original hypersequent.11 For the second part, show that every axiom in your favourite axiomatisation of 5 is derivable in the sequent system. The verification of this part is routine. To show that modus ponens (say in the form of the inference from ¬(A∧¬B) and A to B) preserved derivability, we must use the rule cut, to extend the derivations as follows: *** ` A *** ` ¬(A ∧ ¬B) A ` A B ` B [¬R] ` ¬B,B [∧R] A ` A ∧ ¬B,B [¬L] ¬(A ∧ ¬B), A ` B [Cut] A ` B [Cut] ` B That proof is simple, but it does not tell us much about the proof system. It is more interesting to prove completeness directly. Proof [model construction]: Given an underivable hypersequent, we construct a model in which that hypersequent is invalid. One way to do this is to show that any underivable sequent must have an unsuccessful derivation search, from which a model can be constructed. This technique can succeed without the use of the cut rule. Firstly, notice that the following rules can be derived on the basis of the connective rules (and contractions, merges and weakenings). X,¬A ` A, Y | ∆ [¬Ls] X,¬A ` Y | ∆ X,A ` ¬A, Y | ∆ [¬Rs] X ` ¬A, Y | ∆ X,A,B,A ∧ B ` Y | ∆ [∧Ls] X,A ∧ B ` Y | ∆ X ` A,A ∧ B, Y | ∆ X ` B,A ∧ B, Y | ∆ [∧Rs] X ` A ∧ B, Y | ∆ X,A ` Y | X ′, A ` Y ′ | ∆ [Ls] X,A ` Y | X ′ ` Y ′ | ∆ X ` A, Y | ` A | ∆ [Rs] X ` A, Y | ∆ Now consider what happens with an underivable hypersequent. If a hypersequent is underivable, and it has the form of one of the lower hypersequents in that table above, then one of the hypersequents above that line must also be underivable. In particular, that means that we do not get a hypersequent in which the same formula finds itself on both sides of a turnstile in the one zone. 11Why the boxes on all formulas other than one? First, to make the translation of a hypersequent with a single zone the identity translation. Second, the valid hypersequent ` ¬A | ` A may be translated as ` ¬A ∨ A, which is also valid. Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 16 (Any hypersequent containing a zone of the form X,A ` A, Y is derivable, using weakenings and merges.) So, we can think of an underivable hypersequent as a partial description of a model. Each zone partially describes world. Antecedent formulas are true, and consequent formulas are false. The search rules above tell us that if we have a negation true, its negand is false, if a negation is false, its negand is true. Similarly for conjunction, and for necessity, if A is true, then A is true in each zone, and if A is false, then there is some zone in which A is false. So, search for a derivation, by taking a hypersequent and whenever we have a formula in a zone that is 'unprocessed' (a negation whose negand is not in the opposite zone, A true in a zone, but A not appearing in some zone), process it by means of the rules we have seen. (This might require branching in the case of a conjunction in consequent position.) Continue this process. If the original sequent is underivable, the result will be a partial description of a model in which each zone describes a world. The model will falsify the original hypersequent. This technique (which is, in effect, constructing a tableaux system from this sequent calculus) has the advantage of not requiring the cut rule. A corollary of soundness and completeness proved in this way is that cut is admissible. That is, since we know that the cut rule preserves validity in models, and since we know that validity in models is captured exactly by the hypersequents with cut-free derivations, we know that if the premise hypersequents of a cut rule are derivable, so is the endsequent. This proof tells us nothing about how to convert a proof involving cuts into one that does not use cut. We can adopt the standard cut-elimination technique [13]. My presentation follows from Belnap's systematic account in his Display Logic [5, 21], which in turn follows Curry's formulation of the proof [8, page 250]. First, we check that the rules of the hypersequent calculus satisfy a number of conditions. / That is, the Cut on an identity sequent is redundant: A ` A X ′, A ` Y ′ | ∆ ′ [Cut] X ′, A ` Y ′ | ∆ ′ Clearly, a cut on an identity sequent may be left out completely.   Next we have conditions on parameters in rules. In our case, a parameter in an inference falling under a rule is every formula except for the major formulas in a connective rule (the formula with the connective introduced below the line and its ancestor formulas above the line), and the cut formulas in a cut rule. Every other formula is a parameter. Parameters may appear both above and below the line. A parametric class is a collection of instances of a formula in a proof. Two formulas are a part of the same parametric class if they are represented by the same letter in a presentation Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 17 of the rule (the instances of A in an inference of contraction, for example) or if they occur in the same place in a structure (such as an antecedent X or a hypersequent term ∆).  The regularity condition is that if a cut formula is parametric in an inference immediately before the cut, the cut may be permuted above that inference. For example the segment X ` A,A, Y | ∆ [WR] X ` A, Y | ∆ X ′, A ` Y ′ | ∆ ′ [Cut] X,X ′ ` Y, Y ′ | ∆ | ∆ ′ can be replaced by this segment, in which cuts take place on the top sequents, at the cost of duplicating material in the derivation. X ` A,A, Y | ∆ X ′, A ` Y ′ | ∆ ′ [Cut] X,X ′ ` A, Y, Y ′ | ∆ | ∆ ′ X ′, A ` Y ′ | ∆ ′ [Cut] X,X ′, X ′ ` Y, Y ′ | ∆ | ∆ ′ | ∆ ′ [W and merge] X,X ′ ` Y, Y ′ | ∆ | ∆ ′ And similarly, X ` A, Y | ∆ X ′, A, B ` Y ′ | ∆ ′ [L] B ` | X ′, A ` Y ′ | ∆ ′ [Cut] B ` | X,X ′ ` Y, Y ′ | ∆ | ∆ ′ becomes X ` A, Y | ∆ X ′, A, B ` Y ′ | ∆ ′ [Cut] X,X ′, B ` Y, Y ′ | ∆ | ∆ ′ [R] B ` | X,X ′ ` Y, Y ′ | ∆ | ∆ ′ -   Two formulas in the same parameter class are in the same position (either antecedent position or consequent position). This is straightforward to check.12 -   Parametric classes have only onemember below the line of an inference. This is straightforward to check. The previous conditions all concern permuting cuts over inferences when one side or other is parametric.    A formula is principal in a rule if it is not parametric. The single principal constituent condition is that each inference has only one principal formula below the line. This is immediate. 12This condition rules out inferences such as "matched weakening", leading from X ` Y to X,A ` A, Y in which the parameteric class for A would appear in both antecedent and consequent position. Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 18      An instance of cut in which the cut formula is principal in both inferences immediately before the cut may be traded in for a cut (or cuts) on subformulas of the cut formula. The interesting case in our system is for A. We have: ` A | ∆ [R] ` A | ∆ X,A ` Y | ∆ ′ [L] A ` | X ` Y | ∆ ′ [Cut] X ` Y | ∆ | ∆ ′ Clearly we could have made the cut before the introduction: ` A | ∆ X,A ` Y | ∆ ′ [Cut] X ` Y | ∆ | ∆ ′ Given that our system satisfies these conditions, we may eliminate cuts from derivations.  [  ] Given a derivation in which the rule [Cut] is applied, we may effectively transform this derivation into one in which cut is not used. Proof: We perform an induction on the complexity of the cut formula. The hypothesis is that for every subformula of A (and for every, X,X ′, Y, Y ′, ∆, ∆ ′) if X ` A, Y | ∆ and X ′, A ` Y ′ | ∆ ′ are derivable, so is X,X ′ ` Y, Y ′ | ∆ | ∆ ′, and we wish to show that this is the case for the formulaA also. So, suppose we have derivations δ and δ ′ of X ` A, Y | ∆ and X ′, A ` Y ′ | ∆ ′ respectively. If the cut-formulaA indicated in the concluding inferences of δ and δ ′ is principal, then we may apply the eliminability of matching principal constituents condition and our induction hypothesis to eliminate the cut. If, on the other hand, A is parametric in either δ or δ ′, we proceed as follows. Without loss of generality, suppose A is parametric in δ. Consider the class A of occurrences of A in δ found by tracing up the derivation and selecting each parametric instance of A congruent with the A in the conclusion of δ. We commute the cut on A (with the other premise X ′, A ` Y | ∆ ′) past each inference in which an instance in A features, using regularity. The result is a derivation in which there may be many more cuts, but for each cut on A introduced, there are no parametric instances of A in consequent position. For each copy of δ ′ introduced, we may form the set A ′ of instances of A congruent with the A in antecedent position in the cut inference. We commute the cut with each inference crossing the set A ′ to construct a derivation in which the cut on A occurs only on prinicpal instances of A, and this case has already been covered. 6   We will end by looking at a number of ways to extend this approach. Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 19  Elimination of cuts corresponds quite directly to the normalisation of circuits, by way of the translation between derivations and circuits. The circuit presentation of this system gives us scope for examining other ways in which proofs may be normalised.  Not every plugging of a wires in nodes produces a circuit. (Consider the putative "inference" in which the output wires of [∨E] are plugged into the input wires of [∧I]. This does not tell us that we may infer A∧B from A ∨ B.) The literature on proofnets has introduced the notion of a correctness criterion [15, 9]. It is an open question as to what might be an appropriate correctness criterion for these circuits.  Natural deduction systems lend themselves to a representation in a term calculus, according to which proofs correspond to terms, where formulas are types. An appropriate term calculus for these circuits is, also, an open question. It seems that Philip Wadler's recent work on term calculi for classical linear logic will provide a useful starting point [25].    We have not said when two circuits represent the same proof. Clearly, these circuits are not the last word for proof identity. Even in the classical case, proof identity is a complicated business. There are many prposals in the literature [4, 11, 12, 18]. The key idea in this literature that a theory of proofs has the structure of a category. A proof from A to B is, essentially, an arrow in that category. It is less clear that this is what we want in the case of modal reasoning. In the category-centred approach, we take a proof for X ` Y to be an arrow f : ∧ X → ∨ Y. In the case of hypersequents, we do not have an obvious translation in terms of formulas. Take the hypersequent A ` | ` B. It can be thought of from the perspective of A (so it tells us that A ` B) or from B (it tells us that ♦A ` B). The proof from A to B cannot be the same as the proof from ♦A to B, as the source formulas differ, and the target formulas differ.13 So, which arrow in the category is the proof? Could a more natural model for these deductions be a different generalisation of a category? If we quotient our proofnets with some congruence relation (respecting the kind of identities we might expect, given our preferences about the way to go here) then what kind of "category-like" structure do we find? This is an open question.   Finally, it is clear that we need to generalise this account to cover modal logics other than 5. To do this, we need to step from a simple relation of which ignores anything other than the identity and difference of contexts for wires in a proof, to something more subtle. In an inference [E] from A, we step not to an arbitrary context, but to a successor context. The rule [I] must similarly be modified. The aim, of course, is an account of proof in which the rules for the modal operators are untouched, and the structural rules (in this 13They are not only different, they will not be isomorphic is the categories, as they have different inferential roles. Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 20 case, the behaviour of nearness and the relations of ancestor/descendant) play the role of determining which modal logic is found. Exploring these matters must be left for another time.  [1]  . "A Constructive Analysis of RM". Journal of Symbolic Logic, 52:939–951, 1987. [2]  . "Using Hypersequents in Proof Systems for Non-classical Logics". Annals of Mathematics and Artificial Intelligence, 4:225–248, 1991. [3]  . "The Method of Hypersequents in the Proof Theory of Propositional Non-classical Logics". In . , . , . ,  . , editors, Logic: from foundations to applications, pages 1–32. Oxford University Press, 1996. [4]  ,  ,  ,   . "Categorical Proof Theory of Classical Propositional Calculus". Theoretical Computer Science, 200+. to appear. [5]  . . "Display Logic". Journal of Philosophical Logic, 11:375–417, 1982. [6]  , . . . , . . . ,  . . . "Natural Deduction and Coherence for Weakly Distributive Categories". Journal of Pure and Applied Algebra, 13(3):229–296, 1996. Available from ftp://triples.math.mcgill.ca/pub/rags/nets/nets.ps.gz. [7]  . "A Cut-Free Gentzen Formulation of the Modal Logic S5". Logic Journal of the IGPL, 8(5):629–643, 2000. [8]  . . Foundations of Mathematical Logic. Dover, 1977. Originally published in 1963. [9]     . "The Structure of Multiplicatives". Archive of Mathematical Logic, 28:181–203, 1989. [10]  . "Sequent-Systems for Modal Logic". Journal of Symbolic Logic, 50:149–168, 1985. [11]     . Proof-Theoretical Coherence. KCL Publications, London, 2004. [12]     . "Order-enriched Categorical Models of the Classical Sequent Calculus". Journal of Pure and Applied Algebra, 204:21–78, 2006. [13]  . "Untersuchungen über das logische Schliessen". Math. Zeitschrift, 39:176–210 and 405–431, 1934. Translated in The Collected Papers of Gerhard Gentzen [14]. [14]  . The Collected Papers of Gerhard Gentzen. North Holland, 1969. Edited by M. E. Szabo. [15] - . "Linear Logic". Theoretical Computer Science, 50:1–101, 1987. [16]  . "Cut-free Double Sequent Calculus for S5". Logic Journal of the IGPL, 6(3):505–516, 1998. [17]  . . "The Province of Logic". In . . , editor, Contemporary British Philosophy: Third Series, pages 237–261. George Allen and Unwin, 1956. [18] .   . . "Naming Proofs in Classical Logic". To appear in Proceedings of  2005. [19]  . "Indexed systems of sequents and cut-elimination". Journal of Philosophical Logic, 26:671–696, 1997. [20] .   . . "Gentzen Method in Modal Calculi". Osaka Mathematical Journal, 9:113–130, 1957. [21]  . An Introduction to Substructural Logics. Routledge, 2000. Greg Restall, restall@unimelb.edu.au  1 February 9, 2006 http://consequently.org/writing/s5nets 21 [22]  . "Proof Nets for Classical Logic". Journal of Logic and Computation, 13(5):777–797, 2003. [23] . .   . . . Multiple Conclusion Logic. Cambridge University Press, Cambridge, 1978. [24] . . . Normalization, cut-elimination, and the theory of proofs. Number 28 in  Lecture Notes.  Publications, Stanford, 1992. [25]  . "Down with the bureaucracy of syntax! Pattern matching for classical linear logic". Available at http://homepages.inf.ed.ac.uk/wadler/papers/ dual-revolutions/dual-revolutions.pdf, 2004. [26]  . DisplayingModal Logic. Kluwer Academic Publishers, Dordrecht, 1998. [27]  . "Translation of Hypersequents into Display Sequents". Logic Journal of the IGPL, 6(5):719–733, 1998. Greg Restall, restall@unimelb.edu.au  1 February 9,