Noname manuscript No. (will be inserted by the editor) Formalizing Kant's Rules A Logic of Conditional Imperatives and Permissives R. Evans * M. Sergot * A. Stephenson Received: date / Accepted: date Abstract This paper formalizes part of the cognitive architecture that Kant develops in the Critique of Pure Reason. The central Kantian notion that we formalize is the rule. As we interpret Kant, a rule is not a declarative conditional stating what would be true if such and such conditions hold. Rather, a Kantian rule is a general procedure, represented by a conditional imperative or permissive, indicating which acts must or may be performed, given certain acts that are already being performed. These acts are not propositions; they do not have truth-values. Our formalization is related to the input/ output logics, a family of logics designed to capture relations between elements that need not have truth-values. In this paper, we introduce KL3 as a formalization of Kant's conception of rules as conditional imperatives and permissives. We explain how it differs from standard input/output logics, geometric logic, and firstorder logic, as well as how it translates natural language sentences not well captured by first-order logic. Finally, we show how the various distinctions in Kant's much-maligned Table of Judgements emerge as the most natural way of dividing up the various types and sub-types of rule in KL3. Our analysis sheds new light on the way in which normative notions play a fundamental role in the conception of logic at the heart of Kant's theoretical philosophy. Keywords Kant *Conditional imperatives * Input/output logics *Normativity R. Evans E-mail: RichardEvans@google.com M. Sergot E-mail: m.sergot@imperial.ac.uk A. Stephenson E-mail: andrew.stephenson@soton.ac.uk 2 R. Evans et al. 1 Introduction Judgments, insofar as they are regarded merely as the condition of the unification of given representations in one consciousness, are rules. [Prolegomena 4:305]1 We will define a logic of conditional imperatives and permissives that was designed as part of an effort to make sense of what Kant was trying to do in the Critique of Pure Reason. There were two sources of motivation for designing this logic. The first came from our long-term project to extract from the Critique a cognitive architecture that could be realised in a computational system 2. Consider a simple agent with various sensors, trying to make sense of its3 sensory perturbations. It must, somehow, interpret its motley array of sensory perturbations as representations of an external world. This world consists of objects located in space and persisting through time, causally interacting with each other. What sorts of things must an agent do in order to achieve this? What must an agent do in order to represent a world at all? This is not an epistemological question: we are not asking what conditions have to hold in order for an agent who already believes something to also know something. This is a pre-epistemological question about intentionality: what conditions must hold for an agent to even think a thought that is about the world, irrespective of whether that thought is true or false? Kant's cardinal innovation, as we read him, is that the agent makes sense of its sensory perturbations by constructing and applying rules: We have above explained the understanding in various ways – through a spontaneity of cognition (in contrast to the receptivity of the sensibility), through a faculty of thinking, or a faculty of concepts, or also of judgements – which explanations, if one looks at them properly, come down to the same thing. Now we can characterize it as the faculty of rules. This designation is more fruitful, and comes closer to its essence. Sensibility gives us forms (of intuition), but the understanding gives us rules. It is always busy poring through the appearances with the aim of finding some sort of rule in them. [A126]4 In making sense of its sensory perturbations, the rules an agent constructs and applies must satisfy various constraints, codified by Kant as the Cate1 Translations are from the Cambridge Edition of the Works of Immanuel Kant (details at the end), with occasional modifications. With the exception of those to the Critique of Pure Reason, which take the standard A/B format, references to Kant are by volume and page number in the Academy Edition [Immanuel Kants gesammelte Schriften, 29 volumes, Berlin: de Gruyter, 1902-], along with a short English title. 2 For a description of this larger project, and preliminary results, see [17] and [18]. 3 'It' will be our default singular third person pronoun: "Through this I, or He, or It (the thing), which thinks, nothing further is represented than a transcendental subject of thoughts = x" [A346/B404]. 4 See also [A52/B76], [A127], [B143], [A132/B171], [A159/B198], [A302/B359], [Jäsche Logic 9:1112], and [Prolegomena 4:318] Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 3 gories and Principles of Pure Understanding. Only if they satisfy these constraints does the agent achieve what Kant calls "experience": it has constructed a coherent, unified representation of a coherent, unified external world5. According to this interpretation, self-legislation is just as critical to Kant's theoretical philosophy as it is to his practical philosophy. The agent is only able to achieve experience by constructing rules that it then applies. According to this picture, self-legislation is prior to conscious experience in the explanatory ordering. In stark contrast to interpretations on which the Kantian agent is only able to construct and apply rules after they have already achieved a conscious representation of the world, our account views the construction and application of rules as necessary, indeed partially constitutive, of such representation. If rules are to play this fundamental, load-bearing role in Kant's theory of intentionality – his theory of experience – then we had better be very clear what we mean, exactly, by a rule. If a rule is seen as a conditional that relates propositions that have truth-values, then it cannot be a foundational part of his architecture. Kant's project, as we understand it, is to explain intentionality itself: he wants to explain how an agent can have world-directed thoughts that are so much as capable of being true or false. If he presupposes rules that connect propositions that are already true or false, then he has already presupposed too much. We argue that, for Kant, a rule is a general procedure relating acts, not propositions. In the case of theoretical reason, the rule's constituent acts are mental rather than physical. They include things like seeing a bruised apple, feeling a heavy hammer, and hearing a buzzing bee. Crucially, for Kant, such acts do not themselves have truth-values: For truth and illusion are not in the object insofar as it is intuited, but in the judgment about it insofar as it is thought. Thus it is correctly said that the senses do not err; yet not because they always judge correctly, but because they do not judge at all. Hence truth, as much as error, and thus also illusion as leading to the latter, are to be found only in judgments, i.e., only in the relation of the object to our understanding... In the senses there is no judgment at all, neither a true nor a false one. [A293-4/B350] See also [Jäsche Logic 9:53]. In this paper, we will provide a formalization of Kant's conception of rules as general procedures, using a logic of conditional imperatives and permissives over acts. We will also sketch how this logic fits into the larger picture of the Kantian self-legislating agent, that makes sense of its sensory perturbations by spontaneously constructing and applying rules. We said that there were two sources of motivation for designing this logic. The first was to formalize the notion of a rule at the heart of Kant's cognitive 5 Kant uses "experience" ("Erfahrung") in various ways. For this, which we regard as its central use, see especially [Bxli], [A110], [A146/B185], [A225/B272], [A229-30/B282]; [Prolegomena 4:292, 320]; [Metaphysical Foundations of Natural Science 4:560]. 4 R. Evans et al. architecture with a view to realising that architecture in a computational system. The second source of motivation is exegetical and defensive. Although Kant's Table of Judgements plays an absolutely pivotal role in his Critical system, it has been roundly criticised. One common objection has been that it is based on the out-dated Aristotelian term logic. Just as Kant's views on nature are based on a defunct conception of Newtonian physics and his views on mathematics are based on a defunct conception of Euclidean geometry, so his views on the mind and its acts of judgement are based on a defunct conception of Aristotelian logic. Yet if the Table of Judgements is incomplete or arbitrary, then the derivation of the Table of Categories and the subsequent Transcendental Deduction has failed before it has even started. If the Table of Judgements is based on an incomplete and out-dated logic, then Kant's recurrent use throughout his work of the basic structure it provides is merely the result of an "architectonic mania"6, rather than the persistent application of a unified template that runs through all our mental activity. Or so the story goes. Our second motivation for formalizing Kant's conception of rules, then, was to design a logic in which the distinctions he introduces in the Table of Judgements emerge as the most natural way of dividing up the various types and sub-types of rule. In Section 2, we briefly outline some of the key elements in our preferred interpretation of Kant. We define the essential terms and sketch how they fit together in Kant's theory of experience, focusing on his conception of rules. In Sections 3, 4, and 5, we present a logic that formalizes Kant's rules as conditional imperatives and permissives. We explain how it handles the major deontic paradoxes and how it differs from standard input/output logics, geometric logic, and first-order logic. We also explain how it translates natural language sentences, including those involving multiple quantification and features not captured by first-order logic, such as predicate negation and disjunction. Finally, in Section 6, we show how this logic makes sense of Kant's Table of Judgements, not only its particular "moments" and its overall structure, but also the various finer points of structure that Kant insists upon. Our analysis sheds new light on the way in which normative notions play an absolutely fundamental role in the conception of logic at the heart of Kant's theoretical philosophy. Apart from its own intrinsic interest, historical and philosophical, this also provides a clear hint that our analysis might be extended to Kant's account of moral rules and practical agency. Consider the central role that imperatives and permissives play in Kant's moral philosophy (e.g. at [Groundwork 4:414ff., 421ff.]), as well as his claim that practical and theoretical reason share a "common principle" [Groundwork 4:391]. We cannot pursue this extension here; that is a task for future work. 6 [Kant and the Capacity to Judge, p.5]. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 5 2 Kant's cognitive architecture In this section, we outline our preferred interpretation of Kant. The interpretation will be elaborated on and confirmed throughout (especially in Section 6), but we do not attempt to mount a full defence of it here. Our primary aim in this paper is to formalize an aspect of Kant's thought, as we understand it, and then apply the results in making sense of his Table of Judgements. The aspects of our interpretation that we do not defend here have been defended by ourselves or others elsewhere, which we note when relevant7. 2.1 Mental activity as constituted activity It is a familiar idea that social activity is constituted activity8. In certain circumstances, if various constraints are satisfied, then pushing a horse-shaped object across a chequered board counts as moving a knight to king's bishop three; an utterance of the words "I do" counts as an acceptance of marriage vows; running away counts as desertion; writing your name counts as signing a contract. Such counts-as claims are constitutive, not merely predicative or classificatory (as when we say that a horse counts as a mammal). We are saying that doing x just consists in doing y in the right circumstances; that there is nothing more to doing x than doing y in the right circumstances. In this sense, social actions are things we can only do mediately, by doing something else in the right circumstances. The constitution might continue, so that one constituted social activity in turn constitutes another, as when a move in a chess game in turn counts as winning the game. But here, too, we do one thing by doing another, and the circumstances must be appropriate. Neither playing nor winning at chess are things we can do immediately. One action can only count as another if the surrounding circumstances satisfy certain conditions. Just going up to a stranger in the street and saying "I do" does not count as marrying them. Saying "I do" only counts as marrying someone in the particular context of a marriage ceremony when the officiator has asked a particular question. What determines which circumstances are the right circumstances? It is the constitutive rules of the constituted activity that determine the subset of circumstances in which doing y also counts as doing x. These constitutive rules are to be distinguished from regulative rules, like the driving laws, which merely regulate a pre-existing independent activity. So we have here a distinction between constituting and constituted activities, where constituted activities can in turn play a role in constituting further activities, as well as an attendant distinction between constitutive and regulative rules. And note finally that, when a constituting activity itself consists in the construction and application of rules, then the constitutive rules that determine when this activity counts as a further, constituted activity will be 7 We are particularly indebted to the writings of Christine Korsgaard [35,36], Béatrice Longuenesse [40,41], Wayne Waxman [58,59], and Robert Brandom [7–9]. 8 See [49]. For an overview as well as a formal treatment, see [23]. 6 R. Evans et al. meta-rules: rules that determine how the construction and application of rules must go if it is to constitute the constituted activity in question. As we read Kant, the guiding theme of his philosophy of mind is that mental activity is constituted activity. His primary concern is with the constituted mental activity experience. This is a complex, high-level activity, itself constituted by other constituted mental activities, themselves constituted by yet others, and so on. In each case, we can ask: what constraints must be satisfied for one activity to constitute another? Ultimately, we are asking: what are the constitutive rules of experience? It is the purpose of Kant's cognitive architecture to articulate all of this (and more). He calls it "the conditions for the possibility of experience" [e.g. at A92ff./B124ff., A158ff./B197ff.]. Two of the mental activities that play a role in constituting experience, if certain constraints are satisfied, are perception and judgement. The former includes things like seeing a bruised apple, feeling a heavy hammer, and hearing a buzzing bee. The latter includes things like forming the thought that all humans are mortal, that some humans are uneducated, or that Caius is a human. Perception and judgement are distinct but interdependent activities. They are also themselves constituted activities, and it is at this level that we come to the four elements of Kant's cognitive architecture that will be essential for our logic. 2.2 The four elements An intuition is a mental object, constructed out of given sensations by the solitary individual agent. Your intuitions are different from my intuitions – each agent has its own private repository. There is no limit to how many intuitions an agent can construct. Intuiting is the constituted mental activity of constructing intuitions9. A mark is a symbol that can be ascribed to multiple intuitions. Unlike an intuition which can only be had by a single agent, a mark is public and can be used by many different agents. (Marks are general in both of these senses.) A mark has no predefined meaning. Its meaning is determined entirely by its inferential role (i.e. by the rules in which it figures)10. A subsumption is the mental activity of assigning a mark to an intuition11 (or tuple of intuitions). As we read Kant, and this is central to everything that follows, a subsumption is an act that does not itself have a truth-value. It is not itself a judgement or a thought, still less a belief or knowledge. Although marks are shared public symbols, intuitions are private mental objects, so the act of subsuming an intuition under a mark is only performable by the particular individual who has that particular intuition. 9 We have argued for this traditional view of intuition, and against the currently popular 'relationalist' view elsewhere [52]. 10 See [38]. 11 See [Kant and the Capacity to Judge, p.92n] and also [58] p.264 and p.269n. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 7 A rule is a general procedure for generating subsumptions from subsumptions. There are two basic types of rule. As Kant describes them: the representation of a universal condition in accordance with which a certain manifold (of whatever kind) can be posited is called a rule [Regel], and, if it must be so posited, a law [Gesetz]. [A113]12 A rule is not a sentence – it is a general procedure. But if it were to be described by a sentence, it would be described by a conditional imperative or permissive. For example: for every intuition x, if you subsume x under mark p, then also subsume x under mark q! Or: for every intuition x, if you subsume x under mark p, then feel free13 to also subsume x under mark r! Rules are general procedures that apply to all intuitions. Unlike subsumptions, which are private to the individual, rules are things that the solitary agent can share with others14. Unlike subsumptions, which bring two heterogenous elements together (the intuition and the mark), rules bring homogenous elements (various subsumptions) together. To reiterate, rules are not themselves conditional imperatives or permissives like those above. This is important because natural language conditional imperatives and permissives have truth-evaluable content in their antecedents, whereas this is not the case for Kantian rules (for the reasons given in Section 1). Our claim is that Kantian rules, as general procedures, can nevertheless be formalized using a logic of conditional imperatives and permissives (with a suitable formal semantics). As a convenience we will often talk as though rules just are conditional imperative or permissives, but strictly speaking what we mean is that they can be described or formalized as such. These, then, are the four basic elements of Kant's constitutive psychology: intuitions, marks, subsumptions, and rules. If certain constraints are satisfied, if everything comes together, then: – an intuition counts as a representation of a particular external object – a mark counts as a concept – a subsuming counts as a perception – a rule counts as a judgement (with a truth-value) 12 Kant rarely sticks to this rule/law terminology and we do not adopt it here, referring to both imperatives and permissives simply as rules. See also [B201n], where Kant uses yet other terminology. 13 This informal way of putting it is not ideal. What we are trying to express here is the permissive that corresponds to the imperative as "may" corresponds to "must". If English contained a punctuation mark corresponding to "!" that represented a permissive rather than an imperative, then we would use that, but there isn't one. 14 [Kant and the Capacity to Judge, p.88]: "This is how, by virtue of its logical form alone, a judgment lays a claim to holding for any consciousness, whereas a mere coordination of representations might only hold for my subjective consciousness.". See also [32] and [53]. 8 R. Evans et al. As we read Kant, these constitutive counts-as claims should not be thought of as successive or independent stages. An intuition only counts as a representation of an external particular insofar as it is subsumed under a mark that counts as a concept; a mark only counts as a concept insofar as it is involved in a subsumption that counts as a perception; and a subsumption only counts as a perception insofar as it is bound to other subsumptions in a rule that counts as a judgement; thus, in turn, an intuition only counts as a representation of an external particular and a mark only counts as a concept insofar as each figures in a rule that counts as a judgement. And a rule only counts as a judgement insofar as it is bound to other such rules in a coherent, unified representation of a coherent, unified external world; that is, only insofar as it is part of experience. This, in a nutshell, is Kant's constitutive theory of experience. Its constitutive (meta-) rules – the constraints that must be satisfied for the above counts-as claims to hold – are what Kant articulates in the Analytic of Principles15. Its basic elements are intuitions, marks, subsumptions, and rules, all of which will play a role in our logic. Here we focus on just one of the counts-as claims: that a rule counts as a judgement. 2.3 Judgements as rules Recall that a rule is a general procedure that we will formalize as a conditional imperative or permissive. It might seem strange to think of an imperative or permissive rule as something that can count as having a truth-value, but this, we contend, is exactly what Kant has in mind. He says: All judgements are accordingly functions of unity among our representations, since instead of an immediate representation, a higher one, which comprehends this and other representations under itself, is used for the cognition of the object, and many possible cognitions are thereby drawn together into one [A69/B94] This is exactly the role of rules, on our account. A rule is a "higher representation" that binds together subsumptions, which consist of marks and intuitions, or "immediate representations". It is "the mediate cognition of an object, hence the representation of a representation of it" [A68/B93]. Thus: Judgments, insofar as they are regarded merely as the condition of the unification of given representations in one consciousness, are rules. [Prolegomena 4:305] All rules (judgments) contain objective unity of consciousness of the manifold of cognition, hence a condition under which one cognition belongs with another to one consciousness. [Jäsche Logic 9:121] 15 How exactly this goes has been interpreted in many different ways. See, for example, [10,33, 34,40,58,60]. For our own preferred interpretation, implemented in the aforementioned computer model, see [17,18]. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 9 Consider Kant's identification of judgements with rules. This identification is easiest to see in the case of universal judgements. The judgement "All humans are mortal" just is the rule: for every intuition x, if you subsume x under "human", then also subsume x under "mortal"! But the identification applies equally to particular judgements. The difference is that particular judgements are permissive rules. "Some humans are uneducated" just is the rule: for every intuition x, if you subsume x under "human", then feel free to also subsume x under "uneducated"! And it also applies to singular judgements. "Caius is a human" just is the rule: for every intuition x, if you subsume x under "Caius", then also subsume x under "human"! together with a constraint: for any distinct intuitions x and y, do not subsume both x and y under "Caius"! Indeed, after presenting our logic in Sections 3, 4, and 5, we will argue in Section 6 that our rule-based analysis accounts for all of the different kinds of judgement that Kant identifies in his Table of Judgements, and we will also show how it accounts for the Table's finer points of structure. Note, for instance, how singular judgements come out above as a sub-type of universal judgements; how both singular judgements and universal judgements imply particular judgements, so long as imperatives imply permissives; and how negation can be applied to a predicate within an atomic (categorical) judgement. 2.4 What Kant meant by "logic" We have become accustomed to thinking of logic as the study of entailment relations between sets of linguistic items. We are given a set of sentences, A, and a further sentence p and we want to find out if A entails p, written A |= p, where |= is defined in terms of truth: A |= p if any model in which A is true is also a model in which p is true. Logic, for Kant, was not primarily about entailment relations between elements with truth-values. First and foremost, it describes how we should think16. And since thinking is a mental activity, logic is primarily a codification of principles describing which activities we should perform, conditional on the activities we have already performed. This project will turn out to include an account of relations between elements with truth-values. But, we contend, it is not exhausted by such an account. 16 See [A52ff./B76ff.], [A131/B170], and [Jäsche Logic 9:14]. 10 R. Evans et al. Our focus in this paper is on how we should think when our goal is experience. As we read Kant, this question amounts to the following: given a collection of subsumptions that the agent is performing concurrently, and a set of rules it has adopted, what further subsumptions may/must it also perform, if it is to achieve experience? Note that this is a question about acts: given that the agent is performing these acts, and given that it has adopted these rules, what further acts may/must it perform? These mental acts (subsumptions) do not have truth-values. Kant's Logic is not only concerned with relating elements that have truth-values, but also tells us what mental acts we may/must perform. Suppose, for example, the agent has adopted the following rules: – If you are seeing yellow and black stripes and hearing a buzzing, then feel free to perceive a bee! – If you are seeing yellow and black stripes and hearing a buzzing, then feel free to perceive a wasp! – Do not count anything as both a wasp and a bee! Having adopted the above rules, suppose that the agent now performs the subsumptions that, in the right circumstances, constitute the antecedents: seeing yellow and black stripes and hearing a buzzing. What further subsumptions may/must it make? To repeat, this is not yet a question about which judgements it should adopt. It is not yet a question about which propositions it should hold for true. Rather, it is a question about what mental acts it should perform. In the case at hand, it is a question about what it should perceive17. One permissible subsumption would be to perceive a bee. Another permissible subsumption would be to perceive a wasp. But it is not permissible to subsume the same intuition under both "bee" and "wasp". On our account, a fundamental question of Kant's logic concerns relations between elements that do not have truth-values. There is, however, so conceived, a further question for logic to answer: given a collection of judgements (i.e. rules) that have been adopted, what other judgements may/must also be adopted? Experience is constituted by perception and judgement. Now, since judgements do have truth-values, this secondary aspect of transcendental logic aligns with the contemporary, Fregean focus on truth-conditional logic. In this paper, we present a logic that addresses both aspects (see Section 3.6). But our logic begins "one level down" from first-order logic18: it represents (the perceptual) activities that do not themselves have truth-values but which can contribute to the constitution of 17 Note that these mental acts are at the sub-personal level – our 'agent' is a sub-personal ruleinduction system. It is not as if the person consciously chooses whether to perceive a bee or a wasp, but rather that a pre-conscious process makes this "decision". It does so with the goal of achieving experience, whence the sense and force of a question about what should be perceived. But experience, on Kant's account, is the first level at which there arises anything like a person's conscious perspective on a unified, coherent external world. See Section 6 and [51] for further discussion. 18 See [1], p.236 for a related claim. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 11 (the judgemental) elements that do have truth-values, all of which together, if things go right, will constitute experience19. In the Critique, Kant distinguished between general and transcendental logic. While general logic describes the forms and principles for thinking in general, transcendental logic describes the forms and principles needed for thinking to have "objective validity" [A88-9/B122], i.e., to be about the world. How does the logic of rules that we present relate to Kant's distinction between general and transcendental logic? Our logic of rules is necessary (but not sufficient) for a subsequent project of properly formalizing the transcendental logic, but there is still remaining work to do. We do not give a full account of the conditions for the possibility of experience. Nor do we give a full account of what it takes for a thought to have objective validity. 3 KL1 In the following three sections, we present a logic that formalizes Kant's rules as conditional imperatives and permissives. Our claim is not, of course, that Kant had this precise logic in mind. Rather, the claim is that our formalization is based on, compatible with, and helps explain part of Kant's view in the Critique of Pure Reason (and associated texts, especially Jäsche Logic). From what has already been said, we know this logic must satisfy two constraints. First, since subsumptions are acts, we need a logic that does not assume its constituent elements have truth-values. The input/output logics [43] were designed to capture inferential relations between elements that do not necessarily have truth-values. They were conceived in response to Jørgensen's Dilemma: 1. Logical inference requires that the elements (premises and conclusions) have truth-values. 2. Imperatives do not have truth-values. For example, the command "Burn all the books!" has a satisfaction-condition, but does not have a truth-value. 19 In the Second Analogy, Kant claims that it is only through the construction and application of causal rules that temporal succession is generated. It is not that we first perceive a temporal succession of individual events, and then subsequently posit causal laws to explain the succession. Rather, what it is to perceive the succession of a followed by b just is to posit a causal rule whose body subsumes a and whose head subsumes b. In this picture, rules and acts are prior in the order of explanation to temporality. A potential problem for our account emerges when we acknowledge that acts (the constituents of rules) are themselves temporal phenomena. If a rule has a conjunction of actions in the body, this means that both actions are performed at the same time. If an act has preand post-conditions, this means that certain conditions must be true before the act is performed, and certain conditions must be true after. How, then, can the notion of rule and act be prior to temporality in the direction of explanation, if the notion of an act presupposes a temporal ordering? This is a difficult question, but it is not just a question for our particular account. It is a problem for all interpretations of Kant that focus on the mental processes that underly experience. One way to address this concern is to invoke Kant's distinction between subjective succession and objective succession [B233-4], and to argue that the subjective succession of mental acts can be used to explain the construction of the objective succession of external events. 12 R. Evans et al. 3. There are valid logical inferences between imperatives. For example: – Burn all the books in the library! – The Critique of Pure Reason is a book in the library – Therefore: burn The Critique of Pure Reason! The input/output logics resolve this impasse by denying the first claim: they support inference between elements that do not have truth-values. The logic we present below is a member of the family of input/output logics (broadly conceived). The second constraint on any logic that formalizes Kant's rules is that it must support not only conditional imperatives but also conditional permissives. For example: if you subsume intuition x under mark p, then feel free to also subsume x under q! A logic that contains explicit permissives as well as imperatives will generate multiple acceptable sets of derived subsumptions20. For ease of exposition, we divide the logic into three parts. The first part, KL1, is a type of input/output logic with two types of rule: conditional imperatives and permissives. The second part, KL2, extends KL1 with a negation operator. The third part, KL3, extends KL2 by adding variables and quantifiers. All three logics, KL1, KL2, and KL3, have been implemented and tested in computer programs. In particular, the soundness, completeness, and monotonicity for KL1 and KL2 have been empirically verified21. The code corresponds closely to the text in Sections 3, 4, and 5 below. 3.1 Syntax LetA be the set of all atoms. A,B,C and X will range over sets of atoms, and a, b, c will range over individual atoms. There are two types of rule in KL1: R ::= B C | B C B is the body of the rule; it is a set of atoms representing a conjunction. C is the head of the rule; it is a set of atoms representing a disjunction. We use sets, not sequences or multisets, to avoid various uninteresting inferences involving duplication and permutation of elements. For readability, we write the body of the rule as a conjunction, and the head of a rule as a disjunction. The rule {p, q} {r, s} is represented as: p ∧ q r ∨ s 20 The standard input/output logics generate a single set of derived conclusions. The family of logics we present here are unusual (in the family of input/output logics) in generating multiple acceptable sets of conclusions. Of course, many non-monotonic formalisms generate multiple acceptable sets of conclusions. See Section 3.7 for further comparison. 21 See https://github.com/RichardEvans/kl haskell. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 13 Table 1: Example rules in KL1 Rule Readable version Translation {p} {q} p q If you are doing p, then do q! {p, q} {r} p ∧ q r If you are doing p and q, then do r! {p} {q, r} p q ∨ r If you are doing p, then do q or do r! {} {p} > p Do p! {p} {} p ⊥ Don't, whatever you do, do p! {p} {q} p q If you are doing p, then feel free to do q! {p, q} {r} p ∧ q r If you are doing p and q, then feel free to do r! {p} {q, r} p q ∨ r If you are doing p, then feel free to do q or r! {} {p} > p Feel free to do p! If the body of the rule is empty, we write > instead of the empty set. For example, {}  {p} is written as >  p. If the head of the rule is empty, we write ⊥ for the empty set. For example, {p} {} is written as p  ⊥. Rules with empty bodies and singleton heads are called facts, while rules with empty heads are called constraint rules. The  rules are intended to be read as conditional imperatives between actions. For example, the rule p  q ∨ r should be read as "if you are doing p, then do q or do r!". The  rules are intended to be read as conditional permissives between actions. For example, p  q should be read as "if you are doing p, then feel free to do q!". Some example rules are given in Table 1. Since the elements of rules are actions – not propositions with truth-values – disjunction should not be interpreted truth-functionally. To say that you must do p ∨ q is to say there are two available actions, p and q, and you must choose one of these actions (or both22). It is straightforward to generalise the form of rules to allow disjunctions of conjunctions of atoms in the head. We will not do that here for ease of exposition. 3.2 Semantics (part 1) Given a (countable but not necessarily finite) set R of rules and a (finite) set A ⊆ A of atoms, the consequences out1(R,A) is a set {X1,X2, . . . } of sets of atoms, where each Xi ⊆ A is one of the distinct ways in which the rules can be satisfied. There are two sources of non-determinism in KL1. The first is disjunction. Rules that have disjunctive heads can be satisfied in multiple ways. For example, given: R = p q ∨ rq s 22 Kant's disjunctions are exclusive ([A73-4,B99], [Jäsche Logic 9:106]), while ours are not. We represent an exclusive disjunction by a non-exclusive disjunction > p ∨ q plus a constraint p ∧ q ⊥. 14 R. Evans et al. with A = {p}, the possible outcomes are: out1(R,A) =  {p, q, s} {p, r} {p, q, r, s} The second source of non-determinism is rules. For example, given: R =  > p > q p ∧ q ⊥ the possible outcomes are: out1(R, {}) =  {} {p} {q} The out1 function is defined in terms of a set cns(R,A) of consequences, representing the various ways of applying R to A, from which are filtered out those that do not satisfy the rules in R. Definition 1 A set X of atoms satisfies a set R of rules, written X |= R, when X satisfies every rule in R. X satisfies a rule r, written X |= r, when: X |= B C if B * X or C ∩ X , ∅ X |= B C always Definition 2 For all sets R of rules and sets A of atoms: out1(R,A) = {X ∈ cns(R,A) | X |= R} cns(R,A) is defined inductively as follows: cns0(R,A) = {A} cnsn+1(R,A) = {X ∪ {t} | X ∈ cnsn(R,A), t ∈ step(R,X)} cns(R,A) = ⋃ n≥0 cnsn(R,A) The step function combines the consequences of the various rules that apply: step(R,X) = {c | B C ∈ R or B C ∈ R, B ⊆ X, c ∈ C} The step function treats and exactly the same; the place where they are treated differently is in the satisfaction condition X |= R in out1. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 15 Note that, according to this semantics, the permissives are weak in that they are overridden by the imperatives. If we have p q and p ∧ q ⊥, then the constraint overrides the rule: R = p qp ∧ q ⊥ A = {p} out1(R,A) = {{p}} Similarly, if we have p  q and p  r with q and r incompatible (i.e., q ∧ r ⊥), then p q will trump p r: R =  p q p r q ∧ r ⊥ A = {p} out1(R,A) = {{p, q}} It is also worth observing that the assumptions A can be replaced equivalently by a set of corresponding unconditional  rules. For any set of assumptions A: out1(R,A) = out1(R ∪ {> a | a ∈ A}, ∅) For readability we sometimes write out1(R) as shorthand for out1(R, ∅). Note also that out1(R,A) is not monotonic in either R or A. It remains to confirm that cns and out1 are well-defined and unique for any set of rules R and assumptions A. We do this in the next section by translating rules R to a simple form of logic program for which the required properties are immediate. 3.3 Semantics (part 2) In this section, we provide an alternative semantics that is provably equivalent to the semantics in Section 3.2 above. A definite clause is a rule of the form c← b1, . . . , bn (n ≥ 0) where c and b1, . . . , bn are atoms. We are using standard logic programming notation: c is the head of the clause; the body b1, . . . , bn is to be read as a conjunction. As is usual, where the body of a clause is empty we identify a clause c ← with the the atom c. A definite logic program is a set of definite clauses. The idea is that every and rule can be translated to a set of definite clauses, each of which represents one of the ways that the rule can be satisfied; a set of rules is translated to the set of definite programs obtained by taking all combinations of the translations of the individual rules. 16 R. Evans et al. Definition 3 (Definite clause encoding) Define a function defr from rules to sets of sets of definite clauses: defr(B C) =  { ∅ } if C = ∅{ {c← B | c ∈ C′} | C′ ⊆ C, C′ , ∅ } otherwise defr(B C) = { {c← B | c ∈ C′} | C′ ⊆ C} Now define a function def that translates a set of rules into a set of definite programs (a set of sets of definite clauses): def ({r1, r2, . . . }) = {D1 ∪D2 ∪ * * * | D1 ∈ defr(r1),D2 ∈ defr(r2), . . . } Example 1 defr(p q ∨ r) =  {} {q← p} {r← p} {q← p, r← p} defr(r s) = { {s← r} def ({p q ∨ r, r s}) =  {s← r} {q← p, s← r} {r← p, s← r} {q← p, r← p, s← r} Definition 4 (Least model) Let TD : 2A → 2A be the 'immediate consequence operator' [16] of the definite program D: TD(X) = {c | c← B ∈ D, B ⊆ X} For any A ⊆ A, let M(D,A) be defined inductively as follows: M0(D,A) = A Mi+1(D,A) = Mi(D,A) ∪ TD(Mi(D,A)) M(D,A) = ⋃ i≥0 Mi(D,A) The following are all properties of definite logic programs [16] found in any standard text on logic programming. M(D,A) is the least fixpoint of TD that contains A. For definite clauses D it always exists and is unique. M(D,A) is also the least (set inclusion) set of atoms containing A and closed under the rules D, and the least (Herbrand) model of D∪A. (We are identifying an atom a with the clause a←.) Now we can define an alternative version of cns in terms of def and M. cnsd(R,A) = {M(D,A) | D ∈ def (R)} Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 17 In the original semantics of Section 3.2, the inductive definition of cns can be seen as the construction of a tree rooted in {A}whose leaves are the elements of cns(R,A). Here in our second, alternative semantics, cnsd(R,A) can be seen as a set of linear derivations each of which is the application to A of one of the definite programs in def (R). Clearly if D ∈ def (R) then M(D,A) satisfies every rule of the form B C (C , ∅) in R and (trivially) every rule in R. It can be confirmed (e.g., by induction on R) that if R contains no constraint rules, i.e., no rules of the form B ⊥, then: cnsd(R,A) = {X ∈ cns(R,A) | X |= R} Let R⊥ denote the set of all constraint rules in R. Then: out1(R,A) = {X ∈ cnsd(R,A) | X |= R⊥} Putting the above together we have the following alternative characterisation of out1(R,A). Proposition 1 Let R be a set of rules and A a set of atoms. out1(R,A) = {M(D,A) | D ∈ def (R), M(D,A) |= R} Proof In the preceding discussion. ut The following corollary will be useful later: Proposition 2 If R is a set of rules and X a set of atoms, then if X |= R then X ∈ out1(R,X) Proof Assume X |= R. We shall provide a definite program D in def (R) such that M(D,X) = X. For each rule r1, ..., rn in R, construct definite programs d1, ..., dn as follows. If ri = B  C, then let di = {}. If ri = B  C, consider two cases. First, if B * X, let di = {}. Second, if B ⊆ X, then since X |= B C, X ∩ C , ∅. Let C′ = X ∩ C and define di as {c← B | c ∈ C′}. Let D = {d1, ..., dn}. We have M(D,X) = X and hence by Proposition 1, X ∈ out1(R,X). ut 3.4 Entailment We shall define entailment on KL1 rules using a notion of strong equivalence between rule sets. It is natural to say that two rule sets R1 and R2 are strongly equivalent if, for all rule sets R′ and all sets A of assumptions: out1(R1 ∪ R′,A) = out1(R2 ∪ R′,A) Since out1(R,A) = out1(R ∪ {>  a | a ∈ A}, ∅), it is equivalent to require that out1(R1 ∪ R′, ∅) = out1(R2 ∪ R′, ∅) for all rule sets R′. However, for strong equivalence of KL1 rule sets it is sufficient to restrict attention to sets R′ of 18 R. Evans et al. nullary rules ('facts'), i.e., rules of the form > a. This is because out1(R, ∅) can be seen as being defined by a set of definite clause programs def (R). Two definite clause programs D1 and D2 have the same models, and in particular the same least models, if and only if D1 ∪A and D2 ∪A have the same models for all sets A of unit clauses ('facts'), i.e., clauses of the form a ← >. The property generalises straightforwardly to comparing sets of definite clause programs as required here. We therefore take the following definition of strong equivalence and of rule entailment. Definition 5 Two rule sets R1 and R2 are strongly equivalent in KL1 if: out1(R1,A) = out1(R2,A) for all sets A of atoms A set R of rules entails a rule r in KL1, written R |=KL1 r, if R and R∪ {r} are strongly equivalent in KL1. In other words, R |=KL1 r if out1(R,A) = out1(R ∪ {r},A) for all sets A of atoms It is also convenient to employ a functional notation. kl1(R) denotes the set of rules semantically entailed by R in KL1: kl1(R) = {r | R |=KL1 r}. Rule sets R1 and R2 are strongly equivalent in KL1 when kl1(R1) = kl1(R2). A set R of rules is strongly inconsistent in KL1 if, for every set A of atoms, out1(R,A) = ∅. Proposition 3 Let R be a set of rules. out1(R,A) = ∅ for all sets A of atoms if and only if out1(R, ∅) = ∅. Hence, R is strongly inconsistent in KL1 when out1(R, ∅) = ∅. Proof We prove that if out1(R,A) , ∅ for some A then out1(R, ∅) , ∅. For suppose X ∈ out1(R,A). Then there is some definite program DR in the encoding def (R) of R such that X = M(DR,A) and X |= R⊥ where R⊥ are the constraint rules B ⊥ of R. But M(DR, ∅) ⊆M(DR,A) so M(DR, ∅) |= R⊥, and M(DR, ∅) ∈ out1(R, ∅). The other direction is trivial. ut Notice that if out1(R,A) = ∅ for all A then, for all R′, out1(R ∪ R′,A) = ∅ for all A, and hence R |=KL1 R′. Now suppose R |=KL1 > ⊥. Then out1(R,A) = out1(R∪{> ⊥}, A) for all A, and clearly out1(R∪{> ⊥}, A) = ∅ because X 6|= > ⊥ for any X. So we have the following. Proposition 4 Let R be a set of rules. R is strongly inconsistent in KL1 iff R |=KL1 > ⊥. 3.5 Inference rules The inference rules for KL1 are given in Figure 2. Note that, since the leftand right-hand sides of rules are sets of atoms, not sequences or multisets, a finite set R of rules has only a finite number of inferential consequences. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 19 Fig. 1: The entailment lattice for {p, q}. If there is a line between two nodes, then the lower node entails the higher node. Given a set R of rules, the derived rules deriv(R) are the rules generated by repeated application of the inference rules in Figure 2. We also write R `KL1 r if r ∈ deriv(R). Example 2 One can check that {p q, p r} |=KL1 p p ∨ q ∨ r. Here is a derivation using the inference rules: p q MAY-MUSTp q p r MAY-UNIONp q ∨ r MAY-IDp p MAY-UNIONp p ∨ q ∨ r Example 3 It can also be confirmed that {p  q ∨ r, q  ⊥} |=KL1 p  r. Here is a derivation using the inference rules: q ⊥ QUOD-LIBETq r MUST-SIp ∧ q r MUST-IDr r MUST-SIp ∧ r r p q ∨ r MUST-TRANSp r Proposition 5 (Soundness) KL1 is sound: R `KL1 r implies R |=KL1 r. That is, deriv1(R) ⊆ kl1(R). Proofs are in the Appendix. Conjecture 1 (Completeness) KL1 is complete: R |=KL1 r implies R `KL1 r. That is, kl1(R) ⊆ deriv1(R). 20 R. Evans et al. MUST-ID − A A A , ∅ MUST-SI A B A′ B A ⊂ A′ MUST-UNION A B A C A B ∪ C MUST-TRANS A b1 ∨ . . . ∨ bn A ∧ b1  C . . . A ∧ bn  C A C QUOD-LIBET A ⊥ A B MAY-ID − A A MAY-SI A B A′ B A ⊂ A′ MAY-UNION A B A C A B ∪ C MAY-TRANS A b1 ∨ . . . ∨ bn A ∧ b1  C . . . A ∧ bn  C A C if for every c ∈ C, A ∧ c bi for some bi ∈ {b1, . . . , bn} MAY-SO A B A B′ B′ ⊂ B MAY-MUST A B A B MAY-FALSUM A ∧ b ⊥ A b Fig. 2: Inference rules of KL1 We do not have a proof of this although we have strong reasons to believe it is true. We have tested it empirically using a computer implementation23 of the definitions and results in Sections 3, 4, and 5. For KL1, we sample randomly generated sets R of KL1 rules, and individual rules r such that R |=KL1 r. Then we generate all inferential consequences of R (always finite for a finite set of rules) and test if r is among the consequences. Extensive empirical testing, from sample sets of the order of 100,000 rules, suggests that KL1 is indeed complete. We would of course prefer the full confidence of a formal proof 24. 23 See https://github.com/RichardEvans/kl haskell. 24 Completeness is not just of formal interest – it also has significant philosophical consequences. The contrapositive of completeness is that any consistent set of sentences has a model. If every finite subset of a set Γ of sentences is consistent, then (by completeness) each subset has a model. Then, by compactness, Γ has a model. Thus, completeness allows us to move from a prooftheoretic notion (consistency) to a world-directed notion (having a model), thus allowing our Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 21 3.5.1 A comparison between and inference rules The MUST-TRANS and MAY-TRANS rules are very similar. But there is one extra condition in MAY-TRANS which does not appear in MUST-TRANS. The reason for the extra condition is this. Consider the simpler variant: MAY-TRANS-BAD A b1 ∨ . . . ∨ bn A ∧ b1  C . . . A ∧ bn  C A C This rule is exactly parallel to MUST-TRANS but it is unsound: it would allow us to infer p r from p q and q r. Let p  q be "if you visit the south of France, then feel free to enter a naturist resort!". Let q r be "if you enter a naturist resort, then feel free to disrobe entirely!". We do not want to infer "if you visit the south of France, then feel free to disrobe entirely!". To avoid this, there is an extra condition in MAY-TRANS that insists that each c in C requires some bi in {b1, . . . , bn}: for every c ∈ C, there must be a rule A ∧ c bi for some bi ∈ {b1, . . . , bn}. 3.6 The primary and secondary aim of logic In Section 2.4 above, we claimed that, for Kant, a fundamental question of logic is: given a set of subsumptions, and a set of rules, what further subsumptions may/must I perform? This question is answered by the out1 function defined in Section 3.2 above and further characterised by the results of the sections that follow. A further, secondary question of logic is: given a set of rules I have adopted, what further rules may/must I adopt? This question is answered by the rule entailment (|=KL1 ) and inference rules (`KL1 ) defined in Sections 3.4 and 3.5 above. 3.7 Comparing KL1 with other input / output logics The family of input/output logics, broadly conceived, are logics that operate on elements that do not (necessarily) have truth-values, and in which inferences are not closed under contraposition. KL1 is a member of the family of input/output logics, very broadly conceived. However, there are a number of essential differences between KL1 and the particular input/output logics described in [43]. There is the obvious notational difference. Standard input/output logics use the pair notation (p, q) to indicate implication from p to q. That in itself is logic to fulfil a key role in the task of transcendental logic: explaining how it is possible for our thoughts to be about an external mind-independent world. This is related to the inverse system semantics of Achourioti and Michiel van Lambalgen [1]. 22 R. Evans et al. a trivial difference but KL1 needs two distinct arrows, p q and p q, for the two different types of rule (see Section 2.2). That aside, the first major difference is that, in the standard input/output logics, a rule (φ,ψ) relates arbitrarily complex expressions that are closed under the Boolean connectives. In KL1, by contrast, the elements are conjunctions and disjunctions of atoms only. In KL1, the left hand side (antecedent) of a rule is a set of atoms representing a conjunction, while the right hand side (consequent) is a sets of atoms representing a disjunction. In a standard input/output logic, if we apply the rule (p, q ∨ r) to the premises {p}, we get the single result {q ∨ r}. In KL1, by contrast, if we apply the rule p  q ∨ r to the premises {p}, we get three possible answers: out1({p q ∨ r}, {p}) = {{p, q}, {p, r}, {p, q, r}} The second major difference, then, is that the output out1 of a set of rules in KL1 is a set of sets of atoms, representing different possible ways to satisfy the rules, while the output in a standard input/output logic is a single set of propositions. A third major difference is that KL1 does not have an inference rule for weakening the output. Although there is a rule for strengthening the input (MUST-SI), there is no corresponding rule for weakening the output (WO): WO A B A B ∪ C This rule is not valid in KL1 because it would license activities that were arbitrary25. Suppose, for example, the agent is performing the action p and has the rule: p q If we are allowed to infer, using WO, that p q ∨ r then there will be two possible sets of actions that are compatible with the original rule plus the derived rule: {p, q} and {p, q, r}. The trouble is that the action r that is introduced in the second answer is arbitrary in the sense that it is not itself grounded in a rule. Throughout his mature writings, Kant assumes that all activity must be grounded in a rule in order to count as activity at all. Consider the following difference: some of the movements my body performs are mere spasms, while other movements count as actions. The difference between the two, according to Kant, is that actions are movements that are grounded in a rule I have adopted. Kant's fundamental normative step is to characterise the subset of my bodily movements that count as actions as those which are subsumed under a rule. This is as true of mental activity as it is of physical activity: 25 [1] and [2] make a similar point, for different reasons. They argue that B∨C may not constitute a "whole". See footnotes 10 and 41 of [1]. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 23 Like all our powers, the understanding in particular is bound in its actions to rules [Jäsche Logic §1] 26 Therefore, in any logic that tries to capture Kant's normative theory of activity, weakening output (WO) should not be valid (see Section 3.9). KL1 has only the following restricted form: MUST-UNION A B A C A B ∪ C The final difference between KL1 and the standard input/output logics is that KL1 allows what is called "throughput" in input/output logics: for each X ∈ out1(R,A), A ⊆ X. The inference rule corresponding to throughput is MUST-ID, allowing us to infer p  p: "if you are doing p, then do p!", for all actions p. The reason why this inference rule is valid in KL1 is again because of the particular intended application: KL1 is a logic of concurrent activity, prescribing the activities that we must perform conditional on the activities we are already performing. It is for the agent to produce a package of activities that together satisfy the various rules. If it is already performing action p, then p must be part of any complete package of activity. If you are already doing p, there is no point trying to undo the performance of p – that ship has already sailed. See Section 6 for further discussion. 3.8 Comparing KL1 to other logics of imperatives Many logics of imperatives27 start with a base truth-functional logic and extend it with one or more imperatival operators (e.g. "!"). For example, given a set Σ of sentences that have a truth-functional semantics (e.g. sentences of classical propositional logic or first-order logic), with S ranging over sentences in Σ, define an imperative language L as: L := S | !S Given such a framework, imperatival inference can be explained using Dubislav's trick [25]: !p entails !q whenever p entails q. We stress, however, that KL1 does not use this framework. We do not presuppose an existing base language Σ in which truth-conditions have already been assigned28. In KL1, the atomsA represent actions that do not have truthvalues. In this crucial respect, KL1 is closer to the input/output logics than it is to most logics of imperatives. 26 Or to put it another way, mental occurrences not grounded in a rule "would then belong to no experience, and would consequently be without an object, and would be nothing but a blind play of representations, i.e., less than a dream." [A112], see also [A156/B195]. See [18] for further elaboration. 27 For example, [13,27,12]. 28 In this respect, our approach follows the pragmatist order of explanation, in which rules specifying what to do are explanatorily prior to rules specifying what is the case. See Sections 2.4 and 6. 24 R. Evans et al. [12] identifies three sets of requirements that any logic of imperatives must satisfy: 1. imperatives can stand in inconsistency relations 2. imperatives can stand in inferential relations 3. imperatives can be embedded Requirement 1: A set R of rules is strongly inconsistent in KL1 if, for every set A of atoms, out1(R,A) = ∅. A set R of rules is weakly inconsistent if there is some set A of atoms such that out1(R,A) = ∅. Example 4 R1 = > pp ⊥ R2 =  p q p r q ∧ r ⊥ Here, R1 is strongly inconsistent but R2 is only weakly inconsistent. When A = {}, out1(R2, {}) = {∅} , ∅. [12] insists that an imperative requiringφ is inconsistent with a permissive allowing ∼φ. In KL2, where we add a form of negation to KL1, the following set R5 is not inconsistent (not even weakly inconsistent): R5 = p qp ∼q This is because the permissive rule is weak and is overridden by a rule. Of course, differences of intuition are to be expected here because [12] is developing a semantics for conditional imperatives in natural language, while our project is to provide a logic of conditional imperatives for describing rules of thought. Requirement 2: inferential relations in KL1 are defined in terms of a semantic (|=KL1 ) and syntactic (`KL1 ) notion of entailment. We have commented on the relationship (soundness and completeness) between them. Requirement 3: the central motivating cases of embedding in [12] are conditional imperatives and permissives. These are precisely the types of imperative that KL1 is designed to model. Although we agree that other forms of embeddedness are also important, we do not have space to do justice to a full discussion here. We have developed an extension of KL1 that includes embedded rules (e.g. (p q) r), but we leave a full description to further work. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 25 3.9 The deontic paradoxes in KL1 According to our interpretation of Kant, conditional imperatives and permissives play a key role in both his theoretic and his practical philosophy. In this section, we sketch how KL1 handles some of the standard deontic paradoxes of practical reasoning. Ross's paradox [48] was first described for von Wright's deontic logic, but it also applies to logics of conditional imperatives. Suppose we are given the order: Post the letter! Now the declarative proposition "x posts the letter" entails the proposition "either x posts the letter or x burns it." But we do not want to infer from this entailment and the original order that: Therefore: post the letter or burn it! One way of seeing the problem with this conclusion is by inferring (via the inference that if you must do something, then you may do it): Therefore: you may post the letter or burn it! and then inferring (since permission distributes over disjunctions): Therefore: you may burn it! The absurdity in this chain of reasoning comes out even more clearly if the newly introduced disjunct is something altogether irrelevant to posting the letter, and altogether unacceptable. For example: Post the letter! Therefore: post the letter or set fire to the school! Therefore: you may post the letter or set fire to the school! Therefore: you may set fire to the school! In the usual input/output logics, the rule for weakening the output is: WO A B A B ∨ C KL1 avoids this paradox because it does not have the troublesome WO inference rule (see Section 3.7). There is a paradox that is closely related to Ross's paradox, that involves conjunction rather than disjunction. Suppose you are permitted to wipe your feet and enter the house. It does not follow that you are permitted to enter the house simpliciter29. This troublesome inference does not go through in KL1. Letting w stand for "wipe your feet", e stand for "enter the house", and 29 See the related "Window paradox" in [25]. 26 R. Evans et al. c stand for the conjunction of both activities, the conjunctive permission is represented by the set R of rules30: R =  > c c w c e Here there are two acceptable packages of actions: doing nothing, or doing both w and e: out1(R, {}) = {}{e,w, c} Note that neither {e} nor {c, e} are elements of out1(R, {}). There is a third, related paradox involving implication. Suppose we have: You must do a or b! If you do a then you must do c! We do not want to infer from these rules that: You may do b and c! The trouble with this inference is that doing c is only conditionally permissible: it is only if you are doing a that you must (and hence may) do c. Consider the concrete example: You must either leave the dinner early or stay until the end. If you leave early, then you must interrupt the conversation to tell everyone you are going early. It would not be acceptable to both stay until the end but also interrupt the conversation to tell everyone that you were leaving. This troublesome inference does not go through in KL1: out1({> a ∨ b, a c}, {}) =  {a, c} {b} {a, b, c} Note that {b, c} is not an element of out1({> a ∨ b, a c}, {}). 30 It is straightforward to extend the form of rules in KL1 to allow disjunctions of conjunctions of atoms in the heads (consequents) of rules. In that version, the example would be represented by R′ = {> (w ∧ e)} with out1(R′, {}) = {{}, {w.e}}. The details are straightforward but we have omitted them here to shorten the presentation. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 27 4 KL2: extending KL1 with negation We define KL2 by adding a negation operator to KL1. This negation operator applies to an atom a to generate a literal ∼a. It can only be applied to atoms; we do not allow expressions such as ∼∼p, ∼(p ∧ q) or ∼(p ∨ q). Henceforth, a, b, c range over literals and A,B,C and X range over sets of literals. Where c is a literal we write c for the complement of c: if c is an atom then c = ∼c and ∼c = c. When C is a set of literals C = {c | c ∈ C}. The rules of KL2 are: R ::= B C | B C as for KL1 except that now B and C are sets of literals (representing conjunctions and disjunctions respectively). A set of literals X satisfies a set R of rules, X |= R, and satisfies an individual rule r, X |= r, just as in KL1 except that now the elements of a rule are sets of literals rather than atoms: X |= B  C always; X |= B  C if B * X or C ∩ X , ∅. 4.1 Minimal requirements on a Kantian negation operator Kant describes a variety of properties that negation must satisfy throughout the Jäsche Logic31. As minimal requirements, we pick out two fundamental properties that he insists on. The operator ∼ from atoms to literals is our negation operator. First, p and ∼p must be incompatible32. Second, ∼p must be the most general proposition that is incompatible with p33: for any q, if p and q are incompatible, then q must entail ∼p. To motivate the second requirement, consider the following example. Suppose Jill can support at most one of three football teams: Arsenal, Barnet, or Chelsea. She cannot support more than one: supporting Barnet is incompatible with supporting Arsenal. But "Jill supports Barnet" (b) cannot be the negation of "Jill supports Arsenal" (a) because it is too specific. The negation ∼a of "Jill supports Arsenal" is the most general claim that is incompatible with her supporting Arsenal, and "Jill supports Chelsea" (c) is also incompatible with her supporting Arsenal. All we can say about ∼a is that b entails ∼a and c entails ∼a. 31 See [Jäsche Logic 9:51, 103-4, 109, 117-19, 124ff.]. 32 See the principle of contradiction in [A150ff./B189ff.] and [Jäsche Logic 9:51]. 33 Kant makes this precise claim in [Jäsche Logic §49]: "one of [a pair of contrary judgements] says more . . . than the mere negation of the other." In other words, a claim that is incompatible with p entails (but is not necessarily entailed by) the negation of p. See also Brandom [7], Humberstone [28], p.1170. 28 R. Evans et al. 4.2 Inference rules The extra inference rules for KL2 are given in Figure 3. They are chosen to capture the two requirements on the negation operator described above. Here we are assuming that the mutual incompatibility of a (non-empty) set A of literals is expressed by the rule A ⊥. ∼-LEFT − c ∧ ∼c ⊥ ∼-RIGHT A ∧ b ⊥ A b Fig. 3: Additional inference rules for KL2 Example 5 Here we use p q to derive ∼q ∼p: p q MUST-SIp ∧ ∼q q ∼-LEFTq ∧ ∼q ⊥ MUST-SIp ∧ q ∧ ∼q ⊥ MUST-TRANSp ∧ ∼q ⊥ ∼-RIGHT ∼q ∼p Example 6 Here we derive > ∼q from p ∧ q ⊥ and ∼p ∧ q ⊥: p ∧ q ⊥ ∼-RIGHTq ∼p ∼p ∧ q ⊥ MUST-TRANSq ⊥ ∼-RIGHT > ∼q It is instructive to look at some derived rules of KL2. Those in Figure 4 are all derivable using only ∼-LEFT and the rules of KL1. TRANSPOSE is obtained from ∼-LEFT using MUST-SI and MUST-TRANS. MUST-⊥ is obtained by repeated application of TRANSPOSE. (The case of MUST-⊥ where C = ∅ is vacuous but harmless.) INCONS generalises∼-LEFT. We do not show the derivations in detail. They are very straightforward and will be shown in detail when we present their KL3 versions in Section 5. Of particular interest is the following pair: REDUCE-⊥ B ∧ c ⊥ B ∧ ∼c ⊥ B ⊥ RESOLVE-⊥ A ∧ c ⊥ B ∧ ∼c ⊥ A ∪ B ⊥ The first is a special case of the second. They will be discussed in more detail in the treatment of KL3. Unlike the inference rules in Figure 4 their derivation relies on ∼-RIGHT. Example 7 Here is how the earlier Examples 5 and 6 look with these derived inference rules. To derive ∼q ∼p from p q: Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 29 MUST-⊥ B C B ∪ C ⊥ TRANSPOSE B b ∨ C B ∧ b C INCONS A c B ∼c A ∪ B ⊥ Fig. 4: Some derived inference rules for KL2 p q MUST-⊥p ∧ ∼q ⊥ ∼-RIGHT ∼q ∼p To derive > ∼q from p ∧ q ⊥ and ∼p ∧ q ⊥: p ∧ q ⊥ ∼-RIGHTq ∼p ∼p ∧ q ⊥ ∼-RIGHTq p INCONSq ⊥ ∼-RIGHT > ∼q With RESOLVE-⊥ it is even easier: p ∧ q ⊥ ∼p ∧ q ⊥ RESOLVE-⊥q ⊥ ∼-RIGHT > ∼q 4.3 Semantics A set X of literals is consistent if it does not contain a pair of complementary literals a and ∼a for any atom a. It is inconsistent otherwise. A denotes the set of atoms. Let CA denote the set of constraint rules CA = {a ∧ ∼a ⊥ | a ∈ A} We omit the subscriptAwhere it is obvious from context. Clearly the set X of literals is consistent when X |= C. We write VA for the set of maximal consistent sets of literals from A, i.e., VA is the set of sets Xm such that Xm is consistent and, for every a ∈ A, either a ∈ Xm or ∼a ∈ Xm. Definition 6 Given a set R of rules, a set X of literals is a violating set of R if there is no maximal consistent Xm ∈ VA such that Xm ⊇ X and Xm |= R. One can see that an inconsistent set of literals is, by definition, a violating set of every set of rules. And if X is a violating set of R then so is every X′ ⊇ X. 30 R. Evans et al. A violating set X of R cannot be extended to a maximal consistent set Xm ⊇ X that satisfies every rule in R. If B  ⊥ is a rule in R then X is a violating set of R when B ⊆ X. For a rule of the form B  C (C , ∅) and without the consistency requirement, a set X of literals can always be extended to a set X′ ⊇ X that satisfies that rule (the set of all literals satisfies it). With the consistency requirement, X cannot be extended to a consistent X′ ⊇ X that satisfies B C when B∪C ⊆ X. (Indeed that condition applies to constraint rules also: X cannot be extended to satisfy B  ⊥ when B ∪ ∅ ⊆ X.) It is possible that X is a violating set of a set R of rules even though X is not a violating set of any of them individually. A rule B C is satisfied by every set X of literals: only inconsistent sets of literals are violating sets of rules. Example 8 R = { p q, p ∼q } The consistent sets {p}, {p,∼q} and {p, q} are violating sets of R. All inconsistent sets are also violating sets of R. Compare: R′ = { p ∧ q ⊥, p ∧ ∼q ⊥ } Again, {p}, {p,∼q} and {p, q} are violating sets of R′. Example 9 R′′ = { p ∧ q r, p ∧ ∼q r, p ∧ r ⊥ } {p} is a violating set of R′′. Example 10 R′′′ = { p ∧ q r, p ∧ ∼q r, r ∧ s ⊥ } Here {r, s}, {p, s} and {p,∼r} are the minimal consistent violating sets. Definition 7 Given a set R of rules, we define an auxiliary set of rules aux(R). The outcome function out2 for KL2 is defined using aux(R). aux(R) = {A − {a} a | A is a finite violating set of R, a ∈ A} out2(R,A) = out1(R ∪ C ∪ aux(R),A) Example 11 If R = {p ⊥}, then {p} and {p,∼p} are violating sets of R, and aux(R) = { > ∼p, p p, ∼p ∼p, . . . } Note the close parallel between the inference rules and the semantics. In out2(R,A), the check for consistency, expressed by the rules C, matches ∼-LEFT, while the set of rules aux(R) provides the additional inferences afforded by ∼-RIGHT. Indeed, we will show (below) that X is a (finite) violating set of R if, and only if, the rule X  ⊥ is entailed by R in KL2. That will establish the connections between semantic and syntactic entailment in KL2. Rule entailment is defined as in KL1. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 31 Definition 8 Two rule sets R1 and R2 are strongly equivalent in KL2 if: out2(R1,A) = out2(R2,A) for all sets A of atoms A set R of rules entails a rule r in KL2, written R |=KL2 r, if R and R ∪ {r} are strongly equivalent in KL2. We write kl2(R) for the set of rules semantically entailed by R in KL2: kl2(R) = {r | R |=KL2 r}. Rule sets R1 and R2 are strongly equivalent in KL2 when kl2(R1) = kl2(R2). For brevity, we will write KL1(C) for KL1 extended with the inference rule ∼-LEFT and say that R KL1(C)-entails r when R ∪ C |=KL1 r. Proposition 6 (Decomposition) A set R of rules semantically entails a rule r in KL2 if and only if R ∪ C ∪ aux(R) semantically entails r in KL1. That is, for all rule sets R: kl2(R) = kl1(R ∪ C ∪ aux(R)) The following is a general property of KL1. Proposition 7 Let R be a set of rules and A a (finite) set of literals: out1(R,A) = ∅ iff R |=KL1 A ⊥ It is a corollary of the above that R is strongly inconsistent in KL1 if and only if R |=KL1 > ⊥. This was Proposition 4. Now we are ready to prove what we want, that X is a (finite) violating set of R precisely when X ⊥ is KL2-entailed by R. Informally, if X is a (finite, non-empty) violating set of R then {X − {a} a | a ∈ X} ⊆ aux(R) And aux(R) ⊆ kl1(R ∪ C ∪ aux(R)). So straight away: {X − {a} a | a ∈ X} ⊆ kl2(R) Now X  ⊥ ∈ kl2(R) because (for any non-empty finite set X of literals) X ⊥ is entailed in KL1(C) by X − {a} a, any a ∈ X. Syntactically, that is easy to see. It is just an instance of MUST-⊥: B c B ∧ c ⊥ which is a derived rule of KL1(C). Semantically, we want to confirm that B ∧ c ⊥ ∈ kl1({B c} ∪ C). That is very easy (see proof below). Proposition 8 Let R be a set of rules. If X is a finite violating set of R then: X ⊥ ∈ kl1(aux(R) ∪ C) 32 R. Evans et al. Proposition 9 Let R be a set of rules and X a finite set of literals. R |=KL2 X ⊥ iff X is a violating set of R Note that according to the above, ∅ is a violating set of R if and only if R |=KL2 > ⊥. Two refinements are immediately available. First, any inconsistent set of literals is a violating set of any set R of rules. (It has no consistent superset.) But an inconsistent set of literals contributes nothing useful to aux(R). The inconsistent set {c,∼c} produces only the pair {c  c, ∼c  ∼c} in aux(R). These are merely instances of MUST-ID. More generally, an inconsistent set A ∪ {c,∼c} contributes the following rules to aux(R): right({A ∧ c ∧ ∼c ⊥}) =  A ∧ c c A ∧ ∼c ∼c (A − {a}) ∧ c ∧ ∼c a (all a ∈ A) The first two rules are merely consequences of MUST-ID and MUST-SI. The others are entailed by c ∧ ∼c ⊥ in KL1 by MUST-SI and QUOD-LIBET. So if X is an inconsistent set of literals, then right({X  ⊥}) ⊆ kl1(C): X contributes nothing to out2(R,A) and can be ignored. Second, if X is a violating set of R and X′ ⊇ X then X′ is also a violating set of R. Moreover, every rule in right({X′ ⊥}) can be derived in KL1 from a rule in right({X ⊥}). For suppose X′ = X ∪ Y, X and Y disjoint. Then the rules in right({X′ ⊥}) are of the following two forms: right({X ∪ Y ⊥}) = (X − {a}) ∪ Y a (all a ∈ X)X ∪ (Y − {a}) a (all a ∈ Y) Rules in the first group are derived by MUST-SI from (X − {a})  a. Rules in the second group are derived from X  ⊥ by MUST-SI and QUOD-LIBET. So if X′ ⊇ X then right({X′ ⊥}) ⊆ kl1(right({X  ⊥})). This means that in the construction of aux(R) it is enough to consider the minimal violating sets of R. Definition 9 Let R be a set of rules. Define: auxm(R) = {A − {a} a | A is a minimal consistent violating set of R, a ∈ A} Proposition 10 Let R be a set of rules and A a set of literals. auxm(R) ⊆ aux(R) ⊆ kl1(auxm(R) ∪ C) and hence out2(R,A) = out1(R ∪ C ∪ auxm(R), A) Proof In the preceding discussion. ut Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 33 Note that if R ∪ C is strongly inconsistent then ∅ is the only minimal consistent violating set of R. In that case auxm(R) = ∅ and out2(R,A) = out1(R∪ C,A) = ∅ for all sets A of literals. (The converse does not hold, as observed earlier.) Example 12 R = { > p ∨ q, p ∧ q ⊥ } {∼p,∼q}, {p, q} are the only minimal consistent violating sets of R. (There are other violating sets, but they are either inconsistent or non-minimal.) auxm(R) = { ∼p q p ∼q ∼q p q ∼p } out2(R, ∅) = out1(R ∪ C ∪ auxm(R), ∅) = {{p,∼q}, {q,∼p}} Example 13 R = { p ∧ q ⊥, ∼p ∧ q ⊥ } {q}, {p, q}, {∼p, q} are consistent violating sets of R. (There are others.) {q} is the only minimal consistent violating set. auxm(R) = { > ∼q } out2(R, ∅) = out1(R ∪ C ∪ auxm(R), ∅) = {{∼q}} Example 14 R = { p q, p r, q ∧ r ⊥ } {p}, {p, q}, {p,∼q}, {p, r}, {p,∼r}, {p, q, r}, {p, q,∼r}, {p,∼q, r}, {p,∼q,∼r}, {q, r}, {∼p, q, r} are consistent violating sets of R. (There are others.) {p} and {q, r} are the minimal violating sets. auxm(R) = { > ∼p, q ∼r, r ∼q } out2(R, ∅) = out1(R ∪ C ∪ auxm(R), ∅) = {{∼p}} out2(R, {q}) = out1(R ∪ C ∪ auxm(R), {q}) = {{∼p, q,∼r}} Finally we confirm that out2(R,A) is well-defined for non-empty sets A of assumptions. Proposition 11 Let R be a set of rules and A a set of literals. out2(R,A) = out2(R ∪ {> a | a ∈ A}, ∅) 34 R. Evans et al. 4.4 An alternative characterisation of out2 It is possible to construct alternative, equivalent characterisations of the auxiliary rules aux(R). The following will be used in discussions of KL3 to come and for completeness of KL2. Observation 1 X is a violating set of R iff X is a violating set of rules R⊥∪must⊥(R) where R⊥ is the set of constraint rules of the form B ⊥ in R and must⊥(R) is the set of rules obtained by applying MUST-⊥ to the rules in R: must⊥(R) = {B ∧ c1 ∧ . . . ∧ ck  ⊥ | (B c1 ∨ . . . ∨ ck) ∈ R} Observation 2 Let R1 and R2 be sets of rules. If X is a violating set of R1 ∪R2 then X ⊆ X1 ∪ X2 for some X1 and X2 such that X1 is a violating set of R1 and X2 is a violating set of R2. The following is a derived rule of KL2: REDUCE-⊥ B ∧ c ⊥ B ∧ ∼c ⊥ B ⊥ Its derivation requires ∼-RIGHT and so it is an inference rule of KL2 not of KL1(C). We can also give a semantic justification by appeal to violating sets. If B ∪ {c} and B ∪ {∼c} are both violating sets of R then clearly so is B. The following more general rule is also easily derived in KL2: RESOLVE-⊥ A ∧ c ⊥ B ∧ ∼c ⊥ A ∪ B ⊥ Semantically, if A∪{c} and B∪{∼c} are both violating sets of R then so is A∪B. (We cannot extend consistently by either c or ∼c.) These rules will feature prominently in KL3 and we will present their derivation there. We can reformulate RESOLVE-⊥ as a rule applying to violating sets, exactly as stated in the semantic argument. We will call that V-RESOLVE. Definition 10 Let Γ be a set of sets of literals, where each component set of literals is a violating set. Then V-RESOLVE(Γ) = Γ∪{A∪B | A∪{a} ∈ Γ,B∪{a} ∈ Γ}. Now we shall use V-RESOLVE to provide an alternative version of aux, called auxm which provides the minimal set of auxiliary rules that are needed to derive all the consequences we want. Definition 11 v(R) = {B ∪ C | B C ∈ R, B ∪ C consistent} (As observed earlier, that also covers the case of constraint rules, where C = ∅.) v(R) are all consistent violating sets of R (though there may be others). Now define v∗(R) as the closure of v(R) under V-RESOLVE. Let v∗m(R) be the set of the minimal elements of v∗(R). Assuming the construction is complete (see below) we define auxm as follows: auxm(R) = {A − {a} a | A ∈ v∗m(R), a ∈ A} Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 35 We could reformulate RESOLVE-⊥ so that it does not generate rules with inconsistent bodies and V-RESOLVE so that inconsistent sets are discarded. We could also make the computation of v∗(R) more efficient by discarding nonminimal elements as soon as they are constructed during the computation of the closure v∗(R). These are details. Example 15 R = { p ∧ q ⊥, p ∧ ∼q ⊥ } v(R) = {{p, q}, {p,∼q}} v∗(R) = {{p, q}, {p,∼q}, {p}} v∗m(R) = {{p}} auxm(R) = { > ∼p } Example 16 R = { p q, p r, q ∧ r ⊥ } v(R) = {{p,∼q}, {p,∼r}, {q, r}} v∗(R) = v(R) ∪ {{p, r}, {p, q}, {p}} v∗m(R) = {{p}, {q, r}} auxm(R) = { > ∼p, q ∼r, r ∼q } Example 17 R = { p ∧ q r, p ∧ ∼q r, p ∧ r ⊥ } v(R) = {{p, q,∼r}, {p,∼q,∼r}, {p, r}} v∗(R) = v(R) ∪ {{p,∼r}, {p, q}, {p,∼q}, {p}} v∗m(R) = {{p}} auxm(R) = { > ∼p } Now we can define an alternative set of auxiliary rules auxe(R) to be used in out2(R,A) that will be useful in establishing completeness and in KL3. Definition 12 For all rule sets R, let: auxe(R) = {A − {a} a | A ∈ v∗(R), a ∈ A} In order to use auxe(R) in the computation of out2(R,A), and to preserve semantic entailment |=KL2 , it is not necessary that auxe(R) generates all elements of aux(R) – only that it generates at least all minimal elements auxm(R) of aux(R) and nothing that is not in aux(R). Proposition 12 Let R be a set of rules and X a set of literals. If X is a violating set of R and X < v∗(R) then either X is inconsistent or there exists X′ ⊂ X such that X′ ∈ v∗(R). 36 R. Evans et al. Proof By induction on the number of rules in R, and Observation 2. ut This does not say that all elements of v∗(R) are consistent or minimal, but only that all minimal consistent violating sets of R are elements of v∗(R), which is all we need. Clearly auxm(R) ⊆ auxe(R). Further, since the set of all violating sets of R is closed under V-RESOLVE, auxe(R) ⊆ aux(R). We also have (Proposition 10) aux(R) ⊆ kl1(auxm(R) ∪ C). Putting these observations together gives the following. Proposition 13 Let R be a set of rules. auxm(R) ⊆ auxe(R) ⊆ aux(R) ⊆ kl1(auxm(R) ∪ C) and hence kl1(R ∪ C ∪ aux(R)) = kl1(R ∪ C ∪ auxe(R)) Now we shall provide an alternative characterisation of auxe(R) in terms of inference rules of KL2. Definition 13 Let right+(R) denote the results of applying inference rule ∼-RIGHT to rules R keeping only those rules whose bodies are consistent: right+(R) = {A − {a} a | A ⊥ ∈ R, A is consistent, a ∈ A} Let resolve∗ ⊥ (R) denote the closure of rules R under derived rule RESOLVE-⊥, i.e., the closure of R under: resolve⊥(R) = {B ∪ B′ ⊥ | B ∧ c ⊥ ∈ R,B ∧ c ⊥ ∈ R} Let must⊥(R) denote the application of derived rule MUST-⊥ to R: must⊥(R) = {B ∪ C ⊥ | B C ∈ R} Now we can provide an alternative characterisation of auxe: Proposition 14 Let R be a set of rules. auxe(R) = right+(resolve∗⊥(R ∪must⊥(R))) Proof This follows from the definitions. must⊥(R) is the set of constraint rules implied by rules of the form B C (C , ∅) in R. R may also contain constraint rules of the form B ⊥. So (by definition) X ∈ v(R) when X  ⊥ is a rule in R ∪ must⊥(R) and X is consistent. X ∈ v∗(R) when X  ⊥ is a rule in resolve∗ ⊥ (R ∪ must⊥(R)) and X is consistent. So auxe(R) = right+(resolve∗⊥(R ∪ must⊥(R))). ut Now this is going to be used in establishing completeness, because all the inference rules used in the construction of auxe(R) are (derived) inference rules of KL2. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 37 4.5 Soundness and completeness The inference rules of KL2 are those of KL1 together with ∼-LEFT and ∼-RIGHT. We write deriv2(R) to denote the set of rules that can be derived from the set R of rules by repeated application of the inference rules of KL2. We write R `KL2 r if r ∈ deriv2(R). Proposition 15 (Soundness of KL2) For all sets R of rules: deriv2(R) ⊆ kl2(R) We would expect that if KL1 is complete with respect to out1 then KL2 is complete with respect to out2. That is indeed the case. Proposition 16 (Conditional completeness of KL2) If KL1 is complete with respect to out1 then KL2 is complete with respect to out2. That is: if, for all sets R of rules kl1(R) ⊆ deriv1(R) then, for all sets R of rules kl2(R) ⊆ deriv2(R). Proposition 17 (Decomposition of KL2) If KL1 is complete with respect to out1 then deriv2(R) = deriv1(R ∪ C ∪ auxe(R)) 4.6 Conservative extension We have established that: kl2(R) = kl1(R ∪ C ∪ aux(R)) deriv2(R) = deriv1(R ∪ C ∪ auxe(R)) One can see that KL2 is a conservative extension of KL1, both semantically and syntactically. Proposition 18 (Conservative extension) KL2 is a conservative extension of KL1: If R is a set of rules containing no negative literals, and rule r also contains no negative literals, then r ∈ kl2(R) iff r ∈ kl1(R), and r ∈ deriv2(R) iff r ∈ deriv1(R). We can see this claim is true by looking at the aux(R) construction: if R contains no negative literals, then all violating sets of R are sets of atoms. All the rules in aux(R) are therefore rules with singleton heads where the head is a negative literal and the body contains only positive atoms, i.e., rules of the form B  ∼c where c is an atom and B is a set (possibly empty) of atoms. Any rule r containing only positive atoms can only be derived from R ∪ aux(R) (syntactically or semantically) if it can be derived from the rules R. The constraint rules C have no effect if neither R nor the entailed rule r contain negative literals. Indeed, KL1(C) is a conservative extension of KL1 and KL2 is a conservative extension of KL1(C). kl1(R∪C) is a conservative extension of kl1(R) and kl2(R) is a conservative extension of kl1(R ∪ C). 38 R. Evans et al. KL2 can also be seen as a conservative extension of KL1 in the following rather different sense. Given a set X of literals, let X+ be the largest subset of X containing only positive literals. In other words, let X+ be the set of atoms obtained by removing all negative literals from X. If∆ is a set of sets of literals, let ∆+ = {X+ | X ∈ ∆}. Proposition 19 Let R be a set of rules and A a set of assumptions, both containing no negative literals. Then: out2(R,A)+ = out1(R,A) In other words: out2 does not add or remove from the set of solutions to out1 – all it does is possibly add some negative literals to the existing solutions. Example 18 Let R = { p q, p r, q ∧ r ⊥}. Then out2(R) = {{∼p}} and out1(R) = {∅}. 4.7 Entailments Some examples of entailments and non-entailments are given in Table 2. Note that the rule corresponding to the law of excluded middle (> p ∨ ∼p) is not a theorem of KL2. Table 2: Some entailments and non-entailments in KL2 Entailments Non-entailments {} |=KL2 p ∧ ∼p q {} 2KL2 > p ∨ ∼p p ⊥ |=KL2 > ∼p p ∧ q ⊥ 2KL2 > ∼p ∨ ∼q p q |=KL2 p q ∨ ∼q p q 2KL2 p q ∨ r p q |=KL2 ∼q ∼p p q 2KL2 ∼p ∼q ∼q ∼p |=KL2 p q ∼p ∼q 2KL2 p q > ∼p ∨ q |=KL2 p q p q 2KL2 > ∼p ∨ q 4.8 Concluding remarks on negation The treatment of negation in KL2 derives from two starting assumptions: that complementary literals c and ∼c are mutually incompatible (∼-LEFT), and that the negation of c is the most general proposition that is incompatible with c (∼RIGHT). These two assumptions embedded in KL1 produce a form of negation in which the rules B ∧ c  ⊥ and B  c turn out to be equivalent34. It is possible to devise some more elaborate technical constructions which weaken this equivalence, such that the rule B c entails B∧ c ⊥ but not the other 34 This equivalence only holds at the propositional level. We shall see in Section 5.1 below that the two rules are not equivalent when one of them contains existentially quantified variables. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 39 way round. We have not presented any such alternatives here. The technical constructions are not difficult but we have not found support for them in Kant's writings. 5 KL3: extending KL2 with variables and quantifiers KL3 extends KL2 by adding quantified rules to KL2, including rules in which the head may have existentially quantified variables. In KL3, an atom has internal structure; it is composed of a predicate and a list of terms. Given a set P of predicate symbols with associated arities35: P +/− = P ∪ {∼p | p ∈ P} The negation of a predicate p means "un-p". For example, ∼clear means "unclear". The formula p(x) ∼q(x) does not mean "if p(x) then do not subsume x under q!". Rather, it means: "if p(x) then do subsume x under un-q!". Given a set P+/− of predicate symbols, a setK of constants, and a set X of variables, the set LK of ground literals is: LK ::= {p(k1, . . . , kn) | p ∈ P+/−, ki ∈ K , arity(p) = n} The set LX of unground literals is: LX ::= {p(x1, . . . , xn) | p ∈ P+/−, xi ∈ X, arity(p) = n} The set L of literals is: L ::= {p(t1, . . . , tn) | p ∈ P+/−, ti ∈ K ∪ X, arity(p) = n} Note that predicates of arity 0 are allowed. A literal of arity 0 is both a grounded literal and an ungrounded literal. In what follows, constants and variables are written in lower case. Constants are a, b, c, while variables are x, y, z, possibly with subscripts. To avoid cluttering the syntax, we take it to be obvious from context whether a, b, c are to be read as constants or as ranging over literals. Note that both LK and LX are proper subsets of L, and there are literals in L that are not in LK nor LX: any literal that contains a mixture of variables and constants is in neither LK nor LX. p(x, k) is not in LK nor in LX. In KL3, rules are made up entirely of unground literals from LX. No constants are allowed in any of the literals in any of the rules. This is essential. Since rules are intended to be public and shareable between agents, while intuitions are private mental objects, rules must not contain constants (intuitions) or they would not be public (see Section 2.2). 35 Some commentators believe Kant's logic only allowed monadic predicates, but [1] and [2] argue convincingly that Kant always had n-ary predicates in mind. 40 R. Evans et al. There are two forms of rule in KL3, as in KL1 and KL2 but with B and C ranging over sets of unground literals from LX: R ::= B C | B C Variables that appear in both the body and the head of a rule are read as universally quantified. For example, p(x)  q(x) means: "for any x, if you perform p(x), then you must perform q(x)!". Variables that appear in the head but not in the body are existentially quantified36. For example, p(x) q(x, y) means: "for any x, if you perform p(x), then you may construct a y and perform q(x, y)!". p(x)  q(x, y) means: "for any x, if you perform p(x), then you must construct a y and perform q(x, y)!". To emphasise this reading, we write such rules with explicit existential quantifiers in the head, as in e.g. p(x) ∃y q(x, y) and p(x) ∃y q(x, y). A rule such as p(x)  q(x, y) ∨ r(x, y) where there is a shared variable y in the head can be read either as p(x) ∃y (q(x, y) ∨ r(x, y)) or (equivalently) as p(x)  ∃y q(x, y) ∨ ∃y r(x, y). Notice that the latter is also equivalent to p(x) ∃y q(x, y) ∨ ∃z r(x, z), and therefore to the rule p(x) q(x, y) ∨ r(x, z) without explicit quantifiers. As explained below, however, for simplicity of presentation and for practical reasons we will restrict the language so that existential rules have only singleton heads. This does not restrict the expressive power of the language. 5.1 Preliminaries Where θ is a substitution (an assignment of variables and/or constants to variables) and c is a literal, the expression c.θ denotes the application of θ to c. Where C is a set of literals C.θ = {c.θ | c ∈ C}. A substitution θ is ground when all variables in θ are assigned to constants. Where C is a set of unground literals and C.θ are ground literals we say that C.θ is a ground instantiation of C. Example Suppose θ = {x/a} and θ′ = {y/b}. Then p(x).θ = p(a) q(x, y).θ.θ′ = q(a, y).θ′ = q(a, b) Suppose θ = {x/a, y/b, z/c} and θ′ = {} (the identity substitution). Then p(x).θ = p(a) q(x, y).θ.θ′ = q(a, b).θ′ = q(a, b) 36 Rules with existentials in the head are common in geometric logic [14] (also known as coherent logic [5,6]), existential datalog [11], and in agent languages [37]. See Section 5.5 for further comparison. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 41 Definition 14 A set X of ground literals satisfies a set of rules R, written X |= R, when X satisfies every rule in R. X satisfies a rule r, written X |= r, when: X |= B C if for every ground instantiation B.θ of B, if B.θ ⊆ X then there exists a ground instantiation C.θ.θ′ of C.θ such that C.θ.θ′ ∩ X , ∅ X |= B C always Note in the above that if there are no existential variables in the head C of a rule B  C then C.θ is ground and θ′ is the identity substitution. If there are existentially quantified variables in C and θ instantiates all the variables in C to constants (i.e., if C.θ is already a ground instantiation of C) then θ′ is the identity substitution. Rules in KL3 are quantified and unground. Leaving aside rules with existential heads, it is clear that formally all ground instances of KL3 rules are – syntactically and semantically – rules of KL2 where the positive ground literals of KL3 are treated as positive literals (atoms) of KL2, and negative ground literals of KL3 as negative literals of KL2, i.e., as atoms prefixed by the negation operator. We can see that, for rules without existential heads: X |= B C if X |= B.θ C.θ for every ground instantiation B.θ of B X |= B C always Universally quantified rules without existentially quantified heads behave exactly in KL3, syntactically and semantically, as the set of all their ground instances in KL2. Rules with existentially quantified heads however are a different kind of rule and have to be treated specially. Consider the very simplest example: p ∃x q(x) At first sight it might seem that this rule cannot be violated, that there is (apparently) no consistent violating set because we can always extend a (consistent) set of ground literals by finding a new candidate q(ki) atom. But that is not so. The set {p,∼q(a0),∼q(a1), . . . } (infinitely many ∼q(ai) literals) is a violating set, as are all of its supersets. Ordinarily, in the semantics adopted for negation in KL2, if X is a violating set of rule set R then R entails the rule X ⊥. That does not work here: X in this example would represent an infinite conjunction, which is not well-formed. Put another way, the derived inference rule MUST-⊥ which we rely on in the construction of auxe(R) in KL2, would look as follows: p ∃x q(x) p ∧ ∼q(x) ⊥ That rule does not hold for existential rules. The quantification is wrong. To make it valid we would need p ∃x q(x) p ∧ ∼∃x q(x) ⊥ 42 R. Evans et al. but the consequent is not an allowed rule form in KL3. What about ⊥-RIGHT? Could the following be valid? p ∧ q(x) ⊥ p ∃x∼q(x) Clearly not. "You must not perform p and q(x) for any x!" should not imply "if you perform p you must also construct an x and perform ∼q(x)!". For KL3 we will need a restricted form of ⊥-RIGHT, as discussed in the next section. For example, the following inference is valid: p ∧ ∼q(x) ⊥ ∼q(x) ∼p Further, notice that the following rules> ∃x q(x)q(x) ⊥ are strongly inconsistent (with the obvious definition). And that the following pair p ∃x q(x)q(x) ⊥ is weakly inconsistent and has a violating set {p}. Now this is key, because we will want to construct a set auxqe (R) of auxiliary rules for KL3 in analogy to the construction of auxe(R) in KL2. auxe(R) employs a combination of MUST-⊥, to derive constraint rules from non-constraint rules, and then RESOLVE-⊥ to process constraint rules. That is not available here – we do not have MUST-⊥ for existential rules. For existential rules we need (the general form of) the following inference rule, a generalisation of the example above: A ∃x q(x) B ∧ q(x) ⊥ A ∪ B ⊥ In the next section we will call the general form of the above inference rule EXISTS-⊥. In KL2 its propositional analogue can be derived from MUST-⊥ followed by an application of RESOLVE-⊥. In KL3 it can be given a semantic justification in terms of violation sets, as sketched above for the example. It is also derivable from the inference rules for KL3 to be presented in the next section – however, as we show there, the derivation imposes certain restrictions on variables that limit its applicability in KL3. Similarly, we are also going to need the KL3 analogue of RESOLVE-⊥; its derivation likewise will impose certain restrictions in order to deal correctly with quantifiers. For ease of exposition, we restrict attention to the special case of existential rules with singleton heads. Note that this restriction causes no loss of expressive power: we can express rules with existentially quantified disjunctive heads by introducing auxiliary predicates if necessary. In summary we have: Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 43 – universally quantified rules without existentially quantified heads; they have exactly the same meaning – the same semantics and inference rules – as sets of all their ground instances in KL2; – inference rules for converting existential rules to constraint rules, which we can justify by appeal to violation sets, and which are derivable from the inference rules for KL3 to be presented in the next section. The inference rule EXISTS-⊥ for existential rules with singleton heads is simple. 5.2 Inference rules The inference rules for KL3 are provided in Figure 5. As explained above, for simplicity we deal only with the case of universally quantified rules and existential rules with singleton heads. In Figure 5, a, b, c range over unground literals, and A, B, C, A′, B′, C′ range over sets of unground literals. The inference rules are of two types: those that are valid for all rules, and those that are valid only for universally quantified rules without existential variables in the head. In the figure they are distinguished by specifying restrictions on variables. SUB-1 and SUB-2 are specific to KL3. They allow the uniform replacement of variables by variables, enabling, for example, the inference from p(x) q(x) to p(y)  q(y). In SUB-1 and SUB-2, the substitution θ must be injective on the existential variables (the variables in B −A). Without this restriction, they would allow the inference from p(x) ∃y q(x, y) to p(x) q(x, x), which is invalid. In MUST-SI and MAY-SI, the new literals in A′ − A must not bind any of the existential variables in B − A. Without this restriction, we would be able to infer from p(x) ∃y q(x, y) to p(x) ∧ r(y) q(x, y), which is invalid. In ∼-RIGHT, we insist that var(c) ⊆ var(B). Without this restriction, we would be able to infer wrongly from p(x) ∧ q(x, y) ⊥ to p(x) ∃y∼q(x, y). Figure 6 shows three derived inference rules. They will be used, as in KL2, in the construction of auxiliary rules auxqe (R) used in the definition of the out function for KL3. MUST-⊥was used in KL2. It is valid for universally quantified rules without existentially quantified heads but not for rules with existentially quantified heads. Its derivation is presented below in order to show how the restrictions on variables are inherited from MUST-SI. For brevity we only show the derivation for the special case of a rule with singleton head. The derivation of the general form is easily reconstructed. B c MUST-SI var(c) ⊆ var(B) B ∧ c c ∼-LEFT c ∧ c ⊥ MUST-SI B ∧ c ∧ c ⊥ MUST-TRANS B ∧ c ⊥ EXISTS-⊥, also discussed informally in the previous section, gives the conditions under which we can derive a universally quantified constraint rule from an existential rule. We present only the version for existential rules with 44 R. Evans et al. SUB-1 A B A.θ B.θ when θ injective on var(B − A) SUB-2 A B A.θ B.θ when θ injective on var(B − A) MUST-ID − A A A , ∅ MUST-UNION A B A C A B ∪ C MUST-SI A B A′ B A ⊂ A′, var(A′ − A) ∩ var(B − A) = ∅ MUST-TRANS A b1 ∨ . . . ∨ bn A ∧ b1  C . . . A ∧ bn  C A C QUOD-LIBET A ⊥ A B MAY-ID − A A MAY-UNION A B A C A B ∪ C MAY-SI A B A′ B A ⊂ A′, var(A′ − A) ∩ var(B − A) = ∅ MAY-TRANS A b1 ∨ . . . ∨ bn A ∧ b1  C . . . A ∧ bn  C A C if for every c ∈ C, A ∧ c bi for some bi ∈ {b1, . . . , bn} MAY-SO A B A B′ B′ ⊂ B MAY-MUST A B A B MAY-FALSUM A ∧ b ⊥ A b ∼-LEFT − c ∧ ∼c ⊥ ∼-RIGHT A ∧ b ⊥ A b var(b) ⊆ var(A) Fig. 5: Inference rules for KL3 singleton heads. The rule can be justified semantically, by reference to violation sets, and also derived from the inference rules in Figure 5: the derivation uses MUST-SI and MUST-TRANS and for this reason EXISTS-⊥ inherits restrictions on variables from MUST-SI. Notice in particular that none of the existentially quantified variables in A c may appear in the literals B of B ∧ c ⊥. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 45 MUST-⊥ A C A ∪ C ⊥ var(C) ⊆ var(A) EXISTS-⊥ A c B ∧ c ⊥ A ∪ B ⊥ (var(c) − var(A)) ∩ var(B) = ∅ RESOLVE-⊥ A ∧ c ⊥ B ∧ c ⊥ A ∪ B ⊥ var(c) ⊆ var(A) or var(c) ⊆ var(B) Fig. 6: Three derived inference rules of KL3 A c MUST-SI (var(c) − var(A)) ∩ var(B) = ∅ A ∪ B c B ∧ c ⊥ MUST-SI (A ∪ B) ∧ c ⊥ MUST-TRANS A ∪ B ⊥ RESOLVE-⊥was introduced in its quantifier free form in the section on KL2. Although it deals with universally quantified constraint rules its derivation relies on EXISTS-⊥ and ∼-RIGHT from which it inherits restrictions on variables: A ∧ c ⊥ ∼-RIGHT var(c) ⊆ var(A) A c B ∧ c ⊥ EXISTS-⊥ A ∪ B ⊥ (and the symmetric form, which gives the variable restrictions quoted for RESOLVE-⊥ in Figure 6). 5.3 Semantics A set of ground literals is consistent when it contains no complementary pair of literals p(k1, . . . , kn) and ∼p(k1, . . . , kn). Violation sets (sets of ground literals) are defined as in KL2. Given a (countable but not necessarily finite) set R of rules and a (finite) set A of ground literals, the consequences out3(R,A) are defined, as in KL2, in terms of a set auxqe (R) of additional rules representing the consequences of the inference rules for negation, ∼-LEFT and ∼-RIGHT. We will have: out3(R,A) = out q 1(R ∪ CP ∪ aux q e (R), A) CP is the set of rules corresponding to ∼-LEFT: {p(x1, . . . , xn) ∧ ∼p(x1, . . . , xn) ⊥ | p ∈ P, xi ∈ X, arity(p) = n} As usual we omit the subscript Pwhen it is obvious from context. outq1(R,A) is the set of all possible outcomes obtained by applying the rules in R to the assumptions A. Each element of outq1(R,A) is a set (finite if R 46 R. Evans et al. is finite) of ground literals. The definition is essentially the same as for KL1 but adjusted to deal with variables in rules. Notice that since R is a set of unground rules with variables and A is a set of grounded literals, it is no longer the case that assumptions A can be replaced by 'facts' (unconditional rules with singleton head). An expression> a where a is a grounded literal is not a valid rule in KL3 (unless a is a 0-ary term). Definition 15 Let R be a set of KL3 rules and A a set of ground literals. outq1(R,A) = {X ∈ cns q(R,A) | X |= R} cnsq0(R,A) = {A} cnsqn+1(R,A) = {X ∪ {t} | X ∈ cns q n(R,A), t ∈ step q(R,X)} cnsq(R,A) = ⋃ n≥0 cnsqn(R,A) stepq(R,X) = {c.θ | B C ∈ R or B C ∈ R, B.θ ⊆ X, c ∈ C, c.θ is ground} The stepq function takes a set of rules and a set of ground literals and produces all the ground literals that can be inferred in a single step using a single rule from R. stepq is exactly like the step function in the definition of cns and out1 for KL1, except for the need to instantiate variables in rules to constants in the ground literals of argument X. The substitution θ can include new fresh constants that do not appear in A that can serve as witnesses for existentially quantified variables. Example 19 Suppose R = {p(x) ∃y q(x, y)} and A = {p(a)}. stepq(R,A) = {q(a, a), q(a, ν0), q(a, ν1), . . . } Here, ν0 and ν1 are new fresh constants. We assume we have an infinite stock of such constants ν0, ν1, . . . . auxqe (R) is defined analogously to auxe(R) in KL2. For rules without existential heads this will be exactly as for KL2 with universal rules treated as standing for the set of their ground instances. The additional ingredient for existential rules is an application of the inference rule EXISTS-⊥ as discussed informally in the previous section. Definition 16 Let R be a set of KL3 rules. must⊥(R), resolve∗⊥(R) and right +(R) are the three derived rules in Figure 6, defined as for KL2 (and in accordance with the relevant KL3 variable restrictions). Let exists∗⊥(R) denote the closure of rules R under EXISTS-⊥. Define: auxqe (R) = right +(resolve∗⊥(exists ∗ ⊥(R ∪must⊥(R)))) Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 47 R∪auxqe (R) is the closure of R under RIGHT-∼, EXISTS-⊥ and MUST-⊥. In aux q e (R) it is sufficient to perform a single application of MUST-⊥, which deals with nonexistential rules, and then the closure under EXISTS-⊥ and RESOLVE-⊥. The latter can be done in two separate steps, first the closure under EXISTS-⊥ and then the closure under RESOLVE-⊥. This is because (as was shown earlier) RESOLVE-⊥ is derivable as RIGHT-∼ followed by EXISTS-⊥: resolve⊥(R) = exists⊥(R ∪ right(R)) for any R. resolve⊥(R), for any R, is already closed under exists⊥. Example 20 Suppose: R = p(x) q(x)> ∃x∼q(x) A = {} Then auxqe (R) =  p(x) q(x) ∼q(x) ∼p(x) p(x) ∧ ∼q(x) ⊥ out3(R,A) =  {∼p(ν0),∼q(ν0)}, {∼p(ν0),∼q(ν0),∼p(ν1),∼q(ν1)}, {∼p(ν0),∼q(ν0),∼p(ν1),∼q(ν1),∼p(ν2),∼q(ν2)}, . . . Note that the existentially quantified variable x in the rule >  ∃x∼q(x) of R appears in the body of the inferred constraint rule p(x) ∧ ∼q(x)  ⊥ of auxqe (R). The variable restrictions in EXISTS-⊥ however do not sanction the inference of the rule p(x) ⊥. 5.4 Equality We add an extra binary logical operator ,. The expression x , y does not represent the act of subsuming x and y under the mark of inequality. Rather, , is a testing operator that is different from the act of subsumption: to test if x , y is just to see whether the denotations of x and y are distinct. Expressions of the form x , y can appear only in the body of a rule; x and y must be variables appearing in the body. One can think of a rule as having two distinct components (T, r) where r is an expression of the form B  C or B  C, and T is a set, possibly empty, of , tests on variables appearing in B. However, for readability, we allow the inequality tests in T to be written in the body of a rule as if they were atoms. 48 R. Evans et al. Example 21 Suppose R = {p(x) ∃y q(x, y)} and A = {p(a)}. out3(R,A) =  {p(a), q(a, a)} {p(a), q(a, ν0)} {p(a), q(a, a), q(a, ν0)} {p(a), q(a, ν0), q(a, ν1)} . . . If we add an extra rule containing,, then we can constrain the set of witnesses. R = p(x) ∃y q(x, y)q(x, y) ∧ q(x, z) ∧ y , z ⊥ A = {p(a)} out3(R,A) =  {p(a), q(a, a)} {p(a), q(a, ν0)} {p(a), q(a, ν1)} {p(a), q(a, ν2)} . . . Note that ⊥-RIGHT does not allow us to infer from the rule q(x, y) ∧ q(x, z) ∧ y , z  ⊥ in R to the rule q(x, z) ∧ y , z  ∼q(x, y). In the (T, r) representation described above, that would be an inference from ({y , z}, q(x, y) ∧ q(x, z)  ⊥) to ({y , z}, q(x, y)  ∃z q(x, z) ), which does not satisfy the variable restrictions of ⊥-RIGHT. To handle inequality, we modify what it means for a set X of ground literals to satisfy the body of a rule to take into account the possible presence of , tests. Let us think of the inequality tests as belonging to the body, B. We will say that X satisfies B with ground instantiation of variables θ, written X |=θ B, if for every literal b in B, b.θ ∈ X, and for every expression x , y in B, the constants x.θ and y.θ are distinct. We are thereby making a unique names assumption on constants: two constants denote distinct objects when they are lexicographically distinct. The adjustment for stepq is as follows: stepq(R,X) = {c.θ | B C ∈ R or B C ∈ R, X |=θ B, c ∈ C, c.θ is ground} Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 49 Example 22 This example shows how the natural numbers can be constructed. The rules R are: > ∃x zero(x) zero(x) nat(x) zero(x) ∧ zero(y) ∧ x , y ⊥ nat(x) ∃y succ(x, y) succ(x, y) ∧ succ(x, z) ∧ y , z ⊥ succ(x, y) nat(y) succ(x, x) ⊥ succ(x, y) less(x, y) succ(x, y) ∧ less(y, z) less(x, z) less(x, x) ⊥ Note the rule for constructing successors. These rules allow us to create any finite subset of the natural numbers. For example, one of the members of out3(R, ∅) is: nat(ν0) nat(ν1) nat(ν2) zero(ν0) ∼zero(ν1) succ(ν0, ν1) ∼zero(ν2) succ(ν1, ν2) ∼succ(ν1, ν1) less(zero, ν1) ∼succ(ν1, ν0) less(zero, ν2) ∼succ(ν0, ν2) less(ν1, ν2) ∼succ(ν0, ν0) 5.5 Comparing KL3 with geometric logic All rules in KL3 are of the form ∀xφ(x) ◦→ ∃ȳψ(x, ȳ), where x and ȳ are tuples of variables and ◦→ is either or. These rules have the same quantifier structure as the rules of geometric logic37. The geometric formulae (also known as the "coherent implications") are the implications C→ D where C ::= > | C ∧ P D ::= ⊥ | D ∨ E E ::= ∃x C 37 The importance of geometric logic for understanding Kant's thought is stressed in the papers by Theodora Achourioti and Michiel van Lambalgen [1,2]. For geometric logic in general, see [14], [24], [5], [6], [46]. 50 R. Evans et al. and P ranges over L38. Although the rules of KL3 have the same quantifier structure as the rules of geometric logic, there are a number of differences. First, KL3 has two types of rule, and, while geometric logic has only one. Second, KL3 has predicate negation, while geometric logic does not include any sort of negation. Third, weakening the output is valid in geometric logic39, but not in KL3. The fourth difference between the two systems is the way in which the tree of nodes40 is generated. In KL3, to generate the successors stepq(R,X) of a set X of atoms, we consider all rules in R whose bodies are satisfied. When we have finished constructing the nodes, we filter them to accept only those that satisfy all the  rules. In geometric logic, the dynamical proof tree is generated by considering only violated rules: rules whose body is satisfied but whose head is unsatisfied. To see the difference, consider the rule-set R consisting of only one rule: > ∃xφ(x) In geometric logic, the proof tree contains one node with the single atom φ(a0) for some constant a0. Once the rule's head has been satisfied, it is no longer available to generate further nodes. In KL3 by contrast, the stepq function allows a rule to be applied whenever its body is satisfied, so out3(R, {}) contains infinitely many possible solutions: {φ(a0)}, {φ(a0), φ(a1)}, {φ(a0), φ(a1), φ(a2)}, . . . . All of these differences are crucial to the intended application of our logic in understanding Kant (see Sections 2, 3.7, and 6). 5.6 Translating natural language into KL3 Finally in this section, and before returning to Kant's texts, we shall spend a little time showing how natural language sentences can be translated into KL3. This exercise is important because the translation guidelines for KL3 are rather different from those for translating natural language into first-order logic. 5.6.1 Singular judgements In first-order logic, a singular judgement, such as "Caius is mortal," is translated into an atom: mortal(caius) where mortal is a one-place predicate and caius is a constant representing the individual Caius. 38 Note that geometric logic, unlike KL3, does allow constants as terms in rules. 39 See the classical evaluation rule for disjunction on page 3 of [14]: X φ1 ∨ φ2 if X / U and for all Y ∈ U, Y φ1 or Y φ2. 40 Each "node" is a set of literals in outqn(R,A) at depth n. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 51 In KL3, by contrast, an atom represents a subsumption – the act of subsuming a private mental intuition under a mark. So in KL3, declarative sentences are never translated into atoms (subsumptions). Instead, the judgement "Caius is mortal" is rendered as a rule: caius(x) mortal(x) This is a conditional imperative that relates actions. It says: for all intuitions x, if you are subsuming x under the mark "caius", then also subsume x under the mark "mortal"! Now a proper noun, such as "Caius," is normally taken to imply existence (there is at least one individual denoted by "Caius") and uniqueness (there is at most one individual denoted by "Caius"). If we wish to express existence and uniqueness in KL3, we write 41: > ∃x caius(x) caius(x) ∧ caius(y) ∧ x , y ⊥ Judgements involving binary predicates are represented similarly. "Jack loves Jill" is rendered as: jack(x) ∧ jill(y) loves(x, y) plus existence and uniqueness constraints, as needed. 5.6.2 All and some Universally quantified judgements, such as "All humans are mortal," are rendered directly into KL3 as: human(x) mortal(x) Recall, once more, that this rule is a conditional imperative stating what actions you must do: if you are subsuming private mental intuition x under the mark "human" then also subsume x under "mortal"! Judgements involving "some" can be translated into KL3 in two different ways. "Some humans are fickle," for example, can be translated into: human(x) fickle(x) This is a permissive rule: if you are subsuming intuition x under "human", then feel free to also subsume x under "fickle"! This way of translating the sentence has no existential import whatsoever. It is fully compatible with there 41 This is related to Kant's point that "It is a mere tautology to speak of universal or common concepts – a mistake that is grounded in an incorrect division of concepts into universal, particular, and singular. Concepts themselves cannot be so divided, but only their use" [Jäsche Logic p. 91]. For further discussion, see Section 6. Relatedly, note that KL3 and inclusive logic have one thing in common in that they both avoid the presupposition that the domain is non-empty. 52 R. Evans et al. actually being no humans at all. The other way of translating "Some humans are fickle," by contrast, provides existential import: > ∃x p1(x) p1(x) human(x) p1(x) fickle(x) Here, p1 is a new predicate mark introduced to represent the conjunction of human and fickle. These rules mean: you must construct at least one intuition x and subsume x under both "human" and under "fickle"42. In [Jäsche Logic §46], Kant says that universal judgements ("all" judgements) imply particular judgements ("some" judgements). In KL3 the inference from "all" to "some" is valid if we interpret "all" and "some" in terms of "must" and "may"43: human(x) mortal(x) |= human(x) mortal(x) But the inference from "all" to "some" is not valid if we interpret "some" in terms of the existential quantifier: human(x) mortal(x) 2 > ∃x human(x) ∧ mortal(x) One of the key strengths of first-order logic is its ability to handle multiply quantified sentences. We can infer, for example, from "there is some (particular) prince who has offended every delegate", that "for every delegate, there is some prince who has offended her". Aristotle's two-term logic has been rightly criticised for its inability to deal with inferences involving multiply quantified sentences. KL3 does not suffer from the inadequacies of Aristotle's logic. A single rule in KL3 is implicitly of the form: ∀xφ(x) ◦→ ∃ȳψ(x, ȳ) where x and ȳ are tuples of variables, and ◦→ is either or. A single rule cannot capture a sentence of the form ∃x∀ȳφ(x, ȳ). However, a set of rules in KL3 can capture this. For example, "there is some (particular) prince who has offended every delegate" can be rendered as R1 below, while "for every 42 The auxiliary predicate p1 is necessary because we have restricted the form of rules so as not to allow conjunctive conclusions. This restriction can be removed straightforwardly; we have not presented the details to avoid lengthening the presentation. Henceforth when presenting examples we will occasionally write rules with conjunctive conclusions without further comment. Rules with conjunctive conclusions can always be translated by introducing an auxiliary predicate, as in this example. 43 An alternative way to handle the inference from 'all' to 'some' is to use a many-sorted logic (where the domain of each sort must be non-empty), thus legitimising the inference from∀x:t φ(x) to ∃x:t φ(x). We are grateful to Michiel van Lambalgen for this suggestion. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 53 delegate, there is some prince who has offended her" can be rendered as R2: R1 =  > ∃x p1(x) p1(x) prince(x) p1(x) ∧ delegate(y) offended(x, y) R2 =  delegate(y) ∃x p2(x, y) p2(x, y) prince(x) p2(x, y) offended(x, y) In the above example R2 is a conservative extension of R1 in the following sense: 1. for all A and X, if X ∈ out3(R1,A) then ∃Y ∈ out3(R1 ∪ R2,A) such that X ∩ Y = X; 2. for all A and Y, if Y ∈ out3(R1 ∪ R2,A) then ∃X ∈ out3(R1,A) such that X ∩ Y = X. More generally, [15] shows that, for each set F of first-order sentences, there is a set of sentences of geometric logic that is a conservative extension of F44. 5.6.3 The "is" of identity and the "is" of predication In first-order logic, the sentence "Phosphorus is bright" is translated as a predication: bright(phosphorus) where bright is a one-place predicate and phosphorus is a constant. The sentence "Hesperus is Phosphorus," by contrast, involves the "is" of identity and should be translated as: hesperus = phosphorus If we wish to infer that, therefore, Hesperus is bright, we need to use Leibniz's law. This is an (infinite) axiom schema licensing, for every sentence φ(x) with one free variable x the inference: Leibniz Law φ(x) x = y φ(y) In KL3, by contrast, the two senses of "is" do not come apart. "Phosphorus is bright" is translated as: phosphorus(x) bright(x) 44 Many commentators (for example, MacFarlane [42], p.26; also [19] and [55]) assume or claim that Kant's logic does not support nested quantifiers, while our formalization presupposes that his logic does have this expressive power. Our main evidence that this common view is wrong is the systematic support our account gets from making sense of Kant's otherwise notoriously obscure and problematic Table of Judgements (section 6). But for compelling textual evidence, see [1], pages 260-2. 54 R. Evans et al. "Hesperus is Phosphorus" is rendered as: hesperus(x) phosphorus(x) together with the symmetric rule: phosphorus(x) hesperus(x) The inference to hesperus(x)  bright(x) does not require any infinite axiom schema. It just involves the standard MUST-TRANS inference rule. 5.6.4 Two types of negation Natural language distinguishes between sentence-negation (e.g. "It is not the case that Jack is tall") and predicate-negation45 (e.g. "Jack is not tall"). Firstorder logic, of course, cannot capture these two distinct interpretations. The only negation in first-order logic is sentential negation. But KL3 can capture the two distinct readings. "It is not the case that Jack is tall" is rendered as: jack(x) ∧ tall(x) ⊥ "Jack is not tall" is rendered as: jack(x) ∼tall(x) Now in KL3 these two particular claims are provably equivalent – but in general, when existentially quantified variables are involved, sentence-negation and predicate-negation are not equivalent in KL3. Consider "Jack is not married to anyone": jack(x) ∧ married(x, y) ⊥ Compare with "There is someone who Jack is not married to": jack(x) ∃y∼married(x, y) Neither claim entails the other. 6 Recovering the Table of Judgements The Table of Judgements [A70/B95] is divided into four "titles": Quantity, Quality, Relation, and Modality. Kant clearly thought this division was fundamental because it appears as an organising framework throughout the critical works46. Each title represents one structural feature of a judgement. 45 As does Kant (in transcendental logic), e.g. at [A71-3/B97-8], [Jäsche Logic 9:103-4]. See also [58] p.268. 46 See, for example, the Table of Categories [A80/B106], the four aspects of time determination [A145/B184], the four principles [A161/B200], the four ways of comparing concepts [A263/B319], the four aspects of the concept of nothing [A291-2/B348]. For Kant on the importance of his table, see [A80-1/B107-7], Prolegomena 4: 306. Kant also employs the four titles as an organising framework in the Critique of Practical Reason and in the Critique of the Power of Judgement. For a comprehensive visual representation of the extent of this organizing structure in the first Critique, see goo.gl/cRVqZ7. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 55 In Kant's table, there are three possible values for each structural feature, so there are at most 34 possible types of judgement47. The four titles were in widespread use in the logic textbooks of the time48, but Kant's particular use of them was unusual. The Quantity of a categorical subject-predicate judgement indicates whether the extension of the subject is partly or wholly contained in the extension of the predicate. If the extension of the subject S is wholly contained in the extension of the predicate P, then we say "all S are P", and the judgement has universal quantity. If the extension of S is only partly contained in the extension of P, then we say "some S are P", and the judgement has particular quantity. If the extension of S is a singleton, and this single element is a member of the extension of P, then we say "the individual S is P", and the judgement has singular quantity. One problem with this way of characterising Quantity is that it only applies to categorical judgements involving monadic predicates. But we shall see below how to extend this idea naturally to all other types of judgement, including hypothetical and disjunctive judgements involving binary or n-ary predicates. Kant claimed that singular judgements are a sub-type of universal judgements [A71/B96] [Jäsche Logic 9:102]. This was a common claim for logicians working within the Aristotelian two-term logic. But note that this claim is obviously false if universal and singular judgements are interpreted in terms of first-order logic. The singular judgement p(a) is not a sub-type of universal judgement (∀x) a(x) ⊃ p(x). The Quality of a judgement indicates whether the predicate is affirmed or denied of the subject. If the predicate is affirmed of the subject, as in "All humans are mortal", then the judgement is affirmative. If the predicate is denied of the subject, as in "It is not the case that the soul is mortal", then the judgement is negative. But if the negation of the predicate is affirmed of the subject, as in "The soul is non-mortal", then the judgement is infinite49. The infinite judgements are, according to Kant, a sub-type of the affirmative judgements: If I had said of the soul that it is not mortal, then I would at least have avoided an error by means of a negative judgement. Now by means of the proposition "The soul is non-mortal" I have certainly made an actual affirmation as far as logical form is concerned, for I have placed the soul within the unlimited domain of undying things. [A72/B97] Note that both the distinction between negative and infinite judgements, and the claim that the infinite judgements are a sub-type of affirmative judgements, 47 In practice, there will be slightly fewer, since some combinations are incompatible. For example, a judgement cannot both be negative and disjunctive. Nor can it be both negative and particular. See [2]. 48 See in particular The Port-Royal Logic [4]. 49 "In negative judgements the negation always affects the copula; in infinite ones it is not the copula but rather the predicate that is affected" [Jäsche Logic 9:104] 56 R. Evans et al. make no sense within first-order logic. In Frege's logic and its descendants, there is only one type of negation: sentence-level negation. Kant's use of Relation is very different from its current meaning. In modern logic, a relation is a n-ary predicate where n > 1. For Kant, the Relation is a structural feature of a judgement indicating how the various subsumptions in the judgement are related to each other: All relations of thinking in judgement are either those a) of the predicate to the subject, b) of the ground to the consequence, and c) between the cognition that is to be divided and all of the members of the division. [A73/B98] In case (a), when a judgement involves just two subsumptions (e.g. "all humans are mortal"), then the judgement is categorical. In case (b), when a judgement has a condition that must be satisfied (e.g. "If there is perfect justice, then obstinate evil will be punished"), then the judgement is hypothetical. In case (c), when a judgement has a disjunctive conclusion (e.g. "The world exists either through blind chance, or through inner necessity, through an external cause"), then the judgement is disjunctive50 [A73-4/B98-9]. Strawson [55] criticised Kant's use of Relation for being neither exhaustive nor exclusive. The three types of Relation are not exhaustive since some types of judgement (e.g. conjunctions) are not present at all. The three types of Relation are not exclusive since hypotheticals and disjunctions can, in standard propositional logic, be inter-defined using negation: p ⊃ q if and only if ¬p ∨ q. However we shall see, below, that in KL3, Kant's threefold division is very natural. The fourth title, Modality, is a different type of feature from the others. While Quantity, Quality, and Relation are structural features of an individual judgement, Modality (as we read Kant) is a feature indicating how the judgement relates to the rest of the judgements held by an agent: The modality of judgements is a quite special function of them, which is distinctive in that it contributes nothing to the content of the judgement (for besides quantity, quality, and relation there is nothing more that constitutes the content of the judgement), but rather concerns only the value of the copula in relation to thinking in general. [A74/B100] The Modality of a judgement can be either problematic, assertoric, or apodictic. These are not the alethic modalities of possibility, actuality, and necessity. They are more like epistemic modals that relate us to the alethic modalities in particular ways: Problematic judgments are those in which one regards the assertion or denial as merely possible (arbitrary). Assertoric judgments are those in which it is considered actual (true). Apodictic judgments are those in which it is seen as necessary. [A74-5/B100] 50 Disjunctions for Kant are exclusive disjunctions (see Section 3.1). Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 57 In the [Jäsche Logic 9:108-9], Kant goes on to explain his modalities of judgement in terms of the very same normative notions (may/must) that have been so central to KL3. The difference, we shall see, is that they function at a different level. In each of the four titles, the third moment is defined as a sub-type of the first moment. A singular judgement is a sub-type of universal judgement; an infinite judgement is a sub-type of affirmative judgement; a disjunctive judgement is a sub-type of categorical judgement, and an apodictic judgement is a sub-type of problematic judgement. According to Kant, the third moment in each title entails a judgement of the second moment. A singular judgement entails a particular judgement; an infinite judgement entails a negative judgement; a disjunctive judgement entails a hypothetical judgement, and an apodictic judgement entails an assertoric judgement. Kant's Table of Judgements has been roundly criticised for being incomplete, confused, or for being based on an impoverished expressively-limited logic. In this paper we argue, by contrast, that KL3 is a powerful and expressive logic in which Kant's table emerges as the most natural way of categorising rules. 6.1 KL3 makes sense of Kant's Table of Judgements Since Kant sees a judgement as a type of rule (see Sections 1 and 2), a way of classifying rules will also be a way of classifying judgements. In this section, we shall provide four ways of classifying rules in KL3, and show how each classification corresponds to one of the four titles in the Table of Judgements. Quantity. In KL3, there are two types of rule : conditional imperatives and conditional permissives. An imperative of the form p q means: "if you are performing p, then also perform q!" A permissive of the form p q means: "if you are performing p, then feel free to also perform q!" We propose the following simple identification: a rule (judgement) has universal quantity if it is a conditional imperative, while a rule has particular quantity if it is a conditional permissive. So, for example, the universal judgement "all humans are mortal" would be rendered as: human(x) mortal(x) while the particular judgement "some men are fickle" would be rendered as: human(x) fickle(x) A singular judgement is a sub-type of universal judgement in which there is at most one object falling under the subject term. "Caius is mortal", for example, would be rendered by a pair of rules: caius(x) mortal(x) caius(x) ∧ caius(y) ∧ x , y ⊥ 58 R. Evans et al. This way of characterising Quantity has three appealing features. First, it shows how a singular judgement can be a type of universal judgement. In first-order logic, by contrast, a singular judgement is typically not rendered as a type of universal judgement. Second, it shows how Quantity can apply to all types of judgement. Recall that Quantity is normally defined for affirmative categorical judgements involving monadic predicates (subject-predicate sentences of the form "S is P"), and there is a problem how to extend this definition to all types of judgement. If Quantity is based on the distinction between conditional imperatives and conditional permissives, then it applies to all types of rule. Finally, this way of defining Quantity is consistent with Kant's view51 that the inference from universal to particular quantity is valid. Consider the inference: All S are P Therefore, some S are P If these statements are translated into first-order logic, the inference is obviously invalid: we cannot infer from ∀x s(x) ⊃ p(x) that ∃x s(x)∧p(x) since there may be no objects whatsoever satisfying s(x). However, when we translate into KL3 as s(x) p(x) s(x) p(x) we can infer s(x) p(x) from s(x) p(y) using the MUST-MAY inference rule. In Kant's Table of Judgements, the third moment always entails a judgement of the second moment. In the case of Quantity, a singular judgement is a type of universal judgement, which itself entails a particular judgement, using the MUST-MAY inference rule. Quality. In KL3, a conditional imperative has the form p1∧...∧pn  q1∨...∨qm. In particular, if the disjunction is empty, then the imperative acts as a constraint: p1 ∧ ... ∧ pn  ⊥ says whatever you do, do not perform all of p1, ..., pn. Constraints can be used to represent negative judgements52: jack(x) ∧married(x) ⊥ represents the judgement that it is not the case that Jack is married. Recall from Section 5 that the set of predicates in KL3 contains positive and negative marks. Given a set P of predicate marks, the complete set P+/− of signed predicates is: P +/− = P ∪ {∼p | p ∈ P} An infinite judgement is an affirmative judgement in which the conclusion involves a negated mark. To say that Jack is unmarried, we write: jack(x) ∼married(x) 51 See e.g. [Jäsche Logic 9:116]. 52 "Negative judgements have the special job of preventing error" [A709/B737]. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 59 Note that the negation binds to the mark married and not to the subsumption married(x). Unlike first-order logic, KL3 is able to distinguish between negative and infinite judgements, and is able to characterise infinite judgements as a subtype of affirmative judgements. In Kant's Table of Judgements, the third moment always entails a judgement of the second moment. In the case of Quality, an infinite judgement (e.g. p  ∼q) entails a negative judgement (e.g. p ∧ q  ⊥) using the following inference: p ∼q SIp ∧ q ∼q ∼-LEFTq ∧ ∼q ⊥ SIp ∧ q ∧ ∼q ⊥ MUST-TRANSp ∧ q ⊥ Relation. In KL3, imperatives of the form p1 ∧ ... ∧ pn  q1 ∨ ... ∨ qm and permissives of the form p1∧ ...∧pn  q1∨ ...∨qm can be categorised based on how many conjuncts n they have in the antecedent, and how many disjuncts m they have in the consequent. If there is one element in the antecedent, then the rule is categorical53. If there are many elements in the antecedent, then the rule is hypothetical. If there is one element in the antecedent, but many elements in the consequent, then the rule is disjunctive [A93-4, B98-9]. Note that Strawson's criticism of Kant's three moments of Relation (that they are not exhaustive) does not apply to this formalization in KL3. The first two types of rule are exhaustive as long as n > 0. Recall that in Kant's Table of Judgements, the third moment of each title is a sub-type of the first moment, and entails a judgement of the second moment. In the case of Relation, the disjunctive judgement (because it has one element in the antecedent) is a sub-type of the categorical and entails a hypothetical judgement (using MUST-SI or MAY-SI). Modality. The Kantian agent makes sense of its sensory perturbations by constructing and applying rules. These rules are conditional imperatives and permissives relating mental acts, e.g., for all private mental intuitions x, if you are subsuming x under mark p, then also subsume x under mark q! At any moment, the Kantian agent has a set A of subsumptions that it is performing, and a set R of rules that it has adopted. Given the subsumptions A and rules R, there are various different bundles of mental activity that are compatible with A and R. These are the various sets X ∈ out(R,A): the various sets of subsumptions it may perform. There is also the one distinguished set of subsumptions it is actually performing. If we take the intersection of all the X such that X ∈ out(R,A), then we get the subsumptions it must perform. As well as the collections of subsumption acts that it may or must perform, however, there are also the rules that it may or must adopt. These are the selfsame normative notions at work in both cases – they have their force and 53 Alternatively, if we extend KL to include embedded rules (see Section 6.3), then a rule is categorical if it does not include an embedded rule. 60 R. Evans et al. Fig. 7: Interpreting the Table of Judgements in KL content relative to the agent's goal of achieving experience. But they function at different levels. Whereas the normative characterizations of subsumptions within rules were the basis of the types of Quantity, Quality, and Relation in the Table of Judgements, it is the normative characterizations of rules themselves that are the basis of the types of Modality. Given a set A of subsumptions that the agent is performing, and a set R of rules it has adopted, there are various further rules that the agent may adopt. Of course, not every set of rules can be added. Some rules may be incompatible with one of the existing rules in R. Or some rule may be incompatible with some of the agent's current subsumptions. For example, if the agent is Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 61 subsuming an intuition k under mark p, and also is subsuming k under q, then it may not adopt the rule: p(x) ∧ q(x) ⊥ We propose that the problematic judgements are the rules an agent may adopt54: Problematic judgements are those in which one regards the assertion or denial as merely possible. [A74/B100] Given a set R of rules that an agent has already adopted, the assertoric judgements are the rules in R that it has already committed to: The assertoric proposition ... indicates that the proposition is already bound to the understanding according to its laws. [A76/B101] Further, the apodictic judgements are the rules that it must adopt55, given R: Apodictic judgements are those in which it [the judgement] is seen as necessary. [A75/B100] These are the rules in deriv(R), in Section 3.5 above. A summary of our interpretation of Kant's Table of Judgements in KL is given in Figure 7 56. 6.2 Apperception and rule-revision There are, then, two different levels in Kant's theory at which the same normative notions play a key role: there are the subsumptions that may / must be performed, and there are the judgements (i.e. rules) that may / must be adopted. These two levels can come apart: an agent may choose to adopt a rule saying that it must perform a particular subsumption. In this case, there is a sense in which the subsumption is necessary, even though the rule that prompted it need not have been adopted and is, hence, contingent: this word [the copula "is"] designates the relation of the representations to the original apperception and its necessary unity, even if the judgement itself is empirical, hence contingent [B142] This passage brings to the fore a topic that has been latent in much of the preceding text, but which we can only discuss very briefly here insofar as it relates to an important simplification in the present account of Kant's modalities of judgement. The topic is the unity of apperception. The simplification is that we have so far avoided the question of whether, and how, the Kantian agent can reject or revise rules it has previously adopted. 54 See also [Jäsche Logic 9:109]: "The soul of man may be immortal." 55 See also [Jäsche Logic 9:109]: "The soul of man must be immortal." 56 Note that the given formulas in Figure 7 are only examples. Certain combinations are proscribed see Section 6. But, for instance, while our example of a categorical judgement is universal and affirmative, it could just as well have been, say, particular and infinite: p(X) ∼q(X). 62 R. Evans et al. For Kant, unity of apperception and experience are two sides of the same coin. On our interpretation, the Kantian agent "binds" itself in two distinct but related senses when it constructs and applies its rules in constructing experience. First, it binds itself to its rules: it commits to up-holding those rules. Second, it binds itself together: it forms itself into a unity by up-holding its rules57. Now, the agent can only do either of these things insofar as it also binds its subsumptions into a unity, since the rules to and by which it binds itself just are procedures for generating subsumptions from subsumptions. And as we have said, if various (meta-) constraints on this activity are satisfied, this rule-bound unity of subsumptions will constitute experience. It is in this way that a unity of consciousness arises alongside and necessarily accompanies that consciousness of unity that is our experience of a coherent, unified external world. Unity of apperception and experience are two sides of the same coin, and both are the upshot of self-legislation. So how does this relate to rule-revision? Above we assumed that the set R of rules an agent has adopted is fixed. Our prototype computer simulations [17,18] of the Kantian cognitive architecture also make the same assumption. But what if we consider the set of rules to be changing – what if we consider adding to or removing from R? Clearly some sort of revision will sometimes be required in the light of new information. Indeed, the Kantian agent will always be changing its rules to best account for the stream of sensory data. The pattern is new in every moment, constantly requiring revision in making coherent, unified sense of the on-going stream of sensory perturbations. However, if the Kantian agent can reject or revise any rule whenever it sees fit, then this makes a nonsense of the idea that the agent has previously committed to that rule (and with it, the quoted notion of necessary unity of apperception even in contingency of judgement). In what sense is the agent really bound to its rules or together into a unity? Kant's notion of spontaneity cannot just be a free-for-all – rather, it must be compatible with self-legislation. A thorough model of rule revision must answer two questions. First, under what circumstances is the agent permitted to revise a rule? Second, when it is in one of these special circumstances, what is the proper procedure for revision? What are the constraints on acceptable revision? We do not attempt a full answer or a formal implementation here – that is a task for future work. But in brief: first, the agent is permitted to revise a rule when its current rules cannot make sense of its current sensory stimulations; second, the only acceptable revisions of a rule-set are revisions in which all previous subsumptions are still licensed. The end towards which the agent's activity is directed is experience (and apperception): a coherent, unified representation. It cannot revise a rule-set if the new rule-set no longer legitimizes one of the activities it has already performed. (See Section 3.11.) 57 Note that there is also a sense in which the agent binds others (and itself to others) in this way, in that its rules quantify over all intuitions (see Sections 2.2 and 5). Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 63 6.3 Expressive limitations and open questions In this section, we highlight aspects of Kant's logic that are not well captured by our formalization. We have described a logic of conditional imperatives and permissives that formalizes Kant's central notion of a rule. In our interpretation, the relata of rules are acts that do not have a truth-value. In the case of cognition, the constituents of rules are mental acts (specifically, subsumptions). In the case of practical reason, the constituents are physical acts. In both cases, both theoretical and practical reason, we use the same form of rules, the same semantics, and the same inference rules. Because our logic is designed to be able to capture what is in common between theoretical and practical reason, inferences which are only valid in theoretical reason (but not in practical reason) are not supported in KL3. Consider, for example, the law of excluded middle, which is endorsed in [Jäsche Logic 9:117]58. In KL2 this would be expressed as > p ∨ ∼p. This is not valid in KL2 (see Section 4.7). Consider, next, the inference from 'All A are B' to 'Some B are A'. This is endorsed in [Jäsche Logic 9:103]. The inference from 'All A are B' to 'Some B are A' follows from two simpler inferences: (i) the inference from 'all A are B' to 'Some A are B' (endorsed in [Jäsche Logic 9:116]), and (ii) the inference from 'Some A are B' to 'Some B are A' (endorsed in [Jäsche Logic 9:118]). In KL3, 'Some A are B' has two readings. One is permissive: a(x)  b(x). This rule has no existential import: if you subsume x under a, then feel free to also subsume x under b. The second reading is imperative and has existential import:> ∃x a(x)∧b(X). In KL3, the inference from 'All A are B' to 'Some A are B' is only valid under the first permissive non-existential reading of 'Some A are B'. It is not valid under the existential reading. The second inference (from 'Some A are B' to 'Some B are A') is valid in KL3 under the existential reading, but is not valid under the permissive non-existential reading: a(x) b(x) 2 b(x) a(x) When considering practical actions, the inference is also invalid. The conditional "If you spill the drink then you may apologize" does not entail "If you apologize then you may spill the drink." To conclude, under either interpretation of 'some' (the permissive or existential), the inference from 'All A are B' to 'some B are A' is not valid in KL3. Relatedly, Kant endorses the inference from 'All A are B' to 'Some B are not-A' (see [Jäsche Logic 9:103]). This inference is also not supported in KL3. Finally, consider Kant's discussion of embedded judgements. In the first Critique, Kant says that hypothetical and disjunctive judgements "do not contain a relation of concepts but of judgements themselves" [B141] (see also [A73/B98]). However, in our formalization, rules cannot contain rules as constituents. 58 It is not clear that Kant consistently endorses the law of the excluded middle. 64 R. Evans et al. In our approach, a universal categorical 'all A are B' is translated to the conditional a(x)  b(x). A hypothetical 'if an A is C, then it is also B' is translated to the conditional a(x) ∧ c(x)  b(x). In general, a categorical judgement is translated into a conditional with exactly one antecedent, while a hypothetical judgement is translated into a conditional with more than one antecedent. Note that, in this approach, the translation of the categorical 'all A are B' (a(x)  b(x)) is not a syntactic constituent of the translation of the hypothetical 'if an A is C, then it is also B' (a(x) ∧ c(x) b(x)). This approach to categorical and hypothetical judgements is based on Longuenesse [Kant and the Capacity to Judge, p. 103n]. The major difference is that she is analysing judgements using conditionals of classical logic, where the constituents of the conditional are truth-evaluable propositions, while our conditionals relate acts that do not have truth values59. In both Longuenesse's approach and in ours, the constituents of hypothetical judgements are not themselves judgements, but are rather more primitive elements (in our case, subsumptions). This is a clear divergence from Kant's explicit pronouncements. We are currently developing extensions of KL1, KL2, and KL3 to include embedded rules. A rule p  (q  q) means that if you are doing p, then you must adopt the following rule: if you are doing q, then you must do r. A rule (p  q)  r means that if the rules you have adopted jointly commit you to upholding p q, then you must do r. Extending the semantics of KL1 to include embedded rules is relatively straightforward: we modify cns of Section 3.2 so that each output is no longer a set of atoms, but a pair consisting of a set of atoms and a set of additional rules that we have adopted. But things get more complicated when embedded rules interact with bound variables. Consider, for example, the rule a(x)  (b(x)  c(x)). Is the x inside the embedded rule b(x)  c(x) bound to the same x in outer scope? Or is it a distinct variable, thus equivalent to a(x) (b(y) c(y))? If we take the first option, then the subsumption a(k) together with the rule a(x)  (b(x)  c(x)) jointly commit us to adopting the ground rule b(k) c(k); this option requires a thorough understanding of how ground rules interact with quantified rules. If we take the second option, then the "currying" principle p ∧ q  r |= p  (q  r) is invalid, since a(x) ∧ b(x)  c(x) does not entail a(x)  (b(x)  c(x)) = a(x)  (b(y)  c(y)). There is much further work to do here. 7 Conclusion The Kantian agent is a self-legislating rule-induction system. It makes sense of its sensory perturbations by spontaneously constructing and applying rules. If 59 There are two other important differences between her approach and ours. First, Longuenesse (p.93n) is forced to retreat to the claim, unsupported by Kant's texts, that only universal judgements are rules, whereas we can make sense of Kant's claim that all judgements are rules. Second, with support from the texts, we make use of conditional permissives, which Longuenesse does not consider. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 65 this activity satisfies various constraints, the agent achieves experience: it has constructed a coherent, unified representation of a coherent, unified external world. We have defined a logic of conditional imperatives and permissives that was designed as a formalization of Kant's conception of rules relating acts that do not have truth-values. At its heart are the normative notions captured by conditional imperatives and permissives, rather than the notion of truth. In this paper, we showed how the rules formalized in our logic have structural features that correspond precisely to those displayed in Kant's Table of Judgements. We also explained how this logic handles the major deontic paradoxes, how it differs from related logics, and how it translates natural language sentences. Of course our claim has not been that Kant had this precise logic in mind, but rather that it is based on, compatible with, and helps to explain part of Kant's view in the Critique of Pure Reason and associated texts. This paper is part of a larger project to realize Kant's cognitive architecture in the medium of computation60. This wider project includes a number of other key components. First, we must give a formalizable account of the meta-rules governing the agent's construction of experience. Second, as noted above (Section 6), we must give an account of rule-revision. And third, we must extend our analysis to the personal level to account for the agent's consciously guided ruled-based activity in deciding what to do. In this paper, we have mostly focused on theoretical reason, where the constituents of rules are private mental acts (subsumptions), rather than practical reason, where the constituents of rules are public physical acts. In future work, we plan to extend our account to practical reason. References 1. Achourioti, Theodora, and Michiel van Lambalgen. (2011). A formalization of Kant's transcendental logic. The Review of Symbolic Logic, 4.02 : 254-289. 2. Achourioti, Theodora, and Michiel van Lambalgen. (2012). Kant's logic revisited. PhML-2012. 3. Allais, Lucy. (2016). Conceptualism and Nonconceptualism in Kant: A Survey of the Recent Debate. In Kantian Nonconceptualism (pp. 1-25). Palgrave Macmillan, London. 4. Arnauld, Antoine, and Pierre Nicole. (1996). Logic or the art of thinking. Cambridge University Press. 5. Bezem, Marc, and Thierry Coquand. (2005). Automating coherent logic. International Conference on Logic for Programming Artificial Intelligence and Reasoning. Springer, Berlin, Heidelberg. 6. Bezem, Marc. (2005). On the undecidability of coherent logic. Lecture notes in computer science, 3838: 6. 7. Brandom, Robert B. (2008). Between saying and doing: towards an analytic pragmatism. Oxford University Press. 8. Brandom, Robert B. (2015). From Empiricism to Expressivism. Harvard University Press. 9. Brandom, Robert B. (2009). Norms, selves, and concepts. Reason in philosophy. Harvard University Press. 10. Brook, Andrew. (1997). Kant and the mind. Cambridge University Press. 11. Cali, A., Gottlob, G. and Lukasiewicz, T., 2012. A general datalog-based framework for tractable query answering over ontologies. Web Semantics: Science, Services and Agents on the World Wide Web, 14, pp.57-83. 60 For details, see [17,18]. 66 R. Evans et al. 12. Charlow, Nate. (2014). Logic and semantics for imperatives. Journal of Philosophical Logic, 43(4), 617-664. 13. Chellas, Brian. F. (1971). Imperatives. Theoria, 37(2), 114-129. 14. Coquand, T. (2010). A completeness proof for geometric logic. Technical report, Computer Science and Engineering Department, University of Gothenburg. 15. Dyckhoff, Roy, and Sara Negri. (2015). Geometrisation of first-order logic. Bulletin of Symbolic Logic, 21.2: 123-163. 16. Van Emden, Maarten H., and Robert A. Kowalski. The semantics of predicate logic as a programming language. Journal of the ACM (JACM) 23.4 (1976): 733-742. 17. Evans, Richard. (2017). A Kantian cognitive architecture. On the Cognitive, Ethical, and Scientific Dimensions of Artificial Intelligence. Springer, Cham, 2019. 233-262. 18. Evans, Richard. (2017). Kant on constituted mental activity. APA on Philosophy and Computers. 19. Friedman, Michael. (1992). Kant and the exact sciences. Harvard University Press. 20. Geach, Peter Thomas. (1979). Names and predicables. Semiotics in Poland, 240-246. 21. Gelfond, Michael, and Vladimir Lifschitz. (1988).The stable model semantics for logic programming. ICLP/SLP, Vol. 88. 22. Gelfond, Michael, and Vladimir Lifschitz. (1991). Classical negation in logic programs and disjunctive databases. New generation computing, 9(3-4), 365-385. 23. Grossi, David, and Jones, Andrew. (2013). Constitutive norms and counts-as conditionals. Handbook of deontic logic and normative systems, p. 407-441 24. Gurevich, Yuri, Marc Bezem, and Thierry Coquand. (2003). Newman's lemma – a case study in proof automation and geometric logic. Bulletin of the European Association for Theoretical Computer Science. 25. Hansen, Jörg . (2013). Imperative logic and its problems. Handbook of Deontic Logic and Normative Systems, 137-191. 26. Hansen, Jörg . (2008). Imperatives and deontic logic. PhD Thesis, Leipzig. 27. Hofstadter, Albert., and McKinsey, John. C. (1939). On the logic of imperatives. Philosophy of Science, 6(4), 446-457. 28. Humberstone, Lloyd. The connectives. MIT Press, 2011. 29. Kant, Immanuel. (1781). Critique of pure reason. Cambridge University Press. 30. Kant, Immanuel. (2004). Lectures on logic. Cambridge University Press. 31. Kaufmann, Stefan., and Schwager, Magdalena. (2009, September). A unified analysis of conditional imperatives. In Semantics and Linguistic Theory (Vol. 19, pp. 239-256). 32. Kitcher, Patricia. (2017). A Kantian critique of transparency. Kant and the philosophy of mind. Oxford University Press. 33. Kitcher, Patricia. (1993). Kant's transcendental psychology. Oxford University Press. 34. Kitcher, Patricia. (2011). Kant's thinker. Oxford University Press. 35. Korsgaard, Christine. (2009). Self constitution. Oxford University Press. 36. Korsgaard, Christine. (2014). The constitution of agency. Oxford University Press. 37. Kowalski, Robert, and Fariba Sadri. "An agent language with destructive assignment and model-theoretic semantics." In International Workshop on Computational Logic in Multi-Agent Systems, pp. 200-218. Springer, Berlin, Heidelberg, 2010. 38. Landy, David. (2015). Kant's Inferentialism: The Case Against Hume (Vol. 11). Routledge. 39. Lifschitz, Vladimir, David Pearce, and Agustı¿ 12 n Valverde. Strongly equivalent logic programs. ACM Transactions on Computational Logic (TOCL) 2.4 (2001): 526-541. 40. Longuenesse, Beatrice. (1998). Kant and the capacity to judge. Princeton UP. 41. Longuenesse, Beatrice. (2005). Kant on the human standpoint. Cambridge University Press. 42. MacFarlane, John. (2002). Frege, Kant, and the logic in logicism. The Philosophical Review, 111.1: 25-65. 43. Makinson, David, and Leendert Van Der Torre. (2000). Input/output logics. Journal of Philosophical Logic, 29.4 : 383-408. 44. Minker, Jack, and Carolina Ruiz. (1994). Semantics for disjunctive logic programs with explicit and default negation. Fundamenta Informaticae, 20(1, 2, 3), 145-192. 45. Mosser, Kurt. (2008). Necessity and possibility: the logical strategy of Kant's Critique of Pure Reason. CUA Press. 46. Negri, Sara. (2003). Contraction-free sequent calculi for geometric theories with an application to Barr's theorem. Archive for Mathematical Logic, 42.4: 389-401. 47. Reiter, Raymond. (1980). A logic for default reasoning. Artificial intelligence, 13.1-2: 81-132. Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 67 48. Ross, Alf. (1941). "Imperatives and Logic." Theoria, 7: 53?71. 49. Searle, John R. (1995). The construction of social reality. Simon and Schuster. 50. Sommers, Fred. (1983). The logic of natural language. Clarendon Press. 51. Stephenson, Andrew. (2013). Kant's theory of experience (Doctoral dissertation, University of Oxford). 52. Stephenson, Andrew. (2015). Kant on the object-dependence of intuition and hallucination. The Philosophical Quarterly, 65(260), 486-508. 53. Stephenson, Andrew. (2018). How to solve the knowability paradox with transcendental epistemology. Synthese, forthcoming. 54. Stephenson, Andrew. (2018). Logicism, possibilism, and the logic of Kantian actualism. Critique. 55. Strawson, Peter. (2002). The bounds of sense. Routledge. 56. Tiles, Mary. (2004). Kant: From general to transcendental logic. The rise of modern logic: from Leibniz to Frege, 3: 85-130. 57. Vranas, Peter. (2008). New foundations for imperative logic I: Logical connectives, consistency, and quantifiers. Noûs, 42(4), 529-572. 58. Waxman, Wayne. (2005). Kant and the empiricists: understanding understanding. Oxford University Press. 59. Waxman, Wayne. (2013). Kant's anatomy of the intelligent mind. Oxford University Press. 60. Wolff, Robert P. (1963). Kant's theory of mental activity: a commentary on the transcendental analytic of the Critique of Pure Reason. Harvard University Press. Author Contributions Richard Evans designed the logic and wrote the first draft of the paper. Marek Sergot developed the alternative semantics for KL1, developed the semantics for KL2, and improved the semantics for KL3. Andrew Stephenson improved the philosophical discussion and added further discussion of Kant. All three authors edited, revised, and polished the final draft. Acknowledgements We are very grateful to Michiel van Lambalgen for extensive feedback, and for a number of suggestions that have been incorporated into the text. Thanks also to Barnaby Evans for feedback and suggestions. A Proofs Proposition 5 (Soundness) KL1 is sound: R `KL1 r implies R |=KL1 r. That is, deriv1(R) ⊆ kl1(R). Proof We show that if r ∈ deriv1(R) then, for all A, out1(R,A) = out1(R ∪ {r},A). The proof is by induction on the length of the derivation. The base case is trivial. For the inductive step, suppose r was derived in one step from r1 and r2. By the inductive hypothesis out1(R∪{r1, r2},A) = out1(R,A) for all A. We must show that, for all A: out1(R ∪ {r1, r2, r}, A) = out1(R ∪ {r1, r2}, A) By Proposition 1, if X ∈ out1(R ∪ {r1, r2, r}, A) then X = M(D,A) for some definite program D ∈ def (R ∪ {r1, r2, r}) and X |= R ∪ {r1, r2, r}. Suppose r was derived from r1 and r2 using MUST-UNION. (The other cases are similar and are omitted61). Let r1 = A  B, r2 = A  C, r = A  B ∪ C. It can be checked that 61 For other inference rules it is not always the case that def ({r1, r2, r}) = def ({r1, r2}). However, for the first half of the proof it is sufficient to show that for every D ∈ def ({r1, r, 2, r}) there exists D′ ∈ def ({r1, r, 2}) such that D and D′ have the same models (and therefore the same minimal model). This is straightforward for all the inference rules of Figure 2. 68 R. Evans et al. def ({A B ∪ C}) = def ({A B, A C}), and so: def (R ∪ {r1, r2, r}) = def (R ∪ {r1, r2}) This establishes that out1(R ∪ {r1, r2, r}, A) ⊆ out1(R ∪ {r1, r2}, A) for all A. To show the other inclusion, that out1(R∪{r1, r2}A) ⊆ out1(R∪{r1, r2, r}, A), it remains to show that if X |= R∪{r1, r2} then X |= R ∪ {r1, r2, r}. To see this, note that if X |= A B then X |= A B ∪ C. ut Proposition 6 (Decomposition) A set R of rules semantically entails a rule r in KL2 if and only if R ∪ C ∪ aux(R) semantically entails r in KL1. That is, for all rule sets R: kl2(R) = kl1(R ∪ C ∪ aux(R)) Proof kl2(R) = kl1(R ∪ C ∪ aux(R)) for all R is equivalent to saying that two rules sets R1 and R2 are strongly equivalent in KL2, kl2(R1) = kl2(R2), iff R1 ∪ C ∪ aux(R1) and R2 ∪ C ∪ aux(R2) are strongly equivalent in KL1, kl1(R1 ∪ C ∪ aux(R1)) = kl1(R2 ∪ C ∪ aux(R2)). kl2(R1) = kl2(R2) iff out2(R1,A) = out2(R2,A) for all A iff out1(R1 ∪ C ∪ aux(R1), A) = out1(R2 ∪ C ∪ aux(R2), A) for all A iff kl1(R1 ∪ C ∪ aux(R1)) = kl1(R2 ∪ C ∪ aux(R2)) ut Proposition 7 Let R be a set of rules and A a (finite) set of literals: out1(R,A) = ∅ iff R |=KL1 A ⊥ Proof We need to show that out1(R,A) = ∅ iff out1(R ∪ {A ⊥}, A′) = out1(R,A′) for all sets A′ of literals. For left-to-right: suppose out1(R,A) = ∅. First observe that out1(R∪{A ⊥}, A′) ⊆ out1(R,A′) for all A′. (Because if X ∈ out1(R ∪ {A ⊥},A′) then X is computed from assumptions A′ using the non-constraint rules of R and X |= R ∪ {A  ⊥}. Since X |= R, that means X ∈ out1(R,A′) also.) It remains to show that out1(R,A′) ⊆ out1(R∪{A ⊥},A′) for all A′. Assume X ∈ out1(R,A′); we shall show X ∈ out1(R ∪ {A  ⊥},A′). Since X ∈ out1(R,A′), there is a D in def (R) such that X = M(D,A′). We will prove, first, that D is one of the definite programs of R ∪ {A  ⊥}, and second, that X |= R ∪ {A  ⊥}. First, since de fr(A  ⊥) = {∅}, D ∈ def (R ∪ {A  ⊥}). So D, as well as being one of the definite programs of R, is also one of the definite programs of R ∪ {A  ⊥}. Second, since X ∈ out1(R,A′), X |= R. We just need to show X |= A  ⊥. Since out1(R,A) = ∅, A 2 R by Proposition 2. Now, since X |= R, A * X, hence X |= A ⊥. These two claims entail, using Proposition 1, that X ∈ out1(R ∪ {A ⊥},A′). For the other direction: suppose out1(R,A) , ∅. We need to show that there is some A′ such that out1(R ∪ {A ⊥}, A′) , out1(R,A′). Take A′ = A: clearly out1(R ∪ {A ⊥}, A) = ∅. ut Proposition 8 Let R be a set of rules. If X is a finite violating set of R then: X ⊥ ∈ kl1(aux(R) ∪ C) Proof If X = ∅ (∅ is a violating set of R) then {a} and {∼a} are also violating sets of R, any atom a, and {> a, > ∼a} ⊆ aux(R). Clearly aux(R) ∪ C is strongly inconsistent in KL1 and so aux(R) ∪ C |=KL1 > ⊥ (Proposition 4). Suppose X , ∅. If X is a (finite) violating set of R then {X − {a} a | a ∈ X} ⊆ aux(R). We show that X ⊥ ∈ kl1(C ∪ aux(R)) by showing that out1(C ∪ {X − {a} a}), X) = ∅. Consider any rule X−{a} a, a ∈ X. Suppose, for contradiction, that X′ ∈ out1(C∪{X−{a} a}), X). Then X′ ⊇ X and X′ |= C∪ {X− {a} a}. In order to satisfy X− {a} a, X′ must contain a. But a ∈ X so X′ also contains a, and X′ 6|= C. ut Formalizing Kant's Rules: A Logic of Conditional Imperatives and Permissives 69 Remark The proof above shows that if ∅ is a violating set of R then aux(R) ∪ C is strongly inconsistent in KL1. We can also show that if R ∪ C is strongly inconsistent in KL1 then ∅ is a violating set of R. (Because then out1(R ∪ C,X) = ∅ for every set X of literals including every maximal consistent set X, and ∅ is thus a violating set of R.) The converse however is not true. Consider R = {p  ⊥, ∼p  ⊥}. ∅ is a violating set of R but R is not strongly inconsistent: out1(R, ∅) = {∅} , ∅. Proposition 9 Let R be a set of rules and X a finite set of literals. R |=KL2 X ⊥ iff X is a violating set of R Proof One half follows from the preceding result: if X is a (finite) violating set of R then X ⊥ ∈ kl1(C∪aux(R)); kl1(C∪aux(R)) ⊆ kl1(R∪C∪aux(R)) and so X ⊥ ∈ kl2(R). It remains to prove that if X ⊥ ∈ kl2(R) then X is a violating set of R. We will prove that if out1(R ∪ C ∪ aux(R), X) = ∅ then X is a violating set of R. Consider any definite logic program DR in the encoding def (R) of R. Let def (aux(R)) = {Daux} (all rules in aux(R) are rules with singleton heads and so there is a single definite program encoding aux(R)). M(DR ∪ Daux, X) |= aux(R) so it must be that M(DR ∪ Daux, X) 6|= R ∪ C, i.e., either M(DR ∪Daux, X) is inconsistent or B ⊆M(DR ∪Daux, X) for some rule B ⊥ in R. If X is inconsistent then X is a violating set of R. If X is consistent then consider any maximal consistent Xm ⊇ X. M(DR ∪ Daux, Xm) ⊇ M(DR ∪ Daux, X), and since Xm is maximal, Xm ⊇ M(DR ∪Daux, Xm) ⊇M(DR ∪Daux, X). If M(DR ∪Daux, X) is inconsistent then so is Xm, and that cannot be. So B ⊆M(DR ∪Daux, X) for some rule B ⊥ in R. But then B ⊆ Xm, and Xm 6|= R. ut Proposition 11 Let R be a set of rules and A a set of literals. out2(R,A) = out2(R ∪ {> a | a ∈ A}, ∅) Proof The result follows from the previous minimality result. It is enough to consider a singleton set of assumptions A = {a}. The general result follows by repeated application. If R∪{> a}∪C is strongly inconsistent the result holds trivially. Suppose it is not strongly inconsistent. We need to show that: out1(R ∪ {> a} ∪ C ∪ aux(R ∪ {> a})) = out1(R ∪ {> a} ∪ C ∪ aux(R)) We will show that auxm(R ∪ {> a}) = auxm(R) ∪ {> a}. Clearly {a} and all violating sets of R are violating sets of R∪{> a}. Further, right(a ⊥) = {>  a}. So aux(R) ∪ {>  a} ⊆ aux(R ∪ {>  a}). Now (assuming R ∪ {>  a} ∪ C is not strongly inconsistent) {a} is a minimal consistent violating set of R ∪ {>  a}. If X is a minimal consistent violating set of R and a ∈ X then X is not a minimal consistent violating set of R ∪ {> a}; if a < X then X is a minimal consistent violating set of R ∪ {> a}. So auxm(R ∪ {> a}) = auxm(R) ∪ {> a}. ut Proposition 15 (Soundness of KL2) For all sets R of rules: deriv2(R) ⊆ kl2(R) Proof We need to show soundness of the inference rules of KL1, ∼-LEFT and ∼-RIGHT with respect to semantic entailment in KL2. Since kl2(R) = kl1(R∪C∪ aux(R)) (Proposition 6) and KL1 is sound with respect to kl1, the soundness of KL1 inference rules is immediate. Soundness of ∼-LEFT is just C ⊆ kl2(R) which also follows trivially. It remains to show that ∼-RIGHT is sound: if r ∈ right(R) then R |=KL2 r, or more generally, that right(R) ⊆ kl2(R). We want to show: right(R) ⊆ kl1(R ∪ C ∪ aux(R)) We will show that right(R) ⊆ aux(R) (which implies the above). In full: if r ∈ right(R) then r is a rule of the form B c where B ∧ c ⊥ is a rule in R. In that case B ∪ {c} is a (not necessarily minimal or consistent) violating set of R, and aux(R) contains the rule B c. ut 70 R. Evans et al. Proposition 16 (Conditional completeness of KL2) If KL1 is complete with respect to out1 then KL2 is complete with respect to out2. That is: if, for all sets R of rules kl1(R) ⊆ deriv1(R) then, for all sets R of rules kl2(R) ⊆ deriv2(R). Proof r ∈ kl2(R) ⇒ r ∈ kl1(R ∪ C ∪ aux(R)) ⇒ r ∈ kl1(R ∪ C ∪ auxe(R)) (Proposition 13) ⇒ r ∈ deriv1(R ∪ C ∪ auxe(R)) (completeness of KL1) ⇒ r ∈ deriv2(R) The final step is because all the inference rules used in the construction of auxe(R) in Proposition 14 are inference rules of KL2. ut Proposition 17 (Decomposition of KL2) If KL1 is complete with respect to out1 then deriv2(R) = deriv1(R ∪ C ∪ auxe(R)) Proof Right-in-left inclusion is noted in the proof of Proposition 16. The other inclusion is similar: r ∈ deriv2(R) ⇒ r ∈ kl2(R) (soundness of KL2) ⇒ r ∈ kl1(R ∪ C ∪ aux(R)) ⇒ r ∈ kl1(R ∪ C ∪ auxe(R)) (Proposition 13) ⇒ r ∈ deriv1(R ∪ C ∪ aux(R)) (completeness of KL1) ut Proposition 19 Let R be a set of rules and A a set of assumptions, both containing no negative literals. Then: out2(R,A)+ = out1(R,A) Proof We can see this again by looking at the rules in aux(R): if R contains no negative literals, all rules in aux(R) are of the form B  ∼c where c and all of B are atoms. These rules have no effect on the outcomes computed from R except possibly to add negative literals. This is clear if we look at the translation to definite logic programs: every definite program D in the encoding def (R ∪ C ∪ aux(R)) has the form DR ∪Daux where DR ∈ def (R) and def (aux(R)) = {Daux} (all rules in aux(R) are rules with singleton heads and so there is a single definite program encoding aux(R)). Moreover, since none of the heads of clauses in Daux appear in DR the least model M(DR ∪ Daux,A) = M(DR,M(Daux,A)) (indeed that is true whether the set A of assumptions contains negative literals or not). If A contains no negative literals, then this least model also satisfies the constraints C. ut