Sten Lindström A SEMANTIC APPROACH TO NONMONOTONIC REASONING: INFERENCE OPERATIONS AND CHOICE* Abstract This paper presents a uniform semantic treatment of nonmonotonic inference operations that allow for inferences from infinite sets of premisses. The semantics is formulated in terms of selection functions and is a generalization of the preferential semantics of Shoham (1987), (1988), Kraus, Lehman, and Magidor (1990) and Makinson (1989), (1993). A selection function picks out from a given set of possible states (worlds, situations, models) a subset consisting of those states that are, in some sense, the most preferred ones. A proposition α is a nonmonotonic consequence of a set of propositions Γ iff α holds in all the most preferred Γ-states. In the literature on revealed preference theory, there are a number of well-known theorems concerning the representability of selection functions, satisfying certain properties, in terms of underlying preference relations. Such theorems are utilized here to give corresponding representation theorems for nonmonotonic inference operations. At the end of the paper, the connection between nonmonotonic inference and belief revision, in the sense of Alchourrón, Gärdenfors, and Makinson, is explored. In this connection, infinitary belief revision operations, that allow for the revision of a theory with a possibly infinite set of propositions, are introduced and characterized axiomatically. Several semantic representation theorems are proved for operations of this kind. 1. Introduction In standard deductive logic, a proposition α is a logical consequence of a set of propositions Γ (in symbols, Γ a α) just in case α holds (or is true) in every possible state (situation, world) in which all the propositions in Γ hold. In other words, we have the following semantic characterization of logical consequence: Γ a α iff oeΓ" ⊆ oeα", where oeα" and oeΓ" are the sets of all possible states in which, respectively, α and the set of all propositions in Γ hold. If Γ ⊆ ∆, then, of course, oe∆" ⊆ oeΓ". It follows, that standard deductive logic is monotonic, that is: if Γ a α and Γ ⊆ ∆, then ∆ a α. (Monotonicity) Notions of plausible inference or default reasoning do not in general satisfy monotonicity. From the information that x is a quaker, we may plausibly infer that x is a pacifist. However, 2 from the information that x is a quaker and a Republican, it is not a plausible inference to conclude that x is a pacifist. Of course, the phenomenon of nonmonotonicity is familiar also from probabilistic contexts: from α being highly probable given β, we may not conclude in general that α is highly probable given β ∧ γ. A common idea in the literature on nonmonotonic reasoning is the following: α is a nonmonotonic consequence of Γ (in symbols, Γ  α) just in case α holds in all those Γ-states that are maximally plausible (from the viewpoint of some agent). Or more abstractly, Γ  α obtains if α holds in all the best preferred Γ-states, namely in those Γ-states to which no other Γ-state is strictly preferred (or better). Formally we represent this idea by introducing a selection function S which given a set X of possible states picks out the set S(X) of all the "best" elements in X. The relation  of nonmonotonic consequence (or plausible inference) is then defined in terms of S in the following way: Γ  α iff S(oeΓ") ⊆ oeα". This definition will in general lead to  being nonmonotonic, since there is no guarantee that S(oeΓ") ⊆ oeα" will imply that S(oeΓ ∪ ∆") ⊆ oeα". Clearly, one of the best preferred Γ ∪ ∆-states may fail to be a best preferred member of the more inclusive class of Γ-states. Therefore, it need not be the case that S(oeΓ ∪ ∆") ⊆ S(oeΓ"). Neither does it follow that S(oeΓ ∪ ∆") ⊆ oeα". Different choices of underlying language, different conceptions of possible states, and different formal requirements on the selection function will give rise to different nonmonotonic logics. In this paper we shall explore some of the possibilities that ensue. In particular, we are going to study correspondences between various conditions on the selection function S - many of which are well-known from the literature on preference and choice - and conditions on the inference relation . In this connection it is often more natural to look at nonmonotonic inference as a Tarski-style inference operation C on sets of propositions rather than as an inference relation . The two notions are simply related by the equation: C(Γ) = {α: Γ  α}. 3 The essential idea behind our semantic modelling of nonmonotonic inference goes back to McCarthy's classical paper (McCarthy, 1980) on circumscription. McCarthy presents circumscription as a formalized rule of nonmonotonic inference (he calls it a rule of conjecture) which is used in conjunction with the rules of standard logic. There are many versions of circumscription, but the essential model-theoretic idea is the same: among all the models of a formula α some are singled out as being minimal. Minimality here can mean various things, for instance: (i) Domain Circumscription: the minimal models of α are those that have no proper submodels that are also models of α; (ii) Predicate Circumscription: the extensions of some designated predicates are minimized, while the domain together with the extensions of all other predicates are kept fixed; (iii) Parameterized Predicate Circumscription: this case is like (ii), except that the extensions of some predicates (the parameters) are allowed to vary freely; (iv) Prioritized Circumscription: there is a priority ordering of the predicates to be minimized: minimizing a predicate with higher priority is always preferred to minimizing a predicate with lower priority. Given some notion of a minimal model, one can define a corresponding notion of minimal entailment: α minimally entails β iff all minimal α-models are β-models.1 In order to single out the minimal models of α, and thereby the sentences that are minimally entailed by α, a new sentence, called the circumscription of α, is associated with α . This new sentence has as its models just the minimal models of α. Thus, α minimally entails β just in case β is a logical consequence of the circumscription of α. It should be noted, however, that the circumscription of α is in general a sentence of second-order logic. To make all this a little more concrete, let us look at a special case - predicate circumscription (McCarthy, 1980). Let α(P) be a sentence involving the predicate P (for simplicity, we let P be unary). The (Predicate) Circumscription of α with P is the second-order sentence Circum(α, P) defined as:2 α(P) ∧ ∀Q(Q < P → ¬α(Q)), where Q < P is an abbreviation for the sentence ∀x(Q(x) → P(x)) ∧ ∃x(P(x) ∧ ¬Q(x)). Next, we introduce a strict partial ordering ‹P on models of the language under consideration: M ‹P N iff M and N have the same domain, all predicate symbols in the language besides P have the same extension in M and N but the extension of P in M is a proper subset of its extension in N. We say that a model M is a P-minimal model of α, if M ‚ α (Μ is a model of α) and there is no model N such that N ‹P M and N ‚ α. Now, the models of Circum(α, P) are exactly the P-minimal models of α. Following McCarthy (1980), we say that α minimally entails β with respect to P (in symbols, α P β) if all P-minimal models of α are models of β. Thus, α P β holds just in case β is a logical consequence of Circum(α, P). Since a Pminimal model of α ∧ β may not be a P-minimal model of α, minimal entailment with respect to P is nonmonotonic. 4 Example 1: Let α be the conjunction of the sentences:3 (1) bird(tweety); (2) ∀x(bird(x) ∧ ¬ab1(x) → canfly(x)); (3) ∀x(penguin(x) → bird(x)); (4) ∀x(penguin(x) → ab1(x)); (5) ∀x(penguin(x) ∧ ¬ab2(x) → ¬canfly(x)). saying that (1) Tweety is a bird, (2) birds that are not abnormal1 can fly, (3) all penguins are birds, (4) penguins are abnormal1, and (5) all penguins, except those that are abnormal2 cannot fly. Applying predicate circumscription to the abnormality predicates ab1 and ab2 (i.e., minimizing the extension of these two predicates while keeping the extensions of the other predicates fixed) we infer from α that Tweety can fly. We cannot make this inference from α ∧ penguin(tweety). Since Tweety is a penguin, she is abnormal1. Hence, (2) cannot be used to infer that she can fly. On the other hand, it follows by minimality that Tweety is not abnormal2. Therefore, it follows by (5) that she cannot fly. Example 2: Prioritized Circumscription. Let β be the conjunction of (1) bird(tweety); (2) ∀x(bird(x) ∧ ¬ab1(x) → canfly(x)); (3) ∀x(penguin(x) → bird(x)); (5) ∀x(penguin(x) ∧ ¬ab2(x) → ¬canfly(x)). That is, β is like α, except for not containing the so-called cancellation of inheritance axiom (4). Using ordinary predicate circumscription, we can only infer from β ∧ penguin(tweety) that one of the following cases obtains: (i) Tweety is an abnormal1 bird that cannot fly; (ii) Tweety is an abnormal2 penguin that can fly. Nothing follows concerning Tweety's ability to fly. Intuitively, however, it seems reasonable to conjecture from β ∧ penguin(tweety) that Tweety cannot fly. The cases (i) and (ii) are not symmetrical: the information that Tweety is a penguin is more specific than the information that she is a bird. It seems reasonable to give higher priority to minimizing abnormality with respect to the more specific predicate. In the choice between minimizing abnormality1 and abnormality2, we choose the latter. Hence, we conclude that Tweety is not abnormal2. Then, it follows by (5) that she cannot fly. Shoham (1987) and (1988) generalized the concept of circumscription, or minimal entailment, to a more abstract notion: preferential entailment. Shoham's idea was to start from any ordinary model-theoretic semantics for a formal language L and add a new primitive notion to it: a strict partial ordering ‹ of all the models of L. Intuitively, M ‹ N means that the model M is preferred over the model N. Then, M is defined to be a preferred model of α iff (i) M ‚ 5 α; and (ii) there is no model N such that N ‚ α and N ‹ M. Finally, α is the said to preferentially entail β (in symbols, α ‹ β) just in case every preferred model of α is a model of β. Shoham (1988) emphasizes three ways in which his own approach generalizes that of McCarthy's: (i) Preferential entailment can be defined relative to any logic having a modeltheoretic semantics, not just to standard first-order logic. Starting, for instance, with a modal logic and a preference relation over its Kripke-models, one can define the corresponding nonmonotonic modal logic. (ii) A notion of preferential entailment can be defined in terms of any partial ordering of models. That is, one is not limited to those orderings that correspond to circumscription axioms. (iii) There is a shift of emphasis from syntax - circumscription axioms - to semantics - partial orderings of models. In the work of Kraus, Lehman and Magidor (1990), Shoham's approach is generalized further: A new primitive is introduced into the semantics: the notion of a state. Each state is labeled by a set of models of the underlying nonmonotonic logic and the states, not the models, are ordered by a binary relation ‹. In general, it is not assumed that ‹ satisfies any of the usual properties like irreflexivity or transitivity. A formula α holds in a state u (u is an αstate) iff α is true at every model that is labeled by the state. A state u is a preferred α-state iff (i) u is an α-state and there is no α-state v such that v ‹ u. α preferentially entails β, in symbols, α  β, if all preferred α-states are β-states. The main objective of Kraus, Lehman and Magidor (1990) is to study nonmonotonic inference relations  both in terms of abstract proof-theoretic properties and semantically in terms of preferential models. Several important classes of inference relations are characterized semantically by means of representation theorems. The study of abstract non-monotonic inference relations was initiated by Gabbay (1985) who took  to be a relation between a finite set Γ of premises and a single conclusion α. Gabbay (1985) defined a nonmonotonic logic as a relation of the described sort satisfying the following conditions: if α ∈ Γ, then Γ  α; (Reflexivity) if Γ  α and Γ, α  β, then Γ  β; (Finitary Cut) if Γ  α and Γ  β, then Γ, α  β. (Finitary Cautious Monotony) He argued that these requirements should be satisfied by any reasonable inference relation. As we have seen, Shoham (1987, 1988) and Kraus, Lehman and Magidor (1990) define  as a relation taking only single propositions as premises. In the presence of conjunction in the object language, this is essentially equivalent to allowing finite sets of propositions as premises. An more general treatment is proposed in Makinson (1988), where  is allowed to take infinite sets of premises. This generalization makes it possible for Makinson to redefine nonmonotonic consequence as a Tarski-style operation C on arbitrary sets of sentences. Gen6 eralizing Gabbay's conditions to the infinitary case and and expressing them in terms of C rather than , Makinson (1988) obtains the following conditions: Γ ⊆ C(Γ); (Inclusion) Γ ⊆ ∆ ⊆ C(Γ) implies C(∆) ⊆ C(Γ); (Infinitary Cut) Γ ⊆ ∆ ⊆ C(Γ) implies C(Γ) ⊆ C(∆). (Cautious Monotony) An operation on sets of sentences satisfying these conditions is called by Makinson a cumulative inference operation. Makinson (1993) is a comprehensive survey - from an abstract logical point of view - of systems of nonmonotonic logic: its focuses on properties of the inference relations (or operations) that are associated with the various systems. In the present paper, we follow Makinson - and differ from Kraus, Lehman and Magidor - in viewing non-monotonic consequence as an operation C on arbitrary sets of sentences (or equivalently, as a relation Γ  α, where Γ is allowed to be infinite). In addition, we modify the preferential semantics of Shoham and Kraus-Lehman-Magidor by defining C, in the way previously described, in terms of a selection function S on sets of states rather than in terms of a preference relation on states. This treatment is more general, since a given selection function may not be definable in terms of any preference relation. Utilizing various well-known results from preference theory on the rationalizability of a selection function by an underlying preference ordering (cf., Moulin 1985), we are able to prove a series of representation theorems for nonmonotonic inference. The general strategy in proving these results is the following: First, it is shown that any inference operation C that satisfies some set X of conditions may be defined in terms of a selection function S on sets of states satisfying a corresponding set of conditions X*. Next, it is shown that if S satisfies the conditions X*, then S is based on a preference relation P between states (read: xPy as state x is preferred over state y) satisfying some suitable conditions like asymmetry, transitivity, etc. Finally, the two steps are combined to yield a representation theorem for the inference operation C in terms of the preference relation P. The connection between C and P is given by: α ∈ C(Γ) iff ∀x[(x ∈ oeΓ" ∧ ∀y(y ∈ oeΓ" → ¬yPx)) → x ∈ oeα"], that is, α is a nonmonotonic consequence of Γ iff every P-maximal member of oeΓ" is also a member of oeα". At the end of the paper, we shall also briefly consider dyadic inference operations C, where C∆(Γ) is the set of all nonmonotonic consequences of the set of premisses Γ relative to the background assumptions ∆. Dyadic inference operations may be defined from dyadic selection functions on sets of states:4 α ∈ C∆(Γ) iff S(oe∆", oeΓ") ⊆ oeα". 7 The notion of a dyadic nonmonotonic inference operation is, of course, closely related Gärdenfors' concept of theory revision. If K∗α is the revision of a theory K with the proposition α, then we have the following natural connection: β ∈ K∗α iff β ∈ CK({α}), or more briefly: K∗α = CK({α}). That is, the revision of K with α is identified with the theory consisting of all the nonmonotonic consequences of α relative to the background theory K.5 Conversely, a dyadic nonmonotonic inference relation may be viewed as a generalization of ordinary theory revision: C∆(Γ) may be thought of as the result of revising ∆ with the set Γ. 2. Deductive Logics This section consists essentially of a review of selected, but well-known, material about consequence relations and consequence operations, some of it going back to the work of Tarski in the 1920's and 1930's. The concepts introduced here are basic to the development of nonmonotonic logic in the rest of the paper. We assume that a fixed object language L is given. The details of L are left open, except that we assume L to contain the standard connectives: ⊥ (falsity), → (the material conditional), ∧ (conjunction) and ∨ (disjunction). Hence, the set Φ of sentences of L is closed under the rules: (i) ⊥ ∈ Φ; (ii) if α, β ∈ Φ, then (α → β), (α ∧ β), (α ∨ β) ∈ Φ. ¬α is taken as a metalinguistic abbreviation of (α → ⊥). If Γ is a set of sentences in L and α is a sentence in L, then we write Γ a0 α just in case α is a tautological consequence of Γ (that is, if α follows from Γ in classical propositional logic). We also write Cn0(Γ) = {α: Γ a0 α}, that is, Cn0(Γ) is the closure of Γ under tautological consequence. By a consequence relation we shall understand a binary relation a which takes sets of sentences (in L) as its first argument and single sentences (in L) as its second and which satisfies the following conditions: (a1) if α ∈ Γ, then Γ a α; (Reflexivity) (a2) if Γ a α and Γ ⊆ ∆, then ∆ a α; (Monotonicity) (a3) if Γ ∪ ∆ a β and for each α ∈ ∆, Γ a α , then Γ a β. (Cut) Here, Γ and ∆ are any sets of sentences and α, β are any sentences. By a deductive logic L we shall understand a finitary consequence relation, i.e., a consequence relation aL that satisfies: (a4) if Γ aL α, then for some finite ∆ ⊆ Γ, ∆ aL α. (Finiteness) 8 We say that a deductive logic L is {∧, ∨}-normal if it satisfies the standard natural deduction rules for conjunction and disjunction, i.e., (∧I) Γ, α, β aL α ∧ β (∧E) Γ, α ∧ β aL α; and Γ, α ∧ β aL β; (∨I) Γ, α aL α ∨ β; and Γ, β aL α ∨ β; (∨E) if Γ, α aL γ and Γ, β aL γ, then Γ, α ∨ β aL γ. By a classical logic we understand a deductive logic that satisfies the following two conditions: (a5) if Γ a0 α, then Γ aL α; (Supraclassicality) (a6) if Γ, α aL β, then Γ aL α → β. (Deduction Theorem) That is, a classical logic is a deductive logic which extends the classical propositional calculus and satisfies the deduction theorem. Every classical logic is, of course, {∧, ∨}-normal. A deductive logic L can equivalently be presented as a finitary consequence operation CnL, that is, an operation that takes sets of sentences in L into sets of sentences in L and satisfies the following conditions: (Cn1) Γ ⊆ CnL(Γ); (Inclusion) (Cn2) if  Γ ⊆ ∆, then CnL(Γ) ⊆ CnL(∆); (Monotonicity) (Cn3) CnL(CnL(Γ)) ⊆ CnL(Γ); (Iteration) (Cn4) CnL(Γ) ⊆ ∪{CnL(∆): ∆ ⊆ Γ and ∆ is finite}. (Finiteness) In the presence of (Cn1) and (Cn2), (Cn3) is equivalent to the cut rule: (Cn3') If ∆ ⊆ CnL(Γ), then CnL(Γ ∪ ∆) ⊆ CnL(Γ). (Cut) Lemma 2.1. If Cn is a consequence operation, i.e., satisfies (Cn1) - (Cn3), then it also satisfies: (i) Γ ⊆ Cn(∆) iff Cn(Γ) ⊆ Cn(∆); (ii) Cn(Γ ∪ ∆) = Cn(Γ ∪ Cn(∆)) = Cn(Cn(Γ) ∪ Cn(∆)). This lemma, like several of the theorems and lemmas below, is proved in the Appendix. Of course, L is a classical logic if, in addition to (Cn1) - (Cn4), it satisfies the following two conditions: (Cn5) Cn0(Γ) ⊆ CnL(Γ); (Supraclassicality) (Cn6) If β ∈ CnL(Γ ∪ {α}), then α → β ∈ CnL(Γ). (Deduction Theorem) The two presentations of a deductive logic L are related by the following conditions: CnL(Γ) = {α: Γ aL α}; and Γ aL α iff α ∈ CnL(Γ). 9 If α ∈ CnL(Γ), we say that α is an L-consequence of Γ. We say that α is an L-theorem, if α ∈ CnL(∅). Lemma 2.2. If L is a classical logic, then it satisfies the following conditions: (a7) for all α, ⊥ aL α; (Falsity) (a8) if Γ aL α → β and Γ aL α, then Γ aL β; (Modus Ponens) (a9) if Γ, α → ⊥ aL ⊥, then Γ aL α. (Reductio Ad Absurdum) We omit the straightforward proof of Lemma 2.2. Let L be a deductive logic. L is (absolutely) inconsistent if  ∅ aL ⊥ and consistent, otherwise. A set Γ of sentences is said to be L-inconsistent if Γ aL ⊥. Γ is L-consistent if it is not L-inconsistent. A sentence α is said to be L-consistent if {α} is L-consistent. Γ is an L-theory iff Γ = CnL(Γ). A set Γ is L-maximal iff Γ is L-consistent and for every ∆, if Γ ⊆ ∆ and ∆ is L-consistent, then Γ = ∆. In view of the next lemma, we may speak of L-maximal sets as L-maximal theories. Observe the use of Iteration (i.e., Cut) in the proof of the lemma. Lemma 2.3. Let L be a deductive logic. Then every L-maximal set is an L-theory. The proof of the following lemma uses Inclusion, Cut, Monotonicity, Finiteness and the Axiom of Choice in the form of Zorn's Lemma. Lemma 2.4. (Lindenbaum's Lemma) Let L be a deductive logic. (a) Every L-consistent set is included in an L-maximal theory. (b) If α ∉ CnL(Γ), then there exists an L-maximal theory m such that CnL(Γ) ⊆ m and α ∉ m. If L is a deductive logic, then we write ML, TL for the set of all L-maximal theories and the set of all L-theories, respectively. m, m', m",... are variables ranging over L-maximal theories and G, H, K, T, T',... range over L-theories. We also introduce the following notation: for any Γ ⊆ Φ, |Γ|L = {m ∈ ML: Γ ⊆ m}; for α ∈ Φ, |α|L = |{α}|L = {m ∈ ML: α ∈ m}. In what follows, we shall often suppress the subscript L in contexts where the logic is assumed to be fixed. Lemma 2.5. Let L be a deductive logic. Then, CnL(Γ) = ∩(|Γ|L). That is, α is an L-consequence of Γ iff α belongs to every L-maximal extension of Γ. Proof: (⇒) Suppose that α ∈ CnL(Γ) and that m is an L-maximal set such that Γ ⊆ m. It follows by Monotonicity that α ∈ CnL(m). Since m = CnL(m) (Lemma 2.1), α ∈ m. 10 (⇐) Suppose that α ∉ CnL(Γ). By Lemma 2.4, there exists an L-maximal theory m such that CnL(Γ) ⊆ m and α ∉ m. M Lemma 2.6. Let L be a {∧, ∨}-normal deductive logic. Then, every L-maximal set m satisfies the conditions: (i) α ∧ β ∈ m iff α ∈ m and β ∈ m; (ii) α ∨ β ∈ m iff α ∈ m or β ∈ m; Lemma 2.7. Let L be a classical logic. Then, (a) Γ is an L-theory iff (i) CnL(∅) ⊆ Γ (i.e., all L-theorems are in Γ); (ii) if α ∈ Γ and α → β ∈ Γ, then β ∈ Γ (i.e., Γ is closed under modus ponens). (b) Γ is an L-maximal set iff it satisfies the conditions: (i) CnL(∅) ⊆ Γ; (ii) ⊥ ∉ Γ; (iii) α → β ∉ Γ iff α ∈ Γ and β ∉ Γ. 3. Nonmonotonic Inference We assume that a fixed consistent deductive logic L is given. We are next going to introduce the notion of an inference relation based on the underlying deductive logic L. We shall assume that all relations of nonmonotonic inference that we are going to study are inference relations in the sense defined below. In addition, we introduce the notion of an inference operation which is just a notational variant of that of an inference relation. Definition 3.1. (a) By an inference relation based on L we understand a relation  ⊆ ℘(Φ) × Φ satisfying the following conditions for all sets Γ and ∆ of sentences and sentences α, β:6 (1) if α ∈ Γ, then Γ  α; (Reflexivity) (2) if Γ  α, for all α ∈ ∆, and ∆ aL β, then Γ  β; (Closure) (3) if for all α, Γ aL α iff ∆ aL α, then Γ  β iff ∆  β.(Congruence) According to (1), an element of Γ is a nonmonotonic consequence of Γ. (2) says that if β is an L-consequence of a set of nonmonotonic consequences of Γ, then β is itself a nonmonotonic consequence of Γ. In other words, the set of nonmonotonic consequences of Γ is closed under L-consequence. According to (3), L-equivalent sets of sentences have the same nonmonotonic consequences. 11 (b) An inference operation based on L is an operation C: ℘(Φ) → ℘(Φ) satisfying the following conditions: (C1) Γ ⊆ C(Γ); (Inclusion) (C2) CnL(C(Γ)) ⊆ C(Γ); (Closure) (C3) if CnL(Γ) = CnL(∆), then C(Γ) = C(∆). (Congruence) Of course, there is a one-to-one correspondence between inference relations based on L and inference operations based on L. That is, we define the inference operation corresponding to  by: C(Γ) = {α: Γ  α}. Conversely, given C, we define  by: Γ  α iff α ∈ C(Γ). Lemma 3.2. If C is an inference operation based on L, then for all Γ: C(CnL(Γ)) = CnL(C(Γ)) = C(Γ). Proof: Since, CnL(Γ) = CnL(CnL(Γ)), we have by (C3), C(Γ) = C(CnL(Γ)). But, C(Γ) = CnL(C(Γ)), by (Cn1) and (C2). M Lemma 3.3. If C is an inference operation based on L, then C satisfies the condition: CnL(Γ) ⊆ C(Γ). In other words, if Γ aL α, then Γ  α. Proof: Γ ⊆ C(Γ), by (C1). Hence, CnL(Γ) ⊆ CnL(C(Γ)), by (Cn2). But CnL(C(Γ)) = C(Γ). It follows that CnL(Γ) ⊆ C(Γ) M Lemma 3.4. Suppose that L is a classical logic. Then,  is an inference relation based on L iff it satisfies the following conditions: (1) if α ∈ Γ, then Γ  α; (3) if CnL(Γ) = CnL(∆), then Γ  α iff ∆  α; (4) if CnL({α}) = CnL({β}), then Γ  α iff Γ  β; (5) if Γ  α ∧ β, then Γ  α and Γ  β; (6) if Γ  α and Γ  β, then Γ  α ∧ β; (7) Γ  Τ. It is easy to verify that conditions (4) and (5) may be replaced in Lemma 3.4 by the single condition:7 (7) If Γ  α and α aL β, then Γ  β. (Right Weakening) 12 In the next definition, we introduce the notion of an L-maximal theory being Γ-optimal with respect to an inference relation . The L-maximal theories may be thought of as (descriptions of) those possible worlds that are allowed by the underlying logic L. We may think of Γ  α as expressing a (conditional) disposition on the part of an agent to expect α to be true, if she were to be given Γ as her total new information. The set C(Γ) = {α: Γ  α}, then, consists of all the agents Γ-expectations.8 A possible world is Γ-optimal if all the Γ-expectations of the agent are true in it. In other words, after having received the total information Γ, the agent would not be surprised at all if any of the Γ-optimal worlds turned out to be the actual one. Definition 3.5. Let L be a deductive logic and  an inference relation based on L. Let Γ be any set of sentences and m any L-maximal theory. We say that m is Γ-optimal (with respect to  ) if for all α, if Γ  α, then α ∈ m. In other words, m is Γ-optimal iff C(Γ) ⊆ m. Lemma 3.6. Let L be a deductive logic and  an inference relation based on L. Then, Γ  α iff for every Γ-optimal m, α ∈ m. Proof: (⇒) This direction follows immediately from the definition of Γ-optimality. ( ⇐ ) Suppose that α ∉ C(Γ). By Lemma 3.2, C(Γ) = CnL(C(Γ)). Hence, α ∉ CnL(C(Γ)). Then, by Lemma 2.4, there exists an L-maximal theory m such that CnL(C(Γ)) ⊆ m and α ∉ m. Hence, C(Γ) ⊆ m and α ∉ m.M 4. Semantics: Models using Set-Valued Selection Functions In the following we let L be a deductive logic which we assume to be {∧, ∨}-normal. The notion of a model based on L will be introduced in two steps. First, we define the notion of structure based on L. After having defined the requisite concepts, a model will be defined as a a structure of a special kind. Definition 4.1. A structure based on L is a 4-tuple M = <U, V, l, S>, where (i) U is a nonempty set, the elements of which are called states (these might be thought of as representing the possible belief states of an agent). We use the lower case letters x, y, z, u as variables ranging over U. The letters X, Y, Z will be variables ranging over ℘(U) (ii) V is a non-empty family of subsets of U. (iii) l (the labeling function) is a function that assigns to every state u ∈ U a nonempty set l(u) of L-maximal theories. We may think of the members of l(u) as representing those possible worlds that are compatible with the agent's beliefs in state u (the agent's doxastically possible worlds in state u). 13 (iv) S is a function from V to V such that for every X ∈ V, S(X) ⊆ X. Such a function we call a selection function on V. Let M = <U, V, l, S> be a structure based on L. We say that a sentence α holds (or is accepted) in the state u ∈ U (relative to M) and write M *u α iff for every m ∈ l(u), α ∈ m. That is, M *u α obtains just in case l(u) ⊆ |α|L. Intuitively, a sentence α is accepted in a state u just in case α is true in all possible worlds that are compatible with the agent's beliefs in the state u. The set of all states in which α holds will be written oeα"M (or just oeα"). Thus, oeα" = {u ∈ U: M *u α}. For a set of sentences Γ, we write: oeΓ" = ∩{oeα": α ∈ Γ}, that is, oeΓ" is the set of all states in which all sentences in Γ are accepted. Given any set X of states in M, we may also define the set t(X) of sentences that are accepted in all the states in X, i.e., t(X) = {α: X ⊆ oeα"}. Notice, that the pair of mappings oe..." and t together form a Galois connection between ℘(Φ) and ℘(U), i.e., they satisfy:9 (i) if Γ ⊆ ∆, then oe∆" ⊆ oeΓ"; (ii) if X ⊆ Y, then t(Y) ⊆ t(X); (iii) Γ ⊆ t(oeΓ"); (iv) X ⊆ oet(X)". It follows from (i) - (iv) that these mappings also satisfy: (v) oeΓ" = oet(oeΓ")"; (vi) t(X) = t(oet(X)"). For every set X ⊆ U, we define the closure of X, Cl(X), as the set (∗) oet(X)" = ∩{oeα": X ⊆ oeα"}. A set X of states in M is said to be closed if X = Cl(X). The closure of X is the intersection of all closed subsets of U that include X.10 Lemma 4.2. Let M = <U, V, l, S> be a structure based on L. Then, the operator Cl: ℘(U) → ℘(U), defined by the equation (∗) above, satisfies the following conditions. For all X, Y ⊆ U, (Cl 1) If X ⊆ Y, then Cl(X) ⊆ Cl(Y); (Cl 2) X ⊆ Cl(X); (Cl 3) Cl(Cl(X)) = Cl(X); 14 (Cl 4) Cl(∅) = ∅. We are now ready to define the notion of a model based on L. Definition 4.3. Let M = <U, V, l, S> be a structure based on L. We say that M is a model (based on L) if the family V satisfies the following conditions: (i) for every X ⊆ U, Cl(X) ∈ V; (ii) if X, Y ∈ V, then X ∪ Y ∈ V; (iii) for any non-empty family F of members of V, ∩X∈FX ∈ V. That is, a model is a structure in which the domain V of the selection function contains all closed subsets of U and is closed under finite unions and arbitrary intersections. For any model M = <U, V, l, S> based on L,we define two corresponding relations ‚M and M between sets of sentences and single sentences: Γ ‚M α iff oeΓ" ⊆ oeα"; and Γ M α iff S(oeΓ") ⊆ oeα". That is, Γ ‚M α obtains just in case α is accepted in all the Γ-states (i.e., in all the states in which all sentences in Γ are accepted). And, Γ M α obtains just in case α is accepted in all the most preferred Γ-states. Lemma 4.4. If M = <U, V, l, S> is a model based on L, then ‚M is a consequence relation which extends L, i.e., such that: aL ⊆ ‚M and M is an inference relation based on L. Proof: The easy verification that ‚M is a consequence relation extending L is omitted. We prove that M is an inference relation based on L. Reflexivity: Suppose α ∈ Γ. Then clearly, oeΓ" ⊆ oeα". However, S(oeΓ") ⊆ oeΓ". Hence, S(oeΓ") ⊆ oeα". That is, Γ M α. Closure: Suppose Γ M α, for all α ∈ ∆, and ∆ aL β. Then, S(oeΓ") ⊆ ∩{oeα": α ∈ ∆}, i.e., S(oeΓ") ⊆ oe∆". Furthermore, since ∆ aL β, oe∆" ⊆ oeβ". It follows that S(oeΓ") ⊆ oeβ", i.e., Γ M β. Congruence: Suppose that for all α, Γ aL α iff ∆ aL α. Then, |Γ|L = |∆|L. It follows that also oeΓ" = oe∆". Hence, S(oeΓ") = S(oe∆"). Finally, we have that Γ M β iff ∆ M β. M We speak of ‚M and M as the consequence relation and the inference relation, respectively, determined by M. 15 We say that a model M = <U, V, l, S> is a world model if l(u) is a unit set for each u ∈ U. In a world model, the set of sentences accepted in a state is always L-maximal. Since L is assumed to be {∧, ∨}-normal, we have for any model M and all sentences α and β: (i) oeα ∧ β"M = oeα"M ∩ oeβ"M; (ii) oeα"M ∪ oeβ"M ⊆ oeα ∨ β"M; (iii) if M is a world model, oeα ∨ β"M ⊆ oeα"M ∪ oeβ"M; (iv) if L is classical, oeα"M ∩ oe¬α"M = ∅. (v) if L is classical and M is a world model, oeα"M ∪ oe¬α"M = U. A model M = <U, V, l, S>is said to be full if V = ℘(U). Theorem 4.5. Let L be a {∧, ∨}-normal deductive logic and let  be an inference relation based on L. Then, there exists a world model M = <U, V, l, S> (based on L) such that: aL = ‚M and  = M, i.e., aL and  are, respectively, the consequence relation and inference relation determined by M. Proof: We define a structure M = <U, V, l, S>, which we shall call the canonical model for aL and , as follows: (i) U = ML, i.e., U is the set of all L-maximal theories; (ii) V = {|Γ|L: Γ is a set of sentences in L}. That is, V consists of all closed subsets of U. (iii) for each u ∈ U, l(u) = {u}; (iv) We define S as follows: For any set X ∈ V, consider the theory t(X) determined by X, namely: t(X) = {α: ∀m ∈ X, α ∈ m} = ∩X. Now, define: S(X) = |C(t(X))|L = {m ∈ U: for all α, if t(X)  α, then α ∈ m}. That is, S(X) is the set of t(X)-optimal L-maximal theories. Let X ∈ V. Then, X = |t(X)|L. Since t(X) ⊆ C(t(X)), we get that |C(t(X))|L ⊆ |t(X)|L. That is, S(X) ⊆ X. Thus, the canonical model for aL and  is a structure based on L. In this structure, we have oeΓ" = |Γ|L, for all Γ. Hence, aL = ‚M. We now claim that: (∗) Γ  α iff S(|Γ|) ⊆ |α|. Proof of (∗): S(|Γ|) is the set of all ∩(|Γ|)-optimal L-maximal sets. But ∩(|Γ|) = CnL(Γ), so S(|Γ|) is the set of all L-maximal sets that are CnL(Γ)-optimal. However, m is CnL(Γ)optimal iff m is Γ-optimal (since C(Cn(Γ)) = C(Γ)). Hence, S(|Γ|) is the set of all L-maximal 16 theories that are Γ-optimal. It follows by lemma 3.6 that Γ  α iff for all m ∈ S(|Γ|), α ∈ m. Q.E.D. By definition, we have: (∗∗) Γ M α iff S(oeΓ") ⊆ |α|. From (∗) and (∗∗) and the fact that oeΓ" = |Γ|L, we get that Γ  α iff Γ M α. It only remains to show that the canonical model M = <U, V, l, S> for aL and  is indeed a model, i.e., satisfies conditions (i) - (iii) of Definition 4.3: Condition (i) is immediate from the definition of V. Condition (ii): We first prove that the closure operation of the canonical model satisfies: (∗) Cl(X ∪ Y) = Cl(X) ∪ Cl(Y). First of all, X ⊆ X ∪ Y. Hence, t(X ∪ Y) ⊆ t(X), and, therefore, oet(X)" ⊆ oet(X ∪ Y)". That is, Cl(X) ⊆ Cl(X ∪ Y). In the same way, we get Cl(Y) ⊆ Cl(X ∪ Y). Thus, Cl(X) ∪ Cl(Y) ⊆ Cl(X ∪ Y). In order to prove the other direction, assume that m ∈ Cl(X ∪ Y), i.e., m ∈ oet(X ∪ Y)". Then, we have: t(X ∪ Y) ⊆ m. But, t(X ∪ Y) = t(X) ∩ t(Y). Hence, t(X) ∩ t(Y) ⊆ m. Suppose now, that m ∉ Cl(X) ∪ Cl(Y). Then, m ∉ Cl(X) and m ∉ Cl(Y). Hence, there must exist sentences α, β such that α ∈ t(X), α ∉ m; β ∈ t(Y) and β ∉ m. Consider now the sentence α ∨ β. Since, α ∉ m and β ∉ m, it follows by the{∧, ∨}-normality of L that α ∨ β ∉ m. But, on the other hand, α ∨ β ∈ t(X) ∩ t(Y). By contradiction, we conclude that m ∈ Cl(X) ∪ Cl(Y). Thus, Cl(X ∪ Y) ⊆ Cl(X) ∪ Cl(Y). Suppose now that X, Y ∈ V. By the definition of V, X = Cl(X) and Y = Cl(Y). Hence, X ∪ Y = Cl(X) ∪ Cl(Y) =(by (∗)) Cl(X ∪ Y). But, by the definition of V, cl(X ∪ Y) ∈ V. Thus, X ∪ Y ∈ V. Condition (iii). Let F be a non-empty family of elements in V. By (Cl 2) of Lemma 4.2, ∩F ⊆ Cl(∩F). On the other hand, ∩F ⊆ X, for each X ∈ F. Hence, by (Cl 1), Cl(∩F) ⊆ Cl(X) = X, for each X ∈ F. Thus, Cl(∩F) ⊆ ∩F. We have shown that ∩F = Cl(∩F). So by the definition of V, ∩F ∈ V. M Remark 4.6. Let  be an inference relation based on L. Let M = <U, V, l, S> be the corresponding canonical model. Then, we have for all Γ ⊆ Φ and X ∈ V: (i) CnL(Γ) = t(oeΓ"); and (ii) X ≠ ∅ iff t(X) is L-consistent; (iii) C(Γ) = t(S(oeΓ")); and (iv) S(X) = oeC(t(X))". It follows that for all sets of sentences Γ and all X ∈ V, (v) C(t(X)) = t(S(X)); and 17 (vi) S(oeΓ") = oeC(Γ)", that is, the following two diagrams commute: By a canonical model for L we understand a model M = <U, V, l, S> such that: (i) U is the set of all maximal L-theories; (ii) V is the set of all closed subsets of U, i.e., V = {oet(X)": X ⊆ U} = {Cl(X): X ⊆ U}. (ii) for each u ∈ U, l(u) = {u}; It is easy to see that a canonical model for L is the canonical model for L and the inference operation CM defined by: CM(Γ) = t(S(oeΓ")). That is, we also have: S(X) = oeCM(t(X))". The next lemma states that the set V of a canonical model has certain important closure properties: V is closed under finite unions and arbitrary intersections and contains all singleton sets. It follows that V contains all finite subsets of U. Lemma 4.7. Let M = <U, V, l, S> be a canonical model for L. Then, for all X, Y ∈ V, and m ∈ U, (i) Cl(X ∪ Y) = Cl(X) ∪ Cl(Y);11 (ii) Cl({m}) = {m}, i.e., all singleton sets are closed; (iii) All finite subsets of U are members of V. Proof: We have already proved (i) in the course of proving Theorem 4.5. Observe that in the proof of (i), we used the fact that L is closed under the standard natural deduction rules for ∨ (See the Appendix). (ii) Cl({m}) = oet(m)" = oem", since t(m) = m. But oem" = {n ∈ U: m ⊆ n} = {m}, since U is the set of all L-maximal theories. Hence, Cl({m}) = {m}. (iii) follows immediately from (i) and (ii). M We shall now consider some natural conditions that we might want to impose on the selection function in a model. Most of these are taken from the literature on choice functions and 18 preference relations.12 Some, however, borrow their names from the corresponding conditions on inference operations: for any X, Y ∈ V, (cp) If X ≠ ∅, then S(X) ≠ ∅; (Consistency Preservation) (it) S(S(X)) = S(X); (Iteration) (c) if S(X) ⊆ Y ⊆ X, then S(X) ⊆ S(Y); (Cut) (d) S(X ∪ Y) ⊆ S(X) ∪ S(Y) (Distributivity) (ch) S(X) ∩ Y ⊆ S(X ∩ Y); (Chernoff) (aiz) if S(X) ⊆ Y ⊆ X, then S(Y) ⊆ S(X); (Aizerman) (pi) S(S(X) ∪ S(Y)) = S(X ∪ Y); (Path Independence) (g) Let ∅ ≠ F ⊆ V such that ∪X∈FX ∈ V. Then, ∩X∈FS(X) ⊆ S(∪X∈FX); (Gamma) (s) if S(X) ∩ S(Y) ≠ ∅, then S(X ∩ Y) ⊆ S(X) ∩ S(Y);(Sen) (iia) if S(X) ∩ Y ≠ ∅, then S(X ∩ Y) = S(X) ∩ Y. (Arrow) The condition (ch) - originally introduced by Chernoff (1954) - is identical to Sen's (1971) Property α.13 The following formulation is easily seen to be equivalent to the one above: (α) If Y ⊆ X, then S(X) ∩ Y ⊆ S(Y). Intuitively, if x is a best choice in the set X, then x is still a best choice in any subset of X to which x belongs. The conditions Aizerman and Cut together say that: if S(X) ⊆ Y ⊆ X, then S(X) = S(Y). That is, deleting from a set only such members that are not among its best members does not affect which members are best in the set. The condition (g) is called Property γ by Sen (1971). It says that if x is a best choice in every set X in a family of sets, then x is also a best choice in their union. It has the following finitary consequence: S(X) ∩ S(Y) ⊆ S(X ∪ Y). The condition Sen may also be formulated as: if X ⊆ Y and S(X) ∩ S(Y) ≠ ∅, then S(X) ⊆ S(Y). It is called Property β by Sen (1971). Arrow's condition (iia) of Independence of Irrelevant Alternatives can be rewritten as: If Y ⊆ X and S(X) ∩ Y ≠ ∅, then S(X) ∩ Y = S(Y). That is, if Y is a subset of X that contains some of the best members of X, then the best members of Y are precisely the best member of X that belong to Y. 19 The results of the next lemma are either well-known (see Moulin (1985)) or obvious. Their proofs are included in the Appendix for the sake of completeness and easy reference. Lemma 4.8. (i) (c) implies (it); (ii) (ch) implies (c); (iii) (ch) implies (d); (iv) (c), (aiz) and (d) together imply (pi). If the model is full, i.e., if V = ℘(U), then (pi) is equivalent to (ch) and (aiz);14 (v) (cp) and (iia) together imply (ch), (aiz) and (g); (vi) (iia) is equivalent to (ch) and (s). Next we turn to conditions on the inference operation C: (CP) if ⊥ ∉ Cn(Γ), then ⊥ ∉ C(Γ); (Consistency Preservation) (It) C(C(Γ)) = C(Γ); (Iteration) (C) if Γ ⊆ ∆ ⊆ C(Γ), then C(∆) ⊆ C(Γ); (Cut) (D) C(Γ) ∩ C(∆) ⊆ C(Cn(Γ) ∩ Cn(∆)); (Distributivity) (Ch) C(Γ ∪ ∆) ⊆ Cn(C(Γ) ∪ ∆); (Chernoff) (Aiz) if Γ ⊆ ∆ ⊆ C(Γ), then C(Γ) ⊆ C(∆); (Aizerman) (PI) C(C(Γ) ∩ C(∆)) = C(Cn(Γ) ∩ Cn(∆)); (Path Independence) (G) C(∩Γ∈FCn(Γ)) ⊆ Cn(∪Γ∈FC(Γ)), where F is any non-empty family of sets of sentences; (Gamma) (S) If C(Γ) ∪ C(∆) is L-consistent, then C(Γ) ∪ C(∆) ⊆ C(Γ ∪ ∆); (Sen) (IIA) if C(Γ) ∪ ∆ is L-consistent, then C(Γ ∪ ∆) = Cn(C(Γ) ∪ ∆). (Arrow) Notice that Aizerman is the condition that Makinson (1989) calls Cautious Monotony. It is also worth mentioning that Arrow implies the following generalization of the so-called condition of Rational Monotony:15 If C(Γ) ∪ ∆ is L-consistent, then C(Γ) ⊆ C(Γ ∪ ∆). In the presence of (CP), this principle implies Sen, but is not implied by Sen. Rational monotony is the special case of the principle for which ∆ is a unit set. Theorem 4.9. Let L be a {∧, ∨}-normal deductive logic and M = <U, V, l, S> a canonical model based on L. Let C = CM be the inference operation that is determined by M. If (x) is any of the conditions (cp) - (iia), then S satisfies (x) iff the inference operation C satisfies the corresponding condition (X) among (CP) - (IIA). In view of Lemma 4.8 and Theorem 4.9, we have the following result connecting the various conditions on C. 20 Lemma 4.10. (i) (C) implies (It); (ii) (Ch) implies (C); (iii) (Ch) implies (D); (iv) (C), (Aiz) and (D) together imply (PI); (v) (CP) and (IIA) together imply (Ch), (Aiz) and (G); (vi) (IIA) is equivalent to (Ch) and (S). We conclude this section by discussing some consequences of the conditions above in the context of L being classical. First of all, L being classical implies that a finite set of premises may be treated as conjunctions, i.e., α ∧ β  γ iff {α, β}  γ. Hence, the infinitary conditions (C) and (Aiz) above have - in the classical case - the following finitary consequences: if α  β and α ∧ β  γ, then α  γ; (Cut) if α  β and α γ, then α ∧ β  γ. (Cautious Monotony) The next Lemma is due to Makinson (1993). Lemma 4.11. (Makinson) Suppose that C is an inference operation based on a classical logic L. If C satisfies Distribution, then the following conditions are also satisfied: (i) if Γ, α  γ and Γ, β  γ, then Γ, α ∨ β  γ; (Disjunction in the Antecedent) (ii) if Γ, α  γ and Γ, ¬α  γ, then Γ  γ; (Proof by Cases) (iii) if Γ, α  β, then Γ  α → β. Conditionalization) (iv) if Γ  α and Γ, β  ¬α, then Γ  ¬β. Since Chernoff implies Distribution, the assumption of Chernoff yields, in the context of classical logic, Conditions (i) - (iv). Condition (iv) would license inferences of the kind: (1) If Squeaky is a mammal, then it is expected that Squeaky cannot fly. (2) If Squeaky is a mammal and a bat, then it is expected that Squeaky can fly. (3) Hence: if Squeaky is a mammal, then it is expected that Squeaky is not a bat. If L is classical and C satisfies Arrow, then we also have: If Γ  γ and Γ   ¬β, then Γ, β  γ. (Rational Monotony) This principle yields inferences of the kind: (1) If Squeaky is a mammal, then it is expected that Squeaky cannot fly. (2) If Squeaky is a mammal, then it is not expected that Squeaky is not a dog. 21 (3) Hence: if Squeaky is a mammal and a dog, then it is expected that Squeaky cannot fly. 5. Representation Theorems In the last section, we proved a series of results connecting properties of the inference operation C with properties of the selection function S in the canonical model corresponding to C. In this section we wish to explore under what conditions a given selection function can be defined in terms of an underlying preference relation P on the set U of all states. In order to make this question precise, we introduce the notion of a choice structure:16 Definition 5.1. A choice structure is an ordered triple S = <U, V, S>, where U is a nonempty set, V is a non-empty family of subsets of U, S is a function from V to V, such that: (i) for each x ∈ U, {x} ∈ V; (ii) ∅ ∈ V; (iii) if X, Y ∈ V, then X ∪ Y ∈ V; (iv) for any non-empty family F of members of V, ∩X∈FX ∈ V; (v) for each X ∈ V, S(X) ⊆ X. U is the domain of S and the elements of U are here called states (or points). Axioms (ii) - (iv) say that the elements of V form the closed sets of a topological space over U. Hence, it is appropriate to refer to the elements of V as the closed sets of S. For any X ⊆ U, we write Cl(X) for the closure of X, i.e., the intersection of all closed sets that include X. Cl, of course, satisfies the axioms (Cl 1) - (Cl 4) of a topological closure operation. A topological space satisfying condition (i), that all singleton sets are closed, is called a T1-space. It follows from (i) and (iii) that all finite sets are members of V. S is the selection function (or the choice function) of the structure S. According to (v), S selects a subset of elements from any closed subset X of U. Since S is an operation on V, S(X) is always a closed set. The principal case we are interested in is the following: A {∧, ∨}-normal deductive logic L and an inference operation C based on L are given. S = <U, V, S> is defined in terms of L and C as follows: (i) U is the set of all L-maximal theories; (ii) V = {X ⊆ U: X = oet(X)"}; (iii) for each X ∈ V, S(X) = oeC(t(X))". In this case, S = <U, V, S> is essentially identical to the canonical model for L and C. In this section, we shall think of the set U of states as being provided with a preference relation P ⊆ U × U (we read xPy as: x is better than y). In terms of such a relation, we can define the selection function S: V → V as follows: for all X ∈ V: (∗) S(X) = {x: x ∈ X & (∀y)(y ∈ X → ¬(yPx))}. 22 That is, S(X) is the set of all P-maximal elements of X. We say that S is based on the relation P - and that P rationalizes S - if S is defined from P by means of the equation (∗). S is said to be rationalizable if there is a relation that rationalizes it. We use the following terminology for preference relations: We use xRy as an abbreviation for ¬(yPx). P is said to be: (i) a strict partial ordering iff P is asymmetric and transitive; (ii) a strict weak ordering iff P is asymmetric and ∀xyz(xRy ∧ yPz → xPz). (iii) a strict linear order iff P is a strict partial ordering and ∀xy(xRy ∧ x ≠ y → xPy). (iv) neat iff every non-empty element X of V contains a P-maximal element, i.e., an x such that ∀y(y ∈ X → ¬(yPx)).17 Lemma 5.2. Let S: V → V be a selection function and P a relation that rationalizes S. Then, (a) P is irreflexive iff S satisfies the condition: (ir) S({x}) ≠ ∅, for each x ∈ U. (Irreflexivity) (b) P is neat iff S satisfies Consistency Preservation. (c) If P is irreflexive, then P is unique and for all x, y ∈ U, xPy iff y ∉ S({x, y}). Proof: (a) and (b) are trivial. We prove (c). By (∗), we have: y ∈ S({x, y}) iff and ¬yPy and ¬xPy. The irreflexivity of P, then yields: xPy iff y ∉ S({x, y}). M Lemma 5.3. A selection function S is rationalizable iff it satisfies the condition: for all X ∈ V, S(X) = {x ∈ X: (∀y)(y ∈ X → x ∈ S({x, y})}. Proof: Suppose that P rationalizes S. Then, we have for all Z ∈ V and all z ∈ Z, (1) z ∈ S(Z) ↔ (∀y)(y ∈ Z → ¬(yPz)). Let X ∈ V and x ∈ X. We want to show that: (2) x ∈ S(X) ↔ (∀y)(y ∈ X → x ∈ S({x, y}). Suppose that x ∈ S(X) and that y ∈ X. By (1), we then have: ¬(yPx) and ¬(xPx). Hence, by (1) again, we get: x ∈ S({x, y}). To prove the other direction of (2), assume that (∀y)(y ∈ X → x ∈ S({x, y}). Suppose also that x ∉ S(X). Then, by (1), there is a y ∈ X such that yPx. Applying (1) again, we get x ∉ S({x, y}), i.e., a contradiction. 23 To prove the other direction of the lemma, suppose that for all X ∈ V, (3) S(X) = {x ∈ X: (∀y)(y ∈ X → x ∈ S({x, y})}. Define P ⊆ U × U by the condition: (4) xPy iff y ∉ S({x, y}). (3) and (4) then yield: S(X) = {x ∈ X: (∀y)(y ∈ X → ¬(yPx))}. that is, P rationalizes S. M In the theory of preference and choice, there are many well-known theorems relating conditions on the selection function S, like Chernoff, Aizerman, Gamma, etc., to the existence of an underlying preference relation P.18 The following theorem is a slight strengthening of a result by Sen (1971).19 Theorem 5.4. A selection function S is rationalizable iff it satisfies Chernoff and Gamma (i.e., Sen's Properties α and γ): (ch) S(X) ∩ Y ⊆ S(X ∩ Y); (g) Let ∅ ≠ F ⊆ V such that ∪X∈FX ∈ V. Then, ∩X∈FS(X) ⊆ S(∪X∈FX); Proof: We omit the straightforward verification that every rationalizable selection function satisfies Chernoff and Gamma. For the other direction, suppose that S satisfies Chernoff and Gamma. By Lemma 5.3, it is sufficient to prove that for all X ∈ V and x ∈ X, x ∈ S(X) iff (∀y)(y ∈ X → x ∈ S({x, y})}. Assume first that x ∈ S(X). Consider any y ∈ X. Chernoff, then yields that S(X) ∩ {x, y} ⊆ S({x, y}), which implies that x ∈ S({x, y}). Next, we assume that x is such that (∀y)(y ∈ X → x ∈ S({x, y})}. This means that: x ∈ ∩y∈XS({x, y}). But X = ∪y∈X{x, y}. Gamma, then yields ∩y∈XS({x, y}) ⊆ S(∪y∈X{x, y}) = S(X). Hence, x ∈ S(X). M Lemma 5.5.20 Suppose that S is based on P. (a) If S is neat and satisfies Aizerman: (aiz) if S(X) ⊆ Y ⊆ X, then S(Y) ⊆ S(X), then P is transitive. (b) If P is transitive and P-1 is well-founded (i.e., there are no infinitely ascending Pchains in U), then S satisfies Aizerman. 24 Lemma 5.6. Suppose that S is based on a neat and transitive relation P. Then, S satisfies Sen: (s) if S(X) ∩ S(Y) ≠ ∅, then S(X ∩ Y) ⊆ S(X) ∩ S(Y). iff P satisfies the condition: ∀xyz(xRy ∧ yPz → xPz), (i.e., iff P is a neat strict weak ordering). Proof: Cf. Kanger (to appear), Theorem 8.1. Theorem 5.7. (a) S is based on a neat relation P iff S satisfies (cp), Chernoff and Gamma. (b) If S satisfies (cp), Chernoff, Gamma and Aizerman, then S is based on a neat and transitive relation (i.e., a neat strict partial ordering) (c) If any of the following equivalent conditions is satisfied: (i) S satisfies (cp), Chernoff, Gamma, Aizerman and Sen; (ii) S satisfies (cp), Chernoff and Sen; (iii) S satisfies (cp) and (iia), then, S is based on a neat strict weak ordering Proof: By Lemma 4.8, Lemma 5.2 (b), Theorem 5.4, Lemma 5.5 and Lemma 5.6. Theorems 4.9, 5.4 and 5.7 together give us: Theorem 5.8. (Representation Theorem I) Let C be an inference relation based on the deductive logic L. Let M = <U, V, l, S> be the canonical model for L and C. Then, L = LM and C = CM and: (i) C satisfies Chernoff and Gamma iff S is based on a relation P ⊆ U × U. (ii) If C satisfies (CP), Chernoff, Gamma and Aizerman, then S is based on a neat strict partial ordering P ⊆ U × U. (iii) If C satisfies (CP) and Arrow, then S is based on a neat strict weak ordering P ⊆ U × U. 6. Dyadic Inference Operations and Infinitary Belief Revision In this section we shall explore the connection between nonmonotonic inference and belief revision in the sense of Alchourrón; Gärdenfors and Makinson.21 In doing so, we generalize the notion of belief revision to allow for the revision of a set of beliefs with a, possibly infinite, set of propositions representing the new information.22 A representation theorem is 25 proved for the generalized notion of belief revision in terms of systems of spheres of the kind introduced by Grove (1988). In Makinson and Gärdenfors (1990) a method is described for translating postulates for belief revision into postulates for nonmonotonic inference, and vice versa.23 The basic idea here is to interpret β ∈ K∗α as a claim that β is a nonmonotonic consequence of α, relative to the background (or default) theory K. That is, β ∈ K∗α is translated as α K β, where K is a nonmonotonic inference relation associated with the theory K. Expressing this equivalence, in terms of an inference operation CK instead, we get, for a fixed K, the identity: K∗α = CK({α}). This idea can be generalized: Thinking of C as a binary operation and allowing K to be replaced by an arbitrary set of sentences ∆, we get: ∆∗α = C(∆, {α}). In other words, for any ∆ and α, the revision of ∆ with α is identified with the set of nonmonotonic consequences of α, relative to the default assumptions ∆. Now, in order to get complete interdefinability between the notions of belief revision and nonmonotonic inference, we need just another step: we must allow for the possibility of a set ∆ being revised with a possibly infinite set of propositions Γ. Then, for all Γ and ∆, we obtain: ∆∗Γ = C(∆, Γ). Conversely, given a a notion of belief revision ∆∗Γ , we may, of course, define the corresponding notion of nonmonotonic inference via the same equality. In our formal treatment, however, we shall not make a complete identification between the notions of belief revision and nonmonotonic inference. Instead, we take the former as a special case of the latter - in the sense of being characterized by stronger axioms. The axioms for belief revision presented here are straightforward generalizations to the infinitary case of Gärdenfors' (1988) basic axioms (K∗1) - (K∗6) for finitary belief revision.24 Definition 6.1. Let L be a consistent deductive logic. (a) A dyadic inference operation based on L is an operation C: ℘(Φ) × ℘(Φ) → ℘(Φ) satisfying the following conditions. For easy readability, we shall write C∆(Γ) instead of C(∆, Γ). We also write Γ + ∆ as an abbreviation of CnL(Γ ∪ ∆). We speak of Γ + ∆ as the expansion of Γ with ∆. (BC1) C∆(Γ) is an L-theory; (Closure) (BC2) Γ ⊆ C∆(Γ); (Success) (BC3) if CnL(Γ) = CnL(∆) and CnL(Σ) = CnL(Π), then CΣ(Γ) = CΠ(∆). (Congruence) 26 (b) An (infinitary) belief revision operation based on L is a dyadic inference operation C satisfying - in addition to (BC1) - (BC3) - the following conditions: (BC4) if ⊥ ∉ ∆ + Γ, then C∆(Γ) = ∆ + Γ; (Expansion) (BC5) if ⊥ ∉ CnL(Γ), then ⊥ ∉ C∆(Γ). (Consistency Preservation) Here, the preferred reading of C∆(Γ) is: 'the result of revising the set ∆ with the new information Γ'. Notice that (BC1) - (BC3) say no more than that, for any fixed ∆, C∆(...) is an inference operation in the sense of Definition 3.1 (b). The axioms (BC1) - (BC5) should be compared with the corresponding axioms in Gärdenfors (1988), namely: (K∗1) K∗α is an L-theory; (Closure) (K∗2) α ∈ K∗α ; (Success) (K∗6) if CnL({α}) = CnL({β}), then K ∗ α = K ∗ β . (Congruence) The following two axioms correspond to Expansion: (K∗3) K∗α ⊆ K + {α}; (K∗4) If ¬α ∉ K, then K + {α} ⊆ K∗α . Finally, there is: (K∗5) if α is L-consistent, then K∗α is L-consistent. (Consistency Preservation) To the basic axioms (BC1) - (BC5) for belief revision, we might want to add some of the following supplementary axioms: (BC6) CΠ(Γ ∪ ∆) ⊆ CΠ(Γ) + ∆; (Chernoff) (BC7) CΠ(∩Γ∈FCn(Γ)) ⊆ Cn(∪Γ∈FCΠ(Γ)), where F is any non-empty family of sets of sentences; (Gamma) (BC8) if Γ ⊆ ∆ ⊆ CΠ(Γ), then CΠ(Γ) ⊆ CΠ(∆); (Aizerman) (BC9) if ⊥ ∉ CΠ(Γ) + ∆, then CΠ(Γ ∪ ∆) = CΠ(Γ) + ∆. (Arrow) Provided that   CnL({α ∧ β}) = CnL({α, β}), (BC6) yields the following supplementary axiom of Gärdenfors: (K∗7) K∗α∧β ⊆ K ∗ α + β. Under the same provision, (BC9) yields Revision by Conjunction: if K∗α + β is L-consistent, then K ∗ α∧β = K ∗ α + β, which is equivalent to (K∗7) together with the other of Gärdenfors' supplementary axioms: (K∗8) if ¬β ∉ K∗α + β, then K ∗ α∧β = K ∗ α + β. It is straightforward to modify the notion of a model M = <U, V, l, S> based on L, that was introduced in Section 4, in such a way as to get a semantics for dyadic inference operations. The only difference occurs in clause (iii) of Definition 4.1, which has to be changed to: 27 (iii') S is a function from V × V to V such that for all X, Y ∈ V, S(X, Y) ⊆ Y. Such a function we call a dyadic selection function on V. Each model M = <U, V, l, S>, of the new kind, determines two operations: CnM(Γ) = t(oeΓ") and CM∆(Γ) = t(S(oe∆", oeΓ")), where the first operation is a consequence operation that extends L and the second is a dyadic inference operation based on L (cf. Lemma 4.4). Now, for any {∧, ∨}-normal deductive logic L and any dyadic inference operation C, we may define the corresponding canonical model M = <U, V, l, S>, where: (i) U = ML, i.e., U is the set of all L-maximal theories; (ii) V is the set of all closed subsets of U. (ii) for each u ∈ U, l(u) = {u}; (iii) for any sets X, Y ∈ V, S(X, Y) = |C(t(X)), t(Y))|L = {m ∈ U: C(t(X), t(Y)) ⊆ m}. The proof of Theorem 4.5 carries over unchanged, so we have that CnL = CnM and for all Γ, ∆, C∆(Γ) = C M ∆(Γ) . That is, CnL and C are, respectively, the consequence operation and the dyadic inference operation that are determined by the canonical model M. In addition to letting the selection functions take an extra argument, we apply the same procedure to the preference relations. That is, we write xPXy and read it as: x is preferred over (or better than) y, relative to X. Intuitively, xPXy means that x is closer to the optimal alternatives in X than y. We shall refer to ternary relations P ⊆ U × ℘(U) × U as (relativized) preference relations. Properties of relations like reflexivity, transitivity, being a weak ordering, etc. carries over to relativized preference relations as follows: a given property is said to apply to P iff for each X, PX has the property in question. A dyadic selection function S: V × V → V is said to be based on a given preference relation if for all X, Y ∈ V, S(X, Y) = {x: x ∈ Y & (∀y)(y ∈ Y → ¬(yPXx)}. We have now introduced the concepts that are required in order to state the following representation theorem. Theorem 6.2. (Representation Theorem II) Let C be a belief revision operation based on the deductive logic L. Let M = <U, V, l, S> be the canonical model for L and C. Then, CnL = CnM, for all Γ, ∆, C∆(Γ) = C M ∆(Γ) and for all X, Y ∈ V: (i) if Y ≠ ∅, then S(X, Y) ≠ ∅; (Consistency Preservation) (ii) if X ∩ Y ≠ ∅, then S(X, Y) = X ∩ Y. (Expansion) Moreover, 28 (a) If C, in addition to the basic axioms (BC1) - (BC5), also satisfies axioms (BC6) (Chernoff) and (BC7) (Gamma), then there exists a (relativized) preference relation P ⊆ U × ℘(U) × U such that P is neat and S is based on P. (b) If C satisfies the conditions (BC1) - (BC8), then there exists a relation P ⊆ U × ℘(U) × U such that P is neat and transitive and S is based on P. (c) If C satisfies (BC1) - (BC5) together with (BC9) (Arrow), then there exists a relation P ⊆ U × ℘(U) × U such that P is neat strict weak ordering and S is based on P. Proof: (to be written) We are next going to prove that any (infinitary) belief revision operation C that satisfies (BC1) - (BC5) together with (BC9) can be defined in terms of "systems of spheres" of the kind defined in Grove (1988). Theorems 6.4 and 6.5 below for infinitary belief revision operations should be compared with Grove's (1988) Theorems 1 and 2 for finitary belief revision operations. In the following, we let L be a fixed deductive logic and U the set ML of all L-maximal theories. V is the set of all closed subsets of U. Definition 6.3. (a) A family of spheres centered on X ∈ V is a collection $X of elements in V satisfying the conditions:25 ($1) $X is totally ordered by ⊆, that is, if Y, Z ∈ $X, then Y ⊆ Z or Z ⊆ Y; ($1) X is the ⊆-minimum of $X, that is X ∈ SX and for all Y ∈ $X, X ⊆ Y; ($3) U ∈ S; ($4) if Y ∈ V, then there exists an element Z of $X, such that Y ∩ Z ≠ ∅ and for all Z' ∈ $X, if Y ∩ Z' ≠ ∅, then Z ⊆ Z'. In other words, for every closed set Y ∈ V, there exists a smallest sphere in $X intersecting Y. (b) A system of spheres is a function $ that associates a family of spheres with any set X ∈ V. Let P ⊆ U × U be a neat strict weak ordering of U. As usual xRy is defined as ¬yPx. We say that X is a P-sphere iff X ∈ V and (∀x, y)(x∈ X and xRy → y ∈ X). The set $ of all Pspheres is then a family of spheres centered around ∩$. With any system of spheres $X, we may associate a dyadic selection function S: V × V → V in the following way: for all X, Y ∈ V, (i) if Y ≠ ∅, we let S(X, Y) = Z0 ∩ Y, where Z0 is the smallest sphere in $X that intersects Y; and (ii) S(X, ∅) = ∅. Theorem 6.4. Let $ be any system of spheres and let S be the associated dyadic selection function. For all Γ and ∆, let C∆(Γ) = t(S(oe∆", oeΓ")). Then C is a belief revision operation, satisfying axioms (BC1) - (BC5), (BC9). 29 Theorem 6.5. (Representation Theorem III) Let C be any belief revision operation satisfying axioms (BC1) - (BC5) and (BC9). Then, there exists a system of spheres $ such that for all Γ, ∆, C∆(Γ) = t(S(oe∆", oeΓ")), where S is the dyadic selection function associated with $. (Proofs will be added) APPENDIX: PROOFS OF LEMMAS AND THEOREMS Proof of Lemma 2.1: We omit the easy verification of (i) and proceed to prove (ii). By (Cn1), Γ ∪ ∆ ⊆ Γ ∪ Cn(∆). (Cn2) then yields: Cn(Γ ∪ ∆) ⊆ Cn(Γ ∪ Cn(∆)). We also have, by (Cn1) and (Cn2), that: Cn(Γ ∪ Cn(∆)) ⊆ Cn(Cn(Γ) ∪ Cn(∆)). It remains to prove that Cn(Cn(Γ) ∪ Cn(∆)) ⊆ Cn(Γ ∪ ∆). By Monotonicity we get: Γ ⊆ Cn(Γ ∪ ∆) and ∆ ⊆ Cn(Γ ∪ ∆). (i) then yields: Cn(Γ) ⊆ Cn(Γ ∪ ∆) and Cn(∆) ⊆ Cn(Γ ∪ ∆). Hence, Cn(Γ) ∪ Cn(∆) ⊆ Cn(Γ ∪ ∆). Monotonicity then yields: Cn(Cn(Γ) ∪ Cn(∆)) ⊆ Cn(Cn(Γ ∪ ∆)). Finally, using (Cn3), we get: Cn(Cn(Γ) ∪ Cn(∆)) ⊆ Cn(Γ ∪ ∆). M Proof of Lemma 2.3: Suppose Γ is L-maximal. We prove that CnL(Γ) ⊆ Γ. By (Cn1), Γ ⊆ CnL(Γ ). Suppose that CnL(Γ) is not L-consistent. Then, CnL(Γ ) aL ⊥ , i.e., ⊥ ∈ CnL(CnL(Γ)). By (Cn3), ⊥ ∈ CnL(Γ), which is impossible. Hence, CnL(Γ) is an L-consistent superset of Γ. It follows by the L-maximality of Γ that Γ = CnL(Γ). M Proof of Lemma 2.4: First we notice that (a) follows from (b). Substituting ⊥ for α in (a) we get: If Γ is L-consistent, then CnL(Γ) is included in an L-maximal theory. (a) the follows, by Inclusion. Next, we prove (b). Suppose α ∉ CnL(Γ). Let 30 X = {∆: ∆ is L-consistent, CnL(Γ) ⊆ ∆ and α ∉ ∆}. X is non-empty, since CnL(Γ) ∈ X (the claim that CnL(Γ) is L-consistent presupposes Iteration (i.e., Cut)). Let Y be any non-empty chain in X. Consider Σ = ∪Y. Clearly CnL(Γ) ⊆ Σ. We claim that Σ is L-consistent. Indirect proof: Suppose not. Then Σ aL ⊥. By Finiteness, there is a finite Σ' ⊆ Σ such that Σ' aL ⊥. But then, since Σ is simply ordered by inclusion and Σ' is finite, there must exist some ∆ ∈ Y such that Σ' ⊆ ∆. It follows, by Monotonicity, that ∆ is L-inconsistent. Contradiction. It remains to prove that α ∉ Σ. But this is clear, since otherwise α ∈ ∆, for some ∆ ∈ Y, which is impossible. Thus, Σ satisfies the necessary conditions for being a member of X. It follows by Zorn's Lemma that X has a maximal element. M Proof of Lemma 3.4: (⇒) Suppose that L is a classical logic and that  is an inference relation based on L. We verify that  satisfies conditions (4) - (7). (4) Suppose that CnL({α}) = CnL({β}). Then, α aL β and β aL α. Γ  α and α aL β yield via (2) that Γ  β. Similarly, from Γ  β and β aL α, one obtains Γ  α. Thus, Γ  α iff Γ  β. (5) Suppose that Γ  α ∧ β. Since L is classical, α ∧ β aL α and α ∧ β aL β. Hence, by (2), Γ  α and Γ  β. (6) Suppose that Γ  α and Γ  β. L being classical yields α, β aL α ∧ β. Hence, by (2), Γ  α ∧ β. (7) Letting ∆ = ∅ and β = Τ in (2), we get: if Γ  α, for all α ∈ ∅, and ∅ aL Τ, then Γ  Τ. Γ  α, for all α ∈ ∅ holds vacuously and ∅ aL Τ holds because L is classical. Hence, Γ  Τ. (⇐) Suppose that L is a classical logic and that  satisfies (1), (3) - (7). We must prove that  then also satisfies (2). Suppose that Γ  α, for all α ∈ ∆, and ∆ aL β. We wish to prove that Γ aL β. We first consider the case when ∆ ≠ ∅. Then there are β1,..., βn ∈ ∆ (n ≥ 0) such that Γ  β1,..., Γ  βn and {β1,..., βn} aL β. Using (4) and (6), we get that Γ  β1 ∧ ... ∧ βn. Furthermore, β1 ∧ ... ∧ βn aL β. This implies that ∅ aL ((β1 ∧ ... ∧ βn) ∧ β) ↔ β1 ∧ ... ∧ βn. Using (4), we get Γ aL (β1 ∧ ... ∧ βn) ∧ β. This yields by (5) that Γ  β. We now consider the case when ∆ = ∅. We need to prove that if ∅ aL β, then Γ  β. But if ∅ aL β, then ∅ aL (β ↔Τ). We have Γ  Τ, by (7). Hence, Γ  β, by (4).M Proof of Lemma 4.8: (i) Suppose that S satisfies condition (c). By the definition of a model S(S(X)) ⊆ S(X) ⊆ X. Hence, by condition (c), S(X) ⊆ S(S(X)). Thus, S(S(X)) = S(X). 31 (ii) Assume that S(X) ⊆ Y ⊆ X. By Chernoff: S(X) ∩ Y ⊆ S(X ∩ Y). By the assumption: S(X) ∩ Y = S(X) and X ∩ Y = Y. Hence, S(X) ⊆ S(Y). (iii) We assume that Chernoff holds. It follows that: S(X ∪ Y) ∩ X ⊆ S((X ∪ Y) ∩ X); and S(X ∪ Y) ∩ Y ⊆ S((X ∪ Y) ∩ Y). However, (X ∪ Y) ∩ X = X and (X ∪ Y) ∩ Y = Y. So, S(X ∪ Y) ∩ X ⊆ S(X); and S(X ∪ Y) ∩ Y ⊆ S(Y). Since S(X ∪ Y) ⊆ X ∪ Y, we get: S(X ∪ Y) ⊆ S(X) ∪ S(Y). (iv) Suppose that S satisfies Cut, Aizerman and Distributivity. By Distribution and the definition of a selection function: S(X ∪ Y) ⊆ S(X) ∪ S(Y) ⊆ X ∪ Y. From this we get, using Cut and Aizerman: S(S(X) ∪ S(Y)) = S(X ∪ Y) which is Path Independence. We next prove that for full models, Path Independence is equivalent to Chernoff and Aizerman. Since Chernoff implies Cut and Distribution, we have in general that Chernoff and Aizerman imply Path Independence. In order to prove the other direction, suppose that S is the selection function of a full model which satisfies Path Independence. First, we prove Chernoff in the formulation: (α) if Y ⊆ X, then S(X) ∩ Y ⊆ S(Y). Suppose Y ⊆ X. Applying Path Independence to the sets Y and X Y we get: (1) S(X) = S(Y ∪ (X Y)) = S(S(Y) ∪ S(X Y)). Here, we need the assumption of fulness in order to be sure that X Y ∈ V. But, (2) S(S(Y) ∪ S(X Y)) ⊆ S(Y) ∪ S(X Y) ⊆ S(Y) ∪ (X Y). (1) and (2) yield: S(X) ⊆ S(Y) ∪ (X Y). Hence, S(X) ∩ Y ⊆ (S(Y) ∪ (X Y)) ∩ Y. That is, S(X) ∩ Y ⊆ S(Y). To prove that Path Independence, in the presence of fulness, implies Aizerman, suppose that S(X) ⊆ Y ⊆ X. Cut (which follows from Chernoff) then yields S(X) ⊆ S(Y). Path Independence yields: S(X) = S(X ∪ Y) = S(S(X) ∪ S(Y)) = S(S(Y)). 32 Since Chernoff yields Iteration, we also have S(S(Y)) = S(Y). Thus, S(X) = S(Y). We have shown: (aiz) if S(X) ⊆ Y ⊆ X, then S(Y) ⊆ S(X). (v) Suppose that S satisfies Consistency Preservation and Arrow. Suppose, Y ⊆ X. If S(X) ∩ Y = ∅, then clearly S(X) ∩ Y ⊆ S(Y). However, if S(X) ∩ Y ≠ ∅, we have S(X) ∩ Y ⊆ S(Y), by Arrow. Hence, we have derived (α) which is equivalent to Chernoff. (Notice, that we did not use (cp) in this derivation.) In order to prove Aizerman, assume that S(X) ⊆ Y ⊆ X. If S(X) ∩ Y = ∅, then S(X) = ∅. Consistency Preservation then yields X = ∅ and Y = ∅. Hence, ∅ = S(Y) ⊆ S(X), in this case. Hence, we may assume that S(X) ∩ Y ≠ ∅. Arrow then yields: S(X) ∩ Y = S(Y). But since S(X) ⊆ Y, we get S(X) = S(Y). Finally, we prove Gamma. Let ∅ ≠ F ⊆ V such that ∪X∈FX ∈ V. If X = ∅, for all X ∈ F, then Gamma holds trivially. We therefore suppose that for some X ∈ F, X ≠ ∅. Then, ∪X∈FX ≠ ∅. By Consistency Preservation S(∪X∈FX) ≠ ∅. Hence, for some Y ∈ F, S(∪X∈FX) ∩ Y ≠ ∅. Arrow then yields: If Y ⊆ ∪X∈FX and S(∪X∈FX) ∩ Y ≠ ∅, then S(∪X∈FX) ∩ Y = S(Y). Thus, S(∪X∈FX) ∩ Y = S(Y). It follows that S(Y) ⊆ S(∪X∈FX). Since ∩X∈FS(X) ⊆ S(Y), we obtain: ∩X∈FS(X) ⊆ S(∪X∈FX). (vi) We first assume that S satisfies Arrow. As we have already shown (v) above), S then satisfies Chernoff. To prove Sen, assume that X ⊆ Y and S(X) ∩ S(Y) ≠ ∅. By Arrow, we than get: S(Y ∩ S(X)) = S(Y) ∩ S(X). But, S(X) ⊆ X ⊆ Y. Hence, Y ∩ S(X) = S(X). We get: S(S(X)) = S(Y) ∩ S(X). But Arrow implies Chernoff which in turn implies iteration: S(S(X)) = S(X). Hence, S(X) = S(Y) ∩ S(X), i.e., S(X) ⊆ S(Y). For the other direction, assume that S satisfies Chernoff and Sen. We prove Arrow in the form: If Y ⊆ X and S(X) ∩ Y ≠ ∅, then S(X) ∩ Y = S(Y). Thus, assume that Y ⊆ X and S(X) ∩ Y ≠ ∅. By Chernoff, we have: S(X) ∩ Y ⊆ S(X ∩ Y) = S(Y). It follows that S(X) ∩ Y ⊆ S(X) ∩ S(Y) and that S(X) ∩ S(Y) ≠ ∅. Sen then yields S(Y) ⊆ S(X). Hence, S(Y) ⊆ S(X) ∩ Y. But we already have S(X) ∩ Y ⊆ S(Y). Thus, S(X) ∩ Y = S(Y). M Proof of Theorem 4.9: Let M = <U, V, l, S> be a canonical model based on the {∧, ∨}normal deductive logic L and let C = CM. Then, we have: C(Γ) = t(S(oeΓ")) and S(X) = oeC(t(X))". 33 Consistency Preservation: We assume (cp) and prove (CP). Suppose that ⊥ ∉ Cn(Γ). Then oeΓ" ≠ ∅, so by (cp), S(oeΓ") ≠ ∅. But S(oeΓ") =oet(SoeΓ"))" = oeC(Γ)". Thus, oeC(Γ)" ≠ ∅. It follows that C(Γ) is L-consistent. For the other direction, assume that C satisfies (CP). Also assume that X ≠ ∅. Then, t(X) is L-consistent, so it follows by (CP) that C(t(X)) is L-consistent. Hence, oeC(t(X))" ≠ ∅, i.e., S(X) ≠ ∅. Iteration: We assume (it) and prove (It). C(Γ) = t(S(oeΓ")) =by (it) t(S(S(oeΓ"))) = t(S(oet(S(oeΓ"))")) = C(t(S(oeΓ"))) = C(C(Γ)). Next, we assume (It). Then, S(X) = oeC(t(X))" = oeC(C(t(X)))" = oeC(Cn(C(t(X))))" = oeC(t(oeC(t(X))"))" = S(oeC(t(X))") = S(S(X)). Cut: First, we assume (c) and prove (C). Suppose Γ ⊆ ∆ ⊆ C(Γ). It follows that: oeC(Γ)" ⊆ oe∆" ⊆ oeΓ". Hence, oet(S(oeΓ"))" ⊆ oe∆" ⊆ oeΓ" . But S(oeΓ") =oet(SoeΓ"))", so S(oeΓ") ⊆ oe∆" ⊆ oeΓ". Condition (c) now yields: S(oeΓ") ⊆ S(oe∆"). Hence, t(S(oe∆")) ⊆ t( S(oeΓ")). Finally, we get C(∆) ⊆ C(Γ). In order to prove the converse, assume (C) and that S(X) ⊆ Y ⊆ X. Then we get: t(X) ⊆ t(Y) ⊆ t(S(X)). But, S(X) = S(oet(X)"), so it follows that t(X) ⊆ t(Y) ⊆ t(S(oet(X)")). t(S(oet(X)")) = C(t(X)). Hence, t(X) ⊆ t(Y) ⊆ C(t(X)). Condition (C) now yields, C(t(Y)) ⊆ C(t(X)). Thus, oeC(t(X))" ⊆ oeC(t(Y))". Finally, we obtain S(X) ⊆ S(Y). Distributivity: First, we prove (D) from (d). By (d), we get: S(oeΓ" ∪ oe∆") ⊆ S(oeΓ") ∪ S(oe∆"). However, t(oeΓ" ∪ oe∆") = Cn(Γ) ∩ Cn(∆). Hence, oet(oeΓ" ∪ oe∆")" = oeCn(Γ) ∩ Cn(∆)". It follows that: S(oeΓ" ∪ oe∆") = S(oet(oeΓ" ∪ oe∆")") = S(oeCn(Γ) ∩ Cn(∆)"). Hence, we have: S(oeCn(Γ) ∩ Cn(∆)") ⊆ S(oeΓ") ∪ S(oe∆"). However, this in turn implies: oeC(t(oeCn(Γ) ∩ Cn(∆)"))" ⊆ oeC(t(oeΓ"))" ∪ oeC(t(oe∆"))". That is, oeC(Cn(Γ) ∩ Cn(∆))" ⊆ oeC(Γ)" ∪ oeC(∆)". But, oeC(Γ)" ∪ oeC(∆)" ⊆ oeC(Γ) ∩ C(∆)", so oeC(Cn(Γ) ∩ Cn(∆))" ⊆ oeC(Γ) ∩ C(∆)". It follows that: t(oeC(Γ) ∩ C(∆)") ⊆ t(oeC(Cn(Γ) ∩ Cn(∆))"). 34 That is: C(Γ) ∩ C(∆) ⊆ C(Cn(Γ) ∩ Cn(∆)). In order to prove the converse, assume (D). Then, we have: C(t(X)) ∩ C(t(Y)) ⊆ C(t(oe(t(X))") ∩ t(oe(t(Y))")). That is, C(t(X)) ∩ C(t(Y)) ⊆ C(t(X) ∩ t(Y)). But, t(X) ∩ t(Y) = t(X ∪ Y), so: C(t(X)) ∩ C(t(Y)) ⊆ C(t(X ∪ Y)). In other words, t(S(X)) ∩ t(S(Y)) ⊆ t(S(X ∪ Y)). Since t(S(X)) ∩ t(S(Y)) = t(S(X) ∪ S(Y)), t(S(X) ∪ S(Y)) ⊆ t(S(X ∪ Y)). This in turn yields: Cl(S(X ∪ Y)) ⊆ Cl(S(X) ∪ S(Y)). But according to Lemma 4.2, Cl(S(X) ∪ S(Y)) = Cl(S(X)) ∪ Cl(S(Y)) = S(X) ∪ S(Y). Thus, finally, we get: S(X ∪ Y) ⊆ S(X) ∪ S(Y). Chernoff: First, we prove (C) from (c). By (c), we have: S(oeΓ") ∩ oe∆" ⊆ S(oeΓ" ∩ oe∆"). That is: oeC(t(oeΓ"))" ∩ oe∆" ⊆ oeC(t(oeΓ" ∩ oe∆"))". Hence, t(oeC(t(oeΓ" ∩ oe∆"))") ⊆ t(oeC(t(oeΓ"))" ∩ oe∆"). But, oeC(t(oeΓ"))" ∩ oe∆" = oeC(t(oeΓ")) ∪ ∆", so: t(oeC(t(oeΓ" ∩ oe∆"))") ⊆ t(oeC(t(oeΓ")) ∪ ∆"). That is, C(t(oeΓ" ∩ oe∆")) ⊆ Cn(C(Γ) ∪ ∆). t(oeΓ" ∩ oe∆") = Cn(Γ ∪ ∆), and C(Cn(Γ ∪ ∆)) = C(Γ ∪ ∆), so C(Γ ∪ ∆) ⊆ Cn(C(Γ) ∪ ∆). For the other direction, assume (Ch). Then, we have for any X, Y ∈ V: C(t(X) ∪ t(Y)) ⊆ Cn(C(t(X)) ∪ t(Y)). 35 Hence, C(t(X ∩ Y) ⊆ Cn(t(S(X)) ∪ t(Y)). That is, t(S(X ∩ Y)) ⊆ t(S(X) ∩ Y). Hence, oet(S(X) ∩ Y)" ⊆ oet(S(X ∩ Y))". But, S(X) ∩ Y ⊆ oet(S(X) ∩ Y)" and oet(S(X ∩ Y))" = S(X ∩ Y). Thus, it follows that: S(X) ∩ Y ⊆ S(X ∩ Y). Aizerman: First, we assume (aiz) and prove (Aiz). Suppose that Γ ⊆ ∆ ⊆ C(Γ). It follows that oeC(Γ)" ⊆ oe∆" ⊆ oeΓ". Hence, S(oeΓ") ⊆ oe∆" ⊆ oeΓ". Condition (aiz), then, yields S(oe∆") ⊆ S(oeΓ"). This in turn implies: t(S(oeΓ")) ⊆ t(S(oe∆")), i.e., C(Γ) ⊆ C(∆). In order to prove the converse, assume (Aiz) and S(X) ⊆ Y ⊆ X. Then we get: t(X) ⊆ t(Y) ⊆ t(S(X)), i.e., t(X) ⊆ t(Y) ⊆ C(t(X)). (Aiz), then, yields: C(t(X)) ⊆ C(t(Y)). This, in turn, implies: t(S(X)) ⊆ t(S(Y)). Hence, oet(S(Y))" ⊆ oet(S(X))", that is, S(Y) ⊆ S(X). Gamma: First, we assume (g) and prove (G). Condition (g) yields: ∩Γ∈FS(oeΓ") ⊆ S(∪X∈ΓoeΓ"), where F is a non-empty family of sets of sentences. Now, for each Γ, S(oeΓ") = oeC(Γ)". We also have: S(∪X∈ΓoeΓ") = S(Cl(∪X∈ΓoeΓ")). Thus, (1) ∩Γ∈FoeC(Γ)" ⊆ S(Cl(∪X∈ΓoeΓ")). However, t(∪X∈ΓoeΓ") = ∩Γ∈FCn(Γ), so Cl(∪X∈ΓoeΓ") = oe∩Γ∈FCn(Γ)". (1) then yields: ∩Γ∈FoeC(Γ)" ⊆ S(oe∩Γ∈FCn(Γ)"). This, in turn, yields: ∩Γ∈FoeC(Γ)" ⊆ oeC(∩Γ∈FCn(Γ))". Hence, t(oeC(∩Γ∈FCn(Γ))") ⊆ t(∩Γ∈FoeC(Γ)"). But, t(oeC(∩Γ∈FCn(Γ))") = C(∩Γ∈FCn(Γ)) and t(∩Γ∈FoeC(Γ)" ) = Cn(∪Γ∈FC(Γ)). Hence, we get: C(∩Γ∈FCn(Γ)) ⊆ Cn(∪Γ∈FC(Γ)). Next, we assume (G) and prove (g). By (G), we have: C(∩X∈FCn(t(X))) ⊆ Cn(∪X∈FC(t(X))), where F is any non-empty family of elements in V such that ∪X∈FX ∈ V. Simplifying and using: C(t(X)) = t(S(X)), we get: 36 C(∩X∈Ft(X)) ⊆ Cn(∪X∈Ft(S(X))). But, C(∩X∈Ft(X)) = t(S(oe∩X∈Ft(X)")), so: t(S(oe∩X∈Ft(X)")) ⊆ Cn(∪X∈Ft(S(X))). Hence, oeCn(∪X∈Ft(S(X)))" ⊆ oet(S(oe∩X∈Ft(X)"))". It follows that: oe∪X∈Ft(S(X))" ⊆ S(oe∩X∈Ft(X)"). But, oe∪X∈Ft(S(X))" = ∩X∈Foet(S(X))", so we get: ∩X∈Foet(S(X))" ⊆ S(oe∩X∈Ft(X)"). However, ∩X∈Ft(X) = t(∪X∈FX). Hence, oe∩X∈Ft(X)" = Cl(∪X∈FX). We get: ∩X∈Foet(S(X))" ⊆ S(Cl(∪X∈FX)). But, oet(S(X))" = S(X) and S(Cl(∪X∈FX)) = S(∪X∈FX), so we finally get: ∩X∈FS(X) ⊆ S(∪X∈FX). Sen: We first assume (s) and prove (S). Suppose that C(Γ) ∪ C(∆) = t(S(oeΓ")) ∪ t(S(oe∆")) is L-consistent. It follows that S(oeΓ") ∩ S(oe∆") ≠ ∅. Hence, by (s), S(oeΓ" ∩ oe∆") ⊆ S(oeΓ") ∩ S(oe∆"). It follows that, t(S(oeΓ") ∩ S(oe∆")) ⊆ t(S(oeΓ" ∩ oe∆")). That is, Cn(t(S(oeΓ")) ∪ t(S(oe∆"))) ⊆ t(S(oeΓ ∪ ∆")). In other words: Cn(C(Γ) ∪ C(∆)) ⊆ C(Γ ∪ ∆). But C(Γ) ∪ C(∆) ⊆ Cn(C(Γ) ∪ C(∆)), so we get: C(Γ) ∪ C(∆) ⊆ C(Γ ∪ ∆). In order to prove the other direction, we assume that S(X) ∩ S(Y) ≠ ∅. Hence t(S(X) ∩ S(Y)) is L-consistent. But, t(S(X) ∩ S(Y)) = Cn(t(S(X)) ∪ t(S(Y))) = Cn(C(t(X)) ∪ C(t(Y))). Thus, C(t(X)) ∪ C(t(Y)) is L-consistent. It follows, by (S), that: C(t(X)) ∪ C(t(Y)) ⊆ C(t(X) ∪ t(Y)). But, t(X) ∪ t(Y) = t(X ∩ Y). Hence, C(t(X)) ∪ C(t(Y)) ⊆ C(t(X ∩ Y)). This, however, means that: 37 t(S(X)) ∪ t(S(Y)) ⊆ t(S(X ∩ Y)). But, t(S(X)) ∪ t(S(Y)) = t(S(X) ∩ S(Y)), so: t(S(X) ∩ S(Y)) ⊆ t(S(X ∩ Y)). Hence, Cl(S(X ∩ Y)) ⊆ Cl(S(X) ∩ S(Y)). But, Cl(S(X ∩ Y)) = S(X ∩ Y) and the intersection of two closed sets is closed, so Cl(S(X) ∩ S(Y)) = S(X) ∩ S(Y). Hence, S(X ∩ Y) ⊆ S(X) ∩ S(Y). Arrow: We first assume (IIA) and prove (iia). Suppose that S(X) ∩ Y ≠ ∅. That is, oeC(t(X))" ∩ Y ≠ ∅, which means that C(t(X)) ∪ t(Y) is L-consistent. It follows by (IIA) that: C(t(X) ∪ t(Y)) = Cn(C(t(X)) ∪ t(Y)). This implies: oeC(t(X) ∪ t(Y))" = oeC(t(X)) ∪ t(Y)" = oeC(t(X))" ∩ oet(Y)". But C(t(X) ∪ t(Y)) = C(Cn(t(X) ∪ t(Y))) = C(t(X ∩ Y)). Hence, oeC(t(X ∩ Y))" = oeC(t(X))" ∩ oet(Y)". This means that: S(X ∩ Y) = S(X) ∩ Cl(Y). But Y ∈ V, so Cl(Y) = Y. Hence, S(X ∩ Y) = S(X) ∩ Y. (Add proof of the other direction) Proof of Lemma 5.5: (a) Suppose that S is based on P and that P is neat. Then, we have that: xPy iff S({x, y}) = {x}. Now, assume that S satisfies Aizerman and that xPy and yPz. We want to show that xPz. Since, S is rationalizable, it satisfies Cut, so we have: if S(X) ⊆ Y ⊆ X, then S(Y) = S(X). Now, let X = {x, y, z} and Y = {x, z}. Since xPy and yPz, we have that y ∉ S(X) and z €∉ S(X). Since P is neat, S satisfies (cp). Thus, S(X) ≠ ∅. It follows that S(X) = {x}. Thus, we have S(X) ⊆ Y ⊆ X. So by Aizerman (and Cut), S(Y) = S(X), i.e., S({x, z}) = {x}. We conclude that xPz. (b) Assume that P-1 is well-founded and that P is transitive. In order to prove Aizerman, assume that S(X) ⊆ Y ⊆ X. Let x0 ∈ S(Y). In order to derive a contradiction, we assume that 38 x0 ∉ S(X). It follows that there exists some x1 ∈ X such that x1Px0. If x1 ∉ S(X), then there exists some x2 ∈ X such that x2Px1, and so on. For any n, if xn ∉ S(X), we choose xn+1 in such a way that xn+1Pxn and xn+1 ∈ X; and if xn ∈ S(X), we terminate the process. Since P-1 is well-founded, this process must terminate after a finite number of steps. Thus, we get a finite sequence (with at least two terms) x0, x1,..., xn of elements in X such that x0 ∉ S(X) and xn ∈ S(X) and xnPxn-1P...x1Px0. By the transitivity of P, xnPx0. Since S(X) ⊆ Y, we get that xn ∈ Y. Thus, we have: x0 ∈ S(Y) (by assumption), xnPx0 and xn ∈ Y, i.e., a contradiction. Hence, we have proved that S(Y) ⊆ S(X). M NOTES * This paper has existed in its present form since August 1991. It was presented at the 9th International congress of Logic, Methodology and Philosophy of Science, 7-14 augusti, 1991, Uppsala and at a conference on Recent Advances in Philosophy of Science at the University of Amsterdam in August 1991. I started to work on the paper while being a visitor at the Institute for Mathematical Studies in the Social Sciences at Stanford University in the Spring of 1990. I am grateful to the director of the institute, Patrick Suppes, for his great hospitality. At Stanford I learned from Yoav Shoham's lectures and from discussions with Michael Webster - on choice functions - and David Israel - on nonmonotonic reasoning. I also wish to thank Peter Gärdenfors, David Makinson - who saved me from an embarrasing mistake - and Wlodzimierz Rabinowicz for stimulating suggestions and criticisms of earlier versions. 1 Cf. McCarthy (1980). 2 In McCarthy (1980), the circumscription of a sentence is a first-order schema instead of a second-order sentence (cf. the difference between the induction schema of first-order Peano arithmetic and the induction axiom of second order Peano arithmetic). In later work by John McCarthy and Vladimir Lifschitz (cf. McCarthy, 1986 and Lifschitz 1985, 1987) the circumscription of a sentence is defined as a single second-order sentence. In general, a first-order schema is weaker than the corresponding second order sentence and is not strong enough to characterize the minimal models (example: the existence of non-standard models of first-order Peano arithmetic). In certain cases, though, the second order formulation may be equivalent to a first-order sentence or a set of first-order sentences. 3 This is a shortened version of an example in McCarthy (1986). 4 Binary selection functions were studied by Stig Kanger (to appear). However Kanger's interpretation of S(V, X), where X, V are subsets of some grand domain U, differs from the one employed here. Kanger took S(V, X) to be "the set of those alternatives of V ∩ X which, compared with alternatives of V, are regarded as not being worse than any alternative of V ∩ X." Here, on the other hand, S(V, X) is interpreted as the set of all those alternatives of X which are not farther removed from the set V than any alternatives in X. Thus, we think of the elements of V as the 'ideal' alternatives; and the elements of S(V, X) are the elements of X that are as close to being ideal as possible. 39 5 This method of translating back and forth between theories of belief revision and nonmonotoonic inference (with single propositions as premises) was suggested by Makinson and Gärdenfors (1990). 6 For any set A, we let ℘(A) be the power set of A. 7 Cf. Gärdenfors and Makinson (1991). 8 This way of talking about an agent's expectations is inspired by Gärdenfors and Makinson (1991). 9 The notion of a Galois Connection and its use in model theory is discussed, for example, in Cohn (1965). 10 In order to be fully explicit, we should write ClL(X) rather than Cl(X) and speak of it as the L-closure of X. Similarly, we should say that X is L-closed, if X = ClL(X). In most cases, however, the reference to L can be left implicit. 11 Together with (Cl 1) - (CL 4), this condition implies that the closure operation of a canonical model is a topological closure operation in the sense of Kuratowski (see, for instance, Kelley (1955), p. 43). 12 Cf. for example Moulin (1985). 13 See Moulin (1985) for additional references and details about the origin of some of the conditions. 14 David Makinson (personal communication) has proved that (pi) does not in general imply (ch). 15 Cf. Makinson (1989) and Kraus, Lehman and Magidor (1990). 16 The term choice structure is borrowed from Hansson (1968), although his notion of a choice structure is not exactly the one defined here: Hansson's choice structures satisfy weaker structural conditions on the set V, but stronger conditions on the selection function S. 17 Our terminology here, differs from Kanger (to appear) who uses "neat" to refer to the stronger property of P-1 (the converse of P) being well-founded. Thus, P is neat in Kanger's sense, just in case every nonempty subset of U has a P-maximal element, i.e., iff there are no infinitely ascending P-chains in U. Neatness in our sense only requires every non-empty closed subset of U to contain a P-maximal element. Neatness is analogous to the limit assumption of Lewis (1973). 18 Cf. Hansson (1968), Sen (1971), Moulin (1985), Kanger (to appear). 19 Sen proved that a selection function that satisfies Consistency Preservation is rationalizable iff it satisfies Chernoff and Gamma. Theorem 2 of Moulin (1985) is Sen's theorem for the case when U is finite. 20 Moulin (1985) proves a finitary version of this lemma which may be formulated as follows: Suppose that S is based on P and that the set U of all alternatives is finite. Then, S satisfies Aizerman iff P is transitive. This is, of course, a consequence of Lemma 5.5. 21 Cf. Alchourrón, Gärdenfors and Makinson (1985) and Gärdenfors (1988). 22 Infinitary belief revision has been studied before in the literature. Fuhrmann (1988) considers both infinitary belief contraction operations and infinitary belief revision operations. He refers to these kinds of operations as multiple contraction and multiple revision, respectively. Via a generalization of the so-called Levi identity, Fuhrmann defines infinitary belief revision in terms of infinitary belief contraction . He also formulates a set of postulates for infinitary belief revision which is equivalent to (BC1) - (BC4) together with (BC9). (Fuhrmann 1988, p. 159). S. O. Hansson (1989) contains a theory of infinitary belief contraction. 23 See also the discussion in Gärdenfors (1990). 40 24 The reader should perhaps also be remainded that we make weaker assumptions concerning the underlying logic than Gärdenfors does. We assume only that it is a deductive logic, i.e., a finitary Tarski-style consequence relation. He assumes, in addition, that it is classical, i.e., is closed under the axioms and rules (including the deduction theorem) of classical propositional logic. 25 ($4) is a strengthening of Grove's (1988) limit assumption. For a discussion of the limit assumption in the context of possible worlds semantics for counterfactuals, see Lewis (1973). REFERENCES Alchourrón, C. E., Gärdenfors, P., and Makinson, D. (1985) 'On the logic of theory change: Partial meet contraction and revision functions', Journal of Symbolic Logic 50, 510-530. Cohn, P. M. (1965) Universal Algebra. Harper & Row, London. Fuhrmann A. (1988) Relevant Logics, Modal Logics and Theory Change. Doctoral Dissertation, Department of Philosophy and Automated Reasoning Project, Australian National University. Gabbay, D. (1985) 'Theoretical foundations for non-monotonic reasoning in expert systems', in K. R. Apt (ed.) Logics and Models of Concurrent Systems, Springer Verlag, NATO ASI Series, Vol. F13. Gärdenfors, P. (1988) Knowledge in Flux: Modeling the Dynamics of Epistemic States, Bradford Books, MIT Press. Gärdenfors, P. (1990) 'Belief revision and nonmonotonic logic: two sides of the same coin?', to appear in The Proceedings of the Ninth European Conference on AI, ed. by Luigia Carlucci Aiello, Pitman Publishing, London, 768-773. Gärdenfors, P. and Makinson, D. (1994) 'Nonmonotonic inference based on expectations', Artificial Intelligence 65, 197-245. Ginsberg, M. (1987) Readings in Nonmonotonic Reasoning, Morgan Kaufman Publishers. Grove, A. (1988) 'Two modellings for theory change', Journal of Philosophical Logic 17, 157-170. Hansson, B.(1968) 'Choice functions and preference relations', Synthese 18 (1968), 443-458. Hansson, S.O. (1989), 'New operators for theory change', Theoria 55, 114-132. Jónsson, B. and Tarski, A. (1951-52) 'Boolean algebras with operators' Part I, Amer. J. Math., 73, pp. 891-939, Part II, Amer. J. Math., 74, pp. 127-162. 41 Kanger, S. (to appear) 'Choice based on preference', forthcoming in Pörn et al., Formal Methods in Practical Philosophy - The Scandinavian Contribution. Kelley, J. L. (1955) General Topology, D. Van Nostrand. Kraus, S., Lehmann, D., and Magidor, M. (1990) 'Nonmonotonic reasoning, preferential models and cumulative logics', Artificial Intelligence 44, 167-207. Lewis, D. (1973) Counterfactuals, Basil Blackwell, Oxford. Lifschitz, V. (1985) 'Computing circumscription', in Proceedings of the Ninth International Joint Conference on Artificial Intelligence 27, 229-235. Reprinted in Ginsberg (1987). Lifschitz, V. (1987) 'Pointwise circumscription', Technical report, Stanford University. Reprinted in Ginsberg (1987). Makinson, D. (1989) 'General theory of cumulative inference', in M. Reinfrank, J. de Kleer, M.L. Ginsberg, and E. Sandewall, eds. Non-Monotonic Reasoning, Springer Verlag, Lecture Notes on Artificial Intelligence, no 346. Makinson, D., (1993) 'General patterns of nonmonotonic reasoning', Handbook of Logic in Artificial Intelligence and Logic Programming, Volume II: Non-Monotonic and Uncertain Reasoning. Oxford University Press. Makinson, D. and Gärdenfors, P. (1990) 'Relations between the logic of theory change and nonmonotonic logic', to appear in Fuhrmann and Morreau (eds), The Logic of Theory Change, Lecture Notes in Computer Science, Springer Verlag. McCarthy, J. (1980) 'Circumscription - a form of non-monotonic reasoning', Artificial Intelligence 13, 27-39. Reprinted in Ginsberg (1987). McCarthy, J. (1986) 'Applications of circumscription to formalizing common-sense knowledge' Artificial Intelligence 28, 89-116. Reprinted in Ginsberg (1987). Moulin, H. (1985) 'Choice functions over a finite set: a summary' Social Choice and Welfare (1985) 2: 147-160. Sen, A. (1971) 'Choice functions and revealed preference' Review of Economic Studies 38, 307-17. Also in Sen, A. (1982) Choice, Welfare and Measurement, the MIT Press. Shoham, Y. (1987) 'A semantical approach to non-monotonic logics', Proceedings of the Tenth International Joint Conference on Artificial Intelligence, Milan, Italy, pp. 388-392. Reprinted in Ginsberg (1987) Shoham, Y. (1988) Reasoning about Change: Time and Causation from the Standpoint of Artificial Intelligence, The MIT Press. 42 Suzumura, K. (1983) Rational Choice, Collective Decisions, and Social Welfare, Cambridge University Press.