Structural Equations and Beyond Franz Huber Department of Philosophy University of Toronto franz.huber@utoronto.ca http://huber.blogs.chass.utoronto.ca/ penultimate version: please cite the paper in the Review of Symbolic Logic July 7, 2014 Contents 1 Introduction 2 2 Structural Equations and Defaults 4 3 Generalizing Causal Models 11 4 Laws and Counterfactuality 16 5 Counterfactuality and Actuality 20 6 Beyond Structural Equations 25 7 Appendix 31 7.1 Proof of Theorem 1 . . . . . . . . . . . . . . . . . . . . . . . . . 31 7.2 Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . . . 32 1 1 Introduction Recent accounts of actual causation are stated in terms of extended causal models. These extended causal models contain two elements representing two seemingly distinct modalities. The first element are structural equations which represent the "(causal) laws" or mechanisms of the model, just as ordinary causal models do. The second element are ranking functions which represent normality or typicality. The aim of this paper is to show that these two modalities can be unified. I will do so by formulating two constraints under which extended causal models with their two modalities can be subsumed under so called "counterfactual models" which contain just one modality. These two constraints will be formally precise versions of Lewis' (1979) familiar "system of weights or priorities" governing overall similarity between possible worlds. Here is my strategy in a bit more detail. Elsewhere I have introduced counterfactual models which contain one element representing one modality: objective ranking functions representing counterfactuality. In a first step I will generalize extended causal models by relaxing certain restrictions. If anything, this makes my task more difficult. In a second step I will interpret the ranking functions in these generalized extended causal models objectively as in counterfactual models. In a third step I will formulate two constraints on these generalized and objectively interpreted extended causal models. The first constraint relates structural equations and ranking functions. It is reminiscent of Lewis' (1979: 472) two conditions that "[i]t is of the first importance to avoid big, widespread, diverse violations of law" and that "[i]t is of the third importance to avoid even small, localized, simple violations of law." I will show that extended causal models satisfying this first constraint can be subsumed under counterfactual models. The second constraint relates ranking functions and actuality. It is reminiscent of Lewis' (1979: 472) condition that "[i]t is of the second importance to maximize the spatio-temporal region throughout which perfect match of particular fact prevails." I will show that extended causal models that satisfy this second constraint in addition to the first constraint can be subsumed under counterfactual models in a conservative way. By that I mean that all counterfactual claims as well as all claims about lawhood, causality, and actuality are conserved. Therefore, given these two constraints, there is only one modality that is needed to model actual causation and causality in general. That one modality is counterfactuality, which unifies the two modalities of "(causal) laws" or mechanisms and of normality or typicality that figure in extended causal models. This unification is achieved by a formally precise version of Lewis' (1979: 472) "system of weights or priorities." 2 This result is primarily a result about counterfactuals. However, it may impact the theory of causality in the following way. On the new picture of extended causal models, actual causation is the wrong concept to focus on, because it is a hybrid that involves two seemingly distinct modalities. On this view the concept to focus on is the notion of a "(causal) law" or mechanism as represented by a structural equation. In combination with normality or typicality, as well as what is actually the case, "(causal) laws" or mechanisms somehow give rise to actual causation. On a more traditional picture the concept to focus on is that of actual causation, which is to be analyzed in terms of counterfactuals (Lewis 1973a, 1986a, 2000). I do not want to take sides on the issue of which causal notion to focus on. The issue I want to take sides on is how to represent counterfactuals. The traditional picture has come under attack because it has the wrong theory of counterfactuals (Lewis 1973b, 1979). The new picture of extended causal models receives incredulous stares because it has an incomplete theory of counterfactuals. It reaches for a second modality in order to compensate for this incompleteness. However, in contrast to the first modality of "(causal) laws" or mechanisms the second modality of normality or typicality seems to be partly subjective. This flies in the face of the seemingly objective nature of actual causation. Hence the incredulous stares. The present account corrects the theory of counterfactuals underlying the traditional picture. It completes the theory of counterfactuals underlying the new picture by unifying the two modalities of the latter. Therefore the present account provides the framework in terms of which a counterfactual theory of causality should be formulated, if one wants to defend such a theory.1 1For a quite different way of relating ranking functions and structural equations via causation see Spohn (2010). 3 2 Structural Equations and Defaults The most promising framework for analyzing causation seems to be the structural equations approach (Spirtes & Glymour & Scheines 2000, Pearl 2009: ch. 7; see also Halpern & Pearl 2005a and 2005b and Hitchcock 2001 and 2007). While structural equations are primarily used for the analysis of causation, they are of independent interest for studying the logic of counterfactuals (see Briggs 2012 and Halpern forthcoming). I will touch upon some issues in this connection below, but first we have to get started. The following definition is due to Halpern (2008). M = (S,F ) is a causal model if and only if S is a signature and F = {F1, . . . ,Fn} represents a set of n modifiable structural equations. S = (U,V,R) is a signature if and only ifU is a finite set of exogenous variables,V = {V1, . . . ,Vn} is a set of n endogenous variables disjoint fromU, and R : U ∪V → R assigns each variable X inU ∪V its range R (X) ⊆ R. W = ×X∈U∪VR (X) is the set of possible worlds. F = {F1, . . . ,Fn} represents a set of n modifiable structural equations if and only if each Fi is a function fromWi = ×X∈U∪V\{Vi}R (X) into the range R (Vi) of the endogenous variable Vi. A causal modelM = (S,F ) is acyclic if and only if there is no cycle Vi1, . . . ,Vim,Vi1 in V such that the value of Fi( j+1) depends on R ( Vi j ) for j = 1, . . . ,m− 1, and the value of Fi1 depends on R (Vim). Dependence is functional dependence: Fi depends on R ( V j ) just in case there are ~wi and ~wi ′ in Wi = ×X∈U∪V\{Vi}R (X) that differ only in the value from R ( V j ) such that Fi ( ~wi ) , Fi ( ~wi ′ ) . Let Pa (Vi) be the set of exogenous or endogenous variables X in U ∪ V such that Fi depends on R (X). The members of Pa (Vi) are called the parents of the endogenous variable Vi. Let An (Vi) be the ancestral, or transitive closure, of Pa (Vi), which is defined inductively as follows. Pa (Vi) ⊆ An (Vi); and if V ∈ An (Vi), then Pa (V) ⊆ An (Vi). The members of An (Vi) are called the ancestors of the endogenous variable Vi. They are the parents of Vi, Pa (Vi), and the parents of all parents (but excluding Vi itself, unless the model is cyclic). A context is a specification of the values of all exogenous variables and so can be formalized as a vector ~u in R (U) = ×U∈UR (U). A basic fact about causal models is that every acyclic causal model has a unique solution for any context. An acyclic causal model can be represented by a directed acyclic graph whose nodes are the exogenous and endogenous variables inU ∪V and whose arrows point into each endogenous variable Vi from all of the latter's parents in Pa (Vi). 4 The signature provides the framework or language of the model. It contains more structure than a set of possible worlds, because there is a distinction between exogenous and endogenous variables. What may be even more important is the way one understands these variables. I understand them as singular variables and briefly want to explain why. Philosophers such as Woodward (2003), following the lead of Spirtes & Glymour & Scheines (2000) and Pearl (2009), are mainly interested in causal relevance between properties rather than actual causation between events (or, more cautiously, the relata of actual causation; see Paul 2000). That is, they understand the variables in the generic way they are understood in science, especially those areas of science that rely on statistical methods, as assigning values to a population of individuals from which one can draw samples. For instance, the population may be the set of people at a certain age and in a certain geographical region, and the generic variable may assign values to these individuals – say, value i is assigned to an individual in that population if i mg ibuprofen are administered to that individual. With this generic understanding of the variables it might indeed be possible to test counterfactual claims of what would happen under certain interventions by "carry[ing] out the interventions described in the[...] antecedents and then check[ing] to see whether certain correlations hold" (Woodward 2003: 72-73). For instance, it might indeed be possible to test the causal relevance claim that the administration of ibuprofen causes relief of pain by carrying out the intervention of administering a certain number of mg ibuprofen to some select subgroup of the population and then checking if pain is relieved in the members of that group.2 However, we cannot use generic variables if we want to construct a set of possible worlds in the way we have done above. In order to understand the Cartesian product of all possible values of all variables as a set of possible worlds we have to understand the variables in a singular sense. Otherwise the resulting possibilities are not exclusive. For instance, the variable may assign value i to a possible world if i mg ibuprofen are administered to me at noon on July 1, 2014, in that possible world. By moving from generic variables to singular variables we may lose some connection to science, but we get closer to philosophy. The reason is that now we can understand better the counterfactual claims implicit in a causal claim. Here is how. 2It is not entirely clear to me how Woodward (2003) can distinguish between the test of a counterfactual conditional, the test of an indicative conditional, and the test of a claim about conditional probabilities. How to empirically test or confirm counterfactuals on the account presented in section 4 is explained in Author (ms 1). 5 If we can interpret the Cartesian product of all possible values of all variables as a set of possible worlds, then we can rely on a well-developed theory of counterfactuals. According to that theory a counterfactual conditional of the from 'if A were the case, then C would be the case' is true at a world if C is true in all worlds of a certain subset of the A-worlds. This understanding of counterfactuals is not obviously available if we work with generic variables. The reason is that it is not obvious how to construct possible worlds out of generic variables. And even if one has succeeded in constructing possible worlds out of generic variables, it is not obvious how to understand counterfactuals in the sense of this theory while still be able to test them in the way envisaged by Woodward (2003) and sketched above. Another reason why it is important to understand the variables of the causal model as singular variables is that the restriction to acyclic causal models, which will be important later on, is only plausible for singular variables. For generic variables acyclicity is clearly false. A related point is made by Kistler (forthcoming). Pearl (2009: ch. 10) and Hitchcock (2001) and Woodward (2003: sct. 2.7) and Halpern & Pearl (2005a) have provided increasingly sophisticated definitions of actual causation in terms of acyclic causal models (the particular way these authors formalize causal models differs in detail). However, Hiddleston (2005) presents two acyclic causal models where the "intuitively correct" causal judgments differ, even though the two models are isomorphic (two examples illustrating this point will be presented in the next section). As Halpern (2008) puts it: "there must be more to causality than just the structural equations." I will refer to this claim as the insufficiency thesis: structural equations representing the "(causal) laws" or mechanisms of a model are insufficient for causality. In order to solve this problem Hall (2007) and Hitchcock (2007) distinguish between normal or default values and abnormal or deviant values of a variable. In Halpern (2008) and Halpern & Hitchcock (2010) these defaults are modeled in terms of ranking functions (Spohn 1988). The latter are defined as follow. A function % : W → N is a ranking function if and only if % assigns rank 0 to at least one possible world w in W. Usually ranking functions are interpreted epistemically as grades of disbelief, and then their defining clause is a consistency constraint saying that one should not disbelieve every possible world. A ranking function % on the set of possible worlds W is extended to a function %+ : ℘ (W) → N ∪ {∞} on the powerset of (the propositions over)W, ℘ (W), by setting %+ (A) = min { % (w) : w ∈ A ⊆W } and %+ (∅) = ∞. I will abuse notation and write '%' instead of '%+'. 6 M = ( S,F , % ) is an extended (acyclic) causal model if and only if (S,F ) is a(n) (acyclic) causal model and % is a ranking function on W. As suggested – unintentionally, but nevertheless appropriately – by Halpern (2008: sct. 4), the ranking function % should be indexed to the set of contexts, because what is normal may vary from context to context. Thus, extended (acyclic) causal models really are of the formM = ( S,F , ( %~u ) ~u∈R(U) ) , where R (U) = ×U∈UR (U) is the set of all contexts or specifications of the values of all exogenous variables. The definition of actual causation then runs as follows (Halpern & Hitchcock 2010: sct. 3). X1 = x1 ∧ . . . ∧ Xk = xk, or simply: ~X = ~x, is an actual cause of φ in the extended acyclic causal modelM = ( S,F , ( %~u ) ~u∈R(U) ) in context ~u if and only if: 1. ~X = ~x and φ are true inM in ~u. 2. There is a partition { ~Z, ~W } of the endogenous variables V with ~X ⊆ ~Z, and there are vectors of values ~x′ and ~w of ~X and ~W, respectively, with %~u ( ~X = ~x′ ∧ ~W = ~w ) ≤ %~u (w~u) such that: if ~Z = ~z∗ is true inM in ~u, then (a) ~X = ~x′ ∧ ~W = ~wSE ¬φ is true inM in ~u; and (b) for all ~W− ⊆ ~W and all ~Z− ⊆ ~Z: ~X = ~x ∧ ~W− = ~w ∧ ~Z− = ~z∗SE φ is true inM in ~u. 3. There is no proper subset ~X− of ~X such that 1. and 2. hold for ~X−. In order to understand this definition we need to know the truth conditions for counterfactuals of the form ~X = ~x SE φ in an extended acyclic causal model M in a context ~u. It is these counterfactuals that are my main target. In what follows I will ignore the use/mention distinction whenever possible so that the notation does not become even more cumbersome. For an endogenous variable X inV and a value x in R (X), X = x is an atomic sentence. An atomic sentence X = x is true inM in ~u just in case all solutions to the equations represented by F assign value x to the endogenous variable X when the exogenous variables are set to ~u. Since we are restricting the discussion to extended acyclic causal models which have a unique solution in any given context, this means that X = x is true inM in ~u if and only if x is the value of X in the unique solution to all equations inM in ~u. The truth conditions for negations and conjunctions are given in the usual way. 7 A counterfactual X1 = x1 ∧ . . .∧Xk = xk SE φ, or simply: ~X = ~xSE φ, is true in M in ~u just in case φ is true in M~X=~x = ( S~X,F ~X=~x ) (but the same ~u). The latter model results from M by replacing the equations for Xi by the equations Xi = xi, i = 1, . . . , k. Formally this means two things (i-ii). (i) The signatureS is reduced toS~X = ( U,V \ {X1, . . . ,Xk} ,R |U∪V\{X1,...,Xk} ) , where R |U∪V\{X1,...,Xk} is R with its domain restricted from U ∪V to U ∪V \ {X1, . . . ,Xk}. (ii) F is reduced to F ~X=~x which results from F by deleting the functions FXi representing the equations for Xi and by changing the remaining functions FY in F \ { FX1 , . . . ,FXk } as follows. First, restrict the domain of each FY from ×X∈U∪V\{Y}R (X) to ×X∈U∪V\{Y,X1,...,Xk}R (X). Second, replace FY by F ~X=~x Y which results from FY by setting X1, . . . ,Xk to x1, . . . , xk, respectively. While this definition is fairly complicated, the idea behind it is quite simple. In evaluating the counterfactual ~X = ~x SE φ in model M in context ~u, first validate the antecedent by deleting the equations for the endogenous variables ~X and setting their values to ~x. In a second step set the exogenous variables to ~u and let the remaining equations determine the values of the remaining endogenous variables. In a third step check if the resulting solution yields the right value for φ. The equations represent the "(causal) laws" or mechanisms of the model. It is important to stress the relativity to the model and that laws, as understood here, may fail to meet many of the traditional criteria for lawfulness (Woodward 2003: ch. 6). The laws of the model can represent the workings of your fridge, the economics of the food market in the country I live in, the laws of gravitation of some planetary system, or Schrödinger's Equation. Several features of the formal language from above are worth being pointed out. First, all sentences are built up from endogenous variables. Second, Structural -Equations-counterfactuals or SE-counterfactuals cannot be iterated (embeddings can be defined, though, as shown by Halpern forthcoming). Third, the antecedents of SE-counterfactuals are restricted to non-empty conjunctions of atomic sentences, although the consequents of SE-counterfactuals can be arbitrary Boolean combinations of atomic sentences. As Halpern & Hitchcock (2010) note, the introduction of defaults makes the notion of actual causation doubly "subjective" (Halpern & Hitchcock 2010: 384) or relative: judgments of actual causation depend on the choice of the exogenous and endogenous variables and on the choice of the default values for these variables. Let us look at their FIRE example. 8 Endogenous variable L takes on the value 1 if there is lightning, and 0 otherwise. Endogenous variable M takes on the value 1 if there is an arsonist dropping a lit match, and 0 otherwise. Endogenous variable F takes on the value 1 if there is a forest fire, and 0 otherwise. Furthermore exogenous variable (UL,UM) determines the values of L and M. The functions FL : (( i, j ) ,m, f ) 7→ i, FM : (( i, j ) , l, f ) 7→ j, and FF : (( i, j) ) , l,m ) 7→ max {l,m} describe the following equations: • (UL,UM) • L = UL • M = UM • F = L ∨M According to Halpern & Hitchcock (2010), in the context where UL = 1 and UM = 1 so that there is lightning (L = 1) and there is an arsonist dropping a lit match (M = 1) and there is a forest fire (F = 1), the arsonist's dropping a lit match (M = 1) is an actual cause of the forest fire (F = 1). This is so, because: 1. M = 1 and F = 1 are true inM in (uL,uM) = (1, 1). 2. For the partition {{M,F} , {L}} and the values 0 and 0 of M and L we have %(1,1) (M = 0 ∧ L = 0) ≤ %(1,1) ( w(1,1) ) and: (M,F) = (1, 1) is true in M in (1, 1) and (a) M = 0 ∧ L = 0SE F , 1 is true inM in (1, 1), and so are (b) M = 1 SE F = 1, M = 1 ∧ F = 1 SE F = 1, M = 1 ∧ L = 0SE F = 1, M = 1 ∧ L = 0 ∧ F = 1SE F = 1. 3. There is no proper subset of {M} such that 1. and 2. hold. The relevant inequality for the ranking function %(1,1) says that the most typical world where there is no lightning and no arsonist dropping a lit match is at least as typical as the actual world where there are lightning and an arsonist dropping a lit match and a forest fire. This equation holds (in the context where UL = 1 and UM = 1) for the following reason. It is more typical that there is no lightning (L = 0) than that there is lightning (L = 1). It is more typical that there is no arsonist dropping a lit match (M = 0) than that there is an arsonist dropping a lit match (M = 1). It is more typical that there is no forest fire (F = 0) than that there is a forest fire (F = 1). 9 In addition to this the structural equations seem to put a constraint on the ordering of normality or typicality. Even though it is more typical that there is no forest fire than that there is a forest fire, it is more typical that there is lightning and a forest fire than that there is lightning and no forest fire. Similarly, even though it is more typical that there is no forest fire than that there is a forest fire, it is more typical that there are an arsonist dropping a lit match and a forest fire than that there is an arsonist dropping a lit match and no forest fire. Finally, even though it is more typical that there is no forest fire than that there is a forest fire, it is much more typical that there are lightning and an arsonist dropping a lit match and a forest fire than that there are lightning and an arsonist dropping a lit match, but there is no forest fire. And this is so no matter which context we are in. More generally, the structural equations seem to put the following constraint on the ordering of normality or typicality. It seems that worlds which violate an equation are less typical than worlds that obey all equations (the latter are called "legal" in Glymour et al. 2010). And it seems that worlds violating certain equations and then some are less typical than worlds violating only certain equations. It is easy to see, though, that this constraint does not hold for equations such as L = UL and M = UM, if only because we do not know what UL and UM stand for. However, it would be wrong to take this as a reason to reject the constraint that the structural equations seem to put on the ordering of normality or typicality. The fact that the constraint does not hold for equations such as L = UL and M = UM should rather be taken as a reason to reject the above model. Let me explain. The only reason Halpern & Hitchcock (2010) include the "dummy variables" UL and UM and the "dummy equations" U = UL and U = UM is that they want to say that L = 1 and M = 1 are actual causes of F = 1, but cannot do so unless both L and M are endogenous variables. Besides that these variables and equations do no work and could be dropped if the artificial restriction were not in place that only endogenous variables can be causally efficacious. If that restriction were not in place, L and M would be the exogenous variables, and F = L ∨ M the only equation. Indeed, this is the model one would use in the framework of Hitchcock (2007). 10 3 Generalizing Causal Models FIRE example, version 2: Let exogenous variable L take on the value 1 if there is lightning, and 0 otherwise. Let exogenous variable M take on the value 1 if there is an arsonist dropping a lit match, and 0 otherwise. Let endogenous variable F take on the value 1 if there is a forest fire, and 0 otherwise. The function FF : (l,m) 7→ max {l,m} describes the following equation: • L • M • F = L ∨M In this model it is true that worlds that violate an equation are less typical than worlds that obey all equations. My first proposal therefore is to relax the restriction in the (extended acyclic) causal models of Halpern (2008) and Halpern & Hitchcock (2010) and define an atomic sentence to be of the form X = x for an exogenous or endogenous variable X inU∪V and a value x in R (X). Then we do not have to include arbitrary exogenous variables to render L and M endogenous and thus be able to state counterfactual and causal claims with them. For this to make sense we have to define the truth conditions for sentences in a slightly different way. An atomic sentence X = x is true inM in ~u just in case all solutions to the equations represented by F when the exogenous variables are set to ~u assign value x to the exogenous or endogenous variable X. Since we keep restricting the discussion to acyclic models which have a unique solution in any context, this means that X = x is true inM in ~u if and only if x is the value of X in the unique solution to all equations inM in ~u. The truth conditions for negations and conjunctions are again given in the usual way. A counterfactual X1 = x1 ∧ . . .∧Xk = xk SE φ, or simply: ~X = ~xSE φ, is true in M in ~u just in case φ is true in M~X=~x = ( S~X,F ~X=~x ) in ~u~X=~x. The latter model and context result fromM and ~u by replacing the equations for Xi by the equations Xi = xi, i = 1, . . . , k. Formally this means two things (i-ii). (i) The signature S is reduced to S~X = ( U,V \ {X1, . . . ,Xk} ,R |U∪(V\{X1,...,Xk}) ) , where R |U∪(V\{X1,...,Xk}) is R with its domain restricted from the originalU∪V to those variablesU ∪ (V \ {X1, . . . ,Xk}) that remain after deleting the endogenous variables among {X1, . . . ,Xk}. 11 (ii) F is reduced to F ~X=~x which results from F by deleting the functions FXi representing the equations for the endogenous Xi and by changing the remaining functions FY in F \ { FX1 , . . . ,FXk } as follows. First, restrict the domain of each FY from ×X∈U∪V\{Y}R (X) to ×X∈U∪(V\{Y,X1,...,Xk})R (X). Second, replace FY by F ~X=~x Y which results from FY by setting X1, . . . ,Xk to x1, . . . , xk, respectively. The new context ~u~X=~x results from the original context ~u as follows. First, set the values of the exogenous variables among {X1, . . . ,Xk} to x1, . . . , xk, respectively. Second, leave the values of the other exogenous variables in U \ {X1, . . . ,Xk} as they are in ~u. The definition of actual causation has to be changed slightly: in clause (2) we consider a partition of all variables, exogenous or endogenous,U∪V rather than a partition of the endogenous variablesV only. The SURVIVAL example (Halpern & Hitchcock 2010: 400) explains why we need ranking functions in addition to the structural equations. Let exogenous variable A take on the value 1 if Assassin does not put in poison, and 0 otherwise. Let exogenous variable B take on the value 1 if Bodyguard puts in antidote, and 0 otherwise. Let endogenous variable S take on the value 1 if Victim survives, and 0 otherwise. The function FS : (a, b) 7→ max {a, b} describes the following equation: • A • B • S = A ∨ B The structural equation for the SURVIVAL example is isomorphic to that for the FIRE example, version 2. However, people have different intuitions about the correct causal judgment for these two examples. In the FIRE example, version 2 people say that the arsonist's dropping a lit match is an actual cause of the forest fire if there are lightning and an arsonist dropping a lit match (and a forest fire). In the SURVIVAL example people do not say that Bodyguard's putting in antidote is an actual cause of Victim's survival, if Bodyguard puts in antidote and Assassin does not put in poison (and Victim survives). This difference in people's intuitions about the correct causal judgment is explained by appeal to normality or typicality. While the structural equation for the SURVIVAL example is isomorphic to that for the FIRE example, version 2, the ordering of normality or typicality for the former differs from that of the latter in the following way. 12 It is more typical that Assassin does not put in poison (A = 1) than that Assassin puts in poison (A = 0). It is more typical that Bodyguard does not put in antidote (B = 0) than that Bodyguard puts in antidote (B = 1). It is more typical that Victim survives (S = 1) than that Victim does not survive (S = 0). In addition to this the structural equation seems to put a constraint on the ordering of normality or typicality. Even though it is more typical that Victim survives than that Victim does not survive, it is more typical that Assassin puts in poison and Bodyguard does not put in antidote and Victim does not survive than that Assassin puts in poison and Bodyguard does not put in antidote and Victim survives. This helps us see why Bodyguard's putting in antidote is no actual cause of Victim's survival, if Bodyguard puts in antidote and Assassin does not put in poison and Victim survives, A = 1, B = 1, and S = 1. 1. B = 1 and S = 1 are true inM in (a, b) = (1, 1); but 2. for the partition {{B,S} , {A}} (and any other partition) there are no values b and a of B and A with %(1,1) (B = b ∧ A = a) ≤ %(1,1) ( w(1,1) ) and: (B,S) = (1, 1) is true inM in (1, 1) and (a) B = b ∧ A = aSE S , 1 is true inM in (1, 1), and so are (b) B = 1SE S = 1, B = 1 ∧ S = 1SE S = 1, B = 1 ∧ A = aSE S = 1, B = 0 ∧ A = a ∧ S = 1SE S = 1; and 3. there is no proper subset of {B} such that 1. and 2. hold. The reason is that the values b and a of B and A needed for B = b ∧ A = a  S , 1 to come out true inM in (1, 1) are 0 and 0. However, any world in which Bodyguard does not put in antidote and Assassin puts in poison, i.e. where B = 0 ∧ A = 0 is true, is less typical than the actual world w(1,1) where Bodyguard puts in antidote and Assassin does not put in poison – or so Halpern & Hitchcock (2010: sct. 5) claim. In fact, however, this is not true for the ranking function used by Halpern & Hitchcock (2010). Their ranking function assigns rank 1 to both the world that would be needed where Bodyguard does not put in antidote and Assassin puts in poison, as well as to the actual world where Bodyguard puts in antidote but Assassin does not put in poison. What is true, though, is that the world that would be needed where Bodyguard does not put in antidote and Assassin puts in poison is less typical than the most typical world where Assassin does not put in poison, viz. the world where Bodyguard does not put in antidote and Assassin does not put in poison. 13 We therefore have to slightly adjust the definition of actual causation (in the spirit of Hitchcock 2007, who also refers to the actual value of ~W rather than the actual world) as follows: in condition (2), %~u ( ~X = ~x′ ∧ ~W = ~w ) ≤ %~u ( ~W = ~w~u ) , where ~w~u is the actual value of ~W in modelM in context ~u. For the sake of completeness I state the slightly revised definition of actual causation in extended acyclic causal models: X1 = x1 ∧ . . . ∧ Xk = xk, or simply: ~X = ~x, is an actual cause of φ in the extended acyclic causal model M = ( S,F , ( %~u ) ~u∈R(U) ) in context ~u if and only if: 1. ~X = ~x and φ are true inM in ~u. 2. There is a partition { ~Z, ~W } of all variables, exogenous or endogenous,U ∪ V with ~X ⊆ ~Z, and there are vectors of values ~x′ and ~w of ~X and ~W, respectively, with %~u ( ~X = ~x′ ∧ ~W = ~w ) ≤ %~u ( ~W = ~w~u ) such that: if ~Z = ~z∗ is true inM in ~u, then (a) ~X = ~x′ ∧ ~W = ~wSE ¬φ is true inM in ~u; and (b) for all ~W− ⊆ ~W and all ~Z− ⊆ ~Z: ~X = ~x ∧ ~W− = ~w ∧ ~Z− = ~z∗SE φ is true inM in ~u. 3. There is no proper subset ~X− of ~X such that 1. and 2. hold for ~X−. This completes the first step of my argument as it was outlined in the Introduction. In a second step I now want to step back from Halpern & Hitchcock's (2010) interpretation of the ranking functions %~u. Instead of interpreting them solely in terms of normality or typicality, I propose to interpret them as that notion – let us call it (counterfactual) distance – that gives truth conditions to counterfactuals. In the way I propose to interpret them, ranking functions represent a modality, the modality of counterfactuality, that is as objective as counterfactuals are. Therefore I will refer to them as objective ranking functions. Counterfactual distance figures as a primitive on my account. It is the same notion that Stalnaker (1968) and Lewis (1973a; 1979) interpret in terms of overall similarity between possible worlds. While I do not think that overall similarity is an adequate interpretation of counterfactual distance (else I would not treat the latter as primitive), it may be helpful to the reader to think of objective ranking functions as formalizing overall similarity. 14 This formalization in terms of objective ranking functions differs slightly3 from Stalnaker's (1968) formalization in terms of selection functions and from Lewis' (1973a) formalization in terms of a system of spheres. However, these slight differences do not affect the logic of counterfactuals in any way that is relevant for present purposes.4 Interim report: I have taken Halpern's (2008) notion of an extended (acyclic) causal model in terms of which Halpern & Hitchcock (2010) define actual causation. First I have slightly generalized these models by indexing the ranking functions in them to the contexts rather than assuming one fixed ranking function for all contexts. Then I have further generalized these models in the spirit of Hitchcock (2007) by dropping the restriction that only endogenous variables can be causally efficacious. Finally, after fixing a small bug in the definition of actual causation I have re-interpreted the ranking functions in these generalized extended (acyclic) causal models objectively as that notion which gives truth conditions to counterfactuals. This completes the first and second step of my argument as it was outlined in the Introduction. In the next three sections I will carry out the third step. 3The difference is that the limit assumption, which is rejected by Lewis (1973), holds on the formalization in terms of objective ranking functions. 4One reason why I think that similarity is not an adequate interpretation of counterfactual distance is that the axiom(s) of strong centering (and weak centering) come out as (analytic) truths on this interpretation. I think that neither strong centering nor weak centering holds for counterfactuals. For criticism of similarity see Hájek (ms). For criticism of weak centering and strong centering see Leitgeb (2012a; 2012b) and Menzies (2004: sct. 6). 15 4 Laws and Counterfactuality As stressed by Collins & Hall & Paul (2004: 2ff) the logical properties of the counterfactual conditional do not suffice for a counterfactual theory of causation, if only because they do not exclude backtracking counterfactuals. This is why Lewis (1979) imposes four constraints on the similarity relation that is governing the logic of counterfactuals on his account, in addition to its defining features that fix the logical properties of the counterfactual conditional via the system VC. I will impose two constraints as well.5 The first constraint concerns the relation between structural equations and ranking functions and is a strong-dominance version of Lewis' (1979: 472) conditions that "[i]t is of the first importance to avoid big, widespread, diverse violations of law" and that "[i]t is of the third importance to avoid even small, localized, simple violations of law", except that it is relative to the causal model (see Menzies 2004). We start with some terminology relative to an extended acyclic causal model M = ( S,F , ( %~u ) ~u∈R(U) ) . Say that a world w = ( ~u, v1, . . . , vn ) violates the equation for the endogenous variable Vi if and only if vi , Fi ( ~u, v1, . . . , vi−1, vi+1, . . . , vn ) . Let V∗ (w) ⊆ V be the set of endogenous variables Vi such that w violates the equation for Vi. Next say that a world w weakly Halpern-dominates a world w′ if and only if for each endogenous variable X ∈ V∗ (w) \ V∗ (w′) there is an endogenous variable X′ ∈ V∗ (w′) \ V∗ (w) such that X′ ∈ An (X). Finally say that a world w strongly Halpern-dominates a world w′ if and only if w weakly Halpern-dominates w′, but w′ does not weakly Halpern-dominate w (and soV∗ (w′) \ V∗ (w) is not empty). Now we are in a position to formulate our first constraint. The idea is that worlds that violate certain equations and then some are (counterfactually) more distant than worlds that violate only certain equations. However, since a violation of the equation for an endogenous variable early on in the causal hierarchy affects everything causally downstream of that variable, a violation early on is worse – infinitely worse – than a violation later on. If we adopt the terminology of Lewis (1979), a violation of an equation early on in the causal hierarchy amounts to an infinitely bigger miracle than a violation of an equation later on. This is why the first constraint has to be stated in terms of ancestors.6 5In stressing that it is an art to come up with an appropriate model for a given scenario or case Hitchcock (2007) states various constraints on appropriate models. His constraints concern the relation between the model and the case to be modeled. In contrast to these the constraints I will impose are inherent to the model and independent of the case to be modeled. 16 An extended acyclic causal modelM = ( S,F , ( %~u ) ~u∈R(U) ) respects the equations if and only if the following holds for all worlds w and w′ inW: if w strongly Halpern-dominates w′, then it holds for all contexts ~u in R (U): %~u (w) < %~u (w′).7 The idea behind respect for the equations is quite simple. First associate with each world the set of endogenous variables whose equation the world violates. Then, when comparing two given worlds for (counterfactual) distance, ignore those endogenous variables whose equations are violated by both worlds. Finally check whether, among the remaining endogenous variables, for each endogenous variable whose equation is violated by the first world there is an endogenous variable that is causally upstream and whose equation is violated by the second world. In addition, check whether the converse is not true. In other words, check if any violation in the first world is compensated for by a violation in the second world that is worse, because it is further up in the causal hierarchy. In addition check if the converse is not true. If so, then the first world is (counterfactually) less distant, or closer, to any world than the second world. If the second world violates all the equations that are violated by the first world and then some we have the special case where, after ignoring the common violations, no violations in the first world are left. We are approaching the summit of this paper. My aim is to show that by objectively interpreting the ranking functions in them, causal models respecting the equations can be subsumed under so called "counterfactual models", because the ranking functions thus interpreted yield all structural equations. In fact, counterfactual models give us more than causal models, because they define truth conditions for counterfactuals with arbitrary antecedents, something that is hard to come by in the structural equations approach (Briggs 2012, Halpern 2008: sct. 5). Furthermore, in counterfactual models counterfactuals may not only be embedded, but can also be iterated. Finally, the sentences in the formal language for counterfactual models are built up from exogenous and endogenous variables. 6Woodward (2003: 141) can be read as endorsing our first constraint when he points to the following "important general difference between Lewis's scheme and the manipulationist picture. On the manipulationist account [...] "[l]ate" miracles, even numerous, are automatically preferred to "early" miracles, even if single. By contrast, in Lewis's theory, whether we [...] insert many late miracles [...] or whether instead we [insert some early miracle] [...] depends on whether [the effects] have many causes or just one. This sort of sensitivity leads to the insertion of miracles in what, intuitively, is the wrong place." 7The formulation of respect for the equations has undergone several changes. The present one is due to Joseph Y. Halpern, for whose many most helpful comments and suggestions I am very grateful. 17 Here is the definition. M∗ = ( S, ( %w ) w∈W ) is a counterfactual model if and only if S = (U,V,R) is a signature and, for each world w inW, %w :W→ N is a ranking function on W. Rather than indexing the ranking functions to the context ~u or the "legal" world w~u determined by that context, ranking functions are now indexed to the set of all possible worlds. The reason is that truth is a relation between sentences and possible worlds, and not between sentences and contexts (or between sentences and "legal" worlds). This makes it necessary to be explicit about the exogenous variables. From now onU is the set of m exogenous variables {U1, . . . ,Um}. An atomic sentence Xi = x, i = 1, . . . ,m + n, is true in M∗ in world w ∈ W if and only if w ∈ {(u1, . . . ,um, v1, . . . , vm) = (x1, . . . , xm+n) ∈ W : xi = x}. Negations and conjunctions are defined as usual, and where φ and ψ are arbitrary sentences, the counterfactual φ ψ is true in the counterfactual modelM∗ in the world w just in case all %w-minimal φ-worlds are ψ-worlds. The system V is sound and complete with respect to this semantics (Huber ms 2). In a causal model the structural equations are given and then used to define truth conditions for a limited set of counterfactual conditionals. In a counterfactual model the counterfactual conditionals are given via the ranking functions %w. Therefore we have to say what it means for a structural equation represented by some function F to hold in a counterfactual model. For this we first restrict the functions F to those fromWi = ×X∈U∪V\{Vi}R (X) into R (Vi), for some endogenous variable Vi fromV. Call such a function eligible for Vi. A function F :Wi → R (Vi), which is eligible for Vi, holds in a counterfactual modelM∗ just in case, for every world w inW, the following counterfactuals are all true in M∗ in w: ~U ∪V \ {Vi} = ~wi  Vi = Fi ( ~wi ) , where ~wi is in Wi. For an eligible function F to hold in a counterfactual model the above counterfactuals must be true in every world in that model. In contrast to counterfactuals in general, whose truth value is world-dependent, the structural equations hold world-independently. In this sense they are necessarily true. Therefore talk of "(causal) laws" is appropriate. My thesis is that the one modality of counterfactuality suffices for actual causation and causality in general. We have seen why to subscribe to the insufficiency thesis according to which "there must be more to causality than just the structural equations." We should not infer from the insufficiency thesis that we need a second modality. What we should infer from the insufficiency thesis is that the limited set of counterfactuals we get from the structural equations is not enough to represent the one relevant modality of counterfactuality. 18 To put it bluntly: structural equations are insufficient and unnecessary for causality. They are insufficient because they do not give us all counterfactuals, and because they do not give us all correct causal claims. They are unnecessary because we get them for free once we have moved beyond them, on to objective ranking functions. This is the content of the following theorem, which completes the first part of the third step of my argument as it was outlined in the Introduction. The second part of the third step follows in the next chapter. Theorem 1 For each extended acyclic causal model M = ( S,F , ( %~u ) ~u∈R(U) ) which respects the equations there is a counterfactual modelM∗ = ( S, ( %w ) w∈W ) such that: SE Fi holds inM iff Fi holds inM∗ D For all ~u ∈ R (U) and all w ∈ W: %~u (w) = %w~u (w), where w~u is the unique solution to all equations inM in ~u. Proof: Appendix 7.1.  19 5 Counterfactuality and Actuality Let us look at the counterfactual models for our two examples if we use the ranking functions from Halpern & Hitchcock (2010) and evaluate counterfactuals in terms of them rather than the structural equations. In the SURVIVAL example it is false in the actual context where Assassin does not put in poison and Bodyguard puts in antidote (and Victim survives) that Victim would not survive if Bodyguard did not put in antidote, B = 0  S , 1. The reason is that one of the (counterfactually) least distant, or closest, worlds where Bodyguard does not put in antidote, viz. the world where Bodyguard does not put in antidote, Assassin does not put in poison, and Victim survives, is a world where Victim survives. In the FIRE example, version 2 it is true in the actual context where there are lightning and an arsonist dropping a lit match (and a forest fire) that there would be no forest fire if there were no arsonist dropping a lit match, M = 0 F , 1. The reason is that all the (counterfactually) least distant, or closest, worlds where there is no arsonist dropping a lit match, viz. the world where there is no arsonist dropping a lit match, no lightning, and no forest fire, are also worlds where there is no forest fire. This means that theorem 1 is not enough. For it is not true that there would be no lightning if there were no arsonist dropping a lit match. On the contrary, even if there were no arsonist dropping a lit match, there would still be lightning, and hence there would still be a forest fire. This is also how the counterfactual M = 0 SE F , 1 is evaluated according to the structural models approach of Halpern & Hitchcock (2010). This highlights the fact that the counterfactuals defined in terms of the structural equations of a causal model and the counterfactuals defined in terms of a counterfactual model may differ even if all and only the structural equations of the causal model hold in the counterfactual model. So far the only counterfactuals the two approaches agree on are those with maximally specific antecedents: ~U ∪V \ {Vi} = ~wi (SE) Vi = Fi ( ~wi ) , where ~wi is in Wi. These are the necessarily true "(causal) laws" that are true in all worlds or contexts. Defeat is not the appropriate reaction to this mismatch, though. What the mismatch shows is that we cannot define a counterfactual φ  ψ to be true in a world w in a model M if and only if all %w-minimal antecedent worlds are consequent worlds and interpret %w solely in terms of normality or typicality. For that means that φ ψ is true if φ-worlds normally are ψ-worlds. And that is not right. More specifically, that is too weak. 20 The LIGHTNING example due to Christopher R. Hitchcock (personal correspondence) helps us see what is still missing to get the counterfactuals right. Let exogenous variable L take on the value 1 if there is lightning, and 0 otherwise. Let endogenous variable F take on the value 1 if there is a forest fire, and 0 otherwise. The function FF : l 7→ f describes the following equation: • L • F = L The equation says that there would be a forest fire if there were lightning. In the context where there is lightning, L = 1, we want to say that (even) if there were no forest fire there would (still) be lightning, F = 0  L = 1. That is, we do not want our counterfactuals to backtrack. However, the world where there is lightning and no forest fire violates the equation, whereas the world where there is no lightning and no forest fire does not. Therefore, if all we require is respect for the equations we get the wrong result that, in the context where there is lightning, there would be no lightning if there were no forest fire, F = 0 L = 0. In order to get the right result that there would (still) be lightning, (even) if there were no forest fire, we additionally need to hold fixed what is actually true in the context of evaluation. When we formulate the antecedent of a counterfactual we keep fixed as much of the actual context as is consistent with the antecedent. In the LIGHTNING example we keep fixed that there is lightning. The same is true of the FIRE example, version 2, where we also keep fixed that there is lightning. That is why it is true that if there were no arsonist dropping a lit match there would still be lightning, and hence there would still be a forest fire.8 8Note that we cannot hold fixed everything that is consistent with the antecedent. Consider the counterfactual 'If there were no lightning or no arsonist dropping a lit match, there would still be a forest fire.' This counterfactual has no truth-value on the structural models approach, even in its generalized form, because the antecedent is a disjunction. On our counterfactual models account this counterfactual does have a truth-value. Its antecedent is consistent with there being lightning. Its antecedent is also consistent with there being an arsonist dropping a lit match. However, its antecedent is not consistent with there jointly being lightning as well as an arsonist dropping a lit match. Thus we cannot hold fixed everything that is consistent with the antecedent. Nor can we hold fixed only what is common to all antecedent-worlds. For then we would only consider worlds where there is neither lightning nor an arsonist dropping a lit match. The worlds we want to consider are such that either there is lightning but no arsonist dropping a lit match, or else there is no lightning but an arsonist dropping a lit match. For it is those worlds that hold fixed as much of the actual context as is consistent with the antecedent. 21 Consequently the second constraint concerns the relation between ranking functions and actuality. It is a strong-dominance version of Lewis' (1979: 472) condition that "[i]t is of the second importance to maximize the spatio-temporal region throughout which perfect match of particular fact prevails", except that it is relative to the causal model (again, see Menzies 2004). As before we start with some terminology relative to an extended acyclic causal model M = ( S,F , ( %~u ) ~u∈R(U) ) . Say that a world w = ( u1, . . . ,um, ~v ) differs from a world w+ = ( u+1 , . . . ,u + m, ~v+ ) in the value for the exogenous variable Ui if and only if ui , u+i . LetU ∗ w+ (w) be the set of exogenous variables for whose value w differs from w+. Next say that a world w weakly dominates a world w′ in terms of focus on a world w+ if and only ifU∗w+ (w) ⊆ U ∗ w+ (w ′). Finally say that a world w strongly dominates a world w′ in terms of focus on a world w+ if and only if w weakly dominates w′ in terms of focus on w+, but w′ does not weakly dominate w in terms of focus on w+. Now we are in a position to formulate our second constraint. The idea is that worlds that differ from the actual world in the values of certain exogenous variables and then some are (counterfactually) more distant from the actual world than worlds that differ from the actual world only in the values for certain exogenous variables. In contrast to the global constraint of respect for the equations focus on actuality is a local constraint. This is so because what is actual varies from context to context. And that is why we now quantify over contexts at the beginning of the relevant clause. An extended acyclic causal modelM = ( S,F , ( %~u ) ~u∈R(U) ) is focused on actuality if and only if the following holds for all contexts ~u in R (U) and all worlds w and w′ inW: if w strongly dominates w′ in terms of focus on the world w~u, then: %~u (w) < %~u (w′). However, we cannot simply demand of an extended acyclic causal model that it satisfy focus on actuality in addition to respect for the equations. Focus on actuality is more important than respect for the equations, as the above example shows. For this reason, as well as to make sure that the two constraints do not conflict with each other, respect for the equations has to be restricted to worlds which agree on the values for the exogenous variables in U. This means that we have a system of priorities rather than a system of weights (cf. Lewis 1979: 472, Kroedel & Huber forthcoming). Its content is that extended acyclic causal models have to be focused on actuality and subsequently respect the equations in the following sense. (Note that I have omitted this point in outlining my argument in the Introduction.) 22 An extended acyclic causal modelM = ( S,F , ( %~u ) ~u∈R(U) ) is focused on actuality and subsequently respects the equations if and only if M is focused on actuality and the following holds for all worlds w and w′ inW that agree on the values of the exogenous variablesU: if w strongly Halpern-dominates w′, then it holds for all contexts ~u in R (U): %~u (w) < %~u (w′). For extended acyclic causal models which are focused on actuality and subsequently respect the equations the mismatch between the truth-values of counterfactuals in the structural models approach and in the counterfactual models account disappears. This is the content of the following theorem. Theorem 2 For each extended acyclic causal model M = ( S,F , ( %~u ) ~u∈R(U) ) which is focused on actuality and subsequently respects the equations there is a counterfactual modelM∗ = ( S, ( %w ) w∈W ) such that: C For all statements φ in the language of the generalized version of Halpern & Hitchcock (2010) and all contexts ~u ∈ R (U): φ is true inM in ~u according to the structural equations approach iff φ is true inM∗ in w~u according to the counterfactual models account. Proof: Appendix 7.2.  This almost completes the third step of my argument as it was outlined in the Introduction. There is one more twist to the story that will be topic of the next section when we put things together. However, before doing so I want to present a slightly different formulation of focus on actuality and subsequent respect for the equations that may be more accessible. Respect for the equations is a global constraint on the endogenous variables and the structural equations governing them. Focus on actuality is a local constraint on the exogenous variables and their values in a given context. The distinction between exogenous and endogenous variables is relative to the model, and an exogenous variable may become endogenous if one refines a model by including further variables. Therefore one may sometimes want to think of the exogenous variables as potentially endogenous, governed by structural equations that are temporarily set to a constant value for practical purposes, say, for the model to be simple. 23 From this point of view it is natural to adopt the following terminology relative to an extended acyclic causal model M = ( S,F , ( %~u ) ~u∈R(U) ) . Say that the equation for an exogenous variable U j in context ~u = (u1, . . . ,um) is represented by the constant function Fu j : ( u1 . . . ,u j−1,u j+1, . . . ,um, ~v ) 7→ u j from ×X∈U∪V\{U j}R (X) into R ( U j ) . Next say that a world w = ( u1, . . . ,um, ~v ) violates the equation for the exogenous variable U j in context ~u+ = ( u+1 , . . . ,u + m ) if and only if u j , Fu+j ( u1, . . . ,u j−1,u j+1, . . . ,um, ~v ) = u+j . Let X ∗ ~u (w) ⊆ U ∪ V be the set of exogenous or endogenous variables X such that w violates the equation for X (in context ~u). Finally say that an extended acyclic causal model M = ( S,F , ( %~u ) ~u∈R(U) ) is respectful if and only if the following holds for all contexts ~u and all worlds w and w′ in W: if for each exogenous or endogenous variable X ∈ X∗ ~u (w) \V∗ ~u (w′) there is an exogenous or endogenous variable X′ ∈ V∗ ~u (w′) \ V∗ ~u (w) such that X′ ∈ An (X), but the converse does not hold, then: %~u (w) < %~u (w′). Respectfulness is a mixed constraint on the exogenous and endogenous variables of a model, the values of the former in a given context, and the structural equations governing the latter in all contexts. It unifies the prioritized combination of focus on actuality and subsequent respect for the equations and allows us to state the following (strictly weaker) corollary of theorem 2. Theorem 3 For each extended acyclic causal model M = ( S,F , ( %~u ) ~u∈R(U) ) which is respectful there is a counterfactual modelM∗ = ( S, ( %w ) w∈W ) such that: C For all statements φ in the language of the generalized version of Halpern & Hitchcock (2010) and all contexts ~u ∈ R (U): φ is true inM in ~u according to the structural equations approach iff φ is true inM∗ in w~u according to the counterfactual models account. 24 6 Beyond Structural Equations It is time to put things together. Typicality and actuality can come apart. Actuality matters for counterfactuality. So, one might think, even counterfactual models are insufficient for causality. However, consider Spohn's (2006) account of causation. He starts out with a ranking function % over a set of possible worlds which is generated by a set of singular variables in the same way as ours. Spohn interprets the ranking function % subjectively in terms of grades of disbelief. He defines actual causation in terms of the conditional ranking function % (* | Hw), where Hw is the complete history of the actual world w up to right before the effect, but excluding the cause (a temporal ordering relation over the variables allows Spohn to give a precise definition of this clause). So the seemingly objective nature of actual causation in this purely subjective account is partially captured by conditionalizing on what is actually the case. This paves the way for the final move, suggested by Wolfgang Spohn (personal correspondence). Let us follow Halpern & Hitchcock (2010) in interpreting the unconditional ranking functions in terms of typicality. Furthermore, suppose our extended acyclic causal model M = ( S,F , ( %~u ) ~u∈R(U) ) respects the equations. Typicality and actuality come apart in context ~u only if the unconditional ranking function %~u and the conditional ranking function %~u ( * | ~U = ~u ) differ for the rank assigned to some proposition Ui = ui, for some exogenous variable Ui and some value ui in R (Ui). But nothing forces us to use the unconditional ranking function %~u in evaluating counterfactuals in ~u. We are free to use the conditional ranking function %~u ( * | ~U = ~u ) to evaluate counterfactuals in ~u. Here is a restricted, but hopefully more comprehensible version of the main result detailed below. Suppose the modelMwith its family of unconditional ranking functions ( %~u ) ~u∈R(U) respects the equations. This implies that the modelM ~U with the family of conditional ranking functions ( %~u ( * | ~U = ~u )) ~u∈R(U) is focused on actuality and subsequently respects the equations, provided we momentarily exclude the exogenous variables from the sentences of our language (this assumption will be dropped below). The conditional ranking functions give us the counterfactuals in the various contexts (or worlds, if we do not exclude the exogenous variables from the sentences of our language). If two scenarios or cases agree on the conditional ranking functions and the counterfactuals they represent, as is the case for the FIRE example, version 2 and of the SURVIVAL example, they may still differ in the unconditional ranking functions they arise from and the defaults these latter represent. We do not need to introduce a second element in our model. 25 In a nutshell causality and counterfactuality interact in the following way. Typicality is represented by the unconditional or "prior" ranking functions. Counterfactuality includes typicality, but goes beyond it by respecting the equations and, in the context of causality, by being focused on actuality (and subsequently respecting the equations). In the context of causality, counterfactuality is represented by the conditional or "posterior" ranking functions that arise from the unconditional ranking functions by conditionalizing on what is actually the case. Both unconditional as well as conditional ranking functions are to respect the equations. In addition to this the latter, but not the former, are to be focused on actuality. As a consequence, the latter, but not the former, do not represent typicality anymore, if, as may happen, typicality and actuality come apart. As Lewis might put it, typicality "is of little or no importance" (Lewis 1979: 472). Even though conditionalizing on what is actually the case may erase the traces of typicality, we can still refer back to the unconditional roots. This is exactly what we do if we adopt Halpern & Hitchcock's (2010) definition of actual causation. In the relevant clause (2) we use the unconditional ranking function to determine the default values of the variables, whereas we use the conditional ranking function to determine the truth-values of the counterfactuals. Halpern & Hitchcock (2010) use two different formalisms, viz. structural equations and ranking functions, to represent the "(causal) laws" and typicality, respectively. I use just one formalism, viz. objective ranking functions, that, due to its conditional nature, is sufficiently rich to capture both of these dimensions of counterfactuality.9 Things are more complicated if we allow for exogenous variables in the sentences of our language. Then the following more general move has to be made. Take M with its family of unconditional ranking functions ( %~u ) ~u∈R(U). Instead of strictly conditionalizing every %~u on ~U = ~u to obtain the model M ~U with its family of conditional ranking functions ( %~u ( * | ~U = ~u )) ~u∈R(U) , merely Shenoy conditionalize every %~u on ~U = ~u by an appropriately chosen number max to obtain the model Mmax ~U with its family of "Shenoy shifted" ranking functions( %~u ( * ↑ ~U = ~u )) ~u∈R(U) . Shenoy conditionalization is defined as follows. If % : ℘ (W) → N ∪ {∞} is the unconditional ranking function on the powerset overW, ℘ (W), then the result of Shenoy conditionalizing % on the proposition A from ℘ (W) by rank 9It should be noted that this story cannot be told on an account of counterfactuals such as Lewis' (1973) or Stalnaker's (1968), because these accounts lack the operation of conditionalisation: there are no such things as a conditional sphere of similarity or conditional selection functions. 26 k ∈ N ∪ {∞}, %A↑k, is defined as follows. For each B from ℘ (W), %A↑k (B) = min { % (B ∩ A) + 0 −min, % ( B ∩ A ) + k −min } , where min = min { 0 + % (A) , k + % ( A )} is a normalization parameter which depends on the ranking function % which is to be updated, the partition { A,A } , and the input parameters {0, k} by which the elements A,A of the partition are shifted. The effect of normalizing by min is that at least one possible world is assigned rank 0 rather than rank min. Strictly conditionalizing % on A results in the same ranking function as Shenoy conditionalizing % on A by ∞ so that M ~U = M ∞ ~U . Shenoy conditionalization was introduced by Shenoy (1991). It is the ranktheoretic counterpart to probability theory's Field conditionalization (Field 1978). The family of Shenoy shifted ranking functions in terms of which we evaluate counterfactuals results from the original family of unconditional ranking functions by a series of m Shenoy shifts, one for each exogenous variable U j. We start with %~u =: %0 from the family ( %~u ) ~u∈R(U), which, following Halpern & Hitchcock (2010), we interpret in terms of typicality. What we need to do is to Shenoy conditionalize on what is actually the case in context ~u = (u1, . . . ,um). We do this by first Shenoy conditionalizing %0 on U1 = u1 by max = max { %~u (w) : w ∈ W } +1, which is sufficiently large but finite. This has two effects. First, all worlds that differ from the actual world w~u in the value for the exogenous variable U1 are shifted upwards by max − min1, where min1 depends, among others, on %0. Second, all worlds that agree with the actual world w~u on the value for the exogenous variable U1 are shifted downwards by min1 (and so at least one of those latter worlds is assigned rank 0). The result is %0,U1=u1↑max =: %1. We continue by Shenoy conditionalizing %1 on U2 = u2 by max to obtain %1,U2=u2↑max =: %2 and so on until we finally arrive at %m−1,Um=um↑max = %m =: %~u ( * ↑ ~U = ~u ) . %m differs from the original %0 in that worlds that differ from the actual world w~u in the value for exactly k exogenous variables have been shifted upwards or further away by k *max, modulo normalization. The first thing this means is that the model with the Shenoy shifted ranking functions %ms instead of the unconditional ranking functions %~us is focused on actuality. By the choice of max every world that differs from the actual world in the value of some exogenous variable now has a higher rank than any world that agrees with the actual world in the value of all exogenous variables. More generally, every world that dominates another world in terms of focus on the actual world is assigned a smaller rank than the dominated world.10 27 The second thing this mean is that, in the Shenoy shifted ranking functions %m, the relative position of two worlds that agree on the values for the exogenous variables is the same as it is in the unconditional ranking functions %~u (the two worlds are always shifted together). Therefore the model with the Shenoy shifted ranking functions %m instead of the unconditional ranking functions %~u still respects the equations for those worlds that agree on the values for the exogenous variables. Therefore the modelMmax ~U , call it the appropriate Shenoy shift ofM on ~U, with its family of Shenoy shifted ranking functions ( %~u ( * ↑ ~U = ~u )) ~u∈R(U) is focused on actuality and subsequently respects the equations, if the extended acyclic causal modelM with its family of unconditional ranking functions ( %~u ) ~u∈R(U) respects the equations. In the same way we can form the appropriate Shenoy shiftM∗max ~U of a counterfactual modelM∗. As the proofs of theorems 1 and 2 make clear,M∗max ~U constructed in this way is one of the counterfactual modelsMmax∗ ~U that exist for each extended acyclic causal model Mmax ~U which is focused on actuality and subsequently respects the equations. Thus we arrive at Theorem 4 For each extended acyclic causal model M = ( S,F , ( %~u ) ~u∈R(U) ) which respects the equations and its appropriate Shenoy shiftMmax ~U which is focused on actuality and subsequently respects the equations there is a counterfactual modelM∗ = ( S, ( %w ) w∈W ) and its appropriate Shenoy shiftM∗max ~U such that: SE Fi holds inM iff Fi holds inMmax~U iff Fi holds inM ∗ iff Fi holds inM∗max~U D For all ~u ∈ R (U) and all w ∈ W: %~u (w) = %w~u (w), where w~u is the unique solution to all the equations ofM in ~u. C For all statements φ in the language of the generalized version of Halpern & Hitchcock (2010) and all contexts ~u ∈ R (U): φ is true inM in ~u according to the structural equations approach iff φ is true inMmax ~U in ~u according to the structural equations approach iff φ is true inM∗max ~U in w~u according to the counterfactual models account. 10Shenoy conditionalizing just once on the conjunction ~U = ~u by max does not guarantee that the resulting model is focused on actuality, because in that case all that matters is whether a world differs from the actual world in the value for at least one or no exogenous variable. 28 Theorem 4 shows that we can do everything with objective ranking functions that we can do with structural equations together with normality or typicality, and more. It does not show that we can do everything. The reason I am belaboring the obvious is that it may well be that someone comes up with examples which are modeled by isomorphic counterfactual models, and of which it is claimed that the "intuitively correct" causal judgments differ (see, however, Glymour et al. 2010). In the same way one may come up with examples which are modeled by extended acyclic causal models in which, "intuitively", respect for the equations does not hold. The following one due to Christopher R. Hitchcock (personal correspondence) might be a case in point. I think it is not, because counterfactuality trumps typicality in the sense that the most typical A∧C-worlds are more typical than the most typical A ∧ ¬C-worlds if A C is true. Here is Hitchcock's VICTIM example. Let exogenous variable A take on the value 1 if Assassin shoots, and 0 otherwise. Let endogenous variable B take on the value 1 if Backup shoots, and 0 otherwise. Let endogenous variable V take on the value 1 if Victim dies, and 0 otherwise. The functions FB : a 7→ 1 − a and FV : (a, b) 7→ max {a, b} describe the following equations: • A • B = 1 − A • V = A ∨ B In every context, it is less typical for Assassin as well as Backup to shoot than not to shoot, and for Victim to die than not to die. The first equation implies that Backup would shoot if Assassin did not shoot. Respect for the equations forces us to say that the world where Assassin does not shoot, Backup does not shoot, and Victim does not die is less typical than the world where Assassin does not shoot, Backup shoots, and Victim does not die. The reason is that the latter world strongly Halpern-dominates the former world: the latter world violates the equation for V (an no other equation), the former world violates the equation for B (and no other equation), and B ∈ An (V), but V < An (B). For a similar reason we have to say that the world where Assassin does not shoot and Backup does not shoot and Victim dies is less typical than the world where Assassin does not shoot, Backup shoots, and Victim dies. Therefore we must say that it is more typical that Assassin does not shoot and Backup shoots than that Assassin does not shoot and Backup does not shoot. I think this is correct, because it conforms with the counterfactual that Backup would shoot if Assassin did not shoot. 29 Another example is the PEN example mentioned in Halpern & Hitchcock (forthcoming). Let endogenous variable PS take on the value 1 if Professor Smith takes a pen, and 0 otherwise. Let endogenous variable CP take on the value 1 if the department chair institutes a policy forbidding faculty members from taking pens, and 0 otherwise. Let exogenous variable PO take on the value 1 if a problem occurs, and 0 otherwise. The function F : c 7→ c describes the following equation: • CP • PS • PO = PS It is more typical for Professor Smith to not take a pen than to take a pen. In the context where the department chair institutes a policy forbidding faculty members from taking pens, CP = 1, and where Professor Smith takes a pen, PS = 1, it is true that Professor Smith would (still) take a pen (even) if the department chair instituted a policy forbidding faculty members from taking pens, CP = 1  PS = 1. So far so good. Here is the important point. Halpern & Hitchcock (forthcoming) claim that it is more "typical" that the department chair institutes a policy forbidding faculty members from taking pens and Professor Smith does not take a pen than that the department chair institutes a policy forbidding faculty members from taking pens and Professor Smith takes a pen. The reason is that Professor Smith violates a norm when he takes a pen in the context where the department chair institutes a policy forbidding faculty members from taking pens. This norm, or rather its violation, is claimed to have an impact on what is typical in that context. However, what Halpern & Hitchcock (forthcoming) call "typicality" involves a deontic modality. The PEN example contains the conditional obligation that Professor Smith should not take a pen given that the department chair institutes a policy forbidding faculty members from taking pens, Ought (PS = 0 | CP = 1). And while I hold the view that typicality or normality respects for the equations, I do not hold the view that deontic modalities do. Quite the opposite is the case. Given that the department chair institutes a policy forbidding faculty members from taking pens, Professor Smith should not, but (still) would, take a pen. This, I submit, implies that is less typical that the department chair institutes a policy forbidding faculty members from taking pens and Professor Smith does not take a pen than that the department chair institutes a policy forbidding faculty members from taking pens and Professor Smith takes a pen. 30 7 Appendix 7.1 Proof of Theorem 1 Theorem 1 For each extended acyclic causal model M = ( S,F , ( %~u ) ~u∈R(U) ) which respects the equations there is a counterfactual modelM∗ = ( S, ( %w ) w∈W ) such that: SE Fi holds inM iff Fi holds inM∗ D For all ~u ∈ R (U) and all w ∈ W: %~u (w) = %w~u (w), where w~u is the unique solution to all equations inM in ~u. Proof: LetM = ( S,F , ( %~u ) ~u∈R(U) ) be an extended acyclic causal model which respects the equations. I will construct a counterfactual modelM∗ = ( S, ( %w ) w∈W ) with the appropriate features. Take S fromM. For each context ~u ∈ R (U) the equations in F determine a unique "legal" world w~u ∈ W. W0 = { w~u ∈ W : ~u ∈ R (U) } is the set of all "legal" worlds, i.e. the set of all worlds that satisfy all equations. For w~u ∈ W0 we define %w~u (w) = %~u (w) for all w ∈ W. For the "illegal" worlds w ∈ W \W0 which violate at least one equation we let the ranking functions %w copy an arbitrary ranking function %w~u , w~u ∈ W0. The counterfactual modelM ∗ constructed in this way satisfies D. It remains to be shown that it also satisfies SE. Let Fi represent the equation for Vi, i = 1, . . . ,n. Obviously Fi is eligible for Vi. We have to show that Fi holds in M∗. This means we have to show for every world w ∈ W that the following counterfactuals are all true in M∗ in w: ~U ∪V \ {Vi} = ~wi  Vi = Fi ( ~wi ) , where ~wi ∈ Wi = ×X∈U∪V\{Vi}R (X). Since the %ws for the "illegal" worlds w ∈ W \W0 copy some %w~u , for a "legal" world w~u ∈ W0, it suffices to show that this holds for every "legal" world w~u ∈ W0. Each antecedent of the form ~U ∪V \ {Vi} = ~wi, for ~wi ∈ Wi, is true in the set of worlds {( ~wi, vi ) : vi ∈ R (Vi) } . There is exactly one v∗i ∈ R (Vi), viz. the value Fi assigns to ~wi, such that ( ~wi, v∗i ) does not violate the equation for Vi. For all other vi ∈ R (Vi) the resulting world ( ~wi, vi ) violates the equation for the endogenous variable Vi. Hence Vi ∈ V∗ ( ~wi, vi ) \ V ∗ ( ~wi, v∗i ) for all vi , v∗i . Furthermore,( ~wi, v∗i ) and ( ~wi, vi ) agree on the values of all variables other than Vi. 31 Suppose X ∈ V∗ ( ~wi, v∗i ) \ V ∗ ( ~wi, vi ) for an arbitrary vi , v∗i . Since ( ~wi, v∗i ) and ( ~wi, vi ) agree on the value of X, and since, by assumption, ( ~wi, vi ) does not violate the equation for X, there must be an exogenous or endogenous variable Y such that Y ∈ An (X) and ( ~wi, v∗i ) and ( ~wi, vi ) do not agree on the value of Y. Since ( ~wi, v∗i ) and ( ~wi, vi ) agree on the values of all variables other than Vi, this variable Y must be Vi. That is, if X ∈ V∗ ( ~wi, v∗i ) \ V ∗ ( ~wi, vi ) , then Vi ∈ An (X). Since Vi ∈ V∗ ( ~wi, vi ) \V ∗ ( ~wi, v∗i ) for all vi , v∗i , this means that ( ~wi, v∗i ) weakly Halpern-dominates ( ~wi, vi ) . Since, in acyclic causal models, X < An (Vi) if Vi ∈ An (X), and since Vi ∈ V∗ ( ~wi, vi ) \ V ∗ ( ~wi, v∗i ) , ( ~wi, vi ) does not weakly Halpern-dominate ( ~wi, v∗i ) . Respect for the equations implies that %~u (( ~wi, v∗i )) < %~u (( ~wi, vi )) for all vi , v∗i . Since Vi = Fi ( ~wi ) is true in ( ~wi, v∗i ) it follows that all %~u-minimal, i.e. all %w~uminimal, antecedent worlds are consequent worlds. And this is so for all contexts ~u ∈ R (U), i.e. all "legal" worlds w~u ∈ W0. The if-direction follows from the fact that, for each endogenous variable Vi, at most one eligible function holds in a given counterfactual modelM∗. For two such functions F and F′ differ only if there is a ~wi such that F ( ~wi ) , F′ ( ~wi ) . In that case the two counterfactuals ~U ∪V \ {Vi} = ~wi  Vi = F ( ~wi ) and ~U ∪V \ {Vi} = ~wi  Vi = F′ ( ~wi ) have inconsistent consequents, and so cannot be jointly true at any world w.  7.2 Proof of Theorem 2 Theorem 2 For each extended acyclic causal model M = ( S,F , ( %~u ) ~u∈R(U) ) which is focused on actuality and subsequently respects the equations there is a counterfactual modelM∗ = ( S, ( %w ) w∈W ) such that: C For all statements φ in the language of the generalized version of Halpern & Hitchcock (2010) and all contexts ~u ∈ R (U): φ is true inM in ~u according to the structural equations approach iff φ is true inM∗ in w~u according to the counterfactual models account. Proof: LetM = ( S,F , ( %~u ) ~u∈R(U) ) be an extended acyclic causal model which is focused on actuality and subsequently respects the equations. ConstructM∗ as in the proof of theorem 1. 32 Suppose φ is an atomic sentence of the form Xi = x for some exogenous or endogenous variable Xi. If φ is true in M in context ~u this means that x is the value of Xi in the unique solution w~u to all the equations in F . But then w~u ∈ {(u1, . . . ,um, v1, . . . , vn) = (x1, . . . , xm+n) : xi = x}. Conversely, if φ is not true in M in context ~u this means that x is not the value of Xi in the unique solution w~u to all the equations in F , in which case w~u < {(x1, . . . , xm+n) : xi = x}. Now suppose φ is Boolean. Since negations and conjunctions are defined in the same way in the structural equations approach and the counterfactual models account φ is true inM in context ~u iff φ is true inM∗ in "legal" world w~u. Finally, suppose φ is of the form X1 = x1∧ . . .∧Xk = xk  ψ, for short: ~X = ~x ψ, where ψ is Boolean. Then φ is true inM in ~u according to the structural equations account just in case ψ is true in that model M~X=~x = ( S~X,F ~X=~x ) and that context ~u~X=~x that result fromM and ~u by replacing the equations for Xi by the equations Xi = xi, i = 1, . . . , k. On the other hand, φ is true inM∗ in w~u just in case all %w~u-minimal ~X = ~x-worlds are ψ-worlds. It suffices to consider the case where ψ is an atomic sentence of the form Zi = z. In this case ψ is true in the first sense just in case z is the value of Zi in the unique solution w~X=~x ~u~X=~x =: w∗ to all equations represented by F ~X=~x in context ~u~X=~x. We need to show that w∗ is the one and only %w~u-minimal ~X = ~x-world. w ∗ is an ~X = ~x-world and differs from any other ~X = ~x-world w′ at most in the values assigned toU ∪V \ {X1, . . . ,Xk}. w∗ agrees with w~u in the values for the exogenous variables U \ {X1, . . . ,Xk}. Therefore, if an ~X = ~x-world w′ differs from w∗ in the value of some exogenous variable U, w′ differs also from w~u in the value of U. This means that w∗ dominates any such world w′ in terms of focus on w~u. Focus on actuality implies that any such world w′ has a higher rank in w~u and so is not among the %~u-minimal ~X = ~x-worlds. This leaves only ~X = ~x-worlds which differ from w∗ in at most the values for the endogenous variables V \ {X1, . . . ,Xk}. Let w′ be such a world and suppose X ∈ V∗ (w∗) \ V∗ (w′). Since w∗ satisfies the equations for all endogenous variablesV\{X1, . . . ,Xk}, it must be that X ∈ {X1, . . . ,Xk}. Since w′ and w∗ agree on the values of X1, . . . ,Xk, and since, by assumption, w′ satisfies the equation for X, there must be an exogenous or endogenous variable Y such that Y ∈ An (X) and w′ and w∗ differ in the value for Y. The latter implies that Y is endogenous, but not not among X1, . . . ,Xk, and therefore w∗ does not violate the equation for Y. If w′ violates the equation for Y, we are done. So suppose w′ does not violate the equation for Y. 33 w∗ and w′ agree on the values of U as well as X1, . . . ,Xk, w∗ satisfies the equations for V \ {X1, . . . ,Xk}, and Y ∈ V \ {X1, . . . ,Xk}. Hence, if w′ satisfies the equation for Y, there must be an exogenous or endogenous variable Z such that Z ∈ An (Y) ⊆ An (X) and w′ and w∗ differ in the value of Z. As before it follows that Z is endogenous, but not among X1, . . . ,Xk, and that w∗ satisfies the equation for Z. If w′ violates the equation for Z, we are done. If not, there must be another endogenous variable Z′ ∈ An (Z) ⊆ An (Y) ⊆ An (X) with the same properties. Since there are only finitely many variables, and since the model is acyclic, we finally arrive at an endogenous variable Z∗ ∈ An (X) such that w′ violates the equation for Z∗, but w∗ does not. Hence w∗ weakly Halpern-dominates w′. Note thatV∗ (w′) \ V∗ (w∗) is not empty, if w′ differs from w∗. For suppose it is. Then all variables whose equation are violated by w′ are also violated by w∗. Since w∗ does not violate the equations forV \ {X1, . . . ,Xk}, and since w′ and w∗ agree on the values ofU as well as X1, . . . ,Xk, w′ and w∗ agree on the values for all variables, and thus are identical. Since, in acyclic models, X < An (Z∗) if Z∗ ∈ An (X), and since Z∗ ∈ V∗ (w′)\ V ∗ (w∗) for at least one endogenous variable Z∗, w′ does not weakly Halperndominate w∗. Focus on actuality and subsequent respect for the equations implies that any such world w′ has a higher rank in w~u and so is not among the %~u-minimal ~X = ~x-worlds.  Acknowledgements I am grateful to Thomas Kroedel, Jim Joyce, an anonymous referee, and, especially, Joe Halpern, Chris Hitchcock, and Wolfgang Spohn for helpful comments on earlier versions of this paper. 34 References [1] Briggs, Rachael (2012), Interventionist Counterfactuals. Philosophical Studies 160, 139-166. [2] Collins, John & Hall, Ned & Paul, L.A. (2004), Counterfactuals and Causation: History, Problems, and Prospects. In J. Collins & N. Hall & L.A. Paul (2004), Causation and Counterfactuals. Cambridge, MA: MIT Press, 1-57. [3] Field, Hartry (1978), A Note on Jeffrey Conditionalization. Philosophy of Science 45, 361-367. [4] Glymour, Clark & Danks, David & Glymour, Bruce & Eberhardt, Frederick & Ramsey, Joseph & Scheines, Richard & Spirtes, Peter & Teng, Choh Man & Zhang, Jiji (2010), Actual Causation: A Stone Soup Essay. Synthese 175, 169-192. [5] Hájek, Alan (ms), Most Counterfactuals Are False. [6] Hall, Ned (2007), Structural Equations and Causation. Philosophical Studies 132, 109-136. [7] Halpern, Joseph Y. (2008), Defaults and Normality in Causal Structures. Proceedings of the Eleventh International Conference on Principles of Knowledge Representation and Reasoning (KR 2008), 198-208. [8] Halpern, Joseph Y. (forthcoming), From Causal Models to Counterfactual Structures. The Review of Symbolic Logic. [9] Halpern, Joseph Y. & Hitchcock, Christopher R. (2010), Actual Causation and the Art of Modelling. In R. Dechter & H. Geffner & J. Halpern (eds.), Heuristics, Probability, and Causality. London: College Publications, 383406. [10] Halpern, Joseph Y. & Hitchcock, Christopher R. (forthcoming). Compact Representations of Extended Causal Models. Cognitive Science. [11] Halpern, Joseph Y. & Pearl, Judea (2005a), Causes and Explanations: A Structural-Model Approach. Part I: Causes. British Journal for the Philosophy of Science 56, 843-887. 35 [12] Halpern, Joseph Y. & Pearl, Judea (2005b), Causes and Explanations: A Structural-Model Approach. Part II: Explanations. British Journal for the Philosophy of Science 56, 889-911. [13] Hiddleston, Eric (2005), Causal Powers. British Journal for the Philosophy of Science 56, 27-59. [14] Hitchcock, Christopher R. (2001), The Intransitivity of Causation Revealed in Equations and Graphs. Journal of Philosophy XCVIII, 273-299. [15] Hitchcock, Christopher R. (2007), Prevention, Preemption, and the Principle of Sufficient Reason. Philosophical Review 116, 495-532. [16] Huber, Franz (ms 1), What Should I Believe About What Would Have Been the Case? Unpublished manuscript. [17] Huber, Franz (ms 2), New Foundations for Counterfactuals. Unpublished manuscript. [18] Kistler, Max (forthcoming), The Interventionist Account of Causation and Non-causal Association Laws. Erkenntnis. [19] Kroedel, Thomas & Huber, Franz (forthcoming), Counterfactual Dependence and Arrow. Noûs. [20] Leitgeb, Hannes (2012a), A Probabilistic Semantics for Counterfactuals. Part A. Review of Symbolic Logic 5, 26-84. [21] Leitgeb, Hannes (2012b), A Probabilistic Semantics for Counterfactuals. Part B. Review of Symbolic Logic 5, 85-121. [22] Lewis, David K. (1973a), Causation. Journal of Philosophy 70, 556-567. [23] Lewis, David K. (1973b), Counterfactuals. Cambridge, MA: Harvard University Press. [24] Lewis, David K. (1979), Counterfactual Dependence and Time's Arrow. Noûs 13, 455-476. [25] Lewis, David K. (1986a), Postscripts to "Causation". In D. Lewis (1986), Philosophical Papers II. Oxford: Oxford University Press, 172-213. 36 [26] Lewis, David K. (2000), Causation as Influence. Journal of Philosophy 97, 182-197. [27] Menzies, Peter (2004), Difference-Making in Context. In J. Collins & N. Hall & L.A. Paul (eds.), Causation and Counterfactuals. Cambridge, MA: MIT Press, 139-180. [28] Paul, L.A. (2000), Aspect Causation. Journal of Philosophy XCVII, 235256. [29] Pearl, Judea (2009), Causality: Models, Reasoning, and Inference. 2nd ed. Cambridge: Cambridge University Press. [30] Spirtes, Peter & Glymour, Clark & Scheines, Richard (2000), Causation, Prediction, and Search. 2nd ed. Cambridge, MA: MIT Press. [31] Shenoy, Prakash P. (1991), On Spohn's Rule for Revision of Beliefs. International Journal of Approximate Reasoning 5, 149-181. [32] Spohn, Wolfgang (1988), Ordinal Conditional Functions: A Dynamic Theory of Epistemic States. In W.L. Harper & B. Skyrms (eds.), Causation in Decision, Belief Change, and Statistics II. Dordrecht: Kluwer, 105-134. [33] Spohn, Wolfgang (2006), Causation: An Alternative. British Journal for the Philosophy of Science 57, 93-119. [34] Spohn Wolfgang (2010), The Structural Model and the Ranking Theoretic Approach to Causation: a Comparison. In R. Dechter & H. Geffner & J. Halpern (eds.), Heuristics, Probability, and Causality. London: College Publications, 507-522. [35] Stalnaker, Robert C. (1968), A Theory of Conditionals. In N. Rescher (ed.), Studies in Logical Theory. American Philosophical Quaterly. Monograph Series 2. Oxford: Blackwell, 98-112. [36] Woodward, James (2003), Making Things Happen: A Theory of Causal Explanation. Oxford: Oxford University Press 2003.