A Model-Invariant Theory of Causation J. Dmitri Gallow † Causal models provide us with a formal tool for representing the networks of determination in which causes and effects are embedded. They tell us how some token features of the world-represented in the model with variables-determine others. They tell us whether one variable determines another along a single path or along multiple paths. They tell us whether two variables determine a third; and, if so, whether they do so along independent or intersecting paths. And it has been hoped that they can also tell us whether one variable is a token cause of another.1 To this end, a number of authors have developed theories of token causation within the causal modelling framework.2 Lots of good work has been done on this front, but most of the theories developed to date have an awkward consequence: adding † For helpful conversation and feedback on this material, I am indebted to Gordon Belot, Daniel Drucker, Malcolm Forster, Christopher Hitchcock, James M. Joyce, L. A. Paul, Brian Weatherson, Mark Wilson, and James Woodward, as well as audiences at the University of North Carolina, Chapel Hill, the Center for the Philosophy of Science at the University of Pittsburgh, and the Causal and Explanatory Reasoning conference at Venice International University. I am especially indebted to two anonymous reviewers whose generous feedback and watchful eyes helped make this paper much better than it would otherwise have been. 1. Token causation is sometimes called 'singular causation' or 'actual causation'. Token causal relations are the causal relations described with token causal claims-sentences of the form 'c 's F -ing caused e toG ' or 'c 's F -ing was a cause of e 'sG -ing', where c 's F ing and e 'sG -ing are token events (e.g., 'Chris's drinking was a cause of his esophageal cancer'). These are to be contrasted with type, or general, causal claims like 'Drinking causes esophageal cancer'. So too should they be contrasted with the relations of causal determination represented in a causal model. (Looking ahead to section 1, in figure 1, whether B fires causally determines whether E does, but B 's failure to fire is not a token cause of E 's firing.) Throughout, 'cause' should be understood to mean 'token cause'. 2. See, e.g., Halpern and Pearl (2001, 2005), Hitchcock (2001, 2007a), Woodward (2003), Menzies (2004, 2006), Hall (2007), Halpern (2008, 2016), Beckers and Vennekens (2017, 2018), Weslake (forthcoming), and Andreas andGünther (forthcominga, forthcomingb). Final Draft. Forthcoming in the Philosophical Review. A Model-Invariant Theory of Causation or removing an inessential variable from a model will lead these theories to revise their verdicts about whether two variables are causally related.3 Attend to an additional, inessential, variable interpolated along the path leading fromC toE, and these theories will change their mind about whether C caused E. Attend to an additional, inessential, variable feeding into the path from C to E, and these theories will likewise change their mind about whether C caused E.4 I believe that this should concern us. In several instances, these theories are only able to agree with intuition through a judicious choice of which variables to include in the model. For just one example: in section 1.1 below, we'll encounter two systems which appear to differ causally, but which may be modeled with isomorphic variables and equations. Nonetheless, Hitchcock (2001) treats them differently by including an inessential variable in his model of one system while omitting the corresponding variable from his model of the other. There is a serious worry that, in the absence of some more general guidance about when variables can be ignored, and when not, ad hoc decisions like this can be used to effectively shield a theory from refutation. A theory whose verdicts about whether C caused E don't change as inessential variables are added or removed-amodel-invariant theory- would protect us from this kind of special pleading. Such a theory would have the added virtue of making it easier to determine whether C caused E. With such a theory, we needn't consider all possible correct causal models, nor decide which is most appropriate or apt for the present context; we need only check whether C caused E in a single correct model. Below, I will provide amodel-invariant a theory of causation. Along the way, we'll see reason to think that an adequate theory of causation must distinguish between states which are normal, default, or inertial, and events which are abnormal, deviant departures therefrom (section 1.1). This is striking even after you have been persuaded it is true. Why should a distinction between default and deviant behavior play a role in our causal thought and talk? The theory developed here suggests an answer. In rough outline, the theory says that C caused E 3. The theory of Beckers and Vennekens (2017, 2018) is a notable exception-modulo some finicky issues related to their 'timings'. Unfortunately this theory says that a preemptive overdeterminer (see section 4) is not a cause. Beckers and Vennekens recognize and embrace this consequence of their theory, but it is not one that I am willing to endorse. 4. See Gallow (ms). I'll get more precise about the term 'inessential' in section 2 below. 2 whenever bothC and E are deviant or non-inertial events, and there is an uninterrupted process which transmitsC 's deviancy to E (this will be made more precise below). That is, according to this theory, a cause is something which transmits aberrational behavior to its effect; and, if this is what a cause is, then it is no surprise to find the distinction between the default and the deviant, the normal and the abnormal, or the inertial and the non-inertial showing up in our theorizing about causation. In section 1, I will introduce causal models, explain how they can be used to provide a semantics for causal counterfactuals, and explain why I've been persuaded that these models must include information about which variable values are more default, normal, or inertial than which others. Then, in section 2, I'll explain more carefully what I mean when I call a theory of causation formulated in terms of these causal models model-invariant. Sections 3–5 develop the notion of a causal network. This is a formal characterization of what I called above "an uninterrupted process which transmits C 's deviancy to E". Causal networks are the heart of my theory of causation, and they are modelinvariant. In section 6.1, I will give some further motivation for thinking of causal networks as transmitting deviancy from cause to effect. In section 6.2, I will give a precise statement of the theory and apply it to some additional cases. A few words of forewarning: in what follows, I will for themost part confine my attention to some simple 'neuron systems' (see section 1 below)-though, along the way, I'll provide 'real world' cases which exemplify similar causal structures. All of these systems will be deterministic. This narrow focus will allow me to sidestep some thorny issues- for instance, which kinds of variables can be included in a causalmodel, when a system of equations is correct,5 and when one variable value is more or less default, normal, or inertial than another. By focusing on neuron systems, I will be able to get by with a small number of relatively uncontroversial assumptions about these contentious questions. Any complete theory of causation must say more about these issues than I will say here, just as it must be extended to cover indeterministic systems. Accordingly, the story I'll tell here is a central part of a full theory of causation, but it is not yet a complete theory. 5. I've said a bit about this in Gallow (2016), and I'll say a bit more in section 2 below, though there remains more to be said. 3 A Model-Invariant Theory of Causation 1 Causal Models As I'll be using the term here,6 a causal model consists of 5 components: a collection of exogenous variables,U, an assignment of values to those variables, u, a collection of endogenous variables, V,7 a system of structural equations, one equation for each endogenous variable in V, and a specification of which variable values are more normal, typical, inertial, or default than which other variable values.8 Causal Models A causal modelM = (U,u,V,E, ≽) is a 5-tuple of (a) An m-tuple,U = (U1,U2, . . . ,Um), of exogenous variables; (b) An assignment of values, u = (u1,u2, . . . ,um), to U; (c) An n-tuple V = (V1,V2, . . . ,Vn), of endogenous variables; (d) A system of structural equations, E = (φV1, φV2, . . . , φVn ), one equation for each endogenous variableVi ∈ V; and (e) A specification, ≽, of which values of each variable in U ∪ V are more default, normal, typical, or inertial than which others. To see how a causal model represents structures of causal determination, consider the Lewisian system of neurons shown in figure 1. Here's how to read the diagram in figure 1: for every time listed at the bottom, the neurons above it can either fire or not fire at that time. If 6. This terminology is somewhat idiosyncratic. Many authors do not include either ≽ or u in their definition of a 'causal model.' 7. As I understand them, variables are functions from some domain to the real line- in my view, this domain is a set of possible spacetime regions. So, a variable will tell you what its possible values are (these are just the real numbers in the image of the function). The reader may think about variables differently; but they should ensure that a causal model tells us which values each variable may take on. 8. Aword on notation: variables will be denoted with uppercase italic letters (A,B,C, . . . ), while their values will be denoted with the corresponding lowercase italic letters (a, b, c, . . . ). Tuples will be indicated with boldface. I will use uppercase for a tuple of variables and lowercase for a tuple of their values. The Greek letter φ, subscripted with a variable, will stand for a function, and I will often use just pφV q to stand for an entire structural equation like V := φV (X ,Y,Z ). Throughout, I will apply settheoretic notation to tuples of variables. Thus,U ∪ V is a tuple containing all and only the variables in either U or V, V \ X is a tuple containing all and only the variables in V, except for those in X, and so on. There will in general be many such tuples, depending upon an arbitrary choice of order. It won't matter which of these an expression like 'U ∪ V' denotes. In sections 3–6, I will use calligraphic letters (P,N ) to stand for sets of directed edges. 4 1. Causal Models U1 : (A,C ) u1 : (1, 1) V1 : (B,D,E) E1 : ©-« E := B ∨D D := C B := A ∧ ¬C a®¬ Figure 1: On the left, the neuron system Preemptive Overdetermination. On the right, the canonical model, M1, of this neuron system. (For all variables, the value 0 is default, and the value 1 is deviant.) a neuron actually fires at its designated time, then it is colored gray. Otherwise, it is colored white. The arrows represent stimulatory connections between neurons. If the neuron at the tail of the arrow fires at its designated time, then, ceteris paribus, the neuron at the head will fire at its designated time. Thus, if either B or D in figure 1 fires at t2, then E will fire at t3. The circle-headed lines represent inhibitory connections between neurons. If the neurons at their base fire, then the neurons at their head definitely won't fire. In figure 1, for instance, if C fires at t1, then B won't fire at t2, no matter whether A fires or not. Parenthetically, it is not uncommon to see diagrams like these used to represent the causal scenarios described in vignettes-scenarios involving rock throwings, coffee poisonings, and the like. This is not how I will be using them here. Rather, I will be understanding these diagrams as representing hypothetical mechanical systems obeying the simple causal laws described above. These systems consist of a small number of parts, the neurons, with two potential states: being dormant, which is a neuron's inertial state, the state in which it will remain unless acted upon fromwithout, and firing, which a neuron will only do when another neuron connected to it with a stimulatory connection fires.9 You could think of these diagrams as representing an appropriately constructed electrical circuit,10 neurons in the brain, connected with appropriate excitatory and inhibitory synapses,11 or a boring possible world containing no more than a few objects, the 'neurons', and governed by simple laws of nature specifying when these neurons will and will not fire. I'll be using these neuron systems, not as representational tools, but rather as the reality to be represented with a causal model.12 9. Some neuron systems I introduce later on will have more potential states than these. I'll explain the additional complications then. 10. Cf. Armstrong (2004, p. 446) 11. This is how Lewis (1986) thought of them (see, e.g, p. 196). 12. See Hitchcock (2007b, p. 392). 5 A Model-Invariant Theory of Causation To represent the neuron system shown in figure 1, we may assign a variable to every neuron: A,B,C,D, and E. These variables take on the value 1 if their associated neurons fire at their designated times, and take on the value 0 if their associated neurons remain dormant at their designated times. (Thus, I use 'A' for both the neuron and the variable which represents whether A fires at t1. Context will disambiguate.) Both A and C are exogenous variables-variables whose values are not causally determined by the values of the other variables in the model. Since both of those neurons fire at t1, the exogenous assignment will tell us that A = C = 1. B , D , and E will be endogenous variables-variables whose values are causally determined by the values of the other variables in the model. The structural equations in E tell us exactly how the values of the endogenous variables are causally determined. The equation E := B ∨ D tells us, firstly, that whether E fires is causally determined by whether B does and whether D does, and secondly, that E will fire iff either B or D does.13 Similarly, the equation D := C tells us that whether D fires is causally determined by whether C does, and that D will fire iff C does. The structural equations, together with the exogenous variable assignment, allow us to solve for the value of every variable in the model. For instance, in the model M1, the structural equation B := A ∧ ¬C , together with the exogenous assignment A = C = 1, tells us that B = 0. Similarly, the structural equation D := C , together with the exogenous assignment C = 1, tells us that D = 1. And, finally, the structural equation E := B ∨ D , together with the values B = 0 and D = 1, tells us that E = 1. Because the equations in E encode information about the direction of causal determination, we cannot re-arrangeD := C to getC := D , as we could with an ordinary equation. A structural equationV := φV (U ) tells us more than just that the value ofV is a function, φV , of the value ofU . It additionally tells us that the value ofV is causally determined by the value ofU , in a way that the value ofU is not causally determined by the value of V . This is why we use ':=', rather than the symmetric '=', in structural equations. Given a causal model, M, we may construct a causal graph which displays the causal determination structure amongst the variables in U ∪ V, as follows: if a variable U appears on the right-hand-side of V 's structural equation φV , then place a directed edge betweenU and 13. Notation: x ∧y , x ∨y , and ¬x are the Boolean functions min{x, y}, max{x, y}, and 1−x , respectively. 6 1. Causal Models V , with its tail at U and its head atV , U → V . Thus, given the causal model shown in figure 1, we may construct the following causal graph. (Note: I have additionally decorated the graph with the values the variables take on in the model.) This graph tells us that the variables A and C are exogenous, that B 's value is causally determined by the values of A and C , that D 's value is causally determined by the value of C , and that E 's value is causally determined by the values of B and D . While it tells us by which other variables each endogenous variable is directly causally determined, the graph on its own does not tell us how the values of the endogenous variables are causally determined. For that information, we must look to the structural equations in E. It is common to use the metaphor of genealogy to describe the causal determination relations between variables displayed in a graph. For instance, B and D are E 's causal parents, and C 's causal children. Similarly, B , D , and E are C 's causal descendants. Throughout, I will assume that no variable is among its own causal descendants-that is, I will assume that there are no causal loops.14 I will use 'PA(V )' to denote a tuple ofV 's causal parents. Finally, our causal model should specify, for each variable, which values of that variable are more default, inertial, or normal than which others. In the case of the neuron system from figure 1, I will assume that remaining dormant is the default, normal state of a neuron-it is the state in which the neuron will remain unless it is acted upon by some other, stimulatory neuron. And I will assume that firing is a more abnormal deviation from that default, inertial state. I will assume likewise for every other neuron system in this paper.15 The readermay 14. I make this assumption in the interests of simplicity, not out of necessity. Local dependence (see section 4) is well-defined in cyclic models; so causal networks (see section 5) are well-defined in cyclic models; so the theory of causation I'll present in section 6 can be applied straightforwardly to cyclic models. 15. Formally, we can understand ≽ as a function from the variablesV ∈ U∪V to a partial pre-order over their values, ≽V . If v ≽V v∗, then v is no more default, normal, or inertial than v∗ (cf. Halpern (2008, 2016) and Halpern and Hitchcock (2015)). Perhaps which variable values are more inertial than which others should be relativized to the values of some other variables in the model. Taking for granted that your food is poi7 A Model-Invariant Theory of Causation Figure 2: Short Circuit be curious why this kind of information is included in a causal model. I'll explain in section 1.1 below. 1.1 Defaults and Deviancy The neuron system shown in figure 1 gives a case of preemptive overdetermination. There, either A's firing or C 's firing would have been enough, on its own, to make E fire. Both A and C fired, so the firing of E was overdetermined. But the overdetermination is not symmetric. Though the causal process initiated with C runs to completion, the causal process initiated with A is preempted by C 's firing. A would have caused E to fire, were it not for C ; but as it happens, A is merely a backup would-be cause. C , on the other hand, is a genuine cause of E 's firing. Consider the neuron system shown in figure 2. (I followHall (2007) in calling this neuron system a 'short circuit'.16) There, the neuron C fires, causing B to fire; and B 's firing threatens to make E fire. But, at the same time that C initiates this threat to E 's dormancy, it also makes D fire. And D 's firing prevents E from firing. So C both creates a threat to E 's dormancy and, at the same time, neutralizes that very threat. For a case with a similar causal structure, consider:17 soned, your death may be inertial, even though, when we don't take this for granted, death is an abnormal departure from inertial behavior (cf. Halpern (2016)). Perhaps we should further distinguish variable values which are inertial from those which are deviant, saying that, conditional on the poisoning, your death is inertial, but deviant. I'm sympathetic to these thoughts; but I'll put them aside for the nonce. We will be able to say many interesting things without worrying too much about the particulars of the default/deviant distinction. 16. See also Lewis (2004, p. 97–99), in which the same structure is called an inert network. 17. This case is attributed to an early draft of Hall (2004) by Hitchcock (2001). In assuming that Boulder and Short Circuit have similar causal structures, I am in part assuming that the boulder's fall is a deviant, non-inertial event, and that Matthew's survival is a default, inertial state. 8 1. Causal Models Boulder Matthew hikes through the Scottish highlands. Above him, a large boulder becomes dislodged and careens down the hillside. He sees the boulder coming and jumps out of the way at the last second, narrowly escaping death. The boulder's becoming dislodged creates a threat to Matthew's life. However, at the same time that it creates this threat, it also alerts him to its presence, causing him to jump out of the way. So the boulder both creates a threat to Matthew's life and, at the same time, neutralizes that very threat. I take it that the boulder's becoming dislodged did not cause Matthew to survive, nor did C 's firing cause E to remain dormant in the neuron system from figure 2. As Hall (2007) notes, we may write down a system of structural equations for Short Circuit which is isomorphic to the canonical model of Preemptive Overdetermination from figure 1. Let A be a variable which takes on the value 1 if the neuronA doesn't fire, and takes on the value 0 if it does fire. Similarly, let B and E be variables which take on the value 1 if their associated neurons don't fire, and take on the value 0 if they do fire. And let C and D be variables which take on the value 1 if their associated neurons fire, and take on the value 0 if they don't. Then, the following system of equations will correctly describe the causal determination structure amongst these variables. E := B ∨D D := C B := A ∧ ¬C E won't fire just in case either B doesn't fire or D does; D will fire just in case C does; and B won't fire just in case neither A nor C do. These are isomorphic to the equations we wrote down for the case ofPreemptive Overdetermination. Moreover, the exogenous variables take on precisely the same values. In Preemptive Overdetermination, C 's firing caused E to fire (that is, C = 1 caused E = 1). But, in Short Circuit, C 's firing did not cause E to not fire (that is, C = 1 did not cause E = 1). So, if we wish to use causal models to determine which variable values caused with other variable values, then we will need to know more than a true system of structural equations and an assignment of values to the exogenous variables is capable of telling us. It is natural to think of the dormancy of a neuron as a kind of default, normal, or inertial state. It is the state in which the neuron will remain unless it is acted upon by some other, stimulatory neuron. And 9 A Model-Invariant Theory of Causation the event of a neuron's firing is a deviation from that default, normal, inertial state. Several authors18 have thought that this distinction, between default, normal, or inertial states and events which are abnormal, non-inertial deviations therefrom, must be incorporated into a theory of causation. And appealing to this distinction allows us to distinguish Preemptive Overdetermination from Short Circuit. For, in our model of Preemptive Overdetermination, A = 1, B = 1, and E = 1 stand for the deviant, abnormal, non-interial events of neurons firing; while, in our model of Short Circuit, A = 1, B = 1, and E = 1 stand for the default, normal, inertial states of neurons remaining dormant. It is for this reason that a causal model includes ≽, which tells us which variable values are more deviant, abnormal, or non-inertial than which others. No theory of causation incorporating this kind of information is complete until it provides an independent characterization of which variable values are more or less default than which others.19 However, insofar as we keep our focus on simple neuron systems, the only assumption I will need is that a neuron's remaining dormant is more default than its firing. When additional assumptions about the deviancy of a variable's values are needed, I will explicitly state them. The focus on simple neuron systems will also allow me to get by with just one, relatively weak, assumption abouut when a causal model is correct. To understand this assumption, return to the neuron system shown in figure 1. To construct the causal model M1 from this neuron system, we assigned a variable to every neuron, with a value of 1 standing for the neuron firing at its designated time, and a value of 0 standing for the neuron remaining dormant at that time. The variables for the neurons without any stimulatory or inhibitory connections coming into them were made exogenous, and assigned the values corresponding to the actual state of their neurons. We then wrote down equations describing how the state of each endogenous neuron is directly causally determined by the other neurons in the system, and we assumed that firing is a more deviant state of a neuron than remaining dormant. Let's call the causal model that we construct in this way from a given neuron system the canonical model of that neu18. See in particular Kahneman and Miller (1986), Thomson (2003), McGrath (2005), Maudlin (2004), Hall (2007), Hitchcock (2007a), Halpern (2008, 2016), Hitchcock and Knobe (2009), Paul and Hall (2013), and Halpern and Hitchcock (2015). 19. For some attempts, see Kahneman and Miller (1986), Maudlin (2004), McGrath (2005), Hall (2007), Hitchcock (2007a), Hitchcock and Knobe (2009), and Wolff (2016). 10 1. Causal Models ron system. Then, the assumption I'll need about model correctness going forward is this: the canonical model of any neuron system is correct. 1.2 Counterfactual Causal Models Given a causal model M = (U,u,V,E, ≽), with some tuple of variables A ⊆ U∪V, we may construct a counterfactualmodel, in which the variables in A have been intervened upon so as to hold their values fixed at a, as follows: We remove any endogenous variables in A from the endogenous variables V, and add them to the exogenous variables, U. Next, we remove the structural equations of any endogenous variables in A from the system of structural equations E, and change the exogenous assignment u so that it assigns the values in a to the variables in A. The information in ≽ will remain unchanged. Counterfactual Causal Model Given a causal modelM = (U,u,V,E, ≽), including the variablesA, and given the assignment of values a toA, the counterfactual model M[A→ a] = (U∗,u∗,V∗,E∗, ≽∗) is the model such that: (a) U∗ = U ∪ A (b) u∗ = u+ a20 (c) V∗ = V \ A (d) E∗ = E \ (φA | A ∈ A) (e) ≽∗ = ≽ For instance, figure 3 displays the counterfactual modelM1[D → 0] in which we have intervened so as to setD 's value to 0. Notice that, in this model, it is no longer the case that D 's value is causally determined by C . Rather, D has been 'exogenized', and it has been given the exogenous assignment 0. In this new model, when we solve for the values of the variables as before, we find that E = 0. 20. Here, I use 'u+a' to refer to the result of adding the assignment a to u (if the variable from A was not already exogenous) or revising the assignment u to match a (if the variable from A was already exogenous). 11 A Model-Invariant Theory of Causation U∗1 : (A,C,D) u∗1 : (1, 1, 0) V∗1 : (B,E) E∗1 : ( E := B ∨D B := A ∧ ¬C ) Figure 3: On the right, the counterfactual model M1[D → 0] (for all variables, 0 is default, and 1 is deviant). On the left, its associated causal graph. We can use these counterfactual models to provide a semantics for causal counterfactuals.21 According to this semantics, a counterfactual "Had A taken on the values a, then it would have been that C" (where C is any Boolean combination of variable values) is true in a causal modelM just in case C is true in the counterfactual model in which you've intervened so as to set A to the values a,M[A→ a]. Causal Counterfactuals If C is a proposition about the values of the variables in a causal model M, and M contains the variables in A, then the causal counterfactual A = a → C is true in M iff C is true in the counterfactual modelM[A→ a],22 M |= A = a → C ⇐⇒ M[A→ a] |= C Thus, because E = 0 is true in the counterfactual model M1[D → 0], the counterfactual D = 0 → E = 0 is true in the modelM1. 2 Model Invariance Like any other vehicle of representation, a causal model may be appraised for accuracy. The model tells us that the world is a certain way, and what it tells us could be either true or false. In the former case, the model is correct. In the latter case, it is incorrect. A causal model which says that the rain is causally determined by the state of my umbrella is not correct; it gets the causal structure of the world backwards. Amongst the correct causal models, some are more detailed, some less so. One correct model tells us that whether the match lights 21. For more on this semantics, see Galles and Pearl (1998), Briggs (2012), and Huber (2013). 22. I use 'M |= S' for 'the sentence S is true in the model M'. For sentences of the form pV = v q, and Boolean functions of these sentences, the definition of truth in a model is just what you would expect. 12 2. Model Invariance is causally determined by whether it is struck. Another tells us that whether the match lights is causally determined both by whether it is struck and whether there is oxygen present. Both models tell us true things about the world's causal structure, though the second tells us strictly more. Other correct causal models may tell us which variables causally determine whether the match is struck, which are causally determined by whether the match is lit, and which are causally intermediate between the match's striking and its lighting. If we wish to theorize about causation in terms of causal models, then it is important for us to distinguish between correct and incorrect models; for it is only the verdicts issued about correct models which are commitments of our theory. Without some way of distinguishing correct models from incorrect models, a theory of causation tells us nothing at all about which variable values are token causes of which others. For my purposes, I won't need to supply a complete account of when a causal model is correct. I will only need to endorse three, rather weak, conditions on the correctness of a causal model (viz, that the canonical model of a neuron system is correct, and the principles Exogenous and Endogenous Removal, to be introduced below). However, just to orient the reader, let me say a few things here about what I think it takes for a causal model to represent the world correctly. On my view, in order to be correct, a causal model must entail only true counterfactuals about the values of the variables appearing in the model. If a causal model entails a false counterfactual, then the model is incorrect. But entailing only true counterfactuals is not sufficient for a model being correct; some incorrect models entail only true counterfactuals. Return to the case of Preemptive Overdetermination from figure 1, and consider a model which contains only the variables C and E, both of which are exogenous and take on the value 1. This model tells us, truly, that E 's firing is counterfactually independent of C 's firing. But it also tells us, falsely, that whether E fires is causally independent of whether C fires. So this model is not correct, even though it entails only true counterfactuals. Or consider a model of Preemptive Overdetermination which contains only the variables A,C, and E, where A and C are exogenous and both take on the value 1, and a single structural equationwhich tells us thatE := A∨C . Thismodel will entail only true counterfactuals. However, in this model, the variables A andC are perfectly symmetric. So any theory of causation presented with this model will tell us that A = 1 caused E = 1 iff C = 1 caused E = 1. Since C = 1 caused E = 1 and A = 1 did not, this model 13 A Model-Invariant Theory of Causation cannot be correct. My diagnosis is that this too-simple model tells us, falsely, that A and C determine the value of E along non-intersecting paths. So, on my view, causal models don't just represent patterns of counterfactual dependence between variable values-they also tell us something about the paths by which variables causally determine the values of their descendants.23 In general, my view is that a causal model tells us how each of the values of each endogenous variable,V ∈ V, are causally determined by the values ofV 's ancestors in the model-whether by a single path or multiple paths, and whether by independent or intersecting paths- and it tells us that those values are not causally determined byV 's nonancestors. From these facts, we can determine which variable values counterfactually depend upon which others, as described in section 1.2. So, if a model entails false counterfactuals, then the model must have told us something false. But the model tells us strictly more than those counterfactuals do. (I will expand upon this view when discussing some examples below.) 2.1 Exogenous Removal In order to be correct, a causal model needn't include a variable for every factor which is potentially causally relevant. The model which says that whether the match lights is causally determined by whether it is struck and whether there's oxygen in the room is correct. But, so long as the oxygen is present, the variable for oxygen is not needed. We could remove it, and the causal model left behind-the one which tells us that whether the match lights is causally determined by whether it's struck-would be correct, also. (This model no longer tells us whether there's oxygen in the room, but no model will tell us everything about the world, just as no map tells us everything about where things are located. A map of London is not incorrect simply because it doesn't tell us where Sabeen's flat and the Eiffel tower are located. Likewise, a causal model is not incorrect simply because it doesn't tell us something about the values of omitted variables.) Or consider the neuron system displayed in figure 4. The canonical model of this neuron system, M4, includes a variable for A,C , and E (with 1 corresponding to 23. Again, this is my diagnosis of why the model is not correct; but if the reader disagrees with it, this disagreement won't make any difference to anything else I have to say here. My goal in this section is just to defend the principles Exogenous and Endogenous Reduction (see below). And you can accept these principles while disagreeing with me about why this simple model of Preemptive Overdetermination is correct. 14 2. Model Invariance Figure 4: Omission Figure 5: Prevention firing and 0 corresponding to not firing). Its exogenous assignment tells us that A = 1 and C = 0, and it includes the structural equation E := A ∧ ¬C . The canonical model M4 is correct; but, so long as C doesn't fire, the variable for C isn't necessary. Just as we can take the presence of oxygen for granted, so too can we take the non-firing of C for granted. So we can pluck the variable C out of this model and replace it with its actual value, 0, in the structural equation. We will be left with a model-call it 'M−C4 '-which contains the sole exogenous variableA, the sole endogenous variable E, and the structural equation E := A ∧ ¬0, or just E := A. In general, if M = (U,u,V,E, ≽) is a causal model with the exogenous variable U ∈ U, then let M−U be the model that you get by (a) removing U from U; (b) removing U 's value from u; (c) 'exogenizing' any variables in V whose only parent wasU ;24 (d) replacingU with its value in every structural equation in E; and (e) removing information about the deviancy of U 's values from ≽. In my view, removing an exogenous variable from a correct causal model in this way will not always leave a correct causal model behind. For instance, consider the neuron system in figure 5. This is just like the neuron system from figure 4, except that, in figure 5, C fires, and therefore, E doesn't. The canonical model of this neuron system,M5, will be exactly like M4, except that the exogenous assignment will tell us that C = 1, rather than C = 0. In my view, this makes a difference with respect to whether the variable C can be ignored. For if we try to replace C with its actual value inM5, we will be left with the structural equation E := A ∧ ¬1, which is a constant function of A. Whether A is 0 or 1, E will take on the value 0. This equation tells us, falsely, that E andA are causally independent. So the modelM−C5 is not correct, even 24. 'Exogenizing' a variable V ∈ V means (a) moving V from V to U; (b) enriching the exogenous assignment u so that it assignsV the value it takes on in the original model M; and (c) removingV 's structural equation from E. 15 A Model-Invariant Theory of Causation though M5 is. So removing an exogenous variable does not always preserve correctness. In my view, in order for a structural equation V := φV (PA(V )) to be correct, it must tell us how each of V 's values are causally determined by the values of V 's causal parents. So φV must be a surjective function of all of the right-hand-side variables. That is: for every value v of the left-hand-side variable V , there must be some assignment of values to the right-hand-side variables PA(V ) which gets mapped to v by the function φV . If φV is not surjective, then the structural equation for V cannot tell us how each of V 's values could be causally determined by the values of V 's parents. So, if φV is not surjective, then the structural equation forV cannot be correct.25 Additionally: on my view, a structural equation φV tells us that the left-hand-side variable V has its value causally determined by all of the right-hand-side variables. So φV must be a function of all ofV 's causal parents P a ∈ PA(V ). That is: for each P a ∈ PA(V ), there must be some assignment of values to the other variables in PA(V ) such that, when they take on those values, the value V takes on depends upon which value P a takes on. So, if V is not a function of all of the right-hand-side variables, then the structural equation cannot be correct. In general, ifU is exogenous inM, and if every structural equation φV inM−U is both (a) a surjective function and (b) a function of all ofV 's remaining causal parents, then I will say thatU is an inessential exogenous variable in M.26 Though removing exogenous variables will not always preserve correctness, I believe that removing inessential exogenous variables will. That is, I believe we should endorse the following principle.27 25. Or so it seems to me. You may not agree that structural equations must be surjective. If so, this shouldn't prevent you from accepting anything else I have to say here. By imposing this requirement, I strengthen the antecedent of my principle Exogenous Removal (see below). Strengthening the antecedent weakens the conditional. I think that this weakening is necessary; but, even if you think the principle is weaker than it needs to be, this is no reason for you to worry about its truth. (Readers who worry about this surjectivity requirement should also note that it is not required at any point in the proof of my theory's model-invariance in appendix A; so the theory would still be model-invariant even if Exogenous Removal were strengthened. Indeed, the theory would bemodel-invariant even if we say that every exogenous variable is inessential.) 26. To be clear: V 's remaining causal parents in M−U are just V 's causal parents in M, minus U . 27. To be clear: I think that Exogenous Removal is a substantive claim; you could very well disagree with me about it (if, for instance, you thought that removing any exoge16 2. Model Invariance Exogenous Removal If a causal model M = (U,u,V,E, ≽) is correct, and U ∈ U is inessential, thenM−U is also correct. 2.2 Endogenous Removal In order to be correct, a causal model need not include a variable for every factor which is causally intermediate between two variables. Whether the room is illuminated is causally determined by whether the switch is up. There are ever so many variables causally intermediate between these two-whether current is flowing, whether the filament in the bulb is heated, etc. Nevertheless, a model which omits them all is still correct. So, just as we may remove inessential exogenous variables from a causal model, so too may we remove inessential endogenous variables. Consider again the modelM1, shown in figure 1. This model tells us that whether E fires is determined by whether D does, and that whether D does is determined by whether C does. Here, the variable for D is not necessary. We could pluck it out of the model by replacing it with the left-hand-side of its structural equation, C , wherever it appears. We will be left with a model-call it 'M−D1 '- which contains the following system of structural equations. E := B ∨C B := A ∧ ¬C This model won't tell us howD fits into the causal determination structure of the neuron system, but it tells us about the causal determination structure amongst the variablesA,B,C, andE, and what it tells us about them is all correct. In general, ifM = (U,u,V,E, ≽) is a causal model with the endogenous variableV ∈ V, then letM−V be themodel that you get by (a) leavingU and u alone; (b) removingV from V; (c) removingV 's structural equation V := φV (PA(V )) from E; (d) replacing V with φV (PA(V )) whereverV appears on the right-hand-side of a structural equation in E; and (e) removing information aboutV from ≽. Inmy view, removing an endogenous variable from a correct causal model in this way will not always leave a correct causal model behind. As with exogenous variables, removing some endogenous variables won't leave behind surjective, functional structural equations. Those nous variable with a deviant value wouldn't leave a correct model behind); Exogenous Removal is not an implicit partial definition of what I mean by 'correctness'. 17 A Model-Invariant Theory of Causation variables are not inessential. But they are not the only ones. Consider again the model M−D1 . If we pluck the variable B out of this model in themanner specified above, we will arrive at amodel,M−D,−B1 , which contains the sole structural equation E := (A ∧ ¬C ) ∨C , or just E := A ∨ C , and the exogenous assignment A = C = 1. This model treats the variables A and C symmetrically; yet A and C differ causally. So the causal modelM−D,−B1 cannot be correct. As I remarked above, in my view, this is becauseM−D,−B1 tells us that A andC causally determine the value of E along non-intersecting paths, which is not true. Suppose that, in M, V has a single parent, P a, and a single child, Ch, P a →V → Ch and suppose that P a is not also a parent of Ch. If that's so, then say that V is an interpolated variable in M.28 If V is interpolated, then I'll say that it is an inessential endogenous variable. Though removing endogenous variables will not always preserve the correctness of a causal model, I believe that removing inessential endogenous variables will. That is, I think we should endorse the following principle. Endogenous Removal If a causal model M = (U,u,V,E, ≽) is correct, andV ∈ V is inessential, thenM−V is also correct. 2.3 Model-Invariance We want a theory which will tell us whether two variable values, C = c and E = e , are causally related; and we wish to formulate that theory within the framework of causal models. (Notation: throughout, I will use 'C ' and 'E ' for the cause and effect variables of interest, and I will use 'c ' and 'e ' for the actual values of C and E.) This theory will say whether C = c caused E = e relative to a given causal model. For an arbitrary C and E, there will be a great many correct causal models containing bothC and E. It would be nice if our theory did not require us to survey them all. It would be nice if it said whether C caused E relative to a single causal model, and if its verdicts did not change from correctmodel to correctmodel.29 That is, it would be nice if our theory 28. Note that, if V is interpolated, then all of the equations in M−V will automatically be surjective functions of all of their right-hand-side variables, so long as all of the equations inM are. 29. Of course, in order for a theory of causation to tell us whether C = c caused E = e , we will have to provide it with a correct model which contains the variables C and E. There will be many correct models which don't contain C or E. If you want to know 18 2. Model Invariance satisfied the following constraint.30,31 Model Invariance For any two causal models M and M′ which both contain the variablesC and E, if bothM andM′ are correct, then C = c caused E = e inM iff C = c caused E = e inM′. Let's call a theory of causation which is consistent with the principles Model Invariance, Exogenous Removal, and Endogenous Removal a model-invariant theory of causation.32 If the theory is inconsistent with these principles, then let's say that it is a model-variant theory of causation. It would be nice to have a model-invariant theory. If our theory is model-invariant, then, when we ask whether C = c caused E = e , we needn't worry about our causal verdicts changing as we include additional variables lying along, or feeding into, paths from C to E. Nor need we worry about the theory being shielded from refutation by ad hoc choices about which variables to include and which to ignore. Unfortunately, almost all of the extant theories of causation situated in the framework of causal modelling are model-variant. In particular, the accounts of Hitchcock (2001, 2007a), Halpern and whether C caused E, those other models are like maps of Paris when you're lost in London-they're not inaccurate, just unhelpful. 30. There are alternatives to accepting Model Invariance. In general, a theory of causation formulated with causal models will specify when a causal model is a witness to C = c causing E = e . We might go on to say that C = c caused E = e iff there is some witness to C = c causing E = e (and therefore, C = c didn't cause E = e iff there is no witness). Or we might say that C = c caused E = e iff all correct causal models are witnesses to C = c causing E = e (and therefore, C = c didn't cause E = e iff some correct causal model fails to witness C = c causing E = e ). The first alternative makes it easy to establish causation but difficult to establish non-causation (we must establish non-causation in all of the correct causal models). Likewise, the second makes it easy to establish non-causation, but difficult to establish causation. Model-invariance makes it easy to establish causation and non-causation both. 31. Cf. Halpern (2016, §4.4), who shows that his theory of causation will not reverse its verdicts of non-causation as endogenous variables are removed, though it may reverse its judgments of causation. (Note that this result requires strong assumptions about normality. Given the assumption that the dormancy of a neuron is default, while the firing of a neuron is deviant, Halpern's theory will reverse its verdicts about noncausation as well. See Gallow (ms).) 32. Notice that a theory's verdicts about causation will be preserved when inessential variables are removed iff that theory's verdicts about non-causation are preserved when inessential variables are added. And a theory's verdicts about non-causation will be preserved when inessential variables are removed iff that theory's verdicts about causation are preserved when inessential variables are added. So, if we are able to show that a theory's verdicts don't change as inessential variables are removed, we will have also thereby shown that its verdicts don't change as inessential variables are added. 19 A Model-Invariant Theory of Causation Pearl (2001, 2005), Woodward (2003), Halpern (2008, 2016), Weslake (forthcoming), and Andreas and Günther (forthcominga, forthcomingb) will all reverse or suspend their verdicts when inessential variables are removed from a causal model.33 In sections 3–6, I will introduce a theory of causation which is model-invariant. If this theory says thatC = c caused E = e in a causal modelM, then it will continue to say this after any inessential variables are removed from M. And, if the theory says that C = c didn't cause E = e in a causal model M, then it will continue to say this after any inessential variables are removed fromM. I will build the theory up by walking through some standard cases from the literature-symmetric overdetermination (section 3), preemptive overdetermination (section 4), and counterexamples to transitivity (section 5). According to this theory, a cause must be connected to its effect by what I will call a 'causal network'-in rough outline, a causal network represents an uninterrupted process, each stage of which depends upon its predecessors, and which transmits the cause's deviant, non-inertial behavior to the effect. The definition of a causal network will be developed in section 5. Then, in section 6, I will consider some additional cases and suggest that, if we should understand causation in terms of causal networks, then we should understand a cause as something which transmits deviant or non-inertial behavior to its effect. 3 Symmetric Overdetermination A simple case of symmetric overdetermination is shown in figure 6. Either A or C 's firing would have been enough, on its own, to make E fire. Both A and C fired, so the firing of E was overdetermined, and symmetrically so. There's nothing that A's firing has that C 's firing lacks; nor anything C has that A lacks. If either of them caused E to fire, then both of them did. For another case with a similar structure, consider Pay Raise.34 33. See Gallow (ms). 34. Cf. Livengood (2013). Note: when I say that Pay Raise has a similar causal structure, I am in part assuming that the 'yea' votes and the proposal's passing are deviant. (Of course, the causal structure is similar, not exactly the same. In Symmetric Overdetermination, E would still have fired, even if either A or C had not fired; and in Pay Raise, the proposal would still have passed, even if either Franny, Sammy, or Tammy had not voted 'yea'.) 20 3. Symmetric Overdetermination Figure 6: Symmetric Overdetermination Pay Raise Franny, Sammy, and Tammy vote on a proposal to raise legislators' salaries. The proposal requires two out of three votes in order to pass. All three vote in favor, and the proposal passes. The passing of the proposal was overdetermined by the three votes in favor, and symmetrically so. There's nothing that any one vote has that the others lack. If any vote caused the motion to pass, then all of them did. In cases like these, the effect is overdetermined. The world supplied more than enough for the effect to obtain. There is some appeal to the idea that the world did this by supplying more than enough causes- that is, there is some appeal to the idea that each of the overdeterminers are individually causes of the effect. For instance: C individually caused E to fire; and Franny individually caused the proposal to pass. At the same time, there is some appeal to the idea that C 's firing didn't all by itself cause E to fire, and that Franny didn't all by herself cause the proposal to pass. Perhaps she is a part of a cause-perhaps she contributed to the proposal's passing-but, we may think, she did not cause it to pass all by herself, given that the proposal would have had a two-vote majority even without her support. Mackie (1965)35 andLewis (1986)36 were both happywith the judgment that C 's firing did not cause E to fire in figure 6. According to both, in cases of symmetric overdetermination, intuition is split and a theory of causation could reasonably answer with either verdict. I 35. "Our ordinary concept of cause does not deal clearly with cases of this sort." (Mackie, 1965, p. 251). 36. "Such cases can be left as spoils to the victor, in D. M. Armstrong's phrase. We can reasonably accept as true whatever answer comes from the analysis that does best on the clearer cases." (Lewis, 1986, p. 194) 21 A Model-Invariant Theory of Causation agree withMackie andLewis.37 An adequate theory of causation needn't say that C 's firing caused E to fire. However, it should not say that E 's firing was uncaused. If neither A nor C individually causes E to fire, then they must do so jointly. I will formally represent A and C 's jointly causing E to fire by allowing not just individual variable values, but also tuples of variable values, to be causes. In the canonical modelM6, to say that A's firing and C 's firing jointly caused E to fire is to say that (A,C ) = (1, 1) caused E = 1.38 My theory will not say that C 's firing individually caused E to fire. So I will take the lesson of symmetric overdetermination to be this: we should want a theory to tell us more than when an individual variable value C = c caused another variable value E = e . We should also want it to tell us when some collection of variable values, C = c, caused a variable value E = e . That is: we should want a theory not just of individual causation, but of joint causation as well.39 Throughout, by the way, I will draw no distinction between a variable,V , and a 1-tuple containing that variable, (V )-nor will I distinguish between a variable value, V = v , and a 1-tuple variable value, (V ) = (v). This conflation allows a theory of joint causation to cover individual causation as a special case. Once we allow tuples of variables to be causes, we should generalize Model Invariance. So generalized, the principle will tell us that, if both M andM′ are correct and contain the variables in C ∪ (E), then C = c caused E = e in M iff C = c caused E = e in M′. This is how I will understand the principle, and the corresponding property of modelinvariance, from here on out. Are joint causes causes simpliciter? Did Franny cause the proposal to pass? We could go either way. While the formalism will distinguish causes which are 1-tuples from causes which are n-tuples, for n > 1, we could decide to interpret this formalism by saying that, if some n-tuple 37. This view is increasingly unpopular. Halpern and Pearl, Hitchcock, Woodward, and Weslake, inter alia, take it as a desideratum of a theory of causation that it say that C 's firing caused E to fire all by itself. See also the arguments in Schaffer (2003). 38. '(A,C )' is a pair whose first component is the variable A and whose second component is the variable C ; '(1, 1)' is a pair whose first and second components are the value 1. '(A,C ) = (1, 1)' thus says that A = 1 and C = 1. 39. We could try to generalize further by asking when one tuple of variable values, C = c, caused another, E = e. From my perspective, allowing collections of variable values to be effects in this way does not purchase any additional generality; for I am inclined to say that C = c caused E = e iff C = e caused Ei = ei , for each Ei ∈ E and its corresponding value ei ∈ e. 22 4. Preemptive Overdetermination C caused E, then eachC ∈ C counts as a cause of E in its own right. Or we could decide to say that each C ∈ C is merely part of a cause, and distinguish joint from individual causation. My own inclination is to say that neither Franny nor Sammy individually caused the proposal to pass, even though, together, they did; but if the reader balks at this, they should feel free to go the other way. 4 Preemptive Overdetermination The neuron system shown in figure 1 provides a case of preemptive overdetermination. For another case with a similar causal structure, consider Tax Cut.40 Tax Cut The proposal to lower corporate taxes requires one more vote to pass. Tammy's constituents will be angry if she votes in favor, but it is important to her campaign contributors that the proposal pass, so she is prepared to deal with the constituents' ire if her vote is needed. Fortunately for Tammy, Sammy votes 'yea', the proposal passes by a single vote, and Tammy is free to vote 'nay'. The proposal's passing was overdetermined-the corporate donors bought more than enough influence. Still, the overdetermination is not symmetric. Though the causal process initiated with donations to Sammy runs to completion, the causal process initiated with donations to Tammy is preempted by Sammy's voting 'yea'. Tammy would have caused the proposal to pass, were it not for Sammy; but, as it happens, Tammy is merely a backup would-be cause of the proposal's passing. Sammy, on the other hand, is a genuine cause of the proposal's passing. Preemptive overdetermination serves as a counterexample to a simple counterfactual theory of causation which says that counterfactual dependence is both necessary and sufficient for causation. Consider the canonical model of the neuron system from figure 1, M1. In that model, it is not true that, had C not fired, E wouldn't have fired. For, had C not fired, B would have, and E would have fired all the same. (In the counterfactual model M1[C → 0] in which we intervene so as 40. Note: when I say that Tax Cut has a similar causal structure, I assume that the corporate donations, the 'yea' votes, and the proposal's passing are all deviant. 23 A Model-Invariant Theory of Causation to set C 's value to 0, E takes on the value 1.) But C 's firing caused E to fire. So counterfactual dependence is not necessary for causation. Lewis (1973) dealt with cases of preemptive overdetermination by taking causation to be, not counterfactual dependence, but rather the ancestral, or the transitive closure, of counterfactual dependence. While E 's firing doesn't counterfactually depend upon C 's firing directly, it does counterfactually depend upon D 's firing, and D 's firing counterfactually depends upon C 's firing. So Lewis says that C 's firing caused E to fire. This Lewisian transitivity maneuver allows us to correctly say that, in the model M1, C 's firing caused E 's firing. Unfortunately, if we straightforwardly import the Lewisian maneuver into the framework of causal models, the resulting account will bemodel-variant. For suppose we remove the variable D fromM1, in the manner described in section 2. We will get the causal model, M−D1 , in which there is no variable intermediate between C and E. E := B ∨C B := A ∧ ¬C Even though, given the causal model M1, a Lewisian theory will say that C = 1 caused E = 1, given the model M−D1 , it will say that C = 1 didn't cause E = 1. So the theory will be model-variant. The treatment of preemptive overdetermination favored by almost every author in the causal modeling literature41 appeals to either A or B . Though E = 1 does not counterfactually depend upon C = 1 in the modelM1, it does counterfactually depend upon C = 1 in the counterfactual model where we've intervened so as to hold B fixed at its actual value of 0-that is, M1[B → 0] |= C = 0 → E = 0. Likewise, E = 1 counterfactually depends upon C = 1 in the counterfactual modelM1[A → 0]. And according to these authors, counterfactual dependence in counterfactual models like these is sufficient to show that C = 1 caused E = 1. No solution which appeals to the variables A or B in this way will be model-invariant. For note that the exogenous vari41. See, in particular, Halpern and Pearl (2001, 2005), Hitchcock (2001), Woodward (2003), Halpern (2008, 2016), and Weslake (forthcoming). See Yablo (2002, 2004) for similar ideas. Andreas and Günther (forthcominga) have a different treatment of preemptive overdetermination which also appeals to the variable B . (Beckers and Vennekens (2017, 2018) have a radically different treatment of preemptive overdetermination-according to them, preemptive overdeterminers are not causes.) 24 4. Preemptive Overdetermination able A is inessential inM1. So, by Exogenous Removal, we may pluck it out, and we will be left with a model,M−A1 , in which the endogenous variable B is (now) inessential. E := B ∨D D := C B := ¬C Since B is inessential, Endogenous Removal tells us that we may pluck it out. Doing so leaves us with a model,M−A,−B1 , in which neither A nor B appears. E := ¬C ∨D D := C So, if we want our theory of causation to be model-invariant, then we will want a treatment of preemptive overdetermination which does not require the variables A or B . Return to the causal modelM−D1 . For a moment, ignore the structural equation for B , focus just on E 's structural equation, and treat this isolated structural equation as if it were a causal model unto itself- what we can call the local model at E. E := B ∨C Notice that, in the local model, there will be counterfactual dependence between E = 1 and C = 1. Since this is so, I'll say that E = 1 locally counterfactually depends upon C = 1. In general, given a causal model M = (U,u,V,E, ≽), with E ∈ V, let's define the local model at E, which we can write 'M(E)', to be the causal model in which (a) the exogenous variables are just the parents of E, PA(E), in the original modelM; (b) the exogenous variables PA(E) are assigned whatever values they take on in M; (c) the sole endogenous variable is E; (d) the sole structural equation is E 's structural equation in M; and (e) the information about the deviancy of E and PA(E)'s values is the same as inM. Then, we may say that, in the modelM, E = e , rather than e ∗, locally counterfactually depends upon C = c , rather than c ∗, iff, in the local model at E,M(E): M(E) |= C = c ∗ → E = e ∗ 25 A Model-Invariant Theory of Causation In contrast, if M |= C = c ∗ → E = e ∗ then I will say that E = e , rather than e ∗, globally counterfactually depends upon C = c , rather than c ∗, in the model M.42 (If C is a causal parent of E and there is only one path leading from C to E, then there won't be any difference between local and global dependence- in those cases, I will allow myself to say simply: 'E = e , rather than e ∗ depends upon C = c , rather than c ∗'.) To properly classify C = 1 as a cause of E = 1 in M−D1 , I will suggest that we focus on local, as opposed to global, counterfactual dependence. Turning our attention to local counterfactual dependence may help withM−D1 , but it will not, on its own, help us to say that C 's firing caused E 's firing in the canonical model M1. For in this model, E 's firing does not locally counterfactually depend upon C 's firing (the variable for C is not even included in the local model M1(E)). I believe that we should handle this case roughly as Lewis (1973) did: by focusing, not on local dependence, but rather on something like the transitive closure of local dependence. However, there are a number of counterexamples to the thesis that a chain of dependence is sufficient for causation. Let's turn to those counterexamples now. 5 Causal Networks Suppose you've traced out a sequence of states or events, where each state or event in the sequence depends upon its predecessor. When can you go on to conclude that the state or event at the start of the sequence caused the one at the end? Lewis gave the answer: 'always'. This answer allowed him to deal with cases of preemptive overdetermination, but it came at a cost. Chris smokes, contracts cancer, undergoes chemo, and survives. The survival depends upon the chemo; the chemo depends upon the cancer; and the cancer depends upon the smoking. Lewis concludes that smoking caused Chris to survive. This is difficult to swallow, no matter how it's seasoned. The answer to give is 'sometimes, but not always', and the difficulty lies in working out just when. In this section, I will try to lay down conditions specifying when a directed path running from C to E, P : C → D1 → D2 → * * * → DN → 42. Of course, in order for these dependence claims to be true, it must also be that C = c ∧ E = e . Throughout, I am using 'c ' and 'e ' for the actual values of C and E. I will say more about the contrastive 'rather than' clauses in section 5.1 below. 26 5. Causal Networks E, is what I will call a causal path. Actually, I will try to do something slightly more general. In section 3, I explained that I will provide a theory of causation which allows tuples of variable values to be causes. But there won't be a single directed path from a tuple of variables C to an effect variable E. So I will begin by generalizing the notion of a directed path-I'll call the generalization a network-and then I'll try to lay down conditions specifying when a network from C to E is what I will call a causal network. My theory will say that causal networks are necessary for causation: ifC's values are to be a cause of E 's, then there must be a causal network leading from C to E. First, let me explain what I mean by a network. We may think of a directed path, P, from C to E, as a collection of directed edges generated by the following procedure: begin with C , and select exactly one of its causal children, D , to be its P-child. Then, include the directed edge between C and D , C → D , in P. Next, selected exactly one of D 's causal children to be its P-child, and proceed in this manner until you reach E. Now, we can define what I will call a network, N ,43 from the sequence of variables C to E, as a collection of directed edges generated by the following procedure: begin with each variable C ∈ C, and select some of its causal children, D1,D2, . . . ,DN (you needn't choose just one), to be itsN -children.44 Next, for each of theDi , select some of their causal children to be their N -children, and proceed in this manner until E is the only variable lying in N without an N -child. That is, a network from C to E is just a union of directed paths from someC ∈ C to E-and where, for each C ∈ C there is some directed path leading from C to E included in the union. For instance, inM6, A → E ← C is a network from (A,C ) to E, and inM1, C B D E is a network from C to E (remember, I don't distinguish between the variable C and the 1-tuple (C )). Note that every path is a network, though not every network is a path. 43. Cf. Hitchcock (2007a) 44. Terminology: if there is a directed edge C → D in a network N , then I say that D is one of C 'sN -children, and that C is one of D 'sN -parents. Note that being one of D 's N -parents is not the same as being a parent of D lying in the network N . Consider the network N : C → B → E in the modelM−D1 . C is a parent of E lying in N , but C is not one of E 's N -parents. 27 A Model-Invariant Theory of Causation (a) (b) Figure 7 To reiterate: in this section, I will be trying to lay down conditions specifying when a network is causal. And according to the theory I'll present in section 6, causal networks are necessary for causation. In order for C to be a cause of E, there must be a causal network leading from C to E. In these terms, a Lewisian view says that a network N is causal whenever the value of each variable in N depends upon the values of its N -parents. I believe that we should impose additional constraints on a network being causal. I'll introduce these constraints by surveying some representative counterexamples to this Lewisian view. 5.1 Causal Networks and Contrasts One class of counterexamples to the Lewisian view is well illustrated by the neuron system illustrated in figure 7.45 In this neuron system, the octogonal neurons A and B are special. They can either fire weakly (indicated with light grey coloring) or strongly (indicated with dark grey). The connection between C and B is a special kind of inhibitory connection-if the neuron at its base fires, then this will diminish the strength with which the neuron at its head would otherwise have fired. So, e.g., if A fires strongly andC doesn't fire, as in figure 7b, then B will fire strongly; but if A fires strongly and C fires, as in figure 7a, then B will only fire weakly. Neuron E is a regular neuron, so if B fires, no matter whether weakly or strongly, E will fire. In figure 7a, E 's firing (rather than not) depends upon B 's firing weakly (rather than not firing). And B 's firing weakly (rather than strongly) depends upon C 's firing (rather than not). But C 's firing did not cause E to fire. So this neuron system provides a counterexample to the Lewisian view that causation is the transitive closure of dependence. 45. Cf. Paul and Hall (2013, figure 17), and also Lewis (1986, p. 210). 28 5. Causal Networks For another case with a similar structure: a dog bites Michael's right hand. With his right hand on the mend, Michael uses his left hand to hail a taxi. The taxi's stopping depends upon Michael's hailing the taxi with his left hand (rather than not hailing the taxi), and Michael's hailing the taxi with his left hand (rather than his right) depends upon the dog bite. But the dog bite did not cause the taxi to stop.46 I follow Maslen (2004) and Schaffer (2005) in thinking that cases like these illustrate the importance of paying attention to contrasts in chains of dependence.47 There is a difference between saying that (a) E = e , rather than e ∗, depends upon C = c , rather than c ∗, and saying that (b) E = e , rather than e ∗∗, depends upon C = c , rather than c ∗, or that (c) E = e , rather than e ∗, depends upon C = c , rather than c ∗∗. The first claim, (a), is made true by a counterfactualC = c ∗ → E = e ∗; the second, (b), is made true by a counterfactual C = c ∗ → E = e ∗∗; and the third, (c), is made true by a counterfactualC = c ∗∗ → E = e ∗. The lesson of figure 7 is this: in order for a network to be causal, it is not enough that the value of each variable in the network depend upon the value of its parents in the network. The relevant contrasts also have to 'match up'. As a preliminary account, we may say: Causal Network (preliminary) A networkN , from C to E, is a causal network only if there is an assignment of contrasts to the variables in N such that: (a) E 's contrast is distinct from its value; (b) For eachD < C in the network,D 's value, rather than its contrast, locally depends upon D 's N -parents' values, rather than their contrasts.48 And our preliminary theory is that C = c caused E = e only if there is a causal network leading from C to E. Note that there is no one contrast we could assign to B in figure 7a such that E 's firing, rather than not, depends upon B 's firing weakly, rather than that contrast; and such 46. SeeMcDermott (1995), as well as the counterexamples to transitivity discussed in Paul (2004). 47. See Hitchcock (1996b,a) and Schaffer (2012a) for more on contrasts in causal claims. 48. Recall: there is a difference between a variable'sN -parents and its causal parents lying in N . See fn 44. 29 A Model-Invariant Theory of Causation that B 's firing weakly, rather than that contrast, depends upon C 's firing, rather that not. So C → B → E is not a causal network, and our preliminary theory tells us that C 's firing was not a cause of E 's firing. Note that, because we require the contrasts to 'match up', once we have chosen contrasts for the variables in C, the choice of every other contrast is outside of our hands. Pick any D < C in the network, let P be its N -parents, and let p∗ be their contrasts. Then, clause (b) tells us that D 's contrast must be the value d ∗ such that P = p∗ → D = d ∗ is true in the local model at D . There will only be one such d ∗, so we have no choice about what contrast to assign to D . (D was arbitrary, so the same goes for every variable in the network, except for those in C.) Paying attention to contrasts has other benefits, as well. For instance, it allows us to handle cases of trumping preemption.49 Suppose that the troops always follow the orders of the highest ranked officer. The Major and the Sergeant both order the troops to advance, and they advance. Since the Major outranks the Sergeant, it is natural to want to say that it was the Major, and not the Sergeant, who caused the troops to advance. Use a variable, M , to represent the Major's orders. Let M take on the value 2 if the Major orders to advance, 1 if he orders to stay put, and 0 if he gives no order at all. Similarly, use the variable S for the Sergeant's orders. S is 2 if the Sergeant orders to advance, 1 if he orders to stay put, and 0 if he gives no orders at all. And, finally, use a variable, A, for whether the troops advance. A = 2 if they advance, and A = 1 if they do not. I'll assume that the structural equation A := φA(M,S ) is correct, where φA(M,S ) =  M if M , 0 S if M = 0 and S , 0 1 if M = 0 and S = 0 That is: the soldiers will do whatever the Major orders, so long as the Major gives an order. If he does not, then they will follow the orders of the Sergeant. If neither the Major nor the Sergeant give orders, then they will not advance. In this model, notice that, even though the soldier's advance doesn't depend upon the Major's giving the order to advance, rather than giving no orders at all (M = 0 → A = 2), it does depend upon the Major's giving the order to advance, rather than giving the order to stay put (M = 1 → A , 2). SoM → A will be a causal path. Since the soldiers' advance does not depend upon the 49. See Schaffer (2004). 30 5. Causal Networks Figure 8 Figure 2 Sergeant's orders, no matter which contrast we choose, S → A will not be a causal path, and the Sergeant's orders will not count as a cause of the soldier's advance.50 5.2 Causal Networks, Defaults, and Deviancy Schaffer (2005) holds that this kind of contrastivism allows us to handle all counterexamples to the Lewisian view, but in the present context, this would be an overreach.51 Consider again the neuron system of preemptive overdetermination from figure 1, but suppose that C doesn't fire, as in figure 8. In this neuron system, E 's firing depends upon B 's firing (rather than not). And B 's firing (rather than not) depends upon C 's dormancy. So we have a chain of dependence with matching contrasts leading from C to E, but C 's dormancy didn't cause E to fire.52 Or consider again the neuron system from figure 2 (reproduced here). There, E 's remaining dormant depends upon D 's firing (rather than not); andD 's firing (rather than not) depends uponC 's firing. So again we have a chain of dependence with matching contrasts leading from C to E; but C 's firing did not cause E to remain dormant. As we've already seen (in section 1.1), were it not for the information about which variable values are default, inertial states and which are deviant, non-inertial events, we could model the neuron system in figure 2 with amodel isomorphic to the canonical model of preemptive overdetermination from figure 1. So we should expect an explanation of whyC = 1 didn't cause E = 0 to make use of this additional information. Note also that Exogenous Removal and Endogenous Removal 50. Cf. the treatments of trumping preemption in Lewis (2004), Halpern and Hitchcock (2010), and Hitchcock (2011). 51. Schaffer is working in a different theoretical framework; and it affords him a response to the kinds of counterexamples raised below (see p. 342). 52. Cf. Sartorio (2005, 2016)'s Causes as Difference Makers principle, which entails that C 's dormancy cannot cause E to fire, as long as C 's firing would have. 31 A Model-Invariant Theory of Causation allow us to remove every variable other than C and E from M2. A is inessential, so Exogenous Removal tells us that the modelM−A2 is correct. In the modelM−A2 , B is inessential, so Endogenous Removal tells us that the modelM−A,−B2 is correct. And similarly, in the modelM −A 2 ,D is inessential, so Endogenous Removal tells us that the model M−A,−D2 is correct. If we want our theory of causation to be model-invariant, then it had better tell us that C = 1 didn't cause E = 0 in each of these models. So we have good reason to think that the verdicts of our theory should not depend upon the default information of any variables other than C and E themselves.53 In both figure 2 and figure 8, it is noteworthy that either C or E takes on a value representing a default, normal, or inertial state. Whereas, in figure 1, both C and E take on values representing deviant, abnormal, non-inertial events. It is also noteworthy that, in both M2 andM8, there are multiple directed paths from C to E. I will suggest that these are the reasons why C does not cause E in either of those neuron systems. Suppose that we are given a network, N , from C to E, and in this network are two variables, D and R. If there is a directed path from D to R, O : D → O1 → O2 → * * * → ON → R, where none of the directed edges inO are included inN , then I'll say thatD is a departure variable, and that R is one of its return variables (relative to the networkN ). For instance, in the model M8, relative to the network C → B → E, C is a departure variable, and E is its return. And, in the modelM2, relative to the network C → D → E, C is a departure variable with return E. In contrast, relative to the network A → B → E inM2, E is not a return variable-and, relative to the network C → B → E ← D ← C , C is not a departure variable. Take some network, N , with a departure variable D , and one of its returns, R. D potentially affects R both via N and via some other path or paths external to N . It could be that, what D gives R through N , it takes away along some other path or paths. If D gives a deviant value to R throughN-that is, if both D and R take on deviant, rather than default, values-then this will make no difference with respect to whether N is a causal network. (Thus, in figure 1, C → D → E is 53. Every variable in the model besides C and E may be removed; but we may not remove every variable besides C and E. For D is not inessential inM−A,−B2 , and B is not inessential in M−A,−D2 . So for all we've said, it could be that what ≽ tells us about D should be relevant to the theory's verdicts in M−A,−B2 , while what ≽ tells us about B should be relevant to the theory's verdicts inM−A,−D2 . 32 5. Causal Networks Figure 9 causal.) But if D does not give deviant value to R, thenN is not causal. (This, in figure 2, C → D → E is not causal.) Let us add this to our account: a network is causal only if every departure and return variable in the network takes on a value which is more deviant than its contrast.54 Causal Network A network N , from C to E, is a causal network if and only if there is an assignment of contrasts to the variables in N such that: (a) E 's contrast is distinct from its value; (b) For eachD < C in the network,D 's value, rather than its contrast, locally depends upon D 's N -parents' values, rather than their contrasts. (c) every departure and return variable in N has a value which is more deviant than its contrast. This completes my account of when a network is causal. Note that, while Causal Network requires E 's contrast to be distinct from its value, it does not require that the other variables in the network have contrasts which are distinct from their values.55 For instance, consider the neuron system in figure 9. This is a case of double prevention. F is a potential preventer of E 's firing; and C 's firing prevented F from preventing E. In the canonical modelM9, 54. Couldn't a departure variable have a value no more deviant than its contrast, yet still not take away along other paths what it gives R along N ? Yes, but in that case, the additional paths from D to R may simply be incorporated into the network N , and the resulting network will be causal. See the discussion of figure 9 below. 55. If d∗ is D 's actual value, it's a bit odd to call d∗ a contrast value, but I'll stick to this terminology nonetheless. 33 A Model-Invariant Theory of Causation C B D EF is a causal network from C to E. For we may assign C , B , D , and E the contrast value 0 (note that B 's contrast is the same as its value) and F the contrast value 1. Then, E = 1, rather than 0, locally depends upon F = 0, rather than 1. F = 0, rather than 1, locally depends upon (B,D) = (0, 1), rather than (0, 0). D = 1, rather than 0, locally depends upon C = 1, rather than 0. And B = 0, rather than 0, locally depends upon C = 1, rather than 0. (For, in the local model at B , M9(B), the counterfactual C = 0 → B = 0 is true.) It can seem that the variable B is an idle wheel in this network, but it is important that it be included. For, relative to the network C → D → F → E, F is a return variable with a default value and a deviant contrast. So the network C → D → F → E is not causal. However, relative to the network which includes B , F is not a return variable, and need not have a deviant value, nor a default contrast. Note that E 's firing globally counterfactually depends uponC 's firing. If we think that global counterfactual dependence between events like these suffices for causation, and we wish to understand causation in terms of causal networks, then it is for the good that we count as causal the network which includes B . In fact, global counterfactual dependence suffices for the existence of a causal network, not just for the model M9, but in general. That is: in any causal model M, if there is some assignment c∗ to the variables in C, such that the global counterfactual C = c∗ → E , e is true, then there will be a causal network leading from some sub-tuple of C to E inM. (See Proposition 1 in the appendix for a proof.) So defined, causal networks are model-invariant. Suppose we have a causal modelM, with an inessential exogenous variableU < C. Then, there will be a causal network from C to E in M if and only if there is a causal network from C to E in M−U . Similarly, if we have a causal model M with an inessential endogenous variable V < C ∪ (E), then there will be a causal network from C to E in M if and only if there is a causal network from C to E inM−V . (See the proof of Proposition 2 in the appendix.) If we suppose that survival is an inertial state-the state in which people normally remain unless they are acted upon from without- then this proposal explains why the boulder's becoming dislodged does not causeMatthew to survive (section 1.1), even though his survival depends upon his jumping out of the way (rather than staying put), and 34 6. Causation and the Transmission of Deviancy his jumping out of the way (rather than staying put) depends upon the boulder's getting dislodged. So too does it explain why Chris's smoking does not cause him to survive, even though his survival depends upon the chemotherapy, and the chemotherapy depends upon the smoking. Both cases have a causal structure similar to Short Circuit: a threat is created along one path, and simultaneously neutralized along another. If survival is an inertial state, then neither path will be causal. (Nor will the network which consists of both paths be causal-for, while the survival depends upon the neutralization of the threat, it does not depend upon the threat and the neutralization both. If Chris had neither cancer nor chemo, he would still have survived; and, had the boulder not fallen and Matthew not jumped, Matthew would still have survived.) 6 Causation and the Transmission of Deviancy Causal networks are the model-invariant heart of my theory of causation. On my view, in order for C to cause E, there must be a causal network leading from C to E. In section 6.1, I'll say a bit to motivate thinking of a causal network as a process which transmits deviant, abnormal, or non-inertial behavior. In section 6.2, I'll provide my preferred theory of causation, according to which (roughly) C is a cause of E iff there is a causal network leading from C to E, C has deviancy to give, and E receives that deviancy via the causal network. I'll go on to apply this theory to cases from McGrath (2005) and Hall (2004). 6.1 Productive Networks The distinction between the values of variables which represent default, normal, inertial states and those which represent deviant, abnormal, non-inertial events enters into my theory of causation at least in clause (c) of Causal Network. It is natural to wonder about what this distinction is doing in a theory of causation. I take the arguments presented in section 1.1 to demonstrate that this distinction or something like it must be included in an adequate theory-but, even once this is appreciated, it is natural to wonder: why should this distinction play any role in our causal thought and talk? In this subsection, I want gesture at an answer to this question. Roughly, I will suggest that a cause is something which transmits abnormal, deviant, or non-inertial behavior to its effect. If causation is to be understood in terms of the transmission of deviancy, then what is it for this deviancy to be transmitted? One possi35 A Model-Invariant Theory of Causation ble answer is that deviancy is transmitted iff there is an uninterrupted process leading from cause to effect, each stage of which receives its deviancy from the preceding stage. Let's try to make this a bit more precise. Contrast a causal network (as defined in section 5 above), with a productive network, as defined below. (The only difference is in clause (c).) Productive Network A network, N , from C to E, is a productive network iff there is an assignment of contrasts to the variables inN such that (a) E 's contrast is distinct from its value;56 (b) for eachD < C in the network,D 's value, rather than its contrast, locally depends upon D 's N -parents' values, rather than their contrasts; and (c) every variable in the network has a value which is more deviant than its contrast. Note that any productive network from C to E will automatically count as a causal network from C to E. But not all causal networks are productive networks. Being linked by a productive network is sufficient, but not necessary, for being linked by a causal network. A productive network is so-called because it provides a natural characterization of the notion of a productive causal process in the terms of causal models.57 So understood, a productive causal process is an uninterrupted process by which deviant values are transmitted. And what it is for this deviancy to be transmitted is for the deviancy of each stage in the process to locally depend upon the deviancy of its immediate predecessors. Notice that there is a productive network leading from C to E in the canonical model of preemptive overdetermination in figure 1. Similarly, there is a productive network leading from A to E in the canonical models of figures 4, 7, and 8-and from G to E in figure 9. In general, it seems that, if there is a productive network from C to E in 56. Condition (a) is redundant in the presence of condition (c), but I include it to emphasize that Productive Network is just a strengthened version of Causal Network. 57. The notion I am characterizing here is not the notion of a causal process provided by authors like Fair (1979), Salmon (1984, 1994), and Dowe (2000)-those notions are characterized in physical terms, rather than the terms of a causal model-but there are some similarities. See also Hall (2004)'s characterization of causal production. 36 6. Causation and the Transmission of Deviancy Figure 10: Double Prevention the canonical model, the judgment thatC caused E is intuitive and uncontroversial. There is little debate about whether C 's firing caused E to fire in figure 1, or whether A's firing caused E to fire in figures 4, 7, and 8. In contrast, in cases of double prevention like the one shown in figure 10, there is a causal, but not a productive, network leading from C 's firing to E 's firing. There, D is a potential preventer of E 's firing. C 's firing prevents D from preventing E from firing. In the canonical modelM10, C → D → E is a causal network. However, C → D → E is not a productive network, since the intermediate variable D takes on a default value. People's causal judgments about figure 10 tend to be less uniform. More generally, it seems that, when variables are connected by causal, but not productive, networks, some (but by no means all) are more hesitant to attribute causation. Unlike causal networks, productivenetworks aremodel-variant. Take the canonical model of the case of double prevention from figure 10, M10, E := B ∧ ¬D D := A ∧ ¬C In this model, the exogenous variablesA and B are both inessential. So Exogenous Removal tells us that we may remove them both, leaving behind the modelM−A,−B10 , E := ¬D D := ¬C In this model, the endogenous variable D is inessential, so Endogenous Removal tells us that wemay remove it, leaving behind themodel M−A,−B,−D10 , E := C 37 A Model-Invariant Theory of Causation And, in thismodel, there is a productive network leading from C to E. So if we were to understand causation in terms of productive networks, our causal verdicts would change as we attended to additional variables lying along, or feeding into, the network from cause to effect.58 More generally, if we think about the transmission of deviancy as Productive Network does-each variable intermediate between C andE has a deviant (rather than amore default) value which locally depends upon the deviancy of its causal parents-then whether deviancy is transmitted from C to E will vary from model to model. Causal Network is a model-invariant weakening of Productive Network. It suggests a different way of understanding the transmission of deviancy. Suppose that E = e counterfactually depends upon C = c, and suppose that c and e both represent deviant, non-inertial events, rather than default, inertial states. In that case, let us say that C has transmitted deviancy to E-we won't concern ourselves, for instance, with whether this transmission was accomplished by means of double prevention or not. Because counterfactual dependence suffices for a causal network, if E 's deviancy counterfactually depends upon C's, then there will be a causal network leading from C to E. Moreover, if E globally counterfactually depends upon C, there will be a causal network leading from (some sub-tuple of) C to E without any departure and return variables-call this a 'closed causal network'.59 So another, equivalent, way of understanding the claim that counterfactual dependence between deviant, non-inertial events suffices for the transmission of deviancy is this: a closed causal network linking deviant, non-inertial events, rather than default, inertial states, suffices for the transmission of deviancy. In the case of Preemptive Overdetermination from figure 1, E 's deviancy does not depend upon C 's. This is because C affects E along two separate paths. Along one path, C deprives E of deviancy; along the other, it provides deviancy. In cases like these, too, let us say that deviancy has been transmitted from cause to effect. More generally, if there are departure and return variables in a network, then it may be 58. Schaffer (2000, 2012b) argues that, in many paradigm instances of productive causal processes-pulling the trigger, thereby shooting the gun, thereby killing the target- we may interpolate variables between cause and effect so as to reveal a case of double prevention. 59. See the proof of Proposition 1 in the appendix to understand why, if E = e counterfactually depends upon C = c, there will be a closed causal network leading from (some sub-tuple of) C to E. 38 6. Causation and the Transmission of Deviancy that what D transmits to R through the network, it takes away along some other path or paths. If D transmits deviancy to R through the network (if D and R take on deviant, rather than more default, values), then this won't matter. We should still say that C has transmitted deviancy to E. That is: in general, we should allow deviancy to be be transmitted through any causal network, and not just closed causal networks. 6.2 Productive Causation Causal Network does not say anything about C and E having deviant values or (more) default contrasts. So if we think of causation in terms of the transmission of deviancy in the way that I have been suggesting, then we should impose this additional requirement. Doing so yields the following relation, which I will call 'productive causation': Productive Causation Given a correct causal modelM containing the variables in C and E, C = c is a productive cause of E = e iff there is a minimal causal network leading from C to E in M which assigns contrasts to C and E which are more default than their values. That is: C = c is a productive cause of E = e iff there's a minimal causal network leading from C to E and, additionally, C and E, like any departure and return variables in the network, have values more deviant than their contrasts. (I'll explain what I mean by 'minimal' below.) If causation just is productive causation, this would explain some otherwise puzzling features of our causal thought and talk. To borrow an example fromMcGrath (2005): Alice's neighbor Bob promises Alice that he will water her plant while she is away on vacation. He doesn't, and Alice's plant dies. Many judge that Bob's failure to water the plant caused it to die. Only philosophers in the grip of theory judge that Alice's other neighbor, Carlos, caused the plant to die-though the plant's death counterfactually depends upon Carlos's failure to water it every bit as much as it depends upon Bob's.60 If we suppose that death and promise breaking are both deviant events, and that survival and promise-keeping are (more) default, then Bob's failure to water the plant is a productive cause of its death. And if we suppose that 60. See also the pen case in Hitchcock and Knobe (2009). 39 A Model-Invariant Theory of Causation (a) S = 3 (b) S = 2 (c) S = 1 (d) S = 0 Figure 11: Switch. The neuron S can either be set to the left, or to the right. If D fires, then it will be set to the right; if D doesn't fire, then it will be set to the left. S will fire iff M fires. Carlos's failure to water is a default state, then Carlos's failure to water is not a productive cause of its death. If causation is productive causation, then this allows us to explain why switches are not causes (see Hall (2004) and Sartorio (2005)). For, while switches affect the route by which deviancy is transmitted to an effect, they do not themselves transmit deviancy to the effect. For a concrete case of a switch, consider the neuron system shown in figure 11a. There, the neuron S is a switch, which can either be set left (when the variable S is even, as in figures 11b and 11d) or right (when the variable S is odd, as in figures 11a and 11c). D determines whether the switch is set left or right. IfD fires, then S will be set right; whereas, ifD does not fire, then S will be set left. D does not determine whether S fires or not. M does that. If M fires, then S will fire; if M does not fire, then S will not fire. If S fires while left, then L will fire. If S fires while right, then R will fire. And, finally, E will fire iff either L or R does. For a case with a similar causal structure, consider: Doorbells There are two doorbells-one on the left, and one on the right. The signal from the button outside passes through 40 6. Causation and the Transmission of Deviancy a switch, which can have one of two settings: left or right. If the switch is set to the left and the button is pressed, the signal will pass to the left, and the left bell will ring. If the switch is set to the right and the button is pressed, the signal will pass to the right, and the right bell will ring. If either bell rings, Einstein will bark. Before leaving that morning, Doc flipped the switch to the right. WhenMarty arrives, he presses the button, the right bell rings, and Einstein barks. In Doorbells, when Marty presses the button, Einstein will bark-no matter whether the switch is set to the left or the right. Doc's flipping the switch to the right was not a cause of Einstein's barking.61 In contrast, Marty's pressing the button was a cause of Einstein's barking. Likewise, in figure 11a, while D 's firing was a cause of R's firing, it was not a cause of E 's firing. In contrast,M 's firing was a cause of E 's firing. I'll assume that both of these systems can be modeled with the following system of structural equations:62 E := L ∨R L := S = 2 R := S = 3 S := 2M +D I will also assume that S = 2 is no more deviant or abnormal than S = 3-being set to the left is no less normal than being set to the right. With this assumption, we can show that, while there is a causal network from M to E, there is no causal network from D to E. First, let's assume that S = 3 is more deviant than S = 1 and that E = 1 is more deviant than E = 0-in the case of Switch, firing is more deviant than remaining dormant, or, in the case of Doorbells, directing a signal to the right is more deviant than not directing any signal, and barking is more deviant than not barking. With these assumptions, we can show that M → S → R → E is a causal path. For we may assign M,R, and E the contrasts 0, and S the contrast 1. Then: E = 1, rather than 0, depends upon R = 1, rather than 0; R = 1, 61. Of course, it was (along with Marty's pressing the button) a cause of the right bell's ringing. And the right bell's ringing was a cause of Einstein's bark. So, like Boulder, Short Circuit, and figures 7 and 8, Doorbells provides a counterexample to the transitivity of causation. See Hall (2004) and Sartorio (2005). Cf. also Pearl (2000, example 10.3.6) and Halpern and Pearl (2005). 62. 'L := S = 2' says that L's value will be the truth-value of 'S = 2': that is, L = 1 if S = 2 and L = 0 if S , 2. Likewise for 'R := S = 3'. 41 A Model-Invariant Theory of Causation rather than 0, depends upon S = 3, rather than 1; and S = 3, rather than 1, depends upon M = 1, rather than 0. Relative to this path, S is a departure and R is its return, but both S and R have values which are more deviant than their contrasts. So the path is causal. The assumption that S = 3 is more deviant than S = 1 isn't needed to show that there's a causal network fromM to E. Even if it is not, the network M S L R E will be causal. For we may assign M,R,L, and E the contrasts 0, and assign S the contrast 1 (note that L's contrast is the same as its value). Then: E = 1, rather than 0, depends upon (L,R) = (0, 1), rather than (0, 0); R = 1, rather than 0, depends upon S = 3, rather than 1; L = 0, rather than 0, depends upon S = 3, rather than 1; and S = 3, rather than 1, depends upon M = 1, rather than 0. In this network, there are no departures or returns, so the network is causal. In contrast, so long as S = 3 is no more deviant that S = 2, there will be no causal network from D to E. We could assign D,R, and E the contrasts 0, and assign S the contrast 2. Then: E = 1, rather than 0, depends upon R = 1, rather than 0; R = 1, rather than 0, depends upon S = 3, rather than 2; and S = 3, rather than 2, depends upon D = 1, rather than 0. But, relative to this network, S is a departure variable. Since its contrast is no more default than its value, this network is not causal. Nor is the network D S L R E causal. If D were to be 0, then S would be 2, D = 0 → S = 2. And, if S were 2, then L would be 1 and R would be 0. So, if the path is to be causal, then (L,R) must be assigned the contrasts (1, 0). But, if L were to be 1 and R were to be 0, then E would be 1. So E 's contrast would not be distinct from its value. So the network is not causal. The upshot is this: if Marty's pressing the button and Einstein's barking are both deviant, non-inertial events, then the deviancy of Marty pushing the button will be transferred to Einstein's barking, via a causal network. So Marty's pressing the button will be a productive cause of Einstein's barking. On the other hand, so long as the switch's directing a signal to the right is nomore deviant than its directing a signal to the left, Doc's flipping the switch will not transfer any deviancy 42 6. Causation and the Transmission of Deviancy Figure 4: Omission Figure 5: Prevention to Einstein's barking. Instead, Doc's flipping the switch merely diverts the deviancy of Marty's pushing the button to the right path. So Doc's flipping the switch will not be a cause of Einstein's barking. If productive causation just is causation, then default, inertial states can be neither causes nor effects. Assuming that dormancy is the default state of a neuron, this means that C 's dormancy does not cause E to fire in the case of omission from figure 4, nor does C 's firing cause E to remain dormant in the case of prevention from figure 5 (both reproduced here).63,64 If we find these consequences unacceptable, and we wish to insist that prevention and omission are both species of causation, then we may prefer the following theory of causation: Given a correct causal modelM containing the variables in C and E, C = c is a cause of E = e iff there is a minimal causal network leading from C to E inM. Alternatively, we could allow C, but not E, to take on default values or more deviant contrasts. Or we could allow E, but not C, to take on a default value a more deviant contrast. Because minimal causal networks are model-invariant (see Proposition 2 in the appendix), any of these accounts would be model-invariant. The kinds of values and 63. Both of these verdicts have defenders in the literature. Personally, I find the second verdict less intuitive than the first. I am currently inclined towards classifying C 's firing as a productive cause of E 's failure to fire by appealing to a more nuanced account of when a variable value is non-inertial. By way of explanation: I've some inclination to say that it would have been inertial for E to fire, given that A had fired; and thus, that E 's failure to fire was a departure from that inertial behavior. (See footnote 15.) However, I won't explore this proposal any further here. 64. I classify figure 4 as a case of omissionmerely because C 's failure to fire is an omission, and E 's firing counterfactually depends upon this omission. I don't mean for the label 'omission' to imply that this is an instance of causation. Similarly, I classify figure 5 as a case of preventionmerely because E 's failure to fire is an omission, and this omission counterfactually depends upon C 's firing. I don't mean for the label 'prevention' to imply that this is an instance of causation, either. 43 A Model-Invariant Theory of Causation Figure 12 contrasts we tolerate in our causes and effects is a free parameter of the theory. I will say that a causal network, N , from C to E is minimal iff there is no proper sub-network ofN , leading from any sub-tuple of the variables in C to E, which is itself a causal network. In order for C to cause E, they must be connected by aminimal causal network. To understand why, return again to the case of Switch from figure 11a. While there is no causal network leading fromD to E, there is a causal network leading from the pair (D,M ) to E: M D S R E Assign each of D,M,R,S and E the contrast 0. Then, E = 1, rather than 0, locally depends upon R = 1, rather than 0; R = 1, rather than 0, locally depends upon S = 3, rather than 0; and S = 3, rather than 0, locally depends upon (D,M ) = (1, 1), rather than (0, 0). In this network, S is a departure with return E, but both S and E have values more deviant than their contrasts. So this is a causal network. But D is not a joint cause of E, along with M . In M11a , the network M → S → R → E is a sub-network of the causal network leading from (D,M ) to E, and this sub-network is causal. So requiring a causal network to be minimal prevents us from saying that D is a joint cause of E. More generally, it prevents us from counting as a joint cause any irrelevant factor 'free riding' on a causal network which they did nothing to help forge; in order to share in a causal network as a joint cause, you have to pull your weight. Some theories impose a minimality condition on the variables in C. They say that C caused E only if no proper sub-tuple of C caused E.65 These theories face difficulties with neuron systems like the one shown in figure 12. There, C 's firing is a joint cause of E 's firing. It, together 65. See, for instance, Halpern and Pearl (2001, 2005) and Halpern (2016). 44 A Model-Invariant Theory of Causation with A, causes E to fire. However, if we were to impose a minimality condition on the variables in C, our theory would disagree.66 For even though there is a causal network from (A,C ) to E, namelyC → E ← A, there is also a causal network fromA alone to E, namelyA → C → E ← A. Though the tuple (A,C ) is not minimal, the network C → E ← A is minimal. So our theory tells us, correctly, that A and C jointly caused E to fire. (And also that A individually caused E to fire.) A Technicalities A notational convention: throughout the appendix, I will write things like '⟨e, e ∗⟩ locally depends upon ⟨c, c∗⟩' to mean that E = e , rather than e ∗, locally depends upon C = c, rather than c∗. Proposition 1. If M |= C = c∗ → E , e , then there is a causal network leading from some sub-tuple of C to E inM. Proof. LetN be the union of every directed path leading from a member of C to E. We will show that, ifM |= C = c∗ → E , e , then N is a causal network. (Since not every C ∈ C is guaranteed to be an ancestor of E, N may not be a causal network from C to E, but it will be a causal network from some sub-tuple of C to E.) Firstly, note that there are no departure or return variables on N . For suppose there were a departure variable D with return R. Then, there would be a directed path from D to R, D → O1 → * * * → ON → R which is not included in N . But there is a directed path from some member of C to D , and a directed path from R to E. So there is a directed path from some member of C to E which goes by way of D → O1 → * * * → ON → R. Since N includes every directed path from C to E, this path must be included in N . Contradiction. So there can be no departure and return variables. For every variable V < C in the network N , let 'v ' be its actual value, and let v∗ be the value it takes on in the counterfactual model M[C→ c∗]. Since M[C = c∗] |= E , e , e ∗ , e , and E 's contrast is distinct from its value. Now, take an arbitrary D < C which lies in N . We now show that D 's value, rather than its contrast, locally depends upon its N -parents' values, rather than their contrasts. Let PN be the parents of D which lie in the network N , and let PN be the parents of D which do not lie in the network N . By the construction of N , PN are not causal descendants of any member of C. So, in the counterfactual model M[C → c∗], PN take on their actual values, pN . Since 66. Cf. Rosenberg and Glymour (forthcoming). 45 A Model-Invariant Theory of Causation M[C→ c∗] |= D = d ∗, φD(p∗N ,pN ) = d ∗ So ⟨d,d ∗⟩ locally depends upon ⟨pN ,p∗N ⟩. D was arbitrary, so the same goes for every variable in the network N , except for those in C. So there is a causal network running from some sub-tuple of C to E.  Remark. The proposition shows us that counterfactual dependence suffices for a causal network, but this causal network need not be minimal. If C is a singleton, however, then counterfactual dependence will suffice fora minimal causal network. For counterfactual dependence of E = e on C = c means that there is some causal network from C to E. Perhaps this network is not minimal, but no matter-if it is not minimal, then some sub-network of it will be both causal and minimal. So there will be some minimal causal network from C to E. Lemma 1. Given a causal model M = (U,u,V,E, ≽), with U ∈ U, C ⊂ U ∪ V, E ∈ V, and U < C, N is a causal network from C to E in M if and only if N is a causal network from C to E inM−U . Proof. Suppose that N is a causal network from C to E in M. The exogenousU ∈ Uwill not be in this network, so removing it will not affect any of the local dependence relationships between any of the variables inN . Nor will it affect whether any departure or return variables along N have values more deviant than their contrasts. SoN will be a causal network from C to E in M−U . Suppose, on the other hand, that N was not a causal network from C to E in M. If N is a network from C to E, then the exogenous U ∈ U is not on this network, and removing it will not affect the local dependence relationships between any of the variables on N , nor whether any departure and return variables have values more deviant than their contrasts. So removingU will not make N into a causal network from C to E . So N will not be a causal network from C to E inM−U .  Definition 1. IfV is an interpolated variable inM with parent P a and child Ch, andN is a network inM (which may or may not contain the directed edges P a → V and V → Ch), then let N −V be the network in M−V defined as follows: ifV lies alongN , thenN −V isN , minus the directed edges P a →V andV → Ch, and plus the new directed edge P a → Ch; and ifV does not lie along N , then N −V is just N . Definition 2. IfV is an interpolated variable inM with parent P a and child Ch, and N is a network in M−V (which may or may not contain the directed edge P a → Ch), then let N +V be the network inM defined as follows: if N 46 A Model-Invariant Theory of Causation includes P a → Ch, then N +V is N , minus P a → Ch, and plus the directed edges P a → V and V → Ch; and if N does not include P a → Ch, then N +V is just N . Lemma 2. Given a causal model M = (U,u,V,E, ≽), with V ∈ V, C ⊆ U∪V, E ∈ V, andV < C∪ (E), ifV is inessential, then: (a) if N is a causal network from C to E in M, then N −V is a causal network from C to E in M−V ; and (b) if N is a causal network from C to E inM−V , then N +V is a causal network from C to E inM. Proof. Start with part (a). Suppose that N is a causal network from C to E in M. SinceV is inessential, it has a single parent, P a, and a single child, Ch (and P a is not a parent of Ch). Let their actual values in M be v, pa, and ch, respectively. There are two possibilities: either (A) V does not lie on N ; or (B) V does lie on N . In case (A), removing V may introduce new local dependence relationships between P a and Ch, but it will not alter any local dependence relations between any of the variables on N and their N -parents. Since, in M, each variable in N , rather than its contrast, locally depends upon itsN -parents' values, rather than their contrasts, inM−V , each variable inN −V = N , rather than its contrast, will still locally depend upon its N −V -parents' values, rather than their contrasts. For any departure or return variable in N , removing V will not affect whether these variables are departure/return variables, nor whether their values are deviant and their contrasts default. So, in case (A), N −V will still be a causal network in M−V . In case (B), V lies on N . Then, P a and Ch must lie on N as well. Let RN be Ch's parents other than V that lie in the network N (if such there be); let their actual values be rN , and their designated contrasts, r∗N . Similarly, let RN be Ch's parents that don't lie in the network N (if such there be), and let their actual values be rN . Then, inM, there are some v∗, pa∗, and ch∗ such that ⟨ch, ch∗⟩ locally depends upon ⟨rN ∪ (v), r∗N ∪ (v ∗)⟩ and ⟨v,v∗⟩ locally depends upon ⟨pa, pa∗⟩. Since P a isV 's only parent, we can conclude that φV (pa∗) = v∗(1) And since ⟨ch, ch∗⟩ locally depends upon ⟨rN ∪ (v), r∗N ∪ (v ∗)⟩, we can conclude that φCh(v ∗, r∗N , rN ) = ch ∗(2) By the construction ofM−V , it contains the structural equation Ch := φCh(φV (P a),RN ,RN ) 47 A Model-Invariant Theory of Causation Note that (3) follows from (1) and (2). φCh(φV (pa ∗), r∗N , rN ) = ch ∗(3) So, inM−V , ⟨ch, ch∗⟩ locally depends upon ⟨rN ∪ (pa), r∗N ∪ (pa ∗)⟩. RemovingV will not affect whether any variables are departure or return variables, relative to N , nor whether departure and return variables have deviant values or default contrasts. So N − V will be a causal network inM−V . To establish part (b), suppose that N is a causal network from C to E in M−V . N either (A) includes the directed edge P a → Ch or (B) doesn't. If (A), then there must be some pa∗, ch∗, and r∗N such that ⟨ch, ch∗⟩ locally depends upon ⟨rN ∪ (pa), r∗N ∪ (pa ∗)⟩. (RN are Ch's N -parents, other than P a, if such there be.) So φCh(φV (pa ∗), r∗N , rN ) = ch ∗(4) (RN are the parents of Ch which do not lie on the causal network N .) Let v∗ be the value ofV such that v∗ = φV (pa∗). Then, it follows from (4) that ⟨ch, ch∗⟩ will locally depend upon ⟨rN ∪ (v), r∗N ∪ (v ∗)⟩ in M. IncludingV will not affect which variables are departure/return variables, nor whether their values are deviant rather than default. So N + V will be a causal network in M. If (B), then N + V = V will also be a causal network from C to E inM, since including the interpolated variableV will not alter any of the local dependence relationships amongst any of the variables other than P a and Ch, nor will it affect which variables are departure/returns relative toN , nor whether their values are deviant and their contrasts default.  Proposition 2. Minimal causal networks are model-invariant. That is: (a) given a causal model M = (U,u,V,E, ≽), with U ∈ U, C ⊆ U ∪ V, E ∈ V, and U < C, there is a minimal causal network from C to E in M iff there is a minimal causal network from C to E in M−U . And (b) given a causal model M = (U,u,V,E, ≽), with C ⊆ U ∪ V, E,V ∈ V, and V < C ∪ (E), if V is inessential, then there is a minimal causal network from C to E in M iff there is a minimal causal network from C to E inM−V . Proof. Begin with part (b): suppose there is a minimal causal network from C to E in M. Then, there is a causal network, N , from C to E in M, and there is no proper sub-network of N , from any sub-tuple of C to E in M. By Lemma 2, N −V is a causal network in M−V . Suppose (for reductio) that this causal network is not minimal. Then, there is some proper sub-network of N −V , N ∗, from some sub-tuple of C to 48 A Model-Invariant Theory of Causation E inM−V which is causal. By Lemma 2, N ∗+V is causal inM. If N ∗ is a proper sub-network of N −V inM−V , then N ∗ +V is a proper subnetwork of N inM. So inM there is a proper sub-network of N , from some sub-tuple of C to E , which is causal. SoN is not a minimal causal network in M. Contradiction. So N −V is a minimal causal network inM−V . Going in the other direction, suppose that there is a minimal causal network from C to E inM−V . So there is a causal network, N , inM−V , and there is no proper sub-network of N , from any sub-tuple of C to E in M−V . By Lemma 2, N +V is a causal network in M. Suppose (for reductio) that this causal network is not minimal. Then, there is some proper sub-network of N , N ∗, from some sub-tuple of C to E in M which is causal. By Lemma 2,N ∗−V is a causal network from some sub-tuple of C to E inM−V . If N ∗ is a proper sub-network of N +V in M, thenN ∗−V is a proper sub-network ofN inM−V . So inM−V there is a proper sub-network ofN , from some sub-tuple of C to E , which is causal. So N is not a minimal causal network in M−V . Contradiction. So N +V is a minimal causal network from C to E inM. The proof of part (a) is exactly the same, with Lemma 2 swapped out forLemma 1,M−V swapped out forM−U ,N−V andN+V swapped out for N , and N ∗ −V and N ∗ +V swapped out for N ∗.  References Andreas, Holger and Mario Günther, forthcominga. "A Ramsey Test Analysis of Causation for Causal Models." The British Journal for the Philosophy of Science. Https://doi.org/10.1093/bjps/axy074. Andreas, Holger and Mario Günther, forthcomingb. "Causation in Terms of Production." Philosophical Studies. Https://doi.org/10.1007/s11098-019-01275-3. Armstrong, David, 2004. "Going Through the Open Door Again: Counterfactuals vs. Singularist Theories of Causation." In Collins et al. (2004), pages 445–457. Beckers, Sander and Joost Vennekens, 2017. "The Transitivity and Asymmetry of Actual Causation." Ergo, 4: 1–27. Beckers, Sander and Joost Vennekens, 2018. "A Principled Approach to Defining Actual Causation." Synthese, 195(2): 835–862. Briggs, R. A., 2012. "Interventionist Counterfactuals." Philosophical Studies, 160: 139–166. 49 A Model-Invariant Theory of Causation Collins, J., N. Hall, and L. A. Paul, editors, 2004. Causation and Counterfactuals. Cambridge, ma: The MIT Press. Dowe, Phil, 2000. Physical Causation. Cambridge: Cambridge University Press. Fair, David, 1979. "Causation and the Flow of Energy." Erkenntnis, 14: 219–50. Galles, David and Judea Pearl, 1998. "An axiomatic characterization of causal counterfactuals." Foundations of Science, 3(1): 151–182. Gallow, J. Dmitri, 2016. "A Theory of Structural Determination." Philosophical Studies, 173(1): 159–186. Gallow, J. Dmitri, ms. "Model-Variance in Theories of Token Causation." Hall, Ned, 2004. "Two Concepts of Causation." In Collins et al. (2004), pages 225–276. Hall, Ned, 2007. "Structural Equations and Causation." Philosophical Studies, 132(1): 109–136. Halpern, Joseph Y., 2008. "Defaults and Normality in Causal Structures." Proceedings of the Eleventh International Conference on Principles of Knowledge Representation and Reasoning, pages 198–208. Halpern, Joseph Y., 2016. Actual Causality. Cambridge,ma: MIT Press. Halpern, Joseph Y. and Christopher Hitchcock, 2010. "Actual Causation and the Art of Modeling." In "Heuristics, Probability and Causality: A Tribute to Judea Pearl," , editors Rina Dechter, Hechtor Geffner, and Joseph Y. Halpern, pages 383–406. College Publications. Halpern, Joseph Y. and Christopher Hitchcock, 2015. "Graded Causation and Defaults." The British Journal for the Philosophy of Science, 66(2): 413–457. Halpern, Joseph Y. and Judea Pearl, 2001. "Causes and Explanations: A Structural-Model Approach. Part 1: Causes." In "Proceedings of the Seventeeth Conference onUncertainty in Artificial Intelligence," , editors John Breese and Daphne Koller, pages 194–202. San Francisco: Morgan Kaufman. 50 A Model-Invariant Theory of Causation Halpern, Joseph Y. and Judea Pearl, 2005. "Causes and Explanations: A Structural-Model Approach. Part 1: Causes." The British Journal for the Philosophy of Science, 56: 843–887. Hitchcock, Christopher, 1996a. "Farewell to Binary Causation." Canadian Journal of Philosophy, 26(2): 267–282. Hitchcock, Christopher, 1996b. "The Role of Contrast in Causal and Explanatory Claims." Synthese, 107(3): 395–419. Hitchcock, Christopher, 2001. "The Intransitivity of Causation Revealed in Equations and Graphs." The Journal of Philosophy, 98(6): 273–299. Hitchcock, Christopher, 2007a. "Prevention, Preemption, and the Principle of Sufficient Reason." Philosophical Review, 116(4): 495– 532. Hitchcock, Christopher, 2007b. "What's Wrong with Neuron Diagrams?" In "Causation and Explanation," , editors Joseph Keim Campbell, Michael O'Rourke, and Harry Silverstein, Topics in Contemporary Philosophy, chapter 4, pages 69–92. Cambridge, MA: the MIT Press. Hitchcock, Christopher, 2011. "Trumping and contrastive causation." Synthese, 181: 227–240. Hitchcock, Christopher and Joshua Knobe, 2009. "Cause and Norm." Journal of Philosophy, 106(11): 587–612. Huber, Franz, 2013. "Structural Equations and Beyond." The Review of Symbolic Logic, 6(4): 709–732. Kahneman, Daniel and Dale T. Miller, 1986. "Norm Theory: Comparing Reality to Its Alternatives." Psychological Review, 94(2): 136– 153. Lewis, David K., 1973. "Causation." The Journal of Philosophy, 70(17): 556–567. Lewis, David K., 1986. "Causation." In "Philosophical Papers," volume II. New York: Oxford University Press. Lewis, David K., 2004. "Causation as Influence." In Collins et al. (2004), chapter 3, pages 75–106. 51 A Model-Invariant Theory of Causation Livengood, Jonathan, 2013. "Actual Causation in Simple Voting Scenarios." Noûs, 47(2): 316–345. Mackie, John L., 1965. "Causes andConditions." American Philosophical Quarterly, 2(4): 245–55. Maslen, Cei, 2004. "Causes, contrasts, and the nontransitivity of causation." In Collins et al. (2004), pages 341–357. Maudlin, Tim, 2004. "Causation, Counterfactuals, and the Third Factor." In Collins et al. (2004), pages 419–443. McDermott, Michael, 1995. "Redundant Causation." The British Journal for the Philosophy of Science, 46(4): 523–544. McGrath, Sarah, 2005. "Causation by Omission: A Dilemma." Philosophical Studies, 123: 125–148. Menzies, Peter, 2004. "Causal Models, Token Causation, and Processes." Philosophy of Science, 71(5): 820–832. Menzies, Peter, 2006. "A Structural Equations Account of Negative Causation." In "Contributed Papers of the Philosophy of Science Association 20th Biennial Meeting," . Paul, L. A., 2004. "Aspect Causation." In Collins et al. (2004). Paul, L. A. and Ned Hall, 2013. Causation: A User's Guide. Oxford: Oxford University Press. Pearl, Judea, 2000. Causality: Models, Reasoning, and Inference. Cambridge: Cambridge University Press, 2 edition. Rosenberg, Ian and Clark Glymour, forthcoming. "Review of Joseph Halpern, Actual Causality." The British Journal for the Philosophy of Science. Salmon, Wesley, 1984. Scientific Explanation and the Causal Structure of the World. Princeton: Princeton University Press. Salmon, Wesley, 1994. "Causality without Counterfactuals." Philosophy of Science, 61(2): 297–312. Sartorio, Carolina, 2005. "Causes as Difference-Makers." Philosophical Studies, 123: 71–98. Sartorio, Carolina, 2016. Causation & Free Will. Oxford: Oxford University Press. 52 A Model-Invariant Theory of Causation Schaffer, Jonathan, 2000. "Causation by Disconnection." Philosophy of Science, 67(2): 285–300. Schaffer, Jonathan, 2003. "Overdetermining Causes." Philosophical Studies, 114: 23–45. Schaffer, Jonathan, 2004. "Trumping Preemption." In Collins et al. (2004), chapter 2, pages 59–75. Schaffer, Jonathan, 2005. "Contrastive Causation." The Philosophical Review, 114(3): 297–328. Schaffer, Jonathan, 2012a. "Causal Contextualism." In "Contrastivism in Philosophy," , editor Blaauw, chapter 2, pages 35–63. Routledge. Schaffer, Jonathan, 2012b. "Disconnection and Responsibility." Legal Theory, 18(4): 399–435. Thomson, Judith Jarvis, 2003. "Causation: Omissions." Philosophy and Phenomenological Research, 66(1): 81–103. Weslake, Brad, forthcoming. "A Partial Theory of Actual Causation." The British Journal for the Philosophy of Science. Wolff, J. E., 2016. "Using Defaults to Understand Token Causation." The Journal of Philosophy, 113(1): 5–26. Woodward, James, 2003. Making Things Happen: A Theory of Causal Explanation. Oxford: Oxford University Press. Yablo, Stephen, 2002. "De Facto Dependence." The Journal of Philosophy, 99(3): 130–148. Yablo, Stephen, 2004. "Advertisement for a Sketch of an Outline of a Prototheory of Causation." In Collins et al. (2004), chapter 5, pages 119–138.