Bell's Theorem, Quantum Probabilities, and Superdeterminism Eddy Keming Chen* July 1, 2020 Forthcoming in Eleanor Knox and Alastair Wilson (eds.), The Routledge Companion to the Philosophy of Physics Abstract In this short survey article, I discuss Bell's theorem and some strategies that attempt to avoid the conclusion of non-locality. I focus on two that intersect with the philosophy of probability: (1) quantum probabilities and (2) superdeterminism. The issues they raised not only apply to a wide class of nogo theorems about quantum mechanics but are also of general philosophical interest. Keywords: Bell's theorem, non-locality, quantum probabilities, Kolmogorov axioms, super-determinism, simplicity, complexity, initial conditions of the universe Contents 1 Introduction 2 2 Bell's Theorem 3 3 Quantum Probabilities to the Rescue? 7 4 Escape with Super-Determinism? 10 5 Conclusion 16 6 Further readings 16 *Department of Philosophy, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA 92093-0119. Website: www.eddykemingchen.net. Email: eddykemingchen@ucsd.edu 1 1 Introduction As early as the beginning of quantum mechanics, there have been numerous attempts to prove impossibility results or "no-go" theorems about quantum mechanics. They aim to show that certain plausible assumptions about the world are impossible to maintain given the predictions of quantum mechanics, which can and have been empirically confirmed. Some of them are more significant than others. Arguably, the most significant is J. S. Bell's (1964) celebrated theorem of non-locality: given plausible assumptions, Bell shows that, in our world, events that are arbitrarily far apart can instantaneously influence each other. Bell's theorem is most significant because its conclusion is so striking and its assumptions so innocuous that it requires us to radically change how we think about the world (and not just about quantum theory). Before Bell's theorem, the picture we have about the world is like this: physical things interact only locally in space. For example, a bomb dropped on the surface of Mars will produce immediate physical effects (chemical reactions, turbulences, and radiations) in the immediate surroundings; the event will have (much milder) physical effects on Earth only at a later time, via certain intermediate transmission between Mars and the Earth. More generally, we expect the world to work in a local way such that events arbitrarily far apart in space cannot instantaneously influence one another. This picture is baked into classical theories of physics such as Maxwellian electrodynamics and (apparently) in relativistic spacetime theories. After Bell's theorem, that picture is untenable. Bell proves that Nature is nonlocal if certain predictions of quantum mechanics are correct. Many experimental tests (starting with Aspect et al. (1982a) and Aspect et al. (1982b)) have been performed. They confirm over and over again the predictions of quantum mechanics. Hence, we should have extremely high confidence in the conclusion that Nature is non-local: events that are arbitrarily far apart in space can instantaneously influence each other. (In the relativistic setting, it amounts to the conclusion that events that are space-like separated can influence each other.) However, not everyone is convinced. In fact, there are still disagreements about what Bell proved and how general the result is. Some disagreements can be traced to misunderstandings about the assumptions in the proof. Others may be due to more general issues about scientific explanations and the standards of theory choice. There are many good articles and books about Bell's theorem. (For example, see Maudlin (2011, 2014), Goldstein et al. (2011), and Myrvold and Shimony (2019).) In this short article, I would like to focus on two strategies that attempt to avoid the conclusion of non-locality. They are about (1) quantum probabilities and (2) superdeterminism, both having to do, in some ways, with the philosophy of probability. First, I argue that solving the problem by changing the axioms of classical probability theory is a non-starter, as Bell's theorem only uses frequencies and proportions that obey the rules of arithmetic. Moreover, this point is independent of any interpretation of probability (such as frequentism). Second, I argue that a super-deterministic theory may end up requiring an extremely complex initial condition, one that de2 serves a much lower prior probability than its non-local competitors. Since both issues can be appreciated without much technical background and have implications for other subfields of philosophy, I will try to present them in a non-technical way that is accessible to non-specialists. The lessons we learn from them also apply to the more recently proven theorem (2012) of Pusey, Barrett, and Rudolph about the reality of the quantum state, which is in the same spirit as Bell's theorem. (Their theorem says that, under plausible assumptions, quantum states represent states of reality rather than merely certain knowledge about reality.) 2 Bell's Theorem There are many versions of Bell's theorem and Bell inequalities. For illustration, in this section, we discuss a version of them by adapting a simple example involving perfect correlations discussed in Maudlin (2011)§1. (Another simple example, involving perfect anti-correlations, can be found in Albert (1992)§3.) Under certain physical conditions, the calcium atom can emit a pair of photons that travel in opposite directions: left and right. We have labs that can realize such conditions. In this situation, we can set up polarizers on the left and on the right, as well as devices on both sides that detect photons that happen to pass through the polarizers. If a photon is absorbed by a polarizer, then the photon detector placed behind the polarizer will detect nothing. (Here we assume that the photon detectors are 100% reliable. The idealization can be relaxed, and analyses have been done to show that the differences do not change the conclusion we want to draw.) Further, we can arrange the polarizers to be pointing in any direction on a particular plane. Each direction is representable by a number between 0 and 180, corresponding to the clockwise angle of the polarizer away from the vertical direction. Since either polarizer receives exactly one incoming photon, we say that the pair of photons agree if they either both passed or both got absorbed by the polarizers (so the photon detectors on both sides clicked or neither did); they disagree if one passed but the other got absorbed (so exactly one photon detector clicked). When we carry out the experiments, say, by using 100,000 pairs of photons, quantum mechanics predict that we would observe the following: • Prediction 1: If the left polarizer and the right polarizer point in the same direction, 100% of the pairs agree. • Prediction 2: If the left polarizer and the right polarizer differ in direction by 30 degrees, 25% of the pairs disagree. • Prediction 3: If the left polarizer and the right polarizer differ in direction by 60 degrees, 75% of the pairs disagree. (The situation is a bit simplified. In actual experiments, the empirical frequencies will be approximately 25% and approximately 75% respectively and will increasingly approach them as we carry out more trials.) In the end, these statistics will be shown to clash with a plausible hypothesis of locality: 3 Locality Events arbitrarily far away cannot instantaneously influence each other. Bell shows that the conjunction of Locality and the predictions of quantum mechanics leads to a contradiction. There are two parts in Bell's argument. The first part is based on the argument of Einstein et al. (1935), also known as the EPR argument. In the context of our example, the EPR argument can be summarized as follows. First, the photon traveling to the left and the photon traveling to the right can be separated arbitrarily far away. Second, we can always place a polarizer in the path of the photon on the left and another in the path of the photon on the right. Third, according to Prediction 1, if the two polarizers point in the same direction, the pair of photons always agree, however far away they are from each other. Moreover, if we first measure the photon on the left and find that it passed the polarizer on the left, then we do not even need to measure the photon on the right if the polarizer on the right points in the same direction; we know the result-it will pass the polarizer on the right. Assume Locality: what happens to the photon on the (distant) left cannot instantaneously influence the photon on the (distant) right. So there is already a fact of the matter, before measurement, about the result on the right. Hence, Locality implies that there are facts of the matter about the polarization direction of the photon on the left and the photon on the right. In other words, their values of polarizations are predetermined. Here is another way to see this. Given Prediction 1, since there is no way to "know" the directions of the two polarizers, the photons must already agree, even inside the calcium atom, how they would react to the polarizers come what may. That is, they must already agree whether to both pass or both get absorbed for polarizers pointing to any particular angle. For example, they must "agree" how to react when facing polarizers pointing at 0 degrees, when facing polarizers pointing at 30 degrees, when facing polarizers pointing at 60 degrees, and so on. Otherwise they would not be able to satisfy Prediction 1. However, such predetermined facts are not included in the quantum mechanical description using a wave function. So somehow these facts will be encoded in further parameters going beyond quantum theory. Indeed, the EPR argument aims to show that Locality implies that quantum mechanics is an incomplete description of Nature. (A famous example of a theory that adds additional parameters is the de Broglie-Bohm theory, but it is manifestly non-local in the particle dynamics. So it is not an example of the kind of local completion of quantum mechanics that EPR look for. Nevertheless, the non-local character of the de Broglie-Bohm theory was one of the motivations for Bell to investigate the generality of non-locality. See Bell (1964)§1 and Bell (2001).) In short, what was shown by EPR and used in Part I of Bell's argument is the following: Part I Locality & Quantum PredictionsÔ⇒ Predetermined Values In Part II, Bell shows the following: 4 Part II Predetermined Values & Quantum PredictionsÔ⇒ Contradiction We will see that predetermined values and quantum predictions lead to a contradiction with the laws of arithmetic (regarding addition, multiplication, and fraction). Recall that there are facts of the matter about the polarization properties of the pair of photons. But there are still two possibilities for each angle. For example, for polarizers pointing at 30 degrees, there can be two alternatives: both pass and both get absorbed. To simplify the example, we assume that the directions of the polarizers have only three choices (say, limited by the turning knobs on the devices): 0 degrees, 30 degrees, and 60 degrees. Then for each choice of the angle of polarizer, there can be two possibilities for the pair: both pass (P) or both get absorbed (A). For example, they may both instantiate P30, which means they will both pass if the polarizer is pointing at a 30 degrees angle; they may both instantiate A60, which means they will both get absorbed if the polarizer is pointing at a 60 degrees angle. Since 23 = 8, there are exactly eight choices for the assignments of properties in the two photons. Eight Possible Assignments of Properties Left Photon Right Photon Feature Percentage (1) P0,P30,P60 P0,P30,P60 X α% (2) A0,A30,A60 A0,A30,A60 (3) A0,P30,P60 A0,P30,P60 Y β% (4) P0,A30,A60 P0,A30,A60 (5) P0,A30,P60 P0,A30,P60 Z γ% (6) A0,P30,A60 A0,P30,A60 (7) P0,P30,A60 P0,P30,A60 W δ% (8) A0,A30,P60 A0,A30,P60 To satisfy Prediction 1, different pairs of photons can choose exactly one of these eight assignments. If a pair does not choose among these eight, then it can violate experimental results. The eight assignments can be put in four groups as indicated in the table. Let us label the four groups with features X, Y, Z, and W, which we mention again in §3. Now suppose we have a large number of pairs of photons emitted from a collection of calcium atoms. (The larger the number, the closer empirical frequencies will approach the predicted percentages.) Assuming Locality, each pair must adopt one of the eight assignments listed above. Let α be the percentage of pairs that realizes either (1) or (2); β be the percentage of pairs that realizes either (3) or (4); γ be the percentage of pairs that realizes either (5) or (6); and δ be the percentage of pairs that realizes either (7) or (8). By the laws of arithmetic, α + β + γ + δ = 100 (1) Moreover, each percentage number must be non-negative. In particular, γ ≥ 0 (2) Therefore, 5 γ + δ + β + γ ≥ β + δ (3) Unfortunately, this is inconsistent with the conjunction of Prediction 2 and Prediction 3. According to Prediction 2, if the the angles of the polarizers on the two sides differ by 30 degrees, then we find photon disagreement 25% of the time. We run the large number of pairs of photons with the left polarizer pointing to 0 and the right pointing to 30. By inspection of the table, we know that pairs realizing assignments (1) and (2) will agree. So we know that α percent of the pairs agree. Moreover, we know that pairs realizing assignments (7) and (8) will also agree. That is another δ percent of pairs that agree. The only pairs that disagree will be those realizing assignments (3), (4), (5), and (6). That is β + γ percent pairs that disagree. Hence, β + γ = 25 (4) Similar considerations apply when we set the left polarizer at 30 degrees and the right at 60 degrees. Then, γ + δ = 25 (5) According to Prediction 3, if the the angles of the left and the right polarizers differ by 60 degrees, in our example that is when one is pointing at 0 and the other 60, then pairs of photons disagree 75% of the time. All disagreements come from photon pairs that realize assignments (3), (4), (7), and (8). Hence, β + δ = 75 (6) From the above three equations, since 50 is smaller than 75, we can conclude that γ + δ + β + γ < β + δ. (7) But equation (7) is inconsistent with equation (3). We have arrived at a contradiction. Hence, the second part of Bell's argument is established. Together, Part I and Part II imply: Locality & Quantum PredictionsÔ⇒ Contradiction Since quantum predictions have been confirmed to an extremely high degree, we should have very high confidence that Locality is refuted and that Nature is nonlocal. (Here we take quantum predictions to be statistical-regarding empirical frequencies-rather than probabilistic.) Of course, we have made some implicit assumptions in the derivation: (A) The rules of inferences obey classical logic. (B) The laws of arithmetic are true. 6 (C) Frequencies and proportions obey the laws of arithmetic. (D) There are no conspiracies in nature. Strictly speaking, it is only by assuming (A)-(D) can we derive the contradiction from Locality and Quantum Predictions. We will return to these implicit assumptions in the next two sections. (Another assumption is the idea that each experimental outcome is unique and definite, which is denied in the Many-Worlds interpretation. See §6 for further readings. One can label this as the fifth assumption. However, this assumption is arguably already contained in our description of Quantum Predictions about empirical frequencies. If experimental outcomes are not definite, empirical frequencies wouldn't even make sense unless we state them in a different way, such as by pairing certain outcomes into a single branch and using "branch-weighted" frequencies.) In this section, we have presented one version of Bell inequalities (in equation (3)) and explained how it is violated by the predictions of quantum mechanics (in equation (7)). (Bell's own version (1964) uses perfect anti-correlation and is stated in terms of expectation values. Clauser et al. (1969) provides a generalization of Bell's result that allows imperfect correlations.) 3 Quantum Probabilities to the Rescue? Perhaps due to the significance of Bell's theorem, there have been many attempts that try to avoid the conclusion of non-locality by identifying some other "weak link" in the argument. (For some examples, see Further Readings in §5.) That is surprising, since the other assumptions are quite innocuous and a priori, as illustrated by the previous example. One purported "weak link" is associated with the "implicit assumptions" about classical probability theory. One might suspect that the derivations of Bell's theorem require substantive assumptions about the nature of probability. Probability is notoriously difficult to understand. Hence, there may be room to revise our classical theory of probability given empirical data. The suggestion is that, instead of rejecting Locality, we can modify (or generalize) the classical axioms and algebraic structure of Kolmogorov probability theory to avoid the contradiction. (For example, see Fine (1982a), Fine (1982b) and Pitowsky (1989).) However, the previous example serves as a counterexample. In the argument of §2, assumptions of classical probability theory do not even occur. Nor do they implicitly play any essential role. All we ever needed were proportions and how they arithmetically interact with each other (addition, multiplication, subtraction, and division). For example, Predictions 1, 2, and 3 are formulated in terms of percentages of pairs of photons. The four groups of possible assignment of properties receive percentages α, β, γ, and δ. We call them "percentages," which may remind readers of probabilities. But in our argument they merely represent proportions. To say that α percent of the pairs realize property assignments (1) or (2) is to say that the number 7 of pairs having those properties is exactly α per 100 pairs. If we have 100,000 pairs in total in the collection, then that amounts to 1000 × α pairs. Since the percentages α, β, γ, and δ represent proportions, it is in their nature that they obey the laws of arithmetic, and their bearers (property assignments (1)-(8)) obey the rules of Boolean algebra. (Tumulka (2016) makes a similar point.) The fact that we are assuming, in the conditional proof, they have hidden properties does not matter at all. As such, proportions obey the axioms governing how we should count a finite number of things, which obey the Kolmogorovian axioms, which may also govern probabilities (according to some interpretations of probability). Nevertheless, that does not make proportions subject to various interpretational issues as probability does. Many other concepts also satisfy Kolmogorovian axioms, including as mass, length, and volume of finite physical objects. Neither are they subject to the interpretational controversies surrounding the concept of probabilities. Probability faces a wide range of interpretational puzzles, and it is controversial what its axioms ought to be. Still, there are no similar difficulties with concepts of mass, length, volume, frequencies, or proportions. Why is it in the nature of frequencies and proportions to obey the laws of arithmetic or counting finite number of things? This may seem like a question in the philosophy of mathematics. Fortunately, we do not need to settle those controversies to answer that question for our purposes here. The discussion about non-classical probability spaces and Bell's theorem is sometimes highly technical, and different proposals have been suggested to understand violations of the rules of Boolean algebra and Kolmogorov axioms. For our purposes we can distill the central intuitions using the concrete example of §2. Suppose we have a large collection of photon pairs adequately prepared. Consider four features that each photon pair can have-X, Y, Z, and W-that are mutually exclusive and jointly exhaustive, and consider the following propositions: (i) The percentage of photon pairs having exactly one of the four features is 100%. (ii) The percentage of photon pairs having feature Z is non-negative. (iii) The percentage of photon pairs having either Y or W is the sum of the percentage of photon pairs having Y and the percentage of photon pairs having W. (iv) The sum of percentage of photon pairs having the property (Y or Z) and the percentage of photon pairs having the property (Z or W) is well defined-a non-negative number. Can these propositions be false? In particular, can they fail in the following ways? (i') The percentage of photon pairs having exactly one of the four features is 115%. (ii') The percentage of photon pairs having feature Z is −5%. 8 (iii') The percentage of photon pairs having either Y or W is less than the sum of the percentage of photon pairs having Y and the percentage of photon pairs having W. (iv') The sum of percentage of photon pairs having the property (Y or Z) and the percentage of photon pairs having the property (Z or W) does not exist. It is a priori that propositions (i)-(iv) cannot be false while propositions (i')-(iv') cannot be true. Propositions such as (i)-(iv) are sufficient to prove the violation of a Bell inequality (equation (3)) in §2. They are not dependent on any substantive theory or axioms about probabilities, because they are about proportions and not about probabilities. We do not need to appeal to assumptions about the nature of probabilities to prove that Nature is non-local. A potential misunderstanding is that, to say the thing we just said, we must be endorsing a particular interpretation of probability-frequentism, according to which probabilities boil down to long-run frequencies. But that is a mistake. We can make judgments about those eight propositions without endorsing any particular interpretation of probability. To evaluate them, we do not have to settle the debate among subjectivism, frequentism, and the propensity interpretations. For example, one can be a subjective Bayesian about probabilities and still accept that frequencies, percentages, and proportions obey propositions (i)-(iv). One can even adopt the view that the actual axioms governing real probabilities are non-Kolmogorovian and involving non-Boolean algebra without denying that frequencies and proportions obey the rules of arithmetic. (Moreover, the actual evidence we use to support quantum theory consists in empirical frequencies, which obviously obey the classical probability axioms.) However, not everyone would agree with our assessment. Fine (1982a,b) and Pitowsky (1989) seem to suggest it remains possible to save locality by revising classical probability theory. (See Malament (2006) for a clear introduction to this project. Feintzeig (2015) demonstrates further mathematical constraints.) The project has led to important and beautiful mathematical results that can shed light on the mathematical structures of impossibility theorems. Nevertheless, if the above analysis is correct, then the project of avoiding non-locality by revising probability axioms is a non-starter; it cannot get off the ground, no matter how ingenious or elegant the models of non-classical probability spaces are. No matter what changes we make to classical probability theory, they do not affect the conclusion of non-locality. The argument for non-locality does not rely on classical probability theory. We only need to use rules for counting relative frequencies and proportions. Quantum probability (as an alternative to classical probability) is related to quantum logic (as an alternative to classical logic). Some people who want to keep classical logic may nonetheless be open to revise the axioms of probability to make room for locality. But as we just discussed, it is the axioms governing frequencies and proportions that need to be revised if one goes that route. Since they obey the axioms of arithmetics, and since the latter are closely related to logic, it is hard to see how to pursue this route without also revising logic in some way. (See Wilce (2017) 9 for a survey of quantum logic and quantum probability theory.) Therefore, we cannot save Locality by changing the axioms governing classical probability theory. Which probability theory is correct is an important question in the philosophy of probability but it is irrelevant to the question whether Nature is non-local. 4 Escape with Super-Determinism? Another purported "weak link" in Bell's argument is associated with the assumption of statistical independence. The strategy is to allow systematic violations of statistical independence in favor of "super-deterministic" theories. (This is sometimes labeled as "conspiratorial theories.") In this section we will try to understand what the strategy is and what difficulties it faces. In §2, we assumed that the direction of the polarizer can be set independently of the collection of incoming photon pairs. We can, for example, use a mechanical device that randomly selects (say, based on certain digits of π) among the three choices-pointing at 0 degrees, 30 degrees, and 60 degrees. That assumption- statistical independence-seems fundamental to scientific experimentation. Another way to see it is in terms of random sampling. Given any collection of photon pairs adequately prepared, and after the experimental set up is completed, we can perform random sampling on the collection and obtain a sub-collection that reflects the same statistical profile as the overall collection and any other sub-collection so randomly chosen. That is, if the sub-collection is such that 25% of them would disagree when pairs of photons pass through polarization filters that differ by 30 degrees, then the whole collection (and other randomly chosen sub-collection) would also have that property. In other words, the choice of the sub-collections can be made statistically independent of the experimental setup. Statistical independence enables us to apply the conjunction of Prediction 1, Prediction 2, and Prediction 3 to the collection as a whole (and to each sub-collection) and to deduce equations (4), (5), and (6), from which we derive a contradiction with inequality (3). Without assuming statistical independence, the inference is not valid. We can construct an example in which the quantum predictions are all satisfied during experiments but there is no contradiction. Suppose we have 100,000 photon pairs to start with. Each photon pair realizes one of the eight assignments listed in the table. Suppose further that α = β = γ = δ = 25. We have three experimental setups: (A) Left polarizer at 0 degrees, right polarizer at 30 degrees. (B) Left polarizer at 30 degrees, right polarizer at 60 degrees. (C) Left polarizer at 0 degrees, right polarizer at 60 degrees. From the collection of 100,000 photon pairs, we choose three sub-collections-(a), (b), and (c)-each with exactly 100 photon pairs. It turns out that, when we send (a) through (A), 25% of them disagree; when we send (b) through (B), 25% of them 10 disagree; when we send (c) through C, 75% of them disagree. (As before, this is an idealization. The fractions get closer to these numbers when we run the trials with more pairs.) This can be realized in the following way. In (a), 25 pairs are of type (3) and the rest are of type (1); in (b), 25 pairs are of type (5), and the rest are of type (1); in (c), 75 pairs are of type (7) and the rest are of type (1). That is, sub-collection (a) has exactly the kind of statistical profile required to be in agreement with quantum predictions for experiment (A); sub-collection (b) for (B); and the sub-collection (c) for (C). Hence, each sub-collection has the "right" statistical profile matching the experimental setup it goes through, but none of them has the statistical profile required by the conjunction of the three predictions. Moreover, none of the subcollections is statistically similar with any other sub-collection. Still, the outcomes of experiments are consistent with quantum predictions. The problem is that the sampling is not random. Somehow, the choice of which photon pairs to send to which experimental setup is correlated with the choice of the experimental setup itself. In this case, equations (4)-(6) do not hold for the entire collection or any particular sub-collection, and γ + δ + β + γ is larger than or equal to β + δ without contradicting quantum predictions. In this case, 100 ≥ 50; no contradictions exist between outcomes of actual experiments and the assumption of Locality. Such a violation of statistical independence would seem to require some extraordinary conspiracies in Nature. Not only does this have to be true for these particular setups, which is incredible already, we need there to be similar conspiracies for every such experimental setup, done by anyone, anywhere, and anytime. No matter where, when, and who to carry out the experiment, the strategy requires that no matter what random sampling method we use, the photon pairs with the "right" statistical profile should always find themselves at the "right" experimental setup. The randomization can be done by a deterministic device that decides, based on the digits of π, which photon pair goes into which sub-collection. The randomization can also use other mundane methods, such as the rolling of dice, flipping of coin, and the English letters in Act V of Hamlet. No matter what randomization method is used in experiment, the superdeterministic theory will require violations of statistical independence in such a way that the sub-collections will be statistically dissimilar to each other, rendering equations (4)-(6) false of each sub-collection and the entire collection. Nature conspires to hide its locality from us. Such extraordinary features may be difficult to achieve in any realistic physical theories. Are there any physical theories that can do this? I am not aware of any worked out theory at the moment. However, some initial steps have been taken to investigate possible dynamics and toy models of superdeterminism. 'T Hooft (2014) provides an illustration. Hossenfelder and Palmer (2020) provide an up-todate overview and some philosophical discussions. (Friederich and Evans (2019) review some "retrocausal" models that use backward-in-time causal influences.) Superdeterminism faces many objections. An important criticism focuses on the fact that endorsing violations of statistical independence would be bad for science in one way or another. After all, the assumption of statistical independence is integral to ordinary statistical inferences. Shimony et al. (1976) argue that rejecting 11 statistical independence would undermine the scientific enterprise of discovery by experimentation: In any scientific experiment in which two or more variables are supposed to be randomly selected, one can always conjecture that some factor in the overlap of the backwards light cones has controlled the presumably random choices. But, we maintain, skepticism of this sort will essentially dismiss all results of scientific experimentation. Unless we proceed under the assumption that hidden conspiracies of this sort do not occur, we have abandoned in advance the whole enterprise of discovering the laws of nature by experimentation. Similarly, Maudlin (2019) suggests that rejecting it would make it impossible to do science: If we fail to make this sort of statistical independence assumption, empirical science can no longer be done at all. For example, the observed strong robust correlation between mice being exposed to cigarette smoke and developing cancer in controlled experiments means nothing if the mice who are already predisposed to get cancer somehow always end up in the experimental rather than control group. But we would regard that hypothesis as crazy. These objections based on scientific methodology seem quite compelling to many people. Recently, Hossenfelder and Palmer (2020) argue that there are multiple mistakes in this type of criticism. One of the mistakes is "the idea that we can infer from the observation that Statistical Independence is useful to understand the properties of classical systems, that it must also hold for quantum systems. This inference is clearly unjustified; the whole reason we are having this discussion is that classical physics is not sufficient to describe the systems we are considering" (emphasis original). We may have justification for applying statistical independence to classical systems such as experimental setups involving mice and cigarette smoke. But it does not logically entail that we have justification for applying it to quantum systems of photons and electrons. (What kind of justification do they mean here? I think they mean both epistemic and pragmatic justifications but the text is ambiguous.) Their response does not seem to address the worry about scientific methodology. Statistical independence is not the kind of principles we try to empirically justify. Rather, it is part of the inductive principles that we presuppose in order to do science. That is, statistical independence is a precondition for empirical investigation by experimentation. It is not clear what would be an experiment that confirms or disconfirms it, and we may need to assume statistical independence to draw conclusions from the very experiment itself. It may be impossible to empirically justify statistical independence, but that does not suggest there is a problem for applying it in the first place. This follows from a more general observation that even 12 if we cannot empirically justify induction, we are justified in using induction to learn about the world. (See Henderson (2020) on Hume's problem of induction.) Hence, their response does not seem to answer the objections of Shimony et al. (1976) and Maudlin (2019). Nevertheless, their response raises an interesting possibility. It is certainly logically consistent for a defender of superdeterminism to maintain that while small microscopic systems (such as electrons and photons) violate statistical independence, large macroscopic systems (such as mice) do not violate it for all practical purposes. That is, we may have reasons to think that the violations of statistical independence may be suppressed when we reach the macroscopic level. Hence, it is logically consistent for one to claim that statistical independence is false about microscopic systems but for all practical purposes true of macroscopic systems. In short, in ordinary situations when we experiment with mice, we can still use statistical independence; but we should not assume statistical independence when experimenting with electrons and photons (and other microscopic systems). That is of course logically consistent. But we may ask what reasons do we have for thinking that it is true in the superdeterministic theory? One might appeal to decoherence as the mechanism for suppressing certain quantum effects from manifesting in the macroscopic domain (for more on decoherence, see Crull's article in this volume). But decoherence does not fit naturally in a superdeterministic theory. For one thing, decoherence is primarily about the behaviors of quantum states (represented by wave functions). However, typically a superdeterministic theory (such as the type favored by Hossenfelder and Palmer (2020)) does not regard the quantum states to be objective and does not postulate quantum states in the fundamental ontology. Moreover, it is unclear how decoherence can suppress violations of statistical independence. Decoherence explains the dynamical features that certain "branches" of the wave function do not interfere much with each other. Although the possibility is interesting, there is much work to be done to demonstrate its plausibility in a superdeterministic framework. I would like to raise a different worry about superdeterminism. We may worry that superdeterminism of this sort is unlikely to result in a simple fundamental theory. (Here, by "unlikely" I mean unlikely in the epistemic sense: unlikely given what we know so far and absent any explicit empirically adequate models that show otherwise.) The constraints on empirical frequencies are so severe that it is hard to see how it can be written down in any simple formula. (See Kronz (1990) for a related argument. See Lewis (2006) for a discussion of Kronz's argument as well as a new "measurement problem" for superdeterminism.) In order for the local theory to be compatible with the predictions of quantum mechanics, it would have to radically constrain the state space of the local theory so that only a very small class of histories will be allowed. (Such a constraint can be a joint effect of some lawlike initial conditions and the dynamical laws.) Not all arrangements of the local parameters will be permitted-otherwise one cannot guarantee perfect agreement with quantum predictions. What kind of constraints? They will have to encode as much information as the setup and non-local correlations. For example, they would 13 need to entail that an experiment done today using randomization method based on the digits of π will somehow still result in statistically dissimilar sub-collections in such a way that produce the desired outcomes of experiments done at arbitrarily far away locations. Similarly it will be the case for randomization based on the letters of Act V of Hamlet, the Chinese characters in the Analects, or the hexagrams of I Ching. No matter what randomization method we choose, the superdeterministic mechanism must ensure that the chosen sub-collection is somehow just the right one for a particular experimental setup. Since the randomization methods seem to have nothing in common, it is hard to see how the constraints on initial conditions and dynamics can be simple at all. These give us reasons to think that they will be quite complicated. A defender of superdeterminism may reply that there is a simple formula: just write down the usual Born rule of quantum mechanics and demand that the superdeterministic theory more or less respects that. It is not clear how to state the Born rule as a simple law in terms of objects accepted on superdeterminism. As mentioned earlier, typically a superdeterministic theory (such as the type favored by Hossenfelder and Palmer (2020)) does not postulate quantum states (represented by wave functions) in the fundamental ontology. After all, a non-separable quantum state may lead to non-local dynamics. However, the Born rule is stated in terms of the quantum state. Respecting the Born-rule statistics (or something close to it) is certainly a nice goal when trying to construct a local superdeterministic theory with a well-defined ontology and dynamics. The goal is simple (respect the Born rule where it is valid), but it does not follow that the underlying theory will be simple. Because of the lack of simplicity, the constraints we need to impose in a superdeterministic theory will not look lawlike. Hence, such a theory can be quite complex and difficult to compete with other candidate theories that are far simpler. For example, 'T Hooft (2014)'s Cellular Automaton Interpretation requires the selection of an initial state of the universe, which may be extremely detailed and not at all simple. Here I take simplicity as a hallmark of fundamental laws of nature. A superdeterministic theory will likely postulate an extremely complicated initial condition (or complicated dynamical laws) that looks nothing like a fundamental law. Hence, this problem of superdeterminism boils down to a violation of a familiar constraint on fundamental laws of nature. A fundamental law should not be too complex. When we evaluate competing theories we are judging them (in part) by the relative complexities of the fundamental laws. Among competing observationally equivalent theories, the more complex a theory is the lower prior probability we should assign to it. This corresponds to an objective Bayesian way of thinking about probabilities. However, complexity and simplicity come in degrees. Now, simplicity and complexity are notoriously vague. But they are indispensable theoretical tools when we confront observationally equivalent theories. For our purpose here, one can plug in any reasonable notion of simplicity and complexity for evaluating scientific theories. In fact, some good physical theories do constrain initial states in order to explain 14 certain wide-spread regularities. For example, in a universe with wide-spread temporal asymmetries, we postulate a low-entropy initial condition. That is now called the Past Hypothesis (Albert (2000)). We ought to subject the Past Hypothesis to the constraint of simplicity because it is a candidate fundamental law. It is a candidate fundamental law because it underlies many nomological generalizations such as the Second Law of Thermodynamics and does not seem to be further explained by the dynamics. (This will no longer be true if Carroll and Chen (2004)'s model can successfully explain time's arrow.) Fortunately, we have reasons to think that the Past Hypothesis is not extremely complex. Indeed, it can be specified in terms of simple macroscopic variables (the values of the pressure, density, volume, and energy of the early universe). In certain frameworks, it can even be specified in simple microscopic variables, such as Penrose (1979)'s Weyl Curvature Hypothesis or Ashtekar and Gupt (2016)'s initial condition for Loop Quantum Cosmology. In the density-matrix-realist framework, the Past Hypothesis can be replaced by the Initial Projection Hypothesis (Chen (2018)) that pins down a unique quantum microstate of the universe. It is interesting that a simple postulate about the initial condition of the universe can explain the wide-spread temporal asymmetries. Part of the reason is due to the structure of state space: there is an asymmetry of macrostate volumes (or dimensions) that emerges as a result of simple dynamics; it is part of the answer to the problem of time's arrow. Moreover, the Past Hypothesis explanation is perfectly compatible with statistical independence. We have good reasons to think that superdetermistic theories, in contrast, will postulate something much more complicated than the Past Hypothesis as an initial condition. If such a superdeterministic theory is devised, we should also interpret its initial condition as a fundamental law of nature. (At the very least, it should be given a fundamental axiomatic status in the theory since it is not derived from other laws of the theory.) The wide-spread violations of Bell-type inequalities cry out for explanations. In such a superdeterministic theory, the initial condition is supposed to do the work of explaining why arbitrarily far away events are correlated with each other. We see no reason at all why such a theory (and especially its constraint on the state space) will be simple enough. At least we do not have any evidence that it will be simpler than the competing non-superdeterministic and non-local theories that are already on the market, such as Bohmian mechanics and GRW theory (see the survey articles by Tumulka and Lewis in this volume). Hence, there are significant differences between the superdeterministic theory that constrains its initial states to explain Bell-type correlations and a regular quantum theory that constrains its initial states (by the Past Hypothesis) to explain temporal asymmetries. However, these are differences in degrees and not of kind. If a superdeterministic theory aims to recover all quantum predictions, then it would be observationally equivalent to Bohmian mechanics and more or less equivalent to some versions of GRW theory. But we have good reasons to think that Bohmian mechanics and GRW theory are far simpler than the superdeterministic theory. Hence, the superdeterministic theory should receive much lower prior probability than either Bohmian mechanics or GRW theory. 15 Nevertheless, that does not mean we should assign 0 credence to superdeterminism. Instead, I think we should follow Bell (1977) and be open-minded in a qualified way: Of course it might be that these reasonable ideas about physical randomizers are just wrong – for the purpose at hand. A theory may appear in which such conspiracies inevitably occur, and these conspiracies may then seem more digestible than the non-localities of other theories. When that theory is announced I will not refuse to listen, either on methodological or other grounds. (my emphasis) If one constructs an empirically adequate superdeterministic theory that is simpler than a non-local theory such as Bohmian mechanics or GRW theory, we should be be open to assign much higher credence in it. At the moment, no such theory is available. 5 Conclusion In this short survey article, I introduced Bell's theorem by discussing a simple example. I focused on two strategies that attempt to avoid the conclusion of nonlocality: (1) changing the axioms of classical probability theory and (2) embracing superdeterminism and allowing systematic violations of statistical independence. Both have to do in some way with the philosophy of probability. Neither seems promising. Nevertheless, understanding these ideas can help us come to a deeper understanding of Bell's theorem, its significance, and the relevance (or irrelevance) of the nature of probability. 6 Further readings • The issue of "realism" in Bell's proof: see Norsen (2007), Maudlin (2014), and Tumulka (2016). • Non-locality, superluminal signaling, and relativistic invariance: see Maudlin (2011) for a landmark monograph on the topic; see Tumulka (2006) and Bedingham et al. (2014) for collapse models that demonstrate the compatibility of Lorentz invariance and non-locality. • Locality and non-locality in the many-worlds interpretation of quantum mechanics: see Wallace (2012)§8 and Allori et al. (2010). • Parameter independence and outcome independence: see Jarrett (1984), Shimony (1989), Healey (1992), and Maudlin (2011)§4. • Causation and causal explanations: see Bell (1981), Redhead (1989), Healey (1992), and Maudlin (2011)§5. 16 • Experimental tests and certain loophole-free tests of Bell's inequalities: see Myrvold and Shimony (2019)§4-§5 for a review. Acknowledgement I am grateful for helpful discussions with Craig Callender, Sheldon Goldstein, Mario Hubert, and Isaac Wilhelm, as well as written comments from Alan Hájek, Sabine Hossenfelder, Kelvin McQueen, Peter Morgan, Travis Norsen, Timothy Palmer, and Alastair Wilson. References Albert, D. Z. (1992). Quantum Mechanics and Experience. Harvard University Press. Albert, D. Z. (2000). Time and chance. Harvard University Press. Allori, V., Goldstein, S., Tumulka, R., and Zanghì, N. (2010). Many worlds and schrödinger's first quantum theory. British Journal for the Philosophy of Science, 62(1):1–27. Ashtekar, A. and Gupt, B. (2016). Initial conditions for cosmological perturbations. Classical and Quantum Gravity, 34(3):035004. Aspect, A., Dalibard, J., and Roger, G. (1982a). Experimental test of bell's inequalities using time-varying analyzers. Physical Review Letters, 49(25):1804. Aspect, A., Grangier, P., and Roger, G. (1982b). Experimental realization of einsteinpodolsky-rosen-bohm gedankenexperiment: a new violation of bell's inequalities. Physical review letters, 49(2):91. Bedingham, D., Dürr, D., Ghirardi, G., Goldstein, S., Tumulka, R., and Zanghì, N. (2014). Matter density and relativistic models of wave function collapse. Journal of Statistical Physics, 154(1-2):623–631. Bell, J. (2001). Introduction to the hidden-variable question. John S. Bell on the Foundations of Quantum Mechanics, page 22. Bell, J. S. (1964). On the Einstein-Rosen-Podolsky paradox. Physics, 1(3):195–200. Bell, J. S. (1977). Free variables and local causality. Epistemological Lett., 15(CERNTH-2252):xx. Bell, J. S. (1981). Bertlmann's socks and the nature of reality. Le Journal de Physique Colloques, 42(C2):C2–41. Carroll, S. M. and Chen, J. (2004). Spontaneous inflation and the origin of the arrow of time. arXiv preprint hep-th/0410270. 17 Chen, E. K. (2018). Quantum mechanics in a time-asymmetric universe: On the nature of the initial quantum state. The British Journal for the Philosophy of Science, forthcoming. Clauser, J. F., Horne, M. A., Shimony, A., and Holt, R. A. (1969). Proposed experiment to test local hidden-variable theories. Physical review letters, 23(15):880. Einstein, A., Podolsky, B., and Rosen, N. (1935). Can quantum-mechanical description of physical reality be considered complete? Physical review, 47(10):777. Feintzeig, B. (2015). Hidden variables and incompatible observables in quantum mechanics. The British Journal for the Philosophy of Science, 66(4):905–927. Fine, A. (1982a). Hidden variables, joint probability, and the Bell inequalities. Physical Review Letters, 48(5):291. Fine, A. (1982b). Joint distributions, quantum correlations, and commuting observables. Journal of Mathematical Physics, 23(7):1306–1310. Friederich, S. and Evans, P. W. (2019). Retrocausality in quantum mechanics. In Zalta, E. N., editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, summer 2019 edition. Goldstein, S., Norsen, T., Tausk, D. V., and Zanghi, N. (2011). Bell's theorem. Scholarpedia, 6(10):8378. revision #91049. Healey, R. (1992). Chasing quantum causes: how wild is the goose? Philosophical Topics, 20(1):181–204. Henderson, L. (2020). The problem of induction. In Zalta, E. N., editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, spring 2020 edition. Hossenfelder, S. and Palmer, T. (2020). Rethinking superdeterminism. Frontiers in Physics, 8:139. Hájek, A. (2019). Interpretations of probability. In Zalta, E. N., editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, fall 2019 edition. Jarrett, J. P. (1984). On the physical significance of the locality conditions in the bell arguments. Noûs, pages 569–589. Kronz, F. M. (1990). Hidden locality, conspiracy and superluminal signals. Philosophy of Science, 57(3):420–444. Lewis, P. J. (2006). Conspiracy theories of quantum mechanics. The British journal for the philosophy of science, 57(2):359–381. Malament, D. B. (2006). Notes on Bell's theorem. Accessed at http://www.lps.uci.edu/malament/probdeterm/PDnotesBell.pdf. 18 Maudlin, T. (2011). Quantum non-locality and relativity: Metaphysical intimations of modern physics. John Wiley & Sons. Maudlin, T. (2014). What Bell did. Journal of Physics A: Mathematical and Theoretical, 47(42):424010. Maudlin, T. (2019). Bell's other assumption(s) (accessed on youtube). Myrvold, Wayne, G. M. and Shimony, A. (2019). Bell's theorem. In Zalta, E. N., editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, spring 2019 edition. Norsen, T. (2007). Against 'realism'. Foundations of Physics, 37(3):311–340. Penrose, R. (1979). Singularities and time-asymmetry. In Hawking, S. and Israel, W., editors, General relativity, pages 581–638. Pitowsky, I. (1989). Quantum probability-quantum logic. Springer. Redhead, M. L. (1989). Nonfactorizability, stochastic causality, and passion-at-adistance. Cushing and McMullin (1989), 145:53. Shimony, A. (1989). Search for a worldview which can accommodate our knowledge of microphysics. Philosophical consequences of quantum theory, pages 62–76. Shimony, A., Horne, M. A., and Clauser, J. F. (1976). Comment on 'the theory of local beables'. Epistemological Letters, 13(1). 'T Hooft, G. (2014). The cellular automaton interpretation of quantum mechanics. arXiv preprint arXiv:1405.1548. Tumulka, R. (2006). A relativistic version of the Ghirardi–Rimini–Weber model. Journal of Statistical Physics, 125(4):821–840. Tumulka, R. (2016). The assumptions of Bell's proof. In Bell, M. and Gao, S., editors, Quantum Nonlocality and Reality: 50 Years of Bell's Theorem. Cambridge University Press. Wallace, D. (2012). The emergent multiverse: Quantum theory according to the Everett interpretation. Oxford University Press. Wilce, A. (2017). Quantum logic and probability theory. In Zalta, E. N., editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, spring 2017 edition.