Belief Revision for Growing Awareness Katie Steele Australian National University katie.steele@anu.edu.au H. Orri Stefánsson Stockholm University orri.stefansson@philosophy.su.se Abstract The Bayesian maxim for rational learning could be described as conservative change from one probabilistic belief or credence function to another in response to new information. Roughly: 'Hold fixed any credences that are not directly affected by the learning experience.' This is precisely articulated for the case when we learn that some proposition that we had previously entertained is indeed true (the rule of conditionalisation). But can this conservative-change maxim be extended to revising one's credences in response to entertaining propositions or concepts of which one was previously unaware? The economists Karni and Vierø (2013, 2015) make a proposal in this spirit. Philosophers have adopted effectively the same rule: revision in response to growing awareness should not affect the relative probabilities of propositions in one's 'old' epistemic state. The rule is compelling, but only under the assumptions that its advocates introduce. It is not a general requirement of rationality, or so we argue. We provide informal counterexamples. And we show that, when awareness grows, the boundary between one's 'old' and 'new' epistemic commitments is blurred. Accordingly, there is no general notion of conservative change in this setting. 1 1 Introduction Central to Bayesianism is the idea that beliefs come in varying degrees, and that a rational agent's graded beliefs (dubbed credences) over the 'possibility space', to put it roughly, can be measured by a probability function. This model of belief, also known as probabilism, is mathematically elegant and powerful, all the more so given its celebrated intimate connection to rational preference, as defined by, e.g., the so-called expected utility axioms. Moreover, the model promises a simple account of learning: it is a question of how the new (probabilistic) credence function, in light of the learning experience, should relate to the old credence function. The standard Bayesian answer to this question of learning is conditionalisation. The rule dictates precisely how to revise credences when one gets a particular type of information, namely, the information that some proposition that one had previously entertained is indeed true. The rule is conservative: 'hold fixed any credences that are not directly affected by the learning experience', where this is taken to imply that the agent's new credence function should equate to her old one, conditional on the proposition that is now found to be true. This underlying 'conservative-change' maxim has been extended to cases where the information one gains is that some proposition is more or less probable than one had thought (in the form of the rule known as Jeffrey conditionalisation).1 But there are ways in which one can become 'better informed' that fall outside the scope of the traditional Bayesian conception of learning and its extensions. In particular, one can become better informed by entertaining propositions or concepts of which one was previously unaware. This is not a learning experience that can be characterised in the usual Bayesian way: as a constraint on the agent's probability function over a given possibility space. It is rather a revision of this very possibility space. Although so-called awareness growth has not been central to the Bayesian project, there is no apparent reason why the maxims of probabilism and conservative change cannot be extended to these kinds of cases. Indeed, in just that spirit, the economists Karni and Vierø (2013, 2015) propose an extension of the Bayesian model to capture growing awareness, and they defend the following rule for the associated credence 1Themaxim has also been extended to cases where the information one gains affects one's conditional probabilities (in the form of the rule known as Adams conditionalisation) (Bradley 2005). 2 change: When a person becomes aware of new possibilities, she should update her credences 'in such a way that likelihood [probability] ratios of events in the original [epistemic] state space remain intact' (2013, p. 2801). They dub this rule Reverse Bayesianism. Philosophers too have explored the challenge that growing awareness poses for the traditional Bayesian model, typically under the guise of 'the problem of new theories' (Earman 1992; the problem was originally raised by Glymour 1980 as the counterpart to 'the problem of old evidence'). The most detailed and well-workedout of the philosophers' responses to this problem (namely, Henderson et al. 2010, Wenmackers and Romeijn 2016, and Bradley 2017) endorse what is effectively Reverse Bayesianism. We regard this as an important development of the popular Bayesian approach to belief and learning. But the extant proposals all focus, albeit for different reasons, on limited sorts of cases (more on which later). As far as we know, the question of whether Reverse Bayesianism is a general requirement of rationality-whether it is an appropriate learning rule for any kind of credence function and for any awareness growth-has not been explored. Here we defend a negative answer to that question. We initially offer some informal counterexamples to the rule. We then proceed to diagnose the counterexamples by appeal to a formal model. We claim that awareness growth is best conceived as the replacement of the old possibility space with an entirely new possibility space. This serves to highlight that the agent's understanding of any given proposition-in particular, how it relates to other propositions-changes upon awareness growth. As such, there is no clear way to delineate the credences that are and are not affected by the new learning experience. So there is no general notion of 'conservative change' for growing awareness. 2 Unawareness and traditional Bayesianism Let's state the Bayesian framework more precisely as well as the problem that unawareness raises. Let P be the function representing the degrees of belief of our agent, that is, her credence function, defined on an algebra F with the following character: • F contains a contradictory proposition (⊥). • F contains a tautologous proposition (>). 3 • F is closed under disjunction, conjunction, and negation. That is, if A and B are in F , then A ∨ B, A&B and ¬A and ¬B are also in F . The Bayesian norms of rationality are taken to establish thatP is a probability function.2 This core Bayesian commitment is known as probabilism.3 The algebraF is typically interpreted as the propositions about which the agent has an opinion; see, e.g., Pettigrew (2011). Moreover, Pettigrew articulates the common assumption that the algebra, F , is best interpreted in terms of possible worlds, understood to be 'maximally specific ways the world might be'. Specifically, propositions are identified with the set of those possible worlds that make the proposition true.4 The tautologous proposition is thus identified with the set of all possible worlds, while the contradictory proposition is identified with the empty set of possible worlds. This is an appealing way to conceive of propositions because it makes vivid how the total probability 'mass' is distributed such that the probability for both simple and complex propositions can be derived. Pettigrew notes that for most purposes, it is not useful to quantify over all possible worlds; rather, one can simply refer to the possible worlds relative to F , which are specific enough just to assign truth values to each of the propositions in F , and thus as a set amount to a coarsening of the set of possible worlds. There is a further widely neglected question here: whether it makes sense to conceive of an agent's possibility space as having the apparently objective status of the set of possible worlds, whether coarsened or not. Indeed, the phenomenon of awareness growth calls for a reconsideration of this standard presumption. It is far from being a benign presumption because, strictly speaking, it means the agent has substantive knowledge of how propositions relate to one another, and that she in some sense grasps the full set of possibilities. But we will get to this issue in due course. For now what is important is simply that the agent, at a given time, has credences in select propositions, where the set of these propositions forms an algebra F . Now, Bayesianism tells an agent, whose credences are represented by P, precisely 2That is, P(A) ∈ [0, 1] for all A in F ; P(⊥) = 0; P(>) = 1; P(A∨ B) = P(A)+ P(B) for all mutually exclusive A and B in F . 3We will not here consider conservative generalisations of probabilism, such as the position that rational belief is representable by a set of probability functions, and not necessarily a single precise probability function. 4If propositions are identified with sets of possible worlds, the reader might wonder why we characterised the above algebra sententially rather than set-theoretically. The reason is that we want to leave open the possibility that propositions are not sets of possible worlds (see e.g. footnote 13). 4 how she ought to revise these credences when she learns that some proposition A in F is true. The norm of conditionalisation-one of the core theses of Bayesian epistemology-requires that for any proposition B, the agent's credence in B, after learning A (and nothing stronger), should equal her (prior) conditional credence in B givenA, i.e., P(B | A), which, according to the standard definition of conditional probabilities, equates toP(A&B)/P(A)wheneverP(A) > 0.5 More formally, letPA represent our agent's credences after she has learned A. Then the norm of Conditionalisation states that: Conditionalisation. For any A,B ∈ F and according to any rational agent: PA(B) = P(B | A) assuming that P(A) > 0 Given the standard definition of conditional probabilities stated above, Conditionalisation is logically equivalent to the conjunction of the following two principles: Certainty. PA(A) = 1 Rigidity. PA(B | A) = P(B | A) assuming that P(A) > 0 Informally, Certainty says that the agent is certain of whatever she has learned. Rigidity, on the other hand, says that whatever proposition the agent may learn, her credences conditional on this proposition should be rigid, or unchanged by the learning experience. These two principles thus reflect a neat division between those credences directly affected by the learning experience (described by Certainty), and those credences that are not affected by the learning experience and are thus unchanged (as per Rigidity). It has been well noted that the Certainty condition does not encompass all kinds of learning. For starters, it does not fitwell with an intuitive notion of learning according to which one could take oneself to have learned something without having become certain of some proposition. Fortunately, the Bayesian framework can be straightforwardly extended to learning experiences where an agent does not learn anything with certainty, without giving up Rigidity, as Richard Jeffrey (1965) proposed. But Jeffrey's extension does not help in solving the problem that is the topic of this paper, i.e., learning 5We leave it as an open question whether or not the relevant conditional probabilities can otherwise be defined for cases where the proposition conditioned on is assigned zero probability (see Hájek 2003 for discussion). In general, there is a question which we do not address in this paper of how an agent should revise her credences if she were to learn that a proposition which she had assigned zero probability is in fact true. 5 that amounts to a growth in awareness. How are such learning experiences best characterised? And what does conservative change amount to in such cases? These are the questions we address in this article. The next section discusses an example that informally illustrates two intuitively different types of awareness growth. 3 An example: The movie screening tonight Suppose you are deciding whether to see a movie at your local cinema. You know that the cinema only shows one 'international' movie each evening, but let's suppose that you have no way of checking which movie will be shown tonight without paying the theatre a visit. Moreover, you know that the movie's predominant language and genre will affect your viewing experience. The languages you consider are French and German and the genres you consider are thrillers and comedies. But then you realise that, due to your poor French and German skills, your enjoyment of the movie will also depend on the level of difficulty of the language, whether high difficulty, or low difficulty. We can understand this as a move from a situation where your state of awareness is represented by Table 1 to one where your state of awareness is represented by Table 2.6 Thriller Comedy French French & Thriller French & Comedy German German & Thriller German & Comedy Table 1: Less aware state Thriller Comedy French & High French & Thriller & High French & Comedy & High French & Low French & Thriller & Low French & Comedy & Low German & High German & Thriller & High German & Comedy & High German & Low German & Thriller & Low German & Comedy & Low Table 2: Awareness gain by refinement Now suppose you realise that the genre could be drama. To keep things simple, let's assume you realise this when you are in the state of awareness represented by Table 1. The resulting state of awareness is represented by Table 3. 6'French' and 'German' do not represent potential actions, but rather the propositions that the movie being shown is predominantly French/German. 6 Thriller Comedy Drama French French & Thriller French & Comedy French & Drama German German & Thriller German & Comedy German & Drama Table 3: Awareness gain by expansion In the shift from the epistemic state represented by Table 1 to the one represented by Table 2, you have refined the possibilities you entertain. Your old possibilities are effectively split into more fine-grained ones, allowing for new partitions of the possibility space. In this particular case, all your old possibilities are similarly split into the high versus low language-level versions. But the refinement might otherwise be more limited. For example, perhaps there is no gradation in the difficulty of the German language, such that the high/low language distinction only properly applies to French. Then only the French possibilities would be split into high versus low language-level versions. In the shift from the epistemic state represented by Table 1 to the one represented by Table 3, you have instead expanded or extended the set of possibilities you entertain to include the further genre of drama. It is not simply that old possibilities are split, allowing for new, finer-grained partitions of the possibility space. Rather, what was previously thought to be a partition of the possibility space is recognised to not in fact be exhaustive. We have drawn attention only to special cases of awareness growth: what we might dub pure refinement and pure expansion respectively. But mixed cases are also possible. For example, one may recognise dramas as a movie genre at the same time as recognising that the language difficulty of movies may vary. In this paper, however, it will suffice to consider only the special cases of pure refinement and pure expansion. We encounter difficulties enough in extending Bayesian belief revision to these special cases of awareness growth. 4 Reverse Bayesianism We now consider how to extend Bayesian belief revision to encompass awareness growth. More specifically, we ask: How should awareness growth affect an agent's credences? Note that we take probabilism to be non-negotiable: we are assuming that a rational 7 agent's credences, at any stage of awareness, or in any given awareness context, must satisfy the probability axioms. Formally, we say that an agent's awareness context is defined by a setX of basic propositions ofwhich she is aware, and fromwhich aBoolean algebra, FX, can be generated that is the domain of her probabilistic credence function. (One can understand basic propositions as primitive propositions, representing simple facts about the world, that do not involve any logical connectives; so, for instance, 'French' and 'Thriller' are basic while '¬French' and 'French & Thriller' are not.) The basic propositions effectively form the building blocks of the formal model that we present in §6.1; when the agent's set of basic propositions changes upon awareness growth, so too does her possibility space, as it is constituted by truth functions over the basic propositions.7 The question is whether and how an agent's credence function for one awareness context constrains or relates to her credence function once she has experienced a growth in awareness. For instance, suppose again that having found yourself in the epistemic situation represented by Table 1, you realise that the level of language difficulty further distinguishes possibilities. How should this refinement affect your credences in other propositions? Or, instead, suppose that in the situation represented by Table 1, you become aware that the movie showing could be a drama; perhaps because you come across movie reviews that concern dramas. How should you revise your credences in the various other propositions in light of this expansion? Traditional Bayesianism is silent on these two questions. As we have seen, this is not a type of learning experience that the traditional Bayesian framework incorporates. But recently, Karni and Vierø have defended a unified answer to these two questions (at least for the particular kind of decision problem and awareness growth that they represent) in the form of a principle that they call 'Reverse Bayesianism'. Versions of this principle have more recently been endorsed by Wenmackers and Romeijn (2016) and Bradley (2017). Let us state Reverse Bayesianism as if it were a general principle transcending the particular type of decision model formulated by Karni and Vierø (more on which below). To ensure that our formulation is a plausible reading (or rather extension) of the intent of Karni and Vierø as well as the aforementioned philosophers, we appeal 7Abrief comment is in order here: Given that differing awareness contexts involve differing possibility spaces, the tautology, which has probability one according to the probability axioms (recall footnote 2), must be interpreted such that it depends on the awareness context: it is associated with the set of all possibilities in that context, which corresponds, for instance, to A ∨ ¬A, for any A in the context. 8 to the notion of basic propositions that we introduced above. Let X (X+) be the basic propositions of which the agent is aware before (after) awareness grows. We use P (P+) to represent the credences of the agent, defined over FX (FX+), before (after) awareness grows. Reverse Bayesianism holds that the ratio between the credences of any two incompatible basic propositions in the old epistemic state (that each had positive credence) should not change when awareness grows. More formally: Reverse Bayesianism. For anyA,B ∈ X (where P(A&B) = 0, P(A) > 0 and P(B) > 0) and according to any rational agent: P(A) P(B) = P+(A) P+(B) The restriction to basic propositions is important in that it excludes propositions involving the negation operator. (For one thing, if the principle extended to pairs of incompatible propositions of the form A, ¬A, it would require that credences in propositions of which the agent was already aware should stay constant. This would not do justice to the phenomenon of awareness growth by expansion.)8 Consider what Reverse Bayesianism requires in the movie example we have been discussing. Suppose you find 'German' to be twice as likely as 'French' before realising that both French and German can be either at a high or a low level of language difficulty. Then after this realisation, you should still find 'German' to be twice as likely as 'French'. Similarly, after you realise the movie could be a drama, you should still find 'German' to be twice as likely as 'French'. On the face of it, these implications of Reverse Bayesianism seem quite intuitive. For why should the fact that the language in the movie could have a high or low level of difficulty, or the prospect of it being a drama, change one's relative credence in the movie being predominantly German versus French? One might surmise that Reverse Bayesianism is compelling because it precisely captures conservative belief change for the learning experience in question-awareness growth. Indeed, its defenders take it to be the consequence of something akin to the 8Onemight worry that the restriction to basic propositions introduces unwanted language dependence in what is supposed to be an unambiguous rule of belief revision. After all, 'French' may be basic in one language while 'Frenchgerman' is basic in another, translated as 'French ∨ German'. We claim that the language dependence here is not, however, problematic; it corresponds to a deep feature of our model of an agent's awareness growth. As mentioned above and elaborated in §6.1, basic propositions are not introduced merely in the statement of Reverse Bayesianism but rather play a foundational role in defining an agent's awareness context. 9 Rigidity condition for this kind of learning experience. Bradley, for instance, says as much: Within the Bayesian framework, conservation of the agent's relational beliefs is ensured by the rigidity of her conditional probabilities. So we can conclude that conservative belief change [when faced with growing awareness] requires [that] the agent's new conditional probabilities, given the old domain, for any members of the old domain should equal her old unconditional probabilities for these members. (2017, p. 258) Wenmackers and Romeijn similarly suggest that the conservation of 'probability ratios among the old hypotheses' follows from the relevant conditional probabilities remaining constant: In analogy with Bayes' rule, one natural conservativity constraint is that the new [i.e., more aware] probability distribution must respect the old [i.e., less aware] distribution on the preexisting parts of the algebra [i.e., on the distributions' shared domain]. (2016, p. 1235) Karni and Vierø also appeal to the constancy of conditional attitudes by way of defending Reverse Bayesianism. In the behaviourist economics tradition, they appeal to constraints on preferences, and only indirectly on credences: . . . as the decision-maker's awareness of consequences grows and his state space expands, his preference relation conditional on the prior state space remains unchanged. (2013, p. 2801) The above defences of Reverse Bayesianism are arguably sound given the models of awareness growth to which they pertain. As mentioned earlier, however, these models place limitations on the kinds of awareness growth and/or the credences that are subject to the Reverse Bayesianism rule. Karni and Vierø, for instance, employ an Anscombe and Aumann (1963) framework, which (like Savage's 1954 framework, on which it is based), consists of acts, maximally specific consequences, and states amounting to actconsequence pairs. For Karni and Vierø, in cases of awareness growth by expansion, the agent ultimately comes to be aware of states that are by their very nature inconsistent with the states that define her old awareness context. Philosophers tend to prefer a more general Jeffrey (1965)-inspired propositional framework, but nonetheless introduce similar restrictions to Karni and Vierø in their 10 discussions of belief revision under growing awareness. Wenmackers and Romeijn (2016) focus on changes to sets of scientific theories that are assumed to be mutually inconsistent. Bradley's interests are more general, but he too, in his endorsement of Reverse Bayesianism for awareness growth by expansion, at least, focuses on propositions that are inconsistent with those the agent comes to be aware of: . . . the key to conservative attitude change in cases where we become aware of prospects that are inconsistent with those that we previously took into consideration is that we should extend our relational attitudes to the new set in such a way as to conserve all prior relational beliefs . . . (2017, p. 257, emphasis added) Moreover, we hold that Bradley implicitly assumes only 'vanilla' kinds of awareness growth by refinement, as our discussion in the next section will reveal. We allow that Reverse Bayesianism may be defensible in the limited setting that the above authors consider. But the question remains as to whether this learning rule is defensible in a more general setting (as Bradley, at least, suggests). In the next section, we offer some informal counterexamples to Reverse Bayesianism. This prompts, in the subsequent section, a more careful diagnosis of the limits of this learning rule, and indeed the very notion of conservative belief change. 5 Counterexamples to Reverse Bayesianism It is not hard to see that Reverse Bayesianism cannot generally be true once we move beyond the constrained models used by its defenders. That is, one can devise examples where Reverse Bayesianism is violated without irrationality on behalf of the agent in question. All we need are examples where awareness grows since an agent becomes aware of a proposition that she takes to be evidentially relevant, intuitively speaking, to the comparison of propositions of which she was already aware. For in that case, the ratio between probabilities of propositions of which the agent was already aware will not stay the same; one proposition will become more probable compared to the other, just like in ordinary cases where one learns evidence relevant to the comparison of hypotheses. In fact, the history of science is full of examples that undermine Reverse Bayesianism, for the above reason. Here is a particularly prominent such example: 11 Example 1. Nineteenth century physicists were unaware of the Special Theory of Relativity (STR). That is, not only did they not take the theory to be true; they had not even entertained the theory. We can suppose, however, that they had entertained various propositions for which the theory was regarded evidentially relevant, once Einstein brought the theory to their attention. In particular, they did (rightly) take the theory to be evidentially relevant to various propositions about the speed of light, such as whether the speed of light would always be measured at 300,000 km/s independently of how fast the investigator is moving or whether the measured speed would differ, depending on how fast the investigator is moving. But then the awareness and subsequent acceptance of the STR changed their relative confidence in such propositions. Not all examples where Reverse Bayesianism fails come from the history of science. Here is a more mundane, or everyday, example: Example 2. Suppose you happen to see your partner enter your best friend's house on an evening when your partner had told you she would have to work late. At that point, you become convinced that your partner and best friend are having an affair, as opposed to their being warm friends or mere acquaintances. You discuss your suspicion with another friend of yours, who points out that perhaps they were meeting to plan a surprise party to celebrate your upcoming birthday-a possibility that you had not even entertained. Becoming aware of this possible explanation for your partner's behaviour makes you doubt that she is having an affair with your friend, relative, for instance, to their being warm friends. A defender of Reverse Bayesianismmight argue that the above two examples do not undermine their thesis, since, for instance, the proposition picked out by the sentence 'the speed of light will always be measured at 300,000 km/s independently of how fast the investigator is moving' is different before and after the speaker becomes aware of the Special Theory of Relativity. (Similarly, the proposition picked out by the sentence 'my partner and best friend are having an affair' is different before and after the speaker realises that their partner and best friend might be organising a surprise party.) That is, it is not just that the propositions in question are understood differently, given a change in the underlying possibility space (which is our own approach, to be elaborated in §6.1); rather what appear to be the same propositions across awareness contexts are in fact entirely different propositions. For instance, the physics case might be spelled out as follows: despite appearances, the agent's growth in awareness is not simply an 12 expansion of the 'fundamental physical theory' partition to include the STR; there is also an expansion of the 'light hypothesis' partition to include the STR versions of the (speed-of-) light hypotheses. As a result, the addition of the STR has no bearing on the original (speed-of-) light hypotheses, in conformity with Reverse Bayesianism. It might be added that, if the new propositions of which the agent becomes aware were apparently evidentially relevant to the basic propositions in the old awareness context, then we would not have a case of genuine awareness growth, to which Reverse Bayesianism is limited.9 This way of saving Reverse Bayesianism however seriously weakens the commonsense appeal and normative interest of the thesis, and seems rather ad hoc, as the examples under consideration are surely as genuine cases of awareness growth as any. Moreover, if the aim is to represent the internal perspective of some agent, then it is surely more natural to take the individuation of propositions at face value, such that, with respect to our example above, the speed-of-light hypothesis corresponds to the same proposition before and after recognition of the Special Theory of Relativity. But that means that new propositions may well have a bearing on the relative probabilities of old basic propositions. Better to modify the Reverse Bayesian principle itself than to modify what counts as genuine awareness growth. So, we can conclude that we should not impose Reverse Bayesianism as a general constraint on how a rational agent can revise her credences when her awareness grows. The above counterexamples, however, both involve what we called awareness growth by expansion. But as previously mentioned, proponents also want to impose Reverse Bayesianism as a constraint on how a rational agent can revise her credences when her awareness grows due to refinement (e.g., Karni and Vierø 2013 and Bradley 2017). And one might well hope that despite the above counterexamples, the principle could be retained for belief revision due to refinement. Unfortunately, counterexamples similar to those discussed above also undermine Reverse Bayesianism understood in this latter way. Consider a third example, which is an elaboration on the refinement case represented by a shift from Table 1 to Table 2: Example 3. Suppose you are deciding whether to see a movie at your local cinema. You know that the movie's predominant language and genre will affect your viewing experience. The possible languages you consider are French and German and the 9The implication is that we would rather have a case of irrational and/or poorly represented belief change. 13 genres you consider are thriller and comedy. But then you realise that, due to your poor French and German skills, your enjoyment of the movie will also depend on the level of difficulty of the language. Since it occurs to you that the owner of the cinema is quite simple-minded, you are, after this realisation, much more confident that the movie will have low-level language than high-level language. Moreover, since you associate low-level language with thrillers, this makes you more confident than you were before that the movie on offer is a thriller as opposed to a comedy. The important feature of the above example is that the original awareness context is partitioned according to some new property (the language level) that is taken to be evidentially relevant to the comparison of some pair of incompatible basic propositions ('Thriller', 'Comedy') in the old awareness context. Again, a defender of Reverse Bayesianism might retort that the above example is not one of genuine awareness growth. Since you take the owner of the cinema to be simple minded, you should already have expected the movie to be simple (and hence a thriller) rather than difficult, the defender might argue. So, what is going on in this example is that you are reasoning to a conclusion that you already should have drawn, rather than gaining more awareness and revising your credences contrary to Reverse Bayesianism.10 More generally, one might worry that what at first sight may look like a counterexample to Reverse Bayesianism will generally turn out to be simply a correction of previously sloppy reasoning. We make two connected points about our example that also speak to the general worry here. First, we agree that if you take the owner of the cinema to be simple minded, then that is a reason for you to be more confident that the movie will be simple rather than difficult. However, we assume that you had not considered the fact that the movie's language is a way in which the movie can be either more or less difficult. And once you become aware of this fact, you become even more convinced that the movie is simple, and thus a thriller rather than a comedy, in violation of Reverse Bayesianism, assuming that this is a case of awareness growth. Second, we think this is indeed a case of awareness growth, since while it is true that your belief changes because youmake an inference that you already had most of the resources to make, the growth in awareness nevertheless provides a necessary link that makes the new inference possible. In sum, we take the above examples to show that Reverse Bayesianism cannot hold in full generality, neither as a constraint on belief revision due to expansion nor as a 10We thank a referee for making us see the need to respond to this objection. 14 constraint on belief revision due to refinement. Before moving on, however, we note a different potential criticism of our analysis. It might be argued that our examples are not illustrative of a simple learning event (a simple growth in awareness); rather, our examples illustrate and should be expressed formally as complex learning experiences, where first there is a growth in awareness, and then there is a further learning event that may be represented, say, as a Jeffrey-style or Adams-style learning event.11 In this way, one could argue that the awareness-growth aspect of the learning event always satisfies Reverse Bayesianism (the new propositions are in the first instance evidentially irrelevant to the comparison of the old basic propositions). Subsequently, however, there may be a revision of probabilities over some partition of the possibility space, resulting in changes to the ratios of probabilities for the old basic propositions. The reason we reject this way of conceiving of the learning events described by our examples is that the two-part structure seems ultimately unmotivated. The second learning stage is an odd, spontaneous learning event that would be hard to rationalise. Hence, this would again seem to us to be an artificial and ad hoc way to save Reverse Bayesianism. 6 Diagnosis So, Reverse Bayesianism fails with respect to propositions A and B if the awareness growth favours one of these propositions over the other. What happens in these cases is that the 'new' propositions that the agent comes to be aware of change the way she comprehends the 'old' propositions, in particular, how these propositions relate to other propositions. But this suggests a retreat from Reverse Bayesianism; a retreat to the kind of rigidity principle that defenders of Reverse Bayesianism apparently take as fundamental. Informally, the idea is that the probabilities of the old propositions, conditional on, roughly speaking, 'how things were before' (in Example 1, the theories that scientists were aware of at the beginning of the nineteenth century, say), should be rigid or unaffected by the awareness growth (the expansion of the fundamental theory space to include 'STR'). Such a rigidity principle looks to be the appropriate basis for conservative belief change in the awareness-growth setting. It is just that the condition does not entail Reverse Bayesianism when stated in general terms, namely, for all pairs of 11This suggestion resonates with the discussion in Hill (2010). 15 incompatible basic propositions. Or so the argument might go. Even if this position were roughly right, the relevant rigidity condition would need to be properly spelled out. For instance, what precisely is meant by 'how things were before'? The details of how an agent's proposition space changes with growing awareness are crucial if we want to fully understand conservative belief change in this setting. We thus turn to the details now. We eventually show that, when it comes to a notion of rigidity for awareness growth, the devil does lie in the details. 6.1 Awareness growth in detail A model of awareness growth should do two things: i) represent the agent's own internal perspective, and in so doing permit both awareness growth due to refinement and awareness growth due to expansion, and ii) shed light on an agent's credences at any given time by relating propositions to atomic possibilities. The way forward, we suggest, is to divorce the agent's possibilities from objective possible worlds. (We will give a more comprehensive defence of this move in §6.2 below.) After all, the aim is to shed light on an agent's credences over propositions, and not, simultaneously, to represent the meaning of these propositions. They can simply go uninterpreted. The agent's possibilities can be defined as truth functions over uninterpreted propositions.12 In this way, the second modelling aim can be achieved without compromising the first aim of depicting an agent's limited awareness at a time and the ways in which this awareness may subsequently grow. Recall that an agent's awareness context is defined by a set X of basic propositions of which she is aware (which we assume to be finite). We earlier defined basic propositions to be primitive propositions, representing simple facts about the world, that do not involve any logical connectives. As already noted, the basic propositions are not themselves given an interpretation in our model. (In other words, any deeper interpretation of these propositions, whether in terms of objective possible worlds or some other kind of structure, is not explicitly modelled here. We do not here take a stance on whether propositions should be identified with sets; see footnotes 4 and 13.) Let the possibilities that the agent is aware of be truth functions, ωi, that return 'true/false' for each of the basic propositions. Note that below we will occasionally use ω1, ω2, ..., ωn to denote individual possibilities. The putative set of possibilities 12As such, our account has some affinity with the 'subjective state-space' approach of economists Heifetz et al. (2006, 2008). 16 are all the distinct truth functions that take this form, i.e., effectively all the different combinations of truth values for the basic propositions. This is merely the putative or first-pass set of possibilities, since some will be deemed inconsistent by the agent (to be explained shortly) and thus excluded from the real set of possibilities (as recognised by the agent). We may describe the possibilities in terms of conjunctions of the basic propositions for which the ωi function in question returns 'true'. So, in the awareness context represented by Table 1, the possibility {ωi(French) = true, ωi(German) = false, ωi(Thriller) = true,ωi(Comedy) = false} can be described as 'French & Thriller'. From now on, we will use this latter way of describing possibilities. For the set of basic propositions X, let WX be the agent's (real) set of possibilities, which is a subset of the putative set of possibilities, containing only the possibilities that the agent regards as consistent. A possibility is consistent, by the agent's lights, if all its conjuncts could be true, i.e., if the agent does not take the conjuncts to be mutually inconsistent. What an agent takes to be the set of consistent possibilities will depend on what she regards as partitions of the proposition space (corresponding to properties or categories for which one and only one value can be assumed). For instance, for the agent described by Table 1, one partition of the space is {'French', 'German'}, these being the candidate values for the language-type property; a necessary condition for being a consistent possibility, then, is that the conjuncts include only one of 'French', 'German'. So an agent's awareness context X may be just as well defined in terms of her possibility space, WX. Any given basic proposition Xi can now be associated with a set of possibilities in WX: theωi ∈WX for which the proposition Xi is true. For simplicity, we refer to this set as {Xi}.13,14 We can now also generate a Boolean algebra, FX, in the usual way: ¬Xi is associated with the set WX \ {Xi}, Xi ∨X j is associated with the set {Xi} ∪ {X j}, and Xi&X j is associated with the set {Xi} ∩ {X j}. For reasons that will become apparent shortly, the same proposition can be associated with different sets of possibilities in different awareness contexts. So, more formally, we can think of a 13We are not here suggesting that the basic propositions are identical to, or defined in terms of, the relevant set of possibilities. After all, the possibilities were themselves constructed from propositions that had some prior meaning. One can retain the traditional notion of propositions being identified with sets of objective possible worlds, as per, e.g. Stalnaker (1984), although this is not explicitly represented in our model. The relation of 'association' that we appeal to here is intended to be weaker than 'identity'. 14Strictly speaking, the set in question should be thought of as being indexed to the relevant awareness context. If we wanted to make the index explicit, we could, for instance, write {Xi}X. But to simplify the notation, we omit making the index explicit. 17 proposition as a function from the awareness contexts in which the proposition plays a role to the corresponding sets of possibilities.15 Now let us address the dynamics of awareness. We say that the agent's awareness grows when the awareness context shifts from X to X+ = X ∪ Xj where Xj is the set of propositions that are in X+ but not in X. Note that by the assumptions we made above, when the awareness context shifts from X to X+ there is a corresponding shift from WX to WX+ and from FX to FX+ . Strictly speaking, WX and WX+ do not have any possibilities in common; after all, the possibilities in each are truth functions that have a different number of propositions in their domain. If, however, we allow that the possibilities may be described in terms of the proposition that they are each associated with-the conjunction of all basic propositions for which the function in question returns 'true'-then WX and WX+ may in certain cases (as we will see shortly) have possibilities in common. Finally, while awareness growth by refinement and awareness growth by expansion have a lot in common, we can now characterise more precisely how they differ. To that end, let us measure the length of a possibility by the number of propositions for which the function in question returns 'true'. (Recall that we assume that the set of basic propositions is finite.) We say that the awareness growth was (purely) due to refinement if the number of possibilities in WX+ is greater than in WX, and moreover, at least some possibilities in WX+ are longer (in the sense just described) than the corresponding possibilities in WX. In contrast, we say that the awareness growth was (purely) due to expansion if the number of possibilities in WX+ is greater than in WX, without any possibilities becoming longer in the sense given. Return again to our movie example, and suppose that in the least-aware context (Table 1), the only possibilities that the agent is aware of and considers consistent can be characterised as: 'French & Thriller', 'French & Comedy', 'German & Thriller', 'German & Comedy'. In other words, she regards any possibility that involves 'French & German', and likewise 'Thriller & Comedy', inconsistent. Now, when awareness grows due to refinement into high and low level language-as represented by the shift from the awareness context represented by Table 1 to the one represented by Table 2-the new possibilities are longer: 'French & Thriller & High', 'French & Thriller & Low', etc. The number of possibilities also grows, since e.g., 'French & Thriller' 15To clarify: For awareness contexts where the proposition does not play a role, it is not associated with any set of possibilities. 18 becomes 'French & Thriller & High', 'French & Thriller & Low'. In contrast, when awareness grows due to an expansion, e.g., when the agent becomes aware of the possibility of the movie being a drama which she takes to be inconsistent with the movie being either a comedy or a thriller-as represented by the shift from the awareness context represented by Table 1 to the one represented by Table 3-the possibilities do not become longer: we simply add 'French & Drama', 'German & Drama' to the first four possibilities. 6.2 Why not appeal to a 'catch-all'? Onemight object that our firstmodelling aimcan just aswell be achieved by a traditional, and far more elegant, possible-worlds model. Even if all awareness growth is at base a refinement of the set of possible worlds, such a model can still do justice to an agent's internal perspective and to the difference between refinement and expansion. The latter simply amounts to a particular kind of refinement: one involving what was previously an unarticulated catch-all, that is, a proposition interpreted roughly as 'none of the above' or 'something else'. Indeed, the more and less worked-out proposals in the philosophy literature for accommodating awareness growth appeal to an explicit catch-all, or else similarly abstract propositions that are place-holders for yet-to-be-properly-articulated propositions/theories.16 For instance, Maher (1995) assumes that the agent's algebra contains variable propositions for each of the yet-to-be-formulated theories, and he moreover assumes that the agent assigns a (non-zero) probability to each such proposition. Henderson et al. (2010) propose something similar, although with the added sophistication that the propositions in the agent's algebra form a hierarchy that remains fixed throughout the investigation, i.e., remains unchanged even when the agent becomes aware of new theories that effectively fill in this hierarchy.17 Wenmackers and Romeijn (2016) appeal to a single catch-all to account for the negation of all explicit theories to date, an idea that Earman (1992) also considers. The economists Grant and Quiggin (2013) incorporate a catch-all in their model too; it is assigned a probability based on the 16Bradley (2017) is an exception. 17The proposal of Zabell (1992) for extending statistical inference to cases where previously unsuspected phenomena occur (such as in the so-called sampling of the species problem) can be understood along these lines. The probability function is defined over a set of hypotheses that are sufficiently abstract to accommodate all the possible phenomena, whatever they turn out to be. Moreover, by construction, the probability function does not depend on how exactly the abstract hypotheses are instantiated. 19 agent's past experience of limited awareness. The problem with these proposals is that they either constrain what an agent may later come to be aware of (e.g. Maher op. cit., Henderson et al. op. cit.) or else they appeal to catch-all propositions that are so abstract from the agent's point of view-after all, the agent has no idea how to specify the propositions' content-that it is unclear why it is useful, or whether it is even cogent, to depict the agent as entertaining these propositions. On the question of cogency: in order for an agent to make sense of a catch-all, she would presumably need to entertain some universal set of possibilities relative to which the catch-all can be defined as the complement of those possibilities she can properly articulate. But it is hard to see how the agent could have access to this universal set of possibilities (which might in fact not even be a coherent notion), given that, by assumption, some of these possibilities cannot be articulated. So, it is hard to see how the catch-all could be well-defined for the agent.18 Notwithstanding these doubts about cogency, the appeal to highly abstract catchalls cannot be easily dismissed. After all, this approach has the kind of generality we seek in a model of awareness growth. The idea is that every partition of the agent's possibilities include a catch-all proposition that represents remaining mutually inconsistent possibilities that the agent cannot articulate (Shimony 1970 was an early advocate of an idea of this kind). In favour of this kind of model, it might be argued that not only do the cogency worries not hold up on further inspection, but rationality positively requires an agent to entertain catch-alls (as argued by Canson ms). Alternatively, it might be argued that the best way to model an agent's limited perspective is to situate it in a more objective setting, whereby she implicitly assigns zero probability to that of which she is unaware. While we are sympathetic to these lines of argument, we find them both ultimately unconvincing. As for the first: We admit that the cogency worries might, as it were, prove too much. After all, the 'abstractness' of a proposition would seem to be a matter of degree rather than an on/off affair. Moreover, one can surely represent an agent as having credences in all sorts of propositions without thereby being committed to her being able to precisely articulate what these propositions mean. Nonetheless, all that these cogency counterarguments can establish is that an agent may entertain propositions that have the form of catch-alls. But that is not ruled out by our framework, since it too 18We thank Alan Hájek for suggesting this way of putting the problem. 20 can accommodate abstract and unspecific propositions receiving positive probability. It is a much stronger claim, and not obviously true, that rationality requires an agent to entertain catch-alls, such that her possibility space always coincides with the set of possible worlds. Even if this turns out to be a compelling norm of rationality, better that it is not baked into the underlying model. As far as possible, we seek a model of an agent's epistemic state that leaves open the important normative questions. Turning now to the second line of argument: The idea is that the catch-all is accessible only to the modeller or to the agent in her more aware state, not the agent in question (in her less aware state), who is modelled as implicitly assigning the catch-all zero probability.19 In this way, there is no question that the catch-all is well defined. For example, an agent may be modelled as partitioning the set of fundamental physical theories into 'Newton's Theory', 'STR', plus, implicitly, 'other fundamental physical theory', where the last alternative is assigned zero probability and corresponds to all the remaining possible physical theories, e.g., from the modeller's all-seeing perspective. In this way, if the agent were to become aware of a new fundamental theory, say 'Quantum Theory', this would not change the overall possibility space; the catchall would simply be divided into two mutually inconsistent propositions, 'Quantum Theory' and '(revised) other fundamental physical theory'. For some applications, there may well be special reason to capture a wiser, thirdperson or later-person perspective on an agent's limited awareness (see, e.g., Fagin and Halpern 1987). But we contend that that is not the case for our application. We are interested in the concerns or limited perspective of a single agent, and how these concerns change with time. For this purpose, there is no need to keep track of how the agent's awareness looks from some more expansive point of view. Doing so only detracts from the simplicity of the model. We maintain that our model described above is the clearest way to proceed. 6.3 Awareness Rigidity So, let's finally return to the idea raised at the start of this section, namely, that some more fundamental, and broadly Bayesian, rigidity condition holds even when it comes to awareness growth. Recall that the thought was that this rigidity condition would hold in general, but that it would not in general imply Reverse Bayesianism. In particular, the hope would be that this rigidity condition does not imply Reverse Bayesianism in 19We thank Richard Bradley for this suggestion. 21 the situations in which we do not want the latter to hold, that is, in situations where the awareness growth is intuitively evidentially relevant to the comparison of some propositions. Our suggestion for specifying a rigidity condition for awareness growth is to identify the smallest set of possibilities in the new awareness context that corresponds to what used to be the tautology in the old awareness context. So, for instance, with respect to our movie example, when awareness grows by expansion to incorporate the new genre of 'Drama', as per the shift from Table 1 to Table 3, the proposition corresponding to the smallest set of new possibilities and which used to be the tautology in the old awareness context is the disjunction of all the old genres, i.e., 'Thriller ∨ Comedy'. In the case of refinement, any such proposition will simply correspond to the set of all possibilities constituting the new awareness context. This allows us to specify a rigidity condition that one might take to be the appropriate extension of Bayesian (i.e., conservative) belief change to the case of growing awareness: Awareness Rigidity. Let T∗ in FX be the proposition that, amongst those associated with the full set WX, is associated with the smallest subset of WX+ . For any rational agent and for any A ∈ FX: P+(A | T∗) = P(A) This, we think, captures the rigidity condition that defenders of Reverse Bayesianism take as more fundamental than Reverse Bayesianism itself. Unfortunately, it is not a compelling rationality requirement, especially for refinement. The problem is that Awareness Rigidity entails Reverse Bayesianism in cases where awareness grows by refinement (since it effectively requires that the probabilities for all propositions in the old awareness context remain unchanged). We previously argued that Reverse Bayesianism is not plausible even in cases of refinement. By modus tollens, Awareness Rigidity is then not a plausible principle for belief change. Is there any reason to retract this position? We think not. Awareness of a new property, like the language level of a movie, may intuitively cause adjustment of the relative probabilities of old propositions. Moreover, our formal model reveals why this should not be surprising. The set of possibilities associated with, say, 'Thriller', or even 'French & Thriller', is completely different before and after awareness grows to include language levels. Initially the set is constituted by possibilities that are effectively truth functions 22 over a given set of basic propositions; after awareness growth, the set is constituted by possibilities that are truth functions over an enlarged set of basic propositions. We see that there is no clear delineation between those propositions and associated credences that are and are not affected by the awareness growth. Awareness Rigidity might nevertheless be compelling for awareness growth by expansion. After all, when awareness grows by expansion, the new tautology contains possibilities that it did not contain before, and hence, Awareness Rigidity does not entail Reverse Bayesianism in cases of expansion. So, perhaps Awareness Rigidity is plausible when it comes to expansion, even though Reverse Bayesianism is not. Even in cases of expansion, however, it may be that the gained awareness shakes things up sufficiently in one's old awareness state that Awareness Rigidity is violated. Again, our formal model underscores the fact that all propositions are in some way affected by a growth in awareness, whether it be awareness growth due to expansion or refinement. So it is certainly not off the table that Awareness Rigidity may be violated in cases of awareness growth by expansion. We leave as an open question whether that is something that can rationally happen. 7 Concluding remarks Our leading question was how to extend the Bayesian maxim of conservative belief change to cases of growing awareness. We have seen that the arguablymost-worked-out proposal to this effect, namely, Reverse Bayesianism, does not hold in general. Nor does the more limited Awareness Rigidity principle. That is, there are examples where an agent's awareness grows in a way that conflicts with Reverse Bayesianism, and sometimes with Awareness Rigidity too, without any apparent irrationality on behalf of the agent. By way of diagnosis, we suggested that when awareness grows, there is no clear boundary between those credences that are directly affected by the learning experience and those that are not. Thus an agent need not hold any aspect of her credence function fixed in such circumstances. While Reverse Bayesianism must be abandoned as a general principle for belief revision in the case of growing awareness, it is arguably still an interesting relation that may hold between pairs of propositions in the transition from one belief state to another. As we have seen with the earlier motivating movie examples, there are circumstances where Reverse Bayesianism does intuitively hold. We suggest that these are cases in 23 which what is learnt is evidentially irrelevant to the pairs of propositions at issue. In other words, Reverse Bayesianism is a useful relationship for characterising cases of awareness growth that is evidentially irrelevant to the pair of propositions at issue. We save a fuller and more precise discussion of this for elsewhere.20 References Anscombe, F. J. and R. J. Aumann (1963). A definition of subjective probability. The Annals of Mathematical Statistics 34(1), 199–205. Bradley, R. (2005). Radical probabilism and Bayesian conditioning. Philosophy of Science 72(2), 342–364. Bradley, R. (2017). Decision Theory with a Human Face. Cambridge University Press. de Canson, C. (ms.). The nature of awareness growth. Earman, J. (1992). Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory. MIT Press. Fagin, R. and J. Y. Halpern (1987). Belief, awareness, and limited reasoning. Artificial Intelligence 34(1), 39 – 76. Glymour, C. (1980). Why I am not a Bayesian. In C. Glymour (Ed.), Theory and Evidence. Princeton University Press. 20We have benefited from discussing the content of this paper with Richard Bradley, Kamilla Haworth Buchter, Krister Bykvist, Catrin Campbell-Moore, Chloé de Canson, Edward Elliott, Thord Grünbaum, Alan Hájek, Brian Hedden, Anna Mahtani, Andreas Mogensen, Michael Nielsen, Joe Roussos, Julia Staffel, Daniel Stoljar, Christian Tarsney, Teruji Thomas, Aron Vallinder, Marie-Louise Vierø and Ignacio Ojea Quintana. Two referees and an editor of Mind also provided very useful feedback. Wemoreover received helpful commentwhen presenting early versions of this paper at Higher Seminar in Philosophy, Umeå University, 2017; 'StockholmWorkshop on Philosophy & Economics', KTH Royal Institute of Technology, 2017; Departmental Seminar, University of Vienna, 2017; 'What are Degrees of Belief?', University of Leeds, 2018; 'Foundations of Normative Decision Theory', University of Oxford, 2018; 'Ethics and Risk', Australian National University, 2018; 'Epistemic and Personal Transformation: Dealing with the Unknowable and Unimaginable', University of Queensland, 2019; and, in particular, at a reading group on unawareness at the Global Priorities Institute, University of Oxford, 2020. Orri gratefully acknowledges financial support from Riksbankens Jubileumsfond (The Swedish Foundation for Humanities and Social Sciences) through a Pro Futura Sciencia 2019-2024 fellowship. Katie's research was supported by an Australian National University (ANU) Futures grant, an Australian Research Council Discovery grant (grant number: 170101394) and the Humanising Machine Intelligence 'Grand Challenge' project at ANU. Finally, both Katie and Orri have received financial support from the project 'Climate Ethics and Future Generations' which is funded by Riksbankens Jubileumsfond (grant number M17-0372:1). 24 Grant, S. and J. Quiggin (2013). Bounded awareness, heuristics and the precautionary principle. Journal of Economic Behavior & Organization 93(C), 17–31. Hájek, A. (2003). What conditional probability could not be. Synthese 137(3), 273– 323. Heifetz, A., M. Meier, and B. Schipper (2006). Interactive unawareness. Journal of Economic Theory 130(1), 78–94. Heifetz, A., M. Meier, and B. C. Schipper (2008). A canonical model for interactive unawareness. Games and Economic Behavior 62(1), 304 – 324. Henderson, L., N. D. Goodman, J. B. Tenenbaum, and J. F. Woodward (2010). The structure and dynamics of scientific theories: A hierarchical bayesian perspective. Philosophy of Science 77(2), 172–200. Hill, B. (2010). Awareness dynamics. Journal of Philosophical Logic 39(2), 113–137. Jeffrey, R. (1965). The Logic of Decision. The University of Chicago Press. Karni, E. and M.-L. Vierø (2013). "Reverse Bayesianism": A choice-based theory of growing awareness. American Economic Review 103(7), 2790–2810. Karni, E. andM.-L. Vierø (2015). Probabilistic sophistication and reverse Bayesianism. Journal of Risk and Uncertainty 50(3), 189–208. Maher, P. (1995). Probabilities for new theories. Philosophical Studies 77(1), 103–115. Pettigrew, R. (2011). Epistemic utility arguments for probabilism. In E. Zalta (Ed.), Stanford Encyclopedia of Philosophy. Savage, L. (1954). The Foundations of Statistics. John Wiley & Sons. Shimony, A. (1970). Scientific inference. In R. Colodny (Ed.), The Nature and Function of Scientific Theories, pp. 79–172. University of Pittsburgh Press. Stalnaker, R. (1984). Inquiry. MIT Press. Wenmackers, S. and J. Romeijn (2016). New theory about old evidence. Synthese 193(4), 1225–1250. Zabell, S. L. (1992). Predicting the unpredictable. Synthese 90(2), 205–232.