C O M PA R AT I V E P R O B A B I L I T I E S Jason Konek On the Bayesian view, belief is not just a binary, on-off matter. Bayesians model agents not as simply categorically believing or disbelieving propositions, but rather as having degrees of confidence, or degrees of belief, or credences in those propositions. Rather than flat out believing that your Kimchi Jjigae will turn out splendidly, you might, for example, be 0.7 confident that it will turn out splendidly. Or you might have less precise opinions on the matter. You might be more confident than not that it will turn out splendidly. You might be at least 0.6 confident and at most 0.9 confident that it will turn out splendidly. You might have any number of more or less informative opinions, but nevertheless fall short of having a precise credence on the matter. In that case, we say that your credences are imprecise. Credences, whether precise or imprecise, play a number of important theoretical roles according to Bayesians. For example, a rational agent's credences determine expectations of measurable quantities-quantities like the size of the deficit 10 years hence, or the utility of an outcome-which capture her best estimates of those quantities. Those best estimates, in turn, typically rationalise or make sense of her evaluative attitudes and choice behaviour. Suppose that you are considering donating to charity. You have credences regarding the cost of bulk food, shipping, and other matters relevant for estimating how good different donation options are. Your credences, let's imagine, determine a higher expected utility for giving cash directly to the poor than for investing in infrastructure development. These expected utilities, on the Bayesian view, capture your best estimates of how much good each option would produce. And these best estimates, in turn, rationalise or make sense of your evaluative attitudes-your opinion that direct-giving is the better action, perhaps. Evaluative attitudes, in turn, rationalise choice behaviour. In the case at hand, your evaluative attitudes rationalise your choice to give cash directly to the poor rather than invest in infrastructure development. According to many Bayesians, e.g., Koopman (1940a, 1940b), Good (1950), de Finetti (1951), Savage (1954), and Joyce (2010), certain types of doxastic attitudes-opinions of the form "X is at least as likely as Y," known as comparative beliefs-play an especially important role in explicating the concept of credence. These explications typically involve an important bit of mathematics known as a representation theorem. The aim of this 267 268 jason konek chapter is straightforward, but fundamental. We will explore three very different approaches to explicating credence using comparative beliefs and representation theorems. Along the way, we will introduce a brand new account of credence: epistemic interpretivism. We will also evaluate how these respective accounts stand up to the criticisms of Hájek (2009), Meacham and Weisberg (2011), and Titelbaum (2015). The rest of the chapter proceeds as follows. Section 1 outlines the main interpretations of comparative probability orderings, which mirror the main interpretations of quantitative probability functions. Section 2 homes in on one interpretation of comparative probability in particular: the subjectivist interpretation. Then it briefly surveys some important representation theorems from Kraft, Pratt, and Seidenberg (1959), Scott (1964), Suppes and Zanotti (1976), and Alon and Lehrer (2014). Section 3 outlines three different "comparativist" accounts of credence: the measurement-theoretic, decision-theoretic, and epistemic interpretivist accounts. Comparativist accounts explain what it is to have one credal state or another in terms of subjective comparative probability relations (or comparative belief relations) and representation theorems. Epistemic interpretivism is an entirely new account of credence. So we spend a bit of time developing it. Section 4 examines criticisms of comparativist accounts by Hájek (2009), Meacham and Weisberg (2011), and Titelbaum (2015). Finally, Section 5, Section 6, and Section 7 explore the extent to which our three different approaches can withstand these criticisms. We will pay special attention to the question of whether they vindicate probabilism: the thesis that rational credences satisfy the probability axioms. 1 main interpretations of comparative probability Probabilities seem to pop up all over the place. They feature in the respective explanations of all sorts of different phenomena. They help to explain, for example, singular events, such as the outcomes of particular experiments, particular one-off historical events, and the like. Consider some examples: (1a) Why did the die land with a blue face up? (1b) It has 1 blue faces and 1 red face. And it's fair. So it had a 5/6 probability of coming up blue. (2a) Why did Rose get lung cancer? (2b) She smoked for 30 years. And the probability of getting lung cancer if you smoke for so long is really high. The high probability (= 5/6) of this particular die coming up blue on this particular toss helps to explain why it in fact came up blue. And the comparative probabilities 269 high probability of this particular person-Rose-getting lung cancer (as a result of her 30 years of smoking) helps to explain why she in fact got lung cancer. Probabilities also help to explain why we ought to have high or low confidence in certain hypotheses. Consider: (3a) Why should we think that Quantum Electrodynamics is true? (3b) It's the best confirmed physical theory ever. It's extremely probable given the current evidence. (4a) Why should we think that Jones stole the paintings? (4b) Given his acrimonious history with the art museum's curator, the eyewitness testimony, and the DNA evidence, it's quite probable that Jones is guilty. The extremely high probability of Quantum Electrodynamics given the current evidence at least partially explains why we ought to think that it is true. Likewise, the high probability that Jones stole the paintings given the eyewitness testimony, the DNA evidence, etc., at least partially explains why we ought to think that he is guilty. Finally, probabilities explain and rationalise our behaviour. For example: (5a) Why did you bet Aadil £100 that Manchester City would win their match against Newcastle? (5b) Have you seen Newcastle lately? They're a joke. It's extremely probable that Manchester City will win. (6a) Why did you go to Better Food Company rather than Sainsbury's? (6b) I wanted fresh herbs, and it's more probable that the Better Food Company will have them. The fact that it is extremely probable, in your view, that Manchester City will win the match helps to explain why you took the bet. It also helps to rationalise or make sense of your decision. And the fact that it's more probable, in your view, that the Better Food Company will have fresh herbs than Sainsbury's helps to explain and rationalise your choice to go to the Better Food Company. So, probabilities seem to do quite a lot of explanatory work. But no single thing is shouldering the whole explanatory load in (1)–(6). Different kinds of probability do the explaining in different examples. In (1)–(2), it is the physical probability or chance of the singular event in question that helps to explain why the event actually occurs. (See Hájek, 2009, Gillies, 2000, and Hitchcock, 2012, for discussion of different theories of chance.) In (3)–(4), it is the logical probability or degree of confirmation of the hypothesis 270 jason konek in question (conditional on the current evidence) that helps to explain why we ought to accept it. (See Earman, 1992, Hájek and Joyce, 2008, and Paris, 2011, for discussion of Bayesian confirmation theory and some of its issues.) Finally, in (5)–(6), it is the subjective probability, or degree of belief, or credence, of the agent in question that helps to explain and rationalise her choice. Formally, a probability function is just a particular type of real-valued function. Let Ω be a universal set, which we can think of as the set of "possible worlds" or "basic possibilities." And let F be a Boolean algebra of subsets of Ω, which we can think of as a set of "propositions." More carefully, we can think of each X ∈ F as the proposition that is true at each world w ∈ F , and false at each w∗ 6∈ F . A Boolean algebra F of subsets of Ω has three important properties: (i) F contains Ω (i.e., Ω ∈ F ); (ii) F is closed under complementation (i.e., if X ∈ F , then Ω− X ∈ F ); and (iii) F is closed under unions (i.e., if X, Y ∈ F , then X ∪ Y ∈ F ). A real-valued function p : F → R is a probability function if and only if it satisfies the laws of (finitely additive) probability. 1. Normalization. p(Ω) = 1. 2. Nonnegativity. p(X) ≥ p(∅). 3. Finite Additivity. If X ∩Y = ∅, then p(X ∪Y) = p(X) + p(Y). Axiom 1 says that p must assign probability 1 to the tautologous proposition Ω. Axiom 2 says that p must assign at least as high a probability to every proposition X as it does to the contradiction ∅. Axiom 3 says that the probability that p invests in a disjunction of incompatible propositions X and Y must be the sum of the probabilities that it invests in X and Y, respectively. The three main interpretations of probability-physical probability, or chance; logical probability, or degree of confirmation; and subjective probability, or degree of belief, or credence-correspond to the three main types of phenomena that we use probability functions to model. For example, according to propensity theories of chance, chance functions measure how strongly a causal system is disposed to produce one outcome or other on a particular occasion. Chance functions, on this view, are just probability functions that are used to model one type of physical system (causal systems) as having one type of gradable property (causal dispositions of varying strengths). Likewise, logical probability functions measure how strongly a body of evidence E supports or confirms a given hypothesis H. Logical probability functions are just probability functions that are used to model another type of target system (systems of propositions describing data and hypotheses) as having another type of gradable property (as comparative probabilities 271 having hypotheses which are supported to varying degrees by data propositions). Finally, subjective probability functions, or credence functions, measure (roughly) how confident an agent with some range of doxastic attitudes can be said to be of various propositions. Subjective probability functions are just probability functions that are used to model yet another type of system (an agent's doxastic attitudes) as having yet another type of gradable property (as either constituting or licensing varying degrees of confidence). Of course, sorting out the precise relationship between these various models-probability functions-and their respective target systems is a delicate task. The "interpretations" above provide only rough, first-pass descriptions of that relationship. Part of this chapter's goal is to explore the relationship between subjective probability functions, in particular, and the underlying system of doxastic attitudes that they model. Just as there are a few main interpretations of quantitative probability functions, corresponding to the main types of phenomena that we use those probability functions to model, so too are there a few main interpretations of "comparative probability orderings." Formally, a comparative probability ordering is just particular type of relation  on a Boolean algebra F of subsets of Ω. On each of the three main interpretations, "X  Y" means roughly that X is at least as probable as Y. (What exactly this amounts to, however, will vary from interpretation to interpretation.) Traditionally, a relation  on F is said to be a comparative probability ordering if and only if it satisfies de Finetti's (1964, pp. 100–101) axioms of comparative probability. 1. Nontriviality. Ω  ∅ and ∅ 6 Ω. 2. Nonnegativity. X  ∅. 3. Transitivity. If X  Y and Y  Z, then X  Z. 4. Totality. X  Y or Y  X. 5. Quasi-Additivity. If X ∩ Z = Y ∩ Z = ∅, then X  Y if and only if X ∪ Z  Y ∪ Z. Axiom 1 says that the tautology Ω is strictly more probable than the contradiction ∅. Axiom 2 says that every proposition X is at least as probable as the contradiction ∅. Axioms 3 and 4 guarantee that  is a total preorder (i.e., reflexive, transitive, and total). Finally, axiom 5 says that disjoining X and Y with some incompatible Z does nothing to alter their comparative probability; so X is at least as probable as Y if and only if the disjunction of X and Z is at least as probable as the disjunction of Y and Z. 272 jason konek The three main interpretations of comparative probability correspond to the three main types of phenomena that we use comparative probability orderings to model. Physical comparative probability orderings, or chance orderings-on one theory of chance anyway, viz., propensity theory-model causal systems. In particular, they model causal systems C as being more or less strongly disposed to produce one outcome or another on a particular occasion: X  Y iff C is at least as strongly disposed to produce an outcome w ∈ X and thereby make X true as it is to produce an outcome w∗ ∈ Y and thereby make Y true. Logical comparative probability orderings model a rather different type of target system: systems of propositions describing data and hypotheses. In particular, they model certain data D as supporting or confirming certain hypotheses H more than other data D∗ support other hypotheses H∗: 〈H, D〉  〈H∗, D∗〉 iff D supports or confirms H at least as much as D∗ supports or confirms H∗. Finally, subjective comparative probability orderings model yet another type of target system: an agent's doxastic attitudes. In particular, they model agent A as being more or less confident that one proposition or another is true: X  Y iff A is at least as confident that X is true as she is that Y is true. Of course, the three main interpretations of comparative probability are really families of interpretations. All three types of comparative probability orderings come in different flavours. For example, behaviorists like de Finetti (1931, 1964) and Savage (1954) treat subjective comparative probability orderings as particular types of preference orderings. To be more confident that X is true than Y is, roughly speaking, to prefer a dollar bet on X to a dollar bet on Y. They subscribe to what Jeffrey calls the thesis of the primacy of practical reason, which says that between belief and preference, "it is preference that is real and primary" (Jeffrey, 1987, p. 590). Hence, "belief states that correspond to identical preference rankings of propositions are in fact one and the same" (Jeffrey, 1965/1983, p. 138). Jeffrey (1965, 2002) and Joyce (1999), in contrast, do not subscribe to this thesis. On their view, being more confident that X is true than Y involves comparative probabilities 273 making a peculiarly doxastic judgment. Such doxastic judgments partially explain and rationalise our preferences. But they do not even supervene on preferences, let alone reduce to them. (Two agents, for example, could both be in a state of nirvana on this view, and so be indifferent between every prospect and the status quo, but nevertheless make different comparative probability judgments.) And the laws governing rational subjective comparative probability judgments, on this account, are not simply special cases of the laws governing rational preference. Rather, they derive from peculiarly epistemic considerations, e.g., considerations of accuracy. To have a general way of talking about comparative beliefs, without assuming that they satisfy de Finetti's axioms, let's introduce some terminology. Call any relation  on an algebra F of subsets of Ω that is used to model an agent's comparative beliefs a comparative belief relation. And call 〈Ω,F ,〉 a comparative belief structure. Comparative belief relations may or may not be comparative probability orderings. That is, they may or may not satisfy de Finetti's axioms of comparative probability. Why the hubbub about comparative belief? Why think that comparative belief relations have a particularly important role to play in modeling rational agents' doxastic states? What are they especially suited to do that precise, real-valued credence functions are not? There are a number of common answers to these questions. The first is that comparative belief relations provide a more psychologically realistic model of agents' doxastic attitudes than precise, real-valued credence functions. Often I simply lack an opinion about which of two propositions is more plausible. I am not more confident that copper will be greater than £2/lb in 2025 (call this proposition C) than I am that nickel will be greater than £3/lb in 2025 (call this proposition N). Neither am I less confident in C than N, nor equally confident. I simply lack an opinion on the matter. We can model this using comparative belief relations. Just choose a relation  that does not rank C and N: C 6 N and N 6 C. The incompleteness in  reflects my lack of opinionation. Precise credence functions, on the other hand, do not allow for this sort of lack of opinionation. Any agent with precise credences for C and N takes a stand on their comparative plausibility. She is either more confident in C than N, less confident in C than N, or equally confident in the two.1 The second answer is evidentialist. Not only do real agents in fact have sparse and incomplete opinions, but they ought to have such opinions. If your evidence is incomplete and unspecific, then your comparative beliefs 1 See Suppes (1994, p. 19), Kyburg and Pittarelli (1996, p. 325), Kaplan (2010, p. 47), and Joyce (2010, p. 283) for similar remarks. 274 jason konek (and your other qualitative and comparative opinions) should be correspondingly incomplete to reflect the unspecific nature of that evidence. This is the response that is most justified, or warranted, or appropriate in light of such evidence. Again, we can capture this sort of lack of opinionation using comparative belief relations, but not using precise credence functions. Having precise credences requires having total or complete comparative beliefs (as well as total conditional comparative beliefs, total preferences, and so on).2 The third answer is information-theoretic. Proponents of maximum entropy methods, for example, argue that you ought to have the least informative doxastic state consistent with your evidence. And according to any plausible informativeness or entropy measure for comparative beliefs, any incomplete comparative belief relation will be less informative than any extension of it.3 As a result, minimizing informativeness will often require adopting incomplete comparative beliefs. Precise credences, however, do not allow for incomplete comparative beliefs. As Joyce puts it, adopting precise credences, in many evidential circumstances, "amounts to pretending that you have lots and lots of information that you simply don't have" (Joyce, 2010, p. 284). The final common answer is that comparative belief is more explanatorily fundamental than precise credence. Comparative beliefs figure into the best explanation of what precise credences are, but not vice versa. It is worthwhile, then, exploring the various accounts of credence that aim to furnish such an explanation. We will turn our attention to them shortly. Each of these accounts, however, makes use of an important bit of mathematics known as a representation theorem. So our first task is to get familiar with the nuts and bolts of representation theorems. 2 representation theorems Suppose that Monty Hall invites you to choose one of three doors: either door a, b, or c. Behind one of these doors: a car. Behind the other two: a goat. You are more confident that the car is behind a than b, let's imagine. You are also more confident that it's behind b than c. But that is all. You do not take a stand, for example, on whether it's more likely to be behind either b or c than a, or vice versa. You abstain from judgment on all other matters. Let wa be the world in which the car is behind door a, wb be the world in which the car is behind door b, and wc be the world in which the car is 2 See Joyce (2005, p. 171) for similar remarks. 3 See Abellan and Moral (2000, 2003) for measures of entropy for imprecise probability models which might also serve as measures of entropy for comparative belief relations. comparative probabilities 275 behind door c. Then we can represent you as having comparative beliefs about propositions in the following Boolean algebra: F =  {wa, wb, wc} {wa, wb} , {wa, wc} , {wb, wc} , {wa} , {wb} , {wc} , ∅  . And we can represent those fragmentary comparative beliefs as follows: {wa, wb, wc} {wa} {wb} {wc} ∅, where X Y is shorthand for X  Y and X 6 Y.4 Your comparative belief relation  is not a comparative probability ordering, i.e.,  does not satisfy de Finetti's axioms of comparative probability. Relation  violates Quasi-Additivity, for example, as well as totality. You are, after all, more confident that the car is behind a than b: {wa} {wb} . So de Finetti's Quasi-Additivity axiom demands that you also be more confident that it is behind a or c than you are that it is behind b or c. But you abstain from judgment on the matter: {wa, wc} 6 {wb, wc} and {wa, wc} 6 {wb, wc} . A few back-of-the-envelope calculations suffice to show that de Finetti's axioms are necessary for probabilistic representability. Following Savage (1954), we say that: p fully agrees with  iff X  Y ⇔ p(X) ≥ p(Y). We say that  is (fully) probabilistically representable iff there is a probability function p that fully agrees with . Since your comparative belief relation  does not satisfy de Finetti's axioms, in our little example, it is not probabilistically representable. 4 This shorthand is inadequate. You may well think X  Y and X 6 Y without thinking X Y. Imagine for example that you recently learned that Y entails X. So you think that X is at least as likely as Y, i.e., X  Y. But you have no idea whether the entailment goes both ways. So you withhold judgment about whether Y is at least as likely as X, i.e., X 6 Y. For exactly the same reason you withhold judgment about whether X is strictly more likely than Y, i.e., X 6 Y. We would do better, then, to represent your doxastic state with a pair of relations 〈,〉. But historically one or the other has been taken as primitive. For ease of exposition, we follow de Finetti (1951), Savage (1954), and Krantz, Luce, Suppes, and Tversky (1971), who take  as primitive. 276 jason konek De Finetti (1951) famously conjectured that his axioms encode not only necessary conditions for probabilistically representability, but sufficient conditions as well. The question: was de Finetti right? Let 〈Ω,F ,〉 be an agent's comparative belief structure. (For now, assume that F is finite.) Probability functions that fully agree with  show that we can think of that structure 〈Ω,F ,〉 numerically, so to speak. We can map the propositions X in F to real-valued proxies p(X). And we can do so in such a way that one proxy p(X) is larger than another p(Y) exactly when our agent is more confident in X than Y. So the familiar "greater than or equal to" relation ≥ on the real numbers R is a mirror image of our agent's comparative belief relation  on F . Kraft et al. (1959) show that de Finetti's conjecture is false. Though de Finetti's axioms are necessary for probabilistic representability, they are not sufficient. To establish this, Kraft et al. construct a clever counterexample involving a comparative belief relation  over the Boolean algebra G of all subsets of Ω = {wa, wb, wc, wd, we}. Their relation  satisfies de Finetti's axioms of comparative probability, and also the following inequalities: {wd} {wa, wc} , (1) {wb, wc} {wa, wd} , (2) {wa, we} {wc, wd} . (3) As a result, any probability function p that fully agrees with  satisfies the corresponding inequalities (to simplify notation, we let p({w}) = p(w)): p(wd) > p(wa) + p(wc), (4) p(wb) + p(wc) > p(wa) + p(wd), (5) p(wa) + p(we) > p(wc) + p(wd). (6) But any p that satisfies (4)–(6) also satisfies (7) (simply sum the inequalities). p(wb) + p(we) > p(wa) + p(wc) + p(wd). (7) Notice, however, that {wb, we} and {wa, wc, wd} appear nowhere in (1)–(3). So Transitivity does not constrain how you order them. Neither do any supersets of {wb, we} or {wa, wc, wd} appear there. So Quasi-Additivity does not constrain how you order them either. Hence, for all de Finetti's axiom say, you can order {wb, we} and {wa, wc, wd} any way you please. So Kraft et al. make  satisfy (8). {wb, we}  {wa, wc, wd} . (8) But if p fully agrees with , then (8) requires: p(wb) + p(we) ≤ p(wa) + p(wc) + p(wd). (9) Lines (7) and (9), however, are jointly unsatisfiable. So no probability function p fully agrees with . comparative probabilities 277 2.1 Scott's Theorem So de Finetti's conjecture is false. De Finetti's axioms of comparative probability are not necessary and sufficient for probabilistic representability. Luckily, Kraft et al. (1959) and Scott (1964) provide the requisite fix. They provide stronger axioms that are both necessary and sufficient for probabilistic representability. Scott's axioms are more straightforward, so we will focus our attention on them. Before stating Scott's theorem, it is worth noting that our formulation abuses notation a bit. Expressions 'Xi' and 'Yi' refer both to propositions in F , as well as their characteristic functions, i.e., functions that take the value 1 at worlds w where Xi (or Yi) is true (i.e., w ∈ Xi), and take the value 1 at worlds w′ where Xi is false (i.e., w′ 6∈ Xi). This will turn out to be a helpful bit of sloppiness. Scott (1964) proves the following: Scott's Theorem. Every comparative belief structure 〈Ω,F ,〉 (with finite F ) has a probability function p : F → R that fully agrees with  in the sense that X  Y ⇔ p(X) ≥ p(Y) if and only if  satisfies the following axioms. 1. Non-triviality. Ω ∅. 2. Non-negativity. X  ∅. 3. Totality. X  Y or Y  X. 4. Isovalence. If X1 + . . . + Xn = Y1 + . . . + Yn and Xi  Yi for all i ≤ n, then Xi  Yi for all i ≤ n as well. Axiom 4-sometimes called Scott's axiom, the Isovalence axiom, or the Finite cancellation axiom-is the one new axiom of the bunch. To see what it says, note that X1(w) + . . . + Xn(w) counts the number of truths in the set {X1, . . . , Xn} at world w. Ditto for Y1(w) + . . . + Yn(w). So X1 + . . . + Xn = Y1 + . . . + Yn says that the two sets of propositions, {X1, . . . , Xn} and {Y1, . . . , Yn}, contain the same number of truths come what may, i.e., in every possible world. We call such sets of propositions isovalent. In light of this, the Isovalence axiom says that if {X1, . . . , Xn} and {Y1, . . . , Yn} are isovalent, then you cannot think that the Xis are uniformly more plausible than the Yis. ("Uniformly more plausible" here means that you think that Xi is at least as plausible as Yi for all i, and that Xj is strictly more plausible than Yj for some j.) After all, an equal number of the Xis 278 jason konek and Yis are guaranteed to be true! So you can only think that the Xis are at least as plausible as the Yis across the board if you think that they are equally plausible. But how exactly does Scott prove his representation theorem? It is worth walking through the proof strategy informally. This will help interested readers dig through the mathematical minutia in Scott (1964). Indeed, it will prove instructive to use Scott's strategy to establish something slightly stronger than Scott's theorem. Generalised Scott's Theorem (GST). For any comparative belief structure 〈Ω,F ,〉 with finite F and a comparative belief relation  that satisfies 1. Non-triviality. Ω ∅, 2. Non-negativity. X  ∅, the following two conditions are equivalent. 3. Isovalence. If X1 + . . . + Xn = Y1 + . . . + Yn and Xi  Yi for all i ≤ n, then Xi  Yi for all i ≤ n as well. 4. Strong representability. there exists a probability function p : F → R that strongly agrees with  in the sense that (i) X  Y ⇒ p(X) ≥ p(Y), (ii) X Y ⇒ p(X) > p(Y). Scott's theorem, as we shall see at the end of Section 2, follows fairly straightforwardly from GST. We prove the GST in the appendix. The key insight required for proving GST is this. In the presence of Non-triviality and Non-negativity, strong representability boils down to sorting almost desirable gambles from undesirable gambles.5 On top of this, Scott (1964) shows that sorting almost desirable from undesirable gambles is equivalent to satisfying Isovalence.6 Figure 1 summarizes the situation. Given Non-triviality and Non-negativity Strong rep. ⇔ Sort gambles ⇔ Isovalence Figure 1: Logical relations between properties of  5 For an accessible introduction to desirable gambles, see Walley (2000). See Quaeghebeur (2014) for more detail. 6 More carefully, Scott (1964) shows that for any comparative belief relation  that satisfies Non-triviality and Non-negativity, satisfying Isovalence is sufficient for sorting almost desirable from undesirable gambles. Showing that it is necessary is straightforward. See the appendix for proof. comparative probabilities 279 Gambles are measurable quantities G : Ω→ R. Say that a gamble G is almost desirable relative to  iff it is a non-negative linear combination of almost desirable components: (X1 −Y1), . . . , (Xn −Yn). And say that each component Xi − Yi is almost desirable iff Xi  Yi. Gamble G is a non-negative linear combination of (X1−Y1), . . . , (Xn −Yn) just in case: G = ∑ i λi(Xi −Yi) for some λ1, . . . , λn ≥ 0. We call components Xi − Yi almost-desirable if Xi  Yi because any probability function p that strongly agrees with  determines a nonnegative expected value for Xi −Yi: Xi  Yi ⇒ p(Xi) ≥ p(Yi) ⇔ Ep[Xi] ≥ Ep[Xi] ⇔ Ep[Xi −Yi] ≥ 0. So if we interpret those values as payoffs in utility, then p expects Xi −Yi to be at least as good as the status quo (i.e., its expected utility is nonnegative). Likewise, we call G almost desirable if it is a non-negative linear combination of almost-desirable components because any probability function p that strongly agrees with  determines a non-negative expected value for G: Xi  Yi for all i ⇒ Ep[Xi −Yi] ≥ 0 for all i ⇒ ∑ i λiEp[Xi −Yi] ≥ 0 ⇔ Ep [ ∑ i λi(Xi −Yi) ] ≥ 0 ⇔ Ep[G] ≥ 0. Similar remarks apply to undesirable gambles. We call a gamble G∗ undesirable relative to  iff it is a convex combination of undesirable components (X∗1 −Y∗1 ), . . . , (X∗n −Y∗n ), so that G∗ = ∑ i λ∗i (X ∗ i −Y∗i ). for some λ∗1 , . . . , λ ∗ n ≥ 0 with ∑i λ∗i = 1. A component X∗i − Y∗i is undesirable iff X∗i ≺ Y∗i . The reason is the same as before. Any probability 280 jason konek function that strongly agrees with  determines a negative expected value for undesirable components, as well as convex combinations of undesirable components. Now say that  sorts almost desirable gambles from undesirable ones iff the two sets of gambles are disjoint. That is, if A = { G | G is almost desirable rel. to  } and U = { G∗ | G∗ is undesirable rel. to  } , then  sorts almost desirable gambles from undesirable ones iff A∩U = ∅. If  fails to sort gambles in this way, then some gamble is both almost desirable and undesirable, i.e., G = G∗ for some almost desirable gamble G and some undesirable gamble G∗. And if that is the case, then there is no probability that strongly agrees with it. (Moreover, since full probabilistic representability entails strong representability, there is no probability function that fully agrees with it either.) If it were strongly representable, then we would have both Ep[G] = Ep[G∗] and Ep[G∗] < 0 ≤ Ep[G] for some probability function p. This shows that sorting almost desirable from undesirable gambles is necessary for strong agreement with a probability function, which is itself necessary for full agreement with a probability function, i.e., probabilistic representability. Scott's insight, though, is that it is also sufficient, in the presence of Non-triviality and Non-negativity. Given that  satisfies Nontriviality and Non-negativity, it sorts almost desirable from undesirable gambles if and only if it strongly agrees with a probability function. What's more, if  is total as well, then strong agreement is equivalent to full agreement. So non-trivial, non-negative, total comparative belief relations sort almost desirable from undesirable gambles if and only if they are probabilistically representable. See Figure 2. To prove this, Scott uses what is known as a hyperplane separation theorem. The hyperplane separation theorem guarantees that for any two closed, convex, disjoint sets, there is a hyperplane that strictly separates them (Kuhn & Tucker, 1956, p. 50). Now note that A is the closed, convex polyhedral cone generated by the set { X−Y | X  Y } . Likewise, U is the convex hull of { Y− X | X Y } -a closed and convex set. And if  sorts comparative probabilities 281 Given Non-triviality, Non-negativity and Totality Strong rep. ⇔ Full prob. rep. m Sort gambles ⇔ Isovalence Figure 2: Logical relations between properties of  −1 0 1−1 0 1 A U Figure 3: Hyperplane strictly separating A and U almost desirable from undesirable gambles, then they are also disjoint. So there is a hyperplane that strictly separates A and U (see Figure 3). This hyperplane determines (in effect) an expectation operator E. Gambles G on one side of the hyperplane get positive expected values according to E. Gambles on the other side get negatives ones. Precisely how high or low E[G] happens to be is determined by G's distance from the hyperplane. The resulting expectation operator E assigns a non-negative value to every almost desirable gamble G in A, and a negative value to every undesirable gamble G∗ in U: E[G] ≥ 0 for all G ∈ A, E[G∗] < 0 for all G∗ ∈ U. 282 jason konek And from this expectation operator, E, it is fairly straightforward to extract a probability function p that strongly agrees with . Just let p(X) = E[X] for all X ∈ F .7 Then we have: Xi  Yi ⇒ E[Xi −Yi] ≥ 0 ⇔ E[Xi] ≥ E[Yi] ⇔ p(Xi) ≥ p(Yi). We also have: X∗i ≺ Y∗i ⇒ E[X∗i −Y∗i ] < 0 ⇔ E[X∗i ] < E[Y∗i ] ⇔ p(X∗i ) < p(Y∗i ). The upshot: sorting almost desirable from undesirable gambles is both necessary and sufficient for strong agreement with a probability function (in the presence of Non-triviality and Non-negativity). Now for the kicker: a comparative belief relation -whether or not it satisfies Non-triviality and Non-negativity-sorts almost desirable from undesirable gambles (in the sense that A ∩U = ∅) if and only if it satisfies Isovalence.8 Hence non-trivial and non-negative  are strongly representable if and only if they satisfy Isovalence. What's more, as we mentioned above, for total comparative belief relations it's easy to see that strong agreement with a probability function is equivalent to full agreement. So non-trivial, non-negative, total  are fully probabilistically representable if and only if they satisfy Isovalence. This is the main thrust of Scott's theorem. 2.2 Varieties of Representability There are, of course, other types of representability besides just strong and full probabilistic representability. For example, a probability function p strongly agrees with  just in case it satisfies two conditions: X  Y ⇒ p(X) ≥ p(Y), X Y ⇒ p(X) > p(Y). 7 More methodically, the hyperplane separation theorem gives a strictly separating linear functional, φ. But given that  satisfies Non-triviality and Non-negativity, A and U have a certain structure, which guarantees that we can normalise φ to arrive at an expectation operator E. For example, Non-triviality ensures that ∅ − Ω ∈ Y . Hence φ(Ω) > 0. Normalising then gives us E[Ω] = 1. Similarly, Non-negativity ensures that X −∅ ∈ X . Hence φ(X) ≥ 0, and in turn E[X] ≥ 0. 8 See Scott (1964, pp. 235–6) for the proof of sufficiency. We present a simplified version of both necessity and sufficiency in the appendix. comparative probabilities 283 We can pick apart these two conditions to arrive at two weaker notions of representability. Say that p almost agrees with  iff X  Y ⇒ p(X) ≥ p(Y), and also that p partially agrees with  iff X Y ⇒ p(X) > p(Y). A comparative belief relation  is almost representable if there is a probability function p that almost agrees with . Likewise,  is partially representable if there is a probability function p that partially agrees with . Kraft et al. (1959) show that  is almost representable if and only if it satisfies the Almost-Cancellation axiom. Almost-Cancellation. If X1 + . . . + Xn < Y1 + . . . + Yn and Xi  Yi for all i 6= j, then Xj 6 Yj. Similarly, Adams (1965) and Fishburn (1969) show that  is partially representable if and only if it satisfies the Partial-Cancellation axiom. Partial-Cancellation. If X1 + . . . + Xn ≤ Y1 + . . . + Yn and Xi Yi for all i 6= j, then Xj 6 Yj. It does not, to be clear, follow from these two results that  is strongly representable if and only if  satisfies both the Almost and PartialCancellation axioms. Satisfying Almost and Partial-Cancellation would simply guarantee that (i) some probability function p almost agrees with , and (ii) some possibly distinct probability function q partially agrees with . But strong representability requires that a single probability function do both types of agreeing. It is an open question what exactly is required for strong representability. (Of course, GST identifies necessary and sufficient conditions for strong representability given Non-triviality and Non-negativity. But clearly neither of those conditions is itself necessary for strong representability.) Almost, partial, and strong representability all place negative demands on your comparative beliefs. They require you to avoid certain sets of comparative beliefs. Say that p endorses  iff p(X) ≥ p(Y)⇒ X  Y. 284 jason konek For your comparative belief relation  to be almost representable, you must avoid having weak comparative beliefs that no probability function whatsoever endorses. Likewise, for  to be partially representable, you must avoid having strict comparative beliefs that no probability function endorses. For  to be strongly representable, you must avoid both. Full probabilistic representability is stronger. It makes positive demands as well as negative demands on your comparative beliefs. Full probabilistic representability requires your comparative beliefs to be sufficiently rich and specific that some probability function endorses exactly those comparative beliefs. So not only must you avoid comparative beliefs that are not endorsed by any probability function, but you must positively go in for all of the comparative beliefs endorsed by some probability function. Imprecise representability, or IP-representability, strikes a balance between these previous types. Like strong representability, IP-representability places negative demands on your comparative beliefs. It requires you to avoid comparative beliefs that no probability function endorses. But like full probabilistic representability, it also makes positive demands on your comparative beliefs. It does not go so far as to demand that you go in for all of the comparative beliefs endorsed by some probability function. But it does say that you must already be more confident in X than Y if every probability function that endorses your other comparative beliefs endorses X Y as well. In this way, it requires you to draw out the "probabilistic consequences" of your other comparative beliefs. Formally, a comparative belief relation  is imprecisely representable if and only if there is a set of probability functions P that fully agrees with it: P fully agrees with  iff X  Y ⇔ p(X) ≥ p(Y) for all p ∈ P . Rios Insua (1992) and Alon and Lehrer (2014) show that  is IPrepresentable if and only if it satisfies Reflexivity, Non-negativity, Nontriviality, and the Generalised Finite-Cancellation axiom. Generalised Finite-Cancellation axiom. If X1 + . . . + Xn + A + . . . + A} {{ } k times = Y1 + . . . + Yn + B + . . . + B} {{ } k times and Xi  Yi for all i ≤ n, then A  B. IP-representability is clearly stronger than strong representability. IP-representability implies strong-representability. But a strongly representable comparative belief relation  might fail to satisfy Reflexivity, Non-negativity, and Non-triviality. No IP-representable  will do so, comparative probabilities 285 Full prob. rep. Imprecise rep. Strong rep. Partial rep.Almost rep. Figure 4: Logical relations between different types of representability however. So strong-representability does not imply IP-representability. Moreover, Harrison-Trainor, Holliday, and Icard (2016) show that even for non-trivial, non-negative, and reflexive , IP-representability is stronger than strong representability. To wrap up, let's taxonimise these various types of representability according to their logical strength: see Figure 4. 2.3 Loose Ends: Infinite Algebras, Conditional Comparative Beliefs, Etc. Scott (1964, p. 247) claims that his theorem extends to comparative belief structures 〈Ω,F ,〉 with infinite algebras F , by a clever application of the Hahn-Banach Theorem. The proof however remains unpublished. Suppes and Zanotti (1976) also provide necessary and sufficient conditions for a comparative belief relation on an infinite algebra to be fully probabilistically representable. Suppes and Zanotti's axioms, however, do not directly constrain comparative beliefs. Rather, they show that  is probabilistically representable if and only if it is extendable to a comparative estimation relation over a larger set; a set containing not just propositions-sets of worlds, or equivalently, functions from worlds to 1 (true) or 0 (false), i.e., indicator functions-but to real-valued quantities Q : Ω→ R more generally. To get the rough idea, consider a travel agent. She might not have a precise estimate of how many travelers will go to Hawaii this year. (Perhaps her evidence is incomplete and ambiguous.) Likewise, she might not have a precise estimate of how many travelers will go to Acapulco. Despite this, she might well estimate that more travelers will go to Hawaii than Acapulco. Or consider the weather. Alayna might not have a precise estimate of how much rain London will receive in June. She might not have a precise estimate of how much rain Canterbury will receive in June. Despite this, she might estimate that London will receive more rain than Canterbury. Ditto for stock prices, or the number of MPs that different parties will lose or gain in the next election, or any other quantity you might care about. You can have comparative estimates regarding those respective quantities- i.e., estimate that one quantity Q will have a higher/equal/lower value 286 jason konek than another quantity Q∗-without having a unique, precise best estimate for any of them. Call a relation ∗ on a set F ∗ of real-valued quantities defined on Ω a comparative estimation relation if it is used to model an agent's comparative estimates. And call a comparative estimation relation ∗ qualitatively satisfactory if it satisfies the following (putative) coherence constraints. 1. ∗ is transitive and total. 2. Ω ∗ ∅. 3. X ∗ ∅. 4. X ∗ Y iff X + Z ∗ Y + Z. 5. If X ∗ Y then for all W, Z ∈ F ∗, there's an n > 0 such that X + . . . + X} {{ } n times +W ∗ Y + . . . + Y} {{ } n times +Z. Suppes and Zanotti (1976) show that a comparative belief relation  on an algebra F of propositions (indicator functions), whether F is finite or infinite, is fully probabilistically representable if and only if there is a comparative estimation relation ∗ on the set F ∗ of non-negative integervalued quantities F ∗ = { Q | Q : Ω→ Z≥0 } , which both (i) extends , and (ii) is qualitatively satisfactory. This shows that whatever latent structural defect prevents a comparative belief relation  from being fully probabilistically representable rears its head explicitly when you extend the relation. If  has this defect, then when you extend it, so that it encodes not just comparative estimates of truth-values of propositions, but also comparative estimates of the values of non-negative integer-valued quantities more generally, what you end up with-your new, larger comparative estimation relation ∗-will violate one of Suppes and Zanotti's putative coherence constraints. And vice versa. If  does not have this latent defect, then there is some way of extending it that does not violate those constraints.9 Of course, the representation theorems surveyed here are just the tip of the iceberg. For example, we said that a comparative belief relation  is imprecisely representable if and only if there is a set of probability functions P that fully agrees with it. But we could explore full agreement (or almost agreement, or partial agreement, or strong agreement) with any 9 Suppes and Zanotti's axiom 5 is an "Archimedean axiom," which guarantees (roughly) that differences in one's best estimates are not "infinitely small." For a non-Archimedean theory of comparative estimation, see Pederson (2014). comparative probabilities 287 number of imprecise probability models: Dempster-Shafer belief functions, n-monotone Choquet capacities, coherent lower previsions/expectations, or coherent lower probabilities (cf. Walley, 1991, 2000; Augustin, Coolen, de Cooman, and Troffaes, 2014; Troffaes and de Cooman, 2014). Alternatively, we could focus not on comparative belief relations, but conditional comparative belief relations. Hájek (2003)-following Rényi (1955), Jeffreys (1961), de Finetti (1974), and others-argues forcefully that we should treat precise conditional credence as more fundamental than precise unconditional credence. Similarly, we might treat conditional comparative beliefs of the form A | B  C | D as more fundamental than unconditional comparative beliefs. A | B  C |D says that the agent in question is at least as confident in A given B as she is in C given D. We can then recover unconditional comparative belief relations from comparative ones by conditioning on the tautology, Ω: A  B⇔ A |Ω  B |Ω. Say that a conditional comparative belief relation  on F is probabilistically representable if there is a conditional probability function that fully agrees with it. More carefully: there is a probability function p : F → R such that for any two propositions A, B ∈ F , and any two non-null propositions C, D ∈ F , we have A | B  C | D ⇔ p(A ∩ B) p(B) ≥ p(C ∩ D) p(D) . A proposition X is non-null just in case it is not just as likely as the contradiction, i.e., X |Ω 6≈ ∅ |Ω. Now we can ask: when are conditional comparative belief relations probabilistically representable? Domotor (1969) extends the results of Scott (1964) to provide necessary and sufficient conditions for probabilistic representability when  is defined on a finite algebra F . Suppes and Zanotti (1982) extend the results of Suppes and Zanotti (1976) to provide necessary and sufficient conditions in the general case (whether or not F is finite). See Suppes (1994) for additional detail. With a basic understanding of representation theorems and their mechanics in hand, we can now turn our attention to the central question of this chapter: do comparative beliefs and representation theorems figure into the best explanation of what precise credences are? If so, how? What do these accounts of credence look like? And how do they stand up to the criticisms of Hájek (2009), Meacham and Weisberg (2011), and Titelbaum (2015)? 288 jason konek 3 the comparative belief-credence connection To a first approximation, an agent's credence function measures how confident she can be said to be in each proposition. If c(X) = 1, then she is maximally confident that X is true, i.e., 100% confident. If c(X) = 0, then she is minimally confident that X is true, i.e., 0% confident. If c(X) = 2/3, then she is more confident than not that X is true, but not quite fully confident. But what does this really mean? What does it mean to say that an agent is 100%, or 80%, or 23.9556334% confident in a proposition? We might have similar questions for imprecise Bayesians. Imprecise Bayesians model rational agents' opinions not with a single credence function c, but with a set of credence functions C. Sets of credence functions are called imprecise credal states (see Mahtani, this volume).10 To a first approximation, imprecise credal states also measure how confident agents can be said to be in various propositions. But they allow for a strictly greater range of opinions than precise credal states. For example, if c(X) = 1 for all c in C, then our agent is 100% confident that X is true. If, however, 0.6 ≤ c(X) ≤ 0.9 for all c in C, and nothing stronger, then she is at least 60% confident and at most 90% confident that X is true. But she has no precise level of confidence for X. Precise credence functions allow for the first sort of opinion, but not the second. But again, what exactly does this mean? What does it mean to say that an agent is at least 60% confident and at most 90% confident in a proposition? The history of Bayesianism is chock-full of different accounts of credence that aim to answer this question. Very roughly, we can lump them into three groups: measurement-theoretic accounts, decision-theoretic accounts, and interpretivist accounts. Before exploring the differences between these various accounts, it is worth emphasising one similarity. They all treat 'credence function' or 'credal state' in roughly the way Carnap treated theoretical terms more generally. They carve out some theoretical role (or set of roles) R as constitutive of what it is for a real-valued function, c, or a set of such functions, C, to count as "your credal state." The better c (or C) plays role R, the more eligible it is as a "credal state candidate." What these accounts disagree on is what the relevant theoretical role R is. 10 Precise credal states are special cases of imprecise credal states, on the imprecise Bayesian view. Formally, C is precise just in case C = {c} for some credence function c. comparative probabilities 289 3.1 Measurement-Theoretic Account of Credence Measurement-theoretic accounts, like those of Koopman (1940a, 1940b), Good (1950), and Krantz et al. (1971), treat credal states as mere numerical measurement systems for comparative beliefs (or more generally, for some underlying structure of comparative and qualitative opinions). Compare: numerical measurement systems for length, mass, velocity, etc., allow engineers, scientists and the like to measure certain parts of the system of interest, perform numerical calculations, and draw inferences about other parts of the system. Imagine, for example, measuring the length of two pieces of wood arranged at a right angle, and using the Pythagorean theorem to infer how long the diagonal must be. Similarly, on the measurement-theoretic view, credal states are mere numerical measurement systems. They allow you to measure certain parts of an agent's system of comparative beliefs, perform numerical calculations, and draw conclusions about what other comparative beliefs she must have (or must not have). Imagine, for example, that you elicit a sufficient number of an agent's comparative beliefs  to be quite confident that (i) she satisfies Scott's axioms, so that  fully agrees with some probability function c, and further that (ii) c(X) = 0.3, c(Y) = 0.4, and c(X ∩ Z) = c(Y ∩ Z) = 0. Given these measurements, you can use the probability axioms to calculate that c(X ∪ Z) ≤ c(Y ∪ Z). Since c fully agrees with , you can infer that X ∪ Z  Y ∪ Z. (See Section 5 for a more complete introduction to the measurement-theoretic view.) The upshot: just like measurement systems for physical quantities (length, etc.), credal states allow you to represent comparative beliefs in an elegant, easy-to-use, numerical fashion. And modeling comparative (and qualitative) beliefs with numbers is useful. Numerical measurement systems are designed specifically to reflect important structural features of the underlying target system, so that you can use them to straightforwardly extract information about one part of the system from information about other parts. Where does this leave us? The measurement-theoretic view takes a particular stand on the nature of the theoretical role R that a function c (or set of functions C) must play in order to count as "your credal state." More specifically, c (or C) must fully agree (or almost agree, or partially agree, or strongly agree) with the agent's comparative beliefs, , in the way required to count as a numerical measurement system for . The better c (or C) plays this role R, the more eligible it is as a credal state candidate. Equally good measurement systems are equally eligible credal state candidates. 290 jason konek 3.2 Decision-Theoretic Account of Credence Decision-theoretic accounts of credence, like those of Ramsey (1931), de Finetti (1931, 1964), and Walley (1991), carve out a rather different theoretical role for credal states. Credal states, on these views, encode an agent's fair buying and selling prices. An agent's fair buying price for a gamble G is, roughly, the largest amount that she could pay for G while still leaving herself (in her own view) in at least as good a position as the status quo. An agent's fair selling price for a gamble G is, roughly, the smallest amount that someone else would have to pay her in exchange for G in order to leave herself (in her own view) in at least as good a position as the status quo. To illustrate, imagine that you have an urn. The urn contains 10 balls. Each ball is either red or black. There are at least 3 black balls, and at most 7 black balls. But you have no absolutely no idea whether the urn contains 3, 4, 5, 6, or 7 black balls. Let G be the gamble that pays out £10 if a random draw from the urn yields a black, and £0 otherwise. Given what you know about the contents of the urn, you would likely judge that paying a measly £1 for G is a good deal. Maybe £2 is a good deal too. But let's imagine that £3 is your limit. Paying any more than £3 would leave you in a situation where you are no longer, in your own view, determinately doing at least as well as the status quo. Then your fair buying price for G is 3. More carefully, your fair buying price for G is 3 iff you weakly prefer paying 3 and receiving G to the status quo, but not so for any amount higher than 3. Similarly, suppose that a friend wants to buy G from you. They will pay you some initial amount. Then you will pay them £10 if the draw comes up black and £0 otherwise. Given what you know about the urn, you would likely judge that selling G to your friend for £9 is a good deal (for you, anyway). Maybe £8 is a good deal too. But let's imagine that £7 is your limit. If they offer you any less than £7, then you would be left in a position where you are no longer, in your own view, determinately doing at least as well as the status quo. Then your fair selling price for G is 7. More carefully, your fair selling price for G is 7 iff you weakly prefer receiving 7 and selling G to the status quo, but not so for any amount lower than 7. On the decision-theoretic view, the principal theoretical role of an agent's credal state is to encode her fair buying and selling prices. A set E of realvalued functions e counts as "your credal state" just in case its lower and upper envelope for gambles G, E [G] = inf { e(G) | e ∈ E } , E [G] = sup { e(G) | e ∈ E } , comparative probabilities 291 are equal to your fair buying and selling prices for G, B(G) and S(G), respectively, i.e., E [G] = B(G) and E [G] = S(G). (Treat a single real-valued function e as the singleton E = {e}.) The better E plays this role R-the closer its lower and upper envelopes are to your fair buying and selling prices-the more eligible it is as a credal state candidate. Your credal state only captures information about your beliefs, on this view, insofar as they are reflected in your fair buying and selling prices. For any proposition X ∈ F , let GX be the unit gamble on X, i.e., the gamble that pays out £1 if X and £0 otherwise. Your lower and upper "previsions" for GX, E [GX] and E [GX] (i.e., the value of the lower and upper envelopes of E at GX), encode your fair buying and selling prices for GX. If you are willing to pay something near £1 for a unit gamble on X (E [GX] ≈ 1), then for the purposes of decision-making you are quite confident in X. If you would be happy to sell a unit gamble on X to a friend for mere pennies (E [GX] ≈ 0), then for the purposes of decision-making you have extremely low confidence in X. If you would only buy a unit gamble on X for next to nothing (E [GX] ≈ 0), and would only sell a unit gamble on for close to its maximum payout (E [GX] ≈ 1), then for the purposes of decision-making you have no idea whether X is true. Your opinions are rather imprecise. The decision-theoretic view comes in many flavours-one for each way of thinking about the preferences that determine your fair buying and selling prices. On a flat-footed behaviourist view, B(G) is your fair buying price for G just in case you actually buy G for B(G), and actually refuse to buy G for any higher price (or perhaps do so a sufficiently high proportion of the time). On a more sophisticated behaviourist view, B(G) is your fair buying price for G just in case you are disposed to buy G for B(G), and disposed to refuse to buy G for any higher price. Alternatively, we might reject behaviourism in its various guises, and say that the preferences that fix your fair buying/selling prices are irreducibly evaluative attitudes. But where do comparative beliefs enter the picture? It may not appear that comparative beliefs play an especially important role in explicating the concept of credence on the decision-theoretic view. After all, on this view, an agent's credal state encodes her fair buying and selling prices. And fair buying and selling prices are fixed by one's preferences, not their comparative beliefs. Even on Savage's view, where comparative belief reduces to preference, different fragments of an agent's preference relation fix her fair buying/selling prices and comparative beliefs, respectively. Nonetheless, rational comparative beliefs and fair buying/selling prices hang together in a certain way (Section 6). So comparative beliefs (and representation theorems) will be important for answering the normative question, even on the decision-theoretic view. Also worth noting: if an agent has a precise credal state E = {e}, then E [G] = E [G] 292 jason konek for all gambles G. That is, her fair buying prices just are her fair selling prices. The maximum amount she is willing to pay for G is precisely the minimum amount she is willing to accept in exchange for selling G. Agents with genuinely imprecise credal states (non-singleton E ), in contrast, may well think that buying is worthwhile only at very low prices, and selling is worthwhile only at very high prices. Imprecise Bayesians typically see this as the proper (or at least a permissible) type of evaluative attitude to bear in decision contexts where evidence is unspecific or ambiguous. One final note: measurement-theoretic and decision-theoretic accounts of credence can be difficult to distinguish in practice. Consider a proponent of the measurement-theoretic account, such as Savage, who treats comparative belief as reducible to preference (Savage, 1954, Section 3.2). You judge that X  Y iff whenever you prefer one outcome to another, you also prefer getting the better outcome if X than if Y. Then certain types of measurement systems for comparative belief-viz., sets of probability functions-encode fair buying and selling prices (see Section 6). Whence the difference, then, between this sort of measurement-theoretic account of credence, and a decision-theoretic account? The difference is this. On the measurement-theoretic view, any numerical measurement system for  does the work necessary to count as "your credal state"-not just ones that encode your fair buying and selling prices. Likewise, on the decision-theoretic view, any numerical system that encodes your buying and selling prices counts as "your credal state." But some of those systems (viz. upper and lower previsions) carry too little information to determine a numerical representation of your preference relation (Walley, 2000, Section 6). Shorter: even though some numerical systems do both jobs (measurement-theoretic and decision-theoretic), it is possible to do one without doing the other. So the two accounts make different predictions about which functions (sets of functions) count as "eligible credal state candidates." 3.3 Interpretivist Account of Credence Our final account of credence is the interpretivist account, of the sort espoused by Lewis (1974) and Maher (1993). According to preference-based interpretivist accounts, like Patrick Maher's, "an attribution of probabilities and utilities is correct just in case it is part of an overall interpretation of the person's preferences that makes sufficiently good sense of them and better sense than any competing interpretation does" (Maher, 1993, p. 12). And according to Maher, if some probabilistically coherent credence function c and cardinal utility function u jointly agree with an agent A's preferences, in the following sense: comparative probabilities 293 A weakly prefers α to β iff Ec[α] ≥ Ec[β] (where Ec[α] and Ec[β] are the expected utilities of acts α and β relative to c and u, respectively), then c and u perfectly rationalise or make sense of that agent's preferences. On Maher's view, both credence functions and utility functions earn their theoretical keep by rationalising preferences. If c and u rationalise your preferences better than any competing c∗ and u∗, then c plays the appropriate theoretical role to count as "your credal state," and u plays the appropriate theoretical role to count as "your utility function." This presupposes the thesis of the primacy of practical reason. Whether or not c rationalises your comparative and qualitative beliefs, understood as irreducibly doxastic attitudes, is neither here nor there. What makes c "your credence function" is the fact that it helps to rationalise your preferences. But we can distinguish another brand of interpretivism: epistemic interpretivism. This is a new account of credence. So we will spend a bit of time developing it. According to epistemic interpretivism, credal states are assignments of truth-value estimates (or sets of such assignments) that rationalise one's comparative beliefs (or more generally, her comparative and qualitative opinions), understood as irreducibly doxastic attitudes. A function c : F → R (or set C) counts as "your credal state" just in case it encodes truthvalue estimates (or constraints on such estimates) that best rationalise or make sense of your comparative beliefs. Spelling out epistemic interpretivism requires two things: (i) saying something about what truth-value estimates are, and (ii) explaining what it means for truth-value estimates to best rationalise a set of comparative beliefs. Estimates are familiar enough. For example, an analyst's best estimate of Tesla's stock price 1 years hence might be $425. Your best estimate of the number of bananas in a randomly selected bunch might be 5.7. And so on. In each of these examples, there is the agent doing the estimating, there is the quantity being estimated, and there is the estimate of that quantity. For the purposes of spelling out epistemic interpretivism, it is the last of these that matters most. Estimates are numbers. But not all numbers are estimates. For example, the numbers in the expression 1, 000, 000 > 2 are not estimates. What sorts of numbers are estimates then? Plausibly, they are numbers that are subject to a certain standard of evaluation. A 294 jason konek number is an estimate in a context iff it is evaluated qua estimate in that context. In typical contexts of evaluation, numbers like 2 in expressions like the above are not estimates because they are not evaluated qua estimates. There is no quantity that it would be better or worse for 2 to be close to. It is no better or worse for being close to the actual price of stock X, or the actual dosage of drug Y, etc. In contrast, the number at the bottom of a contractor's quote-a paradigm of an estimate-is evaluated qua estimate. It is quite bad, for example, if it is £20, 000 off the actual price of the job. What exactly is it to evaluate a number qua estimate? We will not provide a full answer here. But we can say something informative. The type of phenomenon under consideration-evaluating an entity E qua X-is a common one. You might be brilliant qua scientist, mediocre qua mentor, and terrible qua conversationalist. The reason seems to be this: scientists, mentors and conversationalists all perform characteristic functions. And you can perform some functions well while performing others poorly. Microbiologists, for example, carefully dissect tissue samples, meticulously document their experiments, write up academic papers, communicate their results at conferences, etc. Conversationalists, on the other hand, ask engaging questions, are familiar with current events, and so on. You might well dissect tissue samples masterfully, but have no idea what the news of the day is. This suggests the following. Evaluating an entity E in some capacity X , or qua X , is a matter of evaluating E on the basis of how well it performs the characteristic functions F1, . . . ,Fn associated with X . What to say about estimates in particular then? What characteristic functions do they serve, for example, in scientific inquiry, engineering, finance, etc.? Whatever the full answer is, the following seems non-negotiable: an estimate of quantity Q serves the function of approximating the true value of Q. So ceteris paribus it is better the closer it is to the true value of Q. Note that, on the present account, for a number to count as an estimate in a context, there must be an evaluator in that context; an agent evaluating the number qua estimate. (This need not require having the concept estimate, or anything of the sort. Evaluating a number qua estimate might be a fairly cognitively undemanding task.) But there need be no estimator; no agent producing the estimate; no agent explicitly judging that this is the best estimate of that, etc. Thermometers provide estimates of temperature. Geiger counters provide estimates of radiation. Ditto for other measurement devices. In each of these cases, there is an estimate (38◦C, 0.10mSv, etc.), but no estimator; no agent doing the estimating. Similarly, a tree's rings provide an estimate of its age. Your parents' income provides an estimate of your income. Again, estimates without estimators. And estimates, of course, do not need to be good. The number of tea leaves concentrated in one part of your cup comparative probabilities 295 provides a (thoroughly unreliable) estimate of the number of fortunate events in your future. Once more: estimate, but no estimator. The upshot: we can talk of estimates doing this or that-for example, rationalising a set of comparative beliefs-even if those estimates do not "belong" to anyone. Estimates without estimators. Back to our original question: what are truth-value estimates? We have made some progress in saying what estimates are more generally. Now, following de Finetti and Jeffrey, treat a proposition X as an "indicator variable" that takes the value 1 at worlds where X is true, and 0 where X is false. Truth-value estimates, then, are simply estimates of the value, 0 or 1, that the proposition takes at the actual world. To finish spelling out the epistemic interpretivist account of credence, we need to explain what it means for truth-value estimates to "best rationalise" a set of comparative beliefs. To get a feel for how this might work, consider an example. Grandma relies on folklore methods for predicting the weather. She feels things in her bones, observes the behaviour of the cows in the pasture, etc. You are not sure whether the weather-related opinions that Grandma comes to on this basis make much sense or not. But then you open your weather app. Lo and behold, you find a bunch of estimates- probabilities for sun, clouds, rain, etc., estimates of rainfall amount, hourby-hour temperature estimates, etc.-that recommend thinking precisely what Grandma thinks. For example, Grandma thinks it is likelier than not to rain this evening. And the weather app recommends thinking that too. It specifies a greater than 50% probability of rain. (We will explore a few different accounts of recommendation shortly.) The weather app's estimates recommend having Grandma's opinions. And these estimates are themselves eminently rational. In virtue of this, they rationalise or make sense of those opinions. Note, however, that the weather app itself is not essential to this story. Estimates do not need an estimator. If there exists some rational set of estimates that recommend Grandma's opinions, then whether or not any weather app actually spits those estimates out, or any meteorologist actually judges those estimates to be best-or indeed whether any artificial or human system is in the business of explicitly estimating quantities at all-Grandma's opinions are nonetheless rationalisable. The rational estimates that recommend her opinions provide that rationale. Before saying something more general about when a set of truth-value estimates best rationalises a set of comparative beliefs, we should key in on two important features of our example. The first is the strength of the recommendation in question. The second is the quality of that recommendation. We stipulated that the weather app's estimates recommend having Grandma's opinions. This makes it seem as though recommendation is 296 jason konek an on-off matter. But recommendations plausibly come in degrees. You can recommend a trip to the Alps a little more strongly than a trip to Tahoe, but much more strongly than a trip to Cudahy, Wisconsin. In our example, the weather app's estimates most strongly recommend thinking precisely what Grandma thinks. We might have stipulated, however, that they recommend a similar but distinct state of opinion most strongly, and recommend Grandma's state of opinion a little less strongly. In that case, the weather app's estimates provide a fairly strong, but not maximally strong rationale for Grandma's state of opinion. In addition to the strength of a recommendation, we can consider the quality of that recommendation. We stipulated that the weather app's estimates are eminently rational. But our weather app could have been a bit glitchy and delivered mildly irrational estimates (ones that violate the probability axioms, perhaps, but not by much). Those estimates might still recommend thinking what Grandma thinks just as strongly. But in virtue of their mild irrationality, they provide a slight lower quality rationale for Grandma's state of opinion. The distinction between strength and quality is important. If Grandma's state of opinion is epistemically defective, it may turn out that no estimates unreservedly recommend it, i.e., recommend it at least as strongly as any other state of opinion. Every set of estimates might recommend some other state of opinion more strongly. Nonetheless, some sets of estimates might recommend Grandma's state of opinion more strongly than others. And amongst the sets of estimates that recommend it as strongly as possible (at least as strongly as any other set of estimates), some might provide a higher quality recommendation than others. The extent to which a set of estimates rationalises or makes sense of a state of opinion depends on both strength and quality. To provide the best possible rationale for Grandma's state of opinion, for example, a set of estimates must (i) recommend that state as strongly as possible, and (ii) must provide the highest quality recommendation from amongst the sets of estimates that satisfy (i). Let's take stock. According to epistemic interpretivism, a function c : F → R (or set of functions C) counts as "your credal state" just in case it encodes truth-value estimates (or constraints on such estimates) that best rationalise or make sense of your comparative beliefs. We gave a brief account of estimatehood to fill this out a bit. And we quickly unpacked what it means for c (or C) to "best rationalise" your comparative beliefs, . To best rationalise , c should provide at least as strong a rationale for  as any other set of truth-value estimates c∗. And on the picture sketched above, c provides a rationale for  by recommending . So for c to count as "your credal state," no other c∗ can recommend  more strongly than c. Moreover, amongst the truth-value estimates that provide a maximally strong rationale for  (recommend it as strongly as possible), c should comparative probabilities 297 provide at least as high quality a rationale as any other c∗. On the picture sketched above, the quality of c's rationale depends on how close c itself is to rational. So for c to count as "your credal state," no other c∗ that recommends  as strongly as possible should be more rational than c. Pulling this all together, c (or C) counts as "your credal state" just in case it encodes truth-value estimates (or constraints on such estimates) that recommend your comparative beliefs as strongly as possible, and are as rational as possible whilst doing so. See Figure 5. c1 c2 c3 Estimates that recommend your comparative beliefs as strongly as possible Maximally rational estimates Provides some rationale for your comparative beliefs Provides a higher quality rationale Provides the highest quality rationale Figure 5: More rational estimates provide higher quality rationales. The big lingering question is this: when exactly does a set of truthvalue estimates recommend a certain set of comparative beliefs more or less strongly? There are a number of ways one could spell this out. We will not defend a particular account of recommendation here. But here are three options. Metaphysical Account. The truth-value estimates given by c : F → R recommend  to degree k iff it is metaphysically necessary that any agent who explicitly judges c(X) to be the best truth-value estimate for X, for all X ∈ F , has comparative beliefs c and D(c,) = 1/k, where D is some reasonable measure of distance between comparative belief relations. On the metaphysical account, judging c : F → R to encode the best truthvalue estimates for propositions in F entails having certain comparative beliefs c. Since having comparative beliefs c is part and parcel of judging c best, c recommends c as strongly as possible. And c recommends 298 jason konek other comparative beliefs, , less strongly the further away they are from c. See Deza and Deza (2009) and Fitelson and McCarthy (2015) for more information on measures of distance between comparative belief relations. Our next account says that while judging c to encode the best truth-value estimates may not entail that you have some set of comparative beliefs or other, it nevertheless rationally requires you to have those beliefs. And we can use this fact to say what it is for a set of truth-value estimates to recommend comparative belief relations to different degrees. Normative Account. The truth-value estimates given by c : F → R recommend  to degree k iff it is rationally required that any agent who explicitly judges c(X) to be the best truthvalue estimate for X, for all X ∈ F , has comparative beliefs c and D(c,) = 1/k, where D is some reasonable measure of distance between comparative belief relations. A proponent of the normative account might treat the principles of rationality that generate the relevant requirement as properly basic components of her epistemology. Alternatively, she might provide a teleological explanation of why those principles have the normative force that they do by appealing to facts about epistemic value or utility. One final account of recommendation-the epistemic utility account-explains recommendation more directly in terms of epistemic value/utility facts. Informally, the epistemic utility account says that c recommends  to degree k just in case the most rational way of adding estimates of the value of comparative beliefs to the stock of truth-value estimates encoded by c involves estimating  to have epistemic utility k. Let's make this a little more precise. An assignment of truth-value estimates c : F → R (or set C) maps a very specific kind of measurable quantity-propositions or indicator functions-to estimates. Let Q be the set of all measurable quantities Q : Ω → R. An assignment est : Q → R of estimates to measurable quantities extends c just in case c(X) = est(X) for all X ∈ F . To make sense of something being closer or further from rational, we need two things: an epistemic utility function U and laws of preference L. First let's talk about U . For any assignment of truth-value estimates c, U (c, w) measures how epistemically valuable c is at world w. Whatever properties make truth-value estimates epistemically valuable at a world, U (c, w) captures the extent to which c has a good balance of these properties at w. Likewise, U (, w) measures how epistemically valuable comparative beliefs  are at world w. For a philosophically rich discussion of how to measure the epistemic value of estimates, see Joyce (2009) and Pettigrew (2016). comparative probabilities 299 Laws of preference L are familiar from decision theory. In conjunction with U , they specify rationally permissible ways of structuring one's preferences over options. For example, the law of dominance says that if one option o is guaranteed to have higher utility than another option o∗, then you ought to prefer o to o∗. Likewise, the law of (first-order) stochastic dominance says: if for any possible utility value x, o is guaranteed to have greater chance than o∗ of having higher-than-x utility, then you ought to prefer o to o∗. And so on. Let T be the set of rational truth-value estimates, relative to U and L, i.e., the set of c that are not dispreferred to some other c∗. Let E be the set of rational estimates more generally relative to U and L, i.e., the set of est that are not dispreferred to some other est∗. Say that est is the maximally rational extension of c to Q iff (i) est extends c to Q, and (ii) est is closer to rational (i.e., closer to E ) than any other est∗ that extends c to Q. We can now state the epistemic utility account more precisely. Epistemic Utility Account. The truth-value estimates given by c : F → R recommends to degree k iff the maximally rational extension of c to Q, estc, is such that such that estc(U ()) = k. The basic thought here is that while c might not directly encode estimates of quantities other than truth-values, it nonetheless takes a stand on how to estimate those quantities. It encodes such estimates indirectly. There is some most rational way of adding estimates of other measurable quantities Q to the stock of truth-value estimates encoded by c. These estimates, estc(Q), are the best estimates of those quantities, from c's perspective. So, in effect, the epistemic utility account says that c recommends  to degree k just in case it indirectly estimates  to be epistemically valuable to degree k. There are no doubt myriad unanswered questions about each of these accounts of recommendation. It is not our purpose to provide a full defense of any particular account. Just note that you can choose your favourite (or one not on the list) and slot it into our official version of epistemic interpretivism. Epistemic Interpretivism. A function c (or set C) counts as "your credal state" iff it best rationalises your comparative beliefs . Moreover, c (or set C) best rationalises  iff (i) it recommends  as strongly as possible, so that no other c∗ (or set C∗) recommends  to a higher degree, and (ii) c is itself closer to rational (closer to T ) than any other c∗ that recommends  as strongly as possible. Even setting aside questions about how to understand recommendation, there are various lingering questions about epistemic interpretivism. For 300 jason konek example, one might wonder what makes comparative beliefs more or less epistemically valuable at a world, or how to measure such value. See Fitelson and McCarthy (2015) for an investigation of "additive" epistemic utility measures for comparative belief. One might also wonder what makes one set of estimates closer to rational than another. For a nuanced discussion, see Staffel (2018). We will not address these questions here. But we will evaluate epistemic interpretivism in a bit more depth in Section 7. 4 challenges to the relevance of representation theorems We now have a number of accounts of credence on the table, however briefly sketched. These accounts purport to tell us what it means to say that an agent is x% confident in a proposition (if she has precise credences), or between y% and z% confident (if she has imprecise credences). Proponents of these accounts use them to answer some important questions. For example, when exactly is there a real-valued function c (or set C) that plays the relevant theoretical role R well enough to count as "your credal state"? Following Meacham and Weisberg (2011), we will call this the characterisation question. And why should we expect rational agents to have probabilistically coherent credences? We will call this the normative question. In answering these questions, proponents typically invoke coherence constraints (on either preference or comparative belief) and representation theorems. Hájek (2009), Meacham and Weisberg (2011), and Titelbaum (2015) challenge any such approach. Whatever account of credence you adopt, they argue, there is no plausible representation-theorem-centric narrative that could answer these questions. Their objections are many. We will focus on a few central ones. Hájek, Meacham and Weisberg, and Titelbaum all imagine that the "basic representation theorem argument" goes as follows. 1. Coherence Constraints. Any rational agent's comparative belief relation  satisfies coherence constraints φ. 2. Representation Theorem. Relation  satisfies constraints φ if and only if  fully agrees (or almost agrees, or partially agrees, or strongly agrees) with some probability function c (or set of probability functions C). C. Probabilism. Any rational agent has probabilistic credences (either precise credences given by c, or imprecise credences given by C). If successful, this argument would at least partially answer both the characterisation and normative question at once. When is there a credence comparative probabilities 301 function c, or a set of such functions C, that plays the relevant theoretical role R well enough to count as your credal state? Whenever your comparative beliefs satisfy coherence constraints φ! Satisfying φ is a sufficient condition for having credences. And why should we expect rational agents to have probabilistically coherent credences? Because the coherence constraints φ are rationally mandatory. And any agent who satisfies φ not only has credences, but probabilistically coherent credences. But this argument is not successful as it stands. As Eriksson and Hájek (2007), Hájek (2009), Meacham and Weisberg (2011), and Titelbaum (2015) emphasise, it does not follow from the mere fact that some probabilistically coherent credence function fully agrees with her comparative beliefs that she in fact has probabilistic credences. So the argument is invalid. Hájek puts the point as follows (cf. also Meacham and Weisberg, 2011, p. 14, and Titelbaum, 2015, p. 274): the mere possibility of representing you one way or another might have less force than we want; your acting as if the representation is true of you does not make it true of you. To make this concern vivid, suppose that I represent your preferences with Voodooism. My voodoo theory says that there are warring voodoo spirits inside you. When you prefer A to B, then there are more A-favouring spirits inside you than B-favouring spirits [. . . ] I then 'prove' Voodooism: if your preferences obey the usual rationality axioms, then there exists a Voodoo representation of you. That is, you act as if there are warring voodoo spirits inside you in conformity with Voodooism. Conclusion: rationality requires you to have warring Voodoo spirits in you. Not a happy result. (Hájek, 2009, p. 238) The same thing, these objectors claim, can be said about the representation theorem argument for probabilism. Just because your preferences can be represented as the end product of a vigorous war between the voodoo spirits inside you does not imply that you in fact have such spirits inside you. Similarly, just because your comparative beliefs can be represented as arising from precise credences c (or imprecise credences C) does not imply that you in fact have such credences. This line of criticism is not particularly concerning. The reason: no Bayesians put forward this basic "representation theorem argument." Koopman, Savage, Joyce, etc.; they all presuppose some account of credence or other. For example, Krantz et al. presuppose a measurement-theoretic account of credence. we inquire into conditions under which an ordering  of E has an order-preserving function P that satisfies Definition 2. Obviously, the ordering is to be interpreted empirically as 302 jason konek meaning "qualitatively at least as probable as." Put another way, we shall attempt to treat the assignment of probabilities to events as a measurement problem of the same fundamental character as the measurement of, e.g., mass or momentum. (Krantz et al., 1971, pp. 199–202) The upshot: any faithful reconstruction of the "representation theorem argument" really ought to feature an account of credence explicitly as a premise. The simple argument under attack here fails this basic test. Of course, objectors do not focus exclusively on this simple version of the representation theorem argument. Hájek, Meacham and Weisberg, and Titelbaum all consider more sophisticated versions as well. A fairly general, more charitable way of understanding what fans of representation theorems are up to is this. Firstly, to shed some light on the characterisation question, they establish a "Bridge Theorem" which shows that the function c, or set of functions C, outputted by their favourite representation theorem is fit to play the theoretical role R singled out by their favourite account of credence. Bridge Theorem. If  satisfies φ, then at least one of the probability functions c (or set of probability functions C) whose existence is guaranteed by the Representation Theorem plays role R well enough to count as "your credal state." Secondly, to answer the normative question, they put their favourite account of credence, their favourite representation theorem, and this bridge theorem to work in order to provide a more sophisticated argument for probabilism. 1. Coherence Constraints. Any rational agent's comparative belief relation  satisfies coherence constraints φ. 2. Theory of Credence. A real-valued function c (or set C) counts as "your credal state" to the extent that it plays theoretical role R. The better c (or C) plays role R, the more eligible it is as a "credal state candidate." 3. Representation Theorem. Relation  satisfies constraints φ if and only if  fully agrees (or almost agrees, or partially agrees, or strongly agrees) with some probability function c (or set of probability functions C). 4. Bridge Theorem. If  satisfies φ, then at least one of the probability functions c (or set of probability functions C) whose existence is guaranteed by the Representation Theorem plays role R well enough to count as "your credal state." comparative probabilities 303 C. Probabilism. Any rational agent has probabilistic credences (either precise credences given by c, or imprecise credences given by C). In Section 5–7, we will evaluate how this argument fares on each of our competing accounts of credence. But it is worth addressing some general concerns about this argumentative strategy here.11 Meacham and Weisberg worry that even if the axioms φ of your favourite representation theorem encode genuine coherence constraints on rational comparative belief, ordinary folks like you and me are not typically rational (Meacham & Weisberg, 2011, pp. 7–8).12 Our comparative beliefs violate these constraints φ. So even if the Bridge Theorem is correct-even if the representation theorem in question would output a function c, or a set of functions C that deserves to be called "your credal state" if your comparative beliefs satisfied φ-it is silent about ordinary folks. The upshot: it does not help to answer the characterisation question in any interesting way. While it does specify sufficient conditions for having credences, those conditions are so demanding that they are more or less irrelevant for agents like us. This concern, however, does not cut much ice. As we will see in Section 5– 7, there is plenty to say about when ordinary folks-folks who reliably violate constraints of rationality-count as having credences on each of our competing accounts (measurement-theoretic, decision-theoretic, and epistemic interpretivist). Meacham and Weisberg also worry that the "representation theorem argument" trivialises normative epistemology (Meacham & Weisberg, 2011, pp. 14–16). There is a gap, recall, between representability and psychological reality. Just because your comparative beliefs can be represented as arising from precise credences c (or imprecise credences C) does not imply that you in fact have such credences. To avoid this problem, the objection goes, representation theorem arguments must stipulatively define an agent's credences to be given by the function c (or set C) outputted by one's favourite representation theorem. But those theorems deliver probabilistic representations by construction. So it is simply true by stipulative definition that whenever an agent has credences, they are probabilistically coherent. Whence the normative force of probabilism then? The claim that rational credences are probabilistically coherent is trivial if all credences are probabilistically coherent by definition. 11 The following objections are adapted from Hájek (2009), Meacham and Weisberg (2011), and Titelbaum (2015). 12 Meacham and Weisberg are concerned primarily with representation theorems for preference relations. Accordingly, they focus on empirical data that shows that ordinary agents reliably violate putative coherence constraints on rational preference. For example, Kahneman and Tversky (1979) show that subjects consistently violate Savage's Independence Axiom, and Lichtenstein and Slovic (1971, 1973) show that subjects often have intransitive preferences. We adapt their concerns to the case of comparative belief mutatis mutandis. 304 jason konek But again this concern need not give us much pause. We do not need to bridge the gap between representability and psychological reality by stipulative definition. Rather, we bridge that gap by (i) providing a theory of credence, which specifies the theoretical role R that a function c (or set C) must play to count as "your credal state," and (ii) providing a bridge theorem, which establishes that some function c (or set C) outputted by one's favourite representation theorem in fact plays roleR sufficiently well. This strategy does not stipulatively define your credences as those given by c (or C). Far from it. Establishing that c (or C) plays R well enough to count as "your credal state" requires substantive argumentation. It is safe, then, to put these general concerns to the side. Of course, their spectre lingers until we see the details about the relevant bridge theorems and so on (Section 5–7). We now turn our attention to evaluating how well this strategy answers the characterisation and normative questions, respectively, on each of our competing accounts of credence. 5 evaluating the measurement-theoretic view 5.1 Interpreting Credence Functions On the measurement-theoretic view, a credence function c (or set C) is a mere numerical measurement system. It allows you to represent an agent's comparative belief structure, 〈Ω,F ,〉, numerically in the following sense. Firstly, c maps the propositions X in F to real-valued proxies, c(X). Secondly, it does so in a "structure-preserving fashion." If c fully agrees with , then one proxy c(X) is larger than another c(Y) exactly when our agent is more confident in X than Y (and c(X) = c(Y) exactly when she is equally confident in X and Y): X  Y ⇔ c(X) ≥ c(Y). In this sense, the familiar "greater than or equal to" relation ≥ on the real numbers "preserves the structure" of our agent's comparative belief relation  on F . Because of this, you can use the numerical measurement system in helpful ways. You can elicit certain comparative beliefs, infer properties of c, perform numerical calculations, and draw conclusions about what other comparative beliefs she must have (or must not have). Similarly, a set of real-valued functions C can provide a numerical measurement system for . If C fully agrees with , then the c in C uniformly assign larger proxies to X than Y exactly when our agent is more confident in X than Y (and uniformly assign equal proxies exactly when she is equally confident in X and Y): X  Y ⇔ c(X) ≥ c(Y) for all c ∈ C. comparative probabilities 305 Once more, this allows you to use the (imprecise) numerical measurement system C in helpful ways. You can elicit certain comparative beliefs, infer properties of C, perform numerical calculations, and draw conclusions about what other comparative beliefs she must have (or must not have). Weaker types of agreement yield numerical measurement systems fit for slightly different purposes. Suppose, for example, that c strongly agrees with : X  Y ⇒ c(X) ≥ c(Y), X Y ⇒ c(X) > c(Y). Such a measurement system licenses fewer inferences about  than fully agreeing systems. To see this, imagine that c is a probability function, X and Y are both incompatible with Z, and X Y. Then since c strongly agrees with , we have c(X) > c(Y). And since c is a probability function, c(X ∪ Z) > c(Y ∪ Z). Hence c(X ∪ Z) 6≤ c(Y ∪ Z). From this we can infer that X ∪ Z 6 Y ∪ Z. But we cannot infer that X ∪ Z  Y ∪ Z. If, on the other hand, c were to fully agree with , then we could make this latter inference. To recap: the measurement-theoretic view takes a particular stand on the nature of the theoretical role R that a function c (or set C) must play in order to count as "your credal state." More specifically, c (or C) must fully agree (or almost agree, or partially agree, or strongly agree) with the agent's comparative beliefs, , in the way required to count as a numerical measurement system for . The better c (or C) plays this role R, the more eligible it is as a credal state candidate. Importantly, though, any function c (or set C) that plays this role R has equal claim to be called "your credence function," on the measurementtheoretic view. Any order-preserving mapping (homomorphism) from F into R is just as eligible as a credal state candidate as any other. So credence functions are not unique. Indeed, if c fully agrees (or almost agrees, or partially agrees, or strongly agrees) with , then any of the infinitely many strictly increasing transformations of c do so as well. So if you have one credence function, on this view, then you have infinitely many. In addition, interpreting credence functions requires care, on the measurement-theoretic view. An agent's credence function does not wear its representationally significant features on its sleeve. Sorting out which features of one's credence function are representationally significant, rather than "mere artefacts," requires knowing what the "permissible transformations" of that credence function are. That is, it requires knowing not only that b counts as "your credence function," but also what other functions c preserve the structure of your comparative and qualitative 306 jason konek beliefs, and so count as "your credence function" as well. For example, on the standard Bayesian picture, an agent's credence function c is such that c(X ∩Y) c(Y) = c(X) just in case she judges that Y is evidentially independent of X. But if credence functions c are mere numerical measurement systems for an agent's comparative beliefs , then properties like c(X ∩Y)/c(Y) = c(X) are not representationally significant. They do not reflect anything real about the agent's doxastic state. To see this, imagine that you take two blood tests. Let w++ be the world in which both tests come back positive; w+− be the world in which the first comes back positive and the second negative; w−+ be the world in which the first comes back negative and the second positive; and w−− be the world in which both tests come back negative. You have comparative beliefs over propositions in the following Boolean algebra: F =  { w++, w + −, w − +, w − − }{ w++, w + −, w − + } , { w++, w + −, w − − }{ w++, w − +, w − − } , { w+−, w − +, w − − }{ w++, w + − } , { w++, w − + } , { w++, w − − }{ w+−, w − + } , { w+−, w − − } , { w−+, w − − } ,{ w++ } , { w+− } , { w−+ } , { w−− } , ∅  . In particular, your comparative beliefs are given by:{ w++, w + −, w − +, w − − } { w++, w − +, w − − } { w++, w + − } { w−+, w + −, w − − } { w+−, w − − } { w−+, w + − } { w+− } { w++, w − +, w − − } { w++, w − − } { w++, w − + } { w++ } { w−+, w − − } { w−− } { w−+ } ∅. Then the probability functions b and c in Table 1 both fully agree with , and hence both count as a numerical measurement systems for . So both play the credal state role, on the measurement-theoretic view. But b is such that b( { w++ } ) b( { w++, w + − } ) = 1 3 = b( { w++, w − + } ). comparative probabilities 307 w++ w + − w − + w − − b 1136 22 36 1 36 2 36 c 2064 33 64 4 64 7 64 Table 1: Probability functions b and c on F . Both fully agree with . Given that b is your credence function, the standard interpretation says: you think that the result of the first test provides no evidence one way or the other about the result of the second test. On the other hand, c is such that c( { w++ } ) c( { w++, w + − } ) = 20 53 > 24 64 = c( { w++, w − + } ). Given that c is your credence function, the standard interpretation says: you judge that a positive outcome on the second test supports or confirms a positive outcome on the first test. Finding out that the second test is positive increases your credence that the first test is positive too. What is going on here? Answer: the "standard interpretation" reads more information into one's credence function than is actually encoded in that credence function. On the measurement-theoretic view, credence functions are nothing more than numerical measurement systems that encode the ordering determined by your comparative beliefs. (They are mere "ordinal scale" measurement systems, not "ratio scale" measurement systems.) But there is more to making judgments of evidential relevance and irrelevance than having a particular constellation of comparative beliefs. Agents who only have comparative beliefs simply are not opinionated enough to count as having opinions about evidential relevance and irrelevance. So credence functions do not reflect any such opinions. Interpreting imprecise credal states requires care too. Suppose, for example, that you have opinions about the propositions in the Boolean algebra F ∗ = {Ω, X,¬X,∅} . Consider the precise and imprecise credal states on F ∗ given by the probability function b in Table 2 and the set of probability functions C: Ω X ¬X ∅ b 1 0.7 0.3 0 Table 2: Probability function b on F ∗ C = { c | c(Ω) > c(X) > c(¬X) > c(∅) } . 308 jason konek On the standard interpretation, b and C represent different doxastic states. An agent with credence function b is precisely 70% confident that X is true. An agent with imprecise credal state C, in contrast, is at least 50% confident that X is true, but nothing stronger. On the standard interpretation, these are not idle differences. These differences in doxastic states are reflected in one's evaluative attitudes. For example, an agent with credence function b will have precisely the same fair buying and selling price for a unit gamble G on X, viz., 0.7. Paying any price up to £0.7 for G is a good deal in her view. Selling G for any price over £0.7 is a good deal. But an agent with imprecise credal state C will have different buying and selling prices for G. Paying any price up to £0.5 for G is a good deal, in her view. But selling G is only a determinately good deal if the buyer is willing to pay more than £1. On the measurement-theoretic view, however, b and C represent exactly the same doxastic state. They both fully agree with the following comparative belief relation : Ω X ¬X ∅. They are both order-preserving mappings from F ∗ into the reals that preserve exactly the same structure. In this case, there is simply no substantive difference between being 70% confident, or 89.637% confident, or at least 50% confident that X is true. Only the comparative beliefs that b and C encode are psychologically real. Everything else is a "mere artefact" of one's preferred numerical measurement system. Both b and C, and any other credal state that fully agrees with , plays exactly the same theoretical role: they represent the comparative beliefs captured by  in an elegant, easy-to-use, numerical fashion-nothing more, nothing less. 5.2 Unary and Pluralist Variants We have focussed thus far on a particular unary variant of the measurementtheoretic view. On this view, credence functions are mere numerical measures of one's comparative beliefs. Having credences is nothing over and above having numerically representable comparative beliefs. You might be attracted to this view if, for example, you think that we can explain and rationalise everything important about choice and inference by appealing exclusively to comparative belief-no additional modes or types of doxastic judgment necessary. In that case, you might say: to the extent that we are willing to talk about prima facie distinct types of opinion-degrees of belief, full or categorical belief, etc.-they ought to ultimately reduce to comparative beliefs. Reducing those other types of opinion away will allow us to provide the simplest and most unified possible explanations of the relevant data regarding choice and inference. comparative probabilities 309 But there is also a pluralist variant of the measurement-theoretic view, which you might find attractive if you are less optimistic about the explanatory power of comparative belief. On the pluralist version, agents have a genuine plurality of doxastic attitudes, not simply comparative beliefs. In addition to comparative beliefs, agents also have: (i) opinions about the evidential dependence or independence of one hypothesis on another; (ii) opinions about the causal dependence or independence of one variable on another; (iii) full or categorical beliefs; they may even (iv) explicitly estimate the values of all sorts of different variables, including the frequency of truths in a set of propositions, and the truth-values of individual propositions. Estimating, in this sense, is a matter of making a sui generis doxastic judgment-a type of judgment that may bear interesting relations to other types of judgments (normative relations, causal relations, etc.), but is not reducible to them. Estimating the truth-value of a proposition, in this sense, is what Jeffrey (2002) calls having an exact judgmental probability for the truth of that proposition. On the pluralist measurement-theoretic view, your credence function is a mere numerical measurement system, but not a measure specifically of your comparative belief relation. Rather, on the pluralist view, you have a genuine plurality of comparative and qualitative doxastic attitudes, and your credence function is a measure of that entire system of attitudes. Consider once again our blood test example. You have the following comparative beliefs:{ w++, w + −, w − +, w − − } { w++, w + −, w − − } { w++, w − +, w + − } { w++, w + − } { w−+, w + −, w − − } { w+−, w − − } { w−+, w + − } { w+− } { w++, w − +, w − − } { w++, w − − } { w++, w − + } { w++ } { w−+, w − − } { w−− } { w−+ } ∅. But now imagine that you have a wide range of comparative and qualitative opinions, not just comparative beliefs. You think, for example, that when you find out the result of the first test (positive or negative), this provides no evidence one way or the other about the result of the second test. (Perhaps the tests probe two different, unrelated conditions.) That is, you judge {w++, w+−} and {w−+, w−−} to be evidentially independent of {w++, w−+} and {w+−, w−−}, and vice versa. In addition, you have certain full beliefs or categorical beliefs. Let's suppose that you believe that the first test will come back positive. (It probes for a condition that you quite clearly have.) That is, you fully believe {w++, w+−}. And you believe all of the logical consequences of this proposition. But you have no further full or categorical beliefs. 310 jason konek Finally, you judge 1/3 to be the best estimate of the truth-value of the proposition that the second test will come back positive. (Recall, a proposition's truth-value is 1 if it is true and 0 if it is false.) In Jeffrey's parlance, you have a judgmental probability of 1/3 for the proposition {w++, w−+}. So you have a genuine plurality of doxastic attitudes: you have comparative beliefs; you make evidential independence judgments; you have full or categorical beliefs; you also estimate the truth-values of certain propositions (you have exact judgmental probabilities). On the pluralist measurement-theoretic view, your credence function is a measure of this entire system of attitudes. To make this more precise, let's model your doxastic attitudes using a relational structure: A = 〈 F ,, I ,B, E1/3 〉 . A comprises your Boolean algebra F of subsets of Ω = {w++, w+−, w−+, w−−}, together with a comparative belief relation  on F , an independence relation I , a (unary) belief relation B, and a (unary) estimation relation E1/3. I models your evidential independence judgments. It will be convenient to think of I as a 3-place relation on F : I(X, Y, X ∩Y) iff you judge X to be evidentially independent of Y. Since you judge {w++, w+−} and {w−+, w−−} to be independent of {w++, w−+} and {w+−, w−−}, and vice versa, we have: I( { w++, w + − } , { w++, w − + } , { w++ } ), I( { w−+, w − − } , { w++, w − + } , { w−+ } ), I( { w++, w + − } , { w+−, w − − } , { w+− } ), I( { w−+, w − − } , { w+−, w − − } , { w−− } ). We also have I(Y, X, X∩Y) for each of these four independence judgments I(X, Y, X ∩Y). Likewise, B models your full or categorical beliefs: B(X) iff you believe X. comparative probabilities 311 Since you believe {w++, w+−} and all of its logical consequences, we have: B( { w++, w + −, w − +, w − − } ), B( { w++, w + −, w − + } ), B( { w++, w + −, w − − } ), B( { w++, w + − } ). Finally, E1/3 models your explicit estimates of truth-values: Ex(X) iff you judge x to be the best estimate of the truth-value of X. Since you judge 1/3 to be the best estimate of the truth-value of {w++, w−+}, we have: E1/3 ({ w++, w − + }) . On the pluralist view, your credence function is a measure of your entire system of attitudes: A = 〈 F ,, I ,B, E1/3 〉 . It is a homomorphism-a structure-preserving mapping-that takes A into some numerical structure A∗. A∗ = 〈 R,∗, I∗,B∗, E∗1/3 〉 . That is, your credence function c maps F into R in a way that preserves A's structure, so that:13 X  Y ⇔ c(X) ∗ c(Y), I(X, Y, X ∩Y) ⇔ I∗(c(X), c(Y), c(X ∩Y)), B(X) ⇔ B∗(c(X)), E1/3(X) ⇔ E∗1/3(c(X)). Which numerical structure c takes A into, on the measurement-theoretic view, is either a matter of convention or a matter to be decided on practical grounds. For illustrative purposes, let's choose a familiar numerical structure. Let ∗ by the "greater than or equal to" relation, ≥. Let I∗ be the standard probabilistic independence relation: I∗(c(X), c(Y), c(X ∩Y)) iff c(X)c(Y) = c(X ∩Y). 13 We could swap full agreement for almost, or partial, or strong agreement here. Weaker notions of agreement would provide us with weaker notions of structure-preservation. 312 jason konek Let B∗ be a Lockean belief relation, so that believed propositions X have real-valued proxies c(X) that are greater than (or equal to) some threshold τ (for concreteness let τ = 5/6): B∗(c(X)) iff c(X) ≥ τ. Finally, let E∗1/3 be: E∗1/3(c(X) iff c(X) = 1/3. This ensures that for any structure-preserving measurement system, c, you explicitly judge 1/3 to be the best estimate of X's truth-value just in case c(X) = 1/3. The important observation to make is this: the pluralist view carves out a bigger job for credence functions to do than the reductive view. Credence functions must do more than preserve the order induced on F by your comparative belief relation. They must also preserve the structure induced by your various other doxastic attitudes: your evidential independence judgments, full or categorical beliefs, and so on. So a function c : F → R may well do the work required to count as "your credence function" on the unary view, but yet fall short of that mark on the pluralist view. Consider, for example, the function b : F → R: b( { w++, w + −, w − +, w − − } ) = 1, b( { w++, w − +, w − − } ) = 31/64, b( { w++, w + −, w − − } ) = 60/64, b( { w++, w − − } ) = 27/64, b( { w++, w − +, w + − } ) = 57/64, b( { w++, w − + } ) = 24/64, b( { w++, w + − } ) = 53/64, b( { w++ } ) = 20/64, b( { w−+, w + −, w − − } ) = 44/64, b( { w−+, w − − } ) = 11/64, b( { w+−, w − − } ) = 40/64, b( { w−− } ) = 7/64, b( { w−+, w + − } ) = 37/64, b( { w−+ } ) = 4/64, b( { w+− } ) = 33/64, b(∅) = 0. It is easy to verify that b fully agrees with , i.e., X  Y ⇔ b(X) ≥ b(Y). So b is a real-valued measure of your comparative belief relation . Hence, it counts as a credence function on the unary measurement-theoretic view. It preserves the structure on F induced by your comparative beliefs-the only type of doxastic attitude that the unary view countenances. But it does not play the theoretical role required to count as a credence function on the pluralist measurement-theoretic view. To do that, it must also preserve comparative probabilities 313 the structure on F induced by your various other doxastic attitudes: your independence judgments, full or categorical beliefs, and so on. But b falls short of that mark. For example, you think that the outcome of the first test provides no evidence about the outcome of the second. But b does not treat {w++, w+−} and {w++, w−+}, for example, as independent, in the way specified by I∗: b ({ w++, w + − }) b ({ w++, w − + }) = 53 64 * 24 64 = 159 512 6= 160 512 = 20 64 = b ({ w++ }) . Similarly, you believe that the first test will come back positive. That is, you fully believe {w++, w+−}. But b does not treat {w++, w+−} as believed, in the way specified by the Lockean belief relation B∗. It maps {w++, w+−} to a real-valued proxy b({w++, w+−}) = 53/64 ≈ 0.828 below the threshold τ = 5/6 ≈ 0.833 required for full or categorical belief. Finally, you judge 1/3 to be the best estimate of the truth-value of the proposition that the second test will come back positive. You have a judgmental probability of 1/3 for the proposition {w++, w−+}. But b fails to map {w++, w−+} to the real-valued proxy set aside by E∗1/3 for such propositions, viz., 1/3. Instead, b({w++, w−+}) = 24/64 = 0.375. The upshot: while b preserves the structure on F induced by your comparative beliefs, it fails to preserve the additional structure induced by your various other doxastic attitudes: your evidential independence judgments, full or categorical beliefs, and so on. So while b does count as one of your (infinitely many) credence functions on the unary measurement-theoretic view, it does not count as one on the pluralist measurement-theoretic view. In contrast, the function c : F → R of Figure 6 counts as "your credence function" on both the unary and pluralist views. (The interested reader may verify this for herself.) To recap: the measurement-theoretic view stakes out a particular position on the theoretical role R that a function c (or set of functions C) must play in order to count as "your credal state." It says that c (or C) must fully agree (or almost agree, or partially agree, or strongly agree) with your comparative and qualitative opinions-comparative beliefs, evidential independence judgments, full or categorical beliefs, etc.-in the way required to count as a numerical measure of that entire system of attitudes. The better c (or C) plays this role R, the more eligible it is as a credal state candidate. On the unary measurement-theoretic view, the fundamental type of doxastic attitude is comparative belief. So credal states are numerical measures of comparative beliefs. On the pluralist measurement-theoretic view, you have a genuine plurality of doxastic attitudes. So credal states are numerical measures of a more highly structured system of attitudes. 314 jason konek c( { w++, w + −, w − +, w − − } ) = 1, c( { w++, w − +, w − − } ) = 14 36 , c( { w++, w + −, w − − } ) = 35 36 , c( { w++, w − − } ) = 13 36 , c( { w++, w − +, w + − } ) = 34 36 , c( { w++, w − + } ) = 12 36 , c( { w++, w + − } ) = 33 36 , c( { w++ } ) = 11 36 , c( { w−+, w + −, w − − } ) = 25 36 , c( { w−+, w − − } ) = 3 36 , c( { w+−, w − − } ) = 24 36 , c( { w−− } ) = 2 36 , c( { w−+, w + − } ) = 23 36 , c( { w−+ } ) = 1 36 , c( { w+− } ) = 22 36 , c(∅) = 0. Figure 6: Credence function c To streamline our discussion, we will focus on the the unary variant of the measurement-theoretic view going forward. 5.3 The Characterisation and Normative Questions How does the unary measurement-theoretic account answer the characterisation question? When exactly is there a function c (or set C) that fully agrees (or almost agrees, or partially agrees, or strongly agrees) with your comparative belief relation  in the way required to count as a numerical measure of ? We explored a partial answer to this question earlier. Scott (1964) proves that there is a probability function that fully agrees with your comparative belief relation  just in case  satisfies Non-Triviality, Non-Negativity, Totality, and Isovalence. Rios Insua (1992) and Alon and Lehrer (2014) prove that there is a set of probability functions that fully agrees with  just in case  satisfies Reflexivity, Non-negativity, Non-triviality, and the Generalised Finite-Cancellation axiom. Kraft et al. (1959) proves that there is a probability function that almost agrees with  just in case it satisfies Almost-Cancellation. Adams (1965) and Fishburn (1969) prove that there is a probability function that partially agrees with  just in case it satisfies Partial-Cancellation. Finally, in proving the Generalised Scott Theorem, we identified sufficient conditions for the existence of a probability function that strongly agrees with : Non-Triviality, Non-Negativity, and Isovalence. comparative probabilities 315 Pinning down necessary and sufficient conditions for strong representability is an open problem. These representation theorems tell us what it takes to count as having probabilistically coherent credences, on the measurement-theoretic view. But they do not answer the more general characterisation question: when is your comparative belief relation sufficiently well-behaved for you to count as having credences full stop, coherent or not? Krantz et al. (1971) provide an answer. They show that a comparative belief relation  fully agrees with a real-valued function c if and only if  is a weak order, i.e.,  satisfies Transitivity and Totality (Krantz et al., 1971, p. 15, Theorem 1). So if a real-valued function c counts as a structurepreserving numerical measure of  just in case c fully agrees with , and if precise credence functions just are structure-preserving numerical measures of , then we now know exactly when you count as having precise credences full stop. You count as having precise credences just in case  satisfies Transitivity and Totality. Weaker notions of agreement set weaker standards for "structure preservation." They thereby make it easier for a real-valued function (or set of functions) to count as a structure-preserving numerical measurement system for . In turn, your comparative beliefs need not satisfy such strict constraints for you to count as having credences. For example, every comparative belief relation  almost agrees with a real-valued function c. So if all that is required for structure-preservation is almost-agreement, then nothing whatsoever is required of  for you to count as having credences. Any comparative belief relation will do. More interestingly,  strongly agrees with a real-valued function c if and only if  satisfies weak transitivity (see the appendix for proof).14 Weak Transitivity. If X  Y1  . . .  Yn  Z, then X 6≺ Z. So if structure-preservation requires strong-agreement and nothing more, then you count as having precise credences just in case  satisfies Weak Transitivity. What then of Meacham and Weisberg's concern? They claim that the axioms of typical representation theorems for comparative belief are so demanding that only perfectly rational agents could possibly satisfy them. So even if those axioms do encode sufficient conditions for having credences, they are more or less irrelevant for irrational agents like us. They leave entirely open whether our comparative beliefs are ever well-behaved enough for us to count as having credences. But your comparative beliefs need not satisfy the axioms of Scott's Theorem (or the Almost-Cancellation axiom, or the Partial-Cancellation axiom, etc.) for you to count as having credences. Such axioms encode 14 The proof strategy for this theorem is due to Catrin Campbell-Moore. 316 jason konek necessary and sufficient conditions for having probabilistic credences. Probabilistic credence functions, however, are not the only credence functions in town. Your comparative beliefs only need to satisfy weaker constraints, such as Weak Transitivity, to count as having credences tout court. Weak Transitivity is not nearly as demanding as Scott's axiom. It is also worth nothing that even though Scott's axiom and the like seem complicated, it is not obvious that they are excessively difficult for agents like us to satisfy. It may be computationally intensive to run a diagnostic program which continually checks your comparative beliefs for violations of Scott's axiom. And if we had to run such a program to reliably satisfy Scott's axiom, then you might well expect that limited agents like us typically violate it. But no such program is necessary. Nature is replete with cheap solutions to seemingly computationally intensive problems. This is one main lesson of the embodied cognition movement in cognitive science.15 Agents like us might well use computationally cheap strategies, rather than demanding diagnostic programs, in order to minimise violations of Scott's axiom and other coherence constraints. Meacham and Weisberg also worry that the measurement-theoretic view and its ilk count the wrong functions as eligible credal state candidates (Meacham & Weisberg, 2011, p. 5). On the (unary) measurement-theoretic view, any of the infinitely many numerical measurement systems for  count as equally eligible credal state candidates. But some clearly are more eligible than others. For example, suppose that Holmes has opinions about finitely many propositions, e.g., about whether Moriarty is in London, etc. Then Holmes is struck on the head. The blow does not change Holmes' comparative beliefs. He is still more confident that Moriarty is in London than Paris, and so on. But it does raise his confidence that Moriarty is in London. Then clearly something has changed about which functions are the most eligible candidates for counting as Holmes' credence function. But on the measurement-theoretic view, nothing at all has changed. One of two things is going on here. Option 1: the objection tacitly presupposes that the measurement-theoretic view simply misidentifies the theoretical role R that a function c (or set C) must play in order to count as "your credal state." That is a meaty, substantive debate, and we will not explore it any further. Option 2: the objection tacitly presupposes that Holmes makes explicit judgments about the best estimates of truth-values, or something of the sort. But that assumes pluralism. And the pluralist 15 Consider, for example, the "outfielder's problem" (Clark, 2015, p. 12). It might seem miraculous that baseball players manage to catch fly balls if doing so involves: (i) estimating the position of a ball at various time points; (ii) using this information to estimate the ball's trajectory; (iii) calculating where the ball will land on the basis of its trajectory. This is computationally intensive! Luckily, there is a computationally cheap solution. You can just move your body in a way that keeps the ball centred in your visual field. This strategy uses the agent's body to reduce computational demand. comparative probabilities 317 measurement-theoretic view simply does not say that any of the infinitely many numerical measurement systems of Holmes' comparative beliefs are equally eligible candidates for counting as Holmes' credence function. So much for the characterisation question. How does the unary measurement-theoretic account answer the normative question? Why should we expect rational agents to have probabilistically coherent credences? How we answer the normative question depends on what we say about structure preservation. If we say, for example, that c must fully agree with  to count as a structure-preserving numerical measure of , and in turn count as "your credal state," then the following argument answers the normative question. 1. Coherence Constraints. Any rational agent's comparative belief relation  satisfies Non-Triviality, Non-negativity, Totality, and Isovalence. 2. Theory of Credence. A real-valued function c (or set C) counts as "your credal state" just in case it is a structure-preserving numerical measure of , i.e., just in case it plays the "structure-preservation role" R. And c preserves the structure of  just in case c fully agrees with . 3. Scott's Theorem. Relation  satisfies Non-Triviality, Non-negativity, Totality, and Isovalence if and only if  fully agrees with some probability function c. 4. Bridge Theorem. If  satisfies Non-Triviality, Non-negativity, Totality, and Isovalence, then there is some probability function c that plays role R well enough to count as "your credal state." (From 2 and 3) C. Probabilism. Any rational agent has probabilistic credences. (From 1 and 4) Now, you might quibble with premise 1. You might doubt whether Totality, for example, encodes a genuine constraint of rationality. In that case, we might weaken our putative coherence constraints by adopting less demanding standards for structure preservation. For example, if we say that structure preservation requires only strong agreement with , rather than full agreement, then we can offer the following argument. 1∗. Coherence Constraints. Any rational agent's comparative belief relation  satisfies Non-Triviality, Non-negativity, and Isovalence. 2∗. Theory of Credence. A real-valued function c (or set C) counts as "your credal state" just in case it is a structure-preserving numerical measure of , i.e., just in case it plays the "structure-preservation 318 jason konek role" R. And c preserves the structure of  just in case c strongly agrees with . 3∗. Corollary of GST. If  satisfies Non-Triviality, Non-negativity, and Isovalence, then  strongly agrees with some probability function c. 4∗. Bridge Theorem. If  satisfies Non-Triviality, Non-negativity, and Isovalence, then there is some probability function c that plays role R well enough to count as "your credal state." (From 2∗ and 3∗) C∗. Probabilism. Any rational agent has probabilistic credences. (From 1∗ and 4∗) Each type of agreement (full, strong, almost, partial) yields a different variant of this argument. Whether you find any of them compelling will depend on (i) which putative coherence constraints you find plausible or implausible (premise 1), and (ii) what type of agreement is required for credence functions to play any auxiliary theoretical roles you deem important (premise 2). At this point, you might be a bit suspicious. Doesn't this argument trivialise probabilism? True enough, you might say, the probability functions outputted by Scott's theorem are fit to play the "credal state role" R on measurement-theoretic view. But that is because we reverse engineered R so that Scott's theorem outputs exactly the right sorts of functions to play R! We stipulatively defined R to be the role of preserving the structure of . Then we stipulatively defined structure-preservation to be a matter of fully agreeing with . But given these stipulative definitions, it follows trivially that the probability functions outputted by Scott's theorem play R well enough to count as "your credal state." Probabilism seems less like a substantive normative thesis, then, and more like a trivial consequence of stipulative definitions. This suspicion is doubly off the mark. Firstly, the measurement-theoretic account of credence puts forward a substantive claim about the principal theoretical role of credence functions c (and imprecise credal states C). It is motivated by the thought that our opinions are qualitative. At bottom, we have opinions like: comparative beliefs, full beliefs, etc. And the best way to understand the numbers that we use to describe these qualitative attitudes is in exactly the same way that we understand the numbers that we use to describe length, mass, volume, etc., viz., as numerical measurement systems. Whether this is right or wrong, it is surely no stipulative definition. Secondly, as we have already emphasised, representability by a probability function is strictly stronger than representability by a real-valued function. Establishing that the stronger axioms (e.g., Scott's axioms) encode genuine constraints of rationality, rather than merely the weaker axioms (e.g., Transitivity and Totality) is non-trivial. As a result, establishing probabilism is comparative probabilities 319 non-trivial, even if we simply grant the measurement theorist her account of credence. You might also be concerned that the strategy above only establishes half of probabilism. If successful, it establishes that all rational agents have probabilistic credences. But it does not establish that rational agents have only probabilistic credences. On the measurement-theoretic view, any agent that counts as having a credence function at all in fact has a plurality of credence functions. If she is rational, then at least one of these will be probabilistically coherent. But many will not be. If c is a probability function that fully agrees with  (or almost agrees, or partially agrees, or strongly agrees), then any of the infinitely many strictly increasing transformations of c do so as well. These transformations will not in general be probability functions. But this auxiliary thesis-that no rational agent has a probabilistically incoherent credence function-is not particularly interesting, on the measurement-theoretic view. The reason: nothing interesting hinges on whether some incoherent function (or set of functions) is fit to play the "credal state role" for you. On the measurement-theoretic view, credence functions are mere numerical measurement systems for comparative belief; systems which allow you to measure certain parts of an agent's comparative belief relation  and draw inferences about other parts of . Probabilistic measurement systems are particularly useful for this end. Probability functions have nice properties; properties that simplify the calculations necessary to draw inferences about . Whether or not some unhelpful, incoherent measurement system exists is neither here nor there.16 If some such system exists, who cares! It's not hurting anyone. The interesting question is whether the useful things exist. But if an agent has incoherent credences, doesn't this come at some cost to her? Doesn't it hurt her? De Finetti (1964) shows that any agent with incoherent credences is Dutch bookable, i.e., susceptible to sure loss at the hands of a clever bettor. And Joyce (1998, 2009) shows that any agent with incoherent credences is accuracy-dominated, i.e., there are distinct 16 Of course, not all incoherent measurement systems are unhelpful. For example, suppose that b is a probability function and fully agrees with . Let c(X) = eb(X). Then c fully agrees with . But while b satisfies Finite Additivity: b(X ∪Y) + b(X ∩Y) = b(X) + b(Y), c satisfies Finite Multiplicativity: c(X ∪Y) * c(X ∩Y) = c(X) * c(Y). Note, though, that c is no less "helpful" than b. All of the theorems of probability theory can be rewritten in terms of a multiplicative scale rather than an additive scale. So c could be used to facilitate inference about  just as well as b. For analogous remarks regarding additive and mutliplicative measures in physics, see (Krantz et al., 1971, p. 100). 320 jason konek (coherent) credences that are guaranteed to be closer to the truth than hers. Aren't these costs-pragmatic and epistemic-that any agent with incoherent credences must pay? No. Not on the measurement-theoretic view. De Finetti assumes that if c counts as your credence function, then c(X) is both your fair buying and selling price for a unit gamble on X. But this is simply not so on the measurement-theoretic view. Credence functions represent your comparative beliefs  in an elegant, easy-to-use, numerical fashion-nothing more, nothing less. It is simply not the job of a credence function to capture your fair buying (or selling) prices. We cannot read your fair buying and selling prices off of c in any straightforward fashion. Indeed, to infer anything about your betting behaviour from c, we need decision-theoretic norms that specify how rational comparative beliefs and preferences hang together. For example, following (Savage, 1954, Section 3.2), we might suggest the following. Coherence. If X Y, then you ought to prefer to stake good outcomes on X than Y. More carefully, if you strictly prefer outcome o to o∗, and X Y, then you ought to strictly prefer A to B: A = [o if X, o∗ if ¬X] , B = [o if Y, o∗ if ¬Y] . Moreover, you ought to be willing to sacrifice some small amount ε to exchange A for B. If you satisfy Coherence, and c fully agrees with , then we can use c to infer something about your betting behaviour. For example, if c(X) = 0.7 and c(Y) = 0.6, then we can infer that you prefer to let £1 ride on X than on Y, and would even be willing to pay some small amount to exchange the first gamble for the second. But we cannot infer that your fair buying (selling) price for X is £0.7, or that your fair buying (selling) price for Y is £0.6. Without this crucial assumption-that credences encode fair buying/selling prices-we cannot provide a de Finetti-style Dutch book argument to show that no rational agent has incoherent credences. Having an incoherent credence function does not mean that you have incoherent fair buying/selling prices, and hence does not mean that your buying/selling prices render you Dutch-bookable. In a similar fashion, Joyce assumes that if c counts as your credence function, then c(X) is your best estimate of X's truth-value. Moreover, the accuracy of these estimates is what makes your doxastic state better or worse from the epistemic perspective. (Accuracy is the principal source of epistemic value, anyway.) But again, this is not so on the measurementtheoretic view. Credence functions are mere numerical measures of comcomparative probabilities 321 parative belief relations. It is simply not the job of a credence function to capture your best estimates of truth values, on the measurement-theoretic view. The upshot: having an incoherent credence function does not mean that you in any sense have incoherent truth-value estimates; so it does not mean having accuracy-dominated truth-value estimates; so it does not mean having a doxastic state that is epistemic-value-dominated. Finally, one might level a criticism similar to Meacham and Weisberg's (2011, pp. 19–20) criticism of Lyle Zynda. Zynda is a proponent of the unary measurement-theoretic account (Zynda, 2000, pp. 66–68).17 On Zynda's view, there are comparative beliefs-agents are more confident in some propositions than others-but there are no additional modes or types of doxastic judgment. To the extent that we countenance talk of fully believing a proposition, or believing something much more strongly than something else, this better ultimately reduce to talk about comparative beliefs. Meacham and Weisberg object that comparative beliefs lack the structure required to explain everything about choice and inference that we would like to explain. So even if the measurement theorist provides some reason to expect rational agents to have probabilistic credences, the background picture of the basic stock of doxastic attitudes available to such agents is too impoverished for their arguments to cut much ice. For example, if we buy the unary measurement-theoretic account, then the well-known problem of interpersonal utility comparisons rears its head as a problem of interpersonal credal comparisons. Just as it makes no sense to say that Ashan desires chocolate ice cream more strongly than Bilal does, on the measurement-theoretic account (since there is no common scale one which their preferences are measured), similarly it makes no sense to say that Ashan is more confident that it will rain than Bilal is. But, at least in certain cases, it seems that we need such facts to explain choice behaviour. Why did Ashan grab his umbrella but Bilal did not? One possible explanation: both are more confident than not that it will rain, but Ashan is more confident than Bilal. On the the unary measurement-theoretic account, such explanations are unavailable, Meacham and Weisberg argue. More generally: the extra-ordinal structure contained in the standard Bayesian picture of degrees of belief is not idle. Magnitudes encode important features of our degrees of belief, and if we abandon this structure, degrees of belief lose much of their utility. (Meacham & Weisberg, 2011, p. 20) 17 Like Maher, Zynda subscribes to the thesis of the primacy of practical reason (cf., Zynda 2000, p. 55). Credence functions are numerical measures of comparative beliefs. But preferences are the real thing. Comparative beliefs reduce to preferences. 322 jason konek You might not think that there is much to this line of criticism. For example, Ashan might think that rain is just as likely as picking a black ball at random from an urn containing 99 black balls and 1 white ball. Bilal, in contrast, might think that rain is just as likely as picking a black ball at random from an urn containing 51 black balls and 49 white balls.18 These individual comparative belief facts help to explain why Ashan grabbed his umbrella but Bilal did not at least as well as the purported interpersonal fact that Ashan is more confident than Bilal. It is not obvious, then, that there is any genuine problem of interpersonal credal comparisons to resolve. Even if you do think there is something to this line of criticism, note that it is not an objection to the measurement-theoretic account of credence per se. It is only an objection to the unary measurement-theoretic account. A pluralist faces no such problems. Of course, in answering the normative question, a pluralist cannot simply appeal to Scott's theorem. Scott's theorem only shows that comparative belief relations with certain properties are probabilistically representable. The pluralist must appeal to a representation theorem that shows that a more comprehensive system of doxastic attitudes with certain properties is probabilistically representable. But there is no principled reason for thinking that such representation theorems are not forthcoming. 6 evaluating the decision-theoretic view On the decision-theoretic view, the principal theoretical role of an agent's credal state is to encode her fair buying and selling prices. Recall, an agent's fair buying price for a gamble G is the largest amount B(G) that she could pay for G without making herself worse off. She pays B(G), receives G, and is no worse than the status quo, in her own view. Her fair selling price for G is the smallest amount S(G) that someone else would have to pay her in exchange for G to avoid being worse off. She receives S(G), commits to shelling out G's payoff, and is no worse than the status quo, in her own view. Gambles are measurable quantities G : Ω→ R. For simplicity, we will assume that |Ω| = n, and treat gambles as vectors in Rn. When we model a gamble as a vector G = 〈g1, . . . , gn〉 , 18 Both de Finetti (1931) and Koopman (1940a) use "partition axioms" to extract quantitative information from belief relations in roughly this way. For a recent approach along these lines, see Elliott (2018). You might also model agents as having comparative estimation relations, as explored in §2.3. Comparative estimation relations allow for a much richer and explanatorily powerful set of doxastic attitudes than comparative belief relations. comparative probabilities 323 we do so by specifying the net effect gi that the gamble has on our agent's level of total wealth in world wi. For example, suppose you let £100 ride on red at the roulette table. Let w1, . . . , wi be the worlds in which the ball lands on red (you net £100), and wi+1, . . . , wn be the worlds in which it does not (you net −£100). Then we model your gamble as follows: G = 〈 100, . . . , 100} {{ } i times ,−100, . . . ,−100} {{ } (n-i) times 〉 . For any proposition X ∈ F , we model a unit gamble on X by the characteristic vector x = 〈x1, . . . , xn〉 of X, i.e., the vector with xi = 1 if wi ∈ X and xi = 0 if wi 6∈ X. And for any a ∈ R, we model the "constant gamble" that pays out £a in every world by the constant vector a = 〈a, . . . , a〉. Following Walley (1991), we can specify an agent's fair buying and selling prices using sets of almost-desirable gambles. Say that a gamble G is almost desirable for an agent iff she weakly prefers G to 〈0, . . . , 0〉, i.e., the status quo. Let D ⊆ Rn be the set of gambles that she finds almost desirable. Now we can specify her fair buying and selling price for G (B(G) and S(G), respectively) in terms of D. Let B(G) = sup { a | G − a ∈ D } . Taking the gamble G − a is equivalent to paying £a for G. So B(G) is the largest amount that she could pay for G while leaving herself in a position that she weakly prefers to the status quo, i.e., her fair buying price for G. Likewise, let S(G) = inf { a | a− G ∈ D } . Taking the gamble a− G is equivalent to receiving £a and shelling out G's payoff. So S(G) is the smallest amount that someone else would have to pay her in exchange for G while leaving herself in a position that she weakly prefers to the status quo, i.e., her fair selling price for G. Talk of both fair buying and fair selling prices is actually a bit redundant. Note that −B(−G) = − sup { a | −G − a ∈ D } = inf { −a | −G − a ∈ D } = inf { a | −G + a ∈ D } = S(G). Taking −G from someone (they shell out −G's payoff to you) is nothing more than you offering G to them (you shell out G's payoff to them). And 324 jason konek paying a negative amount to someone for some good is really nothing more than them paying you a positive amount (and vice versa: taking a negative amount is nothing more than you paying a positive amount). The smaller the positive amount that they pay you, the bigger the negative amount you pay them. So the negative of the biggest amount that you would pay to take −G, i.e., −B(−G), is just another way of describing the smallest amount that you would need to be paid to offer G. We will just talk of your fair buying prices henceforth. But these really capture both your fair buying and selling prices. Say that a set E of real-valued functions e : Rn → R encodes your fair buying prices iff its lower envelope for G, E [G] = inf { e(G) | e ∈ E } , is equal to B(G) when B(G) is defined, and is undefined when it is not. Say that a set of probability functions C encodes your fair buying prices just in case its corresponding set of expectation operators EC = { Ec | c ∈ C } does so: EC [G] = inf { Ec[G] | c ∈ C } . Finally, say that your fair buying prices are probabilistic iff some set of probability functions encodes them. How does the decision-theoretic account answer the characterisation question? When exactly is there a real-valued function c (or a set of such functions C) that encodes your fair buying and selling prices? Answer: always. Say that a real-valued function e : Rn → R dominates your fair buying prices iff e(G) ≥ B(G) whenever B(G) is defined. Let E∗ be the set of real-valued functions that dominate your fair buying prices, i.e., E∗ = { e | e(G) ≥ B(G) if B(G) is defined } . Then E∗ encodes your fair buying prices, whatever they are. Hence E∗ counts as "your credal state" according to the decision-theoretic view. So there are no demanding constraints that an agent must satisfy in order to have credences, on this view. Having credences is dead easy. And clearly it is perfectly possible to have non-probabilistic credences. So much for the characterisation question. How does the decisiontheoretic account answer the normative question? Why should we expect rational agents to have probabilistic credences? The story here is considerably more tricky. One might expect standard Dutch book arguments to provide an answer. De Finetti (1964) shows that for a specific sort of agent-one whose fair buying prices are equal to her fair selling prices, i.e., B(G) = S(G)-having non-probabilistic fair buying prices renders you Dutch bookable (susceptible to sure loss at comparative probabilities 325 the hands of a clever bettor). One can see essentially the same result by considering (Walley, 1991, 3.3.3a). Walley shows that an agent's fair buying prices are not Dutch bookable (avoid sure loss) iff they are dominated by the expectation operator of some probability function. And in the special case under consideration-fair buying prices equal fair selling prices- one's fair buying prices are dominated in this way just in case they are probabilistic, i.e., encoded by some set of probabilities C. The upshot: in this special case-fair buying prices equal fair selling prices-an agent is not Dutch bookable (avoids sure loss) iff there is some set of probabilities C that encodes her fair buying prices, and hence counts as "her credal state." So if rationality requires avoiding sure loss, then we have good reason to expect this very special kind of agent to have probabilistic credences. You might hope, then, that such a Dutch book argument could show quite generally that rational agents have probabilistic credences. But your hopes would be in vain. An agent avoids sure loss iff there is some set of probabilities C whose expectations for gambles uniformly dominate her fair buying prices for those gambles, i.e., Ec[G] ≥ B(G) for all c ∈ C and all gambles G. When an agent's fair buying and selling prices come apart, this can happen even when there is no set of probabilities C∗ that actually encodes her fair buying prices.19 Bottom line: non-Dutch-bookability (avoiding sure loss) does not require having probabilistic credences. Having probabilistic credences, in the decision-theoretic sense (i.e., some set of probabilities that encodes your fair buying prices), is equivalent to something stronger than non-Dutch-book-ability-what Walley calls "coherence." Your fair buying prices are coherent iff they satisfy the following axioms. 1. Accept Sure Gains. B(G) ≥ infG. 2. Homogeneity. B(λG) = λB(G) for λ ≥ 0. 3. Superlinearity. B(G + G∗) ≥ B(G) + B(G∗). Axiom 1 forbids you from paying at most £1 for G when G is guaranteed to payoff either £2, £3, or £4, for example. It says that your maximum buying price for G must be at least £2. Axiom 2 says that your fair buying price for a gamble G that is guaranteed to pay 2 (or 10, or 58.97) times another gamble G∗ should be 2 (or 10, or 58.97) times your fair buying price for G∗. Axiom 3 says that your fair buying price for a package of bets 19 Consider, for example, an agent whose fair buying price for any gamble G is infG − ε. For any non-constant G, B(G) = infG − ε < supG − ε = − inf−G − ε < − inf−G + ε = S(G). But clearly B[G] < EC [G] for any set of probability functions C. So the lower envelope of EC dominates her fair buying prices. Hence she avoids sure loss. But no such C encodes her fair buying prices. 326 jason konek should be at least as great as the sum of your fair buying prices for each of the bets in the package. To reiterate: coherence is strictly stronger than avoiding sure loss. Walley (1991, Section 2.4) provides examples of fair buying prices that avoid sure loss (are not Dutch bookable), but nevertheless are not coherent. (Every coherent set of fair buying prices, in contrast, avoids sure loss.) So Dutch book or sure loss considerations do not give us good reason to think that, quite generally, rational agents have probabilistic credences. All is not lost, though. Even if Dutch books arguments don't do the trick, another argument might. For example, in the spirit of Icard (2016) and Fishburn (1986, p. 338), we might propose constraints of rationality governing how one's comparative beliefs and preferences, or judgments of almost-desirability, ought to hang together. In particular, we might suggest that the set D of gambles that an agent finds almost desirable (i.e., that she weakly prefers to the status quo) ought to be exactly the set D of gambles that are almost desirable relative to her comparative belief relation . Belief-Preference Coherence. D = D. Recall, a gamble G is almost desirable relative to  iff it is a non-negative linear combination of components (X1 −Y1), . . . , (Xn −Yn) which are such that Xi  Yi. G is a non-negative linear combination of (X1 −Y1), . . . , (Xn −Yn) just in case G = ∑ i λi(Xi −Yi) for some λ1, . . . , λn ≥ 0. The basic thought here is this. Xi − Yi is the gamble that pays out £1 if Xi is true and −£1 if Yi is true. You ought to weakly prefer this to the status quo iff you are at least as confident that Xi is true as Yi. Moreover, you ought to think that any package of such bets, even if their stakes are scaled up or down by a positive constant, is almost-desirable; you ought to weakly prefer it to the status quo. And nothing more. Your comparative beliefs give you no reason to determinately prefer any other gamble to the status quo. Now suppose that rationality not only demands comparative beliefs and preferences hang together as per Belief-Preference Coherence, but that it also demands that comparative beliefs on their own satisfy the Generalised Finite-Cancellation axiom. Generalised Finite-Cancellation. If X1 + . . . + Xn + A + . . . + A} {{ } k times = Y1 + . . . + Yn + B + . . . + B} {{ } k times comparative probabilities 327 and Xi  Yi for all i ≤ n, then A  B. Perhaps pragmatic considerations other than Dutch book or sure loss considerations establish that rational comparative beliefs satisfy GFC.20 Or perhaps epistemic considerations establish this. Perhaps, for example, comparative beliefs that satisfy GFC epistemic-utility-dominate ones that do not, or something of the sort. For now, let's just leave an IOU for the justification of GFC. Supposing that rational comparative beliefs satisfy GFC, we can now provide some reason to think that quite generally rational agents have probabilistic credences. 1. Coherence Constraints. Any rational agent's comparative beliefs satisfy GFC. Moreover, her comparative beliefs and preferences, or judgments of almost-desirability, jointly satisfy Belief-Preference Coherence. 2. Theory of Credence. A set of real-valued functions C count as "your credal state" just in case they encode your fair buying prices. 3. IP-Representability Theorem. Relation  satisfies GFC iff  is IPrepresentable, i.e.,  fully agrees with some set of probability functions C. 4. Bridge Theorem. If  is IP-representable and satisfies Belief-Preference Coherence, then the maximal set of probability functions C∗ that fully agrees with  also encodes your fair buying prices, and hence counts as "your credal state." (Premise 2, Appendix) C. Probabilism. Any rational agent has probabilistic credences. (From 1, 3, and 4) Does this argument trivialise probabilism? Of course not. It relies on the decision-theoretic account of credence-a substantive, highly non-trivial thesis. Moreover, even if we simply grant the decision-theoretic account of credence, it is no trivial consequence that rational agents have probabilistic credences. Having credences is easy. You have credences whatever your fair-buying prices are. But having probabilistic credences requires satisfying some demanding axioms (Belief-Preference Coherence, GFC). Establishing that these axioms encode genuine constraints of rationality is non-trivial. As a result, establishing probabilism is non-trivial. 20 Icard (2016) shows that an agent who satisfies Belief-Preference Coherence avoids sure loss iff her comparative beliefs are strongly representable. Strong representability is weaker than GFC. So we need something other than sure loss considerations to show that rational comparative beliefs satisfy GFC. 328 jason konek You might be concerned that our little argument only establishes half of probabilism. It shows that rational agents have probabilistic credences. But it does not show that rational agents have only probabilistic credences. Indeed, it cannot do so. We gave a recipe earlier for constructing a credal state for any agent. Just take the set of real-valued functions that dominate her fair buying prices. This set will encode those prices, and so count as "her credal state," on the decision-theoretic view. But it is not a set of probability functions. But this converse thesis-viz., that for any rational agent, no set of realvalued functions with non-probabilistic members counts as "her credal state"-is not theoretically interesting, on the decision-theoretic view. The benefits of having probabilistic credences (avoiding sure loss, coherence) accrue to any agent whose fair buying prices are encoded by some set of probabilities. Whether or not some non-probabilistic set encodes them as well is neither here nor there. Nothing of theoretical import hangs on it. Finally, you might once again complain that the decision-theoretic account presupposes that rational agents only have comparative beliefs; no additional modes or types of doxastic judgment. But this stock of basic doxastic attitudes is too sparse. It is insufficient to explain everything about choice and inference that we would like to explain. So arguments that presuppose it are weak. A similar response to the one in Section 5 will suffice. There is good reason to think this criticism lacks punch. And even if you buy the criticism, it is not an objection to the decision-theoretic account of credence per se. It is only an objection to the unary variant of this account. A pluralist faces no such problems. Of course, a pluralist must say more about how other types of doxastic attitudes-not just comparative beliefs-ought to hang together with judgments of almost-desirability. In addition, she must provide a more sophisticated IP-representability theorem and bridge theorem. But these are not in-principle problems. They are requests to cash in an IOU. What does the scorecard look like? Whether the decision-theoretic account provides a compelling answer to the normative question depends in part on whether those IOUs can be replaced by theorems. But there is no special reason to think this task cannot be done. In addition, epistemic utility theorists, e.g., Joyce (1998, 2009) and Pettigrew (2016), worry that this story provides an incomplete picture of our reasons to have probabilistic credences. A complete picture would provide a purely epistemic rationale for having imprecise credences.21 Nevertheless, some form of the argument presented here might help illuminate some of our reasons for having probabilistic credences. 21 Epistemic utility theorists get off the boat early by rejecting the decision-theoretic account of credence. comparative probabilities 329 7 evaluating the epistemic interpretivist view On the epistemic interpretivist view, a function (or set of functions) counts as "your credal state" just in case it encodes truth-value estimates that best rationalise or make sense of your comparative beliefs, understood as irreducibly-doxastic attitudes. More formally, a function c : F → R (or a set of such functions C) counts as "your credal state" iff its truth-value estimates best rationalise your comparative beliefs  (or on the pluralist version: your comparative and qualitative opinions more generally). For c (or C) to best rationalise , it must satisfy two conditions: (i) c must recommend  as strongly as possible, so that no other c∗ (or set C∗) recommends  to a higher degree; (ii) c must be closer to rational (closer to the set T of rational assignments of truth-value estimates) than any other c∗ that recommends  as strongly as possible-this ensures that c provides the highest quality recommendation possible. To illustrate this view, consider two concrete cases. In case 1, you have comparative beliefs over propositions in the following Boolean algebra: F = { ∅, {w1} , {w2} , {w1, w2} } . In particular, your comparative beliefs are given by: ∅ ≺ {w1} ≺ {w2} ≺ {w1, w2} . Question: which function c (or set C) encodes truth-value estimates (or constraints on estimates) that best rationalise your comparative beliefs , and hence counts as "your credal state," according to epistemic interpretivism? To provide a concrete answer, we will need to make a few substantive assumptions about recommendation, rationality, and the like. In Section 3.3, we outlined three accounts of recommendation-the metaphysical, normative, and epistemic utility accounts-which aim to explain when and how an assignment of truth-value estimates c (or set of assignments C) recommends comparative beliefs  more or less strongly. For simplicity, we will assume the metaphysical account in what follows. Recall, on the metaphysical account, c (or C) recommends  as strongly as possible just in case explicitly judging c (or C) to encode the best (constraints on) estimates of truth-values metaphysically entails having precisely the comparative beliefs given by . Moreover, we will assume that judging c to be best entails having a specific set of comparative beliefs, viz., the comparative beliefs c that fully agree with c. Likewise, we will assume that judging a set of assignments C (constraints on truth-value estimates) to be best entails having the comparative beliefs C that fully agree with C. Finally, we will assume that the set T of rational assignments 330 jason konek of truth-value estimates is fairly inclusive. In particular, we will adopt the radical subjective Bayesian assumption that T is the set of probability functions. Likewise, the rational constraints on truth-value estimates (imprecise truth-value estimates) are just the sets of probability functions. Back to our question then. Which function c (or set C) encodes (constraints on) truth-value estimates that best rationalise your comparative beliefs , and hence counts as "your credal state," according to epistemic interpretivism? Firstly, note that any probability function c with 0.5 < c({w2}) < 1 is such that c(∅) < c({w1}) < c({w2}) < c({w1, w2}), and hence fully agrees with . So any such c recommends  as strongly as possible, on the metaphysical account. Secondly, note that each such a c is probabilistically coherent. So it is maximally rational, according to our radical subjective Bayesian assumption. Hence any of these probability functions counts as "your credal state," according to epistemic interpretivism. Similarly, note that any set of probability functions C with 0.5 < c({w2}) < 1 for all c ∈ C fully agrees with . So any such C recommends  as strongly as possible, on the metaphysical account. And each such C is maximally rational, according to our radical subjective Bayesian assumption. So any of these sets of probability functions C counts as "your credal state," according to epistemic interpretivism. On a pluralist version of epistemic interpretivism, according to which your credal state does not simply best rationalise your comparative beliefs, but rather best rationalises a broader set of comparative and qualitative opinions, we might be able to winnow down the set of candidate credal states more than this. Likewise, on a more sophisticated version of epistemic interpretivism according to which your credal state does not simply best rationalise your current opinions, but rather is part of a package that best rationalises your opinions over time, we might be able to winnow down this set even further. But as it stands, epistemic interpretivism allows for a lot of slack in what counts as your credences. It allows for a great many ties between maximally eligible credal states. This, however, is as it should be. Given what epistemic interpretivists take the principle theoretical role of credal states to be-their job is to best rationalise your comparative and qualitative opinions-and given how few comparative beliefs you actually have, various sets of truth-value estimates (and constraints on such estimates) play that role equally well. Consider one more concrete case. In case 2, your comparative beliefs are given by: ∅ ≈ {w1} ≈ {w2} ≺ {w1, w2} . comparative probabilities 331 Which function c (or set C) encodes (constraints on) truth-value estimates that best rationalise your comparative beliefs , and hence counts as "your credal state," according to epistemic interpretivism? To answer this question, first note that your comparative belief relation is clearly not probabilistically representable. No probability function fully agrees with . Nor is it imprecisely representable. So no probability function c, or set of probability functions C, recommends  as strongly as possible. But there are non-probabilistic assignments of truth-value estimates that fully agree with , and hence recommend it as strongly as possible. In particular, any c with c(∅) = c({w1}) = c({w2}) = x, c({w1, w2}) = y and x < y is such that c(∅) = c({w1}) = c({w2}) < c({w1, w2}), and hence fully agrees with . So any such c recommends  as strongly as possible, on the metaphysical account. But which one of these provides the highest quality recommendation of ? That is, which one is closest to rational? Since the rational assignments of truth-value estimates are exactly the probability functions (by assumption), the question really is: which of these assignments of truth-value estimates is closest to probabilistically coherent (i.e., closest to the set of all probability functions)? To answer this question, we need to plump for some measure of "closeness" or "proximity." One natural choice: squared Euclidean distance. Squared Euclidean distance is the "Bregman divergence" generated by a very popular measure of accuracy, viz., the Brier Score. It captures one attractive way of thinking about how close two sets of truth-value estimates are in terms of how similar their degree of accuracy is expected to be. If we plump for squared Euclidean distance as our measure of "closeness" or "proximity," then the assignment of truth-value estimates that best rationalises your comparative beliefs is given by: c(∅) = c({w1}) = c({w2}) = 1/3, c({w1, w2}) = 1. So this function c counts as "your credal state," according to epistemic interpretivism. What's more, assuming that supersets {c, c∗, . . .} of {c} with the c∗ all strictly less rational than c are themselves less rational than {c}, we have that c is the unique function that counts as "your credal state." So much for the nuts and bolts of epistemic interpretivism. What about the characterisation question? When exactly is there a real-valued function c (or a set of such functions C) that encodes truth-value estimates which best rationalise your comparative beliefs, and hence counts as "your credal state"? 332 jason konek It depends. It depends which account of recommendation you plump for. It depends which assignments of truth-value estimates you count as rational. It depends how you measure proximity to the set of rational assignments of truth-value estimates. Different commitments on these fronts will yield different answers to the question of which truth-value estimates rationalise which comparative beliefs. But for concreteness, suppose we stick with the metaphysical account and radical subjective Bayesian assumption we made earlier. In that case, whenever your comparative beliefs  fully agree with a real-valued function c, that function will recommend  as strongly as possible. So some such function will best rationalise  (or near enough).22 And a comparative belief relation  fully agrees with a real-valued function c if and only if  satisfies Transitivity and Totality (Krantz et al., 1971, Theorem 1). So some real-valued function c best rationalises , and hence counts as "your credal state," whenever  satisfies Transitivity and Totality. Conversely, if violates either Transitivity or Totality, then no single realvalued function fully agrees with . Hence no single function recommends  as strongly as possible. If there is some set of real-valued functions that fully agrees with , then that set recommends  more strongly than any single function, on the metaphysical view. So while no single function will count as "your credal state," some set (the most rational one that fully agrees with ) will do so (or near enough). It is an interesting question when exactly there is such a set of functions (not necessarily probability functions) that fully agrees with . The overarching lesson here is a familiar one. Your comparative beliefs need not satisfy Scott's axioms-axioms which might seem rather demanding on their face-in order to count as having credences, according to epistemic interpretivism. Given our working assumptions (the metaphysical account, etc.), such axioms encode necessary and sufficient conditions for having precise probabilistic credences. But it is perfectly possible to have non-probabilistic credences. As we saw in case 2, if your comparative beliefs are given by ∅ ≈ {w1} ≈ {w2} ≺ {w1, w2} , then the following non-probabilistic real-valued function c counts as "your credal state," according to the epistemic interpretivist: c(∅) = c({w1}) = c({w2}) = 1/3, c({w1, w2}) = 1. Your comparative beliefs only need to satisfy relatively weak constraints to count as having credences tout court. 22 For any ε > 0, we can pick some c that recommends  such that any other c∗ that does so as well is no more than ε-closer to coherent. comparative probabilities 333 On to the normative question then. Why should we expect rational agents to have probabilistic credences? The epistemic interpretivist's answer to the normative question is much more complicated than our previous accounts' answers. For the epistemic interpretivist, the normative question breaks into (at least) three subquestions. 1. Why should we expect that for any rational agent, there is some probability function that fully agrees with her comparative beliefs? 2. Why should we expect it to be the case that the rational assignments of truth-value estimates are exactly the probability functions? 3. Why should we expect any probability function that fully agrees with comparative beliefs  to recommend  as strongly as possible? Scott's theorem tells us that comparative beliefs  fully agree with some probability function c iff  satisfies Non-Triviality, Non-negativity, Totality, and Isovalence. So answering question 1 amounts to defending the claim that rational comparative beliefs satisfy Non-Triviality, Nonnegativity, Totality, and Isovalence. Of course, you might doubt whether "structural axioms" like Totality encode genuine constraints of rationality. In that case, the epistemic interpretivist might defend the Generalised Finite-Cancellation axiom and argue that rational agents have imprecise probabilistic credences. To answer question 2, the epistemic interpretivist might rely on results from epistemic utility theory. For example, she might endorse an austere conception of rationality, according to which rationality requires you to prefer one assignment of truth-value estimates b to another c just in case b is guaranteed to be more accurate than c. Then she might appeal to Joyce (1998, 2009), Predd et al. (2009), Schervish, Seidenfeld, and Kadane (2009), and Pettigrew (2016), who show that any non-probabilistic b is accuracydominated by some probabilistic c, i.e., the truth-value estimates encoded by c are guaranteed to be more accurate than those encoded by b. No probabilistic c, in contrast, is even weakly accuracy-dominated. So the probability functions are exactly the rational assignments of truth-value estimates (not rationally dispreferred to any other assignment), on the austere conception of rationality. Finally, to answer question 3, the epistemic interpretivist must defend an account of recommendation and various auxiliary claims. For example, a proponent of the epistemic utility account must defend various substantive claims about the nature of epistemic value. In particular, she must defend the claim that on any reasonable measure of epistemic utility for comparative beliefs, all probability functions expect the comparative belief relations that fully agree with them to have maximal epistemic utility. 334 jason konek We will not provide answers to questions 1–3 here. But if the epistemic interpretivist can answer them satisfactorily, then she can offer up something like the following argument for probabilism. 1. Coherence Constraints. Any rational agent's comparative belief relation  satisfies Non-Triviality, Non-negativity, Totality, and Isovalence. 2. Theory of Credence. An assignment of truth-value estimates c counts as "your credal state" iff it best rationalises your comparative beliefs . Moreover, c best rationalises  iff (i) it recommends  as strongly as possible, and (ii) c is itself closer to rational than any other c∗ that recommends  as strongly as possible. 3. Accuracy Argument. An assignment of truth-value estimates c is rational iff c is a probability function. (Accuracy-dominance theorem, austere conception of rationality) 4. Scott's Theorem. Relation  satisfies Non-Triviality, Non-negativity, Totality, and Isovalence if and only if  fully agrees with some probability function c. 5. Theory of Recommendation. An assignment of truth-value estimates c recommends  to degree k iff the maximally rational extension of c to Q, estc, is such that estc(U ()) = k. 6. Bridge Theorem I. If  fully agrees with a probability function c, then c recommends  as strongly as possible. (Premise 5, austere conception of rationality, auxiliary assumptions about epistemic utility) 7. Bridge Theorem II. If  fully agrees with a probability function c, then c not only recommends  as strongly as possible, but is also rational. Hence c counts as "your credal state." (From 2, 3 and 6) C. Probabilism. Any rational agent has probabilistic credences. (From 1, 4 and 7) Does this argument trivialise probabilism? Obviously not! It is positively baroque! And unlike the measurement-theoretic and decision-theoretic accounts, the epistemic interpretivist can plausibly argue that rational agents have only probabilistic credences. To see this, suppose that your comparative beliefs are given by the rational comparative belief relation . By premise 1,  satisfies Non-Triviality, Non-negativity, Totality, and Isovalence. So by premises 4 and 7, there is some probability function c that fully agrees with , and hence best rationalises . Now, while many non-probabilistic assignments of truth-value estimates b will also fully agree with , and hence recommend  as strongly as possible, none will comparative probabilities 335 provide as high a quality recommendation for . Hence none will best rationalise . The reason: any non-probabilistic b is accuracy-dominated by some probabilistic c, and hence less rational than c (premise 3). So b provides a weaker rationale for  than c. Hence b fails to count as "your credal state," according to the epistemic interpretivist. Finally, you might return to the complaint that this account presupposes that rational agents only have comparative beliefs-an overly austere, explanatorily deficient stock of doxastic attitudes. But again, this is only an objection to a unary version of epistemic interpretivism (to the extent that it has bite at all). It has no force against pluralist variants. Of course, pluralist variants face a range of unanswered questions. When exactly does a set of truth-value estimates rationalise a more comprehensive system of doxastic attitudes (comparative beliefs, full beliefs, opinions about evidential and causal dependence and independence, etc.)? Why should we expect reasonable measures of epistemic utility for these more comprehensive systems to satisfy a suitably generalised version of strict propriety? And when exactly are these more comprehensive systems of doxastic attitudes probabilistically representable? But these are new research questions bubbling up on the boundary of an active research programme. There is no principled reason for thinking that they do not have adequate answers. Before wrapping up, it is worth highlighting one additional virtue of epistemic intepretivism. At the outset, we mentioned a Bayesian platitude about credences. Joyce puts the platitude as follows: "in the probabilistic tradition, the defining fact about credences is that they are used to estimate quantities that depend on truth-values" (Joyce, 2009, pp. 268–9). A rational agent's credences determine expectations of measurable quantities (quantities like the size of the deficit 10 years hence, or the utility of an outcome), which capture her best estimates of those quantities. Those best estimates, in turn, typically rationalise or make sense of her evaluative attitudes and choice behaviour. Shorter: credences capture estimates that provide rationalising explanations. Epistemic interpretivism is much better positioned than the measurement-theoretic or decision-theoretic views to vindicate this platitude. On the measurement-theoretic view, credence functions are just mappings from propositions to real numbers that preserve the structure of your comparative beliefs. They do not encode estimates, or any other quantity that might plausibly play a role in rationalising your doxastic attitudes, evaluative attitudes, or choice behaviour. On the decision-theoretic view, credence functions are just numerical systems that encode your fair buying and selling prices. But having fair buying and selling prices is nothing over 336 jason konek and above having certain kinds of preferences. So they are hardly fit to rationalise preferences.23 In contrast, epistemic interpretivism directly identifies your credence function c with the assignment of truth-value estimates that best rationalises your comparative beliefs . And however we spell out what it is for truth-value estimates to best rationalise comparative beliefs, we can apply a similar story to preferences and choice behaviour. Consider, for example, the epistemic utility account of recommendation from Section 3.3. On this view, we start with c, and we add estimates of other measurable quantities Q to the stock of truth-value estimates encoded by c in the most rational way possible. In particular, we add estimates of the epistemic utility of comparative belief relations. The larger the estimate of 's epistemic utility, the more strongly c recommends . We can tell exactly the same story about how your credences rationalise preferences and choice behaviour. We start with the truth-value estimates c that best rationalise your comparative beliefs, and we add estimates of the value of actions, for example, in the most rational way possible. The larger the estimate of an action's value, the more strongly c recommends it. In turn, the more strongly it rationalises choosing that action. The moral: epistemic interpretivism appears to have the resources to vindicate core tenets of Bayesianism that other accounts have trouble with. 8 concluding remarks Many Bayesians take comparative belief to be crucial for spelling out what it is to have a degree of confidence, or degree of belief, or credence. And they typically appeal to representation theorems when answering foundational questions about credence. We have explored three different accounts- measurement-theoretic, decision-theoretic, and epistemic interpretivist- that utilise comparative beliefs and representation theorems in order to answer two such questions: the characterisation question, i.e., when exactly an agent counts as having credences, and the normative question, i.e., why we should expect rational agents to have probabilistic credences. Hájek (2009), Meacham and Weisberg (2011), and Titelbaum (2015) pose some pressing challenges to accounts of this sort: they make the bar for having credences so high that very few real agents clear it, they trivialise probabilism, and so on. But we found that suitably sophisticated versions of each of our three accounts handle these challenges fairly well. There is more work to be done in filling these accounts out. But wholesale scepticism about the role of comparative belief and representation theorems in providing an account of credence seems premature. 23 For a similar criticism of behaviourism, see Joyce (1999, Section 1.3). comparative probabilities 337 9 appendix Choose any comparative belief structure 〈Ω,F ,〉 with finite F . Assume without loss of generality that |Ω| = n. Theorem 1 (Generalised Scott's Theorem) Suppose  satisfies the following two conditions. 1. Non-triviality. Ω ∅. 2. Non-negativity. X  ∅. Then the following two conditions are equivalent. 3. Isovalence. If X1 + . . . + Xn = Y1 + . . . +Yn and Xi  Yi for all i ≤ n, then Xi  Yi for all i ≤ n as well. 4. Strong representability. There exists a probability function p : F → R that strongly agrees with  in the sense that: (i) X  Y ⇒ p(X) ≥ p(Y), (ii) X Y ⇒ p(X) > p(Y). Proof. Let A = { ∑ i λi(Xi −Yi) | λi ≥ 0 and Xi  Yi } and U = { ∑ i λi(Yi − Xi) | λi ≥ 0, ∑ i λi = 1, and Xi Yi } . First we will show that A ∩U = ∅ iff  satisfies Isovalence. Then we will show that if  satisfies Non-Triviality and Non-Negativity, then A∩U = ∅ iff  is strongly representable. Suppose that  satisfies Isovalence. So if X1 + . . . + Xt = Y1 + . . . + Yt and Xi  Yi for all i ≤ t, then Xi  Yi for all i ≤ t as well. Suppose for reductio that A ∩U 6= ∅. So there is some G ∈ A ∩U. Hence G = ∑ i≤m λi(Xi −Yi) = ∑ i≤k δi(Bi − Ai), where λi ≥ 0 and Xi  Yi for all i ≤ m; likewise δi ≥ 0, ∑i δi = 1, and Ai Bi for all i ≤ k. So λ1(X1 −Y1) + . . . + λm(Xm −Ym) + δ1(A1 − B1) + . . . + δk(Ak − Bk) = 0. 338 jason konek Let Xi = 〈 xi1, . . . , x i n 〉 . Likewise for Yi, Ai, and Bi. Then the equality above gives us a system of n homogenous linear equations with rational coefficients: (x11 − y11)λ1 + . . . + (xm1 − ym1 )λm + (a11 − b11)δ1 + . . . + (ak1 − bk1)δk = 0, ... (x1n − y1n)λ1 + . . . + (xmn − ymn )λm + (a1n − b1n)δ1 + . . . + (akn − bkn)δk = 0. Since this system of equations has rational coefficients, it has a rational solution if it has any solution, by Gauss' method. So we can rewrite λ1(X1 −Y1) + . . . + λm(Xm −Ym) + δ1(A1 − B1) + . . . + δk(Ak − Bk) = 0 as α1 β1 (X1−Y1)+ . . .+ αm βm (Xm−Ym)+ φ1 ψ1 (A1− B1)+ . . .+ φk ψk (Ak− Bk) = 0. Multiplying through and rearranging gives us (α1β2 . . . βmψ1 . . . ψk)(X1 −Y1) + . . . + (αmβ1 . . . βm−1ψ1 . . . ψk)(Xm −Ym) + (φ1ψ2 . . . ψkβ1 . . . βm)(A1 − B1) + . . . + (φkψ1 . . . ψk−1β1 . . . βm)(Ak − Bk) = 0. This in turn gives us (α1β2 . . . βmψ1 . . . ψk)X1 + . . . + (αmβ1 . . . βm−1ψ1 . . . ψk)Xm + (φ1ψ2 . . . ψkβ1 . . . βm)A1 + . . . + (φkψ1 . . . ψk−1β1 . . . βm)Ak = (α1β2 . . . βmψ1 . . . ψk)Y1 + . . . + (αmβ1 . . . βm−1ψ1 . . . ψk)Ym + (φ1ψ2 . . . ψkβ1 . . . βm)B1 + . . . + (φkψ1 . . . ψk−1β1 . . . βm)Bk. But recall, Xi  Yi for all i ≤ m, and Ai  Bi for all i ≤ k. So by Isovalence we must have Xi  Yi for all i ≤ m and Ai  Bi for all i ≤ k. But since Ai Bi for all i ≤ k, we have A 6 Bi for all i ≤ k. Contradiction. Therefore A∩U = ∅. Conversely, suppose that A ∩U = ∅. Suppose for reductio that  violates Isovalence. So there are X1, . . . , Xt, Y1, . . . , Yt ∈ F such that X1 + . . . + Xt = Y1 + . . . + Yt. Xi  Yi for all i ≤ t, and Xj Yj for some j ≤ t. Assume without loss of generality that Xi ≈ Yi for all i 6= j. Then ∑ i 6=j Xi −Yi = Yj − Xj. comparative probabilities 339 Let G = ∑ i 6=j Xi −Yi = Yj − Xj. Then G ∈ A∩U. Contradiction. Therefore  must satisfy Isovalence. This establishes that A ∩U = ∅ iff  satisfies Isovalence. Now we will show that if  satisfies Non-Triviality and Non-Negativity, then A∩U = ∅ iff  is strongly representable. Suppose that  satisfies Non-Triviality and Non-Negativity. So Ω ∅ and X  ∅ for all X ∈ F . Now suppose that A ∩U = ∅. Note that A is the closed, convex polyhedral cone generated by the set {X − Y | X  Y}. Likewise, U is the convex hull of {Y − X | X Y}-a closed and convex set. So the hyperplane separation theorem of Kuhn and Tucker (1956, p. 50) guarantees that there is a linear functional E that strictly separates A and U in the sense that E[G] ≥ 0 for all G ∈ A, E[G∗] < 0 for all G∗ ∈ U. Since Ω ∅, ∅−Ω = −Ω ∈ U. Hence E[−Ω] < 0, which is the case iff E[Ω] > 0. Since X  ∅ for all X ∈ F , X−∅ = X ∈ A. Hence E[X] ≥ 0. Now let p(X) = E[X] E[Ω] for all X ∈ F . Obviously p satisfies Normalization and Non-negativity, since p(Ω) = E[Ω] E[Ω] = 1 and p(X) ≥ p(∅) iff E[X] E[Ω] ≥ E[∅] E[Ω] iff E[X] ≥ 0. Moreover, if X ∩Y = ∅, then X ∪Y = X + Y. So p(X ∪Y) = E[X + Y] E[Ω] = E[X] E[Ω] + E[Y] E[Ω] = p(X) + p(Y). So p satisfies Finite Additivity. Hence p is a probability function. And it follows straightforwardly that p strongly agrees with : X  Y ⇒ E[X−Y] ≥ 0 ⇔ E[X] ≥ E[Y] ⇔ p(X) ≥ p(Y), 340 jason konek and X Y ⇒ E[Y− X] < 0 ⇔ E[X] > E[Y] ⇔ p(X) > p(Y). Conversely, suppose that  is strongly representable. So there is some probability function p such that strongly agrees with  in the sense that (i) X  Y ⇒ p(X) ≥ p(Y), (ii) X Y ⇒ p(X) > p(Y). Suppose for reductio that A∩U 6= ∅. So there is some G ∈ A∩U. Hence G = ∑ i≤m λi(Xi −Yi) = ∑ i≤k δi(Bi − Ai), where λi ≥ 0 and Xi  Yi for all i ≤ m; likewise, δi ≥ 0, ∑i δi = 1, and Ai Bi for all i ≤ k. Let Ep[V ] = ∑ wi∈Ω p(wi)vi, where Ω = {w1, . . . , wn} and V =< v1, . . . , vn >. Once more let Xi = 〈 xi1, . . . , x i n 〉 . Likewise for Yi, Ai, and Bi. Then Ep[G] = ∑ wi∈Ω p(wi) ∑ j≤m λj(x j i − y j i)  = ∑ j≤m λj  ∑ wi∈Ω p(wi)(x j i − y j i)  = ∑ j≤m λj(p(Xj)− p(Yj)). Since Xj  Yj for all j ≤ m, p(Xj) ≥ p(Yj). Hence Ep[G] ≥ 0. But we also have Ep[G] = ∑ wi∈Ω p(wi) ∑ j≤k λj(b j i − a j i)  = ∑ j≤k λj  ∑ wi∈Ω p(wi)(b j i − a j i)  = ∑ j≤k λj(p(Bj)− p(Aj)). comparative probabilities 341 Since Aj Bj for all j ≤ k, p(Aj) > p(Bj). Hence Ep[G] < 0. Contradiction. Therefore A∩U = ∅. So far we have established that A ∩U = ∅ iff  satisfies Isovalence. Moreover, we have established that if  satisfies Non-Triviality and NonNegativity, then A∩U = ∅ iff  is strongly representable. This suffices to prove GST.  Theorem 2 Suppose that  is IP-representable and satisfies Belief-Preference Coherence. Let C be the maximal set of probability functions C that fully agrees with . Let EC = {Ec | c ∈ C}. This C encodes your fair buying prices in the sense that EC [G] = inf { Ec[G] | c ∈ C } is equal to your fair buying price for G, B(G), when B(G) is defined, and is undefined when it is not. Proof. Suppose that  is IP-representable. So there is a set of probability functions that fully agrees with it. Let C be the maximal set of probability functions C that fully agrees with . So X  Y ⇔ c(X) ≥ c(Y) for all c ∈ C. And if C∗ fully agrees with , then C∗ ⊆ C. First, note that C must be the set B of all probability functions b that almost agree with : B = { b | X  Y ⇒ b(X) ≥ b(Y) } . Obviously C ⊆ B. To see that B ⊆ C, choose b ∈ B. Suppose for reductio that b 6∈ C. Case 1. For all X, Y ∈ F , if c(X) ≥ c(Y) for all c ∈ C, then b(X) ≥ b(Y). In that case, C∗ = C ∪ {b} fully agrees with . But then C is not maximal. Contradiction. Case 2. For some X, Y ∈ F , c(X) ≥ c(Y) for all c ∈ C, but b(X) < b(Y). In that case, since C fully agrees with , X  Y. But since b almost agrees with , this implies b(X) ≥ b(Y). Contradiction. Hence B = C. Second, note that since  is IP-representable,  satisfies Non-Triviality and Non-Negativity. That is, Ω ∅ and X  ∅ for all X ∈ F . Now suppose that  also satisfies Belief-Preference Coherence. So the set A of gambles that our agent finds almost desirable (i.e., that she weakly 342 jason konek prefers to the status quo) is exactly the set A of gambles that are almost desirable relative to : A = A = { ∑ i λi(Xi −Yi) | λi ≥ 0 and Xi  Yi } . Our agent's fair buying price for a gamble G is B(G) = sup { a | G − a ∈ A } . Our aim is to show that EC [G] = B(G) if B(G) is defined, and undefined if not. We will start by first showing that A = A∗ where A∗ = { G | Ec[G] ≥ 0 for all c ∈ C } . Suppose that G ∈ A. So G = ∑ i≤m λi(Xi −Yi), where λi ≥ 0 and Xi  Yi for all i ≤ m. Again let Xi = 〈 xi1, . . . , x i n 〉 . Likewise for Yi. Choose c ∈ C. Then Ec[G] = ∑ wi∈Ω c(wi) ∑ j≤m λj(x j i − y j i)  = ∑ j≤m λj  ∑ wi∈Ω c(wi)(x j i − y j i)  = ∑ j≤m λj(c(Xj)− c(Yj)). Since Xj  Yj for all j ≤ m, c(Xj) ≥ c(Yj). So Ec[G] ≥ 0. Therefore Ec[G] ≥ 0 for all c ∈ C. So G ∈ A∗. Now suppose that G ∈ A∗. Suppose for reductio that G 6∈ A. Note that A (= A) is the closed, convex polyhedral cone generated by the set {X−Y | X  Y}. So the hyperplane separation theorem of Kuhn and Tucker (1956, p. 50) guarantees that there is a linear functional E that strictly separates this point G 6∈ A from A in the sense that E[V ] ≥ 0 for all V ∈ A, comparative probabilities 343 but E[G] < 0. The proof of Theorem 1 shows how to use E to construct a probability function b that almost (indeed, strongly) agrees with . And b is such that Eb[V ] = ∑ wi∈Ω b(wi)vi = ∑ wi∈Ω E(wi) E(Ω) vi = E[V ] E(Ω) for any gamble V . So Ec[V ] ≥ 0 iff E[V ] ≥ 0. In particular, then, since E[G] < 0, Eb[G] < 0 as well. But since b almost agrees with , b ∈ B = C. Since Eb[G] < 0, G 6∈ A∗. Contradiction. This establishes that A = A∗. Now we will show that EC [G] = B(G) if B(G) is defined, and undefined if not. EC [G] = inf { Ec[G] | c ∈ C } = sup { a | Ec[G] ≥ a for all c ∈ C } = sup { a | Ec[G − a] ≥ 0 for all c ∈ C } = sup { a | G − a ∈ A∗ } = sup { a | G − a ∈ A } = B(G).  Theorem 3 A relation  strongly agrees with a real-valued function c if and only if  satisfies weak transitivity: Weak Transitivity. If X  Y1  . . .  Yn  Z, then X 6≺ Z. Proof. The left-to-right direction is trivial. So suppose that  satisfies weak transitivity. For any X ∈ F , let ΦX = {X} ∪ { Z | X  Y1  . . .  Yn  Z for some Y1  . . .  Yn ∈ F } . Let c : F → R be defined by c(X) = |ΦX|. We must show: (i) A  B⇒ c(A) ≥ c(B), (ii) A B⇒ c(A) > c(B). 344 jason konek Assume that A  B. Choose Z ∈ ΦB. Either Z = B or B  Y1  . . .  Yn  Z for some Y1  . . .  Yn ∈ F . So either A  Z or A  B  Y1  . . .  Yn  Z. Either way, Z ∈ ΦA. Hence ΦB ⊆ ΦA. As a result c(A) = |ΦA| ≥ |ΦB| = c(B). Now suppose that A B. As before, ΦB ⊆ ΦA. But now note that while A ∈ ΦA, A 6∈ ΦB. To see this, suppose for reductio that A ∈ ΦB. Then either A = B or B  Y1  . . .  Yn  A for some Y1  . . .  Yn ∈ F . If A = B, then A A, i.e., A  A but A 6 A. Contradiction. If B  Y1  . . .  Yn  A, then by weak transitivity, B 6≺ A. Contradiction. So A ∈ ΦA but A 6∈ ΦB. Hence ΦB ⊂ ΦA. As a result c(A) = |ΦA| > |ΦB| = c(B).  references Abellan, J. & Moral, S. (2000). A non-specificity measure for convex sets of probability distributions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 8, 357–367. Abellan, J. & Moral, S. (2003). Maximum entropy for credal sets. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 11, 587–597. Adams, E. W. (1965). Elements of a theory of inexact measurement. Philosophy of Science, 32(205–228). Alon, S. & Lehrer, E. (2014). Subjective multi-prior probability: A representation of a partial likelihood relation. Journal of Economic Theory, 151, 476–92. Augustin, T., Coolen, F., de Cooman, G., & Troffaes, M. (2014). Introduction to imprecise probabilities. Wiley Series in Probability and Statistics. Wiley. comparative probabilities 345 Clark, A. (2015). Embodied prediction. In T. Metzinger & J. M. Windt (Eds.), Open mind (Vol. 7, T). Frankfurt am Main: MIND Group. de Finetti, B. (1931). Sul significato soggesttivo della probabilita. Fund. Mathm, 17, 298–329. de Finetti, B. (1951). La "logica del plausibile" secondo la concezione di polya. In Atti della xlii riunione (pp. 227–236). Rome: Societa Italiana per il Progresso delle Scienze. de Finetti, B. (1964). Foresight: Its logical laws, it subjective sources (1937). In H. E. Kyburg Jr. & H. E. Smokler (Eds.), Studies in subjective probability (Vol. 7, pp. 93–158). Wiley. de Finetti, B. (1974). Theory of probability. Wiley. Deza, M. & Deza, E. (2009). Encyclopedia of distances. Heidelberg. Springer. Domotor, Z. (1969). Probabilistic relational structures and their applications (tech. rep. No. 144). Institute for Mathematical Studies in the Social Sciences, Stanford University. Earman, J. (1992). Bayes or bust? a critical examination of bayesian confirmation theory. Cambridge: MIT Press. Elliott, E. (2018). Comparativism and the measurement of partial belief. Ms. Eriksson, L. & Hájek, A. (2007). What Are Degrees of Belief? Studia Logica: An International Journal for Symbolic Logic, 86(2), 183–213. Fishburn, P. C. (1969). Weak qualitative probability on finite sets. Annals of Mathematical Statistics, 40, 2118–2126. Fishburn, P. C. (1986). The axioms of subjective probability. Statistical Science, 1(3), 335–345. Fitelson, B. & McCarthy, D. (2015). Toward an epistemic foundation for comparative confidence. Ms. Gillies, D. (2000). Varieties of propensity. British Journal for the Philosophy of Science, 51, 807–835. Good, I. J. (1950). Probability and the weighing of evidence. London: Griffin. Hájek, A. (2003). What Conditional Probability Could Not Be. Synthese, 137(3), 273–323. Hájek, A. (2009). Argument for-or against-probabilism? In F. Huber & C. Schmidt-Petri (Eds.), Degrees of belief (Vol. 342). Springer. Hájek, A. & Joyce, J. M. (2008). Confirmation. In S. Psillos & M. Curd (Eds.), The routledge companion to philosophy of science (Chap. 11, pp. 115–128). Routledge. Harrison-Trainor, M., Holliday, W. H., & Icard, T. F. (2016). A note on cancellation axioms for comparative probability. Theory and Decision, 80(1), 159–166. Hitchcock, C. (2012). Cause and chance. Ms. Icard, T. (2016). Pragmatic considerations on comparative probability. Philosophy of Science, 83, 348–370. 346 jason konek Jeffrey, R. (1965/1983). The logic of decision (2nd). University of Chicago Press. Jeffrey, R. (1965). The logic of decision. University of Chicago Press. Jeffrey, R. (1987). Indefinite probability judgment: A reply to levi. Philosophy of Science, 54(4), 586–591. Jeffrey, R. (2002). Subjective probability: The real thing. Cambridge: Cambridge University Press. Jeffreys, H. (1961). Theory of probability (3rd). Oxford: Clarendon Press. Joyce, J. M. (1998). A nonpragmatic vindication of probabilism. Philosophy of Science, 65(4), 575–603. Joyce, J. M. (1999). The foundations of causal decision theory. Cambridge University Press. Joyce, J. M. (2005). How probabilities reflect evidence. Philosophical Perspectives, 19, 153–178. Joyce, J. M. (2009). Accuracy and coherence: Prospects for an alethic epistemology of partial belief. In F. Huber & C. Schmidt-Petri (Eds.), Degrees of belief (Vol. 342). Dordrecht: Springer. Joyce, J. M. (2010). A defense of imprecise credences in inference and decision making. Philosophical Perspectives, 24(1), 281–323. Kahneman, D. & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263–91. Kaplan, M. (2010). In defense of modest probabilism. Synthese, 176(1), 41– 55. Koopman, B. O. (1940a). The axioms of algebra of intuitive probability. Annals of Mathematics, 41(2), 269–292. Koopman, B. O. (1940b). The bases of probability. Bulletin of the American Mathematical Society, 46, 763–774. Kraft, C. H., Pratt, J. W., & Seidenberg, A. (1959). Intuitive probability on finite sets. The Annals of Mathematical Statistics, 30(2), 408–419. Krantz, D., Luce, R. D., Suppes, P., & Tversky, A. (1971). Foundations of measurement vol. i: Additive and polynomial representations. New York: Academic Press. Kuhn, H. & Tucker, A. (Eds.). (1956). Linear inequalities and related systems. Princeton University Press. Kyburg, H. & Pittarelli, M. (1996). Set-based bayesianism. IEEE Transactions on Systems, Man, and Cybernetics, 26, 324–339. Lewis, D. (1974). Radical interpretation. Synthese, 23, 331–344. Lichtenstein, S. & Slovic, P. (1971). Reversals of preferences between bids and choices in gambling decisions. Experimental Psychology, 89, 46– 55. Lichtenstein, S. & Slovic, P. (1973). Response-induced reversals of preference in gambling decisions: An extended replication in las vegas. Journal of Experimental Psychology, 101(1), 16–20. comparative probabilities 347 Maher, P. (1993). Betting on theories. Cambridge University Press. Mahtani, A. (2019). Imprecise probabilities. In R. Pettigrew & J. Weisberg (Eds.), The open handbook of formal epistemology. PhilPapers. Meacham, C. & Weisberg, J. (2011). Representation theorems and the foundations of decision theory. Australasian Journal of Philosophy, 89(4), 641–663. Paris, J. (2011). Pure inductive logic. In L. Horsten & R. Pettigrew (Eds.), The continuum companion to philosophical logic (pp. 428–449). London: Continuum International Publishing Group. Pederson, A. P. (2014). Comparative expectations. Studia Logica, 102, 811– 848. Pettigrew, R. (2016). Accuracy and the laws of credence. Oxford: Oxford University Press. Predd, J., Seiringer, R., H. Lieb, E., Osherson, D., Poor, H. V., & R. Kulkarni, S. (2009). Probabilistic coherence and proper scoring rules. IEEE Transaction on Information Theory, 55(10), 4786–4792. Quaeghebeur, E. (2014). Introduction to imprecise probabilities. In T. Augustin, F. Coolen, G. de Cooman, & M. Troffaes (Eds.), (Chap. 1, pp. 1–27). Wiley. Ramsey, F. P. (1931). Truth and probability. In The foundations of mathematics and other logical essays. New York: Humanities Press. Rényi, A. (1955). On a new axiomatic theory of probability. Acta Mathematica Academiae Scientiarum Hungaricae, 6, 286–335. Rios Insua, D. (1992). On the foundations of decision making under partial information. Theory and Decision, 33(1), 83–100. Savage, L. (1954). The foundations of statistics. Wiley. Schervish, M., Seidenfeld, T., & Kadane, J. (2009). Proper scoring rules, dominated forecasts, and coherence. Decision Analysis, 6(4), 202–221. Scott, D. (1964). Measurement structures and linear inequalities. Journal of Mathematical Psychology, 1(2), 233–247. Staffel, J. (2018). Unsettled thoughts: A theory of degrees of rationality. Oxford: Oxford University Press. Suppes, P. (1994). Qualitative theory of subjective probability. In G. Wright & P. Ayton (Eds.), Subjective probability (pp. 17–37). Chichester, UK: John Wiley and Sons. Suppes, P. & Zanotti, M. (1976). Necessary and sufficient conditions for existence of a unique measure strictly agreeing with a qualitative probability ordering. Journal of Philosophical Logic, 5(3), 431–438. Suppes, P. & Zanotti, M. (1982). Necessary and sufficient qualitative axioms for conditional probability. Z. Wahrsch. Verw. Gebiete, 60, 163–169. Titelbaum, M. (2015). Fundamentals of bayesian epistemology. Oxford: Oxford University Press. Troffaes, M. C. M. & de Cooman, G. (Eds.). (2014). Lower previsions. Wiley. 348 jason konek Walley, P. (1991). Statistical reasoning with imprecise probabilities. New York: Chapman and Hall. Walley, P. (2000). Towards a unified theory of imprecise probability. International Journal of Approximate Reasoning, 24(2–3), 125–148. Zynda, L. (2000). Representation theorems and realism about degrees of belief. Philosophy of Science, 67(1), 45–69.