We discuss several features of coherent choice functions —where the admissible options in a decision problem are exactly those that maximize expected utility for some probability/utility pair in fixed set S of probability/utility pairs. In this paper we consider, primarily, normal form decision problems under uncertainty—where only the probability component of S is indeterminate and utility for two privileged outcomes is determinate. Coherent choice distinguishes between each pair of sets of probabilities regardless the “shape” or “connectedness” of the sets of (...) probabilities. We axiomatize the theory of choice functions and show these axioms are necessary for coherence. The axioms are sufficient for coherence using a set of probability/almost-state-independent utility pairs. We give sufficient conditions when a choice function satisfying our axioms is represented by a set of probability/state-independent utility pairs with a common utility. (shrink)
This essay is, primarily, a discussion of four results about the principle of maximizing entropy (MAXENT) and its connections with Bayesian theory. Result 1 provides a restricted equivalence between the two: where the Bayesian model for MAXENT inference uses an "a priori" probability that is uniform, and where all MAXENT constraints are limited to 0-1 expectations for simple indicator-variables. The other three results report on an inability to extend the equivalence beyond these specialized constraints. Result 2 established a sensitivity of (...) MAXENT inference to the choice of the algebra of possibilities even though all empirical constraints imposed on the MAXENT solution are satisfied in each measure space considered. The resulting MAXENT distribution is not invariant over the choice of measure space. Thus, old and familiar problems with the Laplacian principle of Insufficient Reason also plague MAXENT theory. Result 3 builds upon the findings of Friedman and Shimony (1971; 1973) and demonstrates the absence of an exchangeable, Bayesian model for predictive MAXENT distributions when the MAXENT constraints are interpreted according to Jaynes's (1978) prescription for his (1963) Brandeis Dice problem. Lastly, Result 4 generalizes the Friedman and Shimony objection to cross-entropy (Kullback-information) shifts subject to a constraint of a new odds-ratio for two disjoint events. (shrink)
Gordon Belot argues that Bayesian theory is epistemologically immodest. In response, we show that the topological conditions that underpin his criticisms of asymptotic Bayesian conditioning are self-defeating. They require extreme a priori credences regarding, for example, the limiting behavior of observed relative frequencies. We offer a different explication of Bayesian modesty using a goal of consensus: rival scientific opinions should be responsive to new facts as a way to resolve their disputes. Also we address Adam Elga’s rebuttal to Belot’s analysis, (...) which focuses attention on the role that the assumption of countable additivity plays in Belot’s criticisms. (shrink)
For Savage (1954) as for de Finetti (1974), the existence of subjective (personal) probability is a consequence of the normative theory of preference. (De Finetti achieves the reduction of belief to desire with his generalized Dutch-Book argument for Previsions.) Both Savage and de Finetti rebel against legislating countable additivity for subjective probability. They require merely that probability be finitely additive. Simultaneously, they insist that their theories of preference are weak, accommodating all but self-defeating desires. In this paper we dispute these (...) claims by showing that the following three cannot simultaneously hold: (i) Coherent belief is reducible to rational preference, i.e. the generalized Dutch-Book argument fixes standards of coherence. (ii) Finitely additive probability is coherent. (iii) Admissible preference structures may be free of consequences, i.e. they may lack prizes whose values are robust against all contingencies. (shrink)
We review de Finetti’s two coherence criteria for determinate probabilities: coherence1defined in terms of previsions for a set of events that are undominated by the status quo – previsions immune to a sure-loss – and coherence2 defined in terms of forecasts for events undominated in Brier score by a rival forecast. We propose a criterion of IP-coherence2 based on a generalization of Brier score for IP-forecasts that uses 1-sided, lower and upper, probability forecasts. However, whereas Brier score is a strictly (...) proper scoring rule for eliciting determinate probabilities, we show that there is no real-valuedstrictly proper IP-score. Nonetheless, with respect to either of two decision rules – Γ-maximin or E-admissibility-+-Γ-maximin – we give a lexicographic strictly proper IP-scoring rule that is based on Brier score. (shrink)
It is a familiar argument that advocates accommodating the so-called paradoxes of decision theory by abandoning the “independence” postulate. After all, if we grant that choice reveals preference, the anomalous choice patterns of the Allais and Ellsberg problems violate postulate P2 of Savage's system. The strategy of making room for new preference patterns by relaxing independence is adopted in each of the following works: Samuelson, Kahneman and Tversky's “Prospect Theory”, Allais and Hagen, Fishburn, Chew and MacCrimmon, McClennen, and in closely (...) argued essays by Machina. (shrink)
Can there be good reasons for judging one set of probabilistic assertions more reliable than a second? There are many candidates for measuring "goodness" of probabilistic forecasts. Here, I focus on one such aspirant: calibration. Calibration requires an alignment of announced probabilities and observed relative frequency, e.g., 50 percent of forecasts made with the announced probability of.5 occur, 70 percent of forecasts made with probability.7 occur, etc. To summarize the conclusions: (i) Surveys designed to display calibration curves, from which a (...) recalibration is to be calculated, are useless without due consideration for the interconnections between questions (forecasts) in the survey. (ii) Subject to feedback, calibration in the long run is otiose. It gives no ground for validating one coherent opinion over another as each coherent forecaster is (almost) sure of his own long-run calibration. (iii) Calibration in the short run is an inducement to hedge forecasts. A calibration score, in the short run, is improper. It gives the forecaster reason to feign violation of total evidence by enticing him to use the more predictable frequencies in a larger finite reference class than that directly relevant. (shrink)
Let κ be an uncountable cardinal. Using the theory of conditional probability associated with de Finetti and Dubins, subject to several structural assumptions for creating sufficiently many measurable sets, and assuming that κ is not a weakly inaccessible cardinal, we show that each probability that is not κ-additive has conditional probabilities that fail to be conglomerable in a partition of cardinality no greater than κ. This generalizes our result, where we established that each finite but not countably additive probability has (...) conditional probabilities that fail to be conglomerable in some countable partition. (shrink)
De Finetti introduced the concept of coherent previsions and conditional previsions through a gambling argument and through a parallel argument based on a quadratic scoring rule. He shows that the two arguments lead to the same concept of coherence. When dealing with events only, there is a rich class of scoring rules which might be used in place of the quadratic scoring rule. We give conditions under which a general strictly proper scoring rule can replace the quadratic scoring rule while (...) preserving the equivalence of de Finetti’s two arguments. In proving our results, we present a strengthening of the usual minimax theorem. We also present generalizations of de Finetti’s fundamental theorem of prevision to deal with conditional previsions. (shrink)
The Sleeping Beauty problem has spawned a debate between “thirders” and “halfers” who draw conflicting conclusions about Sleeping Beauty's credence that a coin lands heads. Our analysis is based on a probability model for what Sleeping Beauty knows at each time during the experiment. We show that conflicting conclusions result from different modeling assumptions that each group makes. Our analysis uses a standard “Bayesian” account of rational belief with conditioning. No special handling is used for self-locating beliefs or centered propositions. (...) We also explore what fair prices Sleeping Beauty computes for gambles that she might be offered during the experiment. (shrink)
Several axiom systems for preference among acts lead to a unique probability and a state-independent utility such that acts are ranked according to their expected utilities. These axioms have been used as a foundation for Bayesian decision theory and subjective probability calculus. In this article we note that the uniqueness of the probability is relative to the choice of whatcounts as a constant outcome. Although it is sometimes clear what should be considered constant, in many cases there are several possible (...) choices. Each choice can lead to a different "unique" probability and utility. By focusing attention on statedependent utilities, we determine conditions under which a truly unique probability and utility can be determined from an agent's expressed preferences among acts. Suppose that an agent's preference can be represented in terms of a probability P and a utility U.That is, the agent prefers one act to another iff the expected utility of that act is higher than that of the other. There are many other equivalent representations in terms of probabilities Q, which are mutually absolutely continuous with P, and state-dependent utilities V, which differ from U by possibly different positive affine transformations in each state of nature. We describe an example in which there are two different but equivalent state-independent utility representations for the same preference structure. They differ in which acts count as constants. The acts involve receiving different amounts of one or the other of two currencies, and the states are different exchange rates between the currencies. It is easy to see how it would not be possible for constant amounts of both currencies to have simultaneously constant values across the differentstates. Savage (1954, sec. 5.5) discovered a situation in which two seemingly equivalent preference structures are represented by different pairs of probability and utility. He attributed the phenomenon to the construction of a "small world." We show that the small world problem is just another example of two different, but equivalent, representations treating different actsas constants. Finally, we prove a theorem (similar to one of Karni 1985) that shows how to elicit a unique state-dependent utility and does not assume that there are prizes with constant value. To do this, we define a new hypothetical kind of act in which both the prize to be awarded and the state of nature are determined by an auxiliary experiment. (shrink)
This important collection of essays is a synthesis of foundational studies in Bayesian decision theory and statistics. An overarching topic of the collection is understanding how the norms for Bayesian decision making should apply in settings with more than one rational decision maker and then tracing out some of the consequences of this turn for Bayesian statistics. There are four principal themes to the collection: cooperative, non-sequential decisions; the representation and measurement of 'partially ordered' preferences; non-cooperative, sequential decisions; and pooling (...) rules and Bayesian dynamics for sets of probabilities. The volume will be particularly valuable to philosophers concerned with decision theory, probability, and statistics, statisticians, mathematicians, and economists. (shrink)
This paper discusses some differences between the received theory of regular conditional distributions, which is the countably additive theory of conditional probability, and a rival theory of conditional probability using the theory of finitely additive probability. The focus of the paper is maximally "improper" conditional probability distributions, where the received theory requires, in effect, that P{a: P = 0} = 1. This work builds upon the results of Blackwell and Dubins.
This paper (based on joint work with M.J.Schervish and J.B.Kadane) discusses some differences between the received theory of regular conditional distributions, which is the countably additive theory of conditional probability, and a rival theory of conditional probability using the theory of finitely additive probability. The focus of the paper is maximally "improper" conditional probability distributions, where the received theory requires, in effect, that P{a: P(a|a) = 0} = 1. This work builds upon the results of Blackwell and Dubins (1975).
The JSTOR Archive is a trusted digital repository providing for long-term preservation and access to leading academic journals and scholarly literature from around the world. The Archive is supported by libraries, scholarly societies, publishers, and foundations. It is an initiative of JSTOR, a not-for-profit organization with a mission to help the scholarly community take advantage of advances in technology. For more information regarding JSTOR, please contact [email protected]
Conditioning can make imprecise probabilities uniformly more imprecise. We call this effect "dilation". In a previous paper (1993), Seidenfeld and Wasserman established some basic results about dilation. In this paper we further investigate dilation on several models. In particular, we consider conditions under which dilation persists under marginalization and we quantify the degree of dilation. We also show that dilation manifests itself asymptotically in certain robust Bayesian models and we characterize the rate at which dilation occurs.
When real-valued utilities for outcomes are bounded, or when all variables are simple, it is consistent with expected utility to have preferences defined over probability distributions or lotteries. That is, under such circumstances two variables with a common probability distribution over outcomes – equivalent variables – occupy the same place in a preference ordering. However, if strict preference respects uniform, strict dominance in outcomes between variables, and if indifference between two variables entails indifference between their difference and the status quo, (...) then preferences over rich sets of unbounded variables, such as variables used in the St. Petersburg paradox, cannot preserve indifference between all pairs of equivalent variables. In such circumstances, preference is not a function only of probability and utility for outcomes. Then the preference ordering is not defined in terms of lotteries. (shrink)
The "traditional" view of normative decision theory, as reported (for example) in chapter 2 of Luce and RaiÃa's [1957] classic work, Games and Decisions, proposes a reduction of sequential decisions problems to non-sequential decisions: a reduction of extensive forms to normal forms. Nonetheless, this reduction is not without its critics, both from inside and outside expected utility theory, It islay purpose in this essay to join with those critics by advocating the following thesis.
We contrast three decision rules that extend Expected Utility to contexts where a convex set of probabilities is used to depict uncertainty: Γ-Maximin, Maximality, and E-admissibility. The rules extend Expected Utility theory as they require that an option is inadmissible if there is another that carries greater expected utility for each probability in a (closed) convex set. If the convex set is a singleton, then each rule agrees with maximizing expected utility. We show that, even when the option set is (...) convex, this pairwise comparison between acts may fail to identify those acts which are Bayes for some probability in a convex set that is not closed. This limitation affects two of the decision rules but not E-admissibility, which is not a pairwise decision rule. E-admissibility can be used to distinguish between two convex sets of probabilities that intersect all the same supporting hyperplanes. (shrink)
We consider Geanakoplos and Polemarchakis’s generalization of Aumman’s famous result on “agreeing to disagree", in the context of imprecise probability. The main purpose is to reveal a connection between the possibility of agreeing to disagree and the interesting and anomalous phenomenon known as dilation. We show that for two agents who share the same set of priors and update by conditioning on every prior, it is impossible to agree to disagree on the lower or upper probability of a hypothesis unless (...) a certain dilation occurs. With some common topological assumptions, the result entails that it is impossible to agree not to have the same set of posterior probabilities unless dilation is present. This result may be used to generate sufficient conditions for guaranteed full agreement in the generalized Aumman-setting for some important models of imprecise priors, and we illustrate the potential with an agreement result involving the density ratio classes. We also provide a formulation of our results in terms of “dilation-averse” agents who ignore information about the value of a dilating partition but otherwise update by full Bayesian conditioning. (shrink)
We report two issues concerning diverging sets of Bayesian (conditional) probabilities-divergence of "posteriors"-that can result with increasing evidence. Consider a set P of probabilities typically, but not always, based on a set of Bayesian "priors." Fix E, an event of interest, and X, a random variable to be observed. With respect to P, when the set of conditional probabilities for E, given X, strictly contains the set of unconditional probabilities for E, for each possible outcome X = x, call this (...) phenomenon dilation of the set of probabilities (Seidenfeld and Wasserman 1993). Thus, dilation contrasts with the asymptotic merging of posterior probabilities reported by Savage (1954) and by Blackwell and Dubins (1962). (1) In a wide variety of models for Robust Bayesian inference the extent to which X dilates E is related to a model specific index of how far key elements of P are from a distribution that makes X and E independent. (2) At a fixed confidence level, (1-α), Classical interval estimates A n for, e.g., a Normal mean θ have length O(n -1/2 ) (for sample size n). Of course, the confidence level correctly reports the (prior) probability that θ ∈ A n ,P(A n )=1-α , independent of the prior for θ . However, as shown by Pericchi and Walley (1991), if an ε -contamination class is used for the prior on the parameter θ , there is asymptotic (posterior) dilation for the A n , given the data. If, however, the intervals A ′ n are chosen with length $O(\sqrt{\log (\text{n})/\text{n})}$ , then there is no asymptotic dilation. We discuss the asymptotic rates of dilation for ClassClassical and Bayesian interval estimates and relate these to Bayesian hypothesis testing. (shrink)
The degree of incoherence, when previsions are not made in accordance with a probability measure, is measured by either of two rates at which an incoherent bookie can be made a sure loser. Each bet is considered as an investment from the points of view of both the bookie and a gambler who takes the bet. From each viewpoint, we define an amount invested (or escrowed) for each bet, and the sure loss of incoherent previsions is divided by the escrow (...) to determine the rate of incoherence. Potential applications include the treatment of arbitrage opportunities in financial markets and the degree of incoherence of classical statistical procedures. We illustrate the latter with the example of hypothesis testing at a fixed size. (shrink)
PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association, Vol. 1984, Volume Two: Symposia and Invited Papers. (1984), pp. 201-212.
The Sleeping Beauty problem has spawned a debate between “Thirders” and “Halfers” who draw conflicting conclusions about Sleeping Beauty’s credence that a coin lands Heads. Our analysis is based on a probability model for what Sleeping Beauty knows at each time during the Experiment. We show that conflicting conclusions result from different modeling assumptions that each group makes. Our analysis uses a standard “Bayesian” account of rational belief with conditioning. No special handling is used for self-locating beliefs or centered propositions. (...) We also explore what fair prices Sleeping Beauty computes for gambles that she might be offered during the Experiment. (shrink)
It has long been known that the practice of testing all hypotheses at the same level , regardless of the distribution of the data, is not consistent with Bayesian expected utility maximization. According to de Finetti’s “Dutch Book” argument, procedures that are not consistent with expected utility maximization are incoherent and they lead to gambles that are sure to lose no matter what happens. In this paper, we use a method to measure the rate at which incoherent procedures are sure (...) to lose, so that we can distinguish slightly incoherent procedures from grossly incoherent ones. We present an analysis of testing a simple hypothesis against a simple alternative as a case‐study of how the method can work. (shrink)
When can a Bayesian investigator select an hypothesis H and design an experiment (or a sequence of experiments) to make certain that, given the experimental outcome(s), the posterior probability of H will be lower than its prior probability? We report an elementary result which establishes sufficient conditions under which this reasoning to a foregone conclusion cannot occur. Through an example, we discuss how this result extends to the perspective of an onlooker who agrees with the investigator about the statistical model (...) for the data but who holds a different prior probability for the statistical parameters of that model. We consider, specifically, one-sided and two-sided statistical hypotheses involving i.i.d. Normal data with conjugate priors. In a concluding section, using an "improper" prior, we illustrate how the preceding results depend upon the assumption that probability is countably additive. (shrink)
We review several of de Finetti’s fundamental contributions where these have played and continue to play an important role in the development of imprecise probability research. Also, we discuss de Finetti’s few, but mostly critical remarks about the prospects for a theory of imprecise probabilities, given the limited development of imprecise probability theory as that was known to him.
The Independence postulate links current preferences between called-off acts with current preferences between constant acts. Under the assumption that the chance-events used in compound von Neumann-Morgenstern lotteries are value-neutral, current preferences between these constant acts are linked to current preferences between hypothetical acts, conditioned by those chance events. Under an assumption of stability of preferences over time, current preferences between these hypothetical acts are linked to future preferences between what are then and there constant acts. Here, I show that a (...) failure of Independence with respect to current preferences leads to an inconsistency in sequential decisions. Two called-off acts are constructed such that each is admissible in the same sequential decision and yet one is strictly preferred to the other. This responds to a question regarding admissibility posed by Rabinowicz ([2000] Preference stability and substitution of indifferents: A rejoinder to Seidenfeld, Theory and Decision 48: 311â318 [this issue]). (shrink)
Tiebreak rules are necessary for revealing indifference in non- sequential decisions. I focus on a preference relation that satisfies Ordering and fails Independence in the following way. Lotteries a and b are indifferent but the compound lottery f, 0.5b> is strictly preferred to the compound lottery f, 0.5a>. Using tiebreak rules the following is shown here: In sequential decisions when backward induction is applied, a preference like the one just described must alter the preference relation between a and b at (...) certain choice nodes, i.e., indifference between a and b is not stable. Using this result, I answer a question posed by Rabinowicz (1997) concerning admissibility in sequential decisions when indifferent options are substituted at choice nodes. (shrink)
It has long been known that the practice of testing all hypotheses at the same level , regardless of the distribution of the data, is not consistent with Bayesian expected utility maximization. According to de Finetti’s “Dutch Book” argument, procedures that are not consistent with expected utility maximization are incoherent and they lead to gambles that are sure to lose no matter what happens. In this paper, we use a method to measure the rate at which incoherent procedures are sure (...) to lose, so that we can distinguish slightly incoherent procedures from grossly incoherent ones. We present an analysis of testing a simple hypothesis against a simple alternative as a case‐study of how the method can work. (shrink)
When can a Bayesian investigator select an hypothesis H and design an experiment to make certain that, given the experimental outcome, the posterior probability of H will be lower than its prior probability? We report an elementary result which establishes sufficient conditions under which this reasoning to a foregone conclusion cannot occur. Through an example, we discuss how this result extends to the perspective of an onlooker who agrees with the investigator about the statistical model for the data but who (...) holds a different prior probability for the statistical parameters of that model. We consider, specifically, one-sided and two-sided statistical hypotheses involving i.i.d. Normal data with conjugate priors. In a concluding section, using an "improper" prior, we illustrate how the preceding results depend upon the assumption that probability is countably additive. (shrink)
Statistical decision theory, whether based on Bayesian principles or other concepts such as minimax or admissibility, relies on minimizing expected loss or maximizing expected utility. Loss and utility functions are generally treated as unit-less numerical measures of value for consequences. Here, we address the issue of the units in which loss and utility are settled and the implications that those units have on the rankings of potential decisions. When multiple currencies are available for paying the loss, one must take explicit (...) account of which currency is used as well as the exchange rates between the various available currencies. (shrink)
We investigate differences between a simple Dominance Principle applied to sums of fair prices for variables and dominance applied to sums of forecasts for variables scored by proper scoring rules. In particular, we consider differences when fair prices and forecasts correspond to finitely additive expectations and dominance is applied with infinitely many prices and/or forecasts.
In 1936 R.A.Fisher asked the pointed question, "Has Mendel's Work Been Rediscovered?" The query was intended to open for discussion whether someone altered the data in Gregor Mendel's classic 1866 research report on the garden pea, "Experiments in Plant-Hybridization." Fisher concluded, reluctantly, that the statistical counts in Mendel's paper were doctored in order to create a better intuitive fit between Mendelian expected values and observed frequencies. That verdict remains the received view among statisticians, so I believe. Fisher's analysis is a (...) tour de force of so-called "Goodness of Fit" statistical tests using c2 to calculate significance levels, i.e., P-values. In this presentation I attempt a defense of Mendel's report, based on several themes. (1) Mendel's experiments include some important sequential design features that Fisher ignores. (2) Fisher uses particular statistical techniques of Meta-analysis for pooling outcomes from different experiments. These methods are subject to critical debate. and (3) I speculate on a small modification to Mendelian theory that offers some relief from Fisher's harsh conclusion that Mendel's data are too good to be true. (shrink)
On pp. 55–58 of Philosophical Problems of Statistical Inference, I argue that in light of unsatisfactory after-trial properties of “best” Neyman-Pearson confidence intervals, we can strengthen a traditional criticism of the orthodox N-P theory. The criticism is that, once particular data become available, we see that the pre-trial concern for tests of maximum power may then misrepresent the conclusion of such a test. Specifically, I offer a statistical example where there exists a Uniformly Most Powerful test, a test of highest (...) N-P credentials, which generates a system of “best” confidence intervals with exact confidence coefficients. But the [CIλ] intervals have the unsatisfactory feature that, for a recognizable set of outcomes, the interval estimates cover all parameter values consistent with the data, at strictly less than 100% confidence. (shrink)
This paper has two main parts. In the first part, we motivate a kind of indeterminate, suppositional credences by discussing the prospect for a subjective interpretation of a causal Bayesian network, an important tool for causal reasoning in artificial intelligence. A CBN consists of a causal graph and a collection of interventional probabilities. The subjective interpretation in question would take the causal graph in a CBN to represent the causal structure that is believed by an agent, and interventional probabilities in (...) a CBN to represent suppositional credences. We review a difficulty noted in the literature with such an interpretation, and suggest that a natural way to address the challenge is to go for a generalization of CBN that allows indeterminate credences. In the second part, we develop a decision-theoretic foundation for such indeterminate suppositional credences, by generalizing a theory of coherent choice functions to accommodate some form of act-state dependence. The upshot is a decision-theoretic framework that is not only rich enough to, so to speak, ground the probabilities in a subjectively interpreted causal network, but also interesting in its own right, in that it accommodates both act-state dependence and imprecise probabilities. (shrink)
Experimenters sometimes insist that it is unwise to examine data before determining how to analyze them, as it creates the potential for biased results. I explore the rationale behind this methodological guideline from the standpoint of an error statistical theory of evidence, and I discuss a method of evaluating evidence in some contexts when this predesignation rule has been violated. I illustrate the problem of potential bias, and the method by which it may be addressed, with an example from the (...) search for the top quark. A point in favor of the error statistical theory is its ability, demonstrated here, to explicate such methodological problems and suggest solutions, within the framework of an objective theory of evidence. (shrink)