Abstract
The aim of this paper is to study the monotonicity properties with respect to the probability distribution of the state processes, of optimal decisions in bandit decision problems. Orderings of dynamic discrete projects are provided by extending the notion of stochastic dominance to stochastic processes.
Similar content being viewed by others
REFERENCES
Banks, J.S. and Sundaram, R.K. (1992), Denumerable armed bandit problems, Econometrica 60(5):1071–1096.
Berry, D.A. and Fristedt, B. (1985), Bandit Problems: Sequential Allocation of Experiments, London: Chapman and Hall.
Berry, D.A. and Kertz, R.P. (1991), Worth of perfect information in Bernoulli bandits, Adv.Appl.Prob.23: 1–23.
Bikhchandani, S. and Sharma, S. (1990), Optimal search with learning, Working Paper no. 580, Department of Economics, University of California, Los Angeles.
Bikhchandani, S., Segal, U. and Sharma, S. (1992), Stochastic dominance under Bayesian learning, Journal of Economic Theory 56(2):352–377.
Billingsley, P. (1986), Probability and Measure, New York: Wiley.
Blackwell, D. (1965), Discounted dynamic programming, Ann.Math.Statis.36: 226–35.
DeGroot, M.H. (1970), Optimal Statistical Decisions, New York: McGraw-Hill Book.
Fishman, A. (1990), Stochastic dominance in multi sampling environments, Journal of Economic Theory51: 77–91.
Flinn, C. (1986), Wages and job mobility of young workers, Journal of Political Economy94: S88–S110.
Gittins, J.C. (1989), Multi-armed Bandit Allocation Indices, New York: Wiley
Gittins, J.C. and Jones, D.M. (1974), A dynamic allocation index for sequential design of experiments, in J. Gani (ed.), Progress in Statistics, Amsterdam: North-Holland, pp. 241–266.
Gittins, J.C. and Wang, Y.-G. (1992), The learning component of allocation indices, Annals of Statistics 20(3):1625–1636.
Hadar, J. and Russel, W.R. (1969), Rules for ordering uncertain prospects, The American Economic Review59: 25–34.
Jovanovic, B. (1979), Job-matching and the theory of turnover, Journal of Political Economy87: 972–990.
Kamae, T., Krengel, U. and O'Brien, G.L. (1977), Stochastic inequalities on partially ordered spaces, The Annals of Probability5: 899–912.
Milgrom, P.R. and Weber, R.J. (1982), A theory of auctions and competitive bidding, Econometrica50: 1089–1122.
Miller, R. (1984), Job matching and occupational choice, Journal of Political Economy92: 1086–1120.
Rothschild, M. and Stiglitz, J.E. (1970), Increasing risk, I: A definition, Journal of Economic Theory2: 225–243.
Rothschild, M. (1974), A two-armed bandit theory of market pricing, Journal of Economic Theory9: 185–202.
Russel, W.R. and Seo, T.K. (1989), Representative sets for stochastic dominance rules, in T.B. Fomby and T.K. Seo (eds.), Studies in the Economics of Uncertainty, Berlin: Springer Verlag.
Whittle, P. (1982), Optimization over Time, 2 vols, New York: Wiley.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Magnac, T., Robin, JM. Dynamic stochastic dominance in bandit decision problems. Theory and Decision 47, 267–295 (1999). https://doi.org/10.1023/A:1005142630173
Issue Date:
DOI: https://doi.org/10.1023/A:1005142630173