1 Introduction

Since Friedman (1998), the Monty Hall decision problemFootnote 1 was intensively discussed. While the experimental observations appear interesting, its behavioral explanation still remains disappointing. The investigated decision frame is constructed after a television show where three doors exist of which only one conceals the winning prize, while the other two equal zero profits. After you, as the contestant, have picked a door of your choice, the show master Monty opens one of the unchosen doors which does not reveal the prize. The question then is: do you want to switch to the remaining door or do you want to stick with your original choice. In other words, what are the winning probabilities for changing and not changing doors. In the standard construction, Monty always opens one of the unchosen doors (the one without the prize if the prize has not been chosen and otherwise one of the two randomly) and offers the contestant the option to change doors. Under these simplifying specifications the undisputed consent is to always change your door as the remaining door’s probability to be a winning door is (at least) larger than 1/3,Footnote 2 while the probability has not changed for the initially chosen door. Why is it then that so many of us stay with their initial choice and do not want to change to the other door with the higher winning probability?

Is it then necessary to resort to things like reverse psychology as a possibility raised by Kevin Spacey in the role of MIT Professor Micky Rosa (in the movie “21” released 2008 by Columbia Pictures)?Footnote 3 Interestingly, not only the first intuition is to stick with the initially chosen door, but experimental investigations show that many participants remain reluctant to change and do not switch to the other unopened door. Though, playing the Monty game repeatedly documents a robust learning effect toward increased switching close to or slightly above 50% (i.e. Friedman 1998). Palacios-Huerta (2003) show that incentives, ability, and social interaction can further strengthen learning effects in the repeated game. In a similar vein, Slembeck and Tyran (2004) conclude that communication and competition between participants supports learning towards increased switching – especially over the first rounds. Repetitions seem to help, although do not lead to optimal behavior. Granberg (1999a) show in their cross-cultural comparison study that sticking with the initial choice in the Monty game is a rather universal phenomenon. Cognitive illusions (i.e. of control) or cognitive biases (i.e. status-quo) have been proposed as possible explanatory concepts for such kinds of behavior (compare Granberg and Brown 1995; Granberg 2014). Can game theory provide alternative solutions besides explanatory concepts and posthoc rationalizations?

2 Definitions and Solutions

The Monty game can be defined as a sequential two player constant sum game with asymmetric information and the following specific characteristics.

  1. (i)

    Player 1 (i.e. you) chooses between three options with only one holding the winning prize, but you do not know which. Therefore, the probability of having chosen the winning option (W) is 1/3 and the probability of having chosen the losing option (L) is 2/3.

  2. (ii)

    Player 2 (i.e. Monty) has the possibility to expose (e) or not expose (\(e'\)) one of the unchosen options which is not holding the prize.

  3. (iii)

    Player 2 knows before deciding between e or \(e'\) if W or L. The prize is never exposed and revealed to player 1 only in the final stage of the game.

  4. (iv)

    Iff e player 1 decides between changing to the unexposed and unchosen option (c) or staying with the initial choice (\(c'\)).

  5. (v)

    The incentive for player 1 is to win the prize and for player 2 not to give away the prize.

Furthermore, assume fully rational players completely abiding to these rules and always acting according to purpose without error. Simplified Monty decides, as player 2, only between e and \(e'\). Sophisticated Monty fully takes information under (iii) into account, and as player 2 chooses separately for \(e_W\) and for \(e_L\) or respective odds. First, pure and then mixed strategies are investigated. The utility structure is strongly simplified under (v). The easiest representation of individual utility is in monetary terms, here as winning or not winning the prize. Monetary rewards are not necessarily the only outcome, which is taken into account. Social considerations or anticipated feelings can determine the resulting utility as well. Plausible utility extensions for player 2 and player 1 are investigated under Monty game expansions. These additional interdependent components are introduced by stepwise adding complexity.

2.1 Simplified Monty game

The simplest representation of the Monty game as a strategic game is in normal form. This defines the full strategy space for every player and all possible strategy combination with the resulting payoff for each player. The representation of all possible strategy combinations is in the form of a static matrix, which can be a contingent representation of a sequential game. Without considering the information if it is the winning or losing option W or L, the Monty game can be considered a simultaneous move game as shown in Table 1. The solution concept here is the Nash equilibrium, where in a given situation none would be better off by switching towards an alternative strategy. With two players and two strategies for each, this simply means that a player could not increase his/her payoff by choosing the other strategy, given the current strategy of the other player. This must simply hold for both players.

Table 1 Winning probabilities for simplified Monty in 2x2 normal form

Proposition 1

The only equilibrium in pure strategies is with player 2 not exposing (\(e',c\)) and (\(e',c'\)).

Proof

Player 1 profitably changes (c) iff player 2 exposes (e), therefore player 2 never chooses e.

As a sequential game in extensive form the simplified Monty game reduces to one subgame perfect equilibrium at (\(e',c\)) through backwards induction (see Fig. 1). Given that player 2 decides not knowing whether W or L, there is no mixed strategy equilibrium as player 2 can only improve by increasing the proportion of \(e'\) as \(e'\) weakly dominates e (if c then \(e'\) is better and if \(c'\) then \(e'\) is not worse). The maximum gain for player 1 is increasing the winning probability from \(\frac{1}{3}\) to \(\frac{2}{3}\) in c for e. This gain is simplified in the literature when e is given, although without further assumptions player 2 would prefer \(e'\) (i.e. never opens a door to expose that it is not the winning prize).

Fig. 1
figure 1

Simplified Monty game in extensive form

2.2 Sophisticated Monty game

In addition, in previous investigations it is stressed that player 2 knows if the winning door was chosen (W or L), and this knowledge can be acknowledged in a formal representation of the Monty game. Monty as player 2 knows if player 1 has initially picked the winning option (i.e. door with the prize behind it) or not, and it is reasonable to assume in the sequential form game two variants for e: one if it was the winning choice \(e_W\) (or \(e'_W\)) and another one if it was the losing choice \(e_L\) (or \(e'_L\)). Furthermore, these can be chosen with different probabilities in a mixed strategy equilibrium. A comparable differentiation between probabilities for e has been made by Morgan et al. (1991), page 286, Mueser et al. (1999), pages43-46, and Whitmeyer (2017), pages5-7. Schuller (2012) more generally stresses that with unknown expose probabilities of winning versus losing cases the safe strategy for player 1 is not to change and secure a 1/3 winning probability. As a consequence, all sophisticated Monty game equilibria restrict player 1 to \(c'\).

Proposition 2

The only Nash equilibria in pure strategies are with player 1 not changing \(((e_W,e'_L),c')\) and \(((e'_W,e'_L),c')\).

Proof

Player 2 is indifferent (\(e=e'\)) iff player 1 not changes (\(c'\)), otherwise player 2 prefers \(e'_W\) and \(e_L\) where player 1 prefers \(c'\) over c. Only for \(((e_W,e'_L),c')\) and \(((e'_W,e'_L),c')\) is \(c'\ge c\)

$$\begin{aligned} ((e_W,e'_L),c')>((e_W,e'_L),c) \text{ and } ((e'_W,e'_L),c')=((e'_W,e'_L),c). \end{aligned}$$

Player 2 exposing doors dependent on the initial choice of player 1 (e conditional on W or L) is an informational advantage and does change the equilibria. With asymmetric information the game is represented in extensive form. In pure strategies it makes player 1 to choose \(c'\), which is consistent with most peoples’ intuition. Mixed strategies can be derived for player 1 with p for c and \(1-p\) for \(c'\). Player 2 can mix both probabilities r and s (with r for \(e_w\), \(1-r\) for \(e'_W\), s for \(e_L\), and \(1-s\) for \(e'_L\)). Figure 2 shows the sophisticated Monty game in extensive form with corresponding probabilities in brackets.

Fig. 2
figure 2

Sophisticated Monty game in extensive form

Proposition 3

The only equilibria with mixed strategies exist for player 1 not changing \(((r,s),c')\).

Proof

Indifference for player 2 between \(e_W\) and \(e'_W\) as well as \(e_L\) and \(e'_L\) requires \(p=0\) as otherwise \(r=1\) and \(s=0\). Determining r and s so that player 1 is indifferent between c and \(c'\) requires

$$\begin{aligned} \frac{1}{3}(1-r)+\frac{2}{3}s=\frac{1}{3}(r+1-r) \Longrightarrow r=2s \end{aligned}$$

All combinations of \(e_W\) and \(e_L\) with \(r=2s\) (and \(c'\)) are equilibria. It pays for player 1 to choose c only when \(\frac{r}{2}>2\), but this again would contradict player 2’ interests. Player 2 keeps this combination only for \(c'\), as otherwise decreasing r and increasing s would be beneficial. Naturally, player 2 can have different incentives in this game deriving for example from extending the game or from receiving something back if the prize is won.

2.3 Monty game expansions

Additional assumptions can be introduced as explanatory concepts for the observed behavior. Two game expansions are proposed here for illustration purposes. First, the process of opening a door (e) is beneficial for the host and the derived utility needs to be added for player 2. Second, social concerns like reciprocity might play a role and can be taken into account.

It appears reasonable the host being fickle and alternating between e and \(e'\). Furthermore, these frequencies can be chosen purposeful when enjoying the prolongation of the game per se.Footnote 4 This is represented in Fig. 3a by adding constant utility for player 2 when reaching the second stage. The only equilibrium in pure strategies would then be \(((e_W,e_L),c)\), as e weakly dominates \(e'\) and for e player 1 prefers c. Note that this only holds for the value of prolonging being equal to the prize. This value can be expected to be lower and then only one mixed strategy equilibrium remains. As for player 1 the payoffs are always the same, \(2r=s\) remains unchanged. \(e_W\) is strictly preferred (i.e. \(r=1\)) and \(e_L=e'_L\) requires

$$\begin{aligned} p+2(1-p)=1 \Longrightarrow p=1. \end{aligned}$$

More generally, for prolonging being smaller in value than the prize then p equals their relation (i.e. \(p=0.5\) if the value of prolonging is half the value of the prize). Only if the values are equal does the pure strategy equilibrium result. Otherwise for player 1 the question to answer is “what is prolonging worth for player 1” to determine p. Interestingly the proclaimed advantage of c can result, but the value of simply prolonging the show can be comparably small.

Fig. 3
figure 3

Monty game expansions

Another game expansion is to assume social motives in the form of reciprocity. In the setting of the Monty Hall game show this could be in the form of showing extra joy for winning after having to reconsider the choice (being valuable for the show master by increasing the number of viewers). The expanded game in Fig. 3b acknowledges this, but without taking negative reciprocity into account. Concerning pure strategy equilibria nothing changes, and mixed strategy equilibria still require \(p=0\) for player 2 to be indifferent. The only difference concerns the relation between s and r, which now need to be equal for a payback of 0.5 as shown in Fig. 3b. For a reasonably lower payback than 50% of the prize \(r<s\) (\(2s[1-\)payback\(]=r\)). The higher the payback the lower is the proportion of r. The question for player 2 then shifts towards the question of reciprocity (“how much can I expect back”) when exposing the door without the prize behind (i.e. in terms of show value). Both expansions together provide a more specific characterisation of the Monty Hall problem than its simplified representation in the literature, and which is more in line with the natural understanding of this strongly framed choice task under uncertainty.

3 Conclusion and discussion

Psychological expansions can rationalize the popular solution, although simply mixed strategy equilibria and conditional probabilities suffice here. An interesting psychological aspect is to take first associations or the initial intuition into account. This need not only apply for the equilibrium selection problem (i.e. focal points or prominence), but could also enrich the understanding of other behavioral regularities. Perceived risk is the fundamental characteristic investigated by the Monty Hall game. The derived results describe the (persistent) behavior of many that switching doors is more risky. This is not only true under bounded rationality of not knowing the odds, but also in a strategic setting where the host prefers not giving away the prize. Only for simplified Monty who is always opening, or if Monty is assumed to make lots of errors while revealing a losing door (i.e. opening the doors in winning and losing cases more equally), then switching doors becomes the more successful strategy.

Most controversies of the Monty Hall problem might be due to unclear player incentives (see Mueser et al. 1999). The experimental evidence of many participants not switching is robust even under experimenters explicit claim of always opening the unchosen door with no prize behind (compare Granberg 1999b). Uncertainty might prevail as this experimental promise is non-binding and the choice situation can be represented as a normal form game with two players both having two strategies, as in the Simplified Monty game. The sequential game representation, as in the Sophisticated Monty game, illustrates this uncertainty as an information set for the contestant not knowing in which state of the world W or L (s)he is in. Furthermore, bounded rationality could argue for the complexity of the task making not switching the more robust strategy, and we do not need refer to reverse psychology or other forms of psychological tricks to influence the other players behavior. If there is an additional utility from prolonging the game and this crucial utility of the host is acknowledge by the contestant, only then switching should be preferred to not switching. An alternative explanation are social preferences. In the form of sequential reciprocity this can work similar to forwards induction in the trust game (compare Kohlberg and Mertens 1986; Dufwenberg and Kirchsteiger 2004; Battigalli and Dufwenberg 2009). The (anticipated) effect of trusting or not can be seen as serious competitors to mixed strategies equilibria, but Monty’s motivation mostly remains unclear. For this various Monty types have been proposed (i.e. mean, altruistic, etc.), but the general grounds for cooperative versus uncooperative behavior remain dubious. The Monty game is usually specified as a one shot game (though investigated experimentally as a repeated game). Signaling the Monty type by opening a door does not work either (compare common priors Whitmeyer 2017). Also that joy will be shown by the contestant cannot be taken for granted and would demand another decision stage. Note that not all possible incentive structures of the game are covered here and that the chosen game tree expansions are mainly introduced to illustrate corresponding shortcomings in the discussion of this choice task under uncertainty. When the specific structural component of a simultaneous choice is stressed for switching to be the dominant strategy, as if deciding before the revealing weather to switch or not, this as well seems not properly represent the strategic situation in the game. If Monty always reveals a losing door, this does not represent a free agent in a strict economic sense (i.e. for game theory an awkward definition of a social problem as one player against chance). Furthermore, the experimental results of increasing switching decisions over repetitions might as well result from experimenter demands or being a reconsidering effect, and improving behavior over repetitions does not necessarily incorporate the learning of the underlying odds.

Still, the Monty Hall game illustrates the clash between statistical thinking and observed choice behavior. Taking this discrepancy seriously asks for descriptive models that can cope with the complexity of the problem. Already different standard representations help to illustrate the problem. A formalization of choices in social settings is given by game theory that captures the strategic dependencies between players. The provided exercise of differently representing the choice situation should sharpen the understanding of the problem diversity and illustrates how the representation of a choice problem can theoretically lead to distinct outcomes. What expansions are useful to improve the general understanding of the problem can only be answered empirically. The provided expansions for the Monty Hall problem clearly need to be investigated experimentally. This theoretical approach here is to stress the importance of developing sound foundations in experimental investigations, and to help understand the behavioral facets in social settings. Behavior can be manifold. Formalizing, and thereby clearly defining the decision problem at hand, is important in all social sciences and teaching conditional probabilities and aspects of game theory serves as a nice illustrative example here.

Sometimes the initial intuition can be right. Usually the audience in the Monty Hall show perceives changing doors as more risky under unknown probabilities. This can be seen as some kind of uncertainty avoidance (similar to the Allais paradox) by people simply playing safe. For the Monty game uncertainty avoidance has been investigated as anticipated regret (Gilovich et al. 1995) or a minimax strategy (Schuller 2012), and not switching doors does not need another explanatory heuristic. If a person changes his/her initial choice this behavior demands distributional assumptions about Monty’s behavior, preferences for prolonging the game, or some form of forwards induction with specific social preferences. Usually, social situations can be rather complex, but also grasped by various theoretical concepts. Grasping the statistical dependencies within the Monty Hall game is representative for the understanding of various decision problems in social sciences.