Upload
dodang
View
214
Download
0
Embed Size (px)
Citation preview
Running Head: MONTY HALL EXPERIMENT 1
Please cite this article as:
Saenen, L., Van Dooren, W., Onghena, P. (2015). A randomized Monty Hall experiment: The positive effect of conditional frequency feedback. Thinking & Reasoning, 21 (2), 176-192. doi:10.1080/13546783.2014.918562
A Randomized Monty Hall Experiment:
The Positive Effect of Conditional Frequency Feedback
Lore Saenena, Wim Van Doorenb, & Patrick Onghenaa
a Methodology of Educational Sciences Research Group, KU Leuven, Leuven, Belgium
b Centre for Instructional Psychology and Technology, KU Leuven, Leuven, Belgium
Correspondence concerning this article can be addressed to Lore Saenen, Methodology of
Educational Sciences Research Group, Andreas Vesaliusstraat 2 - Box 3762, B-3000 Leuven,
Belgium. Phone +32 16 373041. Fax +32 16 326200.
E-mail [email protected]
Acknowledgement: This work was supported by the Concerted Research Action of the KU
Leuven under Grant GOA/12/010.
Total word count: 6425
1
Running Head: MONTY HALL EXPERIMENT 2
Abstract
The Monty Hall dilemma (MHD) is a notorious probability problem with a counterintuitive
solution. There is a strong tendency to stay with the initial choice, despite the fact that
switching doubles the probability of winning. The current randomized experiment investigates
whether feedback in a series of trials improves behavioural performance on the MHD and
increases the level of understanding of the problem. Feedback was either conditional or non-
conditional, and was given either in frequency format or in percentage format. Results show
that people learn to switch most when receiving conditional feedback in frequency format.
However, problem understanding does not improve as a consequence of receiving feedback.
Our study confirms the dissociation between, on the one hand, behavioural performance on
the MHD and, on the other hand, actual understanding of the MHD. We discuss how this
dissociation can be understood.
Keywords: Monty Hall dilemma, probability, conditional feedback, frequencies,
decision making
2
Running Head: MONTY HALL EXPERIMENT 3
The Monty Hall dilemma (MHD) is a notoriously difficult conditional probability
problem with a counterintuitive solution (e.g., Baratgin & Politzer, 2010; Howard, Lambdin,
& Datteri, 2007). The classic version of the MHD involves a contestant who is asked to pick
one of three identical doors. Behind one of the doors, there is a valuable prize, usually a car.
The other two doors conceal goats. Once the contestant picks a door, the host, who knows the
location of the prize, opens a door not selected by the contestant, always revealing a goat. The
dilemma then offered to the contestant is whether he stays with his initial choice, or wants to
switch to the other remaining door. The optimal solution to the MHD is to switch, as
switching doors yields a probability to win the prize that is twice as large as the probability
obtained by staying. Nevertheless, there is a strong (cross-cultural) tendency to stay with the
initial decision (Granberg, 1999; Granberg & Brown, 1995; Granberg & Dorr, 1998).
Various studies have shown that this tendency is malleable. First, there is strong
evidence that repeated experience with the MHD leads to increased switching rates across
trials (Franco-Watkins, Derks, & Dougherty, 2003; Granberg & Brown, 1995; Granberg &
Dorr, 1998; Herbranson & Schroeder, 2010; Klein, Evans, Schultz, & Beran, 2013; Petrocelli
& Harris, 2011; Slembeck & Tyran, 2004; Tubau & Alonso, 2003). Second, also the
numerical format in which the MHD is presented to participants seems to influence
participants’ performances. Aaron and Spivey-Knowlton (1998) showed that a MHD
presentation in terms of natural frequencies resulted in higher switching rates than the
situation in which the MHD is presented in terms of probabilities. Although the difference in
switching rates was not statistically significant, the difference in the answers on the
mathematical questions did reach statistical significance: Participants who solved the MHD
stated in natural frequencies, reported significantly more often the correct posterior winning
probabilities compared to the participants who were asked to solve the MHD stated in
probabilities (proportions or percentages). Research of Tubau (2008) confirmed these results:
3
Running Head: MONTY HALL EXPERIMENT 4
Participants who completed the MHD in natural frequencies, showed both statistically
significant higher switching rates and more correct posterior probability judgements than
participants who completed the MHD presented in terms of relative frequencies. Also Krauss
and Wang (2003) included a natural frequencies manipulation in their experiment.
Participants were asked in how many of the three possible MHD situations a participant
would win the game when staying and when switching. When this manipulation was
combined with the manipulation in which the participant was asked to imagine that (s)he was
the host of the MHD, switching behaviour was facilitated.
The critical question that arises when behavioural performance on the MHD improves,
is whether this performance reflects actual understanding of the problem. More specifically, a
decision to switch is not necessarily the consequence of correct probabilistic reasoning. This
proposition is confirmed in previous research: Franco-Watkins, Derks, and Dougherty (2003),
Stibel, Dror, and Ben-Zeev (2009), and Tubau and Alonso (2003) demonstrated that there is a
clear dissociation between choice behaviour on the MHD and the actual understanding of the
problem’s underlying probabilities. Thus, although research demonstrates ways to influence
the use of the optimal strategy to solve the MHD, people still fail to grasp the reason why
switching is beneficial.
The failure to detect the correct underlying probabilities of the MHD can be explained
by the cognitive process of updating (Baratgin & Politzer, 2010) and by a distorted memory
for decision/outcome frequencies when a series of trials is involved (Petrocelli & Harris,
2011). Baratgin and Politzer (2010) showed that when people solve the MHD, they often
erroneously estimate the posterior probability for winning when staying (and switching) as
being 1/2 because they engage in what the authors call ‘updating’, by which they mean that
participants believe that the new situation, with only two remaining doors, is independent
from the conditional information received about the opened non-winning door. As a
4
Running Head: MONTY HALL EXPERIMENT 5
consequence, the posterior winning probabilities for staying and switching are incorrectly
judged equally as 1/2 (Baratgin & Politzer, 2010). For a correct probability understanding,
however, one should engage in what the authors call ‘focusing’, by which they mean that the
situation in which the dilemma occurs is considered as not independent from the situation
before the actual dilemma took place. Thus, when focusing, one takes into account the
conditional information (Baratgin & Politzer, 2010).
Next, Petrocelli and Harris (2011) pointed to the distorted memory for
decision/outcome frequencies that participants have after solving successive MHD trials. In
their research, participants completed several trials of the MHD. Afterwards, they were asked
to estimate how often they had won and lost, depending on the used strategy. Above all,
participants estimated their switch losses being much higher as compared to the actual number
of switch losses during the experiment. Of course, it is difficult to keep in mind the number of
winning and losing trials depending on the used strategy without having any external help. It
requires that the working memory of participants is constantly updating these frequencies
while solving the MHD. The important role of working memory capacity on MHD
performances has indeed been demonstrated, both correlationally and experimentally (De
Neys, 2005; De Neys & Verschueren, 2006).
In sum, when people would solve the MHD repeatedly, we assume that correct
registration of the number of times a participant wins or loses in relation to the choice to
switch or to stay may be extremely difficult because it demands too much from working
memory. There exist several studies in which participants completed successive trials of the
MHD and received feedback about their performances: After each trial, the participant was
informed about whether (s)he won or lost the trial. The results of these studies showed that
participants indeed learned to switch across successive trials, but still did not respond
optimally, which means that they still did not switch on all trials (Herbranson & Schroeder,
5
Running Head: MONTY HALL EXPERIMENT 6
2010; Klein et al., 2013; Mazur & Kahlbaugh, 2012; Slembeck & Tyran, 2004). Although the
influence of feedback was examined in these studies, according to our knowledge no study so
far systematically investigated whether cumulative, constantly updated feedback provided to
the participants during the game, influences performances on the MHD and more generally
improves participants’ understanding of the underlying probabilities.
In the current paper, we will investigate the effect of cumulative, constantly updated,
and available feedback during an experiment in which participants are asked to complete
multiple trials of the MHD. The nature of feedback in the MHD is systematically manipulated
in two different ways.
First, we will provide either conditional or non-conditional feedback that is constantly
updated and shown to the participant. We expect that participants will perform better when
receiving conditional feedback compared to non-conditional feedback, because conditional
feedback will prevent that participants’ memory becomes distorted for decision-outcome
frequencies while this distortion is still possible in the case of unconditional feedback. Thus,
when conditional feedback is provided to the participants while solving multiple trials, their
working memory will be less burdened in comparison with non-conditional feedback.
Second, the feedback will be provided in either a percentage or a frequency format.
This latter manipulation is based on research indicating that people perform better on
conditional probability problems when the problem is stated in frequencies rather than in
percentages or probabilities (e.g., Gigerenzer, 1991, 1994; Gigerenzer & Hoffrage, 1995;
Moro, Bodanza, & Freidin, 2011). As already mentioned above, this has also been
demonstrated for the MHD by the research of Aaron and Spivey-Knowlton (1998), Krauss
and Wang (2003), and Tubau (2008). However, in these studies the numerical format of the
problem presentation of the MHD was manipulated, whereas in our study we will investigate
whether the numerical presentation of the feedback makes a difference in how people solve
6
Running Head: MONTY HALL EXPERIMENT 7
the MHD. We expect that participants perform better on the MHD when feedback is given in
frequency format rather than in percentage format, because humans’ cognitive systems are
considered to be more suitable for natural frequencies, but less suitable for proportions and/or
percentages (see Cosmides & Tooby, 1996). The latter statement does not necessarily imply
that the frequentist format is the causal explanation for the reported research results on the
MHD (Aaron & Spivey-Knowlton, 1998; Krauss & Wang, 2003; Tubau, 2008). An
alternative plausible explanation is that natural frequencies favour the correct focusing
interpretation of the MHD, instead of the misleading updating interpretation (for an overview,
see Baratgin, 2009).
Notice that the previously mentioned expectations imply that we expect participants to
perform best when they receive conditional feedback presented in frequencies, because they
would benefit from both manipulations.
To the best of our knowledge, no previous study has systematically investigated the
effect of cumulative, constantly updated, and available feedback, nor has it been investigated
which format of such feedback (i.e., the numerical and conditional format in which the
feedback is provided) is most effective. In our experiment, we will investigate the influence of
cumulative, constantly updated, and available feedback on people’s performances on the
MHD. More specifically, we will investigate whether such feedback leads to both more
optimal behavioural responses on the MHD and an increased understanding of the MHD.
Methods
Participants and Design
Seventy-seven (under)graduate students of the University of Leuven (Belgium)
volunteered to participate in the experiment, in return for a small financial reward. Nine
participants were eliminated due to prior familiarity with the MHD, leaving data of 68
participants (18 males, 50 females; Mage = 21.60, SDage = 4.36) used for further analyses.
7
Running Head: MONTY HALL EXPERIMENT 8
The participants were randomly assigned to one of four treatments created by a 2 × 2
between-subjects design. Both independent variables considered the nature of the constantly
updated feedback on the MHD. The first independent variable was the conditional format of
the feedback (conditional vs. non-conditional). Conditional feedback was given by providing
information about the number of wins and losses for staying and switching separately. When
non-conditional feedback was given, only information about the number of times the
participant won or lost was provided. The second independent variable was the numerical
format of the feedback (in frequencies vs. in percentages). This led to the following four
treatments: non-conditional feedback in frequencies (NCF), non-conditional feedback in
percentages (NCP), conditional feedback in frequencies (CF), and conditional feedback in
percentages (CP). Figure 1 shows the feedback as given to the participants in the different
treatments.
(Figure 1 about here)
Two dependent variables were included in the study. The first dependent variable was
the behavioural response on the MHD (staying vs. switching). The second dependent variable
was the probability judgement of winning for the final decision that was made by the
participant (staying vs. switching). This means that when the participant chose to stay with the
initial choice, (s)he was asked to give the posterior winning probability of staying, while the
choice to switch resulted in a question about the posterior winning probability of switching.
The study protocol was approved by the Ethical Committee of the University of
Leuven.
Materials and Procedure
8
Running Head: MONTY HALL EXPERIMENT 9
The MHD was completed repeatedly (80 times) and feedback on the performance (see
Figure 1) was constantly updated on a computer screen that was placed in an angle of 45
degrees and at a distance of approximately 45 centimetres at the left front of the participant.
Participants came individually to the laboratory. Upon arrival, the participant took
place in front of the experimenter. Three piles of 80 cards laid face down on the table. For
each trial of the MHD, the three cards that laid on top of the piles were used. These three
cards always included two black cards and one red (winning) card. The participant received
the instructions for the MHD task on a sheet of paper (see the Appendix). After reading the
instructions, the experimenter summarized the procedure to the participant. The participant
was told that (s)he would participate in a game of chance that consisted of 80 trials and that
for each trial, the aim was to locate the red winning card. For each trial, the participant made
an initial choice by mentioning a number from one to three (number one being the card on the
left, number three being the card on the right). Next, the experimenter turned face up another
card than the initial choice of the participant. This card was always black, while the
participant knew that the experimenter was aware of the location of the red winning card.
Next, the participant was asked to decide to either stay with the initial choice or to switch to
the other remaining card. After this final choice, the other two remaining cards were turned
faced up so that the participant could see whether (s)he had won or lost this trial (see Figure
2). Before starting the next trial, the information on the computer screen was updated.
(Figure 2 about here)
Two practice trials of the MHD were included in order to make sure the participant
understood the procedure of the game. These practice trials were analogous to the
experimental trials. After these two practice trials, the MHD was completed 80 times. The
9
Running Head: MONTY HALL EXPERIMENT 10
reason to include 80 trials was both practical and theoretical. From a practical point of view, a
pilot study showed that 80 trials were easily completed in a one hour experimental session.
From a theoretical point of view, 80 trials is considered to be sufficient to investigate the
influence of feedback as the majority of learning is known to happen in approximately the
first 50 trials (Herbranson & Schroeder, 2010). After every 10 trials, the participant was
explicitly encouraged by the experimenter to look at the feedback on the computer screen (see
Figure 1). The participant could look at the computer screen more often if (s)he wanted to,
because the feedback on the computer screen was constantly updated and remained available
during the entire experiment.
During trial 1, trial 40, and trial 80, after the experimenter revealed a non-winning
card, the participant was also asked to make a probability judgement with respect to winning,
given the final decision that (s)he had made. This was done in order to detect whether his/her
understanding of the underlying probabilities of the MHD improved.
The participant was motivated to perform well by receiving the instruction that for
every winning trial (s)he would gain 10 eurocents. At the end of the experiment, the
participant received the actual amount of money (s)he had obtained during the experiment.
Results
Choice Behaviour
Frequencies of switching were counted for every block of 20 trials. A repeated
measures analysis of variance was performed with one within-subjects variable ‘block’ (four
blocks of 20 trials) and two between-subjects variables ‘conditional format’ (conditional vs.
non-conditional feedback) and ‘numerical format’ (feedback in frequencies vs. percentages)
and with frequency of switching per block as the dependent variable. Mauchly’s test indicated
that the assumption of sphericity was violated, ²(5) = 34.13, p < .01. Therefore, corrections
10
Running Head: MONTY HALL EXPERIMENT 11
for the degrees of freedom using the Greenhouse-Geisser estimates of sphericity were used.
The significance level was determined at α = .05.
As expected, after having experienced more trials of the MHD, participants switched
more often (first block: M = 7.57, SD = 5.08; second block: M = 9.31, SD = 5.73; third block:
M = 11.03, SD = 6.27; fourth block: M = 11.59, SD = 6.69), F(2.16, 138.04) = 26.90, p < .01,
Cohen’s f = .65, which indicated a main learning effect. In line with our expectations, an
interaction effect between conditional format and numerical format was also observed, F(1,
64) = 5.69, p = .02, Cohen’s f = .30. Post-hoc pairwise contrasts between the four treatments
were tested across blocks, using Tukey’s Honestly Significant Difference (HSD) test. This
test revealed only one statistically significant pairwise contrast: Participants in the conditional
feedback in frequencies condition (CF) switched significantly more often than participants in
the non-conditional feedback in frequencies condition (NCF), HSD = 23.24, p < .01. The
other five pairwise contrasts did not reach statistical significance. See Table 1 for an overview
of all post-hoc pairwise contrasts.
(Table 1 about here)
The above mentioned main effect of block and the two-way interaction effect between
conditional format and numerical format were further qualified by a three-way (block by
conditional format by numerical format) interaction effect, F(2.16, 138.04) = 8.92, p < .01,
Cohen’s f = .37. Figure 3 clarifies this three-way interaction effect by showing how the two-
way interaction effect (conditional format by numerical format) differs from block to block.
(Figure 3 about here)
11
Running Head: MONTY HALL EXPERIMENT 12
To interpret this three-way interaction effect, we examined how the two-way
interaction between conditional format and numerical format differed for each block. A
Bonferroni-Holm correction was used to control the Familywise Error Rate at 5%. For blocks
1 and 2, the interaction between conditional format and numerical format was not statistically
significant, F(1, 64) = 0.69, p = .41, Cohen’s f = .11, and F(1, 64) = 0.85, p = .36, Cohen’s f =
.11, respectively. This interaction, however, reached statistical significance in blocks 3 and 4,
F(1, 64) = 8.77, p < .01, Cohen’s f = .37, and F(1, 64) = 13.24, p < .01, Cohen’s f = .45,
respectively. These two-way interactions reveal that participants in the last 40 trials switched
more often when receiving conditional feedback in frequencies than when receiving non-
conditional feedback in frequencies, while the switch rates for participants receiving either
conditional or non-conditional feedback in percentages fell in between.
Probability Judgements
During trial 1, trial 40, and trial 80, after the experimenter revealed a non-winning
card and the participant made a final choice to either stay with his/her initial decision or to
switch, (s)he was asked to estimate the probability of winning. The probability judgements of
the participants are summarized in Table 2. Answers were classified as correct when the
participant answered “1/3” and had chosen to stay with the initial choice, or “2/3” when the
participant switched to the other remaining card. All other answers were classified as
incorrect and were further subdivided in the typical “1/2” answer category, or the category
“other incorrect answers”.
(Table 2 about here)
The results show that in all conditions and for all three investigated trials, the majority
of participants judged the posterior probability associated with the winning card incorrectly as
12
Running Head: MONTY HALL EXPERIMENT 13
1/2. Thus, most participants’ understanding of the probabilities underlying the MHD did not
improve through the experiment. Only two participants – one participant in the non-
conditional feedback in percentages condition (NCP) and one participant in the conditional
feedback in frequencies condition (CF) – judged the posterior winning probabilities correctly
for all three investigated trials.
When considering the category of “other” incorrect answers, the following answers
were reported: Besides one nonsensical answer, three other students gave an answer that
possibly also expressed some understanding that switching was advantageous, but the
justifications could not unequivocally be considered correct.
Three Fisher-Freeman-Halton Tests were performed – one for each trial that involved
the probability judgement (i.e., trial 1, trial 40, and trial 80) – to test for statistically
significant differences in the number of times a particular probability judgement was given
between the four treatments. A Bonferroni-Holm correction was used to keep the Familywise
Error Rate under control at 5%. The results of these tests revealed that the type of probability
judgement (i.e., correct, “1/2”, or “other”) participants gave, was not systematically related to
the treatment participants were assigned to (trial 1: p = .38; trial 40: p = .54; and trial 80: p
= .64).
Discussion
The present study focused on the notorious MHD. The MHD was solved repeatedly
and the constantly updated feedback on the outcomes was manipulated in two different ways.
First, the feedback was either conditional or non-conditional. Second, the feedback was given
either in percentages or in frequencies. To the best of our knowledge, this is the first study
that directly addressed the differential effect of these forms of feedback on the MHD
performances of participants in a controlled and randomized experimental setting. We
13
Running Head: MONTY HALL EXPERIMENT 14
investigated the effect of feedback on both the behavioural level and the level of
understanding.
As in previous MHD studies, participants of our experiment learned to switch by
solving the MHD repeatedly (Franco-Watkins et al., 2003; Granberg & Brown, 1995;
Granberg & Dorr, 1998; Herbranson & Schroeder, 2010; Klein et al., 2013; Petrocelli &
Harris, 2011; Slembeck & Tyran, 2004; Tubau & Alsono, 2003). Our results showed that
behavioural performance on the MHD is related to the feedback that is provided to the
participants. More specifically and in line with our expectations, at the behavioural level
participants learned most when they received conditional feedback in frequencies. In order to
keep the load on participants’ working memory capacity low, we provided the participants
with constantly updated conditional feedback. The conditional format of the feedback most
likely helped to avoid the distortion of participants’ memory for decision/outcome frequencies
(see Petrocelli & Harris, 2011). Furthermore, the results of the behavioural performances on
the MHD show that when giving conditional feedback, the numerical representation of
feedback is important. Previous research on the MHD (Aaron & Spivey-Knowlton, 1998;
Krauss & Wang, 2003; Tubau, 2008) and other conditional probability problems (e.g.,
Gigerenzer, 1991, 1994; Gigerenzer & Hoffrage, 1995; Moro et al., 2011) have shown that
participants perform better when the problems are presented in frequencies rather than in
percentages or proportions. In our study, we did not replicate this overall superiority of the
frequency manipulation as such.
Although the results of participants’ choices on the MHD indicated that cumulative,
constantly updated, and available feedback enhances participants’ behavioural performances
on the MHD, especially when this feedback is provided in conditional frequency format, the
results of the probability judgements clearly demonstrate that feedback does not help to
improve participants’ level of understanding of the underlying probabilities of the MHD. The
14
Running Head: MONTY HALL EXPERIMENT 15
majority of participants still judged their posterior winning probability as 1/2. In sum, our
results provide further support for the earlier demonstrated dissociation between behavioural
performance on the MHD and the level of actual problem understanding (Franco-Watkins et
al., 2003; Stibel, Dror, & Ben-Zeev, 2009; Tubau & Alonso, 2003).
The question that this dissociation raises is what exactly is going on in terms of
participants’ underlying reasoning processes that can explain why understanding does not
improve.
A first possibility that can explain why the posterior probability judgements were
rarely correct in our research is participants’ probability understanding in general: Previous
MHD research showed that participants sometimes even have difficulties with understanding
the initial probabilities (Tubau, 2008), thus, before a door is eliminated and the actual
dilemma takes place. Of course, in that case it is very unlikely that a participant would give a
correct winning posterior probability judgement. Note that we did not manipulate the way in
which the posterior probability judgement questions were posed: We always formulated these
questions in terms of probabilities (i.e., “What is your chance of winning?”). However,
formulating the posterior probability judgement questions in terms of frequencies could have
positively influenced participants’ correct answer rates. Therefore, future research should
investigate whether this is the case.
A second explanation for the lack of problem understanding in our research refers to
the structure of the MHD itself, because previous research already showed that people do not
take into account the fact that the behaviour of the host is dependent on the initial choice of
the participant. Thus, participants fail to take into account the structure of the MHD (Burns &
Wieth, 2003, 2004; Idson et al., 2004; Krauss & Wang, 2003; Tubau & Alonso, 2003).
Feedback does not influence the structure of the problem, and does not make the underlying
15
Running Head: MONTY HALL EXPERIMENT 16
structure of the MHD more explicit,. This again explains why feedback may not have helped
to improve participants’ understanding of the MHD and its underlying probabilities.
Third, the finding that the majority of our participants overwhelmingly estimated the
posterior winning probabilities as 1/2 can be explained by two related cognitive processes,
being partitioning-editing-counting (Fox & Levav, 2004) and updating (Baratgin, 2009;
Baratgin & Politzer, 2010). When the actual dilemma of the MHD occurs, one door has been
eliminated by the host and only two doors remain. At this point, a participant may involve in
the cognitive strategy of partitioning-editing-counting to deal with the probabilities, as
described by Fox and Levav (2004). Partitioning-editing-counting in the MHD means that in
order to calculate the posterior winning probabilities, the number of prizes (i.e., one) is
divided by the number of remaining options (i.e., two). As a consequence, the posterior
winning probability for both staying and switching are considered to be equal (i.e., 1/2). The
partition-edit-count strategy (Fox & Levav, 2004) is a specific example of the cognitive
strategy of updating (Baratgin, 2009; Baratgin & Politzer, 2010) which we already described
earlier. Thus, updating is a valid alternative explanation for the majority of 1/2 answers which
were obtained on the posterior probability judgement questions of our experiment.
Besides the fact that after our study, there remain questions about the reasons for the
dissociation between behavioural performance on the MHD and the level of understanding of
the problem, our study has some other limitations. One limitation of the current study
concerns the physical implementation of the experiment. First, the posterior winning
probabilities for both staying and switching could differ from the theoretical 1/3 and 2/3
probabilities. Although the completion of 80 trials resulted in posterior winning probabilities
of approximately 1/3 and 2/3, the posterior winning probabilities could deviate substantially
for smaller subsets of the 80 trials, for example for the first 10 trials. Furthermore, the
physical implementation led to (rather small) differences in posterior winning probabilities
16
Running Head: MONTY HALL EXPERIMENT 17
between participants. Second, because of the physical implementation of the experiment, it
was not possible to use eye tracking methodology in order to control for participants’ number
of fixations and fixation lengths on the screen that provided the feedback. Despite these
methodological issues, we were convinced of the strength of a physical implementation of the
experiment because computerized versions may lead participants to believe that the computer
has been manipulated to change the winning card after the initial choice has been made,
whereas manipulating the outcomes of an MHD experiment is not possible in the case of a
physical implementation. Although an effect of the physical versus computerized
implementation on participants’ MHD performances is speculative because no prior research
has been done on this topic, we decided to rule out the degree of participants’ suspiciousness
about the experiment by choosing for a physical implementation.
Another limitation of the current research is that posterior probability judgements of
participants were included as a dependent variable, but these mathematical answers do not
provide further insight in the reasoning process that led to the (in)correct answers.
Furthermore, it should be mentioned that correct probability judgements do not necessarily
reflect a real understanding of the MHD problem. For example, a participant might
accidentally give a correct probability judgement on the MHD when involving in the
cognitive process of updating (see Baratgin, 2009). In order to get a better grip on the
reasoning processes participants involve in when confronted with the MHD, further research
should consider to include open questions. Mixed methods research in which both quantitative
and qualitative dependent variables are included may lead us to a deeper understanding of
how people reason when confronted with the MHD (see Heyvaert, Maes, & Onghena, 2013).
More specifically, open questions (i.e., qualitative component) may show the reasoning
process a participant is involved in to give a particular probability judgement (i.e.,
quantitative component).
17
Running Head: MONTY HALL EXPERIMENT 18
In sum, this paper investigated how and which kind of cumulative, constantly updated,
and available feedback influenced people’s performance on the MHD, both on the
behavioural and level of understanding. First, the results showed that participants’ switch
rates increased most when they received conditional feedback in frequency format, which was
in line with our hypotheses. Second, the findings in this paper provided additional evidence
for the dissociation between behavioural performances on the MHD and actual understanding
of the problem (Franco-Watkins et al., 2003; Stibel et al., 2009; Tubau & Alonso, 2003).
Although cumulative, constantly updated, and available feedback, especially in conditional
frequency format, helped the participants to detect that switching was the optimal response on
the MHD, the feedback did not help the participants to improve their actual understanding of
why this is the case.
18
Running Head: MONTY HALL EXPERIMENT 19
References
Aaron, E., & Spivey-Knowlton, M. (1998). Frequency vs. probability formats: Framing the
three doors problem. Proceedings of the 20th Annual Conference of the Cognitive
Science Society, 13-18. Retrieved from
http://dspace.library.cornell.edu/bitstream/1813/7326/1/98-1672.pdf
Baratgin, J. (2009). Updating our beliefs about inconsistency: The Monty-Hall case.
Mathematical Social Sciences, 57, 67-95. doi:10.1016/j.mathsocsci.2008.08.006
Baratgin, J., & Politzer, G. (2010). Updating: A psychologically basic situation of probability
revision. Thinking & Reasoning, 16, 253-287. doi:10.1080/13546783.2010.519564
Burns, B. D., & Wieth, M. (2003). Causality and reasoning: The Monty Hall dilemma.
Proceedings of the twenty-fifth annual conference of the cognitive science society,
198-203. Retrieved from http://csjarchive.cogsci.rpi.edu/proceedings/2003/pdfs/57.pdf
Burns, B. D., & Wieth, M. (2004). The collider principle in causal reasoning: Why the Monty
Hall dilemma is so hard. Journal of Experimental Psychology: General, 133, 434-449.
doi:10.1037/0096-3445.133.3.434
Cosmides, L., & Tooby, J. (1996). Are humans good intuitive statisticians after all?
Rethinking some conclusions from the literature on judgment under uncertainty.
Cognition, 58, 1-73. doi:10.1016/0010-0277(95)00664-8
De Neys, W. (2005). Smarter and richer?: Executive processing and the Monty Hall dilemma.
Proceedings of the 26th Annual Conference of the Cognitive Science Society, 285-290.
Retrieved from http://www.cogsci.northwestern.edu/cogsci2004/papers/paper131.pdf
De Neys, W., & Verschueren, N. (2006). Working memory capacity and a notorious brain
teaser. The case of the Monty Hall dilemma. Experimental Psychology, 53, 123-131.
doi:10.1027/1618-3169.53.2.123
19
Running Head: MONTY HALL EXPERIMENT 20
Fox, C. R., & Levav, J. (2004). Partition-edit-count: Naive extensional reasoning in judgment
of conditional probability. Journal of Experimental Psychology: General, 133, 626-
642. doi:10.1037/0096-3445.133.4.626
Franco-Watkins, A. M., Derks, P. L., & Dougherty, M. R. P. (2003). Reasoning in the Monty
Hall problem: Examining choice behavior and probability judgements. Thinking &
Reasoning, 9, 67-90. doi:10.1080/13546780244000114
Gigerenzer, G. (1991). How to make cognitive illusions disappear: Beyond “heuristics and
biases”. European Review of Social Psychology, 2, 83-115.
doi:10.1080/14792779143000033
Gigerenzer, G. (1994). Why the distinction between single event probabilities and frequencies
is important for psychology (and vice versa). In G. Wright & P. Ayton (Eds.),
Subjective probability (pp. 129-161). Chichester, England: John Wiley & Sons.
Gigerenzer, G., & Hoffrage, U. (1995). How to improve Bayesian reasoning without
instruction: Frequency formats. Psychological Review, 102, 684-704.
doi:10.1037/0033-295X.102.4.684
Granberg, D. (1999). Cross-cultural comparison of responses to the Monty Hall dilemma.
Social Behavioral and Personality, 27, 431-438. doi:10.2224/sbp.1999.27.4.431
Granberg, D., & Brown, T. A. (1995). The Monty Hall dilemma. Personality and Social
Psychology Bulletin, 21, 711-723. doi:10.1177/0146167295217006
Granberg, D., & Dorr, N. (1998). Further exploration of two-stage decision making in the
Monty Hall dilemma. American Journal of Psychology, 111, 561-579.
doi:10.2307/1423551
Herbranson, W. T., & Schroeder, J. (2010). Are birds smarter than mathematicians? Pigeons
(Columba livia) perform optimally on a version of the Monty Hall dilemma. Journal
of Comparative Psychology, 124, 1-13. doi:10.1037/a0017703
20
Running Head: MONTY HALL EXPERIMENT 21
Heyvaert, M., Maes, B., & Onghena, P. (2013). Mixed methods research synthesis:
Definition, framework, and potential. Quality & Quantity, 47, 659-676.
doi:10.1007/s11135-011-9538-6
Howard, J. N., Lambdin, C. G., & Datteri, D. L. (2007). Let’s make a deal: Quality and
availability of second-stage information as a catalyst for change. Thinking &
Reasoning, 13, 248-272. doi:10.1080/13546780600848049
Idson, L. C., Chugh, D., Bereby-Meyer, Y., Moran, S., Grosskopf, B., & Bazerman, M.
(2004). Overcoming focusing failures in competitive environments. Journal of
Behavioral Decision Making, 17, 159-172. doi:10.1002/bdm.467
Klein, E. D., Evans, T. A., Schultz, N. B., & Beran, M. J. (2013). Learning how to “make a
deal”: Human (Homo sapiens) and monkey (Macaca mulatta) performance when
repeatedly faced with the Monty Hall dilemma. Journal of Comparative Psychology,
127, 103-108. doi:10.1037/a0029057
Krauss, S., & Wang, X. T. (2003). The psychology of the Monty Hall problem: Discovering
psychological mechanisms for solving a tenacious brain teaser. Journal of
Experimental Psychology: General, 132, 3-22. doi:10.1037/0096-3445.132.1.3
Mazur, J. E., & Kahlbaugh, P. E. (2012). Choice behavior of pigeons (Columba livia), college
students, and preschool children (Homo sapiens) in the Monty Hall dilemma. Journal
of Comparative Psychology, 126, 407-420. doi:10.1037/a0028273
Moro, R., Bodanza, G. A., & Freidin, E. (2011). Sets or frequencies? How to help people
solve conditional probability problems. Journal of Cognitive Psychology, 23, 843-857.
doi:10.1080/20445911.2011.579072
Petrocelli, J. V., & Harris, A. K. (2011). Learning inhibition in the Monty Hall problem: The
role of dysfunctional counterfactual prescriptions. Personality and Social Psychology
Bulletin, 37, 1297-1311. doi:10.1177/0146167211410245
21
Running Head: MONTY HALL EXPERIMENT 22
Slembeck, T., & Tyran, J.-R. (2004). Do institutions promote rationality? An experimental
study of the three-door anomaly. Journal of Economic Behavior and Organization, 54,
337-350. doi:10.1016/j.jebo.2003.03.002
Stibel, J. M., Dror, I. E., & Ben-Zeev, T. (2009). The collapsing choice theory: Dissociating
choice and judgment in decision making. Theory and Decision, 66, 149-179.
doi:10.1007/s11238-007-9094-7
Tubau, E. (2008). Enhancing probabilistic reasoning: The role of causal graphs, statistical
format and numerical skills. Learning and Individual Differences, 18, 187-196.
doi:10.1016/j.lindif.2007.08.006
Tubau, E., & Alonso, D. (2003). Overcoming illusory inferences in a probabilistic
counterintuitive problem: The role of explicit representations. Memory and Cognition,
31, 596-607. doi:10.3758/BF03196100
22
Running Head: MONTY HALL EXPERIMENT 23
Appendix1
Instructions
As you can see, three piles of cards lie on the table. We will always play with the three cards
that lie on top of the piles. Out of these three cards, one card is red and is the “winning card”.
The other two cards are black and are “losing cards”. Your aim is to collect as many winning
cards as possible. For each “winning card” you indicate correctly, you will be rewarded with €
0.10.
Each trial will take place in an identical way, as described below.
For each trial, you will be asked to pick one of the three cards. Make your choice for a card
explicit by mentioning the number of the card to the experimenter (imagine: you choose card
1). The experimenter is aware of the location of the red winning card. After you have
mentioned your choice to the experimenter, the experimenter will reveal one of the black non-
winning cards. This will never be the card that you have initially chosen. When the
experimenter has the choice between two black cards, he will randomly pick one of both
black cards (imagine: card 2 is one of the losing cards and will be turned face up). Next, the
experimenter will give you the opportunity to either stay with your initial choice (card 1), or
to switch to the other remaining card (card 3). Make your final choice explicit by again
mentioning the number of the card to the experimenter. After you have communicated your
final choice for a card, the experimenter will turn face up the two remaining cards in order to
reveal whether you have won or lost the trial.
Each trial you win will be rewarded with € 0.10. Losing a trial will result in nothing (you will
never lose money).
1 Original instructions were in Dutch.
23
Running Head: MONTY HALL EXPERIMENT 24
You will play this game 80 times. After each trial, you will receive feedback about your
previous performances on the computer screen. Have a look at this screen on a regular basis.
Before starting the game, you will be playing two practice trials.
If you still have any questions about the game after reading the above instructions, please do
not hesitate to communicate your questions to the experimenter.
24
Running Head: MONTY HALL EXPERIMENT 25
Table 1Post-hoc pairwise contrasts between the four treatments using Tukey’s Honestly Significant Difference (HSD)Condition 1 Condition 2 Mean
difference (1 - 2)p
CF CP 13.29 .23
CF NCF 23.24 < .01
CF NCP 13.24 .23
CP NCF 9.94 .48
CP NCP -0.06 1.00
NCF NCP -10.00 .47
Note. CF = conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; NCP = non-conditional feedback in percentages condition.
25
Running Head: MONTY HALL EXPERIMENT 26
Table 2Frequencies of participants per MHD probability judgement category
Probability judgementCorrect “1/2” “other”
Trial 1 NCF 0 16 1NCP 2 15 0CF 2 15 0CP 0 17 0
Trial 40 NCF 0 16 1NCP 1 16 0CF 1 16 0CP 0 15 2
Trial 80 NCF 0 16 1NCP 1 16 0CF 2 14 1CP 1 14 2
Note. NCF = non-conditional feedback in frequencies condition; NCP = non-conditional feedback in percentages condition; CF = conditional feedback in frequencies condition; CP = conditional feedback in percentages condition.
26
Running Head: MONTY HALL EXPERIMENT 27
27
Running Head: MONTY HALL EXPERIMENT 28
28
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional
Figure 3. Mean frequencies of switching in the different feedback formats for each block of 20 trials (block 1 = trial 1-20; block 2 = trial 21-40; block 3 = trial 41-60; block 4 = trial 61-80). Error bars represent standard errors. NCP = non-conditional feedback in percentages condition; NCF = non-conditional feedback in frequencies condition; CP = conditional feedback in percentages condition; CF = conditional