17
Journal of Experimental Psychology: Applied 1999, Vol. 5, No. 2,205-221 Copyright 1999 by the American Psychological Association, Inc. 1076-898X/99/S3.00 The Curse of Expertise: The Effects of Expertise and Debiasing Methods on Predictions of Novice Performance Pamela J. Hinds Stanford University Experts are often called on to predict the performance of novices, but cognitive heuristics may interfere with experts' ability to capitalize on their superior knowledge hi predicting novice task performance. In Study 1, experts, intermediate users, and novices predicted the time it would take novices to complete a complex task. In Study 2, expertise was experimen- tally manipulated. In both studies, those with more expertise were worse predictors of novice performance times and were resistant to debiasing techniques intended to reduce underestimation. Findings from these studies suggest that experts may have a cognitive handicap that leads to underesti- mating the difficulty novices face and that those with an intermediate level of expertise may be more accurate in predicting novices' performance. Organizations often rely on experts to predict the performance of novices. Project managers predict team members' times to complete new work assignments. Marketers and designers pre- dict consumers' abilities to master the use of new products. Teachers predict students' efforts to complete homework and finish exams. The predic- tions of experts influence the allocation of time and resources to projects, decisions about product design, and the pacing of students' lessons. Predictive errors can have significant negative consequences: employee alienation, failure to meet project deadlines, consumer dissatisfaction, and student frustration or boredom. This study draws on the literature in cognitive heuristics to explore whether experts, as compared with those with less expertise and novices themselves, are more or less accurate in predicting the perfor- This research was partially supported by funding from Hewlett-Packard Laboratories. I am grateful to George Loewenstein, Sara Kiesler, Robert Kraut, and Robyn Dawes for their assistance during the formative stages of this work. Correspondence concerning this article should be addressed to Pamela J. Hinds, Department of Indus- trial Engineering and Engineering Management, 496 Terman Building, Stanford University, Stanford, Cali- fornia 94305-4024. Electronic mail may be sent to [email protected]. mance of novices. The measure of performance in this research was the accuracy of estimating how long it would take novices to complete a new task. Inaccuracy can result from eitheroverestimat- ing or underestimating how long it will take novices to complete a task. Because people generally under- estimate task performance times (see Buehler, Grif- fin, & Ross, 1994), this study also explored the effectiveness of debiasing techniques intended to improve underestimators' predictions of novice performance. Accuracy of Experts' Predictions Logically, one may expect experts to be com- paratively good at predicting novice perfor- mance. Experts have experience as learners and a body of task-specific knowledge on which to draw (Simon, 1989; Simon, Larkin, & McDer- mott, 1980). From their experience as learners, experts should be familiar with the problems a novice will encounter and should therefore be able to predict how much time it will take novices to learn. Presumably, experts can extrapolate from their task-specific knowledge and store of practical know-how to guess what difficulties novices face when performing a new task. There- fore, experts should make more accurate predic- tions than those with less expertise. 205

Curse of Expertise Hinds99

Embed Size (px)

DESCRIPTION

science

Citation preview

  • Journal of Experimental Psychology: Applied1999, Vol. 5, No. 2,205-221

    Copyright 1999 by the American Psychological Association, Inc.1076-898X/99/S3.00

    The Curse of Expertise: The Effects of Expertise andDebiasing Methods on Predictions of Novice Performance

    Pamela J. HindsStanford University

    Experts are often called on to predict the performance of novices, butcognitive heuristics may interfere with experts' ability to capitalize on theirsuperior knowledge hi predicting novice task performance. In Study 1,experts, intermediate users, and novices predicted the time it would takenovices to complete a complex task. In Study 2, expertise was experimen-tally manipulated. In both studies, those with more expertise were worsepredictors of novice performance times and were resistant to debiasingtechniques intended to reduce underestimation. Findings from these studiessuggest that experts may have a cognitive handicap that leads to underesti-mating the difficulty novices face and that those with an intermediate levelof expertise may be more accurate in predicting novices' performance.

    Organizations often rely on experts to predictthe performance of novices. Project managerspredict team members' times to complete newwork assignments. Marketers and designers pre-dict consumers' abilities to master the use of newproducts. Teachers predict students' efforts tocomplete homework and finish exams. The predic-tions of experts influence the allocation of timeand resources to projects, decisions about productdesign, and the pacing of students' lessons.Predictive errors can have significant negativeconsequences: employee alienation, failure tomeet project deadlines, consumer dissatisfaction,and student frustration or boredom. This studydraws on the literature in cognitive heuristics toexplore whether experts, as compared with thosewith less expertise and novices themselves, aremore or less accurate in predicting the perfor-

    This research was partially supported by fundingfrom Hewlett-Packard Laboratories. I am grateful toGeorge Loewenstein, Sara Kiesler, Robert Kraut, andRobyn Dawes for their assistance during the formativestages of this work.

    Correspondence concerning this article should beaddressed to Pamela J. Hinds, Department of Indus-trial Engineering and Engineering Management, 496Terman Building, Stanford University, Stanford, Cali-fornia 94305-4024. Electronic mail may be sent [email protected].

    mance of novices. The measure of performancein this research was the accuracy of estimatinghow long it would take novices to complete a newtask. Inaccuracy can result from either overestimat-ing or underestimating how long it will take novicesto complete a task. Because people generally under-estimate task performance times (see Buehler, Grif-fin, & Ross, 1994), this study also explored theeffectiveness of debiasing techniques intended toimprove underestimators' predictions of noviceperformance.

    Accuracy of Experts' PredictionsLogically, one may expect experts to be com-

    paratively good at predicting novice perfor-mance. Experts have experience as learners and abody of task-specific knowledge on which todraw (Simon, 1989; Simon, Larkin, & McDer-mott, 1980). From their experience as learners,experts should be familiar with the problems anovice will encounter and should therefore beable to predict how much time it will take novicesto learn. Presumably, experts can extrapolatefrom their task-specific knowledge and store ofpractical know-how to guess what difficultiesnovices face when performing a new task. There-fore, experts should make more accurate predic-tions than those with less expertise.

    205

  • 206 HINDS

    Research on cognitive heuristics suggests adifferent set of predictionsthat experts may notbe very good at predicting novices' performanceafter all. Three cognitive shortcuts on whichexperts may rely could lead them to underesti-mate the performance time of novices. First,experts may_rely on the availability heuristic topredict novice performance. The availability heu-ristic is a cognitive shortcut in which peoplemake predictions on the basis of the informationthat they have readily available in memory (Tver-sky & Kahneman, 1973). Experts, if they havelearned a task longer ago than those with lessexpertise, may not have their own learning expe-rience as readily available in memory as thosewith less expertise because people tend to havebetter recall of their more recent experiences(Hoch, 1984; Miller & Campbell, 1959). Forexample, Marcus (1986) studied people's recallof their own political attitudes and found thatparticipants' recall of their political attitudes in1973 more closely matched their attitudes at thetime of making the estimate in 1982. Experts'recent experiences, which should be more avail-able, are likely to include a disproportionatenumber of trouble-free, routine experiences withthe task. If the availability heuristic influencesestimators' predictions of novice task perfor-mance, then experts are likely to underestimatenovice task times.

    A second cognitive heuristic that may biasexperts is anchoring and insufficient adjustment.Experts may anchor on their own performanceand fail to adjust adequately for the differences inskills between themselves and novices. Peopleoften anchor on their own attitudes, beliefs, andknowledge and use this anchor as a basis forpredicting what others believe, feel, and know(Davis, Hoch, & Ragsdale, 1986; Nickerson,Baddeley, & Freeman, 1987). This prediction isconsistent with the curse of knowledge, a biaswhereby knowledgeable people are unable toignore information they hold that others do not(Camerer, Loewenstein, & Weber, 1989). Expertshave a larger gap to bridge than those with lessexpertise when they try to predict the skill andknowledge level of novices. From the literatureon anchoring and adjusting, experts should an-chor on their own skill and knowledge, which isgreater than that of others and requires a largeradjustment to accurately predict the inferior skill

    and knowledge of novices. Because the adjust-ment required is larger for experts, they shouldunderestimate the performance of novices morethan do those with less expertise.

    A third cognitive shortcut that may interferewith experts' abilities to predict novice perfor-mance is oversimplification of the task. Langerand Imber (1979) argued that "the expert... maybe in a position of knowing that he/she canperform the task, without any longer knowinghow he or she performs it, that is, withoutknowing the steps or components that make upthe performance" (p. 2015). As people becomemore expert, they automate the process (seeStemberg, 1997) and develop an oversimplifiedview of the task as the details of the task becomeless salient (Langer & Imber, 1979). In oversim-plifying the task, experts may lose sight of thecomplexity faced by the novice performer.

    If the availability heuristic, anchoring andadjusting, and oversimplification are involvedwhen experts attempt to predict novice perfor-mance, then experts may not benefit from theirsuperior experience and may seriously underesti-mate the performance of novices as comparedwith those with less expertise.

    Novices also should underestimate task comple-tion time because they have no task relevantexperience on which to draw, and they may haveunrealistically simplistic views of the task. Theyare likely to generate overly smooth scenariosbecause of their lack of task-specific experiencecombined with their failure to take into accountthe unplanned difficulties that can arise with anynew project or task (see Buehler et al., 1994).Therefore, in comparison with those with a littleexpertise (intermediate users), novices should beless accurate estimators.

    Hypothesis 1: Experts and novices will underesti-mate novice task completion times more than willintermediate users.

    Improving Experts' Predictions

    Methods for improving people's predictive andrecall accuracy have been explored in numerousstudies (e.g., Connolly & Dean, 1997; Glenberg,Sanocki, Epstein, & Morris, 1987; Osberg &Shrauger, 1986; Thompson & Mingay, 1991).Two such methods may improve the accuracy of

  • THE CURSE OF EXPERTISE 207

    experts' predictions of novice performance byreducing reliance on cognitive heuristics. Onemethod is to prompt people to recall their ownexperience and instruct them to use this experi-ence to construct a scenario of how the task beingestimated may progress. When studying the accu-racy of estimating^the time it would take program-mers to write a new program, Buehler et al.(1994) found that research participants providedmore accurate estimates of task completion timeswhen they were required to construct scenarioslinking past experience and the event to beestimated. To the extent that underestimation iscaused by use of the availability heuristic, prompt-ing experts to construct a novice scenario maymake their own novice experience more avail-able. To the extent that underestimation is causedby anchoring and adjusting, prompting experts torecall their past experiences may help them resettheir anchor or make a bigger adjustment. Moregenerally, prompting experts to recall their ownexperience and to develop scenarios may encour-age them to process relevant information moresystematically and rely less on cognitive heuris-tics (e.g., Tversky & Kahneman, 1974). Becausenovices have no task-specific experience on whichto draw, asking them to reflect on their own taskexperience is unlikely to cause any increase inpredictive accuracy.

    Hypothesis 2: Experts' predictions of novice tasktimes will increase in accuracy when they areasked to recall their own experience.

    Hypothesis 3: Novices' predictions of novice tasktimes will not increase in accuracy when they areasked to recall their own experience.

    Another debiasing method that has been used inprior research (e.g., Engle & Lumpkin, 1992) isto present estimators with a list of problemsnovices encounter when they learn the task.Recent research indicates that this method, com-bined with the suggestion to remember one's ownpast experiences, results in significant improve-ments in recall accuracy. Providing a list ofpotential problems has significantly improved theaccuracy of recall in a variety of tasks, includingstudying and talking on the telephone (Engle &Lumpkin, 1992). Buehler et al. (1994) used asimilar debiasing method of providing informa-tion about the target's previous task experiences.

    They found that time estimates increased and thatobservers relied heavily on the information aboutothers' past problems to generate their predic-tions of the target's performance. This debiasingmethod may be especially effective for experts.Presenting experts with a list of novice problemsmight remind them of their own experiences asnovices, making these experiences more avail-able in memory. Presenting experts with a list ofnovice problems may also shift the experts'anchor from their own recent experience to theirown experience as novices. This list also mayprovide experts with more information for theadjustment process as the steps in the process aremade more salient. Because a list is a moreexplicit reminder of the novice experience thanan admonition to recall one's own prior experi-ence, one may expect the list method to be moreeffective at debiasing experts than the recallmethod.

    Novices are unlikely to have anticipated theproblems identified on the list and may increasetheir accuracy when given relevant informationon the difficulty of the task. In fact, a list ofproblems significantly increases the relevant infor-mation available to novices in their estimationprocess. Therefore, the list method should in-crease the accuracy of novices' predictions byincreasing the length of time they estimate thetask will take.

    Hypothesis 4: Experts will increase the accuracyof their novice task time predictions more whenthey are presented with a list of problems faced bynovices than when they are asked to recall theirown experiences.

    Hypothesis 5: Novices will increase the accuracyof their novice task time predictions even morewhen they are presented with a list of problemsfaced by novices than when they are asked torecall their own experiences.

    Overview of StudiesTo test these hypotheses, two studies were

    performed. The first was a field study comparingnovice, intermediate, and expert predictions ofnovice performance times on a series of tasksusing a cellular telephone. In Study 1, threeelicitation methods were comparedunaided,recall, and list. Study 2 was a laboratory experi-

  • 208 HINDS

    ment in which expertise was manipulated using aLEGO1 assembly task (V-Wing Fighter; LegoSystems, Inc., Enfield, CT). In this between-subjects experiment, predictions based on theunaided elicitation and the list debiasing methodwere compared.

    Study 1

    Study 1 was designed to compare people'spredictions of novice performance when using anadvanced cellular telephone technology. Sales-people (experts), customers (intermediate users),and novices estimated the novices' performance.This approach allowed for a measure of theaccuracy of the predictions by comparing theprediction of each group with actual noviceperformance. Each group of research participantsalso was exposed to one debiasing methodtherecall method (telling research participants toremember their own past experience) or the listmethod (presenting research participants with alist of novice problems). Both debiasing methodswere modeled after approaches found successfulin previous research with similar estimation tasks(e.g., Buehler et al., 1994; Engle & Lumpkin,1992). It was expected that research participantsin each group (experts, intermediate users, andnovices) would tend to underestimate novices'task completion times. People generally makeoverly optimistic predictions, in part becausethey fail to take into account the number ofunanticipated problems they have encountered onother similar tasks (Buehler et al., 1994; Kahne-man & Tversky, 1979). Experts (e.g., Kidd,1970), those with less expertise (e.g., Buehler etal., 1994), and novices would all be expected tounderestimate task performance. Over and abovethis general bias toward underestimation, thisstudy poses different predictions for groups withdifferent levels of expertise (i.e., experts mayunderestimate more than others).

    The design was a 3 (level of expertise) X 2(debiasing condition) factorial in which everyresearch participant made two predictions ofnovices' performance. The unaided-recall condi-tion, assigned to half of the participants at eachlevel of expertise, required these participants tofirst predict novices' task completion times on thebasis of a description of the novices and the task(the unaided prediction). These same research

    participants then predicted novices' task comple-tion times after exposure to a recall debiasingmethod. The unaided-list condition, assigned tothe other half of the participants, required thatthese participants make the unaided predictionfirst, then make their estimate after exposure to alist debiasing technique. The novices alwaysperformed the task after giving their estimates.

    There were two dependent variables of inter-est. Predictive accuracy was calculated as thedifference between the prediction and the actual(median) time it took the novices to complete thetask. Using predictive accuracy allows an evalua-tion of the extent to which performance time wasunderestimated (the predicted direction of thebias).2

    MethodNovice task. A task was required in which

    experience ranged from none to extensive. Fur-ther, it was important that experts be knowledge-able in the domain and highly skilled at the task.At the time of this study, a field trial of anadvanced cellular telephone and network services(Personal Communication Service, or PCS) wasunderway at Carnegie Mellon University. Thetask given to research participants was to esti-mate the median length of time it would takenovices to use the advanced cellular telephone tostore a greeting on voice mail, leave a message onvoice mail, and listen to the messages in a voicemail box, erase one message, and save onemessage. Novices generally had difficulty per-forming the task. They had to use an indicatorlight rather than dialtone to recognize the systemwas working. They had to learn to send a call usinga keypad. They also had to navigate at least threelevels in the voice prompt menu of the voice mailsystem.

    Participants. Research participants weredrawn from three different populations. Expertswere drawn from the sales staff of a local cellulartelephone company sponsoring the field trial ofPCS. The sales staff in this organization is

    is a trademark of the LEGO Group (LegoSystems, Inc., Enfield, Connecticut).

    2Absolute error (the absolute value of estimateaccuracy) also was used in the Study 1 analyses andresulted in the same pattern of results.

  • THE CURSE OF EXPERTISE 209

    responsible for selling cellular telephone servicedirectly to customers, training customers in theuse of the phones, and providing ongoing cus-tomer support. Salespeople were recruited throughthe company field trial coordinator. Several of thevolunteers were eliminated from the study be-cause they were_ recently hired by the cellularcompany and did not have extensive knowledgeof the technology used in the trial. The 18 remainingexperts fit the definition of experts in that they hadextensive task related experience and had, asexperienced mobile telecommunications sales-people, a large body of domain knowledge.

    Research participants with an intermediatelevel of expertise were drawn from a sample ofuniversity staff, faculty, and students who wereparticipating as customers in the PCS field trial.The customers in the field trial were approachedin person and asked to participate in this study.Sixty-nine percent of the intermediate users wereavailable to participate. The 44 intermediateusers had 3-9 months of experience with the newcellular telephones and services.

    Novice research participants were drawn fromthe pool of graduate students in nontechnical andnonartistic departments at the university. Theywere recruited at the graduate coffee house and atthe graduate student lounges on campus. Noviceswere asked to complete estimates similar to theones completed by the expert and intermediateresearch participants and to learn how to performthe tasks after completing the estimates. Thirty-four novice volunteers finished the study. Ninenovices decided after volunteering that they didnot have time to complete the study. A one-wayanalysis of variance (ANOVA) examining thedifference in estimates between novices whocompleted the task and those who only providedestimates showed no difference in estimates be-tween these groups.

    All research participants were given a gourmetchocolate candy bar for their participation in thestudy.

    Procedure. Novice, intermediate user, andexpert research participants were asked to predictthe time it would take a novice to perform a taskwith three components: (a) store a greeting onvoice mail, (b) leave a message on voice mail,and (c) listen to the messages in their voice mailbox, erase the first one (from the experimenter),and save the message they recorded in the

    previous component. Research participants weregiven a booklet that described the novices asgraduate students from nontechnical and nonartis-tic departments on campus and told them that thenovices would be "sitting in a coffee house or ina shared office while learning to perform thetasks." Research participants were further in-formed that none of the novices had ever owned acellular phone or subscribed to voice mail and thatthe novices would be instructed to learn how toperform the tasks from written instructions devel-oped by the companies providing the products.

    With their estimate booklets, all research par-ticipants were given a copy of the instructionalmaterials novices would use. (These were theactual materials used by customers in the fieldtrial.) The experimenter asked research partici-pants to open the booklet and make two separatetime estimates. Research participants within eachlevel of expertise were randomly assigned topresentation sets in which questions were pre-sented in one of the following two orders:Presentation Set 1 = unaided (Trial l)-recall(Trial 2) and Presentation Set 2 unaided (Triall)-list (Trial 2). The unaided prediction simplydescribed the situation and then read as follows:

    We would like your estimate of how long youthink it will take these people to learn to performtasks A, B, and C above. Please provide a"median estimate." This is an estimate such thatyou would expect half of the participants to takemore time than your estimate and the other half totake less time than your estimate. Please fill in thestatement below with your best estimate.

    I estimate that it will take minutes for aparticipant to learn how to perform tasks A-Cabove.

    The above scenario is referred to as the unaidedquestion because no additional information rel-evant to the task was elicited or provided in thisquestion. The second method is referred to asthe recall question and the instructions read asfollows:

    We would now like you to recall your ownlearning experience and think of some of theproblems you may have encountered when youattempted to learn to use the new cellular phonesand voice mail. Keeping your own experience inmind, think about any points of confusion ordifficulty you think the participants might encoun-ter in performing tasks A-C.

  • 210 HINDS

    Research participants were further instructed tomake notes of any difficulties they thought thenovices would encounter, to "construct a sce-nario of a participant learning to perform tasksA~C," and to incorporate this information intotheir estimate. Space was provided for researchparticipants to make notes about their scenario.The rest of the instructions were the same as inthe unaided prediction. The final (list) questionwas identical to the recall question with oneadditional component. After listing the tasks, theinstructions stated that several graduate studentswere observed learning the tasks and that somehad difficulty. Six observed difficulties werelisted, such as "figuring out how to get to aprevious menu when having made an error" and"clearing the display when making a dialingerror." Research participants were instructed to"construct a scenario of a participant learning toperform tasks A-C" and to use this informationin developing an estimate. As in the recallquestion, the rest of the question mirrored theunaided prediction.

    After providing the two estimates, researchparticipants were asked to answer a series ofquestions regarding their experience with cellulartechnology and network services. They wereasked when they began using a cellular phone,how many calls per day they sent and received ontheir cellular phone, how many voice mail mes-sages they retrieved each day, and several ques-tions regarding their proficiency with the voicemail system.

    After experts and intermediate users finishedtheir estimates and other questionnaire items,they were dismissed. After novices completedtheir estimates, the novices learned to use thecellular handset (telephone) to record, listen to,erase, and save a message on the associated voicemail service. Novices were given instructionalmaterials that had all the necessary information tocomplete the task. They were required to workalone until they finished the task. Start and endtimes were recorded.

    Analysis. The primary dependent variablewas predictive accuracyresearch participants'predictions minus the median actual task comple-tion time of the novices in the study. Repeatedmeasures ANOVAs were used to control formultiple trials and planned orthogonal contrasts(experts and novices vs. intermediate users; ex-

    perts vs. novices) were made to determine thenature of the relationship between expertise andpredictive accuracy. An alpha level of .05 wasused for all statistical tests.

    ResultsThe experts had used cellular phones a mean of

    28.50 months (SD = 17.72 months) with a rangeof 12-72 months. The intermediate group hadused cellular phones a mean of 9.32 months(SD = 10.42 months) with a range of 1-30months. By selection, none of the novices hadany experience using cellular telephony. Theexpert and intermediate groups differed in theirnumber of months of use, r(53) = 4.91, p < .05,in the number of cellular calls they placed per day(M = 8.43 vs. M = 5.72), t(52) = 1.54, ns, andin the number of voice mail messages theyreceived per day (M = 6.64 vs. M = 2.34),t(52) = 4.15, p < .05. In every case, experts wereheavier users. By their own estimates, intermedi-ate users had completed many calls (1,340 onaverage); however, experts had completed amuch larger number (8,640 on average). It isunclear when the learning curve flattens out, butit is clear that experts had far more experiencethan did intermediate users. Experts, as comparedwith intermediates, also rated themselves as moreproficient at using the voice mail system(M = 14.23 vs. M = 11.02), f(54) = 3.32, p