Pertinence Generation in Radiological Diagnosis - CSJ Archive

COGNITIVE SCIENCE Vol22 (4) 1998, pp. 517-546 ISSN 0364-0213 Copyright 0 1998 Cognitive Science Society, Inc. All rights of reproduction in any form reserved.

Pertinence Generation

in Radiological Diagnosis:

Spreading Activation and

the Nature of Expertise

ERIC RAUFASTE

HELENE EYROLLE

CLAUDETTE MARINE

Universite’ de Toulouse Le Mirail

An empirical study of human expert reasoning processes is presented. Its pur- pose is to test a model of how a human expert’s cognitive system learns to detect, and does detect, pertinent data and hypotheses. This process is called pertinence generation. The model is based on the phenomenon of spreading activation within semantic networks. Twenty-two radiologists were asked to produce diagnoses from two very difficult X-ray films. As the model predicted, pertinence increased with experience and with semantic network integration. However, the experts whose daily work involved explicit reasoning were able, in addition, to go beyond and to generate more pertinence. The results suggest that two qualitatively different kinds of expertise, basic and super, should be distinguished. A reinterpretation of the results of Lesgold et al. (1988) is proposed, suggesting that apparent nonmonotonicities in performance ore not representative of common radiological expertise acquisition but result from the inclusion of basic and super expertise on the same curve.

I. INTRODUCTION

Problem solving consists of finding a path that gradually reduces the gap between a current

state and a goal state (Newell & Simon, 1972). However, in diagnosis the goal state is not

given and must be identified first by the physician: “Medical problem solving proceeds by

selecting a number of diagnostic hypotheses as possible goals, and then testing to see

whether one or more of the hypotheses selected can be justified.” (Elstein, Shulman, &

Sprafka, 1978, p. 21). Early diagnostic hypotheses guide data gathering (Barrows, Nor-

man, Neufeld, & Feightner, 1982; Lesgold, Rubinson, Feltovich, Glaser, Klopfer, & Wang,

Direct all correspondence to: Eric Raufaste, Laboratoire Travail et Cognition, UMR 5551 du CNRS / UniversitC

Toulouse Le Mirail, Maison de la Recherche, 5, All&s A. Machado, 31058 Toulouse Cedex, France.

517

518 RAUFASTE, EYROLLE, AND MARINk

1988) in such a way that further processes tend to eliminate early diagnoses rather than add

new ones (Elstein et al., 1978; Joseph & Patel, 1990; Sebillotte, 1984). Consequently, the

problem of how early hypotheses are selected is crucial in medical reasoning. More specif-

ically, given the large number of possible diagnostic schemata, and the fact that diagnosti-

cians often make use of irrelevant information (Doherty, Schiavo, Mynatt, & Tweney,

1981), an important question is how are pertinent rather than non-pertinent hypotheses

selected. Acquisition of expertise in medical diagnosis could also be related to the ability

to select appropriate information and hypotheses (Custers, Boshuizen, & Schmidt, 1996).

The present study was designed to investigate what we call the problem of pertinence gen-

eration. This involves both the question of how pertinence increases with experience and

the question of how pertinence is created by the expert’s mental processes. After reviewing

the related work, we present a semantic-network-based model of expertise acquisition and

an empirical study of radiologists’ diagnostic activity.

Levels of Expertise and Pertinence Generation

We begin with three questions. The first, how expertise develops, relates to the nature of

expertise. The second, how pertinence is generated, addresses the issue of the processes

that allow early selection of pertinent hypotheses. Third, we examine how recognition of

pertinence can be expected to develop.

1. How does Expertise Develop?

Expertise (and thus the cognitive processes underlying expert performance) is thought to

develop in various ways, depending on the authors’ assumptions. Dreyfus and Dreyfus

(1986) proposed a five-step model of expertise development. At the first level, the novice

uses context-independent rules without taking into account the pertinent cues that exist

within the context. The next two steps are rule-based reasoning refinements. At the fourth

level, the subject goes beyond analytical reasoning and uses holistic recognition, although

if necessary, she or he may determine a goal structure and return to analytic processing.

Finally, the expert uses intuitions that cannot really be explained. Even if deliberating, the

expert does not use reasoning but proceeds by comparisons between the current situation

and his reservoir of memorized situations. Thus, Dreyfus and Dreyfus’ model tends to pro-

mote a conception of expertise acquisition where automaticity is what characterizes the

real expert.

Chase and Simon (1973a, 1973b) described an unconscious and fast perceptual pro-

cessing that structures perception in chess players and retrieves memorized patterns for

subsequent slow conscious processing. According to Lesgold et al. (1988), the first kind of

processing in radiological diagnosis is perceptual processing. The outcome of perceptual

processing is a set of pointers directed toward the patterns that are most compatible with

the cues contained in the X-ray film. However, one problem in perceptual processing is dis-

crimination insufficiency, particularly in radiological diagnosis. Several diagnoses are

often compatible with all the data, so a simple recognition process is inadequate. Accord-

ing to Lesgold, deeper cognitive-based processing is then necessary. In this second step,

the cognitive apparatus delays the action that would be based solely on probabilities and

tries to resolve ambiguity by obtaining more information. Such cognitive processing would

be triggered by the outcomes of the perceptual process. Only experts would develop a

plain, cognitive way of reasoning because cognitive processing uses the outcomes of per-

ceptual processing as input, and because prior contact with a large number of experiences

would be necessary to develop perceptual processes. Concerning perceptual and cognitive

processing, Lesgold et al. (1988) proposed a three-stage model of expertise development.

They observed that intermediate individuals sometimes demonstrate worse performance

than novices and experts. Lesgold et al. attributed this temporary loss of performance to a

transition phase in the three-stage cognitive development. The three stages hypothesized

by Lesgold and his colleagues were: (1) Subsymbolic processes develop with the first

experiences. (2) Perceptual processes begin to produce outcomes. These outcomes trigger

cognitive processes which may start to gain importance. But at that time, they may still be

in conflict with the perceptual processes. (3) Cognitive processing is sufficiently developed

and the subjects are now able to tune their schemata and test them, so that perceptual and

cognitive processes are no longer in conflict.

Lesgold’s model is the opposite of Dreyfus and Dreyfus’s model. It emphasizes percep-

tual processing in novices, and then both perceptual and cognitive processing in experts. In

the Dreyfus and Dreyfus view, cognitive processing comes first, and then reasoning disap-

pears, leaving only perceptual processing. However, these models have in common the

point that both processes exist within some period of the acquisition of expertise.

2. How is Pertinence Generated?

Pertinence might be generated by perceptual or cognitive processes. The two kinds of mod-

els will now be reviewed.

In connectionist models (Rumelhart & McClelland and the PDP Research Group,

1986), each concept is represented by a schema that can be viewed as a processor with an

activation level. Only the most activated schemata have a chance to control behavior.

When a cue matches parts of a schema, this increases its activation. Now, the most appro-

priate schemata are usually more likely to match environmental cues and thus to be

selected. In a connectionist view, generating pertinence may also result from connections

between schemata. When the activation of a schema goes over a certain threshold, it

spreads to the surrounding schemata. Balota and Paul (1996) found that the facilitation pro-

duced on a node by preliminary activation of several linked nodes (multiple priming) is

additive. Consequently, schemata that match several cues receive a large amount of activa-

tion, so they are reinforced more than non-pertinent schemata, which tend to be eliminated.

This competition among schemata could explain empirical studies showing that not all

available data are used by physicians (e.g., Hoffman, Slavic, & Rorel, 1968 for radiolo-

gists; Weber, Biickenholt, Hilton, & Wallace, 1993 for general practitioners). In the

experts studied by Lesgold et al., pertinent diagnostic schemata appeared to be triggered

within the first two seconds. According to Lesgold et al., this could result from a pandemo-

nium decision process (Selfridge, 1959) based on the probabilities associated with the dis-

eases, and resident training is supposed to make those probabilities more accurate. In the


same way, Medin and Edelson (1988) found an effect of base-rate knowledge acquired

from experience on diagnostic judgments, and according to Hasher and Zacks (1984), the

frequency of occurrences associated with facts is spontaneously coded by memory. Never-

theless, several studies have shown that a human’s estimation of probabilities is biased

(e.g., Smedslund, 1963; Kahneman & Tversky, 1973). Moreover, Medin and Edelson

found some heuristics used by subjects that enable them to take rare cases into account, for

example, when confronted with specific cues. According to Weber et al. (1993), physicians

make a compromise between likelihood and clinical severity because they cannot afford to

miss diagnoses with severe consequences.

The previous examples suggest that perceptual processes are not sufficient for solving

every case. So, how might cognitive processing explain pertinence generation?

Pate1 and Groen (1986) suggested that relevant causal information is accessed from the

initial representation of the case. Using a propositional analysis, they rebuilt causal net-

works, and showed that expert physicians demonstrated a better ability than novices to dis-

tinguish relevant from irrelevant material. Moreover, experts made more inferences from

the relevant material. Lemieux and Bordage (1992) found that diagnosticians who pro-

vided correct diagnoses had a deeper representation of the problem, mainly because they

were able to transform elementary meanings into rich and well organized semantic net-

works. When they collected data, they used a more pertinent and diversified set of semantic

axes that enabled them to think in terms of opposing properties. Lemieux and Bordage

hypothesized that successful diagnosticians have a greater “ability to recognize pertinent

relationships between abstract properties” (1992, p. 198). The abstract properties are

thought to be linked inside a network of formal qualities that are then used to drive induc-

tive inferences. Successful diagnosticians would be able to follow a vertical pathway of

abstraction from the case, whereas unsuccessful physicians would not. The pathway would

be built by using formal qualities which allow the physician, when confronted with a

branch, to decide what direction has to be taken.

Because early generation of hypotheses takes only a few seconds, we should consider

mechanisms that can generate pertinence automatically rather than by controlled process-

ing (Shiffrin and Schneider, 1977). The Lemieux and Bordage approach is appealing but

its limitation lies in the fact that it is based upon deliberate choices by the subjects, so it

cannot explain automated processing such as early hypothesis generation.

3. How can Pertinence be Expected to Develop?

Few studies have addressed the issue of pertinence development. Experts have regularly

been found to be more able to recognize pertinence than novices (Pate1 & Groen, 1986;

Lesgold et al., 1988; Lemieux & Bordage, 1992; Shanteau, 1992) but we know nothing

about the monotonicity of the pertinence curve. In perceptual processing, pertinence comes

from the interdependence of schemata within networks. In novice radiologists, where per-

ceptual processes are dominant (Lesgold et al., 1988), one can expect pertinence to grow

correlatively with that interdependence. On the other hand, when cognitive processing is

dominant, we do not know how pertinence develops: it might grow monotonically with the

progressive acquisition of mastery in cognitive reasoning, or it might temporarily decrease

due to a transition period during which subjects would no longer use genuine perceptual

processing and would not yet be able to use cognitive processing. Pate1 and Groen (1991)

reported three experiments where intermediates used a greater number of both relevant and

non-relevant inferences than novices and experts. Experts used fewer irrelevant inferences

than both novices and intermediates. Nevertheless, the experts in these studies were resi-

dents. In radiology, beginners already are residents and it would be hazardous to generalize

any conclusions without first knowing more about what mental processes are involved.

Connectionist models seem to be oriented toward perceptual processing, and symbolic

models seem to be oriented toward cognitive processing. But Lesgold et al. (1988) empir-

ically showed the necessity of finding a framework of radiological expertise that can inte-

grate both types. We shall now present a conception of expertise acquisition, which is

based on spreading-activation theory (Collins and Loftus, 1975), and could fulfill that con-

dition.

Toward a Model of Pertinence Generation

The schemata (and the links between those schemata) that have been explicitly acquired

through a university education will be called canonical schemata and canonical links. Our

main assumption is that explicitly acquired canonical knowledge constitutes a pre-struc-

tured semantic network to which further acquisitions will be attached. An analogy could be

a small crystal in a salt solution: the crystalline network develops around the initial crystal.

In other words, semantic networks of explicit knowledge can be the basis for further

implicit acquisitions. We will examine two questions: how experience affects semantic

networks, and how this could explain previous experimental results about expertise.

How does Experience Affect Semantic Networks?

According to Rabinowitz and McAuley (1990), connectionist approaches toward concep-

tual knowledge acquisition involve three aspects: the number of available concepts, the

number of units interconnected by associative links, and the potential for knowledge struc-

tures to activate the units so they can be retrieved. Applying this view to the acquisition of

medical expertise, we suggest that in spite of the initial existence of canonical schemata

and links in the semantic network, poor spreading activation potential does not allow nov-

ices to easily access the concepts (Rabinowitz & Chi, 1987) that are necessary to establish

a pertinent diagnosis. Each time a resident manages to see the association between a com-

bination of features and the correct diagnosis given by the professor, a subsymbolic asso-

ciation is created or reinforced within the semantic network. When a feature matches one

or more low-level concepts in the semantic network, the correct diagnosis (the one pro-

vided by the professor) is also activated. This simultaneous activation is repeated several

times so that subsymbolic associations can link two concepts in the semantic network. The

same result is obtained when the correct solution is given, not by a professor, but by further

examination, another x-ray, or anything that can confirm the real solution. Such an associ-

ation may also occur with abstract concepts, between a non-observable sign and a disease

for example, or between a disease and a complication. Such a process is plausible, given

522 RAUFASTE, EYROLLE, AND MARINE

that Frick and Lee (1995) demonstrated that implicit learning of concepts may occur by

mere exposure to the instances of the concepts.

As Rabinowitz and Chi (1987) stated, “the amount of activation that spreads from any

node is dependent on both the strength of the associative link and the level of activation of

a node” (p. 92). These notions are not easily measured in human knowledge bases such as

the ones experts have. It is easier to determine, when a schema is triggered, whether acti-

vation spreads or does not spread to surrounding ones. We defined as activable, links that,

through experience, became easy to activate, we called this property activability. As such,

the associative mechanisms produce two types of links that do not result from explicit

teaching alone (activable canonical links and operative links). Activable canonical links

are canonical links that already exist in the network but have been reinforced by experi-

ence. Operative links are new links, created only by an associative process. They connect

concepts that, canonically, were not directly connected. For instance, a radiologist sees a

film of a patient with an ablated lung and immediately starts to search for secondary loca-

tions of cancer. There is no direct causal relation between lung ablation and metastases.

The relation is indirectAancer is a possible, although not unique, cause of lung ablation.

The search for metastases is a normal procedure when a cancer is suspected. Experience

has created a new direct link between lung ablation and metastases, an operative link. This

point of view is an extension of the experimental results obtained by Dagenbach, Horst,

and Carr (1990), who showed that “very extensive episodic association of some types can

result in an addition to semantic memory that will function similarly to existing structures,

at least with respect to producing automatic semantic priming of a lexical decision” (p.

589). Considering that retrieval of a concept within a semantic network is a probabilistic

function that depends on the level of activation of the concept, Rabinowitz and McAuley

(1990) implemented a computer simulation of how knowledge structure affects free recall.

They found that variations in the strengths of associations between concepts determine the

amount of information available for retrieval. On this basis, we hypothesize that activable

and operative links make canonical knowledge accessible and useful for calculation.

Figure 1 illustrates these concepts by depicting a part of a hypothetical semantic net-

work showing canonical schemata and the links between them. The link between schemata

al and b is weakly activable. We assume that, in such a case, no automatic spreading of

activation will occur. Thus, the only way for schema b to be activated is through deliberate

attention on the subject’s part. On the other hand, the link between schemata a2 and b is

highly activable because it has been reinforced by one or more experiences. If activated,

schema a2 will automatically activate schema b. Furthermore, in this example an operative

link had previously been created between a2 and c, which is two steps away. This operative

link will enable the subject to directly activate schema c after seeing a2. For a less-experi-

enced subject who has not developed that link, only b would be activated and the activation

of c would need a deliberate inference from b.

If experience affects semantic networks by adding and reinforcing links, a second ques-

tion is: How could these changes in semantic networks explain empirical results concern-

ing expertise?

Knowledge Base

Explicit Learning

canonical schemata

canonical links a&able canonical links

non-activable canonical link l low-level schema (e.g., Rgural feature)

- activable canonical link 0 intermediate schema (e.g., semiological concept)

- operative link (shortcut) l high-level schema (e.g., diagnosis)

Figure 1. A Model of How Experience Affects the Knowledge Base

Working memory can be seen as the activated part of long-term memory (Anderson,

1983; Ericsson & Kintsch, 1995; and for empirical evidence Cantor & Engle, 1993). In this

point of view, only subjects who have a lot of activated canonical links and operative links

will have a rich and deep representation: rich, because the number of activated links deter-

mines the number of concepts that will be included in the representation; deep, because

concepts having a higher level of abstraction will be included. This mechanism seems suf-

ficient to explain why studies have shown that experts’ representations are deeper and

richer than those of novices.

In order to represent a global property of semantic networks that could account for the

interdependence of the elements in the representation, we define a numeric function called

integration. The desired function has to fulfill two requirements: (1) for a constant number

of schemata, integration should increase with the number of links; and (2) for a constant

number of links, integration should decrease with the number of schemata. A cluster (Les-

gold et al., 1988) is defined as a set of schemata such that each schema in the cluster is con-

nected to at least one other schema in the same cluster. Each separate schema is counted as

one cluster. An integration mark (I) for each cluster is defined by taking the ratio of the

number of links (L) within the cluster to the number of schemata (S). We can generalize

that formula for the whole representation:

Such a function has the advantage of being richness-independent: 20 schemata con-

nected by 20 links (I= 1) are equivalent to five schemata connected by five links. The result

is equal to half the mean number of links departing from or arriving at a schema.

As we reported earlier, experts seem to demonstrate better pertinence than novices in

their selection of cues and diagnostic hypotheses. From a connectionist standpoint, perti-

nence could come from the interrelations between the elements in the network-the more

schemata are linked together, the more pertinence will be demonstrated because pertinent

schemata will receive multiple primes and then receive more activation (Balota & Paul,

1996). Non-pertinent schemata could also be inhibited by concurrent ones, but here we


chose not to treat this issue. It is important to note that this relation between network inte-

gration and pertinence is not really causal: in novices, when integration increases, activa-

tion is roughly distributed among categories. In experts, more links connect schemata.

Many more combinations are then available and the distribution of activation among sche-

mata may be more precise and differentiated. Consequently, in a competition based upon

the level of activation, and for a given number of schemata, pertinent schemata are more

likely to be selected when more relations exist in the network. This is why, over all, we

can expect pertinence to increase with integration. However this relation is not causal

because there could be cases where a novice activates only accurate categories, leading to

one hundred-percent pertinence despite a sparse network. Conversely, the same configura-

tion of cues might activate a non-pertinent hypothesis in a particular expert, despite a

dense network. Nevertheless, these cases should be unlikely enough that if experience

increases the number and the activability of links then pertinence can be expected to grow

accordingly.

From that global framework, we can derive seven predictions:

1.

2.

3.

4.

5.

6.

Experience increases the number of activable canonical links. With experience, each

link is used more. Consequently, due to classical subsymbolic learning processes, its

activability will grow. The increased activability of canonical links should allow sche-

mata to be more easily retrieved. Therefore, a greater number of canonical links should

appear within the verbalizations.

Experience increases the number of operative links. This hypothesis is justified by the

same subsymbolic mechanisms as the previous one. The expected difference between

the two kinds of links comes from the fact that most canonical links exist before expe-

rience because they are explicitly acquired. Subsymbolic processes do not create them

but just make them more activable. On the other hand, operative links are supposed to

be acquired genuinely by experience-they result from the frequent simultaneous acti-

vation of two schemata that canonically are not directly linked.

A correlation exists between the number of operative and activable canonical links.

Because the numbers of activable canonical links and operative links are expected to

grow due to the effect of experience, we can expect them to be correlated.

Experience increases the integration of the representation. Because the accessibility

of schemata is expected to grow with experience, there should be, given a constant

number of features, a greater number of relations among the elements of the represen-

tation. This should produce a rise in integration: that is, an increase in the ratio of links

to schemata.

Pertinence is related to integration. Because pertinence is expected to come solely

from the effect of spreading activation within the semantic networks, there should be a correlation between integration and pertinence. In particular, pertinent schemata

should receive more activation because they are more compatible with the whole set

of data. Moreover, if pertinence results from the network structure, then the correlation

between pertinence and integration should be independent of experience.

The pertinence of the representation increases with experience. Because experience is expected to increase integration, and integration is expected to generate pertinence, more experience will lead to greater pertinence.

7. Accuracy depends on pertinence. Because pertinence is expected to determine the ini-

tial set of schemata, and given that prior studies have shown that subsequent process-

ing essentially consists of eliminating superfluous diagnoses, pertinence should be an

important factor for accuracy.

II. EXPERIMENTAL CONDITIONS

In order to test these hypotheses, we asked radiologists to examine some x-ray films and

make a diagnosis. We used a method similar to that designed by Lesgold et al. (1988) in

which several radiologists examined films of real cases.

Subjects

Our treatment of the subjects’ levels of experience was based on the work of Lesgold and

his colleagues (1988). Lesgold’s experts were considered as “outstanding” by colleagues

and had at least 10 years of experience after residency. But as not all experienced radiolo-

gists are outstanding, one could ask if this group was truly representative of the overall

population of radiologists. Consequently, we used four groups of subjects instead of three.

The first group consisted of four experts who had practiced radiology for at least 13 years

after residency. They were not only diagnosticians but also had an institutional role as

researchers and as teachers and trainers of residents. They were recognized by colleagues

as outstanding, among the top French specialists, so they were called super experts. They

had the kind of expertise that Lesgold et al. called “expert”. The second group consisted of

four other experts who had practiced radiology for at least six years after residency.

Because their professional roles involved neither teaching nor research, they were called

basic experts, that is, they were more representative of the overall population of radiolo-

gists. They were genuine practitioners. The third group consisted of eight “novices” who

were first- and second-year residents. The fourth group consisted of six intermediates who

were third- and fourth-year residents.

Task

Each physician was asked to examine two cases. In each case, the experimenter had the

physician follow a sequence of three steps.

1. In the first step, the physician was asked to produce a diagnosis from a posteroanterior

chest x-ray (the most typical type of x-ray). A radiologist typically has access to exter-

nal information, such as a clinical description, that constrains the exploration (Kundel

& Wright, 1969). In the experiment, however, we were interested in the cognitive con-

straints on exploration, so the x-rays were the only source of data provided to the par-

ticipating diagnosticians. The subjects were asked to think aloud and their verbalizations were recorded simultaneously on a Dictaphone and a tape recorder.

Radiologists are familiar with the use of a Dictaphone; the only difference here was

that subjects were asked to leave the recording button on the ‘run’ position. This way,

everything was recorded twice, as needed for the third step.


2. In the second step the x-ray was taken away from the physician, who was asked to

draw it on a paper and to name everything he or she could remember that was relevant

to analyzing the case. The verbalizations were tape-recorded. This step allowed the

experimenter to obtain the kind of information that is contained in topographical

aspects of the subject’s representation (Denis, 1989). These aspects can play a role in

reasoning, especially in the field of medical imagery.

3. In the third and final step, the subject commented on the playback of the Dictaphone

recording from the first step, while the film was presented again. The subject was

asked to explain what he or she had said in the first step. During this third step, the subject might intervene spontaneously or answer the experimenter’s questions. As in the

previous steps, verbalizations were tape-recorded.

X-ray Films

Using appropriate psychological criteria (discussed below), and with the helpful coopera-

tion of a confederate expert radiologist, we chose two films that were difficult enough to

generate differences among groups. The films were selected so as to present different kinds

of problems. The first was chosen because of salient cues that usually lead to another diag-

nosis which was in fact wrong. The second was expected to be difficult due to the simulta-

neous presence of several common but independent pathologies.

The first film (“Film 1”) showed a case of lymphangitic carcinomatosis with a left lower

lobe atelectasis. This type of cancer often causes nodules that are commonly visible on the

x-rays. On our film, the nodules existed but were very hard to discern because one was hid-

den by the heart and the images of the others were extremely faded. Typically, the salient

clues in this film, given the fact that no clinical data was available, evoke bronchial disease

and lead to a completely different kind of diagnosis such as chronic bronchitis.

The second film (“Film Y) showed a complex case where four independent pathologies

could be diagnosed. The most salient was a lung ablation that had occurred two months

before, after cancer. The second pathology was an infection in the remaining lung. The fea-

tures that could lead to that diagnosis were not very hard to see but could be misinterpreted

when compared with pneumonectomy features. The third pathology was heart failure, and

the fourth was a hydatid cyst in the liver. The key aspect of this film was the overwhelming

salience of one of the pathologies, in this case the pneumonectomy.

Building Semantic Networks

Radiologists use a technical language with a codified vocabulary describing precise medi-

cal concepts. These concepts could be identified in the verbalizations. Our confederate

expert, who knew the films and all the charts, helped us list these schemata and determine

the logical relationships among them. The schemata constituted the nodes of networks, and

their relations provided the links. Semantic networks were built according to several rules;

a detailed example of semantic network building is provided in the Appendix.

Some of the rules involved data selection. A specific semantic network was built for

each film and subject, using the verbalizations from the three steps. Non-medical com-

ments and systematic evaluations were ignored. For example, reports about absent pathol-

ogies that are supposed to be verified systematically on that kind of film were ignored.

Technical evaluations of the film were ignored unless they were used in an inference

related to a suspected pathology. Residents are taught to systematically verify such things,

regardless of the particular case they have to diagnose. However, radiologists do not ver-

balize technical evaluations, and do not even do all the systematic verifications (Carmody,

Kundel, & Nodine, 1984; Lesgold et al., 1988). Taking them into account would have inap-

propriately penalized novices’ pertinence marks.

Graphical rules were also used. The nodes, which were the words or expressions corre-

sponding to the medical concepts, were typewritten. Each relationship between two nodes

(link) was depicted as a line, without regard to its inhibiting or activating nature. The sche-

mata describing diagnoses were enclosed in ellipses. Rejected diagnoses were crossed out.

The topography of the semantic networks was oriented vertically in order to represent the

depth of semantic categories in canonical medical knowledge. The visual clues (such as

lines, borders, and opacities) were near the top of the networks, and diagnostic schemata

were further down. Thus, an inductive inference was drawn as a line connecting two type-

written schemata, with the deepest schema lower down.

Data Coding

All of medical concepts expressed were considered as canonical schemata and all of the

links expressed (through explicit or implicit inferences) were counted as canonical links.

The shortcuts (i.e., inferences in the first phase between two schemata that canonically

were not directly linked) were counted as operative links. Intermediary schemata that had

remained implicit even after the last phase were restored. For instance, an expert said “a

North African patient”. Only one thing could enable him to say that: the name written on

the film was “Mohammed X”. Consequently, that cue was assumed to have been taken into

account by the expert and was introduced into his semantic network.

As previously discussed, the integration score for a subject and a film was calculated by

dividing the number of links in the verbalizations by the number of schemata.

We now turn to the issue of methods for evaluating pertinence. Pate1 and Groen’s

(1986) definition of relevancy was based on the matching of the findings to the correct

diagnosis, and what led to that diagnosis. They considered a proposition to be relevant

when it matched a canonical structure (given by their confederate expert) that led to the

right solution. With this definition, all propositions leading to other diagnoses are irrele-

vant. Our problem here, however, was less that of understanding how rules can lead to the

correct solution rather than to understand how relevant rules are selected. Pate1 and Groen

considered that a rule is applied when either the premise or the conclusion of the rule

matches a clinical datum. Actually, a single clinical cue may match with the premise of

numerous rules, and most of the time, a physician can yield only a set of possible diag-

noses. Consequently, all diagnoses are relevant until new data can eliminate some of them.

Therefore, we preferred to evaluate relevancy from the standpoint of stimuli rather than

correct diagnoses: a representation was considered as more pertinent when it contained

more clearly visible features or legitimately inferable cues from the film. For instance, on

the lymphangitis film, the diagnosis of chronic bronchial disease was incorrect. But this


diagnosis could be legitimately evoked because the film contained salient clues that were

relatively typical of that pathology, and did not contain any true incompatibilities. We

developed a list of schemata used by each subject. For each schema we decided, with our

confederate expert, whether it was legitimate to evoke that schema. The pertinence value

was determined by calculating the proportion of pertinent schemata in the verbalizations.

In contrast to pertinence, accuracy was taken to express whether the subject had found

the correct diagnoses. Our confederate expert established a comprehensive list of correct

diagnoses. For x-ray 1, two diagnoses were correct: carcinomatous lymphangitis and left

lower lobe atelectasis. For x-ray 2, the four correct diagnoses were pneumonectomy,

hydatid cyst in the liver, pneumonia, and heart disease. The level of accuracy was deter-

mined by tallying the number of correct diagnoses each subject had stated.

Finally, in order to compare our results with Lesgold’s reference study (1988, p. 3 18-

3 19), we measured the longest chain of reasoning (the longest sequence of links), the num-

ber of findings in the biggest cluster, the number of different clusters, and the percentage

of findings connected to other findings.

In order to evaluate coding reliability, two confederate expert radiologists built seman-

tic networks from eight protocols (one per group and per x-ray), using the rules presented

above. Each schema and each link that was cited by at least one coder received a value for

each coder: “1” if cited, “0” if not. Coders also rated schema pertinence (“1” if pertinent,

“0” if not). For each of these three variables, we computed the statistic G, developed by

Light (1971) to compare the joint agreement of several observers with a standard. In the

present study, the standard was the initial coding.

Results

First we report the results related to the hypotheses derived from our model. Then we

present data for comparison with the study by Lesgold et al. (1988).

Effects of Experience on Canonical and Operative Links

The index of agreement for schema coding was Light’s G(226) = 3.82 (p < .OOOl). The

index for link coding was G(245) = 11.986 (p < .OOOl). As shown in Table 1, and as

expected with experience, there was a monotonic growth in the number of activable canon-

ical links (F(3,18) = 7.160;~ c .003) and in the number of operative links (F(3,18) = 8.13;

p c .002) demonstrated by subjects.

TABLE 1 Numbers of Canonical and Operative Links, by level of Expetfise

Level of Expertise

Novices Intermediates

Mean (SD) Mean (SD)

Basic Experts

Mean (SD)

Super Experts

Mean (SD)

Number of canonical links 17.3 (9.2) 22.2 (6.4) 33.75 (12.7) 42.5 (11.7)

Number of operative links 0.25 (0.4) 0.5 (0.8) 1.25 (0.6) 2.125 (0.9)

TABLE 2 Correlations between Canonical and Operative Links, by Level of Expertise

Level of Expertise

Novices Intermediates Basic Experts Super Experts

Correlations between r = -.0975 r = .5261 r = .8527 r = .7490

canonical and operative links

Intergroup comparisons showed that means significantly differed only for noncontigu-

ous groups. For instance, novices differed from basic experts (p < .OOS) but did not differ

significantly from intermediates. The absolute number of operative links remained small,

even in super experts-only the comparisons between (a) novices and super experts, and

(b) intermediates and super experts showed a significant difference (p < .05).

Over all, the existence of the expected correlation between the numbers of canonical

and operative links was proven by Pearson’s coefficient analysis (t-(20, = .765; p < .OOl).

Correlations involving groups of four subjects cannot be regarded as reliable. Neverthe-

less, after analysis of within-group correlations (see Table 2), the correlation between the

numbers of canonical and operative links did not even exist in novices (r-(6, = -.0975). It

was higher in intermediates (r(4) = .5261), although not statistically significant. It was rel-

atively high in experts (r(6) = ,814; p = .014), although not significant if we consider the

basic and super experts separately.

This apparent growth of the correlation between the numbers of canonical and operative

links requires validation by additional research. However, our model suggests a possible

interpretation: For novices, many canonical links are not very activable and there are very

few operative links. Therefore, canonical links are used so they can be reinforced. Con-

versely, the generation of operative links requires sufficient prior activability of canonical

links. In intermediates, many canonical links become activable so that the initial operative

links can be built. Because experience reinforces both types of links, a correlation gradu-

ally emerges. In experts, the networks are fully formed. They are dense and highly activa-

ble. The subjects can demonstrate both types of links, and the correlation produced by

experience is more fully manifested.

Effects of Experience on the Integration and Pertinence of the Representation

The expected continuous growth of integration, shown in Table 3, was highly significant

(F(3,lB) = 9.777; p c .OOOS). This suggests that as experience grows, subjects make more

inferences. This growth appears in Figure 2, which shows that from the same stimulus, the

experienced radiologists built representations that were both richer and more integrated.

The index of agreement for coding pertinence was G(zo7) = 10.126 (p < .OOOl). As

expected, pertinence also rose with experience (F(3,lB) = 8.586; p c .OOl). Although Table

3 seems to show that basic experts demonstrated a lower mean pertinence than intermedi-

ates, this difference was not significant. This lack of a difference suggests that the effi-

ciency of the mechanisms (whatever they might be) that generate pertinence in levels of

530 RAUFASTE, EYROLLE, AND MARlNi

Figure 2. Examples of Semantic Networks on Film 1

expertise ranging from novice to basic expert reaches a limit beyond which only super

experts go. The nature of this limit, and the solutions super experts find in order to go

beyond it, will be discussed below.

Relationship between Integration and Pertinence

As expected, the global correlation between integration and pertinence was confirmed

(r(*u) = .776; p < .OOl). The correlation between integration and pertinence might be

explained by the genuine effect of experience. Studies have shown that experts usually

TABLE 3 Mean Values of Integration and Pertinence, by Level of Expetiise

Level of Expertise

Value of Integration

Novices Intermediates Meon (SD) Mean (SD)

0.783 (0.162) 0.993 (0.104)

Basic Experts

Mean (SD)

1.071 (0.92)

Super Experts

Mean (SD)

1 .162 (0.90)

Value of Pertinence 0.745 (0.13) 0.927 (0.058) 0.910 (0.024) 0.978 (0.020)

make more inferences than novices. Experts also demonstrate greater pertinence (e.g.,

Pate1 & Groen, 1986). These two characteristics may be sufficient to generate a correlation.

An analysis of covariance was conducted in order to obtain the real effect of experience,

controlling for integration. In these conditions, pertinence was no longer significantly

dependent upon experience (F(3,17) = 1.536; p = .241), and integration seemed sufficient

to explain the variations in pertinence (F(3,17) = 5.384; p = .033). This result is interesting

because it suggests that pertinence may be an effect of integration rather than experience.

Nevertheless, if we consider the within-group correlation, we find a monotonic decrease as

experience increases: from ‘(6) = +.62 in novices (p = .OSO) to ~(2) = -.77 in super experts.

Even if correlations within such small groups are unreliable, our results suggest that inte-

gration may be a source of pertinence in novices but not in experts. These ideas will be

developed in the discussion.

Origins of Precision and Accuracy

Film 1 turned out to be too difficult to generate differences among the subjects. Only one

subject in each group, even for super experts, found the correct diagnosis. All the other

subjects found none. Therefore, we used only Film 2 for statistical tests involving accu-

racy. Nevertheless, an analysis of the diagnoses proposed by the subjects for Film 1 con-

firmed the expected sources of error. The critical clues were both atypical and not very

salient, and conversely, salient clues typically led to an incorrect solution.

On Film 2, we would have been able to say that accuracy increased with experience had

the basic experts performed a little better (F(3,18) = 3.135; p = .0512). As Table 4 shows,

their performance was lower than that of the novices.

The qualitative analysis of the protocols showed that the basic experts were affected by

the overwhelming salience of some clues, so they did not detect more subtle features. This

finding is particularly clear when we examine the detection rate for the hydatid cyst, on

Film 2. Fifty percent of the novices saw it, compared to only 17 percent of the intermedi-

ates and none of the basic experts. In contrast, 75 percent of the super experts saw it. After

the experiment, when the critical clues were pointed out, the basic experts immediately

interpreted them correctly. They had not seen them before because they were masked by

the image of pneumonectomy. Novices, as they are taught to be, were globally more sys-

tematic in their exploration. They saw the clue but they had difficulty interpreting it. We

TABLE 4 Accuracy and Precision on Film 2, by level of Expertise

Level of Expertise


Mean (SD) Mean (SD)

Basic Experts

Mean (SD)

Super Experts

Mean (SD)

Number of accurate 1.63 (1.06) 2.33 (1.03) 1.5 (0.58) 3.25 (0.96)

diagnoses

Precision of diagnoses 43.13 (30,Ol) 49.37 (16,29) 91.68 (16,65) 49.10 (9,69)

532 RAUFAETE, EYROLLE, AND MARINk

TABLE 5 Diagnoses Proposed for Film 2 (with percentages of subiects within groups)*

Diagnoses Novices Interm. B. Experts S. Experts

Pneumonectomy Pneumonectomy

Cancer

Metastases

Atelectasis

Right pleural effusion

Hematic effusion

Surgical complication

Mcleod’s Syndrome

Focus Focus Pneumonia

Cancer or metastases

Inhalation

Edema

Bronchopleural fistula

Tuberculosis

Gynecomastia

Heart failure Heart failure

Hydatid intracardiac Dissemination

Pericorditis

Tumor invading the pericardial sac

Hydatid cyst in the liver Hydatid cyst

Amebic cyst

No interpretation

Other Left pleural effusion

Left susclavicular nodule

62.5

25

12.5

37.5

62.5

12.5

12.5

37.5

12.5

12.5

25 33.3 25

83.3 100 100

16.7 75

16.7

100

16.7

50 25

33.3

16.7 25

16.7

16.7

37.5 16.7

12.5

25

12.5

12.5

75

25

50

25

25

50

50

25

25

75

Note. *Correct diagnoses are shown in italics

can also examine how the salient clues of pneumonectomy were processed. All of the

experts found the right diagnosis (Table 5) nearly instantaneously. Among the residents,

two intermediates out of six rejected the diagnosis and three novices out of eight did not

even evoke it. These two examples show that novices detected more clues than basic

experts, probably due to their more systematic exploration, but also that their problem was

interpreting the clues they had found. In contrast, experts processed features in a better

way, but sometimes did not detect all critical clues.

Sixteen diagnoses were proposed by novices, 12 by intermediates, only 4 by basic

experts, and 12 by super experts. However, this variation appears to depend on the type of

pathology involved. For features involving pneumonectomy, only two diagnoses were

offered by super experts, and one by basic experts. These were correct diagnoses. In con-

trast, diversity among residents came from the inability to discard inappropriate diagnoses.

TABLE 6 Correlations between Pertinence or Richness, and Accuracy

Level of Expertise

Correlation between

Accuracy and Novices Intermediates Basic Experts Super Experts

Pertinence

Richness

.3105 -.4665

.5809 .9055

p = .066 p = .006

-.9853

p = ,007

.9642

p = .018

.1741

.9619

p = .019

Six diagnoses were proposed by novices and live by intermediates. For intermediates,

pleural effusion was the most frequently cited diagnosis for the pneumonectomy image,

and it was wrong. As one can see in Table 5, the main variation among super experts came

from the right diagnosis, or from the activation of knowledge about rare complications of

the right diagnoses: pneumonectomy (e.g., bronchopleural fistula), cancer (e.g., tumor

invading the pericardial sac), or hydatid disease (e.g., hydatid intracardiac dissemination).

Only basic experts had a highly focused analysis. In order to evaluate accuracy relative to

variation, we computed a new variable, called “precision”, which was the ratio of the num-

ber of correct diagnoses made by a subject to the total number of diagnoses she or he made

(Table 4). Precision varied with experience (F (3,18) = 4.691; p = .0137), and basic experts

obtained the best performance (p’s were below .05 on Student-Newman-Keuls tests com-

paring them with every group). This probably means that basic experts restrict their infer-

ences to typical diagnoses, in contrast to super experts who tend to explore all possibilities.

The overall correlation between accuracy and pertinence was poor (r(~u) = .3586, p =

0.05 1). As Table 6 shows, the correlation increased negatively with experience, except in

super experts. The gap between basic and super experts was considerable.

In order to find better determinants of accuracy, we also hypothesized that in addition to

pertinence, the richness of the representation would play a role in accuracy. Pertinent find-

ings were the ones that, given the available data, could be assumed to lead to accurate diag-

noses. If among the salient features, the pertinent findings were not sufficient for the

diagnostician to reach an accurate diagnosis, one can assume that accuracy came from the

deliberate search for new features on the film, or from the deliberate search for non-obvi-

ous diagnoses in memory. Both strategies can be expected to lead to a greater number of

activated schemata and expressed links. To test this idea, the number of schemata and the

number of links were totaled to make the “richness” variable. Thus, a representation was

considered rich when it included many schemata and links.

As expected, we found (see Table 6) a significant correlation between the richness of the

representation and accuracy (r(zo) = . 7266; p < .OOl). Because accuracy appeared to

increase with experience, we analyzed the variance of accuracy produced by experience

while controlling for richness. We found that richness was the main source of accuracy

(FC3,t8) = 30.06; p < .OOOl). Experience alone was not significant (F(3,ls) = 2.662; p =

.OSl). Nevertheless, the film was very difficult because neither the obvious cues nor the


TABLE 7 Results for Comparison with the Lesgold et al. (1988) Reference Study

Level of Expertise


Mean (SD) Mean (SD)

Basic Experts

Mean (SD)

Super Experts

Mean (SD)

Number of findings

Longest chain

Biggest cluster

Number of different

clusters

Percentage of findings

connected to others

21.4 (11.1) 22.8 (7.5) 32.3 (9.5) 37.8 (8.0)

2.9 (0.7) 3.4 (0.6) 3.9 (0.6) 6.0 (1.1)

10.9 (4.6) 15.1 (2.7) 25.8 (6.7) 36.1 (8.5)

5.1 (2.9) 2.9 (1.5) 2.5 (0.7) 1.8 (0.5)

89.5 (6.7) 97.9 (1.2) 97.2 (2.2) 99.42 (0.5)

pertinence were sufficient to find accurate diagnoses. It was necessary to explore the film

in order to find new clues that could not be produced inferentially. Consequently, more typ-

ical x-rays do not need such a level of richness for proper diagnosis. We agree with Weber

et al. (1993), who suggested that richness is useful for solving uncommon cases, but auto-

matic generation of hypotheses is sufficient for common cases.

Results for Comparison with the Reference Study (Lesgold et al., 1988)

Because the low number of subjects may have biased our results, especially in the two

expert groups, it would be useful to see how the study matches previous results in the field

of radiological expertise. Consequently, before our general discussion, we will present

some quantitative and qualitative results that enable comparison with Lesgold et al.‘s ref-

erence study.

Most of the relations shown in Table 7 are similar to those found in Lesgold et al.

(1988). Several variables increased monotonically with experience. These were the mean

number of findings (Fc3,18) = 3.278; p = .0388), the longest chain of reasoning (F(3,t~ =

16.637;~ < .OOOl), the biggest cluster size (Fc3,1~) = 21.834; p < .OOOl), and the percentage

of connected findings (F(,,,,) = 6.891; p = .0028). There were two discrepancies, however:

1) Even though the overall trends were similar, the absolute values of the measured vari-

ables were consistently higher in our study. For example, the longest chain was 2.03 in

Lesgold’s experts whereas ours was 6.0. The explanation probably lies in the particularities

of the stimuli: we chose x-rays with numerous visible abnormalities, leading to more find-

ings, from which more hypotheses were plausible and thus had to be discussed by radiolo-

gists. This effect was probably reinforced by the simultaneous existence of several diseases

(two on Film 1 and four on Film 2). The combination of possibilities might have led sub-

jects to make more attempts to connect findings, especially among the super experts. All of

the super experts explicitly evoked rare complications of diseases in order to make all the

data fit with only one disease (e.g., one of them declared “I think he may have had a pneu-

monectomy (...) I think he has a hydatid disease. Is there a connection between the two? It

seems logical or else it is afive-legged sheep”). Differences in coding rules may also have

played a small role. We chose to include only reported abnormalities in the semantic net-

works, and thus, in our calculations. We also included normal features when their normal-

ity made sense with regard to the interpretation of abnormalities (for example, when a

normal feature allowed the diagnostician to discard a hypothesis). Most of the time, the

eliminated elements were disconnected findings (e.g., “the scapulas are normal”), which

might have resulted in a higher percentage of connected findings in our study than in Les-

gold’s. Consequently, the discrepancies in the percentage of connected findings are proba-

bly well explained by specific stimulus-characteristic differences and coding differences.

2) The number of different clusters decreased monotonically (F(3,18) = 3,330; p = .043),

whereas in Lesgold’s study it increased. With the rates of connected findings we found

(ranging from 89,5% in novices to 99,42% in super experts), it is reasonable to find a

reduction in the number of clusters, as greater integration tends to make clusters merge.

Qualitative analysis of protocols showed that subjects used several reasoning patterns.

We could identify all the elementary reasoning steps reported by Lesgold et al.: schema

triggering, testing, tuning and changing. At first a schema was triggered by features, and

then tested. Sometimes the schema did not fit the features well, so the subjects evoked a

more specific diagnosis (what Lesgold et al. called “tuning”). The subjects tuned and dis-

carded schemata but we observed only one case where the subject completely changed the

schema (what Lesgold et al. called “flexibility”). All subjects manifested schema trigger-

ing and testing. On Film 1, only triggering and testing were really used by subjects. On

Film 2, the tuning rate was 37.5% for the novices, 67% for the intermediates, 75% for the

basic experts, and 100% for the super experts. These results are consistent with those

reported by Lesgold et al. In addition, we found a new complex pattern of forward reason-

ing, related to independent disease processing. We call it “integrating.” Having selected

and tested a diagnosis, some subjects tried to relate this diagnosis to every available fea-

ture. While doing this, they evoked rare complications of diseases. This pattern was used

by all of the super experts on Film 2, and by no one else. On Film 1, only one basic expert

and one super expert used this pattern.

III. DISCUSSION

After an evaluation of the above results, we will examine their implications with regard to

the nature of expertise and pertinence generation.

Evaluation of the Results

The results confirmed most of the predictions derived from the model. Experience

increased the mean number of activable canonical and operative links, and the correlation

between them. Integration and pertinence also increased continuously with experience, but

while the effect of experience upon pertinence disappeared when integration was con-

trolled, the correlation between integration and pertinence remained significant. Unexpect-

edly, experience had little effect on overall accuracy, mainly because basic experts did not

detect every critical feature; super experts avoided this problem. Richness was more highly

536 RAUFASTE, EYROLLE, AND MARlNi

correlated with accuracy than with pertinence. However, qualitative analysis revealed dif-

ferences in data processing. Residents had more problems than experts in retrieving good

diagnoses and discarding poor ones. Super experts used a specific kind of deliberate rea-

soning. Most of the variables used in Lesgold’s study were found to behave similarly in our

study: with experience, the number of findings grew, as did the longest chain, the biggest

cluster, and the percentage of findings connected to others.

Despite the fact that the experimental testing confirmed the predictions derived from the

model, we are aware that we did not provide any manipulation that clearly addresses sub-

symbolic processes. Consequently, other explanations of our data might exist. For exam-

ple, there is evidence of the existence of mediated priming in semantic networks (Balota &

Larch, 1986), that is, facilitation of a schema that is not directly connected to the prime.

This could explain the emergence of shortcuts in protocols without recourse to operative

links. Nevertheless, some authors claim that “mediated priming can be said to be priming

between directly related weak associates” (McKoon & Ratcliff, 1992, p. 1165).

A second issue arises out of the idea that experts possess much more knowledge than do

novices. In this view, the higher number of findings and links could be a consequence of

differences in knowledge availability rather than being an effect of knowledge accessibil-

ity. Actually, the real question is why intermediates who have knowledge that would

enable them to reach the correct diagnoses do not use that knowledge? In a recent article

about medical diagnosis, Custers et al. said that “It remains a challenge . . . to discover why

subjects at intermediate levels of expertise do possess the relevant knowledge, but are

unable to use it in many diagnostic situations” (1996, p. 395). In Lesgold’s and our studies,

even the novices were already doctors, with experience in external medicine. Intermediates

were 3- and 4-year residents. It is extremely unlikely that they were unaware of pneu-

monectomies or hydatid cysts. This phenomenon is demonstrated in the following tran-

script, from the verbalizations of an intermediate who missed the pneumonectomy:

(Experimenter) You rejected pneumonectomy because there was no-

(Subject) Attraction (Experimenter) Then you kept pleural effusion. Do you have other signs for that? (Subject) No. I don’t. Actually in pneumonectomies, just after, there may be no mediastinal

attraction (. . .). (Experimenter) But did you think of it at that time? (Subject) No.

In our model, such an effect might be explained in terms of the weakness of canonical

link activability, that is, knowledge accessibility.

A third potential bias might result from knowledge that has been activated but not ver-

balized. This would matter if it affected the groups differently. In fact, many studies have

shown that acquisition of expertise is accompanied by greater automatization. This means

that we might expect omissions to affect more experts than intermediates and novices; this

would have led to a greater underestimation of the number of findings and links in experts

than in novices. This bias probably made it more difficult to confirm hypotheses about dif-

ferences in the richness of the representation. It is unclear what effects such a bias might

have with regard to pertinence.

All these criticisms might be summarized by noting that verbal protocols cannot consti-

tute sufficient proof of a spreading activation system; rule-based reasoning could lead to

the same verbal protocols. Thus, we have no data to prove that spreading activation is the

cause of the observed results, even if the predicted hypotheses were globally corroborated.

Nevertheless, our results have implications with respect to the nature of expertise and per-

tinence generation.

Implications for the Nature of Expertise

We claim that the addition of the basic expert group reveals that what we have called basic

and super expertise are qualitatively different, so that outstanding radiologists (super

experts) are not representative of common radiological expertise because of qualitative

gaps in daily practice and cognitive processes.

Expertise can be considered either as increased automation of perceptual processing

(e.g., Dreyfus & Dreyfus, 1986) or, in contrast, as a greater capacity to proceed in a more

flexible and deliberate manner in both perceptual and cognitive processing (e.g., Lesgold

et al., 1988). Two facts suggest that Dreyfus and Dreyfus’s position does not apply to radi-

ologists. First, the mean longest chain for super experts was 6.0 whereas it was 2.9 in nov-

ices. Second, the integrating pattern of reasoning we found in all super experts on Film 2

was clearly deliberate. Therefore cognitive processing was involved in super experts’ per-

formance. They did not proceed only by intuition.

However, the existence of perceptual processing in radiological expertise is well docu-

mented (e.g., Kundel & Nodine, 1983; Lesgold et al., 1988). Considering their findings

showing that intermediates sometimes provided less accurate diagnoses than did novices,

Lesgold et al. (1988) attempted to apply a three-stage model to expertise acquisition in

adults. The loss of accuracy was attributed to the conflict between perceptual and cognitive

processes. Our own results showed that salient features (namely of the pneumonectomy)

led to a monotonic increase in the performance curve with experience. The results also

showed that subtle features led to a U-shaped performance curve (namely those associated

with a hydatid cyst). The point here is that the non-monotonicity can be ascribed to the

super experts. If we had considered only three groups (with no super experts), we would

have found two monotonic performance curves: increasing for salient features and

decreasing for subtle features. But if we had considered three groups with no basic experts,

we would have found the same results as Lesgold et al. (Figure 3). So, if basic and super

experts represent two different kinds of expertise then we have no reason to place these

subjects on the same curve. Basic expertise acquisition curves (including novices, interme-

diates, and basic experts) are compatible with models where expertise acquisition is a

gradual process leading to automaticity (e.g., Anderson 1983, 1992; Sternberg & Frensch,

1992): with the salient and typical features, performance monotonically increased, and

with the inconspicuous or atypical features, performance monotonically decreased. This

result is compatible with the qualitative findings showing that basic experts used a type of

cognitive processing which was limited to testing diagnostic hypotheses and tuning sche-


Qualitative gap

Novices Super experts

Intermediates

Novices ’ Super expert5

lntermedfetes

Basic experts

Lesgold ‘s model Our findings

Figure 3. Reinterpretation of the Lesgald et al. (1988) Performance Curve an Atypical Films*

*The left figure of the diagram shows Lesgold’s point of view. It can be reconstructed from our data if basic experts are withdrawn and super experts included in the curve. The right-hand figure shows our point of view: basic experts are representative of the overall radiologist population, and super experts are not.

mata. The basic expert level probably matches what Olsen and Rasmussen (1989) called

“highly skilled performer” whereas super experts correspond more closely to the descrip-

tion of the “reflective expert” provided by these authors. The longest chain of reasoning,

which might be taken as a cue for deliberate reasoning, was significantly higher in super

experts than in any other group, including basic experts. At the same time, there was no

significant difference between intermediates and basic experts. Furthermore, as we

reported, super experts also demonstrated a deliberate ‘integrating’ strategy that was not

found in the other groups. Consequently, super experts probably produced higher perfor-

mance due to their greater use of cognitive reasoning. This raises two new questions. First,

why did all the basic experts fail to detect a feature (hydatid cyst) that was noticed by inter-

mediates, and even better by novices? Their more advanced perceptual skills should have

given them a better chance at detection. Second, why did super experts not miss this

hydatid feature?

Berbaum and colleagues (1990) reported that having previously detected features rel-

evant to a specific diagnosis may increase the difficulty at detecting features related to

different diagnoses, which they called the satisfaction of search phenomenon (SOS). Using eye-position tracking, Samuel, Kundel, Nodine, and Toto (1995) found that

SOSs occur when the attention is captured by the first features detected. Another expla-

nation involves the use of a schema-driven exploration strategy, which in our protocols,

was used by all basic experts. Studying the integration of clinical data in radiological

diagnosis, Norman, Brooks, Coblentz, and Babcook (1992) found that feature calls

depend on expected categories. In our study, the feature for hydatid cyst could be

expected from nothing else.

The most obvious explanation of why super experts did not fail to detect the hydatid

cyst would be to attribute it to better perceptual skills. But this explanation is not suffi-

cient, because if only perceptual processes were responsible for this detection then basic

experts would have performed better than residents. Furthermore, Samuel and colleagues,

who studied nodule detection, reported that “most missed nodules were fixated.” (1995,

p. 895).

We propose another explanation based on differences in usual professional practices.

The basic expert has to make the best diagnosis, in a limited time. As the job becomes

familiar, processes that require a high level of conscious attention are needed less and less.

In real-life practice, basic experts have several x-rays of the same patient. They often have

a clinical description, which has been proven to reduce the SOS phenomenon (Berbaum et

al., 1993). In case of doubt, they proceed with further examinations using other tech-

niques. In contrast, super experts are not only practitioners in radiology but also profes-

sors, which implies an ability to observe, justify and explicitly describe the methods

leading to accurate diagnoses. Furthermore, super experts also are researchers. It means

they have to devote conscious effort making their results explicit and publishing in scien-

tific journals. Finally, as our confederate expert radiologist suggested, super experts usu-

ally see much more complex cases than other radiologists. They are often asked to give

advice about difficult or non-standard films. So, as a consequence of their different attri-

butions and practices, they are highly trained to maintain high levels of attention in their

diagnostic activities, and to organize and explain the information with which they are con-

fronted. Their usually higher level of attention may partially protect them against the SOS

phenomenon. In another task domain, Bryan and Harter (1897) showed that Morse code

operators normally reached a plateau in their skill acquisition but that performance could

still increase with deliberate effort. According to Ericsson, Krampe, and Tesch-RGmer

(1993), “eminent performance” is directly related to “the amount of deliberate practice

related to that goal” (p. 392), and it is probable that a significant proportion of super

experts’ daily activities constitutes deliberate practice. So, if super expertise comes from a

specific professional activity that is clearly different from the activity of “normal” radiolo-

gists, super expertise can no longer be considered as the natural end of expertise develop-

ment in radiology.

More practically, we suggest that daily activity could be used as a criterion in order to

determine groups that vary in processing level: experienced subjects whose normal profes-

sional activity mainly involves skill-based processing could be classified as highly skilled

performers, and experienced subjects whose normal activity involves explicit reasoning

could be classified as reflective experts. Validating such a criterion requires further exper-

imentation, but if correct, this distinction could be useful for testing hypotheses that imply

control over the level of data processing.


I

Pertinence ,

J coming fron controlled processing

b

coming fron automatic

proC6SSillg

\i/

intermediates

basic experts

novices

Experience

Figure 4. Hypothetical Curve of Pertinence Acquisition

On the Origins and Role of Pertinence

The fact that experience increases the interdependence between representation elements-

what we called integration-is consistent with numerous studies showing that experts have

more integrated representations (e.g., Lesgold et al., 1988 in radiology; AmergC & MarinC,

1992 in ergonomic diagnosis; Chase & Simon, 1973a; Freyhof, Gruber, & Ziegler, 1992 in

chess playing). The present study shows that pertinence in schema triggering is related to

integration. Ericsson and Smith stated, “In domains with complex stimuli, such as medi-

cine . . . it is clear that part of the integration of the presented information involves identifi-

cation of the relevant and critical information., .” ( 199 1, p. 24). In our model, identification

of relevant and critical information is not part of the integration process but rather is the

result of that process. Integration of data into the network occurs before pertinence detec-

tion: information is deemed pertinent after it has been integrated. This is globally compat-

ible with a connectionist point of view for the acquisition of pertinence through the basic

expert level. Nevertheless, the pertinence curve drawn from our results (Figure 4) showed

an increase only until a plateau was reached in the intermediates. Only super experts went

beyond. Actually, in the super experts the pertinence continued to grow while the correla-

tion between pertinence and integration disappeared. Consequently, other mechanisms are

necessary to explain why a plateau is reached and how pertinence can still increase.

Rabinowitz (1991) found that strategic processing is important for learning “medium typi-

cal” nouns because spreading activation within semantic networks is not sufficient to con-

strain the activation of related knowledge, whereas strategic processing is sufficient. Such

an explanation could be transferred to our model: even if, as experience grows, the possi-

bility for activation to spread within the semantic network grows, integration of the net-

work as the source of pertinence lessens. Thus, for typical cases, spreading activation could

be assumed to be wide enough in basic experts for pertinent diagnoses to be quickly

retrieved, but processing would become unconstrained for less typical cases. Only super

experts, who maintain deliberate reasoning activity in everyday work, would trigger suffi-

cient levels of deliberate reasoning, the only kind of reasoning able to constrain possibili-

ties optimally in such cases.

IV. CONCLUSION

Holyoak (1991) stated that “The most salient gap in the current models is that none

addresses the crucial issue of learning. Nonetheless, it seems reasonable to expect that

learning models can be developed within the symbolic connectionist paradigm” (p. 324).

The model of medical reasoning presented in this article falls within this paradigm. The

present work is somewhat preliminary: much more testing is necessary to give this

model a solid empirical background. Nevertheless, the results appear promising. The

model proposes a simple mechanism by which declarative knowledge is triggered and

integrated into a representation. Because it involves both declarative and subsymbolic

knowledge, it supplies a basis for understanding how likelihood (provided by experi-

ence) and probabilities (provided by college teaching) are used. Using spreading activa-

tion mechanisms, the model describes how pertinent diagnostic hypotheses would be

triggered and integrated into a representation. The model also provides an explanation of

how experience enhances performance for typical cases. Basic experts appeared to con-

strain their inferences to typical diagnoses, in contrast with super experts who tended to

explore all possibilities. Solutions to atypical cases probably require controlled process-

ing. Super experts, who demonstrated more deliberate reasoning than any other group,

and whose daily activity involves more deliberate reasoning, also performed better than

any other group. Finally, the question of the difference between basic and super exper-

tise, which was raised in this study, is of practical importance because the value of an

expert lies in his or her ability to find solutions not only for typical cases but for atypical

cases as well.

Acknowledgments: We thank Paul Feltovich, Mitchell Rabinowitz, and an anony-

mous reviewer for their comments on earlier drafts of this article. We also thank our

consulting expert radiologists, Daniele Verderi-Raufaste, Andreas Schulz, and Jacques

Bernier.

APPENDIX

Example of Semantic Network Building

Verbalizations

The following protocol (Table 8) was verbalized by a novice (first year resident). Trans-

lated from French.


TABLE 8 Verbal Protocols and Commentaries

Concomitant Protocol Self-Confrontation

Well, good, the second film... well, I’m looking

at the skeleton. So the x-ray, it was taken in bed,

it’s written on the film. Clavicles.. . I look at the

scapulas, it’s OK, ribs, you can follow the ribs,

mainly on the left side because on the right you

can see nothing. Left costo-diaphragmatic

angle. then, . . . . . . so, I’m missing a little

trachea..

Well, left diaphragmatic cupola OK.. Mmm left

costo-diaphragmatic angle.. . OK. Right, so,

what can we... what can we say? The whole

field, the right pulmonary field is white. Well,

. ..er...

and left pulmonary field, I’ve a feeling I see little

micro-nodules... so are they cut vessels?

. ..Mmm... may be vessels.

Well... There you are, so to conclude, field...

white lung on the right or . ..er... lung absence,

prior pneumonectomy to con... well to confront

with clinical data.

No deviation of the mediastinum, neither right

attraction nor right-to-left shift.

No left pleural effusion, no visible evolution-

ary-like pieuro-parenchymatous lesion on the

left....mmm... on skeleton, no visible lesion

. ..er... in the soft tissues... no well, that’s all.

What does “missing a little trachea” mean to

you?

Well no, it’s because I cannot follow it until tra-

cheal bifurcation because it’s white.. . it does not

mean anything to me.

Finally I think not, there are no micro-nodules,

finally I don’t think so.

Yes, it must be a pneumonectomy because I

think I’m seeing the . . . [points to surgical clips]

Buf you didn’f see it a few minutes ogo?

No. I didn’t.

Well, what did you try fo verify?

so, precisely, if there’s a pneumonectomy, nor-

mally the porenchyma there expands a little and

therefore it pushes away mediastinum... It

seems to me. Well, here it’s not so.

Then it would be against pneumonectomy?

Yes.

Verbalizations During Drawing

Well, I’m drawing the skeleton, these are thoracic outlines, two clavicles, trachea that can

be followed until.. . until above the aortic notch, no left lung abnormality but either little

nodules or cut vessels . . . should ask the boss, then a whole white lung there.

PERTINENCE GENERATION IN RADIOLOGICAL DIAGNOSIS

Extracted Schemata

in bed film

no visible left evolutionary-like pleuro-parenchymatous lesion

no pleural effusion

no right attraction of mediastinum

no right-to-left shift of mediastinum

micro-nodules

pneumonectomy

right white lung

missing a little trachea

cut vessels

Comments on the Choices

TABLE 9 Commentaries on the Choices in Semantic Network Building

543

Verbalization Comment

Well, good, the second film

it wos taken in bed,

clavicles... I’m looking at the scapulas, it’s OK,

ribs, you can follow the ribs

then , . . , . . . so, I’m missing a little trachea...

What does u missing a little trachea a mean to

you? Well no, it’s because I cannot follow it until tra-

cheal bifurcation because it’s white... it does not

mean anything to me.

. ..little micro-nodules... so are they cut vessels?

. ..Mmm... may be vessels.

. . . white lung on the right or . ..er... lung

absence, prior pneumonectomy to con... well to

confront with clinical data.

Yes, it must be a pneumonectomy because I

think I’m seeing the . . . [she points to surgical

clips] Buf you didn’t see it a few minutes ago?

No. I didn’t.

This is a commentary which has nothing to do

within the semantic network.

In spite of the fact that this sentence seems dis-

connected from everything else, we kept it

because this characteristic is pertinent informa-

tion relative to the patient’s state.

We didn’t choose to retain these schemata

because they are, here, parts of a systematic

examination of anatomy that does not mean

anything by itself: novices are taught to do this.

Two schemata are connected here: the subject

explains the lost trachea as a consequence of

the white lung. That’s why we drew a line

between the two.

The subject hesitates between two semiological

interpretotions of an image. Both are non-perti-

nent. We drew a line between them because the

subject clearly tries to reinterpret micro-nodules

as vessels.

The subject suspects a white lung then changes

her mind towards the hypothesis of a a pneu-

monectomy (lung absence is the same thing

here).

While commenting on the verbal protocols, the

subject discovered a critical cue not seen before.

We did not include that late finding which would

have been an artifact.

544

Semantic Network

in bed film

lost trachea

no right attraction of mediastinum

\

RAUFASTE, RYROLLE, AND MARINk

micro-nodules- cut vessels

no right-to-left shift of medjastinum

Figure 5. Resulting Semantic Network

REFERENCES

Amerge, C., & Marine, C. (1992). Etude comparative expert-debutant lors de l’elaboration d’un prediagnostic

ergonomique. Le Travail Humain, S(2), 97-117. Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press.

Anderson, J. R. (1992). Automaticity and the ACT* theory. American Joumal of Psychology, 105(2), 165-180.

Balota, D. A., & Larch, R. F. (1986). Depth of automatic spreading activation: Mediated priming effects in pro-

nunciation but not in lexical decisions. Journal of Experimenfal Psychology: Learning, Memory, and Cog- nition, 12, 336-345.

Balota, D. A., & Paul, S. T. (1996). Summation of activation: Evidence from multiple primes that converge and

diverge within semantic memory. Journal of Experimental Psychology: Learning, Memory, and Cogni- tion, 22(4), 827-845.

Barrows, H. S., Norman, G. R., Neufeld, V. R., & Feightner, J. W. (1982). The clinical reasoning of randomly

selected physicians in general medicine practice. Clinical and Investigative Medicine, 5.49-55. Berbaum, K. S., Franken, E. A., Dorfman, D. D., Rooholamini S. A., Kathol , M. H., Barloon, T. J., Behike, F.

M., Sato, Y, Lu C. H., El-Khoury G. Y., Fhckinger, F. W., & Montgomery, W. J. (1990). Satisfaction of

search in diagnostic radiology. Investigative Radiology, 25, 133-140. Berbaum, K. S., Franken, E. A., Anderson, K. L., Dorfman, D. D., Erkonen, W. E., Farrar, G. P., Geraghty, J. J.,

Gleason, T. J., MacNaughton, M. E., Phillips, M. E., Renfrew, D. L., Walker, C. W., Whitten, C. G., &

Young, D. C. (1993). The influence of clinical history on visual search with single and multiple abnormal-

ities. Investigative Radiolagy, 28, 19 l-201.

Bryan, W. L., & Harter, N. (1897). Studies in the physiology and psychology of the telegraphic language. Psycho- logical Review, 4,27-53.

Cantor, J., & Engle, R. W. (1993). Working-memory capacity as long-term memory activation: An individual-dif-

ferences approach. Journal of Experimental Psychology: Learning, Memory and Cognition, 19(5), 1 lOI- 1114.

Carmody, D. P., Kundel, H. L., & Nodine C. F. (1984). Comparison scans while reading chest images: Taught but

not practiced. Investigative Radiology, 19(5), 462-466.

Chase, W. G., & Simon, H. A. (1973a). Perception in Chess. Cognirive Psychology, 4.55-81. Chase, W. G., & Simon, H. A. (1973b). The mind’s eye in chess. In W. G. Chase (Ed.), Visual information pro-

cessing (pp. 215-281). New York: Academic Press.

Collins, A. M., & Loftus, E. F. (1975). A spreading activation theory of semantic processing. Psychological Review, 82, 407-428.

Custers, E. J. M. F., Boshuizen, H. P. A., & Schmidt, H. G. (1996). The influence of medical expertise, case typ-

icality, and illness script component on case processing and disease probability estimates. Memov and Cognition, 24,3, 384-399.

Dagenbach, D., Horst, S., & Carr, T. H. (1990). Adding new information to semantic memory: How much leam-

ing is necessary to produce automatic priming. Joumai of Experimenful Psychology: Learning, Memory. and Cognition, 16(4), 58 1-59 1.

Denis, M. (1989). Image et cognition. 2nd edition 1994. Paris : PUF.

Doherty, M. E., Schiavo, M., Mynatt, C. R., & Tweney, R. D. (1981). The influence of feedback and diagnostic

data on pseudodiagnosticity. Bulletin of the Psychonomic Society, 18(4), I91 -194. Dreyfus, H. L., & Dreyfus, S. E. (1986). Mind over machine: The power of human intuition and expertise in the

era of the computer. New York: The Free Press.

Elstein, A. S., Shulman, L. S., & Sprafka, S. A. (1978). Medicalproblem solving: An analysis of clinical reasoning. Cambridge, MA, & London: Harvard University Press.

Ericsson, K. A., Krampe, R. T., & Tesh-RBmer, C. (1993). The role of deliberate practice in the acquisition of

expert performance. Psychological Review, 100(3), 363-406. Ericsson, K. A., & Smith, J. (1991). Prospects and limits of the empirical study of expertise. In K. A. Ericsson &

J. Smith (Eds.), Toward a general theory of expertise-prospects and limits (pp. l-38). Cambridge: Cam-

bridge University Press.

Ericsson, K. A., & Kintsch, W. (1995). Long-term working memory. Psychological Review, 4, 21 l-245. Freyhof, H., Gruber H., & Ziegler, A. (1992). Expertise and hierarchical knowledge representation in chess. Psy-

chological Research, 54.32-37. Frick, R. W., & Lee, Y. -S. (1995). Implicit learning and concept learning. The Quarterly Journal of Experimenral

Psychology, 48A(3), 762-782. Hasher, L., & Zacks, R. T. (I 984). Automatic processing of fundamental information: The case for frequency of

occurrence. American Psychologist, 39, 1372-1388.

Hoffman, P., Slavic, R., & Rorel, L. (1968). An analysis of variance model for the assessment of configural cue

utilization in clinical judgment. Psychological Bulletin, 69, 338-349.

Holyoak, K. J. (1991). Symbolic connectionism: Toward third-generation theories of expertise. In K. A. Ericsson

& J. Smith (Eds.), Toward a general rheory of expertise-prospecrs and limits (pp. 301-335). Cambridge:

Cambridge University Press.

Joseph, G.-M., & Patel, V. L., (1990). Domain knowledge and hypothesis generation in diagnostic reasoning.

Medical Decision Making, 10, 3 l-46.

Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological Review, 80, 237-251. Kundel, H. L., & Nodine, C. F. (1983). A visual concept shapes image perception. RudioIogy, 146(2), 363.368. Kundel, H. L., & Wright, D. J. (1969). The influence of prior knowledge on visual search strategies during the

viewing of chest radiographs. Radiology, 93(8), 315-320. Lemieux. M., & Bordage, G. (1992). Propositional versus structural semantic analyses of medical diagnostic

thinking. Cognitive Science, 16, 185-212.

Lesgold, A. M., Rubinson, H., Feltovich, P., Glaser, R., Klopfer, D., & Wang, Y. (1988). Expertise in a complex

skill: Diagnosing X-Ray Pictures. In M. T. H. Chi, R. Glaser, & M. J. Farr (Eds.), The nature of expertise (pp. 31 l-342). Hillsdale, NJ: Lawrence Erlbaum.

Light, R. J. (1971). Measures of response agreement for qualitative data: Some generalizations and alternatives.

Psychological Bulletin, 76(5), 365-377. McKoon, G., Ratcliff, R. (1992). Spreading activation versus compound-cue accounts of priming: Mediated

priming revisited. Journal of Experimental Psychology: Learning, Memory, and Cog&ion, Z&6), 1155-

1172.

Medin, D., & Edelson, S. (1988). Problem structure and the use of a base-rate information from experience. Jour- nal of Experimental Psychology: General, II 7( I), 68-85.

Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall.

546 RAUFASTE, RYROLLE, AND MARlNi

Norman, G. R., Brooks, L. R., Coblentz, C. L., & Babcook, C. J. (1992). The correlation of feature identification

and category judgment in diagnostic radiology. Memory and Cognition, 20(4), 344-355.

Olsen, S. E., & Rasmussen, J. (1989). The reflective expert and the prenovice: Notes on skill-, rule- and knowl-

edge-base performance in the setting of instruction and training. In L. Bainbridge & S.A. Ruiz-Quintanilla

(Eds.), Developing skills with information technology (pp. 9-33). Chichester: John Wiley.

Pate], V. L., & Green, G. J. (1986). Knowledge base solution strategies in medical reasoning. Cognitive Science, IO, 91-l 16.

Patel, V. L., & Green, G. J. (1991). The general and specific nature of medical expertise: A critical look. In K. A.

Ericsson & J. Smith (Eds.), Toward a general theory of expertise-prospects and limits (pp. 93-125).

Cambridge: Cambridge University Press.

Rabinowitz, M. (1991). Semantic and strategic processing: Independent roles in determining memory perfor-

mance. American Journal of Psychology, 104(3), 427-431. Rabinowitz, M., & Chi, M. T. H. (1987). An interactive model of strategic processing. In S. J. Ceci (Ed.), Hand-

book of the cognitive, social, and physiological characteristics of learning disabilities (Vol. 2, pp. 84-

102). NJ: Erlbaum.

Rabinowitz, M., & McAuley, M. (1990). Conceptual knowledge processing: an oxymoron? In W. Shneider & F.

E. Weinert (Eds.), Interactions among aptitudes, strategies, and knowledge in cognitive performance (pp.

117-133). New York: Springer-Verlag.

Rumelhart, D. E., McClelland J. L., & The PDP Research Group (1986). Parallel distributed Processing: Explo- rations in the microstructure of cognition. (Volumes 1 & 2). Cambridge, MA: MIT Press.

Samuel, S., Kundel, H. L., Nodine, C. F., & Toto, L. C. (1995). Mechanism of satisfaction of search: Eye position

recordings in the reading of chest radiographs. Radiology, I94(3), 895-902. Sebillotte, S. (1984). La resolution de probltme en situation de diagnostic. Un exemple: le diagnostic medical.

Psychologie Francaise, 29(3/4), 273-277. Selfridge, 0. G. (1959). Pandemonium: A paradigm for learning. In Symposium on the mechanization of thought

processes, London, HMSO.

Shanteau, J. (1992). How much information does an expert use? Is it relevant? Acta Psychologica, 81, 75-86. Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing, II: Perceptual

learning, automatic attending and a general theory. Psychological Review, 84, 127-190. Smedslund, J. (1963). The concept of correlation in adults. Scandinavian Journal of Psychology, 4, 165-173. Stemberg R. J., & Frensch P. A (1992). On being an expert: A cost-benefit analysis. In R. R. Hoffman (Ed.), The

psychology of expertise--cognitive research and empirical AI (pp. 191-203). New York: Springer-Verlag.

Weber, E. U., Bijckenholt, U., Hilton, D. J., & Wallace, B. (1993). Determinants of diagnostic hypothesis gener-

ation: Effects of information, base rates and experience. Journal of Experimental Psychology: Learning, Memory, and Cognition, I9(5), 1151-1164.

Documents

Pertinence Generation in Radiological Diagnosis - CSJ Archive