In this Issue: Casting a Net

In this Issue: Casting a Net

Brian C. Madden, Rochester

The effort to model neural activity as a computation has its roots inSantiago RamoÂn y Cajal's discovery late in the nineteenth centurythat the brain was not a syncytium but, instead, a collection ofisolated cells. Each cell embodied a complex transformation of inputvalues which in turn passed on new values to other cells. Theresearch to model this transformation took on a life of its own,distinct from physiology, with the discovery of the manifest power ofneural networks as classi®ers whose structure and weights could bederived directly from the data. Any semblance to physical neuronsrapidly became symbolic. Progress in this area, however, was notwithout its ®ts and starts. Although Minsky and Papert's in¯uentialbook, Perceptrons, set lowered expectations in the 1960s for ageneration of neural network research (certainly a minor blip whenset beside their pioneering work in arti®cial intelligence and a fewother sundries such as Minsky's invention of confocal microscopy),there was a resurgence in the 1980s when it was demonstrated that amultilayered construction can give the classi®cation methodappreciable computational power. A good example of this power ispresented in a paper by Rubegni et al. (p. 471) that uses an arti®cialneural network to fashion a melanoma detector.

The authors ®rst present a short summary of the impressiveresults in melanoma detection that have been published based onepiluminescence light microscopy (ELM). For the last 15 years,various versions of this technique have used oil immersion to matchthe optical properties of the device to those of the skin and therebyimprove the sharpness and contrast of the lesion. The improvedmatch of the refraction index between glass and skin both increasesthe transmission of light into the skin and reduces the amount ofspecularity that obscures subsurface detail (as with sunlight re¯ectedoff of the surface of a lake). In addition, contrast is often enhancedby some form of polarization ®ltering. The comparison betweenELM and surface photography is analogous to that between slideand print ®lm. Re¯ectance images simply do not afford the samedynamic range as when the epidermis and dermis are suffused withlight, allowing transillumination of pigmented regions. A numberof recent systems incorporate digital color cameras, an addition thatfacilitates subsequent computer analysis. The instrument used in theresearch presented here is the DBDermo-Mips, a system that wasdeveloped at the University of Siena and provided for theextraction of 48 lesion parameters from the images ± some grosslyapparent, and others not. Processing was performed with an imageresolution of 45 pixels per mm and with a 3CCD color camera (stillappreciably less than the equivalent resolution available with thetype of Kodachrome ®lm normally used in clinical imaging and thesame ®eld of view). A selection of 57 melanomas and 90 benignpigmented lesions obtained in their clinic in Siena were used by theauthors to test the detector. Using the ten most signi®cant lesionparameters, they constructed a classi®er that achieved a maximumaccuracy of 93% in detecting melanomas in the 147 lesion sampleset. Depending on the costs of false negatives and false positives, thedecision criterion can be altered to vary the obtained sensitivity andspeci®city (a sensitivity of 99% could be achieved but only with aspeci®city of 81%; a decrease in sensitivity to 93% allowed thespeci®city also to increase to 93%).

A strong point of neural networks is that once they are trainedand the weights are selected, their implementations can run

exceedingly fast (as distinct from the input ®lters that still oftenrequire considerable image processing computations, hence lessthan half of the signi®cant parameters were used). But, by far, thegreatest bene®t of this formulation is that, once designed, it iscompletely objective and does not suffer from variations in thetraining of the operator. For those not familiar with arti®cial neuralnetworks, a brief description of some of the techniques presented inthe paper is presented here as background. SLP: Single-LayerPerceptrons are an elementary form of neural network. Weights areapplied to the inputs of each node and the results are summed. Ifthe total response exceeds a threshold the output is a 1 (the selectedproperty is detected); else, it is 0. SLPs are limited in that there arecertain types of relations among the inputs that they cannotdistinguish. ANN: Arti®cial Neural Networks are more powerfulclassi®ers that have multiple layers of nodes: an input layer, one ormore hidden layers and an output layer (Fig 1). Similarly to SLPs,inputs to each node are weighted, summed and transformed by anoutput function. The weights are tuned by applying a set of lesionswhose classi®cation is known to the network and then using anyclassi®cation errors to adjust the weights. Starting with the new setof weights, the tuning is repeated. This process continues until asatisfactory level of performance is achieved or no moreimprovement can be induced. Leave one out: A statistical methodthat allows the data used to train the network to also be used toprovide an estimate of how the current set of weights a new set oflesions with the same characteristics as the test set. Stepwise featureselection: When computational resources are limited and it isdesirable to optimize response time, this technique orders thesigni®cance of each of the input features and eliminates the one thatcontributes the least until the desired compromise between speedand performance is reached. ROC: The Receiver OperatingCharacteristic is a graphical method of displaying the changes inclassifying malignant and benign lesions as the level of the networkoutput used to separate the two classes is changed. The graph plotsthe probability of false positives (horizontal axis) versus theprobability of true positives (vertical axis). As the criterion goesfrom being very strict to very lax, most ROC curves form arcs thatare vertical in the lower left of the plot (where many more truepositives are obtained at the cost of only a few false positives),become less steep, and ®nally turn nearly horizontal in the upperright (where many more false positives must be accepted to obtainonly a few more true positives). The more arced the curve, themore discriminable is the task. A given sensitivity/speci®city pairform a single point on an ROC curve.

In addition to these terms, two other issues deserve furtherconsideration: the development of input ®lters and the selection ofappropriate performance metrics. The choice of the input valuesand the details of how they are operationalized are crucial to theperformance of the neural network and determine in large part howwell lesions can be classi®ed. In the extreme, one could simply takeall of the individual pixels as input values. Starting from such baseinput, coming up with a way to ®nd meaningful properties from allthe differences observed in a set of lesions has proven very dif®cultto achieve. Given the naturally occurring degree of variation andnoise, such detailed input almost universally leads to classi®cationfailure. Higher-level features (e.g., the gradient of dark regions

0022-202X/02/$15.00 ´ Copyright # 2002 by The Society for Investigative Dermatology, Inc.

363

from the center to the periphery of the lesion) must be individuallydesigned and selected to capture signi®cant qualities of the target.Obtaining a good set of ®lters is often the most dif®cult part of thetask of building a classi®er.

Implicit in the search for algorithms with ever higher speci®cityand sensitivity is the assumption that melanoma detection is, in fact,a highly discriminable task ± albeit a highly complex, multi-dimensional one. The strength of the ANN technique comes fromthe weighted combination of many properties, each of which maybe highly ambiguous when taken alone. A hint as to the upperbound on performance can be obtained from the 3% of thehistological classi®cations that were in contention. Given that thereadings of the dermatopathologist are the gold standard in lesionclassi®cation, there will be a ceiling in the overlap and scoringuncertainty of the lesion populations until markers moreunambiguous and unique to the disease are developed.

With such high performance available why, then, do only 25%of dermatologists use any form of dermoscopy, and decidedly feweravail themselves of cutting edge digital analysis? It takes more thangood science to get a technique into the clinic. Is an automatedmelanoma detector yet another expensive piece of equipmentsqueezed into an already overcrowded exam room that requiresanother time-consuming procedure to be squeezed into an alreadyovercrowded schedule? Is this something only for technophiles inthe laboratory or can a good business case be made? What is thereceived bene®t over the established practice of when in doubt, cut

it out and let the pathologists sort it out? What is the value of thereduction in delayed diagnosis or in unnecessary dis®gurement?Answers to these questions are as important as the science inactually getting the technology implemented.

In addition to sorting out these practical considerations,clinicians are faced with a range of technological choices. Sincethe early 1990s, almost every imaging laboratory with a newtechnological hammer has tried to pound the melanoma nail. Theresulting multiplicity of currently available devices has led to adegree of confusion and uncertainty as to what constitutes the bestavailable technique. Although the current work presents appreci-able technical detail in the public record, this is not always the case.Often, details suf®cient to actually recreate the experiment are notavailable. Many instruments are currently offered, each with theirown proprietary transducer and processing algorithms. It is dif®cultto sort out the wheat from the chaff.

In May, 1993, the committee created by the FCC to testcompeting HDTV proposals itself created the Grand Alliance inorder to combine the four best of the 23 proposed, and proprietary,ideas. Each individual proposal contained serious ¯aws and ratherthan have a selection process among less than optimal alternatives, itbecame clear that cooperation would produce the best result for thepublic. Consideration should be given to creating a similarmechanism to review the potential in the current array ofmelanoma detection devices, an effort perhaps vetted by theNCI. When the market is small (dermatology clinics) and theinstrumentation does not cost millions per unit, this form ofcollaboration might be helpful in increasing user con®dence andmarket penetration. An institutional resource that does not have avested interest in any particular technology could provide avaluable service in testing and evaluation. Also, the potential to testinstruments under the same realistic conditions is valuable giventhat even a moderately large set of lesions from a single clinic mightpose selection problems. It is certainly in the best interest of thepatient population to make to best instrumentation readilyaccessible.

Technological advance does not end with ANNs. Otherdetection technologies are on the horizon (e.g., confocal,OCT). Dermoscopy in any of its forms is still a 2Dcompression of a 3D lesion. A volumetric representationmight enrich the classi®cation by locating features relative tofunctional structures such as the dermal±epidermal boundary.The cameras used in dermoscopy usually have only threeoverlapping broadband color ®lters. Expansion to spectro-graphic analysis with more and narrower spectral ®lters alsooffers promise. We lack even a basic standard as to what isrequired in terms of pixels per mm on the skin or bits per pixelfor diagnostically suf®cient images. Do 45 pixels per mmcapture all the available, and necessary, information? Evenwith the current uncertainties, one thing is clear ± when itcomes to imaging, dermatology is an orchard with much lowhanging fruit and technologies such as digital ELM deserve thesupport required to bring them to the patient.

Figure 1. The basic organization of an Arti®cial NeuralNetwork. Inputs to each of the nodes in the left column areweighted and summed and then an output transformation is appliedto the sum before it is passed on as input to hidden nodes of thenext layer (middle columns). The magnitude of the transformedsum of an output node (the rightmost node) is used to assign theclassi®cation to the lesion represented by the current inputs.

364 IN THIS ISSUE THE JOURNAL OF INVESTIGATIVE DERMATOLOGY

Documents

In this Issue: Casting a Net