4
Congressional Update: Report from the Biomedical Imaging Program of the National Cancer Institute National Cancer Institute Initiative: Lung Image Database Resource for Imaging Research 1 Laurence P. Clarke, PhD, Barbara Y. Croft, PhD, Edward Staab, MD, Houston Baker, PhD, Daniel C. Sullivan, MD Preliminary clinical studies suggest that spiral computed tomography (CT) of the lungs can improve early detection of lung cancer in high-risk individuals. More clinical studies are needed, however, before public health recommendations can be proposed for population-based screening. Spiral CT generates large-volume data sets and thus poses problems in terms of implementation of efficient and cost-effective screening methods. Image processing algorithms such as computer as- sisted diagnostic (CAD) methods have the potential to assist in lesion (eg, nodule) detection on spiral CT studies. CAD methods may also be used to characterize nodules by either assessing the stability or change in size of lesions based on evaluation of serial CT studies, or quantitatively measuring the temporal parameters related to contrast dynamics when using contrast material– enhanced CT studies. CAD methods therefore have the potential to enhance the sensitivity and specificity of spiral CT lung screening studies. Lung cancer screening studies now under investigation create an opportu- nity to develop an image database that will allow comparison and optimization of CAD algorithms. This database could serve as an important national resource for the academic and industrial research community that is currently involved in the development of CAD methods. The National Cancer Institute request for applications (RFA) (CA-01-001) has already been announced (April 2000) to establish and support a consortium of academic centers to develop this database, the con- sortium to be referred to as the Lung Image Database Consortium (LIDC). This RFA is now closed. Five academic sites have been selected to be members of the LIDC, the first meeting of this consortium is planned for spring of 2001, and a public meeting is to be held in 2002. This report is abstracted from the previously published RFA to serve as an example of how an initiative is developed by the National Cancer Institute to support a research resource. For specific details of the RFA, please access the following Internet site: http://www.nci.nih.gov/bip/NCI-DIPinisumm.htm#a11. Electronic image detectors such as those used in contem- porary spiral computed tomographic (CT) scanners ac- quire more information than can be displayed at any one time with standard display methods. Therefore, research on image processing methods is essential to fully exploit the information that has been acquired. Furthermore, the ability to extract quantitative information from images is increasingly important and requires image processing. Investigators working on image processing frequently do not have access to or the resources necessary to create the large databases of images necessary to develop and test their work. In addition, the comparison and relative evalu- ation of image processing techniques against each other require common data sets and standardized methods for evaluation. The need for medical image databases as re- Acad Radiol 2001; 8:447– 450 1 From the Biomedical Imaging Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, Executive Plaza North, Suite 6000, 6130 Executive Blvd, Rockville, MD 20852. Received and accepted Febru- ary 22, 2001. Address correspondence to L.P.C. © AUR, 2001 447

National Cancer Institute Initiative

Embed Size (px)

Citation preview

Congressional Update: Report from the Biomedical Imaging Program

of the National Cancer Institute

National Cancer Institute Initiative:Lung Image Database Resource

for Imaging Research1

Laurence P. Clarke, PhD, Barbara Y. Croft, PhD, Edward Staab, MD, Houston Baker, PhD, Daniel C. Sullivan, MD

Preliminary clinical studies suggest that spiral computed tomography (CT) of the lungs can improve early detection oflung cancer in high-risk individuals. More clinical studies are needed, however, before public health recommendations canbe proposed for population-based screening. Spiral CT generates large-volume data sets and thus poses problems in termsof implementation of efficient and cost-effective screening methods. Image processing algorithms such as computer as-sisted diagnostic (CAD) methods have the potential to assist in lesion (eg, nodule) detection on spiral CT studies. CADmethods may also be used to characterize nodules by either assessing the stability or change in size of lesions based onevaluation of serial CT studies, or quantitatively measuring the temporal parameters related to contrast dynamics whenusing contrast material–enhanced CT studies. CAD methods therefore have the potential to enhance the sensitivity andspecificity of spiral CT lung screening studies. Lung cancer screening studies now under investigation create an opportu-nity to develop an image database that will allow comparison and optimization of CAD algorithms. This database couldserve as an important national resource for the academic and industrial research community that is currently involved inthe development of CAD methods. The National Cancer Institute request for applications (RFA) (CA-01-001) has alreadybeen announced (April 2000) to establish and support a consortium of academic centers to develop this database, the con-sortium to be referred to as the Lung Image Database Consortium (LIDC). This RFA is now closed. Five academic siteshave been selected to be members of the LIDC, the first meeting of this consortium is planned for spring of 2001, and apublic meeting is to be held in 2002. This report is abstracted from the previously published RFA to serve as an exampleof how an initiative is developed by the National Cancer Institute to support a research resource. For specific details ofthe RFA, please access the following Internet site: http://www.nci.nih.gov/bip/NCI-DIPinisumm.htm#a11.

Electronic image detectors such as those used in contem-porary spiral computed tomographic (CT) scanners ac-quire more information than can be displayed at any onetime with standard display methods. Therefore, research

on image processing methods is essential to fully exploitthe information that has been acquired. Furthermore, theability to extract quantitative information from images isincreasingly important and requires image processing.Investigators working on image processing frequently donot have access to or the resources necessary to create thelarge databases of images necessary to develop and testtheir work. In addition, the comparison and relative evalu-ation of image processing techniques against each otherrequire common data sets and standardized methods forevaluation. The need for medical image databases as re-

Acad Radiol 2001; 8:447–450

1 From the Biomedical Imaging Program, Division of Cancer Treatment andDiagnosis, National Cancer Institute, Executive Plaza North, Suite 6000,6130 Executive Blvd, Rockville, MD 20852. Received and accepted Febru-ary 22, 2001. Address correspondence to L.P.C.

© AUR, 2001

447

search resources for medical image processing has oftenbeen identified as a priority at various National Institutesof Health (NIH) workshops (eg, the National Cancer In-stitute [NCI] Lung Imaging Workshop and Related Tech-nology Transfer, January 1997; the Image ProcessingWorkshop, October 1998; and the NIH Biomedical Imag-ing Symposium, June 1999). Previous efforts to createimage databases and make them widely available havemet with limited success, partly due to lack of consensuson critical issues such as the case selection criteria andmethods or metrics used for evaluation of computer-assisted diagnostic (CAD) methods.

An example of the establishment of a successful imagedatabase as an international resource is the National Li-brary of Medicine’s Visible Human Project (VHP), whichwas initiated in 1989. This project was partly supportedby the National Science Foundation. The VHP has provento be far more useful than expected at its inception. Thisresource consists of CT and magnetic resonance (MR)images of male and female human cadavers, along withphotographs of the corresponding cadaver sections. It iscontinually used around the world for research and educa-tional purposes. The National Library of Medicine is cur-rently supporting work to develop generalizable algo-rithms for image segmentation and registration that arespecifically optimized for use with the VHP data.

Another example of an image database resource indevelopment is the NIH collaborative effort called theHuman Brain Project (HBP), begun in 1993, that seeks tocreate common tools to facilitate research on neuroinfor-matics. This project was also partly supported by the Na-tional Science Foundation. Some of these tools under de-velopment are image processing algorithms for neuroim-aging data from brain CT, positron emission tomographic(PET), and MR imaging studies. The solutions under de-velopment by the VHP for normal anatomy and by theHBP for brain imaging will not necessarily transfer di-rectly to other organ systems, but the NCI will encouragethe exchange of information among the investigators sup-ported by this Lung Image Database Consortium (LIDC)initiative, the VHP, and the HBP.

CAD is a general term used to describe a variety ofartificial intelligence techniques applied to medical im-ages. CAD methods are being rapidly developed at sev-eral academic and industrial sites, particularly for large-scale breast, lung, and colon cancer screening studies.Imaging for lung cancer screening is a good physical andclinical model for the development of image processingand CAD methods, related image database resources, and

the development of common metrics and statistical meth-ods for evaluation. Automatic target recognition algo-rithms are one example of a CAD method. For large-scalescreening applications, automatic target recognition is animportant method for (a) improving the sensitivity of can-cer detection, (b) reducing observer variation in imageinterpretation, (c) increasing the efficiency of readinglarge image arrays, (d) improving efficiency of screeningby identifying normal images, and (e) facilitating remotereading by experts. Image processing tools are also beingdeveloped for temporal analysis of serial images or analy-sis of the dynamics of contrast material, with the aim ofdetecting early subtle changes that might not be obviousto the reading physician. CAD techniques may thereforeimprove the specificity of cancer detection by assigning aquantitative estimate of the probability that a detectedlesion is benign or malignant.

Another promising application of CAD is in the predic-tion of which cases are most suitable for a particular treat-ment option. These types of CAD methods require consen-sus on such issues as the development of reference standards(electronic ground truth), software modules for registrationof serial images, and related image segmentation. Multimo-dality imaging (x-ray imaging, CT, MR imaging, or PET/single photon emission CT) will also be important for diag-nostic image interpretation. Therefore, development and im-plementation of multimodality image registration software isdesirable by the proposed LIDC. Finally x-ray CT imagereconstruction methods may also influence the performanceof CAD methods, and hence the collection of raw imagedata must be considered in the design of an image databasethat is intended to be a general resource.

The intent of the proposed request for applications isto support a consortium of institutions to develop consen-sus guidelines for a spiral CT lung image resource and toconstruct a database of spiral CT lung images. The inves-tigators funded under this initiative will create a set ofguidelines and metrics for database use and will developa database as a test-bed and showcase for those methods.The database created by this consortium will be availableto all researchers and users through the Internet in atimely manner and will have wide utility as a research,teaching, and training resource.

SPECIFIC GOALS

The first goal of the RFA is to create a lung imagedatabase consortium (LIDC) to develop consensus guide-lines for (a) the preparation and submission of lung CT

CLARKE ET AL Academic Radiology, Vol 8, No 5, May 2001

448

cases that are representative of clinical practice, (b) thedevelopment of reference standards (“ground truth”) forspatial determination of an imaged lesion(s) electronicallyin three dimensions, (c) the development of standards forhistological verification of an image lesion(s), and (d) thedevelopment of common metrics and software for statisti-cal validation of the performance of image processingmethods. The second goal is to create an image databaseto serve as a common research resource to the medicalimaging community to (a) permit early identification ofpromising software methods from the diverse pool ofemerging tools and (b) stimulate the development of ad-vanced image processing methods including temporalanalysis and image registration. The third goal is to allowInternet access to the database by the broad imaging re-search community to stimulate interdisciplinary researchcollaboration among researchers in academia, government,and industry. Internet access will be accomplished by us-ing either the resources of the NIH to develop and pro-vide the necessary infrastructure, with image access usingDICOM (digital imaging and communication in medicine)standards or through the use of a distributed image ar-chive by the members of the LIDC.

It is anticipated that approximately five individual aca-demic sites will be supported. There is a critical need tohave broad representation from the medical imaging com-munity to promote the development of consensus and estab-lishment of standards. To ensure a representative database iscollected, the final criteria for selection of awardees willinclude a requirement for the inclusion of different awardeesrepresenting a minimum of three different institutions andthe collection of images from at least two different spiral CTx-ray imaging devices (ie, from two different manufactur-ers). The individual awardee institutions do not need to pro-vide images from more than one CT scanner manufacturer.In creating the LIDC members as a whole, NCI staff willensure that images from at least two different scanner manu-facturers will be included. The generation of image subsetsof either CT raw data or multimodality images is an optionalsecondary goal, and thus no specific requirements are posed.The Consortium will determine, after it is formed, what datasets of raw data and multimodality images will be useful andare practical to obtain.

COLLABORATIVE RESPONSIBILITIES

A Steering Committee will be the main governingboard of the LIDC and will be composed of the principalinvestigator of each awarded grant, a second member

from the research team of each awarded grant selected byits principal investigator, the NCI program director andchief of the Imaging Technology Development Branch.Each of these members will have one vote. The chairper-son will be someone other than an NCI staff member andwill be selected by the Steering Committee.

The Steering Committee will have primary responsibil-ity for implementation of the overall goals of the Consor-tium, refining the scope of the original applications sub-mitted so that the work of the Consortium as a whole isintegrated, and organizing research tasks for each partici-pating site where necessary. The Steering Committee’sresponsibilities will include efforts to reach consensus onthe criteria to populate the image database, to monitor theaccrual of cases, to monitor image quality control, to re-view ground truth and related confirmation of pathologyto ensure commonality of methods at each site, to deter-mine whether raw data will be incorporated into the data-base, to decide what multimodality images to include, toexpand the database if improved CT imaging sensors be-come available, to reach a consensus on the metrics andstatistical methods for software evaluation, and to ensureDICOM standards are followed for transfer of images andall the patient-related information.

The Steering Committee will be responsible for deter-mining the image format and the mode of transfer to ei-ther an NIH central image archive or a distributed imagearchive within the LIDC. The means of archiving is to bedecided by the Steering Committee. In addition, Internetaccess by the research community must be developed.The Steering Committee will continue to monitor the im-plementation of the data release plan for the life of theawards. It will also be responsible for the collaborationwith other NCI-supported clinical trials where applicable(eg, the American College of Radiology Imaging NetworkCooperative Group), for establishing procedures for ac-cess to image databases, and for ensuring appropriate se-lection of imaging protocols. It will also be encouraged tointeract with professional societies such as the Radiologi-cal Society of North America and the American Collegeof Radiology, with government agencies such as NationalScience Foundation, and with other NIH cooperativegroups active in generating image databases (eg, HBP,VHP) to ensure community acceptance of standards pro-posed for evaluation of the image databases. The SteeringCommittee will facilitate the conduct of studies and re-porting and publication of study results. Subcommitteeswill be established by the Steering Committee, as itdeems appropriate.

Academic Radiology, Vol 8, No 5, May 2001 BIOMEDICAL IMAGING PROGRAM

449

The Steering Committee will review database submis-sions, follow-up information on cases selected for inclu-sion in the database, protocol and DICOM compliance,results of audits, and regulatory requirements at the par-ticipating centers, and will formally report the results ofits reviews to the NCI in writing. The format and timeintervals of such reports will be decided by the SteeringCommittee. The Steering Committee will meet three

times in year 1 and two times per year in subsequentyears. The Steering Committee will be responsible for theorganization of a public meeting in year 2—open to allinterested researchers working on image processing algo-rithm development and evaluation, lung cancer screening,image databases, or any related activity—to seek feed-back on the plans for the database generation and evalua-tion.

CLARKE ET AL Academic Radiology, Vol 8, No 5, May 2001

450