29
Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program Director, Pathology Informatics Cancer Diagnosis Program, NCI, NIH email: [email protected] *Opinions do not necessarily represent the policies/opinions of the U.S. federal government.

Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Embed Size (px)

Citation preview

Page 1: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Opportunity and Rewards for Pathology Data SharingAssociation of Pathology ChairsJuly 23, 2004Mt. Tremblant, Quebec

Jules J. Berman, Ph.D., M.D.*Program Director, Pathology InformaticsCancer Diagnosis Program, NCI, NIHemail: [email protected]

*Opinions do not necessarily represent the policies/opinions of the U.S. federal government.

Page 2: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

UFO Abductees

Lots of them

They often say about the same thing (independent confirmations)

All walks of life

Mostly honest and rational people

Minority are a little crazy

One problem: no evidence

Page 3: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Researchers who don’t publish their primary data

Lots of them

They often say about the same thing (independent confirmations)

All walks of life

Mostly honest and rational people

Minority are a little crazy

One problem: no evidence

Page 4: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Pathology research is data-intensive.

Example: A tissue microarray study can easily involve terabytes of data.

After your research data reaches a certain size, the data becomes the research, and the journal articles become tiny editorials that describe or interpret the data

Page 5: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Think of the relationship between the earth and the sun.

The sun is hundreds of thousands of times larger than the earth. Consequently, it’s the earth that orbits the sun.

In data-intensive research, the manuscripts are tiny satellites of editorial comment orbiting a central large BLOB of data.

Page 6: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Data sharing requirements (from funding organizations):

NIH Statement on Data Sharinghttp://grants.nih.gov/grants/guide/notice-files/NOT-OD-03-032.html

Page 7: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Data sharing requirements (from journals)

National Research Council UPSIDE Universal Principle of Sharing Integral Data Expeditiouslyhttp://books.nap.edu/books/0309088593/html/R1.html

Page 8: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Data sharing requirements (from congress, in addition to FOIA):

On June 26, 2003, the "Public Access to Science Act, (H.R.2613)was introduced to the House by Congressman Sabo.

Purpose: “to exclude from copyright protection works resulting from scientific research substantially funded by the Federal Government.”

Latest Major Action: 9/4/2003 Referred to House subcommittee. Status: Referred to the Subcommittee on Courts, the Internet, and Intellectual Property

Page 9: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Data Quality requirements (from federal government):

The Data Quality Act was passed as part of the FY 2001 Consolidated Appropriations Act (Pub. L. No. 106-554. codified at 44 U.S.C. § 3516, note.)

The DQA requires the Office of Management and Budget ("OMB") to develop government-wide standards for data quality in the form of guidelines, which OMB has done through a series of rule makings.

Page 10: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Data Quality requirements (from courts):

In 1999, however, the Supreme Court of Pennsylvania carved a large exception out of the immunity doctrine for expert witnesses. In the case of LLMD of Michigan, Inc. v. Jackson-Cross Co. (1999), the court held that a client could sue his expert witness for negligence.

In 2002, the Supreme Court of Appeals in West Virginia took the issue further in Davis v. Wallace (2002) when it suggested that an expert witness could be sued for negligence not only by his own client but also by the opposing party against whom the expert testifies.

Page 11: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

NIH Funding for data sharing

Shared Pathology Informatics Networkhttp://grants.nih.gov/grants/guide/rfa-files/RFA-CA-01-006.html

Tools for collaborations that involve data sharinghttp://grants1.nih.gov/grants/guide/pa-files/PAR-03-134.html

Infrastructure for data sharing and archivinghttp://grants.nih.gov/grants/guide/rfa-files/RFA-HD-03-032.html

caBIGhttp://cabig.nci.nih.gov/

Page 12: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

How are we doing?

Page 13: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

The ScientistVolume 18 | Issue 3 | 47 | Feb. 16, 2004

Scientists Abandon their Software

Good biology programs abound in universities, but academia offers little incentive to keep them current

By Sam Jaffe

Page 14: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

http://www.isse.gmu.edu/~adinh/wchap1.html

GAO investigation of fed-funded software projects

29% never delivered 47% never used 19% reworked or abandoned after delivery 03% needs modifications after delivered 02% could be used as delivered

Page 15: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

For the IRS There's No EZ Fix

“By assembling a star-studded team of vendors, the IRS thought its $8 billion modernization project would manage itself.

The IRS thought wrong. Now the agency's ability to collect revenue, conduct audits and go after tax evaders has been severely compromised.”

BY ELANA VARON

Apr. 1, 2004 Issue of CIO Magazine

http://www.cio.com/archive/040104/irs.html

Page 16: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Not unusual for a large medical center to spend over $100 million on Information Technology

What use are we getting from all that data?

Page 17: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Do we even have the fundamental tools needed to share data?

1. Standard ways of obtaining medical research data (confidentiality methods)

2. Standard ways of organizing data (nomenclatures, taxonomies, ontologies, classifications, data structures)

3. Standard ways of exchanging and merging data

Page 18: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Ensuring confidentiality

Page 19: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Berman JJ. Zero-check: a zero-knowledge protocol for reconciling patient identities across institutions.Arch Pathol Lab Med. 2004 Mar;128(3):344-6.

Berman JJ. Racing to share pathology data. Am J Clin Pathol. 2004 Feb;121(2):169-71 (editorial).

Berman JJ. Concept-Match Medical Data Scrubbing: How pathology datasets can be used in research. Arch Pathol Lab Med. 2003 Jun;127(6):680-6.

Berman JJ. Threshold protocol for the exchange of confidential medical data. BMC Medical Research Methodology, 2002, 2:12.

Page 20: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Organizing data

Page 21: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Developmental Lineage Classification and Taxonomy of Neoplasms

Free, open access, soon to be merged into NCI Thesaurus

Comprehensive 102,000+ terms ( 7+ Megabytes)

Heritable class structure with a unique class location for each tumor

XML document that can be cross-annotated with molecular biology databases

Preserves current tumor names, while abandoning purely morphologic categories (e.g. epithelial/stromal)

Page 22: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Berman JJ. Tumor classification: molecular analysis meets Aristotle. BMC Cancer 2004 4:10, 17 March 2004

Articlehttp://www.biomedcentral.com/1471-2407/4/10

XML file (gzipped)http://12.183.10.150/jjb/neoclxml.gz

Flat file (gzipped)http://12.183.10.150/jjb/neoself.gz

Page 23: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Exchanging data

Page 24: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Real world example: The Tissue Microarray Data Exchange Specification

The greatest value of TMAs is the ability to link TMA data with data from other TMAs and from other databases that inform on the data contained in the TMA database.

That value is essentially untapped because there has been no way to publish, exchange, merge and link TMA datasets in a manner that everyone can use and understand.

Page 25: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

The TMA Specification is an open access document that can be used without any restriction.

Its development was sponsored by the NCI and by the Association for Pathology Informatics

Jules J Berman, Mary Edgerton and Bruce Friedman.The tissue microarray data exchange specification: a community-based, open source tool for sharing tissue microarray data. BMC Med Inform Decis Mak. 2003 May 23;3:5

Jules J Berman, Milton Datta, Andre Kajdacsy-Balla, Jonathan Melamed, Jan Orenstein, Kevin Dobbin, Ashok Patel, Rajiv Dhir, Michael J Becich. The tissue microarray data exchange specification: implementation by the Cooperative Prostate Cancer Tissue Resource. BMC Bioinformatics 2004 Feb 27, 5:19

Page 26: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Querying/collecting dispersed and heterogeneous data

Page 27: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Shared Pathology Informatics Networkhttp://grants.nih.gov/grants/guide/rfa-files/RFA-CA-01-006.html

MGH and affiliates (Isaac Kohane PI)

UCLA and affiliate hospitals (Jonathan Braun PI)

Indiana University and affiliates (Clem McDonald PI)

U. Of Pittsburgh (Mike Becich PI)

Program Director, Jules Berman

Page 28: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

Shared Pathology Informatics Network – NCI’s most ambitious pathology data sharing effort

Peer-2-Peer network that open their databases to queries on their surgical pathology data, providing de-identified records linked to specimens (first public demo on May 28, 2004)

The individual informatics systems are all different, but they have common data exchange and data query language

Current prototype has millions of annotated specimens

Future: expanding number of SPIN participants, expanding the kinds of data that can be queried

Page 29: Opportunity and Rewards for Pathology Data Sharing Association of Pathology Chairs July 23, 2004 Mt. Tremblant, Quebec Jules J. Berman, Ph.D., M.D.* Program

end