Upload
tamsyn-pitts
View
217
Download
0
Embed Size (px)
Citation preview
1
Common Biorepository Model
(CBM) : Specimen Searches Across Real Specimen Collection
Data
caBIG® Tissue Banks & Pathology Tools
Workspace
TBPT F2F
Houston, Texas
November 3, 2010
2
Agenda
• Progress on CBM Challenge with CBM 1.0 Beta
• CBM Use at the University of Colorado Cancer Center
• CBM Use at the Medical University of South Carolina Hollings Cancer Center
• CBM Use at Washington University St. Louis and caB2B Querying
• Model, Next Steps and Discussion
4
Why CBM? (1/2)
• Clinical research often uses only locally obtained specimens due to limited ability to search for specimens outside an institution.
• The ability to aggregate similar specimens from various sites will expand the validation of pathology research findings and thus, more quickly impact patient care.
• caGrid allows for data from caBIG-compatible systems to be connected across the sites• Has a CQL querying language that allows for interrogation of model
metadata to navigate and find information about the data on the “grid node”
• caTissue Suite allows individual specimen level queries on caGrid
• How can we get, at minimum, all biorepository systems to share SUMMARY LEVEL information about their specimens, if many use different vendor systems and/or institute-developed solutions?
5
Why CBM? (2/2)• A CBM serves as a simple information model for interfacing with
systems by sharing key summary-level specimen information, enabling a single search across multiple biorepositories/banks.
• Biorepository software vendors and NCI stakeholders (tissue bank personnel, researchers) convened to develop a CBM undergoing final development stage.
• The goal is to reduce the time and effort required by researchers to locate biorepositories with needed specimens.
• Researchers will be able to search via the OBBR’s Specimen Resource Locator to identify specimen resources to fit their research needs.
Compatibility Path
Compatibility
Fu
nct
ion
alit
y
Bronze caGrid Services
Common Biorepository
Model
Specimen Management
Services
Path to CBM 1.0 (to 2009)
June
- Aug
Decem
ber
Novem
ber
Dec -
Jan
Internal NCI
Revisions
Internal NCI
Revisions
2009
Feb-
May
Updated versionCBM 0.93,
Review with Stakeholders; engage more
vendors
Updated versionCBM 0.93,
Review with Stakeholders; engage more
vendors
CBM Vendor Participants Accept CBM Challenge
(14)
CBM Vendor Participants Accept CBM Challenge
(14)
SRL Stakeholders define initial
vocabulary terms desired, initial
Silver Level Review Feedback incorporated into
CBM 0.95
SRL Stakeholders define initial
vocabulary terms desired, initial
Silver Level Review Feedback incorporated into
CBM 0.95
Sept-N
ov
Vendor workshop developed CBM 0.9
Vendor workshop developed CBM 0.9
2008
CBM 0.95 Service files generated (Grid-KC, with simple data set from caTissue
ETL);CBM vendors ready to begin
looking at service docs
CBM 0.95 Service files generated (Grid-KC, with simple data set from caTissue
ETL);CBM vendors ready to begin
looking at service docs
2009
Path to CBM 1.0 (2010)CBM Participants Mapping to REAL
Collections:
UCCC, using .NET
service connected to home-grown
system
Hollings Cancer Center –AIM TissueMetrix
WashU-caTissue
CBM Participants Mapping to REAL
Collections:
UCCC, using .NET
service connected to home-grown
system
Hollings Cancer Center –AIM TissueMetrix
WashU-caTissue
Jan-
April
June
CBM 1.0Beta -Oct
• All terms in caDSR,
• Mapping to terms with SNOMED, ICD-9/10 synonyms,
• Corrections to model based on June feedback
CBM 1.0Beta -Oct
• All terms in caDSR,
• Mapping to terms with SNOMED, ICD-9/10 synonyms,
• Corrections to model based on June feedback
July
-Oct
Nov
CBM Vendors Stand up Grid with Test Data (IMS, 5AM,
caTissue-WashU (early ETL), AIM)
FreezerWorks testing code
Daedalus, Healthcare IT, Westat, GenoLogics , Ocimum Biosolutions,
LabVantage looking at code, ready when vendors ready
UCCC interested in testing
CBM Vendors Stand up Grid with Test Data (IMS, 5AM,
caTissue-WashU (early ETL), AIM)
FreezerWorks testing code
Daedalus, Healthcare IT, Westat, GenoLogics , Ocimum Biosolutions,
LabVantage looking at code, ready when vendors ready
UCCC interested in testing
CBM 1.0Beta-May – Service
Files Generated for early testers;
new term curation
started in NCI Thesaurus
CBM 1.0Beta-May – Service
Files Generated for early testers;
new term curation
started in NCI Thesaurus
2010
2011: CBM 1.0 ECCF-SAIF Compliant, caBIG Compatible Service
15
Where we are today, Tuesday Nov 3
• UCCC has a node on the Training Grid – generated from a CBM service based on October 2010 CBM1.0Beta• ETL mapping done from MS-SQL to MS-SQL• Used CBM vocabulary• Using CBM1.0Beta-October EA model
• Hollings Cancer Center – with Artificial Intelligence Tissue Metrix• ETL mapping from Oracle SQL to MySQL• Extensively used the CBM vocabulary (CBM1.0Beta October vocab)• Close to showing on the Grid – can query the Database• Using CBM1.0Beta-May service files
• Washington University• Early caTissue Suite-CBM ETL scripts from April 2010 used (not complete
mapping) • Using CBM1.0Beta-May service files and Database
18
Training Grid Portal has CBM Nodes: http://portal.training.cagrid.org/web/guest/home
19
iGoogle Gadget – Querying the CBM Grid Node• iGoogle Application • Queries and used GridPortlet
• Using “My Gadgets”• http://
www.neatdev.com/cabig/gadget.xml
• Aggregating Data from the Grid:• UCCC• Washington University
20
How to install the Specimen Counts Google Gadget (connecting to CBM test nodes on the caGrid)• Works best on Chrome and Firefox viewers• Go To http://www.google.com/ig • Click “add stuff” • Search for “My gadgets” and click to add this to your iGoogle Page• Install “My gadgets”, go back to your iGoogle page • Once installed, enter this address in the “add gadget” bar:
http://www.neatdev.com/cabig/gadget.xml • The Specimen Counts should appear on your iGoogle homepage.
23
Hollings Cancer Center : Number of cases by specimen
• Waiting for Grid Connection – will be up soon (next few days?)• Specimens that will be exposed are reported here:
http://tmxstorefront.hcc.musc.edu/
BIRT (Ex. June 2009) – open-source reporting tool
• Tie into XML output of Grid Queries• Customize reporting• Could customize what is displayed
26
Lessons Learned as we continue…
• Working with Vendors and multiple systems helps identify issues and iteratively improve• Mapping issues – are we all mapping to the same things• Testing with real data – when challenges are identified• Grid node testing – various environments/difficulties encountered
helps documentation and potential muddy areas!
• Extract-Transform-Load process – must be meticulous and all terms agreed on• Mapping Decisions are key in the process - Map to a code or map
to an NCI preferred name?• All must subscribe to same version – or work out how to map
against different versions • Through Real-testing is when we find this challenges – THANKS
to the institutes testing with us in this Challenge
27
Fall 2010 Additions to help ETL
• Tables added with the lists of values from the model• Can be directly accessed by ETL processes• Enforces integrity via foreign keys
• Mapping table• Addresses mappings to
• ICD9CM• ICD10• SNOMEDCT
28
What we have todayArtifact Links Description
CBM with Value Domains.EAP UML Model :
https://ncisvn.nci.nih.gov/svn/common_biorepository_model/trunk/caCORE_SDK/models/CBM%20with%20Value%20Domains.EAP
https://ncisvn.nci.nih.gov/svn/common_biorepository_model/trunk/caCORE_SDK/models/CBM%20with%20Value%20Domains.xmi
EA Model (with Permissible values,
and XMI version
CBM. SQL MySQL Database
https://ncisvn.nci.nih.gov/WebSVN/filedetails.php?repname=common_biorepository_model&path=%2Ftrunk%2Fdatabase%2FCBM.sql
Database to be used for ETL; Now has the NCI Concept Code and NCI-Concept name
HTML view of EA model:
https://ncisvn.nci.nih.gov/svn/common_biorepository_model/trunk/html_documentation/index.htm
Able to navigate through EA model (w/o having EA on machine) – can walk through UML model
/cbm service files and Grid Deployment instructions
90% complete. Needs some testing of the /cbm
Files generated from running model through caCORE-SDK and Introduce (latest version)
Where we are headed:
• Specimen Resource Locatorhttp://biospecimens.cancer.gov/locator
• From 2002 to December 2009, ~14,000 queries looking at static-based SRL
• Q1 2011 – SRL Developers will begin• CBM 1.0 documentation/service/test package will be
developed (caBIG® Compliant ECCF/SAIF service)• SRL 2.0 Work will begin
1.Electronic web form = A web-based questionnaire based on CBM
2.Common Biorepository Model = through CBM challenge adoptees• SRL will provide names of biorepositories’ contacts that have
specimens researchers are looking for.• Additional information will be obtained from direct interaction with
contact• Material Transfer Agreements (collaboration, purchase, etc) – will then
be discussed between researcher and biorepository
35
CBM 1.0Beta Next Steps in Testing
• Work with our testing vendors to check if we can query them• Through querying, determine if the values are appropriate/expected
• Helping with how Specimen Resource Locator will also be expecting data in
• Test if .NET-based service is matching with the caCORE-SDK/Introduce version, in terms of querying paths
• caTissue Suite ETL – continue testing with Washington University, to expose all the data types and thus, aggregate with the specimen information from other test sites
• Incorporate feedback, more detailed instructions, guidance for ETL
• CBM 1.0Beta – transfer to Specimen Resource Locator Development team• ECCF/SAIF documentation• Comply with the standards set up for new caBIG® services (ISO 21090 data
types, etc.)
• Position to be first Specimen Management Service Set component
36
How to Participate
• View CBM Wiki (Latest information): https://wiki.nci.nih.gov/display/TBPT/Common+Biorepository+Model+%28CBM%29
• Review Model and vocabulary – how will it match with your biorepository data?
• Test the service/mapping when final service files are released• ETL process can begin today
• Contact TBPT
• If you are using a CBM Participating Vendor system – let them know you are interested in testing/using CBM
• Identify key projects in your institute that could have/will benefit from finding more specimens for their work – or finding uses for the specimens they currently hold
37
Acknowledgements
• Hollings Cancer Center• Artificial Intelligence in Medicine (AIM)• University of Colorado Cancer Center• University of Virginia (.NET service)
TBPT – CBM Team• Ian Fore, DPhil, NCI-CBIIT• Anna Fernandez, PhD, Booz Allen Hamilton• Libby Prince, Sapient Government Solutions• Andrew Breychak, Sapient Government Solutions• Ben Fombonne, Kelly Government• Beth DiGiulian, Booz Allen Hamilton
• caGrid KC, special thanks to Joe George, Bill Stephens, & Justin Permar!• Tissue Banking Knowledge Center• Vendor Community• Biorepository community!
40
Key CBM Sites:
• CBM Site (Latest Information): https://cabig.nci.nih.gov/workspaces/TBPT/CBM/
• CBM 1.0Beta Grid Test Package (May 2010) – email and links will be found on CBM website
• CBM Latest Model (Interactive Enterprise Architect version, IE browser):
https://ncisvn.nci.nih.gov/svn/common_biorepository_model/trunk/html_documentation/index.htm
• Find CBM 1.0Beta Test Nodes! • The service is deployed to the Training Grid. Here is the service URL:
http://portal.training.cagrid.org/web/guest/discoveryThen search for a service using “Name” field - with name “CBM”:
• CBM Site for Vendor/Community Comments:• https://cabig-kc.nci.nih.gov/Biospecimen/forums/
Find “Common Biorepository Model” – Discussions Forum
caBIG iGoogle Gadget
Queries & displays NCI Common Biorepository
Specimen Count Data
Developed by Booz Allen Hamilton
caBIG iGoogle Gadget Flowchart
caBIG iGoogle Gadget
PHP PagecaGRID.orgREST form
caGRID.orgXML output
1
4
2
First, the caBIG iGoogle Gadget is added to an iGoogle Homepage. A Google Gadget is stored in an XML file which contains gadget metadata as well as HTML, CSS and Javascript.
When the iGoogle homepage loads, the Gadget makes an AJAX request to retrieve data from a PHP page hosted on an external server.
The PHP page makes several queries using the RESTful interface to cagrid.org, located at http://portal-demo.training.cagrid.org/cagridportlets/xml/form. This form returns links to XML files, which contain data retrieved from multiple servers.
1
2
3
caBIG iGoogle Gadget Flowchart
caBIG iGoogle Gadget
PHP PagecaGRID.orgREST form
caGRID.orgXML output
1
4
3
Using the links obtained in the previous step, these XML files are retrieved and summed within the PHP page. The PHP page then outputs this summary data in XML format.
The caBIG iGoogle Gadget’s AJAX request is completed as the XML data output by the PHP page is retrieved. A Javascript callback function parses the summary XML and outputs it to the screen.
Selecting a different option within the caBIG iGoogle Gadget does not make another AJAX request, but instead locally re-parses the already retrieved XML summary data.
3
4
2
44
Key TBPT/NCI Sites:
• The Specimen Resource Locator (website) http://biospecimens.cancer.gov/locator
• caBIG® Tissue Banks & Pathology Tools Workspace:
https://cabig.nci.nih.gov/workspaces/TBPT
• OBBR – Office of Biorepositories and Biospecimen Research:
http://biospecimens.cancer.gov
• OBBR & NCI Best Practices http://biospecimens.cancer.gov/practices/
BIRT (in the works) – open-source reporting tool
• Tie into XML output of Grid Queries• Customize reporting• Could customize what is displayed
47
CBM Next Steps
caBIG® Tissue Banks & Pathology Tools Workspace
caBIG® TBPT
National Cancer Institute
48
June 2009 - Challenge announced to use CBM to expose data on the caGrid – Vendor & Cancer Centers (biorepositories) involvement
Fall 2009 – A CBM with caGrid artifacts ready for testing
Winter 2010 –First set of vendors have incorporated a CBM into their systems and can expose test data on the caGrid
Fall 2010 – First set of cancer centers (biorepositories) have deployed CBM at their site and are exposing real data on the caGrid
End of 2010/2011 – Researchers demonstrate real-world cases of successful research with biospecimens located via CBM
2009
2010
Vendor Commitment to identify resources/timeline for CBM testing in their product
NCI-CBIIT releases initialDocumentation, SW artifacts andtools for vendors to start testing
Vendors/caTissue/NCI-CBIIT pass test suite on test-data.
Biorepositories commitment to expose their own biorepository data using CBM-ready SW
Biorepositories with
Vendors/caTissue successfully expose real data on the caGrid
NCI SRL 2.0 Ready to query for specimens across caGrid via CBM
Cancer Centers identify projects that will use CBM in specimen search
Researchers report out on research impact using CBM
Identify future publications using specimens located through CBM
ITERATIVE Development PROCESS w
ith CBM Community
CBM Challenge (DEC 2009 Update)