20
A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo Clinic Architecture/VCDE Face-to-Face Meeting Atlanta, GA, October 22, 2009

A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Embed Size (px)

Citation preview

Page 1: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

A LexWiki-based Representation and Harmonization

Framework for

caDSR Common Data Elements

Guoqian Jiang, Ph.D.Robert Freimuth, Ph.D.

Harold Solbrig Mayo Clinic

Architecture/VCDE Face-to-Face MeetingAtlanta, GA, October 22, 2009

Page 2: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Challenges for Metadata Community

• The community is facing the harmonization scaling problem and the need for tooling to navigate the model space is urgent.

• To form better community adoption and governance, a more open, scalable and collaborative platform is desired.

Page 3: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Wiki/Semantic Wiki/LexWiki

• Wiki as a collaborative system – community generated content.

• Semantic wiki as an platform – support different levels of the formality continuum (Free text -> OWL).

• LexWiki - a collaborative authoring platform for large-scale biomedical terminologies.• BiomedGT – Biomedical Grid Terminology• CTCAE - Common Terminology Criteria for Adverse Events • WHO ICD11 – the International Classification of Disease• NeuroLex - the Neuroscience Lexicon • XMDR - eXtended MetaData Registry (XMDR) Project• CSHARE - CDISC Shared Health and Research Electronic Library

Page 4: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Objectives

• We propose a LexWiki-based representation and harmonization framework for the caDSR CDEs.

• We intend to provide enhanced capabilities for• semantic storage and retrieval • community involvement and collaboration

Page 5: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Representation and Harmonization Framework in UML Model(Compatible with ISO 11179)

• This is a representation and harmonization framework represented in UML model. We think the model is compatible with ISO 11179 standard.

• The boxes in light blue color indicate what is loaded from individual contributors, you may see the classes data elements/ value domians/permissible values. The boxes in pink represent the data element semantics, in another words, the formal definiton of data elements. The boxes in light green represent the terminology element, indicating what comes from the various terminology resources such as SNOMED-CT, MeDra, etc. The boxes in yellow indicates where the harmonization assertions happen.

Page 6: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Representation and Harmonization Framework in UML Model(Compatible with ISO 11179)

• This diagram indicates the relationship between data element and data element concept. The yellow box between them indicates data element meaning should be asserted.

• This diagram indicates the relationship between permissible value class and value meaning. The yellow box between them indicates permissible value meaning link should be asserted.

Page 7: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Workflow process

• This diagram indicates the workflow process derived from cshare project. You may see the data elements are contributed from individual organizations and the community works together to link the data elements with terminologies and standards and then merge and harmonize them and generate new standardized definitions.

SNOMED, NCI Meta, ICD 10, BRIDG, CDISC etc.

Page 8: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Individual Data Element Representation - Description

• This is a screenshot to show the individual data element represented in wiki platform. In lexical description part,

• You may see lexical descriptions of the data elements and the data elements belongs to domain lesion measurement and its original source is caBIG.

Page 9: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Individual Data Element Representation - Valueset (or codelist)

• Under the values tab, you may see the data element is linked to a code list or value set and permissible values contained in that value set. Interestingly, only one permissible value has concept reference asserted.

Page 11: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Underlying Semantic Annotations

• Each data element has underlying semantic annotations that can be used for formal rendering. We are using 11179 prefix in property name to indicate this is a ISO 11179 compatible model.

• Please note that a list of terminologies are also loaded in the wiki to facilitate the harmonization process. The following slides will show this point.

Page 12: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

RDF/OWL Rendering

• This is the RDF/OWL rendering of those semantic annotations for that specific data element. This is a built-in feature of semantic mediawiki which could be potentially extended to interface with other existing semantic web applications.

Page 13: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

• This slide shows a form based authoring support for lexical definition and ISO data types. You may see the enhanced functionality like autocompletion support and definition display

Harmonization Assertion – Lexical definition and ISO 21090 datatypes

Page 14: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Harmonization Assertion- Concept reference and BRIDG linkage

• This slide shows the form based concept reference authoring with autocompletion and definition display support. The prefix NCIM indicates the concepts come from the coding scheme NCI Metathesaurus.

• The platform also support the BRIDG model mapping.

Page 15: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Slicer and Dicer – Domain-based view

• The slide shows a domain based view through a Exhibit browser which is developed by MIT SIMILE group. We call this browser as a slicer and dicer which provides a flexible way for browsing the data elements from different sources.

Page 16: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Slicing and dicing

• We may do slicing and dicing using data type, object class concept, property concept. In this slide, you may see the 6 data elements are grouped together by datatype Character, object class concept Lesion, preperty concept Increase.

Page 17: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

CDE Proposal – New Definition

• Then a CDE proposal can be generated through merging those 6 data elements. The community can work together for harmonization and create a new definition.

• I would like to mention that the wiki platform has built-in collaboration features for discussion and version management.

Page 18: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Summary

• As a summary, our lexwiki based framework provides a flexible, extensible and collaborative way for representing and harmonizing the data elements contributed from different sources.

SNOMED, NCI Meta, ICD 10, BRIDG, CDISC etc.

Page 19: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Use Cases for a Platform to Support Collaborative Authoring of Data Elements

• Data Element Harmonization• Identification of related CDEs

• Creation of CDE standards from existing CDEs

• Development of DAMs (or DAM subdomains)

• Linkage among CDEs, DAMs and Standards/Terminologies• Connections to MDR – caDSR/cgMDR/openMDR

• Dynamic Extensions• Advertisement of proposed or implemented extensions

• Mirror extensions in other application instances

• Use of extensions in new applications

• Connections with Semantic Web applications/communities• …

The Vocabulary Knowledge Center will host a teleconference to

demonstrate CSHARE and discuss other potential use cases

Page 20: A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo

Questions ?

Questions ?