19
Supporting clinical trial data curation and integration with table mining Nikola Milosevic 1 , Cassie Gregson 3 , Robert Hernandez 3 , Goran Nenadic 1,2 1 School of Computer Science, University of Manchester 2 The Farr Institute @HeRC 3 AstraZeneca

Supporting clinical trial data curation and integration with table mining

Embed Size (px)

Citation preview

Page 1: Supporting clinical trial data curation and integration with table mining

Supporting clinical trial data curation and integration

with table miningNikola Milosevic1, Cassie Gregson3, Robert Hernandez3, Goran Nenadic1,2

1School of Computer Science, University of Manchester2 The Farr Institute @HeRC3AstraZeneca

Page 2: Supporting clinical trial data curation and integration with table mining

Clinical trial publications• Around 800 000 clinical trials in PubMed• Difficult to digest/search• Text mining approaches• But tables and figures are

often not processed

Page 3: Supporting clinical trial data curation and integration with table mining

Tables in publications• Present factual information• Usually:• Experimental settings (i.e. demographics)• Findings and results (e.g. DDI, side effects, adverse events…)• Background information (previous research, datasets, etc.)• Examples

• Important information about trials

Page 4: Supporting clinical trial data curation and integration with table mining

Extraction and curation of table data

Page 5: Supporting clinical trial data curation and integration with table mining

Challenges• Complex structure• Table dimensionality (1, 2, multi-dimensional)• Visual relationships

• Dense content• Ambiguous short text• Lack of context• Acronyms and abbreviations• Incomplete information

Page 6: Supporting clinical trial data curation and integration with table mining
Page 7: Supporting clinical trial data curation and integration with table mining

Table analysis overview

Page 8: Supporting clinical trial data curation and integration with table mining

Table types (1)• 4 types: list, matrix, super-row and multi-tables• List table:

Page 9: Supporting clinical trial data curation and integration with table mining

Table types (2)• Matrix table

Page 10: Supporting clinical trial data curation and integration with table mining

Table types (3)• Super-row table

Page 11: Supporting clinical trial data curation and integration with table mining

Table types (4)• Multi-table

Page 12: Supporting clinical trial data curation and integration with table mining

Example of decomposition

Page 13: Supporting clinical trial data curation and integration with table mining

Example of decomposition

Page 14: Supporting clinical trial data curation and integration with table mining

Example of decomposition

Page 15: Supporting clinical trial data curation and integration with table mining

Results

Page 16: Supporting clinical trial data curation and integration with table mining

Next steps• Add semantic annotations• Link patterns in data cells with its meaning• Build/Expand knowledge bases• Relate to existing knowledge on the semantic web

Page 17: Supporting clinical trial data curation and integration with table mining

Annotation schema• Meta-data• Paper (name, abstract, authors, publisher)• Authors (names, emails, affiliations)• Table (caption, footers)• Cells (content, role)• Inter-cell relationships• Semantics (links to ontologies, dictionaries, knowledge bases)

Page 18: Supporting clinical trial data curation and integration with table mining

Summary• Tables contain valuable information such as settings or

results • System for extraction and curation of table data• Decomposition and annotation of the tables• Accuracy of 85%

• Semantic analysis and information extraction