85
Cognitive and Social Challenges of Ontology use in the Biomedical Domain SLE 2012: 5 th International Conference on Software Language Engineering Dresden, Germany Margaret-Anne Storey The CHISEL Group, University of Victoria

SLE 2012 Keynote: Cognitive and Social Challenges of Ontology Use in the Biomedical Domain

Embed Size (px)

DESCRIPTION

ABSTRACT: Ontologies can provide a conceptualization of a domain leading to a common vocabulary for communities of researchers and important standards to facilitate computation, software interoperability and data reuse. Most successful ontologies, especially those that have been developed by diverse communities over long periods of time, are typically large and complex. To address this complexity, ontology authoring and browsing tools must provide cognitive support to improve comprehension of the many concepts and relationships in ontologies. Also, ontology tools must support collaboration as the heart of ontology design and use is centered on community consensus. In this talk, I will describe how standardized ontologies are developed and used in the biomedical and clinical domains to aid in scientific and medical discoveries. Specifically, I will present how the US National Center for Biomedical Ontology has designed the BioPortal ontology library (and associated technologies) to promote the use of standardized ontologies and tools. I will review how BioPortal and other ontology tools use established and novel visualization and collaboration approaches to improve ontology authoring and data curation activities. I will also discuss an ambitious project by the World Health Organization that leverages the use of social media to broaden participation in the development of the next version of the International Classification of Diseases. To conclude, I will discuss the challenges and opportunities that arise from using ontologies to bridge communities that manage and curate important information resources.

Citation preview

  • 1.Cognitive and Social Challenges ofOntology use in the Biomedical DomainSLE 2012: 5th International Conference on Software Language Engineering Dresden, GermanyMargaret-Anne StoreyThe CHISEL Group, University of Victoria

2. Studying and addressing human aspects insoftware engineering and knowledge engineering 3. Research methods used: Mixed methods (analysis of archival data,interviews, grounded theory, surveys etc.)Technologies explored: Visualization techniques Collaboration support Social media 4. Focus of this talk: Providing cognitive supportfor ontology developers and usersthroughcollaborative visual user interfaces 5. OntologyOntologyOntologyCreation Library Applications BackgroundBioPortal Annotation ExamplesServicesSearch Tools Mappings 6. OntologyOntologyOntologyCreation Library Applications BackgroundBioPortal Annotation ExamplesServicesSearch Tools Mappings 7. The study of being 8. Co-opted by computer science to enable the explicit specification ofEntities Propertiesand attributes of entities Relationsbetween entities 9. One definitionExplicit specification of a conceptualization[Gruber, 1993] 10. ntologies, Ontologies, OntologiesO 11. Ontology languagesChoice of language and choice of reasoningengineTradeoff between expressiveness, reasoningpower, tractability and human understandingMay need inference engine to give real-timefeedback while authoring an ontology 12. Why ontologies? 13. Awash in data. 14. How are ontologies used? 15. Challenges?Cognitive issues: Complexity, scale Evolution Inclusion of upper ontologies, orparts of other ontologiesSocial issues: One size does not fit all Multiple authors Input from broader set of stakeholders 16. OntologyOntologyOntologyCreation Library Applications BackgroundBioPortal Annotation ExamplesServicesSearch Tools Mappings FMA, GO, ICD 17. Foundational Model of Anatomy(FMA)Comprehensive ontology ofhuman anatomyOver 120K terms, 2.1Mrelationship instances (168relationship types)One of the largest and bestdeveloped ontologies inbiomedicine, multi-purposeSlide by Mark Musen. 18. Slide by Mark Musen. 19. Gene Ontology (GO)To unify representation of gene and geneproduct attributes across all speciesFor annotating genes and gene products,assimilate and disseminate annotation dataContains over 24,500 terms applicable to awide variety of biological organismsA standard tool in bioinformatics 20. See http://www.nature.com/scitable/topicpage/ontologies-scientific-data-sharing-made-easy-77972 21. International Classification of Diseases (ICD) An enumeration of diseasesthat forms the basis for medical claims and reimbursements A legacy terminology that has its roots in 19th century epidemiology Created initially by biostatisticians with a pressing need to compare death statistics in different European countriesSlide by Mark Musen. 22. ICD is used for lots of (too many?) things! ICD is used to code all patient encounters with the health-care system for: Billing and reimbursement Institutional planning Disease surveillance and public health Quality assurance Economic modeling ICD was never intended to make the distinctions relevant to all these tasks! Nevertheless it is widely used! Slide by Mark Musen. 23. ICD: An excerpt724 Unspecified disorders of the back724.0 Spinal stenosis, other than cervical724.00 Spinal stenosis, unspecified region724.01 Spinal stenosis, thoracic region724.02 Spinal stenosis, lumbar region724.09 Spinal stenosis, other724.1 Pain in thoracic spine724.2 Lumbago724.3 Sciatica724.4 Thoracic or lumbosacral neuritis724.5 Backache, unspecified724.6 Disorders of sacrum724.7 Disorders of coccyx724.70 Unspecified disorder of coccyx724.71 Hypermobility of coccyx724.71 Coccygodynia724.8 Other symptoms referable to back724.9 Other unspecified back disorders Slide by Mark Musen. 24. ICD9 (1977): A handful of codes for traffic accidents Slide by Mark Musen. 25. ICD10 (1999): 587 codes for such accidentsV31.22 Occupant of three-wheeled motor vehicle injuredin collision with pedal cycle, person on outside ofvehicle, nontraffic accident, while working for incomeW65.40 Drowning and submersion while in bath-tub,street and highway, while engaged in sports activityX35.44 Victim of volcanic eruption, street and highway,while resting, sleeping, eating or engaging in other vitalactivities Slide by Mark Musen. 26. ICD revision process in the 20th Century International and National Revision conferences 1-5 person delegations in Internationalconferences, multi-disciplinary Manual curation Output: paper copy Negotiation process: decibel method ofdiscussion ICD drafts translated into 27 languages See http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2950305/ 27. ICD-11 revision: key aspects Content model Topic Advisory Groups vertical and horizontal Classification experts (ontology development) iCAT: web based collaborative authoring tool Use cases evaluating ICD-11 in use 28. Deliverables Print versions fit for purposein multiplelanguages Web portal to access, browse and maintain it Input from the crowd Classification in formalized language 29. OntologyOntologyOntologyCreation Library Applications BackgroundBioPortal Annotation ExamplesServicesSearch Tools MappingsFMA-Explorer, Protg, iCAT 30. Foundational Model Explorer University of Victoria 33 31. Protg ontology authoring environmentOntology contents need tobe processed and interpretedby computersInteractive tools can assistdevelopers in ontologyauthoring (e.g. Protg) 32. Collaborative Protg 33. iCAT web based authoring tool for ICD-11 34. OntologyOntologyOntologyCreation Library Applications BackgroundBioPortal Annotation ExamplesServicesSearch Tools Mappings 35. National Center for Biomedical OntologyGoal: develop innovative technology and methods thatallow scientists to record, manage, and disseminatebiomedical information and knowledge in bothhuman readable and machine-processableform 36. BioPortal Library 37. OntologyOntology OntologyCreation LibraryApplications BackgroundBioPor Annotation ExamplesServices Search Tools Mappings 38. BioPortal services Ontology recommender Ontology widgets Annotator API access through REST services Virtual appliance (custom installs, canbe proprietary) 39. Ontology Widgets 40. Ontology Widgets (2) 41. Ontology OntologyOntologyCreationAcquisition Applications Background Library Annotation ExamplesServices Search Tools Mappings 42. Visualizing multiple ontologies and mappings 43. Mappings between terms - Matrix 44. Mappings between ontologies -- Graph 45. Ontology OntologyOntologyCreationAcquisition Applications Background Library Annotation ExamplesServices Search Tools Mappings 46. Data from STRIDE 1.8 million pediatric and adult patients with clinical anddemographic data (1994 - present) 19 million Clinical Encounters (1994 - present) 35 million 22 million 2.9 million 1.2 million 7 million 137 million 10 millionSlide by Nigam Shah. 47. Making EMRs Unreasonably EffectiveText clinical noteBioPortal knowledge graph Creating clean lexicons Diseases Frequency Term 1:Term recognition tool:NCBO AnnotatorProcedures:Annotation WorkflowSyntactic types Term nDrugsTerms Recognized P1 ICD9ICD9ICD9ICD9 ICD9 ICD9 P1 T1, T5, T4, T8, T6,T1, Further AnalysisT2, T4, T3, T9,T8,T2,no T4 T3T1T4 T10no T4 P2 P2 P3Negation detection Cohort ofInterest P3 : : Pn Pn Terms form a temporal series of tags Slide by Nigam Shah. 48. P1ICD9 ICD9 ICD9ICD9ICD9ICD9P1T1, T5, T4, T8, T6, T1, T1 TnT2,T4,T3, T9, T8, T2,no T4T3 T1T4T10 no T4P1 10 1 1P2P2:01 1 0P3P3:00 0 1:: Pn 01 0 1PnPnT1 Tn P1 PnT11 0.6 0.50.6 P1 10.1 0.7 0.8: 1 0.20.3 : 1 0.5 0.8: 10.1 : 1 0.4Who is gettingTn 1 Pn1 What is special these drugs, about these conditions, etc?patients?Comparative cephalexincaneDrug Safetydoppler ultrasonography ultrasoundimagingEffectiveness amoxicillindoppler studies angioplastyatherectomyrevascularization wheelchaircilostazolvascular surgicalbypass graft Learninghydralazine congestive heartdiagnostic proceduresPredictionsfrom Datapneumoniaimagingsurgical revisionfailurebypass heart failurenifedipine testosteroneamiodaronepravastatinvascular diseases carotid pantoprazole insulin glargine endarterectomy obesity transplantation ramiprilfentanyl zolpidemtrimethoprim decompressiveincision coronary sulfamethoxazolediazepam angiography fluoroscopicheartangiography transplantationtacrolimustemazepam Slide by Nigam Shah. 49. Drug Safety: Detecting Risk SignalsROR of 1.5, CI of [1.11, 2.13]The X2 p-value