40
BioPortal ontologies et ressources de données biomédicales à portée de main… Clement Jonquet & BioPortal team [email protected] Atelier Web Sémantique Médical, Nîmes, France - 8 Juin 2010 1

BioPortal: ontologies and integrated data resourcesat the click of a mouse

Embed Size (px)

DESCRIPTION

Invited presentation at the French Medical Semantic Web workshop 2010 in Nimes. Presentation done in several seminars since then.

Citation preview

  • 1.BioPortal ontologies et ressources de donnes biomdicales porte de main
    Clement Jonquet& BioPortal team
    [email protected]
    Atelier Web Smantique Mdical, Nmes, France - 8 Juin 2010
    1

2. Prsentation de la prsentation
Merci pour cette opportunit
Contribution de tout le groupe NCBO (~20 pers.)
Plan
Prsentation gnrale
Ce quon peut faire avec BioPortal (dmo?)
Discussion
Article de rfrence
N. F. Noy, N. H. Shah, P.L. Whetzel, B. Dai, M. Dorf, N.B. Griffith, C. Jonquet, D. L. Rubin, M. Storey, C.G. Chute, M. A. Musen. BioPortal: ontologies and integrated data resourcesat the click of a mouse. NucleicAcidsResearch, 37:170173, May 2009.
2
3. Biologist have adopted ontologies
To provide canonical representation of scientific knowledge
To annotate experimental data to enable interpretation, comparison, and discovery across databases
To facilitate knowledge-based applications for
Decision support
Natural language-processing
Data integration
But ontologies are: spread out, in different formats, of different size, with different structures
3
4. What is BioPortal?
Web repository for biomedical ontologies one stop shop
Make ontologies accessible and usable abstraction on format, locations, structure, etc.
Users can publish, download, browse, search, comment, align ontologiesand use them for annotations both online and via a web services API.
Community-based ontology development, alignment, and evaluation
Figures:
200+ ontologies (OWL, OBO, UMLS)
~ 1.7 million terms
~ 2 million mappings
22 annotated biomedical resources
~ 10 milliards annotations
4
5. What are we trying to do
Youve built an ontology, how do you let the world know?
You need an ontology, where do you go o get it?
How do you know whether an ontology is any good?
How do you find resources that are relevant to the domain of the ontology (or to specific terms)?
How could you leverage your ontology to enable new science?
5
6. Community-based ontology repository
http://bioportal.bioontology.org
6
7. BioPortal features
Library of ontologies (support browsing,visualizing, versioning, metrics, views)
Search ontologies, resources
Peer review:comments and discussion
Mapping
Annotate data
7
8. Library of biomedical ontologies
8
9. Ontology metadata
9
10. Ontology metrics
10
Statistics
Conformance to
Best practices
11. Ontology views
11
Specific subset
Other languages
12. Ontology search
12
Keywords & options
Ontologies to use
13. Ontology browsing
13
14. Ontology visualizing
14
15. Ontology notes
15
16. Ontology mappings
16
17. Mappings in BioPortal
Ontologies, vocabularies, and terminologies will inevitably overlap in coverage
Concept-to-concept mappings
e.g., nostril in NCI Thesaurus is similar to naris in Mouse Anatomy Ontology

  • Found by tools and uploaded in bulk

18. Created by users 19. Provenance17
20. How mappings are useful?
Navigation mechanism, linking one ontology to another
Annotating& query expansion in search
Allows to include synonyms defined in other ontologies
Use for finding important or reference ontologies
If everyone maps to NCI Thesaurus, it must be important
Accessible through web services & RDF to be used in other applications
18
21. Ontology-based annotation workflow
19
First, direct annotations are created by recognizing concepts in raw text,
Second,annotations are semantically expanded using knowledge of the ontologies,
Third, all annotations are scored according to the context in which they have been created.
22. Explosion of biomedical data: diverse, distributed, unstructured not link to ontologies

  • Hard for biomedical researchers to find the data they need

23. Data integration problem 24. Translational discoveries are prevented 25. Good examples 26. GO annotations 27. PubMed (biomedical literature) indexed with Mesh headingsAnnotate data with ontology concepts
Horizontal approach
Annotation challenge
20
RESOURCES
ONTOLOGIES
28. NCBO Annotator in BioPortal
21
29. Code
Word & Firefox add-ins to call the Annotator Service?
Excel
UIMA platform
Specific UI
NCBO Annotator service
Multiple ways to access
30. NCBO Biomedical Resources index

  • We have used the workflow to index several important biomedical resources with ontology concepts (22+)

31. The index can be used to enhance search & data integration23
[DILS 08]
[BMC BioInfo09]
[IC 10]
32. Ex: annotation of a GEO element
24
33. Ontology-based search (1/2)
Example of resource available (name and description)
Number of annotations in the NCBO Resource Index
Ontology concept/term browsed
Title and URL link to the original element
Context in which an element has been annotated
ID of an element
25
34. Ontology-based search (2/2)
26
Ontology concept(s) to use for search
Keyword to search
Biomedical resources to query
Resource elements found
35. Good use of the semantics (1/2)

  • Simple keywords based search miss results

27
36. 28
Good use of the semantics (2/2)
37. Ontology recommendation
29
38. The BioPortal technology
All BioPortal data is accessible through REST services
BioPortal user interface accesses the repository through REST services as well
For example:
http://bioportal.bioontology.org/visualize/40401/?conceptid=D008545
http://rest.bioontology.org/bioportal/concepts/40401/?conceptid=D008545
The BioPortal technology is domain-independent
BioPortal code is open-source
Technology stack includes: Protg, LexGrid, MySQL, Hibernate, Spring, J2EE, Ruby-on-Rails
30
39. Other installations of BioPortal
31
40. BioPortals future
Better support of Semantic Web standards
Done: provide URI for every concept in the ontology
TBD: ontologies & annotations available through a SPARQL endpoint
Development of a biomedical mega-thesaurus based on ontology mappings
Merge ontology editing & publishing
Scalability
Distributed architecture
Enhance views/modularization e.g., different languages
32
41. Conclusion
BioPortal is allowing NCBO to experiment with new models for
Dissemination of knowledge on the Web
Integration and alignment of online content
Knowledge visualization and cognitive support
Peer review of online content
Exciting context of research & application for both CS and Biomedical informatics
BioPortal is a good illustration ofbiomedical semantic web application
Please try it and join us!
33
42. Collaborateurs & remerciements

  • @ NCBO, Stanford University

43. Natasha Noy, Mark Musen, Nigam Shah, Patricia Whetzel, Adrien Coulet, Paea Le Pendu, Michael Dorf, Cherie Youn, Paul Alexander, Sean Falconer 44. @ NCBO, somewhere else 45. Peggy Storey, Chris Callendar, Christopher Chute, Pradip Kanjamala, JyotiPathak, Jim Buntrock 46. and many others34
47. MerciNational Center for BioMedical Ontologyhttp://www.bioontology.orgBioPortal, biomedical ontology repositoryhttp://bioportal.bioontology.orgContact [email protected]
35
48. Develop a mega-thesaurus
Group mapped concept s from different ontologies to create a single concept
Similar to the approach taken by NLM with UMLS Metathesaurus
manual vs. automatic
36
49. Integration of ontology editing and publishing
Enable users to go seamlessly between ontology editing and publishing
Notes created in BioPortal are visible in an ontology editor
User accounts and roles shared among BioPortal and ontology editors
Users dont need to be aware of the difference: they just get their work done
37
50. Annotation & semantic web

  • Part of the vision for the semantic web

51. Web content must be semantically described using ontologies 52. Semantic annotations help to structure the web 53. Annotation is not an easy task 54. Automatic vs. manual 55. Lack of annotation tools (convenient, simple to use and easily integrated into automatic processes) 56. Todays web content (& public data available through the web) mainly composed of unstructured text38
57. Annotation is not a common practice

  • High number of ontologies

58. Getting access to all is hard: formats, locations, APIs 59. Lack of tools that easily access all ontologies (domain) 60. Users do not always know the structure of an ontologys content or how to use it in order to do the annotations themselves 61. Lack of tools to do the annotations automatically 62. Boring additional task without immediate reward for the user39
63. The challenge

  • Automatically process a piece of raw text to annotate it with relevant ontologies

64. Large scale to scale up for many resources and ontologies 65. Automatic to keep precision and accuracy 66. Easy to use and to access to prevent the biomedical community from getting lost 67. Customizable to fit very specific needs 68. Smart to leverage the knowledge contained in ontologies40