12
Ontology Research (EHS- Ontology Research (EHS- CORE) Project CORE) Project Collaborative Expedition Workshop Collaborative Expedition Workshop #38, #38, February 22, 2005, National Science February 22, 2005, National Science Foundation Foundation Jane Greenberg Jane Greenberg , Associate Professor, School of , Associate Professor, School of Information and Library Science, University of North Information and Library Science, University of North Carolina at Chapel Hill (SILS/UNC—CH) Carolina at Chapel Hill (SILS/UNC—CH) Abe Crystal Abe Crystal , Research Assistant and Doctoral Student, , Research Assistant and Doctoral Student, SILS/UNC SILS/UNC W. Davenport Robertson W. Davenport Robertson , Library Director, National , Library Director, National Institute of Environmental Health Sciences Institute of Environmental Health Sciences

Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

Embed Size (px)

DESCRIPTION

Problem: “Information Silos”  Researchers are unaware of useful data and literature sources in related disciplines, beyond their immediate scope, because they are confronted with information silos  Scenario 1: we know it’s there, but “it’s roll the dice whether or not we find it”  Scenario 2: we don’t know it’s there (student PubMed search misses many relevant databases)  Researchers aware of resources in other domains must locate all relevant and independent data sources, interact with each data source in isolation, and manually combine results

Citation preview

Page 1: Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

Environmental Health Science—Environmental Health Science—Cross Domain Ontology Cross Domain Ontology

Research (EHS-CORE) ProjectResearch (EHS-CORE) Project

Collaborative Expedition Workshop #38, Collaborative Expedition Workshop #38, February 22, 2005, National Science FoundationFebruary 22, 2005, National Science Foundation

Jane GreenbergJane Greenberg, Associate Professor, School of Information and , Associate Professor, School of Information and Library Science, University of North Carolina at Chapel Hill Library Science, University of North Carolina at Chapel Hill (SILS/UNC—CH)(SILS/UNC—CH)

Abe CrystalAbe Crystal, Research Assistant and Doctoral Student, SILS/UNC, Research Assistant and Doctoral Student, SILS/UNC

W. Davenport RobertsonW. Davenport Robertson, Library Director, National Institute of , Library Director, National Institute of Environmental Health SciencesEnvironmental Health Sciences

Page 2: Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

Obesity and the Built Environment: Obesity and the Built Environment: An Interdisciplinary Challenge An Interdisciplinary Challenge

Obesity in America has become an “epidemic.” (Health Obesity in America has become an “epidemic.” (Health and Human Services Secretary Tommy Thompson) and Human Services Secretary Tommy Thompson)

Accounts for more than 300,000 premature deaths each Accounts for more than 300,000 premature deaths each year, direct health care costs in excess of $61 billion year, direct health care costs in excess of $61 billion

Burden significantly greater in the lower socioeconomic Burden significantly greater in the lower socioeconomic strata, minority and vulnerable populations. strata, minority and vulnerable populations.

Promising solution—integrate physical activity into daily life Promising solution—integrate physical activity into daily life by improving the by improving the built environmentbuilt environment—the physical —the physical surroundings in which one lives and works. surroundings in which one lives and works.

Interdisciplinary nature of obesity and the built environmentInterdisciplinary nature of obesity and the built environment

Page 3: Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

Problem: “Information Silos”Problem: “Information Silos”

Researchers are unaware of useful data and Researchers are unaware of useful data and literature sources in related disciplines, beyond literature sources in related disciplines, beyond their immediate scope, because they are their immediate scope, because they are confronted with confronted with information silosinformation silos Scenario 1: we know it’s there, but “it’s roll the dice Scenario 1: we know it’s there, but “it’s roll the dice

whether or not we find it”whether or not we find it” Scenario 2: we don’t know it’s there (student PubMed Scenario 2: we don’t know it’s there (student PubMed

search misses many relevant databases)search misses many relevant databases)

Researchers aware of resources in other Researchers aware of resources in other domains must locate all relevant and domains must locate all relevant and independent data sources, interact with each independent data sources, interact with each data source in isolation, and manually combine data source in isolation, and manually combine results results

Page 4: Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

Problem impactProblem impact Researchers face: Researchers face:

A labor-intensive and inefficient interdisciplinary A labor-intensive and inefficient interdisciplinary research experience (hard to find/integrate data and research experience (hard to find/integrate data and literature from outside own domain)literature from outside own domain)

Difficulty in locating “undiscovered public knowledge” Difficulty in locating “undiscovered public knowledge” (Swanson, 1986)—research from disparate (Swanson, 1986)—research from disparate disciplines, that when combined can solve an opendisciplines, that when combined can solve an openproblemproblem

Duplicative research resulting from the absence of Duplicative research resulting from the absence of knowledge about research in related, but pertinent knowledge about research in related, but pertinent disciplinesdisciplines

Page 5: Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

Solution: information integrationSolution: information integrationResearch goals of proposed project:Research goals of proposed project: Integrate existing domain-specific ontologies to provide Integrate existing domain-specific ontologies to provide

uniform intellectual access to interdisciplinary data and uniform intellectual access to interdisciplinary data and literature on obesity and the built environment.literature on obesity and the built environment.

Use Semantic Web metadata and technologies to Use Semantic Web metadata and technologies to provide powerful querying and inferencing capabilities on provide powerful querying and inferencing capabilities on the integrated ontology.the integrated ontology.

Develop an ontology server capable of dynamically Develop an ontology server capable of dynamically incorporating changes (i.e., “just-in-time” integration) in incorporating changes (i.e., “just-in-time” integration) in domain-specific ontologies (e.g., new or revised domain-specific ontologies (e.g., new or revised vocabularies) into the integrated ontology.vocabularies) into the integrated ontology.

Page 6: Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

Proposed Research TeamProposed Research Team

Domain science Domain science (nutrition and public health)(nutrition and public health) UNC School of Public Health, Active Living by DesignUNC School of Public Health, Active Living by Design

Ontology engineering and systems Ontology engineering and systems development (computer science)development (computer science) MINDSWAP/UMDMINDSWAP/UMD

Ontology and Web semantics development Ontology and Web semantics development and evaluation (information science)and evaluation (information science) Metadata Research Center/SILS/UNC-CHMetadata Research Center/SILS/UNC-CH

Page 7: Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

Information Integration: Information Integration: Ontological SolutionsOntological Solutions

Functional criteriaFunctional criteria Integrate ontologies from different Integrate ontologies from different

domains/disciplines, using standard languages domains/disciplines, using standard languages such as OWLsuch as OWL

Provide access to disparate and distributed Provide access to disparate and distributed data and literaturedata and literature

Update vocabulary dynamically (on the fly, or at Update vocabulary dynamically (on the fly, or at frequent intervals) based on changes in host frequent intervals) based on changes in host ontologiesontologies

Page 8: Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

Information Integration: Information Integration: Ontological Solutions (2)Ontological Solutions (2)

Technical criteriaTechnical criteria The components must be openly The components must be openly

accessible, preferably open source, and accessible, preferably open source, and listed in a standard registry.listed in a standard registry.

They must use open enabling They must use open enabling technologies and standards, such as:technologies and standards, such as: Uniform Resource Identifiers (URIs)Uniform Resource Identifiers (URIs) Resource Descriptor Format (RDF), RDFS, Resource Descriptor Format (RDF), RDFS,

and OWL (Web Ontology Language)and OWL (Web Ontology Language)

Page 9: Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

ImplementationImplementation Domain researchDomain research

Multi-method approach (interviews, log analysis…)Multi-method approach (interviews, log analysis…) Ontology mappingOntology mapping

Standardization, pruning, mapping, testing, reviewing, Standardization, pruning, mapping, testing, reviewing, etc.etc.

Ontology serverOntology server Define functional requirements, system architecture, Define functional requirements, system architecture,

prototyping, evaluationprototyping, evaluation Document CatalogingDocument Cataloging

Document sampling, cataloging (Dublin Core), Document sampling, cataloging (Dublin Core), metadata evaluationmetadata evaluation

Unified interfaceUnified interface Define functional requirements, prototyping, connect Define functional requirements, prototyping, connect

to ontology server, to ontology server, usability testingusability testing

Page 10: Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

Three Key ImpactsThree Key Impacts Addresses a major social problem, epidemic Addresses a major social problem, epidemic

obesityobesity

Validates an approach to dynamic ontological Validates an approach to dynamic ontological integration approach, which may be applicable integration approach, which may be applicable to many domainsto many domains

Facilitates cross-domain research, leading to Facilitates cross-domain research, leading to increased scientific productivity and discoveryincreased scientific productivity and discovery

Page 11: Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

Project StatusProject Status

Beginning preliminary fieldworkBeginning preliminary fieldwork

Pending proposals: NSF (system design Pending proposals: NSF (system design and ontological integration), IMLS (user and ontological integration), IMLS (user access to resource collection at ALbD)access to resource collection at ALbD)

Environmental Health Science Thesaurus Environmental Health Science Thesaurus Forum (buy-in by many)Forum (buy-in by many)

Page 12: Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science

Selected ReferencesSelected References Greenberg, J. (2004a). Metadata Extraction and Harvesting: A Comparison of Two Greenberg, J. (2004a). Metadata Extraction and Harvesting: A Comparison of Two

Automatic Metadata Generation Applications. Journal of Internet Cataloging, 6(4): 59-82.Automatic Metadata Generation Applications. Journal of Internet Cataloging, 6(4): 59-82. Gruber, TR. (1993). A Translation Approach to Portable Ontology Specification. Knowledge Gruber, TR. (1993). A Translation Approach to Portable Ontology Specification. Knowledge

Acquisition, 5: 199-220.Acquisition, 5: 199-220. Gruber, TR. (1994). Toward Principles for the Design of Ontolgoies Used for Knowledge Gruber, TR. (1994). Toward Principles for the Design of Ontolgoies Used for Knowledge

Sharing. IJHSC, 43 (5/6): 907-928.Sharing. IJHSC, 43 (5/6): 907-928. Guarino, N. (1998). Formal Ontology and Information Systems. In: N. Guarino, editor, Guarino, N. (1998). Formal Ontology and Information Systems. In: N. Guarino, editor,

Proceedings of the 1st International Conference on Formal Ontologies in Information Proceedings of the 1st International Conference on Formal Ontologies in Information Systems, FOIS '98, Trento, Italy, June, 1998, ISO Press, pp. 2-15.Systems, FOIS '98, Trento, Italy, June, 1998, ISO Press, pp. 2-15.

Kalyanpur, A, Sirin E, Parsia B, and Hendler, J. (2004). Hypermedia inspired Ontology Kalyanpur, A, Sirin E, Parsia B, and Hendler, J. (2004). Hypermedia inspired Ontology Engineering Environment: Swoop. Submitted to ISWC 2004 as a poster. [Online]. Available Engineering Environment: Swoop. Submitted to ISWC 2004 as a poster. [Online]. Available http://www.mindswap.org/papers/SWOOP-Poster.pdfhttp://www.mindswap.org/papers/SWOOP-Poster.pdf

Lauser, B., Wildemann, T., Poulos, A., Fisseha, F., Keizer, J., and Katz, S. A Lauser, B., Wildemann, T., Poulos, A., Fisseha, F., Keizer, J., and Katz, S. A Comprehensive Framework for Building Multilingual Domain Ontologies: Creating a Comprehensive Framework for Building Multilingual Domain Ontologies: Creating a Prototype Biosecurity Ontology. In Proceedings of the International Conference on Dublin Prototype Biosecurity Ontology. In Proceedings of the International Conference on Dublin Core and Metadata for e-Communities, 2002, Florence, Italy. October 13-17. Firenze: Core and Metadata for e-Communities, 2002, Florence, Italy. October 13-17. Firenze: Firenze University Press, pp. 113-123, 2002. [Online] Firenze University Press, pp. 113-123, 2002. [Online] http://www.bncf.net/dc2002/program/ft/paper13.pdfhttp://www.bncf.net/dc2002/program/ft/paper13.pdf..

Robertson, WD, and Greenberg, J. (2004). Architecting a Cross-Disciplinary Thesaurus for Robertson, WD, and Greenberg, J. (2004). Architecting a Cross-Disciplinary Thesaurus for the Semantic Web. DC-2004: Metadata across Languages and Cultures. Proceedings of the the Semantic Web. DC-2004: Metadata across Languages and Cultures. Proceedings of the International Conference on Dublin Core and Metadata Applications, October 11-14, 2004, International Conference on Dublin Core and Metadata Applications, October 11-14, 2004, Shanghai, China.Shanghai, China.

Sowa, J. F. (2002). Ontology, Metadata, and Semiotics, International Conference on Sowa, J. F. (2002). Ontology, Metadata, and Semiotics, International Conference on Conceptual Structures, ICCS '2000, August 14-18, Darmstadt, Germany. Conceptual Structures, ICCS '2000, August 14-18, Darmstadt, Germany.

Swanson, D. R. (1986). Undiscovered Public Knowledge. Library Quarterly, 56: 103-118.Swanson, D. R. (1986). Undiscovered Public Knowledge. Library Quarterly, 56: 103-118. Shanghai: Shanghai Scientific & Technological Literature Publishing House, pp. 231-235.Shanghai: Shanghai Scientific & Technological Literature Publishing House, pp. 231-235.