UMBC an Honors University in Maryland 1 Information Integration and the Semantic Web Finding knowledge, data and answers Tim Finin 1, Anupam Joshi 1, Li

Embed Size (px)

Citation preview

  • Slide 1

UMBC an Honors University in Maryland 1 Information Integration and the Semantic Web Finding knowledge, data and answers Tim Finin 1, Anupam Joshi 1, Li Ding 2 1 University of Maryland, Baltimore County 2 Stanford University, Knowledge Systems Lab Joint work with Yun Peng, Cynthia Parr, Andriy Parafinyk, Lushan Han, Pranam Kolari, Pavan Reddivari, Rong Pan, Akshay Java, Joel Sachs and others. http://creativecommons.org/licenses/by-nc-sa/2.0/ This work was partially supported by DARPA contract F30602-97-1-0215, NSF grants CCR007080 and IIS9875433 and grants from IBM, Fujitsu and HP. http://ebiquity.umbc.edu/resource/html/id/327/ Slide 2 UMBC an Honors University in Maryland 2 Google has made us smarter Slide 3 UMBC an Honors University in Maryland 3 But what about our agents? tell register Agents still have a very minimal understanding of text and images. Slide 4 UMBC an Honors University in Maryland 4 But what about our agents? A Google for knowledge on the Semantic Web is needed by software agents and programs Swoogle tell register Slide 5 UMBC an Honors University in Maryland 5 Information Integration and the Semantic Web The Semantic Web enables information integration with standards supporting shared semantic models, ontology mapping, common tools, etc. A Google-like global index can help people and programs to Find Semantic Web ontologies and data Understand how these are being used Build trust and provenance models Assemble ontology maps Create new integration tools Slide 6 UMBC an Honors University in Maryland 6 http://swoogle.umbc.edu/ Running since summer 2004 1.8M RDF docs, 320M triples, 10K ontologies, 15K namespaces, 1.3M classes, 175K properties, 43M instances, 600 registered users http://swoogle.umbc.edu/ Running since summer 2004 1.8M RDF docs, 320M triples, 10K ontologies, 15K namespaces, 1.3M classes, 175K properties, 43M instances, 600 registered users Slide 7 UMBC an Honors University in Maryland 7 Applications and use cases Supporting Semantic Web developers Ontology designers, vocabulary discovery, who uses what ontologies & data, use analysis, errors, statistics, etc. Helping scientists publish and find data Spire: aggregating observations and data from biologists InferenceWeb: searching over and enhancing proofs SemNews: Text Meaning of news stories Supporting SW tools Triple shop: finding data for SPARQL queries 1 2 3 Slide 8 UMBC an Honors University in Maryland 8 1 Slide 9 UMBC 9 By default, ontologies are ordered by their popularity, but they can also be ordered by recency or size. 80 ontologies were found that had these three terms Lets look at this one Slide 10 UMBC an Honors University in Maryland 10 All of this is available in RDF form for the agents among us. Slide 11 UMBC an Honors University in Maryland 11 Heres what the agent sees. Note the swoogle and wob (web of belief) ontologies. Slide 12 UMBC an Honors University in Maryland 12 2 An NSF ITR collaborative project with University of Maryland, Baltimore County University of Maryland, College Park University of California, Davis Rocky Mountain Biological Laboratory An NSF ITR collaborative project with University of Maryland, Baltimore County University of Maryland, College Park University of California, Davis Rocky Mountain Biological Laboratory Slide 13 UMBC an Honors University in Maryland 13 Invasive Species Invasive species cost the U.S. economy over $138 billion per year By various estimates, these species contribute to the decline of 35% - 46% of U.S. endangered and threatened species The invasive species problem is growing, as the number of pathways of invasion increases. Pimental et al. 2000 Environmental and economic costs associated with non-indigenous species in the United States. Bioscience 50:53-65. Charles Groat, Director U.S. Geological Survey, http://www.usgs.gov/invasive_species/plw/usgsdirector01.html Slide 14 UMBC an Honors University in Maryland 14 East River Valley Trophic Web http://www.foodwebs.org/ Slide 15 UMBC an Honors University in Maryland 15 Biologists Gathering data Increase utility Maximize productivity Foster discovery Broaden participation Slide 16 UMBC an Honors University in Maryland 16 Representing and sharing data Journal articles Flat files Spreadsheets Local databases On the Web in HTML or XML Slide 17 UMBC an Honors University in Maryland 17 Bacteria Microprotozoa Amphithoe longimana Caprella penantis Cymadusa compta Lembos rectangularis Batea catharinensis Ostracoda Melanitta Tadorna tadorna ELVIS: Ecosystem Localization, Visualization, and Integration System Oreochromis niloticus Nile tilapia ? ?... Species list constructor Food web constructor Slide 18 UMBC an Honors University in Maryland 18 ELVIS Food Web Constructor predicts basic network structure Prelude to systems models Slide 19 UMBC an Honors University in Maryland 19 Examine evidence for predicted links. The Evidence Provider lets users explore evidence (data, papers, reasoning) for food web links Slide 20 UMBC an Honors University in Maryland 20 data from ~300 food webs Slide 21 UMBC an Honors University in Maryland 21 Supporting ontologies and their use SpireEcoConcepts, for confirmed and potential food web links bibliographic information of food web studies ecosystem terms taxonomic ranks California Wildlife Habitat Relationships Ontology life history geographic range management information ETHAN (Evolutionary Trees and Natural History) Natural history information on species derived from data in the Animal Diversity Web and other taxonomic sources Slide 22 UMBC an Honors University in Maryland 22 UMBC Triple Shop http://sparql.cs.umbc.edu/ Online SPARQL RDF query processing with several interesting features Automatically finds data for queries using Swoogle Datasets, queries and results can be saved, tagged, annotated, shared, searched for, etc. RDF datasets as first class objects Can be stored on our server or downloaded Can be materialized in a database or (soon) as a Jena model 3 RDF OWL RDF query language Slide 23 UMBC an Honors University in Maryland 23... leaving out the FROM clause What are body masses of fishes that eat fishes? Triple Shop Slide 24 UMBC an Honors University in Maryland 24 specify dataset Slide 25 UMBC an Honors University in Maryland 25 11 RDF documents were found that might have useful data Slide 26 UMBC an Honors University in Maryland 26 Well select them all and add them to the current dataset. Slide 27 UMBC an Honors University in Maryland 27 Well run the query against this dataset to see if the results are as expected. Slide 28 UMBC an Honors University in Maryland 28 The results can be produced in any of several formats Slide 29 UMBC an Honors University in Maryland 29 Results http://sparql.cs.umbc.edu/tripleshop2/ Slide 30 UMBC an Honors University in Maryland 30 Looks like a useful dataset! Lets annotate, tag and save it and also materialize it the TS triple store. Queries can also be annotated, tagged and shared. Looks like a useful dataset! Lets annotate, tag and save it and also materialize it the TS triple store. Queries can also be annotated, tagged and shared. Slide 31 UMBC an Honors University in Maryland 31 Themes revisited The Web contains the worlds knowledge in forms accessible to people and computers The Semantic Web enables information integration with standards supporting shared semantic models, ontology mapping, common tools, etc. We need better ways to discover, index, search and reason over knowledge on the Semantic Web Swoogle-like systems help create consensus ontologies, foster best practices, find data and support tools. Slide 32 UMBC an Honors University in Maryland 32 http://ebiquity.umbc.edu/ Annotated in OWL For more information