22
Free Open-Source, Open-Platform System for Information Mash- Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan, Sina Masoud-Ansari Center for eResearch & School of Environment The University of Auckland

Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Embed Size (px)

Citation preview

Page 1: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Free Open-Source, Open-Platform System for

Information Mash-Up and Exploration in Earth Science

Tawan Banchuen, Will Smart,

Brandon Whitehead,Mark Gahegan,

Sina Masoud-Ansari

Center for eResearch & School of EnvironmentThe University of Auckland

Page 2: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Overview

1. Introduction and background to project2. Application Development

– Software system for integrating, browsing and understanding large information bases

3. Demonstration / sample results4. Conclusion

Page 3: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Components of knowledge computing

Rich descriptions of resource meaning

Recommender systems

Finding analogous situations

Knowledge evaluation

Ontology alignment tools

Filters and query tools for locating resources

Knowledge visualization tools (e.g. ConceptVista, CMap, ThinkBase)

Workflow description Metadata scraping

Ontology capture

Use-case capture

Tag clouds

Ontologies, controlled vocabularies, taxonomies

Metadata Knowledge bases

RDF/OWL/KIF

Page 4: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

4

What is an Ontology?

• An ontology describes what we know or what is true, via a kind of logic

• An ontology can be as simple as a concept map showing terms used to describe a topic and the relationships between those terms

Topic

Terms

Page 5: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

The problem

• Knowledge leaks from organizations– Some gets forgotten– Some leaves with its container– Some gets buried or lost in the infrastructure

• We are very poorly equipped to care for knowledge in computational infrastructure– Can we ‘surface’ more of the knowledge implicitly held in

unstructured documents?– If so, can we put it to use effectively?

Page 6: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Complete conceptual neighborhood of a document

ConceptVista, Gahegan et al.

Page 7: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Methods

Lab Books

Preprints

Data

Video

Blogs

Podcasts

Codes

Algorithms

Models

Presentations

Ontologies

IntermediateResults

Related Articles

Comments& Reviews

Plans

Reproducible, transparent science Composite research components

Carole Goble, UK eScience

Page 8: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Methods

Lab Books

Preprints

Data

Video

Blogs

Podcasts

Codes

Algorithms

Models

Presentations

Ontologies

IntermediateResults

Related Articles

Comments& Reviews

Connections run both ways…an open, linked web of science

Plans

Carole Goble, UK eScience

Page 9: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Application Development

Software system for integrating, browsing and understanding large information bases

Page 10: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Alfred & SemDat IntegrationData Sources • Geospatial Data - Geoserver & Mapserver• Ontological Data - Sesame• Documents - webpages, PDFs, reportsVisualization • Map• Concept graph• Concept tree• Web browserAnalysis methods • Visual exploration• Relevant measurement• Spatial and ontological queries

Page 11: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

• The application has the following basic module types:

Page 12: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Single Sourcing

Page 13: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

• Eclipse is used as the base– Stable and industry-standard– Enables advanced coordination between our modules and many

available third party modules

• The display modules provide a view on the dataset with rich interactivity– A user can focus on the information they want.

• The query engine is the smarts– Determines which information is relevant to the current selection– Determines how that information should be displayed

Page 14: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Style queries mark-up displayed information based on semantics:

Page 15: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

• Standards:– Eclipse – Industry-standard base with standardized plug-in

format• NeOn – Existing eclipse application providing useful ontological plug-ins• uDig – Existing eclipse application providing useful mapping and browser

plug-ins• Open source• Open standard• Active communities

– OWL/RDF – Industry standard for representing ontologies– SPARQL – Query language– Jython/Python – Advanced styling and rendering of data

Page 16: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Geographic Context (Map View)

Analysts can gain insights from geographic relationships between cases

• Distance – possible physical/chemical interactions, team collaboration

• Clusters – successes and failures• Patterns – successes restricted to a particular team• Possible explanations/theories

Page 17: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Drill Down to Related Document

Analysts can drill down to investigate anindividual abstract/article for more details

Page 18: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,
Page 19: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

We need far better information filters

Page 20: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Demonstration / sample results

Page 21: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Conclusions

• We are drowning in data / information / knowledge, yet are rewarded for producing more, not less

zero sum game: if we are writing more, we must be reading less…

• Describing documents and other digital artifacts according to a variety of different facets holds considerable promise

The semantic web is providing many ways to describe data collectionsWe may not be able to capture what things mean directly, but we can provide some useful signifiers (clues)

• The traces that individuals leave behind can be very useful, both to themselves and to others.

And it is comparatively inexpensive to capture and analyse

• Trust: Researchers need commitments over data custodianship that they can rely on into the long term.

Not 4 year funding cycles for nationally significant datasets

Page 22: Free Open-Source, Open- Platform System for Information Mash-Up and Exploration in Earth Science Tawan Banchuen, Will Smart, Brandon Whitehead, Mark Gahegan,

Questions?

Tawan Banchuen, PhDLecturer at Auckland University

[email protected]://eresearch.auckland.ac.nz