40
A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Embed Size (px)

Citation preview

Page 1: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

A Web-Based Resource Model for eScience:

Object Reuse & Exchange

2008 Microsoft eScience Conference

Indianapolis, December 8, 2008

Page 2: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

OAI-ORE Editors

• Carl Lagozeo Cornell University

• Herbert Van de Sompelo Los Alamos National Laboratory

• Pete Johnstono Eduserv Foundation

• Michael Nelsono Old Dominion University

• Rob Sandersono University of Liverpool

• Simeon Warnero Cornell University

Page 3: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Joint work with …

ORE Technical Committee Chris Bizer Freie UniversitŠt Berlin Les Carr University of Southampton Tim DiLauro Johns Hopkins University Leigh Dodds Ingenta David Fulker UCAR Tony Hammond Nature Publishing Group Pete Johnston Eduserv Foundation Richard Jones HP Labs Carl Lagoze Cornell University Peter Murray OhioLINK Michael Nelson Old Dominion University Ray Plante NCSA and National Virtual Observatory Rob Sanderson University of Liverpool Herbert Va n de Sompel Los Alamos National Laboratory Simeon Warner Cornell University Jeff Young OCLC ORE Liaison Group Leonardo Candela Consiglio Nazionale delle Ricerche - DRIVER Tim Cole University of Illinois Urbana-Champaign - Aquifer Julie Allinson JISC Jane Hunter University of Queensland - DEST Savas Parastatidis Microsoft Corporation Sandy Payette Fedora Commons Thomas Place University of Tilburg - DARE Andy Powell Eduserv Foundation - DCMI Robert Tansley Google, Inc. - DSpace

Page 4: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

OAI Object Reuse and Exchange: Support

• The Andrew W. Mellon Foundation• The Coalition for Networked Information• Joint Information Systems Committee• Microsoft Corporation• The National Science Foundation

Page 5: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

OAI Object Reuse and Exchange

Subject: Aggregations of Web resources

Approach: Publish Resource Maps to the Web that Instantiate, Describe, and Identify Aggregations

Page 6: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Instantiate, Describe, and Identify Aggregations

Aggregations

Page 7: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Aggregations

At one time it was possible to convey all scientific information about a topic in a

single “convenient” medium.

Babylonian Astronomical Catalogue

Page 8: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Aggregations

But quickly the limitations of that medium became obvious.

text data1857 Astrophysics paper

Page 9: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Aggregations

Those limitations seem to live on.

Page 10: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Aggregations

“Solving” the problem with ad hoc methods.

Photo plate kept separate from text(digitized version of original plate shown)

text

1890 Astrophysics paper

Page 11: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Hubble optical observationBaltimore, MD

Basic object informationStrasbourg, France

Aggregations

Objects of interest in eScience are by nature compound.

text

2006 Astrophysics paper

X-MM-Newton X-ray observationVilspa, Spain

Chandra X-ray observationCambridge, MA

A1795

Page 12: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Aggregations!

http://arxiv.org/abs/astro-ph/0611775

Formats

Versions

Identifiers

Relationships

Splash page

Page 13: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Object Reuse and Exchange: A Web-Centric Approach

• The Web Architecture as the platform for interoperability

• De-facto integration with existing Web

applications

• Potential of adoption by other

communities

• Potential of tools created by other

communities

• Incorporating the “social web” (Web 2.0) in eScience

Page 14: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Foundations of OAI-ORE

o Web Architecture- <http://www.w3.org/TR/webarch/>

o Semantic Web, RDF- <http://www.w3.org/TR/rdf-primer/>

o Linked Data- <http://linkeddata.org/>- <http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/>

o Cool URIs for the Semantic Web- <http://www.w3.org/TR/cooluris>

Page 15: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

W3C Web Architecture

Resource

URIRepresentation 2

Represents

Representation 1

Represents

Identifies

Content Negotiation

The tools we have to solve the interoperability problem are:

• Resource• URI• Representation

Page 16: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Semantic Web

The tools we have to solve the interoperability problem are:

• URI• RDF• Vocabularies

SemanticWeb

URI RDF

Vocabularies

Page 17: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Linked Data

• Linked Data principles:

1. Use URIs as names for things.

2. Use HTTP URIs so that people can look up those names.

3. When someone looks up a URI, provide useful information.

4. Include links to other URIs. So that they can discover more things.

Page 18: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

OAI Object Reuse and Exchange: The Approach

Subject: Aggregations of Web resources

Approach: Publish Resource Maps to the Web that Instantiate, Describe, and establish identity of

Aggregations

Approach: Instantiate Aggregations as Resources with unique URIs on the Web

Page 19: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

An Aggregation and the Web• Resources of an

Aggregation are distinct URI-identified Web resources

• Missing are:o The boundary that

delineates the Aggregation in the Web

o An identity (URI) for the Aggregation

Page 20: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Publish a Resource Map to the Web

Page 21: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

The Resource Map Describes the Aggregation

Page 22: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

The Resource Map and the Aggregation integrate into the Web

Page 23: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

ORE Data Model

Page 24: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

ORE Data Model

We want to have our cake and to eat it too (don't we all?):

o ORE should be simple and easy to use without deep understanding

- Use simple tools and rules to create Atom Resource Maps

o ORE should have well crafted data model that enables interoperability through well defined semantics

- Separate design from implementation- Future-proof ORE – today's technologies will be

replaced (even HTTP?)- Don't need to understand Data Model fully to do ORE

Page 25: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Aggregation: Resource that is a set of resources

This resource is an Aggregation

This resource is an Aggregated Resource

A Relationship defined in the ORE vocabulary

Page 26: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Resource Map: Describes an Aggregation:

This resource is a Resource Map

ResourceMap

SerializationThe resource has a representation

HTTP GET

ore:isDescribedBy

Implied as inverse of “describes”

Page 27: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Recommend use if HTTP URIs

• HTTP is technology of today's web

• Want to be able to cite of refer to Aggregation but get Resource Map describing it

o Follow Linked Data strategies to link: access URI-A, get redirected to URI-R (HTTP 303) or simple # URI

o Provides notion of Authority

Multiple Resource Mapso An Aggregation MAY be asserted and described by multiple

Resource Mapso The purpose of multiple Resource Maps is to provide

descriptions of the Aggregation in multiple serializations (e.g., Atom, RDF/XML, RDFa, etc.)

o Each Resource Map MUST have only one representation

Page 28: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Authority

o Authoritative Resource Mapso Get to Resource Map via Aggregation, usually created by

same authorityo Multiple: MUST be minimally equivalent (same Aggregated

Resources and Proxies), SHOULD assert mutual existenceo Non-authoritative Resource Maps

o Best practice is to not create themo Assert your own Aggregation insteado Use rdfs:seeAlso to assert relationship between two

Aggregation

Page 29: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Multiple Resource Maps

Atom

RDFa

Atom

RDF/XML

ore:describes

ore:describes

These are authoritative Resource Maps

These are authoritative Resource Maps

These are non-authoritative Resource MapsThese are non-authoritative Resource Maps

Page 30: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Not much else

Page 31: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Association with another resource/identifier

Page 32: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Adding other properties to the core

The ReM makes the assertions

The ReM makes the assertions

Metadata about the

ReM

Metadata about the

ReM

Metadata about the

Aggregation

Metadata about the

Aggregation

Required

Page 33: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Asserting other Relationships

Aggregation is a journal

Aggregation is a journal

Aggregation has another version “A”Aggregation has

another version “A”

Aggregated Resources are

articles

Aggregated Resources are

articles“AR-3” is by Stephen Hawking

“AR-3” is by Stephen Hawking

The ReM makes the assertions

The ReM makes the assertions

Assertions about the Aggregation.

Assertions about the Aggregation.

Assertions about Aggregated Resources.

Assertions about Aggregated Resources.

Page 34: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Limits of Assertions thus Far

• The meaning of an RDF triple is independent of the context in which it is stated

• Think of the difference:o Carl is a mano Carl is visiting Indianapolis

• All the triples described thus far are context independento Therefore they can have the URI of an aggregated

resource as subject or objecto But remember that is just the URI of the Resource and is

not exclusive of it being an Aggregated Resource

• Introduce proxy URI

Page 35: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

Proxy: Stands for resource in context of other resource

hasNext might have meaning only in context

Page 36: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

lineage: “this came from”

Reuse of data set AR-1 in Aggregation A-2.

ore:lineage predicate expressed origin or provenance of data. Needs proxies because statement depends on contexts

Page 37: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

ORE Deployment

Page 38: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

arXiv.org: ORE possibilities

arXiv is an e-print archive of 500k scholarly articles

Express:• Structure of arXiv: archives, sub-categories, articles• Versioning: “article” (concept) and specific versions and

formats• Articles by Joe Smith – somewhat like a result set• Constituents of an article (metadata, PDF, source, video, data,

extracted references)• Describe internal and external components (e.g. external

video associated with article but on Perimeter Institute server)• Use as part of workflow for ingest – assembly of components,

possible combination with SWORD

Page 39: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

http://www.openarchives.org/oreChem/

Page 40: A Web-Based Resource Model for eScience: Object Reuse & Exchange 2008 Microsoft eScience Conference Indianapolis, December 8, 2008

SCOPE Architecture