18
GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey [email protected] February 5, 2008

GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey [email protected] February 5, 2008

Embed Size (px)

Citation preview

Page 1: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

GEOSS ADC Architecture WorkshopClearinghouse, Catalogues, Registries

Doug Nebert

U.S. Geological Survey

[email protected]

February 5, 2008

Page 2: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

GEOSS Access Context

GEOSSComponent,

Service registry

Standards,Special

ArrangementsRegistries

references

Web Portals and client applications

search

Offerors

contributeregister

CommunityResourcesaccess

GEOSSClearinghouse

Catalogues

Services

User

accesses

get list ofcatalogueservices

accesses

search

invoke

2

3

5

9

1

reference

operate

6

4

8

7

Page 3: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

GEOSS Clearinghouse

• Clearinghouse as a broker to Community Catalogues

• Searches GEOSS Service Registry to identify services that can be searched

• Community Catalogues may either be “harvested” in advance or “searched” at the time of a user query

• Searches received from GEO Web Portal, Community Portals or any other external application acting as a catalog client

• Brief or full responses are marshaled and returned to requesting client as XML

Page 4: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

GEOSS Clearinghouse

Page 5: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

Use Case: coordination of Registry and Clearinghouse

• Providers interface the Registry using a GUI to register components and services.

• Clearinghouse routinely updated with select contents of Service Registry.

• Portals (both GEO and Community) and other clients search the Clearinghouse through a catalog service interface, i.e., not a GUI

• Searches of the Clearinghouse accomplished via – 1) metadata held in the clearinghouse - previously harvested

from remote catalogues – 2) distributed searches to remote catalogues at the time of

the users search.

Page 6: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

Use Case: coordination of Registry and Clearinghouse• In the publishing activity, “A,” a GEOSS publisher activates an

online service and documents its existence or its data sources in a catalog.

• Activity “B” details the transactions taking place between a publisher who is registering a Component and a service and the Service and Standards registries.

• Activity “C” shows the GEOSS Clearinghouse discovering eligible services including catalog services in the GEOSS Service Registry and then accessing the found services directly. In some cases, the remote catalogs are set up for real-time distributed query – in others for harvesting or processing the results into a local cache.

• Activity “D” shows the expected interaction between a Web Portal and the clearinghouse and Component and Service registry.

Page 7: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

Interaction Diagram – Clearinghouse

Page 8: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

Interaction Diagram, continued

Page 9: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

Clearinghouse testing

• Three implementations tested – Geonetwork Clearinghouse– ESRI Clearinghouse– Compusult Clearinghouse

• Three sets of tests were performed– Clearinghouse to Service Registry – Search of Clearinghouses by GEO Web Portal

candidates – Clearinghouse to Community Catalogues

Page 10: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

Clearinghouse Requirements

• GEOSS Clearinghouse candidates assessment is based on the fulfillment of the requirements contained CFP – Requirements contain slight changes vs CFP

• Clearinghouse candidate self - assessment against requirements– Compliant except where requirements are

ambiguous– Expectation that all registered catalog services

should be made searchable through each Clearinghouse instance

Page 11: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

Clearinghouse trade study: Distributed search vs. Harvest• Set of evaluation criteria defined followed by analysis

of the alternatives• Harvest alternative advantage: quick searches.

Disadvantage: metadata duplication• Distributed Search advantage is metadata is

maintained closer to source. Disadvantage that searching takes longer to complete and has more chances for the search to not be completed.

• Recommend Harvest when possible– Harvest only collection metadata– Policy of community catalogue must be respected

Page 12: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

Integration Issues

• Catalogues registered with GEOSS have a wide variety of standardization. Protocols include:– ISO23950 (Z39.50) “GEO” Profile Version 2.2

• FGDC (CSDGM Metadata)• ANZLIC Metadata• ISO 19115 Metadata

– OGC Catalogue Service for the Web (Version 2.0.1 and 2.0.2)• ebRIM Profile (incl ISO and EO Extension Packages)• FGDC Profile• ISO 19115 Profile

– SRU/SRW OpenSearch– OAI-Protocol for Metadata Harvesting (OAI-PMH)– Dublin/Darwin Core Metadata– Web-accessible folder/ftp?

Page 13: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

Who are the primary user types?

• Registries• Clearinghouse• Catalogues

Page 14: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

What resource types should be registered?

• Consider service, data set, data collection (series), items as alternatives and the ability to transition from one to the other.

• Current results are too heterogeneous

Page 15: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

What protocols can be expected?

• Let responses to CFP suggest choices• Support test harness capability to self-test registered

catalog service types• Clearinghouse instances must expose identical

service interfaces

Page 16: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

What metadata formats are found?

• ISO 19139 and Profiles (INSPIRE, ANZLIC, NAP)• FGDC CSDGM• Dublin Core• Darwin Core

Page 17: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

What metadata? How should it be presented?

• Need to refine the “core” metadata results that are handled and presented by the Clearinghouse as an intersection of data elements or “Summary” style record synthesized from the remote response

Page 18: GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey ddnebert@usgs.gov February 5, 2008

Specific recommendations (agreements for Clearinghouse testing and implementation)

• Performance issues and scalability need to be addressed, usage expectations, type & volume of use

• Typical use cases of query and presentation and load handling need to be included to gracefully handle numerous users and query loads