Upload
valeria-pesce
View
182
Download
3
Embed Size (px)
DESCRIPTION
The CIARD RING, a global directory of datasets for agriculture, has been enhanced during the EC-funded agINFRA project. It has become a Linked Data hub that can be queried by other applications. Presented at the 4th RDA Plenary Meeting in Amsterdam on 22/09/2014.
Citation preview
The new CIARD RINGa machine-readable directory of
datasets for agriculture
Valeria PesceGlobal Forum on Agricultural Research (GFAR)
Research Data Alliance 4th Plenary Meeting22-24 September 2014, Amsterdam
Agricultural Data Interoperability Interest Group
agINFRA projectEC 7th framework program INFRA-2011-1.2.2 - Grant agr. no: 283770
The CIARD RING
http://ring.ciard.net
The CIARD RING is a project implemented within the CIARD initiative and is led by the Global Forum on Agricultural Research (GFAR).
The CIARD RING is a global directory of web-based information
services and datasets for agriculture
Why (1)
- Producers and managers of information / data need a place where their information products can be found
- Data consumers need to find suitable data sources
- IT professionals need information on the level and mode of interoperability of information services and datasets for using data in their applications
Numbers and map
• 468 data providers• 1018 information services, of which– 268 exposed datasets
Definition of “dataset” in the RINGThe term “datasets” has been defined in several ways, all of which further specify or extend the basic concept of “a collection of data”.
Definition given by the W3C Government Linked Data Working Group:
A dataset is “a collection of data, published or curated by a single source, and available for access or download in one or more formats”
The “instances” of the dataset “available for access or download in one or more formats” are called “distributions”. A dataset can have many distributions.
Examples of distributions include a downloadable CSV file, an API or an RSS feed.
Direct submission + federation
• All datasets currently featured in the RING have been manually submitted by their owners / managers
• BUT, We don’t want to force data owners who already have a dataset catalog to catalog and maintain their datasets in two places
We are working on procedures to federate datasets from the most used dataset cataloguing platforms (Dataverse, CKAN…)
First experiment started with the IFPRI Dataverse dataset catalog
The RING user interface
Dataset record
The RING machine interface – Why (2)
• Datasets registered in the RING have to be found by applications
• Applications have to be able to read all the metadata about datasets and filter datasets according to their needs
• Applications have to find enough technical metadata in the RING to:– Identify datasets with a specific coverage (type of data, thematic
coverage, geographic coverage)– Identify datasets that comply with certain technical specifications
(format, protocol etc.)– Access the dataset and get the data
This machine-readable layer can support the data aggregation workflows of external services
The RING machine interface – SPARQL
An RDF store is a way of storing data using a machine-readable "grammar" (the Resource Description Framework) and documented semantics (RDF vocabularies).
URIsThe URI for each service / dataset is built as follows: RING-domain/node/service-ID.For example: http://ring.ciard.net/node/2417
The RING database is also an accessible RDF store.
SPARQL endpointhttp://ring.ciard.net/sparql1
SPARQL how to: vocabularies
The vocabularies used in the RDF store are:• RDF: http://www.w3.org/1999/02/22-rdf-syntax-ns#• RDFS: http://www.w3.org/2000/01/rdf-schema# • DC: http://purl.org/dc/terms/• DCAT: http://www.w3.org/ns/dcat# • ADMS: http://www.w3.org/ns/adms# • FOAF: http://xmlns.com/foaf/0.1/ • DOAP: http://usefulinc.com/ns/doap# • SKOS: http://www.w3.org/2004/02/skos/core# • VCARD: http://www.w3.org/2006/vcard/ns#
The data model chosen to describe datasets is the
W3C Data Catalog Vocabulary (DCAT)designed to describe datasets
and the forms in which they are exposed, their "distributions"
SPARQL how to: sample queryTo get all datasets available through the OAI-PMH protocolQuery: PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dc: <http://purl.org/dc/terms/> PREFIX dcat: <http://www.w3.org/ns/dcat#> PREFIX adms: <http://www.w3.org/ns/adms#> PREFIX doap: <http://usefulinc.com/ns/doap#> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> DESCRIBE ?dataset ?distro ?owner ?contact ?topic ?standard ?format ?protocol
WHERE { ?dataset rdf:type dcat:Dataset . ?dataset dc:title ?title . ?dataset dcat:distribution ?distro . ?dataset dc:publisher ?owner . ?distro dcat:accessURL ?url . ?distro adms:representationTechnique <http://ring.ciard.net/taxonomy_term/108> . OPTIONAL { ?dataset doap:maintainer ?contact } OPTIONAL { ?dataset dcat:theme ?topic } OPTIONAL { ?distro dc:conformsTo ?standard } OPTIONAL { ?distro dc:format ?format } OPTIONAL { ?distro adms:representationTechnique ?protocol } }
SPARQL how to: URIs?
All the URIs that you may need in queries are listed on the RING web site• A list of the URIs of all the RING entities
(services/datasets, organizations, KOSs etc.)• A list of the URIs of all RING concepts
(countries, topics, regions, protocols etc.)
SPARQL how to: URIs of entities
SPARQL how to: exploit linked URIs
Example of use: AGRIS RING
1. How AGRIS uses the RING Linked Data
AGRIS (http://agris.fao.org): database of more than 7 million bibliographic references on agricultural research and technology and links to related data resources on the Web.AGRIS retrieves information on AGRIS centers through a SPARQL query run against the RING.<http://ring.ciard.net/node/10687> is the uRI of the AGRIS network in the RING------------------------------
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX dc: <http://purl.org/dc/terms/> PREFIX dcat: <http://www.w3.org/ns/dcat#> DESCRIBE ?dataset WHERE { ?dataset rdf:type dcat:Dataset . ?dataset dc:partOf <http://ring.ciard.net/node/10687> } ------------------------------
Example of use: AGRIS RING2. How to get AGRIS Linked Data bibliographic records for each AGRIS center
In the AGRIS RDF store, all bibliographic records are associated to the corresponding AGRIS center through the dcterms:source property: the URI used to identify the AGRIS center is the RING URI.Any application can therefore retrieve all records belonging to an AGRIS center by running a query against the AGRIS SPARQL endpoint (http://202.45.139.84:10035/catalogs/fao/repositories/agris).------------------------------------PREFIX dcterms: <http://purl.org/terms> DESCRIBE ?rec WHERE { ?rec dcterms:source <http://ring.ciard.net/node/2754> . } -----------------------------------
Interoperability assessment in the RING
The technical metadata registered in the RING for each dataset provide enough information to give a good idea of the level of “interoperability” of that dataset.
“Interoperability is a feature of datasets— and of information services that give access to datasets— whereby data can easily be retrieved, processed, re-used, and re-packaged (“operated”) by other systems. The less pre-coordination required to achieve this, the more “interoperable” the dataset.”
[from: Interim Proceedings of International Expert Consultation on “Building the CIARD Framework for Data and Information Sharing”, Beijing 20-23 June 2011. 2011.]
Metadata Type Interoperability points Tim Berner Lee’s stars
For the service/dataset in general 1 Global coverage Select list 4 if not empty 2 Regional coverage (FAO) Select list 4 if not empty 3 Regional coverage (GFAR) Select list 4 if not empty 4 National coverage Select list 4 if not empty 5 Specific topic (AGROVOC) Autocomplete multiple
(authority: AGROVOC)8 if not empty
6 Type of content/data managed Autocomplete multiple 4 if not empty 7 KOSs used Select list multiple
(authority: VEST Registry)10 for each KOS used 5 IF you already have 4
8 Special instructions for getting data from this service
Text 3 if not empty
9 Examples Text multiple 2 for each example For each distribution of the
dataset
10 URL / target / endpoint Text 30 if not empty 1
11 File upload Upload 10 if not empty 1
12 Access / licensing Autocomplete 4 if half-open; 6 if free / open; 8 if formally open (OA, CC)
0.5 if half-open; 1 if open; 1.5 if open and known license e.g. CC
13 License URL Text: URL 7 if not empty 0.5
14 Protocol Select list 10 ftp/download; 20 OAI-PMH or web service; 30 if SPARQL
1 if ftp/download; 3 if OAI-PMH or RSS; 4 if SPARQL
15 Format / serialization / notation Select list(authority: subset of IANA types)
5 Excel; 10 CSV, XML; 12 JSON; 15 RDFXML; 20 JsonLD, ntriples-n3-turtle)
2 if Excel; 3 if CSV, XML, JSON; 4 if JsonLD, RDFXML, ntriples-n3-turtle
16 Metadata set(s) used Select list(authority: VEST Registry)
6 for each metadata set 2.5
17 Does the dataset use URIs? Yes/No 20 if yes; OR: multiply 15 by n. 10 4 (OR: 4 IF you already have 3)
18 Does the dataset link to external URIs?
Yes/No 20 if yes; OR: multiply 15 by n. 15 5 (OR: 5 IF you already have 3)
Example of interoperability
assessment in the RING
Thank you
Thank you for your attentionValeria Pesce