15
Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Embed Size (px)

DESCRIPTION

Slide 3 of 63 Need for data aggregation and harmonization

Citation preview

Page 1: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Nikos ManolisAgro-Know Technologies

Tutorial on data aggregation and accessing datasets

Page 2: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 2 of 63

There is a lot of data

Page 3: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 3 of 63

Need for data aggregation and harmonization

Page 4: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 4 of 63

Objectives

This presentation aims to provide information on:

How to use a service for aggregating datasets How to get already processed datasets How to search processed datasets with a search API

• Educational – GLN API (21008 res)• Bibliographic – ABN API (451602 res)

Page 5: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 5 of 63

The agDataHarvester service

• Implements the OAI-PMH protocol to harvest metadata records from open data providers– REST-based API– Harvested dataset available through HTTP

Page 6: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 6 of 63

agDataHarvester parameters

{ "document_type": "harvesting_target", "harvesting_target": { "name":"Repository name", "description":”Short Repository Description", "url":"OAI-PMH target URL", "type":"metadata format prefix", "frequency":hours }}

Page 7: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 7 of 63

param.json

{ "document_type": "harvesting_target", "harvesting_target": { "name":"Indian Academy of Science", "description":"Indian Academy of Science", "url":"http://repository.ias.ac.in/cgi/oai2", "type":"mets", "frequency":24 }}

curl -X POST [email protected] http://'demo001':[email protected]/agcouchdbcurl -X POST [email protected] tp://'demo001':[email protected]/agcouchdb

{ "ok": true, "id": " 5c56a3fa18fa21d2a85fd63cc9eb78ac ", "rev": "1-19ef1210376df8f1695a32b53ecb963a" }

Page 8: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 8 of 63

Get details on the datasethttp://agro.ipb.ac.rs/agcouchdb/_design/datasets/_list/search/list?dataset.process_parameter_id=5c56a3fa18fa21d2a85fd63cc9eb78ac

Page 9: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 9 of 63

Get details on the datasethttp://agro.ipb.ac.rs/agcouchdb/_design/datasets/_view/list_by_process?key=agdataharvester

{"id": "6796259b52d79e4797e210c06e6a0aee","key": "6796259b52d79e4797e210c06e6a0aee","value": {

"_id": "6796259b52d79e4797e210c06e6a0aee","_rev": "1-d55d7bc90d26db64dae328c9328e4e4a","document_type": "harvesting_target","harvesting_target": {

"name": “WorldBank","description": "The World Bank - Open Knowledge Repository","url": ""https://openknowledge.worldbank.org/oai/request","type": “mets","frequency": 24

},"document_publisher": {

"address": "83.212.96.169","author": "demo001","utc_datetime": "Wed Dec 11 11:58:45 2013","utc_timestamp": 1386763125

}}}

Page 10: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 10 of 63

The agWorkflow service

http://agro.ipb.ac.rs/agcouchdb/_design/datasets/_list/search/list?dataset.process=agworkflow&dataset.type=oai_lom&dataset.accuracy=true

I want all datasets with educational resources processed by the agINFRA powered aggregation workflow !

http://agro.ipb.ac.rs/agcouchdb/_design/datasets/_list/search/list?dataset.process=agworkflow&dataset.type=oai_agris&dataset.accuracy=true

I want all datasets with bibliographic resources processed by the agINFRA powered aggregation workflow !

Page 11: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 11 of 63

Is there a way to search on available datasets ?

Page 12: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 12 of 63

Search API

• REST-based queries over harmonized information (result of metadata processing)

• Two data models supported – akif: describing educational resources for

agriculture, http://54.228.180.124:8080/search-api/v1/akif/?q=*

– agrif: describing bibliographic resources for agriculture (mainly from FAO’s data), http://212.189.145.241:8080/search-api/v1/agrif/?q=*

Page 13: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 13 of 63

Search options

• Simple searchhttp://BASE_URL/search-api/v1/akif/?q=tomato

• Searching within specific fieldshttp://BASE_URL/search-api/v1/akif/?

languageBlocks.en.description=tomato

• Temporalhttp://BASE_URL/search-api/v1/akif/?creationDate=2013-04-16

• Fetching specific items http://BASE_URL/search-api/v1/akif/COLLECTION/20296

Page 14: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Slide 14 of 63

Managing results

• Sorting resultse.g ?q=*&sort_by=creationDate&sort_order=desc

• Facetse.g ?facets=set&facet_size=3

• Paginatione.g ?q=sea&page_size=25&page=3

Full Documentation : 54.228.180.124:8080/search-api/

Page 15: Nikos Manolis Agro-Know Technologies Tutorial on data aggregation and accessing datasets

Nikos ManolisAgro-Know [email protected]