Semantics, technology and linked data in open access repositories on agriculture and related...

Preview:

DESCRIPTION

This presentation provides a practical overview of current practices in creating vocabularies and linked data in the area of agriculture and related sciences and also on authority control of bibliografic data practices. Finally the survey carried out by FAO in December 2009 - January 2010 on the state of the art of the use of semantics and technology in open access document repositories in the field of agriculture and related sciences is presented.

Citation preview

Semantics, Technology and Linked Data in Open Access Repositories on Agriculture and Related Sciences

imma.subirats@fao.orgsarah.dister@fao.orgIT-Enhanced Organic, Agro-

Ecological and Environmental EducationSeptember 16-17, 2010Budapest (Hungary)

About ourselves…

Imma Subirats & Sarah Dister Information & knowledge management specialists at FAOActively involved in the promotion of open access in agriculture and

related sciences, assuring the quality of repository content through implementing metadata standards, thesauri, and other forms of authority control

…about FAO of the UN

It is the specialized agency of the United Nations that leads international efforts to defeat hunger

acts as a neutral forum where all nations meet as equals to negotiate agreements and debate policy

is also a source of knowledge and information.

Semantics & Technology in Open Access Document Repositories

Short introduction about…

What linked data is and its benefits What the authority control content model

means and its benefits for the open access repositories in the agricultural domain

Overview of the current situation of the use of technology and semantics in open access repositories in agriculture

What can we say about Linked Data?

What is linked data?

Data which contains URI’s as identifiers for concepts described in the data and URIs to identify the relationships between those concepts

A richer linking mechanism for the web that takes us from hypertext links (document to document) to hyperdata links (across things that documents are about)…

A term coined by Tim Berners-Lee

So?

TALIS, 2009

Linked Data Principles

Use of URIs as names for thingsUse of HTTP URIsProvide useful information in RDFIncluding RDF links to other URIs

What is RDF?

Resource Description FrameworkRDF is the data format for linked dataDescribes relationships between thingsRDF uses URIs to name things, preferably HTTP

http://www.w3.org/RDF/

Graphically

TALIS, 2009

RelationsRelations

LiteralsLiterals

ResourcesResources

RDF

TALIS, 2009

What data?

PeopleDocumentsPhotographsPlacesJournalsCorporate bodies (Institutions)ConferencesEtc...

What vocabularies?

FOAFDublin CoreBIBOSKOSEtc...

Examples in Agriculture

A I M S

Not much yet

http://linkeddata.org/data-sets

AGROVOC

What is AGROVOC?Multilingual structured thesaurus for all subject fields in agriculture, forestry, fisheries, food and related domains

What is its purpose?standardize the indexing process in order to make searching simpler and more efficient and to guide the user to the most relevant sourcesWho uses AGROVOC?Downloaded on average 1000 times per year, and individuals in about ninety countries regularly access AGROVOC online

More about AGROVOC

It is a concept/term based systemAround 30,000 concepts600,000 labels in around 20 languages

A knowledge base of related concepts organized in relationships (hierarchical, associative, equivalence)One-stop shop for terminological knowledge related to agriculture in general

AGROVOC as linked data

A I M S

Concept Based Authority Control System for bibliographic data

Authority Control for Bibliographic Data

Context: library information systemsUsed for: access points to bibliographic recordsCorporate bodies, Conferences, Projects, Journal titles…

Definition: Technique/process of assigning a unique form of name and the use of cross-references from obsolete and related forms

Scope: To bring all the works of a bibliographical entity together in one place by selecting a single form of name

Benefits

FAO

Food and Agriculture Organization

ExampleFood and Agriculture Organization of the United Nations

Benefits• Efficient system searching• Exhausitive search results

It improves access dramatically by providing consistency in the forms used to identify corporate authors, conferences, place names, subjects, etc.

FAODocuments

Food and agriculture Organisations of the United Nations

Search

FAO Authority Control System

WhyFAO OA Repository project → 170,000 records of legacy data managed by a flat (no cross-references) authority control system↓Features of new Authority Control System • Concept based• Multilingual• URIs

ExampleAUTHORIZED TERMSEnglish: Food and Agriculture Organization of the United NationsFrench: Organisation des Nations Unies pour l'alimentation et l'agriculture Spanish: Organización de las Naciones Unidas para la Agricultura y la Alimentación Arabic: منظمة األغذية والزراعة لألمم المتحدةRussian: Продовольственная и сельскохозяйственная организация Объединенных Наций Chinese ....ALTERNATIVE TERMSIncomplete form: Food and Agriculture OrganizationAcronym: FAODutch form: Voedsel en landbouw OrganisatieC-C RELATIONSHIPSIs spatially located in: ItalyHas parts: Office of Knowledge Exchange, Research and Extension

Methodology

The Authority Control Content Model

It is based on a concept-based systemA concept is represented by all the forms,

preferred and non-preferred, in all languages, associated with it

A form is a word (simple term) or a multiword expression (complex term) that designates a particular concept

Content

http://202.73.13.50:54123/agrovocdevv10/http://202.73.13.50:54123/agrovocdevv10/

Conclusions

Arbitrary Politically sensitive Expensive

Sharing Standardization Simplification Consistency Reliability

But properly implemented, the authority control provides…

Do you have any question so far?

What can we say about the current situation of open access document repositories in the agricultural domain?

OA Document RepositoryDefinitionA digital archive to collect, preserve and disseminate scientific information in digital formBenefits Immediate, universal and free access to information available. Increase of visibility, usage and impact of work of researchers/institutionsImportanceMaking knowledge accessible → vital to (agricultural) development

Survey

WhyObtain a better understanding of the current situation Identify trends and issues that need attention

How30 questions divided in thematic groups web based survey on CIARD ring mail sent to 150 institutions and 9 specialized mailing lists

General Data collection: 82 repositories compiled

surveys Type of Institution: Majority universities,

minority governmental, international and Nongovernmental org

Year of foundation: Founded between 1993-2009

1993 – 2000: 1/2 repositories a year2001≥ substantial increase of growth ↕promotion of OA

OAI-PMH

Open Archives Initiative Protocol for Metadata HarvestingPurpose: To improve interoperability of digital repositories by exposing and harvesting metadata

45% no DC as metadata set to export data→ 55% is not OAI PMH compliant70% not interested improving metadata↓Promotion of OAI PMH

Authority Control

Bibliographical concepts62% no use of authority control when used, especially for journal titles 50% would be interested in applying an authority control system↓Promotion

Software

A I M S

SoftwareComparing with other repositories

Summary

OAI-PMH – interoperability Authority control – accessibilitySoftware - standardization

CIARD RingData collected added to repository profiles on CIARD Ring

Thank you for your attentionimma.subirats@fao.org

sarah.dister@fao.org

IT-Enhanced Organic, Agro-Ecological and Environmental EducationSeptember 16-17, 2010Budapest (Hungary)

Recommended