30
Johannes Keizer Food and Agriculture Organization of the UN Library and Documentatio n Systems Division Semantic Standards for the Web 28-10-2002 A Comprehensive Framework for Building Multilingual Domain Ontologies: Creating a Prototype Biosecurity Ontology AFITA 2002 Beijing The AGRICULTURAL ONTOLOGY SERVICE A Comprehensive Framework for Building Multilingual Domain Ontologies: Creating an ontology on Food Safety, Animal and Plant Health (OFsAPH) Johannes Keizer Information Systems Officer Food and Agriculture Organization of the UN AFITA 2002, Beijing 28 th October 2002 Team: Boris Lauser Team: Boris Lauser, Allison Poullos, Tanja Wildemann, Frehiwot Fisseha

The AGRICULTURAL ONTOLOGY SERVICE. A Comprehensive Framework for Building Multilingual Domain Ontologies: Creating an ontology on Food Safety, Animal and Plant Health (OFsAPH)

Embed Size (px)

Citation preview

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

The AGRICULTURAL ONTOLOGY SERVICEA Comprehensive Framework for Building Multilingual Domain Ontologies:

Creating an ontology on Food Safety, Animal and Plant Health

(OFsAPH)

Johannes Keizer

Information Systems Officer

Food and Agriculture Organization of the UN

AFITA 2002, Beijing 28th October 2002

Team: Boris LauserTeam: Boris Lauser, Allison Poullos, Tanja Wildemann, Frehiwot Fisseha

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 2

FAOs mandate

• Reducing the quantity of hungry people by 50% within the year 2015 (World Food Summit 1996).

• WAICENT (World Agricultural Information Center) is FAO’s approach to fight hunger with information

• FAO itself produces huge amount of content in it’s subject area

• It is also within FAOs mandate to make available useful information from other information providers

• FAO collaborates in information networks

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 3

Introduction

It has become a triviality to state the difficulty of finding relevant information on the web

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 4

The Search Problem

Both parameters are ranking low today!

RecallNumber of Relevant Documents in the Collection

Number of Relevant Documents Identified

PrecisionNumber of Relevant Documents Identified

Total Number of Documents Identified

How to evaluate Search Results?

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 5

State of Search Systems

Full text search engines based on statistical text analysis are inprecise by nature

New system based only on “machine intelligence” do not show too promising results

Recogniton of meaning (semantic analysis) by machines is only possible by using knowledge organization systems

agreed metadata schemas Controlled vocabularies Machine readable encoding

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 6

Knowledge Organization Systems: Vocabularies

AGROVOC

NAL Thesaurus

CABI Thesaurus

Dedicated KOSs

Non-dedicated KOSs

e.g., ASFA thesaurus

e.g., the Multilingual Forestry Thesaurus

e.g., the Sustainable Development

website classification

e.g., biological taxonomies such as NCBI and ITIS

GEMET

Other thematic thesauri

Existing Thesauri and Knowledge Organization Systems (KOSs)

Common concepts are not declared

No or very limited interoperability

Insufficient subject + language coverage

Severe maintenance problems

Very limited machine readability

Only very simple encoding of semantic relations

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 7

Ontologies?

An ontology is a formal knowledge organization system

It contains concepts (and instances) a formal description of the application knowledge Definitions of concepts and instances Relations between concepts and instances possibility of machine processing

Nearly everyone tries to build (inexplicit) ontologies Directory structures, navigation trees Humans can overcome bad organization by intuition Machine have no intuition, Machine need formal information

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 8

What benefits do we expect from Ontologies?

• Semantic Organization of websites Knowledge maps Guided discovery of knowledge Easy retrievability of information without using complicated

Boolean logic

• Text processing by machines Text Mining on the Web (meaning-oriented access) Automatic indexing and text annotation tools Full text search engines that create meaningful classification

(FAO-Schwartz not related to FAO) (semantic clustering)

• Intelligent search of the Web Building dynamical catalogues from machine readable meta data

• Natural Language processing Better machine translation Queries using natural language

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 9

The Example: International Portal on Food Safety, Animal and Plant Health

• Goal: To create an explicit, formal specification of a shared conceptualization of a domain of interest

Ontology

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 10

Ontology: conceptual model

Concept

label

synonym

synonym

synonym

stem

description

Concept

relationship

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 11

Ontology: RDFS model, machine readable encoding

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 12

Processes to create a Domain Ontology

• Ontology acquisition (2 paths)– Creating core ontology from scratch

– Automatic extraction of ontological knowledge from base vocabulary and domain specific text sources

• Merging into one ontology• Refinement and Extension• Evaluation and Assessment

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 13

Creation of the core ontology

67 concepts91 relationships

Information Resources:•Brainstorming•Codex Alimentarius•SPS Agreement

Core Ontology

Ontology Editor(SOEP)

3 subject specialists

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 14

1st Acquisition Approach:

Focused Crawling

Focused Web Crawling

68 concepts91 relationships

Core Ontology

List of extracted main sites:http://www.foodsafety.gov/ Gateway to Government Food Safety Information

http://vm.cfsan.fda.gov/ Center for Food Safety & Applied Nutrition

http://www.inspection.gc.ca/ Canadian Food Inspection Agency

http://www.extension.iastate.edu/foodsafety/ Iowa State University - Food Safety Project

http://www.foodsafety.iastate.edu Iowa State University - Food Safety Consortium

http://www.fsis.usda.gov/ United States Department of Agriculture, Food Safety and Inspection Service

http://www.nal.usda.gov/foodborne/index.html Foodborne Ilness Education Information Center

http://www.euro.who.int/foodsafety World Health Organization – Regional Office for Europe Food Safety Programme

List of 257 food Safety domainweb pages

Grouping into Main sites

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 15

Selection of Documents

• Domain Set: Manual selection– 11 documents

• Codex Alimentarius: Description, Code of Ethics, Food Hygiene, Food Import and Export• Report of consultation on risk assessment of microbiological hazards in foods• Ensuring food quality and safety, Protecting food quality and safety

• Domain Set: Focused Crawler Output– 5 documents extracted:

• http://vm.cfsan.fda.gov/;• http://www.inspection.gc.ca/;• http://www.foodsafety.iastate.edu; • http://www.extension.iastate.edu/foodsafety/; • http://www.euro.who.int/foodsafety

• Generic documents: Manual Selection– 8 documents

• www.nytimes.com• Several documents of the animal feed domain

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 16

2nd Acquisition Approach:

Thesaurus Pruning

Food SafetyDocuments

GenericDocuments

Rice BT … NT … RT … RT … RT … …

AGROVOC27365 keywords

Automatic Pruning

Extracted ontological structure:# of concepts: 504taxonomic depth: 5

5 evaluation runs

1632 frequent terms

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 17

Merging of Ontologies and Refinement

1632 Terms from pruning process 12 new concepts

extracted

Ontologicalstructureextracted from AGROVOC

23 new conceptsWith hierarchicalrelationships extracted

67 concepts91 relationships

Core Ontology

Assemblystep

92 new relationshipscreated

Food Safety OntologyPrototype

102 concepts183 relationships

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 18

Final Prototype

Food Safety OntologyPrototype

102 concepts183 relationships

1.79 relationshipsconcept

Core Ontology67 concepts

91 relationships

relationshipsconcept1.36

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 19

102 Concepts

Agreement of AgricultureALOPALOP, CodexALOP, OIEALRanimal byproductsanimal diseasesanimal fatsanimal feed additives animal feed contaminantsanimal feed ingredientsanimal feedinganimal healthanimal processinganimal productsanimal wasteanimalsantibioticsBacteriabakery productsbiological agentCACCaragene protocolCCFHcereal productscheese

chemical agentCodex CommitteescommoditiesConsumer healthdiseaseseggsexposure assessmentfabricationFAOfishesfoodfood additivesfood consumptionfood contaminantsfood exportfood importfood ingredientsfood safetyfood-borne diseasesfungigood hygienic practiceshazardhazard characterizationhazard identificationhuman healthhuman nutrition

humansinternational agreementsinternational food tradeinternational governmental organizationsIPPClabellingmeatmicroorganismsmicroorganisms byproductsmicroorganisms processingmicroorganisms productsmicroorganisms wastemilkmilk productsmilk productsnon-pathogensOIEpackagingparasitespathogensphysical agentplant byproductsplant diseasesplant feed additivesplant feed contaminants

plant feed ingredientsplant feedingplant healthplant processingplant productsplant wasteplantsprocessed animal productsprocessed plant productsprocessed productsprocessingrisk analysisrisk assessmentrisk characterizationrisk communicationrisk managementslaughterSPS agreementstandardssugar TBT agreementtransportvirusesWHOWTO

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 20

29 Unique Relationships

adoptsadversely affectare included inare produced byare the source forcan be used asconstitutesdescribesdeterminesensuresestablishesgovernhas economical impact onImpliesincludes

influencesinteracts withis a consequence of is a step in the processis comprised ofis established byis protected byoriginate fromrefer to requiresrulesustainstradesuses

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 21

Current project statusOntology creation: 2nd application of framework

Food Safety OntologyPrototype

102 concepts183 relationships

Text To Onto ~100 domain

Specificdocuments

AGROVOC

Revised OntologyPruner

List offrequent

terms

Pruned Agrovoc: ~3000 concepts

Ontology Editor(OIModeler)

Merging &Refinement

1st acquisitionapproach

2nd acquisitionapproach

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 22

Usage Scenario

Search:

Risk assessment

Biosecurity Portal:

OntologyEnabled Search

Application

Ontology based search extension

Risk characterization

Hazard characterization

Hazard identification

Exposure assessment

Risk assessment

Risk management

Risk communication Risk analysis

Is aStepIn theprocess

Is aStepIn theprocess

Extended Search

Mark the terms below, which you might want to include in your search:

Interactswith

Risk assessment Risk characterization Risk analysisSearch:

Ontology

Doc baseSearchresults

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 23

Current project statusApplication scenario: 2 use cases

Use Case 1: Indexing the subject of a document

Use Case 2: Searching information on the portal

Risk;…Subject

Title

OFsAPH

Risk;…Search…

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 24

Current project status: Application: Ontology Browser for the Ontology on Food Safety,Animal and Plant Health

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 25

The Project for an Agricultural Ontology Service

• Only agreed semantic standards guarantee knowledge discovery between different applications

• The definition of Knowledge Organization systems is resource intensive

• Therefore FAO started initiatives to bring interested partners together October 2000 Launch of the AGStandards initiative to agree on

metadata standards July 2001 concept paper on Agricultural Ontology Service

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 26

What does Agricultural Ontology Service mean?

The Agricultural Ontology Service is an approach to organize knowledge organization systems that is

International The Internet must become plurilingual

MultidisciplinaryThe area of subjects is broad and needs various inputs

Cooperativedifferent expert knowledge has to be associated and used

Distributed no central ownership should be looked for

CoordinatedCoordination must ensure reusability and standardization

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 27

AOS: Iterative Knowledge Registration

KOS uses components to build

an application

Discussions and choices for amendments to

components

Components: terms, definitions,

relationshipsUsers search and browse

application using components

User feedback

Agricultural Ontology Service (AOS)

Federated storage and description facility

Components: terms, definitions,

relationships

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 28

Activities up to now

• The first workshop took place in Rome, November 2001

• A launch group was established with participation of

– Content providers (FAO, CABI)– Solution providers in the Agricultural Area (ATO -Wageningen,

University of Florida)

– Ontology development Groups (AIFB Karlsruhe, CNR Italy)

• Two further workshops were organized in January and May 2002

• Ontology protypes are under development

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 29

AOS – a “business model”

• A consortium of Information Providers• A clearinghouse for semantic standards in the

relevant subject areas• One stop access to agreed standards (Ontologies,

Metadataschemas, Vocabularies…)• Participation as a consortium in semantic web

activities to get funding for specific projects (“Semkos” for EU 6th framework)

• Organization of seminars and workshops to further develop and promote the use of semantic standards

Johannes Keizer

Food and Agriculture

Organization of the UN

Library and Documentation

Systems Division

Semantic Standards for

the Web

28-10-2002

A Comprehensive Framework for

Building Multilingual

Domain Ontologies: Creating a Prototype

Biosecurity Ontology

AFITA 2002

Beijing

Slide 30

Further Information

http://www.fao.org/agris/AOS

http://www.fao.org/agris/AGMES