17
FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 (ICT-211423) ing Ontologies for Transition-Based Organization Intelligent Content and Semantics //www.kyoto-project.eu/ Vossen, VU University Amsterdam

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

Embed Size (px)

Citation preview

Page 1: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009

KYOTO (ICT-211423)Yielding Ontologies for Transition-Based OrganizationFP7: Intelligent Content and Semantics

http://www.kyoto-project.eu/

Piek Vossen, VU University Amsterdam

Page 2: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009

2

Project goals

• Open platform for knowledge sharing across languages and cultures– Wiki environment that allows people in the field to maintain their

knowledge and agree on meaning without knowledge engineering skills

– Bootstrap this knowledge through open text mining & concept learning

– Enables knowledge transition and information search across different target groups, transgressing linguistic, cultural and geographic boundaries.

– Enables deep semantic search for facts and knowledge

• Free, open source license (GPL)

Page 3: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009

3

• Languages: – English, Dutch, Italian, Spanish, Basque, Chinese, Japanese

• Domain:– Environmental domain, BUT usable in any domain

• Global: – Both European and non-European languages

• Available: – Free: as open source system and data (GPL)

• Future perspective: – Content standardization that supports world wide communication

Scope

Page 4: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009

4

KYOTO (ICT-211423) • Funded:

– 7th Framework Program-ICT of the European Union: Intelligent Content and Semantics

– Taiwan and Japan funded by national grants • STREPS project: research & development• Duration:

– March 2008 – March 2011

• Effort: – 364 person months of work.

Page 5: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009

5

Consortium

1. Vrije Universiteit Amsterdam (Amsterdam, The Netherlands), 2. Consiglio Nazionale delle Ricerche (Pisa, Italy), 3. Berlin-Brandenburg Academy of Sciences and Humantities (Berlin,

Germany), 4. Euskal Herriko Unibertsitatea (San Sebastian, Spain), 5. Academia Sinica (Tapei, Taiwan), 6. National Institute of Information and Communications Technology

(Kyoto, Japan), 7. Irion Technologies (Delft, The Netherlands), 8. Synthema (Rome, Italy), 9. European Centre for Nature Conservation (Tilburg, The Netherlands), • Subcontractors:

– World Wide Fund for Nature (Zeist, The Netherlands), – Masaryk University (Brno, Czech)

Page 6: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009

6

Current situation environment domain

• Vast amount of information in all kinds of formats and structures: websites, documents, databases, experts, community networks

• Scattered over the world: different regions, languages and cultures

• Highly dynamic and developing

• Increasing time and information pressure• Technology gap, use first results Google• Critical knowledge dependency

Page 7: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009

7

KYOTO cycle

Page 8: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009

8

KYOTO's Solution• Text mining:

– Massive and accurate indexing of facts from vast amounts of text;– In any language/culture from scattered sources;– Again and again to detect trends and changes;– Direct relation between knowledge modeling effort and text mining

• Knowledge modeling:– automatic learning of terms and concepts from text in any language;– formalization of knowledge in computer usable format -> wordnets &

ontologies• Community software:

– For experts in the field and not knowledge engineers– Continuous and collaborative effort:

• adapt to the changing domain;• consensus in the field;• consensus across languages and cultures

– Produce interoperable, formal, standardized knowledge structures;– Relate knowledge structure to expressions in languages

Page 9: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

Top

Middle

H20 CO2

Substance

Abstract

Process

Physical

Ontology

Environmental organizations

Tybot: term yielding robot

Kybot: knowledge yielding robot

Wordnets

Distributed, diverse & dynamic data

1

Capture text:"Sudden increase of CO2 emissions in 2008 in Europe"

2

CO2 emission3

Wikyoto

maintainterms & concepts

4

Index facts:Process: Emission Involves: CO2Property: increase, suddenWhen: 2008 Where: Europe

5Text & Fact Index

SemanticSearch

6

Citizens

Governments

Companies

DomainCO2

EmissionH20

PollutionGreenhouse

Gas

Page 10: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009

10

Achievements after 1st year

• First version of all system components– Wordnets in 7 languages in uniform database formats

– Standard representation for output of linguistic processing for 7 languages, based on ISO proposals

– Tybot (term extraction), Kybot (fact extraction) and Wikyoto (user editor)

– Semantic search

• Extensive definition of user requirements• Integration of system components

Page 11: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

Potential impact

Page 12: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

Kyoto Knowledge Base

WnIT

Domain

WnEN

Domain

WnEU

Domain

WnNL

DomainWnJP

Domain

WnCH

Domain

WnES

DomainOntologyOntologyOntology

Domain Ontology

Page 13: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009

13

Linking Open Data dataset cloud

http://richard.cyganiak.de/2007/10/lod/

Wordnetsailingterms

Ontologyenvironment

concepts

environmentfacts

Ontologymedical

concepts

Wordnetlegalterms

Wordnetmedicalterms

medicalfacts

legalfacts

Ontologylegal

concepts

Ontologysailing

concepts

Wordnetenvironment

terms

Wordnetenvironment

terms

Wordnetenvironment

terms

Wordnetenvironment

terms

Wordnetenvironment

terms

Page 14: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

Project characteristics

Page 15: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009

15

Why STRP project?

• Major technical challenges• Cross-cultural and cross-lingual• Small consortium for intense collaboration

and discussion• Bridge the gap between users and

technology: two-directional process• Role out needs to follow from technical

achievements

Page 16: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009

16

How to keep focus?

• Use existing state of the art technology• Start from current practice as baseline• Develop robust platform that adds to baseline,

with baseline as fall back• Gradually add richer data, more precision and new

functionalities• Allow end-users to control the process, driven by

textual examples• Open standardized architecture that can be

developed further

Page 17: FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT-211423) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content

Thank you for your attention