View
219
Download
1
Category
Preview:
Citation preview
1
ESCRIRE:Embedded Structured ContentRepresentation In Repositories
Jérôme Euzenat
INRIA Rhône-Alpes
Jerome.Euzenat@inrialpes.fr
2
ESCRIRE: Motivations
Embedding a simplified but formal representation of content in documents :
• search on structured criteria;
• document comparison (genericity, similarity…);
• automatic classification and organisation.
3
Knowledge based queries
(and book (about "Agatha Christie"))
vs. book AND "Agatha Christie"
(and flat (location "Alps"))
…including those in Val d’Isère!
(and bookshop (location "London"))
…bookstore included.
4
Query languages
level 3 Semiotic
level 2 Semantic (F-logic, Escrire…)
level 1 Structural (SQL, XQL)
level 0 Full-text search
5
ESCRIRE: Goals
Comparison of several knowledge representation techniques
in order to find the type of situation to which they are most suited (indexing, classifying, filtering…).
6
ESCRIRE: Consortium
“Coordinated research action (ARC)” involving
Acacia (Sophia-Antipolis): conceptual graphs
Sherpa/Exmo (Rhône-Alpes): object-based representations
Orpailleur (Lorraine): terminological logics.
Usinor: application.
7
“Ontology”
Description
ESCRIRE: Acquisition
Globalanalysis
Individualanalysis
IntegrationXML
document
Document
Tr-schema
Tr-object
9
ESCRIRE: Problem statement
Given:A set of (HTML) documents annotated by a description of their content in a pivotal langageAn ontology of the domainA set of queries about the subject.
Retrieve:the adequate documents.
10
ESCRIRE: Software variation
Knowledge representation + query evaluation
Translated from a pivotal language in
Conceptual graphs, Object-based representation, Description logic
Translated by hand in CG, OKR, DL
12
ESCRIRE: Quantitative criteria
• Precision: rate of correct answers
• Recall: rate of complete answers
• Acuracy=(precision+recall)/2
• Performances in time
• Coverage of the query language
• Ordering of answers
13
ESCRIRE: Qualitative criteria
Given by external users (query designers):
• Naturalness of queries
• Adequacy of answers
• Overall appreciation (aggregation).
16
ESCRIRE: Ontology elements (1)<esc:ontology>
<esc:defclass name="gene">
<esc:classref name="adn-part"/>
<esc:defattribute name="length">
<esc:typeref name="integer"/>
</esc:defattribute>
<esc:defattribute name="protein">
<esc:classref name="protein"/>
</esc:defattribute>
</esc:defclass>
…
17
ESCRIRE: Ontology elements (2)
<esc:descrelation name="interaction">
<esc:relref name="bio-process"/>
<esc:defattribute name="effect">
<esc:typeref name="string"/>
</esc:defattribute>…
<esc:defrole name="promoter">
<esc:classref name="gene"/>
</esc:defrole>…
</esc:descrelation>…
</esc:ontology>
18
ESCRIRE: Content descriptions
<esc:content ontology="biointer.xml" url=".">
<esc:object type="gene" id="bcd"/>
<esc:relation type="interaction">
<esc:attribute name="effect">
inhibition
</esc:attribute>
<esc:role name="promoter">
<esc:objref id="Bcd"/>
</esc:role>
</esc:relation>…
</esc:content>
19
ESCRIRE: Knowledge embedding
<html>… <!-- xhtml -->
<rdf:RDF>
<rdf:Description about="/">
<!-- dublin core -->
<dc:title>…</dc:title>…
<!-- pivot language -->
<esc:content>… </esc:content>
<!-- conceptual graphs -->
<gc:graphs>…</gc:graphs>
…
</rdf:Description>…
</rdf:RDF>…
</html>
20
ESCRIRE: Queries
• Stated on objects, but results are documents
(concerning these topics)
• Document similarity by content similarity
21
ESCRIRE: Query language
SELECT / FROM / WHERE / ORDERBY
+
AND / OR / NOT / ALL / EXISTS
<path> <relop> <path>|<value>
IN <class>
ALIKE <document>
22
ESCRIRE: Corpus 1
Subject: genetic interaction
Text source: MedLine abstracts
Annotations: manual
Ontology: Knife knowledge base + other
23
ESCRIRE: Corpus 2
Subject: Psychological stress
Text source: MedLine abstracts
Annotation: manual annotations
Ontology: UMLS/MeSH
24
ESCRIRE: Where are we?
• Building translators from pivot to actual formats
• 1st part of Corpus 1 available (other data shall folow quikly)
25
ESCRIRE: Calls
• Other corpora
• Natural language technology
• Other representation systems
starting from september 2000
Recommended