Upload
giorgio-orsi
View
359
Download
1
Tags:
Embed Size (px)
Citation preview
Keyword-basedContext-aware Selection
of Natural Language Query-Patterns
Giorgio Orsi, Letizia Tanca and Eugenio Zimeo
EDBT Conference – Uppsala
March 23rd 2011
2SAFE – EDBT Conference
Background:Cardiovascular diseases
March 16, 2011
Courtesy of American Heart Association: Heart Disease & Stroke Statististics (2009)
3SAFE – EDBT ConferenceMarch 16, 2011
Background:Emergency rescue of people with CVD
(1)emergency
(2)rescue
(3)on-site assistance
(4)transport to hospital
(5)surgey preparation
(6)surgery
missing informationtime constraintslimited technology
4SAFE – EDBT Conference
Positioning:which information access paradigm?
March 16, 2011
Form-based:
IR-style:
NLP queries:
Keyword search
Schema-less:
graph patterns:
too rigid, application flow does not always “covers” the users needs.interpretation of keywords, output are documents and not tuples.non-trivial NL analysis takes time and shallow analysis is too imprecise.good! if keywords are interpreted
• semantics is not exploited enough
• still affected by uncertainty
5April 12, 2023 SAFE – EDBT Conference
user
Approach:The SAFE way
query-answeringsystem
DB relevant results
Desiderata:
SAFE:
user
keywords
rankedquery
patterns
instantiation
of querypatterns
queries
user review
formulation execution
context context
context
6April 12, 2023 SAFE – EDBT Conference
Approach:Query Patterns
<nlquery id=“Q23"> <sentence> … </sentence> <variables> … </variables> <formalQuery> <query> … </query> <resources>
… </resources>
</formalQuery></nlquery>
<sentence description=“pharmacological interactions"> <fixed>
show the substances and their formulas which are known to interact with </fixed> <var ref=“v1"/> </sentence>
<variables> <variable id=“v1" label=“pharmacon name" type=“xsd:string"/></variables>
<query> select ?name ?formula where { ?x rdf:type domain:Substance. ?y rdf:type domain:Substance. ?x domain:subName ?n1. ?x domain:formula ?formula. { ?x domain:interacts ?y. }
?y domain:subName ?n2 FILTER (?n2 = '<fvar ref=“v1"/>') }</query>
<resources> <res modelRef="&domain#Substance" /> <res modelRef=“&domain#Additive" /> <res modelRef="&domain#Molecule" /> <res modelRef="&domain#Pharmacon" /> <res modelRef="&domain#interacts" /> <res modelRef="&domain#foodPresence" /></resources>
7April 12, 2023 SAFE – EDBT Conference
S = {CPR, heart massage, …}K = {heart stroke, CPR}
S = {heart stroke, heart failure, …}K = {heart stroke}
ontology controlled vocabulary
keywords:search terms (e.g., patient, drug)
parameters (e.g., “John Doe”, “49.5 Kg”)
online keyword suggestionauto-completion
semantically-related terms
frequently-used terms
Approach:Keyword to Ontology Matching
input = <he…
Intended input: <heart stroke>
LOnt = {…, heart stroke, heart failure, CPR, resuscitation, …}
K = ØS = Ø
Input = <c…
S = {resuscitation, …}
LOnt : ontological terms (labels)
S : suggested keywords
K : input keywords
Intended input: <CPR>
input = “”
8April 12, 2023 SAFE – EDBT Conference
pertinence computation:
phase 1: best-match decoration
phase 2: neighbors decoration
phase 3: pertinence combination
Approach:PertinenceConstruct S by picking n terms t from LOnt related to the keywords already chosen (those in the set K)
S = f( freq( t ), pert( t, K ) )
K = { , ascriptin} LR1 = {drug, pharmaceutical, medicinal}
LR2 = {disease, condition, illness, sickness}
LR3 = {treats, cures, heals}
LR4 = {name}
LR5 = {code}
input = Ø
R1
R2
string stringR4 R3 R5
drug
1.00.50.5
0.250.25
assuming n = 6…
S = { treats, cures, heals, name, disease, condition }
9April 12, 2023 SAFE – EDBT Conference
Approach:Ranking the Query Patterns
naïve approach
Rank by average pertinence of the formal resources in the pattern
normalized approach
use the number of resources directly associated to a keyword and mentioned in the pattern
rkg (p )= ∑𝑟 𝑖∈𝑅𝑃 (𝑝)
𝑝𝑒𝑟𝑡 (𝑟𝑖 ,𝐾 )|𝑅𝑃 (𝑝)|
rkg norm (p )=𝜃×rkg (p )
10April 12, 2023 SAFE – EDBT Conference
relevant areas definition
keyword suggestion
pattern-ranking
query-answering
Approach:Focus by Context-Awareness
all
roletopicsituation
rescue ER
anamn treat
para-md doctor
pharm patology
naturalchem CeV CaV
11April 12, 2023 SAFE – EDBT Conference
10 people without a previous experience of the systems
50 natural-language query patterns
metric access time (aT)
Break down metrics: aT = thT + kpT + srT + qeT + coT
Thinking time (thT)
pertinence computation (kpT)
Scoring and ranking (srT)
query execution time (qeT)
communication time (coT)
Experimentation:Experimental settings
12April 12, 2023 SAFE – EDBT Conference
5 query patterns (randomly selected from the pool) to each user.
The “right” queries were found:
in 65 % of cases on top of the list
in 25 % of cases at the second position
in 8 % cases of the first result page
In some cases, the testers were not able to formulate the right query using the form-based system.
Experimentation:Validation (1)
13April 12, 2023
Experimentation:Validation (2)
SAFE – EDBT Conference
12345678910111213141516171819
14April 12, 2023 SAFE – EDBT Conference
now…
novel paradigm for keyword-based search
context-aware and semantic ranking of query patterns
fast and precise information access
Summary:Conclusion and future work
future…
automatic definition of query patterns
automatic definition of natural language descriptions
automatic definition of relevant areas
Q & A
16April 12, 2023 SAFE – EDBT Conference
Two implementations:
Maemo Linux on Nokia Smartphones N810 and N900
Web based on OpenLaszlo and enterprise technologies
Experimental testbed in a client/server environment
Web-based SAFE vs form-based system provided by a hospital
Experimentation:Testbeds