Supporting Queries with Imprecise Constraints Ullas Nambiar Dept. of Computer Science University of...

Supporting Queries with Imprecise Constraints

Ullas NambiarDept. of Computer Science

University of California, Davis

Subbarao Kambhampati

Dept. of Computer ScienceArizona State University

18th July, AAAI -06, Boston, USA

[WebDB 2004; VLDB 2005 (d);WWW 2005 (p); ICDE 2006]

Dichotomy in Query Processing

Databases

• User knows what she wants

• User query completely expresses the need

• Answers exactly matching query constraints

IR Systems

• User has an idea of what she wants

• User query captures the need to some degree

• Answers ranked by degree of relevance

AutonomousUn-curated DB

Inexperienced,Impatient user population

Why Support Imprecise Queries ?

Want a ‘sedan’ priced around $7000

A Feasible Query

Make =“Toyota”, Model=“Camry”,

Price ≤ $7000

What about the price of a Honda Accord?

Is there a Camry for $7100?

Solution: Support Imprecise Queries

………

1998$6500CamryToyota

Others are following …

The Problem: Given a conjunctive query Q over a relation R, find a set of tuples that will be considered relevant by the user.

Ans(Q) ={x|x Є R, Rel(x|Q,U) >c}

Constraints– Minimal burden on the end user – No changes to existing database – Domain independent

What does Supporting Imprecise Queries Mean?

AutonomousUn-curated DB

Inexperienced,Impatient user population

Assessing Relevance Function Rel(x|Q,U)

We looked at a variety of non-intrusive relevance assessment methods– Basic idea is to learn the relevance function for user

population rather than single users Methods

– From the analysis of the (sample) data itself • Allows us to understand the relative importance of attributes,

and the similarity between the values of an attribute [ICDE 2006;WWW 2005 poster]

– From the analysis of query logs• Allows us to identify related queries, and then throw in their

answers [WIDM 2003; WebDB 2004]

– From co-click patterns• Allows us to identify similarity based on user click pattern

[Under Review]

Our Solution: AIMQ

The AIMQ Approach

ImpreciseQuery

Query Engine

Map: Convert“like” to “=”

Qpr = Map(Q)

Dependency Miner

Use Base Set as set ofrelaxable selection

queries

Using AFDs findrelaxation order

Derive Extended Set byexecuting relaxed queries

Similarity Miner

Use Value similaritiesand attribute

importance to measuretuple similarities

Prune tuples belowthreshold

Return Ranked Set

Query Engine

Derive BaseSet Abs

Abs = Qpr(R)

[For the special case of empty query, we start with a relaxation that uses AFD analysis]

An Illustrative Example

ImpreciseQuery

Q Map: Convert“like” to “=”

Qpr = Map(Q)

Use Base Set as set ofrelaxable selectionqueries

Use Concept similarityto measure tuplesimilarities

Return Ranked Set

Derive BaseSet Abs

Abs = Qpr(R)

Relation:- CarDB(Make, Model, Price, Year) Imprecise query

Q :− CarDB(Model like “Camry”, Price like “10k”)

Base query

Qpr :− CarDB(Model = “Camry”, Price = “10k”)

Base set Abs

Make = “Toyota”, Model = “Camry”, Price = “10k”, Year = “2000”Make = “Toyota”, Model = “Camry”, Price = “10k”, Year = “2001”

Obtaining Extended Set

ImpreciseQuery

Qpr = Map(Q)

Return Ranked Set

Derive BaseSet Abs

Abs = Qpr(R)

Problem: Given base set, find tuples from database similar to tuples in base set.

Solution: – Consider each tuple in base set as a selection query.

e.g. Make = “Toyota”, Model = “Camry”, Price = “10k”, Year = “2000”

– Relax each such query to obtain “similar” precise queries.e.g. Make = “Toyota”, Model = “Camry”, Price = “”, Year =“2000”

– Execute and determine tuples having similarity above some threshold.

Challenge: Which attribute should be relaxed first?

– Make ? Model ? Price ? Year ?

Solution: Relax least important attribute first.

Least Important Attribute

Definition: An attribute whose binding value when changed has minimal effect on values binding other attributes.• Does not decide values of other attributes• Value may depend on other attributes

E.g. Changing/relaxing Price will usually not affect other attributes but changing Model usually affects Price

Dependence between attributes useful to decide relative importance• Approximate Functional Dependencies & Approximate Keys

Approximate in the sense that they are obeyed by a large percentage (but not all) of tuples in the database• Can use TANE, an algorithm by Huhtala et al [1999]

Deciding Attribute Importance Mine AFDs and Approximate

Keys Create dependence graph using

AFDs– Strongly connected hence a

topological sort not possible Using Approximate Key with

highest support partition attributes into

– Deciding set– Dependent set– Sort the subsets using

dependence and influence weights

Measure attribute importance as

ImpreciseQuery

Qpr = Map(Q)

Return Ranked Set

Derive BaseSet Abs

Abs = Qpr(R)

CarDB(Make, Model, Year, Price)

Decides: Make, YearDepends: Model, Price

Order: Price, Model, Year, Make

1- attribute: { Price, Model, Year, Make}

2-attribute: {(Price, Model), (Price, Year), (Price, Make).. }

•Attribute relaxation order is all non-keys first then keys

•Greedy multi-attribute relaxation

depends

idepends

decides

idecides

RAttributescount

AlaxOrderAiW

)(Re)(

Tuple Similarity

Tuples obtained after relaxation are ranked according to their

similarity to the corresponding tuples in base set

where Wi = normalized influence weights, ∑ Wi = 1 , i = 1 to |Attributes(R)|

Value Similarity• Euclidean for numerical attributes e.g. Price, Year• Concept Similarity for categorical e.g. Make, Model

WiAitvalueAitvalueilarityAttrSimttSimilarity ]))[2(]),[1(()2,1(

ImpreciseQuery

Qpr = Map(Q)

Return Ranked Set

Derive BaseSet Abs

Abs = Qpr(R)

Categorical Value Similarity Two words are semantically

similar if they have a common context – from NLP

Context of a value represented as a set of bags of co-occurring values called Supertuple

Value Similarity: Estimated as the percentage of common {Attribute, Value} pairs

– Measured as the Jaccard Similarity among supertuples representing the values

ST(QMake=Toy

Model Camry: 3, Corolla: 4,….

Year 2000:6,1999:5 2001:2,……

Price 5995:4, 6500:3, 4000:6

Supertuple for Concept Make=Toyota

JaccardSim(A,B) = BABA

imp AivSTAivSTJaccardSimAiWvvVSim1

)).2(,).1(()()2,1(

ImpreciseQuery

Qpr = Map(Q)

Return Ranked Set

Derive BaseSet Abs

Abs = Qpr(R)

August 15th 2005 Answering Imprecise Queries over Autonomous Databases

Value Similarity Graph

Chevrolet

Toyota

DodgeNissan

0.110.15

Empirical Evaluation Goal

– Evaluate the effectiveness of the query relaxation and similarity estimation

Database– Used car database CarDB based on Yahoo AutosCarDB( Make, Model, Year, Price, Mileage, Location, Color)

• Populated using 100k tuples from Yahoo Autos

– Census Database from UCI Machine Learning Repository• Populated using 45k tuples

Algorithms – AIMQ

• RandomRelax – randomly picks attribute to relax• GuidedRelax – uses relaxation order determined using approximate keys

and AFDs

– ROCK: RObust Clustering using linKs (Guha et al, ICDE 1999)• Compute Neighbours and Links between every tuple

Neighbour – tuples similar to each other Link – Number of common neighbours between two tuples

• Cluster tuples having common neighbours

Efficiency of Relaxation

1 2 3 4 5 6 7 8 9 10

Queries

Є= 0.7

Є = 0.6

Є = 0.5

•Average 8 tuples extracted per relevant tuple for Є =0.5. Increases to 120 tuples for Є=0.7.

•Not resilient to change in Є

1 2 3 4 5 6 7 8 9 10Queries

Є = 0.7

Є = 0.6

Є = 0.5

•Average 4 tuples extracted per relevant tuple for Є=0.5. Goes up to 12 tuples for Є= 0.7.

•Resilient to change in Є

Random Relaxation Guided Relaxation

Accuracy over CarDB

•14 queries over 100K tuples

• Similarity learned using 25k sample

• Mean Reciprocal Rank (MRR) estimated as

• Overall high MRR shows high relevance of suggested answers

1|)()(|

ii tAIMQRanktUserRankAvgQMRR

1 2 3 4 5 6 7 8 9 10 11 12 13 14Queries

GuidedRelax

RandomRelax

Handling Imprecision & Incompleteness

Incompleteness in data– Databases are being

populated by• Entry by lay people• Automated extraction

E.g. entering an “accord” without mentioning “Honda”

Imprecision in queries– Queries posed by lay users

• Who combine querying and browsing

General Solution: “Expected Relevance Ranking”

Relevance Function

DensityFunction

Challenge: Automated & Non-intrusive assessment of Relevance and Density functions

Handling Imprecision & Incompleteness

Supporting Queries with Imprecise Constraints Ullas Nambiar Dept. of Computer Science University of...

Documents

NAME Dr.A.Rajan Nambiar DESIGNATION Associate Professor … · 1 NAME Dr.A.Rajan Nambiar DESIGNATION Associate Professor of Physics QUALIFICATION M.Sc.,M.Phil.,Ph.D. EMAIL ID Rajannambiara61@gmail.com

Hrishikesh Roy, C.J. A.K.Jayasankaran Nambiar, J. of 2018-S Dated this the day of June, 2019 ORDER A.K.Jayasankaran Nambiar, J. Pursuant to our last order dated 21.3.2019, the learned

Computer Measurement Group, India 0 0 HPC Tutorial Manoj Nambiar, Performance Engineering Innovation Labs Parallelization and Optimization

Accused Persons arrested in Thiruvananthapuram Rural ... · 48 Suresh Kumar Sasidharan nair 38 Remya Bhavan, ullas Nagar, Mundakkal varam, Koliyacode village Poolanthara 12.07.15

c: > c o - appdev.be · Title: tijd article Author: Bipin Nambiar Created Date: 12/2/2008 9:29:21 AM

Work portfolio - Rohit nambiar

POPULATION STATUS OF TIGERS ( PANTHERA TIGRIS ) … · support provided by Dr. Ullas Karanth of the Wildlife ... IN A PRIMARY RAINFOREST OF PENINSULAR MALAYSIA By Kae ... impending

Answering Imprecise Queries over Autonomous Web Databases Ullas Nambiar Dept. of Computer Science University of California, Davis Subbarao Kambhampati

Global Cities' Climate-Adjusted Carbon Footprint Indicator By Suraj Nambiar and Heikki Keskivali

Nambiar Associates › yahoo_site_admin › assets › ...Vishnu Maya R. Nambiar, ACCA (UK), CPA(R), CPA (K) She is an Audit partner in Nambiar Associates since 2005. She has qualified

Java Petstore : A Case Study Rohit Nambiar July 2005

Vinod Nambiar - Facebook Developer Garage Bangalore

ENTREPRENEURSHIP DEVELOPMENT PROGRAM€¦ · 24 Sumeet Jagdish Gandhi Electrical Degree Male 23 GENERAL 25 ... Nambiar I Mr. Sajanan Nambiar, CEO, Pearl coating Pvt Ltd dealt with

An Empirical Examination of the Efficiency of Commodity ...globalbizresearch.org/files/m499_irrem_n-r-parasuraman-ullas-rao-138153.pdf · International Review of Research in Emerging

Digitizing Retail Payments - CGAP · Digitizing Retail Payments Photo Credit: Ullas Kalappura, 2016 CGAP Photo Contest Peter Zetterli and Rashmi Pillai n t. Contents • Benefit of

Rajesh Nambiar - General Manager, Global Delivery, IBM India

PANCHATHANTHRAM.KUNCHAN NAMBIAR

17 skeleton - irp-cdn.multiscreensite.com Li… · Upper limb fractures Mithun Nambiar . Orthopaedic Resident . Royal Melbourne Hospital

Auxillary Memory Organization by Sanjiv Nambiar

Directorate Of Minorities Bengaluru Final... · 153 si1819dk0150 ullas hanstin maben christian 25,000/-154 si1819dk0153 jenica d almeida christian 25,000/- ... 198 si1819dk0232 jesmin