22
Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany Presented by Aaron Stewart BYU CS 652 Spring 2009

Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Query Relaxation Using Malleable Schemas

Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang NejdlL3S Research Center

Leibniz UniversityHanover, Germany

Presented by Aaron StewartBYU CS 652Spring 2009

Page 2: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Problem

+ = ?

Page 3: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Problem

• Multiple data sources

• Unmatched schemas

Page 4: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Approach

1. Malleable schemas

2. Discover correlations

3. Relax user queries

Page 5: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Malleable Schemas

• Allow duplicate fields

• Allow related fields

Page 6: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Malleable Schemas

Page 7: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Malleable Schemasfirst_name, sur_name

name

Page 8: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Malleable Schemas

contents

body

Page 9: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

In Practice: Tables

• “…a malleable schema… contains imprecise and overlapping definitions of attributes or relationships.”

• “In this way, a malleable schema can capture such heterogeneous data structures as in Figure 1.”

Page 10: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

In Practice: Tables

Page 11: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

In Practice: Tables

Entities (database records, rows)

Attributes (database fields, columns)

Equivalently: Distinct tables

Page 12: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Query Relaxation Planning

• Multiple queries– Different columns or tables– As few queries as possible

• Exponential number of relaxed queries– Evaluate in order of precision– Stop at k results

Page 13: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Query Relaxation Planning

A1 A2

relaxed attributechild attributes

Page 14: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Query Relaxation Planning

• A “relaxed query always yields better precision than its child queries, so that it should always be evaluated prior to its child queries”

Page 15: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Parent/Child Relationship

• We would think A is the parent, and A1 and A2 are the children, but…

• Put them in order of correlation probability– If P(A|A1) > P(A|A2)– Then A => A1 => A2

Page 16: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Query Relaxation Planning

Page 17: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Query Relaxation

Page 18: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Experiments

• Data sets– IMDB Movies– Amazon.com DVDs and VHS videos

Page 19: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Results

Page 20: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Results

Page 21: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Results

Page 22: Query Relaxation Using Malleable Schemas Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl L3S Research Center Leibniz University Hanover, Germany

Analysis

• Strengths– Handles mixed schemas– Well-designed algorithms (IMO)

• Future work– Speed