Upload
bernardo-randon
View
216
Download
0
Embed Size (px)
Citation preview
Answering Queries Using Views
Advanced DB Class
Presented by David FuhryMarch 9, 2006
Presentation Outline
1) Introduction to views
2) Where views are used
3) How a database processes views
4) Query equivalence and containment
5) Using views to solve queries
6) Means of optimizing the above
i. System-R, Transformational
What is a View?
● A named query [Hal0?]● Virtual or logical table composed of the
result set of a query [Wik06]● Any relation that is not a part of the
logical model, but is made visible to a user as a visual relation [SKS02]
An Example View
CREATE VIEW CHEAP_HOTELS AS SELECT Hotel_name, Distance FROM HOTELS WHERE Price < 250;
HOTELS
Hotel_name Price DistanceAqua Ocean Tower 78 1.2Outrigger Reef 154 0.2The Royal Hawaiian 257 0.3Halekulani 385 0Sheraton Waikiki 207 0.5
SELECT * FROM HOTELS
CHEAP_HOTELS
Hotel_name DistanceAqua Ocean Tower 1.2Outrigger Reef 0.2Sheraton Waikiki 0.5
SELECT * FROM CHEAP_HOTELS
Presentation Outline
1) Introduction to views
2) Where views are used
3) How a database processes views
4) Query equivalence and containment
5) Using views to solve queries
6) Means of optimizing the above
i. System-R, Transformational
Where are views used?
● Query Optimization & DB Design– Significant performance gain (if materialized)– Logical perspective of physical data
● Data Integration– Provide common query interface to non-
uniform data sources– Query -> Mediated Schema -> Source
Descriptor -> Source Data
When might I use a view?
● Organize the data to be presented by a screen or page of an application
● Secure a protected global table by making only parts of it visible to users
● Reduce size of query statement– As do stored procedures and prepared
statements– Views integrate into SQL expressions more
easily though
When else might I use a view?
● Result set is too large to exist on disk– Frequent itemsets when the number of items
is realistically large● I can only access chunks of the data at a
time– Web “screen scraping” of detail pages
Presentation Outline
1) Introduction to views
2) Where views are used
3) How a database processes views
4) Query equivalence and containment
5) Using views to solve queries
6) Means of optimizing the above
i. System-R, Transformational
How does the database process views?
SELECT * FROM CHEAP_HOTELS
SELECT Hotel_name, Distance FROM HOTELS WHERE Price < 250
SELECT Hotel_name from CHEAP_HOTELS WHERE Distance > 0.3
SELECT Hotel_name FROM HOTELS WHERE Distance > 0.3 AND Price < 250
Presentation Outline
1) Introduction to views
2) Where views are used
3) How a database processes views
4) Query equivalence and containment
5) Using views to solve queries
6) Means of optimizing the above
i. System-R, Transformational
Query Containment
● Q1 Q2 if the tuples (rows) returned by Q1 are a subset of those returned by Q2
– Q1 is contained in Q2
Hotel_name Price DistanceAqua Ocean Tower 78 1.2Outrigger Reef 154 0.2The Royal Hawaiian 257 0.3Halekulani 385 0Sheraton Waikiki 207 0.5
SELECT Hotel_name, Price, DistanceFROM hotels WHERE Price < 400;
SELECT Hotel_name, Price, DistanceFROM hotels WHERE Price < 240;
Hotel_name Price DistanceAqua Ocean Tower 78 1.2Outrigger Reef 154 0.2Sheraton Waikiki 207 0.5
Q1 Q2
In the above case Q1 Q2
Query Equivalence
● Q1 and Q2 are equivalent if Q1 Q2 and Q2 Q1
Hotel_name Price DistanceAqua Ocean Tower 78 1.2The Royal Hawaiian 257 0.3Sheraton Waikiki 207 0.5
SELECT Hotel_name, Price, DistanceFROM hotels WHERE Distance >= 0.3
Q1SELECT Hotel_name, Price, DistanceFROM hotels WHERE Distance BETWEEN(0.3, MAX_FLOAT)
Q2
Hotel_name Price DistanceAqua Ocean Tower 78 1.2The Royal Hawaiian 257 0.3Sheraton Waikiki 207 0.5
Presentation Outline
1) Introduction to views
2) Where views are used
3) How a database processes views
4) Query equivalence and containment
5) Using views to solve queries
6) Means of optimizing the above
i. System-R, Transformational
When can a view be useful for solving part of a query?
● If it has relation(s) in common with the query and selects some attributes selected by the query
Hotel_name Price Distance Rooms Address CountyAqua Ocean Tower 78 1.2 78 129 Paokalani KauaiOutrigger Reef 154 0.2 240 2169 Kalia Rd. MauiThe Royal Hawaiian 257 0.3 137 2259 Kalakaua Av. KalawaoHalekulani 385 0 92 2199 Kalia Rd. HonoluluSheraton Waikiki 207 0.5 57 120 Kaiulani Av. Nihoa
US_Hotels Hawaii_Buildings
Norwegian_Beagles Jordanian_Hotels
Grouping and aggregation
● How useful can views with grouping or aggregation be in solving the query?– If the view uses weaker predicates than the
query, very useful– If the view uses stronger predicates, then
perhaps as a subset of the results
Grouping and aggregation
Price
Distance
Rooms
Adapted from: Essbase Database Administrator's Guide – Understanding Multidimensional Databases
Presentation Outline
1) Introduction to views
2) Where views are used
3) How a database processes views
4) Query equivalence and containment
5) Using views to solve queries
6) Means of optimizing the above
i. System-R, Transformational
Problem Statement
● How can we use more efficiently answer queries using a predefined set of materialized views?
Efficiently answering a query
● Suppose a query like the following is being run very often:– SELECT attr1, attr2, ..., attrN FROM t1
INNER JOIN t2 ON t1.some_attr = t2.id...OUTER JOIN tM ON t1.other_attr = tM.id
– Lots of JOINs.– M tables must be joined. The operation will
be expensive.– Can we do better? (Hint: yes)
Efficiently answering a query
Attr1 Attr2 Attr3 Attr4 ... AttrN34 ... ... ... ... ...
362 ... ... ... ... ...97 ... ... ... ... ...1 ... ... ... ... ...
M Source Tables
Result Set
How can database systems determine which (if any) materialized views to use to solve the query?
Materialized Views
Query Optimization Techinques
● Here are a few techniques:– Bottom-up (System-R style)– Transformational– Other
Presentation Outline
1) Introduction to views
2) Where views are used
3) How a database processes views
4) Query equivalence and containment
5) Using views to solve queries
6) Means of optimizing the above
i. System-R, Transformational
System-R style optimization
1. Identify potentially useful views
2. Termination testing
3. Pruning of plans
4. Combining partial plans
System-R style optimization
1. Identify potentially useful views– Here is where we use the concepts of query
containment and equivalence discussed earlier
– But to recap: “A view can be useful for a query if the set of relations it mentions overlaps with that of the query, and it selects some of the attributes selected by the query”
System-R style optimization
2. Termination testing– Differentiate partial query plans from
complete query plans– Enumerate possible join orders and explore
all partial paths
Col_1 Col_2 Col_3 Col_4 Col_5 Col_634 ... ... ... ... ...
362 ... ... ... ... ...97 ... ... ... ... ...1 ... ... ... ... ...
Source Tables Result SetMaterialized Views
System-R style optimization
3. Pruning of plans– A plan is pruned if a cheaper plan exists
which contains it
Attr1 Attr2 Attr3 Attr4 ... AttrN34 ... ... ... ... ...
362 ... ... ... ... ...97 ... ... ... ... ...1 ... ... ... ... ...
Plan 0Cost: 30
Plan 1Cost: 25
Plan 2Cost: 18
Attr1 Attr2 Attr3 Attr4 ... AttrN AttrY AttrZ34 ... ... ... ... ... ... ...
362 ... ... ... ... ... ... ...97 ... ... ... ... ... ... ...1 ... ... ... ... ... ... ...
System-R style optimization
4. Combining partial plans– Consider different possible ways of joining
the views– Use dynamic programming
● To solve optimal plan for Join({A, B, C, D}), find optimal (cheapest) plan among
– Join({A, B, C}, D)– Join({A, B, D}, C)– Join({A, C, D}, B)– Join({B, C, D}, A)
● Use recursion to solve● Discard the other three
System-R style optimization
Source: An overview of Query Optimization in Relational SystemsChaudhuri, Surajit
Presentation Outline
1) Introduction to views
2) Where views are used
3) How a database processes views
4) Query equivalence and containment
5) Using views to solve queries
6) Means of optimizing the above
i. System-R, Transformational
Transformational query rewriting
● Top-down approach● Cache materialized view metadata
– Relations the view is composed of– Columns the view covers– Groupings the view applies– etc.
● Build a multiway search tree out of all views' metadata– It partitions the views by the above attributes– Idea is to reject irrelevant views quickly
A Filter Tree
Source table condition
Hub condition
Output column condition
Grouping columns
Range constrained columns
Residual predicate condition
Output expression condition
Grouping expression condition
{V1,V3} {V7,V9,V10}
...
Leaf nodes point to sets of relevant views
Other types of view rewriting
● Query Graph Model (QGM)– Split query into multiple boxes, and try to
match the view's boxes with the query's
References● [Hal0?] A.Y. Halevy. Answering Queries Using Views: A Survey. VLDB
Journal, 10(4).
● [Ull97] Jeffrey D. Ullman. Information Integration Using Logical Views. ICDT 1997.
● [Wik06] Wikipedia contributors (2006). View (database). Wikipedia, The Free Encyclopedia
● [SKS02] Silbershatz, Korth, Sudarshan. Database System Concepts, 4th Ed. 2002. (100)
● [JL01] Jonathan Goldstein and Per-Ake Larson. Optimizing queries using materialized views: a practical, scalable solution. In Proc. Of SIGMOD, pages 331-342, 2001.
● [Ess06] IBM Corp. Essbase Analytic Services Database Administrator's Guide. Understanding Multidimensional Databases
Recap
1) Introduction to views
2) Where views are used
3) How a database processes views
4) Query equivalence and containment
5) Using views to solve queries
6) Means of optimizing the above
i. System-R, Transformational