CMPT 354 Views and Indexes Spring 2012 Instructor: Hassan Khosravi

Preview:

Citation preview

CMPT 354 Views and Indexes

Spring 2012

Instructor: Hassan Khosravi

1.2

Three level vision of a database

1.3

What are views Relations that are defined with a create table statement exist in

the physical layer and do not change unless explicitly told so

Virtual views do not physically exist, they are defined by expression over the tables. Can be queries (most of the time) as if they were tables.

Why use views? Hide some data from users Make some queries easier Modularity of database

When not specified exactly based on tables.

1.4

Defining and using Views

Create View <view name> As <view definition> View definition is defined in SQL From now on we can use the view almost as if it is just a normal

table

View V (R1,…Rn )

query Q involving V Conceptually

V (R1,…Rn ) is used to evaluate Q

In reality

The evaluation is performed over R1,…Rn

1.5

Example of views

1.6

Views Conceptually

Reality

Query optimizer

Users do not need to worry about any of this

1.7

Views on Views

1.8

Views make life easy

1.9

Deleting views

Drop View <view name> Dropping a view does not affect any tuples of the in the underlying

relation. Dropping a table would remove all of the tuples of the table and views

that use that table.

What would happen if CSaccept is dropped? Some DBMS would not allow it because it has dependencies Some would delete it and the be unable to answer queries on

CSberk

1.10

Modifying views

Can views be modified? Doesn’t make sense to modify a view since it doesn’t physically

exist Should be possible since many users can only observe views and

do not have access to tables, so they should be able to alter tables using views.

Solution Modifications to the view rewritten to modify base tables.

1.11

Example

R(A,B) which has (1, 2)

V is just the A attribute of V = ∏ A (R)

Insert (3) into A What would be insert into R(A,B)?

(3,1)? (3,2)? (3,null)?

R(N) V = avg(N)

Update

1.12

Example

1.13

Example

1.14

Approaches to View Modification

Rewriting process specified explicitly by view creator (user) + Can handle all modification - No guarantee of correctness - More work

Restrict views and modifications so that translation to base tables can be done automatically, correctly, and unambiguously + No User intervention - Restrictions are significant

1.15

Rewriting process specified explicitly by view creator

Instead of Triggers are mainly used

We can write a trigger to perform deletion on this view

1.16

This will produce an error without having a trigger for it There is nothing stopping you from making a mistake in your trigger.

1.17

Restricting modifications in Views Showing students that are either in CS or EE

Inserting new students

1.18

Automatic View Modification

Restrict views and modifications so that translation to base tables can be done automatically, correctly, and unambiguously The view should be on one table and not join of tables Attributes that are not in the view can be null or have default

values Sub-queries must not refer to T You cannot have group by or aggregation

MYSQL supports automatic updates on views Postgress and SQLite do not support automatic updates on views and

triggers should be used

1.19

Deletion Example

The deletion will be done in the apply table and would show in the CSaccept view

1.20

Insertion Example

You should not be able to add to view tuples that do not belong to the view

1.21

Updatable Views -- Aggregation

What would this deletion mean? The system disallows this deletion.

1.22

Updatable views -- Sub-queries

This is disallowed since it is vague It would have surprising effects on the base table

1.23

Updatable views-- Joins

What would insertion or deletion to this table mean? Updates the apply table but not the student table so the first row

would not match any more.

1.24

Indexes in SQL

Primary mechanism to get improved performance on database Data structure stored in the database

Implementation issues

Full scan vs. immediate tuple location

1.25

1.26

Implementation of Indexes

Underlying data structure Balance Trees

A = v A < v A > v Logarithm time

Hash tables Only A = v If designed properly, constant time

1.27

1.28

1.29

1.30

Downside of Indexes

Extra space Local hard space

Over head in Index creation Index maintenance

1.31

Calculating the best indexes to create

Typically a relation is stored over many disk blocks

Principle cost of a query or modification is often the number of disks that have to be brought into the main memory

Indexes store where tuples are stored, so we know which block to restore instead of full scan of the table

However indexes themselves have to be stored and modified Modification in the actual relation needs one page access to

modify the relation and one page access to modify the index Assuming we have history of the previous queries and modifications

how can we decide what to have indexes on?

1.32

Example StarIn(movieTitle, movieYear, starName) Three database operations are performed

(Q1)Look for title and year of movies in which a given star appeared (p1) (Q2) Look for stars that appear in a given movie (p2) (I) Insert a new tuple into the starIn table (1-p1-p2)

StarIn occupies 10 pages On average a star has appeared in 3 movies On average a movie has 3 stars Movies of a star and stars of a movie are speared over all pages One disk access is needed to read a page in the index Insertion requires 1 access to find page to write to and 1 to actually write it

Action No Index Star Index Movie Index

Both Indexes

Q1 10 4 10 4

Q2 10 10 4 4

I 2 4 4 6

Average 2 + 8p1 +8p2 4 + 6p2 4 + 6p1 6 - 2p1 - 2p2

1.33

Action No Index Star Index Movie Index

Both Indexes

Q1 10 4 10 4

Q2 10 10 4 4

I 2 4 4 6

Average 2 + 8p1 +8p2 4 + 6p2 4 + 6p1 6 - 2p1 - 2p2

P1=0.33P2=0.33

7.28 6 6 4.67

P1=0.1 p2=0.1

3.6 4.6 4.6 5.6

P1=0.1 p2=0.5

6.8 7 4.6 4.8

1.34

Automatic Selection of Indexes to create

1.35

Materialized views

Materialized views

Periodic Maintenance of Materialized views Common for databases to serve two purposes

Record current inventory Analysts to study patterns

Analyst’s queries involve many aggregation and joins Not as sensitive to modifications They are updated (reconstructed) periodically

Once a day or week