View
227
Download
3
Tags:
Embed Size (px)
Citation preview
CS 257 Database Principles
Presentation File for CS 257Database Principles
Submitted to : Dr T.Y Lin
Submitted By: Saurabh Vishal (006530966) Roll ID: - 115
CS 257 Database Principles
Chapters Covered
Chapter 13. Secondary storage Chapter 15 Query execution Chapter 16 Query complier Chapter 18 Concurrency control Chapter 21 information Integration
CS 257 Database Principles
Chapter 13 SECONDARY STORAGE
13.1 Secondary Storage Management13.2 Disks13.3 Accelerating Access to Secondary Storage13.4 Recovery from Disk Crashes13.5 Arranging data on disk13.6 Representing Block and Record Addresses13.7 Variable length data and record13.8 Record Modification
CS 257 Database Principles
13.1 Secondary Storage Management In Secondary Storage Management
Database systems always involve secondary storage like the disks and other devices that store large amount of data that persists over time.
CS 257 Database Principles
Memory Hierarchy
A typical computer system has several different components in which data may be stored.
These components have data capacities ranging over at least seven orders of magnitude and also have access speeds ranging over seven or more orders of magnitude.
CS 257 Database Principles
Diagram of Memory hierarchy
CS 257 Database Principles
Cache Memory
It is the lowest level of the hierarchy is a cache. Cache is found on the same chip as the microprocessor itself, and additional level-2 cache is found on another chip.
Data and instructions are moved to cache from main memory when they are needed by the processor.
Cache data can be accessed by the processor in a few nanoseconds.
CS 257 Database Principles
Main Memory
In the center of the action is the computer's main memory. We may think of everything that happens in the computer - instruction executions and data manipulations - as working on information that is resident in main memory
Typical times to access data from main memory to the processor or cache are in the 10-100 nanosecond range
CS 257 Database Principles
Secondary Storage
Essentially every computer has some sort of secondary storage, which is a form of storage that is both significantly slower and significantly more capacious than main memory.
The time to transfer a single byte between disk and main memory is around 10 milliseconds.
CS 257 Database Principles
Tertiary Storage
As capacious as a collection of disk units can be, there are databases much larger than what can be stored on the disk(s) of a single machine, or even of a substantial collection of machines.
Tertiary storage devices have been developed to hold data volumes measured in terabytes.
Tertiary storage is characterized by significantly higher read/write times than secondary storage, but also by much larger capacities and smaller cost per byte than is available from magnetic disks.
CS 257 Database Principles
Transfer of Data Between Levels
Normally, data moves between adjacent levels of the hierarchy.
At the secondary and tertiary levels, accessing the desired data or finding the desired place to store data takes a great deal of time, so each level is organized to transfer large amount of data or from the level below, whenever any data at all is needed.
The disk is organized into disk blocks and the entire blocks are moved to or from a continuous section of main memory called a buffer
CS 257 Database Principles
Volatile and Nonvolatile Storage: A volatile device "forgets" what is stored in it when
the power goes off. A nonvolatile device, on the other hand, is expected
to keep its contents intact even for long periods when the device is turned off or there is a power failure.
Magnetic and optical materials hold their data in the absence of power. Thus, essentially all secondary and tertiary storage devices are nonvolatile. On the other hand main memory is generally volatile.
CS 257 Database Principles
Virtual Memory
When we write programs the data we use, variables of the program, files read and so on occupies a virtual memory address space.
Many machines use a 32-bit address space; that is, there are 2(pow)32 bytes or 4 gigabytes.
The Operating System manages virtual memory, keeping some of it in main memory and the rest on disk.
Transfer between memory and disk is in units of disk blocks
CS 257 Database Principles
13.2 Disks
The use of secondary storage is one of the important characteristics of a DBMS, and secondary storage is almost exclusively based on magnetic disks
CS 257 Database Principles
Mechanics of Disks
The two principal moving pieces of a disk drive are a disk assembly and a head assembly.
The disk assembly consists of one or more circular platters that rotate around a central spindle
The upper and lower surfaces of the platters are covered with a thin layer of magnetic material, on which bits are stored.
CS 257 Database Principles
Diagram for Disk Management
CS 257 Database Principles
0’s and 1’s are represented by different patterns in the magnetic material.A common diameter for the disk platters is 3.5 inches.Disk is organized into tracks, which are concentric circles on a single platter.The tracks that are at a fixed radius from a center, among all the surfaces form one cylinder.
Tracks are organized into sectors, which are segments of the circle separated by gaps that are magnetized to represent either 0’s or 1’s. The second movable piece the head assembly, holds the disk heads.
CS 257 Database Principles
The Disk Controller
One or more disk drives are controlled by a disk controller, which is a small processor capable of:
Controlling the mechanical actuator that moves the head assembly to position the heads at a particular radius.
Transferring bits between the desired sector and the main
memory. Selecting a surface from which to read or write, and selecting a
sector from the track on that surface that is under the head.An example of single processor is shown in next slide.
CS 257 Database Principles
Simple computer Architecture System
CS 257 Database Principles
Disk Access Characteristics
Seek Time: The disk controller positions the head assembly at the cylinder containing the track on which the block is located. The time to do so is the seek time.
Rotational Latency: The disk controller waits while the first sector of the block moves under the head. This time is called the rotational latency.
Transfer Time: All the sectors and the gaps between them pass under the head, while the disk controller reads or writes data in these sectors. This delay is called the transfer time.
The sum of the seek time, rotational latency, transfer time is the latency of the time.
CS 257 Database Principles
13.3 Accelerating Access to Secondary Storage Several approaches for more-efficiently accessing
data in secondary storage: Place blocks that are together in the same cylinder. Divide the data among multiple disks. Mirror disks. Use disk-scheduling algorithms. Prefetch blocks into main memory.
Scheduling Latency – added delay in accessing data caused by a disk scheduling algorithm.
Throughput – the number of disk accesses per second that the system can accommodate.
CS 257 Database Principles
The I/O Model of Computation
The number of block accesses (Disk I/O’s) is a good time approximation for the algorithm. This should be minimized.
Ex 13.3: You want to have an index on R to identify the block on which the desired tuple appears, but not where on the block it resides. For Megatron 747 (M747) example, it takes 11ms to read a
16k block. A standard microprocessor can execute millions of
instruction in 11ms, making any delay in searching for the desired tuple negligible.
CS 257 Database Principles
Organizing Data by Cylinders
If we read all blocks on a single track or cylinder consecutively, then we can neglect all but first seek time and first rotational latency.
Ex 13.4: We request 1024 blocks of M747. If data is randomly distributed, average latency is 10.76ms
by Ex 13.2, making total latency 11s. If all blocks are consecutively stored on 1 cylinder:
6.46ms + 8.33ms * 16 = 139ms
(1 average seek) (time per rotation) (# rotations)
CS 257 Database Principles
Using Multiple Disks If we have n disks, read/write performance will increase by
a factor of n. Striping – distributing a relation across multiple disks
following this pattern: Data on disk R1: R1, R1+n, R1+2n,… Data on disk R2: R2, R2+n, R2+2n,… … Data on disk Rn: Rn, Rn+n, Rn+2n, …
Ex 13.5: We request 1024 blocks with n = 4. 6.46ms + (8.33ms * (16/4)) = 39.8ms
(1 average seek) (time per rotation) (# rotations)
CS 257 Database Principles
Mirroring Disks
Mirroring Disks – having 2 or more disks hold identical copied of data.
Benefit 1: If n disks are mirrors of each other, the system can survive a crash by n-1 disks.
Benefit 2: If we have n disks, read performance increases by a factor of n.
Performance increases further by having the controller select the disk which has its head closest to desired data block for each read.
CS 257 Database Principles
Disk Scheduling and the Elevator Problem Disk controller will run this algorithm to select which of
several requests to process first. Pseudo code:
requests[] // array of all non-processed data requests upon receiving new data request:
requests[].add(new request) while(requests[] is not empty)
move head to next location if(head location is at data in requests[])
retrieve data remove data from requests[]
if(head reaches end) reverse head direction
CS 257 Database Principles
Disk Scheduling and the Elevator Problem (con’t)Events:
Head starting point
Request data at 8000
Request data at 24000
Request data at 56000
Get data at 8000Request data at
16000Get data at 24000Request data at
64000Get data at 56000Request Data at
40000Get data at 64000Get data at 40000Get data at 16000
data time
Current time
Current time
0
Current time
4.3
Current time
10
Current time
13.6
Current time
20
Current time
26.9
Current time
30
Current time
34.2
Current time
45.5
Current time
56.8
800016000240003200040000480005600064000
data time
8000.. 4.3
data time
8000.. 4.3
24000.. 13.6
data time
8000.. 4.3
24000.. 13.6
56000.. 26.9
data time
8000.. 4.3
24000.. 13.6
56000.. 26.9
64000.. 34.2
data time
8000.. 4.3
24000.. 13.6
56000.. 26.9
64000.. 34.2
40000.. 45.5
data time
8000.. 4.3
24000.. 13.6
56000.. 26.9
64000.. 34.2
40000.. 45.5
16000.. 56.8
CS 257 Database Principles
Disk Scheduling and the Elevator Problem (con’t)
data time
8000.. 4.3
24000.. 13.6
56000.. 26.9
64000.. 34.2
40000.. 45.5
16000.. 56.8
data time
8000.. 4.3
24000.. 13.6
56000.. 26.9
16000.. 42.2
64000.. 59.5
40000.. 70.8
Elevator Algorithm
FIFOAlgorithm
CS 257 Database Principles
Pre fetching and Large-Scale Buffering
If at the application level, we can predict the order blocks will be requested, we can load them into main memory before they are needed.
CS 257 Database Principles
13.4 Recovery from Disk Crashes• Ways to recover dataWays to recover data
The most serious mode of failure for disks is “head crash” where data permanently destroyed.
So to reduce the risk of data loss by disk crashes there are number of schemes which are know as RAID (Redundant Arrays of Independent Disks) schemes.
Each of the schemes starts with one or more disks that hold the data and adding one or more disks that hold information that is completely determined by the contents of the data disks called Redundant Disk.
CS 257 Database Principles
Mirroring /Redundancy Mirroring /Redundancy TechniqueTechnique
Mirroring Scheme is referred as RAID level 1 protection against data loss scheme.
In this scheme we mirror each disk.
One of the disk is called as data disk and other redundant disk.
In this case the only way data can be lost is if there is a second disk crash while the first crash is being repaired.
CS 257 Database Principles
Parity Blocks
RAID level 4 scheme uses only one redundant disk no matter how many data disks there are.
In the redundant disk, the ith block consists of the parity checks for the ith blocks of all the data disks.
It means, the jth bits of all the ith blocks of both data disks and redundant disks, must have an even number of 1’s and redundant disk bit is used to make this condition true.
CS 257 Database Principles
Parity Blocks – Reading disk
Reading data disk is same as reading block from
any disk.• We could read block from each of the other disks and compute
the block of the disk we want to read by taking the modulo-2 sum.
disk 2: 10101010
disk 3: 00111000
disk 4: 01100010
If we take the modulo-2 sum of the bits in each column, we get
disk 1: 11110000
CS 257 Database Principles
Parity Block - Writing
• When we write a new block of a data disk, we need to change that block of the redundant disk as well.
• One approach to do this is to read all the disks and compute the module-2 sum and write to the redundant disk.But this approach requires n-1 reads of data, write a data block and write of redundant disk block.
Total = n+1 disk I/Os
CS 257 Database Principles
Continued………………
• Better approach will require only four disk I/Os
1. Read the old value of the data block being changed.
2. Read the corresponding block of the redundant disk.
3. Write the new data block.
4. Recalculate and write the block of the redundant disk.
CS 257 Database Principles
Parity Blocks – Failure RecoveryIf any of the data disk crashes then we just have to compute the
module-2 sum to recover the disk.
Suppose that disk 2 fails. We need to re compute each block of the replacement disk. We are given the corresponding blocks of thefirst and third data disks and the redundant disk, so the situation looks like:
disk 1: 11110000disk 2: ????????disk 3: 00111000disk 4: 01100010
If we take the modulo-2 sum of each column, we deduce that the missing block of disk 2 is : 10101010
CS 257 Database Principles
An Improvement: RAID 5
• RAID 4 is effective in preserving data unless there are two simultaneous disk crashes.
• Whatever scheme we use for updating the disks, we need to read and write the redundant disk's block. If there are n data disks, then the number of disk writes to the redundant disk will be n times the average number of writes to any one data disk.
• However we do not have to treat one disk as the redundant disk and the others as data disks. Rather, we could treat each disk as the redundant disk for some of the blocks. This improvement is often called RAID level 5.
CS 257 Database Principles
Continue : An Improvement: RAID 5• For instance, if there are n + 1 disks numbered 0through n, we could treat the ith cylinder of disk j asredundant if j is the remainder when i is divided by n+1.
• For example, n = 3 so there are 4 disks. The first disk,numbered 0, is redundant for its cylinders numbered 4,8, 12, and so on, because these are the numbers thatleave remainder 0 when divided by 4.
• The disk numbered 1 is redundant for blocks numbered1, 5, 9, and so on; disk 2 is redundant for blocks 2, 6.10,. . ., and disk 3 is redundant for 3, 7, 11,. . . .
CS 257 Database Principles
Coping With Multiple Disk Crashes• Error-correcting codes theory known as Hamming code
leads to the RAID level 6.
• By this strategy the two simultaneous crashes are correctable.
The bits of disk 5 are the modulo-2 sum of the corresponding bits of disks 1, 2, and 3.
The bits of disk 6 are the modulo-2 sum of the corresponding bits of disks 1, 2, and 4.
The bits of disk 7 are the module2 sum of the corresponding bits of disks 1, 3, and 4
CS 257 Database Principles
Coping With Multiple Disk Crashes – Reading/Writing
• We may read data from any data disk normally.
• To write a block of some data disk, we compute the modulo-2 sum of the new and old versions of that block. These bits are then added, in a modulo-2 sum, to the corresponding blocks of all those redundant disks that have 1 in a row in which the written disk also has 1.
CS 257 Database Principles
13.5 Arranging data on disk
Data elements are represented as records, which Data elements are represented as records, which stores in consecutive bytes in same disk block. stores in consecutive bytes in same disk block.
Basic layout techniques of storing data : Basic layout techniques of storing data : Fixed-Length RecordsFixed-Length Records Allocation criteria - data should start at word boundary.Allocation criteria - data should start at word boundary. Fixed Length record header Fixed Length record header A pointer to record schema.A pointer to record schema. The length of the record.The length of the record. Timestamps to indicate last modified or last read.Timestamps to indicate last modified or last read.
CS 257 Database Principles
ExampleExample
CREATE TABLE employee(CREATE TABLE employee(
name CHAR(30) PRIMARY KEY,name CHAR(30) PRIMARY KEY,
address VARCHAR(255),address VARCHAR(255),
gender CHAR(1),gender CHAR(1),
birthdate DATEbirthdate DATE
););
Data should start at word boundary and contain header and four Data should start at word boundary and contain header and four fields name, address, gender and birthdate.fields name, address, gender and birthdate.
CS 257 Database Principles
Packing Fixed-Length Records Packing Fixed-Length Records into Blocks into Blocks
Records are stored in the form of blocks on the disk and Records are stored in the form of blocks on the disk and they move into main memory when we need to update they move into main memory when we need to update or access them.or access them.
A block header is written first, and it is followed by series A block header is written first, and it is followed by series of blocks.of blocks.
CS 257 Database Principles
Block header contains the following Block header contains the following informationinformation
Links to one or more blocks that are part of a network of blocks.
Information about the role played by this block in such a network.
Information about the relation, the tuples in this block belong to.
A "directory" giving the offset of each record in the block.
Time stamp(s) to indicate time of the block's last modification and/or access.
CS 257 Database Principles
Example
Along with the header we can pack as many record as we can in one block as shown in the figure and remaining space will be unused.
CS 257 Database Principles
13.6 Representing Block and Record Addresses Address of a block and Record
In Main Memory Address of the block is the virtual memory address of the first byte Address of the record within the block is the virtual memory
address of the first byte of the record
In Secondary Memory: sequence of bytes describe the location of the block in the overall system
Sequence of Bytes describe the location of the block : the device Id for the disk, Cylinder number, etc.
CS 257 Database Principles
Addresses in client server Systems
The addresses in address space are represented in two ways Physical Addresses: byte strings that determine the place within
the secondary storage system where the record can be found. Logical Addresses: arbitrary string of bytes of some fixed length
Physical Address bits are used to indicate: Host to which the storage is attached Identifier for the disk Number of the cylinder Number of the track Offset of the beginning of the record
CS 257 Database Principles
Map Table relates logical addresses to physical addresses.
Logical Physical
Logical Address
Physical Address
Continued………….
CS 257 Database Principles
Logical and Structural Addresses
Purpose of logical address? Gives more flexibility, when we
Move the record around within the block Move the record to another block
Gives us an option of deciding what to do when a record is deleted? Rec
ord 4
Record 3
Record 2
Record 1
HeaderOffset table
Unused
CS 257 Database Principles
Pointer Swizzling
Having pointers is common in an object-relational database systems
Important to learn about the management of pointers
Every data item (block, record, etc.) has two addresses: database address: address on the disk memory address, if the item is in virtual
memory
CS 257 Database Principles
Continued………………
Translation Table: Maps database address to memory address
All addressable items in the database have entries in the map table, while only those items currently in memory are mentioned in the translation table
Dbaddr Mem-addrDatabase address
Memory Address
CS 257 Database Principles
Continued………..
Pointer consists of the following two fields Bit indicating the type of address Database or memory address
Disk
Block 2
Block 1
Memory
Swizzled
Unswizzled
Block 1
CS 257 Database Principles
Example Block 1 has a record with pointers to a second
record on the same block and to a record on another block
If Block 1 is copied to the memory The first pointer which points within Block 1
can be swizzled so it points directly to the memory address of the target record
Since Block 2 is not in memory, we cannot swizzle the second pointer
CS 257 Database Principles
Continued…………
Three types of swizzling Automatic Swizzling
As soon as block is brought into memory, swizzle all relevant pointers.
Swizzling on DemandOnly swizzle a pointer if and when it is actually
followed. No Swizzling
Pointers are not swizzled they are accesses using the database address.
CS 257 Database Principles
Programmer control of Swizzling
Unswizzling When a block is moved from memory back to
disk, all pointers must go back to database (disk) addresses
Use translation table again Important to have an efficient data structure for
the translation table
CS 257 Database Principles
A block in memory is said to be pinned if it cannot be written back to disk safely.
If block B1 has swizzled pointer to an item in block B2, then B2 is pinned Unpin a block, we must unswizzle any pointers to it Keep in the translation table the places in memory
holding swizzled pointers to that item Unswizzle those pointers (use translation table to
replace the memory addresses with database (disk) addresses
Pinned Records And Blocks
CS 257 Database Principles
13.7 Variable length data and record
A simple but effective scheme is to put all fixed length fields ahead of the variable-length fields. We then place in the record header:
1. The length of the record.2. Pointers to (i.e., offsets of) the beginnings of all the
variable-length fields. However, if the variable-length fields always appear in the same order then the first of them needs no pointer; we know it immediately follows the fixed-length fields.
CS 257 Database Principles
Records With Repeating Fields
A similar situation occurs if a record contains a variable
number of Occurrences of a field F, but the field itself is of
fixed length. It is sufficient to group all occurrences of field F
together and put in the record header a pointer to the first. We can locate all the occurrences of the field F as follows.
Let the number of bytes devoted to one instance of field F be
L. We then add to the offset for the field F all integer
multiples of L, starting at 0, then L, 2L, 3L, and so on. Eventually, we reach the offset of the field following F.
Where upon we stop.
CS 257 Database Principles
An alternative representation is to keep the record of fixed length, and put the variable length portion - be it fields of variable length or fields that repeat an indefinite number of times - on a separate block. In the record itself we keep: Pointers to the place where each repeating field begins,
and Either how many repetitions there are, or where the
repetitions end.
CS 257 Database Principles
Storing variable-length fields separately from the record
CS 257 Database Principles
Variable-Format Records The simplest representation of variable-format records is
a sequence of tagged fields, each of which consists of:1. Information about the role of this field, such as:
(a) The attribute or field name,(b) The type of the field, if it is not apparent from the field name and some readily available schema information, and(c) The length of the field, if it is not apparent from the
type.2. The value of the field.
CS 257 Database Principles
There are at least two reasons why tagged fields would make sense.
1. Information integration applications - Sometimes, a relation has been constructed from several earlier sources, and these sources have different kinds of information For instance, our movie star information may have come from several sources, one of which records birthdates, some give addresses, others not, and so on. If there are not too many fields, we are probably best off leaving NULL those values we do not know.
2. Records with a very flexible schema - If many fields of a record can repeat and/or not appear at all, then even if we know the schema, tagged fields may be useful. For instance, medical records may contain information about many tests, but there are thousands of possible tests, and each patient has results for relatively few of them.
CS 257 Database Principles
A record with tagged fields
CS 257 Database Principles
Records That Do Not Fit in a Block
These large values have a variable length, but even if the length is fixed for all values of the type, we need to use some special techniques to represent these values. In this section we shall consider a technique called “spanned records" that can be used to manage records that are larger than blocks.
Spanned records also are useful in situations where records are smaller than blocks, but packing whole records into blocks wastes significant amounts of space.For both these reasons, it is sometimes desirable to allow records to be split across two or more blocks. The portion of a record that appears in one block is called a record fragment.
CS 257 Database Principles
If records can be spanned, then every record and record fragment requires some extra header information:
1. Each record or fragment header must contain a bit telling whether or not it is a fragment.
2. If it is a fragment, then it needs bits telling whether it is the first or last fragment for its record.
3. If there is a next and/or previous fragment for the same record, then the fragment needs pointers to these other fragments.
Storing spanned records across blocks
CS 257 Database Principles
BLOBS• Binary, Large OBjectS = BLOBS
• BLOBS can be images, movies, audio files and other very large values that can be stored in files.
• Storing BLOBS
– Stored in several blocks.
– Preferable to store them consecutively on a cylinder or multiple disks for efficient retrieval.
• Retrieving BLOBS
– A client retrieving a 2 hour movie may not want it all at the same time.
– Retrieving a specific part of the large data requires an index structure to make it efficient. (Example: An index by seconds on a movie BLOB.)
CS 257 Database Principles
Column Stores
An alternative to storing tuples as records is to store each column as a record. Since an entire column of a relation may occupy far more than a single block, these records may span many block, much as long as files do. If we keep the values in each column in the same order then we can reconstruct the relation from column records
CS 257 Database Principles
13.8 Record Modification
When a data manipulation operation is performed , called as record Modification
Example: Record Structure for a person Table CREATE TABLE PERSON ( NAME CHAR(30), ADDRESS CHAR(256) ,
GENDER CHAR(1), BIRTHDATE CHAR(10));
CS 257 Database Principles
Types of Records
Fixed length Records• CREATE TABLE SJSUSTUDENT(STUDENT_ID INT(9) NOT NULL , PHONE_NO
INT(10) NOT NULL);
Varaible Length Records• CREATE TABLE SJSUSTUDENT(STUDENT_ID INT(9) NOT NULL,NAME
CHAR(100) ,ADDRESS CHAR(100) ,PHONE_NO INT(10) NOT NULL);
Record Modification• Insert, update & delete
CS 257 Database Principles
Structure of a Block & Records Various Records are clubbed together and
stored together in memory in blocks Structure of a Block
CS 257 Database Principles
BLOCKS & RECORDS
If records need not be any particular order, then just find a block with enough empty space
We keep track of all records/tuples in a relation/tables using Index structures, File organization concepts
CS 257 Database Principles
Inserting New Records
If Records are not required to be a particular order, just find an empty block and place the record in the block.eg: Heap Files
What if the Records are to be Kept in a particular Order (eg: sorted by primary key) ?
Locate appropriate block,check if space is available in the block if yes place the record in the block.
CS 257 Database Principles
Inserting New Records
We may have to slide the Records in the Block to place the Record at an appropriate place in the Block and suitably edit the block header.
CS 257 Database Principles
What If The Block Is Full ?
We need to Keep the record in a particular block but the block is full. How do we deal with it ?
We find room outside the Block There are 2 approaches to finding the room for
the record.
I. Find Space on Nearby Block
II. Create an Overflow Block
CS 257 Database Principles
Approaches to Finding Room For record Find a space on nearby block Block B1 has no space If space is available on block B2 move records
of B1 tp B2 If there are external Pointers to records of B1
Moved to B2 Leave Forwarding Address in offset Table of B1
CS 257 Database Principles
Approaches to Finding Room For Record Create Overflow block Each Block B has in its header pointer to an
overflow block where additional blocks of B can be placed
CS 257 Database Principles
Deletion
Try to reclaim the space available on a record after deletion of a particular record
If an offset table is used for storing information about records for the block then rearrange/slide the remaining records.
If Sliding of records is not possible then maintain a SPACE-AVAILABLE LIST to keep track of space available on the Record.
CS 257 Database Principles
Tomstone
What about pointer to deleted records ? A tombstone is placed in place of each deleted
record A tombstone is a bit placed at first byte of
deleted record to indicate the record was deleted ( 0 – Not Deleted 1 – Deleted)
A tombstone is permanent
CS 257 Database Principles
Updating Records
For Fixed-Length Records, there is no effect on the storage system
For variable length records :• If length increases, like insertion “slide the records”• If length decreases, like deletion we update the
space-available list, recover the space/eliminate the overflow blocks.
CS 257 Database Principles
Chapter 13 Ends………..!
Chapter 15
Starts………………………….
CS 257 Database Principles
Chapter 15
QUERY EXECUTION
15.1 Introduction to Physical-Query-Plan Operators 15.2 One-Pass Algorithms for Database Operations 15.3 Nested-Loop Joins 15.4 Two-Pass Algorithms Based on Sorting 15.5 Two-Pass Algorithms Based on Hashing 15.6 Index-Based Algorithms 15.7 Buffer Management 15.8 Algorithms Using More Than Two Passes 15.9 Parallel Algorithms for Relational Operations
CS 257 Database Principles
Query Compilation
Query compilation is divided into 3 major steps:
Parsing, in which a parse tree construct the structure and query
Query rewrite, in which the parse tree is converted to an initial query plan, which is an algebraic representation of the query.
Physical Plan Generation, where the abstract query plan is converted into physical query plan.
CS 257 Database Principles
Query Compilation
CS 257 Database Principles
15.1 Introduction to Physical-Query-Plan Operators Physical query plans are built from the
operators each of which implements one step of the plan.
Physical operators can be implementations of the operators of relational algebra.
However they can also be operators of non-relational algebra like ‘scan’ operator used for scanning tables.
CS 257 Database Principles
For scanning a table
We have two different approach Table scan
Relation R is stored in secondary memory with its tuples arranged in blocks. It is possible to get the blocks one by one
Index scan if there is an index on any attribute of relation R, then we can
use this index to get all the tuples of R.
CS 257 Database Principles
Sorting is the major topic while scanning the table Reasons why we need sorting while scanning tables
Various algorithms for relational-algebra operations require one or both of their arguments to be sorted relation
The query could include an ORDER BY clause. Requiring that a relation be sorted
A Physical-query-plan operator sort-scan takes a relation R and a specification of the attributes on which the sort is to be made, and produces R in that sorted order. If we are to produce a relation R sorted by attribute a, and if there is a
B-tree index on a, then index scan is used. If relation R is small enough to fit in main memory, then we can
retrieve its tuples using a table scan.
CS 257 Database Principles
Parameters for measuring costs Parameters that mainly affect the performance of a query are:
1. The cost mainly depends upon size of memory block on the disk and the size in the main memory affects the performance of a query.
2. Buffer space availability in the main memory at the time of execution of the query.
3. Size of input and the size of the output generated This are the number of disk I/O’s needed for each of the scan
operators.1. If a relation R is clustered, then the number of disk I/O’s is approximately
B where B is the number of blocks where R is stored.2. If R is clustered but requires a two phase multi way merge sort then the
total number of disk i/o required will be 3B.3. If R is not clustered, then the number of required disk I/0's is generally
much higher.
CS 257 Database Principles
Implementation of Physical Operator
The three major functions are open(), getnext(), close() Open() we have to start process by getting tuples
and initialize the data structure and perform operation.
Getnext() this function returns the next tuple in result
Close() finally closes all operation and function
CS 257 Database Principles
15.2 One-Pass Algorithms for Database Operations To transform a logical query plan into a physical
query a algorithm is required. Main classes of Algorithms:
Sorting-based methods Hash-based methods Index-based methods
Division based on degree difficulty and cost: 1-pass algorithms 2-pass algorithms 3 or more pass algorithms
CS 257 Database Principles
One-Pass Algorithm Methods
One-Pass Algorithms for Tuple-at-a-Time Operations (Unary operation) such as Selection & projection. read the blocks of R one at a time into an input buffer perform the operation on each tuple move the selected tuples or the projected tuples to the
output buffer The disk I/O requirement for this process depends only on how
the argument relation R is provided. If R is initially on disk, then the cost is whatever it takes to
perform a table-scan or index-scan of R.
CS 257 Database Principles
The block Diagram the memory operation Performed using One pass algorithm tuple-at-a time operation Figure(1) and other for duplication elimination of records
Figure(1) Figure(2)Ref: - Database Complete Book By Hector Garcia-Molina,Jeffrey D. UllmanJennifer Widom
CS 257 Database Principles
One-Pass Algorithms for Unary, fill-Relation Operations
Duplicate Elimination To eliminate duplicates, we can read each block of R one
at a time, but for each tuple we need to make a decision as to whether:1. It is the first time we have seen this tuple, in which case we
copy it to the output, or
2. We have seen the tuple before, in which case we must not output this tuple.
One memory buffer holds one block of R's tuples, and the remaining M - 1 buffers can be used to hold a single copy of every tuple.
CS 257 Database Principles
Duplication Elimination Continued………. When a new tuple from R is considered, we compare it with all tuples
if it is not equal: copy both to the output and add it to the in-memory list of tuples we have seen.
if there are n tuples in main memory: each new tuple takes processor time proportional to n, so the complete operation takes processor time proportional to n2.
We need a main-memory structure that allows each of the operations: Add a new tuple, and Tell whether a given tuple is already there
Note: Basic data structure that are used for searching and sorting is Hash table Balanced binary search tree
CS 257 Database Principles
One-Pass Algorithms for Unary, fill-Relation Operations Grouping
The grouping operation gives us zero or more grouping attributes and presumably one or more aggregated attributes. If we create in main memory one entry for each group then we can scan the tuples of R, one block at a time. The entry for a group consists of values for the grouping attributes and an accumulated value or values for each aggregation.
Min() or Max() ,Avg() ,Count (), sum()
Binary operations include Union Intersection Difference Product Join
CS 257 Database Principles
Binary operation
We read S into M - 1 buffers of main memory and build a search
structure where the search key is the entire tuple. All these tuples are also copied to the output. Read each block of R into the Mth buffer, one at a time.
Set Union : - For each tuple t of R, see if t is in S, and if not, we copy t to the output. If t is also in S, we skip t.
Set intersection: -Read each block of R, and for each tuple t of R, see if t is also in S. If so, copy t to the output, and if not, ignore t.
Set Difference: - Read each block of R, and for each tuple t of R, see if t is also in S. then copy the remaining that are not in R to the ouput.
CS 257 Database Principles
Continued………………
Bag Difference S -B R, read tuples of S into main memory & count no. of
occurrences of each distinct tuple Then read R; check each tuple t to see whether t occurs in S,
and if so, decrement its associated count. At the end, copy to output each tuple in main memory whose count is positive, & no. of times we copy it equals that count.
To compute R -B S, read tuples of S into main memory & count no. of occurrences of distinct tuples.
Think of a tuple t with a count of c as c reasons not to copy t to the output as we read tuples of R.
Read a tuple t of R; check if t occurs in S. If not, then copy t to the output. If t does occur in S, then we look at current count c associated with t. If c = 0, then copy t to output. If c > 0, do not copy t to output, but decrement c by 1.
CS 257 Database Principles
Continued….
Product Read S into M - 1 buffers of main memory. Then read each block of R, and for
each tuple t of R concatenate t with each tuple of S in main memory. Output each concatenated tuple as it is formed.
Natural Join To compute the natural join, do the following:
1. Read all tuples of S & form them into a main-memory search structure. Hash table or balanced tree are good e.g. of such structures. Use M - 1 blocks of memory for this purpose.
2. Read each block of R into 1 remaining main-memory buffer. For each tuple t of R, find tuples of S that agree with t on all attributes of Y, using the search structure. For each matching tuple of S, form a tuple by joining it with t, & move resulting tuple to output.
CS 257 Database Principles
15.3 Nested Loops Joins
Tuple-Based Nested-Loop Join
The simplest variation of nested-loop join has loops that range over individual tuples of the relations involved. In this algorithm, which we call tuple-based nested-loop join, we compute the join as follows
For each tuple s in S DO For each tuple r in R Do
if r and s join to make a tuple t THEN output t;
CS 257 Database Principles
Continued……………..
If we are careless about how the buffer the blocks of relations R and S, then this algorithm could require as many as T(R)T(S) disk .there are many situations where this algorithm can be modified to have much lower cost.
One case is when we can use an index on the join attribute or attributes of R to find the tuples of R that match a given tuple of S, without having to read the entire relation R.
The second improvement looks much more carefully at the way tuples of R and S are divided among blocks, and uses as much of the memory as it can to reduce the number of disk I/O's as we go through the inner loop. We shall consider this block-based version of nested-loop join.
CS 257 Database Principles
An Iterator for Tuple-Based Nested-Loop Join
Open() { R.Open(); S.open(); A:=S.getnext();}
GetNext() {Repeat {
r:= R.Getnext();IF(r= Not found) {/* R is exhausted for the
current s*/R.close();s:=S.Getnext();
IF( s= Not found) RETURN Not Found;/* both R & S are exhausted*/R.Close();r:= R.Getnext();
} }until ( r and s join)RETURN the join of r and s;
}Close() {
R.close ();S.close ();
}
CS 257 Database Principles
A Block-Based Nested-Loop Join Algorithm
1. Organizing access to both argument relations by blocks. 2. Using as much main memory as we can to store tuples belonging to the
relation S, the relation of the outer loop.3. Algorithm
FOR each chunk of M-1 blocks of S DO BEGINread these blocks into main-memory buffers;organize their tuples into a search structure whose
search key is the common attributes of R and S;FOR each block b of R DO BEGINread b into main memory;
FOR each tuple t of b DO BEGINfind the tuples of S in main memory thatjoin with t ;output the join of t with each of these
tuples;END ;
END ;END ;
CS 257 Database Principles
Analysis of Nested-Loop Join
Assuming S is the smaller relation, the number of chunks or iterations of outer loop is B(S)/(M - 1). At each iteration, we read hf - 1 blocks of S and B(R) blocks of R. The number of disk I/O's is thus
B(S)/M-1(M-1+B(R)) or B(S)+B(S)B(R)/M-1
Assuming all of M, B(S), and B(R) are large, but M is the smallest of these, an approximation to the above formula is B(S)B(R)/M. That is, cost is proportional to the product of the sizes of the two relations, divided by the amount of available main memory.
CS 257 Database Principles
Example
B(R) = 1000, B(S) = 500, M = 101 Important Aside: 101 buffer blocks is not as unrealistic as it sounds.
There may be many queries at the same time, competing for main memory buffers.
Outer loop iterates 5 times At each iteration we read M-1 (i.e. 100) blocks of S and all of R (i.e.
1000) blocks. Total time: 5*(100 + 1000) = 5500 I/O’s
Question: What if we reversed the roles of R and S? We would iterate 10 times, and in each we would read 100+500 blocks,
for a total of 6000 I/O’s. Compare with one-pass join, if it could be done! We would need 1500 disk I/O’s if B(S) M-1
CS 257 Database Principles
Continued………
1. The cost of the nested-loop join is not much greater than the cost of a one-pass join, which is 1500 disk 110's for this example. In fact.if B(S) 5 lZI - 1, the nested-loop join becomes identical to the one-pass join algorithm of Section 15.2.3
2. Nested-loop join is generally not the most efficient join algorithm.
CS 257 Database Principles
15.4 Two-Pass Algorithms Based on Sorting
In Two-pass algorithms, data from the operand relations is read into main memory, processed in some way, written out to disk again, and then reread from disk to complete the operation.
- Read M blocks of R in to main memory- Sort these M blocks in main memory, using efficient,
main memory algorithm.- Write sorted list into M blocks of disk, refer this
contents of the blocks as one of the sorted sub list of R.
CS 257 Database Principles
Duplicate Elimination Using Sorting Duplicate Elimination Using Sorting δδ(R)(R) First we sort the tuples of R in
sub lists Then we use the available main memory to hold one block from each sorted sub list
Then we repeatedly copy one to the output and ignore all tuples identical to it.
The total cost of this algorithm is 3B(R)
This algorithm requires only √B(R) blocks of main memory, rather than B(R) blocks ( one-pass algorithm).
CS 257 Database Principles
Grouping and aggregation using sorting γγ(R)(R)
Two-pass algorithm for grouping and aggregation is quite similar to the previous algorithm.
Read the tuples of R into memory, M blocks at a time. Sort each M blocks, using the grouping attributes of L as the sort key. Write each sorted sublist to disk.
Use one main-memory buffer for each sublist, and initially load the first block of each sublist into its buffer.
Repeatedly find the least value of the sort key (grouping attributes) present among the first available tuples in the buffers.
This algorithm takes 3B(R) disk 1/0's, and will work as long as B(R) < M².
CS 257 Database Principles
A Sort-Based Union Algorithm
For Bag-union one-pass algorithm is used.
For set-union
Repeatedly bring M blocks of R into main memory, sort their tuples, and write the resulting sorted sub list back to disk.
Do the same for ‘S’, to create sorted sub lists for relation ‘S’. Use one main-memory buffer for each sub list of ‘R’ and ‘S’. Initialize each with the
first block from the corresponding sub list. Repeatedly find the first remaining tuple ‘t’ among all the buffers. Copy t to the
output. and remove from the buffers all copies of t (if R and S are sets there should be at most two copies)
This algorithm takes 3(B(R)+B(S)) disk 1/0's, and will work as long as B(R)+B(S) < M².
CS 257 Database Principles
Sort-Based Intersection and Difference For Both set version and bag version, the algorithm is same as
that of set-union except that the way we handle the copies of a tuple t at the fronts of the sorted sublists.
For set intersection, output t if it appears in both R and S. For bag intersection, output t the minimum of the number of
times it appears in R and in S. For set difference, R-S, output t if and only if it appears in R
but not in S. For bag difference, R-S, output t the number of times it
appears in R minus the number of times it appears in S.
CS 257 Database Principles
A Simple Sort-Based Join Algorithm
Given relations R(X, Y) and S(Y, Z) to join, and given M blocks of main memory for buffers.
Step 1:Sort R and S, using a two-phase, multiway merge sort, with Y as the sort key.
Step 2:Merge the sorted R and S. The following steps are done repeatedly: Find the least value y of the join attributes Y that is currently at the front of the
blocks for R and S. If y does not appear at the front of the other relation, then remove the tuple(s)
with sort key y. Otherwise, identify all the tuples from both relations having sort key y. Output all the tuples that can be formed by joining tuples from R and S with a
common Y-value y. If either relation has no more unconsidered tuples in main memory.,reload the
buffer for that relation More Efficient Sort-Based JoinMore Efficient Sort-Based Join
The number of disk I/O’s is 3(B(R) + B(S)) It requires B(R) + B(S) ≤ M² to work
CS 257 Database Principles
Table illustrating Sort Based Algorithm
Operators ApproximateM required
Disk I/O
γ,δ √B 3B
U,∩,− √(B(R) + B(S)) 3(B(R) + B(S))
►◄ √(max(B(R),B(S))) 5(B(R) + B(S))
►◄(more efficient) √(B(R) + B(S)) 3(B(R) + B(S))
CS 257 Database Principles
15.5 Two-Pass Algorithms Based on Hashing
Hashing is done if the data is too big to store in main memory buffers.
Hash all the tuples of the argument(s) using an appropriate hash key.
For all the common operations, there is a way to select the hash key so all the tuples that need to be considered together when we perform the operation have the same hash value.
This reduces the size of the operand(s) by a factor equal to the number of buckets.
CS 257 Database Principles
Partitioning Relations by HashingAlgorithm:initialize M-1 buckets using M-1 empty buffers;FOR each block b of relation R DO BEGIN
read block b into the Mth buffer;FOR each tuple t in b DO BEGIN
IF the buffer for bucket h(t) has no room for t THENBEGIN
copy the buffer t o disk;initialize a new empty block in that buffer;
END; copy t to the buffer for bucket h(t);END ;
END ;FOR each bucket DO
IF the buffer for this bucket is not empty THENwrite the buffer to disk;
CS 257 Database Principles
Duplicate Elimination
For the operation δ(R) hash R to M-1 Buckets.
Note: - That two copies of the same tuple t will hash to the same bucket
Do duplicate elimination on each bucket Ri independently, using one-pass algorithm
The result is the union of δ(Ri), where Ri is the portion of R that hashes to the ith bucket
CS 257 Database Principles
Requirements
Number of disk I/O's: 3*B(R) B(R) < M(M-1), only then the two-pass, hash-
based algorithm will work In order for this to work, we need:
hash function h evenly distributes the tuples among the buckets
each bucket Ri fits in main memory (to allow the one-pass algorithm)
i.e., B(R) ≤ M2
CS 257 Database Principles
Union, Intersection, and Difference For binary operation we use the same hash function
to hash tuples of both arguments. R U S we hash both R and S to M-1 R ∩ S we hash both R and S to 2(M-1) R-S we hash both R and S to 2(M-1) Requires 3(B(R)+B(S)) disk I/O’s. Two pass hash based algorithm requires min(B(R)
+B(S))≤ M2
CS 257 Database Principles
Hash-Join Algorithm
Use same hash function for both relations; hash function should depend only on the join attributes
Hash R to M-1 buckets R1, R2, …, RM-1
Hash S to M-1 buckets S1, S2, …, SM-1
Do one-pass join of Ri and Si, for all i 3*(B(R) + B(S)) disk I/O's; min(B(R),B(S)) ≤ M2
CS 257 Database Principles
Sort based Vs Hash based
For Binary operations, hash-based only limits size to min of arguments, not sum
Sort-based can produce output in sorted order, which can be helpful
Hash-based depends on buckets being of equal size
Sort-based algorithms can experience reduced rotational latency or seek time
CS 257 Database Principles
15.6 Index-Based Algorithms
Clustering and Non-Clustering Indexes
Clustered Relation: Tuples are packed into roughly as few blocks as can possibly hold those tuples
Clustering indexes: Indexes on attributes that all the tuples with a fixed value for the search key of this index appear on roughly as few blocks as can hold them
A relation that isn’t clustered cannot have a clustering index
A clustered relation can have Non-clustering indexes
CS 257 Database Principles
Index-Based Selection
For a selection σC(R), suppose C is of the form a=v, where a is an attribute
For clustering index R.a: The number of disk I/O’s will be B(R)/V(R,a) Index is not kept entirely in main memory They spread over more blocks May not be packed as tightly as possible into
blocks
CS 257 Database Principles
Example
B(R)=1000, T(R)=20,000 number of I/O’s required: clustered, not index 1000 not clustered, not index 20,000 If V(R,a)=100, index is clustering 10 If V(R,a)=10, index is nonclustering 2,000
CS 257 Database Principles
Joining by Using an Index
Natural join R(X, Y) S S(Y, Z)
Number of I/O’s to get RClustered: B(R)Not clustered: T(R)
Number of I/O’s to get tuple t of SClustered: T(R)B(S)/V(S,Y)Not clustered: T(R)T(S)/V(S,Y)
CS 257 Database Principles
Example
R(X,Y): 1000 blocks S(Y,Z)=500 blocks
Assume 10 tuples in each block,
so T(R)=10,000 and T(S)=5000
V(S,Y)=100
If R is clustered, and there is a clustering index on Y for S
the number of I/O’s for R is: 1000
the number of I/O’s for S is10,000*500/100=50,000
CS 257 Database Principles
Joins Using a Sorted Index
Natural join R(X, Y) S (Y, Z) with index on Y for either R or S
Extreme case: Zig-zag join Example:
relation R(X,Y) and R(Y,Z) with index on Y for both relations
search keys (Y-value) for R: 1,3,4,4,5,6
search keys (Y-value) for S: 2,2,4,6,7,8
CS 257 Database Principles
15.7 Buffer Management
Buffer Manager manages the required memory for the process with minimum delay.
Buffer Manager manages the required memory for the process with minimum delay.
Read/Write
Buffers
RequestBuffer Manager
CS 257 Database Principles
Buffer Management Architecture Two types of architecture:
Buffer Manager controls main memory directly Buffer Manager allocates buffer in Virtual Memory
In Each method, the Buffer Manager should limit the number of buffers in use which fit in the available main memory.
When Buffer Manager controls the main memory directly, it selects the buffer to empty by returning its content to disk. If it fails, it may simply be erased from main memory.
If all the buffers are really in use then very little useful works gets done.
CS 257 Database Principles
Buffer Management Strategies LRU (Least Recent Used)
It makes buffer free from the block that has not been read or write for the longest time.
FIFO (First In First Out)It makes buffer free that has been occupied the longest and assigned to
new request.
The “Clock” Algorithm
10
01
10
CS 257 Database Principles
The Relationship Between Physical OperatorSelection and Buffer Management
The query optimizer will eventually select a set of physical operators that will be used to execute a given query.
the buffer manager may not be able to guarantee the availability of the buffers when the query is executed.
CS 257 Database Principles
Chapter 15 Ends………..!
Chapter 16
Starts………………………….
CS 257 Database Principles
Chapter 16
QUERY COMPILER
16.1 Parsing 16.2Algebraic Laws for Improving Query Plans 16.3 From Parse Trees to Logical Query Plans 16.4 Estimating the Cost of Operations 16.5 Introduction to Cost-Based Plan Selection 16.6 Choosing an Order for Joins 16.7 Completing the Physical-Query-Plan.
CS 257 Database Principles
16.1 Parsing
Query compilation is divided into three steps
1. Parsing: Parse SQL query into parser tree.2. Logical query plan: Transforms parse tree into expression tree
of relational algebra.3.Physical query plan: Transforms logical query plan into physical
query plan. Operation performed Order of operation Algorithm used
The way in which stored data is obtained and passed from one operation to another.
CS 257 Database Principles
Parser
Preprocessor
Logical Query plan generator
Query rewrite
Preferred logical query plan
CS 257 Database Principles
Syntax Analysis and Parse Tree Parser takes the sql query and convert it to
parse tree. Nodes of parse tree:
#Atoms: known as Lexical elements such as key words, constants, parentheses, operators, and other schema elements.
#Syntactic categories: Subparts that plays a
CS 257 Database Principles
Simple Grammar
<Query> ::= <SFW><Query> ::= (<Query>)
<SFW> ::= SELECT <SelList> FROM <FromList> WHERE <Condition>
<SelList> ::= <Attribute>,<SelList><SelList> ::= <Attribute>
<FromList> ::= <Relation>, <FromList><FromList> ::= <Relation>
<Condition> ::= <Condition> AND <Condition><Condition> ::= <Tuple> IN <Query><Condition> ::= <Attribute> = <Attribute><Condition> ::= <Attribute> LIKE <Pattern>
<Tuple> ::= <Attribute>
Atoms(constants), <syntactic categories>(variable),::= (can be expressed/defined as)
CS 257 Database Principles
Query and Parse Tree
StarsIn(title,year,starName) MovieStar(name,address,gender,birthdate)Query: Give titles of movies that have at least one star born in 1960
SELECT title FROM StarsIn WHERE starName IN (
SELECT name FROM MovieStar WHERE birthdate LIKE '%1960%'
);SELECT title FROM StarsIn, MovieStarWHERE starName = name AND birthdate LIKE '%1960%' ;
CS 257 Database Principles
CS 257 Database Principles
Parse Tree
CS 257 Database Principles
Preprocessor
Functions of Preprocessor
. If a relation used in the query is virtual view then each use of this relation in the form-list must replace by parser tree that describe the view.
. It is also responsible for semantic checking
1. Checks relation uses : Every relation mentioned in FROM- clause must be a relation or a view in current schema.
2. Check and resolve attribute uses: Every attribute mentioned in SELECT or WHERE clause must be an attribute of same
relation in the current scope.
3. Check types: All attributes must be of a type appropriate to
their uses.
CS 257 Database Principles
Preprocessing Queries Involving Views
When an operand in a query is a virtual view, the preprocessor needs to replace the operand by a piece of parse tree that represents how the view is constructed from base table.
Base Table: Movies( title, year, length, genre, studioname, producerC#)
View definition : CREATE VIEW ParamountMovies AS
SELECT title, year FROM movies WHERE studioName = 'Paramount'; Example based on view: SELECT title FROM ParamountMovies WHERE year = 1979;
CS 257 Database Principles
16.2 Algebraic Laws for Improving Query Plans
Optimizing the Logical Query Plan The translation rules converting a parse tree to a logical
query tree do not always produce the best logical query tree.
It is often possible to optimize the logical query tree by applying relational algebra laws to convert the original tree into a more efficient logical query tree.
Optimizing a logical query tree using relational algebra laws is called heuristic optimization
CS 257 Database Principles
Relational Algebra Laws
These laws often involve the properties of: Commutativity - operator can be applied to
operands independent of order. E.g. A + B = B + A - The “+” operator is
commutative. Associativity - operator is independent of
operand grouping. E.g. A + (B + C) = (A + B) + C - The “+”
operator is associative.
CS 257 Database Principles
Associative and Commutative Operators The relational algebra operators of cross-
product (×), join (⋈), union, and intersection are all associative and commutative.
Commutative
R X S = S X R
R ⋈ S = S ⋈ R
R S = S R
R ∩ S = S ∩ R
Associative
(R X S) X T = S X (R X T)
(R ⋈ S) ⋈ T= S ⋈ (R ⋈ T)
(R S) T = S (R T)
(R ∩ S) ∩ T = S ∩ (R ∩ T)
CS 257 Database Principles
Laws Involving Selection
Complex selections involving AND or OR can be broken into two or more selections: (splitting laws)
σC1 AND C2 (R) = σC1( σC2 (R))σC1 OR C2 (R) = ( σC1 (R) ) S ( σC2 (R) )
Example R={a,a,b,b,b,c} p1 satisfied by a,b, p2 satisfied by b,c σp1vp2 (R) = {a,a,b,b,b,c} σp1(R) = {a,a,b,b,b} σp2(R) = {b,b,b,c} σp1 (R) U σp2 (R) = {a,a,b,b,b,c}
CS 257 Database Principles
Continued………………….
Selection is pushed through both arguments for union:
σC(R S) = σC(R) σC(S)
Selection is pushed to the first argument and optionally the second for difference:
σC(R - S) = σC(R) - SσC(R - S) = σC(R) - σC(S)
All other operators require selection to be pushed to only one of the arguments.
For joins, may not be able to push selection to both if argument does not have attributes selection requires.
σC(R × S) = σC(R) × SσC(R ∩ S) = σC(R) ∩ SσC(R ⋈ S) = σC(R) ⋈ SσC(R ⋈D S) = σC(R) ⋈D S
CS 257 Database Principles
Laws Involving Projection
It is also possible to push projections down the logical query tree. However, the performance gained is less than selections because projections just reduce the number of attributes instead of reducing the number of tuples.
Laws for pushing projections with joins:
πL(R × S) = πL(πM(R) × πN(S))πL(R ⋈ S) = πL((πM(R) ⋈ πN(S))πL(R ⋈D S) = πL((πM(R) ⋈D πN(S))
Laws for pushing projections with set operations. Projection can be performed entirely before union.
πL(R UB S) = πL(R) UB πL(S)
Projection can be pushed below selection as long as we also keep all attributes needed for the selection (M = L attr(C)).πL ( σC (R)) = πL( σC (πM(R)))
CS 257 Database Principles
Laws Involving Join
1. Joins are commutative and associative.
2. Selection can be distributed into joins.
3. Projection can be distributed into joins
CS 257 Database Principles
Laws Involving Duplicate Elimination The duplicate elimination operator (δ) can be pushed through many
operators.R has two copies of tuples t, S has one copy of t, δ (RUS)=one copy of t δ (R) U δ (S)=two copies of t
Laws for pushing duplicate elimination operator (δ):
δ(R × S) = δ(R) × δ(S)δ(R S) = δ(R) δ(S)δ(R D S) = δ(R) D δ(S)δ( σC(R) = σC(δ(R))
The duplicate elimination operator (δ) can also be pushed through bag intersection, but not across union, difference, or projection in general.
δ(R ∩ S) = δ(R) ∩ δ(S)
CS 257 Database Principles
Laws Involving Grouping
The grouping operator (γ) laws depend on the aggregate operators used.
There is one general rule, however, that grouping subsumes duplicate elimination:
δ(γL(R)) = γL(R)
The reason is that some aggregate functions are unaffected by duplicates (MIN and MAX) while other functions are (SUM, COUNT, and AVG).
CS 257 Database Principles
16.3 From Parse Trees to Logical Query Plans
Parsing Goal is to convert a text string containing a query
into a parse tree data structure: Leaves form the text string (broken into lexical elements) Internal nodes are syntactic categories
Uses standard algorithmic techniques from compilers Given a grammar for the language (e.g., SQL), process
the string and build the tree
CS 257 Database Principles
Convert Parse Tree to Relational Algebra The complete algorithm depends on specific
grammar, which determines forms of the parse trees Here is a flavor of the approach Suppose there are no subqueries.
SELECT att-list FROM rel-list WHERE cond
is converted into PROJatt-list(SELECTcond (PRODUCT (rel-list))), or att-list(cond( X (rel-list)))
CS 257 Database Principles
SELECT movieTitleFROM StarsIn, MovieStarWHERE starName = name AND birthdate LIKE '%1960';
<SFW>
SELECT <SelList> FROM <FromList> WHERE <Condition>
<Attribute> <RelName> , <FromList> AND <Condition>
movieTitle StarsIn <RelName> <Attribute> LIKE <Pattern>
MovieStar birthdate '%1960'
<Condition>
<Attribute> = <Attribute>
starName name
<Query>
CS 257 Database Principles
Equivalent Algebraic Expression Tree
starname = name AND birthdate LIKE '%1960'
X
StarsIn MovieStar
movieTitle
CS 257 Database Principles
Query:SELECT titleFROM StarsInWHERE starName IN (
SELECT nameFROM MovieStarWHERE birthdate LIKE ‘%1960’
);Use an intermediate format called two-argument
selection
CS 257 Database Principles
Example: Two-Argument Selection
title
StarsIn <condition>
<tuple> IN name
<attribute> birthdate LIKE ‘%1960’
starName MovieStar
CS 257 Database Principles
Converting Two-Argument Selection
To continue the conversion, we need rules for replacing two-argument selection with a relational algebra expression
Different rules depending on the nature of the sub query
Here is shown an example for IN operator and uncorrelated query (sub query computes a relation independent of the tuple being tested)
CS 257 Database Principles
Improving the Logical Query Plan There are numerous algebraic laws
concerning relational algebra operations By applying them to a logical query plan
judiciously, we can get an equivalent query plan that can be executed more efficiently
Next we'll survey some of these laws
CS 257 Database Principles
Example: Improved Logical Query Plan
title
starName=name
StarsIn name
birthdate LIKE ‘%1960’
MovieStar
CS 257 Database Principles
Associative and Commutative Operations
• Product• Natural join• Set and Bag union• Set and Bag intersection
Associative: (A op B) op C = A op (B op C)Commutative: A op B = B op A
CS 257 Database Principles
Laws Involving Selection
• Selections usually reduce the size of the relation• Usually good to do selections early, i.e., "push them down the tree"• Also can be helpful to break up a complex selection into parts
Selection Splitting
C1 AND C2 (R) = C1 ( C2 (R))
C1 OR C2 (R) = ( C1 (R)) Uset ( C2 (R))if R is a set
C1 ( C2 (R)) = C2 ( C1 (R))
CS 257 Database Principles
Selection and Binary Operators
• Must push selection to both arguments:– C (R U S) = C (R) U C (S)
• Must push to first arg, optional for 2nd:– C (R - S) = C (R) - S
– C (R - S) = C (R) - C (S)
• Push to at least one arg with all attributes mentioned in C:– product, natural join, theta join, intersection– e.g., C (R X S) = C (R) X S, if R has all the atts in C
CS 257 Database Principles
Pushing Selection Up the Tree
• Suppose we have relations– StarsIn(title,year,starName)– Movie(title,year,len,inColor,studioName)
• and a view– CREATE VIEW MoviesOf1996 AS
SELECT *FROM MovieWHERE year = 1996;
• and the query– SELECT starName, studioName
FROM MoviesOf1996 NATURAL JOIN StarsIn;
CS 257 Database Principles
The Straightforward Tree
starName,studioName
year=1996 StarsIn
MovieRemember the ruleC(R S) = C(R) S ?
CS 257 Database Principles
The Improved Logical Query Plan
starName,studioName
year=1996 StarsIn
Movie
starName,studioName
year=1996
Movie StarsIn
starName,studioName
year=1996 year=1996
Movie StarsIn
push selectionup tree
push selectiondown tree
CS 257 Database Principles
Grouping Assoc/Comm Operators
• Groups together adjacent joins, adjacent unions, and adjacent intersections as siblings in the tree
• Sets up the logical QP for future optimization when physical QP is constructed: determine best order for doing a sequence of joins (or unions or intersections)
U D E FU
UA
B C
D E F
A B C
CS 257 Database Principles
16.4 Estimating the Cost of Operations Physical Plan
An order and grouping for associative-and-commutative operations like joins, unions.
An Algorithm for each operator in the logical plan. whether nested loop join or hash join to be used
Additional operators that are needed for the physical plan but that were not present explicitly in the logical plan. eg: scanning, sorting
The way in which arguments are passed from one operator to the next.
CS 257 Database Principles
Estimating Sizes of Intermediate Relations
Rules for estimating the number of tuples in an intermediate relation:
1. Give accurate estimates2. Are easy to compute3. Are logically consistent Objective of estimation is to select best physical plan with
least cost. Estimating the Size of a Projection
The projection is different from the other operators, in that the size of the result is computable. Since a projection produces a result tuple for every argument tuple, the only change in the output size is the change in the lengths of the tuples
CS 257 Database Principles
Estimating the Size of a Selection(1) & Selection(2)
Let , ,where A is an attribute of R and C is a constant. Then we recommend as an estimate:
T(S) =T(R)/V(R,A)
The rule above surely holds if all values of attribute A occur equally often in the database.
If ,then our estimate for T(s) is: T(S) = T(R)/3 We may use T(S)=T(R)(V(R,a) -1 )/ V(R,a) as an
estimate. When the selection condition C is the And of several
equalities and inequalities, we can treat the selection as a cascade of simple selections, each of which checks for one of the conditions.
)(RS cA
)(RS ca
CS 257 Database Principles
Estimating the Size of a Selection(3)
A less simple, but possibly more accurate estimate of the size of is to assume that C1 and of which satisfy C2, we would estimate the number of tuples in S as
In explanation, is the fraction of tuples that do not satisfy C1, and is the fraction that do not satisfy C2. The product of these numbers is the fraction of R’s tuples that are not in S, and 1 minus this product is the fraction that are in S.
OR (R)S c2 c1 2m
))/1)(/1(1( 21 nmnmn nm /1 1
nm /1 2
CS 257 Database Principles
Estimating the Size of a Join
Two simplifying assumptions:
1. Containment of Value SetsIf R and S are two relations with attribute Y and V(R,Y)<=V(S,Y) then every Y-value of R will be a Y-value of S.
2. Preservation of Value SetsJoin a relation R with another relation S with attribute A in R and not in S then all distinct values of A are preserved and not lost.V(S R,A) = V(R,A)
Under these assumptions, we estimate
T(R S) = T(R)T(S)/max(V(R,Y), V(S, Y))
CS 257 Database Principles
Natural Joins With Multiple Join Attributes
Of the T(R),T(S) pairs of tuples from R and S, the expected number of pairs that match in both y1 and y2 is:
T(R)T(S)/max(V(R,y1), V(S,y1)) max(V(R, y2), V(S, y2)) In general, the following rule can be used to estimate the size of
a natural join when there are any number of attributes shared between the two relations.
The estimate of the size of R S is computed by multiplying T(R) by T(S) and dividing by the largest of V(R,y) and V(S,y) for each attribute y that is common to R and S.
CS 257 Database Principles
Joins of Many Relations(1) & (2)Rule for estimating the size of any join
Start with the product of the number of tuples in each relation. Then, for each attribute A appearing at least twice, divide by all but the least of V(R,A)’s.
We can estimate the number of values that will remain for attribute A after the join. By the preservation-of-value-sets assumption, it is the least of these V(R,A)’s.
Based on the two assumptions-containment and preservation of value sets: No matter how we group and order the terms in a natural join of
n relations, the estimation of rules, applied to each join individually, yield the same estimate for the size of the result. Moreover, this estimate is the same that we get if we apply the rule for the join of all n relations as a whole.
CS 257 Database Principles
Estimating Sizes for Other Operations
Union: the average of the sum and the larger.
Intersection: approach1: take the average of the
extremes, which is the half the smaller. approach2: intersection is an
extreme case of the natural join, use the formula
T(R S) = T(R)T(S)/max(V(R,Y), V(S, Y))
Difference: T(R)-(1/2)*T(S)
Duplicate Elimination: take the smaller of (1/2)*T(R) and the product of all the V(R, )’s.
Grouping and Aggregation: upper-bound the number of groups by a product of V(R,A)’s, here attribute A ranges over only the grouping attributes of L. An estimate is the smaller of (1/2)*T(R) and this product.
CS 257 Database Principles
16.5 Introduction to Cost-based plan selection The "cost" of evaluating an expression is approximated well by the
number of disk I/O's performed.
The number of disk I/O’s, in turn, is influenced by:
1. The particular logical operators chosen to implement the query, a matter decided when we choose the logical query plan.
2. The sizes of intermediate results (whose estimation we discussed in Section 16.4)
3. The physical operators used to implement logical operators. e.g.. The choice of a one-pass or two-pass join, or the choice to sort or not sort a given relation.
4. The ordering of similar operations, especially joins5. The method of passing arguments from one physical operator to the
next.
CS 257 Database Principles
Estimates for Size Parameter
The formulas of Section 16.4 were predicated on knowing certain important parameters, especially T(R), the number of tuples in a relation R, and V(R, a),
the number of different values in the column of relation R for attribute a.
A modern DBMS generally allows the user or administrator explicitly to request the gathering of statistics, such as T(R) and V(R, a). These statistics are then used in
subsequent query optimizations to estimate the cost of operations.
By scanning an entire relation R, it is straightforward to count the number of tuples T(R) and also to discover the number of different values V(R, a) for each attribute a.
The number of blocks in which R can fit, B(R), can be estimated either by counting the actual number of blocks used (if R is clustered), or by dividing T(R) by the number of tuples per block
CS 257 Database Principles
Computation of Statistics
Periodic re-computation of statistics is the norm in most DBMS's, for several reasons. First, statistics tend not to change radically in a short time. Second, even somewhat inaccurate statistics are useful as long as they are applied
consistently to all the plans. Third, the alternative of keeping statistics up-to-date can make the statistics
themselves into a "hot-spot" in the database; because statistics are read frequently, we prefer not to update them frequently too.
The recomputation of statistics might be triggered automatically after some period of time, or after some number of updates.
However, a database administrator noticing, that poor-performing query plans are being selected by the query optimizer on a regular basis, might request the recomputation of statistics in an attempt to rectify the problem.
Computing statistics for an entire relation R can be very expensive, particularly if we compute V(R, a) for each attribute a in the relation.
One common approach is to compute approximate statistics by sampling only a fraction of the data. For example, let us suppose we want to sample a small fraction of the tuples to obtain an estimate for V(R, a).
CS 257 Database Principles
Heuristics for Reducing the Cost of Logical Query Plans One important use of cost estimates for queries or sub-queries is in the
application of heuristic transformations of the query. We have already observed previously how certain heuristics applied
independent of cost estimates can be expected almost certainly to improve the cost of a logical query plan.
However, there are other points in the query optimization process where estimating the cost both before and after a transformation will allow us to apply a transformation where it appears to reduce cost and avoid the transformation otherwise.
In particular, when the preferred logical query plan is being generated, we may consider a number of optional transformations and the costs before and after.
Because we are estimating the cost of a logical query plan, so we have not yet made decisions about the physical operators that will be used to implement the operators of relational algebra, our cost estimate cannot be based on disk I/Os.
Rather, we estimate the sizes of all intermediate results using the techniques of Section 16.1, and their sum is our heuristic estimate for the cost of the entire logical plan.
CS 257 Database Principles
Example illustrate this
Consider the initial logical query plan of as shown below,
The statistics for the relations R and S be as follows To generate a final logical query plan from, we shall
insist that the selection be pushed down as far as possible. However, we are not sure whether it makes sense to push the δ below the join or not. Thus, we generate from the two query plans shown in next slide. They differ in whether we have chosen to eliminate duplicates before or after the join.
CS 257 Database Principles
250
δ δ
σa = 10 S
R
50
100
5000 2000
1000
500
σa = 10 S
R
100
5000
2000
10000
δ
(a)(b)
We know how to estimate the size of the result of the selections, we divide T(R) by V(R, a) = 50.
We also know how to estimate the size of the joins; we multiply the sizes of the arguments and divide by max(V(R, b), V(S, b)), which is 200.
CS 257 Database Principles
Approaches to Enumerating Physical Plans Let us consider the use of cost estimates in the conversion of a logical
query plan to a physical query plan. The baseline approach, called exhaustive, is to consider all
combinations of choices (for each of issues like order of joins, physical implementation of operators, and so on).
Each possible physical plan is assigned an estimated cost, and the one with the smallest cost is selected.
There are two broad approaches to exploring the space of possible physical plans: Top-down: Here, we work down the tree of the logical query plan from the
root. Bottom-up: For each sub-expression of the logical-query-plan tree, we
compute the costs of all possible ways to compute that sub-expression. The possibilities and costs for a sub-expression E are computed by considering the options for the sub-expressions for E, and combining them in all possible ways with implementations for the root operator of E.
CS 257 Database Principles
Branch-and-Bound Plan Enumeration
This approach, often used in practice, begins by using heuristics to find a good physical plan for the entire logical query plan. Let the cost of this plan be C. Then as we consider other plans for sub-queries, we can eliminate any plan for a sub-query that has a cost greater than C, since that plan for the sub-query could not possibly participate in a plan for the complete query that is better than what we already know.
Likewise, if we construct a plan for the complete query that has cost less than C, we replace C by the cost of this better plan in subsequent exploration of the space of physical query plans.
CS 257 Database Principles
Hill Climbing & Dynamic Programming
Hill Climbing This approach, in which we really search for a “valley” in the
space of physical plans and their costs; starts with a heuristically selected physical plan.
We can then make small changes to the plan, e.g., replacing one method for an operator by another, or reordering joins by using the associative and/or commutative laws, to find "nearby" plans that have lower cost.
When we find a plan such that no small modification yields a plan of lower cost, we make that plan our chosen physical query plan.
Dynamic Programming In this variation of the general bottom-UP strategy, we keep for
each sub-expression only the plan of least cost. As we work UP the tree, we consider possible implementations of
each node, assuming the best plan for each sub-expression is also used
CS 257 Database Principles
16.6 Choosing an order for Joins The argument relations in joins determine the cost of the join The left argument of the join is
Called the build relation Assumed to be smaller Stored in main-memory
The right argument of the join is Called the probe relation Read a block at a time Its tuples are matched with those of build relation
The join algorithms which distinguish between the arguments are: One-pass join Nested-loop join Index join
CS 257 Database Principles
Join Trees Order of arguments is important for joining two relations Left argument, since stored in main-memory, should be smaller With two relations only two choices of join tree With more than two relations, there are n! ways to order the arguments
and therefore n! join trees, where n is the no. of relations
Database System The Complete Book Chapter 16 Figure: 16.27
CS 257 Database Principles
Join Trees Types In fig (a) Left-deep tree
all the right children are leaves. In fig (c) Right-deep tree
all the left children are leaves. Fig (b) Bushy tree
Considering left-deep trees is advantageous for deciding join orders
Dynamic Programming to Select a Join Order and Grouping. We have three choices
Consider them all Consider a subset Use a heuristic to pick one
CS 257 Database Principles
Dynamic Programming to Select a Join Order and Grouping Dynamic programming is used either to consider all or a
subset
Construct a table of costs based on relation size Remember only the minimum entry which will required to proceed
Disadvantage of dynamic programming is that it does not involve the actual costs of the joins in the calculations
Can be improved by considering Use disk’s I/O for evaluating cost When computing cost of R1 join R2, since we sum cost of R1 and
R2, we must also compute estimates for there sizes
CS 257 Database Principles
A Greedy Algorithm for Selecting a Join Order
It is expensive to use an exhaustive method like dynamic programming
Better approach is to use a join-order heuristic for the query optimization
Greedy algorithm is an example of that Make one decision at a time and never backtrack on the
decisions once made
CS 257 Database Principles
16.7Completing the Physical-Query-Plan
Issues Regarding Physical Query Plan Decisions regarding when intermediate results will be
materialized and when they will be pipelined. Selection of algorithms to implement the operations of the
query plan, when algorithm-selection was not done as part of some earlier step such as selection of a join order by dynamic programming.
Notation for physical-query-plan operators, which must include details regarding access methods for stored relations and algorithms for implementation of relational-algebra operators.
CS 257 Database Principles
Choosing a Selection Method
To pick algorithms for each selection operator. Assuming there are no multidimensional indexes on
several of the attributes, then each physical plan uses some number of attributes that each: Have an index. Are compared to a constant in one of the terms of the
selection. We then use these indexes to identify the sets of tuples
that satisfy each of the conditions.
CS 257 Database Principles
Continued.. We discuss physical plans that:
Use one comparison of the form AѲc, where A is an attribute with an index, c is a constant, and Ѳ is a comparison operator such as = or <.
Retrieve all tuples that satisfy the above comparison, using the index scan physical operator.
Consider each tuple selected above to decide whether it satisfies the rest of the selection condition.
We decide among the physical plans with which to implement a given election by estimating the cost of reading data for each possible option.
We shall count only the cost of accessing the data blocks, not the index blocks.
CS 257 Database Principles
Join Method
One approach is to call for the one-pass join, hoping that the buffer manager can devote enough buffers to the join, or that the buffer manager can come close, so thrashing is not a major cost.
An alternative is to choose a nested-loop join, hoping that if the left argument cannot be granted enough buffers to fit in memory at once, then that argument will not have to be divided into too many pieces, and the resulting join will still be reasonably efficient.
A sort- join is good choice when either: One or both arguments are already sorted on their join attributes Or there are two or more joins on the same attribute, such as
(R(a. b) w S(%c) ) w T(a,d )
When sorting R and S on a will cause the result of R w S to be sorted on a and used directly in a second sort-join
CS 257 Database Principles
Pipelining vs. Materialization The naïve way to execute a query plan is to order the operations
appropriately and store the results of each operation on disk until it is needed by another operation. This strategy is called materialization.
More subtle way to execute a query plan is to interleave the execution of several operations. The tuples produced by one operation are passed directly to the operation that uses it, without ever storing the intermediate tuples on disk. This approach in called pipelining.
Since pipelining saves disk I/O’s, where is an obvious advantage to pipelining, but there is a corresponding disadvantage. Since several operations must share main memory at any time, there is a chance that algorithm with higher disk I/O requirements must be chosen or thrashing will occur , thus giving back all the disk-I/O savings that were gained by pipelining.
CS 257 Database Principles
Pipelining Unary Operations
Selection and projection are excellent candidates for pipelining. Since these operations are tuple-at-a-time, we never need to have more than one block for input and one for output.
Figure 1:
CS 257 Database Principles
Pipelining Binary Operations
We use one buffer to pass the results to its consumer, one block at a time.
The number of other buffers need to compute the results and to consume the results varies, depending on the size of the result and the sizes of other relations involved in the query.
CS 257 Database Principles
Notations for Physical Query Plans Each operator of the logical plan becomes one or more
operators of the physical plan, and leaves (stored relations) of the logical plan become, in the physical plan, one of the scan operators applied to that relation.
Materialization would be indicated by a Store operator applied to the intermediate result that is to be materialized, followed by a suitable scan operator when the materialized result is accessed by its consumer.
We shall indicate that a certain intermediate relation is materialized by a double line crossing the edge between that relation and its consumer.
All other edges are assumed to represent pipelining between the supplier and consumer of tuples.
CS 257 Database Principles
Operators for Leaves Each relation R that is a leaf operand of the logical-query-plan
tree will be replaced by a scan operator. The options are:
TableScan(R): All blocks holding tuples of R are read in arbitrary order.
SortScan (R, L): Tuples of R are read in order, sorted according to the attribube(s) on list L.
IndexScan(R,C): Here, C is a condition of the form AѲc, where A is an attribute of R,Ѳ is a comparison such as = or <, and c is a constant. Tuples of R are accessed through an index on attribute A. If the comparison Ѳ is not =, then the index must be one, such as a B-tree, that supports range queries.
IndexScan(R,A): Here A is an attribute of R. The entire relation R is retrieved via an index on R.A. This operator behaves like TableScan, but may be more efficient in certain circumstances, if R is not clustered and/or its blocks are not easily found.
CS 257 Database Principles
Physical Operators for Selection A logical operator σc(R) is often combined, or partially
combined, with the access method for relation R, when R is a stored relation
Other selections, where the argument is not a stored relation or an appropriate index is not available, will be replaced by the corresponding physical operator we have called Filter.
The notation we shall use for the various selection implementations are: We may simply replace σc (R) by the operator Filter(C). If condition C can be expressed as AѲc AND D for some
other condition D, and there is an index on R.A, then we may: Use the operator InterScan(R,Aѳc) to access R, and Use Filter(D) in place of the selection σc (R).
CS 257 Database Principles
Physical Sort Operators
Sorting of a relation can occur at any point in the physical query plan.
When we apply a sort-based algorithm for operations such as join or grouping, there is an initial phase in which we sort the argument according to some list of attributes
It is common to use an explicit physical operator Sort(L) to perform this sort on an operand relation that is not stored.
CS 257 Database Principles
Other Relational Algebra Operations
All other operations are replaced by a suitable physical operator.
These operators can be given designations that indicate: The operation being performed, e.g., join or grouping. Necessary parameters, e.g., the condition in a theta-join or
the list of elements in a grouping. A general strategy for the algorithm: sort-based, hash-
based, or in some joins, index-based. The decision about the number of passes to be used: one-
pass, two-pass, or multi-pass An anticipated number of buffers the operation will require.
CS 257 Database Principles
Ordering of Physical Operations The following rules summarize the ordering of events implicit in a
physical-query-plan tree: Break the tree into sub-trees at each edge that represents
materialization. The sub-trees will be executed one-at-a-time. Order the execution of the sub-trees in a bottom-up, left-to-right
manner. To be precise, perform a preorder traversal of the entire tree. Order the sub-trees in the order in which the preorder traversal exits from the sub-trees.
Execute all nodes of each sub-tree using a network of iterators. Thus, all the nodes in one sub-tree are executed simultaneously, with GetNext calls among their operators determining the exact order of events.
Following this strategy, the query optimizer can now generate executable code, perhaps a sequence of function calls, for the query.
CS 257 Database Principles
Chapter 16 Ends………..!
Chapter 18Starts………………………….
CS 257 Database Principles
Chapter 18
CONCURRENCY CONTROL
18.1Serial and Serializable Schedule 18.2 Conflict- Serializability 18.3 Enforcing Serializability by Locks 18.4 Locking Systems With Several Lock Modes 18.5 An Architecture for a Locking Scheduler 18.6 Managing Hierarchies of Database Elements 18.7 The Tree Protocol 18.8 Concurrency Control by Timestamp 18.9 Concurrency Control by Validation
CS 257 Database Principles
What Is Concurrency Control & Who controls it?
A process of assuming that the transactions preserve the consistency when executing simultaneously is called Concurrency Control.
This consistency is taken care by Scheduler.
CS 257 Database Principles
Buffer
Read / Write Requests
Correctness Principle It’s a principle that states that a transaction starts in a correct
database state and ends in a correct database state. Does the system really follow the correctness principal all the
time?
18.1Serial and Serializable Schedule
CS 257 Database Principles
Basic Example Schedule
T1
READ (A,t)
t := t+100
WRITE (A,t)
READ (B,t)
t := t+100
WRITE (B,t)
T2
READ (A,s)
s := s*2
WRITE (A,s)
READ (B,s)
s := s*2
WRITE (B,s)
A=B=50To be consistent the final state should be A=B
CS 257 Database Principles
Serial Schedule T1 T2 A B
50 50
READ (A,t)
t := t+100
WRITE (A,t) 150
READ (B,t)
t := t+100
WRITE (B,t) 150
READ (A,s)
s := s*2
WRITE (A,s) 300
READ (B,s)
s := s*2
WRITE (B,s) 300
(T1,T2)
A := 2*(A+100)
CS 257 Database Principles
Does the order really matter? T1 T2 A B
50 50
READ (A,s)
s := s*2
WRITE (A,s) 100
READ (B,s)
s := s*2
WRITE (B,s) 100
READ (A,t)
t := t+100
WRITE (A,t) 200
READ (B,t)
t := t+100
WRITE (B,t) 200
(T2,T1)
The final state of a database is not independent of the order of transaction.
CS 257 Database Principles
Serializable Schedule T1 T2 A B
50 50
READ (A,t)
t := t+100
WRITE (A,t) 150
READ (A,s)
s := s*2
WRITE (A,s) 300
READ (B,t)
t := t+100
WRITE (B,t) 150
READ (B,s)
s := s*2
WRITE (B,s) 300
Serializable but not Serial Schedule
CS 257 Database Principles
Non-Serializable Schedule T1 T2 A B
50 50
READ (A,t)
t := t+100
WRITE (A,t) 150
READ (A,s)
s := s*2
WRITE (A,s) 300
READ (B,s)
s := s*2
WRITE (B,s)100
READ (B,t)
t := t+100
WRITE (B,t) 200
A := 2*(A+100)B := 2*B + 100
CS 257 Database Principles
A Serializable Schedule with details
T1 T2 A B
50 50
READ (A,t)
t := t+100
WRITE (A,t) 150
READ (A,s)
s := s*1
WRITE (A,s) 150
READ (B,s)
s := s*1
WRITE (B,s) 50
READ (B,t)
t := t+100
WRITE (B,t) 150
A := 1*(A+100)B := 1*B + 100
CS 257 Database Principles
Notations for Transaction
1. Action : An expression of the form ri(X) or wi(X) meaning that transaction Ti reads or writes, respectively, the database X.
2. Transaction : A transaction Ti is a sequence of actions with subscript.
3. Schedule : A schedule S of a transactions T is a sequence of actions, in which for each transaction Ti in T, the action of Ti appear in the definition of Ti itself.
CS 257 Database Principles
Notational Example
T1
READ (A,t)
t := t+100
WRITE (A,t)
READ (B,t)
t := t+100
WRITE (B,t)
T2
READ (A,s)
s := s*2
WRITE (A,s)
READ (B,s)
s := s*2
WRITE (B,s)
Notation:T1 : r1(A); w1(A); r1(B); w1(B)T2 : r2(A); w2(A); r2(B); w2(B)
CS 257 Database Principles
Concurrency Control
Concurrency control in database management systems (DBMS) ensures that database transactions are performed concurrently without the concurrency violating the data integrity of a database.
Executed transactions should follow the ACID rules. The DBMS must guarantee that only serializable (unless Serializability is intentionally relaxed), recoverable schedules are generated.
It also guarantees that no effect of committed transactions is lost, and no effect of aborted (rolled back) transactions remains in the related database.
CS 257 Database Principles
Transaction ACID rules
Atomicity - Either the effects of all or none of its operations remain when a transaction is completed - in other words, to the outside world the transaction appears to be indivisible, atomic.
Consistency - Every transaction must leave the database in consistent state.Isolation - Transactions cannot interfere with each other. Providing isolation is the main goal of concurrency control.
Durability - Successful transactions must persist through crashes
CS 257 Database Principles
Serial and Serializable Schedules
In the field of databases, a schedule is a list of actions, (i.e. reading, writing, aborting, committing), from a set of transactions.
In this example, Schedule D is the set of 3 transactions T1, T2, T3. The schedule describes the actions of the transactions as seen by the DBMS. T1 Reads and writes to object X, and then T2 Reads and writes to object Y, and finally T3 Reads and writes to object Z. This is an example of a serial schedule, because the actions of the 3 transactions are not interleaved.
CS 257 Database Principles
Serial and Serializable Schedules
A schedule that is equivalent to a serial schedule has the serializability property.
In schedule E, the order in which the actions of the transactions are executed is not the same as in D, but in the end, E gives the same result as D.
CS 257 Database Principles
Serial Schedule TI precedes T2
T1 T2
Read(A); A A+100
Write(A);
Read(B); B B+100;
Write(B);
Read(A);A A2;
Write(A);
Read(B);B B2;
Write(B);
A B25 25
125
125
250
250250 250
CS 257 Database Principles
Serial Schedule T2 precedes Tl
T1 T2
Read(A);A A2;
Write(A);
Read(B);B B2;
Write(B);
Read(A); A A+100
Write(A);
Read(B); B B+100;
Write(B);
A B25 25
50
50
150
150150 150
CS 257 Database Principles
serializable, but not serial, schedule
T1 T2
Read(A); A A+100
Write(A);
Read(A);A A2;
Write(A);
Read(B); B B+100;
Write(B);
Read(B);B B2;
Write(B);
A B25 25
125
250
125
250250 250
r1(A); w1 (A): r2(A); w2(A); r1 (B); w1 (B); r2(B); w2(B);
CS 257 Database Principles
Non serializable schedule
T1 T2
Read(A); A A+100
Write(A);
Read(A);A A2;
Write(A);
Read(B);B B2;
Write(B);
Read(B); B B+100;
Write(B);
A B25 25
125
250
50
150250 150
CS 257 Database Principles
Schedule that is serializable only because of the detailed behavior of the transactionsT1 T2’
Read(A); A A+100
Write(A);
Read(A);A A1;
Write(A);
Read(B);B B1;
Write(B);
Read(B); B B+100;
Write(B);
regardless of the consistent initial state: the final state will be consistent.
A B25 25
125
125
25
125125 125
CS 257 Database Principles
Non-Conflicting Actions
Two actions are non-conflicting if whenever theyoccur consecutively in a schedule, swapping themdoes not affect the final state produced by theschedule. Otherwise, they are conflicting.
CS 257 Database Principles
18.2 Conflict- Serializability
Two actions of the same transaction conflict: r1(A) w1(B)
Two actions over the same database element conflict, if one of them is a write r1(A) w2(A) w1(A) w2(A)
CS 257 Database Principles
Conflict actions Two or more actions are said to be in conflict if:
The actions belong to different transactions. At least one of the actions is a write operation. The actions access the same object (read or write).
The following set of actions is conflicting: T1:R(X), T2:W(X), T3:W(X)
While the following sets of actions are not: T1:R(X), T2:R(X), T3:R(X) T1:R(X), T2:W(Y), T3:R(X)
CS 257 Database Principles
Conflict Serializable
We may take any schedule and make as many non-conflicting swaps as we wish.
With the goal of turning the schedule into a serial schedule.
If we can do so, then the original schedule is serializable, because its effect on the database state remains the same as we perform each of the nonconflicting swaps.
CS 257 Database Principles
Conflict Serializable A schedule is said to be conflict-serializable when the schedule is conflict-
equivalent to one or more serial schedules. Another definition for conflict-serializability is that a schedule is conflict-
serializable if and only if there exists an acyclic precedence graph/serializability graph for the schedule.
Which is conflict-equivalent to the serial schedule <T1,T2>, but not <T2,T1>.
CS 257 Database Principles
Conflict equivalent / conflict-serializable Let Ai and Aj are consecutive non-conflicting actions that
belongs to different transactions. We can swap Ai and Aj without changing the result.
Two schedules are conflict equivalent if they can be turned one into the other by a sequence of non-conflicting swaps of adjacent actions.
We shall call a schedule conflict-serializable if it is conflict-equivalent to a serial schedule.
CS 257 Database Principles
Conflict-serializable
T1 T2
R(A)
W(A)
R(A)
R(B)
W(A)
W(B)
R(B)
W(B)
T1 T2
R(A)
W(A)
R(B)
R(A)
W(A)
W(B)
R(B)
W(B)
CS 257 Database Principles
conflict-serializable
T1 T2
R(A)
W(A)
R(A)
R(B)
W(B)
W(A)
R(B)
W(B)
T1 T2
R(A)
W(A)
R(A)
W(B)
R(B)
W(A)
R(B)
W(B)
SerialSchedule
CS 257 Database Principles229
18.3 Enforcing Serializability by Locks
Enforcing serializability by locks
Locks Locking scheduler Two phase locking
CS 257 Database Principles230
Locks
It works like as follows : A request from a transaction Scheduler checks in the lock table Generates a serializable schedule of actions.
CS 257 Database Principles231
Consistency of transactions Actions and locks must relate each other
Transactions can only read & write only if has a lock and has not released the lock.
Unlocking an element is compulsory. Legality of schedules
No two transactions can aquire the lock on same element without the prior one releasing it.
Locking scheduler Grants lock requests only if it is in a legal schedule. Lock table stores the information about current locks on the
elements.
CS 257 Database Principles232
The locking scheduler continued……………….. A legal schedule of consistent transactions but
unfortunately it is not a serializable.
CS 257 Database Principles233
Locking schedule continued….
The locking scheduler delays requests that would result in an illegal schedule.
The locking scheduler delays requests that would result in an illegal schedule.
T1 T2 A B
l1 (A); r1 (A)
A: = A + 100;
125
l2 (A); r2 (A)
A: = A * 2;w2(A);l2(B) Denied
250
r1 (B); B := B+100;
w1(B);u2(B);
125
l2 (B); u2 (A); r2
(B)B: = B * 2;w2(B);u2(B)
250
CS 257 Database Principles234
Two-phase locking Guarantees a legal schedule of consistent
transactions is conflict-serializable. All lock requests proceed all unlock requests. The growing phase:
Obtain all the locks and no unlocks allowed. The shrinking phase:
Release all the locks and no locks allowed.
CS 257 Database Principles235
How the Two-Phase locking works Assures serializability. Two protocols for 2PL:
Strict two phase locking : Transaction holds all its exclusive locks till commit / abort.
Rigorous two phase locking : Transaction holds all locks till commit / abort.
Possible to find a transaction Tj that has a 2PL and a schedule S for Ti ( non 2PL ) and Tj that is not conflict serializable.
CS 257 Database Principles236
2PL Failure……….s 2PL fails to provide security against
deadlocks.T1 T2 A B
l1 (A); r1 (A)
A: = A + 100;
25 25
l2 (B); r2 (B)
B: = B * 2;w2(A);l2(B) Denied
w1(A); 125
l1(A) Denied w1(B);
l2 (B); Denied
50250
CS 257 Database Principles
18.4 Locking Systems With Several Lock Modes
Locking Scheme Shared/Read Lock ( For Reading) Exclusive/Write Lock( For Writing)
Compatibility Matrices Upgrading Locks Update Locks Increment Locks
237
CS 257 Database Principles
Shared & Exclusive Locks Consistency of Transactions
Cannot write without Exclusive Lock Cannot read without holding some lock
This basically works on 2 principles A read action can only proceed a shared or an exclusive
lock A write lock can only proceed a exclusice lock
All locks need to be unlocked before commit
238
CS 257 Database Principles
Shared and exclusive locks continued………………….. Two-phase locking of transactions
Must precede unlocking
Legality of Schedules An element may be locked exclusively by one transaction
or by several in shared mode, but not both.
239
T1 T2
sl1 (A); r1 (A)
A: = A + 100;
sl2 (A); r2 (A)
sl2 (B); r2 (B)
xl1 (B);Denied
u2 (A); u2 (B)
xl1 (B); r1 (B); u1 (B);
u1 (A); u2 (B);
CS 257 Database Principles
Compatibility Matrices Has a row and column for each lock mode.
Rows correspond to a lock held on an element by another transaction
Columns correspond to mode of lock requested. Example :
240
LOCK REQUESTED
S X
LOCK S YES NO
HOLD X NO NO
CS 257 Database Principles
Upgrading Locks Suppose a transaction wants to read as well
as write : It acquires a shared lock on the element Performs the calculations on the element And when its ready to write, It is granted a
exclusive lock. Transactions with unpredicted read write
locks can use UPGRADING LOCKS.
241
CS 257 Database Principles
Upgrading locks [continued ….] Indiscriminating use of upgrading produces a
deadlock. Example : Both the transactions want to
upgrade on the same element
242
CS 257 Database Principles
Update locks Solves the deadlock occurring in upgrade
lock method. A transaction in an update lock can read but
cant write. Update lock can later be converted to
exclusive lock. An update lock can only be given if the
element has shared locks.
243
CS 257 Database Principles
Update locks (cont.) An update lock is like a shared lock when you
are requesting it and is like a exclusive lock when you have it.
Compatibility matrix :
244
S X U
S YES NO YES
X NO NO NO
U NO NO NO
CS 257 Database Principles
Increment Locks Used for incrementing & decrementing stored
values. E.g. - Transfer money from one bank to
another, Ticket selling transactions in which number seats are decremented after each transaction.
245
CS 257 Database Principles
Increment lock (cont.)
246
A increment lock does not enable read or write locks on element.
Any number of transaction can hold increment lock on element
Shared and exclusive locks can not be granted if an increment lock is granted on element
S X I
S YES NO NO
X NO NO NO
I NO NO YES
CS 257 Database Principles
18.5 Locking Scheduler
CS 257 Database Principles
Scheduler That Inserts Lock Actions
If transaction is delayed, waiting for a lock, Scheduler performs following actions
Takes the stream of requests generated by the transaction & insert appropriate lock modes to db operations (read, write, or update)
Take actions (a lock or db operation) from above step and executes it.
Determine the transaction (T) that action belongs and status of T (delayed or not). If T is not delayed then Database access action is transmitted to the database
and executed
248
CS 257 Database Principles
Scheduler That Inserts Lock Actions If lock action is received by Later , it checks the L Table
whether lock can be granted or not i > Granted, the L Table is modified to include granted lock ii >Not G. then update L Table about requested lock then later step delays transaction T
When a T = commits or aborts, former is notified by the transaction manager and releases all locks. If any transactions are waiting for locks of former to notifies later.
Then later when notified about the lock on some DB element, determines next transaction T’ to get lock to continue.
249
CS 257 Database Principles
The Lock Table
A relation that associates database elements with locking information about that element
Implemented with a hash table using database elements as the hash key
Size is proportional to the number of lock elements only, not to the size of the entire database
250
DB element A
Lock information for A
CS 257 Database Principles
Lock Table Entries Structure
251
Some Sort of information found in Lock Table entry 1>Group modes-S: only shared locks are held-X: one exclusive lock and no other locks- U: one update lock and one or more shared locks2>wait : one transaction waiting for a lock on A3>A list : T currently hold locks on A or Waiting for lock on A
CS 257 Database Principles
Handling Lock Requests
Suppose transaction T requests a lock on A If there is no lock table entry for A, then there
are no locks on A, so create the entry and grant the lock request
If the lock table entry for A exists, use the group mode to guide the decision about the lock request
252
CS 257 Database Principles
Handling Lock Requests
253
If group mode is U (update) or X (exclusive)
No other lock can be grantedDeny the lock request by T
Place an entry on the list saying T requests a lock
And Wait? = ‘yes’ If group mode is S (shared)
Another shared or update lock can be grantedGrant request for an S or U lock
Create entry for T on the list with Wait? = ‘no’
Change group mode to U if the new lock is an update lock
CS 257 Database Principles
Handling Unlock Requests
254
Now suppose transaction T unlocks A Delete T’s entry on the list for A If T’s lock is not the same as the group mode,
no need to change group mode Otherwise check entire list for new group
mode S: GM(S) or nothing U: GM(S) or nothing X: nothing
CS 257 Database Principles
Handling Unlock Requests
255
If the value of waiting is “yes" need to grant one or more locks using following approachesFirst-Come-First-Served: Grant the lock to the longest waiting request. No starvation (waiting forever for lock)Priority to Shared Locks: Grant all S locks waiting, then one U lock. Grant X lock if no others waitingPriority to Upgrading: If there is a U lock waiting to upgrade to an X lock, grant that first.
CS 257 Database Principles
18.6 Managing Hierarchies of Database ElementsIt Focus on two problems that come up when there id tree
structure to our data.
Tree Structure : Hierarchy of lockable elements. And How to allow locks on both large elements, like Relations and elements in it such as blocks and tuples of relation, or individual.
Note: Another is data that is itself organized in a tree. A major example would be B-tree index.
CS 257 Database Principles
Locks With Multiple Granularity
Database Elements” : It is sometime noticeably the various elements which can be used for locking.
Eg: Tuples, Pages or Blocks, Relations etc.
Granularity locks and Types : While putting locks actually when we decide which database element is to be used for locking makes it separates in two types.
Types of granularity locks: 1) Large grained 2) Small grained
CS 257 Database Principles
Example
Small granularity locks: Larger concurrency can achieved.
Large granularity locks: Some times saves from unserializable
behavior.
CS 257 Database Principles
Warning locks
The solution to the problem of managing locks at different granularities involves a new kind of lock called a “Warning.“
It is helpful in hierarchical or nested structure .
It involves both “ordinary” locks and “warning” locks.
Ordinary locks: Shared(S) and Exclusive(X) locks.
Warning locks: Intention to shared(IS) and Intention to
Exclusive(IX) locks.
CS 257 Database Principles
Warning Protocols These are the rules to be followed while putting locks on different elements.
To place an ordinary S or X lock on any element. we must begin at the root of the hierarchy.
If we are at the element that we want to lock, we need look no further. We request lock there only, If the element is down in hierarchy then place warning lock on that node respective of shared and exclusive locks and then Move on to appropriate child and then try steps 2 or 3 and until you go to desired node and then request shared or exclusive lock.
CS 257 Database Principles
Compatibility Matrix
IS column: Conflicts only on X lock.
IX column: Conflicts on S and X locks.
S column: Conflicts on X and IX locks.
X column: Conflicts every locks.
IS IX S X
IS YES YES YES NO
IX YES YES N O NO
S YES NO YES NO
X NO NO NO NO
CS 257 Database Principles
Warning ProtocolsWarning ProtocolsConsider the relation: M o v i e ( t i t l e , year, length, studioName) Transaction1 (T1):SELECT *FROM MovieWHERE title = 'King Kong';
Transaction2(T2):UPDATE MovieSET year = 1939WHERE title = 'Gone With the Wind';
When ever some transaction inserts sub elements to the node being locked then there may be problem like serializability issues.
Lets have transaction 3 (T3) to be executed:
SELECT SUM(length)FROM MovieWHERE studioName = ‘Disney’
But at the same time the transaction t4 inserts the new movie of ‘Disney’ studio. Then what happens if t3 gets executed and t4 afterwards that sum will be incorrect. But solution could be we could treat the insert or delete transaction like writing operation with exclusive locks at that time this problem gets solved.
CS 257 Database Principles
18.7 Tree Protocol
Tree structures that are formed by the link pattern of the elements themselves. Database are the disjoint pieces of data, but the only way to get to Node is through its parent.
B trees are best example for this sort of data.
Knowing that we must traverse a particular path to an element give us some important freedom to manage locks differently from two phase locking approaches.
CS 257 Database Principles
Tree Based Locking
B tree index in a system that treats individual nodes( i.e. blocks) as lockable database elements. The Node Is the right level granularity.
We use a standard set of locks modes like shared,exculsive, and update locks and we use two phase locking
CS 257 Database Principles
Rules for access Tree Structured Data There are few restrictions in locks from the tree
protocol. We assume that that there are only one kind of lock. Transaction is consider a legal and schedules as
simple. Expected restrictions by granting locks only when
they do not conflict with locks already at a node, but there is no two phase locking requirement on transactions.
CS 257 Database Principles
Why the tree protocol works.
A transaction's first lock may be at any node of the tree.
Subsequent locks may only be acquired if the transaction currently has a lock on the parent node.
Nodes may be unlocked at any time A transaction may not relock a node on which
it has released a lock, even if it still holds a lock on the node’s parent
CS 257 Database Principles
A tree structure of Lockable elements
CS 257 Database Principles
Three transactions following the tree protocol
CS 257 Database Principles
Why the Tree Protocol works? The Tree protocol forces a serial order on the
transactions involved in a schedule.
Ti <sTj if in schedule S., the transaction Ti and Tj lock a node in common and Ti locks the node first.
CS 257 Database Principles
Example
If precedence graph drawn from the precedence relations that we defined above has no cycles, then we claim that any topological order of transactions is an equivalent serial schedule.
For Example either ( T1,T2,T3) or (T3,T1,T2) is an equivalent serial schedule the reason for this serial order is that all the nodes are touched in the same order as they are originally scheduled.
CS 257 Database Principles
If two transactions lock several elements in common, then they are all locked in same order.
I am Going to explain this with help of an example.
CS 257 Database Principles
Precedence graph derived from Schedule
CS 257 Database Principles
Example:--4 Path of elements locked by two transactions
CS 257 Database Principles
Now Consider an arbitrary set of transactions T1, T2;.. . . Tn,, that obey the
tree protocol and lock some of the nodes of a tree according to schedule S.
First among those that lock, the root. they do also in same order.
If Ti locks the root before Tj, Then Ti locks every node in common with Tj does. That is Ti<sTj, But not Tj>sTi.
Continued….
CS 257 Database Principles
18.8 Concurrency Control By Timestamp
Concurrency is a property of a systems in which several computations are executing and overlapping in time, and interacting with each other.
Timestamp is a sequence of characters, denoting the date or time at which a certain event occurred. Example of Timestamp: 20-MAY-09 05.45.14.000000 AM 18/23/2003 14:55:14:000000
CS 257 Database Principles
Time stamping
We assign a timestamp to transaction and timestamp is usually presented in a consistent format, allowing for easy comparison of two different records and tracking progress over time; the practice of recording timestamps in a consistent manner along with the actual data is called timestamping.
CS 257 Database Principles
Timestamps
To use timestamping as a concurrency-control method, the scheduler needs to assign to each transaction T a unique number, its timestamp TS(T). It is generated by usually two approaches Using system clock Another approach is for the scheduler to maintain
a counter. Each time when transaction starts the counter is incremented by 1 and new value become timestamp for transaction.
CS 257 Database Principles
Whichever method we use to generate timestamp , the scheduler must maintain a table of currently active transaction and their timestamp.
To use timestamps as a concurrency-control method we need to associate with each database element x two timestamps and an additional bit. RT(x) The read time of x. WT(x) The write time of x. C(x) The commit bit of x. which is true if and only if the
most recent transaction to write x has already committed. The purpose of this bit is to avoid a situation of “Dirty Read”.
CS 257 Database Principles
Physically Unrealizable Behaviors
Read too late
Transaction T tries to read too late
CS 257 Database Principles
Write too late
Transaction T tries to write too late
CS 257 Database Principles
Problem with dirty data
T could perform a dirty read if it is reads X
CS 257 Database Principles
A write is cancelled because of a write with a later timestamp, but the writer then aborts
CS 257 Database Principles
Rules for timestamp based scheduling
1. Granting Request
2. Aborting T (if T would violate physical reality) and restarting T with a new timestamp (Rollback)
3. Delaying T and later deciding whether to abort T or to grant the request
CS 257 Database Principles
Request RT(X):1. If TS(T) >= WT(X), the read is physically
realizablea) If C(X) is true, grant the request. If TS(T) >
RT(X), set RT(X) := TS(T); otherwise do not change RT(X)
b) If C(X) is false, delay T until C(X) becomes true or the transaction that wrote X aborts
2. If TS(T) < WT(X), the read is physically unrealizable. Rollback T; abort T and restart it with a new, larger timestamp
Rules
CS 257 Database Principles
Request WT(X):1. If TS(T) >= RT(X) and TS(T) >= WT(X), the write
is physically realizable and must be performeda) Write the new value for Xb) Set WT(X) := TS(T), andc) Set C(X) := false
2. If TS(T) >= RT(X), but TS(T) < WT(X), then the write is physically realizable, but there is already a later value in X. If C(X) is true, then ignore the write by T. If C(X) is false, delay T
3. If TS(T) < RT(X), then the write is physically unrealizable
CS 257 Database Principles
Timestamps Vs Locks
Timestamps Locks
Superior if most transactions are read-only rare that concurrent transactions will read or write the same element
Superior in high-conflict situations
In high-conflict situations, rollback will be frequent, introducing more delays than a locking system
Frequently delay transactions as they wait for locks
CS 257 Database Principles
18.9 Concurrency Control by Validation
What is Validation? Optimistic concurrency control Concurrency Control assumes that conflicts
between transactions are rare Scheduler maintains record of active transactions Does not require locking Check for conflicts just before commit
CS 257 Database Principles
Different Validation Phases
Read Phase Reads from the database for the elements in its
read set ReadSet (Ti): It is a Set of objects read by
Transaction Ti. Whenever the first write to a given object is
requested, a copy is made, and all subsequent writes are directed to the copy
When the transaction completes, it requests its validation and write phases
CS 257 Database Principles
Continued………..
Validation Phase Checks are made to ensure serializability is not violated Scheduling of transactions is done by assigning transaction
numbers to each transactions There must exist a serial schedule in which transaction Ti
comes before transaction Tj whenever t(i) < t(j) If validation fails then the transaction is rolled back
otherwise it proceeds to the
CS 257 Database Principles
Continued……………..
Write Writes the corresponding values for the elements
in its write set
WriteSet (Ti): Set of objects where Transaction Ti has intend to write on it.
Locally written data are made global
CS 257 Database Principles
Different Terminologies
Scheduler maintains 3 states START(T), VAL(T), FIN(T) START
Transactions that are started but not yet validated
VAL Transactions that are validated but not yet finished
FIN Transactions that are finished
CS 257 Database Principles
Rule 1 for Validation
Rule 1 T1 T2 T2 starts before T1 finishes FIN(T1) > START(T2) RS(T2) WS(T1) =
Rule 2 T1 T2 T2 starts before T1 finishes FIN(T1) > VAL(T2) WS(T2) WS(T1) =
TimeLine
Validation
Read
Write
Interference – Leads toRollback of T2
No Problem
TimeLine
Validation
Write
CS 257 Database Principles
Validation Rules
T2 & T1 RS(T
2) WS(T
1) = {B} {A,C} =
WS(T2) WS(T
1) = {D} {A,C} =
T3 & T1 RS(T
3) WS(T
1) = {B} {A,C} =
WS(T3) WS(T
1) = {D,E} {A,C} =
T3 & T2 RS(T
3) WS(T
2) = {B} {D} =
WS(T3) WS(T
2) = {D,E} {D} = D// Rule 2 Can't be applied; FIN(T
2)
< VAL(T3)
CS 257 Database Principles
Continued…………
T4 Starts before T1 and T3 finishes. So T4 has to be checked against the sets of T1 and T3
T4 & T
1
RS(T4) WS(T
1) = {A,D} {A,C} = {A}
Rule 2 can not be applied
T4 & T
3
RS(T4) WS(T
3) = {A,D} {D,E} = {D}
WS(T4) WS(T
3) = {A,C} {D,E} =
CS 257 Database Principles
Comparison
Lock Lock management overhead Deadlock detection/resolution. Concurrency is significantly lowered, when congested
nodes are locked. Locks can not be released until the end of a transaction
Conflicts are rare. (We might get better performance by not locking, and instead checking for conflicts at commit time.
Timestamp Deadlock is not possible Prone to restart
CS 257 Database Principles
Continued………….
Validation Optimistic Concurrency Control is superior to locking
methods for systems where transaction conflict is highly unlikely, e.g query dominant systems.
Avoids locking overhead Starvation: What should be done when validation
repeatedly fails ? Solution: If the concurrency control detects a starving
transaction, it will be restarted, but without releasing the critical section semaphore, and transaction is run to the completion by write locking the database
CS 257 Database Principles
Chapter 18 Ends………..!
Chapter 21
Starts………………………….
CS 257 Database Principles
Chapter 21
Information Integration
21.1 Introduction to Information Integration 21.2 Modes of Information Integration 21.3 Wrappers in Mediator-Based Systems
CS 257 Database Principles
21.1 Introduction to Information Integration Need for Information Integration
All the data in the world could put in a single database (ideal database system)
In the real world (impossible for a single database): databases are created independently hard to design a database to support future use
CS 257 Database Principles
Example of Information Integration
Registrar: to record student and grade Bursar: to record tuition payments by
students Human Resources Department: to record
employees Other department….
CS 257 Database Principles
How to integrate the database
Start over
build one database: contains all the legacy databases; rewrite all the applications
result: painful Build a layer of abstraction (middleware)
on top of all the legacy databases
this layer is often defined by a collection of classes
CS 257 Database Principles
Heterogeneity Problem
What is Heterogeneity Problem
Aardvark Automobile Co.
1000 dealers has 1000 databases
to find a model at another dealer
can we use this command:
SELECT * FROM CARS
WHERE MODEL=“A6”;
CS 257 Database Principles
21.2 Modes of Information Integration Federations
The simplest architecture for integrating several DBs
One to one connections between all pairs of DBs DBs talk to each other, n(n-1) wrappers are
needed Good when communications between DBs are
limited
CS 257 Database Principles
Wrapper
Wrapper : a software translates incoming queries and outgoing answers. In a result, it allows information sources to conform to some shared schema
CS 257 Database Principles
Federations Diagram
A federated collection of 4 DBs needs 12 components to translate queries from one to another
DB3 DB4
2 Wrappers
2 Wrappers
2 Wrappers
2 Wrappers
2 Wrappers 2 Wrappers
DB4DB3 DB4
CS 257 Database Principles
Example
Car dealers want to share their inventory. Each dealer queries the other’s DB to find the needed car.
Dealer-1’s DB relation: NeededCars (model,color,autoTrans) Dealer-2’s DB relation: Auto(Serial, model, color) Options
(serial,option)
Dealer-1’s DB
Dealer-2’s DB
wrapper
wrapper
wrapper
wrapper
CS 257 Database Principles
Continued………..
For(each tuple(:m,:c,:a) in NeededCars){ if(:a=TRUE){/* automatic transmission wanted */ SELECT serial FROM Autos, Options WHERE Autos.serial = Options.serial AND Options.option = ‘autoTrans’ AND Autos.model = :m AND Autos.color =:c; } Else{/* automatic transmission not wanted */ SELECT serial FROM Auto WHERE Autos.model = :m AND Autos.color = :c AND NOT EXISTS( SELECT * FROM Options WHERE serial = Autos.serial AND option=‘autoTrans’); } }
CS 257 Database Principles
Data Warehouse
Sources are translated from their local schema to a global schema and copied to a central DB.
User transparent: user uses Data Warehouse just like an ordinary DB
User is not allowed to update Data Warehouse
CS 257 Database Principles
Diagram of Dataware house
Warehouse
Extractor Extractor
Source 1 Source 2
User query
result
Combiner
CS 257 Database Principles
Example……………….
Construct a data warehouse from sources DB of 2 car dealers:
Dealer-1’s schema: Cars(serialNo, model,color,autoTrans,cdPlayer,…) Dealer-2’s schema: Auto(serial,model,color) Options(serial,option)
Warehouse’s schema: AutoWhse(serialNo,model,color,autoTrans,dealer)
Extractor --- Query to extract data from Dealer-1’s data:
INSERT INTO AutosWhse(serialNo, model, color, autoTans, dealer) SELECT serialNo,model,color,autoTrans,’dealer1’ From Cars;
CS 257 Database Principles
Continued………….
Extractor --- Query to extract data from Dealer-2’s data:
INSERT INTO AutosWhse(serialNo, model, color, autoTans, dealer) SELECT serialNo,model,color,’yes’,’dealer2’ FROM Autos,Options WHERE Autos.serial=Options.serial AND option=‘autoTrans’;
INSERT INTO AutosWhse(serialNo, model, color, autoTans, dealer) SELECT serialNo,model,color,’no’,’dealer2’ FROM Autos WHERE NOT EXISTS ( SELECT * FROM serial =Autos.serial AND option = ‘autoTrans’);
CS 257 Database Principles
How to construct a Data ware House Periodically reconstructed from the current data in the sources,
once a night or at even longer intervals. Advantages: simple algorithms. Disadvantages: Need to shut down the warehouse; data can become out of
date. Updated periodically based on the changes(i.e. each night) of the
sources. Advantages: Involve smaller amounts of data. (important when
warehouse is large and needs to be modified in a short period) Disadvantages: the process to calculate changes to the warehouse is
complex. Data can become out of date. Changed immediately, in response to each change or a small set
of changes at one or more of the sources. Advantages: data won’t become out of date. Disadvantages: Requires too much communication, therefore, it is generally
too expensive.
CS 257 Database Principles
Virtual warehouse, which supports a virtual view or a collection of views, that integrates several sources.
Mediator doesn’t store any data. Mediators’ tasks:
a) receive user’s query,
b) send queries to wrappers,
c) combine results from wrappers,
d) send the final result to user.
CS 257 Database Principles
Mediator Diagram
Mediator
Wrapper Wrapper
Source 1 Source 2
User query
Result
Result
Result
Result
Result
Query
Query
QueryQuery
CS 257 Database Principles
Example
Same data sources as the example of data warehouse, the mediator
Integrates the same two dealers’ source into a view with schema:
AutoMed(serialNo,model,color,autoTrans,dealer)
When the user have a query:
SELECT sericalNo, model FROM AkutoMed Where color=‘red’
CS 257 Database Principles
Another Example…
In this simple case, the mediator forwards the same query to each Of the two wrappers.
Wrapper1: Cars(serialNo, model, color, autoTrans, cdPlayer, …) SELECT serialNo,model FROM cars WHERE color = ‘red’;
Wrapper2: Autos(serial,model,color); Options(serial,option) SELECT serial, model FROM Autos WHERE color=‘red’;
CS 257 Database Principles
Different Solutions
There may be different options for the mediator to forward user query,
for example, the user queries if there are a specific model&color car
(i.e. “Gobi”, “blue”).
The mediator decides 2nd query is needed or not based on the result of
1st query. That is, If dealer-1 has the specific car, the mediator doesn’t
have to query dealer-2.
CS 257 Database Principles
21.3 Wrappers in Mediator-Based Systems
Wrappers in Mediator-based Systems More complicated than that in most data
warehouse system. Able to accept a variety of queries from the
mediator and translate them to the terms of the source.
Communicate the result to the mediator.
CS 257 Database Principles
How to design a wrapper? Classify the possible queries that the
mediator can ask into templates, which are queries with parameters that represent constants.
CS 257 Database Principles
Templates for Query Patterns
Use notation T=>S to express the idea that the template T is turned by the wrapper into the source query S.
Example 1
Dealer 1Cars (serialNo, model, color, autoTrans, navi,…)
For use by a mediator with schema
AutoMed (serialNo, model, color, autoTrans, dealer)
CS 257 Database Principles
Continued…………
We denote the code representing that color by the parameter $c, then the template will be:
SELECT *FROM AutosMedWHERE color = ’$c’;
=>SELECT serialNo, model, color, autoTrans, ’dealer1’FROM CarsWHERE color=’$c’;
(Template T => Source query S) There will be total 2n templates if we have the option of specifying n
attributes.
CS 257 Database Principles
Wrapper Generators The wrapper generator creates a table holds the
various query patterns contained in the templates. The source queries that are associated with each. A driver is used in each wrapper, the task of the driver
is to: Accept a query from the mediator. Search the table for a template that matches the
query. The source query is sent to the source, again using a
“plug-in” communication mechanism. The response is processed by the wrapper. Filter: -Have a wrapper filter to supporting more
queries
CS 257 Database Principles
Example 2
If wrapper is designed with more complicated template with queries specify both model and color.SELECT *FROM AutosMedWHERE model = ’$m’ AND color = ’$c’;
=>SELECT serialNo, model, color, autoTrans, ’dealer1’FROM CarsWHERE model = ’$m’ AND color=’$c’;
Now we suppose the only template we have is color. However the wrapper is asked by the Mediator to find “blue Gobi model car.”
CS 257 Database Principles
Solution
1. Use template with $c=‘blue’ find all blue cars and store them in a temporary relation
TemAutos (serialNo, model, color, autoTrans, dealer)
2.The wrapper then return to the mediator the desired set of automobiles by excuting the local query:SELECT*
FROM TemAutos
WHERE model= ’Gobi’;
CS 257 Database Principles
Chapter 21 Ends………..!
End of File…………………………!
CS 257 Database Principles
References .
Database Systems: The Complete Book (2nd Edition) (Hardcover) by Hector Garcia-Molina (Author), Jeffrey D.
Ullman (Author), Jennifer Widom (Author) Publisher : Prenctice Hall.