13
Tutorial 19 Dina Said

Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

Embed Size (px)

Citation preview

Page 1: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

Tutorial 19

Dina Said

Page 2: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

Indexing Data

1. A data entry k* is an actual data record (with search key value k

2. A data entry is a (k, rid) pair, where rid is the record id of a data record with search key value k.

3. A data entry is a (k, rid-list) pair, where rid-list is a list of record ids of data records with search key value k.

Page 3: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

Indexing Data

1. A data entry k* is an actual data record (with search key value k Primary Index

2. A data entry is a (k, rid) pair, where rid is the record id of a data record with search key value k. Secondary Index

3. A data entry is a (k, rid-list) pair, where rid-list is a list of record ids of data records with search key value k.

Page 4: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

Duplicates

• Two data entries are said to be duplicates if they have the same value for the search key field associated with the index.

Page 5: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

Indexing Data

1. A data entry k* is an actual data record (with search key value k Can’t have duplicates

2. A data entry is a (k, rid) pair, where rid is the record id of a data record with search key value k. May have duplicates

3. A data entry is a (k, rid-list) pair, where rid-list is a list of record ids of data records with search key value k.

Page 6: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

Duplicates

• If no duplicates exist– The search key contains some candidate key– We call the index a unique index.

Page 7: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

Problem 10.10Consider the instance of the Students relation shown in Figure 10.22.Show a B+ tree of order 2 in each of these cases below, assuming that duplicates arehandled using overflow pages. Clearly indicate what the data entries are (i.e., do not use the k convention).∗

1. A B+ tree index on age using Alternative (1) for data entries.

Page 8: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

Solution

Page 9: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

Problem 10.10Consider the instance of the Students relation shown in Figure 10.22.Show a B+ tree of order 2 in each of these cases below, assuming that duplicates arehandled using overflow pages. Clearly indicate what the data entries are (i.e., do not use the k convention).∗

2. A dense B+ tree index on gpa using Alternative (2) for data entries. For thisquestion, assume that these tuples are stored in a sorted file in the order shownin Figure 10.22: The first tuple is in page 1, slot 1; the second tuple is in page1, slot 2; and so on. Each page can store up to three data records. You can usepage-id, slot to identify a tuple.

Page 10: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

Consider the instance of the Students relation shown in Figure 10.22.Show a B+ tree of order 2 in each of these cases below, assuming that duplicates arehandled using overflow pages. Clearly indicate what the data entries are (i.e., do not use the k convention).∗

2. A dense B+ tree index on gpa using Alternative (2) for data entries. For thisquestion, assume that these tuples are stored in a sorted file in the order shownin Figure 10.22: The first tuple is in page 1, slot 1; the second tuple is in page1, slot 2; and so on. Each page can store up to three data records. You can use<page-id, slot> to identify a tuple.

1,1

1,2

1,3

2,1

2,2

2,3

3,1

3,2

3,3

4,1

4,2

4,3

5,1

5,2

5,3

Page 11: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

Is that correct?

Page 12: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid

3-d tree

• Construct a 3-d tree using the following dimensions: age (int), years with the company (int), salary (real) for the following database: John(60, 24, 64,000); Scott(25, 2, 50,000); Charlie(38, 18, 54000); David(55, 29, 68,400); Ellen(27, 7, 55000); Frank(57, 17, 115000); Grant (66, 22, 40000).

Page 13: Tutorial 19 Dina Said. Indexing Data 1. A data entry k* is an actual data record (with search key value k 2. A data entry is a (k, rid) pair, where rid