56
V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras [email protected] 1

V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras [email protected] 1

Embed Size (px)

Citation preview

Page 1: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

DBMAN8

Data access layersFiles and IndicesRelational algebraRelational calculusRandom theoretical extras

[email protected] 1

Page 2: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

DBMAN8

Data access layersFiles and IndicesRelational algebraRelational calculusRandom theoretical extras

[email protected] 2

Page 3: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Data representation• What are the boundaries of the questions the database

can answer?• Models the real world: mini world with limitations• 3 levels:

– Conceptual model: A world described by the DB– Implementation/representation model: a model

understandable for the DBMS (structured records, tables, fields, etc.)

– Physical model: DBMS implemented on the computer (files, programs)

[email protected] 3

Page 4: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Structure of the DBMS• The user has permission for the smaller part of the DB:

View

Conceptual model

Implementation model

Physical model

View1 View2 View3

[email protected] 4

Page 5: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Example: university DB• View: Teachers can see info about their courses (= DQL)• Conceptual model (= tables)

– Student (sid: string, name: string, age: integer, cumulative average: real)

– Subject (subid: string, sname: string, credit: integer)– Registration (sid: string, subid: string, mark: integer)

• Implementation model (= DDL)– Create table subject (

subid varchar(10) not null primary key, sname varchar (50) not null, credit int not null )

• Physical model: files containing unsorted [email protected] 5

Page 6: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Structure of the DBMS• Layers

Query optimization and-execution

Relational operators

Files and permissions

Buffering

Handling storage

Handling concurrency and

recovery control is taken

into consideration

DBOS

[email protected] 6

Page 7: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Steps of a query1. User ”asks” of the DBMS (SQL query)2. DBMS checks the permission in the schema3. DBMS checks the permission in the subschema4. DBMS asks the OS to execute the I/O operation5. OS looks for the asked record6. OS imports the record into the system buffer7. OS notifies the DBMS8. Record is taken the the user workspace9. DBMS notifies the user about the recieved data

+1. How to get the actual data chunks from the [email protected] 7

Page 8: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

DBMAN8

Data access layersFiles and IndicesRelational algebraRelational calculusRandom theoretical extras

[email protected] 8

Page 9: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Records in files• DBMS handles records and files• Files: collection of pages/blocks containing records• They must support

– DML (insert, update, delete)– Read records (identified by rid)– Read all the records (satisfying some conditions)

• How to store the blocks?

[email protected] 9

Page 10: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Way of storage - RAID• Redundant Array of Inexpensive/Independent Data• Connecting disks logically, storing data redundantly• Aims:

– Minimizing data loss, increase reliability– Increasing capacity by more smaller/cheaper disks– Increase data access performance– Increase flexibility (can be replaced during usage)

• Data striping: Data is partitioned into striping units and the partitions are distributed on several disks

• Redundancy: Data is strored redundantly so that reconstruction of data in case of disk failure is possible

[email protected] 10

Page 11: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Levels of RAID – Level 0 – JBOD

• Non redundant• If one of the disks fails, data is lost• Parallel reading/writing• If the capacity of the disks is different

then the performance depends on the worst disk

[email protected] 11

Page 12: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

RAID Levels – Level 1 – Mirror

• Mirrored, the data is the same on every disk

• If one of the disks fails then data can be reconstructed

• Parallel reading with increased velocity• Parallel writing with normal velocity• If the capacity of the disks is different

then the performance depends on the worst disk

• Does not use data striping

[email protected] 12

Page 13: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

RAID 0+1 and RAID 1+0 (RAID 10)

• RAID 0+1: speed of RAID 0 and redundancy of RAID 1

• Min 4 disks

• RAID 10: first mirroring, then connecting

• If a disk fails, only that RAID 1 is involved

[email protected] 13

Page 14: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

RAID Levels – Level 2

• Uses data striping (unit=1 bit) but some of the disks are used to store error-correcting codes

• ECC: redundant bits calculated from data bits (compress)

• In the strip the corresponding strip’s error correcting code is stored.

• Not used any more (HDDs handle error correction)

[email protected] 14

Page 15: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

RAID Levels – Level 3• Bit-Interleaved Parity• Cannot identify the failed disk (disk controllers do that)• One check disk with parity information• The failed disk’s data can be recovered• Can process only one I/O at a time• Strips=1 bit

[email protected] 15

Page 16: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

RAID Levels – Level 4• Block-Interleaved Parity• Like RAID 3, with strips as disk blocks• Supports serving multiple users• Parity disk needs to be updated at every write, can be

bottle neck• In case of disk failure, reading speed reduces

[email protected] 16

Page 17: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

RAID Levels – Level 5• Block-Interleaved Distributed Parity• Rotating parity: parity is not stored on a single check

disk, but uniformly over all disks• Parallel read and write• Similar to RAID 3 and 4 depending on the size of strips• If a disks fails, it has to be replaced inmediately

otherwise if another fails, all data will be lost

[email protected] 17

Page 18: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

RAID Levels – Level 6• High possibility of the failure of a second disk during disk

recovery use an extra disk• Needs 2 check disks• Able to recover from up to two simultaneous disk

failures• Read and write speed is equal to RAID 5

[email protected] 18

Page 19: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Unordered (heap) files• Okay, now we can access blocks... How to access the

actually needed data block from the files?

• Simplest file structure: heap file• For the record-level operations DBMS must register

– pages in the file– free space in the page– records in the page

• Sloooooooooooow!

[email protected] 19

Page 20: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Heap file as a linked list

• Address of the header page and the name of the heap file must be stored in a known location

• Every page contains two pointers in addition• Typical problems of a linked list: slow!

Header

Page

DataPage

DataPage

DataPage

DataPage

DataPage

DataPage Pages with

free space

Full pages

[email protected] 20

Page 21: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Directory-based heap file• Maintain directory

of pages• DBMS stores the address

of the first pageof each heap file

• Directory=collection of pages(e.g. chained list)

• Counter for every page: amount of free space/entry• Can be faster... Structure of the directory = key factor!• Directory = index file ... We can use more than one index

files for one table!

[email protected] 21

Page 22: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Example, library

1. Search the books of Asimov2. Search for the book Foundation

[email protected] 22

Page 23: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Access methods• B trees (B+ trees, B* trees...)• Hash-based structures• We already learned the basics of those from

Programming II.

• Let’s recap: Hash vs Symmetric vs assymetric encryptionUsage in DB: table joins (equality check), partitioning index structures, encryption

• Let’s recap: B tree - basic structure, always balanced, search, insert (split node), delete (merge nodes)Usage in DB: key-based LUTs, custom ordered indices(note: char indices are SLOW – use hash indexes!)

[email protected] 23

Page 24: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

B-tree variants• In the B+ tree, copies of the keys are stored in the

internal nodes; the keys and records are stored in leaves; in addition, a leaf node may include a pointer to the next leaf node to speed sequential access

• The B* tree balances more neighboring internal nodes to keep the internal nodes more densely packed. This variant requires non-root nodes to be at least 2/3 full instead of ½

• Instead of immediately splitting up a node when it gets full, its keys are shared with a node next to it. When both nodes are full, then the two nodes are split into three. Deleting nodes very complex

[email protected] 24

Page 25: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Indexes• To speed up not supported operations• Collection of data entries to speed up search• Rid=pointer to the entries

[email protected] 25

Page 26: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Index properties: Clusered vs. unclustered• Clustered

– Ordering of data is similar to ordering of indexes (data sorted by the search key on every page)

– Expensive• Unclustered

– Random ordering of data

[email protected] 26

Page 27: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0 [email protected] 27

Page 28: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Index properties: Dense vs. sparse• Dense:

– It contains at least one data entry for every search key that appears in the data file

– Useful optimization techniques rely on it• Sparse

– Contains one entry for each page in the data file– Much smaller

[email protected] 28

Page 29: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Example

[email protected] 29

Page 30: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Index properties: Primary vs. secondary• Primary

– Index on a set of fields that includes primary key• Secondary

– Not primary index• Unique

– Contains a key candidate

[email protected] 30

Page 31: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Index properties: Simple vs composite• Contains several fields

[email protected] 31

Page 32: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

DBMAN8

Data access layersFiles and IndicesRelational algebraRelational calculusRandom theoretical extras

[email protected] 32

Page 33: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Basic operations

[email protected] 33

Page 34: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Consequent operations

a1 b1

a1 b2

a1 b3

a1 b4

a2 b1

a2 b3

a3 b2

a3 b3

a3 b4

a4 b1

a4 b2

a4 b3

b1

b2

b3

a1

a4

[email protected] 34

Page 35: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Quotient examples

sno pnos1 p1s1 p2s1 p3s1 p4s2 p1s2 p2s3 p2s4 p2s4 p4

pno p2

pnop2p4

pnop1p2p4

snos1s2s3s4

snos1s4

snos1

[email protected] 35

Page 36: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Joins

[email protected] 36

Page 37: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

All symbols

[email protected] 37

Page 38: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

DBMAN8

Data access layersFiles and IndicesRelational algebraRelational calculusRandom theoretical extras

[email protected] 38

Page 39: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Goal

[email protected] 39

Page 40: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Atomic formulas

[email protected] 40

Page 41: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Formulas

[email protected] 41

Page 42: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Example

[email protected] 42

Page 43: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Example no. 2

[email protected] 43

Page 44: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Example no. 3

[email protected] 44

Page 45: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

To conclude relational algebra/calculus• Relational algebra and relational calculus can express the

same• Declarative part is convenient in the queries (user-

friendly)• The algebra (way of calculation) is the task of the DB, it is

hidden from the user

[email protected] 45

Page 46: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

DBMAN8

Data access layersFiles and IndicesRelational algebraRelational calculusRandom theoretical extras

[email protected] 46

Page 47: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

SYSTEM PRIVILEGES - ORACLE

• CREATE SESSION – REQUIRED TO LOG IN• CREATE TABLE• CREATE VIEW• CREATE PROCEDURE• CREATE USER• ALTER ANY TABLE• ALTER ANY TRIGGER• SELECT ANY TABLE• DROP ANY TABLE• DROP USER• ALL PRIVILEGES

[email protected] 47

Page 48: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

ROLES, USERS

• CREATE USER {name} IDENTIFIED BY {password};• DROP USER {name};

• Role ~ Group of privileges• CREATE ROLE {name};• DROP ROLE {name};

• Authentication: user+pass / user+pass+host / LDAP / PAM / Windows-based

[email protected] 48

Page 49: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

SETTING SYSTEM PRIVILEGES• GRANT {privileges} TO {role/user};• "WITH GRANT OPTION"• GRANT {role} TO {user};• REVOKE {privileges / role} FROM {user};

[email protected] 49

Page 50: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

OBJECT-PRIVILEGES• SELECT• INSERT• UPDATE• DELETE• ALTER• EXECUTE (for PL/SQL)• READ (for files)• REFERENCES (for constraints)• INDEX (for CREATE INDEX)

[email protected] 50

Page 51: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

SETTING OBJECT-PRIVILEGES• GRANT {privileges} ON {object} TO {user} [WITH GRANT

OPTION];• REVOKE {privileges / ALL} ON {object} FROM {user};

[email protected] 51

Page 52: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

SQL: Kleene’s three-valued logic

• If compared with NULL, it is usually NULL except when UNION or INTERSECT is used

A AND BB

True Unknown False

A

True True Unknown False

Unknown Unknown Unknown False

False False False False

A OR BB

True Unknown False

A

True True True True

Unknown True Unknown Unknown

False True Unknown False A NOT A

True False

Unknown Unknown

False True

[email protected] 52

Page 53: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Basic terms – should NOT be new!• Elements: , , 𝑎 𝑏 𝑐• Sets: , , 𝐴 𝐵 𝐶• Defining a set:

– enumeration: ={ , , }, thus 𝐴 𝑎 𝑏 𝑐 𝑎∈𝐴– rules: ={ | ≥100 ≤1000}𝐵 𝑥 𝑥 ∧𝑥

• Subset: , 𝐴⊂𝐵 if :∀𝑎∈𝐴 𝑎∈𝐵• Ordered set (vector): = , , 𝑞 ⟨𝑎 𝑏 𝑐⟩• Attributes: key vs secondary attributes• SQL = DML, DDL, DQL, DCL!

[email protected] 53

Page 54: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0

Finding the key

[email protected] 54

Page 55: V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random theoretical extras szabo.zsolt@nik.uni-obuda.hu 1

V 1.0 [email protected] 55