V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random...

DBMAN8

Data access layersFiles and IndicesRelational algebraRelational calculusRandom theoretical extras

szabo.zsolt@nik.uni-obuda.hu 1

DBMAN8

Data representation• What are the boundaries of the questions the database

can answer?• Models the real world: mini world with limitations• 3 levels:

– Conceptual model: A world described by the DB– Implementation/representation model: a model

understandable for the DBMS (structured records, tables, fields, etc.)

– Physical model: DBMS implemented on the computer (files, programs)

Structure of the DBMS• The user has permission for the smaller part of the DB:

Conceptual model

Implementation model

Physical model

View1 View2 View3

Example: university DB• View: Teachers can see info about their courses (= DQL)• Conceptual model (= tables)

– Student (sid: string, name: string, age: integer, cumulative average: real)

– Subject (subid: string, sname: string, credit: integer)– Registration (sid: string, subid: string, mark: integer)

• Implementation model (= DDL)– Create table subject (

subid varchar(10) not null primary key, sname varchar (50) not null, credit int not null )

• Physical model: files containing unsorted dataszabo.zsolt@nik.uni-obuda.hu 5

Structure of the DBMS• Layers

Query optimization and-execution

Relational operators

Files and permissions

Buffering

Handling storage

Handling concurrency and

recovery control is taken

into consideration

Steps of a query1. User ”asks” of the DBMS (SQL query)2. DBMS checks the permission in the schema3. DBMS checks the permission in the subschema4. DBMS asks the OS to execute the I/O operation5. OS looks for the asked record6. OS imports the record into the system buffer7. OS notifies the DBMS8. Record is taken the the user workspace9. DBMS notifies the user about the recieved data

+1. How to get the actual data chunks from the HDD?szabo.zsolt@nik.uni-obuda.hu 7

DBMAN8

Records in files• DBMS handles records and files• Files: collection of pages/blocks containing records• They must support

– DML (insert, update, delete)– Read records (identified by rid)– Read all the records (satisfying some conditions)

• How to store the blocks?

Way of storage - RAID• Redundant Array of Inexpensive/Independent Data• Connecting disks logically, storing data redundantly• Aims:

– Minimizing data loss, increase reliability– Increasing capacity by more smaller/cheaper disks– Increase data access performance– Increase flexibility (can be replaced during usage)

• Data striping: Data is partitioned into striping units and the partitions are distributed on several disks

• Redundancy: Data is strored redundantly so that reconstruction of data in case of disk failure is possible

Levels of RAID – Level 0 – JBOD

• Non redundant• If one of the disks fails, data is lost• Parallel reading/writing• If the capacity of the disks is different

then the performance depends on the worst disk

RAID Levels – Level 1 – Mirror

• Mirrored, the data is the same on every disk

• If one of the disks fails then data can be reconstructed

• Parallel reading with increased velocity• Parallel writing with normal velocity• If the capacity of the disks is different

then the performance depends on the worst disk

• Does not use data striping

RAID 0+1 and RAID 1+0 (RAID 10)

• RAID 0+1: speed of RAID 0 and redundancy of RAID 1

• Min 4 disks

• RAID 10: first mirroring, then connecting

• If a disk fails, only that RAID 1 is involved

RAID Levels – Level 2

• Uses data striping (unit=1 bit) but some of the disks are used to store error-correcting codes

• ECC: redundant bits calculated from data bits (compress)

• In the strip the corresponding strip’s error correcting code is stored.

• Not used any more (HDDs handle error correction)

RAID Levels – Level 3• Bit-Interleaved Parity• Cannot identify the failed disk (disk controllers do that)• One check disk with parity information• The failed disk’s data can be recovered• Can process only one I/O at a time• Strips=1 bit

RAID Levels – Level 4• Block-Interleaved Parity• Like RAID 3, with strips as disk blocks• Supports serving multiple users• Parity disk needs to be updated at every write, can be

bottle neck• In case of disk failure, reading speed reduces

RAID Levels – Level 5• Block-Interleaved Distributed Parity• Rotating parity: parity is not stored on a single check

disk, but uniformly over all disks• Parallel read and write• Similar to RAID 3 and 4 depending on the size of strips• If a disks fails, it has to be replaced inmediately

otherwise if another fails, all data will be lost

RAID Levels – Level 6• High possibility of the failure of a second disk during disk

recovery use an extra disk• Needs 2 check disks• Able to recover from up to two simultaneous disk

failures• Read and write speed is equal to RAID 5

Unordered (heap) files• Okay, now we can access blocks... How to access the

actually needed data block from the files?

• Simplest file structure: heap file• For the record-level operations DBMS must register

– pages in the file– free space in the page– records in the page

• Sloooooooooooow!

Heap file as a linked list

• Address of the header page and the name of the heap file must be stored in a known location

• Every page contains two pointers in addition• Typical problems of a linked list: slow!

Header

DataPage

DataPage Pages with

free space

Full pages

Directory-based heap file• Maintain directory

of pages• DBMS stores the address

of the first pageof each heap file

• Directory=collection of pages(e.g. chained list)

• Counter for every page: amount of free space/entry• Can be faster... Structure of the directory = key factor!• Directory = index file ... We can use more than one index

files for one table!

Example, library

1. Search the books of Asimov2. Search for the book Foundation

Access methods• B trees (B+ trees, B* trees...)• Hash-based structures• We already learned the basics of those from

Programming II.

• Let’s recap: Hash vs Symmetric vs assymetric encryptionUsage in DB: table joins (equality check), partitioning index structures, encryption

• Let’s recap: B tree - basic structure, always balanced, search, insert (split node), delete (merge nodes)Usage in DB: key-based LUTs, custom ordered indices(note: char indices are SLOW – use hash indexes!)

B-tree variants• In the B+ tree, copies of the keys are stored in the

internal nodes; the keys and records are stored in leaves; in addition, a leaf node may include a pointer to the next leaf node to speed sequential access

• The B* tree balances more neighboring internal nodes to keep the internal nodes more densely packed. This variant requires non-root nodes to be at least 2/3 full instead of ½

• Instead of immediately splitting up a node when it gets full, its keys are shared with a node next to it. When both nodes are full, then the two nodes are split into three. Deleting nodes very complex

Indexes• To speed up not supported operations• Collection of data entries to speed up search• Rid=pointer to the entries

Index properties: Clusered vs. unclustered• Clustered

– Ordering of data is similar to ordering of indexes (data sorted by the search key on every page)

– Expensive• Unclustered

– Random ordering of data

V 1.0 szabo.zsolt@nik.uni-obuda.hu 27

Index properties: Dense vs. sparse• Dense:

– It contains at least one data entry for every search key that appears in the data file

– Useful optimization techniques rely on it• Sparse

– Contains one entry for each page in the data file– Much smaller

Example

Index properties: Primary vs. secondary• Primary

– Index on a set of fields that includes primary key• Secondary

– Not primary index• Unique

– Contains a key candidate

Index properties: Simple vs composite• Contains several fields

DBMAN8

Basic operations

Consequent operations

Quotient examples

sno pnos1 p1s1 p2s1 p3s1 p4s2 p1s2 p2s3 p2s4 p2s4 p4

pno p2

pnop2p4

pnop1p2p4

snos1s2s3s4

snos1s4

All symbols

DBMAN8

Atomic formulas

Formulas

Example

Example no. 2

Example no. 3

To conclude relational algebra/calculus• Relational algebra and relational calculus can express the

same• Declarative part is convenient in the queries (user-

friendly)• The algebra (way of calculation) is the task of the DB, it is

hidden from the user

DBMAN8

SYSTEM PRIVILEGES - ORACLE

• CREATE SESSION – REQUIRED TO LOG IN• CREATE TABLE• CREATE VIEW• CREATE PROCEDURE• CREATE USER• ALTER ANY TABLE• ALTER ANY TRIGGER• SELECT ANY TABLE• DROP ANY TABLE• DROP USER• ALL PRIVILEGES

ROLES, USERS

• CREATE USER {name} IDENTIFIED BY {password};• DROP USER {name};

• Role ~ Group of privileges• CREATE ROLE {name};• DROP ROLE {name};

• Authentication: user+pass / user+pass+host / LDAP / PAM / Windows-based

SETTING SYSTEM PRIVILEGES• GRANT {privileges} TO {role/user};• "WITH GRANT OPTION"• GRANT {role} TO {user};• REVOKE {privileges / role} FROM {user};

OBJECT-PRIVILEGES• SELECT• INSERT• UPDATE• DELETE• ALTER• EXECUTE (for PL/SQL)• READ (for files)• REFERENCES (for constraints)• INDEX (for CREATE INDEX)

SETTING OBJECT-PRIVILEGES• GRANT {privileges} ON {object} TO {user} [WITH GRANT

OPTION];• REVOKE {privileges / ALL} ON {object} FROM {user};

SQL: Kleene’s three-valued logic

• If compared with NULL, it is usually NULL except when UNION or INTERSECT is used

A AND BB

True Unknown False

True True Unknown False

Unknown Unknown Unknown False

False False False False

A OR BB

True Unknown False

True True True True

Unknown True Unknown Unknown

False True Unknown False A NOT A

True False

Unknown Unknown

False True

Basic terms – should NOT be new!• Elements: , , 𝑎 𝑏 𝑐• Sets: , , 𝐴 𝐵 𝐶• Defining a set:

– enumeration: ={ , , }, thus 𝐴 𝑎 𝑏 𝑐 𝑎∈𝐴– rules: ={ | ≥100 ≤1000}𝐵 𝑥 𝑥 ∧𝑥

• Subset: , 𝐴⊂𝐵 if :∀𝑎∈𝐴 𝑎∈𝐵• Ordered set (vector): = , , 𝑞 ⟨𝑎 𝑏 𝑐⟩• Attributes: key vs secondary attributes• SQL = DML, DDL, DQL, DCL!

Finding the key

V 1.0 szabo.zsolt@nik.uni-obuda.hu 55

V 1.0 DBMAN 8 Data access layers Files and Indices Relational algebra Relational calculus Random...

Documents

RDBMS and SQL Relational Algebravvtesh.co.in/teaching/dbms2019/Lecture2-Relational...Tuple Relational Calculus Domain Relational Calculus Popular Language SQL Relational Algebra Relation

Chapter 3: Relational Model · Chapter 3: Relational Model Structure of Relational Databases Relational Algebra Tuple Relational Calculus Domain Relational Calculus Extended Relational-Algebra-Operations

Relational Database Languages: Relational Calculus · Chapter 8 Relational Database Languages: Relational Calculus Overview the relational calculus is a specialization of rst-order

Relational Algebra. 2 Outline Relational Algebra Unary Relational Operations Relational Algebra Operations from Set Theory Binary Relational Operations

Databases Unit 2 Relational data model and relational ...€¦ · Relational model and relational algebra Relational data model Relational algebra. Databases Relational data model

Relational Model and Relational Algebra

The Relational Algebra. Slide 6- 2 Outline Relational Algebra Unary Relational Operations Relational Algebra Operations From Set Theory Binary Relational

Chapter 4: Relational Model€¦ · Chapter 4: Relational Model Content: •Relational model •How to transform ER diagrams into a relational model Next: •Transform relational

Chapter 3: Relational Model - Bucknell Universityxmeng/Course/CS379/db-relational.pdf · Chapter 3: Relational Model Structure of Relational Databases Relational Algebra Tuple Relational

Chapter 2: Relational Model - WordPress.com · Chapter 2: Relational Model Structure of Relational Databases Fundamental Relational-Algebra-Operations Additional Relational-Algebra-Operations

The Relational Data Model and Relational Database Constraints · »The Relational Data Model »Relational Model Constraints »Update Operations »Relational Algebra . Relational Model

Relational algebra-and-relational-calculus

Chapter 3 : Relational Model Structure of Relational Databases Fundamental Relational-Algebra Operations Additional Relational-Algebra Operations Extended

Relational Models - uni-muenchen.detresp/papers... · Relational learning, statistical relational models, statistical relational learning, relational data mining 2 Glossary Entities

V 1.0 DBMAN 3 Group By, Having Cube, Rollup OLTP vs OLAP Data analysis 1

1 Relational Query Languages Relational Algebra (procedural) Relational Calculus (non-procedural)

Relational Algebra and Relational Calculus

Relational Model & Relational Algebra

Relational Algebra and Relational Calculus

Relational Database & Relational Algebra