40
Part I What are Databases?

Part I What are Databases?

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Part I

What are Databases?

What are Databases?

What are Databases?

1 Overview & Motivation

2 Architectures

3 Areas of Application

4 History

Saake Database Concepts Last Edited: April 2019 1–1

What are Databases?

Educational Objective for Today . . .

Motivation for using database systemsKnowledge of basic architectures

Saake Database Concepts Last Edited: April 2019 1–2

What are Databases? Overview & Motivation

What are Databases?

Data = logically grouped pieces of informationBase1

: the bottom or lowest part of something : the part on whichsomething rests or is supported: something (such as a group of people or things) that providessupport for a place, business, etc.

Base of a column in architecture

1http://www.merriam-webster.com/dictionary/base

Saake Database Concepts Last Edited: April 2019 1–3

What are Databases? Overview & Motivation

Areas of Application

Saake Database Concepts Last Edited: April 2019 1–4

What are Databases? Overview & Motivation

How is Data Managed?

Without databasesEach application manages its own dataData is stored multiple times redundancyProblems

I Waste of storage spaceI “Forgetting” of changesI No centralized, “standardized” data management

Saake Database Concepts Last Edited: April 2019 1–5

What are Databases? Overview & Motivation

Problems of Data Redundancy

Other software systems cannot process large amounts of dataefficientlyMany users or applications cannot access the same data inparallel without interfering with each otherApplication developers / users cannot develop / use applicationswithout knowing

I internal representation of dataI storage media or computers

(no data independence)No data security; potential loss of data

Saake Database Concepts Last Edited: April 2019 1–6

What are Databases? Overview & Motivation

Idea: Data Integration Using Database Systems

Database

...

DBMS

Application Application

structured data, which is managed by DBMS

Database Management System =Software for Managing Databases

DBS = Database System

Saake Database Concepts Last Edited: April 2019 1–7

What are Databases? Overview & Motivation

Motivation

Database systemsare center piece ofmodern ITsystems

. . . ubiquitous

Databasespecialists are inhigh demand

Saake Database Concepts Last Edited: April 2019 1–8

What are Databases? Overview & Motivation

Questions

1 How to organize (model and use) data?2 How to store data safely and persistently?3 How to process huge amounts of data (� terabytes) efficiently?4 How can many users (� 10,000) access data concurrently?

Saake Database Concepts Last Edited: April 2019 1–9

What are Databases? Architectures

Principles: Codd’s Nine Rules

1 Integration: uniform, non-redundant data management2 Operations: insert, query, update, delete3 Catalog: access to the database description in a data dictionary4 User views: different users/applications must be able to have a

different perception of the data5 Integrity: ensure conformity of database contents with real world6 Security: prevention of unauthorized access7 Transactions: multiple DB operations handled as an atomic unit8 Synchronisation: coordination of concurrent transactions9 Recovery: data backup and recovery after system errors

Saake Database Concepts Last Edited: April 2019 1–10

What are Databases? Architectures

Data Independence and Schemata

Based on coarse DBMS architectureDecouple user and implementation viewGoals include:

I Separate modeling view from internal storageI PortabilityI Simplify tuningI Standardized interfaces

Saake Database Concepts Last Edited: April 2019 1–11

What are Databases? Architectures

Schema Architecture

Conceptual Schema

ExternalSchema 1

ExternalSchema N

InternalSchema

...

Query Processing

Data Representation

Saake Database Concepts Last Edited: April 2019 1–12

What are Databases? Architectures

Schema Architecture /2

Connection betweenI Conceptual schema (result of data definition)I Internal schema (definition of file structure and access paths)I External schemata (result of view definition)I Application programs (result of application programming)

Saake Database Concepts Last Edited: April 2019 1–13

What are Databases? Architectures

Schema Architecture /3

Separation schema — instanceI Schema (metadata, data description)I Instance (user data, database state or shape)

Database schema consists ofI Internal, conceptual, external schemata and application programs

Conceptual schema contains, e.g.:I Structure descriptionsI Integrity descriptionsI Authorization rules (DB accesses that a user may perform)

Saake Database Concepts Last Edited: April 2019 1–14

What are Databases? Architectures

Data Independence /2

Stability of user interface with respect to changesPhysical: Changes to file structure or access paths do notinfluence the conceptual schemaLogical: Changes to conceptual schema and certain externalschemata do not influence other external schemata or applicationprograms

Saake Database Concepts Last Edited: April 2019 1–15

What are Databases? Architectures

Data Independence /3

Potential impact of changes to the conceptual schema:I external schemata may be affected (changing attributes)I application programs may be affected (recompilation of application

programs, adaptions may be necessary)

But: necessary changes are recognized and monitored by theDBMS

Saake Database Concepts Last Edited: April 2019 1–16

What are Databases? Architectures

Application Example: Music Store

Musician

Title

Year

Tracks

PriceCritique(s)

Saake Database Concepts Last Edited: April 2019 1–17

What are Databases? Architectures

Layer Architecture by Example

Conceptual view: presentation in tables (relations)

Artist MNr Name Country103 Apocalyptica Finland104 Subway To Sally Germany105 Rammstein Germany

Album ANr Title Year Genre MNr ! Artist1014 Amplified 2006 Rock 1031015 Nord Nord Ost 2005 Rock 1041016 Rosenrot 2005 Rock 1051021 Engelskrieger 2003 Rock 1041025 Reflections 2006 Rock 103

Saake Database Concepts Last Edited: April 2019 1–18

What are Databases? Architectures

Layer Architecture by Example /2

External view: data in a flat relation

ANr Title Year Genre Artist1014 Amplified 2006 Rock Apocalyptica1015 Nord Nord Ost 2005 Rock Subway To Sally1016 Rosenrot 2005 Rock Rammstein1021 Engelskrieger 2003 Rock Subway To Sally1025 Reflections 2006 Rock Apocalyptica

Saake Database Concepts Last Edited: April 2019 1–19

What are Databases? Architectures

Layer Architecture by Example /3

External view: data in a hierarchically structured relation

Artist AlbumTitle Year Genre

Apocalyptica Amplified 2006 RockReflections 2003 Rock

Subway To Sally Nord Nord Ost 2005 MetalEngelskrieger 2003 Rock

Rammstein Rosenrot 2005 Rock

Saake Database Concepts Last Edited: April 2019 1–20

What are Databases? Architectures

Layer Architecture by Example /4Internal presentation

1000 1500 2000

1014 Amplified 2006

1015 Nord Nord Ost 2005

19.99 Rock 103 ....15.99 Rock 104 ....

Space for Data SetsOverflow

Tree-like Access on

Album Number

Partial Storage

of Data Sets in a Tree

Saake Database Concepts Last Edited: April 2019 1–21

What are Databases? Architectures

System Architectures

Description of components of a database systemStandardized interfaces between components

Architecture proposalsI ANSI-SPARC architecture three layer architecture

I Five layer architecture describes transformation components in detailLecture “Database Implementation Techniques”

Saake Database Concepts Last Edited: April 2019 1–22

What are Databases? Architectures

ANSI-SPARC Architecture

ANSI: American National Standards InstituteSPARC: Standards Planning and Requirement CommitteeProposal from 1978Basically refines coarse architecture

I Internal layer / operation system refinedI Multiple interactive and programming componentsI Interfaces named and standardized

Saake Database Concepts Last Edited: April 2019 1–23

What are Databases? Architectures

ANSI-SPARC Architecture /2

Data Dictionary

Optimizer Mapping Disk AccessQuery

Updates

View Definition

Data Definition

File Organisation

DB-Operations

Embedding

GUIs

P1

Pn

...

External Layer Conceptual Layer Internal Layer

Saake Database Concepts Last Edited: April 2019 1–24

What are Databases? Architectures

Classification of Components

Definition components: data definition, file organization, viewdefinitionProgramming components: DB programming with embedded DBoperationsUser components: interactive application programs, query andupdateTransformation components: optimizer, analysis, disk accessData dictionary: contains data from definition components,provides other components with this information

Saake Database Concepts Last Edited: April 2019 1–25

What are Databases? Architectures

Five Layer ArchitectureRefinement of transformation steps

Data System

Access System

Storage System

Buffer Manager

Operating System

Set-OrientedInterface

Record-OrientedInterface

System BufferInterface

FileInterface

DeviceInterface

External Storage

Translation, Access Path Selection,Access Control, Integrity Control

Data Dictionary, Currency Pointer,Sorting, Concurrency control

Record Manager, Access Path Management, Lock Management, Logging, Recovery

System Buffer Management, Page Replacement

External Storage Management

SOI

ROI

IRI

SBI

FI

DI

Internal-RecordInterface

Saake Database Concepts Last Edited: April 2019 1–26

What are Databases? Architectures

Application Architectures

Architecture of database applications typically based onclient-server model: server ⌘ database system

3. Answer

Client(Customer)

Server(Service Provider)

1. Request

2. Processing

Saake Database Concepts Last Edited: April 2019 1–27

What are Databases? Architectures

Application Architectures /2Separation of functionality of an application

I Presentation and user interactionI Application logic (“business logic”)I Data management functionality (store, query, . . . ).

User Interface

Application Logic

DB-Interface

Client

DB-Server

Two Layer Architecture

User Interface

Application Logic

DB-Interface

Client

DB-Server

Application-Server

Three Layer ArchitectureSaake Database Concepts Last Edited: April 2019 1–28

What are Databases? Areas of Application

Some concrete systems

(Object-)relational DBMSI Oracle11g, IBM DB2 V.10, Microsoft SQL Server 2012, SAP HANAI MySQL (www.mysql.org), PostgreSQL (www.postgresql.org)

Pseudo DBMSI MS Access

NoSQL systemsI Graph database systems (InfiniteGraph, neo4j), document

databases (MongoDB), key-value stores, ....

Saake Database Concepts Last Edited: April 2019 1–29

What are Databases? Areas of Application

Areas of Application

Classical areas of application:I Many objects (15,000 books, 300 users, 100 books borrowed per

week, . . . )I Few object types (BOOK, USER, BORROWING)I For instance, book keeping systems, order tracking systems, library

systems, . . .

Current applications:I E-Commerce, decision supporting systems (data warehouses,

OLAP), NASA’s Earth Observation System (petabyte databases),data mining

Saake Database Concepts Last Edited: April 2019 1–30

What are Databases? Areas of Application

Database Sizes

eBay Data Warehouse 10 PB (⇡ 10 · 1015 bytes)Teradata DBMS, 72 nodes, 10,000 users,millions of queries/day

WalMart Data Warehouse 2,5 PBTeradata DBMS, NCR MPP hardware;product information (sales etc.) of 2,900 stores;50,000 queries/week

Facebook 400 TBx.000 MySQL serverHadoop/Hive, 610 nodes, 15 TB/day

US Library of Congress 10-20 TBnot digitized

PB for Petabyte is in the order of 1015

Saake Database Concepts Last Edited: April 2019 1–31

What are Databases? History

Historical Developments: 60s

Start of 60s: elementary files, application specific file structure(device-dependent, redundant, inconsistent)End of 60s: file management systems (SAM, ISAM) with serviceprograms (sorting) (device-independent, but redundant andinconsistent)DBS based on hierarchical model, network model

I Pointer structures between dataI Weak separation of internal / conceptual layerI Navigational DMLI Separation DML / programming language

Saake Database Concepts Last Edited: April 2019 1–32

What are Databases? History

Historical Developments: 70s and 80s

70s: database systems (device and data independence,redundancy free, consistent)Relational database systems

I Data in table structuresI 3 layer conceptI Declarative DMLI Separation DML / programming language

Saake Database Concepts Last Edited: April 2019 1–33

What are Databases? History

History of RDBMS

1970: Ted Codd (IBM) ! relational model as conceptual basis ofrelational DBS1974: System R (IBM) ! first prototype of an RDBMS

I Two modules: RDS, RSS; ca. 80,000 LOC (PL/1, PL/S, assembler),ca. 1.2 MB of code

I Query language SEQUELI First deployment 1977

1975: University of California at Berkeley (UCB) ! IngresI Query language QUELI Predecessor of Postgres, Sybase, . . .

1979: Oracle Version 2

Saake Database Concepts Last Edited: April 2019 1–34

What are Databases? History

Historical Developments: (80s and) 90s

Knowledge base systemsI Data in table structuresI Strongly declarative DML, integrated database programming

languageObject-oriented database systems

I Data in more complex object structures (separation of object and itsdata)

I Declarative or navigational DMLI Often integrated database programming languageI Often incomplete separation of layers

Saake Database Concepts Last Edited: April 2019 1–35

What are Databases? History

Historical Developments: Today

New hardware architecturesI Multicore processors, terabytes of main memory: in-memory

database systems (e.g., SAP HANA)Support for specific applications

I Cloud databases: Hosting of databases, scalable datamanagement solutions (Amazon RDS, Microsoft Azure)

I Data stream processing: online processing of live data, e.g., stockexchange data, sensor data, RFID data, . . . (StreamBase, MSStreamInsight, IBM Infosphere Streams)

I Big Data: Handle petabytes of data through highly scalable, parallelprocessing, data analysis (Hadoop, Hive, Google Spanner & F1,. . . )

I NoSQL databases (“Not only SQL”): non-relational databases,flexible schema (document-centered), “light-weight” because SQLfunctionality like transactions is omitted, powerful declarative querylanguages with joins, etc. (CouchDB, MongoDB, Cassandra, . . . )

Saake Database Concepts Last Edited: April 2019 1–36

What are Databases? History

Trends

User-generated content, e.g., Google:I Daily processing of 20 PBI 15 hours of video uploaded to YouTube every minuteI Reading 20 PB would take 12 years with a 50 MB/s hard disk drive

Linked data and data webI Provision, exchange and linking of structured data on the WebI Enables querying (with query languages like SPARQL) and further

processingI Examples: DBpedia, GeoNames

Saake Database Concepts Last Edited: April 2019 1–37

What are Databases? History

Summary

Motivation for using database systemsCodd’s rules3 layer schema architecture & data independenceAreas of application

Saake Database Concepts Last Edited: April 2019 1–38

What are Databases? History

Control Questions

What is the advantage of using databasesystems compared to application-specificdata management?What does data independence mean andhow is it achieved?Which are areas of application ofdatabase systems?

Saake Database Concepts Last Edited: April 2019 1–39