Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
What are Databases?
What are Databases?
1 Overview & Motivation
2 Architectures
3 Areas of Application
4 History
Saake Database Concepts Last Edited: April 2019 1–1
What are Databases?
Educational Objective for Today . . .
Motivation for using database systemsKnowledge of basic architectures
Saake Database Concepts Last Edited: April 2019 1–2
What are Databases? Overview & Motivation
What are Databases?
Data = logically grouped pieces of informationBase1
: the bottom or lowest part of something : the part on whichsomething rests or is supported: something (such as a group of people or things) that providessupport for a place, business, etc.
Base of a column in architecture
1http://www.merriam-webster.com/dictionary/base
Saake Database Concepts Last Edited: April 2019 1–3
What are Databases? Overview & Motivation
Areas of Application
Saake Database Concepts Last Edited: April 2019 1–4
What are Databases? Overview & Motivation
How is Data Managed?
Without databasesEach application manages its own dataData is stored multiple times redundancyProblems
I Waste of storage spaceI “Forgetting” of changesI No centralized, “standardized” data management
Saake Database Concepts Last Edited: April 2019 1–5
What are Databases? Overview & Motivation
Problems of Data Redundancy
Other software systems cannot process large amounts of dataefficientlyMany users or applications cannot access the same data inparallel without interfering with each otherApplication developers / users cannot develop / use applicationswithout knowing
I internal representation of dataI storage media or computers
(no data independence)No data security; potential loss of data
Saake Database Concepts Last Edited: April 2019 1–6
What are Databases? Overview & Motivation
Idea: Data Integration Using Database Systems
Database
...
DBMS
Application Application
structured data, which is managed by DBMS
Database Management System =Software for Managing Databases
DBS = Database System
Saake Database Concepts Last Edited: April 2019 1–7
What are Databases? Overview & Motivation
Motivation
Database systemsare center piece ofmodern ITsystems
. . . ubiquitous
Databasespecialists are inhigh demand
Saake Database Concepts Last Edited: April 2019 1–8
What are Databases? Overview & Motivation
Questions
1 How to organize (model and use) data?2 How to store data safely and persistently?3 How to process huge amounts of data (� terabytes) efficiently?4 How can many users (� 10,000) access data concurrently?
Saake Database Concepts Last Edited: April 2019 1–9
What are Databases? Architectures
Principles: Codd’s Nine Rules
1 Integration: uniform, non-redundant data management2 Operations: insert, query, update, delete3 Catalog: access to the database description in a data dictionary4 User views: different users/applications must be able to have a
different perception of the data5 Integrity: ensure conformity of database contents with real world6 Security: prevention of unauthorized access7 Transactions: multiple DB operations handled as an atomic unit8 Synchronisation: coordination of concurrent transactions9 Recovery: data backup and recovery after system errors
Saake Database Concepts Last Edited: April 2019 1–10
What are Databases? Architectures
Data Independence and Schemata
Based on coarse DBMS architectureDecouple user and implementation viewGoals include:
I Separate modeling view from internal storageI PortabilityI Simplify tuningI Standardized interfaces
Saake Database Concepts Last Edited: April 2019 1–11
What are Databases? Architectures
Schema Architecture
Conceptual Schema
ExternalSchema 1
ExternalSchema N
InternalSchema
...
Query Processing
Data Representation
Saake Database Concepts Last Edited: April 2019 1–12
What are Databases? Architectures
Schema Architecture /2
Connection betweenI Conceptual schema (result of data definition)I Internal schema (definition of file structure and access paths)I External schemata (result of view definition)I Application programs (result of application programming)
Saake Database Concepts Last Edited: April 2019 1–13
What are Databases? Architectures
Schema Architecture /3
Separation schema — instanceI Schema (metadata, data description)I Instance (user data, database state or shape)
Database schema consists ofI Internal, conceptual, external schemata and application programs
Conceptual schema contains, e.g.:I Structure descriptionsI Integrity descriptionsI Authorization rules (DB accesses that a user may perform)
Saake Database Concepts Last Edited: April 2019 1–14
What are Databases? Architectures
Data Independence /2
Stability of user interface with respect to changesPhysical: Changes to file structure or access paths do notinfluence the conceptual schemaLogical: Changes to conceptual schema and certain externalschemata do not influence other external schemata or applicationprograms
Saake Database Concepts Last Edited: April 2019 1–15
What are Databases? Architectures
Data Independence /3
Potential impact of changes to the conceptual schema:I external schemata may be affected (changing attributes)I application programs may be affected (recompilation of application
programs, adaptions may be necessary)
But: necessary changes are recognized and monitored by theDBMS
Saake Database Concepts Last Edited: April 2019 1–16
What are Databases? Architectures
Application Example: Music Store
Musician
Title
Year
Tracks
PriceCritique(s)
Saake Database Concepts Last Edited: April 2019 1–17
What are Databases? Architectures
Layer Architecture by Example
Conceptual view: presentation in tables (relations)
Artist MNr Name Country103 Apocalyptica Finland104 Subway To Sally Germany105 Rammstein Germany
Album ANr Title Year Genre MNr ! Artist1014 Amplified 2006 Rock 1031015 Nord Nord Ost 2005 Rock 1041016 Rosenrot 2005 Rock 1051021 Engelskrieger 2003 Rock 1041025 Reflections 2006 Rock 103
Saake Database Concepts Last Edited: April 2019 1–18
What are Databases? Architectures
Layer Architecture by Example /2
External view: data in a flat relation
ANr Title Year Genre Artist1014 Amplified 2006 Rock Apocalyptica1015 Nord Nord Ost 2005 Rock Subway To Sally1016 Rosenrot 2005 Rock Rammstein1021 Engelskrieger 2003 Rock Subway To Sally1025 Reflections 2006 Rock Apocalyptica
Saake Database Concepts Last Edited: April 2019 1–19
What are Databases? Architectures
Layer Architecture by Example /3
External view: data in a hierarchically structured relation
Artist AlbumTitle Year Genre
Apocalyptica Amplified 2006 RockReflections 2003 Rock
Subway To Sally Nord Nord Ost 2005 MetalEngelskrieger 2003 Rock
Rammstein Rosenrot 2005 Rock
Saake Database Concepts Last Edited: April 2019 1–20
What are Databases? Architectures
Layer Architecture by Example /4Internal presentation
1000 1500 2000
1014 Amplified 2006
1015 Nord Nord Ost 2005
19.99 Rock 103 ....15.99 Rock 104 ....
Space for Data SetsOverflow
Tree-like Access on
Album Number
Partial Storage
of Data Sets in a Tree
Saake Database Concepts Last Edited: April 2019 1–21
What are Databases? Architectures
System Architectures
Description of components of a database systemStandardized interfaces between components
Architecture proposalsI ANSI-SPARC architecture three layer architecture
I Five layer architecture describes transformation components in detailLecture “Database Implementation Techniques”
Saake Database Concepts Last Edited: April 2019 1–22
What are Databases? Architectures
ANSI-SPARC Architecture
ANSI: American National Standards InstituteSPARC: Standards Planning and Requirement CommitteeProposal from 1978Basically refines coarse architecture
I Internal layer / operation system refinedI Multiple interactive and programming componentsI Interfaces named and standardized
Saake Database Concepts Last Edited: April 2019 1–23
What are Databases? Architectures
ANSI-SPARC Architecture /2
Data Dictionary
Optimizer Mapping Disk AccessQuery
Updates
View Definition
Data Definition
File Organisation
DB-Operations
Embedding
GUIs
P1
Pn
...
External Layer Conceptual Layer Internal Layer
Saake Database Concepts Last Edited: April 2019 1–24
What are Databases? Architectures
Classification of Components
Definition components: data definition, file organization, viewdefinitionProgramming components: DB programming with embedded DBoperationsUser components: interactive application programs, query andupdateTransformation components: optimizer, analysis, disk accessData dictionary: contains data from definition components,provides other components with this information
Saake Database Concepts Last Edited: April 2019 1–25
What are Databases? Architectures
Five Layer ArchitectureRefinement of transformation steps
Data System
Access System
Storage System
Buffer Manager
Operating System
Set-OrientedInterface
Record-OrientedInterface
System BufferInterface
FileInterface
DeviceInterface
External Storage
Translation, Access Path Selection,Access Control, Integrity Control
Data Dictionary, Currency Pointer,Sorting, Concurrency control
Record Manager, Access Path Management, Lock Management, Logging, Recovery
System Buffer Management, Page Replacement
External Storage Management
SOI
ROI
IRI
SBI
FI
DI
Internal-RecordInterface
Saake Database Concepts Last Edited: April 2019 1–26
What are Databases? Architectures
Application Architectures
Architecture of database applications typically based onclient-server model: server ⌘ database system
3. Answer
Client(Customer)
Server(Service Provider)
1. Request
2. Processing
Saake Database Concepts Last Edited: April 2019 1–27
What are Databases? Architectures
Application Architectures /2Separation of functionality of an application
I Presentation and user interactionI Application logic (“business logic”)I Data management functionality (store, query, . . . ).
User Interface
Application Logic
DB-Interface
Client
DB-Server
Two Layer Architecture
User Interface
Application Logic
DB-Interface
Client
DB-Server
Application-Server
Three Layer ArchitectureSaake Database Concepts Last Edited: April 2019 1–28
What are Databases? Areas of Application
Some concrete systems
(Object-)relational DBMSI Oracle11g, IBM DB2 V.10, Microsoft SQL Server 2012, SAP HANAI MySQL (www.mysql.org), PostgreSQL (www.postgresql.org)
Pseudo DBMSI MS Access
NoSQL systemsI Graph database systems (InfiniteGraph, neo4j), document
databases (MongoDB), key-value stores, ....
Saake Database Concepts Last Edited: April 2019 1–29
What are Databases? Areas of Application
Areas of Application
Classical areas of application:I Many objects (15,000 books, 300 users, 100 books borrowed per
week, . . . )I Few object types (BOOK, USER, BORROWING)I For instance, book keeping systems, order tracking systems, library
systems, . . .
Current applications:I E-Commerce, decision supporting systems (data warehouses,
OLAP), NASA’s Earth Observation System (petabyte databases),data mining
Saake Database Concepts Last Edited: April 2019 1–30
What are Databases? Areas of Application
Database Sizes
eBay Data Warehouse 10 PB (⇡ 10 · 1015 bytes)Teradata DBMS, 72 nodes, 10,000 users,millions of queries/day
WalMart Data Warehouse 2,5 PBTeradata DBMS, NCR MPP hardware;product information (sales etc.) of 2,900 stores;50,000 queries/week
Facebook 400 TBx.000 MySQL serverHadoop/Hive, 610 nodes, 15 TB/day
US Library of Congress 10-20 TBnot digitized
PB for Petabyte is in the order of 1015
Saake Database Concepts Last Edited: April 2019 1–31
What are Databases? History
Historical Developments: 60s
Start of 60s: elementary files, application specific file structure(device-dependent, redundant, inconsistent)End of 60s: file management systems (SAM, ISAM) with serviceprograms (sorting) (device-independent, but redundant andinconsistent)DBS based on hierarchical model, network model
I Pointer structures between dataI Weak separation of internal / conceptual layerI Navigational DMLI Separation DML / programming language
Saake Database Concepts Last Edited: April 2019 1–32
What are Databases? History
Historical Developments: 70s and 80s
70s: database systems (device and data independence,redundancy free, consistent)Relational database systems
I Data in table structuresI 3 layer conceptI Declarative DMLI Separation DML / programming language
Saake Database Concepts Last Edited: April 2019 1–33
What are Databases? History
History of RDBMS
1970: Ted Codd (IBM) ! relational model as conceptual basis ofrelational DBS1974: System R (IBM) ! first prototype of an RDBMS
I Two modules: RDS, RSS; ca. 80,000 LOC (PL/1, PL/S, assembler),ca. 1.2 MB of code
I Query language SEQUELI First deployment 1977
1975: University of California at Berkeley (UCB) ! IngresI Query language QUELI Predecessor of Postgres, Sybase, . . .
1979: Oracle Version 2
Saake Database Concepts Last Edited: April 2019 1–34
What are Databases? History
Historical Developments: (80s and) 90s
Knowledge base systemsI Data in table structuresI Strongly declarative DML, integrated database programming
languageObject-oriented database systems
I Data in more complex object structures (separation of object and itsdata)
I Declarative or navigational DMLI Often integrated database programming languageI Often incomplete separation of layers
Saake Database Concepts Last Edited: April 2019 1–35
What are Databases? History
Historical Developments: Today
New hardware architecturesI Multicore processors, terabytes of main memory: in-memory
database systems (e.g., SAP HANA)Support for specific applications
I Cloud databases: Hosting of databases, scalable datamanagement solutions (Amazon RDS, Microsoft Azure)
I Data stream processing: online processing of live data, e.g., stockexchange data, sensor data, RFID data, . . . (StreamBase, MSStreamInsight, IBM Infosphere Streams)
I Big Data: Handle petabytes of data through highly scalable, parallelprocessing, data analysis (Hadoop, Hive, Google Spanner & F1,. . . )
I NoSQL databases (“Not only SQL”): non-relational databases,flexible schema (document-centered), “light-weight” because SQLfunctionality like transactions is omitted, powerful declarative querylanguages with joins, etc. (CouchDB, MongoDB, Cassandra, . . . )
Saake Database Concepts Last Edited: April 2019 1–36
What are Databases? History
Trends
User-generated content, e.g., Google:I Daily processing of 20 PBI 15 hours of video uploaded to YouTube every minuteI Reading 20 PB would take 12 years with a 50 MB/s hard disk drive
Linked data and data webI Provision, exchange and linking of structured data on the WebI Enables querying (with query languages like SPARQL) and further
processingI Examples: DBpedia, GeoNames
Saake Database Concepts Last Edited: April 2019 1–37
What are Databases? History
Summary
Motivation for using database systemsCodd’s rules3 layer schema architecture & data independenceAreas of application
Saake Database Concepts Last Edited: April 2019 1–38