29
Database Management Systems CSE 594 Introduction September 28, 2000

Database Management Systems CSE 594 Introduction September 28, 2000

Embed Size (px)

Citation preview

Page 1: Database Management Systems CSE 594 Introduction September 28, 2000

Database Management Systems

CSE 594

IntroductionSeptember 28, 2000

Page 2: Database Management Systems CSE 594 Introduction September 28, 2000

Staff

Instructor: Alon Halevy Sieg, Room 310, [email protected] Office hours: Thursdays 5pm, email.

TAs: Maya Rodrig Office hours: Thursdays 5pm, or by

appointment.Mailing list: cse594@csWeb page:

http://www.cs.washington.edu/education/courses/594/00au/

Page 3: Database Management Systems CSE 594 Introduction September 28, 2000

Goals of the Course

Purpose: Principles of building database applications Foundations of database management systems. Issues in building database systems. Have fun: databases are not just bunches of

tuples. Not an introduction to the nitty gritty of any

specific commerical system.

Page 4: Database Management Systems CSE 594 Introduction September 28, 2000

Grading

Paper homeworks: 25% Very little regurgitation. Meant to be challenging (I.e., fun).

Two programming projects: 40% Work in pairs. Build a database application Build an XML query processor

Final Exam: 25% (currently scheduled for Dec. 14th).

Intangibles (e.g., participation): 10%

Page 5: Database Management Systems CSE 594 Introduction September 28, 2000

Textbook

Two volume collection, available as a pair in the bookstore:

A First Course on Database Systems: Ullman & Widom

Database System Implementation: Garcia-Molina, Ullman and Widom.

A few comments about the books.

Page 6: Database Management Systems CSE 594 Introduction September 28, 2000

Other Useful Texts

Database Management Systems: Ramakrishnan and Gehrke

Foundations of Databases (Abiteboul, Hull & Vianu) Parallel and Distributed DBMS (Ozsu and Valduriez) Transaction Processing (Gray and Reuter) Database Systems (Silberschatz, Korth and

Sudarshan) Principles of Transaction Processing (Bernstein and

Newcomer) Readings in Database Systems (Stonebraker and

Hellerstein) Proceedings of SIGMOD, VLDB, PODS conferences.

Page 7: Database Management Systems CSE 594 Introduction September 28, 2000

Prerequisites

Page 8: Database Management Systems CSE 594 Introduction September 28, 2000

Real Prerequisites

Operating systemsData structures and

algorithmsDistributed systemsComplexity theoryMathematical LogicKnowledge

Representation

User interface design

Programming languages

Artificial Intelligence (Search)

Greek, Hebrew, French

Page 9: Database Management Systems CSE 594 Introduction September 28, 2000

Why use a DBMS?

Suppose we are building a system to store the information pertaining to the university.

Several questions arise:how do we store the data? (file organization, etc.)how do we query the data? (write programs…)make sure that updates don’t mess things up?Provide different views on the data? (registrar versus

students)how do we deal with crashes?

Way too complicated! Go buy a database system!

Page 10: Database Management Systems CSE 594 Introduction September 28, 2000

Why Use a DBMS?

• Large amounts of data (Giga’s, Tera’s) • Data is very structured• Persistent data• Valuable data• Performance requirements• Concurrent access to the data• Restricted access to data

All programs manipulate data, so why use a database?Many data manipulation tasks involve recurringoperations:

Page 11: Database Management Systems CSE 594 Introduction September 28, 2000

Functionality of a DBMS

Persistent storage managementTransaction managementResiliency: recovery from crashes.Separation between logical and physical

views of the data. High level query and data manipulation

language. Efficient query processing

Interface with programming languages

Page 12: Database Management Systems CSE 594 Introduction September 28, 2000

Bird’s Eye View of

How to build a database application

The different components of a database system.

Page 13: Database Management Systems CSE 594 Introduction September 28, 2000

Building an Application with a Database System

Requirements modeling (conceptual, pictures) Decide what entities should be part of the application

and how they should be linked.

Schema design and implementation Decide on a set of tables, attributes. Define the tables in the database system. Populate database (insert tuples).

Write application programs using the DBMS way easier now that the data management is taken

care of.

Page 14: Database Management Systems CSE 594 Introduction September 28, 2000

address name field

Professor

Advises

Takes

Teaches

CourseStudent

name category

quarter

name

ssn

Conceptual Modeling

Page 15: Database Management Systems CSE 594 Introduction September 28, 2000

Relational Terminology

Name Price Category Manufacturer

gizmo $19.99 gadgets GizmoWorks

Power gizmo $29.99 gadgets GizmoWorks

SingleTouch $149.99 photography Canon

MultiTouch $203.99 household Hitachi

tuples

Attribute namesProduct (relation name)

Product(name: string, Price: real, category: enum, Manufacturer: string)

(Arity=4)

Page 16: Database Management Systems CSE 594 Introduction September 28, 2000

Schema Design and Implementation

Table Students

Note: Separation of the logical view from the physical view of the data.

Normalization (theory).

Student Course Quarter

Charles CS 444 Fall, 1997

Dan CS 142 Winter,1998

… … …

Page 17: Database Management Systems CSE 594 Introduction September 28, 2000

Querying a Database

Find all the students who have taken CSE444 in Fall, 1997.

S(tructured) Q(uery) L(anguage) select E.name from Enroll E where E.course=CS444 and E.quarter=“Fall, 1997”

Query processor figures out how to answer the query efficiently.

An acquired taste…Other query languages exist (OO, OR, datalog)

Page 18: Database Management Systems CSE 594 Introduction September 28, 2000

Writing Application Code

Use ODBC/JDBC.Create a connection with a database.Embed SQL in application code.Specify transaction bordersMay need physical tuning of the

database.

Page 19: Database Management Systems CSE 594 Introduction September 28, 2000

Query optimizer

Execution engine

Index/record mgr.

Buffer manager

Storage manager

storage

User/Application

Queryupdate

Query executionplan

Record, indexrequests

Page commands

Read/writepages

Page 20: Database Management Systems CSE 594 Introduction September 28, 2000

Storage Management

Becomes a hard problem because of the interaction with the other levels of the DBMS: What are we storing? Efficient indexing, single and multi-

dimensional Exploit “semantic” knowledge

Issue: interaction with the operating system. Should we rely on the OS?

Page 21: Database Management Systems CSE 594 Introduction September 28, 2000
Page 22: Database Management Systems CSE 594 Introduction September 28, 2000
Page 23: Database Management Systems CSE 594 Introduction September 28, 2000

TP and RecoveryFor efficient use of resources, we want

concurrent access to data.Systems sometimes crash.A “real” database guarantees ACIDACID:

Atomicity: all or nothing of a transaction. Consistency: always leave the DB consistent. Isolation: every transaction runs as if it’s the only

one in the system. Durability: if committed, we really mean it.

Do we really want ACID?

Page 24: Database Management Systems CSE 594 Introduction September 28, 2000

ReviewsSh ip p in gO rd ersIn ven toryBooks

m ybooks .com M edia ted S chem a

W e s t

...

F e dE x

W A N

a lt.bo o ks .re v ie w s

In te rne tIn te rne t In te rne t

UP S

E a s t O rde rs C us to me rR e v ie w s

NY Time s

...

M o rga n-K a ufma n

P re ntic e -Ha ll

Data Integration

Uniform query capability across autonomous, heterogeneous data sources on LAN, WAN, or Internet

Page 25: Database Management Systems CSE 594 Introduction September 28, 2000

XML: Semi-structured Data

Emerging format for data exchange on the web and between applications.

<db> <book> <title>Complete Guide to DB2</title> <author>Chamberlin</author> </book> <book> <title>Transaction Processing</title> <author>Bernstein</author> <author>Newcomer</author> </book> <publisher> <name>Morgan Kaufman</name> <state>CA</state> </publisher></db>

eXtensible Markup Language:

Page 26: Database Management Systems CSE 594 Introduction September 28, 2000

Database Industry

Relational databases are a great success of theoretical ideas.

Oracle has a market cap of over $200BOther players: IBM, MS, Sybase, InformixTrends:

warehousing and decision support data integration XML, XML, XML.

Page 27: Database Management Systems CSE 594 Introduction September 28, 2000

Course (Rough) Outline

The basics: (quickly) The relational model SQL Views, integrity constraints

XMLPhysical representation:

Index structures.

Page 28: Database Management Systems CSE 594 Introduction September 28, 2000

Course Outline (cont)

Query execution: (Zack Ives) Algorithms for joins, selections,

projections.Query OptimizationData Integrationsemi-structured dataTransaction processing and recovery

(Phil Bernstein)

Page 29: Database Management Systems CSE 594 Introduction September 28, 2000

Projects

Goal: identify and solve a problem in database systems.

(almost) anything goes.Groups of 2-3Groups assembled end of week 2;Proposals, end of week 3.Touch base with me: every two weeks.Example projects on web site.Start Early.