36
IMS1907 Database Systems Summer Semester 2004/2005 Lecture 13.2 Unit Review

IMS1907 Database Systems Summer Semester 2004/2005

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: IMS1907 Database Systems Summer Semester 2004/2005

IMS1907 Database Systems

Summer Semester 2004/2005

Lecture 13.2 Unit Review

Page 2: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 2

Basic Concepts

Central concepts for understanding database systems– Database– Data– Information– Data vs Information– Metadata– DBMS

Page 3: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 3

Basic ConceptsFile processing systems vs DBMS– data sharing– speed of access and retrieval– security– integrity, quality, consistency of data– data independence– maintenance, productivity– multiple users, complex data– backup and recovery

Page 4: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 4

Basic Concepts

The database system environment– DBMS– database– metadata (repository)– application software– CASE tools– user interfaces– users, developers, administrators

Page 5: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 5

Personal DatabasesWorkgroup DatabasesDepartment DatabasesEnterprise DatabasesInternet, Intranet, and Extranet DatabasesData warehouses

Types of Database Systems

Page 6: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 6

Need for new, specialised personnelInstallation cost and complexityMaintenance cost and complexityConversion costs from legacy systemsCritical need for explicit backup and recoveryOrganisational conflict and change

Costs and Risks of Database Systems

Page 7: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 7

Relational database systems organise the database as groups of related tables– table or relation– record – field– primary key– secondary key– foreign key– table structure

Forms, reports, queries

Relational DBMS Software

Page 8: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 8

Database System Development

Database development requires a focus on the information needs of a businessInformation Engineering (IE) is a popular, data-oriented methodology used to develop database systems– data are modelled in the organisational context, not in

the usage, processing or technology context– business context changes slowly stable databases– top-down planning

Page 9: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 9

Database System Development

Database systems planning– the three steps in the IE Planning phase

• identify strategic planning factors• identify corporate planning objects• develop an enterprise model

Enterprise data model– needed for top-down plans and bottom-up

requests– organisation-wide perspective

Page 10: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 10

Enterprise modelling

Database Development and the SDLC

Initiation

Analysis

Design

Implementation

Review

Maintenance

Conceptual data modelling

Logical database design

Database implementation

Physical database design and definition

Database review

Database maintenance

Page 11: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 11

Enterprise modelling– the organisational perspective

Conceptual data modelling– scope identification, ER modelling

Logical database design– transform conceptual model into logical data

model– start to specify logic for maintaining and querying

database– populate repository

Database Development and the SDLC

Page 12: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 12

Physical database design and definition– define database for specific DBMS used– organisation of data, database processing programs

Database implementation– install database and processing programs– develop procedures, load data, turn on!

Database maintenance– tune and fix the database, keep it running and

evolvingPackaged data models – universal, industry specific

Database Development and the SDLC

Page 13: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 13

Data modellingBusiness rulesER modelling– entities or ‘things of interest’– relationships– properties or attributes– rules and constraints affecting integrity of

entities

ER Modelling

Page 14: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 14

Entities– strong, weak

RelationshipsAssociative entitiesAttributes– multi-valued, derived, composite

Degree– unary, binary, ternary, n-ary

ER Modelling

Page 15: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 15

Cardinality– one-to-one, one-to-many, many-to-many

Cardinality constraints– optional, mandatory

Time dependent dataEntity types and sub-typesER quality issues

ER Modelling

Page 16: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 16

Detailed data modellingRelational database theory– considers data structure, manipulation, integrity

RelationPrimary keyComposite keyForeign keyIntegrity constraints– domain constraints, entity integrity, referential

integrity

Relational Database Theory

Page 17: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 17

A well-structured relation– is robust, stable and flexible– contains a minimum amount of redundancy– allows users to insert, modify, and delete rows in

a table without errors or inconsistenciesThree types of anomaly are possible- insertion - deletion - modification

Relational Database Theory

Page 18: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 18

Representing entities and relationships as relationsNormalisation is a process for converting complex data structures into simple, stable data structures in the form of relationsFunctional dependencyAccomplished in stages, each of which corresponds to a “normal form”

Normalisation

Page 19: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 19

First normal form (1NF)– identify PK, identify and remove repeating

groupsSecond normal form (2NF)– remove partial dependencies

Third normal form (3NF)– remove transitive dependencies

Merging relationsData structure diagrams (DSD)

Normalisation

Page 20: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 20

Has become de facto language for creating and querying relational databasesBenefits and disadvantages of SQLThe SQL environment– catalog– schema– data definition language (DDL)– data manipulation language (DML)– data control language (DCL)– data types

Structured Query Language (SQL)

Page 21: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 21

DDL– CREATE statements

• database, table, view, ….• assigning constraints

– DROP statements• database, table, view, ….

– ALTER statements• database, table, view, column, ….

Structured Query Language (SQL)

Page 22: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 22

DML– INSERT, LOAD DATA statements to populate tables– SHOW, DESCRIBE statements to view structures– retrieving data – queries

• SELECT …. FROM …. WHERE ….– aggregate operators

• COUNT, SUM, AVG, MIN, MAX, DISTINCT• GROUP BY• ordering query results with ORDER BY

Structured Query Language (SQL)

Page 23: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 23

DML– matching patterns with LIKE– joining tables– sub-queries– outer joins using LEFT JOIN– query format– How joins are processed

• Cartesian product

Structured Query Language (SQL)

Page 24: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 24

ViewsSchemaANSI/SPARC three-schema architecture standard– external schema

• user views– conceptual schema

• single, coherent definition of enterprise data– internal schema

• physical storage structures

Database Systems Architecture

Page 25: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 25

Data independence– logical– physical

Network architecture– client–server tiered architecture– distributed databases

Database Systems Architecture

Page 26: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 26

Database Systems Performance Issues

The ultimate measures of database performance are– response time to queries– the speed of updates

We also need to consider– data accessibility, security, integrity– usability– recoverability

Physical database design translates conceptual and external schemas into physical designs aimed at storing data in a way that provides adequate performance

Page 27: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 27

Guided by the nature of the data and its intended useTuning the database is often performed during operation but good performance starts with a strong physical design Critical decisions during physical design– choice of storage format – data type– grouping of attributes into physical records– arranging similarly structured records in secondary

memory– indexes, clusters, architectures– strategies for query handling based on indexes, records

Physical Database Design

Page 28: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 28

Data volume and usage analysis – workloadsChoice of data typesDesigning physical records– page size, blocking factor

Denormalisation– combining attributes into a single table– partitioning a table into several physical records

Physical file organisation– sequential, indexed, hashed

Clusters, indexesImproving file access - RAID

Physical Database Design

Page 29: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 29

Database Systems Performance

Choosing an appropriate database architectures– hierarchical database model– network database model– relational database model– object-oriented database model– multidimensional database model

Optimising query performance– good query design

Page 30: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 30

Data is viewed as a corporate assetAs with any asset, management is essential to exploit the resource to the maximum benefitEffective management of data provides support for operations and decision making at all organisational levelsThe roles of data administration and database administration have evolved to meet the complex task of– achieving effective management of data resources– leveraging those resources to the greatest advantage

Information Resource Management

Page 31: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 31

Information Resource Management

There are three major roles in information resource management– data administration

• planning, analysis– database administration

• physical design and operational use– application development

• systems design and implementation

Page 32: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 32

Information Resource Management

Ineffective data administration leads to poor data utilisationNew technologies and trends are driving the evolution of the roles of data administrator and database administratorRoles of the– data administrator– database administrator

Evolving roles of the DA and DBA

Page 33: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 33

Final Exam

3 hour exam, 10 minute reading timeTen questions

– 1 question consisting of ten short answer questions (10 x 1 mark)

– 6 short to medium length questions (1 x 5 marks, 6 x 10 marks)

– ER modelling (10 marks)– normalisation (15 marks)

– Attempt all questions!

Page 34: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 34

Exam Strategy

Know the date, time and location of your exam – it’s your responsibility!Know your seat number Make sure you have your student ID card with youGet to the exam earlyEnsure you have adequate writing materials with youNo text books or notes allowedRelax – there’s really not much to worry about– whatever you have to do (within reason) to help

you relax is ok

Page 35: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 35

Exam Strategy3 hour, 10 minute reading time, 100 marks– you’ve got 180 minutes to earn 100 marks!

Convert marks to minutes 1.8 minutes/markCalculate time available for each questionIt is a guide to the amount of effort I expect you to spend on each question Once the available time for a question is up, stop writing!If you finish a question within the available time, return to any incomplete answersMake sure you understand the questions!

Page 36: IMS1907 Database Systems Summer Semester 2004/2005

Monash University 2004 36

Study StrategyGive yourself sufficient time for revision– don’t wait till the day before the exam to start studying

Study all topics covered in lectures

Re-read lecture notes, your notes, text books, tutorial notes

Do all exercises especially revision exercises

Attempt previous exams

Consult tutors or lecturer before exam

Get plenty of sleep, drink lots of water, eat green vegetables– it’s not quite time to party yet!