48
Fundamentals, Design, and Implementation, 9/e COS 346 Day 25

Fundamentals, Design, and Implementation, 9/e COS 346 Day 25

  • View
    218

  • Download
    3

Embed Size (px)

Citation preview

Fundamentals, Design, and Implementation, 9/e

COS 346

Day 25

Chapter 15/2 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Agenda

Assignment 9 Corrected – 4 A’s, 1 D and one non-submit

Assignment 10 Due Assignment 11 Posted

– Due May 6 10 AM Quiz 4

– May 6 10 AM – Chap 9, 10, 12, 13 & 15

Capstones Presentation May 6 10AM– All deliverables Due

Today – Discussion on Sharing Enterprise Data

Fundamentals, Design, and Implementation, 9/e

Chapter 15Sharing Enterprise Data

Chapter 15/4 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Database Processing Architectures System architectures for enterprise

database processing:– Teleprocessing system– Client-server system– File-sharing system– Distributed system

Chapter 15/5 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Teleprocessing Systems

Classic architecture for multi-user database processing

Users operate dumb terminals or PC that emulate dumb terminals– User interface is usually simple and primitive

A single centralized computer processes communications control program, application programs, DBMS, and operating system

Chapter 15/6 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Teleprocessing Systems

Chapter 15/7 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Client-Server Systems

A client-server system consists of a network of computers connected via a LAN

Clients are personal computers used to process application programs

Servers are PCs or mainframes that stores DBMS and the data-management portion of the operating system

Chapter 15/8 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Client-Server Systems

Chapter 15/9 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

File-Sharing Systems

This architecture was developed before the client-server architecture

File server and user computers are connected through LAN– File server provides access to files and other resources– User computers must contain a copy of DBMS and

application programs DBMS on user’s computer sends requests to the

data management portion of the operating system on the file server for file-level processing– This cause more traffic across LAN than the client-server

system

Chapter 15/10 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

File-Sharing Systems

Chapter 15/11 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Distributed Database Systems

Distributed database systems use multiple computers to process the same database

Distributed processing: use multiple computers for applications or DBMS processing– Database is centralized– E.g., file-sharing, client-server, and distributed

database system Distributed database processing: distribute

database to multiple computers – E.g., distributed database system

Chapter 15/12 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Distributed Database Systems

Chapter 15/13 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Database Partitioning

A vertical partition, or vertical fragment, refers to a table that is broken into two or more sets of columns

A horizontal partition, or horizontal fragment, refers to a table that is broken into two or more sets of rows

Mixed partition refers to a database broken into both horizontal and vertical partitions

Chapter 15/14 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Types of Distributed Databases

Types of distributed database:– Nonpartitioned, nonreplicate– Partitioned, nonreplicated– Nonpartitioned, replicated– Partitioned, replicated

The greater the degree of partitioning and replication– The greater the flexibility, independence, and reliability– The greater the expense, control difficulty, and security

problems

Chapter 15/15 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Types of Distributed Databases

Chapter 15/16 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Comparing DB Distribution Alternatives

Chapter 15/17 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Distributed Processing Techniques

Three types of distributed database processing Downloading of read-only data: only one computer

updates data, but multiple computers are sent copies to process

Updating by a designated computer: allows data update requests to originate on multiple computers, but to transmit those update requests to a designated computer for processing– Database at the designated computers must be

periodically synchronized

Chapter 15/18 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Distributed Processing Techniques (cont.) Updating by multiple computers: allows multiple

updates on the same data at multiple sites– Three types of distributed update conflict can occur:

• Loss of uniqueness• Lost updates due to concurrent transactions• Updates of deleted data

Coordinating distributed atomic transactions is difficult and requires a two-phase commit

The OLE Distributed Transaction Server and Java Enterprise Beans are two technologies for dealing with these problems

Chapter 15/19 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Downloading Data

Powerful personal computers enable user to download enterprise data for local processing

Users can query and report on downloaded data using DBMS products on their machines

Normally, users are not allowed to update and return data to prevent data integrity problems

A Web server can be used to publish downloaded data

Chapter 15/20 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Potential Problems of Downloaded Databases Coordination

– Downloaded data must conform to database constraints

– Local updates must be coordinated with downloads

Consistency– In general, downloaded data should not be

updated– Applications need features to prevent updating– Users should be made aware of possible

problems

Chapter 15/21 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Potential Problems of Downloaded Databases (cont.) Access Control

– Data may be replicated on many computers– Procedures to control data access are more

complicated

Potential for Computer Crime– Illegal copying is difficult to prevent– Diskettes and illegal online access are easy to

conceal– Risk may prevent the development of

downloaded data applications

Chapter 15/22 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Processing Downloaded Data with a Web Server

Chapter 15/23 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

OLAP

On Line Analytical Processing (OLAP) is a new way of presenting information

With it, data is viewed in cubes that have axes, dimensions, measures, slices, and levels

Cube refers to – Underlying semantic structure that is used to

interpret data– A particular materialization of data in such a

semantic structure

Chapter 15/24 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Example: Relational Source Data

Chapter 15/25 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Example: OLAP Cube

Chapter 15/26 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

OLAP Terminology

OLAP hypercube: means a data display with an unlimited number of axes (Dimensions)

Chapter 15/27 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Example: Cube Data Definition

CREATE CUBE HousingSalesCube (DIMENSION Time TYPE TIME,

LEVEL Year TYPE YEAR,LEVEL Quarter TYPE QUARTER,LEVEL Month TYPE MONTH,

DIMENSION Location,LEVEL USA TYPE ALL,LEVEL State,LEVEL City,

DIMENSION HousingCategory,DIMENSION HousingType,MEASURE SalesPrice,

FUNCTION AVG

MEASURE AskingPrice, FUNCTION AVG)

Chapter 15/28 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Example: Multidimensional SELECT Statement

SELECT CROSSJOIN ({Existing Structure, New Construction},{California.Children, Nevada})ON COLUMNS,{1998.Q1.Children, 1998.Q2, 1998.Q3, 1998.Q4,1999.Q1.Children, 1999.Q2, 1999.Q3, 1999.Q4}ON ROWS

FROM HousingSalesCubeWHERE (SalesPrice, HousingType =

‘SingleFamily’)

Chapter 15/29 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

OLAP Schema Structures

Star schema: every dimension table is adjacent to the table storing the measure values– These tables may or may not be normalized

Snowflake schema: there can be multilevel, normalized tables

In general, the star schema requires greater storage, but it is faster to process than the snowflake schema

Chapter 15/30 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Example: Star Schema

Chapter 15/31 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Example: Snowflake Schema

Chapter 15/32 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

OLAP Storage Alternatives

Three different means for storing OLAP data ROLAP (relational OLAP): relational DBMS with

extensions is sufficient to meet OLAP requirements

MOLAP (multidimensional OLAP): a specialized multidimensional processor is necessary to produce acceptable OLAP performance

HOLAP (hybrid OLAP): both DBMS products and specialized OLAP engines have a role and can be used to advantage

Chapter 15/33 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Microsoft OLAP Architecture

Microsoft has extended OLE DB and ADO for OLAP OLE DB for OLAP includes

– A dataset object– ADO MD: new objects for processing dataset objects in

ways similar to recordset objects– Pivot Table Service: an OLAP processor available as Office

2000/XP and Windows 2000/XP service

Microsoft’s architecture moves much OLAP processing to client computers

Whether this will be acceptable for the processing of data on enterprise servers is as yet unknown

Chapter 15/34 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Microsoft OLAP Architecture

Chapter 15/35 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Client Side Pivot Tables

Chapter 15/36 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Data Warehouse

A data warehouse is a store of enterprise data that is designed to facilitate management decision-making

Goal: to increase the value of the organization’s data asset

Role: to store extracts from operational data and make those extracts available to users in a useful format– Data can be extracts from databases, files,

images, recordings, photos, external data, etc.

Chapter 15/37 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Data Warehouse

Chapter 15/38 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Data Warehouse Components

Data extraction tools Extracted data Metadata of warehouse contents Warehouse DBMS(s) and OLAP servers Warehouse data management tools Data delivery programs End-user analysis tools User training courses and materials Warehouse consultants

Chapter 15/39 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Data Warehouse Requirements

Queries and reports with variable structure User-specified data aggregation User-specified drill down Graphical outputs Integration with domain-specific programs

Chapter 15/40 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Challenges for Data Warehouses

Inconsistent data– When data are integrated, inconsistencies can

develop due to timing and domain differences– Solution: create metadata to describe both

timing and domains of source data Tool Integration

– Because of the many tools required in a data warehouse, tools will have different user interfaces and inconsistent means of importing and exporting data, and it may be difficult to obtain technical support

Chapter 15/41 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Challenges for Data Warehouses

Lack of tools for managing the data warehouse– The organization may have to develop its own

tools for managing non-relational data and for maintaining appropriate metadata. Such development is difficult and expensive

Ad hoc nature of requirements – Such requests are difficult to satisfy– Solution: create datamart, i.e,, limited-scope

data warehouses

Chapter 15/42 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Data Marts

A data mart is a limited-scope data warehouse

A data mart is easier to manage than the enterprise data warehouse because – It has a much smaller domain– It can be restricted

• To a particular type of input data• To a particular business function• To a particular business unit or geographic area

Chapter 15/43 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Enterprise Data Sharing Continuum

Chapter 15/44 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Data Administration

Data are an important organizational asset that can support both operations and management decisionmaking

The purpose of offices of data administration is to guard and protect the data and to ensure that they are used effectively

Chapter 15/45 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Data Administration Challenges

Many types of data exist Basic categories of data are not obvious The same data can have many names The same data can have many

descriptions and formats Data are changed often concurrently Political-organizational issues complicate

operational issues

Chapter 15/46 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Functions of Data Administration

Marketing– Communicate existence of data administration to

organization– Explain reason for existence of standards, policies, and

guidelines– Describe in a positive light the services provided

Data Standards– Establish standard means for describing data items.

Standards include name, definition, description, processing restrictions, etc.

– Establish data proponents

Chapter 15/47 Copyright © 2004

Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke

Functions of Data Administration

Data Policies– Establish organization-wide data policy, e.g., security,

data proponency, and distribution

Forum for Data Conflict Resolution– Establish procedures for reporting conflicts– Provide means for hearing all perspectives and views– Have authority to make decision to resolve conflict

Return on Organization's Data Investment– Focus attention on value of data investment– Investigate new methodologies and technologies– Take proactive attitude toward information management

Fundamentals, Design, and Implementation, 9/e

Chapter 15Sharing Enterprise Data