71
12 1 Chapter 12 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel

PowerPoint Chapter 12

  • Upload
    phamnga

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PowerPoint Chapter 12

12

1

Chapter 12

Distributed Database Management Systems

Database Systems: Design, Implementation, and Management,

Seventh Edition, Rob and Coronel

Page 2: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

2

In this chapter, you will learn:

• What a distributed database management system (DDBMS) is and what its components are

• How database implementation is affected by different levels of data and process distribution

• How transactions are managed in a distributed database environment

• How database design is affected by the distributed database environment

Page 3: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

3

The Evolution of Distributed Database Management Systems

• Distributed database management system (DDBMS) – Governs storage and processing of logically

related data over interconnected computer systems in which both data and processing functions are distributed among several sites

Page 4: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

4

The Evolution of Distributed Database Management Systems (continued)

• Centralized database required that corporate data be stored in a single central site

• Dynamic business environment and centralized database’s shortcomings spawned a demand for applications based on data access from different sources at multiple locations

Page 5: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

5

The Evolution of Distributed Database Management Systems (continued)

Page 6: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

6

DDBMS Advantages and Disadvantages

• Advantages include:– Data are located near “greatest demand” site– Faster data access– Faster data processing – Growth facilitation – Improved communications

Page 7: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

7

DDBMS Advantages and Disadvantages (continued)

• Advantages include (continued):– Reduced operating costs – User-friendly interface – Less danger of a single-point failure – Processor independence

Page 8: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

8

DDBMS Advantages and Disadvantages (continued)

• Disadvantages include:– Complexity of management and control – Security – Lack of standards– Increased storage requirements – Increased training cost

Page 9: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

9

DDBMS Advantages and Disadvantages (continued)

Page 10: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

10

DDBMS Advantages and Disadvantages (continued)

Page 11: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

11

DDBMS Advantages and Disadvantages (continued)

Page 12: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

12

Characteristics of Distributed Management Systems

• Application interface

• Validation

• Transformation

• Query optimization

• Mapping

• I/O interface

Page 13: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

13

Characteristics of Distributed Management Systems (continued)

• Formatting

• Security

• Backup and recovery

• DB administration

• Concurrency control

• Transaction management

Page 14: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

14

Characteristics of Distributed Management Systems (continued)

• Must perform all the functions of centralized DBMS

• Must handle all necessary functions imposed by distribution of data and processing– Must perform these additional functions

transparently to the end user

Page 15: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

15

Characteristics of Distributed Management Systems (continued)

Page 16: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

16

DDBMS Components

• Must include (at least) the following components:– Computer workstations – Network hardware and software – Communications media – Transaction processor (application processor,

transaction manager)• Software component found in each computer

that requests data

Page 17: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

17

DDBMS Components (continued)

• Must include (at least) the following components (continued):– Data processor or data manager

• Software component residing on each computer that stores and retrieves data located at the site

• May be a centralized DBMS

Page 18: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

18

DDBMS Components (continued)

Page 19: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

19

Levels of Data and Process Distribution

Page 20: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

20

Single-Site Processing, Single-Site Data (SPSD)

• All processing is done on single CPU or host computer (mainframe, midrange, or PC)

• All data are stored on host computer’s local disk

• Processing cannot be done on end user’s side of system

Page 21: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

21

Single-Site Processing, Single-Site Data (SPSD) (continued)

• Typical of most mainframe and midrange computer DBMSs

• DBMS is located on host computer, which is accessed by dumb terminals connected to it

• Also typical of first generation of single-user microcomputer databases

Page 22: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

22

Single-Site Processing, Single-Site Data (SPSD) (continued)

Page 23: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

23

Multiple-Site Processing, Single-Site Data (MPSD)

• Multiple processes run on different computers sharing single data repository

• MPSD scenario requires network file server running conventional applications that are accessed through LAN

• Many multiuser accounting applications, running under personal computer network, fit such a description

Page 24: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

24

Multiple-Site Processing, Single-Site Data (MPSD) (continued)

Page 25: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

25

Multiple-Site Processing, Multiple-Site Data (MPMD)

• Fully distributed database management system with support for multiple data processors and transaction processors at multiple sites

• Classified as either homogeneous or heterogeneous

• Homogeneous DDBMSs – Integrate only one type of centralized DBMS

over a network

Page 26: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

26

Multiple-Site Processing, Multiple-Site Data (MPMD) (continued)

• Heterogeneous DDBMSs– Integrate different types of centralized DBMSs

over a network

• Fully heterogeneous DDBMS– Support different DBMSs that may even

support different data models (relational, hierarchical, or network) running under different computer systems, such as mainframes and microcomputers

Page 27: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

27

Multiple-Site Processing, Multiple-Site Data (MPMD) (continued)

Page 28: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

28

Distributed Database Transparency Features

• Allow end user to feel like database’s only user

• Features include: – Distribution transparency– Transaction transparency– Failure transparency– Performance transparency– Heterogeneity transparency

Page 29: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

29

Distribution Transparency

• Allows management of physically dispersed database as though it were a centralized database

• Following three levels of distribution transparency are recognized:– Fragmentation transparency – Location transparency – Local mapping transparency

Page 30: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

30

Distribution Transparency (continued)

Page 31: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

31

Distribution Transparency (continued)

Page 32: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

32

Transaction Transparency

• Ensures database transactions will maintain distributed database’s integrity and consistency

Page 33: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

33

Distributed Requests and Distributed Transactions

• Distributed transaction – Can update or request data from several

different remote sites on network

• Remote request– Lets single SQL statement access data to be

processed by single remote database processor

• Remote transaction– Accesses data at single remote site

Page 34: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

34

Distributed Requests and Distributed Transactions (continued)

• Distributed transaction – Allows transaction to reference several

different (local or remote) DP sites

• Distributed request– Lets single SQL statement reference data

located at several different local or remote DP sites

Page 35: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

35

Distributed Requests and Distributed Transactions (continued)

Page 36: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

36

Distributed Requests and Distributed Transactions (continued)

Page 37: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

37

Distributed Requests and Distributed Transactions (continued)

Page 38: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

38

Distributed Requests and Distributed Transactions (continued)

Page 39: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

39

Distributed Requests and Distributed Transactions (continued)

Page 40: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

40

Distributed Concurrency Control

• Multisite, multiple-process operations are much more likely to create data inconsistencies and deadlocked transactions than are single-site systems

Page 41: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

41

Distributed Concurrency Control (continued)

Page 42: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

42

Two-Phase Commit Protocol

• Distributed databases make it possible for transaction to access data at several sites

• Final COMMIT must not be issued until all sites have committed their parts of transaction

• Two-phase commit protocol requires each individual DP’s transaction log entry be written before database fragment is actually updated

Page 43: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

43

Performance Transparency and Query Optimization

• Objective of query optimization routine is to minimize total cost associated with execution of request

• Costs associated with request are function of:– Access time (I/O) cost – Communication cost – CPU time cost

• Must provide distribution transparency as well as replica transparency

Page 44: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

44

Performance Transparency and Query Optimization (continued)

• Replica transparency – DDBMS’s ability to hide existence of multiple

copies of data from user

• Query optimization techniques include: – Manual or automatic– Static or dynamic – Statistically based or rule-based algorithms

Page 45: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

45

Distributed Database Design

• Data fragmentation – How to partition database into fragments

• Data replication – Which fragments to replicate

• Data allocation – Where to locate those fragments and replicas

Page 46: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

46

Data Fragmentation

• Breaks single object into two or more segments or fragments

• Each fragment can be stored at any site over computer network

• Information about data fragmentation is stored in distributed data catalog (DDC), from which it is accessed by TP to process user requests

Page 47: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

47

Data Fragmentation (continued)

• Strategies– Horizontal fragmentation

• Division of a relation into subsets (fragments) of tuples (rows)

– Vertical fragmentation • Division of a relation into attribute (column)

subsets– Mixed fragmentation

• Combination of horizontal and vertical strategies

Page 48: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

48

Data Fragmentation (continued)

Page 49: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

49

Data Fragmentation (continued)

Page 50: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

50

Data Fragmentation (continued)

Page 51: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

51

Data Fragmentation (continued)

Page 52: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

52

Data Fragmentation (continued)

Page 53: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

53

Data Fragmentation (continued)

Page 54: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

54

Data Fragmentation (continued)

Page 55: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

55

Data Replication

• Storage of data copies at multiple sites served by computer network

• Fragment copies can be stored at several sites to serve specific information requirements– Can enhance data availability and response

time– Can help to reduce communication and total

query costs

Page 56: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

56

Data Replication (continued)

Page 57: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

57

Data Replication (continued)

• Replication scenarios– Fully replicated database

• Stores multiple copies of each database fragment at multiple sites

• Can be impractical due to amount of overhead – Partially replicated database

• Stores multiple copies of some database fragments at multiple sites

• Most DDBMSs are able to handle the partially replicated database well

Page 58: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

58

Data Replication (continued)

• Replication scenarios (continued)– Unreplicated database

• Stores each database fragment at single site• No duplicate database fragments

Page 59: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

59

Data Allocation

• Deciding where to locate data

• Allocation strategies– Centralized data allocation

• Entire database is stored at one site– Partitioned data allocation

• Database is divided into several disjointed parts (fragments) and stored at several sites

Page 60: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

60

Data Allocation (continued)

• Allocation strategies (continued)– Replicated data allocation

• Copies of one or more database fragments are stored at several sites

• Data distribution over computer network is achieved through data partition, data replication, or combination of both

Page 61: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

61

Client/Server vs. DDBMS

• Way in which computers interact to form system

• Features user of resources, or client, and provider of resources, or server

• Can be used to implement a DBMS in which client is the TP and server is the DP

Page 62: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

62

Client/Server vs. DDBMS (continued)

• Client/server advantages– Less expensive than alternate minicomputer

or mainframe solutions– Allow end user to use microcomputer’s GUI,

thereby improving functionality and simplicity– More people in job market have PC skills than

mainframe skills– PC is well established in workplace

Page 63: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

63

Client/Server vs. DDBMS (continued)

• Client/server advantages (continued)– Numerous data analysis and query tools exist

to facilitate interaction with DBMSs available in PC market

– Considerable cost advantage to offloading applications development from mainframe to powerful PCs

Page 64: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

64

Client/Server vs. DDBMS (continued)

• Client/server disadvantages– Creates more complex environment

• Different platforms (LANs, operating systems, and so on) are often difficult to manage

– An increase in number of users and processing sites often paves the way for security problems

Page 65: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

65

Client/Server vs. DDBMS (continued)

• Client/server disadvantages (continued)– Possible to spread data access to much wider

circle of users• Increases demand for people with broad

knowledge of computers and software• Increases burden of training and cost of

maintaining the environment

Page 66: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

66

C. J. Date’s Twelve Commandments for Distributed Databases

• Local site independence

• Central site independence

• Failure independence

• Location transparency

• Fragmentation transparency

• Replication transparency

Page 67: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

67

C. J. Date’s Twelve Commandments for Distributed Databases (continued)

• Distributed query processing

• Distributed transaction processing

• Hardware independence

• Operating system independence

• Network independence

• Database independence

Page 68: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

68

Summary

• Distributed database stores logically related data in two or more physically independent sites connected via computer network

• Distributed processing is division of logical database processing among two or more network nodes

• Distributed databases require distributed processing

• Main components of DDBMS are transaction processor and data processor

Page 69: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

69

Summary (continued)

• Current database systems can be classified by extent to which they support processing and data distribution

• Homogeneous distributed database system integrates only one particular type of DBMS over computer network

• Heterogeneous distributed database system integrates several different types of DBMSs over computer network

Page 70: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

70

Summary (continued)

• DDBMS characteristics are best described as set of transparencies

• Transaction is formed by one or more database requests

• Distributed concurrency control is required in network of distributed databases

• Distributed DBMS evaluates every data request to find optimum access path in distributed database

Page 71: PowerPoint Chapter 12

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel

12

71

Summary (continued)

• The design of distributed database must consider fragmentation and replication of data

• Database can be replicated over several different sites on computer network

• Client/server architecture refers to way in which two computers interact over computer network to form a system