Upload
tanek-scott
View
24
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Week 5 Lecture. Distributed Database Management Systems. Suggestions for using the Lecture Slides. Samuel Conn , Asst Professor. In this lecture, you will learn:. What a distributed database management system (DDBMS) is and what its components are - PowerPoint PPT Presentation
Citation preview
Week 5 Lecture
Distributed Database Management Systems
Samuel Conn, Asst Professor
Suggestions for using the Lecture Slides
2
In this lecture, you will learn:
What a distributed database management system (DDBMS) is and what its components are
How database implementation is affected by different levels of data and process distribution
How transactions are managed in a distributed database environment
How database design is affected by the distributed database environment
3
Evolution of DDBMS
Decentralized database management systems (DDBMS) Interconnected computer systems Data/processing functions reside on multiple sites
1970’s: Centralized DBMS 1980’s: Social and Technical Changes
Ad hoc capability required Decentralized management structure common
1990’s: New forces Internet and the World Wide Web used for data access and
distribution Data analysis through data mining and data warehousing
4
DDBMS Advantages
Data located near site with greatest demand
Faster data access Faster data processing Growth facilitation Improved communications Reduced operating costs User-friendly interface Less danger of single-point failure Processor independence
5
DDBMS Disadvantages
Complexity of management and control
Security Lack of standards Increased storage requirements Greater difficulty in managing data
environment Increased training costs
6
Distributed Processing
Shares database’s logical processing among physically, networked independent sites
Figure 10.1
7
Distributed Database
Stores logically related database over physically independent sites
Figure 10.2
8
Distributed Database vs. Distributed Processing
Distributed processing Does not require distributed database May be based on a single database on single
computer Copies or parts of database processing
functions must be distributed to all data storage sites
Distributed database Requires distributed processing
Both Require a network to connect components
9
Functions of DDBMS
Application/end user interface Validation to analyze data requests Transformation to determine request components Query optimization to find the best access
strategy Mapping to determine the data location I/O interface to read or write data Formatting to prepare the data for presentation Security to provide data privacy Backup and recovery DB Administration Concurrency Control Transaction Management
10
Centralized Database
Figure 10.3
11
Fully Distributed Database Management System
Figure 10.4
12
DDBMS Components
Computer workstations Network hardware and software
components Communications media Transaction processor (TP)
Also called application manager (AP) or transaction manager (TM)
Data processor (DP) Also called data manager (DM)
13
Distributed Database Components
Figure 10.5
14
DDBMS Protocols
Interface with network to transport data and commands between DPs and TPs
Synchronize data received from DPs and route to appropriate TPs
Ensure common database functions Security Concurrency control Backup and recovery
15
Levels of Data and Process Distribution
Database systems can be classified based on process distribution and data distribution
Table 10.1
16
Single-Site Processing, Single-Site Data (SPSD)
All processing on single CPU or host computer
All data are stored on host computer disk
DBMS located on the host computer DBMS accessed by dumb terminals Typical of mainframe and minicomputer
DBMSs Typical of 1st generation of single-user
microcomputer database
17
Single-Site Processing, Single-Site Data (con’t.)
Figure 10.6
18
Multiple-Site Processing, Single-Site Data (MPSD)
• Requires network file server • Applications accessed through LAN • Variation known as client/server architecture
Figure 10.7
19
Multiple-Site Processing, Multiple-Site Data (MPMD)
Fully distributed DDBMS with support for multiple DPs and TPs at multiple sites Homogeneous I
• Integrate one type of centralized DBMS over the network
Heterogeneous • Integrate different types of centralized
DBMSs over a network
20
Heterogeneous Distributed Database Scenario
Figure 10.8
21
Distributed DB Transparency
Allows end users to feel like only database user
Hides complexities of distributed database
Transparency features Distribution Transaction Failure Performance Heterogeneity
22
Distribution Transparency
Allows management of a physically dispersed database as though it were centralized
Three Levels Fragmentation transparency Location transparency Local mapping transparency
Table 10.2
23
Transaction Transparency
Ensures transactions maintain integrity and consistency
Completed only if all involved database sites complete their part of the transaction
Management mechanisms Remote request Remote transaction Distributed transaction Distributed request
24
Remote Request
Figure 10.10
25
Remote Transaction
Figure 10.11
26
Distributed Transaction
Figure 10.12
27
Distributed Requests
Figure 10.13
28
Distributed Requests (con’t.)
Figure 10.14
29
Distributed Concurrency Control
Multi-site, multiple-process operations more likely to create data inconsistencies and deadlocked transactions
Problems Transaction committed by local DP One DP could not commit transaction’s result Yields inconsistent database
30
Two-Phase Commit Protocol
DO-UNDO-REDO protocol Write-ahead protocol Two kinds of nodes
• Coordinator • Subordinates
Phases Preparation
• Coordinator sends message to all subordinates • Confirms all are ready to commit or abort
Final Commit • Ensures all subordinates have committed or aborted
31
Performance Transparency and Query Optimization
Objective: Minimize total cost associated with execution of request
Main costs Access time Communication CPU time
Basis for query optimization algorithms Optimum execution order Sites accessed to minimize communication costs
Dynamic or static optimization Statistically based vs. rule-based query
optimization algorithms
32
Distributed Database Design
Partition database into fragments Horizontal Vertical Mixed
Fragments to replicate Storage of data copies at multiple sites Fully, partially, un-replicated databases
Data allocation Where to locate data Centralized, partitioned, replicated
33
Client/Server Advantages Over DDBMS
Client/server less expensive Client/server solutions allow use of
microcomputer’s GUI More people with PC skills than
mainframe skills PC is well established in workplace Numerous data analysis and query tools
exist Considerable cost advantages to off-
loading application development
34
Client/Server Disadvantages
Creates more complex environment with different platforms Increased number of users and sites creates security problems Training issues become more complex and expensive
35
Date’s 12 Commandments for Distributed Databases
1. Local Site Independence 2. Central Site Independence 3. Failure Independence 4. Location Transparency 5. Fragmentation Transparency
6. Replication Transparency
36
Date’s 12 Commandments for Distributed Databases
7. Distributed Query Processing 8. Distributed Transaction Processing 9. Hardware Independence 10. Operating System Independence 11. Network Independence
12. Database Independence