Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li lik Pusan National...

Preview:

Citation preview

Massively Distributed Database Systems- Distributed DBS

Spring 2014Ki-Joune Li

http://isel.cs.pusan.ac.kr/~likPusan National University

Pros and ConsReliability and Availability

Local Control

Incremental Growth

Communication Costs

Fast Response

Advantages

Software Cost and Complexity

Processing Overhead

Data Integrity

Slow Response

Disadvantages

3-layer model of databases

External Layer

Conceptual Layer

Physical LayerData Storage Format

Conceptual Schema

View Definitions

Implementation- Systems

Modeling

Local DatabaseLocal Database

Distributed Databases as a Logical Layers

External Layer

Conceptual Layer

Physical Layer

External Layer

Conceptual Layer

Physical Layer

External Layer

Conceptual Layer

Physical Layer

Global Conceptual Layer

Local Database

Global Physical Layer

GlobalDatabase

View from client View from client View from client

??

Issues

• Replication vs. Partitioning• Distributed DBMS • Transparency• Query Optimization• Transaction Management

Replication vs. Partitioning

• Replication• Partitioning• Vertical vs. Horizontal• Hybrid

• Replicate all or parts of databases to local DB

Site 1

Replication

DB-1

DB-2

DB-3Site 2

DB-1

DB-2

Site 2

DB-2

DB-3

How to manage replicated DBs?

• Issue 1 – Consistency • If updates at a site, how to manage the integrity of

global databases

• Issue 2 – How to duplicate• All or only some parts• Factors to consider

Replication – Management of Consistency

• Snapshot replication• Store all update logs at a central site from a given time• Periodically send proper logs to local sites• Each local site reflects the update logs for its local DB

• Near Real-Time replication• When an update occurs, it triggers updates at other sites

• Pull Replication• Instead of push protocol, each local site asks update logs

when it is necessary

Replication – Management of Consistency

• Exclusive ownership vs. Shared ownership• Single update vs. Multiple update

• Synchronous updates vs. Asynchronous update• Simple snapshot vs. Multiple snapshot

Replication – How to replicate

• Fast Response• Communication Overhead• Security• Query Optimization

Partitioning – Horizontal Partitioning

Split a table into sev-eral subtables

Partitioning – Horizontal Partitioning

• How to split a table?• Efficiency• Local Optimization• Communication Overhead• Security

• Dynamic reconfiguration of Partitioning

Partitioning – Vertical Partitioning

Split a DB into severaldisjoint tables

Shared Primary Keys – Join operations are inevitable

Comparison btw Replication and Partitions

Distributed DBMS

• What a distributed DBMS should do⁻ Management of Data Dictionary⁻ Resolving Heterogeneity: Schema, QL, DBMS⁻ Keeping distributed DBs secure and consistent: TM⁻ Transparency: single logical view to user⁻ Dynamic load balancing⁻ Query processing (Optimization)

Recommended