1---DistributedDBDesign-Basic Concepts

Embed Size (px)

Citation preview

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    1/18

    Distributed Database Design

    BASICS

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    2/18

    Introduction Distributed Database defn:

    A logicallly interrelated collection of shareddata, physically distributed over a computernetwork

    Distributed DBMS defn:

    the software system that permits themanagement of the distributed database andmakes the distribution transparent to users

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    3/18

    Distributed Systems Data spread over multiple machines

    (also referred to as sites or nodes).

    Network interconnects the machines

    Data shared by users on multiplemachines

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    4/18

    Advantages Organisational structure - many

    organisations cover several sites Shareability -users at different sites can share Local autonomy each site is able to retain a

    degree of control over data stored locally

    Improved availability- node failure will notmake system inoperable

    Improved Reliability- replicated data allowsdata accessability Improved Performance - data located near

    site Modular Growth - easier expansion

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    5/18

    Disadvantages Complexity - more complex than centralised

    Cost - added network and maintenance costs

    Security - network must be made secure Integrity control - more difficult to ensure

    proper coordination among sites.

    Lack of standards experience - no tools ormethodologies

    Complex Design- Database design morecomplex

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    6/18

    Same software/schema on all sites, data may be

    partitioned among sites Goal: provide a view of a single database, hiding

    details of distribution

    Characteristics All sites use same DBMS product.

    Much easier to design and manage.

    Approach provides incremental growth and allows

    increased performance.

    Homogeneous DDBMSClassification of DDBMS

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    7/18

    Different software/schema on different sites

    Goal: integrate existing databases to provide

    useful functionalityCharacteristics Sites may run different DBMS products, with possibly different

    underlying data models.

    Occurs when sites have implemented their own databases and

    integration is considered later.

    Translations required to allow for:

    Differenthardware.

    DifferentDBMS products.

    DifferentHW and differentDBMS products.

    Typical solution is to use gateways.

    HeterogeneousDDBMS

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    8/18

    Classification Contd

    Type of DBMS LAN network WAN network

    Homogenous

    Heterogeneous

    Data managementand financialapplications

    Inter-divisionalinformationsystems

    Travel managementand finanacialapplications

    Integrated bankingand inter-bankingsystems

    Examples of typical applications:

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    9/18

    Design Issues with DDBMS In designing a distributed database, the same issues are faced

    as for a centralized database plus, in addition:

    Fragmentation

    Relation may be divided into a number of sub-relations,which are then distributed.

    Allocation:

    Each fragment is stored at site with "optimal" distribution.

    Replication

    Copy of fragment may be maintained at several sites.

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    10/18

    Functions of DDBMS

    extended communication to allow the transfer of

    queries and data among sites extended system catalog to store data distribution

    details

    distributed query processing , including query

    optimisation extended concurrency control to maintain consistency

    of replicated data

    extended recovery services to take account of failuresof individual sites and comms links

    Functions of a Centralised DBMS plus:

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    11/18

    Component Architecture LocalDBMS (LDBMS) - responsible for local data

    Transaction, Buffer and Recovery Managers and Scheduler

    DataCommunications (DC) component- allows all sites to

    communicate with each other lobalsystemcatalog (GSC) - catalog information re:

    fragmentation and allocation schema

    DistributedDBMS (DDBMS) - controlling unit of the entiresystem

    ComponentsofaDDBMS

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    12/18

    Due to diversity, no accepted architecture

    equivalent to ANSI/SPARC 3-level

    architecture.

    A reference architecture consists of:Set of global external schemas. (GES)Global conceptual schema (GCS).Fragmentation schemaAllocation schema.

    Set of schemas for each local DBMSconforming to 3-level ANSI/SPARC.

    Some levels may be missing, depending on

    levels of transparency supported.

    Reference Architecture for

    DDBMS

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    13/18

    Local and Global Transactions

    A local transaction accesses data in the single siteat which the transaction was initiated.

    A global transaction either accesses data in a site

    different from the one at which the transactionwas initiated or accesses data in several differentsites.

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    14/18

    Implementation Issues for Distributed

    Databases Atomicity needed even for transactions that update

    data at multiple sites

    The two-ph

    ase commit protocol (2P

    C) is used toensure atomicity

    Basic idea: each site executes transaction until just beforecommit, and the leaves final decision to a coordinator

    Each site must follow decision of coordinator, even if there is afailure while waiting for coordinators decision

    Distributed concurrency control (and deadlock detection)required

    Data items may be replicated to improve data availability

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    15/18

    DDBMS Network Types Local-areanetworks (LANs) composed

    of processors that are distributed over small

    geographical areas, such as a singlebuilding or a few adjacent buildings.

    Wide-areanetworks (WANs) composed of processors distributed over alarge geographical area.

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    16/18

    Networks Types (Cont.) WANs with continuous connection (e.g. the

    Internet) are needed for implementing distributed

    database systems Groupware applications such as Lotus notes can

    work on WANs with discontinuous connection:

    Data is replicated.

    Updates are propagated to replicas periodically.

    Copies of data may be updated independently.

    Non-serializable executions can thus result.

    Resolution is application dependent.

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    17/18

    Other CategoriesOpen Database Access and Interoperability

    Open Group has formed a Working Group to provide specifications thatwill create database infrastructure environment where there is:

    Common SQL API that allows client applications to be written that donot need to know vendor of DBMS they are accessing.

    Common database protocol that enables DBMS from one vendor tocommunicate directly with DBMS from another vendor without theneed for a gateway.

    A common network protocol that allows communications betweendifferent DBMSs.

    Most ambitious goal is to find a way to enable transaction to span DBMSsfrom different vendors without use of a gateway.

  • 8/7/2019 1---DistributedDBDesign-Basic Concepts

    18/18

    DDBMS in which each site maintains complete

    autonomy. DBMS that resides transparently on top of existing

    database and file systems and presents a singledatabase to its users.

    Allows users to access and share data withoutrequiring physical database integration.

    Two Categories: Unfederated MDBS (no local

    users) and

    federate

    dMDBS.

    MultiDatabase System (MDBS)Other Categories Contd..