Introduction Deﬁnition Distributed System: Deﬁnition · 2017. 5. 23. · Partition data and computations across multiple machines: Move computations to clients (Java applets)

Introduction Definition

Distributed System: Definition

A distributed system is a piece of software that ensures that:

a collection of independent computers appears to its users asa single coherent system

Two aspects: (1) independent computers and(2) single system⇒ middleware.

Local OS 1 Local OS 2 Local OS 3 Local OS 4

Appl. A Application B Appl. C

Computer 1 Computer 2 Computer 4Computer 3

Network

Distributed system layer (middleware)

2 / 25

Introduction Goals

Scale in Distributed Systems

ObservationMany developers of modern distributed system easily use the adjective“scalable” without making clear why their system actually scales.

Scalability

At least three components:

Number of users and/or processes (size scalability)Maximum distance between nodes (geographical scalability)Number of administrative domains (administrative scalability)

ObservationMost systems account only, to a certain extent, for size scalability. The(non)solution: powerful servers. Today, the challenge lies in geographical andadministrative scalability.

8 / 25

Introduction Goals



Scalability




8 / 25

Introduction Goals



Scalability




8 / 25

Introduction Goals

Techniques for Scaling

DistributionPartition data and computations across multiple machines:

Move computations to clients (Java applets)Decentralized naming services (DNS)Decentralized information systems (WWW)

10 / 25

Introduction Goals

Techniques for Scaling

Replication/cachingMake copies of data available at different machines:

Replicated file servers and databasesMirrored Web sitesWeb caches (in browsers and proxies)File caching (at server and client)

11 / 25

Introduction Goals

Scaling – The Problem

ObservationApplying scaling techniques is easy, except for one thing:

Having multiple copies (cached or replicated), leads toinconsistencies: modifying one copy makes that copy differentfrom the rest.Always keeping copies consistent and in a general way requiresglobal synchronization on each modification.Global synchronization precludes large-scale solutions.

ObservationIf we can tolerate inconsistencies, we may reduce the need for globalsynchronization, but tolerating inconsistencies is applicationdependent.

12 / 25

Introduction Goals





12 / 25

Introduction Goals





12 / 25

Introduction Goals





12 / 25

Introduction Goals





12 / 25

Architectures Architectural styles

Architectural styles

Basic ideaOrganize into logically different components, and distribute thosecomponents over the various machines.

Layer N

Layer N-1

Layer 1

Layer 2

Request flow

Response flow

(a) (b)

Object

Object

Object

Object

Object

Method call

(a) Layered style is used for client-server system(b) Object-based style for distributed object systems.

2 / 25

Architectures System Architectures

Centralized Architectures

Basic Client–Server ModelCharacteristics:

There are processes offering services (servers)There are processes that use services (clients)Clients and servers can be on different machinesClients follow request/reply model wrt to using services

Client

Request Reply

ServerProvide service Time

Wait for result

4 / 25


Application Layering

Traditional three-layered viewUser-interface layer contains units for an application’s userinterfaceProcessing layer contains the functions of an application, i.e.without specific dataData layer contains the data that a client wants to manipulatethrough the application components

ObservationThis layering is found in many distributed information systems, usingtraditional database technology and accompanying applications.

5 / 25



Traditional three-layered viewUser-interface layer contains units for an application’s userinterfaceProcessing layer contains the functions of an application, i.e.without specific dataData layer contains the data that a client wants to manipulatethrough the application components

ObservationThis layering is found in many distributed information systems, usingtraditional database technology and accompanying applications.

5 / 25



Databasewith Web pages

Querygenerator

Rankingalgorithm

HTMLgenerator

User interface

Keyword expression

Database queries

Web page titleswith meta-information

Ranked listof page titles

HTML pagecontaining list

Processinglevel

User-interfacelevel

Data level

6 / 25


Multi-Tiered Architectures

Single-tiered: dumb terminal/mainframe configurationTwo-tiered: client/single server configurationThree-tiered: each layer on separate machine

Traditional two-tiered configurations:

User interface User interface User interface

Application

User interface

Application

User interface

Application

Database

ApplicationApplication Application

Database Database Database Database Database

User interface

(a) (b) (c) (d) (e)

Client machine

Server machine

7 / 25

Processes Virtualizaton

Virtualization

ObservationVirtualization is becoming increasingly important:

Hardware changes faster than softwareEase of portability and code migrationIsolation of failing or attacked components

Hardware/software system A

Interface A

Program

Interface A

Program

Implementation of mimicking A on B

Hardware/software system B

Interface B

(a) (b)

9 / 33


Architecture of VMs

ObservationVirtualization can take place at very different levels, strongly dependingon the interfaces as offered by various systems components:

Privileged instructions

System calls

Library functions

General instructions

Hardware

Operating system

Library

Application

10 / 33


Process VMs versus VM Monitors

Runtime systemRuntime system

Hardware

Operating system

Hardware

Operating systemOperating system

Operating system

Applications

Virtual machine monitor

(a) (b)

Runtime system

Application

Process VM: A program is compiled to intermediate (portable)code, which is then executed by a runtime system (Example: JavaVM).VM Monitor: A separate software layer mimics the instruction setof hardware⇒ a complete operating system and its applicationscan be supported (Example: VMware, VirtualBox).

11 / 33


Hybrid Architectures: Client-server combined with P2P

ExampleEdge-server architectures, which are often used for Content DeliveryNetworks

Edge server

Core Internet

Enterprise network

ISPISP

Client Content provider

17 / 25

Distributed Web-Based Systems Architecture

Distributed Web-based systems

EssenceThe WWW is a huge client-server system with millions of servers; eachserver hosting thousands of hyperlinked documents.

Documents are often represented in text (plain text, HTML, XML)Alternative types: images, audio, video, applications (PDF, PS)Documents may contain scripts, executed by client-side software

Client machine

Browser

OS

Server machine

Web server

1. Get document request (HTTP)

3. Response

2. Server fetchesdocument fromlocal file

1 / 71

Distributed Web-Based Systems Architecture

Multi-tiered architectures

ObservationWeb sites were soon organised into three tiers.

Web server Database serverCGI process

CGI program

1. Get request

3. Start process to fetch document

5. HTML document created

HTTP request handler6. Return result

4. Database interaction

2 / 71

Distributed Web-Based Systems Processes

Clients: Web browsers

ObservationBrowsers form the Web’s most important client-side sofware. Theyused to be simple, but that is long ago.

User interface

Browser engine

Rendering engine

Network comm.

HTML/XML parser

Display back end

Client-side script

interpreter

3 / 71


Apache Web server

Observation: More than 52% of all 185 million Web sites are Apache.

The server is internally organised more or less according to the steps neededto process an HTTP request.

Hook Hook Hook Hook

Function

... ... ...

Module Module Module

Apache coreFunctions called per hook

Link between function and hook

Request Response4 / 71

Processes Threads

Threads and Distributed Systems

Multithreaded Web clientHiding network latencies:

Web browser scans an incoming HTML page, and finds that more filesneed to be fetched.Each file is fetched by a separate thread, each doing a (blocking) HTTPrequest.As files come in, the browser displays them.

Multiple request-response calls to other machines (RPC)

A client does several calls at the same time, each one by a differentthread.It then waits until all results have been returned.Note: if calls are to different servers, we may have a linear speed-up.

6 / 33

Processes Threads


Multithreaded ServersAlthough there are important benefits to multithreaded clients, the mainbenefits of multithreading in distributed systems are on the server side.

7 / 33

Processes Threads


Improve performance

Starting a thread is much cheaper than starting a new process.Having a single-threaded server prohibits simple scale-up to amultiprocessor system.As with clients: hide network latency by reacting to next request whileprevious one is being replied.

Better structure

Most servers have high I/O demands. Using simple, well-understoodblocking calls simplifies the overall structure.Multithreaded programs tend to be smaller and easier to understand dueto simplified flow of control.

8 / 33


Server clusters

EssenceTo improve performance and availability, WWW servers are often clustered ina way that is transparent to clients.

Frontend

Webserver

Webserver

Webserver

Webserver

Request Response

Front end handlesall incoming requestsand outgoing responses

LAN

5 / 71

Processes Servers

Server clusters: three different tiers

Logical switch (possibly multiple)

Application/compute servers Distributed file/database

system

Client requests

Dispatched request

First tier Second tier Third tier

Crucial elementThe first tier is generally responsible for passing requests to anappropriate server.

22 / 33


Server clusters

ProblemThe front end may easily get overloaded, so that special measuresneed to be taken.

Transport-layer switching: Front end simply passes the TCPrequest to one of the servers, taking some performance metricinto account.Content-aware distribution: Front end reads the content of theHTTP request and then selects the best server.

6 / 71

Processes Servers

Request Handling

ObservationHaving the first tier handle all communication from/to the cluster maylead to a bottleneck.

SolutionVarious, but one popular one is TCP-handoff

Switch Client

Server

Server

RequestRequest

(handed off)

ResponseLogically asingle TCPconnection

23 / 33


Server Clusters

QuestionWhy can content-aware distribution be so much better?

SwitchClient

Webserver

Webserver

Distributor

Distributor

Dis-patcher

1. Pass setup requestto a distributor

2. Dispatcher selectsserver

3. Hand offTCP connection

4. InformswitchSetup request

Other messages

5. Forwardothermessages

6. Server responses

7 / 71

Naming Naming Entities

Naming Entities

Names, identifiers, and addressesName resolutionName space implementation

1 / 37


Naming

EssenceNames are used to denote entities in a distributed system. To operateon an entity, we need to access it at an access point. Access pointsare entities that are named by means of an address.

NoteA location-independent name for an entity E , is independent from theaddresses of the access points offered by E .

2 / 37


Identifiers

Pure nameA name that has no meaning at all; it is just a random string. Purenames can be used for comparison only.

IdentifierA name having the following properties:

P1: Each identifier refers to at most one entityP2: Each entity is referred to by at most one identifierP3: An identifier always refers to the same entity (prohibits reusingan identifier)

ObservationAn identifier need not necessarily be a pure name, i.e., it may havecontent.

3 / 37

Naming Flat Naming

Hierarchical Location Services (HLS)

Basic ideaBuild a large-scale search tree for which the underlying network isdivided into hierarchical domains. Each domain is represented by aseparate directory node.

A leaf domain, contained in S

Directory nodedir(S) of domain S

A subdomain Sof top-level domain T(S is contained in T)

Top-leveldomain T

The root directorynode dir(T)

13 / 37

Naming Structured Naming

Name resolution

ProblemTo resolve a name we need a directory node. How do we actually find that(initial) node?

Closure mechanismThe mechanism to select the implicit context from which to start nameresolution:

www.cs.vu.nl: start at a DNS name server/home/steen/mbox: start at the local NFS file server (possible recursivesearch)0031204447784: dial a phone number130.37.24.8: route to the VU’s Web server

QuestionWhy are closure mechanisms always implicit?

19 / 37


Name resolution





19 / 37


Name resolution





19 / 37


Name linking

Hard linkWhat we have described so far as a path name: a name that isresolved by following a specific path in a naming graph from one nodeto another.

20 / 37


Name-space implementation

Basic issueDistribute the name resolution process as well as name space managementacross multiple machines, by distributing nodes of the naming graph.

Distinguish three levels

Global level: Consists of the high-level directory nodes. Main aspect isthat these directory nodes have to be jointly managed by differentadministrationsAdministrational level: Contains mid-level directory nodes that can begrouped in such a way that each group can be assigned to a separateadministration.Managerial level: Consists of low-level directory nodes within a singleadministration. Main issue is effectively mapping directory nodes to localname servers.

23 / 37






23 / 37






23 / 37






23 / 37






23 / 37



org netjp us

nl

sun

eng

yale

eng

ai linda

robot

acm

jack jill

ieee

keio

cs

cs

pc24

co

nec

csl

oce vu

cs

ftp www

ac

com edugov mil

pub

globe

index.txt

Mana-geriallayer

Adminis-trational

layer

Globallayer

Zone

24 / 37



Item Global Administrational Managerial1 Worldwide Organization Department2 Few Many Vast numbers3 Seconds Milliseconds Immediate4 Lazy Immediate Immediate5 Many None or few None6 Yes Yes Sometimes

1: Geographical scale 4: Update propagation2: # Nodes 5: # Replicas3: Responsiveness 6: Client-side caching?

25 / 37


Replication of records

DefinitionReplicated at level i – record is replicated to all nodes with i matchingprefixes. Note: # hops for looking up record at level i is generally i .

ObservationLet xi denote the fraction of most popular DNS names of which therecords should be replicated at level i , then:

xi =

[d i(logN−C)

1+d + · · ·+d logN−1

]1/(1−α)

with N is the total number of nodes, d = b(1−α)/α and α ≈ 1, assumingthat popularity follows a Zipf distribution:The frequency of the n-th ranked item is proportional to 1/nα

34 / 37


Replication of records

MeaningIf you want to reach an average of C = 1 hops when looking up a DNSrecord, then with b = 4, α = 0.9, N = 10,000 and 1,000,000 recordsthat

61 most popular records should bereplicated at level 0

284 next most popular records at level 11323 next most popular records at level 26177 next most popular records at level 3

28826 next most popular records at level 4134505 next most popular records at level 5the rest should not be replicated

35 / 37

Consistency & Replication Client-Centric Consistency Models

Consistency for mobile users

ExampleConsider a distributed database to which you have access throughyour notebook. Assume your notebook acts as a front end to thedatabase.

At location A you access the database doing reads and updates.At location B you continue your work, but unless you access thesame server as the one at location A, you may detectinconsistencies:

your updates at A may not have yet been propagated to Byou may be reading newer entries than the ones available at Ayour updates at B may eventually conflict with those at A

12 / 40


Consistency for mobile users

NoteThe only thing you really want is that the entries you updated and/orread at A, are in B the way you left them in A. In that case, thedatabase will appear to be consistent to you.

13 / 40


Basic architecture

Read and write operations

Client moves to other locationand (transparently) connects toother replica

Wide-area network

Replicas need to maintainclient-centric consistency

Portable computer

Distributed and replicated database

14 / 40

Consistency & Replication Replica Management

Distribution protocols

Replica server placementContent replication and placementContent distribution

22 / 40


Replica placement

EssenceFigure out what the best K places are out of N possible locations.

Select best location out of N−K for which the average distance toclients is minimal. Then choose the next best server. (Note: Thefirst chosen location minimizes the average distance to all clients.)Computationally expensive.Select the K -th largest autonomous system and place a server atthe best-connected host. Computationally expensive.Position nodes in a d-dimensional geometric space, wheredistance reflects latency. Identify the K regions with highestdensity and place a server in every one. Computationally cheap.

23 / 40


Replica placement



23 / 40


Replica placement



23 / 40


Replica placement



23 / 40


Content replication

Distinguish different processesA process is capable of hosting a replica of an object or data:

Permanent replicas: Process/machine always having a replicaServer-initiated replica: Process that can dynamically host areplica on request of another server in the data storeClient-initiated replica: Process that can dynamically host areplica on request of a client (client cache)

24 / 40


Content replication

Permanentreplicas

Server-initiated replicas

Client-initiated replicas

Clients

Client-initiated replicationServer-initiated replication

25 / 40


Server-initiated replicas

Server withoutcopy of file F

Client Server withcopy of F

PQ

C1

C2

Server Q counts access from C andC as if they would come from P

12

File F

Keep track of access counts per file, aggregated by consideringserver closest to requesting clientsNumber of accesses drops below threshold D ⇒ drop fileNumber of accesses exceeds threshold R ⇒ replicate fileNumber of access between D and R ⇒ migrate file

26 / 40


Content distribution

ModelConsider only a client-server combination:

Propagate only notification/invalidation of update (often used forcaches)Transfer data from one copy to another (distributed databases)Propagate the update operation to other copies (also called activereplication)

NoteNo single approach is the best, but depends highly on availablebandwidth and read-to-write ratio at replicas.

27 / 40