Module 1 - Distributed System Architectures & Models
CS454/654 1-2
System Architecture
n Defines the structure of the system
l components identified
l functions of each component defined
l interrelationships and interactions between components defined
CS454/654 1-3
StandardizationReference Model
l A conceptual framework whose purpose is to divide standardization work into manageable pieces and to show at a general level how these pieces are related to one another.
Approachesl Component-based
à Components of the system are defined together with the interrelationships between components.
à Good for design and implementation of the system.l Function-based
à Classes of users are identified together with the functionality that the system will provide for each class.
à The objectives of the system are clearly identified. But how do you achieve these objectives?
à Example: ISO/OSI Model
CS454/654 1-4
Software Layers
Platform
From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms© Prentice-Hall, Inc. 2002
CS454/654 1-5
Layersn Platform
l Fundamental communication and resource management servicesl We won’t be worried about these
n Middlewarel Provides a service layer that hides the details and heterogeneity of the
underlying platforml Provides an “easier” API for the applications and servicesl Can be as simple as RPC or as complex as OMA
à RPC (Remote Procedure Call): simple procedure call across remote machine boundaries
à OMA (Object Management Architecture) : an object-oriented platform for building distributed applications
n Applicationsl Distributed applications, servicesl Examples: e-mail, ftp, etc
CS454/654 1-6
Hardware Organization
From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms© Prentice-Hall, Inc. 2002
CS454/654 1-7
Software Organization
From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms© Prentice-Hall, Inc. 2002
Provide distribution transparency
Additional layer atop of NOS implementing general-purpose services
Middleware
Offer local services to remote clients
Loosely-coupled operating system for heterogeneous multicomputers (LAN and WAN)
NOS
Hide and manage hardware resources
Tightly-coupled operating system for multi-processors and homogeneous multicomputers
DOS
Main GoalDescriptionSystem
CS454/654 1-8
System Architectures
n Client-serverl Multiple-client/single-serverl Multiple-client/multiple-serversl Mobile clientsl Thin clients
n Multitier systemsn Peer-to-peer systems
CS454/654 1-9
Multiple-Client/Single Server
Communications
Server Services
Network
Communications
ClientServices
Applications
Communications
ClientServices
Applications
Data
n User interfacen Programmatic interfacen other application
support environments
CS454/654 1-10
Advantages of Client/Server Computing
n More efficient division of labor
n Horizontal and vertical scaling of resources
n Better price/performance on client machines
n Ability to use familiar tools on client machines
n Client access to remote data (via standards)
n Full DBMS functionality provided to client workstations
n Overall better system price/performance
CS454/654 1-11
Client-Server Communication
Client
Client
Server
Process Computer
Request (invocation)
Result
Request (invocation)
Result
CS454/654 1-12
Client-Server Timingn General interaction between a client and a server.
From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms© Prentice-Hall, Inc. 2002
CS454/654 1-13
An Example Client and Server (1)n The header.h file used by the client and server.
From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms© Prentice-Hall, Inc. 2002
CS454/654 1-14
An Example Client and Server (2)n A sample server.
From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms© Prentice-Hall, Inc. 2002
CS454/654 1-15
An Example Client and Server (3)n A client using the server to copy a file.
1-27 b
From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms© Prentice-Hall, Inc. 2002
CS454/654 1-16
Problems With Multiple-Client/Single Server
n Server forms bottleneck n Server forms single point of failuren System scaling difficult
CS454/654 1-17
Multiple Clients/Multiple Servers
Communications
Server Services
Network
Communications
Server Services
Communications
ClientServices
Applications
Data Data
CS454/654 1-18
Multiple-Client/Multiple-Server Communication
Server
Client
Client
invocation
result
Serverinvocation
result
From Coulouris, Dollimore and Kindberg, Distributed Systems: Concepts and Design, 3rd ed. © Addison-Wesley Publishers 2000
CS454/654 1-19
Service Across Multiple Servers
Server
Server
Server
Service
Client
Client
From Coulouris, Dollimore and Kindberg, Distributed Systems: Concepts and Design, 3rd ed. © Addison-Wesley Publishers 2000
CS454/654 1-20
Laptop
PDA
Printer Camera
Internet
Host intranet Home intranetWAP Wireless LAN Gateway
Host site
Mobile Computing
Database
CS454/654 1-21
Example Mobile Computing Environment
Internet
gateway
PDA
service
Music service
serviceDiscovery
Alarm
Camera
Guestsdevices
LaptopTV/PC
Hotel wirelessnetwork
These types of environments are commonly called “spontaneous systems” or “pervasive computing”
From Coulouris, Dollimore and Kindberg, Distributed Systems: Concepts and Design, 3rd ed. © Addison-Wesley Publishers 2000
CS454/654 1-22
ThinClient
ApplicationProcess
Network computer or PCCompute server
network
“Thin” Clientsn Thin Clients and Compute Servers
l Executing graphical user interface on local computer while application executes on compute server
l Example: X11 server (run on the application client side)l In reality: Palm Pilots, Mobile phones
From Coulouris, Dollimore and Kindberg, Distributed Systems: Concepts and Design, 3rd ed. © Addison-Wesley Publishers 2000
CS454/654 1-23
Multitier Systems
n Servers are clients of other serversn Example:
l Web proxy servers
Client
Proxy
Web
server
Web
server
serverClient
From Coulouris, Dollimore and Kindberg, Distributed Systems: Concepts and Design, 3rd ed. © Addison-Wesley Publishers 2000
CS454/654 1-24
Multitier Systems (2)
n Example: Internet Search Engines
From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms© Prentice-Hall, Inc. 2002
CS454/654 1-25
Multitier System Alternatives
From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms© Prentice-Hall, Inc. 2002
CS454/654 1-26
Communication in Multitier Systems
From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms© Prentice-Hall, Inc. 2002
CS454/654 1-27
Peer-to-Peer Systems
Coordination
Application
code
Coordination
Application
code
Coordination
Application
code
From Coulouris, Dollimore and Kindberg, Distributed Systems: Concepts and Design, 3rd ed. © Addison-Wesley Publishers 2000
CS454/654 1-28
Example Client/Server Middleware
n Remote Procedure Call (RPC)l Uses the well-known procedure call semantics.l The caller makes a procedure call and then waits. If it is a local
procedure call, then it is handled normally; if it is a remote procedure, then it is handled as a remote procedure call.
l Caller semantics is blocked send; callee semantics is blocked receive to get the parameters and a nonblocked send at the end to transmit results.
Callerprocedure
Callerstub
Comm.
Transmit
Receive
Caller
Calleeprocedure
Calleestub
Comm.
Marshallresults
Transmit
Unmarshallarguments
Receive
Callee
Call
Return
Call packet(s)
Result packet(s)
Call(…)Marshallparams
Unmarshallresults
.
.
.
CS454/654 1-29
Example Client/Server Middlewaren Object Management Architecture (OMA)
l Object-oriented
Application objects Common facilities
Common Object Services (COSS)
CASE tool Query Browser Email
Persistence Name Service
Common Object Request Broker(CORBA)
…
CS454/654 1-30
OMA Modules
n Object Request Broker: directs requests and answers between objects
n Object Services: basic functions for object management (e.g., a name service)
n Common Facilities: generic object-oriented tools for various applications (e.g., a class browser)
n Application Objects: classes specific to an application domain (e.g., a CASE tool)
CS454/654 1-31
OMG Object Modeln Generalized object model including objects, values (object
names and handles), operations, signatures, types and classesn An object is an abstraction with a state and a set of operationsn A request is an operation call with one or more parameters,
any of which may identify an object (multi-targeting); Arguments and results are passed by value
request
object operation
type
identifies identifies
has has
CS454/654 1-32
n Communications substraten A specific programmer interface (no implementation)n Multi-vendor ORBs to interoperate (CORBA-2)n Layered on top of other communication substrates
(RPC, byte streams, IPC,…)n Language mapping (C, C++, Ada, Smalltalk, Java,
Cobol)n Object adaptersn ORB interfacen Interface repository (IR)
CORBA Features
CS454/654 1-33
OMA’s Client-Server Structure
ORB
Request
Client Object Implementation
Response
CS454/654 1-34
ORB Functionsn Request dispatch to determine the identity of a method to be calledn Parameter encoding to convey local representations of parameter valuesn Delivery of request and result messages to the proper nodesn Synchronization of the requesters with the responses of the requestsn Activation and deactivation of persistent objects, e.g., Using the ODBMS
serversn Exception handling to report various failures to requesters and serversn Security mechanisms to assure the secure conveyance of messages among
objects
CS454/654 1-35
ORB StructureClient Object Implementation
ORB-dependent interface
DynamicInvocation
IDLStubs
ORBInterface
Common ORB interface
Different stub and skeleton foreach object type
Multiple object adapters
ORB Core
IDLSkeleton
ObjectAdapter
DynamicSkeletonInterface
RepositoryImpl.
Repository
CS454/654 1-36
CORBA Method Invocationsn Static
l Fasterà Everything at compile time
l Easier to programà Invoke a method on an identified object
l Static type checkingl Code is self-documenting
n Dynamicl Flexible
à Code genericity
l Run-time addition of classesà Needed for tools (e.g., Browsers)
CS454/654 1-37
Design Requirements of Distributed Systems
n Performancel Metricsl Load balancingl Scalabilityl Use of caching and replication
n Quality of Servicel Functional as well as performance
n Dependabilityl Reliabilityl Fault tolerance
n Security
CS454/654 1-38
Performance
n Performance metricsl Response time
à How long it takes from a user’s request until the result is returned to the user
à Average response time is usually meaningful
l Throughput à Number of jobs per unit time
l System utilizationl Amount of network capacity consumed
n Performance Probleml Exchange of messages is typically slow (about 1 msec return trip time
on a LAN, mainly due to protocol handling)
CS454/654 1-39
Messaging Overhead
100 Mbps 1 Gbps10 Mbps
HardwareOverhead
SoftwareOverhead
New Protocols(e.g., Infiniband,FiberChannel)
Changing Network Characteristics
CS454/654 1-40
Date Computers
1979, Dec. 188
1989, July 130,000
1999, July 56,218,000
Date Computers Web servers Percentage
1993, July 1,776,000 130 0.008
1995, July 6,642,000 23,500 0.4
1997, July 19,540,000 1,203,096 6
1999, July 56,218,000 6,598,697 12
Computers in the InternetComputers vs. Web servers in the Internet
Scalabilityn Distributed systems operate at many different scales
l Two workstations and a file serverl The CS computers…
n Most current distributed systems designed to work with few hundred CPUs n Future systems will be orders of magnitude larger
CS454/654 1-41
Guiding Performance Principlesn Distribution provides opportunities for parallelism that can improve
performancel Inter-process vs intra-process parallelisml Pay considerable attention to the size of computations
à Fine-grained parallelism: large number of small computations, interact highly with one another ⇒ potential trouble
à Coarse-grained parallelism: large computations, low interaction rates, little data ⇒maybe better
n Often the more important question is not can you scale, but can you scale well
l Solutions for 200 machines will fail miserably for 2,000,000n Designers should try to avoid the potential bottlenecks:
l Centralized components (e.g., A single mail server for all users)l Centralized tables (e.g., A single Directory Name Server)l Centralized algorithms (e.g., Routing based on complete information)
n Difference between bandwidth and latencyn Use caching and data replication for performance reasons (replication can
be used for dependability reasons as well)
CS454/654 1-42
Quality of Service (QoS)
n How well the system can meet the service requirements specified for it
n Functionall Whether the system can perform the specified functions
n Non-functionall How well the system can perform these functions with respect to
à Performanceà Reliabilityà Availability
n Examplesl Frame rates, jitter in video transmissionl Real-time constraintsl Database access time requirementsl 24/7 requirements
CS454/654 1-43
n Reliabilityl A measure of success with which a system conforms to some
authoritative specification of its behavior.
l Probability that the system has not experienced any failures within a given time period.
l Typically used to describe systems that cannot be repaired or where the continuous operation of the system is critical.
n Availabilityl The fraction of the time that a system meets its specification.
l The probability that the system is operational at a given time t.
Dependability
CS454/654 1-44
How to Improve Dependability
n Hardware and software redundancy - maskingl Take a distributed system with 4 file servers, each with a 0.95 chance of
being up at any instant l The probability of all 4 being down simultaneously is 0.054 = 0.000006l So the probability of at least one being available (i.e., the reliability of
the full system) is 0.999994, far better than 0.95l If there are 2 servers, then the reliability of the system is (1-0.052) =
0.9975
n A design that does not require simultaneous functioning of a substantial number of critical components
CS454/654 1-45
Hardware Redundancy
n Two computers are employed for a single application, one acting as a standby
l Very costly, but often very effective solution
n Redundancy can be planned at a finer grainl Individual servers can be replicatedl Redundant hardware can be used for non-critical activities when no
faults are presentl Redundant routes in network
CS454/654 1-46
Software Redundancy
n Distributed Systems are foremost highly complex software systems
l Nortel Networks DMS- 100 switch: 25- 30 million lines of code, 3000 software developers, 20 years life cycle to date.
l Motorola: 20% of engineers produce hardware, 80% produce softwarel Subject to all kinds of software engineering problems
n Software redundancy requiresl Process redundancy
à When one process fails, another one can take overl Data redundancy
à the state of permanent data can be recovered or “rolled back” when a fault is detected ⇒ transaction processing
à Dilemma: Data replication improves availability, but increses the chances that they will be inconsistent, especially if updates are frequent
– Replication protocols
CS454/654 1-47
Securityn Security is a big issue in computing in general, but even more so in
distributed computingl Communicationl Distributed resourcesl Infrastructure attacks
n Files/resources must be protected from unauthorized usagen Problems
l Threats to processesl Threats to communication channelsl Denial of service attacks
n Some issuesl Privacyl Authenticationl Availabilityl Integrity
CS454/654 1-48
Summary
n Distributed systems consist of autonomous CPUs that work together to make the complete system look like a single computer
n System Modelsl Client-Server: Most widely used
à Variants of Client-Server Model– Service provided by multiple servers, proxy servers, mobile code, mobile
agents, network computers, thin clients, “spontaneous” (or “pervasive”) networking
l Peer-to-peer Model
n Key issues when designing distributed systems:l Transparency, reliability, performance, scalability, …