40
Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

Embed Size (px)

Citation preview

Page 1: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

Introduction to Distributed Systems

Slides for CSCI 3171 Lectures E. W. Grundke

Page 2: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

2

References

A. Tanenbaum and M. van Steen (TvS)Distributed Systems: Principles and ParadigmsPrentice-Hall 2002

G. Coulouris, J. Dollimore and T. Kindberg (CDK)Distributed System: Concepts and DesignAddison-Wesley 2001

Acknowledgment: Some slides from TvS:

http://www.prenhall.com/divisions/esm/app/author_tanenbaum/custom/dist_sys_1e/

CDK: http://www.cdk3.net/ig/beida/index.html

Page 3: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

3

What is a Distributed System?

A collection of independent computers that appears to its users as a single coherent system.

Examples: Distributed object-based systems (CORBA, DCOM)Distributed file systems (NFS)etc.

TvS 1.2

Page 4: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

4

Heterogeneity

Applies to all of the following:– networks

• Internet protocols mask the differences between networks– computer hardware

• e.g. data types such as integers can be represented differently

– operating systems• e.g. the API to IP differs from one OS to another

– programming languages• data structures (arrays, records) can be represented

differently– implementations by different developers

• they need agreed standards so as to be able to interwork

CDK Ch. 1.4

Page 5: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

5

Transparency in a Distributed System

Different forms of transparency in a distributed system.

Transparency Description

AccessHide differences in data representation and how a resource is accessed

Location Hide where a resource is located

Migration Hide that a resource may move to another location

RelocationHide that a resource may be moved to another location while in use

ReplicationHide that a resource may be shared by several competitive users

ConcurrencyHide that a resource may be shared by several competitive users

Failure Hide the failure and recovery of a resource

PersistenceHide whether a (software) resource is in memory or on disk

TvS 1.4

Page 6: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

6

Layered Protocols: IPLayers, interfaces, and protocols in the OSI model.

Page 7: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

7

Layered Protocols: OSILayers, interfaces, and protocols in the OSI model.

2-1

TvS 2.2

Page 8: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

8

Middleware Protocols

An adapted reference model for networked communication.

2-5

TvS 2.6

Page 9: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

9

Middleware

A software layer that – masks the heterogeneity of systems– provides a convenient programming abstraction– provides protocols for providing general-purpose services

to more specific applications, eg.• authentication protocols• authorization protocols• distributed commit protocols• distributed locking protocols• high-level communication protocols

– remote procedure calls (RPC)– remote method invocation (RMI)

Page 10: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

10

MiddlewareGeneral structure of a distributed system as middleware.

1-22

TvS 1.24

Page 11: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

11

Middleware and Openness

In an open middleware-based distributed system, the protocols used by each middleware layer should be the same, as well as the interfaces they offer to applications.

1.23

TvS 1.25

Page 12: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

12

Middleware programming models

Remote Calls– remote Procedure Calls (RPC) – distributed objects and Remote Method

Invocation (RMI)• eg. Java RMI

Common Object Request Broker Architecture (CORBA) – cross-language RMI

Other programming models– remote event notification– remote SQL access– distributed transaction processing

CDK Ch 1

Page 13: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

External Data Representation

See Coulouris, Dollimore and Kindberg (CDK), Sec. 4.3

Page 14: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

14

Motivation

Data in running programs:Not just primitives, but

arrays, pointers, lists, trees, etc.In general:

complex graphs of interconnected structures or objects

Data being transmitted:Sequential! Pointers make no sense. Structures must be flattened.

All the heterogeneities must be masked! (endian, binary formats, etc.)

CDK 4.3

Page 15: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

15

Motivation

Data in running programs:Not just primitives, but

arrays, pointers, lists, trees, etc.In general:

complex graphs of interconnected structures or objects

Data being transmitted:Sequential! Pointers make no sense. Structures must be flattened.

All the heterogeneities must be masked! (endian, binary formats, etc.)

CDK 4.3

Page 16: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

16

What is an External Data Representation?

“An agreed standard for the representation of data structures and primitive values.”

Internal to external: “marshalling”External to internal: “unmarshalling”

Examples:CORBA’s Common Data Representation (CDR)Java Object SerializationSun XDR (RFC 1832)

Page 17: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

17

CORBA CDR

Defined in CORBA 2.0 in 1998Primitive types:

Standard data types, both big/little endian, conversion by the receiver.

Constructed types:sequence, string, array, struct, enumerated, union (not objects)

Data types are not specified in the external format: receiver is assumed to have access to the definition (via IDL).(unlike Java Object Serialization!)

CDK 4.3

Page 18: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

18

CORBA CDR– only defined in CORBA 2.0 in 1998, before that, each

implementation of CORBA had an external data representation, but they could not generally work with one another. That is:• the heterogeneity of hardware was masked• but not the heterogeneity due to different

programmers (until CORBA 2)– CORBA CDR represents simple and constructed data

types (sequence, string, array, struct, enum and union)• note that it does not deal with objects

– it requires an IDL specification of data to be serialised

CDK 4.3

Page 19: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

19

CORBA CDR Example

The flattened form represents a Person struct with value:{”Smith”, ”London”, 1934}

CDK 4.3

0-3 5 Length of string

4-7 ”Smit” “Smith”

8-11 ”h____”

12-15 6 Length of string

16-19 ”Lond” “London”

20-23 ”on__”

24-27 1934 Unsigned long

Index in sequence 4 bytes wide Notesof bytes

Page 20: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

Remote Procedure Calls (RPC)

Page 21: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

21

What is RPC?

Call a procedure (function, subroutine, method, …) in a program running on a remote machine,

while hiding communication details from the programmer.

Note: Think C, not java! We deal with objects later!

Page 22: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

22

Conventional Procedure Call

a) Parameter passing in a local procedure call: the stack before the call to readb) The stack while the called procedure is active

TvS 2.7

Page 23: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

23

Parameter Passing Techniques

Call-by-value

Call-by-reference

Call-by-copy/restore

Page 24: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

24

Client and Server StubsPrinciple of RPC between a client and server program.

TvS 2.8

Page 25: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

25

Steps of a Remote Procedure Call

1. Client procedure calls client stub in normal way2. Client stub builds message, calls local OS3. Client's OS sends message to remote OS4. Remote OS gives message to server stub5. Server stub unpacks parameters, calls server6. Server does work, returns result to the stub7. Server stub packs it in message, calls local OS8. Server's OS sends message to client's OS9. Client's OS gives message to client stub10. Stub unpacks result, returns to client

TvS 2.9

Page 26: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

26

Passing Value ParametersSteps involved in doing remote computation through RPC

2-8

TvS 2.10

Page 27: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

27

Passing Value Parameters:Data Representation Issues

a) Original message on the Pentiumb) The message after receipt on the SPARCc) The message after being inverted. The little numbers in boxes

indicate the address of each byte

BUT: This is not usually a problem with strings! (E.W.G.)TvS 2.11

Page 28: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

28

Parameter Specification and Stub Generation

a) A procedureb) The corresponding message.

TvS 2.12

Page 29: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

29

Passing Reference Parameters

Reference variables (pointers):pointers to arrayspointers to structures (objects without methods)

What if the structure contains other pointers?The server may need a whole “graph” of structures!

“Parameter marshalling”

Interface Definition Language (IDL):Specifies types, constants, procedures and

parameter data types,compiled into client and server stubs.

Page 30: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

30

Asynchronous RPC

a) The interconnection between client and server in a traditional RPCb) The interaction using asynchronous RPC

2-12

TvS 2.14

Page 31: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

31

Asynchronous RPC:Deferred Synchronous RPC

A client and server interacting through two asynchronous RPCs

TvS 2.15

Page 32: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

32

Distributed Computing Environment (DCE)

A middleware systemDeveloped by The Open Group (previously OSF)Includes

distributed file servicedirectory servicesecurity servicedistributed time service

Adopted by Microsoft for distributed computing

Page 33: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

33

DCE: Binding a Client to a Server

2-15

TvS 2.17

Page 34: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

Remote Method Invocation (RMI)

Page 35: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

35

What is RMI?

RPC to a method in an object on another machine.

Note: Now think Java!

Page 36: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

36

Object Orientation:Remote Method Invocation (RMI)

An object encapsulatesState (fields or instance variables)Methods (often described by an interface)

Distributed object:An interface known locally may describe an object on another machine.

Page 37: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

37

Distributed Objects

Common organization of a remote object with client-side proxy.2-16

TvS 2.18

Page 38: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

38

Binding a Client to an Object

a) An example with implicit binding using only global references

b) An example with explicit binding using global and local references

Distr_object* obj_ref; //Declare a systemwide object referenceobj_ref = …; // Initialize the reference to a distributed objectobj_ref-> do_something(); // Implicitly bind and invoke a method

(a)

Distr_object objPref; //Declare a systemwide object referenceLocal_object* obj_ptr; //Declare a pointer to local objectsobj_ref = …; //Initialize the reference to a distributed objectobj_ptr = bind(obj_ref); //Explicitly bind and obtain a pointer to the local proxyobj_ptr -> do_something(); //Invoke a method on the local proxy

(b)

TvS 2.19

Page 39: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

39

Parameter PassingThe situation when passing an object by reference or by value.

2-18

TvS 2.20

Page 40: Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke

40

The DCE Distributed-Object Modela) Distributed dynamic objects in DCE.b) Distributed named objects

TvS 2.21