By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison

Preview:

DESCRIPTION

Middleware for Active Reduction Operations in Distributed Systems. By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison. Spring 2000. Talk Outline. Motivation and Goals General Architecture of the middleware Components of the middleware - PowerPoint PPT Presentation

Citation preview

By

Nitin Bahadur

Gokul Nadathur

Department of Computer Sciences

University of Wisconsin-MadisonSpring 2000Spring 2000

Multicast / Reduction Trees 2Spring 2000

Talk Outline

• Motivation and Goals• General Architecture of the middleware• Components of the middleware• Providing reliability - handling of node failures• Applications developed using the middleware• Performance• Conclusions and possible extensions

Multicast / Reduction Trees 3Spring 2000

Motivation and Goals

• A middleware for an application with Master - Worker paradigm

• Scalable framework for communication and computing client response (“Reduction”)

• Unicast does not scale - so use multicast• Introducing reduction operations dynamically in

clients • A general framework for communication among

clients

Multicast / Reduction Trees 4Spring 2000

The Big Picture...

Master App

ARTL

Client AppARTL

Client AppARTL

Client App

ARTL

Sends queriesReduces resultsHands back results to application

Execute responses to queries Forward queries downstreamReduces incoming resultsSends reduced results to master

Executes responses to queriesSends back results towards master

Multicast / Reduction Trees 5Spring 2000

ART - Library Architecture

Network

ARTL Communication Layer

Event Handler

Application API

Framework for processing messages

Incoming Packet

ARTL specific message

Application specific callbacks

Reduction functions

ARTL messages :1. Query from master 2. Response from downstream nodes

Outgoing message

Application

Multicast / Reduction Trees 6Spring 2000

ART - Library Architecture

Network

ARTL Communication Layer

Event Handler

Application API

Framework for processing messages

Incoming Packet

ARTL specific message

Application specific callbacks

Reduction functions

ARTL messages :1. Query from master 2. Response from downstream nodes

Outgoing message

Application

Multicast / Reduction Trees 7Spring 2000

Communication Subsystem

• Connection Setup – Connect nodes as a Binomial tree

• Send and receive ARTL and application messages• Detect node failure and act accordingly• Integrate restarted node in current tree structure

Multicast / Reduction Trees 8Spring 2000

Why use Binomial Tree

Master App

Client App Client App

Client App

1

2

2Master

App

Client App

Client App

Client App

Binomial TreeQuery Propagation time = 2

Unicast MechanismQuery Propagation time = 3

2

3

1

Multicast / Reduction Trees 9Spring 2000

Reduction

1

5 3 2

7 6

8

4

Reduction at 5 and 3

Responses

Example Reduction operations:Min(), Max()

Multicast / Reduction Trees 10Spring 2000

1

5 3 2

7 6

8

4

Tree connection setup

Multicast / Reduction Trees 11Spring 2000

1

5 3 2

7 6

8

4

Tree Setup - Phase I

TCP connection setup

Multicast / Reduction Trees 12Spring 2000

1

5 3 2

7 6

8

4

Tree Setup - Phase II

TCP connection setup

Multicast / Reduction Trees 13Spring 2000

1

5 3 2

7 6

8

4

Tree Setup - Phase III

TCP connection setup

Multicast / Reduction Trees 14Spring 2000

Inter node communication

• Unicast and multicast data transmission• ARTL receives application messages for which no

receive has been posted – these are sent to a callback function registered by

application

• ARTL receives data on behalf of application when application explicitly posts a receive

DataARTL Header

Multicast / Reduction Trees 15Spring 2000

ART - Library Architecture

Network

ARTL Communication Layer

Event Handler

Application API

Framework for processing messages

Incoming Packet

ARTL Encapsulated message

Application specific callbacks

Reduction functions

ARTL messages :1. Query from master 2. Response from downstream nodes

Outgoing message

Application

Multicast / Reduction Trees 16Spring 2000

Reduction Functions

• Implemented as Shared objects

• Sent to client during Setup phase

• Each reduction function is associated with a particular response it reduces

Multicast / Reduction Trees 17Spring 2000

Event Handler

Network

Table containing Query id and Callback information for currently registered queries

Responses for the shaded entry from down stream nodes

Reduced response sent upstream

Event Handler

Application

Response Callback

Run Queue of reduction/response

operations

Thread Pool

Multicast / Reduction Trees 18Spring 2000

Multithreaded Architecture

• No prior Knowledge about behavior of reduction function

• Exploit concurrency - multiple processor per node

• Static Pool of threads - Creation and destruction of threads is bad (Firefly RPC)

Multicast / Reduction Trees 19Spring 2000

Crash Reconfiguration

1

5 3 2

7 6

8

4

Multicast / Reduction Trees 20Spring 2000

Crash Reconfiguration

1

5 3

7 6

8

4

Crash Reconfiguration at depth 1

Multicast / Reduction Trees 21Spring 2000

Crash Reconfiguration

1

5 3

7 6

8

4

Crash Reconfiguration at depth 2

Multicast / Reduction Trees 22Spring 2000

1

5 3 2

7 6

8

4

Crash Reconfiguration

Crash Reconfiguration at depth 1

Multicast / Reduction Trees 23Spring 2000

1

3 27

68 4

Crash Reconfiguration

Crash Reconfiguration at depth 1

Multicast / Reduction Trees 24Spring 2000

Crash Detection

• Break in TCP connection with parent/child – a signal is received at the other end of connection

• Use of periodic refresh messages to inform parent that child is up and running– useful in WAN environments

Multicast / Reduction Trees 25Spring 2000

Crash Handling

• Parent of node down informs master• All nodes are informed of a node failure• Master recomputes tree

– If leaf node down, then no problem

– If intermediate node down, some reconfiguration is required

Multicast / Reduction Trees 26Spring 2000

Node Restart

• Restarted node contacts master to tell it about restart

• Master sends it current state of network and the shared object(s)

• All nodes are informed of a node restart• Master recomputes tree and informs the new

node’s parent about its new child• Parent and child establish connections

Multicast / Reduction Trees 27Spring 2000

SysMon - A System monitor

Monitors the load average from /procdisplays Min, Max and average loads

Per-node load is also displayed

ARTL Reduction operations : Min, Max and Average

Multicast / Reduction Trees 28Spring 2000

SysMon - A System monitor

Node failures are detected and SysMon pops up an alert

Multicast / Reduction Trees 29Spring 2000

File Transfer Application

• Transfers a file from master to all clients• File can be executed at clients (if required)

– execution can be instantaneous on receiving file– execution can be delayed until all nodes have

received the file

Multicast / Reduction Trees 30Spring 2000

File Transfer PerformanceFile Transfer Time for 40 MB file

020406080

100120140160180

2 4 8 16

Total number of nodes

Tim

e in

se

co

nd

s Unicast FileTransfer time

Multicast FileTransfer time

Expected multicastfile transfer time

Multicast / Reduction Trees 31Spring 2000

Total Startup Time vs Number of Nodes

Total Startup Time

0

5

10

15

20

2 4 8 16 32

Number of Nodes

Tim

e in

sec

Startup time in sec

Client processes started using ssh on different machines

Multicast / Reduction Trees 32Spring 2000

Conclusions and Extensions• A middleware for dynamic operations• Support for crash detection, recovery and dynamic

processes• Demonstrated near optimal speedup using real

applications

• Making response function dynamic - active services

• Differential scheduling in thread scheduler for QoS• Making dynamic code secure

Recommended