Introduction Goals

Preview:

DESCRIPTION

Java-Based Parallel Computing on the Internet: Javelin 2.0 & Beyond Michael Neary & Peter Cappello Computer Science, UCSB. Introduction Goals. Service parallel applications that are: Large : too big for a cluster Coarse-grain : to hide communication latency Simplicity of use - PowerPoint PPT Presentation

Citation preview

Java-Based Parallel Computing on the Internet: Javelin 2.0 & Beyond

Michael Neary & Peter CappelloComputer Science, UCSB

IntroductionGoals

• Service parallel applications that are:– Large: too big for a cluster– Coarse-grain: to hide communication latency

• Simplicity of use– Design focus: decomposition [composition] of computation.

• Scalable high performance– despite large communication latency

• Fault-tolerance– 1000s of hosts, each dynamically [dis]associates.

IntroductionSome Related Work

IntroductionSome Applications

• Search for extra-terrestrial life• Computer-generated animation• Computer modeling of drugs for:

– Influenza– Cancer– Reducing chemotherapy’s side-effects

• Financial modeling• Storing nuclear waste

Outline

• Architecture

• Model of Computation

• API

• Scalable Computation

• Experimental Results

• Conclusions & Future Work

Architecture Basic Components

Brokers

Clients

Hosts

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

PING(BID?)

ArchitectureBroker Discovery

B

B B B

B

B B B

BrokerNamingSystem

B

H

ArchitectureNetwork of Broker-Managed Host Trees

• Each broker manages a tree of hosts

ArchitectureNetwork of Broker-Managed Host Trees

• Brokers form a network

ArchitectureNetwork of Broker-Managed Host Trees

• Brokers form a network

• Client contacts broker

ArchitectureNetwork of Broker-Managed Host Trees

• Brokers form a network

• Client contacts broker• Client gets host trees

Scalable ComputationDeterministic Work-Stealing Scheduler

Task container

addTask( task ) getTask( )

stealTask( )

HOST

Scalable ComputationDeterministic Work-Stealing Scheduler

Task getWork( ) {

if ( my deque has a task ) return task;else if ( any child has a task ) return child’s task;else return parent.getWork( );

}

CLIENT

HOSTS

Models of Computation

• Master-slave

– AFAIK all proposed commercial applications

• Branch-&-bound optimization

– A generalization of master-slave.

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0 0UPPER = LOWER = 0

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

2

0UPPER = LOWER = 2

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

3

2

0UPPER = LOWER = 3

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

4

3

2

0UPPER = 4LOWER = 4

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

34

3

2

0UPPER = 3LOWER = 3

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0

34

3 6

2

0UPPER = 3LOWER = 6

Models of ComputationBranch & Bound

34 8 7 12 10 9 10

3 6 10 8

2 7

0 UPPER = 3LOWER = 7

34

3 6

2 7

0

Models of ComputationBranch & Bound

• Tasks created dynamically

• Upper bound is shared

• To detect termination:

scheduler detects tasks that

have been:

– Completed

– Killed (“bounded”)34

3 6

2 7

0

APIpublic class Host implements Runnable{ . . . public void run() { while ( (node = jDM.getWork()) != null ) { if ( isAtomic() ) compute(); // search space; return result else { child = node.branch(); // put children in child array for (int i = 0; i < node.numChildren; i++) if ( child[i].setLowerBound() < UpperBound )

jDM.addWork( child[i] ); //else child is killed implicitly } } }

APIprivate void compute() { . . .

boolean newBest = false;

while ( (node = stack.pop()) != null ) { if ( node.isComplete() ) if ( node.getCost() < UpperBound ) { newBest = true; UpperBound = node.getCost(); jDM.propagateValue( UpperBound ); best = Node( child[i] ); } else { child = node.branch(); for (int i = 0; i < node.numChildren; i++) if ( child[i].setLowerBound() < UpperBound ) stack.push( child[i] ); //else child is killed implicitly } } if ( newBest ) jDM.returnResult( best );} }

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Scalable ComputationWeak Shared Memory Model

• Slow propagation of bound affects performance not correctness.

Propagate bound

Scalable ComputationFault Tolerance via Eager Scheduling

When:

• All tasks have been assigned

• Some results have not been reported

• A host wants a new task

Re-assign a task!

• Eager scheduling tolerates faults & balances the load.

– Computation completes, if at least 1 host communicates with client.

Scalable ComputationFault Tolerance via Eager Scheduling

• Scheduler must know which: – Tasks have completed

– Nodes have been killed

• Performance balance – Centralized schedule info

– Decentralized computation34

3 6

2 7

0

Experimental Results

0

20

40

60

80

100

0 20 40 60 80 100

Processors

Speedup graph22

idealgraph24

Experimental Results

34 8 7 12 10 9 10

3 6 10 8

2 7

0 Example of a “bad” graph

Conclusions• Javelin 2 relieves designer/programmer managing a set of

[Inter-] networked processors that is:– Dynamic– Faulty

• A wide set of applications is covered by:– Master-slave model– Branch & bound model

• Weak shared memory performs well.• Use multicast (?) for:

– Code distribution– Propagating values

Future Work

• Improve support for long-lived computation:– Do not require that the client run continuously.

• A dag model of computation– with limited weak shared memory.

Future WorkJini/JavaSpaces Technology

TaskManageraka Broker

H H

HH

H

H

H

H

“Continuously” disperse Tasks among brokers via a physics model

Future WorkJini/JavaSpaces Technology

• TaskManager uses persistent JavaSpace– Host management: trivial

– Eager scheduling: simple

• No single point of failure– Fat tree topology

Future WorkAdvanced Issues

• Privacy of data & algorithm• Algorithms

– New computation-communication complexity model– N-body problem, …

• Accounting: Associate specific work with specific host– Correctness– Compensation (how to quantify?)

• Create open source organization– System infrastructure– Application codes

Recommended