40
Using Small Using Small Abstractions Abstractions to Program to Program Large Distributed Large Distributed Systems Systems Douglas Thain Douglas Thain University of Notre Dame University of Notre Dame 11 December 2008 11 December 2008

Using Small Abstractions to Program Large Distributed Systems

  • Upload
    alma

  • View
    25

  • Download
    0

Embed Size (px)

DESCRIPTION

Using Small Abstractions to Program Large Distributed Systems. Douglas Thain University of Notre Dame 11 December 2008. Using Small Abstractions to Program Large Distributed Systems. (And multicore computers!). Douglas Thain University of Notre Dame 11 December 2008. - PowerPoint PPT Presentation

Citation preview

Page 1: Using Small Abstractions to Program Large Distributed Systems

Using Small AbstractionsUsing Small Abstractionsto Programto Program

Large Distributed SystemsLarge Distributed Systems

Douglas ThainDouglas Thain

University of Notre DameUniversity of Notre Dame

11 December 200811 December 2008

Page 2: Using Small Abstractions to Program Large Distributed Systems

Using Small AbstractionsUsing Small Abstractionsto Programto Program

Large Distributed SystemsLarge Distributed Systems

Douglas ThainDouglas Thain

University of Notre DameUniversity of Notre Dame

11 December 200811 December 2008

(And multicore computers!)

Page 3: Using Small Abstractions to Program Large Distributed Systems

Clusters, clouds, and gridsClusters, clouds, and gridsgive us access to zillions of CPUs.give us access to zillions of CPUs.

How do we write How do we write programsprograms that can that canrun effectively in large systems?run effectively in large systems?

Page 4: Using Small Abstractions to Program Large Distributed Systems

I have 10,000 iris images acquired in my research lab. I want to reduce each one to a feature space, and then compare all of them to each other. I want to spend my time doing science, not struggling with computers.

I have a laptop.

I own a few machines I can buy time from Amazon or TeraGrid.

Now What?

Page 5: Using Small Abstractions to Program Large Distributed Systems

How do a I program a CPU?How do a I program a CPU?

I write the algorithm in a language that I find I write the algorithm in a language that I find convenient: C, Fortran, Python, etc…convenient: C, Fortran, Python, etc…

The compiler chooses instructions for the The compiler chooses instructions for the CPU, even if I don’t know assembly.CPU, even if I don’t know assembly.

The operating system allocates memory, The operating system allocates memory, moves data between disk and memory, and moves data between disk and memory, and manages the cache.manages the cache.

To move to a different CPU, recompile or use To move to a different CPU, recompile or use a VM, but a VM, but don’t change the programdon’t change the program..

Page 6: Using Small Abstractions to Program Large Distributed Systems

How do I program the grid/cloud?How do I program the grid/cloud?

Split the workload into pieces.Split the workload into pieces.– How much work to put in a single job?How much work to put in a single job?

Decide how to move data.Decide how to move data.– Demand paging, streaming, file transfer?Demand paging, streaming, file transfer?

Express the problem in a workflow Express the problem in a workflow language or programming environment.language or programming environment.– DAG / MPI / Pegasus / Taverna / Swift ?DAG / MPI / Pegasus / Taverna / Swift ?

Babysit the problem as it runs.Babysit the problem as it runs.– Worry about disk / network / failures…Worry about disk / network / failures…

Page 7: Using Small Abstractions to Program Large Distributed Systems

How do I program on 128 cores?How do I program on 128 cores?

Split the workload into pieces.Split the workload into pieces.– How much work to put in a single thread?How much work to put in a single thread?

Decide how to move data.Decide how to move data.– Shared memory, message passing, streaming?Shared memory, message passing, streaming?

Express the problem in a workflow language Express the problem in a workflow language or programming environment.or programming environment.– OpenMP, MPI, PThreads, Cilk, …OpenMP, MPI, PThreads, Cilk, …

Babysit the problem as it runs.Babysit the problem as it runs.– Implement application level checkpoints.Implement application level checkpoints.

Page 8: Using Small Abstractions to Program Large Distributed Systems

Tomorrow’s distributed systems will Tomorrow’s distributed systems will be clouds of multicore computers.be clouds of multicore computers.

So, we have to solve So, we have to solve bothboth problems problemswith a single model.with a single model.

Page 9: Using Small Abstractions to Program Large Distributed Systems

ObservationObservation

In a given field of study, a single person In a given field of study, a single person may repeat the same may repeat the same pattern of work of work many times, making slight changes to the many times, making slight changes to the data and algorithms.data and algorithms.

Examples everyone knows:Examples everyone knows:– Parameter sweep on a simulation code.Parameter sweep on a simulation code.– BLAST search across multiple databases.BLAST search across multiple databases.

Are there other examples?.Are there other examples?.

Page 10: Using Small Abstractions to Program Large Distributed Systems

AbstractionsAbstractionsfor Distributed Computingfor Distributed Computing

Abstraction: a Abstraction: a declarative specificationdeclarative specification of the computation and data of a workload.of the computation and data of a workload.

A A restricted patternrestricted pattern, not meant to be a , not meant to be a general purpose programming language.general purpose programming language.

UsesUses data structures instead of files. instead of files.

Provide users with a Provide users with a bright path..

Regular structure makes it tractable to Regular structure makes it tractable to model and model and predict performance.predict performance.

Page 11: Using Small Abstractions to Program Large Distributed Systems

All-Pairs AbstractionAll-Pairs Abstraction

AllPairs( set A, set B, function F )AllPairs( set A, set B, function F )

returns matrix M wherereturns matrix M where

M[i][j] = F( A[i], B[j] ) for all i,jM[i][j] = F( A[i], B[j] ) for all i,j

B1

B2

B3

A1 A2 A3

F F F

A1A1

An

B1B1

Bn

F

AllPairs(A,B,F)F

F F

F F

FMoretti, Bulosan, Flynn, Thain,AllPairs: An Abstraction… IPDPS 2008

allpairs A B F.exe

Page 12: Using Small Abstractions to Program Large Distributed Systems

Example ApplicationExample Application

Goal: Design robust face comparison function.Goal: Design robust face comparison function.

F

0.05

F

0.97

Page 13: Using Small Abstractions to Program Large Distributed Systems

Similarity Matrix ConstructionSimilarity Matrix Construction

11 .8.8 .1.1 00 00 .1.1

11 00 .1.1 .1.1 00

11 00 .1.1 .3.3

11 00 00

11 .1.1

11

F

Current Workload:4000 images256 KB each10s per F(five days)

Future Workload:60000 images1MB each1s per F(three months)

Page 14: Using Small Abstractions to Program Large Distributed Systems

http://www.cse.nd.edu/~ccl/viz

Page 15: Using Small Abstractions to Program Large Distributed Systems

Non-Expert User Using 500 CPUsNon-Expert User Using 500 CPUsTry 1: Each F is a batch job.Failure: Dispatch latency >> F runtime.

HN

CPU CPU CPU CPUF F F FCPUF

Try 2: Each row is a batch job.Failure: Too many small ops on FS.

HN

CPU CPU CPU CPUF F F FCPUFFFF FF

FFFF

FFF

FFF

Try 3: Bundle all files into one package.Failure: Everyone loads 1GB at once.

HN

CPU CPU CPU CPUF F F FCPUFFFF FF

FFFF

FFF

FFF

Try 4: User gives up and attemptsto solve an easier or smaller problem.

Page 16: Using Small Abstractions to Program Large Distributed Systems

All-Pairs AbstractionAll-Pairs Abstraction

AllPairs( set A, set B, function F )AllPairs( set A, set B, function F )

returns matrix M wherereturns matrix M where

M[i][j] = F( A[i], B[j] ) for all i,jM[i][j] = F( A[i], B[j] ) for all i,j

B1

B2

B3

A1 A2 A3

F F F

A1A1

An

B1B1

Bn

F

AllPairs(A,B,F)F

F F

F F

F

Page 17: Using Small Abstractions to Program Large Distributed Systems
Page 18: Using Small Abstractions to Program Large Distributed Systems

How Many CPUs on the Cloud?How Many CPUs on the Cloud?

Page 19: Using Small Abstractions to Program Large Distributed Systems

What is the right metric?What is the right metric?

Page 20: Using Small Abstractions to Program Large Distributed Systems

How to measure in clouds?How to measure in clouds?

Speedup?Speedup?– Seq Runtime / Parallel RuntimeSeq Runtime / Parallel Runtime

Parallel Efficiency?Parallel Efficiency?– Speedup / N CPUs?Speedup / N CPUs?

Neither works, because the number of CPUs Neither works, because the number of CPUs varies over time and between runs.varies over time and between runs.

An Alternative: Cost EfficiencyAn Alternative: Cost Efficiency– Work Completed / Resources ConsumedWork Completed / Resources Consumed– Person-Miles / GallonPerson-Miles / Gallon– Results / CPU-hoursResults / CPU-hours– Results / $$$Results / $$$

Page 21: Using Small Abstractions to Program Large Distributed Systems

All-Pairs Abstraction

Page 22: Using Small Abstractions to Program Large Distributed Systems

Wavefront ( R[x,0], R[0,y], F(x,y,d) )Wavefront ( R[x,0], R[0,y], F(x,y,d) )

R[4,2]

R[3,2] R[4,3]

R[4,4]R[3,4]R[2,4]

R[4,0]R[3,0]R[2,0]R[1,0]R[0,0]

R[0,1]

R[0,2]

R[0,3]

R[0,4]

Fx

yd

Fx

yd

Fx

yd

Fx

yd

Fx

yd

Fx

yd

F

F

y

y

x

x

d

d

x F Fx

yd yd

Page 23: Using Small Abstractions to Program Large Distributed Systems

Implementing WavefrontImplementing Wavefront

CompleteInput

CompleteOutput

DistributedMaster

MulticoreMaster

MulticoreMaster

F

F

F

F

Input

Output

Input

Output

F

F

F

F

Page 24: Using Small Abstractions to Program Large Distributed Systems

The Performance ProblemThe Performance Problem

Dispatch latency really matters: a delay in one Dispatch latency really matters: a delay in one holds up all of its children.holds up all of its children.If we dispatch larger sub-problems:If we dispatch larger sub-problems:– Concurrency on each node increases.Concurrency on each node increases.– Distributed concurrency decreases.Distributed concurrency decreases.

If we dispatch smaller sub-problems:If we dispatch smaller sub-problems:– Concurrency on each node decreases.Concurrency on each node decreases.– Spend more time waiting for jobs to be dispatched.Spend more time waiting for jobs to be dispatched.

So, model the system to choose the block size.So, model the system to choose the block size.And, build a fast-dispatch execution system.And, build a fast-dispatch execution system.

Page 25: Using Small Abstractions to Program Large Distributed Systems

Model of 1000x1000 WavefrontModel of 1000x1000 Wavefront

Page 26: Using Small Abstractions to Program Large Distributed Systems

500x500 Wavefront on ~200 CPUs500x500 Wavefront on ~200 CPUs

Page 27: Using Small Abstractions to Program Large Distributed Systems

Wavefront on a 200-CPU GridWavefront on a 200-CPU Grid

Page 28: Using Small Abstractions to Program Large Distributed Systems

Wavefront on a 32-Core CPUWavefront on a 32-Core CPU

Page 29: Using Small Abstractions to Program Large Distributed Systems

T2

Classify AbstractionClassify Abstraction

Classify( T, R, N, P, F )Classify( T, R, N, P, F )

T = testing setT = testing set R = training setR = training set

N = # of partitionsN = # of partitions F = classifierF = classifier

P

T1

T3

F

F

F

T

R

V1

V2

V3

C V

Moretti, Steinhauser, Thain, Chawla,Scaling up Classifiers to Cloud Computers, ICDM 2008.

Page 30: Using Small Abstractions to Program Large Distributed Systems
Page 31: Using Small Abstractions to Program Large Distributed Systems

From AbstractionsFrom Abstractionsto a Distributed Languageto a Distributed Language

Page 32: Using Small Abstractions to Program Large Distributed Systems

What Other AbstractionsWhat Other AbstractionsMight Be Useful?Might Be Useful?

Map( set S, F(s) )Map( set S, F(s) )Explore( F(x), x: [a….b] )Explore( F(x), x: [a….b] )Minimize( F(x), delta )Minimize( F(x), delta )Minimax( state s, A(s), B(s) )Minimax( state s, A(s), B(s) )Search( state s, F(s), IsTerminal(s) )Search( state s, F(s), IsTerminal(s) )Query( properties ) -> set of objectsQuery( properties ) -> set of objects

FluidFlow( V[x,y,z], F(v), delta )FluidFlow( V[x,y,z], F(v), delta )

Page 33: Using Small Abstractions to Program Large Distributed Systems

How do we connect multiple How do we connect multiple abstractions together?abstractions together?

Need a meta-language, perhaps with its own Need a meta-language, perhaps with its own atomic operations for simple tasks:atomic operations for simple tasks:Need to manage (possibly large) intermediate Need to manage (possibly large) intermediate storage between operations.storage between operations.Need to handle data type conversions between Need to handle data type conversions between almost-compatible components.almost-compatible components.Need type reporting and error checking to avoid Need type reporting and error checking to avoid expensive errors.expensive errors.If abstractions are feasible to model, then it may If abstractions are feasible to model, then it may be feasible to model entire programs.be feasible to model entire programs.

Page 34: Using Small Abstractions to Program Large Distributed Systems

Connecting Abstractions in BXGridConnecting Abstractions in BXGrid

B1

B2

B3

A1 A2 A3

F F F

F

F F

F F

F

L brown

L blue

R brown

R brown

S1

S2

S3

eye color

F

F

F

ROCCurve

S = Select( color=“brown” )

B = Transform( S,F )

M = AllPairs( A, B, F )

Bui, Thomas, Kelly, Lyon, Flynn, ThainBXGrid: A Repository and Experimental Abstraction… poster at IEEE eScience 2008

Page 35: Using Small Abstractions to Program Large Distributed Systems

Implementing AbstractionsImplementing Abstractions

S = Select( color=“brown” )

B = Transform( S,F )

M = AllPairs( A, B, F )

DBMS

Relational Database (2x)

Active Storage Cluster (16x)

CPU

Relational Database

CPU CPU CPU

CPU CPU CPU CPU

Condor Pool (500x)

Page 36: Using Small Abstractions to Program Large Distributed Systems
Page 37: Using Small Abstractions to Program Large Distributed Systems

What is the Most Useful ABI?What is the Most Useful ABI?

Functions can come in many forms:Functions can come in many forms:– Unix ProgramUnix Program– C function in source formC function in source form– Java function in binary formJava function in binary form

Datasets come in many forms:Datasets come in many forms:– Set: list of files, delimited single file, or database query.Set: list of files, delimited single file, or database query.– Matrix: sparse elem list, or binary layoutMatrix: sparse elem list, or binary layout

Our current implementations require a particular Our current implementations require a particular form. With a carefully stated ABI, abstractions form. With a carefully stated ABI, abstractions could work with many different user communities.could work with many different user communities.

Page 38: Using Small Abstractions to Program Large Distributed Systems

What is the type system?What is the type system?

Files have an obvious technical type:Files have an obvious technical type:– JPG, BMP, TXT, PTS, …JPG, BMP, TXT, PTS, …

But they also have a But they also have a logicallogical type: type:– JPG: Face, Iris, Fingerprint etc.JPG: Face, Iris, Fingerprint etc.– (This comes out of the BXGrid repository.)(This comes out of the BXGrid repository.)

The meta-language can easily perform The meta-language can easily perform automatic conversions between technical types, automatic conversions between technical types, and between some logical types:and between some logical types:– JPG/Face -> BMP/Face via ImageMagickJPG/Face -> BMP/Face via ImageMagick– JPG/Iris -> BIN/IrisCode via ComputeIrisCodeJPG/Iris -> BIN/IrisCode via ComputeIrisCode– JPG/Face -> JPG/Iris is not allowed.JPG/Face -> JPG/Iris is not allowed.

Page 39: Using Small Abstractions to Program Large Distributed Systems

Abstractions ReduxAbstractions Redux

Mapping general-purpose programs to Mapping general-purpose programs to arbitrary distributed/multicore systems is arbitrary distributed/multicore systems is algorithmically complex and full of pitfalls.algorithmically complex and full of pitfalls.

But, mapping a single abstraction is a But, mapping a single abstraction is a tractable problem that can be optimized, tractable problem that can be optimized, modeled, and re-used.modeled, and re-used.

Can we combine multiple abstractions Can we combine multiple abstractions together to achieve both expressive power together to achieve both expressive power and tractable performance?and tractable performance?

Page 40: Using Small Abstractions to Program Large Distributed Systems

AcknowledgmentsAcknowledgments

Cooperative Computing LabCooperative Computing Lab– http://www.cse.nd.edu/~cclhttp://www.cse.nd.edu/~ccl

Grad StudentsGrad Students– Chris MorettiChris Moretti– Hoang Bui Hoang Bui – Karsten Karsten

SteinhauserSteinhauser– Li YuLi Yu– Michael AlbrechtMichael Albrecht

Faculty:Faculty:– Patrick FlynnPatrick Flynn– Nitesh ChawlaNitesh Chawla– Kenneth JuddKenneth Judd– Scott EmrichScott Emrich

NSF Grants CCF-0621434, CNS-0643229NSF Grants CCF-0621434, CNS-0643229

UndergradsUndergrads– Mike KellyMike Kelly– Rory CarmichaelRory Carmichael– Mark PasquierMark Pasquier– Christopher LyonChristopher Lyon– Jared BulosanJared Bulosan