Upload
severino-trejo
View
35
Download
1
Embed Size (px)
DESCRIPTION
Programming Distributed Systems with High Level Abstractions. Douglas Thain University of Notre Dame 23 October 2008. Distributed Systems. Scale: 2 – 100s – 1000s – millions Domains:Single or Multi Users: 1 – 10 – 100 – 1000 – 10000 Naming:Direct, Virtual - PowerPoint PPT Presentation
Citation preview
ProgrammingProgrammingDistributed SystemsDistributed Systems
with High Level Abstractionswith High Level Abstractions
Douglas ThainDouglas Thain
University of Notre DameUniversity of Notre Dame
23 October 200823 October 2008
Distributed SystemsDistributed Systems
Scale: Scale: 2 – 100s – 1000s – millions2 – 100s – 1000s – millions
Domains:Domains: Single or MultiSingle or Multi
Users: Users: 1 – 10 – 100 – 1000 – 100001 – 10 – 100 – 1000 – 10000
Naming:Naming: Direct, VirtualDirect, Virtual
Scheduling:Scheduling: Timesharing / Space SharingTimesharing / Space Sharing
Interface:Interface: Allocate CPU / Execute JobAllocate CPU / Execute Job
Security:Security: None / IP / PKI / KRB …None / IP / PKI / KRB …
Storage: Storage: Embedded / ExternalEmbedded / External
Cloud Computing?Cloud Computing?
Scale: Scale: 2 – 100s – 2 – 100s – 1000s – 10000s1000s – 10000s
Domains:Domains: SingleSingle or Multi or Multi
Users: Users: 1 – 10 – 100 – 1000 – 1 – 10 – 100 – 1000 – 1000010000
Naming:Naming: Direct, Direct, VirtualVirtual
Scheduling:Scheduling: Timesharing / SpacesharingTimesharing / Spacesharing
Interface:Interface: Allocate CPU / Execute JobAllocate CPU / Execute Job
Security:Security: None / IP / PKI / KRB …None / IP / PKI / KRB …
Storage: Storage: Embedded / ExternalEmbedded / External
Grid Computing?Grid Computing?
Scale: Scale: 2 – 100s – 2 – 100s – 1000s – 10000s1000s – 10000s
Domains:Domains: Single or Single or MultiMulti
Users: Users: 1 – 10 – 100 – 1000 – 1 – 10 – 100 – 1000 – 1000010000
Naming:Naming: Direct, VirtualDirect, Virtual
Scheduling:Scheduling: Timesharing / SpacesharingTimesharing / Spacesharing
Interface:Interface: Allocate CPU / Allocate CPU / Execute JobExecute Job
Security:Security: None / IP / PKI / KRB …None / IP / PKI / KRB …
Storage: Storage: Embedded / ExternalEmbedded / External
An Assembly LanguageAn Assembly Languageof Distributed Computingof Distributed Computing
Fundamental OperationsFundamental Operations– TransferFile( source, destination )TransferFile( source, destination )– ExecuteJob( host, exe, input, output )ExecuteJob( host, exe, input, output )– AllocateVM( cpu, mem, disk, opsys )AllocateVM( cpu, mem, disk, opsys )
Semantics of Assembly are Subtle:Semantics of Assembly are Subtle:– When do instructions commit?When do instructions commit?– Delay slots before control transfers?Delay slots before control transfers?– What exceptions are valid for each opcode?What exceptions are valid for each opcode?– Precise or imprecise exceptions?Precise or imprecise exceptions?– What is the cost of each instruction?What is the cost of each instruction?
Programming in Assembly StinksProgramming in Assembly Stinks
You know the problems:You know the problems:– Stack management.Stack management.– Garbage collection.Garbage collection.– Type checking.Type checking.– Co-location of data and computation.Co-location of data and computation.– Query optimizations.Query optimizations.– Function shipping or data shipping?Function shipping or data shipping?– How many nodes should I harness?How many nodes should I harness?
AbstractionsAbstractionsfor Distributed Computingfor Distributed Computing
Abstraction: a Abstraction: a declarative specificationdeclarative specification of the computation and data of a workload.of the computation and data of a workload.
A A restricted patternrestricted pattern, not meant to be a , not meant to be a general purpose programming language.general purpose programming language.
Avoid the really terrible cases.Avoid the really terrible cases.
Provide users with a Provide users with a bright pathbright path..
Data structuresData structures instead of file systems. instead of file systems.
All-Pairs AbstractionAll-Pairs Abstraction
AllPairs( set A, set B, function F )AllPairs( set A, set B, function F )
returns matrix M wherereturns matrix M where
M[i][j] = F( A[i], B[j] ) for all i,jM[i][j] = F( A[i], B[j] ) for all i,j
B1
B2
B3
A1 A2 A3
F F F
A1A1
An
B1B1
Bn
F
AllPairs(A,B,F)F
F F
F F
FMoretti, Bulosan, Flynn, Thain,AllPairs: An Abstraction… IPDPS 2008
Example ApplicationExample Application
Goal: Design robust face comparison function.Goal: Design robust face comparison function.
F
0.05
F
0.97
Similarity Matrix ConstructionSimilarity Matrix Construction
11 .8.8 .1.1 00 00 .1.1
11 00 .1.1 .1.1 00
11 00 .1.1 .3.3
11 00 00
11 .1.1
11
F
Current Workload:4000 images256 KB each10s per F(five days)
Future Workload:60000 images1MB each1s per F(three months)
Non-Expert User Using 500 CPUsNon-Expert User Using 500 CPUsTry 1: Each F is a batch job.Failure: Dispatch latency >> F runtime.
HN
CPU CPU CPU CPUF F F FCPUF
Try 2: Each row is a batch job.Failure: Too many small ops on FS.
HN
CPU CPU CPU CPUF F F FCPUFFFF FF
FFFF
FFF
FFF
Try 3: Bundle all files into one package.Failure: Everyone loads 1GB at once.
HN
CPU CPU CPU CPUF F F FCPUFFFF FF
FFFF
FFF
FFF
Try 4: User gives up and attemptsto solve an easier or smaller problem.
All-Pairs AbstractionAll-Pairs Abstraction
AllPairs( set A, set B, function F )AllPairs( set A, set B, function F )
returns matrix M wherereturns matrix M where
M[i][j] = F( A[i], B[j] ) for all i,jM[i][j] = F( A[i], B[j] ) for all i,j
B1
B2
B3
A1 A2 A3
F F F
A1A1
An
B1B1
Bn
F
AllPairs(A,B,F)F
F F
F F
F
What is the right metric?What is the right metric?
Speedup?Speedup?– Seq Runtime / Parallel RuntimeSeq Runtime / Parallel Runtime
Parallel Efficiency?Parallel Efficiency?– Speedup / N CPUs?Speedup / N CPUs?
Neither works, because the number of CPUs Neither works, because the number of CPUs varies over time and between runs.varies over time and between runs.
Cost EfficiencyCost Efficiency– Work Completed / Resources ConsumedWork Completed / Resources Consumed– Person-Miles / GallonPerson-Miles / Gallon– Results / CPU-hoursResults / CPU-hours– Results / $$$Results / $$$
T2
Classify AbstractionClassify Abstraction
Classify( T, R, N, P, F )Classify( T, R, N, P, F )
T = testing setT = testing set R = training setR = training set
N = # of partitionsN = # of partitions F = classifierF = classifier
P
T1
T3
F
F
F
T
R
V1
V2
V3
C V
Moretti, Steinhauser, Thain, Chawla,Scaling up Classifiers to Cloud Computers, ICDM 2008.
BXGrid AbstractionsBXGrid Abstractions
B1
B2
B3
A1 A2 A3
F F F
F
F F
F F
F
L brown
L blue
R brown
R brown
S1
S2
S3
eye color
F
F
F
ROCCurve
S = Select( color=“brown” )
B = Transform( S,F )
M = AllPairs( A, B, F )
Bui, Thomas, Kelly, Lyon, Flynn, ThainBXGrid: A Repository and Experimental Abstraction… in review 2008.
Implementing AbstractionsImplementing Abstractions
S = Select( color=“brown” )
B = Transform( S,F )
M = AllPairs( A, B, F )
DBMS
Relational Database (2x)
Active Storage Cluster (16x)
CPU
Relational Database
CPU CPU CPU
CPU CPU CPU CPU
Condor Pool (500x)
Compatibility of Abstractions?Compatibility of Abstractions?
Assembly Language
Map-Reduce All-PairsClassify
Compatibility of Abstractions?Compatibility of Abstractions?
Assembly Language
Map-Reduce
All-PairsClassify ???
Mismatch:MR relies on data partition.AP relies on data re-use.
Mismatch:Classify partitions logically.
MR partitions physically.
Compatibility of Abstractions?Compatibility of Abstractions?
Assembly Language
Map-Reduce All-PairsClassify
SwiftDryad More General,Less Optimized?
From Clouds to MulticoreFrom Clouds to Multicore
Next Step: AP Implementation that runs Next Step: AP Implementation that runs well on Single CPU, Multicore, Cloud, or well on Single CPU, Multicore, Cloud, or Cloud of Multicores.Cloud of Multicores.
Assembly Language
Map-Reduce All-PairsClassify
DryadSwift
CPU CPU CPU CPU
Assembly Language
Map-Reduce All-PairsClassify
DryadSwift
CPU CPU CPU CPU
$$$ $$$ $$$ $$$
RAM
AcknowledgmentsAcknowledgments
Cooperative Computing LabCooperative Computing Lab– http://www.cse.nd.edu/~cclhttp://www.cse.nd.edu/~ccl
Grad Students:Grad Students:– Chris MorettiChris Moretti– Hoang BuiHoang Bui– Michael AlbrechtMichael Albrecht– Li YuLi Yu
NSF Grants CCF-0621434, CNS-0643229NSF Grants CCF-0621434, CNS-0643229
Undergraduate StudentsUndergraduate Students– Mike KellyMike Kelly– Rory CarmichaelRory Carmichael– Mark PasquierMark Pasquier– Christopher LyonChristopher Lyon– Jared BulosanJared Bulosan