Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Evaluating Cloud Computing for HPC Applications
Lavanya Ramakrishnan CRD & NERSC
2
• What are the unique needs and features of a science cloud? – NERSC Magellan User Survey
• What applications can efficiently run on a cloud? – Benchmarking cloud technologies (Hadoop, Eucalyptus) and platforms (Amazon EC2, Azure)
• Are cloud computing Programming Models such as Hadoop effective for scientific applications?
– Experimentation with early applications • Can scientific applications use a data-as-a-service or software-as-a-service model? • What are the security implications of user-controlled cloud images? • Is it practical to deploy a single logical cloud across multiple DOE sites? • What is the cost and energy efficiency of clouds?
Magellan Research Agenda
3
Magellan User Survey
Program Office
Advanced Scientific Computing Research 17%
Biological and Environmental Research 9%
Basic Energy Sciences -Chemical Sciences 10%
Fusion Energy Sciences 10%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Access to additional resources
Access to on-demand (commercial) paid resources closer to deadlines
Ability to control software environments specific to my application
Ability to share setup of software or experiments with collaborators
Ability to control groups/users
Exclusive access to the computing resources/ability to schedule independently of other groups/users
Easier to acquire/operate than a local cluster
Cost associativity? (i.e., I can get 10 cpus for 1 hr now or 2 cpus for 5 hrs at the same cost)
MapReduce Programming Model/Hadoop
Hadoop File System
User interfaces/Science Gateways: Use of clouds to host science gateways and/or access to cloud resources through science
Program Office
High Energy Physics 20%
Nuclear Physics 13%
Advanced Networking Initiative (ANI) Project 3%
Other 14%
4
• Infrastructure as a Service (IaaS) – Provide unlimited access to data storage and compute cycles – e.g., Amazon EC2, Eucalyptus
• Platform as a Service (PaaS) – Delivery of a computing platform/software stack – Container/images for specific user groups – e.g., Hadoop, Azure
• Software as a Service – Specific function provided for use across multiple user groups (i.e. Science Gateways)
Cloud Computing Services
Magellan Software
5
Bat
ch Q
ueue
s
Virtu
al M
achi
nes
(Euc
alyp
tus)
Had
oop
Priv
ate
Clu
ster
s
Pub
lic o
r Rem
ote
Clo
ud
Sci
ence
and
Sto
rage
G
atew
ays ANI
Magellan Cluster
• Web-service API to IaaS offering • Uses Xen paravirtualization
– cluster compute instance type uses hardware assisted virtualization
• Non-persistent local disk in VM • Simple Storage Service (S3)
– scalable persistent object store • Elastic Block Storage (EBS)
– persistent, block level storage
Amazon Web Services
• Open source IaaS implementation – API compatible with Amazon AWS – manage virtual machines
• Walrus & Block Storage – interface compatible to S3 & EBS
• Available to users on Magellan testbed • Private virtual clusters
– scripts to manage dynamic virtual clusters – NFS/Torque etc
Eucalyptus
• Platforms – Amazon, Azure, Lawrencium (IT cluster) – Magellan
• IB, TCP over IB, TCP over Ethernet, VM • Workloads
– HPCC – NERSC6 Benchmarks – Applications Pipelines
• JGI Supernova Factory
• Metrics – Performance, Cost, Reliability, Programmability
Virtualization Impact
NERSC-6 Benchmark Performance [1/2]
0
2
4
6
8
10
12
14
16
18
GAMESS GTC IMPACT fvCAM MAESTRO256
Run
time
Rel
ativ
e to
Car
ver
Amazon EC2
Lawrencium
EC2-Beta
EC2-Beta-Opt
Franklin
Carver
NERSC-6 Benchmark Performance [2/2]
0
10
20
30
40
50
60
MILC PARATEC
Run
time
rela
tive
to C
arve
r
Amazon EC2
Lawrencium
EC2-Beta
EC2-Beta-Opt
Franklin
Carver
Magellan: NERSC6 Application Benchmarks
0 10 20 30 40 50 60 70 80 90
100
GTC PARATEC CAM
Perc
enta
ge P
erfo
rman
ce R
elat
ive
to N
ativ
e
TCP o IB
TCP o Eth
VM
Magellan: HPCC
0 2 4 6 8
10 12
Ping Pong Latency RandRing Latency
Perc
enta
ge
Perf
orm
ance
Rel
ativ
e to
Nat
ive
TCP o IB
TCP o Eth
IB/VM
0
20
40
60
80
100
DGEMM Ping Pong Bandwidth
HPL PTRANS Perc
enta
ge P
erfo
rman
ce w
ith
Res
pect
to N
ativ
e
TCP o IB TCP o Eth VM
• James Hamilton’s cost model – http://perspectives.mvdirona.com/2010/09/18/OverallDataCenterCosts.aspx – expanded for HPC environments
• Quantify difference in cost between IB and Ethernet
Performance-Cost Tradeoffs
Application Class Performance Increase/Cost
Increase Tightly Coupled with IO
(e.g., CAM) 4.36
Tightly Coupled with minimal IO (e.g., PARATEC, GTC)
1.2
• Tools to measure the expansion history of the Universe and explore the nature of Dark Energy
– Largest data volume supernova search • Data Pipeline
– Custom data analysis codes • Coordinated by Python scripts
– Run on a standard Linux batch queue cluster • Cloud provides
– Control over OS versions – Root access and shared “group” account – Immunity to externally enforced OS or architecture changes
Nearby Supernova Factory
Experiments on Amazon EC2
15
Input Data Output Data
EBS via NFS Local storage to EBS via NFS
Staged to local storage from EBS
Local storage to EBS via NFS
EBS via NFS EBS via NFS
Staged to local storage from EBS
EBS via NFS
EBS via NFS Local storage to S3
Staged to local storage from S3
Local storage to S3
Total Cost (Instance + Storage/month)
0
20
40
60
80
100
120
140
160
180
EBS-A1 EBS-A2 EBS-B1 EBS-B2 S3-A S3-B
Tota
l Cos
t of a
n ex
perim
ent (
Cos
t of i
nsta
nces
+
stor
gge
cost
/mon
th)
Experiment Matrix
Data Cost per month (80) Instance Cost (80)
Data Cost per month (40) Instance Cost (40)
Magellan Software
17
Bat
ch Q
ueue
s
Virtu
al M
achi
nes
(Euc
alyp
tus)
Had
oop
Priv
ate
Clu
ster
s
Pub
lic o
r Rem
ote
Clo
ud
Sci
ence
and
Sto
rage
G
atew
ays ANI
Magellan Cluster
Hadoop Stack • Open source reliable, scalable
distributed computing – implementation of MapReduce – Hadoop Distributed File System (HDFS)
Core Avro
MapReduce HDFS
Pig Chukwa Hive HBase
Source: Hadoop: The Definitive Guide
Zoo Keeper
Hadoop for Science • Advantages of Hadoop
– transparent data replication, data locality aware scheduling
– fault tolerance capabilities • Mode of operation
– use streaming to launch a script that calls executable
– HDFS for input, need shared file system for binary and database
– input format • handle multi-line inputs (BLAST sequences), binary
data (High Energy Physics)
19
• Compare traditional parallel file systems to HDFS – TeraGen and Terasort to compare file system performance
• 32 maps for TeraGen and 64 reduces for Terasort over a terabyte of data
Hadoop Benchmarking: Early Results [1/2]
0
2000
4000
6000
GPFS HDFS Lustre
Tim
e (s
econ
ds)
TeraGen
Terasort-64 reduces
21
Hadoop Benchmarking: Early Results [2/2]
0
10
20
30
40
50
60
70
0 10 20 30 40 50 60
Thro
ughp
ut M
B/s
Number of concurrent writers
10MB
10GB
TestDFSIO to understand concurrency at default block size
+ ~ 350 - 500 Genomes ~ .5 – 1 Mil Genes
Every 4 months
65 Samples: 21 Studies IMG+2.6 Mil genes 9.1 Mil total
Monthly
On demand
On demand + 330 Genomes ! 158 GEBA
8.2 Mil genes
+ 287 Samples: ~105 Studies + 12.5 Mil genes 19 Mil genes
5,115 Genomes 6.5 Mil genes
IMG Systems: Genome & Metagenome Data Flow
BLAST on Hadoop • NCBI BLAST (2.2.22)
– reference IMG genomes- of 6.5 mil genes (~3Gb in size) – full input set 12.5 mil metagenome genes against
reference • BLAST Hadoop
– uses streaming to manage input data sequences – binary and databases on a shared file system
• BLAST Task Farming Implementation – server reads inputs and manages the tasks – client runs blast, copies database to local disk or ramdisk
once on startup, pushes back results – advantages: fault-resilient and allows incremental
expansion as resources come available
BLAST Performance
0 10 20 30 40 50 60 70 80 90
100
16 32 64 128
Tim
e (m
inut
es)
Number of processors
EC2-taskFarmer
Franklin-taskFarmer EC2-Hadoop
Azure
• Initial config – Hadoop memory ulimit issues, – Hadoop memory limits increased to accommodate high memory tasks – 1 map per node for high memory tasks to reduce contention – thrashing when DB does not fit in memory
• NFS shared file system for common DB – move DB to local nodes (copy to local /tmp). – initial copy takes 2 hours, but now BLAST job completes in < 10 minutes – performance is equivalent to other cloud environments. – future: Experiment with Distributed Cache
• Time to solution varies - no guarantee of simultaneous availability of resources
BLAST on Yahoo! M45 Hadoop
Strong user group and sysadmin support was key in working through this.
• Porting applications still requires lots of work • Public clouds
– Virtualization has a performance impact – Failures when creating large instances – Data Costs tend to be overwhelming
• Eucalyptus – Learning curve and stability at scale
• Alternate stacks and trying version 2.0 – SSE instructions were not exposed in VM
• Additional benchmarking
Summary: Virtualization for Science
26
• Deployment Challenges – all jobs run as user “hadoop” affecting file permissions – less control on how many nodes are used - affects allocation policies – file system performance for large file sizes
• Programming Challenges: No turn-key solution – using existing code bases, managing input formats and data
• Performance – BLAST over Hadoop: performance is comparable to existing systems – existing parallel file systems can be used through Hadoop On Demand
• Additional benchmarking, tuning needed, Plug-ins for Science
Summary: Hadoop for Science
Acknowledgements
This work was funded in part by the Advanced Scientific Computing Research (ASCR) in the DOE Office of Science under contract number DE-C02-05CH11231.
CITRIS/UC, Yahoo M45!, Amazon EC2 Education and Research Grants, Microsoft Research, Wei Lu, Dennis Gannon, Masoud Nikravesh and Greg Bell.
Magellan - Shane Canon, Iwona Sakrejda Magellan Benchmarking – Shane Canon, Nick Wright EC2 Benchmarking – Keith Jackson, Krishna Muriki, John Shalf,
Shane Canon, Harvey Wasserman, Shreyas Cholia, Nick Wright BLAST – Shreyas Cholia, Keith Jackson, Shane Canon, John Shalf Supernova Factory – Keith Jackson, Rollin Thomas, Karl Runge
29
Questions [email protected]