1. Innovation Recherche Feedback on Big Compute & HPC on
Windows Azure Antoine Poliakov HPC Consultant ANEO
[email protected] http://blog.aneo.eu
2. #mstechdays Innovation Recherche#3 Cloud : on-demand access
through a telecommunications network to shared and user-
configurable IT resources HPC (High Performance Computing) : a
branch of computer science conercned with maximizing software
efficiency, in particular in terms of execution speed Raw computing
power doubles every 1.5 - 2 years Network throughput doubles every
2 - 3 years The compute/network gap doubles every 5 years HPC in
the cloud allows makes computing power accessible to all (SME,
research labs, etc.) Fosters innovation Our question : can the
cloud offer sufficient performances for HPC workloads ? CPU : 100%
native speed RAM: 99% native speed Network ??? HPC : a challenge
for the cloud Introduction
3. #mstechdays Innovation Recherche#4 State of the art of HPC
in the cloud Experiments Technology HPC oriented cloud Use-case HPC
software 3 ingredients yield an answer through experimentation
Introduction
4. #mstechdays Innovation Recherche#5 Identify technologies and
partners HPC software use-case Efficient cloud computing service
Port the applicative HPC code : cluster cloud Skills improvements
Feedback on the technologies Experiment and measure performances
Scaling Data transfers Experimenting on HPC in the cloud : our
approach Introduction
5. #mstechdays Innovation Recherche#6 A collaborative project
with 3 complementary actors Introduction Established HPC research
teams: Distributed software & big data Machine learning and
interactive systems Goals Is the cloud ready for scientific
computing ? Specificities of deploying in the cloud ? Performances
Windows Azure provides a cloud solution aimed at HPC workloads:
Azure Big Compute Goals Pre-release feedback Inside view of a HPC
cluster cloud transition Consulting firm: organization and
technologies HPC Practice: fast/massive information processing for
finance and industries Goals Identify most relevent use-cases for
our clients Estimate the complexity of porting and deploying an app
Evaluate if the solution is production-ready
6. #mstechdays Innovation Recherche#7 Dedicated and competant
teams: thank you all! Introduction Research Use-case: distributed
audio segmentation Experiments analysis Provider Created the
technical solution Made available notable computational power
Consulting Ported and deployed the application in the cloud Led the
benchmarks Constantinos Makassikis HPC Consultant Wilfried
Kirschenmann HPC Consultant Antoine Poliakov HPC Consultant Stphane
Rossignol Assistant Professor, Signal processing Stphane Vialle
Professor, Computer science Xavier Pillons Principal Program
Manager, Windows Azure CAT Kvin Dehlinger Computer scientist intern
CNAM
7. #mstechdays Innovation Recherche#8 1. Technical context 2.
Feedback on porting the application 3. Optimizations 4. Results
Presentationcontents
8. Innovation Recherche#mstechdays #9 1. TECHNICAL CONTEXT a.
Azure Big Compute b. ParSon
9. #mstechdays Innovation Recherche#10 Azure Big Compute = New
Azure nodes + HPC Pack 2x8 snb E5-2670 @2.6Ghz, 112Gb DDR3 @1.6Ghz
InfiniBand (network direct @40Gbit/s): RDMA via MS-MPI @3.5Gb/s, 3s
IP over Ethernet @10Gbit/s ; HDD 2Tb @250Mo/s Azure hypervisor New
nodes: A8 and A9 Task scheduler middleware: Cluster Manager + SDK
Tested with 50k cores in Azure Free Extension Pack : any Windows
Server install can be a node HPC Pack Azure Big Compute
10. #mstechdays Innovation Recherche#11 HPC Pack : on permise
cluster Azure Big Compute Active Directory, Manager and nodes in a
privately managed infrastructure Cluster dimensioned w.r.t. maximal
workload Administration : hardware + software AD M N N N N N N N N
N N N N
11. #mstechdays Innovation Recherche#12 HPC Pack : in the Azure
Big Compute cloud Active Directory and manager in the cloud (VMs)
Nodes allocation and pricing on demand Admin : software only Azure
Big Compute AD M N N N N N N N N N N N N Remote desktop/CLI PaaS
nodes IaaS VM
12. #mstechdays Innovation Recherche#13 HPC Pack : hybrid
deployment Azure Big Compute Active Directory and manager on
premise Nodes both in the datacenter and in the cloud Local
dimensioning w.r.t. average load Dynamic cloud dimensioning:
absorbs peaks Admin: software + hardware AD M N N N N N N N N N N N
N N N N N N N N N N N N N VPN
13. #mstechdays Innovation Recherche#14 ParSon = audio
segmentation algorithm : voice / music 1. Supervised training on
known audio samples to calibrate the classifier 2. Classification
based on spectral analysis (FFT) on sliding windows ParSon: an
audio segmentation scientific software ParSon ParSon Segmentation
and classification Digital audio voice music
14. #mstechdays Innovation Recherche#15 ParSon is distributed
with OpenMP + MPI ParSon 1. Upload input files OAR 2. Reserves N
computers 4. MPI Exec 6. Get outputs NAS Reserved computers Linux
cluster 3. Input deployment 5. Tasks with heavy inter-
communications Data Control
15. #mstechdays Innovation Recherche#16 Performances are
limited by data transfers ParSon 8 32 128 512 2048 1 4 16 64 256
Bestruntime(s) Number of nodes en rseau, froid en local, froid IO
bound Nodes read from NAS Nodes read locally
16. Innovation Recherche#mstechdays #17 2. PORTING THE
APPLICATION a. Porting C++ code: Linux Windows b. Porting
distribution strategy: Cluster HPC Cluster Manager c. Porting and
adapting deployment scripts
17. #mstechdays Innovation Recherche#18 ParSon and Visual
conform to the C++ standard few code changes Dependencies are the
standard libraries and cross-platform scientific libraries :
libsnd, fftw Thanks to MS-MPI, inter-process communication code
doesnt change Visual Studio natively supports OpenMP The only task
left was translating build files: Makefiles Visual C++ projects
Standards conformance = easy Linux Windows porting Porting
18. #mstechdays Innovation Recherche#19 ParSon in the cluster
Porting 1. Upload input file OAR 2. Reserves N computers 4. MPI
Exec 6. Get output NAS Reserved computers Linux cluster 3. Input
deployment 5. Run and inter-com. Data Control
19. #mstechdays Innovation Recherche#20 HPC pack SDK ParSon
dans le Cloud Azure Porting 1. Upload input file HPC Cluster
Manager 2. Reserves N nodes 4. MPI Exec 6. Get output Azure Storage
Provisioned A9 nodes PaaS Big Compute 3. Input deployment AD Domain
controller IaaS PaaS Data Control 5. Run and inter-com.
20. #mstechdays Innovation Recherche#21 At every software
update : package + send in the cloud 1. Send to manager Either with
Azure Storage Set-AzureStorageBlobContent
Get-AzureStorageBlobContent hpcpack create ; hpcpack upload hpcpack
download Or with normal transfert : internet accessible fileserver
: FileZilla, etc. 2. Packaging script: mkdir, copy, etc. ; hpcpack
create 3. Send to Azure storage: hpcpack upload At every node
provisioning : local copy 1. Remotely execute on nodes from the
manager with clusrun 2.hpcpack download 3.powershell -command
"Set-ExecutionPolicy RemoteSigned" Invoke-Command -FilePath
-Credential Start-Process powershell -Verb runAs -ArgumentList 4.
Installation : %deployedPath%deployScript.ps1 Deployment within
Azure Porting
21. #mstechdays Innovation Recherche#22 Transferring the input
file is longer than sequential computation on a single thread On
many cores, computation times is negligible compared to transfers
WAV format headers and ParSon code limit input size to 4Gb This
first working setup has some limitations Porting
23. #mstechdays Innovation Recherche#24 Identified bottleneck
is the input file transfer 1. Disk write throughput: 300 Mb/s We
use a RAMFS 2. Accs Azure Storage : QoS 1.6 Gb/s Download only once
from the storage account, then broadcast through InfiniBand 3.
Large input files: 60 Gb FLAC c8 lossless compression halves size +
not limited to 4Gb Declare all counters as 64 bits ints in C++ code
Methodology : suppress the bottleneck Optimizations
24. #mstechdays Innovation Recherche#25 RAMFS = filesystem
stored in a RAM block Very fast Limited capacity, non persistent
ImDisk Lightweight: driver + service + command line Open-source but
signed for Win64 Scripted silent install : hpcpack create rundll32
setupapi.dll,InstallHinfSection DefaultInstall 128 disk.inf
Start-Service -inputobject $(get-service -Name imdisk) imdisk.exe
-a -t vm -s 30G -m F: -o rw format F: /fs:ntfs /x /q /Y $acl =
Get-Acl F:
$acl.AddAccessRule(FileSystemAccessRule("Everyone","Write", ))
Set-Acl F: $acl Run at every node provisioning Accelerating local
data access with a RAM filesystem Optimizations
25. #mstechdays Innovation Recherche#26 All standard transfer
systems go through the Ethernet interface Azure Storage access via
Azure and HPC Pack SDKs Windows share or CIFS network drive
Standard file transfer protocols: FTP, NFS, etc. The simplest way
to leverage InfiniBand is through MPI 1. On one node: download the
input file: Azure RAMFS 2. mpiexec broadcast.exe : 1 process per
node We developped a command line utility in C++ / MPI If id = 0,
reads RAMFS, by 4mb blocs and sends to other nodes through
InfiniBand : MPI_Bcast If id 0, recieve data blocs and save them on
RAMFS Uses Win32 API: faster than standard library abstractions 3.
Input data is in the RAM of all nodes, accessible as a file from
the application Accelerating input file deployment
Optimizations
27. #mstechdays Innovation Recherche#28 Computations scale
well, especially for bigger files Results Number of cores
(log)Number of cores (log) Computation time scaling (log-log plot)
Computation efficiency for different input sizes
Realspeedup/idealspeedup Computationtime(sec,log)
28. #mstechdays Innovation Recherche#29 Input file transfer
make global scaling worse Results +- Number of cores (log)
Efficiency for compute only and including transfers Raw compute
Number of cores (log) Realspeedup/idealspeedup Time(sec,log) Time
decomposition, for an hour of input audio
29. #mstechdays Innovation Recherche#30 Consistent storage
throughput (220Mb/s), latency may be high Broadcast constant @700
Mb/s Results Number of machinesFile size (Gb) Asure storage
download performances Broadcast time scaling Broadcasttime(sec,log)
Downloadtime(min)
31. #mstechdays Innovation Recherche#32 Our feedback on the Big
Compute technology HPC standards conformance: C++, OpenMP, MPI
Ported in 10 work days Solid performances Compute: CPU, RAM
Network: InfiniBand between nodes Reactive support Community,
Microsoft Intuitive user interface manage.windowsazure.com HPC
Cluster Manager Everything is scriptable & programmable Cloud
is more flexible than cluster Unified management of cloud and
on-premise Data transfers Azure storage latency sometimes high
Azure storage limited QoS users must implement multiple account
striping HDDs are slow (for HPC), even on A9 Nodes administration
Nodes Manager transfers must go through Azure storage: less
convenient than conventional remote file systems Provisioning time
must be taken into account (~7min)
32. #mstechdays Innovation Recherche#33 Azure Big Compute for
research and business Access to compute without any barrier
paperwork, finance, etc. Start your workload in minutes For
squeezing a few more before the (extended) deadline for that
conference Well suited to researchers in distributed computing
Parametric experiments A super computer for all, without investment
Elastic scaling : on-demand sizing Interoperable with Windows
clusters Cloud absorbs peaks Best of both worlds Datacenters in UE
: Ireland + Netherlands Predictable, pay what you use cost model
Modern design, extensive documentation, efficient support Decreased
need for administration but still needed on the software side For
research For business
33. #mstechdays Innovation Recherche#34 Thanks ? Thank you for
your attention Antoine Poliakov [email protected] Stphane Vialle
[email protected] ANEO http://aneo.eu http://blog.aneo.eu
Retrouvez nous aux TechDays ! Stand ANEO jeudi 11h30 - 13h Au cur
du SI > Infrastructure moderne avec Azure All our thanks to
Microsoft for lending us the nodes A question : dont hesitate!