Feedback on Big Compute & HPC on Windows Azure

  • Upload
    aneo

  • View
    138

  • Download
    1

Embed Size (px)

Citation preview

  1. 1. Innovation Recherche Feedback on Big Compute & HPC on Windows Azure Antoine Poliakov HPC Consultant ANEO [email protected] http://blog.aneo.eu
  2. 2. #mstechdays Innovation Recherche#3 Cloud : on-demand access through a telecommunications network to shared and user- configurable IT resources HPC (High Performance Computing) : a branch of computer science conercned with maximizing software efficiency, in particular in terms of execution speed Raw computing power doubles every 1.5 - 2 years Network throughput doubles every 2 - 3 years The compute/network gap doubles every 5 years HPC in the cloud allows makes computing power accessible to all (SME, research labs, etc.) Fosters innovation Our question : can the cloud offer sufficient performances for HPC workloads ? CPU : 100% native speed RAM: 99% native speed Network ??? HPC : a challenge for the cloud Introduction
  3. 3. #mstechdays Innovation Recherche#4 State of the art of HPC in the cloud Experiments Technology HPC oriented cloud Use-case HPC software 3 ingredients yield an answer through experimentation Introduction
  4. 4. #mstechdays Innovation Recherche#5 Identify technologies and partners HPC software use-case Efficient cloud computing service Port the applicative HPC code : cluster cloud Skills improvements Feedback on the technologies Experiment and measure performances Scaling Data transfers Experimenting on HPC in the cloud : our approach Introduction
  5. 5. #mstechdays Innovation Recherche#6 A collaborative project with 3 complementary actors Introduction Established HPC research teams: Distributed software & big data Machine learning and interactive systems Goals Is the cloud ready for scientific computing ? Specificities of deploying in the cloud ? Performances Windows Azure provides a cloud solution aimed at HPC workloads: Azure Big Compute Goals Pre-release feedback Inside view of a HPC cluster cloud transition Consulting firm: organization and technologies HPC Practice: fast/massive information processing for finance and industries Goals Identify most relevent use-cases for our clients Estimate the complexity of porting and deploying an app Evaluate if the solution is production-ready
  6. 6. #mstechdays Innovation Recherche#7 Dedicated and competant teams: thank you all! Introduction Research Use-case: distributed audio segmentation Experiments analysis Provider Created the technical solution Made available notable computational power Consulting Ported and deployed the application in the cloud Led the benchmarks Constantinos Makassikis HPC Consultant Wilfried Kirschenmann HPC Consultant Antoine Poliakov HPC Consultant Stphane Rossignol Assistant Professor, Signal processing Stphane Vialle Professor, Computer science Xavier Pillons Principal Program Manager, Windows Azure CAT Kvin Dehlinger Computer scientist intern CNAM
  7. 7. #mstechdays Innovation Recherche#8 1. Technical context 2. Feedback on porting the application 3. Optimizations 4. Results Presentationcontents
  8. 8. Innovation Recherche#mstechdays #9 1. TECHNICAL CONTEXT a. Azure Big Compute b. ParSon
  9. 9. #mstechdays Innovation Recherche#10 Azure Big Compute = New Azure nodes + HPC Pack 2x8 snb E5-2670 @2.6Ghz, 112Gb DDR3 @1.6Ghz InfiniBand (network direct @40Gbit/s): RDMA via MS-MPI @3.5Gb/s, 3s IP over Ethernet @10Gbit/s ; HDD 2Tb @250Mo/s Azure hypervisor New nodes: A8 and A9 Task scheduler middleware: Cluster Manager + SDK Tested with 50k cores in Azure Free Extension Pack : any Windows Server install can be a node HPC Pack Azure Big Compute
  10. 10. #mstechdays Innovation Recherche#11 HPC Pack : on permise cluster Azure Big Compute Active Directory, Manager and nodes in a privately managed infrastructure Cluster dimensioned w.r.t. maximal workload Administration : hardware + software AD M N N N N N N N N N N N N
  11. 11. #mstechdays Innovation Recherche#12 HPC Pack : in the Azure Big Compute cloud Active Directory and manager in the cloud (VMs) Nodes allocation and pricing on demand Admin : software only Azure Big Compute AD M N N N N N N N N N N N N Remote desktop/CLI PaaS nodes IaaS VM
  12. 12. #mstechdays Innovation Recherche#13 HPC Pack : hybrid deployment Azure Big Compute Active Directory and manager on premise Nodes both in the datacenter and in the cloud Local dimensioning w.r.t. average load Dynamic cloud dimensioning: absorbs peaks Admin: software + hardware AD M N N N N N N N N N N N N N N N N N N N N N N N N VPN
  13. 13. #mstechdays Innovation Recherche#14 ParSon = audio segmentation algorithm : voice / music 1. Supervised training on known audio samples to calibrate the classifier 2. Classification based on spectral analysis (FFT) on sliding windows ParSon: an audio segmentation scientific software ParSon ParSon Segmentation and classification Digital audio voice music
  14. 14. #mstechdays Innovation Recherche#15 ParSon is distributed with OpenMP + MPI ParSon 1. Upload input files OAR 2. Reserves N computers 4. MPI Exec 6. Get outputs NAS Reserved computers Linux cluster 3. Input deployment 5. Tasks with heavy inter- communications Data Control
  15. 15. #mstechdays Innovation Recherche#16 Performances are limited by data transfers ParSon 8 32 128 512 2048 1 4 16 64 256 Bestruntime(s) Number of nodes en rseau, froid en local, froid IO bound Nodes read from NAS Nodes read locally
  16. 16. Innovation Recherche#mstechdays #17 2. PORTING THE APPLICATION a. Porting C++ code: Linux Windows b. Porting distribution strategy: Cluster HPC Cluster Manager c. Porting and adapting deployment scripts
  17. 17. #mstechdays Innovation Recherche#18 ParSon and Visual conform to the C++ standard few code changes Dependencies are the standard libraries and cross-platform scientific libraries : libsnd, fftw Thanks to MS-MPI, inter-process communication code doesnt change Visual Studio natively supports OpenMP The only task left was translating build files: Makefiles Visual C++ projects Standards conformance = easy Linux Windows porting Porting
  18. 18. #mstechdays Innovation Recherche#19 ParSon in the cluster Porting 1. Upload input file OAR 2. Reserves N computers 4. MPI Exec 6. Get output NAS Reserved computers Linux cluster 3. Input deployment 5. Run and inter-com. Data Control
  19. 19. #mstechdays Innovation Recherche#20 HPC pack SDK ParSon dans le Cloud Azure Porting 1. Upload input file HPC Cluster Manager 2. Reserves N nodes 4. MPI Exec 6. Get output Azure Storage Provisioned A9 nodes PaaS Big Compute 3. Input deployment AD Domain controller IaaS PaaS Data Control 5. Run and inter-com.
  20. 20. #mstechdays Innovation Recherche#21 At every software update : package + send in the cloud 1. Send to manager Either with Azure Storage Set-AzureStorageBlobContent Get-AzureStorageBlobContent hpcpack create ; hpcpack upload hpcpack download Or with normal transfert : internet accessible fileserver : FileZilla, etc. 2. Packaging script: mkdir, copy, etc. ; hpcpack create 3. Send to Azure storage: hpcpack upload At every node provisioning : local copy 1. Remotely execute on nodes from the manager with clusrun 2.hpcpack download 3.powershell -command "Set-ExecutionPolicy RemoteSigned" Invoke-Command -FilePath -Credential Start-Process powershell -Verb runAs -ArgumentList 4. Installation : %deployedPath%deployScript.ps1 Deployment within Azure Porting
  21. 21. #mstechdays Innovation Recherche#22 Transferring the input file is longer than sequential computation on a single thread On many cores, computation times is negligible compared to transfers WAV format headers and ParSon code limit input size to 4Gb This first working setup has some limitations Porting
  22. 22. Innovation Recherche#mstechdays #23 3. OPTIMIZATIONS
  23. 23. #mstechdays Innovation Recherche#24 Identified bottleneck is the input file transfer 1. Disk write throughput: 300 Mb/s We use a RAMFS 2. Accs Azure Storage : QoS 1.6 Gb/s Download only once from the storage account, then broadcast through InfiniBand 3. Large input files: 60 Gb FLAC c8 lossless compression halves size + not limited to 4Gb Declare all counters as 64 bits ints in C++ code Methodology : suppress the bottleneck Optimizations
  24. 24. #mstechdays Innovation Recherche#25 RAMFS = filesystem stored in a RAM block Very fast Limited capacity, non persistent ImDisk Lightweight: driver + service + command line Open-source but signed for Win64 Scripted silent install : hpcpack create rundll32 setupapi.dll,InstallHinfSection DefaultInstall 128 disk.inf Start-Service -inputobject $(get-service -Name imdisk) imdisk.exe -a -t vm -s 30G -m F: -o rw format F: /fs:ntfs /x /q /Y $acl = Get-Acl F: $acl.AddAccessRule(FileSystemAccessRule("Everyone","Write", )) Set-Acl F: $acl Run at every node provisioning Accelerating local data access with a RAM filesystem Optimizations
  25. 25. #mstechdays Innovation Recherche#26 All standard transfer systems go through the Ethernet interface Azure Storage access via Azure and HPC Pack SDKs Windows share or CIFS network drive Standard file transfer protocols: FTP, NFS, etc. The simplest way to leverage InfiniBand is through MPI 1. On one node: download the input file: Azure RAMFS 2. mpiexec broadcast.exe : 1 process per node We developped a command line utility in C++ / MPI If id = 0, reads RAMFS, by 4mb blocs and sends to other nodes through InfiniBand : MPI_Bcast If id 0, recieve data blocs and save them on RAMFS Uses Win32 API: faster than standard library abstractions 3. Input data is in the RAM of all nodes, accessible as a file from the application Accelerating input file deployment Optimizations
  26. 26. Innovation Recherche#mstechdays #27 4. RESULTS
  27. 27. #mstechdays Innovation Recherche#28 Computations scale well, especially for bigger files Results Number of cores (log)Number of cores (log) Computation time scaling (log-log plot) Computation efficiency for different input sizes Realspeedup/idealspeedup Computationtime(sec,log)
  28. 28. #mstechdays Innovation Recherche#29 Input file transfer make global scaling worse Results +- Number of cores (log) Efficiency for compute only and including transfers Raw compute Number of cores (log) Realspeedup/idealspeedup Time(sec,log) Time decomposition, for an hour of input audio
  29. 29. #mstechdays Innovation Recherche#30 Consistent storage throughput (220Mb/s), latency may be high Broadcast constant @700 Mb/s Results Number of machinesFile size (Gb) Asure storage download performances Broadcast time scaling Broadcasttime(sec,log) Downloadtime(min)
  30. 30. Innovation Recherche#mstechdays #31 5. CONCLUSION
  31. 31. #mstechdays Innovation Recherche#32 Our feedback on the Big Compute technology HPC standards conformance: C++, OpenMP, MPI Ported in 10 work days Solid performances Compute: CPU, RAM Network: InfiniBand between nodes Reactive support Community, Microsoft Intuitive user interface manage.windowsazure.com HPC Cluster Manager Everything is scriptable & programmable Cloud is more flexible than cluster Unified management of cloud and on-premise Data transfers Azure storage latency sometimes high Azure storage limited QoS users must implement multiple account striping HDDs are slow (for HPC), even on A9 Nodes administration Nodes Manager transfers must go through Azure storage: less convenient than conventional remote file systems Provisioning time must be taken into account (~7min)
  32. 32. #mstechdays Innovation Recherche#33 Azure Big Compute for research and business Access to compute without any barrier paperwork, finance, etc. Start your workload in minutes For squeezing a few more before the (extended) deadline for that conference Well suited to researchers in distributed computing Parametric experiments A super computer for all, without investment Elastic scaling : on-demand sizing Interoperable with Windows clusters Cloud absorbs peaks Best of both worlds Datacenters in UE : Ireland + Netherlands Predictable, pay what you use cost model Modern design, extensive documentation, efficient support Decreased need for administration but still needed on the software side For research For business
  33. 33. #mstechdays Innovation Recherche#34 Thanks ? Thank you for your attention Antoine Poliakov [email protected] Stphane Vialle [email protected] ANEO http://aneo.eu http://blog.aneo.eu Retrouvez nous aux TechDays ! Stand ANEO jeudi 11h30 - 13h Au cur du SI > Infrastructure moderne avec Azure All our thanks to Microsoft for lending us the nodes A question : dont hesitate!
  34. 34. Digital is business