© 2013 IBM Corporation
IBM Systems & Technology Group
LRZ SuperMUC One year of Operation
IBM Deep Computing
13.03.2013 Klaus Gottschalk – IBM HPC Architect
© 2013 IBM Corporation
IBM Systems & Technology Group
Leibniz Computing Center‘s new HPC System is now installed and operational
2
© 2013 IBM Corporation
IBM Systems & Technology Group
SuperMUC Technical Highlights
• 3 PFLOP Computer in Germany in Gauß-Center
• 9414 Nodes with 2 Intel Sandy Bridge EP
• 209 Nodes with 4 Intel Westmere EX
• 3 PFLOP/s Peak Performance
• 327 TB Memory
• Infiniband Interconnect
• Large File Space for multiple purpose
• 10 PByte File Space based on IBM GPFS
with 200GByte/s aggregated I/O Bandwidth
• 2 PByte NAS Storage with 10GByte/s
aggregated I/O Bandwidth
• No GPGPUs or other Accelerator Technology
• Innovative Technology for Energy Efficient Computing
• Hot Water Cooling
• Energy Aware Scheduling
• Most Energy and Cooling Efficient high End HPC System: PUE 1.1
© 2013 IBM Corporation
IBM Systems & Technology Group
SuperMUC Energy Efficiency Goals
• Stable, highly scalable, efficient Hardware
based on standard x86 components
• Save 40% of Energy compared to air-cooled HPC systems
• Hot Water cooling allowing for Fee Cooling
all year around
• Using standard components
• Easy serviceable
• Frequency controlled nodes
• Optimize Application Energy consumption during use
• Energy saving if not in use
• Power Aware Job Scheduling
• Run Application at optimal clock rate – according to predefined policies
• Deliver an energy report after job run
End of a LINPack Run on 240 nodes
© 2013 IBM Corporation
IBM Systems & Technology Group
© 2013 IBM Corporation
IBM Systems & Technology Group
iDataPlex dx360 M4 water cooled - with Intel Sandy Bridge CPU
© 2013 IBM Corporation
IBM Systems & Technology Group
Island Architecture – Muticluster GPFS
Martin W Hiegl / Uwe Tron 09.09.2012
N1
N2
N3
N4
N5
N6
N7
N8
N65
N66
N67
N68
N69
N70
N71
N72
SFA
12k
SFA
12k
IB P2P
GPFS
Server Cluster
I/O Island
Core
Switch
Spine Switch 1
Spine Switch 2
Spine Switch 3
Spine Switch 124
Spine Switch 125
Spine Switch 126
…
Island 2
Core SW
Island 2
Core SW
Island 18
Core SW
…
2
2
© 2013 IBM Corporation
IBM Systems & Technology Group
Option: Direct Water Cooling (DWC)
Direct cooling in water at heat source (95%) - no media change
Less noise in machine room – no spinning fans
Cooling of system without need for Chillers – PUE 1.05 – 1.10
Node inlet temperature between 18 – 45°C
Inlet temperature can vary with seasons based on achievable temperature
Lower and more stable CPU Core temperature (max 70°C)
About 10% less leak current compared to air cooled systems
Similar pipework requirements as rear door heat exchangers*
Clear Advantages:
Less energy consumption for cooling of the system (about 40%)
Less energy consumption of the CPU (10%)
Enables usage of Turbo Mode with all Cores
Better TCO and higher efficiency of compute power usage
8
(*) see ASHRAE Technical Committee 9.9 Whitepaper: 2011 Thermal Guidelines for Liquid Cooled Data Processing Environments
© 2013 IBM Corporation
IBM Systems & Technology Group
Option: Energy Aware Scheduling (EAS)
Policy based steering of node CPU clock for user batch jobs
Batch scheduler estimates application run time based on clock rate
Admin defined policies determine node clock rate at application execution time
Currently unused nodes will be powered down
EAS is part of IBM LoadLeveler and xCAT and will be ported to LSF
Clear Advantages:
Less power consumption of application that cannot gain performance from high clock rates
Reduction of power consumption of idle nodes
Observation of operational limits
Example SuperMUC :
• Default clock rate is 2.2GHz
• Higher rates up to 2.7 GHz (or Turbo Mode) for applications that will gain performance
• LINPACK measurement done with Intel Turbomode
9
© 2013 IBM Corporation
IBM Systems & Technology Group
LINPACK on May 31, 2012
0:================================================================================
0:T/V N NB P Q Time Gflops
0:--------------------------------------------------------------------------------
0:WR01C2R4 5201920 160 256 576 32387.59 2.897e+06
0:--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV-
0:Max aggregated wall time rfact . . . : 7.38
0:+ Max aggregated wall time pfact . . : 6.52
0:+ Max aggregated wall time mxswp . . : 6.31
0:Max aggregated wall time update . . : 31901.91
0:+ Max aggregated wall time laswp . . : 4309.07
0:Max aggregated wall time up tr sv . : 10.35
0:--------------------------------------------------------------------------------
0:||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0006563 ...... PASSED
0:============================================================================
0:
0:Finished 1 tests with the following results:
0: 1 tests completed and passed residual checks,
0: 0 tests completed and failed residual checks,
0: 0 tests skipped because of illegal input values.
0:----------------------------------------------------------------------------
0:
0:End of Tests.
0:============================================================================
------- done running linpack at 05-31-12--06:43:59 --------
© 2013 IBM Corporation
IBM Systems & Technology Group
Value of the SuperMUC System
• SuperMUC represents tightly integrated innovative solution, with a value proposition which reduces client’s
total cost of ownership and which address growth areas of x86 and green computing.
• Energy- and cooling efficiency characteristics of hardware and HPC Software Stack
provides quantifiable cost reduction – PUE 1.1 (SuperMUC incl. Cooling)
• Holistic view of the Supercomputer Hardware, -Software and Applications
• Running cost of client reduced by 40% compared to HPC standard system of similar size
• Scalability, functionality and quality of hardware, software and service provide a
qualifiable cost advantage
• Fewer problems because of leveraging experience from other platforms
• Faster problem resolution because of integrating development and support
• Running cost of client reduced by less downtime
• Running cost of client reduced by less management effort
© 2013 IBM Corporation
IBM Systems & Technology Group
One Year of Operation
• Direct water cooling is relievable and stable – Summer and Winter
• LRZ Decision: Inlet Temperature varies with outdoor temperature
between 18 – 45°C
• Energy Saving Goal of LRZ and IBM is achieved
• Hardware Failures are below the expected Range
• Island based architecture for Infiniband, xCAT, GPFS, LoadLeveler
proves its scalability for large systems
• Power Consumption metering based on iPDUs down to outlet level
• Hardware and Service Monitoring based on Icinga
• Automated Call Home on Failure for all system parts
• Log file analysis based on Splunk
Martin W Hiegl / Uwe Tron 09.09.2012
© 2013 IBM Corporation
IBM Systems & Technology Group