‘Big data’ and Computer Architecture
Why my computer is not beige anymore?
Dr. Michael Browne ICHEC Technical Manager [email protected]
Stop Press!
Fionn – new system €4.1M investment • SGI ICE X
• 7,680 E5-2660v2 cores, c. 140 Tflops • 20.5TB RAM • Mellenox FDR interconnect
• Accelerated region • 640 E5-2660v2 cores • 32 Xeon Phi 5110P • 32 NVIDIA K20m
• SGI UV2000 – shared memory • 112 Xeon Cores + 2 Xeon Phi 5110P • 1.7TB RAM
• DDN SFA12k-20 Storage • 550TB (formatted)
• Offsite failover system for critical services
Stoney – Now legacy GPU system • Bull NS R422-E2
• 512 X5560 cores • 5.7 Tflops • 3TB RAM in total • 48 NVIDIA M2090s
Short tour after lunch • Ground floor IT building at 14:00 Due for conversion to Hadoop system in December
What can I do with it?
1036 or 1,000,000,000,000,000,000,000,000,000,000,000,000
1x10-15m
1x1021
m
4x107m
5x10-9m 1x10-2m
1x10debatablem It is up to YOU!
Access is FREE and subject to Peer Review
Performance More is better, right?
Performance* of the world’s 500 fastest systems over 20 years
1.0E+08
1.0E+09
1.0E+10
1.0E+11
1.0E+12
1.0E+13
1.0E+14
1.0E+15
1.0E+16
1.0E+17
1.0E+18
1.0E+19
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
Sum
First
Last
Top500.org
Exa
Peta
Tera
Giga
Flops
Great, my code keeps running faster…
What happened since 2005?
Performance has been maintained by adding cores not serial speed. Source top500.org
Not just academia & research institutes, by system numbers 53.8% of Top500 systems are based in industry in ’13. Source top500.org
Who uses it?
So more is better, but more is also different
• More is what we’ll get whether we want it or not, but this will mostly be in ever more parallel HPC and cloud style systems. – Squeezed from the top
• R&D / Industry momentum is directed towards the mobile space i.e. low power and lower performance. – Squeezed from below
• Can only exploit “more” if software is VERY parallel. • Is a future workstation is a laptop or less?
Hardware Landscape What kit can we throw at our
problems today?
Scale to today’s limits, today
• Workstation’s typically have dual processors and processors have 10-12 cores, are you using them all?
• Look at licencing costs for scaling up, many different models out there.
• Look at TCO calculations for old equipment & consider upgrades.
• Look at remote options for performance critical tasks or bursts, ICHEC, AWS, etc.
• Be aware of constraints your licence puts on you.
• Xeon Phi 5110P • up to 8 GB GDDR5 • 352 GB/s • 2.147 TFLOPS (SP) • 1.074 TFLOPS (DP) • PCIe 2.0 • 225 Watt
• Tesla K20X • 6GB GDDR5 • 250 GB/s • 3.95 TFLOPS (SP) • 1.31 TFLOPS (DP) • PCIe 2.0 • 225 Watt
Tesla & Xeon Phi Up to 1000 speedup
‘Most Innovative Use Of HPC in Financial Services’
Interconnect
• Ethernet typically 1Gbps in the office - 10Gbps is affordable
• Infiniband FDR 56Gbps, much lower latency better for compute
Storage
• DDN SFA12K-20 (Fionn) 20GB/s raw, ~12GB/s write performance
• Generic SATA HDD 0.6GB/s raw, ~0.15GB/s write performance
Remote Workstations
• Batch processing & remote Office apps is a solved problem but not so for remote CAE GUIs & rendering
• NVIDIA Grid provides remote virtualised GPUs.
• Very new but seems to work well with a decent network.
• Large auto firms adopting the technology rapidly.
• Facilitates offsite working.
• Will change what cloud providers can offer.
!
NVIDIA GRID K2
Graphics Board BD-06580-001_v02 | 1
OVERVIEW
"#$!%&'(')!*+'(,!-.!/0!1!231450467!89:;!/<=#!>?'!@ABC$00!*$<D!EC1B#/=0!=1C2!F/7#!7F6!
#/E#5$<2!%&'(')G!-$B4$C,!EC1B#/=0!BC6=$00/<E!3</70!H*>I0J:!"#$!%&'(')!*+'(!-.!
#10!K!*L!6M!*((+;!N$N6CO!HP!*L!B$C!*>IJQ!1<2!1!..;!R!N1A/N3N!B6F$C!4/N/7:!"#$!
%&'(')!*+'(!-.!EC1B#/=0!S61C2!30$0!1!B100/T$!#$17!0/<U!7#17!C$V3/C$0!0O07$N!1/CM46F!
76!BC6B$C4O!6B$C17$!7#$!=1C2!F/7#/<!7#$CN14!4/N/70:!'7!/0!2$0/E<$2!76!1==$4$C17$!EC1B#/=0!
/<!T/C7314!C$N67$!F6CU0717/6<!1<2!T/C7314!2$0U76B!$<T/C6<N$<70:!!
"#$!%&'(')!*+'(!-.!EC1B#/=0!S61C2!=1<!S$!=6<M/E3C$2!76!$<1S4$!6C!2/01S4$!@??!H$CC6C!
=6CC$=7/<E!=62$0J!7#17!=1<!M/A!0/<E4$5S/7!$CC6C0!1<2!2$7$=7!263S4$5S/7!$CC6C0:!@<1S4/<E!
@??!F/44!=130$!06N$!6M!7#$!N$N6CO!76!S$!30$2!M6C!7#$!@??!S/70Q!06!7#$!30$C!1T1/41S4$!
N$N6CO!F/44!2$=C$10$!SO!89W:!@??!BC67$=7/6<!/0!M6C!(+)X!6<4O:!
!
Figure 1. NVIDIA GRID K2 Graphics Board (GK104 / P2055)
Data centres change everything
Big Big Data
Not just warehouses
Now just add low cost & power
• ARM, dominant in TV & Mobile • Intel Quark, IoT • Designed in Ireland
Software
The Common Challenge • Learn to think in parallel… (much) harder as number
of CPUs increases. • Main obstacle is limited scalability of software: - Scale to 10,000s+, not just 100s. - Large-scale software initiatives urgently required. - Licencing models should encourage adoption.
• But opportunities exist today: - Parameter studies can be done readily. - Enterprise IT consolidation and cloud services are
possible. - Does any other economic sector face such change?
Thank you
What happens if applications don’t exploit parallelism?
The future rate of discovery will slow. Looking back, the rate of discovery has been driven by the staggering improvement in technology combined with an equally staggering reduction in cost.
In the 15-year period from 1997 to 2012, the cost-per-floating-point operation of a TOP 500 supercomputer has decreased 100x, while the performance has increased 1500x.
The result? Intensive computing solutions once reserved for only an elite few researchers have become broadly accessible to academic, industrial, and government markets. The benefit is this compute is significantly faster time-to-insight through increasing accurate calculations critical to success across all markets.
Intel Corp.