Upload
hamish-bryan
View
13
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Infrastructure. Shane Canon Leadership Computing Facility Oak Ridge National Laboratory U.S. Department of Energy. Summary. LCF Roadmap Infrastructure for the Petascale Networking File Systems Archival Storage Data Analytics. Hardware Roadmap. - PowerPoint PPT Presentation
Citation preview
Presented by
Infrastructure
Shane CanonLeadership Computing Facility
Oak Ridge National LaboratoryU.S. Department of Energy
2 Presenter_date
Summary
LCF Roadmap
Infrastructure for the Petascale Networking File Systems Archival Storage Data Analytics
3 Presenter_date
Hardware Roadmap
As it looks to the future, the NCCS expects to lead the accelerating field of high-performance computing. Upgrades will boost Jaguar’s performance fivefold—to 250 teraflops—by the end of 2007, followed by installation of a petascale system in 2009.
4 Presenter_date
Network
Shifting to a hybrid InfiniBand/Ethernet network
InfiniBand based network helps meet the bandwidth and scaling needs forthe center
Wide-Area networkwill scale to meetuser demand using currently deployed routers and switches
100 TF60 GB/s LAN3 GB/s WAN
2007
250 TF200 GB/s LAN3 GB/s WAN
1000 TF200 GB/s LAN4 GB/s WAN
2008
2009
5 Presenter_date
Ethernet[O(10GB/s)]
Ethernet core scaled to match wide-area
connectivity and archive
Infiniband core scaled to match center-wide file
system and data transfer
NCCS network roadmap summary
Viz
High-PerformanceStorage System
(HPSS)
Jaguar
Lustre
Baker
Infiniband[O(100GB/s)]
Gateway
6 Presenter_date
Center-Wide File System (Spider)
Baker
NFS Servers
ESnet, USN, TeraGrid,
Internet2, NLR
Phoenix Cray X1E
Jaguar Cray XT3
`Data Analysis
& Visualization
2007•1 PB•30 GB/s (aggregate)2008•10 PB•200 GB/s (aggregate)
HPSS
7 Presenter_date
Center-Wide File System (Spider)
Increase scientific productivity by providing single repository for simulation data
Connect to all major LCF Resources
Connected to both InfiniBand and Ethernet networks
Potentially becomes the file system for the 1000 TF System
100 TF100 TB10 GB/s
2007
250 TF1 PB
30 GB/s
1000 TF10 PB
200 GB/s
2008
2009
8 Presenter_date
Center-Wide File System (Spider)
FY07 FY08 FY09
OSS 20 30 160
Controllers Pairs
4 8 40
Capacity (PB) .2 1 10
Bandwidth (GB/s)
10 30 200
*End of FY, In Production
Lustre based file system
Can natively utilize the InfiniBand network
Already running on today’s XT3 at 10k+ clients
External based system will utilize routers that are part of the transport protocol used in Lustre (route between IB and SeaStar/Gemini)
External System already demonstrated on current XT systems
9 Presenter_date
Data Storage – Past Usage
Data growth is explosive
Doubling stored data every year since 1998!
Over 1 PB stored today, and adding almost 2 TB of data per day
10 Presenter_date
Archival Storage
HPSS Software has already demonstrated ability to scale to many PB
Add 2 Silos/Year Tape Capacity & Bandwidth,
Disk Capacity and Bandwidth are all scaled to maintain a balanced system
Utilize new methods to improve data transfer speeds between parallelfile systems and archival system
100 TF4 PB (Tape Cap)
4 GB/s (Agg. BW)
2007
250 TF10 PB (Tape Cap)8 GB/s (Agg. BW)
1000 TF18 PB (Tape Cap)
18 GB/s (Agg. BW)
2008
2009
11 Presenter_date
Archival Storage
FY07 FY08 FY09
Silos 2* 3 4
Tape Drives (T10K) 16 24 32
Tape Capacity (PB) 4 10 18
Disk Cache (TB) 200 1000 1100
Tape BW (Agg. GB/s) 1.9 3.8 7.6
Disk BW (Agg. GB/s) 4 10 19
* Doesn’t include older SilosNote: Mid of FY, In Production
12 Presenter_date
Data Analytics
Existing Resources
Visualization Cluster (64 nodes/quadrics)
End-to-End Cluster (80 nodes/IB)
Recently Deployed
Deploy 32 Nodes with 4X-DDR
Connected to Center-wide File System
13 Presenter_date
Data Analytics – Strategies
Jaguar (250TF) (FY08)
Utilize portion of system for data analysis (50 TF/20 TB)
Baker (FY08/09)
Utilize Jaguar as analysis resource (250 TF/50 TB)
Provision fraction of Baker for analysis
14 Presenter_date
Milestones – FY08
First Half FY08
Perform “Bake-off” of Storage for Center-wide File System
Expand IB network
Demonstrate 1.5 GB/s sustained with single OSS node (dual socket QC)
Deploy HPSS Upgrades
Second Half FY08
Select storage system and Procure next Phase of Center-wide Storage (200 GB/s)
Deploy next phase Center-wide File System
15 Presenter_date
Operations Infrastructure SystemsNow and Future Estimates
Archival Storage FY07 FY08 FY09Capacity (PB)
Bandwidth (GB/s)
4
4
10
10
20
19
Viz/End-to-End FY07 FY08 FY09IO B/W
Memory (TB)
10
0.5
15
2+20
60
69
Central Storage FY07 FY08 FY09Capacity (PB)
Bandwidth (GB/s)
0.10
10
1.0
30
10.0
200
Networking FY07 FY08 FY09External B/W (GB/s)
LAN B/W (GB/s)
3
60
3
200
4
200
16 Presenter_date 16 Presenter_date
Contacts
Shane CanonTechnology IntegrationCenter for Computational Sciences(864) [email protected]
17 Presenter_date
Juniper T640
Ciena CoreStream
Juniper T320
Force10 E600
Cisco 6509NSTG
TeraGrid
CCS
DOE UltraScience/NSF CHEETAH
CCS
OC-192 to TeraGrid
2 x OC-192 to USN
OC-192 to CHEETAH
CCS WAN overview
OC-192 to Internet 2
OC-192 to Esnet
Internal 10G Link
Internal 1G Link
Ciena CoreDirector
18 Presenter_date
4
4
4
20
20
48
4848
48
4848
48
48
48
48
128 DDR
Infiniband
64 DDR
64 DDR
24 SDR
87 SDR
3 SDR
64 DDR
20 SDR
32 DDR
32
Ethernet
48-96 SDR
Jaguar
Spider60
Viz
HPSS
Devel
E2E
Spider10
CCS Network 2007
19 Presenter_date
50 DDR
24 SDR20 SDR
300 DDR(50 DDR/link)
64 DDR
32 DDR
87 SDRJaguar
Viz
HPSS
E2E
DevelSpider10
Spider240Baker
50 DDR
48-96 SDR(16-32 SDR/link)
CCS IB network 2008
20 Presenter_date
Code Coupling/Workflow Automation
GTC runs on Tflop – Pflop Cray
40Gbps
Data archiving
Data replication
Large data analysisEnd-to-end system160p, M3D runs on 64PMonitoring routines here
Data replication
User monitoring
Post processing
21 Presenter_date
IBM p550HPSS Core ServicesHPSS 6.2
SSAFibre ChannelSCSI1 GbE10GbE
10 GbE Copper
IBMNdapi server
1GbE
8 - Dell 2950Tape Movers
-10GbE
4 Gb Fibre-
T10K Tape Drives
STK 9310(4) 9840A-SCSI
(4) 9840A
STK 9310(8) 9940B
STK 9310(8) T10K
STK 9310(4) 9840A(8) 9940B
STK 9310(6) 9840A(2) 9940A
STK 9310(8) T10K
Brocade Switches2 - SilkWorm 38002 - SilkWorm 39001 - SilkWorm 4100
Replacing9940Bwith
T10K
DataDirectS2A950038.4 TB
DataDirectS2A9500
SATA100TB
8 Dell 2950Disk Movers
10GbE4 Gb Fibre
6 - Dell 2850Tape Movers
-10GbE
2 Gb Fiber-
9840 & 9940 Tape Drives STK TAPE
9840A - 10-35MBS 20GB9840C - 30-70MBS 40GB
9940B - 30-70MBS 200GBTitanium A - 120+ MBS 500GB
6 STK 9310 Silos16 Dell 2950 Server16 STK T10K Tape Drive 1 DDN S2A9550 100TB
HPSS – 2Q07
22 Presenter_date
IBM p550HPSS Core ServicesHPSS 6.2
SSAFibre ChannelSCSI1 GbE10GbE
10 GbE Copper
IBMNdapi server
1GbE
16 - Dell 2950Tape Movers
-10GbE
4 Gb Fibre-
T10K Tape Drives
STK 9310(4) 9840A-SCSI
(4) 9840A
STK 9310(8) 9940B
STK 9310(8) T10K
STK 9310(4) 9840A(8) 9940B
STK 9310(6) 9840A(2) 9940A
STK 9310(8) T10K
Brocade Switches2 - SilkWorm 38002 - SilkWorm 39001 - SilkWorm 4100
Replacing9940Bwith
T10K
DataDirectS2A950038.4 TB DataDirect
S2A9500SATA100TB
14 Dell 2950Disk Movers
10GbE4 Gb Fibre
6 - Dell 2850Tape Movers
-10GbE
2 Gb Fiber-
9840 & 9940 Tape Drives STK TAPE
9840A - 10-35MBS 20GB9840C - 30-70MBS 40GB
9940B - 30-70MBS 200GBTitanium A - 120+ MBS 500GB
6 STK 9310 Silos2 SL850030 Dell 2950 Server32 STK T10K Tape Drive 2 DDN S2A9550 100TB
HPSS – 1Q08
SL8500(8) T10K
SL8500(8) T10K
23 Presenter_date
CCS network roadmap
FY 2006–FY 2007 Scaled Ethernet to meet wide-area needs and current day
local-area data movement Developed wire-speed, low-latency perimeter security to fully
utilize 10 G production and research WAN connections Began IB LAN deployment
FY 2007–FY 2008 Continue building out IB LAN infrastructure to satisfy the file
system needs of the Baker system Test Lustre on IB/WAN for possible deployment
Summary: Hybrid Ethernet/Infiniband (IB) network to provide both high-speed wide-area connectivity and uber-speed local-area data movement