CSE 403, 2/6/06
Architecture Up Close
Gail AlversonCray Inc.
Who am I?
• Gail Alverson• Lurker in your Winter 403 class• Computer Science Professor• Senior Engineering Manager at Cray Inc.• Your entertainer for the next 50 minutes!
OutlinePart 1 – Case Study: Story of an engineer
What’s my software engineering history?
Part 2 – Architecture up close at CraySome examplesSome challengesSome lessons learned
Government-Classified
Government-Research
Weather/Environmental
Automotive
Life Sciences
Aerospace
Academic Research
Cray Solutions for:Security
Design EngineeringEnvironmentalLife Sciences
Data Management/SAN
Focus on systems to solve huge computationally intense problemsFocus on systems to solve huge computationally intense problems
Cray Inc.
My SW Engineering roles
Technical Manager for the VP of EngineeringTechnical Manager for the VP of Engineering
Senior Manager, OS and PE componentsSenior Manager, OS and PE components
Manager, Programming Environments (PE)Manager, Programming Environments (PE)
Project lead/Software engineer, libraries and toolsProject lead/Software engineer, libraries and tools
Software Engineer – libraries and debuggerSoftware Engineer – libraries and debugger
TIM
E
Can you think of a different SW career track?Can you think of a different SW career track?
My last project: Red Storm system• Massively parallel processing supercomputer
system used for analysis and stewardship of nuclear weapons - for Sandia National Lab $93M
System requirements included10,000 AMD processorsHigh speed custom interconnectLOW communication latencyFast custom microkernel on compute nodesUnix OS on service nodesHigh performance IO2 ½ year development timelineNo single point of failure
Can you think of some SW challenges? Risks?Can you think of some SW challenges? Risks?
How did we architect this beast?
Identify major componentsWhile (1) {
Understand your component (and overall) requirementsDesign major interfaces with dependent components
== interact with other architectsArchitect your component (the what)Design review internally and with customer[Prototype/further design (the how)]}
Here are some examples of what an architecture can capture and ways it can be presented
SSI Block
Hyper Transport
L0 Controler
SSI BlockSSI Block
Hyper Hyper TransportTransportHSN
CommunicationLinks
RouterRouterRouter
DMAEngineDMADMA
EngineEngine
2 GHz
AMDOpteron™
2 GHz
AMDOpteron™
ProcessorCore
RAM
ProcessorProcessorCoreCore
RAMRAM
1 –> 4 GB
DRAM
1 –> 4 GB
DRAM
Hardware Architecture
Cray Design
3rd Party Integration
Cray System ChipCray System Chip
+Y
-Y
+Z-X
+X-Z
+Y
-Y
+Z-X
+X-Z
• 2 us Nearest neighbor• 5 us Furthest point
MPI Latency
Compute Node
HSNCommunication
Links
Service & IO Node
• 1.5 TB/s bi-directionalBi-section Bandwidth
• 120 TB/s
Aggregate Sustained Bandwidth
3D Mesh27 x 16 x 24 (x, y, z)
RAID Controllers
10 GigE Network
PCI - X
Network Architecture Can you remember some ways to view SW architecture?• Code oriented views
– Module view– Software layer view
• Execution oriented views– Data-flow view– Operational view
• Deployment views
LWK Linux
Compute Service & IO
Service Partition (256 nodes)• Linux OS• Specialized Linux nodes
Login Nodes ~ 10
Dedicated Functional NodesIO Server Nodes ~ 190Network Server Nodes ~ 50FS Metadata Server Nodes ~ 4Database Server Nodes ~ 2
Compute Partition (10K nodes)Sandia LWK OS (Catamount)
RAID Controllers
Network
PCI-X10 GigE
Fiber ChannelPCI-X
Software Architecture (SA)What type of view is this?What type of view is this?
Allocator PBS mom
Job Launch
Compiler PerfToolTotalView
Login Node
Showmesh
LinuxLibraries
User Space
Boot
Portals
Portals Driver IPPortals Driver Support RCA
RCA
RCA daemon
Parallel File Systems
Support Services & daemons
Com
mon
Operating System
3rd Party3rd Party CrayCraySandiaSandia Open SourceOpen Source
3rd PartyCray Integration
3rd PartyCray Integration
User’s Environment
Accounting
DBPBS srv
Server NodeDedicated Server Functions
IO & FS daemons
IO Node
IO & Network
Nodes only
FC Driver
10GigEDriver
Dedicated Network & External RAID IO Specialized by
Node Function
Runs on all Service and IO Nodes
SA: Service NodesWhat type of view is this?What type of view is this?
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
LinuxLinux Kernel
RCA Portals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
QuintessentialKernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
LWK
RCAPortals
Portals 3.2
Compute NodesService Nodes
Seastar
Portals 3.2
Data Packets over the High Speed Network
SeastarSSI SSI
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
LinuxLinux Kernel
RCA Portals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCA
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
Linux Kernel
RCA Portals
LinuxLinux Kernel
RCA Portals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
QuintessentialKernel (QK)
RCAPortals
Quintessential Kernel (QK)
RCAPortals
LWK
RCAPortals
Portals 3.2
Compute NodesService Nodes
Seastar
Portals 3.2
Data Packets over the High Speed Network
SeastarSSI SSI
SA: Message Passing
What type of view is this?What type of view is this?
LWK on Compute Side
SeaStar
SA: more Message PassingApplication
LWKPortalsModule
Firmware Lib
Firmware
Portals3.2
MPI Library
PortalsState
Application
SSNAL
Kernel Lib
MPI Library orSML Library
API
SeaStar
FirmwareDriverState
PortalsState
Messages over High Speed Network
User
KernelH
W
Linux on Service Side
What could make this view more meaningful to the customer?
What could make this view more meaningful to the customer?
Job Launch
Login Node
Linux
UserApplication
Login &Start App
Red Storm User ComputeNode
Allocator
Database Node
Request CPU’s
CPU Inventory Database
CPU list
Load Application
catamount/QKCompute Node
PCT
UserApplication
Fan outapplication
Application Runs
IO NodeIO
daemons
IO NodeIO
daemons
IO NodeIO
daemons
IO R
equests from C
ompute N
odesIO
Nodes Im
plement R
equest
Application Exit & Cleanup
Return CPU’s
Return CPU’s
SA: Job Launch
Is there enough detail to start coding from this view?
Is there enough detail to start coding from this view?
Diagrams rock, but you need to specify the APIs as well
Flowcharts help define the usage
Flowcharts help define the usage
API’s specify enough of the “what” for the next development phase, of designing the “how”
API’s specify enough of the “what” for the next development phase, of designing the “how”
One last example of a SA: System boot Requirement 5.11 The full system must be able to be booted in
less than 15 minutes after a clean shutdown.
Service Nodes1. RAS broadcasts kernel to Service nodes2. Service nodes verify Unix image is valid
If not, first loads image from RASand others, load from peer
3. Service nodes are up and running
Compute Nodes1. RAS broadcasts kernel to cage controllers2. Cage controllers broadcast to other nodes3. Compute nodes are up and running
C CC C
C C
R R
I II I
LL
Golden unix kernelGolden unix imageGolden cougar kernel
RAS
How is software different?How is software different?
Pictures tend to be part of an overall architecture document
• Introduction– Revision history!– Scope– Requirements
• System description• Deployment view
– Intro– Top level diagram– Interfaces
• Functional view • Software layer view• …
SoftwareArchitecture
Can you think of other elements?Can you think of other elements?
– Issues to resolve (risks, mitigations)
Stepping back, what was important to capture in the architecture?
Can you see any common themes?Can you see any common themes?
• Multiple views of the components• Diagrams, diagrams, diagrams• Focus on interfaces and their requirements
the “what” of the integration points
What was most important in the architecture process?
What do you think?What do you think?
• Reviews• Both internal and with the customer
• Iteration
Epilogue
• Red Storm was made into a product, Cray XT3• Full delivery was 3 ½+ years, but got something to
the customer in 3 • Software effort was much more complex than
expected• Rearchitected at least two major SW components
after getting experience with them• Product is successful and in demand (yah!)