A Framework for Visualizing Science at the Petascale and Beyond Kelly Gaither Research Scientist Associate Director, Data and Information Analysis Texas

A Framework for Visualizing Science at the Petascale and Beyond

Kelly GaitherKelly Gaither

Research ScientistResearch Scientist

Associate Director, Data and Information AnalysisAssociate Director, Data and Information Analysis

Texas Advanced Computing CenterTexas Advanced Computing Center

Outline of Presentation

• Science at the Petascale– Scaling Resources– Scaling Applications– Access Mechanisms

• Issues and Impediments• Framework

Science at the Petascale

Global Weather Prediction

Understanding Chains of Reaction with Living Cells

Formation and Evolution of Stars and Galaxies in the Early Universe

Scaling HPC Resources• Mission

– Provide greater computational capacity/capability to the science community to compute ever larger simulations

• Enablers:– Commodity multi-core chip sets with low

power/cooling requirements– Efficient packaging for a compact footprint– High-speed commodity interconnects for fast

communications– Affordable! (Nodes with 8 cores, 2GB/core memory,

in the 2K/node price range)

TeraGrid Network Map: BT2

Largest single machine 96 TF

TeraGrid Network Map: AT2

RangerFeb 2008 (0.5 PF)


RangerFeb 2008 (0.5 PF)

KrakenJune 2008(~1PF)


RangerFeb 2008 (0.5PF)


Track2C2010 (>1PF)




Track2C2010 (>1PF)

Track12010 (10PF)

Scaling Analysis Resources• Mission

– Provide an interactive interface allowing users to manipulate/view the results of their science

• Enablers:– Commodity chips with low power/cooling requirements?

Commodity graphics chips yes, low power/cooling no!– Efficient packaging for a compact footprint? Until

recently, desktop box packaging, now available in rack mounted 2U boxes

– High-speed commodity interconnects for fast communications? Yes!

– Affordable? (Nodes with 8 cores, 6GB/core memory, in the 10K/node price range) No!

TeraGrid Network Map: BT2

Largest single machine 96 TF Maverick: 0.5TB shared Maverick: 0.5TB shared

memory, 16 GPUs, 128 coresmemory, 16 GPUs, 128 cores

UIC/ANL Cluster: 96 nodes, UIC/ANL Cluster: 96 nodes, 4GB/node, 96 GPUs4GB/node, 96 GPUs


Spur: 1TB aggregate memory, Spur: 1TB aggregate memory, 32 GPUs, 128 cores32 GPUs, 128 cores



Track2C2010 (>1PF)

Track12010 (10PF)

UIC/ANL Cluster: 96 nodes, UIC/ANL Cluster: 96 nodes, 4GB/node, 96 GPUs4GB/node, 96 GPUs

Impediments to Scaling Analysis Resources

• Power and cooling requirements: (10x more power needed for analysis resource)!

• Footprint: (2x more space needed for analysis resource)!

• Cost: (5x more money needed for comparable analysis resource)!

Scaling HPC Applications• Mission

– As the number of processing cores increases, scale as close to linearly as possible

• Enablers:– Science driven need to solve larger and larger

problems – so significant intellectual body of work applied to scaling applications

– There is basic information that you know ahead of time

• Size of the problem you want to solve• Number of unknowns that you are trying to solve for• Decomposition Strategy• Communication patterns between nodes

Application Examples: DNS/Turbulence

Application Examples: DNS/Turbulence

Courtesy: P.K. Yeung, Diego Donzis, TG 2008

Application Example: Earth Sciences Mantle Convection, AMR Method

Courtesy: Omar Ghattas, et. al.

Application Example: Earth Sciences

Courtesy: Omar Ghattas, et. al.

Scaling Analysis Applications• Mission

– As the number of processing cores increases, scale as close to linearly as possible

• Enablers:– Science driven need to solve larger and larger

problems? Yes, but it’s more complicated than that– Is there basic information that you know ahead of

time?• Size of the problem you want to analyze? Yes• Decomposition Strategy? Tricky!• Communication patterns between nodes? Dependent on

your decomposition strategy!

Impediments to Scaling Analysis Applications

• Decomposition Strategy is a moving target! Tied to the viewpoint.

• Have an additional requirement for interactive frame rate performance!

Accessing HPC Applications• Mission:

– Provide mechanisms for submitting and perhaps monitoring job performance

• Enablers:– Schedulers for submitting jobs – comes with a price!

• Impediments:– Weak support for interactive applications– Still in the mode of hypothesize, run, check…

Accessing Analysis Applications• Mission:

– Provide mechanisms for interactively running applications to analyze data

• Enablers:– Lots of intellectual capital in remote and collaborative

access mechanisms – this is where we are ahead of the HPC community

• Remote desktop• VNC• AccessGrid

Impediments to Reaching the Petascale

• 10x power requirement• 2x space requirement• 5x more expensive• Tenuous balance between requirement for

interactive performance and need to scale to more processing cores

• Retrofitting our access mechanisms to work with batch schedulers

Requirements for Designing a Framework for Visualizing at the

Petascale• 10x power requirement• 2x space requirement• 5x more expensive• Address balance between requirement for

interactive performance and the need to scale to more processing cores

• Retrofit our access mechanisms to work with batch schedulers

Not something I can address short term

Requirements for Designing a Framework for Visualizing at the

Petascale• Minimize Data Movement – users can generate

100s of TB of data, but can’t move it off the storage local to the machine it was generated on

• Optimize for the platforms that we can run on – data starved cores become much more apparent

• Reduce the barriers to entry

SCOREVIS Software Stack

• Scalable, Collaborative and Remote Visualization• NSF STCI funded project that began March 1• Balance Goals:

– Accessibility: provide remote and collaborative access to visualization applications over common networks with standard communications protocols.

– Rendering: include data decomposition, the transformation from data primitives to geometric primitives, and the transformation from geometric primitives to pixels.

– Scalability: choose between image decomposition or data decomposition depending on underlying size of the data and number of processors available.

SCOREVIS Requirements

• Minimize Data Movement• Address balance between requirement for

interactive performance and the need to scale to more processing cores

• Retrofit our access mechanisms to work with batch schedulers

• Optimize for the platforms that we can run on – data starved cores become much more apparent

• Reduce the barriers to entry

SCOREVIS Approach

• Minimize Data Movement – move analysis to where the data is generated

• Address balance between requirement for interactive performance and the need to scale to more processing cores – address data decomposition and scaling of applications and core algorithms

• Retrofit our access mechanisms to work with batch schedulers – allow remote and collaborative access

• Reduce the barriers to entry – phased approach providing access to familiar applications – OpenGL based

Traditional OpenGL Architecture

Application Node

Application OpenGL Hardware Screen

Application/Client Host

Application libGL/Xlib X Server

GLX and XProtocol

OpenGLHardware

Screen

User’s Local Display

SCoReViS Architecture

Chromium

Compositing

Application Processing (in some cases)

Application/UI Rendering

Chromium Server

Hardware Accelerated OpenGL

or

Mesa Software Rendering

VNC Client VNC Client …

Compute NodeVNC Server on Login Node

Application/UI Rendering

Chromium

VNC Client VNC Client …

SCOREVIS To Date• Been successful in providing remote and collaborative access

to visualization applications based on OpenGL – with a caveat (ParaView and VisIt and HomeGrown)– Did get “interactive” frame rates – 6-10 fps

• Been successful in profiling to better understand where the bottlenecks exist in the analysis pipeline:– I/O (Lustre Parallel file system ~32GB/sec)– Core visualization algorithms (current apps do not do a good

job for load balancing)– Rendering in mesa – quickly find out that native mesa does

not handle multiple cores• Also developing quick and dirty ways to handle in-situ analysis

Acknowledgements:

• Thanks to the National Science Foundation Office of Cyberinfrastructure for supporting this work through STCI Grant #0751397.

For more information, contact:

Kelly [email protected]

Texas Advanced Computing Center

Questions?

mailto:[email protected]

Documents

A Framework for Visualizing Science at the Petascale and Beyond Kelly Gaither Research Scientist Associate Director, Data and Information Analysis Texas