17
Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center

Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center

Embed Size (px)

Citation preview

Abhinav Bhatele, Laxmikant V. KaleUniversity of Illinois at Urbana-Champaign

Sameer KumarIBM T. J. Watson Research Center

Molecular DynamicsA system of [charged] atoms with bondsUse Newtonian Mechanics to find the positions and

velocities of atomsEach time-step is typically in femto-secondsAt each time step

calculate the forces on all atoms calculate the velocities and move atoms around

September 9th, 2008 Abhinav S Bhatele 2

NAMD: NAnoscale Molecular DynamicsNaïve force calculation is O(N2)Reduced to O(N logN) by calculating

Bonded forces Non-bonded: using a cutoff radius

Short-range: calculated every time step Long-range: calculated every fourth time-step (PME)

September 9th, 2008 Abhinav S Bhatele 3

Hybrid of spatial and force decomposition

NAMD’s Parallel Design

September 9th, 2008 Abhinav S Bhatele 4

Parallelization using Charm++

September 9th, 2008 Abhinav S Bhatele 5

Static Mapping

Load Balancing

Bhatele, A., Kumar, S., Mei, C., Phillips, J. C., Zheng, G. & Kale, L. V. 2008 Overcoming Scaling Challenges in Biomolecular Simulations across Multiple Platforms. In Proceedings of IEEE International Parallel and Distributed Processing Symposium, Miami, FL, USA, April 2008.

Communication in NAMDEach patch multicasts its

information to many computes

Each compute is a target of two multicasts only

Use ‘Proxies’ to send data to different computes on the same processor

September 9th, 2008 Abhinav S Bhatele 6

Topology Aware TechniquesStatic Placement of Patches

September 9th, 2008 Abhinav S Bhatele 7

Topology Aware Techniques (contd.)Placement of computes

September 9th, 2008 Abhinav S Bhatele 8

Load Balancing in Charm++Principle of Persistence

Object communication patterns and computational loads tend to persist over time

Measurement-based Load Balancing Instrument computation time and communication volume

at runtime Use the database to make new load balancing decisions

September 9th, 2008 Abhinav S Bhatele 9

NAMD’s Load Balancing StrategyNAMD uses a dynamic centralized greedy strategyThere are two schemes in play:

A comprehensive strategy (called once) A refinement scheme (called several times during a run)

Algorithm:Pick a compute and find a “suitable” processor to place it

on

September 9th, 2008 Abhinav S Bhatele 10

Choice of a suitable processorAmong underloaded processors, try to:

Find a processor with the two patches or their proxiesFind a processor with one patch or a proxyPick any underloaded processor

September 9th, 2008 Abhinav S Bhatele 11

Highest Priority

Lowest Priority

Load Balancing MetricsLoad Balance: Bring Max-to-Avg Ratio close to 1Communication Volume: Minimize the number of

proxiesCommunication Traffic: Minimize hop bytes

Hop-bytes = Message size X Distance traveled by message

September 9th, 2008 Abhinav S Bhatele 12

Agarwal, T., Sharma, A., Kale, L.V. 2008 Topology-aware task mapping for reducing communication contention on large parallel machines, In Proceedings of IEEE International Parallel and Distributed Processing Symposium, Rhodes Island, Greece, April 2006.

Results: Hop-bytes

September 9th, 2008 Abhinav S Bhatele 13

Results: Performance

September 9th, 2008 Abhinav S Bhatele 14

Simulation of WW DomainWW: 30,591- atom simulation on NCSA’s Abe cluster

September 9th, 2008 Abhinav S Bhatele 15

Freddolino, P. L., Liu, F., Gruebele, M., & Schulten, K. 2008 Ten-microsecond MD simulation of a fast-folding WW domain Biophysical Journal 94 L75-L77.

Future WorkA scalable distributed load balancing strategyGeneralized Scenario:

multicasts: each object is the target of multiple multicasts use topological information to minimize communication

Understanding the effect of various factors on load balancing in detail

September 9th, 2008 Abhinav S Bhatele 16

NAMD Development Team:Parallel Programming Lab, UIUC – Abhinav Bhatele, Sameer Kumar, David Kunzman, Chee Wai Lee, Chao Mei, Gengbin Zheng, Laxmikant V. KaleTheoretical and Computational Biophysics Group – Jim Phillips, Klaus Schulten

Acknowledgments:Argonne National Laboratory, Pittsburgh Supercomputing Center (Shawn Brown, Chad Vizino, Brian Johanson), TeraGrid