22
Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

Embed Size (px)

Citation preview

Page 1: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

Harnessing Grid-Based Parallel Computing Resources for Molecular

Dynamics Simulations

Josh Hursey

Page 2: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

Villin Folding

QuickTime™ and aYUV420 codec decompressor

are needed to see this picture.

Page 3: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

ZZ

Z

ZZ

Z

ZZ

Z

ZZ

Z

Page 4: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey
Page 5: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

ZZ

Z

ZZ

Z

ZZ

Z

ZZ

Z

Page 6: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey
Page 7: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

OverviewOverview

Folding@Clusters, is an adaptive framework for harnessing low latency parallel compute resources for protein folding

research.

It combines capability discovery, load balancing, process monitoring, and checkpoint/re-start services to provide a platform for molecular dynamics simulations on a range of

grid-based parallel computing resources including clusters, SMP machines, and clusters of SMP machines (sometimes

known as constellations).

Page 8: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

Design GoalsDesign GoalsProvide an easy to use, open source interface to significant

computing resources for scientists performing molecular dynamics simulations on large biomolecular systems.

•Automate process of running molecular systems on a variety of parallel computing resources

•Handle failures gracefully & automatically

•Don’t hinder performance possibilities

•Ease of use for scientists, sys. admin.s, & contributors

•Provide low friction install, configuration, & run-time interfaces

•Sustain tight linkage with Folding@Home project.

Page 9: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

Open Source Building BlocksOpen Source Building Blocks

•GROMACS: Molecular dynamics software package. Primary Scientific Core

•FFTW: Fast Fourier Transform Library.Used internally in GROMACS.

•LAM/MPI: Message Passing Interface implementation. Supports the MPI-2.0 specification.

•COSM: Distributed computing library to aid in portability.Provides capability discovery, logging, & base utilities.

•NetPipe: Common tool for measuring bandwidth & latency. Used in capability discovery.

•Folding@Home: Large-scale distributed computing project.Foundation for this project.

Page 10: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey
Page 11: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

Contributor SetupContributor Setup1. Create a user to run Folding@Clusters

2. Download & unpack the distribution

3. Confirm LAM/MPI installation & configuration

4. Start LAM/MPI: $ lamboot

5. Configure Folding@Clusters using mother.conf

6. Start Folding@Clusters: $ mpirun -np 1 bin/mother

$ lamnodes n0 c1.cluster.earlham.edu:2:origin,this_noden1 c2.cluster.earlham.edu:2:n2 c3.cluster.earlham.edu:2:n3 c4.cluster.earlham.edu:2:n4 c5.cluster.earlham.edu:2:n5 c6.cluster.earlham.edu:2:n6 c7.cluster.earlham.edu:2:n7 c8.cluster.earlham.edu:2:n8 c9.cluster.earlham.edu:2:n9 c10.cluster.earlham.edu:2:

$ cat conf/mother.conf [Network]LamHosts=n0,n1,n2,n3,n4,n5,n6LamMother=n0

Page 12: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

Testing Environment: Testing Environment: CairoCairo

•Network Fabric: 2 Netgear GSM712 1000 MB Switches Linked together by dual GBIC/1000 BT RJ45 modules

•OS: Yellow Dog Linux (4.0 Release, 2.6.8-1 SMP Kernel)

•GCC: 3.3.3-16

Nodes 16 Apple XservesProcessor Dual G4 PowerPC 999 MHzL2 Cache 256 KBL3 Cache 2 MBFront Side Bus 133 MHzRAM 1 GB PC2100 DDRAM

NIC1 on-board 10/100/1000 BT1 PCI 10/100/1000 BT

Hard Drive 60 GB IBM Ultra ATA/100 7200 RPM w/ 2 MB cacheMotherboard Apple Proprietary

Page 13: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

Testing Environment: Testing Environment: MoleculesMolecules

Molecule Description Mass Points

DPPC A phospholipid membrane, consisting of 1024 dipalmitoylphosphatidylcholine (DPPC) lipids in a bilayer configuration with 23 water molecules per lipid.

121,856

Proteasome

(Stable)

A peptide in a proteasome with explicit solvent and a Coulomb type of reaction field.

119,507

Villin The Villin headpiece, a 35 residue peptide, simulated with 3000 water molecules.

9,389

Page 14: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

PerformancePerformance

DPPCProteasome

(Stable)

Villin

Page 15: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

Future DirectionsFuture Directions

•New scientific cores (Amber, NAMD, etc…)

•Remove dependencies on pre-installed software

•Extend testing suite of molecules

•Extend range of parallel compute resources used in testing

•Abstract the @Clusters framework

•Investigate load balancing & resource usage improvements

•Architecture addition: Grandmothers

•Beta Release!

Page 16: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

Future DirectionsFuture Directions

Page 17: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

About UsAbout UsCharles Peck

Josh McCoy

John Schaefer

Vijay Pande

Erik Lindahl

Adam Beberg

Josh Hursey

Page 18: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

QuestionsQuestions

Page 19: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey
Page 20: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

SpeedupSpeedup

DPPCProteasome

(Stable)

Villin

Page 21: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

Testing Environment: Testing Environment: BazaarBazaar

•Network Fabric: 2 Switches (3Com 3300XM 100 MB, 3Com 3300 100 MB) Linked together by a 3Com MultiLink cable

•OS: SuSE Linux (2.6.4-52 SMP Kernel)

•GCC: 3.3.3

Nodes 16 VA Linux 2200sProcessor Dual Pentium III 500 MHzL1 Cache 32 KBL2 Cache 512 KBFront Side Bus 100 MHzRAM 512 MB SDRAM

NIC Intel Pro 10/100B/100+ Ethernet (on-board)

Hard Drive 18 GB WD Caviar 7200Motherboard Intel L330GX+ Server Board

Page 22: Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey

MotivationMotivation