25
1 NWChem: Computational Chemistry NWChem: Computational Chemistry Software for Parallel Computers Software for Parallel Computers Edoardo Aprà William R. Wiley Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory

NWChem: Computational Chemistry Software for Parallel ... · 1 NWChem: Computational Chemistry Software for Parallel Computers Edoardo Aprà William R. Wiley Environmental Molecular

  • Upload
    vudieu

  • View
    234

  • Download
    1

Embed Size (px)

Citation preview

1

NWChem: Computational Chemistry NWChem: Computational Chemistry Software for Parallel ComputersSoftware for Parallel Computers

Edoardo Aprà William R. Wiley Environmental Molecular Sciences Laboratory,

Pacific Northwest National Laboratory

2

Outline of TalkOutline of Talk

Overview of NWChem

Parallel Performance

Hardware and Software requirements

3

Molecular Science Software GroupMolecular Science Software Group

Gary BlackLiz JurrusCarina LansingBruce Palmer Karen SchuchardtEric StephanErich Vorpagel

Edoardo ApràEric BylaskaMahin HacklerBert deJongSo HirataLisa PollackTjerk StraatsmaMarat ValievTheresa Windus

Manoj Kumar Krishnan

Jarek Nieplocha

Bruce Palmer

Vinod Tipparaju

4

Why NWChem Was DevelopedWhy NWChem Was Developed

Developed as part of the construction of the Environmental Molecular Sciences Laboratory (EMSL)

Envisioned to be used as an integrated component in solving DOE’s Grand Challenge environmental restoration problems

Designed and developed to be a highly efficient and portable Massively Parallel computational chemistry package

Provides computational chemistry solutions that are scalable with respect to chemical system size as well as MPP hardware size

5

NWChem DistributionNWChem Distribution

6

NWChem ArchitectureNWChem ArchitectureR

un

-tim

e da

taba

se

DFT energy, gradient, …

MD, NMR, Solvation, …

Optimize, Dynamics, …

SCF energy, gradient, …

Inte

gral

AP

I

Geo

met

ry

Obj

ect

Bas

is S

et O

bjec

t

...PeI

GS

...

Global Arrays

Memory Allocator

Parallel IO

MolecularModelingToolkit

MolecularCalculation

Modules

MolecularSoftware

DevelopmentToolkit

GenericTasksEnergy, structure, …

• Object-oriented design

• abstraction, data hiding, APIs• Parallel programming model

• non-uniform memory access, global arrays, MPI

• Infrastructure

• GA, Parallel I/O, RTDB, MA, ...• Program modules

• communication only through the database

• persistence for easy restart

7

Global ArraysGlobal Arrays

single, shared data structure/ global indexing

e.g., access A(4,3) rather than buf(7) on task 2

Physically distributed data

Distributed dense arrays that can be accessed through a shared memory-like style

Global Address Space

8

Global ArraysGlobal Arrays Shared memory model in context of distributed dense arrays

Complete environment for parallel code development

Compatible with MPI

Data locality control similar to distributed memory/message passing model

Extensible and scalable

Compatible with other libraries: ScaLapack, Peigs, etc …

part of bigger system for NUMA programming:

parallel I/O extensions: Disk Resident Arrays

per-processor private files Exclusive Access Files

9

Structure of GAStructure of GA

Message PassingGlobal operations

ARMCIportable 1-sidedcommunication

put,get, locks, etc

distributed arrays layermemory management, index translation

system specific interfacesLAPI, GM/Myrinet, threads, VIA,..

Global Arrays and MPI are completely interoperable. Code can contain calls to both libraries.

Fortran 77 C C++ Babel

F90

Python

JavaApplication programming language interface

10

Mirrored ArraysMirrored Arrays

Create Global Arrays that are replicated between SMP nodes but distributed within SMP nodes

Aimed at fast nodes connected by relatively slow networks (e.g. Beowulf clusters)

Use memory to hide latency

Most of the operations supported on ordinary Global Arrays are also supported for mirrored arrays

Global Array toolkit augmented by a merge operation that adds all copies of mirrored arrays together

Easy conversion between mirrored and distributed arrays

11

Mirrored Arrays (cont.)Mirrored Arrays (cont.)

Distributed Mirrored Replicated

12

Mirrorred Arrays in NWChem/DFTMirrorred Arrays in NWChem/DFT

13

NWChem CapabilitiesNWChem Capabilities Quantum Mechanical Capabilities: Hartree-Fock & density functional theory at the local and nonlocal levels (with N3 and N4 formal scaling)

energies, gradients, & second derivatives. Linear scaling quadrature and exchange. TDDFT Multiconfiguration self consistent field (MCSCF) energies and gradients. Many-body perturbation theory energies and gradients. Effective core potential energies, gradients, and second derivatives. Coupled cluster [CCSD and CCSD(T)] and configuration interaction energies. Tensor Contraction Engine module, that can generate unrestricted CISD, CISDT, CISDTQ, LCCD, CCD,

LCCSD, CCSD, QCISD, CCSDT, CCSDTQ, MBPT(2), MBPT(3), MBPT(4) wavefunctions Segmented and generally contracted basis sets including the correlation-consistent basis sets. Plane-wave pseudo-potential codes (periodic and free-space) with dynamics; PAW.

Classical Mechanical Capabilities: Energy minimization; molecular dynamics simulation; ab initio dynamics Free energy calculation Supports variations such as: multiconfiguration thermodynamic integration or multiple step thermodynamic

perturbation, first order or self consistent electronic polarization, simple reaction field or particle mesh Ewald, and quantum force-field dynamics

Mixed QM + MM models and ONIOM

14

Supported PlatformsSupported Platforms CRAY T3D and T3E IBM SP SGI MIPS and Altix SMP systems SUN workstations running SOLARIS HP PA-RISC workstations running HPUX Fujitsu VX/VPP HP Alpha SMP servers running Tru64 or Linux. HP Alpha SC series. Workstation networks IA32-based workstations running Linux or FreeBSD IA32 Linux Clusters with Giganet switch using the VIA protocol IA32 Linux Clusters with Myrinet switch using the GM software IA32-based PC running Microsoft 32-bit Windows PowerPC workstations running Linux Itanium1/2 servers running Linux or HPUX Linux IA32, Itanium and Alpha clusters with Elan3 switch AMD Opteron port in progress Itanium cluster with Elan4 port in progress

15

Parallel scaling of the DFT code of NWChemParallel scaling of the DFT code of NWChem

Si8O7H18

347 Basis f.LDA wavefunction

16

Si28O67H30

1687 Basis f.

LDA wavefunction

Parallel scaling of the DFT code of NWChemParallel scaling of the DFT code of NWChem

17

Si28O148H66

3554 Basis functions

LDA wavefunction

Parallel scaling of the DFT code of NWChemParallel scaling of the DFT code of NWChem

18

H2O7

287 Basis functions

MP2 Energy & Gradient

Parallel scaling of the MP2 code of NWChemParallel scaling of the MP2 code of NWChem

19

Haloalkanedehalogenase: Accumulated Time per Step

0

10

20

30

40

50

60

70

80

1 10 100

Number of Processors

Sca

ling

IBM-SP

HP

Haloalkane dehalogenase (41,259 atoms)

NWChem Molecular Dynamics

20

Haloalkanedehalogenase: Time per Step

0.1

1

10

100

1 10 100

Number of Processors

Sca

ling

IBM-SP

HP

Haloalkanedehalogenase: Scaling

1

10

100

1 10 100

Number of Processors

Sca

ling

IBM-SP

HP

Linear

NWChem Molecular Dynamics

21

Parallel efficiencies of the NWChem PSPW module for calculations of C40, and (H2O)16 clusters. Calculations performed on

a linux cluster made up of dual processor 500 MHz Pentium nodes connected via a Giganet high-speed network and on the EMSL IBM SP

0

0.2

0.4

0.6

0.8

1

1.2

Efficiency

1 2 4 8 16 32 64 128

number of processors

Efficiency of PSPW Module

C40 (84 Ry) - Colony 16 H2O (100 Ry) - Colony 16 H2O (100 Ry) - NWMpp1

Plane-wave module – Parallel EfficiencyPlane-wave module – Parallel Efficiency

22

Hardware and Software requirementsHardware and Software requirements

low latency and high bandwidth for I/O communication

availability of large amount of aggregate memory and disk flat I/O model using local disk 64-bit addressing (required by highly correlated methods) extensive use of Linear Algebra: BLAS FFT Eigensolvers

use of other scientif libraries (e.g. MASS for exp and erfc)

23

AcknowledgementsAcknowledgements

NWChem Development TeamGlobal Array developersEnvironmental Molecular Sciences Laboratory (EMSL) operations supported by DOE’s Office of Biological and Environmental Research

24

How do you get NWChem?How do you get NWChem?

http://www.emsl.pnl.gov/pub/docs/nwchem => Register

Website with lots of other NWChem information

Print, fill-out, and sign site agreement form and fax

back to PNNL, where Form will be signed by PNNL official and

download information will be sent via fax

[email protected] for HELP!

Mailing lists: [email protected] and [email protected]

25

NWChem authors

T. P. Straatsma, E. Apra, T. L. Windus, M. Dupuis, E. J. Bylaska, W.

de Jong, S. Hirata, D. M. A. Smith, M. T. Hackler, L. Pollack,

R. J. Harrison, J. Nieplocha, V. Tipparaju, M. Krishnan, A. Auer, E.

Brown, G. Cisneros, G. I. Fann, H. Fruchtl, J. Garza, K. Hirao, R.

Kendall, J. A. Nichols, M. Nooijen, K. Tsemekhman, M. Valiev, K.

Wolinski, J. Anchell, D. Bernholdt, et al.