Excited State Spectroscopy using GPUs Robert Edwards Jefferson Lab TexPoint fonts used in EMF. Read...

Preview:

Citation preview

Excited State Spectroscopy using GPUs

Robert Edwards Jefferson Lab

Hadronic & Nuclear Physics with LQCD

• Hadronic spectroscopy– Hadron resonance determinations– Exotic meson spectrum (JLab 12GeV )

• Hadronic structure– 3-D picture of hadrons from gluon & quark spin+flavor distributions– Ground & excited E&M transition form-factors (JLab 6GeV+12GeV+Mainz)– E&M polarizabilities of hadrons (Duke+CERN+Lund)

• Nuclear interactions– Nuclear processes relevant for stellar evolution– Hyperon-hyperon scattering– 3 & 4 nucleon interaction properties [Collab. w/LLNL] (JLab+LLNL)

• Beyond the Standard Model– Neutron decay constraints on BSM from Ultra Cold Neutron source (LANL)

2

Spectroscopy

Spectroscopy reveals fundamental aspects of hadronic physics– Essential degrees of freedom?– Gluonic excitations in mesons - exotic states of

matter?

• Status– Can extract excited hadron energies & identify spins, – Pursuing full QCD calculations with realistic quark

masses.

• New spectroscopy programs world-wide– E.g., BES III, GSI/Panda– Crucial complement to 12 GeV program at JLab.

• Excited nucleon spectroscopy (JLab)• JLab GlueX: search for gluonic excitations.

3

Excited states: anisotropy+operators+variational

• Anisotropic lattices with Nf=2+1 dynamical fermions– Temporal lattice spacing at < as (spatial lattice spacing)– High temporal resolution ! Resolve noisy & excited states– Major project within USQCD – Hadron Spectrum Collab.

• Extended operators– Subduction: sufficient derivatives ! nonzero overlap at

origin

• Variational method:– Distillation: matrix of correlators ! project onto excited

states

• PRD 78 (2008) & PRD 79 (2009)

• PRD 76 (2007), PRD 77 (2008), PRD 80 (2009), arxiv:1002.0818

• PRD 72 (2005), PRD 72 (2005), PRL 103 (2009)

4

Gauge Generation: Cost Scaling• Cost: reasonable statistics, box size and “physical” pion

mass• Extrapolate in lattice spacings: 10 ~ 100 PF-yr

PF-years

State-of-Art

Today, 10TF-yr

2011 (100TF-yr)

5

Computational Requirements

Gauge generation : Analysis

Current calculations• Weak matrix elements: 1 : 1• Baryon spectroscopy: 1 : 10• Nuclear structure: 1 : 4

Computational Requirements: Gauge Generation : Analysis 10 : 1 (2005) 1 : 3 (2010)

Core work: Dirac inverters - use GPU-s 6

SciDAC Software Stack

QCD friendly API’s/libs

http://www.usqcd.org

Data parallel C/C++

Architectural level

High-level (linpack-like)

GPU-s

Application level

7

8

SciDAC Impact

• Software development– QCD friendly API’s and libraries: enables high user

productivity– Allows rapid prototyping & optimization – Significant software effort for GPU-s

• Algorithm improvements– Operators & contractions: clusters (Distillation: PRL (2009))

– Mixed-precision Dirac-solvers: INCITE+clusters+GPU-s, 2-3X

– Adaptive multi-grid solvers: clusters, ~8X (?)

• Hardware development via USQCD Facilities– Adding support for new hardware– GPU-s

Inverter Strong Scaling: V=323x256

Local volume on GPU too small (I/O bottleneck)

3 Tflops

9

New Science Reach in 2010-2011

QCD Spectrum

• Gauge generation: (next dataset)– INCITE: Crays&BG/P-s, ~ 16K – 24K cores– Double precision

• Analysis (existing dataset): two-classes– Propagators (Dirac matrix inversions)

• Few GPU level• Single + half precision• No memory error-correction

– Contractions: • Clusters: few cores• Double precision + large memory

footprint

Cost (TF-yr)

New: 10 TF-yrOld: 1 TF-yr

10 TF-yr

1 TF-yr

10

Isovector Meson Spectrum

11

Isovector Meson Spectrum

12

Exotic matter?

Can we observe exotic matter? Excited string

• QED

• QCD

13

Exotic matterExotics: world summary

14

Exotic matter

Suggests (many) exotics within range of JLab Hall D

Previous work: photo-production rates high

Current GPU work: (strong) decays - important experimental input

Exotics: first GPU results

15

Baryon Spectrum

“Missing resonance problem”• What are collective modes?• What is the structure of the states?

– Major focus of (and motivation for) JLab Hall B– Not resolved experimentally @ 6GeV

16

Nucleon & Delta Spectrum

First results from GPU-s

< 2% error bars

17

Nucleon & Delta Spectrum

First results from GPU-s

< 2% error bars[56,2+]D-wave

[70,1-]P-wave[70,1-]

P-wave

[56,2+]D-wave

Discern structure: wave-function overlaps

Change at light quark mass? Decays!

Suggests spectrum at least as dense as quark model

18

Towards resonance determinations

• Augment with multi-particle operators– Needs “annihilation diagrams” – provided by

Distillation Ideally suited for (GPU-s)

• Resonance determination– Scattering in a finite box – discrete energy levels– Lüscher finite volume techniques– Phase shifts ! Width

• First results (partially from GPU-s)– Seems practical

arxiv:0905.2160

19

Phase Shifts: demonstration

20

Prospects

• Anisotropic gauge production: – Useful for hadronic & nuclear physics

• Spectrum determination– Looks promising! Significant progress in last year– Possible with new correlator and operator constructions:

Distillation + Subduction– Framework for multi-particle decays: on-going work– Not discussed: photon decays -> internal probe of

structure

• GPU-s– Powerful resource for inversions– New ECC+double precision -> handle contractions

21

Extending science reach

• USQCD:– Next calculations: physical quark masses: 100 TF – 1 PF-yr– New INCITE+Early Science application (ANL+ORNL+NERSC)– NSF Blue Waters Petascale (PRAC)

• Need SciDAC-3– Significant software effort for next generation GPU-s &

heterogeneous environments– Participate in emerging ASCR Exascale initiatives

• INCITE + LQCD synergy:– ARRA GPU system well matched to current leadership

facilities

22

Summary

Capability + Capacity + SciDAC – Deliver science & HEP + NP milestones

Petascale (leadership) + Petascale (capacity)+SciDAC-3Spectrum + decays

First contact with experimental resolution

Exascale (leadership) + Exascale (capacity)+SciDAC-3Full resolution

Spectrum + transitionsNuclear structure

Collaborative efforts: USQCD + JLab user communities

23

Backup slides

• The end

24

Dirac Inverter with Parallel GPU-s

Divide problem among nodes:

• Trade-offs – On-node vs off-

node bandwidths– Locality vs memory

bandwidth

• Efficient at large problem size per node

25

Interpretation of Meson Spectrum

Future: incorporate in bound-state model phenomenology

Future: probe with photon decays

Distillation: annihilation diagrams

• Two-meson creation op

• Correlator

arxiv:0905.2160

27

Operators and contractions

• New operator technique: Subduction – Derivative-based continuum ops -> lattice

irreps– Operators at rest or in-flight, mesons &

baryons

• Large basis of operators -> lots of contractions– E.g., nucleon Hg 49 ops up through 2 derivs– Order 10000 two-point correlators

• Feed all this to variational method

– Diagonalization: handles near degeneracies

PRL 103 (2009)

28

Recommended