Surrogate ModelingSolutions - NASA · NASA Machine Learning Workshop August 30th, 2017. Overview 1....

Preview:

Citation preview

Surrogate Modeling Solutionsfor Cosmological Parameter Inference

of Hydrogen Intensity Mapping Surveys

Nick KernUC Berkeley

NASA Machine Learning WorkshopAugust 30th, 2017

Overview

1. ScienceRadio intensity mapping of cosmic hydrogen and the quest to

detect the Epoch of Reionization (EoR)

Nick Kern NASA ML Workshop 8/30/2017

2. Machine LearningCosmological parameter inference and how surrogate modeling

enables for more robust constraints

3. ApplicationForecast of future constraints from the Hydrogen Epoch of

Reionization Array1 (HERA), a $15M international project to build a radio telescope capable of detecting the EoR

1reionization.orgKern et al. 2017

arXiv:1705.04688

Science

Nick Kern NASA ML Workshop 8/30/2017

Cosmic History Timeline

Nick Kern NASA ML Workshop 8/30/2017

13.7 Gyr

Cosmic History Timeline

Nick Kern NASA ML Workshop 8/30/2017

Cosmic Microwave Background

Sloan Digital Sky Survey

what happened here?

13.7 Gyr

Cosmic History Timeline

Nick Kern NASA ML Workshop 8/30/2017

Cosmic Microwave Background

Sloan Digital Sky Survey

what happened here?

13.7 GyrAlvarez et al. 2009

Hydrogen’s 21cm “Spin Flip” Transition

Nick Kern NASA ML Workshop 8/30/2017

Nick Kern NASA ML Workshop 8/30/2017

21cm “Spin Flip” Transition for 3D Tomographic Mapping

Nick Kern NASA ML Workshop 8/30/2017

21cm “Spin Flip” Transition for 3D Tomographic Mapping

Nick Kern NASA ML Workshop 8/30/2017

21cm “Spin Flip” Transition for 3D Tomographic Mapping

Nick Kern NASA ML Workshop 8/30/2017

21cm “Spin Flip” Transition for 3D Tomographic Mapping

21cm “Spin Flip” Transition for 3D Tomographic Mapping

Nick Kern NASA ML Workshop 8/30/2017

redshift

frequency

z = 7z = 9z = 11

Radio Intensity Mapping Experiments

Nick Kern NASA ML Workshop 8/30/2017

21cm power spectrum

Machine Learning

Nick Kern NASA ML Workshop 8/30/2017

The last step: How do we interpret our data?• We want to constrain cosmological models:

Nick Kern NASA ML Workshop 8/30/2017

data

Ali et al. 2015

The last step: How do we interpret our data?• We want to constrain cosmological models:

Nick Kern NASA ML Workshop 8/30/2017

data model

Ali et al. 2015

The last step: How do we interpret our data?• We want to constrain cosmological models:

Nick Kern NASA ML Workshop 8/30/2017

data model

Ali et al. 2015

The last step: How do we interpret our data?• Maximize likelihood for parameter constraints

Nick Kern NASA ML Workshop 8/30/2017

data

model

observational error

Problem: what if our models are sophisticated & expensive simulations?— performing MCMC directly with the simulation is not practical (or even feasible) with limited time and resources

Solution:— use surrogate models to describe the simulation output over the space of its input parameters

— example: tRUN ~ 24 hours, NRUN ~ 104, tMCMC > 6 years on 100-core cluster

Surrogate Modeling aka Emulation

Nick Kern NASA ML Workshop 8/30/2017

Surrogate Modeling aka Emulation

Nick Kern NASA ML Workshop 8/30/2017

Surrogate Modeling aka Emulation

Nick Kern NASA ML Workshop 8/30/2017

emulatorcross validation set

training set

Surrogate Modeling aka Emulation

Nick Kern NASA ML Workshop 8/30/2017

Considerations:• training set sampling• choice of regression model• cross validation• error propagation

Benefits:• parameter constraints

with complex simulations• orders of magnitude faster

Costs:• approximate• bound by training set

Kern et al. 2017arXiv:1705.04688

github.com/nkern/emupy

Gaussian Process Regression

Nick Kern NASA ML Workshop 8/30/2017

Forecasting HERA Constraints

Nick Kern NASA ML Workshop 8/30/2017

Hydrogen Epoch of Reionization Array (HERA)

Nick Kern NASA ML Workshop 8/30/2017

PI: Parsons

Training an Emulator on an EoR Simulation

Nick Kern NASA ML Workshop 8/30/2017

Mesinger et al. 2011

• start with an eleven parameter model- six astrophysical : flat priors- five cosmological : Planck CMB priors

Training an Emulator on an EoR Simulation

Nick Kern NASA ML Workshop 8/30/2017

• generate Gaussian training set

• emulate 21cm power spectra

• cross validate

• HERA instrumental simulation

Joint Posterior Distribution

Nick Kern NASA ML Workshop 8/30/2017

Marginalized Posterior Distribution

Nick Kern NASA ML Workshop 8/30/2017

cosmologicalparameters

astrophysicalparameters

Summary

• Radio intensity mapping surveys are poised to make a first detection of primordial hydrogen at the EoR and subsequently produce strong

constraints on astrophysical parameters

• Challenges of MCMC with complex numerical simulations can be overcome by developing surrogate models that approximate the input-

output mapping of the simulation, which can then be used to accelerate MCMC sampling

• Surrogate modeling can be used to extract information from the data (i.e., parameter constraints) but, viewed the other way, can also be

used to extract information from the simulation (i.e., model calibration)

Nick Kern NASA ML Workshop 8/30/2017

Comparison against brute-force MCMC

Nick Kern NASA ML Workshop 8/30/2017

Recommended