41
MNRAS 000, 138 (2020) Preprint 22 September 2020 Compiled using MNRAS L A T E X style file v3.0 Stellar Parameter Determination from Photometry using Invertible Neural Networks Victor F. Ksoll, 1 ,2 ? Lynton Ardizzone, 3 Ralf Klessen, 1 ,2 Ullrich Koethe, 3 Elena Sabbi, 5 Massimo Robberto, 5 Dimitrios Gouliermis, 1 ,4 Carsten Rother, 3 Peter Zeidler 5 ,6 and Mario Gennaro 5 1 Universit¨ at Heidelberg, Zentrum f¨ ur Astronomie, Institut f¨ ur Theoretische Astrophysik, Albert-Ueberle-Str. 2, 69120 Heidelberg, Germany 2 Universit¨ at Heidelberg, Interdisziplin¨ ares Zentrum f¨ ur Wissenschaftliches Rechnen, Im Neuenheimer Feld 205, 69120 Heidelberg, Germany 3 Universit¨ at Heidelberg, Heidelberg Collaboratory for Image Processing, Visual Learning Lab, Berliner Str. 43, 69120 Heidelberg, Germany 4 Max Planck Institute for Astronomy, K¨ onigstuhl 17, 69117 Heidelberg, Germany 5 Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218, USA 6 Department of Physics and Astronomy, Johns Hopkins University, Baltimore, MD 21218, USA Accepted XXX. Received YYY; in original form ZZZ ABSTRACT Photometric surveys with the Hubble Space Telescope (HST) allow us to study stellar populations with high resolution and deep coverage, with estimates of the physical parameters of the constituent stars being typically obtained by comparing the survey data with adequate stellar evolutionary models. This is a highly non-trivial task due to effects such as differential extinction, photometric errors, low filter coverage, or uncertainties in the stellar evolution calculations. These introduce degeneracies that are difficult to detect and break. To improve this situation, we introduce a novel deep learning approach, called conditional invertible neural network (cINN), to solve the inverse problem of predicting physical parameters from photometry on an individual star basis and to obtain the full posterior distributions. We build a carefully curated synthetic training data set derived from the PARSEC stellar evolution models to pre- dict stellar age, initial/current mass, luminosity, effective temperature and surface gravity. We perform tests on synthetic data from the MIST and Dartmouth models, and benchmark our approach on HST data of two well-studied stellar clusters, West- erlund 2 and NGC6397. For the synthetic data we find overall excellent performance, and note that age is the most difficult parameter to constrain. For the benchmark clus- ters we retrieve reasonable results and confirm previous findings for Westerlund 2 on cluster age (1.04 +8.48 -0.90 Myr), mass segregation, and the stellar initial mass function. For NGC 6397 we recover plausible estimates for masses, luminosities and temperatures, however, discrepancies between stellar evolution models and observations prevent an acceptable recovery of age for old stars. Key words: methods: data analysis – methods: statistical – stars: formation – stars: fundamental parameters – stars: pre-main-sequence – galaxies: clusters: individual: Westerlund 2, NGC6397. 1 INTRODUCTION Machine learning (ML) employs statistical models to predict the characteristics of a dataset using samples of previously collected data without relying on physical models of the sys- tem. The introduction of ML for solving regression, clas- sification and clustering problems has revolutionised scien- ? E-mail: [email protected] tific research, and in particular has provided effective meth- ods for analyzing big astronomical data (Feigelson & Babu 2012; Ivezic et al. 2014). In order to construct a model from observed data, machine learning methods rely on human- defined classifiers or ’feature extractors’ (Hastie et al. 2009). However, complex problems require algorithms that auto- mate the creation of feature extractors using large amounts of data. These algorithms represent a family of ML tech- niques, named deep learning, and they are based on the © 2020 The Authors arXiv:2007.08391v2 [astro-ph.SR] 21 Sep 2020

Stellar Parameter Determination from Photometry using Invertible … · 2020. 9. 22. · E-mail: [email protected] ti c research, and in particular has provided e ective

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

  • MNRAS 000, 1–38 (2020) Preprint 22 September 2020 Compiled using MNRAS LATEX style file v3.0

    Stellar Parameter Determination from Photometry usingInvertible Neural Networks

    Victor F. Ksoll,1,2? Lynton Ardizzone,3 Ralf Klessen,1,2 Ullrich Koethe,3

    Elena Sabbi,5 Massimo Robberto,5 Dimitrios Gouliermis,1,4 Carsten Rother, 3

    Peter Zeidler5,6 and Mario Gennaro51Universität Heidelberg, Zentrum für Astronomie, Institut für Theoretische Astrophysik, Albert-Ueberle-Str. 2, 69120 Heidelberg, Germany2Universität Heidelberg, Interdisziplinäres Zentrum für Wissenschaftliches Rechnen, Im Neuenheimer Feld 205, 69120 Heidelberg, Germany3Universität Heidelberg, Heidelberg Collaboratory for Image Processing, Visual Learning Lab, Berliner Str. 43, 69120 Heidelberg, Germany4Max Planck Institute for Astronomy, Königstuhl 17, 69117 Heidelberg, Germany5Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218, USA6Department of Physics and Astronomy, Johns Hopkins University, Baltimore, MD 21218, USA

    Accepted XXX. Received YYY; in original form ZZZ

    ABSTRACTPhotometric surveys with the Hubble Space Telescope (HST) allow us to study stellarpopulations with high resolution and deep coverage, with estimates of the physicalparameters of the constituent stars being typically obtained by comparing the surveydata with adequate stellar evolutionary models. This is a highly non-trivial task dueto effects such as differential extinction, photometric errors, low filter coverage, oruncertainties in the stellar evolution calculations. These introduce degeneracies thatare difficult to detect and break. To improve this situation, we introduce a novel deeplearning approach, called conditional invertible neural network (cINN), to solve theinverse problem of predicting physical parameters from photometry on an individualstar basis and to obtain the full posterior distributions. We build a carefully curatedsynthetic training data set derived from the PARSEC stellar evolution models to pre-dict stellar age, initial/current mass, luminosity, effective temperature and surfacegravity. We perform tests on synthetic data from the MIST and Dartmouth models,and benchmark our approach on HST data of two well-studied stellar clusters, West-erlund 2 and NGC 6397. For the synthetic data we find overall excellent performance,and note that age is the most difficult parameter to constrain. For the benchmark clus-ters we retrieve reasonable results and confirm previous findings for Westerlund 2 oncluster age (1.04+8.48−0.90 Myr), mass segregation, and the stellar initial mass function. ForNGC 6397 we recover plausible estimates for masses, luminosities and temperatures,however, discrepancies between stellar evolution models and observations prevent anacceptable recovery of age for old stars.

    Key words: methods: data analysis – methods: statistical – stars: formation – stars:fundamental parameters – stars: pre-main-sequence – galaxies: clusters: individual:Westerlund 2, NGC6397.

    1 INTRODUCTION

    Machine learning (ML) employs statistical models to predictthe characteristics of a dataset using samples of previouslycollected data without relying on physical models of the sys-tem. The introduction of ML for solving regression, clas-sification and clustering problems has revolutionised scien-

    ? E-mail: [email protected]

    tific research, and in particular has provided effective meth-ods for analyzing big astronomical data (Feigelson & Babu2012; Ivezic et al. 2014). In order to construct a model fromobserved data, machine learning methods rely on human-defined classifiers or ’feature extractors’ (Hastie et al. 2009).However, complex problems require algorithms that auto-mate the creation of feature extractors using large amountsof data. These algorithms represent a family of ML tech-niques, named deep learning, and they are based on the

    © 2020 The Authors

    arX

    iv:2

    007.

    0839

    1v2

    [as

    tro-

    ph.S

    R]

    21

    Sep

    2020

  • 2 Ksoll et al.

    construction of artificial neural networks (NNs; Goodfellowet al. 2016). While training NNs requires significant com-putational power, they achieve far higher levels of accuracythan classic ML for many non-linear problems. In this pi-lot study we employ invertible NNs to infer stellar ages andmasses from Hubble Space Telescope (HST) imaging of twowell-studied stellar clusters. Our aim is to explore the effi-ciency of NNs in extracting stellar physical parameters fromphotometry alone. We train our networks using modeled-observable properties relations provided by theoretical evo-lutionary models.

    Star clusters, the building blocks of galaxies, are thesignposts guiding our understanding of the formation andevolution of stars. This understanding stems from the phys-ical properties of stars in clusters, being deduced from de-tailed comparisons of photometric observations to theo-retical evolutionary models. The interface where observa-tions meet theory is often provided by the observationalcolourâĂŞmagnitude diagram (CMD) and its theoreticalcounterpart, the Hertzsprung-Russell diagram (HRD). Inthe HRD two physical properties of stars, the effective tem-perature and the luminosity, are compared to stellar evolu-tionary models to determine fundamental stellar parameters,the initial mass and the age of the star, which are not directlyaccessible by observations alone. This comparison can be di-rectly performed through fitting of isochronal evolutionarymodels to the observed CMDs. This method, however, lacksproper statistical basis because the relations between observ-ables and physical properties may present degeneracies thatneed to be accounted for. More advanced methods, basedon Bayes statistics, derive probabilistically the cumulativeproperties of stellar populations, such as the mean age, interms of posterior probability distribution functions of theproperties of individual stars, e.g. the age (see Valls-Gabaud2014, and references therein). These methods provide a sig-nificant improvement by tackling the intrinsic model degen-eracies through priors on the stellar initial mass function,binary fraction, or extinction distribution (e.g. Jørgensen,B. R. & Lindegren, L. 2005; Da Rio et al. 2010).

    Bayesian inference encompasses a specific class of ma-chine learning models, i.e. those based on strong prior intu-itions. However, these priors do not add significant value inthe case of big data, and are computationally expensive andslow. As a consequence, other ML methods are employedto infer stellar physical parameters from photometry. Themost successful techniques developed so far are generallybased on time-domain observations, such as light curves us-ing photometric-brightness variations (e.g. Miller et al. 2015)or time-series asteroseismic observations (e.g. Bellinger et al.2016). These methods make use of various instances of eachspecific target star in time, a dataset which cannot be easilyobtained for rich stellar samples in compact clusters. Inves-tigations of stars in clusters normally rely only on ’static’,rather than time-dependent imaging, which cannot be ad-dressed by classic ML methods. Moreover, it is now well un-derstood that parameter degeneracies encoded in the evolu-tionary models make the problem of inferring stellar massesand ages from photometric measurements a non-linear prob-lem. The solution of such problems calls for the employmentof artificial NNs.

    There have been several recent studies that employ neu-ral network approaches to solve prediction tasks in astron-

    omy similar to the problem that we analyze in this paper.Sharma et al. (2020) train a convolutional neural network ona suite of spectral libraries in order to classify stellar spec-tra according to the Harvard scheme and successfully applytheir approach to data from the Sloan Digital Sky Survey(SDSS) database. Kounkel et al. (2020) leverage Gaia DR2photometry and parallaxes to construct a neural networkthat predicts age, extinction and distance of stellar clustersin the Milky Way, allowing them to study the star formationactivity in the spiral arms. Cantat-Gaudin et al. (2020) usea similar neural network approach, also predicting physicalparameters of stellar clusters from Gaia data, but use 2D his-tograms of the observed CMDs as inputs. Olney et al. (2020)use a deep convolutional neural network to predict surfacetemperature, metallicity and surface gravity of young stel-lar objects (YSOs) based on spectra from APOGEE. Withintheir training set construction they employ another convolu-tional neural network to infer physical parameters of YSOs,i.e. ages, masses, extinction, surface temperature/gravity,from photometry in 9 bands of the Gaia system, as well asdistance, stellar radius and luminosity. This auxiliary net-work is trained on synthetic isochrone data and successfullyrecovers surface temperatures for YSOs on real Gaia obser-vations.

    For many applications in natural sciences, the forwardprocess of determining measurements from a set of underly-ing physical parameters is well-defined, whereas the inverseproblem is ambiguous because multiple parameter sets canresult in the same observation (i.e., degeneracies). Classi-cal neural networks attempt to address this ambiguity bysolving the inverse problem directly. However, to fully char-acterise degeneracies, the full posterior parameter distribu-tion, conditioned on an observed measurement, must be de-termined. A particular class of neural networks, so-called in-vertible neural networks (INNs), is well suited for this task(e.g. Ardizzone et al. 2019a). Unlike classical neural net-works, INNs learn the forward process, using additional la-tent output variables to capture the information otherwiselost. This invertibility allows a model of the correspondinginverse process to be learned implicitly, providing the fullparameter posterior distribution for a given observation andcorresponding distribution of the latent variables. INNs aretherefore a powerful tool in identifying multi-modalities, pa-rameter correlations, and unrecoverable parameters.

    In this paper we present the application of invertibleneural networks to the regression problem of predictingphysical parameters of individual stars based on observedphotometry. Note that we do not perform an exhaustiveanalysis of the approach, but rather aim to provide an intro-duction to the method, highlighting our first successes. Thispaper is the first in a series, in which we adapt and developthe approach, as well as explore its limitations.

    As mentioned above, in general this regression task isprone to errors due to the many sources of degeneracy in themapping from physical to observable space, such as metallic-ity, extinction, variability, binarity and the intrinsic overlapof certain phases in stellar evolution in the observable space,e.g. the red giant branch and the pre-main-sequence. Sinceour primary goal is to test the viability of the method, inthis paper we neglect some of these factors, adopting thefollowing simplifying assumptions: 1) We only deal with sin-gle metallicity populations, 2) we obtain an estimate of the

    MNRAS 000, 1–38 (2020)

  • Stellar Parameters from INNs 3

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●● ●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●● ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●●

    ●●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●● ●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●● ●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ● ●

    ●●

    15

    20

    25

    3 5 7F814W − F160W (mag)

    F81

    4W (

    mag

    )

    ●●

    ●●

    ●●

    ●●

    ●●● ●

    ●●●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●●

    ●●●

    ●●

    ●●

    ● ●

    ● ●

    ● ●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ● ●

    ●●

    ●●

    ● ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●