Deep Shallow Transits Learning - Exoplanets II · Shallow transits: inter-temporal correlations...

Preview:

Citation preview

Shallow

Transits

Deep

Learning

Shay Zucker, Raja Giryes

Elad Dvash

(Tel Aviv University)

Yam Peleg

(Deep Trading)

Red Noise and Transit Detection

Deep transits:

traditional methods

(BLS) work well

Shallow transits:

inter-temporal

correlations might

mask the signal

Pont, Zucker & Queloz 2006, MNRAS, 373, 231

BLS: Kovács, Zucker & Mazeh 2002, A&A, 391, 369

Gaussian Processes

• An elegant way to model inter-temporal correlations

• Use a kernel function to model the correlation

• A kernel is parameterized by a few hyperparameters

• Fitting is very hard (involves inversion of huge matrices)

• Simultaneous GP fitting and transit search even harder…

• Rasmussen & Williams 2006 (textbook)

• Aigrain et al. 2016 (application to K2 light curves)

• Foreman-Mackey et al. 2017 (approximate fast fitting)

𝑘 𝑡𝑖 − 𝑡𝑗 = 𝐴𝑠2exp −

𝑡𝑖−𝑡𝑗

𝜆𝑠

2

+𝐴𝑞2 exp −

sin2 𝜋 𝑡𝑖−𝑡𝑗 /𝑇𝑞

2−

𝑡𝑖−𝑡𝑗

𝜆𝑞

2

+ 𝐴𝑤2 𝛿 𝑡𝑖 − 𝑡𝑗

Deep Learning

Neural Networks

“a set of computational heuristics to train

highly nonlinear parametric functions

structured in a layered form to perform a

certain task”

Biological Neuron

McCulloch-Pitts Neuron

Deep Learning in a Nutshell

• Supervised learning: given examples with ground truth

• (‘training set’)

• Loss function (error quantification)

• Loss function depends analytically on the synaptic weights

• Backpropagation of derivatives (chain rule) through layers

• Slowly update the synaptic weights (e.g. gradient descent,

Metropolis-Hastings, etc.) to minimize loss function

• Essential ingredients: Non linearity and layered structure

• A growing multitude of neural network architectures

Feasibility Study

• Zucker & Giryes 2018, AJ, 155, 4

• Fictitious planet-hunting space telescope

• Noise simulated by GP

– White noise

– Red noise (squared exponential)

– Quasi periodic noise

– Hyperparameters drawn randomly

Feasibility Study

Feasibility Study

Receiver Operating Characteristic (ROC) curve

Deep

LearningHPF+BLS

Feasibility Study

Adding outliers and discontinuities

Deep

LearningOutlier

removal

+HPF+BLS

Sample detections (FPR=0.01)

Sample detections (FPR=0.01)

Sample false detections (FPR=0.01)

TESS ETE-6-based test

• Time sampling provided in ETE-6 (with gaps)

• White noise more dominant

• Red noise, same as in previous study

• DL still outperforms BLS, but less convincingly

• First attempts in estimating period and detrending

TESS ETE-6-based test

DL still outperforms BLS,

but less convincingly

DL

BLS+HPF

TESS ETE-6-based test

Estimating period

First attempts

TESS ETE-6-based test

Estimating period

First attempts

What next?

• Work in progress: use DL to:

- Detrend light curves

- Characterize transit signals

- Identify individual transits

• Introduce complications (gaps, TTV, multis etc.)

• Mine old data for hidden planets (Kepler, CoRoT)

• Use DL to fit GPs

• Apply Deep Learning to RV (to overcome activity)

• Prepare for PLATO

Related Works

• Vanderburg & Shallue 2018, AJ, 155, 94

- ‘Identifying’ – not ‘detecting’…

- Traditional approach to detect TCEs in resonant systems

- Deep learning for vetting, not detecting

• Pearson, Palafox & Griffith 2017, MNRAS, 474, 478

- Discrete grid of transit parameters (not distributions)

- Quasi-periodicity+white noise, not GP

Summary

• Deep learning neural networks are the future!

• May achieve unprecedented performance,

specifically for small planets,

with long periods,

around G-type stars

• A fundamentally different approach (nonlinear)

• Zucker & Giryes 2018, AJ, 155, 4

Recommended