22
Gal Matijevič, Carsten Denker, Andrea Diercke, Christoph Kuckein, Ekaterina Dineva, Horst Balthasar, Ioannis Kontogiannis, and Partha S. Pal Classification of High-resolution Solar Hα Spectra using t-SNE Meetu Verma

Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Gal Matijevič, Carsten Denker, Andrea Diercke,

Christoph Kuckein, Ekaterina Dineva, Horst Balthasar,

Ioannis Kontogiannis, and Partha S. Pal

Classification of High-resolution

Solar Hα Spectra using t-SNE

Meetu Verma

Page 2: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Why do we need classification? On one observing day time-series of 21 Hα spatio-spectral data

cubes.

2Machine Learning in Heliophysics2019 September 20

Page 3: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

3D data Science in every pixel Contains about 8.7 million intensity and contrast profiles.

3Machine Learning in Heliophysics2019 September 20

Page 4: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

t-SNE Appropriate tool to

classify spectra Probabilistic approach

Dimensionality reduction

t-SNE result of classifying on

3000 256-dimensional

grayscale images of

handwritten digits.

Classes are quite well

separated even though t-SNE

had no information about class

labels.

Within each class,

properties like orientation,

skew and stroke thickness

tend to vary smoothly

across the space.

4Machine Learning in Heliophysics2019 September 20

L.J.P. van der Maaten and G.E. Hinton. Visualizing High-Dimensional Data

Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008

L.J.P. van der Maaten. Accelerating t-SNE using Tree-Based Algorithms.

Journal of Machine Learning Research 15(Oct):3221-3245, 2014.

Page 5: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

t-SNE From profiles to classification

5Machine Learning in Heliophysics2019 September 20

t-SNE projection of 630 x 660 spectral profiles with 601 wavelength points.

The choice of parameters perplexity = 50, theta = 0.5, number of iterations = 1000

Q1 Is the default

choice ok?

Q2 Is the projection

different for profiles

and PCA coefficients?

Q3 Is the projection

affected by seeing

conditions?

https://distill.pub/2016/misread-tsne/

Page 6: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Parameter study – Careful selection

Three parameters we can change.

Theta, Perplexity, Number of Iterations θ = 0.3, P = 50, n = 1000

6Machine Learning in Heliophysics2019 September 20

Page 7: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Parameter study – Careful selection

7Machine Learning in Heliophysics2019 September 20

Three parameters we can change.

Theta, Perplexity, Number of Iterations θ = 0.4, P = 50, n = 1000

Page 8: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Parameter study – Careful selection

8Machine Learning in Heliophysics2019 September 20

Three parameters we can change.

Theta, Perplexity, Number of Iterations θ = 0.7, P = 50, n = 1000

Page 9: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Parameter study – Careful selection

9Machine Learning in Heliophysics2019 September 20

Three parameters we can change.

Theta, Perplexity, Number of Iterations θ = 0.8, P = 50, n = 1000

Page 10: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Parameter study – Careful selection

10Machine Learning in Heliophysics2019 September 20

Three parameters we can change.

Theta, Perplexity, Number of Iterations θ = 0.5, P = 10, n = 1000

Page 11: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Parameter study – Careful selection

11Machine Learning in Heliophysics2019 September 20

Three parameters we can change.

Theta, Perplexity, Number of Iterations θ = 0.5, P = 30, n = 1000

Page 12: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Parameter study – Careful selection

12Machine Learning in Heliophysics2019 September 20

Three parameters we can change.

Theta, Perplexity, Number of Iterations θ = 0.5, P = 50, n = 1000

Page 13: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Parameter study – Careful selection

13Machine Learning in Heliophysics2019 September 20

Three parameters we can change.

Theta, Perplexity, Number of Iterations θ = 0.5, P = 80, n = 1000

Page 14: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Parameter study – Careful selection

14Machine Learning in Heliophysics2019 September 20

Three parameters we can change.

Theta, Perplexity, Number of Iterations θ = 0.5, P = 50, n = 200

Page 15: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Parameter study – Careful selection

15Machine Learning in Heliophysics2019 September 20

Three parameters we can change.

Theta, Perplexity, Number of Iterations θ = 0.5, P = 50, n = 400

Page 16: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Parameter study – Careful selection

Three parameters we can change.

Theta, Perplexity, Number of Iterations, θ = 0.5, P = 50, n = 2000

16Machine Learning in Heliophysics2019 September 20

Page 17: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Parameter study – Careful selection

17Machine Learning in Heliophysics2019 September 20

Three parameters we can change.

Theta, Perplexity, Number of Iterations θ = 0.5, P = 50, n = 4000

Page 18: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

18Machine Learning in Heliophysics2019 September 20

Parameter study – Careful selection

A1 The default parameters are fine, maybe the number of iteration has to be

improved for large datasets

Page 19: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

19Machine Learning in Heliophysics2019 September 20

Parameter study – Careful selection

A1 The default parameters are fine, maybe the number of iteration has to be

improved for large datasets

Page 20: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Good or Bad Seeing

PCA or Observed

20Machine Learning in Heliophysics2019 September 20

A2 Seeing does affect the projection but not much

A3 PCA coefficients and observed contrast profiles lead to similar results

Bad Seeing

Contrast profiles

Good Seeing

PCA coefficients

Good Seeing

Observed profiles

Good Seeing

Contrast profiles

UMAP

Page 21: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Back mapping

Selected the regions after

using threshold of 0.9 and

better.

21Machine Learning in Heliophysics2019 September 20

These are the regions where

the profiles can surely be

inverted using cloud model.

Page 22: Classification of High-resolution Solar HαSpectra using t-SNE · Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9(Nov):2579-2605, 2008 L.J.P

Conclusion

t-SNE is a powerful tool to classify

spectra.

No prior information is needed.

It classify good vs. bad profiles for

inversion.

Best settings perplexity = 50,

theta = 0.5 and number of

iterations = 1000 based on time

for computation and discerning

power.

Contrast as well as line profiles,

PCA coefficients, PCA denoised or

observed profiles lead to similar

projection.

22Machine Learning in Heliophysics2019 September 20

Does show some differences for

good and bad seeing.

The regions which can surely be

inverted using cloud models are

discernible.

OUTLOOK

Performing cloud model

inversions of selected regions in

t-SNE projection.

Projecting more data points on

the already projected map.

Comparison with UMAP.