1
Using the Minimum String Length Statistic Spectral analysis in astronomy generally focuses on processing unevenly spaced data since many observations, especially those from surveys, are intrinsically unevenly spaced. Fourier periodograms, especially the Lomb-Scargle periodogram, are most commonly used to process unevenly spaced data. When looking at nonsinusoidal signals, Fourier methods may have difficulties. One nonparametric statistic used to perform spectral analysis on unevenly spaced data is the minimum string length of the dataset. The minimum string length is the sum of the length of lines connecting values of a time series when plotted with an arbitrary phase. An incorrect period scatters the signal when plotted against phase and increases the line length. By obtaining the minimum string length for a range of periods, we can reliably determine the period of a dataset without Fourier analysis. 1. Write a code that outputs a plot of string length vs. period for a dataset including time and magnitude. This code should iteratively phase the data (with phase running from 0 to 1) for a range of periods and then calculate the string length for each period. Phase is defined as !"#$ !"#$%& ( !"#$ !"#$%& ). Attach your code. 2. Using the DECam imager at the CTIO in Chile, Wilson collected data over 4 nights looking at the Carina Nebula for eclipsing binaries and T Tauri rotation periods, amongst other things. The attached file obs.txt lists observations of a potential eclipsing binary by date (in MJD), r-magnitude, and error in magnitude. a. Looking at the photometric errors, do you think that excluding errors from spectral analysis will significantly affect your results? b. Using the date and r-magnitude, run the potential eclipsing binary data through your minimum string length code and attach the output plot of string length vs. period. Use a range of periods from .1 days to 3 days with a resolution of .01 days. 3. As with any periodogram, the results are open to interpretation. In this case, the minimum string length may not correspond to the proper period. a. Using the period corresponding to the minimum string length from your code, plot the r-magnitude data vs. phase. What period did you use? Does this look like an eclipsing binary? If not, what’s wrong? b. There should be two other possible periods that come close to the minimum string length. Make 2 more plots with these periods. Looking at all 3 plots, is one plot clearly better? Can any be ruled out? If so, why? What information is still missing from the best plot? Thanks to Wilson for the data!

Using the Minimum String Length Statistic - Rice Universitysparky.rice.edu/~hartigan/astr600/old/hw4.pdf · Using the Minimum String Length Statistic Spectral analysis in astronomy

Embed Size (px)

Citation preview

Page 1: Using the Minimum String Length Statistic - Rice Universitysparky.rice.edu/~hartigan/astr600/old/hw4.pdf · Using the Minimum String Length Statistic Spectral analysis in astronomy

Using the Minimum String Length Statistic Spectral analysis in astronomy generally focuses on processing unevenly spaced

data since many observations, especially those from surveys, are intrinsically unevenly spaced. Fourier periodograms, especially the Lomb-Scargle periodogram, are most commonly used to process unevenly spaced data. When looking at nonsinusoidal signals, Fourier methods may have difficulties.

One nonparametric statistic used to perform spectral analysis on unevenly spaced data is the minimum string length of the dataset. The minimum string length is the sum of the length of lines connecting values of a time series when plotted with an arbitrary phase. An incorrect period scatters the signal when plotted against phase and increases the line length. By obtaining the minimum string length for a range of periods, we can reliably determine the period of a dataset without Fourier analysis.

1. Write a code that outputs a plot of string length vs. period for a dataset including time and magnitude. This code should iteratively phase the data (with phase running from 0 to 1) for a range of periods and then calculate the string length for each period. Phase is defined as !"#$

!"#$%&–  𝑓𝑙𝑜𝑜𝑟( !"#$

!"#$%&). Attach your code.

2. Using the DECam imager at the CTIO in Chile, Wilson collected data over 4

nights looking at the Carina Nebula for eclipsing binaries and T Tauri rotation periods, amongst other things. The attached file obs.txt lists observations of a potential eclipsing binary by date (in MJD), r-magnitude, and error in magnitude.

a. Looking at the photometric errors, do you think that excluding errors from spectral analysis will significantly affect your results?

b. Using the date and r-magnitude, run the potential eclipsing binary data through your minimum string length code and attach the output plot of string length vs. period. Use a range of periods from .1 days to 3 days with a resolution of .01 days.

3. As with any periodogram, the results are open to interpretation. In this case, the

minimum string length may not correspond to the proper period. a. Using the period corresponding to the minimum string length from your

code, plot the r-magnitude data vs. phase. What period did you use? Does this look like an eclipsing binary? If not, what’s wrong?

b. There should be two other possible periods that come close to the minimum string length. Make 2 more plots with these periods. Looking at all 3 plots, is one plot clearly better? Can any be ruled out? If so, why? What information is still missing from the best plot?

Thanks to Wilson for the data!