Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne

Budapest May 27, 2008

Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction

Anders Grimvall, Sackmone Sirisack, Agne Burauskaite-Harju, and Karl Wahlin

Department of Computer and Information ScienceLinköping University, SE-58183 Linköping, Sweden

E-mail: [email protected]


Objective of our work

Combine the best ideas of

a class of Mixed Linear Models (MLM) suggested by Picard et al.

and

Multiple Analysis of Series for Homogenization (MASH)

Provide a unified notation and theoretical framework for breakpoint detection and correction

Discuss further development of the cited models/methods


Parametric vs nonparametric approaches

Parametric approaches are needed to capture the abruptness of a change

Nonparametric approaches are suitable for tests of smooth trends in corrected data


Checklist for describing methods for breakpoint detection and correction1. Candidate-reference comparisons

Pairwise differences or differences between candidate series and optimally weighted reference series

2. Probability model of observed data Mean function (observed values adjusted for meteorological variability) Variance-covariance matrix (meteorological variability and relationship

between observations made at different locations and/or different occasions)

3. Estimators of breakpoints and other model parameters for a given number of breakpoints

Joint estimation of all model parameters or sequential identification of breakpoints

Theoretically optimal estimators or ad-hoc methods


Checklist for describing methods for breakpoint detection and correction4. Stopping rule for the number of breakpoints

Hypothesis testing or information measures

5. Numerical algorithms for the chosen estimators Numerical stability and computational cost

6. Loss function for the performance of the breakpoint correction

Minimizing the risk of erroneous estimates of individual breakpoints or false trends in the corrected series

All the listed items should be documented in any assessment of methods for breakpoint detection and correction!


Candidate-reference comparisons

Mixed Linear Models (MLM) Candidate-reference comparisons are determined a priori

Multiple Analysis of Series for Homogenization (MASH) “Optimally weighted” references are created during the data

analysis


Probability model of observed data - the mean function MASH

The mean function of candidate-reference differences is stepwise constant (multiple breakpoints can be accommodated)

MLM The mean function of candidate-reference differences is

stepwise constant (multiple breakpoints can be accommodated)


Probability model of observed data- the variance-covariance matrix

MASHThe spatio-temporal covariance is split into spatial covariance and noise

Candidate-reference differences observed at different occasions are assumed to be statistically independent

MLMThe spatio-temporal covariance of observed data is expressed by nested random components

A time series of random components common to all sites in a local neighbourhood introduces both spatial and temporal correlations

Noise (independent random components) adds to the variability of observed data


Probability model of observed data- distributional assumptions

MASH The candidate-reference differences are assumed to form a

Gaussian vector of independent random variables

MLM All random components are assumed to be independent and

to have a Gaussian distribution


Estimators of breakpoints and other model parameters for a given number of breakpoints MASH

Method based on the idea that breakpoints are most easily detected if each candidate series is compared to an optimally selected reference series

Breakpoints are estimated one at a time (?), given the previously detected breakpoints

MLM Joint estimation of all model parameters, including the breakpoints The estimator defined as the argument maximizing the likelihood of

observed data (Maximum-Likelihood estimation)


Numerical algorithms

MASH The estimators used are defined by their numerical algorithms

MLM Parameter estimates are computed using an Expectation-Maximization

(EM) algorithm in which segmentation of observed data is alternated with estimation of model parameters for a given segmentation


A Mixed Linear Model of data from m stations observed at n occasions

nm

n

n

m

n

mK

m

m

K

nm

n

n

m

nm

n

n

m

e

e

e

e

e

e

U

U

tT

tT

tT

tT

tT

tT

Y

Y

Y

Y

Y

Y

m

.

.

.

.

.

.

.

.

.

10...0

......

......

10....0

......

......

......

......

0..001

......

......

0...01

.

.

.

.

.

.

)(0.0

0...

..)(0

0.0)(

....

....

....

....

)(.0.0

0...

..)(0

0.0)(

.

.

.

.

.

.

2

1

1

12

11

1

2

1

1

12

11

2

1

1

12

11

2

1

1

12

11

1

Incidence by station and segment

Incidence by sampling occasion

Observed values

Noise

Means by station and segment

Random components by sampling occasion

Vector of zeros and ones indicating the segment of each observation


Matrix representation used by Picard et al.

Model:

The matrix T defines the segmentation of the study period

U is a zero mean normal vector with covariance matrix G

E is a zero mean normal vector with diagonal covariance matrix R

U and E are independent, implying that Y has covariance matrix

.RZ'GZV

EZUTμY


Implicit model of candidate-reference differences Introduce the (nm)x(nm) matrix

where n is the number of sampling occasions, m is the number of stations.

Provided that the row sums of W are zero, we get the matrix equation

WEWTμWYY*

1....0..0

.

0...0....1

.

.

1....0..0

.

0...0....1

1,1

112

1,1

112

mmm

m

mmm

m

ww

ww

ww

ww

W


Alternating algorithms for joint estimation of all model parameters

The entire space of parameters is searched by altering some of the coordinates at a time

Each cycle of the alternating algorithm contains:i. a segmentation step (S)

ii. an estimation step for a given segmentation of the data (E)

iii. an optional step for deriving an “optimal” reference to each time series of data (O)

S E O S E O S E O


Remarks to alternating algorithms for joint estimation of all model parameters One does not need to maximize with respect to all of the latent

parameters at once, but could instead maximize over one or a few of them at a time, alternating with the maximization step

The algorithm can be made adaptive by altering the return time for different parts of the full cycle

Additional constraints may be imposed on the structure of the variance-covariance matrix

The mean function can be modified to accommodate mean functions that are non-constant between breakpoints

Covariates can be introduced into the model


Proposed basis for a unified approach1. A joint probabilistic framework comprised of multivariate normal

distributions expressed as mixed linear models

2. Explicitly defined mean functions and variance-covariance matrices (stepwise constant or linear mean functions, spatial and temporal correlations etc)

3. Joint ML-estimation of all model parameters (including the location of breakpoints) is adopted as a desirable standard

4. Optimal weighting of references and other systems for candidate-reference systems are offered as options to all models

5. Various stopping rules for the number of breakpoints are offered as options to all models

6. The detection and correction for breakpoints should be regarded as a filter that reduces the risk of false conclusions regarding temporal trends


Some remarks on temporal scales

Homogenizing subannual data may have three objectives:

Facilitate the detection of breakpoints that occur in the middle of a year

Facilitate the detection of breakpoints by using meteorological covariates

Facilitate the detection of changepoints in extremes


Additional remark on parametric vs nonparametric approaches

Parametric approaches are often a must when data are sparse

Observations of extreme events are sparse

The joint occurrence of shifts in the mean and higher percentiles calls for parametric modelling


Conclusions

We need a checklist for describing all methods considered

Mixed linear models provide a framework and generic notation for unifying “all” parametric approaches from SNHT to Caussinus & Mestre and MASH

The choice of principles for parameter estimation should be separated from the construction of numerical algorithms

Options for candidate-reference comparisons and stopping rules for the number of breakpoint should be offered to all underlying models

Documents

Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne