19
Parameter Estimation in Autologistic Regression Models for Detection of Smoke in Satellite Images Mark Wolters Shanghai Center for Mathematical Sciences Charmaine Dean Western University June 15, 2015 SSC 2015, Halifax

Parameter Estimation in Autologistic Regression Models for

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Parameter Estimation in Autologistic Regression Models for

Parameter Estimation in AutologisticRegression Models for Detection of Smoke in

Satellite Images

Mark Wolters Shanghai Center for Mathematical SciencesCharmaine Dean Western University

June 15, 2015SSC 2015, Halifax

1

Page 2: Parameter Estimation in Autologistic Regression Models for

Outline

[email protected] 2/19

1. Introduction

Application; goals

2. Modelling Framework

CRF approach; autologistic regression; challenges

3. Approach to Estimation

Independence model + optimal smoothing

4. Some Results

Simulated data; preliminary findings

5. Discussion

Possible improvements; estimation in a prediction setting

2

Page 3: Parameter Estimation in Autologistic Regression Models for

1. Introduction

[email protected] 3/19

Earth-orbiting satellites help studylarge-scale environmental phenomena.

Our interest: smoke from forest fires.

Data: MODIS images1 per day, 143 days1.2 Mp eachCentered at Kelowna,BCHand-drawn smoke ar-eas

Goal: classify pixels intosmoke/nonsmoke

Why?Health studiesModel input or valida-tionMonitoring & archiv-ing

3

Page 4: Parameter Estimation in Autologistic Regression Models for

[email protected] 4/19

Data characteristics

Hyperspectral images.

Spectra at each pixel arecovariates for predictingsmoke.

High-dimensional predictorspace.

Expect spatial association.

35 image planes+

higher order terms

4

Page 5: Parameter Estimation in Autologistic Regression Models for

2. Modelling Framework

[email protected] 5/19

Joint PMF for an image: autologistic regression (ALR)

Pr(Y = y) ∝ exp

(

(Xβ)T y +λ

2yT Ay

)

y Class labels (n-vector), yi ∈ {L,H}X Model matrix (spectral data, n × p)πi P(pixel i is smoke | neighbour pixels)G = (E ,V) Graph structure for dependence among pixels

(regular 4-connected grid)

linear predictorsare the unary

coefficientsλ = pairwiseassociationparameter

A = adjacencymatrix

5

Page 6: Parameter Estimation in Autologistic Regression Models for

[email protected] 6/19

Autologistic regression

Notes:

Intractable normalizing constant

ALR is a conditional random field (CRF) model (Lafferty et al., 2001):given X, Y is a Markov random field.

Conditional logit form:

log

(πi

1 − πi

)

= (H − L)

xTi β + λ

j∼i

yj

logistic regression ⇐⇒ λ = 0

Spatial effect is homogeneous, isotropic

For Yi ∈ {0, 1}, the sum∑

yj increases log-odds unless all neigh-bours are zero.

=⇒ Estimates of β, λ are strongly coupled (Caragea and Kaiser, 2009;

Hughes et al., 2011)

6

Page 7: Parameter Estimation in Autologistic Regression Models for

[email protected] 7/19

Extensions

Centered ALR model (Caragea and Kaiser (2009)) aims to correct for theasymmetry of the pairwise term when Yi ∈ {0, 1}:

logit(πi) = xTi β + λ

j∼i

(yj − μj)

where μj = E[Yj |λ = 0] is the independence expectation.

Claim: just use Yi ∈ {−1, +1} to get the same effect.

Proposal: let λ = λij = λ(xi,xj) for adaptive smoothing.

Then

logit(πi) = 2

xTi β +

j∼i

λ(xi,xj)yj

7

Page 8: Parameter Estimation in Autologistic Regression Models for

3. Approach to Estimation

[email protected] 8/19

Existing possibilities

1. Ignore spatial association (logistic regression, large n, large p).

2. Pseudolikelihood (PL): L(β, λ) ≈∏

img

n∏

i=1

logit(πi)

3. Monte Carlo ML

4. Bayesian approach

Problems

We have ∼ 108 pixels

We have thousands of predictors, need model selection

We’re still developing models—rapid evaluation of candidates is ben-eficial

}Hughes et al. use perfect sampling; rec-

ommend PL for large n.

8

Page 9: Parameter Estimation in Autologistic Regression Models for

[email protected] 9/19

Proposal: plug-in estimation

a) Use independence (logistic) to get βIncluding model selectionSample pixels to reduce n to manageable size

b) Choose λ to optimize predictive power

Rationale

Treat λ as a smoothing parameter.

Assuming independence, β captures how information in X can beused to predict Y.

For fixed β, tuning λ will optimally reduce noise in the predictedprobabilities.

9

Page 10: Parameter Estimation in Autologistic Regression Models for

[email protected] 10/19

Proposed procedure:

model selection

images

training images validation images

test images

training pixels

validation pixels

fitting

performance

evaluation

best independence

model

choose l

to minimize

prediction

error

best ALR

model

final performance

results

logistic regression

autologistic

regression

10

Page 11: Parameter Estimation in Autologistic Regression Models for

4. Results

[email protected] 11/19

Simulated Images

Predictors: R, G, B

Random ellipses =Class 1 (smoke)

Background = class 2(nonsmoke)

90 images at 3 sizes:100, 200, 400 pixelssquare

11

Page 12: Parameter Estimation in Autologistic Regression Models for

[email protected] 12/19

A) plug-in vs. PL

pixels method R G B λerror

rate (%)time(min)

1002 plug-in −2.20 −2.00 1.95 0.45 19.2 1.2∗

PL −2.04 −1.99 2.06 0.99 23.3 0.9

2002 plug-in −1.65 −1.36 1.71 0.5 20.8 5.0∗

PL −1.61 −1.30 1.70 1.19 48.2 2.8

4002 plug-in −2.00 −1.41 1.64 0.6 20.8 21.5∗

PL −2.08 −1.40 1.68 1.36 50.5 16.1

∗ time per candidate λ value

12

Page 13: Parameter Estimation in Autologistic Regression Models for

[email protected] 13/19

Example predicted probabilities

Truth Plug-in PL

13

Page 14: Parameter Estimation in Autologistic Regression Models for

[email protected] 14/19

B) Effect of coding & centering

0.0 0.2 0.4 0.6 0.8 1.0 1.2

0.20

0.30

0.40

0.50

Error rate vs. λ for 200 x 200 images

Pairwise Parameter

Pre

dict

ion

Err

or

Centered (0,1)Standard (0,1)Standard (−1, 1)

logit(πi) = (H − L)

xTi β + λ

j∼i

(yj − μj)

14

Page 15: Parameter Estimation in Autologistic Regression Models for

[email protected] 15/19

C) Preliminary smoke results

RGB image

Logistic model Autologistic model

ALR model with 50 predictors

1. Smoke-free areas: OK2. Clouds vs. smoke: OK3. Snow vs. smoke: OK4. Spatial smoothing: OK5. Smoke + Cloud: Problem

15

Page 16: Parameter Estimation in Autologistic Regression Models for

[email protected] 16/19

Preliminary smoke results (continued)

RGB Logistic Autologistic

6. “Thin” smoke: Problem7. Original masks (training data): Problem

16

Page 17: Parameter Estimation in Autologistic Regression Models for

5. Discussion

[email protected] 17/19

Advantages of working on a data-rich prediction problem

If you get low out-of-sample prediction error, the following are of littleconcern:

The “truth” of your modelThe complexity or statistical efficiency of your modelwhether or not your parameters are statistically significantwhether or not your parameter estimates are stable

Computational feasibility & run time become paramount.

How would things change if we’re interested in interpretation?

Trade predictive power for model simplicity.The “plug-in” estimation approach is no longer helpful.The issue of model centering and coding becomes critical.

17

Page 18: Parameter Estimation in Autologistic Regression Models for

[email protected] 18/19

Future plans: smoke

Improve the base logistic regression model

Address “true” label ambiguity

Future plans: models

Computational improvements:

low-level codeparallelization

Revisit adaptive smoothing

A beta CRF for direct modelling of probabilities

Multi-class (autobinomial) extension

18

Page 19: Parameter Estimation in Autologistic Regression Models for

References

[email protected] 19/19

Caragea, P. C. and Kaiser, M. S. (2009), “Autologistic models with interpretableparameters,” Journal of agricultural, biological, and environmental statis-tics, 14, 281–300.

Hughes, J., Haran, M., and Caragea, P. C. (2011), “Autologistic models for bi-nary data on a lattice,” Environmetrics, 22, 857–871.

Lafferty, J. D., McCallum, A., and Pereira, F. C. N. (2001), “Conditional RandomFields: Probabilistic Models for Segmenting and Labeling Sequence Data,”in Proceedings of the Eighteenth International Conference on Machine Learn-ing, ICML ’01, San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.,pp. 282–289.

19