15
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression for Estimation of Spatiotemporal Mountain Glacier Retreat From Satellite Images Nezamoddin N. Kachouie, Travis Gerke, Peter Huybers, and Armin Schwartzman Abstract—Historical variations in the extent of mountain glaciers give insight into natural and forced changes of these bell- wethers of the climate. Because of the limited number of ground observations relative to the number of glaciers, it is useful to develop techniques that permit for the monitoring of glacier sys- tems using satellite imagery. Here, we propose a new approach for identifying the glacier terminus over time from Landsat images. The proposed method permits for detecting inflection points in multispectral satellite imagery taken along a glacier’s flow path in order to identify candidate terminus locations. A gated tracking algorithm is then applied to identify the best candidate for the glacier terminus location through time. Finally, the long-term trend of the terminus position is estimated with uncertainty bounds. This is achieved by applying nonparametric regression to the temporal sequence of estimated terminus locations. The method is shown to give results consistent with ground-based observations for the Franz Josef and Gorner glaciers and is fur- ther applied to estimate the retreat of Viedma, a glacier with no available ground measurements. Index Terms—Curve fitting, glacier terminus, global warm- ing, local polynomial regression, plug-in bandwidth selection, spatiotemporal analysis, tracking. I. I NTRODUCTION A GLACIER terminus demarks the end of a glacier’s flow path, and where an ice surface often gives way to debris, bare ground, or water, if snow cover is absent. Trends in the terminus position of individual glacier systems depend strongly on local basin geometry and local variations in temperature and precipitation (e.g., [1] and [2]). This implies that, although a general retreat of mountain glacier systems has been identified in relation to centennial trends toward warmer temperatures [3]–[5], there is the potential to extract a great deal more information regarding regional variations in climate from the Manuscript received July 15, 2013; revised May 31, 2014; accepted June 16, 2014. N. N. Kachouie is with the Department of Mathematical Sciences, Florida Institute of Technology, Melbourne, FL 32901 USA (e-mail: nezamoddin@ fit.edu). T. Gerke is with the Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115 USA. P. Huybers is with the Department of Earth and Planetary Sciences, Harvard University, Cambridge, MA 02138 USA. A. Schwartzman is with the Department of Statistics, North Carolina State University, Raleigh, NC 27695 USA. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TGRS.2014.2334643 mapping of the time history of the terminus position. Ground measurements are available for certain glaciers (e.g., [6] and [7]), but given that each glacier system is likely to be distinct both in the climate variations that it has experienced and in its response to these variations, augmenting ground-based observations with satellite-based estimates has value. The geometry of many individual glaciers has been es- timated using remote sensing techniques (e.g., [8]–[13]) to include various semiautomated image analysis techniques such as supervised classification [14], region segmentation [15], and other methods [16], [17]. In this paper, we introduce a general approach for quantifying mountain glacier variations. The proposed method permits for detecting and tracking the location of the terminal point of the glacier over time from a temporal sequence of multispectral satellite images. This work is motivated by the availability of satellite images: specifically, an inventory of the Landsat images [18] covering the past several decades that have been publicly available since 2009 [19], [20]. Unlike common methods [8]–[17], the method pre- sented here does not rely on detecting the glacier boundaries on the 2-D image. Instead, it needs only to determine the boundary of the 1-D intensity profile at the terminus which is found via inflection point detection, tracking, and temporal smoothing. The statistical method proposed incorporates both spatial and temporal information. Data are gathered as a time sequence of spatially registered multispectral intensity profiles extracted along the glacier path. Within each extracted intensity profile, a transition is expected to occur where ice terminates and gives way to another surface having distinct reflective characteristics, such as ground or water. In high-resolution images with high contrast and a clear view of the glacier, the terminus can be identified as the strongest transition along the path. Therefore, in our previous work [21], we introduced a new processed band with increased contrast and relied on the detection of a single inflection point along the glacier path at each time frame to identify the terminus using a constrained bandwidth selection method. The visibility of the glacier path and its terminus in satellite images is, however, affected by many factors that produce partial loss of contrast and variability in the extracted intensity profile. Some are easy to recognize visually, such as the effect of mountain shadows and clouds, while others are more subtle, such as the presence of snow, variations in temperature and humidity, debris atop glacial ice, and image acquisition noise. 0196-2892 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135

Nonparametric Regression for Estimation ofSpatiotemporal Mountain Glacier Retreat

From Satellite ImagesNezamoddin N. Kachouie, Travis Gerke, Peter Huybers, and Armin Schwartzman

Abstract—Historical variations in the extent of mountainglaciers give insight into natural and forced changes of these bell-wethers of the climate. Because of the limited number of groundobservations relative to the number of glaciers, it is useful todevelop techniques that permit for the monitoring of glacier sys-tems using satellite imagery. Here, we propose a new approach foridentifying the glacier terminus over time from Landsat images.The proposed method permits for detecting inflection points inmultispectral satellite imagery taken along a glacier’s flow path inorder to identify candidate terminus locations. A gated trackingalgorithm is then applied to identify the best candidate for theglacier terminus location through time. Finally, the long-termtrend of the terminus position is estimated with uncertaintybounds. This is achieved by applying nonparametric regressionto the temporal sequence of estimated terminus locations. Themethod is shown to give results consistent with ground-basedobservations for the Franz Josef and Gorner glaciers and is fur-ther applied to estimate the retreat of Viedma, a glacier with noavailable ground measurements.

Index Terms—Curve fitting, glacier terminus, global warm-ing, local polynomial regression, plug-in bandwidth selection,spatiotemporal analysis, tracking.

I. INTRODUCTION

A GLACIER terminus demarks the end of a glacier’s flowpath, and where an ice surface often gives way to debris,

bare ground, or water, if snow cover is absent. Trends in theterminus position of individual glacier systems depend stronglyon local basin geometry and local variations in temperature andprecipitation (e.g., [1] and [2]). This implies that, although ageneral retreat of mountain glacier systems has been identifiedin relation to centennial trends toward warmer temperatures[3]–[5], there is the potential to extract a great deal moreinformation regarding regional variations in climate from the

Manuscript received July 15, 2013; revised May 31, 2014; accepted June 16,2014.

N. N. Kachouie is with the Department of Mathematical Sciences, FloridaInstitute of Technology, Melbourne, FL 32901 USA (e-mail: [email protected]).

T. Gerke is with the Department of Epidemiology, Harvard School of PublicHealth, Boston, MA 02115 USA.

P. Huybers is with the Department of Earth and Planetary Sciences, HarvardUniversity, Cambridge, MA 02138 USA.

A. Schwartzman is with the Department of Statistics, North Carolina StateUniversity, Raleigh, NC 27695 USA.

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TGRS.2014.2334643

mapping of the time history of the terminus position. Groundmeasurements are available for certain glaciers (e.g., [6] and[7]), but given that each glacier system is likely to be distinctboth in the climate variations that it has experienced andin its response to these variations, augmenting ground-basedobservations with satellite-based estimates has value.

The geometry of many individual glaciers has been es-timated using remote sensing techniques (e.g., [8]–[13]) toinclude various semiautomated image analysis techniques suchas supervised classification [14], region segmentation [15],and other methods [16], [17]. In this paper, we introduce ageneral approach for quantifying mountain glacier variations.The proposed method permits for detecting and tracking thelocation of the terminal point of the glacier over time from atemporal sequence of multispectral satellite images. This workis motivated by the availability of satellite images: specifically,an inventory of the Landsat images [18] covering the pastseveral decades that have been publicly available since 2009[19], [20]. Unlike common methods [8]–[17], the method pre-sented here does not rely on detecting the glacier boundarieson the 2-D image. Instead, it needs only to determine theboundary of the 1-D intensity profile at the terminus whichis found via inflection point detection, tracking, and temporalsmoothing.

The statistical method proposed incorporates both spatial andtemporal information. Data are gathered as a time sequenceof spatially registered multispectral intensity profiles extractedalong the glacier path. Within each extracted intensity profile,a transition is expected to occur where ice terminates and givesway to another surface having distinct reflective characteristics,such as ground or water. In high-resolution images with highcontrast and a clear view of the glacier, the terminus can beidentified as the strongest transition along the path. Therefore,in our previous work [21], we introduced a new processed bandwith increased contrast and relied on the detection of a singleinflection point along the glacier path at each time frame toidentify the terminus using a constrained bandwidth selectionmethod.

The visibility of the glacier path and its terminus in satelliteimages is, however, affected by many factors that producepartial loss of contrast and variability in the extracted intensityprofile. Some are easy to recognize visually, such as the effectof mountain shadows and clouds, while others are more subtle,such as the presence of snow, variations in temperature andhumidity, debris atop glacial ice, and image acquisition noise.

0196-2892 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

1136 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015

These sources of uncertainty tend to produce several inflectionpoints along the glacier path in addition to the one correspond-ing to the true terminus.

To extend the previous work and address the aforementionedproblems, we present here a new statistical method that explic-itly accounts for the often multimodal distribution associatedwith estimating terminus locations and that is robust againstthe effects of clouds and shadows. The influence of shadows isfurther reduced by using Landsat frequency bands that are lessaffected by them, particularly the thermal frequency band B62and a processed band called Normalized Difference Snow Index(NDSI), defined as the difference between the visual frequencyband B20 and the infrared frequency band B50, divided by theirsum [11], [22], [23].

Contributions. In this paper, we improve and extend our pre-vious work [21] to locate the glacier terminus for the estimationof glacier variations over time using satellite imagery in twomajor ways. Specifically, we take into account and address twodifferent uncertainty sources to allow more accurate estimationin the temporal analysis. The first source of uncertainty ariseswhen, within each time frame, multiple transitions (inflectionpoints) are detected along the glacier path, giving rise toseveral terminus candidates. This issue is resolved by trackingthe terminus location over time through the pool of terminuscandidates. Second, each terminus candidate’s location has anassociated uncertainty that is accounted for in estimating thechange in terminus location over time.

The proposed approach involves four steps:

1) path selection along the flow axis of the glacier (seeSection II);

2) the direct estimation of the first derivative of the imageintensity along the glacier path at each time frame (seeSection III);

3) inflection point detection to locate glacier terminus candi-dates at each time frame and the tracking of the terminuslocation over time (see Section IV);

4) temporal smoothing for the estimation of long-term ter-minus retreat (see Section V).

In Step 1, the path is manually selected along the flow axisof the glacier and extends across the terminus. Path selectionis an obvious candidate for additional work, possibly in con-junction with using high-resolution elevation data, but suchimprovements are not further pursued in this paper. In Step2, nonparametric local polynomial regression is used at eachtime frame to directly estimate the first derivative of the imageintensity profile. Spatial correlation along the path is takeninto account to determine the plug-in smoothing bandwidthas well as to compute standard errors of the estimation. InStep 3, inflection points of the intensity profile are detectedby finding the local optima of the estimated first derivative.The detected inflection points are considered as candidatesfor the glacier terminus location, and the unique estimatedterminus location at each time frame is then determined bytracking over time. Finally, in Step 4, the long-term trend inglacier retreat is estimated by nonparametric local polynomialregression. In this case, a nonhomoscedastic model is assumedto account for the variability in the estimation accuracy of the

glacier terminus location at different time frames. This modelis used for determining the plug-in smoothing bandwidth andcomputing estimation standard errors.

Nonparametric regression, applied in the aforementionedSteps 1 and 3, has been widely used to estimate an unknownfunction from noisy observations [24]–[29]. In particular,Step 1 addresses the problem of detecting the inflection pointsof the underlying function. Inflection points represent sharptransitions along a continuous curve [30]. In contrast, most ofthe related literature concerns the detection of change points,commonly understood as jump discontinuities [31]–[36]. Thedetection of inflection points requires estimating the first deriva-tive of the underlying function. In general, the estimated first

derivative r(1) via local polynomial regression is not the sameas the derivative of the estimated function r(1) [29]. For thisreason, we prefer to estimate the first derivative directly usinga plug-in bandwidth [27], [29], [37]. This is in contrast withcross-validation (and/or in conjunction with bootstrap) band-width selection methods that have been widely used for changepoint, or jump, detection [31], [34], [35].

II. DATA DESCRIPTION

Image intensity profiles are obtained using a Matlab graphicuser interface that has been specifically developed to providedata preprocessing, visualization, and analysis capabilities. De-tails are given in [21]. Briefly, for a given mountain glacierof interest, Landsat scenes taken at different time points arespatially registered and cropped around the glacier using user-defined geographical coordinates. Then, the user outlines theglacier path through the interior of the ice surface once, andthis outline is applied to all of the spatially registered imagesto extract intensity profiles for each time frame via local spatialaveraging.

The extracted glacier intensity profile at each time tj , j =1, . . . ,m, is an array of intensities Yij , i = 1, . . . , n, corre-sponding to spatial locations si, i = 1, . . . , n, along the glacierpath, where m is the number of time frames (columns) andn is the number of data points along the glacier path (rows).Variables and parameters are summarized in Table I.

III. SPATIAL ESTIMATION

Terminus candidates are first located on the intensity profilewithin each time frame. Assuming a signal plus noise model,nonparametric local polynomial regression is applied to thenoisy intensity observations along the glacier path in orderto directly estimate the first derivative of the glacier intensityprofile.

A. Nonparametric Local Polynomial Regression

For any given time frame, the extracted image intensity pro-file along the glacier path consists of n observed noisy samplesYi, i = 1, 2, . . . , n, at spatial locations si, i = 1, 2, . . . , n, alongthe path, related by the signal plus noise model

Yi = r(si) + εi, εi : (0,Σε), i = 1, 2, . . . , n (1)

Page 3: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

KACHOUIE et al.: NONPARAMETRIC REGRESSION FOR ESTIMATION OF MOUNTAIN GLACIER RETREAT 1137

TABLE IDEFINITION OF VARIABLES AND PARAMETERS

where εi is the correlated noise (0,Σε), r is an unknownregression function, and the si’s are equally spaced in arc lengthover a segment of length |S|. The time frame index j has beenomitted in order to simplify notation. There are many factorsthat affect the observation pairs (si, Yi). In model (1), theunknown function r is intended to include important systematicsources of bias, such as the effect of shadows from adjacentmountains and clouds. The residual noise is intended to capturenonsystematic variability in the intensity, such as instrumentmeasurement noise.

Using local polynomial regression [29], the unknown func-tion r can be locally approximated at any point s by a poly-nomial of order p. For locations z in a neighborhood of s,we define

g(z;As) = a(s) + a′(s)(z − s)

+ a′′(s)(z − s)2/2! + · · ·+ a(p)(s)(z − s)p/p! (2)

where As = (a(s), a′(s), a′′(s), . . . , a(p)(s))T is estimated byAs = (a(s), a′(s), a′′(s), . . . , a(p)(s))T to minimize

n∑i=1

wi(s) (Yi − g(si;As))2 = (Y −XsAs)

TWs(Y −XsAs)

(3)

where Y = (Y1, . . . , Yn)T , Xs is a n× (p+ 1) matrix whose

ith row is [1, (si − s), . . . , (si − s)p/p!], Ws is a n× n di-agonal matrix whose ith diagonal entry is wi(s) = Kγ(si −s), and Kγ(x) = (1/γ)K(x/γ) is a kernel with band-width γ for which

∫K(x)dx = 1. Equation (3) is a least

squares problem, and through minimization, we obtain As =

(XTs WsXs)

−1XT

s WsY = LsY where the smoothing matrixLs = (XT

s WsXs)−1XT

s Ws and r(s) = a(s) is the inner prod-uct of the first row of Ls with Y

r(s) = ([1 0 0 · · · 0]× Ls)Y = e1LsY =

n∑1

l1,i(s)Yi. (4)

The variance of this estimator is

var (r(s)) = lT1 (s)Σεl1(s) (5)

where lT1 (s) = e1Ls is the vector defined by l1(s) =(l1,1(s), l1,2(s), · · · , l1,n(s))T . As in (4), the dth derivativer(d)(s) can be estimated by the inner product of Y with the(d+ 1)th row of Ls as

r(d)(s)=([0 · · · 1 0 · · · 0]×Ls)Y=ed+1LsY=

n∑1

ld+1,i(s)Yi

(6)

where ed+1 is a vector with a single 1 in the (d+ 1)th positionand ld+1,i is the ith entry of the (d+ 1)th row of Ls (ld+1(s)).Its associated variance is computed in a manner similarto (5)

var(r(d)(s)

)= lTd+1(s)Σεld+1(s). (7)

In our case, we are mainly interested in directly estimating thefirst derivative (d = 1) of the glacier intensity profile, whoselocal minima will be used as candidates for the glacier terminus.However, we will also need estimates of the second and thethird derivative in order to estimate the standard error of thelocation of those candidates. We therefore use local polynomialorder p = 3, being the smallest order that produces the desiredquantities.

The tricube function Kγ=1(x) = (1− |x|3)3.1{|x|≤1} [29] isused as a smoothing kernel for two reasons. First, it providesfinite support for smoothing in order to limit the spread of thesignal region into neighboring regions. Second, this smoothingkernel has continuous first, second, and third derivatives withinits support and at the boundaries, facilitating the identificationof local minima. These characteristics ensure smooth estimatesof the function derivatives and reduce edge effects. The se-lection of the plug-in smoothing bandwidth γs is described inSection III-C.

B. Spatial Error Associated With the Estimated Derivative

Given a smoothing bandwidth γs, the standard error of thederivative is estimated as follows. Considering the fact that εi’s(spatial noise) are correlated, the spatial error associated with

r(d)(s) can be estimated using (7) by

var(r(d)(s)

)= lTd+1(s)Σεld+1(s) (8)

where Σε is the estimate of the noise autocovariance matrix Σε.A time series analysis of the nonparametric regression residualsεi = Yi(s)− ri(s) for our data indicates that the residuals arewell modeled as a second order autoregressive (AR2) process

εi = a1εi−1 + a2εi−2 + θi (9)

where a1 and a2 are AR2 model coefficients and θi is zeromean white noise. The autoregressive (AR) model parametersare estimated by applying the Yule–Walker method [38]. Giventhe AR model, the estimated noise covariance matrix Σε isa symmetric Toeplitz matrix whose first row is obtained by

Page 4: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

1138 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015

utilizing the estimates of the AR parameters a1, a2, and σ2θ in

the formulas

Σ1,1(a1, a2, σ2)=

a1(1− a2)σ2

(1 + a2) ((1− a2)2 − a21) (1− a2)

Σ1,2(a1, a2, σ2)= a1Σ1,1(a1, a2, σ

2)

+a2(1− a2)σ

2

(1 + a2) ((1− a2)2 − a21)

Σ1,j(a1, a2, σ2)= a1Σ1,(j−1)(a1, a2, σ

2)

+ a2Σ1,(j−2)(a1, a2,σ2), j∈ [3, n] (10)

applied to Σ = Σε. The standard error associated with r(d)(s)is finally estimated by (8).

C. Spatial Bandwidth Selection—Plug-In Bandwidth

Cross validation (CV) is often used for choosing the smooth-ing bandwidth for estimating the underlying function in non-parametric regression. However, because we are interested indirectly estimating the first derivative, we instead use a plug-inbandwidth designed for that purpose. In addition, although CVperforms well for uncorrelated errors, it selects bandwidths thatare too small, resulting in overfitting to the observations (thisis discussed in detail in Section VII-C). We expect the plug-inmethod to perform better than CV, given the spatial correlationsin the present scenario. This is discussed further in Section VII.

Assuming that the spatial locations si’s are equidistant overa segment of length |S|, the plug-in bandwidth [27]–[29], [37]to estimate the qth derivative of r using a pth order localpolynomial regression where q ∈ [0, p] is

γs =

(CS

∫ ∫Σs,udsdu

n∫ (

r(p+1)(s))2

ds

) 1(2p+3)

≈(CS |S|

∑ni=1

∑nj=1 Σεij∑n

i=1

(r(p+1)(si)

)2) 1

(2p+3)

(11)

where

CS(p, q) =(p+ 1)!2(2q + 1)

∫K2(u)du

2(p+ 1− q)(∫

up+1K(u)du)2

is a constant computed for the kernel K, Σs,u is the covariancematrix of the noise ε, and r(p+1) is the (p+ 1)th derivative ofr. The second expression in (11) is an estimator of the plug-in bandwidth, which uses a pilot estimate r of the regressionfunction r; then, Σεij is an estimator of Σs,u based on the pilotresiduals εi = Yi(s)− ri(s).

Often, the pilot estimate r is computed by fitting a globalpolynomial of order pG ≥ 4 [29]. As further discussed inSection VII, we use global polynomial order pG = 2p+ 3 = 9.As in (9), to estimate the noise autocovariance Σε, the residualsεi are modeled as an AR model of order 2 as εi = b1εi−1 +b2εi−2 + ηi, where b1 and b2 are the AR model coefficientsand ηi is zero mean white noise. The Yule–Walker methodis applied to estimate the AR2 parameters. The pilot noisecovariance matrix Σε is obtained by utilizing the estimatesof the AR2 parameters b1, b2, and σ2

η in (10) and assigning

Σ = Σε, a1 = b1, a2 = b2, σ2 = σ2

η .

IV. LOCATING THE GLACIER TERMINUS

We work under the assumption that the glacier terminus canbe identified by locating the transitions in the intensity profilealong the glacier path. Specifically, we assume that the terminusis represented by a change from high to low intensity, i.e.,a negative inflection point. This holds in the processed bandNDSI. In B62, where the ice is darker than the soil, we computethe inverse image with respect to the maximum intensity ofthis band.

A. Inflection Point Detection Using the Estimated Derivativeof Intensity Profile

We define a negative inflection point as any point on theglacier intensity profile where the first derivative (r(1)(s)) ofthe underlying function r has a local minimum and r(1)(s) < 0.At such an inflection point, the second derivative r(2)(s) crossesthe zero line from being negative (r(2)(s) < 0) to being positive(r(2)(s) > 0). Thus, the local minima of the estimated first

derivative r(1) may be found by locating the zero crossings ofthe second derivative. However, as discussed in Section VII,when both the first and second derivatives are estimated directlyvia local polynomial regression, the local minima of the formerdo not necessarily match the zero crossings of the latter. Toensure a match, the local minima of the first derivative are foundby locating the zero crossings of the numerical second deriva-

tive, denoted by (r(1))(1), obtained by the numerical derivative

of r(1), rather than locating the zero crossings of the estimated

second derivative r(2) via (6) with d = 2. Specifically, applying(6) with d = 1, we have(

r(1))(1)

(si) ≈Δr(1)(si)

Δsi=

r(1)(si)− r(1)(si−1)

si − si−1

=[l2(si)− l2(si−1)]Y

si − si−1(12)

which is a linear function of the data. By construction, any

point si such that (r(1))(1)(si) < 0 and (r(1))(1)(si+1) > 0 willcorrespond to a local minimum of the estimated first derivative

r(1), as desired. Since inflection points are located as zerocrossings of the numerical second derivative, the standard errorof any located inflection point at s = D is estimated by

Se(D) ∼=Se

(r(2)(D)

)r(3)(D)

∼=Se

((r(1)

)(1)

(D)

)(r(1)

)(2)

(D)

(13)

where Se((r(1))(1)

(D)) =

√var((r(1))

(1)(D)) is the standard

error of the numerical second derivative at the inflection point,

and the third derivative ((r(1))(2)

(D)) is the numerical secondderivative of the estimated first derivative (see Fig. 1). Asdiscussed in Section VII, the latter serves as an estimate ofthe third derivative that can be directly interpreted as the slopeof the second derivative at its zero crossings, as opposed to

Page 5: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

KACHOUIE et al.: NONPARAMETRIC REGRESSION FOR ESTIMATION OF MOUNTAIN GLACIER RETREAT 1139

Fig. 1. (Blue) Second and (red) third derivatives of the intensity profilealong the glacier path where the slope of the second derivative, i.e., thethird derivative, at the estimated inflection point approximately is r(3)(D) ∼=Se(r(2)D)/Se(D).

the estimated third derivative r(3) from the local polynomialregression via (6) with d = 3. To compute the numerator of(13), the required variance of the numerical second derivativeis estimated using (7) by

var

((r(1)

)(1)

(si)

)

≈ var

(Δr(1)(si)

Δsi

)

=[l2(si)− l2(si−1)]

T Σε [l2(si)− l2(si−1)]

Δs2i. (14)

B. Tracking Terminus Position Over Time

The presence of shadows, clouds, and debris may producemany transitions in the intensity profiles in addition to theone corresponding to the true terminus. Hence, for each timeframe tj , we may have Kj located inflection points at locationsDk

j , k = 0, 1, · · · ,Kj . From these candidates, we must identifya single one as the estimate of the terminus location Dj ateach time point tj . The collection of such time–location pairs(tj , Dj), j = 1, 2, . . . ,m, of identified terminus points repre-sents the glacier variations over time.

Intuitively, the most likely terminus location at a typicaltime frame tj is the inflection point with maximum change.However, as strong inflection points may also be produced bytransitory features such as shadows, this is not necessarily thecase in every time frame. The effect of shadows is attenuated inthe B62 and NDSI Landsat bands, which we chose to analyzefor this reason. To further minimize the selection of wronginflection points, we approach the terminus detection problemover time as a tracking problem. Temporal information helpsguide terminus identification at each time point tj so that arelatively smooth track is maintained while the selection offalse terminal points is avoided. A nearest neighbor trackingmethod with gating is proposed as follows.

Conservatively, the terminus candidate (tJ , DJ ) with max-imum absolute first derivative among all terminus candidatesover time (j = 1, 2, . . . , J, . . .m) is identified to initiate theterminus path. Because this candidate might be located any-where on the time axis, the track is constructed both forwardfor tj = tJ , tJ+1, . . .m and backward for tj = tJ , tJ−1, . . . 1.To identify the terminus at time point tj for j = J + 1, J +2, . . .m, we compute the absolute rate of change betweenthe identified terminus location at time point tj−1 and eachterminus candidate Dk

j at time point tj as |Δkj−1,j | = |Dj−1 −

Dkj |/(tj − tj−1). The terminus location at time point tj is

identified as the terminus candidate Dkj with the smallest rate

of change within the validation gate (|Δkj−1,j | ≤ G)

Dj = argminDk

j

{∣∣Δkj−1,j

∣∣ : ∣∣Δkj−1,j

∣∣ ≤ G}. (15)

The gate size G represents the maximum annual possibleterminus retreat or advance. To prevent biasing results towardshowing too little change, we use a large constant value of G =2000 m/year. If there is no terminus candidate at time pointtj , i.e., Kj = 0, or none of the terminus candidates Dk

j fallsinside the gate, we set Dj = Dj−1. The backward tracking isapplied similarly, where distances are computed by |Δk

j,j−1| =|Dj −Dk

j−1|/(tj−1 − tj) for tj = tJ , tJ−1, . . . 1.

V. TEMPORAL SMOOTHING

The glacier terminus location estimates (Dj) obtained viatracking are directly related to the change in glacier lengthover time; however, these locations are estimated with error.Furthermore, they contain short-term variability, while we areforemost interested in the long-term trends. To estimate thelong-term temporal trend, we apply a nonparametric local linearregression model over time as described hereinafter. In contrastwith the spatial noise model, we assume independent het-eroscedastic temporal noise between time frames. As discussedfurther in Section VII-A, temporal variance and correlationparameters cannot be estimated simultaneously at each timeframe, and in this case, the heteroscedasticity is more importantto consider.

A. Temporal Estimation of the Terminus Location

Suppose that m pairs of identified terminus points(tj , Dj), j = 1, 2, . . . ,m are obtained from the tracking step,where Dj is the glacier terminus location (meters) and tj isthe time frame (years). The terminus location Dj and the timeframe tj are related by the model

Dj = μ(tj) + δj , δj :(0, ασ2

tj

)(16)

where μ is an unknown regression function and δj’s are in-dependent noise variables, each with mean zero and differentvariance ασ2

tj. In contrast with the spatial estimation step (see

Section III) in which the spatial noise was assumed ho-moscedastic at each time frame, here, the temporal noisevariance ασ2

tjis modeled as having two components: a

Page 6: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

1140 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015

time-varying heteroscedastic factor σ2tj

= var(Dj(t)), repre-senting the estimation accuracy of the terminus location fromthe spatial estimation step, itself estimated from spatial analy-sis, and a common homoscedastic variance factor α, represent-ing the unknown short-term temporal variability. In the noisevariance model, ασ2

tjallows the variance at each time frame

to be relatively proportional to the estimation accuracy of theterminus location, while allowing an unknown absolute magni-tude. In this way, the proposed temporal noise model propagatesthe spatial uncertainties associated with the estimated terminallocation at each time frame through temporal estimation, whileallowing the additional variability unaccounted for in the previ-ous steps.

To solve (16), the regression function μ can be locallyestimated at a point u in a neighborhood of t by a polynomialof order q

μ(u) = μ(t) + μ′(t)(u− t) + μ′′(t)(u− t)/2!

+ · · ·+ μ(q)(t)(u− t)q/q! (17)

where Mt = (μ, μ′, μ′′, . . . , μ(q))T

are estimated by Mt(u) =

(μ, μ′, μ′′, . . . , μ(q))T

to minimize

m∑j=1

wj(t) (Dj−μ(tj))2 = (D −XtMt)

TWt(D −XtMt)

(18)

where D = (D1, . . . , Dm)T , Xt is an m× (q + 1) ma-trix whose jth row is [1, (tj − t), . . . , (tj − t)q/q!], Wt isan m×m diagonal matrix whose jth diagonal entry isa kernel with bandwidth γ, wj(t) = Kγ(tj − t)ω(tj), andKγ(x) = (1/γ)K(x/γ). The ωj(t)’s are weights given byspatial standard errors ωj(t) = 1/σ2

tj. Equation (18) is a least

squares problem, and through minimization, we obtain Mt =

(XTt WtXt)

−1XT

t WtD = LtD where the smoothing matrixLt = (XT

t WtXt)−1XT

t Wt and μ(t) is the inner product of thefirst row of Lt with D

μ(t) = ([1 0 0 · · · 0]× Lt)D = e1LtD =

m∑j=1

l1,j(t)Dj .

(19)The variance of this estimator is

var (μ(t)) = α

m∑j=1

l21,j(t)σ2tj. (20)

In the same way, μ(1)(t), μ(2)(t), . . . , μ(q)(t) are estimated bythe inner product of the second row, third row, and (q + 1)throw of Lt with D, respectively. Note that the nonparametriclinear regression estimate μ(t) does not depend on the unknownparameter α, but its variance does.

In our case, we are mainly interested in the long-term trendof glacier retreat represented by the function μ(t) and applythe aforementioned local linear regression solution with orderq = 1. As a kernel, we use the same tricube kernel functionas in the spatial smoothing step but with a different bandwidthselection, as described hereinafter.

B. Estimation Error

Given a temporal smoothing bandwidth, the standard errorof the estimated temporal curve μ is se(μ(t)) =

√var(μ(t))

where the variance in (20) is estimated by

var (μ(t)) = α

m∑j=1

l21,j(t)σ2tj. (21)

The factors σ2tj

are assumed to be known from the spatialsmoothing step. The parameter α is estimated by the methodof moments as

α =

∑m

j=1(Dj(t)−μj(t))

2

σ2tj

(m− 2vt + vt)(22)

where vt = trace(L), vt = trace(LTL) =∑m

j=1 ‖l1(tj)‖2,and (m− 2vt + vt) is the number of degrees of freedom.

C. Temporal Bandwidth Selection—Plug-In Bandwidth

As discussed in Section III-C, CV performs poorly and yieldssmall bandwidths. Assigning p = 1 and q = 0 in (11) andassuming that the time points tj’s are randomly selected froma uniform density over a segment of length |T |, the plug-inbandwidth using local linear regression with a polynomial oforder 1 is [27]–[29], [37]

γt =

(CT

∫σ2t (t)dt

k∫ (

μ(2)(t))2

dt

) 15

≈(

CT |T |∑m

j=1 Σδjj∑mj=1

(μ(2)(Tj)

)2) 1

5

=

(CT |T |

∑mj=1 βσ

2tj∑m

j=1

(μ(2)(Tj)

)2) 1

5

(23)

where CT =∫K2(u)du/(

∫u2K(u)du)

2 is a constant com-puted for the kernel Kγ , μ(2) is the second derivative of μ,and μ is a pilot estimate of μ obtained by fitting a globalpolynomial of order pG = 4 [29]. The noise autocovarianceis a diagonal matrix whose diagonal entries are Σδjj = βσ2

tj,

following model (16), where σ2tj

is the spatial noise variance attime frame tj . Here, the factors σ2

tjare known from the spatial

smoothing step, while β plays the role of α and is estimated ina similar way to (22) as

β =

∑m

j=1(Dj(t)−μj(t))

2

σ2tj

(m− (pG + 1))(24)

where the residuals Dj(t)− μj(t) are estimated using the pilotestimate (μ) of μ and (pG + 1) is the number of degrees offreedom.

VI. RESULTS

In this section, the proposed method is used to estimatethe terminus location over a few decades using Landsat 5 andLandsat 7 multispectral images for three glaciers: Franz Josef(west coast of New Zealand’s South Island), Gorner (canton of

Page 7: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

KACHOUIE et al.: NONPARAMETRIC REGRESSION FOR ESTIMATION OF MOUNTAIN GLACIER RETREAT 1141

Fig. 2. Application of the proposed methods to a Landsat 7 multispectral image of the Franz Josef glacier where the glacier path is partially occluded withshadows. (Left column) Visualization of the glacier using false color (RGB = B50,B40,B30): Manually drawn glacier path is shown in yellow, located inflectionpoints are marked by purple disks, the estimated terminus location is marked by a yellow disk, whereas the estimated terminus location from the previous timeframe is superimposed and marked by a blue disk [(bottom) is the zoom of (top)]. (Right column) Smoothed intensity profile applying the local polynomialregression and its associated derivatives. (Top) (Purple) Extracted noisy intensity profile along the glacier path, (blue) estimated intensity profile by (4) using theplug-in bandwidth estimated by (11), (black) pointwise confidence bands computed using (5), and (yellow disks) estimated terminus location candidates obtainedusing (12). (Bottom) (Blue) Estimated first derivative and (red) numerically computed second derivative of the intensity profile along the glacier path that are usedto locate the glacier terminus candidates.

Valais, Switzerland), and Viedma (Patagonia, Argentina). Theperformance of the proposed method is evaluated using theavailable ground measurements for the terminus location of theFranz Josef and Gorner. The same method is then applied toViedma, for which ground measurements are not available.

A. Spatial Analysis for Specific Time Frames

Fig. 2 shows a typical Landsat 7 multispectral scene of FranzJosef for a single time frame. Images are shown in false colorwith the Red, Green, Blue (RGB) colors assigned to threespecific Landsat frequency bands: R = B50, G = B40, andB = B30. This image has high contrast, but the glacier pathis partially occluded by shadows from the adjacent mountains,so the terminus is not clearly visible. The noisy observations[Yi’s in (1)] are extracted from the intensity profile along themanually sketched glacier path (see Fig. 2). The smoothedintensity profile is estimated by applying (4) for which theplug-in bandwidth is obtained by (11) for p = 3 and q = 1.

The pointwise confidence bands are estimated using (5) whereΣε is estimated using (10). The confidence bands reflect thevariability of the estimates but not their bias.

Candidates for the terminus position along the glacier inten-sity profile are marked in Fig. 2. The first derivative of the true

intensity profile r(s) along the glacier path r(1)(s) is estimated

using (6) for d = 1. To find the local minima of r(1)(s), welocate the zero crossings of the numerical second derivative

(r(1)(s))(1)

(12). Note that, as a result of discrete sampling,the numerical second derivative is shifted to the right by halfa pixel, and therefore, the local minima of the first derivativealways correspond to the sample point on the numerical secondderivative that is immediately before the zero crossing.

In this scene, it is visually apparent that the glacier terminuscorresponds to the inflection point with the largest absolutefirst derivative. There are, however, other inflection points withcomparable magnitudes caused by shadows. The automaticidentification of the terminus point is achieved by applying the

Page 8: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

1142 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015

Fig. 3. Application of the proposed methods to a Landsat 5 multispectral image of the Gorner glacier where the multispectral scene has a low resolution andpoor contrast. (Left column) Visualization of the glacier using false color (RGB = B50,B40,B30): Manually drawn glacier path is shown in yellow, locatedinflection points are marked by purple disks, the estimated terminus location is marked by a yellow disk, whereas the estimated terminus location from the previoustime frame is superimposed and marked by a blue disk [(bottom) is the zoom of (top)]. (Right column) Smoothed intensity profile applying the local polynomialregression and its associated derivatives. (Top) (Purple) Extracted noisy intensity profile along the glacier path, (blue) estimated intensity profile by (4) using theplug-in bandwidth estimated by (11), (black) pointwise confidence bands computed using (5), and (yellow disks) estimated terminus location candidates obtainedusing (12). (Bottom) (Blue) Estimated first derivative and (red) numerically computed second derivative of the intensity profile along the glacier path that are usedto locate the glacier terminus candidates.

proposed tracking approach using (15). Fig. 2 visually indicateschanges in the location of the terminus by superimposing a bluedisk indicating the estimated location of the terminus from theprevious frame.

The same panels and color scheme as those in Fig. 2 are usedfor Figs. 3 and 4. A typical multispectral Landsat 5 image of theGorner glacier is shown in Fig. 3. This image has poor contrastrelative to the Landsat 7 image shown in Fig. 2 for the FranzJosef glacier, but the terminus appears to be clearly identifiable.Applying the same analysis procedure as discussed for theFranz Josef results in the identification of a clear inflectionpoint whose first derivative is significantly greater than otherterminus candidates in this time frame.

A shadow-free multispectral Landsat 7 image of the Viedmaglacier is shown in Fig. 4. In comparison with Figs. 2 and 3,this image has high contrast, and the glacier terminus is clearlyvisible in blue (see Fig. 4, left). As in Figs. 2 and 3, the extractednoisy glacier intensity profile, smoothed profile, estimated firstderivative, numerically computed second derivative, and inflec-tion points along the glacier path are obtained. A single sharp

transition (see Fig. 4, top right) with a strong inflection pointcan be observed at this time frame, for which the associatedlocal minimum of the first derivative (blue in Fig. 4, bottomright) is significantly higher than the other terminus candidates(yellow disks).

B. Temporal Analysis

The estimated terminus locations over time are presented inFig. 5 for Franz Josef, Gorner, and Viedma. Scene identifica-tions (IDs) and dates for the Landsat images used for theseglaciers are presented in Table II. The terminus location isselected from among all candidate positions using the trackingapproach described in Section IV-B. To study the long-termtrend, we suppress the short-time terminus changes using locallinear regression, as described in Section V-A, and where theplug-in bandwidth is obtained by (23). Notice that the smoothedterminus curve (blue) is estimated over an equally spaced timegrid with 0.2-year intervals. The pointwise confidence bands(red) of the smoothed terminus change curve are computed by

Page 9: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

KACHOUIE et al.: NONPARAMETRIC REGRESSION FOR ESTIMATION OF MOUNTAIN GLACIER RETREAT 1143

Fig. 4. Application of the proposed methods to a shadow-free Landsat 7 multispectral image of the Viedma glacier. (Left column) Visualization of the glacierusing false color (RGB = B50,B40,B30): Manually drawn glacier path is shown in yellow, located inflection points are marked by purple disks, the estimatedterminus location is marked by a yellow disk, whereas the estimated terminus location from the previous time frame is superimposed and marked by a blue disk[(bottom) is the zoom of (top)]. (Right column) Smoothed intensity profile applying the local polynomial regression and its associated derivatives. (Top) (Purple)Extracted noisy intensity profile along the glacier path, (blue) estimated intensity profile by (4) using the plug-in bandwidth estimated by (11), (black) pointwiseconfidence bands computed using (5), and (yellow disks) estimated terminus location candidates obtained using (12). (Bottom) (Blue) Estimated first derivativeand (red) numerically computed second derivative of the intensity profile along the glacier path that are used to locate the glacier terminus candidates.

(20), where σ2tj

for each time frame tj is taken to be knownfrom the spatial smoothing step.

The performance of the proposed method is evaluated bycomparing the estimated terminus change over time withground measurements (see Fig. 5). Our primary interest is in de-termining relative changes, and the estimated terminus changeand the ground measurements are compared after shifting theground measurements vertically to minimize the mean squaredifference between the two curves, where both curves aresampled on the same equally spaced temporal grid mentionedearlier. This results in the superimposed curves having the sametemporal average.

The estimated terminus change over time from year 2000to year 2010 for the Franz Josef glacier is compared withavailable terminus measurements from 1999 to 2010. Basedon the ground measurements, Franz Josef’s terminus retreatedfrom 1999 to 2005, losing about 500 m of its length. Due to highsnowfall accumulation, however, Franz Josef advanced during2005 to 2010. The terminus changes estimated by our methodclosely follow the ground-based estimates, showing a retreatin terminus position from 2000 to 2004 and an advance from

2004 to 2010. The ground measurement curve is completelycontained within the estimated pointwise confidence bands,despite the fact that the bands reflect only the variability inthe estimates and not their bias. In this case, there are 31Landsat images available but only 8 ground measurements,permitting our method to discern shorter term details like thesharp increase between 2003 and 2006.

The estimated terminus locations for the Gorner glacier,obtained from 10 Landsat images from 1985 to 2005, arecompared with 21 ground measurements from 1984 to 2005.The ground measurements indicate that Gorner’s terminussmoothly retreated over this time period, losing almost 350 min length. Terminus trends estimated by our method showa smooth decline comparable to the ground measurements,where the latter are again completely contained within the esti-mated pointwise confidence bands. Note that there is a 10-yeargap from 1989 to 1999 in imagery and that this leads to alarger temporal smoothing bandwidth of 6.25 years. Also, theterminus location cannot be estimated between 1992 and 1997,where the smoothed terminus change curve is discontinuousand the estimation errors are larger.

Page 10: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

1144 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015

Fig. 5. Estimated terminus locations over time (years) for the three glaciers: (First row) Franz Josef, (second row) Gorner, and (third row) Viedma. Shown are(purple dots) the located terminus candidates, (purple line) the terminus change curve obtained by tracking, (blue) the smoothed terminus curve obtained by (19),(red) the pointwise confidence bands of the smoothed terminus change curve computed by (20), and (black) ground measurements. Vertical axes are relativechange in meters. The right column is the zoom of the left one.

Finally, we apply our method to estimate terminus changesfrom 44 Landsat images between 1984 and 2010 for the Viedmaglacier. We are not aware of ground-based observations forthe terminus position of this glacier, highlighting the value ofthe proposed method for studying otherwise unknown glacierchange over time. We observe that Viedma has lost 1800 m ofits length but not at a constant rate. As with Gorner, there is an11-year gap from 1988 to 1999 in our data set after removingcorrupted Landsat images, so this results in a relatively largetemporal smoothing bandwidth of 4.6 years. The terminuslocation cannot be estimated between 1989 and 1997, where thesmoothed terminus change curve is broken and the estimationerror is large, but we can estimate a total decline in this periodof about 700 m.

VII. DISCUSSION

We have proposed a statistical method for estimating thelocation of a glacier terminus over time from a sequence of

satellite images. The proposed method is robust in both aspatial and a temporal sense. Spatially, the method is able totolerate image effects such as partial occlusion by mountainshadows. Temporally, it is able to tolerate gaps in the data dueto corrupted images by clouds or missing data. The resultsobtained by the proposed method are promising as the esti-mated terminus changes essentially follow the same trend asthe ground measurements available in the case of two glaciers.There remain, however, several methodological issues worthyof more discussion.

A. Modeling Considerations

Spatial Modeling: Partial contrast loss and variability in theextracted intensity profile are caused by different sources of un-certainty such as mountain shadows, clouds, snow, temperaturevariations, and image acquisition noise. These tend to produceseveral inflection points along the glacier path in addition to thetrue terminus.

Page 11: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

KACHOUIE et al.: NONPARAMETRIC REGRESSION FOR ESTIMATION OF MOUNTAIN GLACIER RETREAT 1145

TABLE IISCENE IDS FOR THE PROCESSED GLACIERS

We model the extracted intensity profile along the glacierpath in (1) as a nonparametric smooth curve in order to per-mit for flexibility in handling the most important sources ofvariability. Because most of the modeling flexibility is givento the signal, model identifiability requires the noise term in(1) to be less flexible. Here, we model this residual noise asbeing homoscedastic, spatially correlated, and stationary, whichaccounts for some of the more subtle sources of variability

mentioned earlier. Assuming spatially correlated noise, theplug-in bandwidth is estimated using (11) rather than using CVbandwidth, as discussed in Section VII-C.

Temporal Modeling: The foregoing noted sources of vari-ability degrade the accuracy of the terminus location estimatesover time. In addition, the terminus location is affected by short-term climatological effects such as seasonal variations. In orderto capture the long-term changes in the terminus location, we

Page 12: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

1146 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015

Fig. 6. Typical intensity profile along the glacier path extracted from a Landsat image of Viedma, showing the mismatch between the directly estimated andnumerically computed second derivatives. (Left) (Purple) Extracted noisy intensity profile along the glacier path, (blue) estimated intensity profile by (4), (black)pointwise confidence bands computed using (5), and (yellow disks) estimated terminus location candidates obtained using (12). (Right) (Dotted blue) Estimatedfirst derivative by (6) for d = 1 using the plug-in bandwidth estimated by (11), (dotted red) estimated second derivative by (6) for d = 2, and (solid red) numericallycomputed second derivative using (12).

Fig. 7. CV versus plug-in bandwidth. (Top) CV bandwidth. (Bottom) Plug-in bandwidth. (Left) (Purple) Typical extracted noisy intensity profile along theglacier path for the Gorner glacier at a specific time frame, (blue) the estimated intensity profile by (4), (black) pointwise confidence bands computed using (5),and (yellow disks) estimated terminus location candidates obtained using (12). (Right) Estimated terminus locations over time (years) for Gorner: Shown are(purple dots) the located terminus candidates, (purple line) the terminus change curve, (blue) the smoothed terminus curve obtained by (19), (red) the pointwiseconfidence bands of the smoothed terminus change curve computed by (20), and (black) ground measurements. Vertical axes are relative change in meters. Theright column is the zoom of the left one.

model the terminus location over time in (16) as a nonparamet-ric smooth curve. However, in contrast with the homoscedasticspatial noise model, a heteroscedastic noise model is assumedfor the temporal noise. The heteroscedasticity is justified giventhat the terminus location estimates have different precisionsfor different time frames. In addition, it is safe to assume thatmany of the subtle sources of variability mentioned earlier andcontributing to the temporal noise are essentially independentbetween images taken several weeks apart.

The residual noise term reflects both the precision of theterminus location estimates and the short-term variability. Be-cause the terminus location estimates at each time frame havedifferent precisions, it is reasonable to assume that the noisevariance at each time frame should be proportional to thecorresponding estimating precision from the spatial smoothingstep, which we assume as prior information at this step. Theadditional variance due to the short-term variability is capturedin the model by an unknown common scaling factor α. This

Page 13: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

KACHOUIE et al.: NONPARAMETRIC REGRESSION FOR ESTIMATION OF MOUNTAIN GLACIER RETREAT 1147

Fig. 8. Autocorrelation function and partial autocorrelation function plots for Franz Josef glacier for eight scenes between years 2000 and 2003.

factor has to be constant for identifiability of the model; itcannot be flexible because the flexibility has already been givento the temporal signal.

B. Estimating Derivatives and Zero Crossings

Numerical Derivatives and Zero Crossings: As noted in

Section IV-A and [29], the estimated first derivative r(1) vialocal polynomial regression is not the same as the derivativeof the estimated function r(1). The differencing operation thatis applied to compute r(1) amplifies the noise, whereas the

summation operator that is applied to estimate r(1) averagesthe spatial noise and alleviates spurious transitions that mayintroduce false inflection points. Therefore, to estimate the

inflection point locations for which we locate the local minimaof the first derivative of the underlying function, r, we prefer to

directly estimate the first derivative (r(1)). For this reason, weestimate the plug-in bandwidth for directly estimating the firstderivative (q = 1) in (11).

To identify the local minima of r(1), we need to locate the

zero crossings of its derivative (r(1))(1)

. Like before, (r(1))(1)

and the estimated second derivative r(2) are not the same. Asan example, Fig. 6 shows that the zero crossings of r(2) do not

exactly match the local minima of r(1), while the zero crossings

of (r(1))(1)

do. The reason for the mismatch is that the plug-in

bandwidth for directly estimating the first derivative r(1), whichis the one that we use, is not the same plug-in bandwidth for

Page 14: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

1148 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015

directly estimating the second derivative r(2). Thus, to locate

the inflection points, we prefer to compute (r(1))(1)

numerically

by (12) than use the estimated second derivative r(2) in (6)for d = 2. As a consequence, the associated variance with

(r(1))(1)

at the inflection point is estimated by (14) rather thanusing (7).

To estimate the standard error of the located inflection pointsusing (13), we need the third derivative. Following the afore-mentioned reasoning and as depicted in Fig. 1, we prefer touse an estimate of the third derivative that can be interpreted asthe slope of the second derivative at its zero crossings, rather

than the estimated third derivative r(3) from the local poly-nomial regression via (6) with d = 3. Hence, we numericallycompute the second derivative of the estimated first derivative

(r(1))(2)

(D) instead, relying on the fact that we have set theplug-in bandwidth to directly estimate the first derivative (q =1) in (11). This bandwidth is large enough that the numericalthird derivative is smooth enough for our purposes.

C. Plug-In Bandwidth versus CV

Since we need the first, second, and third derivatives of theunderlying function, fitting a smooth curve to the noisy obser-vations is crucial. CV produces small bandwidths for correlatederrors, which results in overfitting to the noisy observations. Asan example, Fig. 7 (top left) shows that the estimated bandwidthusing CV is very small (120 m), producing a wiggly estimatedcurve with many closely located inflection points (total of 15).It is apparent that only the inflection point corresponding to thesharpest transition should correspond to the terminus location.The presence of many false inflection points at each time framecan lead the tracking algorithm astray [see Fig. 7 (top right)].

In contrast with CV, the estimated plug-in bandwidth is largeenough (727 m) to smooth out the noise, providing a fewinflection points (total of 4) that are distantly located along theintensity profile [see Fig. 7 (bottom left)]. Decreasing the num-ber of candidates per time frame using the plug-in bandwidthpermits the correct tracking of the glacier terminus [see Fig. 7(bottom right)]. This is particularly critical for images corruptedwith the mask effects of shadows (e.g., Fig. 2), where theterminus location cannot be clearly associated with an inflectionpoint and can only be resolved via tracking.

D. AR Modeling of the Residuals

As noted in Section III-B, the time series analysis of theresiduals εi = Yi(s)− ri(s) presented in Fig. 8 indicates thatan AR2 process εi = a1εi−1 + a2εi−2 + θi is adequate tomodel the residuals. As shown in Fig. 8 for Franz Josef, partialautocorrelation coefficients are significant up to order 2 in allcases, justifying the AR2 model.

E. Tracking

In Section IV-B, we proposed to address the tracking problemby a gated nearest neighbor method. A more sophisticated

tracking approach such as probabilistic data association andKalman filtering would require us to model the velocity of theterminus location for purposes of prediction at the next timestep. This is difficult because the short-term glacier retreat isinfluenced by complex seasonal, climatological, and environ-mental factors and the influence of these factors on the short-time glacier retreat is either unknown or is not confidentlydetermined. Our proposed approach is a conservative simplermethod that does not require directly modeling the time ten-dency of the terminus position.

F. Future Work

Future work will involve the extension of the method, morethorough testing against observations, and wider application indetermining terminus variability where ground-based observa-tions are lacking. In addition to the considerations mentionedin the foregoing sections, a high priority for methodologicalimprovement is the development of an automatic selectionalgorithm for the glacier path, which is done manually in thepresent analysis. The method should also be tested upon a fullersuite of glacier observations having ground-based measure-ments. If those trials appear successful, it should be possibleto track terminus locations on a large number of otherwiseunmonitored glaciers so as to more fully describe historicalchanges in glaciation. A well-resolved estimate of changes inglacier length in both space and time should provide furtherinsight into how and why glacier systems are changing.

ACKNOWLEDGMENT

Credit for all Landsat images: U.S. Geological Survey, Depart-ment of the Interior/United States Geological Survey (USGS).Credit for glacier ground measurements: for Franz Josefgoes to T. Chinn, Dr. I. Owens—Department of Geography—University of Canterbury, and B. Anderson—Victoria Univer-sity of Wellington; for Gorner goes to The Laboratory ofHydraulics, Hydrology and Glaciology (VAW)/scnat agencyfrom 2001 to 2005.

REFERENCES

[1] J. Oerlemans, “Extracting a climate signal from 169 glacier records,”Science, vol. 308, no. 5722, pp. 675–677, Apr. 2005.

[2] K. M. Huybers and G. H. Roe, “Glacier response to regional patterns ofclimate variability,” J. Climate, vol. 22, pp. 4606–4620, 2009.

[3] G. H. Roe, “What do glaciers tell us about climate variability and climatechange?” J. Glaciol., vol. 57, no. 203, pp. 567–578, 2011.

[4] R. Bintanja, R. S. Van de Wal, and J. Oerlemans, “Modelled atmospherictemperatures and global sea levels over the past million years,” Nature,vol. 437, no. 7055, pp. 125–128, Sep. 2005.

[5] M. E. Mann, R. S. Bradley, and M. K Hughes, “Northern hemi-sphere temperatures during the past millennium: Inferences, uncertain-ties, and limitations,” Geophys. Res. Lett., vol. 26, no. 6, pp. 759–762,Mar. 1999.

[6] The Swiss Glacier Inventory 2000 (SGI 2000). [Online]. Available: http://www.geo.unizh.ch/~fpaul/SGI2000/

[7] World Glacier Monitoring Service. [Online]. Available: http://www.geo.uzh.ch/microsite/wgms/index.html

[8] N. J. Cullen et al., “Kilimanjaro glaciers: Recent areal extent from satel-lite data and new interpretation of observed 20th century retreat rates,”Geophys. Res. Lett., vol. 33, no. 16, pp. 1–6, 2006.

[9] M. Erdenetuya, P. Khishigsuren, G. Davaa, and M. Otgontogs, “Glacierchange estimation using Landsat TM data,” in Int. Archives Photo-gramm., Remote Sens. Spatial Inf. Sci., 2006, vol. XXXVI (Part 6),pp. 240–243.

Page 15: Nonparametric Regression for Estimation of …phuybers/Doc/Kachouie_IEEE...IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 3, MARCH 2015 1135 Nonparametric Regression

KACHOUIE et al.: NONPARAMETRIC REGRESSION FOR ESTIMATION OF MOUNTAIN GLACIER RETREAT 1149

[10] S. Hastenrath and L. Greischar, “Glacier recession Kilimanjaro, EastAfrica, 1912–89,” Glaciol, vol. 43, no. 145, pp. 455–459, 1997.

[11] R. Irish, “Landsat 7 automatic cloud cover assessment,” in Proc. SPIEAlgorithms Multispectr., Hyperspectr., Ultraspectr. Imagery IV , S. S. Shenand M. R. Descour, Eds., 2000, vol. 4049, pp. 1–8.

[12] A. Kääb et al., “The new remote sensing derived Swiss glacier inventory:II. First results,” Ann. Glaciology, vol. 34, pp. 362–366, 2002.

[13] M. S. Moussavi et al., “Change detection of mountain glacier surfaceusing aerial and satellite imagery: A case study in Iran, Alamchal Glacier,”Int. Archives Photogramm., Remote Sens. Spatial Inf. Sci., vol. 37,pp. 1013–1016, 2008.

[14] R. W. Sidjak and R. D. Wheate, “Glacier mapping of the Illecillewaet ice-field, British Columbia, Canada, using Landsat TM and digital elevationdata,” Int. J. Remote Sens., vol. 20, no. 2, pp. 273–284, 1999.

[15] B. Höfle et al., “Glacier surface segmentation using airborne laserscanning point cloud and intensity data,” in Proc. IAPRS, Sep. 2007,vol. XXXVI (3/W52), pp. 195–200.

[16] J. Gaoa and Y. Liu, “Applications of remote sensing, GIS and GPS inglaciology: A review,” Progress Phys. Geography, vol. 25, no. 4, pp. 520–540, Dec. 2001.

[17] N. S. Arnold et al., “Evaluating the potential of high-resolution airborneLiDAR in glaciology,” Int. J. Remote Sens., vol. 27, no. 5–6, pp. 1233–1251, Mar. 2006.

[18] US Geographical Survey (USGS). [Online]. Available: http://landsat.usgs.gov/

[19] Landsat—US National Aeronautics and Space Administration (NASA).[Online]. Available: http://landsat.gsfc.nasa.gov/

[20] Landsat.org. [Online]. Available: http://landsat.org/[21] N. N. Kachouie, P. Huybers, and A. Schwartzman, “Localization of moun-

tain glacier Termini in Landsat multi-spectral images,” Pattern Recog.Lett., vol. 34, no. 1, pp. 94–106, Jan. 2013.

[22] R. Irish, “Landsat 7 automatic cloud cover assessment,” Remote Sens.Environ., vol. 112, no. 3, pp. 955–969, 2008.

[23] V. V. Salomonsona and I. Appelb, “Estimating fractional snow coverfrom MODIS using the normalized difference snow index,” Remote Sens.Environ., vol. 89, pp. 351–360, 2004.

[24] E. A. Nadaraya, “On estimating regression,” Theory Probab. Appl., vol. 9,no. 1, pp. 141–142, 1964.

[25] G. S. Watson, “Smooth regression analysis,” Sankhya Series A, vol. 26,no. 4, pp. 359–372, Dec. 1964.

[26] W. Hardle, Applied Nonparametric Regression. Cambridge, U.K.:Cambridge Univ. Press, 1990.

[27] N. S. Altman, “Kernel smoothing of data with correlated errors,” J. Amer.Statist. Assoc., vol. 85, no. 411, pp. 749–759, Sep. 1990.

[28] J. Fan and I. Gijbels, “Local polynomial modelling and its applications,”in Monographs on Statistics and Applied Probability, vol. xi. London,U.K.: Chapman & Hall, 1996, 341 p.

[29] L. Wasserman, All of Nonparametric Statistics, vol. xii. New York, NY,USA: Springer, 2006.

[30] D. Manocha and J. F. Canny, “Detecting cusps and inflection points incurves,” Comput. Aided Geom. Des., vol. 9, no. 1, pp. 1–24, 1992.

[31] I. Gijbels and A. C. Goderniaux, “Bandwidth selection for change pointestimation in nonparametric regression,” Technometrics, vol. 46, no. 1,pp. 76–86, Feb. 2004.

[32] H. G. Müller, “Change-points in nonparametric regression analysis,” Ann.Statist., vol. 20, no. 2, pp. 737–761, Jun. 1992.

[33] C. R. Loader, “Change point estimation using nonparametric regression,”Ann. Statist., vol. 24, no. 4, pp. 1667–1678, Aug. 1996.

[34] I. Gijbels, A. Lambert, and P. Qiu, “Jump-preserving regression andsmoothing using local linear fitting: A compromise,” Ann. Inst. Statist.Math., vol. 59, no. 2, pp. 235–272, Jun. 2007.

[35] J. Joo and P. Qiu, “Jump detection in a regression curve and its derivative,”Technometrics, vol. 51, no. 3, pp. 289–305, 2009.

[36] J. S. Wu and C. K. Chu, “Nonparametric function estimation and band-width selection for discontinuous regression functions,” Statist. Sinica,vol. 3, pp. 557–576, 1993.

[37] J. Fan, I. Gijbels, T. C. Hu, and L. S. Huang, “A study of variable band-width selection for local polynomial regression,” Statist. Sinica, vol. 6,pp. 113–127, 1996.

[38] B. Friedlander and B. Porat, “The modified Yule-Walker methodof ARMA spectral estimation,” IEEE Trans. Aerosp. Electron. Syst.,vol. TAES-20, no. 2, pp. 158–173, Mar. 1984.

Nezamoddin N. Kachouie received the B.S. andM.S. degrees in electrical and computer engineeringand the Ph.D. degree in systems design engineer-ing from the University of Waterloo, Waterloo, ON,Canada.

He is currently an Assistant Professor with theDepartment of Mathematical Sciences at FloridaInstitute of Technology. He is an imaging statisticianwith background in electrical, systems design, andbiomedical engineering. His research interest lies atthe interface of statistical analysis, pattern recogni-

tion, and digital image processing with applications to environmental studies,cancer research, and public health.

Travis Gerke is currently working toward theSc.D. degree in the Department of Epidemiology atHarvard School of Public Health.

His methodological interests concern classifica-tion algorithms for high-dimensional data, and hissubstantive work is focused on the discovery andevaluation of cancer biomarkers.

Peter Huybers received the B.S. degree in physicsand the Ph.D. degree in climate chemistry andphysics from the Massachusetts Institute of Technol-ogy, Cambridge, MA, USA.

He is currently a Professor of Earth and PlanetarySciences and Environmental Science and Engineer-ing with Harvard University, Cambridge, MA, USA.His work published in Science and Nature centers onthe analysis of modern and ancient climate variability.

Armin Schwartzman received the B.S. and M.S.degrees in electrical engineering and the Ph.D. de-gree in statistics from Stanford University, Stanford,CA, USA.

He was an Assistant Professor of Biostatistics atHarvard and is currently an Associate Professor ofStatistics at North Carolina State University, Raleigh,NC, USA. His work involves the development ofstatistical methods for signal and image analysis withbiomedical and environmental applications.