68
Applications of Spatial Data Analysis Earvin Balderama Department of Mathematics and Statistics Loyola University Chicago April 23, 2015 Applications of Spatial Data Analysis c 2015 by Earvin Balderama <[email protected]>

Applications of Spatial Data Analysis

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Applications of Spatial Data Analysis

Applications of Spatial Data Analysis

Earvin Balderama

Department of Mathematics and StatisticsLoyola University Chicago

April 23, 2015

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 2: Applications of Spatial Data Analysis

A little philosophy...

“Nothing puzzles me more than time and space; and yet nothing troubles meless...”

Charles Lamb (1810)

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 3: Applications of Spatial Data Analysis

A little philosophy...

“Nothing puzzles me more than time and space; and yet nothing troubles meless...as I never think of them.”

Charles Lamb (1810)

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 4: Applications of Spatial Data Analysis

Spatial Statistics

1 Studies dependencies in data due to their proximity.2 Answers the questions of “When?” and “Where?”

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 5: Applications of Spatial Data Analysis

Spatial Statistics

1 Studies dependencies in data due to their proximity.2 Answers the questions of “When?” and “Where?”

Fact“There would be no History without Geography.”

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 6: Applications of Spatial Data Analysis

What’s so spatial about spatial data?

We usually think of Y1,Y2, . . . ,Yn as independent observations.But if the Yi’s are from locations in Rd, they may be spatially correlated.

Three main types of spatial data1 Point-referenced data (geostatistics)2 Point pattern data3 Areal data

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 7: Applications of Spatial Data Analysis

What’s so spatial about spatial data?

1 Point-referenced data (geostatistics)Data: The observed values from fixed locations.Goal: Interpolation over continuous space.e.g., predicting the values between locations.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 8: Applications of Spatial Data Analysis

What’s so spatial about spatial data?

1 Point-referenced data (geostatistics)2 Point pattern data

Data: The random locations of an observed point process.Goal: Model the underlying data-generating process.e.g., predicting the location of certain values.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 9: Applications of Spatial Data Analysis

What’s so spatial about spatial data?

1 Point-referenced data (geostatistics)Data: The observed values from fixed locations.Goal: Interpolation over continuous space.e.g., predicting the values between locations.

2 Point pattern dataData: The random locations of an observed point process.Goal: Model the underlying data-generating process.e.g., predicting the location of certain values.

3 Areal dataData: Observations on a (regular or irregular) lattice.equal-sized square grid cells, or by county or state lines.disease mapping, etc.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 10: Applications of Spatial Data Analysis

Where are all the old people? (Areal data example)

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 11: Applications of Spatial Data Analysis

Where are all the old people? (Areal data example)

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 12: Applications of Spatial Data Analysis

Spatial data not restricted to locations on Earth

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 13: Applications of Spatial Data Analysis

Red Banana

Red Bananas

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 14: Applications of Spatial Data Analysis

Red Banana Data

Red bananas found in Costa Rica

1 Species Musa velutina1 Native to India and parts of

southeast Asia.2 Usually grown as an ornamental

plant.2 788 observed plants

1 Dec 2006 – Jan 20082 La Selva Biological Station3 Measured height, GPS location

3 318 selected for weekly heightmeasurements; the rest were dug out.

1 Empirical growth rate2 Estimate birth times

10.43

10.44

−84.02 −84.01 −84.00Longitude

Latit

ude

Height (m) 1 2 3 4 5 6

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 15: Applications of Spatial Data Analysis

Red Banana Data

Spread of red banana beginning in 2002

6 months

10.43

10.44

−84.02 −84.01 −84.00Longitude

Latit

ude

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 16: Applications of Spatial Data Analysis

Red Banana Data

Spread of red banana beginning in 2002

1 year

10.43

10.44

−84.02 −84.01 −84.00Longitude

Latit

ude

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 17: Applications of Spatial Data Analysis

Red Banana Data

Spread of red banana beginning in 2002

2 years

10.43

10.44

−84.02 −84.01 −84.00Longitude

Latit

ude

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 18: Applications of Spatial Data Analysis

Red Banana Data

Spread of red banana beginning in 2002

3 years

10.43

10.44

−84.02 −84.01 −84.00Longitude

Latit

ude

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 19: Applications of Spatial Data Analysis

Red Banana Data

Spread of red banana beginning in 2002

4 years

10.43

10.44

−84.02 −84.01 −84.00Longitude

Latit

ude

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 20: Applications of Spatial Data Analysis

Red Banana Data

Finding models for spread

Goal: Find a model to characterize the spread of an invasive species inspace-time.

Epidemic-type aftershock sequence (ETAS)1

Models a sequence of earthquakes via a conditional intensity.

λ(t, x, y|Ht) = µ(x, y) +∑{i:ti<t}

g(t − ti, x− xi, y− yi;Mi)

1 λ is the infinitesimal rate at which events are expected to occurconditioned on the prior history of process,

2 µ is a non-homogeneous background rate,3 g explains how aftershocks are “triggered."

1Ogata, Y., 1998. Space-Time Point-Process Models for Earthquake Occurrences. The Inst. of Statistical Mathematics.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 21: Applications of Spatial Data Analysis

Red Banana Data

Estimate background rate using a kernel smooth

●●●

●●

●●

● ●

●●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

10.43

10.44

−84.02 −84.01 −84.00Longitude

Latit

ude

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 22: Applications of Spatial Data Analysis

Red Banana Data

Estimate background rate using a kernel smooth

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 23: Applications of Spatial Data Analysis

Red Banana Data

Finding models for spread

Space-time ETAS triggering function

g(t, x, y;M) =K0

(t + c)p ·e−α(M−M0)

(x2 + y2 + d)q

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 24: Applications of Spatial Data Analysis

Red Banana Data

Finding models for spread

Space-time ETAS triggering function

g(t, x, y;M) =K0

(t + c)p ·e−α(M−M0)

(x2 + y2 + d)q

Omori’s law2

Frequency of aftershocks at time t follow a power-law distribution.

n(t) =K

(t + c)p

Magnitude frequency law3

Magnitudes of earthquakes follow an exponential distribution.

P(Mag > M) = e−βM

2Omori, F., 1894. On the aftershocks of earthquakes. Journal of the College of Science, Imperial University of Tokyo, 7,111-200.

3Gutenberg, B., Richter, C., 1944. Frequency of Earthquakes in California. Bulletin of the Seismological Society of America, 34,

185-188.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 25: Applications of Spatial Data Analysis

Red Banana Modified ETAS

Spatial and temporal clustering

Inter-event distanceDistance between pairs of plantsborn within 6 weeks of each other.

inter−event squared distance (km2)

Den

sity

0 2 4 6

0.0

0.2

0.4

0.6

0.8

1.0

Inter-event timeTime between pairs of plants that areless than 100 meters apart.

inter−event birth time (weeks)

Den

sity

0 50 100 150

0.00

00.

005

0.01

00.

015

0.02

00.

025

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 26: Applications of Spatial Data Analysis

Red Banana Modified ETAS

Modified ETAS for red banana

Triggering density function

g(t, x, y) =αβ

πe−αt−β(x2+y2)

Conditional intensity

λ(t, x, y|Ht) = (1− p)µ(x, y) +pαβπ

∑{i:ti<t}

e−α(t−ti)−β{(x−xi)2+(y−yi)2}

1 α is the temporal clustering parameter,2 β is the spatial clustering parameter,3 p is the proportion of plants that were “triggered" by previous plants.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 27: Applications of Spatial Data Analysis

Red Banana Estimation

Estimate model parameters by MLE

log-Likelihood

l =n∑

i=1

logλ(ti, xi, yi)−∫∫

A

∫ ∞0

λ(t, x, y)dtdxdy

α̂ = 0.076 (0.005)

β̂ = 0.029 (0.002)

p̂ = 0.577 (0.019)

1 About 58% of plants are direct descendants of previous plants.2 Model fit evaluated using super-thinning4 with the R packagestppresid.

4Clements, R., Schoenberg, F., Veen, A., 2012. Evaluation of space-time point process models using super-thinning. Environmetrics, 23,

606-616.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 28: Applications of Spatial Data Analysis

Red Banana Estimation

Simulating 100 years

0 500 1000 1500 2000 2500

0500

1000

1500

2000

2500

After 1 year

xy

0 500 1000 1500 2000 2500

0500

1000

1500

2000

2500

After 2 years

x

y

0 500 1000 1500 2000 2500

0500

1000

1500

2000

2500

After 5 years

x

y

0 500 1000 1500 2000 2500

0500

1000

1500

2000

2500

After 10 years

xy

0 500 1000 1500 2000 2500

0500

1000

1500

2000

2500

After 50 years

x

y

0 500 1000 1500 2000 2500

0500

1000

1500

2000

2500

After 100 years

x

y

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 29: Applications of Spatial Data Analysis

Sea birds

Sea Birds

←− Northern Gannet

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 30: Applications of Spatial Data Analysis

Sea birds Motivation

Development of offshore wind energy facilities

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 31: Applications of Spatial Data Analysis

Sea birds Motivation

Detrimental effects on seabird life

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 32: Applications of Spatial Data Analysis

Sea birds Motivation

Statistical motivation

Northern Gannet data

Count

Fre

quen

cy

0 500 1000 1500

020

0040

0060

0080

00

Observed count Frequency0 9553

1− 10 177811− 100 184

101− 1000 111001+ 1

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 33: Applications of Spatial Data Analysis

Sea birds Data

Data collection

1 Boat and aerial surveys2 1992− 20103 43,701 transects4 133,890 separate sightings5 > 2 million total birds6 ∼ 150 unique species

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 34: Applications of Spatial Data Analysis

Sea birds Data

Data collection

1 Boat and aerial surveys2 1992− 20103 43,701 transects4 133,890 separate sightings5 > 2 million total birds6 ∼ 150 unique species

Goal: Model the space-timedistribution of seabirds in theAtlantic ocean, and create mapsthat assess risk.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 35: Applications of Spatial Data Analysis

Sea birds Data

Discretize spatial domain

15984 sites4 × 4 km each site35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 36: Applications of Spatial Data Analysis

Sea birds Data

Consider data from July 2002—November 2010

Amount of Effort35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

10+

5−9

1−4

0

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 37: Applications of Spatial Data Analysis

Sea birds Model

Mixture models

1 Must account for excess zeros.1 Zero-inflated models.2 Hurdle models.

2 Must account for over-dispersion.1 Negative Binomial (NB) instead of Poisson for “typical" counts.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 38: Applications of Spatial Data Analysis

Sea birds Model

Mixture models

1 Must account for excess zeros.1 Zero-inflated models.2 Hurdle models.

2 Must account for over-dispersion.1 Negative Binomial (NB) instead of Poisson for “typical" counts.2 Generalized Pareto distribution (GPD) for “extreme" values.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 39: Applications of Spatial Data Analysis

Sea birds Model

Mixture models

1 Must account for excess zeros.1 Zero-inflated models.2 Hurdle models.

2 Must account for over-dispersion.1 Negative Binomial (NB) instead of Poisson for “typical" counts.2 Generalized Pareto distribution (GPD) for “extreme" values.

Generalized Pareto distribution (GPD) properties1 µ is the lower bound (threshold)2 σ > 0 is the scale3 ξ is the shape

1 if ξ < 0, the distribution is bounded above2 if ξ > 0.5, the variance is infinite3 if ξ > 1, the mean is infinite

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 40: Applications of Spatial Data Analysis

Sea birds Model

Double-hurdle model

P(Yij|θ) =

pij if Yij = 0,(1− pij) · (1− qij) · NB(mij, r) if 1 ≤ Yij < µ,

(1− pij) · qij ·GPD(µ, σ, ξ) if Yij ≥ µ.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 41: Applications of Spatial Data Analysis

Sea birds Model

Double-hurdle model

P(Yij|θ) =

pij if Yij = 0,(1− pij) · (1− qij) · NB(mij, r) if 1 ≤ Yij < µ,

(1− pij) · qij ·GPD(µ, σ, ξ) if Yij ≥ µ.

1 p = Pr(zero-count)

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 42: Applications of Spatial Data Analysis

Sea birds Model

Double-hurdle model

P(Yij|θ) =

pij if Yij = 0,(1− pij) · (1− qij) · NB(mij, r) if 1 ≤ Yij < µ,

(1− pij) · qij ·GPD(µ, σ, ξ) if Yij ≥ µ.

1 p = Pr(zero-count)

2 m = mean of typical-count distribution.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 43: Applications of Spatial Data Analysis

Sea birds Model

Double-hurdle model

P(Yij|θ) =

pij if Yij = 0,(1− pij) · (1− qij) · NB(mij, r) if 1 ≤ Yij < µ,

(1− pij) · qij ·GPD(µ, σ, ξ) if Yij ≥ µ.

1 p = Pr(zero-count)

2 m = mean of typical-count distribution.

3 q = Pr(large-count | nonzero-count)

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 44: Applications of Spatial Data Analysis

Sea birds Model

Double-hurdle model

P(Yij|θ) =

pij if Yij = 0,(1− pij) · (1− qij) · NB(mij, r) if 1 ≤ Yij < µ,

(1− pij) · qij ·GPD(µ, σ, ξ) if Yij ≥ µ.

1 p = Pr(zero-count)logit(p) = Xβ(p) + S(p)

2 m = mean of typical-count distribution.

3 q = Pr(large-count | nonzero-count)

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 45: Applications of Spatial Data Analysis

Sea birds Model

Double-hurdle model

P(Yij|θ) =

pij if Yij = 0,(1− pij) · (1− qij) · NB(mij, r) if 1 ≤ Yij < µ,

(1− pij) · qij ·GPD(µ, σ, ξ) if Yij ≥ µ.

1 p = Pr(zero-count)logit(p) = Xβ(p) + S(p)

2 m = mean of typical-count distribution.log(m) = log(E) + Xβ(m) + S(m)

3 q = Pr(large-count | nonzero-count)

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 46: Applications of Spatial Data Analysis

Sea birds Model

Double-hurdle model

P(Yij|θ) =

pij if Yij = 0,(1− pij) · (1− qij) · NB(mij, r) if 1 ≤ Yij < µ,

(1− pij) · qij ·GPD(µ, σ, ξ) if Yij ≥ µ.

1 p = Pr(zero-count)logit(p) = Xβ(p) + S(p)

2 m = mean of typical-count distribution.log(m) = log(E) + Xβ(m) + S(m)

3 q = Pr(large-count | nonzero-count)logit(q) = Xβ(q)

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 47: Applications of Spatial Data Analysis

Sea birds Spatial model

Modeling spatial random effects

1 Guassian Markov random field

π(S|τ) ∝ τ rank(Q)/2 exp(− τ

2S′QS

)

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 48: Applications of Spatial Data Analysis

Sea birds Spatial model

Modeling spatial random effects

1 Guassian Markov random field

π(S|τ) ∝ τ rank(Q)/2 exp(− τ

2S′QS

)2 Inverse covariance matrix Q = D− ρA

D is diagonal with entries the number of neighbors,A is the adjacency matrix,ρ = 1 specifies the intrinsic CAR prior.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 49: Applications of Spatial Data Analysis

Sea birds Spatial model

Modeling spatial random effects

1 Guassian Markov random field

π(S|τ) ∝ τ rank(Q)/2 exp(− τ

2S′QS

)2 Inverse covariance matrix Q = D− ρA

D is diagonal with entries the number of neighbors,A is the adjacency matrix,ρ = 1 specifies the intrinsic CAR prior.

3 Q is a 15984×15984 matrixEigen-decompose Q = VΛV−1,V is a matrix whose columns are eigenvectors,Λ is diagonal whose entries are eigenvalues.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 50: Applications of Spatial Data Analysis

Sea birds Spatial model

Hierarchical modeling

Spatial random effects, S = Vn×n· α

n×1≈ V

n×k· α

k×1

1 Choose the first k� n eigenvectors (k = 50 explains 67% of variance),

logit(p) = Xβ(p) + Vα(p)

log(m) = log(E) + Xβ(m) + Vα(m)

logit(q) = Xβ(q)

Biophysical covariates, X = [1, x1, x2, . . . , x7]

1 Bathymetry, Distance-to-shore,2 Sea surface temperature, Chlorophyll,3 Fourier basis sin(π6 month), cos(π6 month) for temporal variation.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 51: Applications of Spatial Data Analysis

Sea birds Results

Predictive maps for northern gannet

JanuaryP(y ≥ 1)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.00

0.25

0.50

0.75

1.00

JanuaryP(y ≥ 7)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.0

0.1

0.2

0.3

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 52: Applications of Spatial Data Analysis

Sea birds Results

Predictive maps for northern gannet

FebruaryP(y ≥ 1)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.00

0.25

0.50

0.75

1.00

FebruaryP(y ≥ 7)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.0

0.1

0.2

0.3

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 53: Applications of Spatial Data Analysis

Sea birds Results

Predictive maps for northern gannet

MarchP(y ≥ 1)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.00

0.25

0.50

0.75

1.00

MarchP(y ≥ 7)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.0

0.1

0.2

0.3

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 54: Applications of Spatial Data Analysis

Sea birds Results

Predictive maps for northern gannet

AprilP(y ≥ 1)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.00

0.25

0.50

0.75

1.00

AprilP(y ≥ 7)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.0

0.1

0.2

0.3

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 55: Applications of Spatial Data Analysis

Sea birds Results

Predictive maps for northern gannet

MayP(y ≥ 1)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.00

0.25

0.50

0.75

1.00

MayP(y ≥ 7)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.0

0.1

0.2

0.3

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 56: Applications of Spatial Data Analysis

Sea birds Results

Predictive maps for northern gannet

JuneP(y ≥ 1)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.00

0.25

0.50

0.75

1.00

JuneP(y ≥ 7)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.0

0.1

0.2

0.3

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 57: Applications of Spatial Data Analysis

Sea birds Results

Predictive maps for northern gannet

JulyP(y ≥ 1)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.00

0.25

0.50

0.75

1.00

JulyP(y ≥ 7)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.0

0.1

0.2

0.3

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 58: Applications of Spatial Data Analysis

Sea birds Results

Predictive maps for northern gannet

AugustP(y ≥ 1)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.00

0.25

0.50

0.75

1.00

AugustP(y ≥ 7)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.0

0.1

0.2

0.3

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 59: Applications of Spatial Data Analysis

Sea birds Results

Predictive maps for northern gannet

SeptemberP(y ≥ 1)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.00

0.25

0.50

0.75

1.00

SeptemberP(y ≥ 7)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.0

0.1

0.2

0.3

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 60: Applications of Spatial Data Analysis

Sea birds Results

Predictive maps for northern gannet

OctoberP(y ≥ 1)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.00

0.25

0.50

0.75

1.00

OctoberP(y ≥ 7)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.0

0.1

0.2

0.3

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 61: Applications of Spatial Data Analysis

Sea birds Results

Predictive maps for northern gannet

NovemberP(y ≥ 1)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.00

0.25

0.50

0.75

1.00

NovemberP(y ≥ 7)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.0

0.1

0.2

0.3

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 62: Applications of Spatial Data Analysis

Sea birds Results

Predictive maps for northern gannet

DecemberP(y ≥ 1)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.00

0.25

0.50

0.75

1.00

DecemberP(y ≥ 7)

35.0

37.5

40.0

42.5

−76 −72 −68 −64Longitude

Latit

ude

0.0

0.1

0.2

0.3

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 63: Applications of Spatial Data Analysis

What was the most popular NYTimes.com article of 2013?

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 64: Applications of Spatial Data Analysis

How Y’all, Youse, and You Guys Talk

Fun:http://spark.rstudio.com/jkatz/DialectMap/http://www.nytimes.com/interactive/2013/12/20/sunday-review/dialect-quiz-map.html

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 65: Applications of Spatial Data Analysis

My Dialect Map

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 66: Applications of Spatial Data Analysis

How ya’ll, youse, and you guys talk

1 Examine regional variation in English dialect in continental US.2 Important in linguistic research.3 Point-referenced data, coded by zip code.4 Estimate pt, the probability vector for any location t.5 k-nearest neighbor kernel smoothing.

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 67: Applications of Spatial Data Analysis

Fall Course

Coming this Fall...STAT 388/488 - Applied Spatial Statistics

1 No required textbook2 Data analyses using R3 Project-based course4 Final poster presentation

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>

Page 68: Applications of Spatial Data Analysis

Thank you!

Thank You!

Applications of Spatial Data Analysisc© 2015 by Earvin Balderama <[email protected]>