Upload
trannguyet
View
215
Download
0
Embed Size (px)
Citation preview
Copyright © 2016 by Luc Anselin, All Rights Reserved
Luc Anselin
Spatial Clusters of Rates
http://spatial.uchicago.edu
Copyright © 2016 by Luc Anselin, All Rights Reserved
• concepts
• EBI local Moran
• scan statistics
Copyright © 2016 by Luc Anselin, All Rights Reserved
Concepts
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Rates as Risk
• from counts (spatially extensive) to rates (spatially intensive)
• rate = number of events / population
• rate as a measure of risk (a probability)
• crude rate: Oi / Pi
• relative: Oi / Ei observed relative to expected
Copyright © 2016 by Luc Anselin, All Rights Reserved
• The Problem with Rates
• r = O / P
• O number of events
• P population (at risk)
• O is a random variable, P is not
• variance of r depends inversely on P
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Moments of the Binomial Variable
• mean: E [O] = π.P
• risk times population
• variance: V [O] = π (1 - π).P
• variance depends on population P
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Moments of the Rate
• P is just a constant
• E[r] = E[O]/P = π P / P = π
• crude rate is unbiased estimator for risk
• Var[r] = Var[O] / P2 = π (1 - π) P / P2 = π (1 - π) / P
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Non-Standard Features of Rate Variance
• variance depends on the mean (= risk)
• numerator π (1 - π) = π - π2 ≈ π
• higher risk implies greater variance
• variance depends inversely on population P
• P in the denominator
• smaller places (smaller P) have larger variance
Copyright © 2016 by Luc Anselin, All Rights Reserved
crude rate map
Empirical Bayes (EB)smoothed map
effect of variance instability on outliers (schools/population)
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Approaches
• variance instability violates the basic assumption underlying spatial autocorrelation analysis of a constant variance
• solutions
• standardized local indicators of spatial autocorrelation (EBI LISA)
• scan statistics
Copyright © 2016 by Luc Anselin, All Rights Reserved
EBI Local Moran
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Correcting Variance Instability
• NOT by smoothing rates and applying standard Moran’s I
• smoothing induces spatial correlation
• BUT by adjusting the Moran’s I statistic directly
• several proposals: constant risk hypothesis (Walter 92), Tango’s I (95), Oden’s Ipop (95) and Assuncao-Reis EBI (99)
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Empirical Bayes Index - EBI
• standardizing the rate variable using an Empirical Bayes (EB) logic
• zi = (ri - b) / siwith ri as the original rate (xi/pi), b as a mean and si as a standard deviation
• use local Moran with standardized rates zi
Copyright © 2016 by Luc Anselin, All Rights Reserved
• EBI Adjustment
• mean b = Σi x
i / Σ
i p
i for i = 1,...,R
i.e., total sum of cases / total population, not the mean of the rates
• variancei = {[Σi pi(ri - b)2] / Ptot} - b/Pav
• Ptot = Σi pi and Pav = Ptot / m, average population by region
• si = square root of variance
Copyright © 2016 by Luc Anselin, All Rights Reserved
local Moran for crude rate vs EBI local Moran(schools/population)
crude rate
EBI local Moran
Copyright © 2016 by Luc Anselin, All Rights Reserved
Scan Statistics
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Scan Statistics
• count events within a given shape
• typically based on centroids and circle
• count until a given number of events is reached: Besag-Newell
• count until a given aggregate population is reached: Kulldorff
Copyright © 2016 by Luc Anselin, All Rights Reserved
Besag-Newell
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Principle
• aggregate areal units until a chosen number of events has been reached
• then carry out a hypothesis test with the Poisson expected count as the null
• what is the probability that the observed count in the aggregate areal units is from a Poisson distribution with the average
• aggregate with highest significance (lowest p-value) is a cluster
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Implementation
• typically carried out using the centroids of areal units
• sort the neighbors in order of increasing distance
• add the number of events until the critical threshold (k) is exceeded
Copyright © 2016 by Luc Anselin, All Rights Reserved
Besag-Newell clusters (schools/population)
cluster 1
cluster 2
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Interpretation
• care is needed to interpret the p-values
• multiple comparisons
• sequential tests
• clusters are overlapping
• same areal unit can appear in multiple clusters
Copyright © 2016 by Luc Anselin, All Rights Reserved
Kulldorff Scan Statistic
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Principle
• aggregate areal units until a target population is reached
• likelihood ratio test of events within the “cluster” against events outside of the “cluster”
• null hypothesis is Poisson distribution with expected counts
• select cluster with max likelihood ratio
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Likelihood Ratio Test
• T = max (Oi/Ei)Oi (Oo/Eo)Oo
for Oi/Ei > Oo/Eo
• count within region (i) versus outside (o)
• Oi/o observed in/out, Ei/o expected in/out
• inference based on randomization
• Tr computed for simulation under constant risk
• compare reference distribution of Tr to observed T
• pseudo p-value = proportion of Tr that exceeds T
Copyright © 2016 by Luc Anselin, All Rights Reserved
Kulldorff scan clusters (schools/population)
cluster 1
cluster 2
Copyright © 2016 by Luc Anselin, All Rights Reserved
• Interpretation
• most likely cluster has highest log-likelihood ratio
• p-value based on Monte Carlo simulation
• other clusters ranked in order of log-likelihood ratio
• p-values suffer from multiple comparisons and sequential testing