Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Title: Spatial Statistics for Point Processes and Lattice Data (Part II)
Point Processes
Tonglin Zhang
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Outline
Outline
I Simulated Examples
I Interesting Problems
I Analysis under Stationarity
I Analysis under Nonstationarity
I Likelihood Analysis
I Stochastic Integral
I Asymtotic Frameworks
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Simulated Examples
Poisson Processes
Let S be the study area. A poisson process is derived ifN(A1), · · · ,N(Ak) are iid Poisson random variable with meanµ(A1), · · · , µ(Ak) if A1, · · · ,Ak are disjoint subsets of S.Extensions:
I Cox process: µ is a random measure (e.g. log µ is given by aGaussian random field).
I Mixed poisson process (mean measure is given by Yµ, whereY is a nongenative random variable).
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Simulated Examples
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Homogeneous Poisson Process
x
y
Figure : A Homogeneous Poisson Point Process on [0, 1]2.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Simulated Examples
Neyman Scott Cluster Process
A Neyman Scott cluster process has:I A parent process and an off-spring process.
I The parent process is Poisson.
I The off-spring process is derived from the parent process:each parent point generate a Poisson number of off-springpoints around it.
I Example: Let s1, · · · , sm be parent point. Then,I There are m clusters.I Cluster 1 is around s1; cluster 2 is around s2; and so on.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Simulated Examples
0.2 0.4 0.6 0.8
0.0
0.2
0.4
0.6
0.8
1.0
Homogeneous Poisson Cluster Process
x
y
Figure : A Homogeneous Neyman Scott Cluster Process on [0, 1]2.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Simulated Examples
How to generate an inhomogeneous Poisson point process
Let f be a PDF on s. Then, we canI Generate n ∼ Poisson(b).
I Generate n observations independently from a PDF f . Letthem be s1, · · · , sn.
I Then, {s1, · · · , sm} is a sample from Poisson point processwith intensity
λ(s) = bf (s).
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Simulated Examples
Other Processes
Compound Point ProcessI Generate n from a nonnegative discreate random varible.
I Generate n observations independent from f .
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Interesting Research Problems
I Stationarity: λ(s) is a constant; λ2(s1, s2) = λ2(s1 − s2);g(s1, s2) = g(s1 − s2); and etc.
I Isotropic: Stationarity, and λ2(s1, s2) = λ2(∥s1 − s2∥).I Nonstationary process: estimation of intensity functions.
I Stationary temporal point process: return (or recurrence)intervals.
I Second-order analysis: K -function, L-function, Pair correlationfunction.
I Return Intervals.
I Marked point processes: relationship between points andmarks.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Stationarity
The K -Function (Ripley 1976)
Let {s1, · · · , sn} be observed points from a stationary pointprocess with intensity λ. K -function describes the second-orderproperties, which is
K (t) = 2π
∫ t
0ug(u)du,
where u is the pair correlation function.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Stationarity
We can estimate
K̂ (t) =1
|A|λ2
n∑i=1
∑j ̸=i
wij I (dij ≤ t),
where A is a disc, wij is the weight for edge correction, and
dij = ∥si − sj∥ (λ may be replaced as λ̂ if λ is unknown).
There is a local version, which is called the local K -function, whichis
K̂local(t; si ) =1
|A|λ∑j ̸=i
wij I (dij ≤ t).
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Stationarity
If N is a homogeneous Poisson point process, then
K (t) = 2π
∫ t
0udu = πt2.
For any s in S, let
Us,t = {s′ : 0 < ∥s− s′} ≤ t}.
If there is a point at s, then
E [N(Us,t)|s] = πt2.
Therefore, the L-function
L(t) =
√K (t)
π− t
is useful.Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Stationarity
If N is a homogeneous cluster point process with cluster size k,then
g(u) ≈ k.
Therefore,K (t) ≈ kπt2.
Then,E [N(Us,t)|s] ≈ kπt2.
There isL(t) ≈ (k − 1)t > 0.
For any A ⊂ S , there is
E [N(A)] = λ|A|
andV [N(A)] ≈ kE [N(A)] = kλ|A|
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Stationarity
Pair Correlation Function
If N is a Neyman cluster process, then we roughly have
g(u) ≈ k < 1
for some k . Then,L(t) < 0.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Stationarity
Pair Correlation Function
If N is strong stationary, then (under for conditions for weakdependence) for any A, we roughtly have
V [N(A)] = ϕE [N(A)]
where
ϕ = λ
∫Rd
[g(u)− 1]du+ 1.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Stationarity
A Simulated Example
We evaluate the value ϕ = V [N(A)]/E [N(A)] based on thefollowing example.
I N is a cluster process on [0, 1]2 with cluster size k .
I The cluster is determined by a bivariate normal with center atits parent point and standard deviance σ.
I The parameters are (λ, σ, k).
I A is a rectangle at (0.2, 0.2), (0.2, 0.8), (0.8, 0.2), and(0.8, 0.8).
I I simulated 1000 times.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Stationarity
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Example for Dispersion Effect
x
y
Figure : An Example for Dispersion Effect under Stationarity
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Stationarity
Table : Averages under Stationarity in 10, 000-run simulations
λ σ k E [N(A)] V [N(A)] ϕ
1000 0.005 4 359.872 1413.705 3.9281000 0.005 2 360.600 695.630 1.9291000 0.005 1 360.188 355.049 0.9861000 0.001 4 359.594 1426.759 3.9681000 0.001 2 360.050 710.812 1.9741000 0.001 1 360.346 368.514 1.0235000 0.005 4 1800.936 7131.298 3.9605000 0.005 2 1799.848 3582.422 1.9905000 0.005 1 1800.289 1769.029 0.9835000 0.001 4 1800.547 7301.358 4.0555000 0.001 2 1800.527 3561.236 1.9785000 0.001 1 1800.356 1823.486 1.013
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Annalysis under Stationarity
Return Intervals.
Let N be a stationary point process on R. Let τ be the time forthe next occurrence. If N is Poisson, then the average return timeof a point is
E (τ) =1
λ.
If N is a cluster process, then we should consider the return ofparent points. It is about
E (τ) =1
kλ.
If there is a return, then it is a return of cluster.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Annalysis under Nonstationarity
Assume N is not stationary with first-order intensity λ(s) andsecond-order intensity λ2(s1, s2). Recently, most research focuseson the second-order intensity-reweighted stationary or isotropicprocoess, where the first is defined by
g(s1, s2) =λ2(s1, s2)
λ(s1)λ(s2)= g(s1 − s2)
and the second is defined by
g(s2, s2) = g(∥s1 − s2∥).
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Nonstationarity
For any A ⊆ S, there is
V [N(A)] ≈ (ϕ− 1)E 2[N(A)] + E [N(A)]
for a centain ϕ.If N is an inhomogeneous cluster process, then we roughly have
V [N(A)] ≈ kE [N(A)],
where k is the cluster size. Therefore, an inhomogeneous clusterprocess is not a second-order intensity-reweighted stationaryprocess.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Nonstationarity
If λ(s) is given, then the K -function is estimated by
K̂ (t, λ) =1
|A|
n∑i=1
∑j ̸=i
wij I (dij ≤ t)
λ(si )λ(sj),
where A is a disc and wij is edge-correction.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Nonstationarity
One should
I Estimate λ(s);
I Estimate λ2(s1, s2) by estimating g(u) or K (u);I It is hard to provide nonparametric estimation for both the
first-order and the second-order intensity functions. Therefore,one often (e.g. Guan (2009) JASA, 1482-1491) considers:
I A parametric analysis for λ(s); andI a nonparametric analysis for λ2(s1, s2).
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Nonstationarity
A Simulated Example
We generate data from cluster process with λ(s) = bβ(2, 2).
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Inhomogeneous Poisson Cluster Process
x
y
Figure : An Example for Dispersion Effect Under Nonstationarity
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Analysis under Stationarity
Table : Averages under Nonstationarity in 10, 000 Simulations
b σ k E [N(A)] V [N(A)] ϕ
1000 0.005 4 472.797 1857.836 3.9291000 0.005 2 472.561 935.265 1.9791000 0.005 1 472.442 467.561 0.9901000 0.001 4 471.662 1876.383 3.9781000 0.001 2 472.554 957.339 2.0261000 0.001 1 472.861 482.426 1.0205000 0.005 4 2362.713 9401.313 3.9795000 0.005 2 2362.378 4717.934 1.9975000 0.005 1 2363.058 2343.867 0.9925000 0.001 4 2362.672 9274.963 3.9265000 0.001 2 2361.637 4607.397 1.9515000 0.001 1 2363.759 2387.218 1.010
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Likelihood Analysis
If N is an inhomogeneous Poisson process with intensity functionλθ(s), then the loglikelihood function is
ℓ(θ) =n∑
i=1
log λθ(si )−∫Sλθ(u)du.
If N is not Poisson, the above is the composite loglikelihoodfunction, which can also provide a consistent estimator.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Stochastic Integral
Let f (s) be a function. We can use Stochastic Integral method as
U =n∑
i=1
f (si ) =
∫Sf (s)N(ds).
Then,
E (U) =
∫Sf (s)λ(s)ds
and
E (U2) =
∫S
∫Sf (s)f (s′)λ2(s, s
′)ds′ds+
∫Sf 2(s)λ(s)ds.
Therefore,
V (U) =
∫S
∫Sf (s)f (s′)[g(s, s′)−1]λ(s)λ(s′)ds′ds+
∫Sf 2(s)λ(s)ds.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Stochastic Integral
For a bivariate function f (s, s′), there is
U =
∫S
∫Sf (s, s′)N(ds)N(s′).
Then,
E (U) =
∫S
∫Sf (s, s′)λ2(s, s
′)dsds′ +
∫Sf (s, s)λ(s)ds.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Stochastic Integral
In addition, there is
E (U2) =
∫S
∫S
∫S
∫Sf (s, s′)f (s′′, s′′′)λ4(s, s
′, s′′, s′′′)dsds′ds′′ds′′′
+
∫S
∫S
∫S[2f (s, s′)f (s′′, s′′) + f (s, s′)f (s, s′′)
+ f (s, s′)f (s′′, s′) + 2f (s, s′)f (s′′, s)]λ3(s, s′, s′′)dsds′ds′′
+
∫S
∫S[f (s, s)f (s′, s′) + f (s, s′)f (s, s′)
+ f (s, s′)f (s′, s)]λ2(s, s′)dsds′
+
∫Sf 2(s, s)λ(s)ds.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Stochastic Integral
For ℓ(θ), there is
ℓ(θ) =
∫Slog λθ(s)N(ds)−
∫Sλθ(u)du.
Then, we have
∂ℓ(θ)
∂θj=
∫S
1
λθ(s)
∂λθ(s)
∂θjN(ds)−
∫S
∂λθ(u)
∂θjdu.
Thus,
E [∂ℓ(θ)
∂θj] = 0.
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)
Asymptotic Frameworks
There are two different types of asymtotic frameworks:
I Increasing domain: S increases to Rd but intensity functionsdo not vary (focused by most research).
I Fixed domain: S does not vary but intensity functions goes toinfnity (e.g. λ(s) = bλ0(s) as b → ∞, rarely discussed).
Tonglin Zhang, Department of Statistics, Purdue University Spatial Statistics for Point and Lattice Data (Part II)