Intensity Estimation for Spatial Point Processes …cucala/intensestim.pdf · Intensity Estimation for Spatial Point ... In the kernel density estimation framework, ... Scand J Statist

doi: 10.1111/j.1467-9469.2007.00583.x© Board of the Foundation of the Scandinavian Journal of Statistics 2008. Published by Blackwell Publishing Ltd, 9600 GarsingtonRoad, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA, 2008

Intensity Estimation for Spatial PointProcesses Observed with NoiseLIONEL CUCALA

Institut de Mathematiques de Toulouse

ABSTRACT. This article introduces a kernel estimator of the intensity function of spatial pointprocesses taking into account location errors. The asymptotic properties of the estimator arederived and a bandwidth selection procedure is described. A simulation study compares our results withthat of the classical kernel estimator and shows that the edge-corrected deconvoluting kernelestimator is more appropriate.

Key words: deconvolution, kernel density estimation, measurement error, spatial pointprocesses

1. Introduction

In the kernel density estimation framework, many authors have considered the issueof estimating the density from noisy observations. Indeed, one may consider that eachmeasurement reflects the true value altered by the addition of a stochastic error.This problem is usually handled by a deconvolution method, when the distributionof the errors is either known (Stefanski & Carroll, 1990) or unknown (Diggle & Hall,1993).

When dealing with a spatial point pattern, a systematic exploratory tool is the estimationof the intensity function which reflects the first-order characteristics of the underlyingpoint process. Some authors propose parametric estimations of the intensity function.Let us mention the approaches based on Poisson processes (Ogata & Katsura, 1986)and more recently on Cox processes (Møller & Waagepetersen, 2004). But non-parametric estimators are also very frequent. These are mainly kernel and nearest-neighbour estimators (Cressie, 1993) derived from the multivariate density estimationtheory. Recently, a new approach based on a hierarchical Bayesian model and theVoronoi tessellation has also been studied (Heikkinen & Arjas, 1998; Byers & Raftery,2002).

All of these methods use location data of the events. These are often difficult to collectand consequently their measurements are subject to errors. Lund & Rudemo (2000)tried to make inference on such point processes observed with noise. Bar-Hen et al.(2005) studied the influence of measurement errors on descriptive statistics for testingthe complete spatial randomness. In this paper, we consider a standard additiveerror model (Carroll et al., 1999) and we introduce a kernel estimator of the intensityfunction which takes into account the location errors by a deconvolution method.For simplicity, we develop this estimator in the case of two-dimensional pointprocesses.

Section 2 describes the framework and introduces the new estimator. We present anasymptotic study in section 3 and a bandwidth selection procedure adapted to this specificproblem in section 4. The interest of the estimator is assessed by its application to simulateddata in section 5 and to real data in section 6. The paper concludes in a short discussion insection 7.

2 L. Cucala Scand J Statist

2. A deconvoluting kernel intensity estimator

2.1. Model and notations

In order to clarify the notations, all two-dimensional vectors are written in bold. Similarly,all random variables are written in capitals. Consider a point process Y in R2 with unknownintensity function �Y . We assume that this function is twice differentiable and its first- andsecond-order partial derivatives are bounded.

We only observe the point pattern {z1, . . . , zn} in the domain D ⊂ R2 according to theadditive error model

zi =yi + ei , (1)

where {yi : i =1, . . . , n} are events issued from the process Y and the location errors{ei : i =1, . . . , n} are independent and identically distributed with known isometric densityfunction g. We will also assume that the errors ei are independent from the true locations yi .

The intensity function of the observed process Z is denoted by �Z . We use N and {Z1, . . . , ZN}to denote the random variables corresponding respectively to the number and positions ofobservations in the domain D.

The two-dimensional Fourier transform of the density function g is

F (g)(t)=∫

R2e−it′zg(z)�(dz),

where � denotes Lebesgue measure, z = (z(1), z(2))′, t = (t(1), t(2))′ and t′z = t(1)z(1) + t(2)z(2). Itsinverse Fourier transform is

F−1(g)(t)= 1(2�)2

∫R2

eit′zg(z)�(dz).

2.2. Intensity estimators

Based on the observations, the basic kernel estimator for �Z(s) is

�h(s) =N∑

j =1

1h2

K(

s −Zj

h

),

where K is an even two-dimensional kernel function and h > 0 is the bandwidth. We willassume that K is a product kernel such that

K (u)=K0(u(1))K0(u(2)), ∀u = (u(1), u(2))′ ∈R2,

where the one-dimensional kernel K0 satisfies∫R

K0(x)�(dx)=1 (2)

and ∫A

xpK0(x)q�(dx) <∞, ∀p=1, 2, 3, ∀q =1, 2, ∀A⊆R. (3)

As the events can only be observed on the bounded domain D, Diggle (1985) introducedthe edge-corrected kernel estimator

�EC,h(s)={

�h(s)ph(s) if ph(s) /=0,

0 otherwise,

© Board of the Foundation of the Scandinavian Journal of Statistics 2008.

Scand J Statist Intensity estimation for spatial point processes 3

where

ph(s)=∫

D

1h2

K(

s −uh

)�(du).

The normalization by ph(s) ensures that the estimator is asymptotically unbiased in a specificframework described in section 3 and its practical interest has also been shown (Zheng et al.,2004).

In the density estimation framework, Stefanski & Carroll (1990) introduced a deconvol-uting kernel estimator adapted to noisy observations. Based on this, we introduce thedeconvoluting kernel estimator of �Y (s) defined by

�∗h(s)=F−1

(F (�h)(t)/F (g)(t)

)(s) (4)

=N∑

j =1

1(2�)2

∫R2

eis′t{∫

R2e−it′z 1

h2K(

z −Zj

h

)�(dz)/F (g)(t)

}�(dt)

=N∑

j =1

1h2

K ∗h

(s −Zj

h

), (5)

where

K ∗h (s)= 1

(2�)2

∫R2

eis′tF (K )(t)/F (g)(t/h)�(dt)

is the two-dimensional version of the deconvoluting kernel. We assume that K is a band-limited kernel, that is its Fourier transform has a compact support, and

|F (g)(t)|> 0, ∀t ∈R2, (6)

where | · | denotes the absolute value. These assumptions ensure that the inverse Fourier trans-form can be applied. Notice that K ∗

h is a kernel since∫R2

K ∗h (s)�(ds)=1. (7)

The proofs for expressions (5) and (7) are given in the one-dimensional setting by Stefanski& Carroll (1990).

Thus, the estimator �EC,h is adapted to observations on a bounded domain and theestimator �∗

h is adapted to noisy observations. Our objective is to combine these estimatorsin order to build an estimator adapted to noisy observations on a bounded domain. A firstpossibility consists of inserting an edge-correction term before carrying out the deconvolution.This can be done by substituting the edge-corrected estimator �EC,h for the basic estimator�h in expression (4). Thus we obtain

�h(s)=F−1(F (�EC,h)(t)/F (g)(t)

)(s)

=N∑

j =1

1(2�)2

∫R2

eis′t

⎧⎨⎩∫

Gh

e−it′z 1h2 K

(z−Zj

h

)ph(z)

�(dz)/F (g)(t)

⎫⎬⎭ �(dt),

where Gh ={s ∈ R2 : ph(s) /=0}. Unfortunately, because of the edge-correction term ph(z), theinverse Fourier transform may not be well defined. This is a main difference with theestimator �∗

h previously introduced as it prevents any practical use. That is why we give outthis first possibility and focus on another one. We decide to insert an edge-correction term



after the deconvolution process and this leads to the following edge-corrected deconvolutingkernel estimator

�∗EC,h(s)=

{�∗

h(s)p∗

h(s) if p∗h(s) /=0,

0 otherwise,

where

p∗h(s)=

∫D

1h2

K ∗h

(s −u

h

)�(du).

The quantity p∗h(s) is an edge-correction term corresponding to Diggle’s edge-correction

term for the deconvoluting kernel K ∗h .

3. Asymptotic properties

In this section we establish the asymptotic properties of the edge-corrected deconvolutingkernel estimator �∗

EC,h in a specific asymptotic framework. For simplicity, we assume that Yis a Poisson point process.

3.1. The asymptotic framework

In most papers dealing with kernel density estimation in Rd , the asymptotic framework isthe following: the deterministic sample size n tends to infinity and the bandwidth h tendsto 0 such that nhd →∞. Thus, the estimated density at each point depends on an expectednumber of observations tending to infinity.

In the point process theory, one often assumes that the expectation of the number ofobserved events tends to infinity with the size of the observation domain. This is called theincreasing-domain asymptotics framework (Cressie, 1993). However, in this case, if the band-width h tends to 0, the estimated intensity at each point depends on an expected number ofevents tending to 0.

The solution adopted by Diggle & Marron (1988) consists of setting up an increasing-intensity asymptotic framework. Let mY =∫

R2 �Y (s)�(ds) denote the expected number of events.In the asymptotics mY →∞, Cucala & Thomas-Agnan (forthcoming) have obtained a consis-tent kernel estimator of �Y (s)/mY in the error-free unbounded-domain case (no measurementerror, unbounded domain). The quantity �Y (s)/mY is the density of the event locations. Thisframework is quite similar to the increasing-time asymptotic framework defined by Ellis (1991)for temporal point processes.

According to the idea of Lahiri et al. (1999), another option is setting up a mixed asymptoticframework: both the intensity and the observation domain increase to infinity, but the inten-sity faster than the domain. A final option consists of considering several realizations of theprocess on a finite domain and letting this number of realizations tend to infinity (Kutoyants,1998). However, this does not seem relevant as only one realization is usually observed.

In order to deal with our specific problem, we decide to set up the following scheme. Let

mZ =∫

D�Z(s)�(ds)=E(N).

We study the asymptotic behaviour of

�∗0EC,h(s)= �∗

EC,h(s)N

I[N /=0]

when mZ tends to infinity and h tends to 0. This can be seen as the estimator of the densityof the event locations. We also observe what happens when D=R2, that is when the domainis unbounded.



3.2. Preliminary results

Let �0Z(s)=�Z(s)/mZ and �0

Y (s)=�Y (s)/mZ . We introduce the random variable

X = 1N

N∑i =1

f (Zi)I[N /=0],

where f is any given measurable function. As Z is also a Poisson process, it has been shown(Cucala & Thomas-Agnan, forthcoming) that

E(X )= (1− e−mZ )∫

Df (s)�0

Z(s)�(ds) (8)

and

var(X )=∫

Df 2(s)�0

Z(s)�(ds)A(mZ)−(∫

Df (s)�0

Z(s)�(ds))2 (

A(mZ)− e−mZ + e−2mZ), (9)

where

A(mZ)= e−mZ

∞∑k =1

mkZ

kk!=E[ 1

NI[N /=0]

].

3.3. Asymptotic expectation of the estimator �∗0EC,h(s)

From (8) it follows that

E(�∗0

EC,h(s))

= 1− e−mZ

(2�)2p∗h(s)

×∫

R2eis′t{∫

R2e−it′z

∫D

1h2

K(

z −xh

)�0

Z(x)�(dx)�(dz)/F (g)(t)}

�(dt), (10)

for all s ∈ G∗h ={s ∈ R2 : p∗

h(s) /=0}. The asymptotic expression of this expectation is obtainedby a Taylor expansion (for details see Appendix). We get that

E(�∗0

EC,h(s))

= 1− e−mZ

(2�)2p∗h(s)

{∫R2

eis′tfD,h(t)�(dt)+O(h)}

, (11)

where

fD,h(t)=∫

R2

∫R2

∫Bz, h

K (u)�(du)e−it′(z−e)�0Y (z − e)�(dz)e−it′eg(e)�(de)∫

R2 e−it′eg(e)�(de)

and Bz,h ={ z−xh : x ∈ D}. This expression depends on the observation domain D and cannot

be simplified as limh→0 Bz,h also depends on D.In the case of D=R2, many terms become simpler and it remains

E(�∗0

EC,h(s))

=�0Y (s)+ h2

2

∫R

x2K0(x)�(dx)

(∂2�0

Y

∂s(1)2

(s)+ ∂2�0Y

∂s(2)2

(s)

)+O(h)3. (12)

Hence �∗0EC,h(s) is an asymptotically unbiased estimator of �0

Y (s).

3.4. Asymptotic variance of the estimator �∗0EC,h(s)

From (9) it follows that

var(�∗0EC,h(s))= p∗

h(s)−2

h4

[(∫D

K ∗h

(s − z

h

)2

�0Z(z)�(dz)

)A(mZ)

−(∫

DK ∗

h

(s − z

h

)�0

Z(z)�(dz))2 (

A(mZ)− e−mZ + e−2mZ)]

,



for all s ∈G∗h . The asymptotic expression of this variance is obtained by a Taylor expansion

(refer appendix for details). We get that

var(�∗0EC,h(s))= A(mZ)

p∗h(s)2

[O(h−2VK ,g(h)

)], (13)

where

VK ,g(h)=∫

R2K ∗

h (u)2�(du)=∫

R2

F (K )(t)2

|F (g)(t/h)|2 �(dt)

from Parseval’s identity. The asymptotic behaviour of VK ,g(h) depends on the smoothness ofthe error density function g. As in the density estimation framework (Fan, 1991), h must tendto 0 and mZ to infinity such that

A(mZ)h−2VK ,g(h)=o(1). (14)

In the case of D=R2, it remains

var(�∗0EC,h(s))=A(mZ)

[1h2

∫R2

�0Z(s −hu)K ∗

h (u)2�(du)+O(1)]

(15)

and the assumption (14) ensures that �∗0EC,h(s) is a consistent estimator of �0

Y (s). Notice that,when D=R2,

∫R2 �Y (s)�(ds)=mZ so that �0

Y is the density function of the event locations.

4. The bandwidth selection procedure

Each kernel method is subject to the crucial choice of the bandwidth, which is much moreimportant than the analytical form of the kernel (Silverman, 1986). This choice has beenextensively discussed in the literature and original procedures have been proposed either forthe deconvolution kernel density estimation problem (Delaigle & Gijbels, 2004) or for thekernel intensity estimation problem (Xu et al., 2003).

Let us look for the bandwidth minimizing the mean integrated squared error (MISE inshort)

MISE(h)=E

{∫D{�∗0

EC,h(s)−�0Y (s)}2�(ds)

}.

Because of their complexity and their dependence on the domain D, we do not use the generalexpressions of the asymptotic bias and variance to set up a bandwidth selection procedure.Instead, we rely on the expressions obtained in the specific case D=R2. Indeed, we believethat the global choice of the bandwidth is not much affected by a specific correction mainlyinfluent on the edges of the domain.

The expression of the asymptotic MISE (AMISE in short) is obtained from expressions(12) and (15). As

∫R2 �0

Z(s)�(ds)=1, we get that

AMISE(h)= A(mZ)(2�h)2

∫R2

F (K )(t)2

|F (g)(t/h)|2 �(dt)+ h4

4�2∫

R2(∇2�0

Y (s))2�(ds),

where �=∫R

x2K0(x)�(dx).Before minimizing this expression, we need to estimate two terms. The first one is

A(mZ)=E

[1N

I(N /=0)]

and is estimated by 1/n. For the second one,

I =∫

R2(∇2�0

Y (s))2�(ds),



we propose to use a normal-reference rule. Assuming that �0Y is the density of a Gaussian

distribution with covariance matrix

�Y =(

�2Y ,1 �Y �Y ,1�Y ,2

�Y �Y ,1�Y ,2 �2Y ,2

),

we can find a function denoted H such that

I =H(�Y ,1, �Y ,2, �Y ).

Letting �Y ,1, �Y ,2 and �Y denote the standard estimators of the parameters �Y ,1, �Y ,2 and�Y , we can thus estimate I by

H(�Y ,1, �Y ,2, �Y ).

The complete expressions of H and of the parameters estimators are given in the Appendix.

5. A simulation study

In the example of this section and section 6, we applied the product kernel issued from

K0(x)= 48�

x3 cos(x)−6x2 sin(x)+15 sin(x)−15x cos(x)x7

,

which is the one-dimensional band-limited kernel used by Delaigle & Gijbels (2004) andsatisfies assumptions (2) and (3).

Its Fourier transform is

F (K )(t)= (1− t2(1))

3(1− t2(2))

3I[−1,1]2 (t).

The Fourier transform of the error density function g can usually be calculated analytically.As in Stefanski & Carroll (1990), the inverse Fourier transforms were evaluated by a numer-ical Simpson procedure which is slower but more accurate than the Fast Fourier Transform(FFT) procedure.

An inhomogeneous Poisson process with intensity

�Y (s)=C[1+0.7 cos

(2�(||s||−0.5)

)]was simulated in [0, 1]2 enlarged by a guard area, using an acceptation–rejection method(Gentle, 2002). The constant C was chosen such that the expected number of events in [0, 1]2

is 100. Then the location errors {ei} were simulated and added to the simulated locations {yi}according to the model (1). We only retained the events {zi} located in [0, 1]2.

From the simulated sample, we computed the estimates �EC,hopt , �∗h∗ and �∗

EC,h∗ , where hopt

is the bandwidth obtained by the classical normal-reference rule (Silverman, 1986, p. 85) andh∗ is the bandwidth obtained by the procedure described in section 4. Let

ISEEC =∫

[0,1]2

(�EC,hopt (s)−�Y (s)

)2�(ds),

ISE∗ =∫

[0,1]2

(�∗

h∗ (s)−�Y (s))2

�(ds),

and

ISE∗EC =

∫[0,1]2

(�∗

EC,h∗ (s)−�Y (s))2

�(ds)

denote the integrated squared errors (ISE) associated to each estimator.



Table 1. Gaussian errors (�=0.05, �=0.1)

ISEEC ISE∗ ISE∗EC

�=0.05First quartile (×103) 0.7923 1.1019 0.7844Median (×103) 0.9537 1.3250 0.9398Third quartile (×103) 1.2341 1.5001 1.2116


This procedure was repeated 100 times and we computed the empirical quartiles of ISEEC,ISE* and ISE∗

EC. Table 1 gives the results when the location errors are Gaussian randomvariables with mean (0, 0)′ and covariance matrix

�=(

�2 00 �2

).

Table 2 gives the results with Laplace errors having the same mean and covariance matrix.The results of the estimator �∗

h∗ are worse than the ones obtained by the classicalDiggle estimator �EC, hopt. This suggests that, when the error variance is not so large, it ismore important to deal with edge effects than with location errors. But, in each case, theestimator �∗

EC,h∗ leads to slightly improved results so that both deconvolution and edge-correction should be considered when dealing with perturbed locations in a bounded domain.

Finally, we consider how to handle uniform location errors. Indeed, in this case, condition(6) is not satisfied and there is no appropriate deconvoluting intensity estimator. A solutionconsists of using a deconvoluting estimator adapted to another error distribution. Toillustrate this, Table 3 shows the results obtained when using the convoluting kernel esti-mator adapted to Laplace (with exponent ∗L) or Gaussian (with exponent ∗G) errors todata perturbed by uniform errors. The simulation procedure remains the same.

It appears that, even when the error distribution is mis-specified, the deconvoluting kernelestimator remains useful. This goes along with the results obtained by Hesse (1999) in thedeconvoluting kernel density estimation framework: he asserts that it is more important tospecify the error variance than the error distribution.

Table 2. Laplace errors (�=0.05, �=0.1)

ISEEC ISE∗ ISE∗EC



Table 3. Uniform errors (�=0.05)

ISEEC ISE∗L ISE∗LEC ISE∗G ISE∗G

EC

First quartile (×103) 0.6423 1.2746 0.6214 1.2861 0.6009Median (×103) 0.9647 1.6796 0.9541 1.7024 0.9488Third quartile (×103) 1.1079 1.8236 1.0967 1.8462 1.0884



6. An application to real data

In this section we illustrate our method on the spatial distribution of trees observed at theParacou site. This experimental station is located in the coastal part of French Guyana andused for various ecological studies. It contains 14 experimental permanent sample plots of6.25 ha each and one of 16 ha. In 1984, on each plot, all trees of diameter at breast heightgreater than 10 cm were located by cartesian coordinates and botanically identified, whenpossible. These data are provided by the Forest Department of CIRAD (Gourlet-Fleury et al.,2004).

As GPS does not work well around the equator and is not at all precise under canopy, thetrees were located using a ropes axis system, a decametre and a compass. Thus the trees loca-tions were estimated independently and the location error is a sum of the metrology error, abad estimation at the centre of a tree (as the trunk could be far from circular in this tropicalcontext) plus various entry errors. Finally, the location errors are suspected to be Gaussianrandom variables with standard deviation equal to 4 m.

Figure 1 presents the results obtained when applying both the classical Diggle estimator(on the left) and the edge-corrected deconvoluting kernel estimator (on the right) to one ofthe data sets from Paracou. This data set represents the spatial distribution of a tree speciescalled Dicorynia. We notice that many of the observed trees, represented by a single dot, arelocated close to the boundary of the observation window so that edge correction seems essen-tial. The estimated standard deviation of the location errors is quite large compared with thesize of the plot so that using deconvolution seems appealing. However, as the results obtainedin the simulation study, the difference between �EC,hopt and �∗

EC,h∗ is not striking. Both esti-mators exhibit three high-intensity zones, that we may call clusters, which are slightly lesspronounced when the location errors are taken into account. The one located at the top of

Fig. 1. Left: Contours of �EC,hopt . Right: Contours of �∗EC,h∗ .



the observation window is the most significant and the estimated intensity in this area ismuch higher than anywhere else. It would be interesting to analyse the characteristics of thisspecific area that lead to such a high concentration of Dicorynia.

7. Discussion

As point patterns are usually observed in a bounded domain and located with errors,using an intensity estimator adapted to these two limitations seems necessary. The edge-corrected deconvoluting kernel estimator presented here is a starting point for solving thisproblem.

Theoretically speaking, it is not consistent but we believe that this comes from thedifficulty of the problem. Indeed, if the support of the density function g is R2, the estimationof �Y in each point of D requires the estimation of �Z in each point of R2. Here �Z may behighly underestimated in D, the complement of D, as events in this domain are not observed.However, we shall not forget that this asymptotic result has been obtained under thePoisson assumption. It would be worth comparing this with the results obtained fromalternative hypotheses, such as Cox processes.

Practically speaking, we may wonder whether the bandwidth selected by the proceduredescribed in section 4 is close to the ideal bandwidth. An adaptation of the specificprocedures introduced by Delaigle & Gijbels (2004) in the density estimation framework shouldbe considered.

Finally, we may sometimes suspect that location errors are spatially dependent, forexample when the observation domain is heterogeneous. In that case, the deconvolution methodis no longer valid and a new way to handle the problem still has to be found.

Acknowledgements

This research project is within the scope of my PhD thesis work under the direction ofChristine Thomas-Agnan. I gratefully thank her for her constant help and support. It wasinitiated by my visit to Ohio State University and my rewarding discussions with Noel Cressie.I gratefully acknowledge an associate editor and two referees for their valuable remarks on afirst version of the paper, which led to a considerable improvement of the original manuscript.I also thank Renaud Marty for his advices concerning English writing. All programs arewritten in R language and are available on demand.

References

Bar-Hen, A. , Chadœuf, J., Dessard, H. & Monestiez, P. (2005). Estimating distance functions of pointprocesses with known independent noise. Unpublished manuscript.

Byers, S. D. & Raftery, A. E. (2002). Bayesian estimation and segmentation of spatial point processesusing Voronoi tilings. In Spatial cluster modelling (eds D. G. T. Lawson & A. B. Denison), 109–121.Chapman and Hall, London.

Carroll, R., Maca, J. & Ruppert, D. (1999). Nonparametric–regression in the presence of measurementerror. Biometrika 86, 541–554.

Cressie, N. A. C. (1993). Statistics for spatial data. Wiley, New York.Cucala, L. & Thomas-Agnan, C. (forthcoming) Donnees spatiales. In Approches non-parametriques en

regression (eds J.-J. Droesbeke & G. Saporta). Editions Technip, Paris (to appear).Delaigle, A. & Gijbels, I. (2004). Practical bandwidth selection in deconvolution kernel density esti-

mation. Comput. Stat. Data Anal. 45, 249–267.Diggle, P. J. (1985). A kernel method for smoothing point process data. J. Roy. Statist. Soc. Ser. C 34,

138–147.



Diggle, P. J. & Hall, P. (1993). A Fourier approach to nonparametric deconvolution of a density estimate.J. Roy. Statist. Soc. Ser. B Stat. Methodol. 55, 523–531.

Diggle, P. J. & Marron, P. (1988). Equivalence of smoothing parameter selectors in density and intensityestimation. J. Amer. Statist. Assoc. 83, 793–800.

Ellis, S. P. (1991). Density estimation for point processes. Stochast. Process. Appl. 39, 345–358.Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. Ann.

Statist. 19, 1257–1272.Gentle, J. E. (2002). Elements of computational statistics. Springer-Verlag, New York.Gourlet-Fleury, S., Ferry, B., Molino, J.-F., Petronelli, P. & Schmitt, L. (2004). Experimental plots: key

features. In Ecology and management of a neotropical rainforest. Lessons drawn from Paracou, a long-term experimental research site in French Guiana (eds S. Gourlet-Fleury, J.-M. Guehl & O. Laroussinie).Elsevier, Paris.

Heikkinen, J. & Arjas, E. (1998). Non-parametric bayesian estimation of a spatial Poisson intensity.Scand. J. Statist. 25, 435–450.

Hesse, C. (1999). Data-driven deconvolution. J. Nonparametr. Statist. 10, 343–373.Kutoyants, Y. A. (1998). Statistical inference for spatial point processes. Lecture Notes in Statist., 134.

Springer-Verlag, New York.Lahiri, S. N., Kaiser, M. S., Cressie, N. & Hsu, N. (1999). Prediction of spatial cumulative distribution

functions using subsampling. With comments and a rejoinder by the authors. J. Amer. Statist. Assoc.94, 86–110.

Lund, J. & Rudemo, M. (2000). Models for point processes observed with noise. Biometrika 87, 235–249.Møller, J. & Waagepetersen, R. P. (2004). Statistical inference and simulation for spatial point processes.

Chapman & Hall, Boca Raton.Ogata, Y. & Katsura, K. (1986). Point-process models with linearly parameterized intensity for appli-

cation to earthquake data. J. Appl. Probab. 23A, 291–310.Silverman, B. W. (1986). Density estimation for statistics and data analysis. Chapman and Hall, London.Stefanski, L. & Carroll, R. J. (1990). Deconvoluting kernel density estimators. Statistics 21, 169–184.Xu, C., Dowd, P. A., Mardia, K. V. & Fowell, R. J. (2003). Stochastic approaches to fracture modelling.

Proceedings of IAMG 2003. Portsmouth, UK.Zheng, P., Durr, P. & Diggle, P. (2004). Edge-correction for spatial kernel smoothing – when is it nec-

essary? Proceedings of the GisVet Conference 2004, University of Guelph, Ontario, Canada.

Received May 2006, in final form October 2007

Lionel Cucala, Universite Paul Sabatier, Laboratoire de Statistique et Probabilites, UMR C5583, 31062Toulouse Cedex 9, France.E-mail: [email protected]

Appendix

Proof of formulas (11) and (12)

Let

J =∫

D

1h2

K(

z −xh

)�0

Z(x)�(dx)=∫

R2

∫D

1h2

K(

z −xh

)�0

Y (x − e)�(dx)g(e)�(de)

=∫

R2

∫Bz, h

K (u)�0Y (z −hu − e)�(du)g(e)�(de).

A Taylor–Lagrange expansion gives

�0Y (z −hu − e)=�0

Y (z − e)−h

(u(1)

∂�0Y

∂s(1)(z)+u(2)

∂�0Y

∂s(2)(z)

),



where z = (z(1) − e(1) +�1hu(1), z(2) − e(2) +�2hu(2))′, �1 ∈ [0, 1], �2 ∈ [0, 1]. This leads to

J =∫

R2

{�0

Y (z − e)∫

Bz, h

K (u)�(du)−h∂�0

Y

∂s(1)(z)∫

Bz, h

u(1)K (u)�(du)

−h∂�0

Y

∂s(2)(z)∫

Bz, h

u(2)K (u)�(du)

}g(e)�(de)

=∫

R2

{�0

Y (z − e)∫

Bz, h

K (u)�(du)

}g(e)�(de)+O(h)

from hypothesis (3) and assumptions concerning �Y . Expression (11) is obtained whenreplacing J in expression (10).

Now, in order to prove expression (12), we may also find z in the previously defined domainsuch that

�0Y (z −hu − e) = �0

Y (z − e)−h

(u(1)

∂�0Y

∂s(1)(z − e)+u(2)

∂�0Y

∂s(2)(z − e)

)

+ h2

(u2

(1)

2∂2�0

Y

∂s2(1)

(z − e)+ u2(2)

2∂2�0

Y

∂s2(2)

(z − e)+u(1)u(2)∂2�0

Y

∂s(1)∂s(2)(z − e)

)+O(h3).

When D=R2, Bz,h =R2 and the term J becomes

∫R2

{�0

Y (z − e)∫

R2K (u)�(du)−h

∂�0Y

∂s(1)(z − e)

∫R2

u(1)K (u)�(du)

− h∂�0

Y

∂s(2)(z − e)

∫R2

u(2)K (u)�(du)+ h2

2∂2�0

Y

∂s2(1)

(z − e)∫

R2u2

(1)K (u)�(du)

+ h2

2∂2�0

Y

∂s2(2)

(z − e)∫

R2u2

(2)K (u)�(du)+O(h3)

=∫

R2

{�0

Y (z − e) + h2

2

∫R

x2K0(x)�(dx)

(∂2�0

Y

∂s2(1)

(z − e)+ ∂2�0Y

∂s2(2)

(z − e)

)+O(h3)

from the properties of the kernel K. As D = R2, we also have that p∗h(s)=1, for all s ∈ R2,

and we obtain expression (12).

Proof of formulas (13) and (15)

The usual change of variables leads to

1h2

∫D

K ∗h

(s − z

h

)2

�0Z(z)�(dz)=

∫Bs, h

K ∗h (u)2�0

Z(s −hu)�(du)=O(VK ,g(h)

)

since �Z is bounded, as �Y . From expression (7), we also get that

1h2

∫D

K ∗h

(s − z

h

)�0

Z(z)�(dz)=O(1).

As e−mZ = o(A(mZ)

), expression (13) is obtained.

When D=R2, then Bs,h =R2 and p∗h(s)=1, for all s ∈R2. Expression (15) follows.



Details for the bandwidth selection procedure

The Gaussian assumption leads to∫R2

(∇2�0

Y (s))2

�(ds)

= (�2Y ,1 +�2

Y ,2)2

4��5Y ,1�

5Y ,2(1−�2

Y )5/2− �2

Y ,1 +�2Y ,2

4��3Y ,1�

5Y ,2(1−�2

Y )3/2

+ 3

16��Y ,1�5Y ,2

√1−�2

Y

−√

1−�2Y

(�4

Y ,1�2Y +�2

Y ,1�2Y ,2(1+�2

Y )+�4Y ,2

)4��5

Y ,1�5Y ,2(1−�2

Y )3

+ (3�2Y �2

Y ,1 +�2Y ,2)√

1−�2Y

8��3Y ,1�

5Y ,2(1−�2

Y )2+ 3√

1−�2Y (�2

Y �2Y ,1 +�2

Y ,2)2

16��5Y ,1�

5Y ,2(1−�2

Y )3

=H(�Y ,1, �Y ,2, �Y ).

Let

�Z =(

�2Z,1 �Z �Z,1�Z,2

�Z �Z,1�Z,2 �2Z,2

)

denote the empirical covariance matrix issued from the observations {z1, . . . , zn} and

�e =(

�2e,1 �e�e,1�e,2

�e�e,1�e,2 �2e,2

)

denote the covariance matrix associated to the error density function g. From the expression(1) and the independence assumptions, natural estimators for the parameters of �0

Y are givenby

�2Y ,1 = �2

Z,1 −�2e,1,

�2Y ,2 = �2

Z,2 −�2e,2,

�Y = �Z �Z,1�Z,2 −�e�e,1�e,2√(�2

Z,1 −�2e,1)(�2

Z,2 −�2e,2)

.


Documents

Intensity Estimation for Spatial Point Processes …cucala/intensestim.pdf · Intensity Estimation for Spatial Point ... In the kernel density estimation framework, ... Scand J Statist