8
Hindawi Publishing Corporation Journal of Applied Mathematics Volume 2013, Article ID 531062, 7 pages http://dx.doi.org/10.1155/2013/531062 Research Article Generalized Linear Spatial Models to Predict Slate Exploitability Angeles Saavedra, 1,2 Javier Taboada, 3 María Araújo, 3 and Eduardo Giráldez 3 1 Department of Statistics, University of Vigo, 36310 Vigo, Spain 2 E.T.S.I. MINAS, Universidad de Vigo, Campus Lagoas-Marcosende, R´ ua Maxwell, 36310 Vigo, Spain 3 Department of Natural Resources, University of Vigo, 36310 Vigo, Spain Correspondence should be addressed to Angeles Saavedra; [email protected] Received 11 November 2012; Revised 3 May 2013; Accepted 26 May 2013 Academic Editor: Zhiping Qiu Copyright © 2013 Angeles Saavedra et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e aim of this research was to determine the variables that characterize slate exploitability and to model spatial distribution. A generalized linear spatial model (GLSMs) was fitted in order to explore relationship between exploitability and different explanatory variables that characterize slate quality. Modelling the influence of these variables and analysing the spatial distribution of the model residuals yielded a GLSM that allows slate exploitability to be predicted more effectively than when using generalized linear models (GLM), which do not take spatial dependence into account. Studying the residuals and comparing the prediction capacities of the two models lead us to conclude that the GLSM is more appropriate when the response variable presents spatial distribution. 1. Introduction e exploitability of a slate deposit depends on many quality- determining factors that are spatially correlated. Knowledge and study of these factors are essential for the evaluation of deposits [1, 2]. erefore, it can be fairly safely assumed that better evaluations of quality parameters and better predictions of slate exploitability could be obtained by using specific statistical models that take spatial correlation into account. Traditionally, the main aim of geostatistical models has been to predict a spatially correlated response variable. Under this approach, estimating the parameters of the geostatistical model is not usually the main interest. However, estimating and inferring parameters enables a more precise identifica- tion of the factors influencing the geographical distribution of exploitable slate, thus allowing greater knowledge to be gained regarding the response variable of interest. In our research, the model-based geostatistics methodol- ogy was adapted in the analysis of slate exploitability using a generalized linear spatial model (GLSM). With this type of model, the objective of inference can be focused on the parameters of the regression function, on the properties of the residuals, or on the distribution of the residuals conditionally on the response variable. A brief description of the statistical models used in this study is given in Section 2. e data studied and the formulated model are described in Section 3. e statistical results and analyses are presented in Section 4, and some comments and the discussion can be found in Section 5. 2. Statistical Analysis Methods 2.1. Generalized Linear Spatial Models. Generalized linear models (GLMs) were introduced by [3] and studied in depth by [4] and later by several authors (see [59]). In a GLM, a response variable = ( 1 , 2 ,..., ) is assumed so that the variables 1 , 2 ,..., are mutually independent and with its expected value related to a linear predictor [] = −1 ( ), where R is a vector of unknown regression parameters, d are known explanatory variables, and is a known function called link function. An important extension of the GLM is the generalized linear mixed model (GLMM) [10], in which the response variables are considered independent of one another con- ditionally on the values for a set of latent variables. e generalized linear spatial model (GLSM) [11] is basically a GLMM, in as much as the latent variables derive from a spatial process. e term “model-based geostatistics” was first used by these authors to describe an approach to geostatistical

estadistica spatial

Embed Size (px)

DESCRIPTION

estadistica del espacio

Citation preview

Page 1: estadistica spatial

Hindawi Publishing CorporationJournal of Applied MathematicsVolume 2013 Article ID 531062 7 pageshttpdxdoiorg1011552013531062

Research ArticleGeneralized Linear Spatial Models to Predict Slate Exploitability

Angeles Saavedra12 Javier Taboada3 Mariacutea Arauacutejo3 and Eduardo Giraacuteldez3

1 Department of Statistics University of Vigo 36310 Vigo Spain2 ETSI MINAS Universidad de Vigo Campus Lagoas-Marcosende Rua Maxwell 36310 Vigo Spain3 Department of Natural Resources University of Vigo 36310 Vigo Spain

Correspondence should be addressed to Angeles Saavedra saavedrauvigoes

Received 11 November 2012 Revised 3 May 2013 Accepted 26 May 2013

Academic Editor Zhiping Qiu

Copyright copy 2013 Angeles Saavedra et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

The aim of this research was to determine the variables that characterize slate exploitability and to model spatial distribution Ageneralized linear spatialmodel (GLSMs)was fitted in order to explore relationship between exploitability and different explanatoryvariables that characterize slate qualityModelling the influence of these variables and analysing the spatial distribution of themodelresiduals yielded a GLSM that allows slate exploitability to be predicted more effectively than when using generalized linear models(GLM) which do not take spatial dependence into account Studying the residuals and comparing the prediction capacities of thetwo models lead us to conclude that the GLSM is more appropriate when the response variable presents spatial distribution

1 Introduction

The exploitability of a slate deposit depends onmany quality-determining factors that are spatially correlated Knowledgeand study of these factors are essential for the evaluationof deposits [1 2] Therefore it can be fairly safely assumedthat better evaluations of quality parameters and betterpredictions of slate exploitability could be obtained by usingspecific statistical models that take spatial correlation intoaccount

Traditionally the main aim of geostatistical models hasbeen to predict a spatially correlated response variable Underthis approach estimating the parameters of the geostatisticalmodel is not usually the main interest However estimatingand inferring parameters enables a more precise identifica-tion of the factors influencing the geographical distributionof exploitable slate thus allowing greater knowledge to begained regarding the response variable of interest

In our research the model-based geostatistics methodol-ogy was adapted in the analysis of slate exploitability usinga generalized linear spatial model (GLSM) With this typeof model the objective of inference can be focused on theparameters of the regression function on the properties of theresiduals or on the distribution of the residuals conditionallyon the response variable

A brief description of the statistical models used inthis study is given in Section 2 The data studied and theformulated model are described in Section 3 The statisticalresults and analyses are presented in Section 4 and somecomments and the discussion can be found in Section 5

2 Statistical Analysis Methods

21 Generalized Linear Spatial Models Generalized linearmodels (GLMs) were introduced by [3] and studied in depthby [4] and later by several authors (see [5ndash9])

In a GLM a response variable 119884 = (1198841 1198842 119884

119899)

is assumed so that the variables 1198841 1198842 119884

119899are mutually

independent and with its expected value related to a linearpredictor 119864[119884] = 119892minus1(119889119879120573) where 120573 isin R119901 is a vector ofunknown regression parameters d are known explanatoryvariables and 119892 is a known function called link function

An important extension of the GLM is the generalizedlinear mixed model (GLMM) [10] in which the responsevariables are considered independent of one another con-ditionally on the values for a set of latent variables Thegeneralized linear spatial model (GLSM) [11] is basically aGLMM in as much as the latent variables derive from aspatial processThe term ldquomodel-based geostatisticsrdquo was firstused by these authors to describe an approach to geostatistical

2 Journal of Applied Mathematics

problems based on formal statistical models and inferenceprocedures This leads to the following specification of themodel

Consider 119899 different locations 1199091 119909

119899 sub 119868 sub R2 and

assume that a realization 119910 = (1199101 119910

119899)119879 of 119884 is observed

where 119910 = 119884(119909119894)

Let 119878 = 119878(119909) 119909 isin 119868 119868 sub R2 be a Gaussian processwith mean function 119864[119878(119909)] = 119889(119909)119879120573 and covariancecov(119878(119909) 119878(1199091015840)) = 1205902120588(119909 1199091015840 120593) + 12059121119909 = 1199091015840 where 120573 isin R119901is as in theGLMcase a vector of unknown regression param-eters 119889(119909) are known explanatory variables now with spatialdependence 120588(119909 1199091015840 120593) is a correlation function in R2 120593 isa scale parameter that controls the speed at which spatialcorrelation approaches 0 as the distance between locationsgrows and finally 1205912 ge 0 is known as the nugget effectin accordance with the usual geostatistical terminology Thenugget effect can be interpreted as a measurement error or amicroscale variation or a combination of both

Conditionally on 119878 the process 119884(119909) 119909 isin 119868 consistsof random mutually independent variables and for eachlocation 119909 isin 119868 the error distribution or [119884(119909) | 119878] has adensity that depends only on the conditional mean 119864[119884(119909

119894) |

119878(119909119894)] A known link function 119892 relates the conditional mean

and 119878(119909) so that 119864[119884(119909119894) | 119878(119909

119894)] = 119892

minus1(119878(119909119894))

When the regression parameters 120573 are of interest it isimportant to remember that their interpretation is moreconditional than marginal In particular 119864[119884(119909

119894) | 119878(119909

119894)] and

119864[119884(119909119894)] differ in terms of the structural dependence of the

explanatory variables 119889(119909119894) thus the interpretation of 120573 calls

for caution Only in the case where 119884(119909119894) | 119878(119909

119894) is Gaussian

and the link function is identity parameter comparison isdirect The need to distinguish between conditional andmarginal regression parameters which is not possible inGaussian linearmodels is well known in the context of GLMsfor longitudinal data (see eg [12])

To estimate the parameters for the GLSM and dueto the fact that the stationary Gaussian process 119878(119909) isnot observable it is not possible to obtain a closed-formlikelihood function except as a high-dimension integralReference [11] suggests using algorithms based on Markovchain Monte Carlo (MCMC) to calculate GLSM parametersin a Bayesian framework This is the approach used in ouranalysis implemented using geoR and geoRglm packages(free open-source programs for use with119877 statistical software[13])

22 ROC Curves When the marginal distribution (in theGLM) or conditional distribution (in the GLSM) of theresponse variable119884 follows a binomial distribution themod-els can be called binary classification systems The exactitudeof a diagnostic test for a binary classification system canbe summarized as a receiver operating characteristic (ROC)curve which is a graphic representation of true positiveversus false positive rates when the discrimination thresholdis varied Within the framework of binary GLMs it is normalto estimate the ROC curves of models in which one or moreexplanatory variables have been excluded so as to evaluatethe effects of these variables Analysing ROC curves provides

tools for comparing and selecting the best models Moreprecisely the area under the curve (AUC) of the ROC curveis usually calculated in order to compare the different binarymodels and thereby select the explanatory variables to beincluded in the model Reference [14] described a bootstrap-based method for testing the significant effect of dependentvariables on the ROC curve

We used the AUC and residual semivariograms todemonstrate the goodness-of-fit of the binary GLSM com-pared with the binary GLM when working with spatiallycorrelated data

3 Data Description and Model Formulation

31 The Studied Area and the Geographic Database Thedata used to build the proposed model was collected fromborehole samples taken from slate deposits in Baja CabreraLeonesa (northwest Spain) an area with a long tradition ofextracting processing and exporting roofing slate

When surveying a slate deposit in-depth studies of therock are performed by taking continuous borehole sampleswhich enable geologists to study the living rock and analysethe possibility of using it as ornamental slate see [15] thesesamples also reveal the degree of fracturation inside the rockmass

The specific borehole logging process was based onmanual and visual inspection of the borehole by an expertwho after evaluating the aesthetic and functional defectsand properties of the slate differentiated between seams ofcommercial and unusable slate The survey was performedby taking a control sample every 25 centimetres rock qualitydesignation (RQD) however was defined by homogeneouslyfractured sections

A total of 313 equally spaced in-depth observationswere obtained resulting from prior evaluation of variousparameters affecting the ornamental quality of the slate andfrom direct binary values (0 or 1) assigned by the expert toindicate exploitation potential The 9 specific variables thataffected the results of borehole logging were as follows

(i) RQD borehole core samples recovered in piecesgreater than 10 cm long as a percentage of the totalborehole length This is an indicator of the degree ofrock mass fracturing

(ii) Veins presence of microfractures filled with quartzthat determine the breakage resistance of a commer-cial slab

(iii) Crenulations effect of crenulation cleavage on themain schistosity planes This increases the roughnessof the foliation surfaces of the slate and reducesfissility

(iv) Kink bands Presence of microfolding caused by lateVariscan deformations

(v) Sandy laminations presence of sedimentary sandlayers which cut the schistosity planes and have anegative effect on fissility

Journal of Applied Mathematics 3

Table 1 Correlation matrix of the 119901 = 9 explanatory variables that affected the results of borehole logging

(a)

RQD Veins Crenulation Kink bands Sandy laminationsRQD 1Veins 01624 1Crenulation 03038 01631 1Kink bands 01040 00408 01962 1Sandy laminations 00177 00187 00200 00736 1Microfractures 06696 00035 01284 minus00590 minus00138

Pyrite minus01953 minus00403 minus01221 minus00891 minus01202

Oxidation 02053 00837 01722 00821 minus00645

Rough cleavage minus00908 minus00593 02096 minus00058 minus00454

(b)

Microfractures Pyrite Oxidation Rough cleavageMicrofractures 1Pyrite minus01587 1Oxidation 01162 minus01152 1Rough cleavage minus00989 00065 minus00366 1

(vi) Microfractures presence of barely visible fractureswhich determine the breakage resistance of slabsmeasuring 3ndash5mm thick

(vii) Pyrite presence of iron sulphides

(viii) Oxidation degree of oxidation of iron sulphides in theslate

(ix) Rough cleavage slatewith poor fissility due to texturalheterogeneity

In-depth knowledge of the variability and distribution ofexploitable slate and possible correlation between propertiesare conducive to the use of GLSM to spatially model thegeographic database

Table 1 shows the correlation matrix of the explanatoryvariables given above in order to know the degree of depen-dence between them

32 Model Formulation The response variable 119884(119909) takesthe values 0 or 1 to indicate disposable or exploitable slaterespectively in a particular location 119909 It is assumed in whatfollows that slate exploitability is a spatial phenomenon thatcan bemodelled using aGLSM In otherwords conditional tothe Gaussian process 119878(119909) the data 119884(119909

119894) 119894 = 1 119899 follow

the classic GLM The role of 119878(119909) is therefore to explainthe residual spatial variation after considering all the knownexplanatory variables It is also reasonable to assume thatthe conditional distribution of exploitability can be modelledas a binomial distribution which is why a binomial errordistribution was considered in our study

Binomial error distribution was used by Diggle et al [6]and Zhang [8] A class of transformations that can be used aslink functions for this distribution was described by [16] Weassume the stationary Gaussian process 119878(sdot) to be the basis fora model of spatial variation in the probability 119875(119909) that the

slate in119909 is exploitable but with a logit transformation tomapthe domain of 119878(sdot) onto the unit interval Thus

119892 [119875 (119909)] = log 119875 (119909)[1 minus 119875 (119909)]

= 120583 + 119878 (119909) (1)

The regression function 119864[119884(119909119894) | 119878(119909

119894)] varies spatially only

through 119878(119909) in the locations 119909119894

We adopted a Bayesian framework for inference andprediction of the parameters using algorithms based onMCMC

The parameters of this binomial GLSM are 120579 = (1205902 120593)and 120573 = (120573

0 120573

119901) where 120573

0is the independent term

and 1205731 120573

119901are the regression coefficients corresponding

to each known dependent variable

4 Statistical Analysis

We initially included all the variables that characterizeslate exploitability namely RQD veins crenulations kinkbands sandy laminations microfractures pyrite oxidationand poor fissility Taking this data and considering slateexploitability as the response variable we fitted a binaryGLM called GLM1 A ROC curve was estimated for thiscomplete binary model and the AUC was 099

Next binary GLMs were fitted to different groups ofdependent variables in an attempt to find the minimumnumber of variables that would provide a high AUC valuethat is close to 099 The model called GLM2 fitted withthe RQD crenulation kink band andmicrofracture variablesobtained an AUC of 092 Figure 1 shows the ROC curves andthe corresponding AUC values for both GLM1 and GLM2The study was continued with the four variables included inGLM2 given that the reduction in the number of variablesdid not overly affect the accuracy of the model A binarynonspatial GLM was fitted using Bayesian methods and the

4 Journal of Applied Mathematics

Table 2 Estimated coefficients and 25 25 75 and 975 quantiles for the GLM

Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus356747 minus776804 minus490098 minus260904 minus186385

1205731(RQD) 12826 minus11476 04192 20516 396331205732(crenulation) 05875 03288 05089 06761 083521205733(kink band) 05668 03965 05019 06395 078571205734(microfracture) 25815 09221 16093 39181 67967

Table 3 Estimated coefficients and 25 25 75 and 975 quantiles for the GLSM

Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus277021 minus320664 minus296753 minus247446 minus177297

1205731(RQD) 02521 minus00385 01291 03726 053881205732(crenulation) 07334 03762 06121 08283 106911205733(kink band) 09189 05881 07802 10274 122611205734(microfracture) 12911 08834 10125 14480 16261

00 02 04 06 08 10

00

02

04

06

08

10

ROC plot

Sens

itivi

ty (t

rue p

ositi

ves)

AUC099 GLM1092 GLM2

1-specificity (false positives)

Figure 1 ROC curves and the corresponding AUC values for thetwo binary GLMs included in the study

MCMClogit function from the MCMCpack (R language)The vector120573 of the parameters fitted for theGLM2model was120573 = (minus356747 12826 05875 05668 25815) where the firstvalue is the independent term and the remaining values arethe corresponding coefficients of the explanatory variables inthe same order as they were mentioned above

Table 2 shows the estimated coefficients and the 2525 75 and 975 quantiles for the fitted model It can beobserved that for a significance level of 120572 = 005 the onlynonsignificant variable is the RQD

The study of the residuals of the GLM2 model fittedwith four explanatory variables detected a spatial dependence

Distance

Sem

ivar

ianc

e

05

10

15

20

5 10 15 20 25

Figure 2 Experimental semivariogram for the GLM2 residuals(points) together with the fitted theoretical model fitted accordingto an exponential model (continuous line)

that could be modelled using an exponential theoreticalsemivariogram with range 081 and sill 161 Figure 2 showsthe experimental semivariogram for the GLM2 residualstogether with the corresponding fitted theoretical model

Given the presence of spatial dependence in the resid-uals it then made sense to fit a GLSM maintaining fourexplanatory variables and assuming an exponential modelfor the process 119878(119909) The correlation function considered wasof the type cov(119878(119909) 119878(1199091015840)) = 1205902120588(119909 1199091015840 120593) + 12059121119909 = 1199091015840with 120588(119906 120593) = exp(minus119906120593) The fit was made using Bayesianinference implemented via the MCMC algorithms The first10000 sample observations from the simulationwere ignoredas burn-in at which point it was considered that convergencetime had been achieved The subsequent samples were usedto obtain the subsequent distribution of the parameters of

Journal of Applied Mathematics 5

minus5

minus4

minus4

minus4

minus3

minus3

minus2

minus2

minus2

3

10

20 40

00 05 10 15

000

005

010

015

020

025

030

120593

1205912 re

l

Figure 3 Two-dimensional likelihood profile for (120593 1205912) for theGLSM

interest The chain was sampled for each 100 of the 50000iterations to obtain samples containing 500 values (for furtherdetails see [17])

Using this procedure the parameters were estimated as120593 = 082 120590

2= 146 120591

2= 009 and 120573 = (minus277021

02521 07334 09189 12911) Table 3 shows the estimated 120573coefficients and their 25 25 75 and 975 quantilesOnce againwe can see how for a significance level of120572 = 005only the RQD variable was not significant

It is important to remember that these parametersshould be conditionally and not marginally interpreted andso should not be directly compared with the parametersestimated for the GLM Direct comparison of a spatial andnonspatial GLM could lead to erroneous conclusions as theestimation methods are fundamentally different Nonethe-less there is a certain correlation in the conclusions to bedrawn from these tables with variables such as crenulationkink band andmicrofracture remaining significant RQD onthe other hand was not significant at 5 level

Figure 3 shows the likelihood profile in two dimensionsfor parameters (120593 1205912) of the model illustrating the flatness ofthe likelihood surface obtained using theMCMC algorithms

A study of the spatial dependence of the GLSM residualsindicated that the spatial component had been correctlymodelled on this occasion Figure 4 shows an experimentalsemivariogram of the GLSM residuals and a theoreticalmodel fitted according to a nugget effect model It is clearthat the empirical semivariogram is essentially flat whichsuggests a suitable fit to the spatial structure A directcomparison between Figures 2 and 4 leads to the conclusionthat the spatial dependence has been properly captured by thestationary Gaussian process 119878

The ROC curve and AUC were calculated for the GLSMThe AUC of the binary spatial model was 099 which

Distance

Sem

ivar

ianc

e5 10 15 20 25

1eminus06

2eminus06

3eminus06

4eminus06

Figure 4 Experimental semivariogram for the GLSM residuals(points) together with the theoretical model fitted according to anugget effect model (continuous line)

Table 4 Error rates for the binary models for three scenariosrepresenting different levels of prediction difficulty

5 test 10 test 15 testGLM 66119864 minus 2 63119864 minus 2 56119864 minus 2

GLSM 41119864 minus 2 51119864 minus 2 46119864 minus 2

indicates a substantial improvement in the precision of theGLSM This improvement is reflected in Figure 5 whichdepicts the ROC curves and AUC values for the GLM2 andGLSM binary models

The comparison between the two models was completedwith a simulation study designed to compare the reliabilityof the predictions in three scenarios of varying levels ofdifficulty Randomly selected for the first scenario was 95 ofthe 313 initial observations composing the training set thatwas used to fit the GLM and GLSM The fitted models werethen validated with the remaining 5 of the observationsThis procedure was repeated 100 times and the numberof errors in the slate exploitability prediction was recordedfor each repetition For the second scenario we randomlyselected 90 of the observations for the training set and theremaining 10made up the test set This simulation was alsorepeated 100 times The procedure for the third scenario wassimilar but this time 85 and 15 of observations made upthe training and test sets respectively

Table 4 displays the error rates for the three scenariosdescribed with the error rate calculated as the ratio betweenthe number of prediction errors and the total number ofpredictions in the test set

6 Journal of Applied Mathematics

00 02 04 06 08 10

00

02

04

06

08

10

ROC plot

Sens

itivi

ty (t

rue p

ositi

ves)

AUC

099 GLSM092 GLM

1-specificity (false positives)

Figure 5 ROC curves and correspondingAUCvalues for the binaryGLM2 and the GLSM

In all the cases it can be observed that theGLSMprovideda better explanation not only of the effect of the variablesdetermining slate quality but also of the spatial behaviourof exploitable slate thereby producing lower prediction errorrates

5 Conclusions

A general interpretation of the GLSM used in our analysis isthat the spatial term 119878 represents the accumulative effect ofpossible explanatory variables with an undetermined spatialstructure which have therefore not been observed

In GLMs the fact that the spatial correlation of thevariables is not taken into account can significantly affect thequality of statistical results Our study highlights the potentialrisk of using GLMs when the data is spatially structured

The conclusion reached after comparing ROC curvesand their corresponding AUCs is that GLSMs predict slateexploitability better than GLMs Therefore it would seemessential to include unexplained spatial variation when mod-elling spatially correlated variables

Based on the comparison of the semivariograms of theGLM and GLSM residuals we would like to draw attentionto the presence of spatial dependence in the GLM residualsin contrast towhat occurswhen aGLSM is implementedThisindicates that spatial dependence has been captured correctlyby the stationary Gaussian process 119878 We can therefore

conclude that a GLSM is more suitable for modelling spatialdependence which is overlooked by classic GLMs

The simulation study demonstrates that for varying levelsof prediction difficulty the GLSM had lower error rates thanthe GLM

Although the parameters of the GLSM must be inter-preted conditionally rather than marginally to 119878 the resultsof the statistical analysis denote the broader potential of theGLSM compared to the classic GLM in analysing spatialdata They also underline the potential risk of reachingerroneous conclusions when using nonspatial models toanalyse spatially structured data

Acknowledgments

This work was funded partly by the ProjectINCITE10REM304009PR of the Xunta de Galicia and by theProjects MTM2008-03129 and MTM2011-23204 of theMinistry of Science and Innovation

References

[1] J M Matıas A Vaamonde J Taboada and W Gonzalez-Manteiga ldquoSupport vector machines and gradient boosting forgraphical estimation of a slate depositrdquo Stochastic Environmen-tal Research and Risk Assessment vol 18 no 5 pp 309ndash3232004

[2] F G Bastante J Taboada L Alejano and E Alonso ldquoOptimiza-tion tools and simulationmethods for designing and evaluatinga mining operationrdquo Stochastic Environmental Research andRisk Assessment vol 22 no 6 pp 727ndash735 2008

[3] J A Nelder and R W M Wedderburn ldquoGeneralized linearmodelsrdquo Journal of the Royal Statistical Society A vol 135 pp370ndash384 1972

[4] P McCullagh and J A Nelder Generalized Linear ModelsChapman and Hall London UK 1989

[5] O F Christensen and R Waagepetersen ldquoBayesian predictionof spatial count data using generalized linear mixed modelsrdquoBiometrics vol 58 no 2 pp 280ndash286 2002

[6] P Diggle R Moyeed B Rowlingson andMThomson ldquoChild-hood malaria in the Gambia a case-study in model-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 51no 4 pp 493ndash506 2002

[7] P J Diggle P J Ribeiro andO F Christensen ldquoAn introductionto model-based geostatisticsrdquo in Spatial Statistics and Compu-tational Methods J Moslashller Ed pp 43ndash86 Springer New YorkNY USA 2003

[8] H Zhang ldquoOn estimation and prediction for spatial generalizedlinear mixed modelsrdquo Biometrics vol 58 no 1 pp 129ndash1362002

[9] H Zhang ldquoOptimal interpolation and the appropriateness ofcross-validating variogram in spatial generalized linear mixedmodelsrdquo Journal of Computational and Graphical Statistics vol12 no 3 pp 698ndash713 2003

[10] N Breslow and D Clayton ldquoApproximate inference in gener-alized linear mixed modelsrdquo Journal of the American StatisticalAssociation vol 88 pp 9ndash25 1993

[11] P J Diggle J A Tawn and R A Moyeed ldquoModel-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 47no 3 pp 299ndash350 1998

Journal of Applied Mathematics 7

[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994

[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg

[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011

[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998

[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997

[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: estadistica spatial

2 Journal of Applied Mathematics

problems based on formal statistical models and inferenceprocedures This leads to the following specification of themodel

Consider 119899 different locations 1199091 119909

119899 sub 119868 sub R2 and

assume that a realization 119910 = (1199101 119910

119899)119879 of 119884 is observed

where 119910 = 119884(119909119894)

Let 119878 = 119878(119909) 119909 isin 119868 119868 sub R2 be a Gaussian processwith mean function 119864[119878(119909)] = 119889(119909)119879120573 and covariancecov(119878(119909) 119878(1199091015840)) = 1205902120588(119909 1199091015840 120593) + 12059121119909 = 1199091015840 where 120573 isin R119901is as in theGLMcase a vector of unknown regression param-eters 119889(119909) are known explanatory variables now with spatialdependence 120588(119909 1199091015840 120593) is a correlation function in R2 120593 isa scale parameter that controls the speed at which spatialcorrelation approaches 0 as the distance between locationsgrows and finally 1205912 ge 0 is known as the nugget effectin accordance with the usual geostatistical terminology Thenugget effect can be interpreted as a measurement error or amicroscale variation or a combination of both

Conditionally on 119878 the process 119884(119909) 119909 isin 119868 consistsof random mutually independent variables and for eachlocation 119909 isin 119868 the error distribution or [119884(119909) | 119878] has adensity that depends only on the conditional mean 119864[119884(119909

119894) |

119878(119909119894)] A known link function 119892 relates the conditional mean

and 119878(119909) so that 119864[119884(119909119894) | 119878(119909

119894)] = 119892

minus1(119878(119909119894))

When the regression parameters 120573 are of interest it isimportant to remember that their interpretation is moreconditional than marginal In particular 119864[119884(119909

119894) | 119878(119909

119894)] and

119864[119884(119909119894)] differ in terms of the structural dependence of the

explanatory variables 119889(119909119894) thus the interpretation of 120573 calls

for caution Only in the case where 119884(119909119894) | 119878(119909

119894) is Gaussian

and the link function is identity parameter comparison isdirect The need to distinguish between conditional andmarginal regression parameters which is not possible inGaussian linearmodels is well known in the context of GLMsfor longitudinal data (see eg [12])

To estimate the parameters for the GLSM and dueto the fact that the stationary Gaussian process 119878(119909) isnot observable it is not possible to obtain a closed-formlikelihood function except as a high-dimension integralReference [11] suggests using algorithms based on Markovchain Monte Carlo (MCMC) to calculate GLSM parametersin a Bayesian framework This is the approach used in ouranalysis implemented using geoR and geoRglm packages(free open-source programs for use with119877 statistical software[13])

22 ROC Curves When the marginal distribution (in theGLM) or conditional distribution (in the GLSM) of theresponse variable119884 follows a binomial distribution themod-els can be called binary classification systems The exactitudeof a diagnostic test for a binary classification system canbe summarized as a receiver operating characteristic (ROC)curve which is a graphic representation of true positiveversus false positive rates when the discrimination thresholdis varied Within the framework of binary GLMs it is normalto estimate the ROC curves of models in which one or moreexplanatory variables have been excluded so as to evaluatethe effects of these variables Analysing ROC curves provides

tools for comparing and selecting the best models Moreprecisely the area under the curve (AUC) of the ROC curveis usually calculated in order to compare the different binarymodels and thereby select the explanatory variables to beincluded in the model Reference [14] described a bootstrap-based method for testing the significant effect of dependentvariables on the ROC curve

We used the AUC and residual semivariograms todemonstrate the goodness-of-fit of the binary GLSM com-pared with the binary GLM when working with spatiallycorrelated data

3 Data Description and Model Formulation

31 The Studied Area and the Geographic Database Thedata used to build the proposed model was collected fromborehole samples taken from slate deposits in Baja CabreraLeonesa (northwest Spain) an area with a long tradition ofextracting processing and exporting roofing slate

When surveying a slate deposit in-depth studies of therock are performed by taking continuous borehole sampleswhich enable geologists to study the living rock and analysethe possibility of using it as ornamental slate see [15] thesesamples also reveal the degree of fracturation inside the rockmass

The specific borehole logging process was based onmanual and visual inspection of the borehole by an expertwho after evaluating the aesthetic and functional defectsand properties of the slate differentiated between seams ofcommercial and unusable slate The survey was performedby taking a control sample every 25 centimetres rock qualitydesignation (RQD) however was defined by homogeneouslyfractured sections

A total of 313 equally spaced in-depth observationswere obtained resulting from prior evaluation of variousparameters affecting the ornamental quality of the slate andfrom direct binary values (0 or 1) assigned by the expert toindicate exploitation potential The 9 specific variables thataffected the results of borehole logging were as follows

(i) RQD borehole core samples recovered in piecesgreater than 10 cm long as a percentage of the totalborehole length This is an indicator of the degree ofrock mass fracturing

(ii) Veins presence of microfractures filled with quartzthat determine the breakage resistance of a commer-cial slab

(iii) Crenulations effect of crenulation cleavage on themain schistosity planes This increases the roughnessof the foliation surfaces of the slate and reducesfissility

(iv) Kink bands Presence of microfolding caused by lateVariscan deformations

(v) Sandy laminations presence of sedimentary sandlayers which cut the schistosity planes and have anegative effect on fissility

Journal of Applied Mathematics 3

Table 1 Correlation matrix of the 119901 = 9 explanatory variables that affected the results of borehole logging

(a)

RQD Veins Crenulation Kink bands Sandy laminationsRQD 1Veins 01624 1Crenulation 03038 01631 1Kink bands 01040 00408 01962 1Sandy laminations 00177 00187 00200 00736 1Microfractures 06696 00035 01284 minus00590 minus00138

Pyrite minus01953 minus00403 minus01221 minus00891 minus01202

Oxidation 02053 00837 01722 00821 minus00645

Rough cleavage minus00908 minus00593 02096 minus00058 minus00454

(b)

Microfractures Pyrite Oxidation Rough cleavageMicrofractures 1Pyrite minus01587 1Oxidation 01162 minus01152 1Rough cleavage minus00989 00065 minus00366 1

(vi) Microfractures presence of barely visible fractureswhich determine the breakage resistance of slabsmeasuring 3ndash5mm thick

(vii) Pyrite presence of iron sulphides

(viii) Oxidation degree of oxidation of iron sulphides in theslate

(ix) Rough cleavage slatewith poor fissility due to texturalheterogeneity

In-depth knowledge of the variability and distribution ofexploitable slate and possible correlation between propertiesare conducive to the use of GLSM to spatially model thegeographic database

Table 1 shows the correlation matrix of the explanatoryvariables given above in order to know the degree of depen-dence between them

32 Model Formulation The response variable 119884(119909) takesthe values 0 or 1 to indicate disposable or exploitable slaterespectively in a particular location 119909 It is assumed in whatfollows that slate exploitability is a spatial phenomenon thatcan bemodelled using aGLSM In otherwords conditional tothe Gaussian process 119878(119909) the data 119884(119909

119894) 119894 = 1 119899 follow

the classic GLM The role of 119878(119909) is therefore to explainthe residual spatial variation after considering all the knownexplanatory variables It is also reasonable to assume thatthe conditional distribution of exploitability can be modelledas a binomial distribution which is why a binomial errordistribution was considered in our study

Binomial error distribution was used by Diggle et al [6]and Zhang [8] A class of transformations that can be used aslink functions for this distribution was described by [16] Weassume the stationary Gaussian process 119878(sdot) to be the basis fora model of spatial variation in the probability 119875(119909) that the

slate in119909 is exploitable but with a logit transformation tomapthe domain of 119878(sdot) onto the unit interval Thus

119892 [119875 (119909)] = log 119875 (119909)[1 minus 119875 (119909)]

= 120583 + 119878 (119909) (1)

The regression function 119864[119884(119909119894) | 119878(119909

119894)] varies spatially only

through 119878(119909) in the locations 119909119894

We adopted a Bayesian framework for inference andprediction of the parameters using algorithms based onMCMC

The parameters of this binomial GLSM are 120579 = (1205902 120593)and 120573 = (120573

0 120573

119901) where 120573

0is the independent term

and 1205731 120573

119901are the regression coefficients corresponding

to each known dependent variable

4 Statistical Analysis

We initially included all the variables that characterizeslate exploitability namely RQD veins crenulations kinkbands sandy laminations microfractures pyrite oxidationand poor fissility Taking this data and considering slateexploitability as the response variable we fitted a binaryGLM called GLM1 A ROC curve was estimated for thiscomplete binary model and the AUC was 099

Next binary GLMs were fitted to different groups ofdependent variables in an attempt to find the minimumnumber of variables that would provide a high AUC valuethat is close to 099 The model called GLM2 fitted withthe RQD crenulation kink band andmicrofracture variablesobtained an AUC of 092 Figure 1 shows the ROC curves andthe corresponding AUC values for both GLM1 and GLM2The study was continued with the four variables included inGLM2 given that the reduction in the number of variablesdid not overly affect the accuracy of the model A binarynonspatial GLM was fitted using Bayesian methods and the

4 Journal of Applied Mathematics

Table 2 Estimated coefficients and 25 25 75 and 975 quantiles for the GLM

Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus356747 minus776804 minus490098 minus260904 minus186385

1205731(RQD) 12826 minus11476 04192 20516 396331205732(crenulation) 05875 03288 05089 06761 083521205733(kink band) 05668 03965 05019 06395 078571205734(microfracture) 25815 09221 16093 39181 67967

Table 3 Estimated coefficients and 25 25 75 and 975 quantiles for the GLSM

Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus277021 minus320664 minus296753 minus247446 minus177297

1205731(RQD) 02521 minus00385 01291 03726 053881205732(crenulation) 07334 03762 06121 08283 106911205733(kink band) 09189 05881 07802 10274 122611205734(microfracture) 12911 08834 10125 14480 16261

00 02 04 06 08 10

00

02

04

06

08

10

ROC plot

Sens

itivi

ty (t

rue p

ositi

ves)

AUC099 GLM1092 GLM2

1-specificity (false positives)

Figure 1 ROC curves and the corresponding AUC values for thetwo binary GLMs included in the study

MCMClogit function from the MCMCpack (R language)The vector120573 of the parameters fitted for theGLM2model was120573 = (minus356747 12826 05875 05668 25815) where the firstvalue is the independent term and the remaining values arethe corresponding coefficients of the explanatory variables inthe same order as they were mentioned above

Table 2 shows the estimated coefficients and the 2525 75 and 975 quantiles for the fitted model It can beobserved that for a significance level of 120572 = 005 the onlynonsignificant variable is the RQD

The study of the residuals of the GLM2 model fittedwith four explanatory variables detected a spatial dependence

Distance

Sem

ivar

ianc

e

05

10

15

20

5 10 15 20 25

Figure 2 Experimental semivariogram for the GLM2 residuals(points) together with the fitted theoretical model fitted accordingto an exponential model (continuous line)

that could be modelled using an exponential theoreticalsemivariogram with range 081 and sill 161 Figure 2 showsthe experimental semivariogram for the GLM2 residualstogether with the corresponding fitted theoretical model

Given the presence of spatial dependence in the resid-uals it then made sense to fit a GLSM maintaining fourexplanatory variables and assuming an exponential modelfor the process 119878(119909) The correlation function considered wasof the type cov(119878(119909) 119878(1199091015840)) = 1205902120588(119909 1199091015840 120593) + 12059121119909 = 1199091015840with 120588(119906 120593) = exp(minus119906120593) The fit was made using Bayesianinference implemented via the MCMC algorithms The first10000 sample observations from the simulationwere ignoredas burn-in at which point it was considered that convergencetime had been achieved The subsequent samples were usedto obtain the subsequent distribution of the parameters of

Journal of Applied Mathematics 5

minus5

minus4

minus4

minus4

minus3

minus3

minus2

minus2

minus2

3

10

20 40

00 05 10 15

000

005

010

015

020

025

030

120593

1205912 re

l

Figure 3 Two-dimensional likelihood profile for (120593 1205912) for theGLSM

interest The chain was sampled for each 100 of the 50000iterations to obtain samples containing 500 values (for furtherdetails see [17])

Using this procedure the parameters were estimated as120593 = 082 120590

2= 146 120591

2= 009 and 120573 = (minus277021

02521 07334 09189 12911) Table 3 shows the estimated 120573coefficients and their 25 25 75 and 975 quantilesOnce againwe can see how for a significance level of120572 = 005only the RQD variable was not significant

It is important to remember that these parametersshould be conditionally and not marginally interpreted andso should not be directly compared with the parametersestimated for the GLM Direct comparison of a spatial andnonspatial GLM could lead to erroneous conclusions as theestimation methods are fundamentally different Nonethe-less there is a certain correlation in the conclusions to bedrawn from these tables with variables such as crenulationkink band andmicrofracture remaining significant RQD onthe other hand was not significant at 5 level

Figure 3 shows the likelihood profile in two dimensionsfor parameters (120593 1205912) of the model illustrating the flatness ofthe likelihood surface obtained using theMCMC algorithms

A study of the spatial dependence of the GLSM residualsindicated that the spatial component had been correctlymodelled on this occasion Figure 4 shows an experimentalsemivariogram of the GLSM residuals and a theoreticalmodel fitted according to a nugget effect model It is clearthat the empirical semivariogram is essentially flat whichsuggests a suitable fit to the spatial structure A directcomparison between Figures 2 and 4 leads to the conclusionthat the spatial dependence has been properly captured by thestationary Gaussian process 119878

The ROC curve and AUC were calculated for the GLSMThe AUC of the binary spatial model was 099 which

Distance

Sem

ivar

ianc

e5 10 15 20 25

1eminus06

2eminus06

3eminus06

4eminus06

Figure 4 Experimental semivariogram for the GLSM residuals(points) together with the theoretical model fitted according to anugget effect model (continuous line)

Table 4 Error rates for the binary models for three scenariosrepresenting different levels of prediction difficulty

5 test 10 test 15 testGLM 66119864 minus 2 63119864 minus 2 56119864 minus 2

GLSM 41119864 minus 2 51119864 minus 2 46119864 minus 2

indicates a substantial improvement in the precision of theGLSM This improvement is reflected in Figure 5 whichdepicts the ROC curves and AUC values for the GLM2 andGLSM binary models

The comparison between the two models was completedwith a simulation study designed to compare the reliabilityof the predictions in three scenarios of varying levels ofdifficulty Randomly selected for the first scenario was 95 ofthe 313 initial observations composing the training set thatwas used to fit the GLM and GLSM The fitted models werethen validated with the remaining 5 of the observationsThis procedure was repeated 100 times and the numberof errors in the slate exploitability prediction was recordedfor each repetition For the second scenario we randomlyselected 90 of the observations for the training set and theremaining 10made up the test set This simulation was alsorepeated 100 times The procedure for the third scenario wassimilar but this time 85 and 15 of observations made upthe training and test sets respectively

Table 4 displays the error rates for the three scenariosdescribed with the error rate calculated as the ratio betweenthe number of prediction errors and the total number ofpredictions in the test set

6 Journal of Applied Mathematics

00 02 04 06 08 10

00

02

04

06

08

10

ROC plot

Sens

itivi

ty (t

rue p

ositi

ves)

AUC

099 GLSM092 GLM

1-specificity (false positives)

Figure 5 ROC curves and correspondingAUCvalues for the binaryGLM2 and the GLSM

In all the cases it can be observed that theGLSMprovideda better explanation not only of the effect of the variablesdetermining slate quality but also of the spatial behaviourof exploitable slate thereby producing lower prediction errorrates

5 Conclusions

A general interpretation of the GLSM used in our analysis isthat the spatial term 119878 represents the accumulative effect ofpossible explanatory variables with an undetermined spatialstructure which have therefore not been observed

In GLMs the fact that the spatial correlation of thevariables is not taken into account can significantly affect thequality of statistical results Our study highlights the potentialrisk of using GLMs when the data is spatially structured

The conclusion reached after comparing ROC curvesand their corresponding AUCs is that GLSMs predict slateexploitability better than GLMs Therefore it would seemessential to include unexplained spatial variation when mod-elling spatially correlated variables

Based on the comparison of the semivariograms of theGLM and GLSM residuals we would like to draw attentionto the presence of spatial dependence in the GLM residualsin contrast towhat occurswhen aGLSM is implementedThisindicates that spatial dependence has been captured correctlyby the stationary Gaussian process 119878 We can therefore

conclude that a GLSM is more suitable for modelling spatialdependence which is overlooked by classic GLMs

The simulation study demonstrates that for varying levelsof prediction difficulty the GLSM had lower error rates thanthe GLM

Although the parameters of the GLSM must be inter-preted conditionally rather than marginally to 119878 the resultsof the statistical analysis denote the broader potential of theGLSM compared to the classic GLM in analysing spatialdata They also underline the potential risk of reachingerroneous conclusions when using nonspatial models toanalyse spatially structured data

Acknowledgments

This work was funded partly by the ProjectINCITE10REM304009PR of the Xunta de Galicia and by theProjects MTM2008-03129 and MTM2011-23204 of theMinistry of Science and Innovation

References

[1] J M Matıas A Vaamonde J Taboada and W Gonzalez-Manteiga ldquoSupport vector machines and gradient boosting forgraphical estimation of a slate depositrdquo Stochastic Environmen-tal Research and Risk Assessment vol 18 no 5 pp 309ndash3232004

[2] F G Bastante J Taboada L Alejano and E Alonso ldquoOptimiza-tion tools and simulationmethods for designing and evaluatinga mining operationrdquo Stochastic Environmental Research andRisk Assessment vol 22 no 6 pp 727ndash735 2008

[3] J A Nelder and R W M Wedderburn ldquoGeneralized linearmodelsrdquo Journal of the Royal Statistical Society A vol 135 pp370ndash384 1972

[4] P McCullagh and J A Nelder Generalized Linear ModelsChapman and Hall London UK 1989

[5] O F Christensen and R Waagepetersen ldquoBayesian predictionof spatial count data using generalized linear mixed modelsrdquoBiometrics vol 58 no 2 pp 280ndash286 2002

[6] P Diggle R Moyeed B Rowlingson andMThomson ldquoChild-hood malaria in the Gambia a case-study in model-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 51no 4 pp 493ndash506 2002

[7] P J Diggle P J Ribeiro andO F Christensen ldquoAn introductionto model-based geostatisticsrdquo in Spatial Statistics and Compu-tational Methods J Moslashller Ed pp 43ndash86 Springer New YorkNY USA 2003

[8] H Zhang ldquoOn estimation and prediction for spatial generalizedlinear mixed modelsrdquo Biometrics vol 58 no 1 pp 129ndash1362002

[9] H Zhang ldquoOptimal interpolation and the appropriateness ofcross-validating variogram in spatial generalized linear mixedmodelsrdquo Journal of Computational and Graphical Statistics vol12 no 3 pp 698ndash713 2003

[10] N Breslow and D Clayton ldquoApproximate inference in gener-alized linear mixed modelsrdquo Journal of the American StatisticalAssociation vol 88 pp 9ndash25 1993

[11] P J Diggle J A Tawn and R A Moyeed ldquoModel-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 47no 3 pp 299ndash350 1998

Journal of Applied Mathematics 7

[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994

[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg

[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011

[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998

[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997

[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: estadistica spatial

Journal of Applied Mathematics 3

Table 1 Correlation matrix of the 119901 = 9 explanatory variables that affected the results of borehole logging

(a)

RQD Veins Crenulation Kink bands Sandy laminationsRQD 1Veins 01624 1Crenulation 03038 01631 1Kink bands 01040 00408 01962 1Sandy laminations 00177 00187 00200 00736 1Microfractures 06696 00035 01284 minus00590 minus00138

Pyrite minus01953 minus00403 minus01221 minus00891 minus01202

Oxidation 02053 00837 01722 00821 minus00645

Rough cleavage minus00908 minus00593 02096 minus00058 minus00454

(b)

Microfractures Pyrite Oxidation Rough cleavageMicrofractures 1Pyrite minus01587 1Oxidation 01162 minus01152 1Rough cleavage minus00989 00065 minus00366 1

(vi) Microfractures presence of barely visible fractureswhich determine the breakage resistance of slabsmeasuring 3ndash5mm thick

(vii) Pyrite presence of iron sulphides

(viii) Oxidation degree of oxidation of iron sulphides in theslate

(ix) Rough cleavage slatewith poor fissility due to texturalheterogeneity

In-depth knowledge of the variability and distribution ofexploitable slate and possible correlation between propertiesare conducive to the use of GLSM to spatially model thegeographic database

Table 1 shows the correlation matrix of the explanatoryvariables given above in order to know the degree of depen-dence between them

32 Model Formulation The response variable 119884(119909) takesthe values 0 or 1 to indicate disposable or exploitable slaterespectively in a particular location 119909 It is assumed in whatfollows that slate exploitability is a spatial phenomenon thatcan bemodelled using aGLSM In otherwords conditional tothe Gaussian process 119878(119909) the data 119884(119909

119894) 119894 = 1 119899 follow

the classic GLM The role of 119878(119909) is therefore to explainthe residual spatial variation after considering all the knownexplanatory variables It is also reasonable to assume thatthe conditional distribution of exploitability can be modelledas a binomial distribution which is why a binomial errordistribution was considered in our study

Binomial error distribution was used by Diggle et al [6]and Zhang [8] A class of transformations that can be used aslink functions for this distribution was described by [16] Weassume the stationary Gaussian process 119878(sdot) to be the basis fora model of spatial variation in the probability 119875(119909) that the

slate in119909 is exploitable but with a logit transformation tomapthe domain of 119878(sdot) onto the unit interval Thus

119892 [119875 (119909)] = log 119875 (119909)[1 minus 119875 (119909)]

= 120583 + 119878 (119909) (1)

The regression function 119864[119884(119909119894) | 119878(119909

119894)] varies spatially only

through 119878(119909) in the locations 119909119894

We adopted a Bayesian framework for inference andprediction of the parameters using algorithms based onMCMC

The parameters of this binomial GLSM are 120579 = (1205902 120593)and 120573 = (120573

0 120573

119901) where 120573

0is the independent term

and 1205731 120573

119901are the regression coefficients corresponding

to each known dependent variable

4 Statistical Analysis

We initially included all the variables that characterizeslate exploitability namely RQD veins crenulations kinkbands sandy laminations microfractures pyrite oxidationand poor fissility Taking this data and considering slateexploitability as the response variable we fitted a binaryGLM called GLM1 A ROC curve was estimated for thiscomplete binary model and the AUC was 099

Next binary GLMs were fitted to different groups ofdependent variables in an attempt to find the minimumnumber of variables that would provide a high AUC valuethat is close to 099 The model called GLM2 fitted withthe RQD crenulation kink band andmicrofracture variablesobtained an AUC of 092 Figure 1 shows the ROC curves andthe corresponding AUC values for both GLM1 and GLM2The study was continued with the four variables included inGLM2 given that the reduction in the number of variablesdid not overly affect the accuracy of the model A binarynonspatial GLM was fitted using Bayesian methods and the

4 Journal of Applied Mathematics

Table 2 Estimated coefficients and 25 25 75 and 975 quantiles for the GLM

Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus356747 minus776804 minus490098 minus260904 minus186385

1205731(RQD) 12826 minus11476 04192 20516 396331205732(crenulation) 05875 03288 05089 06761 083521205733(kink band) 05668 03965 05019 06395 078571205734(microfracture) 25815 09221 16093 39181 67967

Table 3 Estimated coefficients and 25 25 75 and 975 quantiles for the GLSM

Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus277021 minus320664 minus296753 minus247446 minus177297

1205731(RQD) 02521 minus00385 01291 03726 053881205732(crenulation) 07334 03762 06121 08283 106911205733(kink band) 09189 05881 07802 10274 122611205734(microfracture) 12911 08834 10125 14480 16261

00 02 04 06 08 10

00

02

04

06

08

10

ROC plot

Sens

itivi

ty (t

rue p

ositi

ves)

AUC099 GLM1092 GLM2

1-specificity (false positives)

Figure 1 ROC curves and the corresponding AUC values for thetwo binary GLMs included in the study

MCMClogit function from the MCMCpack (R language)The vector120573 of the parameters fitted for theGLM2model was120573 = (minus356747 12826 05875 05668 25815) where the firstvalue is the independent term and the remaining values arethe corresponding coefficients of the explanatory variables inthe same order as they were mentioned above

Table 2 shows the estimated coefficients and the 2525 75 and 975 quantiles for the fitted model It can beobserved that for a significance level of 120572 = 005 the onlynonsignificant variable is the RQD

The study of the residuals of the GLM2 model fittedwith four explanatory variables detected a spatial dependence

Distance

Sem

ivar

ianc

e

05

10

15

20

5 10 15 20 25

Figure 2 Experimental semivariogram for the GLM2 residuals(points) together with the fitted theoretical model fitted accordingto an exponential model (continuous line)

that could be modelled using an exponential theoreticalsemivariogram with range 081 and sill 161 Figure 2 showsthe experimental semivariogram for the GLM2 residualstogether with the corresponding fitted theoretical model

Given the presence of spatial dependence in the resid-uals it then made sense to fit a GLSM maintaining fourexplanatory variables and assuming an exponential modelfor the process 119878(119909) The correlation function considered wasof the type cov(119878(119909) 119878(1199091015840)) = 1205902120588(119909 1199091015840 120593) + 12059121119909 = 1199091015840with 120588(119906 120593) = exp(minus119906120593) The fit was made using Bayesianinference implemented via the MCMC algorithms The first10000 sample observations from the simulationwere ignoredas burn-in at which point it was considered that convergencetime had been achieved The subsequent samples were usedto obtain the subsequent distribution of the parameters of

Journal of Applied Mathematics 5

minus5

minus4

minus4

minus4

minus3

minus3

minus2

minus2

minus2

3

10

20 40

00 05 10 15

000

005

010

015

020

025

030

120593

1205912 re

l

Figure 3 Two-dimensional likelihood profile for (120593 1205912) for theGLSM

interest The chain was sampled for each 100 of the 50000iterations to obtain samples containing 500 values (for furtherdetails see [17])

Using this procedure the parameters were estimated as120593 = 082 120590

2= 146 120591

2= 009 and 120573 = (minus277021

02521 07334 09189 12911) Table 3 shows the estimated 120573coefficients and their 25 25 75 and 975 quantilesOnce againwe can see how for a significance level of120572 = 005only the RQD variable was not significant

It is important to remember that these parametersshould be conditionally and not marginally interpreted andso should not be directly compared with the parametersestimated for the GLM Direct comparison of a spatial andnonspatial GLM could lead to erroneous conclusions as theestimation methods are fundamentally different Nonethe-less there is a certain correlation in the conclusions to bedrawn from these tables with variables such as crenulationkink band andmicrofracture remaining significant RQD onthe other hand was not significant at 5 level

Figure 3 shows the likelihood profile in two dimensionsfor parameters (120593 1205912) of the model illustrating the flatness ofthe likelihood surface obtained using theMCMC algorithms

A study of the spatial dependence of the GLSM residualsindicated that the spatial component had been correctlymodelled on this occasion Figure 4 shows an experimentalsemivariogram of the GLSM residuals and a theoreticalmodel fitted according to a nugget effect model It is clearthat the empirical semivariogram is essentially flat whichsuggests a suitable fit to the spatial structure A directcomparison between Figures 2 and 4 leads to the conclusionthat the spatial dependence has been properly captured by thestationary Gaussian process 119878

The ROC curve and AUC were calculated for the GLSMThe AUC of the binary spatial model was 099 which

Distance

Sem

ivar

ianc

e5 10 15 20 25

1eminus06

2eminus06

3eminus06

4eminus06

Figure 4 Experimental semivariogram for the GLSM residuals(points) together with the theoretical model fitted according to anugget effect model (continuous line)

Table 4 Error rates for the binary models for three scenariosrepresenting different levels of prediction difficulty

5 test 10 test 15 testGLM 66119864 minus 2 63119864 minus 2 56119864 minus 2

GLSM 41119864 minus 2 51119864 minus 2 46119864 minus 2

indicates a substantial improvement in the precision of theGLSM This improvement is reflected in Figure 5 whichdepicts the ROC curves and AUC values for the GLM2 andGLSM binary models

The comparison between the two models was completedwith a simulation study designed to compare the reliabilityof the predictions in three scenarios of varying levels ofdifficulty Randomly selected for the first scenario was 95 ofthe 313 initial observations composing the training set thatwas used to fit the GLM and GLSM The fitted models werethen validated with the remaining 5 of the observationsThis procedure was repeated 100 times and the numberof errors in the slate exploitability prediction was recordedfor each repetition For the second scenario we randomlyselected 90 of the observations for the training set and theremaining 10made up the test set This simulation was alsorepeated 100 times The procedure for the third scenario wassimilar but this time 85 and 15 of observations made upthe training and test sets respectively

Table 4 displays the error rates for the three scenariosdescribed with the error rate calculated as the ratio betweenthe number of prediction errors and the total number ofpredictions in the test set

6 Journal of Applied Mathematics

00 02 04 06 08 10

00

02

04

06

08

10

ROC plot

Sens

itivi

ty (t

rue p

ositi

ves)

AUC

099 GLSM092 GLM

1-specificity (false positives)

Figure 5 ROC curves and correspondingAUCvalues for the binaryGLM2 and the GLSM

In all the cases it can be observed that theGLSMprovideda better explanation not only of the effect of the variablesdetermining slate quality but also of the spatial behaviourof exploitable slate thereby producing lower prediction errorrates

5 Conclusions

A general interpretation of the GLSM used in our analysis isthat the spatial term 119878 represents the accumulative effect ofpossible explanatory variables with an undetermined spatialstructure which have therefore not been observed

In GLMs the fact that the spatial correlation of thevariables is not taken into account can significantly affect thequality of statistical results Our study highlights the potentialrisk of using GLMs when the data is spatially structured

The conclusion reached after comparing ROC curvesand their corresponding AUCs is that GLSMs predict slateexploitability better than GLMs Therefore it would seemessential to include unexplained spatial variation when mod-elling spatially correlated variables

Based on the comparison of the semivariograms of theGLM and GLSM residuals we would like to draw attentionto the presence of spatial dependence in the GLM residualsin contrast towhat occurswhen aGLSM is implementedThisindicates that spatial dependence has been captured correctlyby the stationary Gaussian process 119878 We can therefore

conclude that a GLSM is more suitable for modelling spatialdependence which is overlooked by classic GLMs

The simulation study demonstrates that for varying levelsof prediction difficulty the GLSM had lower error rates thanthe GLM

Although the parameters of the GLSM must be inter-preted conditionally rather than marginally to 119878 the resultsof the statistical analysis denote the broader potential of theGLSM compared to the classic GLM in analysing spatialdata They also underline the potential risk of reachingerroneous conclusions when using nonspatial models toanalyse spatially structured data

Acknowledgments

This work was funded partly by the ProjectINCITE10REM304009PR of the Xunta de Galicia and by theProjects MTM2008-03129 and MTM2011-23204 of theMinistry of Science and Innovation

References

[1] J M Matıas A Vaamonde J Taboada and W Gonzalez-Manteiga ldquoSupport vector machines and gradient boosting forgraphical estimation of a slate depositrdquo Stochastic Environmen-tal Research and Risk Assessment vol 18 no 5 pp 309ndash3232004

[2] F G Bastante J Taboada L Alejano and E Alonso ldquoOptimiza-tion tools and simulationmethods for designing and evaluatinga mining operationrdquo Stochastic Environmental Research andRisk Assessment vol 22 no 6 pp 727ndash735 2008

[3] J A Nelder and R W M Wedderburn ldquoGeneralized linearmodelsrdquo Journal of the Royal Statistical Society A vol 135 pp370ndash384 1972

[4] P McCullagh and J A Nelder Generalized Linear ModelsChapman and Hall London UK 1989

[5] O F Christensen and R Waagepetersen ldquoBayesian predictionof spatial count data using generalized linear mixed modelsrdquoBiometrics vol 58 no 2 pp 280ndash286 2002

[6] P Diggle R Moyeed B Rowlingson andMThomson ldquoChild-hood malaria in the Gambia a case-study in model-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 51no 4 pp 493ndash506 2002

[7] P J Diggle P J Ribeiro andO F Christensen ldquoAn introductionto model-based geostatisticsrdquo in Spatial Statistics and Compu-tational Methods J Moslashller Ed pp 43ndash86 Springer New YorkNY USA 2003

[8] H Zhang ldquoOn estimation and prediction for spatial generalizedlinear mixed modelsrdquo Biometrics vol 58 no 1 pp 129ndash1362002

[9] H Zhang ldquoOptimal interpolation and the appropriateness ofcross-validating variogram in spatial generalized linear mixedmodelsrdquo Journal of Computational and Graphical Statistics vol12 no 3 pp 698ndash713 2003

[10] N Breslow and D Clayton ldquoApproximate inference in gener-alized linear mixed modelsrdquo Journal of the American StatisticalAssociation vol 88 pp 9ndash25 1993

[11] P J Diggle J A Tawn and R A Moyeed ldquoModel-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 47no 3 pp 299ndash350 1998

Journal of Applied Mathematics 7

[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994

[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg

[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011

[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998

[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997

[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: estadistica spatial

4 Journal of Applied Mathematics

Table 2 Estimated coefficients and 25 25 75 and 975 quantiles for the GLM

Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus356747 minus776804 minus490098 minus260904 minus186385

1205731(RQD) 12826 minus11476 04192 20516 396331205732(crenulation) 05875 03288 05089 06761 083521205733(kink band) 05668 03965 05019 06395 078571205734(microfracture) 25815 09221 16093 39181 67967

Table 3 Estimated coefficients and 25 25 75 and 975 quantiles for the GLSM

Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus277021 minus320664 minus296753 minus247446 minus177297

1205731(RQD) 02521 minus00385 01291 03726 053881205732(crenulation) 07334 03762 06121 08283 106911205733(kink band) 09189 05881 07802 10274 122611205734(microfracture) 12911 08834 10125 14480 16261

00 02 04 06 08 10

00

02

04

06

08

10

ROC plot

Sens

itivi

ty (t

rue p

ositi

ves)

AUC099 GLM1092 GLM2

1-specificity (false positives)

Figure 1 ROC curves and the corresponding AUC values for thetwo binary GLMs included in the study

MCMClogit function from the MCMCpack (R language)The vector120573 of the parameters fitted for theGLM2model was120573 = (minus356747 12826 05875 05668 25815) where the firstvalue is the independent term and the remaining values arethe corresponding coefficients of the explanatory variables inthe same order as they were mentioned above

Table 2 shows the estimated coefficients and the 2525 75 and 975 quantiles for the fitted model It can beobserved that for a significance level of 120572 = 005 the onlynonsignificant variable is the RQD

The study of the residuals of the GLM2 model fittedwith four explanatory variables detected a spatial dependence

Distance

Sem

ivar

ianc

e

05

10

15

20

5 10 15 20 25

Figure 2 Experimental semivariogram for the GLM2 residuals(points) together with the fitted theoretical model fitted accordingto an exponential model (continuous line)

that could be modelled using an exponential theoreticalsemivariogram with range 081 and sill 161 Figure 2 showsthe experimental semivariogram for the GLM2 residualstogether with the corresponding fitted theoretical model

Given the presence of spatial dependence in the resid-uals it then made sense to fit a GLSM maintaining fourexplanatory variables and assuming an exponential modelfor the process 119878(119909) The correlation function considered wasof the type cov(119878(119909) 119878(1199091015840)) = 1205902120588(119909 1199091015840 120593) + 12059121119909 = 1199091015840with 120588(119906 120593) = exp(minus119906120593) The fit was made using Bayesianinference implemented via the MCMC algorithms The first10000 sample observations from the simulationwere ignoredas burn-in at which point it was considered that convergencetime had been achieved The subsequent samples were usedto obtain the subsequent distribution of the parameters of

Journal of Applied Mathematics 5

minus5

minus4

minus4

minus4

minus3

minus3

minus2

minus2

minus2

3

10

20 40

00 05 10 15

000

005

010

015

020

025

030

120593

1205912 re

l

Figure 3 Two-dimensional likelihood profile for (120593 1205912) for theGLSM

interest The chain was sampled for each 100 of the 50000iterations to obtain samples containing 500 values (for furtherdetails see [17])

Using this procedure the parameters were estimated as120593 = 082 120590

2= 146 120591

2= 009 and 120573 = (minus277021

02521 07334 09189 12911) Table 3 shows the estimated 120573coefficients and their 25 25 75 and 975 quantilesOnce againwe can see how for a significance level of120572 = 005only the RQD variable was not significant

It is important to remember that these parametersshould be conditionally and not marginally interpreted andso should not be directly compared with the parametersestimated for the GLM Direct comparison of a spatial andnonspatial GLM could lead to erroneous conclusions as theestimation methods are fundamentally different Nonethe-less there is a certain correlation in the conclusions to bedrawn from these tables with variables such as crenulationkink band andmicrofracture remaining significant RQD onthe other hand was not significant at 5 level

Figure 3 shows the likelihood profile in two dimensionsfor parameters (120593 1205912) of the model illustrating the flatness ofthe likelihood surface obtained using theMCMC algorithms

A study of the spatial dependence of the GLSM residualsindicated that the spatial component had been correctlymodelled on this occasion Figure 4 shows an experimentalsemivariogram of the GLSM residuals and a theoreticalmodel fitted according to a nugget effect model It is clearthat the empirical semivariogram is essentially flat whichsuggests a suitable fit to the spatial structure A directcomparison between Figures 2 and 4 leads to the conclusionthat the spatial dependence has been properly captured by thestationary Gaussian process 119878

The ROC curve and AUC were calculated for the GLSMThe AUC of the binary spatial model was 099 which

Distance

Sem

ivar

ianc

e5 10 15 20 25

1eminus06

2eminus06

3eminus06

4eminus06

Figure 4 Experimental semivariogram for the GLSM residuals(points) together with the theoretical model fitted according to anugget effect model (continuous line)

Table 4 Error rates for the binary models for three scenariosrepresenting different levels of prediction difficulty

5 test 10 test 15 testGLM 66119864 minus 2 63119864 minus 2 56119864 minus 2

GLSM 41119864 minus 2 51119864 minus 2 46119864 minus 2

indicates a substantial improvement in the precision of theGLSM This improvement is reflected in Figure 5 whichdepicts the ROC curves and AUC values for the GLM2 andGLSM binary models

The comparison between the two models was completedwith a simulation study designed to compare the reliabilityof the predictions in three scenarios of varying levels ofdifficulty Randomly selected for the first scenario was 95 ofthe 313 initial observations composing the training set thatwas used to fit the GLM and GLSM The fitted models werethen validated with the remaining 5 of the observationsThis procedure was repeated 100 times and the numberof errors in the slate exploitability prediction was recordedfor each repetition For the second scenario we randomlyselected 90 of the observations for the training set and theremaining 10made up the test set This simulation was alsorepeated 100 times The procedure for the third scenario wassimilar but this time 85 and 15 of observations made upthe training and test sets respectively

Table 4 displays the error rates for the three scenariosdescribed with the error rate calculated as the ratio betweenthe number of prediction errors and the total number ofpredictions in the test set

6 Journal of Applied Mathematics

00 02 04 06 08 10

00

02

04

06

08

10

ROC plot

Sens

itivi

ty (t

rue p

ositi

ves)

AUC

099 GLSM092 GLM

1-specificity (false positives)

Figure 5 ROC curves and correspondingAUCvalues for the binaryGLM2 and the GLSM

In all the cases it can be observed that theGLSMprovideda better explanation not only of the effect of the variablesdetermining slate quality but also of the spatial behaviourof exploitable slate thereby producing lower prediction errorrates

5 Conclusions

A general interpretation of the GLSM used in our analysis isthat the spatial term 119878 represents the accumulative effect ofpossible explanatory variables with an undetermined spatialstructure which have therefore not been observed

In GLMs the fact that the spatial correlation of thevariables is not taken into account can significantly affect thequality of statistical results Our study highlights the potentialrisk of using GLMs when the data is spatially structured

The conclusion reached after comparing ROC curvesand their corresponding AUCs is that GLSMs predict slateexploitability better than GLMs Therefore it would seemessential to include unexplained spatial variation when mod-elling spatially correlated variables

Based on the comparison of the semivariograms of theGLM and GLSM residuals we would like to draw attentionto the presence of spatial dependence in the GLM residualsin contrast towhat occurswhen aGLSM is implementedThisindicates that spatial dependence has been captured correctlyby the stationary Gaussian process 119878 We can therefore

conclude that a GLSM is more suitable for modelling spatialdependence which is overlooked by classic GLMs

The simulation study demonstrates that for varying levelsof prediction difficulty the GLSM had lower error rates thanthe GLM

Although the parameters of the GLSM must be inter-preted conditionally rather than marginally to 119878 the resultsof the statistical analysis denote the broader potential of theGLSM compared to the classic GLM in analysing spatialdata They also underline the potential risk of reachingerroneous conclusions when using nonspatial models toanalyse spatially structured data

Acknowledgments

This work was funded partly by the ProjectINCITE10REM304009PR of the Xunta de Galicia and by theProjects MTM2008-03129 and MTM2011-23204 of theMinistry of Science and Innovation

References

[1] J M Matıas A Vaamonde J Taboada and W Gonzalez-Manteiga ldquoSupport vector machines and gradient boosting forgraphical estimation of a slate depositrdquo Stochastic Environmen-tal Research and Risk Assessment vol 18 no 5 pp 309ndash3232004

[2] F G Bastante J Taboada L Alejano and E Alonso ldquoOptimiza-tion tools and simulationmethods for designing and evaluatinga mining operationrdquo Stochastic Environmental Research andRisk Assessment vol 22 no 6 pp 727ndash735 2008

[3] J A Nelder and R W M Wedderburn ldquoGeneralized linearmodelsrdquo Journal of the Royal Statistical Society A vol 135 pp370ndash384 1972

[4] P McCullagh and J A Nelder Generalized Linear ModelsChapman and Hall London UK 1989

[5] O F Christensen and R Waagepetersen ldquoBayesian predictionof spatial count data using generalized linear mixed modelsrdquoBiometrics vol 58 no 2 pp 280ndash286 2002

[6] P Diggle R Moyeed B Rowlingson andMThomson ldquoChild-hood malaria in the Gambia a case-study in model-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 51no 4 pp 493ndash506 2002

[7] P J Diggle P J Ribeiro andO F Christensen ldquoAn introductionto model-based geostatisticsrdquo in Spatial Statistics and Compu-tational Methods J Moslashller Ed pp 43ndash86 Springer New YorkNY USA 2003

[8] H Zhang ldquoOn estimation and prediction for spatial generalizedlinear mixed modelsrdquo Biometrics vol 58 no 1 pp 129ndash1362002

[9] H Zhang ldquoOptimal interpolation and the appropriateness ofcross-validating variogram in spatial generalized linear mixedmodelsrdquo Journal of Computational and Graphical Statistics vol12 no 3 pp 698ndash713 2003

[10] N Breslow and D Clayton ldquoApproximate inference in gener-alized linear mixed modelsrdquo Journal of the American StatisticalAssociation vol 88 pp 9ndash25 1993

[11] P J Diggle J A Tawn and R A Moyeed ldquoModel-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 47no 3 pp 299ndash350 1998

Journal of Applied Mathematics 7

[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994

[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg

[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011

[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998

[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997

[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: estadistica spatial

Journal of Applied Mathematics 5

minus5

minus4

minus4

minus4

minus3

minus3

minus2

minus2

minus2

3

10

20 40

00 05 10 15

000

005

010

015

020

025

030

120593

1205912 re

l

Figure 3 Two-dimensional likelihood profile for (120593 1205912) for theGLSM

interest The chain was sampled for each 100 of the 50000iterations to obtain samples containing 500 values (for furtherdetails see [17])

Using this procedure the parameters were estimated as120593 = 082 120590

2= 146 120591

2= 009 and 120573 = (minus277021

02521 07334 09189 12911) Table 3 shows the estimated 120573coefficients and their 25 25 75 and 975 quantilesOnce againwe can see how for a significance level of120572 = 005only the RQD variable was not significant

It is important to remember that these parametersshould be conditionally and not marginally interpreted andso should not be directly compared with the parametersestimated for the GLM Direct comparison of a spatial andnonspatial GLM could lead to erroneous conclusions as theestimation methods are fundamentally different Nonethe-less there is a certain correlation in the conclusions to bedrawn from these tables with variables such as crenulationkink band andmicrofracture remaining significant RQD onthe other hand was not significant at 5 level

Figure 3 shows the likelihood profile in two dimensionsfor parameters (120593 1205912) of the model illustrating the flatness ofthe likelihood surface obtained using theMCMC algorithms

A study of the spatial dependence of the GLSM residualsindicated that the spatial component had been correctlymodelled on this occasion Figure 4 shows an experimentalsemivariogram of the GLSM residuals and a theoreticalmodel fitted according to a nugget effect model It is clearthat the empirical semivariogram is essentially flat whichsuggests a suitable fit to the spatial structure A directcomparison between Figures 2 and 4 leads to the conclusionthat the spatial dependence has been properly captured by thestationary Gaussian process 119878

The ROC curve and AUC were calculated for the GLSMThe AUC of the binary spatial model was 099 which

Distance

Sem

ivar

ianc

e5 10 15 20 25

1eminus06

2eminus06

3eminus06

4eminus06

Figure 4 Experimental semivariogram for the GLSM residuals(points) together with the theoretical model fitted according to anugget effect model (continuous line)

Table 4 Error rates for the binary models for three scenariosrepresenting different levels of prediction difficulty

5 test 10 test 15 testGLM 66119864 minus 2 63119864 minus 2 56119864 minus 2

GLSM 41119864 minus 2 51119864 minus 2 46119864 minus 2

indicates a substantial improvement in the precision of theGLSM This improvement is reflected in Figure 5 whichdepicts the ROC curves and AUC values for the GLM2 andGLSM binary models

The comparison between the two models was completedwith a simulation study designed to compare the reliabilityof the predictions in three scenarios of varying levels ofdifficulty Randomly selected for the first scenario was 95 ofthe 313 initial observations composing the training set thatwas used to fit the GLM and GLSM The fitted models werethen validated with the remaining 5 of the observationsThis procedure was repeated 100 times and the numberof errors in the slate exploitability prediction was recordedfor each repetition For the second scenario we randomlyselected 90 of the observations for the training set and theremaining 10made up the test set This simulation was alsorepeated 100 times The procedure for the third scenario wassimilar but this time 85 and 15 of observations made upthe training and test sets respectively

Table 4 displays the error rates for the three scenariosdescribed with the error rate calculated as the ratio betweenthe number of prediction errors and the total number ofpredictions in the test set

6 Journal of Applied Mathematics

00 02 04 06 08 10

00

02

04

06

08

10

ROC plot

Sens

itivi

ty (t

rue p

ositi

ves)

AUC

099 GLSM092 GLM

1-specificity (false positives)

Figure 5 ROC curves and correspondingAUCvalues for the binaryGLM2 and the GLSM

In all the cases it can be observed that theGLSMprovideda better explanation not only of the effect of the variablesdetermining slate quality but also of the spatial behaviourof exploitable slate thereby producing lower prediction errorrates

5 Conclusions

A general interpretation of the GLSM used in our analysis isthat the spatial term 119878 represents the accumulative effect ofpossible explanatory variables with an undetermined spatialstructure which have therefore not been observed

In GLMs the fact that the spatial correlation of thevariables is not taken into account can significantly affect thequality of statistical results Our study highlights the potentialrisk of using GLMs when the data is spatially structured

The conclusion reached after comparing ROC curvesand their corresponding AUCs is that GLSMs predict slateexploitability better than GLMs Therefore it would seemessential to include unexplained spatial variation when mod-elling spatially correlated variables

Based on the comparison of the semivariograms of theGLM and GLSM residuals we would like to draw attentionto the presence of spatial dependence in the GLM residualsin contrast towhat occurswhen aGLSM is implementedThisindicates that spatial dependence has been captured correctlyby the stationary Gaussian process 119878 We can therefore

conclude that a GLSM is more suitable for modelling spatialdependence which is overlooked by classic GLMs

The simulation study demonstrates that for varying levelsof prediction difficulty the GLSM had lower error rates thanthe GLM

Although the parameters of the GLSM must be inter-preted conditionally rather than marginally to 119878 the resultsof the statistical analysis denote the broader potential of theGLSM compared to the classic GLM in analysing spatialdata They also underline the potential risk of reachingerroneous conclusions when using nonspatial models toanalyse spatially structured data

Acknowledgments

This work was funded partly by the ProjectINCITE10REM304009PR of the Xunta de Galicia and by theProjects MTM2008-03129 and MTM2011-23204 of theMinistry of Science and Innovation

References

[1] J M Matıas A Vaamonde J Taboada and W Gonzalez-Manteiga ldquoSupport vector machines and gradient boosting forgraphical estimation of a slate depositrdquo Stochastic Environmen-tal Research and Risk Assessment vol 18 no 5 pp 309ndash3232004

[2] F G Bastante J Taboada L Alejano and E Alonso ldquoOptimiza-tion tools and simulationmethods for designing and evaluatinga mining operationrdquo Stochastic Environmental Research andRisk Assessment vol 22 no 6 pp 727ndash735 2008

[3] J A Nelder and R W M Wedderburn ldquoGeneralized linearmodelsrdquo Journal of the Royal Statistical Society A vol 135 pp370ndash384 1972

[4] P McCullagh and J A Nelder Generalized Linear ModelsChapman and Hall London UK 1989

[5] O F Christensen and R Waagepetersen ldquoBayesian predictionof spatial count data using generalized linear mixed modelsrdquoBiometrics vol 58 no 2 pp 280ndash286 2002

[6] P Diggle R Moyeed B Rowlingson andMThomson ldquoChild-hood malaria in the Gambia a case-study in model-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 51no 4 pp 493ndash506 2002

[7] P J Diggle P J Ribeiro andO F Christensen ldquoAn introductionto model-based geostatisticsrdquo in Spatial Statistics and Compu-tational Methods J Moslashller Ed pp 43ndash86 Springer New YorkNY USA 2003

[8] H Zhang ldquoOn estimation and prediction for spatial generalizedlinear mixed modelsrdquo Biometrics vol 58 no 1 pp 129ndash1362002

[9] H Zhang ldquoOptimal interpolation and the appropriateness ofcross-validating variogram in spatial generalized linear mixedmodelsrdquo Journal of Computational and Graphical Statistics vol12 no 3 pp 698ndash713 2003

[10] N Breslow and D Clayton ldquoApproximate inference in gener-alized linear mixed modelsrdquo Journal of the American StatisticalAssociation vol 88 pp 9ndash25 1993

[11] P J Diggle J A Tawn and R A Moyeed ldquoModel-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 47no 3 pp 299ndash350 1998

Journal of Applied Mathematics 7

[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994

[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg

[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011

[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998

[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997

[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: estadistica spatial

6 Journal of Applied Mathematics

00 02 04 06 08 10

00

02

04

06

08

10

ROC plot

Sens

itivi

ty (t

rue p

ositi

ves)

AUC

099 GLSM092 GLM

1-specificity (false positives)

Figure 5 ROC curves and correspondingAUCvalues for the binaryGLM2 and the GLSM

In all the cases it can be observed that theGLSMprovideda better explanation not only of the effect of the variablesdetermining slate quality but also of the spatial behaviourof exploitable slate thereby producing lower prediction errorrates

5 Conclusions

A general interpretation of the GLSM used in our analysis isthat the spatial term 119878 represents the accumulative effect ofpossible explanatory variables with an undetermined spatialstructure which have therefore not been observed

In GLMs the fact that the spatial correlation of thevariables is not taken into account can significantly affect thequality of statistical results Our study highlights the potentialrisk of using GLMs when the data is spatially structured

The conclusion reached after comparing ROC curvesand their corresponding AUCs is that GLSMs predict slateexploitability better than GLMs Therefore it would seemessential to include unexplained spatial variation when mod-elling spatially correlated variables

Based on the comparison of the semivariograms of theGLM and GLSM residuals we would like to draw attentionto the presence of spatial dependence in the GLM residualsin contrast towhat occurswhen aGLSM is implementedThisindicates that spatial dependence has been captured correctlyby the stationary Gaussian process 119878 We can therefore

conclude that a GLSM is more suitable for modelling spatialdependence which is overlooked by classic GLMs

The simulation study demonstrates that for varying levelsof prediction difficulty the GLSM had lower error rates thanthe GLM

Although the parameters of the GLSM must be inter-preted conditionally rather than marginally to 119878 the resultsof the statistical analysis denote the broader potential of theGLSM compared to the classic GLM in analysing spatialdata They also underline the potential risk of reachingerroneous conclusions when using nonspatial models toanalyse spatially structured data

Acknowledgments

This work was funded partly by the ProjectINCITE10REM304009PR of the Xunta de Galicia and by theProjects MTM2008-03129 and MTM2011-23204 of theMinistry of Science and Innovation

References

[1] J M Matıas A Vaamonde J Taboada and W Gonzalez-Manteiga ldquoSupport vector machines and gradient boosting forgraphical estimation of a slate depositrdquo Stochastic Environmen-tal Research and Risk Assessment vol 18 no 5 pp 309ndash3232004

[2] F G Bastante J Taboada L Alejano and E Alonso ldquoOptimiza-tion tools and simulationmethods for designing and evaluatinga mining operationrdquo Stochastic Environmental Research andRisk Assessment vol 22 no 6 pp 727ndash735 2008

[3] J A Nelder and R W M Wedderburn ldquoGeneralized linearmodelsrdquo Journal of the Royal Statistical Society A vol 135 pp370ndash384 1972

[4] P McCullagh and J A Nelder Generalized Linear ModelsChapman and Hall London UK 1989

[5] O F Christensen and R Waagepetersen ldquoBayesian predictionof spatial count data using generalized linear mixed modelsrdquoBiometrics vol 58 no 2 pp 280ndash286 2002

[6] P Diggle R Moyeed B Rowlingson andMThomson ldquoChild-hood malaria in the Gambia a case-study in model-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 51no 4 pp 493ndash506 2002

[7] P J Diggle P J Ribeiro andO F Christensen ldquoAn introductionto model-based geostatisticsrdquo in Spatial Statistics and Compu-tational Methods J Moslashller Ed pp 43ndash86 Springer New YorkNY USA 2003

[8] H Zhang ldquoOn estimation and prediction for spatial generalizedlinear mixed modelsrdquo Biometrics vol 58 no 1 pp 129ndash1362002

[9] H Zhang ldquoOptimal interpolation and the appropriateness ofcross-validating variogram in spatial generalized linear mixedmodelsrdquo Journal of Computational and Graphical Statistics vol12 no 3 pp 698ndash713 2003

[10] N Breslow and D Clayton ldquoApproximate inference in gener-alized linear mixed modelsrdquo Journal of the American StatisticalAssociation vol 88 pp 9ndash25 1993

[11] P J Diggle J A Tawn and R A Moyeed ldquoModel-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 47no 3 pp 299ndash350 1998

Journal of Applied Mathematics 7

[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994

[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg

[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011

[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998

[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997

[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: estadistica spatial

Journal of Applied Mathematics 7

[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994

[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg

[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011

[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998

[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997

[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: estadistica spatial

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of