Upload
julio-davalos-vasquez
View
8
Download
0
Tags:
Embed Size (px)
DESCRIPTION
estadistica del espacio
Citation preview
Hindawi Publishing CorporationJournal of Applied MathematicsVolume 2013 Article ID 531062 7 pageshttpdxdoiorg1011552013531062
Research ArticleGeneralized Linear Spatial Models to Predict Slate Exploitability
Angeles Saavedra12 Javier Taboada3 Mariacutea Arauacutejo3 and Eduardo Giraacuteldez3
1 Department of Statistics University of Vigo 36310 Vigo Spain2 ETSI MINAS Universidad de Vigo Campus Lagoas-Marcosende Rua Maxwell 36310 Vigo Spain3 Department of Natural Resources University of Vigo 36310 Vigo Spain
Correspondence should be addressed to Angeles Saavedra saavedrauvigoes
Received 11 November 2012 Revised 3 May 2013 Accepted 26 May 2013
Academic Editor Zhiping Qiu
Copyright copy 2013 Angeles Saavedra et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited
The aim of this research was to determine the variables that characterize slate exploitability and to model spatial distribution Ageneralized linear spatialmodel (GLSMs)was fitted in order to explore relationship between exploitability and different explanatoryvariables that characterize slate qualityModelling the influence of these variables and analysing the spatial distribution of themodelresiduals yielded a GLSM that allows slate exploitability to be predicted more effectively than when using generalized linear models(GLM) which do not take spatial dependence into account Studying the residuals and comparing the prediction capacities of thetwo models lead us to conclude that the GLSM is more appropriate when the response variable presents spatial distribution
1 Introduction
The exploitability of a slate deposit depends onmany quality-determining factors that are spatially correlated Knowledgeand study of these factors are essential for the evaluationof deposits [1 2] Therefore it can be fairly safely assumedthat better evaluations of quality parameters and betterpredictions of slate exploitability could be obtained by usingspecific statistical models that take spatial correlation intoaccount
Traditionally the main aim of geostatistical models hasbeen to predict a spatially correlated response variable Underthis approach estimating the parameters of the geostatisticalmodel is not usually the main interest However estimatingand inferring parameters enables a more precise identifica-tion of the factors influencing the geographical distributionof exploitable slate thus allowing greater knowledge to begained regarding the response variable of interest
In our research the model-based geostatistics methodol-ogy was adapted in the analysis of slate exploitability usinga generalized linear spatial model (GLSM) With this typeof model the objective of inference can be focused on theparameters of the regression function on the properties of theresiduals or on the distribution of the residuals conditionallyon the response variable
A brief description of the statistical models used inthis study is given in Section 2 The data studied and theformulated model are described in Section 3 The statisticalresults and analyses are presented in Section 4 and somecomments and the discussion can be found in Section 5
2 Statistical Analysis Methods
21 Generalized Linear Spatial Models Generalized linearmodels (GLMs) were introduced by [3] and studied in depthby [4] and later by several authors (see [5ndash9])
In a GLM a response variable 119884 = (1198841 1198842 119884
119899)
is assumed so that the variables 1198841 1198842 119884
119899are mutually
independent and with its expected value related to a linearpredictor 119864[119884] = 119892minus1(119889119879120573) where 120573 isin R119901 is a vector ofunknown regression parameters d are known explanatoryvariables and 119892 is a known function called link function
An important extension of the GLM is the generalizedlinear mixed model (GLMM) [10] in which the responsevariables are considered independent of one another con-ditionally on the values for a set of latent variables Thegeneralized linear spatial model (GLSM) [11] is basically aGLMM in as much as the latent variables derive from aspatial processThe term ldquomodel-based geostatisticsrdquo was firstused by these authors to describe an approach to geostatistical
2 Journal of Applied Mathematics
problems based on formal statistical models and inferenceprocedures This leads to the following specification of themodel
Consider 119899 different locations 1199091 119909
119899 sub 119868 sub R2 and
assume that a realization 119910 = (1199101 119910
119899)119879 of 119884 is observed
where 119910 = 119884(119909119894)
Let 119878 = 119878(119909) 119909 isin 119868 119868 sub R2 be a Gaussian processwith mean function 119864[119878(119909)] = 119889(119909)119879120573 and covariancecov(119878(119909) 119878(1199091015840)) = 1205902120588(119909 1199091015840 120593) + 12059121119909 = 1199091015840 where 120573 isin R119901is as in theGLMcase a vector of unknown regression param-eters 119889(119909) are known explanatory variables now with spatialdependence 120588(119909 1199091015840 120593) is a correlation function in R2 120593 isa scale parameter that controls the speed at which spatialcorrelation approaches 0 as the distance between locationsgrows and finally 1205912 ge 0 is known as the nugget effectin accordance with the usual geostatistical terminology Thenugget effect can be interpreted as a measurement error or amicroscale variation or a combination of both
Conditionally on 119878 the process 119884(119909) 119909 isin 119868 consistsof random mutually independent variables and for eachlocation 119909 isin 119868 the error distribution or [119884(119909) | 119878] has adensity that depends only on the conditional mean 119864[119884(119909
119894) |
119878(119909119894)] A known link function 119892 relates the conditional mean
and 119878(119909) so that 119864[119884(119909119894) | 119878(119909
119894)] = 119892
minus1(119878(119909119894))
When the regression parameters 120573 are of interest it isimportant to remember that their interpretation is moreconditional than marginal In particular 119864[119884(119909
119894) | 119878(119909
119894)] and
119864[119884(119909119894)] differ in terms of the structural dependence of the
explanatory variables 119889(119909119894) thus the interpretation of 120573 calls
for caution Only in the case where 119884(119909119894) | 119878(119909
119894) is Gaussian
and the link function is identity parameter comparison isdirect The need to distinguish between conditional andmarginal regression parameters which is not possible inGaussian linearmodels is well known in the context of GLMsfor longitudinal data (see eg [12])
To estimate the parameters for the GLSM and dueto the fact that the stationary Gaussian process 119878(119909) isnot observable it is not possible to obtain a closed-formlikelihood function except as a high-dimension integralReference [11] suggests using algorithms based on Markovchain Monte Carlo (MCMC) to calculate GLSM parametersin a Bayesian framework This is the approach used in ouranalysis implemented using geoR and geoRglm packages(free open-source programs for use with119877 statistical software[13])
22 ROC Curves When the marginal distribution (in theGLM) or conditional distribution (in the GLSM) of theresponse variable119884 follows a binomial distribution themod-els can be called binary classification systems The exactitudeof a diagnostic test for a binary classification system canbe summarized as a receiver operating characteristic (ROC)curve which is a graphic representation of true positiveversus false positive rates when the discrimination thresholdis varied Within the framework of binary GLMs it is normalto estimate the ROC curves of models in which one or moreexplanatory variables have been excluded so as to evaluatethe effects of these variables Analysing ROC curves provides
tools for comparing and selecting the best models Moreprecisely the area under the curve (AUC) of the ROC curveis usually calculated in order to compare the different binarymodels and thereby select the explanatory variables to beincluded in the model Reference [14] described a bootstrap-based method for testing the significant effect of dependentvariables on the ROC curve
We used the AUC and residual semivariograms todemonstrate the goodness-of-fit of the binary GLSM com-pared with the binary GLM when working with spatiallycorrelated data
3 Data Description and Model Formulation
31 The Studied Area and the Geographic Database Thedata used to build the proposed model was collected fromborehole samples taken from slate deposits in Baja CabreraLeonesa (northwest Spain) an area with a long tradition ofextracting processing and exporting roofing slate
When surveying a slate deposit in-depth studies of therock are performed by taking continuous borehole sampleswhich enable geologists to study the living rock and analysethe possibility of using it as ornamental slate see [15] thesesamples also reveal the degree of fracturation inside the rockmass
The specific borehole logging process was based onmanual and visual inspection of the borehole by an expertwho after evaluating the aesthetic and functional defectsand properties of the slate differentiated between seams ofcommercial and unusable slate The survey was performedby taking a control sample every 25 centimetres rock qualitydesignation (RQD) however was defined by homogeneouslyfractured sections
A total of 313 equally spaced in-depth observationswere obtained resulting from prior evaluation of variousparameters affecting the ornamental quality of the slate andfrom direct binary values (0 or 1) assigned by the expert toindicate exploitation potential The 9 specific variables thataffected the results of borehole logging were as follows
(i) RQD borehole core samples recovered in piecesgreater than 10 cm long as a percentage of the totalborehole length This is an indicator of the degree ofrock mass fracturing
(ii) Veins presence of microfractures filled with quartzthat determine the breakage resistance of a commer-cial slab
(iii) Crenulations effect of crenulation cleavage on themain schistosity planes This increases the roughnessof the foliation surfaces of the slate and reducesfissility
(iv) Kink bands Presence of microfolding caused by lateVariscan deformations
(v) Sandy laminations presence of sedimentary sandlayers which cut the schistosity planes and have anegative effect on fissility
Journal of Applied Mathematics 3
Table 1 Correlation matrix of the 119901 = 9 explanatory variables that affected the results of borehole logging
(a)
RQD Veins Crenulation Kink bands Sandy laminationsRQD 1Veins 01624 1Crenulation 03038 01631 1Kink bands 01040 00408 01962 1Sandy laminations 00177 00187 00200 00736 1Microfractures 06696 00035 01284 minus00590 minus00138
Pyrite minus01953 minus00403 minus01221 minus00891 minus01202
Oxidation 02053 00837 01722 00821 minus00645
Rough cleavage minus00908 minus00593 02096 minus00058 minus00454
(b)
Microfractures Pyrite Oxidation Rough cleavageMicrofractures 1Pyrite minus01587 1Oxidation 01162 minus01152 1Rough cleavage minus00989 00065 minus00366 1
(vi) Microfractures presence of barely visible fractureswhich determine the breakage resistance of slabsmeasuring 3ndash5mm thick
(vii) Pyrite presence of iron sulphides
(viii) Oxidation degree of oxidation of iron sulphides in theslate
(ix) Rough cleavage slatewith poor fissility due to texturalheterogeneity
In-depth knowledge of the variability and distribution ofexploitable slate and possible correlation between propertiesare conducive to the use of GLSM to spatially model thegeographic database
Table 1 shows the correlation matrix of the explanatoryvariables given above in order to know the degree of depen-dence between them
32 Model Formulation The response variable 119884(119909) takesthe values 0 or 1 to indicate disposable or exploitable slaterespectively in a particular location 119909 It is assumed in whatfollows that slate exploitability is a spatial phenomenon thatcan bemodelled using aGLSM In otherwords conditional tothe Gaussian process 119878(119909) the data 119884(119909
119894) 119894 = 1 119899 follow
the classic GLM The role of 119878(119909) is therefore to explainthe residual spatial variation after considering all the knownexplanatory variables It is also reasonable to assume thatthe conditional distribution of exploitability can be modelledas a binomial distribution which is why a binomial errordistribution was considered in our study
Binomial error distribution was used by Diggle et al [6]and Zhang [8] A class of transformations that can be used aslink functions for this distribution was described by [16] Weassume the stationary Gaussian process 119878(sdot) to be the basis fora model of spatial variation in the probability 119875(119909) that the
slate in119909 is exploitable but with a logit transformation tomapthe domain of 119878(sdot) onto the unit interval Thus
119892 [119875 (119909)] = log 119875 (119909)[1 minus 119875 (119909)]
= 120583 + 119878 (119909) (1)
The regression function 119864[119884(119909119894) | 119878(119909
119894)] varies spatially only
through 119878(119909) in the locations 119909119894
We adopted a Bayesian framework for inference andprediction of the parameters using algorithms based onMCMC
The parameters of this binomial GLSM are 120579 = (1205902 120593)and 120573 = (120573
0 120573
119901) where 120573
0is the independent term
and 1205731 120573
119901are the regression coefficients corresponding
to each known dependent variable
4 Statistical Analysis
We initially included all the variables that characterizeslate exploitability namely RQD veins crenulations kinkbands sandy laminations microfractures pyrite oxidationand poor fissility Taking this data and considering slateexploitability as the response variable we fitted a binaryGLM called GLM1 A ROC curve was estimated for thiscomplete binary model and the AUC was 099
Next binary GLMs were fitted to different groups ofdependent variables in an attempt to find the minimumnumber of variables that would provide a high AUC valuethat is close to 099 The model called GLM2 fitted withthe RQD crenulation kink band andmicrofracture variablesobtained an AUC of 092 Figure 1 shows the ROC curves andthe corresponding AUC values for both GLM1 and GLM2The study was continued with the four variables included inGLM2 given that the reduction in the number of variablesdid not overly affect the accuracy of the model A binarynonspatial GLM was fitted using Bayesian methods and the
4 Journal of Applied Mathematics
Table 2 Estimated coefficients and 25 25 75 and 975 quantiles for the GLM
Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus356747 minus776804 minus490098 minus260904 minus186385
1205731(RQD) 12826 minus11476 04192 20516 396331205732(crenulation) 05875 03288 05089 06761 083521205733(kink band) 05668 03965 05019 06395 078571205734(microfracture) 25815 09221 16093 39181 67967
Table 3 Estimated coefficients and 25 25 75 and 975 quantiles for the GLSM
Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus277021 minus320664 minus296753 minus247446 minus177297
1205731(RQD) 02521 minus00385 01291 03726 053881205732(crenulation) 07334 03762 06121 08283 106911205733(kink band) 09189 05881 07802 10274 122611205734(microfracture) 12911 08834 10125 14480 16261
00 02 04 06 08 10
00
02
04
06
08
10
ROC plot
Sens
itivi
ty (t
rue p
ositi
ves)
AUC099 GLM1092 GLM2
1-specificity (false positives)
Figure 1 ROC curves and the corresponding AUC values for thetwo binary GLMs included in the study
MCMClogit function from the MCMCpack (R language)The vector120573 of the parameters fitted for theGLM2model was120573 = (minus356747 12826 05875 05668 25815) where the firstvalue is the independent term and the remaining values arethe corresponding coefficients of the explanatory variables inthe same order as they were mentioned above
Table 2 shows the estimated coefficients and the 2525 75 and 975 quantiles for the fitted model It can beobserved that for a significance level of 120572 = 005 the onlynonsignificant variable is the RQD
The study of the residuals of the GLM2 model fittedwith four explanatory variables detected a spatial dependence
Distance
Sem
ivar
ianc
e
05
10
15
20
5 10 15 20 25
Figure 2 Experimental semivariogram for the GLM2 residuals(points) together with the fitted theoretical model fitted accordingto an exponential model (continuous line)
that could be modelled using an exponential theoreticalsemivariogram with range 081 and sill 161 Figure 2 showsthe experimental semivariogram for the GLM2 residualstogether with the corresponding fitted theoretical model
Given the presence of spatial dependence in the resid-uals it then made sense to fit a GLSM maintaining fourexplanatory variables and assuming an exponential modelfor the process 119878(119909) The correlation function considered wasof the type cov(119878(119909) 119878(1199091015840)) = 1205902120588(119909 1199091015840 120593) + 12059121119909 = 1199091015840with 120588(119906 120593) = exp(minus119906120593) The fit was made using Bayesianinference implemented via the MCMC algorithms The first10000 sample observations from the simulationwere ignoredas burn-in at which point it was considered that convergencetime had been achieved The subsequent samples were usedto obtain the subsequent distribution of the parameters of
Journal of Applied Mathematics 5
minus5
minus4
minus4
minus4
minus3
minus3
minus2
minus2
minus2
3
10
20 40
00 05 10 15
000
005
010
015
020
025
030
120593
1205912 re
l
Figure 3 Two-dimensional likelihood profile for (120593 1205912) for theGLSM
interest The chain was sampled for each 100 of the 50000iterations to obtain samples containing 500 values (for furtherdetails see [17])
Using this procedure the parameters were estimated as120593 = 082 120590
2= 146 120591
2= 009 and 120573 = (minus277021
02521 07334 09189 12911) Table 3 shows the estimated 120573coefficients and their 25 25 75 and 975 quantilesOnce againwe can see how for a significance level of120572 = 005only the RQD variable was not significant
It is important to remember that these parametersshould be conditionally and not marginally interpreted andso should not be directly compared with the parametersestimated for the GLM Direct comparison of a spatial andnonspatial GLM could lead to erroneous conclusions as theestimation methods are fundamentally different Nonethe-less there is a certain correlation in the conclusions to bedrawn from these tables with variables such as crenulationkink band andmicrofracture remaining significant RQD onthe other hand was not significant at 5 level
Figure 3 shows the likelihood profile in two dimensionsfor parameters (120593 1205912) of the model illustrating the flatness ofthe likelihood surface obtained using theMCMC algorithms
A study of the spatial dependence of the GLSM residualsindicated that the spatial component had been correctlymodelled on this occasion Figure 4 shows an experimentalsemivariogram of the GLSM residuals and a theoreticalmodel fitted according to a nugget effect model It is clearthat the empirical semivariogram is essentially flat whichsuggests a suitable fit to the spatial structure A directcomparison between Figures 2 and 4 leads to the conclusionthat the spatial dependence has been properly captured by thestationary Gaussian process 119878
The ROC curve and AUC were calculated for the GLSMThe AUC of the binary spatial model was 099 which
Distance
Sem
ivar
ianc
e5 10 15 20 25
1eminus06
2eminus06
3eminus06
4eminus06
Figure 4 Experimental semivariogram for the GLSM residuals(points) together with the theoretical model fitted according to anugget effect model (continuous line)
Table 4 Error rates for the binary models for three scenariosrepresenting different levels of prediction difficulty
5 test 10 test 15 testGLM 66119864 minus 2 63119864 minus 2 56119864 minus 2
GLSM 41119864 minus 2 51119864 minus 2 46119864 minus 2
indicates a substantial improvement in the precision of theGLSM This improvement is reflected in Figure 5 whichdepicts the ROC curves and AUC values for the GLM2 andGLSM binary models
The comparison between the two models was completedwith a simulation study designed to compare the reliabilityof the predictions in three scenarios of varying levels ofdifficulty Randomly selected for the first scenario was 95 ofthe 313 initial observations composing the training set thatwas used to fit the GLM and GLSM The fitted models werethen validated with the remaining 5 of the observationsThis procedure was repeated 100 times and the numberof errors in the slate exploitability prediction was recordedfor each repetition For the second scenario we randomlyselected 90 of the observations for the training set and theremaining 10made up the test set This simulation was alsorepeated 100 times The procedure for the third scenario wassimilar but this time 85 and 15 of observations made upthe training and test sets respectively
Table 4 displays the error rates for the three scenariosdescribed with the error rate calculated as the ratio betweenthe number of prediction errors and the total number ofpredictions in the test set
6 Journal of Applied Mathematics
00 02 04 06 08 10
00
02
04
06
08
10
ROC plot
Sens
itivi
ty (t
rue p
ositi
ves)
AUC
099 GLSM092 GLM
1-specificity (false positives)
Figure 5 ROC curves and correspondingAUCvalues for the binaryGLM2 and the GLSM
In all the cases it can be observed that theGLSMprovideda better explanation not only of the effect of the variablesdetermining slate quality but also of the spatial behaviourof exploitable slate thereby producing lower prediction errorrates
5 Conclusions
A general interpretation of the GLSM used in our analysis isthat the spatial term 119878 represents the accumulative effect ofpossible explanatory variables with an undetermined spatialstructure which have therefore not been observed
In GLMs the fact that the spatial correlation of thevariables is not taken into account can significantly affect thequality of statistical results Our study highlights the potentialrisk of using GLMs when the data is spatially structured
The conclusion reached after comparing ROC curvesand their corresponding AUCs is that GLSMs predict slateexploitability better than GLMs Therefore it would seemessential to include unexplained spatial variation when mod-elling spatially correlated variables
Based on the comparison of the semivariograms of theGLM and GLSM residuals we would like to draw attentionto the presence of spatial dependence in the GLM residualsin contrast towhat occurswhen aGLSM is implementedThisindicates that spatial dependence has been captured correctlyby the stationary Gaussian process 119878 We can therefore
conclude that a GLSM is more suitable for modelling spatialdependence which is overlooked by classic GLMs
The simulation study demonstrates that for varying levelsof prediction difficulty the GLSM had lower error rates thanthe GLM
Although the parameters of the GLSM must be inter-preted conditionally rather than marginally to 119878 the resultsof the statistical analysis denote the broader potential of theGLSM compared to the classic GLM in analysing spatialdata They also underline the potential risk of reachingerroneous conclusions when using nonspatial models toanalyse spatially structured data
Acknowledgments
This work was funded partly by the ProjectINCITE10REM304009PR of the Xunta de Galicia and by theProjects MTM2008-03129 and MTM2011-23204 of theMinistry of Science and Innovation
References
[1] J M Matıas A Vaamonde J Taboada and W Gonzalez-Manteiga ldquoSupport vector machines and gradient boosting forgraphical estimation of a slate depositrdquo Stochastic Environmen-tal Research and Risk Assessment vol 18 no 5 pp 309ndash3232004
[2] F G Bastante J Taboada L Alejano and E Alonso ldquoOptimiza-tion tools and simulationmethods for designing and evaluatinga mining operationrdquo Stochastic Environmental Research andRisk Assessment vol 22 no 6 pp 727ndash735 2008
[3] J A Nelder and R W M Wedderburn ldquoGeneralized linearmodelsrdquo Journal of the Royal Statistical Society A vol 135 pp370ndash384 1972
[4] P McCullagh and J A Nelder Generalized Linear ModelsChapman and Hall London UK 1989
[5] O F Christensen and R Waagepetersen ldquoBayesian predictionof spatial count data using generalized linear mixed modelsrdquoBiometrics vol 58 no 2 pp 280ndash286 2002
[6] P Diggle R Moyeed B Rowlingson andMThomson ldquoChild-hood malaria in the Gambia a case-study in model-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 51no 4 pp 493ndash506 2002
[7] P J Diggle P J Ribeiro andO F Christensen ldquoAn introductionto model-based geostatisticsrdquo in Spatial Statistics and Compu-tational Methods J Moslashller Ed pp 43ndash86 Springer New YorkNY USA 2003
[8] H Zhang ldquoOn estimation and prediction for spatial generalizedlinear mixed modelsrdquo Biometrics vol 58 no 1 pp 129ndash1362002
[9] H Zhang ldquoOptimal interpolation and the appropriateness ofcross-validating variogram in spatial generalized linear mixedmodelsrdquo Journal of Computational and Graphical Statistics vol12 no 3 pp 698ndash713 2003
[10] N Breslow and D Clayton ldquoApproximate inference in gener-alized linear mixed modelsrdquo Journal of the American StatisticalAssociation vol 88 pp 9ndash25 1993
[11] P J Diggle J A Tawn and R A Moyeed ldquoModel-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 47no 3 pp 299ndash350 1998
Journal of Applied Mathematics 7
[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994
[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg
[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011
[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998
[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997
[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
2 Journal of Applied Mathematics
problems based on formal statistical models and inferenceprocedures This leads to the following specification of themodel
Consider 119899 different locations 1199091 119909
119899 sub 119868 sub R2 and
assume that a realization 119910 = (1199101 119910
119899)119879 of 119884 is observed
where 119910 = 119884(119909119894)
Let 119878 = 119878(119909) 119909 isin 119868 119868 sub R2 be a Gaussian processwith mean function 119864[119878(119909)] = 119889(119909)119879120573 and covariancecov(119878(119909) 119878(1199091015840)) = 1205902120588(119909 1199091015840 120593) + 12059121119909 = 1199091015840 where 120573 isin R119901is as in theGLMcase a vector of unknown regression param-eters 119889(119909) are known explanatory variables now with spatialdependence 120588(119909 1199091015840 120593) is a correlation function in R2 120593 isa scale parameter that controls the speed at which spatialcorrelation approaches 0 as the distance between locationsgrows and finally 1205912 ge 0 is known as the nugget effectin accordance with the usual geostatistical terminology Thenugget effect can be interpreted as a measurement error or amicroscale variation or a combination of both
Conditionally on 119878 the process 119884(119909) 119909 isin 119868 consistsof random mutually independent variables and for eachlocation 119909 isin 119868 the error distribution or [119884(119909) | 119878] has adensity that depends only on the conditional mean 119864[119884(119909
119894) |
119878(119909119894)] A known link function 119892 relates the conditional mean
and 119878(119909) so that 119864[119884(119909119894) | 119878(119909
119894)] = 119892
minus1(119878(119909119894))
When the regression parameters 120573 are of interest it isimportant to remember that their interpretation is moreconditional than marginal In particular 119864[119884(119909
119894) | 119878(119909
119894)] and
119864[119884(119909119894)] differ in terms of the structural dependence of the
explanatory variables 119889(119909119894) thus the interpretation of 120573 calls
for caution Only in the case where 119884(119909119894) | 119878(119909
119894) is Gaussian
and the link function is identity parameter comparison isdirect The need to distinguish between conditional andmarginal regression parameters which is not possible inGaussian linearmodels is well known in the context of GLMsfor longitudinal data (see eg [12])
To estimate the parameters for the GLSM and dueto the fact that the stationary Gaussian process 119878(119909) isnot observable it is not possible to obtain a closed-formlikelihood function except as a high-dimension integralReference [11] suggests using algorithms based on Markovchain Monte Carlo (MCMC) to calculate GLSM parametersin a Bayesian framework This is the approach used in ouranalysis implemented using geoR and geoRglm packages(free open-source programs for use with119877 statistical software[13])
22 ROC Curves When the marginal distribution (in theGLM) or conditional distribution (in the GLSM) of theresponse variable119884 follows a binomial distribution themod-els can be called binary classification systems The exactitudeof a diagnostic test for a binary classification system canbe summarized as a receiver operating characteristic (ROC)curve which is a graphic representation of true positiveversus false positive rates when the discrimination thresholdis varied Within the framework of binary GLMs it is normalto estimate the ROC curves of models in which one or moreexplanatory variables have been excluded so as to evaluatethe effects of these variables Analysing ROC curves provides
tools for comparing and selecting the best models Moreprecisely the area under the curve (AUC) of the ROC curveis usually calculated in order to compare the different binarymodels and thereby select the explanatory variables to beincluded in the model Reference [14] described a bootstrap-based method for testing the significant effect of dependentvariables on the ROC curve
We used the AUC and residual semivariograms todemonstrate the goodness-of-fit of the binary GLSM com-pared with the binary GLM when working with spatiallycorrelated data
3 Data Description and Model Formulation
31 The Studied Area and the Geographic Database Thedata used to build the proposed model was collected fromborehole samples taken from slate deposits in Baja CabreraLeonesa (northwest Spain) an area with a long tradition ofextracting processing and exporting roofing slate
When surveying a slate deposit in-depth studies of therock are performed by taking continuous borehole sampleswhich enable geologists to study the living rock and analysethe possibility of using it as ornamental slate see [15] thesesamples also reveal the degree of fracturation inside the rockmass
The specific borehole logging process was based onmanual and visual inspection of the borehole by an expertwho after evaluating the aesthetic and functional defectsand properties of the slate differentiated between seams ofcommercial and unusable slate The survey was performedby taking a control sample every 25 centimetres rock qualitydesignation (RQD) however was defined by homogeneouslyfractured sections
A total of 313 equally spaced in-depth observationswere obtained resulting from prior evaluation of variousparameters affecting the ornamental quality of the slate andfrom direct binary values (0 or 1) assigned by the expert toindicate exploitation potential The 9 specific variables thataffected the results of borehole logging were as follows
(i) RQD borehole core samples recovered in piecesgreater than 10 cm long as a percentage of the totalborehole length This is an indicator of the degree ofrock mass fracturing
(ii) Veins presence of microfractures filled with quartzthat determine the breakage resistance of a commer-cial slab
(iii) Crenulations effect of crenulation cleavage on themain schistosity planes This increases the roughnessof the foliation surfaces of the slate and reducesfissility
(iv) Kink bands Presence of microfolding caused by lateVariscan deformations
(v) Sandy laminations presence of sedimentary sandlayers which cut the schistosity planes and have anegative effect on fissility
Journal of Applied Mathematics 3
Table 1 Correlation matrix of the 119901 = 9 explanatory variables that affected the results of borehole logging
(a)
RQD Veins Crenulation Kink bands Sandy laminationsRQD 1Veins 01624 1Crenulation 03038 01631 1Kink bands 01040 00408 01962 1Sandy laminations 00177 00187 00200 00736 1Microfractures 06696 00035 01284 minus00590 minus00138
Pyrite minus01953 minus00403 minus01221 minus00891 minus01202
Oxidation 02053 00837 01722 00821 minus00645
Rough cleavage minus00908 minus00593 02096 minus00058 minus00454
(b)
Microfractures Pyrite Oxidation Rough cleavageMicrofractures 1Pyrite minus01587 1Oxidation 01162 minus01152 1Rough cleavage minus00989 00065 minus00366 1
(vi) Microfractures presence of barely visible fractureswhich determine the breakage resistance of slabsmeasuring 3ndash5mm thick
(vii) Pyrite presence of iron sulphides
(viii) Oxidation degree of oxidation of iron sulphides in theslate
(ix) Rough cleavage slatewith poor fissility due to texturalheterogeneity
In-depth knowledge of the variability and distribution ofexploitable slate and possible correlation between propertiesare conducive to the use of GLSM to spatially model thegeographic database
Table 1 shows the correlation matrix of the explanatoryvariables given above in order to know the degree of depen-dence between them
32 Model Formulation The response variable 119884(119909) takesthe values 0 or 1 to indicate disposable or exploitable slaterespectively in a particular location 119909 It is assumed in whatfollows that slate exploitability is a spatial phenomenon thatcan bemodelled using aGLSM In otherwords conditional tothe Gaussian process 119878(119909) the data 119884(119909
119894) 119894 = 1 119899 follow
the classic GLM The role of 119878(119909) is therefore to explainthe residual spatial variation after considering all the knownexplanatory variables It is also reasonable to assume thatthe conditional distribution of exploitability can be modelledas a binomial distribution which is why a binomial errordistribution was considered in our study
Binomial error distribution was used by Diggle et al [6]and Zhang [8] A class of transformations that can be used aslink functions for this distribution was described by [16] Weassume the stationary Gaussian process 119878(sdot) to be the basis fora model of spatial variation in the probability 119875(119909) that the
slate in119909 is exploitable but with a logit transformation tomapthe domain of 119878(sdot) onto the unit interval Thus
119892 [119875 (119909)] = log 119875 (119909)[1 minus 119875 (119909)]
= 120583 + 119878 (119909) (1)
The regression function 119864[119884(119909119894) | 119878(119909
119894)] varies spatially only
through 119878(119909) in the locations 119909119894
We adopted a Bayesian framework for inference andprediction of the parameters using algorithms based onMCMC
The parameters of this binomial GLSM are 120579 = (1205902 120593)and 120573 = (120573
0 120573
119901) where 120573
0is the independent term
and 1205731 120573
119901are the regression coefficients corresponding
to each known dependent variable
4 Statistical Analysis
We initially included all the variables that characterizeslate exploitability namely RQD veins crenulations kinkbands sandy laminations microfractures pyrite oxidationand poor fissility Taking this data and considering slateexploitability as the response variable we fitted a binaryGLM called GLM1 A ROC curve was estimated for thiscomplete binary model and the AUC was 099
Next binary GLMs were fitted to different groups ofdependent variables in an attempt to find the minimumnumber of variables that would provide a high AUC valuethat is close to 099 The model called GLM2 fitted withthe RQD crenulation kink band andmicrofracture variablesobtained an AUC of 092 Figure 1 shows the ROC curves andthe corresponding AUC values for both GLM1 and GLM2The study was continued with the four variables included inGLM2 given that the reduction in the number of variablesdid not overly affect the accuracy of the model A binarynonspatial GLM was fitted using Bayesian methods and the
4 Journal of Applied Mathematics
Table 2 Estimated coefficients and 25 25 75 and 975 quantiles for the GLM
Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus356747 minus776804 minus490098 minus260904 minus186385
1205731(RQD) 12826 minus11476 04192 20516 396331205732(crenulation) 05875 03288 05089 06761 083521205733(kink band) 05668 03965 05019 06395 078571205734(microfracture) 25815 09221 16093 39181 67967
Table 3 Estimated coefficients and 25 25 75 and 975 quantiles for the GLSM
Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus277021 minus320664 minus296753 minus247446 minus177297
1205731(RQD) 02521 minus00385 01291 03726 053881205732(crenulation) 07334 03762 06121 08283 106911205733(kink band) 09189 05881 07802 10274 122611205734(microfracture) 12911 08834 10125 14480 16261
00 02 04 06 08 10
00
02
04
06
08
10
ROC plot
Sens
itivi
ty (t
rue p
ositi
ves)
AUC099 GLM1092 GLM2
1-specificity (false positives)
Figure 1 ROC curves and the corresponding AUC values for thetwo binary GLMs included in the study
MCMClogit function from the MCMCpack (R language)The vector120573 of the parameters fitted for theGLM2model was120573 = (minus356747 12826 05875 05668 25815) where the firstvalue is the independent term and the remaining values arethe corresponding coefficients of the explanatory variables inthe same order as they were mentioned above
Table 2 shows the estimated coefficients and the 2525 75 and 975 quantiles for the fitted model It can beobserved that for a significance level of 120572 = 005 the onlynonsignificant variable is the RQD
The study of the residuals of the GLM2 model fittedwith four explanatory variables detected a spatial dependence
Distance
Sem
ivar
ianc
e
05
10
15
20
5 10 15 20 25
Figure 2 Experimental semivariogram for the GLM2 residuals(points) together with the fitted theoretical model fitted accordingto an exponential model (continuous line)
that could be modelled using an exponential theoreticalsemivariogram with range 081 and sill 161 Figure 2 showsthe experimental semivariogram for the GLM2 residualstogether with the corresponding fitted theoretical model
Given the presence of spatial dependence in the resid-uals it then made sense to fit a GLSM maintaining fourexplanatory variables and assuming an exponential modelfor the process 119878(119909) The correlation function considered wasof the type cov(119878(119909) 119878(1199091015840)) = 1205902120588(119909 1199091015840 120593) + 12059121119909 = 1199091015840with 120588(119906 120593) = exp(minus119906120593) The fit was made using Bayesianinference implemented via the MCMC algorithms The first10000 sample observations from the simulationwere ignoredas burn-in at which point it was considered that convergencetime had been achieved The subsequent samples were usedto obtain the subsequent distribution of the parameters of
Journal of Applied Mathematics 5
minus5
minus4
minus4
minus4
minus3
minus3
minus2
minus2
minus2
3
10
20 40
00 05 10 15
000
005
010
015
020
025
030
120593
1205912 re
l
Figure 3 Two-dimensional likelihood profile for (120593 1205912) for theGLSM
interest The chain was sampled for each 100 of the 50000iterations to obtain samples containing 500 values (for furtherdetails see [17])
Using this procedure the parameters were estimated as120593 = 082 120590
2= 146 120591
2= 009 and 120573 = (minus277021
02521 07334 09189 12911) Table 3 shows the estimated 120573coefficients and their 25 25 75 and 975 quantilesOnce againwe can see how for a significance level of120572 = 005only the RQD variable was not significant
It is important to remember that these parametersshould be conditionally and not marginally interpreted andso should not be directly compared with the parametersestimated for the GLM Direct comparison of a spatial andnonspatial GLM could lead to erroneous conclusions as theestimation methods are fundamentally different Nonethe-less there is a certain correlation in the conclusions to bedrawn from these tables with variables such as crenulationkink band andmicrofracture remaining significant RQD onthe other hand was not significant at 5 level
Figure 3 shows the likelihood profile in two dimensionsfor parameters (120593 1205912) of the model illustrating the flatness ofthe likelihood surface obtained using theMCMC algorithms
A study of the spatial dependence of the GLSM residualsindicated that the spatial component had been correctlymodelled on this occasion Figure 4 shows an experimentalsemivariogram of the GLSM residuals and a theoreticalmodel fitted according to a nugget effect model It is clearthat the empirical semivariogram is essentially flat whichsuggests a suitable fit to the spatial structure A directcomparison between Figures 2 and 4 leads to the conclusionthat the spatial dependence has been properly captured by thestationary Gaussian process 119878
The ROC curve and AUC were calculated for the GLSMThe AUC of the binary spatial model was 099 which
Distance
Sem
ivar
ianc
e5 10 15 20 25
1eminus06
2eminus06
3eminus06
4eminus06
Figure 4 Experimental semivariogram for the GLSM residuals(points) together with the theoretical model fitted according to anugget effect model (continuous line)
Table 4 Error rates for the binary models for three scenariosrepresenting different levels of prediction difficulty
5 test 10 test 15 testGLM 66119864 minus 2 63119864 minus 2 56119864 minus 2
GLSM 41119864 minus 2 51119864 minus 2 46119864 minus 2
indicates a substantial improvement in the precision of theGLSM This improvement is reflected in Figure 5 whichdepicts the ROC curves and AUC values for the GLM2 andGLSM binary models
The comparison between the two models was completedwith a simulation study designed to compare the reliabilityof the predictions in three scenarios of varying levels ofdifficulty Randomly selected for the first scenario was 95 ofthe 313 initial observations composing the training set thatwas used to fit the GLM and GLSM The fitted models werethen validated with the remaining 5 of the observationsThis procedure was repeated 100 times and the numberof errors in the slate exploitability prediction was recordedfor each repetition For the second scenario we randomlyselected 90 of the observations for the training set and theremaining 10made up the test set This simulation was alsorepeated 100 times The procedure for the third scenario wassimilar but this time 85 and 15 of observations made upthe training and test sets respectively
Table 4 displays the error rates for the three scenariosdescribed with the error rate calculated as the ratio betweenthe number of prediction errors and the total number ofpredictions in the test set
6 Journal of Applied Mathematics
00 02 04 06 08 10
00
02
04
06
08
10
ROC plot
Sens
itivi
ty (t
rue p
ositi
ves)
AUC
099 GLSM092 GLM
1-specificity (false positives)
Figure 5 ROC curves and correspondingAUCvalues for the binaryGLM2 and the GLSM
In all the cases it can be observed that theGLSMprovideda better explanation not only of the effect of the variablesdetermining slate quality but also of the spatial behaviourof exploitable slate thereby producing lower prediction errorrates
5 Conclusions
A general interpretation of the GLSM used in our analysis isthat the spatial term 119878 represents the accumulative effect ofpossible explanatory variables with an undetermined spatialstructure which have therefore not been observed
In GLMs the fact that the spatial correlation of thevariables is not taken into account can significantly affect thequality of statistical results Our study highlights the potentialrisk of using GLMs when the data is spatially structured
The conclusion reached after comparing ROC curvesand their corresponding AUCs is that GLSMs predict slateexploitability better than GLMs Therefore it would seemessential to include unexplained spatial variation when mod-elling spatially correlated variables
Based on the comparison of the semivariograms of theGLM and GLSM residuals we would like to draw attentionto the presence of spatial dependence in the GLM residualsin contrast towhat occurswhen aGLSM is implementedThisindicates that spatial dependence has been captured correctlyby the stationary Gaussian process 119878 We can therefore
conclude that a GLSM is more suitable for modelling spatialdependence which is overlooked by classic GLMs
The simulation study demonstrates that for varying levelsof prediction difficulty the GLSM had lower error rates thanthe GLM
Although the parameters of the GLSM must be inter-preted conditionally rather than marginally to 119878 the resultsof the statistical analysis denote the broader potential of theGLSM compared to the classic GLM in analysing spatialdata They also underline the potential risk of reachingerroneous conclusions when using nonspatial models toanalyse spatially structured data
Acknowledgments
This work was funded partly by the ProjectINCITE10REM304009PR of the Xunta de Galicia and by theProjects MTM2008-03129 and MTM2011-23204 of theMinistry of Science and Innovation
References
[1] J M Matıas A Vaamonde J Taboada and W Gonzalez-Manteiga ldquoSupport vector machines and gradient boosting forgraphical estimation of a slate depositrdquo Stochastic Environmen-tal Research and Risk Assessment vol 18 no 5 pp 309ndash3232004
[2] F G Bastante J Taboada L Alejano and E Alonso ldquoOptimiza-tion tools and simulationmethods for designing and evaluatinga mining operationrdquo Stochastic Environmental Research andRisk Assessment vol 22 no 6 pp 727ndash735 2008
[3] J A Nelder and R W M Wedderburn ldquoGeneralized linearmodelsrdquo Journal of the Royal Statistical Society A vol 135 pp370ndash384 1972
[4] P McCullagh and J A Nelder Generalized Linear ModelsChapman and Hall London UK 1989
[5] O F Christensen and R Waagepetersen ldquoBayesian predictionof spatial count data using generalized linear mixed modelsrdquoBiometrics vol 58 no 2 pp 280ndash286 2002
[6] P Diggle R Moyeed B Rowlingson andMThomson ldquoChild-hood malaria in the Gambia a case-study in model-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 51no 4 pp 493ndash506 2002
[7] P J Diggle P J Ribeiro andO F Christensen ldquoAn introductionto model-based geostatisticsrdquo in Spatial Statistics and Compu-tational Methods J Moslashller Ed pp 43ndash86 Springer New YorkNY USA 2003
[8] H Zhang ldquoOn estimation and prediction for spatial generalizedlinear mixed modelsrdquo Biometrics vol 58 no 1 pp 129ndash1362002
[9] H Zhang ldquoOptimal interpolation and the appropriateness ofcross-validating variogram in spatial generalized linear mixedmodelsrdquo Journal of Computational and Graphical Statistics vol12 no 3 pp 698ndash713 2003
[10] N Breslow and D Clayton ldquoApproximate inference in gener-alized linear mixed modelsrdquo Journal of the American StatisticalAssociation vol 88 pp 9ndash25 1993
[11] P J Diggle J A Tawn and R A Moyeed ldquoModel-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 47no 3 pp 299ndash350 1998
Journal of Applied Mathematics 7
[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994
[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg
[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011
[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998
[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997
[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Journal of Applied Mathematics 3
Table 1 Correlation matrix of the 119901 = 9 explanatory variables that affected the results of borehole logging
(a)
RQD Veins Crenulation Kink bands Sandy laminationsRQD 1Veins 01624 1Crenulation 03038 01631 1Kink bands 01040 00408 01962 1Sandy laminations 00177 00187 00200 00736 1Microfractures 06696 00035 01284 minus00590 minus00138
Pyrite minus01953 minus00403 minus01221 minus00891 minus01202
Oxidation 02053 00837 01722 00821 minus00645
Rough cleavage minus00908 minus00593 02096 minus00058 minus00454
(b)
Microfractures Pyrite Oxidation Rough cleavageMicrofractures 1Pyrite minus01587 1Oxidation 01162 minus01152 1Rough cleavage minus00989 00065 minus00366 1
(vi) Microfractures presence of barely visible fractureswhich determine the breakage resistance of slabsmeasuring 3ndash5mm thick
(vii) Pyrite presence of iron sulphides
(viii) Oxidation degree of oxidation of iron sulphides in theslate
(ix) Rough cleavage slatewith poor fissility due to texturalheterogeneity
In-depth knowledge of the variability and distribution ofexploitable slate and possible correlation between propertiesare conducive to the use of GLSM to spatially model thegeographic database
Table 1 shows the correlation matrix of the explanatoryvariables given above in order to know the degree of depen-dence between them
32 Model Formulation The response variable 119884(119909) takesthe values 0 or 1 to indicate disposable or exploitable slaterespectively in a particular location 119909 It is assumed in whatfollows that slate exploitability is a spatial phenomenon thatcan bemodelled using aGLSM In otherwords conditional tothe Gaussian process 119878(119909) the data 119884(119909
119894) 119894 = 1 119899 follow
the classic GLM The role of 119878(119909) is therefore to explainthe residual spatial variation after considering all the knownexplanatory variables It is also reasonable to assume thatthe conditional distribution of exploitability can be modelledas a binomial distribution which is why a binomial errordistribution was considered in our study
Binomial error distribution was used by Diggle et al [6]and Zhang [8] A class of transformations that can be used aslink functions for this distribution was described by [16] Weassume the stationary Gaussian process 119878(sdot) to be the basis fora model of spatial variation in the probability 119875(119909) that the
slate in119909 is exploitable but with a logit transformation tomapthe domain of 119878(sdot) onto the unit interval Thus
119892 [119875 (119909)] = log 119875 (119909)[1 minus 119875 (119909)]
= 120583 + 119878 (119909) (1)
The regression function 119864[119884(119909119894) | 119878(119909
119894)] varies spatially only
through 119878(119909) in the locations 119909119894
We adopted a Bayesian framework for inference andprediction of the parameters using algorithms based onMCMC
The parameters of this binomial GLSM are 120579 = (1205902 120593)and 120573 = (120573
0 120573
119901) where 120573
0is the independent term
and 1205731 120573
119901are the regression coefficients corresponding
to each known dependent variable
4 Statistical Analysis
We initially included all the variables that characterizeslate exploitability namely RQD veins crenulations kinkbands sandy laminations microfractures pyrite oxidationand poor fissility Taking this data and considering slateexploitability as the response variable we fitted a binaryGLM called GLM1 A ROC curve was estimated for thiscomplete binary model and the AUC was 099
Next binary GLMs were fitted to different groups ofdependent variables in an attempt to find the minimumnumber of variables that would provide a high AUC valuethat is close to 099 The model called GLM2 fitted withthe RQD crenulation kink band andmicrofracture variablesobtained an AUC of 092 Figure 1 shows the ROC curves andthe corresponding AUC values for both GLM1 and GLM2The study was continued with the four variables included inGLM2 given that the reduction in the number of variablesdid not overly affect the accuracy of the model A binarynonspatial GLM was fitted using Bayesian methods and the
4 Journal of Applied Mathematics
Table 2 Estimated coefficients and 25 25 75 and 975 quantiles for the GLM
Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus356747 minus776804 minus490098 minus260904 minus186385
1205731(RQD) 12826 minus11476 04192 20516 396331205732(crenulation) 05875 03288 05089 06761 083521205733(kink band) 05668 03965 05019 06395 078571205734(microfracture) 25815 09221 16093 39181 67967
Table 3 Estimated coefficients and 25 25 75 and 975 quantiles for the GLSM
Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus277021 minus320664 minus296753 minus247446 minus177297
1205731(RQD) 02521 minus00385 01291 03726 053881205732(crenulation) 07334 03762 06121 08283 106911205733(kink band) 09189 05881 07802 10274 122611205734(microfracture) 12911 08834 10125 14480 16261
00 02 04 06 08 10
00
02
04
06
08
10
ROC plot
Sens
itivi
ty (t
rue p
ositi
ves)
AUC099 GLM1092 GLM2
1-specificity (false positives)
Figure 1 ROC curves and the corresponding AUC values for thetwo binary GLMs included in the study
MCMClogit function from the MCMCpack (R language)The vector120573 of the parameters fitted for theGLM2model was120573 = (minus356747 12826 05875 05668 25815) where the firstvalue is the independent term and the remaining values arethe corresponding coefficients of the explanatory variables inthe same order as they were mentioned above
Table 2 shows the estimated coefficients and the 2525 75 and 975 quantiles for the fitted model It can beobserved that for a significance level of 120572 = 005 the onlynonsignificant variable is the RQD
The study of the residuals of the GLM2 model fittedwith four explanatory variables detected a spatial dependence
Distance
Sem
ivar
ianc
e
05
10
15
20
5 10 15 20 25
Figure 2 Experimental semivariogram for the GLM2 residuals(points) together with the fitted theoretical model fitted accordingto an exponential model (continuous line)
that could be modelled using an exponential theoreticalsemivariogram with range 081 and sill 161 Figure 2 showsthe experimental semivariogram for the GLM2 residualstogether with the corresponding fitted theoretical model
Given the presence of spatial dependence in the resid-uals it then made sense to fit a GLSM maintaining fourexplanatory variables and assuming an exponential modelfor the process 119878(119909) The correlation function considered wasof the type cov(119878(119909) 119878(1199091015840)) = 1205902120588(119909 1199091015840 120593) + 12059121119909 = 1199091015840with 120588(119906 120593) = exp(minus119906120593) The fit was made using Bayesianinference implemented via the MCMC algorithms The first10000 sample observations from the simulationwere ignoredas burn-in at which point it was considered that convergencetime had been achieved The subsequent samples were usedto obtain the subsequent distribution of the parameters of
Journal of Applied Mathematics 5
minus5
minus4
minus4
minus4
minus3
minus3
minus2
minus2
minus2
3
10
20 40
00 05 10 15
000
005
010
015
020
025
030
120593
1205912 re
l
Figure 3 Two-dimensional likelihood profile for (120593 1205912) for theGLSM
interest The chain was sampled for each 100 of the 50000iterations to obtain samples containing 500 values (for furtherdetails see [17])
Using this procedure the parameters were estimated as120593 = 082 120590
2= 146 120591
2= 009 and 120573 = (minus277021
02521 07334 09189 12911) Table 3 shows the estimated 120573coefficients and their 25 25 75 and 975 quantilesOnce againwe can see how for a significance level of120572 = 005only the RQD variable was not significant
It is important to remember that these parametersshould be conditionally and not marginally interpreted andso should not be directly compared with the parametersestimated for the GLM Direct comparison of a spatial andnonspatial GLM could lead to erroneous conclusions as theestimation methods are fundamentally different Nonethe-less there is a certain correlation in the conclusions to bedrawn from these tables with variables such as crenulationkink band andmicrofracture remaining significant RQD onthe other hand was not significant at 5 level
Figure 3 shows the likelihood profile in two dimensionsfor parameters (120593 1205912) of the model illustrating the flatness ofthe likelihood surface obtained using theMCMC algorithms
A study of the spatial dependence of the GLSM residualsindicated that the spatial component had been correctlymodelled on this occasion Figure 4 shows an experimentalsemivariogram of the GLSM residuals and a theoreticalmodel fitted according to a nugget effect model It is clearthat the empirical semivariogram is essentially flat whichsuggests a suitable fit to the spatial structure A directcomparison between Figures 2 and 4 leads to the conclusionthat the spatial dependence has been properly captured by thestationary Gaussian process 119878
The ROC curve and AUC were calculated for the GLSMThe AUC of the binary spatial model was 099 which
Distance
Sem
ivar
ianc
e5 10 15 20 25
1eminus06
2eminus06
3eminus06
4eminus06
Figure 4 Experimental semivariogram for the GLSM residuals(points) together with the theoretical model fitted according to anugget effect model (continuous line)
Table 4 Error rates for the binary models for three scenariosrepresenting different levels of prediction difficulty
5 test 10 test 15 testGLM 66119864 minus 2 63119864 minus 2 56119864 minus 2
GLSM 41119864 minus 2 51119864 minus 2 46119864 minus 2
indicates a substantial improvement in the precision of theGLSM This improvement is reflected in Figure 5 whichdepicts the ROC curves and AUC values for the GLM2 andGLSM binary models
The comparison between the two models was completedwith a simulation study designed to compare the reliabilityof the predictions in three scenarios of varying levels ofdifficulty Randomly selected for the first scenario was 95 ofthe 313 initial observations composing the training set thatwas used to fit the GLM and GLSM The fitted models werethen validated with the remaining 5 of the observationsThis procedure was repeated 100 times and the numberof errors in the slate exploitability prediction was recordedfor each repetition For the second scenario we randomlyselected 90 of the observations for the training set and theremaining 10made up the test set This simulation was alsorepeated 100 times The procedure for the third scenario wassimilar but this time 85 and 15 of observations made upthe training and test sets respectively
Table 4 displays the error rates for the three scenariosdescribed with the error rate calculated as the ratio betweenthe number of prediction errors and the total number ofpredictions in the test set
6 Journal of Applied Mathematics
00 02 04 06 08 10
00
02
04
06
08
10
ROC plot
Sens
itivi
ty (t
rue p
ositi
ves)
AUC
099 GLSM092 GLM
1-specificity (false positives)
Figure 5 ROC curves and correspondingAUCvalues for the binaryGLM2 and the GLSM
In all the cases it can be observed that theGLSMprovideda better explanation not only of the effect of the variablesdetermining slate quality but also of the spatial behaviourof exploitable slate thereby producing lower prediction errorrates
5 Conclusions
A general interpretation of the GLSM used in our analysis isthat the spatial term 119878 represents the accumulative effect ofpossible explanatory variables with an undetermined spatialstructure which have therefore not been observed
In GLMs the fact that the spatial correlation of thevariables is not taken into account can significantly affect thequality of statistical results Our study highlights the potentialrisk of using GLMs when the data is spatially structured
The conclusion reached after comparing ROC curvesand their corresponding AUCs is that GLSMs predict slateexploitability better than GLMs Therefore it would seemessential to include unexplained spatial variation when mod-elling spatially correlated variables
Based on the comparison of the semivariograms of theGLM and GLSM residuals we would like to draw attentionto the presence of spatial dependence in the GLM residualsin contrast towhat occurswhen aGLSM is implementedThisindicates that spatial dependence has been captured correctlyby the stationary Gaussian process 119878 We can therefore
conclude that a GLSM is more suitable for modelling spatialdependence which is overlooked by classic GLMs
The simulation study demonstrates that for varying levelsof prediction difficulty the GLSM had lower error rates thanthe GLM
Although the parameters of the GLSM must be inter-preted conditionally rather than marginally to 119878 the resultsof the statistical analysis denote the broader potential of theGLSM compared to the classic GLM in analysing spatialdata They also underline the potential risk of reachingerroneous conclusions when using nonspatial models toanalyse spatially structured data
Acknowledgments
This work was funded partly by the ProjectINCITE10REM304009PR of the Xunta de Galicia and by theProjects MTM2008-03129 and MTM2011-23204 of theMinistry of Science and Innovation
References
[1] J M Matıas A Vaamonde J Taboada and W Gonzalez-Manteiga ldquoSupport vector machines and gradient boosting forgraphical estimation of a slate depositrdquo Stochastic Environmen-tal Research and Risk Assessment vol 18 no 5 pp 309ndash3232004
[2] F G Bastante J Taboada L Alejano and E Alonso ldquoOptimiza-tion tools and simulationmethods for designing and evaluatinga mining operationrdquo Stochastic Environmental Research andRisk Assessment vol 22 no 6 pp 727ndash735 2008
[3] J A Nelder and R W M Wedderburn ldquoGeneralized linearmodelsrdquo Journal of the Royal Statistical Society A vol 135 pp370ndash384 1972
[4] P McCullagh and J A Nelder Generalized Linear ModelsChapman and Hall London UK 1989
[5] O F Christensen and R Waagepetersen ldquoBayesian predictionof spatial count data using generalized linear mixed modelsrdquoBiometrics vol 58 no 2 pp 280ndash286 2002
[6] P Diggle R Moyeed B Rowlingson andMThomson ldquoChild-hood malaria in the Gambia a case-study in model-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 51no 4 pp 493ndash506 2002
[7] P J Diggle P J Ribeiro andO F Christensen ldquoAn introductionto model-based geostatisticsrdquo in Spatial Statistics and Compu-tational Methods J Moslashller Ed pp 43ndash86 Springer New YorkNY USA 2003
[8] H Zhang ldquoOn estimation and prediction for spatial generalizedlinear mixed modelsrdquo Biometrics vol 58 no 1 pp 129ndash1362002
[9] H Zhang ldquoOptimal interpolation and the appropriateness ofcross-validating variogram in spatial generalized linear mixedmodelsrdquo Journal of Computational and Graphical Statistics vol12 no 3 pp 698ndash713 2003
[10] N Breslow and D Clayton ldquoApproximate inference in gener-alized linear mixed modelsrdquo Journal of the American StatisticalAssociation vol 88 pp 9ndash25 1993
[11] P J Diggle J A Tawn and R A Moyeed ldquoModel-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 47no 3 pp 299ndash350 1998
Journal of Applied Mathematics 7
[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994
[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg
[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011
[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998
[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997
[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
4 Journal of Applied Mathematics
Table 2 Estimated coefficients and 25 25 75 and 975 quantiles for the GLM
Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus356747 minus776804 minus490098 minus260904 minus186385
1205731(RQD) 12826 minus11476 04192 20516 396331205732(crenulation) 05875 03288 05089 06761 083521205733(kink band) 05668 03965 05019 06395 078571205734(microfracture) 25815 09221 16093 39181 67967
Table 3 Estimated coefficients and 25 25 75 and 975 quantiles for the GLSM
Coefficient 25 quantile 25 quantile 75 quantile 975 quantile1205730(intercept) minus277021 minus320664 minus296753 minus247446 minus177297
1205731(RQD) 02521 minus00385 01291 03726 053881205732(crenulation) 07334 03762 06121 08283 106911205733(kink band) 09189 05881 07802 10274 122611205734(microfracture) 12911 08834 10125 14480 16261
00 02 04 06 08 10
00
02
04
06
08
10
ROC plot
Sens
itivi
ty (t
rue p
ositi
ves)
AUC099 GLM1092 GLM2
1-specificity (false positives)
Figure 1 ROC curves and the corresponding AUC values for thetwo binary GLMs included in the study
MCMClogit function from the MCMCpack (R language)The vector120573 of the parameters fitted for theGLM2model was120573 = (minus356747 12826 05875 05668 25815) where the firstvalue is the independent term and the remaining values arethe corresponding coefficients of the explanatory variables inthe same order as they were mentioned above
Table 2 shows the estimated coefficients and the 2525 75 and 975 quantiles for the fitted model It can beobserved that for a significance level of 120572 = 005 the onlynonsignificant variable is the RQD
The study of the residuals of the GLM2 model fittedwith four explanatory variables detected a spatial dependence
Distance
Sem
ivar
ianc
e
05
10
15
20
5 10 15 20 25
Figure 2 Experimental semivariogram for the GLM2 residuals(points) together with the fitted theoretical model fitted accordingto an exponential model (continuous line)
that could be modelled using an exponential theoreticalsemivariogram with range 081 and sill 161 Figure 2 showsthe experimental semivariogram for the GLM2 residualstogether with the corresponding fitted theoretical model
Given the presence of spatial dependence in the resid-uals it then made sense to fit a GLSM maintaining fourexplanatory variables and assuming an exponential modelfor the process 119878(119909) The correlation function considered wasof the type cov(119878(119909) 119878(1199091015840)) = 1205902120588(119909 1199091015840 120593) + 12059121119909 = 1199091015840with 120588(119906 120593) = exp(minus119906120593) The fit was made using Bayesianinference implemented via the MCMC algorithms The first10000 sample observations from the simulationwere ignoredas burn-in at which point it was considered that convergencetime had been achieved The subsequent samples were usedto obtain the subsequent distribution of the parameters of
Journal of Applied Mathematics 5
minus5
minus4
minus4
minus4
minus3
minus3
minus2
minus2
minus2
3
10
20 40
00 05 10 15
000
005
010
015
020
025
030
120593
1205912 re
l
Figure 3 Two-dimensional likelihood profile for (120593 1205912) for theGLSM
interest The chain was sampled for each 100 of the 50000iterations to obtain samples containing 500 values (for furtherdetails see [17])
Using this procedure the parameters were estimated as120593 = 082 120590
2= 146 120591
2= 009 and 120573 = (minus277021
02521 07334 09189 12911) Table 3 shows the estimated 120573coefficients and their 25 25 75 and 975 quantilesOnce againwe can see how for a significance level of120572 = 005only the RQD variable was not significant
It is important to remember that these parametersshould be conditionally and not marginally interpreted andso should not be directly compared with the parametersestimated for the GLM Direct comparison of a spatial andnonspatial GLM could lead to erroneous conclusions as theestimation methods are fundamentally different Nonethe-less there is a certain correlation in the conclusions to bedrawn from these tables with variables such as crenulationkink band andmicrofracture remaining significant RQD onthe other hand was not significant at 5 level
Figure 3 shows the likelihood profile in two dimensionsfor parameters (120593 1205912) of the model illustrating the flatness ofthe likelihood surface obtained using theMCMC algorithms
A study of the spatial dependence of the GLSM residualsindicated that the spatial component had been correctlymodelled on this occasion Figure 4 shows an experimentalsemivariogram of the GLSM residuals and a theoreticalmodel fitted according to a nugget effect model It is clearthat the empirical semivariogram is essentially flat whichsuggests a suitable fit to the spatial structure A directcomparison between Figures 2 and 4 leads to the conclusionthat the spatial dependence has been properly captured by thestationary Gaussian process 119878
The ROC curve and AUC were calculated for the GLSMThe AUC of the binary spatial model was 099 which
Distance
Sem
ivar
ianc
e5 10 15 20 25
1eminus06
2eminus06
3eminus06
4eminus06
Figure 4 Experimental semivariogram for the GLSM residuals(points) together with the theoretical model fitted according to anugget effect model (continuous line)
Table 4 Error rates for the binary models for three scenariosrepresenting different levels of prediction difficulty
5 test 10 test 15 testGLM 66119864 minus 2 63119864 minus 2 56119864 minus 2
GLSM 41119864 minus 2 51119864 minus 2 46119864 minus 2
indicates a substantial improvement in the precision of theGLSM This improvement is reflected in Figure 5 whichdepicts the ROC curves and AUC values for the GLM2 andGLSM binary models
The comparison between the two models was completedwith a simulation study designed to compare the reliabilityof the predictions in three scenarios of varying levels ofdifficulty Randomly selected for the first scenario was 95 ofthe 313 initial observations composing the training set thatwas used to fit the GLM and GLSM The fitted models werethen validated with the remaining 5 of the observationsThis procedure was repeated 100 times and the numberof errors in the slate exploitability prediction was recordedfor each repetition For the second scenario we randomlyselected 90 of the observations for the training set and theremaining 10made up the test set This simulation was alsorepeated 100 times The procedure for the third scenario wassimilar but this time 85 and 15 of observations made upthe training and test sets respectively
Table 4 displays the error rates for the three scenariosdescribed with the error rate calculated as the ratio betweenthe number of prediction errors and the total number ofpredictions in the test set
6 Journal of Applied Mathematics
00 02 04 06 08 10
00
02
04
06
08
10
ROC plot
Sens
itivi
ty (t
rue p
ositi
ves)
AUC
099 GLSM092 GLM
1-specificity (false positives)
Figure 5 ROC curves and correspondingAUCvalues for the binaryGLM2 and the GLSM
In all the cases it can be observed that theGLSMprovideda better explanation not only of the effect of the variablesdetermining slate quality but also of the spatial behaviourof exploitable slate thereby producing lower prediction errorrates
5 Conclusions
A general interpretation of the GLSM used in our analysis isthat the spatial term 119878 represents the accumulative effect ofpossible explanatory variables with an undetermined spatialstructure which have therefore not been observed
In GLMs the fact that the spatial correlation of thevariables is not taken into account can significantly affect thequality of statistical results Our study highlights the potentialrisk of using GLMs when the data is spatially structured
The conclusion reached after comparing ROC curvesand their corresponding AUCs is that GLSMs predict slateexploitability better than GLMs Therefore it would seemessential to include unexplained spatial variation when mod-elling spatially correlated variables
Based on the comparison of the semivariograms of theGLM and GLSM residuals we would like to draw attentionto the presence of spatial dependence in the GLM residualsin contrast towhat occurswhen aGLSM is implementedThisindicates that spatial dependence has been captured correctlyby the stationary Gaussian process 119878 We can therefore
conclude that a GLSM is more suitable for modelling spatialdependence which is overlooked by classic GLMs
The simulation study demonstrates that for varying levelsof prediction difficulty the GLSM had lower error rates thanthe GLM
Although the parameters of the GLSM must be inter-preted conditionally rather than marginally to 119878 the resultsof the statistical analysis denote the broader potential of theGLSM compared to the classic GLM in analysing spatialdata They also underline the potential risk of reachingerroneous conclusions when using nonspatial models toanalyse spatially structured data
Acknowledgments
This work was funded partly by the ProjectINCITE10REM304009PR of the Xunta de Galicia and by theProjects MTM2008-03129 and MTM2011-23204 of theMinistry of Science and Innovation
References
[1] J M Matıas A Vaamonde J Taboada and W Gonzalez-Manteiga ldquoSupport vector machines and gradient boosting forgraphical estimation of a slate depositrdquo Stochastic Environmen-tal Research and Risk Assessment vol 18 no 5 pp 309ndash3232004
[2] F G Bastante J Taboada L Alejano and E Alonso ldquoOptimiza-tion tools and simulationmethods for designing and evaluatinga mining operationrdquo Stochastic Environmental Research andRisk Assessment vol 22 no 6 pp 727ndash735 2008
[3] J A Nelder and R W M Wedderburn ldquoGeneralized linearmodelsrdquo Journal of the Royal Statistical Society A vol 135 pp370ndash384 1972
[4] P McCullagh and J A Nelder Generalized Linear ModelsChapman and Hall London UK 1989
[5] O F Christensen and R Waagepetersen ldquoBayesian predictionof spatial count data using generalized linear mixed modelsrdquoBiometrics vol 58 no 2 pp 280ndash286 2002
[6] P Diggle R Moyeed B Rowlingson andMThomson ldquoChild-hood malaria in the Gambia a case-study in model-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 51no 4 pp 493ndash506 2002
[7] P J Diggle P J Ribeiro andO F Christensen ldquoAn introductionto model-based geostatisticsrdquo in Spatial Statistics and Compu-tational Methods J Moslashller Ed pp 43ndash86 Springer New YorkNY USA 2003
[8] H Zhang ldquoOn estimation and prediction for spatial generalizedlinear mixed modelsrdquo Biometrics vol 58 no 1 pp 129ndash1362002
[9] H Zhang ldquoOptimal interpolation and the appropriateness ofcross-validating variogram in spatial generalized linear mixedmodelsrdquo Journal of Computational and Graphical Statistics vol12 no 3 pp 698ndash713 2003
[10] N Breslow and D Clayton ldquoApproximate inference in gener-alized linear mixed modelsrdquo Journal of the American StatisticalAssociation vol 88 pp 9ndash25 1993
[11] P J Diggle J A Tawn and R A Moyeed ldquoModel-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 47no 3 pp 299ndash350 1998
Journal of Applied Mathematics 7
[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994
[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg
[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011
[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998
[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997
[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Journal of Applied Mathematics 5
minus5
minus4
minus4
minus4
minus3
minus3
minus2
minus2
minus2
3
10
20 40
00 05 10 15
000
005
010
015
020
025
030
120593
1205912 re
l
Figure 3 Two-dimensional likelihood profile for (120593 1205912) for theGLSM
interest The chain was sampled for each 100 of the 50000iterations to obtain samples containing 500 values (for furtherdetails see [17])
Using this procedure the parameters were estimated as120593 = 082 120590
2= 146 120591
2= 009 and 120573 = (minus277021
02521 07334 09189 12911) Table 3 shows the estimated 120573coefficients and their 25 25 75 and 975 quantilesOnce againwe can see how for a significance level of120572 = 005only the RQD variable was not significant
It is important to remember that these parametersshould be conditionally and not marginally interpreted andso should not be directly compared with the parametersestimated for the GLM Direct comparison of a spatial andnonspatial GLM could lead to erroneous conclusions as theestimation methods are fundamentally different Nonethe-less there is a certain correlation in the conclusions to bedrawn from these tables with variables such as crenulationkink band andmicrofracture remaining significant RQD onthe other hand was not significant at 5 level
Figure 3 shows the likelihood profile in two dimensionsfor parameters (120593 1205912) of the model illustrating the flatness ofthe likelihood surface obtained using theMCMC algorithms
A study of the spatial dependence of the GLSM residualsindicated that the spatial component had been correctlymodelled on this occasion Figure 4 shows an experimentalsemivariogram of the GLSM residuals and a theoreticalmodel fitted according to a nugget effect model It is clearthat the empirical semivariogram is essentially flat whichsuggests a suitable fit to the spatial structure A directcomparison between Figures 2 and 4 leads to the conclusionthat the spatial dependence has been properly captured by thestationary Gaussian process 119878
The ROC curve and AUC were calculated for the GLSMThe AUC of the binary spatial model was 099 which
Distance
Sem
ivar
ianc
e5 10 15 20 25
1eminus06
2eminus06
3eminus06
4eminus06
Figure 4 Experimental semivariogram for the GLSM residuals(points) together with the theoretical model fitted according to anugget effect model (continuous line)
Table 4 Error rates for the binary models for three scenariosrepresenting different levels of prediction difficulty
5 test 10 test 15 testGLM 66119864 minus 2 63119864 minus 2 56119864 minus 2
GLSM 41119864 minus 2 51119864 minus 2 46119864 minus 2
indicates a substantial improvement in the precision of theGLSM This improvement is reflected in Figure 5 whichdepicts the ROC curves and AUC values for the GLM2 andGLSM binary models
The comparison between the two models was completedwith a simulation study designed to compare the reliabilityof the predictions in three scenarios of varying levels ofdifficulty Randomly selected for the first scenario was 95 ofthe 313 initial observations composing the training set thatwas used to fit the GLM and GLSM The fitted models werethen validated with the remaining 5 of the observationsThis procedure was repeated 100 times and the numberof errors in the slate exploitability prediction was recordedfor each repetition For the second scenario we randomlyselected 90 of the observations for the training set and theremaining 10made up the test set This simulation was alsorepeated 100 times The procedure for the third scenario wassimilar but this time 85 and 15 of observations made upthe training and test sets respectively
Table 4 displays the error rates for the three scenariosdescribed with the error rate calculated as the ratio betweenthe number of prediction errors and the total number ofpredictions in the test set
6 Journal of Applied Mathematics
00 02 04 06 08 10
00
02
04
06
08
10
ROC plot
Sens
itivi
ty (t
rue p
ositi
ves)
AUC
099 GLSM092 GLM
1-specificity (false positives)
Figure 5 ROC curves and correspondingAUCvalues for the binaryGLM2 and the GLSM
In all the cases it can be observed that theGLSMprovideda better explanation not only of the effect of the variablesdetermining slate quality but also of the spatial behaviourof exploitable slate thereby producing lower prediction errorrates
5 Conclusions
A general interpretation of the GLSM used in our analysis isthat the spatial term 119878 represents the accumulative effect ofpossible explanatory variables with an undetermined spatialstructure which have therefore not been observed
In GLMs the fact that the spatial correlation of thevariables is not taken into account can significantly affect thequality of statistical results Our study highlights the potentialrisk of using GLMs when the data is spatially structured
The conclusion reached after comparing ROC curvesand their corresponding AUCs is that GLSMs predict slateexploitability better than GLMs Therefore it would seemessential to include unexplained spatial variation when mod-elling spatially correlated variables
Based on the comparison of the semivariograms of theGLM and GLSM residuals we would like to draw attentionto the presence of spatial dependence in the GLM residualsin contrast towhat occurswhen aGLSM is implementedThisindicates that spatial dependence has been captured correctlyby the stationary Gaussian process 119878 We can therefore
conclude that a GLSM is more suitable for modelling spatialdependence which is overlooked by classic GLMs
The simulation study demonstrates that for varying levelsof prediction difficulty the GLSM had lower error rates thanthe GLM
Although the parameters of the GLSM must be inter-preted conditionally rather than marginally to 119878 the resultsof the statistical analysis denote the broader potential of theGLSM compared to the classic GLM in analysing spatialdata They also underline the potential risk of reachingerroneous conclusions when using nonspatial models toanalyse spatially structured data
Acknowledgments
This work was funded partly by the ProjectINCITE10REM304009PR of the Xunta de Galicia and by theProjects MTM2008-03129 and MTM2011-23204 of theMinistry of Science and Innovation
References
[1] J M Matıas A Vaamonde J Taboada and W Gonzalez-Manteiga ldquoSupport vector machines and gradient boosting forgraphical estimation of a slate depositrdquo Stochastic Environmen-tal Research and Risk Assessment vol 18 no 5 pp 309ndash3232004
[2] F G Bastante J Taboada L Alejano and E Alonso ldquoOptimiza-tion tools and simulationmethods for designing and evaluatinga mining operationrdquo Stochastic Environmental Research andRisk Assessment vol 22 no 6 pp 727ndash735 2008
[3] J A Nelder and R W M Wedderburn ldquoGeneralized linearmodelsrdquo Journal of the Royal Statistical Society A vol 135 pp370ndash384 1972
[4] P McCullagh and J A Nelder Generalized Linear ModelsChapman and Hall London UK 1989
[5] O F Christensen and R Waagepetersen ldquoBayesian predictionof spatial count data using generalized linear mixed modelsrdquoBiometrics vol 58 no 2 pp 280ndash286 2002
[6] P Diggle R Moyeed B Rowlingson andMThomson ldquoChild-hood malaria in the Gambia a case-study in model-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 51no 4 pp 493ndash506 2002
[7] P J Diggle P J Ribeiro andO F Christensen ldquoAn introductionto model-based geostatisticsrdquo in Spatial Statistics and Compu-tational Methods J Moslashller Ed pp 43ndash86 Springer New YorkNY USA 2003
[8] H Zhang ldquoOn estimation and prediction for spatial generalizedlinear mixed modelsrdquo Biometrics vol 58 no 1 pp 129ndash1362002
[9] H Zhang ldquoOptimal interpolation and the appropriateness ofcross-validating variogram in spatial generalized linear mixedmodelsrdquo Journal of Computational and Graphical Statistics vol12 no 3 pp 698ndash713 2003
[10] N Breslow and D Clayton ldquoApproximate inference in gener-alized linear mixed modelsrdquo Journal of the American StatisticalAssociation vol 88 pp 9ndash25 1993
[11] P J Diggle J A Tawn and R A Moyeed ldquoModel-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 47no 3 pp 299ndash350 1998
Journal of Applied Mathematics 7
[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994
[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg
[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011
[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998
[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997
[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
6 Journal of Applied Mathematics
00 02 04 06 08 10
00
02
04
06
08
10
ROC plot
Sens
itivi
ty (t
rue p
ositi
ves)
AUC
099 GLSM092 GLM
1-specificity (false positives)
Figure 5 ROC curves and correspondingAUCvalues for the binaryGLM2 and the GLSM
In all the cases it can be observed that theGLSMprovideda better explanation not only of the effect of the variablesdetermining slate quality but also of the spatial behaviourof exploitable slate thereby producing lower prediction errorrates
5 Conclusions
A general interpretation of the GLSM used in our analysis isthat the spatial term 119878 represents the accumulative effect ofpossible explanatory variables with an undetermined spatialstructure which have therefore not been observed
In GLMs the fact that the spatial correlation of thevariables is not taken into account can significantly affect thequality of statistical results Our study highlights the potentialrisk of using GLMs when the data is spatially structured
The conclusion reached after comparing ROC curvesand their corresponding AUCs is that GLSMs predict slateexploitability better than GLMs Therefore it would seemessential to include unexplained spatial variation when mod-elling spatially correlated variables
Based on the comparison of the semivariograms of theGLM and GLSM residuals we would like to draw attentionto the presence of spatial dependence in the GLM residualsin contrast towhat occurswhen aGLSM is implementedThisindicates that spatial dependence has been captured correctlyby the stationary Gaussian process 119878 We can therefore
conclude that a GLSM is more suitable for modelling spatialdependence which is overlooked by classic GLMs
The simulation study demonstrates that for varying levelsof prediction difficulty the GLSM had lower error rates thanthe GLM
Although the parameters of the GLSM must be inter-preted conditionally rather than marginally to 119878 the resultsof the statistical analysis denote the broader potential of theGLSM compared to the classic GLM in analysing spatialdata They also underline the potential risk of reachingerroneous conclusions when using nonspatial models toanalyse spatially structured data
Acknowledgments
This work was funded partly by the ProjectINCITE10REM304009PR of the Xunta de Galicia and by theProjects MTM2008-03129 and MTM2011-23204 of theMinistry of Science and Innovation
References
[1] J M Matıas A Vaamonde J Taboada and W Gonzalez-Manteiga ldquoSupport vector machines and gradient boosting forgraphical estimation of a slate depositrdquo Stochastic Environmen-tal Research and Risk Assessment vol 18 no 5 pp 309ndash3232004
[2] F G Bastante J Taboada L Alejano and E Alonso ldquoOptimiza-tion tools and simulationmethods for designing and evaluatinga mining operationrdquo Stochastic Environmental Research andRisk Assessment vol 22 no 6 pp 727ndash735 2008
[3] J A Nelder and R W M Wedderburn ldquoGeneralized linearmodelsrdquo Journal of the Royal Statistical Society A vol 135 pp370ndash384 1972
[4] P McCullagh and J A Nelder Generalized Linear ModelsChapman and Hall London UK 1989
[5] O F Christensen and R Waagepetersen ldquoBayesian predictionof spatial count data using generalized linear mixed modelsrdquoBiometrics vol 58 no 2 pp 280ndash286 2002
[6] P Diggle R Moyeed B Rowlingson andMThomson ldquoChild-hood malaria in the Gambia a case-study in model-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 51no 4 pp 493ndash506 2002
[7] P J Diggle P J Ribeiro andO F Christensen ldquoAn introductionto model-based geostatisticsrdquo in Spatial Statistics and Compu-tational Methods J Moslashller Ed pp 43ndash86 Springer New YorkNY USA 2003
[8] H Zhang ldquoOn estimation and prediction for spatial generalizedlinear mixed modelsrdquo Biometrics vol 58 no 1 pp 129ndash1362002
[9] H Zhang ldquoOptimal interpolation and the appropriateness ofcross-validating variogram in spatial generalized linear mixedmodelsrdquo Journal of Computational and Graphical Statistics vol12 no 3 pp 698ndash713 2003
[10] N Breslow and D Clayton ldquoApproximate inference in gener-alized linear mixed modelsrdquo Journal of the American StatisticalAssociation vol 88 pp 9ndash25 1993
[11] P J Diggle J A Tawn and R A Moyeed ldquoModel-basedgeostatisticsrdquo Journal of the Royal Statistical Society C vol 47no 3 pp 299ndash350 1998
Journal of Applied Mathematics 7
[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994
[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg
[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011
[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998
[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997
[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Journal of Applied Mathematics 7
[12] P J Diggle K Y Liang and S L Zeger The Analysis ofLongitudinal Data Clarendon Oxford UK 1994
[13] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2012 httpwwwr-projectorg
[14] M X Rodrıguez-Alvarez J Roca-Pardinas and C Cadarso-Suarez ldquoROCcurve and covariates extending inducedmethod-ology to the non-parametric frameworkrdquo Statistics andComput-ing vol 21 no 4 pp 483ndash499 2011
[15] J Taboada A Vaamonde A Saavedra and A Arguelles ldquoQual-ity index for ornamental slate depositsrdquo Engineering Geologyvol 50 no 1-2 pp 203ndash210 1998
[16] V de Oliveira B Kedem and D A Short ldquoBayesian predictionof transformed Gaussian random fieldsrdquo Journal of the Ameri-can Statistical Association vol 92 no 440 pp 1422ndash1433 1997
[17] O F Christensen ldquoMonteCarlomaximum likelihood inmodel-based geostatisticsrdquo Journal of Computational and GraphicalStatistics vol 13 no 3 pp 702ndash718 2004
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of