10
Digital soil mapping at local scale using a multi-depth VisNIR spectral library and terrain attributes Rodnei Rizzo a , José A.M. Demattê b, , Igo F. Lepsch c , Bruna C. Gallo b , Caio T. Fongaro b a Environmental Analysis and Geoprocessing Laboratory, Center for Nuclear Energy in Agriculture, University of São Paulo, Av. Centenário, 303, PO BOX 96, 13416-000 Piracicaba, SP, Brazil b College of Agriculture Luiz de Queiroz, University of São Paulo, ESALQ/USP, Department of Soil Science, Av. Padua Dias 11, PO Box 9, CEP 13418-900 Piracicaba, SP, Brazil c Visiting Soil Scientist, ESALQ/USP, Department of Soil Science. Av. Padua Dias 11, PO Box 9, CEP 13418-900 Piracicaba, SP, Brazil abstract article info Article history: Received 11 June 2015 Received in revised form 10 February 2016 Accepted 20 March 2016 Available online xxxx Conventional soil mapping is costly and time consuming. Therefore, the development of quick, cheap, but accu- rate methods is required. Several studies highlight the importance of developing regional soil spectral libraries for digital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. This study aims to produce a digital soil map using as training set Visible and Near Infra-Red (VisNIR) spectra from local soil samples, a regional spectral library and terrain attributes. The soils were sampled in 162 locations on a 270-ha farm in the municipality of Piracicaba, São Paulo, Brazil. Spectra from topsoil and subsoil were measured in laboratory (4002500 nm) and arranged as multi-depth spectra. Information was summarized by principal component analysis. Regression tree models were calibrated to predict principal components (PC) scores based on terrain attributes. After calibration, the models were applied to the entire study site, resulting in PC score maps. Fuzzy c-means and PC maps were used to dene the soil mapping units (MU). Based on fuzzy cen- troids, representative samples (RS) were dened to the MU. Munsell soil color and soil order were predicted from soil spectra and used to characterize the MU. The regression tree model had a good t for PC1, with an r 2 of 0.92, and a satisfactory r 2 for PC3, PC4, and PC5, respectively 0.58, 0.66 and 0.53. The fuzzy clustering dened seven MU. The R 2 for Munsell color predictions were 0.94 (hue), 0.96 (value) and 0.73 (chroma). Soil order had good agreement in validation, with kappa coefcient of 0.41. The methodology indicates the potential of VisNIR spec- tra to improve soil mapping campaigns and consequently provides a product similar to a conventional soil map. © 2016 Elsevier B.V. All rights reserved. Keywords: Regional soil spectral library Regression tree Fuzzy c-means Munsell color Brazilian Soil Classication System 1. Introduction Choosing species and crop varieties adapted to the various kinds of soils delineated in maps could provide higher crop yields. However, soil maps with adequate scale for crop management are scarce (Mendonça-santos and Dos Santos, 2006). Ben-Dor et al. (2008) re- ported on high cost associated with soil surveying and mapping. One al- ternative to reduce costs could be the adoption of Digital Soil Mapping (DSM). A number of studies describe DSM techniques to create maps of soil attributes or even soil types (identied as taxonomic classes). In most cases, these maps are derived from a calibration set (punctual information related to chemical and physical properties or soil classi- cation) and environmental covariates, such as terrain attributes and sat- ellite images (Adhikari et al., 2014; Lagacherie et al., 2012; Vasques et al., 2015). Visible and near-infrared (VisNIR) spectroscopy can be a useful in- dicator of soil variability (Demattê and Terra, 2014). The ability to obtain a large number of information at lower costs or short time allows in- creasing the number of observations and consequently improving digi- tal soil mapping (Viscarra Rossel et al., 2009). Recently, the joint effort of researchers from several countries has resulted in the establishment of a global soil spectral library (Viscarra Rossel et al., 2016). These databases have a great potential to improve accuracy of digital soil maps, providing information about the most relevant soil attributes, enabling spatio-temporal monitoring of soils in many regions worldwide. Given that soil spectra carry information about many soil attributes (Soriano-Disla et al., 2014), studies have suggested that the spectra could also be used to measure similarities between soil types and conse- quently provide soil classication (Vasques et al., 2014; Viscarra Rossel and Webster, 2011). Bellinaso et al. (2010) used a regional soil spectral library to describe and classify soil proles according to the Brazilian Soil Classication System (SiBCS) (Embrapa, 2013). Ben-Dor et al. (2008) developed the 3S-HeD, a device able to improve reectance data mea- surement on the eld. The authors attached a eld spectrometer to this device and performed a quantitative prole description based on Geoderma 274 (2016) 1827 Corresponding author. E-mail addresses: [email protected] (R. Rizzo), [email protected] (J.A.M. Demattê), [email protected] (I.F. Lepsch), [email protected] (B.C. Gallo), [email protected] (C.T. Fongaro). http://dx.doi.org/10.1016/j.geoderma.2016.03.019 0016-7061/© 2016 Elsevier B.V. All rights reserved. Contents lists available at ScienceDirect Geoderma journal homepage: www.elsevier.com/locate/geoderma

Digital soil mapping at local scale using a multi-depth ... · digital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. This

  • Upload
    others

  • View
    1

  • Download
    4

Embed Size (px)

Citation preview

Page 1: Digital soil mapping at local scale using a multi-depth ... · digital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. This

Geoderma 274 (2016) 18–27

Contents lists available at ScienceDirect

Geoderma

j ourna l homepage: www.e lsev ie r .com/ locate /geoderma

Digital soil mapping at local scale using a multi-depth Vis–NIR spectrallibrary and terrain attributes

Rodnei Rizzo a, José A.M. Demattê b,⁎, Igo F. Lepsch c, Bruna C. Gallo b, Caio T. Fongaro b

a Environmental Analysis and Geoprocessing Laboratory, Center for Nuclear Energy in Agriculture, University of São Paulo, Av. Centenário, 303, PO BOX 96, 13416-000 Piracicaba, SP, Brazilb College of Agriculture “Luiz de Queiroz”, University of São Paulo, ESALQ/USP, Department of Soil Science, Av. Padua Dias 11, PO Box 9, CEP 13418-900 Piracicaba, SP, Brazilc Visiting Soil Scientist, ESALQ/USP, Department of Soil Science. Av. Padua Dias 11, PO Box 9, CEP 13418-900 Piracicaba, SP, Brazil

⁎ Corresponding author.E-mail addresses: [email protected] (R. Rizzo), j

(J.A.M. Demattê), [email protected] (I.F. Lepsch), g(B.C. Gallo), [email protected] (C.T. Fongaro).

http://dx.doi.org/10.1016/j.geoderma.2016.03.0190016-7061/© 2016 Elsevier B.V. All rights reserved.

a b s t r a c t

a r t i c l e i n f o

Article history:Received 11 June 2015Received in revised form 10 February 2016Accepted 20 March 2016Available online xxxx

Conventional soil mapping is costly and time consuming. Therefore, the development of quick, cheap, but accu-ratemethods is required. Several studies highlight the importance of developing regional soil spectral libraries fordigital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. Thisstudy aims to produce a digital soil map using as training set Visible and Near Infra-Red (Vis–NIR) spectra fromlocal soil samples, a regional spectral library and terrain attributes. The soils were sampled in 162 locations on a270-ha farm in the municipality of Piracicaba, São Paulo, Brazil. Spectra from topsoil and subsoil were measuredin laboratory (400–2500 nm) and arranged as multi-depth spectra. Information was summarized by principalcomponent analysis. Regression tree models were calibrated to predict principal components (PC) scoresbased on terrain attributes. After calibration, the models were applied to the entire study site, resulting in PCscore maps. Fuzzy c-means and PC maps were used to define the soil mapping units (MU). Based on fuzzy cen-troids, representative samples (RS)were defined to theMU.Munsell soil color and soil orderwere predicted fromsoil spectra and used to characterize theMU. The regression tree model had a good fit for PC1, with an r2 of 0.92,and a satisfactory r2 for PC3, PC4, and PC5, respectively 0.58, 0.66 and 0.53. The fuzzy clustering defined sevenMU. The R2 for Munsell color predictions were 0.94 (hue), 0.96 (value) and 0.73 (chroma). Soil order had goodagreement in validation, with kappa coefficient of 0.41. Themethodology indicates the potential of Vis–NIR spec-tra to improve soil mapping campaigns and consequently provides a product similar to a conventional soil map.

© 2016 Elsevier B.V. All rights reserved.

Keywords:Regional soil spectral libraryRegression treeFuzzy c-meansMunsell colorBrazilian Soil Classification System

1. Introduction

Choosing species and crop varieties adapted to the various kinds ofsoils delineated in maps could provide higher crop yields. However,soil maps with adequate scale for crop management are scarce(Mendonça-santos and Dos Santos, 2006). Ben-Dor et al. (2008) re-ported on high cost associatedwith soil surveying andmapping. One al-ternative to reduce costs could be the adoption of Digital Soil Mapping(DSM). A number of studies describe DSM techniques to create mapsof soil attributes or even soil types (identified as taxonomic classes).In most cases, these maps are derived from a calibration set (punctualinformation related to chemical and physical properties or soil classifi-cation) and environmental covariates, such as terrain attributes and sat-ellite images (Adhikari et al., 2014; Lagacherie et al., 2012; Vasqueset al., 2015).

[email protected]@gmail.com

Visible and near-infrared (Vis–NIR) spectroscopy can be a useful in-dicator of soil variability (Demattê andTerra, 2014). The ability to obtaina large number of information at lower costs or short time allows in-creasing the number of observations and consequently improving digi-tal soilmapping (Viscarra Rossel et al., 2009). Recently, the joint effort ofresearchers from several countries has resulted in the establishment ofa global soil spectral library (Viscarra Rossel et al., 2016). Thesedatabases have a great potential to improve accuracy of digitalsoil maps, providing information about the most relevant soilattributes, enabling spatio-temporalmonitoring of soils inmany regionsworldwide.

Given that soil spectra carry information about many soil attributes(Soriano-Disla et al., 2014), studies have suggested that the spectracould also be used tomeasure similarities between soil types and conse-quently provide soil classification (Vasques et al., 2014; Viscarra RosselandWebster, 2011). Bellinaso et al. (2010) used a regional soil spectrallibrary to describe and classify soil profiles according to theBrazilian SoilClassification System (SiBCS) (Embrapa, 2013). Ben-Dor et al. (2008)developed the 3S-HeD, a device able to improve reflectance data mea-surement on the field. The authors attached a field spectrometer tothis device and performed a quantitative profile description based on

Page 2: Digital soil mapping at local scale using a multi-depth ... · digital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. This

19R. Rizzo et al. / Geoderma 274 (2016) 18–27

Vis–NIR spectra. Vasques et al. (2015) elaborated a digital soil map ofSiBCS suborder level based on satellite images, terrain attributes and in-terpolated average reflectance from soil Vis–NIR spectra.

While many studies apply Vis–NIR spectra to improve digital map-ping of soil attributes, the synergy between proximal Vis–NIR sensingand soil types has been little explored. Clearly, there is a need for strat-egies using Vis–NIR spectra on DSM of soil types (identified as soil clas-ses). The aim of our study was to test a digital mapping technique thatuses as training set (i) soil spectra from local samples, (ii) a regionalspectral library and (iii) terrain attributes. Spectra of local samples areused in many steps of the mapping process to (a) define the mappingunits (MU), (b) select representative samples (RS) from each MU and(c) classify the soil types according to SiBCS order level (Embrapa,2013).

2. Material and methods

2.1. Study site

The study site is a 270 ha farm located in the municipality ofPiracicaba, São Paulo State, Brazil, between the coordinates 22°42′30″–22°43′27″S and 47°33′32″–47°34′45″W (Fig. 1). Lithology is diabasesfrom the Serra Geral Formation, argillaceous siltstone and argillitesfrom the Tatuí Formation and argillites from the Irati Formation(Vidal-Torrado et al., 1999). The climate is “Cwa” subtropical with drywinters and rainy summers (Koppen classification). Annual rainfallranges from 1250 to 1500 mm. Relief consists of two interconnectedhills with dominantly convex slopes ranging from 2% (on the hilltops)to 12% (on the foothills).

2.2. Data acquisition

A 30-meter resolution digital elevation model (DEM) of the studysite was obtained from a topographical chart (1:10,000 scale)(Hutchinson, 1993). Later, the SAGA GIS (System for Automated

Fig. 1. Location of the study site, the samples collected in t

Geoscientific Analyses) was used to derive the following terrain attri-butes: Altitude Above Channel Network (AACN), Aspect (ASP), Catch-ment Area (CA), Channel Network Base Level (CNBL), Curvature(CUR) (Zevenbergen and Thorne, 1987), Hillshade (HIL), Topographicfactor (LSF), Slope (SLOP) (Horn, 1981), Stream Power (SP) (Mooreet al., 1993), Terrain Roughness (TR), Topographic Wetness Index(TWI) (Moore et al., 1993), Vector Terrain Roughness (VTR) (Hoffmanand Krotkov, 1990) and Wetness Index (WI) (Moore et al., 1993).

Soils were sampled with an auger along five toposequences, in a 30-meter interval, at two depths (0–20 cm and 80–100 cm). We collected324 samples that were dried at 50 °C and sieved through a 2-mmmesh. Fractions smaller than 2 mmwere used for laboratorial analyses.The color of dry soil was measured with a Minolta colorimeter (CR–300), adjusted to the Munsell color system (Campos et al., 2003). Thesoils were classified at the suborder level according to the BrazilianSoil Classification System (SiBCS) (Embrapa, 2013). The correspondingWorld Reference Base (IUSS Working Group WRB, 2014) and Soil Tax-onomy (Soil Survey Staff, 2014) classes are shown in Table 1.

The soil samples spectra were measured in the laboratory using aFieldSpec Pro spectrometer (Analytical Spectral Devices, Boulder, CO)considering a spectral range between 400 and 2500 nm. The system ge-ometry corresponded to the perpendicular position of the sensor in re-lation to the sample at a distance of 27 cm. The light source waspositioned at 61 cm from the sample and at an angle of 20° with the ze-nith. The absolute reference standard usedwas awhite spectralon plate.

2.3. Spatial modeling of soil multi-depth spectra

Spectra from the two sampled soil depthswere joined in sequence tocreate a pseudo multi-depth soil spectrum (Vasques et al., 2014). Theprincipal component analysis (PCA)was applied to summarize informa-tion in the spectra, resulting in 5 principal components (PC). The PCAwas performed using the interactive NIPALS algorithm (Martens andNaes, 1989) implemented in Parles 3.01 (Viscarra Rossel, 2008).

he site and samples from the regional spectral library.

Page 3: Digital soil mapping at local scale using a multi-depth ... · digital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. This

Table 1Soil suborders according to the Brazilian Soil Classification System (SiBCS; EMBRAPA, 2013) and corresponding World Reference Base (WRB; IUSS Working Group WRB, 2014) and SoilTaxonomy (Soil Survey Staff, 2014) classes.Source: adapted from Vasques et al. (2014).

Identif. SiBCS classification WRB classification Soil taxonomy classification

CX Cambissolo Háplico Cambisol UdeptsLV Latossolo Vermelho Ferralsol UdoxLVA Latossolo Vermelho-Amarelo Ferralsol UdoxNV Nitossolo Vermelho Nitisol Udalfs, UdultsNX Nitossolo Háplico Nitisol Udalfs, UdultsPA Argissolo Amarelo Lixisol Udalfs, UdultsPV Argissolo Vermelho Lixisol Udalfs, UdultsPVA Argissolo Vermelho-Amarelo Lixisol Udalfs, UdultsR Neossolo Leptosol, Regosol Lithic Udorthents,Lithic Udipsamments, Udorthents, Udipsamments

20 R. Rizzo et al. / Geoderma 274 (2016) 18–27

The next step consisted of calibrating a model to estimate the PC ac-cording to the predictor variables, i.e. terrain attributes. A model treebased on the M5 method (Quinlan, 1992) and implemented in R pack-age Cubist (Kuhn et al., 2014) was used. The algorithm applies a recur-sive partitioning on the predictor variable space using a divide-and-conquer strategy to build a tree like model (Viscarra Rossel and Chen,2011). The partitions are generated according to a set of binary decisionrules to reduce and minimize the intra-subset variation at each node(Henderson et al., 2005). This technique divides the variable space insmaller regions, where interactions are usually easier to describe andit creates a multivariate linear least-squares model for each of thepartitions.

During the calibration process, the effect of model parameters onprediction performance was tested based on a 10-fold cross valida-tion. In this case, 130 (80%) observations from the dataset wereused. During the process, up to 20 committee and 100 rules weretested to select the model with the best performance. The selectedmodel was validated with an independent set, that is, the remaining32 (20%) observations. The statistics considered in the predictionsassessment were the Spearman's rank correlation coefficient (Cr),average error (AE), relative error (RE) and root mean-square error(RMSE). Cr measures the linear relationship between observed andpredicted values, AE corresponds to the calculation of the meanerror and indicates how close predictions are from the measuredvalue, RE corresponds to:

RE ¼1N

XN

i¼1yi−yij j

1N

XN

i¼1yi−yj j

and RMSE to:

RMSE ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1N

XNi¼1

yi−yið Þ2vuut

where yi andyi are the observed and predicted values,y is the averageof predicted values, N is the number of data. In general, useful modelshave an RE value smaller than 1, otherwise, there is a little predictivecapacity.

The cubist model also provides the set of rules and use frequency ofeach predictor in both conditions and linear models. This informationindicates the importance of each terrain attribute to explain the varia-tion of multi-depth spectra in the study site. After testing, the modelswith a reasonable fitting were generalized to the entire site using theterrain attributes.

2.4. Digitally mapping soil units

Assuming the combined spectral information as a good indicator ofsoil variability, spectra multi-depth maps were used in a clustering

technique. A fuzzy c-means algorithm (Bezdek, 1981) was applied, di-viding the landscape into different groups, i.e. defining the MU accord-ing to multi-depth spectra.

The fuzzy was applied several times, varying the number of clustersfrom 5 to 10. Two indexes were evaluated to establish the best numberof groups, the Xie and Beni index (XB) (Xie and Beni, 1991) and the par-tition coefficient (PTC) proposed by Bezdek (1981). The best partitioncorresponded to the lowest XB value and the highest PTC value. Aftercalculating the validation indexes and defining the optimal number ofclusters, the map representing the spatial distribution of fuzzy groups(i.e.: MU) was defined.

2.5. Characterizing soil mapping units

After the MU were defined, the next step consisted on identify thesoil order and Munsell color of each unit (Fig. 2). Soil color is an impor-tant characteristic and is used as a criterion in SiBCS suborder (2ndlevel) classification.

First, a representative soil sample had to be defined to eachMU. Con-sidering that fuzzy algorithm creates centroids to every MU, soil sam-ples most similar to the centroids were chosen as RS. A soil sample(superficial and sub-superficial) for each MU was selected based onthe highest fuzzy membership values.

The spectral library developed from Bellinaso et al. (2010) wasthe basis to identify the SiBCS soil order of the RS. This library wasbuilt using soil samples collected from field surveys in the Piracicabaregion. A subset of this library was defined (SRSSL) and used in ourstudy. In this subset, soil samples developed from the same parentalmaterials of our study site were selected. A total of 191 sampled loca-tions (including soil pits and auger borings) (Fig. 1) were used in theSRSSL.

To predict soil order, the SRSSL information from both diagnostichorizons (superficial and subsuperficial) was combined (multi-depth spectra). Later, soil order centroids were calculated averagingthe characteristics values (principal component scores) of each soilclass on the database. The classification was based on the Manhattandistance between the RSs and soil order centroids from the SRSSL,where the representative sample adopts the soil order of the nearestSRSSL centroid.

The predictions of Munsell color required only the spectra from sub-soils, here considered as the subsurface diagnostic horizon. Munsellcolor prediction combined methodologies described by Torrent andBarrón (1993) and ViscarraRossel et al. (2006). Briefly, RS spectra inthe visible range were converted to the XYZ color system, integratingspectra and color-matching functions (x, y, z) of the Standard ObserverD65 (CIE, 1931). The XYZ coordinates were converted to L*a*b*, coordi-nates a* and b* allowed the calculation of hue angles and chroma, whilevaluewas estimated based on L*. Using a LUT color conversion table, thehue angle was converted to Munsell notation (ViscarraRossel et al.,2006).

Page 4: Digital soil mapping at local scale using a multi-depth ... · digital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. This

Table 2Tree model performance for prediction of 5 principal components, considering an inde-pendent validation set. The parameters considered were r2, RMSE, correlation coefficient,average error and relative error.

Comm. r2 Cr RMSE AE RE

PC1 10 0.92 0.96 1.25 1.53 0.27PC2 10 0.29 0.02 0.82 1.16 1.43PC3 20 0.58 0.72 0.12 0.16 0.75PC4 20 0.66 0.53 0.26 0.41 1.05PC5 10 0.53 0.43 0.20 0.20 0.92

Fig. 2. Flowchart of the methodology employed on digital soil mapping.

21R. Rizzo et al. / Geoderma 274 (2016) 18–27

2.6. Validation

Tomeasure the fuzzy clustering ambiguity, the confusion index (CI)was calculated (Burrough andMcDonnell, 1998) for each cell of the spa-tial grid. Burrough et al. (1997) defined the CI as the ratio between thesecond and the first highest membership values at an observation. IfCI value is close to 0, the observation is strongly associated to the firstdominant class and consequently there is little confusion. However, ifCI is close to 1, the difference between the two membership values issmall, implying more confusion. Higher confusion indexes might be as-sociated to an intergrade between the two dominant classes.

An independent validation set corresponding to 32 sampled loca-tions were used to validate the technique. The soil color of each MUwas predicted from the representative samples at 80–100 cm depth.On the other hand, the validation considered predictions from topsoil

and subsoil. Soil color predictions were compared to colorimeter mea-surements. Soil orderswere validated by comparing (in each 32 locationsampled) the digital soil map polygons identificationswith the soil clas-sification performed by a pedologist. A confusion matrix and the kappacoefficient (Congalton et al., 1983) were calculated.

2.7. Comparison between conventional and digital soil maps

A conventional soil map (CM) (1:10,000 scale) was designed andcompared with the digital map. To allow a consistent comparison, thedigital soil map classification was adjusted to the suborder level by fol-lowing specifications of the 2nd categorical level of SiBCS (predicted soilorder and Munsell color). Later, the agreement rate between maps wascalculated. Note that the CMwas not considered as a “ground truth”, butrather as an alternative soil map from the same site.

3. Results and discussion

3.1. Spatial variability of soil multi-depth spectra

The PCA analysis was performed on the multi-depth spectra and PC1 to 5 were selected. These PC explained, respectively, 98.5%, 0.97%,0.21%, 0.13% and 0.05% of data variability. The PC prediction had goodto satisfactory results (Table 2). The r2 ranged from 0.29 to 0.92 in theindependent validation. PC1 had the best r2 (0.92), while r2 of PC2was 0.29. The relative error had better results for PC1 and PC3, PC4 pre-sented RE values of 0.93 and in PC 2 and PC 5, RE values were higherthan 1. Due to the unsatisfactory performance of PC2 estimates (RE =1.43), this information was removed from the analysis. On the otherhand, PC5 presented a reasonable r2 and error besides a coherent spatialdistribution. Therefore, it was decided to keep it in the analysis despitethe RE value 1.05.

The PC loadings (Fig. 3) are important indicators of the influence ofeach wavelength on the PC scores variability, highlighting which soilproperties are represented by the scores. The influence of soil attributeson the spectra is well discussed in several studies (Ben-Dor et al., 1999),Clark et al., 1990), Stenberg et al., 2010), thus, we provide only a briefdescription.

The first PC loadings (Fig. 3a) indicate that at both soil depths, thespectral albedo explains most of PC variation. In general, the soil albedois related to texture, organic carbon (OC) and mineralogy (magnetiteand ilmenite), however, since the site has been cultivated with sugar-cane in the last 20 years, low OC rates are observed. Consequently,only texture and mineralogy could be responsible for differences inthe albedo. Variability on these attributes is conditioned by parentalma-terial, which, in turn, is closely related to elevation. Soils located athigher elevations were formed from diabases and showed higher con-tents of opaque sandminerals and Fe oxides. At intermediate to low el-evations, argillite and siltstones predominates, forming soils withmedium-to-high clay contents with low contents of iron oxides(Vidal-Torrado et al., 1999). This agrees with the PC1 map (Fig. 4),which has low PC1 values at the summit and higher scores on the foot-hill sloping locations.

Page 5: Digital soil mapping at local scale using a multi-depth ... · digital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. This

Fig. 3. Loadings from (a) PC1, (b) PC3, (c) PC4 and (d) PC5, for sampled depths. A (0–20 cm) and B (80–100 cm). Fig. 4. Principal component maps of the multi-depth spectra and aerial photo of the field.

22 R. Rizzo et al. / Geoderma 274 (2016) 18–27

The most important terrain attributes for PC1 prediction (Fig. 5a)were elevation and altitude above channels, accounting for 88% and62% to linear models. Other attributes with relevant contribution wereslope (49%), aspect and hillshade (41%) (Fig. 5a).

According to PC3 loadings (Fig. 3b), superficial and sub-superficialsoil spectra contributed to variation in this PC, however, subsoils hadhigher loading values. Loadings from topsoil spectra are pronounced be-tween 500 and 1000 nm, mainly at 680 nm, indicating influence of soilswith high Fe oxides contents. Moreover, within range 1200–2400 nm,the observed small positive values are probably related to variations inclay contents. In subsoils, loadings from 500 to 1800 nm presented asimilar pattern for surface loadings. Due to kaolinite and 2:1 minerals,features at 1900–2200 nm were more pronounced in the subsoil.

Comparison between PC3 map (Fig. 4) and an aerial-photo (bare soil)(Fig. 4) show that lower PC3 values occur where the red soils are, thatis, at the summit, while higher PC3 values occur where the yellowsoils are. Terrain attributes with higher contribution to the linearmodels were DEM (75%), AACN (60%), slope (40%), wetness index(33%) and channel network base level (31%) (Fig. 5b). Besides elevation,which is an indicative of geology, terrain attributes such as slope andwetness index are important to understand water dynamics in thesesoils and have strong influence on the weathering process, regulatingsoils mineralogy.

PC4 has positive loadings from 400 to 1850 nm and negative valuesfor wavelengths greater than 1900 nm. This pattern is observed both intopsoil and subsoil samples (Fig. 3c). Features at 530 and 880 nm

Page 6: Digital soil mapping at local scale using a multi-depth ... · digital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. This

Fig. 5. Contribution of individual predictors (terrain attributes) to the model trees in boththe conditionals and linearmodels used to predict (a) principal component 1, (b) principalcomponent 3, (c) principal component 4 and (d) principal component 5.

Fig. 6. Values of Xie & Beni index and Partition Coefficient, indicating the performance offuzzy c-means when clustering the data from 5 to 10 partitions.

23R. Rizzo et al. / Geoderma 274 (2016) 18–27

confirm hematite influence, while features at 1900 and 2200 nm showinfluences of 2:1 clay minerals and kaolinite. Unlike PC3, the loadingsranging from 1000 nm to 1800 nm are not so relevant. However, thespectral range from 1900 nm to 2400 nm contributed to PC variability.

Terrain attributes thatmost contributed to the linearmodelwere eleva-tion (78%) and slope (49%) (Fig. 5c). The areas at the summits presentedthe highest scoreswhile the lowest valueswere observed at the foothills(Fig. 4).

In PC5 loadings (Fig. 3d), spectral features that most influenced top-soil scores were 530, 880, 2200 nm, besides the range from 1800 to2400 nm. These wavelengths are related to hematite and kaolinite,while the range 1800–2400 nm also corresponds to variations in soiltexture. Based on loadings from sub-superficial samples, three featuresinfluenced PC5, namely 530 nm, 700 nm (related to goethite) and1900 nm (hygroscopic water in 2:1 clay minerals). Terrain attributesthat most contributed to PC5 linear models where elevation (66%), alti-tude above channel network (59%), slope (53%) and terrain ruggedness(48%) (Fig. 5d). PC5 spatial distribution (Fig. 4) is mainly related tochannel network and terrain ruggedness, where highest PC values arefound at the stream bottom while low values occur at flat and highlocations.

3.2. Mapping units and representative samples

The MU were established based on the PC maps. The optimal num-ber of units was based on the XB and PTC indexes (Fig. 6). Accordingto the XB index, data clustering from 5 to 7 units tends to improve com-pactness and separation, however, dividing the area into 8 to 10 MU in-creases the index and consequently affects the results. For the PTCindex, dividing the dataset from 5 to 7 units also provides better results,i.e. it creates units with lower fuzziness and reduces heterogeneity.From 8 to 10 MU, the PTC index reduces and presents a small variation,confirming that the best condition is reached at 7 units.

Themembership value maps (Fig. 7a) represent the spatial distribu-tion and similarities between the MU. In general, the MU have a well-defined position in the landscape and low confusion. For example, MU1 and 2 had high membership values at the summit and shoulder, re-spectively, and lowvalues at the footslopes.MU3 and 5 presented inter-mediary degree of similarity when compared to all other MU. ProbablytheseMU are an intergrade betweenMU 1 and 7.MU 4, 6 and 7 occur insoils formed from argillites and siltstones and consequently have lowsimilarity with MU 1 and 2, derived from diabase. Furthermore, MU 6and 7 are located at a steeper foothill area, which explains not onlythe similarity between these twoMU, but also the difference from sum-mit soils.

Based on fuzzy membership values, soil samples with the highestsimilarity to fuzzy centroids were selected as representative. The spatialdistribution of these samples comprises soils derived from differentgeologic materials and diverse landscape features (Fig. 7b). In general,representative samples corresponded to areas with low confusionindex, that is, samples from MU 1, 2, 4, 5 and 6 with CI values between

Page 7: Digital soil mapping at local scale using a multi-depth ... · digital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. This

Fig. 7. Parameters of the fuzzy c-means analysis. (a) Membership value maps for the 7 mapping units; and (b) crisp soil units map, the confusion index map and representative samplesselected for each Mu.

24 R. Rizzo et al. / Geoderma 274 (2016) 18–27

0.13 and 0.28. Samples fromMU 3 and 7 had CI values of approximately0.6, indicating that the 1st and 2nd highest membership values have aratio lower than 0.4, therefore, in these sampling locations similarity be-tween two of the MU is high.

3.3. MU identification and validation

MU identification consisted in defining soil Munsell color and soilorder. The soil color predictions were similar to the colorimeter mea-surements (Fig. 8) with R2 0.95 and 0.96 to hue number and value,

Fig. 8. Comparison of the estimated hue number, value and chromawith the colorimetermeasu20 = 10YR; 22.5 = 2.5 Y.

respectively, while Chroma had a determination coefficient of 0.73.The RMSE for Hue, Value and Chroma were 0.57, 0.19 and 0.29, respec-tively, indicating a good performance of the color prediction model,agreeing with Post et al. (1993).

The soil order prediction had a good agreement with conventionalclassification with kappa coefficient 0.41 (Table 3). The method identi-fied the three most prominent soil orders of the study site, Latossolos,Nitossolos and Cambissolos, which, altogether occupy more than 80%of the site. These orders were accurately identified in 70.6, 83.3 and100% of the observations. In the validation set, there were also

rements. Hue number correspondence: 10=10R; 12.5= 2.5YR; 15=5YR; 17.5= 7.5YR;

Page 8: Digital soil mapping at local scale using a multi-depth ... · digital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. This

Table 3Confusion matrix and kappa coefficient of observed and predicted soil orders, calculated from the validation set.

Soil classif.aObserved

KappaCambissolo Gleissolo Latossolo Nitossolo Argissolo

Predicted

Cambissolo 2 2 0 0 2

0.41Gleissolo 0 0 0 0 0Latossolo 0 0 12 1 0Nitossolo 0 0 5 5 3Argissolo 0 0 0 0 0Per cent correct 100.0 0.0 70.6 83.3 0.0 -

a Soil orders and corresponding World Reference Base (WRB; IUSS Working Group WRB, 2014) and Soil Taxonomy (Soil Survey Staff, 2014) classes: Cambissolo–Cambisol, Udepts;Latossolo–Ferralsol, Udox; Nitossolo–Nitisol, Udalfs/Udults; Argissolo–Lixisol, Udalfs/Udults.

25R. Rizzo et al. / Geoderma 274 (2016) 18–27

observations identified as Gleissolo and Argissolo, but the method wasnot able to accurately predict them. The SRSSL does not containGleissolos observations and consequently it was not possible to identifythis soil type. On the other hand, the database has a considerable num-ber of Argissolos observations, nevertheless, in the study site, this soilorder was assigned as Nitossolo or Cambissolo. In the study site, thereis little occurrence of Argissolos and seem to be intergrade betweenthe Nitossolos and Cambissolos, whichmight explain the confusion. Ac-cording to Vasques et al. (2014), the formation of both Argissolos andNitossolos in this region is related to processes of clay accumulation inB horizon and consequently the soils display a good degree of similarity.

3.4. Comparison between traditional and digital soil maps

Conventional and digital soilmaps presented a good agreementwitha similar spatial distribution ofMUand soil orders (Fig. 9). In bothmaps,soils at the summit were classified as Latossolo Vermelho (LV), but theMU in the CM covered a larger area and corresponded to two differentMU of the digital map. At the non-corresponding area, digital map clas-sified the soils as Nitossolo Vermelho (NV) (Table 4). According toCooper and Vidal-Torrado (2005), these soil classes might have highsimilarity and are distinguished basically by the structure of the diag-nostic horizon. According to Bellinaso et al. (2010), distinction betweenLV and NV based on the spectrum is possible, however, it requires thefeature evaluation at 2265 nm. LV usually contains a higher gibbsitecontent due to severe weathering processes, resulting in differences inthe soil spectra.

Another misclassification occurred between Gleissolos (GX) andCambissolos (CX). During the field survey, an MU corresponding tosoils with lithic contact (at approximately 2 m deep) was designedand located in slightly depressed areas. The MU presented a water

Fig. 9. (a) Soil units map, (b) digital soil

logging condition and consequently it is a hydromorphic soil. Althoughthis soil had some similarities with the descriptions of the Cambissolosorder, it was classified as Gleissolo Háplico (GX) due to criteriaestablished by the SiBCS. The SRSSL showed no GX and consequentlythe method was not able to identify the MU as such. The methodassigned the most similar class, that is, CX, which is coherent with soilsurvey observations.

In the DSM, there is anMU defined as R (Neossolo), whichwas iden-tified as CX in the conventional map (Fig. 9) (Table 4). The SRSSL fromthemunicipality of Piracicaba presents observations of Neossolo Litólicomainly, which is a shallow soil with lowweathering degree andwithoutthe B horizon. The CX from the study site are also soils with lowweathering degree, however, different from R due to presence of athin B horizon. Considering that the DSM used information from onlytwo layers (0–20 cm and 80–100 cm), the sampling process may havemissed the B horizon in these areas and affected the classificationresults.

This work ratifies first observations of Demattê et al. (2001), wherethe use of aerial photographs (which evaluates relief) revealed greaternumber of poligons than a traditional survey. Indeed, Demattê et al.(2004), performed the pioneering practical method on using spectros-copy for soil mapping. These authors generated a soil map in a detailedscale using spectral sensing in various depths, associated with relief.These results are in accordance with our data and with a upgrade onthe methodology and geotechnologies associated.

4. Conclusions

The use of regional spectral libraries combined to local soil spectraallows pedologists to perform rapid predictions of soil attributes. More-over, information from multiple depths enables the identification of

map and (c) conventional soil map.

Page 9: Digital soil mapping at local scale using a multi-depth ... · digital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. This

Table 4Cross tabulation between conventional and digital soil map indicating the coincident andnon-coincident areas, in percentage.

Conventional soil map

Soil classif.a PA GX NV CX LV NX PVA FF

Digital soilmap

PA 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0GX 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0NV 0.0 3.2 63.7 3.7 39.4 37.9 9.7 93.9CX 100.0 96.8 6.1 64.9 0.0 14.8 62.2 0.0LV 0.0 0.0 4.9 0.1 59.8 0.0 0.0 0.0NX 0.0 0.0 25.3 4.0 0.8 45.3 28.1 6.1PVA 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0FF 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0R 0.0 0.0 0.0 27.3 0.0 2.0 0.0 0.0Total 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0

a Soil suborders and correspondingWorld Reference Base (WRB; IUSS Working GroupWRB, 2014) and Soil Taxonomy (Soil Survey Staff, 2014) classes: CX, Cambissolo–Cambisol, Udepts; LV, Latossolo Vermelho–Ferralsol, Udox; NV, Nitossolo Vermelho; NX,Nitossolo Háplico–Nitisol, Udalfs/Udults; PA, Argissolo Amarelo; PV, PVA, ArgissoloVermelho Amarelo–Lixisol, Udalfs/Udults; R, Neossolo–Leptosol, Lithic Udorthents/LithicUdipsamments; Udorthents/Udipsamments.

26 R. Rizzo et al. / Geoderma 274 (2016) 18–27

soils according to a classification system. In this study, kappa coeffi-cients of 0.41 indicated that the soil order classification had good agree-ment with the observed classes. Besides, Munsell color predictions forhue had excellent performance with R2 0.95when compared to a color-imeter. On the other hand, SRSSL from the Piracicaba region still needsto be improved to include as many soil orders as possible and becomemore representative.

The conventional soil mapping process is a complex activity wherepedologist relies on tacit knowledge to delineate the MU and selectsampling locations. Spatial explicit information, such as multi-depthspectra, could be used to help in decision making of soil scientists, inspecial for detailed scales. Validation of PC1 score map, for example,proves that such information is reliable.

Considering that spectrometers are portable devices, further worksshould focus on applying this strategy on spectra measured in the field.

Acknowledgments

We thank the Department of Soil Science at the University of SãoPaulo, the Coordination for the Improvement of Higher Education Per-sonnel (CAPES) at the Brazilian Ministry of Education and the State ofSão Paulo Research Foundation for financial support (proc. numbers2009-54144-8 and 2014-22262-0). We also thank the Geotechnologiesin Soil Science Group (GeoSS; http://esalqgeocis.wix.com/english) atthe Soil Science Department at the University of São Paulo.

References

Adhikari, K., Minasny, B., Greve, M.B., Greve, M.H., 2014. Constructing a soil class map ofDenmark based on the FAO legend using digital techniques. Geoderma 214–215,101–113.

Bellinaso, H., Dematte, J.A.M., Romeiro, S.A., 2010. Soil spectral library and its use in soilclassification. Brazil J. Soil Sci. 34, 861–870.

Ben-Dor, E., Irons, J.R., Epema, G.F., 1999. Soil reflectance. In: Rencez, A.N. (Ed.), RemoteSensing for the Earth Science: Manual of Remote Sensing. Jonh Wiley, New York,pp. 111–188.

Ben-Dor, E., Heller, D., Chudnovsky, A., 2008. A novel method of classifying soil profiles inthe field using optical means. Soil Sci. Soc. Am. J. 72, 1–13.

Bezdek, J.C., 1981. Pattern Recognition with Fuzzy Objective Function Algorithms. PlenumPress, New York.

Burrough, P.A., McDonnell, R.A., 1998. Principles of Geographical Information Systems.Oxford University Press, New York.

Burrough, P.A., Van Gaans, P.F.M., Hootsmans, R., 1997. Continuous classification in soilsurvey: spatial correlation, confusion and boundaries. Geoderma 77, 115–135.

Campos, R.C., Demattê, J.A.M., Quartaroli, C.F., 2003. Determinação indireta do teor dehematita na fração argila de solos a partir de dados de colorimetria e radiometria.Pesq. Agrop. Brasileira 38, 521–528.

Clark, R.N., Gallagher, A.J., Swayze, G.A., 1990. Material absorption band depth mapping ofimaging spectrometer data using the complete band shape least-squares algorithm

simultaneously fit to multiple spectral features frommultiple materials. Airborne Vis-ible/Infrared Imaging Spectrometer (AVIRIS)Workshop. JPL Publication, pp. 176–186.

Commission Internationale de L'Eclairage, 1931. Proceedings of the eight session. BureauCentral de la CIE, Cambridge.

Congalton, R.G., Oderwald, R.G., Mead, R.A., 1983. Assessing Landsat classification accu-racy using discrete multivariate analysis statistical techniques. Photogramm. Eng.Rem. S. 49, 1671–1678.

Cooper, M., Vidal-Torrado, P., 2005. Caracterização morfológica, micromorfológica efísico-hídrica de solos com horizonte B nítico [Morphological, micromorphologicaland hydro-physical characterization of soils with a nitic B horizon]. Braz. J. Soil Sci.29, 581–595.

Demattê, J.A.M., Demattê, J.L.I., Camargo,W.P., Fiorio, P.R., Nanni, M.R., 2001. Remote sens-ing in the recognition and mapping of tropical soils developed on topographic se-quences. Mapp. Sci. Remote. Sens. 38, 79–102.

Demattê, J.A.M., Campos, R.C., Alves, M.C., Fiorio, P.R., Nanni, M.R., 2004. Visible-NIR re-flectance: a new approach on soil evaluation. Geoderma 121, 95–112.

Demattê, J.A.M., Terra, F.S., 2014. Spectral pedology: a new perspective on evaluation ofsoils along pedogenetic alterations. Geoderma 217–218, 190–200.

Henderson, B.L., Buib, E.N., Moranb, C.J., Simonb, D.A.P., 2005. Australia-wide predictionsof soil properties using decision trees. Geoderma 124, 383–398.

Hoffman, R., Krotkov, E., 1990. Terrain roughness measurement from elevation maps. Ad-vances in Intelligent Robotics Systems Conference. International Society for Opticsand Photonics, pp. 104–114.

Horn, B.K.P., 1981. Hill shading and the reflectancemap. Proceedings eIEEE Cambridge 69,pp. 14–47.

Hutchinson, M.F., Lock, M.D., 1993. On thin plate splines and kriging. In: Tarter, M.E. (Ed.),Computing Science and Statistics 25. Interface Foundation of North America, Berke-ley, pp. 55–62.

Kuhn, M., Weston, S., Keefer, C., Coulter, N., 2014. Cubist: Rule- and Instance-Based Re-gression Modeling. (Rpackage version 0.0.18e. http://CRAN.R-project.org/package=Cubist).

Lagacherie, P., Bailly, J.S., Monestiez, P., Gomez, C., 2012. Using scattered hyperspectral im-agery data to map the soil properties of a region. Eur. J. Soil Sci. 63, 110–119.

Martens, H., Naes, T., 1989. Multivariate Calibration. John Wiley, New York.Mendonça-santos, M.L., Dos Santos, H.G., 2006. The state of the art of Brazilian soil map-

ping and prospects for digital soil mapping. In: Lagacherie, P., McBratney, A.B., Voltz,M. (Eds.), Developments in Soil Science. Elsevier, Amsterdam, pp. 39–54.

Moore, I.D., Gessler, P.E., Nielsen, G.A., Peterson, G., 1993. Soil attribute prediction usingterrain analysis. Soil Sci. Soc. Am. J. 57, 443–452.

Post, D.F., Bryant, R.B., Batchily, A.K., Huete, A.R., Levine, S.J., Mays, M.D., Escadafal, R.,1993. Correlations between field and laboratory measurements of soil color. SoilColor 35–49.

Quinlan, J.R., 1992. Learning with continuous classes. In: Adams, S. (Ed.), Proceedings ofthe 5th Australian Joint Conference on Artificial Intelligence. World Scientific,Singapore, pp. 343–348.

EMBRAPA (Empresa Brasileira de Pesquisa Agropecuária), 2013. Sistema Brasileiro deClassificação de Solos. third ed. EMBRAPA, Brasília.

Soil Survey Staff., 2014. Keys to Soil Taxonomy. 12th ed. USDA-Natural Resources Conser-vation Service, Washington, p. 362.

Soriano-Disla, J.M., Janik, L.J., ViscarraRossel, R.A., MacDonald, L.M., McLaughlin, M.J., 2014.The performance of visible, near-, and mid-infrared reflectance spectroscopy for pre-diction of soil physical, chemical, and biological properties. Appl. Spectrosc. 49,139–186.

Stenberg, B., ViscarraRossel, R.A., Mouazen, A.M., Wetterlind, J., 2010. Visible and near in-frared spectroscopy in soil science. Adv. Agron. 107, 163–215.

Torrent, J., Barrón, V., 1993. Laboratory measurement of soil color: theory and practice.Soil color. Soil Sci. Soc. Agron. 31, 21–33.

Vasques, G.M., Demattê, J.A.M., Viscarra Rossel, R.A., Ramírez-López, L., Terra, F.S., 2014.Soil classification using visible/near-infrared diffuse reflectance spectra frommultipledepths. Geoderma 223–225, 73–78.

Vasques, G.M.; Demattê, J.A.M., Viscarra-rossel, R.A; Ramirez-lopez, L. Terra, F.S.; Rizzo, R.;Souza Filho, B., 2015. Integrating geospatial and multi-depth laboratory spectral datafor mapping soil classes in a geologically complex area in southeastern Brazil. Eur. J.Soil Sci. 66, 767–779.

Vidal-Torrado, P., Lepsch, I.F., Castro, S.S., Cooper, M., 1999. Pedogênese em umaseqüência latossolo-podzólico na borda de um platô na depressão periférica paulista(Pedogenesis of an oxisol-ultisol-alfisol sequence on the border of a plateau at thepaulista peripherical depression in Brazil). Braz. J. Soil Sci. 23, 909–921.

Viscarra Rossel, R.A., 2008. ParLeS: software for chemometric analysis of spectroscopicdata. Chemom. Intell. Lab. Syst. 90, 72–83.

Viscarra Rossel, R.A., Chen, C., 2011. Digitally mapping the information content of visible–near infrared spectra of surficial Australian soils. Remote Sens. Environ. 115,1443–1455.

Viscarra Rossel, R.A., Webster, R., 2011. Discrimination of Australian soil horizons andclasses from their visible–near infrared spectra. Eur. J. Soil Sci. 62, 637–647.

Viscarra Rossel, R.A., Cattle, S.R., Ortega, A., Fouad, Y., 2009. In situ measurements of soilcolour, mineral composition and clay content by vis–NIR spectroscopy. Geoderma3–4, 253–266.

ViscarraRossel, R.A., Minasny, B., Roudier, P., Mcbratney, A.B., 2006. Colour space modelsfor soil science. Geoderma 133, 320–337.

Viscarra Rossel, R.A., Behrens, T., Ben-Dor, E., Brown, D.J., Demattê, J.A.M., Shepherd, K.D.,Shi, Z., Stenberg, B., Stevens, A., Adamchuk, V., Aïchi, H., Barthès, B.G., Bartholomeus,H.M., Bayer, A.D., Bernoux, M., Böttcher, K., Brodský, L., Du, C.W., Chappell, A., Fouad,Y., Genot, V., Gomez, C., Grunwald, S., Gubler, A., Guerrero, C., Hedley, C.B., Knadel, M.,Morrás, H.J.M., Nocita, M., Ramirez-Lopez, L., Roudier, P., Rufasto Campos, E.M.,Sanborn, P., Sellitto, V.M., Sudduth, K.A., Rawlins, B.G., Walter, C., Winowiecki, L.A.,

Page 10: Digital soil mapping at local scale using a multi-depth ... · digital soil mapping, but few studies report on the use of these libraries to aid digital mapping of soil types. This

27R. Rizzo et al. / Geoderma 274 (2016) 18–27

Hong, S.Y., Ji, W., 2016. A global spectral library to characterize theworld's soil. Earth-Sci Rev. 155, 198–230.

IUSS Working Group WRB, 2014. World Reference Base for Soil Resources 2014: Interna-tional Soil Classification System for Naming Soils and Creating Legends for Soil Maps.World Soil Resources Report No 106. Rome, Food and Agriculture Organization.

Xie, X.L., Beni, G., 1991. A validity measure for fuzzy clustering. IEEE Trans. Pattern Anal.Mach. Intell. 13, 841–847.

Zevenbergen, L.W., Thorne, C.R., 1987. Quantitative analysis of land surface topography.Earth Surf. Processes 12, 47–56.