18
Application of GIS based data driven evidential belief function model to predict groundwater potential zonation Haleh Nampak a , Biswajeet Pradhan a,, Mohammad Abd Manap b a Faculty of Engineering, Department of Civil Engineering, Geospatial Information Science Research Centre (GISRC), University Putra Malaysia, Serdang, Selangor DarulEhsan 43400, Malaysia b Minerals and Geoscience Department (JMG), 19-22th Floor, BangunanTabung Haji, JalanTunRazak, Kuala Lumpur 50658, Malaysia article info Article history: Received 29 September 2013 Received in revised form 17 February 2014 Accepted 19 February 2014 Available online 20 March 2014 This manuscript was handled by Geoff Syme, Editor-in-Chief, with the assistance of Craig T. Simmons, Associate Editor Keywords: Groundwater potential Evidential belief function (EBF) Logistic regression (LR) GIS Remote sensing Malaysia summary The objective of this paper is to exploit potential application of an evidential belief function (EBF) model for spatial prediction of groundwater productivity at Langat basin area, Malaysia using geographic infor- mation system (GIS) technique. About 125 groundwater yield data were collected from well locations. Subsequently, the groundwater yield was divided into high (P11 m 3 /h) and low yields (<11 m 3 /h) respectively, based on the groundwater classification standard recommended by Department of Mineral and Geosciences (JMG), Malaysia. Out of all of the borehole data, only 60 wells possessed higher yield at P 11 m 3 /h. Further, these wells were randomly divided into a testing dataset 70% (42 wells) for training the model and the remaining 30% (18 wells) was used for validation purpose. To perform cross validation, the frequency ratio (FR) approach was applied into remaining groundwater wells with low yield to show the spatial correlation between the low potential zones of groundwater productivity. A total of twelve groundwater conditioning factors that affect the storage of groundwater occurrences were derived from various data sources such as satellite based imagery, topographic maps and associated database. Those twelve groundwater conditioning factors are elevation, slope, curvature, stream power index (SPI), topo- graphic wetness index (TWI), drainage density, lithology, lineament density, land use, normalized differ- ence vegetation index (NDVI), soil and rainfall. Subsequently, the Dempster–Shafer theory of evidence model was applied to prepare the groundwater potential map. Finally, the result of groundwater poten- tial map derived from belief map was validated using testing data. Furthermore, to compare the perfor- mance of the EBF result, logistic regression model was applied. The success-rate and prediction-rate curves were computed to estimate the efficiency of the employed EBF model compared to LR method. The validation results demonstrated that the success-rate for EBF and LR methods were 83% and 82% respectively. The area under the curve for prediction-rate of EBF and LR methods were calculated 78% and 72% respectively. The outputs achieved from the current research proved the efficiency of EBF in groundwater potential mapping. Ó 2014 Elsevier B.V. All rights reserved. 1. Introduction Groundwater is one of the most important natural resources worldwide serving as a major source of water to communities, industries and agricultural purposes (Ayazi et al., 2010; Manap et al., 2012, 2013; Neshat et al., 2013; Pradhan, 2009). Groundwa- ter is defined as water in saturated zone (Fitts, 2002) which fills the pore spaces among mineral grains or cracks and fractured rocks in rock mass. Groundwater is usually formed by rain or snow melts which seeps down through the soil into the underlying rocks (Banks et al., 2002; Saraf and Choudhury 1998). The traditional approach of groundwater exploration through drilling, geological, hydro-geological, and geophysical methods are costly and time consuming (Sander et al., 1996; Singh and Prakash, 2002). A common method used to prepare groundwater potential maps is mainly based on ground surveys (Ganapuram et al., 2009). Recently, with the popular use of geographic informa- tion systems (GISs) and remote sensing (RS) based technologies, groundwater potential mapping has become an easy procedure (Singh and Prakash, 2002). GIS is a powerful tool to handle huge amount of spatial data and can be used in the decision making process in a number of fields such as geology and environmental management. The information about surface features related to http://dx.doi.org/10.1016/j.jhydrol.2014.02.053 0022-1694/Ó 2014 Elsevier B.V. All rights reserved. Corresponding author. Tel.: +60 3 89466383; fax: +60 3 89468470. E-mail addresses: [email protected], [email protected] (B. Pradhan). Journal of Hydrology 513 (2014) 283–300 Contents lists available at ScienceDirect Journal of Hydrology journal homepage: www.elsevier.com/locate/jhydrol

Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Embed Size (px)

Citation preview

Page 1: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Journal of Hydrology 513 (2014) 283–300

Contents lists available at ScienceDirect

Journal of Hydrology

journal homepage: www.elsevier .com/ locate / jhydrol

Application of GIS based data driven evidential belief function modelto predict groundwater potential zonation

http://dx.doi.org/10.1016/j.jhydrol.2014.02.0530022-1694/� 2014 Elsevier B.V. All rights reserved.

⇑ Corresponding author. Tel.: +60 3 89466383; fax: +60 3 89468470.E-mail addresses: [email protected], [email protected]

(B. Pradhan).

Haleh Nampak a, Biswajeet Pradhan a,⇑, Mohammad Abd Manap b

a Faculty of Engineering, Department of Civil Engineering, Geospatial Information Science Research Centre (GISRC), University Putra Malaysia, Serdang, Selangor DarulEhsan43400, Malaysiab Minerals and Geoscience Department (JMG), 19-22th Floor, BangunanTabung Haji, JalanTunRazak, Kuala Lumpur 50658, Malaysia

a r t i c l e i n f o

Article history:Received 29 September 2013Received in revised form 17 February 2014Accepted 19 February 2014Available online 20 March 2014This manuscript was handled by GeoffSyme, Editor-in-Chief, with the assistance ofCraig T. Simmons, Associate Editor

Keywords:Groundwater potentialEvidential belief function (EBF)Logistic regression (LR)GISRemote sensingMalaysia

s u m m a r y

The objective of this paper is to exploit potential application of an evidential belief function (EBF) modelfor spatial prediction of groundwater productivity at Langat basin area, Malaysia using geographic infor-mation system (GIS) technique. About 125 groundwater yield data were collected from well locations.Subsequently, the groundwater yield was divided into high (P11 m3/h) and low yields (<11 m3/h)respectively, based on the groundwater classification standard recommended by Department of Mineraland Geosciences (JMG), Malaysia. Out of all of the borehole data, only 60 wells possessed higher yield atP 11 m3/h. Further, these wells were randomly divided into a testing dataset 70% (42 wells) for trainingthe model and the remaining 30% (18 wells) was used for validation purpose. To perform cross validation,the frequency ratio (FR) approach was applied into remaining groundwater wells with low yield to showthe spatial correlation between the low potential zones of groundwater productivity. A total of twelvegroundwater conditioning factors that affect the storage of groundwater occurrences were derived fromvarious data sources such as satellite based imagery, topographic maps and associated database. Thosetwelve groundwater conditioning factors are elevation, slope, curvature, stream power index (SPI), topo-graphic wetness index (TWI), drainage density, lithology, lineament density, land use, normalized differ-ence vegetation index (NDVI), soil and rainfall. Subsequently, the Dempster–Shafer theory of evidencemodel was applied to prepare the groundwater potential map. Finally, the result of groundwater poten-tial map derived from belief map was validated using testing data. Furthermore, to compare the perfor-mance of the EBF result, logistic regression model was applied. The success-rate and prediction-ratecurves were computed to estimate the efficiency of the employed EBF model compared to LR method.The validation results demonstrated that the success-rate for EBF and LR methods were 83% and 82%respectively. The area under the curve for prediction-rate of EBF and LR methods were calculated 78%and 72% respectively. The outputs achieved from the current research proved the efficiency of EBF ingroundwater potential mapping.

� 2014 Elsevier B.V. All rights reserved.

1. Introduction

Groundwater is one of the most important natural resourcesworldwide serving as a major source of water to communities,industries and agricultural purposes (Ayazi et al., 2010; Manapet al., 2012, 2013; Neshat et al., 2013; Pradhan, 2009). Groundwa-ter is defined as water in saturated zone (Fitts, 2002) which fills thepore spaces among mineral grains or cracks and fractured rocks inrock mass. Groundwater is usually formed by rain or snow melts

which seeps down through the soil into the underlying rocks(Banks et al., 2002; Saraf and Choudhury 1998).

The traditional approach of groundwater exploration throughdrilling, geological, hydro-geological, and geophysical methodsare costly and time consuming (Sander et al., 1996; Singh andPrakash, 2002). A common method used to prepare groundwaterpotential maps is mainly based on ground surveys (Ganapuramet al., 2009). Recently, with the popular use of geographic informa-tion systems (GISs) and remote sensing (RS) based technologies,groundwater potential mapping has become an easy procedure(Singh and Prakash, 2002). GIS is a powerful tool to handle hugeamount of spatial data and can be used in the decision makingprocess in a number of fields such as geology and environmentalmanagement. The information about surface features related to

Page 2: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

284 H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300

groundwater such as landforms, land use, lineaments can beextracted through RS data. Those data can be easily entered toGIS to integrate with other associated tabular data, followed byspatial analysis and visual interpretation (Jha et al., 2007).

In Malaysia, groundwater has been considered as a hot issueespecially during prolonged drought periods. The Selangor statefaced a long period of drought in 1998 due to El Nino effects.Groundwater, in other states of Malaysia such as Kelantan, Perlis,Terengganu, Pahang, Sarawak and Sabah has been utilized as amain source of water supply (Suratman, 2004). Moreover, it isbeing exploited by private sectors for commercial production ofmineral water. The failure to recognize the vast potential zone isthe main reason for underutilization of groundwater resources inMalaysia.

More recently, a lot of studies have been applied using indexbased models for assessing groundwater potential mapping (Daret al., 2010; Madrucci et al., 2008; Nag et al., 2012; Prasad et al.,2008). In some studies probabilistic models such as multi-criteriadecision analysis (Chenini et al., 2010; Gupta and Srivastava,2010; Murthy and Mamo, 2009), weights-of-evidence (Corsiniet al., 2009; Lee et al., 2012), frequency ratio (FR) (Oh et al.,2011), and analytical hierarchy process (AHP) (Chowdhury et al.,2009; Pradhan, 2009) have been used for groundwater potentialmapping. In recent years, some soft computing techniques suchas fuzzy logic (Shahid et al., 2002; Ghayoumian et al., 2007),numerical modelling and decision tree (DT) (Chenini andMammou, 2010) approaches have been applied in groundwaterpotential mapping. Magesh et al. (2012) carried out weighted over-lay analysis using a multi-influencing factors and assigned weightsto various groundwater conditioning factors.

In this paper, an EBF model was applied for groundwater poten-tial mapping (Shafer, 1976; Dempster, 2008). The EBF approach hasbeen popularly used in mineral potential mapping (Moon, 1990).Carranza and Hale (2003) proposed a data-driven approach basedon the Dempster’s rule of combination using GIS for mineral poten-tial mapping (Carranza and Castro, 2006). Similarly multivariatebased logistic regression model (LR) has been applied in ground-water potential mapping (Ozdemir, 2011). LR model is useful todescribe the significance and correlation of groundwater occurenceto each conditioning factor.

The main aim of the present study is to evaluate the efficiencyof the EBF model for groundwater potential mapping. In order tocompare the robustness of the proposed EBF model, a well-knownLR model was applied to identify the significant groundwater con-ditioning factors and subsequently the EBF model was re-run tocheck its efficiency. Through this analysis, the relationshipsbetween wells and each conditioning factor will be quantitativelydefined. The main difference between this research and theapproaches described in the aforementioned publications is thatan EBF model is applied and the result is validated for groundwaterpotential mapping in the Langat basin, Malaysia. The application ofEBF in groundwater potential mapping provides originality to thisstudy.

2. Study area

Langat River catchment is located in the southeast part of Selan-gor state. It is considered as the most urbanized river basin inMalaysia providing two thirds of the water in the Selangor state.However, with the large-scale physical and economic developmentin the area, water scarcity and water quality deterioration isemerging in recent years. Bringemeier (2006) reported that thenumber of water-intensive enterprises (e.g. steel works, pulp andpaper industry, power plants) play a vital role in water shortageand groundwater quality reduction in this region. This trend willincrease the water crisis problem within the next 20 years.

The study area lies between 101�1902000E to 102�101000E latitudeand 2�4001500N to 3�1601500 longitude covering an area about2100 km2 (Fig. 1). The basin includes several districts such as KualaLangat, Sepang, Hulu Langat and part of Seremban district.

Topographically, the area is divided into three distinct regions(Manap et al., 2012). The Langat River has several tributaries withthe principal ones being the Semenyih River, the Lui River, andthe Beranang River. The main and other tributaries flow west-ward to the Malacca strait and create flat alluvial zone wheresoils are mostly peat with clay and silt soil. High average andconstant annual temperature coupling with high precipitationand high humidity affect the hydrology and geomorphology ofthe study area. In addition, the weather conditions are influencedby the southwest monsoon that blows across the Strait ofMalacca (Juahir et al., 2010) and the study area experiences thewet season in April to November and a relatively drier periodfrom January to March. According to Malaysian MeteorologicalServices Department, the mean annual temperature of HuluLangat area is 32 �C and the annual rainfall about 2316.5–4223.4 mm. The bedrock geology of the study area mainlyconsists of granite, sedimentary, alluvium and volcanic rocks(Fig. 2).

3. Data used

3.1. Groundwater occurrence characteristics

The groundwater data such as topography, number of wells,yield and depth were obtained from Department of Mineral andGeosciences (JMG), Malaysia. Groundwater yield is based on actualpumping test of groundwater well e.g. m3/h. Moreover, groundwa-ter potential is based on prediction of the best potential forgroundwater extraction in the study area. There are 125 individualgroundwater borehole wells in the study area. In the year 2007,JMG had set up their own standard of groundwater potential clas-sification classes which are based on yield value of well produc-tion. The high productivity value was based on yield value P11 m3/h. The groundwater productivity data from 60 wells wereselected and randomly divided into a training dataset 70% (42groundwater wells) and a validation dataset 30% (18 groundwaterwells). Additionally, same numbers of points (42) were selected asnon-well occurrences pixels (i.e. the absence of a well over thearea) to allocate the value of 0 for applying LR model. It is expectedto improve the accuracy of the outcomes through the considerationof non-well occurrences pixels in the logistic regression analysis.The rest of the well occurrences (18 points) were applied for thepurpose of validation. The remaining groundwater wells with lessthan 11 m3/h productivity were used for FR modelling for a cross-validation and to compare the modelling results.

Fig. 3 shows the groundwater well locations of the study area.

3.2. Groundwater conditioning factors

Generally, the occurrence and movement of groundwater in agiven area is governed by various conditioning factors. These factorsare topography, lithology, geological structure, fracture density,aperture and connectivity of fractures, secondary porosity, ground-water table distribution, groundwater recharge, slope, drainage pat-tern, landform, land use and land cover, climatic condition(Mukherjee, 1996; Oh et al., 2011; Ozdemir, 2011). To create agroundwater potential map, the spatial database was consideredto be a set of related spatial conditioning factors that influencegroundwater occurrence. In this study, a total of twelve conditioningfactors are used. Those conditioning factors are: elevation, slope,curvature, stream power index (SPI), topographic wetness index(TWI), rainfall, normalized difference vegetation index (NDVI),

Page 3: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Fig. 1. Location map of the study area.

H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300 285

drainage density, lineament density, geology, land use and soil(Table 1).

3.2.1. Topographic factorsA digital elevation model (DEM) with 20 m resolution was used

(from the 1:25,000 scale topographic map) to derive various topo-graphic factors such as elevation (Fig. 4a), slope angle (Fig. 4b) andslope curvature (Fig. 4c) using ArcGIS.

3.2.2. Water related factorsVarious factors such as drainage density, TWI and SPI were

obtained from the DEM as a measure of surface water, sub-surfacewater and groundwater. The TWI has been widely applied toexplain the impact of topography on the location and size of satu-rated source zones of runoff generation. Eq. (1) proposed by Mooreet al. (1991) was used for TWI computation:

TWI ¼ lnðAS= tan bÞ ð1Þ

where AS is the specific catchment’s area (m2/m) and b is slope gra-dient (in degrees). The TWI map is illustrated in Fig. 4d.

Fig. 4e presents the SPI which is a measure of the erosive powerof water flow based on the assumption that discharge is propor-tional to specific catchment area Eq. (2) (Moore et al., 1991).

SPI ¼ ðAS� tan bÞ ð2Þ

The geology is one of the most significant indicators of hydro-geological features (Charon, 1974). The relationship between infil-tration and runoff is controlled largely by permeability, which is inturn a function of the rock type and fracturing of the underlyingrock or surface bedrock (Fig. 4f). High river density values showhigh surface runoff of an area (Prasad et al., 2008) and then suchzones are regarded as favorable for arresting excessive runoff(Krishnamurthy et al., 1996).

Page 4: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Fig. 2. Lithological map showing the lineaments in the study area.

286 H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300

3.2.3. Geological factorsThe lineament map was provided by the Department of Mineral

& Geosciences, Malaysia and the lineament density maps was gen-erated using ArcGIS spatial analyst extension (Fig. 4g). The bedrockin the mountainous area consists of Permian igneous rock andPre-Devonian schist and phyllite (Hutchison and Tan, 2009). Thebedrock of hilly part includes Kajang formation and Kenny hill for-mation which consists of phyllite, shale and quartzite (Yin, 1976).Unconsolidated gravel, sand, silt and clay formed as the Simpangformation in lowlands (Fig. 4h).

3.2.4. Land use/Land coverThe land use map (Fig. 4i) was obtained from the Department of

Agriculture, Malaysia at a scale of 1:25,000. For analysis, the landuse map was constructed with ten classes which mainly include

built up area, mining, rubber, oil palm, plantation, grassland, forestand swamps.

The NDVI map was produced from RS imagery showing the sur-face vegetation coverage and density in an image. The NDVI valuewas computed using Eq. (3) (Pradhan et al., 2010). Fig. 4j shows theNDVI map.

NDVI ¼ ðNIR � VISÞ=ðNIR þ VISÞ ð3Þ

where VIS and NIR stand for the spectral reflectance measurementsacquired in the visible (red) and near-infrared regions, respectively.

3.2.5. SoilThe soil map was obtained from the Department of Agriculture,

Malaysia. The soil cover varied between highland and lowlandzones (Fig. 4k) and it contains coarse to fine sandy loam. As the

Page 5: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Fig. 3. Groundwater well locations in the study area.

Table 1The spatial database construction.

Classification Sub-classification

Data type Scale

Groundwater Tube well Point 1:25,000Base map Topographic

mapPoint, line and polygoncoverage

1:25,000

Geological map Polygon 1:63,360Lineament map Polyline 1:63,360Land use map Grid 20 � 20Soil map Grid 20 � 20NDVI Grid 20 � 20Rainfall map Grid 20 � 20

H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300 287

major and other tributaries flow westward to the Malacca strait bycreating flat alluvial zone where soils are mostly peat with clay andsilt soil.

3.2.6. RainfallThe precipitation amount determines the amount of water that

would be percolating into the groundwater system as the majorsource of recharge. For that reason, an annual rainfall map of thestudy area was prepared using the historical rainfall data of past29 years (1981–2010) measured at the meteorology stationslocated in the surrounding study area (Fig. 4l).

All the aforementioned groundwater conditioning factors wereconverted to a grid comprising of 20 � 20 m grid cells, with an areaof 3285 rows by 4001 columns (total number of cells is 6,117,068).

4. Methodology

The groundwater potential zonation mapping consists of fourmain steps: (1) data collection and spatial database constructionfor the groundwater related conditioning factors, (2) assessmentof groundwater potential using the relationships between wellsand groundwater conditioning factors, (3) validation of the results,

(4) description and visual interpretation of the results. Fig. 5 illus-trates the general methodological flow chart used in this study.

4.1. Evidential belief function (EBF) model

To apply the EBF model, at first the thematic layers (groundwa-ter conditioning factors) should be transformed into evidentialdata layers. After that they can be integrated to produce a predic-tive groundwater potential index map (GWPI) using the quantita-tive knowledge of the spatial relationship between the wells andthe groundwater conditioning factors. The EBF model containsBel (degree of belief), Dis (degree of disbelief), Unc (degree ofuncertainty) and Pls (degree of plausibility) in range of [0,1](Carranza and Hale, 2003; Althuwaynee et al., 2012; Pradhanet al., 2014).

The primary parts of the theory are shown by Bel and Pls aslower and upper probability respectively and basic probabilityassignment function (bpa or m) describes a mapping of the powerset to (0–1).

Eq. (4) shows the Dempster–Shafer theory of evidence which issynthesized from Carranza and Hale (2003), Park (2011),Althuwaynee et al. (2012).

m : PðHÞ ¼ f0;1g ð4Þ

mðøÞ ¼ 0 mðBÞ ¼ 1

mðBÞ ¼ 1 :X

H�PðHÞmðHÞ ¼ 1

Belief is a lower probability which defines as the sum of allbeliefs committed identically to every subset of B by m and plausi-bility shows the degree that the evidence remains plausible (Park,2011).

Based on bpa or mass function, belief and plausibility functioncan be expressed into following Eq. (5).

Page 6: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Fig. 4. Groundwater conditioning factors; (a) elevation, (b) slope, (c) curvature, (d) TWI, (e) SPI, (f) river density, (g) lineament density, (h) lithology, (i) land use, (j) NDVI, (k)soil, and (l) rainfall.

288 H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300

BelðBÞ ¼XH�B

mðHÞ ð5Þ

PlsðBÞ ¼X

H\B–ø

mðHÞ

These two functions have the following properties:

BelðBÞ 6 PlsðBÞ

PlsðBÞ ¼ 1� Bel

As shown in Eq. (6), �B Is negation of B and it is called disbelieffunction and it is classical complement of �B.

Page 7: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Fig. 4 (continued)

H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300 289

PlsðBÞ ¼X

H\B–ø

mðHÞ ¼ BelðBÞ ¼XH�B

mðHÞ ¼ Belð�BÞ ð6Þ

Eq. (7) represents the difference between belief and plausibilityindicating the uncertainty (ignorance) as well (Park, 2011).

PlsðBÞ � BelðBÞ ¼ Uncðignorance or doubtÞ ð7Þ

If it is assumed that Unc = 0, therefore Pls = Bel

The schematic relationships of three mass functions are pre-sented in Fig. 6.

4.2. Logistic regression (LR) model

LR involves a multivariate regression between a dependent var-iable and various independent variables (Hosmer and Lemeshow,

Page 8: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Fig. 4 (continued)

290 H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300

2000). The aim of LR is to identify the suitable model to definethe relationship between a dependent variable and conditioningfactors to produce the coefficient for each variable (Umar et al.,2014). The ratio of each conditioning factor can be estimated byusing the coefficients derived from logistic regression. LR iseffective compared to linear regression because the dependentvariable in logistic regression can be a mixture of continuousand categorical or binary variables. LR is useful to anticipate

the presence or absence of a feature or product relying on thevalues of predictive variables (Lee, 2005). LR model correspondsto the generalized linear model that can be computed asfollows;

P ¼ ez=ð1þ ezÞ ð8Þ

where P is the probability of an occurrence. Z is a value from �1 to+1, defined by the following equation;

Page 9: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Fig. 5. Methodological flowchart of the study.

H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300 291

Z ¼ aþ B1X1 þ B2X2 þ � � � þ BnXn ð9Þ

where and Z is a linear combination function of the conditioningfactors indicating a linear relationship. a is the cutoff of model, nis the number of conditioning factors, and B1, B2, . . ., Bn are coeffi-cients, which deliberate the contribution of conditioning factorsX1, X2, . . ., Xn (Ayalew and Yamagishi, 2005; Akgun, 2012). A positivecoefficient implies the positive correlation between dependant and

conditioning factor, and negative coefficient represents the oppositeeffect. In the LR model, the dependent variable can be expressed as;

Z ¼ logeP

1� P

� �¼ log itðPÞ ð10Þ

where P is the probability that the well occurrence as dependentvariable (binary) illustrating the presence or absence of

Page 10: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Fig. 6. Schematic relationships of evidential belief functions; (Carranza et al., 2005).

292 H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300

groundwater by values of 0 and 1, function Z indicates log (p/1 � p)to base e, called as logit (P) and p/(1 � p) represents odds or likeli-hood ratio. Probabilities differ between 0 and 1. While the valueof Z increases, the probability of P strictly increases. As a probabilitybecomes closer to 1, the numerator of the odds becomes larger rel-ative to the denominator, and the odds become an increasinglylarge number. On the contrary, if a probability becomes closer to0, the numerator of the odds becomes smaller relative to thedenominator (Ayalew and Yamagishi, 2005; Kavzoglu et al., 2013).

4.3. Frequency ratio (FR) model

Frequency ratio (FR) model as bivariate statistical techniquecan be applied as a simple geospatial assessment tool to computethe probabilistic relationship between dependent and indepen-dent variables (Oh et al., 2011). Here, in this research FR modelwas applied to illustrate the quantitative relationship betweendistribution of groundwater occurrences with low yield and condi-tioning factors as a cross validation approach. The calculation andoutput processes are very easy and can be readily realized asfollows;

FR ¼ ðA=BÞ=ðC=DÞ ¼ E=F ð11Þ

where A is the area of a class for each groundwater conditioning fac-tor; B is the total area of the each factor; C is the number of ground-water occurrences with low yield in the class area of the factor; D isthe number of total groundwater occurrences with low yield in thestudy area; E is the percentage for area with regard to a class for thefactor; F is the percentage for the entire domain; and FR is the ratioof the area where groundwater occurred to the total area, so that avalue of 1 is an average value. If the value is greater than 1, it meansa higher correlation, and a value lower than 1 means lower correla-tion (Pradhan and Lee, 2010)

5. Results and discussion

5.1. Application of EBF to groundwater potential mapping

Once three mass functions for all input data layers were pre-pared, Dempster’s rule of combination was applied to obtain fourcombined mass functions including the belief, disbelief, ignorance,and plausibility functions.

Eq. (12) shows the Dempster–Shafer theory direct massfunction.

M : 2� ¼ fø; Tp; T�p;�g � ¼ fTp; Tpg ð12Þ

where Tp and T�p express the following:� Tp = class pixels involved by groundwater wells.� T�p = class pixels not involved by groundwater wells.

In this part, L Numbers of multiple spatial data layer should beconsidered as evidence (Eij), where (ij) represents (i) as amount oflayers and (j) class attribute individually to obtain certain accurateresults.

The integrated EBF values of the groundwater conditioning fac-tors were implemented one after another by using Eqs. (13)–(18).Table 2 illustrates the estimated EBF for the 12 groundwater con-ditioning factors.

Eqs. (13) and (14) shows how Bel results could be derived;

k ¼ ðTpÞEij

¼ ½NðL \ EijÞ=NðLÞ�=½NðEijÞ � NðL \ EijÞ=ðNðAÞ � NðLÞÞ�¼ N=D ð13Þ

Bel ¼ kðTpÞEij=X

kðTpÞEij ð14Þ

where N (L \ Eij) is the number of groundwater occurrence pixels inthe domain

NðLÞor Total number of groundwater occurrenceP

NðL \ EijÞ

8><>:N (Eij) is the number of pixel in the domain

NðAÞor Total number of pixels in the domainP

NðEijÞ

8><>:N is the proportion of groundwater occurrence, D is proportion ofnon-groundwater occurrence area.

Correspondingly, the Dis value was computed by Eqs. (15) and(16).

kðT�pÞ ¼ ½ðNðLÞ � NðL \ EijÞÞ=NðLÞ�=½ðNðAÞ � NðLÞ � NðEijÞþ NðL \ EijÞ þ NðL \ EijÞÞ=ðNðAÞ � NðLÞÞ� ¼ K=H ð15Þ

Dis ¼ kðT�pÞEijX

kðT�pÞEij.

ð16Þ

where K is the proportion of groundwater occurrence that does notoccur and H is proportion of non-groundwater occurrence areas inother attributes outside the class.

Eqs. (13) and (15) were applied on all the groundwater condi-tioning factor classes, and then Eqs. (14) and (16) were appliedto produce Bel and Dis results.

The Unc and Pls values were obtained using Eqs. (17) and (18).

Unc ¼ 1� Dis� Bel ð17Þ

pls ¼ 1� Dis ð18Þ

The values of belief and plausibility range between 0 and 1.According to Park (2011), an important constraint about EBF is thatif there is no value for Belief in a certain class, then it indicates thatthere is no groundwater occurrence in the same class. The esti-mated EBFs results of three mass functions of belief, disbelief anduncertainty are illustrated in Table 2. The values were derivedthrough the wells with yield value P 11 m3/h.

A comparatively high value of bel implies a higher probability ofgroundwater potential, while a low value of bel indicates a lowerprobability of groundwater potential. The degree of belief showsa higher value of groundwater potential with lower elevation, mildslope, and flat curvature. As a result, a higher elevation, steeperslope, and convex–concave curvatures could produce a higher rain-fall–runoff rate and lower infiltration, thus possibly produce alower groundwater potential. The previous study by Manap et al.(2012) confirmed that the areas with flat surface are more suitablezones for groundwater exploration.

Page 11: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Table 2The estimated EBF values for conditioning parameters and logistic regression coefficient for significant parameters.

Factor Class No. of classpixels

No.of yieldP11 m3/hpixels

% of yieldP11 m3/hpixels

Bel Dis Unc Logisticregressioncoefficient

Elevation (m)a 0–20 2,477,881 10,400 62 0.204 0.065 0.731 �9.37420–39 172,438 400 2 0.113 0.102 0.78639–40 497,618 2800 17 0.273 0.092 0.63440–59 319,257 1600 10 0.244 0.097 0.66059–60 497,704 800 5 0.078 0.105 0.81760–80 441,197 800 5 0.088 0.104 0.80880–141 430,492 0 0 0.000 0.109 0.891141–246 428,265 0 0 0.000 0.109 0.891246–400 429,521 0 0 0.000 0.109 0.891400–1440 422,695 0 0 0.000 0.109 0.891

Slope angle (degree)a 0–1 3,601,727 14,000 83 0.306 0.042 0.652 �0.271–3 375,683 1200 7 0.252 0.102 0.6463–6 310,625 0 0 0.000 0.109 0.8916–10 334,617 1200 7 0.282 0.102 0.61610–14 304,669 0 0 0.000 0.109 0.89114–18 311,385 0 0 0.000 0.109 0.89118–21 235,649 0 0 0.000 0.108 0.89221–25 279,597 0 0 0.000 0.108 0.89225–29 196,952 400 2 0.160 0.104 0.73629–82 166,164 0 0 0.000 0.106 0.894

Curvature Concave 1,053,679 1600 10 0.247 0.405 0.348Flat 3,965,553 14,000 83 0.575 0.175 0.249Convex 1,097,836 1200 7 0.178 0.420 0.403

Topographic wetness index (TWI)a �8.86 to �2.36 601,340 800 5 0.047 0.106 0.847 �0.6931�2.36–1.65 621,009 800 5 0.046 0.106 0.8481.65–2.95 630,354 0 0 0.000 0.111 0.8892.95–5.78 638,769 1200 7 0.067 0.104 0.8295.78–8.62 610,574 1200 7 0.070 0.103 0.8278.62–10.04 583,255 2800 17 0.172 0.092 0.73610.04–10.98 591,805 2800 17 0.169 0.092 0.73810.98–11.81 597,268 2800 17 0.168 0.092 0.74011.81–12.52 577,112 2800 17 0.174 0.092 0.73412.52–21.38 665,582 1600 10 0.086 0.102 0.813

Stream power index (SPI) �13.2 to �8.2 581,272 1200 7 0.077 0.103 0.821�8.2 to �5.1 632,041 1200 7 0.071 0.104 0.826�5.1 to �3.6 620,965 3200 19 0.192 0.090 0.718�3.6 to �2.7 623,492 4400 26 0.264 0.082 0.654�2.7 to �1.9 634,166 800 5 0.047 0.106 0.847�1.9 to �1.3 676,363 2800 17 0.154 0.094 0.752�1.3 to �0.8 686,841 1200 7 0.065 0.105 0.830�0.8 to �0.06 602,202 1200 7 0.074 0.103 0.823�0.06–0.9 542,601 400 2 0.027 0.107 0.8650.9–13.25 517,125 400 2 0.029 0.107 0.865

Drainage density (km/km2) 0–0.000114 2,252,376 7600 45 0.113 0.087 0.8000.000114–0.000305 1,041,203 1600 10 0.051 0.109 0.8390.000305–0.000162 1,029,373 2000 12 0.065 0.106 0.8290.000467–0.00062 771,116 1600 10 0.069 0.104 0.8270.00062–0.000792 491,426 2400 14 0.163 0.093 0.7430.000792–0.001002 231,927 800 5 0.115 0.099 0.7850.001002–0.00125 147,461 0 0 0.000 0.103 0.8970.00125–0.001545 89,479 400 2 0.150 0.099 0.7510.001545–0.001927 49,114 400 2 0.274 0.099 0.6280.001927–0.002442 13,607 0 0 0.000 0.100 0.900

Lineament density (km/km2) 0–0.000052 4,163,391 14,400 86 0.315 0.046 0.6390.000052–0.000164 449,805 800 5 0.162 0.107 0.7320.000164–0.000281 367,530 0 0 0.000 0.110 0.8900.000282–0.000392 330,289 1200 7 0.331 0.102 0.5670.000392–0.000504 325,684 0 0 0.000 0.110 0.8900.000504–0.000628 188,473 400 2 0.193 0.105 0.7030.000628–0.000785 138,547 0 0 0.000 0.106 0.8940.000785–0.000985 83,966 0 0 0.000 0.105 0.8950.000985–0.001236 51,932 0 0 0.000 0.105 0.8950.001236–0.001675 17,464 0 0 0.000 0.104 0.896

NDVI �0.6 to �0.17 612,309 4400 26 0.262 0.082 0.656�0.17 to �0.002 609,351 3200 19 0.191 0.090 0.719�0.002–0.13 635,412 1600 10 0.091 0.101 0.8080.13–0.21 614,554 2000 12 0.118 0.098 0.7840.21–0.26 599,660 1600 10 0.097 0.100 0.8030.26–0.29 580,580 1600 10 0.100 0.100 0.8000.29–0.32 608,795 800 5 0.048 0.106 0.8470.32–0.34 683,032 800 5 0.042 0.107 0.8500.34–37 580,953 800 5 0.050 0.105 0.845

(continued on next page)

H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300 293

Page 12: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Table 2 (continued)

Factor Class No. of classpixels

No.of yieldP11 m3/hpixels

% of yieldP11 m3/hpixels

Bel Dis Unc Logisticregressioncoefficient

0.37–0.53 602,481 0 0 0.000 0.111 0.889Land use/land covera Urban area 1,027,250 8000 48 0.253 0.063 0.684 7.693

Other crop 497,219 1600 10 0.104 0.098 0.798 �12.99Rubber 709,056 1600 10 0.073 0.102 0.825 �11.28Oil palm 1,714,702 4000 24 0.075 0.106 0.819 2.067Grass 165,448 0 0 0.000 0.103 0.897 0.000Forest 1,350,150 0 0 0.000 0.129 0.871 �30.72Swamp 277,377 800 5 0.093 0.100 0.807 2.928Clear land 155,956 0 0 0.000 0.102 0.898 0.000Water bodies 64,987 800 5 0.402 0.096 0.502 0.000Mines area 92,083 0 0 0.000 0.101 0.899 0.000

Soila Clay 2,168,614 8000 48 0.075 0.081 0.843 2.131Sandy loam sandy 343,039 400 2 0.024 0.103 0.873 �34.65clay 7 0.360 0.094 0.546 �14.5Gravel clay-silty 68,824 1200Clay–clay 14 0.135 0.091 0.774 3.282Fine sandy clay 362,355 2400Fine sandy clay loam to sandy clay–clay 271,021 800 5 0.060 0.100 0.840 �55.32Coarse sandy clay–clay 843,144 800 5 0.019 0.111 0.870 �13.26Fine sandy clay loam 301,460 0 0 0.000 0.105 0.895 0.000Coarse sandy clay 1,404,502 400 2 0.006 0.127 0.867 6.000Sandy clay 201,510 2400 14 0.245 0.089 0.667 �14.8Sand 107,187 400 2 0.076 0.099 0.825 0.000

Lithology Schist 132,655 0 0 0 0.101 0.899Acid t intermediate 7386 0 0 0 0.099 0.901Schist, phyllite, slate and limestone 17,047 0 0 0 0.099 0.901Limestone/marble 26,656 0 0 0 0.099 0.901Phyllite, schist and slate 1,173,771 7200 43 0.175 0.070 0.756Acid intrusives (undifferentiated) 2,480,347 1200 7 0.014 0.154 0.832Peat, humic clay and silt 619,293 3200 19 0.147 0.089 0.764Phyllite, slate, shale and sandstone 447,393 1200 7 0.076 0.099 0.825Clay, silt, sand and gravel 45,247 800 5 0.510 0.094 0.396Clay and silt (marine) 1,162,530 3200 19 0.078 0.098 0.823

Rainfall (mm) a 2070–2162 599,637 0 0 0.000 0.111 0.889 �1.3272162–2239 610,656 2400 14 0.143 0.095 0.7612239–2296 607,363 1600 10 0.096 0.100 0.8042296–2349 620,282 2800 17 0.165 0.093 0.7422349–2390 626,851 2400 14 0.140 0.095 0.7652390–2428 626,529 1600 10 0.093 0.101 0.8062428–2465 618,099 2000 12 0.118 0.098 0.7842465–2501 599,190 1600 10 0.097 0.100 0.8022501–2551 589,201 2000 12 0.124 0.097 0.7792551–2686 619,260 400 2 0.024 0.109 0.868

a Statistically significant parameter.

294 H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300

According to Beven and Kirkby (1979), TWI relates upslopeareas as a measure of water flowing towards a certain point (i.e.to the local slope), which is a measure of the subsurface lateraltransmissivity. A higher TWI value represents a lower slope anda larger slope area. Therefore, there is a positive correlationbetween groundwater occurrence and TWI which indicates ahigher groundwater potential over an increasing value of TWI.While, there is a negative correlation between SPI and groundwa-ter productivity. The classes between �3.6 and �2.7 had a highergroundwater potential. In the case of river density the results rep-resented the positive relationship with degree of belief betweenthe denser drainage, and the greater groundwater potential. Thedegree of Bel and Dis values indicated the highest value between0.001545 and 0.001927 km/km2. Accordingly, topography plays asignificant role in hydro-geological systems.

For lineament density, it can be seen that as the lineament den-sity increased, the groundwater occurrence generally increased.For lineament density between 0.000281 and 0.000392 km/km2,the bel value, indicated a high probability. However, for lineamentdensity value between 0.0000504 and 0.0000628 km/km2, the belvalue was lower which was due to the less number of lineamentspresent in the study area.

In the case of lithology, the bel value was higher in clay, silt,sand and gravel areas which suggest a higher probability ofgroundwater occurrence than other lithological units. The proba-bility was lower in acid intrusive (undifferentiated) areas. Theigneous rocks are assumed as poor groundwater potential due todifficulty in terms of groundwater movement (Thakur andRaghuwanshi, 2008).

For the land use/land cover factor, the water bodies had thehighest values indicating the highest bel probability of groundwa-ter potential. It was followed by urban areas, other crops, swamp,oil palm and rubber. In the soil type factor, the highest values weregravel, clay, silty, and sandy clay while the lowest values wereobserved for the remaining soil type groups. In the case of vegeta-tion index, for NDVI values above 0 the bel values were lower thanNDVI values less than 0. This result implied that groundwaterpotential decreased with the increase of the vegetation indexvalue. As a result it was shown that high rainfall was favorablefor high groundwater potential.

The integrated results are shown in Fig. 7. The belief map(Fig. 7a) was compared to the disbelief map (Fig. 7b) which showedthat belief values were high for areas where disbelief values arelow and vice versa. It indicated that high groundwater potential

Page 13: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Fig. 7. Integrated results of EBF model; (a) belief, (b) disbelief, (c) uncertainty, and (d) plausibility.

H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300 295

was for the areas where there were high degrees of belief and lowdegree of disbelief for the occurrence. The uncertainty map(Fig. 7c) showed lack of information to provide a real prove forgroundwater occurrences. The high uncertainty values were inthe areas where belief values were low. The plausibility map(Fig. 7d) shows high values for areas where both belief and uncer-tainty values are high. Finally, the groundwater potential index(GWPI) was prepared based on the belief function as shown inFig. 8.

5.2. Application of logistic regression model

The multivariate statistical estimation method inquires the rel-ative strength and significance of the variable. Also, LR analysismeasured the coefficient of each conditioning factors. Amongtwelve groundwater conditioning factors, six of them including;elevation, landuse, soil, rainfall, topographic wetness index andslope were chosen because they were statistically significant(Fig. 9) based on the LR model.

Page 14: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Fig. 8. Groundwater productivity potential index map.

296 H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300

For groundwater potential mapping, after calculating the LRcoefficients of the six groundwater conditioning factors (Table 2),values were computed in a raster calculator of the ArcGIS softwareas follows;

Z ¼ ð�9:374 elevationÞ þ landuseb þ soilb

þ ð�1:327 rainfallÞ þ ð�0:6931 TWIÞ þ ð�0:27 slopeÞþ 49:07 ð19Þ

where landuseb and soilb are logistic multiple regression coefficientas listed in Table 2.

The probability index ranges from 0 to 1. Fig. 10a represents theprobability map generated from LR model. This index indicates thepredicted probabilities of groundwater occurrence for each pixel inthe presence of a given set of conditioning factors. The probabilitymap were grouped into five classes of very high (10%), high (10%),moderate (20%), low (20%) and very low (40%) using quantile

method of classification (Pradhan and Lee, 2010; Pradhan, 2013;Tehrany et al., 2014). The probability map shows that almost 45%and 52% of total area are located in relative high and no wellclasses respectively. Subsequently, the non-significant groundwa-ter conditioning factors were removed from the EBF and themodel was re-run and a new EBF final map was generated. Thefinal map (Fig. 10b) derived from EBF model using six significantparameters, contains 38.5% of the total area, which is allocated tobe of high and very high GWPI. Moderate, low and no wellGWPI zones constitutes 19%, 13% and 22% of the total area,respectively.

5.3. Validation of the groundwater potential map

5.3.1. Success-rate and prediction-rateValidation is considered to be the most important process of

modelling and without validation; the models will have no

Page 15: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Fig. 9. Statistically significant groundwater conditioning factors.

Fig. 10. Groundwater productivity potential index map from (a) LR method; and (b) EBF from LR method.

H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300 297

scientific significance (Chang-Jo and Fabbri, 2003). By overlayingthe groundwater potential maps with the groundwater well loca-tions in the training datasets, cumulative percentage of thegroundwater occurrence (starting from the highest to the lowestof GWPI values) were calculated (Fig. 11). The success-rate curveswere then obtained by plotting on the x axis (cumulative percent-age of potential map) and on the y axis (the cumulative percentageof groundwater occurrence). Evaluation of prediction and successrate is a required outcome for every scheme (Tehrany et al., 2013).

Fig. 11 shows the success-rate curve of EBF model using alltwelve conditioning factors. The validation of the model indicatedthat the area under the curve (AUC) was 0.830 which correspondsto 83% of success accuracy. Since the success-rate method used thetraining groundwater well locations that have already been usedduring the EBF model building, thus the success-rate is not a

suitable method for assessment of the prediction capability ofthe model (Pradhan, 2013). However, the success-rate methodmay help to determine how well the resulting groundwater poten-tial map has classified the areas of existing wells.

On the other hand, the prediction-rate described how well themodel and predictor variable anticipates the groundwater occur-rence (Pradhan, 2013). For quantitative comparison, the areasunder the prediction-rate curves (AUC) were calculated. Likewise,the prediction validation was carried out by using the groundwateroccurrence dataset that was not used in the training phase (i.e. 18groundwater well samples). The AUC for the prediction curve ofthe groundwater potential map was 0.779 (77.9%) which is reason-able for regional groundwater potential mapping.Moreover, Fig. 11shows the validation of two results obtained using significant con-ditioning factors; the LR method generated the value of 0.820 for

Page 16: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

Fig. 11. The success and prediction rate curve for groundwater potential map; (a) success rate and (b) prediction rate.

Fig. 12. The frequency ratio histogram for low potential wells.

298 H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300

area under the curve which indicates 82% success accuracy. TheAUC for prediction of the groundwater potential index was 0.720implying a perdition accuracy of 72%. In the case of EBF modellingbased on using six significant conditioning factors, the area underthe curve for success and prediction accuracy were calculated0.750 and 0.680 respectively.

As it can be seen, the success rate accuracy of both LR and EBFare almost similar, with higher accuracy of EBF in prediction usingall groundwater conditioning factors. But EBF modelling throughsignificant parameters is not as good as LR result. These resultsindicate that the EBF through significant parameters is relativelypoor estimator compared to the LR model. According to the studyof Ozdemir (2011), LR showed poor estimator for spring potentialmapping. The performance of the employed EBF model is in agree-ment with the result obtained by other researchers applied in var-ious environmental and natural hazard studies (Althuwaynee et al.,2014; Bui et al., 2012, 2013; Lee et al., 2013; Mohammady et al.,2012).

5.3.2. Cross-validation using low yield well and FR modelAdditionally, the groundwater productivity potential index was

cross-validated with the aid of low yield wells and using FRapproach. The FR approach is based on the observed relationshipsbetween the remaining 65 low yield wells <11 m3/h andgroundwater conditioning factors. In the histogram of GWPIderived from the FR (Fig. 12) the groundwater locations with low

productivity coincided with the sites falling in the no-well andlow classes. This correlation results showed the negative relation-ship between groundwater well locations and potential area. Rela-tive frequencies of areas affected by different groundwaterpotential zones were calculated from the ratio. Ideally speaking,the FR value should decrease from a no-well zone to the veryhigh-potential zone. It is necessary to note that the histogramexplained how well the EBF model and predictor variables pre-dicted the groundwater potential zones.

6. Conclusion

Groundwater is one of the most important natural resourceswhich play an increasingly significant role for water suppliesthroughout the world. Therefore, detection and prediction of spa-tial distribution of potential locations for groundwater explorationhave become an important topic for private, government, andresearch institutions worldwide. In this paper, a data driven EBFmodel was successfully applied to delineate the potential ground-water zones in the Langat basin in Malaysia. The validation resultsindicated that the success-rate for the EBF model (bel map using 12conditioning parameters) was 83% with prediction-rate of 78%. Inaddition, a FR model was applied to cross-verify the results derivedfrom EBF model for low yield wells. In this method, instead of usinghigh yield wells, low yield wells were also used to show where thelow groundwater zones are located.

Page 17: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300 299

To map groundwater potential zones, the primary step was thepreparation of the groundwater conditioning factors which affectthe groundwater potential. The groundwater conditioning factorswere then integrated in a spatial database using EBF model, whichindicated the relationship between groundwater yield values andthe conditioning factors. Hence, the quantitative relationshipsbetween known groundwater occurrence and hydrogeological lay-ers are useful in transforming hydrogeological map data into evi-dential data layers which can be integrated to produce apredictive groundwater potential map. The main advantage ofDempster–Shafer theory is that, the application of the EBF allowsnot only the predictive mapping of favorable zones for groundwa-ter occurrence but also allows modelling of the degrees of uncer-tainty in the prediction.

To check the robustness of the proposed EBF model, awell-known LR model was applied using the twelve groundwaterconditioning parameters. Subsequently, out of twelve conditioningfactors, six of them were selected as significant parameters andcoefficient values were assigned to each of them. They were calcu-lated and a final map was produced showing the probability from 0to 1. In the next step, the EBF model was re-run using those six sig-nificant conditioning factors to check the efficiency of the model.

In summary, the results of this study proved that EBF model canbe successfully applied in groundwater potential mapping. Theresult obtained in this study might be useful for related agencies inMalaysia for comprehensive evaluation of groundwater explorationdevelopment and environmental management for future planning.The proposed method provided rapid, accurate and cost effectiveresults. Furthermore, the analysis may be transferable to other dis-tricts with similar topographic and hydro-geological characteristics.

Acknowledgments

This research was supported by RUGS Grant/05-02-12-2195 RUat the University Putra Malaysia (Vote No. 9376500). The Authorswould like to thank Department of Survey and Mapping Malaysia(JUPEM), Minerals and Geosciences Department Malaysia (JMG)and Department of Agriculture for providing various data sets forthe research. Thanks to Mahyat Shafapour Tehrany, Mustafa Nea-mah Jebur and Omar Althuwaynee for their valuable contributionwhile preparing the manuscript. Thanks to anonymous reviewersfor their valuable inputs which were useful to improve the qualityof the manuscript.

References

Akgun, A., 2012. A comparison of landslide susceptibility maps produced by logisticregression, multi-criteria decision, and likelihood ratio methods: a case study at_Izmir, Turkey. Landslides 9 (1), 93–106.

Althuwaynee, O.F., Pradhan, B., Lee, S., 2012. Application of an evidential belieffunction model in landslide susceptibility mapping. Comput. Geosci. 44, 120–135.

Althuwaynee, O.F., Pradhan, B., Park, H.J., Lee, J.H., 2014. A novel ensemble bivariatestatistical evidential belief function with knowledge-based analytical hierarchyprocess and multivariate statistical logistic regression for landslidesusceptibility mapping. Catena 114, 21–36.

Ayalew, L., Yamagishi, H., 2005. The application of GIS-based logistic regression forlandslide susceptibility mapping in the Kakuda-Yahiko Mountains, CentralJapan. Geomorphology 65, 15–31.

Ayazi, M.H., Pirasteh, S., Arvin, A.K.P., Pradhan, B., Nikouravan, B., Mansor, S., 2010.Disasters and risk reduction in groundwater: Zagros Mountain Southwest Iranusing geoinformatics techniques. Disaster Adv. 3 (1), 51–57.

Banks, D., Robins, N.S., Robins, N., 2002. An Introduction to Groundwater inCrystalline Bedrock. Norges Geologiske Undersokelse.

Beven, K.J., Kirkby, M.J., 1979. A physically based, variable contributing area modelof basin hydrology/Un modèle à base physique de zone d’appel variable del’hydrologie du bassin versant. Hydrol. Sci. J. 24 (1), 43–69.

Bringemeier, 2006. Groundwater exploration adjacent to the Kuala LumpurInternational Airport/Malaysia—challenges and chances of exploring afractured rock aquifer, <http://www.coffey.com/>.

Bui, D.T., Pradhan, B., Lofman, O., Revhaug, I., Dick, O.B., 2012. Spatial prediction oflandslide hazards in Hoa Binh province (Vietnam): a comparative assessment of

the efficacy of evidential belief functions and fuzzy logic models. Catena 96,28–40.

Bui, D.T., Pradhan, B., Revhaug, I., Nguyen, D.B., Pham, H.V., Bui, Q.N., 2013. A novelhybrid evidential belief function-based fuzzy logic model in spatial predictionof rainfall-induced shallow landslides in the Lang Son city area (Vietnam).Geomatics, Nat. Hazard Risk, 1–30. http://dx.doi.org/10.1080/19475705.2013.843206 (article online first available).

Carranza, E.J.M., Castro, O.T., 2006. Predicting lahar-inundation zones: case study inWest Mount Pinatubo, Philippines. Nat. Hazards 37 (3), 331–372.

Carranza, E.J.M., Hale, M., 2003. Evidential belief functions for data-drivengeologically constrained mapping of gold potential, Baguio district,Philippines. Ore. Geol. Rev. 22 (1), 117–132.

Carranza, E.J.M., Woldai, T., Chikambwe, E.M., 2005. Application of data-drivenevidential belief functions to prospectivity mapping for aquamarine-bearingpegmatites, Lundazi district, Zambia. Nat. Resour. Res. 14 (1), 47–63.

Chang-Jo, F., Fabbri, A.G., 2003. Validation of spatial prediction models for landslidehazard mapping. Nat. Hazards 30 (3), 451–472.

Charon, J.E., 1974. Hydrogeological applications of ERTS satellite imagery. In: ProcUN/FAO Regional Seminar on Remote Sensing of Earth Resources andEnvironment, Cairo. Commonwealth Science Council, pp. 439–456.

Chenini, I., Mammou, A.B., 2010. Groundwater recharge study in arid region: anapproach using GIS techniques and numerical modelling. Comput. Geosci. 36(6), 801–817.

Chenini, I., Mammou, A.B., Moufida, E.M., 2010. Groundwater recharge zonemapping using GIS-based multi-criteria analysis: a case study in CentralTunisia (Maknassy Basin). Water Resour. Manage 24 (5), 921–939.

Chowdhury, A., Jha, M.K., Chowdary, V.M., Mal, B.C., 2009. Integrated remotesensing and GIS-based approach for assessing groundwater potential in WestMedinipur district, West Bengal, India. Int. J. Remote Sens. 30 (1), 231–250.

Corsini, A., Cervi, F., Ronchetti, F., 2009. Weight of evidence and artificial neuralnetworks for potential groundwater spring mapping: an application to the Mt.Modino area (Northern Apennines, Italy). Geomorphology 111 (1), 79–87.

Dar, I.A., Sankar, K., Dar, M.A., 2010. Remote sensing technology and geographicinformation system modeling: an integrated approach towards the mapping ofgroundwater potential zones in Hardrock terrain, Mamundiyar basin. J. Hydrol.394, 285–295.

Dempster, A., 2008. Upper and lower probabilities induced by a multivaluedmapping. In: Yager, R., Liu, L., Dempster, A.P., Shafer, G. (Eds.), Classic Works ofthe Dempster-Shafer Theory of Belief Functions. Springer, Berlin, Heidelberg,pp. 57–72 (Chapter 3).

Fitts, C.R., 2002. Groundwater Science. Academic Press.Ganapuram, S., Kumar, G.T., Krishna, I.V., Kahya, E., Demirel, M.C., 2009. Mapping of

groundwater potential zones in the Musi basin using remote sensing data andGIS. Adv. Eng. Software 40 (7), 506–518.

Ghayoumian, J., MohseniSaravi, M., Feiznia, S., Nouri, B., Malekian, A., 2007.Application of GIS techniques to determine areas most suitable for artificialgroundwater recharge in a coastal aquifer in southern Iran. J. Asian Earth Sci. 30(2), 364–374.

Gupta, M., Srivastava, P.K., 2010. Integrating GIS and remote sensing foridentification of groundwater potential zones in the hilly terrain of Pavagarh,Gujarat, India. Water Int. 35 (2), 233–245.

Hosmer, D.W., Lemeshow, S., 2000. Applied Logistic Regression, second ed. JhonWiley & Sons Inc., NewYork.

Hutchison, C.S., Tan, D.N.K., 2009. Geology of Peninsular Malaysia, University ofMalaya.

Jha, M.K., Chowdhury, A., Chowdary, V.M., Peiffer, S., 2007. Groundwatermanagement and development by integrated remote sensing and geographicinformation systems: prospects and constraints. Water Resour. Manage 21 (2),427–467.

Juahir, H., Zain, S.M., Aris, A.Z., Yusof, M.K., Samah, M.A.A., Mokhtar, M., 2010.Hydrological trend analysis due to land use changes at Langat River Basin.Environ. Asia 3 (2010), 20–31.

Kavzoglu, T., Sahin, E.K., Colkesen, I., 2013. Landslide susceptibility mapping usingGIS-based multi-criteria decision analysis, support vector machines, andlogistic regression. Landslides, 1–15.

Krishnamurthy, J., Kumar, N.V., Jayaraman, V., Manivel, M., 1996. An approach todemarcate groundwater potential zones through remote sensing and ageographic information system. Int. J. Remote Sens. 17 (10), 1867–1884.

Lee, S., 2005. Application of logistic regression model and its validation for landslidesusceptibility mapping using GIS and remote sensing data. Int. J. Remote Sens.26, 1477–1491.

Lee, S., Kim, Y.S., Oh, H.J., 2012. Application of a weights-of-evidence method andGIS to regional groundwater productivity potential mapping. J. Environ.Manage. 96 (1), 91–105.

Lee, S., Hwang, J., Park, I., 2013. Application of data-driven evidential belief functionsto landslide susceptibility mapping in Jinbu, Korea. Catena 100, 15–30.

Madrucci, V., Taioli, F., Cesar de Araujo, C., 2008. Groundwater favourability mapusing GIS multi criteria data analysis on crystalline terrain, Sao Paulo State,Brazil. J. Hydrol. 357, 153–173.

Magesh, N.S., Chandrasekar, N., Soundranayagam, J.P., 2012. Delineation ofgroundwater potential zones in Theni district, Tamil Nadu, using remotesensing. GIS and MIF techniques. Geosci. Frontiers 3 (2), 189–196.

Manap, M.A., Nampak, H., Pradhan, B., Lee, S., Sulaiman, W.N.A., Ramli, M.F., 2012.Application of probabilistic-based frequency ratio model in groundwaterpotential mapping using remote sensing data and GIS. Arab. J. Geosci., 1–14.http://dx.doi.org/10.1007/s12517-012-0795-z.

Page 18: Application of GIS based data driven evidential belief function model to predict groundwater potential zonation

300 H. Nampak et al. / Journal of Hydrology 513 (2014) 283–300

Manap, M.A., Sulaiman, W.N.A., Ramli, M.F., Pradhan, B., Surip, N., 2013. Aknowledge-driven GIS modeling technique for groundwater potentialmapping at the Upper Langat Basin, Malaysia. Arab. J. Geosci. 6 (5), 1621–1637. http://dx.doi.org/10.1007/s12517-011-0469-2.

Mohammady, M., Pourghasemi, H.R., Pradhan, B., 2012. Landslide susceptibilitymapping at Golestan Province, Iran: a comparison between frequency ratio,Dempster–Shafer, and weights-of-evidence models. J. Asian Earth Sci. 61, 221–236.

Moon, W.M., 1990. Integration of geophysical and geological data using evidentialbelief function. IEEE Trans. Geosci. Remote Sens. 28 (4), 711–720.

Moore, I.D., Grayson, R.B., Ladson, A.R., 1991. Digital terrain modelling: a review ofhydrological, geomorphological, and biological applications. Hydrol. Proc. 5 (1),3–30.

Mukherjee, S., 1996. Targetting saline aquifer by remote sensing andgeophysical methods in a part of Hamirpur-Kanpur, India. Hydrol. J. 19,1867–1884.

Murthy, K.S.R., Mamo, A.G., 2009. Multi-criteria decision evaluation in groundwaterzones identification in Moyale-Teltele sub-basin, South Ethiopia. Int. J. RemoteSens. 30 (11), 2729–2740.

Nag, A., Ghosh, S., Biswas, S., Sarkar, D., Sarkar, P.P., 2012. An image steganographytechnique using X-box mapping. In: Advances in Engineering, Science andManagement (ICAESM), 2012 International Conference on pp. 709–713. IEEE.

Neshat, A., Pradhan, B., Pirasteh, S., Shafri, H.Z.M., 2013. Estimating groundwatervulnerability to pollution using a modified DRASTIC model in the Kermanagricultural area, Iran. Environ. Earth Sci. 1–13. http://dx.doi.org/10.1007/s12665-013-2690-7.

Oh, H.J., Kim, Y.S., Choi, J.K., Park, E., Lee, S., 2011. GIS mapping of regionalprobabilistic groundwater potential in the area of Pohang City, Korea. J. Hydrol.399 (3), 158–172.

Ozdemir, A., 2011. Using a binary logistic regression method and GIS for evaluatingand mapping the groundwater spring potential in the Sultan Mountains(Aksehir, Turkey). J. Hydrol. 405 (1), 123–136.

Park, N.W., 2011. Application of Dempster–Shafer theory of evidence to GIS-basedlandslide susceptibility analysis. Environ. Earth Sci. 62 (2), 367–376.

Pradhan, B., 2009. Groundwater potential zonation for basaltic watersheds usingsatellite remote sensing data and GIS techniques. Cent. Eur. J. Geosci. 1 (1), 120–129.

Pradhan, B., 2013. A comparative study on the predictive ability of the decision tree,support vector machine and neuro-fuzzy models in landslide susceptibilitymapping using GIS. Comput. Geosci. 51, 350–365. http://dx.doi.org/10.1016/j.cageo.2012.08.023.

Pradhan, B., Abokharima, M.H., Jebur, M.N., Tehrany, M.S., 2014. Land subsidencesusceptibility mapping at Kinta Valley (Malaysia) using the evidential belief

function model in GIS. Nat. Hazards, 1–24. http://dx.doi.org/10.1007/s11069-014-1128-1.

Pradhan, B., Oh, H.J., Buchroithner, M., 2010. Weights-of-evidence model applied tolandslide susceptibility mapping in a tropical hilly area. Geomat. Nat. Haz. Risk1 (3), 199–223.

Pradhan, B., Lee, S., 2010. Regional landslide susceptibility analysis using back-propagation neural network model at Cameron Highland, Malaysia. Landslides7 (1), 13–30.

Prasad, R.K., Mondal, N.C., Banerjee, P., Nandakumar, M.V., Singh, V.S., 2008.Deciphering potential groundwater zone in hard rock through the application ofGIS. Environ. Geol. 55 (3), 467–475.

Sander, P., Chesley, M.M., Minor, T.B., 1996. Groundwater assessment using remotesensing and GIS in a rural groundwater project in Ghana: lessons learned.Hydrogeol. J. 4 (3), 40–49.

Saraf, A.K., Choudhury, P.R., 1998. Integrated remote sensing and GIS forgroundwater exploration and identification of artificial recharge sites. Int. J.Remote Sens. 19 (10), 1825–1841.

Shafer, G., 1976. A Mathematical Theory of Evidence, vol. 1. Princeton UniversityPress, Princeton.

Shahid, S., Nath, S.K., Kamal, A.S.M.M., 2002. GIS integration of remote sensing andtopographic data using fuzzy logic for ground water assessment in MidnapurDistrict, India. Geocarto Int. 17 (3), 69–74.

Singh, A.K., Prakash, S.R., 2002. An integrated approach of remote sensing,geophysics and GIS to evaluation of groundwater potentiality of Ojhala sub-watershed, Mirjapur district, UP, India. In: Asian Conference on GIS, GPS, AerialPhotography and Remote Sensing, Bangkok-Thailand.

Suratman, S., 2004. IWRM: Managing the Groundwater Component. Paperpresented at Malaysia Water Forum, Kuala Lumpur, Malaysia.

Tehrany, M.S., Pradhan, B., Jebur, M.N., 2013. Spatial prediction of flood susceptibleareas using rule based decision tree (DT) and a novel ensemble bivariate andmultivariate statistical models in GIS. J. Hydrol. 504, 69–79.

Tehrany, M.S., Pradhan, B., Jebur, M.N., 2014. Flood susceptibility mapping using anovel ensemble weights-of-evidence and support vector machine models inGIS. J. Hydrol. 512, 332–343. http://dx.doi.org/10.1016/j.jhydrol.2014.03.008.

Thakur, G.S., Raghuwanshi, R.S., 2008. Perspect and assessment of groundwaterresources using remote sensing techniques in and around Choral river basin,Indore and Khargone districts, MP. J. Indian Soc. Remote Sens. 36 (2), 217–225.

Umar, Z., Pradhan, B., Ahmad, A., Jebur, M.N., Tehrany, M.S., 2014. Earthquakeinduced landslide susceptibility mapping using an integrated ensemblefrequency ratio and logistic regression models in West Sumatera Province,Indonesia. Catena 118, 124–135. http://dx.doi.org/10.1016/j.catena.2014.02.005.

Yin, E.H., 1976. Geologic map of Selangor, Sheet 94 (Kuala Lumpur). Scale 1, 63360.