20
A novel approach to validate satellite soil moisture retrievals using precipitation data L. Karthikeyan 1 and D. Nagesh Kumar 1 1 Department of Civil Engineering, Indian Institute of Science, Bangalore, India Abstract A novel approach is proposed that attempts to validate passive microwave soil moisture retrievals using precipitation data (applied over India). It is based on the concept that the expectation of precipitation conditioned on soil moisture follows a sigmoidal convex-concave-shaped curve, the characteristic of which was recently shown to be represented by mutual information estimated between soil moisture and precipitation. On this basis, with an emphasis over distribution-free nonparametric computations, a new measure called Copula-Kernel Density Estimator based Mutual Information (CKDEMI) is introduced. The validation approach is generic in nature and utilizes CKDEMI in tandem with a couple of proposed bootstrap strategies, to check accuracy of any two soil moisture products (here Advanced Microwave Scanning RadiometerEOS sensor's Vrije Universiteit Amsterdam-NASA (VUAN) and University of Montana (MONT) products) using precipitation (India Meteorological Department) data. The proposed technique yields a best choice soil moisture productmap which contains locations where any one of the two/none of the two/both the products have produced accurate retrievals. The results indicated that in general, VUA-NASA product has performed well over University of Montana's product for India. The best choice soil moisture map is then integrated with land use land cover and elevation information using a novel probability density function-based procedure to gain insight on conditions under which each of the products has performed well. Finally, the impact of using a different precipitation (Asian Precipitation-Highly-Resolved Observational Data Integration Towards Evaluation of Water Resources) data set over the best choice soil moisture product map is also analyzed. The proposed methodology assists researchers and practitioners in selecting the appropriate soil moisture product for various assimilation strategies at both basin and continental scales. 1. Introduction Soil moisture quanties the amount of water present in the pore spaces of unsaturated soil matrix. Though soil moisture accounts for approximately 0.001% of Earth's total water content, the presence of it at land atmosphere interface strongly affects the energy and water exchanges. Soil moisture forms a major source of water for processes that lead to land surface evapotranspiration which in turn contributes to water circulation in the atmosphere [Seneviratne et al., 2010]. Furthermore, soil moisture at the surface (typically top 05 cm of soil) is an important variable to estimate the distribution of precipitation into storage and surface runoff [Wagner et al., 2003]. These inuences lead to usage of soil moisture data in several hydrological, meteorological, and climate appli- cations such as ood forecasting [Brocca et al., 2012; Lievens et al., 2015], monitoring of droughts [Bolten et al., 2010; Zhang and Jia, 2013], climate studies [Bisselink et al., 2011; Dorigo et al., 2012], and calibration of hydro- logic models [Wanders et al., 2014]. Successful outcome of these applications demands accurate soil moisture observations which are often difcult to obtain both spatially and temporally due to the highly heterogeneous nature of soil moisture that cripples the ability to generalize the ndings to a regional scale based on at-site analysis [Prigent et al., 2005]. This leads to satellite-based retrieval of soil moisture data. Scanning Multichannel Microwave Radiometer, Special Sensor Microwave Imager (SSM/I), TMI-TRMM Microwave Imager where TRMM is Tropical Rainfall Measuring Mission, Advanced Microwave Scanning RadiometerEOS (AMSR-E), Soil Moisture and Ocean Salinity (SMOS), and Soil Moisture Active Passive (SMAP) are the important passive microwave sensors [Njoku et al., 2003; Entekhabi et al., 2010; Kerr et al., 2012; Lettenmaier et al., 2015] deployed so far which contain appropriate frequency bands (L band at 12 GHz, C band at 48 GHz, and X band at 812 GHz) for global-scale soil moisture retrievals. The measurements (satellite brightness temperatures) recorded by these passive microwave sensors at a location are converted into soil moisture value using a retrieval algorithm (which is primarily based on the KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,516 PUBLICATION S Journal of Geophysical Research: Atmospheres RESEARCH ARTICLE 10.1002/2016JD024829 Key Points: A novel approach is introduced to validate satellite soil moisture retrievals using rainfall data VUA-NASA product in general performed well compared to University of Montana's product over India Results are integrated with LULC and elevation data; impact of reference rainfall data is analyzed Supporting Information: Supporting Information S1 Correspondence to: D. N. Kumar, [email protected] Citation: Karthikeyan, L., and D. N. Kumar (2016), A novel approach to validate satellite soil moisture retrievals using precipita- tion data, J. Geophys. Res. Atmos., 121, 11,51611,535, doi:10.1002/ 2016JD024829. Received 20 JAN 2016 Accepted 18 AUG 2016 Accepted article online 23 AUG 2016 Published online 12 OCT 2016 ©2016. American Geophysical Union. All Rights Reserved.

A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

A novel approach to validate satellite soil moistureretrievals using precipitation dataL. Karthikeyan1 and D. Nagesh Kumar1

1Department of Civil Engineering, Indian Institute of Science, Bangalore, India

Abstract Anovel approach is proposed that attempts to validate passivemicrowave soilmoisture retrievalsusing precipitation data (applied over India). It is based on the concept that the expectation of precipitationconditionedonsoilmoisture followsa sigmoidal convex-concave-shapedcurve, thecharacteristic ofwhichwasrecently shown to be represented by mutual information estimated between soil moisture and precipitation.On this basis, with an emphasis over distribution-free nonparametric computations, a newmeasure calledCopula-Kernel Density Estimator basedMutual Information (CKDEMI) is introduced. The validation approach isgeneric in nature and utilizes CKDEMI in tandem with a couple of proposed bootstrap strategies, to checkaccuracy of any two soil moisture products (here Advanced Microwave Scanning Radiometer–EOS sensor'sVrije Universiteit Amsterdam-NASA (VUAN) and University of Montana (MONT) products) using precipitation(India Meteorological Department) data. The proposed technique yields a “best choice soil moisture product”map which contains locations where any one of the two/none of the two/both the products have producedaccurate retrievals. The results indicated that in general, VUA-NASA product has performed well overUniversity of Montana's product for India. The best choice soil moisture map is then integrated with land useland cover and elevation information using a novel probability density function-based procedure to gaininsight on conditions under which each of the products has performed well. Finally, the impact of using adifferent precipitation (Asian Precipitation-Highly-Resolved Observational Data Integration TowardsEvaluation of Water Resources) data set over the best choice soil moisture product map is also analyzed.The proposed methodology assists researchers and practitioners in selecting the appropriate soil moistureproduct for various assimilation strategies at both basin and continental scales.

1. Introduction

Soil moisture quantifies the amount of water present in the pore spaces of unsaturated soil matrix. Thoughsoil moisture accounts for approximately 0.001% of Earth's total water content, the presence of it at landatmosphere interface strongly affects the energy and water exchanges. Soil moisture forms a major sourceof water for processes that lead to land surface evapotranspiration which in turn contributes to watercirculation in the atmosphere [Seneviratne et al., 2010]. Furthermore, soil moisture at the surface (typicallytop 0–5 cm of soil) is an important variable to estimate the distribution of precipitation into storage andsurface runoff [Wagner et al., 2003].

These influences lead to usage of soil moisture data in several hydrological, meteorological, and climate appli-cations such as flood forecasting [Brocca et al., 2012; Lievens et al., 2015], monitoring of droughts [Bolten et al.,2010; Zhang and Jia, 2013], climate studies [Bisselink et al., 2011; Dorigo et al., 2012], and calibration of hydro-logic models [Wanders et al., 2014]. Successful outcome of these applications demands accurate soil moistureobservationswhich are often difficult to obtain both spatially and temporally due to the highly heterogeneousnature of soil moisture that cripples the ability to generalize the findings to a regional scale based on at-siteanalysis [Prigent et al., 2005]. This leads to satellite-based retrieval of soil moisture data. ScanningMultichannel Microwave Radiometer, Special Sensor Microwave Imager (SSM/I), TMI-TRMM MicrowaveImager where TRMM is Tropical Rainfall Measuring Mission, Advanced Microwave Scanning Radiometer–EOS (AMSR-E), Soil Moisture and Ocean Salinity (SMOS), and Soil Moisture Active Passive (SMAP) are theimportant passive microwave sensors [Njoku et al., 2003; Entekhabi et al., 2010; Kerr et al., 2012; Lettenmaieret al., 2015] deployed so far which contain appropriate frequency bands (L band at 1–2GHz, C band at4–8GHz, and X band at 8–12GHz) for global-scale soil moisture retrievals.

The measurements (satellite brightness temperatures) recorded by these passive microwave sensors at alocation are converted into soil moisture value using a retrieval algorithm (which is primarily based on the

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,516

PUBLICATIONSJournal of Geophysical Research: Atmospheres

RESEARCH ARTICLE10.1002/2016JD024829

Key Points:• A novel approach is introduced tovalidate satellite soil moistureretrievals using rainfall data

• VUA-NASA product in generalperformed well compared toUniversity of Montana's productover India

• Results are integrated with LULC andelevation data; impact of referencerainfall data is analyzed

Supporting Information:• Supporting Information S1

Correspondence to:D. N. Kumar,[email protected]

Citation:Karthikeyan, L., and D. N. Kumar (2016),A novel approach to validate satellitesoil moisture retrievals using precipita-tion data, J. Geophys. Res. Atmos., 121,11,516–11,535, doi:10.1002/2016JD024829.

Received 20 JAN 2016Accepted 18 AUG 2016Accepted article online 23 AUG 2016Published online 12 OCT 2016

©2016. American Geophysical Union.All Rights Reserved.

Page 2: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

concepts of radiative transfer theory and dielectric mixing model) [Jackson et al., 1982; Owe et al., 2001;Pellarin et al., 2003; Jones and Kimball, 2011; Kerr et al., 2012; Pan et al., 2014]. Recently, Mladenova et al.[2014] provided an excellent review on various retrieval algorithms proposed that are applicable to observa-tions of AMSR-E sensor. It has to be noted that despite the source of satellite data (brightness temperature)remaining the same, these algorithms vary (in terms of their sample moments) in producing outputs. Thesevariations could be due to (a) the concept based on which the algorithm is formulated, (b) soil physicsinvolved in dielectric models, and (c) the assumptions and parameterization involved in the algorithm.Covering these aspects, a sensitivity analysis study over the parameters of retrieval algorithm has recentlybeen carried out by Neelam and Mohanty [2015].

The validation of retrieved soil moisture observations is in general carried out using in situ observations[Reichle et al., 2007; Chen et al., 2013; Dorigo et al., 2015] subject to data availability. Currently, theInternational Soil Moisture Network [Dorigo et al., 2011] hosts, on public domain, the data correspondingto 2153 stations (as of 19May 2015) spread across the world. Although this kind of validation gives better ideaabout the performance of the soil moisture products, it is often crippled by lack of in situ data. India, as anexample, is devoid of a comprehensive soil moisture station network due to which in situ validation of satel-lite products is not possible. In this regard, alternate validation techniques have been developed in the recentpast which do not explicitly require ground data. Existing techniques include triple collocation [Scipal et al.,2008; Su et al., 2014] and Rvalue approach [Crow, 2007; Crow and Zhan, 2007; Crow et al., 2010]. Triple colloca-tion is used to compute root-mean-square error of two soil moisture products with reference to thereanalysis/observed soil moisture data set. The method is based on assumptions of (a) orthogonality in pro-ducts' errors, (b) independence between the errors of three data sets, and (c) errors in the products that arelinearly related to the reference data set. Recent studies by Yilmaz and Crow [2014] and Gruber et al. [2016]ascertained that violation of assumptions of triple collocation can impact the final error estimates of the soilmoisture products. Rvalue approach, on the other hand, assumes correlation between antecedent precipita-tion indices (API) simulated by assimilating soil moisture retrievals into water balance model and observedAPI (computed from precipitation data) as a proxy to determine the accuracy of the soil moisture product.The Rvalue approach requires an additional modeling of water balance equation along with prespecifiedmodel parameters which could introduce certain uncertainty in the final outcome. Apart from these twotechniques, recently, Tuttle and Salvucci [2014] have proposed an approach to validate soil moisture productssolely based on precipitation data set using mutual information measure which is based on a conceptualframework proposed by Salvucci [2001]. In the current work, we have developed a new mutual informationmeasure called “Copula-Kernel Density Estimator based Mutual Information” with emphasis to computethemeasure in a nonparametric environment that is devoid of any of the aforesaid assumptions. Also, a novelattempt has beenmade to check if one or both soil moisture products are indeed accurate at a location understatistical basis through a couple of bootstrap procedures. Synthetic studies have been carried out to assessthe robustness of the proposed measure and associated bootstrap tests. The proposed measure—integratedwith the bootstrap tests—can validate any two soil moisture products based on observations of precipitationat a location. The selection of soil moisture products is independent of sensor and the frequency band bywhich retrievals are made (e.g., active or passive microwave sensors at L/C/X band frequencies). The methodcan also be used to include reanalysis/modeled soil moisture data sets such as Global Land Data AssimilationSystem (GLDAS) land surface model soil moisture simulations [Rodell et al., 2004].

Two soil moisture retrieval algorithms using AMSR-E sensor data have been validated through the proposedvalidation algorithm over India by employing precipitation data obtained from India MeteorologicalDepartment (IMD). The results thus obtained are integrated with land use land cover and elevation infor-mation to evaluate the performance of the retrieval algorithms. Finally, the impact of changing the refer-ence precipitation data set used for product validation is also worked out. The current work thus forms aviable addition to the existing validation techniques and contains novelty in the manner by which one cancarry out the following analyses: (i) validate and select the best performing soil moisture product solelybased on precipitation data using nonparametric approach with strong conceptual backing, (ii) test thesignificance of selected product (at a location) for independence with regard to precipitation, (iii) checkif multiple products have performed well at the location with regard to precipitation, and (iv) commenton advantages and fallacies of the algorithms thereby providing a scope for development of better soilmoisture retrievals in the future. The proposed methodology and the underlying theory are presented in

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,517

Page 3: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

section 2 followed by data descrip-tion and overview of the algorithmused for the analysis. The paper isconcluded with a discussion regard-ing how retrieval algorithms haveperformed under varied physiogra-phical conditions over India.

2. Methodology

Recently, Tuttle and Salvucci [2014]have proposed that the mutual infor-mation measure can be used as aneffective proxy to ascertain the rela-tionship between soil moisture andprecipitation data sets. Their work iscentered on analysis carried out bySalvucci [2001] which examined theproperties of soil moisture and preci-pitation observations based on theconcepts of water balance and statio-narity. A brief outline of their analysisis presented below:

Equation (1) presents the water bal-ance equation of a unit area of landextending over a depth Z.

dSMdt

¼ P � ET� Q (1)

where dSMdt is the change in soil moisture storage with time; P, ET, and Q are fluxes (LT�1) of pre-

cipitation, evapotranspiration, and combined losses due to runoff and drainage. Salvucci [2001] provedmathematically—supported by simulation studies and analysis with observed data—that expectation ofchange in soil moisture storage with time, conditioned over mean soil moisture (SM) over that time interval,is zero (E[dSM/dt|SM] = 0). Thus, the water balance equation upon conditioning on soil moisture will bemodified as

E PjSM½ � ¼ E ETjSM½ � þ E QjSM½ � (2)

which implies that for a given value of soil moisture, the precipitation that occurred over soil surface will getdistributed into ET dissipating from soil and losses in terms of runoff and drainage from soil. Later on, Salvucci[2001] determined that the graphs of E[P|SM] possess a characteristic sigmoidal convex-concave-shapedcurve. The shape of this curve is attributed to the cumulative effects of dependence of ET andQ on soil moist-ure. The sigmoidal behavior of E[P|SM] curve has been observed by Salvucci's group at both plot scale andsatellite scale, and the relevant findings have been presented in various studies [Saleem and Salvucci, 2002;Arrigo and Salvucci, 2005; Sun et al., 2011, 2012; Tuttle and Salvucci, 2014]. The typical curve of E[P|SM] alongwith associated time series plots of precipitation and moisture (in the inset) is presented in Figure 1.

According to Tuttle and Salvucci [2014], E[P|SM] plot exhibits characteristic sigmoidal convex-concave-shapedcurve only when the soil moisture data set is accurately measured (retrieved); i.e., the agreement betweenprecipitation and corresponding soil moisture observations is not perturbed. However, during the retrievalprocess, the measurements of soil moisture may not exactly align with ground-based observations at a loca-tion due to several aforementioned issues pertaining to retrieval algorithms and sensor errors. Under thesecircumstances, E[P|SM] curve may not exhibit the described behavior indicating lack of accuracy in the con-cerned soil moisture product at the location under study. In this context, Tuttle and Salvucci [2014] indicatedthat mutual information computed from soil moisture and precipitation data sets can be used as a proxy thatcomprises the E[P|SM] curve. Through a set of simulation studies, the authors have concluded that the soil

Figure 1. Typical plot of E[P|SM] curve plotted from soil moisture data(VUA�NASA soil moisture product) and IMD precipitation (data corre-sponding to location 25.75°N 82.75°E in India). Initially, soil moistureobservations are normalized using (SM� SMmin)/(SMmax� SMmin).Normalized soil moisture observations are binned, and precipitationobservations falling in corresponding bins are averaged to finally formE[P|SM] curve. The uncertainty band at each bin is obtained by σ=

ffiffiffin

pwhere

σ and n are standard deviation and number of precipitation observationsfalling in the concerned bin. The figure in the inset shows the timeseries plots of soil moisture and precipitation data sets from which theE[P|SM] curve has been constructed. The plot here clearly depicts charac-teristic sigmoidal convex-concave-shaped curve with former at low soilmoisture values and latter at high soil moisture values.

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,518

Page 4: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

moisture product with highest value of mutual information will have the most accurate pairwise agreementbetween moisture and precipitation which results in accurate E[P|SM] curve. In their work, the authors havecomputed mutual information between soil moisture (obtained at daily scale from three different products)and precipitation data sets by transforming the variables into Gaussian distribution.

But it is known that the distribution of precipitation at daily scale is generally skewed due to the presence of alarge number of nonrainy days with bounds being [0,∞). Soil moisture, on the other hand, is strictly bounded.Its value ranges between residual soil moisture θr (the amount of water which cannot be drained from soileven under high tension head) and porosity η (the ratio between volume of voids present and total soilvolume) and hence with bounds of [θr, η]. Due to these inherent properties that both the variables possess,it may not be sensible to transform the variables into Gaussian space. Hence, the current work proposesan assumption-free nonparametric approach to calculate mutual information—between soil moistureand precipitation—based on the concept of copulas and kernel density estimators which is termed asCopula-Kernel Density Estimator based Mutual Information measure termed as CKDEMI. Although mutualinformation is superior to other dependency measures such as correlation coefficient, due to its computa-tional complexity, there is no statistical testing procedure—to the best of our knowledge—that can examinethe significance of dependence between the variables under study. Hence, we have proposed a bootstrapstrategy to test the independence through CKDEMI in an attempt to investigate if a particular soil moistureproduct has significant dependence on precipitation which eventually translates into correct representationof the E[P|SM] curve. The robustness of the proposed measure and the bootstrap strategy are tested with acouple of simulation experiments. We also hypothesize that there could be few locations wherein the perfor-mance of multiple soil moisture products could be comparable; i.e., multiple products could have retrievedsoil moisture accurately. This hypothesis is tested by another bootstrap procedure, the details of which willbe presented subsequently. The following subsections provide a description of copulas which form key com-ponents in the formulation of CKDEMI measure.

2.1. Copulas

A copula is a statistical tool used to obtain multivariate distribution functions from one-dimensional marginaldistribution functions which lie in the interval [0, 1] [Nelsen, 2013]. As the name suggests, a copula stores thedependency structure between the random variables. Under a bivariate scenario, consider two continuousrandom variables X, Y (where x, y∈ℝ) with their cumulative distribution functions (CDFs) and joint distribu-tion function being FX(x) = P[X ≤ x], FY(y) = P[Y ≤ y], and FXY(x, y) = P[X ≤ x, Y ≤ y], respectively. Using a copulafunction C, a joint CDF of X and Y can be estimated using equation (3).

FXY x; yð Þ ¼ C FX xð Þ; FY yð Þð Þ (3)

If uX= FX(x) and uY= FY(y), then equation (3) can be written as FXY F�1X uxð Þ; F�1Y uy� �� � ¼ C ux ; uy

� �. A variety of

copula families have been developed in bivariate scenario in literature [Joe, 1997; Nelsen, 2013] of whichGaussian copula and Archimedean family of copulas have been extensively used in the field of hydrology.The general structure of these copula families is presented in the supporting information (Table S1).

The computation ofmutual informationbetween two randomvariables requires corresponding joint probabil-ity density function (PDF) fXY(x, y) (the details regarding mutual information are presented in section 2.2). Thefollowing set of equations describe themanner in which fXY(x, y) can be computed using equation (3).

f XY x; yð Þ ¼ ∂2FXY x; yð Þ∂x∂y

obtaining joint PDF from joint CDF½ �

¼ ∂2C FX xð Þ; FY yð Þð Þ∂FX xð Þ∂FY yð Þ � ∂FX xð Þ

∂x� ∂FY yð Þ

∂yusing equation 3ð Þ½ �

¼ ∂2C ux ; uy� �

∂ux∂uy� f X xð Þ � f Y yð Þ

¼ c ux ; uy� � � f X xð Þ � f Y yð Þ

∴f XY x; yð Þ ¼ c ux ; uy� �� f X xð Þ � f Y yð Þ

⇒ c ux ; uy� � ¼ f XY x; yð Þ

f X xð Þ � f Y yð Þ(4)

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,519

Page 5: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

where c(ux, uy) is called the copula density function obtained by partially differentiating copula function withrespect to marginal CDFs. Hence, equation (4) suggests that a joint PDF between two random variables canbe obtained by the use of copulas through copula density function and correspondingmarginal PDFs (typicalplots of C(ux, uy) and c(ux, uy) are presented in the supporting information Figure S1). In the current work,eight copulas which belong to Archimedean family along with Gaussian copula have been employed forthe analysis. Under Archimedean family, four one-parameter copulas, namely, Clayton, Gumbel, Frank, andJoe copulas followed by four two-parameter copulas, namely, Clayton-Gumbel (ClGu), Joe-Gumbel (JoGu),Joe-Clayton (JoCl), and Joe-Frank (JoFr) copulas were utilized. Further information regarding the formulationand associated properties of these copulas is presented in the supporting information (Text S1). The copulafunction, as mentioned before, ties the marginals with a copula parameter(s) that stores a dependency struc-ture between the random variables. Hence, an accurate estimation of joint CDF necessitates accurate copulaparameter(s) along with marginals. In the current work, the estimation of parameters of copula is carried outusing maximum likelihood estimation method [Genest et al., 2009], the information of which is briefed in thesupporting information (Text S2). Furthermore, the selection of best copula among the nine copulas is carriedout using Akaike information criterion (AIC) [Akaike, 1974]. It has to be noted that computation of copula den-sity for a bivariate data set requires corresponding inputs in the form of marginal CDFs which, in the currentwork, are obtained using nonparametric Kernel Density Estimator (KDE).

Comprehensive discussion on KDE can be found in Wand and Jones [1994] and Tsybakov [2008]. Let{xi, i= 1,…, n} denote a sample of size n drawn from a population of random variable X with CDF FX. The

kernel estimator (F̂ X ) of CDF FX is given by equation (5).

F̂ X x Hjð Þ ¼ 1nH

Xni¼1

Kx � Xi

H

� �(5)

where K(•) is the CDF of kernel function which is a symmetric PDF κ(•) (K xð Þ ¼ ∫x

�∞κ yð Þdy ) with H as the

smoothing parameter or bandwidth. Since the choice of kernel function is not critical [Duong, 2007], standardnormal kernel function �K xð Þ ¼ 1=

ffiffiffiffiffiffi2π

pexp �x2=2ð Þ is considered for the analysis. The parameter H is

estimated based on optimum bandwidth selector suggested in Duong [2007, 2015]. The marginal CDFsestimated using equation (5) are used to compute copula density function (equation (4)) which, in turn, isused to estimate CKDEMI. The following subsection provides an outline on the concept of mutual informationfollowed by description of proposed CKDEMI measure.

2.2. Mutual Information

The notion of mutual information is derived from the concept of entropy which is a measure of randomnessof a random variable. Originally articulated by Shannon [1948], the concept of entropy is widely applied in thefield of hydrology (a review of relevant applications can be found in Singh [2013, 2015]). In the case of con-tinuous random variables, the entropy is termed as differential entropy. Dependency between two randomvariables can be estimated using measures such as correlation coefficient, cross correlation, and Kendall's tau.However, these measures quantify certain aspects of dependence such as linearity, periodicity, or monotoni-city. In this context, mutual information (MI) measure has an ability to describe complete dependencebetween any two random variables; i.e., MI between two random variables quantifies the amount of informa-tion one can know about one variable in the presence of the other. Hence, themore the value of MI, the lesserwould be the uncertainty in one variable explaining the other variable or vice versa. Consider two continuousrandom variables X (sample space ΩX) and Y (sample space ΩY) which have PDFs fX(x) and fY(y), respectively,along with joint PDF f(x, y) In bivariate scenario, MI can be computed using the following equation.

MI X; Yð Þ ¼ h Xð Þ � h XjYð Þ ¼ h Xð Þ þ h Yð Þ � h X; Yð Þ¼ ∬

ΩX ;ΩY

f x; yð Þlog f x; yð Þf X xð Þf Y yð Þ dxdy

(6)

where h(X) and h(Y) are the differential entropies of X and Y, respectively, (given byh Xð Þ ¼ �∫ΩXf X xð Þlog f X xð Þdx);

h(X|Y) is the conditional entropy of X given Y (given by h XjYð Þ ¼ �∬ΩXΩYf x; yð Þlog f xjyð Þdxdy); h(X, Y) is

the joint entropy of X and Y (given by h X; Yð Þ ¼ �∬ΩXΩYf x; yð Þlog f x; yð Þdxdy).

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,520

Page 6: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

It can be observed from equation (6) that through MI one can estimate the average distance between actualjoint PDF and joint PDF assuming that the variables are independent, thereby providing an estimate of howclose the joint distribution is toward independence. MI is a scalar quantity with range [0,∞), where 0 indicatescomplete independence between random variables. A representation of MI using a Venn diagram is pre-sented in the supporting information (Figure S3). Mutual information, though promising, is found to have lim-ited applications, to the best of our knowledge, in the field of hydrology. Harrold et al. [2001] have estimatedMI to model dependence using kernel density estimator. Fernando et al. [2009] carried out selection of inputvariables required in artificial neural networks using a variant of MI. Fahle et al. [2015] used MI to achieve opti-mum groundwater station network. The limited usage of MI could be due to two reasons (a) requirement ofestimating marginal as well as joint PDFs which might be ambiguous and (b) lack of a test of hypothesis tocheck if the obtained value of MI is indeed significant. In the current work, the first issue is addressed withthe use of copulas which resulted in proposed CKDEMI measure. And in order to tackle the second problem,a bootstrap simulation technique is proposed which statistically tests the null hypothesis that two randomvariables are independent MI = 0. Using the components described till now, the following subsection pre-sents proposed Copula-Kernel Density Estimator based Mutual Information (CKDEMI) measure.2.2.1. Copula-Kernel Density Estimator Based Mutual Information MeasureLet {xi, i=1,…, n} and {yi, i= 1,…, n} be two samples each of size n drawn from the populations of correspond-

ing random variables X and Y, respectively. Consider f̂ X xið Þ, f̂ Y yið Þ, f̂ xi; yið Þ, ∀ i= 1,…, n to be the estimatedmarginal and joint PDFs of X and Y. Equation (6) can be rewritten in terms of expectation function withrespect to joint PDF which turns out to be equation (7).

MI X; Yð Þ ¼ Ef̂ x;yð Þ logf̂ x; yð Þ

f̂ X xð Þf̂ Y yð Þ

" #(7)

Using weak law of large numbers, expectation equation (7) can be approximated as an average over numberof values of available bivariate samples (equation (8)).

MI X; Yð Þ ≈ 1n

Xni¼1

logf̂ xi; yið Þ

f̂ X xið Þf̂ Y yið Þ (8)

The derivation of the above equation from equation (6) is based on asymptotic equipartition property ofinformation theory [Cover and Thomas, 2006] which is a consequence of the weak law of large numbers. Ithas to be noted that the most important component of equation (8) is joint PDF. This is estimated usingcopula density function (equation (4)) which leads to the following formulation. The derived Copula-KernelDensity Estimator based Mutual Information measure is shown in equation (9).

MI X; Yð Þ ≈ 1n

Xni¼1

logf̂ xi; yið Þ

f̂ X xið Þf̂ Y yið Þ

¼ 1n

Xni¼1

logc ux;i; uy;i� � � f̂ X xið Þ � f̂ Y yið Þ

f̂ X xið Þf̂ Y yið Þ using equation 4ð Þð Þ

¼ 1n

Xni¼1

logc ux;i; uy;i� � �=̂f X xið Þ �=̂f Y yið Þ=̂f X xið Þ=̂f Y yið Þ

CKDEMI X; Yð Þ ¼ 1n

Xni¼1

log c ux;i; uy;i� �

(9)

The measure primarily requires the marginal CDFs of two random variables which are fed to copula densityfunction. It has to be noted that log in equation (9) refers to natural logarithm. The marginal CDFs are esti-mated using kernel-based CDF estimator (equation (5)). CKDEMI is a simple and elegant measure which offersrobustness in terms of nonparametric computation of joint and marginal distributions using copulas andKDE, respectively. An interesting aspect to note is that the form of CKDEMI is similar to that of log likelihoodfunction (refer to equation (S4) of the supporting information) which is a key component in parameter esti-mation (of copulas) and selection of best performing copula (AIC criteria). This property explains that CKDEMIis inherently influencing these background processes which in turn are assisting the estimation of theproposed measure.

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,521

Page 7: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

In the current work, CKDEMI is employed to quantify the accuracy of satellite soil moisture products with theaid of precipitation data set. If two soil moisture products SM1 and SM2 need to be validated, CKDEMI is com-puted between precipitation and SM1 followed by precipitation and SM2. Later on, the product that givesgreater value of CKDEMI is selected as the accurate product for the location of study. It has to be noted thatduring the computation of CKDEMI for each product, marginal CDFs of precipitation and concerned precipi-tation product are computed using kernel-based CDF estimator which are fed to copula density functions. Atthis point, a battery of copulas (refer to supporting information Table S2) are initially employed, out of whichthe best copula is selected for the estimation of copula density function.

There is a need to ascertain how well CKDEMI is able to quantify the dependency between two random vari-ables with regard to sample size and amount of actual dependence (i.e., variations from complete indepen-dence to high dependence between random variables). For this purpose, a simulation experiment is carriedout. Consider bivariate normal random variables which have sample correlation of bρ. Under this setup, Coverand Thomas [2006] have mathematically derived relationship between theoretical MI (denoted as MInorm)and bρ which is shown in equation (10).

MInorm ¼ �12log 1� bρ2� �

(10)

To conduct the experiment, bivariate normal random variables are generated with key inputs of sample sizeand correlation coefficient. Sample sizes of 10, 100, 1000, and 10,000 along with correlation coefficients of0, 0.25, 0.5, 0.75, and 0.95 are considered for the analysis. At each one of the 20 combinations of sample sizeand correlations, 1000 ensembles of bivariate normal random samples are generated. Later on, for each ofthese ensembles, CKDEMI is estimated which is considered to be the predicted quantity of dependence.MInorm is computed using sample correlation coefficient (equation (10)) which acts as the observed valueof MI. It has to be noted that for each ensemble across the aforementioned combinations, estimation ofCKDEMI is conducted using a battery of copulas (refer to supporting information Table S2) and kernel-basedCDF estimator. If CKDEMI estimates vary closely with respect to their true counterparts (MInorm), it can be con-cluded that CKDEMI estimates are reliable and accurate. The scatterplots of ensembles across each of thecombinations with MInorm and CKDEMI along horizontal and vertical axes, respectively, are presented inFigure 2. The performance metrics (R2, percentage R bias) of corresponding simulations with respect to

Figure 2. Scatterplots with observed mutual information (MInorm) on horizontal axis and simulated value (CKDEMI) on ver-tical axis prepared for each of the combinations of correlation (R) and sample size (Size) wherein R2 and percent R biasmetrics have been estimated between observed and simulated values of MI. The numbers in the top left box and bottomright box at each scatterplot indicate R2 and percent R bias, respectively.

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,522

Page 8: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

concerned observed values are computed for each combination of correlation and sample size which are alsopresented in Figure 2 (the numbers in the top left box and bottom right box are R2 and percent R bias metrics,respectively). It can be observed from the figure that CKDEMI has estimated the dependencies with reason-able accuracy, although some amount of overestimation took place when the sample size is small. As thesample size increases, CKDEMI is found to be more accurate in estimating the dependencies which demon-strates that the measure is robust in terms of estimating the overall dependency between the two randomvariables. Under a similar experimental setup, CKDEMI is compared with five existing measures of MI, andit is found that CKDEMI outperformed its counterparts under all scenarios of sample size and correlation(results are presented in the supporting information Text S3).

In the process of comparing two soil moisture products (using time and space coherent precipitation data), asmentioned earlier, the product with greater value of CKDEMI measure will be selected for that particular loca-tion. Following the product selection, there is a need to check if the obtained value is statistically significantor not, i.e., to check if the two random variables exhibit statistically significant dependence or not. As far asour knowledge is concerned, in the case of continuous random variables, there are no statistical tests to verifythis hypothesis using mutual information. In an attempt to verify our initial hypothesis that there might existfew locations wherein soil moisture retrievals may not be accurate with regard to corresponding precipitationobservations, a bootstrap simulation procedure is framed in the current work following the principles out-lined by Efron and Tibshirani [1994] which is presented in the following subsection.2.2.2. Bootstrap Test of Independence Between Two Random Variables Using CKDEMI (BTICKDEMI)Let x= {xi, i= 1,…, n} and y= {yi, i= 1,…, n} be two samples each of size n drawn from the populations of cor-responding random variables X and Y, respectively. Let Z and ξ be the size of each bootstrap sample andnumber of bootstrap simulations, respectively. Once CKDEMI is computed between X and Y, this proceduretests the significance of CKDEMI at (1� α) confidence interval under the following null (Ho) and alternatehypotheses (Ha) setup.

Ho : X and Y are independent CKDEMI ¼ 0ð ÞHa : ¬Ho CKDEMI > 0ð Þ

1. Draw Z number of values randomly from x with replacement. Let this be X′.2. Draw Z number of values randomly from y with replacement. Let this be Y′.3. Compute CKDEMI between X′ and Y′. Let it be CKDEMIi (where 1 ≤ i ≤ ξ).4. Repeat steps 1–3 for ξ number of times. This step yields dCKDEMI ¼ CKDEMIi : i ¼ 1;…; ξf g5. Order the values of dCKDEMI obtained from step 4 such thatCKDEMI�1≤…≤CKDEMI�ξ (superscript * indicates

ordered sample). The critical valueCKDEMIcritical at (1� α) confidence interval is estimated from theorderedsample using the following equation.

CKDEMIcritical ¼ CKDEMI�ξ 1�αð Þ (11)

6. If CKDEMI>CKDEMIcritical, the null hypothesis of CKDEMI = 0 can be rejected at (1� α) confidence interval.Else, failure to reject null hypothesis indicates that X and Y are independent.

The flowchart of BTICKDEMI procedure is presented in Figure 3. It has to be noted that the notations of para-meters and variables in the figure are consistent with the aforementioned algorithm. Furthermore, there is aneed to check the validity of the formulated method since this test plays a crucial role in validating the soilmoisture products. The method is verified using a synthetic case study. It is known that if two random vari-ables, which are normally distributed, carry a zero correlation between them, then it can be hypothesizedthat the random variables are completely independent [Montgomery and Runger, 2010]. Using this property,if the hypothesis test is carried out over an arbitrary bivariate normal sample of correlation 0, to check if sig-nificant dependency exists between the variables, it is expected that there will be a failure to reject the nullhypothesis (that they are independent). So, initially, random samples of size 1000 and population correlationparameter 0 are generated from bivariate normal distribution. This sample is subjected to BTICKDEMI testingprocedure and also tested in parallel for significance of sample correlation (which follows approximately tdistribution). This experiment is repeated 1000 times. The result, as mentioned before, is expected to showthat no dependency exists between the two variables. It is observed that out of 1000 instances, standard testfor significance of correlation failed to reject the null hypothesis 945 times, whereas BTICKDEMI, which uses

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,523

Page 9: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

proposed CKDEMI measure, failed to reject the null hypothesis 959 times; i.e., BTICKDEMI came up with theright result for a greater number of instances than that of the traditional correlation test. Although it isexpected to give a result of failure to reject null hypothesis in all the instances, sampling effect might haveplayed some role in the few remaining cases which is observed to have lesser influence over CKDEMI-basedBTICKDEMI testing procedure than that of standard correlation analysis. The output from the above analysisindicates the accuracy of CKDEMI and strength of BTICKDEMI procedure in testing for independence.

Following by BTICKDEMI analysis, it is also hypothesized that there could be locations wherein two soilmoisture products might fare equally well with respect to precipitation data set. Once CKDEMI is obtainedfor the two soil moisture products under study, if the respective values of CKDEMI are statistically close toeach other, then it can be concluded that both the soil moisture products have exhibited the characteristicE[P|SM] curve. The equality of concerned CKDEMI values is tested using a separate bootstrap technique, thealgorithm of which is summarized in the following subsection. Since the selection of soil moisture productsis primarily being carried out by comparing the magnitudes of independently obtained CKDEMI measure,this testing procedure addresses the situation wherein the two products at a location resulted in CKDEMIvalues which are very close to each other but only one of the products have been selected becauseof the fact that the CKDEMI of the selected product is only marginally greater than its competitor soil moist-ure product.2.2.3. Bootstrap Test for Checking the Equality of Two Soil Moisture Products ThroughCKDEMI (BTECKDEMI)Consider SM1 = {SM1,i : i=1,…, n}, SM2= {SM2,i : i=1,…, n}, and P= {Pi : i=1,…, n} to be two soil moisture pro-ducts and precipitation data, respectively, each of size n drawn from the respective populations of randomvariables SM1, SM2, and P. Let Z and ξ be the size of bootstrap sample and number of bootstrap simulations,respectively. The following procedure tests the equality in performance of SM1 and SM2 at (1� α) confidenceinterval.

1. Collocate SM1 and P to form Λ, an n× 2 matrix where Λi,j : i= {1,…, n}, i= {1, 2}.2. Collocate SM2 and P to form Ω, an n×2 matrix where Ωi, j : i= {1,…, n}, i= {1, 2}.3. Draw Z number of pairs randomly from Λwith replacement. Let the output be Λ′. It has to be noted that

Λ′ is a Z×2 matrix with columns 1 and 2 containing values corresponding to random variables SM1 andP, respectively. In this way, the bivariate pairs are kept intact even after random sampling.

4. Similar to step 3, draw Z number of pairs randomly from Ω with replacement. Let the output be Ω′.5. Compute CKDEMI between the first and second columns of Λ′. Let it be CKDEMIΛ

.6. Repeat steps 1–5 for ξ number of times. This step yields CKDEMIΛ

¼ CKDEMIΛ;υ⌢

: υ ¼ 1;…; ξn o

. Thesevalues would indicate the range in which CKDEMI can lie due to uncertainties in input data.

7. Compute CKDEMI between the first and second columns of Ω′. Let it be CKDEMIΩ⌢

.8. Repeat steps 1–4 and 7 for ξ number of times. This step yields CKDEMIΩ

¼ CKDEMIΩ;υ

: υ ¼ 1;…; ξn o

.

Figure 3. Flowchart of BTICKDEMI—bootstrap test of independence between two random variables using CKDEMI.

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,524

Page 10: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

9. Conduct a two-tailed two-sample t test to check for equality of population means (μ) betweenCKDEMIΛ⌢

and CKDEMIΩ⌢

at (1� α) confidence interval with the following null and alternate hypotheses.

Ho : μCKDEMIΛ¼ μCKDEMIΩ

Ha : μCKDEMIΛ≠ μCKDEMIΩ

10. If the null hypothesis has failed to be rejected, it can be concluded that the two soil moisture productsSM1 and SM2 have performed equally well with regard to time and space coherent precipitation data set.Else, select the product which has greater value of CKDEMI as the most accurate soil moisture product.

The flowchart of the BTECKDEMI procedure is presented in Figure 4. With the analyses described in the abovesubsections, two soil moisture products over an area can be compared to decide (a) which one of the twoproducts is accurate (section 2.2.1), (b) if the selected product has indeed exhibited significant dependencyon precipitation (section 2.2.2), and (c) if the two products perform comparably well with respect to precipi-tation (section 2.2.3). Blending these analyses, the following subsection explains the proposed algorithm andpresents the flowchart of the algorithm proposed in the current work.

2.3. Proposed Algorithm

This section presents the validationmethodology applied in the current work which synthesizes the conceptsand algorithms presented in the previous subsections. The method validates two soil moisture productsdenoted by SM1 and SM2 using precipitation data P. It has to be noted that all the three data sets need tobe time- and space-coherent with equal lengths of record for the method to be applicable. After preparingthe data sets appropriately for a location, the following algorithm is applied over them to select the accuratesoil moisture product for that location.

1. Consider P and SM1. Compute their respectivemarginals uP anduSM1 (equation (5), using kernel-based CDFestimator described in section 2.1).

2. Estimate copula density function of the marginals computed in step 1, c uP; uSM1ð Þ (using the battery ofcopulas with parameter estimation and copula selection criteria presented in section 2.1)

3. Estimate CKDEMISM1 from the output obtained in step 2 using equation (9).4. Repeat steps 1–3 using P and SM2 to estimate CKDEMISM2 .5. If CKDEMISM1 > CKDEMISM2 , select SM1 product and apply bootstrap test of independence between the

two random variables using CKDEMI (BTICKDEMI; section 2.2.2) over P and SM1 data sets to check if thevalue ofCKDEMISM1 is statistically significant. If the obtained value is found to be insignificant from the test

Figure 4. Flowchart of BTECKDEMI—bootstrap test for checking the equality of two soil moisture products through CKDEMI.

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,525

Page 11: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

(i.e., CKDEMISM1 ¼ 0), it indicates that SM1 has not produced accurate retrieval for the location. Else, pro-ceed to step 7.

6. If CKDEMISM2 > CKDEMISM1 , select SM2 product and apply BTICKDEMI over P and SM2 data sets to check ifthe value of CKDEMISM2 is statistically significant. If the obtained value is found to be insignificant from thetest (i.e., CKDEMISM2 ¼ 0), it concludes that SM2 has not produced accurate retrieval for the location. Else,proceed to step 7.

7. Check if both SM1 and SM2 have retrieved comparably accurate soil moisture observations for the loca-tion. To check for this aspect, apply bootstrap test for checking the equality of two soil moisture productsthrough CKDEMI (BTECKDEMI; section 2.2.3). If CKDEMISM1 ¼ CKDEMISM2, it indicates that both SM1 and SM2

have produced comparably accurate soil moisture retrievals. Else, proceed to step 8.8. If CKDEMISM1 > CKDEMISM2 , it indicates that SM1 is the more accurately retrieved soil moisture product

for the location. Else, it concludes that SM2 is the more accurately retrieved soil moisture product forthe location.

The flowchart of the algorithm is presented in Figure 5.

3. Data

The validation algorithm presented in previous section is tested to check the performance of two soilmoisture products over India obtained from the AMSR-E brightness temperature data. AMSR-E has a passiveradiometer sensor which is onboard polar-orbiting Aqua satellite (under operation from 19 June 2002 to 3October 2011). One soil moisture product is obtained through the retrieval algorithm proposed by Oweet al. [2008] developed jointly by Vrije Universiteit Amsterdam and NASA (National Aeronautics and SpaceAdministration), hereafter referred to as VUAN. The other soil moisture product is obtained based on theretrieval algorithm proposed by Jones et al. [2010] developed at University of Montana [Jones and Kimball,2011], hereafter referred to as MONT. Both VUAN and MONT soil moisture products are level 3 data setsgridded globally at 0.25° × 0.25° spatial resolution concurrently available at daily scale from 19 June 2002to 27 September 2011 (a total of 3386 days). VUAN product utilized AMSR-E's X band (10.65GHz) bright-ness temperatures and involved Land Parameter Retrieval Model (LPRM) algorithm for soil moistureretrievals. MONT, on the other hand, is obtained using C band (6.925GHz) brightness temperature datawhich are replaced with X band-based retrievals in case the C band data get contaminated by radiofrequency interference.

Figure 5. Flowchart of the proposed algorithm.

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,526

Page 12: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

The precipitation data used to validate VUAN and MONT products (according to the flowchart in Figure 5) isobtained from India Meteorological Department (hereafter referred to as IMD) developed by Pai et al. [2014].This data set is available over India at 0.25° × 0.25° spatial resolution spanning a period of 113 years from 1901to 2013 at a daily scale. The data preparation involved 6995 rain gauge stations spread across the countrywhich were subjected to necessary quality control before their usage for gridding. Due to the high densityof rain gauge stations involved coupled with quality checking procedures, it is assumed that this precipitationdata are accurate enough for the current analysis with no need for corrections. The precipitation data pertain-ing to the time period for which VUAN and MONT products are available, i.e., 19 June 2002 to 27 September2011, are utilized for the present analysis.

In order to achieve spatial consistency across three data sets, the grid system of IMD is considered as a refer-ence with respect to whichMONT and VUAN data sets have been resampled using inverse-weighted distanceaverage technique. Care has been ensured in carrying out resampling from the closest grids which resulted insensible values with minimal loss of data. Each of the two soil moisture products has two data sets, one cor-responding to ascending pass and the other to descending pass. For AMSR-E the ascending and descendingpasses cross the equator at approximately 1:30 P.M. and 1:30 A.M., respectively. The soil moisture retrievalmay vary according to the pass due to changes in climate as well as soil conditions. Hence, in order to esti-mate the total amount of soil moisture recorded in a day, the average of ascending and descending passes'retrievals has been computed and utilized with the assumption that the resultant soil moisture correspondsto the amount of precipitation recorded by IMD on a particular day. This step is carried out post resampling,for each grid, temporally across both soil moisture products VUAN and MONT.

Later filtering out for missing data across the three data sets (MONT, VUAN, and IMD) at each grid is carriedout, such that if a value does not exist in even one of the three data sets on a particular date, the very date iseliminated for that grid. With this step, the data sets obtained are ensured to be of equal length (which is aprerequisite for current analysis) with temporal and spatial coherence. Data filling (for missing data) is notopted in the current work because it is intended to test the soil moisture retrievals with minimal correctionswhich otherwise can contribute to an additional uncertainty in obtaining the final output. Finally, a total of4632 grids with each grid containing three data sets (MONT, VUAN, and IMD) have been obtained for the ana-lysis. Figure 6 presents the daily mean values of IMD, VUAN, and MONT products.

It can be observed from themean daily precipitationmap (Figure 6a) that the general characteristics of Indianprecipitation pattern are faithfully displayed, e.g., high precipitation along the regions of southwest coast(Western Ghats) and northeastern parts of India and scarce precipitation in the regions of Rajasthan (north-west) and south central (which is observed to be drought prone) India. The eastern coastal region is observedto have recorded moderate daily precipitation. Corresponding to the same period of precipitation records,spatially coherent maps of daily mean VUAN and MONT products are plotted in Figures 6b and 6c. It is

Figure 6. Daily mean maps of (a) IMD precipitation (in mm/day), (b) VUAN soil moisture product (m3/m3), and (c) MONT soil moisture product (m3/m3).

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,527

Page 13: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

observed that both the products have in general replicated the spatial characteristics that correspond to pre-cipitation patterns. In case of VUAN product, soil moisture is observed to be on the higher side in the regionsof Jammu and Kashmir state, northeast India, and southwest coast. MONT product, on the other hand, isobserved to retrieve low soil moisture observations in the northeastern stretches leading to a speculationthat MONT product may not perform well in this region. Additionally, in the case of Rajasthan—which is sub-jected to low precipitation—it is seen that both VUAN and MONT products have retrieved soil moisturevalues with lower magnitude. In the region of south central India (along the states of Madhya Pradesh,Maharashtra, and Karnataka), VUAN product has shown to exhibit comparable soil moisture conditions inthe presence of scanty precipitation in the region. Furthermore, in the stretches of Himachal Pradesh andUttarakhand, a contrasting observation of very low soil moisture retrievals by MONT product is obtained com-pared to moderate precipitation conditions prevalent in the region. This precipitation characteristic is visuallyobserved to be correctly influenced by VUAN product leading to supposition that this product might haveretrieved soil moisture with relatively better accuracy. Finally, one peculiar observation has been made incase of MONT product. In the region of Delhi, MONT product is seen to exhibit high mean soil moistureretrievals (shown as blue pixels) contrasting to low precipitation conditions observed in this area. Due to thisbehavior, it is expected that the current analysis may not select the MONT product for this region. A furtheranalysis has to be carried out using the proposed methodology to gain further insights regarding the perfor-mance of the two soil moisture products across the country. The following section presents the resultsobtained along with relevant discussion.

4. Results and Discussion

The proposed algorithm discussed in section 2.3 is employed to validate the performance of VUAN andMONT products over India. As mentioned before, the methodology, if applied over a grid containing thethree data sets (VUAN, MONT, and IMD), will yield one of the following results: (a) VUAN product got selected,(b) MONT product got selected, (c) none of the products made an accurate retrieval, or (d) both VUAN andMONT products have produced accurate retrievals. Although it is observed that there could be some overes-timation by CKDEMI under low sample sizes (Figure 2), it is assumed that its effect shall be minimal on com-paring the two soil moisture products with the corresponding precipitation data, provided the length of dataremains same. The analysis is carried out over 4632 grids encompassing India, and a best choice soil moistureproduct map is obtained. It may be noted that the bootstrap simulation parameters Z and ξ in both BTICKDEMI

and BTECKDEMI algorithms are fixed as 1000 and 300, respectively. Also, the statistical testing in the two pro-cedures is carried out at 95% confidence interval (BTICKDEMI—section 2.2.2: step 5; BTECKDEMI—section 2.2.3:step 9).

Figure 7 presents the best choice soil moisture product map plotted along with associated values of CKDEMIat each grid over India. It is observed that out of 4632 grids, the methodology could not be applied due to

Figure 7. (a) Best choice soil moisture product map of India. (b) Gridwise CKDEMI values corresponding to selected soilmoisture products. (In the legend of Figure 7a, “Independent” indicates that none of the products got selected in thatparticular location, and “NaN” indicates grids where the methodology could not be applied due to lack of data).

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,528

Page 14: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

lack of data post filtering for 99 grids (~2%; shown as NaN in Figure 7a). These grids are primarily located inthe snow-fed region of Jammu and Kashmir. Among the rest of the grids (4533), approximately 60% (2734)selected VUAN product, 32% (1470) selected MONT product, 88 grid locations are found to have retrievalswhich may be inaccurate, and 9% of the grids had both the products performing equally well.

The results indicate the superior performance of VUAN data set in retrieving accurate soil moisture observa-tions over India. VUAN product is also seen to be predominantly selected in the central India region alongwith the state of Gujarat. VUAN product is also observed to be dominant in the regions of southwest coastand northeast which are the heavy rainfall regions of the country. In the case of high elevation stretches ofJammu and Kashmir, apart from snow-laden areas wherein lack of data constrained the computation ofCKDEMI, the grids selected a mixture of VUAN and MONT products. The MONT product, on the other hand,is observed to be prevailing in the upper reaches of agriculturally intense Ganges basin along the states ofPunjab, Haryana, Uttar Pradesh, Bihar, and West Bengal. This product is also observed to be selected in thestates of Chhattisgarh and Orissa and in the areas of south central India along with much of Tamil Naduregion. Incidentally, these regions are observed to receive moderate mean daily precipitation of around2.45mm. In light of these results, it can be said that VUAN product is predominant in the regions of heavyprecipitation, whereas MONT product performed well under moderate precipitation conditions. This phe-nomenon could be attributed to the fact that VUAN product's soil moisture is based on X band brightnesstemperature observations which have lesser sensing depth than that of MONT product's C band-based soilmoisture data. This makes the former product to be more responsive to the precipitation dynamics thanthe latter product resulting in better performance of VUAN under heavy precipitation conditions. Apart fromthe grids where either of these products got selected, it is observed that there exist a few grids wherein noneof the soil moisture products are found to be accurate (through BTICKDEMI procedure) which are clustered inthe regions of Rajasthan and Mizoram both of which are characterized with low-precipitation conditions. Inthe scenario of Delhi, as speculated previously, VUAN product got selected. This indicates the strength of theproposed analysis in filtering out anomalous retrievals. Furthermore, the grids where both VUAN and MONTare accurate (obtained from BTECKDEMI procedure) are all spatially scattered across India, which calls forfurther analysis to gain deeper understanding in this regard. It is observed from Figure 7a that the productselection has taken place mostly in clusters suggesting that there could be an underlying physical patternthat might be impacting the product selection.

Hence, further attempts have been made to link the results obtained with physiographical characteristics. Itis known that the soil moisture retrievals depend on satellite brightness temperature observations which inturn get influenced by factors on the ground such as vegetation (the higher the vegetation, the lesser wouldbe the penetration capability of passive microwave frequencies≤ 11GHz) and surface elevation. Althoughthe quantity of biomass is known to influence final soil moisture output, it may not be possible at this stageto integrate seasonally varying biomass information to assess its impact over the best choice soil moistureproduct map (Figure 7a). Hence, the best choice soil moisture product map is integrated with Land Use LandCover (LULC) map (which may not exhibit drastic changes over 10 years of time period) and also elevationmap. The LULC map was recently developed at 0.5 km spatial resolution by the United States GeologicalSurvey Land Cover Institute [Broxton et al., 2014]. The classification is based on 10 years (2001–2010; in whichdata nearly coincide with the current analysis time period) of collection of 5.1 MCD12Q1 type land coverdata. The categorization has been carried out into 17 land cover types. The elevation map, on the otherhand, is obtained from Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM) which pro-vides elevation above mean sea level globally at 90m spatial resolution [Jarvis et al., 2008]. The procuredLULC and elevation maps are resampled to 0.25° × 0.25° spatial resolution to match the grid system of cur-rent analysis and are presented in Figure 8. The details regarding land cover types are provided in thelegend of Figure 8a.

The maps clearly depict the underlying pattern with which the product selection is carried out to obtain bestchoice soil moisture product map. For example, the ridge between low-elevation croplands and high eleva-tion savannas, woody savannas, and grasslands formed along the borders of Uttarpradesh, Haryana andPunjab, and Uttarakhand and Himachal Pradesh is clearly observed in best choice soil moisture productmapwhereinMONT product got selected in the former and VUAN got selected in the latter conditions of landcover and elevation. Also, few grids in the state of Mizoram where none of the products got selected arefound to be categorized as evergreen broadleaf forest. Since it is known that dense vegetation attenuates

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,529

Page 15: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

the microwave radiation from contacting soil surface which might degrade the quality of soil moistureobservations at that location, the land cover coupled with poor precipitation conditions might havedegraded the E[P|SM] curve leading to none of the products getting selected. Although such an observationis limited to a few grids, it could be an important characteristic correctly picked up by the current methodol-ogy. In the case of Rajasthan region, the locations wherein neither of the two products got selected mostlypertain to either barren or sparsely vegetated category with elevation below 325m. It is observed that inthese grid locations, there is a negative slope of E[P|SM] curve (not presented in the manuscript) in bothVUAN and MONT products for soil moisture values on the “drier” side indicating an unexpected decreasein soil moisture observations for an increase in precipitation. Since this region's grids are expected mostlyto receive low precipitation (Figure 6a), the negative slope might have played some role in rejecting boththe soil moisture products.

In order to gain further information about the quantity of grids that resulted in each of the four possible out-comes across 17 land cover categories, a bar chart (Figure 9) has been prepared which indicates the ratio ofgrids selected in each of the outcomes to the total number of grids present in each land cover category. Overthe top of each bar, the percentage of area occupied by that particular land cover category in India is speci-fied. The bar chart indicates that in case of forested regions, i.e., evergreen needleleaf forest, evergreenbroadleaf forest, deciduous needleleaf forest, and mixed forests, VUAN product has obtained accurate retrie-vals compared to MONT product. But MONT product, on the other hand, computes the retrievals using Cband observations which have better penetration capability than that of X band observations due to whichbetter performance can be expected of MONT product in these land cover categories. The dominance ofVUAN indicates the greater influence of retrieval algorithms and the associated parameterizations in obtain-ing accurate soil moisture retrievals, although caution must be exercised while dealing with retrievals invegetated regions. Similar observation regarding VUAN product's superior performance over MONT productin densely vegetated regions was made by Tuttle and Salvucci [2014] over CONUS (Contiguous United States)region. The scattered nature of grids where both VUAN and MONT products got selected, on the other hand,is reflected in Figure 9. These grids are more pronounced across croplands (C), cropland/natural vegetationmosaic (C/N), barren or sparsely vegetated (B), woody savannas (W), and open shrublands (OS). When the ele-vation information in these grids is integrated, it is observed that most grids are situated in low-elevationregions (<1000m). This prompts the inference that the aforementioned land cover categories situated inlow-elevation areas—which may not have steep slopes (that can hinder the satellite observations)—mightprovide perfect conditions for retrieval of accurate soil moisture observations. It is observed that C togetherwith WS, C/N, and B land cover categories constitute approximately 82% of area over India, and hence furtheranalysis of best choice soil moisture product map is focused exclusively on these land cover classes. Yi et al.[2011] have compared VUAN and MONT products with respect to in situ observations (AmeriFlux network)across different LULC classes in the U.S., and it is observed that our results regarding products' performancemostly agree with their findings.

Figure 8. Maps of (a) land use land cover (LULC) and (b) elevation. It has to be noted that these maps are resampled to0.25° × 0.25° spatial resolution to match the grid system of the current analysis.

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,530

Page 16: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

In order to assess the performance of VUAN and MONT products in the aforementioned four land cover con-ditions, as a novel attempt, the PDFs of grids under MONT and VUAN products with regard to elevation ineach land cover category are studied, and the results are presented in Figure 10. For example, initially, thegrids under C category are identified across India and the elevation information in the selected grids is noted.From these locations, the two sets of grids are formed—along with concerned elevation data—which corre-spond to VUAN and MONT products. The elevation PDFs (that characterize distribution information) of threedata sets (grids belonging to C land cover, grids selected by VUAN product belonging to C land cover, andgrids selected by MONT product belonging to C land cover) are estimated using KDE (using PDF version ofequation (5)) and plotted in Figure 10a. Similar PDF plots have been prepared for WS, C/N, and B land covercategories (Figures 10b–10d). Owing to relatively smaller proportion of grids that resulted in two other out-puts (none got selected and both got selected), the concerned elevation distributions have not been plotted(due to discontinuities) in these figures.

Figure 9. Bar chart displaying each land cover category on horizontal axis with each category containing the proportion ofgrids that got selected into four possible outcomes (VUAN, MONT, none getting selected—Independent, and both gettingselected—VUAN/MONT). The percentage on each bar indicates the area occupied by that particular land cover category inIndia. Wa, water; EN, evergreen needleleaf forest; EB, evergreen broadleaf forest; DN, deciduous needleleaf forest; DB,deciduous broadleaf forest; M, mixed forests; CS, closed shrublands; OS, open shrublands; WS, woody savannas; S,savannas; G, grasslands; W, permanent wetland; C, croplands; U, urban and built up; C/N, cropland/natural vegetationmosaic; SI, snow and ice; and B, barren or sparsely vegetated. Note that DN and CS land cover classes do not exist over Indiaand hence appear to be 0%.

Figure 10. PDF plots of elevation data of a land cover category (shown as blue line) and elevation data of grids selectedwith VUAN and MONT products in that land cover category (shown as red and yellow lines, respectively). Such plotsare prepared for (a) cropland—C, (b) woody savannas—WS, (c) cropland/natural vegetation mosaic—C/N, and (d) barren orsparsely vegetated—B land cover categories.

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,531

Page 17: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

The figures indicate that in case of C, WS, and C/N regions, MONT is dominant over its counterpart at lowerelevations. Particularly, in the case of C and C/N, this behavior supports MONT being selected in the domi-nant portions of Ganges basin and also in the state of Tamil Nadu. MONT is also seen to possess higher prob-ability density for WS in higher-elevation regions, although the number of grids is very low (indicated byblue line). Interestingly, in the regions of Chhattisgarh and Orissa, few grids belonging to low-elevationWS category selected MONT but when similar conditions existed along the stretches of southwest coast,VUAN product has outperformed MONT. This difference could be attributed to the varied precipitation con-ditions in these regions (southwest region receives intense precipitation compared to Chhattisgarh andOrissa region). This suggests that MONT can provide better soil moisture retrievals under moderate vegeta-tion (croplands) and moderate precipitation conditions in flatter terrains, and VUAN product performs betterunder similar terrain and land cover conditions but with higher precipitation patterns. The selection of VUANproduct, on the other hand, became more frequent as the elevation increases in these land cover categories.Under C and C/N classes, VUAN underwent dominant selection with grid elevations lying approximatelybetween 300m and 1000m which incidentally covers a major area of India (depicted in Figures 8aand 9). As mentioned before, the Deccan plateau region in south central India, despite belonging mostlyto C and C/N categories, has a reasonably good number of grids which selected MONT product. Theseare observed to be primarily drought-affected regions with dry soils as well as conditions of moderate pre-cipitation. This observation—along with previously obtained inferences (in Ganges, Chhattisgarh, Orissa,and Tamil Nadu regions)—culminates that although MONT product has better probability of selection inlow-elevation terrains occupied with C and C/N, lower precipitation conditions might be playing a moreimportant role in its selection.

Two important inferences can be drawn out regarding performance of the products over grids of barren orsparsely vegetated category from this analysis. One, the PDF's peak of elevation (blue line) surpasses thepeaks of the other two PDFs (at elevation of 185m) due to the presence of grids that resulted in none ofthe products getting selected in this region (Rajasthan). This PDF is followed by VUAN's PDF which is dueto the selection of VUAN product under similar land cover and elevation conditions of Gujarat region.And two, as the elevation increases, a reversal of scenario has occurred leading to more number of gridsselecting MONT product (in Jammu and Kashmir region). From the analysis of the two soil moisture pro-ducts, it is observed that VUAN product performs reasonably well under the conditions of dense vegetation(evergreen forests), moderate vegetation (croplands) with elevation ranging between 300m and 1000m,and sparsely vegetated (barren conditions) flatter terrains. Furthermore, MONT product is found to be per-forming well under low-elevation conditions coupled with moderate vegetation and sparsely vegetatedhigher-elevation terrains.

Since precipitation data inherently plays a crucial role in the preparation of best choice soil moisture productmap, there is a necessity to check the influence of reference data set on the final output. In order to check itseffect, the current analysis is replicated to validate VUAN and MONT products over India using precipitationdata set developed under Asian Precipitation-Highly-Resolved Observational Data Integration TowardsEvaluation of Water Resources (APHRODITE) project [Yatagai et al., 2012]. The data set is available at a dailyscale spanning from 1951 to 2007 at 0.25° × 0.25° spatial resolution. Due to nonavailability of precipitationdata beyond 2007, the analysis has been restricted to the time period from 19 June 2002 to 30 December2007. The APHRODITE data are resampled to grid location of IMD precipitation data using inverse distanceweighted average method. The proposed algorithm (section 2.2.3) is applied to validate VUAN and MONTdata sets using APHRODITE precipitation data set over India. All the parameters (in BTICMICKDEMI andBTICMECKDEMI) are maintained to be the same as those for IMD precipitation-based analysis. Figure 11presents the best choice soil moisture product map obtained from the analysis using APHRODITE data.

It is observed that out of 4632 grids over India, CKDEMI could be computed at over 97.5% of grids of whichapproximately 73% of the grids selected VUAN product, 22% of the grids selected MONT product, 34 grids(0.0075%) resulted in no result (none got selected), and 3.5% of the grids lead to the selection of bothVUAN and MONT. Overall, it is seen that the best choice soil moisture product map has retained most ofthe features which were seen in Figure 7a. Noticeable aspects include selection of MONT product in theGanges basin along with few areas of Chhattisgarh, Orissa, and Tamil Nadu states which is reasonably consis-tent with the best choice soil moisture map obtain from IMD precipitation-based analysis. Also, the patternsin northeast India region are intact suggesting that there may not be much difference in the precipitation

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,532

Page 18: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

patterns of the two products in thisarea. However, two important differ-ences have been observed uponcomparing the outputs generatedby changing the precipitation pro-duct. One, many portions of Deccanplateau region, which previouslyselected MONT product, have nowselected VUAN product. And two,few grids in the Rajasthan regionwhich previously selected none ofthe two products have now selectedthe MONT product. It has to be notedthat the analysis using APHRODITEdata is carried out over data lengthwhich is 4 years shorter than that ofIMD precipitation-based analysis.Also, APHRODITE data are not sub-jected to any sort of bias correction(which also could be the reason forthe difference between the two out-puts), and it is hypothesized that theimpact of using different reference

data sets can be properly evaluated only when the original data set is used for the analysis. Hence, these fac-tors might have played a significant role in influencing temporal patterns of precipitation which finallyaffected the selection of MONT product in these two regions.

5. Conclusions

A novel approach to validate the soil moisture retrievals of satellite products using precipitation observa-tions is presented. In this process, a new measure called Copula-Kernel Density Estimator based MutualInformation (CKDEMI) is introduced along with a couple of bootstrap strategies (BTICKDEMI and BTECKDEMI).This validation approach is designed to be generic in nature and can evaluate any two soil moisture pro-ducts (here AMSR-E's VUAN and MONT products using IMD precipitation) to yield “best choice soil moistureproduct map” which contains information about locations where any one of the two/none of the two/boththe products have produced accurate soil moisture retrievals.

The results indicated that VUAN product has shown superior performance over MONT product over India.Some of the key observations over India include (a) VUAN product has good chances of getting selected inthe regions of heavy precipitation, whereas MONT product performed well under moderate precipitationconditions; (b) VUAN product performs reasonably well under the conditions of dense/moderate vegetationat 300–1000m elevation or barren land cover conditions at low-elevation regions. MONT product gotselected where moderate vegetation conditions prevail at low elevations and also at highlands of sparselyvegetated areas; (c) the proposed method has shown some ability in rejecting the products at locationswhere possibly erroneous soil moisture observations might have been retrieved and areas which are occu-pied with dense vegetation (that can degrade the quality of soil moisture retrievals); and (d) there is littleimpact of changing the reference precipitation data sets over product selection spatially, although one ortwo of the differences still need to be addressed.

We are making attempts to integrate the proposed validation algorithm with other existing approaches suchas Rvalue approach and triple collocation at regional scale, which could be helpful in selecting the best per-forming soil moisture product for a location, and also create a merged data set of these products of theregion. The proposed methodology will be extended to include more soil moisture products through whichmore than two products (including retrievals from other sensors such as SMOS and SMAP) can be validatedsimultaneously. We also intend to check the impacts of filtering out missing data, length of the data used for

Figure 11. Best choice soilmoistureproductmappreparedusingAPHRODITEdata spanning from 19 June 2002 to 30 December 2007 gridded at0.25° × 0.25° spatial resolution. Map legend is consistent with that ofFigure 7a.

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,533

Page 19: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

analysis for a location, and also the effect of bias correction applied to precipitation data. Since the presentanalysis was carried out at a daily scale, efforts will be made to check if the methodology holds valid at othertime scales such as 3 day average and weekly and monthly levels. The results obtained will also be evaluatedin the perspective of how much the density of vegetation is affecting the output by integrating time-varyingbiomass information. Efforts will be invested in gaining deeper understanding over parameterization ofretrieval algorithms in an attempt to improve the quality of retrievals (increase the value of CKDEMI). Thepresent research assists hydrologists (climate scientists) in obtaining spatially and temporally accurate soilmoisture observations (using best choice soil moisture product map) who attempt to assimilate soil moisturedata into their hydrologic (climate) models at both basin and continental scales.

ReferencesAkaike, H. (1974), A new look at the statistical model identification, IEEE Trans. Autom. Control, 19(6), 716–723.Arrigo, J. A. S., and G. D. Salvucci (2005), Investigation hydrologic scaling: Observed effects of heterogeneity and nonlocal processes across

hillslope, watershed, and regional scales, Water Resour. Res., 41, W11417, doi:10.1029/2005WR004032.Bisselink, B., E. Van Meijgaard, A. Dolman, and R. De Jeu (2011), Initializing a regional climate model with satellite-derived soil moisture,

J. Geophys. Res., 116, D02121, doi:10.1029/2010JD014534.Bolten, J. D., W. T. Crow, X. Zhan, T. J. Jackson, and C. A. Reynolds (2010), Evaluating the utility of remotely sensed soil moisture retrievals for

operational agricultural drought monitoring, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 3(1), 57–66, doi:10.1109/jstars.2009.2037163.Brocca, L., T. Moramarco, F. Melone, W. Wagner, S. Hasenauer, and S. Hahn (2012), Assimilation of surface- and root-zone ASCAT soil moisture

products into rainfall-runoff modeling, IEEE Trans. Geosci. Remote Sens., 50(7), 2542–2555, doi:10.1109/tgrs.2011.2177468.Broxton, P. D., X. Zeng, D. Sulla-Menashe, and P. A. Troch (2014), A global land cover climatology using MODIS data, J. Appl. Meteorol.

Climatol., 53(6), 1593–1605, doi:10.1175/jamc-d-13-0270.1.Chen, Y., K. Yang, J. Qin, L. Zhao, W. Tang, and M. Han (2013), Evaluation of AMSR-E retrievals and GLDAS simulations against observations of

a soil moisture network on the central Tibetan Plateau, J. Geophys. Res. Atmos., 118, 4466–4475, doi:10.1002/jgrd.50301.Cover, T. M., and J. A. Thomas (2006), Elements of Information Theory, 2nd ed., Wiley-Intersci., Hoboken, N. J.Crow, W. T. (2007), A novel method for quantifying value in spaceborne soil moisture retrievals, J. Hydrometeorol., 8(1), 56–67, doi:10.1175/

jhm553.1.Crow, W. T., and X. Zhan (2007), Continental-scale evaluation of remotely sensed soil moisture products, IEEE Geosci. Remote Sens. Lett., 4(3),

451–455, doi:10.1109/lgrs.2007.896533.Crow, W. T., D. G. Miralles, and M. H. Cosh (2010), A quasi-global evaluation system for satellite-based surface soil moisture retrievals,

IEEE Trans. Geosci. Remote Sens., 48(6), 2516–2527, doi:10.1109/tgrs.2010.2040481.Dorigo, W., R. Jeu, D. Chung, R. Parinussa, Y. Liu, W. Wagner, and D. Fernández-Prieto (2012), Evaluating global trends (1988–2010) in

harmonized multi-satellite surface soil moisture, Geophys. Res. Lett., 39, L18405, doi:10.1029/2012GL052988.Dorigo, W., A. Gruber, R. De Jeu, W. Wagner, T. Stacke, A. Loew, C. Albergel, L. Brocca, D. Chung, and R. Parinussa (2015), Evaluation of the ESA

CCI soil moisture product using ground-based observations, Remote Sens. Environ., 162, 380–395.Dorigo, W. A., R. Jeu, D. Chung, R. Parinussa, Y. Liu, W. Wagner, and D. Fernández-Prieto (2011), The International Soil Moisture Network:

A data hosting facility for global in situ soil moisture measurements, Hydrol. Earth Syst. Sci., 15(5), 1675–1698, doi:10.5194/hess-15-1675-2011.

Duong, T. (2007), ks: Kernel density estimation and kernel discriminant analysis for multivariate data in R, J. Stat. Software, 21(7), 1–16.Duong, T. (2015), Non-parametric smoothed estimation of multivariate cumulative distribution and survival functions, and receiver

operating characteristic curves, J. Korean Stat. Soc., doi:10.1016/j.jkss.2015.06.002.Efron, B., and R. J. Tibshirani (1994), An Introduction to the Bootstrap, Chapman and Hall/CRC, Fla.Entekhabi, D., et al. (2010), The Soil Moisture Active Passive (SMAP) mission, Proc. IEEE, 98(5), 704–716, doi:10.1109/jproc.2010.2043918.Fahle, M., T. L. Hohenbrink, O. Dietrich, and G. Lischeid (2015), Temporal variability of the optimal monitoring setup assessed using

information theory, Water Resour. Res., 51, 7723–7743, doi:10.1002/2015wr017137.Fernando, T. M. K. G., H. R. Maier, and G. C. Dandy (2009), Selection of input variables for data driven models: An average shifted histogram

partial mutual information estimator approach, J. Hydrol., 367(3–4), 165–176, doi:10.1016/j.jhydrol.2008.10.019.Genest, C., B. Remillard, and D. Beaudoin (2009), Goodness-of-fit tests for copulas: A review and a power study, Insur. Math. Econ., 44(2),

199–213, doi:10.1016/j.insmatheco.2007.10.005.Gruber, A., C.-H. Su, S. Zwieback, W. Crow, W. Dorigo, and W. Wagner (2016), Recent advances in (soil moisture) triple collocation analysis,

Int. J. Appl. Earth Obs. Geoinformation, 45, 200–211.Harrold, T. I., A. Sharma, and S. Sheather (2001), Selection of a kernel bandwidth for measuring dependence in hydrologic time series using

the mutual information criterion, Stochastic Environ. Res. Risk Assess., 15(4), 310–324, doi:10.1007/s004770100073.Jackson, T. J., T. J. Schmugge, and J. R. Wang (1982), Passive microwave sensing of soil-moisture under vegetation canopies, Water Resour.

Res., 18, 1137–1142, doi:10.1029/WR018i004p01137.Jarvis, A., H. I. Reuter, A. Nelson, and E. Guevara (2008), Hole-filled SRTM for the globe Version 4. [Available at the CGIAR-CSI SRTM 90 m

Database (http://srtm.csi.cgiar.org).]Joe, H. (1997), Multivariate Models and Multivariate Dependence Concepts, Springer Sci.+Business Media, Fla.Jones, L. A., C. R. Ferguson, J. S. Kimball, K. Zhang, S. T. K. Chan, K. C. McDonald, E. G. Njoku, and E. F. Wood (2010), Satellite microwave remote

sensing of daily land surface air temperature minima and maxima from AMSR-E, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 3(1),111–123.

Jones, L., and J. Kimball (2011), Daily Global Land Surface Parameters Derived From AMSR-E Version 1.2 Update 1, Natl. Snow and Ice Data Cent.,Boulder, Colo.

Kerr, Y. H., et al. (2012), The SMOS soil moisture retrieval algorithm, IEEE Trans. Geosci. Remote Sens., 50(5), 1384–1403, doi:10.1109/tgrs.2012.2184548.

Lettenmaier, D. P., D. Alsdorf, J. Dozier, G. J. Huffman, M. Pan, and E. F. Wood (2015), Inroads of remote sensing into hydrologic science duringthe WRR era, Water Resour. Res., 51(9), 7309–7342, doi:10.1002/2015wr017616.

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,534

AcknowledgmentsWe thank the three anonymousreviewers for providing excellent com-ments that strengthened the work andbrought the manuscript to current form.Thanks to Owe et al. [2008], Jones et al.[2010], Pai et al. [2014], Yatagai et al.[2012], Jarvis et al. [2008], Broxton et al.[2014], Vrije Universiteit Amsterdam,NASA National Snow, and Ice DataCenter (NSIDC) Distributed ActiveArchive Center (DAAC) for MONT data(http://goo.gl/a3AlV1), NASA GoddardEarth Sciences (GES) Data, andInformation Services Center (DISC) forVUAN data (http://goo.gl/HJKjZN), IndiaMeteorological Department for RAINdata (http://goo.gl/abmTDT), UnitedStates Geological Survey Land CoverInstitute LULC project team for LULCdata (http://goo.gl/r4JWRg), ShuttleRadar Topography Mission (SRTM)Digital Elevation Model (DEM) task teamfor elevation data (http://goo.gl/i6I5Ia),and the project team of APHRODITEdata (http://goo.gl/2vMrCC). For furtherdetails regarding data used in the work,contact [email protected]. Thesecond author wishes to thank the sup-port given by the Ministry of EarthSciences, Government of India, throughthe project MOES/ATMOS/PP-IX/09.

Page 20: A novel approach to validate satellite soil moisture retrievals …civil.iisc.ernet.in/~nagesh/pubs/94_JGR_Karthik_CKDEMI_Oct16.pdf · Introduction Soil moisture quantifies the amount

Lievens, H., et al. (2015), SMOS soil moisture assimilation for improved hydrologic simulation in the Murray Darling Basin, Australia,Remote Sens. Environ., 168, 146–162, doi:10.1016/j.rse.2015.06.025.

Mladenova, I. E., et al. (2014), Remote monitoring of soil moisture using passive microwave-based techniques—Theoretical basis andoverview of selected algorithms for AMSR-E, Remote Sens. Environ., 144, 197–213, doi:10.1016/j.rse.2014.01.013.

Montgomery, D. C., and G. C. Runger (2010), Applied Statistics and Probability for Engineers, John Wiley, New York.Neelam, M., and B. P. Mohanty (2015), Global sensitivity analysis of the radiative transfer model, Water Resour. Res., 51, 2428–2443,

doi:10.1002/2014wr016534.Nelsen, R. B. (2013), An Introduction to Copulas, Springer Sci. + Business Media, New York.Njoku, E. G., T. J. Jackson, V. Lakshmi, T. K. Chan, and S. V. Nghiem (2003), Soil moisture retrieval from AMSR-E, IEEE Trans. Geosci. Remote Sens.,

41(2), 215–229, doi:10.1109/tgrs.2002.808243.Owe, M., R. de Jeu, and J. Walker (2001), A methodology for surface soil moisture and vegetation optical depth retrieval using the microwave

polarization difference index, IEEE Trans. Geosci. Remote Sens., 39(8), 1643–1654, doi:10.1109/36.942542.Owe, M., R. de Jeu, and T. Holmes (2008), Multisensor historical climatology of satellite-derived global land surface moisture, J. Geophys. Res.,

113, F01002, doi:10.1029/2007JF000769.Pai, D. S., L. Sridhar, M. Rajeevan, O. P. Sreejith, N. S. Satbhai, and B. Mukhopadhyay (2014), Development of a new high spatial resolution

(0.25° × 0.25°) long period (1901–2010) daily gridded rainfall data set over India and its comparison with existing data sets over the region,Mausam, 65(1), 1–18.

Pan, M., A. K. Sahoo, and E. F. Wood (2014), Improving soil moisture retrievals from a physically-based radiative transfer model, Remote Sens.Environ., 140, 130–140, doi:10.1016/j.rse.2013.08.020.

Pellarin, T., J. P. Wigneron, J. C. Calvet, and P. Waldteufel (2003), Global soil moisture retrieval from a synthetic L-band brightness temperaturedata set, J. Geophys. Res., 108(D12), 4364, doi:10.1029/2002JD003086.

Prigent, C., F. Aires, W. B. Rossow, and A. Robock (2005), Sensitivity of satellite microwave and infrared observations to soil moisture at aglobal scale: Relationship of satellite observations to in situ soil moisture measurements, J. Geophys. Res., 110, D07110, doi:10.1029/2004JD005087.

Reichle, R. H., R. D. Koster, P. Liu, S. P. Mahanama, E. G. Njoku, and M. Owe (2007), Comparison and assimilation of global soil moistureretrievals from the Advanced Microwave Scanning Radiometer for the Earth Observing System (AMSR-E) and the Scanning MultichannelMicrowave Radiometer (SMMR), J. Geophys. Res., 112, D09108, doi:10.1029/2006JD008033.

Rodell, M., P. Houser, U. E. A. Jambor, J. Gottschalck, K. Mitchell, C. Meng, K. Arsenault, B. Cosgrove, J. Radakovich, and M. Bosilovich (2004),The global land data assimilation system, Bull. Am. Meteorol. Soc., 85(3), 381–394.

Saleem, J. A., and G. D. Salvucci (2002), Comparison of soil wetness indices for inducing functional similarity of hydrologic response acrosssites in Illinois, J. Hydrometeorol., 3(1), 80–91, doi:10.1175/1525-7541(2002)003<0080:coswif>2.0.co;2.

Salvucci, G. D. (2001), Estimating the moisture dependence of root zone water loss using conditionally averaged precipitation,Water Resour.Res., 37, 1357–1365, doi:10.1029/2000WR900336.

Scipal, K., T. Holmes, R. de Jeu, V. Naeimi, and W. Wagner (2008), A possible solution for the problem of estimating the error structure ofglobal soil moisture data sets, Geophys. Res. Lett., 35, L24403, doi:10.1029/2008GL035599.

Seneviratne, S. I., T. Corti, E. L. Davin, M. Hirschi, E. B. Jaeger, I. Lehner, B. Orlowsky, and A. J. Teuling (2010), Investigating soil moisture-climateinteractions in a changing climate: A review, Earth Sci. Rev., 99(3–4), 125–161, doi:10.1016/j.earscirev.2010.02.004.

Shannon, C. (1948), A mathematical theory of communication, Bell Syst. Tech. J., 27, 379–423.Singh, V. (2015), Introduction to Entropy Theory in Hydrologic Science and Engineering, McGraw-Hill, New York.Singh, V. P. (2013), Entropy Theory and Its Application in Environmental and Water Engineering, John Wiley, Hoboken, N. J.Su, C. H., D. Ryu, W. T. Crow, and A. W. Western (2014), Beyond triple collocation: Applications to soil moisture monitoring, J. Geophys. Res.

Atmos., 119, 6419–6439.Sun, J., G. D. Salvucci, D. Entekhabi, and L. Farhadi (2011), Parameter estimation of coupled water and energy balance models based on

stationary constraints of surface states, Water Resour. Res., 47, W02512, doi:10.1029/2010WR009293.Sun, J., G. D. Salvucci, and D. Entekhabi (2012), Estimates of evapotranspiration from MODIS and AMSR-E land surface temperature and

moisture over the Southern Great Plains, Remote Sens. Environ., 127, 44–59, doi:10.1016/j.rse.2012.08.020.Tsybakov, A. B. (2008), Introduction to Nonparametric Estimation, Springer Sci. + Business Media, New York.Tuttle, S. E., and G. D. Salvucci (2014), A new approach for validating satellite estimates of soil moisture using large-scale precipitation:

Comparing AMSR-E products, Remote Sens. Environ., 142, 207–222, doi:10.1016/j.rse.2013.12.002.Wagner, W., K. Scipal, C. Pathe, D. Gerten, W. Lucht, and B. Rudolf (2003), Evaluation of the agreement between the first global remotely

sensed soil moisture data with model and precipitation data, J. Geophys. Res., 108(D19), 4611, doi:10.1029/2003JD003663.Wand, M. P., and M. C. Jones (1994), Kernel Smoothing, Crc Press, Fla.Wanders, N., M. F. P. Bierkens, S. M. de Jong, A. de Roo, and D. Karssenberg (2014), The benefits of using remotely sensed soil moisture in

parameter identification of large-scale hydrological models, Water Resour. Res., 50, 6874–6891, doi:10.1002/2013wr014639.Yatagai, A., K. Kamiguchi, O. Arakawa, A. Hamada, N. Yasutomi, and A. Kitoh (2012), APHRODITE constructing a long-term daily gridded

precipitation dataset for Asia based on a dense network of rain gauges, Bull. Am. Meteorol. Soc., 93(9), 1401–1415, doi:10.1175/bams-d-11-00122.1.

Yi, Y., J. S. Kimball, L. A. Jones, R. H. Reichle, and K. C. McDonald (2011), Evaluation of MERRA land surface estimates in preparation for the soilmoisture active passive mission, J. Clim., 24(15), 3797–3816.

Yilmaz, T. M., and W. Crow (2014), Evaluation of assumptions in soil moisture triple collocation analysis, paper presented at EGU GeneralAssembly Conference Abstracts.

Zhang, A., and G. Jia (2013), Monitoring meteorological drought in semiarid regions using multi-sensor microwave remote sensing data,Remote Sens. Environ., 134, 12–23, doi:10.1016/j.rse.2013.02.023.

Journal of Geophysical Research: Atmospheres 10.1002/2016JD024829

KARTHIKEYAN AND KUMAR SATELLITE SOIL MOISTURE RETRIEVALS 11,535