18
This article was downloaded by: [173.11.43.178] On: 24 September 2013, At: 06:26 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Geocarto International Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tgei20 Statistical data fusion of multi-sensor AOD over the Continental United States Sweta Jinnagara Puttaswamy a , Hai M. Nguyen b , Amy Braverman b , Xuefei Hu a & Yang Liu a a Department of Environmental Health , Emory University, Rollins School of Public Health , Atlanta , GA , USA b Jet Propulsion Laboratory , Pasadena , CA , USA Accepted author version posted online: 06 Aug 2013.Published online: 10 Sep 2013. To cite this article: Sweta Jinnagara Puttaswamy , Hai M. Nguyen , Amy Braverman , Xuefei Hu & Yang Liu , Geocarto International (2013): Statistical data fusion of multi-sensor AOD over the Continental United States, Geocarto International, DOI: 10.1080/10106049.2013.827750 To link to this article: http://dx.doi.org/10.1080/10106049.2013.827750 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

This article was downloaded by: [173.11.43.178]On: 24 September 2013, At: 06:26Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Geocarto InternationalPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/tgei20

Statistical data fusion of multi-sensorAOD over the Continental United StatesSweta Jinnagara Puttaswamy a , Hai M. Nguyen b , Amy Bravermanb , Xuefei Hu a & Yang Liu aa Department of Environmental Health , Emory University, RollinsSchool of Public Health , Atlanta , GA , USAb Jet Propulsion Laboratory , Pasadena , CA , USAAccepted author version posted online: 06 Aug 2013.Publishedonline: 10 Sep 2013.

To cite this article: Sweta Jinnagara Puttaswamy , Hai M. Nguyen , Amy Braverman , Xuefei Hu& Yang Liu , Geocarto International (2013): Statistical data fusion of multi-sensor AOD over theContinental United States, Geocarto International, DOI: 10.1080/10106049.2013.827750

To link to this article: http://dx.doi.org/10.1080/10106049.2013.827750

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

Statistical data fusion of multi-sensor AOD over the ContinentalUnited States

Sweta Jinnagara Puttaswamya, Hai M. Nguyenb, Amy Bravermanb, Xuefei Hua andYang Liua*

aDepartment of Environmental Health, Emory University, Rollins School of Public Health,Atlanta, GA, USA; bJet Propulsion Laboratory, Pasadena, CA, USA

(Received 4 June 2012; accepted 18 July 2013)

This article illustrates two techniques for merging daily aerosol optical depth (AOD)measurements from satellite and ground-based data sources to achieve optimal dataquality and spatial coverage. The first technique is a traditional Universal Kriging(UK) approach employed to predict AOD from multi-sensor aerosol products thatare aggregated on a reference grid with AERONET as ground truth. The secondtechnique is spatial statistical data fusion (SSDF); a method designed for massivesatellite data interpolation. Traditional kriging has computational complexity O(N3),making it impractical for large datasets. Our version of UK accommodates massivedata inputs by performing kriging locally, while SSDF accommodates massive datainputs by modelling their covariance structure with a low-rank linear model. In thisstudy, we use aerosol data products from two satellite instruments: the moderateresolution imaging spectrometer and the geostationary operational environmentalsatellite, covering the Continental United States.

Keywords: aerosol optical depth; MODIS; GOES; AERONET; universal kriging;data fusion

1. Introduction

Since the 1990s, epidemiologic studies have provided increasing evidence to linknon-fatal health outcomes in sensitive sub-populations (e.g. seniors, infants and youngchildren) with exposure to ambient air pollution (Metzger et al. 2004; Pope et al. 2009).Of particular interest is the generation of accurate PM2.5 (airborne particles less than orequal to 2.5 μm in aerodynamic diameter) exposure estimates to support the research ofhealth endpoints related to acute exposures such as emergency department visits (Peelet al. 2007), childhood asthma (Brauer et al. 2007) and birth defects (Darrow et al.2009). As ground PM2.5 measurements are relatively sparse in space and time, moststudies rely on data from a central monitor. Consequently, in such analysis, the spatialcontrast in population exposure is not considered. Strickland et al. (2011) argued thatalthough the difference from various metrics of spatially heterogeneous pollutants doesnot seem to alter the exposure-response estimates in a health effects study, these differ-ences are important for health benefits analyses, where results from epidemiologicalstudies on the health effects of pollutants (per unit change in concentration) are used to

*Corresponding author. Email: [email protected]

Geocarto International, 2013http://dx.doi.org/10.1080/10106049.2013.827750

� 2013 Taylor & Francis

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 3: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

predict the health impacts of a reduction in pollutant concentrations. The errors inestimated PM2.5 exposure-response functions have been evaluated using time seriesmodels (Goldman et al. 2010). Error due to undetected spatial variability caused bylimited coverage of ground stations was estimated to result in substantial attenuationof detected health effects, i.e. 16% for PM2.5 sulphate, nitrate, and ammonium, and43–68% for PM2.5 BC. Intermediate impacts were found for PM10, total PM2.5 andPM2.5 OC.

Due to their comprehensive spatial coverage, satellite aerosol data products arepotentially valuable for regional air quality (Engel-Cox et al. 2004; Hutchison et al.2005; Liu et al. 2007) and health effects studies (Hu 2009; Hu & Rao 2009), in whichaccurate characterizations of particles’ spatial and temporal distributions are important.Aerosol optical depth (AOD) observed from satellites is commonly used to derivePM2.5. AOD is defined as the integral of aerosol extinction coefficients along the atmo-spheric column from the ground to the top of the atmosphere. When particles are wellmixed in the boundary layer, AOD is strongly linked to PM2.5 mass concentration.Measurement accuracy and uncertainty in satellite data-derived PM2.5 estimates areclearly specific to the sensor technology. Despite this, satellites are coarse but compre-hensive in space while ground networks tend to be localized; satellites are temporallysparse while ground networks usually produce data that are dense in time. Therefore,an integrated data product that exploits the strengths of these two observing strategiescould improve inferences linking the derived PM2.5 to the health effects of interest inglobal assessments. But, polar orbiting satellite instruments, such as MODIS, are miss-ing AOD data 50% of the time on average. Fusing AOD data from multiple sensorscan potentially reduce data missingness. However, directly merging AOD data productsfrom different sensors are problematic due to their different error structures. Therefore,calibration with ground truth is necessary. Research in this area has been lacking. Ourmain objective is to compare the two statistical methods of fusing AOD products on adaily basis.

There have been other attempts to combine several satellite sensor aerosol productswith ground-based aerosol robotic network (AERONET). Gupta et al. (2008) evaluatesatellite data fusion for level 2 aerosol products from multi-angle imagining spectroradiometer (MISR) and moderate resolution imaging spectrometer (MODIS) with cloud& earth’s radiant energy systems (CERES) spatial scanner footprint (SSF) data andassess their data fusion scheme through case studies. Nirala (2008) use maximumlikelihood approach and spatial interpolation techniques to fuse gridded level 3 MODISaerosol products with each other. Zubko et al. (2010) use the same methodology asNirala (2008) to test with simulated data-set; they verify the limitations of the optimalinterpolation method to fill gaps and suggest that the interpolation method be appliedon the combined data-set to minimize prediction error. Xu et al. (2005) apply a neuralnetwork regression model derived from an integration of MISR and MODIS aerosolproducts along with their associated geometric and radiance attributes to predict spa-tially and temporally sparse AERONET AOD. Kinne (2009) develop a scoring conceptto create a satellite composite data-set using annual averages of the AOD data fromnumerous sensors combined with monthly statistics of AERONET AOD for increasedaccuracy. Chatterjee et al. (2010) apply a geostatistical fusion approach using UniversalKriging (UK) methods to integrate ground-based aerosol data with satellite data. Aero-sol data from MISR, MODIS and AERONET during each two-week time window areaveraged so that there are sufficient spatially collocated data from all three data-sets.This coarser temporal resolution is suitable for studying aerosol spatial patterns at

2 S. Jinnagara Puttaswamy et al.

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 4: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

seasonal and annual scales; however, it is insufficient to support air quality and healtheffects studies that emphasize acute exposures to fine particles over short time periods(i.e. one day to one week). In addition, traditional kriging methods do not account forinstrumental differences such as whether measurements accrue to point locations or rep-resent spatial averages of various sorts, or different sampling and measurement errorcharacteristics. Adjusting for these heterogeneities in ways that capitalize on the instru-ments’ individual strengths could improve the quality of the fused output data.

Spatial Statistical Data Fusion (SSDF; Nguyen et al. 2012) is a relatively newapproach that does account for heterogeneities in the input data by using an underlyingstatistical model that explicitly addresses these issues, starting with the question ofchange-of-support. Change-of-support involves inferring a spatial process at one resolu-tion from data at another resolution. Gotway and Young (2002) reviewed existing meth-odologies focusing on upscaling and downscaling from a single data-set. Wikle andBerliner (2005) propose changing resolution on a single dataset using Bayesian change-of-resolution. Fuentes and Raftery (2005) combined observations and numerical modeloutput at different resolutions using Bayesian melding. Berrocal et al. (2010) downscalefrom areal-level and point-level data through Bayesian regression. All these approacheshave computational complexities that are quadratic or higher with respect to data size,making them ill-suited for massive data-sets like those obtained in remote sensing. Mas-sive data size can be a computational bottleneck for traditional kriging as well, whichhas computational complexity O(N3) because of the need to invert an N�N covariancematrix (N = number of observations). Some spatial inferential methodologies that arescalable include modelling non-stationary covariance functions with multi-resolutionalwavelet models (Wikle et al. 2002), approximate optimal prediction with dimensionreduction through a small set of space-filling locations (Banerjee et al. 2008), modellingnon-stationary covariance models using the discrete Fourier transform (Jun & Stein2008) and Fixed Rank Kriging (FRK) (Cressie & Johannesson 2008). SSDF leveragesthe computational efficiency provided by FRK to solve the data fusion problem whenthe data sources are massive.

In this article, we have applied both UK and SSDF techniques on MODIS andGASP AOD data in the Continental United States for the years 2005 and 2006. Ourobjectives are twofold. First, we evaluate the abilities of these two approaches to reducethe prediction errors in the satellite data by comparing the estimated AOD values withAERONET measurements. Second, we compare the outputs of the two techniques toevaluate the potential benefits of merging the two for optimal data quality and coverage.The following sections of this article are organized as follows: Section 2 introducesAERONET, MODIS and GASP AOD data. Section 3 describes the two data fusionmethodologies. Section 4 presents results of the UK and SSDF techniques, respectively,and Section 5 gives our concluding remarks.

2. Data

2.1. The AERONET level 2 data product

The AERONET is a global optical ground-based aerosol monitoring network and dataarchive supported by NASA’s Earth Observing System and other internationalinstitutions (Dobuvik et al. 2000). AERONET data are used in various satellite andmodel validation studies as the reference standard for AOD measurements mainlybecause of their high accuracy. We download data from the online AERONET dataarchive (http://aeronet.gsfc.nasa.gov/). Since there is no AERONET AOD reported at

Geocarto International 3

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 5: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

550 nm, we calculate it using the spectral dependence of AOD at the two nearest wave-lengths, generally 675 and 500 nm, and then logarithmically interpolate to solve for theAngstrom exponent. A complete detail of the interpolation procedure is given elsewhere(Liu et al. 2004). About 33 sites that have data coverage over six months are used inour analysis as shown in Figure 1.

2.2. MODIS level 2 aerosol products

The MODIS instruments, aboard both the EOS Terra and Aqua satellites, cross theequator on the daylight side of Earth at approximately 10:30 a.m. and 1:30 p.m. localtime, respectively (Levy et al. 2007) (Levy et al. 2010). Remer et al. (2008) showedthat both MODIS-TERRA and MODIS-AQUA AOD retrievals over land are highlycorrelated with ground truth (correlation coefficients� 0.9) and show little biases(Remer et al. 2008). Operational MODIS level 2 aerosol products have a spatial resolu-tion of 10 km at nadir. Polar-orbiting MODIS sensors have overlapping swaths com-bined with a high proportion of missing retrievals due to cloud cover causing arelatively spare coverage over time. Collection 5.1 data from 2005 to 2006 over theContinental United States are downloaded from the web interface to the Level 1 andAtmospheric Archive and Distribution System, (LAADS Web; http://ladsweb.nascom.nasa.gov/). AOD measured at 0.55 μm for both land and ocean (corrected) of the goodquality (QA Confidence Flag = 2 & 3) are extracted from the raw data files for furtherprocessing. We use both quality flags 2 and 3 to improve spatial coverage of MODIS.Our preliminary analysis showed that including quality flag 2 does not cause anysubstantial data quality deterioration. MODIS AOD values range from �0.05 to 3.

2.3. GOES aerosol/smoke product

The geostationary operational environmental satellite (GOES) is the major weathersatellite operated by the national oceanic and atmospheric administration (NOAA).

Figure 1. AERONET Sites located within the Continental United States.

4 S. Jinnagara Puttaswamy et al.

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 6: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

GOES Aerosol/Smoke Product (GASP) AOD based on GOES-12 (East) imager data isestimated using lookup tables generated by radiative-transfer models and surface reflec-tance calculated using clear-sky composite visible imagery, resulting in a product ofapproximately 4 km resolution (Knapp et al. 2005). The satellite’s geostationary orbitpermits AOT retrievals at 30-min intervals between sunrise and sunset, under cloud-freeconditions. GASP AOD retrievals when compared with AERONET are reasonably wellcorrelated in the eastern USA in summer (Prados et al. 2007). GASP AOD at 550 nm at4 km spatial resolution was obtained from the GASP team at NOAA’s NationalEnvironmental Satellite, Data, and Information Service (NESDIS). Data screeningcriteria for outliers and residual cloud contamination followed the work of Prados andKondragunta (Prados et al. 2007; Kondragunta et al. 2008). Valid GASP AOD valuesrange from �0.5 to 2. Negative retrievals in the GASP AOD are results of estimationerror of surface reflectivity when the observed AOD is very low. We keep them in ouranalysis to represent low AOD. Thereby, we are expecting that our models will be ableto predict low values. (Paciorek et al. 2008). AOD measurements from 10 am to 2 pmlocal time are averaged to generate daily AOD (Liu et al. 2009) data.

2.4. Data processing at CMAQ grid level

We average MODIS and GASP AOD data over 12 km modelling grid that is often usedby the community multi-scale air quality modelling system (CMAQ). Unlike GOESpixels, MODIS pixels shift constantly in space at all times. Therefore, we fix gridlocations by using a standard grid type such as CMAQ to compute AOD estimates fromboth sensors. By having the AOD estimates on a common grid, we are essentially ableto use them collectively. CMAQ grid is used as the source grid as well as the targetgrid.

To generate a consistent AOD surface after spatially averaging MODIS andGASPAOD on CMAQ, we select a temporal window of 10 am to 2 pm local time tocombine them temporally. This particular time window is chosen based on the equatorcrossing time of MODIS sensors to calculate daytime mean AOD. And in order to beconsistent we use the same time period to average values of GASP AOD. Pacioreket al. indicate that two instantaneous AOD measurements are sufficient to represent themean AOD calculated using all available GASP AOD measurements retrieved at 30-minute intervals (Paciorek et al. 2008). Due to different viewing condition and changingcloud cover, we observe two scenarios when overlaying the two MODIS products onthe CMAQ grid, one where CMAQ grid locations contain either one of the MODISproducts and the other where they contain both. At locations that contain both MODIS-TERRA and MODIS-AQUA data, the mean represents the mean of the AOD distribu-tion from 10 am to 2 pm local time. In the other scenario, the mean AOD is biasedtowards either morning or afternoon conditions. To avoid this issue, we use simplelinear regression to define the temporal relationship between daily mean AOD values ofMODIS-TERRA and MODIS-AQUA, and then apply regression coefficients as shownin Table 1 at each CMAQ grid location to predict the mean values of the missing dataproduct using one or the other data products that are present. As a result, each CMAQgrid cell contains a mean value that is more representative of the mean AOD distribu-tion from 10 am to 2 pm local time. We map GASP AOD onto the CMAQ grid using asimple spatial average, since GOES observes AOD values continuously throughout theselected time period within a day. AERONET AOD data are also collected continuouslywithin the selected time window at point locations.

Geocarto International 5

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 7: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

3. Methodology

We try to improve the quality of satellite AOD by integrating ground truth data in thetwo data fusion approaches. UK can account for the trend component exhibited by thespatially varying AOD values. Since we assume a linear relationship between AERON-ET and satellite AOD values, the minimum error variance estimator of AOD in the UKapproach can include a global trend term derived from satellite data such as GASPAOD. We then compute Ordinary Least Squares residuals from the trend surface andsource data i.e. AERONET AOD at collocated points where both GOES data andAERONET data are available. Kriging is performed on the residuals and the interpo-lated residuals are added to the trend to compute the estimated AOD.

Estimation of AOD using SSDF follows a slightly different procedure. We firstbuild a data-set of coincident AERONET, MODIS and GOES pixels. We then buildtwo calibration models for the bias between AERONET and MODIS and betweenAERONET and GOES, as functions of latitude, longitude and time of year. These biascalibration models are applied to the raw MODIS and GOES data to producebias-corrected versions of these data. We then apply our data fusion methodology (seeSection 3.2) on bias-corrected data to produce optimal estimates of AOD on CMAQ.

3.1. Universal Kriging

To model the spatially varying AOD, we rely on the theoretical framework of spatialstatistics that assumes stationarity and isotropy in the true geophysical process.Following explanation of the framework is adopted from Bailey and Gatrell (1995).

Suppose fY ðsÞ : s 2 D � <dg is a real-valued spatial process defined on a domainD of the d-dimensional Euclidian space, mean function is denoted as EðY ðsÞ as lðsÞand VARðY ðsÞÞ as r2ðsÞ, then the covariance of the process at any two observed pointsand is defined as Cðsi; sj ¼ EððY ðsiÞ � lðsiÞðY ðsjÞ � lðsjÞÞÞ; that is, Cðs; sÞ ¼ r2ðsÞ.Such a process is said to be stationary if the mean and variance are independent of thelocation and are constant throughout <d . But, a weaker assumption of that of stationa-rity called intrinsic stationarity is considered, which assumes a constant mean and vari-ance in the differences between values separated by h. This means that the covariancefunction depends only the separation distance, h, between si and sj. Additionally, theprocess is isotropic if this dependence is purely a function of h and not the direction.Then, Cðs; sÞ ¼ CðhÞ. That is EðY ðsþ hÞ � Y ðsÞÞ ¼ 0 and VARðY ðsþ hÞ � Y ðsÞÞ ¼2cðhÞ, where cðhÞ ¼ r2 � CðhÞ is the variogram of the spatial process. Also, thecovariance function of the spatial process needs to be symmetric, that is:

C si; sj� � ¼ C sj; si

� � ð1Þ

Table 1. Slope and intercept values from regression of MODIS-AQUA on MODIS-TERRA andvice versa.

Data year No. of days

sAQUA ¼ a1 þ b1sTERRA sTERRA ¼ a2 þ b2sAQUA

a1 b1 R2 a2 b2 R2

2005 364 0.013 0.879 0.74 0.019 0.839 0.742006 364 0.014 0.836 0.78 0.011 0.939 0.78

Note: τ – Aerosol optical depth.

6 S. Jinnagara Puttaswamy et al.

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 8: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

and non-negative definite, that is

Xn

i¼1

Xn

j¼1

a1a2Cðsi; sjÞ � 0 for all n; a1; . . . ; an and s1; . . . ; sn: ð2Þ

Let us parameterize the mean function to include trend by a spatial regression modelas lðsÞ ¼ xT ðsÞb with xðsÞ ¼ ðx1ðsÞ; . . . ; xpðsÞÞT representing exploratory variables at s.

The minimum mean square error unbiased predictor Y ðsÞ is then defined in the UKapproach as a weighted linear combination of the observed values yi at location si. Given

as, Y ðsÞ ¼ Pni¼1 xiðsÞY ðsiÞ, where xiðsÞ are the kriging weights for number of observed

locations subjected to the constraint xT1 ¼ 1 (sum of n weights is unity). These weightsare obtained by solving a system of linear kriging equations to minimize mean squareerror assuming the estimated variogram parameters from which we derive the matrix ofcovariances; C are known. We do not give the details of the derivation here.

We assume some form of stationarity in the observed process in order to estimatethe covariance function, and hence we are restricted to use a valid theoretical variogrammodel that conforms to the aforementioned conditions (symmetric positive definite) in(1) and (2). Therefore, empirical variogram is not directly employed but is only used tofit the expression of the theoretical model variogram. An automatic function in the sta-tistical computing environment R, from the package automap (Hiemstra et al. 2009), isused to fit the theoretical variogram model to the MODIS empirical variogram. Thefunction iterates over a group of standard theoretical models and chooses the model thathas the smallest residual sum of squares (RSS) with the empirical variogram. In ouranalysis, we see four of the theoretical models including Exponential, Gaussian,Spherical and Matern; M. Stein’s parameterization functions that are monotonic innature are being extensively used. We use functions from the R package gstat (Pebesma2004) to perform UK on AERONET AOD considering GASP AOD as the auxiliarydata, given the variogram parameters are derived from MODIS empirical variogram.

3.1.1. Variogram modelling

The variogram model is not fitted over residuals since modelling the overall trend in alarge domain matters only in the extrapolation situations (Journel & Rossi 1989). Ourstudy domain covers the entire Continental United States. It is appropriate to model theglobal AOD trend using our satellite data since it has better spatial coverage than themore sparse ground-based AERONET data. In addition, the kriging error distributionassociated with variogram fitted over satellite data has less outlier compared to thatfitted over detrended residuals. It is also important to note that variogram modelling canhave a significant effect on the kriging weights and the prediction variance. Mostimportantly, UK approach is applied on daily AOD data-sets to avoid degradation ofthe correlations between satellite data and ground-based AERONET (Anderson et al.2003). MODIS covers about 24% of the domain area on a given day, so the MODISempirical variogram is used to model the spatial correlation of the global AOD distribu-tion. On the other hand, empirical variogram developed using raw GASP AOD showedlarge nugget effect, discontinuity at the origin or micro-scale variations, which indicatespossible measurement error (Cressie 1993). The consequence of fitting such a modelwill result in over smoothing of the predicted AOD surface.

Geocarto International 7

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 9: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

Variogram modelling can be performed assuming isotropy in the empirical vario-gram, or by taking the average in all directions to remove any directional effect or byassuming maximum correlation in one particular direction, i.e. the anisotropy case. Inall three cases, the model variogram fits the data closely, so we investigate the predictedvalues and predictor variance in each case. No major differences are noticed whencomparing the maximum predictor variance and the predicted AOD estimates in allthree cases. Although the choice of the variogram modelling is not critical with a largenumber of data points in the study domain (Stein 1988), the AOD distribution is subjectto anisotropy related to terrain and weather (e.g. major mountains, persistent wind direc-tion, etc.). An ideal strategy would be to account for anisotropy when it exists. We usean automatic R function from the package intamap that detects anisotropy and tests forits significance (Pebesma et al. 2011). If significant, the function corrects for globalanisotropy by transforming the coordinates of CMAQ locations (containing MODISAOD) to an isotropic space before estimating the variogram parameters.

3.2. Spatial statistical data fusion

In this section, we give an overview of the notation and mechanics of SSDF. The meth-odology is a variant of kriging specifically designed to optimally combine informationfrom two or more massive data-sets. Essentially, the methodology fits the covariancefunction with a low-rank linear model, allowing for quick inversion of the fullcovariance matrix in computing the kriging coefficients. It deals with inputs from twodata-sets by appending them into a single meta-data-set, reducing the problem to a moretractable one-data-set. Complete details are given elsewhere (Nguyen et al. 2012).

Let fY ðsÞ : s 2 Dg be a hidden, real-valued spatial process on the domain D. Forsimplicity, we assume that both data-sets are observed at point level, and that observa-tions for data-set k are generated according to the following model:

ZkðskmÞ ¼ Y ðskmÞ þ ekðskmÞ ; skm � <d: ð3Þ

where skm represents the m-th footprint in data-set k and ekðskmÞ is a Gaussian errorprocess associated with the measurement for footprint skm ;m ¼ 1; . . . ;Nk ; k ¼ 1; 2. Thek-th measurement error process is assumed to be Gaussian known mean and standarddeviation. The true process Y ð�Þ is assumed to follow the following linear mixedmodel,

Y ðsÞ ¼ tðsÞ0aþ SðsÞ0gþ nðsÞ: ð4Þ

where the first term on the right-hand side of (4) accounts for an assumed linear modelin the trend, where tð�Þ � ðt1ð�Þ; . . . ; tpð�ÞÞ0 is a vector of p known covariates, and thevector of linear coefficients, a, is unknown. The middl term, SðsÞ0g, captures the spatialcovariance and is assumed to have mean zero and finite, possibly heteroskedastic vari-ance. The term SðsÞ0 is a spatial expansion of the location s using a set of r known spa-tial basis functions and the term g is assumed to follow an r - dimensional Gaussiandistribution Nð0;KÞ. The last term, nð�Þ, describes the variability of the process at thescale of the BAU, and it is assumed to be an independent Gaussian process with meanzero and variance r2n.

8 S. Jinnagara Puttaswamy et al.

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 10: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

The parameterization of the spatial covariance process, SðsÞ0g, in (4) is a dimensionreduction technique that models the spatial covariance structure with an r-dimensionalGaussian random effects model. This parameterization has the attractive property thatthe variance/covariance matrix of the data can be inverted exactly and quickly withcomputational complexity O(Nr2). In vector notation, the data vector for data-set k,Z ¼ ðZkðSk1Þ; . . . ; ZkðSkNk ÞÞ0, may be written as

Zk ¼ Yk þ ek Yk ¼ Tkaþ Skgþ nk ; k ¼ 1; 2 ð5Þ

where Yk , Tk , ek Skg and nk are, respectively, the true process, covariate process,measurement-error, smooth-spatial variation, and the fine-scale-variation processesevaluated at the footprints in data-set k.

As in traditional kriging, we construct an estimate of the true process at a locations0 as a linear combination of the data vectors, Y ðs0Þ ¼ a01Z1 þ a02Z2. We need to solvefor the vectors of kriging coefficients a1 and a2 that minimize the expected prediction

error E Y ðs0Þ � Y ðs0Þ�� ��2� �

subject to the unbiasedness condition EðY ðs0ÞÞ ¼ EðY ðs0ÞÞ.The solution to this minimisation problem can be simplified by considering an alterna-tive formulation of our problem. Given the data vectors Z1 and Z2, we can stack themto form the following model

Z1

Z2

� �¼ T1

T2

� �aþ S1g

S2g

� �þ n1

n2

� �þ e1

e2

� �; ð6Þ

or equivalently

ZF ¼ TFaþ SFgþ nF þ eF ; ð7Þ

where ZF , TF , SF , nF and eF each denote the stacked version of the correspondingvector. The formulation in (7) is essentially a single data-set model and it is straightfor-ward to derive the optimal kriging coefficients and the corresponding prediction error.Due to the parameterization of the spatial covariance process in (4), computation of the

inverse of the covariance matrix, varðZFÞ�1, has computational complexity O(Nr2);consequently, estimates of Y ð�Þ using the entire data-sets ZF can be computed veryquickly. Upscaling and downscaling from the data is also straightforward due to thelinear nature of (4) (see Nguyen et al. 2012 for details).

4. Results

4.1. UK results

Our study domain is divided into three sub-regions, namely eastern (east of 90W),central (between 90 and 108W) and western (west of 108W) regions. Although GOESprovides an average coverage of 52% national wide at the daily level, GASP AOD isknown to have spatially varying retrieval errors when compared with AERONET obser-vations (Prados et al. 2007; Remer et al. 2008). Mean differences between AERONETAOD and GASP AOD at the collocated locations in both the western and centralregions are high (�0.17) indicating overestimation of AOD by GOES. Mostly thesedifferences arise from satellite retrieval errors that are a direct result of assumptions onsurface reflectance and atmospheric and aerosol properties. However, negative

Geocarto International 9

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 11: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

Figure 2. Example plot panel consisting of daily mean distribution of GASP AOD (top row),AOD estimates of UK (middle row) and SSDF (bottom row) for 16 July 2006.

10 S. Jinnagara Puttaswamy et al.

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 12: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

GOES-AOD retrievals are reduced by 88% in the estimated AOD through UK. MODISAOD is used in the variogram modelling process, and thus its data characteristics areequally important. In the western USA, where AERONET sites are located over coastaland desert regions that are characterized by a larger than average MODIS retrieval error(Levy et al. 2005), MODIS AOD estimates are high because of surface effects.MODIS-AERONET differences at the collocated locations will increase with increase intheir known biases (Levy et al. 2010), and therefore the estimated AOD will alsoexhibit extreme highs and lows.

We compared the AOD distributions of the raw satellite estimates and krigedestimates with ground-based AOD to visualize the improvement in data quality. Figure 2shows the daily mean distribution of GASP AOD compared with estimates from UKand SSDF for 16 July, 2006. The mean AOD estimates of UK appear to calibrate AODsurface to the observed value around each AERONET site (17 Sites). On this particularday, GOES has coverage of about 89%. With only few AERONET sites covering theentire nation, UK is unable to capture fine variability in the spatial trend. However, UKperforms well in estimating the low AOD values that are otherwise retrieved as negativeAOD by GOES. Spatial coverage of UK estimates is limited to that of GOES on agiven day but SSDF ensures complete coverage. Overall, kriging considerably mini-mizes the standard deviation of the AOD up to 0.084 in spring, 0.13 in summer, 0.11in autumn and 0.057 in winter with the aid of AERONET. During the summer season,less than 75% of the estimated AOD values are below the observed median ofAERONET, indicating that kriging underestimates high levels of AOD. Also, thekriging model overestimates lower values of AERONET AOD in winter.

4.1.1. UK cross-validation results

Leave-one-out cross-validation (LOOCV) is useful in assessing the fit of the modelcovariance function to the observed AOD surface. We assumed that the model covari-ance function is close to the true covariance function. LOOCV essentially involvestreating a data point as validation data considering the remaining data points as test dataand the process is repeated such that every data point is once treated as the validationdata. Summary statistics are computed for days that included more than three collocateddata points per day. These are shown in Table 1. MODIS overestimates raw AOD inthe western region of the study domain throughout the year. Hence, estimated AODalso exhibits a similar trend. While the range of GOES estimates resemble that ofAERONET AOD, there are outliers in the distribution with high kriging errors.

4.2. SSDF results

We apply SSDF to MODIS and GOES data to produce fused estimates of AOD. UnlikeUK, where bias calibration is integrated into the procedure, we first apply a bias calibra-tion algorithm to GOES and MODIS data, and then apply the data fusion methodologydescribed in Section 4.3 to the bias-corrected data.

We account for the biases by building a repository of collocated MODIS-AERON-ET and GOES-AERONET pixels over the domain period. This results in 5857 matcheddata points for GOES, and 5052 matched data points for MODIS. We then build tworandom forest models (Breiman 2001) with season and longitude as predictors. Wechoose random forest as our classifier because it is known to be a robust learningalgorithm (Caruana et al. 2008). Based on cross-validation analysis, we restrict the

Geocarto International 11

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 13: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

predictors to season and longitude. These random forest bias models are then applied tothe raw GOES and MODIS data to produce two bias-corrected data-sets. We applySSDF (Section 3.3) to the bias-corrected data-sets to produce fused, bias-correctedestimates of AOD. In Figure 4, we display the fused AOD seasonal predictions fromSSDF.

The data fusion AOD predictions are influenced primarily by GOES. While SSDFdoes take into account the smaller measurement error standard deviation of MODIS,GOES has about one order of magnitude more data, and hence is able to exert a stron-ger influence on the SSDF outputs. The SSDF seasonal distribution maps are smootherthan the corresponding UK maps. This is because SSDF assumes a covariance structurewith a larger correlation length with respect to space.

To better understand how SSDF performs, we look at a specific daily interpolatedmap on 16 July 2006 (see Figure 2). The top panel is that of the raw GASP AOD,while the bottom is that of the SSDF AOD estimates. The GASP AOD map has gener-ally clear scenes everywhere except for low-visibility AOD plumes over Southern Cali-fornia, South Dakota, Georgia and Texas. The SSDF daily map is able to reproduce theAOD plumes over southern California, Dakota and Georgia, but it overly smoothed theAOD plume over Texas, most likely due to the fact that AOD observations over south-ern Texas tend to be a mixture of both low and medium AOD. The GASP AOD mapalso has a region of negative AOD over Virginia, and here SSDF faithfully reproducednegative AOD in the same region.

Like UK, SSDF significantly reduces noise, allowing the analyst to better judge thespatial/seasonal distribution of AOD. SSDF outputs have significantly narrower rangesthan those of GOES or MODIS AOD inputs. We observe about a 79% reduction innegative AOD after data fusion. In the next section, we examine a direct comparisonbetween SSDF and UK in terms of performance (reduction in root mean square error)using leave-one-out cross-validation.

Table 2. Summary Statistics of observed vs. predicted AOD in the years 2005–2006 by seasonand region.

Cross-validationAERONET

N

SSDF UK

Mean Mean R2 RMSE Mean R2 RMSE

RegionEastern USA 2005 0.181 1253 0.180 0.33 0.150 0.160 0.64 0.1102006 0.150 1492 0.150 0.23 0.126 0.137 0.57 0.096Central USA 2005 0.115 649 0.116 0.13 0.087 0.140 0.1 0.0892006 0.108 837 0.108 0.15 0.072 0.120 0.1 0.074Western USA 2005 0.094 610 0.093 0.161 0.065 0.133 0.003 0.0712006 0.093 1120 0.092 0.024 0.087 0.109 0.03 0.086

SeasonSpring 2005 0.133 532 0.170 0.12 0.092 0.133 0.48 0.0712006 0.114 843 0.144 0.05 0.070 0.116 0.3 0.060Summer 2005 0.224 740 0.202 0.35 0.162 0.236 0.3 0.1692006 0.172 1003 0.175 0.27 0.128 0.175 0.34 0.122Autumn 2005 0.106 974 0.097 0.16 0.099 0.106 0.41 0.0832006 0.113 880 0.078 0.11 0.119 0.114 0.26 0.108Winter 2005 0.070 266 0.087 0.09 0.051 0.086 0.001 0.0532006 0.069 723 0.072 0.05 0.051 0.072 0.03 0.052

12 S. Jinnagara Puttaswamy et al.

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 14: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

4.2.1. SSDF cross-validation

We repeat the leave-one-out cross-validation exercise outlined in Section 4.1.1. We useSSDF to make estimates of AOD for the same cross-validation AERONET observationsused in UK, and we display the aggregated statistical summaries with respect to regionand season in Table 2.

One salient feature of the statistical summaries is that SSDF performs similarly withrespect to RMSE in the western and central regions, and worse than UK in the easternregion. The eastern seaboard is composed of dense urban centres with high spatialvariability in AOD. The particular parameterization we used for SSDF in this analysiscorresponds to a covariance structure with large correlation lengths, and consequently itis likely that we are unable to capture that high spatial variability in the easternseaboard while local UK was able to do so. SSDF estimates made in the interior andwestern regions of the USA are more stable, and the performance at those locations iscomparable in quality with respect to corresponding UK estimates.

When statistical summaries are aggregated over season instead of region, UKconsistently has comparable or smaller RMSE. This is because the computation ofRMSE over the course of a season includes data from AERONET stations in all threeregions, thereby averaging the differential efficiencies between the two methodologieswith respect to the three regions.

4.3. Discussion

A quick comparison of Figures 3 and 4 illustrates the differences between UK andSSDF. UK has limited coverage; its output coverage is about 47% in the western

Figure 3. Multi-annual seasonal distribution of predicted AOD obtained by UK approach. ‘D’denotes AERONET sites. Symbol colour represents seasonal mean AOD values reported byAERONET.

Geocarto International 13

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 15: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

region, 52% in central and 57% in the eastern region. In the seasonal distribution ofUK estimates as shown in Figure 3, there are blank spots in the areas immediately westof the Rocky Mountain (top-left and bottom-right panels) indicating lack of coverage.SSDF has much better coverage; in principal, SSDF is able to make estimates of AODat any location in the domain, although in practice it is prudent to only makepredictions near locations where we actually have data (interpolation) and refrain fromestimating AOD in areas far away from observed data (extrapolation). Both methodolo-gies produce estimates of prediction standard error, allowing the user to assess theuncertainty of the prediction. In SSDF’s case, the prediction standard error can becombined with a reasonable threshold to judge whether an estimate at a location can beconsidered as an interpolation or extrapolation.

Although SSDF has better coverage, UK is more stable when making estimates ofAOD near the edge of the domain. This is because the parameterization for SSDF relieson estimating a larger number of parameters for the covariance structures, and thus themethodology is more susceptible to overfitting. UK does not have this shortcoming; aquick examination of Figure 3 shows that UK remains quite stable near the boundaryof the eastern USA.

UK and SSDF each has strengths and advantages that are quite complimentary. Theformer approach performs quite well near the domain boundary, while SSDF has bettercoverage. A good case can be made for usage of the two methodologies in conjunctionwith one another to produce a more complete and accurate characterisation of AODdistribution.

Figure 4. Multi-annual seasonal distribution of predicted AOD obtained by SSDF approach. ‘D’denotes AERONET sites. Symbol colour represents seasonal mean AOD values reported byAERONET.

14 S. Jinnagara Puttaswamy et al.

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 16: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

Although the UK approach employs daily AERONET calibration, and the SSDFapproach performs calibration before interpolation, satellite data still have large impactson the quality of the fusion product. With a high-quality satellite data, AOD estimateswill be closer to ground truth. Large variability in the retrieval quality of satellite AODis a function of aerosol loading, location, season and prevalent aerosol type (Levy et al.2010). We observe this predominantly in the western region of the domain and duringwinter. If the satellite retrieval biases are minimized to a reasonable extent, AOD esti-mates obtained using high-quality satellite inputs will have lower levels of uncertainty.

5. Summary

We implicitly calibrated raw satellite AOD using ground truth through UK and SSDFtechniques on a daily basis. Cross-validation was performed on each of the resultingAOD estimates to assess overall accuracy of the fitted model. It suggests that UK per-forms better than SSDF in the eastern domain with high R2 values (0.64 in 2005 and0.57 in 2006). Daily mean values within the study domain closely match that ofground-based AERONET. We achieve better spatial coverage with SSDF at a dailylevel. Negative AOD in the satellite data was substantially reduced by 88% through UKapproach and by 79% with SSDF. SSDF tries to overcome the change of support prob-lem to minimize the bias in the merged product. However, UK methodology is basedon the association of sensors with AERONET and not with one another. Our resultsfrom both the techniques confirm that satellite data is more biased in the western regionof the domain and also during winter. Our analysis shows that the UK method is feasi-ble but depends on the availability of ground truth and accurate sensor data. However,it is possible to decrease model uncertainty by including additional auxiliary variablesat the expense of limited spatial coverage. That is to say on a given day, spatiallycollocated pixels from multi-sensors can be included in the model but the number ofcollocated pixels will be restricted by the sensor with the least spatial coverage. Finally,the propagation of errors in the covariance model to the fitted AOD surface needs to beinvestigated, but that is beyond the scope of this paper.

AcknowledgementsWe thank the (PI investigators) and their staff for establishing and maintaining the sites used inthis investigation within the Continental United States. The work of Jinnagara Puttaswamy, Huand Liu were supported by the NASA Applied Sciences Program managed by John Haynes andSue Estes (grant no. NNX09AT52G).

ReferencesAnderson TL, Charlson RJ, Winker DM, Ogren JA, Holmen K. 2003. Mesoscale variations of

tropospheric aerosols. J Atmos Sci. 60:119–136.Bailey TC, Gatrell AC. 1995. Interactive spatial data analysis. Harlow Essex; New York:

Longman Scientific & Technical; J. Wiley.Banerjee S, Gelfand AE, Finley AO, Sang H. 2008. Gaussian prediction process models for large

spatial data sets. J R Stat Soc Ser B. 70:825–848.Berrocal VJ, Gelfand AE, Holland DM. 2010. A spatio-temporal downscaler for output from

numerical models. J Agric Biol Environ Stat. 15:176–197.Brauer M, Hoek G, Smit HA, de Jongste JC, Gerritsen J, Postma DS, Kerkhof M, Brunekreef B.

2007. Air pollution and development of asthma, allergy and infections in a birth cohort. EurRespir J. 29:879–888.

Geocarto International 15

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 17: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

Breiman L. 2001. Random forests. Mach Learn. 45:5–32.Caruana R, Karampatziakis N, Yessenalina, A. 2008. An empirical evaluation of supervised

learning in high dimensions. Proceedings of the 25th international conference on Machinelearning. Helsinki, Finland: ACM.

Chatterjee A, Michalak AM, Kahn RA, Paradise SR, Braverman AJ, Miller CE. 2010. Ageostatistical data fusion technique for merging remote sensing and ground-based observationsof aerosol optical thickness. J Geophys Res. 115:12.

Cressie NAC. 1993. Statistics for spatial data. New York (NY): Wiley.Cressie N, Johannesson G. 2008. Fixed rank kriging for very large spatial data sets. J R Stat Soc

Series B. 70:209–226.Darrow LA, Strickland MJ, Klein M, Waller LA, Flanders WD, Correa A, Marcus M, Tolbert

PE. 2009. Seasonality of birth and implications for temporal studies of preterm birth.Epidemiology. 20:699–706.

Dobuvik O, Smirnov A, Holben B, King MD, Kaufman YJ, Eck TF, Slutsker I. 2000. Accuracyassessments of aerosol optical properties retrieved from Aerosol Robotic Network(AERONET) Sun and sky radiance measurements. J Geophys Res-Atmos. 105:9791–9806.

Engel-Cox JA, Holloman CH, Coutant BW, Hoff RM. 2004. Qualitative and quantitativeevaluation of MODIS satellite sensor data for regional and urban scale air quality. AtmosEnviron. 38:2495–2509.

Fuentes M, Raftery AE. 2005. Model evaluation and spatial interpolation by Bayesiancombination of observations with outputs from numerical models. Biometrics. 61:36–45.

Goldman GT, Mulholland JA, Russell AG, Srivastava A, Strickland MI, Klein M, WallerLA, Tolbert PE, Edgerton ES. 2010. Ambient air pollutant measurement error: charac-terization and impacts in a time-series epidemiologic Study in Atlanta. Environ SciTechnol. 44:7692–7698.

Gotway CA, Young LJ. 2002. Combining incompatible spatial data. J Am Stat Assoc.97:632–648.

Gupta P, Patadia F, Christopher SA. 2008. Multisensor data product fusion for aerosol research.Geosci Remot Sens IEEE Trans. 46:1407–1415.

Hiemstra PH, Pebesma EJ, Twenhofel CJW, Heuvelink GBM. 2009. Real-time automatic interpo-lation of ambient gamma dose rates from the Dutch radioactivity monitoring network. ComputGeosci. 35:1711–1721.

Hu ZY. 2009. Spatial analysis of MODIS aerosol optical depth, PM(2.5), and chronic coronaryheart disease. Int J Health Geogr. 8:10.

Hu ZY, Rao KR. 2009. Particulate air pollution and chronic ischemic heart disease in the easternUnited States: a county level ecological study using satellite aerosol data. Environ Health.8:10.

Hutchison KD, Smith S, Faruqui SJ. 2005. Correlating MODIS aerosol optical thickness data withground-based PM2.5 observations across Texas for use in a real-time air quality predictionsystem. Atmos Environ. 39:7190–7203.

Journel AG, Rossi ME. 1989. When do we need a trend model in kriging. Math Geol.21:715–739.

Jun M, Stein ML. 2008. Nonstationary covariance models For global data. Ann Appl Stat.2:1271–1289.

Kinne S. 2009. Remote sensing data combinations: superior global maps for aerosol optical depth.In: Kokhanovsky A, Leeuw G, editors. Satellite aerosol remote sensing over land. BerlinHeidelberg: Springer.

Knapp KR, Frouin R, Kondragunta S, Prados A. 2005. Toward aerosol optical depth retrievalsover land from GOES visible radiances: determining surface reflectance. Int J Remote Sens.26:4097–4116.

Kondragunta S, Lee P, McQueen J, Kittaka C, Prados AI, Ciren P, Laszlo I, Pierce RB, Hoff R,Szykman JJ. 2008. Air quality forecast verification using satellite data. J Appl MeteorolClimatol. 47:425–442.

Levy RC, Remer LA, Martins JV, Kaufman YJ, Plana-Fattori A, Redemann J, Wenny B. 2005.Evaluation of the MODIS aerosol retrievals over ocean and land during CLAMS. J AtmosSci. 62:974–992.

16 S. Jinnagara Puttaswamy et al.

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3

Page 18: Geocarto International Statistical data fusion of multi ...web1.sph.emory.edu/...statistical_data_fusion.pdfmain objective is to compare the two statistical methods of fusing AOD products

Levy RC, Remer LA, Vermote EF, Mattoo S, Kaufman YJ. 2007. Second-generation operationalalgorithm: retrieval of aerosol properties over land from inversion of moderate resolutionimaging spectroradiometer spectral reflectance. J Geophys Res Atmos. 112, Article numberD13211.

Levy RC, Remer LA, Kleidman RG, Mattoo S, Ichoku C, Kahn R, Eck TF. 2010. Globalevaluation of the Collection 5 MODIS dark-target aerosol products over land. Atmos ChemPhys. 10:10399–10420.

Liu Y, Franklin M, Kahn R, Koutrakis P. 2007. Using aerosol optical thickness to predict ground-level PM2.5 concentrations in the St. Louis area: a comparison between MISR and MODIS.Remote Sens Environ. 107:33–44.

Liu Y, Paciorek CJ, Koutrakis P. 2009. Estimating regional spatial and temporal variability ofPM2.5 concentrations using satellite data, meteorology, and land use information. EnvironHealth Perspect. 117:886–892.

Liu Y., Sarnat JA, Coull BA, Koutrakis P, Jacob DJ. 2004. Validation of Multiangle ImagingSpectroradiometer (MISR) aerosol optical thickness measurements using Aerosol Robotic Net-work (AERONET) observations over the contiguous United States. J Geophys Res. 109:D06205.

Metzger KB, Tolbert PE, Klein M, Peel JL, Flanders WD, Todd K, Mulholland JA, Ryan PB,Frumkin H. 2004. Ambient air pollution and cardiovascular emergency department visits.Epidemiology. 15:46–56.

Nguyen H, Cressie N, Braverman A. 2012. Spatial statistical data fusion for remote sensing appli-cations. J Am Stat Assoc. 107:1004–1018.

Nirala M. 2008. Technical note - multi-sensor data fusion of aerosol optical thickness. Int JRemote Sens. 29:2127–2136.

Paciorek CJ, Liu Y, Moreno-Macias H, Kondragunta S. 2008. Spatio-temporal associationsbetween GOES aerosol optical depth retrievals and ground-level PM2.5. Environ Sci Technol.42:5800–5806.

Pebesma EJ. 2004. Multivariable geostatistics in S: the gstat package. Comput Geosci.30:683–691.

Pebesma E, Cornford D, Dubois G, Heuvelink GBM, Hristopulos D, Pilz J, Stohlker U, MorinG, Skoien JO. 2011. INTAMAP: the design and implementation of an interoperableautomated interpolation web service. Comput Geosci. 37:343–352.

Peel JL, Metzger KB, Klein M, Flanders WD, Mulholland JA, Tolbert PE. 2007. Ambient airpollution and cardiovascular emergency department visits in potentially sensitive groups. AmJ Epidemiol. 165:625–633.

Pope CA, Ezzati M, Dockery DW. 2009. Fine-particulate air pollution and life expectancy in theUnited States. N Engl J Med. 360:376–386.

Prados AI, Kondragunta S, Ciren P, Knapp KR. 2007. GOES aerosol/smoke product (GASP) overNorth America: Comparisons to AERONET and MODIS observations. J Geophys Res Atmos.112, (Art. no. D15201).

Remer LA, Kleidman RG, Levy RC, Kaufman YJ, Tanre D, Mattoo S, Martins JV, Ichoku C,Koren I, Yu HB, Holben BN. 2007. GOES aerosol/smoke product (GASP) over NorthAmerica: Comparisons to AERONET and MODIS observations. J Geophys Res Atmos. 112,Art. No. D15201.

Stein ML. 1988. Asymptotically efficient prediction of a random field with a misspecifiedcovariance function. Ann Stat. 16:55–63.

Strickland M, Darrow L, Mulholland J, Klein M, Flanders WD, Winquist A, Tolbert P. 2011. Impli-cations of different approaches for characterizing ambient air pollutant concentrations within theurban airshed for time-series studies and health benefits analyses. Environ Health. 10:36.

Wikle CK, Berliner LM. 2005. Combining information across spatial scales. Technometrics.47:80–91.

Wikle CK, Nychka D, Royle J. 2002. Multiresolution models for nonstationary spatial covariancefunctions. Stat Model. 2:315–331.

Xu QF, Obradovic Z, Han B, Li Y, Braverman A, Vucetic S, IEEE 2005. Improving aerosolretrieval accuracy by integrating AERONET, MISR and MODIS data. New York: IEEE.

Zubko V, Leptoukh GG, Gopalan A. 2010. Study of data-merging and interpolation methods foruse in an interactive online analysis system: MODIS terra and aqua daily aerosol case. IEEETrans Geosci Remote Sens. 48:4219–4235.

Geocarto International 17

Dow

nloa

ded

by [1

73.1

1.43

.178

] at 0

6:26

24

Sept

embe

r 201

3