11
Development and case study of a science-based software platform to support policy making on air quality Yun Zhu 1, 2, , Yanwen Lao 3 , Carey Jang 4 , Chen-Jen Lin 1, 5 , Jia Xing 2 , Shuxiao Wang 2 , Joshua S. Fu 6 , Shuang Deng 7 , Junping Xie 3 , Shicheng Long 3 1. Guangdong Provincial Key Laboratory of Atmospheric Environment and Pollution Control, College of Environmental and Energy, South China University of Technology, Guangzhou Higher Education Mega Center, Guangzhou 510006, China. E-mail: [email protected] 2. State Environmental Protection Key Laboratory of Sources and Control of Air Pollution Complex, School of Environment, Tsinghua University, Beijing 100084, China 3. College of Environmental and Energy, South China University of Technology, Guangzhou Higher Education Mega Center, Guangzhou 510006, China 4. USEPA/Office of Air Quality Planning & Standards, RTP, NC 27711, USA 5. Department of Civil Engineering, Lamar University, Beaumont, TX 77710-0024, USA 6. Department of Civil & Environmental Engineering, University of Tennessee, Knoxville, TN 37996-2010, USA 7. Chinese Research Academy of Environmental Sciences, Beijing 100012, China ARTICLE INFO ABSTRACT Article history: Received 3 February 2014 Revised 15 July 2014 Accepted 28 July 2014 Available online 26 November 2014 This article describes the development and implementations of a novel software platform that supports real-time, science-based policy making on air quality through a user-friendly interface. The software, RSM-VAT, uses a response surface modeling (RSM) methodology and serves as a visualization and analysis tool (VAT) for three-dimensional air quality data obtained by atmospheric models. The software features a number of powerful and intuitive data visualization functions for illustrating the complex nonlinear relationship between emission reductions and air quality benefits. The case study of contiguous U.S. demonstrates that the enhanced RSM-VAT is capable of reproducing the air quality model results with Normalized Mean Bias <2% and assisting in air quality policy making in near real time. © 2014 The Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences. Published by Elsevier B.V. Keywords: Air quality Policy making Response surface modeling Emission control scenarios Data visualization Introduction Continued economic development has led to air quality degradation in urban areas throughout the world. Air pollution is a result of complex interactions among air pollutant emis- sions, meteorological conditions and a wide variety of atmo- spheric processes including transport, chemical transformation and deposition. To improve air quality without impairing economic development, strategies of air pollutant emission control must be carefully formulated by assessing the fate of air pollutant emissions in the atmosphere. Typically, such assess- ments are achieved by air quality modeling using science-based atmospheric models, such as the Community Multi-scale Air Quality (CMAQ) model (Byun and Schere, 2006; Wang et al., JOURNAL OF ENVIRONMENTAL SCIENCES 27 (2015) 97 107 Corresponding author. E-mail address: [email protected] (Y. Zhu). http://dx.doi.org/10.1016/j.jes.2014.08.016 1001-0742/© 2014 The Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences. Published by Elsevier B.V. Available online at www.sciencedirect.com ScienceDirect www.journals.elsevier.com/journal-of-environmental-sciences

Development and case study of a science-based software ...acs.engr.utk.edu/publications/2015_zhu_1_jes.pdfUS. The emission projections for three future years (2010, 2015 and 2020)

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Development and case study of a science-based software ...acs.engr.utk.edu/publications/2015_zhu_1_jes.pdfUS. The emission projections for three future years (2010, 2015 and 2020)

J O U R N A L O F E N V I R O N M E N T A L S C I E N C E S 2 7 ( 2 0 1 5 ) 9 7 – 1 0 7

Ava i l ab l e on l i ne a t www.sc i enced i r ec t . com

ScienceDirect

www. jou rna l s . e l sev i e r . com/ jou rna l -o f - env i r onmenta l - sc i ences

Development and case study of a science-based softwareplatform to support policy making on air quality

Yun Zhu1,2,⁎, Yanwen Lao3, Carey Jang4, Chen-Jen Lin1,5, Jia Xing2, Shuxiao Wang2,Joshua S. Fu6, Shuang Deng7, Junping Xie3, Shicheng Long3

1. Guangdong Provincial Key Laboratory of Atmospheric Environment and Pollution Control, College of Environmental and Energy, South ChinaUniversity of Technology, Guangzhou Higher Education Mega Center, Guangzhou 510006, China. E-mail: [email protected]. State Environmental Protection Key Laboratory of Sources and Control of Air Pollution Complex, School of Environment, Tsinghua University,Beijing 100084, China3. College of Environmental and Energy, South China University of Technology, Guangzhou Higher Education Mega Center,Guangzhou 510006, China4. USEPA/Office of Air Quality Planning & Standards, RTP, NC 27711, USA5. Department of Civil Engineering, Lamar University, Beaumont, TX 77710-0024, USA6. Department of Civil & Environmental Engineering, University of Tennessee, Knoxville, TN 37996-2010, USA7. Chinese Research Academy of Environmental Sciences, Beijing 100012, China

A R T I C L E I N F O

⁎ Corresponding author.E-mail address: [email protected] (Y. Z

http://dx.doi.org/10.1016/j.jes.2014.08.0161001-0742/© 2014 The Research Center for Ec

A B S T R A C T

Article history:Received 3 February 2014Revised 15 July 2014Accepted 28 July 2014Available online 26 November 2014

This article describes the development and implementations of a novel software platformthat supports real-time, science-based policy making on air quality through a user-friendlyinterface. The software, RSM-VAT, uses a response surface modeling (RSM) methodologyand serves as a visualization and analysis tool (VAT) for three-dimensional air quality dataobtained by atmospheric models. The software features a number of powerful and intuitivedata visualization functions for illustrating the complex nonlinear relationship betweenemission reductions and air quality benefits. The case study of contiguous U.S.demonstrates that the enhanced RSM-VAT is capable of reproducing the air quality modelresults with Normalized Mean Bias <2% and assisting in air quality policy making in nearreal time.© 2014 The Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences.

Published by Elsevier B.V.

Keywords:Air qualityPolicy makingResponse surface modelingEmission control scenariosData visualization

Introduction

Continued economic development has led to air qualitydegradation in urban areas throughout the world. Air pollutionis a result of complex interactions among air pollutant emis-sions, meteorological conditions and a wide variety of atmo-spheric processes including transport, chemical transformation

hu).

o-Environmental Science

and deposition. To improve air quality without impairingeconomic development, strategies of air pollutant emissioncontrol must be carefully formulated by assessing the fate of airpollutant emissions in the atmosphere. Typically, such assess-ments are achieved by air quality modeling using science-basedatmospheric models, such as the Community Multi-scale AirQuality (CMAQ) model (Byun and Schere, 2006; Wang et al.,

s, Chinese Academy of Sciences. Published by Elsevier B.V.

Page 2: Development and case study of a science-based software ...acs.engr.utk.edu/publications/2015_zhu_1_jes.pdfUS. The emission projections for three future years (2010, 2015 and 2020)

98 J O U R N A L O F E N V I R O N M E N T A L S C I E N C E S 2 7 ( 2 0 1 5 ) 9 7 – 1 0 7

2014a, 2014b), for multiple emission scenarios. However, thehigh computational cost associated with the modeling effortoften becomes the bottleneck of the assessment process. Inaddition, most atmospheric models used in air qualitymanagement lack a user-friendly interface that can synthe-size the model results produced from different air pollutantemission inputs, and do not offer adequate data visualizationfor supporting policy making.

To improve the efficiency of data synthesis in air qualitymanagement, it is necessary to develop a tool to translate themassive amount of model data to policy-relevant visualization.Ideally, such policy decision support tools should be efficient,easy-to-use, and capable of estimating air quality changes forgiven emission reduction scenarios. In addition, the tools shouldoffer intuitive policy-related visualizations. Some earlier proto-types, such as: the IMPAQT— IntegratedModular Program forAirQuality Tools (Lim et al., 2005) and the GIS-CALPUFF coupled airquality decision support system (Fraser et al., 2013), havedemonstrated the benefits of a decision support tool for urbanair quality assessmentwith limited visualization features. To ourknowledge, there has not been a comprehensive package thathas been demonstrated to support air quality management atboth urban and regional scales. Since many air quality problemsare of regional scale (Wang et al., 2014a, 2014b), for exampleregional haze and trans-boundary transport, an efficient andpowerful tool for supporting decision making in regional airquality is in great need.

To address this need, we rewrote the code of the re-sponse surface model (RSM) (Lao et al., 2012) with C# using thehigh-dimensional Kriging techniques (Mohammadi et al., 2012;Xiao et al., 2009) to interpolate the model results of CMAQfor policy making support. The RSM serves as a statisticalmetamodeling interface for rapid assessment of the massivedata of atmosphericmodel results (Ashok et al., 2013). Followingthe data assessment, it is essential to couple the model resultsin a graphical user interface such as USEPA's Visual PolicyAnalyzer (VPA), which allows using slider bars and a mouse tochange emission factors for visualizing the air quality response(U.S.EPA, 2006). However, VPA has incurred user-friendlinessand data accuracy issues. At this moment, the developmentand support of VPA have been discontinued.

In this work, an enhanced response surface model-visualization and analysis tool (RSM-VAT) is developed forbetter understanding and analyzing the established statisticalrelationship (RSM) between air pollution emission reductionsand air pollutant concentrations. The statistically generalizedRSM surface by the method of MPerK (MATLAB ParametricEmpirical Kriging) reported in our previous work has demon-strated that the nonlinear relationship between emissionsources and air quality concentration of PM2.5 (Wang et al.,2011) or O3 (Xing et al., 2011) can be well analyzed by RSM. Theestablished nonlinear relationship is not only a foundation ofdecision making, but also critical for formulating emissioncontrol strategies. The development of enhanced RSM-VAT isthe extension of our previous work, it aims to provide a seriesadvanced visualization and analysis functions for validatingRSM prediction results and for examining the nonlinearinteractions among multiple air pollutant precursors, in addi-tion to instantly generate the air pollutant concentrationsurface response to air pollutant emission changes.

1. Methodology

Fig. 1 shows the block diagram of the development andapplication of the RSM-VAT. The modules of control matrix,RSM, and quality assurance (QA) are designed to guide the usersthrough theRSMdevelopment andaccuracy validation.Most endusers (policy makers) would directly use the RSM data analysismodule for policy decisions. It greatly improves theuser interfaceand visualization accuracy ofVPA for the RSM results. The designof control matrix, response surface modeling, quality assurance,and RSM data analysis are described here.

1.1. Design of emission control matrix

RSM aggregates the results of pre-specified CMAQ simula-tions into a multi-dimensional “response surface” that alsoincorporates the acceptable uncertainties in environmentaldecision-making. This allows a rapid assessment of airquality impacts caused by different combinations of emis-sion sources. These sources are selected as emission controlfactors as shown in Table 1. The CMAQ simulations areperformed at a defined set of distributed emission invento-ries (control scenarios) in a high-dimensional experimentaldesign space. For example, as one of the source targetssubject to emission control, Electricity Generating Units(EGUs) NOx emission is set at 0–120% of the base-year level.The response surface of the CMAQ results attempts tomaintain the model accuracy while minimizing the compu-tationally expensive CMAQ run at a given emission scenariowithin the experimental design space. The RSMmethod usesstatistical techniques to relate a response variable (e.g., theannual PM2.5 at multiple U.S. receptor sites) to its influencingfactors (e.g., the emission of a pollutant precursor, NOx) fromlocal and regional emission sources.

The control matrix module (Fig. 1) is to create theexperiment design of emission control matrix. The matrix iscreated by sampling the control factors in the design space(e.g., from 0.00 to 1.20) using Latin Hypercube Sampling— LHS(Hirabayashi et al., 2011). LHS is often applied to generatethe experimental design for a Kriging model (Kleijnen, 2009).It is a statistical method for generating a sample of plausiblecollections of parameter values from a multidimensionaldistribution which is more efficient than random samplingfor a large number of factors. Table 2 shows an exampleof emission control matrix created by LHS. The base-yearemission inventory is factored by the weight shown in thematrix (Table 2), forming 181 emission control scenarios forCMAQmodel runs. These emission inventory factors, togeth-er with the identical meteorology and other model input, areapplied in CMAQ simulations to provide the annual PM2.5

concentration estimates for the 181 emission control scenarios.In order to reduce computational time for such a large numberof annual model runs, the results of 4-monthly (February, April,July, and October 2001) CMAQ runs over the contiguous US areutilized in the response surface modeling. These months werechosen based on greatest predictability of the quarterlymean inUS. The emission projections for three future years (2010, 2015and 2020) together with 2001 meteorology data (U.S.EPA, 2005)are used in the CMAQ runs (U.S.EPA, 2006).

Page 3: Development and case study of a science-based software ...acs.engr.utk.edu/publications/2015_zhu_1_jes.pdfUS. The emission projections for three future years (2010, 2015 and 2020)

CMAQ Simulation

ErrorAcceptable?

No

Yes

Emission inventory 3-D MeteorologicalData

Final outputReal-time Data Analysis PlotGraphical Decision Support

RSM DATAANALYSIS MODULE

QUALITY ASSURANCE MODULE

RSM MODULE

CONTROL MATRIX MODULE

System Framework

Phase IRandomized

Experimental Design

Phase IIAtmospheric Model

Simulation

Phase IIIStatistical

Development of RSM

Phase IV

RSM Validation

Phase VData Analysis and

Visualization

RSM-VAT (GUI)

Emission Control Matrix

Fig. 1 – The primary components and the typical application process of the developed response surface model-visualizationand analysis tool (RSM-VAT).

Table 1 – Twelve emissions control factors selected forresponse surface modeling (RSM) modeling.

Controlfactors

Factor description

NOx/EGU NOx IPM (Integrated Planning Model) EGU(Electricity Generation Unit) point sourceemissions

NOx/NonEGU + area

NOx IPM non-EGU point source, area source, andagricultural source emissions

NOx/mobile NOx nonroad source and mobile source emissionsSOx/EGU SOx IPM EGU point source emissionsSOx/NonEGU_Point

SOx IPM Non-EGU point source emissions

SOx/area SOx area source and agricultural source emissionsVOC/all Volatile organic carbon IPM EGU point source, IPM

Non-EGU point source, area source, agriculturalsource, nonroad source, and mobile sourceemissions

NH3/area Ammonia area source and agricultural sourceemissions

NH3/mobile Ammonia non-road source and mobile sourcesources

POC & PEC/EGU + NonEGU

Elemental carbon and organic carbon IPM EGUpoint source and IPM Non-EGU point sourceemissions

POC & PEC/mobile

Elemental carbon and organic carbon nonroadsource and mobile source emissions

POC & PEC/area Elemental carbon and organic carbon areasource and agricultural source emissions

99J O U R N A L O F E N V I R O N M E N T A L S C I E N C E S 2 7 ( 2 0 1 5 ) 9 7 – 1 0 7

Twelve emission sources, which are treated separately inlocal and regional area, are selected for the RSM modeldevelopment (Table 1). All the urban emissions are controlledusing the same control factor. Under this circumstance, thewithin-city emission is considered “local” and the emissions inother cities are considered “regional” such that the emission inone city does not significantly influence the air quality in anothercity. This is a reasonable assumption as long as the selectedurban areas are sufficiently distant from each other (Fig. 2). The 8urban areas include NewYork/Philadelphia (combined), Chicago,San Joaquin, Atlanta, Salt Lake City, Phoenix, Seattle, and Dallas.

The emission control factors range from 0 to 120%. A zerocontrol factor means that the emission of interest is reducedto zero while a 120% factor indicates that the emission is in-creased by 20%. The LHS method ensures a uniformly distrib-uted sampling in the high-dimensional experimental designspace such that it can capture the linear and nonlinearinteractions among pollutants. The emission control factors inthe emission control matrix (Tables 1, 2) are selected based onemission types and source categories. This allows the generatedRSM to evaluate air quality changes resulting from adjustingeach of the twelve emission sources (Table 1) on urban orregional basis. This experimental design ensures the represen-tativeness of the data used in the response surface regression.

1.2. Development of response surface modeling

A high-dimensional Kriging approach is applied for theresponse surface regression of the CMAQ simulation results

Page 4: Development and case study of a science-based software ...acs.engr.utk.edu/publications/2015_zhu_1_jes.pdfUS. The emission projections for three future years (2010, 2015 and 2020)

Table 2 – Emission control matrix created by controlmatrix module using Latin Hypercube Sampling (LHS)method.

Runa R-1(base)

… R-178 R-180 R-181

(1) Local/NOx/EGU 1 … 0.110483 1.022551 1.087 341(2) Local/NOx/

NonEGU + area1 … 0.027116 0.06448 0.109 682

(3) Local/NOx/mobile 1 … 0.043133 1.049465 0.029 364(4) Local/SOx/EGU 1 … 1.048884 1.008878 1.023 087(5) Local/SOx/

NonEGU_Point1 … 1.086225 0.555295 1.024 059

(6) Local/Sox/area 1 … 0.108373 1.099252 0.353 902(7) Local/VOC/all 1 … 1.021591 0.036443 1.091 375(8) Local/NH3/area 1 … 1.049548 1.082221 1.008 858(9) Local/NH3/mobile 1 … 1.080622 0.521056 0.357 506(10) Local/POC & PEC/

EGU + NonEGU1 … 1.084612 0.357084 0.637 807

(11) Local/POC &PEC/mobile

1 … 0.965029 0.106736 0.522 265

(12) Local/POC &PEC/area

1 … 1.084325 0.034402 0.281 576

(13) Region/NOx/EGU 1 … 0.762628 1.087254 0.718 672(14) Region/NOx/

NonEGU + area1 … 0.765136 0.283111 0.392 529

(15) Region/NOx/mobile

1 … 0.930531 0.318877 0.085 269

(16) Region/SOx/EGU 1 … 1.155187 1.086086 0.763 455(17) Region/SOx/

NonEGU_Point1 … 0.717255 0.260321 0.418 828

(18) Region/SOx/area 1 … 1.013104 0.035435 1.089 675(19) Region/VOC/all 1 … 0.389028 0.608322 0.266 133(20) Region/NH3/area 1 … 1.184101 0.032021 0.062 598(21) Region/NH3/

mobile1 … 0.060786 0.714711 0.260 525

(22) Region/POC & PEC/EGU + NonEGU

1 … 0.107094 0.556018 0.036 105

(23) Region/POC &PEC/mobile

1 … 1.084976 0.635975 0.285 701

(24) Region/POC &PEC/area

1 … 1.009483 0.634597 1.089 613

a Run #1–171 are used for creating RSM and run #172–181 are usedfor out-of-sample validation.

100 J O U R N A L O F E N V I R O N M E N T A L S C I E N C E S 2 7 ( 2 0 1 5 ) 9 7 – 1 0 7

(Ginsbourger et al., 2013; Jia and Taflanidis, 2013). Kriging is ageostatistical method based on an exponentially weightedsum of the sample runs results, which is reported superior todeterministic methods (Eldrandaly and Abu-Zaid, 2011). Itcan approximate highly nonlinear surfaces as long as they arelocally continuous. The predicted changes in the ambientconcentration of PM2.5 in each CMAQ grid cell are modeled asa function of the weighted average of the modeled responsesamong the experimental design space. The weight assigned toa model output depends on the Euclidean distance betweenthe factor levels covering the emission control ranges andthe factor levels defining the CMAQ experimental runs. AMaximum Likelihood Estimation–Experimental Best LinearUnbiased Predictors (MLE–EBLUPs) Kriging Interpolationis adopted. We recoded the Parametric Empirical Krigingalgorithm (Kleijnen, 2009; Peng et al., 2006) using C# languageand accelerated the calculation efficiency by multi-threading

technology. The RSM predicted value Y!

x0ð Þ can be expressedas:

Y!

x0ð Þ ¼Xdj¼1

f j xð Þβ!j þ z xð Þ ≡ f T0 β!þ γ!T

0 R!−1

Yn−F β!� �

ð1Þ

where, f0 is a d × 1 vector of the regression functions for Y!

x0ð Þ,and β

!is a d × 1 vector for the unknown regression

coefficients; γ0 is the n × 1 vector of correlations of Yn withY!

x0ð Þ; R!

is a n × n matrix of correlations among the Yn; F is an × d matrix of regression functions for the training data; z(x)is a zero-mean Gaussian stochastic process with correlationfunction R(h|ξ) and correlation parameters ξ. R can be estimatedby the product power exponential correlation function:

R hjξð Þ ¼ ∏d

i¼1exp −θijhijpi½ � ð2Þ

where, ξ = (θ, p) = (θ1, …, θd, p1, …, pd) with θi ≥ 0 and 0 ≤ pi ≤ 2,and the correlation parameter ξ is obtained by maximumlikelihood estimation (Li and Sudjianto, 2005; Xing et al.,2011). Since the RSM is built from CMAQ simulation results,it has the same strengths and limitations of the CMAQmodeling system as well as its input data driving the modelruns.

1.3. Quality assurance of RSM

The developed RSM is validated using a number of modelperformance evaluation metrics as shown in Table 1. Indi-vidual grad cell scatterplot and visual examination of RSMprediction maps is also conducted to ensure that the spatialdistribution of air quality obtained by the RSM is consistentwith the original CMAQ model results. Cross validation (CV)was performed. During each CV routine, one of the experi-mental runs is left out for creating the RSM (CV-RSM). TheCV-RSM predicted data are then verified against the CMAQresults that are left out. This comparison process is sequen-tially carried out for 171 times. It is found that the predictionsof CV-RSM and RSM do not have observable differences.Out-of-sample validation (OOS) is also conducted to evaluatethe RSM performance, which compares RSM-predicted valuesto those of CMAQ for a set ofmodel runs not used in developingthe RSM. Ten OOS model runs are evaluated in the case study.The same performance evaluation metrics (Table 3) were usedfor both CV and OOSmethods.

The users also have the options of using add-on samplingto generate additional runs for CMAQ simulations if the RSMprediction bias does not meet the desired tolerance. Com-bined with the original model runs, a new sampling matrixand additional CMAQ simulation results can be applied tocreate a new response relationship. The new RSM can beexamined through the validation process until the desiredaccuracy is achieved. Theoretically, the more experimentalruns are used for creating the RSM, the more accurate the RSMresults will be. However, the improvement of accuracy and thecomputational cost of CMAQ simulations should be balanced.The examples of input data file, including: CMAQmodel outputconcentration files, emission factor information file, state

Page 5: Development and case study of a science-based software ...acs.engr.utk.edu/publications/2015_zhu_1_jes.pdfUS. The emission projections for three future years (2010, 2015 and 2020)

NY/PhiladelphiaChicagoSan JoaquinAtlantaSalt lake CityPhoenixSeattleDallas

New York CountyPhiladelphia County

Fig. 2 – Study domain and 8 cities of interest in 36 km grid.

101J O U R N A L O F E N V I R O N M E N T A L S C I E N C E S 2 7 ( 2 0 1 5 ) 9 7 – 1 0 7

shape file (region boundary map file), area information file(target area or region file) and controlmatrix file are provided byU.S.EPA (2006) and can be downloaded by visiting http://www.abacas-dss.com/. Those data files are utilized to create the RSM

Table 3 – Evaluation metrics of RSM (Response SurfaceModel).

Performancemetric

Equationa

Mean Bias(MB, μg/m3)

MB ¼ 1N ∑

N

i¼1Cri−Ccið Þ

Mean Error(ME, μg/m3)

ME ¼ 1N ∑

N

i¼1Cri−Ccij j

MeanNormalizedBias (MNB,−100% to +∞)

MeanNormalizedError (MNE,0% to +∞)

MNB ¼ 1N ∑

N

i¼1

Cri−CciCci

� �MNE ¼ 1

N ∑N

i¼1

Cri−CciCci

��������

MeanFractional Bias(MFB, −200%to +200%)

Mean FractionalError (MFE, 0%to +200%)

MFB ¼ 1N ∑

N

i¼1

Cri−Ccið ÞCriþCci

2

� � MFE ¼ 1N ∑

N

i¼1

Cri−Ccij jCriþCci

2

� �

NormalizedMean Bias(NMB, −100%to + ∞)

NormalizedMean Error(NME, 0%to + ∞)

NMB ¼∑N

i¼1Cri−Ccið Þ

∑N

i¼1Cci NME ¼

∑N

i¼1Cri−Ccij j

∑N

i¼1Cci

a N is the number of grid cells, Cri is the RSM predicted value in gridi, and Cci is the CMAQ value in grid i.

and to support the visualization and analysis tool in the casestudy.

2. Results & discussion

2.1. Performance of the RSM

The PM2.5 results are used as an example for illustratingthe RSM model performance. Comparison of RSM predic-tions to the CMAQ simulation results shows good agree-ment in both the 171 CV cases and the 10 OOS cases(Fig. 3). The median values of NMB, NME, MFB and MFE forthe OOS cases (0.18–0.85%) in the box-plot are slightlyhigher than those of CV cases (−0.31%–0.87%). We chooseCase #178 whose MFB is the highest at 3.69% and Case #180whose MFB is the lowest at −0.11% (Fig. 3, line chart) tofurther analyze the model accuracy. Shown as Fig. 4a, theRSM accurately reproduces the results of Case #180and slightly over-predicts those of Case #178. Theover-predictions of Case #178 can be viewed clearly whenwe compare the base case concentration reduction inFig. 4b because of the amplified coordinate axis. Fig. 5shows that the concentration distribution map of RSMpredicted values can well coordinate with that of theCMAQ simulation results. The slight biases seem moreeasily occurred in the regions with high PM2.5 concentra-tion because of the greater emission reduction (Fig. 5,RSM–CMAQ), while the biases of these regions are less than0.25 μg/m3 compared to their ambient PM2.5 concentration from11.58 to 16.54 μg/m3 using San Joaquin and New York/Philadel-phia as examples. Comparing the LHS sampling values betweenthe control factors of Case #178 and Case #180 listed in Table 2,the control factor of “region/NH3/Area” in Case #178 is 1.184101and only 0.032021 in Case #180. Additional experiment runs forthe emission control factors having a significant impact on

Page 6: Development and case study of a science-based software ...acs.engr.utk.edu/publications/2015_zhu_1_jes.pdfUS. The emission projections for three future years (2010, 2015 and 2020)

a

178172 180

a b

MFB

, MFE

, NM

B&

NM

E ( %

)

-2

0

2

4

6

OOS_MFB OOS_MFE OOS_MNB OOS_MNE OOS_NMB OOS_NME CV_MFB CV_MFE CV_MNB CV_MNE CV_NMB CV_NME

5

4

3

2

1

0

-1

Mea

n fr

actio

nal b

ias (

MFB

%) 5

4

3

2

1

0

-1

-2Mea

n fr

actio

nal e

rror

s (M

FB%

)

1 11 31 51 71 91 111 131 15121 41 61 81 101 121 141 161 171Case number

b

Fig. 3 – Normalized Mean Bias (NMB), Normalized Mean Error (NME), Mean Fractional Bias (MFB) and Mean Fractional Error(MFE) over (a) the 10 outside of the experimental design runs by out-of-sample (OOS) validation, and (b) the 171 controlscenarios used to create RSM by cross-validation (CV); the line chart evaluates the comparison percentage values (Y) withindividual case number (X), and the box-plot demonstrates the statistical analysis of the bias for all of the 10 OOS or the 171 CVcases.

30

20

10

0

RSM

0 10 20 30 40CMAQ

a 10

-1-2-3-4-5R

SM-C

MA

Q b

ase

case

RSM

-CM

AQ

bas

e ca

se

-6 -5 -4 -3 -2 -1 0

30

25

20

15

10

5

00 5 10 15 20 25 30

CMAQ

1-1-3-5-7-9

-11-13

-13 -11 -9 -7 -5 -3 -1 1CMAQ-CMAQ base case

CMAQ-CMAQ base case

RSM

r = 0.9999733

r = 0.9990226 r = 0.9670373

r = 0.9993302

aconc. 178

a aconc. 180

b aconc. 178

b aconc. 180

Fig. 4 – X–Y scatterplot for the concentrations of PM2.5 (μg/m3) simulated by CMAQ (X, Community Multi-scale Air Quality) andpredicted by RSM (Y, Response Surface Model). (a) RSM vs. CMAQ, (b) the CMAQ base case concentration reduction (predicted orsimulated value subtracts the concentration of CMAQ base case) of RSM vs. CMAQ.

102 J O U R N A L O F E N V I R O N M E N T A L S C I E N C E S 2 7 ( 2 0 1 5 ) 9 7 – 1 0 7

Page 7: Development and case study of a science-based software ...acs.engr.utk.edu/publications/2015_zhu_1_jes.pdfUS. The emission projections for three future years (2010, 2015 and 2020)

#180 CMAQ #180 RSM #180 RSM -CMAQ

0 15μg/m3 0 0.3μg/m3

Fig. 5 – Concentration and different (RSM subtract CMAQ) maps for CMAQ simulated results and RSM predicted values createdby the function of “comparison plot”.

103J O U R N A L O F E N V I R O N M E N T A L S C I E N C E S 2 7 ( 2 0 1 5 ) 9 7 – 1 0 7

PM2.5 are necessary if the RSM model accuracy needs to befurther improved.

Validation & QA module also summarizes the values of theperformancemetrics for the RSM (Table 4). Themean bias is <2%and the maximum bias is <6% in the test cases. Increasing thesampling density will further improve the RSM predictionaccuracy (Xing et al., 2011). The add-on sampling function ofthe control matrix module allows users to choose the factors,and to set the number of samples and the factor range. Themodule uses LHS to generate a sub-matrix supplementing theexisting-matrix for additional sampling runs. For example, tofurther reduce the difference between RSM and CMAQ results,one can use the function of “add-on sampling” to generateadditional 20 experiment runs. The CMAQ simulation results ofthe 20 additional runs together are thenused to recreate theRSM.

2.2. RSM data analysis and visualization

The developed RSM-VAT provides a series of visualization andanalytical functions for supporting decisionmaking. Althoughthe quality assurance module also provides visualized verifi-cations for the RSM prediction against CMAQ results (Fig. 5)and for different CMAQ experiment runs, decision support ismainly provided by the RSM data analysis module. Fig. 6 liststhe different input data files or configurations for differentfunction modules, and shows the screenshots of the instant/real-time visualization and analysis features of RSM-VAT,whichinclude 2D and 3D maps (Fig. 6a, b) for instant overview of airquality response, bar charts (Fig. 6c, d, e, f) for quantitativelyanalyzing and comparing the effectiveness of emission reduc-tion, and contour plots (Fig. 6g, h) for dynamically description of

Table 4 – Results of cross- and out-of-sample validations gener

Performance metric Cross-validation (171 scenario

Mean Minimum Ma

MB (μg/m3) 0.0005 −0.0587ME (μg/m3) 0.0328 0.0098MNB (%) 0.03% −1.41%MNE (%) 0.80% 0.22%MFB (%) 0.02% −1.43%MFE (%) 0.79% 0.23%NMB (%) 0.03% −2.04%NME (%) 1.13% 0.33%

the non-liner or liner relationship between two single/groups ofemission control factors.

The RSM-VAT allows instantaneous visualization of theresponse surface of CMAQ outputs corresponding to emissionchanges. For example, users canchangeanyemission factors andview the impact on the ambient concentration of PM2.5 instantly.Fig. 7 displays the 2D and 3Dmap examples of the RSM-VAT. The3Dmap has an instant three-dimensional visual effect for the airquality changes, while 2D map can identify these changes indifferent administrative regions intuitively. Although VPA alsoprovides thevisualizationof 2Dand3Dmaps, it doesnothave theother visualization functions to support quantitative evaluationof the interrelationship among emission control factors. The barcharts and contour plots are important features for analyzing thecomplex relationship between local/regional emission reductionand their air quality effect.

2.3. Case study for decision making support

A test case is employed to illustrate how the RSM-VAT can beused to assist in air quality decisionmaking support. Considerthat a policy maker attempts to know: (1) what emissionreduction is more effective for achieving the targeted airquality, (2) which emission source has a greater impact on airquality, and (3) what is the most effective emission reductionstrategy for improving air quality? The answers can beprovided with the RSM-VAT as demonstrated.

All of the emission sources (control factors) listed inTable 2 has different degrees of contribution to PM2.5 which isa public-focused complex pollutant (Zhuang et al., 2014). Therelative contributions to PM2.5 of regional and urban

ated by the function listed in “evaluation metrics of RSM”.

s) Out-of-sample validation (10 scenarios)

ximum Mean Minimum Maximum

0.1513 0.0448 −0.0209 0.19330.1517 0.0588 0.0084 0.19333.77% 1.03% −0.51% 3.81%3.78% 1.39% 0.25% 3.81%3.62% 0.99% −0.51% 3.69%3.63% 1.35% 0.25% 3.69%5.31% 1.52% −0.75% 5.93%5.31% 2.00% 0.30% 5.93%

Page 8: Development and case study of a science-based software ...acs.engr.utk.edu/publications/2015_zhu_1_jes.pdfUS. The emission projections for three future years (2010, 2015 and 2020)

Functional interfaces for RSM data analysis

Fig. 6 – Visualization and analysis features of the RSM-VAT. The main functional modules are: (a) 2-D map, (b) 3-D map, (c)regional and local emission sources control effects, (d) air quality improvement on selected regions/cities, (e) comparison ofmulti-regions/cities for multi-emission–reduction-ratios, (f) regional and local emission sources control effects for chosenemission sources, (g) 2-D color filled contour, (h) 2-D contour.

104 J O U R N A L O F E N V I R O N M E N T A L S C I E N C E S 2 7 ( 2 0 1 5 ) 9 7 – 1 0 7

emission sources for the eight urban areas (Fig. 2) can beobtained by using the regional and local emission controleffect comparison plot function shown in Fig. 6f. Thecomparison result shows that the regional emissions havea greater air quality impact in Chicago while the urbanemissions in San Joaquin are more important than theregional emissions. Additional information can be obtainedby investigating the response curves of PM2.5 caused by thechanges of regional and local emissions using the contourplot. For example, Fig. 8a shows that reducing regionalemissions is more effective in controlling PM2.5 comparedto reducing local emissions in Chicago. In contrast, thebenefit caused by the local emission reduction in San Joaquinis slightly greater than that of regional emissions (Fig. 8b).Analyses can also be carried out by source categories andemission types for a subset of the geographic areas. Thefunctions shown in Fig. 9 allows user to compare thecumulative impact and the different source contributions toPM2.5 decrease caused by regional and local emissionreduction under different emission control scenarios. Theresults in Fig. 9 show that the regional area-NH3 emissionand EGU-SOx emission have a greater impact on PM2.5 inChicago, while the local emission of area-NH3 and regionalemission ofmobile NOx have a greater impact on PM2.5 in San

Joaquin. Therefore, we can use a combination of the contourplot (Fig. 8) and bar chart (Fig. 9) to provide sound andquantitative air quality improvement suggestions for differ-ent cities; for example, if we want to lower the concentrationof annual PM2.5 around 1.5 μg/m3 in Chicago and San Joaquin,both of the regional and local emission sources of Chicagowill be reduced down to 83.5% (Figs. 8a, 9), while those of SanJoaquin will be reduced down to 52.8% (Figs. 8b, 9); inaddition, the regional EGU-SOx and area-NH3 for Chicago,and local area-NH3, regional and local mobile NOx for SanJoaquin should be focused on (Fig. 9). An additional CMAQsimulation for the emission reduction scenario of Chicagodown to 83.5% was run to further investigate the accuracy ofRSM. The comparison results show that RSM can wellreproduce the simulation results of CMAQ (Fig. 10a) withthe deviation of PM2.5 concentrations less than 0.3 μg/m3

(Fig. 10b) and MFB = 0.072%.

3. Conclusions

This article describes the development and application of aresponse surface modeling-visualization and analysis tool

Page 9: Development and case study of a science-based software ...acs.engr.utk.edu/publications/2015_zhu_1_jes.pdfUS. The emission projections for three future years (2010, 2015 and 2020)

a Emission control factors 0.50 b Emission control factors 1.20

Fig. 7 – Visualization effects of 2D and 3D contour for PM2.5 concentration at emission control factors set to (a) 0.50 and (b) 1.2.

105J O U R N A L O F E N V I R O N M E N T A L S C I E N C E S 2 7 ( 2 0 1 5 ) 9 7 – 1 0 7

(RSM-VAT) for supporting the decision making of air qualitypolicy. The RSM is based on a new approach that aggregatespre-specified air quality modeling results into a multi-dimensional “response surface”. A sophisticated, high-dimensional Kriging interpolation approach is implement-ed in the development. It is shown that the RSM canreproduce the results from individual CMAQ simulationsaccurately.

-1.804 9.368

1.200

1.000

0.800

0.600

0.400

0.200

0.000

Reg

inal

sour

ces

0.000 0.200 0.400 0.600 0.800 1.000 1.200Local sources

a

Fig. 8 – Contour curves for annual average PM2.5 in (a) Chicago anand local (X) emission sources.

TheRSM-VAT is implementedonWindowsoperation systemandhas a user-friendly interface. TheU.S. case study shown thatusers can easily obtain the graphic decision supports by enteringemission control factors and/or clicking themouse. The decisionsupport features include air quality benefits resulted fromemission reductions, determination of the relative importanceof regional and local emission sources, and identification ofimportant emission sources. RSM-VAT can also be used

-0.426 2.524

1.200

1.000

0.800

0.600

0.400

0.200

0.000

Reg

inal

sour

ces

0.000 0.200 0.400 0.600 0.800 1.000 1.200Local sources

b

d (b) San Joaquin area response on the control of regional (Y)

Page 10: Development and case study of a science-based software ...acs.engr.utk.edu/publications/2015_zhu_1_jes.pdfUS. The emission projections for three future years (2010, 2015 and 2020)

Accumulative impact Different sources contribution

Chicago: accumulative contribution of individualsources reduced down to 83.5%

1.4

1.2

1.0

0.8

0.6

0.4

0.2

0.0

-0.2

PM2.

5 (μg

/m3 )

PM2.

5 (μg

/m3 )

PM2.

5 (μg

/m3 )

Regional control Local control

Regional control Local control Regional control Local control

Regional control Local control

Chicago: contribution of individual sourcesreduced down to 83.5%

San Joaquin: contribution of individual sourcesreduced down to 52.8%

0.30

0.25

0.20

0.15

0.10

0.05

0.00

-0.05

PM2.

5 (μg

/m3 )

0.30

0.25

0.20

0.15

0.10

0.05

0.00

San Joaquin: accumulative contribution of individualreduced down to 52.8%

0.90.80.70.60.50.40.30.20.10.0

NOx/EGUNOx/NonEGU+AreaNOx/Mobile

SOx/EGUSOx/NonEGU+AreaSOx/Area

VOC/AllNH3/AreaNH3/Mobile

POC&PEC/EGU+NonEGUPOC&PEC/MobilePOC&PEC/Area

Fig. 9 – Accumulative impact and different sources contribution comparison of regional and local emission sources reduceddown to 83.5% in Chicago and 52.8% in San Joaquin for lowering 1.5 μg/m3 annual PM2.5.

106 J O U R N A L O F E N V I R O N M E N T A L S C I E N C E S 2 7 ( 2 0 1 5 ) 9 7 – 1 0 7

internationally for various model domains after redesigning theemission control factors and the experimental model runs. Weare trying to use RSM-VAT to assist evaluation of the emissioncontrol impacts on air quality among the cites/areas withinYangtze River Delta (YRD), China. The current YRD experi-ment results for two months (January and August, 2010) canbe downloaded fromourwebsite of http://www.abacas-dss.com/.

Chicago: R&L emission down to 83.5%

R = 0.9996782

0

-2

-4

-6

-8

-10

-12

RSM

-CM

AQ

bas

e ca

se

CMAQ-CMAQ base case-12 -10 -8 -6 -4 -2

Fig. 10 – Comparison of the PM2.5 (μg/m3) concentrations simulasources inside and outside Chicago down to 83.5%. (a) X–Y scatteCMAQ, and (b) the concentration difference (RSM subtract CMAQ

Further improvement on the accuracy of RSM canbe achieved by additional sampling model runs,although the added computational cost should also beconsidered. The RSM methodology and the visualizationtool demonstrated in this work represent a stepforward to effective and efficient science-based decisionsupport.

Chicago: RSM-CMAQ

ab0

0.30 μg/m3

ted by CMAQ and predicted by RSM when reducing emissionrplot for CMAQ base case concentration reduction of RSM vs.) maps for CMAQ and RSM predicted values.

Page 11: Development and case study of a science-based software ...acs.engr.utk.edu/publications/2015_zhu_1_jes.pdfUS. The emission projections for three future years (2010, 2015 and 2020)

107J O U R N A L O F E N V I R O N M E N T A L S C I E N C E S 2 7 ( 2 0 1 5 ) 9 7 – 1 0 7

Acknowledgments

Financial and data support for this work is provided by theU.S. Environmental Protection Agency (No. GS-10F-0205T).This work is also partly supported by the funding of GuangdongProvincial Key Laboratory of Atmospheric Environment andPollution Control (No. h2xjD612004Ш), the funding of StateEnvironmental Protection Key Laboratory of Sources and Controlof Air Pollution Complex (No. SCAPC201308), and the project ofAtmospheric Haze Collaboration Control Technology Design (No.XDB05030400) from Chinese Academy of Sciences.

R E F E R E N C E S

Ashok, A., Lee, I.H., Arunachalam, S., Waitz, I.A., Yim, S.H.L.,Barrett, S.R.H., 2013. Development of a response surface modelof aviation's air quality impacts in the United States. Atmos.Environ. 77, 445–452.

Byun, D., Schere, K.L., 2006. Review of the governing equations,computational algorithms, and other components of themodels-3 Community Multiscale Air Quality (CMAQ) modelingsystem. Appl. Mech. Rev. 59 (1–6), 51–76.

Eldrandaly, K.A., Abu-Zaid, M.S., 2011. Comparison of sixGIS-based spatial interpolation methods for estimating airtemperature in Western Saudi Arabia. J. Environ. Inform. 18(1), 38–45.

Fraser, S., Marceau, D.J., Visscher, A.D., Roth, S.H., 2013. Estimatingexposure by loose-coupling an air dispersion model and ageospatial information system. J. Environ. Inform. 21 (2), 84–92.

Ginsbourger, D., Rosspopoff, B., Pirot, G., Durrande, N., Renard, P.,2013. Distance-based kriging relying on proxy simulations forinverse conditioning. Adv. Water Resour. 52, 275–291.

Hirabayashi, S., Kroll, C.N., Nowak, D.J., 2011. Component-baseddevelopment and sensitivity analyses of an air pollutant drydeposition model. Environ. Model Softw. 26 (6), 804–816.

Jia, G., Taflanidis, A.A., 2013. Kriging metamodeling forapproximation of high-dimensional wave and surge responsesin real-time storm/hurricane risk assessment. Comput. MethodsAppl. Mech. Eng. 261–262, 24–38.

Kleijnen, J.P.C., 2009. Kriging metamodeling in simulation: areview. Eur. J. Oper. Res. 192 (3), 707–716.

Lao, Y.W., Zhu, Y., Jang, C., Lin, C.J., Xing, J., Chen, Z.R., Xie, J.P., etal., 2012. Research and development of regional air pollution

control decision support tool based on response surfacemodel. Acta Sci. Circum. 32 (8), 1913–1922.

Li, R., Sudjianto, A., 2005. Analysis of computer experiments usingpenalized likelihood in Gaussian kriging models.Technometrics 47 (2), 111–120.

Lim, L.L., Hughes, S.J., Hellawell, E.E., 2005. Integrated decisionsupport system for urban air quality assessment. Environ.Model Softw. 20 (7), 947–954.

Mohammadi, H., Seifi, A., Foroud, T., 2012. A robust Kriging modelfor predicting accumulative outflow from a mature reservoirconsidering a new horizontal well. J. Petrol. Sci. Eng. 82–83,113–119.

Peng, X.L., Yin, H., Li, R., Fang, K.T., 2006. The application ofKriging and empirical Kriging based on the variables selectedby SCAD. Anal. Chim. Acta. 578 (2), 178–185.

U.S.EPA, 2005. Clean air interstate rule emissions inventorytechnical support document. Available at:, http://www.epa.gov/interstateairquality/pdfs/finaltech01.pdf (Date accessed: 4March 2005).

U.S.EPA, 2006. Technical support document for the proposed PMNAAQS rule: response surface modeling. Available at:, http://www.epa.gov/ttn/scram/reports/pmnaaqs_tsd_rsm_all_021606.pdf (Date accessed: February, 2006).

Wang, S.L., Gao, J., Zhang, Y.C., Zhang, J.Q., Cha, F., Wang, T., etal., 2014a. Impact of emission control on regional air quality:an observational study of air pollutants before, during andafter the Beijing Olympic Games. J. Environ. Sci. 26 (1),175–180.

Wang, S.X., Xing, J., Jang, C., Zhu, Y., Fu, J.S., Hao, J.M., 2011. Impactassessment of ammonia emissions on inorganic aerosols inEast China using response surface modeling technique.Environ. Sci. Technol. 45 (21), 9293–9300.

Wang, S.X., Xing, J., Zhao, B., Jang, C., Hao, J.M., 2014b. Effectivenessof national air pollution control policies on the air quality inmetropolitan areas of China. J. Environ. Sci. 26 (1), 13–22.

Xiao, M., Breitkopf, P., Filomeno Coelho, R., Knopf-Lenoir, C.,Sidorkiewicz, M., Villon, P., 2009. Model reduction by CPOD andKriging. Struct. Multidiscip. Optim. 41 (4), 555–574.

Xing, J., Wang, S.X., Jang, C., Zhu, Y., Hao, J.M., 2011. Nonlinearresponse of ozone to precursor emission changes in China: amodeling study using response surface methodology. Atmos.Chem. Phys. 11 (10), 5027–5044.

Zhuang, X., Wang, Y., He, H., Liu, J., Wang, X., Zhu, T., et al., 2014.Haze insights and mitigation in China: an overview. J. Environ.Sci. 26 (1), 2–12.