11
Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas Javad Bazrafshan 1 ; Nafiseh Heidari 2 ; Isaac Moradi 3 ; and Zahra Aghashariatmadary 4 Abstract: In this paper, copula functions are used to model the dependence structure of monthly mean solar radiation R and sunshine duration hours D. The efficiency of five well-known bivariate parametric copula functions, including (1) normal, (2) students t, (3) Clayton, (4) Frank, and (5) Gumbel, are evaluated in a seasonal basis for nine radiometric stations in Iran. First, the most appropriate marginal prob- ability distributions for R and D were individually selected from 16 univariate distributions; then, performance of the parametric copulas for modeling the dependence structure of R-D joint empirical probability distribution was assessed using two criteria. Finally, based on appropriate parametric copulas, the joint simulation of marginal variables was accomplished using the conditional sampling technique. The results show that the best marginal distribution fitted on the original data D and R is normal distribution when transformed with Johnson function (in more than a half of cases). Because of high (low) correlation of R and D in the left (right) tail of scatter diagram, the Clayton model had better fitting on the empirical copula than other models. The joint simulation using appropriate parametric copula functions indicated that the Clayton yield a better performance in terms of the slope of relation between R and D. Besides, this model does not introduce unreasonable data. Therefore, the Clayton model is proposed as an appropriate copula model for simulating R and D data. DOI: 10.1061/ (ASCE)HE.1943-5584.0001051. © 2014 American Society of Civil Engineers. Author keywords: Joint simulation; Total solar radiation; Sunshine duration; Copulas; Iran. Introduction Incoming solar radiation R and sunshine duration hours D play an important role in determining crops water requirement, utilizing renewable energies, and specifying building climate and human bioclimate criteria, especially in arid and semiarid regions where the incoming solar radiation is remarkable through the year. How- ever, the spares distribution of solar radiation stations (measuring both solar radiation and sunshine duration) is a problem in many arid and semiarid regions like Iran. On the other hand, the network for measuring sunshine duration has higher density than those measuring solar radiation. Because of the significant relation be- tween solar radiation and sunshine duration as well as availability of sunshine duration data, the well-known relation between these variables (i.e., Ångström equation) is widely used over the world (e.g., Falayi and Robio 2005; Skeiker 2006; Liu et al. 2009; Duzen and Aydin 2012; Khorasanizadeh and Mohammadi 2013; Suehrcke et al. 2013). Solar radiation can be simulated using stochastic models such as Markov chain models (Aguiar et al. 1988), autoregressive models (Aguiar and Collares-Pereira 1992; Mora-Lopez and Sidrach- de-Cardona 1997), probabilistic finite automata (PFA) model (Mora- Lopez and Sidrach-de-Cardona 2003), and coupled autoregressive and dynamical system (CARDS) model (Huang et al. 2013), or based on other climate variables such as cloud observations [using a stochastic model (Ehnberg and Bollen 2005)], sunshine duration hours [using a combination of hidden Markov model and fuzzy model (Bhardwaj et al. 2013)], daily air temperature [using a hidden Markov model (Hocaoğlu 2011)], daily range of air temperature [us- ing the ClimGen weather generator (Stöckle et al. 1999)], and mini- mum and maximum air temperatures along with geographical characteristics [using multiple regression models (Li et al. 2013)]. There are two important issues about the mentioned solar radi- ation models, as follows: (1) Some models generate solar radiation variable only based on the measured solar radiation data at a given site. Inattention to other meteorological variables in these models causes the nature of the generated solar radiation data to be highly randomized. (2) Other models consider the dependency of solar radiation with the other meteorological variables such as temper- ature, cloudiness, and sunshine hours. Before running these types of models, it is necessary to simulate other meteorological variables using a given model and then enter the simulated variables into those models for generating solar radiation data. As a result, the uncertainty of solar radiation data is restricted by the uncertainty of the simulated meteorological variables despite the fact that the un- certainty of meteorological variables is not identical to each other in climate system (for example, uncertainty in precipitation is always wider than uncertainty in temperature or solar radiation). Therefore, dispersion of the generated solar radiation data using these models may not be satisfyingly close to dispersion of the measured solar radiation data. Simultaneous simulation of variables in K-dimensional space is the property of copula functions. These functions are able to model the dependence structure of K dependent variables (Nelsen 2006). 1 Assistant Professor, Dept. of Irrigation and Reclamation Engineering, Univ. of Tehran, Karaj 31587-77871, Iran (corresponding author). E-mail: [email protected] 2 M.Sc. Student, Dept. of Irrigation and Reclamation Engineering, Univ. of Tehran, Karaj 31587-77871, Iran. E-mail: [email protected] 3 Assistant Research Scientist, Earth System Science Interdisciplinary Center (ESSIC), Univ. of Maryland, College Park, MD 20740. 4 Assistant Professor, Dept. of Irrigation and Reclamation Engineering, Univ. of Tehran, Karaj 31587-77871, Iran. E-mail: [email protected] Note. This manuscript was submitted on November 8, 2013; approved on June 23, 2014; published online on August 18, 2014. Discussion period open until January 18, 2015; separate discussions must be submitted for individual papers. This paper is part of the Journal of Hydrologic Engi- neering, © ASCE, ISSN 1084-0699/04014061(11)/$25.00. © ASCE 04014061-1 J. Hydrol. Eng. J. Hydrol. Eng. Downloaded from ascelibrary.org by UNIV OF MISSOURI - KANSAS CITY on 12/01/14. Copyright ASCE. For personal use only; all rights reserved.

Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas

  • Upload
    zahra

  • View
    217

  • Download
    4

Embed Size (px)

Citation preview

Page 1: Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas

Simultaneous Stochastic Simulation of Monthly MeanDaily Global Solar Radiation and Sunshine

Duration Hours Using CopulasJavad Bazrafshan1; Nafiseh Heidari2; Isaac Moradi3; and Zahra Aghashariatmadary4

Abstract: In this paper, copula functions are used to model the dependence structure of monthly mean solar radiation R and sunshineduration hours D. The efficiency of five well-known bivariate parametric copula functions, including (1) normal, (2) student’s t, (3) Clayton,(4) Frank, and (5) Gumbel, are evaluated in a seasonal basis for nine radiometric stations in Iran. First, the most appropriate marginal prob-ability distributions for R and D were individually selected from 16 univariate distributions; then, performance of the parametric copulasfor modeling the dependence structure of R-D joint empirical probability distribution was assessed using two criteria. Finally, based onappropriate parametric copulas, the joint simulation of marginal variables was accomplished using the conditional sampling technique.The results show that the best marginal distribution fitted on the original data D and R is normal distribution when transformed with Johnsonfunction (in more than a half of cases). Because of high (low) correlation of R and D in the left (right) tail of scatter diagram, the Claytonmodel had better fitting on the empirical copula than other models. The joint simulation using appropriate parametric copula functionsindicated that the Clayton yield a better performance in terms of the slope of relation between R andD. Besides, this model does not introduceunreasonable data. Therefore, the Clayton model is proposed as an appropriate copula model for simulating R and D data. DOI: 10.1061/(ASCE)HE.1943-5584.0001051. © 2014 American Society of Civil Engineers.

Author keywords: Joint simulation; Total solar radiation; Sunshine duration; Copulas; Iran.

Introduction

Incoming solar radiation R and sunshine duration hours D play animportant role in determining crops water requirement, utilizingrenewable energies, and specifying building climate and humanbioclimate criteria, especially in arid and semiarid regions wherethe incoming solar radiation is remarkable through the year. How-ever, the spares distribution of solar radiation stations (measuringboth solar radiation and sunshine duration) is a problem in manyarid and semiarid regions like Iran. On the other hand, the networkfor measuring sunshine duration has higher density than thosemeasuring solar radiation. Because of the significant relation be-tween solar radiation and sunshine duration as well as availabilityof sunshine duration data, the well-known relation between thesevariables (i.e., Ångström equation) is widely used over the world(e.g., Falayi and Robio 2005; Skeiker 2006; Liu et al. 2009; Duzenand Aydin 2012; Khorasanizadeh and Mohammadi 2013; Suehrckeet al. 2013).

Solar radiation can be simulated using stochastic models such asMarkov chain models (Aguiar et al. 1988), autoregressive models

(Aguiar and Collares-Pereira 1992; Mora-Lopez and Sidrach-de-Cardona 1997), probabilistic finite automata (PFA) model (Mora-Lopez and Sidrach-de-Cardona 2003), and coupled autoregressiveand dynamical system (CARDS) model (Huang et al. 2013), orbased on other climate variables such as cloud observations [usinga stochastic model (Ehnberg and Bollen 2005)], sunshine durationhours [using a combination of hidden Markov model and fuzzymodel (Bhardwaj et al. 2013)], daily air temperature [using a hiddenMarkov model (Hocaoğlu 2011)], daily range of air temperature [us-ing the ClimGen weather generator (Stöckle et al. 1999)], and mini-mum and maximum air temperatures along with geographicalcharacteristics [using multiple regression models (Li et al. 2013)].

There are two important issues about the mentioned solar radi-ation models, as follows: (1) Some models generate solar radiationvariable only based on the measured solar radiation data at a givensite. Inattention to other meteorological variables in these modelscauses the nature of the generated solar radiation data to be highlyrandomized. (2) Other models consider the dependency of solarradiation with the other meteorological variables such as temper-ature, cloudiness, and sunshine hours. Before running these typesof models, it is necessary to simulate other meteorological variablesusing a given model and then enter the simulated variables intothose models for generating solar radiation data. As a result, theuncertainty of solar radiation data is restricted by the uncertainty ofthe simulated meteorological variables despite the fact that the un-certainty of meteorological variables is not identical to each other inclimate system (for example, uncertainty in precipitation is alwayswider than uncertainty in temperature or solar radiation). Therefore,dispersion of the generated solar radiation data using these modelsmay not be satisfyingly close to dispersion of the measured solarradiation data.

Simultaneous simulation of variables in K-dimensional space isthe property of copula functions. These functions are able to modelthe dependence structure of K dependent variables (Nelsen 2006).

1Assistant Professor, Dept. of Irrigation and Reclamation Engineering,Univ. of Tehran, Karaj 31587-77871, Iran (corresponding author). E-mail:[email protected]

2M.Sc. Student, Dept. of Irrigation and Reclamation Engineering, Univ.of Tehran, Karaj 31587-77871, Iran. E-mail: [email protected]

3Assistant Research Scientist, Earth System Science InterdisciplinaryCenter (ESSIC), Univ. of Maryland, College Park, MD 20740.

4Assistant Professor, Dept. of Irrigation and Reclamation Engineering,Univ. of Tehran, Karaj 31587-77871, Iran. E-mail: [email protected]

Note. This manuscript was submitted on November 8, 2013; approvedon June 23, 2014; published online on August 18, 2014. Discussion periodopen until January 18, 2015; separate discussions must be submitted forindividual papers. This paper is part of the Journal of Hydrologic Engi-neering, © ASCE, ISSN 1084-0699/04014061(11)/$25.00.

© ASCE 04014061-1 J. Hydrol. Eng.

J. Hydrol. Eng.

Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IV O

F M

ISSO

UR

I -

KA

NSA

S C

ITY

on

12/0

1/14

. Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

Page 2: Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas

Therefore, the uncertainty of simulation of each variable inK-dimensional space is not restricted by the uncertainty of othervariables. The high level of correlation between solar radiationand sunshine duration allows researchers to use copulas for mod-eling and simulating the dependence structure of the parameterssimultaneously.

Reviewing the literatures shows that the copulas have not beenemployed so far in the simultaneous simulation of solar radiationand sunshine duration; however, they were applied in a large num-ber of drought and precipitation studies (e.g., Shiau 2006; Wonget al. 2010). Therefore, the present paper is intended to assess theefficiency of copulas in joint simulation of solar radiation and sun-shine duration data at some radiometric stations in Iran. The nextsection presents data and methodology, next are the results anddiscussion, and then the conclusion.

Data and Methods

Data Collection and Quality Control

Twenty-five stations in Iran measure solar radiation and sunshineduration using pyranometer and heliographs. Preliminary analysesof data revealed that the solar radiation R and sunshine durationdata D are missing in some years. The quality control of data wasaccomplished using the algorithm developed by Moradi (2009) andindicated that some data do not have acceptable quality for the pres-ent study. After eliminating missing values and poor-quality data,the remainders were used for the dependence structure modeling ofR and D variables.

After quality control of data, nine out of 25 stations wereselected in this paper, which were located in very arid to humidregions of Iran and their characteristics are represented (Table 1).Some of the main points from Table 1 are as follows:• The stations of interest have located in different altitudes

ranging from −20.0 m (the Ramsar station) to 1,484.0 m(the Shiraz station). The stations have well distributed in differ-ent latitudes from 27.22° N toward 37.67° N and in differentlongitudes between 45.05° E and 60.88° E.

• The Ramsar station, located in a cloudy humid region in thenorth of Iran, has recorded the minimum values of daily R andD [1183.4 J cm−2 day−1 (283.1 cal cm−2 day−1) and 4.4 h, re-spectively] among the stations of interest. While the maximumvalues of daily R [2113.4 J cm−2 day−1 (505.6 cal cm−2 day−1)]has recorded in the Yazd station, the maximum value of daily D(9.1 h) has occurred in the Shiraz station.

• The observation period of the stations ranges from 14 to 25years and covers the period 1981–2005. The percentage of

missing data varies from 2.2 to 35.1%. The percentage of avail-able data in the stations, after the data quality control, variesfrom 49.1 to 88.1%. The study was conducted in a seasonalbasis to increase number of observations.

Mathematical Formulation of Copula Functions

Nelsen (2006) stated Sklar’s theorem of copula functions [for tworandom variables (1) X, and (2) Y] as described next. The cumu-lative distribution function (CDF) of any pair (X, Y) of continuousrandom variables could be expressed as

Hðx; yÞ ¼ C½FðxÞ;GðyÞ� ¼ Cðu; vÞx; y ∈ R and FðxÞ;GðyÞ ∈ ½0,1� ð1Þ

where FðxÞ ¼ u = marginal distribution of x; GðyÞ ¼ v = marginaldistribution of y; Hðx; yÞ = joint distribution of x and y; andC½FðxÞ;GðyÞ� = copula function of marginal distributions ofFðxÞ and GðyÞ.

The main advantage of copulas to classical joint distributions isthat the type of marginal probability distributions of random var-iables is not important in modeling of their dependence structurebecause these models are based on the marginal probability valuesof variables (Genest and Favre 2007).

As the classical joint distributions, the copula functions can bestated empirically and theoretically. The empirical copula for twovariables X and Y is defined as

Cnðu; vÞ ¼1

n

Xni¼1

I

�Qi

nþ 1≤ u;

Pi

nþ 1≤ v

�ð2Þ

where Qi and Pi = ranks of two variables Xi and Yi, respectively;n = sample size; IðAÞ is the indicator function of set A, and assumesa value of 0 if A is false and 1 if A is true; and Cnðu; vÞ = empiricalcopula for u; v ∈ ½0,1�. A is the set of conditions within the paren-theses on the right side of Eq. (2).

The empirical copula Cnðu; vÞ is a rank-based estimator ofthe unknown quantity Cðu; vÞ whose large-sample distribution iscentered at Cðu; vÞ and normal (Genest and Favre 2007). This func-tion could be used in estimation of theoretical copula functionparameters. Different families of theoretical copula functions im-pose different dependence structures on data. For information aboutdifferent kinds of parametric copula function, the reader is referredto Nelson (2006). In the present study, two well-known familiesof bivariate copula functions, including (1) elliptical (normal andstudent’s t), and (2) Archimedean (Clayton, Frank, and Gumbel),are used. The mathematical formulae of chosen copulas and theirparameters domains are presented in Appendix I. The important

Table 1. Characteristics of the Stations under Consideration

StationLongitude

(°E)Latitude(°N)

Altitude(m)

Length ofrecord period

(years)Missingdata (%)

Percentage ofavailable data

after quality controlR

(J cm−2 day−1)D(h) Climate

Ramsar 50.67 36.90 −20.0 20 10.8 62.1 1183.4 4.4 HumidYazd 54.28 31.90 1,273.2 14 35.1 63.1 2113.4 8.8 Very aridZahedan 60.88 29.47 1,370.0 25 15.7 58.0 1851.8 9.0 Very aridMashhad 59.63 36.27 999.2 25 8.3 78.0 1601.4 7.8 SemiaridBandar Abbas 56.37 27.22 9.8 19 2.2 88.1 1780.7 8.6 AridShiraz 52.60 29.53 1,484.0 20 37.1 60.0 1993.5 9.1 SemiaridBojnord 57.27 37.47 1,112.0 22 9.5 72.3 1242.3 7.6 SemiaridOroomieh 45.05 37.67 1,328.0 20 18.8 49.1 1986.8 7.8 SemiaridKermanshah 47.15 34.35 1318.6 24 10.1 70.1 1767.3 8.3 Semiarid

© ASCE 04014061-2 J. Hydrol. Eng.

J. Hydrol. Eng.

Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IV O

F M

ISSO

UR

I -

KA

NSA

S C

ITY

on

12/0

1/14

. Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

Page 3: Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas

point of copulas is their relationship with some dependence mea-sures as Spearman’s rho (ρs) and Kendall’s tau (ρτ ) correlationcoefficients that are defined next (Nelson 2006)

ρs ¼ 12

Z1

0

Z1

0

½Cðu; vÞ − uv�dudv ð3Þ

ρτ ¼ 4

Z1

0

Z1

0

Cðu; vÞdCðu; vÞ − 1 ð4Þ

Eqs. (3) and (4) explain that copulas model the dependencestructure of variables. Pearson’s correlation coefficient does notrelate to copula function since this coefficient is calculated basedon raw values of variables rather than their marginal probabilities.The mathematical form of Pearson’s correlation coefficient be-tween two variables, (1) X, and (2) Y, i.e., rðX;YÞ, is defined as

rðX;YÞ ¼P

ni¼1ðXi − XÞðYi − YÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiP

ni¼1 ðXi − XÞ2

p ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPni¼1 ðYi − YÞ2

p ð5Þ

where X and Y = means of the variables X and Y, respectively; andn = sample size. Therefore, the two rank correlation coefficients ρsand ρτ were in this paper used to understand the dependence levelof two variables, (1) solar radiation R, and (2) sunshine duration D.With increasing the value of Spearman’s rho or Kendall’s tau theshape of the joint distribution of R and D data changes fromcircular (for low correlation values or the correlation values nearto zero) to elliptical status (for high positive/negative correlationvalues or correlation values near to �1).

Stochastic Simulation Using Bivariate Copulas

The procedure of stochastic simulation of the monthly averageddaily solar radiation R and sunshine duration D data has beenimplemented using bivariate copula function for each season in thefollowing four steps:1. Determining an appropriate marginal distribution for each vari-

able. Although the type of marginal probability distribution isnot important for copula function, the accurate estimation ofprobability values could be important for representing the de-pendence structure and simulation of data. Hence, the fitting of16 marginal probability distributions including normal (with-out variable transformation, after Box-Cox transformation, andafter Johnson transformation), log-normal (two-parametricand three-parametric types), Weibull (two-parametric andthree-parametric types), exponential (one-parametric andtwo-parametric types), gamma (two-parametric and three-parametric types), the largest extreme values, the smallestextreme values, logistic, and log-logistic (two-parametric andthree-parametric types) on R and D values was assessed usingAnderson-Darling goodness-of-fit test.

2. Choosing an appropriate parametric copula function. The mostsuitable parametric copula function is the one which has thehighest level of adaptation with empirical copula. Two criteria,i.e., (1) Akaike information criterion (AIC), and (2) Schwartzinformation criterion (SIC), were used for measuring theadaptation level of a parametric copula function on empiricalcopula.

3. Generating or simulating the random pairs of marginal prob-abilities ðu; vÞ based on an appropriate copula model. To dothis, the conditional sampling technique (Nelsen 2006) wasused that the details of data simulation using this method aregiven in Appendix I.

4. Transforming the marginal probabilities ðu; vÞ to the corre-sponding original data D, R. Choosing the right probabilitydistribution (mentioned in the first step) can increase the ac-curacy of inverse transformations. For example, suppose thatthe normal and Weibull distributions are the appropriate dis-tributions for D and R, respectively. If the random generatedprobabilities values u1 and v1 correspond to the variables Dand R, D1 could be calculated by taking the inverse normalwith respect to u1, and for R1 by taking the inverse Weibullwith respect to v1.

Estimating Copula’s Parameters

The exact maximum likelihood (EML) method was here used forthe estimating copula’s parameters. Log-likelihood function of acopula, LðθÞ, is defined as (Frees and Wang 2005)

LðθÞ ¼Xni¼1

log cðui; viÞ ð6Þ

where n = number of data; θ = copula’s parameter; and cðu; vÞ =copula density function of variables u and v that is obtained bytaking partial derivative of a copula Cðu; vÞ with respect to varia-bles u and v

cðu; vÞ ¼ ∂2

∂u∂vCðu; vÞ ð7Þ

Taking derivative of LðθÞ with respect to θ and setting it tozero, θ (i.e., an estimation of θ) could be calculated as (Frees andWang 2005)

θ ¼ argmax½LðθÞ�θ

ð8Þ

According to the Eq. (8), θ is a quantity of θ for which LðθÞ ismaximized.

Criteria Used for Choosing Appropriate Copula

Two well-known statistical criteria are here used for assessmentand comparison of copula functions (in terms of modeling thedata dependence structure), including (1) Akaike information cri-terion (Akaike 1974), and (2) Schwartz information criterion(Schwartz 1997). The mathematical formulations of these criteriaare defined as

AIC ¼ ½2n=ðn − k − 1Þ�k − 2 lnðLmaxÞ ð9Þ

SIC ¼ k lnðnÞ − 2 lnðLmaxÞ ð10Þwhere n = number of data; k = number of parameters of the copulamodel; and Lmax = maximum value of the log-likelihood, LðθÞ, forthe estimated model (and is obtained from the previous section).The AIC is preferred in small samples and the SIC in large samples(Shumway and Stoffer 2011). The copula having the lowest valuesof the mentioned criteria is the most appropriate function andwould be subsequently used for simulating the dependence struc-ture of solar radiation and sunshine duration data.

For making better decision on the best copula, two statisticalmeasures were employed (based on AIC and SIC), including(1) AIC difference (AICD), and (2) SIC difference (SICD), whichare defined as (Burnham and Anderson 2002)

AICDðiÞ ¼ AICðiÞ − AICmin ð11Þ

© ASCE 04014061-3 J. Hydrol. Eng.

J. Hydrol. Eng.

Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IV O

F M

ISSO

UR

I -

KA

NSA

S C

ITY

on

12/0

1/14

. Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

Page 4: Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas

SICDðiÞ ¼ SICðiÞ − SICmin ð12Þ

where i = given copula model; and AICmin and SICmin are thelowest values of AIC and SIC, respectively. The larger AICD orSICD is, the less plausible it is that the ith model being the sameas the best model. According to Burnham and Anderson (2002),two models are not distinguishable if their AICD or SICD is lessthan 2. The best model has AICD or SICD ¼ 0. If one or severalmodels have AICD or SICD < 2 (the writers called them as thesuitable models), they can be considered as alternatives for the bestmodel. For the models with 0 < fAICD or SICDg < 2 (i.e., the suit-able models), there is no significant difference between each ofthem and the best model (i.e., a given model with AICD orSICD ¼ 0).

Results and Discussion

Analyzing the Dependency between Solar Radiationand Sunshine Duration

Before using copula models, it is important to consider the asso-ciation of the (1) monthly mean of daily solar radiation R, with(2) sunshine duration D. If there is a poor correlation betweenthe two mentioned variables or the variables are independent,the process of fitting the parametric copulas on data reduces to thecalculation of the independent copula or the product copula[i.e., Cðu; vÞ ¼ uv]. This means that, under independency of thevariables, the copula Cðu; vÞ is easily equal to the product of themarginal probabilities of the variables and therefore it is not nec-essary to evaluate any other copula. Hence, the values of two non-parametric measures of dependence, namely Spearman’s rho andKendall’s tau rank-correlation coefficients between R and D vari-ables were calculated for each season in the selected stations(Table 2). The results showed that both coefficients were significantat the 5% level at all stations. Several points could be induced fromTable 2, as follows: (1) The value of Spearman’s rho correlationcoefficient is always higher than Kendall’s tau coefficient in eachcase. This result is related fundamentally to the way of calculatingthe correlation coefficient and the dependence level of two varia-bles, thus the maximum ratio of Spearman’s rho to Kendall’s taureaches to 3=2 when variables are independent (the mathematicaldetails of problem is given in Fredrick and Nelson 2007). (2) Thevalues of both coefficients were higher in autumn than other sea-sons. This result demonstrates that the dependence structures ofautumn’s solar radiation parameters are stronger than other seasons.(3) The correlation coefficient values for Ramsar station (locatednear Caspian Sea, north of Iran) are lower than the other sta-tions. This condition is related to the high level of humidity and

cloudiness of the region that is under two prevailing synopticallyregimes including Mediterranean low pressure (from the west of thecountry) and Siberian high pressure (from the north of the country).

Fitting the Best Marginal Distribution to Data

The fitting of 16 probability distributions on monthly averageddaily solar radiation R and sunshine duration D data were assessedusing Anderson-Darling goodness-of fit test (AD test) in each sea-son. The best marginal distributions are normal after Johnson’s datatransformation, normal after Box-Cox’s data transformation, nor-mal (without data transformation), Weibull (three-parametric type),log-normal, and Weibull (two-parametric type). More details on thebest fitted distribution for each season in Ramsar and Yazd stationsare given (Table 3) The differences between the best-fitting distri-bution and the empirical distribution of R and D is not significantat the 5% level (Table 3) because the p-values corresponding tothe AD statistics all are greater than 0.05 (significance level). Thep-value refers to the exceedance probability of a given AD valueobtaining from the sampling distribution of AD statistic. Table 3also shows the location and scale parameters of the best fitted dis-tributions in the Ramsar and Yazd stations for each season of theyear. They can be used for inverse transformation of marginal prob-abilities of R and D to their original values.

The frequencies of stations in which a specific probability dis-tribution for R and D in each season of the year was the best (basedon Anderson-Darling goodness-of fit test; Table 4). An importantpoint of Table 4 is that the application of Box-Cox and Johnsontransformations on more than half of data series of D and R led,respectively, to their appropriate forms of marginal distributionsbeing closed to the normal status. In Fig. 1, the effect of Johnsontransformation on normality of winter’s R data is illustrated forMashhad station. Fig. 1(a) shows that few data points were out ofthe 95% confidence interval and the transformation of data withan optimized Johnson function conveyed them inside the interval[Fig. 1(c)]. The right side panels of Fig. 1 represent the way ofoptimally selection of Johnson function [Fig. 1(b)] and the param-eters estimations of the best transformation function [Fig. 1(d)];details of the algorithm and the way of interpreting the results aregiven in Appendix II. Similar condition occurred for Box-Coxtransformation that is displayed for the D data related to winterat Zahedan station in Fig. 2, as an example. Fig. 2 shows thatBox-Cox transformation not only conveyed just the D datum pointout of the 95% confidence interval [Fig. 2(a)] into the interval[Fig. 2(b)] but also moved the other transformed data points to-ward the line of normal distribution more than the original data.In Fig. 2(b), the optimized form of Box-Cox transformation func-tion is as fðDÞ ¼ Dλ ¼ D2. Box-Cox transformation has beendiscussed in more details by Wilks (2011).

Table 2. Kendall’s Tau and Spearman’s Rho Correlation Coefficients between R and D for Different Seasons of the Year in the Selected Stations

Station

Spearman’s rho Kendall’s tau

Winter Spring Summer Autumn Winter Spring Summer Autumn

Bandar Abbas 0.462 0.657 0.517 0.881 0.318 0.491 0.374 0.710Bojnord 0.643 0.817 0.634 0.915 0.441 0.646 0.458 0.746Kermanshah 0.710 0.872 0.421 0.762 0.523 0.701 0.287 0.591Mashhad 0.313 0.706 0.709 0.816 0.205 0.548 0.513 0.615Oroomieh 0.535 0.840 0.735 0.878 0.381 0.638 0.515 0.714Shiraz 0.364 0.825 0.483 0.900 0.257 0.614 0.341 0.735Zahedan 0.334 0.447 0.601 0.766 0.231 0.316 0.424 0.568Ramsar 0.312 0.367 0.529 0.538 0.221 0.241 0.366 0.377Yazd 0.508 0.761 0.415 0.847 0.373 0.544 0.266 0.671

© ASCE 04014061-4 J. Hydrol. Eng.

J. Hydrol. Eng.

Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IV O

F M

ISSO

UR

I -

KA

NSA

S C

ITY

on

12/0

1/14

. Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

Page 5: Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas

Table 3. Best-Fit Marginal Distributions for R and D in Ramsar and Yazd Stations

Station Season Variable Suitable marginal distributionAnderson-Darling

statistics p-value

Parameters

Location Scale

Ramsar Winter D Normal 0.166 0.932 3.314 0.9R Normal after Johnson transformation 0.215 0.834 −0.047 1.0

Spring D Normal 0.218 0.830 5.358 1.4R Normal 0.379 0.389 352.3 90.8

Summer D Normal after Box-Cox transformation 0.351 0.453 2.2 0.3R Normal after Box-Cox transformation 0.518 0.179 17.9 3.0

Autumn D Normal after Box-Cox transformation 0.178 0.913 1.8 0.2R Normal 0.125 0.984 162.6 54.3

Yazd Winter D Normal 0.199 0.870 6.7 1.1R Normal after Box-Cox transformation 0.277 0.626 5.8 0.2

Spring D Normal after Johnson transformation 0.483 0.212 0.01 0.9R Normal after Box-Cox transformation 0.179 0.909 0.00004 97,431.9

Summer D Normal after Box-Cox transformation 0.394 0.353 0.00002 34,552.4R Normal after Box-Cox transformation 0.187 0.896 0.001 0.0001

Autumn D Normal after Box-Cox transformation 0.399 0.336 1.9 0.2R Normal after Johnson transformation 0.535 0.151 0.05 1.0

Table 4. Number of Stations in Which a Given Probability Distribution for Each of the Two Variables R and D was the Best in Each Season

Variable Season Normal Log-normal

Weibull,two

parameters

Normal afterBox-Cox

transformation

Normal afterJohnson

transformation

Weibull,three

parameters

D Winter 4 0 0 3 2 0Spring 3 0 0 0 6 0Summer 3 0 0 5 1 0Autumn 1 0 0 3 4 1

R Winter 1 1 0 4 3 0Spring 3 1 0 1 3 1Summer 2 0 1 4 1 1Autumn 2 0 1 1 4 1

Fig. 1. Effect of Johnson transformation on normality of the solar radiation data for winter season in Mashhad station: (a) fitting the normaldistribution to the original data; (b) diagram of p-value versus corresponding Z value for selection of the best transformation function; (c) fittingthe normal distribution to the transformed data; (d) selected Johnson transformation function along with the largest p-value and its correspondingZ value

© ASCE 04014061-5 J. Hydrol. Eng.

J. Hydrol. Eng.

Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IV O

F M

ISSO

UR

I -

KA

NSA

S C

ITY

on

12/0

1/14

. Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

Page 6: Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas

Choosing the Most Appropriate Copula Models

Five copula functions, including (1) normal, (2) student’s t,(3) Clayton, (4) Gumbel, and (5) Frank, were assessed for modelingthe dependence structure of solar radiation R and hours of sunshineD variables. As an example, the results of choosing the best modelsalong with their estimated parameters and two information criteria,i.e., (1) AIC, and (2) SIC, as well as their corresponding differences(i.e., AICD and SICD) are given (Table 5) for two stations havingthe lowest (Ramsar) and the highest (Yazd) solar radiation values.Clayton is the best copula (due to its AICD and SICD ¼ 0) forsummer and autumn (in the Ramsar station) and winter and autumn(in the Yazd station) and no one of the other copulas cannot beintroduced as the alternatives for it (Table 5). Although Clayton isthe best for spring (in the Yazd station), Gumbel is suitable too(because of its AICD and SICD < 2). A remarkable point fromTable 5 is that the seasons in which Clayton has not been the bestcopula (i.e., winter and spring in the Ramsar station and summer inthe Yazd station), its difference with the best copula was not sig-nificant and therefore it can be introduced as the alternative for thebest copula. For example, according to AICD and SICD measures,although the best copula for winter season in the Ramsar station isstudent’s t, the three copulas (1) Clayton, (2) Gumbel, and (3) Frank(refer to the corresponding cells in Table 5 for winter season), havenot significant differences from student’s t copula. Among thecandidate copulas, Normal copula have not been reported as thebest-fitting or even as the suitable distribution in the two mentionedstations for all seasons.

For an overview, the frequencies of stations in which a specificcopula function is seasonally selected as the best function (i.e., its

Fig. 2. Effect of Box-Cox transformation on normality of theD data inZahedan station for winter season: (a) fitting the normal distributionto the original data of D; (b) fitting the normal distribution to thetransformed data (with lambda ¼ 2)

Table 5. Evaluation of Copula Functions Using the Two InformationCriteria AIC and SIC and Their Corresponding Differences, i.e., AICD andSICD, along with Their Parameters in the Ramsar and Yazd Stations forEach Season of the Year

Station Season Copula Parameters

Informationcriterion

SICD AICDSIC AIC

Ramsar Winter Normal 0.041 −3.42 −2.09 3.56 2.37Student’s t 0.221a, 40b −6.98 −4.46 0 0Clayton 0.008 −6.96 −4.44 0.02 0.02Gumbel 1.004 −6.94 −4.43 0.04 0.03Frank 0.036 −6.93 −4.41 0.05 0.05

Spring Normal 0.34 1.46 3.07 4.92 3.41Student’s t 0.241a, 40b −2.50 0.61 0.96 0.95Clayton 0.63 −2.38 0.73 1.08 1.07Gumbel 1.31 −3.46 −0.34 0 0Frank 2.28 −2.04 1.07 1.42 1.41

Summer Normal 0.53 9.95 11.61 9.89 8.32Student’s t 0.366a, 40b 5.72 8.94 5.66 5.65Clayton 1.15 0.06 3.29 0 0Gumbel 1.57 2.36 5.58 2.3 2.29Frank 3.71 4.73 7.96 4.67 4.67

Autumn Normal 0.54 8.22 9.59 8.15 6.93Student’s t 0.377a, 40b 4.60 7.19 4.53 4.53Clayton 1.20 0.07 2.66 0 0Gumbel 1.60 2.20 4.80 2.13 2.14Frank 3.85 3.53 6.12 3.46 3.46

Yazd Winter Normal 0.51 4.13 5.17 5.41 4.56Student’s t 0.373a, 40b 1.14 3.04 2.42 2.43Clayton 1.19 −1.28 0.61 0 0Gumbel 1.59 0.80 2.69 2.08 2.08Frank 3.80 1.11 3.01 2.39 2.4

Spring Normal 0.77 17.20 18.34 5.54 4.59Student’s t 0.544a, 40b 13.72 15.81 2.06 2.06Clayton 2.38 11.66 13.75 0 0Gumbel 2.19 13.54 15.63 1.88 1.88Frank 6.60 14.11 16.20 2.45 2.45

Summer Normal 0.40 1.53 2.75 4.36 3.31Student’s t 0.266a, 40b −2.08 0.19 0.75 0.75Clayton 2.54 −1.96 0.31 0.87 0.87Gumbel 0.72 −2.40 −0.13 0.43 0.43Frank 1.36 −2.83 −0.56 0 0

Autumn Normal 0.88 23.74 24.63 15.46 14.8Student’s t 0.671a, 40b 20.46 22.01 12.18 12.18Clayton 4.07 8.28 9.83 0 0Gumbel 3.03 17.04 18.59 8.76 8.76Frank 10.19 19.06 20.61 10.78 10.78

Note: The bolded zero values in the two last columns of the table indicatethe best model in a given season, and italicized values in each season showthe model or the set of models having no significant differences with thebest model.aCorrelation coefficient.bDegrees of freedom.

Table 6. Number of Stations in Which a Given Copula is Introduced as theBest and the Suitable for Each Season

Season Normal Student’s t Clayton Gumbel Frank

Winter 0, 0 1, 2 3, 3 0, 3 5, 2Spring 0, 1 0, 2 6, 1 1, 3 2, 2Summer 0, 0 0, 4 6, 1 0, 1 3, 0Autumn 0, 0 1, 0 7, 0 0, 0 1, 0

Note: In each set of entries, the best is the first value and the suitable is thesecond value.

© ASCE 04014061-6 J. Hydrol. Eng.

J. Hydrol. Eng.

Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IV O

F M

ISSO

UR

I -

KA

NSA

S C

ITY

on

12/0

1/14

. Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

Page 7: Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas

Fig. 3. Best theoretical copula function versus the empirical copula in Yazd station: (a) winter; (b) spring; (c) summer; (d) autumn; the best theoreticalcopula function versus the empirical copula in Ramsar station: (e) winter; (f) spring; (g) summer; (h) autumn

© ASCE 04014061-7 J. Hydrol. Eng.

J. Hydrol. Eng.

Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IV O

F M

ISSO

UR

I -

KA

NSA

S C

ITY

on

12/0

1/14

. Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

Page 8: Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas

Fig. 4. Simulation of monthly mean of daily solar radiation and sunshine duration in Yazd station using the best theoretical copula function: (a) winter,Clayton; (b) spring, Clayton; (c) summer, Frank; (d) autumn, Clayton; simulation of monthly mean of daily solar radiation and sunshineduration in Ramsar station using the best theoretical copula function; (e) winter, student’s t; (f) spring, Gumbel; (g) summer, Clayton; (h) autumn,Clayton

© ASCE 04014061-8 J. Hydrol. Eng.

J. Hydrol. Eng.

Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IV O

F M

ISSO

UR

I -

KA

NSA

S C

ITY

on

12/0

1/14

. Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

Page 9: Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas

AICD and SICD ¼ 0) and the suitable function (i.e., its AICD andSICD greater than zero and less than 2), are given (Table 6). Thereare potentially 36 cases (i.e., the combination of nine stations andfour seasons) to be fitted by the five candidate copulas. Clayton isthe best-fitting copula for 22 cases, Frank for 11 cases, student’s tfor two cases, and Gumbel for one case, and are ranked from 1–4,respectively (Table 6). Accounting the suitable cases (the quantitiesnoted in Table 6) for evaluation of the copulas, researchers can in-troduce Clayton for 27 cases, Frank for 15 cases, student’s t for 10cases, and Gumbel for eight cases. Normal copula is selected as thebest function in no cases except for one case as the suitable functionin spring. Generally, the Clayton function was better than otherfunctions except for spring in which the Frank function was thebest ones.

In Figs. 3(a and b), empirical copula functions are plottedagainst the best theoretical copula functions, for example, inRamsar and Yazd stations for each season of the year. Accordingto Figs. 3(a and b), there is a good agreement between the theo-retical and empirical copula functions in both stations. Similarresults were observed in the other stations of interest.

Joint Simulation of Solar Radiation and SunshineDuration

For simulation of monthly averaged daily solar radiation R andsunshine duration D data, 1,000 pairs of marginal probabilitiesðu; vÞ were generated using the most appropriate parametric cop-ula at each station for all seasons. Then, the pairs ðu; vÞ weretransformed to the corresponding data of D and R using the in-verse functions of the marginal distributions of variables D and R.Figs. 4(a and b) represents the simulations of D and R data forYazd station [Figs. 4(a–d)] and Ramsar station [Figs. 4(e–h)] ineach season, for instance. Fig. 4(c) shows that the Frank copula,as the best function, could not effectively simulate the dependencestructure of data at Yazd station in winter. This scatter diagram[Fig. 4(c)] represents a negative correlation between R and D,which is unreasonable. For example, a point with 9 h of sun-shine received approximately a total amount of solar radiation3344.0 J cm−2 day−1 (800 cal cm−2 day−1) whereas a point with12 h of sunshine received a total amount of solar radiation2090.0 J cm−2 day−1 (500 cal cm−2 day−1). The behavior of thestudent’s t copula [Fig. 4(e)] is notable in simulation of thedependence structure of D and R data in Ramsar station for winterseason. Fig. 4(e) shows the student’s t copula could not simulatethe slope of data in the scatter diagram of R-D since it generatedthe R value 418.0 J cm−2 day−1 (100 cal cm−2 day−1) for the Dvalues in the range of 1–6 h. Although the simulation result of theGumbel copula [Fig. 4(f)] was better than Frank and student’s t,it exaggerated the uncertainty around the observed data. Specially,the Gumbel copula generated some unreasonable points in the lefttail of data in where the correlation of R and D was high. Amongfive selected copula functions, it seems that it is only the Claytoncopula [Figs. 4(a, b, d, g, and h)] that presents reasonable simu-lations for two reasons, as follows: (1) keeping the slope ofvariations of data in the scatter diagrams, and (2) preserving thecorrelation of the observed R and D data in different parts (espe-cially in the left and right tails of data) of the scatter diagrams.

Irrational results obtained in simulation of R and D using anumber of copulas could be related to how the data have distrib-uted on the coordinate plane of the variables R and D (i.e., thedependence structure of the variables), on the one hand, and whattype of copula function fitted to data, on the other hand. Besides,the small sample size may be another cause of the unreasonablesimulations. The data dispersion on the two-dimensional space

(R, D) determines the dependence level of the variables which isexpressed by some dependency measures such as Spearman’s rhoand Kendall’s tau rank-correlation coefficients. These correlationcoefficients measure the overall strength of the association butgive no information about how that varies across the distribution.Through the choice of copula a good deal of control can be ex-ercised over what parts of the distributions the variables are morestrongly associated. The Clayton copula captures the left tail de-pendence of the distribution of data. Therefore, that is not a suit-able copula for the observed data distributed in Figs. 4(c, e, and f)due to the poor correlation in the left tail of data. The Gumbelcopula is used to model asymmetric dependence in the data. Thiscopula is famous for its ability to capture strong right tail depend-ence and weak left tail dependence. If outcomes are expected to bestrongly correlated at high values but less correlated at low values,then the Gumbel copula is an appropriate choice. With this ex-planation, this copula cannot successfully simulate the dependencestructure of data in Fig. 4(f) due to the low correlation in both leftand right tails of data. Unlike the Clayton and the Gumbel copula,the Frank copula allows the maximum range of dependence. Thismeans that the dependence parameter of the Frank copula permitsthe approximation of the upper and the lower Fréchet-Hoeffdingbounds and thus the Frank copula permits modeling positive asnegative dependence in the data. When θ (in the Frank equation,in Appendix I) approaches þ∞ and −∞ the Fréchet-Hoeffdingupper and lower bound will be, respectively, attained. The inde-pendence case will be attained when θ approaches zero. However,the Frank copula has neither lower nor upper tail dependence. TheFrank copula is thus suitable for modeling data characterized byweak tail dependence similar to Fig. 4(c). The negative slope ofthe simulated data in Fig. 4(c) can be related to the flexibility ofFrank copula in modeling both negative and positive depen-dence in middle part of the data, as previously stated. In case ofFigs. 4(a and b), since the Frank copula has identified the presenceof negative dependence in the middle part of the observed data theslope of the simulated data has tended to be negative. In case ofstudent’s t copula, this copula function is appropriate for modelingthe left and right tail dependence data which can be set by chang-ing the degree of freedom parameter θ1 (in the student’s t equa-tion, in Appendix I). Fig. 4(e) shows the correlation on both leftand right tails of data as well as the overall correlation of data arepoor. Therefore, researchers cannot anticipate the student’s t cop-ula being satisfyingly fitted to the bivariate distribution of datain Fig. 4(e).

Conclusions

In the present paper, five copula functions from two families in-cluding (1) elliptical (normal and student’s t), and (2) Archimedean(Clayton, Frank, and Gumbel) were assessed in terms of modelingand simulating the joint behavior of monthly averaged daily solarradiation R and sunshine durationD data at nine stations (located invery arid to humid regions) in Iran. To this end, the paper presents astep-by-step methodology which can easily be followed and imple-mented by users. Before considering copulas, qualities of the R andD data were checked using a statistical method; fittings the 16 mar-ginal probability distributions on the variables R and D were indi-vidually evaluated using the Anderson-Darling goodness-of-fittingtest. Copula’s parameters were estimated using the exact maximumlikelihood method. The best copula function for modeling thedependence structure of the variables R and D were determinedusing two information criteria, i.e., (1) AIC, and (2) SIC, along withdifferences (i.e., AICD and SICD). Simultaneous simulations of the

© ASCE 04014061-9 J. Hydrol. Eng.

J. Hydrol. Eng.

Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IV O

F M

ISSO

UR

I -

KA

NSA

S C

ITY

on

12/0

1/14

. Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

Page 10: Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas

R and D data were accomplished using the conditional samplingtechnique.

The main results of the study are as follows: (1) statistical analy-ses showed that the best marginal probability distribution for both Rand D variables is the normal distribution when data are trans-formed with Johnson functions, in most cases; (2) assessmentof different copula functions in modeling the dependence structureof the pairs (D, R) showed that the Clayton function is mostly betterthan the other copula functions; (3) although the Gumbel, Frank,or student’s twere the best copula model in some cases, their abilityto simulate R and D data was low; and (4) normal copula couldnot obtained any success for modeling and simulating the solar ra-diation parameters in the sites of interest.

According to the results, the Clayton function is suggestedas the most suitable copula function for simultaneous simulationof R and D data in the study area. As a research work in the future,it is recommended to develop a weather data generator basedon copulas that simulate the other meteorological variables suchas temperature and precipitation in addition to solar radiationparameters.

Appendix I. Bivariate Copula Functions andSimulation

Mathematical Formulations of Parametric Copulas

Table 7 shows the mathematical formulations and parameters’ranges of bivariate copula functions used in this paper.

Bivariate Simulation Based on Parametric CopulaFunctions

Conditional sampling technique was used for generating thepairs of random variables u and v based on the parametric copulafunctions in the present paper. The four simulation steps usingconditional sampling method are (Nelsen 2006) are as follows:1. Generating two independent uniform Uð0,1Þ variates ðu; tÞ;2. Determining the conditional copula function cuðvÞ ¼

PðV ≤ vjU ¼ uÞ ¼ ∂Cðu; vÞ=∂u;

3. Setting v ¼ cð−1Þu ðtÞ, where cð−1Þu denotes the quasiinverseof cu; and

4. The desired pair is ðu; vÞ.An important point in random generation of copula variates is

the determination of conditional copula function cuðvÞ and itsinverse function cð−1Þu ðtÞ. The determination of these functions isanalytically possible for Archimedean copula function. But theapproximate methods should be used for normal and student’s tcopula functions. For example, the conditional and inverse functionof Frank copula are respectively as

cuðvÞ ¼expðθÞ½−1þ expðθvÞ�

− expðθÞ þ expðθþ θuÞ − expðθuþ θvÞ þ expðθþ θvÞð13Þ

cð−1Þu ðtÞ ¼ − 1

θln

�1þ t½1 − expð−θÞ�

t½expð−θuÞ − 1� − expð−θuÞ�

ð14Þ

Appendix II. Johnson Transformation

Johnson transformation optimally selects one of the three familiesof distribution, as follows: (1) SB, (2) SL, and (2) SU, where B,L, and U refer to the variable being bounded, lognormal, andunbounded, respectively (Chou et al. 1998). Table 8 representsthe mathematical formulations and parameters’ ranges of the men-tioned distributions. The four-step algorithm of the data transfor-mation using Johnson functions is as follows:1. Choosing a given function form of Table 8;2. Estimating the parameters of the selected function based on the

method described in Chou et al. (1998);3. Transforming the original data using the function obtained

from previous step; and4. Fitting the normal distribution to the transformed data and

then calculating Anderson-Darling statistics and correspond-ing p-value.

The previous four steps process is iteratively continued to selectthe transformation function with the largest p-value that is greaterthan the p-value criterion [default is 0.10; Fig. 1(b)]. Otherwise,

Table 8. Mathematical Forms and Parameters’ Ranges of Johnson Transformation Functions

Johnson’s family Transformation function Parameters’ ranges

SB γ þ η ln½ðx − εÞ=ðλþ ε − xÞ� η, λ > 0, −∞ < γ < ∞, −∞ < ε < ∞ , ε < x < εþ λSL γ þ η lnðx − εÞ η > 0, −∞ < γ < ∞, −∞ < ε < ∞, ε < xSU γ þ ηsinh−1½ðx − εÞ=λ� where sinh−1ðxÞ ¼ lnðxþ

ffiffiffiffiffiffiffiffiffiffiffiffiffi1þ x2

pÞ η, λ > 0, −∞ < γ < ∞, −∞ < ε < ∞, −∞ < x < ∞

Note: Data from Chou et al. (1998).

Table 7. Mathematical Forms and Parameters’ Ranges of the Theoretical Copula Functions Used in This Paper

Copula type Mathematical formula θ-domain

Normal Cðu; v; θÞ ¼ ∫ φ−1ðuÞ−∞ ∫ φ−1ðvÞ−∞ 12πð1−θ2Þ0.5 exp

h−ðs2−2θstþt2Þ2ð1−θ2Þ

idsdt ð−1,1Þ

Student’s tCðu; v; θ1; θ2Þ ¼ ∫

t−1θ1 ðuÞ−∞ ∫t−1θ1 ðvÞ−∞ 1

2πð1−θ22Þ0.5

h1þ ðs2−2θ2stþt2Þ

θ1ð1−θ22Þi−½ðθ1þ2Þ=2�

dsdtθ1 ∈ ð0;∞Þa θ2 ∈ ð−1,1Þb

Clayton Cðu; v; θÞ ¼ ðu−θ þ v−θ − 1Þ−1=θ ð0;∞ÞFrank Cðu; v; θÞ ¼ −θ−1 log

n1þ ½expð−θuÞ−1�½expð−θvÞ−1�

expð−θÞ−1o ð−∞;∞Þ \ f0g

Gumbel Cðu; v; θÞ ¼ expf−½ð− ln uÞθ þ ð− ln vÞθ�1=θg ½1;∞ÞNote: φ−1ð•Þ and t−1θ1 ð•Þ denotes the inverse of the CDF of the standard univariate normal and t-distribution with θ1 degrees of freedom, respectively. Thevariables u and v belong to the range [0, 1]. Data from Nelsen (2006).aDegrees of freedom.bCorrelation coefficient.

© ASCE 04014061-10 J. Hydrol. Eng.

J. Hydrol. Eng.

Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IV O

F M

ISSO

UR

I -

KA

NSA

S C

ITY

on

12/0

1/14

. Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

Page 11: Simultaneous Stochastic Simulation of Monthly Mean Daily Global Solar Radiation and Sunshine Duration Hours Using Copulas

Johnson transformation is not appropriate. The selected Johnsonfunction is then used to transform the data to be in accordance witha normal distribution. The more details on Johnson transformationare given in Chou et al. (1998).

References

Aguiar, R. J., and Collares-Pereira, M. (1992). “Tag: A time-dependent,autoregressive, Gaussian model for generating synthetic hourly radia-tion.” Solar Energy, 49(3), 167–174.

Aguiar, R. J., Collares-Pereira, M., and Conde, J. P. (1988). “Simpleprocedure for generating sequences of daily radiation values using alibrary of Markov transition matrix.” Solar Energy, 40(3), 269–279.

Akaike, H. (1974). “A new look at the statistical model identification.”Proc., IEEE Transactions on Automatic Control, New York, 716–723.

Bhardwaj, S., et al. (2013). “Estimation of solar radiation using a combi-nation of hidden Markov model and generalized fuzzy model.” SolarEnergy, 93, 43–54.

Burnham, K. P., and Anderson, D. R. (2002). Model selection andmultimodel inference: A practical information-theoretical approach,2nd Ed., Springer, New York.

Chou, Y., Polansky, A. M., and Mason, R. L. (1998). “Transformingnonnormal data to normality in statistical process control.” J. Qual.Technol., 30(2), 133–141.

Duzen, H., and Aydin, H. (2012). “Sunshine-based estimation of globalsolar radiation on horizontal surface at lake Van region (Turkey).”Energy Convers. Manage., 58, 35–46.

Ehnberg, J. S. G., and Bollen, H. J. (2005). “Simulation of global solarradiation based on cloud observations.” Solar Energy, 78(2), 157–162.

Falayi, E., and Robio, A. (2005). “Modeling global solar radiation usingsunshine duration data.” Niger. J. Phys., 175, 181–186.

Fredricks, G. A., and Nelsen, R. B. (2007). “On the relationship betweenSpearman’s rho and Kendall’s tau for pairs of continuous random var-iables.” J. Stat. Plann. Inf., 137(7), 2143–2150.

Frees, E. W., and Wang, P. (2005). “Credibility using copulas.” N. Am.Actuarial J., 9(2), 31–48.

Genest, C., and Favre, A. C. (2007). “Everything you always wanted toknow about copula modeling but were afraid to ask.” J. Hydrol. Eng.,10.1061/(ASCE)1084-0699(2007)12:4(347), 347–368.

Hocaoğlu, O. F. (2011). “Stochastic approach for daily solar radiationmodeling.” Solar Energy, 85(2), 278–287.

Huang, J., Korolkiewicz, M., Agrawal, M., and Boland, J. (2013).“Forecasting solar radiation on an hourly time scale using a coupled

autoregressive and dynamical system (CARDS) model.” Solar Energy,87, 136–149.

Khorasanizadeh, H., and Mohammadi, K. (2013). “Introducing the bestmodel for predicting the monthly mean global solar radiation over sixmajor cities of Iran.” Energy, 51, 257–266.

Li, M. F., Fan, L., Liu, H. B., Guo, P. T., and Wu, W. (2013). “A generalmodel for estimation of daily global solar radiation using air temper-atures and site geographic parameters in southwest China.” J. Atmos.Solar-Terrestrial Phys., 92, 145–150.

Liu, X., et al. (2009). “Calibration of the Angstrom-Prescott coefficients(a,b) on the different time scales and their impacts in estimating globalsolar radiation in the Yellow River basin.” Agric. Forest Meteorol.,149(3–4), 697–710.

Moradi, I. (2009). “Quality control of global solar radiation using sunshineduration hours.” Energy, 34(1), 1–6.

Mora-Lopez, L., and Sidrach-de-Cardona, M. (1997). “Characterizationand simulation of hourly exposure series of global radiation.” SolarEnergy, 60(5), 257–270.

Mora-Lopez, L., and Sidrach-de-Cardona, M. (2003). “Using probabilisticfinite automata to simulate hourly series of global radiation.” SolarEnergy, 74(3), 235–244.

Nelsen, R. B. (2006). An introduction to copula, 2nd Ed., Springer,New York.

Schwartz, E. S. (1997). “The stochastic behavior of commodity prices:Implications for valuation and hedging.” J. Finance, 52(3), 923–973.

Shiau, J. T. (2006). “Fitting drought duration and severity with two-dimensional copulas.” Water Resour. Manage., 20(5), 795–815.

Shumway, R. H., and Stoffer, D. S. (2011). Time series analysis and itsapplications, Springer, New York.

Skeiker, K. (2006). “Correlation of global solar radiation with commongeographical and meteorological parameters for Damascus province,Syria.” Energy Convers. Manage., 47(4), 331–345.

Stöckle, C. O., Campbell, G. S., and Nelson, R. (1999). ClimGen manual,Biological Systems Engineering Dept., Washington State Univ.,Pullman, WA.

Suehrcke, H., Bowden, R. S., and Hollands, K. G. T. (2013). “Relationshipbetween sunshine duration and solar radiation.” Solar Energy, 92,160–171.

Wilks, D. S. (2011). Statistical methods in the atmospheric sciences,3rd Ed., Academic, New York.

Wong, M. F., Leonard, M., and Metcalfe, A. V. (2010). “Drought analysisusing trivariate copulas conditional on climatic states.” J. Hydrol. Eng.,10.1061/(ASCE)HE.1943-5584.0000169, 129–141.

© ASCE 04014061-11 J. Hydrol. Eng.

J. Hydrol. Eng.

Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IV O

F M

ISSO

UR

I -

KA

NSA

S C

ITY

on

12/0

1/14

. Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.