Prediction of hourly and daily diffuse solar fraction in the city of Fez (Morocco)

$Page 1: Prediction of hourly and daily diffuse solar fraction in the city of Fez (Morocco)$
Theor Appl ClimatolDOI 10.1007/s00704-014-1207-y

ORIGINAL PAPER

Prediction of hourly and daily diffuse solar fractionin the city of Fez (Morocco)

B. Ihya · A. Mechaqrane · R. Tadili · M. N. Bargach

Received: 23 January 2014 / Accepted: 14 June 2014© Springer-Verlag Wien 2014

Abstract In this paper, 3-layers MLP (Multi-Layers Per-ceptron) Artificial Neural Network (ANN) models havebeen developed and tested for predicting hourly and dailydiffuse solar fractions at Fez city in Morocco. In parallel,some empirical models were tested. Three years of data(2009–2011) have been used for establishing the parametersof all tested models and 1 year (2012) to test their predic-tion performances. To select the best ANN (3-layers MLP)architecture, we have conducted several tests by using dif-ferent combinations of inputs and by varying the numberof neurons in the hidden layer. The output is only the dif-fuse solar fraction. The performances of each model wereassessed on the basis of four statistic characteristics: meanabsolute error (MAE), relative mean bias error (RMBE), rel-ative root mean square error (RRMSE) and the degree ofagreement (DA). Additionally, the coefficient of correlation(R) is used to test the linear regression between predictedand observed data. The results indicate that the ANN modelis more suitable for predicting diffuse solar fraction than theempirical tested models at Fez city in Morocco.

B. Ihya (�) · A. MechaqraneLaboratory of signals systems and components, FST StreetImmouzar, B.P. 2202 Fez, Moroccoe-mail: [email protected]

A. Mechaqranee-mail: [email protected]

R. Tadili · M. N. BargachLaboratory of solar energy and environments, FS 4 avenue IbnBattouta, B.P. 1014 Rabat, Morocco

Nomenclature

Hd Diffuse solar radiation W/m2

Hg Global solar radiation W/m2

H0 Extra-terrestrial solar radiation W/m2

Isc Solar constant (=1367 W/m2)Ig,c Global horizontal irradiation by Page clear sky

modelId,c Diffuse horizontal irradiation by Page clear sky

modelId,oc Diffuse horizontal irradiation by Page over-cast

sky modelKt Daily clearness indexkt Hourly clearness indexKd Daily diffuse fractionkd Hourly diffuse fractionT Temperature (◦C)

RH Relative humidity (%)WS Wind speed m/s

WD Wind direction (◦)Rf Rainfall mm

Dn The number of the day of the year, starting fromfirst January

h Hour of the dayDL Day length

ANN Artificial Neural NetworksMLP Multi-Layers Perceptron

R Correlation coefficientDA Degree of agreement

RRMSE Relative root mean square errorRMBE Relative mean bias error

α Solar altitude (◦)δ Solar declination (◦)λ Latitude of measurement site (◦)

mailto:[email protected]

mailto:[email protected]

B. Ihya et al.

1 Introduction

An accurate knowledge of the solar energy reaching theground is necessary for sizing and optimizing the perfor-mances of solar installations such as flat-plate collectors,photovoltaic systems and other solar energy collectors. Inmeteorological stations, the most measured solar compo-nent is the global solar radiation on horizontal surface. Theknowledge of this component is sufficient to size some sys-tems such as flat plate thermal collectors. However, othersystems such as concentrating solar power plant (CSP)or concentrated photovoltaic (CPV) need the knowledgeof direct solar component. This component is difficult tomeasure because it requires the use of a pyrheliometerequipped with solar tracking system, which is very costly.An alternative is to use a pyranometer to measure the globalsolar component and a pyranometer equipped with a solarshadow band to measure the diffuse solar component. Wecan then obtain the direct component by subtracting the dif-fuse component from the global component. Another moreinexpensive used alternative is to measure only the globalsolar component and use a model to calculate the othercomponents.

In general, the used methods consist of empirical rela-tionships between the diffuse fraction (kd ) and the clearnessindex (kt ) defined as following:

kd = Hd

Hg

(1)

kt = Hg

H0(2)

where Hg, Hd , and Ho are, respectively, the global, the dif-fuse, and the extra-terrestrial solar irradiation on horizontalsurface and in a given time scale (usually hourly Ruiz-Ariaset al. 2010; Jacovides et al. 2006; Hamdy 2007; Erbs et al.1981; Oliveira et al. 2002; Muneer 2004 or daily Hamdy2007; Erbs et al. 1981; Oliveira et al. 2002; Muneer 2004;Collares-Pereira and Rabl 1979; Jin et al. 2004; Nfaoui andBuret 1993 scale).

Boland et al. (2008) have developed a logistic model forsome Australian locations.

Ruiz-Arias et al. (2010) have used a regressive modelbased on the sigmoid function to obtain kd with the clear-ness index and the pressure-corrected optical air mass aspredictor variables. Other authors have used empirical rela-tions to estimate the monthly average daily diffuse solarradiation from clearness index (Hamdy 2007; Erbs et al.1981; Oliveira et al. 2002; Ulgen and Hepbasli 2009; Jiang2008).

Also, artificial neural network (ANN) models have beenused by some authors for modeling and estimating thesolar radiation from meteorological data and astronomicalparameters. Hamdy et al. (2007) have proposed an ANN

model to predict the diffuse solar fraction in hourly anddaily scales by using as inputs the global solar radiationand other meteorological parameters like long-wave atmo-spheric emission, air temperature, relative humidity, andatmospheric pressure. Soares et al. (2004) applied a percep-tron neural network technique to estimate hourly values ofthe diffuse solar radiation in Sao Paulo City, Brazil, usingas inputs the global solar radiation and other meteorologi-cal parameters. Alam et al. (2009) has developed, for someIndian stations, ANN models for estimating monthly hourlymean and daily diffuse solar radiation by using as inputsthe latitude, longitude, altitude, time, month of the year, airtemperature, relative humidity, rainfall, wind speed, and netlong wave radiation (infrared radiation). Also, Jiang (2008)have used an ANN model for estimating monthly meandaily diffuse solar radiations.

In this work, we test a set of ANN and other empiricalmodels to estimate hourly and daily values of diffuse solarfraction from the clearness index and other astronomicalparameters and meteorological data.

2 Database

In this study, a database containing hourly measured valuesof global and diffuse solar irradiations and other meteoro-logical parameters are used. The measurements cover theperiod from 1 January 2009 to 31 December 2012 and havebeen realized by a radiometric station placed on the top ofthe Faculty of Sciences and Technics building of the Uni-versity, Sidi Mohamed Ben Abdellah of Fez (latitude 33◦56′ N, longitude 4◦ 59′ W, altitude: 579 m). The city of Fezis located in the plain of Saiss between the Middle Atlasand the Rif Mountains. It is characterized by a continentalclimate.

The devices for collecting the data used in this work are:a Kip & Zonen model CM-11 pyranometer for measuringthe global solar irradiance and another identical pyranome-ter but equipped with a shadow band for measuring thediffuse solar irradiance (Fig. 1). The shadow ring has awidth of 7.6 cm and a radius of 31 cm. For measuring theprecipitation, we use the rain gauge while air temperatureand relative humidity was measured by means of a thermo-hygrometer. The wind speed and direction are measuredwith an anemometer.

Given that the diffuse solar irradiance is measured by apyranometer equipped with a shadow band, we applied thecorrection procedure proposed by Drummond (1964) to takeinto account the part of diffuse component blocked by theshadow band.

Before applying the models for estimating the diffusesolar fraction, some quality tests must be applied to elimi-nate erroneous data and establish a reliable dataset.

Prediction of hourly and daily diffuse solar fraction

Fig. 1 Kipp and Zonen CM-11 equipped with a shadow band

3 Quality control procedure

It is well known that the quality of the measurements of dif-fuse solar irradiation can fail from time to time. As reasonsfor this, we can cite:

– The reading apparatus is known to fail from time totime and will give infeasible values for diffuse radiation(Boland and Ridley 2007).

– The solar shadow band can suffer from misalignmentsfrom time to time.

Figure 2 shows the plot of the diffuse fraction against theclearness index for the hourly raw data. As we can see, someerroneous data are clearly visible in this figure. To constructan accurate and reliable dataset, we apply the four well-known tests (Ruiz-Arias et al. 2010; Muneer 2004; Youneset al. 2005).

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.80

0.2

0.4

0.6

0.8

1

Hourly clearness index (kt)

Hou

rly d

iffus

e so

lar

frac

tion

(kd)

Fig. 2 The scatter plot of the diffuse solar fraction against theclearness index for the hourly raw data

3.1 First test

In this test, and as recommended by Younes et al. (2005), allthe data values corresponding to a solar altitude α below 7 ◦have been eliminated.

3.2 Second test

In this test, only the values positive and less than one ofclearness index and diffuse solar fraction are retained.

0 < kt < 1 and 0 < kd < 1

3.3 Third test

In this test, we only retain the data values satisfying thefollowing inequalities:

Id,oc ≤ Hd ≤ Id,c and Hg ≤ Ig,c

where Id,c and Id,oc are, respectively, the upper and thelower limits of the diffuse solar irradiance Hd . Ig,c is theupper limit of the global solar irradiance Hg. Id,c, Id,oc andIg,c are calculated using the Page model (Ruiz-Arias et al.2010; Younes et al. 2005).

The global horizontal irradiance for clear sky Ig,c con-sists of two components: the beam component Ib,c and thediffuse component Id,c.

Ig,c = Ib,c + Id,c (3)

The beam irradiance on a horizontal surface for clear skyIb,c is given by (Muneer 2004):

Ib,c = Iscsinα(1.0 + 0.03344cos(Dn

365.25− 2.80)exp(−0.8662mTLδr )) (4)

where TL is the Linke turbidity factor for an air massequal to 2 and m is the relative optical air mass calculatedas function of the site elevation z and the scale height ofthe Rayleigh atmosphere near the Earth surface zh (Kasten1993):

m = exp(−zzh)

sin(αtrue) + 0.50572(αtrue + 6.07995)−1.6364 (5)

αtrue = α + 0.061359(180

π)

0.1594 + 1.123α( π180 ) + 0.065656(α π

180 )2

1 + 28.9344α( π180 ) + 277.3971(α π

180 )2 (6)

δr is the integral Rayleigh optical thickness given by(Kasten 1993):

δr = [6.6296 + 1.7513m − 0.1202m2 + 0.0065m3 − 0.00013m4]−1 (7)

B. Ihya et al.

The diffuse horizontal irradiance for clear sky Id,c isdetermined by (Muneer 2004):

Id,c = (1.0 + 0.03344cos(Dn

365.25− 2.80)TrdF (α) (8)

where Trd is the diffuse transmittance function at zenithand F(α) is the solar elevation function.

Trd = − 21.657 + 41.752TL + 0.51905TL2 (9)

The solar elevation function F(α) is a polynomial func-tion of the sine of the solar elevation and it is evaluated using(Muneer 2004):

F(α) = x0 + x1sinα + x2sin2α (10)

The coefficients x0, x1, and x2 are given by:

⎧⎨

⎩

x0 = 0.26463 − 0.061581TL + 0.0031408TL2

x1 = 2.0402 + 0.018945TL − 0.011161TL2

x2 = − 1.3025 + 0.039231TL + 0.0085079TL2

(11)

with a condition on x0:

if x0Trd < 2.10−3, then x0 = 2.10−3

Trd(12)

The diffuse irradiance under overcast skies Id,oc is calcu-lated by (Muneer 2004):

Id,oc = 572α (13)

3.4 Fourth test

After the third test, it seems that some erroneous data arestill existing (Fig. 3; third test). For this problem, we applied

0 0.5 1 1.50

0.2

0.4

0.6

0.8

1

1.2

Clearness index (kt)

Diff

use

sola

r fr

actio

n (k

d)

0 0.5 1 1.50

0.2

0.4

0.6

0.8

1

1.2


Diff

use

sola

r fr

actio

n (k

d)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1


Diff

use

sola

r fr

actio

n (k

d)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1


Diff

use

sola

r fr

actio

n (k

d)

Third test Fourth test

First test Second test

Fig. 3 Quality control for data collected in Fez between 2009 and 2012. The blue dots are values that pass the tests. The green dots are valuesthat do not pass the tests


the standard deviation-based procedure proposed by Clay-well et al. (2005). This test consists on applying a statisticaloutlier analysis. The whole hourly clearness index kt range(0 to 1) is split into ten equally spaced intervals. For eachinterval, we calculate the corresponding kd , mean (kd ),and standard deviation (σd ). We retain only the values forwhich:

kd − 2σd ≤ kd ≤ kd + 2σdFigure 3 shows the retained (blue dots) and eliminated

(green dots) data values. The number of hourly data whichhave passed the all tests is 7,848.

In Fig. 3, we can see that the third test rejects several out-liers, albeit at the same time, it seems that it also eliminatessome, a priori, good points. This has been already noticedby Ruiz-Arias et al. (2010) who have considered this teststatistically positive overall given the high amount of datapoints.

For the quality control of daily data, we firstly calcu-late the daily global G and diffuse D solar radiations bysumming hourly data satisfying the first test (solar elevation> 7 ◦). After this, we apply the following quality tests:

– The daily clearness index Kt and diffuse fraction Kd

must staying between 0 and 1.– The daily solar radiations G and D must satisfy the

Page test:Dd,oc ≤ D ≤ Dd,c and G ≤ Gg,c

where Dg,oc and Dg,c are, respectively, the lowerand the upper limits of the daily diffuse horizontal irra-diation and Gg,c is the upper limit of the daily globalhorizontal irradiation. These quantities are calculatedfrom the hourly values of the Page model (cf. 3.3).

– The statistical outliers analysis.

4 Models of estimating

In this work, we have developed models based on artificialneural networks and compared them with some empiricalmodels (polynomial, Erbs et al. 1981, Collares-Pereira andRabl 1979, Nfaoui and Buret 1993 and Boland et al. 2008)which we present in first.

4.1 Polynomial models

The diffuse solar radiation can be obtained through vari-ous empirical correlations usually expressed in terms of nthorder polynomial relationships between the diffuse fractionand clearness index in hourly and daily scales. In this work,the third-order polynomial is proposed as follows.

kd = a + bkt + c k2t + dk3

t (14)

where a, b, c, and d are empirical constants.

4.2 Nfaoui and Buret model

Nfaoui and Buret developed empirical correlations to estab-lish a relationship between the daily diffuse fraction (Kd )and the daily clearness index (Kt ) for the city of Rabat inMorocco (Nfaoui and Buret 1993).

Kd ={

0.98 Kt < 0.10.98 + 0.15Kt − 1.48K2

t 0.1 ≤ Kt(15)

4.3 Erbs et al. model

Erbs et al. (1981) established the following relationshipsbetween the hourly diffuse solar fraction and hourly clear-ness index for some US locations:

kd =⎧⎨

⎩

1.0 − 0.09kt kt ≤ 0.220.9511 − 0.1604kt + 4.388k2

t − 16.638k3t + 12.336k4

t 0.22 < kt ≤ 0.80.165 0.8 < kt

(16)

4.4 Collares-Pereira and Rabl model

Collares-Pereira and Rabl (1979) established a fourth-orderpolynomial model between daily diffuse solar fraction Kd

and daily clearness index Kt :

Kd ={

0.99 Kt ≤ 0.171.188 − 2.272Kt + 9.473K2

t − 21.856K3t + 14.648K4

t 0.17 < Kt ≤ 0.8(17)

4.5 Boland et al. model

The model developed by Boland et al. (2008) is given by:

Kd = 1

1 + e(a + bKt )(18)

where a and b are empirical constants.

4.6 Artificial neural networks

Artificial neural networks (ANNs) are very simplified mod-els of biological neural networks. The aim of such mod-els is to mimic some interesting properties of the humanbrain, in particular, its learning and generalization abilities.Currently, ANNs are used in many applications such asspeech processing, computer vision, time series prediction,robotics, character recognition, etc.

One of the most important aspects of ANN conceptionis the selection of the topology (architecture and nature ofconnections (feedforward or recurrent)). Because of theirsimplicity and their important characteristic of universalapproximators (Hornik et al. 1989), feed forward multilayerperceptrons (MLPs) are the most commonly used ANNs.Many studies have shown MLPs ability to solve complex

B. Ihya et al.

Fig. 4 Typical 3-layers MLParchitecture with n inputs, Nhidden neurons and 1 outputneuron

and diverse problems in physic, chemistry, econometric,biology, etc.

In this work, three-layers MLP ANNs (Fig. 4) basedon back propagation algorithm are developed, trained, andtested for estimating hourly and daily diffuse solar fractions.

The three-layer MLP ANNs considered here consist ofseveral neurons in the input layer (each one representingone input feature), several neurons in the hidden layer, andone neuron in the output layer representing the diffuse solarfraction (Fig. 4).

In Fig. 4, the notation inside each neuron represents itsoutput. wij is the weight of the connection from the input ito the hidden neuron j , wjo is the weight of the connectionfrom the neuron j in the hidden layer to the output neuron.kd is the output of the MLP. A bias or threshold bj is associ-ated with each neuron j of the network. A bias is consideredas a normal weight with the input clamped at -1. f and h arethe activation or transfer functions of the hidden and outputneurons, respectively. In this study, the neurons of the hid-den layer have a sigmoid transfer function while the outputneuron has a linear transfer function.

Every neuron in the network sums its weighted inputs toproduce an internal activity level sk . For the hidden neuronj , the internal activity can be written as:

sj =n∑

i=1

wijxi − bj (19)

For the output neuron, the internal activity is:

so =N∑

j=1

wjof (sj )− bo (20)

The ANN output kd = h(so) is calculated knowing theweights of different connections and the transfer function ofeach layer.

The weights (weights and biases) are adjusted in thetraining phase where known patterns are presented to thenetwork. The adjustment is carried out by comparison ofthe responses of the network (outputs) and the correspond-ing targets (desired outputs), until the outputs correspond at

Table 1 Average performances obtained for the validation hourly dataset on 10 runs

kt Dn h α δ T RH WS WD Rf R RMBE RRMSE DA MAE MSE RMSE

X 88.63 −2.82 26.96 93.97 0.0984 0.0164 0.1282

X X 89.47 −1.27 25.94 94.50 0.0945 0.0152 0.1234

X X 93.80 0.45 20.62 96.78 0.0740 0.0096 0.0978

X X 90.33 −0.67 25.16 94.98 0.0922 0.0144 0.1198

X X X 94.07 1.09 20.29 96.90 0.0732 0.0092 0.0961

X X X 93.83 0.74 20.60 96.79 0.0747 0.0096 0.0980

X X X 93.88 0.73 20.38 96.83 0.0734 0.0094 0.0970

X X X 93.76 0.93 20.62 96.76 0.0743 0.0096 0.0981

X X X 93.96 0.58 20.33 96.87 0.0733 0.0094 0.0970

X X X 93.82 0.68 20.58 96.79 0.0738 0.0095 0.0977

X X X 93.78 0.50 20.56 96.78 0.0739 0.0096 0.0978

X X X X X X X X X X 93.71 −1.83 23.11 96.73 0.0792 0.0110 0.1047


Table 2 Average performances obtained on the validation hourly dataset for 10 runs

Number of hidden neurons R RMBE RRMSE DA MAE MSE RMSE

1 93.73 0.24 20.57 96.76 0.0738 0.0096 0.0979

2 93.78 0.31 20.52 96.78 0.0740 0.0096 0.0979

3 93.86 0.51 20.50 96.81 0.0738 0.0095 0.0974

4 93.96 0.61 20.39 96.86 0.0737 0.0094 0.0971

5 94.07 1.09 20.29 96.90 0.0732 0.0092 0.0961

6 94.09 1.04 20.23 96.91 0.0733 0.0093 0.0963

7 94.13 0.91 20.09 96.95 0.0730 0.0091 0.0955

8 94.16 1.21 20.11 96.95 0.0731 0.0092 0.0958

9 94.16 1.15 20.13 96.95 0.0728 0.0091 0.0956

10 94.21 1.30 20.02 96.97 0.0723 0.0091 0.0952

11 94.21 1.43 20.05 96.97 0.0728 0.0091 0.0954

12 94.16 0.97 20.05 96.96 0.0727 0.0091 0.0952

13 94.18 1.11 20.03 96.97 0.0724 0.0091 0.0951

14 94.25 1.27 19.94 97.00 0.0720 0.0089 0.0945

15 94.14 0.98 20.06 96.95 0.0731 0.0091 0.0955

16 94.15 0.86 20.05 96.96 0.0725 0.0090 0.0950

17 94.13 1.05 20.00 96.95 0.0729 0.0091 0.0954

18 94.16 0.88 19.97 96.97 0.0724 0.0091 0.0952

19 94.16 1.13 20.02 96.96 0.0726 0.0091 0.0952

20 94.13 0.82 20.03 96.95 0.0728 0.0091 0.0955

best to the targets. The standard back-propagation algorithm(Rumelhart et al. 1986) was used for training the MLP.

We can see in Fig. 4 that the complexity of the archi-tecture (number of connections or number of parameters)depends on the number of neurons in the input and hiddenlayers. The choice of the complexity is very critical becauseit acts directly on the generalization ability of the ANN: anetwork that is not sufficiently complex can fail to detecttrust function between inputs and outputs in a complicateddataset, leading to underfitting, while a network that is toocomplex may fit also the noise in the data leading to overfitting.

So, when applying ANNs, it is always useful to searchthe smallest architecture that accurately represents the rela-tionship between the inputs and the outputs of the studiedsystem.

Table 3 The best statistical performances on 10 runs obtained for thetraining and validation datasets by using the best selected ANN model

R RMBE RRMSE DA

Training dataset 94.48 −0.08 18.54 97.11

Validation dataset 94.32 0.94 19.73 97.04

5 Performance evaluation

The accuracy of different models is assessed by meansof five widely used static parameters (Posadillo andLopez Luque 2009): mean absolute error (MAE), relativemean bias error (RMBE), relative root mean square error(RRMSE), degree of agreement (DA), and the correlationcoefficient (R) which is used to test the linear relationbetween predicted and observed data.

RMBE = 100

N∑

i=1

(pi −mi)

N∑

i=1

mi

(21)

MAE =

N∑

i=1

|pi −mi |

N(22)

RRMSE = 100

m

√√√√√√

N∑

i=1

(pi −mi)2

N(23)

B. Ihya et al.

Fig. 5 Estimated vs measuredhourly diffuse solar fraction.Training dataset and validationdataset

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Measured diffuse solar fraction

Cal

cula

ted

diffu

se s

olar

frac

tion

Training dataset

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Cal

cula

ted

diffu

se s

olar

frac

tion

Valiadtion dataset

RRMSE(%)=19.73AD(%)=97.04R(%)=94.32

RRMSE(%)=18.54AD(%)=97.11R(%)=94.48

DA = 1 −

N∑

i=1

(pi −mi)2

N∑

i=1

(|pi − m| + |mi − m|)2

(24)

R =

N∑

i=1

(pi − p)(mi − m)

√√√√

N∑

i=1

(pi − p)2N∑

i=1

(mi − m)2

(25)

where N is the number of data, pi is the ith predictedvalue, mi is the ith measured value and m and p are,respectively, the calculated and measured mean values.

The RMBE parameter used here for determining only thecapability of our models to overestimate or underestimatethe predicted values. The other parameters’ criteria are usedto evaluate the performances of the models.

6 Results and discussions

As noted above, three-layer MLP ANNs are used in thiswork. All networks were trained using 3 years of data(January 2009–December 2011) and validated using 1 year(2012) which was not presented during the network training

Table 4 Established parameters and the corresponding Performances of the best selected ANN model and the three empirical tested models inhourly scale

Models a b c d e R RMBE RRMSE DA

Polynomial

0.773 2.3 −8.62 6.01 87.20 2.776 31.66 93.20

Erbs et al

kt < 0.22 1.02 −0.55

0.22 ≤ kt < 0.8 0.596 3.34 −10.7 7.31 0.153 87.28 3.72 31.32 93.00

0.8 ≤ kt 0.198

Boland et al

−4.38 8.15 86.32 4.748 34.30 92.69

MLP

input(kt , h, α) 14 hidden neurons 94.32 0.94 19.73 97.04


Fig. 6 Frequency distributionsof the differences betweencalculated and measured kd forthe four models

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.50

10

20

30

40

50

60

kd(Measured)−kd(Calculated)

Fre

quan

cy o

f occ

urre

nce

(%)

Polynomial modelErbs et al modelBoland et al model ANN model

phase. The same datasets were used to establish coefficientsand test the other empirical models.

6.1 Hourly scale

In order to determine the optimal MLP network architec-ture,we have performed several tests by varying the numberof neurons in the input and in the hidden layers.

6.1.1 Choice of input variables

The choice of input variables involves the choice of rel-evant variables which have a significant influence on thediffuse solar fraction (the output). We have tested severalcombinations of input variables for estimating hourly dif-fuse solar fractions using clearness index (kt ), day number(Dn), hour of the day (h), solar altitude (α), solar declination(δ), temperature (T), relative humidity (RH), wind speed(WS), wind direction (WD), and rainfall (Rf). To selectthe best combination of input variables, we used an MLPwith five hidden neurons. Such architecture is relatively lesscomplex and allows to achieve several runs for each combi-nation of inputs within an acceptable time. Table 1 shows,for each input combinations, the average performances onthe validation hourly dataset for 10 runs. We can see thatthe combination of [kt , h, α] gives the best performances interm of R, RRMSE, AD, and MAE values. The RMBE crite-ria parameter gives the best results for all models, the valuesof this parameter varies between 1.08 and −2.82 %. So, wecan consider the combination [kt , h, α] as the best inputcombination. We then take these variables as the most rel-evant ones governing the variations of hourly diffuse solarfraction.

6.1.2 Choice of the number of hidden neurons

The aim of this choice is to find the smallest architecture thataccurately represents the relationship between the inputsand the output. As for the choice of the inputs, we haveperformed several experiments by varying the number ofneurons in the hidden layer. In this study, we have adopteda constructive approach. We start with a hidden layer withsingle neuron and we then add, accumulatively, additionalneurons. Of course, we take [kt , h, α] as inputs in all runs.

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Daily clearness index (Kt)

Dai

ly d

iffus

e so

lar

frac

tion

(Kd)

Fig. 7 Scatter plot between daily values of Kt and Kd for datacollected in Fez between 2009 and 2012

B. Ihya et al.

Table 5 Average performances obtained for the validation daily dataset on 10 runs

Kt Dn Dl δ T RH WS Rf R RMBE RRMSE DA MAE MSE RMSE

X 94.87 −5.68 21.10 97.18 0.0600 0.0059 0.0765

X X 95.57 −7.25 20.32 97.41 0.0569 0.0055 0.0741

X X 95.71 −7.02 19.92 97.50 0.0549 0.0051 0.0716

X X 95.70 −7.03 20.09 97.50 0.0560 0.0054 0.0733

X X X 95.70 −7.43 20.31 97.45 0.0553 0.0053 0.0726

X X X 95.64 −7.49 20.27 97.42 0.0560 0.0053 0.0726

X X X 95.69 −7.17 20.26 97.47 0.0559 0.0053 0.0729

X X X 95.63 −6.60 20.19 97.49 0.0559 0.0052 0.0724

X X X 95.47 −7.03 20.42 97.38 0.0565 0.0055 0.0741

X X X X X X X X 95.69 −7.82 20.29 97.42 0.0558 0.0053 0.0729

Table 2 shows, for each tested architecture, the averageperformances obtained on the validation hourly dataset for10 runs. We can see that the best R, RRMSE, DA, and MAEvalues are obtained with the 14 hidden neurons architec-ture. The RMBE values obtained by all models are less than1.27 % and positive; it means that there is an overestima-tion. So, we can consider the combination [kt , h, α] with 14hidden neurons is good architecture for estimated kd in anhourly scale.

The best statistical performances on 10 runs obtainedfor the training and validation datasets by using the bestselected ANN model ([kt , h, α] as inputs and 14 hiddenneurons) are summarized in Table 3.

For the training dataset, the performances on 10 runsare between 94.29–94.48 % for R, 18.54–18.87 % forthe RRMSE and 96.98–97.11 % for the DA. For the val-idation dataset, these ranges are 94.17–94.32 % for R,19.73–20.15 % for RRMSE, and 96.95–97.04 % for DA.

Table 6 Average performances obtained on the validation daily dataset for 10 runs

Number of hidden neurons R RMBE RRMSE DA MAE MSE RMSE

1 95.46 −6.21 20.17 97.45 0.0566 0.0053 0.0731

2 95.61 −7.10 20.32 97.45 0.0569 0.0054 0.0736

3 95.65 −7.15 20.21 97.46 0.0564 0.0054 0.0732

4 95.67 −7.17 20.12 97.47 0.0559 0.0053 0.0729

5 95.71 −7.02 19.92 97.50 0.0557 0.0053 0.0728

6 95.70 −6.48 19.88 97.54 0.0550 0.0052 0.0720

7 95.72 −6.93 19.95 97.52 0.0550 0.0052 0.0723

8 95.71 −6.79 19.94 97.52 0.0550 0.0052 0.0723

9 95.72 −7.05 20.02 97.51 0.0552 0.0053 0.0725

10 95.72 −6.65 19.87 97.54 0.0546 0.0052 0.0720

11 95.72 −6.71 19.91 97.54 0.0549 0.0052 0.0721

12 95.72 −6.95 20.03 97.52 0.0551 0.0053 0.0726

13 95.69 −7.04 20.05 97.49 0.0553 0.0053 0.0727

14 95.68 −6.79 19.96 97.51 0.0551 0.0052 0.0723

15 95.66 −7.35 20.26 97.45 0.0558 0.0054 0.0734

16 95.64 −7.09 20.19 97.46 0.0555 0.0054 0.0732

17 95.64 −7.27 20.33 97.44 0.0561 0.0054 0.0737

18 95.62 −6.45 20.02 97.50 0.0548 0.0053 0.0725

19 95.63 −6.30 19.92 97.52 0.0549 0.0052 0.0722

20 95.56 −6.71 20.14 97.45 0.0551 0.0053 0.0727


Table 7 Established parameters and the corresponding Performances of the best selected ANN model and the three empirical tested models indaily scale

models a b c d e R RMBE RRMSE DA

Polynomial

0.895 1.025 −5.52 3.619 94.96 −6.49 21.24 97.17

Collares-Pereira and Rabl

Kt ≤ 0.17 0.939

0.17 <Kt ≤ 0.8 1.23 −2.33 6.398 −14.14 9.468 94.88 −6.04 21.33 97.16

Nfaoui and Buret

Kt < 0.1 0.92

0.1 ≤ Kt 1.1317 −0.95 −0.69 95.06 −7.60 21.64 97.11

MLP

input(Kt ,DL) 10 hidden neurons 95.96 −4.29 18.73 97.82

Figure 5 shows the scatter plots of training and validationdatasets for the run (among the ten), which gives the bestperformances on the validation dataset.

For the three tested empirical models (third-order poly-nomial, Erbs et al. and Boland et al. models), the

obtained results are shown in Table 4. We can seethat R is, at best, equal to 87.28 %, the RRMSEis higher than 30 %, and the DA is around 93 %.For the RMBE values, they are positive and less than4.748 %.

Fig. 8 Comparison betweenmeasured and calculated dailydiffuse solar fractions for datacollected in Fez between 2009and 2012 by using four models:third-order Polynomial model,Collares-Pereira and Rablmodel, Nfaoui and Buret model,and ANN model

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1ANN model


Cal

cula

ted

diffu

se s

olar

frac

tion

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1Polynomial model


Cal

cula

ted

diffu

se s

olar

frac

tion

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1Collares−Pereira and rabel model


Cal

cula

ted

diffu

se s

olar

frac

tion

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1Nfaoui and Buret model


Cal

cula

ted

diffu

se s

olar

frac

tion

RRMSE(%)=21.24AD(%)=97.17R(%)=94.96

RRMSE(%)=21.33AD(%)=97.16R(%)=94.88

RRMSE(%)=18.73AD(%)=97.82R(%)=95.96

RRMSE(%)=21.64AD(%)=97.11R(%)=95.06

B. Ihya et al.

Figure 6 shows the frequency distributions of the dif-ferences between calculated and measured kd for ourfour models. The frequency distribution of the differencebetween measured and calculated kd values shows thatabout 94 % of cases are estimated with a deviation less than0.2 by the ANN model. This percentage is of about 83 %for the third-order polynomial model, 84 % for the Erbs etal. model, and 80 % for the Boland et al model.

From all these above results , we can say that the MLP(ANN) constitute a good model for estimating the hourlydiffuse solar fraction and it performs much better than thethree tested empirical models.

6.2 Daily scale

In this section, we are interested in estimating the daily dif-fuse solar fraction Kd from the daily clearness index Kt

and others measured and/or calculated parameters. We testthree empirical models (a third-order polynomial model,Collares-Pereira and Rabl model and Nfaoui and Buretmodel) and we construct an optimal three-layer MLP ANN.

All models have been constructed using data of 3 years(January 2009–December 2011) and tested on data of 1 year(2012).

The three empirical models use the daily clearness indexKt to estimate Kd . The scatter plot between Kt and Kd isrepresented in Fig. 7.

For the ANN model, we have tested several architecturesby varying the number of neurons in the input and in thehidden layers. For the input layer, we have tested differentcombinations of the following parameters: daily clearness

index Kt , day of the year, day length (DL), solar declination,mean daily temperature, mean daily relative humidity andmean daily wind speed. Table 5 shows that for each inputcombination, the average performances on the validationdaily dataset for 10 runs.

The best performances in terms of the R, RRMSE, DA,and MAE values are obtained by the input combination [Kt ,Dl]. The RMBE values obtained by all models are negativeand varies between −5.68 and −7.82 %. The average per-formances obtained for each tested architecture by using theinput combination [Kt , Dl] on the validation daily datasetfor 10 runs are summarizing in Table 6. We can see thatall architectures give almost the same results, with slightlybetter performance in terms of R, RRMSE, AD, and MAEobtained by ten hidden neurons. From these results, we canconsider the architecture using [Kt , Dl] as input and tenhidden neurons as the best architecture.

Table 7 summarizes the performances of the four usedmodels and established parameters for these models.

The comparison between the measured and calculatedKd using the three empirical models and the ANN modelare shown in Fig. 8.

Figure 9 shows the frequency distributions of the dif-ferences between calculated and measured Kd for ourfour models. The frequency distribution of the differencebetween measured and calculated Kd shows that 97.27 %of cases are estimated with a deviation less than 0.16 bythe ANN model. This value is of 95.56 % when using thethird-order polynomial model and the Collares-Pereira andRabl model and 96.59 % when using Nfaoui and Buretmodel.

Fig. 9 Frequency distributionsof the differences betweencalculated and measured Kd forthe four models

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.50

10

20

30

40

50

60

Kd(Measured)−Kd(Calculated)

Fre

quen

cy o

f occ

urre

nce

(%)

Polynomial modelCollares−Pereira and Rabl modelNfaoui and Buret modelANN model


From these results, it may be observed that for the dailyvalues, the ANN model yields the best results. But results ofthe empirical models are also acceptable.

7 Conclusion

In this present work, the MLP ANN models have beendeveloped for predicting the hourly and daily diffuse solarfraction for Fez city of Morocco. In parallel, some empiricalmodels are tested.

For hourly scale, the performances of the ANN modelare significantly higher than those of the three empiricaltested models. In particular, the RRMSE is about 20 %for ANN model, while it exceeds 30 % for the empiricalmodels. The correlation coefficient and the degree of agree-ment are, respectively, around 94 and 97 % for ANN model,while they are around 87 and 93 % for the three empir-ical tested models. Also, 94.62 % of hourly values wereestimated with a deviation less than 0.2 by ANN model,while this value is less than 84 % for all the empiricalmodels.

For the daily scale, it has been observed that the ANNmodel yields the best results. But results of the empiri-cal models are also acceptable. The correlation coefficient,the RRMSE, and the degree of agreement are, respec-tively, 95.96, 18.73, and 97.82 % for the best selectedANN model. These values are, respectively, of 94.96, 21.24,and 97.17 % for the third-order Polynomial model, 94.88,21.33, and 97.16 % for the Collares-Pereira and Rabl modeland 95.06, 21.64, and 97.11 % for the Nfaoui and Buretmodel.

These results show that the neural networks are moresuitable to predict hourly and daily diffuse solar fractions.Particularly, in hourly scale ANN performs much better wellthan the tested empirical models.

These ANN models developed in this work can be gener-alized to the whole plain of Saiss which has the same climateas Fez. For other sites, care must be taken to account for thedifference in regional climate.

Acknowledgments The authors wish to thank the reviewers for theiruseful comments and constructive suggestions.

References

Ruiz-Arias JA, Alsamamra H, Tovar-Pescador J, Pozo-Vazquez D(2010) Proposal of a regressive model for the hourly diffusesolar radiation under all sky conditions. Energy Convers Manag51:881–893

Jacovides CP, Tymvios FS, Assimakopoulos VD, Kaltsounides NA(2006) Comparative study of various correlations in estimatinghourly diffuse fraction of global solar radiation. Renew Energy31:2492–2504

Hamdy KE (2007) Experimental and theoretical investigation ofdiffuse solar radiation: data and models quality tested for Egyptiansites. Energy 32:73–82

Erbs DG, Klein SA, Duffie JA (1981) Estimation of the diffuseradiation fraction for hourly, daily and monthly average globalradiation. Sol Energy 28:293–302

Oliveira AP, Escobedo JF, Machado AJ, Soares J (2002) Correlationmodels of diffuse solar-radiation applied to the city of Sao Paulo,Brazil. Appl Energy 71:59–73

Muneer T (2004) Solar radiation and daylight models, 2nd edn.Elsevier Butterworth-Heinemann

Collares-Pereira M, Rabl A (1979) The average distribution of solarradiation—correlations between diffuse and hemispherical andbetween daily and hourly insolation values. Sol Energy 22(2):155–64

Jin Z, Yezheng W, Gang Y (2004) Estimation of daily diffuse solarradiation in China. Renew Energy 29:1537–1548

Boland J, Ridley B, Brown B (2008) Models of diffuse solar radiation.Renew Energy 33:575–584

Ulgen K, Hepbasli A (2009) Diffuse solar radiation estimation modelsfor Turkey’s big cities. Energy Convers Manag 50:149–156

Jiang Y (2008) Prediction of monthly mean daily diffuse solarradiation using artificial neural networks and comparison withother empirical models. Energy Policy 36:3833–3837

Hamdy KE, Yosry AA, Farag IY (2007) Prediction of hourly anddaily diffuse fraction using neural network, as compared to linearregression models. Energy 32:1513–1523

Soares J, Oliveira AP, Boznar MZ, Mlakar P, Escobedo JF, MachadoAJ (2004) Modeling hourly diffuse solar-radiation in the city ofSao Paulo using a neural-network technique. Appl Energy 79:201–214

Alam S, Kaushik SC, Garg SN (2009) Assessment of diffuse solarenergy under general sky condition using artificial neural network.Appl Energy 86:554–564

Drummond AJ (1964) Comments on “sky radiation measurement andcorrections”. J Appl Meteorol 3:810–811

Boland J, Ridley B (2007) Recent advances in modeling diffuseradiation. In: Proceedings of ISES solar world congress, solarenergy and human settlement

Younes S, Claywell R, Muneer T (2005) Quality control of solarradiation data: present status and proposed new approaches.Energy 30:1533–1549

Hornik K, Stinchcomb M, White H (1989) Multi-layered feedforwardnetworks are universal approximators. Neural Netw 2:359–366

Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internalrepresentation by error propagation. In: Rumelhart DE,McClelland JL (eds) the PDP research group, in parallel dis-tributed processing: exploration in the microstructure of cognition.MIT Press, Cambridge, pp 318–362

Nfaoui H, Buret J (1993) Estimation of daily and monthly direct,diffuse and global solar radiation in Rabat. Renew Energy 3:923–930

Posadillo R, Lopez Luque R (2009) Hourly distributions of thediffuse fraction of global solar irradiation in Cordoba (Spain).Energy Convers Manag 50:223–231

Claywell R, Muneer T, Asif M (2005) An efficient method for assess-ing the quality of large solar irradiance datasets. ASME: JSEE127(1):150–152

Kasten F (1993) Discussion on the relative air mass. Light Res Tech25:129

Documents

Prediction of hourly and daily diffuse solar fraction in the city of Fez (Morocco)