14
149 STATISTICAL EVALUATION AND FORECASTING OF THE DUST PARTICLES CONCENTRATIONS DEPENDING ON THE METEOROLOGICAL CONDITIONS Marko I. Uzunov, Dimitar I. Pilev, Evgenia M. Savova-Videnova ABSTRACT A multiple linear regression involving new variables of interaction and quadratic functions (MLR+NP) has been applied to the modeling of hour-by-hour concentrations of РМ 10 in Sofia depending on the meteorological indices: temperature (T), humidity (W), wind velocity (V) and radiation (R) and one pollutant, СО. The study has been carried out within a one-year period - between 01.04. 2016 and 31.03.2017. The results show that 70 % up to 89 % of the variation of РМ 10 could be explained by the factors T, W, V, R, СО and their quadratic functions and interactions. The monthly models are able to forecast the concentrations of РМ 10 for the separate months with greater precision and reliability in comparison with the model obtained using database values referring to the whole year. The error reaches a value below 11 % upon forecasting the maximal daily concentrations of РМ 10 for each first day of the month. The elaborated models are a potential instrument both for forecasting the concentrations of РМ 10 , as well as for the development of systems for pollution control and management. Keywords: air quality modelling, PM 10 concentrations, meteorology, multiple linear regression, future prediction. Received 08 March 2018 Accepted 20 July 2018 Journal of Chemical Technology and Metallurgy, 54, 1, 2019, 149-162 University of Chemical Technology and Metallurgy 8 Kl. Ohridski, 1756 Sofia, Bulgaria E-mail:[email protected] INTRODUCTION The continuous development and growing up of the population in city regions lead to problems connected with environmental pollution – discharge of toxic sub- stances, solid waste materials, harmful emissions, etc. The problem of air pollution in the cities has become so unbearable that it induces urgent need of timely informa- tion on the contamination level changes. The dust particles are among the most harmful substances which affect human health. The outdoor particulate matter is mainly consisting of metals, organic compounds, materials of biological origin, as well as elemental carbon [1, 2]. PM 10 are particles of a diameter of up to 10µ in a solid or a liquid state causing a num- ber of diseases. The continuous exposure to the effect of PM10 causes negative influence on the respiratory system, damages the lung tissue, causes cancer and un- timely death [3 - 7]. The dust particles exert also negative impact on the plants, especially on these having short vegetative cycles [8]. The main sources of PM 10 are the exhaust gases of motor vehicles, thermal power plants, industrial enterprises, open air combustion or households use of solid fuels [9, 10]. The РМ10 suspended in the air depend on many factors such as: meteorological, topological-geograph- ical peculiarities, sources of emissions, physical and chemical properties of the particles, etc. [11]. The meteorological factors are very important because they are very often used for elucidating and forecasting РМ 10 concentration. The meteorological conditions could lead to concentration increase of the existing pollutants and formation of new ones. The world practice comprises a large number of investigations concerned with the influence of various meteorological factors on the concentration of dust particles with the aim to elaborate a model for the con- centration prediction of this pollutant [12].

STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Marko I. Uzunov, Dimitar I. Pilev, Evgenia M. Savova-Videnova

149

STATISTICAL EVALUATION AND FORECASTING OF THE DUST PARTICLES CONCENTRATIONS DEPENDING ON THE METEOROLOGICAL CONDITIONS

Marko I. Uzunov, Dimitar I. Pilev, Evgenia M. Savova-Videnova

ABSTRACT

A multiple linear regression involving new variables of interaction and quadratic functions (MLR+NP) has been applied to the modeling of hour-by-hour concentrations of РМ10 in Sofia depending on the meteorological indices: temperature (T), humidity (W), wind velocity (V) and radiation (R) and one pollutant, СО. The study has been carried out within a one-year period - between 01.04. 2016 and 31.03.2017. The results show that 70 % up to 89 % of the variation of РМ10 could be explained by the factors T, W, V, R, СО and their quadratic functions and interactions. The monthly models are able to forecast the concentrations of РМ10 for the separate months with greater precision and reliability in comparison with the model obtained using database values referring to the whole year. The error reaches a value below 11 % upon forecasting the maximal daily concentrations of РМ10 for each first day of the month. The elaborated models are a potential instrument both for forecasting the concentrations of РМ10, as well as for the development of systems for pollution control and management.

Keywords: air quality modelling, PM10 concentrations, meteorology, multiple linear regression, future prediction.

Received 08 March 2018Accepted 20 July 2018

Journal of Chemical Technology and Metallurgy, 54, 1, 2019, 149-162

University of Chemical Technology and Metallurgy8 Kl. Ohridski, 1756 Sofia, Bulgaria E-mail:[email protected]

INTRODUCTION

The continuous development and growing up of the population in city regions lead to problems connected with environmental pollution – discharge of toxic sub-stances, solid waste materials, harmful emissions, etc. The problem of air pollution in the cities has become so unbearable that it induces urgent need of timely informa-tion on the contamination level changes.

The dust particles are among the most harmful substances which affect human health. The outdoor particulate matter is mainly consisting of metals, organic compounds, materials of biological origin, as well as elemental carbon [1, 2]. PM10 are particles of a diameter of up to 10µ in a solid or a liquid state causing a num-ber of diseases. The continuous exposure to the effect of PM10 causes negative influence on the respiratory system, damages the lung tissue, causes cancer and un-timely death [3 - 7]. The dust particles exert also negative

impact on the plants, especially on these having short vegetative cycles [8]. The main sources of PM10 are the exhaust gases of motor vehicles, thermal power plants, industrial enterprises, open air combustion or households use of solid fuels [9, 10].

The РМ10 suspended in the air depend on many factors such as: meteorological, topological-geograph-ical peculiarities, sources of emissions, physical and chemical properties of the particles, etc. [11]. The meteorological factors are very important because they are very often used for elucidating and forecasting РМ10 concentration. The meteorological conditions could lead to concentration increase of the existing pollutants and formation of new ones.

The world practice comprises a large number of investigations concerned with the influence of various meteorological factors on the concentration of dust particles with the aim to elaborate a model for the con-centration prediction of this pollutant [12].

Page 2: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Journal of Chemical Technology and Metallurgy, 54, 1, 2019

150

The processes, which are occurring in the atmospher-ic boundary layer (ABL) exert effect on the concentration of РМ10. There, the wind velocity (V), the temperature (T) as well as the air humidity (W) undergoes fast changes. The turbulence causes horizontal and vertical mixing of the air layers. As a result, the particles suspended in ABL also undergo some physicochemical transformations under the specific atmospheric conditions [13].

The most often used meteorological variables used in the models for forecasting of РМ10 refer to the wind velocity and the ambient temperature. The direction of the wind determines the pathway followed by the particles, while its velocity determines the distance to which they are transferred. The effects of wind velocity and direction on the level of РМ10 vary depending on the geographical characteristics of the location. In principle, the low wind speed leads to high levels of РМ10 [14]. The low temperature increases the probability of an inverse layer formation (cold air layer covered by warm one) near the earth’s surface. The inversion prevents the cold air movement upwards and it retains РМ10 near to the ground level. As a result, one often observes high concentrations of РМ10 at low temperatures. They are also present at high temperatures in absence of clouds and stable atmospheric conditions, as well as in case of a large difference between the maximal and in the minimal daily temperature values and a low ABL height [15, 16].

The levels of РМ10 decrease as a result of rain, snow, fog or ice. The rain drops capture the particles and the degree of removal depends on the duration and inten-sity of the rainfall. In spite of the fact that the rain is an important factor for the РМ10 concentration it has not been widely used in the models, due to the fact that in many countries there are no rains for extended periods of time [17].

The relative humidity is very often used in РМ10 forecast models although the relation between РМ10 and W depends also on some other meteorological condi-tions. For example, if the humidity is high and there is abundant rainfall, W is in a negative interdependence with РМ10 due to the rain purifying effect. In case, when the high humidity is not associated with a rainfall and is accompanied by high temperature values, it contributes to higher levels of РМ10. Some authors assume that only

when W is above 55 % it affects РМ10 concentration 18].It is ascertained that the solar radiation leads to de-

crease in the concentration of РМ10. It is so because the ground is heated and intensive heat exchange provides tur-bulent whirlwinds scattering the suspended particles [13].

The automotive transport, the particularities of the studied region (markets, living complexes, industrial zones, etc.) as well as some other pollutants in the air, such as NOx, CO and SO2 exert influence on РМ10 levels. The road motor vehicles emit exhaust gases causing the appearance of fine dust particles suspended in the air. When traffic data is missing, those referring to CO or NOx can be used as proxy of the automotive traffic intensity [19].

Most models concerning РМ10 are designed to predict short-term hourly, average daily or maximal daily concentrations of РМ10 one day in advance. A great variety of mathematical techniques are used for modeling and forecasting the concentration of РМ10, including statistical (Multiple Linear Regression, MLR) and machine-based (Artificial Neuron Networks, ANN). Zia et al. [20 - 23] apply hybrid models combining tech-niques of MLR and ANN joined with PCA (Principle Component Analysis). Thus they successfully predict the concentrations of PM10 for period of several days. The application of ANN for forecasting the quality of air is much more difficult than the use of the classical linear models [24 - 26].

The elaboration of efficient forecasting models dealing with the quality of air in the city regions is very important. Although many models exist and some of them have already been applied in practice, it is still necessary to develop more precise and simplified one. It is of practical interest to predict hour-by-hour РМ10 concentration based on the meteorological factors data available in the web for weather forecast. This will en-able to forecast and disseminate the data on air pollution with РМ10.

Sofia city has at its disposal one of the best devel-oped networks for controlling the quality of the air in the country. There are 16 measurement sites on the entire territory of the city – four of them are automated, while three take manual air samples. The data recording is occurring in real time and the data is transferred to the

Page 3: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Marko I. Uzunov, Dimitar I. Pilev, Evgenia M. Savova-Videnova

151

National Database. It is then published on the web site of the Sofia Municipality [27].

The aim of this paper is to present the results of the application of MLR analysis in predicting PM10 concentration as the function of four meteorological parameters. In order to account for the influence of the traffic intensity on РМ10 one pollutant (CO), which is being emitted as a result of incomplete combustion of fuels, is also included.

EXPERIMENTALArea of the investigation

In recent years, Bulgaria has been one of the Euro-pean countries with the highest PM10 levels measured [28 - 31]. A World Health Organization study of the problem concerning with particulate matter pollution points out that Bulgaria ranks second in the world on the ground of death per capita mortality due to the air pollution with reported 118 deaths per 100 000 people [32].

Some of the cities in Bulgaria like Sofia, Plovdiv, Varna, Gorna Oryahovitsa, Shumen, Pernik and Pleven occupy the front sites for air pollution in the ranking of the European Environment Agency.

Sofia city is the 15th in size in the European Union with population of about 2 million. It is located in the central part of Western Bulgaria. Its total area is 492 km², while the altitude is between 500 m and 699 m. The climate of Sofia is moderate continental with an aver-age annual temperature of 10.6°C. The problem of air

pollution in Sofia is also connected with its geographi-cal location - Sofia valley is surrounded by mountains which decrease the options of self-purification of the atmosphere. The air in the capital city is polluted mainly by dust particles and nitrogen oxides. These are being generated mainly by the automotive transport, household utilization of solid and liquid fuels, contaminated road pavements, thermal power plants.

In combination with the unfavorable meteorological conditions during the winter months of 2016, 2017 and 2018, the number of days exceeding the corresponding РМ10 threshold values of 50 μg m-3 refers to 32, 43 and 31, respectively. It is worth noting that it is not expected to exceed 35 [33].

Aiming prevention the National Institute of Meteor-ology and Hydrology and the National Institute of Geo-phisics, Geodesy and Geography disseminate through their web pages information comprising 72-hours forecast of six air pollutants including PM10 [34]. The system is based on a prediction color code.

During the winter of 2018 Sofia Municipality pro-vided for the first time the average daily values of РМ10 concentration levels for two days in advance [27]. The lack of sufficient accuracy is the main drawback of the system because it uses a color code for the forecast.

Parameters monitoringThe meteorological indices used in the investigation

refer to the temperature, T (oC), the humidity, W (%),the wind velocity, V (m s-1), the solar radiation, R (W m-2) and the pollutants like carbon monoxide, CO (mg m-3) and particulate matter, PM10 (μg m-3). The data about the period from April 2016 to March 2017 is taken from the web of Sofia Municipality. It has been submitted every hour by the 6th stations in Sofia city in real time. To predict the level of PM10 pollution the average values of the observed parameters are calculated and used in the models.

Meteorological conditions in the investigated areaThe investigation comprises a period of 12 months

of different meteorological characteristics. The monthly minimal, average and maximal values of the mete-orological parameters: W, R, T and V are presented in Fig. 1. Location of Sofia city.

Page 4: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Journal of Chemical Technology and Metallurgy, 54, 1, 2019

152

Fig.1(a), (b), (c), (d). The average humidity varies from 62.3 % (April) up to 78.9 % (January) with an average annual value of 71.9 %. The minimal annual value of the humidity, 20 %, is registered for September, while the maximal value of 95 % refers to May. The average solar radiation varies from 44.7 W m-2 (December) up to 277.9 W m-2 (July) with an average annual value of 156.3 W m-2. The minimal annual solar radiation is zero for all the months, while the maximal value of 1100 W m-2 is observed during the period from May to July. The average temperature value varies from minus 5.7oС (January) up to 22.3oС (July). The average annual value is of 10.8oС. The minimal annual temperature is registered in January - minus 15,5oС, while the maximal one amounts to 36 oС in April. The average wind velocity varies from 1.39 m s-1 (September) to 1.59 m s-1(March, April) with average an annual value of 1.52 m s-1. The

minimal annual wind velocity of 0.30 m s-1 is observed in November, while the maximal one of 4.2 m s-1 is registered during the months of February, September and November. It is worth noting that the wind velocity, especially the minimal and the average one, are almost constant within the entire period. This phenomenon is attributed to the geographic location of the city.

On the basis of the analysis carried out one can see that each month is characterized by some specific values of the meteorological indices. Therefore, in order to forecast РМ10 concentration with a sufficiently high accuracy on the ground of the meteorological indices, it is necessary to model the data each month.

Modeling method The multiple linear regression is one of the most

often applied statistical methods for determining the

Fig. 2. Monthly minimal, average and maximal values of the meteorological parameters:W(a),R (b),T(c) and V(d).

a) b)

c) d)

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

20

30

40

50

60

70

80

90

100

W,% min

mean max

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

0

200

400

600

800

1000

1200

R,W

.m-2

min mean max

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec-20

-15

-10

-5

0

5

10

15

20

25

30

35

40

T,C

min mean max

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec0,0

0,5

1,0

1,5

2,0

2,5

3,0

3,5

4,0

4,5

V,m

.s-1

min mean max

Page 5: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Marko I. Uzunov, Dimitar I. Pilev, Evgenia M. Savova-Videnova

153

interconnection between the pollution of the air and the meteorological indices. The general equation of the model (regressive dependence) is:Y= b0+b1.х1+b2.х2 + ....+bkхk +ɛi where bi are the regression coefficients, xi are the inde-pendent variables (descriptors), while ɛi are residuals associated with the differences between the predicted and the actual value of the dependent variable (targeted function). The regression coefficients are calculated by the least squares method.

The values of the regression coefficients in the cor-relation matrix are the criteria for the multicollinearity. When the latter exists, the estimated regression coef-ficients are not effective (the significance degree of the coefficients is larger than α (0,05). In this case one has to search for some other options to eliminate the negative influence. The step-by-step multiple linear regression (MLR) or multiple regression involving new variables of interaction and quadratic functions (MLR+NP) are among them. In order to process the data, the following

where, n = Total number of annual measurements of a particular site; i P= Predicted values of one set annual monitoring record; i O= Observed values of one set annual monitoring record; P = Mean of the predicted values of one set annual monitoring record; O = Mean of the observed values of one set annual monitoring record; pred S= Standard deviation of the predicted values of one set annual monitoring record; obs S= Standard deviation of the observed values of one set annual monitoring record between input and outputs vectors.

Table 1. Performance indicators.Performance indicators Equation Description

Mean absolute error (MAE)

n

OPMAE

n

iii∑

=

−= 1

MAE value closer to zero

indicates better method

Normalized absolute error

(NAE) ( )

=

=

−= n

ii

n

iii

O

OPAbsNAE

1

1

NAE value closer to zero

indicates better method.

Index of agreement

IA value closer to 1 indicates

better method.

Prediction accuracy ( )

( )

2

1

2

1

=

=

−= n

ii

n

ii

OO

OPPA

PA value closer to 1 indicates

better method

Coefficient of determination

(R2) ( )( )2

12

..

−−

=∑=

obspred

n

iii

SSn

OOPPR

R2 value closer to 1 indicates

better method

Mean square error (RMSE) 2

1)(1

i

n

ii PO

nRMSE −= ∑

=

RMSE value closer to zero

indicates good adequacy of the

method

Page 6: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Journal of Chemical Technology and Metallurgy, 54, 1, 2019

154

software packages SPSS, ver. 19 and MultiTab are used.The follow indices for validation of the models are

used: a mean absolute error (MAE), a normalized abso-lute error (NAE), an index of agreement (IA), prediction accuracy (PA), a coefficient of determination (R2), a mean square error (RMSE), Table 1 [35].

RESULTS AND DISCUSSIONTemporal dependence of РМ10 concentration

The change of РМ10 concentration is studied day-and-night and month-by-month during the period from April 2016 to March 2017. For the purpose 5383 aver-aged hourly values of РМ10, СО and the meteorological indices, downloaded from the National Database for Sofia city are used [27].

The average hourly values of РМ10 are presented in Fig. 3. It can be seen that during the period of study the concentration varies within the range from 1 μg m-3 to 430 μg m-3. The variations of the maximal, the minimal and the average values of РМ10 concentration month by month are illustrated in Fig. 4. The data shows that the concentrations of the dust particles are the highest dur-ing the period November - February (included), while

the extreme level is in January. It corresponds to the mean value of 72.58 μg m-3. The lowest levels of PM10 are observed from the end of the spring until the middle of the autumn (May–October). They are in the interval from 19.669 μg m-3 to 25.670 μg m-3. The occasional increase of РМ10 concentration above the maximal ad-missible concentration (MAC) is observed during the whole period, whereupon the number of such incidents is the greatest in the period November - February. The high concentrations of РМ10 during this period are due to the intensive usage of solid fuels in the households in combination with the specific winter meteorological conditions, which favor the pollution. The concentration fluctuations of PM10 in the course of the hot months are due to the enhanced repair activities of the road, the fire accidents, as well as some meteorological conditions favoring the pollution.

The descriptive statistics are presented in Table 2.The lowest coefficient of variation is displayed

by the variable РМ10 in August, while the highest one refers to January. The asymmetry value for February is the highest. The high values of these statistical indices during the winter months are most probably due to the

Fig. 3. Averaged hourly values of РМ10 for the survey period.

Page 7: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Marko I. Uzunov, Dimitar I. Pilev, Evgenia M. Savova-Videnova

155

occurring of extreme events discussed above.Fig. 5 presents the daily and night time variations of

the average РМ10 concentration for the studied period, while Fig. 6 illustrates the distribution of the maximum

monthly РМ10 concentration for every hour during a twenty-four hours interval.

It is seen from Fig. 5 that the 24-hour variations of РМ10 average concentration comprise different zones.

Fig. 4. Variation of the maximal, the minimal and the average value of РМ10 concentration month by month.

Indicators

months Mean StDev Variance Skewness Kurtosis N Min Med Max/h

Jan 72,580 63,642 4050,268 1,3646 1,2309 515 10,000 42,000 320,00/21

Feb 60,636 55,694 3101,818 2,9787 12,8408 497 12,000 41,000 430,00/24

Mar 27,871 11,055 122,206 1,00689 2,18272 316 8,000 26,000 71,00/10

Apr 30,772 17,475 305,380 0,732935 -0,17194 320 3,000 27,000 80,00/19

May 19,669 7,732 59,786 0,310125 -0,00351 416 1,000 20,000 42,00/20

Jun 23,814 7,387 54,564 0,596015 -0,31956 381 10,000 22,500 47,00/23

Jul 22,069 6,369 40,562 0,253027 0,017835 543 8,000 22,000 46,00/10

Aug 20,480 5,523 30,500 0,434161 0,688944 473 9,000 20,000 45,00/10

Sep 25,709 5,812 33,775 -0,360647 -0,01826 423 10,000 26,000 42,00/10

Oct 25,670 9,765 95,352 0,461004 -0,22098 479 10,000 25,000 62,00/21

Nov 37,632 26,680 711,822 2,18205 8,36721 519 7,000 30,500 240,00/3

Dec 40,688 28,174 793,769 1,11190 0,95389 501 8,500 33,000 150,00/20

Table 2. Descriptive statistical data.

Page 8: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Journal of Chemical Technology and Metallurgy, 54, 1, 2019

156

The highest concentration of РМ10 is observed between 8 pm and 3 am and could be explained with the accumulation of emissions from the motor vehicles during the whole day. Between 8 am and 10 am as well as at about 15 pm high concentrations levels are meas-ured as well. They are connected with the intense traf-fic during this time period. Thereafter one can observe gradual increase in the concentration until reaching the maximum value.

The maximum concentrations of РМ10 are reached during the night time between 20 pm and 0 am for most of the months with the exception of March, July, August and September for which it is observed at 10 am, Fig. 6.

Regression analysis between РМ10, some meteorologi-cal parameters and the concentration of СО

The extent of the relationships between the variables included is represented by the correlation coefficients, Table 3. The obtained results reveal that РМ10 concentra-tion levels are inversely proportional to T, V and R (r2 = 15.4 %; 11.4 % and 2.6 %, respectively) and directly proportional to W (r2 = 2.5 %).

The low temperature, the low solar radiation and the low wind velocity favor the accumulation of РМ10, while the low humidity leads to decrease of РМ10. The phe-nomenon is due to the fact that at high humidity the dust particles are retained in the fog deeply below in ABL.

The concentration of РМ10 depends strongly on that of СО with a coefficient of determination equal to 71.3 %. This pollutant is emitted as a result of incomplete

combustion and as a rule it is accompanied by formation of soot. The high concentration of РМ10 during the winter months is not only due to the intensified traffic, but also due to the use of fuels of a non-regulated composition in some households. This is the only explanation of the higher concentration of РМ10 in the suburbs of Sofia in comparison with the City, where the traffic is usually extremely intense [36].

Concentrations of РМ10 above 100 μg m-3 are ob-served at low temperatures (during the winter): from minus 10oС up to 10oС, at solar radiation from 0 W m-2 up to 100 W m-2 (mainly during the dark hours at night time) and from 100 W m-2 up to 500 W m-2 (at day time, 8h - 10h) when the air humidity is high (70 % - 90 %) and the wind velocity is low (below 2 m sec-1).

Fig. 5. Day and night variations of the average value of the concentration of РМ10 for the studied period.

Fig. 6. Distribution of the maximal monthly concentra-tions of РМ10 hour by hour for 24 hours.

Table 3. Correlation matrix of all variables. PM10 CO T W V

CO 0,845 0,000 T -0,393 -0,542 0,000 0,000 W 0,159 0,217 -0,546 0,000 0,000 0,000 V -0,337 -0,231 0,092 -0,267 0,000 0,000 0,000 0,000

R -0,162 -0,171 0,480 -0,539 0,322 0,000 0,000 0,000 0,000 0,000

Page 9: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Marko I. Uzunov, Dimitar I. Pilev, Evgenia M. Savova-Videnova

157

Table 4. Linear regressionen analysis between PM10 and meteorological parameters and CO for the research period. season model Jan

n=515

PM10m=818 - 113,1 X1 + 21,68 X2 - 16,17 X3 - 24,94 X4 - 0,1390 X5 + 6,03 X1^2 + 0,0770 X3^2

+ 4,67 X4^2 + 1,969 X1*X3 + 0,1203 X1*X5 - 0,2762 X2*X3

Feb

n=497

PM10m = -91,9 + 170,5 X1 + 0,724 X2 + 1,898 X3 + 19,7 X4 + 0,0572 X5 + 16,57 X1^2 + 5,60 X4^2

+ 0,000140 X5^2 - 1,441 X1*X2 - 2,028 X1*X3 - 11,50 X1*X4 - 0,00290 X2*X5 - 0,562 X3*X4

- 0,0462 X4*X5

Mart

n=316

PM10m =196,0 - 73,9 X1 - 11,91 X2 - 3,74 X3 + 1,437 X4 - 0,0138 X5 - 26,2 X1^2 + 0,1592 X2^2

+ 0,01404 X3^2 + 1,972 X1*X3 + 0,0750 X1*X5 + 0,1518 X2*X3 - 0,00805 X4*X5

Apr

n=320

PM10m =88,9 - 19,8 X1 - 7,16 X2 + 0,181 X3 - 49,47 X4 + 0,0620 X5 + 0,2270 X2^2 - 0,01136 X3^2

+ 6,53 X4^2 + 0,588 X1*X3 + 21,92 X1*X4 + 0,0602 X2*X3 - 0,002748 X2*X5 + 0,2225 X3*X4

- 0,000835 X3*X5

May

n=416

PM10m =265,0 + 128,5 X1 - 16,24 X2 - 4,593 X3 - 12,63 X4 + 0,01614 X5 - 29,02 X1^2

+ 0,1774 X2^2 + 0,02074 X3^2 + 1,101 X4^2 - 0,708 X1*X3 - 9,39 X1*X4 + 0,14638 X2*X3

+ 0,5833 X2*X4 - 0,000269 X3*X5

Jni

n=381

PM10m =102,9 - 97,7 X1 - 6,77 X2 - 0,747 X3 + 5,29 X4 - 0,01280 X5 + 147,0 X1^2 + 0,1006 X2^2

+ 0,000010 X5^2 - 15,90 X1*X4 + 0,06122 X2*X3

Jli

n=543

PM10m =29,51 + 6,54 X1 - 1,021 X2 - 0,2708 X3 - 9,52 X4 - 0,02127 X5 - 14,58 X1^2

- 0,000402 X3^2 + 0,000010 X5^2 + 18,88 X1*X4 + 0,03906 X1*X5 + 0,03034 X2*X3

- 0,000294 X3*X5 + 0,00502 X4*X5

Aug

n=473

PM10m = -65,44 + 6,02 X1 + 1,607 X2 + 1,355 X3 + 4,62 X4 + 0,00347 X5 + 42,03 X1^2

- 0,00747 X3^2 + 0,000006 X5^2 - 0,625 X1*X2 + 0,01501 X1*X5 - 0,3059 X2*X4

- 0,000301 X3*X5

Sept

n=423

PM10m = -27,67 + 25,86 X1 + 5,682 X2 + 0,4203 X3 - 37,38 X4 - 0,03780 X5 - 14,75 X1^2

- 0,10617 X2^2 - 1,039 X4^2 + 0,04448 X1*X5 - 0,03444 X2*X3 + 0,7576 X2*X4 + 0,3048 X3*X4

+ 0,01104 X4*X5

Oct

n=479

PM10m = -62,8 - 1,79 X1 + 1,788 X2 + 2,232 X3 - 15,15 X4 - 0,03916 X5 + 20,40 X1^2

- 0,01444 X3^2 + 2,573 X4^2 + 0,000031 X5^2 - 0,732 X1*X2 + 0,02723 X1*X5

Nov

n=519

PM10m =79,6 - 56,4 X1 + 4,956 X2 - 0,471 X3 - 24,04 X4 - 0,02775 X5 + 14,73 X1^2 - 0,1342 X2^2

+ 2,753 X4^2 - 3,086 X1*X2 + 0,782 X1*X3 + 0,00294 X2*X5

Dec

n=501

PM10m = -113,8 + 141,5 X1 - 0,978 X2 + 2,74 X3 + 4,28 X4 - 6,970 X1^2 - 0,01673 X3^2

- 0,867 X1*X3 - 18,33 X1*X4

year

2016

n=5383

PM10m = 40,03 + 55,39 X1 - 0,012 X2 - 0,4603 X3 - 4,85 X4 - 0,05326 X5 + 3,939 X1^2

+ 0,000150 X3^2 - 0,000006 X5^2 - 1,6538 X1*X2 - 17,312 X1*X4 + 0,06526 X1*X5

+ 0,01126 X2*X3 + 0,000643 X2*X5 + 0,0949 X3*X4 + 0,00673 X4*X5

X1=CO X2=T X3=W X4=V X5=R

Page 10: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Journal of Chemical Technology and Metallurgy, 54, 1, 2019

158

Computational methodsBecause the bad correlation between РМ10 and the

meteorological factors found, some quadratic functions (T2, CO2, R2, etc.) and their interactions (CO*T, CO*V, T*V, T*W, etc.) are additionally included.

The dependence of РМ10 level on the meteorologi-cal factors and the concentration of СО is described by MLR+NP approach (step-by-step multiple regression with quadratic functions and variables interaction).

The models for every month, as well as for the year, are described by the equations, presented in Table 4.

Statistical estimation of the observed values and the values calculated by the models for РМ10 concentrations are listed in Table 5.

The coefficients of determination for the different models calculated by use of the database for every month vary between 70 % up to 89 %. This means that the magnitudes of the changes in РМ10 levels can be explained by the chosen factors: T, W, V, R and СО, their quadratic functions and interactions.

The coefficient of determination of the model calcu-lated by acquisition of the data for the whole year is 81.2 %. One could accept that all the models are adequate because the level of significance sig.F is 0.000 and is less than the error α (0.05). All the regression coefficients included in the models are statistically significant (sig.F = 0.000). The standardized β-coefficients reveal that the influence of the meteorological parameters on PM10 levels in the air follows the sequence: wind velocity > temperature > humidity > sun radiation.

The data listed in Table 4 show that the best models for predicting РМ10 pollution are those for December, January and February. They possess the highest values of precision: IA varies from 0.7137 to 0.7295; РА changes from 0.8248 to 0.8932; the coefficients of determination R2 vary from 84.0 % to 89.4 %, but the errors are high – they vary from 0.2145 to 0.2916. The same also holds true for the model calculated on the ground of the data for the whole year.

The low coefficients of determination for the period

Indicators

Months NAE MAE RMSE R2, % IA PA

monthly

models

annual

models

Jan 0,2082 14,3844 20,1863 89,41 86,79 0,7295 0,8932

Feb 0,2438 10,2944 16,4384 87,45 82,72 0,7259 0,8772

Mar 0,1664 2,5414 4,2062 73,16 22,37 0,6970 0,7243

Apr 0,2449 6,6799 8,5096 73,29 31,73 0,6920 0,7276

May 0,1633 2,7087 3,6391 73,67 33,93 0,6937 0,7394

Jun 0,1365 2,8219 3,6180 72,31 07,46 0,6862 0,7198

Jul 0,1912 3,4226 3,0552 71,59 23,04 0,6853 0,7069

Aug 0,1182 1,9237 2,6246 71,40 24,98 0,6856 0,7039

Sep 0,0993 2,1211 2,8419 71,16 34,63 0,6865 0,7114

Oct 0,1713 3,8729 5,0036 70,12 42,11 0,6847 0,7020

Nov 0,2916 10,0620 13,2713 72,96 57,44 0,6923 0,7291

Dec 0,2145 8,0544 10,7244 84,01 73,96 0,7137 0,8248

year 0,2915 9,5282 14,2022 - 81,25 0,7216 0,8827

Table 5. Perfomance indicators for PM10 prediction models.

Page 11: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Marko I. Uzunov, Dimitar I. Pilev, Evgenia M. Savova-Videnova

159

from March until November upon using the annual model (R2 varies from 7.5 % to 57.4 %) present a defi-nite interest. Therefore the monthly concentrations of РМ10 are better forecast using the monthly models as they are much more suitable. The comparison between the forecast (for every month and for the year) and the observed values of the dust particle concentrations is illustrated in Fig. 7.

As a result of the investigation carried out and the applied methodology several models are obtained. They possess comparatively high coefficients of determina-tion and treat not only meteorological factors but also the pollutant СО. The coefficients of determination of the regression equations decrease to a high extent when the pollutant is excluded from the model.

The forecast values of the meteorological indices can be downloaded from different weather forecasting web platforms for a period up to 25 days.

The concentration of CO can be predicted using appropriately developed selected models or determined through elimination. The latter procedure consists in eliminating any of the least probable CO values for the month, the time and the forecast meteorological data.

It is of practical importance to forecast the maximal concentration of РМ10 for a day of the month based on the available weather forecast meteorological indices. Aiming this, the already derived monthly models are used to forecast the maximal daily РМ10 concentration for every first day of each month of the year. The exact time (t) at which the concentration is the highest is chosen from the dependence presented in Fig. 6. The predicted values of the temperature (Tpr), the humidity (Wpr), the wind velocity (Vpr) and the solar radiation (Rpr), corresponding to these hours (tpr) are downloaded from web based weather forecast. The values of СО are determined by the method of elimination which com-prises some stages. All values of СО for which t≠ tpr, V is outside the interval Vpr ± 0,5 m sec-1, T is outside the

Fig. 7. Comparison of PM10 concentrations measured and predicted by monthly and annual models.

Fig. 8. Real and calculated on monthly models maximum daily concentrations of PM10.

Fig. 9. Relative errors in predicting peak concentrations of PM10.

Page 12: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Journal of Chemical Technology and Metallurgy, 54, 1, 2019

160

interval Tpr ± 5oС, W is outside the interval Wpr ±10 % and R is outside the interval Rpr ± 100 W m-2 are succes-sively eliminated. They are removed sequentially from the database for that month. The values of СО left after the elimination are averaged.

The maximal daily concentrations of РМ10 measured and calculated on the basis of the monthly models are presented in Fig. 8, while Fig. 9 lists the relative errors in forecasting the maximal concentrations of РМ10.

As evident the pollution with dust particles can be well forecast for all the months, whereupon the pre-vailing errors are below 10 %. A slightly higher error, between 10 % and 11 % is observed in the forecast of the maximal daily concentration for April, July, August and November.

CONCLUSIONS The aim of the investigation was to elaborate models

for forecasting of РМ10 concentration in the air of Sofia city by means of MLR+NP approach. Some meteorologi-cal factors (V,T,W and R) as well as one pollutant (СО), all correlated with PM10 levels were used as independent variables. The period of data acquisition lasted from 1.04.2016 until 31.03.2017.

The acquired data showed that the concentration of РМ10 in Sofia varied between 1 μg m-3 and 430 μg m-3. The average annual value was found equal to 34.5 μg m-3. The concentrations of the dust particles were the highest during the period November - February included, while the extreme value of 72.58 μg m-3 was observed in January. The corresponding concentration was the lowest within the period between May and October. The mean arithmetic values found were in the range from 19.67 μg m-3 to 25.67 μg m-3. The concentrations of РМ10 above 100 μg m-3 were observed at temperatures from minus 10oС up to 10oС, in case of radiation varying from 0 W m-2 to 500 W m-2, at air humidity in the range of 70 % - 90 % and wind velocity less than 2 m s-1.

The quality and the reliability of the elaborated models were evaluated on the basis of the following statistical indices: MAE, NAE, IA, PA, r2, RMSE. The variation of РМ10 concentration from 70 % to 89 % could be explained on the ground of the results obtained

by the following factors: T, W, V, R and СО and their interactions. The evaluation of the effectiveness of these models showed that the monthly models could forecast РМ10 concentration value with greater precision and reliability in comparison with the model obtained on the basis of the anual data.

The experimental test which comprised forecasting of the maximal daily concentrations of РМ10 for each first day of each month showed errors less than 11 %.

The multiple linear regression involving new variables of interaction and quadratic functions was proved to be an efficient technique for forecasting РМ10 concentration using some meteorological indices as well as one pollutant, СО. Obtained results revealed that the elaborated models could be used as a potential method both for the forecasting РМ10 concentration, as well as for the development of systems for control and manage-ment of the environmental contamination.

REFERENCES

1. L. Hu, J. Liu, Z. He, Self-Adaptive Revised Land Use Regression Models for Estimating PM2.5 Concentrations in Beijing, China. Sustainability, 8, 8, 2016, 786.

2. D. Markovic, A. Markovic, A. Jovanovic, L. Lazic, Z. Mijic, Determination of O3, NO2, SO2, CO and PM10 measured in Belgrade urban area, Environ. Monit. Assess., 145, 1,2008, 349-359.

3. D. Dockery, C. Pope, Acute Respiratory Effects of Particulate Air Pollution, Annu. Rev. Public Health, 15, 1994, 107-132.

4. M. Caselli, L.Trizio, P.Ielpo, A simple feedforward neural network for the PM10 forecasting:comparison with a radial basis function network and a multivari-ate linear regression model, Water, Air, Soil Pollut., 201, 1-4, 2009, 365-377.

5. G. Krstic, A reanalysis of fine particulate matter air pollution versus life expectancy in the United States, J. Air Waste Manag. Assoc., 62, 9, 2012, 989-991.

6. P. Giorginia, P. Di Giosia, D. Grassi, M. Rubenfire, R. Brook, C. Ferri, Air pollution exposure and blood pressure: An updated review of the literature, Curr. Pharm. Des., 22, 1, 2016, 28-51.

Page 13: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Marko I. Uzunov, Dimitar I. Pilev, Evgenia M. Savova-Videnova

161

7. A. Lakshmanan, Y. Chiu, B. Coull, A. Just, S. Maxwell, J. Schwartz, A. Gryparis, I. Kloog, R. Wright, R. Wright, Associations between prenatal traffic-related air pollution exposure and birth-weight: Modification by sex and maternal pre-pregnancy body mass index, Environ. Res., 137, 2015, 268-277.

8. J. Sedek, N. Ramli, A. Yahaya, Air quality predic-tions using log normal distribution functions of particulate matter in Kuala Lumpur, Malaysian, J. Environ. Manage., 7, 2006, 33-41.

9. A. Ul-Saufie, A. Yahaya, N. Ramli, H. Hamid, Robust regression models for predicting PM10 con-centration in an industrial area, Int. J. Eng. Technol., 2, 3, 2012, 364-370.

10. S. Li, L. Zhai, B. Zou, H. Sang, X. Fang, A. Ul-Saufie, A. Yahaya, N. Ramli, N. Rosaida, H. Hamid, Future daily PM10 concentrations prediction by combining regression models and feedforward backpropagation models with principle component analysis (PCA), Atmos. Environ., 77, 2013, 621-630.

11. J. Whalley, S. Zandi, Particulate Matter Sampling Techniques and Data Modelling Methods, Chapter 2, 2016, 30-54, https://dx.doi.org/10.5772/65054.

12. S. Tiwari, D.M. Chate, M. Srivastava, P. Safai, A. Srivastava, D. Bisht, B. Padmanabhamurty, N. Hazards, Statistical evaluation of PM10 and distribu-tion of PM1, PM2.5, and PM10 in ambient air due to extreme fireworks episodes (Deepawali festivals) in megacity Delhi, 61, 2012, 521-531, DOI 10.1007/s11069-011-9931-4.

13. A. Afzali, M. Rashid, B. Sabariah, M. Ramli, PM10 pollution: its prediction and meteorological influ-ence in PasirGudang, Johor. IOP Conference Series: Earth and Environmental Science, 18, 2014, 012100. DOI: 10.1088/1755-1315/18/1/012100

14. G. Grivas, A. Chaloulakou , Artificial neural network models for prediction of PM10 hourly concentrations, in the Greater Area of Athens, Greece, Atmospheric Environment, 40, 7, 2006, 1216-1229.

15. P. Perez, J. Reyes, Prediction of maximum of 24h average of PM10 concentrations 30 h in advance in Santiago, Chile, Atmos. Environ., 36, 28, 2002,

4555-4561.16. D. Papanastasiou, D. Melas, I. Kioutsioukis,

Development and assessment of neural network and multiple regression models in order to predict PM10 levels in a medium sized Mediterranean city, Water, Air, Soil Pollut., 182, 1-4, 2007, 325-334.

17. A. Sayegh, S. Munir, T.Habeebullah, Comparing the performance of statistical models for predicting PM10 concentrations, Aerosol Air Qual. Res., 14, 3, 2014, 653-665. DOI: 10.4209/aaqr.2013.07.0259.

18. A. Ul-Saufie, A. Yahaya, N. Ramli, N. Rosaida, H. Hamid, Future daily PM10 concentrations prediction by combining regression models and feedforward backpropagation models with principle component analysis (PCA), Atmos. Environ., 77, 2013, 621-630.

19. S. Munir, T. Habeebullah, A. Seroji, E. Morsy, A. Mohammed, W. Saud, Modeling particulate mat-ter concentrations in Makkah, applying a statistical modeling Approach, Aerosol Air Qual. Res., 13, 3, 2013, 90110.

20. A. Ul-Saufie, A. Yahya, N. Ramli, H. Hamid, Comparison Between Multiple Linear Regression And Feed forward Back propagation Neural Network Models For Predicting PM10 Concentration Level Based On Gaseous And Meteorological Parameters, Int. J. Appl. Sci., Eng. Technol., 1, 4, 2011, 42-49.

21. S. Li, L. Zhai, B. Zou, H. Sang, X. Fang, A Generalized Additive Model Combining Principal Component Analysis for PM2.5 Concentration Estimation, Int.J.Geo-Inf., 6, 8, 2017, 248-262.

22. E. Andric, J. Brana, V. Gvozdic, Impact of meteoro-logical factors on ozone concentrations modelled by time series analysis and multivariate statistical meth-ods, Ecological Informatics, 4, 2, 2009, 117-122.

23. V. Gvozdic, E. Kovac-Andric, J. Brana, Influence of meteorological factors NO2, SO2, CO and PM10

on the concentration of O3 in the urban atmosphere of Eastern Croatia, Environmental Modeling and Assessment,16, 5, 2011, 491-501.

24. W. Pao, L. Grace, Simulation of the daily average PM10 concentrations at Ta-Liao with Box–Jenkins time series models and multivariate analysis, Atmos.

Page 14: STATISTICAL EVALUATION AND FORECASTING OF THE DUST … · 2019-01-29 · could be explained by the factorsT, W, V, R, СО and theirquadratic functions and interactions. The monthly

Journal of Chemical Technology and Metallurgy, 54, 1, 2019

162

Environ., 43, 13, 2009, 2104-2113.25. U. Brunelli, V. Piazza, L. Pignato, F. Sorbello,

S.Vitabile, Two-days ahead prediction of daily maximum concentrations of SO2, O3, PM10, NO2, CO in the urban area of Palermo Italy, Atmos. Environ., 41, 14, 2007, 2967-2995.

26. T. Slini, K. Karatzas, N. Moussiopoulos, Correlation of air pollution and meteorological data using neural networks, P.J. Sturm (ed), Proc. 8th International Conference on Harmonisation within Atmospheric Dispersion Modelling for Regulatory Purposes, Sofia ,1998, 368-372.

27. https://www.sofia.bg/components-environment-air 28. „Air quality in Europe - 2015 report „ EEA Report

No 5/2015 .29. „Air quality in Europe- 2016 report“ EEA Report

No 28/2016.30. „Air quality in Europe-2017 report“ EEA Report

No 13/2017.31. https://eur-lex.europa.eu/legal-content/EN/

ALL/?uri=CELEX%3A62015CJ048832. „Ambient air pollution: A global assessment of

exposure and burden of disease“, World Health Organization, 2016 .

33. eea.government.bg/bg/legislation/air/Naredba_12_Normi_KAV.pdf

34. [http://www.niggg.bas.bg/cw3/index.php?pol=PM10&steps=72&dom=4]

35. H. Lu, Estimating the Emission Source Reduction of PM10 in Central Taiwan, J. of Chemosphere, 54, 7, 2004, 805-814.

36. https://airbg.info/