Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
AN ANALYSIS OF SHORT-TERM LOAD FORECASTING ON RESIDENTIAL
BUILDINGS USING DEEP LEARNING MODELS
SREERAG SURESH
THESIS SUBMITTED TO THE FACULTY OF THE VIRGINIA POLYTECHNIC
INSTITUTE AND STATE UNIVERSITY IN PARTIAL FULFILLMENT OF THE
REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
IN
ENVIRONMENTAL ENGINEERING
FARROKH JAZIZADEH KARIMI, CHAIR
LINSEY C MARR
GABRIEL ISAACMAN-VANWERTZ
MAY 21ST, 2020
BLACKSBURG, VIRGINIA
KEYWORDS: LOAD FORECASTING, BUILDING ENERGY, CNN, DEEP
LEARNING, LSTM
Copyright @ 2020, Sreerag Suresh
AN ANALYSIS OF SHORT-TERM LOAD FORECASTING ON RESIDENTIAL
BUILDINGS USING DEEP LEARNING MODELS
SREERAG SURESH
ABSTRACT
Building energy load forecasting is becoming an increasingly important task with the
rapid deployment of smart homes, integration of renewables into the grid and the advent
of decentralized energy systems. Residential load forecasting has been a challenging
task since the residential load is highly stochastic. Deep learning models have showed
tremendous promise in the fields of time-series and sequential data and have been
successfully used in the field of short-term load forecasting at the building level.
Although, other studies have looked at using deep learning models for building energy
forecasting, most of those studies have looked at limited number of homes or an
aggregate load of a collection of homes. This study aims to address this gap and serve
as an investigation on selecting the better deep learning model architecture for short
term load forecasting on 3 communities of residential buildings. The deep learning
models CNN and LSTM have been used in the study. For 15-min ahead forecasting for
a collection of homes it was found that homes with a higher variance were better
predicted by using CNN models and LSTM showed better performance for homes with
lower variances. The effect of adding weather variables on 24-hour ahead forecasting
was studied and it was observed that adding weather parameters did not show an
improvement in forecasting performance. In all the homes, deep learning models are
shown to outperform the simple ANN model.
AN ANALYSIS OF DEEP LEARNING MODELS FOR SHORT TERM LOAD
FORECASTING ON RESIDENTIAL BUILDINGS
SREERAG SURESH
GENERAL AUDIENCE ABSTRACT
Building energy load forecasting is becoming an increasingly important task with the
rapid deployment of smart homes, integration of renewables into the grid and the advent
of decentralized energy systems. Residential load forecasting has been a challenging
task since residential load is highly stochastic. Deep learning models have showed
tremendous promise in the fields of time-series and sequential data and have been
successfully used in the field of short-term load forecasting. Although, other studies
have looked at using deep learning models for building energy forecasting, most of those
studies have looked at only a single home or an aggregate load of a collection of homes.
This study aims to address this gap and serve as an analysis on short term load
forecasting on 3 communities of residential buildings. Detailed analysis on the model
performances across all homes have been studied. Deep learning models have been used
in this study and their efficacy is measured compared to a simple ANN model.
ACKNOWLEDGMENTS
I would like to express my sincere gratitude to my advisor, Dr. Farrokh Jazizadeh for
his constant support throughout this study. I would not have been able to complete this
research without his guidance. I would also like to express my gratitude to Dr. Linsey
Marr and Dr. Gabriel Issacman-VanWertz for serving in my committee and providing
their valuable insights.
I am extremely grateful to all my colleagues at Virginia Tech, especially all members at
the INFORM lab for their constant support. I would also like to express my sincere
gratitude to Cristiano Ronaldo, Kobe Bryant (RIP Mamba) and Uzumaki Naruto. They
have been an immense source of inspiration and constantly motivate me do best.
I am thankful to my friends Manu Krishnan, Amal Shaj, Nevedita Sankararaman,
Venkatesh Modi and Prachi Jain for providing me their valuable feedback and helping
me improve my thesis.
I would like to dedicate this thesis to my parents Mr. Suresh Babu and Mrs. Deepa
Suresh for their unconditional love and support.
iv
Table of Contents
1. Introduction ............................................................................................................. 1
2. Literature Review .................................................................................................... 4
3. Methodology ........................................................................................................... 9
3.1 Problem Formulation .......................................................................................... 9
3.2 Theoretical Background – Deep Learning ........................................................ 10
3.3 Theoretical Background – Long Short-Term Networks (LSTMs) ................... 11
3.4 Theoretical Background – Convolutional Neural Networks (CNNs) ............... 13
3.5 Theoretical Background – Multi Layered Perceptron (MLPs) ......................... 14
3.5.1 Optimization Algorithm ................................................................................ 15
3.5.2 Regularization ............................................................................................... 16
4. Evaluation Method ................................................................................................ 17
4.1 Data Collection and Characteristics .................................................................. 17
4.1.1 Residential Homes Data ................................................................................... 17
4.1.2 Weather Data ................................................................................................... 17
4.2 Data Description and Pre-processing ................................................................ 18
4.2.1 Austin ............................................................................................................ 18
4.2.2 California ...................................................................................................... 20
4.2.3 New York ...................................................................................................... 22
4.3 Feature Engineering .......................................................................................... 25
4.4 Implementation Setup ....................................................................................... 26
4.5 Evaluation Metric .............................................................................................. 28
5. Results and Discussion ......................................................................................... 29
5.1 Introduction ....................................................................................................... 29
5.2 Multiple Home Analysis ................................................................................... 29
5.2.1 Austin and California .................................................................................... 30
5.2.2 New York ...................................................................................................... 37
5.2.3 Variation of RMSE with Variance ................................................................ 41
5.3 Multistep ahead forecasting .............................................................................. 43
5.3.1 Grid Data Forecasting ................................................................................... 43
5.3.2 Use Data Forecasting .................................................................................... 45
5.3.3 Variation with Lookback .............................................................................. 47
6. Conclusions ........................................................................................................... 49
References .................................................................................................................... 51
vi
1
1. Introduction
According to the US Energy Information Administration, the global energy demand is projected
to increase by 50 %, mostly led by developing economies in Asia. This increasing demand
would put a significant load on the present energy infrastructure and also possibly deteriorate
the world environmental health with an increase in emissions of greenhouse gases from
conventional power sources[1]. Across the United States and Europe, it is estimated that 39 %
and 40 % of the electricity consumption and 38% and 36% of the CO2 emissions are from the
building sector alone[2]. Therefore, regulating and managing building sector energy demand is
an important task for a transition into a more sustainable use of our limited energy resources.
Increasing the use of renewable sources of energy in buildings along with improving building
designs, so that they are more energy efficient are 2 ways of reducing building energy demand
[1]. Energy load forecasting is another method which can be very useful in regulating building
energy demand.
Energy load forecasting has its advantages economically and also with respect to the energy
infrastructure. Predicting the future consumption supports the utility companies to plan ahead
and make economically feasible decisions with respect to resource planning and future
generation. Large amounts of money are involved in energy budgets and therefore providing
reliable forecasts is of significant importance to engineers[3]. For buildings, load forecasting is
instrumental in efficient building load management, load systems commissioning through
noticing system faults, building load operation and avoiding blackouts[4]. Short term load
forecasting at an individual household level is important for implementing demand side
2
management, improving energy use, improving cost saving and ultimately reducing
environmental impact.
In the last decade, there has been an uptick in the adoption of renewables and distributed
generation sources in the grid along with growing progress and execution of smart grids and
buildings to meet the growing energy demands in an effective way. Integrating these distributed
energy resources without causing disturbances in the grid requires accurate load forecasts across
different time horizons[5].
Although load forecasting is a fairly mature field, there is a shortage of studies using data-driven
methods for load forecasting at an individual building level in the US[6]. Load forecasting at
an individual building level is a challenging task compared to aggregate building load
forecasting as individual building loads are highly stochastic as there are multiple factors
affecting the energy consumption. Building energy consumption is dependent on factors such
as occupant behaviour, weather parameters, use of appliances, location of the building and the
structural characteristics of the building[7].
Our overarching goal in this study is to use Deep learning models i.e. 1-D Convolutional Neural
Networks (CNN’s) and Long-Short Term Memory Networks (LSTM’s) for short term load
forecasting. 1-D CNNs have been successfully used in studying sequential data such as audio
recordings and analyzing time series of sensor data[8]. The strength of CNN is in its capacity
to learn relevant features effectively from raw data which does not require preprocessing[9].
Both CNN’s and LSTM’s have proved to be a useful tool for load forecasting showing
3
promising results. In this study, we aim to answer the following questions with regard to short
term load forecasting:
• For a collection of homes, if there is any value in setting up individual deep learning
forecasting model architectures for each home?
• For 24-hour ahead forecasts, is there any benefit in adding weather based, date-based
features to improve performance?
4
2. Literature Review
Load forecasting can save utilities up to 100,000 $ USD annually for a city[10] and therefore,
gives an incentive for utilities to put a focused effort towards improving forecasting accuracy.
The economic and energy infrastructure advantages of load forecasting have led to approaches
as early as 1956 where the authors try to predict the daily peak load 24 hours ahead[11].
In this study a systematic literature review was done initially to identify trends in load
forecasting research. In the literature review it was found that an increased number of research
papers focused on system level whereas the number of studies focusing on individual building
or low aggregations were comparatively lower. This was identified as a gap in literature. The
figure below shows the number of studies classified into 5 categories based on the geographic
scope of prediction i.e. (i) State level – load forecasting done at a state, province or national
level, (ii) City level – forecasting done at a city level, (iii) Building level – load forecasting
done at an individual building level, (iv) Neighborhood level – load forecasting done for an
aggregate number of homes and (v) Review Papers denoted below as N/A.
Studies classified as ‘state level’ looked at forecasting system level loads such as forecasting
loads for the state of California, New South Wales, Chandigarh, Hubei province in China,
Singapore etc. using traditional machine learning models such as SVM, SVR, simple ANN,
deep learning models and also hybrid optimization models to good effect ([12],[13],[14],[15]).
City level studies looked at forecasting loads of cities such as Johor, Langfang, Rome and
Sydney using methods ranging like CNN’s, general regression neural network, echo state
networks using PCA decomposition([16],[17],[18],[19]). At the neighborhood scale,
5
forecasting was studied for a collection of homes mostly using the CER-Ireland data and UK
DALE dataset which contains about a few hundred homes([20],[21],[22]). The review studies
on forecasting talks about the different methodologies, applications and challenges faced in
load forecasting([23],[24]).
Figure 1 : Preliminary literature review categorized based on the geographic scope of prediction.
Electric load forecasting can be categorized into different categories based on the time horizon
of forecasting. Mocanu et al categorized them into (i) short-term forecasting which involves a
prediction horizon in the range of one hour to one week, (ii) medium term forecasting which
involves a range of one week up to one year and lastly (iii) long term forecasts which involves
a prediction horizon greater than a year. Short term forecasting is useful for demand side
6
management (DSM), generation capacity scheduling, renewable energy source
(RES)integration and energy storage system applications[25, 26]. In this study, the prediction
time horizon in focus is short-term load forecasting.
Building-level load forecasting methods mainly comprise of 2 types: (i) Physical modelling
approaches and (ii) data-driven approaches. The physical models or white-box models depend
on thermodynamic laws for energy modelling and analysis. Software’s that utilize physical
models for building energy simulation include EnergyPlus and Ecotect. These modelling
software’s use comprehensive building and environmental parameters such as building
construction details; operation schedules; HVAC design information; and climate, sky, solar
and shading information to calculate building energy consumption[2]. Such detailed data of the
buildings may not always be available and therefore resulting in poor performance during
simulation.
Data-driven forecasting methods on the other hand do not require such comprehensive
collection of features but instead learns form historical data for prediction. Data driven
forecasting models include statistical models, hybrid models and also machine learning models.
Traditionally, statistical models like ARIMA, SARIMA, SARIMAX have been used for short
term load forecasting[27], [28],[29]. But, with the popularity of machine learning and the
improvements made in computing power and more data available, forecasting has shifted to
more computational models. Machine learning models like SVR[30], SVM[31] and k-NN[32]
were used for forecasting energy loads to good effect. The development of intelligent
optimization technologies have led to different types of smart optimization algorithms being
7
applied to the field of building energy forecasting[7]. Hybrid models have been used using
these optimization algorithms in conjunction with machine learning models to good effect
[4],[33].
In the past couple of years, deep learning methods have achieved tremendous success in
handling complex sequential data [34], [35]. Therefore, deep learning methods have also found
its application in load forecasting applications and has successfully shown to be capable of
surpassing various benchmark models, such as simple ANN and traditional statistical time
series methods such as ARIMA, SARIMA[9]. With increased computational power, larger
datasets and higher granularity of data available, deep learning models are destined to rule the
space of load forecasting. The electricity load dataset, the horizon of forecasting and the
evaluation metrics used in each deep learning studies for the building level forecasting are
described in the below table.
8
Table 1 : Deep learning studies at a building level, datasets used and their prediction horizons.
Deep Learning
Model
Dataset Used Forecast
Horizon
Evaluation
Metric
LSTM-S2S[36] UCI Single Household 60 hr. RMSE
CNN[37] UCI Single Household 1 h, 60 hr. RMSE
RBM[25] UCI Single Household 1 d, 1 wk. RMSE
Autoencoders
and GAN[38]
Educational Building in
Hong Kong
1 hr. RMSE
GRU,
LSTM[39]
Educational Building in
Hong Kong
1 d RMSE
Gated CNN and
RNN[40]
BEMOSS project and
EnerNOC buildings
1 d CV-RMSE
LSTM[26] Canadian Household 1 hr. RMSE
LSTM[41] 48 non-residential
Chinese
buildings/industries
1 d RMSE
LSTM[42] CER Ireland (920homes) 1 d RMSE
RBM[43] 40 industrial customers
of KEPCO Korea
1 d RMSE
CNN[44] Pecan Street (220homes) 6 h, 1 d RMSE
CNN[9] Pecan Street (220homes) 1 d CV-RMSE
LSTM-S2S[5] Pecan Street (30 homes)
and Single Building in
Utah
1 wk., 1 yr. CV-RMSE
9
3. Methodology
The CNN, LSTM and MLP models used in the study for short term load forecasting all belong
to the time series category. Being time series models, they do not require additional time
indexing parameters. As a result, these models are capable of revealing time-dependencies
intrinsically embedded in the input data and circumvent possible problems brought by incorrect
time-index labelling[40]. The deep learning models compared in this study have been
successfully used in the study of sequential and time-series data.
3.1 Problem Formulation
All the models used in the study are formulated into the same supervised learning framework
to ensure a fair comparison to answer the following research questions:
i. If there is any value in setting up individual deep learning forecasting models for each
home?
ii. Whether adding weather based and time-based features to the load data shown an
improvement in forecasting performance for multi-step ahead predictions?
For all the experiments conducted in this study, the input matrix X consists of historical load
profile along with weather-based and time-based features, and the output vector Y refers to the
predicted load profile. Both the input historical load window and the output vector (prediction
horizon) are configurable.
10
There are 3 approaches of dealing with multi-step ahead time series forecasting i.e. (i) Direct
approach (ii) Recursive Approach and (iii) Multi-Input Multi Output (MIMO) approach. The
recursive approach may suffer from the trouble of error accumulation while the direct approach
requires more computational power compared to the recursive approach. On the other hand, the
MIMO approach circumvents the error accumulation drawback presented in the recursive
method and also overcomes the conditional independency assumption used in the direct
approach[39]. For the day-ahead (multi-step ahead) load prediction in this study, MIMO
approach has been used to forecast the load at each hour for the next day.
Figure 2 : MIMO approach for multi-step ahead predictions [39]
3.2 Theoretical Background – Deep Learning
According to Yann Lecun et al, “Deep learning allows computational models that are composed
of multiple processing layers to learn representations of data with multiple levels of
abstraction”[45]. Although the idea of ‘deep learning’ has been floating around for decades, it
had often been considered as a fancy concept rather than feasible technology. This was mainly
11
due to 3 constraints i.e. (i) lack of sufficient training data (ii) lack of computing power and (iii)
lack of efficient training algorithm[42]. With the advancements in the semiconductor industry
resulting in powerful graphic processing units (GPUs) and rapid digitalization of the world,
these constraints are now taken care of. Moreover, Geoffrey Hinton’s[46] quantum leap in
developing an efficient neural network training resulted in deep learning implementations made
feasible. In the past few years, deep learning models have been extremely popular in the areas
of computer vision, speech recognition, machine translation and board game programs where
they have produced results equivalent to expert human performance or sometimes even exceed
it [47].
The huge advantage that deep learning models have over traditional machine learning models
is the fact that they learn high-level features from data in an incremental manner which removes
the requirement of subject knowledge and bothersome feature extraction[48]. The main
rationale of using deep learning models in this study is in its superior ability compared to the
traditional neural networks i.e. (i) To learn highly non-linear relationships and (ii) To learn
shared uncertainties.
3.3 Theoretical Background – Long Short-Term Networks (LSTMs)
LSTM or Long-Short Term Networks belongs to a class of networks called recurrent neural
networks that can learn the order reliance between items in a sequence. Recurrent neural
networks(RNNs) are specifically created for dealing with sequential data and they have been
12
effectively used in the fields of machine translation, speech synthesis and time series
prediction[49]. Traditional RNNs often suffer from the problem of vanishing gradients which
reduces its efficacy in dealing with long data sequences. LSTMs are able to partially mitigate
this problem with the help of gates which control the flow of information, thus making them
ideal for dealing with time series data with long temporal dependencies.
An LSTM unit memory cell comprises of three gates i.e. the input gate, an output gate and the
forget gate which regulates the flow of information within the unit cell. The gates contain
sigmoid activation functions squishing values in between 0 and 1 which is useful in updating
or forgetting information. The forget gate decides what information should be kept or forgotten
with inputs as information from the previous hidden state and current input going through the
activation function. The input gate is then used to update the cell state and the output gate is
used to compute the new hidden state of the LSTM cell[50]. This mechanism of forgetting and
keeping information within a cell makes LSTM ideal with dealing with sequential data.
Figure 3 : LSTM Unit Cell and LSTM architecture used for time series forecasting.
13
3.4 Theoretical Background – Convolutional Neural Networks (CNNs)
Convolutional Neural Networks belong to a class of deep learning networks that are used for
processing data with a grid like topology[49]. This can include time-series data and image data
which can be thought of as a 1-D and 2-D data grid respectively. They have been successfully
used in the applications of computer vision, human activity recognition, natural language
processing, drug discovery, time series forecasting etc. ([51],[8],[52],[53],[40]). CNN uses a
specialized linear mathematical operation called convolution in at least one of its layers[49]. In
CNN’s, convolution operation is done using repeated application of filters or kernels on the
input data to obtain a feature map.
Three different operations take place in the convolutional layer. The first operation described
above results in the production of the feature map. The second step is activation of the elements
in the feature map using a nonlinear activation function i.e. mostly RELU or rectified linear
activation function[49]. In the third step, a pooling operation is used to smoothen and reduce
the dimensions of the feature map output. Max pooling method is used in this study. It returns
an array of the values of maximum output within a rectangular neighborhood from the previous
layer[49]. The CNN network may consist of one or more convolutional layers. After the
convolutional layers create their outputs, it is then received by the hidden layers or fully
connected layers. The output layer is positioned following the hidden layer and it performs an
identical role to that of an output layer in a conventional neural network[37].
14
Figure 4 : CNN Architecture for time series data2
3.5 Theoretical Background – Multi Layered Perceptron (MLPs)
Multilayer Perceptron also known as feedforward or artificial neural networks are the
archetypal deep learning models. Multilayer perceptrons are powerful machine learning models
that are used for learning non-linear relationships within the data and are highly flexible
universal approximators [54]. They are extremely useful to machine learning practitioners and
form the basis for many of the commercial machine learning applications [49]. They have been
successfully used in load forecasting and other time series applications[2].
On a high level, the simple neural network consists of an input layer, hidden layer and an output
layer. Unlike recurrent neural networks they have no feedback connections in which outputs of
the model are fed back into itself [49]. Networks with just 1 hidden layer also known as vanilla
artificial neural networks are used in this study. Detailed information on the workings of the
multilayered perceptron can be found in the following literature[49].
15
Figure 5 : Simple Multi Layered Perceptron1
3.5.1 Optimization Algorithm
In this study, all the models use ADAM optimization algorithm for optimizing the weights of
each layer. This adaptive learning rate optimization algorithm shows quicker convergence than
the traditional SGD[55]. It is a first order gradient-based optimization algorithm which is
intuitive, computationally efficient and tailor-made for optimizing models which involve a large
set of parameters. Unlike the stochastic gradient descent which naively updates the weights
with a constant learning rate, ADAM optimization algorithm computes individual adaptive
learning rates from the moments of the gradients. Further details about the ADAM optimization
algorithm can be found in the literature[55].
1,2 These figures are generated using the web application: http://alexlenail.me/NN-SVG/index.html
16
3.5.2 Regularization
Machine Learning models usually suffer from the problem of overfitting that results in testing
errors much worse than errors on the training data. This occurs when the model fits the data in
the training phase too well, resulting in high variance and low bias. The strategies to decrease
the testing error sometimes at the cost of increased training error is known as regularization[49].
In this study, weight decay regularization is used to address the problem of overfitting. The
regularization parameter lambda is selected as 0.01. This is the default lambda value in
Keras[56].
17
4. Evaluation Method
4.1 Data Collection and Characteristics
4.1.1 Residential Homes Data
The residential building data used for the study is obtained from Pecan Street Inc. Dataport[57]
from Austin, New York and California. The dataset is publicly available and is downloaded
using the free student license. 1-year data of 15 min frequency load data is used for the study
where 25 homes from Austin, 24 homes from New York and 23 homes from California are
selected. All the homes in Austin used in the study are in the time range of: 1 January 2018 to
1 January 2019 whereas the homes in California consist of yearly data anywhere between 1
January 2014 to 1 January 2019. In the case of New York, the data consists of half year data
but with ranging from 1 January 2019 to 31 October 2019.
4.1.2 Weather Data
The use of weather data in the study is only done for the city of Austin. 1 year of weather data
is obtained from the website: openweathermap.org [58] from 1 January 2018 to 1 January 2019.
The weather data consists of temperature, humidity and atmospheric pressure.
18
4.2 Data Description and Pre-processing
A total of 72(25, 24 and 23) homes from both Austin, New York and California are initially
selected. All the 72 homes are checked for missing values. The deep learning models would not
run with missing values present in the dataset. It is found that all the homes in California do not
have any missing values whereas several homes in Austin contain missing values. In the case
of New York, large sections of the data is missing and the collection of homes are analysed as
a separate case study.
4.2.1 Austin
The home types and the missing value data from the city of Austin is provided in the table
below. Homes in Austin with missing values greater than 0.5 % (approximately 170 missing
values out of 35000) are omitted and those with homes with missing values present, but less
than 0.5 % are interpolated linearly to fill out the missing values. The houses omitted from
Austin are highlighted in orange in the below table. This finally gives us 20 homes from Austin
(after interpolation). All the homes in Austin belong to the same building type with a few homes
having solar generation capacity.
19
Table 2 : List of residential buildings from Austin. Source: Pecan Street Inc, Dataport[57]
House ID % Missing Values Building Type Solar Available
661 0.95 Single-Family Home Yes 1642 0.52 Single-Family Home Yes 2335 0 Single-Family Home Yes 2361 0 Single-Family Home Yes 2818 0 Single-Family Home Yes 3039 1.5 Single-Family Home No 3456 0.01 Single-Family Home Yes 3538 0 Single-Family Home Yes 4031 0.01 Single-Family Home Yes 4373 0.01 Single-Family Home Yes 4767 0 Single-Family Home Yes 5746 0.38 Single-Family Home No 6139 0 Single-Family Home Yes 7536 0.01 Single-Family Home Yes 7719 0 Single-Family Home Yes 7800 0.06 Single-Family Home Yes 7901 0.01 Single-Family Home No 7951 0 Single-Family Home No 8156 0.01 Single-Family Home Yes 8386 0.01 Single-Family Home No 8565 0 Single-Family Home No 9019 0 Single-Family Home Yes 9160 0 Single-Family Home Yes 9278 7.08 Single-Family Home No 9922 4.26 Single-Family Home Yes
20
Figure 6 : 1 year of load data for a home (with solar, ID=2361) in Austin.
Figure 7 : 1-week of load data for a home (with solar, ID=2361) in Austin.
4.2.2 California
It is found that all 23 homes in California contained no missing values. But they consisted of
different building types i.e. single-family home, a townhome or an apartment. Except one home,
none of the homes in California had solar generation capacity. The data of the homes is given
below.
21
Table 3 : List of residential buildings from California. Source: Pecan Street Inc, Dataport[57]
House ID Missing Values Building Type Solar Available
203 0 Single-Family Home No
1450 0 Town Home No
1524 0 Single-Family Home No
1731 0 Town Home No
2606 0 Town Home No
3687 0 Town Home No
3864 0 Town Home No
3938 0 Apartment No
4495 0 Apartment No
4934 0 Town Home No
5938 0 Town Home No
6377 0 Apartment No
6547 0 Town Home No
7062 0 Town Home No
7114 0 Town Home No
8061 0 Town Home No
8342 0 Town Home No
8574 0 Apartment No
8733 0 Apartment No
9213 0 Apartment No
9612 0 Town Home No
9775 0 Apartment No
9836 0 Town Home Yes
22
Figure 8 : 1 year of load data for a home (no solar, ID = 1450) in California
Figure 9 : 1 week of load data for a home (no solar, ID = 1450) in California
4.2.3 New York
In the case of the New York dataset, it was found that for all the 24 homes 11 months of data
was present for the dates between January 1, 2019 and October 31, 2019, with large portions
of the data missing for similar periods of dates. This can be observed in Figure 10 for home
ID=914. Other homes also show a similar pattern. All the homes are of the Single-family home
type with a few homes having solar generation capacity.
23
Table 4 : List of residential buildings from California. Source: Pecan Street Inc, Dataport[57]
House ID Missing Values Building Type Solar Available
27 0 Single-Family Home Yes
387 0 Single-Family Home Yes
558 0 Single-Family Home No
914 0 Single-Family Home Yes
950 0 Single-Family Home Yes
1222 0 Single-Family Home Yes
1240 0 Single-Family Home No
1417 0 Single-Family Home No
2096 0 Single-Family Home No
2318 0 Single-Family Home No
2358 0 Single-Family Home No
3000 0 Single-Family Home Yes
3488 0 Single-Family Home Yes
3517 0 Single-Family Home Yes
3700 0 Single-Family Home No
3996 0 Single-Family Home No
4283 0 Single-Family Home No
4550 0 Single-Family Home No
5058 0 Single-Family Home Yes
5587 0 Single-Family Home Yes
5679 0 Single-Family Home Yes
5982 0 Single-Family Home No
5997 0 Single-Family Home Yes
9053 0 Single-Family Home No
24
Figure 10 : 1 year of load data for a home (with solar, ID=914) in New York
Fig 11 : 1-week load data for a home (with solar, ID = 914) in New York
25
4.3 Feature Engineering
To test the efficacy of the multistep forecasting (24 hours ahead) models with different
combinations of features (multivariate forecasting), features had to be manually added to the
load data. This experiment is only carried out for a single home in Austin (ID=2361). 24 hours
ahead prediction is done for both grid and use data. Both, the grid and use data was rescaled
from 15-min frequency to 1-hour frequency as the weather data was available only in the latter
frequency. Weather based features: temperature, humidity, pressure and time-based features:
day of the week, weekend/day, holiday data are used in this study. All these features are for the
time range of 1 January 2018 to 1 January 2019. The weather and holiday data which is obtained
from an external source is manually appended to the load data. The ‘day of the week’ and
‘weekend/day’ features are constructed from the load data file using the datetime index in the
load file. For the ‘day of the week’ feature, values of 0 to 6 are assigned for the days Monday
to Sunday. For the ‘weekend/day’ feature, a value of 1 and 0 was assigned for weekdays and.
The above-mentioned features were added to the load data after rescaling and the Pearson
correlations between the load data and the features were studied. It is observed that the time-
based features i.e. ‘day of the week’ and ‘weekend/day’ shows almost close to zero correlation
to the load data. Similarly, the pressure variable also shows minimal correlation with the load
data. Thus, these features are not considered for the multivariate-multistep forecast study.
26
4.4 Implementation Setup
All the models are developed on top of a Keras API running on top of a TensorFlow version
1.0 backend[59]. The analysis is done using Google Colab (Colaboratory) online compiler
which gives us the access to their external graphic processing units.7 different models are used
in this study to carry out the multiple home analysis that include a multi-layer perceptron,
LSTM and CNN networks. The models have been named MLP-1, CNN-1, CNN-2, CNN-3,
CNN-4, LSTM-1 and LSTM-2 respectively. For all the models, 70 % of the data is used for
training and 30 % of the data is used for testing the model. The MLP-1 model consists of 3
layers with 32 units in the input layer, 16 units in the hidden and single output unit. The other
architectures of the models used is provided in the table.
Feature Correlation with
‘Grid’
Temperature 0.424
Humidity -0.335
Pressure -0.169
‘Day of the
week’ -0.0049
‘Weekend/day’ -0.0038
Feature Correlation with
‘Use’
Temperature 0.479
Humidity -0.313
Pressure -0.168
‘Day of the
week’ -0.0068
‘Weekend/day’ -0.0054
Table 5 : Correlation with 'Grid' data Table 6 :Correlation with 'Use' data
27
Table 8 : CNN model architectures used for multiple home analysis.
CNN Model Filters Kernel Sizes Pooling Filters Hidden Layers
1 [32] [[3]] [2] [1]
2 [64,32] [[3,3]] [2,2] [1]
3 [32,32,64,64,32,32] [[3,3,3,3,3,3,3]] [2,2] [1]
4 [32,8] [[4,4]] [3] [1]
Table 9 : LSTM architectures used for multiple home analysis.
LSTM Model LSTM Units Dropout Hidden Layers
1 [30] 0.2 1
2 [30,15] 0.2,0.2 1
Figure 12 : Code block of CNN-4 architecture compiled in Python in Google Colab
28
4.5 Evaluation Metric
To assess the efficacy of the model, the accuracy metric: RMSE or root mean squared error is
used. It is a scale dependent accuracy metric.
𝑅𝑀𝑆𝐸 = '∑ )𝑦!"#$ − 𝑦%&',()*+
𝑛
where,
ypred: is the predicted values
yact: is the actual values
n: number of samples.
29
5. Results and Discussion
5.1 Introduction
This study has been conducted to answer the research questions which are mentioned at the end
of Section 1. This section will describe the results obtained from carrying out experiments for
the two major research questions i.e. to evaluate a need for individual deep learning models for
each home for a collection of homes and to evaluate multistep-ahead (24-hour ahead)
forecasting for an individual home using different features.
5.2 Multiple Home Analysis
In this experiment, 7 different models consisting of an ANN, CNN and LSTM architectures are
used for single-step (15-min ahead) univariate forecasting across all the homes in the 3 locations
i.e. Austin, California and New York from the Pecan Street data. The models are trained and
tested on individual homes and the forecasting performance is evaluated using RMSE values.
For answering the research question, the RMSE values of all the homes for all the models are
tabulated and the overall best model is identified using the minimum average RMSE of each
model. Then for each home the overall best model’s RMSE is compared with all other model’s
RMSE to see any significant difference (if any) from the best model. Significant differences in
RMSE would indicate a need for separate model for that particular home. The multiple home
analysis is done as 2 separate case studies. One of the case studies is for Austin and California
30
which after preprocessing do not contain any missing values and the second case study is for
New York where all the homes contain significant chunks of missing values.
5.2.1 Austin and California
5.2.1.1 Austin
The 7 different model architectures trained on each home are ran across all 20 homes in Austin.
It is observed that LSTM-2 architecture shows the best overall performance whereas MLP-1
showed the worst performance amongst the 7 models. It is also observed that increasing model
complexity by adding more layers in the case of CNN and LSTM do not show any significant
improvement in the forecasting performance. The table below shows the test RMSE values
obtained for all the 7 models on the Austin homes dataset for single-step (15-min ahead)
predictions. The best overall model (LSTM-2) is compared to all other model’s RMSE for each
home and the percentage difference between the best model RMSE and minimum RMSE for
that home is noted down in the table below. In the case of Austin dataset, it is observed that
only 3 homes show significant difference (> 5 % difference in RMSE) between the best overall
model and the min. RMSE for that home.
31
Table 10 : Test RMSE (in kW) values of 20 homes in Austin.
House
ID:
Austin
MLP-1 CNN-1 CNN-2 CNN-3 CNN-4 LSTM-1 LSTM-2
Variance
of
load(kW2)
Min RMSE
for that
home(kW)
% Difference
between best
model and min
RMSE for the
home
2335 0.897 0.877 0.88 0.842 0.866 0.857 0.854 3.48 0.842 1.41
2361 0.756 0.753 0.753 0.733 0.763 0.761 0.743 3.57 0.733 1.35
2818 0.509 0.487 0.488 0.516 0.493 0.512 0.492 2.33 0.487 1.02
3456 0.555 0.564 0.548 0.547 0.536 0.552 0.533 1.82 0.533 0
3538 0.505 0.483 0.489 0.469 0.464 0.47 0.466 1.11 0.464 0.43
4031 0.651 0.628 0.651 0.639 0.636 0.628 0.62 1.94 0.62 0
4373 0.925 0.917 0.928 0.905 0.913 0.904 0.903 4.33 0.903 0
4767 1.014 0.995 1.003 0.908 0.992 0.974 0.975 5.22 0.908 6.87
5746 0.355 0.328 0.334 0.333 0.323 0.33 0.328 0.74 0.323 1.52
6139 0.821 0.813 0.811 0.809 0.815 0.811 0.81 2.36 0.809 0.12
7536 0.892 0.819 0.779 0.842 0.808 0.802 0.79 2.28 0.779 1.39
7719 1.08 1.034 1.101 0.93 1.054 1.039 1.033 3.63 0.93 9.97
7800 0.801 0.72 0.745 0.69 0.727 0.704 0.696 2 0.69 0.86
7901 0.667 0.588 0.597 0.551 0.583 0.618 0.612 1.24 0.551 9.97
7951 1.13 1.108 1.106 1.082 1.11 1.091 1.095 4.21 1.082 1.19
8156 1.001 0.941 0.909 0.981 0.914 0.937 0.906 4.64 0.906 0
8386 0.421 0.422 0.419 0.4 0.422 0.41 0.409 0.52 0.4 2.2
8565 0.523 0.532 0.522 0.53 0.504 0.508 0.5 1.06 0.5 0
9019 0.466 0.462 0.475 0.442 0.464 0.46 0.455 0.7 0.442 2.86
9160 0.541 0.503 0.504 0.6 0.493 0.513 0.496 1.17 0.493 0.6
Avg 0.7255 0.6987 0.7021 0.6874 0.6939 0.694 0.686
Table 11 : Homes showing significant and moderate differences in RMSE values w.r.t best model.
Homes with significant RMSE difference (> 5 %) 3
Homes with a moderate RMSE difference (2-5 %) 2
Total homes in Austin dataset 20
32
Figure 13, 14 and 15 show the actual and predicted load values for MLP-1, CNN-3 and LSTM-
2 for 200 timesteps. There is only a slight variation in performance observed in the figures.
Although, the differences in performance is hard to observe at first glance; on closer
observation, it can be observed that CNN-3 and LSTM-2 show better accuracy in predicting the
peaks compared to MLP-1.
Figure 13 : Predicted and actual values for MLP-1 for a home in Austin (ID=4031)
Figure 14 :Predicted and actual values for CNN-3 for a home in Austin (ID=4031).
33
Figure 15 : Predicted and actual values for LSTM-2 for a home in Austin (ID=4031).
5.2.1.2 California
In the California homes dataset, similar to the Austin dataset, LSTM-2 and MLP-1 shows the
best and worst performance respectively. Although, CNN-4 shows the best performance
amongst the CNN models. Like Austin dataset, increasing model complexity do not show an
improvement in performance. In fact, LSTM-2 and CNN-4 shows better performance than their
more complex counterparts. The overall average RMSE observed is significantly lesser the
Austin dataset. This can be explained by the fact all but one of the homes in California has no
solar generation capacity. After comparing the best overall model (LSTM-2) with the minimum
RMSE for that home, it is found that only 2 homes show a significant difference from the best
overall model.
34
Table 12 : Test RMSE (in kW) values of 23 homes in California
House
ID: MLP-1 CNN-1 CNN-2 CNN-3 CNN-4 LSTM-1 LSTM-2
Variance
of
load(kW2)
Min
RMSE for
that
home(kW)
% Difference
between best
model and min
RMSE for the
home
203 0.43 0.393 0.396 0.381 0.392 0.405 0.396 0.24 0.381 3.79
1450 0.674 0.659 0.68 0.614 0.659 0.671 0.664 0.85 0.614 7.53
1524 0.486 0.47 0.479 0.477 0.461 0.433 0.433 0.47 0.433 0
1731 0.222 0.23 0.195 0.194 0.237 0.199 0.195 0.3 0.194 0.51
2606 0.609 0.601 0.58 0.539 0.594 0.579 0.585 0.76 0.539 7.86
3687 0.387 0.389 0.42 0.403 0.377 0.386 0.382 0.53 0.377 1.31
3864 0.388 0.371 0.384 0.429 0.377 0.369 0.381 0.3 0.369 3.15
3938 0.195 0.187 0.192 0.211 0.189 0.188 0.187 0.08 0.187 0
4495 0.107 0.105 0.106 0.108 0.105 0.104 0.106 0.03 0.104 1.89
4934 0.16 0.159 0.156 0.153 0.157 0.16 0.159 0.16 0.153 3.77
5938 0.162 0.159 0.166 0.171 0.161 0.176 0.158 0.04 0.158 0
6377 0.164 0.167 0.166 0.167 0.168 0.165 0.164 0.11 0.164 0
6547 0.355 0.367 0.375 0.373 0.372 0.369 0.368 0.17 0.355 3.53
7062 0.408 0.393 0.399 0.388 0.398 0.39 0.392 0.35 0.388 1.02
7114 0.402 0.396 0.398 0.477 0.384 0.373 0.373 0.35 0.373 0
8061 0.323 0.335 0.324 0.328 0.337 0.326 0.322 0.3 0.322 0
8342 0.253 0.258 0.258 0.256 0.251 0.258 0.258 0.31 0.251 2.71
8574 0.272 0.268 0.277 0.277 0.264 0.268 0.267 0.2 0.264 1.12
8733 0.253 0.252 0.248 0.237 0.25 0.246 0.244 0.39 0.237 2.87
9213 0.188 0.191 0.201 0.197 0.192 0.196 0.187 0.13 0.187 0
9612 0.279 0.29 0.279 0.277 0.285 0.282 0.283 0.49 0.277 2.12
9775 0.238 0.231 0.227 0.226 0.228 0.226 0.226 0.21 0.226 0
9836 .192 0.186 0.192 0.221 0.198 0.195 0.193 0.45 0.186 3.63
Average 0.3102 0.307 0.3086 0.3086 0.3049 0.3027 0.3020
35
Table 13 : List of homes showing significant and moderate differences in RMSE values w.r.t best model.
Homes with significant RMSE difference (> 5 %) 2
Homes with a moderate RMSE difference (2-5 %) 8
Total homes in California dataset 23
Figure 16, 17 and 18 shows the actual and predicted load values of MLP-1, CNN-3 and LSTM-
2 for a home in California. From the three figures, it is apparent that the deep learning models
outperform the simple multilayer perceptron model.
Figure 16 : Predicted and actual values for MLP-1 for a home in California (ID=1731)
36
Figure 17 : Predicted and actual values for CNN-3 for a home in California (ID=1731)
Figure 18 : Predicted and actual values for LSTM-2 for a home in California (ID=1731)
37
5.2.2 New York
In the case of the New York dataset, it is observed that all the 24 homes contain large chunks
of missing data in the first half of the year as shown in Figure 10. The 7 models are run on these
homes on the reasonable assumption that the missing values should not affect single step
prediction as the lookback window is only 6 hours or 24 timesteps. Interestingly it is observed
that CNN-3 shows the best overall performance in the case of New York dataset. After
comparing the best overall model in the case of New York dataset (CNN-3) with the minimum
RMSE for each home, it is found that 6 homes show significant difference compared to the
best overall model.
From the table above it can be observed that for homes with high variances, the CNN-3 model
seems to do better and for homes with low variances LSTM-2 model seems do perform the best.
It is also observed that despite missing large chunks of data in the initial months, the models
are able to produce RMSE values comparable to that observed in the Austin dataset. This could
be due to the fact that for single step ahead prediction, with a 6-hour lookback window, the
large missing chunks of data does not seem a significant factor to affect the performance.
38
Table 14 : List of homes showing significant and moderate differences in RMSE values w.r.t best model.
House
ID: MLP-1 CNN-1 CNN-2 CNN-3 CNN-4
LSTM-
1 LSTM-2
Variance
of
load(kW2)
Min
RMSE for
that
home(kW)
% Difference
between best
model and min
RMSE for the
home
27 0.855 0.926 0.958 0.793 0.836 0.815 0.945 10.36 0.793 0
387 0.316 0.316 0.308 0.307 0.31 0.313 0.34 0.69 0.307 0
558 0.351 0.352 0.355 0.352 0.345 0.35 0.557 0.75 0.345 1.99
914 0.704 0.654 0.683 0.657 0.666 0.705 0.896 6.69 0.654 0.46
950 2.133 2.039 2.086 1.926 2.017 2.036 2.136 23.5 1.926 0
1222 1.106 1.059 1.055 1.05 1.061 1.069 1.027 4.67 1.027 2.19
1240 0.514 0.504 0.51 0.516 0.498 0.499 0.377 0.4 0.377 26.94
1417 0.4 0.395 0.391 0.389 0.397 0.394 0.481 0.57 0.389 0
2096 0.583 0.549 0.554 0.573 0.549 0.555 0.54 0.35 0.54 5.76
2318 0.402 0.396 0.398 0.402 0.397 0.395 0.385 0.4 0.385 4.23
2358 0.366 0.355 0.356 0.36 0.361 0.36 0.278 0.16 0.278 22.78
3000 0.665 0.62 0.61 0.602 0.642 0.639 0.677 3.83 0.602 0
3488 0.518 0.491 0.505 0.496 0.502 0.503 0.684 1.46 0.491 1.01
3517 0.424 0.396 0.399 0.389 0.391 0.384 0.475 1.53 0.384 1.29
3700 0.163 0.166 0.16 0.159 0.164 0.164 0.13 0.06 0.13 18.24
3996 0.701 0.699 0.695 0.71 0.701 0.702 0.706 0.71 0.695 2.11
4283 0.526 0.503 0.502 0.512 0.504 0.501 0.469 0.67 0.469 8.4
4550 0.348 0.343 0.344 0.351 0.344 0.346 0.284 0.23 0.284 19.09
5058 0.46 0.459 0.451 0.446 0.467 0.464 0.436 1.47 0.436 2.24
5587 0.666 0.651 0.661 0.646 0.656 0.667 0.648 2.76 0.646 0
5679 0.857 0.826 0.847 0.815 0.828 0.837 0.827 3.19 0.815 0
5982 0.43 0.429 0.426 0.42 0.428 0.425 0.472 0.76 0.42 0
5997 0.761 0.751 0.861 0.744 0.796 0.749 0.795 5.67 0.744 0
9053 0.402 0.398 0.397 0.402 0.401 0.403 0.347 0.61 0.347 13.68
Average 0.6104 0.5948 0.6033 0.5853 0.5942 0.5947 0.5928
Table 15 : : List of homes showing significant and moderate differences in RMSE values w.r.t best model.
Homes with significant RMSE difference (> 5 %) 7
Homes with a moderate RMSE difference (2-5 %) 4
Total homes in New York dataset 24
39
Figure 19 : Predicted and actual values for MLP-1 for a home in New York (ID=27)
Figure 20 : Predicted and actual values for CNN-3 for a home in New York (ID=27)
40
Figure 21 : Predicted and actual values for LSTM-2 for a home in New York (ID=27)
41
5.2.3 Variation of RMSE with Variance
The minimum RMSE for each of the homes in the Austin and California dataset (total = 43)
were plotted. The New York dataset is not considered for this analysis as it contains large
missing chunks of data for all the homes. A linear relationship is observed between the variance
of the home and minimum RMSE. Higher variance in the load is shown to give poor minimum
RMSE values.
Figure 22 : Variance vs Min. RMSE for all the homes in Austin and California
The variance, minimum RMSE and the corresponding best model is noted down in the table
below for all the homes in Austin and California. The variance is arranged in increasing order
and along with the corresponding values. Although, these results are not conclusive, it is
observed that, for homes with higher variance, the CNN (highlighted in green) models seem to
perform better and for homes with lower variance, the LSTM (highlighted in orange) models
seem to work better.
R² = 0.8382
0
1
2
3
4
5
6
0 0.2 0.4 0.6 0.8 1 1.2
Var
ianc
e(in
kW2 )
Min. RMSE
Variance vs Min. RMSE
Variance
Linear (Variance)
42
Table 16 : Best models for each home in Austin and California with their variances
Min RMSE(in kW) Variance(in kW2) from
lowest to highest Best
Model Best Model
Type 0.104 0.03 LSTM-1 LSTM 0.158 0.04 LSTM-2 LSTM 0.187 0.08 LSTM-2 LSTM 0.164 0.11 LSTM-2 LSTM 0.187 0.13 LSTM-2 LSTM 0.153 0.16 CNN-3 CNN 0.355 0.17 LSTM-2 LSTM 0.264 0.2 CNN-4 CNN 0.226 0.21 LSTM-2 LSTM 0.381 0.24 CNN-3 CNN 0.194 0.3 CNN-3 CNN 0.369 0.3 LSTM-1 LSTM 0.322 0.3 LSTM-2 LSTM 0.251 0.31 CNN-4 CNN 0.388 0.35 CNN-3 CNN 0.373 0.35 LSTM-2 LSTM 0.237 0.39 CNN-3 CNN 0.186 0.45 CNN-1 CNN 0.433 0.47 LSTM-2 LSTM 0.277 0.49 CNN-3 CNN 0.4 0.52 CNN-3 CNN
0.377 0.53 CNN-4 CNN 0.442 0.7 CNN-3 CNN 0.323 0.74 CNN-4 CNN 0.539 0.76 CNN-3 CNN 0.614 0.85 CNN-3 CNN 0.5 1.06 LSTM-2 LSTM
0.464 1.11 CNN-4 CNN 0.493 1.17 CNN-4 CNN 0.551 1.24 CNN-3 CNN 0.533 1.82 LSTM-2 LSTM 0.62 1.94 LSTM-2 LSTM 0.69 2 CNN-3 CNN 0.779 2.28 CNN-2 CNN 0.487 2.33 CNN-1 CNN 0.809 2.36 CNN-3 CNN 0.842 3.48 CNN-3 CNN 0.733 3.57 CNN-3 CNN 0.93 3.63 CNN-3 CNN 1.082 4.21 CNN-3 CNN 0.903 4.33 LSTM-2 LSTM 0.906 4.64 LSTM-2 LSTM 0.908 5.22 CNN-3 CNN
43
5.3 Multistep ahead forecasting
In this experiment, effect of adding weather-based variables on multistep-ahead (24-hour
ahead) forecasting is studied. The experiments are carried out on a single home in Austin
(ID=2361) with solar generation capacity. 1-hour frequency data is used in these experiments.
The 24-hr ahead forecasting is done for both ‘grid data’ and ‘electricity use’ data for the home.
After trial and error, a best CNN model is identified for the multistep forecast and the effects
of adding the weather features and changing the length of the sliding window are studied in this
section.
5.3.1 Grid Data Forecasting
The CNN model is used for forecasting grid data for the home (ID=2361). Using different
combination of features for 24-hr ahead forecasting, it is found that adding weather-based
features such as ‘temperature’ and ‘humidity’ to the grid data do not show any performance
improvement when compared to using only ‘grid’ data for forecasting. This could be because
of the fact that ‘temperature’ and ‘humidity’ did not show a strong correlation with grid data.
It is also observed that model with all the features combined shows the worst results. This
observation could be due to overfitting of the model.
Table 17 : RMSE values with different combination of features for 'grid' data
Features Used Test RMSE Values (in kW)
Only Grid 0.9120
Grid with Temperature 0.9367
Grid with Humidity 0.9410
Grid with Temperature & Humidity 0.9841
44
Figure 23 : 24-hr ahead forecast with 'Only grid'
Figure 24 : 24-hr ahead forecast with 'Grid' and 'Temperature'
Figure 25 : 24-hr ahead forecast with 'Grid' and 'Humidity'
45
Figure 26 : 24-hr ahead forecast with all the features.
5.3.2 Use Data Forecasting
The same CNN model used for grid forecasting is used in forecasting the 24-hr ahead
‘electricity use’ data. Similar to the results obtained in the case of ‘grid’ data, it is found that
adding the weather-based features ‘temperature’ and ‘humidity’ do not show an improvement
in forecasting performance. Although the correlation of ‘temperature’ with ‘use’ is more than
with ‘grid’ as shown in Table 7, it is still not good enough to show an improvement in the
forecasting performance.
Table 18 : RMSE values with different combination of features for 'grid'' data
Features Used Test RMSE Values (in kW)
Only Use 0.5090
Use with Temperature 0.5115
Use with Humidity 0.5244
Use with Temperature & Humidity 0.5419
46
Figure 27 : 24-hr ahead forecast for 'Use'
Figure 28 : 24-hr ahead forecast for Use and 'Temperature'
Figure 29 : 24-hr ahead forecast for 'Use' and 'Humidity'
47
Figure 30 : 24-hr ahead forecast for all the features.
5.3.3 Variation with Lookback
In this experiment, the effect of different ‘lookback’ values i.e. the number of timesteps (or
hours in this case) used as input to predict the forecast, is analyzed. This effect is studied in the
case of predicting ‘electricity use’ value for a home in Austin (ID=2361). It is observed that
the ideal lookback for 24-hours ahead forecast lies in the range of 4-8 days (96-192 hours).
This behaviour is observed in 3 experiments involving forecasting using ‘Only Use’, ‘Use and
Temperature’ and ‘Use and Humidity’ as shown in Figure 31, 32 and 33 below. Similar results
are obtained in the case of 24-hour ahead forecasting using ‘grid’ data. Thus, a lookback of 4-
8 days gives the best results for 24-hour ahead forecasts.
48
Figure 31 : RMSE vs Lookback for 24-hr forecast using 'Only Use'
Figure 32 : RMSE vs Lookback for 24-hr forecast using 'Use' and 'Temperature'
Figure 33 : RMSE vs Lookback for 24-hr forecast using 'Use' and 'Humidity'
0.470.480.49
0.50.510.520.530.540.550.56
24 48 72 96 120 144 168 192 216 240
RMSE
Lookback(In hrs)
0.48
0.49
0.5
0.51
0.52
0.53
0.54
24 48 72 96 120 144 168 192 216 240
RMSE
Lookback(In hours)
0.49
0.5
0.51
0.52
0.53
0.54
0.55
24 48 72 96 120 144 168 192 216 240
RMSE
Lookback(In Hours)
49
6. Conclusions
With an increased growth of smart grids, smart homes and decentralized energy production,
load forecasting at an individual building level becomes an increasingly important task. Deep
learning models have shown to surpass the traditional statistical as well as hybrid models. With
more data available and increased computational power, their performance is only set to
improve. This study only reinforces the use of deep learning models for building energy
forecasting. With other studies focusing on models on an individual building or an aggregate
building load, this study aims to address this gap and serve as an analysis on short term load
forecasting on a community of residential buildings. The following conclusions are drawn from
this study:
1. It was observed that the deep learning models outperform the ANN model in all cases.
2. For the multiple home analysis, it is found that the LSTM-2 was the best overall model
in the case of Austin and California. It is also seen that only 5 homes out of a total of 43
homes (both Austin and California) show a significant difference from the best
performing overall model (> 5 % RMSE). This indicates that there is not a pressing need
for individual models for each home. The best overall model can be applied across all
homes giving satisfactory results.
3. For the multiple home analysis, it was also found that the CNN models are shown to
give better performances in the case of homes with higher variance and the LSTM
models are shown to have better performance in the case of homes with lower variance.
50
4. In the case of New York dataset, all the homes have large chunks of missing data in the
initial months of the year. It is found that the missing data do not affect the performance
when it comes to single step forecasting. This could be because the lookback used is
only 6 hours.
5. In the case of 24 hour ahead forecasting, it is found that adding weather-based features
such as temperature and humidity did not show an improvement in forecasting
performance. This could be because of not having a strong enough correlation with the
weather-based features. In literature[40] it is shown that adding temperature can result
in improvement in forecasting performance where the correlation between temperature
and load is as high as 0.74.
6. For 24-hour forecast, it is also observed that the forecast performance is dependent on
the lookback window used. A lookback window in the range of 4 to 8 days is shown to
have best results.
7. A forecasting competition on a public residential dataset can be used to compare the
different models already published as most studies focus on only an individual home
and do not show the results on more than a single home.
51
References
1. Omer, A.M., Energy use and environmental impacts: A general review. Journal of renewable and Sustainable Energy, 2009. 1(5): p. 053101.
2. Amasyali, K. and N.M. El-Gohary, A review of data-driven building energy consumption prediction studies. Renewable and Sustainable Energy Reviews, 2018. 81: p. 1192-1205.
3. Namlı, E., H. Erdal, and H.I. Erdal, Artificial Intelligence-Based Prediction Models for Energy Performance of Residential Buildings, in Recycling and Reuse Approaches for Better Sustainability. 2019, Springer. p. 141-149.
4. Li, K., et al., A hybrid teaching-learning artificial neural network for building electrical energy consumption prediction. Energy and Buildings, 2018. 174: p. 323-334.
5. Rahman, A., V. Srikumar, and A.D. Smith, Predicting electricity consumption for commercial and residential buildings using deep recurrent neural networks. Applied energy, 2018. 212: p. 372-385.
6. Dong, B., et al., A hybrid model approach for forecasting future residential electricity consumption. Energy and Buildings, 2016. 117: p. 341-351.
7. Li, K., et al., Building's electricity consumption prediction using optimized artificial neural networks and principal component analysis. Energy and Buildings, 2015. 108: p. 106-113.
8. Um, T.T., V. Babakeshizadeh, and D. Kulić. Exercise motion classification from large-scale wearable sensor data using convolutional neural networks. in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 2017. IEEE.
9. Voß, M., C. Bender-Saebelkampf, and S. Albayrak. Residential short-term load forecasting using convolutional neural networks. in 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm). 2018. IEEE.
10. Association, A.P.P. More accurate load forecasts help utilities save. 2018, September 25; Available from: https://www.publicpower.org/periodical/article/more-accurate-load-forecasts-help-utilities-save.
11. Gillies, D., B. Bernholtz, and P. Sandiford, A New Approach to Forecasting Daily Peak Loads Transactions of the American Institute of Electrical Engineers. Part III: Power Apparatus and Systems, 1956. 75(3): p. 382-387.
12. Che, J. and J. Wang, Short-term load forecasting using a kernel-based support vector regression combination model. Applied energy, 2014. 132: p. 602-609.
13. Bedi, J. and D. Toshniwal, Deep learning framework to forecast electricity demand. Applied Energy, 2019. 238: p. 1312-1326.
14. He, F., et al., A hybrid short-term load forecasting model based on variational mode decomposition and long short-term memory networks considering relevant factors with Bayesian optimization algorithm. Applied energy, 2019. 237: p. 103-116.
15. Wu, Z., et al., A hybrid model based on modified multi-objective cuckoo search algorithm for short-term load forecasting. Applied energy, 2019. 237: p. 896-909.
16. Sadaei, H.J., et al., Short-term load forecasting by using a combined method of convolutional neural networks and fuzzy time series. Energy, 2019. 175: p. 365-377.
52
17. Liang, Y., D. Niu, and W.-C. Hong, Short term load forecasting based on feature extraction and improved general regression neural network model. Energy, 2019. 166: p. 653-663.
18. Bianchi, F.M., et al., Short-term electric load forecasting using echo state networks and PCA decomposition. Ieee Access, 2015. 3: p. 1931-1943.
19. Johannesen, N.J., M. Kolhe, and M. Goodwin, Relative evaluation of regression tools for urban area electrical energy demand forecasting. Journal of cleaner production, 2019. 218: p. 555-564.
20. Hayes, B., J. Gruber, and M. Prodanovic. Short-term load forecasting at the local level using smart meter data. in 2015 IEEE Eindhoven PowerTech. 2015. IEEE.
21. Sauter, P.S., et al. Load Forecasting in Distribution Grids with High Renewable Energy Penetration for Predictive Energy Management Systems. in 2018 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT-Europe). 2018. IEEE.
22. Ebrahim, A.F. and O.A. Mohammed, Pre-Processing of Energy Demand Disaggregation Based Data Mining Techniques for Household Load Demand Forecasting. Inventions, 2018. 3(3): p. 45.
23. Wang, Y., et al., Review of smart meter data analytics: Applications, methodologies, and challenges. IEEE Transactions on Smart Grid, 2018. 10(3): p. 3125-3148.
24. Srivastava, A., A.S. Pandey, and D. Singh. Short-term load forecasting methods: A review. in 2016 International Conference on Emerging Trends in Electrical Electronics & Sustainable Energy Systems (ICETEESES). 2016. IEEE.
25. Mocanu, E., et al., Deep learning for estimating building energy consumption. Sustainable Energy, Grids and Networks, 2016. 6: p. 91-99.
26. Kong, W., et al., Short-term residential load forecasting based on resident behaviour learning. IEEE Transactions on Power Systems, 2017. 33(1): p. 1087-1088.
27. Alberg, D. and M. Last, Short-term load forecasting in smart meters with sliding window-based ARIMA algorithms. Vietnam Journal of Computer Science, 2018. 5(3-4): p. 241-249.
28. Chakhchoukh, Y., P. Panciatici, and P. Bondon. Robust estimation of SARIMA models: Application to short-term load forecasting. in 2009 IEEE/SP 15th Workshop on Statistical Signal Processing. 2009. IEEE.
29. Bercu, S. and F. Proïa, A SARIMAX coupled modelling applied to individual load curves intraday forecasting. Journal of Applied Statistics, 2013. 40(6): p. 1333-1348.
30. Hu, Z., Y. Bao, and T. Xiong, Electricity load forecasting using support vector regression with memetic algorithms. The Scientific World Journal, 2013. 2013.
31. Jain, A. and B. Satish. Clustering based short term load forecasting using support vector machines. in 2009 IEEE Bucharest PowerTech. 2009. IEEE.
32. Voß, M., A. Haja, and S. Albayrak. Adjusted feature-aware k-nearest neighbors: Utilizing local permutation-based error for short-term residential building load forecasting. in 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm). 2018. IEEE.
33. Li, K., H. Su, and J. Chu, Forecasting building energy consumption using neural networks and hybrid neuro-fuzzy system: A comparative study. Energy and Buildings, 2011. 43(10): p. 2893-2899.
34. Baccouche, M., et al. Sequential deep learning for human action recognition. in International workshop on human behavior understanding. 2011. Springer.
53
35. Yu, D. and L. Deng, Deep learning and its applications to signal and information processing
IEEE Signal Processing Magazine, 2010. 28(1): p. 145-154. 36. Marino, D.L., K. Amarasinghe, and M. Manic. Building energy load forecasting using deep
neural networks. in IECON 2016-42nd Annual Conference of the IEEE Industrial Electronics Society. 2016. IEEE.
37. Amarasinghe, K., D.L. Marino, and M. Manic. Deep neural networks for energy load forecasting. in 2017 IEEE 26th International Symposium on Industrial Electronics (ISIE). 2017. IEEE.
38. Fan, C., et al., Deep learning-based feature engineering methods for improved building energy prediction. Applied energy, 2019. 240: p. 35-45.
39. Fan, C., et al., Assessment of deep recurrent neural network-based strategies for short-term building energy predictions. Applied energy, 2019. 236: p. 700-710.
40. Cai, M., M. Pipattanasomporn, and S. Rahman, Day-ahead building-level load forecasts using deep learning vs. traditional time-series techniques. Applied Energy, 2019. 236: p. 1078-1088.
41. Jiao, R., et al., Short-term non-residential load forecasting based on multiple sequences LSTM recurrent neural network. IEEE Access, 2018. 6: p. 59438-59448.
42. Shi, H., M. Xu, and R. Li, Deep learning for household load forecasting—A novel pooling deep RNN. IEEE Transactions on Smart Grid, 2017. 9(5): p. 5271-5280.
43. Ryu, S., J. Noh, and H. Kim, Deep neural network based demand side short term load forecasting. Energies, 2017. 10(1): p. 3.
44. Elvers, A., M. Voß, and S. Albayrak. Short-term probabilistic load forecasting at low aggregation levels using convolutional neural networks. in 2019 IEEE Milan PowerTech. 2019. IEEE.
45. LeCun, Y., Y. Bengio, and G. Hinton, Deep learning. nature, 2015. 521(7553): p. 436-444. 46. Hinton, G.E., S. Osindero, and Y.-W. Teh, A fast learning algorithm for deep belief nets.
Neural computation, 2006. 18(7): p. 1527-1554. 47. Silver, D., et al., Mastering the game of go without human knowledge. Nature, 2017.
550(7676): p. 354-359. 48. Cao, Y., et al., Predicting Long-Term Health-Related Quality of Life after Bariatric
Surgery Using a Conventional Neural Network: A Study Based on the Scandinavian Obesity Surgery Registry. Journal of clinical medicine, 2019. 8(12): p. 2149.
49. Goodfellow, I., Y. Bengio, and A. Courville, Deep learning. 2016: MIT press. 50. Olah, C. Understanding LSTM Networks 2015 [cited 2020; Available from: http://colah.github.io/posts/2015-08-Understanding-LSTMs/. 51. Garg, R., et al. Unsupervised cnn for single view depth estimation: Geometry to the rescue.
in European Conference on Computer Vision. 2016. Springer. 52. Zhang, Y., S. Roller, and B. Wallace, MGNC-CNN: A simple approach to exploiting
multiple word embeddings for sentence classification. arXiv preprint arXiv:1603.00968, 2016.
53. Gawehn, E., J.A. Hiss, and G. Schneider, Deep learning in drug discovery. Molecular informatics, 2016. 35(1): p. 3-14.
54. Pao, J.J. and D.S. Sullivan, Time Series Sales Forecasting. Final Year Project, 2017. 55. Kingma, D.P. and J. Ba, Adam: A method for stochastic optimization. arXiv preprint
arXiv:1412.6980, 2014.
54
56. Chollet, F. Keras: Deep Learning for humans 2015 [cited 2020; Available from: https://github.com/keras-team/keras. 57. Street, P. Dataport : Researcher access to Pecan Street's groundbreaking energy and water
data. [cited 2020; Available from: https://www.pecanstreet.org/dataport/. 58. Ltd., O. OpenWeather Map. 2020 [cited March 20, 2020; Available from:
https://openweathermap.org/. 59. Team, G.B. An end-to-end open source machine learning platform. [cited 2019; Available
from: https://www.tensorflow.org/.