Upload
big-data-spain
View
199
Download
0
Embed Size (px)
Citation preview
BIG DATA SPAIN 20161
BIG DATA SPAIN 2016
© 2015 The MathWorks, Inc.
Turning an idea into a Data-Driven
Production SystemAn Energy Load Forecasting Case Study
Lucas García
Senior Application Engineer
MathWorks
BIG DATA SPAIN 20162
What is Energy Forecasting?
From Wikipedia:
Energy forecasting is a broad term that refers to
"forecasting in the energy industry".
It includes - but is not limited to - forecasting demand
(load) and price of electricity, fossil fuels (natural
gas, oil, coal) and renewable energy sources (RES;
hydro, wind, solar).
BIG DATA SPAIN 20163
What is Data Analytics?
• What happened? Descriptive
• Why did it happen?Diagnostics
• What will happen?Predictive
• What should be done?Prescriptive
Turn large volumes of complex data into actionable information
Data Decisions
BIG DATA SPAIN 20164
Data Analytics – Using Data to Make Better Decisions
Develop Predictive
ModelsAccess and Explore
DataPreprocess Data
Integrate Analytics with
Systems
BIG DATA SPAIN 20165
Goal:
Implement a tool for easy and accurate computation of day-ahead system load forecast
Requirements:
Acquire and clean data from multiple
sources
Accurate predictive model
Easily deploy to production environment
Case Study: Day-Ahead Energy Load Forecasting
BIG DATA SPAIN 20166
The Data
mis.nyiso.com/public/
NYISO Energy Load Data
cdo.ncdc.noaa.gov/qclcd_ascii/
National Climatic Data Center Weather Data
BIG DATA SPAIN 20167
Data Analytics Workflow
Integrate Analytics with
Systems
Desktop Apps
Enterprise Scale
Systems
Embedded Devices
and Hardware
Files
Databases
Sensors
Access and Explore
Data
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
BIG DATA SPAIN 20168
Data Analytics Workflow
Integrate Analytics with
Systems
Desktop Apps
Enterprise Scale
Systems
Embedded Devices
and Hardware
Files
Databases
Sensors
Access and Explore
Data
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
1
BIG DATA SPAIN 20169
Data Analytics Workflow
Files
Databases
Sensors
Access and Explore
DataPreprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
Repositories – SQL, NoSQL, etc.
File I/O – Text, Spreadsheet, etc.
Web Sources – RESTful, JSON, etc.
Business and Transactional Data
Engineering, Scientific and Field Data
Real-Time Sources – Sensors, GPS, etc.
File I/O – Image, Audio, etc.
Communication Protocols – OPC (OLE for
Process Control), CAN (Controller Area
Network), etc.
BIG DATA SPAIN 201610
Data Analytics Workflow
Files
Databases
Sensors
Access and Explore
DataPreprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
Data aggregation
– Different sources (files, web, etc.)
– Different types (images, text, audio, etc.)
Data clean up
– Poorly formatted files
– Irregularly sampled data
– Redundant data, outliers, missing data etc.
Data specific processing
– Signals: Smoothing, resampling, denoising,
Wavelet transforms, etc.
– Images: Image registration, morphological
filtering, deblurring, etc.
Dealing with out of memory data (big data)
Challenges
BIG DATA SPAIN 201611
Data Analytics Workflow
Files
Databases
Sensors
Access and Explore
DataPreprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
Point and click tools to access
variety of data sources
High-performance environment
for big data
Files
Signals
Databases
Images
Built-in algorithms for data
preprocessing including sensor,
image, audio, video and other
real-time data
MATLAB Analytics work
with business and
engineering data
1
BIG DATA SPAIN 201612
Data Analytics Workflow
Integrate Analytics with
Systems
Desktop Apps
Enterprise Scale
Systems
Embedded Devices
and Hardware
Files
Databases
Sensors
Access and Explore
Data
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
1 2
BIG DATA SPAIN 201613
Data Analytics Workflow
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Challenges
Lack of data science expertise
Feature Extraction – How to transform
data to best represent the system?
– Requires subject matter expertise
– No right way of designing features
Feature Selection – What attributes or
subset of data to use?
– Entails a lot of iteration – Trial and error
– Difficult to evaluate features
Model Development
– Many different models
– Model Validation and Tuning
Time required to conduct the analysis
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
BIG DATA SPAIN 201614
Data Analytics Workflow
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
MATLAB enables domain experts to
do Data Science
2
Apps Language
Easy to use apps
Wide breadth of tools to facilitate
domain specific analysis
Examples/videos to get started
Automatic MATLAB code
generation
High speed processing of large
data sets
BIG DATA SPAIN 201615
Data Analytics Workflow
Integrate Analytics with
Systems
Desktop Apps
Enterprise Scale
Systems
Embedded Devices
and Hardware
Files
Databases
Sensors
Access and Explore
Data
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
1 2 3
BIG DATA SPAIN 201616
Data Analytics Workflow
Integrate Analytics with
Systems
Desktop Apps
Enterprise Scale
Systems
Embedded Devices
and Hardware
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
End user: Operators, Analysts,
Administrative Staff, customers etc.
Different target platforms:
– Cluster or Cloud environment
– Standalone desktop applications
– Server based Web and enterprise systems
– Embedded hardware
Different Interfaces: C++, Java, Python,
.NET etc.
Need to translate analytics to production
environment
Challenges
BIG DATA SPAIN 201617
Integrate analytics with systems
MATLAB
Runtime
C, C++ HDL PLC
Embedded Hardware
C/C++ ++ExcelAdd-in Java
Hadoop/
Spark.NET
MATLABProduction
Server
StandaloneApplication
Enterprise Systems
Python
MATLAB Analytics run anywhere
3
BIG DATA SPAIN 201618
MATLAB
Desktop
Deployed AnalyticsMATLAB Production Server
MATLAB
Production
Server
Web
Application
Server
MATLAB
Production Server
Re
qu
est B
roke
r
CTF
Apache Tomcat
Web Server/
Webservice
Weather
Data
Energy
Data
Predictive
Models
Train in
MATLAB
BIG DATA SPAIN 201619
Key Takeaways
Utilize all of your data
Apply advanced analytics techniques
Operationalize analytics to enterprise
systems and embedded devices
MATLAB Analytics work
with business and
engineering data
1
MATLAB enables domain experts to do
Data Science
2
3MATLAB Analytics run anywhere
BIG DATA SPAIN 201620
Thank you!
Stay tuned: Twitter: @MATLAB | LinkedIn: https://www.linkedin.com/company/the-mathworks_2
% Send me your feedback:
% Twitter: @mathinking