21

Turning an idea into a Data-Driven Production System: An Energy Load Forecasting Case Study by Lucas García

Embed Size (px)

Citation preview

BIG DATA SPAIN 20161

BIG DATA SPAIN 2016

© 2015 The MathWorks, Inc.

Turning an idea into a Data-Driven

Production SystemAn Energy Load Forecasting Case Study

Lucas García

Senior Application Engineer

MathWorks

BIG DATA SPAIN 20162

What is Energy Forecasting?

From Wikipedia:

Energy forecasting is a broad term that refers to

"forecasting in the energy industry".

It includes - but is not limited to - forecasting demand

(load) and price of electricity, fossil fuels (natural

gas, oil, coal) and renewable energy sources (RES;

hydro, wind, solar).

BIG DATA SPAIN 20163

What is Data Analytics?

• What happened? Descriptive

• Why did it happen?Diagnostics

• What will happen?Predictive

• What should be done?Prescriptive

Turn large volumes of complex data into actionable information

Data Decisions

BIG DATA SPAIN 20164

Data Analytics – Using Data to Make Better Decisions

Develop Predictive

ModelsAccess and Explore

DataPreprocess Data

Integrate Analytics with

Systems

BIG DATA SPAIN 20165

Goal:

Implement a tool for easy and accurate computation of day-ahead system load forecast

Requirements:

Acquire and clean data from multiple

sources

Accurate predictive model

Easily deploy to production environment

Case Study: Day-Ahead Energy Load Forecasting

BIG DATA SPAIN 20166

The Data

mis.nyiso.com/public/

NYISO Energy Load Data

cdo.ncdc.noaa.gov/qclcd_ascii/

National Climatic Data Center Weather Data

BIG DATA SPAIN 20167

Data Analytics Workflow

Integrate Analytics with

Systems

Desktop Apps

Enterprise Scale

Systems

Embedded Devices

and Hardware

Files

Databases

Sensors

Access and Explore

Data

Develop Predictive

Models

Model Creation e.g.

Machine Learning

Model

Validation

Parameter

Optimization

Preprocess Data

Working with

Messy Data

Data Reduction/

Transformation

Feature

Extraction

BIG DATA SPAIN 20168

Data Analytics Workflow

Integrate Analytics with

Systems

Desktop Apps

Enterprise Scale

Systems

Embedded Devices

and Hardware

Files

Databases

Sensors

Access and Explore

Data

Develop Predictive

Models

Model Creation e.g.

Machine Learning

Model

Validation

Parameter

Optimization

Preprocess Data

Working with

Messy Data

Data Reduction/

Transformation

Feature

Extraction

1

BIG DATA SPAIN 20169

Data Analytics Workflow

Files

Databases

Sensors

Access and Explore

DataPreprocess Data

Working with

Messy Data

Data Reduction/

Transformation

Feature

Extraction

Repositories – SQL, NoSQL, etc.

File I/O – Text, Spreadsheet, etc.

Web Sources – RESTful, JSON, etc.

Business and Transactional Data

Engineering, Scientific and Field Data

Real-Time Sources – Sensors, GPS, etc.

File I/O – Image, Audio, etc.

Communication Protocols – OPC (OLE for

Process Control), CAN (Controller Area

Network), etc.

BIG DATA SPAIN 201610

Data Analytics Workflow

Files

Databases

Sensors

Access and Explore

DataPreprocess Data

Working with

Messy Data

Data Reduction/

Transformation

Feature

Extraction

Data aggregation

– Different sources (files, web, etc.)

– Different types (images, text, audio, etc.)

Data clean up

– Poorly formatted files

– Irregularly sampled data

– Redundant data, outliers, missing data etc.

Data specific processing

– Signals: Smoothing, resampling, denoising,

Wavelet transforms, etc.

– Images: Image registration, morphological

filtering, deblurring, etc.

Dealing with out of memory data (big data)

Challenges

BIG DATA SPAIN 201611

Data Analytics Workflow

Files

Databases

Sensors

Access and Explore

DataPreprocess Data

Working with

Messy Data

Data Reduction/

Transformation

Feature

Extraction

Point and click tools to access

variety of data sources

High-performance environment

for big data

Files

Signals

Databases

Images

Built-in algorithms for data

preprocessing including sensor,

image, audio, video and other

real-time data

MATLAB Analytics work

with business and

engineering data

1

BIG DATA SPAIN 201612

Data Analytics Workflow

Integrate Analytics with

Systems

Desktop Apps

Enterprise Scale

Systems

Embedded Devices

and Hardware

Files

Databases

Sensors

Access and Explore

Data

Develop Predictive

Models

Model Creation e.g.

Machine Learning

Model

Validation

Parameter

Optimization

Preprocess Data

Working with

Messy Data

Data Reduction/

Transformation

Feature

Extraction

1 2

BIG DATA SPAIN 201613

Data Analytics Workflow

Develop Predictive

Models

Model Creation e.g.

Machine Learning

Model

Validation

Parameter

Optimization

Challenges

Lack of data science expertise

Feature Extraction – How to transform

data to best represent the system?

– Requires subject matter expertise

– No right way of designing features

Feature Selection – What attributes or

subset of data to use?

– Entails a lot of iteration – Trial and error

– Difficult to evaluate features

Model Development

– Many different models

– Model Validation and Tuning

Time required to conduct the analysis

Preprocess Data

Working with

Messy Data

Data Reduction/

Transformation

Feature

Extraction

BIG DATA SPAIN 201614

Data Analytics Workflow

Develop Predictive

Models

Model Creation e.g.

Machine Learning

Model

Validation

Parameter

Optimization

Preprocess Data

Working with

Messy Data

Data Reduction/

Transformation

Feature

Extraction

MATLAB enables domain experts to

do Data Science

2

Apps Language

Easy to use apps

Wide breadth of tools to facilitate

domain specific analysis

Examples/videos to get started

Automatic MATLAB code

generation

High speed processing of large

data sets

BIG DATA SPAIN 201615

Data Analytics Workflow

Integrate Analytics with

Systems

Desktop Apps

Enterprise Scale

Systems

Embedded Devices

and Hardware

Files

Databases

Sensors

Access and Explore

Data

Develop Predictive

Models

Model Creation e.g.

Machine Learning

Model

Validation

Parameter

Optimization

Preprocess Data

Working with

Messy Data

Data Reduction/

Transformation

Feature

Extraction

1 2 3

BIG DATA SPAIN 201616

Data Analytics Workflow

Integrate Analytics with

Systems

Desktop Apps

Enterprise Scale

Systems

Embedded Devices

and Hardware

Develop Predictive

Models

Model Creation e.g.

Machine Learning

Model

Validation

Parameter

Optimization

End user: Operators, Analysts,

Administrative Staff, customers etc.

Different target platforms:

– Cluster or Cloud environment

– Standalone desktop applications

– Server based Web and enterprise systems

– Embedded hardware

Different Interfaces: C++, Java, Python,

.NET etc.

Need to translate analytics to production

environment

Challenges

BIG DATA SPAIN 201617

Integrate analytics with systems

MATLAB

Runtime

C, C++ HDL PLC

Embedded Hardware

C/C++ ++ExcelAdd-in Java

Hadoop/

Spark.NET

MATLABProduction

Server

StandaloneApplication

Enterprise Systems

Python

MATLAB Analytics run anywhere

3

BIG DATA SPAIN 201618

MATLAB

Desktop

Deployed AnalyticsMATLAB Production Server

MATLAB

Production

Server

Web

Application

Server

MATLAB

Production Server

Re

qu

est B

roke

r

CTF

Apache Tomcat

Web Server/

Webservice

Weather

Data

Energy

Data

Predictive

Models

Train in

MATLAB

BIG DATA SPAIN 201619

Key Takeaways

Utilize all of your data

Apply advanced analytics techniques

Operationalize analytics to enterprise

systems and embedded devices

MATLAB Analytics work

with business and

engineering data

1

MATLAB enables domain experts to do

Data Science

2

3MATLAB Analytics run anywhere

BIG DATA SPAIN 201620

Thank you!

Stay tuned: Twitter: @MATLAB | LinkedIn: https://www.linkedin.com/company/the-mathworks_2

% Send me your feedback:

% [email protected]

% Twitter: @mathinking