31
Grid Analytics Europe 2016 Pedro Ferreira EDP Inovação 5,6 of April 2016

Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

Grid Analytics Europe 2016

Pedro Ferreira

EDP Inovação

5,6 of April 2016

Page 2: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 2

Agenda

1. Introduction to EDP

2. Motivation

3. Project Predis – Load and Generation dissagregated forecast in real time

4. Project SINAPSE - Improving operations in conventional grids in the Industrial Internet of Things age: How EDP Distribuição detects low-voltage outages near real-time

5. EDP Future IT Architecture

6. Conclusions

Page 3: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação

EDP Group - from a local electricity incumbent to a global energy player with a strong presence in Europe, Brazil and considerable investments in the USA…

UK

USACanada

Portugal

Brazil

Angola

SpainItaly

FranceBelgium

PolandRomania

China中国

# Present in the Electric Sector in Dow Jones Sustainability

Indexes

#3 World wind energy company

#1 Europehydro project

(+3,5 GW under development)

#1 Portugal industrial group

260 Employees3 422 Installed Capacity (MW)9 330 Net Generation (GWh)100% Generation from renewable sources

USA/ Canada

2 635 Employees

2 831 651 Electricity Customers

1 874 Installed Capacity (MW)8 043 Net Generation (GWh)100% Generation from renewable sources24 544 Electricity Distribution (GWh)

Brazil

7252 Employees

6 053 509 Electricity Customers

271 576 Gas Customers

10 992 Installed Capacity (MW)

34 364 Net Generation (GWh)

51% Generation from renewable sources

46 508 Electricity Distribution (GWh)7 138 Gas Distribution (GWh)

Portugal

34 Employees363 Installed Capacity (MW)705 Net Generation (GWh)100% Generation from renewable sources

France/ Belgium

14 Employees

Italy

21 Employees

United Kingdom

51 Employees475 Installed Capacity (MW)621 Net Generation (GWh)100% Generation from renewable sources

Poland/ Romania

2 038 Employees

1 015 543 Electricity Customers

787 869 Gas Customers6 087 Installed Capacity (MW)15 331 Net Generation (GWh)37% Generation from renewable s.9 517 Electricity Distribution (GWh)48 447 Gas Distribution (GWh)

Spain

Mexico

Page 4: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 4

EDP Distribuição and EDP Inovação – facts and figures

245.000Km

Percent of the electricity distribution network owned in mainland Portugal

Distribution network approximate length

6Million

Approximate number of customers served

EDP Distribuição is the EDP Group's company operating in the regulated distribution and supply businesses in Portugal. EDP's distribution activity is regulated by ERSE (EntidadeReguladora dos Serviços Energéticos) which defines the tariffs, parameters and prices for electricity and other services in Portugal.

EDP Inovação is the innovation arm of EDP Group, promoting value-adding innovation within the Group by leading the adoption of new technological evolutions and practices.

Open innovation approach

Client-focused Solutions

SmarterGrids

Cleaner Energy

Data Leap

5 strategic innovation areasEntrepreneurship & Venture Capital ecosystem

Storage

Page 5: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 5

Agenda

1. Introduction to EDP

2. Motivation

3. Project Predis – Load and Generation dissagregated forecast in real time

4. Project SINAPSE - Improving operations in conventional grids in the Industrial Internet of Things age: How EDP Distribuição detects low-voltage outages near real-time

5. EDP Future IT Architecture

6. Conclusions

Page 6: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação

Smart Grid

The transformation of the energy sector adds new challenges to the DSO, demanding new strategies for the Distribution Power Grid, that becomes progressively more intelligent.

Quality of Service

Operational Efficiency

Historical Challenges New Challenges

Advanced Metering

Infrastructure

Network automation & sensoring

Energy efficiency and new business

models

Electric vehicle

Renewables and

Distributed Generation

6

Page 7: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 7

And to face those new challenges it will we need to increase the visibility over the LV network, reducing the existing gap when compared with HV and MV networks.

HV: 9.000 km

412 HV/MV

Substation HV/MV

Station VHV/HV

HV network

Distribution Network

Secondary Substation MV/LV

MV network LV networkRetailer/

Consumer/Producer

140.000 km LV Lines

6.000.000 Users

MV: 74.000 km

MV/LV: 66.000

Network Assets

Level ofMonitoring

and Automation

HANLANWAN

EDP Box

The ability to collect information from different sources (internal and external, structured and unstructured), that are mostly scattered, has a huge potential to largely improve the operational activites of an utility

Page 8: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação

8

In 2013 EDP started to look at big data and advanced analytics, developing comparison between the performance of a conventional DataBase and Hadoop.

Profiling + aggregation Technology Time Notes

Current architecture Oracle Around 8h 4 Million points

SQL with Big Data Hive, Impala 1 to 4h Inadequate

Customized Programming without Big Data Java Around 5min One machine (multi-core)

Customized Programing with Big Data

Spark <5 minMulti machines with Big Data

Higher resilience Parallelization

National Energy Consumption (with load curves) by voltage level*

SystemNodes

[#]Cores

[#]RAM[GB]

Cluster Readings[10^6]

Volume[MB]

Processing Time[h:min:sec]

BO (EDP) 4 96 202 Local 12 x 6 72 3:45:00

Hadoop 21 42 157 Virtual / Cloud 96 x 6 576 00:09:37

*This Proof of Concept was done in the cloud payed with a credit card and cost around $30.

Main conclusions:

• The Hadoop cluster is by nature resilient and coped with nodes failure.

• The processing times can be greatly improved over traditional arquitecture

• There is a high need for customization

• The choice of the tool from the Hadoop ecosystem depends highly on the type of calculations to be made.

Page 9: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 9

Agenda

1. Introduction to EDP

2. Motivation

3. Project Predis – Load and Generation dissagregated forecast in real time

4. Project SINAPSE - Improving operations in conventional grids in the Industrial Internet of Things age: How EDP Distribuição detects low-voltage outages near real-time

5. EDP Future IT Architecture

6. Conclusions

Page 10: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 10

With the results obtained we set up a project called PREDIS to have load and generation forecast at an disaggregated level in real time (with 15 minutes refreshment).

PREDIS needs to:

• Connect to different data sources from EDP Distribuição (GIS, Scada, Oracle, Sap)

• Develop an adequate cluster to perform all the computation (open source software, and R for analytics)

• Integrate the information on a data model for forecast

• Develop the analytic and processes to compute this information in useful time

PREDIS Project Goals:

•Forecast of Electrical Load for the next 72 hours

•Forecast of disaggregated Renewable energy sources (Wind, Solar) for the

next 72 hours

•Deal with an universe of aprox. 6 million points (Substations, Distribution

Transformers, LV clients,)

•Forecast update every 15 minutes

•Incorporate dynamic grid topology

Page 11: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação

Review of existing load

forecast models

We had some steps in order to find an adequate model that allowed us to forecast load with good enough accuracy.

11

Test the model over national

Load

Improve the model with additional

Explanatory variables

Define models for different times of year

Page 12: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 12

After having chosen the model we identified a set of explanatory variables and tested the model over National Demand

Explanatory variables:

• Year, month, day

• Day of week

• Public holiday

• Season (Springer, Summer, Autumn, Winter)

• Daylight save time (TRUE, FALSE)

• Time of year

• Time of day (48 1/2 hour intervals)

• Temperature From NOAA website

Dataset:

• Half-hourly electricity measurements

• National demand (mainland Portugal)

• From 2006 to 2011 – Data for calibration

• From 2012 to 2014 – Data for test

High temperature ~ demand peak(4th highest heat wave since 1981)

Low temperature ~ demand peak (2012 European cold wave due Siberian High)

Page 13: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 13

To increase the model accuracy and looking at the major residuals, we started a trial and error process to identify the principal causes that may decrease the model errors

August

Christmas and New Year period

Public holiday on Sunday

Gong storm

-500

500

0,65

0,7

0,75

0,8

0,85

0,9

0,95

1

Iteraction Variable

1 24h lagged load

2 temp. combined w. time of day

3 48h lagged load

4 day of week

5 public holidays

6 intra-day effect dependent on the day type

7 day of the year

8 24h lagged temp. + min and max temp. of last 24h

9 days offs before Christmas and Carnival

Dev

ian

ce e

xpla

ined

Iteraction

The

hig

her

th

e b

ette

r

Features added/combined

Page 14: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 14

But there were still some issues with the forecast. After special days like Christmas the model shouldn’t use the load of the previous day to make the forecast.

This lead to a new approach of using a weighted majority algorithm

Page 15: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 15

In this approach we had several algorithms that were trained to certain conditions and the model automatically choose the one that minimized the error for each period

Iteraction Variable

1 General-purpose model

2 General-purpose model reviewed

3 Weekends' model

4 August's model

5 Public holidays' model

6 Spring and Summer's model

7 Autumn and Winter's mode

8 Christmas and New Year's model

9 Carnival's model

10 Easter's model

Jan Autumn Dec

Carnivalperiod

Easterperiod

AugustSpring

Christmas andNew Year periodWeekends Other public holidays

On

e y

ear

2

2,1

2,2

2,3

2,4

2,5

2,6

2,7

The

low

er t

he

bet

ter

MA

PE

(%)

Iteraction

Models added/combined

We have a working Algorithm with around 2% of error for an aggregated national load.

Page 16: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação

In parallel we also implemented in R a wind generation forecast model based on wind velocity + air pressure and the energy supplied by the wind farm

Forecast D+1

- Forecast- Actual

Forecast D+2

- Forecast- Actual

Forecast D+3

- Forecast- Actual

7% 8% 12%

NMAENormalized mean absolute error

Test conditions: • 9 months calibration data + 1 month validation data• Hourly generation measurements and forecasts of wind velocity@10m and pressure@MSL (72h time horizon, 3h intervals)

Page 17: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 17

New challenges on load forecasting when we decrease the voltage level (substations and distribution transformers)

August August

Christmas and New year

Christmas and New year

Network reconfigurations?

Done so far:

• Implemented 2 Big Data Clusters (Hadoop)

• Developed an architecture for the Project

• Developed a Load forecast model with ~2% MAPE for

national load

• Developed a Wind forecast model with ~10% error

Next Steps:

• Improve the existing models

• Incorporate network configurations on the forecast module (state estimation, network status)

• Cluster different types of load by voltage level, load typification etc.

• Wind farms state estimation

• PV model definition and implementation

• Collect data from the source systems in a continuous way

Page 18: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 18

Agenda

1. Introduction to EDP

2. Motivation

3. Project Predis – Load and Generation dissagregated forecast in real time

4. Project SINAPSE - Improving operations in conventional grids in the Industrial Internet of Things age: How EDP Distribuição detects low-voltage outages near real-time

5. EDP Future IT Architecture

6. Conclusions

Page 19: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 19

❶ An anomaly occurs in the distribution grid, causing an outage

… × n

❷ Customers in the geographic area affected by the outage call into EDP’s call centers

❸ Geographic location alerts are automatically sent to EDP via internet

❻ Technical team is sent automatically to the area to solve the problem

❺ The volume of alerts combined with geographical location allows a precise location of the affected area

Cable Operator

Mobile Operator Security

Company

❹ Clients can send Clients can send alerts by SMS, Twitter, email…

Internet

The issue at hand: outage time in conventional low-voltage distribution grids is prolonged by the need for human intervention, resulting in avoidable losses

Sinapse created an automatic communication channel to report low voltage anomalies adding a smart layer to the conventional distribution grid.

Page 20: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 20

A suitable solution to process data streams in real-time is to use a Complex Event Processing (CEP) engine, implementing a spacio temporal analysis to look at and coorrelate recent events

Lon

gitu

de

Time

Source

A

B

C…

CEP

Data stream

OFF OFF OFF

OFF

ONON

Input Output

Sliding window

OFF

Length

(time)

Key impacting variables

Sliding window

The open-source ESPER CEP engine was selected for SINAPSE, for cost-effectiveness and suitability

Data analysis is ongoing to determine optimal length and width parameters for the sliding window.

Solution selected and implementation status

Correlated events

Page 21: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 21

Early results from data analysis: May 4th weather anomaly case study

Portuguese news report severe storm hitting northwest Portugal on May 4, 2015

5 de Maio 2015

020406080

100120140160180200

Sinapse Rede Activa

SINAPSE “OFF” events versus Rede Activa incidents (outage management system) in May 4, from 2AM to 5PM

Analysis of event volumes during the storm show correlation between Sinapse and EDP’s outage management system over time, suggesting some anticipation in peak occurrences

Page 22: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 22

Agenda

1. Introduction to EDP

2. Motivation

3. Project Predis – Load and Generation dissagregated forecast in real time

4. Project SINAPSE - Improving operations in conventional grids in the Industrial Internet of Things age: How EDP Distribuição detects low-voltage outages near real-time

5. EDP Future IT Architecture

6. Conclusions

Page 23: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 23

These projects revealed a series of constrains that currently exist mainly in the IT systems:

• How to spread analytics knowledge?

• How to do massive extractions of data without impacting on the performance of existing systems?

• How to avoid a proliferation of interfaces each one with a specific function?

• How to interpret the data that exist in the systems?

• How to “democratize” the access to data so that multi source analytics can be developed?

This needs were important to the development of new approaches to the IT systems.

Page 24: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 24

Traditionally a Utility has analytic solutions based on a “traditional” silo oriented BI architecture that isn’t apt to deal with high volumes of data and unstructured information.

InformationUsage

Integration

Software

application

Operation

Software Application

… Software Application

SapApplication

… SAP Application

Software Application

BW Redundancy of information Lack of connectivity between the

different information “silos”

Little or non-existing related information at a disaggregated level

This was the vision of an IT architecture up to 2010. Meanwhile IT world has changed, but the business needs are still the same. It is necessary to have a Strategic, Tactic and Operational vision.

An

alyt

ic

leve

lO

per

atio

nal

Le

vel SAP ExtractorsETL / Active Data Guard / Golden Gate

Page 25: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação

Usage

25

Information

Integration

Application

Operation

Software Application

… Software Application

SAP Application

… SAP Application

Software Application External Sources

3

Operational Systems create more data every day in the 3V’s that characterize Big Data (Volume, Variety, Velocity). A conventional infrastructure cannot handle operational activities and advanced analytics in due time.

Big Data comes as an option that allows data ingestion and advanced analytics of high volumes of data oriented to one of the 3V’s (Volume, Variety, Velocity).

Keeping operational systems with their normal activities.

SAP Extractors

ETL / Golden Gate

Interaction

BW“DataLake”MDU GR

MDU GAMDU GE

An

alyt

ic

Leve

lO

po

erat

ion

alLe

vel

Page 26: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação

The vision of the architecture is now around the development of a Data Lake were we have the information from the different IT/OT systems that is fed with CDC (Change Data Capture) interfaces and more focused on analytics

Data Mining and Advanced Analytics

Operationand execution

Discovery andInnovation

Data Lake“BigData”

Transformações DataWarehouse

ReportsDashboardsExploration

Event Engine

“Real Time” data

Internal data

External andUnstructured

Data

Real-Time

Strategic Level

Tactical Level

Data andevents

DiscoveryOutput

Cri

tica

lDat

a G

ove

rnan

ce*

26

Business Activity

Monitoring

OperationalLevel

Information

* Focused on comercial sensitive information, commercially irelevant, etc…

Page 27: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação

Internal sources of data from operational systems* (IT/OT)

Operational systems and their connections maintain their “Business as Usual”

GENESys

Sysgrid

SITR

WFMGrid Control Billing

EDM (SGL) Ei-Server

Sensorização

……SCADA-BI

This ecosystem will collect information from the source systems, feed the UDM andprovide access to data and processing power to the use cases that may need it.

Big Data Ecosystem(Data Lake, MDU, Analytics)

Business Process Assurance***

SituationalAwareness***

Business Activity Monitoring***

***Future apps that use the full potential of the Data Lake

External sources of data*

IPMA

Telcos

……

Sinapse

*Non exhaustive

Recent and under development Apps that will use the Data Lake, e return enriched information

Predis Others(…)SinapsePlanning

toolsRevenue

AssuranceUpGrid

Rede Activa

Page 28: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 28

While we don’t have the data lake, and to gather knowledge in the subect wedeveloped two infrastructures: Enterprise Level and a Low Cost

GOALS: Assembly of enterprise level internalcluster for project support.

Data confidentiality guaranteed

Hardware quality (Enterprise level)

Prepared to horizontal scale

Cloudera Hadoop and R

7 nodes/servers (dimensioned for Predisproject)

CLUSTER ENTERPRISE LEVEL

OBJECTIVO: Internal test and development cluster assembly.

Big Data platform knowledge development.

Low cost platform

Cloudera Hadoop and R

48 nodes/servers

CLUSTER LOW COST

Page 29: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação

Big Data Platform/Cluster

HDFS (Storage)Hadoop Distributed File System

HbaseColumnar Store

Mahout Machine Learning

Hive SQL Query

IMPALA/ SPARK

In-memory

Map Reduce/YARN (Resource Management) Distributed Processing Framework

Web app

API – data access and data modeling

Exte

rna

l ac

ce

ssto

da

ta d

ow

nlo

ad

System A

System B

Files

Predis

Forecast

Model 1 implemented on R

Model 2 implemented on R

SIT

EDM (SGL)

Ei-Server

Rede

Activa

SCADA-BI

External data sources

Da

ta e

xtra

ct a

nd

loa

din

g

IPMA

SGL

SIT

New Model implemented on R Deploy

Re

sults

of

ne

w

mo

de

ls d

ep

loye

d

SqoopKafka

And this is the internal architecture of PREDIS system

Files CDC

Page 30: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação 30

Conclusions:

• Analytics is a continuous learning process and a cultural change, to overcome the lack of

knowledge in this area a Advanced Analytics and Machine Learnin course in R is being lectured

in EDP.

• Access to data can be hard when dealing with often overloaded source systems, CDC extractors

seem the best way to extract data from the source systems and have the data available near real

time.

• The development of an Data Lake will decrease the number of interfaces between the systems

and decrease the interfaces between the systems

• The Unified Data Model where information is cataloged allows for a unique understanding of

the available data (single source of truth). But we need Data Governance!

Page 31: Pedro Ferreira EDP Inovação...2016/04/06  · • From 2006 to 2011 –Data for calibration • From 2012 to 2014 –Data for test High temperature ~ demand peak (4th highest heat

EDP Inovação

Obrigado!