19
Powering a Virtual Power Station with Big Data Michael Bironneau April 2016

Powering a Virtual Power Station with Big Data

Embed Size (px)

Citation preview

Page 1: Powering a Virtual Power Station with Big Data

Powering a Virtual Power Station with

Big DataMichael BironneauApril 2016

Page 2: Powering a Virtual Power Station with Big Data

CCGTCoal

Nuclear

Wind

Interconnecto

rsOCGT

Pumped StorageSo

lar Oil

BiomassHydro

0

5

10

15

20

25

30

35

Installed Capacity (GW) Generation (GW)

Page 3: Powering a Virtual Power Station with Big Data
Page 4: Powering a Virtual Power Station with Big Data
Page 5: Powering a Virtual Power Station with Big Data
Page 6: Powering a Virtual Power Station with Big Data

02468

101214161820

Total PowerM

W

Average upwards flex – 120%

Average downwards flex – 35%

Page 7: Powering a Virtual Power Station with Big Data

?

?

Page 8: Powering a Virtual Power Station with Big Data

• 25-40k messages processed per second• Total size of data 500TB-800TB

Open Energi in the coming year:

Page 9: Powering a Virtual Power Station with Big Data

• 25-40k messages processed per second• Total size of data 500TB-800TB

Open Energi in the coming year:

Perspective: here’s what “big data” means to Boeing [1]:• ~64k messages per second from each aircraft• Total size of data over 100 petabytes

[1]: http://bit.ly/18kQlMn

Page 10: Powering a Virtual Power Station with Big Data

Open Energi Boeing0

20

40

60

80

100

120

Size of data (PB)

Our data is not huge at the moment…

Page 11: Powering a Virtual Power Station with Big Data

…but after domestic demand-side response (or something else on that scale)

Open Energi Boeing0

20

40

60

80

100

120

Size of data (PB)

Page 12: Powering a Virtual Power Station with Big Data

Why Hortonworks Data Platform

• Can scale quickly to respond to market demands• Interoperability with existing code• Fantastic data integration• Knowledgeable technical support• Security and data governance

Page 13: Powering a Virtual Power Station with Big Data

Batch | Our HDP setup

Flume

Asset Data

National Electricity Data

Market data

Other “live” timeseries data

Hive Streaming

Hive

otherApplications

Page 14: Powering a Virtual Power Station with Big Data

Real-time | (Work ongoing)

Asset Data

ML models

HDFS, cache, Elasticsearch…

Update ML ModelsCorrelate Events

Enrich

Page 15: Powering a Virtual Power Station with Big Data

Apache Hive | Example

CREATE EXTERNAL TABLE semi_structured_stuff (...) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource' = ‘semi/structured',

'es.index.auto.create' = 'false') ;

SELECT something FROM semi_structured_stuffJOIN metadata m ON …LEFT JOIN timeseries t ON …

Index semi-structured data (Elasticsearch)

Use Hive to integrate this with timeseries data and other metadata

Farm out complex analytics to PythonSELECT transform(something) USING ‘insane_maths.py’AS (result)

Page 16: Powering a Virtual Power Station with Big Data

Benefits

• Reduced storage cost compared to SAN + SQL Server• Better utilisation of infrastructure thanks to YARN• Pain-free integration of multiple data sources with external tables

in Hive• Scale up/down on demand• Re-use existing Python code = low development overhead

Page 17: Powering a Virtual Power Station with Big Data

Dynamic Demand

SimulationsInsights via web

Machine learningStatistical Analysis

Event correlationExpert system

Real-time aggregationReal-time web feed

Page 18: Powering a Virtual Power Station with Big Data

Dynamic Demand

SimulationsInsights via web

Machine learningStatistical Analysis

Event correlationExpert system

Real-time aggregationReal-time web feed

Page 19: Powering a Virtual Power Station with Big Data

Thanks for listening. Any questions?