1
SUNSEED project is partially funded by EC FP7 programme under grant agreement #619437. Big Data Stream Mining Maintain summaries of the streams, sufficient to answer the expected queries about the data: Summaries can be in various forms: clusters (flat or hierarchic, statistical aggregates, …) Maintain a sliding window of the most recently arrived data operations on a sliding window mimic more traditional database/mining operations Sampling obtain representative data sample (i.e., enabling to perform correctly required operations on data) Smart sampling (x % from stream of multiple data sources; alternative take y % of selected data sources) Similarity comparison – smart indexing Incremental updating of predicting models M. Skrjanc, B. Kazic {Maja.Skrjanc, Blaz.Kazic}@ijs.si , Jozef Stefan Institute, Jamova ul. 39, Ljubljana, Slovenia Forecasting in Smart Grids Types of forecasting problems: Electricity load (short term, medium term, long term) Renewable sources generation Electricity prices Costumer segmentation Input sources: Historical load variables: used for learning models and detecting short term trends Meteorological data: known to be correlated with load (depends on location) Static data: such as special calendar data (holidays, summer season), and topology of electrical grid Methods used: Naive approach: Localized averages, previous values. Computationally non demanding, fast, robust and easy to maintain. Can work surprisingly well. Classical approaches: Autoregressive (ARMIA), regressionbased statistics methods. Based on historical data. Can take advantage of seasonality trends, but usually don’t include other data sources. Computational intelligence approaches: Artificial neural networks, support vector machines. Data driven approach that can take advantage of various heterogeneous data sources. Hybrid methods: combine two or more different approaches in order to take advantage of specific methods benefits and overcome their drawbacks.

Big Data StreamMining - Sunseed EU | Sustainable and ...sunseed-fp7.eu/wp-content/uploads/2015/04/13_SUNSEED...SUNSEED project is partially funded by EC FP7 programme under grant agreement

Embed Size (px)

Citation preview

SUNSEED project is partially funded by EC FP7 programme under grant agreement #619437.

Big Data StreamMining• Maintain summaries of the streams, sufficient to answer the

expected queries about the data:• Summaries can be in various forms: clusters (flat orhierarchic, statistical aggregates, …)

• Maintain a sliding window of the most recently arriveddata operations on a sliding window mimic moretraditional database/mining operations

• Sampling ‐ obtain representative data sample (i.e., enabling toperform correctly required operations on data)

• Smart sampling (x % from stream of multiple datasources; alternative ‐ take y % of selected data sources)

• Similarity comparison – smart indexing• Incremental updating of predicting models

M. Skrjanc, B. Kazic{Maja.Skrjanc, Blaz.Kazic}@ijs.si , 

Jozef Stefan Institute, Jamova ul. 39, Ljubljana, Slovenia

Forecasting in Smart Grids• Types of forecasting problems:

• Electricity load (short term, medium term, long term)• Renewable sources generation• Electricity prices• Costumer segmentation

• Input sources:• Historical load variables: used for learning models and

detecting short term trends• Meteorological data: known to be correlated with load

(depends on location)• Static data: such as special calendar data (holidays,

summer season), and topology of electrical grid• Methods used:

• Naive approach: Localized averages, previous values.Computationally non demanding, fast, robust and easyto maintain. Can work surprisingly well.

• Classical approaches: Autoregressive (ARMIA),regression‐based statistics methods. Based on historicaldata. Can take advantage of seasonality trends, butusually don’t include other data sources.

• Computational intelligence approaches: Artificialneural networks, support vector machines. Data drivenapproach that can take advantage of variousheterogeneous data sources.

• Hybrid methods: combine two or moredifferent approaches in order to take advantageof specific methods benefits and overcometheir drawbacks.