WhoAmI
Entrepreneurial geek
Micro-clusters for BigData enthousiasts
BigData Services
Belgian Community
Agenda
Part 1 : BigData
Part 2 : IoT and BigData
Part 3 : Discussions
BigDataPart 1
The Buzz
Started by Google● GFS Paper (2003)● MapReduce Paper (2004)
Storing and processing “Big” Data~ multiple petabytes
The Need
Large amount of data
IOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIIIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIIIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIIOIOIIOIIOIOIOIIIOIOOOOI
The Need
High Speed at which data becomes available IOII OO
IOIIOIII
OIII
IOII OOIOII
OIIIOIII
IOII OOIOII
OIIIOIII
The Need
The Different ways data is structured
IOIIIOII
IOII
IOIIIOIIIOII
IOII
IOII
IOII
IOIIIOII
IOIIIOIIIOIIIOII
IOII
IOIIIOII
IOII
IOII IOII
IOII
IOII
IOIIIOII
IOIIIOIIIOII IOII
Technologies
Technologies
Batch Processing:
Realtime Processing:
STORM
Hadoop
Platform for Batch Processing
Yahoo!Doug CuttingMike Cafarella
Fetching and processing webpages
Storage: HDFS
Processing: YARN
Processing: MapReduce
Apache Storm
Platform for Realtime ProcessingLooks like Complex Event Processing (CEP)
BacktypeNathan MarzAcquired by Twitter
Like “ever-running” MapReduce jobs
Apache Storm
Apache Storm
Apache Storm
Apache Storm
Apache Storm
Myths
“BigData is all about storing data”
“BigData is only for big companies having lots of data”
“There are no BigData use cases in Belgium”
“BigData is only being used for analytics”
Dangers
People are hard to find
Technologies are “different”
Too much technologies
You need hardware
Incorrect information
IoT & BigDataPart 2
Iot as datasource
Use data from “Things”
As raw as possibleNo matter which format
Store everything
Iot as datasource
Iot as datasource
Adapt the processing flow based on the data flowing through it:
- Disable faulty sensors- Take processing shortcuts
Iot as datasource
Control other “Things” based on the data flowing through:
- Rain predicted => disable sprinklers
- Check-out at work, trip home will take 60 min => start heating in 30 min
Would be great to link with a rule-engine (Drools e.g.)
Iot as datasource
Store data for more complex processing:- Pattern or behaviour recognition- Forecasting- Prevention
Iot as datasource
Storage:
Every event is a recordEvery event is immutableEvery event has:
a sourcea timestampa typea value
Iot as datasource
Iot as Platform - Idea
Quantity > Quality
One platform toStoreProcess
Nodes can beAnything with a controllerAnything that can store data
Iot as Platform - Risks
Complex Resource ManagementExploit Capabilities
Complex Data ManagementFragmented Data
Complex Security ModelObfuscate Data
Iot as Platform - Gains
Keep your friends close, but your data closer.
Delegate processing
Use resources in the best way possible
What do you think?
Part 3