71
Copyright : Futuretext Ltd. London 1 International Workshop on Big Data Applications and Principles Madrid By Ajit Jaokar Sep 2014 @ajitjaokar [email protected]

International Workshop on  Big D ata  A pplications and P rinciples Madrid By Ajit Jaokar

Embed Size (px)

DESCRIPTION

International Workshop on  Big D ata  A pplications and P rinciples Madrid By Ajit Jaokar Sep 2014 @ajitjaokar [email protected]. 0. Ajit Jaokar. -. www.opengardensblog.futuretext.com World Economic Forum - PowerPoint PPT Presentation

Citation preview

Page 1: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London1

  

 

International Workshop on Big Data Applications and PrinciplesMadrid

By Ajit JaokarSep 2014

@ajitjaokar [email protected]

Page 2: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London2

Ajit Jaokar

-

www.opengardensblog.futuretext.comWorld Economic Forum Spoken at MWC(5 times), CEBIT, CTIA, Web 2.0, CNN, BBC, Oxford Uni, Uni St Gallen, European Parliament. @feynlabs – teaching kids Computer Science. Adivsory – Connected Liverpool

Page 3: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London3

Ajit Jaokar

-

Machine Learning for IoT and Telecomsfuturetext applies machine learning techniques to complex problems in the IoT (Internet of Things) and Telecoms domains.

We aim to provide a distinct competitive advantage to our customers through application of machine learning techniques

Philosophy:Think of NEST. NEST has no interface. It’s interface is based on ‘machine learning’ i.e. it learns and becomes better with use. This will be common with ALL products and will determine the competitive advantage of companies. Its a winner takes all game! Every product will have a ‘self learning’ interface/component and the product which learns best will win!

Page 4: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London4

Ajit Jaokar

-

• IoT

• Machine Learning

• IoT and Machine Learning

• Case studies and applications

Page 5: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London5

Ajit Jaokar

-

www.futuretext.com@AjitJaokar [email protected]

Page 6: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London6

Image source: Guardian

Image source: Guardian

Page 7: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London7

Page 8: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London8

Ajit Jaokar

IOT - THE INDUSTRY- STATE OF PLAY

Page 9: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London9

Ajit Jaokar

State of play - 2014• Our industry is exciting – but mature - Now a two horse race for devices

with Samsung around 70% of Android • Spectrum allocations and ‘G’ cycles are predictable - 5G around 2020 • 50 billion connected devices by 2020 • ITU world Radio communications Conference, November 2015. • IOT has taken off .. not because of EU and Corp efforts – but because of

Mobile, kickstarter, health apps and iBeacon and ofcourse NEST(acquired by Google)

Page 10: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London10

Ajit Jaokar

Stage One: Early innovation 1999 - 2007Regulatory innovation – net neutrality - Device innovation (Nokia 7110 and Ericsson t68i) - Operator innovation (pricing, bundling, Enterprise) - Connectivity innovation (SMS, BBM)Content innovation (ringtones, games, EMS, MMS) - Ecosystem innovation (iPhone)

Stage two: Ecosystem innovation - iPhone and Android (2007 – 2010)Social innovation  - Platform innovation - Community innovation - Long tail innovation - Application innovation

Page 11: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London11

Ajit Jaokar

Phase three: Market consolidation – 2010 - 2013 And then there were two ... Platform innovation and consolidationSecurity innovation App innovation

Phase four – three dimensions – 2014 ..Horizontal apps (iPhone and Android)Vertical (across the stack) – hardware, security, DataNetwork – 5G and pricing

Page 12: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London12

Ajit Jaokar

Many of the consumer IOT cases will happen with iBeacon in the next two years

Page 13: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London13

Ajit Jaokar

And 5G will provide the WAN connectivity 5G - Source – Ericsson

Page 14: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London14

Ajit Jaokar

Samsung Gear Fit named “Best Mobile Device” of Mobile World Congress

Notification or Quantification? – Displays (LED, e-paper, Mirasol, OLED and LCD) - Touchscreen or hardware controls? - Battery life and charging

Page 15: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London15

Ajit Jaokar

Hotspot 2.0

Page 16: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London16

Ajit Jaokar

Three parallel ecosystems IOT is connecting things to the Internet – which is not the same as connecting things to the cellular network!The difference is money .. and customers realise it

IOT local/personal (iBeacon, Kickstarter, Health apps)

M2M – Machine to Machine

IOT – pervasive(5G, Hotspot 2.0)

Perspectives• 2014 – 2015(radio conf) – 2020(5G, 2020)• 2014 – iBeacon (motivate retailers to open WiFi) • Hotspot 2.0 – connect cellular and wifi worlds• Default wifi and local world? • Operator world – (Big)Data, Corporate, pervasive apps – really happen

beyond 2020• So 5G will be timed well. The ecosystems will develop and they will be

connected by 5G

Page 17: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London17

Ajit Jaokar

IOT – INTERNET OF THINGS

Page 18: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London18

As the term Internet of Things implies (IOT) – IOT is about Smart objectsFor an object (say a chair) to be ‘smart’ it must have three things- An Identity (to be uniquely identifiable – via iPv6)- A communication mechanism(i.e. a radio) and- A set of sensors / actuators

For example – the chair may have a pressure sensor indicating that it is occupiedNow, if it is able to know who is sitting – it could co-relate more data by connecting to the person’s profileIf it is in a cafe, whole new data sets can be co-related (about the venue, about who else is there etc)

Thus, IOT is all about Data ..

IoT != M2M (M2M is a subset of IoT)

Page 19: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London19

Sensors lead to a LOT of Data (relative to mobile) .. (source David wood blog)

By 2020, we are expected to have 50 billion connected devicesTo put in context:The first commercial citywide cellular network was launched in Japan by NTT in 1979The milestone of 1 billion mobile phone connections was reached in 2002

The 2 billion mobile phone connections milestone was reached in 2005

The 3 billion mobile phone connections milestone was reached in 2007

The 4 billion mobile phone connections milestone was reached in February 2009.

Gartner: IoT will unearth more than $1.9 trillion in revenue before 2020; Cisco thinks there will be upwards of 50 billion connected devices by the same date; IDC estimates technology and services revenue will grow worldwide to $7.3 trillion by 2017 (up from $4.8 trillion in 2012).

Page 20: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London20

So, 50 billion by 2020 is a large numberSmart cities can be seen as an application domain of IOT

In 2008, for the first time in history, more than half of the world’s population will be living in towns and cities. By 2030 this number will swell to almost 5 billion, with urban growth concentrated in Africa and Asia with many mega-cities(10 million + inhabitants). By 2050, 70% of humanity will live in cities.

That’s a profound change and will lead to a different management approach than what is possible todayAlso, economic wealth of a nation could be seen as – Energy + Entrepreneurship + Connectivity (sensor level + network level + application level)Hence, if IOT is seen as a part of a network, then it is a core component of GDP.

Page 21: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London21

Ajit Jaokar

Machine Learning

Page 22: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London22

What is Machine Learning?

Mitchell's Machine Learning Tom Mitchell in his book Machine Learning  “The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience.” 

formally: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with  experience E.” 

Think of it as a design tool where we need to understand: What data to  collect for the experience (E)What decisions the software needs to make (T) and How we will evaluate its results (P).   

A programmers perspective: Machine Learning involves:a) Training of a model from data b) Predicts/ Extrapolates a decision c) Against a  performance measure. 

Page 23: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London23

What Problems Can Machine Learning Address? (source Jason Brownlee)

● Spam Detection:● Credit Card Fraud Detection• Digit Recognition: ● Speech Understanding: ● Face Detection: • Product Recommendation: ● Medical Diagnosis: ● Stock Trading: • Customer Segmentation• Shape Detection. 

Page 24: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London24

Types of Problems

• Classification: Data is labelled meaning it is assigned a class, for example spam/non spam or fraud/non fraud. The decision being modelled is toassign labels to new unlabelled pieces of data. This can be thought of as adiscrimination problem, modelling the differences or similarities between groups.

• Regression: Data is labelled with a real value rather than a label.  Examples that are easy to understand are time series data like the price of a stock over time. The decision being modelled is the relationships between

inputs and outputs. 

Clustering: Data is not labelled, but can be divided into groups based onsimilarity and other measures of natural structure in the data.An example from the above list would be organising pictures by faces without names, where the human user has to assign names to groups, like iPhoto on the Mac.

●Rule Extraction: Data is used as the basis for the extraction ofpropositional rules (antecedent/consequent or if then). Often necessary to work backwards from a Problem to the algorithm and then work with Data. Hence, you need a depth of domain experience and also algorithm experience

Page 25: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London25

What Algorithms Does Machine Learning Provide?

RegressionInstance-based Methods Decision Tree LearningBayesian Kernel Methods Clustering methodsAssociation Rule LearningArtificial Neural NetworksDeep Learning Dimensionality ReductionEnsemble Methods

Page 26: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London26

An Algorithmic Perspective

Marsland adopts the Mitchell definition of Machine Learning in his book Machine Learning: An Algorithmic Perspective.

“One of the most interesting features of machine learning is that it lies on the boundary of  several different academic disciplines, principally computerscience, statistics, mathematics,  and engineering (multidisciplinary). 

…machine learning is usually studied as part of artificial intelligence, which puts it firmly into computer science …understanding why these algorithmswork requires a certain amount of statistical and mathematical sophistication that is often missing fromcomputer science undergraduates.” 

Page 27: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London27

Definition of Machine Learning

 A one sentence definition is:  “Machine Learning is the training of a model from data that generalizes adecision against a  performance measure.” 

1) Training a model suggests training examples. 2) A model suggests state acquired through experience. 3) Generalizes a decision suggests the capability to make a decision based on inputs and anticipating unseen inputs in the future for which a decision will be required.4)Finally, against a performance measure suggests a targeted need and  directed quality to the model being prepared. 

Page 28: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London28

Key conceptsData

Instance: A single row of data is called an instance. It is an observationfrom the domain.   Feature: A single column of data is called a feature. It is an component of an observation and is also called an attribute of a data instance. Some features may be inputs to a model (the predictors) and others may be outputs or the features to be predicted.  Data Type: Features have a data type. They may be real or integer valuedor may have a categorical or ordinal value. You can have strings, dates,times, and more complex types, but typically they are reduced to real or categorical values when working with traditional machine learning methods.

Datasets: A collection of instances is a dataset and when working with machine learning methods we typically need a few datasets for different purposes.   Training Dataset: A dataset that we feed into our machine learning algorithm to train our model.  Testing Dataset: A dataset that we use to validate the accuracy of our model but is not used to train the model. It may be called the validation dataset.  

Page 29: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London29

Learning

Machine learning is indeed about automated learning with algorithms.In this section we will consider a few high level concepts about learning.   Induction: Machine learning algorithms learn through a process calledinduction or inductive learning. Induction is a reasoning process that makesgeneralizations (a model) from specific information (training data).  Generalization: Generalization is required because the model that isprepared by a machine learning algorithm needs to make predictions ordecisions based on specific data instances that were not seen during training.

Over Learning: When a model learns the training data too closely and does not generalize, this is called over learning.result is poor performance on data other than the training dataset. This is also called over fitting.  Under Learning: When a model has not learned enough structure from the database because the learning process was terminated early, this is called under learning.The result is good generalization but poor performance on all data, including the training dataset. This is also called under fitting.  

Page 30: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London30

Online Learning: Online learning is when a method is updated with datainstances from the domain as they become available. Online learning requires methods that are robust to noisy data but canproduce models that are more in tune with the current state of the domain

Offline Learning: Offline learning is when a method is created onpre prepared data and is then used operationally on unobserved data.The training process can be controlled and can tuned carefully because the  scope of the training data is known. The model is not updated after it has been prepared and performance may decrease if the domain changes. 

Supervised Learning: This is a learning process for generalizing onproblems where a prediction is required.A "teaching process" compares predictions by the model to known answersand makes corrections in the model.  Unsupervised Learning: This is a learning process for generalizing thestructure in the data where no prediction is required. Natural structures areidentified and exploited for relating instances to each other.

Page 31: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London31

  

 

Technique Applicability Algorithms

Classification

Most commonly used technique for predicting a specific outcome such as response / no-response, high / medium / low-value customer, likely to buy / not buy.

Logistic Regression —classic statistical technique but now available inside the Oracle Database and supports text and transactional data

Naive Bayes —Fast, simple, commonly applicable Support Vector Machine—Next generation, supports text and wide data Decision Tree —Popular, provides human-readable rules

Source: Oracle

Page 32: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London32

  

 

Regression

Technique for predicting a continuous numerical outcome such as customer lifetime value, house value, process yield rates.

Multiple Regression —classic statistical technique but now available inside the Oracle Database and supports text and transactional data

Support Vector Machine —Next generation, supports text and wide data

Attribute Importance

Ranks attributes according to strength of relationship with target attribute. Use cases include finding factors most associated with customers who respond to an offer, factors most associated with healthy patients.

Minimum Description Length—Considers each attribute as a simple predictive model of the target class

Source: Oracle

Page 33: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London33

  

 

Anomaly Detection

Identifies unusual or suspicious cases based on deviation from the norm. Common examples include health care fraud, expense report fraud, and tax compliance.

One-Class Support Vector Machine —Trains on "normal" cases to flag unusual cases

Clustering

Useful for exploring data and finding natural groupings. Members of a cluster are more like each other than they are like members of a different cluster. Common examples include finding new customer segments, and life sciences discovery.

Enhanced K-Means—Supports text mining, hierarchical clustering, distance based Orthogonal Partitioning Clustering—Hierarchical clustering, density based Expectation Maximization—Clustering technique that performs well in mixed data (dense and sparse) data mining problems.

Source: Oracle

Page 34: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London34

  

 

Association

Finds rules associated with frequently co-occuring items, used for market basket analysis, cross-sell, root cause analysis. Useful for product bundling, in-store placement, and defect analysis.

Apriori—Industry standard for market basket analysis

Feature Selection and Extraction

Produces new attributes as linear combination of existing attributes. Applicable for text data, latent semantic analysis, data compression, data decomposition and projection, and pattern recognition.

Non-negative Matrix Factorization—Next generation, maps the original data into the new set of attributes Principal Components Analysis (PCA)—creates new fewer composite attributes that respresent all the attributes. Singular Vector Decomposition—established feature extraction method that has a wide range of applications.

Source: Oracle

Page 35: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London35

  

 

Recap machine learning with IoT Supervised LearningIn supervised learning, a labeled training set (i.e., predefined inputs and known outputs) is used to build the system model. This model is used to represent the learned relation between the input, output and system parameters

K-nearest neighbor (k-NN): This supervised learning algorithm classifies a data sample (called a query point) based on the labels (i.e., the output values) of the near data samples. For example, missing readings of a sensor node can be predicted using the average measurements of neighboring sensors within specific diameter limits. There are several functions to determine the nearest set of nodes. A simple method is to use the Euclidean distance between different sensors

Decision tree (DT): It is a classification method for predicting labels of data by iterating the input data through a learning tree During this process, the feature properties are compared relative to decision conditions to reach a specific category. For example, DT provides a simple, but efficient method to identify link reliability in WSNs by identifying a few critical features such as loss rate, corruption rate, mean time to failure (MTTF) and mean time to restore (MTTR).

Page 36: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London36

  

 

Neural networks (NNs): This learning algorithm could be constructed by cascading chains of decision units (e.g., perceptrons or radial basis functions) used to recognize non-linear and complex functions . In WSNs, using neural networks in distributed manners is still not so pervasive due to the high computational requirements for learning the network weights, as well as the high management overhead. However, in centralized solutions, neural networks can learn multiple outputs and decision boundaries at once which makes them suitable for solving several network challenges using the same model.

Support vector machines (SVMs): It is a machine learning algorithm that learns to classify data points using labeled training samples . For example, one approach for detecting malicious behavior of a node is by using SVM to investigate temporal and spatial correlations of data. To illustrate, given WSN's observations as points in the feature space, SVM divides the space into parts. These parts are separated by as wide as possible margins (i.e., separation gaps), and new reading will be classified based on which side of the gaps they fall on as shown

Page 37: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London37

  

 

Bayesian statistics: Unlike most machine learning algorithms, Bayesian inference requires a relatively small number of training samples One application of Bayesian inference in WSNs is assessing event consistency (θ) using incomplete data sets (D) by investigating prior knowledge about the environment.

Unsupervised Learning

Unsupervised learners are not provided with labels (i.e., there is no output vector). Basically, the goal of an unsupervised learning algorithm is to classify the sample set into different groups by investigating the similarity between them. this theme of learning algorithms is widely used in node clustering and data aggregation problems

Page 38: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London38

  

 

K-means clustering: The k-means algorithm is used to recognize data into

different classes (known as clusters). This unsupervised learning algorithm is

widely used in sensor node clustering problem due to its linear complexity

and simple implementation. The k-means steps to resolve such node

clustering problem are (a) randomly choose k nodes to be the initial

centroids for different clusters; (b) label each node with the closest centroid

using a distance function; (c) re-compute the centroids using the current

node memberships and (d) stop if the convergence condition is valid (e.g., a

predefined threshold for the sum of distances between nodes and their

perspective centroids), otherwise go back to step (b).

Page 39: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London39

  

 

Principal component analysis (PCA): It is a multivariate method for data compression and dimensionality reduction that aims to extract important information from data and present it as a set of new orthogonal variables called principal components . For example, PCA reduces the amount of transmitted data among sensor nodes by finding a small set of uncorrelated linear combinations of original readings. Furthermore, the PCA method simplifies the problem solving by considering only few conditions in very large variable problems (i.e., tuning big data into tiny data representation)

Reinforcement Learning : Reinforcement learning enables an agent (e.g., a sensor node) to learn by interacting with its environment. The agent will learn to take the best actions that maximize its long-term rewards by using its own experience. The most well-known reinforcement learning technique is Q-learning. an agent regularly updates its achieved rewards based on the taken action at a given state.

Page 40: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London40

Ajit Jaokar

IoT and Machine Learning

Page 41: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London

Basic idea of machine learning is to build a mathematical model based on training data(learning stage) – predict results for new data(prediction stage) and tweak the model based on new conditions

What type of model? Predicitive, Classification, Clustering, Decision Oriented, Associative

IoT and Machine Learning On one hand - IoT creates a lot of contextual data which complements existing

processes On the other hand – the Sheer scale of IoT calls for unique solutions

Types of problems:• Apply existing Machine Learning algorithms to IoT data• Use IoT data to complement existing processes• Use the scale of IoT data to gain new insights• Consider some unique characteristics of IoT data (ex streaming)

41

Page 42: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London42

IoT : from traditional computing to .. Gone from making Smart things smarter(traditional computing) to a) Making dumb things smarter .. and b) living things more robust

3 Domains: Consumer, Enterprise, Public infrastructure

1) Consumer – bio sensors(real time tracking), Quantified self – focussing on benefits

2) Enterprise – Complex machinery (preventative maintenance), asset efficiency – reducing assets, increasing efficiency of existing assets. More from transactions to relationships(real time context awareness).

3) Public infrastructure(Dynamically adjust traffic lights). Dis-economies of scale(bad things also scale in cities) – Thanks John Hagel III 

Page 43: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London43

Three key areas: a) Move from exception handling to patterns of exceptions over time.(are some

exceptions occurring repeatedly? Do I need to redsign my product, Is that a new product?) –

b) Move from optimization to disruption – ownership to rental ship (Where are all these dynamic assets?)

c) Move to self learning: Robotics: From assembly line to self learning robots(Boston Dynamics), autonomous helicopters

Four examples of differences: Sensor fusion - Deep Learning - Real time - Streaming

Page 44: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London

Sensor fusion Sensor fusion is the combining of sensory data or data derived from

sensory data from disparate sources such that the resulting information is in some sense better than would be possible when these sources were used individually. The data sources for a fusion process are not specified to originate from identical sensors. Sensor fusion is a term that covers a number of methods and algorithms, including: Central Limit Theorem, Kalman filter, Bayesian networks, Dempster-Shafer

Example: http://www.camgian.com/ http://www.egburt.com/

44

Page 45: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London

Deep learning Google's acquisition of DeepMind Technologies 

In 2011, Stanford computer science professor Andrew Ng founded Google’s Google Brain project, which created a neural network trained with deep learning algorithms, which famously proved capable of recognizing high level concepts, such as cats, after watching just YouTube videos--and without ever having been told what a “cat” is.

A smart-object recognition algorithm that doesn’t need humans http://www.kurzweilai.net/a-smart-object-recognition-algorithm-that-doesnt-need-humans A feature construction method for general object recognition (Kirt Lillywhite, Dah-JyeLee n, BeauTippetts, JamesArchibald)

45

Page 46: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London46

Real time: Beyond ‘Hadoop’ (non hadoopable) the BDAS stack

BDAS Berkeley data analytics stack

Spark – an open source, in-memory, cluster computing framework.Integrated with Hadoop(can work with files stored in HDFS)Written in Scala

Page 47: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London47

Real time (Stream processing)

Page 48: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London48

Page 49: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London49

Spark – an open source, in-memory, cluster computing framework.Integrated with Hadoop(can work with files stored in HDFS)Written in Scala

Spark comes with tools: Interactive query analysis (Shark), Graph processing and analysis (Bagel) and Real-time analysis (Spark Streaming).

RDDs(Resilient Distributed Data sets): are the fundamental data objects used in Spark..RDDs are distributed objects that can be cached in-memory, across a cluster of compute nodes. Scales to 100s of nodes. Can achieve second scale latencies

Page 50: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London50

Source: Tathagata Das (TD) UC Berkeley

Page 51: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London51

Page 52: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London52

Page 53: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London53

  

 

Survey paper: Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications Mohammad Abu Alsheikh et al School of Computer Engineering, Nanyang Technological University, Singapore 639798

Event detection and Query processingMonitoring can be classified as: event-driven, continuous, or query-drivenFundamentally, machine learning offers solutions to restrict query areas and assess event validity for efficient event detection and query processing mechanisms. Advantages:

 

• Optimize limited resources ex storage and processing• Assess accuracy using simple classifiers.

 

• Narrow down the search region (avoid flooding the network)• More than a threshold detection (simplest case)

Event recognition through Bayesian algorithms: Krishnamachari and Iyengar Use of WSNs for detecting environmental phenomenon in a distributed manner. Readings will be considered as faulty if their values exceed a specific threshold. This study employs decentralized Bayesian learning that detects up to 95 percent of the faults, and will result in recognizing the event region.

Page 54: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London54

  

 

Zappi et al

A real-time approach for activity recognition using WSNs that

accurately detects body gesture and motion.

• Initially, the nodes, that are spread throughout the body, detect the

organ motion using an accelerometer sensor with three axis

measurements (positive, negative and null), where these measurements

are used by a hidden Markov model (HMM) to predict the activity at

each sensor.

• Sensor activation and selection rely on the sensor's potential

contributions in classifier accuracy (i.e., select the sensors that provide

the most informative description of the gesture).

• To generate a final gesture decision, a naive Bayes classifier is used

to combine the independent node predictions so as to maximize the

posterior probability of the Bayes theorem.  

Page 55: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London55

  

 

Page 56: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London56

  

 

Forest fire detection through neural network:

WSNs were actively used in fire detection and rescue systems

Yu et al. presented a real-time forest fire detection scheme based on a

neural network method .

Data processing will be distributed to cluster heads, and only important

information will be aggregated to a final decision maker.

Although the idea is creative and beneficial to the environment, the

classification task and system core are hardly interpretable when introducing

such systems to decision makers.

Page 57: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London57

  

 

Query processing through k-nearest neighbors:

K-nearest neighbor query is considered as a highly effective query processing technique in WSNs. Winter et al. developed an in-network query processing solution using the k-nearest neighbor algorithm, namely the “K-NN Boundary Tr ee” (KBT) algorithm. Each node that is aware of its location will determine its k-NN search region whenever a query is received from the application manager.

Jayaraman et al. extended the query processing design. “3D-KNN” is a query processing scheme for WSNs that adopts the k-nearest neighbor algorithm. This approach restricts the query region to bound at least k-nearest nodes deployed within a 3D space. In addition, signal-to-noise ratio (SNR) and distance measurements are used to refine the k-nearest neighbor. 

The primary concerns of such k-NN-based algorithms for query processing are the requirement of large memory footprint to store every collected sample and the high processing delay in large scale sensor networks.

Page 58: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London58

  

 

Distributed event detection for disaster management using

decision tree: Bahrepour et al. developed decision tree-based event

detection and recognition for sensor network disaster prevention systems.

The main application of this decentralized mechanism is the fire detection in

residential areas. The final event detection decision is made by using a

simple vote from the highest reputation nodes.

Page 59: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London59

  

 

Query optimization using principal component analysis (PCA):

Malik et al. optimized traditional query processing in WSNs using data attributes and PCA, thus reducing the overhead of such a process. PCA has been used to dynamically detect important attributes (i.e., dominant principal components) among the whole correlated data set.

proposed algorithm in four fundamental steps.

• SQL request, which contains the human intelligible attributes, is sent to the database management and optimization system. Here, the original query is optimized where the high-variance components are extracted from historical data using the PCA algorithm

• The optimized query is diffused to the wireless sensor network to extract the sensory data. Later, the original attributes (i.e., human intelligible attributes) can be extracted from the optimized attributes by reversing the process of PCA.

Page 60: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London60

  

 

Supposedly, the algorithm guarantees 25 percent improvement in energy saving of the network nodes while achieving 93 percent of accuracy rates. However, this enhancement is at the cost of accuracy of the collected data (as some of the data components will be ignored). Therefore, this solution may not be ideal for the applications with high accuracy and precision requirements.

Page 61: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London61

  

 

Localization and Objects Targeting Localization is the process of determining the geographic coordinates of network's nodes and components.

Position awareness of sensor nodes is an important capability, since most sensor network operations are typically based on the location

In most large scale systems, it is financially infeasible to use global positioning system (GPS) hardware in each node for this purpose. Moreover, GPS service may not be available in the observed environment (e.g., indoor). Relative location measurement is sufficient for certain uses.

However, by using the absolute locations for a small group of nodes, relative locations can be transformed into absolute ones

Page 62: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London62

  

 

Sensor nodes may encounter changes in their location after deployment

(e.g., due to movement). be summarized as follows:

The benefits of using machine learning algorithms in sensor node

localization process are

• Converting the relative locations of nodes to absolute ones using

few anchor points. This will eliminate the need for range measurement

hardware to obtain distance estimations.

 

• In surveillance and object targeting systems, machine learning can be

used to divide the monitored sites into a number of clusters,

where each cluster represents specific location indicator.

Page 63: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London63

  

 

Bayesian node localization: Morelande et al. [21] used a Bayesian

algorithm to develop a localization scheme for WSNs using only few

anchor points. This study focuses on the enhancement of progressive

correction, which is a method for predicting samples from likelihoods to

get closer to the posterior likelihood. The idea of using the Bayesian

algorithm for localization is appealing as it can handle incomplete data

sets by investigating prior knowledge and probabilities.

Page 64: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London64

  

 

Robust location-aware activity recognition: Lu and Fu addressed the problem of sensor and activity localization in smart homes without direct sensing.

The activities of interest include using the phone, listening to the music,

using the refrigerator, studying, etc.

The proposed framework, named “Ambient Intelligence Compliant Object” (AICO), uses multiple naive Bayes classifiers to determine the resident' s current location and evaluate the reliability of the system by detecting any malfunctioned sensors.

The designers must predefine a set of supported activities in advance.

There are also unsupervised machine learning algorithms for automatic feature extraction such as the deep learning methods and the non-negative matrix factorization algorithm where activities are determined without prior training

Page 65: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London65

  

 

Localization based on neural network: Shareef et al

compared three localization schemes that are based on different types of

neural networks. In particular, this study considers WSN localization using

multi-layer perceptron (MLP), radial basis function (RBF), and

recurrent neural networks (RNN). In summary, the RBF neural

network results in the minimum error at the cost of high resource

requirements. In contrast, MLP consumes the minimum computational

and memory resources.

Page 66: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London66

  

 

 

Security and Anomaly Intrusion Detection

 Save node's energy and significantly expand WSN lifetime by preventing the transmission of the outlier, misleading data.

 Enhance network reliability by eliminating faulty and malicious readings. In the same way, avoiding the discovery of unexpected knowledge that will be converted to important, and often critical actions.

Online learning and prevention (without human intervention) of malicious attacks and vulnerabilities.

Sensor m easurem ent

(e .g., tem perature, Anomalies readings

pressure ...e tc.)

Expected readings

SENSO R’S LO C A T IO N

ind icator

Page 67: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London67

  

 

Outlier detection using Bayesian belief network: Janakiram et al. used Bayesian belief networks (BBNs) to develop an outlier detection scheme. Given that the majority of node's neighbors will have similar readings (i.e., temporal and spatial correlations), it is reasonable to use this phenomenon to build conditional dependencies among nodes' readings. BBNs infer the conditional relationships among the observations to discover any potential outliers in the collected data. Furthermore, this method can be used to evaluate missing values.

 

 

Outlier detection using k-nearest neighbors: Branch et al. developed an in-network outlier detection method in WSNs using k-nearest neighbors. Moreover, any missing nodes' readings will be replaced by the average value of the k-nearest nodes. However, such non-parametric, k-NN-based algorithm requires large memory to store every collected readings from the monitored environment.

Page 68: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London68

  

 

Quality of Service, Data Integrity and Fault Detection

 

advantages:

 

• Different machine learning classifiers are used to recognize different

types of streams, thus eliminating the need for flow-aware management

techniques.

 

• The requirements for QoS guarantee, data integrity and fault detection

depend on the network service and application. Machine learning

methods are able to handle much of this while ensuring efficient

resource utilization, mainly bandwidth and power utilization.

Page 69: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London69

  

 

QoS estimation using neural network: Snow et al. introduced a method to estimate a sensor network

dependability metric using a neural network method. Dependability is a metric that represents availability, reliability, maintainability, and survivability of a sensor network. Several attributes are used to estimate such a metric including mean time between failure (MTBF) and mean time to repair (MTTR).

 

Moustapha and Selmic introduced a dynamic fault detection model for WSNs. This model captures the nodes' dynamic behavior and their effects on other nodes. In addition, neural network learning, which is trained using back-propagation method, was used for node identification and fault detection. This study results in an effective nonlinear sensor model that suits applications with fault detection requirements.

Page 70: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London70

  

 

Air quality monitoring using neural networks: Postolache et al. proposed a neural networks-based method for measuring air pollution levels using inexpensive gas sensor nodes, while eliminating the effects of temperature and humidity on sensor readings. This solution detects the air quality and gas concentration using neural networks implemented using JavaScript (JS). As a result, the solution is able to distribute processing between web server and end user computers (i.e., a combination of client and server side scripts).

Intelligent lighting control using neural networks: Gao et al. introduced a new standard for lighting control in smart building using the neural network algorithm. A radial basis function (RBF) neural network is used to extract a new mathematical expression, called “Illuminance Matrix” (I- matrix), to measure the degree of illuminance in the lighted area. Fundamentally, in the field of lighting control, converting the collected data from the photosensors to a form that is suitable for digital signal processing is a crucial issue and can highly affect the performance of the developed system. The article shows that using the I-matrix scheme can achieve about 60% more accuracy compared to the standard methods.

Page 71: International Workshop on  Big D ata  A pplications  and  P rinciples Madrid By  Ajit Jaokar

Copyright : Futuretext Ltd. London71

Ajit Jaokar

-

www.futuretext.com@AjitJaokar [email protected]