Machine Learning - slides.yowconference.com · DNN: Feature Engineering Anything humans can do in...

Copyright Cognomotiv 2016

Machine Learning

No: It Can’t Do That!

Hadi Nahari

hadi@cognomotiv.com

hadinahari

“Friends, Romans, countrymen, lend me your ears;

I come to bury Caesar, not to praise him.

The evil that men do lives after them…”

Julius Caesar

Act 3, Scene II

• ML + NetSec

National Academy of EngineeringGrand Challenges for 21st Century

"The best minds of my generation are thinking about how to make people click ads.” ---Jeff Hammerbacher

Agenda

• Motivations

• Machine Learning 101

• ML & Network Security

• What Works, What Doesn’t

• Conclusion

5 / 50

MOTIVATIONSAgenda

ML Is NOT New

• This is the 5th round…

ML is HOT!!

• VCs fund ML-companies like crazy

• Amazing new fields have opened

– Autonomous driving, behavior analytics, etc.

• Ton of existing fields have been revived

– Search, personalization/customization, audio processing, image processing, etc., etc.

• Mainly because…

Code Complexity

• Space Shuttle: ~400K LOC

• F22 Raptor fighter: ~2M LOC

• Linux kernel 2.2: ~2.5M LOC

• Hubble telescope: ~3M LOC

• Android core: ~12M LOC

• Future Combat Sys.: ~63M LOC

• Connected car: ~100M LOC

• Autonomous vehicle: ~300M LOC

10 / 50

• Autonomous vehicle: ~300M LOC

Large Hadron Collider: 60 M LOC

50 M LOC

Usecase Complexityservice provider

on avg. only five passwords per 40 online accounts per user

where to store the tokens???

Data Procreation

• >2 billion GB of new data is created every day– 2.3283006436538696 B GB to be exact

• Sparse data: mainly 0s

• In ‘93 the information on the internet surpassed all information that humanity had created before it

Stack Proliferation

HW Architecture(s)

Applications

Algorithms

15 / 50

Algorithms

ML 101Agenda

Machine Learning (ML)• Study of pattern recognition & computational

learning theory in Artificial Intelligence (AI)

• Algorithms to learn from, and make predictions on data

• As opposed to following strictly static program instructions

ML Models• Supervised learning

• Unsupervised learning

• (Semi-supervised learning)

• Reinforcement learning

Supervised Learning

• {(labeled) Input} [map] {Expected Output}

• Find [map]

Supervised Learning Model

Unsupervised Learning• {(unlabled) Input} [map] {Output}

• Find structure (patterns) in {Input}

Unsupervised Learning Model

Reinforcement Learning• No correct {Input}/{Output}

• Action, environment, reward

Reinforcement Learning Model

25 / 50

Main ML Approaches• Decision Tree Learning, Association Rule Learning

• Inductive Logic Programming, Support Vector Machines, Clustering, Bayesian Networks

• Representation Learning, Genetic Algorithms

• Similarity and Metric Learning, Sparse Dictionary Learning

• Artificial Neural Networks (ANN), Deep Learning (DL)

Neural Network

• Interpret an Artificial Intelligence (AI) task as the evaluation of complex functions

– Facial Recognition: Map a bunch of pixels to a name

– Handwriting Recognition: Image to a character

• NN: Network of interconnected simple neurons

The NeuronFeed-forward system, made up of two stages:

Linear Transformation of data

Point-wise application of non-linear function

yi =F(ΣWiXi)i

F(x) =max(0,x)

(also sigmoid, Rectified Linear Unit (ReLU), etc.)

Artificial Neural Network (ANN)• Layers and layers of neurons, with many

connections

Input:

Output:

Deep Learning (DL)

• Branch of ML based on a set of algorithms that:

• Attempt to model high-level data abstractions

• Are based on learning representations of data

• Use complex architectures with multiple non-linear transformations

• Some representations make it easier to learn tasks from examples (e.g. Alpha Go)

DNN: Learning Feature Representation

Input Result

DNN: Feature Engineering

Anything humans can do in 0.1 sec, the right, big 10-layer network can do too

Image Vision features Detection

Images/video

Audio Audio features Speaker ID

Text Text features

Text classification, Machine translation, Information retrieval, ....

ML/DL Improve With Scale

Data & Compute

Performance ML / DL

Many previous methods

Past Present Future

ML & NETSECAgenda

Intrusion & Intrusion Detection

“Intrusion is an attempt to compromise CIA

(Confidentiality, Integrity, Availability), or to bypass the

security mechanisms of a computer or network”

“Intrusion detection is the process of monitoring the

events occurring in a computer system or network, and

analyzing them for signs of an intrusion”

3 Main Detection Methodologies• Signature-based Detection (SD)

• Signature: pattern corresponding to known attack or threat

• SD: process to compare patterns against captured events

• A.K.A “Knowledge-based Detection”

• Anomaly-based Detection (AD)

• Anomaly is a deviation to “normal” behavior

• Profile of normal is derived from monitoring network traffic

• AD compares normal profile with observed events

• Stateful Protocol Analysis (SPA)

• Vendor-developed generic profiles to specific protocols

Cybersecurity System

• Attacks evolve, ergo building defense systems is nontrivial

• Thus, higher-level & adaptive methodologies are required

Adaptive Cybersecurity

• Data-capturing tools (Libpcap, Winpcap, etc.) capture events from the audit trails of information sources (e.g. network)

• Data-preprocessing module filters out the attacks from which good signatures have been learned

• A feature-extractor derives basic features (sequence of syscalls, start time, NetFlow duration, src/dest IP/port, protocol, byte and packet counts

• Analysis engine implements detection methods for infrastructure anomalies, which may or may not have appeared before

WHAT WORKS WHAT DOESN’T Agenda

Curse of Dimensionality

• Data volume is massive

– min. ~100M events per day

• Much of the data is streaming data

– Requires inline, real-time analysis

• Feature space is high dimensional

$/Detection Performance Abysmal

• Looking for “every anomaly” is cost prohibitive

– if at all [practically] possible

• Narrowing down the criteria too much

– results in false negative

• Reference data hard to gain due to privacy concerns

– Simulated data is useless

• ML was supposed to be better than signature era

Husky Recognition

• We built an effective snow recognition model…

Learned Features

Models: Simple Correlations

• Simple models are also (usually) wrong

Network Anomalies

• Malicious data packets have a small variety(low type-count), but happen in high frequency

– Current models are not good at detecting this type of anomaly

• Anomaly/outlier varies among application domains

• Labeled anomalies are not available for training/validation

Baselining

• Using ML to detect anomaly is easy when baseline is well-defined and follows simple mathematical model (e.g. Normal Distribution)

• Most real-world systems don’t render a simple baseline (i.e. their behavior is very complex)

• [!]Sanctity of baseline: “nearly 100% of networks are compromised”

Time Shifting

• “Window problem”: algos should be limited to ingest data in chunks that can be processed

– What if the anomaly is seeded outside that window?

• Network traffic diversity: usage varies in every session and with new applications

– window should also be shifted for recurring training

• Serious impact on performance, real-time, and security

There’s More…

• How do you trust what the model predicts?– i.e. how do we know the model works correctly (husky)?

• Designing sound evaluation schemes can be more difficult than the detector itself

• We really don’t know how ML works

• … or how to reason about ML models

• … or how to debug them

• For now it’s just magic & voodoo

CONCLUSIONAgenda

Summary

• ML is a great and necessary technology

• ML really shines for some classes of problems

• ML is NOT the best solution for every problem (e.g. NetSec)

• Obtaining (and training with) useful data remains a challenge

• ML is just one initial building block of Machine Cognition and Artificial Understanding: there are many more

• Still a long way before machines can replicate humans!

THANK YOU!

Hadi Nahari

hadi@cognomotiv.com

hadinahari

Backup

References• Prof. Karl Friston seminal works

(http://www.fil.ion.ucl.ac.uk/~karl/#_Free-energy_principle)• “Why Should I Trust You?” Explaining the Predictions of Any Classifier, Carlos Guestrin, et al

(https://arxiv.org/pdf/1602.04938.pdf)• “Using Machine Learning in Network Intrusion Detection Systems”, Omar Shaya

(http://www.slideshare.net/OmarShaya/machine-learning-in-networks-intrusion-detection?next_slideshow=1)

• “Machine Learning Is Not The Answer To Better Network Security”, Matt Harrigan(https://techcrunch.com/2016/02/29/machine-learning-is-not-the-answer-to-better-network-security/)

• “Machine Learning Algorithm Cheat Sheet”, Laura Diane Hamilton, (http://www.lauradhamilton.com/machine-learning-algorithm-cheat-sheet)

• “Anomaly Detection Approaches for Communicating Networks”(http://users.ece.gatech.edu/~jic/anomaly-book-chap-09.pdf)

• “A Survey on Machine Learning Techniques for Intrusion Detection Systems”, J. Sing, N.J. Nene, (http://ijarcce.com/upload/2013/november/35-o-jayveer_singh-A_Survey_on_Machine.pdf)

• “Machine Learning Techniques for Anomaly Detection: An Overview”, S. Omar, et al,(http://research.ijcaonline.org/volume79/number2/pxc3891478.pdf)

• “Recent Advances in Predictive (Machine) Learning”, J.H. Friedman, et al, (http://statweb.stanford.edu/~jhf/ftp/machine)

• “Outside the Closed World: On Using Machine Learning For Network Intrusion Detection”, R. Sommer, V. Paxson, (http://www.utdallas.edu/~muratk/courses/dmsec_files/oakland10-ml.pdf)

• http://xkcd.com

• IQ scores are rising

• Underlying biological “HW” declining

• “Intelligence” is in decline

Are Humans Getting Smarter?

Machine Learning - slides.yowconference.com · DNN: Feature Engineering Anything humans can do in...

Documents

DNN Extension Development - IowaComputerGurus Inc.static.iowacomputergurus.com/cdn/Downloads/BestPractices/DNN... · DNN Extension Development Best Practices Guide Page | 5 DotNetNuke.com

Benefites of DNN eCommerce Development

DNN Launch Webinar: DNN Platform 8.0 and Evoq 8.3

DNN Connect 2015 Keynote

DNN Pinned Site Creator

DNN Site Search - puresystems.co.uk

Creating a DNN Module

Why DNN Works for Speech and How to Make it More Efficient? · • 2006: DNN for small tasks (Hinton et al., 2006) o RBM-based pre-training for DNN • 2010: DNN for small-scale ASR

Dnn 27 11 2013 001

A seminar presentation on dnn

DNN Web API For Mobile

DNN Sentinel

Chicago Area DNN Users Group

Dnn europe 2013 dnn cloud - no video

LMAX Disruptor 3 - slides.yowconference.com · LMAX Disruptor 3.0!! Advanced Patterns and details! (Making the fast, faster)! @mikeb2701

Stronghold DNN Manual - WebSitesCreative

DNN Sentinel 2_2_final.pdf

Leadtail and DNN Webinar

DNN Module Development Company - DNN Extension

DNN Intro v2.1