View
215
Download
0
Category
Preview:
Citation preview
Time
Expecta
tions
Innovation Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity
Sourc
e: G
artn
er (J
uly
, 20
17
)
M A C H I N E L E A R N I N G
Gartner Hype Cycle for Emerging Technologies | 2017
Vendors Got Us Here
“Our machines detect threats others cannot”
“100% predictive”“Advanced Threats are no match for A.I.”
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
Silver Bullet Marketing
No Explanation or Discussion
Limited Guidance
How We Disservice Machine Learning
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Pulbic
“Field of study that gives computers the ability to learn without being explicitly programmed.”Arthur Samuel’s definition of machine learning in 1959
bayesianbayesian
regressionregression
machine learningalgorithmsmachine learning
algorithms
regularizationregularizationclusteringclustering
neural networkneural networkdeep learningdeep learning
ensembleensemblerule systemrule system
decision treedecision tree
instance basedinstance based
dimensionality reductiondimensionality reductionN E R D A L E R T
Let’s define the helpful data science terms
classifierclassifierground truthground truth
bayesianbayesian
regressionregression
machine learningalgorithmsmachine learning
algorithms
regularizationregularizationclusteringclustering
neural networkneural networkdeep learningdeep learning
ensembleensemblerule systemrule system
decision treedecision tree
instance basedinstance based
dimensionality reductiondimensionality reduction
classifierclassifierground truthground truth
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
Machine Learning
Artificial Intelligence
Machine Learning
Supervised Learning
Unsupervised Learning
Reinforcement Learning
The Big Picture
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
Machine Learning
Common Techniques
Unsupervised Learning
Reinforcement Learning
Supervised Learning
When you know the question you are trying to ask and have examples of it being asked and answered correction
“The other” categoryTrial and error behavior effective in game scenarios
You don't have answers and may not
fully know the questions
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
E
Supervised Learning Unsupervised Learning Other(Reinforcement Learning, etc.)
75% 15% 10%
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
a
Use in combination with Machine Learning
What did we do before Machine Learning?
Simple Pattern Matching
Statistical Methods Rules and First Order Logic (FoL)
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
“Field of study that gives computers the ability to learn without being explicitly programmed.”
“Field of study that gives computers the ability to be implicitly programmed.”
Translation
Classifier Prediction
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
Training Data
Training Classifiers
Machine Learning Algorithm
New Data
Ground Truth Used in Supervised LearningThe 'Ground Truth' is the pairing of example questions and answers.
If you can phrase a problem as 'we know this is right, learn a way to answer more questions of this type'.
Success depends greatly on the dataset expressing the Question -> Answer mapping.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
One Size Does Not Fit All
Other ML Application Security
N E R D A L E R T
Warning: Success in one domain does not guarantee success in another© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
What Is At Stake Matters
Because you watched Deadpool, you might like…
Deadpool X-Men: First Class The Flash Captain America: The First Avenger
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
How did you come to that conclusion?
“The Explainability Problem”
Normal WorkflowCFO daily calendar
Irregular ActivityML detects “suspicious”
activity and suggests remediation
QuarantinedHowever, ML cannot
articulate *why* it wants to remediate
Loss of time and resources
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
PrecisionWhen my classifier predicts an instance in a certain class, how often does the instance belong to that class?
AccuracyHow often does my classifier give me the correct answer?
How We Know Machine Learning is Working
N E R D A L E R T
Root mean square error & Logical RegressionTranslation: On average, how far away are my predictions from what we later know to be true values?
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
Why is Machine Learning so useful in Security?
Evolving SecurityThe security domain is always evolving,
has a large amount of variability,
and is not well-understood
StaticWith limited variability or is
well-understood
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
Insider Threats and Behavioral Security Analytics
DetectingThrough novelty and outliers
AttackersThey’re not breaking in,
they are logging in
EventsTurn weak signals into a
strong one
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
Classify the Observable World and Infer the Rest
Normal ActivityWeird Stuff(but not threat related)
Threat Actor Activity
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
Billions of connections
• Statistical Methods
• Information-Theoretical Methods
• 70+ Unsupervised Anomaly Detectors
• Dynamic Adaptive Ensemble Creation
• Multiple-Instance Learning
• Neural Networks
• Rule Mining
• Random Forests
• Boosting
• ML: Supervised Learning
• Probabilistic Threat Propagation
• Graph-Statistical Methods
• Random Graphs
• Graph Methods
• Supervised Classifier Training
Anomaly Detectionand Trust Modeling
Event Classificationand Entity Modeling
RelationshipModeling
Cascade of specialized layers of Machine Learning algorithms
Multi-layer Analytical Pipeline
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
5
Security that Shows its Work
5
Oct.
3
3 Spam tracking#CSPM02
New
3
Oct.
4
C&
C u
rl
8 Information Stealer#CDCH01
Oct.
15
Anom
alo
us
htt
p
37
Oct.
16
Heavy
uplo
ader
Dro
pbox.c
om
78
Oct.
25
Oct.
28
Malic
ious
htt
p
Recurr
ing
8Malware: salityDec. 9 | 28 days
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
Measure the Right Things
Efficacy of the Assertions
True/False Positive
True/False Negative
Root Mean Squared Error
Overfitting/Undefitting
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
How are you applying Machine Learning in your product and why?
How do you measure its effectiveness?
What to Ask Your Vendor
Regarding supervised learning, what are you using for ’ground truth’?
What non-machine learning are you using and why?
What papers or open-source have you published regarding your analytics?
For the ML based assertions, what entailments are provided?
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Pulbic
Be Pragmatic
Entailments
Success is Domain Specific
Analytical pipeline, over single technique
A Good Machine Learning Approach
Measure helpfulness, not mathematical accuracy
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
Recommended