107
Machine learning and AI in the datacenter and how it will affect you Nick Chase Head of Content, Mirantis Editor in Chief, Open Cloud Digest Author, Machine Learning for Mere Mortals

you datacenter and how it will affect Machine learning and

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Machine learning and AI in the datacenter and how it will affect you

Nick ChaseHead of Content, MirantisEditor in Chief, Open Cloud DigestAuthor, Machine Learning for Mere Mortals

Who am I?

Nick Chase• Long-time programmer• Head of Technical and Marketing Content for

Mirantis• Editor in Chief of Open Cloud Digest• Author of Machine Learning for Mere Mortals• Absolutely NOT a math major• nchase at mirantis dot com

2

Shhhhhh.....

Turns out, this is not as tough as you think.

Special Bonus

• Free early access to Machine Learning for Mere Mortals for live attendees

http://bit.ly/machine_learning_course

4

Agenda

• What is machine learning?• Types of machine learning• Problems facing the datacenter and how ML can

solve them • You, yes, YOU, can do this (and I'll prove it)• Intelligent Delivery datacenter and how to start• Q&A

5

Introduction Just what is machine learning,

anyway?

6

What is machine learning?

Machine learning is the study of enabling a computer to perform in situations it may not have

seen before.

7

Where can we see ML today?

• Analytics• Advertising• Bots• Pattern recognition• NLP• Search

8

Where can we see ML today?

• Analytics• Advertising• Bots• Pattern recognition• NLP• Search

9

• Image recognition • Video analysis• Threat detection• Medicine• Self driving cars

Lots of hype

• Existing companies http://indeedhi.re/2nFRKY0

10

● Amazon.com (1252)● JPMorgan Chase (733)● Goldman Sachs (674)● Microsoft (580)● Apple (322)● Google (217)● Facebook (217)● KPMG (148)● Jobspring Partners (124)● Capital One (118)● NVIDIA (116)● Strategic IT Staffing (111)● Oracle (110)● Lockheed Martin (106)● IBM (101)

Lots of hype

• New companies (http://bit.ly/2E2qFoU)• Newly accessible

• Tensorflow• SciKit-Learn• etc.

11

Machine learning vs artificial intelligence

12

What does ML have to do with "cloud native"?

• Dynamic environments

13

What does ML have to do with "cloud native"?

• Dynamic environments

• High-velocity management

14

What does ML have to do with "cloud native"?

• Dynamic environments

• High-velocity management

• Continuous automation

15

What does ML have to do with "cloud native"?

• Dynamic environments

• High-velocity management

• Continuous automation

• Observability

16

What does ML have to do with "cloud native"?

• Dynamic environments

• High-velocity management

• Continuous automation

• Observability

• Efficiency

17

What can ML do for the datacenter?

Continuous Delivery infrastructure

18

What can ML do for the datacenter?

Continuous Delivery infrastructure

Intelligent Delivery infrastructure

19

Intelligent Delivery infrastructure

• Defined architecture

20

Intelligent Delivery infrastructure

• Defined architecture

• Flexible but controllable infrastructure

21

Intelligent Delivery infrastructure

• Defined architecture

• Flexible but controllable infrastructure

• Intelligent oversight

22

Intelligent Delivery infrastructure

• Defined architecture

• Flexible but controllable infrastructure

• Intelligent oversight

• Secure footing

23

Types of Machine Learning

24

Supervised learning

Training a computer to understand data it hasn't seen by teaching it about data

that has already been labeled

25

Supervised learning: Examples

• Image recognition

26

Supervised learning: Examples

• Image recognition• Speech recognition

27

Supervised learning: Examples

• Image recognition• Speech recognition• Medical outcome prediction

28

Supervised learning: Examples

• Image recognition• Speech recognition• Medical outcome prediction• Spam filtering

29

Supervised learning: Examples

• Image recognition• Speech recognition• Medical outcome prediction• Spam filtering• Stock market/crypto pricing

30

Unsupervised learning

Letting the computer discover patterns and relationships in unlabeled data.

31

Unsupervised learning: example

• Clustering

32

Unsupervised learning: example

• Clustering • Customers into categories

33

Unsupervised learning: example

• Clustering • Customers into categories• Chemical compounds

34

Unsupervised learning: example

• Clustering • Customers into categories• Chemical compounds

• Anomaly detection

35

Unsupervised learning: example

• Clustering • Customers into categories• Chemical compounds

• Anomaly detection• Signal separation

36

Reinforcement learning

A combination of supervised and unsupervised learning, where the computer is given an objective and attempts to reach that objective, discovering

tactics as it goes along.

37

Reinforcement learning: example

• Just about anything humans do

38

Reinforcement learning: example

• Just about anything humans do• Self-driving cars

39

Reinforcement learning: example

• Just about anything humans do• Self-driving cars• Strategy development

40

Reinforcement learning: example

• Just about anything humans do• Self-driving cars• Strategy development

• Games

41

Reinforcement learning: example

• Just about anything humans do• Self-driving cars• Strategy development

• Games• Scheduling

42

Problems Facing the Datacenter

43

Configuration: potential issues

• Self-service

44

Configuration: potential issues

• Self-service • Microservices architectures

45

Configuration: potential issues

• Self-service• Microservices architectures• Dynamic environments and drift issues

46

Configuration: how ML/AI can help

• App configuration based on past history

47

Configuration: how ML/AI can help

• App configuration based on past history• Reinforcement learning to retroactively improve

configuration

48

Performance optimization: Potential issues

• Lots of factors in play

49

Performance optimization: Potential issues

• Lots of factors• Dynamic environments create a changing

landscape

50

Performance optimization: how ML/AI can help

• Analyze multiple factors

51

Performance optimization: how ML/AI can help

• Analyze multiple factors• Predict future load and move or scale workloads

based on patterns

52

Performance optimization: how ML/AI can help

• Analyze multiple factors• Predict future load and move or scale workloads

based on patterns• Move workloads to closer geographical resources

based on patterns

53

Cost optimization: potential issues

• Complex environments

54

Cost optimization: potential issues

• Complex environments• Changing costs

55

Cost optimization: potential issues

• Complex environments• Changing costs• Soaring storage requirements

56

Cost optimization: how AI/ML can help

• Keep track of multiple environments and costs

57

Cost optimization: how AI/ML can help

• Keep track of multiple environments and costs• Adapt to changing conditions

58

Cost optimization: how AI/ML can help

• Keep track of multiple environments and costs• Adapt to changing conditions• Optimize what data is kept

59

Fault detection: potential issues

• Hardware failure

60

Fault detection: potential issues

• Hardware failure• Software configuration drift

61

Fault detection: potential issues

• Hardware failure• Software configuration drift• Data corruption

62

Fault detection: potential issues

• Hardware failure• Software configuration drift• Data corruption

No, I am not making that up: http://bit.ly/cosmic-corruption

63

Fault detection: potential issues

• Hardware failure• Software configuration drift• Data corruption• Resource overrun

64

Fault detection: how AI/ML can help

• Autoscaling

65

Fault detection: how AI/ML can help

• Autoscaling• Pattern recognition

66

Fault detection: how AI/ML can help

• Autoscaling• Pattern recognition• Anomaly detection

67

Security: potential issues

• DDOS

68

Security: potential issues

• Distributed Denial Of Service attacks• Advanced Persistent Threats

69

Security: potential issues

• DDOS• APT• Inside jobs

70

Security: how ML/AI can help

• Anomaly detection to find APTs and inside jobs

71

Security: how ML/AI can help

• Anomaly detection to find APTs and inside jobs• Pattern recognition to predict DDOS attacks

before they're in full swing

72

Try It Yourself

73

Once upon a time...

74

Once upon a time...

• Self-configuring

75

Once upon a time...

• Self-configuring• Self-optimizing

76

Once upon a time...

• Self-configuring• Self-optimizing• Self-healing

77

Once upon a time...

• Self-configuring• Self-optimizing• Self-healing• Self-protecting

78

Once upon a time...

79

What are we trying to do?

80

What are we trying to do?

81

Defining the error

82

Defining the error

83

Defining the error

84

y

Defining the error

85

y

ypredicted

Defining the error

86

y

ypredicted

abs(y - ypredicted)

Defining the error

87

errortotal = sum(abs(y - ypredicted))

Defining the error

88

errortotal = sum(abs(y - (mx + b))

Hot and cold

89

Hot and cold

90

Hot and cold

91

Hot and cold

92

Hot and cold

93

Hot and cold

94

Hot and cold

95

errortotal = sum(abs(y - (mx + b))

Gradient descent

An iterative algorithm to find the minimum of the cost/error function and that parameters that

achieve that minimum.

96

Putting it together

• Data• Python3• Tensorflow• Matplotlib (optional)• Ubuntu 16.04

97

What we'll do

• Read the data• Create the model• Define the error• Define the optimizer• Repeatedly run the optimizer until we get as close

as we can

98

DEMO

99

The actual answer

y = .37x + 12m = .37b = 12

100

DEMO

101

Where do we go from here?

102

More practically

• Don't expect to go from B&W to VR in one step• Smaller steps in between

103

More practically

• Don't expect to go from B&W to VR in one step• Smaller steps in between

• Start with analytics

104

More practically

• Don't expect to go from B&W to VR in one step• Smaller steps in between

• Start with analytics• Ultimately, know what you're trying to achieve

105

Q&A

106

Thank You

107