44
Machine Learning Engineering Anatoly Levenchuk Copyright © 2016 by Anatoly Levenchuk. Permission granted to DeepHack and INCOSE to publish and use.

A.Levenchuk -- Machine learning engineering

Embed Size (px)

Citation preview

Page 1: A.Levenchuk -- Machine learning engineering

Machine Learning Engineering

Anatoly Levenchuk

Copyright © 2016 by Anatoly Levenchuk. Permission granted to DeepHack and INCOSE to publish and use.

Page 2: A.Levenchuk -- Machine learning engineering

What is machine learning as a human activity?

• Ontological question (Aristotle definition: via class-subclass specialization)

• Why it is important?– How to pay? [grants, investmetns, charity]– How to teach? [science, engineering, arts/crafts]– How to name and distinguish in communication

(hiring – participation in division of labor)?

2

How you name yourself to colleagues, when hacking machine learning system?

Page 3: A.Levenchuk -- Machine learning engineering

Machine learning is a…• Science! MSc. in Machine Learning [BigData]• Research?• Engineering?• Art?

Programming is a…• Science? Computer science, MSc.• Research? Computer science, MSc.• Engineering? Software engr., MSc. and MSE!• Art? Master of Art in Mathematics!

3

http://www.computer.org/web/education/professional-competency-certifications

https://www.kaggle.com/competitions

Page 4: A.Levenchuk -- Machine learning engineering

Test:Why is my program not working?

Why is my program not

working?

You need to know why?

To repair compiler?

Software engineer (systems)

To advance theory?

Computer Scientist

You need to program working

properly?

Software engineer

(application)

4

Page 5: A.Levenchuk -- Machine learning engineering

Science

Resulting in models, descriptions (theories), ontologies:• M0 – manufacturing (not science!). Programmers are

engineers: software is physical system!• M1 – design/applied research (Edison)• M2 – basic research (Einstein)• M3 – philosophical logic/mathematics

• There are multiple meta-levels. • Scientists produce these meta-descriptions 5

Page 6: A.Levenchuk -- Machine learning engineering

Engineering• Engineering – discipline, art, skill and profession of acquiring and

applying scientific, mathematical, economic, social, and practical knowledge, in order to design and build structures, machines, devices, systems, materials and processes that safely realize improvements to the lives of people.

• Engineering is the application of mathematics, empirical evidence and scientific, economic, social, and practical knowledge in order to invent, innovate, design, build, maintain, research, and improve structures, machines, tools, systems, components, materials, and processes. 6https://en.wikipedia.org/wiki/Engineering

https://en.wikipedia.org/wiki/Outline_of_engineering

Page 7: A.Levenchuk -- Machine learning engineering

7

Data scientists – ML Engineers

Model/Theory [metamodel]

Engineering/Applied Research

Reality/Data/Model

Science/Basic Research

If it is not about budgeting and social status, it need not to distinguish science and engineering! Practice both of them!

Page 8: A.Levenchuk -- Machine learning engineering

8

Engineering for science

http://blogs.nvidia.com/blog/2016/01/12/accelerating-ai-artificial-intelligence-gpus/

Scientists are mere owner-operators of instruments. Who built the Big Hadron Collider?Experiments order by scientists, builds and carried by engineers, interprets by scientists.

Page 9: A.Levenchuk -- Machine learning engineering

The sunset of the professions, not jobs!

9

• Life-long• Special education• No other professions in a mix

• Several years long• Additional training• One competence in the mix

Machine learning engineering is not a profession. It is a competency!

Page 10: A.Levenchuk -- Machine learning engineering

Machine learning (systems) engineering• Control (systems)

engineering• Machine Learning

(systems) engineering

10

?http://www.payscale.com/research/US/Job=Controls_Engineer/Salary

• Systems Engineer (IT)• Cognitive/Machine Intelligence

Systems Engineer

?

Page 11: A.Levenchuk -- Machine learning engineering

What about jobs?

11Algorithms + Data Structures = Programs (Niklaus Wirth)Scientist is not an engineer, data is not a system

Page 12: A.Levenchuk -- Machine learning engineering

Kind of Engineerings• Mechanical engineering• Agriculture engineering• Aerospace engineering – aircraft architecture• Systems engineering• System of systems engineering• …• Software engineering• Control [systems] engineering – control [system] architecture• Knowledge engineering -- architecture• Machine learning [system] engineering• …• Neural engineering • neural network engineering -- neural [network] architecture• Feature engineering -- ???

12

Page 13: A.Levenchuk -- Machine learning engineering

13

Systems, Software, Machine Learning Engineerings• Systems engineering [Bell Labs in 1940s, boosted as a

profession by NCOSE 1990]• Software engineering [term appeared in 1965, boosted by

NATO as a profession in 1968]• Machine learning engineering [term appeared in 2011]

https://www.google.com/trends/explore#q=machine%20learning%20engineering&cmpt=q&tz=Etc%2FGMT-3

Page 14: A.Levenchuk -- Machine learning engineering

Conversion of engineeringsandDisruption of engineerings

14

Systems

Engineering

Control Engineering

Software Engineering

Machine Learning

Engineering

???

Janosh Szepanovits. Convergence: Model-Based Software, Systems And Control Engineering

+

http://www.infoq.com/presentations/Model-Based-Design-Janos-Sztipanovits

Le Bottou – «Machine Learning disrupts software engineering»http://leon.bottou.org/slides/2challenges/2challenges.pdf

We can add:• Machine learning disrupts

systems engineering• Machine learning disrupts

control engineering• …• Machine learning disrupts

contemporary engineering

Page 15: A.Levenchuk -- Machine learning engineering

Can we use systems and software engineering wisdom in MLE?Le Bottou http://leon.bottou.org/slides/2challenges/2challenges.pdf

• Models as modules: problematic due to weak contracts (models behave differently on different input data)

• Learning algorithms as modules: problematic due to output depends on the training data which itself depends on every other module

Engineering is not only about modularity and modular synthesis! What about other aspects?!• More attention to left part of V-diagram• Optimizations later• …• What else?

15

Page 16: A.Levenchuk -- Machine learning engineering

16

Technical Debt

Machine Learning:The High-Interest Credit Card of Technical Debthttp://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43146.pdf

Hidden Technical Debt in Machine Learning Systemshttp://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf

• Hack now, pay later (with interest, of course!).• Based on heuristics from software engineering (same

approach as our: usage of software and systems engineering wisdom in machine learning engineering).

• Set of domain-specific heuristics for machine learning

Page 17: A.Levenchuk -- Machine learning engineering

Bionics and machine learning systems engineering• In short: brain is only an inspiration, not a model

for reproducing!• There are other “learning systems engineerings”:

e.g. neural engineering (https://en.wikipedia.org/wiki/Neural_engineering).

• AGI (artificial general intelligence) is a far goal, but magnet for freaks of all sorts. Better not mention it.• Biologically plausible machine learning is about

science, not engineering.

17

Page 18: A.Levenchuk -- Machine learning engineering

18

Knowledge engineering• Ontology engineering (manually)• Solutions are (manually) programmed.

• Example: robot-«butterfly», https://youtu.be/kyvW5sOcZHU, https://youtu.be/V30e77x8BQA

– Every type of movement should be programmed anew– Non-adaptable to changes of environment and device– The best science available up today!

– Perfect, if CPS perform only one or two movements. Not for robots, definitely!

• No learning!

Page 19: A.Levenchuk -- Machine learning engineering

19

Tribes

Shallow LearningBig Data

Deep Learning

Neuroevolution

Bayes Army

Symbolic

Page 20: A.Levenchuk -- Machine learning engineering

Our definition of complexityComplex system – the one that does not fit in the sole engineer’s head, thus collaboration of a team and automation of a knowledge work are mandatory.

E.g.: • Aircraft• programming-in-the small vs.programming in the large• VLSI – very large scale integration, more than 1000 transistors

on a single chip (now transistor count is more than 20bln. – FPGA Virtex-Ultrascale XCVU440)

• Artificial neural network – 16bln. parameters.

20

Page 21: A.Levenchuk -- Machine learning engineering

Comlexity

• Systems Engineering • Machine learning

21

Complex system: not fit into one (hundred) heads for its development. Stellarators, Tokamaks, BHC, aerospace and VLSI engineering.IBM Watson (up to 2011): team of 40.

Still not very complex from engineering point of view.

http://josephpcohen.com/w/visualizing-cnn-architectures-side-by-side-with-mxnet/http://787updates.newairplane.com/787-Suppliers/World-Class-Supplier-Quality

Page 22: A.Levenchuk -- Machine learning engineering

22

CNN Architecture/complexity Growth19982012

9/2014

2/2015

12/2015

9/2014

http://josephpcohen.com/w/visualizing-cnn-architectures-side-by-side-with-mxnet/

LeNet 28*28LeNet 28*28

VGG 224x224

GoogLeNet 224x224

Inception V3 299x299

Inception BN 224x224

Page 23: A.Levenchuk -- Machine learning engineering

AutoML• Generative design/architecturing of networks• Bayesian convergence• Neuroevolution• Dynamic neural description languages (e.g.

Chainer)

23

Automatization of machine learning, CAMLE (computer-aided machine learning engineering) is the main trend of today and tomorrow!

Page 24: A.Levenchuk -- Machine learning engineering

Master AlgorithmPedro Domingos [module/construction]

• Symbolic• Evolution• Connectivist• Bayesian• Analogy[No free lunch!]

Sarath Chandar [component/function]

• multi-task learning• transfer learning• zero-shot/one-shot

learning• multi-modal learning• reinforcement learning

24

http://apsarath.github.io/2016/01/19/agi/ http://www.amazon.com/dp/0465065708/

Page 25: A.Levenchuk -- Machine learning engineering

Intellect-stack is only about one aspect of a whole intellect system.

Intellect-stack is about Platforms (modules) = «how to make it»

Based on Fig.3 ISO 81346-1

-Modules

=Components

+Allocations

25

Modules and interfaces: platforms/layers

Stack

Page 26: A.Levenchuk -- Machine learning engineering

Platform

• This is module viewpoint («how to make»)• Platform is a technology stack layer• Cohesive set of modules with published API• Can be based on top of other platform

26

Page 27: A.Levenchuk -- Machine learning engineering

27

Intelligence Platform Stackand machine learning engineering in it

Application (domain) Platform

Cognitive Architecture Platform

Learning Algorithm Platform

Computational library

General Computer Language

CPU

GPU/FPGA/Physical computation Drivers

GPU/FPGU/Physical computation Accelerator

Neurocompiler

Neuromorphic driver

Neuromorphic chip

Disr

uptio

n en

able

rsDisruption dem

and

Thanks for computer gamers for their disruption demand to give us disruption enabler such as GPU!

Page 28: A.Levenchuk -- Machine learning engineering

Alternative deep learning stack (as viewed by GPU hardware people)

28http://www.nextplatform.com/2015/12/07/gpu-platforms-emerge-for-longer-deep-learning-reach/

• No cognitive and application levels• Languages unimportant• Chassis, backplane, blades importans

(separate layer)• No neuromorphic processing

Page 29: A.Levenchuk -- Machine learning engineering

Hardware Acceleration (except GPU)Is this machine learning engineering? No! But…

• Algorithm-dependent• Need compilation (drivers)• Speed rules• Power rules• Scale rules

• GPU• FPGA• ASIC• Neuromorphic chips• Physical computing

29

http://lighton.io/

• Approximating kernels at the speed of light

http://arxiv.org/abs/1510.06664

Analog, optical device, that performs the random projections literally at the speed of light without having to store any matrix in memory. This is achieved using the physical properties of multiple coherent scattering of coherent light in random media.

• Towards Trainable Media:Using Waves for Neural Network-style Learning

• Bitwise Neural Networks http://arxiv.org/abs/1601.06071

• Conversion of Artificial Recurrent Neural Networks to Spiking Neural Networks for Low-power Neuromorphic Hardware http://arxiv.org/abs/1601.04187

http://arxiv.org/abs/1510.03776

Page 30: A.Levenchuk -- Machine learning engineering

General Computer LanguageComputer science + Software engineering

• Important! Separate layer in intellect-stack!• 2 language problem

• experiment and production, like deep learning frameworks (speed)• «Wrappers» in libraries (thresholds in understanding of a full stack up to hardware bottom)

• My preference: Julia (http://julialang.org/)• Scientific computing is design goal of Julia, MATLAB-similar syntax• 2 language problem solved (speed of computation as in C, speed of writing code as in

Python)• Extensive mathematical function library, Base library and external packages in native Julia• Parallel computing supported (GPU supported too)• Not object-oriented, using multiple dispatch as expression problem solution (good

modularity)• Version 0.4.3 now (1.0 expecting in one year)• Caution: slightly more complex than Python, should not be your first computer language…• MXNet deep learning framework have Julia wrapper

• DSL for deep learning is not General Computer Language• Probabilistic programming languages -- http://probabilistic-programming.org/wiki/Home• DNN description languges, like in CNTK -- https://github.com/Microsoft/CNTK

30

Page 31: A.Levenchuk -- Machine learning engineering

Computation libraries/frameworks/platformsNot a machine learning engineering!• Computation libraries Drivers+Hardware (GPU, clusters)• Linear algebra, optimization, autodiff, symbolic computations, etc.• Can be standalone platform, thus differ from machine learning libraries (general

algorithms for multiple purposes: bioinformatics, physics, astronomy, engineering, machine learning etc.)

• Deep learning frameworks often includes such a library (Torch, Theano, …).

• Scikit (NumPy, SciPy, and matplotlib)• Nd4j (n-dimentional arrays for Java)• Julia packages• …• Non-opensource: Mathematica, Maple…

31

Machine learning is “yet another domain modules and DSL” for them!

Page 32: A.Levenchuk -- Machine learning engineering

Learning algorithm frameworks (not systems)!Machine learning engineering!

• Gentleman algorithm set (CNN, RNN,…)• Updating with an arxiv.org papers rhythm!• Network description language – DSL for machine learning engineering• Experiments and production (scalable!)• Extensibility (on base of general computing language and scientific computing library:

on base of another layer platform in intellect-stack)• Presented as The Machine Learning Platform (including all lower levels assembled and

tuned) • There are hundreds of its: no less then «web frameworks» in early web

32http://www.slideshare.net/yutakashino/ss-56291783

• Google• Facebook• Microsoft• Baidu• IBM• Samsung• …

+ standard datasets for comparisons and benchmarking

+ other tribes platforms

Page 33: A.Levenchuk -- Machine learning engineering

Construction (type of modules) in machine learning

• Deep learning classics (DSL in deep learning frameworks)

• Probabilistic languages http://probabilistic-programming.org/, https://probmods.org/

• Deep learning and Bayesian conversion -- ) http://www.nextplatform.com/2015/09/24/deep-learning-and-a-new-bayesian-golden-age/, http://blog.shakirm.com/2015/10/bayesian-reasoning-and-deep-learning/, http://arxiv.org/abs/1512.05287

• Differentiable languages and datatypes http://colah.github.io/posts/2015-09-NN-Types-FP/, http://www.blackboxworkshop.org/pdf/nips2015blackbox_zenna.pdf, http://arxiv.org/abs/1506.02516

• …• Blends and hybrids of many other learning

architectures

33

Varieties in representations: in deep learning abstraction is architecturally layered, in other approaches it different!

Page 34: A.Levenchuk -- Machine learning engineering

Algorithm platform + Hardware platform = Algorithm platform (hardware is not visible for a platform user, but still matter!)

34http://blogs.microsoft.com/next/2016/01/25/microsoft-releases-cntk-its-open-source-deep-learning-toolkit-on-github/

Page 35: A.Levenchuk -- Machine learning engineering

Cognitive systems/architecturesLearning, communications, reasoning, planning

• Cognitive = knowledge processing. Knowledge is information that is useful in variety of situations. • Cognitive architecture/system is a platform for multiple application

systems.• Ensembles of learning algorithms: it is close to cognitive systems

engineering

• Cognitive systems engineering is a machine learning systems engineering plus something else • Something else: e.g. knowledge engineering: manual coding

(formalization) of knowledge.• Machine learning systems engineering is not cognitive systems

engineering, it is smaller!35

Page 36: A.Levenchuk -- Machine learning engineering

Machine Learning and Cognitive Level• «deep learning research is likely to continue its

expansion from traditional pattern recognition jobs to full-scale AI tasks involving symbolic manipulation, memory, planning and reasoning. This will be important for reaching to full understanding of natural language and dialogue with humans (i.e., pass the Turing test). Similarly, we are seeing deep learning expanding into the territories of reinforcement learning, control and robotics and that is just the beginning» -- Joshua Bengio

https://www.quora.com/Where-is-deep-learning-research-headed

36

If we can learn to reason, plan, model, act – then machine learning engineering will be cognitive systems engineering!

Machine intelligence vs. artificial intelligence

Page 37: A.Levenchuk -- Machine learning engineering

Example: MANIC A Minimal Architecture for General Cognition (http://arxiv.org/abs/1508.00019)

• Keywords: action, planning, observation, decisions, knowledge, …• Is it keywords for

learning systems engineering?

37

Page 38: A.Levenchuk -- Machine learning engineering

Application level of intellect-stack• Killer application for learning systems is here!• Domain specificity and data is here!• End users and money are here!• Systems engineering is here!

38This chart is only about enterprise AI systems market.https://www.tractica.com/newsroom/press-releases/artificial-intelligence-for-enterprise-applications-to-reach-11-1-billion-in-market-value-by-2024/

If you have no application of interest, there will be no data, no money, no developments, no engineering.

Most machine learning engineering is applied. Only small part is machine learning platform development.

Page 39: A.Levenchuk -- Machine learning engineering

Application level: systems engineering• Strategizing and Conceptual

design• Requirements engineering• System Architecture• V&V• Configuration management

• Machine learning engineers is one of multiple engineers that participate in a cyber-physical system project team.

39

SensorsConsoles

http://www.nist.gov/el/nist-releases-draft-framework-cyber-physical-systems-developers.cfm

ActuatorsMonitors

Page 40: A.Levenchuk -- Machine learning engineering

Life cycle stages dictionary

• Conception• Design• Manufacturing• Integration• Validation and verification• Operation

40

Machine learning Systems engineeringConception and requirements

Conception and requirements

Architecture and Design Architecture and DesignTraining ManufacturingTransfer learning, ensembling

Integration

Validation and verification Validation and verificationInference Operation

Page 41: A.Levenchuk -- Machine learning engineering

Stakeholders concerns

Domain-specific concern:• Expressivity • Computational efficiency• Trainability• Good generalization (not overfitting)Traditional concerns• Composability – layering, ensembling• Compositiality – transfer learning• Resilience

41

Page 42: A.Levenchuk -- Machine learning engineering

Intellect-stack and machine learning (systems) engineering• Machine learning (systems) engineering cover now only small

part of the whole intellect-stack but interact with all levels.• No one Googbookdu can develop all levels in intellect-stack

platforms (from hardware accelerators in the bottom up to application on the top) by itself. Maybe except IBM that can span from TrueNorth to IBM Watson applications ;-)

• Interfaces from supporting platforms will be stabilizing and… in constant update (like in software engineering APIs: change of everything once in 5 years).• Technology disruption starts with low (enabling) levels of a

stack, demand provides from upper level, thus nobody in the middle can ignore developments in other layer platforms.

42

Page 43: A.Levenchuk -- Machine learning engineering

43

Intellect-Stack

Application (domain) Platform

Cognitive Architecture Platform

Learning Algorithm Platform

Computational library

General Computer Language

CPU

GPU/FPGU/Physical computation Drivers

GPU/FPGA/Physical computation Accelerator

Neurocompiler

Neuromorphic driver

Neuromorphic chip

Disr

uptio

n en

able

rsDisruption dem

and

Where are you now? Where are you tomorrow?

Page 44: A.Levenchuk -- Machine learning engineering

44

Thank you!

Anatoly Levenchuk,TechInvestLab, presidentINCOSE Russian chapter, research directorhttps://ru.linkedin.com/in/[email protected]

Blog in Russian: http://ailev.ru