David Barber - Deep Nets, Bayes and the story of AI

Deep Nets Bayes and the story of AI (continued)

David Barber

Table of Contents

History of the AI dream

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Directed Graphical Models

Variational Generative Models

Reinforcement Learning

Outlook

Intelligent Machinery

1948 Turing and Champernowne lsquopaper and pencilrsquo chess

1951 Prinz mate-in-two moves chess machine

1952 Strachey programs first computer draughts algorithm

Learning Machines

1951 Oettinger makes first program that lsquolearnsrsquo

1955 Samuel adds lsquolearningrsquo to his draughts algorithm

Logical Intelligence

1968 Rischrsquos algorithm for integration in calculus

1972 Prolog for general logical reasoning

1997 Deep Blue defeats Kasparov

Other forms of intelligence

But is this getting us to where wersquod like to beSelfridge-Shannon film clip

Speech Recognition

Visual Processing

Natural Language modelling

Planning and decision in uncertain environments

Perhaps a different approach would be useful

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Astonishing Hypothesis Crick

ldquoA personrsquos mental activities are entirely due to the behaviour of nervecells and the molecules that make them up and influence themrdquo

Neurons

Visual Pathway

Information Processing in Brains

Neurons

Layer 1 Layer 2 Highminuslevel

Concepts

Feature

Hierarchical Modular Binary Parallel Noisy

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Artificial Neuron (Perceptron)

weight 7

output neuron

neuron 1neuron 2neuron 3neuron 4

neuron 7neuron 6neuron 5

inputs

weight 1

Training an artificial neural network

Want to generalise to new images with high accuracy

Artificial Network

1957 Rosenblattrsquos perceptron

perceptron film clip

Connectionism

1960 Realised a perceptron can only solve simple tasks

1970 Decline in interest

1980 New computing power made training multilayer networks feasible

outputinputs

Each node (or lsquoneuronrsquo) computes a function of a weighted combination ofparental nodes hj = σ(

sumi wijhi)

Neural Networks and Deep LearningHistorical Problems with Neural Nets (1990s)

NNs are difficult to train (many local optima)

Particularly difficult to train a NN with a large number of layers (say largerthan around 10)

lsquoGradient Diffusion Problemrsquo ndash difficult to assign responsibility of errors toindividual lsquoneuronsrsquo

Machine Learning (up to 2006)

A large section of the machine learning community abandoned NNs

More principled and computationally better understood techniques (SVMsand related convex methods) replaced them

Bayesian AI (1990s onwards)

From mid 1990s there was a realisation that pattern recognition is notsufficient for all AI purposes

Uncertainty and reasoning are not naturally representable using standardfeed-forward nets

Explosion in more lsquosymbolicrsquo Bayesian AI

Deep Learning

NNs have resurged in interest in the last few years (Hinton Bengio )

Also called lsquodeep learningrsquo

Sense that very complex tasks (object recognition learning complex structurein data) requires going beyond simple (convex) statistical techniques

The brain uses hierarchical distributed processing and it is likely to be for agood reason

Many problems have a hierarchical structure images are made of partslanguage is hierarchical etc

Why now

New computing resources (GPU processing)

Availability of large amount of data means that we can train nets with manyparameters (1010)

Recent evidence suggests local optima are not particularly problematic

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

The bottleneck forces the network to try to find a low dimensionalrepresentation of the data

Useful for unsupervised learning

Autoencoder on MNIST digits (Hinton 2006 Science)

Figure Reconstructions using H = 30 components From the Top Original imageAutoencoder1 Autoencoder2 PCA

60000 training images (28times 28 = 784 pixels)

Use a form of autoencoder to find a lower (30) dimensional representation

At the time the special layerwise training procedure was consideredfundamental to the success of this approach Now not deemed necessaryprovided we use a sensible initialisation

Google Cats

10 Million Youtube video frames (200x200 pixel images)

Use a specialised autoencoder with 9 layers (1 billion weights)

2000 computers + two weeks of computing

Examine units to see what images they most respond to

Google Autoencoder

From Nando De Freitas

Convolutional NNs

CNNs are particularly popular in image processing

Often the feature maps correspond (not to macro features such as bicycles)but micro features

For example in handwritten digit recognition they correspond to smallconstituent parts of the digits

These are used then to process the image into a representation that is betterfor recognition

NNs in NLP

Bag of Words

We have D words in a dictionary aardvark zorro so that we can relateeach word with its dictionary index

We can also think of this as a Euclidean embedding e

aardvarkrarr eaardvark =

zorrorarr ezorro =

Word Embeddings

Idea is to replace the Euclidean embeddings e with embeddings (vectors) vthat are learned

Objective is for example next word prediction accuracy

These are often called lsquoneural language modelsrsquo

NNs in NLP

Each word w in the dictionary has an associated embedding vector vwUsually around 200 dimensional vectors are used

Consider the sentence

the cat sat on the mat

and that we wish to predict the word on given the two preceding cat sat

and two succeeding words the mat

We can use a network that has inputs vcat vsat vthe vmat

The output of the network is a probability over all words in the dictionaryp(w| vinputs)We want p(w = on|vcatvsatvthevmat) to be high

The overall objective is then to learn all the word embeddings and networkparameters subject to predicting the word correctly based on the context

Word Embeddings

Given a word (France for example) we can find which words w have embeddingvectors closest to vFrance From Ronan Collabert (2011)

Word Embeddings

There appears to be a natural lsquogeometryrsquo to the embeddings For example thereare directions that correspond to gender

vwoman minus vman asymp vaunt minus vuncle

vwoman minus vman asymp vqueen minus vking

From Mikolov (2013)

Word Embeddings Analogies

Given a relationship France-Paris we get the lsquorelationshiprsquo embedding

v = vParis minus vFrance

Given Italy we can calculate vItaly + v and find the word in the dictionary whichhas closest embedding to this (it turns out to be Rome) From Mikolov (2013)

Word Embeddings Constrained Embeddings

We can learn embeddings for English words and embeddings for ChinesewordsHowever when we know that a Chinese and English word have a similarmeaning we add a constraint that the word embeddings vChineseWord andvEnglishWord should be closeWe have only a small amount of labelled lsquosimilarrsquo Chinese-English words(these are the green border boxes in the above they are standard translationsof the corresponding Chinese character)We can visualise in 2D (using t-SNE) the embedding vectors See Socher(2013)

Recursive Nets and Embeddings

Stanford Sentiment Treebank Consists of parsed sentences with sentiment labels(minusminusminus 0+++) for each node (phrase) in the tree 215000 labelled phrases(obtained from three humans)

Idea is to recursively combine embeddings such that they accurately predictthe sentiment at each node

Recursive Nets and EmbeddingsTraining

We have a softmax classifier for each node in the tree to predict thesentiment of the phrase beneath this node in the tree

The weights of this classifier are shared across all nodes

At the leaf nodes at the bottom of the tree the inputs to the classifiers arethe word embeddings

The embeddings are combined by another network g with commonparameters which forms the input to the sentiment classifier

We then learn all the embeddings shared classifier parameters and sharedcombination parameters to maximise the classification accuracy

Prediction

For a new movie review the review is first parsed using a standard grammartree parser

This forms the tree which can be used to recursively form the sentiment classlabel for the review

Currently the best sentiment classifier Socher (2013)

Recursive Nets and Embeddingsotilde otilde

Icircplusmnsup1raquoreg

Uumlplusmnfrac14sup1raquoreg

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

frac12plusmnsup3degraquoacuteacutemiddotsup2sup1

ordfiquestregmiddotiquestnotmiddotplusmnsup2shy

plusmnsup2

notcedilmiddotshy

notcedilraquosup3raquo

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

acuteraquoiquestshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

acutemiddotmicroraquofrac14

raquoordfraquoregsect

shymiddotsup2sup1acuteraquo

sup3middotsup2laquonotraquo

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

acutemiddotmicroraquo

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

middotsup2frac12regraquofrac14middotfrac34acutesect

yen yen

frac14laquoacuteacute

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Uacutemiddotsup1laquoregraquo ccedilaelig IcircOgraveIgraveOgrave degregraquofrac14middotfrac12notmiddotplusmnsup2 plusmnordm degplusmnshymiddotnotmiddotordfraquo iquestsup2frac14 sup2raquosup1iquestnotmiddotordfraquo oslashfrac34plusmnnotnotplusmnsup3 regmiddotsup1cedilnotdivide shyraquosup2notraquosup2frac12raquoshy iquestsup2frac14 notcedilraquomiddotreg sup2raquosup1iquestnotmiddotplusmnsup2ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

RNNs are used in timeseries applications

The basic idea is that the hidden units at time ht (and possibly output yt)depend on the previous state of the network htminus1 xtminus1 ytminus1 for inputs xt andoutputs yt

In the above network I lsquounrolled the net through timersquo to give a standard NNdiagram

I omitted the potential links from xtminus1 ytminus1 to ht

Handwriting Generation using a RNN

Some training examples

Some generated examples Top line is real handwriting for comparison See AlexGraversquos work

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Reasons research in deep learning has exploded

Much greater compute power (GPU)

Much larger datasets

AutoDiff

What is AutoDiff

AutoDiff takes a function f(x) and returns an exact value (up to machineaccuracy) for the gradient

gi(x) equivpart

partxif

∥∥∥∥x

Note that this is not the same as a numerical approximation (such as centraldifferences) for the gradient

One can show that if done efficiently one can always calculate the gradient inless than 5 times the time it takes to compute f(x)

Reverse DifferentiationA useful graphical representation is that the total derivative of f with respect to xis given by the sum over all path values from x to f where each path value is theproduct of the partial derivatives of the functions on the edges

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

For f(x) = x2 + xgh where g =x2 and h = xg2

gh2x+ gh

f prime(x) = (2x+ gh) + (g2xg) + (2x2gxxg) + (2xxh) = 2x+ 8x7

Reverse DifferentiationConsider

f(x1 x2) = cos (sin(x1x2))

We can represent this computationally using an Abstract Syntax Tree (AST)

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

Given values for x1 x2 we first run forwards through the tree so that we canassociate each node with an actual function value

Reverse Differentiation

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

The two derivatives share the same computation branch andwe want to exploit this

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

1 Find the reverse ancestral (backwards) scheduleof nodes (f3 f2 f1 x1 x2)

2 Start with the first node n1 in the reverseschedule and define tn1 = 1

3 For the next node n in the reverse schedule findthe child nodes ch (n) Then define

tn =sum

cisinch(n)

partfcpartfn

4 The total derivatives of f with respect to theroot nodes of the tree (here x1 and x2) are givenby the values of t at those nodes

This is a general procedure that can be used to automatically define a subroutineto efficiently compute the gradient It is efficient because information is collectedat nodes in the tree and split between parents only when required

Limitations of forward reasoning

World Representation

Recognising patterns (perceptron style) is only one form of intelligence

Solving chess problems is another and requires complex reasoning using someform of internal model

The world is noisy and information may be conflicting

Recognised that new approaches are required

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models help us to fantasise about the world

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

pos - position in kitchensnd ndash sound

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

int - intended keyhit ndash hit key

Stubby Fingers errors

a b c d e f g h i j k l m n o p q r s t u v w x y z

abcdefghijkl

mnopqrstuvwxyz

Stubby Fingers language

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

Given the typed sequence cwsykcak what is the most likely word that thiscorresponds to

List the 200 most likely hidden sequences

Discard those that are not in a standard English dictionary

Take the most likely proper English word as the intended typed word

Speech Recognition raw signal

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

lsquoneuralrsquo representation

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

pho phoneme (letter)aud audio signal (neural representation)

Medical Diagnosis

tumour flu meningitis

headache fever appetite x-ray

Combine known medical knowledge with patient specific information

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Probability is a logical calculus of uncertainty

Natural framework to use in models of physical systems such as the IsingModel (1920) and in AI applications such as the HMM (Baum 1966Stratonovich 1960)

The need for structure

We often want to make a probabilistic description of many objects (electronspins neurons customers etc )

Typically the representational and computational cost of probabilistic modelsgrows exponentially with the number of objects represented

Without introducing strong structural limitations about how these objects caninteract probability is a non-starter

For this reason computationally lsquosimplerrsquo alternatives (such as fuzzy logic)were introduced to try to avoid some of these difficulties ndash however these aretypically frowed on by purists

Graphical Models

We can use graphs to represent how objects can probabilistically interact witheach other

Graphical Models and then a marriage between Graph and Probability theory

Many of the quantities that we would like to compute in a probabilitydistribution can then be related to operations on the graph

The computational complexity of operations can often be related to thestructure of the graph

Graphical Models are now used as a standard framework in EngineeringStatistics and Computer Science

Graphical Models are used to perform reasoning under uncertainty and aretherefore widely applicable

Uses in Industry

Microsoft used to estimate the skill distribution of players in online games(the worlds largest graphical model)

Hospitals use Belief Nets to encode knowledge about diseases and symptomsto aid medical diagnosis

Google Microsoft Facebook used in many places including advertisingvideo game prediction speech recognition

Used to estimate inherent desirability of products in consumer retail

Microsoft and others Attempt to go beyond simple AB testing by usesGraphical Models to model the whole companyuser relationship

Conditional Probability and Bayesrsquo Rule

The probability of event x conditioned on knowing event y (or more shortly theprobability of x given y) is defined as

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

p(y)(Bayesrsquo rule)

Throwing darts

p(region 5|not region 20) =p(region 5 not region 20)

p(not region 20)

=p(region 5)

p(not region 20)=

Interpretationp(A = a|B = b) should not be interpreted as lsquoGiven the event B = b has occurredp(A = a|B = b) is the probability of the event A = a occurringrsquo The correctinterpretation should be lsquop(A = a|B = b) is the probability of A being in state aunder the constraint that B is in state brsquo

Battleships

Assume there are 2 ships 1 vertical (ship 1) and 1 horizontal (ship 2) of 5pixels each

Can be placed anywhere on the 10times10 grid but cannot overlap

Let s1 is the origin of ship 1 and s2 the origin of ship 2

Data D is a collection of query lsquohitrsquo or lsquomissrsquo responses

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(D)Let X be the matrix of pixel occupancy

p(X|D) =sums1s2

p(X s1 s2|D) =sums1s2

p(X|s1 s2)p(s1 s2|D)

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Belief Networks (Bayesian Networks)

A belief network is a directed acyclic graph in which each node has associated theconditional probability of the node given its parents

The joint distribution is obtained by taking the product of the conditionalprobabilities

p(ABCDE) = p(A)p(B)p(C|AB)p(D|C)p(E|BC)

p(E|BC)

Example ndash Part ISallyrsquos burglar Alarm is sounding Has she been Burgled or was the alarmtriggered by an Earthquake She turns the car Radio on for news of earthquakes

Choosing an orderingWithout loss of generality we can write

p(AREB) = p(A|REB)p(REB)

= p(A|REB)p(R|EB)p(EB)

= p(A|REB)p(R|EB)p(E|B)p(B)

Assumptions

Therefore

p(AREB) = p(A|EB)p(R|E)p(E)p(B)

Example ndash Part II Specifying the Tables

p(A|BE)

Alarm = 1 Burglar Earthquake09999 1 1

099 1 0099 0 1

00001 0 0

p(R|E)

Radio = 1 Earthquake1 10 0

The remaining tables are p(B = 1) = 001 and p(E = 1) = 0000001 The tablesand graphical structure fully specify the distribution

Example Part III Inference

Initial Evidence The alarm is sounding

p(B = 1|A = 1) =

sumER p(B = 1 EA = 1 R)sumBER p(BEA = 1 R)

sumER p(A = 1|B = 1 E)p(B = 1)p(E)p(R|E)sum

BER p(A = 1|BE)p(B)p(E)p(R|E)asymp 099

Additional Evidence The radio broadcasts an earthquake warning

A similar calculation gives p(B = 1|A = 1 R = 1) asymp 001

Initially because the alarm sounds Sally thinks that shersquos been burgledHowever this probability drops dramatically when she hears that there hasbeen an earthquake

Markov Models

For timeseries data v1 vT we need a model p(v1T ) For causal consistency itis meaningful to consider the decomposition

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

with the convention p(vt|v1tminus1) = p(v1) for t = 1

v1 v2 v3 v4

Independence assumptionsIt is often natural to assume that the influence of the immediate past is morerelevant than the remote past and in Markov models only a limited number ofprevious observations are required to predict the future

Markov Chain

Only the recent past is relevant

p(vt|v1 vtminus1) = p(vt|vtminusL vtminus1)

where L ge 1 is the order of the Markov chain

p(v1T ) = p(v1)p(v2|v1)p(v3|v2) p(vT |vTminus1)

For a stationary Markov chain the transitions p(vt = sprime|vtminus1 = s) = f(sprime s) aretime-independent (lsquohomogeneousrsquo)

v1 v2 v3 v4

Figure (a) First order Markov chain (b) Second order Markov chain

Markov Chains

v1 v2 v3 v4

p(v1 vT ) = p(v1)︸︷︷︸initial

Tprodt=2

p(vt|vtminus1)︸︷︷︸Transition

State transition diagramNodes represent states of the variable v and arcs non-zero elements of thetransition p(vt|vtminus1)

Most probable and shortest paths

The shortest (unweighted) path from state 1 to state 7 is 1minus 2minus 7

The most probable path from state 1 to state 7 is 1minus 8minus 9minus 7 (assuminguniform transition probabilities) The latter path is longer but more probablesince for the path 1minus 2minus 7 the probability of exiting state 2 into state 7 is15

Equilibrium distribution

It is interesting to know how the marginal p(xt) evolves through time

p(xt = i) =sumj

p(xt = i|xtminus1 = j)︸︷︷︸Mij

p(xtminus1 = j)

p(xt = i) is the frequency that we visit state i at time t given we startedfrom p(x1) and randomly drew samples from the transition p(xτ |xτminus1)As we repeatedly sample a new state from the chain the distribution at timet for an initial distribution p1(i) is

pt = Mtminus1p1

If for trarrinfin pinfin is independent of the initial distribution p1 then pinfin iscalled the equilibrium distribution of the chain

pinfin = Mpinfin

The equil distribution is proportional to the eigenvector with unit eigenvalueof the transition matrix

PageRank

Define the matrix

1 if website j has a hyperlink to website i0 otherwise

From this we can define a Markov transition matrix with elements

Mij =Aijsumiprime Aiprimej

If we jump from website to website the equilibrium distribution componentpinfin(i) is the relative number of times we will visit website i This has anatural interpretation as the lsquoimportancersquo of website i

For each website i a list of words associated with that website is collectedAfter doing this for all websites one can make an lsquoinversersquo list of whichwebsites contain word w When a user searches for word w the list ofwebsites that contain word is then returned ranked according to theimportance of the site

Hidden Markov Models

The HMM defines a Markov chain on hidden (or lsquolatentrsquo) variables h1T Theobserved (or lsquovisiblersquo) variables are dependent on the hidden variables through anemission p(vt|ht) This defines a joint distribution

p(h1T v1T ) = p(v1|h1)p(h1)Tprodt=2

p(vt|ht)p(ht|htminus1)

For a stationary HMM the transition p(ht|htminus1) and emission p(vt|ht) distributionsare constant through time

v1 v2 v3 v4

h1 h2 h3 h4 Figure A first order hidden Markov modelwith lsquohiddenrsquo variablesdom(ht) = 1 H t = 1 T Thelsquovisiblersquo variables vt can be either discrete orcontinuous

The classical inference problems

Filtering (Inferring the present) p(ht|v1t)Prediction (Inferring the future) p(ht|v1s) t gt sSmoothing (Inferring the past) p(ht|v1u) t lt uLikelihood p(v1T )Most likely path (Viterbi alignment) argmax

p(h1T |v1T )

For prediction one is also often interested in p(vt|v1s) for t gt s

Inference in Hidden Markov Models

Belief network representation of a HMM

h1 h2 h3 h4

v1 v2 v3 v4

Filtering Smoothing and Viterbi are all computationally efficient scalinglinearly with the length of the timeseries (but quadratically with the numberof hidden states)

The algorithms are variants of lsquomessage passing on factor graphsrsquo

Algorithm guaranteed to work if the graph is singly-connected

Huge research effort in the last 15 years to apply message passing forapproximate inference in multiply-connected graphs (eg low-densityparity-check codes)

HMMs for speech recognition

ht is the phoneme at time t p(ht|htminus1) ndash language model p(vt|ht) ndash speechsignal model

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Recently companies including Google have made big advances in speechrecognition

The breakthrough is to model p(vt|ht) as a Gaussian whose mean is somefunction of the phoneme micro(ht θ)

This function is a deep neural network trained on a large amount of data

Goldrush at the moment to find similar breakthrough applications of deepnetworks in reasoning systems

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

It is natural to consider that objects (images for example) can be constructedon the basis of a low dimensional representation

Note that this is a Graphical Model not a Function

The latent variables h can be sampled from using p(h) and then an imagesampled from p(v|h)One cannot use an autoencoder to generate new images

The bad news

Inference (computing p(h|v) and parameter learning) is intractable in thesemodels

Statisticians typically use sampling as an approximation

Very popular in ML to use a variational method ndash much faster for inference

Variational InferenceConsider a distribution

p(v|θ) =inth

p(v|h θ)p(h)

and that we wish to learn θ to maximise the probability this model generatesobserved data

log p(v|θ) ge minusintq(h|v φ) log q(h|v φ) +

q(h|v φ)p(v|h θ) + const

Idea is to choose a lsquovariationalrsquo distribution q(h|v φ) such that we can eithercalculate analytically the bound or sample it efficiently

We then jointly maximise the bound wrt φ and θ

We can parameterise p(v|h θ) using a deep network

Very popular approach ndash see lsquovariational autoencoderrsquo and also attentionmechanisms

Extension to semi-supervised method using p(v) =inth

sumc p(v|h c)p(c)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Can we teach computers to play Atari video games

Deep Reinforcement Learning

Given a state of the world W and a set of possible actions A we need todecide which action to taken for any state of W that will be best for our longterm goals

Problem is that the number of pixel states is enormous

Need to learn a low dimensional representation of the screen (use a deepgenerative model)

Learn then which action to take given the low dimensional representation

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Machine Learning is in a boom period

Renewed interest and hope in creating AI

Combine new computational power with suitable hierarchical representations

Impressive state of the art results in Speech Recognition Image AnalysisGame Playing

Challenges

Improve understanding of optimisation for deep learning

Learn how to more efficiently exploit computational resources

Learn how to exploit massive databases

Improve interaction between reinforcement learning and representationlearning

Marry non-symbolic (neural) with symbolic (Bayesian reasoning)

Emphasis is on scalability

Feel free to contact me at UCL or at my AI company reinfer

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Learning Machines

Speech Recognition

Visual Processing

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Neurons

Visual Pathway

Neurons

Concepts

Feature

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Learning Machines

Speech Recognition

Visual Processing

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Neurons

Visual Pathway

Neurons

Concepts

Feature

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Learning Machines

Speech Recognition

Visual Processing

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Neurons

Visual Pathway

Neurons

Concepts

Feature

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Learning Machines

Speech Recognition

Visual Processing

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Neurons

Visual Pathway

Neurons

Concepts

Feature

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Speech Recognition

Visual Processing

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Neurons

Visual Pathway

Neurons

Concepts

Feature

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Speech Recognition

Visual Processing

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Neurons

Visual Pathway

Neurons

Concepts

Feature

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Neurons

Visual Pathway

Neurons

Concepts

Feature

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Neurons

Visual Pathway

Neurons

Concepts

Feature

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Neurons

Visual Pathway

Neurons

Concepts

Feature

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Visual Pathway

Neurons

Concepts

Feature

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Neurons

Concepts

Feature

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

weight 7

output neuron

inputs

weight 1

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Artificial Network

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Connectionism

outputinputs

sumi wijhi)

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Deep Learning

Why now

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Autoencoder

y1 y2 y3 y4 y5

h1 h2 h3

y1 y2 y3 y4 y5

h6 h7 h8

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Google Cats

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Google Autoencoder

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Convolutional NNs

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

NNs in NLP

Bag of Words

zorrorarr ezorro =

Word Embeddings

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

NNs in NLP

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Word Embeddings

From Mikolov (2013)

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Prediction

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

otilde

middotshy

otilde

plusmnsup2raquo

otilde

plusmnordm

otilde

notcedilraquo

otilde

sup3plusmnshynot

otilde

plusmnsup2

notcedilmiddotshy

ograve

middotshy

plusmnsup2raquo

plusmnordm

notcedilraquo

otilde

plusmnsup2

notcedilmiddotshy

ograve

otilde

plusmnordm

notcedilmiddotshy

ograve

frac14middotfrac14

sup2ugravenot

iquest

plusmnordm

notcedilmiddotshy

ograve

timesnot

ugraveshy

paralaquoshynot

otilde

yen yen

ograve

timesnot

ugraveshy

otilde

sup2plusmnnot

yen yen

ograve

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Recurrent Nets

x1 x2 x3

h1 h2 h3

y1 y2 y3

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

AutoDiff

What is AutoDiff

gi(x) equivpart

partxif

∥∥∥∥x

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

dx=partf

partx+partf

gpartfpartx

partfpartg

Example

gh2x+ gh

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

f1(x1 x2) = x1x2

f2(x) = sin(x)

f3(x) = cos(x)

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

df3dx1

=partf3partf2

df2dx1

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx1

Similarly

df3dx2

=partf3partf2

df2df1︸︷︷︸

df3df1

df1dx2

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

partf1partx1

= x2partf1partx2

partf2partf1

= cos(f1)

partf3partf2

= minus sin(f2)

tn =sum

cisinch(n)

partfcpartfn

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Models

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Burglar Problem

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Creaks and Bumps

Creak Bump

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Burglar Model

pos1 pos2 pos3 pos4

snd1 snd2 snd3 snd4

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Finding the Burglar

creak creak

bump bump

bump bump bump

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Finding the Burglar

creak creak

bump bump

bump bump bump

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Stubby Fingers

int1 int2 int3 int4

hit1 hit2 hit3 hit4

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

abcdefghijkl

mnopqrstuvwxyz

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

abcdefghijkl

mnopqrstuvwxyz 0

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Stubby Fingers

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

0 01 02 03 04 05 06 07 08 09minus02

minus015

minus01

minus005

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

10 20 30 40 50 60 70 80

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Speech Recognition

pho1 pho2 pho3 pho4

aud1 aud2 aud3 aud4

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Medical Diagnosis

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Probability

Why Probability

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Graphical Models

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Uses in Industry

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(x|y) equiv p(x y)

p(y)=p(y|x)p(x)

Throwing darts

p(not region 20)

=p(region 5)

p(not region 20)=

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Battleships

p(s1 s2|D) =p(D|s1 s2)p(s1 s2)

p(X|D) =sums1s2

demoBattleshipsm

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(E|BC)

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Assumptions

Therefore

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(A|BE)

099 1 0099 0 1

00001 0 0

p(R|E)

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(B = 1|A = 1) =

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Markov Models

p(v1T ) =

Tprodt=1

p(vt|v1tminus1)

v1 v2 v3 v4

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Markov Chain

v1 v2 v3 v4

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Markov Chains

v1 v2 v3 v4

Tprodt=2

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(xt = i) =sumj

p(xtminus1 = j)

pt = Mtminus1p1

pinfin = Mpinfin

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

PageRank

Define the matrix

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

v1 v2 v3 v4

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(h1T |v1T )

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

h1 h2 h3 h4

v1 v2 v3 v4

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Deep Nets and HMMs

h1 h2 h3 h4

v1 v2 v3 v4

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Generative Model

v1 v2 v3 v4

The bad news

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

p(v|θ) =inth

p(v|h θ)p(h)

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Tetris

Google

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Table of Contents

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook

Challenges

httpsreinferio

How do brains work

Connectionism

AutoDiff

Fantasy Machines

Probability

Outlook