Upload
arunabha-saha
View
113
Download
1
Embed Size (px)
DESCRIPTION
Introduction to Boltzmann Machine. Here we have described how to get the idea of Boltmann machine from the idea of Hopfield nets.
Citation preview
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Boltzmann MachineA Brief Introduction
Ritajit MajumdarArunabha Saha
University of Calcutta
November 6, 2013
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 1 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
1 Hopfield Net
2 Stochastic Hopfield Nets with Hidden Units
3 Boltzmann Machine
4 Learning Algorithm for Boltzmann Machine
5 Applications of Boltzmann Machine
6 Restricted Boltzmann Machine
7 Reference
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 2 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Hopfield Network
Figure: Two dimensional representation of motion in state space
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 3 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Hopfield Net
A Hopfield Net is composed of binary threshold units withrecurrent connections between them.
Recurrent networks of non-linear units are hard to analyze,since they can behave in many different ways -
1 Settle to a stable state.
2 Oscillate
3 Follow chaotic trajectory.
John Hopfield introduced a global energy function fornetwork with symmetric connections.
1 Each binary configuration of the whole network has anEnergy.
2 The binary threshold decision rule causes the network tosettle to a minimum of this energy function.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 4 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Hopfield Net
A Hopfield Net is composed of binary threshold units withrecurrent connections between them.
Recurrent networks of non-linear units are hard to analyze,since they can behave in many different ways -
1 Settle to a stable state.
2 Oscillate
3 Follow chaotic trajectory.
John Hopfield introduced a global energy function fornetwork with symmetric connections.
1 Each binary configuration of the whole network has anEnergy.
2 The binary threshold decision rule causes the network tosettle to a minimum of this energy function.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 4 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Hopfield Net
A Hopfield Net is composed of binary threshold units withrecurrent connections between them.
Recurrent networks of non-linear units are hard to analyze,since they can behave in many different ways -
1 Settle to a stable state.
2 Oscillate
3 Follow chaotic trajectory.
John Hopfield introduced a global energy function fornetwork with symmetric connections.
1 Each binary configuration of the whole network has anEnergy.
2 The binary threshold decision rule causes the network tosettle to a minimum of this energy function.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 4 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Hopfield Net
A Hopfield Net is composed of binary threshold units withrecurrent connections between them.
Recurrent networks of non-linear units are hard to analyze,since they can behave in many different ways -
1 Settle to a stable state.
2 Oscillate
3 Follow chaotic trajectory.
John Hopfield introduced a global energy function fornetwork with symmetric connections.
1 Each binary configuration of the whole network has anEnergy.
2 The binary threshold decision rule causes the network tosettle to a minimum of this energy function.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 4 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Energy Function
The global energy is defined as -
E = −∑
i
sibi −∑i<j
si sjwij (1)
where bi is the bias of the i th unit, s is 0 or 1 1 depending onwhether the unit is turned off or on respectively. And wij is theweight of the connection between units i and j .
From this energy function, each unit computes locally howchanging their state will affect the global energy.
The energy gap is defined as -
4 Ei = E (si = 0)− E (si = 1) = bi +∑
j
sjwij (2)
1For Bipolar Inputs the states will be −1 and 1 respectively.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 5 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Energy Function
The global energy is defined as -
E = −∑
i
sibi −∑i<j
si sjwij (1)
where bi is the bias of the i th unit, s is 0 or 1 1 depending onwhether the unit is turned off or on respectively. And wij is theweight of the connection between units i and j .
From this energy function, each unit computes locally howchanging their state will affect the global energy.
The energy gap is defined as -
4 Ei = E (si = 0)− E (si = 1) = bi +∑
j
sjwij (2)
1For Bipolar Inputs the states will be −1 and 1 respectively.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 5 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Energy Function
The global energy is defined as -
E = −∑
i
sibi −∑i<j
si sjwij (1)
where bi is the bias of the i th unit, s is 0 or 1 1 depending onwhether the unit is turned off or on respectively. And wij is theweight of the connection between units i and j .
From this energy function, each unit computes locally howchanging their state will affect the global energy.
The energy gap is defined as -
4 Ei = E (si = 0)− E (si = 1) = bi +∑
j
sjwij (2)
1For Bipolar Inputs the states will be −1 and 1 respectively.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 5 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Settling to an Energy Minima
The net is initially in a randomstate i.e., the units are on or offrandomly. The binary thresholddecision rule updates units oneat a time in a random order.
Update each unit towhichever of its two statesminimizes the globalenergy.
Use binary threshold units,i.e., the states can beeither 0 or 1.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 6 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Settling to an Energy Minima
The net is initially in a randomstate i.e., the units are on or offrandomly. The binary thresholddecision rule updates units oneat a time in a random order.
Update each unit towhichever of its two statesminimizes the globalenergy.
Use binary threshold units,i.e., the states can beeither 0 or 1.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 7 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Settling to an Energy Minima
The net is initially in a randomstate i.e., the units are on or offrandomly. The binary thresholddecision rule updates units oneat a time in a random order.
Update each unit towhichever of its two statesminimizes the globalenergy.
Use binary threshold units,i.e., the states can beeither 0 or 1. -E = goodness = 4
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 8 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Using this type of Computation as memory
Hopfield proposed that memories could be energy minimaof a neural net.
I The binary threshold decision rule can be used to “cleanup” incomplete or corrupted memory.
Using energy minima to represent memories gives acontent-addressable memory.
I An item can be accessed just by knowing parts of it.I It is robust against hardware damage.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 9 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Using this type of Computation as memory
Hopfield proposed that memories could be energy minimaof a neural net.
I The binary threshold decision rule can be used to “cleanup” incomplete or corrupted memory.
Using energy minima to represent memories gives acontent-addressable memory.
I An item can be accessed just by knowing parts of it.I It is robust against hardware damage.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 9 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Stochastic Hopfield Nets with HiddenUnits
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 10 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Hopfield Net with Hidden Units
Why add hidden units?
In Hopfield Net, there is no hidden layer of units. By addinghidden layers, the attention can be shifted from just storingmemories to various types of interpretations of the inputs.
Use the net to constructinterpretations of sensoryinput.
The input is representedby the visible units.
The interpretation isrepresented by the statesof the hidden units.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 11 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Hopfield Net with Hidden Units
Why add hidden units?
In Hopfield Net, there is no hidden layer of units. By addinghidden layers, the attention can be shifted from just storingmemories to various types of interpretations of the inputs.
Use the net to constructinterpretations of sensoryinput.
The input is representedby the visible units.
The interpretation isrepresented by the statesof the hidden units.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 11 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Hopfield Net with Hidden Units
Why add hidden units?
In Hopfield Net, there is no hidden layer of units. By addinghidden layers, the attention can be shifted from just storingmemories to various types of interpretations of the inputs.
Use the net to constructinterpretations of sensoryinput.
The input is representedby the visible units.
The interpretation isrepresented by the statesof the hidden units.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 11 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Hopfield Net with Hidden Units
Why add hidden units?
In Hopfield Net, there is no hidden layer of units. By addinghidden layers, the attention can be shifted from just storingmemories to various types of interpretations of the inputs.
Use the net to constructinterpretations of sensoryinput.
The input is representedby the visible units.
The interpretation isrepresented by the statesof the hidden units.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 11 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Noisy Networks
The Binary Threshold Decisionrule always goes downhill, i.e.,reduces energy.
Hence it is impossible to escapefrom a local minima.
Solution - Use random noise toescape from poor, shallowminima.
I Start with a lot of noise toescape the energy barriers ofpoor local minima.
I Slowly reduce the noise sothat the system ends up in adeep minima.
I This process is called“Simulated Annealing”.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 12 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Noisy Networks
The Binary Threshold Decisionrule always goes downhill, i.e.,reduces energy.
Hence it is impossible to escapefrom a local minima.
Solution - Use random noise toescape from poor, shallowminima.
I Start with a lot of noise toescape the energy barriers ofpoor local minima.
I Slowly reduce the noise sothat the system ends up in adeep minima.
I This process is called“Simulated Annealing”.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 12 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Noisy Networks
The Binary Threshold Decisionrule always goes downhill, i.e.,reduces energy.
Hence it is impossible to escapefrom a local minima.
Solution - Use random noise toescape from poor, shallowminima.
I Start with a lot of noise toescape the energy barriers ofpoor local minima.
I Slowly reduce the noise sothat the system ends up in adeep minima.
I This process is called“Simulated Annealing”.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 12 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Noisy Networks
The Binary Threshold Decisionrule always goes downhill, i.e.,reduces energy.
Hence it is impossible to escapefrom a local minima.
Solution - Use random noise toescape from poor, shallowminima.
I Start with a lot of noise toescape the energy barriers ofpoor local minima.
I Slowly reduce the noise sothat the system ends up in adeep minima.
I This process is called“Simulated Annealing”.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 12 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Noisy Networks
The Binary Threshold Decisionrule always goes downhill, i.e.,reduces energy.
Hence it is impossible to escapefrom a local minima.
Solution - Use random noise toescape from poor, shallowminima.
I Start with a lot of noise toescape the energy barriers ofpoor local minima.
I Slowly reduce the noise sothat the system ends up in adeep minima.
I This process is called“Simulated Annealing”.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 12 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Noisy Networks
The Binary Threshold Decisionrule always goes downhill, i.e.,reduces energy.
Hence it is impossible to escapefrom a local minima.
Solution - Use random noise toescape from poor, shallowminima.
I Start with a lot of noise toescape the energy barriers ofpoor local minima.
I Slowly reduce the noise sothat the system ends up in adeep minima.
I This process is called“Simulated Annealing”.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 12 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Stochastic Binary Units
How to add noise?
1 Replace the binary threshold units by binary stochasticunits that make biased random decisions.
2 The “temperature” controls the amount of noise.
3 Unit i then turns on with the probability given by thelogistic function -
prob(si = 1) =1
1 + e−4Ei
T
(3)
T = 0 Deterministic (Hopfield Net)T →∞ Complete ChaosT = 1 Approaches Boltzmann Distribution
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 13 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Stochastic Binary Units
How to add noise?
1 Replace the binary threshold units by binary stochasticunits that make biased random decisions.
2 The “temperature” controls the amount of noise.
3 Unit i then turns on with the probability given by thelogistic function -
prob(si = 1) =1
1 + e−4Ei
T
(3)
T = 0 Deterministic (Hopfield Net)T →∞ Complete ChaosT = 1 Approaches Boltzmann Distribution
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 13 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Stochastic Binary Units
How to add noise?
1 Replace the binary threshold units by binary stochasticunits that make biased random decisions.
2 The “temperature” controls the amount of noise.
3 Unit i then turns on with the probability given by thelogistic function -
prob(si = 1) =1
1 + e−4Ei
T
(3)
T = 0 Deterministic (Hopfield Net)T →∞ Complete ChaosT = 1 Approaches Boltzmann Distribution
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 13 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Stochastic Binary Units
How to add noise?
1 Replace the binary threshold units by binary stochasticunits that make biased random decisions.
2 The “temperature” controls the amount of noise.
3 Unit i then turns on with the probability given by thelogistic function -
prob(si = 1) =1
1 + e−4Ei
T
(3)
T = 0 Deterministic (Hopfield Net)T →∞ Complete ChaosT = 1 Approaches Boltzmann Distribution
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 13 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Stochastic Binary Units
How to add noise?
1 Replace the binary threshold units by binary stochasticunits that make biased random decisions.
2 The “temperature” controls the amount of noise.
3 Unit i then turns on with the probability given by thelogistic function -
prob(si = 1) =1
1 + e−4Ei
T
(3)
T = 0 Deterministic (Hopfield Net)T →∞ Complete ChaosT = 1 Approaches Boltzmann Distribution
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 13 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Boltzmann Machine
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 14 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Statistical Mechanics
Consider a physical system with large number of possible statesand many degrees of freedom.Let pi be the occurrence probability of state i
pi ≥ 0 for all i and∑
i
pi = 1.
At thermal equilibrium state i occurs with probability
pi = 1Z exp
(− Ei
kB T
)where Ei is the energy of the systemBoltzmann constant, kB = 1.38x10−23 J/K
exp(− Ei
kB T
)is the Boltzmann Factor
Partition function,(Zustadsumme)
Z =∑
i
exp
(− Ei
kBT
)
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 15 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Statistical Mechanics
Gibbs Distribution1 States of low energy have a higher occurrence probability
than states of high energy.
2 As T is reduced, the probability is concentrated on asmaller subset of low-energy states.
In the context of neural nets, the parameter T (which controlsthermal fluctuations) represents the effect of synaptic noise.Hence kB = 1 is set and pi and Z getting the form
pi = 1Z exp
(−Ei
T
)Z =
∑i
exp
(−Ei
T
)
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 16 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
What is Boltzmann Machine?
Boltzmann MachineA Hopfield Net consisting of Binary Stochastic Neuron withhidden units is called Boltzmann Machine.
A Boltzmann Machine is a network of symmetricallyconnected, neuron like units that make stochastic decisionsabout whether to be on or off.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 17 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Structure of Boltzmann Machine
The stochastic neurons of Boltzmann machine are in twogroups: vissible and hidden.
visible neurons provide an interface between the net andits environment.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 18 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Structure of Boltzmann Machine
The stochastic neurons of Boltzmann machine are in twogroups: vissible and hidden.
visible neurons provide an interface between the net andits environment.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 18 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Structure of Boltzmann Machine
during the training phase, the visible neurons areclamped; the hidden neurons always operate freely, theyare used to explain underlying constraints in theenvironmental input vectors.
the hidden units do this (explain underlying constraints)by capturing higher-order correlations between theclamping vectors.
Boltzmann machine learning may be viewed as anunsupervised learning procedure for modelling adistribution that is specified by the clamping patterns.
the network can perform pattern completion: when avector bearing part of the information is clamped onto asubset of the visible neurons, the network performscompletion of the pattern on the remaining visible neurons(if it has learnt properly).
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 19 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Structure of Boltzmann Machine
during the training phase, the visible neurons areclamped; the hidden neurons always operate freely, theyare used to explain underlying constraints in theenvironmental input vectors.
the hidden units do this (explain underlying constraints)by capturing higher-order correlations between theclamping vectors.
Boltzmann machine learning may be viewed as anunsupervised learning procedure for modelling adistribution that is specified by the clamping patterns.
the network can perform pattern completion: when avector bearing part of the information is clamped onto asubset of the visible neurons, the network performscompletion of the pattern on the remaining visible neurons(if it has learnt properly).
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 19 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Structure of Boltzmann Machine
during the training phase, the visible neurons areclamped; the hidden neurons always operate freely, theyare used to explain underlying constraints in theenvironmental input vectors.
the hidden units do this (explain underlying constraints)by capturing higher-order correlations between theclamping vectors.
Boltzmann machine learning may be viewed as anunsupervised learning procedure for modelling adistribution that is specified by the clamping patterns.
the network can perform pattern completion: when avector bearing part of the information is clamped onto asubset of the visible neurons, the network performscompletion of the pattern on the remaining visible neurons(if it has learnt properly).
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 19 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Structure of Boltzmann Machine
during the training phase, the visible neurons areclamped; the hidden neurons always operate freely, theyare used to explain underlying constraints in theenvironmental input vectors.
the hidden units do this (explain underlying constraints)by capturing higher-order correlations between theclamping vectors.
Boltzmann machine learning may be viewed as anunsupervised learning procedure for modelling adistribution that is specified by the clamping patterns.
the network can perform pattern completion: when avector bearing part of the information is clamped onto asubset of the visible neurons, the network performscompletion of the pattern on the remaining visible neurons(if it has learnt properly).
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 19 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Modelling Binary Data
The objective of Boltzmann Machine is -
Modelling Binary Data
Given a training set of binary vectors, fit the model that willassign a probability to every possible binary vector.
When unit i is given opportunity to update its state, it firstcomputes its total input zi ,
zi = bi +∑
j
sjwij (4)
Unit i turns on with probability -
prob(si = 1) =1
1 + e−zi(5)
If the units are updated sequentially in random order, thenetwork will eventually reach a Boltzmann Distribution, alsocalled its stationary distribution.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 20 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
How Boltzmann Machine Generates Data
It is not a causal generative model.
Everything is defined in terms of energies of jointconfigurations of the visible and hidden units.
The energies of the joint configurations are related totheir probabilities by two ways:
- either by defining the probability p(v, h) ∝ e−E(v,h)
- define the probability to be the probability of finding thenetwork in that joint configuration after we have updatedall of the stochastic binary units many times.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 21 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
How Boltzmann Machine Generates Data
It is not a causal generative model.
Everything is defined in terms of energies of jointconfigurations of the visible and hidden units.
The energies of the joint configurations are related totheir probabilities by two ways:
- either by defining the probability p(v, h) ∝ e−E(v,h)
- define the probability to be the probability of finding thenetwork in that joint configuration after we have updatedall of the stochastic binary units many times.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 21 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
How Boltzmann Machine Generates Data
It is not a causal generative model.
Everything is defined in terms of energies of jointconfigurations of the visible and hidden units.
The energies of the joint configurations are related totheir probabilities by two ways:
- either by defining the probability p(v, h) ∝ e−E(v,h)
- define the probability to be the probability of finding thenetwork in that joint configuration after we have updatedall of the stochastic binary units many times.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 21 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
How Boltzmann Machine Generates Data
It is not a causal generative model.
Everything is defined in terms of energies of jointconfigurations of the visible and hidden units.
The energies of the joint configurations are related totheir probabilities by two ways:
- either by defining the probability p(v, h) ∝ e−E(v,h)
- define the probability to be the probability of finding thenetwork in that joint configuration after we have updatedall of the stochastic binary units many times.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 21 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
How Boltzmann Machine Generates Data
In Boltzmann Machine, everything is defined in terms of theenergies of joint configurations of the visible (v) and hidden (h)units.
The energies of joint configurations are related to theirprobabilities as -
p(v , h) ∝ e−E(v ,h) (6)
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 22 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Probabilities in term of Energies
The probability of a jointconfiguration over bothvisible and hidden unitsdepends on the energy ofthat joint configurationcompared with theenergies of all other jointconfigurations.
The probability of aconfiguration of the visibleunits is the sum of theprobabilities of all the jointconfigurations thatcontain it.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 23 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Probabilities in term of Energies
The probability of a jointconfiguration over bothvisible and hidden unitsdepends on the energy ofthat joint configurationcompared with theenergies of all other jointconfigurations.
The probability of aconfiguration of the visibleunits is the sum of theprobabilities of all the jointconfigurations thatcontain it.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 23 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
An Example
Figure: Credit: Geoffrey Hinton, Neural Networks for MachineLearning
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 24 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Limitation and Solution
What if the network is remarkably large?
If there are large number of hidden units then we cannotcalculate partition function, as it is exponentially manyterms.
We need to sample the data and to do this we can useMarkov Chain Monte Carlo(MCMC).
I starting from a random global configuration pick units atrandom and allow them to stochastically update theirstates based on their energy gaps.
Run MCMC until it reaches it stationarydistribution(thermal eqb. at temp is 1)
I the probability is related to its energy p(v , h) ∝ e−E(v,h)
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 25 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Limitation and Solution
What if the network is remarkably large?
If there are large number of hidden units then we cannotcalculate partition function, as it is exponentially manyterms.
We need to sample the data and to do this we can useMarkov Chain Monte Carlo(MCMC).
I starting from a random global configuration pick units atrandom and allow them to stochastically update theirstates based on their energy gaps.
Run MCMC until it reaches it stationarydistribution(thermal eqb. at temp is 1)
I the probability is related to its energy p(v , h) ∝ e−E(v,h)
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 25 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Limitation and Solution
What if the network is remarkably large?
If there are large number of hidden units then we cannotcalculate partition function, as it is exponentially manyterms.
We need to sample the data and to do this we can useMarkov Chain Monte Carlo(MCMC).
I starting from a random global configuration pick units atrandom and allow them to stochastically update theirstates based on their energy gaps.
Run MCMC until it reaches it stationarydistribution(thermal eqb. at temp is 1)
I the probability is related to its energy p(v , h) ∝ e−E(v,h)
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 25 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Limitation and Solution
What if the network is remarkably large?
If there are large number of hidden units then we cannotcalculate partition function, as it is exponentially manyterms.
We need to sample the data and to do this we can useMarkov Chain Monte Carlo(MCMC).
I starting from a random global configuration pick units atrandom and allow them to stochastically update theirstates based on their energy gaps.
Run MCMC until it reaches it stationarydistribution(thermal eqb. at temp is 1)
I the probability is related to its energy p(v , h) ∝ e−E(v,h)
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 25 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Boltzmann Learning
“A surprising feature of this rule is that it uses only locallyavailable information. The change of weight depends only
on the behaviour of the two units it connects, eventhough the change optimizes a global measure.”
- Ackley, Hinton 1985
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 26 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Goal of Learning
Learning Algorithm for Boltzmann Machine is an unsupervisedlearning algorithm. Unlike Backpropagation Algorithm, wherethe training set consists of input vector and desired output, inBoltzmann Machine only the input vector is provided.
We want to maximize the product of the probabilities theBoltzmann Machine assigns to the binary vectors intraining set.
It is equivalent to maximizing the probability that wewould obtain exactly the N training cases.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 27 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Assumptions
The goal of Boltzmann learning is to produce a NN thatcategorize input patterns according to Boltzmann distribution.Two assumptions are made:
1 Each environmental vector persists long enough for thenetwork to reach thermal equilibrium.
2 There is no structure in the sequence in whichenvironmental vectors are clamped to the visible units ofthe network.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 28 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Assumptions
The goal of Boltzmann learning is to produce a NN thatcategorize input patterns according to Boltzmann distribution.Two assumptions are made:
1 Each environmental vector persists long enough for thenetwork to reach thermal equilibrium.
2 There is no structure in the sequence in whichenvironmental vectors are clamped to the visible units ofthe network.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 28 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Assumptions
The goal of Boltzmann learning is to produce a NN thatcategorize input patterns according to Boltzmann distribution.Two assumptions are made:
1 Each environmental vector persists long enough for thenetwork to reach thermal equilibrium.
2 There is no structure in the sequence in whichenvironmental vectors are clamped to the visible units ofthe network.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 28 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Why Learning could be difficult
Consider a chain of hidden units with visibile units attached attwo ends -
Training: We want the two visibile units to be in oppositestates.
Solution: The product of all the weights must be negative. Ifall are positive, then turning on one unit will turn on the nextunit and eventually the two visibile units will be in same state.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 29 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Why Learning could be difficult
Consider a chain of hidden units with visibile units attached attwo ends -
Difficulty: To modify w1 and w5, we need to know w3 (andthe weights of other hidden units too).
Because, if w3 is negative, then we need to modify w1 in adifferent way than what we would do if w3 is positive.
So to change one weight in a right direction, we need to knowall the other weights.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 30 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Boltzmann Learning Algorithm
The learning procedure mainly divided into three phases:
1 Clamping phase
2 Free-running phase
3 Learning phase
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 31 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Boltzmann Learning Algorithm
The learning procedure mainly divided into three phases:
1 Clamping phase
2 Free-running phase
3 Learning phase
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 31 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Boltzmann Learning Algorithm
The learning procedure mainly divided into three phases:
1 Clamping phase
2 Free-running phase
3 Learning phase
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 31 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Boltzmann Learning Algorithm
The learning procedure mainly divided into three phases:
1 Clamping phase
2 Free-running phase
3 Learning phase
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 31 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Algorithm
1 Initialization: Set weights to random numbers in [-1, 1]
2 Clamping Phase: Present the net with the mapping it issupposed to learn by clamping input and output units topatterns. For each pattern perform simulated annealing onthe hidden units in the sequence T0,T1, ...,Tfinal . At thefinal temperature, collect statistics to estimate thecorrelations
ρji+ = 〈sjsi 〉+ (j 6= i)
here 〈sjsi 〉+ =∑
sα∈=
∑sβ
P(Sβ = sβ |Sα = sα)sjsi
where sα and sβ represents the vector of visible andhidden neurons respectively
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 32 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Algorithm
1 Initialization: Set weights to random numbers in [-1, 1]
2 Clamping Phase: Present the net with the mapping it issupposed to learn by clamping input and output units topatterns. For each pattern perform simulated annealing onthe hidden units in the sequence T0,T1, ...,Tfinal . At thefinal temperature, collect statistics to estimate thecorrelations
ρji+ = 〈sjsi 〉+ (j 6= i)
here 〈sjsi 〉+ =∑
sα∈=
∑sβ
P(Sβ = sβ |Sα = sα)sjsi
where sα and sβ represents the vector of visible andhidden neurons respectively
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 32 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Algorithm
3 Free-running Phase:Repeat calculations in step 2, butthis time only clamp the input units. Hence, at the finaltemperature, estimate the correlations
ρji− = 〈sjsi 〉− (j 6= i)
here 〈sjsi 〉− =∑
sα∈=
∑sβ
P(Sβ = sβ)sjsi
4 Learning Phase: updating the weights using the learningrule
4wji = η(ρji+ − ρ−ji )
η is learning parameter depending upon T (η = εT )
5 Iteration: Iterate steps 2 to 4 until the learning procedureconverges with no more changes with synaptic weightwji ∀ j , i .
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 33 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Algorithm
3 Free-running Phase:Repeat calculations in step 2, butthis time only clamp the input units. Hence, at the finaltemperature, estimate the correlations
ρji− = 〈sjsi 〉− (j 6= i)
here 〈sjsi 〉− =∑
sα∈=
∑sβ
P(Sβ = sβ)sjsi
4 Learning Phase: updating the weights using the learningrule
4wji = η(ρji+ − ρ−ji )
η is learning parameter depending upon T (η = εT )
5 Iteration: Iterate steps 2 to 4 until the learning procedureconverges with no more changes with synaptic weightwji ∀ j , i .
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 33 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Algorithm
3 Free-running Phase:Repeat calculations in step 2, butthis time only clamp the input units. Hence, at the finaltemperature, estimate the correlations
ρji− = 〈sjsi 〉− (j 6= i)
here 〈sjsi 〉− =∑
sα∈=
∑sβ
P(Sβ = sβ)sjsi
4 Learning Phase: updating the weights using the learningrule
4wji = η(ρji+ − ρ−ji )
η is learning parameter depending upon T (η = εT )
5 Iteration: Iterate steps 2 to 4 until the learning procedureconverges with no more changes with synaptic weightwji ∀ j , i .
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 33 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Learning Algorithm
CorrelationEverything that one weight needs to know of other weights andthe data is contained in the difference of two correlations -
∂logp(v)
∂wij= 〈si sj〉v − 〈si sj〉model (7)
where 〈.〉 is the expectation value.
The first term in R.H.S. denotes the expectation value ofproduct of states at equilibrium when the state vector (ordata) v is clamped on the visibile units.
The second term in R.H.S. denotes the expectation valueof product of states at equilibrium without any clamping.
So we can make the change in weight -
4wij ∝ 〈si sj〉v − 〈si sj〉model (8)
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 34 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Why is it so?
We know the probability of a global configuration atequilibrium is -
p(v , h) ∝ e−E(v ,h)
So, the logarithm of probability is a linear function of theenergy.
And energy, on its own term, is a linear function of weights andstates.
E = −∑
i
sibi −∑i<j
si sjwij
Hence,∂E
∂wij= −si sj (9)
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 35 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Why is it so?
Differentiating equation 1, we get -
∂E∂wij
= −sisj
1 The process of settling to equilibrium state propagatesinformation about the weights.
2 No need of back-propagation.3 The following two stages are required -
I The machine needs to settle to equilibrium with data.I The machine needs to settle to equilibrium without data.
4 However, in both cases the learning process is similar withdifferent boundary conditions.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 36 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Why we need negative phase
The combine use of positive and negative phase stabilizesthe distribution of synaptic weights.
The both phases are important equally due to thepresence of partition function Z.
The direction of steepest descent in energy space in notthe same as the direction of steepest ascent inprobability space.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 37 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Why we need negative phase
The combine use of positive and negative phase stabilizesthe distribution of synaptic weights.
The both phases are important equally due to thepresence of partition function Z.
The direction of steepest descent in energy space in notthe same as the direction of steepest ascent inprobability space.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 37 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Why we need negative phase
The combine use of positive and negative phase stabilizesthe distribution of synaptic weights.
The both phases are important equally due to thepresence of partition function Z.
The direction of steepest descent in energy space in notthe same as the direction of steepest ascent inprobability space.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 37 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Shortcomings
There are few problems with the Boltzmann algorithm
It is not prefixed that how many times we need to iterate.
Due to the presence of negative phase it takes a greatercomputation time
This algorithm computes averages of two phases and taketheir difference. When these two correlations similar, thepresence of sampling noise makes the difference morenoisy.
It runs very slow. Take very much time to learn.
Weight explosion: If weights get too big too early, thenthe network get struck in one goodness optimum.
- This shortcomings can be eliminated by sigmoid beliefnetwork
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 38 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Shortcomings
There are few problems with the Boltzmann algorithm
It is not prefixed that how many times we need to iterate.
Due to the presence of negative phase it takes a greatercomputation time
This algorithm computes averages of two phases and taketheir difference. When these two correlations similar, thepresence of sampling noise makes the difference morenoisy.
It runs very slow. Take very much time to learn.
Weight explosion: If weights get too big too early, thenthe network get struck in one goodness optimum.
- This shortcomings can be eliminated by sigmoid beliefnetwork
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 38 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Shortcomings
There are few problems with the Boltzmann algorithm
It is not prefixed that how many times we need to iterate.
Due to the presence of negative phase it takes a greatercomputation time
This algorithm computes averages of two phases and taketheir difference. When these two correlations similar, thepresence of sampling noise makes the difference morenoisy.
It runs very slow. Take very much time to learn.
Weight explosion: If weights get too big too early, thenthe network get struck in one goodness optimum.
- This shortcomings can be eliminated by sigmoid beliefnetwork
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 38 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Shortcomings
There are few problems with the Boltzmann algorithm
It is not prefixed that how many times we need to iterate.
Due to the presence of negative phase it takes a greatercomputation time
This algorithm computes averages of two phases and taketheir difference. When these two correlations similar, thepresence of sampling noise makes the difference morenoisy.
It runs very slow. Take very much time to learn.
Weight explosion: If weights get too big too early, thenthe network get struck in one goodness optimum.
- This shortcomings can be eliminated by sigmoid beliefnetwork
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 38 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Shortcomings
There are few problems with the Boltzmann algorithm
It is not prefixed that how many times we need to iterate.
Due to the presence of negative phase it takes a greatercomputation time
This algorithm computes averages of two phases and taketheir difference. When these two correlations similar, thepresence of sampling noise makes the difference morenoisy.
It runs very slow. Take very much time to learn.
Weight explosion: If weights get too big too early, thenthe network get struck in one goodness optimum.
- This shortcomings can be eliminated by sigmoid beliefnetwork
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 38 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Shortcomings
There are few problems with the Boltzmann algorithm
It is not prefixed that how many times we need to iterate.
Due to the presence of negative phase it takes a greatercomputation time
This algorithm computes averages of two phases and taketheir difference. When these two correlations similar, thepresence of sampling noise makes the difference morenoisy.
It runs very slow. Take very much time to learn.
Weight explosion: If weights get too big too early, thenthe network get struck in one goodness optimum.
- This shortcomings can be eliminated by sigmoid beliefnetwork
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 38 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Shortcomings
There are few problems with the Boltzmann algorithm
It is not prefixed that how many times we need to iterate.
Due to the presence of negative phase it takes a greatercomputation time
This algorithm computes averages of two phases and taketheir difference. When these two correlations similar, thepresence of sampling noise makes the difference morenoisy.
It runs very slow. Take very much time to learn.
Weight explosion: If weights get too big too early, thenthe network get struck in one goodness optimum.
- This shortcomings can be eliminated by sigmoid beliefnetwork
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 38 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Applications
stock market trend prediction
character recognition
Face recognition
Internet Application
Cancer Detection
Loan Application
Decision making
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 39 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Applications
stock market trend prediction
character recognition
Face recognition
Internet Application
Cancer Detection
Loan Application
Decision making
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 39 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Applications
stock market trend prediction
character recognition
Face recognition
Internet Application
Cancer Detection
Loan Application
Decision making
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 39 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Applications
stock market trend prediction
character recognition
Face recognition
Internet Application
Cancer Detection
Loan Application
Decision making
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 39 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Applications
stock market trend prediction
character recognition
Face recognition
Internet Application
Cancer Detection
Loan Application
Decision making
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 39 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Applications
stock market trend prediction
character recognition
Face recognition
Internet Application
Cancer Detection
Loan Application
Decision making
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 39 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Applications
stock market trend prediction
character recognition
Face recognition
Internet Application
Cancer Detection
Loan Application
Decision making
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 39 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Restricted Boltzmann Machine
Restricted Boltzmann Machine is a stochastic neural network(random behaviour when activated). It consist of one layer ofvisible units (neurons) and one layer of hidden units. Units ineach layer have no connections between them and areconnected to all other units in other layer. Connectionsbetween neurons are bidirectional and symmetric . This meansthat information flows in both directions during the trainingand during the usage of the network and that weights are thesame in both directions.
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 40 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
How RBM Works
First the network is trained by using some data set andsetting the neurons on visible layer to match data pointsin this data set.
After the network is trained we can use it on newunknown data to make classification of the data (this isknown as unsupervised learning)
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 41 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
How RBM Works
First the network is trained by using some data set andsetting the neurons on visible layer to match data pointsin this data set.
After the network is trained we can use it on newunknown data to make classification of the data (this isknown as unsupervised learning)
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 41 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Continuous Restricted Boltzmann Machine
CRBM have very close implementation to original RBM withbinomial neurons (0,1) as possible values of activation.
Training data2 Reconstructed data
2500 training data were takenRitajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 42 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
Reference
Geoffrey HintonNeural Networks for Machine Learningwww.coursera.org
Geoffrey Hintonhttp://www.scholarpedia.org/article/Boltzmann_
machine
Simon Haykin, 2ed
Geoffrey Hinton, David AckleyA learning Algorithm for Boltzmann MachineCognitive Science 9, 147-169 (1985)
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 43 / 44
Boltzmann Machine
Ritajit MajumdarArunabha Saha
Outline
Hopfield Net
Stochastic Hopfield Netswith Hidden Units
Boltzmann Machine
Learning Algorithm forBoltzmann Machine
Applications of BoltzmannMachine
Restricted BoltzmannMachine
Reference
THANK YOU
Ritajit Majumdar Arunabha Saha (CU) Boltzmann Machine November 6, 2013 44 / 44