Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Neural networks, connectionism and bayesianlearning
Pantelis P. Analytis
March 7, 2018
1 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
1 Neural nets
2 Connectionism in Cognitive Science
3 Bayesian inference
4 Bayesian learning models
5 Assignment 2: modeling choice
2 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The first neural nets
Information travels down the axons and is conveyed toother neurons via the synapses.
if the information received by a neuron exceeds a thresholdthe neuron fires.
3 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The first neural nets
Information travels down the axons and is conveyed toother neurons via the synapses.
if the information received by a neuron exceeds a thresholdthe neuron fires.
4 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The first neural nets
First formal abstraction of the neural network wasproposed by McCulloch and Pitts (1943).
5 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The first neural nets
The first learning neural net algorithm, appeared in Psych.Review (1958).Director of the Cognitive Systems research programs atCornell through 1971.
6 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The perceptron algorithm
The NYT reporting about the perceptron ”the embryo ofan electronic computer that [the Navy] expects will beable to walk, talk, see, write, reproduce itself and beconscious of its existence”
7 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The perceptron algorithm
8 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The perceptron: limitations
In 1969 Minksy and Papert publised a book that stressedthe limitations of perceptrons and led to the first AIwinter.Until the early 80s when with new impetus from physicsneural nets came back into fashion.
9 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Multi-layer perceptrons
Multi-layer feedforward networks are universal functionapproximators (Hornik, Stinchcombe, White, 1989)
10 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Backpropagation
The algorithm was conceived in the context of controltheory. Werbos (1975) suggested to used it to train neuralnets in his PhD thesis.
Rumelhart, Hinton and Williams (1986) showed that it cangenerate valuable internal representations of data inhidden layers.
11 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
50 years later
At around 2006 Geoff Hinton started to achievesurprisingly good results in speech recognition.
By 2009 they where able to outperform all otherapproaches in speech recognition.
12 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The eight major aspects of parallel distributedprocessing (Rumelhart, Hinton, McClelland, 1987)
1 A set of processing units.
2 A state of activation.
3 An output function for each unit.
4 A pattern of connectivity among units.
5 A propagation rule for for propagating patterns oractivities through the connectivities.
6 An activation rule for combining the inputs impinging on aunit with the current state of that unit to produce a newlevel of activation for the unit.
7 A learning rule whereby patterns of connectivity aremodified by experience.
8 An environment within which a system must operate.
13 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The eight major aspects of parallel distributedprocessing (Rumelhart, Hinton, McClelland, 1987)
1 A set of processing units.
2 A state of activation.
3 An output function for each unit.
4 A pattern of connectivity among units.
5 A propagation rule for for propagating patterns oractivities through the connectivities.
6 An activation rule for combining the inputs impinging on aunit with the current state of that unit to produce a newlevel of activation for the unit.
7 A learning rule whereby patterns of connectivity aremodified by experience.
8 An environment within which a system must operate.
13 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The eight major aspects of parallel distributedprocessing (Rumelhart, Hinton, McClelland, 1987)
1 A set of processing units.
2 A state of activation.
3 An output function for each unit.
4 A pattern of connectivity among units.
5 A propagation rule for for propagating patterns oractivities through the connectivities.
6 An activation rule for combining the inputs impinging on aunit with the current state of that unit to produce a newlevel of activation for the unit.
7 A learning rule whereby patterns of connectivity aremodified by experience.
8 An environment within which a system must operate.
13 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The eight major aspects of parallel distributedprocessing (Rumelhart, Hinton, McClelland, 1987)
1 A set of processing units.
2 A state of activation.
3 An output function for each unit.
4 A pattern of connectivity among units.
5 A propagation rule for for propagating patterns oractivities through the connectivities.
6 An activation rule for combining the inputs impinging on aunit with the current state of that unit to produce a newlevel of activation for the unit.
7 A learning rule whereby patterns of connectivity aremodified by experience.
8 An environment within which a system must operate.
13 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The eight major aspects of parallel distributedprocessing (Rumelhart, Hinton, McClelland, 1987)
1 A set of processing units.
2 A state of activation.
3 An output function for each unit.
4 A pattern of connectivity among units.
5 A propagation rule for for propagating patterns oractivities through the connectivities.
6 An activation rule for combining the inputs impinging on aunit with the current state of that unit to produce a newlevel of activation for the unit.
7 A learning rule whereby patterns of connectivity aremodified by experience.
8 An environment within which a system must operate.
13 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The eight major aspects of parallel distributedprocessing (Rumelhart, Hinton, McClelland, 1987)
1 A set of processing units.
2 A state of activation.
3 An output function for each unit.
4 A pattern of connectivity among units.
5 A propagation rule for for propagating patterns oractivities through the connectivities.
6 An activation rule for combining the inputs impinging on aunit with the current state of that unit to produce a newlevel of activation for the unit.
7 A learning rule whereby patterns of connectivity aremodified by experience.
8 An environment within which a system must operate.
13 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The eight major aspects of parallel distributedprocessing (Rumelhart, Hinton, McClelland, 1987)
1 A set of processing units.
2 A state of activation.
3 An output function for each unit.
4 A pattern of connectivity among units.
5 A propagation rule for for propagating patterns oractivities through the connectivities.
6 An activation rule for combining the inputs impinging on aunit with the current state of that unit to produce a newlevel of activation for the unit.
7 A learning rule whereby patterns of connectivity aremodified by experience.
8 An environment within which a system must operate.
13 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
The eight major aspects of parallel distributedprocessing (Rumelhart, Hinton, McClelland, 1987)
1 A set of processing units.
2 A state of activation.
3 An output function for each unit.
4 A pattern of connectivity among units.
5 A propagation rule for for propagating patterns oractivities through the connectivities.
6 An activation rule for combining the inputs impinging on aunit with the current state of that unit to produce a newlevel of activation for the unit.
7 A learning rule whereby patterns of connectivity aremodified by experience.
8 An environment within which a system must operate.
13 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Different network architectures
Different types of networks can do different jobs well.
14 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Marr’s three levels of analysis
Computational: What is the goal of the computation?What problems does a system solve of overcome?
Algorithmic: How does the system do it? Whatrepresentations does it use and what processes does itemploy to manipulate the representations?
Implementational: How can such a system be build inhardware?
15 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Connectionist versions of cognitive models
Many cognitive models can be represented as parallelprocessing systems.
Above is the exemplar mode for categorization.
16 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Connectionist versions of cognitive models
And here is the prototype model for categorization.
17 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Neural nets going wrong
Nets are trained on large image datasets, and their verymuch tuned to the statistics of their training set.
18 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Neural nets going wrong
Nets are trained on large image datasets, and their verymuch tuned to the statistics of their training set.
19 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Inverse inference
Bayes illustrious Essay towards solving a problem in thedoctrine of chances was published posthumously in 1763.
Laplace put the work in firm mathematical foundationsand popularized it.
20 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Rules of inverse inference
Bayes theorem: P(A|B) = P(B|A)P(A)/P(B)
P(A) probability of event A, P(B) probability of event B,
P(A|B) is the probability of observing A when B is trueand P(B|A) is the probability of observing B if A is true.
Medical scenario
A patient goes to see a doctor. The doctor performs a testwith 99 percent reliability–that is, 99 percent of people who aresick test positive and 99 percent of the healthy people testnegative. The doctor knows that only 1 percent of the peoplein the country are sick. Now the question is: if the patienttests positive, what are the chances the patient is sick?
21 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Example
In a population of 10,000 there will be 100 diseased peopleand 9,900 non-diseased people.
We also know that the specificity is also 99%, or thatthere is a 1% error rate in non-diseased people.
Among the 100 diseased people, 99 will test positive.
Therefore, among the 9,900 non-diseased people, 99 willhave a positive test.
Table 1: Table representation of the problem
Diseased Not diseased
Test positive 99 99 198
Test negative 1 9801 9802
100 9900 10000
22 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Base rate fallacy and frequencies
Most people get these problems wrong when presented asvignettes.
But people judgement are much more accurate when theyare presented as natural frequencies (Gigerenzer andHoffrage, 1995),
23 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Difference between Bayesian and frequentiststatistics
24 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Bayesian optimization
−2
0
2
0.00 0.25 0.50 0.75 1.00input, x
outp
ut, f
(x)
Prior, Squared exponential kernel, l=1
The framework was initially developed in the context of oilextraction.
Gaussian processes have as a prior a distribution overpossible functional forms.
25 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Bayesian optimization
−2
0
2
0.00 0.25 0.50 0.75 1.00input, x
outp
ut, f
(x)
GP−Thompson, trial 2
As observations come in some functions appear muchmore likely.
26 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Bayesian optimization
−2
0
2
0.00 0.25 0.50 0.75 1.00input, x
outp
ut, f
(x)
GP−Thompson, trial 3
As observations come in some functions appear muchmore likely than others.
27 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Bayesian causal networks
Judeal Pearl developed the formalism in computer science,while Glymour, Sprites, Gopnik and others introducedthem to cognitive science.
28 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Bayesian causal networks
Even with only three nodes there are many differentpossible causal structures (Bramley, 2016).
29 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Bayesian causal networks
People and machines can learn either from observing theworld or from actively changing it (Bramley, 2016).
30 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Children and adults actively intervene in theirenvironment
Alison Gopnik and her students have extensively studiedhow children discover things.
In robotics people have tried to imitate this form oflearning.
31 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Naive Bayes — how your e-mail folders stay clean
Assumes that different pieces of information areconditionally independent.
Also wrong in most cases, it is a surprisingly robust modelin some contexts (e.g. spam detection).
George Box — All models are wrong, but some of themare useful
32 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Assignment 2: Building models of choice
Items selected directly from Imdb and Amazon
People made 200 choices 90 % where conflictual, while 10% where dominated.
33 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Assignment 2: Building models of choice
The average review score and the number of reviewsinteracted and jointly determined what people preferred.
34 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Assignment 2: Building models of choice
The average review score and the number of reviewsinteracted and jointly determined what people preferred.
35 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Assignment 2: Building models of choice
At the aggregate level the prediction rate was relativelylow. It was easier to predict films rather than books.
36 / 37
Neuralnetworks,
connectionismand bayesian
learning
Pantelis P.Analytis
Neural nets
Connectionismin CognitiveScience
Bayesianinference
Bayesianlearningmodels
Assignment 2:modelingchoice
Assignment 2: Building models of choice
At the individual level we achieved higher accuracypredicting books. The best model predicted about 85 % ofchoices.
37 / 37