Upload
susan-ellis
View
220
Download
0
Embed Size (px)
Citation preview
Instrumental Conditioning II
Delay of Reinforcement
StartDelayChoice
Correct
Incorrect
Grice (1948)
Goal
Reward or No Reward
Grice (1948) Results
2030405060708090
10025 100
175
250
325
400
475
550
625
700
Trials
Per
cen
t C
orr
ect
0s
5s
2s1.2s
0.5s
10s
Overcoming the effects of delay
• Secondary reinforcers
• “Marking” procedure
Lieberman, McIntosh & Thomas (1979)
Reinforcement Punishment
Positive contingency
Negative contingency
Chocolate Bar Electric Shock
Excused from Chores
No TV privileges
Effect on Rate Behavior
Professor Drew
Anticipatory Contrast - Crespi (1942)
00.5
11.5
22.5
33.5
44.5
2 4 6 8 10 12 14 16 18 20 2 4 6 8
Trials
Run
ning
Spe
ed (f
t/se
c)
256-16 Pellets16-16 Pellets1 - 16 Pellets
Rats run down maze to find food pellets in goal arm.
What is a reinforcer?
Operational Definition (behaviorists): That which increases the probability of the response that preceded it.
Thorndike: A stimulus that produces a “satisfying state of affairs”
Drive Reduction Theory
Amt of H2O in body
Compare with Set Point
Seek water/ don’t seek water
drives
Drive Reduction Considered: Are reinforcers necessary for survival?
– Eating to excess
– Drugs of Abuse
– “Pleasure centers” of the brain
Behavioral Regulation View: The Premack Principle
• Behaviors are reinforcing, not stimuli
• To predict what will be reinforcing, observe the baseline frequency of different behaviors
• Highly probable behaviors will reinforce less probable behaviors
Premack Revised: The Response Deprivation Hypothesis
• Low frequency behaviors can reinforce high frequency behaviors (and vice versa)
• All behaviors have a preferred frequency = the behavioral bliss point
• Deprivation below that frequency is aversive, and organisms will work to remedy this
Timberlake & Allison (1974)
Response deprivation hypothesis
.25 .5 .75
The ice cream scale (in pints)
1.0 1.25 1.5 1.75 2.0 2.25 2.5
Bliss point
(1.0 pints/night)
Will work to avoid ice creamWill work to obtain
Contiguity versus Contingency in operant conditioning
Degraded Contingency Effect
= bar press = food
Perfect contingency
Strong Responding
Degraded contingency
Weak Responding
G.V. Thomas (1983)
Contiguity pitted against contingency
“Free” reinforcers given every 20s
Lever press advances delivery of pellet, but cancels pellet for next 20-s interval
So if you press at second 2, you get a pellet immediately, but you get no pellet during seconds 3-20 and 21-40.
20s 40s 60s
G.V. Thomas (1983)
Contiguity pitted against contingency
So if you press at second 2, you get a pellet immediately, but you get no pellet during seconds 3-20 and 21-40.
20s 40s 60s
Lever press here
Lose this pellet
“Superstitious Behavior”
• Suggested that temporal contiguity more important than contingency
• 15-s FT, no response requirement
• “adventitious reinforcement”
“In 6 out of 8 cases the resulting responses were so clearly defined that two observers could agree perfectly in counting instances. One bird was conditioned to turn counter-clockwise about the cage, making 2 or 3 turns between reinforcements. Another repeatedly thrust its head into one of the upper corners of the cage….”
Orienting toward feeder
Pecking near feeder
Moving along wall
¼ turn
“Misbehavior” and the limits of operant conditioning
Limits of Operant Conditioning
• Some behaviors can’t be conditioned– Yawning– Scratching
• Belongingness– Presentation of a female won’t reinforce biting
• “Misbehavior”
Marian Breland Bailey – How to train a chicken
The famous dancing chicken
What is learned in operant conditioning?
S R
What is learned?
Edwin Guthrie: mere contiguity of a stimulus and a behavior stamps in that S-R; reinforcement is not necessary
S R
What is learned?
Thorndike:Reinforcement “stamps in” this connection
S R
O
What is learned?
?
S R O
2-Process Theory
operant
Pavlovian
S R
CR
2-Process Theory
operant
Pavlovian
Evidence for 2-process theoryPavlovian-Instrumental Transfer
Phase 1 Phase 2 Test
LeverFood LightFood Light: #Presses?No Light: #Presses?
# Presses
Light No CS
The presence of the CS intensifies operant responding
S R O?
?
What is learned?
Does the Pavlovian S-O association activate a vague emotional state or a specific mental representation of the outcome?
Specific Outcome RepresentationsTrapold
Phase 1 Phase 2 Test
(operant) (classical)
R LeverPellet TonePellet Tone:Left? Right?
L LeverSucrose LightSucrose Light:Left? Right?
# Presses
Light Noise
Left
Right