Operant Conditioning Overview

Preview:

DESCRIPTION

Thorndike’s Puzzle Box

Citation preview

Operant Conditioning

Overviewhttp://www.youtube.com/watch?v=drnnulHw5CM

Edward Thorndike (1874-1949)

• Introduced the “Law of Effect”• Behaviors with favorable consequences

will occur more frequently.• Behaviors with unfavorable

consequences will occur less frequently.• Developed into Operant Conditioning • Created puzzle boxes for research on cats

Thorndike’s Puzzle Box

• A type of learning in which the frequency of a behavior depends on the consequence that follows that behavior

• The frequency will if the consequence is reinforcing to the subject.

• The frequency will if the consequence is not reinforcing to the subject.

Operant Conditioning

B.F. Skinner (1904-1990)

• Developed the fundamental principles and techniques of operant conditioning.

• Devised ways to apply these principles in the real world.

• Designed the Skinner Box.

Reinforcement v. Punishment

• Reinforcement: Anything that increases the likelihood of behavior to be repeated

• Punishment: Anything that decreases the likelihood of the behavior to be repeated

Positive Reinforcement

• Anything that increases the likelihood of a behavior by following it with a desirable event or state

• The subject receives something they want• Will strengthen the behavior

Positive Reinforcement

Negative Reinforcement

• Anything that increases the likelihood of a behavior by following it with the removal of an undesirable event or state

• Something the subject doesn’t like is removed

• Will strengthen the behavior (Definition of Reinforcement)

Negative Reinforcement

Positive Punishment

• Anything that decreases the likelihood of a behavior by following it with an undesirable event or state

• Will weaken behavior

Negative Punishment

• Anything that decreases the likelihood of a behavior by following it with removal of an desirable event or state

• Will weaken behavior

Go to bed with no dinner!

Two types of Punishment:

POSITIVE (ADDED)

NEGATIVE(SUBTRACTED)

REINFORCEMENT(STRENGTHENS)

• Clean the house and earn $5• a coach pats you on the back after a good play• a paycheck for working• $10 for getting an “A” on your report card• Senior privilege for maintaining good grades

• You buy your child ice cream so they stop nagging• You leave early for school to avoid traffic• You take Tylenol to remove back pain

PUNISHMENT (WEAKENS)

• You get your mouth washed out with soap when you curse• Touch and hot stove and get burned• Getting a ticket for speeding

• You lose your driving privileges for breaking curfew• Time out, or the loss of freedom to combat bad behavior• You pay money for a speeding ticket

Schedules of Reinforcement• By Response:

– Fixed Ratio: Rewarded after a certain number of responses (same every time)

– Variable Ratio: Rewarded after a random number of responses (changes between rewards)

• By time:– Fixed Interval: Rewarded after a certain amount of

time (same every time)– Variable Interval: Rewarded after a random

amount of time (changes between rewards)

Immediate/Delayed Reinforcement

• Immediate reinforcement is more effective than delayed reinforcement

• Ability to delay gratification predicts higher achievement

Ways of Reinforcement

Schedules of Reinforcement:

Continuous Reinforcement

Continuous reinforcement

• A schedule of reinforcement in which a reward follows every correct response

• Most useful way to establish a behavior• The behavior will extinguish quickly

once the reinforcement stops.

Think of training your dog… like this woman did.

Schedules of Reinforcement:

Partial Reinforcement

Partial Reinforcement

• A schedule of reinforcement in which a reward follows only some correct responses

• Includes the following types:– Fixed-interval and variable interval– Fixed-ratio and variable-ratio

Fixed-Interval Schedule

• A partial reinforcement schedule that rewards only the first correct response after some defined period of time

• i.e. weekly quiz in a class

Fixed interval schedule is when the reinforcement is received after a fixed amount of time has passed. Ex. You get allowance every other Friday.

Variable-Interval Schedule

• A partial reinforcement that rewards the first correct response after an unpredictable amount of time

• i.e. “pop” quiz in a class

Variable interval schedule is when the reinforcement occurs after varying amounts of time. Ex. Fishing and catching a fish after varying amounts of time

Fixed-Ratio Schedule• A partial reinforcement schedule that

rewards a response only after some defined number of correct responses

• The faster the subject responds, the more reinforcements they will receive.

Fixed ratio schedule a specific number of correct responses is required before reinforcement can be obtained. Ex. Buy 10 haircuts get 1 free.

Variable-Ratio Schedule• A partial reinforcement schedule that

rewards an unpredictable number of correct responses

• This schedule is very resistant to extinction.• Sometimes called the “gambler’s

schedule”; similar to a slot machine

Variable ratios schedule is when an unpredictable number of responses are required before reinforcement can be obtained. Ex. slot machines.

Schedules of Reinforcement

Kindergarten Study• Children who showed high interest in

drawing were selected, then split into 3 groups1. 1 group given good player badge and told they

would get it if they did a good job drawing2. 1 group given badge but weren’t expecting the

reward3. 1 group given no reward after drawing

• Which group drew the most the next day?– Answer: Group 1 drew the least, 2,3 more

• Overjustification Effect: rewarding an already enjoyable behavior may replace natural enjoyment with expectation of reward

Should we pay students when they get better grades?

Bandura’s Experiment• In Albert Bandura’s Bobo Doll experiment

children observed others modeling violent behavior towards a blow-up doll.

1. Another adult rewards the adult model with praise and candy. One group of children saw this ending.

2. Another adult calls the model a “bad person” and spanks the model. A second group of children saw this ending.

3. The model receives neither a reward nor a punishment. A final group saw this neutral ending.

• What would you expect the results of this experiment to be?

Results of Bandura Experiment

• Children who saw the model receiving positive reinforcement were the most violent

• Those who saw the model being punished were the least violent

Modeling – learning by imitating/copying

Bobo-Doll Experiment

Bandura demonstrated that children learn aggressive behaviors by watching an adult’s aggressive behaviors.

Albert Bandura found that we learn by watching others if the following four conditions are met:

• Attention – We must be aware of behaviors of those around us

• Retention – We must remember the behavior we have witnessed

• Ability to Reproduce Behavior – We must possess the skills to do the tricks

• Motivation – We are more likely to feel motivated to learn if the model we’ve observed has been rewarded and we like the model

• Behaviors produced can either be prosocial or antisocial behaviors

– Prosocial: Beneficial (helping people, obeying rules, etc.)

– Antisocial: Damaging (vandalism, violence, etc)

Role Models• Can people choose to be role models?

– Whether we want it or not, people watch us and learn from us.

Recommended