B. F. Skinner & Operant Conditioning

Operant Conditioning Theory

Operant conditioning is a term used to describe behaviour which has been reinforced by reward or discouraged through punishment.

For example, if a mother wants her daughter to clean her room then she may give her some sweets every time she cleans it.

Given enough time, the girl will start to clean her room more often because she knows she will get some sweets in return.

As a result, the girl’s behaviour (cleaning her room) has been modified (conditioned) because she learnt to associate that behaviour with a reward.

Although this may sound similar in principle to classical conditioning, it is in fact different because operant conditioning requires action on the part of the learner.

As a result, the girl will not get any sweets until after she cleans her room. In classical conditioning the conditioned stimulus (sweets) is used regardless of what the learner does.

Note : Operant behaviour is defined as actions which have consequences.

The Skinner Box

It was B. F. Skinner who is best known for operant conditioning, and the device he invented to research it called the operant conditioning apparatus (also known as the Skinner box).

The Skinner box involved placing an animal (such as a rat or pigeon) into a sealed box with a lever that would release food when pressed.

If food was released every time the rat pressed the lever, it would press it more and more because it learnt that doing so gives it food.

Lever pressing is described as an operant behaviour, because it is an action that results in a consequence. In other words, it operates on the environment and changes it in some way.

The food that is released as a result of pressing the lever is known as a reinforcer, because it causes the operant behaviour (lever pressing) to increase.

Food could also be described as a conditioned stimulus because it causes an effect to occur.

Note : There is an important difference between a reward and a reinforcer in operant conditioning.

A reward is something which has value to the person giving the reward, but may not necessarily be of value to the person receiving the reward.

A reinforcer is something which benefits the person receiving it, and so results in an increase of a certain type of behaviour.

Types Of Reinforcers

Below are several of the different ways to categorise a reinforcer.

Positive Reinforcer

A positive reinforcer has some sort of value for whoever is receiving it. For example, food when you are hungry or water when you are thirsty.

A positive reinforcer serves to increase an operant behaviour.

Negative Reinforcer

A negative reinforcer has no value for whoever receives it. It may also injure, harm or cause discomfort in some way. For example, a very hot room, an electric shock or a dangerous situation.

A negative reinforcer causes the recipient to try and escape from it or avoid it.

For example, if a room is very hot then you may switch on the air conditioning or a fan to try and escape from the heat. If this is successful, you are likely to repeat this behaviour the next time you are in a very hot room.

Negative reinforcers therefore also serve to increase operant behaviours.

Note : Negative reinforcers are not a form of punishment, because they precede an operant behaviour.

Punishment occurs after a behaviour has already occurred, such as smacking a child after they have done something bad.

Primary Reinforcer & Secondary Reinforcer

Another way to classify reinforcers is as a primary or secondary reinforcer.

Primary Reinforcer

A primary reinforcer has some value to whoever is receiving it, and this value has not been learnt. For example, food when you are hungry or water when you are thirsty.

Secondary Reinforcer

A secondary reinforcer has an acquired value for whoever receives it, which means you are taught its value/worth over a period of time before you see it as being valuable to you.

For example, money is a secondary reinforcer because you have to learn the value of money and what it does before it has any meaning to you.

If you are short of cash, then receiving money can also be categorised as a positive reinforcer because it has value to you.

Extinction

Just like in classical conditioning where presenting a conditioned stimulus a number of times without the unconditioned stimulus results in extinction, a similar process also occurs in operant conditioning when an operant behaviour begins to declines.

For example, if a rat receives no food when it presses a lever (reinforcement is withheld) then it will gradually press that lever less and less until it stops doing so.

In effect the rat gives up on pressing the lever (stops an operant behaviour) because it no longer results in it receiving food (reinforcer).  The operant behaviour has therefore become extinct.

Shaping Bad Habits

This knowledge of extinction can be applied to behaviour shaping, such as when trying to stop a bad habit.

So rather than punishing a certain behaviour, it is far more effective to take away the reinforcer(s) it provides.

As by doing so, the habit will no longer be seen as having any benefit and so the behaviour gradually start to fade away (extinction).

Punishment may temporarily reduce a certain behaviour, although in the long run because that behaviour is still seen as bringing some sort of benefit, it will continue.

In addition to this, punishment can also make the person being punished resent you and do then do things behind your back.

Partial Reinforcement Effect

Behaviour which is acquired under partial reinforcement is much more resistant to extinction than behaviour which has been acquired under continuous reinforcement.

For example, if a rat receives a reinforcer every time it presses the lever this is continuous reinforcement.

However if the rat receives a reinforcer at random, or every second or third time it presses the lever, this is partial reinforcement because it does not get the reinforcer every time.

If you were to stop giving the reinforcer completely, the rat receiving partial reinforcement would display a greater resistance to extinction (i.e. it would keep pressing the lever for longer after the reinforcer had been stopped).

A good example of partial reinforcement in everyday life can be see in casinos. This is why you will often find that despite winning a large sum of money, many gamblers are unable to stop and end up loosing what they won.

Discriminative Stimulus

In a slight variation of the original Skinner box (there are several variants), a light bulb was placed above the lever.

Whenever the light is on, pressing the lever would result in the rat receiving the reinforcer. But when the light is off, pressing the lever would result in no reinforcer.

Given enough time, the rat eventually learns to only press the lever when the light is on, and ignores the lever when the light is off.

Skinner called the light a discriminative stimulus, which he defined as a stimulus which allows the animal to tell the difference between “a situation which is reinforcing and one that is not”.

In other words, the light allows you to determine whether or not you will get a reward (reinforcer).

Some real life examples of discriminative stimuli include hearing a bell before lunch, or seeing a traffic light when you are driving.

In both cases a signal (bell/light) tells you what sort of reinforcement you will receive in that situation.

Operant Conditioning Is…

Putting this all together, you can now see that operant conditioning is a modification (conditioning) of an action (operant behaviour) which has consequences (e.g. lever pressing releases food) through the use of positive reinforcement (reward) or negative reinforcement (punishment).

Summary

• Operant conditioning involves behaviour which has been modified by positive or negative reinforcement.

• B. F. Skinner is well known for his work on operant conditioning with the Skinner Box.

• A reinforcer is of benefit to the recipient, and causes the frequency of an operant behaviour to increase.

• Negative reinforcers repel you away from something.

• Secondary reinforcers are things which you need to learn the value of beforehand.

• Partial reinforcement is more resistant to extinction than continuous reinforcement.

• Taking away the benefits a habit provides is a good way to stop it.

• A discriminative stimulus lets you know whether you will get a reward or not.

MySpace Twitter Stumbleupon Digg it Facebook
If you found this article or website helpful, please tell someone about it!