What is the main idea of operant conditioning?

Operant conditioning, sometimes referred to as instrumental conditioning, is a method of learning that uses rewards and  punishment to modify behavior. Through operant conditioning, behavior that is rewarded is likely to be repeated, and behavior that is punished will rarely occur.

For example, when you are rewarded at work with a performance bonus for exceptional work, you will be inclined to continue performing at a higher level in hopes of receiving another bonus in the future. Because this behavior was followed by a positive outcome, the behavior will likely be repeated.

Operant Behavior

In operant behavior, stimuli can be appetitive or aversive. Appetitive stimuli are the ones that you voluntarily approach while aversive stimuli are those you try to avoid or escape. Responses to such stimuli can either be positive or negative reinforcement. In this case positive and negative do not mean good or bad. Instead, positive reinforcement means introducing stimulus to increase the probability of recurrence of behavior, while negative reinforcement involves removing stimulus to encourage good behavior.

The Operant Conditioning Theory

Operant conditioning was first described by behaviorist B.F. Skinner. His theory was based on two assumptions. First, the cause of human behavior is something in a person’s environment. Second, the consequences of a behavior determine the possibility of it being repeated. Behavior that is followed by a pleasant consequence is likely to be repeated and behavior followed by an unpleasant consequence is less likely to be repeated.

Although Skinner was the pioneer of the operant conditioning theory, his ideas were based on Thorndike’s law of effect. Skinner also believed that we do have a mind. Therefore it was more productive to study observable behavior rather than internal mental events. 

Skinner was also an exemplary inventor. Among his gadgets was the Skinner Box, which uses subjects like rats and pigeons to record animal behavior in a compressed time frame.

Through his experiments, Skinner identified three types of responses that followed behavior:

Neutral responses. They are responses from the environment that produce no stimulus other than focusing attention. They neither increase nor decrease the probability of a behavior being repeated.

Reinforcers. They are responses from the environment that increase the likelihood of a behavior being repeated. They can either be positive or negative.

Punishers.  These are negative operants that make the likelihood of a behavior decrease. Punishment weakens behavior.

Positive Reinforcement

Positive reinforcement involves the presentation of an appetitive stimulus to increase the likelihood of a behavior occurring in the future. For example, if your child does chores without being asked you can reward them by taking them to a park or handing them a treat. 

Skinner used a hungry rat in a Skinner box to show how positive reinforcement works. The box contained a lever on the side, and as the rat moved about the box, it would accidentally knock the lever. Immediately after it did so a food pellet would drop into a container next to the lever. The consequence of receiving food every time the rat hit the lever ensured that the animal repeated the action again and again.

Positive reinforcement does not have to involve tangible items. Instead, you can positively reinforce your child through:

  • Clapping
  • Cheering
  • Giving a hug or pat on the back
  • Give a thumbs-up
  • Offering a special activity, like playing a game or reading a book together
  • Telling another adult how proud you are of your child’s behavior while your child is listening
  • Praising them
  • Giving a high five

Negative Reinforcement

In negative reinforcement, something unpleasant is terminated in response to a stimulus. Over time, the behavior increases with the expectation that the aversive stimulant will be taken away. If, for example, a child refuses to eat vegetables at dinner time and a parent responds by taking the vegetables away, the removal of the vegetables is negative reinforcement.

Schedules of Reinforcement

A reinforcement schedule is a component of operant conditioning that states which instances of behavior will be reinforced. It involves a set of rules determined by the time and number of responses required to present or remove a reinforcer.

Different patterns of reinforcement have distinctive effects on the speed of learning. Schedules of reinforcement include:

Fixed ratio reinforcement. Rewards depend on the specific number of times a behavior occurs. For instance, a child is applauded after spelling 10 words correctly. 

Fixed interval reinforcement. Rewards are provided at consistent times. An example is a weekly paycheck. Another example is a child being rewarded once a week if the dishes are done.

Variable ratio reinforcement. This reinforcement is unpredictable and yields a high number of responses. For example, gambling may offer wins after several unpredictable attempts.

Variable interval reinforcement. Responses are rewarded after an unpredictable amount of time has passed. An example is unpredictable check-ins by a health inspector.

Continuous reinforcement. This is the reinforcement of a behavior every time it happens. An example is rewarding a toddler each time they use the potty.

Punishment

In operant conditioning, punishment is defined as any change to the surrounding environment that reduces the probability of responses or behavior happening again. Punishment can work either by directly applying an unpleasant stimulus like scolding or by removing a potentially rewarding stimulus, like deducting someone’s daily allowance to punish undesirable behavior.

While punishment is efficient in decreasing undesirable behavior, it is associated with many problems such as:

  • Increased aggression
  • Punished behavior is suppressed rather than forgotten.
  • Fear
  • Punishment does not necessarily guide toward good behavior
  • Punishment can easily become abuse

Token Economy

The token economy is a system used in behavioral modification programs where desirable behavior is reinforced using tangible rewards such as tokens, fake money, food, stickers, poker chips, or buttons that are later exchanged for rewards. In a hospital setting, for example, rewards of token money may be offered in exchange for food, access to television, and other bonuses.

A token economy has not only proven effective in managing psychiatric patients, but also in school. This system can be used in classrooms to reduce disruptive behavior and increase academic engagement.

Show Sources

SOURCES:

American Academy of Pediatrics: “Positive Reinforcement Through Rewards.”

Annual Review of Psychology: “OPERANT CONDITIONING.”

Harvard University, Department of Psychology: “Advocacy of Behaviorism and its Application to Psychology and Life Operant Conditioning and the Law of Effect.”

Journal of Abnormal Psychology: “Preliminary report on the application of contingent reinforcement procedures (token economy) on a "chronic" psychiatric ward.”