Operant Conditioning Learning & Memory Arlo Clark-Foos
Instrumental or Operant • Law of Effect “operates” on environment to cause an outcome behavior is “instrumental” in causing outcome • Priscilla the Fastidious Pig • Thorndike & Skinner https://www.youtube.com/watch?v=LSv992Ts6as
Classical vs. Instrumental • Differences – Classical • Reflexive, automatic behavior • Reinforcement follows CS, regardless of response – Instrumental • Voluntary behavior • Reinforcement only follows the response • Similarities • Negative acceleration, blocking, conditioned inhibition, spontaneous recovery, generalization and discrimination…
History of Instrumental Cond. • Edward Thorndike’s (1898) puzzle boxes – Initially random acts – Decrease in time to escape S D R – Law of Effect (S-R Association) • “Annoying” vs. “Satisfying” events • Believed reinforcer is not part of association!
Superstitious Behavior B.F. Skinner (1938) showed that nearly any behavior a pigeon performs during reinforcement will increase in frequency.
Belongingness • Breland & Breland (1961) – What makes Sammy dance? • Shettleworth (1975) “Reinforcing with food only reinforces feeding Behaviors”
Learned Helplessness • Seligman & Maier (1967) – Rats and yoked shocks – Later extended to college students and anagrams – Also extended to depression
Losing Streaks Detroit Detroit Lions, Lions, 2008 2015?
Studying/Observing Instrumental Learning METHODOLOGY
Willard Small • 1901: Introduced mazes to animal research Hampton Court, London
Mazes in Research
Mazes in Research • T-Maze – Alternation learning – Better at win-shift than win-stay • Radial Arm Maze – Random without repetition – Memory Load: 16+
Mazes in Research • Morris Water Maze – Cued (Response) Learning • Rats can see the platform: S-R Association – Place Learning • Platform is below surface: Explicit, cognitive memory
Conditioning Takes Time • Skinner’s Free Operant Protocol (vs. Discrete Trials) – Skinner box (automatizing data collection) • Cumulative recorder (akin to Odometer) – Secondary Reinforcer
What is Learned? • Discriminative Stimuli (S D ) S D (light on) R (press lever) O (get food) S D (light off) R (press lever) O (no food) Habit Slips (Slips of Action; Reason, 1975) • Responses (R) – Lashley’s rats swimming mazes (different motor responses) • Outcomes (O) – Reinforcers and Punishments
Shaping Behavior Twiggy https://www.youtube.co • Shaping m/watch?v=dVfXF8O-lHw – Requires skilled trainer • Physical rehabilitation and language in autism • Bomb/drug detecting dogs • Chaining – Backward chaining
Human Skills and Habits • Walking – feedback from vision/muscles? 1. Lashley (1951): RTs > 100ms • Pianists: 16+ movements per second 2. Damage to sensory feedback 3. Sequencing errors 4. Time to initiate depends on length
Human Skills and Habits • Motor Programs – Initiated complete – General outline, malleable (Schmidt, 1988) • Skill Acquisition (Anderson, 1982) 1. Cognitive Stage 2. Associative Stage 3. Autonomous Stage
Reinforcers • Primary – Food, water, sleep, sex, shelter (temp control) • Secondary – Predict arrival of primary – Token Economies (Conestogas) • Drive Reduction Theory (Hull, 1943) – Primary not always reinforcing • Negative contrast – Nipple sucking for sugar water – Lame treats on Halloween
Punishers • Determinants of effectiveness 1. Punishment variable behavior • Hot stove 2. SD can encourage cheating • Speeding or my dog and Krispy Kreme 3. Concurrent reinforcement • Class clowns 4. Intensity matters • Child rearing or criminal justice
Differential Reinforcement of Alternative Behaviors (DRA) • Cinemark (2011)
Building S D R O • Timing – Immediate is best • Criminal Justice, Punishment • Self Control – Immediate vs. Delayed Reward – Diets, Studying, etc. – Precommitment (SI)
Positive vs Negative Reinforcement
Positive vs Negative Punishment
Reinforcement Schedules • Continuous vs. Partial • Fixed-ratio (FR) – Postreinforcement pause • Variable-ratio (VR) – Slot machine (keep playing) • Fixed-interval (FI) – TBPM • Variable-interval (VI) – Waiting is the hardest part
Choosing Between Behaviors • Concurrent reinforcement schedules – Football on Saturdays • Matching Law – Behavioral Economics ( Thaler wins Nobel Prize, 2017) – Bliss point and Sunfish (observation of behavior)
Why do I watch football? • Behaviors with no primary reinforcers • Premack Principle (1959) – Rats with water/wheel, Children with candy/pinball • For me: Grading/Cleaning – Response Deprivation Hypothesis • Illegal Drugs?
BRAIN SUBSTRATES
S D R • Basal ganglia – Dorsal Striatum (caudate nucleus, putamen) • Receives highly processed sensory info • Projects to M1 • Lesioned rats fail to learn behaviors in response to stimuli SD (light) R (lever press) O (food) • Habitual and Automatic Behaviors – Bike riding, playing instruments, running past food in a maze
R O • Prefrontal Cortex – Orbitofrontal cortex (OPFC) • Receives sensory input (senses and visceral) • Projects to dorsal striatum • Grape juice neurons (Tremblay & Schultz, 1999)
“I want you to want me” by Cheap Trick • James Olds (1954) – Electrical current in lateral hypothalamus • 700 times an hour, physical exhaustion, starvation • Ventral Tegmental Area (VMA) – Pleasure center? – Excitement/anticipation? – Motivational value – Projects to SNc
Wanting in the VTA/SNc • VTA SNc – Dopaminergic System – Incentive Salience Hypothesis – Working for pleasure (want/drive) • What if there is no drive (no dopamine)? • Addiction, cues, and precommitment
Endogenous Opioids • Exogenous Opiates: Opium, Morphine, Heroin – May mediate Hedonic value • Increases liking of other stimuli • Decreases perception of pain – Endogenous released in response to primary reinforcers • Which and how many activated may determine preference – Nipple Suckers – Play Halo or Watch Cartoons
Punishment Signaling • Somatosensory Cortex (S1) – Nociceptors • Social Rejection – Insular Cortex (Insula) • Dorsal posterior insula • Degree of activation correlates with magnitude of punisher – Dorsal Anterior Cingulate Cortex • Motivational value of punishment
Drug Addiction • Pathological – Known harmful consequences – Concurrent reinforcement • “Yay drugs” & “Boo withdrawals” • Dopaminergic System – Stroke damage to insula can wipe out addiction
“Might as well face it, you’re addicted to love” • Behavioral Addiction – Gambling, VR Schedules (Skinner), and Gambler’s Fallacy – Parkinson’s patients and dopamine agonists – Cognitive and Behavioral Therapies based on Conditioning
Not All Conditioning is Equal • Partial Reinforcement Effect – Partial Reinforcement Extinction Effect (PREE) • Frustration (Amsel) vs. Sequential (Capaldi) Theories • Fixed vs. Variable & Ratio vs. Interval – Child rearing, pet training, gambling, supersition
What explains the PREE? Frustration Theory (Amsel) Evidence for Frustration: Frustration CRF Extinction • Behavior of pigeons Punishes R+ R- • Children tantrums Response CRF: R+ R+ R+ R+ R+ R+ • Develop (R-O) expectancy PRF: R+ R+ R- R+ R- R- • Develop (R-O) and (R-no O) expectancy S R O (frustration)
What explains the PREE? Sequential Theory (Capaldi) Outcome of previous trial serves as a cue for subsequent behavior PRF: R+ R+ R- R+ R- R- Fm Fm NFm Fm NFm NFm • NFm – R (S-R) strengthened by next R+ What happens with long ITI?....Decay • Frustration? • Memory? Stronger PREE with long ITI
Complex Behavior • Response Chaining – Backward Chaining – Breaks in the “chain” – Animal intelligence
Striatum and Skill/Habit • Caudate, putamen, nucleus accumbens • Organizes somatosensory representations and motor responses for planning and executing goal-oriented behavior.
Double Dissociation • Broca vs. Wernicke
Packard et al. (1989) • Radial Arm Maze (8 arms) • Win-Stay vs. Win-Shift
Response vs. Place Learning
Habit Learning in Humans • Parkinson’s Disease – Impaired dopaminergic system in striatum • Huntington’s Disease – Loss of some striatal function (Gabrieli, 1995)
Weather Prediction Game • Knowlton et al. (1996)
Weather Prediction Game • Knowlton et al. (1996)
Weather Prediction Game • Poldrack et al. (1999)
Neurophysiological Data • Mink (1996) – Neurons in striatum fire in anticipation of movement • Schultz (2006) – DA Neurons from brain stem into striatum – Fire with expectation and reception of rewards • Blocking and expectation
Recommend
More recommend