operant conditioning
play

Operant Conditioning Learning & Memory Arlo Clark-Foos - PowerPoint PPT Presentation

Operant Conditioning Learning & Memory Arlo Clark-Foos Instrumental or Operant Law of Effect operates on environment to cause an outcome behavior is instrumental in causing outcome Priscilla the Fastidious Pig


  1. Operant Conditioning Learning & Memory Arlo Clark-Foos

  2. Instrumental or Operant • Law of Effect “operates” on environment to cause an outcome behavior is “instrumental” in causing outcome • Priscilla the Fastidious Pig • Thorndike & Skinner https://www.youtube.com/watch?v=LSv992Ts6as

  3. Classical vs. Instrumental • Differences – Classical • Reflexive, automatic behavior • Reinforcement follows CS, regardless of response – Instrumental • Voluntary behavior • Reinforcement only follows the response • Similarities • Negative acceleration, blocking, conditioned inhibition, spontaneous recovery, generalization and discrimination…

  4. History of Instrumental Cond. • Edward Thorndike’s (1898) puzzle boxes – Initially random acts – Decrease in time to escape S D  R – Law of Effect (S-R Association) • “Annoying” vs. “Satisfying” events • Believed reinforcer is not part of association!

  5. Superstitious Behavior B.F. Skinner (1938) showed that nearly any behavior a pigeon performs during reinforcement will increase in frequency.

  6. Belongingness • Breland & Breland (1961) – What makes Sammy dance? • Shettleworth (1975) “Reinforcing with food only reinforces feeding Behaviors”

  7. Learned Helplessness • Seligman & Maier (1967) – Rats and yoked shocks – Later extended to college students and anagrams – Also extended to depression

  8. Losing Streaks Detroit Detroit Lions, Lions, 2008 2015?

  9. Studying/Observing Instrumental Learning METHODOLOGY

  10. Willard Small • 1901: Introduced mazes to animal research Hampton Court, London

  11. Mazes in Research

  12. Mazes in Research • T-Maze – Alternation learning – Better at win-shift than win-stay • Radial Arm Maze – Random without repetition – Memory Load: 16+

  13. Mazes in Research • Morris Water Maze – Cued (Response) Learning • Rats can see the platform: S-R Association – Place Learning • Platform is below surface: Explicit, cognitive memory

  14. Conditioning Takes Time • Skinner’s Free Operant Protocol (vs. Discrete Trials) – Skinner box (automatizing data collection) • Cumulative recorder (akin to Odometer) – Secondary Reinforcer

  15. What is Learned? • Discriminative Stimuli (S D ) S D (light on)  R (press lever)  O (get food) S D (light off)  R (press lever)  O (no food) Habit Slips (Slips of Action; Reason, 1975) • Responses (R) – Lashley’s rats swimming mazes (different motor responses) • Outcomes (O) – Reinforcers and Punishments

  16. Shaping Behavior Twiggy https://www.youtube.co • Shaping m/watch?v=dVfXF8O-lHw – Requires skilled trainer • Physical rehabilitation and language in autism • Bomb/drug detecting dogs • Chaining – Backward chaining

  17. Human Skills and Habits • Walking – feedback from vision/muscles? 1. Lashley (1951): RTs > 100ms • Pianists: 16+ movements per second 2. Damage to sensory feedback 3. Sequencing errors 4. Time to initiate depends on length

  18. Human Skills and Habits • Motor Programs – Initiated complete – General outline, malleable (Schmidt, 1988) • Skill Acquisition (Anderson, 1982) 1. Cognitive Stage 2. Associative Stage 3. Autonomous Stage

  19. Reinforcers • Primary – Food, water, sleep, sex, shelter (temp control) • Secondary – Predict arrival of primary – Token Economies (Conestogas) • Drive Reduction Theory (Hull, 1943) – Primary not always reinforcing • Negative contrast – Nipple sucking for sugar water – Lame treats on Halloween

  20. Punishers • Determinants of effectiveness 1. Punishment  variable behavior • Hot stove 2. SD can encourage cheating • Speeding or my dog and Krispy Kreme 3. Concurrent reinforcement • Class clowns 4. Intensity matters • Child rearing or criminal justice

  21. Differential Reinforcement of Alternative Behaviors (DRA) • Cinemark (2011)

  22. Building S D  R  O • Timing – Immediate is best • Criminal Justice, Punishment • Self Control – Immediate vs. Delayed Reward – Diets, Studying, etc. – Precommitment (SI)

  23. Positive vs Negative Reinforcement

  24. Positive vs Negative Punishment

  25. Reinforcement Schedules • Continuous vs. Partial • Fixed-ratio (FR) – Postreinforcement pause • Variable-ratio (VR) – Slot machine (keep playing) • Fixed-interval (FI) – TBPM • Variable-interval (VI) – Waiting is the hardest part

  26. Choosing Between Behaviors • Concurrent reinforcement schedules – Football on Saturdays • Matching Law – Behavioral Economics ( Thaler wins Nobel Prize, 2017) – Bliss point and Sunfish (observation of behavior)

  27. Why do I watch football? • Behaviors with no primary reinforcers • Premack Principle (1959) – Rats with water/wheel, Children with candy/pinball • For me: Grading/Cleaning – Response Deprivation Hypothesis • Illegal Drugs?

  28. BRAIN SUBSTRATES

  29. S D  R • Basal ganglia – Dorsal Striatum (caudate nucleus, putamen) • Receives highly processed sensory info • Projects to M1 • Lesioned rats fail to learn behaviors in response to stimuli SD (light)  R (lever press)  O (food) • Habitual and Automatic Behaviors – Bike riding, playing instruments, running past food in a maze

  30. R  O • Prefrontal Cortex – Orbitofrontal cortex (OPFC) • Receives sensory input (senses and visceral) • Projects to dorsal striatum • Grape juice neurons (Tremblay & Schultz, 1999)

  31. “I want you to want me” by Cheap Trick • James Olds (1954) – Electrical current in lateral hypothalamus • 700 times an hour, physical exhaustion, starvation • Ventral Tegmental Area (VMA) – Pleasure center? – Excitement/anticipation? – Motivational value – Projects to SNc

  32. Wanting in the VTA/SNc • VTA  SNc – Dopaminergic System – Incentive Salience Hypothesis – Working for pleasure (want/drive) • What if there is no drive (no dopamine)? • Addiction, cues, and precommitment

  33. Endogenous Opioids • Exogenous Opiates: Opium, Morphine, Heroin – May mediate Hedonic value • Increases liking of other stimuli • Decreases perception of pain – Endogenous released in response to primary reinforcers • Which and how many activated may determine preference – Nipple Suckers – Play Halo or Watch Cartoons

  34. Punishment Signaling • Somatosensory Cortex (S1) – Nociceptors • Social Rejection – Insular Cortex (Insula) • Dorsal posterior insula • Degree of activation correlates with magnitude of punisher – Dorsal Anterior Cingulate Cortex • Motivational value of punishment

  35. Drug Addiction • Pathological – Known harmful consequences – Concurrent reinforcement • “Yay drugs” & “Boo withdrawals” • Dopaminergic System – Stroke damage to insula can wipe out addiction

  36. “Might as well face it, you’re addicted to love” • Behavioral Addiction – Gambling, VR Schedules (Skinner), and Gambler’s Fallacy – Parkinson’s patients and dopamine agonists – Cognitive and Behavioral Therapies based on Conditioning

  37. Not All Conditioning is Equal • Partial Reinforcement Effect – Partial Reinforcement Extinction Effect (PREE) • Frustration (Amsel) vs. Sequential (Capaldi) Theories • Fixed vs. Variable & Ratio vs. Interval – Child rearing, pet training, gambling, supersition

  38. What explains the PREE? Frustration Theory (Amsel) Evidence for Frustration: Frustration CRF Extinction • Behavior of pigeons Punishes R+ R- • Children tantrums Response CRF: R+ R+ R+ R+ R+ R+ • Develop (R-O) expectancy PRF: R+ R+ R- R+ R- R- • Develop (R-O) and (R-no O) expectancy S R O (frustration)

  39. What explains the PREE? Sequential Theory (Capaldi) Outcome of previous trial serves as a cue for subsequent behavior PRF: R+ R+ R- R+ R- R- Fm Fm NFm Fm NFm NFm • NFm – R (S-R) strengthened by next R+ What happens with long ITI?....Decay • Frustration? • Memory? Stronger PREE with long ITI

  40. Complex Behavior • Response Chaining – Backward Chaining – Breaks in the “chain” – Animal intelligence

  41. Striatum and Skill/Habit • Caudate, putamen, nucleus accumbens • Organizes somatosensory representations and motor responses for planning and executing goal-oriented behavior.

  42. Double Dissociation • Broca vs. Wernicke

  43. Packard et al. (1989) • Radial Arm Maze (8 arms) • Win-Stay vs. Win-Shift

  44. Response vs. Place Learning

  45. Habit Learning in Humans • Parkinson’s Disease – Impaired dopaminergic system in striatum • Huntington’s Disease – Loss of some striatal function (Gabrieli, 1995)

  46. Weather Prediction Game • Knowlton et al. (1996)

  47. Weather Prediction Game • Knowlton et al. (1996)

  48. Weather Prediction Game • Poldrack et al. (1999)

  49. Neurophysiological Data • Mink (1996) – Neurons in striatum fire in anticipation of movement • Schultz (2006) – DA Neurons from brain stem into striatum – Fire with expectation and reception of rewards • Blocking and expectation

Recommend


More recommend