Learning How to Soar Learning How to Soar Terrence Sejnowski Salk Institute UCSD
Bird Migration Bird Migration Migration Ecology of Birds, Ian Newton
Thermal Soaring Thermal Soaring
Rayleigh-Bénard Convection Rayleigh-Bénard Convection
Atmospheric Turbulence Atmospheric Turbulence
Tracking a Falcon with GPS Tracking a Falcon with GPS
Humans Soar Too Humans Soar Too
Glider Aerodynamics Bank angle 1 - male condor 2- female condor 3 - black vulture 4 - caracara 4 2 3 1 Angle of attack Control over bank angle and angle of attack Shephard & Lambertucci, 2013
How do Birds Find and Navigate Thermals? How do Birds Find and Navigate Thermals? What quantities do birds sense? • Vertical velocities, temperature, gradients, etc? • How should the bird respond to these cues? • Experiments are hard to control and strategies are difficult to infer from limited data Physics simulations are complex and there are many variables. What should an optimal agent sense?
Time is Honey
Karl von Frisch
Temporal Difference Learning Temporal Difference Learning TD - error : r V ( s ) V ( s ) t t 1 t 1 t Actions are determined by preference s : p ( s , a ) e ( s , a ) Pr a a s s , t t t p ( s , b ) e b Update the preference s : p ( s , a ) p ( s , a ) t t t t t The value function update : Sutton and Barto, 1988 V ( s ) V ( s ) t t t
Hammer and Menzel, 1997 VUMmx1 - Octopamine
Temporal Difference Learning Temporal Difference Learning Montague, Dayan and Sejnowski, 1994
Actor Critic Model Actor Critic Model Dopamine Neurons Dopamine Neurons Environment Cerebral Cortex Basal Reward Ganglia Prediction Error Dopamine Montague, Dayan and Sejnowski, 1996
Temporal Difference Learning Temporal Difference Learning Go Defeat, 2017 Go Defeat, 2017 Environment Cerebral Ke Jie Ke Jie Cortex Basal Reward Ganglia Prediction Error DeepMind DeepMind Dopamine
What Do Thermals Look Like? What Do Thermals Look Like? Rayleigh-Benard convection Vertical velocity field Temperature field Reddy, Vergassola, Sejnowski, 2017
Sink or Soar? Sink or Soar? Pre-training Post-training
Learned Policy Learned Policy +5 o 1-2 meters 0 o -5 o v z Vertical velocity gradient a z Vertical acceleration
Conclusions Conclusions angle of attack vertical acceleration a z and a z and v z gradients v z gradients v z gradients across wings across wings are useful are useful climb rate temperature control over control over angle of attack angle of attack is not useful is not useful
Field Experiments Field Experiments
GoPro Glider GoPro Glider
Gautam Reddy Field Experiments Field Experiments
Field Experiments Field Experiments observed desired 30 Bank angle ( o ) 0 -30 0 50 100 Time (s)
Measuring the Vertical Wind Velocity Measuring the Vertical Wind Velocity GPS and barometer measurement give vertical ground velocity We need to estimate wind velocity GPS/baro ground vel. = wind vel. + glider’s air vel. Pitch( o ) 8 modeling -8 Phugoid 20 s
Training a Glider in the Field Training a Glider in the Field Reddy, Vergassola, Sejnowski, 2018
Training a Glider in the Field Training a Glider in the Field
Field Experiments Field Experiments
Thank You Peter Dayan Gautam Reddy Read Montague Massimo Vergassola John Doyle
Recommend
More recommend