Br Bringing inging Gaming, ing, VR, and nd AR to L to Life fe Wi With th D Deep L Learn rning Danny Lange Vice President of AI & ML Unity Technologies
What Was Before Machine Learning? Clockwork Universe FEEDBACK “ALL-KNOWING PROGRAMMER” DATA PROGRAM RESULTS
What is Machine Learning Indeterminism LEARNER HISTORIC DATA DATA MODEL PREDICTIONS
OODA Loop (John Boyd) Heisenberg's Uncertainty Principle • There is a limit on our ability to observe reality with precision. Gödel's Incompleteness Theorem • Any model of reality is incomplete and must be continuously refined in the face of new observations. Second Law of Thermodynamics (Entropy) - Ludwig Boltzman • Any given system is continuously changing even as we try to maintain order
Multi-armed Bandit & Reinforcement Learning X% Y% Z% Objective: Maximize winnings Exploration vs Exploitation Gaining knowledge • Max payout with current knowledge • Reinforcement Learning AG AGENT Actions • STATE & ACTION REWARD Rewards • ENVIRONMEN EN ENT
DeepMind Playing Atari
Current AI Research Landscape DeepMind Lab OpenAI Gym/Universe
About Unity Leading global game industry platform • 16 Billion Downloads in 2016 – up 31% • 2.6 Billion Unique Devices • 700 Million Gamers • 38% of top 1000 free mobile games
The Unity Ecosystem
Bringing Reinforcement Learning to Unity
How did the Chicken Cross the Road? Actions Rewards Fatal penalty (being hit by a car) • • Positive reward (collecting gift packet) Exploration Exploitation
Navigating by reading maps
Actions Auxiliary signals Network Architecture Deep Recurrent Q-Network • Dual-stream input • Train network with additional • auxiliary losses Inputs
Synthetic Data
Bringing it to developers and researchers
Unity Reinforcement Learning API (Coming Soon) State & Reward Action
Bringing it together
Danny Lange Vice President, AI & Machine Learning +1 425.463.5801 dlange@unity3d.com @danny_lange dannylange
Recommend
More recommend