opportunities for infusing physics in ai ml algorithms
play

Opportunities for infusing physics in AI / ML algorithms Animashree - PowerPoint PPT Presentation

Opportunities for infusing physics in AI / ML algorithms Animashree Anandkumar Director of ML Research, NVIDIA Bren Professor, Caltech 1. Neural Programming 2 Combining Symbolic Expressions & Black-box Function Evaluations in Neural


  1. Opportunities for infusing physics in AI / ML algorithms Animashree Anandkumar Director of ML Research, NVIDIA Bren Professor, Caltech

  2. 1. Neural Programming 2

  3. Combining Symbolic Expressions & Black-box Function Evaluations in Neural Programs FOROUGH ARABSHAHI, SAMEER SINGH, ANIMA ANANDKUMAR

  4. Symbolic + Numerical Input Goal: Learn a domain of functions (sin, cos, log…) ○ Training on numerical input-output does not generalize. Data Augmentation with Symbolic Expressions ○ Efficiently encode relationships between functions. Solution: ○ Design networks to use both: symbolic + numeric ○ Leverage the observed structure of the data Hierarchical expressions

  5. Neural Programming ◎ Data-driven mathematical and symbolic reasoning ◎ Leverage the observed structure of the data ○ Hierarchical expressions 5

  6. Applications ◎ Mathematical equation verification ○ sin $ 𝜄 + cos $ 𝜄 = 1 ??? ◎ Mathematical question answering ○ sin $ 𝜄 + $ = 1 ◎ Solving differential equations 6

  7. Examples 2.5 = 2×10 1 + 5 ×10 23 Symbolic Data Point Function Evaluation Data Point Number Encoding Data Point 7

  8. Representing Mathematical Equations ◎ Grammar rules 8

  9. Domain 9

  10. ICLR’18 Tree-LSTM for capturing hierarchies 10

  11. Dataset Generation ◎ Random local changes = = = − ∧ + + + 1 1 1 1 1 1 0 1 ∧ ∧ ∧ ∧ ∧ ∧ ∧ ∧ cos sin 2 2 2 2 cos sin 2 2 2 2 cos sin 2 2 2 2 1 1 θ θ 1 1 θ θ 1 1 θ θ Replace Node Shrink Node Expand Node 11

  12. Summary of contributions ◎ Combine symbolic expressions and function evaluation ◎ New tasks ○ Equation verification ○ equation completion ○ Solving differential equations ◎ Balanced dataset generation method ◎ Generalizable representation of numbers 12

  13. Dataset Generation ◎ Sub-tree matching = = + + ≡ + + 1 1 1 1 ∧ ∧ 2 2 y 2 x y x ∧ ∧ cos sin 2 2 2 2 cos sin 2 2 2 2 1 1 θ θ 1 1 θ θ Choose Node Dictionary key-value pair Replace with value’s pattern 13

  14. Examples of Generated Equations 14

  15. Equation Verification 120.00% 100.00% 80.00% 60.00% 40.00% 20.00% 0.00% Generalization Extrapolation Majority Class Sympy LSTM:sym TreeLSTM:sym TreeLSTM:sum+num 15

  16. EQUATION VERIFICATION 100.00% 90.00% 80.00% ACCURACY 70.00% 60.00% 50.00% 40.00% Majority Class Sympy LSTM : sym TreeLSTM : sym TreeLSTM:sym+num 50.24% 81.74% 81.71% 95.18% 97.20% Generalization 44.78% 71.93% 76.40% 93.27% 96.17% Extrapolation 16

  17. Equation Completion 17

  18. Equation Completion 18

  19. Take-aways ◎ Vastly Improved numerical evaluation: 90% over function-fitting baseline. ◎ Generalization to verifying symbolic equations of higher depth LSTM: Symbolic TreeLSTM: Symbolic TreeLSTM: symbolic + numeric 76.40 % 93.27 % 96.17 % ◎ Combining symbolic + numerical data helps in better generalization for both tasks: symbolic and numerical evaluation.

  20. Solving Differential Equations ◎ Traditional methods: ○ Gather numerical data from a differential equation ○ Design a neural network for training ◎ Drawback: ○ Trained model can be used only for that differential equation ○ Train a new model for differential equation ○ Not generalizable 20

  21. Solving Differential Equations ◎ Steps: ○ Find a set of candidate solutions ○ Accept the correct candidate using the neural programmer ◎ Advantage: ○ Jointly train for many functions ○ Generalizable ○ Can be used for solving any differential equation 21

  22. Ordinary Differential Equations ◎ 𝑜 56 order ODE ◎ Find 𝑔(𝑦) that satisfies it 22

  23. Extension to solving differential equations 23

  24. Verifying ODE Solutions 98.45% 94.09% 80.07% 59.78% 56.45% 50.16% SYMBOLIC ACCURACY ODE ACCURACY Majority Class Sympy TreeLSTM 24

  25. 2. Tensorized deep learning 25

  26. Tensors for multi-dimensional data and higher order moments Videos: 4 dimensions Images: 3 dimensions Pairwise correlations Triplet correlations

  27. Operations on Tensors: Tensor Contraction Tensor Contraction Extends the notion of matrix product Matrix product Tensor Contraction � � T ( u, v, · ) = Mv = u i v j T i,j, : v j M j i,j j = + = + + +

  28. Deep Neural Nets: Transforming Tensors

  29. Deep Tensorized Networks

  30. Space Saving in Deep Tensorized Networks

  31. Tensors for long-term forecasting Tensor Train RNN and LSTMs Challenges in forecasting: • Long-term dependencies • High-order correlations • Error propagation

  32. Tensor LSTM for Long-term Forecasting Traffic dataset Climate dataset

  33. T e n s o r l y : H i g h - l e v e l A P I f o r T e n s o r A l g e b r a • Python programming • User-friendly API • Multiple backends: flexible + scalable • Example notebooks in repository

  34. Unsupervised learning of Topic Models through tensor methods Topics Justice Education Sports

  35. L e a r n i n g L D A M o d e l t h r o u g h t e n s o r s Topic-word matrix � � [ word � = � i | topic � = � j ] � Topic proportions P[topic = j |document] Moment Tensor: Co-occurrence of Word Triplets = + + Educa ! on Sports e m i r c

  36. Tensor-based LDA training is faster • Mallet is an open-source framework for topic modeling • Benchmarks on AWS SageMaker Platform • Bulit into AWS Comprehend NLP service.

  37. 3. Learning to land a drone Guanya Shi, Xichen Shi, Michael O’Connell, Rose Yu, Kamyar Azizzadenesheli, A. , Yisong Yue, and Soon-Jo Chung

  38. Center for Autonomous Systems and Technologies A New Vision for Autonomy

  39. Physical Model for a Quadrotor drone • Dynamics: • Control: • Unknown forces & moments:

  40. Challenges in landing a Quadrotor drone • Unknown aerodynamic forces & moments. • Example 1: when drone is close to ground. • Example 2: as velocity goes up, air drag. • Example 3: external wind conditions. Wind generation in CALTECH CAST wind tunnel

  41. Challenges in using DNNs to Learn Unknown Dynamics • Our idea: using DNNs to learn unknown aerodynamic forces and then design nonlinear controller to cancel it (unknown moments are very limited in landing) • Challenge 1: DNNs are data-hungry • Challenge 2: DNNs can be unstable and generate unpredictable output • Challenge 3: DNNs are difficult to analyze and design provably stable controller based on them • Our approach : using Spectral Normalization to control Lipschitz property of DNNs and then design stable nonlinear controller ( Neural-Lander )

  42. Neural Lander Demo 1

  43. Neural Lander Demo 2

  44. 4. Lots of efforts at NVIDIA..

  45. Exascale Deep Learning for Climate Analytics • 3 Exaops for AI • ~27k Volta 100 GPUs

  46. Some research leaders at NVIDIA Computer Chief Learning & Graphics Robotics Core ML vision Scientist Perception Dave Luebke Alex Keller Aaron Lefohn Jan Kautz Sanja Fidler Me ! Dieter Fox Bill Dally Applied Networks VLSI Programming Architecture Circuits research Bryan Catanzaro Larry Dennison Brucek Khailany Michael Garland Steve Keckler Tom Gray Dave Nellans Mike O’Connor NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

  47. Conclusions ◎ Rich opportunities to infuse physical domain knowledge in AI algorithms. ◎ Jointly using symbolic and numerical data greatly helps neural programming generalization. ◎ Tensors expand learning into any dimension. Tensorized neural networks capture dependencies better. ◎ Learning unknown aerodynamics using spectrally normalized DNNs. ◎ Many efforts at NVIDIA to scale AI/ML for physics applications. 47

  48. Thanks! Any questions? 48

Recommend


More recommend