implementing autograd
play

Implementing autograd Slides by Matthew Johnson Autograds - PowerPoint PPT Presentation

Implementing autograd Slides by Matthew Johnson Autograds implementation github.com/hips/autograd Dougal Maclaurin, David Duvenaud, Matt Johnson differentiates native Python code handles most of Numpy + Scipy loops, branching,


  1. Implementing autograd Slides by Matthew Johnson

  2. Autograd’s implementation github.com/hips/autograd Dougal Maclaurin, David Duvenaud, Matt Johnson • differentiates native Python code • handles most of Numpy + Scipy • loops, branching, recursion, closures • arrays, tuples, lists, dicts... • derivatives of derivatives • a one-function API!

  3. autodiff implementation options A. direct specification of computation graph B. source code inspection C. monitoring function execution

  4. ingredients: 1. tracing composition of primitive functions 2. vector-Jacobian product for each primitive 3. composing VJPs backward

  5. ingredients: 1. tracing composition of primitive functions 2. vector-Jacobian product for each primitive 3. composing VJPs backward

  6. numpy.sum

  7. primitive autograd.numpy.sum numpy.sum

  8. primitive Node ã autograd.numpy.sum a value: F function: numpy.sum [x] parents:

  9. primitive Node ã autograd.numpy.sum a value: a F function: unbox numpy.sum [x] parents:

  10. primitive ˜ Node ã autograd.numpy.sum Node b a value: value: b a b F function: function: anp.sum unbox numpy.sum box ˜ [x] parents: parents: [ã]

  11. start_node x

  12. start_node x a = A ( x )

  13. start_node x b = B ( a ) a = A ( x )

  14. start_node x b = B ( a ) a = A ( x ) c = C ( b )

  15. start_node end_node y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  16. start_node end_node No control flow!

  17. ingredients: 1. tracing composition of primitive functions 2. vector-Jacobian product for each primitive 3. composing VJPs backward

  18. a = A ( x ) x

  19. ∂ y ∂ a a = A ( x ) x

  20. ∂ y ∂ y ∂ x = ? ∂ a a = A ( x ) x

  21. ∂ x = ∂ y ∂ y ∂ a · ∂ a ∂ y ∂ x ∂ a a = A ( x ) x

  22. vector-Jacobian product ∂ x = ∂ y ∂ y ∂ y ∂ a · A 0 ( x ) ∂ a a = A ( x ) x

  23. ingredients: 1. tracing composition of primitive functions 2. vector-Jacobian product for each primitive 3. composing VJPs backward

  24. start_node end_node y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  25. ∂ y ∂ y = 1 start_node end_node y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  26. ∂ y ∂ y = 1 ∂ y start_node end_node ∂ c y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  27. ∂ y ∂ y ∂ y = 1 ∂ b ∂ y start_node end_node ∂ c y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  28. ∂ y ∂ y ∂ y = 1 ∂ b ∂ y ∂ y start_node end_node ∂ a ∂ c y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  29. ∂ y ∂ y ∂ y ∂ y = 1 ∂ x ∂ b ∂ y ∂ y start_node end_node ∂ a ∂ c y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  30. higher-order autodiff just works: the backward pass can itself be traced

  31. ∂ y ∂ y = 1 start_node end_node y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  32. ∂ y ∂ c ∂ y ∂ y = 1 start_node end_node y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  33. ∂ y ∂ y ∂ b ∂ c ∂ y ∂ y = 1 start_node end_node y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  34. ∂ y ∂ y ∂ y ∂ a ∂ b ∂ c ∂ y ∂ y = 1 start_node end_node y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  35. ∂ y ∂ y ∂ y ∂ y ∂ x ∂ a ∂ b ∂ c ∂ y ∂ y = 1 start_node end_node y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  36. ∂ y end_node ∂ y = 1 start_node y = D ( c ) x b = B ( a ) a = A ( x ) c = C ( b )

  37. ingredients: 1. tracing composition of primitive functions 
 Node , primitive , forward_pass 2. vector-Jacobian product for each primitive 
 defvjp 3. composing VJPs backward 
 backward_pass , make_vjp , grad

  38. what’s the point? easy to extend! - develop autograd! - forward mode - log joint densities from sampler programs

Recommend


More recommend