a functional reboot for deep learning
play

A Functional Reboot for Deep Learning Conal Elliott Target August - PowerPoint PPT Presentation

A Functional Reboot for Deep Learning Conal Elliott Target August 2019 Conal Elliott A Functional Reboot for Deep Learning August 2019 1 / 23 Goal Extract the essence of DL. Shed accidental complexity and artificial limitations, i.e.,


  1. A Functional Reboot for Deep Learning Conal Elliott Target August 2019 Conal Elliott A Functional Reboot for Deep Learning August 2019 1 / 23

  2. Goal Extract the essence of DL. Shed accidental complexity and artificial limitations, i.e., simplify and generalize. Conal Elliott A Functional Reboot for Deep Learning August 2019 2 / 23

  3. Essence Optimization: best element of a set (by objective function). Usually via differentiation and gradient following. For machine learning, sets of functions . Objective function is defined via set of input/output pairs. Conal Elliott A Functional Reboot for Deep Learning August 2019 3 / 23

  4. Accidental complexity in deep learning Conal Elliott A Functional Reboot for Deep Learning August 2019 4 / 23

  5. Accidental complexity in DL (overview) Imperative programming Weak typing Graphs (neural networks ) Layers Tensors/arrays Back propagation Linearity bias Hyper-parameters Manual differentiation Conal Elliott A Functional Reboot for Deep Learning August 2019 5 / 23

  6. Imperative programming Thwarts correctness/dependability (usually “not even wrong”). Thwarts efficiency (parallelism). Unnecessary for expressiveness. Poor fit. DL is math, so express in a math language. Conal Elliott A Functional Reboot for Deep Learning August 2019 6 / 23

  7. Weak typing Requires people to manage detail & consistency. Run-time errors. Conal Elliott A Functional Reboot for Deep Learning August 2019 7 / 23

  8. Graphs (neural networks ) Clutters API, distracting from purpose. Purpose: a representation of functions. We already have a better one: programming language. Can we differentiate? Conal Elliott A Functional Reboot for Deep Learning August 2019 8 / 23

  9. Graphs (neural networks ) Clutters API, distracting from purpose. Purpose: a representation of functions. We already have a better one: programming language. Can we differentiate? An issue of implementation , not language or library definition. Fix accordingly. Conal Elliott A Functional Reboot for Deep Learning August 2019 8 / 23

  10. Layers Strong bias toward sequential composition. Neglects equally important forms: parallel & conditional. Awkward patches: “skip connections”, ResNet, HighwayNet. Don’t patch the problem; eliminate it. Replace with binary sequential, parallel, conditional composition. Conal Elliott A Functional Reboot for Deep Learning August 2019 9 / 23

  11. “Tensors” Really, multi-dimensional arrays. Awkward: imagine you could program only with arrays (Fortran). Unsafe without dependent types. Multiple intents / weakly typed Even as linear maps: meaning of m ˆ n array? Limited: missing almost all differentiable types. Missing more natural & compositional data types, e.g., trees. Conal Elliott A Functional Reboot for Deep Learning August 2019 10 / 23

  12. Back propagation Specialization and rediscovery of reverse-mode auto-diff. Described in terms of graphs. Highly complex due to graph formulation. Stateful: Hinders parallelism/efficiency. High memory use, limiting problem size. Conal Elliott A Functional Reboot for Deep Learning August 2019 11 / 23

  13. Linearity bias “Dense” & “fully connected” mean arbitrary linear transformation. Sprinkle in “activation functions” as exceptions to linearity. Misses simpler and more efficient architectures. Conal Elliott A Functional Reboot for Deep Learning August 2019 12 / 23

  14. Hyper-parameters Same essential purpose as parameters. Different mechanisms for expression and search. Inefficient and ad hoc Conal Elliott A Functional Reboot for Deep Learning August 2019 13 / 23

  15. A functional reboot Conal Elliott A Functional Reboot for Deep Learning August 2019 14 / 23

  16. Values Precision : meaning, reasoning, correctness. Simplicity : practical rigor/dependability. Generality : room to grow; design guidance. Conal Elliott A Functional Reboot for Deep Learning August 2019 15 / 23

  17. Essence Optimization: best element of a set (by objective function). Usually via differentiation and gradient following. For machine learning, sets of functions . Objective function is defined via set of input/output pairs. Conal Elliott A Functional Reboot for Deep Learning August 2019 16 / 23

  18. Optimization Describe a set of values as range of function: f :: p Ñ c . Objective function: q :: c Ñ R . Find argMin p q ˝ f q :: p . When q ˝ f is differentiable, gradient descent can help. Otherwise, other methods. Consider also global optimization, e.g., with interval methods. Conal Elliott A Functional Reboot for Deep Learning August 2019 17 / 23

  19. Learning functions Special case of optimization, where c “ a Ñ b , i.e., f :: p Ñ p a Ñ b q , and q :: p a Ñ b q Ñ R . Objective function often based on sample set S Ď a ˆ b . Measure mis-predictions (loss). Additivity enables parallel, log-time learning step. Conal Elliott A Functional Reboot for Deep Learning August 2019 18 / 23

  20. Differentiable functional programming Directly on Haskell (etc) programs : Not a library/DSEL No graphs/networks/layers Conal Elliott A Functional Reboot for Deep Learning August 2019 19 / 23

  21. Differentiable functional programming Directly on Haskell (etc) programs : Not a library/DSEL No graphs/networks/layers Differentiated at compile time Simple, principled, and general ( The simple essence of automatic differentiation ) Generating efficient run-time code Amenable to massively parallel execution (GPU, etc) Conal Elliott A Functional Reboot for Deep Learning August 2019 19 / 23

  22. Beyond “tensors” Most differentiable types are not vectors (uniform n -tuples), and most derivatives (linear maps) are not matrices. A more general alternative: Free vector space over s : i Ñ s » f s (“ i indexes f ”) Special case: Fin n Ñ s » Vec n s ˆ Algebra of representable functors : f ˆ ˆ g , 1, g ˝ f , Id Your (representable) functor via deriving Generic Linear map p f s ⊸ g s q » g p f s q » p g ˝ f q s (generalized matrix). Other representations for efficient reverse-mode AD (w/o tears). Conal Elliott A Functional Reboot for Deep Learning August 2019 20 / 23

  23. Beyond “tensors” Most differentiable types are not vectors (uniform n -tuples), and most derivatives (linear maps) are not matrices. A more general alternative: Free vector space over s : i Ñ s » f s (“ i indexes f ”) Special case: Fin n Ñ s » Vec n s ˆ Algebra of representable functors : f ˆ ˆ g , 1, g ˝ f , Id Your (representable) functor via deriving Generic Linear map p f s ⊸ g s q » g p f s q » p g ˝ f q s (generalized matrix). Other representations for efficient reverse-mode AD (w/o tears). Use with Functor , Foldable , Traversable , Scannable , etc. No need for special/limited array “reshaping” operations. Compositional and naturally parallel-friendly ( Generic parallel functional programming ) Conal Elliott A Functional Reboot for Deep Learning August 2019 20 / 23

  24. Modularity How to build function families from pieces, as in DL? Category of indexed sets of functions. Extract monolithic function after composing. Other uses, including satisfiability. Prototyped, but problem with GHC type-checker. Conal Elliott A Functional Reboot for Deep Learning August 2019 21 / 23

  25. Progress Simple & efficient reverse-mode AD. Some simple regressions, simple DL, and CNN. Some implementation challenges with robustness. Looking for collaborators, including GHC internals (compiling-to-categories plugin) Background in machine learning and statistics Conal Elliott A Functional Reboot for Deep Learning August 2019 22 / 23

  26. Summary Generalize & simplify DL (more for less). Essence of DL: pure FP with minarg . Generalize from “tensors” (for composition & safety). Collaboration welcome! Conal Elliott A Functional Reboot for Deep Learning August 2019 23 / 23

Recommend


More recommend