Dioptics, etc. @davidad Overview Gradient-Based Dioptics: a common generalization of Learners Motivation “Simple Essence” gradient-based learners and open games Abstract version Reconstitution Backpropagation “Categories of Optics” “Backprop as Functor” David A. Dalrymple Dioptics Dbl ( C ) @davidad Dbl ( Optic C ) Quotienting Dioptic F , G Protocol Labs Gradient descent Getting to Learn Open Games SYCO 5 “Compos. Game Thy.” Birmingham, UK As Dioptics Caveat 2019-09-05 Future work Truthfulness? Functor recipe? ReLU References Thanks 1 / 42
Dioptics, etc. About This Talk @davidad Overview Gradient-Based Learners Motivation “Simple Essence” Abstract version Reconstitution Backpropagation “Categories of Optics” • Clarifying connections between (a lot of) prior work “Backprop as Functor” Dioptics • Besides abstractions, main novelty: generalizing backpropagation and gradient Dbl ( C ) Dbl ( Optic C ) descent to Lie groups and framed Riemannian manifolds Quotienting Dioptic F , G • Work in progress; dubious provenance Gradient descent Getting to Learn Open Games “Compos. Game Thy.” As Dioptics Caveat Future work Truthfulness? Functor recipe? ReLU References Thanks 2 / 42
Dioptics, etc. Haven’t I seen this talk already? @davidad Overview Gradient-Based Learners Motivation “Simple Essence” Abstract version Reconstitution Backpropagation “Categories of Optics” “Backprop as Functor” Dioptics Dbl ( C ) Dbl ( Optic C ) Quotienting Dioptic F , G Gradient descent Getting to Learn There is a lot of overlap with Jules’ talk earlier. A couple differences: Open Games • I only deal with trivializable bundles, TX ∼ = X × X ′ “Compos. Game Thy.” As Dioptics Caveat • I’m aiming to cover more than just backpropagation Future work Truthfulness? Functor recipe? ReLU References Thanks 3 / 42
Dioptics, etc. Notations @davidad Overview Gradient-Based � � • Composition: f � g ( x ) ≡ g ( f ( x )) ≡ x � f � g Learners Motivation � � � � • hom s: C A , B means hom C A , B . [ A , B ] denotes the internal hom from A to B . “Simple Essence” Abstract version A ⊸ B denotes the space of (literally) linear maps from A to B . Reconstitution Backpropagation • Definitions: “Categories of Optics” “Backprop as Functor” Dioptics � � eval ( X ⊸ Y ) ⊗ X → Y := � f , x � f ( x ) : �→ Dbl ( C ) X , Y Dbl ( Optic C ) ���� ���� � �� � ���� � �� � name Quotienting type bindings expression variables Dioptic F , G Gradient descent Getting to Learn means the same as Open Games “Compos. Game Thy.” � � eval X , Y : ( X ⊸ Y ) ⊗ X → Y As Dioptics Caveat eval X , Y � f , x � = f ( x ) Future work Truthfulness? Functor recipe? ReLU References Thanks 5 / 42
Dioptics, etc. @davidad Overview Gradient-Based Learners Motivation “Simple Essence” Abstract version Section 1 Reconstitution Backpropagation “Categories of Optics” “Backprop as Functor” Dioptics Gradient-Based Learners Dbl ( C ) Dbl ( Optic C ) Quotienting Dioptic F , G Gradient descent Getting to Learn Open Games “Compos. Game Thy.” As Dioptics Caveat Future work Truthfulness? Functor recipe? ReLU References Thanks 6 / 42
Dioptics, etc. Machine Learning in 60 seconds @davidad Overview ∗ • A (supervised) machine learning problem . Gradient-Based Learners is a function approximation problem. + σ Motivation “Simple Essence” • A pretty practical class of functions to Abstract version ∗ Reconstitution approximate things with is neural nets. Backpropagation “Categories of Optics” • Deep learning is , in part, about composing “Backprop as Functor” layers. The deepness is (sequential) Dioptics Dbl ( C ) composition depth. Dbl ( Optic C ) Quotienting • Modern deep learning (e.g. TensorFlow, Dioptic F , G Gradient descent PyTorch) uses computational graphs . Getting to Learn Open Games � How much of modern deep learning can be “Compos. Game Thy.” understood from this perspective? As Dioptics Caveat Future work Truthfulness? Functor recipe? ReLU References Thanks 7 / 42
Dioptics, etc. Backpropagation @davidad Overview ∗ • Forward pass computes x �→ y Gradient-Based Learners • Backward pass computes d – dx ← � d – + σ Motivation dy “Simple Essence” • Technically, the name “backpropagation” Abstract version ∗ Reconstitution implies codomain R . Else, reverse-mode Backpropagation “Categories of Optics” automatic differentiation. “Backprop as Functor” Dioptics ∗ ′ Dbl ( C ) Dbl ( Optic C ) Quotienting + ′ σ ′ Dioptic F , G Gradient descent Getting to Learn ∗ ′ Open Games “Compos. Game Thy.” As Dioptics Caveat Future work Truthfulness? Functor recipe? ReLU References Thanks 8 / 42
Dioptics, etc. Two ideas about how "backpropagation is a functor": @davidad “Simple Essence of Automatic “Backprop as Functor” Overview Differentiation” (presented at SYCO 1!) Gradient-Based arXiv:1804.00746 [cs.PL] arXiv:1711.10455 [math.CT] Learners Motivation Brendan Fong, David Spivak, “Simple Essence” Conal Elliott Abstract version Rémy Tuyéras Reconstitution Backpropagation “Categories of Optics” “Backprop as Functor” Dioptics Dbl ( C ) Dbl ( Optic C ) Quotienting Dioptic F , G Gradient descent Getting to Learn Open Games “Compos. Game Thy.” As Dioptics Caveat Future work Truthfulness? Functor recipe? ReLU References Thanks How do these relate? 9 / 42
Dioptics, etc. What’s a Derivative? @davidad Overview • Elliott constructs a “derivative” functor D + Gradient-Based Learners For X , Y : Euc , f : X → Y , let Motivation “Simple Essence” f ′ ( x ):= g x Abstract version Reconstitution � �� � ���� Df : ( X → ( X ⊸ Y )) := x �→ the unique linear g s.t. Backpropagation “Categories of Optics” “Backprop as Functor” � �� � � � � f ( x + ε ) − f ( x ) + f ′ ( x )( ε ) Dioptics � Dbl ( C ) lim = 0 Dbl ( Optic C ) � ε � ε → 0 Quotienting Dioptic F , G Gradient descent Chain rule: Getting to Learn Open Games D ( f � g )( x ) = Df ( x ) � Dg ( f ( x )) “Compos. Game Thy.” As Dioptics Caveat Problem – not functorial: depends on un- D ’d f . Future work � � � � Truthfulness? Let D + f : X → Y × ( X ⊸ Y ) := x �→ f ( x ) , Df ( x ) Functor recipe? Proposition (Elliott). D + is a symmetric monoidal ReLU References functor from Euc into a category with objects of Euc and Thanks � � morphisms of type X → Euc Y × ( X ⊸ Y ) . 10 / 42
Dioptics, etc. What do we really need to assume? @davidad Overview Gradient-Based Learners Motivation We can work in any category E which... “Simple Essence” Abstract version • is cartesian closed and locally cartesian closed Reconstitution Backpropagation • has a product-preserving endofunctor T (given a space X : E , TX is interpreted as “Categories of Optics” “Backprop as Functor” its tangent bundle) Dioptics Dbl ( C ) • has a “base point” natural transformation p : ∀ X . TX → X (that is, p : T ⇒ id E ). Dbl ( Optic C ) Quotienting • has a semiadditive subcategory E Vect of “vector-like spaces” enriched in E Dioptic F , G Gradient descent • has a subcategory E Triv of “trivializable spaces” s.t. for all X : E Triv , there is Getting to Learn some X ′ : E Vect satisfying the isomorphism (of bundles over X ) TX ∼ = X × X ′ . Open Games “Compos. Game Thy.” = X × X ′ looks like a constant-complement lens TX � X • Observation: TX ∼ As Dioptics Caveat • satisfies one last hard-to-state assumption about “linearity of derivatives” Future work Truthfulness? Functor recipe? ReLU References Thanks 11 / 42
Dioptics, etc. What’s a Derivative, Again? @davidad Overview Gradient-Based Learners Motivation “Simple Essence” Abstract version Reconstitution = X × X ′ → Y × Y ′ • If X , Y : C Triv , then T ( f : X → Y ) : TX → TY ∼ Backpropagation “Categories of Optics” • By naturality of base-point projection p : T – ⇒ –, we have Tf � x , ·� = � f ( x ) , ·� . “Backprop as Functor” Dioptics � � • Therefore Tf � x , x ′ � = f ( x ) , π 2 Tf � x , x ′ � . Dbl ( C ) Dbl ( Optic C ) � � � � Y × ( X ′ → Y ′ ) • So we can define T + f : X → := x �→ f ( x ) , λ x ′ .π 2 Tf � x , x ′ � . Quotienting Dioptic F , G • Our last assumption is that T + ( f )( x ) is, in fact, a linear map X ′ ⊸ Y ′ . Gradient descent Getting to Learn � � Y × ( X ′ ⊸ Y ′ ) • Then T + f : X → , just like Elliott’s D + . Open Games “Compos. Game Thy.” As Dioptics Caveat Future work Truthfulness? Functor recipe? ReLU References Thanks 12 / 42
Recommend
More recommend