Neural Networks with Cheap Differential Operators Ricky T. Q. Chen, - PowerPoint PPT Presentation

Jul 18, 2023 •212 likes •330 views

Neural Networks with Cheap Differential Operators Ricky T. Q. Chen, David Duvenaud Differential Operators Want to compute operators such as divergence : d f i ( x ) f = f : d d x i is a neural net. where i

Neural Networks with Cheap Differential Operators Ricky T. Q. Chen, David Duvenaud
Differential Operators • Want to compute operators such as divergence : d ∂ f i ( x ) ∑ ∇ ⋅ f = f : ℝ d → ℝ d ∂ x i is a neural net. where i =1 • Solving PDEs • Fitting SDEs • Finding fixed points • Continuous normalizing flows
Automatic Differentiation (AD) Reverse-mode AD gives cheap vector-Jacobian products: ∂ f 1 ( x ) –––––– –––––– v 1 d ∂ x [ dx f ( x ) ] = ∂ f i ( x ) d ∑ v T v i = ⋮ ∂ x ∂ f d ( x ) i =1 –––––– –––––– v d ∂ x • For full Jacobian, need separate passes d • In general, Jacobian diagonal has the same cost as the full Jacobian! • We restrict architecture to allow one-pass diagonal computations.
HollowNets Allow e ffi cient computation of dimension-wise derivatives of order k: with only k backward passes, regardless of dimension. Example: D k =1 Jacobian Jacobian diagonal dim f ( x ) =
HollowNet Architecture HollowNets are composed of two sub-networks: • Hidden units which don’t depend on their respective input: h i = c i ( x − i ) • Output units depend only on their respective hidden and input: f i ( x ) = τ i ([ x i , h i ])
HollowNet Jacobians Can get exact dimension- wise derivatives by disconnecting some dependencies in backward pass. i.e. detach in PyTorch or stop_gradient in TensorFlow.
HollowNet Jacobians Can factor Jacobian into: • A diagonal matrix (dimension-wise dependencies). • A hollow matrix (all interactions). d ∂ ∂ ∂ = + diagonal + hollow dx f = ∂ x τ ( x , h ) ∂ x h ( x ) ∂ h τ ( x , h )
Application I: Finding Fixed Points Root finding problems can be solved using Jacobi-Newton: ( f ( x ) = 0) − 1 f ( x ) x t +1 = x t − [ D dim f ( x ) ] x t +1 = x t − f ( x ) • Same solution with faster convergence. • We applied to implicit ODE solvers for solving sti ff equations.
Application II: Continuous Normalizing Flows • Transforms distributions through an ODE: • Change in density given by divergence: = tr ( d dx f ( x ) ) = d log p ( x , t ) d ∑ [ D dim f ( x ) ] i dt i =1
Learning Stochastic Diff Eqs • Fokker-Planck describes density change using and : D 2 D dim dim d i =1 [ − ( D dim f ) p − ( ∇ p ) ⊙ f + ( D 2 2 diag ( g ) 2 ⊙ ( D dim ∇ p ) ] i ∂ p ( t , x ) dim diag ( g )) p + 2( D dim diag ( g )) ⊙ ( ∇ p ) + 1 ∑ = ∂ t
Takeaways • Dimension-wise derivatives are costly for general functions. • Restricting to hollow Jacobians gives cheap diagonal grads. • Useful for PDEs, SDEs, normalizing flows, and optimization.

Recommend

Cheap Talk Games: Extensions Cheap Talk Games: Extensions F. Koessler / November 12, 2008 Cheap

Cheap Talk Games: Extensions F. Koessler / November 12, 2008 Cheap Talk Games: Extensions Cheap Talk Games: Extensions F. Koessler / November 12, 2008 Cheap Talk Games: Extensions Outline (November 12, 2008) The Art of Conversation:

1.05k views • 93 slides

VHDL VHDL - Flaxer Eli Ch 5 - 1 Operators and Attributes Outline Logical Operators

Chapter 5 Operators and Attributes VHDL VHDL - Flaxer Eli Ch 5 - 1 Operators and Attributes Outline Logical Operators Relational Operators Shift Operators Adding Operators Multiplying Operators Miscellaneous Operators

232 views • 6 slides

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural Networks can represent complex decision boundaries decision boundaries Variable size. Any boolean function can be Variable size. Any boolean

358 views • 14 slides

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks and Handwriting Recognition Steven Sloss Math 164 Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven Sloss Structure Training Neural Networks Math 164 Motivation Problem

889 views • 41 slides

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Feed-forward Networks Network Training Error Backpropagation Deep Learning Feed-forward Networks Network Training Error Backpropagation Deep Learning Neural Networks Neural networks arise from attempts to model Neural Networks

380 views • 9 slides

More Self-study Operators Unary operators, sizeof, boolean operators, comma, and operators

More Self-study Operators Unary operators, sizeof, boolean operators, comma, and operators precedence... Constant value, enumerated types Control constructs selection: if/else, switch loop: while, do/while, for...

439 views • 31 slides

nouvelle mthode de placement MSc Jacques VERCRUYSSE GEO-GREEN sprl-bvba Cheap-GSHPs (Cheap and

Cheap and efficient application of reliable Ground Source Heat exchangers and Pumps Training Workshop Nouveau type de sondes gothermiques et nouvelle mthode de placement MSc Jacques VERCRUYSSE GEO-GREEN sprl-bvba Cheap-GSHPs (Cheap and

521 views • 23 slides

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Recurrent Neural Networks Long Short-Term Memory Temporal Convolutional Networks Examples Recurrent Neural Networks Long Short-Term Memory Temporal Convolutional Networks Examples Sequential Data with Neural Networks Recurrent Neural

303 views • 4 slides

DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOLATILES DIFFERENTIAL AROMA

The F The First MDPI rst MDPI Lincoln Lincoln University University University University DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOLATILES DIFFERENTIAL AROMA VOLATILES TILES TILES International

459 views • 3 slides

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural IR tasks Neural IR architecture Feature Representations Neural IR query auto completion Neural IR query suggestion Neural IR document

1.48k views • 18 slides

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Ugur HALICI - METU EEE - ANKARA 11/18/2004 CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I : Recurrent Neural Networks CHAPTER I Recurrent Neural Networks Introduction In this chapter first the

404 views • 27 slides

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Ugur HALICI - METU EEE - ANKARA 11/18/2004 CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory Associative Memory CHAPTER III : III : Neural Networks as Associative Memory CHAPTER Neural Networks as

513 views • 22 slides

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks<br/><br/> 5/4/19, 4(03 PM Convolutional Neural Networks<br/><br/> 5/4/19, 4(03 PM Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use UMaine

412 views • 9 slides

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks have become one of the major thrust areas recently in various pattern recognition, prediction, and analysis problems In many problems they have

852 views • 33 slides

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks and their Application to Go A. Bausch Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training neural networks Problems AlphaGo Anne-Marie Bausch The Game of Go Policy Network

280 views • 24 slides

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks have become one of the major thrust areas recently in various pattern recognition, prediction, and analysis problems In many problems they have

1.17k views • 91 slides

Formal Methods in Differential and Linear Trail Search Beno t Viguier October 7, 2016 1

Formal Methods in Differential and Linear Trail Search Beno t Viguier October 7, 2016 1 Overview Introduction Keccak Differential Cryptanalysis Semantics of Trees and Iterators Proven Iterator Conclusion 2 Introduction Hashing vs

977 views • 56 slides

Tangent Categories from the Coalgebras of Differential Categories JS Pacaud Lemay Joint work with

Tangent Categories from the Coalgebras of Differential Categories JS Pacaud Lemay Joint work with Robin Cockett and Rory Lucyshin-Wright The Differential Category World - How Its All Connected Restriction Differential Categories Total Maps

315 views • 16 slides

r ts t trs sss t

r ts t trs sss t r srtt ss t s

436 views • 39 slides

A mathematical approach to defining the semantics of modelling languages Jane Hillston LFCS,

Strachey 100 A mathematical approach to defining the semantics of modelling languages Jane Hillston LFCS, University of Edinburgh 19th November 2016 Strachey 100 A modelling language approach to defining mathematical structures via semantics

1.37k views • 101 slides

Active Semi-Supervised Learning using Submodular Functions Andrew Guillory, Jeff Bilmes

Active Semi-Supervised Learning using Submodular Functions Andrew Guillory, Jeff Bilmes University of Washington Given unlabeled data for example, a graph Learner chooses a labeled set Nature reveals labels y 0, 1 L - +

490 views • 28 slides

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor: Shaddin Dughmi Outline Convex Optimization Basics 1 Common Classes 2 Interlude: Positive Semi-Definite Matrices 3 More Convex Optimization

897 views • 62 slides

NOvA Project John Cooper Fermilab Institutional Review June 6-9, 2011 NOvA CD-4 Deliverables

NOvA Project John Cooper Fermilab Institutional Review June 6-9, 2011 NOvA CD-4 Deliverables Upgrade the Fermilab accelerator complex proton source from pre-NOvA 320 kW to a source capable of 700 kW Paul Derwent is covering this in the

626 views • 26 slides

A Closer Look at Function Approximation Robert Platt Northeastern University The problem of

A Closer Look at Function Approximation Robert Platt Northeastern University The problem of large and continuous state spaces Example of a large state space: Atari Learning Environment state: video game screen actions: joystick

835 views • 55 slides