Outline Second Order Derivatives with ADTAGEO ADTAGEO Gradient-Mode 1 Algorithmic Differentiation Through Automatic Graph Elimination Ordering ADTAGEO at a glance 2 Implementation 3 Andreas Griewank Jan Riehme Institute for Applied Mathematics Hessian Elimination 4 Humboldt Universit¨ at zu Berlin { griewank,riehme } @math.hu-berlin.de Hessian implementation 5 15th April 2005 Automatic Differentiation Workshop Outlook Nice, France 6 Conclusions 7 15th April 2005 Automatic Differentiation Workshop Nice, France 15th April 2005 Automatic Differentiation Wo 1 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40 ADTAGEO Gradient-Mode – Example Computational graph of statement : Second Order Derivatives with ADTAGEO y = x1 + x2 + x3; ADjoints and TAngents by Graph Elimination Ordering with v 0 = x 1 , v − 1 = x 2 , v − 2 = x 3 v 1 = v 0 + v − 1 Andreas Griewank Jan Riehme v 0 v − 1 v − 2 v 2 = v 1 + v − 2 c 1 , 0 c 1 , − 1 c 2 , − 2 Institute for Applied Mathematics Humboldt Universit¨ at zu Berlin c ij = ∂ v i ∂ v j , j ≺ i v 1 { griewank,riehme } @math.hu-berlin.de c 2 , 1 15th April 2005 Automatic Differentiation Workshop y v 2 Nice, France c y , 2 = 1 15th April 2005 Automatic Differentiation Workshop Nice, France 15th April 2005 Automatic Differentiation Wo 2 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40
ADTAGEO Gradient-Mode – Elemination ADTAGEO Gradient-Mode – Elemination After execution of the assignment: Program : y . . . local variable, leaving scope of y Elimination of Intermediates : { double y = x1 + x2 + x3; z = x3 + x4 + y; } y = x1 + x2 + x3; v 0 v − 1 v − 2 v − 3 v 0 v − 1 v − 2 c lj += c li · c ij j ≺ i , l ≺ i c y , 0 = c y , 1 · c 1 , 0 c z , − 3 c z , − 2 c z , − 1 c z , 0 c y , − 1 = c y , 1 · c 1 , − 1 c y , − 2 = c y , 2 · c 2 , − 2 y c lj += c li · c ij z ADIFOR: Statement Level Reverse j ≺ i , l ≺ i AD-enabled NAGWare Fortran 95 compiler 15th April 2005 Automatic Differentiation Workshop Nice, France 15th April 2005 Automatic Differentiation Wo 5 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40 ADTAGEO Gradient-Mode – Elemination ADTAGEO at a glance – The idea behind Program : y . . . local variable, inside scope of y More talking about an IDEA than a another AD- TOOL { double y = x1 + x2 + x3; z = x3 + x4 + y; } A new way of doing Algorithmic Differentiation Do not build the computational graph of complete v − 1 v − 2 v − 3 v 0 (sub)programs c y , − 2 Instead : c y , − 1 c y , 0 c z , − 3 c z , − 2 Maintain a Life -DAG y Eliminate as soon as possible as many vertexes as possible. c z , y Eliminate on the fly, Online elimination. DAG represents the active variables alive at any one time. z → Small graph – Huge memory savings (gradients: factor 100) 15th April 2005 Automatic Differentiation Workshop Nice, France 15th April 2005 Automatic Differentiation Wo 6 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40
ADTAGEO at a glance – Requirements Implementation Proof of concept optimized for understanding not optimized for speed ADTAGEO performs vertex elimination whenever Implemented in C++ (i) An active variable is deallocated/destroyed Heavy use of class map from the Standard Template Library to (ii) An active variable is overwritten store partials locally at every node (edges in graph) Rapid prototyping (First Order): 140 lines of code for +-*/ and sin , cos , exp Perfect fitting into OOP scenario One week (with basic testing) (i) is covered by Destructor (assuming it exists in language) Any new operator / intrinsic requires 4 lines (ii) is covered by assignment operator (2 lines for open and closing curly braces) Rapid prototyping – Hessian : 100 additional lines of code for Hessian elimination One additional day (plus two nights) 15th April 2005 Automatic Differentiation Workshop Nice, France 15th April 2005 Automatic Differentiation Wo 9 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40 ADTAGEO – And Sourcetransformation Implementation – DAGLAD class daglad { Requirements of ADTAGEO ?? private: (i) Recognise leaving of the scope of variables (deallocation) (ii) Recognise assignments (overwrites) double val; //function value Produce source code for graph manipulations map<daglad*, double> args; //arguments = incoming edges map<daglad*, double> uses; //used by = outgoing edges therefore: one have access to the storage associated with pointers at runtime public: no pointer aliasing problem daglad() { ... } ; //constructor DEALLOCATE becomes your best friend: Eliminate all array void eliminate() { ... } ; //eliminate current vertex elements at once opens possibility to optimise the elimination ~daglad() { eliminate(); ... } ; //destructor order { eliminate(); ... } ; // asgnm. void operator = (...) Elements of arrays are handled as single entities // arithmetic operators friend dagdoub operator + (...); partial overwrites are no topic friend double operator % (...); . . . // retrieval op } ; /* class daglad */ 15th April 2005 Automatic Differentiation Workshop Nice, France 15th April 2005 Automatic Differentiation Wo 10 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40
Implementation – DAGLAD Implementation – Example #include "daglad.hpp" Program : main() { y = x1 + x2 + x3; z = x3 + x4 + y; daglad x1(0.5), x2(1.3), y; double xx1, xx2, yy, dy, dyy; x1 x2 x3 x4 y = exp(x1)*sin(x1+x2); // compute f(x) ∂ y // first element of gradient dyy = y%x1; ∂ y ∂ x 3 ∂ y ∂ x 2 ∂ z ∂ x 1 ∂ z ∂ x 4 ∂ x 3 xx1 = x1.val(); xx2 = x2.val(); //shortcuts y dy = exp(xx1)*(sin(xx1+xx2)+cos(xx1+xx2)); y.args cout << " dF1 = " << dyy << " diff " << (dyy-dy) << endl; ∂ z // second element of gradient dyy = y%x2; ∂ y y.uses dy = exp(xx1)*cos(xx1+xx2); z cout << " dF1 = " << dyy << " diff " << (dyy-dy) << endl; cout << " x1 = " << x1 << endl << " x2 = " << x2 << endl; cout << " y = " << y << endl; 15th April 2005 Automatic Differentiation Workshop Nice, France 15th April 2005 Automatic Differentiation Wo } 13 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40 Implementation – Usage (prototype) Implementation – Example Output (reformatted) Easy mode: Redeclare (required) variables to be of type daglad dF1 = 1.23101 diff 2.22045e-16 Retrieve first order derivatives somewhere in the code using the dF2 = -0.374593 diff 0 % operator x1 = |1,l:0,0.5,3, args= {} , uses= { [3,4,0,1.23101] } | x2 = |2,l:0,1.3,2, args= {} , uses= { [3,4,0,-0.374593] } | ∂ y j y [ j ]% x [ i ] ≡ ∂ x i y = |3,l:4,1.6056,0, args= { [2,0,2,-0.374593][1,0,3,1.23101] } Advanced mode: uses= {} | Check/prepare/write code for better performance Right mixture of forward and reverse mode [see below] 15th April 2005 Automatic Differentiation Workshop Nice, France 15th April 2005 Automatic Differentiation Wo 14 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40 Griewank, Riehme (HU Berlin) Second Order Derivatives with ADTAGEO / 40
Recommend
More recommend