Automatic Differentiation Tools for FreeFem++ Workshop FreeFem++ Sylvain Auliac ( s-auliac@netcourrier.com ) - LJLL September 16, 2009
An Overview of FAD (Forward Automatic Differentiation) FAD theoretical principles Informatic point of view FAD in FreeFem++ FADed FreeFem++ in action Perspectives of development
FAD principle What is Automatic Differentiation (AD)? AD denominate a set of technics that allows to calculate automatically and “exactly” the derivatives of the outputs of a programm with respect to some of its inputs. There are several kinds of AD, each of them shows up efficiency in a well defined field of aplication. ♥ FAD = Forward Automatic Differentiation
FAD principle Let P be a programm and call : ◮ f : ( x i ) 1 ≤ i ≤ n �− → ( y i ) 1 ≤ i ≤ m the function implemented by P ◮ P ′ the differentiated version of P ◮ f ′ the function implemented by P ′ One derivative forward mode ◮ f ′ : ( x 1 , dx 1 , x 2 , dx 2 , . . . , x n , dx n ) �− → ( y 1 , dy 1 , . . . , y m , dy m ) � n ∂ y j ◮ ∀ j ∈ { 1 , . . . , m } , dy j = dx i ∂ x i i =1 ◮ with k fixed, if ∀ i , dx i = δ k , i then ∀ j , dy j = ∂ y j ∂ x k
FAD principle Multi derivatives forward mode Let p be an integer (1 ≤ p ≤ n ). ◮ f ′ : ( x 1 , � ∇ x 1 , x 2 , � ∇ x 2 , . . . , x n , � ∇ x n ) �→ ( y 1 , � ∇ y 1 , . . . , y m , � ∇ y m ) where � ∇ x k ∈ R p � n ∂ y j ◮ ∀ j ∈ { 1 , . . . , m } , � � ∇ y j = ∇ x i ∂ x i i =1 ◮ Special case when p = n and ∀ i , � ∇ x i = ( δ i , k ) 1 ≤ k ≤ n then : ∂ y j ∂ x 1 ∂ y j � � ∀ j , � ∂ x 2 � ∇ y j = ∇ y j = such that ∇ y j 1 ≤ j ≤ m = J f . . . ∂ y j ∂ x n (1)
FAD and programming How to modifie a programm to support FAD: ◮ Add and associate a new variable to each variable of the numeric type in the programm to store the derivative ( p new variables are needed for each pre-existing variable for multi-derivative AD). ◮ write the update(s) for the associated derivative(s) just before each change of each variable induced by a calculus, using the following derivations formulas ( f , g : R n − → R ) : ∇ ( f ± g ) = ∇ f ±∇ g ; ∇ ( f ∗ g ) = g ∇ f + f ∇ g ; ∇ ( f / g ) = g ∇ f − f ∇ g g 2 If u : R → R , v : R n → R , ∇ ( u ◦ v ) = ( u ′ ◦ v ) ∇ v
FAD and programming - exemple with 1 derivative This is a C piece of code : and its AD associated one : // u’s computed earlier // so should be du double x=u-1./u; double dx=du + du/(u*u); double y=x+log(u); double dy=dx + du/u; double J=x+y; double dJ=dx+dy; Whole new code : dx=du + du/(u*u); x=u-1./u; dy=dx + du/u; y=x+log(u); dJ=dx+dy; J=x+y;
Pros and cons of the forward mode Advantages: ◮ Easy to handle. Limitations
Pros and cons of the forward mode Advantages: ◮ Easy to handle. ◮ Allows fast development in (low-level) pre-existing codes. Limitations
Pros and cons of the forward mode Advantages: ◮ Easy to handle. ◮ Allows fast development in (low-level) pre-existing codes. ◮ Modifications strongly reduced in programming languages with overloading features. Limitations
Pros and cons of the forward mode Advantages: ◮ Easy to handle. ◮ Allows fast development in (low-level) pre-existing codes. ◮ Modifications strongly reduced in programming languages with overloading features. Limitations ◮ Decrinsing efficiency with relatively modest number of derivation parameters.
Pros and cons of the forward mode Advantages: ◮ Easy to handle. ◮ Allows fast development in (low-level) pre-existing codes. ◮ Modifications strongly reduced in programming languages with overloading features. Limitations ◮ Decrinsing efficiency with relatively modest number of derivation parameters. ◮ Dramatic augmentation of needed memory.
Optimal Control Problem : Let Ω be a domain of R 2 partitioned in n subdomains Ω i . � | u a − u d | 2 , − ∆ u a = f a and u a | ∂ Ω = 0 min a ∈ R n Ω n � f a = a i I Ω i i =1 � Let’s call J : R n − | u a − u d | 2 → R the functionnal a �− → � � Ω Note that : J ( a ) = 1 u a u d + 1 | u a | 2 − 2 � u d � 2 L 2 (Ω) 2 Ω Ω
Optimal Control Conjugued Gradient Algorithm: Initialization: ◮ R n ∋ d 0 = ∇ J ( a 0 ) ◮ ρ 0 = ( ∇ J ( a 0 ) , d 0 ) � u a 0 � 2 L 2(Ω) ◮ a 1 = a 0 − ρ 0 d 0 Iterations, if a k is known : �∇ J ( a k ) � 2 ◮ d k = ∇ J ( a k ) + �∇ J ( a k − 1 ) � 2 d k − 1 , we need u a k : − ∆ u a k = f a k ◮ ρ k = ( ∇ J ( a k ) , d k ) , we need u d k : − ∆ u d k = f d k � u dk � 2 L 2(Ω) ◮ a k +1 = a k − ρ k d k
Optimal Control - FreeFem Script FreeFem Script for n=5 1: real [ int ] A(5),D(5),DD(5); 2: real h = 1./N; 3: for ( int i=1;i<N;i++) { 4: A[i] = 0; // initializations 5: D[i] = 0; AAA[i] = 0; 6: SetDiff (A[i],i); 7:} 8:A[0] = 1.;SetDiff(A[0],0); 9: // Definition of second member functions 10: func real R2( real xx, real yy) 11: { return xx*xx + yy*yy;} 12: func real FA( real xx, real yy) 13:{ 14: int n = floor (R2(xx,yy)/h); 15: n = n>4 ? 4 : n; // can’t use ‘‘region’’ 16: return A[n]; // ’cause of memmory leak :-( 17:} 18: func real FD ... // the same with D array
19: func fA = FA(x,y); 20: func fD = FD(x,y); 21: func g = 0; // Dirichlet boundary condition 22: 23: border C0(t=0,2* pi ) {x=0.2*cos(t); y=0.2*sin(t); label =0;} 24: 25: border C1(t=0,2*pi) 26: {x=0.4*cos(t); y=0.4*sin(t); label =1;} 27: border C2(t=0,2*pi) 28: {x=0.6*cos(t); y=0.6*sin(t); label =2;} 29: border C3(t=0,2*pi) 30: {x=0.8*cos(t); y=0.8*sin(t); label =3;} 31: border C4(t=0,2*pi) 32: {x=cos(t); y=sin(t); label =4;} 33: mesh Th = 34: buildmesh (C0(10)+C1(20)+C2(30)+C3(40)+C4(50)); 35: plot (Th, wait =1, ps ="partitioned_disc.eps"); 36: 37: fespace Vh(Th,P1); 38:Vh ud = 1. - (x*x + y*y); // Exact solution for A=[4,4,4,4,4]
40:Vh uhA,vh,uhD,fff; 41: 42: real Jm1 = 0; // to save J in CG iterations 43: real gradJ2 = 0, gradJ2m1 = 0; // to store and save square norm of grad(J) 44: 45: ofstream file("poisson5/donnees.dat"); 46: 47: 48: for ( int iter=0;iter<60;iter++) 49:{ solve Poisson(uhA,vh) = 50: int2d (Th)( dx (uhA)* dx (vh) + dy (uhA)* dy (vh)) 51: 52: - int2d (Th)(fA*vh) 53: + on (4,uhA=g); 54: real J0 = int2d (Th)(uhA*uhA); 55: real J1 = int2d (Th)(uhA*ud); 56: real J = 0.5*J0 - J1 + 0.5* int2d (Th)(ud*ud); 57: gradJ2m1 = gradJ2; // Saving the square norm (to avoid recalculation)
gradJ2 = Grad2 (J); 58: real rho; 59: // Conjugued Gradient algorithm if (iter==0){ 60: // First CG iteration for ( int i=0;i<N;i++) {DD[i] = J_i;} 61: rho = Grad2 (J); 62: 63: rho /= (J0>0 ? J0 : 1.); 64: for ( int i=0;i<N;i++) A[i] -= rho*DD[i]; 65: for (int i=0;i<N;i++){ SetDiff (A[i],i);}} 66: else { // real CG iterations 67: for ( int i=0;i<N;i++){ // new descent direction 68: D[i] = J_i + DD[i]*gradJ2/gradJ2m1;} 69: solve PoissonD(uhD,vh) = 70: int2d (Th)( dx (uhD)* dx (vh)+ dy (uhD)* dy (vh)) 71: - int2d (Th)(fD*vh) + on (4,uhD=0); 72: real J0D = int2d (Th)(uhD*uhD); 73: for ( int i=0;i<N;i++) {rho += (J_i * D[i]);} 74: rho /= J0D; 75: for ( int i=0;i<N;i++){ 76: A[i] -= rho*D[i]; 77: SetDiff (A[i],i);} 78: } // etc.. etc.. 79:}
Finding Dirichlet to fit Neumann The problem: Let Ω ⊂ R 2 with ∂ Ω = Γ 1 ∪ Γ2, f ∈ L 2 (Ω) and g n ∈ L 2 (Γ 2 ) Consider the J : L 2 (Γ 2 ) − → R such that : � � � − ∆ u g = f dans Ω 2 � � J ( g ) = 1 ∂ u g � � ∂ n − g n , u g : u g = 0 sur Γ 1 � � 2 Γ 2 u g = g sur Γ 2 Resolution algorithm: ◮ Same algorithm as the preceding problem with a new functionnal. ◮ Control parameter living in an infinite dimensionnal space -¿ discretization needed.
Scripting details Here Ω is a square with 21 segments on each of its side. The control parameter: With P 1 finite elements, the differentiation variables are the values taken by g on each vertices of Γ 2 . For FreeFem : real [ int ] A(20); // Initialize to zero and SetDiff... func real g( real a, real b) { int k = floor(a/h); // Uniform discretization if (b>0) return 0.; else { real Akp1 = k<20 ? A[k] : 0; real Ak = k>0 ? (k<21 ? A[k-1]:0) : 0; return ((Akp1-Ak)*(a - k*h)/h + Ak); // P1 by pieces } }
The functionnal J : Decomposition with bilinear and linear forms : � � � � 2 � � J ( g ) = 1 ∂ u g ∂ u g ∂ n g d + 1 � � 2 � g d � 2 − � � L 2 (Γ 2 ) 2 ∂ n Γ 2 Γ 2 Freefem script: real J0 = int1d (Th,1)(dy(uhg)*dy(uhg)); real J1 = int1d (Th,1)(dy(uhg)*gd); real J2 = int1d (Th,1)(gd*gd); J = 0.5*J0 - J1 + 0.5*J2;
Perspectives of development Immediate works: ◮ Some bugs to fix... ◮ Possible use with BFGS. ◮ Setting the number of derivatives in the script. ◮ Optimisation and improvment of the AD-related syntax. More difficult works: ◮ Merging AD version of FreeFem++ with the real one. ◮ Compatibility with complex numbers. ◮ Differentiation with respect to the geometry. ◮ “Mathematization” of the syntax. ◮ Adding new AD styles (inverse and andjoint models).
Recommend
More recommend