AM 205: lecture 20 ◮ Today: PDE optimization, constrained optimization example ◮ New topic: eigenvalue problems
PDE Constrained Optimization We will now consider optimization based on a function that depends on the solution of a PDE Let us denote a parameter dependent PDE as PDE( u ( p ); p ) = 0 ◮ p ∈ R n is a parameter vector; could encode, for example, the flow speed and direction in a convection–diffusion problem ◮ u ( p ) is the PDE solution for a given p
PDE Constrained Optimization We then consider an output functional g , 1 which maps an arbitrary function v to R And we introduce a parameter dependent output, G ( p ) ∈ R , where G ( p ) ≡ g ( u ( p )) ∈ R , which we seek to minimize At the end of the day, this gives a standard optimization problem: p ∈ R n G ( p ) min 1 A functional is just a map from a vector space to R
PDE Constrained Optimization One could equivalently write this PDE-based optimization problem as min p , u g ( u ) subject to PDE( u ; p ) = 0 For this reason, this type of optimization problem is typically referred to as PDE constrained optimization ◮ objective function g depends on u ◮ u and p are related by the PDE constraint Based on this formulation, we could introduce Lagrange multipliers and proceed in the usual way for constrained optimization...
PDE Constrained Optimization Here we will focus on the form we introduced first: p ∈ R n G ( p ) min Optimization methods usually need some derivative information, such as using finite differences to approximate ∇G ( p )
PDE Constrained Optimization But using finite differences can be expensive, especially if we have many parameters: ∂ G ( p ) ≈ G ( p + he i ) − G ( p ) , ∂ p i h hence we need n + 1 evaluations of G to approximate ∇G ( p )! We saw from the Himmelblau example that supplying the gradient ∇G ( p ) cuts down on the number of function evaluations required The extra function calls due to F.D. isn’t a big deal for Himmelblau’s function, each evaluation is very cheap But in PDE constrained optimization, each p → G ( p ) requires a full PDE solve!
PDE Constrained Optimization Hence for PDE constrained optimization with many parameters, it is important to be able to compute the gradient more efficiently There are two main approaches: ◮ the direct method ◮ the adjoint method The direct method is simpler, but the adjoint method is much more efficient if we have many parameters
PDE Output Derivatives Consider the ODE BVP − u ′′ ( x ; p ) + r ( p ) u ( x ; p ) = f ( x ) , u ( a ) = u ( b ) = 0 which we will refer to as the primal equation Here p ∈ R n is the parameter vector, and r : R n → R We define an output functional based on an integral � b g ( v ) ≡ σ ( x ) u ( x )d x , a for some function σ ; then G ( p ) ≡ g ( u ( p )) ∈ R
The Direct Method We observe that � b ∂ G ( p ) σ ( x ) ∂ u = d x ∂ p i ∂ p i a hence if we can compute ∂ u ∂ p i , i = 1 , 2 , . . . , n , then we can obtain the gradient Assuming sufficient smoothness, we can “differentiate the ODE BVP” wrt p i to obtain, ′′ − ∂ u ( x ; p ) + r ( p ) ∂ u ( x ; p ) = − ∂ r u ( x ; p ) ∂ p i ∂ p i ∂ p i for i = 1 , 2 , . . . , n
The Direct Method Once we compute each ∂ u ∂ p i we can then evaluate ∇G ( p ) by evaluating a sequence of n integrals However, this is not much better than using finite differences: We still need to solve n separate ODE BVPs (Though only the right-hand side changes, so could LU factorize the system matrix once and back/forward sub. for each i )
Adjoint-Based Method However, a more efficient approach when n is large is the adjoint method We introduce the adjoint equation: − z ′′ ( x ; p ) + r ( p ) z ( x ; p ) = σ ( x ) , z ( a ) = z ( b ) = 0
Adjoint-Based Method Now, � b ∂ G ( p ) σ ( x ) ∂ u = d x ∂ p i ∂ p i a � b � ∂ u � − z ′′ ( x ; p ) + r ( p ) z ( x ; p ) = d x ∂ p i a � b � ′′ � − ∂ u ( x ; p ) + r ( p ) ∂ u = z ( x ; p ) ( x ; p ) d x , ∂ p i ∂ p i a where the last line follows by integrating by parts twice (boundary terms vanish because ∂ u ∂ p i and z are zero at a and b ) (The adjoint equation is defined based on this “integration by parts” relationship to the primal equation)
Adjoint-Based Method Also, recalling the derivative of the primal problem with respect to p i : ′′ − ∂ u ( x ; p ) + r ( p ) ∂ u ( x ; p ) = − ∂ r u ( x ; p ) , ∂ p i ∂ p i ∂ p i we get � b ∂ G ( p ) = − ∂ r z ( x ; p ) u ( x ; p )d x ∂ p i ∂ p i a Therefore, we only need to solve two differential equations (primal and adjoint) to obtain ∇G ( p )! For more complicated PDEs the adjoint formulation is more complicated but the basic ideas stay the same
Motivation: Eigenvalue Problems A matrix A ∈ C n × n has eigenpairs ( λ 1 , v 1 ) , . . . , ( λ n , v n ) ∈ C × C n such that Av i = λ v i , i = 1 , 2 , . . . , n (We will order the eigenvalues from smallest to largest, so that | λ 1 | ≤ | λ 2 | ≤ · · · ≤ | λ n | ) It is remarkable how important eigenvalues and eigenvectors are in science and engineering!
Motivation: Eigenvalue Problems For example, eigenvalue problems are closely related to resonance ◮ Pendulums ◮ Natural vibration modes of structures ◮ Musical instruments ◮ Lasers ◮ Nuclear Magnetic Resonance (NMR) ◮ ...
Motivation: Resonance Consider a spring connected to a mass Suppose that: ◮ the spring satisfies Hooke’s Law, 2 F ( t ) = ky ( t ) ◮ a periodic forcing, r ( t ), is applied to the mass 2 Here y ( t ) denotes the position of the mass at time t
Motivation: Resonance Then Newton’s Second Law gives the ODE � k � y ′′ ( t ) + y ( t ) = r ( t ) m where r ( t ) = F 0 cos( ω t ) Recall that the solution of this non-homogeneous ODE is the sum of a homogeneous solution, y h ( t ), and a particular solution, y p ( t ) � k / m , then for ω � = ω 0 we get: 3 Let ω 0 ≡ F 0 y ( t ) = y h ( t ) + y p ( t ) = C cos( ω 0 t − δ ) + 0 − ω 2 ) cos( ω t ) , m ( ω 2 3 C and δ are determined by the ODE initial condition
Motivation: Resonance F 0 The amplitude of y p ( t ), 0 − ω 2 ) , as a function of ω is shown m ( ω 2 below 20 15 10 5 0 −5 −10 −15 −20 0 0.5 1 1.5 2 2.5 3 3.5 4 Clearly we observe singular behavior when the system is forced at its natural frequency, i.e. when ω = ω 0
Motivation: Forced Oscillations F 0 Solving the ODE for ω = ω 0 gives y p ( t ) = 2 m ω 0 t sin( ω 0 t ) 5 4 3 2 1 y p ( t ) 0 −1 −2 −3 −4 −5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 t This is resonance!
Motivation: Resonance In general, ω 0 is the frequency at which the unforced system has a non-zero oscillatory solution For the single spring-mass system we substitute 4 the oscillatory � k solution y ( t ) ≡ xe i ω 0 t into y ′′ ( t ) + � y ( t ) = 0 m This gives a scalar equation for ω 0 : kx = ω 2 � 0 mx = ⇒ ω 0 = k / m 4 Here x is the amplitude
Motivation: Resonance Suppose now we have a spring-mass system with three masses and three springs
Motivation: Resonance In the unforced case, this system is governed by the ODE system My ′′ ( t ) + Ky ( t ) = 0 , where M is the mass matrix and K is the stiffness matrix m 1 0 0 k 1 + k 2 − k 2 0 , M ≡ 0 0 K ≡ − k 2 k 2 + k 3 − k 3 m 2 0 0 m 3 0 − k 3 k 3 We again seek a nonzero oscillatory solution to this ODE, i.e. set y ( t ) = xe i ω t , where now y ( t ) ∈ R 3 This gives the algebraic equation Kx = ω 2 Mx
Motivation: Eigenvalue Problems Setting A ≡ M − 1 K and λ = ω 2 , this gives the eigenvalue problem Ax = λ x Here A ∈ R 3 × 3 , hence we obtain natural frequencies from the three eigenvalues λ 1 , λ 2 , λ 3
Motivation: Eigenvalue Problems The spring-mass systems we have examined so far contain discrete components But the same ideas also apply to continuum models For example, the wave equation models vibration of a string (1D) or a drum (2D) ∂ 2 u ( x , t ) − c ∆ u ( x , t ) = 0 ∂ t 2 u ( x ) e i ω t , to obtain As before, we write u ( x , t ) = ˜ u ( x ) = ω 2 − ∆˜ c ˜ u ( x ) which is a PDE eigenvalue problem
Motivation: Eigenvalue Problems We can discretize the Laplacian operator with finite differences to obtain an algebraic eigenvalue problem Av = λ v , where ◮ the eigenvalue λ = ω 2 / c gives a natural vibration frequency of the system ◮ the eigenvector (or eigenmode) v gives the corresponding vibration mode
Motivation: Eigenvalue Problems We will use the Python (and Matlab) functions eig and eigs to solve eigenvalue problems ◮ eig : find all eigenvalues/eigenvectors of a dense matrix ◮ eigs : find a few eigenvalues/eigenvectors of a sparse matrix We will examine the algorithms used by eig and eigs in this Unit
Recommend
More recommend