The KuratowskiRyll-Nardzewski Theorem and semismooth Newton methods for HamiltonJacobiBellman equations Iain Smears INRIA Paris Linz, November 2016 joint work with Endre S uli, University of Oxford Overview Talk outline 1.

  2. Overview Talk outline 1. Introduction: Howard’s algorithm / policy iteration for Hamilton–Jacobi–Bellman equations. 2. Semismoothness of HJB operators in function spaces. 3. Applications to discontinuous Galerkin FEM approximations of HJB equations with Cordes coefficients.

  4. 1. Hamilton–Jacobi–Bellman Equation [ L α u − f α ] = 0 F [ u ] := sup in Ω , α ∈ Λ (HJB) u = 0 on ∂ Ω , where L α u := a α ( x ) : D 2 u + b α ( x ) · ∇ u − c α ( x ) u . d d a α ( x ) : D 2 u = � a α b α ( x ) · ∇ u = � b α Notation: ij ( x ) u x i x j , i ( x ) u x i . i , j =1 i =1 Assumptions: • bounded domain Ω, • control set Λ is a compact metric space, • continuous functions a , b , c and f in x ∈ Ω and α ∈ Λ. Remark: Further assumptions are required for well-posedness of the problem, but not for the semismoothness discussed here. 1/28

  5. 1. Motivation Howard’s algorithm / policy iteration Formal structure 1. Choose an initial guess u 0 . 2. For each k ≥ 0, choose α k : Ω → Λ such that α k ( x ) ∈ argmax α ∈ Λ ( L α u k − f α )( x ) , ∀ x ∈ Ω . 3. Then, find u k +1 as a solution of the PDE L α k u k +1 = f α k with u k +1 = 0 on ∂ Ω , in Ω, where L α k v := a α k ( x ) ( x ) : D 2 v + b α k ( x ) ( x ) · ∇ v − c α k ( x ) v In practice: used in a discrete context after discretization by a numerical method. 2/28

  7. 1. Background Classical works: [Bellman, Dynamic Programming , 1957] [Howard, Dynamic Programming and Markov Processes , 1960]. Historical summary from [Puterman & Brumelle, 1979]: Policy iteration is usually attributed to Bellman [...] and Howard [...] Bellman developed the technique, which he called iteration in policy space, to solve several dynamic programming problems. Howard [16] later developed a version of this procedure for Markovian decision problems which he called the policy-iteration method. [Puterman & Brumelle, 1979]: interpretation as Newton–Kantorovich method & convergence rates assuming: there is δ ∈ (0 , 1] such that, for all functions u and v , � L α v − L α u � L ( X , Y ) � � v − u � δ X where α v and α u are arg-maximisers for v and u . NB: this cannot hold when arg-max operation is non-unique or not continuous. 3/28

  8. 1. Background On solver algorithms for HJB: [Santos & Rust, 2004] Analysis of policy iteration for finite dimensional MDP problems. [Bokanowski, Maroso, Zidani, 2009]: Superlinear convergence and semismoothness of finite dimensional HJB operators of form α ∈A N [ B α x − c α ] = 0 min with matrices B α ∈ R N × N and vectors x , c ∈ R N , (see also discussion of Bellman–Isaacs). Variant algorithms and applications: penalty methods [Reisinger & Witte, 2011, 2012], coupled value-policy iteration [Alla, Falcone, Kalise, 2015] Semismooth Newton methods [Ulbrich, 2002], [Hinterm¨ uller, Ito, Kunisch, 2002] (primal-dual active set method as as semismooth Newton method) 4/28

  9. 1. Semismooth Newton methods Notation: Let X and Y be sets. We write G : X ⇒ Y if G is a set-valued map that maps X into the subsets of Y . Definition of semismoothness [Ulbrich, 2002] Let X and Y be Banach spaces. Let F : X → Y . Let DF : X ⇒ L ( X , Y ) with non-empty images. We say that F is DF-semismooth on U if, for all x ∈ U , 1 lim � e � X sup L ∈ DF [ u + e ] � F [ u + e ] − F [ u ] − L e � Y = 0 . � e � X → 0 Then DF is then called a generalised differential of F on U . Semismoothness + uniform stability of linearizations: � L − 1 � L ( Y , X ) < ∞ sup L ∈ DF [ v ] , v ∈ X = ⇒ local superlinear convergence of semismooth Newton method. 5/28

  10. 1. Semismoothness of max( v , 0) and norm-gap Important example from [Ulbrich, 2002], [Hinterm¨ uller, Ito, Kunisch, 2002] Let 1 ≤ q < r ≤ ∞ . Let G : L r (Ω) → L q (Ω) be defined by G : u �→ max( u , 0). Then G is semismooth from L r (Ω) to L q (Ω) with differential DF [ v ] the set of all L ∈ L ∞ (Ω) of the form:  1 if v ( x ) > 0    L ( x ) = 0 if v ( x ) < 0   an arbitrary fixed value if v ( x ) = 0  Norm gap : the restriction q < r cannot be removed (counter-examples). How to generalise this to HJB operators? 6/28

  11. Overview Talk outline 1. Introduction: Howard’s algorithm / policy iteration for Hamilton–Jacobi–Bellman equations. 2. Semismoothness of HJB operators in function spaces. 3. Applications to discontinuous Galerkin FEM approximations of HJB equations with Cordes coefficients. 7/28

  12. 1. Motivation Howard’s algorithm / policy iteration Formal structure 1. Choose an initial guess u 0 . 2. For each k ≥ 0, choose α k : Ω → Λ such that α k ( x ) ∈ argmax α ∈ Λ ( L α u k − f α )( x ) , ∀ x ∈ Ω . 3. Then, find u k +1 as a solution of the PDE L α k u k +1 = f α k with u k +1 = 0 on ∂ Ω , in Ω, where L α k v := a α k ( x ) ( x ) : D 2 v + b α k ( x ) ( x ) · ∇ v − c α k ( x ) v In practice: used in a discrete context after discretization by a numerical method. 8/28

  13. 2. Semismoothness of HJB operators For FEM applications: let T h be a mesh on Ω. Space X = W 2 , r (Ω , T h ), 1 ≤ r ≤ ∞ , with norm: 1   r   � � u � r � u � W 2 , r (Ω; T h ) = . W 2 , r ( K )   K ∈T h Function u ∈ W 2 , r (Ω , T h ) have element-wise gradient ∇ h u and Hessian D 2 h u . For Λ compact and continuous coefficients, F : W 2 , r (Ω , T h ) → L r (Ω) is well defined and Lipschitz continuous [ L α u − f α ] . F [ u ] := sup α ∈ Λ 9/28

  14. 2. Semismoothness: argmax set-valued map u , ∇ h u , D 2 ∈ L r (Ω; R m ) for suitable m . � � For each u ∈ X , we define u = h u We then view the differential operator F [ u ] as a composition of x �→ u ( x ) with the scalar function F : Ω × R m → R defined by [ a α ( x ) : M + b α ( x ) · p − c α ( x ) z − f α ( x )] , F ( x , v ) = sup v = ( z , p , M ) α ∈ Λ Define the set-valued map Ω × R m ∋ ( x , v ) �→ Λ( x , v ) ⊂ Λ by Λ( x , v ) := argmax α ∈ Λ [ a α ( x ) : M + b α ( x ) · p − c α ( x ) z − f α ( x )] Straightforward: Λ( x , v ) is non-empty and closed in Λ. 10/28

  15. 2. Semismoothness: argmax set-valued map Important lemma: The mapping Λ( · , · ): Ω × R m ⇒ Λ is upper semicontinuous: For every ( x , v ) ∈ Ω × R m , and any open neighbourhood U of Λ( x , v ) , there is an open neighbourhood V of ( x , v ) such that Λ( y , w ) ⊂ U for all ( y , w ) ∈ V . Λ Λ( x, v ) Λ( x n , v n ) ( x n , v n ) → ( x, v ) 10/28

  16. 2. Kuratowski–Ryll-Nardzewski Theorem Kuratowski–Ryll-Nardzewski Let Ω ⊂ R d be a bounded open set, let Λ be a compact metric space, let Λ( · , · ): Ω × R m ⇒ Λ be an upper semicontinuous set-valued function, such that Λ( x , v ) is non-empty and closed for every ( x , v ) ∈ Ω × R m . Then, for any Lebesgue measurable function u : Ω → R m , there exists a Lebesgue measurable selection α : Ω → Λ such that � � α ( x ) ∈ Λ x , u ( x ) for a.e. x ∈ Ω . (Presented here in the form needed for our purposes - original result is rather more general) Kuratowski & Ryll-Nardzewski, Bull. Acad. Polon. Sci., 1965: A general theorem on selectors. A (specialised) proof in Aubin & Cellina, Differential Inclusions , 1984. 11/28

  17. 2. The generalized differential of HJB operators Recall u ( x ) = ( u ( x ) , ∇ h u ( x ) , D 2 h u ( x )) for u ∈ W 2 , r (Ω; T h ). Define the set of measurable selections Λ[ u ]: Λ[ u ] = { α : Ω → Λ; α Lebesgue measurable, α ( x ) ∈ Λ( x , u ( x )) a.e. in Ω } . ⇒ Λ[ u ] is non-empty for all u ∈ W 2 , r (Ω; T h ). Kuratowski–Ryll-Nardzewski Thm = Define the differential DF [ u ] := { L α = a α : D 2 h + b α · ∇ h − c α , α ∈ Λ[ u ] } The measurability of α ∈ Λ[ u ] implies that L α is well defined in L ( W 2 , r (Ω; T h ) , L r (Ω)). 12/28


