Between Piecewise Smoothness and Linear Complementarity Andreas Griewank, with thanks to A. Walther, T. Bosse, N. Strogies, S. Fiege, F. Kerkoff, J.U. Bernt, M. Radons,T. Streubel, R. Hasenfelder, P. Boeck, B. Lenser, . . . Institute for Applied Mathematics, Humboldt Universit¨ at zu Berlin, Germany griewank@math.hu-berlin.de 6th International Conference on Complementarity Problems August 4-8, 2014, Berlin. August 8, 2014 Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 1 / 43
1 Observations/Opinions of a Johnny Come Lately 2 Piecewise Linearization/Differentiation 3 Representation of PL functions in abs-normal form 4 Computation of conical Jacobians and gradients 5 Solving PL systems of Equation and LCPs 6 (Un)constrained optimization by successive PL 7 Integration of Lipschitzian dynamics Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 2 / 43
Observations/Opinions of a Johnny Come Lately Why not smoothen/regularize/parametrize? • If scale too small it makes no algorithmic difference. Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 3 / 43
Observations/Opinions of a Johnny Come Lately Why not smoothen/regularize/parametrize? • If scale too small it makes no algorithmic difference. • If scale too large the problem is significantly changed. Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 3 / 43
Observations/Opinions of a Johnny Come Lately Why not smoothen/regularize/parametrize? • If scale too small it makes no algorithmic difference. • If scale too large the problem is significantly changed. • If scale variable we have one more loose parameter. Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 3 / 43
Observations/Opinions of a Johnny Come Lately Why not smoothen/regularize/parametrize? • If scale too small it makes no algorithmic difference. • If scale too large the problem is significantly changed. • If scale variable we have one more loose parameter. • Resulting solution paths may have sharp turns (LOP). Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 3 / 43
Observations/Opinions of a Johnny Come Lately Why not smoothen/regularize/parametrize? • If scale too small it makes no algorithmic difference. • If scale too large the problem is significantly changed. • If scale variable we have one more loose parameter. • Resulting solution paths may have sharp turns (LOP). • There may be spurious solutions (Bernardo et al). Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 3 / 43
Observations/Opinions of a Johnny Come Lately Why not smoothen/regularize/parametrize? • If scale too small it makes no algorithmic difference. • If scale too large the problem is significantly changed. • If scale variable we have one more loose parameter. • Resulting solution paths may have sharp turns (LOP). • There may be spurious solutions (Bernardo et al). Lesson/Moral: Let’s face the combinatorial music! (Reflected in the piecewise linearization) Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 3 / 43
Observations/Opinions of a Johnny Come Lately My issues with generalized differentiation • For composite nonsmoothness differentiation rules are only inclusions so analysis cannot be turned into algebra in an AD fashion. Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 4 / 43
Observations/Opinions of a Johnny Come Lately My issues with generalized differentiation • For composite nonsmoothness differentiation rules are only inclusions so analysis cannot be turned into algebra in an AD fashion. • The inclusions point the right way for propagating semi-smoothness but the wrong way for calculating generalized derivative elements. Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 4 / 43
Observations/Opinions of a Johnny Come Lately My issues with generalized differentiation • For composite nonsmoothness differentiation rules are only inclusions so analysis cannot be turned into algebra in an AD fashion. • The inclusions point the right way for propagating semi-smoothness but the wrong way for calculating generalized derivative elements. • Semi-smooth Newton extremely local, bundle methods too heuristic. Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 4 / 43
Observations/Opinions of a Johnny Come Lately My issues with generalized differentiation • For composite nonsmoothness differentiation rules are only inclusions so analysis cannot be turned into algebra in an AD fashion. • The inclusions point the right way for propagating semi-smoothness but the wrong way for calculating generalized derivative elements. • Semi-smooth Newton extremely local, bundle methods too heuristic. • Generalized derivatives are fickle since outer semi-continuity of multi-functions does not mean stability in any numerical sense. Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 4 / 43
Observations/Opinions of a Johnny Come Lately My issues with generalized differentiation • For composite nonsmoothness differentiation rules are only inclusions so analysis cannot be turned into algebra in an AD fashion. • The inclusions point the right way for propagating semi-smoothness but the wrong way for calculating generalized derivative elements. • Semi-smooth Newton extremely local, bundle methods too heuristic. • Generalized derivatives are fickle since outer semi-continuity of multi-functions does not mean stability in any numerical sense. • Rademacher says F ∈ C 0 , 1 ( R n ) ≡ W 1 , ∞ ( R n ), hence generalized derivatives are almost everywhere normal derivatives. Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 4 / 43
Observations/Opinions of a Johnny Come Lately Lurking in the background: Prof. Moriarty Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 5 / 43
Piecewise Linearization/Differentiation Basic idea of tangent linearization: ) x ∆ ; ˚ x ( F ∆ + ˚ F F = max( F 1 , F 2 ) ˚ F 1 + F ′ 1 (˚ x ) ∆ x F 2 x ∆ ) ˚ x F 1 ( ′ F 2 + ˚ F 2 Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 6 / 43 x ˚
Piecewise Linearization/Differentiation abs covers min , max and table look-ups Provided u and w are both finite one has 1 max( u , w ) = 2 [ u + w + abs ( u − w )] 1 min( u , w ) = 2 [ u + w − abs ( u − w )] data ( x i , y i ) for 0 ≤ i ≤ n are piecewise linearly interpolated by the formula 1 y = 2 [ y 0 + s 1 abs ( x − x 0 ) + y n + s n abs ( x − x n ) n − 1 � + ( s i +1 − s i ) abs ( x − x i )] whose ??? i =1 where s i = ( y i +1 − y i ) / ( x i +1 − x i ) represent the slopes. • Every continuous PL function can be expressed as composition of affine functions and several abs (). That representation is not unique. Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 7 / 43
Piecewise Linearization/Differentiation Piecewise Linearization We wish to determine for base point x and increment ∆ x ∆ y ≡ ∆ F ( x ; ∆ x ) = F ( x + ∆ x ) − F ( x ) + O ( � ∆ x � 2 ) This can be done by propagating increments according to Smooth elementals ∆ v i = ∆ v j ± ∆ v k for v i = v j ± v k ∆ v i = v j ∗ ∆ v k + ∆ v j ∗ v k for v i = v j ∗ v k c ij ∆ v j with c ij ≡ ϕ ′ ∆ v i = i ( v j ) for v i = ϕ i ( v j ) �≡ abs () Lipschitz Elementals ∆ v i = abs ( v j + ∆ v j ) − abs ( v j ) when v i = abs ( v j ) . and correspondingly for max() und min(). Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 8 / 43
Piecewise Linearization/Differentiation Continuous Piecewise Differentiation Rules Linearity and Product Rule F , G : D ⊂ R n �→ R m , α, β ∈ R = ⇒ ∆[ α F + β G ]( x ; ∆ x ) = α ∆ F ( x , ∆ x ) + β ∆ G ( x , ∆ x ) ∆[ F ⊤ G ]( x ; ∆ x ) G ( x ) ⊤ ∆ F ( x , ∆ x ) + F ( x ) ⊤ ∆ G ( x , ∆ x ) = Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 9 / 43
Piecewise Linearization/Differentiation Continuous Piecewise Differentiation Rules Linearity and Product Rule F , G : D ⊂ R n �→ R m , α, β ∈ R = ⇒ ∆[ α F + β G ]( x ; ∆ x ) = α ∆ F ( x , ∆ x ) + β ∆ G ( x , ∆ x ) ∆[ F ⊤ G ]( x ; ∆ x ) G ( x ) ⊤ ∆ F ( x , ∆ x ) + F ( x ) ⊤ ∆ G ( x , ∆ x ) = Chain Rule F : D ⊂ R n �→ R m G : E ⊂ R m �→ R p and with F ( D ) ⊂ E = ⇒ ∆[ G ◦ F ]( x ; ∆ x ) = ∆ G ( F ( x ); ∆ F ( x , ∆ x )) Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 9 / 43
Piecewise Linearization/Differentiation Second order error and Lipschitz continuity Proposition Suppose F is composite Lipschitz on some open neighborhood D of a closed convex domain K ⊂ R n . Then there exists a constant γ such that for all pairs x , x + ∆ x ∈ K � F ( x + ∆ x ) − F ( x ) − ∆ F ( x ; ∆ x ) � ≤ γ � ∆ x � 2 Andreas Griewank et al (HUB) Nonsmooth numerics via PL August 8, 2014 10 / 43
Recommend
More recommend