introduction to christoffel darboux kernels for
play

Introduction to Christoffel-Darboux kernels for polynomial - PowerPoint PPT Presentation

Introduction to Christoffel-Darboux kernels for polynomial optimization Edouard Pauwels POEMA Online Workshop July 2020 1 / 48 Exponential separation of the support : Lebesgue restricted to S R p , compact, non-empty interior. d p exp(


  1. Introduction to Christoffel-Darboux kernels for polynomial optimization Edouard Pauwels POEMA Online Workshop July 2020 1 / 48

  2. Exponential separation of the support µ : Lebesgue restricted to S ⊂ R p , compact, non-empty interior. d p exp( αd ) 29 / 48

  3. Exponential separation of the support µ : Lebesgue restricted to S ⊂ R p , compact, non-empty interior. d p +2 d p +1 d p exp( αd ) √ exp( α d ) Thresholding scheme: C > 0, q > p { x , v d ( x ) T M − 1 µ, d v d ( x ) ≤ Cd q } “ d →∞ ” → cl ( int ( S )) . Extends to positive densities on S . = 30 / 48

  4. Outline 1. CD kernel, Christoffel function, orthogonal polynomials, moments 2. CD kernel captures measure theoretic properties: univariate case 3. Quantitative asymptotics 4. The singular case 5. Using approximate moments 6. An application to polynomial optimal control 31 / 48

  5. The singular case µ : Borel probability measure in R p , compact support S , absolutely continuous. � p + d � R d [ X ]: p -variate polynomials of degree at most d (of dimension s ( d ) = ). d � ( P , Q ) �→ ⟪ P , Q ⟫ µ := PQd µ, defines a valid scalar product on R d [ X ].a positive semidefinite bilinear form on R d [ X ]. 32 / 48

  6. Specificity of the singular case µ : Borel probability measure in R p , asbolutely continuous, compact support: S . � p + d � R d [ X ]: p -variate polynomials of degree at most d (of dimension s ( d ) = ). d Moment based computation Let { P i } s ( d ) i =1 be any basis of R d [ X ], v d : x �→ ( P 1 ( x ) , . . . , P s ( d ) ( x )) T . d d µ ∈ R s ( d ) × s ( d ) . � v d v T M µ, d = Then, for all x , y ∈ R p , K µ d ( x , y ) = v d ( x ) T M − 1 µ, d v d ( y ) v d ( x ) T M − 1 µ, d v d ( y ) Let P ( x ) = � s ( d ) i =1 p i P i ( x ) P ∈ R d [ X ]. We have � P 2 d µ = p T M µ, d p . If P vanishes on S , if and only if p ∈ ker( M µ, d ). Singular moment matrix, morally, CD kernel should be + ∞ . 33 / 48

  7. Christoffel function to the rescue µ : Borel probability measure in R p , asbolutely continuous, compact support: S . � p + d � R d [ X ]: p -variate polynomials of degree at most d (of dimension s ( d ) = ). d Variational formulation: for all z ∈ R p �� � 1 P 2 d µ : d ( z , z ) = Λ µ d ( z ) = min P ( z ) = 1 . K µ P ∈ R d [ X ] �� � Λ µ P 2 d µ : d ( z ) = min P ( z ) = 1 . P ∈ R d [ X ] Given z ∈ R p , such that there exists P ∈ R d [ X ] such that P ( z ) � = 0 P vanishes on S . Then Λ µ d ( z ) = 0. 34 / 48

  8. Getting the CD kernel back (and computation from moments) µ : Borel probability measure in R p ,compact support: S . � p + d R d [ X ]: p -variate polynomials of degree at most d (of dimension s ( d ) = � ). d V denotes the Zariski closure of S (smallest algebraic set containing S ). For d large enough, V = { z ∈ R p , Λ µ d ( z ) > 0 } . Polynomials on V : L 2 µ, d = R d [ X ] / { P ∈ R d [ X ] , P vanishes on V } . µ, d , ⟪ · , · ⟫ µ ) is a Hilbert space of functions on V . K µ RKHS: ( L 2 d is its reproducing kernel (defined on V ). For any x ∈ V and P ∈ L 2 P ( y ) K µ � µ, d , P ( x ) = d ( x , y ) d µ ( y ). Relation with Christoffel function: Λ µ d ( z ) K µ d ( z , z ) = 1, for z ∈ V . Pseudo inverse computation: let v d be any basis of R d [ X ], M µ, d moment matrix: K µ d ( x , y ) = v d ( x ) M † ∀ x , y ∈ V µ, d v d ( y ) . K µ d ( x , x ) d µ ( x ) = dim ( L 2 � Average value and Hilbert function: µ, d ) ≤ s ( d ). 35 / 48

  9. Outline 1. CD kernel, Christoffel function, orthogonal polynomials, moments 2. CD kernel captures measure theoretic properties: univariate case 3. Quantitative asymptotics 4. The singular case 5. Using approximate moments 6. An application to polynomial optimal control 36 / 48

  10. Motivation for approximate moments “I am a Lasserre hierarchist, I work with pseudo-moments.” “I am a statistician, I work with empirical moments.” “I am a numerician, among others, I care about sensitivity to errors.” 37 / 48

  11. A stability result Choose a basis v d of R d [ X ]. Approximation of Christoffel function: Let Q ( x , y ) = v d ( x ) M − 1 v d ( y ) where M ∈ R s ( d ) × s ( d ) is positive definite, then for all x ∈ R p , 1 1 | Q ( x , x )Λ µ µ, d M − 1 M d ( x ) − 1 | ≤ � I − M 2 µ, d � op 2 If M ≃ M µ, d , then Λ µ 1 d ( x ) ≃ Q ( x , x ) . 38 / 48

  12. Regularization “Using pseudo inverse is like saying 0 = + ∞ ”. Regularization: Let µ 0 be a simple absolutely continuous measure (moments are easy to compute). Replace µ by µ + βµ 0 , β > 0. M µ + βµ 0 , d = M µ, d + β M µ 0 , d ≻ 0 Λ d µ + βµ 0 ≥ Λ d µ + β Λ d µ 0 � � (Λ d µ + βµ 0 ) − 1 d µ ≤ (Λ d µ + βµ 0 ) − 1 d ( µ + βµ 0 ) = s ( d ) = O ( d p ) The moment matrix is positive definite If Λ d µ + βµ 0 is small, then Λ d µ is also small. Λ d µ + βµ 0 stays reasonably big on the support of µ . Λ d µ + βµ 0 stays reasonably small outside the support of µ (if β is small). 39 / 48

  13. Outline 1. CD kernel, Christoffel function, orthogonal polynomials, moments 2. CD kernel captures measure theoretic properties: univariate case 3. Quantitative asymptotics 4. The singular case 5. Using approximate moments 6. An application to polynomial optimal control 40 / 48

  14. Acknowledgement The content of this section is taken from Marx, S., Pauwels, E., Weisser, T., Henrion, D., & Lasserre, J. (2019). Tractable semi-algebraic approximation using Christoffel-Darboux kernel. arXiv preprint arXiv:1904.01833. 41 / 48

  15. From the tutorial of Didier Controled ODE, x ( t ) = f ( x ( t ) , u ( t )) , ˙ x ( t ) ∈ X , u ( t ) ∈ U , t ∈ [0 , 1] , x (0) = 0 Occupation measure, given a classical trajectory d µ ( x , u , t ) = d δ x ( t ) ( x ) d δ u ( t ) ( u ) dt Relaxation: Replace classical trajectories satisfying an ODE by measures satisfying a linear transport PDE. 42 / 48

  16. A heuristic argument Hierarchy: f polynomial, X , U basic semi-algebraic: level d provides pseudo-moments up to degree 2 d in variables t , u , x . PM d Heuristic: As d grows PM d should get close to M µ, d where µ is an occupation measure supported on optimal trajectories. Use the Christoffel Darboux kernel: “( x , u , t ) T PM − 1 d ( x , u , t )” The measure is singular, we only have pseudo moments . . . Morally, it is small on the support of µ and large outside the support. Morally, it is small on the optimal trajectory and large outside. 43 / 48

  17. A semi-algebraic estimator Hierarchy: f polynomial, X , U basic semi-algebraic: level d provides pseudo-moments up to degree 2 d in variables t , u , x . PM d Christoffel Darboux kernel: “( x , u , t ) T PM − 1 d ( x , u , t )” = Q d ( x , u , t ) Morally, it is small on the optimal trajectory and large outside. A semi-algebraic estimator: For all t ∈ [0 , 1] (ˆ u ( t ) , ˆ x ( t )) ∈ argmin ( x , u ) Q ( x , u , t ) . An example with x ( t ) = sign ( t ) / 2 and exact moments Legendre projection Christoffel-Darboux approximation 0.8 d 30 0.4 f d ( x ) 0.0 20 ˆ -0.4 10 -0.8 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 x 44 / 48

  18. Convergence guaranties A semi-algebraic estimator: Q d ( x , u , t ) = “( x , u , t ) T PM − 1 d ( x , u , t )” (ˆ u , ˆ x ): t �→ (ˆ u ( t ) , ˆ x ( t )) ∈ argmin ( x , u ) Q ( x , u , t ) . Assumption: x , u in L 1 , bounded, continuous almost everywhere, exact moments. Strong convergence in L 1 . Assumption: x , u Lipschitz, exact moments. √ Rate of order O (1 / d ). Assumption: x , u have bounded total variation, exact moments. 1 4 ). Conjecture: Rate of order O (1 / d 45 / 48

  19. Illustration on the double integrator with constraints Minimal time to reach the origin. u ∈ [ − 1 , 1], x 1 ≥ − 1. x 2 ( t ) = x 1 ( t ) ˙ x 1 ( t ) = u ( t ) ˙ 1 1 0.8 0.8 0.8 0.7 0.6 0.6 0.6 0.4 0.4 0.5 0.2 0.2 second state first state 0.4 control 0 0 0.3 -0.2 -0.2 0.2 -0.4 -0.4 0.1 -0.6 -0.6 0 -0.8 -0.8 -1 -1 -0.1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 time time time With True moments: 1 0.8 0.6 0.4 0.2 control 0 -0.2 -0.4 -0.6 -0.8 -1 0 0.2 0.4 0.6 0.8 1 time 46 / 48

  20. Illustration in Chemo-Immuno therapy modeling Moussa, K., Fiacchini, M., & Alamir, M. (2019). Robust Optimal Control-based Design of Combined Chemo-and Immunotherapy Delivery Profiles. IFAC-PapersOnLine, 52(26), 76-81. 47 / 48

  21. Conclusion d p +2 d p +1 d p exp( αd ) exp( α √ d ) CD kernel is computed from moments of a measure µ . It captures the support of µ . Century old mathematical history and still active. Proper set up, proof guaranties, require some subtleties. Can be combined with Lassere’s Hierarchy: example in polynomial optimal control. 48 / 48

Recommend


More recommend