and sum of squares polynomials
play

and sum of squares polynomials Georgina Hall INSEAD, Decision - PowerPoint PPT Presentation

Shape-constrained regression and sum of squares polynomials Georgina Hall INSEAD, Decision Sciences Joint work with Mihaela Curmei (Berkeley, EECS) 1 Shape-constrained regression (1/2) =1,, where


  1. Shape-constrained regression and sum of squares polynomials Georgina Hall INSEAD, Decision Sciences Joint work with Mihaela Curmei (Berkeley, EECS) 1

  2. Shape-constrained regression (1/2) 𝑗 𝑗=1,…,𝑛 where π‘Œ 𝑗 ∈ 𝐢 βŠ‚ ℝ π‘œ ( 𝐢 is a box) and 𝑍 Data: π‘Œ 𝑗 , 𝑍 𝑗 ∈ ℝ Goal : Fit a polynomial ො 𝑕 𝑛,𝑒 of degree 𝑒 to the data that minimizes 𝑗 βˆ’ 𝑕(π‘Œ 𝑗 ) 2 Οƒ 𝑗=1…𝑛 𝑍 and that has certain constraints on its shape . 2

  3. Shape-constrained regression (2/2) Convexity over B Monotonicity over B Lipschitz with constant 𝑳 For any function 𝑔 and a fixed For a continuously For a full-dimensional box scalar 𝐿 > 0: differentiable function 𝑔: B and a twice-continuously differentiable function 𝑔: 𝑔 is Lipschitz with constant 𝑔 is increasing K ⇔ (resp. decreasing) 𝑔 is convex over 𝐢 ⇔ 𝑔 𝑦 βˆ’ 𝑔 𝑧 ≀ 𝐿 𝑦 βˆ’ 𝑧 , in component 𝑦 𝑗 ⇔ βˆ‡ 2 𝑔 𝑦 ≽ 0, βˆ€π‘¦ ∈ 𝐢 βˆ€π‘¦, 𝑧 ∈ 𝐢 πœ–π‘” 𝑦 β‰₯ 0 π‘ π‘“π‘‘π‘ž. ≀ 0 , βˆ€π‘¦ ∈ 𝐢 πœ–π‘¦ 𝑗 Use as a regularizer: Example: stops 𝑔 from growing too Example: steeply β€’ Price of a car as a function β€’ Demand as a function of of age price Focus on convex regression here. 3

  4. Convex regression – possible candidate A candidate for our regressor: 2 𝑕 Οƒ 𝑗=1..𝑛 𝑍 𝑕 𝑛,𝑒 𝑦 ≔ arg min ො 𝑗 βˆ’ 𝑕 π‘Œ 𝑗 s.t. 𝑕 is a polynomial of degree 𝑒 βˆ‡ 2 𝑕 𝑦 ≽ 0, βˆ€π‘¦ ∈ 𝐢 But… Theorem [Ahmadi, H.]: It is (strongly) NP-hard to test whether a polynomial π‘ž of degree β‰₯ 3 is convex over a box 𝐢. (Reduction from problem of testing whether a matrix whose entries are affine polynomials in 𝑦 is positive semidefinite for all 𝑦 in 𝐢. ) 4

  5. A detour via sum of squares (1/5) βˆ‡ 2 𝑕(𝑦) ≽ 0, 𝑧 π‘ˆ βˆ‡ 2 𝑕(𝑦)𝑧 β‰₯ 0, 𝑕(𝑦) convex ⇔ ⇔ βˆ€π‘¦ ∈ 𝐢, βˆ€π‘§ ∈ ℝ π‘œ βˆ€π‘¦ ∈ 𝐢 over B Polynomial in π’š and 𝒛 β€’ If we can find a way of imposing that a polynomial be nonnegative, then we are in business! β€’ Unfortunately, hard to test whether a polynomial π‘ž is nonnegative for degree of π‘ž β‰₯ 4. β€’ What to do? 5

  6. A detour via sum of squares (2/5) Idea Find a property that implies nonnegativity but that is easy to test. β‘‘ β‘  = sum of squares (sos) Definition: A polynomial π‘ž is sos if it can be written as π‘ž 𝑦 = Οƒ 𝑗 π‘Ÿ 𝑗 𝑦 2 . β‘  Yes! Nonnegative polynomials Even equal sometimes : π‘œ = 1, 𝑒 = 2, π‘œ, 𝑒 = (3,4) [Hilbert] Sos polynomials What about β‘‘ ? Also yes! Let’s see why. 6

  7. A detour via sum of squares (3/5) A polynomial π‘ž(𝑦) of degree 2d is sos if and only if βˆƒπ‘… ≽ 0 such that π‘ž 𝑦 = π’œ π’š 𝑼 π‘Ήπ’œ(π’š) 𝑒 π‘ˆ is the vector of monomials of degree up to 𝑒. where 𝑨 = 1, 𝑦 1 , … , 𝑦 π‘œ , 𝑦 1 𝑦 2 , … , 𝑦 π‘œ 4 βˆ’ 6𝑦 1 2 + 9𝑦 1 2 βˆ’ 6𝑦 1 2 + 4𝑦 1 𝑦 3 3 𝑦 2 + 2𝑦 1 3 𝑦 3 + 6𝑦 1 2 𝑦 3 2 𝑦 2 2 𝑦 2 𝑦 3 βˆ’ 14𝑦 1 𝑦 2 𝑦 3 3 Ex: π‘ž 𝑦 = 𝑦 1 4 βˆ’ 7𝑦 2 2 + 16𝑦 2 2 𝑦 3 4 +5𝑦 3 2 2 + 𝑦 1 𝑦 3 βˆ’ 𝑦 2 𝑦 3 2 + 4𝑦 2 2 2 2 βˆ’ 3𝑦 1 𝑦 2 + 𝑦 1 𝑦 3 + 2𝑦 3 2 βˆ’ 𝑦 3 = 𝑦 1 T πŸ‘ πŸ‘ π’š 𝟐 π’š 𝟐 𝟐 βˆ’πŸ’ 𝟏 𝟐 𝟏 πŸ‘ π’š 𝟐 π’š πŸ‘ π’š 𝟐 π’š πŸ‘ βˆ’πŸ’ 𝟘 𝟏 βˆ’πŸ’ 𝟏 βˆ’πŸ• πŸ‘ πŸ‘ π’š πŸ‘ π’š πŸ‘ 𝟏 𝟏 πŸπŸ• 𝟏 𝟏 βˆ’πŸ“ = π’š 𝟐 π’š πŸ’ 𝟐 βˆ’πŸ’ 𝟏 πŸ‘ βˆ’πŸ πŸ‘ π’š 𝟐 π’š πŸ’ π’š πŸ‘ π’š πŸ’ 𝟏 𝟏 𝟏 βˆ’πŸ 𝟐 𝟏 π’š πŸ‘ π’š πŸ’ πŸ‘ βˆ’πŸ• πŸ“ πŸ‘ 𝟏 πŸ” πŸ‘ πŸ‘ π’š πŸ’ π’š πŸ’ 7

  8. A detour via sum of squares (4/5) β€’ Testing if a polynomial is sos is a semidefinite program (SDP). min 𝑅 0 Linear equations involving s.t. π‘ž 𝑦 = 𝑨 𝑦 π‘ˆ 𝑅𝑨 𝑦 βˆ€π‘¦ coefficients of π‘ž and entries of 𝑅 𝑅 ≽ 0 β€’ In fact, even optimizing over the set of sos polynomials (of fixed degree) is an SDP. c 1 ,c 2 ,𝑅 𝑑 1 + 𝑑 2 min c 1 ,c 2 𝑑 1 + 𝑑 2 min 𝑑. 𝑒. 𝑑 1 βˆ’ 3𝑑 2 = 4 𝑑. 𝑒. 𝑑 1 βˆ’ 3𝑑 2 = 4 2 βˆ’ 2𝑑 2 𝑦 1 𝑦 2 + 5𝑦 2 4 = 𝑨 𝑦 π‘ˆ 𝑅𝑨 𝑦 𝑑 1 𝑦 1 2 βˆ’ 2𝑑 2 𝑦 1 𝑦 2 + 5𝑦 2 4 sos 𝑑 1 𝑦 1 𝑅 ≽ 0 8

  9. A detour via sum of squares (5/5) β€’ Slight subtlety here: βˆ‡ 2 𝑕(𝑦) ≽ 0, 𝑧 π‘ˆ βˆ‡ 2 𝑕(𝑦)𝑧 β‰₯ 0, 𝑕(𝑦) convex ⇔ ⇔ βˆ€π’š ∈ π‘ͺ, βˆ€π‘§ ∈ ℝ π‘œ over B βˆ€π‘¦ ∈ 𝐢 β€’ How to impose nonnegativity over a set ? Theorem [Putinar ’93]: For a box 𝐢 = 𝑦 1 , … , 𝑦 π‘œ π‘š 1 ≀ 𝑦 1 ≀ 𝑣 1 , … , π‘š π‘œ ≀ 𝑦 π‘œ ≀ 𝑣 π‘œ } , we write instead: 𝑧 π‘ˆ βˆ‡ 2 𝑕 𝑦 𝑧 = 𝜏 0 𝑦, 𝑧 + 𝜏 1 𝑦, 𝑧 𝑣 1 βˆ’ 𝑦 1 𝑦 1 βˆ’ π‘š 1 + β‹― + 𝜏 π‘œ 𝑦, 𝑧 𝑣 π‘œ βˆ’ 𝑦 π‘œ (𝑦 π‘œ βˆ’ π‘š π‘œ ) where 𝜏 0 𝑦, 𝑧 , 𝜏 1 𝑦, 𝑧 , … 𝜏 π‘œ (𝑦, 𝑧) are sos polynomials in 𝑦 and 𝑧 9

  10. Convex regression – a new candidate A new candidate for the regressor: 2 𝑕,𝜏 0 ,β€¦πœ π‘œ Οƒ 𝑗=1..𝑛 𝑍 𝑕 𝑛,𝑒,𝑠 𝑦 ≔ arg ΰ·€ min 𝑗 βˆ’ 𝑕 π‘Œ 𝑗 s.t. 𝑕 is a polynomial of degree 𝑒 𝑧 π‘ˆ βˆ‡ 2 𝑕 𝑦 𝑧 = 𝜏 0 𝑦, 𝑧 + β‹― + 𝜏 π‘œ 𝑦, 𝑧 𝑣 π‘œ βˆ’ 𝑦 π‘œ 𝑦 π‘œ βˆ’ π‘š π‘œ 𝜏 0 𝑦, 𝑧 , 𝜏 1 𝑦, 𝑧 , … , 𝜏 π‘œ (𝑦, 𝑧) are sos of degree 𝑠 in 𝑦 (and 2 in 𝑧) β€’ When 𝑠 is fixed, this is a semidefinite program to solve. β€’ As 𝑠 β†’ ∞ , we recover ො 𝑕 𝑛,𝑒 . 10

  11. Comparison with existing methods Our method Existing method [Lim & Glynn, Seijo & Sen] β€’ Semidefinite program to obtain estimator β€’ Quadratic program to obtain estimator β€’ Number of datapoints does not impact size β€’ Number of variables (resp. constraints) scales of semidefinite program linearly (resp. quadratically) with number of datapoints β€’ Size of the semidefinite program scales β€’ Obtaining a prediction: requires solving a linear polynomially in number of features . program β€’ Obtaining a prediction: evaluation of our β€’ Piecewise affine estimator (can be smoothed polynomial estimator see [Mazumder et al.]) β€’ Smooth estimator β€’ Can be combined with monotonicity constraints β€’ Can be combined with monotonocity (see [Lim & Glynn]) and Lipschitz constraints constraints and Lipschitz constraints (see [Mazumder et al.]) 11

  12. Consistency of ΰ·€ 𝑕 𝑛,𝑒,𝑠 (1/4) β€’ Estimator of [Lim & Glynn, Seijo & Sen] is shown to be consistent. What about ours? Assumptions on the data: For 𝒀 𝒋 For 𝒁 𝒋 For π’ˆ 𝑍 𝑗 = 𝑔 π‘Œ 𝑗 + πœ— 𝑗 for 𝑗 = 1, … , 𝑛 π‘Œ 𝑗 are iid, with support 𝐢 , and 𝑔 is twice continuously 2 < ∞ 2 < ∞ 𝐹 π‘Œ 𝑗 with 𝐹 πœ— 𝑗 π‘Œ 𝑗 = 0 a.s. and 𝐹 πœ— 𝑗 differentiable and convex over 𝐢 Theorem [Curmei, H.] The regressor ΰ·€ 𝑕 𝑛,𝑒,𝑠 is a consistent estimator of 𝑔 over any compact 𝐷 βŠ‚ 𝐢, i.e., sup 𝑕 𝑛,𝑒,𝑠 (𝑦) βˆ’ 𝑔 𝑦 ΰ·€ β†’ 0 a.s., when 𝑒, 𝑛, 𝑠 β†’ ∞ π‘¦βˆˆπ· 12

  13. Consistency of ො 𝑕 𝑛,𝑒 (2/4) Proof ideas: inspired by [Lim and Glynn, OR’12] 1. Write 𝑔 𝑦 βˆ’ ΰ·€ 𝑕 𝑛,𝑒,𝑠 (𝑦) ≀ 𝑔 𝑦 βˆ’ ො 𝑕 𝑛,𝑒 𝑦 + ො 𝑕 𝑛,𝑒 𝑦 βˆ’ ΰ·€ 𝑕 𝑛,𝑒,𝑠 𝑦 Can show sup 𝑕 𝑛,𝑒 𝑦 βˆ’ ΰ·€ ො 𝑕 𝑛,𝑒,𝑠 𝑦 β†’ 0 when 𝑠 β†’ ∞ π‘¦βˆˆπ· 2. Introduce a polynomial approximation of 𝑔 : for any πœ— > 0 , βˆƒ 𝑒 and a convex polynomial 𝑕 𝑒 of degree 𝑒 such that sup 𝑔 𝑦 βˆ’ 𝑕 𝑒 𝑦 < πœ— π‘¦βˆˆπ· 13

  14. Consistency of ො 𝑕 𝑛,𝑒 (3/4) 3. For 𝑦 ∈ 𝐷 and "π‘Œ 𝑗 close to 𝑦": 𝑔 𝑦 βˆ’ ො 𝑕 𝑛,𝑒 𝑦 ≀ 𝑔 𝑦 βˆ’ 𝑕 𝑒 𝑦 + 𝑕 𝑒 𝑦 βˆ’ 𝑕 𝑒 π‘Œ 𝑗 + 𝑕 𝑒 π‘Œ 𝑗 βˆ’ ො 𝑕 𝑛,𝑒 π‘Œ 𝑗 + | ො 𝑕 𝑛,𝑒 π‘Œ 𝑗 βˆ’ ො 𝑕 𝑛,𝑒 𝑦 | Upper bound with πœ— Show that ො 𝑕 𝑛,𝑒 is Lipschitz (uniformly in 𝑛) (bound | ො 𝑕 𝑛,𝑒 | over 𝐷 unif. in Show that 𝑕 𝑒 is Lipschitz 𝑛 and use convexity) (use convexity of 𝑕 𝑒 over B) Upper bound this (algebra) by 2 1 𝑛 Οƒ 𝑗=1..𝑛 𝑕 𝑒 π‘Œ 𝑗 βˆ’ ො 𝑕 𝑛,𝑒 π‘Œ 𝑗 2 β†’ 0 a.s. when 𝑛 β†’ ∞ 1 𝑛 Οƒ 𝑗=1..𝑛 𝑕 𝑒 π‘Œ 𝑗 βˆ’ ො Remains to show that 𝑕 𝑛,𝑒 π‘Œ 𝑗 14

  15. Consistency of ො 𝑕 𝑛,𝑒 (4/4) 2 β†’ 0 a.s. when 𝑛 β†’ ∞ 1 𝑛 Οƒ 𝑗=1..𝑛 𝑕 𝑒 π‘Œ 𝑗 βˆ’ ො 𝑕 𝑛,𝑒 π‘Œ 𝑗 3. Show that 2 to obtain β€’ Use the fact that ො 𝑕 𝑛,𝑒 is a minimizer of Οƒ 𝑗 𝑍 𝑗 βˆ’ 𝑕 π‘Œ 𝑗 2 ≀ 1 2 𝑛 Οƒ 𝑗=1..𝑛 𝑕 𝑒 π‘Œ 𝑗 βˆ’ ො 𝑛 Οƒ 𝑗 𝑍 𝑕 𝑛,𝑒 π‘Œ 𝑗 𝑗 βˆ’ 𝑕 𝑒 π‘Œ 𝑗 β‹… ( ො 𝑕 𝑛,𝑒 π‘Œ 𝑗 βˆ’ 𝑕 𝑒 π‘Œ 𝑗 ) β€’ Can’t use SLLN because ො 𝑕 𝑛,𝑒 is a polynomial that depends on π‘Œ 𝑗 and 𝑍 𝑗 β€’ Idea: approximate ෝ 𝒉 𝒏,𝒆 by a deterministic function which is bounded over 𝐷 . β€’ Show that ො 𝑕 𝑛,𝑒 belongs (for large enough 𝑛) to a compact set whose elements are bounded over 𝐷 β€’ Construct πœ— -net of this set β€’ Replace ො 𝑕 𝑛,𝑒 by an element of this set which is πœ— -close and bounded over 𝐷 β€’ Use SLLN now with 𝒁 𝒋 βˆ’ 𝒉 𝒆 𝒀 𝒋 β‰ˆ 𝝑 𝒋 15

Recommend


More recommend