Shape-constrained regression and sum of squares polynomials Georgina Hall INSEAD, Decision Sciences Joint work with Mihaela Curmei (Berkeley, EECS) 1
Shape-constrained regression (1/2) π π=1,β¦,π where π π β πΆ β β π ( πΆ is a box) and π Data: π π , π π β β Goal : Fit a polynomial ΰ· π π,π of degree π to the data that minimizes π β π(π π ) 2 Ο π=1β¦π π and that has certain constraints on its shape . 2
Shape-constrained regression (2/2) Convexity over B Monotonicity over B Lipschitz with constant π³ For any function π and a fixed For a continuously For a full-dimensional box scalar πΏ > 0: differentiable function π: B and a twice-continuously differentiable function π: π is Lipschitz with constant π is increasing K β (resp. decreasing) π is convex over πΆ β π π¦ β π π§ β€ πΏ π¦ β π§ , in component π¦ π β β 2 π π¦ β½ 0, βπ¦ β πΆ βπ¦, π§ β πΆ ππ π¦ β₯ 0 π ππ‘π. β€ 0 , βπ¦ β πΆ ππ¦ π Use as a regularizer: Example: stops π from growing too Example: steeply β’ Price of a car as a function β’ Demand as a function of of age price Focus on convex regression here. 3
Convex regression β possible candidate A candidate for our regressor: 2 π Ο π=1..π π π π,π π¦ β arg min ΰ· π β π π π s.t. π is a polynomial of degree π β 2 π π¦ β½ 0, βπ¦ β πΆ Butβ¦ Theorem [Ahmadi, H.]: It is (strongly) NP-hard to test whether a polynomial π of degree β₯ 3 is convex over a box πΆ. (Reduction from problem of testing whether a matrix whose entries are affine polynomials in π¦ is positive semidefinite for all π¦ in πΆ. ) 4
A detour via sum of squares (1/5) β 2 π(π¦) β½ 0, π§ π β 2 π(π¦)π§ β₯ 0, π(π¦) convex β β βπ¦ β πΆ, βπ§ β β π βπ¦ β πΆ over B Polynomial in π and π β’ If we can find a way of imposing that a polynomial be nonnegative, then we are in business! β’ Unfortunately, hard to test whether a polynomial π is nonnegative for degree of π β₯ 4. β’ What to do? 5
A detour via sum of squares (2/5) Idea Find a property that implies nonnegativity but that is easy to test. β‘ β = sum of squares (sos) Definition: A polynomial π is sos if it can be written as π π¦ = Ο π π π π¦ 2 . β Yes! Nonnegative polynomials Even equal sometimes : π = 1, π = 2, π, π = (3,4) [Hilbert] Sos polynomials What about β‘ ? Also yes! Letβs see why. 6
A detour via sum of squares (3/5) A polynomial π(π¦) of degree 2d is sos if and only if βπ β½ 0 such that π π¦ = π π πΌ πΉπ(π) π π is the vector of monomials of degree up to π. where π¨ = 1, π¦ 1 , β¦ , π¦ π , π¦ 1 π¦ 2 , β¦ , π¦ π 4 β 6π¦ 1 2 + 9π¦ 1 2 β 6π¦ 1 2 + 4π¦ 1 π¦ 3 3 π¦ 2 + 2π¦ 1 3 π¦ 3 + 6π¦ 1 2 π¦ 3 2 π¦ 2 2 π¦ 2 π¦ 3 β 14π¦ 1 π¦ 2 π¦ 3 3 Ex: π π¦ = π¦ 1 4 β 7π¦ 2 2 + 16π¦ 2 2 π¦ 3 4 +5π¦ 3 2 2 + π¦ 1 π¦ 3 β π¦ 2 π¦ 3 2 + 4π¦ 2 2 2 2 β 3π¦ 1 π¦ 2 + π¦ 1 π¦ 3 + 2π¦ 3 2 β π¦ 3 = π¦ 1 T π π π π π π π βπ π π π π π π π π π π π π βπ π π βπ π βπ π π π π π π π π ππ π π βπ = π π π π π βπ π π βπ π π π π π π π π π π π π βπ π π π π π π π βπ π π π π π π π π π π 7
A detour via sum of squares (4/5) β’ Testing if a polynomial is sos is a semidefinite program (SDP). min π 0 Linear equations involving s.t. π π¦ = π¨ π¦ π π π¨ π¦ βπ¦ coefficients of π and entries of π π β½ 0 β’ In fact, even optimizing over the set of sos polynomials (of fixed degree) is an SDP. c 1 ,c 2 ,π π 1 + π 2 min c 1 ,c 2 π 1 + π 2 min π‘. π’. π 1 β 3π 2 = 4 π‘. π’. π 1 β 3π 2 = 4 2 β 2π 2 π¦ 1 π¦ 2 + 5π¦ 2 4 = π¨ π¦ π π π¨ π¦ π 1 π¦ 1 2 β 2π 2 π¦ 1 π¦ 2 + 5π¦ 2 4 sos π 1 π¦ 1 π β½ 0 8
A detour via sum of squares (5/5) β’ Slight subtlety here: β 2 π(π¦) β½ 0, π§ π β 2 π(π¦)π§ β₯ 0, π(π¦) convex β β βπ β πͺ, βπ§ β β π over B βπ¦ β πΆ β’ How to impose nonnegativity over a set ? Theorem [Putinar β93]: For a box πΆ = π¦ 1 , β¦ , π¦ π π 1 β€ π¦ 1 β€ π£ 1 , β¦ , π π β€ π¦ π β€ π£ π } , we write instead: π§ π β 2 π π¦ π§ = π 0 π¦, π§ + π 1 π¦, π§ π£ 1 β π¦ 1 π¦ 1 β π 1 + β― + π π π¦, π§ π£ π β π¦ π (π¦ π β π π ) where π 0 π¦, π§ , π 1 π¦, π§ , β¦ π π (π¦, π§) are sos polynomials in π¦ and π§ 9
Convex regression β a new candidate A new candidate for the regressor: 2 π,π 0 ,β¦π π Ο π=1..π π π π,π,π π¦ β arg ΰ·€ min π β π π π s.t. π is a polynomial of degree π π§ π β 2 π π¦ π§ = π 0 π¦, π§ + β― + π π π¦, π§ π£ π β π¦ π π¦ π β π π π 0 π¦, π§ , π 1 π¦, π§ , β¦ , π π (π¦, π§) are sos of degree π in π¦ (and 2 in π§) β’ When π is fixed, this is a semidefinite program to solve. β’ As π β β , we recover ΰ· π π,π . 10
Comparison with existing methods Our method Existing method [Lim & Glynn, Seijo & Sen] β’ Semidefinite program to obtain estimator β’ Quadratic program to obtain estimator β’ Number of datapoints does not impact size β’ Number of variables (resp. constraints) scales of semidefinite program linearly (resp. quadratically) with number of datapoints β’ Size of the semidefinite program scales β’ Obtaining a prediction: requires solving a linear polynomially in number of features . program β’ Obtaining a prediction: evaluation of our β’ Piecewise affine estimator (can be smoothed polynomial estimator see [Mazumder et al.]) β’ Smooth estimator β’ Can be combined with monotonicity constraints β’ Can be combined with monotonocity (see [Lim & Glynn]) and Lipschitz constraints constraints and Lipschitz constraints (see [Mazumder et al.]) 11
Consistency of ΰ·€ π π,π,π (1/4) β’ Estimator of [Lim & Glynn, Seijo & Sen] is shown to be consistent. What about ours? Assumptions on the data: For π π For π π For π π π = π π π + π π for π = 1, β¦ , π π π are iid, with support πΆ , and π is twice continuously 2 < β 2 < β πΉ π π with πΉ π π π π = 0 a.s. and πΉ π π differentiable and convex over πΆ Theorem [Curmei, H.] The regressor ΰ·€ π π,π,π is a consistent estimator of π over any compact π· β πΆ, i.e., sup π π,π,π (π¦) β π π¦ ΰ·€ β 0 a.s., when π, π, π β β π¦βπ· 12
Consistency of ΰ· π π,π (2/4) Proof ideas: inspired by [Lim and Glynn, ORβ12] 1. Write π π¦ β ΰ·€ π π,π,π (π¦) β€ π π¦ β ΰ· π π,π π¦ + ΰ· π π,π π¦ β ΰ·€ π π,π,π π¦ Can show sup π π,π π¦ β ΰ·€ ΰ· π π,π,π π¦ β 0 when π β β π¦βπ· 2. Introduce a polynomial approximation of π : for any π > 0 , β π and a convex polynomial π π of degree π such that sup π π¦ β π π π¦ < π π¦βπ· 13
Consistency of ΰ· π π,π (3/4) 3. For π¦ β π· and "π π close to π¦": π π¦ β ΰ· π π,π π¦ β€ π π¦ β π π π¦ + π π π¦ β π π π π + π π π π β ΰ· π π,π π π + | ΰ· π π,π π π β ΰ· π π,π π¦ | Upper bound with π Show that ΰ· π π,π is Lipschitz (uniformly in π) (bound | ΰ· π π,π | over π· unif. in Show that π π is Lipschitz π and use convexity) (use convexity of π π over B) Upper bound this (algebra) by 2 1 π Ο π=1..π π π π π β ΰ· π π,π π π 2 β 0 a.s. when π β β 1 π Ο π=1..π π π π π β ΰ· Remains to show that π π,π π π 14
Consistency of ΰ· π π,π (4/4) 2 β 0 a.s. when π β β 1 π Ο π=1..π π π π π β ΰ· π π,π π π 3. Show that 2 to obtain β’ Use the fact that ΰ· π π,π is a minimizer of Ο π π π β π π π 2 β€ 1 2 π Ο π=1..π π π π π β ΰ· π Ο π π π π,π π π π β π π π π β ( ΰ· π π,π π π β π π π π ) β’ Canβt use SLLN because ΰ· π π,π is a polynomial that depends on π π and π π β’ Idea: approximate ΰ· π π,π by a deterministic function which is bounded over π· . β’ Show that ΰ· π π,π belongs (for large enough π) to a compact set whose elements are bounded over π· β’ Construct π -net of this set β’ Replace ΰ· π π,π by an element of this set which is π -close and bounded over π· β’ Use SLLN now with π π β π π π π β π π 15
Recommend
More recommend