constrained optimization
play

Constrained optimization DS-GA 1013 / MATH-GA 2824 - PowerPoint PPT Presentation

Constrained optimization DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/OBDA_fall17/index.html Carlos Fernandez-Granda Compressed sensing Convex constrained problems Analyzing


  1. Proof Assume there are two distinct projections � y 1 � = � y 2 Consider y ′ := � y 1 + � y 2 � 2 y ′ belongs to S (why?) �

  2. Proof � � x − � y 1 + � y 2 y 1 − � y 1 + � y 2 y ′ , � y ′ � � � x − � y 1 − � = � , � 2 2 � � x − � y 1 + � x − � y 2 , � x − � y 1 − � x − � y 2 � = 2 2 2 2

  3. Proof � � x − � y 1 + � y 2 y 1 − � y 1 + � y 2 y ′ , � y ′ � � � x − � y 1 − � = � , � 2 2 � � x − � y 1 + � x − � y 2 , � x − � y 1 − � x − � y 2 � = 2 2 2 2 = 1 y 1 || 2 + || � � y 2 || 2 � || � x − � x − � 4 = 0

  4. Proof � � x − � y 1 + � y 2 y 1 − � y 1 + � y 2 y ′ , � y ′ � � � x − � y 1 − � = � , � 2 2 � � x − � y 1 + � x − � y 2 , � x − � y 1 − � x − � y 2 � = 2 2 2 2 = 1 y 1 || 2 + || � � y 2 || 2 � || � x − � x − � 4 = 0 By Pythagoras’ theorem y 1 || 2 || � x − � 2

  5. Proof � � x − � y 1 + � y 2 y 1 − � y 1 + � y 2 y ′ , � y ′ � � � x − � y 1 − � = � , � 2 2 � � x − � y 1 + � x − � y 2 , � x − � y 1 − � x − � y 2 � = 2 2 2 2 = 1 y 1 || 2 + || � � y 2 || 2 � || � x − � x − � 4 = 0 By Pythagoras’ theorem � 2 � 2 y 1 || 2 � �� y ′ � �� � �� y ′ � �� || � x − � 2 = � � x − � 2 + � � y 1 − � 2

  6. Proof � � x − � y 1 + � y 2 y 1 − � y 1 + � y 2 y ′ , � y ′ � � � x − � y 1 − � = � , � 2 2 � � x − � y 1 + � x − � y 2 , � x − � y 1 − � x − � y 2 � = 2 2 2 2 = 1 y 1 || 2 + || � � y 2 || 2 � || � x − � x − � 4 = 0 By Pythagoras’ theorem � 2 � 2 y 1 || 2 � �� y ′ � �� �� � y ′ � �� || � x − � 2 = � � x − � 2 + � � y 1 − � 2 2 � � � � y 1 − � � y 2 � 2 y ′ � � �� �� � � � � = � � x − � 2 + � � � � 2 � � � � 2

  7. Proof � � x − � y 1 + � y 2 y 1 − � y 1 + � y 2 y ′ , � y ′ � � � x − � y 1 − � = � , � 2 2 � � x − � y 1 + � x − � y 2 , � x − � y 1 − � x − � y 2 � = 2 2 2 2 = 1 y 1 || 2 + || � � y 2 || 2 � || � x − � x − � 4 = 0 By Pythagoras’ theorem � 2 � 2 y 1 || 2 �� � y ′ � �� � �� y ′ � �� || � x − � 2 = � � x − � 2 + � � y 1 − � 2 2 � � � � y 1 − � � y 2 � 2 y ′ � � �� �� � � � � = � � x − � 2 + � � � � 2 � � � � 2 � 2 y ′ � � �� �� > � � x − � 2

  8. Convex combination x n ∈ R n , Given n vectors � x 1 , � x 2 , . . . , � n � � x := θ i � x i i = 1 is a convex combination of � x 1 , � x 2 , . . . , � x n if θ i ≥ 0 , 1 ≤ i ≤ n n � θ i = 1 i = 1

  9. Convex hull The convex hull of S is the set of convex combinations of points in S The ℓ 1 -norm ball is the convex hull of the intersection between the ℓ 0 “norm" ball and the ℓ ∞ -norm ball

  10. ℓ 1 -norm ball

  11. B ℓ 1 ⊆ C ( B ℓ 0 ∩ B ℓ ∞ ) Let � x ∈ B ℓ 1 x [ i ] | , θ 0 = 1 − � n Set θ i := | � i = 1 θ i � n i = 0 θ i = 1 by construction, θ i ≥ 0 and n + 1 � θ 0 = 1 − θ i i = 1 = 1 − || � x || 1 ≥ 0 because � x ∈ B ℓ 1

  12. B ℓ 1 ⊆ C ( B ℓ 0 ∩ B ℓ ∞ ) Let � x ∈ B ℓ 1 x [ i ] | , θ 0 = 1 − � n Set θ i := | � i = 1 θ i � n i = 0 θ i = 1 by construction, θ i ≥ 0 and n + 1 � θ 0 = 1 − θ i i = 1 = 1 − || � x || 1 ≥ 0 because � x ∈ B ℓ 1 � x ∈ B ℓ 0 ∩ B ℓ ∞ because n � e i + θ 0 � � x = θ i sign ( � x [ i ]) � 0 i = 1

  13. C ( B ℓ 0 ∩ B ℓ ∞ ) ⊆ B ℓ 1 Let � x ∈ C ( B ℓ 0 ∩ B ℓ ∞ ) , then m � � x = θ i � y i i = 1

  14. C ( B ℓ 0 ∩ B ℓ ∞ ) ⊆ B ℓ 1 Let � x ∈ C ( B ℓ 0 ∩ B ℓ ∞ ) , then m � � x = θ i � y i i = 1 || � x || 1

  15. C ( B ℓ 0 ∩ B ℓ ∞ ) ⊆ B ℓ 1 Let � x ∈ C ( B ℓ 0 ∩ B ℓ ∞ ) , then m � � x = θ i � y i i = 1 m � || � x || 1 ≤ θ i || � y i || 1 by the Triangle inequality i = 1

  16. C ( B ℓ 0 ∩ B ℓ ∞ ) ⊆ B ℓ 1 Let � x ∈ C ( B ℓ 0 ∩ B ℓ ∞ ) , then m � � x = θ i � y i i = 1 m � || � x || 1 ≤ θ i || � y i || 1 by the Triangle inequality i = 1 m � θ i || � � ≤ y i || ∞ y i only has one nonzero entry i = 1

  17. C ( B ℓ 0 ∩ B ℓ ∞ ) ⊆ B ℓ 1 Let � x ∈ C ( B ℓ 0 ∩ B ℓ ∞ ) , then m � � x = θ i � y i i = 1 m � || � x || 1 ≤ θ i || � y i || 1 by the Triangle inequality i = 1 m � θ i || � � ≤ y i || ∞ y i only has one nonzero entry i = 1 m � ≤ θ i i = 1

  18. C ( B ℓ 0 ∩ B ℓ ∞ ) ⊆ B ℓ 1 Let � x ∈ C ( B ℓ 0 ∩ B ℓ ∞ ) , then m � � x = θ i � y i i = 1 m � || � x || 1 ≤ θ i || � y i || 1 by the Triangle inequality i = 1 m � θ i || � � ≤ y i || ∞ y i only has one nonzero entry i = 1 m � ≤ θ i i = 1 ≤ 1

  19. Convex optimization problem f 0 , f 1 , . . . , f m , h 1 , . . . , h p : R n → R minimize f 0 ( � x ) subject to f i ( � x ) ≤ 0 , 1 ≤ i ≤ m , h i ( � x ) = 0 , 1 ≤ i ≤ p ,

  20. Definitions ◮ A feasible vector is a vector that satisfies all the constraints x ∗ such that for all feasible vectors � ◮ A solution is any vector � x x ∗ ) f 0 ( � x ) ≥ f 0 ( � ◮ If a solution exists f ( � x ∗ ) is the optimal value or optimum of the problem

  21. Convex optimization problem The optimization problem is convex if ◮ f 0 is convex ◮ f 1 , . . . , f m are convex a i ∈ R n and a T ◮ h 1 , . . . , h p are affine, i.e. h i ( � x ) = � i � x + b i for some � b i ∈ R

  22. Linear program a T � minimize � x c T subject to � i � x ≤ d i , 1 ≤ i ≤ m x = � A � b

  23. ℓ 1 -norm minimization as an LP The optimization problem minimize || � x || 1 x = � subject to A � b can be recast as the LP m � � minimize t [ i ] i = 1 T � � subject to t [ i ] ≥ � e i x T � � t [ i ] ≥ − � e i x x = � A � b

  24. Proof x ℓ 1 Solution to ℓ 1 -norm min. problem: � � � x lp ,� t lp � Solution to linear program: Set � t ℓ 1 [ i ] := � x ℓ 1 [ i ] � � � � x ℓ 1 ,� t ℓ 1 � � � is feasible for linear program m � � � � x ℓ 1 � t ℓ 1 [ i ] � � � 1 = � � � � � � � i = 1

  25. Proof x ℓ 1 Solution to ℓ 1 -norm min. problem: � � � x lp ,� t lp � Solution to linear program: Set � t ℓ 1 [ i ] := � x ℓ 1 [ i ] � � � � x ℓ 1 ,� t ℓ 1 � � � is feasible for linear program m � � � � x ℓ 1 � t ℓ 1 [ i ] � � � 1 = � � � � � � � i = 1 m � t lp [ i ] t lp � by optimality of � ≥ i = 1

  26. Proof x ℓ 1 Solution to ℓ 1 -norm min. problem: � � � x lp ,� t lp � Solution to linear program: Set � t ℓ 1 [ i ] := � x ℓ 1 [ i ] � � � � x ℓ 1 ,� t ℓ 1 � � � is feasible for linear program m � � � � x ℓ 1 � t ℓ 1 [ i ] � � � 1 = � � � � � � � i = 1 m � t lp [ i ] t lp � by optimality of � ≥ i = 1 � � x lp � � ≥ � � � � � � � � � 1

  27. Proof x ℓ 1 Solution to ℓ 1 -norm min. problem: � � � x lp ,� t lp � Solution to linear program: Set � t ℓ 1 [ i ] := � x ℓ 1 [ i ] � � � � x ℓ 1 ,� t ℓ 1 � � � is feasible for linear program m � � � � x ℓ 1 � t ℓ 1 [ i ] � � � 1 = � � � � � � � i = 1 m � t lp [ i ] t lp � by optimality of � ≥ i = 1 � � x lp � � ≥ � � � � � � � � � 1 x lp is a solution to the ℓ 1 -norm min. problem �

  28. Proof Set � t ℓ 1 [ i ] := � x ℓ 1 [ i ] � � � � m � � � � � t ℓ 1 x ℓ 1 = � � � � � � i � � � 1 i = 1

  29. Proof Set � t ℓ 1 [ i ] := � x ℓ 1 [ i ] � � � � m � � � � � t ℓ 1 x ℓ 1 = � � � � � � i � � � 1 i = 1 � � x lp � � x ℓ 1 ≤ � � by optimality of � � � � � � � � 1

  30. Proof Set � t ℓ 1 [ i ] := � x ℓ 1 [ i ] � � � � m � � � � � t ℓ 1 x ℓ 1 = � � � � � � i � � � 1 i = 1 � � x lp � � x ℓ 1 ≤ � � by optimality of � � � � � � � � 1 m � t lp [ i ] � ≤ i = 1

  31. Proof Set � t ℓ 1 [ i ] := � x ℓ 1 [ i ] � � � � m � � � � � t ℓ 1 x ℓ 1 = � � � � � � i � � � 1 i = 1 � � x lp � � x ℓ 1 ≤ � � by optimality of � � � � � � � � 1 m � t lp [ i ] � ≤ i = 1 x ℓ 1 ,� t ℓ 1 � � � is a solution to the linear problem

  32. Quadratic program For a positive semidefinite matrix Q ∈ R n × n x T Q � a T � minimize � x + � x c T subject to � i � x ≤ d i , 1 ≤ i ≤ m , x = � A � b

  33. ℓ 1 -norm regularized least squares as a QP The optimization problem x − y || 2 minimize || A � 2 + � α || � x || 1 can be recast as the QP n � x T A T A � y T � � � x − 2 � minimize x + � α t [ i ] i = 1 T � � subject to t [ i ] ≥ � e i x T � � t [ i ] ≥ − � e i x

  34. Lagrangian The Lagrangian of a canonical optimization problem is m p � � L ( � x , � α,� ν ) := f 0 ( � x ) + α [ i ] f i ( � � x ) + � ν [ j ] h j ( � x ) , i = 1 j = 1 ν ∈ R p are called Lagrange multipliers or dual variables α ∈ R m ,� � If � x is feasible and � α [ i ] ≥ 0 for 1 ≤ i ≤ m L ( � x , � α,� ν ) ≤ f 0 ( � x )

  35. Lagrange dual function The Lagrange dual function of the problem is p m � � l ( � α,� ν ) := inf x ∈ R n f 0 ( � x ) + � α [ i ] f i ( � x ) + � ν [ j ] h j ( � x ) � i = 1 j = 1 Let p ∗ be an optimum of the optimization problem ν ) ≤ p ∗ l ( � α,� as long as � α [ i ] ≥ 0 for 1 ≤ i ≤ n

  36. Dual problem The dual problem of the (primal) optimization problem is maximize l ( � α,� ν ) subject to � α [ i ] ≥ 0 , 1 ≤ i ≤ m . The dual problem is always convex, even if the primal isn’t!

  37. Maximum/supremum of convex functions Pointwise maximum of m convex functions f 1 , . . . , f m f max ( x ) := max 1 ≤ i ≤ m f i ( x ) is convex Pointwise supremum of a family of convex functions indexed by a set I f sup ( x ) := sup f i ( x ) i ∈I is convex

  38. Proof For any 0 ≤ θ ≤ 1 and any � x , � y ∈ R , f sup ( θ� x + ( 1 − θ ) � y ) = sup f i ( θ� x + ( 1 − θ ) � y ) i ∈I

Recommend


More recommend