CS/ECE/ISyE 524 Introduction to Optimization Spring 2017–18 9. Equality constraints and tradeoffs ❼ More least squares ❼ Example: moving average model ❼ Minimum-norm least squares ❼ Equality-constrained least squares ❼ Optimal tradeoffs ❼ Example: hovercraft Laurent Lessard (www.laurentlessard.com)
More least squares Solving the least squares optimization problem: � Ax − b � 2 minimize x Is equivalent to solving the normal equations: A T A ˆ x = A T b ❼ If A T A is invertible ( A has linearly independent columns) x = ( A T A ) − 1 A T b ˆ ❼ A † := ( A T A ) − 1 A T is called the pseudoinverse of A . 9-2
Example: moving average model ❼ We are given a time series of input data u 1 , u 2 , . . . , u T and output data y 1 , y 2 , . . . , y T . Example: ❼ A “moving average” model with window size k assumes each output is a weighted combination of k previous inputs: y t ≈ w 1 u t + w 2 u t − 1 + · · · + w k u t − k +1 for all t ❼ find weights w 1 , . . . , w k that best agree with the data. 9-3
Example: moving average model ❼ Moving average model: y t ≈ w 1 u t + w 2 u t − 1 + w 3 u t − 2 for all t ❼ Writing all the equations (e.g. k = 3): 0 0 y 1 u 1 y 2 u 2 u 1 0 w 1 y 3 u 3 u 2 u 1 ≈ w 2 . . . . . . . . w 3 . . . . y T u T u T − 1 u T − 2 ❼ Solve least squares problem! Moving Average.ipynb 9-4
Minimum-norm least squares Underdetermined case: A ∈ R m × n is a wide matrix ( m ≤ n ), so Ax = b generally has infinitely many solutions. ❼ The set of solutions of Ax = b forms an affine subspace. Recall: if Ay = b and Az = b then A ( α y + (1 − α ) z ) = b . ❼ One possible choice: pick the x with smallest norm. 2.5 2.0 x 1.5 R n 1.0 0.5 - 1 1 2 3 4 - 0.5 ❼ Insight: The optimal ˆ x must satisfy A ˆ x = b and x T (ˆ x − w ) = 0 for all w satisfying Aw = b . ˆ 9-5
Minimum-norm least squares x T (ˆ ❼ We want: ˆ x − w ) = 0 for all w such that Aw = b . ❼ We also know that A ˆ x − w ) = 0. x = b . Therefore: A (ˆ In other words: ˆ x ⊥ (ˆ x − w ) and (ˆ x − w ) ⊥ (all rows of A ) Therefore, ˆ x is a linear combination of the rows of A . x = A T z for some z . Stated another way, ˆ ❼ Therefore, we must find z and ˆ x such that: A T z = ˆ A ˆ x = b and x (this also follows from R ( A ) ⊥ = N ( A T )) 9-6
Minimum-norm least squares Theorem: If there exists ˆ x and z that satisfy A ˆ x = b and A T z = ˆ x , then ˆ x is a solution to the minimum-norm problem � x � 2 minimize x subject to: Ax = b x = b and A T z = ˆ Proof: Suppose A ˆ x . For any x that satisfies Ax = b , we have: � x � 2 = � x − ˆ x � 2 x + ˆ x � 2 + � ˆ x � 2 + 2ˆ x T ( x − ˆ = � x − ˆ x ) x � 2 + � ˆ x � 2 + 2 z T A ( x − ˆ = � x − ˆ x ) x � 2 + � ˆ x � 2 = � x − ˆ x � 2 ≥ � ˆ 9-7
Minimum-norm least squares Solving the minimum-norm least squares problem: � x � 2 minimize x subject to: Ax = b Is equivalent to solving the linear equations: A T z = ˆ AA T z = b ⇒ A ˆ x = b and x = ❼ If AA T is invertible ( A has linearly independent rows) x = A T ( AA T ) − 1 b ˆ ❼ A † := A T ( AA T ) − 1 is also called the pseudoinverse of A . 9-8
Equality-constrained least squares A more general optimization problem: � Ax − b � 2 minimize x subject to: Cx = d (Equality-constrained least squares) ❼ If C = 0, d = 0, we recover ordinary least squares ❼ If A = I , b = 0, we recover minimum-norm least squares 9-9
Equality-constrained least squares Solving the equality-constrained least squares problem: � Ax − b � 2 minimize x subject to: Cx = d Is equivalent to solving the linear equations: A T A ˆ x + C T z = A T b and C ˆ x = d 9-10
Equality-constrained least squares x and z satisfy A T A ˆ x + C T z = A T b and Proof: Suppose ˆ C ˆ x = d . Let x be any other point satisfying Cx = d . Then, � Ax − b � 2 = � A ( x − ˆ x − b ) � 2 x ) + ( A ˆ x ) � 2 + � A ˆ x − b � 2 + 2( x − ˆ x ) T A T ( A ˆ = � A ( x − ˆ x − b ) x ) � 2 + � A ˆ x − b � 2 − 2( x − ˆ x ) T C T z = � A ( x − ˆ x − b � 2 − 2( Cx − C ˆ x ) � 2 + � A ˆ x ) T z = � A ( x − ˆ x ) � 2 + � A ˆ x − b � 2 = � A ( x − ˆ x − b � 2 ≥ � A ˆ Therefore ˆ x is an optimal choice. 9-11
Recap so far Several different variants of least squares problems are easy to solve in the sense that they are equivalent to solving systems of linear equations. Least squares Minimum-norm Equality constrained � Ax − b � 2 � x � 2 � Ax − b � 2 min min min x x x s.t. Ax = b s.t. Cx = d 9-12
Optimal tradeoffs We often want to optimize several different objectives simultaneously, but these objectives are conflicting . ❼ risk vs expected return (finance) ❼ power vs fuel economy (automobiles) ❼ quality vs memory (audio compression) ❼ space vs time (computer programs) ❼ mittens vs gloves (winter) 9-13
Optimal tradeoffs ❼ Suppose J 1 = � Ax − b � 2 and J 2 = � Cx − d � 2 . ❼ We would like to make both J 1 and J 2 small. ❼ A sensible approach: solve the optimization problem: minimize J 1 + λ J 2 x where λ > 0 is a (fixed) tradeoff parameter . ❼ Then tune λ to explore possible results. ◮ When λ → 0, we place more weight on J 1 ◮ When λ → ∞ , we place more weight on J 2 9-14
Optimal tradeoffs This problem is also equivalent to solving linear equations! J 1 + λ J 2 = � Ax − b � 2 + λ � Cx − d � 2 2 � � Ax − b �� √ � � = � � λ ( Cx − d ) � � � A � b 2 � �� � √ √ � � = x − � � λ C λ d � � ❼ An ordinary least squares problem! ❼ Equivalent to solving ( A T A + λ C T C ) ˆ x = ( A T b + λ C T d ) 9-15
Tradeoff analysis 1. Choose values for λ (usually log-spaced). A useful command: lambda = logspace(p,q,n) produces n points logarithmically spaced between 10 p and 10 q . 2. For each λ value, find ˆ x λ that minimizes J 1 + λ J 2 . 3. For each ˆ x λ , also compute the corresponding J λ 1 and J λ 2 . 4. Plot ( J λ 1 , J λ 2 ) for each λ and connect the dots. J 2 λ → 0 λ → ∞ J 1 9-16
Pareto curve J 2 λ → 0 better J 1 worse J 1 worse J 2 worse J 2 candidate point better J 1 worse J 1 better J 2 better J 2 λ → ∞ J 1 9-17
Pareto curve J 2 λ → 0 feasible, but strictly suboptimal P a r e t o - o p t i m a l p o i n t s λ → ∞ infeasible J 1 9-18
Example: hovercraft We are in command of a hovercraft. We are given a set of k waypoint locations and times. The objective is to hit the waypoints at the prescribed times while minimizing fuel use. Goal is to choose appropriate thruster inputs at each instant. 9-19
Example: hovercraft We are in command of a hovercraft. We are given a set of k waypoint locations and times. The objective is to hit the waypoints at the prescribed times while minimizing fuel use. ❼ Discretize time: t = 0 , 1 , 2 , . . . , T . ❼ Important variables: position x t , velocity v t , thrust u t . ❼ Simplified model of the dynamics: x t +1 = x t + v t for t = 0 , 1 , . . . , T − 1 v t +1 = v t + u t ❼ We must choose u 0 , u 1 , . . . , u T . ❼ Initial position and velocity: x 0 = 0 and v 0 = 0. ❼ Waypoint constraints: x t i = w i for i = 1 , . . . , k . ❼ Minimize fuel use: � u 0 � 2 + � u 1 � 2 + · · · + � u T � 2 9-20
Example: hovercraft First model : hit the waypoints exactly T � � u t � 2 minimize x t , v t , u t t =0 subject to: x t +1 = x t + v t for t = 0 , 1 , . . . , T − 1 v t +1 = v t + u t for t = 0 , 1 , . . . , T − 1 x 0 = v 0 = 0 x t i = w i for i = 1 , . . . , k Julia model: Hovercraft.ipynb 9-21
Example: hovercraft Second model : allow waypoint misses T k � u t � 2 + λ � � � x t i − w i � 2 minimize x t , v t , u t t =0 i =1 for t = 0 , 1 , . . . , T − 1 subject to: x t +1 = x t + v t v t +1 = v t + u t for t = 0 , 1 , . . . , T − 1 x 0 = v 0 = 0 ❼ λ controls the tradeoff between making u small and hitting all the waypoints. 9-22
Recommend
More recommend