(Log) Barrier methods November 9, 2018 339 / 429
Barrier Methods for Constrained Optimization Consider a more general constrained optimization problem min f ( x ) x ∈ R n s.t. g i ( x ) ≤ 0 i = 1 ... m and A x = b Possibly reformulations of this problem include: m x in f ( x )+ λ B ( x ) where B is a barrier function like B ( x ) = ∥ A x − b ∥ 2 ∥ . ∥ )) ρ (in Augmented Langragian - for a specific type of strong convexity wrt 2 1 2 ∑ B ( x ) = I ( x )(Projected Gradient Descent: built on this & a linear approximation to f ( x )) 2 g i ( ) B ( x ) = ϕ g i ( x ) = − 1 log − g i ( x ) 3 t Here, − 1 is used instead of λ . Lets discuss this in more details ▶ t November 9, 2018 340 / 429
Barrier Method: Example As a very simple example, consider the following inequality constrained optimization problem. 2 minimize x subject to x ≥ 1 The logarithmic barrier formulation of this problem is 2 − µ ln( x − 1) minimize x The unconstrained minimizer for this convex logarithmic barrier function is √ x ( µ ) = + 1 + 2 µ 1 1 . As µ → 0, the optimal point of the logarithmic barrier problem b 2 2 approaches the actual point of optimality x = 1(which, as we can see, lies on the boundary of b x ) → p ∗ (where p ∗ the feasible region). The generalized idea, that as µ → 0, f ( is the optimal b for primal) will be proved next. November 9, 2018 341 / 429
Barrier Method and Linear Program Recap: L ∗ ( λ ) Problem type Objective Function Constraints Dual constraints Strong duality A x ≤ b − b T λ Linear Program c T x A T λ + c = 0 Feasible primal What are necessary conditions at primal-dual optimality? .. .. November 9, 2018 342 / 429
Log Barrier (Interior Point) Method The log barrier function is defined as 1 ( ) − log − i B ( x ) = ϕ ( x )= g ( x ) g i t ∑ I ( x )(better approximation as t → ∞ ) Approximates g i ∑ f ( x ) + ϕ ( x )is convex if f and g i are convex g i i Why? ϕ g i ( x )is negative of monotonically increasing concave function (log) of a concave function − g i ( x ) Let λ i be lagrange multiplier associated with inequality constraint g i ( x ) ≤ 0 We’ve taken care of the inequality constraints, lets also consider an equality constraint A x = b with corresponding langrage multipler (vector) ν November 9, 2018 343 / 429
Log Barrier Method (contd.) (KKT based intepretation) Our objective becomes ( ) ∑ 1 ( g ( x ) ) − log − i m i n f ( x ) + x t i s.t. Ax = b At different values of t , we get different x ∗ ( t ) Let λ ∗ i ( t ) = First-order necessary conditions for optimality (and strong duality) 17 at x ∗ ( t ) ,λ ∗ i ( t ): .. 1 .. 2 .. 3 .. 4 ⋆ .. 17 of original problem November 9, 2018 344 / 429
Our objective becomes ( ) ∑ 1 ( g ( x ) ) − log − i min f ( x )+ t x i s.t. Ax = b At different values of t , we get different x ∗ −1 ∗ Let λ ( t ) = i tg i ( x ∗ ( t ) ) First-order necessary conditions for optimality (and strong duality) 18 at x ∗ ( t ) ,λ ∗ i ( t ): ( ) ∗ t ) ≤ 0 g x ( 1 i Ax ∗ ( t ) = b 2 ∑ m ( ) ( t ) ∇ g i ( x ∗ ( t ) ) + ν ∗ ( t ) ⊤ A = 0 ∇ f x ( t ) + ∗ ∗ i =1 λ 3 i ∗ i ( t ) ≥ 0 4 λ ( ) ⋆ Since g i x ∗ ( t ) ≤ 0 and t ≥ 0 All above conditions hold at optimal solution x ( t ) ,ν ( t ), of barrier problem ⇒ ( λ ∗ i ( t ) ,ν ∗ ( t ) ) are dual feasible. (onlt complementary slackness is violated) 18 of original problem November 9, 2018 345 / 429
Log Barrier Method & Duality Gap (KKT based intepretation) If necessary conditions are satisfied and if f and g i ’s are convex, and g i ’s strictly ( ) feasible , the conditions are also sufficient. Thus, x ( ∗ ∗ ∗ t ) ,λ ( t ) ,ν ( t ) form a critical point i for the Lagrangian m ∑ ⊤ ( A x − b ) L ( x ,λ,ν ) = f ( x )+ λ g ( x ) + ν i i i =1 Lagrange dual function L ∗ ( λ,ν ) =min x L ( x ,λ,ν ) m ∑ ( ) ( ) ( ) ( A x ( t ) − b ) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ⊤ ∗ λ ( t ) ,ν ( t ) = f x ( t ) + λ ( t ) g x ( t ) + ν ( t ) L i i i =1 = f(x*(t) ) -m/t .is the duality gap upperbound ▶ . m . / . t . . . . . . . As t → ∞ , duality gap → . . . 0 ▶ November 9, 2018 346 / 429
Log Barrier Method & Duality Gap (KKT based intepretation) If necessary conditions are satisfied and if f and g i ’s are convex, and g i ’s strictly ( ) feasible , the conditions are also sufficient. Thus, x ( ∗ ∗ ∗ t ) ,λ ( t ) ,ν ( t ) form a critical point i for the Lagrangian m ∑ ⊤ ( A x − b ) L ( x ,λ,ν ) = f ( x )+ λ g ( x ) + ν i i i =1 Lagrange dual function L ∗ ( λ,ν ) =min x L ( x ,λ,ν ) ∑ m ( ) ( ) ( ) ( A x ( t ) − b ) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ⊤ ∗ λ ( t ) ,ν ( t ) = f x ( t ) + λ ( t ) g x ( t ) + ν ( t ) L i i i =1 m ( ) = f x ( t ) − ∗ t ▶ m here is called the duality gap t ▶ As t → ∞ , duality gap → 0 , but computing optimal solution x ( t )to barrier problem will be that harder November 9, 2018 347 / 429
Log Barrier Method & Duality Gap (KKT based intepretation) At optimality, primal optimal=dual optimal ∗ = d ∗ i.e. p From weak duality, ( ) m ∗ ≤ p ∗ f x ( t ) − t ≤ m ( ) ∗ ∗ = ⇒ f t ) − p x ( t ▶ The duality gap is always ≤ m t ▶ The more we increase t , the smaller will be the duality gap Log Barrier method: Start with small t (conservative about feasibility set Iteratively solve the barrier formulation (start with solution to prev itera increase value of t November 9, 2018 348 / 429
The Log Barrier Method Also known as sequential unconstrained minimization technique (SUMT) & barrier method & path-following method Start with t = t (0) , µ > 1, and consider ϵ tolerance 1 2 Repeat INNER ITERATION: (solved using Dual Ascent or Augment Lagrangian) 1 Solve Newton algo especially good for this ( ) m ∑ 1 ( ) x ∗ ( ) = − t log − i ( ) t argmin f ( x ) + g x x i =1 for solving for x*(t), initialize s.t. Ax = b using x*(t-1) 2 If m <ϵ , Quit t , set t = µ t Scale up the value of t multiplicatively in every else outer iteration November 9, 2018 349 / 429
The Log Barrier Method Also known as sequential unconstrained minimization technique (SUMT) & barrier method & path-following method (0) , µ > 1, and consider ϵ tolerance Start with t = t 1 Repeat 2 1 Solve ( ) m ∑ 1 log − g i ( x ) ) ( x ∗ ( ) = − t argmin f ( x ) + t x i =1 s.t. Ax = b 2 If m <ϵ , Quit t , set t = µ t else Note: Computing x ∗ ( t )exactly is not necessary since the central path has no significance other than that it leads to a solution of the original problem for t → ∞ ; Also small µ ⇒ faster inner iterations. Large µ ⇒ faster outer iterations. Since x*(t-1) will not be far from x*(t) Upper bound on duality gap will shrink quickly November 9, 2018 349 / 429
Central path for an LP with n = 2 and m = 6. The dashed curves show three contour lines of the logarithmic barrier function φ . The central path converges to the optimal point x* as t → ∞ . Also shown is the point on the central path with t = 10. [Figure source: Boyd & Vandenberghe]
In the process, we can also obtain λ ∗ ( t )and ν ∗ ( t ) Convergence of outer iterations: ( ) log m ϵ t (0) We get ϵ accuracy after updates of t log ( µ ) November 9, 2018 350 / 429
Log Barrier Method & Strictly Feasible Starting Point The inner optimization in the iterative algorithm using a barrier method, ( ) ∑ 1 ( ) ∗ − log − i x ( t ) =argmin f ( x ) + g ( x ) t x i s.t. A x = b can be solved using (sub)gradient descent starting from older value of x from previous iteration We must start with a strictly feasible x , otherwise − log ( − g i ( x ) ) → ∞ November 9, 2018 351 / 429
How to find a strictly feasible x (0) ? November 9, 2018 352 / 429
How to find a strictly feasible x (0) ? Basic Phase I method x (0) =argmin Γ x s.t. g i ( x ) ≤ Γ We solve this using the barrier method, and thus will also need a strictly feasible starting (0 ) x ˆ Here, g i ( x ˆ (0) ) + δ Γ =max i =1 ... m where, δ> 0 ▶ i.e. Γ is slightly larger than the largest g i ( x ˆ (0) ) November 9, 2018 353 / 429
On solving this optimization for finding x (0) , ▶ If Γ ∗ < 0, x (0) is strictly feasible ▶ If Γ ∗ = 0, x (0) is feasible (but not strictly) ▶ If Γ ∗ > 0, x (0) is not feasible A slightly ‘richer’ problem can consider different Γ i for each g i , to improve numerical precision x (0) =argmin Γ i x s.t. g i ( x ) ≤ Γ i min over i November 9, 2018 354 / 429
Choice of a good x ˆ (0) or x (0) depends on the nature/class of the problem, use domain knowledge to decide it November 9, 2018 355 / 429
Log Barrier Method & Strictly Feasible Starting Point We need not obtain x ∗ ( t )exactly from each outer iteration ( ) log m ϵ t (0) If not solving for x ∗ ( t )exactly, we will get ϵ accuracy after more than log ( µ ) updates of t ▶ However, solving the inner iteration exactly may take too much time TRADEOFFS ▶ Fewer inner loop iterations correspond to more outer loop iterations November 9, 2018 356 / 429
Recommend
More recommend