DM204 – Autumn 2013 Scheduling, Timetabling and Routing Lecture 5 Advanced Methods for MILP Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark [ Partly based on slides by David Pisinger, DIKU (now DTU) ]
Outline 1. Avanced Methods for MILP Lagrangian Relaxation Dantzig-Wolfe Decomposition Delayed Column Generation 2
Outline 1. Avanced Methods for MILP Lagrangian Relaxation Dantzig-Wolfe Decomposition Delayed Column Generation 3
Relaxation In branch and bound we find upper bounds by relaxing the problem Relaxation � � max s ∈ P f ( s ) max s ∈ P g ( s ) ≥ ≥ max s ∈ S f ( s ) max s ∈ S g ( s ) P : candidate solutions; S ⊆ P feasible solutions; g ( x ) ≥ f ( x ) Which constraints should be relaxed? Quality of bound (tightness of relaxation) Remaining problem can be solved efficiently Proper multipliers can be found efficiently Constraints difficult to formulate mathematically Constraints which are too expensive to write up 5
Different relaxations Tighter LP-relaxation Deleting constraint Best surrogate relaxation Lagrange relaxation Surrogate relaxation Best Lagrangian relaxation Semidefinite relaxation LP relaxation Relaxations are often used in combination. 6
Tightness of relaxation max cx s.t. Ax ≤ b Dx ≤ d x ∈ Z n + LP-relaxation: max { cx : x ∈ conv ( Ax ≤ b , Dx ≤ d , x ∈ Z + ) } � Lagrangian Relaxation: max z LR ( λ ) = cx − λ ( Dx − d ) s.t. Ax ≤ b x ∈ Z n + LP-relaxation: max { cx : Dx ≤ d , x ∈ conv ( Ax ≤ b , x ∈ Z + ) } 7
Surrogate relaxation Relax complicating constraints Dx ≤ d . � Surrogate Relax Dx ≤ d using multipliers λ ≥ 0, i.e., ad dtogether constraints using weights λ z SR ( λ ) = max cx s.t. Ax ≤ b Surrogate Dual Problem λ Dx ≤ λ d x ∈ Z n z SD = min λ ≥ 0 z LR ( λ ) + LP Relaxation: max { cx : x ∈ conv ( Ax ≤ b , λ Dx ≤ λ d , x ∈ Z + ) } best surrogate relaxation (i.e., best λ multipliers) is tighter than best Lagrangian relax. 8
Relaxation strategies Which constraints should be relaxed "the complicating ones" remaining problem is polynomially solvable (e.g. min spanning tree, assignment problem, linear programming) remaining problem is totally unimodular (e.g. network problems) remaining problem is NP-hard but good techniques exist (e.g. knapsack) constraints which cannot be expressed in MIP terms (e.g. cutting) constraints which are too extensive to express (e.g. subtour elimination in TSP) 10
Subgradient optimization Lagrange multipliers max z = cx s . t . Ax ≤ b Dx ≤ d x ∈ Z n + Lagrange Relaxation, multipliers λ ≥ 0 z LR ( λ ) = max cx − λ ( Dx − d ) s . t . Ax ≤ b x ∈ Z n + Lagrange Dual Problem z LD = min λ ≥ 0 z LR ( λ ) We do not need best multipliers in B&B algorithm Subgradient optimization fast method 11 Works well due to convexity
Subgradient optimization, motivation Netwon-like method to minimize a Lagrange function z LR ( λ ) is piecewise function in one variable linear and convex 12
Digression: Gradient methods Gradient methods are iterative approaches: find a descent direction with respect to the objective function f move x in that direction by a step size The descent direction can be computed by various methods, such as gradient descent, Newton-Raphson method and others. The step size can be computed either exactly or loosely by solving a line search problem. Example: gradient descent 1. Set iteration counter t = 0, and make an initial guess x 0 for the minimum 2. Repeat: Compute a descent direction ∆ t = ∇ ( f ( x t )) 3. Choose α t to minimize f ( x t − α ∆ t ) over α ∈ R + 4. Update x t + 1 = x t − α t ∆ t , and t = t + 1 5. 6. Until �∇ f ( x k ) � < tolerance Step 4 can be solved ’loosely’ by taking a fixed small enough value α > 0 13
Newton-Raphson method [from Wikipedia] Find zeros of a real-valued derivable function x : f ( x ) = 0 . Start with a guess x 0 Repeat: Move to a better approximation x n + 1 = x n − f ( x n ) f ′ ( x n ) until a sufficiently accurate value is reached. Geometrically, ( x n , 0 ) is the intersection with the x -axis of a line tangent to f at ( x n , f ( x n )) . f ′ ( x n ) = ∆ y ∆ x = f ( x n ) − 0 . x n − x n + 1 14
Subgradient Generalization of gradients to non-differentiable functions. Definition An m -vector γ is subgradient of f ( λ ) at ¯ λ if f ( λ ) ≥ f (¯ λ ) + γ ( λ − ¯ λ ) The inequality says that the hyperplane y = f (¯ λ ) + γ ( λ − ¯ λ ) is tangent to y = f ( λ ) at λ = ¯ λ and supports f ( λ ) from below 15
λ . If x ′ is an optimal Proposition Given a choice of nonnegative multipliers ¯ solution to z LR ( λ ) then γ = d − Dx ′ is a subgradient of z LR ( λ ) at λ = ¯ λ . Proof We wish to prove that from the subgradient definition: cx − ¯ + γ ( λ − ¯ � � Ax ≤ b ( cx − λ ( Dx − d )) ≥ max max λ ( Dx − d ) λ ) Ax ≤ b Let x ′ be an opt. solution to f (¯ λ ) on the right hand side Inserting γ we get: λ ) + ( cx ′ − ¯ λ ( Dx ′ − d )) Ax ≤ b ( cx − λ ( Dx − d )) ≥ ( d − Dx ′ )( λ − ¯ max = cx ′ − λ ( Dx ′ − d ) 16
Intuition Lagrange dual: min z LR ( λ ) = cx − λ ( Dx − d ) s.t. Ax ≤ b x ∈ Z n + Gradient in x ′ is γ = d − Dx ′ Subgradient Iteration Recursion λ k + 1 = max λ k − θγ k , 0 � � where θ > 0 is step-size If γ > 0 and θ is sufficiently small z LR ( λ ) will decrease. Small θ slow convergence Large θ unstable 17
18
Lagrange relaxation and LP For an LP-problem where we Lagrange relax all constraints Dual variables are best choice of Lagrange multipliers Lagrange relaxation and LP "relaxation" give same bound Gives a clue to solve LP-problems without Simplex Iterative algorithms Polynomial algorithms 19
Dantzig-Wolfe Decomposition Motivation: Large difficult IP models split them up into smaller pieces ➨ Applications Cutting Stock problems Multicommodity Flow problems Facility Location problems Capacitated Multi-item Lot-sizing problem Air-crew and Manpower Scheduling Vehicle Routing Problems Scheduling (current research) Leads to methods also known as: Branch-and-price (column generation + branch and bound) Branch-and-cut-and-price (column generation + branch and bound + cutting planes) 21
Dantzig-Wolfe Decomposition The problem is split into a master problem and a subproblem + Tighter bounds + Better control of subproblem − Model may become (very) large Delayed column generation Write up the decomposed model gradually as needed Generate a few solutions to the subproblems Solve the master problem to LP-optimality Use the dual information to find most promising solutions to the subproblem Extend the master problem with the new subproblem solutions. 22
Delayed Column Generation Delayed column generation, linear master Master problem can (and will) contain many columns To find bound, solve LP-relaxation of master Delayed column generation gradually writes up master 33
Revised Simplex Method max { cx | Ax ≤ b , x ≥ 0 } B = { 1 . . . m } basic variables N = { n + 1 . . . n + m } non-basic variables (will be set to lower bound 0) A B = [ A 1 . . . A m ] A N = [ A n + 1 . . . A n + m ] Standard form A N A B 0 b c N c B 1 0 35
basic feasible solution: Ax = A N x N + A B x B = b X N = 0 A B x B = b − A N x N A B lin. indep. x B = A − 1 B b − A − 1 X B ≥ 0 B A N x N z = cx = c B ( A − 1 B b − A − 1 B A N x N ) + c N x N = = c B A − 1 B b + ( c N − c B A − 1 B A N ) x N Canonical form A − 1 A − 1 B A N I 0 B b B A − 1 B A − 1 c T N − C T − c T B A N 0 1 B b 36
The objective function is obtained by multiplying and subtracting constraints by means of multipliers π (the dual variables) p � p � p + q � p � p � � � � � z = c j − π i a ij x j + c j − π i a ij x j + π i b i j = 1 i = 1 j = p + 1 i = 1 i = 1 Each basic variable has cost null in the objective function p � π = c B A − 1 c j − π i a ij = 0 = ⇒ B i = 1 Reduced costs of non-basic variables: p � c j − π i a ij i = 1 37
Dantzig Wolfe Decomposition with Column Generation [illustration by Simon Spoorendonk, DIKU] 38
Questions Will the process terminate? Always improving objective value. Only a finite number of basis solutions. Can we repeat the same pattern? No, since the objective function is improved. We know the best solution among existing columns. If we generate an already existing column, then we will not improve the objective. 42
Tailing off effect Column generation may converge slowly in the end We do not need exact solution, just lower bound Solving master problem for subset of columns does not give valid lower bound (why?) Instead we may use Lagrangian relaxation of joint constraint “guess” lagrangian multipliers equal to dual variables from master problem 44
Recommend
More recommend