Computational Optimization Duality Theory (MW 12.9) Prof. K. Bennett Bennek@rpi.edu http://www.rpi.edu/~bennek/compopt/
DRUG TRIVIA • In 1999 USA $25B/yr for R&D of pharmaceuticals (33% clinicals) • Worth their weight in gold • 10-15 years from conception � market for drug • Development cost 0.5B/drug 1999 now over 1.5 billion • First-year sales > $1B/drug • 1 drug approved/5000 compounds tested • 1 out of 100 drugs succeeds to market • 19 Alzheimer’s drugs in development • 20,000,000 Americans with Alzheimer by 2050 RENSSELAER
HELP WORLD HIV PROBLEM HIV Reverse-Transcriptase Inhibition modeling: Have a few Molecules that have been tested: R 2 O R 1 R 2 X S O N N O HN HN O O OTBDMS OTBDMS N N R N N N R 1 N O N R 1 O N S O O O O O R 2 N R 1 S HO S H 2 N H 2 N TBDMSO R TBDMSO O O O Can we predict if new molecule will inhibit HIV?
What do we know? The bioactivities of a small set of molecules Many descriptors for each molecules: Molecular Weight Electrostatic Potential Ionization Potential Can we predict molecules bioactivity?
Drug Discovery Application GOAL: Predict bioactivities of molecules in order to decrease need for expensive lab experiments. Given: molecule descriptors x i ∈ R n known bioactivity y i = 1 or -1 Find predictive model: ⎛ ⎞ n ( ) ∑ ( ) = − = − ≈ ⎜ ⎟ ( ) ' f x sign w x b sign w x b y j j ⎝ ⎠ = 1 j
Best Linear Separator?
Best Linear Separator?
Best Linear Separator?
Best Linear Separator?
Best Linear Separator?
Find Closest Points in Convex Hulls c d
c b Plane Bisect Closest Points − = d w = ⋅ w x c d
Find using quadratic program 2 − 1 m in c d α 2 ∑ ∑ = α = α c x d x i i i i ∈ ∈ − 1 1 i i ∑ ∑ α = α = 1 1 i i . . s t ∈ ∈ − 1 1 i i α ≥ = 0 1, ..., i n i Quadratic objective with linear constraints
Best Linear Separator: Supporting Plane Method Maximize distance Between two para supporting planes ⋅ = δ Distance x w = “ Margin ” δ − β = ⋅ = β || || x w w
Maximize margin using quadratic program − δ + β 1 2 min || || w 2 δ β , , w ⋅ ≥ δ ∈ 1 x w i Cl ass i . s t ⋅ ≤ β ∈ − 1 x w i Class i
Dual of Closest Points Method is Support Plane Method � 1 1 ∑ 2 2 α − δ + β || || || || min min y x w i i i 2 2 α δ β = , , 1 w i ∑ ∑ ( ) = = ⇔ ⋅ − δ ≥ α α . . 1 1 . . 0 s t s t x w i i i ∈ ∈− 1 1 i i ( ) α ≥ − ⋅ + β ≥ 0 0 x w i α > 0 Solution only depends on support vectors: i ∈ ⎧ ⎫ � 1 1 i Class ∑ = α = ⎨ ⎬ : w y x y − ∈ − i i i i ⎩ ⎭ 1 1 i Class = 1 i
Two views – one problem The closest point formulation yields same solution as parallel plains. This is not a coincidence Duality theory is a systematic way to formulate and investigate these problems. Many different kinds: Lagrangian, Wolfe, Conjugate
Recall Lagrangian Function → min ( ) f x n : diff f R R r → ≥ n : diff ( ) 0 g R R g x i i . . s t = … 1, , i m j ∑ = − λ ( , ) ( ) ( ( )) L x u f x g x j j = 1 j m ∑ ∇ = ∇ − λ ∇ ( , ) ( ) ( ( )) L x u f x g x x x i x j = 1 i
KKT Conditions Primal Feasibility ≥ = … ( ) 0 1, , g x i m i Dual Feasibility m ∑ ∇ λ = ∇ − λ ∇ = ( , ) ( ) ( ( )) 0 L x f x g x λ x i x j = 1 i λ ≥ 0 Complimentarity λ = = … ( ) 0 1, , g x i m i i
Lagrangian Duality ( )) 0 x ≥ ∈Χ j ( ) g ( ) f x g x ( λ j x − ∑ 1 = j j min ( ) . . f x s t Lagrangian function = ( , ) Base Problem L x u
PRIMAL Problem Primal objective: = λ * ( ) max ( , ) L x L x λ ≥ 0 Primal problem (min max): = λ * min ( ) min max ( , ) L x L x ∈ ∈ λ ≥ 0 x X x X
Same as original Primal objective = λ * ( ) max ( , ) L x L x λ ≥ 0 ≥ ( ) if ( ) 0 f x g x = − λ = max ( ) ' ( ) f x g x ∞ < λ ≥ ( ) 0 g x 0 Primal problem = λ = ≥ * min ( ) min max ( , ) min ( ) . . ( ) 0 L x L x f x s t g x ∈ ∈ λ ≥ x X x X 0
Dual Problem Dual objective: = L x λ * ( ) min ( , ) L x ∈Χ x Dual problem (max min): = λ max ( ) max min ( , ) L x L x * λ ≥ λ ≥ ∈ 0 0 x X
Dual Problem Dual objective: = L x λ * ( ) min ( , ) L x ∈Χ x Dual problem (max min): = λ max ( ) max min ( , ) L x L x * λ ≥ λ ≥ ∈ 0 0 x X L* is always concave!!!!!!
Explicit Form of Dual ≤ Some problems max log( ) . . 1 x s t x have explicit form of λ = − − λ − + ( ) min log( ) ( 1) L x x * x dual objective 1 ∇ λ = − + λ = ⇒ = λ ( , ) 0 1/ x L x x x Exploit λ = − λ − λ − λ + = λ − λ − ( ) log(1/ ) ( 1/ 1) log( ) 1 L * differentiability and log(x)-x convexity -1 -2 -3 -4 -5 -6 -7 -8 0 1 2 3 4 5 6 7 8 9 10 x
x λ ( , ) Weak Duality For any feasible primal feasible x g x ≥ ( ) 0 And dual feasible λ ≥ ∈ − λ 0 arg min ( ) ' ( ) and x f x g x ∈ x X the following holds ≥ λ = − λ ( ) ( , ) ( ) ' ( ) f x L x f x g x
Wolfe Duality For differentiable Convex program can simplify Lagrangian duality λ max ( , ) L x λ , x ∇ λ = ( , ) 0 L x x . . s t λ ≥ 0
Lagrangian Duality Dual Function λ = λ * ( ) min ( , ) L L x x If convex problem λ = λ ( ) min ( , ) L L x * x m ∑ ⇒ ∇ = ∇ − λ ∇ = ( , ) ( ) ( ( )) 0 L x u f x g x x x i x j = 1 i
( , ) λ Dual Problem L x If convex problem simplifies to 0 x = min ( , ) L x u ( , ) λ 0 ≥ 0 L x λ max ≥ x λ ∇ λ , x = max Dual Problem . . ( ) λ s t * L 0 ≥ λ max
Weak Duality For any feasible primal feasible x* ≥ ( *) 0 g x And dual feasible x, λ ∇ x L x λ = λ ≥ ( , ) 0 0 the following holds ≥ λ = − λ ( *) ( , ) ( ) ' ( ) f x L x f x g x If they equal they must be optimal!
General Ideal Primal – minimizes the primal function subject to primal constraints Dual maximizes the dual function with respect to the dual variables λ≥ 0 At optimality the primal and dual functions are equal (requires assumptions – strong duality)
Primal QP − δ + β 1 2 min || || w 2 δ β , , w ⋅ ≥ δ ∈ 1 x w i Class i . s t ⋅ ≤ β ∈ − 1 x w i Class i δ β = − δ + β 1 2 ( , , ) || || L w w 2 ∑ ∑ − α ⋅ − α − α β − ⋅ ( ) ( ) x w x w i i i i ∈ ∈− 1 1 i i
KKT Primal Feasibility: ⋅ ≥ δ ∈ 1 x w i Class i ⋅ ≤ β ∈ − 1 x w i C lass i α ≥ Dual Feasibility: 0 i ∑ ∑ ∇ δ β = − α + α = ( , , ) 0 L w w x x w i i i i ∈ ∈− 1 1 i i ∑ ∇ δ β = α − = ( , , ) 1 0 L w δ i ∈ 1 i ∑ ∇ δ β = − α + = ( , , ) 1 0 L w β i ∈− 1 i plus complementarity
Dual Problem ∑ ∑ α − α δ β max ( , , ) L x x i i i i ∈ ∈− 1 1 i i ∑ ∑ α = α = 1 1 i i ∈ ∈− 1 1 i i α ≥ 0 i Remove w by substitution and simplify Convert to min problem
Recall quadratic program 2 ∑ α 1 m in y x α i i i 2 i ∑ ∑ = α = α c x d x i i i i ∈ ∈ 1 1 i i ∑ ∑ α = α = 1 1 i i . . s t ∈ ∈ − 1 1 i i α ≥ = τ 0 1, ..., i i Quadratic objective with linear constraints
Linear Programming Duality =Special Case ) c − y Ax '( − b x c ' >= = x b x Ax ' ( , ) L x y min . . Primal s t
Linear Programming Duality =Special Case − − Dual max ' '( ) b x y Ax c , x y ∇ = − = . . ( , ) ' 0 s t L x y b A y x ≥ 0 y ∇ = − = Simplify by ' ( , ) '( ' ) 0 x L x y x b A y x max ' y c y = . . ' s t A y b ≥ 0 y
Works for equality/inequality Primal ≥ = min ( ) . . ( ) 0 ( ) 0 f x s t g x h x x Dual − − max ( ) ' ( ) ' ( ) f x u g x v h x , , x u v ∑ ∑ ∇ − ∇ − ∇ = . . ( ) ( ) ( ) 0 s t f x u g x v h x x i x i i x j i j ≥ 0, u v unconstrained
Why Dual Problem? May have nicer structure like easier constraints or function Dual problem always is max of concave function. Catch may have “duality gap” Dual provides lower bound on primal function – use to check optimality and generate cuts/constraints Exploit in algorithms - e.g. augmented Lagrangian and primal dual interior point algorithms.
Recommend
More recommend