CSE203B Convex Optimization: Lecture 3: Convex Function CK Cheng Dept. of Computer Science and Engineering University of California, San Diego 1
Outlines 1. Definitions: Convexity, Examples & Views 2. Conditions of Optimality 1. First Order Condition 2. Second Order Condition 3. Operations that Preserve the Convexity 1. Pointwise Maximum 2. Partial Minimization 4. Conjugate Function 5. Log-Concave, Log-Convex Functions 2
Outlines 1. Definitions 1. Convex Function vs Convex Set 2. Examples 1. Norm 2. Entropy 3. Affine 4. Determinant 5. Maximum 3. Views of Functions and Related Hyperplanes 3
1. Definitions: Convex Function vs Convex Set Theorem: Given π = π¦ π π¦ β€ π If function π π¦ is convex, then π is a convex set. Proof: We prove by the definition of convex set. For every π£, π€ β π, i. e. π π£ β€ π, π π€ β€ π, We want to show that Ξ±π£ + πΎπ€ β π, βΞ± + πΎ = 1, π½, πΎ β₯ 0. We have π π½π£ + πΎπ€ β€ π½π π£ + πΎπ π€ (π ππ‘ ππππ€ππ¦) β€ π½π + πΎπ (π½, πΎ β₯ 0) = π½ + πΎ β π = π (π½ + πΎ = 1) Thus Ξ±π£ + πΎπ€ β π Remark: Convex function => Convex Set π(π¦) β€ π => Convex Set π(π¦) β₯ π => ? 4
1. Convex Function Definitions: Examples π: π π β π is convex if πππ π is a convex set and π ππ¦ + 1 β π π§ β€ ππ π¦ + 1 β π π(π§) βπ¦, π§ β πππ π, 0 β€ π β€ 1 Example on R: Convex Functions ππ¦ + π ππ π for any π, π β π Affine: Exponential: π ππ¦ for any π β π Power: π¦ π½ ππ π ++ for π½ β₯ 1 or π½ β€ 0 π¦ π ππ π for π β₯ 1 Concave Functions ππ¦ + π ππ π for any π, π β π Affine: Power: π¦ π½ ππ π ++ for 0 β€ π½ β€ 1 Logarithm: ππππ¦ ππ π ++ 5
1. Convex Function Definitions: Examples Example on π π : π π¦ = π π π¦ + π Affine: 1 π πππ π β₯ 1; π π¦ π ΰ΅ Norms: π¦ π = Ο π=1 π¦ β = max |π¦ π | π Example on π πΓπ : π Ο π=1 π π π = π’π π΅ π π = Ο π=1 π΅ ππ π¦ ππ Affine: Spectral (max singular value): 1 2 π 2 = π πππ¦ π = (π πππ¦ π π π ) Ξ€ π π = 6
1. Convex Function Definitions: Examples Concave Functions: π Log Determinant: π π = log det π , πππ π = S ++ π β π π Proof: Let π π’ = π π + π’π π π’ = πππ πππ’ (π + π’π) = πππ πππ’ π + ππππππ’(π½ + π’π β 1 2 ππ β 1 2 ) π = πππ πππ’ π + Ο π=1 πππ(1 + π’π π ) π π : ππππππ€πππ£π ππ π β 1 2 ππ β 1 2 π is concave in π’ β π is concave 7
Convex function examples: norm, max, expectation norm: If π: π π β π is a norm and 0 β€ π β€ 1 π ππ¦ + 1 β π π§ β€ π ππ¦ + π 1 β π π§ triangle inequality = ππ(π¦) + (1 β π)π(π§) scalability π¦ π , π¦ = π¦ 1 , π¦ 2 , β¦ , π¦ π π Max function: π π¦ = max π π ππ¦ + 1 β π π§ = max ππ¦ π + 1 β π π§ π π β€ π max π¦ π + 1 β π max π§ π π π = ππ π¦ + 1 β π π π§ for 0 β€ π β€ 1 Probability: (Expectation) If π π¦ is convex with π π¦ a probability at π¦, i. e. π π¦ β₯ 0, βπ¦ and Χ¬ π(π¦) ππ¦ = 1 Then π πΉπ¦ β€ πΉπ π¦ , where πΉπ¦ = Χ¬π¦ π π¦ ππ¦ πΉπ(π¦) = Χ¬ π(π¦) π π¦ ππ¦ 8
1.3 Views of Functions and Related Hyperplanes Given π π¦ , π¦ β π π , we plot the function in π π and π π+1 spaces. 1. Draw function in π π space π¦ π π¦ β ΰ·€ Equipotential surface: tangent plane πΌπ ΰ·€ π¦ = 0 at ΰ·€ π¦ 2. Draw function in π π+1 space 2.1 Graph of function: {(π¦, β)|π¦ β πππ π, β = π π¦ } π¦ π π¦ β ΰ·€ π’π³πͺππ¬πͺπ¦ππ¨π (h = πΌπ ΰ·€ π¦ + π(ΰ·€ π¦) ) π¦ π¦ ΰ·€ π¦ π β 1 πΌπ ΰ·€ β β = 0 π ΰ·€ π¦ Example: π π¦ = π¦ 2 . We show the hyperplane with πΌπ π¦ 2.2. Epigraph: epi π : {(x, π’)|π¦ β πππ π, π π¦ β€ π’} A function is convex iff its epigraph is a convex set. Example: π π¦ = max π π π¦ | π = 1 β¦ π , π π π¦ ππ π ππππ€ππ¦. Since epi π is the intersect of epi π π , epi π is convex. Thus, function π is convex. 9
2. Conditions of Optimality: First Order Condition D efintion: π is differentiable if ππππ is open and ππ π¦ ππ π¦ ππ π¦ πΌπ(π¦) β‘ ( ππ¦ 1 , ππ¦ 2 , β¦ , ππ¦ π ) exists at each π¦ β ππππ Theorem: Differentiable π with convex domain is convex iff π π§ β₯ π π¦ + πΌπ π¦ T π§ β π¦ , βπ¦, π§ β ππππ Proof => If π is convex πβππ 1 β π’ π π¦ + π’π π§ β₯ π 1 β π’ π¦ + π’π§ , β0 β€ π’ β€ 1 π’ π π§ β π π¦ β₯ π π¦ + π’ π§ β π¦ β π(π¦) 1 π π§ β π π¦ β₯ π’ (π π¦ + π’ π§ β π¦ β π π¦ ) = πΌπ π¦ π§ β π¦ π₯βππ π’ β 0 <= π»ππ€ππ π π§ β₯ π π¦ + πΌπ π¦ T π§ β π¦ , βπ¦, π§ β ππππ πππ’ π¨ = 1 β π’ π¦ + π’π§ where ΰ΅ π π¦ β₯ π π¨ + πΌπ π¨ T π¦ β π¨ π π§ β₯ π π¨ + πΌπ π¨ T π§ β π¨ Thus 1 β π’ π π¦ + π’π π§ β₯ π(π¨) 10
2. Conditions: Second Order Condition Definition: π is twice differentiable if ππππ is open and the Hessian πΌ 2 π π¦ β π π π 2 π π¦ πΌ 2 π π¦ ππ β‘ ππ¦ π ππ¦ π , π, π = 1, β¦ , π exists at each π¦ β ππππ Theorem: Twice Differentiable π with convex domain is convex iff πΌ 2 π π¦ β½ 0, βπ¦ β ππππ Proof: Using Lagrange remainder, we can find a z π π¦ + π’(π§ β π¦) = π π¦ + πΌπ π¦ π π’ π§ β π¦ + 1 2 π’ 2 π§ β π¦ π πΌ 2 π π¨ π§ β π¦ , β0 β€ π’ β€ 1, π¨ is between π¦ and π¦ + π’(π§ β π¦) Since the last term is always positive by assumption, the first order condition is satisfied. 11
2. Conditions: Second Order Condition Example: Negative Entropy: π π¦ = π¦ log π¦ , π¦ β π ++ π β² π¦ = π¦ π¦ + log π¦ = 1 + log π¦ , π β²β² π¦ = 1 π¦ Since π¦ β π ++ , π β²β² π¦ > 0 β π π¦ is convex Show the plot of π¦ log π¦ Remark: β’ 1 st order condition can be used to design and prove the property of opt. algorithms. β’ 2 nd order condition implies the 1 st order condition β’ 2 nd order condition can be used to prove the convexity of the functions. 12
2. Conditions: Examples 1 2 π¦ π ππ¦ + π π π¦ + π , π β π π β’ Quadratic Function: π π¦ = πΌπ π¦ = ππ¦ + π, πΌ 2 π π¦ = π 2 β’ Least Square: π π¦ = π΅π¦ β π 2 πΌπ π¦ = 2π΅ π π΅π¦ β π , πΌ 2 π π¦ = π΅ π π΅ π¦ 2 β’ Quadratic over linear: π π¦, π§ = π§ , π§ > 0 π π§ , β π¦ 2 2π¦ πΌπ π¦, π§ = , π§ 2 , 2 β 2π¦ π§ 2 = 2 π§ π§ πΌ 2 π π¦ = π§ βπ¦ 2π¦ 2 βπ¦ π§ 3 β 2π¦ π§ 2 π§ 3 13
2. Conditions: Examples π π π¦ π (Smooth max of softmax β’ Log-sum-exp: π π¦ = log Ο πΏ=1 function) 1 1 πΌ 2 π π¦ = 1 π π¨ π¨π¨ π , π¨ π = π π¦ π 1 π π¨ ππππ π¨ β 1 π π 2 π¨ π β Ο π=1 π π€ π πΌ 2 π π¦ π€ = π€ π π¨ π 2 ] β₯ 0, 1 π π¨ 2 [ Ο π=1 Ο π=1 π¨ π π€ π for all π€ β π π (Cauchy-Schwarz inequality) Thus, π(π¦) is a convex function Cauchy-Schwarz inequality : π π π π π π β₯ π π π 2 , π π = π¨ π , π π = π€ π π¨ π π π π π π π Proof 1: Let π¨ = π β π π π π, or π = π¨ + π π π π We have a π a = z T z + π π π 2 π π π 2 π π π β₯ π π π 2 π π π 2 π π π = π π π 2 π π π Proof 2: By induction 14
3. Operations that preserve convexity β’ Nonnegative multiple: π½π, where π½ β₯ 0, π is convex β’ Sum: π 1 + π 2 , where π 1 , πππ π 2 are convex β’ Composition with affine function: π π΅π¦ + π , where π is convex 2 π π΅π¦ + π = π΅ π πΌ 2 π π§|π§ = π΅π¦ + π π΅ Proof: πΌ π¦ π§ π log π π β π π π π¦ π , E.g. π π¦ = β Ο π=1 π π¦ < π π , π = 1, β¦ , π} πππ π = {π¦|π π π π¦ = π΅π¦ + π (if π is twice differentiable) 15
Recommend
More recommend