More Subgradient Calculus: Function Convexity first Following - PowerPoint PPT Presentation

More Subgradient Calculus: Function Convexity first Following functions are again convex, but again, may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? n ∑ α i f i is convex if each f i for1 ≤ i ≤ n is convex and Nonnegative weighted sum: f = i =1 α i ≥0,1≤ i ≤ n . Composition with affine function: f ( Ax + b )is convex if f is convex. For example: ∑ m The log barrier for linear inequalities, f ( x ) =− log( b i − a T x ), is convex since−log( x )is ▶ i i =1 convex. ▶ Any norm of an affine function, f ( x ) =|| Ax + b ||, is convex. September 1, 2018 86 / 402

More of Basic Subgradient Calculus Scaling:∂( af ) = a ·∂ f provided a >0. The condition a >0makes function f remain convex. Addition:∂( f 1 + f 2 ) =∂( f 1 ) +∂( f 2 ) Affine composition: if g ( x ) = f ( A x + b ), then ∂ g ( x ) = A T ∂ f ( A x + b ) Norms: important special case, f ( x ) =|| x || p The derivations done in class could be used to show that if any other subgradient exists for g outside the stated set above, that could be used to construct a subgradient for f outside the stated set above as well! September 1, 2018 87 / 402

More of Basic Subgradient Calculus Scaling:∂( af ) = a ·∂ f provided a >0. The condition a >0makes function f remain convex. Addition:∂( f 1 + f 2 ) =∂( f 1 ) +∂( f 2 ) Affine composition: if g ( x ) = f ( A x + b ), then ∂ g ( x ) = A T ∂ f ( A x + b ) = max z T x where q is such that Norms: important special case, f ( x ) = || x || p || z || q ≤1 1/ p + 1/ q = 1. Then On the board we have used y instead of z September 1, 2018 87 / 402

More of Basic Subgradient Calculus Scaling:∂( af ) = a ·∂ f provided a >0. The condition a >0makes function f remain convex. Addition:∂( f 1 + f 2 ) =∂( f 1 ) +∂( f 2 ) Affine composition: if g ( x ) = f ( A x + b ), then ∂ g ( x ) = A T ∂ f ( A x + b ) = max z T x where q is such that Norms: important special case, f ( x ) = || x || p || z || q ≤1 1/ p + 1/ q = 1. Then { } ∂ f ( x ) = q ≤ 1and y z T x y :|| y || T x =max = || z || q ≤1 y corresponds to z where the max is attained The part above is largely connected to previous discussion on max of convex functions September 1, 2018 87 / 402

More of Basic Subgradient Calculus Scaling:∂( af ) = a ·∂ f provided a >0. The condition a >0makes function f remain convex. Addition:∂( f 1 + f 2 ) =∂( f 1 ) +∂( f 2 ) Affine composition: if g ( x ) = f ( A x + b ), then ∂ g ( x ) = A T ∂ f ( A x + b ) Norms: important special case, f ( x ) = || x || = max z T x where q is such that p || z || q ≤1 This is derived in 1/ p + 1/ q = 1. Then class { } { } ∂ f ( x ) = q ≤ 1and y q ≤ 1and y y :|| y || T x =max z T x T x = || x || = y :|| y || p || z || q ≤1 Why ||y||_q <= 1 is because of Minkowski's inequality September 1, 2018 87 / 402

Subgradients for the ‘Lasso’ Problem in Machine Learning We use Lasso (min f ( x )) as an example to illustrate subgradients of affine composition: x f ( x ) = 1|| y − x || 2 +λ|| x || 1 2 The subgradients of f ( x )are x - y + \lambda s Where s = {+1,-1}^n such that ||x||_1 = s^T x September 1, 2018 88 / 402

Subgradients for the ‘Lasso’ Problem in Machine Learning We use Lasso (min f ( x )) as an example to illustrate subgradients of affine composition: x f ( x ) = 1|| y − x || 2 +λ|| x || 1 2 The subgradients of f ( x )are h = x − y +λ s , i ∈ [−1,1]if x where s i = sign ( x i )if x i = 0and s i = 0. Second component is a result of the convex hull September 1, 2018 88 / 402

More Subgradient Calculus: Composition Following functions, though convex, may not be differentiable everywhere. How does one compute their subgradients? (what holds for subgradient also holds for gradient) Composition with functions: Let p : ℜ k → ℜ with q ( x ) = ∞ , ∀ x / ∈ d om h and q : ℜ n → ℜ k . Define f ( x ) = p ( q ( x )). f is convex if We will consider ▶ q i is convex, p is convex and nondecreasing in each argument only the first case ▶ or q i is concave, p is convex and nonincreasing in each argument September 1, 2018 89 / 402

More Subgradient Calculus: Composition Following functions, though convex, may not be differentiable everywhere. How does one compute their subgradients? (what holds for subgradient also holds for gradient) Composition with functions: Let p : ℜ k → ℜ with q ( x ) = ∞ , ∀ x / ∈ d om h and q : ℜ n → ℜ k . Define f ( x ) = p ( q ( x )). f is convex if In both conditions, q i is convex, p is convex and nondecreasing in each argument composition will be ▶ ▶ or q i is concave, is convex and nonincreasing in each argument p concave if p is Some examples illustrating this property are: concave exp q ( x )is convex if q is convex ▶ exp is a monotonic and convex p ∑ m log q i ( x )is concave if q i are concave and positive ▶ p is concave i =1 and hence the ∑ m ▶ log exp q i ( x )is convex if q i are convex composition is concave i =1 ▶ 1/ q ( x )is convex if q is concave and positive September 1, 2018 89 / 402

More Subgradient Calculus: Composition (contd) Composition with functions: Let p : ℜ k → ℜ with q ( x ) = ∞ , ∀ x / ∈ d om h and q : ℜ n → ℜ k . Define f ( x ) = p ( q ( x )). f is convex if ▶ q i is convex, p is convex and nondecreasing in each argument ▶ or q i is concave, p is convex and nonincreasing in each argument Subgradients for the first case (second one is homework): September 1, 2018 90 / 402

More Subgradient Calculus: Composition (contd) Composition with functions: Let p : ℜ k → ℜ with q ( x ) = ∞ , ∀ x / ∈ d om h and q : ℜ n → ℜ k . Define f ( x ) = p ( q ( x )). f is convex if ▶ q i is convex, p is convex and nondecreasing in each argument ▶ or q i is concave, p is convex and nonincreasing in each argument Subgradients for the first case (second one is homework): ( ) f ( y ) = p ( q 1 ( y ), . . . , q k ( y )) ≥ p q 1 ( x ) + h T ( y − x ), . . . , q k ( x ) + h T ( y − x ) ▶ q 1 q k Where h q i ∈ ∂ q i ( x )for i = 1.. k and since p (.)is non-decreasing in each argument. ( ) q 1 ( x ) + h T ( y − x ), . . . , q k ( x ) + h T ( y − x ) p ≥ ▶ q 1 q k ) p ( q 1 ( x ), . . . , q k ( x )) + h T ( h T ( y − x ), . . . , h T ( y − x ) p q 1 q k Where h p ∈ ∂ p ( q 1 ( x ), . . . , q k ( x )) All we need to do next is club together h_p and h_q and leave only (y-x) in the second component September 1, 2018 90 / 402

More Subgradient Calculus: Composition (contd) Composition with functions: Let p : ℜ k → ℜ with q ( x ) = ∞ , ∀ x / ∈ d om h and q : ℜ n → ℜ k . Define f ( x ) = p ( q ( x )). f is convex if ▶ q i is convex, p is convex and nondecreasing in each argument ▶ or q i is concave, p is convex and nonincreasing in each argument Subgradients for the first case (second one is homework): ( ) f ( y ) = p ( q 1 ( y ), . . . , q k ( y )) ≥ p q 1 ( x ) + h T ( y − x ), . . . , q k ( x ) + h T ( y − x ) ▶ q 1 q k Where h q i ∈ ∂ q i ( x )for i = 1.. k and since p (.)is non-decreasing in each argument. ) ( q 1 ( x ) + h T ( y − x ), . . . , q k ( x ) + h T ( y − x ) ≥ ▶ p q 1 q k ) p ( q 1 ( x ), . . . , q k ( x )) + h T ( h T ( y − x ), . . . , h T ( y − x ) p q 1 q k Where h p ∈ ∂ p ( q 1 ( x ), . . . , q k ( x )) = f ( x ) + ∑ ( h p ) i h T ( y − x ) p ( q 1 ( x ), . . . , q k ( x )) + h T ( ) k h T ( y − x ), . . . , h T ( y − x ) ▶ p q 1 q k q i i =1 That is, ∑ ( h p ) i h q is a subgradient of the composite function at x . k i H/W: Derive the subdi ff erentials to example functions on previous slide i =1 September 1, 2018 90 / 402

More Subgradient Calculus: Proximal Operator Following functions are again convex, but again, may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? Infimum: If c ( x , y )is convex in( x , y )andCis a convex set, then d ( x ) =inf c ( x , y )is y ∈ C convex. For example: ▶ Let d ( x ,C)that returns the distance of a point x to a convex setC. That is || x − y ||= || x − P d ( x ,C) = inf C ( x )||, where, P C ( x ) = argmin d ( x ,C) . Then d ( x ,C)is a y ∈ C x − P C ( x ) convex function and ∇ d ( x ,C) = ∥ x − P C ( x ) ∥ H/w: Prove that d is convex if c is a convex function and if C is a convex set September 1, 2018 91 / 402

More Subgradient Calculus: Proximal Operator Following functions are again convex, but again, may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? Infimum: If c ( x , y )is convex in( x , y )andCis a convex set, then d ( x ) =inf c ( x , y )is y ∈ C convex. For example: ▶ Let d ( x ,C)that returns the distance of a point x to a convex setC. That is || x − y ||= || x − P d ( x ,C) = inf C ( x )||, where, P C ( x ) = argmin d ( x ,C) . Then d ( x ,C)is a y ∈ C x − P C ( x ) convex function and ∇ d ( x ,C) = ....The point of intersection of convex sets ∥ x − P C ( x ) ∥ C 1 , C 2 ,... C m by minimizing... (Subgradients and Alternating Projections) ▶ argmin d ( x ,C)is a special case of the proximity operator: prox c ( x ) = argmin PROX c ( x )of a y ∈ C y 1 || x − y ||The special case is when convex function c ( x ). Here, PROX c ( x ) = c ( y ) + 2 c(x) is the indicator function over C September 1, 2018 91 / 402

More Subgradient Calculus: Function Convexity first Following - PowerPoint PPT Presentation

More Subgradient Calculus: Function Convexity first Following functions are again convex, but again, may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? n i f i is convex if

Kurdyka- Lojasiewicz inequality and Kurdyka- Lojasiewicz inequality and subgradient

More Subgradient Calculus: Proximal Operator Following functions are again convex, but again, may

Optimal covering of a straight line application to discrete convexity Jean-Marc Chassery, Isabelle

Convexity and Polyhedra Carlo Mannino (from Geir Dahl notes on convexity) University of Oslo,

A Tightrope Walk Between Convexity and Non-convexity in Computer Vision Thomas Pock Institute

Discrete convexity and packages Gleb Koshevoy IITP(RAS) and Poncelet Center (CNRS) 12/05/2020,

Convexity and the Kalmbach monad Gejza Jena August 10, 2018 Gejza Jena Convexity and the

(Sub)Gradients and Convexity (contd) A subdifferential is the closed convex set of all

Relational Calculus Another Theoretical QL-Relational Calculus Comes in two flavors: Tuple

Lecture 1 : Lambda Calculus CS6202 Introduction 1 Lambda Calculus Lambda Calculus

3. Convex functions basic properties and examples operations that preserve convexity

Computational Semantics: More Calculus -calculus Recap NLTK semantics operations Scott

Learn more Do more Be more Learn more Do more Be more UNITY Learn more Do

Ito calculus, Malliavin calculus and Mathematical Finance Shigeo Kusuoka Mathematical Finance

Resequencing Calculus Existing Solutions An Early Multivariate Approach Our Solution Revised

Whats Calculus? Answer: Next semester! (Fundamental Theorem of Calculus, by Newton and

The classification of empty 4-simplices Francisco Santos (joint with O. Iglesias-Vali no) U.

Extremization problems in AdS/CFT a -maximization and attractor mechanism Akishi KATO (Math. Sci.,

An optimal Lp -bound on the Krein spectral shift function (Birmingham, November 1012, 2000)

Lattice Basis Reduction Part II: Algorithms Sanzheng Qiao Department of Computing and Software

Optimisation of the lowest eigenvalue induced by surface singular interactions Vladimir

Zonoids and sparsification of quantum measurements Guillaume AUBRUN (joint with C ecilia

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Econ 2148, fall 2019 Text as data Maximilian Kasy Department of Economics, Harvard University 1