Living on the Edge ❦ Phase Transitions in Convex Programs with Random Data Joel A. Tropp Michael B. McCoy Computing + Mathematical Sciences California Institute of Technology Joint with Dennis Amelunxen and Martin Lotz (Manchester) Research supported in part by ONR, AFOSR, DARPA, and the Sloan Foundation 1
Convex Programs with Random Data Examples... ❧ Stat and ML. Random data models; fit model via optimization ❧ Sensing. Collect random measurements; reconstruct via optimization ❧ Coding. Random channel models; decode via optimization Motivations... ❧ Average-case analysis. Randomness describes “typical” behavior ❧ Fundamental bounds. Opportunities and limits for convex methods Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 2
Research Challenge... Understand and predict precise behavior of random convex programs References: Donoho–Maleki–Montanari 2009, Donoho–Johnstone–Montanari 2011, Donoho–Gavish–Montanari 2013 Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 3
A Theory Emerges... ❧ Vershik & Sporyshev, “An asymptotic estimate for the average number of steps...” 1986 ❧ Donoho, “High-dimensional centrally symmetric polytopes...” 2/2005 ❧ Rudelson & Vershynin, “On sparse reconstruction...” 2/2006 ❧ Donoho & Tanner, “Counting faces of randomly projected polytopes...” 5/2006 ❧ Xu & Hassibi, “Compressed sensing over the Grassmann manifold...” 9/2008 ❧ Stojnic, “Various thresholds for ℓ 1 optimization...” 7/2009 ❧ Bayati & Montanari, “The LASSO risk for gaussian matrices” 8/2010 ❧ Oymak & Hassibi, “New null space results and recovery thresholds...” 11/2010 ❧ Chandrasekaran, Recht, et al., “The convex geometry of linear inverse problems” 12/2010 ❧ McCoy & Tropp, “Sharp recovery bounds for convex demixing...” 5/2012 ❧ Bayati, Lelarge, & Montanari, “Universality in polytope phase transitions...” 7/2012 ❧ Chandrasekaran & Jordan, “Computational & statistical tradeoffs...” 10/2012 ❧ Amelunxen, Lotz, McCoy, & Tropp, “Living on the edge...” 3/2013 ❧ Stojnic, various works 3/2013 ❧ Foygel & Mackey, “Corrupted sensing: Novel guarantees...” 5/2013 ❧ Oymak & Hassibi, “Asymptotically exact denoising...” 5/2013 ❧ McCoy & Tropp, “From Steiner formulas for cones...” 8/2013 ❧ McCoy & Tropp, “The achievable performance of convex demixing...” 9/2013 ❧ Oymak, Thrampoulidis, & Hassibi, “The squared-error of generalized LASSO...” 11/2013 Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 4
The Core Question How big is a cone? Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 5
. Regularized Denoising Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 6
Denoising a Piecewise Smooth Signal Piecewise smooth function + additive white noise Denoised piecewise smooth function 4 3 3 2 2 1 1 0 Value Value 0 −1 −1 −2 −2 −3 −3 By wavelet shrinkage Original + noise Original Original −4 −4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Time Time ❧ Observation: z = x ♮ + σ g where g ∼ normal ( 0 , I ) ❧ Denoise via wavelet shrinkage = convex optimization: 1 2 � x − z � 2 minimize 2 + λ � W x � 1 Reference: Donoho & Johnstone, early 1990s Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 7
Setup for Regularized Denoising ❧ Let x ♮ ∈ R d be “structured” but unknown ❧ Let f : R d → R be a convex function that reflects “structure” ❧ Observe z = x ♮ + σ g where g ∼ normal ( 0 , I ) ❧ Remove noise by solving the convex program* 1 2 � x − z � 2 f ( x ) ≤ f ( x ♮ ) minimize subject to 2 x approximates x ♮ ❧ Hope: The minimizer � *We assume the side information f ( x ♮ ) is available. This is equivalent** to knowing the optimal choice of Lagrange multiplier for the constraint. Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 8
Geometry of Denoising I z σ g x ♮ { x : f ( x ) ≤ f ( x ♮ ) } � x References: Chandrasekaran & Jordan 2012, Oymak & Hassibi 2013 Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 9
Descent Cones Definition. The descent cone of a function f at a point x is D ( f, x ) := { h : f ( x + ε h ) ≤ f ( x ) for some ε > 0 } x 0 { h : f ( x + h ) ≤ f ( x ) } { y : f ( y ) ≤ f ( x ) } D ( f, x ) x + D ( f, x ) References: Rockafellar 1970, Hiriary-Urruty & Lemar´ echal 1996 Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 10
Geometry of Denoising II z σ g x ♮ { x : f ( x ) ≤ f ( x ♮ ) } � x Π K ( σ g ) x ♮ + K K = D ( f, x ♮ ) References: Chandrasekaran & Jordan 2012, Oymak & Hassibi 2013 Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 11
The Risk of Regularized Denoising Theorem 1. [Oymak & Hassibi 2013] Assume ❧ We observe z = x ♮ + σ g where g is standard normal ❧ The vector � x solves 1 2 � z − x � 2 f ( x ) ≤ f ( x ♮ ) minimize subject to 2 Then x − x ♮ � 2 E � � = E � Π K ( g ) � 2 2 sup 2 σ 2 σ> 0 where K = D ( f, x ♮ ) and Π K is the Euclidean metric projector onto K . Related: Bhaskar–Tang–Recht 2012, Donoho–Johnstone–Montanari 2012, Chandrasekaran & Jordan 2012 Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 12
. Statistical . Dimension Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 13
Statistical Dimension: The Motion Picture g g 0 0 Π K ( g ) Π K ( g ) K K big cone small cone Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 14
The Statistical Dimension of a Cone Definition. The statistical dimension δ ( K ) of a closed, convex cone K is the quantity � � � Π K ( g ) � 2 δ ( K ) := E 2 . where ❧ Π K is the Euclidean metric projector onto K ❧ g ∼ normal ( 0 , I ) is a standard normal vector References: Rudelson & Vershynin 2006, Stojnic 2009, Chandrasekaran et al. 2010, Chandrasekaran & Jordan 2012, Amelunxen et al. 2013 Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 15
Basic Statistical Dimension Calculations Cone Notation Statistical Dimension j -dim subspace L j j 1 R d Nonnegative orthant 2 d + L d +1 1 Second-order cone 2 ( d + 1) 1 S d Real psd cone 4 d ( d − 1) + 1 H d 2 d 2 Complex psd cone + References: Chandrasekaran et al. 2010, Amelunxen et al. 2013 Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 16
Circular Cones 1 3/4 1/2 1/4 0 0 References: Amelunxen et al. 2013, Mu et al. 2013, McCoy & Tropp 2013 Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 17
Descent Cone of ℓ 1 Norm at Sparse Vector 1 3/4 1/2 1/4 0 0 1/2 3/4 1/4 1 References: Stojnic 2009, Donoho & Tanner 2010, Chandrasekaran et al. 2010, Amelunxen et al. 2013, Mackey & Foygel 2013 Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 18
Descent Cone of S 1 Norm at Low-Rank Matrix 1 3/4 1/2 1/4 0 0 1/4 1/2 3/4 1 References: Oymak & Hassibi 2010, Chandrasekaran et al. 2010, Amelunxen et al. 2013, Foygel & Mackey 2013 Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 19
. Regularized Linear Inverse Problems Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 20
Example: The Random Demodulator 0.01 0.01 Frequency (MHz) Frequency (MHz) 0.02 0.02 0.04 0.04 0.05 0.05 0.06 0.06 0.07 0.07 40.08 80.16 120.23 160.31 200.39 40.08 80.16 120.23 160.31 200.39 Time ( μ s) Time ( μ s) Reconstruct Input to sensor (e.g., convex optimization) Pseudorandom Seed Number Generator Output from sensor Linear data acquisition system Reference: Tropp et al. 2010, ... Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 21
Setup for Linear Inverse Problems ❧ Let x ♮ ∈ R d be a structured, unknown vector ❧ Let f : R d → R be a convex function that reflects structure ❧ Let A ∈ R m × d be a measurement operator ❧ Observe z = Ax ♮ ❧ Find estimate � x by solving convex program minimize f ( x ) subject to Ax = z x = x ♮ ❧ Hope: � Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 22
Geometry of Linear Inverse Problems x ♮ + null( A ) x ♮ + null( A ) x ♮ x ♮ { x : f ( x ) ≤ f ( x ♮ ) } { x : f ( x ) ≤ f ( x ♮ ) } x ♮ + D ( f, x ♮ ) x ♮ + D ( f, x ♮ ) Success! Failure! References: Cand` es–Romberg–Tao 2005, Rudelson–Vershynin 2006, Chandrasekaran et al. 2010, Amelunxen et al. 2013 Living on the Edge , Modern Time–Frequency Analysis, Strobl, 3 June 2014 23
Recommend
More recommend