Nonlinear Eigenproblems in Data Analysis and Graph Partitioning - PowerPoint PPT Presentation

Nonlinear Eigenproblems in Data Analysis and Graph Partitioning Matthias Hein Department of Mathematics and Computer Science Saarland University, Saarbr¨ ucken, Germany Minisymposium: Modern matrix methods for large scale data and networks SIAM Conference on Applied Linear Algebra, Valencia 19.06.2012 Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 1 / 27

Linear Eigenproblems in Machine Learning Motivation: Eigenvalue problems are abundant in data analysis Principal Component Analysis: Largest eigenvectors of covariance matrix of the data Usage: Denoising by projection onto largest eigenvectors. Spectral Clustering: Second smallest eigenvector of the graph Laplacian Usage: Graph partitioning using thresholded eigenvector. Latent Semantic Analysis: Singular value decomposition of term-document matrix Usage: Recover underlying latent semantic structure. Many more ... ! Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 2 / 27

The Symmetric Linear Eigenproblem Generalized Symmetric Linear Eigenproblem: Let A , B ∈ R n × n be symmetric and B positive definite. Then Ax = � x , Ax � � x , Ax � ⇐ ⇒ � x , Bx � . � x , Bx � Bx x critical point of Variational Principle: Courant-Fischer min-max theorem yields n eigenvalues: � x , Ax � λ m = min U m ∈U m max x ∈ U m � x , Bx � , m = 1 , . . . , n , where U m is the class of all m -dimensional subspaces of R n . Critical point theory for ratios of quadratic functions Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 3 / 27

Robust PCA Principal Component Analysis (PCA) 4 first PCA component 3 2 1 0 −1 −2 −3 −4 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 PCA Type of eigenproblem Linear n 2 i =1 � w , X i − 1 � n j =1 X j � � n Ratio � w � 2 2 Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 4 / 27

Robust PCA Principal Component Analysis (PCA) 4 first PCA component (original) first PCA component (perturbed) 3 2 Source of outliers 1 noisy data 0 adversarial −1 manipulation −2 −3 −4 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 PCA Type of eigenproblem Linear n 2 i =1 � w , X i − 1 � n j =1 X j � � n Ratio � w � 2 2 Robustness no Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 4 / 27

Robust PCA Principal Component Analysis (PCA) 4 first PCA component (original) first PCA component (perturbed) 3 2 Source of outliers 1 noisy data 0 adversarial −1 manipulation −2 −3 −4 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 PCA Type of eigenproblem Linear Nonlinear n � � 2 i =1 � w , X i − 1 � n j =1 X j � � n V � w , X 1 � ,..., � w , X n � Ratio ⇒ � w � 2 � w � 2 2 Robustness no yes Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 4 / 27

The Symmetric Linear Eigenproblem Pros: Fast solvers available Cons: Restriction to ratio of quadratic functionals = ⇒ limited modeling abilities Quadratic functionals are non-robust against outliers (PCA). Quadratic functionals cannot induce eigenvectors which are sparse. Idea: Replace quadratic functionals by convex p -homogeneous functions ! Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 5 / 27

The Nonlinear Eigenproblem (Homogeneous) Nonlinear Eigenproblem: Let R , S : R n → R be convex, even and p -homogeneous ( R ( γ x ) = | γ | p R ( x )) and S ( x ) = 0 ⇔ x = 0. Then 0 ∈ ∂ R ( x ) − R ( x ) R ( x ) S ( x ) ∂ S ( x ) ⇐ = x critical point of S ( x ) . Variational Principle: Lusternik-Schnirelmann min-max theorem yields n nonlinear eigenvalues: R ( x ) λ m = min K ∈K m max x ∈ K S ( x ) , m = 1 , . . . , n , where K m is the class of all compact symmetric subsets of { x ∈ R n | S ( x ) > 0 } with Krasnoselskii genus greater or equal to m . New: general more than n eigenvectors exist. Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 6 / 27

The Nonlinear Eigenproblem II Pros: Stronger modeling power using non-quadratic functions R and S Specific properties of eigenvectors like robustness against outliers or sparsity can be induced by nonsmooth choices of S respectively R . Challenges: Optimization problems for eigenproblems are typically nonconvex and nonsmooth . Need for new efficient algorithms ! Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 7 / 27

(Inverse) Power Method for Nonlinear Eigenproblems Inverse Power Method for Linear Eigenproblems � u , Bf k � � � 1 ⇐ ⇒ 2 � u , Au � − Af k +1 = B f k f k +1 = arg min u ∈ R n Sequence f k converges to smallest eigenvector of generalized eigenproblem. Inverse Power Method for Nonlinear Eigenproblems (H.,B.(2010)) p > 1 p = 1 { R ( u ) − � u , s ( f k ) �} { R ( u ) − λ k � u , s ( f k ) �} g k +1 = arg min f k +1 = arg min u ∈ R n � u � 2 ≤ 1 f k +1 = g k +1 / S ( g k +1 ) 1 / p s ( f k +1 ) ∈ ∂ S ( f k +1 ) s ( f k +1 ) ∈ ∂ S ( f k +1 ) λ k +1 = R ( f k +1 ) λ k +1 = R ( f k +1 ) S ( f k +1 ) S ( f k +1 ) Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 8 / 27

Properties of Nonlinear Inverse Power Method Theorem (Hein, B¨ uhler (2010)): It holds either λ k +1 < λ k or λ k +1 = λ k and the sequence terminates. Moreover, for every cluster point f ∗ of the sequence f k one has 0 ∈ ∂ R ( f ∗ ) − λ ∗ ∂ S ( f ∗ ) , where λ ∗ = R ( f ∗ ) S ( f ∗ ) . Guarantees: monotonic descent method convergence guaranteed to some nonlinear eigenvector but not necessarily the one associated with the smallest eigenvalue. Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 9 / 27

Benefits of Nonlinear Eigenproblems Linear EP Nonlinear EP Modeling power low high Relaxation of loose tight combinatorial problems Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 10 / 27

The Cheeger Cut Problem Cheeger cut: ( C , C ) is a partition of the weighted, undirected graph � cut ( C , C ) � � φ ( C ) = � } , where cut ( A , B ) = w ij � C min {| C | , i ∈ A , j ∈ B Optimal Cheeger cut, φ ∗ = min C φ ( C ) , is NP-hard Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 11 / 27

Balanced Graph Cuts - Applications Clustering/Community detection Image Segmentation � Parallel Computing (Matrix Reordering) � Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 12 / 27

Relaxation of Cheeger Cut Problem Relaxation into semi-definite program with | V | 3 constraints: � Best known (worst case) approximation guarantee: O ( log | V | ). Spectral Relaxation based on graph Laplacian L L = D − W , Isoperimetric inequality (Alon, Milman (1984)) ( φ ∗ ) 2 ≤ λ 2 ( L ) ≤ 2 φ ∗ . 2 max i d i there are graphs known which realize lower bound bipartitioning obtained by optimal thresholding of second eigenvector Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 13 / 27

1-Spectral Clustering 1 -graph Laplacian: The nonlinear graph 1-Laplacian ∆ 1 induces the functional F 1 ( f ), � n 1 i , j =1 w ij | f i − f j | F 1 ( f ) := � f , ∆ 1 f � 2 = . � f � 1 � f � 1 Theorem (Hein,B¨ uhler (2010)): Let G be connected, then cut ( C , C ) min � � � } = min F 1 ( f ) = λ 2 (∆ 1 ) , � C min {| C | , C f nonconstant median ( f )= 0 where λ 2 (∆ 1 ) is the second smallest eigenvalue of ∆ 1 . The second eigenvector of ∆ 1 is the indicator vector of the optimal partition. Tight relaxation of the optimal Cheeger cut ! Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 14 / 27

Quality Guarantee Tight relaxation of Cheeger cut: Minimization of continuous relaxation is as hard as original Cheeger cut problem = ⇒ non-convex and non-smooth No guarantee that one obtains optimal solution by NIPM ! Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 15 / 27

Quality Guarantee Tight relaxation of Cheeger cut: Minimization of continuous relaxation is as hard as original Cheeger cut problem = ⇒ non-convex and non-smooth No guarantee that one obtains optimal solution by NIPM ! but Quality guarantee: Theorem Let ( A , A ) be a given partition of V . If one uses as initialization of NIPM, f 0 = 1 A , then either NIPM terminates after one step or it yields an f 1 which after optimal thresholding gives a partition ( B , B ) which satisifies cut ( B , B ) cut ( A , A ) min {| B | , | B |} < min {| A | , | A |} . Next Goal: Global approximation guarantees. Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 15 / 27

Cheeger Cut: 1-Laplacian (NLEP) vs. 2-Laplacian (LEP) Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 16 / 27

Linear Nonlinear � n i , j =1 w ij ( x i − x j ) 2 � n i , j =1 w ij | x i − x j | Ratio � x � 2 � x � 1 2 Approximation Guarantee loose tight ! Hein, B¨ uhler(2010) Convergence globally optimal locally optimal Scalability � � Quality + +++ 1-Spectral Clustering beats state of the art methods on graph partitioning benchmark Hein (Saarland University) Nonlinear Eigenproblems in Data Analysis 17 / 27

Nonlinear Eigenproblems in Data Analysis and Graph Partitioning - PowerPoint PPT Presentation

Nonlinear Eigenproblems in Data Analysis and Graph Partitioning Matthias Hein Department of Mathematics and Computer Science Saarland University, Saarbr ucken, Germany Minisymposium: Modern matrix methods for large scale data and networks

Nonlinear Control Lecture # 31 Nonlinear Observers Nonlinear Control Lecture # 31 Nonlinear

Nonlinear Control Lecture # 22 Special nonlinear Forms Nonlinear Control Lecture # 22 Special

Nonlinear Control Lecture # 21 Special nonlinear Forms Nonlinear Control Lecture # 21 Special

An application of the Bauer-Fike theorem to nonlinear eigenproblems Elias Jarlebring TU

Nonlinear Control Lecture # 12 Nonlinear Observers and Output Feedback Stabilization Nonlinear

Nonlinear Control Lecture # 8 Special nonlinear Forms Nonlinear Control Lecture # 8 Special

Nonlinear Control Lecture # 20 Special nonlinear Forms Nonlinear Control Lecture # 20 Special

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Nonlinear Control Lecture # 1 Introduction Nonlinear Control Lecture # 1 Introduction Nonlinear

Numerical Proofs in Nonlinear Control Sicun Gao, UCSD Nonlinear control working Nonlinear

Parallel solution of large sparse eigenproblems using a Block-Jacobi-Davidson method Melven

XL1A: Graph Nominal Frequency Data Using Excel2013 3/10/2017 V0E XL1A: V0E XL1A: V0E Graph

A Parallel and Scalable Iterative Solver for Sequences of Dense Eigenproblems Arising in FLAPW

(Preconditioning) Chebyshev subspace iteration applied to sequences of dense eigenproblems in ab

An optimized subspace iteration eigensolver applied to sequences of dense eigenproblems in ab

How To Determine If A Random Graph With A Fixed Degree Sequence Has A Giant Component Bruce Reed

CEE 697z Organic Compounds in Water and Wastewater PPCP Analysis October 27, 2014 Lecture

CEE 697z Organic Compounds in Water and Wastewater PPCP Analysis October 27, 2014 Lecture

Gynecologic Cancer In InterGroup (G (GCIG) Ovarian Cancer Committee Wednesday, May 30, 2018,

DATA MINING LECTURE 12 Community detection in graphs Communities Real-life graphs are not

at the U.S. National Arboretum Auto Services Workshop at the Arboretum June 8, 2017 AGENDA

Myths and Realities Myth: E-Cigarettes produce a harmless water vapor. The secondhand

History of Federal History of Food and Drug Laws Regulations FDA has grown from a single chemist