0 -Sparse Subspace Clustering Yingzhen Yang 1 , Jiashi Feng 2 , - PowerPoint PPT Presentation

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results ℓ 0 -Sparse Subspace Clustering Yingzhen Yang 1 , Jiashi Feng 2 , Nebojsa Jojic 3 , Jianchao Yang 4 , Thomas S. Huang 1 1 Beckman Institute, University of Illinois at Urbana-Champaign, USA 2 Department of ECE, National University of Singapore, Singapore 3 Microsoft Research, USA 4 Snapchat, USA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 / 20

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results Introduction Sparse Subspace Clustering (SSC) aims to partition the data according to their underlying subspaces. Figure 1: Black dots and red dots indicate the data that lie in subspace S 1 and S 2 respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 / 20

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results Sparse Subspace Clustering Sparse Subspace Clustering (SSC) aims to partition the data according to their underlying subspaces. SSC and its robust version solve the following sparse representation problems: min α ∥ α ∥ 1 s.t. X = Xα , diag( α ) = 0 α ∥ X − Xα ∥ 2 min F + λ ℓ 1 ∥ α ∥ 1 s.t. diag( α ) = 0 Under certain assumptions on the underlying subspaces and the data, α satisfies Subspace Detection Property (SDP): its nonzero elements correspond to the data that lie in the same subspace as point x i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 / 20

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results ℓ 0 -induced Sparse Subspace Clustering Subspace Detection Property (SDP) is crucial for its success: data belonging to different subspaces are disconnected in the sparse graph. S S 1 2 M M M M S M 0 1 M S M 0 2 M Figure 2: Block-diagonal similarity matrix due to SDP We propose ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC), which solves the ℓ 0 problem: min α ∥ α ∥ 0 s.t. X = Xα , diag( α ) = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 / 20

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results Models for Analyzing the Subspace Detection Property Deterministic Model: the subspaces and the data in each subspace are fixed. Randomized Model: Semi-Random Model: the subspaces are fixed but the data are distributed at random in each of the subspaces. Full-Random Model: the subspaces and the data of each subspace are random. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 / 20

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results ℓ 0 -induced Sparse Subspace Clustering The sparse subspace clustering literature does not have the answer to the fundamental problem: what is the relationship between sparse representation and SDP? Almost surely equivalence between ℓ 0 -sparsity and SDP, under the mildest assumption to the best of our knowledge. Theorem 1 ( ℓ 0 -sparsity ⇒ SDP ) Under semi-random or full-random model, suppose data in each subspace are generated i.i.d. according to any continuous distribution. Then with probability 1 over the data for semi-random model, or over both the data and the subspaces for the full-random model, the optimal solution to the ℓ 0 sparse representation problem satisfies the subspace detection property. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 20

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results ℓ 0 -induced Sparse Subspace Clustering Inter-subspace hyperplane: the hyperplane spanned by data from different subspaces. The source where the confusion comes from. Key element in the proof: the probability of the intersection of the inter-subspace hyperplane and any associated subspace is 0 . x A j S O 1 S 2 x i Figure 3: Illustration of a inter-subspace hyperplane spanned by x i and x j . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 / 20

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results ℓ 0 -induced Sparse Subspace Clustering Compared to previous subspace clustering methods, ℓ 0 -SSC achieves SDP under far less restrictive assumptions on both the underlying subspaces and the random data generation. Assumption on Subspaces Explanation S 1 :Independent Subspaces Dim[ S 1 ⊕ S 2 . . . S K ] = k Dim[ S k ] ∑ S k ∩ S k ′ = 0 for k ̸ = k ′ S 2 :Disjoint Subspaces 1 ≤ Dim[ S k ∩ S k ′ ] < min { Dim[ S k ] , Dim[ S k ′ ] } for k ̸ = k ′ S 3 :Overlapping Subspaces S 4 :Distinct Subspaces ( ℓ 0 -SSC) S k ̸ = S k ′ for k ̸ = k ′ Assumption on Random Data Generation Explanation D 1 :Semi-Random Model or Full-Random Model i.i.d. uniformly on the unit sphere. D 2 :IID ( ℓ 0 -SSC) i.i.d. from arbitrary continuous distribution. No requirement for other complex geometric conditions, such as ingradius and subspace incoherence. Figure 4: Independent (left) and disjoint (right) subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 / 20

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results ℓ 0 -induced Sparse Subspace Clustering No free lunch! The price we pay for SDP under such much milder assumptions is solving the NP-hard ℓ 0 problem. No better deal! The converse of Theorem 1: Theorem 2 ( No free lunch: SDP ⇒ ℓ 0 -sparsity ) Under the semi-random or full-random model and the assumptions of Theorem 1, if there is an algorithm which, for any data point x i ∈ S k , 1 ≤ i ≤ n, 1 ≤ k ≤ K , can find the data from the same subspace as x i that linearly represent x i , i.e. x i = Xβ ( β i = 0) (1) where nonzero elements of β correspond to the data that lie in the subspace S k . Then, with probability 1 , solution to the ℓ 0 problem (for x i ) can be obtained from β n 3 ) time, where ˆ in O (ˆ n is the number of nonzero elements in β . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 / 20

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results Approximate ℓ 0 -SSC (A ℓ 0 -SSC) Allowing for some tolerance to noise, the optimization problem of ℓ 0 -SSC is R n × n , diag( α )= 0 L ( α ) = ∥ X − Xα ∥ 2 min F + λ ∥ α ∥ 0 α ∈ I Optimization by proximal gradient descent, using SSC as initialization α i ( t ) = h √ ( α i ( t − 1) − 2 τs ( X ⊤ Xα i ( t − 1) − X ⊤ x i )) 2 λ τs where h is an element-wise hard thresholding operator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 / 20

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results Approximate ℓ 0 -SSC The objective value { L ( α i ( t ) ) } t is non-increasing and consequently it converges. But does { α i ( t ) } t converge? If { α i ( t ) } t converges, how far is the resultant sub-optimal solution from the globally optimal solution? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 / 20

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results Approximate ℓ 0 -SSC Definition of sparse eigenvalues ∥ u ∥ 0 ≤ m ; ∥ u ∥ 2 =1 ∥ X u ∥ 2 ∥ u ∥ 0 ≤ m, ∥ u ∥ 2 =1 ∥ X u ∥ 2 κ − ( m ) := min κ + ( m ) := max 2 2 Proposition 1 If κ − ( | supp( α i (0) ) | ) > 0 , { α i ( t ) } t is a bounded sequence that converges to a critical α i . point of L , denoted by ˆ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 / 20

ℓ 0 -induced Sparse Subspace Clustering ( ℓ 0 -SSC) Approximate ℓ 0 -SSC Introduction Results Approximate ℓ 0 -SSC α i from α i ∗ (the globally optimal solution)? Now how far is ˆ Roadmap: prove that both are local solutions to a capped- ℓ 1 problem, and then we can obtain the following bound: Theorem 3 (Bounded distance between sub-optimal solution and the globally optimal solution) Under certain assumptions on the sparse eigenvalues of the data matrix, the sequence { α i ( t ) } t converges to a critical point of L ( α i ) , ˆ α i . Then 2 α i − α i ∗ ) ∥ 2 ∥ ( ˆ 2 ≤ ( κ − ( | ˆ S i ∪ S ∗ i | ) − κ ) 2 (max { 0 , λ S i | (max { 0 , λ j − b |} ) 2 + | S ∗ ( ∑ α i i \ ˆ b − κb } ) 2 ) b − κ | ˆ j ∈ ˆ S i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 / 20

0 -Sparse Subspace Clustering Yingzhen Yang 1 , Jiashi Feng 2 , - PowerPoint PPT Presentation

0 -induced Sparse Subspace Clustering ( 0 -SSC) Approximate 0 -SSC Introduction Results 0 -Sparse Subspace Clustering Yingzhen Yang 1 , Jiashi Feng 2 , Nebojsa Jojic 3 , Jianchao Yang 4 , Thomas S. Huang 1 1 Beckman Institute,

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Subspace Polynomials and Cyclic Subspace Codes Netanel Raviv Joint work with: Prof. Tuvi Etzion

Graph based Subspace Segmentation Canyi Lu National University of Singapore Nov. 21, 2013

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Iterative Krylov Subspace Methods for Sparse Reconstruction James Nagy Mathematics and Computer

Paper Presentation (EE698M) Abhay Kumar Subspace clustering Cluster data drawn from multiple

Neu Neural C al Collab ollabor orativ ive e Subspace ce Clustering Tong Zhang, Pan Ji ,

SUBSPACE CLUSTERING Sylvain Calinon Robot Learning & Interaction Group Idiap Research

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Motivation High Dimensional Issues Subspace Clustering Full Dimensional Clustering Issues

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Finding Clusters Types of Clustering Approaches: Linkage Based, e.g. Hierarchical Clustering

Clustering Hierarchical clustering and k-mean clustering Genome 373 Genomic Informatics

Cl Clustering t i A Categorization of Major Clustering Methods Partitioning Methods

Entropy Rate Estimation for Markov Chains with Large State Space Yanjun Han Stanford EE Jiantao

Structured Perceptron with Inexact Search x x the man bit the dog x the man bit

Yang Yang MICHIGAN TECH Yang Yang , yyang7@mtu.edu RESEARCH FORUM TECHTALKS Current research

Working with YANG Data Models and Instances Using (Mainly) pyang Ladislav Lhotka 20 July 2014

Weighted SGD for p Regression with Randomized Preconditioning Jiyan Yang Stanford University

ELECTROWEAK PHASE TRANSITION S AND HIGGS COUPLINGS Patrick Meade C.N. Yang Institute for

Section 33 Finite fields Instructor: Yifan Yang Spring 2007 Instructor: Yifan Yang Section

Categorical models of probability with symmetries Sam Staton, Oxford Categorical models

Sambuz

Useful Links

Newsletter

Mail Us

0 -Sparse Subspace Clustering Yingzhen Yang 1 , Jiashi Feng 2 , - PowerPoint PPT Presentation

0 -induced Sparse Subspace Clustering ( 0 -SSC) Approximate 0 -SSC Introduction Results 0 -Sparse Subspace Clustering Yingzhen Yang 1 , Jiashi Feng 2 , Nebojsa Jojic 3 , Jianchao Yang 4 , Thomas S. Huang 1 1 Beckman Institute,

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Subspace Polynomials and Cyclic Subspace Codes Netanel Raviv Joint work with: Prof. Tuvi Etzion

Graph based Subspace Segmentation Canyi Lu National University of Singapore Nov. 21, 2013

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Iterative Krylov Subspace Methods for Sparse Reconstruction James Nagy Mathematics and Computer

Paper Presentation (EE698M) Abhay Kumar Subspace clustering Cluster data drawn from multiple

Neu Neural C al Collab ollabor orativ ive e Subspace ce Clustering Tong Zhang, Pan Ji ,

SUBSPACE CLUSTERING Sylvain Calinon Robot Learning &amp; Interaction Group Idiap Research

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Motivation High Dimensional Issues Subspace Clustering Full Dimensional Clustering Issues

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Finding Clusters Types of Clustering Approaches: Linkage Based, e.g. Hierarchical Clustering

Clustering Hierarchical clustering and k-mean clustering Genome 373 Genomic Informatics

Cl Clustering t i A Categorization of Major Clustering Methods Partitioning Methods

Entropy Rate Estimation for Markov Chains with Large State Space Yanjun Han Stanford EE Jiantao

Structured Perceptron with Inexact Search x x the man bit the dog x the man bit

Yang Yang MICHIGAN TECH Yang Yang , yyang7@mtu.edu RESEARCH FORUM TECHTALKS Current research

Working with YANG Data Models and Instances Using (Mainly) pyang Ladislav Lhotka 20 July 2014

Weighted SGD for p Regression with Randomized Preconditioning Jiyan Yang Stanford University

ELECTROWEAK PHASE TRANSITION S AND HIGGS COUPLINGS Patrick Meade C.N. Yang Institute for

Section 33 Finite fields Instructor: Yifan Yang Spring 2007 Instructor: Yifan Yang Section

Categorical models of probability with symmetries Sam Staton, Oxford Categorical models

Sambuz

Useful Links

Newsletter

Mail Us

SUBSPACE CLUSTERING Sylvain Calinon Robot Learning & Interaction Group Idiap Research