Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Rasoul Shafipour Dept. of Electrical and Computer Engineering University of Rochester rshafipo@ece.rochester.edu http://www.ece.rochester.edu/~rshafipo/ Co-author: Gonzalo Mateos Acknowledgment: NSF Awards CCF-1750428 and ECCS-1809356 Pacific Grove, CA, October 30, 2018 Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 1
Network Science analytics Online social media Internet Clean energy and grid analy,cs ◮ Network as graph G = ( V , E ): encode pairwise relationships ◮ Desiderata: Process, analyze and learn from network data [Kolaczyk’09] ◮ Interest here not in G itself, but in data associated with nodes in V ⇒ The object of study is a graph signal ⇒ Ex: Opinion profile, buffer levels, neural activity, epidemic Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 2
Graph signal processing and Fourier transform ◮ Directed graph (digraph) G with adjacency matrix A 2 ⇒ A ij = Edge weight from node i to node j ◮ Define a signal x ∈ R N on top of the graph 1 4 3 ⇒ x i = Signal value at node i ◮ Associated with G is the underlying undirected G u ⇒ Laplacian marix L = D − A u , eigenvectors V = [ v 1 , · · · , v N ] ◮ Graph Signal Processing (GSP): exploit structure in A or L to process x ◮ Graph Fourier Transform (GFT): ˜ x = V T x for undirected graphs ⇒ Decompose x into different modes of variation ⇒ Inverse (i)GFT x = V ˜ x , eigenvectors as frequency atoms Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 3
Our work in context ◮ Spectral analysis and filter design [Tremblay et al’17], [Isufi et al’16] ⇒ GFT as a promising tool in neuroscience [Huang et al’16] ◮ Noteworthy GFT approaches ◮ Jordan decomposition of A [Sandryhaila-Moura’14], [Deri-Moura’17] ◮ Lova´ sz extension of the graph cut size [Sardellitti et al’17] ◮ Basis selection for spread modes [Shafipour et al’18] ◮ Generalized variation operators and inner products [Girault et al’18] ◮ Dictionary learning (DL) for GSP ◮ Parametric dictionaries for graph signals [Thanou et al’14] ◮ Dual graph-regularized DL [Yankelevsky-Elad’17] ◮ Joint topology- and data-driven prediction [Forero et al’14] ◮ Our contribution: digraph (D)GFT (dictionary) design ◮ Orthonormal basis signals (atoms) offer notions of frequency ◮ Frequencies are distributed as even as possible in [0 , f max ] ◮ Sparsely represents bandlimited graph signals Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 4
Signal variation on digraphs ◮ Total variation of signal x with respect to L N � TV( x ) = x T Lx = A u ij ( x i − x j ) 2 i , j =1 , j > i ⇒ Smoothness measure on the graph G u ◮ For Laplacian eigenvectors V = [ v 1 , · · · , v N ] ⇒ TV( v k ) = λ k ⇒ 0 = λ 1 < · · · ≤ λ N can be viewed as frequencies ◮ Directed variation for signals over digraphs ([ x ] + = max(0 , x )) N � A ij [ x i − x j ] 2 DV( x ) := + i , j =1 ⇒ Captures signal variation (flow) along directed edges ⇒ Consistent, since DV( x ) ≡ TV( x ) for undirected graphs Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 5
DGFT with spread frequeny components ◮ Find N orthonormal bases capturing low, medium, and high frequencies ◮ Collect the desired bases in a matrix U = [ u 1 , · · · , u N ] ∈ R N × N x = U T x DGFT: ˜ ⇒ u k represents the k th frequency mode with f k := DV( u k ) ◮ Similar to the DFT, seek N evenly distributed graph frequencies in [0 , f max ] ⇒ f max is the maximum DV of a unit-norm graph signal on G Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 6
Spread frequencies in two steps ◮ First: Find f max by solving f max := DV( u max ) . u max = argmax DV( u ) and � u � =1 ◮ Let v N be the dominant eigenvector of L ⇒ Can 1/2-approximate f max with ˜ u max = argmax DV( v ) v ∈{ v N , − v N } ◮ Second: Set u 1 = u min := 1 N 1 N and u N = u max and minimize √ N − 1 � [DV( u i +1 ) − DV( u i )] 2 δ ( U ) := i =1 ⇒ δ ( U ) is the spectral dispersion function ⇒ Minimized when free DV values form an arithmetic sequence Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 7
Spectral dispersion and sparsity minimization ◮ Sparsify a set of bandlimited signals X ∈ R N × P → Minimize || U T X || 1 ◮ Problem: given G and X , find sparsifying DGFT with spread frequencies N − 1 [DV( u i +1 ) − DV( u i )] 2 + µ || U T X || 1 � min Ψ( U ) := U i =1 U T U = I subject to u 1 = u min u N = u max ◮ Non-convex, orthogonality-constrained minimization ◮ Non-differentiable Ψ( U ) ◮ Feasible since u max ⊥ u min ◮ Variable-splitting and a feasible method in the Stiefel manifold: (i) Obtain f max (and u max ) by minimizing − DV( u ) over { u | u T u = 1 } (ii) Replace U T X with an auxiliary variable Y ∈ R N × P , enforce Y = U T X (iii) Adopt an alternating minimization scheme Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 8
Update U : Feasible method in the Stiefel manifold ◮ For fixed Y = Y k , rewrite the problem of finding U k +1 as φ ( U ) := δ ( U ) + λ + γ � u 1 − u min � 2 + � u N − u max � 2 � 2 � Y k − U T X � 2 � minimize F 2 U subject to U T U = I N ◮ Recall δ ( U ) := � N − 1 i =1 [DV( u i +1 ) − DV( u i )] 2 ◮ Choose large enough λ > 0 to ensure u 1 = u min and u N = u max ◮ Let U k be a feasible point at iteration k and the gradient G k = ∇ φ ( U k ) ⇒ Skew-symmetric matrix B k := G k U k T − U k G k T � − 1 � ◮ Update rule U k +1 ( τ ) = � I + τ I − τ � 2 B k 2 B k U k ⇒ Cayley transform preserves orthogonality (i.e., U k +1 T U k +1 = I ) Theorem (Wen-Yin’13) Iterates converge to a stationary point of smooth φ ( U ), while generating feasible points at every iteration Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 9
Update Y : Soft thresholding ◮ For fixed U = U k +1 , rewrite the problem of finding Y k +1 as µ � Y � 1 + γ T X � F minimize 2 � Y − U k +1 Y ⇒ Proximal operator that is component-wise separable ◮ Update Y k +1 in closed form via soft-thresholding operations T X ) ◦ T X | − µ/γ � � Y k +1 = sign( U k +1 | U k +1 + Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 10
Algorithm 1: Input: Adjacency matrix A , signals X ∈ R N × P , and λ , µ , γ , ǫ 1 , ǫ 2 > 0 1 2: Find u max by a similar feasible method and set u min = N 1 N √ 3: Initialize k = 0, Y 0 ∈ R N × P at random 4: repeat U 0 ∈ R N × N at random U-update: Initialize t = 0 and orthonormal ˆ 5: repeat 6: Compute gradient G t := ∇ φ (ˆ U t ) ∈ R N × N 7: U tT − ˆ Form B t = G t ˆ T U t G t 8: Select τ t satisfying Armijo-Wolfe conditions 9: Update ˆ 2 B t )ˆ 2 B t ) − 1 ( I N − τ t U t +1 ( τ t ) = ( I N + τ t U t 10: until � ˆ U t − ˆ U t − 1 � F / � ˆ U t − 1 � F ≤ ǫ 1 11: Return U k = ˆ U t 12: Y-update: Y k +1 = sign ( U k T X ) ◦ ( | U k T X | − µ/γ ) + . 13: k ← k + 1. 14: 15: until � U k T X − U k − 1 T X � 1 / � U k − 1 T X � 1 ≤ ǫ 2 16: Return ˆ U = U k . ◮ Overall run-time is O ( N 3 ) per iteration Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 11
Numerical test: US average temperatures ◮ Graph of the N = 48 contiguous United States ⇒ Connect two states if they share a border ⇒ Set arc directions from lower to higher latitudes 70 65 60 55 50 45 ◮ Test graph signal x → Average annual temperature of each state Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 12
Numerical test: Convergence behavior ◮ Average monthly temperature over ∼ 60 years for each state ⇒ Training signals X ∈ R 48 × 12 ◮ First, use Monte-Carlo method to study the convergence properties ◮ Plot Ψ ( U ) = δ ( U ) + µ || U T X || 1 versus k for 10 different initializations 700 680 660 640 620 600 580 560 2 3 4 5 10 20 30 40 50 60 70 80 90 100 ◮ Convergence is apparent, with limited variability on the solution Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 13
̸ Numerical test: Spread and sparse ◮ Heat maps of the trained ˜ ◮ Spectral representation of test signal X | ˜ | ˜ X 2: N, · | with µ = 0 X 2: N, · | with µ = 0 | ˜ | ˜ X 2: N, · | via proposed algorithm ( µ ̸ = 0) X 2: N, · | via proposed algorithm ( µ ̸ = 0 ) 400 45 45 70 70 350 40 40 60 60 | ˜ x | when µ = 0 300 35 35 | ˜ x | via proposed algorithm 50 50 250 30 30 DGFT 40 40 25 25 200 20 20 150 30 30 15 15 100 20 20 10 10 50 10 10 5 5 0 0 5 10 15 20 25 30 35 40 45 50 0 0 Frequency Index 2 4 6 8 10 12 2 4 6 8 10 12 ◮ Distribution of all the frequencies 2 1 0 1 2 3 4 5 6 7 8 ◮ Tradeoff: spectral dispersion for a sparser representation ◮ Still attain well dispersed frequencies Spread and Sparse: Learning Interpretable Transforms for Bandlimited Signals on Digraphs Asilomar 2018 14
Recommend
More recommend