1 ! k c a B Signal Processing on Graphs s e k i r t S g - PowerPoint PPT Presentation

! k c a B Signal Processing on Graphs s e k i r t S g SpaRTaN-MacSeNet Spring School n i r e t l i F s s a P Pierre Vandergheynst - w o L Swiss Federal Institute of Technology April is Autism Awareness Month: https://www.autismspeaks.org/wordpress-tags/autism-awareness-month

3 Signal Processing on Graphs Social Networks Energy Networks Transportation Networks Biological Networks Irregular Data Domains

5 Some Typical Processing Problems Compression / Visualization Many interesting new contributions with a SP perspective Denoising [Coifman, Maggioni, Kolaczyk, Ortega, Ramchandran, Moura, Lu, Borgnat] or IP perspective [ElMoataz, Lezoray] Earth data source: Frederik Simons Semi-Supervised Learning See review in 2013 IEEE SP Mag Analysis / Information Extraction

6 Outline l Introduction: - Graphs and elements of spectral graph theory, with emphasis on functional calculs l Kernel Convolution: - Localization, filtering, smoothing and applications l An application to spectral clustering that unifies some of the themes you’ve heard of during the workshop: machine learning, compressive sensing, optimisation algorithms, graphs

7 Elements of Spectral Graph Theory Reference: F. Chung, Spectral Graph Theory

8 Definitions A graph G is given by a set of vertices and « relationships » between them encoded in edges G = ( V , E ) A set V of vertices of cardinality | V | = N A set E of edges: e ∈ E, e = ( u, v ) with u, v ∈ V e 0 = ( v, u ) and e 6 = e 0 Directed edge: e = ( u, v ) , e 0 = ( v, u ) and e = e 0 Undirected edge: e = ( u, v ) , A graph is undirected if it contains only undirected edges A weighted graph has an associated non-negative weight function: w : V × V → R + ( u, v ) / ∈ E ⇒ w ( u, v ) = 0

9 Matrix Formulation Connectivity captured via the (weighted) adjacency matrix W ( u, v ) = w ( u, v ) with obvious restriction for unweighted graphs W ( u, u ) = 0 no loops Let d ( u ) be the degree of u and D = diag(d) the degree matrix Graph Laplacians, Signals on Graphs L norm = D − 1 / 2 L D − 1 / 2 L = D − W Graph signal: f : V → R Laplacian as an operator on space of graph signals X � � L f ( u ) = w ( u, v ) f ( u ) − f ( v ) v ∼ u

10 Some di ff erential operators The Laplacian can be factorized as L = SS ∗ Explicit form of the incidence matrix (unweighted in this example): e =( u , v ) ( ) -1 u S = 1 v is a gradient S ∗ f ( u, v ) = f ( v ) − f ( u ) X X is a negative divergence g ( v 0 , u ) S g ( u ) = g ( u, v ) − ( u,v ) 2 E ( v 0 ,u ) 2 E

11 Properties of the Laplacian Laplacian is symmetric and has real eigenvalues � 2 � 0 X Dirichlet form Moreover: � h f, L f i = w ( u, v ) f ( u ) � f ( v ) u ∼ v positive semi-definite, non-negative eigenvalues Spectrum: 0 = λ 0 ≤ λ 1 ≤ . . . λ max G connected: λ 1 > 0 G has i +1 connected components λ i = 0 and λ i +1 > 0 Notation: h f, L g i = f t L g

12 Measuring Smoothness � 2 � 0 X � h f, L f i = f ( u ) � f ( v ) u ∼ v is a measure of « how smooth » f is on G Using our definition of gradient: r u f = { S ∗ f ( u, v ) , 8 v ⇠ u } sX Local variation kr u f k 2 = | S ∗ f ( u, v ) | 2 v ∼ u sX X X Total variation | f | T V = kr u f k 2 = | S ∗ f ( u, v ) | 2 v ∼ u u ∈ V u ∈ V

13 Notions of Global Regularity for Graph Discrete Calculus , Grady and Polimeni, 2010 � ∂ f Edge p � := w ( m, n ) [ f ( n ) − f ( m )] Derivative � ∂ e � m "⇢ ∂ f # � � Graph � O m f := Gradient � ∂ e � e ∈ E s.t. e =( m,n ) m # 1 " X 2 Local w ( m, n ) [ f ( n ) − f ( m )] 2 || O m f || 2 = Variation n ∈ N m 1 w ( m, n ) [ f ( n ) − f ( m )] 2 = f Quadratic X || O m f || 2 X 2 = T L f Form 2 m ∈ V ( m,n ) ∈ E

14 Smoothness of Graph Signals G 2 G 3 G 1 T L 3 f = 1 . 81 T L 1 f = 0 . 14 T L 2 f = 1 . 31 f f f ˆ ˆ ˆ ( ) ( ) ( ) f λ  f λ  f λ  λ  λ  λ 

15 Remark on Discrete Calculus Discrete operators on graphs form the basis of an interesting field aiming at bringing a PDE-like framework for computational analysis on graphs: l Leo Grady: Discrete Calculus l Olivier Lezoray, Abderrahim Elmoataz and co-workers: PDEs on graphs: - many methods from PDEs in image processing can be   transposed on arbitrary graphs - applications in vision (point clouds) but also machine learning   (inference with graph total variation)

16 Laplacian eigenvectors Spectral Theorem: Laplacian is PSD with eigen decomposition { ( λ ` , u ` ) } ` =0 , 1 ,...,N − 1 L = D − W L = U Λ U t That particular basis will play the role of the Fourier basis: Graph Fourier Transform, Coherence N ˆ X f ( i ) u ∗ f ( λ ` ) := h f , u ` i = ` ( i ) , i =1 h 1 h Graph Coherence p ` ,i | h u ` , δ i i | 2 µ := max , 1 N

17 Important remark on eigenvectors h 1 h p ` ,i | h u ` , δ i i | 2 µ := max , 1 N Optimal - Fourier case What does that mean ?? Eigenvectors of modified path graph

18 Examples: Cut and Clustering RatioCut( A, A ) := 1 C ( A, A ) + 1 C ( A, A ) X C ( A, B ) := W [ i, j ] 2 | A | 2 | A | i ∈ A,j ∈ B 8 q | A | / | A | if i ∈ A < f [ i ] = A ⊂ V RatioCut( A, A ) min q | A | / | A | if i ∈ A − : p k f k = | V | and h f, 1 i = 0 f t L f = | V | · RatioCut( A, A ) p f ∈ R | V | f t L f subject to k f k = | V | and h f, 1 i = 0 arg min Relaxed problem Looking for a smooth partition function

20 Examples: Cut and Clustering Spectral Clustering p f ∈ R | V | f t L f subject to k f k = | V | and h f, 1 i = 0 arg min By Rayleigh-Ritz, solution is second eigenvector u 1 Remarks: Natural extension to more than 2 sets Solution is real-valued and needs to be quantized. In general, k-MEANS is used. First k eigenvectors of sparse Laplacians via Lanczos, complexity driven by eigengap | λ k − λ k +1 | Spectral clustering := embedding + k-MEANS � � 8 i 2 V : i 7! u 0 ( i ) , . . . , u k − 1 ( i )

21 Graph Embedding/Laplacian Eigenmaps Goal: embed vertices in low dimensional space, discovering geometry ( x 1 , . . . x N ) 7! ( y 1 , . . . y N ) k < d x i ∈ R d y i ∈ R k Good embedding: nearby points mapped nearby, so smooth map y i = Φ ( x i )

22 Graph Embedding/Laplacian Eigenmaps Goal: embed vertices in low dimensional space, discovering geometry ( x 1 , . . . x N ) 7! ( y 1 , . . . y N ) k < d x i ∈ R d y i ∈ R k Good embedding: nearby points mapped nearby, so smooth map minimize variations/ X W [ i, j ]( y i − y j ) 2 maximize smoothness of embedding i,j Laplacian Eigenmaps y t L y arg min y L y = λ Dy y t Dy = 1 fix scale y t D 1 = 0

23 Laplacian Eigenmaps [Belkin, Niyogi, 2003]

24 Remark on Smoothness Linear / Sobolev case Smoothness, loosely defined, has been used to motivate various methods and algorithms. But in the discrete, finite dimensional case, asymptotic decay does not mean much f ( ` ) | 2 ≤ M � ` | ˆ X kr f k 2 2  M , f t L f  M ⇔ ` E K ( f )  kr f k 2 E K ( f ) = k f � P K ( f ) k 2 p λ K +1 √ M | ˆ f ( ` ) | ≤ √ � `

25 Smoothness of Graph Signals Revisited G 2 G 3 G 1 ˆ ˆ ˆ ( ) ( ) ( ) f λ  f λ  f λ  λ  λ  λ  T L 3 f = 1 . 81 T L 1 f = 0 . 14 T L 2 f = 1 . 31 f f f

26 Functional calculus It will be useful to manipulate functions of the Laplacian f ( L ) , f : R 7! R L k u ` = λ k polynomials ` u ` Symmetric matrices admit a (Borel) functional calculus Borel functional calculus for symmetric matrices X f ( λ ` ) u ` u t f ( L ) = ` ` ∈ S ( L ) Use spectral theorem on powers, get to polynomials From polynomial to continuous functions by Stone-Weierstrass Then Riesz-Markov (non-trivial !)

27 Example: Di ff usion on Graphs Consider the following « heat » di ff usion model ∂ f @ f ( ` , t ) = − � ` ˆ ˆ f ( ` , 0) := ˆ ˆ f ( ` , t ) f 0 ( ` ) ∂ t = − L f @ t f ( ` , t ) = e − t λ ` ˆ ˆ by functional calculus f = e − t L f 0 f 0 ( ` ) X X e − t � ` u ` ( i ) u ` ( j ) f 0 ( j ) f ( i ) = Explicitly: j ∈ V ` X e − t L = e − t � ` u ` u t X X e − t � ` u ` ( i ) = u ` ( j ) f 0 ( j ) ` ` ` j ∈ V X e − t L [ i, j ] = e − t � ` u ` ( i ) u ` ( j ) e − t � ` ˆ X = f 0 ( ` ) u ` ( i ) ` `

28 Example: Di ff usion on Graphs examples of heat kernel on graph f 0 ( j ) = δ k ( j ) e − t � ` ˆ X f ( i ) = f 0 ( ` ) u ` ( i ) ` X e − t � ` u ` ( k ) u ` ( i ) = `

29 Simple De-Noising Example Suppose a smooth signal f on a graph 1 kr f k 2 2  M , f t L f  M 0.8 0.6 √ 0.4 M | ˆ 0.2 f ( ` ) | ≤ √ � ` 0 − 0.2 − 0.4 − 0.6 − 0.8 − 1 Noisy Original 2 But you observe only a noisy version y 1.5 1 0.5 0 y ( i ) = f ( i ) + n ( i ) − 0.5 − 1 − 1.5 − 2

30 Simple De-Noising Example De-Noising by Regularization 2 s.t. f t L f  M k f � y k 2 argmin f L r f ∗ + τ τ 2 � f − y � 2 2( f ∗ − y ) = 0 T L r f. argmin 2 + f f Graph Fourier � � � � L r f ∗ ( ℓ ) + τ f ∗ ( ℓ ) − ˆ y ( ℓ ) = 0 , 2 ∀ ℓ ∈ { 0 , 1 , . . . , N − 1 } τ � “Low pass” filtering ! f ∗ ( ℓ ) = y ( ℓ ) ˆ τ + 2 λ r � Convolution with a kernel: ˆ f ( ` )ˆ g ( � ` ; ⌧ , r ) ⇒ g ( L ; ⌧ , r )

1 ! k c a B Signal Processing on Graphs s e k i r t S g - PowerPoint PPT Presentation

1 ! k c a B Signal Processing on Graphs s e k i r t S g SpaRTaN-MacSeNet Spring School n i r e t l i F s s a P Pierre Vandergheynst - w o L Swiss Federal Institute of Technology April is Autism Awareness Month:

The Impact of Price Discrimination in Markets with Adverse Selection Andr Veiga

We are a small Over 100 8 years of More than team of 10 contributors developing 1 million

CoSaMP Iterative signal recovery from incomplete and inaccurate samples Joel A. Tropp

Antiretroviral Therapy: Panel none Discussion Medical Management of HIV December 9, 2017

Extraction of coherent structures out of turbulent flows : comparison between real-valued and

A New Lower Bound for Curriculum-Based Course Timetabling V. Cacchiani, A. Caprara, R. Roberti,

Chapter 4 Multiframe Super-resolution Image Reconstruction 1 Multi-frame SRIR (Video

Adaptive Wavelet Collocation for Elasticity Lu s Manuel Castro Silvia Bertoluzza Instituto

Treatment of higher-risk myelodysplas5c syndromes Guillermo Sanz Hospital Universitario y

Source Separation based on Morphological Diversity J.-L. Starck Dapnia/SEDI-SAP, Service

Exascale N-body algorithms for data analysis and simulation 1. Introduction and significance 2.

Multiscale Processing on Networks and Community Mining Part 2 - Spectral Graph Wavelets and

Investigation of Various Levels of Cascade g Multi-Level Inverter DVR for Improve Power Quality

Seam Carving for Content-Aware Finding Optimal Seams Image Resizing Optimal Seams Order

Image Deblurring Seungyong Lee POSTECH 1 Contents Fast Motion Deblurring (Siggraph Asia

voto.it J BO Ad bunnies START END monsters eat tasty t2 Labels States Observation I be

Loading data into x ts object FOR E C ASTIN G P R OD U C T D E MAN D IN R Aric LaBarr , Ph . D

idioms Daniel Jackson MIT Lab for Computer Science 6898: Advanced Topics in Software Design

Autoregressive Conditional Heteroskedasticity (ARCH) Heino Bohn Nielsen 1 of 17 Introduction

A Coordinated Approach for Practical OS-Level Cache Management in Multi-Core Real-Time Systems

2019 Community Initiatives Funding Informational Webinar July 11, 2019 Agenda Today we are

Tracy Doaks SPONSORED BY Secretary, NC Dept of IT and State CIO @NCTA #NCTECH GovtVendors

10 Ways to Scale with Redis IMCSUMMIT - NOVEMBER 2019 | DAVE NIELSEN In-memory Multi-model

Cellular Cohomology In Homotopy Type Theory 20180709 Ulrik Buchholtz Favonia TU Darmstadt U