Connectivity-Optimized Representation Learning via Persistent - PowerPoint PPT Presentation

ICML | 2019 Long Beach Connectivity-Optimized Representation Learning via Persistent Homology Christoph D. Hofer, Roland Kwitt Mandar Dixit Marc Niethammer University of Salzburg UNC Chapel Hill Microsoft

Unsupervised representation learning Q : What makes a good representation? ◮ Ability to reconstruct ( → prevalance of autoencoders) ◮ Robust to pertubations of the input ◮ Useful for downstream tasks (e.g., clustering, or classification) ◮ etc.

Unsupervised representation learning Q : What makes a good representation? ◮ Ability to reconstruct ( → prevalance of autoencoders) ◮ Robust to pertubations of the input ◮ Useful for downstream tasks (e.g., clustering, or classification) ◮ etc. Common idea : Control (/or enforce) properties of (/on) the latent representations in Z . Latent space Z x x ˆ g φ : Z → X Rec [ x, ˆ x ] f θ : X → Z Encoder Decoder

Unsupervised representation learning Q : What makes a good representation? ◮ Ability to reconstruct ( → prevalance of autoencoders) Contractive AE’s [Rifai et al., ICML ’11] ◮ Robust to pertubations of the input ◮ Useful for downstream tasks (e.g., clustering, or classification) ◮ etc. Common idea : Control (/or enforce) properties of (/on) the latent representations in Z . Latent space Z x x ˆ g φ : Z → X + Reg Rec [ x, ˆ x ] f θ : X → Z Encoder Decoder

Unsupervised representation learning Q : What makes a good representation? ◮ Ability to reconstruct ( → prevalance of autoencoders) ◮ Robust to pertubations of the input Denoising AE’s [Vincent et al., JMLR ’10] ◮ Useful for downstream tasks (e.g., clustering, or classification) ◮ etc. Common idea : Control (/or enforce) properties of (/on) the latent representations in Z . Latent space Z x x ˆ g φ : Z → X Rec [ x, ˆ x ] f θ : X → Z � Encoder Decoder Large Perturb, or zero-out

Unsupervised representation learning Q : What makes a good representation? ◮ Ability to reconstruct ( → prevalance of autoencoders) ◮ Robust to pertubations of the input Sparse AE’s [Makhzani & Frey, ICLR ’14] ◮ Useful for downstream tasks (e.g., clustering, or classification) ◮ etc. Common idea : Control (/or enforce) properties of (/on) the latent representations in Z . Latent space Z x x ˆ g φ : Z → X + Reg Rec [ x, ˆ x ] f θ : X → Z Encoder Decoder

Unsupervised representation learning Q : What makes a good representation? ◮ Ability to reconstruct ( → prevalance of autoencoders) ◮ Robust to pertubations of the input ◮ Useful for downstream tasks (e.g., clustering, or classification) Adversarial AE’s [Makhzani et al., ICLR ’16] ◮ etc. (by far not exhaustive) Common idea : Control (/or enforce) properties of (/on) the latent representations in Z . Latent space Z x x ˆ g φ : Z → X Rec [ x, ˆ x ] f θ : X → Z Encoder Decoder � Enforce distributional properties through adversarial training

Motivating (toy) example We aim to control properties of the latent space, but from a topological point of view !

Motivating (toy) example We aim to control properties of the latent space, but from a topological point of view ! Assume, we want to do Kernel Density Estimation (KDE) in the latent space Z . Data ( z i ) Gaussian KDE Bandwidth selection: Scott’s rule [Scott, 1992]

Motivating (toy) example We aim to control properties of the latent space, but from a topological point of view ! Assume, we want to do Kernel Density Estimation (KDE) in the latent space Z . Data ( z i ) Data ( z i ) Gaussian KDE Gaussian KDE Bandwidth selection: Scott’s rule [Scott, 1992] Bandwidth selection can be challenging, as the scaling greatly differs!

Controlling connectivity Q : How do we capture topological properties and what do we want to control? Latent space Z

Controlling connectivity Q : How do we capture topological properties and what do we want to control? Vietoris Rips Persistent Homology (PH) Radius r = r 1 r Latent space Z

Controlling connectivity Q : How do we capture topological properties and what do we want to control? Vietoris Rips Persistent Homology (PH) Radius r = r 2 Latent space Z

Controlling connectivity Q : How do we capture topological properties and what do we want to control? Vietoris Rips Persistent Homology (PH) Radius r = r 3 Latent space Z ◮ PH tracks topological changes as the ball radius r increases ◮ Connectivity information is caputred by 0 -dim. persistent homology

Controlling connectivity Q : How do we capture topological properties and what do we want to control? Vietoris Rips Persistent Homology (PH) Homogeneous arrangement! Radius r = r 3 What if z �→ f θ ( z ) η/ 2 Latent space Z ◮ PH tracks topological changes as the ball radius r increases beneficial for KDE ◮ Connectivity information is caputred by 0 -dim. persistent homology

Connectivity loss Q : How can we control topological properties (connectivity properties in particular)? x x ˆ g φ : R n → X f θ : X → R n Rec [ · , · ]

Connectivity loss Q : How can we control topological properties (connectivity properties in particular)? x ˆ Consider batches g φ : R n → X f θ : X → R n Rec [ · , · ] ( x 1 , . . . , x B ) + Connectivity loss PH PH

Connectivity loss Q : How can we control topological properties (connectivity properties in particular)? x ˆ Consider batches g φ : R n → X f θ : X → R n Rec [ · , · ] ( x 1 , . . . , x B ) + Connectivity loss PH PH L η η , penalize deviation from homogeneous arrangement (with scale η )

Connectivity loss Q : How can we control topological properties (connectivity properties in particular)? Gradient signal x ˆ Consider batches g φ : R n → X f θ : X → R n Rec [ · , · ] ( x 1 , . . . , x B ) + Connectivity loss PH PH L η η , penalize deviation from homogeneous arrangement (with scale η )

Connectivity loss Q : How can we control topological properties (connectivity properties in particular)? � Until now , we could not backpropagate through PH Gradient signal x x ˆ ˆ Consider batches g φ : R n → X f θ : X → R n Rec [ · , · ] ( x 1 , . . . , x B ) + Connectivity loss PH PH L η η , penalize deviation from homogeneous arrangement (with scale η )

Connectivity loss From a theoretical perspective , we show . . . · · · Enc Dec + Connectivity loss PH (1) . . . that under mild conditions, the connectivity loss is differentiable

Connectivity loss From a theoretical perspective , we show . . . x 1 , . . . , x B · · · Enc Dec + Connectivity loss PH (1) . . . that under mild conditions, the connectivity loss is differentiable (2) . . . metric-entropy based guidelines for choosing the training batch size B

Connectivity loss From a theoretical perspective , we show . . . x 1 , . . . , x N · · · Enc Dec + Connectivity loss N ≫ B PH (1) . . . that under mild conditions, the connectivity loss is differentiable (2) . . . metric-entropy based guidelines for choosing the training batch size B (3) . . . “densification ” e ff ects occur for samples, N , larger than the training batch size B

Connectivity loss From a theoretical perspective , we show . . . x 1 , . . . , x N · · · Enc Dec + Connectivity loss N ≫ B PH (1) . . . that under mild conditions, the connectivity loss is differentiable (2) . . . metric-entropy based guidelines for choosing the training batch size B (3) . . . “ densi fi cation ” e ff ects occur for samples, N , larger than the training batch size B Intuitively , during training ... ... the reconstruction loss controls what is worth capturing ... the connectivity loss controls how to topologically organize the latent space

Experiments – Task : One-class learning unlabled data Auxiliary f θ g φ Rec [ · , · ] + Connectivity loss (with fi xed scale η ) PH Trained only once (e.g., on C I FAR-10 without labels)

Experiments – Task : One-class learning unlabled data Auxiliary f θ g φ Rec [ · , · ] + Connectivity loss (with fi xed scale η ) PH Trained only once (e.g., on C I FAR-10 without labels) KDE-inspired one-class "learning" One-class samples r = η/ 2 f θ

Experiments – Task : One-class learning unlabled data Auxiliary f θ g φ Rec [ · , · ] + Connectivity loss (with fi xed scale η ) PH Trained only once (e.g., on C I FAR-10 without labels) KDE-inspired one-class "learning" Computation of a one-class score One-class samples r = η/ 2 I n-class f θ f θ f θ Out-of-class Count #samples falling into balls of radius η , anchored at the one-class instances

Results – Task : One-class learning CIFAR-10 (AE trained on C I FAR-100) 0.8 0.8 ∅ AUROC 0.7 0.7 0.6 0.6 0.5 0.5 DAGMM DSEBM OC-SVM (CAE) Deep-SVDD ADT Ours -120 ADT [Goland & El-Yaniv, N I PS ’ 18] DAGMM [Zong et al., I CLR ’ 18] DSEBM [Zhai et al., I CML ’ 16] Training batch size: B = 100 Deep-SVDD [Ru ff et al., I CML ’ 18]

Results – Task : One-class learning CIFAR-10 (AE trained on C I FAR-100) +7 points 0.8 0.8 ∅ AUROC 0.7 0.7 0.6 0.6 0.5 0.5 DAGMM DSEBM OC-SVM (CAE) Deep-SVDD ADT Ours -120 ADT-1,000 ADT-500 ADT-120 Ours -120 Low-sample size ADT [Goland & El-Yaniv, N I PS ’ 18] DAGMM [Zong et al., I CLR ’ 18] DSEBM [Zhai et al., I CML ’ 16] Training batch size: B = 100 Deep-SVDD [Ru ff et al., I CML ’ 18]

Connectivity-Optimized Representation Learning via Persistent - PowerPoint PPT Presentation

ICML | 2019 Long Beach Connectivity-Optimized Representation Learning via Persistent Homology Christoph D. Hofer, Roland Kwitt Mandar Dixit Marc Niethammer University of Salzburg UNC Chapel Hill Microsoft Unsupervised representation

Connectivity Corollary. GRAPH CONNECTIVITY is not FO definable Connectivity Corollary. GRAPH

Average Connectivity and Average Edge-connectivity in Graphs Suil O joint work with Jaehoon Kim

Cuts and Connectivity Cuts and Connectivity CSE, IIT KGP Vertex Cut and Connectivity Vertex Cut

A NEW STANDARD FOR IOT CONNECTIVITY Managed connectivity services for Internet of Things

Optimized design and analysis of Optimized design and analysis of sparse-sampling fMRI

ZIVD, LLC 1 Laboratory Optimized patient care Clinician Optimized patient care 2

Optimized geothermal Optimized geothermal binary power cycles binary power cycles Kontoleontos

Moving Shadow Tracking in VR Interaction A novel optimized approach A novel optimized approach

! Broadband Connectivity Services Today Technical Solutions for Broadband Connectivity

STREET CONNECTIVITY: WHAT IT IS, WHY IT MATTERS, & HOW TO INCREASE IT IN OUR COMMUNITIES

Highway 17 Wildlife Habitat Connectivity Project Wildlife Habitat Connectivity Project Promoting

Are p and q connected? Network connectivity Yes, they are connected! Network connectivity

Connectivity and Biconnectivity 462 cec CS 16: Connectivity Connected Components Connected

Stable and Efficient Representation Learning with Nonnegativity Constraints Tsung-Han Lin and

Unsupervised Language Learning: Representation Learning for NLP Katia Shutova ILLC University

K K Knowledge Knowledge l d l d Representation Representation Representation

PH 7.130 Paco 2 91 torr Pao 2 74 torr HCO 3 30.1 mEq/L BE -0.4 O2

Lecture 12: Polynomial Hierarchy II Arijit Bishnu 06.04.2010 Complete Problems for levels of PH

Lead Safe Housing Rule Amendment Training For PBA and Conventional PH September 2019 Welcome

Thinning-stable point processes as a model for bursty spatial data Sergei Zuyev Chalmers

Misc What's a reduction? Tapes, NTIME, NEXP, Padding, PH What is a reduction from A to B?

Catalog 2020 Water slides tunnelslides.com Water slide AQUA BANAN Material : AISI 304 / DIN EN

Ph.D. Matt Might, The Illustrated Guide to a Ph.D.:

Sample Preparation and Characterization Cy Jeffries EMBL Hamburg Small-angle scattering (SAS)

Connectivity-Optimized Representation Learning via Persistent - PowerPoint PPT Presentation

ICML | 2019 Long Beach Connectivity-Optimized Representation Learning via Persistent Homology Christoph D. Hofer, Roland Kwitt Mandar Dixit Marc Niethammer University of Salzburg UNC Chapel Hill Microsoft Unsupervised representation

Connectivity Corollary. GRAPH CONNECTIVITY is not FO definable Connectivity Corollary. GRAPH

Average Connectivity and Average Edge-connectivity in Graphs Suil O joint work with Jaehoon Kim

Cuts and Connectivity Cuts and Connectivity CSE, IIT KGP Vertex Cut and Connectivity Vertex Cut

A NEW STANDARD FOR IOT CONNECTIVITY Managed connectivity services for Internet of Things

Optimized design and analysis of Optimized design and analysis of sparse-sampling fMRI

ZIVD, LLC 1 Laboratory Optimized patient care Clinician Optimized patient care 2

Optimized geothermal Optimized geothermal binary power cycles binary power cycles Kontoleontos

Moving Shadow Tracking in VR Interaction A novel optimized approach A novel optimized approach

! Broadband Connectivity Services Today Technical Solutions for Broadband Connectivity

STREET CONNECTIVITY: WHAT IT IS, WHY IT MATTERS, &amp; HOW TO INCREASE IT IN OUR COMMUNITIES

Highway 17 Wildlife Habitat Connectivity Project Wildlife Habitat Connectivity Project Promoting

Are p and q connected? Network connectivity Yes, they are connected! Network connectivity

Connectivity and Biconnectivity 462 cec CS 16: Connectivity Connected Components Connected

Stable and Efficient Representation Learning with Nonnegativity Constraints Tsung-Han Lin and

Unsupervised Language Learning: Representation Learning for NLP Katia Shutova ILLC University

K K Knowledge Knowledge l d l d Representation Representation Representation

PH 7.130 Paco 2 91 torr Pao 2 74 torr HCO 3 30.1 mEq/L BE -0.4 O2

Lecture 12: Polynomial Hierarchy II Arijit Bishnu 06.04.2010 Complete Problems for levels of PH

Lead Safe Housing Rule Amendment Training For PBA and Conventional PH September 2019 Welcome

Thinning-stable point processes as a model for bursty spatial data Sergei Zuyev Chalmers

Misc What's a reduction? Tapes, NTIME, NEXP, Padding, PH What is a reduction from A to B?

Catalog 2020 Water slides tunnelslides.com Water slide AQUA BANAN Material : AISI 304 / DIN EN

Ph.D. Matt Might, The Illustrated Guide to a Ph.D.:

Sample Preparation and Characterization Cy Jeffries EMBL Hamburg Small-angle scattering (SAS)

STREET CONNECTIVITY: WHAT IT IS, WHY IT MATTERS, & HOW TO INCREASE IT IN OUR COMMUNITIES