convolutional dictionary learning based auto encoders for
play

Convolutional dictionary learning based auto-encoders for natural - PowerPoint PPT Presentation

Convolutional dictionary learning based auto-encoders for natural exponential-family distributions Bahareh Tolooshams *1 , Andrew H. Song *2 , Simona Temereanca 3 , and Demba Ba 1 1 Harvard University 2 Massachusetts Institute of Technology 3


  1. Convolutional dictionary learning based auto-encoders for natural exponential-family distributions Bahareh Tolooshams *1 , Andrew H. Song *2 , Simona Temereanca 3 , and Demba Ba 1 1 Harvard University 2 Massachusetts Institute of Technology 3 Brown University * Equal contributions CRISP Group: https://crisp.seas.harvard.edu ICML 2020 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 1 / 22

  2. 1 Motivation 2 Introduction 3 Deep Convolutional Exponential Auto-encoder (DCEA) Experiments 4 Conclusion 5 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 2 / 22

  3. Motivation Deep Learning Signal Processing (SP) Generative models e.g., sparse coding model p ( y | x ) = Hx + ǫ ǫ ǫ, x is sparse • Slow and not scalable ✗ • Fast and scalable ✓ • Interpretable ✓ • Not interpretable ✗ • Memory efficient ✓ • Memory and computationally expensive ✗ • Benefit from scalability of deep learning for traditional SP tasks. • Guide to design interpretable and memory efficient networks. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 3 / 22

  4. 1 Motivation 2 Introduction 3 Deep Convolutional Exponential Auto-encoder (DCEA) Experiments 4 Conclusion 5 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 4 / 22

  5. Convolutional Dictionary Learning (CDL) Generative model for each data j C y j = ǫ j = Hx j + ǫ ǫ j ∼ N ( 0 , σ 2 I ) � h c ∗ x j ǫ j , c + ǫ ǫ ǫ ǫ ǫ c =1 where x j c is sparse. Goal : Learn H that maps sparse representation x j to data y j . J 1 � y j − Hx j � 2 � 2 + λ � x j � 1 min 2 { h c } C c =1 , { x j } J j =1 j =1 • min w.r.t. x j → Convolutional Sparse Coding (CSC) . • min w.r.t. H and x j → Convolutional Dictionary Learning (CDL) . Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 5 / 22

  6. Unfolding Networks Solve CSC and CDL by iterative proximal gradient algorithm. y x t x T ISTA [1]: ˜ y t α H T - H LISTA [2]: y W e x t x T S CSCNet [3]: y ˜ W e x t x T y t H - W d Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 6 / 22

  7. Unfolding Networks Solve CSC and CDL by iterative proximal gradient algorithm. y x t x T ISTA [1]: ˜ y t α H T - H LISTA [2]: y W e x t x T S CSCNet [3]: y ˜ W e x t x T y t H - W d Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 6 / 22

  8. Unfolding Networks Solve CSC and CDL by iterative proximal gradient algorithm. y x t x T ISTA [1]: ˜ y t α H T - H LISTA [2]: y W e x t x T S CSCNet [3]: y ˜ W e x t x T y t H - W d Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 6 / 22

  9. What if the observations are no longer Gaussian? Count-valued data Fingerprint Photon-based imaging Classical CDL approach : Alternating minimization with a Poisson generative model [4, 5]. • Unsupervised ✓ • Follows a generative model ⇒ interpretable ✓ • Not scalable (can take minutes ∼ hours to denoise single image) ✗ Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 7 / 22

  10. Our Contributions Encoder Decoder Repeat T times y ˜ α H T x t x T y t H - f − 1 ( · ) H • Auto-encoder inspired by CDL, termed D eep C onvolutional E xponential A uto-encoder ( DCEA ), for non real-valued data • Demonstration of the flexibility of DCEA for both • unsupervised task, e.g., CDL • supervised task, e.g., Poisson denoising problem • Gradient dynamics of shallow exponential auto-encoder (SEA) • Prove that SEA recovers parameters of the generative model. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 8 / 22

  11. 1 Motivation 2 Introduction 3 Deep Convolutional Exponential Auto-encoder (DCEA) Experiments 4 Conclusion 5 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 9 / 22

  12. Deep Convolutional Exponential Auto-encoder Problem description Natural exponential family with convolutional generative model: � T y + g ( y ) − B � � � log p ( y | µ µ µ ) = f µ µ , where f ( µ µ µ ) = Hx , x is sparse . µ µ µ µ Inverse link: f − 1 ( · ) B(z) y z T z Gaussian R I ( · ) -1 T log( 1 − z ) Binomial [0 ..M ] sigmoid ( · ) 1 T z Poisson [0 .. ∞ ) exp( · ) Exponential Convolutional Dictionary Learning (ECDL): negative log-likelihood code sparsity constraint � �� � � �� � min − log p ( y | µ µ µ ) + λ � x � 1 H , x Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 10 / 22

  13. Deep Convolutional Exponential Auto-encoder Network architecture Encoder Decoder Repeat T times y ˜ x t x T y t α H T H - f − 1 ( · ) H Components for different distributions f − 1 ( · ) Encoder Unfolding ( x t ) Decoder ( f (ˆ µ µ µ ) ) y � � x t − 1 + α H T � Gaussian R I ( · ) S b y t Hx T � � x t − 1 + α H T ( 1 Binomial [0 ..M ] sigmoid ( · ) S b M � y t ) Hx T � � x t − 1 + α H T ( Elu ( � Poisson [0 .. ∞ ) exp( · ) S b y t )) Hx T Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 11 / 22

  14. Deep Convolutional Exponential Auto-encoder Training & inference Forward pass Repeat T times y ˜ x t x T y t α H T L H - y f − 1 ( · ) H Backward pass Training • Forward pass : Estimate code x T & compute loss function. • Backward pass (back-propagation): Estimate dictionary H . • Equivalent to alternating minimization in CDL. Inference : Once trained, the inference (forward pass) is fast. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 12 / 22

  15. Unsupervised → Supervised Repurpose DCEA for supervised tasks with two modifications 1 Loss function : Any supervised loss function, e.g., reconstruction MSE loss or perceptual loss. 2 Architecture : Relax the constraints → Untie the weights of encoder and decoder, learn the bias b . Encoder Decoder � x t − 1 + α H T ( y − f − 1 � � � Original x t = S b Hx t − 1 ) Hx T � x t − 1 + α ( W e ) T ( y − f − 1 � � � W d x t − 1 Relaxed x t = S b ) Hx T • Further relaxations possible, i.e., deep & non-linear decoder. Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 13 / 22

  16. 1 Motivation 2 Introduction 3 Deep Convolutional Exponential Auto-encoder (DCEA) Experiments 4 Conclusion 5 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 14 / 22

  17. Experiments Poisson image denoising Baseline frameworks Supervised? Description SPDA [5] ✗ ECDL + patch-based CA [6] ✓ denoising NN DCEA-C (ours) ✓ constrained DCEA (tied weights) DCEA-UC (ours) ✓ unconstrained DCEA (untied weights) PSNR performance on test dataset Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 15 / 22

  18. Experiments Poisson image denoising Original Noisy peak = 4 DCEA-C DCEA-UC Original Noisy peak = 2 DCEA-C DCEA-UC Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 16 / 22

  19. Experiments Poisson image denoising • Classical ECDL : SPDA vs. DCEA-C ⇒ better denoising + much more efficient ⇒ classical inference task leveraging scalability of NN • Denoising NN : CA vs. DCEA-UC ⇒ competitive denoising + much less parameters ⇒ NN architecture leveraging generative model paradigm Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 17 / 22

  20. Experiments CDL for simulated binomial data Figure: Example of simulated neural spikes and the rate (truth) Figure: Random initialized (Blue), true (Orange), and learned templates (Green) Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 18 / 22

  21. Experiments CDL for simulated binomial data • If we untie the weights, i.e., relax generative model constraints (a) c = 1 (b) c = 2 0 . 4 0 . 2 0 . 2 0 . 0 − 0 . 2 0 . 0 − 0 . 4 True Learned − 0 . 2 − 0 . 6 0 20 40 0 20 40 Time [ms] Time [ms] • If we treat binomial data as Gaussian obs., i.e., model mismatch Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 19 / 22

  22. 1 Motivation 2 Introduction 3 Deep Convolutional Exponential Auto-encoder (DCEA) Experiments 4 Conclusion 5 Harvard CRISP Convolutional dictionary learning based auto-encoders for natural exponential-family distributions 20 / 22

Recommend


More recommend