Lecture 8: Autoencoder & DBM Princeton University COS 495 - PowerPoint PPT Presentation

Deep Learning Basics Lecture 8: Autoencoder & DBM Princeton University COS 495 Instructor: Yingyu Liang

Autoencoder

Autoencoder • Neural networks trained to attempt to copy its input to its output • Contain two parts: • Encoder: map the input to a hidden representation • Decoder: map the hidden representation to the output

Autoencoder ℎ Hidden representation (the code) 𝑦 𝑠 Input Reconstruction

Autoencoder ℎ Encoder 𝑔(⋅) Decoder 𝑕(⋅) 𝑦 𝑠 ℎ = 𝑔 𝑦 , 𝑠 = 𝑕 ℎ = 𝑕(𝑔 𝑦 )

Why want to copy input to output • Not really care about copying • Interesting case: NOT able to copy exactly but strive to do so • Autoencoder forced to select which aspects to preserve and thus hopefully can learn useful properties of the data • Historical note: goes back to (LeCun, 1987; Bourlard and Kamp, 1988; Hinton and Zemel, 1994).

Undercomplete autoencoder • Constrain the code to have smaller dimension than the input • Training: minimize a loss function 𝑀 𝑦, 𝑠 = 𝑀(𝑦, 𝑕 𝑔 𝑦 ) 𝑦 ℎ 𝑠

Undercomplete autoencoder • Constrain the code to have smaller dimension than the input • Training: minimize a loss function 𝑀 𝑦, 𝑠 = 𝑀(𝑦, 𝑕 𝑔 𝑦 ) • Special case: 𝑔, 𝑕 linear, 𝑀 mean square error • Reduces to Principal Component Analysis

Undercomplete autoencoder • What about nonlinear encoder and decoder? • Capacity should not be too large • Suppose given data 𝑦 1 , 𝑦 2 , … , 𝑦 𝑜 • Encoder maps 𝑦 𝑗 to 𝑗 • Decoder maps 𝑗 to 𝑦 𝑗 • One dim ℎ suffices for perfect reconstruction

Regularization • Typically NOT • Keeping the encoder/decoder shallow or • Using small code size • Regularized autoencoders: add regularization term that encourages the model to have other properties • Sparsity of the representation (sparse autoencoder) • Robustness to noise or to missing inputs (denoising autoencoder) • Smallness of the derivative of the representation

Sparse autoencoder • Constrain the code to have sparsity • Training: minimize a loss function 𝑀 𝑆 = 𝑀(𝑦, 𝑕 𝑔 𝑦 ) + 𝑆(ℎ) 𝑦 ℎ 𝑠

Probabilistic view of regularizing ℎ • Suppose we have a probabilistic model 𝑞(ℎ, 𝑦) • MLE on 𝑦 𝑞(ℎ ′ , 𝑦) log 𝑞(𝑦) = log ෍ ℎ ′ •  Hard to sum over ℎ ′

Probabilistic view of regularizing ℎ • Suppose we have a probabilistic model 𝑞(ℎ, 𝑦) • MLE on 𝑦 𝑞(ℎ ′ , 𝑦) max log 𝑞(𝑦) = max log ෍ ℎ ′ • Approximation: suppose ℎ = 𝑔(𝑦) gives the most likely hidden representation, and σ ℎ ′ 𝑞(ℎ ′ , 𝑦) can be approximated by 𝑞(ℎ, 𝑦)

Probabilistic view of regularizing ℎ • Suppose we have a probabilistic model 𝑞(ℎ, 𝑦) • Approximate MLE on 𝑦, ℎ = 𝑔(𝑦) max log 𝑞(ℎ, 𝑦) = max log 𝑞(𝑦|ℎ) + log 𝑞(ℎ) Loss Regularization

Sparse autoencoder • Constrain the code to have sparsity 𝜇 𝜇 • Laplacian prior: 𝑞 ℎ = 2 exp(− 2 ℎ 1 ) • Training: minimize a loss function 𝑀 𝑆 = 𝑀(𝑦, 𝑕 𝑔 𝑦 ) + 𝜇 ℎ 1

Denoising autoencoder • Traditional autoencoder: encourage to learn 𝑕 𝑔 ⋅ to be identity • Denoising : minimize a loss function 𝑀 𝑦, 𝑠 = 𝑀(𝑦, 𝑕 𝑔 ෤ 𝑦 ) where ෤ 𝑦 is 𝑦 + 𝑜𝑝𝑗𝑡𝑓

Boltzmann machine

Boltzmann machine • Introduced by Ackley et al. (1985) • General “connectionist” approach to learning arbitrary probability distributions over binary vectors exp(−𝐹 𝑦 ) • Special case of energy model: 𝑞 𝑦 = 𝑎

Boltzmann machine • Energy model: 𝑞 𝑦 = exp(−𝐹 𝑦 ) 𝑎 • Boltzmann machine: special case of energy model with 𝐹 𝑦 = −𝑦 𝑈 𝑉𝑦 − 𝑐 𝑈 𝑦 where 𝑉 is the weight matrix and 𝑐 is the bias parameter

Boltzmann machine with latent variables • Some variables are not observed 𝑦 = 𝑦 𝑤 , 𝑦 ℎ , 𝑦 𝑤 visible, 𝑦 ℎ hidden 𝑈 𝑇𝑦 ℎ − 𝑐 𝑈 𝑦 𝑤 − 𝑑 𝑈 𝑦 ℎ 𝑈 𝑆𝑦 𝑤 − 𝑦 𝑤 𝑈 𝑋𝑦 ℎ − 𝑦 ℎ 𝐹 𝑦 = −𝑦 𝑤 • Universal approximator of probability mass functions

Maximum likelihood 1 , 𝑦 𝑤 2 , … , 𝑦 𝑤 𝑜 • Suppose we are given data 𝑌 = 𝑦 𝑤 • Maximum likelihood is to maximize 𝑗 ) log 𝑞 𝑌 = ෍ log 𝑞(𝑦 𝑤 𝑗 where 1 𝑞 𝑦 𝑤 = ෍ 𝑞(𝑦 𝑤 , 𝑦 ℎ ) = ෍ 𝑎 exp(−𝐹(𝑦 𝑤 , 𝑦 ℎ )) 𝑦 ℎ 𝑦 ℎ • 𝑎 = σ exp(−𝐹(𝑦 𝑤 , 𝑦 ℎ )) : partition function, difficult to compute

Restricted Boltzmann machine • Invented under the name harmonium (Smolensky, 1986) • Popularized by Hinton and collaborators to Restricted Boltzmann machine

Restricted Boltzmann machine • Special case of Boltzmann machine with latent variables: 𝑞 𝑤, ℎ = exp(−𝐹 𝑤, ℎ ) 𝑎 where the energy function is 𝐹 𝑤, ℎ = −𝑤 𝑈 𝑋ℎ − 𝑐 𝑈 𝑤 − 𝑑 𝑈 ℎ with the weight matrix 𝑋 and the bias 𝑐, 𝑑 • Partition function 𝑎 = ෍ ෍ exp(−𝐹 𝑤, ℎ ) 𝑤 ℎ

Restricted Boltzmann machine Figure from Deep Learning , Goodfellow, Bengio and Courville

Restricted Boltzmann machine • Conditional distribution is factorial 𝑞 ℎ|𝑤 = 𝑞(𝑤, ℎ) 𝑞(𝑤) = ෑ 𝑞(ℎ 𝑘 |𝑤) 𝑘 and 𝑘 + 𝑤 𝑈 𝑋 𝑞 ℎ 𝑘 = 1|𝑤 = 𝜏 𝑑 :,𝑘 is logistic function

Restricted Boltzmann machine • Similarly, 𝑞 𝑤|ℎ = 𝑞(𝑤, ℎ) 𝑞(ℎ) = ෑ 𝑞(𝑤 𝑗 |ℎ) 𝑗 and 𝑞 𝑤 𝑗 = 1|ℎ = 𝜏 𝑐 𝑗 + 𝑋 𝑗,: ℎ is logistic function

Deep Boltzmann machine • Special case of energy model. Take 3 hidden layers and ignore bias: 𝑞 𝑤, ℎ 1 , ℎ 2 , ℎ 3 = exp(−𝐹 𝑤, ℎ 1 , ℎ 2 , ℎ 3 ) 𝑎 • Energy function 𝐹 𝑤, ℎ 1 , ℎ 2 , ℎ 3 = −𝑤 𝑈 𝑋 1 ℎ 1 − (ℎ 1 ) 𝑈 𝑋 2 ℎ 2 − (ℎ 2 ) 𝑈 𝑋 3 ℎ 3 with the weight matrices 𝑋 1 , 𝑋 2 , 𝑋 3 • Partition function exp(−𝐹 𝑤, ℎ 1 , ℎ 2 , ℎ 3 ) 𝑎 = ෍ 𝑤,ℎ 1 ,ℎ 2 ,ℎ 3

Deep Boltzmann machine Figure from Deep Learning , Goodfellow, Bengio and Courville

Lecture 8: Autoencoder & DBM Princeton University COS 495 - PowerPoint PPT Presentation

Deep Learning Basics Lecture 8: Autoencoder & DBM Princeton University COS 495 Instructor: Yingyu Liang Autoencoder Autoencoder Neural networks trained to attempt to copy its input to its output Contain two parts: Encoder: map

Convolutional Autoencoder (CAE) Prof. Seungchul Lee Industrial AI Lab. Convolutional Autoencoder

HISTORY OF DBM HISTORY 1936-64- THE BEGINNING MAURITIUS AGRICULTURAL BANK (MAB) Providing

Age nda Que s t i ons on pha s e 2 of t he pr oj e c t Today: DBM S i

<Title> Yiqun Hu, SP Group Agenda Condition monitoring & anomaly detection

Fate of HIV-1 infected cells under suppressive ART Fabian Otte Molecular Virology DBM Basel

PAGGUGOL NA MATUWID: DAAN SA KASAGANAAN DAAN SA KASAGANAAN Secretary Florencio B. Abad, DBM

Roadmap StandardizaUon and Spectrum AllocaUon 2015 2016 2017

Partial stochastic characterization of timed runs over DBM domains Laura Carnevali Lorenzo Ridi

RBM, DBN, and DBM M. Soleymani Sharif University of Technology Fall 2017 Slides are based on

Machine Learning Lecture 12: Variational Autoencoder Nevin L. Zhang lzhang@cse.ust.hk

Robust and Unsupervised KPI Anomaly Detection Based on Conditional Variational Autoencoder Zeyan

T-CVAE: Transformer-Based Conditioned Variational Autoencoder for Story Completion Tianming Wang

Memristor Based Autoencoder for Unsupervised Real-Time Network Intrusion and Anomaly Detection

Stress Classification: A Deep Stacked Autoencoder Approach Yusuf Gandhi Putra Faculty of

Semisupervised Autoencoder for Sentiment Analysis Shuangfei Zhai, Zhongfei Zhang.

Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder Ryo

State Innovation Waiver Policy Forum: Health Connector Non-Group & Subsidized Coverage AMANDA

8/23/2016 Grace After We're "Home" Stanza 5 Yes, when this flesh and heart shall fail

2017 SOLI DEO GLORIA John 12:27-33 Outline by Jordan Thomas INTRODUCTION The Latin phrase,

Securing Practical Quantum Cryptography with Optical Power Limiters Gong Zhang 1,* , Ignatius

UPPAAL tutorial Modeling Verification 1 2 Architecture of UPPAAL Whats inside UPPAAL

Wireless Communication Systems @CS.NCTU Lecture 11: Successive Interference Cancellation

User Guide KI 734x Series Two way Loss & ORL Tester y www kingfisher com au

The profitability analysis of the multi-band spectrum broker Marcin PARZY Poznan University of

Lecture 8: Autoencoder & DBM Princeton University COS 495 - PowerPoint PPT Presentation

Deep Learning Basics Lecture 8: Autoencoder & DBM Princeton University COS 495 Instructor: Yingyu Liang Autoencoder Autoencoder Neural networks trained to attempt to copy its input to its output Contain two parts: Encoder: map

Convolutional Autoencoder (CAE) Prof. Seungchul Lee Industrial AI Lab. Convolutional Autoencoder

HISTORY OF DBM HISTORY 1936-64- THE BEGINNING MAURITIUS AGRICULTURAL BANK (MAB) Providing

Age nda Que s t i ons on pha s e 2 of t he pr oj e c t Today: DBM S i

&lt;Title&gt; Yiqun Hu, SP Group Agenda Condition monitoring &amp; anomaly detection

Fate of HIV-1 infected cells under suppressive ART Fabian Otte Molecular Virology DBM Basel

PAGGUGOL NA MATUWID: DAAN SA KASAGANAAN DAAN SA KASAGANAAN Secretary Florencio B. Abad, DBM

Roadmap StandardizaUon and Spectrum AllocaUon 2015 2016 2017

Partial stochastic characterization of timed runs over DBM domains Laura Carnevali Lorenzo Ridi

RBM, DBN, and DBM M. Soleymani Sharif University of Technology Fall 2017 Slides are based on

Machine Learning Lecture 12: Variational Autoencoder Nevin L. Zhang lzhang@cse.ust.hk

Robust and Unsupervised KPI Anomaly Detection Based on Conditional Variational Autoencoder Zeyan

T-CVAE: Transformer-Based Conditioned Variational Autoencoder for Story Completion Tianming Wang

Memristor Based Autoencoder for Unsupervised Real-Time Network Intrusion and Anomaly Detection

Stress Classification: A Deep Stacked Autoencoder Approach Yusuf Gandhi Putra Faculty of

Semisupervised Autoencoder for Sentiment Analysis Shuangfei Zhai, Zhongfei Zhang.

Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder Ryo

State Innovation Waiver Policy Forum: Health Connector Non-Group &amp; Subsidized Coverage AMANDA

8/23/2016 Grace After We're &quot;Home&quot; Stanza 5 Yes, when this flesh and heart shall fail

2017 SOLI DEO GLORIA John 12:27-33 Outline by Jordan Thomas INTRODUCTION The Latin phrase,

Securing Practical Quantum Cryptography with Optical Power Limiters Gong Zhang 1,* , Ignatius

UPPAAL tutorial Modeling Verification 1 2 Architecture of UPPAAL Whats inside UPPAAL

Wireless Communication Systems @CS.NCTU Lecture 11: Successive Interference Cancellation

User Guide KI 734x Series Two way Loss &amp; ORL Tester y www kingfisher com au

The profitability analysis of the multi-band spectrum broker Marcin PARZY Poznan University of

<Title> Yiqun Hu, SP Group Agenda Condition monitoring & anomaly detection

State Innovation Waiver Policy Forum: Health Connector Non-Group & Subsidized Coverage AMANDA

8/23/2016 Grace After We're "Home" Stanza 5 Yes, when this flesh and heart shall fail

User Guide KI 734x Series Two way Loss & ORL Tester y www kingfisher com au