Representation Learning Lecture slides for Chapter 15 of Deep Learning www.deeplearningbook.org Ian Goodfellow 2017-10-03
Unsupervised Pretraining Usually Hurts but Sometimes Helps Harm done by pretraining (Ma et al, 2015) Average advantage of not pretraining Break-even point Many di ff erent chemistry datasets (Goodfellow 2017)
4000 Without pretraining 1500 1000 500 0 −500 −1000 −1500 3000 2000 1000 0 −1000 −2000 −3000 −4000 With pretraining Pretraining Changes Learning Trajectory Figure 15.1 (Goodfellow 2017)
Representation Sharing for Multi-Task or Transfer Learning y y One representation h (shared) h (shared) used for many Selection switch input formats h (1) h (1) h (2) h (2) h (3) h (3) or many tasks x (1) x (1) x (2) x (2) x (3) x (3) Figure 15.2 (Goodfellow 2017)
Zero Shot Learning h x = f x ( x ) h y = f y ( y ) f x f y x − space y − space x test y test ( x , y ) pairs in the training set f x : encoder function for x f y : encoder function for y Relationship between embedded points within one of the domains Maps between representation spaces Figure 15.3 (Goodfellow 2017)
Mixture Modeling Discovers Separate Classes y=1 y=2 y=3 p ( x ) x Figure 15.4 (Goodfellow 2017)
Mean Squared Error Can Ignore Small but Task-Relevant Features Input Reconstruction Figure 15.5 The ping pong ball vanishes because it is not large enough to significantly a ff ect the mean squared error (Goodfellow 2017)
Adversarial Losses Preserve Any Features with Highly Structured Patterns Ground Truth MSE Adversarial Figure 15.6 Mean squared error loses the ear because it causes a small change in few pixels. Adversarial loss preserves the ear because it is easy to notice its absence. (Goodfellow 2017)
Binary Distributed Representations Divide Space Into Many Uniquely Identifiable Regions h 2 h 3 h = [1 , 0 , 0] > h = [1 , 1 , 0] > h = [1 , 0 , 1] > h = [1 , 1 , 1] > h 1 h = [0 , 1 , 0] > h = [0 , 1 , 1] > h = [0 , 0 , 1] > Figure 15.7 (Goodfellow 2017)
Binary Distributed Representations Divide Space Into Many Uniquely Identifiable Regions h 2 h 3 h = [1 , 0 , 0] > h = [1 , 1 , 0] > h = [1 , 0 , 1] > h = [1 , 1 , 1] > h 1 h = [0 , 1 , 0] > h = [0 , 1 , 1] > h = [0 , 0 , 1] > Figure 15.7 (Goodfellow 2017)
Nearest Neighbor Divides Space into one Region Per Centroid Figure 15.8 (Goodfellow 2017)
GANs learn vector spaces that support semantic arithmetic = - + Figure 15.9 (Goodfellow 2017)
Recommend
More recommend