Au Autoenc ncoders Prof. Leal-Taixé and Prof. Niessner 1
Mac Machine e lear earning Unsupervised learning Supervised learning Labels or target • classes • Goal: learn a mapping from input to label Classification, • regression Prof. Leal-Taixé and Prof. Niessner 2
Mac Machine e lear earning Unsupervised learning Supervised learning CAT DOG DOG CAT CAT DOG Prof. Leal-Taixé and Prof. Niessner 3
Mac Machine e lear earning Unsupervised learning Supervised learning CAT No label or target class • • Find out properties of DOG DOG the structure of the data Clustering (k-means, • CAT PCA) CAT DOG Prof. Leal-Taixé and Prof. Niessner 4
Mac Machine e lear earning Unsupervised learning Supervised learning CAT DOG DOG CAT CAT DOG Prof. Leal-Taixé and Prof. Niessner 5
Mac Machine e lear earning Unsupervised learning Supervised learning CAT DOG DOG CAT CAT DOG Prof. Leal-Taixé and Prof. Niessner 6
Un Unsupervi vised le learning with au autoenc ncoders Prof. Leal-Taixé and Prof. Niessner 7
Au Auto toenc ncoders • Unsupervised approach for learning a lower- dimensional feature representation from unlabeled training data Prof. Leal-Taixé and Prof. Niessner 8
Au Auto toenc ncoders • From an input image to a feature representation (bottleneck layer) x • Encoder: a CNN in our case z Input Image Conv Prof. Leal-Taixé and Prof. Niessner 9
Au Auto toenc ncoders • Why do we need this dimensionality reduction? • To capture the patterns, the most meaningful factors of variation in our data • Other dimensionality reduction methods? Prof. Leal-Taixé and Prof. Niessner 10
Au Auto toenc ncoder: tr traini ning ng Reconstruction Loss (like L1, L2) Input Image Output Image Conv Transpose Conv Prof. Leal-Taixé and Prof. Niessner 11
Au Auto toenc ncoder: tr traini ning ng Input images Reconstruction x’ Input x Reconstructed images Latent space z dim (z) < dim (x) Prof. Leal-Taixé and Prof. Niessner 12
Au Auto toenc ncoder: tr traini ning ng • No labels required Reconstruction x’ • We can use unlabeled data to Input x first get its structure Latent space z dim (z) < dim (x) Prof. Leal-Taixé and Prof. Niessner 13
Au Auto toenc ncoder: Use : Use C Case ses Embedding of MNIST numbers Prof. Leal-Taixé and Prof. Niessner 14
Au Auto toenc ncoder for pre-tr traini ning ng • Test case: medical applications based on CT images – Large set of unlabeled data. – Small set of labeled data. • We cannot do: take a network pre-trained on ImageNet. Why? • The image features are different CT vs natural images Prof. Leal-Taixé and Prof. Niessner 15
Au Auto toenc ncoder for pre-tr traini ning ng • Test case: medical applications based on CT images – Large set of unlabeled data. – Small set of labeled data. • We can do: pre-train our network using an autoencoder to “learn” the type of features present in CT images Prof. Leal-Taixé and Prof. Niessner 16
Au Auto toenc ncoder for pre-tr traini ning ng • Step 1: Unsupersived training with autoencoders Reconstruction Input Prof. Leal-Taixé and Prof. Niessner 17
Au Auto toenc ncoder for pre-tr traini ning ng • Step 2: Supervised training with the labeled data Reconstruction Input Throw away the decoder Prof. Leal-Taixé and Prof. Niessner 18
Auto Au toenc ncoder for pre-tr traini ning ng • Step 2: Supervised training with the labeled data Backprop x y as always Loss z Input y ∗ Ground truth labels for supervised learning Prof. Leal-Taixé and Prof. Niessner 19
Wh Why usi using au autoen oencoder oders? • Use 1: pre-training, as mentioned before – Image à same image reconstructed – Use the encoder as “feature extractor” • Use 2: Use them to get pixel-wise predictions – Image à semantic segmentation – Low-resolution image à High-resolution image – Image à Depth map Prof. Leal-Taixé and Prof. Niessner 20
Au Autoenc ncoders fo for pi pixel xel-wis wise predic ictio ions ns Prof. Leal-Taixé and Prof. Niessner 21
Se Semanti ntic c Se Segmenta ntati tion n (FCN) • Recall the Fully Convolutional Networks Can we do better? Prof. Leal-Taixé and Prof. Niessner [Long et al. 15] Fully Convolutional Networks for Semantic Segmetnation (FCN) 22
Se SegNet Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016 Prof. Leal-Taixé and Prof. Niessner 23
Se SegNet • Enc Encoder : normal convolutional filters + pooling • De Decoder : Upsampling + convolutional filters Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016 Prof. Leal-Taixé and Prof. Niessner 24
Se SegNet • Enc Encoder : normal convolutional filters + pooling • De Decoder : Upsampling + convolutional filters Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016 Prof. Leal-Taixé and Prof. Niessner 25
Se SegNet • Enc Encoder : normal convolutional filters + pooling • De Decoder : Upsampling + convolutional filters • The convolutional filters in the decoder are learned using backprop and their goal is to refine the upsampling Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016 Prof. Leal-Taixé and Prof. Niessner 26
Rec Recal all tran anspos osed ed con onvol olution on • Transposed convolution Output 5x5 - Unpooling - Convolution filter (learned) - Also called up-convolution (never deconvolution) Input 3x3 Prof. Leal-Taixé and Prof. Niessner 27
Se SegNet • Enc Encoder : normal convolutional filters + pooling • De Decoder : Upsampling + convolutional filters ax layer: The output of the soft-max classifier is • Softmax a K channel image of probabilities where K is the number of classes. Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016 Prof. Leal-Taixé and Prof. Niessner 28
Upsampli ling Prof. Leal-Taixé and Prof. Niessner 29
Ty Types of upsa upsampl plings gs • 1. Interpolation ? Prof. Leal-Taixé and Prof. Niessner 30
Ty Types of upsa upsampl plings gs • 1. Interpolation Original image Nearest neighbor interpolation Bilinear interpolation Bicubic interpolation Prof. Leal-Taixé and Prof. Niessner 31 Image: Michael Guerzhoy
Ty Types of upsa upsampl plings gs • 1. Interpolation Few artifacts Prof. Leal-Taixé and Prof. Niessner 32 Image: Michael Guerzhoy
Ty Types of upsa upsampl plings gs • 2. Fixed unpooling efficient + CONVS A. Dosovitskiy, “Learning to Generate Chairs, Tables and Cars with Convolutional Networks“. TPAMI 2017 Prof. Leal-Taixé and Prof. Niessner 33
Ty Types of upsa upsampl plings gs • 3. Unpooling: “à la DeconvNet” Keep the locations where the max came from Prof. Leal-Taixé and Prof. Niessner 34
Ty Types of upsa upsampl plings gs • 3. Unpooling: “à la DeconvNet” Now: convolutional filters are LEARNED In DeConvNet: we convolve with the transpose of the learned filter Prof. Leal-Taixé and Prof. Niessner 35
Ty Types of upsa upsampl plings gs • 3. Unpooling: “à la DeconvNet” Keep the details of the structures Prof. Leal-Taixé and Prof. Niessner 36
U-Net o Net or s skip con connecti ection ons i s in au autoenc ncoders Prof. Leal-Taixé and Prof. Niessner 38
Ski Skip Conne nnecti ctions ns • U-Net Pass the low- level information High-level information Recall ResNet O. Ronneberger et al. “U-Net: Convolutional Networks for Biomedical Image Segmentation”. MICCAI 2015 Prof. Leal-Taixé and Prof. Niessner 39
Ski Skip Conne nnecti ctions ns • U-Net: zoom in append O. Ronneberger et al. “U-Net: Convolutional Networks for Biomedical Image Segmentation”. MICCAI 2015 Prof. Leal-Taixé and Prof. Niessner 40
Ski Skip Conne nnecti ctions ns • Concatenation connections C. Hazirbas et al. “Deep depth from focus”. ACCV 2018 Prof. Leal-Taixé and Prof. Niessner 41
Ski Skip Conne nnecti ctions ns • Widely used in Autoencoders • At what levels the skip connections are needed depends on your problem Prof. Leal-Taixé and Prof. Niessner 42
Au Autoenc ncoders in in Vi Visi sion on Prof. Leal-Taixé and Prof. Niessner 43
Se SegNet Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016 Prof. Leal-Taixé and Prof. Niessner 44
Se SegNet Input Ground truth SegNet Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016 Prof. Leal-Taixé and Prof. Niessner 45
Mon Monoc ocular ar dep depth • Unsupervised monocular depth estimation R. Garg et al. „Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue“ ECCV 2016 Prof. Leal-Taixé and Prof. Niessner 46
Recommend
More recommend