s3vae self supervised sequential vae for representation
play

S3VAE: Self-Supervised Sequential VAE for Representation - PowerPoint PPT Presentation

S3VAE: Self-Supervised Sequential VAE for Representation Disentanglement and Data Generation Disentangled Representation Learning: Framework Encoder VAE Objectives: Decoder LSTM in the latent space Self-Supervised Signal (1):


  1. S3VAE: Self-Supervised Sequential VAE for Representation Disentanglement and Data Generation

  2. Disentangled Representation Learning: Framework ● Encoder VAE Objectives: ● Decoder ● LSTM in the latent space

  3. Self-Supervised Signal (1): Static Consistency Constraint ● To encourage the appearance representation to exclude any dynamic information. ● Triplet Loss: Anchor Shuffle temporal order Positive Negative

  4. Self-Supervised Signal (2): Dynamic Factor Prediction ● To encourage the motion representation to carry adequate and correct time-dependent information of each timestep ● Optical flow provides the location of motion ○ Grid the optical flow map with indices ● Landmarks provides the subtle motion on facial expression The input frame and optical flow ○ Distances between upper and lower eyelips and distances between lips The three distances on faces

  5. Self-Supervised Signal (3): Mutual Information ● To encourage the information in and to be mutually exclusive. ● To minimize the mutual information between and

  6. Experiments: Representation Swapping ● Swap the appearance and motion representation of two given videos Video A Video B Video A Video B

  7. Experiments: Representation Swapping Real Video Synthesized Video

  8. Experiments: Manipulating video generation(Dsprite) Fix appearance representation Fix motion representation

  9. Experiments: Manipulating video generation (MUG) Fix appearance representation Fix motion representation

  10. Experiments: Quantitatively performance comparison ● Baseline: our sequential VAE without self-supervision ● Baseline-sv: our sequential VAE with supervision of ground truth labels ● Full model: our sequential VAE with self-supervision

Recommend


More recommend