gate shift networks for video action recognition
play

Gate-Shift Networks for Video Action Recognition Swathikiran - PowerPoint PPT Presentation

Gate-Shift Networks for Video Action Recognition Swathikiran Sudhakaran 1 Sergio Escalera 2,3 Oswald Lanz 1 1 Fondazione Bruno Kessler, Italy 2 Computer Vision Center, Spain 3 Universitat de Barcelona, Spain Motivation Video action recognition


  1. Gate-Shift Networks for Video Action Recognition Swathikiran Sudhakaran 1 Sergio Escalera 2,3 Oswald Lanz 1 1 Fondazione Bruno Kessler, Italy 2 Computer Vision Center, Spain 3 Universitat de Barcelona, Spain

  2. Motivation Video action recognition requires spatio-temporal reasoning Putting something similar to other things that are already on the table Taking one of many similar things on the table

  3. Contribution Large number of parameters in 3D CNNs require large scale annotated data for training Existing approaches address this problem by a hard-wired decomposition of the 3D kernels which is suboptimal GSM leverages spatial gating for adaptive feature propagation HxW T T HxW HxW T W T x H 0 0 1 0 -1 1 0 0 1 C C 1 0 0 C 0 1 0 C 0 1 0 1 -1 0 1 0 0 . . . S3D / C3D GSM TSM R(2+1)D

  4. HxW T T HxW HxW T W T x H 0 0 1 0 -1 1 0 0 1 C C 1 0 0 C 0 1 0 C 0 1 0 1 -1 0 1 0 0 . . . S3D / C3D GSM TSM R(2+1)D

  5. GSM develops a flexible and data dependent decomposition of 3D kernels with reduced parameters and computational overhead S3D / C3D GSM TSM R(2+1)D

  6. Gate-Shift TSN TSN Effectiveness of GSM Ablation study on Sth-V1 10.50M 16.46G Putting sth similar to other things that are +29% already on the table 10.45M 16.37G TSN Gate-Shift TSN Unfolding sth

  7. State-of-the-art recognition accuracy of 55% on Something Something-V1

  8. Gate-Shift Networks for Video Action Recognition Swathikiran Sudhakaran 1 Sergio Escalera 2,3 Oswald Lanz 1 1 Fondazione Bruno Kessler, Italy 2 Computer Vision Center, Spain 3 Universitat de Barcelona, Spain

Recommend


More recommend