Dense Optical Flow Prediction from a Static Image Jacob Walker, Abhinav Gupta, and Martial Hebert Aysun Koçak
Introduction Static images contain action and motion information There are several approaches • Common one is agent-centric Activity forecasting, ECCV , 2012 Patch to the future: Unsupervised visual prediction, CVPR , 2014
Introduction Two disadvantages • motion is modeled as a trajectory • shown to perform in restrictive domains This paper proposes a generalized framework • single or multiple agent • indoor or outdoor environment
Related Work Non-parametric methods • data-driven • do not make any assumptions about the underlying scene A data-driven approach for event predictjon, ECCV , 2010
Related Work Parametric methods • domain-specific approaches • assumptions on what are the active elements A hierarchical representatjon for future actjon predictjon, ECCV 2014
Related Work Hybrid methods Patch to the Future: Unsupervised Visual Prediction, CVPR 2014
The Proposed Method Predict motion of each and every pixel in terms of optical flow CNN model for motion prediction Agent-free Makes almost no assumptions about the underlying scene Also makes long-range prediction
The Proposed Method Learn a mapping between the input RGB image and the output space
Training Framework 1. Extract Optical Flow from Video Frames • UCF-101 is action recognition data set consists of 13320 videos from 101 action categories • HMDB-51 has 6849 videos from 51 action categories • Model trained with over 350,000 frames from the UCF-101 and over 150,000 frames from the HMDB- 51 • Labelled with DeepFlow • Data augmentation
Training Framework 2. Assign Optical Flow Vectors to Clusters Regression as Classification • motion estimation can be posed as a regression problem • but it has a drawback • so reformulate as classification o quantize optical flow vectors into 40 clusters by k-means
Training Framework 3. Train Convolutional Neural Network for a Pixel Classification Problem Loss function
Experiments Test on • UCF-101 • HMDB-51 • KTH contains 600 videos from 6 actions 3-fold cross-validation
Experiments Metrics • Direction similarity • Orientation similarity • End-Point-Error
Experiments Metrics • Top 5 • Top 10
Experiments UCF Dataset
Experiments HMDB Dataset
Experiments
Multi-Frame Prediction
Multi-Frame Prediction Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Thank You
Recommend
More recommend