Learning Optical Flow with Limited Data Jia Xu ( �� ) T encent AI Lab 2019-03-14 1
Introduction Input Output Dense correspondence for each pixel between two frames 2
Why Optical Flow? q Optical flow has a wide range of applications. Autonomous Driving 3D Shape Reconstruction Object Tracking Video Action Recognition 3
History of Optical Flow Estimation ��9�4��18 ��62�8�� ���8�� ���9�19���25�927 ���8�� �8��-4� �48�8�� ��88�8�� .0��-4� ���� 4��18 ���� ���� ���� ���� ���� ���� ���� ���� 4
DC Flow Xu, Ranftl, Koltun. Accurate Optical Flow via Direct Cost Volume Processing. CVPR 2017
CNNs for Optical Flow q Advantage: high performance while running at real time. q Disadvantage: need a large amount of labeled data è difficult to obtain. PWC-Net FlowNet Fischer et al. 2015, "FlowNet: Learning Optical Sun et al. 2018, "PWC-Net: CNNs for Optical Flow Flow with Convolutional Networks" Using Pyramid, Warping, and Cost Volume" 6
CNNs for Optical Flow q Advantage: high performance while running at real time. q Disadvantage: need a large amount of labeled data è difficult to obtain. n Pre-training on synthetic dataset: domain gap. n Unsupervised learning: performance gap, cannot predict flow of occluded pixels. Meister et al. 2018, “UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss" 7
Unsupervised Learning for Optical Flow How to learn optical flow of occluded pixels in a totally unsupervised way? 8
Key Observation q Unsupervised Learning: detect occlusion and exclude occluded pixels. Ø The optical flow of non-occluded pixels can be accurately estimated. Ø How do we fully utilize those reliable non-occluded predictions? Ø Data Distillation! Non- Occluded Occluded Liu, King, Lyu, Xu. DDFlow: Learning Optical Flow with Unlabeled Data Distillation. AAAI 2019 9
Framework q Teacher model is trained with photometric loss ! " for non-occluded pixels. Image Warp ! 1 Forward Forward Warped Teacher ! 1 & Flow Occlusion Image Model w ( * ( & Forward- ! 1 ! 2 backward Photometric Loss consistency ! 2 check Backward Warped Backward Teacher Occlusion Image ! 2 & Flow Model w ) & * ) ! 2 ! 1 Student Model 10
Framework q Student model has the same network structure as teacher model. Image Warp ' 1 Forward Forward Warped Teacher ' 1 & Flow Occlusion Image Model w # % # . Forward- ' 1 ' 2 backward Photometric Loss consistency ' 2 check Backward Warped Backward Teacher Occlusion Image ' 2 & Flow Model w $ . % $ ' 2 ' 1 , ' 1 Forward Warped Forward Student ' ̃ 1 Occlusion Image & Flow Model & # w " # , % . ' , ' 2 Forward- 1 backward Photometric Loss consistency , ' 2 check Backward Warped Backward Student ' ̃ 2 & Occlusion Image Flow Model & $ , . % ' w " $ , ' 1 2 Image Warp 11
Framework q Student model is trained with both ! " for non-occluded pixels and ! # for occluded pixels. Only student model is needed during testing. Image Warp ! # only functions ' 1 Cropped Forward Forward Warped Teacher on pixels that are Occlusion ' 1 & Flow Occlusion Image Model . w # % % # 0 non-occluded in Forward- ' 1 ' 2 # backward original images Photometric Loss consistency but occluded in ' 2 check Backward Warped Cropped Backward Teacher Occlusion Image ' 2 Occlusion & Flow cropped patches. Model . w $ 0 % $ % $ ' 2 ' 1 Loss for Occluded Pixels , ' 1 Forward Valid Warped Forward Student ' ̃ 1 Occlusion Mask Image & Flow Model & # w " # , % / 0 ' , ' 2 # Forward- 1 backward Photometric Loss consistency , ' 2 check Backward Valid Warped Backward Student ' ̃ 2 & Occlusion Mask Image Flow Model & $ , 0 % / $ ' w " $ , ' 1 2 Image Warp 12
Loss Functions q Occlusion estimation: based on the forward-backward consistency prior q Photometric loss ! " q Loss for occluded pixels ! # q Teacher model: ! = ! " No hyperparameter ! q Student Model: ! = ! " + ! # 13
Evaluation Metrics q Optical Flow Ø EPE: average endpoint error between the predicted flow and the ground truth flow over all pixels. Ø Fl: percentage of erroneous pixels. A pixel is considered to be correctly estimated if flow end-point error is < 3 pixels or <5%. q Occlusion estimation Ø F-score: the harmonic average of the precision and recall. 14
Quantitative Comparisons q DDFlow outperforms all existing unsupervised flow learning methods on all datasets. 15
Quantitative Comparisons q Our pre-trained model on Flying Chairs even outperforms the finetuned state- of-the-art unsupervised models on Sintel dataset. 16
Quantitative Comparisons q 28.6 % relative improvement on KITTI 2012, 37.7% relative improvement on KITTI 2015. 17
Quantitative Comparisons q 28.6 % relative improvement on KITTI 2012, 37.7% relative improvement on KITTI 2015. q On KITTI 2012, DDFlow outperforms Flownet 2.0 for ranking metric Fl-noc. 18
Quantitative Comparisons q DDFlow achieves the best occlusion estimation performance on Sintel Clean and Sintel Final datasets. q On KITTI dataset, the ground truth occlusion masks only contain pixels moving out of the image boundary. Under such setting, our method can achieve comparable performance. 19
Qualitative Comparisons q Sample results on Sintel datasets. The first three rows are from Sintel Clean, while the last three are from Sintel Final. 20
Qualitative Comparisons q Example results on KITTI datasets. The first three rows are from KITTI 2012, and the last three are from KITTI 2015. q Note that on KITTI datasets, the occlusion masks are sparse and only contain pixels moving out of the image boundary. 21
Quantitative: Ablation Study q Comparing row 1, 2 and row 3, 4: occlusion handling can improve flow estimation performance on all datasets. 22
Quantitative: Ablation Study q Comparing row 1, 2 and row 3, 4: occlusion handling can improve flow estimation performance on all datasets. q Comparing row 1, 3 and row 2, 4: census transform constantly improve performance. 23
Quantitative: Ablation Study q Comparing row 1, 2 and row 3, 4: occlusion handling can improve flow estimation performance on all datasets. q Comparing row 1, 3 and row 2, 4: census transform constantly improve performance. q Comparing row 4, 5: data distillation can greatly improve the performance, especially for occluded pixels, with EPE-OCC decreases 18.5% on Sintel Clean, 16.1% on Sintel Final, 58.2% on KITTI 2012 and 42.1% on KITTI 2015. 24
Video Flow Estimation on Sintel Dataset q The top part is the input frame and the bottom part is the corresponding optical flow estimated by DDFlow. 27
DDFlow code q Code and models available on https://github.com/ppliuboy/DDFlow. 29
What is Next?
Motivation • Can we completely get rid of synthetic data? • Can we win Sintel back? 31 Liu, King, Lyu, Xu. SelFlow: Self-Supervised Learning of Optical Flow. CVPR 2019
' Initially, ! " and ! # are non-occluded from $ % to $ %&" , ! " ' are their corresponding pixels. NOC-Model can and ! # accurately estimate the flow of ! " and ! # using photometric loss. " # " #+1 $ % ′ & 1 & 1 NOC Flow ′ Model & 2 & 2 $ %&"
We inject random noise to ! "#$ and let noise cover % $ and % & , then % $ and % & become occluded from ! " to ' ! "#$ . OCC-Model cannot accurately estimate flow of % $ and % & using photometric loss. " # " #+1 ! " ′ & 1 & 1 NOC Flow ′ Model & 2 & 2 ! "#$ ) " # " #+1 & 1 OCC Flow Model & 2 ' ! "#$
We distill reliable flow estimations of ! " and ! # from NOC-Model to guide the flow learning for OCC-Model. The guidance is only employed to pixels that are occluded from $ % to & $ %'" but non-occluded from $ % to $ %'" , such as ! " and ! # . " # " #+1 $ % ′ & 1 & 1 NOC Flow ′ Model & 2 & 2 Guide $ %'" ) " # " #+1 Self- & 1 OCC Flow supervision Model & 2 mask & $ %'"
Quantitative Results
Our unsupervised results outperform all existing unsupervised results on all datasets by a large margin.
Our unsupervised results even outperform several famous fully-supervised methods.
Our fine-tuned models achieve state-of-the-art results without using any external labeled data.
Qualitative Results
Effect of Self-supervision
Reference Image Flow Estimation without Self-supervision Flow Estimation with Self-supervision
Without Self-supervision Reference Image Flow Estimation without Self-supervision Flow Estimation with Self-supervision
Without Self-supervision Reference Image Flow Estimation without Self-supervision With Self-supervision Flow Estimation with Self-supervision
Reference Image Flow Estimation without Self-supervision Flow Estimation with Self-supervision
Recommend
More recommend