simulation for real
play

Simulation for Real Boqing Gong Tecent AI Lab Department of - PowerPoint PPT Presentation

Domain Adaptation & Transfer: All You Need to Use Simulation for Real Boqing Gong Tecent AI Lab Department of Computer Science An intelligent robot Semantic segmentation of urban scenes Assign each pixel a semantic label An


  1. Domain Adaptation & Transfer: All You Need to Use Simulation “for Real” Boqing Gong Tecent AI Lab Department of Computer Science

  2. An intelligent robot

  3. Semantic segmentation of urban scenes Assign each pixel a semantic label An appealing application: self-driving Image credit: https://www.cityscapes-dataset.com/

  4. Triumphal approach: CNNs convolutional neural networks Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Vision and Pattern Recognition . Proceedings of the IEEE Conference on Computer

  5. To teach/train CNNs to segment images and videos About 1.5 hrs to label one such image! Cityscapes: 30k images captured from 50 cities Only 5k are well labeled thus far Image credit: https://www.cityscapes-dataset.com/

  6. Labeling-free training data by simulation Image credit: http://synthia-dataset.net/

  7. Simulation to real world: catastrophic performance drop 60 60 45 30 22 15 0 Simulation → Simulation Simulation → Cityscapes

  8. The perils of mismatched domains Cause : standard assumption in machine learning Same underlying distribution for training and testing Consequence: Poor cross-domain generalization Brittle systems in dynamic and changing environment 8

  9. The perils of mismatched domains Synthetic imagery → Real photos [Zhang et al., ICCV’17]

  10. The perils of mismatched domains Adapting face detector to a user’s album [Jamal et al., CVPR’18]

  11. The perils of mismatched domains Middle-level concepts describing objects, faces, etc. Shared by different categories Attribute detection [Gan et al., CVPR’17]

  12. The perils of mismatched domains Query-relevant, important, & diverse shots à 1 Car 0.5 Children 0.5 Drink 1 Flowers 0.5 Street 0.5 Area 0.5 Food Important & diverse shots à Water 0.5 (a) Input : Video & Query (b) Algorithm : Sequential & Hierarchical Determinantal Point Process (SH-DPP) (c) Output : Summary Personalization of video summarizers [Sharghi et al., ECCV’16, CVPR’17, ECCV’18]

  13. The perils of mismatched domains Webly supervised learning [Ding et al., WACV’18] [Gan et al., ECCV’16, CVPR’18]

  14. Abstract form: unsupervised domain adaptation (DA) Setup Source domain (with labeled data) Target domain (no labels for training) ? Different distributions Objective Learn models to work well on target 14

  15. Existing methods Correcting sampling bias + + + - - - [Sethy et al., ’09] - [Sugiyama et al., ’08] [Muandet et al., ’13] [Pan et al., ’09] [Huang et al., Bickel et al., ’07] Inferring [Gong et al., ’12] [Argyriou et al, ’08] [Sethy et al., ’06] [Chen et al., ’12] [Daumé III, ’07] domain- [Shimodaira, ’00] [Gopalan et al., ’11] [Blitzer et al., ’06] invariant [Evgeniou and Pontil, ’05] features + - ++ [Duan et al., ’09] -- - + + [Duan et al., Daumé III et al., Saenko et al., ’10] - + - + - + + [Kulis et al., Chen et al., ’11] - Adjusting mismatched models

  16. Image Baseline Ours Groundtruth

  17. Let teacher model hint segmentation net (student) 40% 30% 20% 10% 0% Road Tree Sky Pedestrian Traffic Sign Input : An urban scene image Algorithm : Logistic regression Output : Label distributions

  18. Let 2nd teacher model hint segmentation net (student) Road Sidewalk Input : An urban scene image Algorithm : Super-pixel + Logistic regression Output : Labels of some super-pixels

  19. Curriculum domain adaptation for training CNNs L ( Y s , b Y s ) + d ( p t , p t ( b ( b min Y t )) Y Θ 40% 30% s : Source , t : Target 20% 10% 0% Sky Road Pedestrian Traffic Sign Tree b Y [ICCV’17]

  20. Curriculum domain adaptation 40% 30% C 20% 10% 0% Sky Road Traffic Sign Tree Pedestrian B Road A Sidewalk 20

  21. Cityscapes: Train/val/test: 2993/503/1531

  22. GTA: 24,996 images from the video game

  23. SYNTHIA: 9,400 images

  24. Simulation to real world: catastrophic performance drop 60 60 45 30 31 22 15 0 Adaptation Simulation → Sim Sim → Cityscapes [Zhang et al., ICCV’17]

  25. Recent progress 60 58 53 47 45 41 30 31 15 0 Ours Ours, 2018 FCAN Semi-DA Real2Real

  26. Domain adaptation: key to use simulation “for real” Domain-invariant features Importance sampling of data Adapt background models etc. Curriculum domain adaptation Style transfer, etc. Simulation to reality for segmentation, detection, Dynamics planning & control, etc.

  27. Domain adaptation: key to use simulation “for real” Domain-invariant features Importance sampling of data Adapt background models etc. Curriculum domain adaptation Style transfer, etc. Simulation to reality for segmentation, detection, Dynamics planning & control, etc.

  28. Domain adaptation → domain generalization ! … … � 1 ( x , a ) � 2 ( x , a ) � C ( x , a ) � C +1 � C +2 … ( x , a ) m1 ( x , a ) m C ( x , a ) m2 ( x , ? ) n m C =1,2,… m1=1,2,… m2=1,2,… n=1,2,… Test data from both seen & Training data sampled from C related domains unseen domains

  29. Simulation for domain generalization N tasks Unseen Seen M es scenes Setting 3 Synthesize Policy for Transfer and Adaptation [NIPS’18, Spotlight] across Environments and Tasks

  30. What to simulate? Rare events

  31. What to simulate? Active Simulation More data, better model Reality Simulator Actively tune simulator [ Proof-of-concept paper submitted ]

  32. Thank you!

Recommend


More recommend