Vi Video Ob eo Object ject Segm Segmen enta tati tion on - PowerPoint PPT Presentation

Vi Video Ob eo Object ject Segm Segmen enta tati tion on CV3DST | Prof. Leal-Taixé 1

Vi Video deo Objec ject Seg egmen entat ation on Object Detection Object Tracking This lecture Object Segmentation Video Object Segmentation CV3DST | Prof. Leal-Taixé 2

Vi Video deo Objec ject Seg egmen entat ation on • Goal: Generate accurate and temporally consistent pixel masks for objects in a video sequence. CV3DST | Prof. Leal-Taixé 3

VO VOS: som ome e chal allen enges es • Strong viewpoint/appearance changes CV3DST | Prof. Leal-Taixé 4

VO VOS: som ome e chal allen enges es • Strong viewpoint/appearance changes • Occlusions CV3DST | Prof. Leal-Taixé 5

VO VOS: som ome e chal allen enges es • Strong viewpoint/appearance changes • Occlusions • Scale changes CV3DST | Prof. Leal-Taixé 6

VOS: som VO ome e chal allen enges es • Strong viewpoint/appearance changes • Occlusions Hard to make • Scale changes assumptions about • Illumination object’s appearance • Shape Hard to make • … assumptions about object’s motion CV3DST | Prof. Leal-Taixé 7

VO VOS: tas asks Semi-supervised Unsupervised (zero- (one-shot) video shot) video object object segmentation segmentation We get the first frame We have to find the ground truth mask, we know objects as well as their what object to segment masks CV3DST | Prof. Leal-Taixé 8

VO VOS: tas asks Motion segmentation, salient object detection.. Semi-supervised Unsupervised (zero- (one-shot) video shot) video object object segmentation segmentation We get the first frame We have to find the ground truth mask, we know objects as well as their what object to segment masks CV3DST | Prof. Leal-Taixé 9

VO VOS: tas asks This lecture Semi-supervised Unsupervised (zero- (one-shot) video shot) video object object segmentation segmentation We get the first frame We have to find the ground truth mask, we know objects as well as their what object to segment masks CV3DST | Prof. Leal-Taixé 10

Supe Superv rvised Video Obj bject Se Segm gment ntation Given: First-frame ground truth Goal: Complete video segmentation Task formulation • – Given: segmentation mask of target object(s) in the first frame – Goal: pixel-accurate segmentation of the entire video – Currently a major testing ground for segmentation-based tracking CV3DST | Prof. Leal-Taixé 11

VO VOS Dat atas aset ets • Remember that large-scale datasets are needed for learning-based methods DAVIS 2016 DAVIS 2017 YouTube-VOS 2018 (30/20, single objects, (60/90, multiple (3471/982, multiple first frames) objects, first frames) objects, first frame where object appears) https://davischallenge.org https://youtube-vos.org CV3DST | Prof. Leal-Taixé 12

Bef Befor ore e we e get et star arted… ed… • Pixel-wise output • If we talk about pixel-wise outputs and motion, there is a concept in Computer Vision that we need to know first CV3DST | Prof. Leal-Taixé 13

Optical l flo low CV3DST | Prof. Leal-Taixé 14

Opt Optica cal l flo flow • Input: 2 consecutive images (e.g. from a video) • Output: displacement of every pixel from image A to image B • Results in the “perceived” 2D motion, not the real motion of the object CV3DST | Prof. Leal-Taixé 15

Opt Optica cal l flo flow CV3DST | Prof. Leal-Taixé 16

Opt Optica cal l flo flow CV3DST | Prof. Leal-Taixé 17

Opt Optica cal l flo flow with CNNs NNs • End-to-end supervised learning of optical flow P. Fischer et al. „FlowNet: Learning Optical Flow With Convolutional Networks“. ICCV 2015 CV3DST | Prof. Leal-Taixé 18

Opt Optica cal l flo flow with CNNs NNs P. Fischer et al. „FlowNet: Learning Optical Flow With Convolutional Networks“. ICCV 2015 CV3DST | Prof. Leal-Taixé 19

Fl FlowNet: a : arc rchit itecture ure 1 1 • Stack both images à input is now 2 x RGB = 6 channels CV3DST | Prof. Leal-Taixé 20

Fl FlowNet: a : arc rchit itecture ure 2 2 • Siamese architecture CV3DST | Prof. Leal-Taixé 21

Fl FlowNet: a : arc rchit itecture ure 2 2 • Two key design choices How to combine the information from both images? CV3DST | Prof. Leal-Taixé 22

Cor Correl elation ion layer er • Multiplies a feature vector with another feature vector Fixed operation. No learnable weights! CV3DST | Prof. Leal-Taixé 23

Cor Correl elation ion layer er • The matching score represents how correlated these two feature vectors are CV3DST | Prof. Leal-Taixé 24

Cor Correl elation ion layer er • Hint for anyone interested in 3D reconstruction: Useful for finding image correspondences A Find a transformation from image A to image B B I. Rocco et al. “Convolutional neural network architecture for geometric matching. CVPR 2017. CV3DST | Prof. Leal-Taixé 25

Fl FlowNet : a : arc rchit itecture ure 2 2 • Two key design choices How to obtain high- quality results? How to combine the information from both images? CV3DST | Prof. Leal-Taixé 26

Ca Can we e do o VOS wit ith OF? • Indeed! • Better if we focus on the flow of the object • We can improve segmentation and OF iteratively (no DL yet) Y.H. Tsai et al. “Video Segmentation via Object Flow“. CVPR 2016 CV3DST | Prof. Leal-Taixé 27

OS OSVOS VOS CV3DST | Prof. Leal-Taixé 28

First Fir st-fra frame fi fine-tu tuni ning ng • Goal: Learn the appearance of the object to track • Main contribution: separate training steps – Pre-training for ‘objectness’. – First-frame adaptation to specific object-of-interest using fine-tuning. CV3DST | Prof. Leal-Taixé 29

On One-sh shot V VOS Finetuning Training Pre-trained 1 2 3 Base Network Parent Network Test Network Pre-trained on ImageNet Trained on DAVIS training set Fine-tuned on frame 1 of test sequence Results on frame N of test sequence Edges and Learns how to Learns which basic image do video object to features segmentation segment CV3DST | Prof. Leal-Taixé S. Caelles et al. “One-shot video object segmentation”.CVPR 2017 30

On One-sh shot V VOS • One-shot: we see the first frame ground truth • Finetuning step: this is used to technically overfit to the test sequence first frame. Overfitting is therefor used to learn the appearance of the foreground object (and the background!) • Test time: each frame is processed independently à no temporal information CV3DST | Prof. Leal-Taixé S. Caelles et al. “One-shot video object segmentation”.CVPR 2017 31

Fr Frame me-ba based segm gmentation • PRO: it recovers well from occlusions (unlike mask propagation or optical flow-based methods) • CON: it is temporally inconsistent CV3DST | Prof. Leal-Taixé 32

Ex Exper erimen iments: hig ighly dynamic mic scen enes es CV3DST | Prof. Leal-Taixé 33

Ex Experiments: accuracy y vs annotations Two camels! Another annotation where the 2 nd camel is background Another Mask is annotation refined CV3DST | Prof. Leal-Taixé 34

Fin Finetunin ing ti time Object flow 11.8 pp. 102ms – One forward pass (parent network) CV3DST | Prof. Leal-Taixé DAVIS dataset 35

Obs Observ rvations • OSVOS does not have an object of object shape. • It is a pure appearance-based method, if the foreground (or the background) appearance changes too much, the method fails CV3DST | Prof. Leal-Taixé 36

In Intro roduc ucing Semantics First frame

In Introducing Semantics He was occluded in the first frame, therefore the network never learned he was background. CV3DST | Prof. Leal-Taixé 38

Bu But wai ait…. • We have already seen models that have an idea of object shape.. • Instance segmentation methods! CV3DST | Prof. Leal-Taixé 39

OS OSVOS OS-S: S: Se Semanti ntic c propagati tion Semantic prior branch that gives us proposals to select from Semantic Prior Semantic Semantic Instance Selection & Segmentation Propagation Instance Proposals Top Matching Instances Conditional CNN First-Round Input Image �� Foreground Estimation Foreground Estimation Result Appearance Model Prior: semantics stay coherent throughout the sequence K.-K. Maninis et al. “Video object segmentation without temporal information”. TPAMI 2018 CV3DST | Prof. Leal-Taixé 40

OS OSVOS OS-S: S: Se Semanti ntic c propagati tion Semantic Selection Semantic Propagation Instance Segmentation Proposals Instance Segmentation Proposals Foreground Estimation First-Round Ground Truth Top Person and Motorbike Selected Instances: Person and Motorbike Frame 0 Frame 18 Frame 24 Frame 30 Frame 36 K.-K. Maninis et al. “Video object segmentation without temporal information”. TPAMI 2018 CV3DST | Prof. Leal-Taixé 41

Dr Drifti ting g pr proble blem • If the object greatly changes its appearance (e.g., though pose or camera changes), then the model is not powerful anymore • But this change was gradual…. CV3DST | Prof. Leal-Taixé 42

Vi Video Ob eo Object ject Segm Segmen enta tati tion on - PowerPoint PPT Presentation

Vi Video Ob eo Object ject Segm Segmen enta tati tion on CV3DST | Prof. Leal-Taix 1 Vi Video deo Objec ject Seg egmen entat ation on Object Detection Object Tracking This lecture Object Segmentation Video Object

Vi Video Ob eo Object ject Segm Segmen enta tati tion on CV3DST | Prof. Leal-Taix 1

Sem Semanti tic c segm segmen enta tati tion on CV3DST | Prof. Leal-Taix 1 Ta Task d

Ins Instanc nce segm segmen enta tati tion on CV3DST | Prof. Leal-Taix 1 Se Semanti

Borsa I taliana I taliana STAR segm ent STAR segm ent Borsa PRI MA I NDUSTRI E PRI MA

Debt bt Refina inancing ncing Invest estor Presen enta tati tion on Outline 1 Executi

Focu cusing sing on De Deli liver ery Invest estor Presen enta tati tion on Novem

Focu cusing sing on De Deli liver ery Invest estor Presen enta tati tion on Januar ary

De Deli livering ering Sustainable stainable Val alue ue Invest estor Presen enta tati

Sylv lvania ia I Int nter ercha hange S e Study dy City ty Council P Presen enta tati

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

P art I Understanding the Ob ject-Orien ted W orld View 11 Chapter 1 Ob ject-Orien

ts of In Init itia iatio tion: The Th e Sa Sacr cramen aments BAPTIS PTISM EUCHA

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

Time me Segmen ment Presenter 8:30 AM 9:20 AM Check-In/Scan Registrations 9:20 AM

Segmen&ng a Market Segment: New Ideas for Capturing

Submarine platform automation enabler of an optimized crew concept H. Wehner 1 , Dr. M. Mohr 2

Taxi Operational Performance Seminar 2 Notes The Transport for London financial year consists of

Statistical Machine Translation The Main Idea Treat translation as a noisy channel problem:

Toward Astrophysical Black-Hole Binaries Gregory B. Cook Wake Forest University Mar. 29, 2002

t ts str ss

Chapitre : Recherche d information et apprentissage Slides emprunts De la prsentation

The Penrose inequality for the perturbed Schwarzschild initial data J. Tafel University of

Sur lalgorithme de d ecodage en liste de Guruswami-Sudan sur les anneaux finis. Guillaume

Vi Video Ob eo Object ject Segm Segmen enta tati tion on - PowerPoint PPT Presentation

Vi Video Ob eo Object ject Segm Segmen enta tati tion on CV3DST | Prof. Leal-Taix 1 Vi Video deo Objec ject Seg egmen entat ation on Object Detection Object Tracking This lecture Object Segmentation Video Object

Vi Video Ob eo Object ject Segm Segmen enta tati tion on CV3DST | Prof. Leal-Taix 1

Sem Semanti tic c segm segmen enta tati tion on CV3DST | Prof. Leal-Taix 1 Ta Task d

Ins Instanc nce segm segmen enta tati tion on CV3DST | Prof. Leal-Taix 1 Se Semanti

Borsa I taliana I taliana STAR segm ent STAR segm ent Borsa PRI MA I NDUSTRI E PRI MA

Debt bt Refina inancing ncing Invest estor Presen enta tati tion on Outline 1 Executi

Focu cusing sing on De Deli liver ery Invest estor Presen enta tati tion on Novem

Focu cusing sing on De Deli liver ery Invest estor Presen enta tati tion on Januar ary

De Deli livering ering Sustainable stainable Val alue ue Invest estor Presen enta tati

Sylv lvania ia I Int nter ercha hange S e Study dy City ty Council P Presen enta tati

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

P art I Understanding the Ob ject-Orien ted W orld View 11 Chapter 1 Ob ject-Orien

ts of In Init itia iatio tion: The Th e Sa Sacr cramen aments BAPTIS PTISM EUCHA

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

Time me Segmen ment Presenter 8:30 AM 9:20 AM Check-In/Scan Registrations 9:20 AM

Segmen&amp;ng a Market Segment: New Ideas for Capturing

Submarine platform automation enabler of an optimized crew concept H. Wehner 1 , Dr. M. Mohr 2

Taxi Operational Performance Seminar 2 Notes The Transport for London financial year consists of

Statistical Machine Translation The Main Idea Treat translation as a noisy channel problem:

Toward Astrophysical Black-Hole Binaries Gregory B. Cook Wake Forest University Mar. 29, 2002

t ts str ss

Chapitre : Recherche d information et apprentissage Slides emprunts De la prsentation

The Penrose inequality for the perturbed Schwarzschild initial data J. Tafel University of

Sur lalgorithme de d ecodage en liste de Guruswami-Sudan sur les anneaux finis. Guillaume

Segmen&ng a Market Segment: New Ideas for Capturing