DVDNet Deep Blind Video Decaptioning with 3D-2D Gated Convolutions - PowerPoint PPT Presentation

2018 ChaLearn Looking at People Challenge - Track 2. Video Decaptioning DVDNet Deep Blind Video Decaptioning with 3D-2D Gated Convolutions Dahun Kim*, Sanghyun Woo*, Joonyoung Lee, In So Kweon 1

Our Problem Remove text overlays in video Need to consider two important points: 1. Video : Sequence of frames) 2. Blind : No inpainting mask)

Model Overview 3D gated- 2D gated- CNN CNN Encoder Decoder Input Skipconnections Prediction Output Two important points : • Video : Sequence of frames • 3D-2D U-net • Residual learning • Blind : No inpainting mask + Gated convolution

Vanilla 2D U-Net* Frame-by-frame operation • Spatial context 2D CNN 2D CNN Encoder Decoder Input Skipconnections Prediction Two important points : • Video : Sequence of frames • Scene dynamics • Blind : No inpainting mask * Ronneberger, O.et al. “U -net: Convolutional networks for biomedical image segmentation .” MICCAI 2015.

Input : Multiple frames Scene dynamics • Aggregate hints from spatio-temporal neighborhoods  Object movements  Subtitle changes

Vanilla 3D U-Net* Multiple frame prediction 3D CNN 3D CNN Encoder Decoder Input Skipconnections Prediction • Hard problem • Heavy • Not uniform prediction * C¸ ic¸ek, O ¨ .et al. “3d u-net: learning dense volumetric segmentation from sparse annotation.” MICCAI 2016.

Output : Single frame Focus on a single frame • Aggregate hints from lagging and leading frames. Lagging frames Leading frames 3D-2D U-Net • Easy problem • Light-weight Center frame • Temporal view range Output

3D-2D U-Net architecture Focus on a single frame 3D gated- 2D gated- CNN CNN Encoder Decoder Input Skipconnections Prediction • 3D convolutions to flatten the encoder features into one frame .  to match the shape and concatenate.

Residual Learning 3D gated- 2D gated- CNN CNN Encoder Decoder Input Skipconnections Prediction Output  Implicitly knows the inpainting mask Two important points : • Video : Sequence of frames • Residual learning - Not touching good pixels • Blind : No inpainting mask - Focus on the corrupted regions

+ Attention Gated Convolution* • 0-1 value (Gating) • Attentioning Sigmoid Conv Conv Input feature * Yu, J . et al. “Free -form image inpainting with gated convolution”. arXiv preprint arXiv:1806.03589.

Loss Function L1 + gradient L1 + SSIM loss

Quantative Results

Qualitative Results

2018 ChaLearn Looking at People Challenge - Track 2. Video Decaptioning DVDNet Deep Blind Video Decaptioning with 3D-2D Gated Convolutions Dahun Kim*, Sanghyun Woo*, Joonyoung Lee, In So Kweon 14

DVDNet Deep Blind Video Decaptioning with 3D-2D Gated Convolutions - PowerPoint PPT Presentation

2018 ChaLearn Looking at People Challenge - Track 2. Video Decaptioning DVDNet Deep Blind Video Decaptioning with 3D-2D Gated Convolutions Dahun Kim, Sanghyun Woo, Joonyoung Lee, In So Kweon 1 Our Problem Remove text overlays in video

Intraseasonal variability in South America Mariano S. Alvarez Departamento de Ciencias de la

Speeding Up the ARDL Estimation Command: A Case Study in Efficient Programming in Stata and Mata

NIE Doctor in Education Nurturing leaders for change in the education professions Associate

Introduction to the SPFPFS Strategic Plan Map Ohios SPFPFS Initiative: OnDemand

Observation-constrained pulsar magnetospheric models Yes, this one needs to be serviced too. It

Build an Accountable Sales Program in Your Small Business Pam Watson Korbel Leading vs. Lagging

Developments in resonant power converters for RF tube modulators Jon Clare Professor of Power

FOCUS AREAS GUIDING THE STRATEGIC REFRESH PROCESS Generated from discussions in 2015 with

Where Communication Meets Healthcare Wade Trappe trappe@winlab.rutgers.edu Why is a Wireless

Community Health Improvement Learning Collaborative Webinar #6 Evaluate Actions February 16 th ,

Using Publicly Available Data for Decisions in Agricultural Supply Chain Authors: Satya Dhavala

Outside Insight: navigating a world drowning in data Jorn Lyseggen CEO of Meltwater Ken Benoit

Measures of Effective Teaching (MET) Vicki Phillips Director, College Ready @drvickip

Optimizing the Lead: A data-driven optimization process that goes beyond lead capture Brian

Artist Management in a Artist Management in a Artist Management in a Small Games Company Small

B03: HL-LHC CMS Upgrade QA/QC Plan Carol Wilkinson, Associate Project Manager CD1 Review October

Rhody Health Options Care Management Kathy Ullrich, LICSW Manager, RHO Care Management ICI

The U.S. Environmental Protection Agency (EPA), in collaboration with the U.S. Department of

Cybe r R isk T r e nds: 2017 Wr ap-Up Ja nua ry 30 th , 2018, 11 AM E a ste rn Cybe r R

The Strategist: Strategy from Context Adam Brandenburger J.P . Valles Professor, NYU

High-speed cryptography, Crypto performance problems part 1: often lead users to reduce

INF5210 Information Infrastructure Class #11 Bootstrapping & Gateways Ben Eaton Dan Truong

Software Quality Research: from Processes to Model- based Techniques Bernhard Peischl Softnet

Whats new since last year? www.4s-dawn.com Product Update Dosing Instructions Total Mg in

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

DVDNet Deep Blind Video Decaptioning with 3D-2D Gated Convolutions - PowerPoint PPT Presentation

2018 ChaLearn Looking at People Challenge - Track 2. Video Decaptioning DVDNet Deep Blind Video Decaptioning with 3D-2D Gated Convolutions Dahun Kim*, Sanghyun Woo*, Joonyoung Lee, In So Kweon 1 Our Problem Remove text overlays in video

Intraseasonal variability in South America Mariano S. Alvarez Departamento de Ciencias de la

Speeding Up the ARDL Estimation Command: A Case Study in Efficient Programming in Stata and Mata

NIE Doctor in Education Nurturing leaders for change in the education professions Associate

Introduction to the SPFPFS Strategic Plan Map Ohios SPFPFS Initiative: OnDemand

Observation-constrained pulsar magnetospheric models Yes, this one needs to be serviced too. It

Build an Accountable Sales Program in Your Small Business Pam Watson Korbel Leading vs. Lagging

Developments in resonant power converters for RF tube modulators Jon Clare Professor of Power

FOCUS AREAS GUIDING THE STRATEGIC REFRESH PROCESS Generated from discussions in 2015 with

Where Communication Meets Healthcare Wade Trappe trappe@winlab.rutgers.edu Why is a Wireless

Community Health Improvement Learning Collaborative Webinar #6 Evaluate Actions February 16 th ,

Using Publicly Available Data for Decisions in Agricultural Supply Chain Authors: Satya Dhavala

Outside Insight: navigating a world drowning in data Jorn Lyseggen CEO of Meltwater Ken Benoit

Measures of Effective Teaching (MET) Vicki Phillips Director, College Ready @drvickip

Optimizing the Lead: A data-driven optimization process that goes beyond lead capture Brian

Artist Management in a Artist Management in a Artist Management in a Small Games Company Small

B03: HL-LHC CMS Upgrade QA/QC Plan Carol Wilkinson, Associate Project Manager CD1 Review October

Rhody Health Options Care Management Kathy Ullrich, LICSW Manager, RHO Care Management ICI

The U.S. Environmental Protection Agency (EPA), in collaboration with the U.S. Department of

Cybe r R isk T r e nds: 2017 Wr ap-Up Ja nua ry 30 th , 2018, 11 AM E a ste rn Cybe r R

The Strategist: Strategy from Context Adam Brandenburger J.P . Valles Professor, NYU

High-speed cryptography, Crypto performance problems part 1: often lead users to reduce

INF5210 Information Infrastructure Class #11 Bootstrapping &amp; Gateways Ben Eaton Dan Truong

Software Quality Research: from Processes to Model- based Techniques Bernhard Peischl Softnet

Whats new since last year? www.4s-dawn.com Product Update Dosing Instructions Total Mg in

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

2018 ChaLearn Looking at People Challenge - Track 2. Video Decaptioning DVDNet Deep Blind Video Decaptioning with 3D-2D Gated Convolutions Dahun Kim, Sanghyun Woo, Joonyoung Lee, In So Kweon 1 Our Problem Remove text overlays in video

INF5210 Information Infrastructure Class #11 Bootstrapping & Gateways Ben Eaton Dan Truong