Event-driven Video Frame Synthesis Zihao Wang 1 , Weixin Jiang 1 , Kuan He 1 , Boxin Shi 2 , Aggelos Katsaggelos 1 , Oliver Cossairt 1 1 Northwestern University 2 Peking University 2 nd Int’l Workshop on Physics Based Vision meets Deep Learning (PBDL) in conjunction with
Physics-based vision meets deep learning Physics-based vision Initial point Designed priors following physics laws Objective Target 10/29/2019 PBDL2019, ICCV Workshop 2
Physics-based vision meets deep learning Physics-based vision Initial point Physics-based optimization Subject to noise, or incomplete modeling 10/29/2019 PBDL2019, ICCV Workshop 3
Physics-based vision meets deep learning Physics-based vision Noisy input Physics-based optimization Subject to noise, or incomplete modeling 10/29/2019 PBDL2019, ICCV Workshop 4
Physics-based vision meets deep learning Learning-based vision End-to-end non-linear fitting usually have superior performance as long as you have sufficient data and GPUs 10/29/2019 PBDL2019, ICCV Workshop 5
Physics-based vision meets deep learning Learning-based vision End-to-end non-linear fitting Learning from simulation! 10/29/2019 PBDL2019, ICCV Workshop 6
Physics-based vision meets deep learning Learning-based vision real End-to-end non-linear fitting simulation Learning from simulation! The gap between simulation and real data 10/29/2019 PBDL2019, ICCV Workshop 7
Physics-based vision meets deep learning Learning-based vision End-to-end non-linear fitting Train with data augmentation 10/29/2019 PBDL2019, ICCV Workshop 8
Physics-based vision meets deep learning Learning-based vision End-to-end non-linear fitting Require retraining even for similar tasks. 10/29/2019 PBDL2019, ICCV Workshop 9
Physics-based vision meets deep learning Learning-based vision 1-frame interp. 9-frame interp. End-to-end non-linear fitting E.g. video frame interpolation NN1 Require retraining even NN2 for similar tasks. NN3 10-frame interp. 10/29/2019 PBDL2019, ICCV Workshop 10
Physics-based vision meets deep learning Physics + learning based vision Framework: Suitable application: Physics multi-modal video synthesis First use a unifying physics-based • Physics approach to have rough estimation Learning Then use DNN to learn the residual. • Physics 10/29/2019 PBDL2019, ICCV Workshop 11
Background: rethinking frame-based imaging Shutter on Photon integration Shutter off A/D conversion Readout Other processing Exposure time Post-exposure time Long – blurry, over exposure {ADC + synchronized read-out} Short – noisy, under expo. – discrepancy between frames (frame interpolation) 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 12
Motivation Frame-based camera pipeline Shutter on Photon integration Shutter off A/D conversion Readout Other processing • We need “smart” cameras that: • Can respond to high speed motions (eliminate blur) • Do not always operate at high speed (less data redundancy) • Potential solution: event cameras 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 13
What’s event camera? Another high-speed camera? Scenario: moving poster with shapes Data from DAVIS dataset Each pixel: Compare brightness variations • (blue: increase; red: decrease) • Small latency (micro-second level) • 10 6 FPS! (at max) • Works independently (asynchronous) • Capture: 22 FPS Display: 1.1 FPS 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 14
But… • Events ≠ temporal gradient • Infinitely many solutions to infer intensity from events. • Cannot capture weak variations 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 15
But… • Events ≠ temporal gradient • Infinitely many solutions to infer intensity from events. • Cannot capture weak variations • Events are very noisy • Noise model not well understood • Gaussian on threshold • Event denoisers not advanced • Able to cancel isolated events (correlation) • Cannot handle complex scenarios, e.g. illumination change Example images overlaid with neighbor events data from DAVIS dataset and Pan et al. CVPR’19 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 16
We propose intensity frame + events for high frame-rate video synthesis 10/30/2019 PBDL2019, ICCV Workshop 17
Our approach: fusion of intensity frame + events DMR: Differentiable Model-based Reconstruction 10/29/2019 PBDL2019, ICCV Workshop 21
Differentiable model (event sensing) Per-pixel sensing model Differentiable model (approx.) denotes one pixel of is t th frame of t th event frame 10/29/2019 PBDL2019, ICCV Workshop 22
Differentiable model (frame sensing) We consider 3 temporal settings: interpolation, prediction and motion deblur. 10/29/2019 PBDL2019, ICCV Workshop 23
Reconstruction loss and optimization Objective Frame pixel error Pixel loss Event pixel error Sparsity loss Use stochastic gradient descent (SGD) to minimize the loss. As loss decreases, results get closer to ground truth. 10/29/2019 PBDL2019, ICCV Workshop 24
Results (DMR) • Interpolation case • Given start & end frames + events in-between, recover intermediate frames Low-speed intensity frames High-speed video Event frames (20 frames) (2 frames) (21 frames) The middle frame is withheld for evaluation Frame #10 Error map of Frame #10 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 29
Results (DMR) • Prediction case • Given start frame and future events, recover future frames CF [ACCV’18] Ours 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 30
Results (DMR) • Motion deblur case • Given a blurry image + events in-exposure, recover intermediate sharp frames. Blurry images Events during exposure EDI [CVPR’19] Ours Video recovery 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 31
Overview of our approach 10/29/2019 PBDL2019, ICCV Workshop 32
Residual “denoiser” Use CNN to learn the residual of DMR output w.r.t. ground truth • Designed to enhance DMR results • Easy to train • Model DMR artifacts as residual “noise” • Actually beyond Gaussian denoising • Single frame based • Interface well with DMR • 10/30/2019 Wang et al. PBDL2019, ICCV Workshop 33
Results (residual denoiser) Ours (DMR) DnCNN [TIP’17] FFDNet [TIP’18] Ours (RD) Ground truth 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 34
DnCNN [TIP’17] FFDNet [TIP’18] Ours (RD) Ground truth Ours (DMR) 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 35
Results • Comparison with non-event-based frame interpolation approach • Events can provide additional information which is useful for challenging motions. SepConv [CVPR’17] Ground truth Ours (DMR + RD) 10/29/2019 PBDL2019, ICCV Workshop 36
interpolation prediction DMR DMR RD DMR image + events motion deblur Thank you! Zihao (Winston) Wang zwinswang@gmail.com 10/29/2019 PBDL2019, ICCV Workshop 37
Recommend
More recommend