event driven video frame synthesis
play

Event-driven Video Frame Synthesis Zihao Wang 1 , Weixin Jiang 1 , - PowerPoint PPT Presentation

Event-driven Video Frame Synthesis Zihao Wang 1 , Weixin Jiang 1 , Kuan He 1 , Boxin Shi 2 , Aggelos Katsaggelos 1 , Oliver Cossairt 1 1 Northwestern University 2 Peking University 2 nd Intl Workshop on Physics Based Vision meets Deep Learning


  1. Event-driven Video Frame Synthesis Zihao Wang 1 , Weixin Jiang 1 , Kuan He 1 , Boxin Shi 2 , Aggelos Katsaggelos 1 , Oliver Cossairt 1 1 Northwestern University 2 Peking University 2 nd Int’l Workshop on Physics Based Vision meets Deep Learning (PBDL) in conjunction with

  2. Physics-based vision meets deep learning Physics-based vision Initial point Designed priors following physics laws Objective Target 10/29/2019 PBDL2019, ICCV Workshop 2

  3. Physics-based vision meets deep learning Physics-based vision Initial point Physics-based optimization Subject to noise, or incomplete modeling 10/29/2019 PBDL2019, ICCV Workshop 3

  4. Physics-based vision meets deep learning Physics-based vision Noisy input Physics-based optimization Subject to noise, or incomplete modeling 10/29/2019 PBDL2019, ICCV Workshop 4

  5. Physics-based vision meets deep learning Learning-based vision End-to-end non-linear fitting usually have superior performance as long as you have sufficient data and GPUs 10/29/2019 PBDL2019, ICCV Workshop 5

  6. Physics-based vision meets deep learning Learning-based vision End-to-end non-linear fitting Learning from simulation! 10/29/2019 PBDL2019, ICCV Workshop 6

  7. Physics-based vision meets deep learning Learning-based vision real End-to-end non-linear fitting simulation Learning from simulation! The gap between simulation and real data 10/29/2019 PBDL2019, ICCV Workshop 7

  8. Physics-based vision meets deep learning Learning-based vision End-to-end non-linear fitting Train with data augmentation 10/29/2019 PBDL2019, ICCV Workshop 8

  9. Physics-based vision meets deep learning Learning-based vision End-to-end non-linear fitting Require retraining even for similar tasks. 10/29/2019 PBDL2019, ICCV Workshop 9

  10. Physics-based vision meets deep learning Learning-based vision 1-frame interp. 9-frame interp. End-to-end non-linear fitting E.g. video frame interpolation NN1 Require retraining even NN2 for similar tasks. NN3 10-frame interp. 10/29/2019 PBDL2019, ICCV Workshop 10

  11. Physics-based vision meets deep learning Physics + learning based vision Framework: Suitable application: Physics multi-modal video synthesis First use a unifying physics-based • Physics approach to have rough estimation Learning Then use DNN to learn the residual. • Physics 10/29/2019 PBDL2019, ICCV Workshop 11

  12. Background: rethinking frame-based imaging Shutter on Photon integration Shutter off A/D conversion Readout Other processing Exposure time Post-exposure time Long – blurry, over exposure {ADC + synchronized read-out} Short – noisy, under expo. – discrepancy between frames (frame interpolation) 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 12

  13. Motivation Frame-based camera pipeline Shutter on Photon integration Shutter off A/D conversion Readout Other processing • We need “smart” cameras that: • Can respond to high speed motions (eliminate blur) • Do not always operate at high speed (less data redundancy) • Potential solution: event cameras 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 13

  14. What’s event camera? Another high-speed camera? Scenario: moving poster with shapes Data from DAVIS dataset Each pixel: Compare brightness variations • (blue: increase; red: decrease) • Small latency (micro-second level) • 10 6 FPS! (at max) • Works independently (asynchronous) • Capture: 22 FPS Display: 1.1 FPS 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 14

  15. But… • Events ≠ temporal gradient • Infinitely many solutions to infer intensity from events. • Cannot capture weak variations 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 15

  16. But… • Events ≠ temporal gradient • Infinitely many solutions to infer intensity from events. • Cannot capture weak variations • Events are very noisy • Noise model not well understood • Gaussian on threshold • Event denoisers not advanced • Able to cancel isolated events (correlation) • Cannot handle complex scenarios, e.g. illumination change Example images overlaid with neighbor events data from DAVIS dataset and Pan et al. CVPR’19 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 16

  17. We propose intensity frame + events for high frame-rate video synthesis 10/30/2019 PBDL2019, ICCV Workshop 17

  18. Our approach: fusion of intensity frame + events DMR: Differentiable Model-based Reconstruction 10/29/2019 PBDL2019, ICCV Workshop 21

  19. Differentiable model (event sensing) Per-pixel sensing model Differentiable model (approx.) denotes one pixel of is t th frame of t th event frame 10/29/2019 PBDL2019, ICCV Workshop 22

  20. Differentiable model (frame sensing) We consider 3 temporal settings: interpolation, prediction and motion deblur. 10/29/2019 PBDL2019, ICCV Workshop 23

  21. Reconstruction loss and optimization Objective Frame pixel error Pixel loss Event pixel error Sparsity loss Use stochastic gradient descent (SGD) to minimize the loss. As loss decreases, results get closer to ground truth. 10/29/2019 PBDL2019, ICCV Workshop 24

  22. Results (DMR) • Interpolation case • Given start & end frames + events in-between, recover intermediate frames Low-speed intensity frames High-speed video Event frames (20 frames) (2 frames) (21 frames) The middle frame is withheld for evaluation Frame #10 Error map of Frame #10 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 29

  23. Results (DMR) • Prediction case • Given start frame and future events, recover future frames CF [ACCV’18] Ours 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 30

  24. Results (DMR) • Motion deblur case • Given a blurry image + events in-exposure, recover intermediate sharp frames. Blurry images Events during exposure EDI [CVPR’19] Ours Video recovery 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 31

  25. Overview of our approach 10/29/2019 PBDL2019, ICCV Workshop 32

  26. Residual “denoiser” Use CNN to learn the residual of DMR output w.r.t. ground truth • Designed to enhance DMR results • Easy to train • Model DMR artifacts as residual “noise” • Actually beyond Gaussian denoising • Single frame based • Interface well with DMR • 10/30/2019 Wang et al. PBDL2019, ICCV Workshop 33

  27. Results (residual denoiser) Ours (DMR) DnCNN [TIP’17] FFDNet [TIP’18] Ours (RD) Ground truth 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 34

  28. DnCNN [TIP’17] FFDNet [TIP’18] Ours (RD) Ground truth Ours (DMR) 10/29/2019 Wang et al. PBDL2019, ICCV Workshop 35

  29. Results • Comparison with non-event-based frame interpolation approach • Events can provide additional information which is useful for challenging motions. SepConv [CVPR’17] Ground truth Ours (DMR + RD) 10/29/2019 PBDL2019, ICCV Workshop 36

  30. interpolation prediction DMR DMR RD DMR image + events motion deblur Thank you! Zihao (Winston) Wang zwinswang@gmail.com 10/29/2019 PBDL2019, ICCV Workshop 37

Recommend


More recommend