Motion Denoising with Application to Time-lapse Photography Michael Rubinstein MIT CSAIL Ce Liu Peter Sand Fredo Durand Bill Freeman Microsoft Research NE MIT MIT
Time-lapse Videos Construction Natural phenomena Medical Biological/Botanical
For Personal Use Too! 9 months 16 years http://www.danhanna.com/aging_project/p.html 7 years Source: YouTube
“Stylized Jerkiness”
Motion Denoising Time-lapse World Time Motion denoising Space
Motion Denoising Motion denoising
Time-lapse in Vision/Graphics Research • Video summarization (video time-lapse) [Bennett and McMillan 2007] [Pritch et al. 2008] • Time-lapse editing [Sunkavalli et al. 2007]
Motion Denoising is Challenging! • Naïve low-pass (temporal) filtering – Pixels of different objects are averaged • Smoothing motion trajectories – Motion estimation in time-lapse videos is hard! * Motion discontinuities * Color inconsistencies KLT tracks
Formulation • Key idea: long-term events in videos can be statistically explained within some local spatiotemporal support, while short- term events are more distinctive – Assumption: world is smooth – Short-term variation = noise , long-term variation = signal • Our algorithm reshuffles the pixels in both space and time to maintain long-term events in the video, while removing short- term noisy motions
Formulation 𝐹 𝑥 = |𝐽 𝑞 + 𝑥(𝑞) − 𝐽(𝑞 )| Fidelity (to input) 𝑞 Temporal coherence 2 + 𝛽 𝐽(𝑞 + 𝑥 𝑞 ) − 𝐽 𝑠 + 𝑥(𝑠) (of the result) 𝑞,𝑠∈𝑂 𝑢 (𝑞) Regularization + 𝛿 𝜇 𝑞𝑟 |𝑥 𝑞 − 𝑥 𝑟 | (of the warp) 𝑞,𝑟∈𝑂(𝑞) 𝑞 = (𝑦, 𝑧, 𝑢) 𝐽 – input video, 𝐽(𝑞 + 𝑥 𝑞 ) – output video 𝑂 𝑢 𝑞 - Temporal neighbors of 𝑞 , 𝑂 𝑞 - Spatiotemporal neighbors of 𝑞 𝑥 𝑞 ∈ 𝜀 𝑦 , 𝜀 𝑧 , 𝜀 𝑢 : |𝜀 𝑦 | ≤ Δ 𝑡 , 𝜀 𝑧 ≤ Δ 𝑡 , 𝜀 𝑢 ≤ Δ 𝑢 - displacement field 2 , 𝛾 = 2 2 −1 𝜇 𝑞𝑟 = exp −𝛾 𝐽 𝑞 − 𝐽 𝑟 𝐽 𝑞 − 𝐽 𝑟
Optimization • Optimized discretely on a 3D MRF – Nodes represent pixels – state space of each pixel = volume of possible spatiotemporal shifts • Complicated (huge!) inference problem – E.g. 500 3 nodes, 10 3 states per node – Optimize using Loopy Belief Propagation
Optimization • Potential functions message passing – Message structure stored on disk; read and write message chunks on need Linear in state space + 𝜔 𝑞 𝑥 𝑞 = 𝐽 𝑞 + 𝑥 𝑞 − 𝐽 𝑞 Pre-compute 𝟑 + 𝑢 𝜔 𝑞𝑠 𝑥 𝑞 ,𝑥 𝑠 = 𝛽 𝐽 𝑞 + 𝑥 𝑞 − 𝐽 𝑠 + 𝑥 𝑠 𝛿𝜇 𝑞𝑠 |𝑥 𝑞 − 𝑥 𝑠 | Quadratic in state space (non convex) 𝑢 𝜔 𝑞𝑟 𝑥 𝑞 ,𝑥 𝑟 = 𝛿𝜇 𝑞𝑟 |𝑥 𝑞 − 𝑥 𝑟 | Quadratic in state space But can be computed in linear time (distance transforms)
Multi-scale Processing • Spatiotemporal video pyramid – Smooth spatially – Sample temporally • Displacements in the coarse level used as centers for the search volume in the finer level
Results future y x past
Comparing with Other Optimization Techniques
Results future y x past
Results
Comparison with Naïve Temporal Filtering t x
Support Size
Motion-scale Decomposition
Motion-scale Decomposition
Other Scenarios
Future Work • User-controlled motion scales – Not necessarily binary decomposition into long-term and short-term • Modify the time-lapse capturing process to help post- processing – E.g. use short videos instead of still images and find best “path” through the video • Explore motion-denoising with time-lapse from other domains – Embryos research, satellite imagery
Thank you! http://csail.mit.edu/mrub/timelapse
Recommend
More recommend