Selectively De-Animating Video Jiamin Bai, Aseem Agarwala, Maneesh Agrawala, Ravi Ramamoorthi SIGGRAPH 2012 CS 448V: Computational Video Manipulation
Inspiration http://cinemagraphs.com/
Cinemagraphs
De-Animating Video
Example Walkthrough
Example Walkthrough
Cinemagraphs
System Diagram
System Diagram
System Diagram
Warping: Tracking K (s, t) = set of tracks as a table of 2D coordinates K G (s, t) = subset of tracks that lie on the user indicated region s = track index K’ G (s, t) = locations of tracks after warping t = time (frame number) t a = reference frame K’ G (s, t) = K G (s, t a ) K G = K A ∪ K F
Warping: Tracking K (s, t) = set of tracks as a table of 2D coordinates K G (s, t) = subset of tracks that lie on the user indicated region s = track index K’ G (s, t) = locations of tracks after warping t = time (frame number) t a = reference frame K’ G (s, t) = K G (s, t a ) K G = K A ∪ K F
Warping: Tracking K (s, t) = set of tracks as a table of 2D coordinates K G (s, t) = subset of tracks that lie on the user indicated region s = track index K’ G (s, t) = locations of tracks after warping t = time (frame number) t a = reference frame K’ G (s, t) = K G (s, t a ) K G = K A ∪ K F
Warping: Tracking anchor tracks K (s, t) = set of tracks as a table of 2D coordinates K G (s, t) = subset of tracks that lie on the user indicated region s = track index K’ G (s, t) = locations of tracks after warping t = time (frame number) t a = reference frame K’ G (s, t) = K G (s, t a ) K G = K A ∪ K F
Warping: Tracking K (s, t) = set of tracks as a table of 2D coordinates K G (s, t) = subset of tracks that lie on the user indicated region s = track index K’ G (s, t) = locations of tracks after warping t = time (frame number) t a = reference frame K’ G (s, t) = K G (s, t a ) K G = K A ∪ K F
Warping: Tracking floating tracks K (s, t) = set of tracks as a table of 2D coordinates K G (s, t) = subset of tracks that lie on the user indicated region s = track index K’ G (s, t) = locations of tracks after warping t = time (frame number) t a = reference frame K’ G (s, t) = K G (s, t a ) K G = K A ∪ K F
System Diagram
Warping: Initial Warp E = E a + ω E s
Warping: Initial Warp E = E a + ω E s main constraint
Warping: Initial Warp E = E a + ω E s shape-preserving
Warping: Initial Warp E = E a + ω E s
Warping: Initial Warp E = E a + ω E s
Warping: Initial Warp E = E a + ω E s K’ A (s, t)
Warping: Initial Warp E = E a + ω E s weighting function
System Diagram
Warping: Refined Warp E = E a + E f + ω E s
Warping: Refined Warp E = E a + E f + ω E s K’ F (s, t a ) ???
Warping: Refined Warp E = E a + E f + ω E s
Warping: Refined Warp E = E a + E f + ω E s K’ F (s, t) K’ F (s, t+1)
Warping: Result
System Diagram
System Diagram
Candidate Video Volumes Labels L = W ∪ S dynamic : copies of warped video W(x, y, t) W = {W i , W j } (if loop seamlessly) static : still-frames from input video repeated to fill duration of output S = {I b , I 2b , … I 5b } or “clean plate”
Candidate Video Volumes Labels L = W ∪ S dynamic : copies of warped video W(x, y, t) W = {W} or {W i , W j } (if loop seamlessly) static : still-frames from input video repeated to fill duration of output S = {I b , I 2b , … I 5b } or “clean plate”
Candidate Video Volumes Labels L = W ∪ S dynamic : copies of warped video W(x, y, t) W = {W} or {W i , W j } (if loop seamlessly) static : still-frames from input video repeated to fill duration of output b = time interval that evenly samples the input five times S = {I b , I 2b , … I 5b } I b = video where both frame of input video is repeated for duration of output or “clean plate”
Candidate Video Volumes Labels L = W ∪ S dynamic : copies of warped video W(x, y, t) W = {W} or {W i , W j } (if loop seamlessly) static : still-frames from input video repeated to fill duration of output b = time interval that evenly samples the input five times S = {I b , I 2b , … I 5b } I b = video where both frame of input video is repeated for duration of output or “clean plate”
Compositing: Graph-cut W j W i
Compositing: Graph-cut W j W i
Compositing: Graph-cut W j W i
Compositing: Graph-cut W j W i t = t j - 10 t = t j - 11
Compositing: Labeling Constraints
Compositing: Labeling Constraints From user-drawn compositing strokes: • If v(x, y) = blue , λ (x, y, t) ∈ W v(x, y) = strokes {red, blue, NULL} • If v(x, y) = red , λ (x, y, t) ∈ S For seamless looping: • λ (x, y, 0) ≠ W i • λ (x, y, 20) ≠ W j
Compositing: Labeling Constraints From user-drawn compositing strokes: • If v(x, y) = blue , λ (x, y, t) ∈ W v(x, y) = strokes {red, blue, NULL} • If v(x, y) = red , λ (x, y, t) ∈ S For seamless looping: • λ (x, y, 0) ≠ W i • λ (x, y, 20) ≠ W j t = 0 t = 20
Compositing: Energy Function
Compositing: Energy Function RGB differences
Compositing: Energy Function edge strengths
Compositing: Energy Function RGB differences
Compositing: Energy Function color of pixel p 2 in candidate video volume λ (p 1 )
Compositing: Energy Function edge strengths
Compositing: Energy Function edge strengths only consider dynamic candidates for seams between dynamic and static
Compositing: Energy Function edge strengths
Compositing: Energy Function minimize
System Diagram
Results: Beer
Results: Model K
Results: Glass
Results: Glass
Results: Glass
Results: Video Editing
Results: Roulette
Results: Roulette
Results: Roulette
Results: Video Editing
Assumptions
Assumptions • Input captured with a tripod (or previously stabilized) • Assume large-scale motions can be be de-animated with 2D warps • Objects to de-animate shot in front of a defocused, uniform, or uniformly-textured background
Assumptions • Input captured with a tripod (or previously stabilized) • Assume large-scale motions can be be de-animated with 2D warps • Objects to de-animate shot in front of a defocused, uniform, or uniformly-textured background
Assumptions • Input captured with a tripod (or previously stabilized) • Assume large-scale motions can be be de-animated with 2D warps • Objects to de-animate shot in front of a defocused, uniform, or uniformly-textured background
Limitations: 3D Motion
Limitations: Background
Limitations • What happens if the input video is not stabilized?
Follow-up • This system includes some manual annotation, how would you automate the user input? • Specifically, what would you do for faces?
Follow-up: Cinemagraph Portraits “Automatic Cinemagraph Portraits” Bai et al. EGSR 2013
Selectively De-Animating Video Jiamin Bai, Aseem Agarwala, Maneesh Agrawala, Ravi Ramamoorthi SIGGRAPH 2012 CS 448V: Computational Video Manipulation
Warping: Tracking
Warping: Initial vs Refined
Results: Existing Techniques
Adapted Cost Function Graph-cut
User Input: De-animated Static de-animate strokes compositing strokes
User Input: De-animated Dynamic de-animate strokes compositing strokes
System Diagram
System Diagram
Recommend
More recommend