Lecture 18: Depth estimation 1
Announcements • PS9 out tonight: panorama stitching • New grading policies from UMich (details TBA) • Final presentation will take place over video chat. - We’ll send a sign-up sheet next week 2
Today • Stereo matching • Probabilistic graphical models • Belief propagation • Learning-based depth estimation 3
Basic stereo algorithm For each epipolar line For each pixel in the left image • compare with every pixel on same epipolar line in right image • pick pixel with minimum match cost Improvement: match windows 4 Source: N. Snavely
Stereo matching based on SSD SSD d min d Best matching disparity 5 Source: N. Snavely
Window size W = 3 W = 20 6 Source: N. Snavely
Stereo as energy minimization • What defines a good stereo correspondence? 1. Match quality Want each pixel to find a good match in the other image • 2. Smoothness If two pixels are adjacent, they should (usually) move about the • same amount 7 Source: N. Snavely
Stereo as energy minimization • Find disparity map d that minimizes an energy function • Simple pixel / window matching = Squared distance between windows I ( x , y ) and J ( x + d ( x , y ), y ) 8 Source: N. Snavely
Stereo as energy minimization I ( x , y ) J ( x , y ) y = 141 d x C ( x , y , d ); the disparity space image (DSI) 9 Source: N. Snavely
Stereo as energy minimization y = 141 d x Simple pixel / window matching: choose the minimum of each column in the DSI independently: 10 Source: N. Snavely
Greedy selection of best match 11 Source: N. Snavely
Stereo as energy minimization • Better objective function { { smoothness cost match cost Want each pixel to find a good Adjacent pixels should (usually) match in the other image move about the same amount 12 Source: N. Snavely
Stereo as energy minimization match cost: smoothness cost: : set of neighboring pixels 4-connected 8-connected 13 neighborhood neighborhood Source: N. Snavely
Smoothness cost How do we choose V ? L 1 distance “Potts model” 14 Source: N. Snavely
Probabilistic interpretation exp( E ( d )) = exp( E d ( d ) + λ E s ( d )) Exponentiate: exp( E ( d )) = 1 Normalize: Z exp( E d ( d ) + λ E s ( d )) (make it sum to 1) Z X where exp E ( d 0 ) Z = d 0 P ( d | I ) Rewrite: 15 Example adapted from Freeman, Torralba, Isola
Probabilistic interpretation “Local evidence” “Pairwise compatibility” How good are the matches? Is the depth smooth? P ( d | I ) 16 Example adapted from Freeman, Torralba, Isola
Probabilistic interpretation Local evidence: Pairwise compatibility: P ( d | I ) 17 Example adapted from Freeman, Torralba, Isola
Probabilistic graphical models Graph structure: • Open circles for latent variables x i - d i in our problem • Filled circle for observations y i - Pixels in our problem • Edges between interacting variables - In general, graph cliques for 3+ variable interactions P ( d | I ) 18 Example adapted from Freeman, Torralba, Isola
Probabilistic graphical models Why formulate it this way? • Exploit sparse graph structure for fast inference, usually using dynamic programming • Can use probabilistic inference methods • Provides framework for learning parameters P ( d | I ) 19
Probabilistic graphical models Undirected graphical model. Directed graphical model Also known as Markov Random Field (MRF). Also know as Bayesian network (Not covered in this course) 20
Marginalization What’s the marginal distribution for x 1 ? i.e. what’s the probability of x 1 being in a particular state? • But this is expensive: O (|L|^N) • Exploit graph structure! 21
Marginalization 22
Message passing Can think of “local evidence” message passing Message that node x 3 sends to node x 2 Message that x 2 sends to x 1 23
Message passing • Message m ij is the sum over all states of all nodes in the subtree leaving node i at node j • It summarizes what this node “believes”. • E.g. if you have label x 2 , what’s the probability of my subgraph? • Shared computation! E.g. could reuse m 32 to help estimate p(x 2 | y). 24
Belief propagation • Estimate all marginals p(x i | y) at once! [Pearl 1982] • Given a tree-structured graph, send messages in topological order Sending message from j to i: 1. Multiply all incoming messages (except for the one from i) 2. Multiply the pairwise compatibility 3. Marginalize over x j 25
General graphs • Vision problems often are often on grid graphs • Pretend the graph is tree-structured and do belief propagation iteratively! • Can also have consistency with N > 2 variables - But complexity is exponential in N! Loopy belief propagation: 1. Initialize all messages to 1 2. Walk through the edges in an arbitrary order (e.g. random) 3. Apply the messages updates 26
Finding best labels Often want to find the labels that jointly maximize probability: argmax max x 1 ,x 2 ,x 3 This is called maximum a posteriori estimation (MAP estimation). Marginal: “Max marginal” instead: = max b ( x 1 | ~ y ) = x 2 ,x 3 max x j 27
Application to stereo [Felzenzwalb & Huttenlocher, “Efficient Belief Propagation for Early Vision”, 2006] 28
Deep learning + MRF refinement Query patch Positive Negative Left Right CNN-based matching + MRF refinement [Zbontar & LeCun, 2015] 29
Learning to estimate depth without ground truth Learn “volume”: color + occupancy Viewpoints 3D scene [Mildenhall*, Srinivasan*, Tanick*, et al., Neural radiance fields, 2020] 30
Learning to estimate depth without ground truth A good volume should reconstruct the input views 31 [Mildenhall*, Srinivasan*, Tanick*, et al. 2020]
Learning to estimate depth without ground truth 32 [Mildenhall*, Srinivasan*, Tanick*, et al. 2020]
Learning to estimate depth without ground truth Inserting virtual objects View synthesis 33 [Mildenhall*, Srinivasan*, Tanick*, et al. 2020]
Next class: motion 34
Recommend
More recommend