Administrivia This is the last lecture! Next two will be project presentations by you ‣ Upload your presentations on Moodle by 11 AM, Thursday, Dec. 8 ‣ 6 min presentation + 2 mins of questions Modeling images ‣ The order of presentations will be chosen randomly Subhransu Maji Remaning grading ‣ Homework 3 will be posted later today CMPSCI 670: Computer Vision ‣ Homework 4 (soon) December 6, 2016 Questions? CMPSCI 670 Subhransu Maji (UMASS) 2 Modeling images Modeling images: challenges Learn a probability distribution over natural images How many 64x64 pixels binary images are there? 10 random 64x64 binary images 2 64 × 64 ∼ 10 400 P ( x ) ∼ 1 P ( x ) ∼ 0 atoms in the known universe: 10 80 Assumption Image credit: Flickr @Kenny (zoompict) Teo ‣ Each pixel is generated independently Many applications: P ( x 1 , 1 , x 1 , 2 , . . . , x 64 , 64 ) = P ( x 1 , 1 ) P ( x 1 , 2 ) . . . P ( x 64 , 64 ) ‣ image synthesis: sample x from P( x ) ‣ image denoising: find most-likely clean image given a noisy image ‣ Is this a good assumption? ‣ image deblurring: find most-likely crisp image given a blurry image CMPSCI 670 Subhransu Maji (UMASS) 3 CMPSCI 670 Subhransu Maji (UMASS) 4
Texture synthesis The challenge Goal: create new samples of a given texture Need to model the whole spectrum: from repeated to stochastic texture Many applications: virtual environments, hole-filling, texturing surfaces repeated stochastic Both? Alexei A. Efros and Thomas K. Leung, “Texture Synthesis by Non-parametric Sampling,” Proc. International Conference on Computer Vision (ICCV), 1999. CMPSCI 670 Subhransu Maji (UMASS) 5 CMPSCI 670 Subhransu Maji (UMASS) 6 Markov chains Markov chain example: Text Markov chain “A dog is a man’s best friend. It’s a dog eat dog world out ‣ A sequence of random variables there.” a 2/3 1/3 ‣ is the state of the model at time t dog 1/3 1/3 1/3 is 1 man’s 1 best 1 • Markov assumption : each state is dependent only on the previous one 1 friend it’s - dependency given by a conditional probability : 1 eat 1 world 1 out 1 • The above is actually a first-order Markov chain 1 there • An N’th-order Markov chain: . 1 . a dog is best it’s eat out man’s friend there world Source: S. Seitz Source: S. Seitz CMPSCI 670 Subhransu Maji (UMASS) 7 CMPSCI 670 Subhransu Maji (UMASS) 8
Text synthesis Text synthesis Create plausible looking poetry, love letters, term papers, etc. “As I've commented before, really relating to Most basic algorithm someone involves standing next to impossible.” Build probability histogram 1. ➡ find all blocks of N consecutive words/letters in training documents “ One morning I shot an elephant in my arms and ➡ compute probability of occurrence kissed him.” Given words 2. ➡ compute by sampling from “ I spent an interesting evening recently with a grain of salt” WE NEED TO EAT CAKE Dewdney, “A potpourri of programmed prose and prosody” Scientific American, 1989. Source: S. Seitz Slide from Alyosha Efros, ICCV 1999 CMPSCI 670 Subhransu Maji (UMASS) 9 CMPSCI 670 Subhransu Maji (UMASS) 10 Synthesized text Synthesizing computer vision text This means we cannot obtain a separate copy of the best studied regions in the sum. What do we get if we extract All this activity will result in the primate visual system. the probabilities from a chapter on Linear Filters, and then The response is also Gaussian, and hence isn’t bandlimited. synthesize new statements? Instead, we need to know only its response to any data vector, we need to apply a low pass filter that strongly reduces the content of the Fourier transform of a very large standard deviation. It is clear how this integral exist (it is sufficient for all pixels within a 2k +1 × 2k +1 × 2k +1 × 2k + 1 — required for the images separately. Check out Yisong Yue’s website implementing text generation: build your own text Markov Chain for a given text corpus. http://www.yisongyue.com/shaney/index.php Kristen Grauman Kristen Grauman CMPSCI 670 Subhransu Maji (UMASS) 11
Markov random field Texture synthesis A Markov random field (MRF) Can apply 2D version of text synthesis • generalization of Markov chains to two or more dimensions. First-order MRF: Texture corpus (sample) • probability that pixel X takes a certain value given the values of neighbors A , B , C , and D : A D X B C Output Efros & Leung, ICCV 99 Source: S. Seitz CMPSCI 670 Subhransu Maji (UMASS) 13 Texture synthesis: intuition Synthesizing one pixel Before, we inserted the next word based on existing nearby words… Now we want to insert pixel intensities based on existing nearby pixel values. p input image synthesized image ‣ What is ? ‣ Find all the windows in the image that match the neighborhood ‣ To synthesize x Place we want to ➡ pick one matching window at random insert next ➡ assign x to be the center pixel of that window Sample of the texture ‣ An exact neighbourhood match might not be present, so find the best (“corpus”) matches using SSD error and randomly choose between them, Distribution of a value of a pixel is conditioned on its neighbors alone. preferring better matches with higher probability Slide from Alyosha Efros, ICCV 1999 CMPSCI 670 Subhransu Maji (UMASS) 15 CMPSCI 670 Subhransu Maji (UMASS) 16
Neighborhood window Varying window size input Increasing window size Slide from Alyosha Efros, ICCV 1999 Slide from Alyosha Efros, ICCV 1999 CMPSCI 670 Subhransu Maji (UMASS) 17 CMPSCI 670 Subhransu Maji (UMASS) 18 Synthesis results Growing texture french canvas rafia weave • Starting from the initial image, “grow” the texture one pixel at a time Slide from Alyosha Efros, ICCV 1999 Slide from Alyosha Efros, ICCV 1999 CMPSCI 670 Subhransu Maji (UMASS) 19
Synthesis results Synthesis results white bread brick wall Slide from Alyosha Efros, ICCV 1999 Slide from Alyosha Efros, ICCV 1999 Failure cases Extrapolation Growing garbage Verbatim copying Slide from Alyosha Efros, ICCV 1999 Slide from Alyosha Efros, ICCV 1999 CMPSCI 670 Subhransu Maji (UMASS) 24
(Manual) texture synthesis in the media Image denoising Given a noisy image the goal is to infer the clean image noisy clean Can you describe a technique to do this? ‣ Hint: we discussed this in an earlier class. http://www.dailykos.com/story/2004/10/27/22442/878 CMPSCI 670 Subhransu Maji (UMASS) 25 CMPSCI 670 Subhransu Maji (UMASS) 26 Bayesian image denoising Images as collection of patches Given a noisy image y, we want to estimate the most-likely clean Expected Patch Log-Likelihood (EPLL) [Zoran and Weiss, 2011] image x : log P ( x ) ∼ E p ∈ patch ( x ) log P ( p ) arg max P ( x | y ) = arg max P ( x ) P ( y | x ) = arg max log P ( x ) + log P ( y | x ) ‣ EPLL: log-likelihood of a randomly drawn patch p from an image x prior how well does x explain ‣ Intuitively, if all patches in an image have high log-likelihood, then the observations y the entire image also has high log-likelihood ‣ Advantage: modeling patch likelihood P( p ) is easier ‣ Observation term: P( y | x ) ➡ Assume noise is i.i.d. Gaussian y i = x i + ✏ , ✏ ∼ N (0; � 2 ) EPLL objective for image denoising x ∗ = arg max log E p ∈ patch ( x ) P ( p ) − λ || y − x || 2 − || y − x || 2 ✓ ◆ P ( y | x ) ∝ exp 2 σ 2 Thus, x ∗ = arg max log P ( x ) − λ || y − x || 2 CMPSCI 670 Subhransu Maji (UMASS) 27 CMPSCI 670 Subhransu Maji (UMASS) 28
Example [Zoran and Weiss, 2011] Example [Zoran and Weiss, 2011] Use Gaussian mixture models Optimization requires reasoning (GMMs) to model about which “token” is present at patch likelihoods. each patch and how well does that token explain the noisy Extract 8x8 patches image. from many images and learn a GMM. Gets tricky as patches overlap. CMPSCI 670 Subhransu Maji (UMASS) 29 CMPSCI 670 Subhransu Maji (UMASS) 30 Image deblurring Zoran & Weiss, 11 Given a noisy image the goal is to infer the clean image blurred crisp Can you describe a technique to do this? ‣ Hint: we discussed this in an earlier class. 31 CMPSCI 670 Subhransu Maji (UMASS) 32
Bayesian image deblurring Zoran & Weiss, 11 Given a blurred image y, we want to estimate the most-likely crisp image x : arg max P ( x | y ) = arg max P ( x ) P ( y | x ) = arg max log P ( x ) + log P ( y | x ) prior how well does x explain the observations y ‣ Observation term: P( y | x ) ➡ Assume noise is i.i.d. Gaussian and blur kernel K is known y = K ∗ x + ✏ , ✏ i ∼ N (0 , � 2 ) ✓ − || y − K ∗ x || 2 ◆ linear constraints P ( y | x ) ∝ exp 2 σ 2 Thus, x ∗ = arg max log P ( x ) − λ || y − K ∗ x || 2 34 CMPSCI 670 Subhransu Maji (UMASS) 33 Summary Modeling large images is hard but modeling small images (8x8 patches) is easier. ‣ Can take us quite far with many low-level vision tasks such as texture synthesis, denoising, deblurring, etc. ‣ But fails to capture long-range interactions Variational Framework for Non-Local Inpainting, Vadim Fedorov, Gabriele Facciolo, Pablo Arias Modeling images is an open area of research. Some directions: ‣ Multi-scale representations ‣ Generative image modeling using CNNs (variational auto encoders, generative adversarial networks, etc) CMPSCI 670 Subhransu Maji (UMASS) 35
Recommend
More recommend