Image Pyramids COMPSCI 527 — Computer Vision COMPSCI 527 — Computer Vision Image Pyramids 1 / 12
Outline 1 Pyramids and Scale 2 (Spatial Frequency) Aliasing 3 Downsampling and Upsampling 4 Bilinear Interpolation 5 Gaussian (and Laplacian) Pyramid COMPSCI 527 — Computer Vision Image Pyramids 2 / 12
Pyramids and Scale Pyramids and Scale ↑ smallest denticle we look for • Scale: • Start with smallest template • Look for larger and larger occurrences • Larger template ⇡ smaller image! COMPSCI 527 — Computer Vision Image Pyramids 3 / 12
Pyramids and Scale Scale Budgets • n ⇥ n image, k ⇥ k template, scaling s > 1 • Processing a large image with progressively larger k's templates: g n 2 ( k 2 + k 2 s 2 + k 2 s 4 + . . . ) = n 2 k 2 ( 1 + s 2 + s 4 + . . . ) µ w • Series diverges • Processing progressively smaller images with a small template: k 2 ( n 2 + n 2 / s 2 + n 2 / s 4 + . . . ) = k 2 n 2 ( 1 + 1 / s 2 + 1 / s 4 + . . . ) pi • Series converges to k 2 n 2 s 2 ( s 2 � 1 ) • For s = 2, the series converges to k 2 n 2 4 / 3 • About 33 % additional cost relative to processing the original image alone COMPSCI 527 — Computer Vision Image Pyramids 4 / 12
Pyramids and Scale Finer Scales • Scaling down by s = 2 every time may be overly aggressive O • Let φ = 1 / s be the downsampling factor • For 0 < φ < 1, image shrinks. For φ > 1, the image grows larger • How to downsample (0 < φ < 1)? • Two issues: aliasing and non-integer s COMPSCI 527 — Computer Vision Image Pyramids 5 / 12
(Spatial Frequency) Aliasing et Aliasing • Even when s is an integer, pure sampling is a bad idea: (Spatial frequency) aliasing • Colors are sampled at locations on the pixel grid • Nothing to do with the scene F o O O Original Sampled by s = 30, then magnified by 30 COMPSCI 527 — Computer Vision Image Pyramids 6 / 12
Downsampling and Upsampling Downsampling = Smoothing + Sampling JG.ch 2jGCi.jlIfr i • Smooth with a Gaussian blur kernel first, then sample cs lq 3oJf3oq3ob O0 z z aciij I 303 4 M Original Smoothed with σ = 48, than then sampled by s = 30, then magnified by 30 • We lose detail (blur), but that’s the whole point • True scale: • Every pixel in the low-resolution image is a weighted average of pixel values in the original image COMPSCI 527 — Computer Vision Image Pyramids 7 / 12
Downsampling and Upsampling Key Questions • How much to smooth before resampling? • That is, where does σ = 48 come from for φ = 1 / 30? • Lots of theory for the optimal multiplier • Depends on various factors (spectral properties of image and noise) • We use what works most of the time, empirically • Answer: σ ⇡ 1 . 6 s = 1 . 6 / φ • How to “take one out of every s pixels” when s = 1 / φ is not an integer? COMPSCI 527 — Computer Vision Image Pyramids 8 / 12
Bilinear Interpolation Bilinear Interpolation • What does it mean to “take one out of every s pixels” when s = 1 / phi is not an integer? i ξ = b x c , η = b y c ∆ x = x � ξ , ∆ y = y � η a i E I ( x ) = I ( ξ , η ) ( 1 � ∆ x ) ( 1 � ∆ y ) e i + I ( ξ + 1 , η ) ∆ x ( 1 � ∆ y ) + I ( ξ , η + 1 ) ( 1 � ∆ x ) ∆ y + I ( ξ + 1 , η + 1 ) ∆ x ∆ y COMPSCI 527 — Computer Vision Image Pyramids 9 / 12
Bilinear Interpolation Abstracting Pyramid Operations J = resize ( I , φ ) : • If 0 < φ < 1, image shrinks: Filter with σ = 1 . 6 / φ , then sample every s = 1 / φ > 1 pixels • If φ � 1, image grows: No filter. Just sample every s = 1 / φ 1 pixels O • Pyramid operators: Pick a single value of φ 2 ( 0 , 1 ) , T ee then define down ( X ) = resize ( X , φ ) up ( X ) = resize ( X , 1 / φ ) • up is not the inverse of down : Cannot restore lost information COMPSCI 527 — Computer Vision Image Pyramids 10 / 12
Gaussian (and Laplacian) Pyramid A Gaussian Pyramid ( φ = 1 / 2) • A lowpass pyramid: Each level contains a subset of the lower spatial frequencies that are in the next-higher resolution level (blurring attenuates high frequencies) COMPSCI 527 — Computer Vision Image Pyramids 11 / 12
Gaussian (and Laplacian) Pyramid A Laplacian Pyramid ( φ = 1 / 2) O • A bandpass pyramid , because each level contains a (more or less) separate band of spatial frequencies • The Laplacian pyramid is invertible • Optional topic, see notes COMPSCI 527 — Computer Vision Image Pyramids 12 / 12
Recommend
More recommend