image filtering and image features
play

Image filtering and image features September 26, 2019 Outline: - PowerPoint PPT Presentation

Image filtering and image features September 26, 2019 Outline: Image filtering and image features Images as signals Color spaces and color features 2D convolution Matched filters Gradient filters Separable convolution


  1. Image filtering and image features September 26, 2019

  2. Outline: Image filtering and image features • Images as signals • Color spaces and color features • 2D convolution • Matched filters • Gradient filters • Separable convolution • Accuracy spectrum of a 1-feature classifier

  3. ̅ Images as signals • x[n 1 ,n 2 ,c] = intensity in row n1, column n2, color plane c. • Most image formats (e.g., JPG, PNG, GIF, PPM) distribute images with three color planes: Red, Green, and Blue (RGB) • In this example (Arnold Schwarzenegger’s face), the grayscale image was created as n 1 𝑦 𝑜 $ , 𝑜 & = 1 * 𝑦 𝑜 $ , 𝑜 & , 𝑑 3 +∈{.,/,0} n 2

  4. Color spaces: RGB • Every natural object reflects a continuous spectrum of colors. • However, the human eye only has three color sensors: • Red cones are sensitive to lower frequencies • Green cones are sensitive to intermediate frequencies • Blue cones are sensitive to higher frequencies • By activating LED or other display hardware at just three discrete colors (R, G, and B), it is possible to fool the human eye into thinking that it sees a continuum of colors. Illustration from Anatomy & • Therefore, most image file formats only Physiology, Connexions Web site. code three discrete colors (RGB). http://cnx.org/content/col11496/1. 6/, Jun 19, 2013.

  5. Color features: Luminance • The “grayscale” image is often computed as the average of R, G, and B 𝑦 𝑜 $ , 𝑜 & = $ intensities, i.e., ̅ 3 ∑ +∈{.,/,0} 𝑦 𝑜 $ , 𝑜 & , 𝑑 . • The human eye, on the other hand, is more sensitive to green light than to either red or blue. • The intensity of light, as viewed by the human eye, is well approximated by the standard ITU-R BT.601: 𝑦 𝑜 $ , 𝑜 & , 𝑍 = 0.299𝑦 𝑜 $ , 𝑜 & , 𝑆 + 0.587𝑦 𝑜 $ , 𝑜 & , 𝐻 + 0.114𝑦 𝑜 $ , 𝑜 & , 𝐶 • This signal ( 𝑦 𝑜 $ , 𝑜 & , 𝑍 ) is called the luminance of light at pixel 𝑜 $ , 𝑜 & .

  6. Color features: Chrominance • Chrominance = color-shift of the image. • We measure 𝑄 . =red-shift, and 𝑄 0 =blue-shift, relative to luminance (luminance is sort of green-based, remember?) • We want 𝑄 . 𝑜 $ , 𝑜 & and 𝑄 0 𝑜 $ , 𝑜 & to describe only the color-shift of the pixel, not its average luminance. • We do that using 𝑤 E ⃗ 𝑍 𝑆 𝑄 0 = 𝑤 0 ⃗ 𝐻 𝑄 . 𝐶 𝑤 . ⃗ Cr and Cb, at Y=0.5 Simon A. Eugster, own work. Where 𝑡𝑣𝑛( ⃗ 𝑤 . ) = 𝑡𝑣𝑛( ⃗ 𝑤 0 ) = 0 .

  7. Color features: Chrominance 𝑍 𝑄 0 𝑄 . 𝑆 0.299 0.587 0.114 = −0.168736 −0.331264 0.5 𝐻 0.5 −0.418688 −0.081312 𝐶 gives 𝑡𝑣𝑛( ⃗ 𝑤 0 ) = 0 . 𝑤 . ) = 𝑡𝑣𝑛( ⃗

  8. Color features: Chrominance • Some images are obviously red! (e.g., fire, or wood) • Some images are obviously blue! (e.g., water, or sky) • Average(Pb)-Average(Pr) should be a good feature for distinguishing between, for example, ”fire” versus “water”

  9. Color features: norms N O T$ ∑ Q P RS N P T$ 𝑄 0 𝑜 $ , 𝑜 & . $ • The average Pb value is M N O N P ∑ Q O RS 𝑄 0 = • The problem with this feature is that it gives too much weight to small values of 𝑄 0 𝑜 $ , 𝑜 & , i.e., some pixels might not be all that bluish – as a result, some “water” images have low average-pooled Pb. • The max Pb value is U 𝑄 0 = max Q O max Q P 𝑄 0 𝑜 $ , 𝑜 & . • The problem with this feature is that it gives too much weight to LARGE values of 𝑄 0 𝑜 $ , 𝑜 & , i.e., in the “fire” image, there might be one or two pixels that are blue, even though all of the others are red --- as a result, some “fire” images might have an unreasonably high max- pooled Pb. N O T$ ∑ Q P RS N P T$ 𝑄 0 $/& & 𝑜 $ , 𝑜 & = ∑ Q O RS • The Frobenius norm is 𝑄 0 • The Frobenius norm emphasizes large values, but it doesn’t just depend on the LARGEST value – it tends to resemble an average of the largest values. • In MP3, Frobenius norm seems to be work better than max-pooling or average- pooling. For other image processing problems, you might want to use average- pooling or max-pooling instead.

  10. Outline: Image filtering and image features • Images as signals • Color spaces and color features • 2D convolution • Matched filters • Gradient filters • Separable convolution • Accuracy spectrum of a 1-feature classifier

  11. 2D convolution The 2D convolution is just like a 1D convolution, but in two dimensions. N O T$ N P T$ 𝑦 𝑜 $ , 𝑜 & , 𝑑 ∗∗ ℎ 𝑜 $ , 𝑜 & , 𝑑 = * * 𝑦 𝑛 $ , 𝑛 & , 𝑑 ℎ 𝑜 $ − 𝑛 $ , 𝑜 & − 𝑛 & , 𝑑 \ O RS \ P RS Note that we don’t convolve over the color plane – just over the rows and columns.

  12. Full, Valid, and Same-size convolution outputs N O T$ N P T$ 𝑧 𝑜 $ , 𝑜 & , 𝑑 = * * 𝑦 𝑛 $ , 𝑛 & , 𝑑 ℎ 𝑜 $ − 𝑛 $ , 𝑜 & − 𝑛 & , 𝑑 \ O RS \ P RS Suppose that x is an N 1 xN 2 image, while h is a filter of size M 1 xM 2 . Then there are three possible ways to define the size of the output: • “Full” output: Both 𝑦 𝑛 $ , 𝑛 & and ℎ 𝑛 $ , 𝑛 & are zero-padded prior to convolution, and then 𝑧 𝑜 $ , 𝑜 & is defined wherever the result is nonzero. This gives 𝑧 𝑜 $ , 𝑜 & the size of (N 1 +M 1 -1)x(N 2 +M 2 -1). • “Same” output: The output, 𝑧 𝑜 $ , 𝑜 & , has the size N 1 xN 2 . This means that there is some zero-padding. • “Valid” output: The summation is only performed for values of (n1,n2,m1,m2) at which both x and h are well-defined. This gives 𝑧 𝑜 $ , 𝑜 & , 𝑑 the size of (N 1 - M 1 +1)x(N 2 -M 2 +1).

  13. Example: differencing Suppose we want to calculate the difference between each pixel, and its second neighbor: 𝑧 𝑜 $ , 𝑜 & = 𝑦 𝑜 $ , 𝑜 & − 𝑦 𝑜 $ , 𝑜 & − 2 We can do that as 𝑧 N O T$ N P T$ = * * 𝑦 𝑛 $ , 𝑛 & ℎ 𝑜 $ − 𝑛 $ , 𝑜 & − 𝑛 & \ O RS \ P RS where 1 𝑜 $ = 0, 𝑜 & = 0 ℎ 𝑜 $ , 𝑜 & = ^ −1 𝑜 $ = 0, 𝑜 & = 2 0 𝑓𝑚𝑡𝑓 …we often will write this as h=[1,0,-1].

  14. Example: averaging Suppose we want to calculate the average between each pixel, and its two neighbors: 𝑧 𝑜 $ , 𝑜 & = 𝑦 𝑜 $ , 𝑜 & + 2𝑦 𝑜 $ , 𝑜 & − 1 + 𝑦 𝑜 $ , 𝑜 & − 2 We can do that as 𝑧 N O T$ N P T$ = * * 𝑦 𝑛 $ , 𝑛 & ℎ 𝑜 $ − 𝑛 $ , 𝑜 & − 𝑛 & \ O RS \ P RS where 1 𝑜 $ = 0, 𝑜 & ∈ {0,2} ℎ 𝑜 $ , 𝑜 & = ^ 2 𝑜 $ = 0, 𝑜 & = 1 0 𝑓𝑚𝑡𝑓 …we often will write this as h=[1,2,1].

  15. The two ways we’ll use convolution in mp3 1. Matched filtering : The filter is designed to pick out a particular type of object (e.g., a bicycle, or a Volkswagon beetle). The output of the filter has a large value when the object is found, and a small random value otherwise. 2. Gradient : Two filters are designed, one to estimate the horizontal image gradient 𝐻 a 𝑜 $ , 𝑜 & , 𝑑 = b bQ P 𝑦 𝑜 $ , 𝑜 & , 𝑑 , and one to estimate the vertical image gradient b 𝐻 c 𝑜 $ , 𝑜 & , 𝑑 = bQ O 𝑦 𝑜 $ , 𝑜 & , 𝑑

  16. Outline: Image filtering and image features • Images as signals • Color spaces and color features • 2D convolution • Matched filters • Gradient filters • Separable convolution • Accuracy spectrum of a 1-feature classifier

  17. Matched filter is the solution to the “signal detection” problem. Suppose we have a noisy signal, x[n]. We have two hypotheses: • H0: x[n] is just noise, i.e., x[n]=v[n], where v[n] is a zero-mean, unit- variance Gaussian white noise signal. • H1: x[n]=s[n]+v[n], where v[n] is the same random noise signal, but s[n] is a deterministic (non-random) signal that we know in advance. We want to create a hypothesis test as follows: 1. Compute y[n]=h[n]*x[n] 2. If y[0] > threshold, then conclude that H1 is true (signal present). If y[0] < threshold, then conclude that H0 is true (signal absent). Can we design h[n] in order to maximize the probability that this classifier will give the right answer?

  18. The “signal detection” problem 𝑧 𝑜 = 𝑦 𝑜 ∗ ℎ 𝑜 = 𝑡 𝑜 ∗ ℎ 𝑜 + 𝑤 𝑜 ∗ ℎ[𝑜] • Call it w[n]: w n = 𝑤 𝑜 ∗ ℎ 𝑜 = ∑ 𝑤 𝑛 ℎ[𝑜 − 𝑛] is a Gaussian random variable with zero average. • The weighted sum of Gaussians is also a Gaussian • E 𝑥 𝑛 = 0 because E 𝑤 𝑛 = 0 & = ∑ 𝜏 l & ℎ & [𝑜 − 𝑛] = ∑ ℎ & [𝑜 − 𝑛] • The variance is 𝜏 k & = 1 ). • (because we assumed that 𝜏 l • Suppose we constrain h[n] as ∑ ℎ & [𝑜 − 𝑛] = 1 . Then we have 𝜏 k & = 1 . • So under H0 (signal absent), y[n] is a zero-mean, unit-variance Gaussian random signal.

  19. The “signal detection” problem 𝑧 𝑜 = 𝑦 𝑜 ∗ ℎ 𝑜 = 𝑡 𝑜 ∗ ℎ 𝑜 + 𝑥 𝑜 So w[0] is a zero-mean, unit-variance Gaussian random variable. We have two hypotheses: • H0: 𝑧 0 = 𝑥 0 • H1: 𝑧 0 = 𝑥 0 + ∑ 𝑡 𝑛 ℎ[0 − 𝑛] Goal: we know s[m]. We want to design h[m] so that ∑ 𝑡 𝑛 ℎ[−𝑛] is as large as possible, subject to the constraint that ∑ ℎ & [𝑜 − 𝑛] = 1 .

Recommend


More recommend