discriminant hypothesis for visual saliency and its
play

Discriminant Hypothesis for Visual Saliency and its Applications in - PowerPoint PPT Presentation

Discriminant Hypothesis for Visual Saliency and its Applications in Computer Vision Dashan Gao Joint work with Nuno Vasconcelos Statistical Visual Computing Laboratory Department of Electrical and Computer Engineering, University of


  1. Discriminant Hypothesis for Visual Saliency and its Applications in Computer Vision Dashan Gao Joint work with Nuno Vasconcelos Statistical Visual Computing Laboratory Department of Electrical and Computer Engineering, University of California, San Diego SVCL

  2. What is visual saliency? • certain image features that attract visual attention (Yarbus, 1967) (Treisman & Gormican, 1988) SVCL

  3. What is known about saliency? • Bottom-up (BU) saliency – stimulus-driven mechanism, fast – goal independent – e.g. traffic signs • Top-down (TD) saliency – goal-driven mechanism, slower – provides informative locations to the specific task Yarbus 1967: 1. Free viewing 2. Estimate the economic level of the people 3. Judge their age SVCL

  4. In computer vision • saliency is widely used in visual recognition systems, as a pre-processing stage – a sparse image representation – reduces computation – example: weakly supervised learning of object categories Sivic et. al 2003 Fergus et. al 2003 SVCL

  5. however, in computer vision • most saliency definitions are universal, divorced from the recognition problem – repeatability (stability) [ Forstner(94), Harris-Stephens(88), shi-Tomasi(94), Mikolajczyk(01,04)] – continuity (curvature) [ Sha’ashua-Ullman(88), Asada-Brady(86)] – complexity (information content) [ Kadir-Brady(01),Sebe-Lew(03)] – rarity (low probability) [ Walker et al.(98), Oliva et al.(03), Bruce-Tsotsos(05)] • in result, – detected salient locations may not be very informative for recognition. Harris detection SVCL

  6. Discriminant saliency • Hypothesis: saliency is a discrim inant process • requires a stimulus of interest, and a null hypothesis of stimuli that are not salient – context dependent: salient attributes depend on the object of interest and the context (null hypothesis) • Definition: salient features are those that best distinguish the given visual concept from the null hypothesis (NIPS 2004) SVCL

  7. Infomax feature selection • solution: features that maximize the mutual information between the features ( X ) and the class label (Y) • under a constraint of computational parsimony • we use marginal mutual information, X = { X 1 , … , X n } (more on this later) SVCL

  8. Top-down Discriminant Saliency Model Original Feature Set Discriminant Faces Background Feature Selection Training Testing Salient Features Scale Selection W j Saliency Map WTA Malik-Perona pre-attentive perception model [ M-K90] SVCL

  9. Qualitative evaluation • discriminant saliency more correlates with target objects Original images Saliency Maps by Discriminant Saliency Salient locations by Discriminant Saliency Scale Saliency Detector [K-B 01] Harris Saliency Detector [H-S 88] SVCL

  10. Visual classification • how informative are the salient points? – evaluated by measuring actual recognition rates – classifier: histogram of saliency values, fed to an SVM for a presence/ absence test – compared with two standard saliency detectors, and two bench marks 1 00 9 5 DiscSa l- DCT 9 0 Sca le Sa lie ncy 8 5 Ha rr is Sa lie ncy 8 0 7 5 I m a ge Pix e l 7 0 Conste lla tion 6 5 6 0 F M A a o c i r t e o p s l r a b n i k e e s s SVCL

  11. Repeatability test SSD: Kadir-Brady 2001 HarrLap/ HesLap: Mikolajczyk 2004 Mser: Matas et al. 2004 • robustness to various transformations 100 100 DSD scale+ rotation 90 90 SSD HarrLap HesLap 80 80 Mser 70 70 repeatability (%) repeatability (%) 60 60 50 50 40 40 30 30 JPEG DSD 20 20 SSD com pression HarrLap 10 10 HesLap Mser 0 0 2 2.5 3 3.5 4 4.5 5 5.5 6 2 2.5 3 3.5 4 4.5 5 5.5 6 increasing scale changes increasing blur 100 100 view angle 90 90 80 80 70 70 repeatability (%) repeatability (%) 60 60 50 50 40 40 30 30 DSD DSD 20 blurring 20 SSD SSD HarrLap HarrLap 10 10 HesLap HesLap Mser Mser 0 0 2 2.5 3 3.5 4 4.5 5 5.5 6 2 2.5 3 3.5 4 4.5 5 5.5 6 increasing JPEG compression increasing viewpoint angle SVCL

  12. Other research work based on Top- down Discriminant Saliency • a hierarchical model learning of complex features and detectors for classification Background Faces Feature Complexity Control Initial Complex Discriminant Object Saliency Map & Feature Set Feature Feature Detection Salient Locations Generation Selection Salient (complex) Features New Complex Feature Set (CVPR 2005) SVCL

  13. Bayesian Integration • Advantage – the trade-off between high selectivity (by TD) and high accuracy (by BU) • Probabilistic formulation of salient locations – saliency output: probability distributions of saliency locations over the image plane • Saliency as Bayesian inference – BU saliency -> prior, TD saliency -> likelihood – inference from the two observations: both accuracy and selectivity -> a posterior SVCL

  14. Results • better locations Integration BU TD • better selectivity (classification accuracy) SVCL

  15. Applications • Automated gathering of training examples SVCL

  16. Applications • Region-of-interest (ROI) based Image compression Normal JPEG ROI compression In low bit rate Joint work with Sunhyoung Han SVCL

  17. Discriminant saliency hypothesis • these results are encouraging, but how do we evaluate the hypothesis as a whole? • two fundamental questions – can it explain biological saliency? – can it drive both bottom-up and top-down saliency? • bottom-up saliency particularly interesting – bottom-up visual pathway much better understood than its top- down counterpart • motivated us to study bottom-up discriminant saliency SVCL

  18. Bottom-up discriminant saliency • Recall: bottom-up saliency – stimulus-driven mechanism – saliency detection on a single image • Center-surround mechanism P(x|center) 0 W l P(x| surround) l 1 W l x X: features, Y: { center, surround} SVCL

  19. Joint feature distribution • band-pass features exhibit regular patterns of response to natural images – bow-tie shaped conditional distributions (Buccigrossi & Simoncelli, 1999; Huang & Mumford, 1999) – although fine details of feature dependency may vary from scene to scene, coarse structure follows a universal law for all classes – feature dependencies are not informative about image classes 0 top: three images. bottom: conditional P(x i | x j ) histogram of the same coefficient, conditioned on the value of its parent. SVCL

  20. Joint feature distribution • enables the approximation of mutual information by the sum of marginal mutual information (Vasconcelos & Vasconcelos, 2004) discriminant info of features dependencies discriminant info of individual features – all complexity is encoded in the second term • from a computational standpoint, this is extremely simple to compute (computational parsimony) SVCL

  21. Generalized Gaussian density (GGD) • the marginal distributions of natural image features follow a generalized Gaussian density (GGD) For β = 1, a Laplace distribution, and a Gaussian when β = 2 0 0 10 10 Histogram Histogram GGD −1 −1 10 10 GGD −2 −2 10 10 P(X) P(X) −3 −3 10 10 −4 −4 10 10 −5 −5 10 10 −0.2 −0.1 0 0.1 0.2 −2 −1 0 1 2 X X Examples of GGD fit for responses of two Gabor filters SVCL

  22. In summary Feature decomposition • combining principles of Orientation Intensity Color (R/G, B/Y) Feature maps – infomax organization – computational parsimony – neural tuning to stimulus statistics • leads to a very simple Discriminant measure saliency operator Feature saliency • which is approximately maps optimal in the minimum error probability sense Σ SVCL

  23. Biological plausibility • discriminant saliency can be implemented with a three-layer neural network ... 0 { −0.2 | . | β 0 W 0 l Σ −0.4 φ (x) −0.6 φ′ (x) ... −10 −5 −1.2 0 1.2 5 10 x | . | β 0 ψ [ x j , Φ 0 ] + S(l) differential + Σ φ (.) Σ Σ T l simple cell x j - g[ x j ] I l (X k ,Y) H(Y) ψ [ x j , Φ 1 ] | . | β 1 ... { Σ complex cortical W 1 l | . | β 1 cell columns ... Layer 2 Layer 3 Layer 1 SVCL

  24. Single vs. conjunctive feature search Find a bar different from all others discriminant saliency prediction SVCL

  25. Asymmetries in visual search Time(Find a “Q” among “O”s) < Time(Find a “O” among “Q”s) presence of a feature absence of a feature search time saliency prediction # of distractors SVCL

  26. Distractor heterogeneity (relevant dimension) • Saliency perception is nonlinear and affected by heterogeneous distractors if the heterogeneity is in the same dimension bg= 0 o bg= 10 o bg= 20 o Discriminant saliency Nothdurft (1993) 3.5 bg=0 bg=10 3 bg=20 2.5 Relative Saliency 2 1.5 1 0.5 0 0 10 20 30 40 50 60 70 80 90 Orientation contrast (deg) SVCL

  27. Prediction of human eye fixations on natural images (qualitative) SVCL H S

  28. Applications in motion saliency • discriminant saliency is combined with motion field – optical flow – removes the camera motion Joint work with Vijay Mahadevan (NIPS 2007) SVCL

Recommend


More recommend