visual recognition prospects for image video analytics
play

Visual Recognition: Prospects for Image & Video Analytics - PowerPoint PPT Presentation

Visual Recognition: Prospects for Image & Video Analytics Jitendra Malik University of California at Berkeley Classification & Segmentation Water outdoor Grass wildlife Tiger Sand back Tiger head eye legs tail mouth shadow


  1. Visual Recognition: Prospects for Image & Video Analytics Jitendra Malik University of California at Berkeley

  2. Classification & Segmentation Water outdoor Grass wildlife Tiger Sand back Tiger head eye legs tail mouth shadow UC Berkeley Computer Vision Group

  3. PASCAL Visual Object Challenge

  4. We want to locate the object Orig. Image Segmentation Orig. Image Segmentation

  5. Fifty years of computer vision 1963-2013 • 1960s: Beginnings in artificial intelligence, image processing and pattern recognition • 1970s: Foundational work on image formation: Horn, Koenderink, Longuet- Higgins … • 1980s: Vision as applied mathematics: geometry, multi-scale analysis, probabilistic modeling, control theory, optimization • 1990s: Geometric analysis largely completed, vision meets graphics, statistical learning approaches resurface • 2000s: Significant advances in visual recognition, range of practical applications UC Berkeley Computer Vision Group

  6. Handwritten digit recognition (MNIST,USPS) LeCun’s Convolutional Neural Networks variations (0.8%, • 0.6% and 0.4% on MNIST) • Tangent Distance(Simard, LeCun & Denker: 2.5% on USPS) • Randomized Decision Trees (Amit, Geman & Wilder, 0.8%) • K-NN based Shape context/TPS matching (Belongie, Malik & Puzicha: 0.6% on MNIST) University of California Computer Vision Group Berkeley

  7. EZ-Gimpy Results (Mori & Malik, 2003) • 171 of 192 images correctly identified: 92 % horse spade smile join canvas here UC Berkeley Computer Vision Group

  8. Face Detection Carnegie Mellon University Results on various images submitted to the CMU on-line face detector http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi

  9. Multiscale sliding window Ask this question repeatedly, varying position, scale, category… Paradigm introduced by Rowley, Baluja & Kanade 96 for face detection Viola & Jones 01, Dalal & Triggs 05, Felzenszwalb, McAllester, Ramanan 08

  10. Caltech-101 [Fei-Fei et al. 04] • 102 classes, 31-300 images/class UC Berkeley Computer Vision Group

  11. Caltech 101 classification results (even better by combining cues..)

  12. PASCAL Visual Object Challenge

  13. Trying to find stick figures is hard (and unnecessary!) Generalized Cylinders (Binford, Marr & Nishihara) Geons (Biederman)

  14. Person detection is challenging

  15. Can we build upon the success of faces and pedestrians? Rowley, Baluja, Kanade CVPR96 Dalal and Triggs, CVPR05 Viola and Jones, IJCV01 … …  Pattern matching  Capture patterns that are common and visually characteristic  Are these the only two common and characteristic patterns?

  16. Poselets We will train classifiers for these different visual patterns

  17. Segmenting people Best person segmentation on PASCAL 2010 dataset [Bourdev, Maji, Brox and Malik, ECCV10]

  18. Describing people “A man with short “A man with short “A person with “A woman with long hair, hair, glasses, short hair and long sleeves” long pants” glasses and long pants ”(??) sleeves and shorts”

  19. Male or female?

  20. Gender classifier per poselet is much easier to train

  21. Is male

  22. Has long hair

  23. Wears long pants

  24. Wears a hat

  25. Wears long sleeves

  26. Wears glasses

  27. Actions in still images …  have characteristic :  pose and appearance  interaction with objects and agents

  28. Some discriminative poselets

  29. Problem: Human Activity Recognition Approach: Learn pose and appearance specific for an action Mean Performance: 59.7% correct 12/20/2011 SMARTS Annual Review 2011

  30. Results : Top Confusions

  31. Low-Cost Automated Tuberculosis Diagnostics Using Mobile Microscopy Jeannette Chang 1 , Pablo Arbelaez 1 , Neil Switz 2 , Clay Reber 2 , Asa Tapley 2,3 Lucian Davis 3 , Adithya Cattamanchi 3 , Daniel Fletcher 2 , and Jitendra Malik 1 Department of Electrical Engineering and Computer Science, UC Berkeley 1 Department of Bioengineering, UC Berkeley 2 Medical School and San Francisco General Hospital, UC San Francisco 3

  32. Why Tuberculosis?  Mortality and Treatment 1  TB is second leading cause of deaths from infectious disease worldwide (after HIV/AIDS)  Highly effective antibiotic treatment  Current Diagnostics  Technicians screen microscopic images of sputum smears manually  Other methods include culture and PCR  Tremendous potential benefit from automated processing or classification 1. http://www.who.int/tb/publications/global_report/2011/gtbr11_full.pdf 2. http://www.thehindu.com/health/rx/article21138.ece Examples of sputum smears with TB bacteria. Brightfield (top) and fluorescent (bottom) microscopy. 2

  33. Input image from CellScope device Array of candidate Candidate TB Blob TB objects Identification Each candidate TB object is 𝑦 1 characterized by a feature vector = ⋮ Feature containing 8 Hu moment invariants Extraction 𝑦 𝑂 and 14 geometric/photometric 1 descriptors. SVM Output Confidence Score 0.8 0.6 Linear SVM Classification 0.4 0.2 0 0 20 40 60 80 100 Candidate Object Index Candidate TB objects sorted by their 0.918 0.885 0.389 0.374 0.008 0.002 0.001 0.000 Bar plot with SVM output confidence SVM output confidence scores in scores corresponding to sorted candidate decreasing order (row-wise, from top Sample subset of candidate TB objects with TB objects to bottom) corresponding confidence scores

  34. Sample Candidate Objects Sample positive objects Sample negative objects

  35. Patches in Descending Order of Confidence

  36. Object-Level Performance (Uganda Data) SS/RP curves, Avg spec: 0.96744, Avg prec: 0.95389 cost exp: 7 1 MeanIntensity Eccentricity 0.9 MinorAxisLength φ2 0.8 EquivDiameter MajorAxisLength 0.7 Solidity Specificity or Precision ConvexArea 0.6 φ3 Extent 0.5 EulerNumber MaxIntensity 0.4 φ11 φ4 0.3 φ6 φ7 train-SS 0.2 φ5 train-RP Area 0.1 test-SS FilledArea test-RP Perimeter 0 φ1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 MinIntensity Sensitivity (Recall) 0.000 0.200 0.400 0.600 0.800 1.000 Features listed in descending order of normalized SVM weights.

  37. Slide-Level Performance (Uganda Data)

Recommend


More recommend