Visual Recognition: Prospects for Image & Video Analytics - PowerPoint PPT Presentation

Visual Recognition: Prospects for Image & Video Analytics Jitendra Malik University of California at Berkeley

Classification & Segmentation Water outdoor Grass wildlife Tiger Sand back Tiger head eye legs tail mouth shadow UC Berkeley Computer Vision Group

PASCAL Visual Object Challenge

We want to locate the object Orig. Image Segmentation Orig. Image Segmentation

Fifty years of computer vision 1963-2013 • 1960s: Beginnings in artificial intelligence, image processing and pattern recognition • 1970s: Foundational work on image formation: Horn, Koenderink, Longuet- Higgins … • 1980s: Vision as applied mathematics: geometry, multi-scale analysis, probabilistic modeling, control theory, optimization • 1990s: Geometric analysis largely completed, vision meets graphics, statistical learning approaches resurface • 2000s: Significant advances in visual recognition, range of practical applications UC Berkeley Computer Vision Group

Handwritten digit recognition (MNIST,USPS) LeCun’s Convolutional Neural Networks variations (0.8%, • 0.6% and 0.4% on MNIST) • Tangent Distance(Simard, LeCun & Denker: 2.5% on USPS) • Randomized Decision Trees (Amit, Geman & Wilder, 0.8%) • K-NN based Shape context/TPS matching (Belongie, Malik & Puzicha: 0.6% on MNIST) University of California Computer Vision Group Berkeley

EZ-Gimpy Results (Mori & Malik, 2003) • 171 of 192 images correctly identified: 92 % horse spade smile join canvas here UC Berkeley Computer Vision Group

Face Detection Carnegie Mellon University Results on various images submitted to the CMU on-line face detector http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi

Multiscale sliding window Ask this question repeatedly, varying position, scale, category… Paradigm introduced by Rowley, Baluja & Kanade 96 for face detection Viola & Jones 01, Dalal & Triggs 05, Felzenszwalb, McAllester, Ramanan 08

Caltech-101 [Fei-Fei et al. 04] • 102 classes, 31-300 images/class UC Berkeley Computer Vision Group

Caltech 101 classification results (even better by combining cues..)

PASCAL Visual Object Challenge

Trying to find stick figures is hard (and unnecessary!) Generalized Cylinders (Binford, Marr & Nishihara) Geons (Biederman)

Person detection is challenging

Can we build upon the success of faces and pedestrians? Rowley, Baluja, Kanade CVPR96 Dalal and Triggs, CVPR05 Viola and Jones, IJCV01 … …  Pattern matching  Capture patterns that are common and visually characteristic  Are these the only two common and characteristic patterns?

Poselets We will train classifiers for these different visual patterns

Segmenting people Best person segmentation on PASCAL 2010 dataset [Bourdev, Maji, Brox and Malik, ECCV10]

Describing people “A man with short “A man with short “A person with “A woman with long hair, hair, glasses, short hair and long sleeves” long pants” glasses and long pants ”(??) sleeves and shorts”

Male or female?

Gender classifier per poselet is much easier to train

Is male

Has long hair

Wears long pants

Wears a hat

Wears long sleeves

Wears glasses

Actions in still images …  have characteristic :  pose and appearance  interaction with objects and agents

Some discriminative poselets

Problem: Human Activity Recognition Approach: Learn pose and appearance specific for an action Mean Performance: 59.7% correct 12/20/2011 SMARTS Annual Review 2011

Results : Top Confusions

Low-Cost Automated Tuberculosis Diagnostics Using Mobile Microscopy Jeannette Chang 1 , Pablo Arbelaez 1 , Neil Switz 2 , Clay Reber 2 , Asa Tapley 2,3 Lucian Davis 3 , Adithya Cattamanchi 3 , Daniel Fletcher 2 , and Jitendra Malik 1 Department of Electrical Engineering and Computer Science, UC Berkeley 1 Department of Bioengineering, UC Berkeley 2 Medical School and San Francisco General Hospital, UC San Francisco 3

Why Tuberculosis?  Mortality and Treatment 1  TB is second leading cause of deaths from infectious disease worldwide (after HIV/AIDS)  Highly effective antibiotic treatment  Current Diagnostics  Technicians screen microscopic images of sputum smears manually  Other methods include culture and PCR  Tremendous potential benefit from automated processing or classification 1. http://www.who.int/tb/publications/global_report/2011/gtbr11_full.pdf 2. http://www.thehindu.com/health/rx/article21138.ece Examples of sputum smears with TB bacteria. Brightfield (top) and fluorescent (bottom) microscopy. 2

Input image from CellScope device Array of candidate Candidate TB Blob TB objects Identification Each candidate TB object is 𝑦 1 characterized by a feature vector = ⋮ Feature containing 8 Hu moment invariants Extraction 𝑦 𝑂 and 14 geometric/photometric 1 descriptors. SVM Output Confidence Score 0.8 0.6 Linear SVM Classification 0.4 0.2 0 0 20 40 60 80 100 Candidate Object Index Candidate TB objects sorted by their 0.918 0.885 0.389 0.374 0.008 0.002 0.001 0.000 Bar plot with SVM output confidence SVM output confidence scores in scores corresponding to sorted candidate decreasing order (row-wise, from top Sample subset of candidate TB objects with TB objects to bottom) corresponding confidence scores

Sample Candidate Objects Sample positive objects Sample negative objects

Patches in Descending Order of Confidence

Object-Level Performance (Uganda Data) SS/RP curves, Avg spec: 0.96744, Avg prec: 0.95389 cost exp: 7 1 MeanIntensity Eccentricity 0.9 MinorAxisLength φ2 0.8 EquivDiameter MajorAxisLength 0.7 Solidity Specificity or Precision ConvexArea 0.6 φ3 Extent 0.5 EulerNumber MaxIntensity 0.4 φ11 φ4 0.3 φ6 φ7 train-SS 0.2 φ5 train-RP Area 0.1 test-SS FilledArea test-RP Perimeter 0 φ1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 MinIntensity Sensitivity (Recall) 0.000 0.200 0.400 0.600 0.800 1.000 Features listed in descending order of normalized SVM weights.

Slide-Level Performance (Uganda Data)

Visual Recognition: Prospects for Image & Video Analytics - PowerPoint PPT Presentation

Visual Recognition: Prospects for Image & Video Analytics Jitendra Malik University of California at Berkeley Classification & Segmentation Water outdoor Grass wildlife Tiger Sand back Tiger head eye legs tail mouth shadow

Visual Analytics Visual Analytics is the science of analytical reasoning supported by interactive

Image and Video Coding: Encoder Control D D = - R d R Problem Statement / Scope of Image

Image Retrieval with CNN Giorgos Tolias Visual Recognition Group, CTU in Prague CVPR 2017

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

Interactive Model Learning from High-Dimensional Data: A Visual Analytics Approach Klaus

Image and Video Coding: Introduction bitstream encoder decoder Motivation Image and Video

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Image Processing Todays Class Image Representations: Matrices Image Representations: RGB,

Introduction to Visual Recognition General visual recognition importance for intelligence?

Recap by Milo Davies, SAS NZ POWERFUL ADAPTIVE OPEN UNIFIED SAS Visual Analytics SAS Visual

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Machine visual perception Cordelia Schmid INRIA Grenoble Machine visual perception

. Live Your Vision Edge Analytics Appliance Sonys First AI-Based Video Analytics Solution

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

MICROBOONE Taritree Wongjirad DPF 2017 Tufts/MIT Outline Convolutional neural networks

In Search of a Unifying Theory for Image Interpretation Donald Gem an Department of Applied

Neural networks (Ch. 12) Back-propagation The neural network is as good as it's structure and

COMP30019 Graphics and Interaction Rendering pipeline & object modelling Adrian Pearce

Department of Computer Science IV University of Mannheim, Germany Motivation Part I: Basic

Connectivity and Coverage Problems in Emerging Networks Arun Sen Computer Science &

A monitoring system for the self- driving car and his driver 1 Problematic We are decades

CSE 373: Final thoughts Michael Lee Friday, Mar 9, 2018 1 Logistics Reminder: 2 Project 4

Visual Recognition: Prospects for Image & Video Analytics - PowerPoint PPT Presentation

Visual Recognition: Prospects for Image & Video Analytics Jitendra Malik University of California at Berkeley Classification & Segmentation Water outdoor Grass wildlife Tiger Sand back Tiger head eye legs tail mouth shadow

Visual Analytics Visual Analytics is the science of analytical reasoning supported by interactive

Image and Video Coding: Encoder Control D D = - R d R Problem Statement / Scope of Image

Image Retrieval with CNN Giorgos Tolias Visual Recognition Group, CTU in Prague CVPR 2017

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

Interactive Model Learning from High-Dimensional Data: A Visual Analytics Approach Klaus

Image and Video Coding: Introduction bitstream encoder decoder Motivation Image and Video

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Image Processing Todays Class Image Representations: Matrices Image Representations: RGB,

Introduction to Visual Recognition General visual recognition importance for intelligence?

Recap by Milo Davies, SAS NZ POWERFUL ADAPTIVE OPEN UNIFIED SAS Visual Analytics SAS Visual

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Machine visual perception Cordelia Schmid INRIA Grenoble Machine visual perception

. Live Your Vision Edge Analytics Appliance Sonys First AI-Based Video Analytics Solution

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

MICROBOONE Taritree Wongjirad DPF 2017 Tufts/MIT Outline Convolutional neural networks

In Search of a Unifying Theory for Image Interpretation Donald Gem an Department of Applied

Neural networks (Ch. 12) Back-propagation The neural network is as good as it's structure and

COMP30019 Graphics and Interaction Rendering pipeline &amp; object modelling Adrian Pearce

Department of Computer Science IV University of Mannheim, Germany Motivation Part I: Basic

Connectivity and Coverage Problems in Emerging Networks Arun Sen Computer Science &amp;

A monitoring system for the self- driving car and his driver 1 Problematic We are decades

CSE 373: Final thoughts Michael Lee Friday, Mar 9, 2018 1 Logistics Reminder: 2 Project 4

COMP30019 Graphics and Interaction Rendering pipeline & object modelling Adrian Pearce

Connectivity and Coverage Problems in Emerging Networks Arun Sen Computer Science &