Human Pose Search using Deep Poselets Nataraj Jammalamadaka * Andrew - PowerPoint PPT Presentation

Human Pose Search using Deep Poselets Nataraj Jammalamadaka * Andrew Zisserman § C. V. Jawahar * * CVIT, IIIT Hyderabad, India § Visual Geometry Group, Department of Engineering, IIIT Hyderabad University of Oxford

Human Pose: Gesture and action Cover Drive Walking Gesturing Human pose is a very important IIIT Hyderabad precursor to gesture and action

Pose Search: Motivation Retrieve cover drive shots Retrieve Bharatanatyam poses IIIT Hyderabad

Pose Search: System 𝑦 1 , … , 𝑦 𝑜 Build a feature Take a query IIIT Hyderabad Search through video DB Return the retrieved results

Overview Deep Poselets Poselet Discovery Training Detection • Cluster pose space • Train poselets using • Detect poselets convolutional neural networks Pose retrieval … IIIT Hyderabad • Given a query image • Build Bag of Deep poselets • Return the retrieved results

Datasets Buffy Stickmen (Season 1, 5 episodes) ETH Pascal dataset (Flickr Images) H3D (Flickr Images) IIIT Hyderabad

Datasets FLIC dataset (30 Hollywood movies) IIIT Hyderabad Movie dataset (Ours) ( 22 Hollywood movies) No overlap with FLIC

Datasets Dataset Train Validation Test Total H3D 238 0 0 238 ETHZ Pascal 0 0 548 548 Buffy 747 0 0 747 Buffy-2 396 0 0 396 Movie 1098 491 2172 3756 Flic 2724 2279 0 5003 Total stickmen 5198 2764 2720 10682 annotations + Flipped version 10396 5528 5440 21364 IIIT Hyderabad

Poselets Poselets model body parts in a particular spatial configuration. IIIT Hyderabad

Poselets Poselets model body parts in a particular spatial configuration. Poselet 1 IIIT Hyderabad

Poselets: Discovery Reorganize Left arm (LA) LA + Head LA + Head + Torso All parts except head Training data with ground truth stickmen annotations Right arm (RA) RA + head RA + head + torso Poselet Average Images For each set, get pose descriptors K-Means Clustering • For each body part, note the angle • Cluster on the angles IIIT Hyderabad

Deep Poselets: CNNs . ... . Convolution Convolution . Deep Poselet labels followed by followed by pooling pooling Input Layer 2 Layer 5 Layer 7 Layer 8 Layer 6 Softmax Fully connected Convolutional layers layer layers ReLU Non linearity: 26 30 𝑔(𝑦) = max(0, 𝑦) 13 3x3 5 5 13 30 26 Softmax layer: 50 𝑓 𝑦𝑗 𝑔(𝑦 𝑗 ) = 𝑘 𝑓 𝑦𝑘 3 50 Max Pooling Convolution IIIT Hyderabad

Deep Poselets: Training . ... . Convolution Convolution . Deep Poselet labels followed by followed by pooling pooling Input Layer 2 Layer 5 Layer 7 Layer 8 Layer 6 Softmax Fully connected Convolutional layers layer layers Input image: 𝑦 Model parameters: 𝑥 Ground truth: 𝑕 Output: 𝑧 = 𝑔(𝑦, 𝑥) Training: Stochastic Gradient Descent Loss function: 𝑀 = 𝑘 𝑕 𝑘 log(𝑧 𝑘 ) 𝑥 = 𝑥 − 𝜃𝜖𝑀 𝜖𝑥 IIIT Hyderabad Architecture from Krizhevsky et al., NIPS 2012

Deep Poselets: Fine tuning . ... . Convolution Convolution . Deep Poselet labels followed by followed by pooling pooling Input Layer 2 Layer 5 Layer 7 Layer 8 Layer 6 Softmax Fully connected Convolutional layers layer layers Challenge: Fine tuning procedure: -- Network has 40 million parameters. -- Required training data ~1-2 million. -- Train image classification task -- Available training data ~50K. using imagenet data of size 1.2 million. Solution: -- Replace the softmax layer with -- Train the network on a task with enough random initialization. data present. IIIT Hyderabad -- Fine-tune the network to the current task. -- Run the gradient descent.

Deep Poselets: Detection Given a test image, run all the deep poselets. • Each poselet occurs in a localized regions within a upper body detection. • Run the classifiers on the “Expected center points of poselets ”. Expected center points of poselets. • This improves both the speed and accuracy. IIIT Hyderabad

Deep Poselets: Spatial reasoning Score: 0.3 1 Problem: The three detections fired in the same area. Score: 0.7 2 3 IIIT Hyderabad Score: 0.2

Deep Poselets: Spatial reasoning Score: 0.3  0 1 Problem: The three detections fired in the same area. Score: 0.7  1 Objective: Rescore detection 2 to 1 and the detections 1,3 to 0. 2 Solution: For each poselet, learn regression function whose -- Input: Scores of other poselet detections -- Output: New score 3 IIIT Hyderabad Score: 0.2  0

Deep Poselets: Results Method MAP-test • Evaluation measure: Mean HOG 32.6 average precision. CNN before fine-tuning 48.6 • Comparison: Poselets are trained using HOG feature. CNN after fine-tuning 56.0 IIIT Hyderabad

Deep Poselets: Results 40.4 78.1 AP AP #positives #positives 1863 698 in train set in train set Rank 1 Rank 11 Rank 16 Rank 1 Rank 6 Rank 11 Rank 16 Rank 6 IIIT Hyderabad Rank 21 Rank 26 Rank 31 Rank 36 Rank 36 Rank 21 Rank 26 Rank 31

Pose Search: Indexing • Detect the upper body. • Run all the poselets. • Perform spatial reasoning. For each frame in the video DB collection Descriptor: Max pool the Deep Poselet detections 122D vector … IIIT Hyderabad Index in a database

Pose Search: Retrieval Build Bag of Deep poselets … Given a query image Using cosine distance , search IIIT Hyderabad Return the retrieved results through the database

Pose Search: Results Experimental setup • Database: Test data of size 5440 is used as the database. • Queries: All the samples in the test data are used as query. • Evaluation metric: Mean average precision (MAP). Methods compared against Results • Bag of visual words (BOVW) – Detect sift  K means (K = 1000)  VQ. Method MAP BOVW 14.2 • Berkeley Poselets (BPL) BPL 15.3 – Run poselets  Bag of parts. HPE [1] 17.5 • Human pose estimation [1] (HPE) Ours 34.6 – Run human pose estimation algorithms – Concatenate (sin(x),cos(x)) of IIIT Hyderabad all the body part angles. [1] Y. Yang and D. Ramanan . “Articulated pose estimation with flexible mixtures-of- parts.” In CVPR, 2011.

Pose Search: Results 45 HPE [1]: 17.5 40 75% queries < 20% AP 5% queries > 50% AP 35 Percentage of queries Ours: 34.6 30 45% queries < 20% AP 25 25% queries > 50% AP 20 15 10 5 0 0 10 20 30 40 50 60 70 80 90 100 Average Precision Comparison with the state-of-the-art IIIT Hyderabad

Pose Search: Analysis HPE Ground truth Detection • Pose detection algorithms often commit to wrong pose. • Pose search systems based on them perform poorly. OURS • Bag of poselets descriptor encodes multiple S: 0.3 S: 0.2 proposals weighted by their likelihood • Hence it can recover when some of the detections are wrong. IIIT Hyderabad S: 0.7

Pose Search: Results AP: 59.4 Precision Query Recall IIIT Hyderabad Rank 15 Rank 1 Rank 20 Rank 25 Rank 5 Rank 10

Pose Search: Results AP: 44.5 Precision Query Recall IIIT Hyderabad Rank 25 Rank 1 Rank 5 Rank 10 Rank 15 Rank 20

Pose Search: Results AP: 40.3 Precision Query Recall Rank 25 IIIT Hyderabad Rank 1 Rank 5 Rank 10 Rank 15 Rank 20

Summary • We propose a novel Deep Poselets based method for human pose search system. • Our Deep Poselet method outperforms HOG based poselets by 25% MAP. • Our pose retrieval method improves the performance of the current state-of-art system by 17% MAP. IIIT Hyderabad

Thank you. Questions? IIIT Hyderabad

Human Pose Search using Deep Poselets Nataraj Jammalamadaka * Andrew - PowerPoint PPT Presentation

Human Pose Search using Deep Poselets Nataraj Jammalamadaka * Andrew Zisserman C. V. Jawahar * * CVIT, IIIT Hyderabad, India Visual Geometry Group, Department of Engineering, IIIT Hyderabad University of Oxford Human Pose: Gesture and

Human Pose Estimation by Yannic Jnike - 04.11.2019 https://www.youtube.com/watch?v=mxKlUO_tjcg

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

Hand Pose Estimation Matthew Krenik Advisor: Fabrizio Pece Agenda What is Hand Pose

LightTrack: A Generic Framework for Online Top-Down Human Pose Tracking Authors: Guanghan Ning,

Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image Denis Tom

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Tsinghua University Monocular Depth-Pose Prediction [R, t] Depth and Pose RGB PoseNet

Human Pose Estimation and Action Recognition Gang Yu, Megvii (Face++) Junsong Yuan, SUNY Buffalo

Chirality Nets for Human Pose Regression Raymond A. Yeh, Yuan-Ting Hu, Alexander G. Schwing

Fields of Parts & Friends peter.gehler.net p i Detection + Geometry p i Human Pose

Human Pose Recovery And Gesture Recognition CS365 : Artificial Intelligence Khandesh

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Tabu Search Search Tabu Page 1 Part I Part I Tabu Search Principles Search Principles Tabu

Uninformed Search 2 Informed Search Rest of blind search An informed search strategyone

Informed search algorithms Outline Best-first search Greedy best-first search A *

Foundations of Artificial Intelligence 9. State-Space Search: Tree Search and Graph Search Malte

SP 800-90B Overview* John Kelsey, NIST, May 2016 * Revised to correct some errors discovered

The Integration of SMT Solvers into the RISCAL Model Checker Second Master Thesis Report Franz

Hash function based on the SIS problem HEBANT Chlo e University of Limoges Summer 2016

and Elementary Data Structures Linear Sorting Algorithms Biostatistics 615/815 Lecture 6: . .

MATLAB crash course Cesar E. Tamayo Economics - Rutgers September 27th, 2013 1/27 MATLAB crash

The Swiss Army Knife SLTE Alice Shelton, Michele Barezzani Alcatel-Lucent Submarine

Lecture 10: Lists and Sequences (Sections 10.0-10.2, 10.4-10.6, 10.8-10.13) CS 1110

Chemical Storage & Contamination Sophie Koh, Jonathan Naughton, Cory Seremetis, Aseel

Human Pose Search using Deep Poselets Nataraj Jammalamadaka * Andrew - PowerPoint PPT Presentation

Human Pose Search using Deep Poselets Nataraj Jammalamadaka * Andrew Zisserman C. V. Jawahar * * CVIT, IIIT Hyderabad, India Visual Geometry Group, Department of Engineering, IIIT Hyderabad University of Oxford Human Pose: Gesture and

Human Pose Estimation by Yannic Jnike - 04.11.2019 https://www.youtube.com/watch?v=mxKlUO_tjcg

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

Hand Pose Estimation Matthew Krenik Advisor: Fabrizio Pece Agenda What is Hand Pose

LightTrack: A Generic Framework for Online Top-Down Human Pose Tracking Authors: Guanghan Ning,

Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image Denis Tom

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Tsinghua University Monocular Depth-Pose Prediction [R, t] Depth and Pose RGB PoseNet

Human Pose Estimation and Action Recognition Gang Yu, Megvii (Face++) Junsong Yuan, SUNY Buffalo

Chirality Nets for Human Pose Regression Raymond A. Yeh*, Yuan-Ting Hu*, Alexander G. Schwing

Fields of Parts &amp; Friends peter.gehler.net p i Detection + Geometry p i Human Pose

Human Pose Recovery And Gesture Recognition CS365 : Artificial Intelligence Khandesh

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Tabu Search Search Tabu Page 1 Part I Part I Tabu Search Principles Search Principles Tabu

Uninformed Search 2 Informed Search Rest of blind search An informed search strategyone

Informed search algorithms Outline Best-first search Greedy best-first search A *

Foundations of Artificial Intelligence 9. State-Space Search: Tree Search and Graph Search Malte

SP 800-90B Overview* John Kelsey, NIST, May 2016 * Revised to correct some errors discovered

The Integration of SMT Solvers into the RISCAL Model Checker Second Master Thesis Report Franz

Hash function based on the SIS problem HEBANT Chlo e University of Limoges Summer 2016

and Elementary Data Structures Linear Sorting Algorithms Biostatistics 615/815 Lecture 6: . .

MATLAB crash course Cesar E. Tamayo Economics - Rutgers September 27th, 2013 1/27 MATLAB crash

The Swiss Army Knife SLTE Alice Shelton, Michele Barezzani Alcatel-Lucent Submarine

Lecture 10: Lists and Sequences (Sections 10.0-10.2, 10.4-10.6, 10.8-10.13) CS 1110

Chemical Storage &amp; Contamination Sophie Koh, Jonathan Naughton, Cory Seremetis, Aseel

Chirality Nets for Human Pose Regression Raymond A. Yeh, Yuan-Ting Hu, Alexander G. Schwing

Fields of Parts & Friends peter.gehler.net p i Detection + Geometry p i Human Pose

Chemical Storage & Contamination Sophie Koh, Jonathan Naughton, Cory Seremetis, Aseel