Fisher vector image representation Jakob Verbeek January 13, 2012 - PowerPoint PPT Presentation

Fisher vector image representation Jakob Verbeek January 13, 2012 Course website: http://lear.inrialpes.fr/~verbeek/MLCR.11.12.php

Fisher vector representation • Alternative to bag-of-words image representation introduced in Fisher kernels on visual vocabularies for image categorization F. Perronnin and C. Dance, CVPR 2007. • FV in comparison to the BoW representation – Both FV and BoW are based on a visual vocabulary, with assignment of patches to visual words – FV based on Mixture of Gaussian clustering of patches, BoW based on k-means clustering – FV Extracts a larger image signature than the BoW representation for a given number of visual words – Leads to good classification results using linear classifiers, where BoW representations require non-linear classifiers.

Fisher vector representation: Motivation 1 • Suppose we use a bag-of-words image representation – Visual vocabulary trained offline • Feature vector quantization is computationally expensive in practice • To extract visual word histogram for a new image – Compute distance of each local descriptor to each k-means center – run-time O(NKD) : linear in • N: nr. of feature vectors ~ 10^4 per image • K: nr. of clusters ~ 10^3 for recognition • D: nr. of dimensions ~ 10^2 (SIFT) • So in total in the order of 10^9 multiplications 20 per image to obtain a histogram of size 1000 10 5 • Can this be done more efficiently ?! 3 – Yes, extract more than just a visual word histogram ! 8

Fisher vector representation: Motivation 2 • Suppose we want to refine a given visual vocabulary • Bag-of-word histogram stores # patches assigned to each word – Need more words to refine the representation – But this directly increases the computational cost – And leads to many empty bins, redundancy 18 2 10 0 5 3 0 8 0 0

Fisher vector representation: Motivation 2 • Instead, the Fisher Vector also records the mean and variance of the points per dimension in each cell – More information for same # visual words – Does not increase computational time significantly – Leads to high-dimensional feature vectors • Even when the counts are the same the position and variance of the points in the cell can vary 20 10 5 3 8

Image representation using Fisher kernels • General idea of Fischer vector representation p ( X ; Θ) – Fit probabilistic model to data – Represent data with derivative of data log-likelihood “How does the data want that the model changes?” G ( X , Θ)=∂ log p ( x; Θ) ∂Θ Jaakkola & Haussler. “Exploiting generative models in discriminative classifiers”, in Advances in Neural Information Processing Systems 11, 1999. N X ={ x n } n = 1 • We use Mixture of Gaussians to model the local (SIFT) descriptors L ( X , Θ)= ∑ n log p ( x n ) p ( x n )= ∑ k π k N ( x n ;m k ,C k ) exp α k – Define mixing weights using the soft-max function π k = ∑ k ' exp α k ' ensures positiveness and sum to one constraint

Image representation using Fisher kernels • Mixture of Gaussians to model the local (SIFT) descriptors L (Θ)= ∑ n log p ( x n ) p ( x n )= ∑ k π k N ( x n ;m k ,C k ) K – The parameters of the model are Θ={α k ,m k ,C k } k = 1 – where we use diagonal covariance matrices • Concatenate derivatives to obtain data representation G ( X , Θ)= ( − 1 ) T ∂ L , ... , ∂ L ∂ L , ... , ∂ L ∂ L − 1 , ... , ∂ L , , ∂ α 1 ∂ α K ∂ m 1 ∂ m K ∂ C 1 ∂ C K

Image representation using Fisher kernels • Data representation G ( X , Θ)= ( − 1 ) T ∂ L , ... , ∂ L ∂ L , ... , ∂ L ∂ L − 1 , ... , ∂ L , , ∂α 1 ∂α K ∂ m 1 ∂ m K ∂ C 1 ∂ C K • In total K(1+2D) dimensional representation, since for each visual word / Gaussian we have More/less patches assigned ∂ L = ∑ n ( q nk −π k ) to visual word than usual? Count (1 dim) : ∂α k Center of assigned data ∂ L − 1 ∑ n q nk ( x n − m k ) = C k Mean (D dims) : Relative to cluster center ∂ m k ∂ L − 1 = 1 Variance of assigned data 2 ∑ n q nk ( C k −( x n − m k ) 2 ) Variance (D dims) : relative to cluster variance ∂ C k q nk = p ( k ∣ x n )=π k p ( x n ∣ k ) With the soft-assignments: p ( x n )

Bag-of-words vs. Fisher vector image representation • Bag-of-words image representation – Off-line: fit k-means clustering to local descriptors – Represent image with histogram of visual word counts: K dimensions • Fischer vector image representation – Off-line: fit MoG model to local descriptors – Represent image with derivative of log-likelihood: K(2D+1) dimensions • Computational cost similar: – Both compare N descriptors to K visual words (centers / Gaussians) • Memory usage: higher for fisher vectors – Fisher vector is a factor (2D+1) larger, e.g. a factor 257 for SIFTs ! • Ie for 1000 visual words this is roughly 257*1000*4 bytes ~ 1 Mb – However, because we store more information per visual word, we can generally obtain same or better performance with far less visual words

Images from categorization task PASCAL VOC • Yearly evaluation since 2005 for image classification (also object localization, segmentation, and body-part localization)

Fisher vectors: classification performance • Results taken from: “Fisher Kernels on Visual Vocabularies for Image Categorization”, F. Perronnin and C. Dance, in CVPR '07 • BoW and Fisher vector yield similar performance – Fisher vector uses 32x fewer Gaussians – BoW representation 2.000 long, FV length is 64(1+2 x 128) = 16.448

Additional reading material • Fisher vector image representation – “Fisher Kernels on Visual Vocabularies for Image Categorization” F. Perronnin and C. Dance, in CVPR '07 • Pattern Recognition and Machine Learning Chris Bishop, 2006, Springer - Section 6.2

Exam • Friday January 27 th – From 9 am to 12 am – Room H105 Ensimag building @ campus • Prepare from – Lecture slides – Presented papers – Bishop's book • During the exam you can bring – the lecture slides – the presented papers

Fisher vector image representation Jakob Verbeek January 13, 2012 - PowerPoint PPT Presentation

Fisher vector image representation Jakob Verbeek January 13, 2012 Course website: http://lear.inrialpes.fr/~verbeek/MLCR.11.12.php Fisher vector representation Alternative to bag-of-words image representation introduced in Fisher kernels

Pitfalls in Measuring SLOs Danyel Fisher @fisherdanyel An Outage Danyel Fisher @fisherdanyel

MERRY FISHER 1095 New 2018 PROVISIONAL DOCUMENT MERRY FISHER 1095 : THE JOY OF CRUISING 2 In

Vector addition: The zero vector The D -vector whose entries are all zero is the zero vector ,

Fisher Vector image representation Machine Learning and Category Representation 2014-2015 Jakob

DR. PHINNIZE J. FISHER MIDDLE SCHOOL DR. PHINNIZE J. FISHER MIDDLE SCHOOL South Carolina

16-11-04 Statistical Science and Data Science Nancy Reid 27 October 2016 2 Fisher Memorial

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Matrix and Vector Operations Matrix and Vector Operations 1 / 21 Matrix and Vector Operations

Day 3 Advanced Vector Architectures Session A: Vector Instruction Execution Pipelines Break

Image and Video Coding: Representation, Acquisition, Display ... 10011 ... encoder decoder

IMAGE REPRESENTATION Xinyi Fan COS598c Spring2014 Monday, April 7, 14 IMAGE REPRESENTATION

Image Processing Todays Class Image Representations: Matrices Image Representations: RGB,

Topic 7: Topic 7: Image Morphing Image Morphing 1. 1. Intro to basic image morphing Intro to

Image Features Sanja Fidler CSC420: Intro to Image Understanding 1 / 64 Image Features Image

RGBD Tutorial 14210240041 Gu Pan Image RGB YUV Lab Depth Image RGB image Depth image Each pixel in

Image Features Sanja Fidler CSC420: Intro to Image Understanding 1 / 1 Image Features Image

LECTURE SET 6 PROBABILISTIC BEHAVIOUR RECOGNITION ECVision Summer School: 6 - Probabilistic

Fidelity susceptibility in Gaussian Random Ensembles Marek Ku s* Piotr Sierant** Artur

Linear Models are Most Favorable among Generalized Linear Models Kuan-Yun Lee and Thomas A.

Prediction and Representation of Array Performance under Sensor Failure Erdal MEHMETCIK, Prof.

EVALUATION (1-10) IHCC 2019 Set a circle around your choice 1= no relevance/bad presentation at

Lecture 19 Spatial GLM + Point Reference Spatial Data Colin Rundel 11/09/2017 1 Spatial GLM

Towards the ultimate precision limits in parameter estimation: An introduction to quantum

Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2013 Overview