Person re-identification by Local Maximal Occurrence representation and metric learning Liao Shengcai, Hu Yang, Zhu Xiangyu, Li Stan Z. Experiment Presenter: Zhenpei Yang
Person Re-identification: Given an image of a person from one camera, identifying the person from images taken from different cameras Slides credit: liangzheng
Person re-identification is a challenging problem because: ● Big Intra-class variance due to pose, viewpoint, illumination change. ● Need a proper metric to compute cross-class distance. Contribution ● Extract good features Local Occurrence Maximum (LOMO) ● Use good distance metric Cross-view Quadratic Discriminant Analysis (XQDA)
About distance metric Examine Images Which image is more likely correspond to image Q? A or B? B A Model the distribution for intra-class distance and Q extra-class distance! Query Image
Discriminative model Intuition: model the covariance for Intra-class distance and extra-class distance respectively using gaussian Linear Quadratic Cross-view Quadratic Discriminant Discriminant Discriminant Analysis Analysis (LDA) Analysis (QDA) (XQDA)
Cross-view Quadratic Discriminant Analysis (XQDA) Intuition: Original feature space is too high dimension. Maybe it’s helpful to consider the problem in subspace Hard to measure precisely in high dimension space Measure this in subspace! ? Which subspace ? What about PCA
Cross-view Quadratic Discriminant Analysis (XQDA) The two distribution for intra-class and extra-class distance both have zero means
The QXDA chose subspace that maximize the two classes’ variance ratio PCA QXDA
Viewpoint Invariance Analysis ● Video taken by hand-hold camera ● #Total 23 seconds/705 frames(48*128) ● 0-360 degree view Slides credit: my roomate
Viewpoint Invariance Analysis Learn a distance metric d( ) using XQDA Extract Choose Measure the features on feature d(f_t, f_1) each frame discance? discance? discance?
Investigated Features ● Local Maximum Occurrence (LOMO) ● LOMO without Maximum Operator ● Convolutional Neural Network Feature (CNN)
Distance Metric ● Quadratic Discriminant Analysis (XQDA) ● Cosine Similarity
● The max operation in LOMO makes it more robust to viewpoint change ● XQDA can learn more robust metric against viewpoint variation Cosine similarity measure XQDA similarity measure
Which region contribute mostly? ● Conduct training on four different body parts ● Compute the matching performance using each body parts
The upper body is the most distinguishable part
Sensitivity to Occlusion Parameter: the size of occlusion area
The performance degrades monotonous as occlusion become more severe
1st rank accuracy degrades monotonous as occlusion become more severe
Conclusion ● XQDA find the subspace that maximize the covariance odds of intra-class and extra-class distance. ● Doesn’t robust to occlusion. ● LOMO feature has some viewpoint invariance due to the max operation. ● XQDA can learn more robust metric against viewpoint variation ● Upper body is the most distinct part for person-reidentification
Recommend
More recommend