multiple view object recognition in band limited
play

Multiple-View Object Recognition in Band-Limited Distributed Camera - PowerPoint PPT Presentation

Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Multiple-View Object Recognition in Band-Limited Distributed Camera Networks Allen Y. Yang, Subhransu Maji, Mario Christoudas, Kirak Hong, Posu


  1. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Multiple-View Object Recognition in Band-Limited Distributed Camera Networks Allen Y. Yang, Subhransu Maji, Mario Christoudas, Kirak Hong, Posu Yan Trevor Darrell, Jitendra Malik, and Shankar Sastry Fusion, 2009 Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  2. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Classical Object Recognition Affine invariant features, SIFT. Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  3. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Classical Object Recognition Affine invariant features, SIFT. SIFT Feature Matching [Lowe 1999, van Gool 2004] (a) Autostitch (b) Recognition Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  4. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Classical Object Recognition Affine invariant features, SIFT. SIFT Feature Matching [Lowe 1999, van Gool 2004] (a) Autostitch (b) Recognition Bag of Words [Nister 2006] Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  5. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion SIFT Feature Coding in Sensor Networks In band-limited camera networks Compress scalable SIFT tree [Girod et al. 2009] 1 Observation 1: Tree histogram can be fully reconstructed from leaf nodes. Observation 2: Leaf node histogram is largely sparse (up to 10 6 -dim) : Sequence of consecutive zero bins. R : Sequence of nonzero bin values. S Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  6. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion SIFT Feature Coding in Sensor Networks In band-limited camera networks Compress scalable SIFT tree [Girod et al. 2009] 1 Observation 1: Tree histogram can be fully reconstructed from leaf nodes. Observation 2: Leaf node histogram is largely sparse (up to 10 6 -dim) : Sequence of consecutive zero bins. R : Sequence of nonzero bin values. S Multiple-view SIFT feature selection [Darrell et al. 2008] 2 Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  7. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Problem Statement L camera sensors observe a single object in 3-D. 1 Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  8. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Problem Statement L camera sensors observe a single object in 3-D. 1 The mutual information between cameras are unknown, cross-sensor communication is 2 prohibited. Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  9. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Problem Statement L camera sensors observe a single object in 3-D. 1 The mutual information between cameras are unknown, cross-sensor communication is 2 prohibited. On each camera, construct an encoding function for a nonnegative, sparse histogram x i 3 f : x i ∈ R D �→ y i ∈ R d Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  10. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Problem Statement L camera sensors observe a single object in 3-D. 1 The mutual information between cameras are unknown, cross-sensor communication is 2 prohibited. On each camera, construct an encoding function for a nonnegative, sparse histogram x i 3 f : x i ∈ R D �→ y i ∈ R d On the base station, upon receiving y 1 , y 2 , · · · , y L , simultaneously recover 4 x 1 , x 2 , · · · , x L , and classify the object class in space. Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  11. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Key Observations (a) Histogram 1 (b) Histogram 2 All histograms are nonnegative and sparse . Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  12. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Key Observations (a) Histogram 1 (b) Histogram 2 All histograms are nonnegative and sparse . Multiple-view histograms share joint sparse patterns . Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  13. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Key Observations (a) Histogram 1 (b) Histogram 2 All histograms are nonnegative and sparse . Multiple-view histograms share joint sparse patterns . Classification is based on pairwise similarity measure in ℓ 2 -norm (linear kernel) or ℓ 1 -norm (intersection kernel). Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  14. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Random Projection as Encoding Function y = A x Coefficients of A ∈ R D × d are drawn from zero-mean Gaussian distribution. Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  15. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Random Projection as Encoding Function y = A x Coefficients of A ∈ R D × d are drawn from zero-mean Gaussian distribution. Johnson-Lindenstrauss Lemma For n number of point cloud in R D , given distortion threshold ǫ , for any d > O ( ǫ 2 log n ) , a Gaussian random projection f ( x ) = A x ∈ R d preserves pairwise ℓ 2 -distance (1 − ǫ ) � x i − x j � 2 2 ≤ � f ( x i ) − f ( x j ) � 2 2 ≤ (1 + ǫ ) � x i − x j � 2 2 . Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  16. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Classification in Random Projection Space Projection only applies to leaf-node histogram x 4 (a) Level 1–3 (b) Level 4 (leaf nodes) x T = [ x (1) ∈ R , x (2) ∈ R 10 , x (3) ∈ R 100 , x (4) ∈ R 1000 ] . Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  17. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Classification in Random Projection Space Projection only applies to leaf-node histogram x 4 (a) Level 1–3 (b) Level 4 (leaf nodes) x T = [ x (1) ∈ R , x (2) ∈ R 10 , x (3) ∈ R 100 , x (4) ∈ R 1000 ] . Direct classification can be applied using projected leaf histogram (NN or SVM) y = A x (4) . Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  18. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Classification in Random Projection Space Projection only applies to leaf-node histogram x 4 (a) Level 1–3 (b) Level 4 (leaf nodes) x T = [ x (1) ∈ R , x (2) ∈ R 10 , x (3) ∈ R 100 , x (4) ∈ R 1000 ] . Direct classification can be applied using projected leaf histogram (NN or SVM) y = A x (4) . Advantages about Random Projection Easy to generate and update. 1 Does not need training prior (universal dimensionality reduction). 2 faster recognition speed. 3 Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  19. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Experiment I: COIL-100 object database Database: 100 objects, each provides 72 images captured with 5 degree difference. Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  20. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Experiment I: COIL-100 object database Database: 100 objects, each provides 72 images captured with 5 degree difference. SIFT features: Dense sampling of overlapping 8 × 8 grids. Standard SIFT descriptor. 4-level hierarchical k -means ( k = 10): Leaf-node histogram is 1000-D. Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  21. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion Experiment I: COIL-100 object database Database: 100 objects, each provides 72 images captured with 5 degree difference. SIFT features: Dense sampling of overlapping 8 × 8 grids. Standard SIFT descriptor. 4-level hierarchical k -means ( k = 10): Leaf-node histogram is 1000-D. Setup: For each object class, randomly select 10 image for training. Classifier via linear SVM. Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  22. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion From J-L Lemma to Compressive Sensing (a) J-L lemma (b) Compressive sensing Problem I: J-L lemma does not provide means to reconstruct full hierarchy tree. 1 Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

  23. Introduction Random Projection Compressive Sensing Distributed Recognition Experiment Conclusion From J-L Lemma to Compressive Sensing (a) J-L lemma (b) Compressive sensing Problem I: J-L lemma does not provide means to reconstruct full hierarchy tree. 1 Problem II: Gaussian projection does not preserve ℓ 1 -distance (for intersection kernels). 2 Multiple-View Object Recognition http://www.eecs.berkeley.edu/~yang

Recommend


More recommend