Semantic Localization of Indoor Places Lukas Kuster
Motivation GPS for localization [7] 2
Motivation Indoor navigation [8] 3
Motivation Crowd sensing [9] 4
Motivation Targeted Advertisement [10] 5
Motivation Tourist guidance [12] 6
Semantic Localization GPS WiFi Images Sound Mobility 7
Semantic Localization GPS Works for unseen places WiFi Outdoor and indoor Images Rich in information Sound User’s point of view Mobility No special hardware 8
Overview Motivation Image Indoor Scene Recognition Recognizing Indoor Scenes – 2009 Unsupervised Discovery of Mid-Level Discriminative Patches – 2012 Blocks that Shout – 2013 Semantic Localization in full Systems Conclusions 9
Scene classification in computer vision Goals: Assign a scene category to an input image Library Scene classifier Classroom 10
Challenges in scene recognition Outdoor scenes Global properties Geometric Indoor scenes Local properties Semantic meaningful objects Arrangement of Objects 11
Scene Classification 2009 2012 2013 ... Unsupervised Discovery Blocks that Shout: Recognizing Indoor of Mid-Level Distinctive Parts for Scenes Discriminative Patches Scene Classification Quattoni et al. Singh et al. Juneja et al. 1 2 3 12
Recognizing Indoor Scenes - Quattoni et al. (2009) Two different Image feature descriptors Global information – Gist descriptors Local informations – Sift descrptors MIT Scene 67 dataset 13
Recognizing Indoor Scenes - Quattoni et al. (2009) Random Prototypes 14
Recognizing Indoor Scenes - Quattoni et al. (2009) Random Segmentation Prototypes Manual and automatic segmentation into ROI 15
Recognizing Indoor Scenes - Quattoni et al. (2009) Random Segmentation ROI descriptors Prototypes Manual and automatic segmentation into ROI 2x2 Histogram of Visual Words 16
Recognizing Indoor Scenes - Quattoni et al. (2009) Learning Random Segmentation ROI descriptors Prototypes Manual and automatic segmentation into ROI 2x2 Histogram of Visual Words Optimize parameters on test set p m k f ( x ) g ( x ) kj kj kG k h ( x ) exp j 1 k k 1 17
Recognizing Indoor Scenes - Quattoni et al. (2009) Learning Random Segmentation ROI descriptors Prototypes Manual and automatic segmentation into ROI 2x2 Histogram of Visual Words Optimize parameters on test set p m k f ( x ) g ( x ) kj kj kG k h ( x ) exp j 1 k k 1 Local features Global feature Prototype weight 18
MIT Scene 67 dataset 15620 labeled images 67 indoor scenes categories 19
Test Setup – Quattoni et al. (2009) 67 * 80 images for training 67 * 20 images for testing Performance metric: Standard average multiclass prediction accuracy Category 1 Category 2 Category 3 Category 4 Category 5 (Predicted) (Predicted) (Predicted) (Predicted) (Predicted) Category 1 90.12% 0.00% 9.88% 0.00% 0.00% (Actual) Category 2 0.00% 100.00% 0.00% 0.00% 0.00% (Actual) Category 3 0.00% 0.00% 92.66% 0.00% 7.34% (Actual) Category 4 37.20% 0.00% 10.34% 52.46% 0.00% (Actual) Category 5 0.00% 0.00% 12.69% 0.00% 87.31% (Actual) 20
Results – Quattoni et al. (2009) 21
Evaluation – Quattoni et al. (2009) Segmentation Methods: Segmentation: automatic Annotation: manual Features: Only ROI ROI + Gist 22
Conclusion – Quattoni et al. (2009) Indoor Scene classification Local and global features Low accuracy (26%) Manual annotation 23
Scene Classification 2009 2012 2013 ... Unsupervised Discovery Blocks that Shout: Recognizing Indoor of Mid-Level Distinctive Parts for Scenes Discriminative Patches Scene Classification Quattoni et al. Singh et al. Juneja et al. 1 2 3 24
Unsupervised Discovery of Mid-Level Discriminative Patches – Singh et al. (2012) Mid-Level patches Representative: frequent occurence in world Discriminative: diffrent enough from rest of the world 25
Singh et al. (2012) Random discovery set 26
Singh et al. (2012) Random Random discovery set patches 27
Singh et al. (2012) Random Random Kmeans discovery set patches clustering Cluster patches in HOG space 28
Singh et al. (2012) Random Random Kmeans SVM train discovery set patches clustering Cluster patches in HOG space Train detector for each cluster 29
Singh et al. (2012) Random Random Kmeans SVM train discovery set patches clustering Cluster patches in HOG space Detect new patches Train detector for each cluster Use detector on validation set Get top 5 matches for new cluster Kill clusters that have less than 2 matches 30
Ranking Detectors – Singh et al. (2012) Purity Same visual concept Sum of top r detection scores Discriminativeness Detected rarely in natural world # detections in trainin g set # detections in (training set natural world) 31
Image descriptor – Singh et al. (2012) Object Bank Image representation – Li, L-J et al. (2010) Detect Patches on diffrent scales and diffrent spatial pyramid levels Train classifier with SVM 32
Image descriptor – Singh et al. (2012) Object Bank Image representation – Li, L-J et al. (2010) Detect Patches on diffrent scales and diffrent spatial pyramid levels Train classifier with SVM SVM 33
Top Ranked patches – Singh et al. (2012) MIT 67 Benchmark 34
Evaluation – Singh et al. (2012) Accuracy: Spatial Pyramid HOG 29,8 Spatial Pyramid SIFT (SP) 34,4 ROI-GIST (Quattoni et al.) 26,5 Object Bank 37,6 Patches 38,1 35
Evaluation – Singh et al. (2012) Accuracy: Spatial Pyramid HOG 29,8 Spatial Pyramid SIFT (SP) 34,4 ROI-GIST (Quattoni et al.) 26,5 Object Bank 37,6 Patches 38,1 Combination approaches: GIST+SP+DPM 43,1 Patches+GIST+SP+DPM 49,4 36
Conclusion Quattoni et al. (2009) Singh et al. (2012) Indoor Scene classification Low supervision Local and global features Better accuracy Low accuracy (26%) Low accuracy (49%) Manual annotation Inefficient 37
Scene Classification 2009 2012 2013 ... Unsupervised Discovery Blocks that Shout: Recognizing Indoor of Mid-Level Distinctive Parts for Scenes Discriminative Patches Scene Classification Quattoni et al. Singh et al. Juneja et al. 1 2 3 38
Blocks that Shout: Distinctive Parts for Scene Classification – Juneja et al. (2013) More efficient Distinctive patches 39
Blocks that Shout – Juneja et al. (2013) Seeding Initial training set 40
Blocks that Shout – Juneja et al. (2013) Seeding Initial training set Superpixels Automatic segmentation into superpixels 41
Blocks that Shout – Juneja et al. (2013) Seeding Initial training set Seed Blocks Superpixels Automatic segmentation into superpixels Seedblocks: Intermediate sized superpixels Image variation 42
Blocks that Shout – Juneja et al. (2013) Seeding Expansion Seed Block HOG descriptor 8x8 HOG cells of 8x8 pixels 43
Blocks that Shout – Juneja et al. (2013) Seeding Expansion Seed Block HOG descriptor Exemplar SVM 8x8 HOG cells of 8x8 pixels Detect similiar blocks 44
Blocks that Shout – Juneja et al. (2013) Seeding Expansion Seed Block HOG descriptor Exemplar SVM seed round1 round2 round3 round4 round5 8x8 HOG cells of 8x8 pixels Detect similiar blocks 5 iterations for final part detector 45
Blocks that Shout – Juneja et al. (2013) Seeding Expansion Selection Select most distincitve part detectors N Entropy: H ( Y , r ) p ( y , r ) log p ( y , r ) 2 y 1 46
Image descriptor – Blocks that Shout (2013) Object Bank Image representation – Li, L-J et al. (2010) Detect Patches on diffrent scales and diffrent spatial pyramid levels Train classifier with SVM SVM 47
Blocks that Shout – Juneja et al. (2013) Results 48
Blocks that Shout – Juneja et al. (2013) Evaluation Accuracy: ROI-GIST (Quattoni et al.) 26,5 Object Bank 37,6 Patches (Singh et al.) 38,1 BoP 46,1 49
Blocks that Shout – Juneja et al. (2013) Evaluation Accuracy: ROI-GIST (Quattoni et al.) 26,5 Object Bank 37,6 Patches (Singh et al.) 38,1 BoP 46,1 Combination approaches: Patches+GIST+SP+DPM (Singh et al.) 49,4 IFV + BoP 63,1 50
Conclusion Quattoni et al. (2009) Singh et al. (2012) Juneja et al. (2013) Indoor Scene Low supervision Low supervision classification Better accuracy More efficient Local and global Distinctive Parts features Low accuracy (49%) Even better accuracy Inefficient Low accuracy (26%) Manual annotation Low accuracy (63%) 51
Recommend
More recommend