indoor places
play

Indoor Places Lukas Kuster Motivation GPS for localization [7] 2 - PowerPoint PPT Presentation

Semantic Localization of Indoor Places Lukas Kuster Motivation GPS for localization [7] 2 Motivation Indoor navigation [8] 3 Motivation Crowd sensing [9] 4 Motivation Targeted Advertisement [10] 5 Motivation


  1. Semantic Localization of Indoor Places Lukas Kuster

  2. Motivation  GPS for localization [7] 2

  3. Motivation  Indoor navigation [8] 3

  4. Motivation  Crowd sensing [9] 4

  5. Motivation  Targeted Advertisement [10] 5

  6. Motivation  Tourist guidance [12] 6

  7. Semantic Localization  GPS  WiFi  Images  Sound  Mobility 7

  8. Semantic Localization  GPS  Works for unseen places  WiFi  Outdoor and indoor  Images  Rich in information  Sound  User’s point of view  Mobility  No special hardware 8

  9. Overview  Motivation  Image Indoor Scene Recognition  Recognizing Indoor Scenes – 2009  Unsupervised Discovery of Mid-Level Discriminative Patches – 2012  Blocks that Shout – 2013  Semantic Localization in full Systems  Conclusions 9

  10. Scene classification in computer vision  Goals:  Assign a scene category to an input image Library Scene classifier Classroom 10

  11. Challenges in scene recognition  Outdoor scenes  Global properties  Geometric  Indoor scenes  Local properties  Semantic meaningful objects  Arrangement of Objects 11

  12. Scene Classification 2009 2012 2013 ... Unsupervised Discovery Blocks that Shout: Recognizing Indoor of Mid-Level Distinctive Parts for Scenes Discriminative Patches Scene Classification Quattoni et al. Singh et al. Juneja et al. 1 2 3 12

  13. Recognizing Indoor Scenes - Quattoni et al. (2009)  Two different Image feature descriptors  Global information – Gist descriptors  Local informations – Sift descrptors  MIT Scene 67 dataset 13

  14. Recognizing Indoor Scenes - Quattoni et al. (2009) Random Prototypes 14

  15. Recognizing Indoor Scenes - Quattoni et al. (2009) Random Segmentation Prototypes  Manual and automatic segmentation into ROI 15

  16. Recognizing Indoor Scenes - Quattoni et al. (2009) Random Segmentation ROI descriptors Prototypes  Manual and automatic segmentation into ROI  2x2 Histogram of Visual Words 16

  17. Recognizing Indoor Scenes - Quattoni et al. (2009) Learning Random Segmentation ROI descriptors Prototypes  Manual and automatic segmentation into ROI  2x2 Histogram of Visual Words  Optimize parameters on test set   p m k     f ( x ) g ( x )   kj kj kG k  h ( x ) exp j 1 k  k 1 17

  18. Recognizing Indoor Scenes - Quattoni et al. (2009) Learning Random Segmentation ROI descriptors Prototypes  Manual and automatic segmentation into ROI  2x2 Histogram of Visual Words  Optimize parameters on test set   p m k     f ( x ) g ( x )   kj kj kG k  h ( x ) exp j 1 k  k 1 Local features Global feature Prototype weight 18

  19. MIT Scene 67 dataset  15620 labeled images  67 indoor scenes categories 19

  20. Test Setup – Quattoni et al. (2009)  67 * 80 images for training  67 * 20 images for testing  Performance metric: Standard average multiclass prediction accuracy Category 1 Category 2 Category 3 Category 4 Category 5 (Predicted) (Predicted) (Predicted) (Predicted) (Predicted) Category 1 90.12% 0.00% 9.88% 0.00% 0.00% (Actual) Category 2 0.00% 100.00% 0.00% 0.00% 0.00% (Actual) Category 3 0.00% 0.00% 92.66% 0.00% 7.34% (Actual) Category 4 37.20% 0.00% 10.34% 52.46% 0.00% (Actual) Category 5 0.00% 0.00% 12.69% 0.00% 87.31% (Actual) 20

  21. Results – Quattoni et al. (2009) 21

  22. Evaluation – Quattoni et al. (2009)  Segmentation Methods:  Segmentation: automatic  Annotation: manual  Features:  Only ROI  ROI + Gist 22

  23. Conclusion – Quattoni et al. (2009)  Indoor Scene classification  Local and global features  Low accuracy (26%)  Manual annotation 23

  24. Scene Classification 2009 2012 2013 ... Unsupervised Discovery Blocks that Shout: Recognizing Indoor of Mid-Level Distinctive Parts for Scenes Discriminative Patches Scene Classification Quattoni et al. Singh et al. Juneja et al. 1 2 3 24

  25. Unsupervised Discovery of Mid-Level Discriminative Patches – Singh et al. (2012)  Mid-Level patches  Representative: frequent occurence in world  Discriminative: diffrent enough from rest of the world 25

  26. Singh et al. (2012) Random discovery set 26

  27. Singh et al. (2012) Random Random discovery set patches 27

  28. Singh et al. (2012) Random Random Kmeans discovery set patches clustering  Cluster patches in HOG space 28

  29. Singh et al. (2012) Random Random Kmeans SVM train discovery set patches clustering  Cluster patches in HOG space  Train detector for each cluster 29

  30. Singh et al. (2012) Random Random Kmeans SVM train discovery set patches clustering  Cluster patches in HOG space Detect new patches  Train detector for each cluster  Use detector on validation set  Get top 5 matches for new cluster  Kill clusters that have less than 2 matches 30

  31. Ranking Detectors – Singh et al. (2012)  Purity  Same visual concept  Sum of top r detection scores  Discriminativeness  Detected rarely in natural world # detections in trainin g set   # detections in (training set natural world) 31

  32. Image descriptor – Singh et al. (2012) Object Bank Image representation – Li, L-J et al. (2010)  Detect Patches on diffrent scales and diffrent spatial pyramid levels  Train classifier with SVM 32

  33. Image descriptor – Singh et al. (2012) Object Bank Image representation – Li, L-J et al. (2010)  Detect Patches on diffrent scales and diffrent spatial pyramid levels  Train classifier with SVM SVM 33

  34. Top Ranked patches – Singh et al. (2012)  MIT 67 Benchmark 34

  35. Evaluation – Singh et al. (2012) Accuracy: Spatial Pyramid HOG 29,8 Spatial Pyramid SIFT (SP) 34,4 ROI-GIST (Quattoni et al.) 26,5 Object Bank 37,6 Patches 38,1 35

  36. Evaluation – Singh et al. (2012) Accuracy: Spatial Pyramid HOG 29,8 Spatial Pyramid SIFT (SP) 34,4 ROI-GIST (Quattoni et al.) 26,5 Object Bank 37,6 Patches 38,1 Combination approaches: GIST+SP+DPM 43,1 Patches+GIST+SP+DPM 49,4 36

  37. Conclusion Quattoni et al. (2009) Singh et al. (2012)  Indoor Scene classification  Low supervision  Local and global features  Better accuracy  Low accuracy (26%)  Low accuracy (49%)  Manual annotation  Inefficient 37

  38. Scene Classification 2009 2012 2013 ... Unsupervised Discovery Blocks that Shout: Recognizing Indoor of Mid-Level Distinctive Parts for Scenes Discriminative Patches Scene Classification Quattoni et al. Singh et al. Juneja et al. 1 2 3 38

  39. Blocks that Shout: Distinctive Parts for Scene Classification – Juneja et al. (2013)  More efficient  Distinctive patches 39

  40. Blocks that Shout – Juneja et al. (2013) Seeding Initial training set 40

  41. Blocks that Shout – Juneja et al. (2013) Seeding Initial training set Superpixels  Automatic segmentation into superpixels 41

  42. Blocks that Shout – Juneja et al. (2013) Seeding Initial training set Seed Blocks Superpixels  Automatic segmentation into superpixels  Seedblocks:  Intermediate sized superpixels  Image variation 42

  43. Blocks that Shout – Juneja et al. (2013) Seeding Expansion Seed Block HOG descriptor  8x8 HOG cells of 8x8 pixels 43

  44. Blocks that Shout – Juneja et al. (2013) Seeding Expansion Seed Block HOG descriptor Exemplar SVM  8x8 HOG cells of 8x8 pixels  Detect similiar blocks 44

  45. Blocks that Shout – Juneja et al. (2013) Seeding Expansion Seed Block HOG descriptor Exemplar SVM seed round1 round2 round3 round4 round5  8x8 HOG cells of 8x8 pixels  Detect similiar blocks  5 iterations for final part detector 45

  46. Blocks that Shout – Juneja et al. (2013) Seeding Expansion Selection  Select most distincitve part detectors N     Entropy: H ( Y , r ) p ( y , r ) log p ( y , r ) 2  y 1 46

  47. Image descriptor – Blocks that Shout (2013) Object Bank Image representation – Li, L-J et al. (2010)  Detect Patches on diffrent scales and diffrent spatial pyramid levels  Train classifier with SVM SVM 47

  48. Blocks that Shout – Juneja et al. (2013) Results 48

  49. Blocks that Shout – Juneja et al. (2013) Evaluation Accuracy: ROI-GIST (Quattoni et al.) 26,5 Object Bank 37,6 Patches (Singh et al.) 38,1 BoP 46,1 49

  50. Blocks that Shout – Juneja et al. (2013) Evaluation Accuracy: ROI-GIST (Quattoni et al.) 26,5 Object Bank 37,6 Patches (Singh et al.) 38,1 BoP 46,1 Combination approaches: Patches+GIST+SP+DPM (Singh et al.) 49,4 IFV + BoP 63,1 50

  51. Conclusion Quattoni et al. (2009) Singh et al. (2012) Juneja et al. (2013)    Indoor Scene Low supervision Low supervision classification  Better accuracy  More efficient  Local and global  Distinctive Parts features   Low accuracy (49%) Even better  accuracy Inefficient  Low accuracy (26%)  Manual annotation  Low accuracy (63%) 51

Recommend


More recommend