daily activity recognition combining gaze motion and
play

Daily Activity Recognition Combining Gaze Motion and Visual - PowerPoint PPT Presentation

Daily Activity Recognition Combining Gaze Motion and Visual Features Yuki Shiga, Takumi Toyama, Yuzuko Utsumi, Andreas Dengel, Koichi Kise Outline Introduction Proposed Method Experiment Conclusion Outline Introduction


  1. Daily Activity Recognition Combining Gaze Motion and Visual Features Yuki Shiga, Takumi Toyama, Yuzuko Utsumi, Andreas Dengel, Koichi Kise

  2. Outline • Introduction • Proposed Method • Experiment • Conclusion

  3. Outline • Introduction • Proposed Method • Experiment • Conclusion

  4. Gaze Motion Vision Focus movements • Activity recognition draws public attention • Focus on vision-based and Gaze motion-based method • These methods deal with activities that involve eye

  5. Eye Tracker •An eye tracker is useful for recognizing activities that involve eye movements •Record a scene image video as well as the gaze position data Scene Image Gaze Position (Where the User Fixates)

  6. Related Works •Gaze motion-based activity recognition: •Bulling et al., “Eye movement analysis for activity recognition using electrooculography.”[1] •Vision-based activity recognition: •Hipny et al., “Recognizing Egocentric Activities from Gaze Regions with Multiple-Voting Bag of Words.”[2] They used only each modality (Motion or Vision) [2] Hipiny IM, Mayol-Cuevas W. Recognising Egocentric Activities from Gaze Regions with Multiple-Voting Bag of Words. CSTR-12-003. 2012. [1] Bulling, Andreas, Ward, Jamie, Gellersen, Hans, and Töster, Gerhard. Eye movement analysis for activity recognition using electrooculography. IEEE transactions on pattern analysis and machine intelligence , 33, 4 (2011), 741-53. !

  7. Purpose Activity can also be expressed by "what eyes see” can be expressed by "how eyes move” We use both vision-based and gaze motion-based modality for activity recognition

  8. and vision-based method Both combination of vision and gaze motion can improve recognizing activities that involve eye movements Purpose • Propose a method combining gaze motion-based method • Verify the hypothesis: 


  9. Outline • Introduction • Proposed Method • Experiment • Conclusion

  10. Gaze Motion Feature Overview Visual Feature Classifier Classifier Eye Tracker Record Gaze Points and Scene Images Fusion Result Output Output

  11. Gaze Motion Feature Overview Visual Feature Classifier Classifier Eye Tracker Record Gaze Points and Scene Images Fusion Result Output Output

  12. Gaze Motion Feature L N-gram Fature Statistical R R r r r L R r r r R Convert Saccade Representing Size and Direction of Saccade Fixation method • The method proposed by Bulling et al. [1] Bulling, Andreas, Ward, Jamie, Gellersen, Hans, and Töster, Gerhard. Eye movement analysis for activity recognition using electrooculography. IEEE transactions on pattern analysis and machine intelligence , 33, 4 (2011), 741-53. !

  13. Gaze Motion Feature Overview Visual Feature Classifier Classifier Eye Tracker Record Gaze Points and Scene Images Fusion Result Output Output

  14. Visual Feature Crop a region around gaze points to remove a irrelevant region

  15. Visual Feature Crop a region around gaze points to remove a irrelevant region

  16. Local Feature Extraction Intrest Points by Dense Sampling Extract Local Features (PCA-SIFT) From Each Point

  17. Convert to Global Feature Learning Image k-means clustering k centroids (visual words) … Test Image Nearest Neighbor Search to visual words … Global Feature

  18. Gaze Motion Feature Overview Visual Feature Classifier Classifier Eye Tracker Record Gaze Points and Scene Images Fusion Result Output Output

  19. Classifier Read Write Type ~ Feature Vector For Learning • SVM with Probability Estimation • Two classifiers are made for visual and gaze motion features

  20. Classifier Read Write Type ~ Feature Vector for Test

  21. Classifier Read Write Type Type Write Read Probability

  22. Gaze Motion Feature Overview Visual Feature Classifier Classifier Eye Tracker Record Gaze Points and Scene Images Fusion Result Output Output

  23. Fusion Read Type Write Read Probability from gaze motion Type Write Read Probability from vision

  24. Fusion Type Write Read Probability from gaze motion Type Write Read Probability from vision Type Write Read Combined probability Average

  25. Outline • Introduction • Proposed Method • Experiment • Conclusion

  26. Experiments User Same Cross-user Same Different Cross-scene Same Same Baseline Target Objects / Environments contains a person different from training data Whether the combined method performs when test data objects are different between training and test data Whether the combined method performs when target vision-based and gaze motion-based method Whether combined method performs better than individual Different • Baseline: 
 • Cross-scene: 
 • Cross-user: 


  27. 1280 × 960 Pixels 300 × 300 pixels around gaze points 700 gaze samples Condition of All Experiments • Sampling rate of the eye tracker: 30 Hz • Resolution of the scene camera: 
 • Visual features are extracted from 
 • Gaze motion features are extracted from 


  28. Activity List Watch a video Write text Read text Type text Have a chat Walk

  29. Baseline Experiment Wach a video 4 Scene 3 Scene 2 Scene 1 Scene Walk Have a chat Type text Read Text Write text • 1 person • Contains 4 different scenes • The dataset was divided into 2 parts

  30. Baseline Experiment Type Proposed Visual Gaze motion Avg. Walk Chat Read Acuracy(%) Write Watch 100 75 50 25 0 • The accuracy of the proposed method was the best

  31. Cross-scene Experiment Wach a video Write text Read Text Type text Have a chat Walk Scene 1 Scene 2 Scene 3 Scene 4 • 3 people

  32. Cross-scene Experiment Wach a video 4 Scene 3 Scene 2 Scene 1 Scene Walk Have a chat Type text Read Text Write text Leave Out for Test Data • 3 people • Leave-one-out cross validation

  33. Cross-scene Experiment Read Propsed(Cross-scene) Proposed(Baseline) Avg. Walk Chat Type Write Acuracy(%) Watch 100 75 50 25 0 • The recognition rate of Cross-scene is lower than Baseline

  34. Cross-scene Experiment 0 Visual(Closs-scene) Visual(Baseline) Avg. Walk Chat Type Read Write Watch 100 75 50 25 Acuracy(%) Acuracy(%) Watch 0 25 50 75 100 Write Gaze motion(Cross-scene) Read Type Chat Walk Avg. Gaze motion(Baseline) • Both of recognition rates dropped • Gaze motion also depends on targets or environments

  35. Cross-user Experiment Wach a video Write text Read Text Type text Have a chat Walk Scene 1 Scene 2 × 7 people 1 person: test The rest 6 people: training

  36. Cross-user Experiment Read Proposed(Cross-user) Proposed(Baseline) Avg. Walk Chat Type Write Acuracy(%) Watch 100 75 50 25 0 • The recognition rate of Cross-user is lower than Baseline

  37. Cross-user Experiment Walk people Gaze motions of “Read” activity are similar between different • Gaze motions are different between people • Gaze motion(Cross-user) Gaze motion(Baseline) Avg. Chat Acuracy(%) Type Read Write Watch 100 75 50 25 0

  38. Outline • Introduction • Proposed Method • Experiment • Conclusion

  39. recognize daily activities that involve eye movements recognition accuracy is higher when we combine vision- based method and gaze motion-based method Conclusion • Combined gaze motion feature and visual feature to • The results from the experiments show that the

  40. Daily Activity Recognition Combining Gaze Motion and Visual Features Yuki Shiga, Takumi Toyama, Yuzuko Utsumi, Andreas Dengel, Koichi Kise

  41. Cross-User Experiment Acuracy(%) 0 25 50 75 100 Watch Write Read Type Chat Walk Avg. Visual(Baseline) Visual(Closs-user)

Recommend


More recommend