6 835 multimodal interfaces final presentation
play

6.835 Multimodal Interfaces Final Presentation Zack Anderson - PowerPoint PPT Presentation

6.835 Multimodal Interfaces Final Presentation Zack Anderson Contents 1 motivation 2 example 3 system architecture 4 gesture recognition engine 5 performance 6 contributions+future Motivation clock/radio weather station personal


  1. 6.835 Multimodal Interfaces Final Presentation Zack Anderson

  2. Contents 1 motivation 2 example 3 system architecture 4 gesture recognition engine 5 performance 6 contributions+future

  3. Motivation clock/radio weather station personal computer calendar/planner news channel KEY OBSERVATION: Disconnect between two classes of devices. Single-purpose home devices are easy and efficient. PCs offer extensible interfaces to data. CHALLENGE: Design an easy and efficient interface to access time-sensitive data.

  4. Example Live demo

  5. System Architecture User Interface RSS feeds, etc. mode changes / UI updates State-Machine & gesture set Contextual Booster phrase set mode time command command Speech Gesture Recognizer Recognizer

  6. Gesture Recognition Engine Nearest neighbors classification

  7. Gesture Recognition Engine Nearest neighbors classification Weighted Euclidian distance measures Δ x Δ y a b c d Δ x_dot Δ y_dot

  8. Gesture Recognition Engine Nearest neighbors classification Weighted Euclidian distance measures Dynamically-restricted gesture set for better performance

  9. Gesture Recognition Engine Nearest neighbors classification Weighted Euclidian distance measures Dynamically-restricted gesture set for better performance

  10. Gesture Recognition Engine Nearest neighbors classification Weighted Euclidian distance measures Dynamically-restricted gesture set for better performance Transforming-normalization algorithm to make temporally-similar gestures look the same

  11. Performance: Gesture Engine Recognition Accuracy Per Gesture Set Size 100% accuracy rate 99.2% 10 5 10 Gesture Set restricted gesture set size *Tests conducted on a total sample size of 300 gestures of 10 types input by 6 different people. Left chart used 1 training example per gesture.

  12. Performance: Gesture Engine Recognition Accuracy Recognition Accuracy Per Gesture Set Size Per Training Set Size accuracy rate 100% 100% 100% 99.2% 98.3% accuracy rate 99.2% 1 2 3 4 # of training examples 10 5 10 Gesture Set restricted gesture set size *Tests conducted on a total sample size of 300 gestures of 10 types input by 6 different people. Left chart used 1 training example per gesture.

  13. Performance: Speech Engine Recognition Accuracy Per Command Set Size 97.9% 97.9% accuracy rate 96.9% 96.9% 93.8% 2 4 8 16 32 restricted grammar size (# of commands) *Tests conducted using a custom python wrapper of the Microsoft Speech SDK. Grammars are dynamically-restricted. Microsoft Speech engine was trained before testing. Where possible, restricted grammars were kept within a domain. Non-recognitions are considered false recognitions.

  14. Performance: Usability “ Gestures seem to flow with the UI, ” making the system very intuitive.

  15. Performance: Usability “ Gestures seem to flow with the UI, ” making the system very intuitive. “ Response time needs to be faster to ” make the system seem seamless.

  16. Performance: Usability “ Gestures seem to flow with the UI, ” making the system very intuitive. “ Response time needs to be faster to ” make the system seem seamless. “ Recognition accuracy is surprisingly good, making the ” wallcomputer efficient, simple to learn, and pleasing to use.

  17. Performance: Usability “ Gestures seem to flow with the UI, ” making the system very intuitive. “ Response time needs to be faster to ” make the system seem seamless. “ Recognition accuracy is surprisingly good, making the ” wallcomputer efficient, simple to learn, and pleasing to use. “ System inputs are immersive and natural. ” It would be nice if the UI were more tactile .

  18. Contributions / Future Designed an accurate (>99%) gesture recognition system based on optimizations of a nearest-neighbors algorithm Demonstrated that multimodal, contextually-restricted UIs provide superior performance Presented a new paradigm of computer interaction that verges between ambient and full-PC capability Built a functional “wallcomputer”

  19. Contributions / Future Designed an accurate (>99%) gesture recognition system based on optimizations of a nearest-neighbors algorithm Demonstrated that multimodal, contextually-restricted UIs provide superior performance Presented a new paradigm of computer interaction that verges between ambient and full-PC capability Built a functional “wallcomputer” - Add more modes (i.e. schedule, automation system control, future stock quotes, etc.), integrate 3 rd party APIs (i.e. gcalendar) - Add more control modalities for greater user efficiency - Incorporate tactile/auditory feedback

Recommend


More recommend