6 808 mobile and sensor computing
play

6.808 Mobile and Sensor Computing aka IoT Systems Lecture 14 Split - PowerPoint PPT Presentation

6.808 Mobile and Sensor Computing aka IoT Systems Lecture 14 Split Computing / Continuous Object Recognition Logistics & Norm Setting What to do now? Turn on your video (if your connection allows it) Mute your mic (unless you are


  1. 6.808 Mobile and Sensor Computing aka IoT Systems Lecture 14 Split Computing / Continuous Object Recognition

  2. Logistics & Norm Setting • What to do now? • Turn on your video (if your connection allows it) • Mute your mic (unless you are the active speaker) • Open the “Participant” List • Make sure your full name is shown • If you have a question: • Use the chat feature to either write the question or to indicate your interest in asking the question • James will be monitoring the chat • unmute -> ask question -> mute again • Same procedure for answering questions • We will post this online

  3. Glimpse Continuous, Real-Time Object Recognition on Mobile Devices Tiffany Chen Lenin Ravindranath Shuo Deng Victor Bahl Hari Balakrishnan

  4. Continuous, Real-Time Recognition Apps - Apps that continuously locate and label objects in a video stream.

  5. Continuous, Real-Time Recognition Apps SLOW Driver Assistance Augmented Reality Shopping Face Recognition Augmented Reality Tourist App

  6. Earlier Designs: Picture-Based Object Recognition

  7. Earlier Designs: Picture-Based Object Recognition

  8. Earlier Designs: Picture-Based Object Recognition calories 180

  9. Video-Based Object Recognition

  10. Video-Based Object Recognition Top seller Buy 1 get 1 free

  11. Glimpse • Continuous, real-time object recognition on mobile devices in a video stream

  12. Glimpse • Continuous, real-time object recognition on mobile devices in a video stream • Continuously identify and locate objects in each frame Bob Alice Bob Alice Bob Alice Alice

  13. Object Recognition Pipeline

  14. Object Recognition Pipeline Feature Classification Detection Extraction

  15. Object Recognition Pipeline Feature Classification Detection Extraction

  16. Object Recognition Pipeline Feature Classification Detection Extraction

  17. Object Recognition Pipeline Feature Classification Detection Extraction

  18. Object Recognition Pipeline Feature Classification Detection Extraction Stop Sign

  19. Before Convolutional Neural Network Feature Classification Detection Extraction

  20. Before Convolutional Neural Network Feature engineering Feature Extraction

  21. Before Convolutional Neural Network Feature engineering Feature Extraction

  22. Before Convolutional Neural Network Feature engineering Feature Extraction 12 -2 . . … … .

  23. Convolutional Neural Network Feature learning Feature Extraction Berkeley caffe http://caffe.berkeleyvision.org/

  24. Object Recognition Pipeline Feature Classification Detection Extraction Stop Sign

  25. Object Recognition Pipeline Feature Classification Detection Extraction Stop Sign • Computationally expensive and memory-intensive • Server is 700x faster than Google Glass • Scalability • We need to offload the recognition pipeline to servers

  26. Client-Server Architecture Server Feature Detection Classification Extraction Labels, Network Frame bounding boxes Client Camera Display

  27. End-to-End Latency Lowers Accuracy Expected In reality…

  28. Client-Server Architecture Server Feature Detection Classification Extraction Labels, Network Frame bounding boxes Client Camera Display Challenges 1. End-to-end latency lowers object recognition accuracy

  29. Client-Server Architecture Server Feature Detection Classification Extraction Labels, Network Frame bounding boxes Client Camera Display Challenges 1. End-to-end latency lowers object recognition accuracy 2. Bandwidth and battery efficiency

  30. Glimpse Architecture Server Feature Detection Classification Extraction Labels, Network Frame bounding boxes Client Active Cache Camera Display 1. Active Cache combats e2e latency and regains accuracy

  31. Glimpse Architecture Server Feature Detection Classification Extraction Labels, Network Frame bounding boxes Client Trigger Active Frame Cache Camera Display 1. Active Cache combats e2e latency and regains accuracy 2. Trigger Frame reduces bandwidth usage

  32. Glimpse Architecture Server Feature Detection Classification Extraction Labels, Network Frame bounding boxes Client Trigger Active Frame Cache Camera Display 1. Active Cache combats e2e latency and regains accuracy

  33. End-to-End Latency Lowers Accuracy Is it possible to combat latency and regain accuracy?

  34. Relocate Moving Object with Tracking • Object tracking on the client to re-locate the object Frame 0 Frame 12 (delay = 360 ms)

  35. Relocate Moving Object with Tracking • Object tracking on the client to re-locate the object Fast Frame 0 Frame 12 (delay = 360 ms)

  36. Relocate Moving Object with Tracking • Object tracking on the client to re-locate the object • Fails to work when object displacement is large

  37. Relocate Moving Object with Tracking • Object tracking on the client to re-locate the object • Fails to work when object displacement is large Frame 0 Frame 30 (delay= 1 sec)

  38. Regain Accuracy with Active Cache • Cache and run tracking through the cached frames

  39. Regain Accuracy with Active Cache • Cache and run tracking through the cached frames Delay = 1 sec Server Network Active Cache Frame 0

  40. Regain Accuracy with Active Cache • Cache and run tracking through the cached frames Delay = 1 sec Alice Server Network Active Cache Frame 0

  41. Regain Accuracy with Active Cache • Cache and run tracking through the cached frames Delay = 1 sec Server Network Active Cache Alice Frame 0 Frame 30 Run tracking from Frame 0 to Frame 30

  42. Regain Accuracy with Active Cache • Cache and run tracking through the cached frames Server Tracking through all cached Network frames takes too long! Active Cache ….

  43. Regain Accuracy with Active Cache • Cache and run tracking through the cached frames Server Tracking through all cached Network frames takes too long! Active Cache ….

  44. Adaptive Frame Selection Given n_cached frames, select s_selected frames so that we can catch up without sacrificing tracking performance

  45. Adaptive Frame Selection Given n_cached frames, select s_selected frames so that we can catch up without sacrificing tracking performance 1. How many frames to select? 2. Which frames to select?

  46. Adaptive Frame Selection Given n_cached frames, select s_selected frames so that we can catch up without sacrificing tracking performance 1. How many frames to select? ‧ s_selected: active cache processing time vs. tracking accuracy

  47. Adaptive Frame Selection Given n_cached frames, select s_selected frames so that we can catch up without sacrificing tracking performance 1. How many frames to select? ‧ s_selected: active cache processing time vs. tracking accuracy What is the maximum number of frames that can be tracked? e = execution time for processing any frame in the active cache N frames per second => have 1/N seconds before next frame => Can process s_selected = (1/N)/e frames

  48. Adaptive Frame Selection Given n_cached frames, select s_selected frames so that we can catch up without sacrificing tracking performance 1. How many frames to select? ‧ s_selected: active cache processing time vs. tracking accuracy What is the maximum number of frames that can be tracked? What if I’m okay with increasing the latency a bit? e = execution time for processing any frame in the active cache N frames per second => have 1/N seconds before next frame If I’m fine with a lag of t frames => Can process s_selected = (t/N)/e frames

  49. Adaptive Frame Selection Given n_cached frames, select s_selected frames so that we can catch up without sacrificing tracking performance 2. Given s_selected , which frames to select? ‧ Temporal redundancy between frames

  50. Adaptive Frame Selection Given n_cached frames, select s_selected frames so that we can catch up without sacrificing tracking performance 2. Given s_selected , which frames to select? ‧ Temporal redundancy between frames ‧ Use frame differencing to quantify movement and select frames to capture as much movement as possible

  51. Active Cache Short Question - Does Glimpse reduce the end-to-end latency of object recognition?

  52. Active Cache Short Question - Does active cache reduce the end-to-end latency of object recognition? • No. It’s a trick to cheat the users into thinking the recognition is in real time.

  53. Glimpse Architecture Server Feature Detection Classification Extraction Labels, Network Frame bounding boxes Client Trigger Active Frame Cache Camera Display 1. Active Cache combats e2e latency and regains accuracy 2. Trigger Frame reduces bandwidth usage

  54. Reduce Bandwidth Usage with Trigger Frames ‧ Strategically send certain trigger frames to the server

  55. Reduce Bandwidth Usage with Trigger Frames ‧ Strategically send certain trigger frames to the server 1. Measuring scene changes from the previously processed frame

Recommend


More recommend