miris fast object track queries in video
play

MIRIS: Fast Object Track Queries in Video Favyen Bastani, Songtao - PowerPoint PPT Presentation

MIRIS: Fast Object Track Queries in Video Favyen Bastani, Songtao He, Arjun Balasingam, Karthik Gopalakrishnan, Mohammad Alizadeh, Hari Balakrishnan, Michael Cafarella, Tim Kraska, Sam Madden MIT CSAIL Traffic Cameras Dashcams Miscellaneous


  1. MIRIS: Fast Object Track Queries in Video Favyen Bastani, Songtao He, Arjun Balasingam, Karthik Gopalakrishnan, Mohammad Alizadeh, Hari Balakrishnan, Michael Cafarella, Tim Kraska, Sam Madden MIT CSAIL

  2. Traffic Cameras Dashcams Miscellaneous

  3. Video Analytics Debugging Autonomous Vehicle Software Traffic Planning Finding Interesting Events Real-Time Mapping

  4. Prior Work [1, 2, 3] Select video frames with three buses [1] NoScope : Optimizing Neural Network Queries over Video at Scale. Daniel Kang et al. VLDB 2017. [2] Accelerating Machine Learning Inference with Probabilistic Predicates . Yao Lu et al. SIGMOD 2018. [3] BlazeIt : Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics. Daniel Kang et al. VLDB 2020.

  5. Prior Work [1, 2, 3] Object Detector Object Detector Object Detector [1] NoScope : Optimizing Neural Network Queries over Video at Scale. Daniel Kang et al. VLDB 2017. [2] Accelerating Machine Learning Inference with Probabilistic Predicates . Yao Lu et al. SIGMOD 2018. [3] BlazeIt : Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics. Daniel Kang et al. VLDB 2020.

  6. Prior Work [1, 2, 3] Object Detector Object Detector Object Detector [1] NoScope : Optimizing Neural Network Queries over Video at Scale. Daniel Kang et al. VLDB 2017. [2] Accelerating Machine Learning Inference with Probabilistic Predicates . Yao Lu et al. SIGMOD 2018. [3] BlazeIt : Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics. Daniel Kang et al. VLDB 2020.

  7. Prior Work [1, 2, 3] Object 0 Detector Object 3 Detector Object 1 Detector [1] NoScope : Optimizing Neural Network Queries over Video at Scale. Daniel Kang et al. VLDB 2017. [2] Accelerating Machine Learning Inference with Probabilistic Predicates . Yao Lu et al. SIGMOD 2018. [3] BlazeIt : Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics. Daniel Kang et al. VLDB 2020.

  8. Prior Work [1, 2, 3] Fast, Inaccurate Approximate 0.03 ❌ Classifier Approximate 0.96 Classifier Approximate 0.23 Classifier [1] NoScope : Optimizing Neural Network Queries over Video at Scale. Daniel Kang et al. VLDB 2017. [2] Accelerating Machine Learning Inference with Probabilistic Predicates . Yao Lu et al. SIGMOD 2018. [3] BlazeIt : Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics. Daniel Kang et al. VLDB 2020.

  9. Prior Work [1, 2, 3] X Approximate 0.03 Classifier Approximate 0.96 Classifier Object 3 buses Detector ✅ Approximate 0.23 Classifier Object Detector [1] NoScope : Optimizing Neural Network Queries over Video at Scale. Daniel Kang et al. VLDB 2017. [2] Accelerating Machine Learning Inference with Probabilistic Predicates . Yao Lu et al. SIGMOD 2018. [3] BlazeIt : Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics. Daniel Kang et al. VLDB 2020.

  10. Prior Work [1, 2, 3] Query : Select video frames with three buses X Approximate 0.03 Classifier Approximate 0.96 Classifier Object 3 buses Detector ✅ Approximate 0.23 Classifier Object Only 1 bus Detector ❌ [1] NoScope : Optimizing Neural Network Queries over Video at Scale. Daniel Kang et al. VLDB 2017. [2] Accelerating Machine Learning Inference with Probabilistic Predicates . Yao Lu et al. SIGMOD 2018. [3] BlazeIt : Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics. Daniel Kang et al. VLDB 2020.

  11. Object Track Queries <( t 1 , x 1 , y 1 , w 1 , h 1 ), … , ( t n , x n , y n , w n , h n )>

  12. Object Track Queries

  13. Object Track Queries

  14. Object Track Queries

  15. Object Track Queries

  16. Find cars that rapidly decelerate

  17. Find cars that rapidly decelerate Given track A : select A if there is a 1 sec interval I such that, if v 1 is A ’s velocity in first half of I , and v 2 is velocity in second half, then v 1 - v 2 exceeds a threshold.

  18. Find bears catching salmon

  19. Find bears catching salmon Given bear A and salmon B : select ( A , B ) if A and B intersect for at least two seconds.

  20. Find cars that run a red light

  21. Find cars that run a red light Given car A and red light B : select ( A , B ) if A starts in bottom-right and ends in top-left, and the interval of A is contained in the interval of B .

  22. Object Detector Object Detector Object Detector Object Detector Object Detector Object Detector

  23. Object Detector Object Detector Object Detector Object Detector Object Detector Object Detector

  24. Object Detector Object Detector Object Detector Object Detector Object Detector Object Detector

  25. Object Detector Object Detector ● Costly! ● On $10,000 GPU , object Object detection runs at ~30 fps Detector ● On AWS, $1 per video hour => $72K to execute query ● Object Detector over one month of video captured from 100 cameras Object Detector Object Detector

  26. Object Detector Object Detector

  27. Low-Framerate Tracking: Matching Errors

  28. Low-Framerate Tracking: Matching Errors 0.38 0.42 0.02

  29. Low-Framerate Tracking: Predicate Errors

  30. Predicate Errors

  31. Predicate Errors

  32. Predicate Errors

  33. Predicate Errors

  34. MIRIS: Fast Object Track Queries over Video Key ideas: ● Track at low framerate; but may need to re-visit some intermediate frames ● Query Planning + Object Tracking ○ Parameterizable query-driven object tracking method ○ Query planner to select the parameters using AQP techniques

  35. 10 sec 12 sec 14 sec 16 sec 18 sec Object Detections

  36. 10 sec 12 sec 14 sec 16 sec 18 sec Object Detections

  37. 10 sec 12 sec 14 sec 16 sec 18 sec

  38. 10 sec 12 sec 14 sec 16 sec 18 sec Object Track

  39. 10 sec 12 sec 14 sec 16 sec 18 sec

  40. Low-Framerate Tracking: Matching Errors 0.38 0.42 0.02

  41. Low-Framerate Tracking: Matching Errors Close: keep both 0.38 0.42 0.02

  42. 10 sec 12 sec 14 sec 16 sec 18 sec

  43. Filtering ● Remove groups of paths that we are sure do not satisfy the predicate ● Several filtering methods for planner to choose from: nearest-neighbor, RNN

  44. Refinement: Address Predicate Errors

  45. MIRIS: Fast Object Track Queries over Video Key ideas: ● Track at low framerate; but may need to re-visit some intermediate frames ● Query Planning + Object Tracking ○ Parameterizable query-driven object tracking method ○ Query planner to select the parameters using AQP techniques

  46. Query Planning Select tracks satisfying P , with 99% accuracy. Video Dataset

  47. Query Planning Select tracks satisfying P , with 99% accuracy . Video Dataset

  48. Query Planning Select tracks satisfying P , with 99% accuracy. Video Dataset Sampled Video Segments

  49. Query Planning Select tracks satisfying P , with 99% accuracy. Video Dataset Sampled Video Segments

  50. Query Planning Select tracks satisfying P , with 99% accuracy. Video Dataset Sampled Video Segments

  51. Query Planning Select tracks satisfying P , with 99% accuracy. Video Dataset Sampled Video Segments Initial Uncertainty Filtering Refinement Tracking Resolution

  52. Query Planning Select tracks satisfying P , with 99% accuracy. Video Dataset Sampled Video Segments Initial Uncertainty Filtering Refinement Tracking Resolution Parameters: Parameters: Sampling “Closeness” Framerate Threshold

  53. Query Planning Select tracks satisfying P , with 99% accuracy. Video Dataset Sampled Video Segments Initial Uncertainty Filtering Refinement Tracking Resolution Parameters: Methods: Parameters: Methods: Sampling “Closeness” Prefix- NND RNN Accel RNN Framerate Threshold Suffix T T T T T Per-method threshold parameters

  54. Evaluation: 9 Queries over 5 Video Sources Diverse range of video sources: ● UAV: video captured by UAV over traffic junction ● Tokyo, Warsaw: video captured by fixed traffic camera ● Resort: video of a pedestrian walkway ● BDD: dashcam video

  55. Four baselines: ● Overlap-based tracking [1] ● Kernel correlation filters (KCF) [2] ● FlowNet [3] ● Probabilistic predicates [4, 5, 6] [1] Simple Online and Realtime Tracking. Alex Bewley et al. ICIP 2016. [2] High-Speed Tracking with Kernelized Correlation Filters. Joao Henriques et al. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014. [3] FlowNet: Learning Optical Flow with Convolutional Networks. Alexey Dosovitskiy et al. ICCV 2015. [4] NoScope: Optimizing Neural Network Queries over Video at Scale. Daniel Kang et al. VLDB 2017. [5] Accelerating Machine Learning Inference with Probabilistic Predicates. Yao Lu et al. SIGMOD 2018. [6] BlazeIt: Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics. Daniel Kang et al. VLDB 2020. Higher Speed Higher Accuracy

Recommend


More recommend