data analytics using deep learning
play

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // JENNIFER - PowerPoint PPT Presentation

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // JENNIFER MA L E C T U R E # 1 4 : L I V E V I D E O A N A L Y T I C S A T S C A L E W I T H A P P R O X I M A T I O N A N D D E L A Y - T O L E R A N C E TODAYS PAPER Live


  1. DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // JENNIFER MA L E C T U R E # 1 4 : L I V E V I D E O A N A L Y T I C S A T S C A L E W I T H A P P R O X I M A T I O N A N D D E L A Y - T O L E R A N C E

  2. TODAY’S PAPER • Live Video Analytics at Scale with Approximation and Delay-Tolerance GT 8803 // Fall 2018 2

  3. TODAY’S AGENDA • Problem Overview • Key Ideas • Technical Details • Experiments • Discussion GT 8803 // Fall 2018 3

  4. PROBLEM OVERVIEW • Querying camera recordings • Traffic intersections, retail stores, offices, etc. • Slow and costly GT 8803 // Fall 2018 4

  5. PROBLEM OVERVIEW • Use cases? GT 8803 // Fall 2018 5

  6. PROBLEM OVERVIEW • Use cases? � Catching criminals • Shoplifting • Trafficking � Sending ambulances • Car accidents • Free routes � Traffic control � Amber alerts GT 8803 // Fall 2018 6

  7. PROBLEM OVERVIEW • 2 main problems with querying videos GT 8803 // Fall 2018 7

  8. PROBLEM OVERVIEW • 2 main problems with querying videos � Slow � Costly GT 8803 // Fall 2018 8

  9. PROBLEM OVERVIEW • Querying a month-long video would requires 280 GPU hours and $250 • To run the query in 1 minute requires 10000s of GPUs • Traffic jurisdictions and retails may only have 10s or 100s • VOT Challenge 2015 – 1 fps GT 8803 // Fall 2018 9

  10. PROBLEM OVERVIEW • Goal: Optimize thousands of queries operating in clusters GT 8803 // Fall 2018 10

  11. KEY IDEAS • 2 key characteristics of video analytics � Resource-quality tradeoff with multidimensional configurations � Variety in quality and lag goals GT 8803 // Fall 2018 11

  12. KEY IDEAS • Resource-quality trade-off with multi-dimensional configurations GT 8803 // Fall 2018 12

  13. KEY IDEAS • Resource-quality trade-off with multi-dimensional configurations � Estimated amount of resources needed � Quality: accuracy of output � Configuration: a combination of parameters for an algorithm � Multi-dimensional – how configurations have multiple parameters GT 8803 // Fall 2018 13

  14. KEY IDEAS • Example parameters: • Video resolution • Frame rate • Size of the sliding window GT 8803 // Fall 2018 14

  15. KEY IDEAS • Variety in quality and lag goals GT 8803 // Fall 2018 15

  16. KEY IDEAS • Variety in quality and lag goals � Some outputs don’t need to be 100% accurate, such as counts of cars � Some outputs can wait GT 8803 // Fall 2018 16

  17. KEY IDEAS • Variety in quality and lag goals � Some outputs don’t need to be 100% accurate, such as counts of cars � Some outputs can wait • Traffic tickets where the billing can be delayed GT 8803 // Fall 2018 17

  18. KEY IDEAS • Variety in quality and lag goals � Some outputs don’t need to be 100% accurate, such as counts of cars � Some outputs can wait • Traffic tickets where the billing can be delayed � Queries that need a fast result? GT 8803 // Fall 2018 18

  19. KEY IDEAS • Variety in quality and lag goals � Some outputs don’t need to be 100% accurate, such as counts of cars � Some outputs can wait • Traffic tickets where the billing can be delayed � Queries that need a fast result? • Amber alerts GT 8803 // Fall 2018 19

  20. KEY IDEAS • Variety in quality and lag goals � Some outputs don’t need to be 100% accurate, such as counts of cars � Some outputs can wait • Traffic tickets where the billing can be delayed � Queries that need a fast result? • Amber alerts � Outputs that need to have high accuracy? GT 8803 // Fall 2018 20

  21. KEY IDEAS • Variety in quality and lag goals � Some outputs don’t need to be 100% accurate, such as counts of cars � Some outputs can wait • Traffic tickets where the billing can be delayed � Queries that need a fast result? • Amber alerts � Outputs that need to have high accuracy? • Amber alerts GT 8803 // Fall 2018 21

  22. KEY IDEAS • Variety in quality and lag goals � Some outputs don’t need to be 100% accurate, such as counts of cars � Some outputs can wait • Traffic tickets where the billing can be delayed � Queries that need a fast result? • Amber alerts � Outputs that need to have high accuracy? • Amber alerts � Low accuracy? GT 8803 // Fall 2018 22

  23. KEY IDEAS • Variety in quality and lag goals � Some outputs don’t need to be 100% accurate, such as counts of cars � Some outputs can wait • Traffic tickets where the billing can be delayed � Queries that need a fast result? • Amber alerts � Outputs that need to have high accuracy? • Amber alerts � Low accuracy? • Counting cars GT 8803 // Fall 2018 23

  24. KEY IDEAS • How do systems for stream processing allocate resources? GT 8803 // Fall 2018 24

  25. KEY IDEAS • How do systems for stream processing allocate resources? � Resource fairness GT 8803 // Fall 2018 25

  26. KEY IDEAS • How do systems for stream processing allocate resources? � Resource fairness • VideoStorm, their system, takes into account the resource demand, the quality needed, and the lag tolerance. Lag is the amount of time that a frame has been waiting to be processed. GT 8803 // Fall 2018 26

  27. KEY IDEAS • Challenges? GT 8803 // Fall 2018 27

  28. KEY IDEAS • Challenges? � Hard to analyze what resources and the quality of the output needed for a query � Hard to pick configurations because there are many knobs � Trading off between lag and quality goals is tricky � Resource allocation across all queries each having many configurations is computationally intractable GT 8803 // Fall 2018 28

  29. KEY IDEAS • Solution � Offline phase: • Analyze resource demand and quality needed of each query for different configurations • Pick the ones on the pareto boundary � Online phase: • Scheduler reallocates resources, reselects configurations, and considers migrating queries to different machines • Based on resource-quality profiles and changes in resource capacity GT 8803 // Fall 2018 29

  30. TECHNICAL DETAILS Video queries specification: • Queries are submitted to VideoStorm as sequences of transforms. • A transform (task) could have multiple inputs and outputs GT 8803 // Fall 2018 30

  31. Resource Allocation � Have a selection of configurations � Pick configs for queries for overall better quality � Put queries on lag if some queries with low lag-tolerance need resources GT 8803 // Fall 2018 31

  32. Real-world video queries - Examples GT 8803 // Fall 2018 32

  33. Real-world video queries - Examples � License plate reader � Car counter � Deep neural network classifier for object detection and classification � Object tracker GT 8803 // Fall 2018 33

  34. TECHNICAL DETAILS � Parameters that affect CPU demand and quality for most video queries GT 8803 // Fall 2018 34

  35. TECHNICAL DETAILS � Parameters that affect CPU demand and quality for most video queries • Image resolution • Frame sampling rate GT 8803 // Fall 2018 35

  36. TECHNICAL DETAILS � How do these affect License plate reader queries? GT 8803 // Fall 2018 36

  37. TECHNICAL DETAILS � How do these affect License plate reader queries? • Lower resolution and lower sampling rate lead to dramatically less resource demand • Missed or incorrectly read plates GT 8803 // Fall 2018 37

  38. TECHNICAL DETAILS � How do they affect a car counter? GT 8803 // Fall 2018 38

  39. TECHNICAL DETAILS � How do they affect a car counter? • Good quality still GT 8803 // Fall 2018 39

  40. TECHNICAL DETAILS � Profile estimation • Profile: estimated resources needed and desired accuracy of output • For a configuration of parameters, for one query GT 8803 // Fall 2018 40

  41. Profile Estimation � Overview • Pareto boundary • Compute a value for each profile GT 8803 // Fall 2018 41

  42. Profile Estimation � Choosing configurations by greedy exploration • High quality and low demand • Hill climbing GT 8803 // Fall 2018 42

  43. TECHNICAL DETAILS � Resource management: • Allocation – of resources for each query • Placement – of new and old queries GT 8803 // Fall 2018 43

  44. TECHNICAL DETAILS � Utility function for a configuration • Quality and lag predicted • Utility is used to help select a configuration for a query GT 8803 // Fall 2018 44

  45. TECHNICAL DETAILS Utility function: Baseline + bonus - penalty GT 8803 // Fall 2018 45

  46. TECHNICAL DETAILS � Optimization objectives • Public cloud – maximize revenue -> maximize sum of utilities • Shared private cluster – want fairness -> maximize min utility GT 8803 // Fall 2018 46

  47. TECHNICAL DETAILS � Resource allocation • Optimize for near future • Greedy approach GT 8803 // Fall 2018 47

  48. TECHNICAL DETAILS Query placement � Place new queries based on 3 goals • Maximizing utility in the cluster • Load balancing • Lag spreading GT 8803 // Fall 2018 48

  49. Evaluation � Profiles are ‘nearly’ correct � Setup • 4 types of queries � Baseline • Fair scheduler � Metrics • Quality • % frames exceeding lag goal • Utility GT 8803 // Fall 2018 49

  50. Evaluation � Performance • 300 queries of 4 types • Lag of 20s or 300s • Quality goal of 0.25 • 300 ‘distinct’ video datasets GT 8803 // Fall 2018 50

Recommend


More recommend