Live Video Analytics at Scale with Approximation and Delay-Tolerance - PowerPoint PPT Presentation

Live Video Analytics at Scale with Approximation and Delay-Tolerance Haoyu Zhang, Microsoft and Princeton University; Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, and Paramvir Bahl, Microsoft; Michael J. Freedman, Princeton University (thanks for the slides)

Computer vision background Fast GPU:s has made matrix multiplication extremely cheap, which has enabled deep learning whose central learning scheme is based on matrix multiplication Computer vision, powered by deep learning, is now better than humans at a variety of vision tasks Classification takes vast resources...

Real world analytics The article considers real time video analytics, motivated by smart cities We have queries such as car counting, license plate identification for tolling and identification of cars containing kidnapped kids with different lag-resistances and quality needs In a real world setting we are often overwhelmed by data, and cannot use the biggest neural network on all data We need to judiciously schedule resources for correctly chosen machine learning tasks

Video-storm, contributions The authors implement video-storm, a system that perform queries on video data The first major contribution is a method of profiling the resource usage versus quality trade-off in machine learning models and their pipelines The second contribution is a system that schedules and configures machine learning algorithm for real-time queries on videos

Video-storm at a glance At offline time, we profile different settings to understand resource/quality tradeoffs Online we periodically consider all queries and assign resources, configurations and so on Each query has a utility function that describes its quality and lag requirements, we maximize total utility or minimum utility

Related work: scheduling There has previously been a lot of work done in scheduling In the video analytics setting the requirements of a job is not fixed, and we can move along the resource-quality curve at times with high traffic, this makes scheduling tricky The authors additionally considers a setting where all queries comes from the same agent, which makes fairness irrelevant

Related work, approximate query processing Compared to most other work, the authors argue that they consider quality of query answers and lag requirements of queries jointly The authors also argue that they provides automatic knob tuning, this incorporates transformations of the videos in terms of frame rates etc.

Related work, hyper-parameter tuning There has been a lot of research in tuning machine learning algorithms A typical approach is using bayesian optimization This is not mentioned at all...

Technical contribution: profiling Machine learning models have a large number of parameters, and the search space is combinatorial when we discretize real numbers The authors proposes a local search method for finding parameters with good resource-quality trade-off for every query type

Profiling: details The local search is a simple hill-climbing algorithm We select a number of “random” configurations, and evaluate them using a linear combination of its quality and resource consumption From the best configuration we find a “similar” configuration by perturbing a random knob, we repeat this until it doesn’t get better In the end we throw away all configurations that are dominated both in terms of quality and resource-usage This creates a much smaller amount of settings to consider at the pareto boundary

Technical contribution: resource management The authors proposes a system for allocating resources to different queries, and scheduling them The system periodically performs resource allocation and query placement

Details resource management Every query has an associated utility function that measures its sensitivity to extra quality over some lowest acceptable standard and sensitivity to lag The complete optimization problem is then formulated as a knapsack problem, where we want to maximize total utility given resource constraints The authors uses a greedy heuristic, we add Δ resources to the query whose utility increases the most until we run out of resources

Details resource management With query configurations and resource allocations done, the authors considers the problem of placing jobs on machines The match between a job and a machine is the mean of three scores 1) Utilization score as measured by dot product of job resource requirements and machines available resources 2) Load balancing score defined to the right 3) Lag score as measured by average tolerable lag The system places each job on the machine with the highest score, and migrates jobs which achieve a sizable improvement in score

Results They compare against a fair scheduler in a scenario which starts out with a number of jobs where a burst of jobs arrives

Shortcomings, machine learning The method for selecting machine learning parameters is very primitive, and there exists a lot of related work A lot of bayesian optimization exists, for example auto-weka It is also not clear what “parameters” there is a in neural network design, clearly the search space is infinite

Shortcomings The scheduling part also seem quite “hacky” to me Heuristics without any (given) approximation guarantees are used The query-to-machine matching isn’t well motivated either

Future directions Profiling for machine learning could definitely be improved How to parametrize the design space of neural networks for efficient exploration? In these settings the difference between false positives vs false negatives can be important, but in vanilla ML-settings they are treated the same. Can this be rectified?

Future directions Using machine learning with resource constraints is also an interesting problem Can we consider a setting where the machine learning algorithms can answer “I don’t know”, in which case we would like to use a better but more expensive algorithm For the packing/allocation problems it would be interesting to find approximation algorithms. The problem is bipartite-matching-esque. Otherwise use MIPs?

Live Video Analytics at Scale with Approximation and Delay-Tolerance - PowerPoint PPT Presentation

Live Video Analytics at Scale with Approximation and Delay-Tolerance Haoyu Zhang, Microsoft and Princeton University; Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, and Paramvir Bahl, Microsoft; Michael J. Freedman, Princeton University

Live Video Analytics at Scale with Approximation and Delay-Tolerance Haoyu Zhang, Ganesh

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

. Live Your Vision Edge Analytics Appliance Sonys First AI-Based Video Analytics Solution

6. Approximation and fitting norm approximation least-norm problems regularized

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

Netflix: Netflix: Petabyte Scale Petabyte Scale Analytics Infrastructure in Analytics

Video Games Written and Researched by: Patrick Kania First Video Game The first Video Game made

Learning from Unlabeled Video Carl Vondrick Columbia University Survivor Bias of Video Data

Video Analytics Framework with Multilevel Security Dr. Patrick McDaniel Zachary Lassman Fall

Google Analytics Overview Whats Google Analytics? The Google Analytics

Introduction to Talent Analytics and Interim View 01 Overview Erich OSaben Talent Analytics

Google Analytics A beginners guide What is Google Analytics? Google Analytics is not magic.

Document Name Solar Analytics - Rooftop PV energy analytics PREPARED BY: Your Name, Your Title

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

ECS 231 Lecture on Approximation and Error Analysis 1 / 9 Approximation and error analysis 1.

Moderately exponential approximation Bridging the gap between exact computation and polynomial

EXPLOITING LOCALITY IN GRAPH ANALYTICS THROUGH HARDWARE ACCELERATED TRAVERSAL SCHEDULING Anurag

Importing Skill-Biased Technology Ariel Burstein Javier Cravino Jonathan Vogel January 2012

Direct/Adjoint Methods Lecture 12 ME EN 575 Andrew Ning aning@byu.edu Outline Motivating

Julia H. Appleton MT(ASCP), MBA Centers for Medicare & Medicaid Services (CMS) Center for

Towards Practical Differential Privacy for SQL Queries Noah Johnson, Joseph P. Near, Dawn Song

derivatives for design and control with Jim and Simon review: serial manipulator end

Adaptive Multiscale Streamline Simulation and Inversion for High-Resolution Geomodels Vegard

in the Storage Ring pEDM Experiment ERIC METODIEV CAPP/IBS, HARVARD COLLEGE HAWAII, JOINT

Live Video Analytics at Scale with Approximation and Delay-Tolerance - PowerPoint PPT Presentation

Live Video Analytics at Scale with Approximation and Delay-Tolerance Haoyu Zhang, Microsoft and Princeton University; Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, and Paramvir Bahl, Microsoft; Michael J. Freedman, Princeton University

Live Video Analytics at Scale with Approximation and Delay-Tolerance Haoyu Zhang, Ganesh

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

. Live Your Vision Edge Analytics Appliance Sonys First AI-Based Video Analytics Solution

6. Approximation and fitting norm approximation least-norm problems regularized

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

Netflix: Netflix: Petabyte Scale Petabyte Scale Analytics Infrastructure in Analytics

Video Games Written and Researched by: Patrick Kania First Video Game The first Video Game made

Learning from Unlabeled Video Carl Vondrick Columbia University Survivor Bias of Video Data

Video Analytics Framework with Multilevel Security Dr. Patrick McDaniel Zachary Lassman Fall

Google Analytics Overview Whats Google Analytics? The Google Analytics

Introduction to Talent Analytics and Interim View 01 Overview Erich OSaben Talent Analytics

Google Analytics A beginners guide What is Google Analytics? Google Analytics is not magic.

Document Name Solar Analytics - Rooftop PV energy analytics PREPARED BY: Your Name, Your Title

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

ECS 231 Lecture on Approximation and Error Analysis 1 / 9 Approximation and error analysis 1.

Moderately exponential approximation Bridging the gap between exact computation and polynomial

EXPLOITING LOCALITY IN GRAPH ANALYTICS THROUGH HARDWARE ACCELERATED TRAVERSAL SCHEDULING Anurag

Importing Skill-Biased Technology Ariel Burstein Javier Cravino Jonathan Vogel January 2012

Direct/Adjoint Methods Lecture 12 ME EN 575 Andrew Ning aning@byu.edu Outline Motivating

Julia H. Appleton MT(ASCP), MBA Centers for Medicare &amp; Medicaid Services (CMS) Center for

Towards Practical Differential Privacy for SQL Queries Noah Johnson, Joseph P. Near, Dawn Song

derivatives for design and control with Jim and Simon review: serial manipulator end

Adaptive Multiscale Streamline Simulation and Inversion for High-Resolution Geomodels Vegard

in the Storage Ring pEDM Experiment ERIC METODIEV CAPP/IBS, HARVARD COLLEGE HAWAII, JOINT

Julia H. Appleton MT(ASCP), MBA Centers for Medicare & Medicaid Services (CMS) Center for