TVM for Ads Ranking @ Facebook Hao Lu, Ansha Yu, Yinghai Lu, Andrew - PowerPoint PPT Presentation

Jan 19, 2024 •456 likes •530 views

TVM for Ads Ranking @ Facebook Hao Lu, Ansha Yu, Yinghai Lu, Andrew Tulloch Ads Ranking at Facebook . . . ad 1 ad 2 ad 3 ad n model 2 model 2 model 1 model 3 . . . model k batch 1 batch 2 predictions X 2 Ads Ranking at Facebook:

TVM for Ads Ranking @ Facebook Hao Lu, Ansha Yu, Yinghai Lu, Andrew Tulloch
Ads Ranking at Facebook . . . ad 1 ad 2 ad 3 ad n model 2 model 2 model 1 model 3 . . . model k batch 1 batch 2 predictions X 2
Ads Ranking at Facebook: Production Requirements • Parallel execution between model evaluation ad 1 ad 2 ad 3 • Each model runs on a single thread • For each model, there can be multiple batches executing at the same time. In this case, weights are global and shared between threads, but activations are thread local • Model weights are refreshed every few model 2 model 2 model 1 hours. Therefore, activations needs to be batch 1 batch 2 released at the end of each inference to avoid running out of memory • Batch size is dynamic • C++ only • Mutiple CPU architectures: avx512, avx2 predictions X 3
Model Architecture TVM EMB MLP: Multilayer perceptron (sequence of FC + activation function) https://ai.facebook.com/blog/dlrm-an-advanced-open-source-deep-learning-recommendation-model/
Ads Ranking Models Implementation Dense features + embeddings from ca ff e2 batch_size x • JIT (not AOT): because models are updated periodically • Graph runtime does not manage memory • weights are shared between threads for the same model graph runtime graph runtime graph runtime • activations are shared by instances of all graph batch_size 1 batch_size 2 batch_size n runtimes • release activation after each iteration to avoid OOM Performance • Use MKL for FC for simplicity prediction • 5-10% speedup from fusion • Runtime overhead eats into speedup 5
What's Next Relay VM Performance • Handles dynamic shapes • Autotuning at scale • JIT compilation • FBGEMM for fp16 and int8 • Dynamic memory allocation • Embedding lookup 6

Recommend

TVM at Facebook Lots of contributors at FB and elsewhere TVM at Facebook Why TVM? Examples from

TVM at Facebook Lots of contributors at FB and elsewhere TVM at Facebook Why TVM? Examples from Speech Synthesis Sparsity PyTorch Why TVM for ML Systems? - Performance matters - Flexibility matters - Portability matters ML Systems at

679 views • 29 slides

Social Advertising Facebook Ads overlooked - organic reach Facebook Ads overlooked bad ads

Social Advertising Facebook Ads overlooked - organic reach Facebook Ads overlooked bad ads But Facebook is growing 20% Time Online 30% Ad Impressions Facebook is changing its targeting Social Targeting: Ads only displayed to a Social

245 views • 10 slides

Quantization for TVM Ziheng Jiang TVM Conference, Dec 12th 2018 Quantization for TVM What is

Quantization for TVM Ziheng Jiang TVM Conference, Dec 12th 2018 Quantization for TVM What is Quantization? source: Han et al Converting weight value to low-bit integer like 8bit precision from float-point without significant accuracy drop.

421 views • 7 slides

VTA: Open & Flexible DL Acceleration Thierry Moreau TVM Conference, Dec 12th 2018 TVM Stack

VTA: Open & Flexible DL Acceleration Thierry Moreau TVM Conference, Dec 12th 2018 TVM Stack High-Level Differentiable IR Tensor Expression IR LLVM CUDA Metal TVM Stack High-Level Differentiable IR Tensor Expression IR LLVM CUDA Metal

1.01k views • 84 slides

Facebook Exchange Facebook Exchange (FBX) (FBX) Facebook Exchange The Facebook Exchange allows

Facebook Exchange Facebook Exchange (FBX) (FBX) Facebook Exchange The Facebook Exchange allows programmatic buying of Facebook ad inventory through real-time bidding. Agencies, trading desks and direct marketers can buy Facebook ads using

343 views • 7 slides

GETTING STARTED WITH FACEBOOK ADVERTISING 1.Facebook Ads Growth 2.Why theyre popular

FACEBOOK ADVERTISING GETTING STARTED WITH FACEBOOK ADVERTISING 1.Facebook Ads Growth 2.Why theyre popular 3.Capabilities 4.Why its different 5.See it action + Q&A FACEBOOK ADVERTISING 1. FACEBOOK ADS GROWTH 1. Over 6 Million

473 views • 9 slides

ADS-B Eastern Michigan University, Ypsilanti, MI January 25-26, 2013 Presented By Tod Lanham

ADS-B Eastern Michigan University, Ypsilanti, MI January 25-26, 2013 Presented By Tod Lanham TERMS ADS-B OUT ADS-B IN DIVERSITY (not required) 1090ES UAT ADS-R ADS-B (Out) The simple picture ADS-B The Complex Picture ADS-B

655 views • 34 slides

MEXICO ADS-B PROJECT PREVIEW Syllabus SENEAM Previous SENEAM ADS-B Program SENEAM-FAA ADS-B

S ERVICIOS A LA N AVEGACIN EN EL E SPACIO A REO M EXICANO MEXICO ADS-B PROJECT PREVIEW Syllabus SENEAM Previous SENEAM ADS-B Program SENEAM-FAA ADS-B Program SENEAM ADS-B Schedule SENEAM ADS-B 2014-2018 1 2 3 4 5 2 Syllabus:

453 views • 41 slides

Running a Successful Facebook Ad Campaigns 7th of April 2020 What will be covered today?

Running a Successful Facebook Ad Campaigns 7th of April 2020 What will be covered today? Webinar Overview Why advertise on Facebook? Guide to Facebook Ads Where people see Facebook Ads Facebook Ads Manager Creating

696 views • 54 slides

December 12, 2018 Luis Ceze Welcome to the 1st TVM and Deep Learning Compilation Conference!

1st TVM and Deep Learning Compilation Conference December 12, 2018 Luis Ceze Welcome to the 1st TVM and Deep Learning Compilation Conference! Welcome to the 1st TVM and Deep Learning Compilation Conference! 180+ ppl! Machine learning is

1.31k views • 115 slides

TVM TVM f for ed or edge c e com omputin ting p g pla latf tform orm NTT Software Inno

TVM TVM f for ed or edge c e com omputin ting p g pla latf tform orm NTT Software Inno nnovation n Ce Center Ka Kazutaka Mo Morita In Inference in 5G era Edge Devices Offload MEC (Mobile edge computing) server Offload

181 views • 7 slides

TVM Deep Learning on Bare-Metal Devices Pratyush Patel No OS stack Extend TVM to support

TVM Deep Learning on Bare-Metal Devices Pratyush Patel No OS stack Extend TVM to support bare-metal devices Optimization High-Level Differentiable IR AutoTVM Tensor Expression IR LLVM, CUDA VTA AutoVTA Hardware FPGA ASIC Fleet

1.46k views • 18 slides

TVM @ FB Andrew Tulloch Research Scientist Background Excited to be here! Lots of FB

TVM @ FB Andrew Tulloch Research Scientist Background Excited to be here! Lots of FB folks in the audience Working in TVM since ~June Focusing on apply TVM to accelerate ML inference on CPUs/GPUs across mobile and server

863 views • 24 slides

All Media ADS About All Media ADS All Media ADS offers Internet advertising that provides

All Media ADS About All Media ADS All Media ADS offers Internet advertising that provides innovative advertising and publishing solutions that speak directly to your customers. At All Media ADS , our team is committed to the progress of our

161 views • 13 slides

Facebook Strategies Facebook www.facebook.com Facebook TIPS Idea #1: Share the School Calendar.

Facebook Strategies Facebook www.facebook.com Facebook TIPS Idea #1: Share the School Calendar. Idea #2: Link to Positive Stories. Idea #3: Post Photos of Events. Idea #4: Congratulate Students and Staff Who Achieve. Facebook TIPS Going

427 views • 39 slides

WELCOME TO Getting started with Facebook Ads Foundations to get in place before you start

WELCOME TO Getting started with Facebook Ads Foundations to get in place before you start creating Ads! Creating basic ads using Business Manager When logged in to Facebook, firstly check if you have a Business Manager:

840 views • 28 slides

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

Deep 3D Representation Learning for Visual Computing Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms Conclusion 2 Outline Overview of 3D deep learning Background 3D deep learning tasks 3D deep

1.66k views • 122 slides

Overview of Linear Prediction Introduction Terms and definitions Important for more

Overview of Linear Prediction Introduction Terms and definitions Important for more applications than just prediction Nonstationary case Prominent role in spectral estimation, Kalman filtering, fast algorithms, etc. Stationary

292 views • 10 slides

Reminder Teds talk Ted Selker what is a human computer input sensor? 2.15

Reminder Teds talk Ted Selker what is a human computer input sensor? 2.15 pm, BU101 ttingenstrasse 67 1 LMU Mnchen Medieninformatik Andreas Butz, Julie Wagner ! Mensch-Maschine-Interaktion II WS2014/15

579 views • 30 slides

Page 1 Research at MERL on fast, But we can fake it with low-cost vision systems clever system

6.869 projects 6.869 projects, continued Projects due Thursday, May 12 (3 weeks from today). The write-up should have an introduction, where you explain why the reader Projects due Thursday, May 12 (3 weeks from today). The write-up should have

359 views • 13 slides

Apache Sentry - High Availability Hao Hao - hao.hao@cloudera.com Seville, Spain, Nov 14 - 16 2016

Apache Sentry - High Availability Hao Hao - hao.hao@cloudera.com Seville, Spain, Nov 14 - 16 2016 About me Software engineers at Cloudera Apache Sentry PMC and Committer Presentation Agenda Apache Sentry Overview

593 views • 35 slides

CENG3420 Lab 2-2: LC-3b Simulator Hao Geng Department of Computer Science and Engineering The

CENG3420 Lab 2-2: LC-3b Simulator Hao Geng Department of Computer Science and Engineering The Chinese University of Hong Kong hgeng@cse.cuhk.edu.hk Spring 2019 1 / 22 Overview Basis LC-3b Example: Count From 10 To 1 Tasks 2 / 22

449 views • 25 slides

CNRS

CNRS Universit de Parissud XI

770 views • 29 slides

Scholarship Program Basic Elements A big part of what we do as a Fraternity is mentoring and

Scholarship Program Basic Elements A big part of what we do as a Fraternity is mentoring and offering support to college students. We now award Six $1000 scholarships (up from Four in 2018) each year, to 6 of the 10 schools listed below, each

686 views • 30 slides