Quick Growth through ML Model A/B Testing Introduce eBay - PowerPoint PPT Presentation

Quick Growth through ML Model A/B Testing Introduce eBay Experimentation Platform for the Paid Search Ads - Sleven Liu, Martin Zhang, Yi Liu

Agenda • Why Growth hacking and A/B testing? • Search Ads: The most important marketing channel • Challenges and Solution for A/B testing • Machine Learning Models Integration Hadoop Summit 2

Quick Growth in the eBay Paid Marketing through A/B Testing & ML Model 60+ 5+ 50+ Experiments/ Years Models/Year Year Hadoop Summit 3

Growth Hacking “ Growth hackers are a hybrid of marketer and coder, one who … answers with A/B tests, landing pages, viral factor, email deliverability, and Open Graph. Marketing On top of this, they layer the discipline of direct marketing, with its emphasis on quantitative A/B test measurement, scenario modeling via spreadsheets, and a lot of database queries. ” Data - 《 Growth Hacker is the new VP Marketing 》 Andrew Chen Hadoop Summit 4

A/B Testing • Key Elements – Statistical hypothesis – Sampling • Benefits – Customer vs. expertise – Early launch and adoption in the marketing – Continue delivery and integration – Based on the data and statistics • Limitation – Statistician Power – Imbalancing Hadoop Summit 5

Growth Hacking Channels • “ Poor distribution, not product is the number one cause of failure” – Peter Thiel, 《 Zero to One 》 Viral Marketing Affiliate Email Net Ads UGC / SEO Hadoop Summit 6

Google Text Ads • Google Ads, CPC • Content – Headline – Display URL – Description • SRP + Search Network • Exact vs. Broad match • Campaign Structure Hadoop Summit 7

Google Product Listing Ads / Shopping Campaign • More info (price/picture) more qualified traffic • Catch more eyeballs • Product/Brand match • Higher barrier, less competition • Backend structure Hadoop Summit 8

Challenges of A/B testing in the Paid Search Ads • No control on the user/visiting Sampling • Accurate user targeting • Skew data & Low coverage • “Black Box” on third partner / ads platform Test Setup • Limitation of Testing objects Tracking • External data loop Hadoop Summit 9

A/B Testing Solution Example in the Text Ads • Based on the keywords • Stratified sampling to resolve skewed data Sampling • Campaign structure management • Test object: bidding models Test Setup • Insides + outsides tracking Tracking • Data loop for the model Hadoop Summit 10

Why Sampling is important for A/B testing?  Choose the right sample size • Is a large sample always good to speed up A/B? Or put business in real risk?  Choose the right method • Why not using random sampling anyway?  Un-represented sampling result might hurt business after rollout • Is the model workable for all the Ads? Or only the sampled ads?  A trustable sampling result makes the A/B result trustable • Is the difference from A/B test result really from the model? Or because of the sampling difference? Hadoop Summit 11

Sampling Challenge – Huge volume of data • Billion level Ads • New Ads sourcing – is the process scalable for more ads added to marketing? • Ads history tracking – how the process dealing with the historical data? Hadoop Summit 12

Sampling challenge – Skew Data & Low Coverage 100.00% Click Distribution (hot -> cold) 90.00% 80.00% 70.00% Ads Count 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0 5000000 10000000 15000000 20000000 25000000 Ad Count ADS IMPRESSION CLICK VALUED ad count total_ad CLICK • • Low Conversion Rate – Impression -> Click -> Top click queries Transaction • Long tail queries • Deal with ads with no impression on partner Hadoop Summit 13

Sampling Solution - Method Hadoop Summit 14

Sampling Solution - Tech • Hbase + HDFS  Active ads stored in Hbase  Ads history stored in HDFS • Spark  Huge data pre-aggregation  Optimization of huge data join with ads history, user behavior…  Store data as Parquet to improve the spark job efficiency Hadoop Summit 15

Machine Learning Model Integration Where is the data? What is a model? How to manage the model lifecycle? Hadoop Summit 16

Challenge for data • Data extraction • Data processing • Data gathering • Original Solution  Regular ETL data pipeline to build factor for each model  Move gathered factors to model running env based on different scenario • Bottleneck  Some effort are duplicated among different models  Factor is not reusable as it is built to meet special model’s requirement  More effort to maintain the factor as it could be from different sources and built for specified model Hadoop Summit 17

New Solution - Factor System  Factor: the model input  Heterogeneous data sources  Syntax + Semantic layer  Calculate on the Hadoop  Factor life-cycle Hadoop Summit 18

What factor system provides • Register Service  Factor code integration, deployment  External factor register • Download Service  Online model input  Offline data exploring and model development • Scheduling Service  Schedule the factor code in factor system due to different source data latency • Dashboard  Factor status monitor, help understand the factor code running status  Factor meta definition, help data scientist better understand the factor to build the model Hadoop Summit 19

Capacity of Factor System • PB level source data volume • 10+TB daily increment • 1000+ permanent factors, historical data backup on HDFS • Use Cases  Batch Models - serve all the machine learning models for Paid IM marketing  Adhoc – to support offline data exploring for data scientist and data developer  NRT/Real-time (Future) - build factor cache for NRT or real-time model use cases Hadoop Summit 20

What model requires • Model can access the wanted data based on the logical Data Stream 1 design • Model can be executed in Model result // Model Logic expected env using right tech to meet different use cases Data Stream 2 • Model result can be delivered for real business needs Hadoop Summit 21

What is a model – Model Engine • Onboarding data from factor system to model engine • Execute models using different tech solution to meet the real scenarios • Landing result to different system to integrate with Ads publisher Hadoop Summit 22

What model engine can help more to data scientist • Sampled data for model training  Data scientist can get pre-sampled represented ads to train/test the models • Real production factors access  Avoid duplicated effort from data scientist when developing new models with existing factors • Self Service  Integration, provide staging environment similar to real-production for model execution to avoid integration issue after model deployment  Model deployment  Online debugging, all the model result/logs are kept in system to allow data scientist debugging during A/B testing • Dashboard  Model status monitor Hadoop Summit 23

Model Lifecycle (Batch) Hadoop Summit 24

Model Lifecycle (NRT) Hadoop Summit 25

Anything Else for model? • Is Model Result Reliable?  “SafeNet” • Collect the historical behavior of model • Detect any significant difference • Block the result sending to publisher • How to track?  Ads Monitor & Alert • Expose online model result to Scientist/Analyst • Dashboard • Hourly & Daily report • Alerts deliver to model owner & business owner Hadoop Summit 26

Summary • A/B Testing  Hbase, HDFS, MySQL, Oracle, Mongo  Java, Scala, SQL • Machine learning model  HDFS, Kafka, Cassandra  Hive, Spark, Spark streaming  Java, Scala, R, Python • Dashboard  InfluxDB  Grafana Hadoop Summit 27

Hadoop Summit 28

Quick Growth through ML Model A/B Testing Introduce eBay - PowerPoint PPT Presentation

Quick Growth through ML Model A/B Testing Introduce eBay Experimentation Platform for the Paid Search Ads - Sleven Liu, Martin Zhang, Yi Liu Agenda Why Growth hacking and A/B testing? Search Ads: The most important marketing channel

Georgia Student Growth Model Student Growth Percentiles 1 Why focus on student growth? A

The Well-Founded Model A Quick Introduction Peter Baumgartner The Well-Founded Model

The Multilevel Change Model James H. Steiger Department of Psychology and Human Development

Economic Growth I Outline The Solow growth model 1. The Golden Rule 1. Going to the Golden

From Manufacturing Led Export Growth to a 21st Century Inclusive Growth Strategy: Explaining the

1 Factor Analysis (FA): quick recap To recap, the FA model is defined by a low-dimensional

A (Short) Technical Overview of the TAG Model Current Model Tax Foundation Taxes and Growth

JAX-WS Basics JAX-WS Basics Agenda Quick overview of JAX-WS > Differences from JAX-RPC

Growth: To Proficiency and Beyond While growth can still be an important consideration for

Higher Economic Growth Progress since Announcements on 23.8.2019 Quick Follow-up on Measures

World Languages Portfolio Model A new path to measuring growth in traditionally non-tested grades

Efficiency and Government ECON 499: Growth and Development Spring 2018 Technology and growth

Simple ORNL model Provided in Quick Basic, re-implemented as visual basic in Excel

On a nonlinear model for tumor growth: Global existence of weak solutions Hamiltonian PDEs:

Sorting Chapter 7 1 Quick Sort One of the most popular fast sorting algorithms Quick sort

From Manufacturing Led Export Growth to a 21st Century Inclusive Growth Strategy for Africa

Model Economic Township Limited Meet Your Future Business At Reliance MET India Growth Story

Technical Change, Finance, and Public Policies in an Evolutionary Model of Endogenous Growth and

Corporate Presentation June 2017 Badger Daylighting: Proven Business Model with Organic Growth

Corporate Presentation August 2017 Badger Daylighting: Proven Business Model with Organic Growth

Analysis of microspectroscopy images based on a PDEs model for electrodeposition metal growth

Growth Funding Allocation Model Dan Troy, Vice Chancellor of Fiscal Policy, CCC Chancellors

Growth Funding Allocation Model Dan Troy, Vice Chancellor of Fiscal Policy, CCC Chancellors

FY 2019 Investor Presentation | February 2020 1 Our sustainable growth model 2 Key elements

Quick Growth through ML Model A/B Testing Introduce eBay - PowerPoint PPT Presentation

Quick Growth through ML Model A/B Testing Introduce eBay Experimentation Platform for the Paid Search Ads - Sleven Liu, Martin Zhang, Yi Liu Agenda Why Growth hacking and A/B testing? Search Ads: The most important marketing channel

Georgia Student Growth Model Student Growth Percentiles 1 Why focus on student growth? A

The Well-Founded Model A Quick Introduction Peter Baumgartner The Well-Founded Model

The Multilevel Change Model James H. Steiger Department of Psychology and Human Development

Economic Growth I Outline The Solow growth model 1. The Golden Rule 1. Going to the Golden

From Manufacturing Led Export Growth to a 21st Century Inclusive Growth Strategy: Explaining the

1 Factor Analysis (FA): quick recap To recap, the FA model is defined by a low-dimensional

A (Short) Technical Overview of the TAG Model Current Model Tax Foundation Taxes and Growth

JAX-WS Basics JAX-WS Basics Agenda Quick overview of JAX-WS &gt; Differences from JAX-RPC

Growth: To Proficiency and Beyond While growth can still be an important consideration for

Higher Economic Growth Progress since Announcements on 23.8.2019 Quick Follow-up on Measures

World Languages Portfolio Model A new path to measuring growth in traditionally non-tested grades

Efficiency and Government ECON 499: Growth and Development Spring 2018 Technology and growth

Simple ORNL model Provided in Quick Basic, re-implemented as visual basic in Excel

On a nonlinear model for tumor growth: Global existence of weak solutions Hamiltonian PDEs:

Sorting Chapter 7 1 Quick Sort One of the most popular fast sorting algorithms Quick sort

From Manufacturing Led Export Growth to a 21st Century Inclusive Growth Strategy for Africa

Model Economic Township Limited Meet Your Future Business At Reliance MET India Growth Story

Technical Change, Finance, and Public Policies in an Evolutionary Model of Endogenous Growth and

Corporate Presentation June 2017 Badger Daylighting: Proven Business Model with Organic Growth

Corporate Presentation August 2017 Badger Daylighting: Proven Business Model with Organic Growth

Analysis of microspectroscopy images based on a PDEs model for electrodeposition metal growth

Growth Funding Allocation Model Dan Troy, Vice Chancellor of Fiscal Policy, CCC Chancellors

Growth Funding Allocation Model Dan Troy, Vice Chancellor of Fiscal Policy, CCC Chancellors

FY 2019 Investor Presentation | February 2020 1 Our sustainable growth model 2 Key elements

JAX-WS Basics JAX-WS Basics Agenda Quick overview of JAX-WS > Differences from JAX-RPC