Learning Multi-touch Conversion Attribution with Dual-attention - - PowerPoint PPT Presentation

learning multi touch conversion attribution with dual
SMART_READER_LITE
LIVE PREVIEW

Learning Multi-touch Conversion Attribution with Dual-attention - - PowerPoint PPT Presentation

Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Advertising Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang Apex Data & Knowledge Management Lab Shanghai Jiao


slide-1
SLIDE 1

Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Advertising

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗

Apex Data & Knowledge Management Lab Shanghai Jiao Tong University

∗University College London

CIKM, 2018

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 1 / 27

slide-2
SLIDE 2

Outline

1

Problem Background

2

Our Solution

3

Experiments

4

Visualization & Insights

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 2 / 27

slide-3
SLIDE 3

Problem Background

Problem Background

Figure: John Wanamaker

John Wanamaker: “Half the money I spend on advertising is wasted; the trouble is I don’t know which half.”

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 3 / 27

slide-4
SLIDE 4

Problem Background

Problem Background

Search Social Website No Conversion User 1 Search Website Search No Conversion User 3 Social Search Search Website Conversion User 2 Impression Click

Two views of the problem

Sequence View: Touch point attributes positively/negatively to the conversion. Channel View: Which channel appeals the user the most?

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 4 / 27

slide-5
SLIDE 5

Problem Background

Problem Background

Search Social Website No Conversion User 1 Search Website Search No Conversion User 3 Social Search Search Website Conversion User 2 Impression Click

Two views of the problem

Sequence View: Touch point attributes positively/negatively to the conversion. Channel View: Which channel appeals the user the most?

Problem

To analyze the effects of the touch points from different channels to the final user conversion.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 4 / 27

slide-6
SLIDE 6

Problem Background

Related Works

Rule-based Methods

Too simple and heuristic, cannot help subsequent advertising strategy.

Google Ads: https://support.google.com/google-ads/answer/7002714?hl=en Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 5 / 27

slide-7
SLIDE 7

Problem Background

Data Insights

20 40 60 80 Length of user action sequence 2 4 6 8 10 12 14 Logarithm number of sequence 2 4 6 8 10 12 Logarithm number of converted sequence

Number of certain length sequences

Sequence number Converted sequence number 20 40 60 80 Length of user action sequence 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0.22 Probability

Conversion rate w.r.t. sequence length

Conversion rate

Figure: Left: Sequence length distribution; Right: CVR distribution against the sequence length. Longer behavior sequence higher conversion rate. Not all the ad touch points have additive positive influence, some may have counteractive effects.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 6 / 27

slide-8
SLIDE 8

Problem Background

Related Works

Data-driven Methods

Logistic regression with learned coefficients for the attribution. [Shao et al. In KDD’11.] Additive point process to model the conversion rate over time and derive the attribution for each point. [Zhang et al. ICDM’16. Ji et al. CIKM’16, AAAI’17.]

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 7 / 27

slide-9
SLIDE 9

Problem Background

Problem Challenge: Multi-touch Conversion Attribution

Cons of the traditional methods

Rule-based methods are heuristical wrong to subsequence usage of attributed results Simple probability methods predict upon single point ignore sequential influence Consider only one type of user behaviors.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 8 / 27

slide-10
SLIDE 10

Our Solution

Our Solution

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 9 / 27

slide-11
SLIDE 11

Our Solution

Our Solution

Attention-based Conversion Prediction

Use recurrent neural network to model the sequential user activities. Learn to assign “attention” to the touch points to model the conversion attributions. Simultaneously model impression-level and click-level patterns for conversion estimation.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 10 / 27

slide-12
SLIDE 12

Our Solution

Dual-attention Mechanism for Conversion Attribution

Attention as Attribution Credits (cont.)

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 11 / 27

slide-13
SLIDE 13

Our Solution

Attention Implementation

{ (Query, Key, Value) }

x

h

x m

i

a

h

Softmax

E e

j j j j

(1, .., j, .., m )

i

j 1h1

a h c

m

i

m

i

Attention

f

hj-1

a

Dzmitry Bahdana et al. Neural Machine Translation By Jointly Learning To Align and Translate. ICLR 2015. Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 12 / 27

slide-14
SLIDE 14

Our Solution

Dual-attention Mechanism for Conversion Attribution

Attention as Attribution Credits (cont.)

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 13 / 27

slide-15
SLIDE 15

Our Solution

The Usage of Attribution

Attribution of the j-th Touch Point Attrj = (1 − λ) · ai2v

j

+ λ · ac2v

j

. (1) Now that we have obtained the attributed credits, what else can we do with it?

None of the related works consider the subsequent usage of the

  • btained attribution values.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 14 / 27

slide-16
SLIDE 16

Our Solution

The Usage of Attribution

Attribution of the j-th Touch Point Attrj = (1 − λ) · ai2v

j

+ λ · ac2v

j

. (1) Now that we have obtained the attributed credits, what else can we do with it?

None of the related works consider the subsequent usage of the

  • btained attribution values.

Example To guide the subsequent budget allocation over the channels for the advertiser.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 14 / 27

slide-17
SLIDE 17

Our Solution

Back Evaluation for Attibution Guided Budget Allocation

Attribution Calculation for the k-th Channel (yi: converted ) Attr(ck|yi) =

mi

  • j=1

Attrj · 1(cj = ck) (2)

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 15 / 27

slide-18
SLIDE 18

Our Solution

Back Evaluation for Attibution Guided Budget Allocation

Attribution Calculation for the k-th Channel (yi: converted ) Attr(ck|yi) =

mi

  • j=1

Attrj · 1(cj = ck) (2) Inferred ROI of Channel (Sahin Cem Geyik et al. ADKDD’14) ROIck =

  • ∀yi=1 Attr(ck|yi) · V · 1(yi = 1)

Money spent on channel ck , (3)

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 15 / 27

slide-19
SLIDE 19

Our Solution

Back Evaluation for Attibution Guided Budget Allocation

Attribution Calculation for the k-th Channel (yi: converted ) Attr(ck|yi) =

mi

  • j=1

Attrj · 1(cj = ck) (2) Inferred ROI of Channel (Sahin Cem Geyik et al. ADKDD’14) ROIck =

  • ∀yi=1 Attr(ck|yi) · V · 1(yi = 1)

Money spent on channel ck , (3) Attribution guided Budget Allocation For channel ck, the allocated budget bk = ROIck K

v=1 ROIcv

× B . (4)

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 15 / 27

slide-20
SLIDE 20

Our Solution

Back Evaluation Flow

Allocate the budget for each channel w.r.t. the calculated attributions

  • f the model.

Replay the history user behavior sequences according to the timestamp of each touch point, and judge

If the left budget of the channel has run out, then those sequence would be “blocked” and put into the blacklist; If the replay has reached the tail of sequence, the result of (non-)conversion would be added for the model performance.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 16 / 27

slide-21
SLIDE 21

Experiments

Experiments

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 17 / 27

slide-22
SLIDE 22

Experiments

Experimental Setup

Conversion Estimation Given the user behavior sequence, compare the models on the measurement (AUC and Log-loss) for conversion rate estimation. Attribution Guided Budget Allocation After the back evaluation, compare different models over Conversion Number (CN) Conversion Rate (CVR) Profit = V0 · CN − cost Cost-per-action (CPA) w.r.t. the models.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 18 / 27

slide-23
SLIDE 23

Experiments

Datasets

Miaozhen Dataset

Zhang et al. ICDM’16; Ji et al. CIKM’16, AAAI’17.

Criteo

http://ailab.criteo.com/criteo-attribution-modeling-bidding-dataset/

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 19 / 27

slide-24
SLIDE 24

Experiments

Compared Settings

LR is the Logistic Regression model [24]. SP is a Simple Probabilistic model [7]. AH (AdditiveHazard) model [37] using additive point process. AMTA is the Additional Multi-touch Attribution model [12] which was state-of-the-art. ARNN is the normal Recurrent Neural Network (i.e., only with encoder part) method. DARNN is our proposed model with dual-attention mechanism.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 20 / 27

slide-25
SLIDE 25

Experiments

Conversion Estimation

Table: Conversion estimation results on two datasets. AUC: the higher the better; Log-loss: the lower the better. Miaozhen Criteo Models AUC Log-loss AUC Log-loss LR 0.8418 0.3496 0.9286 0.3981 SP 0.7739 0.5617 0.6718 0.5535 AH 0.8693 0.6791 0.6791 0.5067 AMTA 0.8357 0.1636 0.8465 0.3897 ARNN 0.8914 0.1610 0.9793 0.1850 DARNN 0.9123 0.1095 0.9799 0.1591

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 21 / 27

slide-26
SLIDE 26

Experiments

Back Evaluation of Budget Allocation

Figure: CPA: the lower, the better.

Budget Settings We set budget constraint as 1/n of the total costs in the training dataset.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 22 / 27

slide-27
SLIDE 27

Visualization & Insights

Visualization & Insights

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 23 / 27

slide-28
SLIDE 28

Visualization & Insights

Visualization of the Attribution

Sequence Level

2 4 6 8 touch point index 0.00 0.05 0.10 0.15 0.20 0.25 0.30 atttribution

Attribution on sequence length = 10

AH AMTA ARNN DARNN 1 2 3 4 touch point index 0.0 0.1 0.2 0.3 0.4 0.5 atttribution

Attribution on sequence length = 5

AH AMTA ARNN DARNN

Figure: Touch point level attribution statistics (Miaozhen).

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 24 / 27

slide-29
SLIDE 29

Visualization & Insights

Visualization of the Attribution

Channel Level

Figure: Attribution of different channels on Miaozhen.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 25 / 27

slide-30
SLIDE 30

Visualization & Insights

Visualization of the Attribution Preferences

Click-level v.s. Impression-level, Attrj = (1 − λ) · ai2v

j

+ λ · ac2v

j 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 λ 0.000 0.005 0.010 0.015 0.020 0.025 0.030 ratio

The distribution of λ

Figure: The distribution of λ over Criteo dataset.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 26 / 27

slide-31
SLIDE 31

Summary

Summary

First work using attentional recurrent model for conversion attribution.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 27 / 27

slide-32
SLIDE 32

Summary

Summary

First work using attentional recurrent model for conversion attribution. First work proposing a replay protocol for offline evaluation over the

  • btained attribution.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 27 / 27

slide-33
SLIDE 33

Summary

Summary

First work using attentional recurrent model for conversion attribution. First work proposing a replay protocol for offline evaluation over the

  • btained attribution.

Perhaps the subsequent budget allocation should be guided by the data-driven attributions.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 27 / 27

slide-34
SLIDE 34

Summary

Summary

First work using attentional recurrent model for conversion attribution. First work proposing a replay protocol for offline evaluation over the

  • btained attribution.

Perhaps the subsequent budget allocation should be guided by the data-driven attributions. Reproductive code: https://github.com/rk2900/deep-conv-attr.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 27 / 27

slide-35
SLIDE 35

Summary

Summary

First work using attentional recurrent model for conversion attribution. First work proposing a replay protocol for offline evaluation over the

  • btained attribution.

Perhaps the subsequent budget allocation should be guided by the data-driven attributions. Reproductive code: https://github.com/rk2900/deep-conv-attr. Outlook

To attribute with the consideration of cost.

Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, Jun Wang∗ (Shanghai Jiao Tong University) Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Adverti CIKM, 2018 27 / 27