diffusion of user tracking data in the online advertising
play

Diffusion of User Tracking Data in the Online Advertising Ecosystem - PowerPoint PPT Presentation

Diffusion of User Tracking Data in the Online Advertising Ecosystem Muhammad Ahmad Bashir and Christo Wilson Your Digital Privacy Footprint 2 Your Digital Privacy Footprint 2 Your Digital Privacy Footprint 2 Your Digital Privacy


  1. Our Simulation Setup We simulate browsing traces for 200 users using method from [1]. 1. User generates an impression on N selected publishers. 2. Impressions are forwarded to A&A domains via: A. Direct Propagation: • Present on publisher or won RTB auction. Observable (goes through the browser) B. Indirect Propagation: • A&A domains learn impressions through RTB participation. Non-observable � 11 [1]. Burken et al. User centric walk: An integrated approach for modeling the browsing behavior of users on the web. ASS 2005

  2. Our Simulation Setup We simulate browsing traces for 200 users using method from [1]. 1. User generates an impression on N selected publishers. 2. Impressions are forwarded to A&A domains via: A. Direct Propagation: • Present on publisher or won RTB auction. Observable (goes through the browser) B. Indirect Propagation: • A&A domains learn impressions through RTB participation. Non-observable � 11 [1]. Burken et al. User centric walk: An integrated approach for modeling the browsing behavior of users on the web. ASS 2005

  3. Our Simulation Setup We simulate browsing traces for 200 users using method from [1]. 1. User generates an impression on N selected publishers. 2. Impressions are forwarded to A&A domains via: A. Direct Propagation: • Present on publisher or won RTB auction. Observable (goes through the browser) B. Indirect Propagation: • A&A domains learn impressions through RTB participation. Non-observable 3. RTB winner is decided based on probability (function of edge weights). � 11 [1]. Burken et al. User centric walk: An integrated approach for modeling the browsing behavior of users on the web. ASS 2005

  4. Simulation Example (RTB Constrained) 10 Activation Node Type a1 100 e1 p1 a4 Direct 5 Publisher 90 Exchange a2 Indirect 80 DSP/Advertiser 95 100 p2 e2 a5 20 a3 � 12

  5. Simulation Example (RTB Constrained) 10 Activation Node Type a1 100 e1 p1 a4 Direct 5 Publisher 90 Exchange a2 Indirect 80 DSP/Advertiser 95 100 p2 e2 a5 20 a3 We manually select 37 exchanges which are allowed to forward indirect impressions to solicit bids during RTB � 12

  6. Simulation Example (RTB Constrained) 10 Activation Node Type a1 100 e1 p1 a4 Direct 5 Publisher 90 Exchange a2 Indirect 80 DSP/Advertiser 95 100 p2 e2 a5 20 a3 We manually select 37 p1 exchanges which are allowed to forward indirect impressions to solicit bids during RTB � 12

  7. Simulation Example (RTB Constrained) 10 Activation Node Type a1 100 e1 p1 a4 Direct 5 Publisher 90 Exchange a2 Indirect 80 DSP/Advertiser 95 100 p2 e2 a5 20 a3 1 We manually select 37 e1 p1 exchanges which are allowed to forward indirect impressions to solicit bids during RTB � 12

  8. Simulation Example (RTB Constrained) 10 Activation Node Type a1 100 e1 p1 a4 Direct 5 Publisher 90 Exchange a2 Indirect 80 DSP/Advertiser 95 100 p2 e2 a5 20 a3 1 1 a1 We manually select 37 e1 p1 exchanges which are 1 allowed to forward a2 indirect impressions to solicit bids during RTB � 12

  9. Simulation Example (RTB Constrained) 10 Activation Node Type a1 100 e1 p1 a4 Direct 5 Publisher 90 Exchange a2 Indirect 80 DSP/Advertiser 95 100 p2 e2 a5 20 a3 1 1 a1 0 We manually select 37 e1 p1 a4 exchanges which are 1 allowed to forward 1 a2 indirect impressions to a5 solicit bids during RTB � 12

  10. Simulation Example (RTB Constrained) 10 Activation Node Type a1 100 e1 p1 a4 Direct 5 Publisher 90 Exchange a2 Indirect 80 DSP/Advertiser 95 100 p2 e2 a5 20 a3 1 1 a1 0 We manually select 37 e1 p1 a4 exchanges which are 1 allowed to forward 1 a2 indirect impressions to p2 a5 solicit bids during RTB � 12

  11. Simulation Example (RTB Constrained) 10 Activation Node Type a1 100 e1 p1 a4 Direct 5 Publisher 90 Exchange a2 Indirect 80 DSP/Advertiser 95 100 p2 e2 a5 20 a3 1 1 a1 0 We manually select 37 e1 p1 a4 exchanges which are 1 allowed to forward 1 1 a2 indirect impressions to p2 e2 a5 solicit bids during RTB � 12

  12. Simulation Example (RTB Constrained) 10 Activation Node Type a1 100 e1 p1 a4 Direct 5 Publisher 90 Exchange a2 Indirect 80 DSP/Advertiser 95 100 p2 e2 a5 20 a3 1 1 a1 0 We manually select 37 e1 p1 a4 exchanges which are 1 allowed to forward 1 1 a2 indirect impressions to p2 e2 a5 solicit bids during RTB a3 � 12

  13. Simulation Example (RTB Constrained) 10 Activation Node Type a1 100 e1 p1 a4 Direct 5 Publisher 90 Exchange a2 Indirect 80 DSP/Advertiser 95 100 p2 e2 a5 20 a3 1 1 a1 0 We manually select 37 e1 p1 a4 exchanges which are 2 allowed to forward 1 1 a2 indirect impressions to p2 e2 a5 1 solicit bids during RTB a3 � 12

  14. Simulation Example (RTB Constrained) 10 Activation Node Type a1 100 e1 p1 a4 Direct 5 Publisher 90 Exchange a2 Indirect 80 DSP/Advertiser 95 100 p2 e2 a5 20 a3 1 1 a1 0 We manually select 37 e1 p1 a4 exchanges which are 2 allowed to forward 1 a2 2 indirect impressions to p2 e2 a5 1 solicit bids during RTB a3 � 12

  15. Impressions Observed We have 3 simulation models: 1. RTB-Relaxed : Upper-bound 1 Fraction of A&A Domains (CDF) 2. Cookie-Matching : Lower-bound 0.8 3. RTB-Constrained : Realistic Scenario 0.6 0.4 0.2 Cookie Matching-Only RTB Constrained RTB Relaxed 0 0 0.2 0.4 0.6 0.8 1 Fraction of Impressions Observed � 13

  16. Impressions Observed We have 3 simulation models: 1. RTB-Relaxed : Upper-bound 1 Fraction of A&A Domains (CDF) 2. Cookie-Matching : Lower-bound 0.8 3. RTB-Constrained : Realistic Scenario 0.6 0.4 0.2 Cookie Matching-Only RTB Constrained RTB Relaxed 0 0 0.2 0.4 0.6 0.8 1 Fraction of Impressions Observed � 13

  17. Impressions Observed We have 3 simulation models: 1. RTB-Relaxed : Upper-bound 1 Fraction of A&A Domains (CDF) 2. Cookie-Matching : Lower-bound 0.8 3. RTB-Constrained : Realistic Scenario 0.6 0.4 0.2 Cookie Matching-Only RTB Constrained RTB Relaxed 0 0 0.2 0.4 0.6 0.8 1 Fraction of Impressions Observed � 13

  18. Impressions Observed We have 3 simulation models: 1. RTB-Relaxed : Upper-bound 1 Fraction of A&A Domains (CDF) 2. Cookie-Matching : Lower-bound 0.8 3. RTB-Constrained : Realistic Scenario 0.6 Take Away 0.4 0.2 Cookie Matching-Only 1. RTB-Constrained is very close to RTB Constrained RTB Relaxed RTB-Relaxed 0 2. 10% A&A see more than 90% of 0 0.2 0.4 0.6 0.8 1 impressions in RTB-Constrained Fraction of Impressions Observed � 13

  19. Effect of Blocking Extensions DoubleClick OpenX 
 PubMatic � 14

  20. Effect of Blocking Extensions DoubleClick OpenX 
 PubMatic � 14

  21. Effect of Blocking Extensions DoubleClick OpenX 
 PubMatic � 14

  22. Effect of Blocking Extensions DoubleClick OpenX 
 PubMatic � 14

  23. Effect of Blocking Extensions DoubleClick OpenX 
 PubMatic � 14

  24. Effect of Blocking Extensions DoubleClick OpenX 
 PubMatic � 14

  25. Effect of Blocking Extensions DoubleClick OpenX 
 PubMatic � 14

  26. Effect of Blocking Extensions DoubleClick OpenX 
 PubMatic � 14

  27. Impressions With Blocking 1 Fraction of A&A Domains (CDF) 0.8 0.6 0.4 Disconnect 0.2 Ghostery AdBlock Plus No Removal 0 0 0.2 0.4 0.6 0.8 1 Fraction of Impressions Observed � 15

  28. Impressions With Blocking 1 Fraction of A&A Domains (CDF) 0.8 0.6 0.4 Disconnect 0.2 Ghostery AdBlock Plus No Removal 0 0 0.2 0.4 0.6 0.8 1 Fraction of Impressions Observed � 15

  29. Impressions With Blocking Take Away 1 Fraction of A&A Domains (CDF) 0.8 • Disconnect list is most e ff ective. 0.6 • ABP is not e ff ective at all due to Acceptable Ads program. 0.4 • Due to RTB, impressions are leaked to Disconnect A&A domains even with blocking 0.2 Ghostery extensions. AdBlock Plus No Removal 0 0 0.2 0.4 0.6 0.8 1 Fraction of Impressions Observed � 15

  30. Top 10 Domains � 16

  31. Top 10 Domains EasyList Disconnect Ghostery AdBlock Plus No Blocking 0 10 20 30 40 50 60 70 80 90 100 Impressions Observed (%) � 16

  32. Top 10 Domains EasyList Disconnect Ghostery AdBlock Plus No Blocking 0 10 20 30 40 50 60 70 80 90 100 Impressions Observed (%) Top 10 companies can view majority of user impressions even with (most) blocking extensions installed � 16

  33. Top 10 Domains Impression Domain EasyList % google-analytics 97.0 youtube 91.7 Disconnect quantserve 91.6 scorecardresearch 91.6 Ghostery skimresources 91.3 twitter 91.1 AdBlock Plus pinterest 91.0 addthis 90.0 No Blocking criteo 90.0 bluekai 90.8 0 10 20 30 40 50 60 70 80 90 100 Top 10 domains with most observed 
 Impressions Observed (%) impressions under AdBlock Plus Top 10 companies can view majority of user impressions even with (most) blocking extensions installed � 16

  34. Simulation Limitations • Our simulation models provide approximations • Di ff erent users might have di ff erent browsing behaviors • We only simulate with respect to popular publishers • The ecosystem could have changed from when the dataset was collected (December 2015) • Not representative of mobile advertising ecosystem � 17

  35. Summary • We are the first to provide a model to study the impact of Real Time Bidding (RTB) on user privacy. • Ad Exchanges share user impressions to facilitate RTB • More than 10% A&A domains view up to 90% of user impressions under realistic conditions. • Due to RTB, impressions can leak to A&A domains even with blocking extensions • AdBlock Plus is not e ff ective at all due to Acceptable Ads program • Disconnect performed the best in terms of protecting privacy

  36. Summary • We are the first to provide a model to study the impact of Real Time Bidding (RTB) on user privacy. • Ad Exchanges share user impressions to facilitate RTB • More than 10% A&A domains view up to 90% of user impressions under realistic conditions. • Due to RTB, impressions can leak to A&A domains even with blocking extensions • AdBlock Plus is not e ff ective at all due to Acceptable Ads program • Disconnect performed the best in terms of protecting privacy Questions? ahmad@ccs.neu.edu http://personalization.ccs.neu.edu/Projects/AdGraphs/

  37. Backup Slides

  38. Number of Exchanges for RTB-C 1 0.8 0.6 CDF 0.4 5 30 10 50 0.2 20 100 0 0 0.2 0.4 0.6 0.8 1 Fraction of Impressions � 20

  39. Model Validation — Per Publisher # Nodes Activated 6 300 5 Tree Depth 250 4 200 3 150 2 100 50 1 0 0 O R R C O R R C T T M r T T M r i i g B B g B B i - - n i - - R C n R C a a l l � 21

  40. Inclusion Chain from DOM � 22

  41. Inclusion Chain from DOM <html> <body> <script src=“a1.com/cookie-match.js” </script> <!-- Tracking pixel inserted dynamically 
 by cookie-match.js --> <img src=“a2.com/pixel.jpg” /> <iframe src=“a3.com/banner.html”> <script src=“a4.com/ad.js”> </script> </iframe> </body> </html> DOM Tree for http://p.com/index.html � 22

  42. Inclusion Chain from DOM <html> <body> <script src=“a1.com/cookie-match.js” </script> <!-- Tracking pixel inserted dynamically 
 by cookie-match.js --> <img src=“a2.com/pixel.jpg” /> <iframe src=“a3.com/banner.html”> <script src=“a4.com/ad.js”> </script> </iframe> </body> </html> DOM Tree for http://p.com/index.html Inclusion of resources � 22

  43. Inclusion Chain from DOM p <html> <body> <script src=“a1.com/cookie-match.js” </script> <!-- Tracking pixel inserted dynamically 
 by cookie-match.js --> <img src=“a2.com/pixel.jpg” /> <iframe src=“a3.com/banner.html”> <script src=“a4.com/ad.js”> </script> </iframe> </body> </html> DOM Tree for http://p.com/index.html Inclusion of resources � 22

Recommend


More recommend