between ad exchanges
play

Between Ad Exchanges Using Retargeted Ads Muhammad Ahmad Bashir , - PowerPoint PPT Presentation

Tracing Information Flows Between Ad Exchanges Using Retargeted Ads Muhammad Ahmad Bashir , Sajjad Arshad, William Robertson, Christo Wilson Northeastern University Your Privacy Footprint 2 Your Privacy Footprint 2 Your Privacy Footprint 2


  1. Tracing Information Flows Between Ad Exchanges Using Retargeted Ads Muhammad Ahmad Bashir , Sajjad Arshad, William Robertson, Christo Wilson Northeastern University

  2. Your Privacy Footprint 2

  3. Your Privacy Footprint 2

  4. Your Privacy Footprint 2

  5. Your Privacy Footprint 2

  6. Your Privacy Footprint 2

  7. Your Privacy Footprint 2

  8. Real Time Bidding • RTB brings more flexibility in the ad ecosystem. • Ad request managed by an Ad Exchange which holds an auction. • Advertisers bid on each ad impression. Cookie matching is a prerequisite. Advertiser Exchange • RTB spending to cross $20B by 2017 [1] . • 49% annual growth. • Will account for 80% of US Display Ad spending by 2022. [1] http://www.prnewswire.com/news-releases/new-idc-study-shows-real-time-bidding-rtb-display-ad- 3 spend-to-grow-worldwide-to-208-billion-by-2017-228061051.html

  9. Real Time Bidding (RTB) Advertisers User Ad Exchange Publisher GET, CNN’s Cookie GET, DoubleClick’s Cookie Solicit bids, DoubleClick’s Cookie Bid 4

  10. Real Time Bidding (RTB) Advertisers User Ad Exchange Publisher GET, CNN’s Cookie GET, DoubleClick’s Cookie Solicit bids, DoubleClick’s Cookie Bid GET, RightMedia’s Cookie Advertisement 4

  11. Real Time Bidding (RTB) Advertisers User Ad Exchange Publisher GET, CNN’s Cookie Advertisers cannot read their cookie! GET, DoubleClick’s Cookie Solicit bids, DoubleClick’s Cookie Bid GET, RightMedia’s Cookie Advertisement 4

  12. Cookie Matching Key problem: Advertisers cannot read their cookies in the RTB auction • How can they submit reasonable bids if they cannot identify the user? Solution: cookie matching • Also known as cookie synching • Process of linking the identifiers used by two ad exchanges GET, Cookie=12345 301 Redirect, Location=http://criteo.com/?dblclk_id=12345 GET ?dblclk_id=12345, Cookie=ABCDE 5

  13. Cookie Matching Key problem: Advertisers cannot read their cookies in the RTB auction • How can they submit reasonable bids if they cannot identify the user? Solution: cookie matching • Also known as cookie synching • Process of linking the identifiers used by two ad exchanges GET, Cookie=12345 301 Redirect, Location=http://criteo.com/?dblclk_id=12345 GET ?dblclk_id=12345, Cookie=ABCDE 5

  14. Prior Work • Several studies have examined cookie matching • Acar et al. found hundreds of domains passing identifiers to each other • Olejnik et al. found 125 exchanges matching cookies • Falahrastegar et al. analyzed clusters of exchanges that share the exact same cookies • These studies rely on studying HTTP requests/responses. 6

  15. Challenge 1: Server Side Matching Criteo observes the user. (IP: 207.91.160.7) 1) RightMedia observes the user. (IP: 207.91.160.7) 2) Behind the scene, RightMedia and Criteo sync up. (IP: 207.91.160.7) 7

  16. Challenge 2: Obfuscation amazon.com dbclk.js GET %^$ck#&93#&, Cookie=XYZYX 8

  17. Challenge 2: Obfuscation amazon.com dbclk.js GET %^$ck#&93#&, Cookie=XYZYX 8

  18. Challenge 2: Obfuscation amazon.com dbclk.js GET %^$ck#&93#&, Cookie=XYZYX 8

  19. Goal Develop a method to identify information flows (cookie matching) between ad exchanges • Mechanism agnostic: resilient to obfuscation • Platform agnostic: detect sharing on the client- and server-side ? 9

  20. Key Insight: Use Retargeted Ads Retargeted ads are the most highly targeted form of online ads $15.99 Key insight: because retargets are so specific, they can be used to conduct controlled experiments • Information must be shared between ad exchanges to serve retargeted ads 10

  21. Contributions 1. Novel methodology for identifying information flows between ad exchanges 2. Demonstrate the impact of ad network obfuscation in practice • 31% of cookie matching partners cannot be identified using heuristics 3. Develop a method to categorize information sharing relationships 4. Use graph analysis to infer the roles of actors in the ad ecosystem 11

  22. Contributions 1. Novel methodology for identifying information flows between ad exchanges 2. Demonstrate the impact of ad network obfuscation in practice • 31% of cookie matching partners cannot be identified using heuristics 3. Develop a method to categorize information sharing relationships 4. Use graph analysis to infer the roles of actors in the ad ecosystem 11

  23. Data Collection Classifying Ad Network Flows Results 12

  24. Using Retargets as an Experimental Tool Key observation: retargets are only served under very specific circumstances Advertiser observes the user at a shop 1) Advertiser and the exchange must have matched cookies 2) This implies a causal flow of information from Exchange  Advertiser 13

  25. Data Collection Overview Visit Persona Visit Publishers Single Persona 150 Publishers 10 websites/persona 15 pages/publisher 10 products/website Store Images, Inclusion Chains, HTTP requests/ responses 571,636 Images 14

  26. Data Collection Overview 90 Personas { Visit Persona Visit Publishers Single Persona 150 Publishers 10 websites/persona 15 pages/publisher 10 products/website Store Images, Inclusion Chains, HTTP requests/ responses Ad Detection Potential Targeted Ads Filter Images 31,850 571,636 which appeared Images in > 1 persona 14

  27. Data Collection Overview 90 Personas { Visit Persona Visit Publishers Single Persona 150 Publishers 10 websites/persona 15 pages/publisher 10 products/website Store Images, Inclusion Chains, HTTP requests/ responses Ad Detection Crowd Sourcing Potential Targeted Isolated Ads Retargeted Ads Filter Images 31,850 571,636 which appeared Images in > 1 persona 14

  28. Crowd Sourcing We used Amazon Mechanical Turk (AMT) to label 31,850 ads. • Total 1,142 Tasks. • 30 ads / Task. • 27 unlabeled. • 3 labeled by us. • 2 workers per ad. • $415 spent. 15

  29. Crowd Sourcing We used Amazon Mechanical Turk (AMT) to label 31,850 ads. • Total 1,142 Tasks. • 30 ads / Task. • 27 unlabeled. • 3 labeled by us. • 2 workers per ad. • $415 spent. 15

  30. Crowd Sourcing We used Amazon Mechanical Turk (AMT) to label 31,850 ads. • Total 1,142 Tasks. • 30 ads / Task. • 27 unlabeled. • 3 labeled by us. • 2 workers per ad. • $415 spent. 15

  31. Crowd Sourcing We used Amazon Mechanical Turk (AMT) to label 31,850 ads. • Total 1,142 Tasks. • 30 ads / Task. • 27 unlabeled. • 3 labeled by us. • 2 workers per ad. • $415 spent. 15

  32. Final Dataset 5,102 unique retargeted ads • From 281 distinct online retailers 35,448 publisher-side chains that served the retargets • We observed some retargets multiple times 16

  33. Data Collection Classifying Ad Network Flows Results 17

  34. A look at Publisher Chains Shopper-side chain Publisher-side chain Example • How does Criteo know to serve ad on BBC? • In this case it is pretty trivial. • Criteo observed us on the shopper. • Can we classify all such publisher-side chains? 18

  35. What is a Chain? 19

  36. What is a Chain? e a e a 19

  37. What is a Chain? e a e a a$ e .* ^pub 19

  38. Four Classifications Four possible ways for a retargeted ad to be served 1. Direct (Trivial) Matching 2. Cookie Matching 3. Indirect Matching 4. Latent (Server-side) Matching 20

  39. Four Classifications Four possible ways for a retargeted ad to be served 1. Direct (Trivial) Matching 2. Cookie Matching 3. Indirect Matching 4. Latent (Server-side) Matching 20

  40. 1) Direct (Trivial) Matching Shopper-side Publisher-side Example Rule ^shop .* a .*$ ^pub a$ a is the advertiser that serves the retarget 21

  41. 1) Direct (Trivial) Matching Shopper-side Publisher-side Example Rule ^shop .* a .*$ ^pub a$ a is the a must appear … but other advertiser that on the shopper- trackers may serves the side… also appear retarget 21

  42. 2) Cookie Matching Shopper-side Publisher-side Example Rule ^pub .* e a$ ^shop .* a .*$ e precedes a, which implies an RTB auction 22

  43. 2) Cookie Matching Shopper-side Publisher-side Example Rule ^pub .* e a$ ^shop .* a .*$ a must appear e precedes a, on the which implies an shopper-side RTB auction 22

  44. 2) Cookie Matching Shopper-side Anywhere Publisher-side Example Rule ^pub .* e a$ ^shop .* a .*$ ^* .* e a .*$ a must appear e precedes a, Transition e  a is where on the which implies an cookie match occurs shopper-side RTB auction 22

  45. 3) Latent (Server-side) Matching Shopper-side Publisher-side Example Rule ^shop ^pub .* e a$ [^ea]$ Neither e nor a appears on the shopper-side 23

  46. 3) Latent (Server-side) Matching Shopper-side Publisher-side Example Rule ^shop ^pub .* e a$ [^ea]$ Neither e nor a a must receive information from appears on the some shopper-side tracker shopper-side 23

  47. 3) Latent (Server-side) Matching Shopper-side Publisher-side Example Rule ^shop ^pub .* e a$ [^ea]$ Neither e nor a a must receive information from appears on the some shopper-side tracker shopper-side We find latent matches in practice! 23

  48. Data Collection Classifying Ad Network Flows Results 24

Recommend


More recommend