tracing cross border web tracking
play

Tracing Cross Border Web Tracking Costas Iordanou Georgios - PowerPoint PPT Presentation

Tracing Cross Border Web Tracking Costas Iordanou Georgios Smaragdakis Ingmar Poese Nikolaos Laoutaris We Web adver)sing fuels the w fuels the web eb 1 The r Th e rise of e of t targeted ed a ads Why Targeted ads? How it


  1. Tracing Cross Border Web Tracking Costas Iordanou Georgios Smaragdakis Ingmar Poese Nikolaos Laoutaris

  2. We Web adver)sing fuels the w fuels the web eb 1

  3. The r Th e rise of e of t targeted ed a ads Why Targeted ads? How it works? • Tracking and profiling users • Users get relevant ads • Real Gme aucGons of ads (RTB) • Increase user engagement • Cookie synchronizaGon • More efficient ad campaigns • Etc. • Higher ROI for the adverGsers • BeIer use of resources • Etc. User typed in “used cars for sale” 2

  4. Th The r e rea eac) c)on on of of u user ers a and r reg egulator ors Regulators Users Browser extensions Browsers 3

  5. User Users and r s and regula egulators r s reac eac)o )on n Regulators Users Browser extensions Browsers 4

  6. Gen Gener eral D Data P Prot otec) ec)on on R Reg egula)on on - D - Details One of the biggest changes with respect to privacy and regulaGon on the web in the last few years (Enforcement date: 25 th May, 2018) In general the new legislaGon: 1. tries to regulate how users’ data are collected, processed and stored and 2. if they include any sensiGve informaGon about the user 5

  7. Gen Gener eral D Data P Prot otec) ec)on on R Reg egula)on on - D - Details One of the biggest changes with respect to privacy and regulaGon on the web in the last few years (Enforcement date: 25 th May, 2018) In general the new legislaGon: 1. tries to regulate how users’ data are collected, processed and stored and 2. if they include any sensiGve informaGon about the user ImplementaGon – Per member state Data ProtecGon Authority (DPA) DPA: Responsible for complaints – invesGgaGons & enforcement InvesGgaGon starGng point – Ad & Tracking flows entry point servers locaGon RQ: How can we idenGfy the physical locaGons of such servers? 6

  8. Ch Challen enges es 1. How to effecGvely detect ad and tracking related domains in the wild ? 2. How to ensure correct geoloca7on of infrastructure servers ? 7

  9. Ch Challen enges es 1. How to effecGvely detect ad and tracking related domains in the wild ? 2. How to ensure correct geoloca7on of infrastructure servers ? 3. How to ensure that all possible ad and tracking servers are observed ? 4. How to maintain a balance between accuracy and scalability ? 8

  10. Why real users instead of ju just Web crawling? User interacGon Real Users 9

  11. Why real users instead of ju just Web crawling? User interacGon Real Users Geo load balancing 10

  12. Mapping 3 rd rd party doma Mapping 3 mains to IPs Chrome Browser Extension Chrome API event listeners hIp://www.example.com chrome. chrome. webRequest. webRequest. onBeforeSendHeader onCompleted tracker.com Mapping Table - example.com Domain IP analyGcs.com tracker.com 213.121.66.99 … analyGcs.com … tracker.com 213.121.66.99 11

  13. Iden)fy Ad and Tracking related doma mains easyprivacy easylist CorrecGon Script ABP Parser Custom keywords 12

  14. Iden)fy Ad and Tracking related doma mains easyprivacy easylist AD + Tracking Domains 2 YES CorrecGon YES 3 Script Should block? Ad + Tracking related? NO ABP Parser NO 1 Custom url 1 + meta data keywords url 2 + meta data url 3 + meta data … 13

  15. Ch Challen enges es 1. How to effecGvely detect ad and tracking related domains in the wild ? 2. How to ensure correct geoloca7on of infrastructure servers ? 3. How to ensure that all possible ad and tracking servers are observed ? 4. How to maintain a balance between accuracy and scalability ? 14

  16. Accu Accurate g e geo-l eo-loc oca)on on of of s ser erver er IP IPs RIPE IPmap validaGon process - infrastructure servers IPs RIPE IPmap prefix region service 46.51.128.0/18 eu-west-1 AMAZON 46.51.216.0/21 ap- AMAZON southeast-1 13.73.232.0/21 japaneast AZURE 99.6% match with 20 . 19 . 14 . 128 koreacentral AZURE the reported country / 25 … … … Regions maps eu-west-1: Ireland, Ireland ap-southeast-1: Singapore, Singapore 15

  17. Ch Challen enges es 1. How to effecGvely detect ad and tracking related domains in the wild ? 2. How to ensure correct geoloca7on of infrastructure servers ? 3. How to ensure that all possible ad and tracking servers are observed ? 4. How to maintain a balance between accuracy and scalability ? 16

  18. Avoiding piIalls … Av - IdenGfy all domains behind each IP (Reverse DNS query) Query: hIps://freeapi.robtex.com/pdns/reverse/93.184.216.34 Response: rrname:example.org, rrdata:93.184.216.34, rrtype:A, time_first:1440526884, time_last:1535919774, count:18 rrname:www.example.org, rrdata:93.184.216.34, rrtype:A, time_first:1440723354, time_last:1527899734, count:18 rrname:www.example.com, rrdata:93.184.216.34, rrtype:A, time_first:1441108386, time_last:1535371292, count:18 rrname:www.example.net, rrdata:93.184.216.34, rrtype:A, time_first:1436692690, time_last:1527900018, count:18 rrname:imrek.org, rrdata:93.184.216.34, rrtype:A, time_first:1440827324, time_last:1508103356, count:18 rrname:example.net, rrdata:93.184.216.34, rrtype:A, time_first:1440526998, time_last:1533895598, count:18 … 17

  19. Avoiding piIalls … Av - IdenGfy all domains behind each IP (Reverse DNS query) Query: hIps://freeapi.robtex.com/pdns/reverse/93.184.216.34 Response: rrname:example.org, rrdata:93.184.216.34, rrtype:A, time_first:1440526884, time_last:1535919774, count:18 rrname:www.example.org, rrdata:93.184.216.34, rrtype:A, time_first:1440723354, time_last:1527899734, count:18 rrname:www.example.com, rrdata:93.184.216.34, rrtype:A, time_first:1441108386, time_last:1535371292, count:18 rrname:www.example.net, rrdata:93.184.216.34, rrtype:A, time_first:1436692690, time_last:1527900018, count:18 rrname:imrek.org, rrdata:93.184.216.34, rrtype:A, time_first:1440827324, time_last:1508103356, count:18 rrname:example.net, rrdata:93.184.216.34, rrtype:A, time_first:1440526998, time_last:1533895598, count:18 … - IdenGfy all IPs for each domain (Forward DNS query) Query: hIps://freeapi.robtex.com/pdns/forward/example.com Response: rrname:example.com, rrdata:2606:280::::::1946, rrtype:AAAA, time_first:1441278890, time_last:1535952170, count:18 rrname:example.com, rrdata:93.184.216.34, rrtype:A, time_first:1441278890, time_last:1535952170, count:18 rrname:example.com, rrdata:208.77.188.166, rrtype:A, time_first:1246678898, time_last:1246678898, count:1 18

  20. Avoiding piIalls … Av - IdenGfy all domains behind each IP (Reverse DNS query) Query: hIps://freeapi.robtex.com/pdns/reverse/93.184.216.34 Response: rrname:example.org, rrdata:93.184.216.34, rrtype:A, time_first:1440526884, time_last:1535919774, count:18 rrname:www.example.org, rrdata:93.184.216.34, rrtype:A, time_first:1440723354, time_last:1527899734, count:18 rrname:www.example.com, rrdata:93.184.216.34, rrtype:A, time_first:1441108386, time_last:1535371292, count:18 rrname:www.example.net, rrdata:93.184.216.34, rrtype:A, time_first:1436692690, time_last:1527900018, count:18 rrname:imrek.org, rrdata:93.184.216.34, rrtype:A, time_first:1440827324, time_last:1508103356, count:18 rrname:example.net, rrdata:93.184.216.34, rrtype:A, time_first:1440526998, time_last:1533895598, count:18 … - IdenGfy all IPs for each domain (Forward DNS query) Query: hIps://freeapi.robtex.com/pdns/forward/example.com Response: rrname:example.com, rrdata:a.iana-servers.net, rrtype:NS, time_first:1246678898, time_last:1535952170, count:2 rrname:example.com, rrdata:b.iana-servers.net, rrtype:NS, time_first:1246678898, time_last:1535952170, count:2 rrname:example.com, rrdata:2606:280::::::1946, rrtype:AAAA, time_first:1441278890, time_last:1535952170, count:18 rrname:example.com, rrdata:93.184.216.34, rrtype:A, time_first:1441278890, time_last:1535952170, count:18 rrname:example.com, rrdata:208.77.188.166, rrtype:A, time_first:1246678898, time_last:1246678898, count:1 19

  21. Joining Jo ining everything thing togethe ther r Browser extension with real users RIPE IPmap CorrecGon Mapping Table - example.com Script Domain IP & tracker.com 213.121.66.99 hIps://ipmap.ripe.net/ analyGcs.com 130.12.88.110 ABP Parser … … Source country 3 rd party flow Mapping IP(s) Filtering DesGnaGon country Spain hIp://tracker.com 213.121.66.99 Ad + Tracking Germany France hIp://example.com 145.100.210.5 Clean USA … … … … … 20

  22. Results - EU 28 member states confinement level MaxMind geo-locaGon 21

  23. Results - EU 28 member states confinement level MaxMind geo-locaGon RIPE IPmap geo-locaGon 22

  24. Wha What abo t about sensi)v ut sensi)ve w e web ebsit sites? es? SensiGve categories as defined by GDPR GeneGc & biometric data Race & Ethnicity PoliGcal beliefs Religion Health Sexual OrientaGon 23

  25. Re Results - Sensi)ve websites based on EU 28 users SensiGve Category DesGnaGon ConGnent 24

  26. Ch Challen enges es 1. How to effecGvely detect ad and tracking related domains in the wild ? 2. How to ensure correct geoloca7on of infrastructure servers ? 3. How to ensure that all possible ad and tracking servers are observed ? 4. How to maintain a balance between accuracy and scalability ? 25

  27. Scaling up – From real users to ISP flows 26

  28. up – Fr Sc Scaling aling up From m real users to ISP fl flows Datasets ISPs Datasets List of Ad + Tracking IPs + < 28k IPs 27

Recommend


More recommend