Website Fingerprinting Attacks and Defenses in the Tor Onion Space - PowerPoint PPT Presentation

Website Fingerprinting Attacks and Defenses in the Tor Onion Space Marc Juarez imec-COSIC KU Leuven COSIC Seminar - 23rd October 2017, Leuven

Introduction • Contents of this presentation: - PETS’17: “Website Fingerprinting Defenses at the Application Layer” - CCS’17: “How Unique is Your Onion?” 2

What is Website Fingerprinting (WF)? Adversary Tor network WWW Middle User Exit Entry 3

Website Fingerprinting: deployment Training Testing 4

Why Do We Care? • Tor is the most popular anonymity network and aims to protect against such adversaries. • Series of successful attacks with accuracies greater than 90% • … how concerned should we be in practice ? – Critical review of WF attacks (Juarez et al, 2014)

Closed vs Open World Closed world Open world 6

Tor Hidden Services (HS) User xyz.onion • HS: user visits xyz.onion without resolving it to an IP • Examples: Wikileaks, GlobaLeaks, Facebook, ... 7

Website Fingerprinting on Hidden Services (HSes) • WF adversary can distinguish HSes from regular sites • Website Fingerprinting in HSes is more threatening: - Fewer sites makes HSes more identifiable (~ closed world) - HS users vulnerable because content is sensitive 8

The SecureDrop case • Freedom of the Press Foundation • Whistleblowing platform • Vulnerable to website fingerprinting (?) 9

Website Fingerprinting Defenses at the Application Layer Giovanni Cherubin 1 Jamie Hayes 2 Marc Juarez 3 1 Royal Holloway University of London 2 University College London 3 imec-COSIC KU Leuven Presented in PETS 2017, Minneapolis, MN, USA

Website Fingerprinting defenses WF Defenses BuFLO Tamaraw Tor network CS-BuFLO WTF-PAD … Middle User Entry Dummy These are TCP packets or Tor messages Real 11

Application-layer Defenses • Existing defenses are designed at the network layer Key observation: identifying info originates at app layer! Identifying info Web content ‘Latent‘ features: F 1 , …, F n HTTP(S) T(·) Tor Last layer of encryption TLS Observed features: O 1 , ..., O n TCP Adversary ... 12

Pros and Cons of app-layer Defenses The main advantage is that they are easier to implement: • do not depend on Tor to be implemented Cons: • padding runs end-to-end may require server collaboration… but HSes have incentives! • 13

LLaMA ALPaCA • Client-side (FF add-on) • Server-side (first one) • Applied on hosted content • Applied on website requests • More bandwidth overhead • More latency overhead (two different solutions, not a client-server solution) 14

ALPaCA Original Target Morphed • Abstract web pages as num objects and object sizes : pad them to match a target page • Does not impact user experience: e.g., comments in HTML/JS, images’ metadata, “ display: none” styles 15

ALPaCA strategies (1) Example: protect a SecureDrop page - Strategy 1: target page is Facebook securedrop securedrop.png fake.css index.html facebook index.html facebook.png style.css Padding 16

ALPaCA strategies (2) - Strategy 2: pad to an “anonymity set” target page securedrop securedrop.png index.html fake.css facebook facebook.png index.html style.css target Padding Defines num objects and object sizes by: Deterministic: next multiple of λ, δ ● ● Probabilistic: sampled from empirical distribution 17

Evaluation: methodology • Collect with and without defense: 100 HSes (cached) ○ Security: accuracy of attacks kNN, k-Fingerprinting (kFP), CUMUL ○ Performance: overheads - latency (extra delay) - bandwidth (extra padding/time) 18

ALPaCA: results • From 40% to 60% decrease in accuracy • 50% latency and 85% bandwidth overheads 19

How Unique is Your Onion? An Analysis of the Fingerprintability of Tor Onion Services Rebekah Overdorf 1 Marc Juarez 2 Gunes Acar 2 Rachel Greenstadt 1 Claudia Diaz 2 1 Drexel University 2 imec-COSIC KU Leuven To be presented in CCS 2017, Dallas, TX, USA

Disparate impact • WF normally attacks report average success • But… – Are certain websites more susceptible to website fingerprinting attacks than others? – What makes some sites more vulnerable to the attack than others? Credit: Claudia Diaz

State-of-the-Art Attacks - k-NN (Wang et al., 2015) - CUMUL (Panchenko et al., 2016) - k-Fingerprinting (Hayes and Danezis, 2016) 23

k-NN (Wang et al. 2015) • Features – number of outgoing packets in spans of 30 packets – the lengths of the first 20 packets – traffic bursts (sequences of packets in the same direction) • Classification – k -NN – Tune weights of the distance metric that minimizes the distance among instances that belong to the same site. • Results – From 90% to 95% accuracy on a closed-world of 100 non-hidden service websites. Credit: Bekah Overdorf

CUMUL (Panchenko et al. 2016) • Features – 100 interpolation points of the cumulative sum of packet lengths (with direction) • Classification – Radial Basis Function kernel (RBF) SVM • Results – From 90% to 93% for 100 Non HS sites. Credit: Bekah Overdorf

k-Fingerprinting (Hayes and Danezis 2016) • Features – Timing and size features in the literature • Classification – Random Forest (RF) + k-NN • Results – 90% accuracy on 30 hidden services Credit: Bekah Overdorf

Data • Crawled 790 sites over Tor (homepages) • Removed – Offline sites – Failed visits – Duplicates • 482 sites fit our criteria with 70 visits each Credit: Bekah Overdorf

Credit: Bekah Overdorf

SecureDrop sites • There was a SecureDrop site in our dataset: – Project On Gov’t Oversight’ (POGO) • CUMUL achieved 99%!!! – As compared to 80% in average

Misclassifications of Hidden Services Credit: Bekah Overdorf

Median of total incoming packet size for misclassified instances Credit: Bekah Overdorf

Low-level Feature Analysis • Intra-class variance: similarity between instances of the same site. – Lower intra-class variances improves identification. • Inter-class variance: similarity between instances of different sites. – Higher inter-class variances improves identification. Top features: 1. Total Size of all Outgoing Packets 2. Total Size of Incoming Packets 3. Number of Incoming Packets 4. Number of Outgoing Packets

Site-level Feature Analysis • Can we determine what characteristics of a website affect its fingerprintability ? • Site-Level Features: – Number of embedded resources – Number of fonts – Screenshot size – Use of a CMS? – …

Can we predict if a site will be fingerprintable? “Meta-classifier” Random forest regressor

Results: importance of site-level features

Take aways • WF threatens Tor, especially its Hidden services. • Disparate impact: some pages are more fingerprintable than others (there is a bias in reporting average results). • WF defenses that alter the website design (app layer) are easier to implement and as effective as network-layer defenses. • Changes of the paget that protect against WF: – Small (e.g., fewer resources) and dynamic.

Take aways • WF threatens Tor, especially its Hidden services. • Disparate impact: some pages are more fingerprintable than others (there is a bias in reporting average results). • WF defenses that alter the website design (app layer) are easier Future work Re-design ALPaCA to follow to implement and as effective as network-layer defenses. these guidelines. • Changes of the paget that protect against WF: – Small (e.g., fewer resources) and dynamic.

Software and Data • HSes have incentives to support server-side defenses: SecureDrop has implemented a prototype of ALPaCA • ALPaCA is running on a HS: 3tmaadslguc72xc2.onion • Source code defenses: github.com/camelids • Source code and data for fingerprintability analysis: cosic.esat.kuleuven.be/fingerprintability 40

The HS world • Exploratory crawl: 5,000 HSes (from Ahmia.fi) • Stats for the HS world (from intercepted HTTP headers) - Distribution of types, sizes and number of resources • Most HSes are small compared to an average website • Few HSes have any JS or 3rd-party content - JS: less than 13% Assumption: no JS - 3rd party content: less than 20% Assumption: no 3rd parties 41

Limitations and Future Work • ALPaCA can only make sites bigger, but not smaller • What’s the optimal padding at the app layer? Lack of a thorough feature analysis. • How do the distributions change over time? How do we update our defenses accordingly? - How does the strategy need be adapted as HSes adopt our defense(s)? 42

LLaMA Client Server • Inspired by Randomized Pipelining C 1 Goal: randomize HTTP requests C 2 • Same goal from a FF add-on: δ C 1 ’ - Random delays ( δ) C 2 - Repeat previous requests (C 1 ) 43

LLaMA: results • Accuracy drops between 20% and 30% • Less than 10% latency and bandwidth overheads 44

Website Fingerprinting Attacks and Defenses in the Tor Onion Space - PowerPoint PPT Presentation

Website Fingerprinting Attacks and Defenses in the Tor Onion Space Marc Juarez imec-COSIC KU Leuven COSIC Seminar - 23rd October 2017, Leuven Introduction Contents of this presentation: - PETS17: Website Fingerprinting Defenses at

Anonymity and Censorship Resistance Entry node Middle node Exit node Tor user Tor Node Tor

Website fingerprinting attacks against Tor Browser Bundle: a comparison between HTTP/1.1 and

Website fingerprinting on Tor: attacks and defenses Claudia Diaz KU Leuven Joint work with: Marc

k -fingerprinting: a Robust Scalable Website Fingerprinting Technique George Danezis Jamie Hayes

Circumventing Internet censorship with Tor Philipp Winter The Tor Project What Tor Browser does

Acoustic Fingerprinting Soundz Jake Runzer June 28, 2018 Jake Runzer Acoustic Fingerprinting

Bayes, not Nave Security Bounds on Website Fingerprinting Defenses Giovanni Cherubin Privacy

Tor: The Onion Router 2 / 13 Tor: The Onion Router www.cbc.ca 2 / 13 Tor: The Onion Router

Internet Outbreaks: Internet Outbreaks: Epidemiology and Defenses Epidemiology and Defenses

Tor and circumvention: Lessons learned Roger Dingledine The Tor Project https://torproject.org/

Tor update 2012 Roger Dingledine The Tor Project https://torproject.org/ 1 Today's plan 0)

Overview of Tor issues Roger Dingledine The Tor Project https://torproject.org/ 1 Today's plan

Website Fingerprinting Defenses at the Application Layer Giovanni Cherubin 1 Jamie Hayes 2 Marc

Website Fingerprinting Defenses at the Application Layer Giovanni Cherubin 1 Jamie Hayes 2 Marc

Feature Selection in Website Fingerprinting Junhua Yan Advisor: Prof. Jasleen Kaur July 24,

Cryptographic Challenges in and around Tor Nick Mathewson The Tor Project 9 January 2013

Lecture 7 Space- Multiplexed Channel Electrical Channels-2 Types of Electrical Channels

Disclosures United Therapeutics, Speakers Bureau Daniela Brady, RN Research Nurse

The Most Important UI: You bit.ly/self-care-talk-resources @carolstran the space where

GPSS Senate Meeting April 6 th , 2016 Call to Order Alex Bolton GPSS Senate Meeting Agenda

The system used in the U.S. is not as logical as that used in other countries (like Great Britain,

Onboarding: A Vehicle for Employee R etention ALPACAS Awesome Library Professionals Always

Recent Changes and Advances in Tor Tran, Tuan Tu Technische Universitt Mnchen Seminar Future

R are E ta D ecays with a T pc for O ptical P hotons Corrado Gatto INFN Napoli and Northern