Bayesian Inference and Traffic Analysis Carmela Troncoso George - PowerPoint PPT Presentation

Bayesian Inference and Traffic Analysis Carmela Troncoso George Danezis September-November 2008 Microsoft Research Cambridge/ KU Leuven(COSIC)

Anonymous Communications  “T ell me who your friends are. .. ” => Anonymous communications to hide communication partners  High latency systems (e.g.anonymous remailers) use mixes [Chaum 81]: hide input/output relationship MIX MIX MIX 2

Anonymous Communications  Attacks to mix networks  Restricted routes [Dan03]  Bridging and Fingerprinting [DanSyv08]  Social information:  Disclosure Attack [Kes03],  Statistical Disclosure Attack [Dan03],  P erfect Matching Disclosure Attacks [T ron08]  Heuristics and specific models 3

Mix networks and traffic analysis  Determine probability distributions input-output ( , , ) A B C 1 1 A A or B 2 2 MIX 1 3 3 1 ( , , ) B Q 8 8 4 MIX 3 3 3 1 R ( , , ) 8 8 4 1 1 A or B 1 1 1 2 2 A or B or C 4 4 2 MIX 2 1 1 1 ( , , ) C S 4 4 2

Mix networks and traffic analysis  Constraints, e.g. length=2 ( , , ) A B C 1 1 A A or B 2 2 MIX 1 1 1 1 ( , , ) B Q 4 4 2 MIX 3 1 1 1 R ( , , ) 4 4 2 1 1 A or B 1 C 2 2 MIX 2 1 1 ( , , 0 ) C S 2 2 N on trivial given observation!!

“The real thing” S enders Mixes (Threshold = 3) Receivers How to compute probabilities How to compute probabilities systematically? systematically? ? ?

Mix networks and traffic analysis  “hidden state” + Observation = P aths A B Q M1 R M3 C S M2 P 1 A M1 M2 M3 R P 2 B M1 M3 Q P 3 C M2 S  Pr( | , ) Pr( | ) O HS C K Paths C   Pr( | , ) HS O C  

Bayesian Inference  Actually… we want marginal probabilities ( , , ) A B C 1 1 A or B A 2 2 3 3 1 ( , , ) B Q 8 8 4 3 3 1 ( , , ) R 8 8 4 1 1 A or B 1 1 1 2 2 A or B or C 4 4 2 1 1 1 C S ( , , ) 4 4 2  ( ) I HS  A Q j   Pr( | , , ) HS A Q HS O C j  But… we cannot obtain them directly

Bayesian Inference - sampling  If we obtain samples ~ Pr( | , ) HS O C HS 1 , HS 2 , HS 3 , HS 4 ,…, HS j (A → Q)? 0 1 0 1 … 1  ( ) I HS  A Q j   HS Pr( | , , ) A Q HS O C j  Markov Chain Monte Carlo Methods  Metropolis Hastings alg orithm Pr( | ) Paths C  Pr( | , ) HS O C  How does Pr(P aths|C) look like?

Probabilistic model – Basic Constraints  Users decide independently   Pr( | ) Pr( | ) Paths C P x C x L  Pr( | )  Length restrictions with any distribution l C 1  e.g.   uniform ( L min , L max ) Pr( | ) L l C  L L max min  N ode choice restrictions 1   Pr( | , )  Choose l out of the N mix node a vailable M L l C x ( , ) P N l mix  Choose a set ( ) I set M x      Pr( | ) Pr( | ) Pr( | , ) ( ) P C L l C M L l C I M x x set x

Probabilistic model – Basic Constraints  Unknown destinations max  3 L S C S     L max       Pr( | ) Pr( | ) Pr( | , ) ( ) P C L l C M L l C I M x x set x    l L obs

Probabilistic model – More Constraints  Bridging ( )  Known nodes I M bridging x  N on-compliant clients (with probability ) p c p  Do not respect length restrict ions ( , ) L L min, max, c p c p  Choose l out of the N mix node a vailable, allow repetiti ons 1   Pr( | , , ( )) M L l C I Path x c p ( , ) P N l r mix   Pr( | ) Pr( | ) Paths C P x C x              Pr( | ) Pr( | , ( )) ( 1 ) Pr( | ) Paths C p P C I P p P C i c p i j     c p c p       i P j P c p cp

Probabilistic model – More constraints  S ocial network information x   Assuming we know sending profiles Pr( Sen Rec ) x        Pr( | ) Pr( | ) Pr( | , ) ( ) Pr( Sen Rec ) P C L l C M L l C I M x x set x x x  O ther constraints  Unknown origin  Dummies  O ther mixing strategies  ….

Markov Chain Monte Carlo  S ample from a distribution difficult to sample from directly   Pr( | , ) Pr( | ) Pr( | , ) Pr( | ) O HS C HS C O HS C K Paths C    Pr( | , ) HS O C    Pr( , | ) HS O C HS  3 K ey advantages:  Requires generative model (we know how to compute it!)  Good estimation of errors  N ot false positives and negatives  Systematic

Metropolis Hastings Algorithm Pr( | , )  Constructs a Markov Chain with stationary distribution HS O C Q  Current state Candidate state ( | ) Q HS HS candidate current HS HS candidate current ( | ) Q HS HS current candidate Pr( ) ( | ) HS Q HS HS   candidate candidate current 1. Compute Pr( ) ( | ) HS Q HS HS   current current candidate 1 2. If  HS HS current candidate ~ U ( 0 , 1 ) u else   u if  HS HS current candidate else  HS HS current current

Our sampler: Q transition ( | ) Q Paths Paths candidate current Pr( | ) Paths C  Pr( | , ) HS O C Pahts Paths candidate current Z ( | ) Q Paths Paths current candidate Pr( ) ( | ) Paths Q Paths Paths   candidate candidate current Pr( ) ( | ) Paths Q Paths Paths current current candidate  T ransition Q : swap operation Q A R B M3 M1 C S M2  More complicated transitions for non-compliant clients

Iterations ( | ) Q Paths Paths candidate current Pr( | ) Paths C  Pr( | , ) HS O C Pahts Paths Z candidate current ( | ) Q Paths Paths current candidate  Consecutive samples dependant   S ufficiently separated   Paths Pr( | ) Pr( ) Paths Paths Paths Paths i j i Paths Paths Paths i Paths Paths Paths Paths Paths Paths j

Error estimation  P 1 P 2 P 3 P 4   Paths I (A → Q)? Paths    A Q 1 0 1 0 Pr( ) A Q j Paths  Error estimation Paths  Bernouilli distribution Paths  Pr[ , , ,... | Pr( )] Paths Paths Paths A Q 1 2 3  Prior Beta(1,1) ~ uniform A  Pr[Pr( ) | , , ,...] Q Paths Paths Paths 1 2 3      Pr( ) ~ ( ( ) 1 , ( ) 1 ) A Q Beta I Path I Path    A Q i A Q i Paths Paths  Confidence intervals

Evaluation Create an instance of a network 1. Run the sampler 2. Choose a target sender and a receiver 3. Estimate probability 4.  ( ) I Paths  Sen Rec j   Pr( Sen Rec ) Paths j Check if actually S en chose Rec as receiver ( ) I network 5.  Sen Rec Choose new network and g o to 2 6. Events should happen with the estimated probability  ( ) I Paths  Sen Rec j    Paths Pr( Sen Rec ) ( ( )) E I network  Sen Rec j

Results – compliant clients ( ( )) E I network  Sen Rec  ( ) I Paths  Sen Rec j Paths j

Results – 50 messages

Results – 10 messages

Results – big networks

Performance – RAM usage Nmix t Nmsg Samples RAM(Mb) 3 3 10 500 16 3 3 50 500 18 5 10 100 500 19 10 20 1 000 1 000 24 10 20 10 000 1 000 125  S ize of network and population  Results are kept in memory during simulation  N umber samples collected increases

Performance – Running time Nmix t Nmsg iter Full analysis (min) One sample(ms) 3 3 10 6011 2.33 267.68 3 3 50 6011 2.55 306.00 5 10 100 4011 1.58 190.35 10 20 1 000 7011 3.16 379.76  O perations should be O (1)  W riting of the results on a file  Different number of iterations

Conclusions  T raffic analysis is non trivial when there are constraints  Probabilistic model: incorpor ates most attacks  N on-compliant clients  Monte Carlo Markov Chain methods to extract marginal probabilities  Future work:  SDA based on Ba yesian Inferenc e  Added value?

Thanks for y our attention Carmela.T roncoso@ esat.kuleuven .be Microsoft technical report coming soon… 28

Bayes theorem   Pr( , | ) Pr( | , ) Pr( | ) O HS C HS O C O C   Pr( , | ) Pr( | , ) Pr( | ) O HS C O HS C HS C   Pr( | , ) Pr( | ) Pr( | , ) Pr( | ) O HS C HS C O HS C HS C   Pr( | , ) HS O C  Pr( | ) Pr( , | ) O C HS O C HS J oint probability:     Pr( , ) Pr( | ) Pr( ) Pr( | ) Pr( ) X Y X Y Y Y X X

Bayesian Inference and Traffic Analysis Carmela Troncoso George - PowerPoint PPT Presentation

Bayesian Inference and Traffic Analysis Carmela Troncoso George Danezis September-November 2008 Microsoft Research Cambridge/ KU Leuven(COSIC) Anonymous Communications T ell me who your friends are. .. => Anonymous

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Basics of Bayesian Inference A frequentist thinks of unknown parameters as fixed Basics of

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

using Traffic Analysis Attacks Salini S K What is Traffic Analysis What is Traffic Analysis

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Introduction to Bayesian Inference Frank Wood April 6, 2010 Introduction Overview of Topics

Analytics, Inference and Computation in Cosmology: Exercises on Bayesian Inference Roberto

EST5104 Bayesian Inference EST5803 Advanced Bayesian Inference Ricardo Ehlers ehlers@icmc.usp.br

Machine Learning: Foundations Lecturer: Yishay Mansour Lecture 2 Bayesian Inference Kfir Bar

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

CS 730/730W/830: Intro AI Bayesian Networks Approx. Inference Exact Inference 1 handout: slides

CS 730/830: Intro AI Bayesian Networks Approx. Inference Exact Inference Wheeler Ruml (UNH)

Traffic signal optimization and traffic assignment Traffic signals Traffic signal optimization

Traffic Shaping, Traffic Policing Peter Puschner, Institut fr Technische Informatik Traffic

Welcome to the co u rse ! FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R Rasm u s Bth

MaCh3 and Bayesian Analysis Patrick Dunne Outline Introduce T2K method for analysis How

Part 3 Robust Bayesian statistics & applications in reliability networks by Gero Walter 69

Introduction to Bayesian Statistics Lecture 4: Multiparameter models (I) Rung-Ching Tsai

Bayesian Fitting Probabilistic Morphable Models Summer School, June 2017 Sandro Schnborn

Model inference . Course of Machine Learning Master Degree in Computer Science University of

Probabilistic Graphical Models Lecture 6 Variable Elimination CS/CNS/EE 155 Andreas Krause

First Results with PAWIAN th 2019| P ANDA CM 19/2 GSI | Jennifer Ptz June 25 Outline

Bayesian Inference and Traffic Analysis Carmela Troncoso George - PowerPoint PPT Presentation

Bayesian Inference and Traffic Analysis Carmela Troncoso George Danezis September-November 2008 Microsoft Research Cambridge/ KU Leuven(COSIC) Anonymous Communications T ell me who your friends are. .. => Anonymous

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Basics of Bayesian Inference A frequentist thinks of unknown parameters as fixed Basics of

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

using Traffic Analysis Attacks Salini S K What is Traffic Analysis What is Traffic Analysis

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Introduction to Bayesian Inference Frank Wood April 6, 2010 Introduction Overview of Topics

Analytics, Inference and Computation in Cosmology: Exercises on Bayesian Inference Roberto

EST5104 Bayesian Inference EST5803 Advanced Bayesian Inference Ricardo Ehlers ehlers@icmc.usp.br

Machine Learning: Foundations Lecturer: Yishay Mansour Lecture 2 Bayesian Inference Kfir Bar

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

CS 730/730W/830: Intro AI Bayesian Networks Approx. Inference Exact Inference 1 handout: slides

CS 730/830: Intro AI Bayesian Networks Approx. Inference Exact Inference Wheeler Ruml (UNH)

Traffic signal optimization and traffic assignment Traffic signals Traffic signal optimization

Traffic Shaping, Traffic Policing Peter Puschner, Institut fr Technische Informatik Traffic

Welcome to the co u rse ! FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R Rasm u s Bth

MaCh3 and Bayesian Analysis Patrick Dunne Outline Introduce T2K method for analysis How

Part 3 Robust Bayesian statistics &amp; applications in reliability networks by Gero Walter 69

Introduction to Bayesian Statistics Lecture 4: Multiparameter models (I) Rung-Ching Tsai

Bayesian Fitting Probabilistic Morphable Models Summer School, June 2017 Sandro Schnborn

Model inference . Course of Machine Learning Master Degree in Computer Science University of

Probabilistic Graphical Models Lecture 6 Variable Elimination CS/CNS/EE 155 Andreas Krause

First Results with PAWIAN th 2019| P ANDA CM 19/2 GSI | Jennifer Ptz June 25 Outline

Part 3 Robust Bayesian statistics & applications in reliability networks by Gero Walter 69