inverting sampled traffic
play

Inverting Sampled Traffic Nicolas Hohn, Darryl Veitch Australian - PowerPoint PPT Presentation

Inverting Sampled Traffic Nicolas Hohn, Darryl Veitch Australian Research Council Special Research Center for Ultra-Broadband Information Networks T HE U NIVERSITY OF M ELBOURNE Inverting Sampled Traffic Motivation Sampling Techniques


  1. Inverting Sampled Traffic Nicolas Hohn, Darryl Veitch Australian Research Council Special Research Center for Ultra-Broadband Information Networks T HE U NIVERSITY OF M ELBOURNE

  2. Inverting Sampled Traffic Motivation Sampling Techniques – Packet Sampling – Flow Sampling Comparison of sampling techniques – Distribution of the number of packets per flows – Spectral density of packet arrival process Application to traffic modelling

  3. Introduction Motivation Traffic statistics collected by routers don’t scale well with link speed: exact traffic logging is impossible for backbone links Need to sample the traffic, export partial statistics Aim: infer statistics of original traffic from partial measurements

  4. Introduction Motivation Traffic statistics collected by routers don’t scale well with link speed: exact traffic logging is impossible for backbone links Need to sample the traffic, export partial statistics Aim: infer statistics of original traffic from partial measurements Short history 1993: Claffy et al. advocate sampling techniques at the packet level to reduce the load on measuring infrastructure. 2002-2003: Duffield et al. give estimates of first order quantities from packet level sampled traffic: average rate, mean number of packets per flows.

  5. Inverting Sampled Traffic Motivation Sampling Techniques – Packet Sampling – Flow Sampling Comparison of sampling techniques – Distribution of the number of packets per flows – Spectral density of packet arrival process Application to traffic modelling

  6. Packet Sampling Original traffic Time i.i.d. sampling with probability q Sampled traffic Simple example: recover original packet rate - Sample packets with probability q . - Measure rate of sampled traffic: λ ( q ) . - Infer rate of original traffic: λ ( q ) /q

  7. Packet Sampling Original traffic Time i.i.d. sampling with probability q Time Sampled traffic Simple example: recover original packet rate - Sample packets with probability q . - Measure rate of sampled traffic: λ ( q ) . - Infer rate of original traffic: λ ( q ) /q

  8. Packet Sampling Original traffic Time i.i.d. sampling with probability q Time Sampled traffic Time Simple example: recover original packet rate - Sample packets with probability q . - Measure rate of sampled traffic: λ ( q ) . - Infer rate of original traffic: λ ( q ) /q

  9. Packet Sampling Original traffic Time i.i.d. sampling with probability q Time Sampled traffic Time Simple example: recover original packet rate - Sample packets with probability q , - Measure rate of sampled traffic λ ( q ) , - Infer rate of original traffic λ ( q ) /q .

  10. Terminology IP flow : set of packets with same 5-tuple IP Source Destination Source Destination protocol Address Address Port Port Flow Level Packet Level Time Time

  11. Terminology IP flow : set of packets with same 5-tuple IP Source Destination Source Destination protocol Address Address Port Port Flow Level Packet Level Time

  12. Original Traffic Time Recovering original flow sizes not straightforward

  13. Flow Sampling Time No ‘inversion’ problems

  14. Original Traffic Time Recovering original flow sizes not straightforward

  15. Packet Sampling Time Recovering original flow sizes not straightforward

  16. Inverting Sampled Traffic Motivation Sampling Techniques – Packet Sampling – Flow Sampling Comparison of sampling techniques – Distribution of the number of packets per flows – Spectral density of packet arrival process Application to traffic modelling

  17. Distribution of number of packets per flow Original traffic Time Packet sampling Flow Sampling Time Time Potential inversion problems No ‘inversion’ problems

  18. Distribution of number of packets per flow Packet sampling p j : Probability that a flow had j packets before sampling. p ( q ) k : Probability that a flow has k packets after sampling, ∞ p ( q ) � = Pr { k packets after thinning | j packets before thinning } p j k j = k ∞ � j � p ( q ) � q k (1 − q ) j − k p j = (1) k k j = k Aim: express p j as a function of p ( q ) by inverting (1) k

  19. Inverting (1) with generating functions ∞ � p j z j , z ∈ D (0 , 1) . G P ( z ) = Definition: j =0 D ( z, r ) : open disc centered at z with radius r Singularity at z = 1 if heavy tailed distribution. k z k = G P (1 − q + qz ) , z ∈ D (0 , 1) G ( q ) p ( q ) � P ( z ) = From (1): k � z − (1 − q ) � G ( q ) G P ( z ) = , z ∈ D (1 − q, q ) P q Aim: Find power series expansion of G P at z = 0 Methods: – Analytic Continuation – Cauchy Integral

  20. Scheme 1: Analytic Continuation q = 0 . 6 1 z0 z1 0.5 0 −0.5 −1 −1 −0.5 0 0.5 1 ∞ � ( − 1) n − j � n � (1 − q ) n − j p ( q ) p j = (2) n q n j n = j

  21. Scheme 1: Analytic Continuation q = 0 . 1 1 z0 z1 z2 0.5 z3 z4 z5 0 −0.5 −1 −1 −0.5 0 0.5 1 p j = ...

  22. Scheme 2: Cauchy Integral G P ( z ) � p j = z j +1 dz, (3) S S : any closed contour containing the origin, for instance D (0 , 1) . Inversion methods work well when G P can be directly evaluated on S Values of G P on D (0 , 1) are unknown : obtained with Pad´ e Approximants

  23. Distribution of number of packets per flow q = 0 . 6 0 10 Theoretical original density Flow thinning Packet thinning: scheme 1 Packet thinning: scheme 2 −2 10 Pr(P=j) −4 10 −6 10 0 1 2 3 10 10 10 10 j (number of packets per flow)

  24. Distribution of number of packets per flow Packet sampling Flow Sampling Time Time Easy to implement, Need on-line processing to create flows. Need for consistent flow definition for sampled traffic (new timeout T 0 ), No need to change flow definition, Problems to estimate p ( q ) No inversion to recover packet from 0 distribution, sampled data, q plays no theoretical role. Only Severe numerical issues to the remaining number of flows recover the packet distribution matters for the estimation, ( “impossible” for q < 0 . 5 ! ),

  25. Spectral density of packet arrival process Original traffic Time Packet sampling Flow Sampling Time Time Potential inversion problems Potential inversion problems

  26. Spectral density of packet arrival process Γ X ( ω ) : spectral density of original traffic Γ ( q ) X ( ω ) : spectral density of sampled traffic Packet sampling Results from theory of thinned point processes give direct inversion Γ X ( ω ) = 1 � X ( ω ) − (1 − q ) λ ( q ) � Γ ( q ) q 2 Flow sampling Assumptions needed: Flow arrivals follow a Poisson process , Flows are uncorrelated . Γ X ( ω ) = 1 q Γ ( q ) X ( ω )

  27. Study Second Order Structure Analysis tools: Discrete Wavelet Transform Definition: Comparison of a signal X ( t ) with a family of functions ψ j,k by means of inner products d X ( j, k ) = < X, ψ j,k > , where ψ j,k = 2 − j/ 2 ψ (2 − j t − k ) , and ψ is the mother wavelet, localised both in time and frequency. Properties: { d X ( j, k ) , k ∈ Z} is stationary and short range dependent for j fixed, variance ( j ) = E | d X ( j, k ) | 2 For scaling processes: E | d X ( j, k ) | 2 = 2 jα E | d X (0 , k ) | 2 , For LRD processes: E | d X ( j, k ) | 2 ∼ 2 jα E | d X (0 , k ) | 2 for large j . � k | d X ( j, k ) | 2 � 1 � Wavelet Spectrum Estimate: log 2 vs j n j Link with power spectral density: E | d X ( j, k ) | 2 = X ( ν )2 j | Ψ(2 j ν ) | 2 dν � Γ

  28. Spectral density: q = 0 . 1 0.004 0.016 0.062 0.25 1 4 16 64 256 1024 18 Original Packet Thinned Inferred from Packet Thinned 16 Flow Thinned Inferred from Flow Thinned 14 j ) log 2 Var( d 12 10 8 6 −8 −6 −4 −2 0 2 4 6 8 10 12 j = log 2 ( a )

  29. Spectral density: q = 0 . 001 30.5mus 977mus 0.031 1 32 Original 35 Packet Thinned Inferred from Packet Thinned Flow Thinned 30 Inferred from Flow Thinned 25 j ) log 2 Var( d 20 15 10 5 −14 −12 −10 −8 −6 −4 −2 0 2 4 6 j = log 2 ( a )

  30. Conclusions Packet Sampling Flow Sampling Easy to implement, Need on-line processing to create Need for consistent flow definition flows. for sampled traffic (new timeout T 0 ), No need to change flow definition, Problems to estimate p ( q ) from 0 No inversion to recover packet sampled data, distribution, Severe numerical issues to q plays no theoretical role. Only recover the packet distribution the remaining number of flows (“impossible” for q < 0 . 5 ! ), matters for the estimation, Inaccurate estimation of the Accurate spectrum estimation, spectrum from sampled traffic for small q .

  31. Inverting Sampled Traffic Motivation Sampling Techniques – Packet Sampling – Flow Sampling Comparison of sampling techniques – Distribution of the number of packets per flows – Spectral density of packet arrival process Application to traffic modelling

  32. Application to traffic modelling Aim Fit model to sampled traffic, Infer model parameters for unsampled traffic. Theory Closure properties of the Bartlett-Lewis Point Process under both packet and flow sampling. Practice Only flow thinning is applicable.

  33. Sampling the Bar tlett-Lewis Point Process 0.004 0.016 0.062 0.25 1 4 16 64 256 1024 20 Original BLPP matched to Original Flow Thinned BLPP matched to Flow Thinned BLPP reconstructed from Thinned 15 j ) log 2 Var( d 10 5 0 −8 −6 −4 −2 0 2 4 6 8 10 12 j = log 2 ( a )

Recommend


More recommend