Statistical and Applied Mathematical Statistical and Applied Mathematical Sciences Institute Sciences Institute Wavelet & SiZer Analyses of Internet Traffic Data Cheolwoo Park (Joint work with various researchers) SAMSI May 27, 2004 1 1
Wavelet & SiZer Analyses of Internet Traffic Data � Data description & Goal Wavelet Spectrum � Motivation � � Example 1 � Example 2 Example 3 � 2 2
Data & Goal Data Description UNC main link in 2002 & 2003 � � Packet / byte counts at 10ms intervals � Different days and times: 28 traces � 2 hour time blocks http://www-dirt.cs.unc.edu/ts/ � 3 3
Data & Goal What are we interested in? Explore traffic behavior � � Long Range Dependence (LRD) property Nonstationary behavior � � Invent and develop statistical tools for data analysis � Goodness of fit test of network models How do we know this works like real traffic? � 4 4
Wavelet & SiZer Analyses of Internet Traffic Data � Data description & Goal Wavelet Spectrum � Motivation � � Example 1 � Example 2 Example 3 � 5 5
Wavelet Spectrum Wavelets ∫ ψ = = − l L ( ) 0 , 0 , 1 , , 1 x x dx l N N ∫ − − = ψ − ∈ / 2 j j ( ) 2 ( 2 ) , d Y t t k dt k Z , j k N : wavelet coefficients - J (scale), k (location) - fast computation 6 6
Wavelet Spectrum Wavelet Spectrum (Abry & Veitch (1998)) Properties for LRD process � − − − 2 2 2 H N | | ~ ( ) | ' | Ed d C j k k : weakly correlated , , ' j k j k − 2 ( 2 1 ) j H ~ 2 Ed C , j k ( , ) Wavelet spectrum: plot j Y � j ⎛ ⎞ n 1 ∑ j ⎜ ⎟ = − 2 log | | ( ) Y d g j ⎜ ⎟ 2 , j j k n ⎝ ⎠ = 1 k j j ∈ [ , ] Estimation of H: WLS for j 1 j � 2 Robust to trend � 7 7
8 8 linear Wavelet Spectrum (FGN, H= 0.9) Wavelet Spectrum
Wavelet & SiZer Analyses of Internet Traffic Data � Data description & Goal Wavelet Spectrum � Motivation � � Example 1 � Example 2 Example 3 � 9 9
Motivation Hurst parameter estimation � Hernandez-Campos et al. (2004) � http://www-dirt.cs.unc.edu/net_lrd/ H: Characterizing Internet Traffic behavior � Three different methods: AV, Local Whittle, Wavelet � � Similar, but sometimes different � Need careful analysis of the data Wavelet Spectrum can say a lot more � 10 10
Motivation Wavelet Spectrum: 2002 Apr 13 Sat 19:30 – 21:30 Piecewise linear 11 11
Wavelet & SiZer Analyses of Internet Traffic Data � Data description & Goal Wavelet Spectrum � Motivation � � Example 1 � Example 2 Example 3 � 12 12
Example 1 Wavelet Spectrum: 2002 Apr 13 Sat 1 pm – 3 pm Q: What causes the bump? (2-3 sec. periodic behavior) 13 13
Example 1 SiZer � SIgnificance of ZERo crossings of the derivative of the smooths in scale space: Chaudhuri and Marron (1999) � Exploratory smoothing method � Are bumps really there? Consider all smoothing levels � Study (simultaneous) C. I.s for slope (derivative) of smooth � � Combine statistical inference with visualization � Blue: slope significantly upwards � Red: slope significantly downwards � Purple: insignificant slope 14 14
Example 1 Dependent SiZer Park, Marron, and Rondonotti (2004) � � SiZer compares data with white noise � Inappropriate to dependent data � Dependent SiZer compares data with an assumed model Goodness of fit test � 15 15
16 16 Dependent SiZer: 2002 Apr 13 Sat 1–3 pm Example 1
17 17 Dependent SiZer: 2002 Apr 13 Sat 1–3 pm (zoom) Example 1
Example 1 Summary: 2002 Apr 13 Sat 1 pm – 3 pm H= 1.56, Bump in the wavelet spectrum at j= 8 � Big spike in the middle for 6 minutes � � Different behavior from FGN with H= 0.9 Zooming dependent SiZer: 3 seconds periodic behavior � = × ≈ 2 8 256 ( 10 ) 3 (sec) ms � Possible explanation: IP port scan 18 18
Wavelet & SiZer Analyses of Internet Traffic Data � Data description & Goal Wavelet Spectrum � Motivation � � Example 1 � Example 2 Example 3 � 19 19
Example 2 Wavelet Spectrum: 2002 Apr 11 Thu 1 – 3 pm Q: What causes the bump? 20 20
21 21 Dependent SiZer: 2002 Apr 11 Thu 1 – 3 pm Example 2
Example 2 Wavelet SiZer Park, Godtliebsen, Taqqu, Stoev, Marron (2004) � 2 ) into SiZer for Plug squared wavelet coefficients ( d j,k � each scale j instead of original time series Use original SiZer: d j,k ’s are weakly correlated � Find local hidden nonstationarities � 22 22
23 23 Wavelet SiZer: FGN (H= 0.9) Example 2
24 24 Wavelet SiZer: 2002 Apr 11 Thu 1 – 3 pm 3&4 Example 2 1 2
25 25 4 Further analysis: 2002 Apr 11 Thu 1 – 3 pm 3 2 H= 0.9 Example 2 1
Example 2 Summary: 2002 Apr 11 Thu 1 – 3 pm H = 0.73, Bump at j = 11 � � Wavelet SiZer: find local hidden nonstationaries � 8 seconds dropout and several bursts H is not enough: need to explore traffic for modeling � Wavelet + SiNos � � Park, Godtliebsen, Taqqu, Stoev, Marron (2004) 26 26
Wavelet & SiZer Analyses of Internet Traffic Data � Data description & Goal Wavelet Spectrum � Motivation � � Example 1 � Example 2 Example 3 � 27 27
Example 3 Wavelet Spectrum: 2003 Sat 9:30 – 11:30 pm shoulder Q: What causes the shoulder? (1-2 sec. periodic behavior) 28 28
Example 3 Wavelet Spectrum: protocol-dependent analysis � Park, Hernandez-Campos, Marron, Rolls, Smith (2004) � 2003 packet counts data have � Lower H estimates Equal amount of features in dependent SiZer � Shoulder in the wavelet spectrum: 1 second scaling behavior � � Protocol-dependent analysis Transmission Control Protocol (TCP): HTTP, FTP, Mail � User Datagram Protocol (UDP): Streaming � 29 29
30 30 Wavelet Spectrum : 2003 Sat 9:30 pm, Blubster Example 3
31 31 SiZer: 2003 Sat 9:30 – 11:30 pm, Blubster (10sec) Example 3
32 32 Wavelet Spectrum : packet vs. byte Example 3
Example 3 Summary: protocol-dependent analysis 2002 2003 H 0.9 0.7-0.8 Packet WS Piecewise linear Shoulder Protocol TCP UDP (Blubster) H 0.9 0.9 Byte WS Piecewise linear Piecewise linear Protocol TCP TCP 33 33
Conclusion � Wavelet spectrum and SiZer are useful for � Finding nonstationarities Modeling and validating � � Strengths and limitations of wavelet spectrum Stoev, Taqqu, Park, and Marron (2004) � � Real data analysis + Simulation study 34 34
Recommend
More recommend