Wavelet & SiZer Analyses of Internet Traffic Data Cheolwoo Park - - PowerPoint PPT Presentation

wavelet sizer analyses of internet traffic data
SMART_READER_LITE
LIVE PREVIEW

Wavelet & SiZer Analyses of Internet Traffic Data Cheolwoo Park - - PowerPoint PPT Presentation

Statistical and Applied Mathematical Statistical and Applied Mathematical Sciences Institute Sciences Institute Wavelet & SiZer Analyses of Internet Traffic Data Cheolwoo Park (Joint work with various researchers) SAMSI May 27, 2004 1


slide-1
SLIDE 1

1 1

Statistical and Applied Mathematical Statistical and Applied Mathematical Sciences Institute Sciences Institute

Wavelet & SiZer Analyses of Internet Traffic Data

Cheolwoo Park

(Joint work with various researchers) SAMSI May 27, 2004

slide-2
SLIDE 2

2 2

Wavelet & SiZer Analyses of Internet Traffic Data

  • Data description & Goal
  • Wavelet Spectrum
  • Motivation
  • Example 1
  • Example 2
  • Example 3
slide-3
SLIDE 3

3 3

Data & Goal

Data Description

  • UNC main link in 2002 & 2003
  • Packet / byte counts at 10ms intervals
  • Different days and times: 28 traces
  • 2 hour time blocks
  • http://www-dirt.cs.unc.edu/ts/
slide-4
SLIDE 4

4 4

Data & Goal

What are we interested in?

  • Explore traffic behavior
  • Long Range Dependence (LRD) property
  • Nonstationary behavior
  • Invent and develop statistical tools for data analysis
  • Goodness of fit test of network models
  • How do we know this works like real traffic?
slide-5
SLIDE 5

5 5

Wavelet & SiZer Analyses of Internet Traffic Data

  • Data description & Goal
  • Wavelet Spectrum
  • Motivation
  • Example 1
  • Example 2
  • Example 3
slide-6
SLIDE 6

6 6

Wavelet Spectrum

Wavelets

1 , , 1 , , ) ( − = =

N l dx x x

N l

L ψ

Z k dt k t t Y d

j N j k j

∈ − =

− −

, ) 2 ( 2 ) (

2 / ,

ψ

: wavelet coefficients

  • J (scale), k (location)
  • fast computation
slide-7
SLIDE 7

7 7

Wavelet Spectrum

Wavelet Spectrum (Abry & Veitch (1998))

  • Properties for LRD process
  • Wavelet spectrum: plot
  • Estimation of H: WLS for
  • Robust to trend

C Ed

H j k j ) 1 2 ( 2 ,

2 ~

N H k j k j

k k j C d Ed

2 2 2 ' , ,

| ' | ) ( ~ | |

− −

− ) ( | | 1 log

1 2 , 2

j g d n Y

j

n k k j j j

− ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

=

) , (

j

Y j

: weakly correlated

] , [

2 1 j

j j∈

slide-8
SLIDE 8

8 8

Wavelet Spectrum

Wavelet Spectrum (FGN, H= 0.9) linear

slide-9
SLIDE 9

9 9

Wavelet & SiZer Analyses of Internet Traffic Data

  • Data description & Goal
  • Wavelet Spectrum
  • Motivation
  • Example 1
  • Example 2
  • Example 3
slide-10
SLIDE 10

10 10

Motivation

Hurst parameter estimation

  • Hernandez-Campos et al. (2004)
  • http://www-dirt.cs.unc.edu/net_lrd/
  • H: Characterizing Internet Traffic behavior
  • Three different methods: AV, Local Whittle, Wavelet
  • Similar, but sometimes different
  • Need careful analysis of the data
  • Wavelet Spectrum can say a lot more
slide-11
SLIDE 11

11 11

Motivation

Wavelet Spectrum: 2002 Apr 13 Sat 19:30 – 21:30 Piecewise linear

slide-12
SLIDE 12

12 12

Wavelet & SiZer Analyses of Internet Traffic Data

  • Data description & Goal
  • Wavelet Spectrum
  • Motivation
  • Example 1
  • Example 2
  • Example 3
slide-13
SLIDE 13

13 13

Example 1

Wavelet Spectrum: 2002 Apr 13 Sat 1 pm – 3 pm Q: What causes the bump? (2-3 sec. periodic behavior)

slide-14
SLIDE 14

14 14

Example 1

SiZer

  • SIgnificance of ZERo crossings of the derivative of the smooths in

scale space: Chaudhuri and Marron (1999)

  • Exploratory smoothing method
  • Are bumps really there?
  • Consider all smoothing levels
  • Study (simultaneous) C. I.s for slope (derivative) of smooth
  • Combine statistical inference with visualization
  • Blue: slope significantly upwards
  • Red: slope significantly downwards
  • Purple: insignificant slope
slide-15
SLIDE 15

15 15

Example 1

Dependent SiZer

  • Park, Marron, and Rondonotti (2004)
  • SiZer compares data with white noise
  • Inappropriate to dependent data
  • Dependent SiZer compares data with an assumed model
  • Goodness of fit test
slide-16
SLIDE 16

16 16

Example 1

Dependent SiZer: 2002 Apr 13 Sat 1–3 pm

slide-17
SLIDE 17

17 17

Example 1

Dependent SiZer: 2002 Apr 13 Sat 1–3 pm (zoom)

slide-18
SLIDE 18

18 18

Example 1

Summary: 2002 Apr 13 Sat 1 pm – 3 pm

  • H= 1.56, Bump in the wavelet spectrum at j= 8
  • Big spike in the middle for 6 minutes
  • Different behavior from FGN with H= 0.9
  • Zooming dependent SiZer: 3 seconds periodic behavior
  • Possible explanation: IP port scan

(sec) 3 ) 10 ( 256 28 ≈ × = ms

slide-19
SLIDE 19

19 19

Wavelet & SiZer Analyses of Internet Traffic Data

  • Data description & Goal
  • Wavelet Spectrum
  • Motivation
  • Example 1
  • Example 2
  • Example 3
slide-20
SLIDE 20

20 20

Example 2

Wavelet Spectrum: 2002 Apr 11 Thu 1 – 3 pm Q: What causes the bump?

slide-21
SLIDE 21

21 21

Example 2

Dependent SiZer: 2002 Apr 11 Thu 1 – 3 pm

slide-22
SLIDE 22

22 22

Example 2

Wavelet SiZer

  • Park, Godtliebsen, Taqqu, Stoev, Marron (2004)
  • Plug squared wavelet coefficients (dj,k

2) into SiZer for

each scale j instead of original time series

  • Use original SiZer: dj,k’s are weakly correlated
  • Find local hidden nonstationarities
slide-23
SLIDE 23

23 23

Example 2

Wavelet SiZer: FGN (H= 0.9)

slide-24
SLIDE 24

24 24

Example 2

Wavelet SiZer: 2002 Apr 11 Thu 1 – 3 pm 1 2 3&4

slide-25
SLIDE 25

25 25

Example 2

Further analysis: 2002 Apr 11 Thu 1 – 3 pm 1 2 3 4 H= 0.9

slide-26
SLIDE 26

26 26

Example 2

Summary: 2002 Apr 11 Thu 1 – 3 pm

  • H = 0.73, Bump at j = 11
  • Wavelet SiZer: find local hidden nonstationaries
  • 8 seconds dropout and several bursts
  • H is not enough: need to explore traffic for modeling
  • Wavelet + SiNos
  • Park, Godtliebsen, Taqqu, Stoev, Marron (2004)
slide-27
SLIDE 27

27 27

Wavelet & SiZer Analyses of Internet Traffic Data

  • Data description & Goal
  • Wavelet Spectrum
  • Motivation
  • Example 1
  • Example 2
  • Example 3
slide-28
SLIDE 28

28 28

Example 3

Wavelet Spectrum: 2003 Sat 9:30 – 11:30 pm shoulder Q: What causes the shoulder? (1-2 sec. periodic behavior)

slide-29
SLIDE 29

29 29

Example 3

Wavelet Spectrum: protocol-dependent analysis

  • Park, Hernandez-Campos, Marron, Rolls, Smith (2004)
  • 2003 packet counts data have
  • Lower H estimates
  • Equal amount of features in dependent SiZer
  • Shoulder in the wavelet spectrum: 1 second scaling behavior
  • Protocol-dependent analysis
  • Transmission Control Protocol (TCP): HTTP, FTP, Mail
  • User Datagram Protocol (UDP): Streaming
slide-30
SLIDE 30

30 30

Example 3

Wavelet Spectrum : 2003 Sat 9:30 pm, Blubster

slide-31
SLIDE 31

31 31

Example 3

SiZer: 2003 Sat 9:30 – 11:30 pm, Blubster (10sec)

slide-32
SLIDE 32

32 32

Example 3

Wavelet Spectrum : packet vs. byte

slide-33
SLIDE 33

33 33

Example 3

Summary: protocol-dependent analysis

2002 2003 H 0.9 0.7-0.8 WS Piecewise linear Shoulder Protocol TCP UDP (Blubster) H 0.9 0.9 WS Piecewise linear Piecewise linear Protocol TCP TCP Byte Packet

slide-34
SLIDE 34

34 34

Conclusion

Wavelet spectrum and SiZer are useful for

  • Finding nonstationarities
  • Modeling and validating

Strengths and limitations of wavelet spectrum

  • Stoev, Taqqu, Park, and Marron (2004)
  • Real data analysis + Simulation study