time series models and its relevance to modeling tcp syn
play

Time Series Models and its Relevance to Modeling TCP SYN based DoS - PDF document

Time Series Models and its Relevance to Modeling TCP SYN based DoS attacks Cyriac James, Hema A. Murthy Department of Computer Science & Engineering Indian Institute of Technology Madras, India June 23, 2011 Cyriac James, Hema A. Murthy


  1. Time Series Models and its Relevance to Modeling TCP SYN based DoS attacks Cyriac James, Hema A. Murthy Department of Computer Science & Engineering Indian Institute of Technology Madras, India June 23, 2011 Cyriac James, Hema A. Murthy (IITM) June 23, 2011 1 / 35 Background: TCP SYN Attack A common DoS attack: TCP SYN Attack Limited backlog queue ( sysctl − a | grep ipv 4 .tcp max syn backlog ) Figure: SYN Attack Time out: 3, 6, 12, 24 and 48 seconds 1 1 V. Paxson and M. Allman, “RFC 2988 - Computing TCPs Retransmission Timer,” http://www.ietf.org/rfc/rfc2988.txt, Nov. 2000 Cyriac James, Hema A. Murthy (IITM) June 23, 2011 2 / 35

  2. Outline Related Work Motivation for the Work Network Trace Representing Network Traffic as a Discrete Time Signal Time Series Models Analysis and Results Conclusion Cyriac James, Hema A. Murthy (IITM) June 23, 2011 3 / 35 Related Work Popular statistical work based on CUSUM algorithm by H. Wang et al - for the edge routers SYN - FIN 1 SYN - SYN/ACK 2 Drawbacks Series assumed to be i.i.d Traffic burstiness and non-stationarity: Local-Area Network 3 and Wide-Area Network 4 1 H. Wang, D. Zhang, and K. G. Shin, “Detecting syn flooding attacks,“ Proceedings of the IEEE INFOCOM, 2002 2 H. Wang, D. Zhang, and K. Shin, ”Syn-dog: Sniffing syn flooding sources,” ICDCS, 2002 3 W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson, “On the self-similar nature of ethernet traffic,” in IEEE/ACM Transactions on Networking, 1994 4 V. Paxson and S. Floyd, “Wide-Area Traffic: The Failure of Poisson Modeling,“ in IEEE/ACM Transactions on Networking, 1995 Cyriac James, Hema A. Murthy (IITM) June 23, 2011 4 / 35

  3. Related Work Later, there were quite a few work based on Box-Jenkins Time Series Models - solution at the victim server Modeling the outstanding TCP requests 1 Modeling the service rate 2 Based on modeling the flow level features 3 Modeling the web traffic 4 1 D. M. Divakaran, H. A. Murthy, and T. A. Gonsalves, ”Detection of SYN flooding attacks using linear prediction analysis,“ ICON, 2006 2 G. Zhang, S. Jiang, G. Wei, and Q. Guan, ”A prediction-based detection algorithm against distributed denial-of-service attacks,“ in Proceedings of IWCMC, 2009 3 J. Cheng, J. Yin, C. Wu, B. Zhang, and Y. Liu, ”DDoS attack detection method based on linear prediction model,“ in ICIC, 2009 4 W. U. Qing-tao and S. Zhi-qing, ”Detecting DD O S attacks against web server using time series analysis,“ Wuhan Univesity Journal of Natural Sciences, vol. 11, no. 1, pp. 175-180, 2006 Cyriac James, Hema A. Murthy (IITM) June 23, 2011 5 / 35 Motivation for the Work Some of the major drawbacks of these work are: Assumptions of time invariance and stability of the process Window sizes of the order of seconds or lesser Can we have a model valid for a longer period? Lacks description on: Model identification Model validation Relevance of linear time series models? Cyriac James, Hema A. Murthy (IITM) June 23, 2011 6 / 35

  4. Network Trace Figure: Tenet Network Architecture Traces collected at the edge router using tcpdump Feature: SYN - SYN/ACK, also called half-open count Sampling Interval: 10 seconds Data Set-1: 26 th July 2010 to 30 th July 2010 Data Set-2: 23 rd August 2010 to 27 th August 2010 Data Set-3: 20 th September 2010 to 24 th September 2010 Cyriac James, Hema A. Murthy (IITM) June 23, 2011 7 / 35 Representing Network Traffic as a Discrete Time Signal Representing Network Traffic as a Discrete Time Signal Cyriac James, Hema A. Murthy (IITM) June 23, 2011 8 / 35

  5. Representing Network Traffic as a Discrete Time Signal Discrete Time Signal Discrete Signal 18 16 14 12 Amplitude 10 8 6 4 2 0 2 4 6 8 10 12 14 16 18 20 Time Figure: Network Signal No access to input signals Consider the series as a sequence of impulse responses µ t , for time t ≥ 0 Cyriac James, Hema A. Murthy (IITM) June 23, 2011 9 / 35 Representing Network Traffic as a Discrete Time Signal Stability of the System For a linear system, Total Response = Zero-State Response + Zero-Input Response 1 External Stability: Zero-State Response Internal Stability: Zero-Input Response Internal Stability ⇒ External Stability 1 For internal stability, Impulse response must die-off ∞ � | µ j | < ∞ (1) j =0 µ j : Impulse response at j th time lag 1B. P . Lathi. “Principles of Linear Systems and Signals”, Oxford University Press, 2009 Cyriac James, Hema A. Murthy (IITM) June 23, 2011 10 / 35

  6. Time Series Models Box-Jenkins Time Series Models Cyriac James, Hema A. Murthy (IITM) June 23, 2011 11 / 35 Time Series Models Time Series Models Idea from the observations of Yule 1 x t = a t + α 1 a t − 1 + α 2 a t − 2 + ... (2) x t : Output Signal at time t a t , a t − 1 , ... : Random shocks or white noise process α 1 , α 2 ... : Model coefficients Also called Linear Filter Model Stationarity: First and Second order moments finite and independent of time 2 LTI and Stability ⇔ Stationarity Can be used to build models for prediction 1G. U. Yule, “On a method of investigating periodicities in distributed series, with special reference to Wolfer’s sunspot numbers”, Philos. Trans. Roy. Soc. A226, 267-298, 1927 2G. E. P . Box, G. M. Jenkins, and G. C. Reinsel. “Time Series Analysis: Forecasting and Control”, Pearson Education, 1994 Cyriac James, Hema A. Murthy (IITM) June 23, 2011 12 / 35

  7. Time Series Models Auto-Regressive(AR) Model An AR model can be written as x t = α 1 x t − 1 + α 2 x t − 2 + .... + α p x t − p + a t (3) x t , x t − 1 , ... : Output values α 1 , α 2 , ... : Model coefficients, where p is the model order a t : Random shock at time t Can be written as an infinite series of random shocks Consider an AR(2) model: x t = α 1 x t − 1 + α 2 x t − 2 + a t (4) x t − 1 = α 1 x t − 2 + α 2 x t − 3 + a t − 1 (5) x t − 2 = α 1 x t − 3 + α 2 x t − 4 + a t − 2 (6) . . . x t = a t + ψ 1 a t − 1 + ψ 2 a t − 2 + ..... (7) Cyriac James, Hema A. Murthy (IITM) June 23, 2011 13 / 35 Time Series Models Auto-Regressive(AR) Model Computing ACF: E ( x t x t − k ) = E ( a t x t − k + ψ 1 a t − 1 x t − k + ψ 2 a t − 2 x t − k + ..... ) (8) γ k = E ( x t − k a t − 1 ) + ψ 1 E ( x t − k a t − 1 ) + ψ 2 E ( x t − k a t − 2 ) + .... (9) where γ K is the autocovariance. Above equation can be generalised into: ∞ γ k = σ 2 � ψ j ψ j + k (10) a j =0 σ 2 a : Variance of a t with mean zero In terms of the impulse response (ACF), it becomes σ 2 � ∞ j =0 ψ j ψ j + k a ρ k = (11) γ 0 Hence, an AR process is an infinite impulse response system For stability ⇒ � ∞ j =0 | ψ j | < ∞ Cyriac James, Hema A. Murthy (IITM) June 23, 2011 14 / 35

  8. Time Series Models Auto-Regressive(AR) Model Multiply with x t − k on Equation (4) and take expectation on both sides: γ k = α 1 γ k − 1 + α 2 γ k − 2 (12) Dividing by γ 0 , (12) becomes: ρ k = α 1 ρ k − 1 + α 2 ρ k − 2 (13) ρ k − α 1 ρ k − 1 − α 2 ρ k − 2 = 0 (14) Characteristic equation: λ 2 − α 1 λ − α 2 = 0 (15) The general solution is of the form: ρ k = C 1 ( λ 1 ) k + C 2 ( λ 2 ) k (16) λ 1 , λ 2 : Roots (if distinct) C 1 and C 2 : Arbitary constants For Stability ⇒ | λ 1 | < 1 and | λ 2 | < 1 Yule-Walker Equation - Estimating model coefficients Cyriac James, Hema A. Murthy (IITM) June 23, 2011 15 / 35 Time Series Models Moving Average(MA) Model An MA model can be written as x t = a t − ψ 1 a t − 1 − ψ 2 a t − 2 − ... − ψ 2 a t − q (17) x t : Output Signal at time t a t , a t − 1 , ... : Random shocks or white noise process ψ 1 , ψ 2 ..., ψ q : Model coefficients, where q is the model order Finite linear filter model Can be written as an infinite series of past values Consider an MA model of order 1 x t = a t − ψ 1 a t − 1 (18) Cyriac James, Hema A. Murthy (IITM) June 23, 2011 16 / 35

  9. Time Series Models Moving Average(MA) Model Multiply with x t − k on equation (18) and take expectation on both sides, γ k = E ( a t x t − k − ψ 1 a t − 1 x t − k ) γ 0 = E ( a t x t − ψ 1 a t − 1 x t ) (19) γ 0 = E ( a t ( a t − ψ 1 a t − 1 ) − ψ 1 a t − 1 ( a t − ψ 1 a t − 1 ) (20) γ 0 = σ 2 a + ψ 2 1 σ 2 (21) a γ 1 = − ψ 1 σ 2 (22) a γ 2 = 0 (23) γ k = 0 for all values of k > 1 Hence, an MA process is a finite impulse response system Time invariant MA process is always stationary Cyriac James, Hema A. Murthy (IITM) June 23, 2011 17 / 35 Time Series Models Duality Property: For Model Identification For an AR(p) process, ACF converges slowly, but PACF cut-off after lag p For an MA(q) process, PACF converges slowly, but ACF cut-off after lag q Cyriac James, Hema A. Murthy (IITM) June 23, 2011 18 / 35

Recommend


More recommend