detection of http get attack with clustering and
play

Detection of HTTP-GET Attack with Clustering and Information - PowerPoint PPT Presentation

Problem definition Clustering Detection Measurements Result analysis Future work Detection of HTTP-GET Attack with Clustering and Information Theoretic Measurements Pawel Chwalinski Roman Belavkin Xiaochun Cheng Middlesex University


  1. Problem definition Clustering Detection Measurements Result analysis Future work Detection of HTTP-GET Attack with Clustering and Information Theoretic Measurements Pawel Chwalinski Roman Belavkin Xiaochun Cheng Middlesex University School of Science and Technology London, United Kingdom FOUNDATIONS & PRACTICE OF SECURITY, 2012 Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  2. Problem definition Clustering Detection Measurements Result analysis Future work Outline Problem definition 1 Clustering 2 Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means Detection Measurements 3 Mutual Information Mahalanobis Distance Likelihood of the same request segments Result analysis 4 Future work 5 Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  3. Problem definition Clustering Detection Measurements Result analysis Future work Outline Problem definition 1 Clustering 2 Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means Detection Measurements 3 Mutual Information Mahalanobis Distance Likelihood of the same request segments Result analysis 4 Future work 5 Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  4. � � Problem definition Clustering Detection Measurements Result analysis Future work Flash Crowd ❣ ❣ ❣ ❣ ❣ ❣ Legitimate ❣ ❣ ❣ ❣ ❣ ❣ Surge of connections ❲ ❲ ❲ ❲ ❲ ❲ Illegitimate ❲ ❲ ❲ ❲ ❲ ❲ ❲ ❲ ❲ Resource exhaustion Increased number of attacking sessions � = Increased arrival rate No bandwidth flooding The only difference is in intent, rather than in behaviour Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  5. Web Dataset and Sessions Problem definition Web clusters as joint distributions Clustering Entropy-based clustering Detection Measurements Attacking Strategies Result analysis Cluster assignment Future work Results and comparison against K-means Outline Problem definition 1 Clustering 2 Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means Detection Measurements 3 Mutual Information Mahalanobis Distance Likelihood of the same request segments Result analysis 4 Future work 5 Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  6. Web Dataset and Sessions Problem definition Web clusters as joint distributions Clustering Entropy-based clustering Detection Measurements Attacking Strategies Result analysis Cluster assignment Future work Results and comparison against K-means Dataset D divided into: D T for training and D V validating Example: s i =( homepage , homepage , homepage , news , news , news , homepage , homepage , weather , tv ) s i =(1 , 1 , 1 , 2 , 2 , 2 , 1 , 1 , 5 , 7) Ω= { 1 , 2 , . . . , n C } , where each ω represents a numerical label of an actual category, and n C = 17 Pairs of two consecutive requests are analysed, such that ( x , y ) ∈ Ω × Ω Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  7. Web Dataset and Sessions Problem definition Web clusters as joint distributions Clustering Entropy-based clustering Detection Measurements Attacking Strategies Result analysis Cluster assignment Future work Results and comparison against K-means 0.4 0.3 P(x,y) 0.2 0.1 0 17 16 15 14 1 13 2 12 3 Category Label 11 4 5 10 6 l 9 e 7 b 8 8 a L 7 9 y 10 6 r o 11 5 g 12 e 4 t 13 a 14 C 3 15 2 16 1 17 Figure: Joint distribution of the the pairs of requests observed in a cluster (i.e. one pattern of “behaviour”) Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  8. Web Dataset and Sessions Problem definition Web clusters as joint distributions Clustering Entropy-based clustering Detection Measurements Attacking Strategies Result analysis Cluster assignment Future work Results and comparison against K-means 0.25 0.2 P(x,y) 0.15 0.1 0.05 0 1 2 3 4 5 6 7 8 91011121314151617 1 2 3 4 5 6 7 8 9 1011121314151617 Category Label e l b a L y r o g e t a C Figure: Joint distribution of the the pairs of requests observed in a cluster (i.e. one pattern of “behaviour”) Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  9. Web Dataset and Sessions Problem definition Web clusters as joint distributions Clustering Entropy-based clustering Detection Measurements Attacking Strategies Result analysis Cluster assignment Future work Results and comparison against K-means n random variables { X 1 , . . . , X n } having discreet joint distribution defined as p ( x 1 , . . . , x n ) for which entropy is calculated as: � � h ( x 1 , . . . , x n ) = − . . . p ( x 1 , . . . , x n ) log p ( x 1 , . . . , x n ) (1) x 1 ,..., x n For a joint distribution p ( x , y ), entropy h ( x , y ) is calculated as: � � h ( x , y )= − p ( x , y ) log p ( x , y ) (2) x y Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  10. Web Dataset and Sessions Problem definition Web clusters as joint distributions Clustering Entropy-based clustering Detection Measurements Attacking Strategies Result analysis Cluster assignment Future work Results and comparison against K-means H T H T H 0.25 0.25 H 0.02 0.02 T 0.25 0.25 T 0.02 0.94 Table: Legitimate Coins Table: Biased Coins Entropy of legitimate distribution: h ( x , y ) = 2( bits ) x , y ∈{ H , T } Entropy of biased distribution: h ( x , y ) = 0 . 3952( bits ) x , y ∈{ H , T } Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  11. Web Dataset and Sessions Problem definition Web clusters as joint distributions Clustering Entropy-based clustering Detection Measurements Attacking Strategies Result analysis Cluster assignment Future work Results and comparison against K-means Clustering metric A set C = { C 1 , C 2 , . . . , C k } of k clusters, the goal is to minimise the average entropy of the clusters, which is computed as follows: � | C j | k � � E { H } = | D T |{ h j } (3) j =1 Having grouped b , 1 ≤ b ≤ n T sequences from D T given (3), every time a sequence s i , b < i ≤ n T is to be added to C , a cluster C j , 1 ≤ j ≤ k is picked where adding sequence s i decreases (3) the most. Problems: Unwanted effect of sequence order (Re-clustering & Merging). Minimization of (3) is computationally expensive (Partitioning) Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  12. Web Dataset and Sessions Problem definition Web clusters as joint distributions Clustering Entropy-based clustering Detection Measurements Attacking Strategies Result analysis Cluster assignment Future work Results and comparison against K-means Re-clustering Suppose that s i has been added to C j , 1 ≤ j ≤ k . Having grouped other b sequences, there exists another cluster C j ′ , 1 ≤ j ′ ≤ k , j ′ � = j such that placing the previously added s i inside C j ′ , minimises (3) further. Therefore, after processing a batch of b sequences, the algorithm is stopped, and the whole set C is re-clustered. Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  13. Web Dataset and Sessions Problem definition Web clusters as joint distributions Clustering Entropy-based clustering Detection Measurements Attacking Strategies Result analysis Cluster assignment Future work Results and comparison against K-means Merging Suppose there are two joint distributions p i ( x , y ) , 1 ≤ i ≤ k , and p j ( x , y ) , 1 ≤ j ≤ k is the application of Kullback-Leibler (KL) divergence formula: p i ( x , y ) log p i ( x , y ) � � D KL ( p i || p j )= (4) p j ( x , y ) x , y ∈ Ω Two clusters C i and C j are similar, and are merged when D KL ( p i || p j ) ≤ η KL , and D KL ( p j || p i ) ≤ η KL . Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  14. Web Dataset and Sessions Problem definition Web clusters as joint distributions Clustering Entropy-based clustering Detection Measurements Attacking Strategies Result analysis Cluster assignment Future work Results and comparison against K-means Partitioning D T divided into p same-length blocks { B 1 , B 2 , . . . , B p } . For each block B i , 1 ≤ i ≤ p minimisation of (3) merging re-clustering Having combined { B 1 , B 2 , . . . , B p } into C : merging re-clustering Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

  15. Web Dataset and Sessions Problem definition Web clusters as joint distributions Clustering Entropy-based clustering Detection Measurements Attacking Strategies Result analysis Cluster assignment Future work Results and comparison against K-means Decision-making problem for attackers Having requested a link from category c ( ω t ) , ω t ∈ Ω, a programmed zombie decides whether � remain p R c ( ω t +1 ) = p M =1 − p R move Uniformly-changing zombies where | Ω | and p M =1 − p R = | Ω |− 1 p R = 1 | Ω | are too easy to detect. Frequently-changing hosts are the ones that tend to change categories less frequently comparing to the uniformly-changing zombies, such that p M = p R =0 . 5. Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

Recommend


More recommend