on entropy in network traffic anomaly detection
play

On Entropy in Network Traffic Anomaly Detection Jayro Santiago-Paz, - PowerPoint PPT Presentation

Introduction Feature Extraction Entropy Calculation Anomaly detection Classification Open Issues On Entropy in Network Traffic Anomaly Detection Jayro Santiago-Paz, Deni Torres-Roman. Cinvestav, Campus Guadalajara, Mexico November 2015


  1. Introduction Feature Extraction Entropy Calculation Anomaly detection Classification Open Issues On Entropy in Network Traffic Anomaly Detection Jayro Santiago-Paz, Deni Torres-Roman. Cinvestav, Campus Guadalajara, Mexico November 2015 Jayro Santiago-Paz, Deni Torres-Roman. 1/19 On Entropy in Network Traffic Anomaly Detection

  2. Introduction Feature Extraction Entropy Calculation Anomaly detection Classification Open Issues Outline Introduction 1 Databases Feature Extraction 2 Windowing in Network Traffic Entropy Calculation 3 Kullback-Leibler divergence Mutual information Entropy calculation Anomaly detection 4 Classification 5 The Classifier Metrics Open Issues 6 Jayro Santiago-Paz, Deni Torres-Roman. 2/19 On Entropy in Network Traffic Anomaly Detection

  3. Introduction Feature Extraction Entropy Calculation Anomaly detection Classification Open Issues Outline Introduction 1 Databases Feature Extraction 2 Windowing in Network Traffic Entropy Calculation 3 Kullback-Leibler divergence Mutual information Entropy calculation Anomaly detection 4 Classification 5 The Classifier Metrics Open Issues 6 Jayro Santiago-Paz, Deni Torres-Roman. 2/19 On Entropy in Network Traffic Anomaly Detection

  4. Introduction Feature Extraction Entropy Calculation Anomaly detection Classification Open Issues Outline Introduction 1 Databases Feature Extraction 2 Windowing in Network Traffic Entropy Calculation 3 Kullback-Leibler divergence Mutual information Entropy calculation Anomaly detection 4 Classification 5 The Classifier Metrics Open Issues 6 Jayro Santiago-Paz, Deni Torres-Roman. 2/19 On Entropy in Network Traffic Anomaly Detection

  5. Introduction Feature Extraction Entropy Calculation Anomaly detection Classification Open Issues Outline Introduction 1 Databases Feature Extraction 2 Windowing in Network Traffic Entropy Calculation 3 Kullback-Leibler divergence Mutual information Entropy calculation Anomaly detection 4 Classification 5 The Classifier Metrics Open Issues 6 Jayro Santiago-Paz, Deni Torres-Roman. 2/19 On Entropy in Network Traffic Anomaly Detection

  6. Introduction Feature Extraction Entropy Calculation Anomaly detection Classification Open Issues Outline Introduction 1 Databases Feature Extraction 2 Windowing in Network Traffic Entropy Calculation 3 Kullback-Leibler divergence Mutual information Entropy calculation Anomaly detection 4 Classification 5 The Classifier Metrics Open Issues 6 Jayro Santiago-Paz, Deni Torres-Roman. 2/19 On Entropy in Network Traffic Anomaly Detection

  7. Introduction Feature Extraction Entropy Calculation Anomaly detection Classification Open Issues Outline Introduction 1 Databases Feature Extraction 2 Windowing in Network Traffic Entropy Calculation 3 Kullback-Leibler divergence Mutual information Entropy calculation Anomaly detection 4 Classification 5 The Classifier Metrics Open Issues 6 Jayro Santiago-Paz, Deni Torres-Roman. 2/19 On Entropy in Network Traffic Anomaly Detection

  8. Introduction Feature Extraction Entropy Calculation Databases Anomaly detection Classification Open Issues Chandola et al. (2009) states that the term anomaly-based intrusion detection in networks refers to the problem of finding exceptional pat- terns in network traffic that do not conform to the expected normal behavior. Given a traffic network and its set of the selected traffic features X = { X 1 , X 2 , . . . , X p } , and N time instances of X , the normal and abnor- mal behaviors of the instances can be studied. The space of all in- stances of X builds the feature space which can be mapped to another space by employing a function such as entropy. In the literature, Shan- non and generalized Rényi and Tsallis entropy estimators, as well as probability estimators (Balanced, Balanced II), are used. Jayro Santiago-Paz, Deni Torres-Roman. 3/19 On Entropy in Network Traffic Anomaly Detection

  9. Introduction Feature Extraction Entropy Calculation Databases Anomaly detection Classification Open Issues The A-NIDS usually consists of two stages: training and testing stage. In the training stage using a database of “normal” or free-anomaly network traffic, the feature extraction, windowing and entropy calculation modules, a “normal” profile is found. In the testing stage, using the feature extraction, windowing and entropy calculation modules, anomalies in the current network traffic are detected and classified. Figure 1: General architecture of entropy-based A-NIDS. Jayro Santiago-Paz, Deni Torres-Roman. 4/19 On Entropy in Network Traffic Anomaly Detection

  10. Introduction Feature Extraction Entropy Calculation Databases Anomaly detection Classification Open Issues Synthetic The synthetic databases are generated artificially, e.g., the MIT-DARPA 1998 , 1999 , 2000 databases a , which include five major categories: Denial of Service Attacks (DoS), User to Root Attacks (U2R), Remote to User Attacks (R2U) and probes. a http://www.ll.mit.edu/ideval/index.html Real Some real public databases are: CAIDA a , which contains anonymized passive traffic traces from high-speed Internet backbone links, and the traffic data repository, main- tained by the MAWI b Working Group of the WIDE Project. Other researchers have cre- ated their own databases in different universities, e.g., Carnegie Mellon University, Xi’an Jiaotong University, and Clemson University (GENI), or traffic collected from backbone in SWITCH, Abilene, and Géant. a https://www.caida.org/data/passive/passive_2012_dataset.xml b http://mawi.wide.ad.jp/mawi/ Jayro Santiago-Paz, Deni Torres-Roman. 5/19 On Entropy in Network Traffic Anomaly Detection

  11. Introduction Feature Extraction Entropy Calculation Windowing in Network Traffic Anomaly detection Classification Open Issues Motoda H. and Liu H. (2002) Feature selection is a process that chooses a subset of M fea- tures from the original set of N features M ≤ N so that the feature space is optimally reduced according to a certain crite- rion. Feature extraction is a process that extracts a set of new features from the original features through some func- tional mapping. Assuming that there are features N Z 1 , Z 2 , . . . , Z N after feature extraction, another set of new fea- tures X 1 , X 2 , . . . , X M ( M < N ) is obtained via the mapping func- tions F i , i.e. X i = F i ( Z 1 , Z 2 , . . . , Z N ) . Jayro Santiago-Paz, Deni Torres-Roman. 6/19 On Entropy in Network Traffic Anomaly Detection

  12. Introduction Feature Extraction Entropy Calculation Windowing in Network Traffic Anomaly detection Classification Open Issues Among the algorithms used to reduce the number of features in network traffic anomaly detection are: PCA, Mutual Information and linear correlation, decision tree, and maxi- mum entropy. In network traffic, the most commonly employed features are: source and destination IP addresses and source and destination port numbers. Other features extracted from headers are: protocol field, number of bytes, service, flag, and country code. Zhang et al. (2009) divided the size of packets into seven types and Gu et al. (2005) defined 587 packet classes based on the port number. At flow a level the features selected were: flow duration, flow size distribution (FSD), and average packet size per flow. For KDD Cup 99, 41 features or a subset were employed. On the other hand, Tellenbach et al. (2011) used source port, country code and others, constructing the TES as input data. a An IP flow corresponds to an IP port-to-port traffic exchanged between two IP addresses during a period of time T. Jayro Santiago-Paz, Deni Torres-Roman. 7/19 On Entropy in Network Traffic Anomaly Detection

  13. Introduction Feature Extraction Entropy Calculation Windowing in Network Traffic Anomaly detection Classification Open Issues Window-based methods group consecutive packets or flows based on a sliding window. The i th window of size L packets is represented as W i ( L, τ ) = { pack k , pack k +1 , . . . , pack k + L } , with k = iL − iτ, where τ is the overlapping and τ ∈ { 0 , 1 , . . . , L − 1 } . When the window size is given by time, L can be different in each window. Windowing is performed in two ways: overlapping ( τ � = 0 ) and non overlapping ( τ = 0 ) windows. The window sizes most commonly used are: 5 min, 30 min, 1 min, 100 sec, 5 sec and 0 . 5 sec. Some researchers use windows with a fixed length L = 4096 , 1000 , and 32 packets. Jayro Santiago-Paz, Deni Torres-Roman. 8/19 On Entropy in Network Traffic Anomaly Detection

  14. Introduction Feature Extraction Kullback-Leibler divergence Entropy Calculation Mutual information Anomaly detection Entropy calculation Classification Open Issues Let X be a random variable which takes values of the set { x 1 , x 2 , ..., x M } , p i := P ( X = x i ) the probability of occurrence of x i , and M the cardinality of the finite set; hence, the Shannon entropy is: M H S ( X ) = − � (1) p i log ( p i ) . i =1 The Rényi entropy is defined as: � M � 1 H R ( X, q ) = � p q (2) 1 − q log i i =1 and the Tsallis entropy is � M � 1 H T ( X, q ) = � p q (3) 1 − , i q − 1 i =1 when q → 1 the generalized entropies are reduced to Shannon entropy. In order to compare the changes of entropy at different times, the entropy is normalized, i.e., H ( X ) ¯ (4) H ( X ) = H max ( X ) . Jayro Santiago-Paz, Deni Torres-Roman. 9/19 On Entropy in Network Traffic Anomaly Detection

Recommend


More recommend