ميحرلا نمحرلا للوا مسب Combining Machine and Automata Learning for Network Traffic Classification Zeynab Sabahi, Fatemeh Ghassemi, and Zahra Alimadadi TTCS 2020,Tehran, Iran 1
Network Traffic Classification, What & Why? For a given interleaved packet trace, we want to detect which applications are running ? For the network management tasks: - Anomaly detection, - Balancing bandwidth usage, - Firewalling, gateway .. . 010011011001111 2
Network Traffic Classification, How? Port-based classification: Inefficient (random or non-standard ports usage) Payload inspection: Useless in encrypted traffic Statistical methods: Flow/packet statistical features Fast but less accurate Ignore temporal relation among flows Behavioral classification: Specific to the category of application 3
Our solution Intuition: A network application is a program calling different well-known protocols such as HTTP, TCP, SSL, and TLS. TCP TLS HTTP User Each application has its specific network communication language when calling different well-known protocols. 4
Research Goals Learning the network language for each application that we do not have its source code, in an automatic way Classifying an interleaved packet traces of applications according to the learned languages 5
Research Goals k-TSS language Learning the network language for each application that we do not have its source code, in an automatic way Classifying an interleaved packet traces of applications according to the learned languages 6
Introduction Preliminary: k-TSS Language NeTLang Framework Evaluation Conclusion 7
Formal Foundation: k-TSS Language k - T estable language in the S trict S ense is a regular k-size window language. Its learning is decidable. Words are determined by three allowed sets prefixes, suffixes, and segments. 8
Formal Foundation: k-TSS Language k - T estable language in the S trict S ense is a regular k-size window language. Its learning is decidable. Words are determined by three allowed sets prefixes, suffixes, and segments. aba aabababb Window of size 3 Segments = {aba} Prefixes = {} Suffixes = {} 9
Formal Foundation: k-TSS Language k - T estable language in the S trict S ense is a regular k-size window language. Its learning is decidable. Words are determined by three allowed sets prefixes, suffixes, and segments. a baa abababb Window of size 3 Segments = {aba, baa} Prefixes = {} Suffixes = {} 10
Formal Foundation: k-TSS Language k - T estable language in the S trict S ense is a regular k-size window language. Its learning is decidable. Words are determined by three allowed sets prefixes, suffixes, and segments. ab aaa bababb Window of size 3 Segments = {aba, baa, aaa} Prefixes = {} Suffixes = {} 11
Formal Foundation: k-TSS Language k - T estable language in the S trict S ense is a regular k-size window language. Its learning is decidable. Words are determined by three allowed sets prefixes, suffixes, and segments. ab aaabababb Window of size 3 Segments = {aba, baa, aaa, aab, bab, abb} Prefixes = {ab} Suffixes = {} 12
Formal Foundation: k-TSS Language k - T estable language in the S trict S ense is a regular k-size window language. Its learning is decidable. Words are determined by three allowed sets prefixes, suffixes, and segments. abaaababa bb Window of size 3 Segments = {aba, baa, aaa, aab, bab, abb} Prefixes = {ab} Suffixes = {bb} 13
Formal Definition of k-TSS Language Definition 1 (k-test vector) Let k > 0. A k-test vector is a 5-tuple 𝑎 = < 𝛵, 𝐽, 𝐺, 𝑈, 𝐷 > where: 𝐽 ⊆ Σ 𝑙−1 is a set of allowed prefixes 𝐺 ⊆ Σ 𝑙−1 is a set of allowed suffixes 𝑈 ⊆ Σ 𝑙 is a set of allowed segments 𝐷 ⊆ Σ <𝑙 is a set of allowed short strings Definition 2 (k-TSS Language) Let 𝑎 = < Σ, 𝐽, 𝐺, 𝑈, 𝐷 > be a k-test vector, for some k > 0. L(Z) = [(𝐽Σ ∗ ∩ Σ ∗ 𝐺) − Σ ∗ (Σ 𝑙 − 𝑈)Σ ∗ ] ∪ 𝐷 14
Formal Definition of k-TSS Language What is it? How should it be defined for network domain? Definition 1 (k-test vector) ? Let k > 0. A k-test vector is a 5-tuple 𝑎 = < 𝜯, 𝐽, 𝐺, 𝑈, 𝐷 > where: 𝐽 ⊆ Σ 𝑙−1 is a set of allowed prefixes 𝐺 ⊆ Σ 𝑙−1 is a set of allowed suffixes 𝑈 ⊆ Σ 𝑙 is a set of allowed segments 𝐷 ⊆ Σ <𝑙 is a set of allowed short strings Definition 2 (k-TSS Language) Let 𝑎 = < Σ, 𝐽, 𝐺, 𝑈, 𝐷 > be a k-test vector, for some k > 0. L(Z) = [(𝐽Σ ∗ ∩ Σ ∗ 𝐺) − Σ ∗ (Σ 𝑙 − 𝑈)Σ ∗ ] ∪ 𝐷 15
Introduction Preliminary: k-TSS Language NeTLang Framework Evaluation Conclusion 16
Translating Network Concepts to Automata Learning Intuition: some packets always appear together due to the control phase of protocols or the specific functionality of an application A sequence of related packets : A symbol of the alphabet A packet trace of an application : A word of the language For a set of all packet traces of an application its k-TSS language can be learned 17
NeTLang Framework Ne twork T raffic Lan guage Learner: NeTLang Architectural View: 18
NeTLang Framework Ne twork T raffic Lan guage Learner: NeTLang Architectural View: 1 2 3 19
1) Trace Generator Different coloring is for their protocol. Clustering algorithm is Kmeans++. Stats is statistical features based on length, number, and IAT of packets. 20
2) Language Learner By moving a k-window sliding parser the k-TSS vector is learned. For the running example (k=3): Σ = {H-2, SL-2, SL-3, SL-4, SL-5, T-1, T-10, TL-2, U- 0, U-1} T = {SL-2 T-1 U-0, SL-4 SL-2 T-1, T-1 SL-5 H-2, T-1 U-0 U-1, T-10 TL-2 T-1, TL-2 T-1 SL-5, U-0 U-1 SL-3} I = {SL-4 SL-2, T-10 TL-2} F = {SL-5 H-2, U-1 SL-3} 21
3) Classifier The automata of applications The interleaved packet trace App1 App2 . . . 22
3) Classifier The automata of applications The interleaved packet trace App1 The trace generator module is used to divide the symbolic sub-traces by timing features. App2 . . . 23
3) Classifier The automata of applications Sub-trace s1 App1 Automata word inclusion is not a suitable approach due to App2 the incomplete sub-traces and network noises. . . . 24
3) Classifier The automata of applications Sub-trace s1 App1 Z(App1) = < 𝛵 1 , 𝐽 1 , 𝐺 1 , 𝑈 1 > Z(s1) = < 𝛵, 𝐽, 𝐺, 𝑈 > Window-based Similarity App2 . . . 25
3) Classifier The automata of applications Sub-trace s1 App1 Z(App1) = < 𝛵 1 , 𝐽 1 , 𝐺 1 , 𝑈 1 > Z(s1) = < 𝛵, 𝐽, 𝐺, 𝑈 > Window-based Similarity App2 Percentage 𝛦𝑈 = 𝑈 −𝑈1 1 = 𝑈1 −𝑈 , 𝛦𝑈 𝑈1 , 𝑈 Change 𝛦Ʃ = Ʃ −Ʃ1 , 𝛦𝐽 = 𝐽 −𝐽1 𝐽 , 𝛦𝐺 = 𝐺 −𝐺1 metric Ʃ 𝐺 . . . distance(s1, App1) = 𝛦𝑈 𝛦𝑈 1 𝛦Ʃ 𝛦𝐽 𝛦𝐺 26
3) Classifier The automata of applications Sub-trace s1 App1 Z(App1) = < 𝛵 1 , 𝐽 1 , 𝐺 1 , 𝑈 1 > Z(s1) = < 𝛵, 𝐽, 𝐺, 𝑈 > Window-based Similarity App2 𝛦𝑈 = 𝑈 −𝑈1 1 = 𝑈1 −𝑈 , 𝛦𝑈 𝑈1 , 𝑈 𝛦Ʃ = Ʃ −Ʃ1 , 𝛦𝐽 = 𝐽 −𝐽1 𝐽 , 𝛦𝐺 = 𝐺 −𝐺1 Ʃ 𝐺 In general: . . D(Z(w), Z( 𝐵𝑞𝑞 𝑗 )) = Δ𝑈 Δ𝑈 𝑗 ΔƩ . ΔI ΔF 27
3) Classifier The automata of applications Sub-trace s1 App1 distance(s1, App1) Min = distance(s1, Appj) App2 Class(s1) = 𝐵𝑞𝑞 𝑘 distance(s1, App2) . . . . . . 28
3) Classifier The automata of applications Sub-trace s1 App1 distance(s1, App1) Min = distance(s1, Appj) App2 Class(s1) = 𝐵𝑞𝑞 𝑘 distance(s1, App2) Class(w) = j if D(L(w), L( 𝐵𝑞𝑞 𝑘 )) = . . . . 𝑏𝑠𝑛𝑗𝑜 ∀ 𝐵𝑞𝑞 𝑗 ∈ |A| (D(L(w), L( 𝐵𝑞𝑞 𝑗 ))) . . 29
Classifier Result for the Running Example Z(w= SL-4 SL-2 T-10 TL-2 T-1 U-2 ): Z(App ): Σ’ = {H-2, SL-2, SL-3, SL-4, SL-5, Σ = { SL-2, TL-2, T-1, U-2, SL-4, T-10} T-1, T-10, TL-2, U-0, U-1} T = {SL-4 SL-2 T-10, SL-2 T-10 TL-2, T-10 TL- T ’ = {SL-2 T-1 U-0, SL-4 SL-2 T-1, 2 T-1,TL-2 T-1 U-2} T-1 SL-5 H-2, T-1 U-0 U-1, T-10 I = {SL-4 SL-2,} TL-2 T-1, TL-2 T-1 SL-5, U-0 U-1 F = {T-1 U-2} SL-3} I ’ = {SL-4 SL-2, T-10 TL-2} F ’ = {SL-5 H-2, U-1 SL-3} 𝑗 = 0.85, ΔƩ = 0 .16, Δ𝑈 = 0.75, Δ𝑈 ΔI = 0, ΔF = 1 D(L(w),L(app)) = 7484160099 31
Introduction Preliminary: k-TSS Language NeTLang Framework Evaluation Conclusion 32
Dataset Description We divided pcaps to three sets: - Train: 65% - Validation: 15% - Test: 20% Metrics: - Precision (P), - Recall (R), - F1-Measure= 2∗𝑄∗𝑆 𝑄+𝑆 33
Evaluation Results The Best Configurations of Validation Set Parameter Application Traffic Identification Characterization Session 15 15 Threshold Inactive 5 15 Timeout Flow 10 10 Duration k 3 3 34
Compare with Statistical Classifiers: Application Identification Precision F1-Measure Recall 35
Recommend
More recommend