Controlling False Alarm/Discovery Rates in Online Internet Traffic - PowerPoint PPT Presentation

Controlling False Alarm/Discovery Rates in Online Internet Traffic Flow Classification Daniel Nechay, Yvan Pointurier and Mark Coates McGill University Department of Electrical and Computer Engineering Montreal, Quebec, Canada April 22, 2009

Outline Introduction Methodology Data & Processing Simulations Conclusion Outline Introduction 1 Methodology 2 Background Traffic Classification Data & Processing 3 Simulation Experiments 4

Outline Introduction Methodology Data & Processing Simulations Conclusion Introduction What is Internet traffic classification? Associate a user-defined class to a traffic flow Class can be broad (P2P) or application specific (BitTorrent, Kazaa, etc.) Why do we need Internet traffic classification? There are a variety of applications where Internet traffic classification is needed: To help provide QoS guarantees or enforce Service Level Agreements (SLA) Prioritize or limit/block traffic Network provisioning Network security

Outline Introduction Methodology Data & Processing Simulations Conclusion Current Traffic Classification Methods Port-Based Simplest method Not reliable Deep-Packet Inspection Examine the payload of the packets to look for application-specific signatures Privacy and legal concerns Shallow-Packet Inspection Derives statistics from the packet headers and uses this information to classify the flow Non-invasive and still works on encrypted packets

Outline Introduction Methodology Data & Processing Simulations Conclusion Our Contribution Contributions 1 Provide a performance guarantee on the false alarm or false discovery rates 2 Novel methodology: converted binary classifier into a multi-class classifier 3 Online classification

Outline Introduction Methodology Data & Processing Simulations Conclusion Problem Formulation Definitions X - the d -dimensional random variable corresponding to the flow features Each flow is associated an output Y Z = Y ∈ { 1 . . . , c + 1 } the class of the flow

Outline Introduction Methodology Data & Processing Simulations Conclusion Problem Statement 1 Goal of Neyman-Pearson classification To minimize the overall misclassification rate while adhering to certain false alarm rate (FAR) constraints False Alarm Rate for class i Expected fraction of the flows that do not belong to traffic class i that are incorrectly classified as belonging to i .

Outline Introduction Methodology Data & Processing Simulations Conclusion Problem Statement 2 Goal of Learning to Satisfy (LSAT) framework To provide false discovery rates (FDR) guarantees while minimizing the overall misclassification rate False Discovery Rate for class i Expected fraction of incorrectly classified flows among all traffic flows classified as class i .

Outline Introduction Methodology Data & Processing Simulations Conclusion Background Background Support Vector Machines (SVM) SVMs consist of two steps: 1 Transform the input features x i via a mapping Φ : R d → H where H is a high-dimensional Hilbert space 2 Construct a hyperplane (the decision boundary) in H according to the max-margin principle Cost-Sensitive Classification Regular SVM treats all misclassifications equally Cost-Sensitive classification (our case 2 ν -SVM) treats the misclassification of each class differently Have two parameters ν − & ν + to control the misclassification for the different classes

Outline Introduction Methodology Data & Processing Simulations Conclusion Background What is LSAT? Goal The goal is to learn a set in the input (feature) space that simultaneously satisfies multiple output constraints. The LSAT framework is distinguished by: 1 multiple performance criteria must be satisfied 2 output behaviour is assessed only on the solution set.

Outline Introduction Methodology Data & Processing Simulations Conclusion Background LSAT example Comparison of LSAT to WSVM LSAT Weighted SVM (WSVM) 0.8 0.6 0.4 0.2 Reference F. Thouin, M. J. Coates, B. Erikkson, R. Nowak, and C. Scott, Learning to Satisfy, in Proc. Int. Conf. Acoustics, Speech, and Signal Proc. (ICASSP), Las Vegas, NV, USA, Apr. 2008.

Outline Introduction Methodology Data & Processing Simulations Conclusion Traffic Classification Traffic Classification How to classify c classes? Use a chain of c binary classifiers Each binary classifier responsible for a particular class Ordering is important Classified as unknown if there are no mappings to a class How to determine the best classifier? Find the best parameters ν + , ν − and σ for the 2 ν -SVM Introduce cost functions to rank the classifiers

Outline Introduction Methodology Data & Processing Simulations Conclusion Traffic Classification Cost Functions Traffic classification with FAR constraints For every classifier, the following risk function is used: 1 � R ( f ) = max( P F ( s ( i )) − α s ( i ) , 0) + P M ( s ( i )) α s ( i ) s ( i ) s(i): class i α s ( i ) : FAR constraint for class i P F ( s ( i )): FAR for class i P M ( s ( i )): Misclassification rate for class i Traffic classification with FDR constraints Ensure that it satisfies the constraints set — then choose the classifier that minimizes the misclassification rate

Outline Introduction Methodology Data & Processing Simulations Conclusion Input Data Data Collected a 24 hour trace using tcpdump in April and split the trace by hour Only considered TCP flows for inputs tcptrace was able to collect 142 statistics for every flow Feature selection reduced the feature space to 5 features Classify after the first six packets of a flow Bro was used to provide a ground truth

Outline Introduction Methodology Data & Processing Simulations Conclusion Application Breakdown Application Breakdown after 6 packets of a flow Table: Application breakdown for flows > 6 packets Flows Size Application Number Percentage GB Percentage HTTP 315375 78.3% 4.1 74.6% HTTPS 20736 5.2% 0.29 5.4% MSN 3364 0.8% 0.04 0.7% POP3 1311 0.3% 0.01 0.2% OTHER 61870 15.4% 1.05 19.1%

Outline Introduction Methodology Data & Processing Simulations Conclusion Simulation environment Statistics Used total number of bytes sent (C → S) number of packets with the FIN field set (C → S) the window scaling factor used (C → S) total number of bytes truncated in the packet capture (C → S) total number of packets truncated in the packet capture (S → C)

Outline Introduction Methodology Data & Processing Simulations Conclusion FAR-constrained classifier Classifiers Three classifiers compared: Baseline Classifier - Multi-class SVM FAR-constrained classifier with α { HTTP } = 0 . 4% FAR-constrained classifier with α { HTTPS , HTTP } = 0 . 05% Hour 1 Results Trained on 1000 randomly chosen points in hour 1 & validated on the rest of the hour Baseline classifier has α { HTTP } = 3 . 7% and α { HTTPS , HTTP } = 0 . 07% Classwise FAR-constrained classifier has α { HTTP } = 0 . 3% while the pairwise FAR-constrained classifier has α { HTTPS , HTTP } = 0 . 02%

Outline Introduction Methodology Data & Processing Simulations Conclusion FAR-constrained classifier Overall Accuracy for Hours 2 - 24 100 98 96 94 Accuracy (%) 92 90 88 Baseline Classifier FAR(HTTP) = .4% 86 FAR(HTTPS,HTTP) = .02% 84 Hour

Outline Introduction Methodology Data & Processing Simulations Conclusion FAR-constrained classifier FAR(HTTP) for Hours 2 - 24 30 Baseline Classifier FAR(HTTP) = .4% 25 20 FAR(HTTP) (%) 15 10 5 0 Hour

Outline Introduction Methodology Data & Processing Simulations Conclusion FAR-constrained classifier FAR(HTTPS,HTTP) for Hours 2 - 24 0.4 Baseline Classifier FAR(HTTPS,HTTP) = .02% 0.3 FAR(HTTPS,HTTP) (%) 0.2 0.1 0 0 4 8 12 16 20 24 Hour

Outline Introduction Methodology Data & Processing Simulations Conclusion FDR-constrained classifier Classifiers Three classifiers compared: Baseline Classifier - Multiclass SVM Unconstrained binary-chained classifier FDR-constrained classifier with β { HTTPS } = 5% Hour 1 Results Trained on 1000 randomly chosen points in hour 1 Unconstrained binary-chained classifier has β { HTTPS } = 7.0% while the FDR-constrained classifier has β { HTTPS } = 4.2%

Outline Introduction Methodology Data & Processing Simulations Conclusion FDR-constrained classifier Overall Accuracy for Hours 2 - 24 100 98 96 Accuracy (%) 94 92 90 Multiclass SVM Baseline 88 Unconstrained Binary Chain FDR(HTTPS) = 5% 86 Hour

Outline Introduction Methodology Data & Processing Simulations Conclusion FDR-constrained classifier FDR(HTTPS) for Hours 2 - 24 50 Multiclass SVM Baseline Unconstrained Bin. Chain FDR(HTTPS) = 5% 40 FDR(HTTPS) (%) 30 20 10 0 0 4 8 12 16 20 24 Hour

Outline Introduction Methodology Data & Processing Simulations Conclusion Conclusion Summary Two novel algorithms for Internet traffic classification proposed Able to provide performance guarantees Validated our approach with data provided by an ISP On-going Research Experiment on a more diverse data set Creating a hybrid classifier

Controlling False Alarm/Discovery Rates in Online Internet Traffic - PowerPoint PPT Presentation

Controlling False Alarm/Discovery Rates in Online Internet Traffic Flow Classification Daniel Nechay, Yvan Pointurier and Mark Coates McGill University Department of Electrical and Computer Engineering Montreal, Quebec, Canada April 22, 2009

Controlling False Discovery Rate Privately Weijie Su University of Pennsylvania NIPS, Barcelona,

Availability, Reliability, False Alarm Resistance Best Practice in Fire Alarm Systems FIRE

When an alarm is false, every penny counts When its real, every minute counts. Facts and

false alarm reduction Raman Chagger Principal Consultant, Fire Safety Group, BRE FIREX, 20 th

False Alarm Reduction for Active Sonars using Deep Learning Architectures Matthias Bu

Barriers to Preventing False Discovery in Interactive Data Analysis Jonathan Ullman (Northeastern

Disambiguating False-Alarm Hashtag Usages in Tweets for Irony Detection Hen-Hsen Huang 1 ,

False-name-proofness in Online Mechanisms Taiki Todo, Takayuki Mouri, Atsushi Iwasaki, and

Q1) How important is the problem of adaptivity and its various guises as a cause of false

Monitoring DT trigger rates using online lumi Luminosity monitoring using DT trigger rates?

Toward Controlling Discrimination in Online Ad Auctions L. Elisa Celis 1 , Anay Mehrotra 2 ,

Hommels Method for False Discovery Proportions Jelle Goeman Joint work with: Aldo Solari,

Microarrays False Discovery Rate Prof. Tesler Math 186 Winter 2019 Prof. Tesler

High-Dimensional Variable Selection in Nonlinear Models that Controls the False Discovery Rate

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate

Differential analysis of microarray data, Multiple testing problems and Local False Discovery

Controlling Coastal erosion Coastal Erosion Rates in the U.S. Coastal Erosion and Stabilization

Leveraging prior information and group structure for false discovery rate control Rina Foygel

On Demand Online Credit Rates in the Banking Domain A proposal for dynamic on-demand credit

The spread of true and false news online Soroush Vosoughi, Deb Roy, Sinan Aral Presentation:

Discovery Bank Update 2 Global trends and shared-value Behavioural Bank Members

FDR and Online FDR Adel Javanmard and Andrea Montanari USC and Stanford December 11, 2015

A UTH S COPE : Towards Automatic Discovery of Vulnerable Authorizations in Online Services

Five Ashes Village Hall Fire Procedures 1 What to do in case of fire .Raise the Alarm. .Get