Exploiting Sequence of Events for Potential Attack Detection in - PowerPoint PPT Presentation

Exploiting Sequence of Events for Potential Attack Detection in Network Security using Machine Learning Ashrith Barthur, PhD Security Research @cyberbaggage H 2 O .ai Machine Intelligence

Sequence of Events (SoE) • What is a Sequence of Events? o A set of events, that usually includes sub-events that help you achieve a goal. H 2 O .ai Machine Intelligence

SoE - In Depth o An individual event is usually a set of sub-events that we/machines do to achieve a state. • E.g. Entering username and password and hit enter - login event. o An event by itself does not say much. • E.g. Did you login to Google? Facebook? o So an event needs a context. • E.g. Enter www.google.com - page load event. • Enter username and password - login event. H 2 O .ai Machine Intelligence

SoE - Importance o If you are predicting loan default / fraud then a sequence of events are not that important. o But when you are classifying a potential attack /malicious behaviour, sequence of events is important. H 2 O .ai Machine Intelligence

SoE - Importance o Is this not just about building related features? o Not so. o This is actually chaining data from different sources and making them a sequence, by actual data joins, or algorithmically. H 2 O .ai Machine Intelligence

Why Do We Need a Sequence of Events While Identifying Potential Attack? - Answer lies in how attacks occur, Anatomy . H 2 O .ai Machine Intelligence

Classification of Attacks • Short Term Goals o DDoS - for different layers o Physical Attacks • Long Term Goals o Network/Service Reconnaissance o Enterprise Service attacks - attack on infrastructure o Phishing, Spear Phishing (more focussed) o Social Engineering - Out-of-loop • H 2 O .ai Machine Intelligence

Anatomy of An Attack - Short Term • Identify Target • Identify Service of Attack • Overwhelm the service • Post-Attack Analysis o Attack mechanism is simple. o Variations occur in source of attack, protocols levels. o Relatively short lived. o Damage quantifiable. H 2 O .ai Machine Intelligence

Anatomy of An Attack - Long Term • Identify Target • Reconnaissance • Identify Infrastructure Vulnerability / Or means of phishing • Network Foothold • Lateral movement and service compromises • Data Exfiltration/ Network Squatting, or passive sniffing. H 2 O .ai Machine Intelligence

Anatomy of An Attack - Long Term (cont) • Post-Attack Analysis ( Usually an Illusion ) o Attack might still continue o Variations can occur based on services, new vulnerabilities, new softwares, unused access, network segments without VLANs, un-closed, outdated wall sockets, etc. o Usually very long term o Damage assessment is not usually accurate. • H 2 O .ai Machine Intelligence

How are these two attack variants used? H 2 O .ai Machine Intelligence

Usage • Used Together, if needed. • Short Term Attacks are used as: o A means of Reconnaissance o A method of shielding another attack, or breaking down some basic protection before an attack is launched. o It is also used to shield any detection of data exfiltration H 2 O .ai Machine Intelligence

Usage • As you can clearly see a potential attack is set of connected events. • Identifying only one event might not yield much information. o E.g. An access to the database in itself is hardly a potential attack identifier. o Accessing the database outside work-hours too is hardly an identifier as people all around the world might be working on the same database. H 2 O .ai Machine Intelligence

Current Day Solutions. 1. Solutions do exist that correlate events 2. But are limited 3. They are purely rule-based, and mostly stateless. 4. Hardly capable of smartly identifying events related across time. - A must for identifying long term attacks. H 2 O .ai Machine Intelligence

CSec Solution Evolution Rule-based Feature-based Pure Data Driven Model Model Model H 2 O .ai Machine Intelligence

CSec Solution Evolution Feature-based Model H 2 O .ai Machine Intelligence

CSec Solution Evolution Feature-based Model ● Using a feature based model we look for anomalies / potential attacks by: ○ First marking the kind of traffic it is. ○ And the likelihood of it being malicious ● These anomalies are further verified by having a human analyse the outcome of the model. H 2 O .ai Machine Intelligence

Features - ( Used in Feature-based Model ) 1. Features are meta data (Extracted from the data) 2. They help algorithms capture information from the data. 3. Feature engineering is a form of language translation: Between raw data and the algorithm. 4. Build much better features for your supervised models. H 2 O .ai Machine Intelligence

Source of Data 1. Past Attack 2. Past Traffic 3. Current Traffic 4. Application Logs 5. System logs 6. PCAP files - raw network capture files. 7. ASA, IDS, etc. H 2 O .ai Machine Intelligence

Features - Example 1. Average length of connection (too small, too large) 2. Average number of DNS requests (within network/outside network) 3. Average number of new domains 4. Change in MTU ratio vs. Windows/Mac/*Nix machine churn. 5. Packet Utilization - segmentation 6. Window Size 7. Arrival Jitter Variance H 2 O .ai Machine Intelligence

Features - Example average tcp connect length by protocol 7 Days H 2 O .ai Machine Intelligence

Features: Advantages 1. Designed Features Highlight Transactional Behaviour 2. Features Continuously Track Network’s Transactional Behaviour 3. Rules Variables can only Identify Threshold Changes H 2 O .ai Machine Intelligence

Feature-based Model: Advantages 1. Uses AI - artificial intelligence 2. AI with features uses a consistent and objective approach 3. Quick classification 4. Multiclass - quickly identifies types of traffic - event. 5. Low false positive rate - tweaked based on risk appetite. H 2 O .ai Machine Intelligence

Limitation of the Model 1. A single traffic classification 2. A single likelihood for the specific type of traffic. 3. It still needs to be verified by a security analyst a. An analyst needs to go through large amounts of data for identification H 2 O .ai Machine Intelligence

Identification and Labeling Two different methods 1. Completely Manual 2. Assisted by Clustering H 2 O .ai Machine Intelligence

Manual Labeling Logs Information Analytical Inputs: 1. Behavioural Input 2. Univariate Alert score 3. Threat score Suspicious Not Suspicious H 2 O .ai Machine Intelligence

Assisted Labeling ● The approach of Manually Labeling is slow. ● Therefore, we involve an assisted Labeling approach. H 2 O .ai Machine Intelligence

Assisted Labeling Clustering Output Sampling Clustering output labeling Clustering Classification H2O Unsupervised Output Algorithm 1. Algo tuning 1. Features SoC Analyst Logs/Pcap H 2 O .ai Machine Intelligence

Model Deployment Suspicious Data with Features H2O Machine Learning Algorithm Not Suspicious 1. Traffic logs 2. Pcap Info 3. Alert systems H 2 O .ai Machine Intelligence

Limitation of This Approach 1. Slow 2. Loss of Classification information H 2 O .ai Machine Intelligence

Loss of Classification of Information Output Class 1 Class 2 Class 3 Class 4 Class 5 Class 6 Class Class 1 0.7 0.2 0.05 0.04 0.0 0.0 Class 1 0.7 0.2 0.05 0.04 0.0 0.0 ... ... ... ... ... ... ... Class 1 0.55 0.0 0.0 0.0 0.0 0.45 ... ... ... ... ... ... ... H 2 O .ai Machine Intelligence

Loss of Classification of Information ● In a multiclass ML problem we get probability scores for all possible candidates ● But we disregard all scores except the highest score. ● Benign events and potential attacks get class-probabilities in a multi-classification. ● Events that are benign, in a given class e.g. Class 1, tend to have similar scores. ● Events that are potential attacks in a certain class e.g. Class 1 , tend to have different scores when compared to benign events. H 2 O .ai Machine Intelligence

Model Improvement ● We exploited this information from the multi-classification. ● The classes in multi-classification are the sequence of events . ● We passed the probability scores thru an autoencoder. ● By exploiting the multi-classification probability values we calculated reconstruction errors. ● Using reconstruction errors we were able to classify traffic that seemed anomalous - potential attack, and benign. H 2 O .ai Machine Intelligence

Model Improvement - Advantages ● FAST! ● Results reinforced with bit more information. ● Reinforced events are the sequence of events. ● Analyst looks at a smaller set of data and can quickly identify potential attacks. H 2 O .ai Machine Intelligence

Thank You Questions? H 2 O .ai Machine Intelligence

Exploiting Sequence of Events for Potential Attack Detection in - PowerPoint PPT Presentation

Exploiting Sequence of Events for Potential Attack Detection in Network Security using Machine Learning Ashrith Barthur, PhD Security Research @cyberbaggage H 2 O .ai Machine Intelligence Sequence of Events (SoE) What is a Sequence of

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Connectionist Temporal Classification 1 Sequence-to-sequence

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or

MiTM Attack MiTM Attack Edri Guy Edri Guy May 29 ,2013 May 29 ,2013 PC-Labs May 29 2013

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

Sequence to Sequence models: Connectionist Temporal Classification 5 March 2018 1

61A Lecture 30 Announcements Efficient Sequence Processing Sequence Operations 4 Sequence

Introduction to sequence to sequence models N ATURAL LAN GUAGE GEN ERATION IN P YTH ON

Sequence-to-Sequence Learning with Neural Networks Ilya Sutskever, Oriol Vinyals, Quoc V. Le,

Potential Games Matoula Petrolia April 14, 2011 Examples Potential Games Potential vs

Asynchronous sequence circuits An asynchronous sequence machine is a sequence circuit without

Events Team CONTENTS 1) Event Categories 2) Major Events 3) Event timeline 4) Events

How Events Are Reshaping Modern Systems Jonas Bonr @jboner Why Should you care about Events?

PacketExpert PacketBroker (Wire-speed Ethernet Tap) 818 West Diamond Avenue - Third Floor,

Probably the worlds best stateless traffic generation and analysis platform 2 1 2 4 5 6

Packet Classification Omid Mashayekhi Vaibhav Chidrewar What is Packet Classification?

Configuring and Troubleshooting MPLS VPN Vinit Jain, CCIE Security, Data Center, SP, and R&S

6QM Solution for IPv6 QoS Measurements Nov. 2004 Moscow Jordi Palet (Consulintel), Csar

Device Drivers: Dont build a house on a shaky foundation johnny cache, researcher david

A networked-FPGA platform o ff ering fm exible Ethernet switching from Layer 1 all the way to full

Evaluation of variance for TCP throughput Olga I. Bogoiavlenskaia PetrSU, Department of Computer

Exploiting Sequence of Events for Potential Attack Detection in - PowerPoint PPT Presentation

Exploiting Sequence of Events for Potential Attack Detection in Network Security using Machine Learning Ashrith Barthur, PhD Security Research @cyberbaggage H 2 O .ai Machine Intelligence Sequence of Events (SoE) What is a Sequence of

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Connectionist Temporal Classification 1 Sequence-to-sequence

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

SEQUENCE ANALYSIS The term &quot; sequence analysis &quot; in biology implies subjecting a DNA or

MiTM Attack MiTM Attack Edri Guy Edri Guy May 29 ,2013 May 29 ,2013 PC-Labs May 29 2013

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

Sequence to Sequence models: Connectionist Temporal Classification 5 March 2018 1

61A Lecture 30 Announcements Efficient Sequence Processing Sequence Operations 4 Sequence

Introduction to sequence to sequence models N ATURAL LAN GUAGE GEN ERATION IN P YTH ON

Sequence-to-Sequence Learning with Neural Networks Ilya Sutskever, Oriol Vinyals, Quoc V. Le,

Potential Games Matoula Petrolia April 14, 2011 Examples Potential Games Potential vs

Asynchronous sequence circuits An asynchronous sequence machine is a sequence circuit without

Events Team CONTENTS 1) Event Categories 2) Major Events 3) Event timeline 4) Events

How Events Are Reshaping Modern Systems Jonas Bonr @jboner Why Should you care about Events?

PacketExpert PacketBroker (Wire-speed Ethernet Tap) 818 West Diamond Avenue - Third Floor,

Probably the worlds best stateless traffic generation and analysis platform 2 1 2 4 5 6

Packet Classification Omid Mashayekhi Vaibhav Chidrewar What is Packet Classification?

Configuring and Troubleshooting MPLS VPN Vinit Jain, CCIE Security, Data Center, SP, and R&amp;S

6QM Solution for IPv6 QoS Measurements Nov. 2004 Moscow Jordi Palet (Consulintel), Csar

Device Drivers: Dont build a house on a shaky foundation johnny cache, researcher david

A networked-FPGA platform o ff ering fm exible Ethernet switching from Layer 1 all the way to full

Evaluation of variance for TCP throughput Olga I. Bogoiavlenskaia PetrSU, Department of Computer

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or

Configuring and Troubleshooting MPLS VPN Vinit Jain, CCIE Security, Data Center, SP, and R&S