ACCELERATING CYBER THREAT DETECTION WITH GPU Joshua Patterson | - PowerPoint PPT Presentation

ACCELERATING CYBER THREAT DETECTION WITH GPU Joshua Patterson | Director of Applied Solutions Engineering | GTC Israel 2017 @datametrician

RULES & PEOPLE DON’T SCALE Current methods are too slow Right now, financial services reports it takes an average of 98 days to detect an Advance Threat but retailers say it can be about seven months . Once the security community moves beyond the mantras “encrypt everything” and “secure the perimeter,” it can begin developing intelligent prioritization and response plans to various kinds of breaches – with a strong focus on integrity. The challenge lies in efficiently scaling these technologies for practical deployment , and making them reliable for large networks . This is where the security community should focus its efforts. 2 http://www.wired.com/2015/12/the-cia-secret-to-cybersecurity-that-no-one-seems-to-get/

ATTACKS ARE MORE SOPHISTICATED How Hackers Hijacked a Bank’s Entire Online Operation https://www.wired.com/2017/04/hackers-hijacked-banks-entire-online-operation/ 3

FIRST PRINCIPLES OF CYBER SECURITY Where the industry must go 1. Indication of compromise needs to improve as attacks are becoming more sophisticated, subtle, and hidden in the massive volume and velocity of data. Combining machine learning, graph analysis, and applied statistics, and integrating these methods with deep learning is essential to reduce false positives, detect threats faster, and empower analyst to be more efficient. 2. Event management is an accelerated analytics problem, the volume and velocity of data from devices requires a new approach that combines all data sources to allow for more intelligent/advanced threat hunting and exploration at scale across machine data. 3. Visualization will be a key part of daily operations, which will allows analyst to label and train Deep Learning models faster, and validate machine learning prediciton. 4

FIRST PRINCIPLES OF CYBER SECURITY Where the industry must go 1. Indication of compromise needs to improve as attacks are becoming more sophisticated, subtle, and hidden in the massive volume and velocity of data. Combining machine learning, graph analysis, and applied statistics, and integrating these methods with deep learning is essential to reduce false positives, detect threats faster, and empower analyst to be more efficient. 5

DATA PLATFORM-AS-A-SERVICE HIGH AVAILABILITY SCALE Offers HA with no data-loss Handles 1M events/second • • • Always-on architecture • Auto-scales the cluster • Data replication automatically SELF SERVICE SECURITY Log-to-analytics Data platform security has • • • Kibana, JDBC access been implemented with • Accessing data using BI tools VPCs in AWS • Dashboard access using NVIDIA LDAP 7

ARCHITECTURE V1 8

DATA PLATFORM STATS 9

ANOMALY DETECTION 10

ANOMALY DETECTION USING DEEP LEARNING NGC/NGN GPU NGC/NGN Cluster GPU Cluster GPU Cloud Anomaly Detection Data Platform AD AI Framework Top Features (Keras + TensorFlow) Automated Alerts & Dashboards Early Detection Self Service Better accuracy & less noise 11

ANOMALY DETECTION FRAMEWORK Anomalies: Email alerts, Dashboards Feedback from user Time X1 X2 Y Anomaly Anomaly Post- Description processing: Univariate 1 X1 Analysis 0 Anomaly Detection Time X1 X2 X’ X’’ Y Supervised Learning: Unsupervised Learning: 1 Logistic Regression Multivariate-Gaussian 0 Feature Learning Time X1 X2 X’ X’’ Algorithm: Recurrent Neural Network (RNN), Autoencoders (AE) Time X1 X2 Raw Dataset 12

ANOMALY DETECTION BENEFITS WITH DEEP LEARNING Top Features Automated Alerts & Dashboards Early Detection Self Service Better accuracy & less noise 13

ANOMALY DETECTION TRAINING Evolution • V1 : V2 : Multi- V0 : Automatic GPU support • CPU vs GPU Manual Feature + TensorFlow Feature Creation Serving using DL (Keras + Creation • Learnings : (Theano) TensorFlow) Manual feature extraction does not scale • Dataset preparation is the long pole • Training on CPU takes longer than data collection rate • 14

INFERENCING V1 Use Case: Detecting anomalies with user’s activity • Python Script Performance 200 • Inferencing flow from 10k feet 150 154 Live ETL 100 Streaming aggregations AD Platform 103 Data for inferencing 50 73 0 10 MINS 30 MINS 60 MINS Started with python scripts for windowed aggregation • • Learnings: Hard to scale for near real time. AD platform runs inferencing every 3 mins as we are impacted by speed of data processing 15

INFERENCING V2 Improved Performance 35 V2: To improve performance, we started using Presto • with data on S3 in JSON format 30 30 25 Live data will be streamed from Kafka to S3. We use • 25 20 Presto for our data warehousing needs 20 15 • Presto is an open-source distributed SQL query engine 10 optimized for low-latency, ad-hoc analysis of data * 8 5 6 4 0 PRESTO ON JSON PRESTO ON PARQUET 10 mins 30 mins 60 mins • Learnings: Presto with Parquet has best performance but we need to batch data at 30 secs interval. So it’s not completely real time 16

FIRST PRINCIPLES OF CYBER SECURITY Where the industry must go 1. Indication of compromise needs to improve as attacks are becoming more sophisticated, subtle, and hidden in the massive volume and velocity of data. Combining machine learning, graph analysis, and applied statistics, and integrating these methods with deep learning is essential to reduce false positives, detect threats faster, and empower analyst to be more efficient. 2. Event management is an accelerated analytics problem, the volume and velocity of data from devices requires a new approach that combines all data sources to allow for more intelligent/advanced threat hunting and exploration at scale across machine data. 17

GPU ACCELERATION Accelerate the Pipeline, Not Just Deep Learning GPUs for deep learning = proven • Data Ingestion • Where else and how else can we use GPU acceleration? • Dashboards Data Processing • Accelerating data pipeline Stream processing • Visualization Inferencing Model Training • Building better models faster • First: GPU databases 18

MOVING TO BIG DATA IS A START Spark outperforms traditional SIEM SIEM vs Big Data Solution 10 node cluster - ~$60k in hardware Production SIEM of Fortune 500 Enterprise Data 450+ columns ~250 million events per day Spark vs SIEM Benchmarks from Accenture Labs - Strata NY, Bsides LV 19

MOVING TO BIG DATA IS A START Spark outperforms traditional SIEM Typical Scenario Time Period SIEM Big Data Speed Up 1 Show all network communication from one host 1 Day 3h 20m 13s 1m 44s 114 Times Faster (IP) to multiple hosts (IPs) 1 Week Not Feasible* 4m 05s 2 Retrieve failed logon attempts in Active 1 Day 18m 26s 1m 37s 10 Times Faster Directory 1 Week 2h 13m 45s 3m 10s 41 Times Faster 3 Search for Malware (exe) in Symantec logs 1 Day 3h 24m 36s 1m 37s 125 Times Faster 1 Week Not Feasible* 3m 22s 4 View all proxy logs for a for specific domain 1 Day 4h 30m 13s 2m 54s 92 Times Faster 1 Week Not Feasible* 1m 09s** Spark vs SIEM Benchmarks from Accenture Labs - Strata NY, Bsides LV 20

GPU DATABASES ARE EVEN FASTER 1.1 Billion Taxi Ride Benchmarks Query 1 Query 2 Query 3 Query 4 10190 8134 19624 85942 5000 4500 4000 3500 2970 3000 2500 2250 Time in Milliseconds 2000 1560 1500 1250 1000 696 372 500 269 150 99 80 30 21 0 MapD DGX-1 MapD 4 x P100 Redshift 6-node Spark 11-node Source: MapD Benchmarks on DGX from internal NVIDIA testing following guidelines of @marklit82 Mark Litwintschik’s blogs: Redshift, 6-node ds2.8xlarge cluster & Spark 2.1, 11 x m3.xlarge cluster w/ HDFS 21

MAPD MapD Core MapD Immerse Backend Rendering LLVM Streaming LLVM creates one custom function that Data goes from compute (CUDA) to Speed eliminates need to pre-index or runs at speeds approaching hand-written graphics (OpenGL) pipeline without copy aggregate data. Compute resides on functions. LLVM enables generic and comes back as compressed PNG GPUs freeing CPUs to parse + ingest. targeting of different architectures + run (~100 KB) rather than raw data (> 1GB). Finally, newest data can be combined with simultaneously on CPU/GPU. billions of rows of “near historical” data. 22

MAPD ARCHITECTURE Open Source Commercial High Availability Visualization Libraries LLVM Distributed Scale-out MapD Core has native MapD Core has high MapD Core SQL queries are JavaScript libraries that allow distributed scale-out availability functionality that compiled with a just-in-time users to build custom web- capabilities. MapD Core users provides durability and can query and visualize larger redundancy. Ingest and (JIT) LLVM based compiler, based visualization apps datasets with much smaller and run as NVIDIA GPU powered by a MapD Core queries are load balanced cluster sizes than traditional machine code. database based on DC.js. across servers for additional solutions. throughput. 23

ACCELERATING CYBER THREAT DETECTION WITH GPU Joshua Patterson | - PowerPoint PPT Presentation

ACCELERATING CYBER THREAT DETECTION WITH GPU Joshua Patterson | Director of Applied Solutions Engineering | GTC Israel 2017 @datametrician RULES & PEOPLE DONT SCALE Current methods are too slow Right now, financial services reports it

Optimal Machine Learning Algorithms for Cyber Threat Detection A Presentation by Hafiz Farooq,

THE ANATOMY OF CYBER THREAT INTELLIGENCE (CTI) Noureen Njoroge OVERVIEW Define Cyber Threat

The Cyber Threat: Securing Cyber Infrastructure Overview What are the FBIs Priorities?

Cyber Security: The Current Threat T/Detective Chief Inspector Paul Peters Regional Cyber Crime

Vajra Cyber Threat Mitigation Service (Vajra CTMS) A Military Grade Cyber Threat Mitigation

MENA Information Security Conference 2017 On the Verge : Combating Cyber Threats leveraging Threat

Agenda What is Cyber Threat Intelligence (CTI) Sandbox Malware analysis Debugger

THE EVOLVING CYBER THREAT LANDSCAPE : Ensuring the Integrity and Value of Information Sean

Cyber Catastrophe in the Movies: Realistic Threat or Hollywood Hokum? MODERATOR: PANELISTS:

Insider Threat Insider Threat (Database Intrusion Detection) 1 Insider Threats: Motivation and

Counterintelligence & Insider Threat Detection National Insider Threat Special Interest Group

CONFRONTING THE CYBER THREAT David J. Hickton SAC-PA Workshop Founding Director Pittsburgh,

Current Trends in Cyber Security Course on Cyber Attack Detection & Mitigation Techniques

An End-to-End Infrastructure for Cyber-Physical Intrusion Detection REINHARD GENTZ, MAHDI JAMEI,

Advanced Cyber Risk Management Threat Modeling & Cyber Wargaming April 23, 2018 The

Overview Introduction Legal framework Dynamics of the threat Nature of cyber

Threat Modeling and S haring S ummary Proposal to kick off Threat Modeling proj ect

What is Our Greatest Cyber-Security Threat? OT vulnerability explained and national conversation

Development of Initiating Cyber Threat Scenarios and the Probabilities Based on Operating

Threat Modeling in Cyber-Physical Systems May 16, 2017 By Emeka Eyisi Ph.D. Mark Moulin Ph.D.

Healthcare: Is the Cyber Threat Real? President Obama has identified cybersecurity as one of the

How to protect yourself and your business from the threat of Cybercrime Presented by Laura

Cyber Moving Targets Yashar Dehkan Asl Introduction An overview of different cyber moving target

Advanced Incident Detection and Threat Hunting using Sysmon (and Splunk) Tom Ueltschi, Swiss

ACCELERATING CYBER THREAT DETECTION WITH GPU Joshua Patterson | - PowerPoint PPT Presentation

ACCELERATING CYBER THREAT DETECTION WITH GPU Joshua Patterson | Director of Applied Solutions Engineering | GTC Israel 2017 @datametrician RULES & PEOPLE DONT SCALE Current methods are too slow Right now, financial services reports it

Optimal Machine Learning Algorithms for Cyber Threat Detection A Presentation by Hafiz Farooq,

THE ANATOMY OF CYBER THREAT INTELLIGENCE (CTI) Noureen Njoroge OVERVIEW Define Cyber Threat

The Cyber Threat: Securing Cyber Infrastructure Overview What are the FBIs Priorities?

Cyber Security: The Current Threat T/Detective Chief Inspector Paul Peters Regional Cyber Crime

Vajra Cyber Threat Mitigation Service (Vajra CTMS) A Military Grade Cyber Threat Mitigation

MENA Information Security Conference 2017 On the Verge : Combating Cyber Threats leveraging Threat

Agenda What is Cyber Threat Intelligence (CTI) Sandbox Malware analysis Debugger

THE EVOLVING CYBER THREAT LANDSCAPE : Ensuring the Integrity and Value of Information Sean

Cyber Catastrophe in the Movies: Realistic Threat or Hollywood Hokum? MODERATOR: PANELISTS:

Insider Threat Insider Threat (Database Intrusion Detection) 1 Insider Threats: Motivation and

Counterintelligence &amp; Insider Threat Detection National Insider Threat Special Interest Group

CONFRONTING THE CYBER THREAT David J. Hickton SAC-PA Workshop Founding Director Pittsburgh,

Current Trends in Cyber Security Course on Cyber Attack Detection &amp; Mitigation Techniques

An End-to-End Infrastructure for Cyber-Physical Intrusion Detection REINHARD GENTZ, MAHDI JAMEI,

Advanced Cyber Risk Management Threat Modeling &amp; Cyber Wargaming April 23, 2018 The

Overview Introduction Legal framework Dynamics of the threat Nature of cyber

Threat Modeling and S haring S ummary Proposal to kick off Threat Modeling proj ect

What is Our Greatest Cyber-Security Threat? OT vulnerability explained and national conversation

Development of Initiating Cyber Threat Scenarios and the Probabilities Based on Operating

Threat Modeling in Cyber-Physical Systems May 16, 2017 By Emeka Eyisi Ph.D. Mark Moulin Ph.D.

Healthcare: Is the Cyber Threat Real? President Obama has identified cybersecurity as one of the

How to protect yourself and your business from the threat of Cybercrime Presented by Laura

Cyber Moving Targets Yashar Dehkan Asl Introduction An overview of different cyber moving target

Advanced Incident Detection and Threat Hunting using Sysmon (and Splunk) Tom Ueltschi, Swiss

Counterintelligence & Insider Threat Detection National Insider Threat Special Interest Group

Current Trends in Cyber Security Course on Cyber Attack Detection & Mitigation Techniques

Advanced Cyber Risk Management Threat Modeling & Cyber Wargaming April 23, 2018 The