Masquerading Malicious DNS Traffic Bayesian Inference, Rainier, Spark David Rodriguez March 28, 2019
The Outline Masquerading Time Series Rainier Anomaly DNS Modeling + Detection Traffic Spark
The Outline Masquerading Time Series Rainier Anomaly DNS Modeling + Detection Traffic Spark
Cisco Umbrella DNS Resolution
Part 1 DNS Resolution 180 Billion Cisco Umbrella Per Day Many Web Server Mail More IP Server DNS Address Records
Part 1 Protection 101 Ransomware Malvertising Worms Phishing Virus Compromised Account
Part 1 Definition Masquerading Traffic = Masquerading Users + Compromised Websites
Part 1 Masquerading Users Email Browsing PDF Viewer Internet Text Editor SSH Keys
Part 1 Compromised Websites Typical Visitors Compromised Phished Server Malicious Webpage Browser Redirect Backdoor Vulnerability
Part 1 Masquerading DNS Traffic Typical Vistor DNS Atypical Traffic Vistor
Part 1 Emotet Campaign Malware Downloaded Emotet Runs Code in Process and Registers Computer with C2 Server Masquerading Traffic Links or Macros Make DNS Requests Phishing Email User Click Links or Opens Attachments to Email
Part 1 Emotet Campaign
The Outline Masquerading Time Series Rainier Anomaly DNS Modeling + Detection Traffic Spark
Part 2 Time-Series Analysis Extreme Outliers Expected Non-Zero Volume Expected Zero Volume
Part 2 Time-Series Analysis Probability of Demand Expected Demand when non-zero
Part 2 Croston’s Method Spark Trended Volume Data Pipeline Store Spark Table Join Spark Spark Historical Table Table Note :
Part 2 Bayesian Approach X Y Probability Distribution Probability of Demand Expected Demand when non-zero
Part 2 Bayesian Approach Outliers Non-Zero Outliers Distribution Zero Distribution 0 1 2 3 4 5 6 7 8 9
Part 2 Mixture Models
Part 2 Discrete Models
Part 2 Continuous Models
The Outline Masquerading Time Series Rainier Anomaly DNS Modeling + Detection Traffic Spark
Part 3 MCMC Methods Rejection of Samples Sampling Observations From Distribution Proposed Distribution
Part 3 MCMC Methods
Part 3 Rainier Depending on your background, you might think of Rainier as aspiring to be either: “ Stan, but on the JVM” or “ Tensorflow, but for small data” . ~ README
Part 3 Rainier Methods
Part 3 PyMC Methods
Part 3 Rainier + Spark JVM Rainier Spark
Part 3 Rainier + Spark Hourly Aggregations Spark Job 150 Million Paid-Level Domains Daily Aggregations Spark Job Spark Job Rainier Simulations Filtering Heuristics
The Outline Masquerading Time Series Rainier Anomaly DNS Modeling + Detection Traffic Spark
Part 4 Window Based Difference Window 2 Window 1 Window 2 Window 1 Rainier Simulated Distribution Parameter Parameter Values Values
Part 4 Window Simulations Week 2 Week 3 Week 4 Week 1
Part 4 Outlier Window
Part 4 Local Outlier to Global
Closing Recap Masquerading Time Series Rainier Anomaly DNS Modeling + Detection Traffic Spark
Closing Glossed Over Details Outliers Goodness of Fit
Closing References A Review of Croston's method for intermittent demand forecasting https://www.researchgate.net/publication/254044245_A_Review_of_Croston's_method_for_intermittent_demand_forecasting Rainier https://github.com/stripe/rainier PyMC3 https://docs.pymc.io/ Emotet https://www.us-cert.gov/ncas/alerts/TA18-201A Bokeh Plots https://bokeh.pydata.org/en/latest/ Twitter Chill https://github.com/twitter/chill
Closing Contact Website davidrdgz.github.io Github @davidrdgz Twitter @davidrdgz Email davrodr3 at cisco.com
Recommend
More recommend