Nonparametric Sequential Change Detection for High-Dimensional - PowerPoint PPT Presentation

Nonparametric Sequential Change Detection for High-Dimensional Problems Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Yılmaz Electrical Engineering, University of South Florida Allerton 2017

Nonparametric Sequential Change Detection for High-Dimensional Problems Outline 1 Introduction 2 Background 3 ODIT: Online Discrepancy Test 4 Numerical Results 5 Conclusion

Nonparametric Sequential Change Detection for High-Dimensional Problems Introduction Introduction

f 0 ( x ) f 1 ( x ) f ( x ) x Nonparametric Sequential Change Detection for High-Dimensional Problems Introduction Anomaly Detection Objective: identify patterns that deviate from a nominal behavior Applications: cybersecurity, quality control, fraud detection, fault detection, health care, . . .

Nonparametric Sequential Change Detection for High-Dimensional Problems Introduction Anomaly Detection Objective: identify patterns that deviate from a nominal behavior Applications: cybersecurity, quality control, fraud detection, fault detection, health care, . . . 0.45 In literature typically f 0 ( x ) f 1 ( x ) 0.4 statistical outlier detection 0.35 = 0.3 anomaly detection 0.25 f ( x ) 0.2 However an outlier could be 0.15 nominal tail event 0.1 or 0.05 real anomalous event 0 -5 0 5 10 (e.g., mean shift) x

Nonparametric Sequential Change Detection for High-Dimensional Problems Introduction Problem Formulation Instead of anomaly = outlier , consider also temporal dimension Nominal 4 Proposed Model 2 anomaly = persistent outliers x ( t ) 0 -2 outlier Objective -4 0 2 4 6 8 10 12 14 16 18 20 t Timely and accurate detection of Anomaly after t=10 with prob. 0.2 4 anomalies in high-dimensional 2 persistent outliers datasets x ( t ) 0 -2 Approach -4 0 2 4 6 8 10 12 14 16 18 20 t Sequential & Nonparametric anomaly detection

Nonparametric Sequential Change Detection for High-Dimensional Problems Introduction Motivating Facts: IoT Security, Smart Grid, . . . IoT devices: 8.4B in 2017 and expected to hit 20B by 2020 1 IoT systems: highly vulnerable – needs scalable security solutions 2 Mirai IoT botnet: largest recorded DDoS attack with at least 1.1 Tbps bandwidth (Oct. 2016) 2 Persirai IoT botnet targets at least 120,000 IP cams (May 2017) 3 A plausible cyberattack against the US grid: 100M people may be left without power with up to $1 trillion of monetary loss 4 1 R. Minerva, A. Biru, and D. Rotondi, “Towards a definition of the Internet of Things (IoT),” IEEE Internet Initiative, no. 1, 2015. 2 E. Bertino and N. Islam, “Botnets and Internet of Things Security,” Computer, vol. 50, no. 2, pp. 76-79, Feb. 2017. 3 Trend Micro, “Persirai: New Internet of Things (IoT) Botnet Targets IP Cameras”, May 9 , 2017, available online 4 Trevor Maynard and Nick Beecroft, “Business Blackout,” Lloyd’s Emerging Risk Report, p. 60, May 2015.

Nonparametric Sequential Change Detection for High-Dimensional Problems Introduction Motivating Facts: IoT Security, Smart Grid, . . . Challenges: Unknown anomalous distribution: parametric methods, as well as signature-based methods (e.g., antivirus) are not feasible High-dimensional problems: even nominal distribution is difficult to know Nonparametric methods are needed Timely and accurate detection is critical

Nonparametric Sequential Change Detection for High-Dimensional Problems Background Background

Nonparametric Sequential Change Detection for High-Dimensional Problems Background Sequential Change Detection - CUSUM inf T sup sup E τ [ T − τ | T ≥ τ ] s.t. E ∞ [ T ] ≥ β { x 1 ,..., x T } τ � � W t − 1 + log f 1 ( x t ) W t = max f 0 ( x t ) , 0 T = min { t : W t ≥ h }

Nonparametric Sequential Change Detection for High-Dimensional Problems Background Statistical Outlier Detection Needs to know a statistical description f 0 of the nominal (e.g., no attack) behavior (baseline) Determines instances that significantly deviate from the baseline � ∞ With f 0 completely known, x is outlier if f 0 ( y )d y < α (p-value) x Equivalently, if x �∈ most compact set of data points under f 0 (minimum volume set) � � Ω α = arg min d y subject to f 0 ( y )d y ≥ 1 − α A A A 0.4 Uniformly most powerful test when 0.35 anomalous distribution is a linear mixture 0.3 0.25 of f 0 and the uniform distribution f 0 ( x ) 0.2 Coincides with minimum entropy set which 0.15 0.1 minimizes the R´ enyi entropy while 0.05 satisfying the same false alarm constraint 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 x

Nonparametric Sequential Change Detection for High-Dimensional Problems Background Geometric Entropy Minimization (GEM) 0.8 High-dimensional datasets: even if f 0 is Training set 1 0.75 Training set 2 known, very computationally expensive Test set 0.7 (if not impossible) to determine Ω α 0.65 L ( K ) 0.6 Various methods for learning Ω α L 1 x ij 2 0.55 t GEM is very effective with 0.5 0.45 high-dimensional datasets while 0.4 asymptotically achieving Ω α for L 2 0.35 lim K , N →∞ K / N → 1 − α 0.3 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 x ij 1 t Training: Randomly partitions training set into two and forms K - k NN graph 5 K k ¯ � � X N 1 L k ( X N 1 K , X N 2 ) = K = arg min | e i ( l ) | γ X N 1 i =1 l = k ∗ K Test: new point x t ∈ R d outlier if x t �∈ ¯ X N 1 +1 , K l = k ∗ | e t ( l ) | γ > L ( K ) equivalently if L t = � k 5 A. O. Hero III, “Geometric entropy minimization (GEM) for anomaly detection and localization”, NIPS, pp. 585-592, 2006

Nonparametric Sequential Change Detection for High-Dimensional Problems ODIT: Online Discrepancy Test ODIT: Online Discrepancy Test

Nonparametric Sequential Change Detection for High-Dimensional Problems ODIT: Online Discrepancy Test Online Discrepancy Test (ODIT) 0.7 GEM lacks the temporal aspect ODIT statistic, s ij t Detection threshold, h 0.6 In GEM, x t is outlier if l = k ∗ | e i ( l ) | γ > L ( K ) L t = � k 0.5 0.4 In ODIT, D t = L t − L ( K ) is treated as some positive/negative evidence for 0.3 anomaly 0.2 D t approximates ℓ t = log p ( r ( x t ) | H 1 ) 0.1 p ( r ( x t ) | H 0 ) between H 1 claiming x t is anomalous and 0 1 2 3 4 5 6 7 8 9 10 t H 0 claiming x t is nominal

Nonparametric Sequential Change Detection for High-Dimensional Problems ODIT: Online Discrepancy Test Online Discrepancy Test (ODIT) 0.7 GEM lacks the temporal aspect ODIT statistic, s ij t Detection threshold, h 0.6 In GEM, x t is outlier if l = k ∗ | e i ( l ) | γ > L ( K ) L t = � k 0.5 0.4 In ODIT, D t = L t − L ( K ) is treated as some positive/negative evidence for 0.3 anomaly 0.2 D t approximates ℓ t = log p ( r ( x t ) | H 1 ) 0.1 p ( r ( x t ) | H 0 ) between H 1 claiming x t is anomalous and 0 1 2 3 4 5 6 7 8 9 10 t H 0 claiming x t is nominal Assuming independence, � T t =1 D t gives aggregate anomaly evidence until time T (as � T t =1 ℓ t , sufficient statistic for optimum detection) Similar to CUSUM (optimum minimax sequential change detector), ODIT decides using T d = min { t : s t ≥ h } , s t = max { s t − 1 + D t , 0 }

Nonparametric Sequential Change Detection for High-Dimensional Problems ODIT: Online Discrepancy Test Theoretical Justification - Asymptotic Asymptotic Optimality - Scalarized problem As training set grows ( N 2 → ∞ ) ODIT is asymptotically optimum for H 0 : r ( x t ) ∼ f k 0 , ∀ t H 1 : r ( x t ) ∼ f k and r ( x t ) ∼ f k 0 , t < τ, uni , t ≥ τ { x t } independent r ( x t ) kNN distance f 0 ( x t ) > 0 Lebesgue continuous f k 0 and f k uni distributions of kNN distance under f 0 and uniform distr. � ∞ r α f k on a d -dimensional grid with spacing r α where 0 ( r )d r = α

Nonparametric Sequential Change Detection for High-Dimensional Problems ODIT: Online Discrepancy Test Sketch of the Proof For independent { x t } , continuous f 0 > 0 defines a non-homogeneous Poisson point process with continuous rate λ ( x ) > 0. Obtain a homogeneous Poisson point process with rate k by defining a d -dimensional non-homogeneous grid with volume k /λ ( x ) 6 For this homogeneous Poisson point process, nearest neighbor function is given by D x ( r d ) = k d v d ( x , r ) e − kv d ( x , r ) d r d Under H 0 , r ( x t ) = r t comes from f k 0 which can be computed using training set as L t . Under H 1 , r ( x t ) = r α comes from f k uni which has a single atom at r α , computed as L ( K ) . As training set grows, L t → r t and L ( K ) → r α D x ( r α ) D x ( r t ) = kc ( r d t − r d The optimum CUSUM test computes log α ) 6 Robert Gallager. 6.262 Discrete Stochastic Processes, Chapter 2. Spring 2011. Massachusetts Institute of Technology: MIT OpenCourseWare, https://ocw.mit.edu. License: Creative Commons BY-NC-SA.

Nonparametric Sequential Change Detection for High-Dimensional - PowerPoint PPT Presentation

Nonparametric Sequential Change Detection for High-Dimensional Problems Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical Engineering, University of South Florida Allerton 2017 Nonparametric

{Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code}

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Random Sampling Florian Schoppmann August 24, 2010 Non-Sequential Sequential Sequential with

Hardware Design with VHDL Sequential Stmts ECE 443 Sequential Statements This slide set covers

Sequential Files : Outline ! Overview ! Ordered vs. Unordered ! Physical sequential Files !

Nonparametric Regression Splines for Nonparametric Regression Splines for Regional Atmospheric

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Nonparametric analysis of CMB Nonparametric analysis of CMB power spectrum data and consistency

Sequential techniques for Hypothesis testing & Change detection George V. Moustakides

Chapter 5 Synchronous Sequential Logic 5-1 Outline ! Sequential Circuits ! Latches ! Flip-Flops

Sequential Supervised Learning Sequential Supervised Learning Many Application Problems Require

Introduction to Synchronous Sequential Introduction to Synchronous Sequential Circuits Circuits

Lecture 14: Sequential Circuits, FSM Todays topics: Sequential circuits Finite

Nonparametric Minimax Estimation of the Estimation of the Volatility in High- Volatility in

Hardware Design with VHDL Sequential Circuit Design I ECE 443 Sequential Circuit Design:

Sequential Circuits Combinational circuits : current input output Sequential circuit :

CS 525M Mobile and Ubiquitous Computing Seminar TCP Westwood Written by S. Mascolo, C.

Graceful Degradation of QoS in Smart Grid Rohit Gupta under the guidance of Prof. Krithi

Overview Enview turns massive datasets into operational insights to support pipeline operational

P S I P H O N . C A Psiphon: Who We Are P S I P H O N . C A Psiphon was founded in 2006 out of

Impacts of the RICE Rule Over 900,000 existing CI engines estimated to be impacted 80%

A Truthful Incentive Mechanism for Emergency Demand Response in Colocation Data Centers Linquan

Threat Modeling in Cyber-Physical Systems May 16, 2017 By Emeka Eyisi Ph.D. Mark Moulin Ph.D.

Race Why is parallelism hard? Non-determinism!! Practice Theory 2 Why is parallelism

Nonparametric Sequential Change Detection for High-Dimensional - PowerPoint PPT Presentation

Nonparametric Sequential Change Detection for High-Dimensional Problems Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical Engineering, University of South Florida Allerton 2017 Nonparametric

{Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code}

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Random Sampling Florian Schoppmann August 24, 2010 Non-Sequential Sequential Sequential with

Hardware Design with VHDL Sequential Stmts ECE 443 Sequential Statements This slide set covers

Sequential Files : Outline ! Overview ! Ordered vs. Unordered ! Physical sequential Files !

Nonparametric Regression Splines for Nonparametric Regression Splines for Regional Atmospheric

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Nonparametric analysis of CMB Nonparametric analysis of CMB power spectrum data and consistency

Sequential techniques for Hypothesis testing &amp; Change detection George V. Moustakides

Chapter 5 Synchronous Sequential Logic 5-1 Outline ! Sequential Circuits ! Latches ! Flip-Flops

Sequential Supervised Learning Sequential Supervised Learning Many Application Problems Require

Introduction to Synchronous Sequential Introduction to Synchronous Sequential Circuits Circuits

Lecture 14: Sequential Circuits, FSM Todays topics: Sequential circuits Finite

Nonparametric Minimax Estimation of the Estimation of the Volatility in High- Volatility in

Hardware Design with VHDL Sequential Circuit Design I ECE 443 Sequential Circuit Design:

Sequential Circuits Combinational circuits : current input output Sequential circuit :

CS 525M Mobile and Ubiquitous Computing Seminar TCP Westwood Written by S. Mascolo, C.

Graceful Degradation of QoS in Smart Grid Rohit Gupta under the guidance of Prof. Krithi

Overview Enview turns massive datasets into operational insights to support pipeline operational

P S I P H O N . C A Psiphon: Who We Are P S I P H O N . C A Psiphon was founded in 2006 out of

Impacts of the RICE Rule Over 900,000 existing CI engines estimated to be impacted 80%

A Truthful Incentive Mechanism for Emergency Demand Response in Colocation Data Centers Linquan

Threat Modeling in Cyber-Physical Systems May 16, 2017 By Emeka Eyisi Ph.D. Mark Moulin Ph.D.

Race Why is parallelism hard? Non-determinism!! Practice Theory 2 Why is parallelism

Sequential techniques for Hypothesis testing & Change detection George V. Moustakides