Anomaly Detection and Prototype Selection Using Polyhedron Curvature Benyamin Ghojogh, Fakhri Karray, Mark Crowley Canadian AI conference, 2020 1
Anomaly Detection • finding outliers or anomalies which differ significantly from the normal data points • fraud detection, intrusion detection, medical diagnosis, and damage detection • Some methods: • Local Outlier Factor (LOF) • One-class SVM • Elliptic Envelope (EE) • Isolation forest 2
Prototype Selection • also referred to as instance ranking and numerosity reduction • Two versions: • Ranking based • Retaining based • Some methods: • Edited Nearest Neighbor (ENN) • Decremental Reduction Optimization Procedure 3 (DROP3) • Stratified Ordered Selection (SOS) • Shell Extraction (SE) • Principal Sample Analysis (PSA) • Instance Ranking by Matrix Decomposition (IRMD) 3
Polyhedron Curvature • Polytope: a geometrical object in R^d whose faces are planar • Special cases: • Polygon: polytope in R^2 • Polyhedron: polytope in R^3 • Consider a polygon where τj and µj are the interior and exterior angles at the j-th vertex • we have τj +µj = π 4
Polyhedron Curvature • Thomas Harriot’s theorem proposed in 1603: • if this geodesic on the unit sphere is a triangle, its area is µ1 +µ2 +µ3 −π = 2π −(τ1 +τ2 +τ3) • generalization of this theorem from a geodesic triangular polygon (3-gon) to an k-gon 𝑙 is µ1 + · · · + µk − kπ + 2π = 2π − σ 𝑏=1 τa 𝑙 • Descartes’s angular defect: 2π − σ 𝑏=1 τa 5
Polyhedron Curvature 𝑙 • Descartes’s angular defect: D(x) = 2π − σ 𝑏=1 τa • total defect of a polyhedron with v vertices, e edges, and f faces is: 𝑙 D := σ 𝑏=1 D(xi) = 2π(v − e + f). • T erm v − e + f is Euler -Poincare characteristic of the polyhedron • The smaller τ angles result in sharper corner of the polyhedron • So, we can consider the angular defect as the curvature of the vertex 6
Curvature Anomaly Detection (CAD) • Every data point is considered to be the vertex of a hypothetical polyhedron • For every point, we find its k-Nearest Neighbors (k-NN) • The k neighbors of the point (vertex) form the k faces of a polyhedron meeting at that vertex. • The more curvature that point (vertex) has, the more anomalous it is, i.e., far away (different) from its neighbors • So, anomaly score s_A is proportional to the curvature. 7
Curvature Anomaly Detection (CAD) 𝑙 • Descartes’s angular defect: 2π − σ 𝑏=1 . Hence, curvature is τa proportional to minus the summation of angles • S_A(xi) ∝ 1/ τ a ∝ cos( τ a) 𝑙 • S_A(xi) := σ 𝑏=1 cos(τa) = 𝑙 σ 𝑏=1 (x’_a x’_a+1) / (||x’_a||_2 ||x’_a+1||2) • x’_a := x_a − x_i • Relaxation: Relaxation is valid 8
Curvature Anomaly Detection (CAD) • Finding anomalies (training data): • Scree plot • K-means with two clusters: Cluster with larger mean is anomaly • Finding anomalies (out-of-sample): • k-NN for the out-of-sample point where the neighbors are from the training points • Calculate anomaly score • Compare with the means of clusters 9
Kernel Curvature Anomaly Detection (K-CAD) • Pattern of normal and anomalous data might not be linear. • Done in feature space: (1) finding k-NN, (2) calculating the anomaly score • Kernel: k(x1, x2) := φ(x1)^T φ(x2) • Euclidean distance in the feature space: • Normalize the kernel: 10
Kernel Curvature Anomaly Detection (K-CAD) • Score: • anomaly score in K-CAD is ranked inversely for some kernels such as Radial Basis Function (RBF), Laplacian, and polynomial (different degrees) • Reason: future work • M ultiply the scores by −1 or take the K -means cluster with smaller mean as the anomaly cluster 11
Anomaly Landscape • anomaly landscape: the landscape in the input space whose value at every point in the space is the anomaly score computed by CAD or K- CAD. • two types of anomaly landscape: • all the training data points are used for k-NN • or merely the non-anomaly training points are used for k-NN 12
Anomaly Paths • anomaly path: the path that an anomalous point has traversed from its not-known-yet normal version to become anomalous. Conversely, it is the path that an anomalous point should traverse to become normal • anomaly path can be used to make a normal sample anomalous or vice-versa 13
Inverse Curvature Anomaly Detection (iCAD) • Score: • Two versions: • Rank based: ranking the points with the ranking score • Retaining based: apply K-means clustering, with two clusters, to the ranking scores and take the points of the cluster with larger mean 14
Kernel Inverse Curvature Anomaly Detection (K-iCAD) • Scores: • iCAD and K-iCAD are task agnostic: • Classification: apply the method for every class • Regression and clustering: the method is applied on the entire data. 15
Experiments: anomaly landscape 16
Experiments: anomaly paths 17
An application in image denoising 18
Experiments: anomaly detection In most cases, K-CAD has better performance than CAD In many cases, we are better than the baseline methods We are also very fast 19
Experiments: Effect of k Almost robust to change of k 20
Experiments: prototype selection on synthetic data 21
Prototype selection Outperform many of the baseline methods: • in both accuracy and time • In both ranking and retaining based approaches 22
Future Direction • Try the idea of curvature for manifold embedding to propose a curvature preserving embedding method. 23
Recommend
More recommend