Sever: A Robust Meta-Algorithm for Stochastic Optimization Ilias - PowerPoint PPT Presentation

Sever: A Robust Meta-Algorithm for Stochastic Optimization Ilias Diakonikolas 1 , Gautam Kamath 2 , Daniel M. Kane 3 , Jerry Li 4 , Jacob Steinhardt 5 , Alistair Stewart 1 (alphabetical order) 1 USC 2 Waterloo 3 UCSD 4 MSR AI 5 Berkeley

DEFENDING AGAINST DATA POISONING Main question: can you learn a good classifier from poisoned training data?

DEFENDING AGAINST DATA POISONING Main question: can you learn a good classifier from poisoned training data? Given a labeled training set, where an (unknown) 𝜁 -fraction of them are adversarially corrupted, can we learn a model which achieves good accuracy on a clean test set?

DEFENDING AGAINST DATA POISONING Main question: can you learn a good classifier from poisoned training data? Example: Training an SVM with 3% poisoned data

DEFENDING AGAINST DATA POISONING Main question: can you learn a good classifier from poisoned training data? Example: Training an SVM with 3% poisoned data [Koh-Steinhardt-Liang ’18] Against known defenses, the test error can go up to 30%!

DEFENDING AGAINST DATA POISONING Main question: can you learn a good classifier from poisoned training data? Example: Training an SVM with 3% poisoned data Lots of work on related problems: [Barreno-Nelson-Joseph-Tygar’10,Nasrabadi-Tran-Nguyen’11, Biggio-Nelson-Laskov’12, Nguyen-Tran’13, Newell-Potharaju- Xiang-Nita-Rotaru’14, Bhatia-Jain-Kar’15, Diakonikolas- Kamath-Kane-L-Moitra-Stewart’16, Bhatia-Jain-Kamalaruban- Kar’17, Balakrishnan-Du-L-Singh’17, Charikar-Steinhardt- Valiant’17, Steinhardt-Koh-Liang’17, Koh-Liang’17, Prasad- Suggala-Balakrishnan-Ravikumar’18, Diakonikolas-Kong- Stewart’18, Klivans-Kothari-Meka’18,Koh-Steinhardt- [Koh-Steinhardt-Liang ’18] Liang’18…] Against known defenses, the test error can go up to 30%!

OUR RESULTS We present a framework for robust stochastic optimization • Strong theoretical guarantees against strong adversarial models • Outperforms benchmark defenses on state-of-the-art data poisoning attacks • Works well in high dimensions • Works with black-box access to any learner for any stochastic optimization task

SEVER Idea: Until termination: 1. train black box learner to find approximate minima of empirical risk on corrupted training set, 2. then run outlier detection method on the gradients of the loss functions at ERM to remove suspected outliers

SEVER Idea: Until termination: 1. train black box learner to find approximate minima of empirical risk on corrupted training set, 2. then run outlier detection method on the gradients of the loss functions at ERM to remove suspected outliers Filter

FILTERING AND ROBUST MEAN ESTIMATION How should we detect outliers from the gradients?

FILTERING AND ROBUST MEAN ESTIMATION How should we detect outliers from the gradients? We exploit a novel connection to robust mean estimation

FILTERING AND ROBUST MEAN ESTIMATION How should we detect outliers from the gradients? We exploit a novel connection to robust mean estimation Filtering [DKKLMS16, DKKLMS17]: Given a set of points 𝑌 # , … , 𝑌 & drawn from a “nice” distribution, but where an 𝜁 -fraction are corrupted, there is a linear time algorithm which either:

FILTERING AND ROBUST MEAN ESTIMATION How should we detect outliers from the gradients? We exploit a novel connection to robust mean estimation Filtering [DKKLMS16, DKKLMS17]: Given a set of points 𝑌 # , … , 𝑌 & drawn from a “nice” distribution, but where an 𝜁 -fraction are corrupted, there is a linear time algorithm which either: 1. Certifies that the true mean is close to the empirical mean of the corrupted dataset

FILTERING AND ROBUST MEAN ESTIMATION How should we detect outliers from the gradients? We exploit a novel connection to robust mean estimation Filtering [DKKLMS16, DKKLMS17]: Given a set of points 𝑌 # , … , 𝑌 & drawn from a “nice” distribution, but where an 𝜁 -fraction are corrupted, there is a linear time algorithm which either: 1. Certifies that the true mean is close to the empirical mean of the corrupted dataset 2. Removes more bad points than good points

FILTERING AND ROBUST MEAN ESTIMATION How should we detect outliers from the gradients? We exploit a novel connection to robust mean estimation Filtering [DKKLMS16, DKKLMS17]: Given a set of points 𝑌 # , … , 𝑌 & drawn from a “nice” distribution, but where an 𝜁 -fraction are corrupted, there is a linear time algorithm which either: 1. Certifies that the true gradient of the loss function is close to 0 2. Removes more bad points than good points

GUARANTEES Theorem (informal) : Suppose we have a distribution 𝒠 ≼ 𝜏 1 𝐽 . Suppose over convex functions 𝑔 , and Cov 𝛼𝑔 𝜄 we have 𝑔 & (𝜄) drawn from 𝒠 , where 𝜁 -fraction # (𝜄), 𝑔 1 (𝜄), … , 𝑔 of them are adversarial. Under mild assumptions on 𝒠 , then 5 so that w.h.p. given enough samples, SEVER outputs a 𝜄 5) − min 𝜏 1 𝜁 . 𝑔̅(𝜄 ; 𝑔 𝜄 < 𝑃 Can also give results for non-convex objectives Sample complexity / runtime are polynomial but not super tight For GLMs (e.g. SVM, regression), we obtain tight(er) bounds

EMPIRICAL EVALUATION: REGRESSION

EMPIRICAL EVALUATION: SVM

CONCLUSIONS Main question: can you learn a good classifier from poisoned data? Sever is a meta-algorithm for robust stochastic optimization Filter Based on connections to robust mean estimation Interested? See poster #143 this evening!

Sever: A Robust Meta-Algorithm for Stochastic Optimization Ilias - PowerPoint PPT Presentation

Sever: A Robust Meta-Algorithm for Stochastic Optimization Ilias Diakonikolas 1 , Gautam Kamath 2 , Daniel M. Kane 3 , Jerry Li 4 , Jacob Steinhardt 5 , Alistair Stewart 1 (alphabetical order) 1 USC 2 Waterloo 3 UCSD 4 MSR AI 5 Berkeley DEFENDING

COMPANY PRESENTATION ADDRESS AND LOCATION ATB SEVER a.d. Magnetna polja 6, 24 000 Subotica

Meta- Meta -Programming with Programming with Modelica Modelica for Meta- for Meta

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

Dual Effect in Stochastic Optimization February 10, 2015 P. Carpentier Master MMMEF Cours

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Distributed Workflows with Flowy EuroPython 2015 Sever Banesiu @severb Overview 1. Distributed

Meta-optimization of Quantum-Inspired Evolutionary Algorithm Robert Nowotniak, Jacek Kucharski

Stochastic optimization in Hilbert spaces Aymeric Dieuleveut Aymeric Dieuleveut Stochastic

Stochastic Optimization and Discretization January 06, 2021 P. Carpentier Master Optimization

META Seal of Recognition and META Prize Award Ceremony Georg Rehm (DFKI) on behalf of the

Overview of the Stochastic Gradient Method December 02, 2020 P. Carpentier Master Optimization

Introduction to Stochastic Optimization January 13, 2015 P. Carpentier Master MMMEF Cours

Stochastic Online Optimization Jian Li Institute of Interdisciplinary Information Sciences

A simple and robust A simple and robust algorithm for extracting algorithm for extracting

Robust Deep Learning Based on Meta-learning Deyu Meng Xian Jiaotong University

A Stochastic EM algorithm for construction of Mortality Tables Luz Judith Rodriguez Esparza, F .

Verified Calculation of Nonlinear Dynamics of Viscous Detonation Christopher M. Romick,

Duncan Stewart, PE Agenda 1. The Enterprise Scheduling Model Overview 2. Disadvantages of the

SBA and Programs to Know About First Wednesday Virtual Learning Series 2018 www.sba.gov 1

Mobile Malware: Why the traditional AV paradigm is doomed, and how to use physics to detect

Solving large scale eigenvalue problems Lecture 10, May 2, 2018: More on Lanczos and Arnoldi

Miranda Stewart with Roger Wilkins and Troy Henderson 28 th March 2019 Inequality, Tax and

Gale-Stew a rt games and Blakw ell games Daisuk e Ik egami (Universit y of Califo

Using Mixed Precision in Numerical Computations to Speedup Linear Algebra Solvers Jack Dongarra,