minimax statistical learning
play

Minimax Statistical Learning with Wasserstein distances Jaeho Lee - PowerPoint PPT Presentation

Minimax Statistical Learning with Wasserstein distances Jaeho Lee & Maxim Raginsky NeurIPS 2018 Poster #86 Minimax learning Goal: find the hypothesis minimizing the worst-case risk is an ambiguity set representing


  1. Minimax Statistical Learning with Wasserstein distances Jaeho Lee & Maxim Raginsky NeurIPS 2018 Poster #86

  2. “Minimax” learning Goal: find the hypothesis minimizing the worst-case risk … is an ambiguity set representing uncertainty, e.g. Γ ( P , ϱ ) - domain drift ( mismatch of training & test distribution ) - adversarial attack ( enhancing robustness of hypothesis )

  3. “Minimax” learning Goal: find the hypothesis minimizing the worst-case risk … is an ambiguity set representing uncertainty, e.g. Γ ( P , ϱ ) - domain drift ( mismatch of training & test distribution ) - adversarial attack ( enhancing robustness of hypothesis ) Approach: find the hypothesis minimizing the empirical risk

  4. “Minimax” learning Goal: find the hypothesis minimizing the worst-case risk … is an ambiguity set representing uncertainty, e.g. Γ ( P , ϱ ) - domain drift ( mismatch of training & test distribution ) - adversarial attack ( enhancing robustness of hypothesis ) Approach: find the hypothesis minimizing the empirical risk Question: what is the speed of convergence

  5. “Minimax” learning Goal: find the hypothesis minimizing the worst-case risk … is an ambiguity set representing uncertainty, e.g. Γ ( P , ϱ ) - domain drift ( mismatch of training & test distribution ) - adversarial attack ( enhancing robustness of hypothesis ) Approach: find the hypothesis minimizing the empirical risk Question: what is the speed of convergence P ϱ Focus on 1-Wasserstein ambiguity ball! (we have results for p-Wasserstein balls, too! See Poster#86)

  6. Taming the supremum Main challenge is to handle the supremum.

  7. Taming the supremum Main challenge is to handle the supremum. Trick: (1) write down the dual form

  8. Taming the supremum Main challenge is to handle the supremum. Trick: (1) write down the dual form (2) empirical risk minimization is now joint minimization

  9. Taming the supremum Main challenge is to handle the supremum. Trick: (1) write down the dual form (2) empirical risk minimization is now joint minimization (3) gauge the complexity of the “set of all possible ” With high probability,

  10. Result Theorem) Under mild assumptions, with high probability, - vanishes to 0 as the sample size grows. - does not require Lipschitz-type assumptions on f - similar procedure could be applied for any ambiguity set 
 with suitable dual form Come to poster #86 for… - applications to domain adaptation - complementary generalization bound recovering 
 classic bound as - Results on p-Wasserstein balls

Recommend


More recommend