learning with marginalized corrupted features
play

Learning with Marginalized Corrupted Features L. van der Maaten, M. - PowerPoint PPT Presentation

Learning with Marginalized Corrupted Features L. van der Maaten, M. Chen, S. Tyree, K. Weinberger ICML 2013 Jan Gasthaus Tea talk April 11, 2013 1 / 17 2 / 17 2 / 17 2 / 17 2 / 17 2 / 17 2 / 17 2 / 17 2 / 17 Data Augmentation Secret


  1. Learning with Marginalized Corrupted Features L. van der Maaten, M. Chen, S. Tyree, K. Weinberger ICML 2013 Jan Gasthaus Tea talk April 11, 2013 1 / 17

  2. 2 / 17

  3. 2 / 17

  4. 2 / 17

  5. 2 / 17

  6. 2 / 17

  7. 2 / 17

  8. 2 / 17

  9. 2 / 17

  10. Data Augmentation Secret 4: lots of jittering, mirroring, and color perturbation of the original images generated on the fly to increase the size of the training set Yann LeCun on Google+ about Alex Krizhevsky’s ImageNet results 3 / 17

  11. Main Idea Old idea: create artificial additional training data by corrupting it with “noise” 4 / 17

  12. Main Idea Old idea: create artificial additional training data by corrupting it with “noise” One easy way to incorporate domain knowledge (e.g. possible transformations) 4 / 17

  13. Main Idea Old idea: create artificial additional training data by corrupting it with “noise” One easy way to incorporate domain knowledge (e.g. possible transformations) But: additional training data = ⇒ additional computation Idea: Corrupt with known ExpFam noise and integrate it out 4 / 17

  14. Explicit vs. Implicit Corruption Explicit corruption: Take training set D = { ( x n , y n ) } N n = 1 and corrupt it M times N M 1 L (˜ � � L (˜ D , Θ) = x nm , y n , Θ) M n = 1 m = 1 with x nm ∼ p (˜ x nm | x n ) . 5 / 17

  15. Explicit vs. Implicit Corruption Implicit corruption: Minimize the expected value of the loss under p (˜ x n | x n ) : N � E [ L (˜ L ( D , Θ) = x n , y n , Θ)] p (˜ x n | x n ) n = 1 i.e. replace the empirical average with the expectation. 6 / 17

  16. Wait a second . . . This is so obvious that it must have been done before . . . 7 / 17

  17. Wait a second . . . This is so obvious that it must have been done before . . . ◮ Vicinal Risk Minimization , Chapelle, Weston, Bottou, & Vapnik, NIPS 2000 7 / 17

  18. Wait a second . . . This is so obvious that it must have been done before . . . ◮ Vicinal Risk Minimization , Chapelle, Weston, Bottou, & Vapnik, NIPS 2000 Explicitly only consider the case of Gaussian noise distributions 7 / 17

  19. Quadratic Loss 8 / 17

  20. Quadratic Loss 9 / 17

  21. Exponential Loss 10 / 17

  22. Logistic Loss 11 / 17

  23. MGFs 12 / 17

  24. Results 13 / 17

  25. Results 14 / 17

  26. Results 15 / 17

  27. Results 16 / 17

  28. Results 17 / 17

Recommend


More recommend