a kernel theory of modern data augmentation
play

A Kernel Theory of Modern Data Augmentation Tr Tri Dao ao , Albert - PowerPoint PPT Presentation

th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:309:00 P 9:00 PM A Kernel Theory of Modern Data Augmentation Tr Tri Dao ao , Albert


  1. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM A Kernel Theory of Modern Data Augmentation Tr Tri Dao ao , Albert Gu, Alex Ratner, Virginia Smith, Chris De Sa, Chris Ré

  2. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation is important to accuracy…

  3. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation is important to accuracy… 3.7 pt. average gain across top ten CIFAR-10 models

  4. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation is important to accuracy… 3.7 pt. average gain across top ten CIFAR-10 models 13.9 pt. average gain for CIFAR-100

  5. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation is important to accuracy… 3.7 pt. average gain across top ten CIFAR-10 models 13.9 pt. average gain for CIFAR-100 A form of weak supervision: expresses domain knowledge (invariance)

  6. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM … but is not well understood

  7. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM … but is not well understood How does data augmentation affect the model? • Learning process • Parameters and decision surface

  8. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Augmentation as sequence modeling • TANDA [Ratner et al., 2017] • AutoAugment [Cubuk et al., 2018]

  9. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Augmentation as sequence modeling • TANDA [Ratner et al., 2017] • AutoAugment [Cubuk et al., 2018] Model augmentation as a Markov chain

  10. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Augmentation as kernels Base classifier: k-nearest neighbors + Data augmentation = Asymptotic kernel classifier

  11. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Effects of data augmentation on kernel classifiers

  12. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Effects of data augmentation on kernel classifiers o o o o o x x o x x o x x Invariance

  13. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Effects of data augmentation on kernel classifiers o o o o o o o o o o x x x x o o x x x o o x x x x x Regularization Invariance

  14. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Effects of data augmentation on kernel classifiers o o o o o o o o o o x x x x o o x x x o o x x x x x Regularization Invariance Practical utility

  15. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Effects of data augmentation on kernel classifiers o o o o o o o o o o x x x x o o x x x o o x x x x x Regularization Invariance Practical utility speeding up training

  16. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Effects of data augmentation on kernel classifiers o o o o o o o o o o x x x x o o x x x o o x x x x x Regularization Invariance Practical utility as a speeding up diagnostic training

  17. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Model of data augmentation: kernel classifier n 1 X ` ( w > � ( x i )) min Non-augmented: w n Loss function i =1 Feature map

  18. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Model of data augmentation: kernel classifier n 1 X ` ( w > � ( x i )) min Non-augmented: w n Loss function i =1 Feature map n 1 X E z i ⇠ T ( x i ) ` ( w > � ( z i )) min Augmented: w n i =1 Transformed versions of data point

  19. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation effects n n 1 E z i ⇠ T ( x i ) ` ( w > � ( z i )) ≈ 1 X X ` ( w > E z i ⇠ T ( x i ) � ( z i )) n n i =1 i =1 Average of augmented features (i.e. kernel mean embedding)

  20. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation effects n n 1 E z i ⇠ T ( x i ) ` ( w > � ( z i )) ≈ 1 X X ` ( w > E z i ⇠ T ( x i ) � ( z i )) n n i =1 i =1 Average of augmented features (i.e. kernel mean embedding) 1 st order effect: induces invariance by feature averaging

  21. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation effects n n 1 E z i ⇠ T ( x i ) ` ( w > � ( z i )) ≈ 1 X X ` ( w > E z i ⇠ T ( x i ) � ( z i )) n n i =1 i =1 Average of augmented features (i.e. kernel mean embedding) 1 st order effect: 2 nd order effect: reduces induces invariance model complexity by feature via a data-dependent averaging regularization

  22. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM A diagnostic: kernel alignment metric ψ ( x ) = E z ∼ T ( x ) φ ( z ) Averaged features: Kernel target alignment [Cristianini et al., 2002]: how well separated are features from different classes

  23. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM A diagnostic: kernel alignment metric Kernel alignment Kernel alignment MNIST

  24. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM A diagnostic: kernel alignment metric Kernel alignment Kernel alignment MNIST Kernel alignment correlates with accuracy.

  25. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Summary • Data augmentation + k-NN = asymptotic kernel classifier. • Data augmentation induces invariance and regularizes. • Application in speeding up training and diagnostics. Tri Dao trid@stanford.edu Poster #227 on Tuesday Jun 11 th at 6:30pm

Recommend


More recommend