breaking inter layer co adaptation
play

Breaking Inter-Layer Co-Adaptation by Classifier Anonymization Ikuro - PowerPoint PPT Presentation

ICML2019 Breaking Inter-Layer Co-Adaptation by Classifier Anonymization Ikuro Sato 1 Denso IT Laboratory. Inc., Japan 1 Kohta Ishikawa 1 National Institute of Advanced Industrial 2 Guoqing Liu 1 Science and Technology, Japan Masayuki Tanaka 2


  1. ICML2019 Breaking Inter-Layer Co-Adaptation by Classifier Anonymization Ikuro Sato 1 Denso IT Laboratory. Inc., Japan 1 Kohta Ishikawa 1 National Institute of Advanced Industrial 2 Guoqing Liu 1 Science and Technology, Japan Masayuki Tanaka 2 I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 1 /10

  2. Summary first About what? Breaking co-adaptation between feature extractor and classifier. How? By classifier anonymization technique. Theory? Proved: Features form simple point-like distribution . In reality? Point-like property largely confirmed on real datasets. I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 2 /10

  3. E2E optimization scheme flourishes. Is it always good? 1 E2E opt. ๐œš โ‹† , ๐œ„ โ‹† = arg min เท ๐‘€ ๐ท ๐œ„ ๐บ ๐œš ๐‘ฆ , ๐‘ข ๐’  0 ๐œš,๐œ„ ๐‘ฆ,๐‘ข โˆˆ๐’  Input DNN Feature Ext. Classifier Loss w/ target ๐‘ข ๐บ ๐œš ๐‘ฆ ๐‘ฆ ๐ท ๐œ„ ๐บ ๐œš ๐‘ฆ ๐‘€ ๐ท ๐œ„ ๐บ ๐œš ๐‘ฆ , ๐‘ข Feature extractor ๐บ ๐œš โ‹† adapts to a particular classifier ๐ท ๐œ„ . โ€˜+1โ€™ color: ๐ท ๐œ„ value Feature dim-2 โ€˜ - 1โ€™ Toy ex.) Features may form 2-class regression excessively complex distribution. Disjointed โ€ข Split โ€ข Feature dim-1 I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 3 /10

  4. FOCA: Feature-extractor Optimization through Classifier Anonymization 1 ๐œš โ‹† = arg min FOCA เท ๐”ฝ ๐œ„~ฮ˜ ๐œš ๐‘€ ๐ท ๐œ„ ๐บ ๐œš ๐‘ฆ , ๐‘ข ๐’  0 ๐œš ๐‘ฆ,๐‘ข โˆˆ๐’  Want to know more about ๐›ช ๐œš ? Random weak classifier: ๐œ„~ฮ˜ ๐œš Please come to the poster! Feature extractor ๐บ ๐œš โ‹† adapts to a set of weak classifiers ๐ท ๐œ„ . Feature dim-2 Features form simple point-like distribution per class under some conditions. Feature dim-1 I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 4 /10

  5. Proposition about the point-like property In words, If feature extractor has an enough representation ability, all input data of the same class are projected to a single point in the feature space in a class-separable way under certain conditions. Please see the paper for the proof. I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 5 /10

  6. x-axis Feature dim. #1 Toy problem demonstration y-axis Feature dim. #2 data used to generate classifier decision boundary start Small-batch classifier works as a weak classifier to the entire dataset. Small perturbations lead to end point-like distribution. I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 6 /10

  7. Experiment #1: partial-dataset training Thing we wish to confirm: full-dataset classifier partial-dataset classifier Do they perform similarly for given ๐บ ๐œš โ‹† ?? I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 7 /10

  8. Experiment #1: partial-dataset training CIFAR10 test error rates Performance gap large for other methods much smaller One indication of for FOCA point-like property classifier trained classifier trained with large dataset with small dataset (The same, fixed feature extractor is used within each method.) I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 8 /10

  9. More experiments โ€ฆ including: โ€ข Approximate geodesic distance measurements between large- and small-dataset solutions โ€ข Low-dimensional analyses to further study the point-like property. I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 9 /10

  10. Poster #28 tonight What? Breaking co-adaptation between feature extractor and classifier. How? By classifier anonymization . Proved: Features form simple Theory? point-like distribution . Reality? Point-like property largely confirmed on real datasets. 10 /10

Recommend


More recommend