pareco pareto aware channel optimization for slimmable
play

PareCO: Pareto-aware Channel Optimization for Slimmable Neural - PowerPoint PPT Presentation

PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks Ting-Wu (Rudy) Chin Ari S. Morcos Diana Marculescu Slimmable Neural Networks Error #FLOPs One set of weights, multiple networks on the trade-off front! Why Slimmable


  1. PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks Ting-Wu (Rudy) Chin Ari S. Morcos Diana Marculescu

  2. Slimmable Neural Networks Error #FLOPs One set of weights, multiple networks on the trade-off front!

  3. Why Slimmable Neural Networks? Reduce model maintenance cost Runtime optimization

  4. The Gap

  5. How can we optimize slimmable neural networks with flexible widths? α , θ Trade-off induced by a slimmable network Error Error α * #FLOPs #FLOPs

  6. The objective of our problem min 𝔽 x , y 𝔽 λ L CE ( θ ; x , y , α *) θ s.t. α * = arg min T λ ( α ; θ , x , y ) Augmented Tchebyshev Scalarization

  7. ImageNet: Compared to conventional slimmable neural networks MobileNetV2 MobileNetV3

  8. Takeaways • Optimizing the layer-wise channel counts for the sub-networks in slimmable neural networks allows for better trade-off between prediction error and FLOPs • This work provides a principled formulation and a practical algorithm for optimizing the layer-wise channel counts for slimmable neural networks

Recommend


More recommend