Energy-Aware Neural Architecture Optimization With Splitting Steepest Descent
Splitting yields adaptive net structure optimization Questions • Why splitting? • What neurons should be split first? • How to split a neuron optimally?
Intuition: escaping local minima ‣ Splitting 𝜄 into 𝑛 copies : SGD saddle point? local minima ‣ Smooth loss change: ‣ A simple network:
Splitting Steepest Descent ‣ How to choose 𝑛 and {𝜄 𝑗 , 𝑥 𝑗 } optimally? Splitting-index, minimum eigenvalue Splitting-matrix ‣ Optimal splitting strategy 𝜄 2 𝜄 no splitting 𝜄 1
Our Algorithm -0.2 12 gain flops -0.9 3 budget -0.1 1 -0.2 1 -1.2 4
Image Classification Results using MobileNetV1 0.7 0.6 0.5 Pruning (Bn) Splitting (ours) 6.0 6.5 7.0 7.5 8.0
Recommend
More recommend