Sujith Ravi @ravisujith http://www.sravi.org ICML 2019 Motivation - PowerPoint PPT Presentation

Efficient On-Device Models using Neural Projections Sujith Ravi @ravisujith http://www.sravi.org ICML 2019

Motivation tiny Neural Networks big Neural Networks running on device running on cloud Sujith Ravi

User   Limited Efficient Consistent   Privacy Connectivity Computing Experience Sujith Ravi

On-Device ML in Practice Image Recognition Smart Reply on your mobile phone on your Android watch Blog “Custom On-Device ML Models with Learn2Compress”, Sujith Ravi “On-Device Conversation Modeling with TensorFlow Lite”, Sujith Ravi “On-Device Machine Intelligence” , Sujith Ravi Sujith Ravi

Challenges for Running ML on Tiny Devices ➡ Hardware constraints — computation, memory, energy-efficiency ➡ Robust quality — difficult to achieve with small models ➡ Complex model architectures for inference ➡ Inference challenging — structured prediction, high dimensionality, large output spaces • Previous work, model compression ➡ techniques like dictionary encoding, feature hashing, quantization, … ➡ performance degrades with dimensionality, vocabulary size & task complexity Sujith Ravi

Can We Do Better? ● Build on-device neural networks that ➡ are small in size ➡ are very efficient ➡ can reach (near) state-of-the-art performance Sujith Ravi

Learn Efficient Neural Nets for On-device ML Learning Inference (on cloud) (on device) Data ( x, y ) Projection model architecture (efficient, customizable nets) Projection Neural Network Optimized NN model, ready-to-use on device Small Size → compact nets, multi-sized ● Fast → low latency ● Efficient, Generalizable Deep Networks (our work) Fully supported inference → TF / TFLite / custom ● using Neural Projections Sujith Ravi

Learn Efficient On-Device Models using Neural Projections Sujith Ravi

Projection Neural Networks Fully connected layer Dynamically Generated Projection layer Intermediate feature layer (sparse or dense vector) Sujith Ravi

Efficient Representations via Projections operations as illustrated • Transform inputs using T projection functions functions P 1 , ..., P T , x i ) , ..., P T ( ~ x p i = P 1 ( ~ x i ) ~ • Projection transformations (matrix) pre-computed using parameterized functions ➡ Compute projections efficiently using a modified version of Locality Sensitive Hashing (LSH) Sujith Ravi

Locality Sensitive ProjectionNets • Use randomized projections (repeated binary hashing) as projection operations ➡ Similar inputs or intermediate network layers are grouped together and projected to nearby projection vectors ➡ Projections generate compact bit (0/1) vector representations Sujith Ravi

Generalizable, Projection Neural Networks Stack projections, combine with other operations & non-linearities to create ● a family of efficient, projection deep networks This sounds good Projection + Dense Projection + Convolution Projection + Recurrent Sujith Ravi

Family of Efficient Projection Neural Networks Transferable Projection Networks (Sankar, Ravi & Kozareva, NAACL 2019) ProjectionCNN SGNN++ SGNN: Self-Governing Neural Networks (Ravi, ICML 2019) Hierarchical, Partitioned Projections (Ravi & Kozareva, EMNLP 2018) (Ravi & Kozareva, ACL 2019) ProjectionNet (Ravi, 2017) arxiv/abs/1708.00630 + … upcoming Sujith Ravi

ProjectionNets, ProjectionCNNs for Vision Tasks Image classification results (precision@1) Table 1. Classification Results (precision@1) for vision tasks using Neural Projection Nets and baselines. Model Compression Ratio MNIST Fashion CIFAR-10 MNIST (wrt baseline) NN (3-layer) (Baseline: feed-forward) 1 98.9 89.3 - CNN (5-layer) (Baseline: convolutional) (Figure 2, Left) 0.52 ∗ 99.6 93.1 83.7 Random Edge Removal (Ciresan et al., 2011) 8 97.8 - - Low Rank Decomposition (Denil et al., 2013) 8 98.1 - - Compressed NN (3-layer) (Chen et al., 2015) 8 98.3 - - Compressed NN (5-layer) (Chen et al., 2015) 8 98.7 - - Dark Knowledge (Hinton et al., 2015; Ba & Caruana, 2014) - 98.3 - - HashNet (best) (Chen et al., 2015) 8 98.6 - - NASNet-A (7 cells, 400k steps) (Zoph et al., 2018) - - - 90.5 (our approach) Joint (trainer = NN) [ T=8,d=10 ] 3453 70.6 2312 76.9 [ T=10,d=12 ] [ T=60,d=10 ] 466 91.1 ProjectionNet [ T=60,d=12 ] 388 92.3 [ T=60,d=10 ] + FC [128] 36 96.3 [ T=60,d=12 ] + FC [256] 15 96.9 13 97.1 86.6 [ T=70,d=12 ] + FC [256] (our approach) (Figure 2, Right) ProjectionCNN (4-layer) 8 99.4 92.7 78.4 Joint (trainer = CNN) (our approach) (Conv3-64, Conv3-128, Conv3-256, P [ T=60, d=7 ], FC [128 x 256]) ProjectionCNN (6-layer) Self (trainer = None) 4 82.3 Joint (trainer = NASNet) 4 84.7 • Efficient wrt compute/memory while maintaining high quality Sujith Ravi

ProjectionNets for Language Tasks Text classification results (precision@1) Model Compression Smart Reply ATIS (wrt RNN) Intent Random (Kannan et al., 2016) - 5.2 - Frequency (Kannan et al., 2016) - 9.2 72.2 LSTM (Kannan et al., 2016) 1 96.8 - Attention RNN 1 - 91.1 (Liu & Lane, 2016) ProjectionNet (our approach) >10 97.7 91.3 [ T=70,d=14 ] → FC [256 x 128] • Efficient wrt compute/memory while maintaining high quality ➡ On ATIS, ProjectionNet (quantized) achieves 91.0% with tiny footprint ( 285KB ) • Achieves SoTA for NLP tasks Sujith Ravi

Learn2Compress: Build your own On-Device Models Data + (optional) Blog “Custom On-Device ML Models with Learn2Compress” Sujith Ravi

Thank You! http://www.sravi.org @ravisujith Paper Efficient On-Device Models using Neural Projections http://proceedings.mlr.press/v97/ravi19a.html Check out our Workshop Joint Workshop on On-Device Machine Learning & Compact Deep Neural Network Representations (ODML-CDNNR) Fri, Jun 14 (Room 203)

Sujith Ravi @ravisujith http://www.sravi.org ICML 2019 Motivation - PowerPoint PPT Presentation

Efficient On-Device Models using Neural Projections Sujith Ravi @ravisujith http://www.sravi.org ICML 2019 Motivation tiny Neural Networks big Neural Networks running on device running on cloud Sujith Ravi User Limited Efficient

AL ALTERN TERNATE E LAND ND USE USE SY SYSTEMS STEMS Dr. G Dr . G. M. SUJITH M.

Deciphering Foreign Language NLP Sujith Ravi and Kevin Knight sravi@usc.edu,knight@isi.edu

Splay Tree Cheruku Ravi Teja November 14, 2011 Cheruku Ravi Teja Splay Tree Introduction

Lecture 1: Introduction Ravi Netravali http://web.cs.ucla.edu/~ravi/ Todays Agenda

Minimum cuts via Breadth-First search R. Ravi ravi@cmu.edu Outline Minimum s-t cut in

Lecture 1: Introduction Ravi Netravali https://web.cs.ucla.edu/~ravi/ Todays Agenda

Seamless Mobility over ICN Ravi Ravindran (ravi.ravindran@huawei.com) FG-IMT 2020, Demo Day

Smart Reply: Automated Response Suggestion for Email Authors Anjuli Kannan* Karol Kurach*

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation Xin (Eric) Wang

Optimal PEEP selection in Mechanical Ventilation using EIT Ravi B. Bhanabhai Carleton University

HOUSTONS APPROACH FOR MAINTAINING LARGE DIAMETER SEWERS Ravi Kaleyatodi, P.E., CPM Assistant

ALMEDA TUNNEL PAST, PRESENT AND FUTURE Ravi Kaleyatodi, P.E., CPM Senior Assistant Director,

GREENING THE HOTEL SECTOR IN SRI LANKA THE KANDALAMA EXPERIENCE Ravi de Silva Consultant

Symbolic Integration DART IV, Beijing, China V. Ravi Srinivasan Rutgers University-Newark

Evaluating pathrate and pathload with realistic cross-traffic Ravi Prasad Manish Jain

ICN Based Scalable Audio/Video Conferencing over Virtual Service Edge Router (VSER) Platform Ravi

SN trigger requirement changes and latency Pierre Lasorak & Simon Peeters 1 Outline SN

The quest of efficiency and certification in polynomial optimization Victor Magron , CNRSLAAS

Learning From Data Lecture 17 Memory and Efficiency in Nearest Neighbor Memory Efficiency M.

Proposal to add DUNE to the OSG Council Ken Herner for the DUNE Collaboration CHEP 2019 13 Dec

Improving the Energy and Execution Efficiency of a Small Instruction Cache by Using an

Computational Logic Efficiency Issues in Prolog 1 Efficiency In general, efficiency

HMFEv - An Efficient Multivariate Signature Scheme Albrecht Petzoldt, Ming-Shing Chen, Jintai

Efficiency/Effectiveness Trade-offs in Learning to Rank Tutorial @ ECML PKDD 2018

Sujith Ravi @ravisujith http://www.sravi.org ICML 2019 Motivation - PowerPoint PPT Presentation

Efficient On-Device Models using Neural Projections Sujith Ravi @ravisujith http://www.sravi.org ICML 2019 Motivation tiny Neural Networks big Neural Networks running on device running on cloud Sujith Ravi User Limited Efficient

AL ALTERN TERNATE E LAND ND USE USE SY SYSTEMS STEMS Dr. G Dr . G. M. SUJITH M.

Deciphering Foreign Language NLP Sujith Ravi and Kevin Knight sravi@usc.edu,knight@isi.edu

Splay Tree Cheruku Ravi Teja November 14, 2011 Cheruku Ravi Teja Splay Tree Introduction

Lecture 1: Introduction Ravi Netravali http://web.cs.ucla.edu/~ravi/ Todays Agenda

Minimum cuts via Breadth-First search R. Ravi ravi@cmu.edu Outline Minimum s-t cut in

Lecture 1: Introduction Ravi Netravali https://web.cs.ucla.edu/~ravi/ Todays Agenda

Seamless Mobility over ICN Ravi Ravindran (ravi.ravindran@huawei.com) FG-IMT 2020, Demo Day

Smart Reply: Automated Response Suggestion for Email Authors Anjuli Kannan* Karol Kurach*

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation Xin (Eric) Wang

Optimal PEEP selection in Mechanical Ventilation using EIT Ravi B. Bhanabhai Carleton University

HOUSTONS APPROACH FOR MAINTAINING LARGE DIAMETER SEWERS Ravi Kaleyatodi, P.E., CPM Assistant

ALMEDA TUNNEL PAST, PRESENT AND FUTURE Ravi Kaleyatodi, P.E., CPM Senior Assistant Director,

GREENING THE HOTEL SECTOR IN SRI LANKA THE KANDALAMA EXPERIENCE Ravi de Silva Consultant

Symbolic Integration DART IV, Beijing, China V. Ravi Srinivasan Rutgers University-Newark

Evaluating pathrate and pathload with realistic cross-traffic Ravi Prasad Manish Jain

ICN Based Scalable Audio/Video Conferencing over Virtual Service Edge Router (VSER) Platform Ravi

SN trigger requirement changes and latency Pierre Lasorak &amp; Simon Peeters 1 Outline SN

The quest of efficiency and certification in polynomial optimization Victor Magron , CNRSLAAS

Learning From Data Lecture 17 Memory and Efficiency in Nearest Neighbor Memory Efficiency M.

Proposal to add DUNE to the OSG Council Ken Herner for the DUNE Collaboration CHEP 2019 13 Dec

Improving the Energy and Execution Efficiency of a Small Instruction Cache by Using an

Computational Logic Efficiency Issues in Prolog 1 Efficiency In general, efficiency

HMFEv - An Efficient Multivariate Signature Scheme Albrecht Petzoldt, Ming-Shing Chen, Jintai

Efficiency/Effectiveness Trade-offs in Learning to Rank Tutorial @ ECML PKDD 2018

SN trigger requirement changes and latency Pierre Lasorak & Simon Peeters 1 Outline SN