Deep-Learning Oriented Smart Sensing for the Next Generation of - PowerPoint PPT Presentation

Deep-Learning Oriented Smart Sensing for the Next Generation of Embedded Applications Manuele Rusci, Francesco Conti , Alessandro Capotondi, Luca Benini Energy-Efficient Embedded Systems Laboratory Dipartimento di Ingegneria dell’Energia Elettrica e dell’Informazione “Guglielmo Marconi” IWES18 – Siena, 14 Settembre 2018

From data collectors… Node average power budget Wireless Sensing Wireless Power Sensor MCU Sensing Unit TX/RX Unit Sensing Analog A/D External Element Chain Conv Memory [Alioto, Massimo. "IoT: Bird’s Eye View, Megatrends and Perspectives." Enabling the Internet of Things . Springer International Publishing, 2017. 1-45.] 2 M. Rusci, F. Conti, A. Capotondi, L. Benini

...to always-ON smart sensors Challenge: bringing intelligence in-the-node at mW cost Smart Sensing Power System Processing Unit TX/RX Sensing Unit Unit Core Peripheral Subsystem Region Sensing Analog A/D Element Chain Conv External Memory Subsystem Memory 3 M. Rusci, F. Conti, A. Capotondi, L. Benini

...to always-ON smart sensors Challenge: bringing intelligence in-the-node at mW cost Smart Sensing Power System Processing Unit TX/RX Sensing Unit Unit Core Peripheral Subsystem Region Sensing Analog A/D Element Chain Conv External Memory Subsystem Memory 1. low-power “feature” / event extraction on sensor 4 M. Rusci, F. Conti, A. Capotondi, L. Benini

...to always-ON smart sensors Challenge: bringing intelligence in-the-node at mW cost Smart Sensing Power System Processing Unit TX/RX Sensing Unit Unit Core Peripheral Subsystem Region Sensing Analog A/D Element Chain Conv External Memory Subsystem Memory 1. low-power “feature” / event extraction on sensor 2. event-based near-sensor processing 5 M. Rusci, F. Conti, A. Capotondi, L. Benini

...to always-ON smart sensors Challenge: bringing intelligence in-the-node at mW cost Smart Sensing Power System Processing Unit TX/RX Sensing Unit Unit Core Peripheral Subsystem Region Sensing Analog A/D Element Chain Conv External Memory Subsystem Memory 1. low-power “feature” / event extraction on sensor 2. event-based near-sensor processing 3. “slim” and uncommon transmission of high-level features 6 M. Rusci, F. Conti, A. Capotondi, L. Benini

Ultra-Low Power Imaging (GrainCam) Focal Plane Processing . Moving an early computation stage into the sensor die to reduce the power costs of the imaging task. Per-pixel circitut for filtering and Gradient binarization extraction V res V res to pixel PN PN Imager performing spatial V EDGE to pixel PE PN PE V Q PO filtering and binarization PO comp2 PE V th Contrast Spatial- on the sensor die through Block PO contrast QO comp1 mixed-signal sensing ! V Q QN QE Adpating exposure ‘Moving’ pixel window PN PO PE Traditional Camera Graincam w/ motion detection 7 M. Rusci, F. Conti, A. Capotondi, L. Benini

Event-Based Paradigm Ultra-Low Power Consumption <100uW Event-based sensing : output frame data bandwidth depends on the external context- activity <10x wrt SoA imagers {x 0 ,y 0 } {x 1 ,y 1 } Frame- Event- {x 2 ,y 2 } based based {x 3 ,y 3 } Event-Based Data Processing {x n-1 ,y n-1 } idle Readout modes : Detection of relevant data transfer data processing information by the sensor IDLE : readout the counter of asserted pixels  ~10mW power ACTIVE : sending the addresses of asserted Absence of significant  ~100uW information from the sensor pixels (Address-Coded Representation, AER) M. Rusci et al. "A sub-mW IoT-endnode for always-on visual monitoring and smart triggering," in IEEE Internet of Things Journal, 2017 8 M. Rusci, F. Conti, A. Capotondi, L. Benini

Deep Learning at the Edge Convolutional Neural Networks are state-of-the art for visual recognition, detection and classification tasks Inference Engine Multi-Dimensional Imager Data Output Class Label bike How to exploit CNNs on always-on devices with a power envelope of few mWs or sub-mW ? Issues:  Large memory footprint to store weights (the ‘program’) and intermediate results (up to hundreds of MBs), greater than memory footprint available on ultra-low power engines (100’s kBs)  High-complexity CNN implementation, demanding floating-point precision  Imager Power costs of tens to hundreds of mWs 9 M. Rusci, F. Conti, A. Capotondi, L. Benini

Deep Learning at the Edge “Extreme” example: ResNet-34  classifies 224x224 images into 1000 classes  ~ trained human-level performance  ~ 21M parameters  ~ 3.6G MAC Performance for 1 fps: ~3.6 GMAC/s Energy efficiency for 1 fps @ 20 mW: ~180 GMAC/s/W = ~5pJ/MAC Quantization Specialized HW Precision Accuracy loss parallelism and HW acceleration full precision / 8bit 0 are key paradigms to achieve 6bit -1.3% low energy 4bit -3.3% VGG-16 @ CIFAR-10 10 M. Rusci, F. Conti, A. Capotondi, L. Benini

Quantization: no free lunch Running INT-Q convolution on a ARM Cortex-M7 core -> huge opportunity for HW/SW codesign Lower power consumption when fitting into L1 thanks lower bandwidth from L2-SRAM to compression impacts on low-bitwidth precision overhead for casting INT-4/2 to INT-16 for 2x16bit vectorized MAC instructions INT-1 kernel exploits bitwise operations and does not pay casting overhead because Open Source: XNOR convolutions are https://github.com/EEESlab/CMSIS_NN-INTQ supported by the ISA 11 M. Rusci, F. Conti, A. Capotondi, L. Benini

Quantization + Acceleration = ❤ More efficient than any ULP MCU… Bubble size = pJ/op (smaller is better) F. Conti et al., https://arxiv.org/abs/1612.05974 12 M. Rusci, F. Conti, A. Capotondi, L. Benini

Quantization + Acceleration = ❤ … and even more 865 6 pJ/op if compared to a pJ/op commercial high-perf MCU 23 pJ/op 143 pJ/op Bubble size = pJ/op 50 pJ/op (smaller is better) 11 pJ/op 1000 0.001 F. Conti et al., https://arxiv.org/abs/1612.05974 13 M. Rusci, F. Conti, A. Capotondi, L. Benini

Flying a Drone with DL ( in <10mW ) DroNet : a ResNet-based CNN to drive a drone in the environment • original implementation: 20fps on external CPU, requires a big drone (e.g. DJI, Parrot) GAP8 – GAP8 – 8 Cores HWCE (200MHz) (200MHz) FPS 32 fps 51 fps DroNet on GAP8/PULP: - Fixed-Point 16bit (Q3.13) - Removed Batch Normalization - Max Pooling layer 2x2 - Striding support in HW - Support for HWCE - Comparable accuracy w.r.t. baseline Example nano-drone from D. Palossi et al., https://arxiv.org/abs/1805.01831 14 F. Conti, M. Rusci, A. Capotondi, D. Rossi, L. Benini

Thanks for your attention. Questions? Special acks to: Davide Rossi (UNIBO), Daniele Palossi (ETHZ), Eric Flamand (GreenWaves Technologies), all the PULP team https:// github.com/pulp-platform Twitter @pulp_platform 15 M. Rusci, F. Conti, A. Capotondi, L. Benini

Deep-Learning Oriented Smart Sensing for the Next Generation of - PowerPoint PPT Presentation

Deep-Learning Oriented Smart Sensing for the Next Generation of Embedded Applications Manuele Rusci, Francesco Conti , Alessandro Capotondi, Luca Benini Energy-Efficient Embedded Systems Laboratory Dipartimento di Ingegneria dellEnergia

SMART ENERGY SMART ASSET SMART SMART SMART & CUSTOMER ASSET PURPOSE PEOPLE

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Smart Sensing Transport & Supply Chain Applying integrated Smart Sensing RFID Technology where

NSW Smart Sensing Network (NSSN) DR ANTHONY MORFA Business Development Manager, NSW Smart Sensing

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Smart and Adaptive Cyber-Physical Systems Chapters 1,2 Cyber-Physical Systems Smart mobility

SENSING ACTUATION Cluj school, September 2007 SENSING ACTUATION MAGNETIC MAGNETIC SENSING

Quality of Life - Smart Mobility - Smart Infrastructure - Smart People, Smart Living ARC 590

CONTENTS Smart Schools Bond Act Committees and the Smart Schools Investment Plan Smart Schools

Packet-Level Signatures for Smart Home Devices Rahmadi Trimananda, Janus Varmarken, Athina

Mobileye Sensing Status and Road Map Dr. Gaby Hayon, EVP R&D 1 Confidential The Challenge

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Deep Compressed Sensing Yan Wu, Mihaela Rosca, Tim Lillicrap Compressed Sensing A Brief Review

From decision procedures to full model-checking: the MCMT experience S. Ghilardi University of

DRAFT This paper is a draft submission to Inequality Measurement, trends, impacts, and

Optimal Control in the space of probability measures Claudia Totzeck joint work with M. Burger,

KDI SOA Solutions: Ontologies Fausto Giunchiglia and Mattia Fumagallli University of Trento

Stability Measurement of 3 CSOs with Tracking DDSs and Two-Sample COV C. E. Calosso 1 , F .

CS 4518 Mobile and Ubiquitous Computing Lecture 11: Quantified Self, Smartwatches, Android Wear,

Tracking 2 Basic Principles of Detectors Jochen Kaminski University of Bonn BND summer

SYSC 5801 Open Source Business Session 9: Nov 14 Fall 2011 www.carleton.ca/tim Michael Weiss

Deep-Learning Oriented Smart Sensing for the Next Generation of - PowerPoint PPT Presentation

Deep-Learning Oriented Smart Sensing for the Next Generation of Embedded Applications Manuele Rusci, Francesco Conti , Alessandro Capotondi, Luca Benini Energy-Efficient Embedded Systems Laboratory Dipartimento di Ingegneria dellEnergia

SMART ENERGY SMART ASSET SMART SMART SMART &amp; CUSTOMER ASSET PURPOSE PEOPLE

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Smart Sensing Transport &amp; Supply Chain Applying integrated Smart Sensing RFID Technology where

NSW Smart Sensing Network (NSSN) DR ANTHONY MORFA Business Development Manager, NSW Smart Sensing

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Smart and Adaptive Cyber-Physical Systems Chapters 1,2 Cyber-Physical Systems Smart mobility

SENSING ACTUATION Cluj school, September 2007 SENSING ACTUATION MAGNETIC MAGNETIC SENSING

Quality of Life - Smart Mobility - Smart Infrastructure - Smart People, Smart Living ARC 590

CONTENTS Smart Schools Bond Act Committees and the Smart Schools Investment Plan Smart Schools

Packet-Level Signatures for Smart Home Devices Rahmadi Trimananda, Janus Varmarken, Athina

Mobileye Sensing Status and Road Map Dr. Gaby Hayon, EVP R&amp;D 1 Confidential The Challenge

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Deep Compressed Sensing Yan Wu, Mihaela Rosca, Tim Lillicrap Compressed Sensing A Brief Review

From decision procedures to full model-checking: the MCMT experience S. Ghilardi University of

DRAFT This paper is a draft submission to Inequality Measurement, trends, impacts, and

Optimal Control in the space of probability measures Claudia Totzeck joint work with M. Burger,

KDI SOA Solutions: Ontologies Fausto Giunchiglia and Mattia Fumagallli University of Trento

Stability Measurement of 3 CSOs with Tracking DDSs and Two-Sample COV C. E. Calosso 1 , F .

CS 4518 Mobile and Ubiquitous Computing Lecture 11: Quantified Self, Smartwatches, Android Wear,

Tracking 2 Basic Principles of Detectors Jochen Kaminski University of Bonn BND summer

SYSC 5801 Open Source Business Session 9: Nov 14 Fall 2011 www.carleton.ca/tim Michael Weiss

SMART ENERGY SMART ASSET SMART SMART SMART & CUSTOMER ASSET PURPOSE PEOPLE

Smart Sensing Transport & Supply Chain Applying integrated Smart Sensing RFID Technology where

Mobileye Sensing Status and Road Map Dr. Gaby Hayon, EVP R&D 1 Confidential The Challenge