Running Deep Learning in less than 100KB on Microcontrollers Pete - PowerPoint PPT Presentation

Running Deep Learning in less than 100KB on Microcontrollers

Pete Warden Engineer, TensorFlow petewarden@google.com @petewarden

Why am I here?

150 Billion Devices! Growing faster than internet users or smaruphones

Why ML?

Energy! Many devices rely on batuery or energy harvesting Transmituing data takes a lot of power, and can’t improve fast enough Capturing and processing data locally very cheap

Energy! Most captured data is currently being wasted ML lets us turn it into something actionable

How is this done?

What are the challenges? Less than 100KB of RAM and storage Less than 10 million arithmetic ops per second Can’t rely on fmoating point hardware No operating system

Model Design We needed a 20KB model Happily common in speech world Learned a lot about quantization Actually just a tiny image CNN on spectrograms

Model Design Uses fewer than 400,000 arithmetic operations htups://www.tensorglow.org/tutorials/sequences/a udio_recognition

Sofuware Design TensorFlow Lite still > 100KB binary size Depends on Posix and standard C/C++ libraries Uses dynamic memory allocation

Sofuware Design But a lot we don’t want to lose from TensorFlow Lite: - Existing op implementations - Well-documented APIs and fjle format - Conversion tooling

Sofuware Design Modularized existing code Separated out API defjnitions from implementations Used reference code Added minimal new runtime layer for MCUs

Sofuware Design Focused on getuing one end to end example working, rather than going broad poruing whole framework at once

What does this mean in practice?

int32 acc = 0; for (int filter_y = 0; filter_y < filter_height; ++filter_y) { for (int filter_x = 0; filter_x < filter_width; ++filter_x) { const int in_x = in_x_origin + dilation_width_factor * filter_x; const int in_y = in_y_origin + dilation_height_factor * filter_y; // If the location is outside the bounds of the input image, // use zero as a default value. if ((in_x >= 0) && (in_x < input_width) && (in_y >= 0) && (in_y < input_height)) { int32 input_val = input_data[Offset(input_shape, b, in_y, in_x, ic)]; int32 filter_val = filter_data[Offset( filter_shape, 0, filter_y, filter_x, oc)]; acc += (filter_val + filter_offset) * (input_val + input_offset); } } }

Reference code is imporuant Most ML operations can be implemented simply Most frameworks only ship with optimized versions Understandable, but makes it very hard to extend or optimize for other platgorms

What’s the takeaway?

There is no killer app Voice intergaces are the closest Vision, accelerometer, audio sensors ofger a lot Need to connect with the right problems

Think about your domain What could you do if your model ran on a 50 cent chip that could be peeled and stuck anywhere, and run forever?

Get it. Try it. Code : github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/micro Docs : tensorglow.org/lite/guide/microcontroller Example : g.co/codelabs/sparkfunTF

Pete Warden Engineer, TensorFlow petewarden@google.com @petewarden

Running Deep Learning in less than 100KB on Microcontrollers Pete - PowerPoint PPT Presentation

Running Deep Learning in less than 100KB on Microcontrollers Pete Warden Engineer, TensorFlow petewarden@google.com @petewarden Why am I here? 150 Billion Devices! Growing faster than internet users or smaruphones Why ML? Energy! Many

Red- -Light Running Light Running Red Red-Light Running 2 Traffic Signals Traffic Signals

Red- -Light Running Light Running Red Red-Light Running 2 Traffic Signals Traffic Signals

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Deep Water Running Richard Lucas What is it? Mimicking running action while in the water

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

Running Time Why do we need to analyze the running Algorithm/Running Time Analysis time of a

D7: Front-running Race conditions #7: Front ont-running running A form of a race condition

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Optimizing the Truckload / Less Than Truckload (TL/LTL) Optimizing the Truckload / Less Than

Transportation Less than $5 million How to define barely under $5 Million Drywell

2019 Project of the Year Environmental $5 Million to Less than $25 Million A reservoir with

2016 Project of the year Environmental Less than $5million If you build it, they will come

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Smart clothing is distinct from wearable computers in that smart clothing emphasizes the

Nick Roseto ABB Motors and Mechanical Key Account Manager Mining, Aggregate and Cement

PRESENTATION Paltalk, Inc. | Ticker: PALT Safe Harbor This presentation is for discussion

THE FUTURE OF THE TRANSIT INDUSTRY Mobile Payments and BLE Based Payment Validation PASSPORT

Appy Families Welcome to Appy Families from Glynn! Creative Director of Complete

IMR 2.0 Platform Helping you close more deals faster. October 22, 2019 Chicago, IL JP Werlin

Spoofax vs Xtext A language workbench comparative case study University of Antwerp Leonard Elezi

Ed Ellis THE SKY IS FALLING! (NO, REALLY, IT IS) The national rail network is in serious