RoboX An End-to-End Solution to Accelerate Autonomous Control in - PowerPoint PPT Presentation

RoboX An End-to-End Solution to Accelerate Autonomous Control in Robotics Alternative Computing Technologies (ACT) Lab Jacob Sacks Divya Mahajan Richard C. Lawson Georgia Institute of Technology Hadi Esmaeilzadeh† † University of California, San Diego ISCA ’18 Los Angeles, California

Challenges in Autonomous Robotics Many diverse Battery applications constraints Limited power Compute-intensive budget

Challenges in Autonomous Robotics Mobile CPU Processor Mobile Flight Time

Challenges in Autonomous Robotics Flight Time Power

Accelerating Planning and Control Model Predictive Control

RoboX Workflow Domain-Specific Macro Dataflow Statically-Scheduled Graph Instructions Language System Quadrotor( ) { Computation state position[3], angle[3]; Schedule input torque[4]; ... Communication Task takeOff() { Schedule penalty target_height; Program Controller constraint max_height; Translator Compiler Memory ... } Schedule } Concise mathematical Automatically Statically schedule description synthesize DFG on accelerator

Background: System Models yaw (ɸ) thrust (f 4 ) thrust (f 3 ) roll (ψ) thrust (f 1 ) thrust (f 2 ) pitch (θ)

Background: Dynamics and Constraints General nonlinear dynamics yaw (ɸ) !̇ = $(!, ') thrust (f 3 ) thrust (f 4 ) roll (ψ) time inputs thrust (f 1 ) thrust (f 2 ) derivative states State and input constraints pitch (θ) ! ≤ ! # ≤ ! $ ≤ $ i

Background: Objective Function

Background: Objective Function % 1 ! = $ %&'( ) * + + - $ './ ) * 0* % 2 terminal cost running cost

Components of MPC Model Predictive Control Input Constraints Objective Function Dynamics State Constraints

Domain-Specific Language System Distill MPC into modular components Task Remain close to Aims of RoboX DSL mathematical expressions Symbolic expressions Group Independent of operations implementation

DSL: System Component ang_vel ( ⍵ ) System MobileRobot( ) { state pos[2]; state angle; input vel; input ang_vel; (pos[0], pos[1]) … } z angle (θ) x y vel (v)

DSL: System Component ang_vel ( ⍵ ) System MobileRobot( ) { state pos[2]; state angle; input vel; input ang_vel; (pos[0], pos[1]) pos[0].dt = vel * cos(angle); z pos[1].dt = vel * sin(angle); angle.dt = ang_vel; angle (θ) x y … } vel (v)

DSL: Task Component System MobileRobot(...) { Task moveTo(…) { penalty target_x, target_y; target_x.running = pos[0] - desired_x; target_y.running = pos[1] - desired_y; …}}

DSL: Task Component System MobileRobot(...) { Task moveTo(…) { penalty target_x, target_y; target_x.running = pos[0] - desired_x; target_y.running = pos[1] - desired_y; constraint pos_bound; pos_bound.running = sqrt(pos[0]ˆ2 + pos[1]ˆ2); pos_bound.upper_bound <= radius;}}

RoboX Accelerator Architecture Flexible dataflow Compute CU CU CU Global µCode Bu ff er Global LD/ST Bu ff er Cluster 1 architecture Programmable Memory Access Engine Compute CU CU CU organized as a Cluster 2 two-level Shifter hierarchy to handle large Memory µCode Bus µCode Compute CU CU CU amount of data Cluster N-1 dependencies Compute CU CU CU Cluster N

RoboX Accelerator Architecture Compute- enabled interconnect to perform simple operations on in-transit data

RoboX Accelerator Architecture Comp CU CU CU µCode N 0 1 Bus µCode Each computer cluster executes separate compute and communication microprograms and can operate in a SIMD mode

RoboX Accelerator Architecture CU CU CU 0 N 1 Compute units do not initiate communication requests but consume data from single-hop connections and a shared bus

RoboX Accelerator Architecture Neighbor (Right) Neighbor (Left) Nonlinear State Buffer Nonlinear Input Buffer Gradient Buffer Hessian Buffer Interm Buffer The compute unit is a three-stage pipeline an divides its memory into separate buffers to simplify communication scheduling

RoboX Accelerator Architecture Programmable Global µCode Bu ff er Global LD/ST Bu ff er memory access Programmable Memory Access Engine engine prefetches instructions and Shifter data according to its own statically- Memory µCode Bus µCode scheduled microprogram

Instruction Set Architecture Scalar Compute Instructions SIMD Data Transfer Communication Instructions In-Network Load Memory Instructions Store

Program Translator Domain-Specific Language States and inputs Dynamics function Objective function Parameterized Solver Template Automatic di ff erentiation for necessary gradients

Controller Compiler Compute CU CU CU Global µCode Bu ff er Global LD/ST Bu ff er Cluster 1 Computation Instruction Programmable Memory Access Engine Schedule Compute CU CU CU Cluster 2 Communication Instruction Shifter Schedule Memory µCode Mapping and Decode Bus µCode Compute CU Memory Instruction CU CU Cluster N-1 Scheduling Schedule Compute CU CU CU Cluster N

Benchmarks Name System Task Task # States MobileRobot Two-Wheel Mobile Robot Trajectory Tracking Manipulator Two-Link Manipulator Reaching AutoVehicle Four-Wheel Vehicle High-Speed Racing MicroSat Miniature Satellite Orbit Control Quadrotor Four-Rotor Micro UAV Motion Planning Hexacopter Six-Rotor Micro UAV Attitude Control

Platforms Low Power ARM Cortex A57 CPU High Performance Intel Xeon E3 Low Power Tegra X2 Desktop Class GTX 650 Ti Tesla K40 High Performance

Evaluation 79 X 65 X 40.0 X ARM 35.0 X Xeon 30.0 X Speedup RoboX 25.0 X 20.0 X 15.0 X 10.0 X 5.0 X 0.0 X MobileRobot AutoVehicle MicroSat Quadrotor Manipulator Hexacopter Geomean On average, RoboX achieves a 29.4X and 7.3X speedup over the ARM A57 and Xeon E3, respectively

Evaluation 4.0 X GTX 650 Ti Tegra X2 3.5 X Tesla K40 RoboX 3.0 X Speedup 2.5 X 2.0 X 1.5 X 1.0 X 0.5 X 0.0 X MobileRobot AutoVehicle MicroSat Quadrotor Manipulator Hexacopter Geomean On average, RoboX achieves a 2.0X and 3.5X speedup over the GTX and Tegra, respectively, and is 1.3X slower than the Tesla

Evaluation GTX 650 Ti Tegra X2 Tesla K40 RoboX Performance-per-Watt 100.0 X 10.0 X 1.0 X 0.1 X MobileRobot AutoVehicle MicroSat Quadrotor Manipulator Hexacopter Geomean On average, RoboX achieves a 65.5X, 7.9X, and 71.8X performance- per-watt improvement over the GTX, Tegra, and Tesla, respectively

Conclusion Domain-general acceleration solution by leveraging algorithmic understanding of robotics Deliver significant performance and energy gains while abstracting away details of controls, optimization, and hardware First step towards enabling full-stack solutions for robotics from high-level mathematical specifications

RoboX An End-to-End Solution to Accelerate Autonomous Control in - PowerPoint PPT Presentation

RoboX An End-to-End Solution to Accelerate Autonomous Control in Robotics Alternative Computing Technologies (ACT) Lab Jacob Sacks Divya Mahajan Richard C. Lawson Georgia Institute of Technology Hadi Esmaeilzadeh University of

High-Performance Session Variability Compensation Session Variability Compensation in Forensic

Your Toolbar In seAngs you will find your Microphone and Camera Set-up and trouble shoo.ng

ECEU530 Projects ECE U530 Individual project implementing a design in VHDL Digital Hardware

A Formal Study of Power Variability Issues and Side-Channel Attacks for Nanoscale Devices

MSGC Chamber Micro-Strip Gas Counter What is MSGC ? MSGC : a fragile structure MSGC DISCHARGE

Microwave Ovens Microwave Ovens They often cook foods unevenly They often cook foods unevenly

The Idea and Motivation o Microwaves caused ~7,400 fires in 2005 n 87 injuries and $18

System Modeling / Class Diagram System Modeling / Class Diagram Week 6 Agenda (Lecture) Agenda

Ozone Profile Measurements within the NDACC www.ndacc.org Mike Kurylo, Geir Braathen, Stuart

What This Course Is About Design-by-Contract (DbC) Focus is design Readings: OOSC2 Chapter 11

Propagating Functionality with Inheritance Object-Oriented Programming in R: S3 & R6

OCRA: The One Centimetre Receiver Array Richard Davis, Mike Peel OCRA Collaboration: University

Lunar Resources for Solar Conversion B.Eng. M.Sc. Juergen Schleppi js79@hw.ac.uk OEMF -

Machine Translation at Booking.com Journey and Lessons Learned May 30, 2017, Prague Pavel Levin

AIRS PROJECT OVERVIEW AND LAUNCH READINESS STATUS 13 February 2002 Tom Pagano AIRS Deputy

The 25th Princeton Conference Navigating Uncertainty in the U.S. Health Care System Where

Crystallography revisited 1 Point coordinates z 111 c Point coordinates for unit cell center

Towards Scalable Real-Time Entity Resolution using a Similarity-Aware Inverted Index Approach

Suppression of superkicks in BBH inspiral U. Sperhake Institute of Space Sciences CSIC-IEEC

Closed-Loop Impulse Control of Oscillating Systems A. N. Daryin and A. B. Kurzhanski Moscow

Advanced Database Management Systems Distributed DBMS:Introduction and Architectures Alvaro A A

A Tour of Machine Learning Security Florian Tramr CISPA August 6 th 2018 The Deep Learning

Mi Michael hael Liao ao Advisor : Andy Wu Graduate Institute of Electronics Engineering

Using Simulation to Support Multi-Criteria Decision Analysis Peer-Olaf Siebers EM SIM SIG

RoboX An End-to-End Solution to Accelerate Autonomous Control in - PowerPoint PPT Presentation

RoboX An End-to-End Solution to Accelerate Autonomous Control in Robotics Alternative Computing Technologies (ACT) Lab Jacob Sacks Divya Mahajan Richard C. Lawson Georgia Institute of Technology Hadi Esmaeilzadeh University of

High-Performance Session Variability Compensation Session Variability Compensation in Forensic

Your Toolbar In seAngs you will find your Microphone and Camera Set-up and trouble shoo.ng

ECEU530 Projects ECE U530 Individual project implementing a design in VHDL Digital Hardware

A Formal Study of Power Variability Issues and Side-Channel Attacks for Nanoscale Devices

MSGC Chamber Micro-Strip Gas Counter What is MSGC ? MSGC : a fragile structure MSGC DISCHARGE

Microwave Ovens Microwave Ovens They often cook foods unevenly They often cook foods unevenly

The Idea and Motivation o Microwaves caused ~7,400 fires in 2005 n 87 injuries and $18

System Modeling / Class Diagram System Modeling / Class Diagram Week 6 Agenda (Lecture) Agenda

Ozone Profile Measurements within the NDACC www.ndacc.org Mike Kurylo, Geir Braathen, Stuart

What This Course Is About Design-by-Contract (DbC) Focus is design Readings: OOSC2 Chapter 11

Propagating Functionality with Inheritance Object-Oriented Programming in R: S3 &amp; R6

OCRA: The One Centimetre Receiver Array Richard Davis, Mike Peel OCRA Collaboration: University

Lunar Resources for Solar Conversion B.Eng. M.Sc. Juergen Schleppi js79@hw.ac.uk OEMF -

Machine Translation at Booking.com Journey and Lessons Learned May 30, 2017, Prague Pavel Levin

AIRS PROJECT OVERVIEW AND LAUNCH READINESS STATUS 13 February 2002 Tom Pagano AIRS Deputy

The 25th Princeton Conference Navigating Uncertainty in the U.S. Health Care System Where

Crystallography revisited 1 Point coordinates z 111 c Point coordinates for unit cell center

Towards Scalable Real-Time Entity Resolution using a Similarity-Aware Inverted Index Approach

Suppression of superkicks in BBH inspiral U. Sperhake Institute of Space Sciences CSIC-IEEC

Closed-Loop Impulse Control of Oscillating Systems A. N. Daryin and A. B. Kurzhanski Moscow

Advanced Database Management Systems Distributed DBMS:Introduction and Architectures Alvaro A A

A Tour of Machine Learning Security Florian Tramr CISPA August 6 th 2018 The Deep Learning

Mi Michael hael Liao ao Advisor : Andy Wu Graduate Institute of Electronics Engineering

Using Simulation to Support Multi-Criteria Decision Analysis Peer-Olaf Siebers EM SIM SIG

Propagating Functionality with Inheritance Object-Oriented Programming in R: S3 & R6