ASIC Clouds: Specializing the Datacenter Ikuo Magaki, Moein - PowerPoint PPT Presentation

ASIC Clouds: Specializing the Datacenter Ikuo Magaki, Moein Khazraee, Luis Vega Gutierrez, and Michael Bedford Taylor UC San Diego and Toshiba Presented By: Vandit Agarwal

Motivation • GPU and FPGA based clouds already successful • Even ASIC Clouds have been successfully used • Take this idea ahead to form ASIC based clouds for other applications • Purpose built Datacenter • Large arrays of ASIC accelerators • Optimize Total Cost of Ownership (TCO) • For increasingly common high-volume chronic computations • Downside: • High Non Recurring Engineering (NRE) • Inflexibility

Introduction • Two visible trends: • Heavy work done on cloud; interactive moved to client • Rise of dark silicon - specialization and near threshold computation • Conjunction of these two designs proved viable • On a single machine level, ASICs can o ff er at least an order improvement - explore and propose ASIC cloud • Identify key issues by studying Bitcoin ASIC Cloud

Objective In a Nutshell • Two key metrics drive the development: • H/w cost per performance = $ per op/s • Energy per operation = W per op/s • Working with a joint knowledge/control over datacenter and h/ w design • Select single TCO-optimal point amongst many Pareto- optimal points

Specialization Hierarchy O ff -PCB Interface On-PCB Network On-ASIC Interconnection Network • ASIC Design: achieves reduction in silicon area and energy consumption • ASIC Server: organization of ASIC, heat sinks, selective components, custom voltages • ASIC Datacenter: optimize rack and datacenter level thermal distribution, costs such as provisioning cost, availability, taxes etc. **To meet the requirements at datacenter level, modifications trickle down in the hierarchy

ASIC Cloud Architecture O ff -PCB Interface On-PCB Network On-ASIC Interconnection Network • Trying to create a generic skeleton for ASIC Cloud • Heart of ASIC cloud - Replicated Compute Accelerator (RCA) - multiplied recursively • Customization: eg - if RCA requires DRAM, then ASIC contains shared DRAM controllers connected to ASIC-local DRAMs

ASIC Server Overview • Focussed on 1U 19-inch Rackmount servers • Forced air-cooling system • Air intake from front, removal from back • Air at 30 o C

ASIC Server Evaluation Flow • Given an implementation and architecture for target RCA: • VLSI tools used to map it to target process • Analysis tools provide info on: • Area • Performance • Power density • Tune the following to find lowest TCO: • No. of RCAs/Chip • No. of chips/PCB • Organization of chips on PCB • Power delivery mechanism • Cooling mechanism • Choice of voltage

Thermally-Aware ASIC Server Design • ASICs and DC/DC convertors - major sources of heat • Heat Sinks: • Heat spreader glued to the heat source (die) using Thermal Interface Material (TIM) • Spreader has fins - air blowed through them • Increasing spreader size improves cooling • Increasing the die size improves cooling - overcomes TIM resistance • Developed a model: • Input: fan curve, ASIC count/row • Output: Optimal heat sink parameters

Arranging ASICs on PCB

More Chips vs Fewer Chips • How large (in mm 2 ) should each chip be? • Determines how many RCAs will be on each chip • Many small ASICs easier to cool than few large ASICs • Increasing silicon area -> heat dissipation capacity increases (TIM) • Large total die area in a row is e ff ective • Increasing no. of chips increases the packaging cost but not by much

Power Density and Server Cost • Given same RCA, increasing Watts, increases performance • Moving right (high power density), very little total silicon per lane (due to temperature constraints) and must be divided into many smaller chips • Cooling and packaging cost • Moving left (low power density), more silicon per lane and fewer chips • Silicon area cost

Bitcoin • Semi-anonymously and securely transfer money • Blockchain - globally replicated public ledger of transactions • A distributed consensus algorithm called Byzantine Fault Tolerance determines whose transactions are added to the blockchain • Mining: • Machines request work from a pool server • Hash - brute force attempt at partial inversion of cryptographically hard hash function • Hashrate - rate of hash - typically Giga hashes per second (GH/s) • On success, other machines verify. Accept and append the block

What Led to Bitcoin ASIC Cloud? • People are incentivized to mine: • More number of machine = more secure system • Blockchain reward (25 BTC = ~USD 11k in 2016) • 144 blocks daily x 25 BTC per block = ~USD 1.5M daily • Rising TCO justifies the increased investment in NRE and other development cost • Leads to more specialization

Bitcoin ASIC Trend Di ffi culty

Implementation • 0.66 mm 2 silicon in UMC 28-nm process. • Power density: 2W/mm 2 • Extremely high power density

Results • More silicon -> optimal voltages decreases -> server e ffi ciency increases • Initially, costs reduce (right to left) but then silicon costs start building up

Voltage Stacking • DC/DC power is significant • Chips serially chained so that their supplies sum to 12V • Lead to significant savings in TCO optimal case

Litecoin ASIC Cloud

Video Transcoding ASIC Cloud **Pareto points are glitchy because of variations in constants and polynomial order for server components as they vary with voltages

CNN ASIC Cloud

When is ASIC Cloud Feasible

Discussion • This is one of the earlier attempts to create a general framework/skeleton for an ASIC cloud. How feasible do you think this technology is and how widely and how soon can we potentially adopt it for a large variety of applications? • The authors recommend that open sourcing various tools by the cloud providers and silicon foundries would potentially lead to lower TCO. Is this a good solution? Why or why not? • What do you think is more optimal? Investing heavily in (high NRE) in more advanced nodes (eg 16nm) or using/modifying older nodes (eg 65nm) in an ASIC?

Bitcoin ASIC Cloud Design • Repeatedly execute a Bitcoin hash operation • Input: 512 bit block • Mutate the block and perform SHA256 on it • Fed into another round of SHA256 • Leading zero count performed and matched with the target • 64 rounds in each SHA

ASIC Clouds: Specializing the Datacenter Ikuo Magaki, Moein - PowerPoint PPT Presentation

ASIC Clouds: Specializing the Datacenter Ikuo Magaki, Moein Khazraee, Luis Vega Gutierrez, and Michael Bedford Taylor UC San Diego and Toshiba Presented By: Vandit Agarwal Motivation GPU and FPGA based clouds already successful Even ASIC

Clouds A B Clouds A Eastern 2/3 of the U.S. Clouds Clouds on Mars are made of _____ . A.

When you look up into the sky, you will often see clouds. No two clouds are the same, and there

ASIC Computer-Aided Design Flow ELEC 5250/6250 ASIC Design Flow ASIC Design Flow Behavioral

Coercive Powers & ASIC Coercive Powers & ASIC ASIC Summer School 2011 Richard Gilbert

ASIC Development @ GSI Holger Flemming Experiment Electronic / ASIC-Design 1 1 The GSI ASIC

2 Microstructures of Warm Clouds Clouds that lie completely below the 0 C isotherm, referred to

Evolving ASIC Methodology to Adapt to Technology and EDA Tool Advances Tom Russell Manager ASIC

Measurements on P2 and P3 FE ASIC and Experience of P2 FE ASIC in ProtoDUNE-SP Shanshan Gao on

ASIC Research and Development at Fermilab R. Yarema April 20, 2005 Main areas ASIC R&D

6 Artificial Modification of Clouds The microstructures of clouds are influenced by the concen-

4. Droplet Growth in Warm Clouds In warm clouds, droplets can grow by condensation in a

Session 3: Hydrology & Clouds 3:00- 5:30 PM Session 3: Hydrology & Clouds 3:00- 5:30 PM

The Time-less Datacenter Paul Borrill and Alan H. Karp Earth Computing The Datacenter Resilience

Scaling Datacenter Accelerators With Compute-Reuse Architectures Adi Fuchs and David Wentzlaff

FLAT DATACENTER STORAGE CS 744 - Big Data Systems Fall 2018 Presenter - Arjun Balasubramanian

CompSci 514: Computer Networks Lecture 14 Datacenter Transport protocols II Xiaowei Yang

SYSC3601 Microprocessor Systems Unit 5: Memory Structures and Interfacing SYSC3601 1

Design of a Simple Computer 2 Schedule Today

Analysis and Approximation of Optimal Co-Scheduling on Chip Multiprocessors Yunlian Jiang

Evaluation and Optimization of Multicore Performance Bottlenecks in Supercomputing Applications

The most famous math textbook in history Chirag Kalelkar National Chemical Laboratory Pune At

Structural analysis of expected and unexpected clauses in sentences using gaze-tracking studies

Auto-grading for 3D Modeling Assignments in MOOCs Swapneel Mehta Nitin Ayer Chirag Raman

IDC Update on How Big Data Is Redefining High Performance Computing Earl Joseph

ASIC Clouds: Specializing the Datacenter Ikuo Magaki, Moein - PowerPoint PPT Presentation

ASIC Clouds: Specializing the Datacenter Ikuo Magaki, Moein Khazraee, Luis Vega Gutierrez, and Michael Bedford Taylor UC San Diego and Toshiba Presented By: Vandit Agarwal Motivation GPU and FPGA based clouds already successful Even ASIC

Clouds A B Clouds A Eastern 2/3 of the U.S. Clouds Clouds on Mars are made of _____ . A.

When you look up into the sky, you will often see clouds. No two clouds are the same, and there

ASIC Computer-Aided Design Flow ELEC 5250/6250 ASIC Design Flow ASIC Design Flow Behavioral

Coercive Powers &amp; ASIC Coercive Powers &amp; ASIC ASIC Summer School 2011 Richard Gilbert

ASIC Development @ GSI Holger Flemming Experiment Electronic / ASIC-Design 1 1 The GSI ASIC

2 Microstructures of Warm Clouds Clouds that lie completely below the 0 C isotherm, referred to

Evolving ASIC Methodology to Adapt to Technology and EDA Tool Advances Tom Russell Manager ASIC

Measurements on P2 and P3 FE ASIC and Experience of P2 FE ASIC in ProtoDUNE-SP Shanshan Gao on

ASIC Research and Development at Fermilab R. Yarema April 20, 2005 Main areas ASIC R&amp;D

6 Artificial Modification of Clouds The microstructures of clouds are influenced by the concen-

4. Droplet Growth in Warm Clouds In warm clouds, droplets can grow by condensation in a

Session 3: Hydrology &amp; Clouds 3:00- 5:30 PM Session 3: Hydrology &amp; Clouds 3:00- 5:30 PM

The Time-less Datacenter Paul Borrill and Alan H. Karp Earth Computing The Datacenter Resilience

Scaling Datacenter Accelerators With Compute-Reuse Architectures Adi Fuchs and David Wentzlaff

FLAT DATACENTER STORAGE CS 744 - Big Data Systems Fall 2018 Presenter - Arjun Balasubramanian

CompSci 514: Computer Networks Lecture 14 Datacenter Transport protocols II Xiaowei Yang

SYSC3601 Microprocessor Systems Unit 5: Memory Structures and Interfacing SYSC3601 1

Design of a Simple Computer 2 Schedule Today

Analysis and Approximation of Optimal Co-Scheduling on Chip Multiprocessors Yunlian Jiang

Evaluation and Optimization of Multicore Performance Bottlenecks in Supercomputing Applications

The most famous math textbook in history Chirag Kalelkar National Chemical Laboratory Pune At

Structural analysis of expected and unexpected clauses in sentences using gaze-tracking studies

Auto-grading for 3D Modeling Assignments in MOOCs Swapneel Mehta Nitin Ayer Chirag Raman

IDC Update on How Big Data Is Redefining High Performance Computing Earl Joseph

Coercive Powers & ASIC Coercive Powers & ASIC ASIC Summer School 2011 Richard Gilbert

ASIC Research and Development at Fermilab R. Yarema April 20, 2005 Main areas ASIC R&D

Session 3: Hydrology & Clouds 3:00- 5:30 PM Session 3: Hydrology & Clouds 3:00- 5:30 PM