FAWNdamentally Power-efficient Clusters David Andersen, for: Vijay - PowerPoint PPT Presentation

FAWNdamentally Power-efficient Clusters David Andersen, for: Vijay Vasudevan, Jason Franklin, Amar Phanishayee, Lawrence Tan, Michael Kaminsky*, Iulian Moraru Carnegie Mellon University, *Intel Research Pittsburgh 1

Monthly energy statement considered harmful • Power is a limiting factor in computing • 3-year TCO soon to be dominated by power cost [EPA 2007] • Influences location, technology choices 2

Approaches to saving power Power generation Infrastructure Power distribution Efficiency Cooling Sleeping when idle Dynamic Power Rate adaptation Scaling VM consolidation Computational FAWN Efficiency Goal of computational efficiency: Reduce the amount of energy to do useful work 3

FAWN Improve computational efficiency of data-intensive computing using Fast Array of Wimpy Nodes an array of well-balanced low- power systems. AMD Geode 256MB DRAM 4GB CompactFlash 4

Target: Data-intensive computing • Large amounts of data • Highly-parallelizable • Fine-grained, independent tasks Workloads amenable to “scale - out” approach 5

Outline • What is FAWN? • Why FAWN? • When FAWN? • Challenges (How FAWN?) 6

1. Fixed power costs dominate 70% of peak power at 0% utilization! Power (W) } Fixed power costs Ideal Figure adapted from Tolia et. al HotPower 08 7

2. Balancing to save energy • How do we balance? • Big CPUs clocked down? • CPU-to-Disk seek Embedded Speed Ratio CPUs? • Why not use more disks with big CPUs? Year 8

3. Targeting the sweet-spot in efficiency Fast processors mask memory wall Speed vs. Efficiency at the cost of efficiency Fixed power costs can dominate efficiency for slow processors FAWN targets sweet spot in processor efficiency when including fixed costs 9

4. Reducing peak power consumption • Provisioning for peak power requires: 1. worst case cooling requirements 2. UPS systems upon power failure 3. power generation and substations investment 10

What is FAWN good for? • Random-access workloads (Key-value Lookup) • Scan-bound workloads (Hadoop, Data Analytics) • CPU-bound workloads (Compression, Encryption) 11

Important metrics Performanc Efficiency Density Cost e Work Perf Perf Perf time Watt Volume $ 12

Random access workloads FAWN + CF (4W) Traditional + HD (87W) Traditional + SSD (83W) 14

Random access workloads FAWN is 6-200x more efficient than traditional systems 450 400 424.25 350 300 250 200 150 100 50 2.03448 69.8795 0 Queries/Joule Performance Efficiency 15

CPU-bound encryption AES encryption/decryption of a FAWN is 2x more efficient for CPU-bound operations! 512MB file with a 256-bit key 0.8 0.7 0.73 0.6 0.5 0.4 0.3 0.365 0.2 0.1 0 Encryption Efficiency (MB/J) Performance Efficiency 16

When to use FAWN for random access workloads? • Total cost of ownership • Capital cost + 3 year power @ $0.10/kWh • What is the cheapest architecture for serving random access workloads? • Traditional + {Disks, SSD, DRAM}? • FAWN + {Disks, SSD, DRAM}? 17

Architecture with lowest TCO for random access workloads Ratio of query rate to dataset size informs storage technology FAWN-based systems can provide lower cost per {GB, QueryRate} 18

Challenges “Each decimal order of magnitude increase in parallelism requires a major redesign and rewrite of parallel code” - Kathy Yelick • Algorithms and Architectures at 10x scale • Dealing with Amdahl’s law • High performance using low performance nodes • Today’s software may not run out of the box • Manageability, failures, network design, power cost vs. engineering cost 19

But it’s possible... • Example: FAWN-KV high-performance key-value store 20

FAWN-KV Requests Responses

Using BerkeleyDB on ext3 • Initially implemented using BDB on ext3 • BDB uses B-tree indexing structure • Files stored on CompactFlash • Benchmarked: • Inserts (BDB file creation) • Splits • Merges

DB Mgmt on Flash Split/Merge Operations Creating 1.8GB BDB File Number of Files Insertion Time 1 12 hours 50 min 8 3 hours 18 min 32 2 hours 26 min B-Tree does many small, random writes. Flash does not like.

Flash... • Setting a bit to 1 is free • Setting a bit to zero requires clearing a 128--256KB erase block • Practical consequence: • Seq reads fast; seq writes pretty fast; • rand reads decent; rand writes awful • Almost everything on flash becomes log structured

FAWNDB • Key-Value storage • Inserts written sequentially to end • Deletions/Appends require periodic compaction In memory Log-like behavior is free: DB already tracks location Wimpy memory limits size for key-value at byte Key frag (15 bits) of DB. Offset (32 bits) granularity Wimpies have little Filesystem or device can DRAM. do so at block granularity, higher overhead

FawnDB Performance Creating 1.8GB BDB File Number of Files Insertion Time 1 12 hours 50 min 8 3 hours 18 min 32 2 hours 26 min Creating 1.8GB FAWNDB File Number of Files Insertion Time 1 9.63 min 8 9.83 min 32 9.93 min

Tackling other challenges • Limited DRAM • Good progress on developing new “massive multi - grep” (given 1M strings, find if any of them occur in massive dataset) with low memory requirements • and more! :)

Conclusion • FAWN improves the computational efficiency of datacenters • Informed by fundamental system power trends • Challenges: programming for 10x scale, running today’s software on yesterday’s machine, dealing with flash, ... http://www.cs.cmu.edu/~fawnproj/ Thanks to: Google, Intel, NetApp 28

21-node FAWN Cluster 2 Gb/s of small k-v queries @ 891µs 90W from 80GB dataset median access (on 4 year old hardware) time

Cleaning & Merging: Not bad! • LFS traditional weak link: Cleaning... Max load Low load

More Design • Reliability: Chain replication • FAWN-DS is log structured • stream log tail to next replica • Load balancing • Front-ends have small cache; • working on: read from any replica, load balancing across workers

Database distribution Requirements: H D 1) Spread data and G queries uniformly B 2) Handle node joins F C and departures E A without affecting many nodes

Consistent hashing Hash Index Values H (H,B] G B F C D

Node Addition Hash Index Values H A (H,B] G B F C E D

splits and joins Keys Values 1 At : B (H,B] Split (H,A] (A,B] Transfer B A 2 (H,A]

FAWNdamentally Power-efficient Clusters David Andersen, for: Vijay - PowerPoint PPT Presentation

FAWNdamentally Power-efficient Clusters David Andersen, for: Vijay Vasudevan, Jason Franklin, Amar Phanishayee, Lawrence Tan, Michael Kaminsky, Iulian Moraru Carnegie Mellon University, Intel Research Pittsburgh 1 Monthly energy

Balancing Fairness and Efficiency for Cache Sharing in Semi-external Memory System Shanjiang Tang

PVMD Delft University of Technology Learning objectives External Quantum Efficiency (EQE)

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources of

PHOTON CONVERSION EFFICIENCY at CDF Focusing on: D 0 D 0 D D 0 Paola

downstream track efficiency. Eluned Smith 20/02/2013 Eluned Smith Imperial College

Data 88 Economic Models Overreaction & Momentum October 27, 2020 Reading: De Bondt, W.

Machine Learning for Trading Financial Investing Technical Analysis: Historical Price

Modeling Portfolios that Contain Risky Assets Risk and Return III: Basic Markowitz Portfolio

Economics 2 Professor Christina Romer Spring 2020 Professor David Romer LECTURE 19 SAVING AND

Greed, Leverage, and Potential Losses: A Prospect Theory Perspective Xunyu Zhou Based on joint

Intoduction to the Fifth Workshop Game-Theoretic Probability and Related Topics Glenn Shafer 13

MATH 20: PROBABILITY Expected Value of Discrete Random Variables Xingru Chen

Part I Security Challenges in Automotive Hardware/Software Architecture Design Martin

To what r risks d do patients e expose them emsel elves es? H ? How t to recognize e

Model Interpretation Machine Learning for Data Science 1 EE ploinoble At the extend to wash and

Procurement law update 15 th November 2018 Newcastle | Leeds | Manchester 2 Housekeeping

, L-3 PROPERTIES OF THE FOURIER TR ANSFORM IIIMMINED a G (4.) = Sg(t)exp(- j1nf t) dt CDEEP

Componenta Corporation Interim Report 1 January 31 March 2009

La Riforma III dellimposizione delle imprese tra compatibilit e competitivit SUPSI, La

Swiss Business Hub Presentation June 9th 2020 Mexico Embassy of Switzerland Swiss Business

T F A Statutes R D Voting T Voting about bullet points, not statutes details (wording

The Icelandic Economic Situation Status Report July 2013 The Icelandic Economic Situation

Hedging the Risk of Renewable Energy Sources in Electricity Production Giorgia Oggioni 1 Cristian

Update on the TERENA Compendium, 2003 A talk about comparing apples with oranges in the NREN

FAWNdamentally Power-efficient Clusters David Andersen, for: Vijay - PowerPoint PPT Presentation

FAWNdamentally Power-efficient Clusters David Andersen, for: Vijay Vasudevan, Jason Franklin, Amar Phanishayee, Lawrence Tan, Michael Kaminsky*, Iulian Moraru Carnegie Mellon University, *Intel Research Pittsburgh 1 Monthly energy

Balancing Fairness and Efficiency for Cache Sharing in Semi-external Memory System Shanjiang Tang

PVMD Delft University of Technology Learning objectives External Quantum Efficiency (EQE)

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources of

PHOTON CONVERSION EFFICIENCY at CDF Focusing on: D 0 D 0 D D 0 Paola

downstream track efficiency. Eluned Smith 20/02/2013 Eluned Smith Imperial College

Data 88 Economic Models Overreaction &amp; Momentum October 27, 2020 Reading: De Bondt, W.

Machine Learning for Trading Financial Investing Technical Analysis: Historical Price

Modeling Portfolios that Contain Risky Assets Risk and Return III: Basic Markowitz Portfolio

Economics 2 Professor Christina Romer Spring 2020 Professor David Romer LECTURE 19 SAVING AND

Greed, Leverage, and Potential Losses: A Prospect Theory Perspective Xunyu Zhou Based on joint

Intoduction to the Fifth Workshop Game-Theoretic Probability and Related Topics Glenn Shafer 13

MATH 20: PROBABILITY Expected Value of Discrete Random Variables Xingru Chen

Part I Security Challenges in Automotive Hardware/Software Architecture Design Martin

To what r risks d do patients e expose them emsel elves es? H ? How t to recognize e

Model Interpretation Machine Learning for Data Science 1 EE ploinoble At the extend to wash and

Procurement law update 15 th November 2018 Newcastle | Leeds | Manchester 2 Housekeeping

, L-3 PROPERTIES OF THE FOURIER TR ANSFORM IIIMMINED a G (4.) = Sg(t)exp(- j1nf t) dt CDEEP

Componenta Corporation Interim Report 1 January 31 March 2009

La Riforma III dellimposizione delle imprese tra compatibilit e competitivit SUPSI, La

Swiss Business Hub Presentation June 9th 2020 Mexico Embassy of Switzerland Swiss Business

T F A Statutes R D Voting T Voting about bullet points, not statutes details (wording

The Icelandic Economic Situation Status Report July 2013 The Icelandic Economic Situation

Hedging the Risk of Renewable Energy Sources in Electricity Production Giorgia Oggioni 1 Cristian

Update on the TERENA Compendium, 2003 A talk about comparing apples with oranges in the NREN

FAWNdamentally Power-efficient Clusters David Andersen, for: Vijay Vasudevan, Jason Franklin, Amar Phanishayee, Lawrence Tan, Michael Kaminsky, Iulian Moraru Carnegie Mellon University, Intel Research Pittsburgh 1 Monthly energy

Data 88 Economic Models Overreaction & Momentum October 27, 2020 Reading: De Bondt, W.