CARD: A Congestion-Aware Request Dispatching Scheme for Replicated - PowerPoint PPT Presentation

CARD: A Congestion-Aware Request Dispatching Scheme for Replicated Metadata Server Cluster Shangming Cai, Dongsheng Wang, Zhanye Wang and Haixia Wang Tsinghua University 1

Background: Massive-scale ML in product environments • Datasets updated hourly or daily • data collected and stored in an HDFS-like distributed filesystem • periodically offline training for online inference • Challenges of the data-reader pipeline while training • extremely heavy read workloads: millions to billions of files per epoch • random access pattern: up-level shuffling for convergence speed 2

Background: Massive-scale ML in product environments Training workers • Workers interact with a DFS • Metadata request requests / data -> metadata server (MDS) • File I/O Metadata Server …… -> object storage devices (OSD) OSD OSD OSD OSD Distributed filesystem 3

When the number of training workers grows… Training workers • Extremely stressed workloads …… • Metadata access step requests / data bottlenecks the data-reader pipeline Metadata Server …… OSD OSD OSD OSD • Potential single point of failure on MDS Distributed filesystem 4

Typical industrial response: Scaling out likewise Training workers • Concerns to be addressed: …… • Cost-effectiveness requests / data • Scalability …… MDS MDS MDS • Run-time stability …… OSD OSD OSD OSD Distributed filesystem 5

To achieve load- balance… Training workers • A middle layer load-balancer …… • Pros: • good global load balancing • more features are optional Load balancer • Cons: • load-balancer is stressed …… MDS MDS MDS • reintroduce a potential single …… OSD OSD OSD OSD point of failure • not cost-effective Distributed filesystem 6

To achieve load- balance… Training workers • A middle layer load-balancer …… • Pros: • good global load balancing • more features are optional Load balancer • Cons: • load-balancer is stressed …… MDS MDS MDS • reintroduce a potential single …… OSD OSD OSD OSD point of failure • not cost-effective Distributed filesystem 7

Try client-side solutions Training workers …… • Easy to implement client −side solutions • Cost-effective …… MDS MDS MDS …… OSD OSD OSD OSD Distributed filesystem 8

Client-side solution: Round-Robin Clients (training workers) • Round-Robin • Pros: • simple yet effective in homogeneous environments • Cons: • inflexible and inefficient in MDS MDS MDS MDS shifting or heterogeneous 0 1 2 3 environments 9

Client-side solution: Heuristic selection Clients • Heuristic selection • e.g., prefer lowest MART (moving average of response time) • Pros: • effective when facing light- weight workloads • Cons: MDS MDS MDS MDS • cause herd-behavior and load- 0 1 2 3 oscillations 40 ms 20 ms 15 ms 25 ms 10 10

Client-side solution: Round-Robin with Throttling Clients • Round-Robin with throttling • e.g., LADS, preset a MART threshold to mark servers as congested • Light-weight workloads • = Round-Robin • Heavy workloads MDS MDS MDS MDS • = Heuristic selection 0 1 2 3 • herd-behavior and load- 25 ms 30 ms 5 ms 20 ms oscillations remain Threshold: 50 ms 11

Client-side solution: Round-Robin with Throttling Clients • Round-Robin with throttling • e.g., LADS, preset a MART threshold to mark servers as congested • Light-weight workloads • = Round-Robin • Heavy workloads MDS MDS MDS MDS • = Heuristic selection 0 1 2 3 • herd-behavior and load- 55 ms 60 ms 40 ms 65 ms oscillations remain congested congested congested Threshold: 50 ms 12

CARD: Congestion-Aware Request Dispatching scheme • Core idea: Round-Robin with adaptive rate-control • inspired by CUBIC for TCP protocol • counting-based implementation • no extra info required from servers • Light-weight workloads • = Round-Robin • Heavy workloads • redirect requests from overloaded MDS to underloaded MDS • suppress upcoming requests: if and only if all servers are overloaded 13

Congestion-aware rate-control mechanism Process unit at clients • Queue: place pending requests requests replies Queue • Selector: Round-Robin dispatching Selector Feedback RL RL RL RL • Rate-limiter: rate-control module • Feedback: process feedbacks and MDS MDS MDS MDS forward replies 0 1 2 3 14

Congestion-aware rate-control mechanism Process unit at clients • Restrict requests routed to each MDS requests replies per 𝜀 time window Queue Selector Feedback • Gradually increase the restriction according to a cubic growth function RL RL RL RL • Feedback module computes receiving rates after each time window and MDS MDS MDS MDS 0 1 2 3 forwards to RLs 15

Congestion-aware rate-control mechanism Process unit at clients • How to identify a congestion event? requests replies • sending rate > receiving rate Queue • elapsed time since last sending rate ↑ Selector Feedback event > 𝜇 (a hysteresis period ) RL RL RL RL • What to do then? • record current sending rate as saturated sending rate MDS MDS MDS MDS • reduce current sending rate 0 1 2 3 16

The cubic growth function for the rate-control • ∆𝑢 : elapsed time since the last congestion event • 𝑁 𝑗𝑘 : saturated sending rate • Changed to current sending rate adaptively whenever a congestion event happens • Then, current sending rate reduced to (1 − 𝛾) ∙ 𝑁 𝑗𝑘 , and start to grow all over again accordingly 17

Evaluation setup • We implemented a prototype RMSC for simulation purposes • Up to 8 servers to measure system scalability • Crafted descending setup for heterogeneous experiments • 10 clients run on separate machines launching request with Poisson arrivals • 𝜀 = 5 ms, 𝜇 = 10 ms, 𝛾 =0.20 • To compare against CARD, we implemented aforementioned Round-Robin, MART and LADS as well • Refer to the paper for more setup details 18

Evaluation highlights • Do CARD’s rate -control mechanism work as expected? • Yes, the rate-control process is effective and adaptive • Loads among servers are balanced under heavy workloads • Can CARD achieve better scalability? • In homogeneous clusters: CARD ≈ Round-Robin > other strategies • In heterogeneous clusters: Yes, CARD > other strategies 19

Examples of the rate-control procedure The sending rate from each client to each server is adjusted adaptively according to the receiving rate 20

Overall arriving rates in the homogeneous cluster MART CARD 1) Heuristic selections cause severe herd behavior and load-oscillations 2) A data loading job is completed earlier when using CARD 21

Overall arriving rates in the heterogeneous cluster LADS CARD 1) A basic threshold throttling strategy is not sufficient enough 2) Arriving rates are stabilized around servers’ capacity when using CARD 22

Overall throughput in the homogeneous cluster • Heuristic selection is a bad choice under heavy workloads • In ideal homogenous environments, Round-Robin and CARD achieve great scalability 23

Overall throughput in the heterogeneous cluster • Round-Robin is ineligible when facing heterogenous setups • CARD outperforms other strategies and achieves excellent scalability 24

Summary: CARD • Adaptive client-side throttling method: easy and efficient • Redirect requests from the overloaded server to the underloaded server adaptively under heavy workloads • Degrade into pure Round-Robin when facing light-weight workloads • Boosts throughput significantly over competing strategies in heterogeneous environments 25

CARD: A Congestion-Aware Request Dispatching Scheme for Replicated - PowerPoint PPT Presentation

CARD: A Congestion-Aware Request Dispatching Scheme for Replicated Metadata Server Cluster Shangming Cai, Dongsheng Wang, Zhanye Wang and Haixia Wang Tsinghua University 1 Background: Massive-scale ML in product environments Datasets

Outline Single Machine Models DMP204 SCHEDULING, TIMETABLING AND ROUTING 1. Dispatching Rules

Congestion Control Outline Queuing Discipline Reacting to Congestion Avoiding Congestion 1

What do you mean, Congestion? some history Congestion Collapse

Congestion Games with affine functions Maria Serna Fall 2016 AGT-MIRI, FIB-UPC Congestion Games

Dispatching control system Miroslav Kocur for substation Own products Dispatching control

The National Dispatching of GRTgaz Offer implementation Gas system management National

Congestion Control Mark Handley Outline Part 1: Traditional congestion control for bulk

The Present and Future of Congestion Control Mark Handley Outline Purpose of congestion

Internet congestion control: TCP Internet congestion control: TCP 1988: "Congestion

Router Architectures CPU CPU Memory Memory packets NFE NFE Processor Processor Line Card

Design of Bandwidth Bandwidth Aware Aware and and Design of Congestion Avoiding Avoiding

Flows and linkages Observation Edge congestion a, maximum degree vertex congestion

Traffic Congestion Continues to Increase Across the US, congestion during commuting hours

TCP Congestion Avoidance Joshua Gancher November 10, 2016 Joshua Gancher TCP Congestion

Congestion Games and Selfish Routing Maria Serna Fall 2016 AGT-MIRI, FIB-UPC Congestion Games

TCP TCP Congestion Control Congestion Control Essential strategy :: The TCP host sends

Fast but Not Loose: Typesafe Clients in a Distributed Service Architecture, a retrospective

for XCA, XCPD, XDR and XCDR Profiles Presented by Charles Parisot GE Healthcare, IHE ITI

CS 5150 Software Engineering 24. Presentations William Y. Arms Presentations The following

ModelDB : a system for managing ML models Manasi Vartak , PhD Candidate MIT Database Group

Metaphors Pages 43 - 44 Transform the World Metaphor Critical Faculty/ Barrier Problems

Troubleshooting Service Dr. Susanne Naegele-Jackson DFN / University of Erlangen-Nuremberg

Socket clients and servers CSCI 136: Fundamentals of Computer Science II Keith Vertanen

Small Busines Small Business s Finding, T Finding, Touching, P ouching, Partnering

CARD: A Congestion-Aware Request Dispatching Scheme for Replicated - PowerPoint PPT Presentation

CARD: A Congestion-Aware Request Dispatching Scheme for Replicated Metadata Server Cluster Shangming Cai, Dongsheng Wang, Zhanye Wang and Haixia Wang Tsinghua University 1 Background: Massive-scale ML in product environments Datasets

Outline Single Machine Models DMP204 SCHEDULING, TIMETABLING AND ROUTING 1. Dispatching Rules

Congestion Control Outline Queuing Discipline Reacting to Congestion Avoiding Congestion 1

What do you mean, Congestion? some history Congestion Collapse

Congestion Games with affine functions Maria Serna Fall 2016 AGT-MIRI, FIB-UPC Congestion Games

Dispatching control system Miroslav Kocur for substation Own products Dispatching control

The National Dispatching of GRTgaz Offer implementation Gas system management National

Congestion Control Mark Handley Outline Part 1: Traditional congestion control for bulk

The Present and Future of Congestion Control Mark Handley Outline Purpose of congestion

Internet congestion control: TCP Internet congestion control: TCP 1988: &quot;Congestion

Router Architectures CPU CPU Memory Memory packets NFE NFE Processor Processor Line Card

Design of Bandwidth Bandwidth Aware Aware and and Design of Congestion Avoiding Avoiding

Flows and linkages Observation Edge congestion a, maximum degree vertex congestion

Traffic Congestion Continues to Increase Across the US, congestion during commuting hours

TCP Congestion Avoidance Joshua Gancher November 10, 2016 Joshua Gancher TCP Congestion

Congestion Games and Selfish Routing Maria Serna Fall 2016 AGT-MIRI, FIB-UPC Congestion Games

TCP TCP Congestion Control Congestion Control Essential strategy :: The TCP host sends

Fast but Not Loose: Typesafe Clients in a Distributed Service Architecture, a retrospective

for XCA, XCPD, XDR and XCDR Profiles Presented by Charles Parisot GE Healthcare, IHE ITI

CS 5150 Software Engineering 24. Presentations William Y. Arms Presentations The following

ModelDB : a system for managing ML models Manasi Vartak , PhD Candidate MIT Database Group

Metaphors Pages 43 - 44 Transform the World Metaphor Critical Faculty/ Barrier Problems

Troubleshooting Service Dr. Susanne Naegele-Jackson DFN / University of Erlangen-Nuremberg

Socket clients and servers CSCI 136: Fundamentals of Computer Science II Keith Vertanen

Small Busines Small Business s Finding, T Finding, Touching, P ouching, Partnering

Internet congestion control: TCP Internet congestion control: TCP 1988: "Congestion