Avoiding Scheduler Subversion usin ing Scheduler-Cooperative Locks - PowerPoint PPT Presentation

Avoiding Scheduler Subversion usin ing Scheduler-Cooperative Locks Yuvraj Patel, Leon Yang * , Leo Arulraj + , Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Michael M. Swift University of Wisconsin-Madison * - Now at Facebook, + - Now at Cohesity

Competitive environment • Every container/VM/user expects Clients their desired share of resources C1 C2 VM1 VM2 • Schedulers play an important role Containers App 1 App 1 App 2 App 2 to fulfill the expectations Bins/Lib Bins/Lib Bins/Lib Bins/Lib • CPU schedulers important for CPU Container Engine Guest OS Guest OS allocation Operating System Hypervisor • Majority of the systems are Physical Physical Infrastructure Infrastructure concurrent systems protected by locks Example use-cases of modern data centers 2

The problem – Scheduler Subversion • Accessing locks can lead to new problem - “Scheduler subversion” • Locks determine CPU allocation instead of the scheduler • 2 Processes – P0 & P1 • Default priority • P0 holds the lock twice as long as P1 • Ticket lock- acquisition fairness • Linux CFS Scheduler Expected 3

The problem – Scheduler Subversion • Accessing locks can lead to new problem - “Scheduler subversion” • Locks determine CPU allocation instead of the scheduler • 2 Processes – P0 & P1 • Default priority CPU allocation • P0 holds the lock aligns with lock twice as long as P1 usage • Ticket lock- acquisition fairness • Linux CFS Scheduler Expected Observed 4

The solution – Scheduler-Cooperative Locks • Scheduler-Cooperative Locks (SCL) guarantee lock usage fairness by aligning with scheduling goals • Three important design components to build SCLs • Track lock usage • Penalize dominant users • Provide dedicated window of opportunity to every user • Implementation - Two user-space locks and one kernel lock • Evaluation • Correctness - Allocate lock usage according to the scheduling goals even in extreme cases • Performance - Efficient and scalable • Useful – Apply SCLs to real-world systems – UpScaleDB, KyotoCabinet, Linux kernel 5

• Introduction • The Problem – Scheduler Subversion • The Solution – Scheduler-Cooperative Locks • Evaluation • Conclusion 6

Lock & CPU dominance • UpScaleDB – embedded key-value database • Global mutex lock • Workload • 8 threads pinned on 4 CPU • 4 threads insert ops • 4 threads find ops • Default thread priority • Equal CPU allocation • Run for 120 seconds 7

Lock & CPU dominance • UpScaleDB – embedded key-value database • Global mutex lock 30 CPU Time (Seconds) 25 • Workload Lock Hold Time 20 • 8 threads pinned on 4 CPU Wait + Other 15 • 4 threads insert ops • 4 threads find ops 10 • Default thread priority 5 • Equal CPU allocation 0 • Run for 120 seconds F1 F2 F3 F4 I1 I2 I3 I4 Thread 8

Lock & CPU dominance • UpScaleDB – embedded key-value database • Global mutex lock 30 CPU Time (Seconds) 25 • Workload Lock Hold Time 20 • 8 threads pinned on 4 CPU Wait + Other 15 • 4 threads insert ops • 4 threads find ops 10 • Default thread priority 5 • Equal CPU allocation 0 • Run for 120 seconds F1 F2 F3 F4 I1 I2 I3 I4 Thread 9

Lock & CPU dominance • UpScaleDB – embedded key-value database • Global mutex lock 30 CPU Time (Seconds) 25 • Workload Lock Hold Time 20 • 8 threads pinned on 4 CPU Wait + Other 15 • 4 threads insert ops • 4 threads find ops 10 • Default thread priority 5 • Equal CPU allocation 0 • Run for 120 seconds F1 F2 F3 F4 I1 I2 I3 I4 Nearly six times more CPU allocated Thread to insert threads than find threads 10

Lock & CPU dominance • UpScaleDB – embedded key-value database • Global mutex lock 30 CPU Time (Seconds) 25 • Workload Lock Hold Time 20 • 8 threads pinned on 4 CPU Wait + Other 15 • 4 threads insert ops • 4 threads find ops 10 • Default thread priority 5 • Equal CPU allocation 0 • Run for 120 seconds F1 F2 F3 F4 I1 I2 I3 I4 Nearly six times more CPU allocated Thread Insert threads to insert threads than find threads dominate lock usage 11

Causes of scheduler subversion • Two reasons 12

Reason #1 - Different critical section lengths • Threads spend varied amount of time in 44 critical section 33 Ratio • Thread dwelling longer in critical section 22 becomes dominant user of CPU 11 0 Put/Get Insert/Find LevelDB UpScaleDB Ratio of median critical section times for various systems 13

Reason #2 - Majority locked run time • Time spent in critical section is high -> contention • Lock algorithm determines which threads scheduled • Common case in many applications and OS 1,2,3,4 1. Lock – Unlock: Is That All? A Pragmatic Analysis of Locking in Software Systems. ACM Trans. Comput. Syst.,36(1), March 2019 2. Remote Core Locking: Migrating Critical-Section Execution to Improve the Performance of Multithreaded Applications. USENIX ATC 2012 3. Understanding Manycore Scalability of File Systems, USENIX ATC 2016 14 4. Non-scalable locks are dangerous. Linux Symposium, 2012

• Introduction • The Problem – Scheduler Subversion • The Solution – Scheduler-Cooperative Locks • Evaluation • Conclusion 15

Scheduler-Cooperative Locks (SCLs) • Lock opportunity • Amount of time thread holds lock or could acquire lock when free • Important metric to measure lock usage fairness • Philosophy • Prevent dominant users from acquiring lock • Ensure equal “lock opportunity” to every user • Design locks that aligns with scheduling goals • Three important design components 16

#1 - Track lock usage • Track time spent in critical section 17

#1 - Track lock usage • Track time spent in critical section scl_lock() { ….. lock.start_cs = now() } scl_unlock() { ….. end_cs = now() cs_time = end_cs – lock.start_cs ….. } 18

#1 - Track lock usage • Track time spent in critical section scl_lock() { • Tracking helps to identify dominant ….. users lock.start_cs = now() } scl_unlock() { ….. end_cs = now() cs_time = end_cs – lock.start_cs ….. } 19

#1 - Track lock usage • Track time spent in critical section scl_lock() { • Tracking helps to identify dominant ….. users lock.start_cs = now() } • Tracking flexible • Any schedulable entity such as scl_unlock() threads, processes, containers { • Type of work – readers or writers ….. end_cs = now() cs_time = end_cs – lock.start_cs ….. } 20

#2 – Penalize users • Penalize dominant users 21

#2 – Penalize users • Penalize dominant users scl_lock() { • Penalty calculated while releasing lock if (penalty) { • Penalty applied while acquiring lock sleep-until-penalty-time } • Prevent user from acquiring lock ….. lock.start_cs = now() } scl_unlock() { ….. end_cs = now() cs_time = end_cs – lock.start_cs calculate penalty, penalty-time 22 ….. }

#2 – Penalize users • Penalize dominant users scl_lock() { • Penalty calculated while releasing lock if (penalty) { • Penalty applied while acquiring lock sleep-until-penalty-time } • Prevent user from acquiring lock ….. lock.start_cs = now() • Penalty based on scheduling goals } scl_unlock() { ….. end_cs = now() cs_time = end_cs – lock.start_cs calculate penalty, penalty-time 23 ….. }

#3 – Dedicated window of opportunity • Lock slice – dedicated window of opportunity to every user 24

#3 – Dedicated window of opportunity • Lock slice – dedicated window of opportunity to every user P0 P1 25

#3 – Dedicated window of opportunity • Lock slice – dedicated window of Lock slice (2ms) opportunity to every user P0 P1 Time Slice owner is lock owner 26

#3 – Dedicated window of opportunity • Lock slice – dedicated window of Lock slice (2ms) Lock acquisition is opportunity to every user fast-pathed improving P0 • Owner can acquire lock multiple throughput P1 times within a slice without penalty Time Slice owner is lock owner 27

#3 – Dedicated window of opportunity • Lock slice – dedicated window of Lock slice (2ms) opportunity to every user P0 • Owner can acquire lock multiple P1 times within a slice without penalty Lock slice (2ms) Time Slice ownership transferred to P1 28

#3 – Dedicated window of opportunity • Lock slice – dedicated window of Lock slice (2ms) opportunity to every user P0 • Owner can acquire lock multiple P1 times within a slice without penalty Lock slice (2ms) Time Size of individual critical section can vary 29

#3 – Dedicated window of opportunity • Lock slice – dedicated window of Lock slice (2ms) Lock slice (2ms) opportunity to every user P0 • Owner can acquire lock multiple P1 times within a slice without penalty Lock slice (2ms) • Slice ownership alternates between Time users Wait-times depends on lock slice size 30

#3 – Dedicated window of opportunity • Lock slice – dedicated window of Lock slice (2ms) Lock slice (2ms) opportunity to every user P0 • Owner can acquire lock multiple P1 times within a slice without penalty Lock slice (2ms) • Slice ownership alternates between Time users Lock slice - Fixed-sized virtual critical section - Transferred to next owner based on scheduling policy 31

Avoiding Scheduler Subversion usin ing Scheduler-Cooperative Locks - PowerPoint PPT Presentation

Avoiding Scheduler Subversion usin ing Scheduler-Cooperative Locks Yuvraj Patel, Leon Yang * , Leo Arulraj + , Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Michael M. Swift University of Wisconsin-Madison * - Now at Facebook, + - Now at

What is Subversion Free/open-source version control system: it manages any collection of

Avoiding Scheduler Subversion using Scheduler - Cooperative Locks Yuvraj Patel , Leon Yang , Leo

Lecture Notes on Subversion (COMP 303) Subversion is a free/open-source version control system

Specification In Inference Usin ing Context-Free Language Reachability Osbert Bastani, Saswat

Version control with subversion A short introduction Outline What is version control?

Version Control With Subversion Jonathan Worthington Scarborough Linux User Group Version

BUGZILLA & SUBVERSION Agenda Bugzilla. Subversion. Bugzilla What is Bugzilla?

Spelling, Punctuation and Grammar Suffixes -ing Year One SPaG | Suffixes -ing Suffixes Suffixes

Version Control and Subversion Chris Coakley Outline What is Version Control? Why use

Preempting Scheduler Activations Scheduler activations are completely preemptable Deadlocks

WORK STEALING SCHEDULER 2 6/16/2010 Work Stealing Scheduler

Design and Implemention of a Plugin Scheduler for DIET March 11, 2005 Design and Implemention of

Us Usin ing g Mu Multime ltimedia dia An And d Hy Hype permedia media In Teac aching

Achievin ing Lig ightweight Mult lticast in in Asynchronous Networks-on on-Chip Usin ing

Housing Market Cr Crash Prediction Us Usin ing Machin ine Le Learn rnin ing and His

Normal Basis is Usin ing Novel Concurrent Seria ial Squarin ing and Mult ltipli lication

Multiagent Systems: Spring 2006 Ulle Endriss Institute for Logic, Language and Computation

CS 886: Game-theoretic methods for computer science Normal Form Games Kate Larson Computer

Preparing for the Worst but Hoping for the Best: Robust (Bayesian) Persuasion Piotr Dworczak

Bidding in First-Price Auctions Game Theory Course: Jackson, Leyton-Brown & Shoham Game

The Subspace Method for Diagnosing Network-Wide Traffic Anomalies Anukool Lakhina, Mark Crovella,

On Dominating Your Neighborhood Profitably

Centrality Measures on Big Graphs: Exact, Approximated, and Distributed Algorithms Francesco

Dominant Decay Channel of Higgs Particle Observed at ATLAS Zhijun Liang

Avoiding Scheduler Subversion usin ing Scheduler-Cooperative Locks - PowerPoint PPT Presentation

Avoiding Scheduler Subversion usin ing Scheduler-Cooperative Locks Yuvraj Patel, Leon Yang * , Leo Arulraj + , Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Michael M. Swift University of Wisconsin-Madison * - Now at Facebook, + - Now at

What is Subversion Free/open-source version control system: it manages any collection of

Avoiding Scheduler Subversion using Scheduler - Cooperative Locks Yuvraj Patel , Leon Yang , Leo

Lecture Notes on Subversion (COMP 303) Subversion is a free/open-source version control system

Specification In Inference Usin ing Context-Free Language Reachability Osbert Bastani, Saswat

Version control with subversion A short introduction Outline What is version control?

Version Control With Subversion Jonathan Worthington Scarborough Linux User Group Version

BUGZILLA &amp; SUBVERSION Agenda Bugzilla. Subversion. Bugzilla What is Bugzilla?

Spelling, Punctuation and Grammar Suffixes -ing Year One SPaG | Suffixes -ing Suffixes Suffixes

Version Control and Subversion Chris Coakley Outline What is Version Control? Why use

Preempting Scheduler Activations Scheduler activations are completely preemptable Deadlocks

WORK STEALING SCHEDULER 2 6/16/2010 Work Stealing Scheduler

Design and Implemention of a Plugin Scheduler for DIET March 11, 2005 Design and Implemention of

Us Usin ing g Mu Multime ltimedia dia An And d Hy Hype permedia media In Teac aching

Achievin ing Lig ightweight Mult lticast in in Asynchronous Networks-on on-Chip Usin ing

Housing Market Cr Crash Prediction Us Usin ing Machin ine Le Learn rnin ing and His

Normal Basis is Usin ing Novel Concurrent Seria ial Squarin ing and Mult ltipli lication

Multiagent Systems: Spring 2006 Ulle Endriss Institute for Logic, Language and Computation

CS 886: Game-theoretic methods for computer science Normal Form Games Kate Larson Computer

Preparing for the Worst but Hoping for the Best: Robust (Bayesian) Persuasion Piotr Dworczak

Bidding in First-Price Auctions Game Theory Course: Jackson, Leyton-Brown &amp; Shoham Game

The Subspace Method for Diagnosing Network-Wide Traffic Anomalies Anukool Lakhina, Mark Crovella,

On Dominating Your Neighborhood Profitably

Centrality Measures on Big Graphs: Exact, Approximated, and Distributed Algorithms Francesco

Dominant Decay Channel of Higgs Particle Observed at ATLAS Zhijun Liang

BUGZILLA & SUBVERSION Agenda Bugzilla. Subversion. Bugzilla What is Bugzilla?

Bidding in First-Price Auctions Game Theory Course: Jackson, Leyton-Brown & Shoham Game