Scalability in the Clouds! A Myth or Reality? Sanidhya Kashyap , Changwoo Min, Taesoo Kim
Programmer's Paradise? ● A programmer day-to-day task: program compilation, like Linux kernel compilation. ● Relies on Buildbot to complete the job ASAP! ● Expects the job to complete sooner with increasing core count. – With respect to vertical scalability, a parallel job with no sequential bottleneck should scale with increasing core count.
Programmer's Paradise? ● A programmer day-to-day task: program compilation, like Linux kernel compilation. ● Relies on Buildbot to complete the job ASAP! How about using Cloud providers for our fun ● Expects the job to complete sooner with and their profjt? increasing core count. – With respect to vertical scalability, a parallel job with no sequential bottleneck should scale with increasing core count.
Clouds Trend ● Trend is changing Larger instances (40 → vCPUs) are available. ● Will Buildbot really scale? 50 vCPUs 40 40 32 32 32 32 30 20 16 8 8 10 4 1 0 2007 2009 2011 2013 2015 2006 2008 2010 2012 2014
Scalability Behavior in the Clouds 140 120 EC2 100 builds / hour 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs
Scalability Behavior in the Clouds 140 120 EC2 100 GCE builds / hour 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs
Scalability Behavior in the Clouds 140 EC2 120 GCE 100 builds / hour Azure 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs
Scalability Behavior in the Clouds 140 EC2 120 GCE 100 Azure builds / hour 16-core E5 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs
Scalability Behavior in the Clouds 140 EC2 120 GCE 100 Azure builds / hour 16-core E5 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs
Scalability Behavior in the Clouds 140 EC2 120 GCE 100 Azure builds / hour 16-core E5 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs
Scalability Behavior in the Clouds 140 EC2 120 GCE 100 Azure builds / hour 16-core E5 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs
Scalability Behavior in the Clouds 140 EC2 ? 120 GCE 100 Azure builds / hour 16-core E5 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs
Scalability Behavior in VMs with Higher-core count 250 Host 200 builds / hour 150 100 50 0 20 40 60 80 100 120 140 160 #vCPUs
Scalability Behavior in VMs with Higher-core count 250 Host Guest 200 builds / hour 150 100 50 0 20 40 60 80 100 120 140 160 #vCPUs
Scalability Behavior in VMs with Higher-core count 250 Host Guest 200 builds / hour 150 100 6.7x 50 0 20 40 60 80 100 120 140 160 #vCPUs
Why? 200 ● Performance 150 builds / hour degradation occurs 100 Guest due to drastic 50 increase in VMEXITS 0 ( halt exits ). 0 20 40 60 80 100 120 140 160 1400 #halt exits x 1000 1200 1000 800 600 Guest 400 200 0 20 40 60 80 100 120 140 160 #vCPUs
Why? 200 ● Performance 150 builds / hour degradation occurs 100 Guest due to drastic 50 increase in VMEXITS 0 ( halt exits ). 0 20 40 60 80 100 120 140 160 1400 #halt exits x 1000 1200 1000 800 600 Spinlock is sleeping! Guest 400 200 0 20 40 60 80 100 120 140 160 #vCPUs
Spinlock Evolution in the Linux Kernel
Spinlock Evolution in the Linux Kernel Test-and- Test-And-Set spinlock
Spinlock Evolution in the Linux Kernel Test-and- 2.6.25 (April Test-And-Set Fairness 2008) spinlock Ticket spinlock
Spinlock Evolution in the Linux Kernel Test-and- 2.6.25 (April 3.15 (July 2014) Test-And-Set Fairness 2008) Shared cacheline qspinlock, variant spinlock Ticket spinlock of MCS lock (yet to contention be merged)
Spinlock Evolution in the Linux Kernel 3.11 ( 2013) Paravirtual Ticket spinlock Test-and- 2.6.25 (April 3.15 (July 2014) Test-And-Set Fairness 2008) Shared cacheline qspinlock, variant spinlock Ticket spinlock of MCS lock (yet to contention be merged)
Spinlock Evolution in the Linux Kernel 3.11 ( 2013) 4.0 (May, 2015) Paravirtual Ticket OTicket spinlock Test-and- 2.6.25 (April 3.15 (July 2014) Test-And-Set Fairness 2008) Shared cacheline qspinlock, variant spinlock Ticket spinlock of MCS lock (yet to contention be merged)
Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.
Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.
Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.
Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.
Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.
Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.
Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.
Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.
Complexity of Ticket Spinlock in Virtualized Environment ● vCPUs are scheduled by host Guest OS scheduler. ● Semantic gap between the vCPU 1 vCPU 2 hypervisor and guest OS. Hypervisor CPU 1
Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted.
Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted. Scheduled Preempted
Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted. 0 head = 0 tail = 1 Scheduled Preempted
Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted. 0 1 head = 0 tail = 2 Scheduled Preempted
Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted. 0 1 2 head = 0 Scheduled Preempted
Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted. 1 2 head = 1 Scheduled Preempted
Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted. 1 2 head = 1 Scheduled Preempted
Recommend
More recommend