scalability in the clouds a myth or reality
play

Scalability in the Clouds! A Myth or Reality? Sanidhya Kashyap , - PowerPoint PPT Presentation

Scalability in the Clouds! A Myth or Reality? Sanidhya Kashyap , Changwoo Min, Taesoo Kim Programmer's Paradise? A programmer day-to-day task: program compilation, like Linux kernel compilation. Relies on Buildbot to complete the job ASAP!


  1. Scalability in the Clouds! A Myth or Reality? Sanidhya Kashyap , Changwoo Min, Taesoo Kim

  2. Programmer's Paradise? ● A programmer day-to-day task: program compilation, like Linux kernel compilation. ● Relies on Buildbot to complete the job ASAP! ● Expects the job to complete sooner with increasing core count. – With respect to vertical scalability, a parallel job with no sequential bottleneck should scale with increasing core count.

  3. Programmer's Paradise? ● A programmer day-to-day task: program compilation, like Linux kernel compilation. ● Relies on Buildbot to complete the job ASAP! How about using Cloud providers for our fun ● Expects the job to complete sooner with and their profjt? increasing core count. – With respect to vertical scalability, a parallel job with no sequential bottleneck should scale with increasing core count.

  4. Clouds Trend ● Trend is changing Larger instances (40 → vCPUs) are available. ● Will Buildbot really scale? 50 vCPUs 40 40 32 32 32 32 30 20 16 8 8 10 4 1 0 2007 2009 2011 2013 2015 2006 2008 2010 2012 2014

  5. Scalability Behavior in the Clouds 140 120 EC2 100 builds / hour 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs

  6. Scalability Behavior in the Clouds 140 120 EC2 100 GCE builds / hour 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs

  7. Scalability Behavior in the Clouds 140 EC2 120 GCE 100 builds / hour Azure 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs

  8. Scalability Behavior in the Clouds 140 EC2 120 GCE 100 Azure builds / hour 16-core E5 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs

  9. Scalability Behavior in the Clouds 140 EC2 120 GCE 100 Azure builds / hour 16-core E5 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs

  10. Scalability Behavior in the Clouds 140 EC2 120 GCE 100 Azure builds / hour 16-core E5 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs

  11. Scalability Behavior in the Clouds 140 EC2 120 GCE 100 Azure builds / hour 16-core E5 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs

  12. Scalability Behavior in the Clouds 140 EC2 ? 120 GCE 100 Azure builds / hour 16-core E5 80 60 40 20 0 4 8 12 16 20 24 28 32 #vCPUs

  13. Scalability Behavior in VMs with Higher-core count 250 Host 200 builds / hour 150 100 50 0 20 40 60 80 100 120 140 160 #vCPUs

  14. Scalability Behavior in VMs with Higher-core count 250 Host Guest 200 builds / hour 150 100 50 0 20 40 60 80 100 120 140 160 #vCPUs

  15. Scalability Behavior in VMs with Higher-core count 250 Host Guest 200 builds / hour 150 100 6.7x 50 0 20 40 60 80 100 120 140 160 #vCPUs

  16. Why? 200 ● Performance 150 builds / hour degradation occurs 100 Guest due to drastic 50 increase in VMEXITS 0 ( halt exits ). 0 20 40 60 80 100 120 140 160 1400 #halt exits x 1000 1200 1000 800 600 Guest 400 200 0 20 40 60 80 100 120 140 160 #vCPUs

  17. Why? 200 ● Performance 150 builds / hour degradation occurs 100 Guest due to drastic 50 increase in VMEXITS 0 ( halt exits ). 0 20 40 60 80 100 120 140 160 1400 #halt exits x 1000 1200 1000 800 600 Spinlock is sleeping! Guest 400 200 0 20 40 60 80 100 120 140 160 #vCPUs

  18. Spinlock Evolution in the Linux Kernel

  19. Spinlock Evolution in the Linux Kernel Test-and- Test-And-Set spinlock

  20. Spinlock Evolution in the Linux Kernel Test-and- 2.6.25 (April Test-And-Set Fairness 2008) spinlock Ticket spinlock

  21. Spinlock Evolution in the Linux Kernel Test-and- 2.6.25 (April 3.15 (July 2014) Test-And-Set Fairness 2008) Shared cacheline qspinlock, variant spinlock Ticket spinlock of MCS lock (yet to contention be merged)

  22. Spinlock Evolution in the Linux Kernel 3.11 ( 2013) Paravirtual Ticket spinlock Test-and- 2.6.25 (April 3.15 (July 2014) Test-And-Set Fairness 2008) Shared cacheline qspinlock, variant spinlock Ticket spinlock of MCS lock (yet to contention be merged)

  23. Spinlock Evolution in the Linux Kernel 3.11 ( 2013) 4.0 (May, 2015) Paravirtual Ticket OTicket spinlock Test-and- 2.6.25 (April 3.15 (July 2014) Test-And-Set Fairness 2008) Shared cacheline qspinlock, variant spinlock Ticket spinlock of MCS lock (yet to contention be merged)

  24. Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.

  25. Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.

  26. Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.

  27. Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.

  28. Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.

  29. Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.

  30. Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.

  31. Ticket Spinlock #defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; F&I(*addr) { old = *addr; int threshold = SPIN_THRESHOLD; *addr++; void lock() { return old; my_ticket = F&I(tail); } for( ; ; ) { int count = threshold; do { head if(my_ticket == head); goto out; } while(--count); } out: ; tail } void unlock() { head++; } ● Guaranteed FIFO ordering. ● Mitigates starvation with increasing core count.

  32. Complexity of Ticket Spinlock in Virtualized Environment ● vCPUs are scheduled by host Guest OS scheduler. ● Semantic gap between the vCPU 1 vCPU 2 hypervisor and guest OS. Hypervisor CPU 1

  33. Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted.

  34. Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted. Scheduled Preempted

  35. Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted. 0 head = 0 tail = 1 Scheduled Preempted

  36. Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted. 0 1 head = 0 tail = 2 Scheduled Preempted

  37. Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted. 0 1 2 head = 0 Scheduled Preempted

  38. Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted. 1 2 head = 1 Scheduled Preempted

  39. Complexity of Ticket Spinlock in Virtualized Environment ● Lock Holder Preemption: vCPU holding the lock gets preempted. 1 2 head = 1 Scheduled Preempted

Recommend


More recommend