CSE 3320 Operating Systems Multiprocessor Scheduling Jia Rao Department of Computer Science and Engineering http://ranger.uta.edu/~jrao
Recap of the Last Class • Basic scheduling policies on uniprocessors o First Come First Serve o Shortest Job First Time-sharing: o Round Robin Which thread should be run next ? o Priority scheduling o Multilevel feedback queue
Multiprocessor Scheduling • Two-dimension scheduling o Time-sharing on each processor Which thread to run and where ? o Load-balancing among multiple processors • Several issues o Why load balancing ? take advantage of parallelism o Simple time-sharing ? No, may need to consider a group of thds o Are all processors/cores equal ? No, cache affinity, memory Locality, and cache hotness make them different
Multiprocessor Hardware • Uniform memory access (UMA) C Cache FSB DRAM Controller Memory A schematic view of Intel Core 2
Multiprocessor Hardware (cont’) • Non-uniform memory access (NUMA) Node 1 Node 0 RAM RAM 1. Local v.s. remote memory 2. Cache sharing 1. Constructive IMC IMC 2. Destructive M M M M i i i i Q Q s s s s u u Core Core Core Core Core Core Core Core c c c c e e 0 2 4 6 1 3 5 7 I I I I u u O e O O e O Q Q Q Q P P P P Shared L3 cache Shared L3 cache I I I I Processor 0 Interconnect Processor 1 A schematic view of Intel Nehalem
Ready Queue Implementation • A single system-wide ready queue ready queue pick_next_task () processor … processor Cons: Pros: 1. Easy to implement 1. Scalability issues due to centralized synchronization 2. High overhead and low efficiency 2. Perfect load balancing 1. Hard to maintain cache hotness
Ready Queue Implementation (cont’) • Per-CPU ready queue ready queue pick_next_task () processor Load balancing: keep … … queue sizes balanced ready queue pick_next_task () processor Cons: Pros: 1. More complex to implement 1. Scalable to many CPUs 1. Push model v.s. pull model 2. Easy to maintain cache hotness 2. Not perfect load balancing à not always balanced
Push Model v.s. Pull Model • Push model Kick ready queue pick_next_task () processor Every a while, a kernel thread checks … … load imbalance and move threads ready queue pick_next_task () processor • Pull model ready queue pick_next_task () processor Whenever a queue becomes empty, … … steal a thread from non-empty queues steal ready queue pick_next_task () Both are widely used processor
Scheduling Parallel Programs • A parallel job o A collection of processes/threads that cooperate to solve the same problem o Scheduling matters in overall job completion time • Why scheduling matters ? o Synchronization on shared data (mutex) o Causality between threads (producer-consumer) o Synchronization on execution phases (barrier) The slowest thread delays the entire job
Space Sharing Pros: • Divide processor into groups 1. Highly efficient, low overhead 2. Strong affinity o Dedicate each group to a parallel job Cons: o No preemption before job completion 1. Highly inefficient, cycle waste 2. inflexible
Time Sharing: Gang or Co-Scheduling • Each processor runs threads from multiple jobs o Groups of related threads are scheduled as a unit, a gang o All CPUs perform context switch together Gang scheduling (stricter) > co-scheduling
Summary • Multiprocessor hardware • Two implementation of the ready queue o A single queue v.s. multiple queues • Load balancing o Push model v.s. Pull model • Parallel program scheduling o Space sharing v.s. time sharing • Additional practice o See the load balancer part in } http://www.scribd.com/doc/24111564/Project-Linux-Scheduler-2-6-32 o See LINUX_SRC/kernel/sched.c } Function load_balance and pull_task
Recommend
More recommend