introduction to multiprocessor
play

Introduction to Multiprocessor 2 GHz 2 GHz 2 GHz Identical FPU - PowerPoint PPT Presentation

Real-time Scheduling and Synchronization Seminar Three Kinds of Multiprocessors Proc. 1 Proc. 2 Proc. 3 Introduction to Multiprocessor 2 GHz 2 GHz 2 GHz Identical FPU FPU FPU Real-Time Systems Uniform 2 GHz 1 GHz 500 MHz


  1. Real-time Scheduling and Synchronization Seminar Three Kinds of Multiprocessors Proc. 1 Proc. 2 Proc. 3 Introduction to Multiprocessor 2 GHz 2 GHz 2 GHz Identical FPU FPU FPU Real-Time Systems Uniform 2 GHz 1 GHz 500 MHz Heterogeneous FPU FPU FPU 1 GHz 3 GHz 500 MHz Unrelated Heterogeneous WS 2012/2013 FPU large cache I/O coproc. identical: ➡ all processors have equal speed and capabilities uniform heterogeneous (or homogenous): ➡ all processors have equal capabilities ➡ but different speeds Björn Brandenburg unrelated heterogenous: Real-Time Systems Group ➡ no regular relation assumed We consider only identical ➡ tasks may not be able to execute on all processors multiprocessors in this class. MPI-SWS Brandenburg 2 Real-time Scheduling and Synchronization Seminar Real-time Scheduling and Synchronization Seminar What makes multiprocessor Scheduling Approaches scheduling hard? J 4 J 3 J 2 J 1 J 1 J 2 J 3 J 4 “Few of the results obtained for a single processor Q1 Q2 Q3 Q4 Q1 generalize directly to the multiple processor case; Core 1 Core 2 Core 3 Core 4 bringing in additional processors adds a new Core 1 Core 2 Core 3 Core 4 dimension to the scheduling problem. The simple fact L2 L2 L2 L2 that a task can use only one processor even when Cache Cache Cache Cache several processors are free at the same time adds a Main Memory Main Memory surprising amount of difficulty to the scheduling of multiple processors.” [emphasis added] Partitioned Scheduling Global Scheduling ➡ task statically assigned to cores ➡ jobs migrate freely ➡ One ready queue per core ➡ All cores serve shared ready queue LIU, C. L. (1969). Scheduling algorithms for multiprocessors in a hard real-time environment. In JPL Space Programs Summary, vol. 37-60. JPL, Pasadena, CA, 28–31. ➡ uniprocessor scheduler on each core ➡ requires new schedulability analysis MPI-SWS Brandenburg MPI-SWS Brandenburg 3 4

  2. Real-time Scheduling and Synchronization Seminar Real-time Scheduling and Synchronization Seminar Global Scheduling — Dhall Effect Dhall Effect — Illustration A Difficult Task Set Uniprocessor Utilization Bounds ➡ m + 1 tasks ➡ EDF = 1 ➡ First m tasks — ( T i for 1 ≤ i ≤ m ): ➡ Rate-Monotonic (RM) = ln 2 ‣ Period = 1 Total utilization? ‣ WCET : 2 ε ➡ Last task T m+1 Question: What are the utilization bounds on a multiprocessor? ‣ Period = 1 + ε ➡ Notation: m is the number of processors ‣ WCET = 1 ➡ Intuition: would like to fully utilize all processors! scheduled on processor 1 T 3 scheduled on processor 2 Guesses? T 2 ➡ Global EDF = ? release deadline T 1 completion ➡ Global RM = ? time 2 ε 1 + ε 0 1 Dhall, S. and Liu, C. (1978). On a real-time scheduling problem. Operations Research, 26(1):127– 140. MPI-SWS MPI-SWS Brandenburg 5 Brandenburg 6 Real-time Scheduling and Synchronization Seminar Real-time Scheduling and Synchronization Seminar Dhall Effect — Implications Partitioned Scheduling Utilization Bounds Reduction to m uniprocessor problems ➡ For ε ➞ 0 , the utilization bound approaches 1 . ➡ Assign each task statically to one processor ➡ Adding processors makes no difference! ➡ Use uniprocessor scheduler on each core ‣ Either fixed-priority ( P-FP ) scheduling or EDF ( P-EDF ) Global vs. Partitioned Scheduling ➡ Partitioned scheduling is easier to implement. Find task mapping such that ➡ Dhall Effect shows limitation of global EDF and RM scheduling. ➡ No processor is over-utilized ➡ Researchers lost interest in global scheduling for ~25 years. ➡ Each partition is schedulable ‣ trivial for implicit deadlines & EDF Since late 1990ies… ➡ It’s a limitation of EDF and RM, not global scheduling in general. ➡ Much recent work on global scheduling. MPI-SWS Brandenburg MPI-SWS Brandenburg 7 8

  3. ��������������������������������� ���� ��� ��������������������������� ��� �� ����������������� �� �� ���� ��������������������������������������������������������� ������������������� ������������������ ������������������ ���� ���� �� ��������������������������������������������������������������������������� ������������������������ ��� ������������������ ��� ��� ��� ��� Real-time Scheduling and Synchronization Seminar Real-time Scheduling and Synchronization Seminar Connection to Bin Packing Bin-Packing Reduction Bin packing decision problem Bin packing decision problem Given a number of bins B , a bin capacity V , and a Given a number of bins B , a bin capacity V , and a set of n items x 1 ,…,x n with sizes a 1 ,…,a n , does there set of n items x 1 ,…,x n with sizes a 1 ,…,a n , does there exist a packing of x 1 ,…,x n that fits into B bins? exist a packing of x 1 ,…,x n that fits into B bins? 1) Normalize sizes a 1 ,…,a n and capacity V ➡ assume unit-speed processors Bin packing optimization problem 2) Create an implicit-deadline sporadic task T i for each item x i Given a bin capacity V and a set of n items x 1 ,…,x n ➡ with utilization u i = a i / V with sizes a 1 ,…,a n , assign each item to a bin such that ➡ Pick period arbitrarily, scale WCET appropriately the number of bins is minimized. 3) Is the resulting task set feasible under P-EDF on B processors? ➡ Hence, finding a valid partitioning is NP-hard. MPI-SWS MPI-SWS Brandenburg 9 Brandenburg 10 Real-time Scheduling and Synchronization Seminar Real-time Scheduling and Synchronization Seminar Upper Utilization Bound Partitioning in Practice (I) Theorem: there exist task sets with utilizations arbitrarily close to (m+1)/2 that cannot be partitioned. Andersson, B., Baruah, S., and Jonsson, J. (2001). Static-priority scheduling on multiprocessors. In Proceedings of the 22nd IEEE Real-Time Systems Symposium , pages 193–202. A difficult-to-partition task set ➡ m + 1 tasks ➡ For each T i for 1 ≤ i ≤ m + 1 : ‣ Period = 2 ‣ WCET : 1 + ε ‣ Utilization: (1 + ε ) / 2 Partitioning not possible ➡ Any two tasks together over-utilize a single processor by ε ! ➡ Total utilization = (m + 1) · (1 + ε ) / 2 Bottom line: heuristics work well most of the time (for independent tasks). MPI-SWS Brandenburg MPI-SWS Brandenburg 11 12

Recommend


More recommend