large scale distributed systems and networks tdde35
play

Large-Scale Distributed Systems and Networks TDDE35 Lectures on - PowerPoint PPT Presentation

Large-Scale Distributed Systems and Networks TDDE35 Lectures on Embedded Systems Petru Eles Institutionen fr Datavetenskap (IDA) Linkpings Universitet email: petru.eles@liu.se http://www.ida.liu.se/~petel71/ phone: 28 1396 B building,


  1. Finite State Machines Elevator controller input event output r 1 /n r 2 /n Input events: {r 1 , r 2 , r 3 }  r 2 /u 1 S 1 S 2  r i : request from floor i. r 1 /d 1 r 3 /u 2 d 1 / Outputs: {d 2 , d 1 , n, u 1 , u 2 } r  2 r 1 u 1 / initial state d 2 / r 3  d i : go down i floors S 3  u i : go up i floors  n : stay idle r 3 /n States: {S 1 , S 2 , S 3 }   S i : elevator is at floor i . TDDE35/ Embedded Systems 28 of 128

  2. A Design Example The system to be implemented is modelled as a task graph : T 1  a node represents a task (a unit of functionality activated as response to a certain input and which generates a certain output). T 2 T 3  an edge represents a precedence constraint and data dependency between two tasks. T 5 T 6 Period : 42 time units T 4  The task graph is activated every 42 time units  an activation has to terminate in time less than 42. T 7 Cost limit: 8 T 8  The total cost of the implemented system has to be less than 8. TDDE35/ Embedded Systems 29 of 128

  3. Informal Specification, Traditional Design Flow Constraints Modeling Functional Simulation System Model Select Architecture Hardware and Software Implementation not OK Prototype Testing OK Fabrication TDDE35/ Embedded Systems 30 of 128

  4. Informal Specification, Traditional Design Flow Constraints 1. Start from some informal specification of functionality Modeling and a set of constraints Functional Simulation System Model Select Architecture Hardware and Software Implementation not OK Prototype Testing OK Fabrication TDDE35/ Embedded Systems 31 of 128

  5. Informal Specification, Traditional Design Flow Constraints 1. Start from some informal specification of functionality Modeling and a set of constraints 2. Generate a more formal mod- Functional el of the functionality, based Simulation System on some modeling concept. Model Such model is our task graph Select Architecture Hardware and Software Implementation not OK Prototype Testing OK Fabrication TDDE35/ Embedded Systems 32 of 128

  6. Informal Specification, Traditional Design Flow Constraints 1. Start from some informal specification of functionality Modeling and a set of constraints 2. Generate a more formal mod- Functional el of the functionality, based Simulation System on some modeling concept. Model Such model is our task graph 3. Simulate the model in order to check the functionality. If Select Architecture needed make adjustments. Hardware and Software Implementation not OK Prototype Testing OK Fabrication TDDE35/ Embedded Systems 33 of 128

  7. Informal Specification, Traditional Design Flow Constraints 1. Start from some informal specification of functionality Modeling and a set of constraints 2. Generate a more formal mod- Functional el of the functionality, based Simulation System on some modeling concept. Model Such model is our task graph 3. Simulate the model in order to check the functionality. If Select Architecture needed make adjustments. 4. Choose an architecture (  processor, buses, etc.) Hardware and Software such that cost limits are satis- Implementation fied and, you hope, time and not OK power constraints are ful- filled. Prototype Testing OK Fabrication TDDE35/ Embedded Systems 34 of 128

  8. Informal Specification, Traditional Design Flow Constraints 1. Start from some informal specification of functionality Modeling and a set of constraints 2. Generate a more formal mod- Functional el of the functionality, based Simulation System on some modeling concept. Model Such model is our task graph 3. Simulate the model in order to check the functionality. If Select Architecture needed make adjustments. 4. Choose an architecture (  processor, buses, etc.) Hardware and Software such that cost limits are satis- Implementation fied and, you hope, time and not OK power constraints are ful- filled. Prototype Testing 5. Build a prototype and imple- ment the system. OK Fabrication TDDE35/ Embedded Systems 35 of 128

  9. Informal Specification, Traditional Design Flow Constraints 1. Start from some informal specification of functionality Modeling and a set of constraints 2. Generate a more formal mod- Functional el of the functionality, based Simulation System on some modeling concept. Model Such model is our task graph 3. Simulate the model in order to check the functionality. If Select Architecture needed make adjustments. 4. Choose an architecture (  processor, buses, etc.) Hardware and Software such that cost limits are satis- Implementation fied and, you hope, time and not OK power constraints are ful- filled. Prototype Testing 5. Build a prototype and imple- ment the system. OK 6. Verify the system: neither Fabrication time nor power constraints TDDE35/ Embedded Systems 36 of 128

  10. Informal Specification, Traditional Design Flow Constraints Now you are in great trouble: you have spent a lot of time and mon- Modeling ey and nothing works! Functional  Go back to 4, choose a Simulation new architecture and start System Model a new implementation.  Or negotiate with the cus- tomer on the constraints. Select Architecture Hardware and Software Implementation not OK Prototype Testing OK Fabrication TDDE35/ Embedded Systems 37 of 128

  11. The Traditional Design Flow The consequences:   Delays in the design process - Increased design cost Delays in time to market  missed market window -  High cost of failed prototypes  Bad design decisions taken under time pressure - Low quality, high cost products TDDE35/ Embedded Systems 38 of 128

  12. Informal Specification, Constraints Modeling Functional Simulation System Model More work should be Select Architecture done here! Hardware and Software Implementation not OK Prototype Testing OK Fabrication TDDE35/ Embedded Systems 39 of 128

  13. Example T 1 T 2 T 3 We have the system model (task graph) which has been  validated by simulation. T 5 T 6 We decide on a certain  processor  p1, with cost 6.  T 4 T 7 For each task the worst case execution time (WCET) when run  on  p1 is estimated . T 8 TDDE35/ Embedded Systems 40 of 128

  14. Example T 1 T 2 T 3 We have the system model (task graph) which has been  validated by simulation. T 5 T 6 We decide on a certain  processor  p1, with cost 6.  T 4 T 7 For each task the worst case execution time (WCET) when run  on  p1 is estimated . T 8 task - - - - - - - - - - - -  processor Estimator arch. model WCET TDDE35/ Embedded Systems 41 of 128

  15. Example T 1 T 2 T 3 We have the system model (task graph) which has been  validated by simulation. T 5 T 6 We decide on a certain  processor  p1, with cost 6.  T 4 T 7 For each task the worst case execution time (WCET) when run  on  p1 is estimated . T 8 task Tas WCET - - - - k - - - - - - - - T 1 4 T 2 6 T 3 4  processor Estimator T 4 7 arch. model T 5 8 T 6 12 WCET T 7 7 T 8 10 TDDE35/ Embedded Systems 42 of 128

  16. Example T 1 T 2 T 3 We generate a schedule: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 T 5 T 6 T 2 T 4 T 3 T 5 T 6 T 7 T 8 T 1 T 4 T 7 T 8 Tas WCET k T 1 4 T 2 6 T 3 4 T 4 7 T 5 8 T 6 12 T 7 7 T 8 10 TDDE35/ Embedded Systems 43 of 128

  17. Example T 1 T 2 T 3 We generate a schedule: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 T 5 T 6 T 2 T 4 T 3 T 5 T 6 T 7 T 8 T 1 T 4 T 7 Using the architecture with  processor  p1 we got a solution with: T 8  Execution time: 58 > 42 Tas WCET k  Cost: 6 < 8 T 1 4 T 2 6 T 3 4 We have to try with another architecture! T 4 7 T 5 8 T 6 12 T 7 7 T 8 10 TDDE35/ Embedded Systems 44 of 128

  18. Example T 1 T 2 T 3 We look after a  processor which is fast enough:  p2 T 5 T 6 T 4 T 7 T 8 TDDE35/ Embedded Systems 45 of 128

  19. Example T 1 T 2 T 3 We look after a  processor which is fast enough:  p2 T 5 T 6 For each task the WCET, when run on  p2, is estimated. T 4 T 7 T 8 Tas WCET k T 1 2 T 2 3 T 3 2 T 4 3 T 5 4 T 6 6 T 7 3 T 8 5 TDDE35/ Embedded Systems 46 of 128

  20. Example T 1 T 2 T 3 We look after a  processor which is fast enough:  p2 T 5 T 6 For each task the WCET, when run on  p2, is estimated. T 4 T 7 Using the architecture with  processor  p2 we got a solution with:  Execution time: 28 < 42 T 8  Cost: 15 > 8 Tas WCET k T 1 2 T 2 3 We have to try with another architecture! T 3 2 T 4 3 T 5 4 T 6 6 T 7 3 T 8 5 TDDE35/ Embedded Systems 47 of 128

  21. Example T 1 T 2 T 3 We have to look for a multiprocessor solution  In order to meet cost constraints try 2 cheap (and slow)  ps: T 5 T 6  p3: cost 3  p4: cost 2 T 4 interconnection bus: cost 1 T 7  p3  p4 T 8 Bus TDDE35/ Embedded Systems 48 of 128

  22. Example T 1 T 2 T 3 We have to look for a multiprocessor solution  In order to meet cost constraints try 2 cheap (and slow)  ps: T 5 T 6  p3: cost 3  p4: cost 2 T 4 interconnection bus: cost 1 T 7  p3  p4 T 8 WCET Tas Bus k  p3  p4 T 1 5 6 For each task the WCET, when run on  p3 and  p4, is estimated. T 2 7 9 T 3 5 6 T 4 8 10 T 5 10 11 T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 49 of 128

  23. Example T 1 T 2 T 3 Now we have to map the tasks to processors:  p3: T 1 , T 3 , T 5 , T 6 , T 7 , T 8 . T 5 T 6  p4: T 2 , T 4 . T 4 If communicating tasks are mapped to different processors, they T 7 have to communicate over the bus. Communication time has to be estimated; it depends on the T 8 amount of bits transferred between the tasks and on the speed of the bus. WCET Tas k  p3  p4 Estimated communication times: T 1 5 6 C 1-2 : 1 T 2 7 9 C 4-8 : 1 T 3 5 6 T 4 8 10 T 5 10 11 T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 50 of 128

  24. Example T 1 T 2 T 3  p3: T 1 , T 3 , T 5 , T 6 , T 7 , T 8 .  p4: T 2 , T 4 . T 5 T 6 Estimated communication times: T 4 C 1-2 : 1 T 7 C 4-8 : 1 T 8 We generate a schedule: WCET Tas 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 k  p3  p4  p3 T 1 T 3 T 5 T 6 T 7 T 8 T 1 5 6  p4 T 2 T 4 T 2 7 9 T 3 5 6 bus T 4 8 10 C 1-2 C 4-8 T 5 10 11 T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 51 of 128

  25. Example T 1 T 2 T 3  p3: T 1 , T 3 , T 5 , T 6 , T 7 , T 8 .  p4: T 2 , T 4 . T 5 T 6 Estimated communication times: T 4 C 1-2 : 1 T 7 C 4-8 : 1 T 8 We generate a schedule: WCET Tas 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 k  p3  p4  p3 T 1 T 3 T 5 T 6 T 7 T 8 T 1 5 6  p4 T 2 T 4 T 2 7 9 T 3 5 6 bus T 4 8 10 C 1-2 C 4-8 T 5 10 11 We have exceeded the allowed execution time (42)! T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 52 of 128

  26. Example T 1 T 2 T 3 Try a new mapping; T 5 to  p4, in order to increase parallelism. Two new communications are introduced, with estimated times: T 5 T 6 C 3-5 : 2 T 4 C 5-7 : 1 T 7 T 8 We generate a schedule: WCET Tas 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 k  p3  p4  p3 T 1 T 3 T 6 T 7 T 8 T 1 5 6  p4 T 2 T 4 T 5 T 2 7 9 T 3 5 6 bus T 4 8 10 C 1-2 C 3-5 C 4-8 C 5-7 T 5 10 11 The execution time is still 62, as before! T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 53 of 128

  27. Example T 1 T 2 T 3 Try a new mapping; T 5 to  p4, in order to increase parallelism. Two new communications are introduced, with estimated times: T 5 T 6 C 3-5 : 2 T 4 C 5-7 : 1 T 7 T 8 There exists a better schedule! WCET Tas 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 k  p3  p4  p3 T 1 T 3 T 6 T 7 T 8 T 1 5 6  p4 T 2 T 5 T 4 T 2 7 9 T 3 5 6 bus T 4 8 10 C 1-2 C 3-5 C 5-7 C 4-8 T 5 10 11 T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 54 of 128

  28. Example T 1 T 2 T 3 Try a new mapping; T 5 to  p4, in order to increase parallelism. Two new communications are introduced, with estimated times: T 5 T 6 C 3-5 : 2 T 4 C 5-7 : 1 T 7 T 8 There exists a better schedule! WCET Tas 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 k  p3  p4  p3 T 1 T 3 T 6 T 7 T 8 T 1 5 6  p4 T 2 T 5 T 4 T 2 7 9 T 3 5 6 bus T 4 8 10 C 1-2 C 3-5 C 5-7 C 4-8 T 5 10 11 Execution time: 52 > 42 T 6 17 21 Cost: 6 < 8 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 55 of 128

  29. Example T 1 T 2 T 3  p3  p4 Bus T 5 T 6 T 4 T 7 Possible solutions:   Change  proc.  p3 with faster one  cost limits exceeded T 8 WCET Tas k  p3  p4 T 1 5 6 T 2 7 9 T 3 5 6 T 4 8 10 T 5 10 11 T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 56 of 128

  30. Example T 1 T 2 T 3  p3  p4 Bus T 5 T 6 ASIC T 4 T 7 Possible solutions:   Change  proc.  p3 with faster one  cost limits exceeded T 8  Implement part of the functionality in hardware as an ASIC WCET Cost of ASIC: 1 Tas k  p3  p4 T 1 5 6 T 2 7 9 T 3 5 6 T 4 8 10 T 5 10 11 T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 57 of 128

  31. Example T 1 T 2 T 3  p3  p4 Bus T 5 T 6 ASIC T 4 T 7 Possible solutions:   Change  proc.  p3 with faster one  cost limits exceeded T 8  Implement part of the functionality in hardware as an ASIC WCET Tas New architecture  k  p3  p4 Cost of ASIC: 1 T 1 5 6 T 2 7 9 Mapping  T 3 5 6  p3: T 1 , T 3 , T 6 , T 7 . T 4 8 10  p4: T 2 , T 4 , T 5 . T 5 10 11 ASIC: T 8 with estimated WCET= 3 T 6 17 21  New communication, with estimated time: T 7 10 14 C 7-8 : 1 T 8 15 19 TDDE35/ Embedded Systems 58 of 128

  32. Example T 1 T 2 T 3  p3  p4 Bus T 5 T 6 ASIC T 4 T 7 Mapping   p3: T 1 , T 3 , T 6 , T 7 .  p4: T 2 , T 4 , T 5 . T 8 ASIC: T 8 with estimated WCET= 3 WCET Tas  New communication, with estimated time: k  p3  p4 C 7-8 : 1 T 1 5 6 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 Time T 2 7 9  p3 T 3 T 6 T 7 T 1 T 3 5 6 T 4 8 10  p4 T 2 T 5 T 4 T 5 10 11 T 8 ASIC T 6 17 21 T 7 10 14 bus C 1-2 C 3-5 C 5-7 C 4-8 C 7-8 T 8 15 19 TDDE35/ Embedded Systems 59 of 128

  33. Example T 1 T 2 T 3  p3  p4 Bus T 5 T 6 ASIC T 4 T 7 Using this architecture we got a solution with: T 8  Execution time: 41 < 42  Cost: 7 < 8 WCET Tas k  p3  p4 T 1 5 6 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 Time T 2 7 9  p3 T 3 T 6 T 7 T 1 T 3 5 6 T 4 8 10  p4 T 2 T 5 T 4 T 5 10 11 T 8 ASIC T 6 17 21 T 7 10 14 bus C 1-2 C 3-5 C 5-7 C 4-8 C 7-8 T 8 15 19 TDDE35/ Embedded Systems 60 of 128

  34. Example What did we achieve? We have selected an architecture.  We have mapped tasks to the processors and ASIC.  We have elaborated a a schedule.  TDDE35/ Embedded Systems 61 of 128

  35. Example What did we achieve? We have selected an architecture.  We have mapped tasks to the processors and ASIC.  We have elaborated a a schedule.  Extremely important!!! Nothing has been built yet. All decisions are based on simulation and estimation. TDDE35/ Embedded Systems 62 of 128

  36. Example What did we achieve? We have selected an architecture.  We have mapped tasks to the processors and ASIC.  We have elaborated a a schedule.  Extremely important!!! Nothing has been built yet. All decisions are based on simulation and estimation. Now we can go and do the software and hardware implementation, with a  high degree of confidence that we get a correct prototype. TDDE35/ Embedded Systems 63 of 128

  37. Informal Specification, What is the essential difference Constraints compared to the “traditional” design flow? Modeling Functional Simulation System Arch. Selection model System Mapping architecture Scheduling Estimation Mapped and scheduled not OK not OK model OK Hardware and Software Implementation Testing Prototype not OK OK Fabrication TDDE35/ Embedded Systems 64 of 128

  38. Informal Specification, What is the essential difference Constraints compared to the “traditional” design flow? Modeling Functional  The inner loop which is per- Simulation formed before the hardware/ System Arch. Selection model software implementation. This loop is performed several System Mapping times as part of the design architecture space exploration . Different architectures, mappings and Scheduling Estimation schedules are explored, be- fore the actual implementation Mapped and scheduled and prototyping. not OK not OK model OK  We get highly optimized good quality solutions in short time. Hardware and Software We have a good chance that Implementation the outer loop, including pro- totyping, is not repeated. Testing Prototype not OK OK Fabrication TDDE35/ Embedded Systems 65 of 128

  39. The Design Flow Formal verification  It is impossible to do an exhaustive verification by simulation!  Especially for safety critical systems formal verification is needed. Hardware/Software codesign  During the mapping/scheduling step we also decide what is going to be  executed on a programmable processor (software) and what is going into hardware (ASIC, FPGA). During the implementation phase, hardware and software components  have to be developed in a coordinated way, keeping care of their consistency (hardware/software cosimulation) TDDE35/ Embedded Systems 66 of 128

  40. Informal Specification, Constraints Functional Modeling Simulation Formal Arch. Selection System model S y s t e m L e v e l Verification System Mapping architecture Estimation Scheduling not OK not OK Mapped and scheduled model Simulation Formal OK Verification Softw. model Hardw. model Simulation Softw. Generation Hardw. Synthesis Lower Levels Softw. blocks Hardw. blocks Simulation Testing Prototype OK not OK Fabrication TDDE35/ Embedded Systems 67 of 128

  41. The “Lower Levels” Software generation:  Encoding in an implementation language (C, C++, assembler).  Compiling (this can include particular optimizations for application  specific processors, DSPs, etc.). Generation of a real-time kernel or adapting to an existing operating  system. Testing and debugging (in the development environment).  Several courses are teaching this part: Programming related courses,  Algorithms and data structures, Compilers, operating systems, real-time systems, .... TDDE35/ Embedded Systems 68 of 128

  42. The “Lower Levels” Hardware synthesis:  Encoding in a hardware description language (VHDL, Verilog)  Successive synthesis steps: high-level, register-transfer level, logic-  level synthesis. Testing and debugging (by simulation)  Several courses are teaching this part: Digital design, Electronics and VLSI  related courses, Computer Architectures, .... TDDE35/ Embedded Systems 69 of 128

  43. The System Level TDTS07: System Design and Methodology (Modeling and Design of  Embedded Systems) TDDE35/ Embedded Systems 70 of 128

  44. Bring Power Consumption into the Picture Why is power consumption an issue? Portable systems: battery life time!  Systems with limited power budget: Mars Pathfinder, autonomous helicopter, ...  Desktops and servers: high power consumption   raises temperature and deteriorates performance & reliability  increases the need for expensive cooling mechanisms One main difficulty with developing high performance chips is heat extraction.  High power consumption has economical and ecological consequences.  TDDE35/ Embedded Systems 71 of 128

  45. Sources of Power Dissipation in CMOS Devices 1 2         P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 C = node capacitances V DD = supply voltage N SW = switching activities Q SC = charge carried by (number of gate transi- short circuit cur- tions per clock cycle) rent per transition f = frequency of operation I leak = leakage current TDDE35/ Embedded Systems 72 of 128

  46. Sources of Power Dissipation in CMOS Devices dynamic 1 2         P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 C = node capacitances V DD = supply voltage N SW = switching activities Q SC = charge carried by (number of gate transi- short circuit cur- tions per clock cycle) rent per transition f = frequency of operation I leak = leakage current TDDE35/ Embedded Systems 73 of 128

  47. Sources of Power Dissipation in CMOS Devices dynamic 1 2         P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 Switching power Short-circ. power Power required to Dissipation due charge/discharge to short-circuit circuit nodes current C = node capacitances V DD = supply voltage N SW = switching activities Q SC = charge carried by (number of gate transi- short circuit cur- tions per clock cycle) rent per transition f = frequency of operation I leak = leakage current TDDE35/ Embedded Systems 74 of 128

  48. Sources of Power Dissipation in CMOS Devices dynamic static 1 2         P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 Switching power Short-circ. power Leakage power Power required to Dissipation due Dissipation charge/discharge to short-circuit due to leakage circuit nodes current current C = node capacitances V DD = supply voltage N SW = switching activities Q SC = charge carried by (number of gate transi- short circuit cur- tions per clock cycle) rent per transition f = frequency of operation I leak = leakage current TDDE35/ Embedded Systems 75 of 128

  49. Sources of Power Dissipation in CMOS Devices dynamic static 1 2         P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 Switching power Short-circ. power Leakage power Power required to Dissipation due Dissipation charge/discharge to short-circuit due to leakage circuit nodes current current Earlier:  Leakage power has been considered negligible compared to dynamic. Today:  Total dissipation from leakage is approaching the total from dynamic. As transistor sizes shrink:  Leakage power becomes significant. TDDE35/ Embedded Systems 76 of 128

  50. Sources of Power Dissipation in CMOS Devices dynamic static 1 2         P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 Switching power Short-circ. power Leakage power Power required to Dissipation due Dissipation charge/discharge to short-circuit due to leakage circuit nodes current current Leakage power is consumed even if the circuit is idle (standby). The only way  to avoid is decoupling from power. TDDE35/ Embedded Systems 77 of 128

  51. Sources of Power Dissipation in CMOS Devices dynamic static 1 2         P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 Switching power Short-circ. power Leakage power Power required to Dissipation due Dissipation charge/discharge to short-circuit due to leakage circuit nodes current current Leakage power is consumed even if the circuit is idle (standby). The only way  to avoid is decoupling from power. Short circuit power is up to 10% of total.  TDDE35/ Embedded Systems 78 of 128

  52. Sources of Power Dissipation in CMOS Devices dynamic static 1 2         P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 Switching power Short-circ. power Leakage power Power required to Dissipation due Dissipation charge/discharge to short-circuit due to leakage circuit nodes current current Leakage power is consumed even if the circuit is idle (standby). The only way  to avoid is decoupling from power. Short circuit power can be around 10% of total.  Switching power is still the main source of power consumption.  TDDE35/ Embedded Systems 79 of 128

  53. Power and Energy Consumption 1 2     P = - C V DD f N SW - 2 1 2      E = P t = - C V DD N CY N SW - 2 N CY = number of cycles needed for the particular task. TDDE35/ Embedded Systems 80 of 128

  54. Power and Energy Consumption 1 2     P = - C V DD f N SW - 2 1 2      E = P t = - C V DD N CY N SW - 2 N CY = number of cycles needed for the particular task. In certain situations we are concerned about power consumption:  heath dissipation, cooling:  physical deterioration due to temperature.  Sometimes we want to reduce total energy consumed:  battery life.  TDDE35/ Embedded Systems 81 of 128

  55. Power and Energy Consumption 1 2     P = - C V DD f N SW - 2 1 2      E = P t = - C V DD N CY N SW - 2 Reducing power/energy consumption:  Reduce supply voltage  TDDE35/ Embedded Systems 82 of 128

  56. Power and Energy Consumption 1 2     P = - C V DD f N SW - 2 1 2      E = P t = - C V DD N CY N SW - 2 Reducing power/energy consumption:  Reduce supply voltage  Reduce switching activity  TDDE35/ Embedded Systems 83 of 128

  57. Power and Energy Consumption 1 2     P = - C V DD f N SW - 2 1 2      E = P t = - C V DD N CY N SW - 2 Reducing power/energy consumption:  Reduce supply voltage  Reduce switching activity  Reduce capacitance  TDDE35/ Embedded Systems 84 of 128

  58. Power and Energy Consumption 1 2     P = - C V DD f N SW - 2 1 2      E = P t = - C V DD N CY N SW - 2 Reducing power/energy consumption:  Reduce supply voltage  Reduce switching activity  Reduce capacitance  Reduce number of cycles  TDDE35/ Embedded Systems 85 of 128

  59. System Level Power/Energy Optimization Dynamic techniques: applied at run time.  These techniques are applied at run-time in order to reduce power consumption by exploiting idle or low-workload periods. Static techniques: applied at design time.  Compilation for low power: instruction selection considering their pow-  er profile, data placement in memory, register allocation. Algorithm design: find the algorithm which is the most power-efficient.  Task mapping and scheduling.  TDDE35/ Embedded Systems 86 of 128

  60. System Level Power/Energy Optimization Three techniques will be discussed: 1. Dynamic power management: a dynamic technique. 2. Task mapping: a static technique. 3. Task scheduling with dynamic power scaling: static & dynamic. TDDE35/ Embedded Systems 87 of 128

  61. Dynamic Power Management (DPM) application power aware OS hardware TDDE35/ Embedded Systems 88 of 128

  62. Dynamic Power Management (DPM) Decisions: Switching among multiple power states:  application  idle power aware OS  sleep  run hardware Switching among multiple frequencies  and voltage levels. TDDE35/ Embedded Systems 89 of 128

  63. Dynamic Power Management (DPM) Decisions: Switching among multiple power states:  application  idle power aware OS  sleep  run hardware Switching among multiple frequencies  and voltage levels. Goal:  Energy optimization  QoS constraints satisfied TDDE35/ Embedded Systems 90 of 128

  64. Dynamic Power Management (DPM) Intel Xscale Processor RUN: operational  IDLE: Clocks to the CPU  are disabled; recovery is through interrupt. RUN SLEEP: Mainly powered  10  s 1.5ms off; recovery through 10  s wake-up event. 140ms 90  s Other intermediate  IDLE SLEEP states: DEEP IDLE, 160  W 40mW STANDBY, DEEP SLEEP TDDE35/ Embedded Systems 91 of 128

  65. Dynamic Power Management (DPM) Intel Xscale Processor 0.75V, 60mW 150MHz RUN: operational  RUN 1.3V, 450mW IDLE: Clocks to the CPU RUN  600MHz RUN are disabled; recovery 1.6V, 900mW RUN is through interrupt. 800MHz 160  s RUN SLEEP: Mainly powered  10  s 1.5ms off; recovery through 10  s wake-up event. 140ms 90  s Other intermediate  IDLE SLEEP states: DEEP IDLE, 160  W 40mW STANDBY, DEEP SLEEP TDDE35/ Embedded Systems 92 of 128

  66. The Basic Concept of DPM When there are requests for a device  the device is busy ;  otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state.  TDDE35/ Embedded Systems 93 of 128

  67. The Basic Concept of DPM When there are requests for a device  the device is busy ;  otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state.  Workload Requests Requests T 1 T 4 Time TDDE35/ Embedded Systems 94 of 128

  68. The Basic Concept of DPM When there are requests for a device  the device is busy ;  otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state.  Workload Requests Requests Device state Busy Idle Busy T 1 T 4 Time TDDE35/ Embedded Systems 95 of 128

  69. The Basic Concept of DPM When there are requests for a device  the device is busy ;  otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state.  Workload Requests Requests Device state Busy Idle Busy T sd T w Power state Sleeping Working Working T 1 T 4 Time TDDE35/ Embedded Systems 96 of 128

  70. The Basic Concept of DPM When there are requests for a device  the device is busy ;  otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state.  Workload Requests Requests Device state Busy Idle Busy T sd T w Power state Sleeping Working Working T 1 T 4 Time Changing the power state takes time and extra energy .   T sd : shutdown delay  T wu : wake-up delay Send the device to sleep only if the saved energy justifies the overhead! TDDE35/ Embedded Systems 97 of 128

  71. The Basic Concept of DPM When there are requests for a device  the device is busy ;  otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state.  Workload Requests Requests Device state Busy Idle Busy T sd T w Power state Sleeping Working Working T 1 T 4 Time The main Problems:   Don’t shut down such that delays occur too frequently.  Don’t shut down such that the savings due to the sleeping are smaller than the energy overhead of the state changes. TDDE35/ Embedded Systems 98 of 128

  72. Power Management Policies When there are requests for a device  the device is busy ;  otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state.  Workload Requests Requests Device state Busy Idle Busy T sd T w Power state Sleeping Working Working T 1 T 4 Time Power management policies are concerned with predictions of idle periods:   For shut-down: try to predict how long the idle period will be in order to decide if a shut-down should be performed.  For wake-up: try to predict when the idle period ends, in order to avoid user delays due to T wu . - Very difficult! TDDE35/ Embedded Systems 99 of 128

  73. Dynamic Power Management (DPM) For many embedded systems DPM techniques, like presented before, are not  appropriate: They have time constraints  we have to keep deadlines (usually we  cannot afford shut-down and wake-up times). The OS is simple&fast  no sophisticated run-time techniques.  The application is known at design time  we know a lot about the  application and optimize already at design time. TDDE35/ Embedded Systems 100 of 128

Recommend


More recommend