Finite State Machines Elevator controller input event output r 1 /n r 2 /n Input events: {r 1 , r 2 , r 3 } r 2 /u 1 S 1 S 2 r i : request from floor i. r 1 /d 1 r 3 /u 2 d 1 / Outputs: {d 2 , d 1 , n, u 1 , u 2 } r 2 r 1 u 1 / initial state d 2 / r 3 d i : go down i floors S 3 u i : go up i floors n : stay idle r 3 /n States: {S 1 , S 2 , S 3 } S i : elevator is at floor i . TDDE35/ Embedded Systems 28 of 128
A Design Example The system to be implemented is modelled as a task graph : T 1 a node represents a task (a unit of functionality activated as response to a certain input and which generates a certain output). T 2 T 3 an edge represents a precedence constraint and data dependency between two tasks. T 5 T 6 Period : 42 time units T 4 The task graph is activated every 42 time units an activation has to terminate in time less than 42. T 7 Cost limit: 8 T 8 The total cost of the implemented system has to be less than 8. TDDE35/ Embedded Systems 29 of 128
Informal Specification, Traditional Design Flow Constraints Modeling Functional Simulation System Model Select Architecture Hardware and Software Implementation not OK Prototype Testing OK Fabrication TDDE35/ Embedded Systems 30 of 128
Informal Specification, Traditional Design Flow Constraints 1. Start from some informal specification of functionality Modeling and a set of constraints Functional Simulation System Model Select Architecture Hardware and Software Implementation not OK Prototype Testing OK Fabrication TDDE35/ Embedded Systems 31 of 128
Informal Specification, Traditional Design Flow Constraints 1. Start from some informal specification of functionality Modeling and a set of constraints 2. Generate a more formal mod- Functional el of the functionality, based Simulation System on some modeling concept. Model Such model is our task graph Select Architecture Hardware and Software Implementation not OK Prototype Testing OK Fabrication TDDE35/ Embedded Systems 32 of 128
Informal Specification, Traditional Design Flow Constraints 1. Start from some informal specification of functionality Modeling and a set of constraints 2. Generate a more formal mod- Functional el of the functionality, based Simulation System on some modeling concept. Model Such model is our task graph 3. Simulate the model in order to check the functionality. If Select Architecture needed make adjustments. Hardware and Software Implementation not OK Prototype Testing OK Fabrication TDDE35/ Embedded Systems 33 of 128
Informal Specification, Traditional Design Flow Constraints 1. Start from some informal specification of functionality Modeling and a set of constraints 2. Generate a more formal mod- Functional el of the functionality, based Simulation System on some modeling concept. Model Such model is our task graph 3. Simulate the model in order to check the functionality. If Select Architecture needed make adjustments. 4. Choose an architecture ( processor, buses, etc.) Hardware and Software such that cost limits are satis- Implementation fied and, you hope, time and not OK power constraints are ful- filled. Prototype Testing OK Fabrication TDDE35/ Embedded Systems 34 of 128
Informal Specification, Traditional Design Flow Constraints 1. Start from some informal specification of functionality Modeling and a set of constraints 2. Generate a more formal mod- Functional el of the functionality, based Simulation System on some modeling concept. Model Such model is our task graph 3. Simulate the model in order to check the functionality. If Select Architecture needed make adjustments. 4. Choose an architecture ( processor, buses, etc.) Hardware and Software such that cost limits are satis- Implementation fied and, you hope, time and not OK power constraints are ful- filled. Prototype Testing 5. Build a prototype and imple- ment the system. OK Fabrication TDDE35/ Embedded Systems 35 of 128
Informal Specification, Traditional Design Flow Constraints 1. Start from some informal specification of functionality Modeling and a set of constraints 2. Generate a more formal mod- Functional el of the functionality, based Simulation System on some modeling concept. Model Such model is our task graph 3. Simulate the model in order to check the functionality. If Select Architecture needed make adjustments. 4. Choose an architecture ( processor, buses, etc.) Hardware and Software such that cost limits are satis- Implementation fied and, you hope, time and not OK power constraints are ful- filled. Prototype Testing 5. Build a prototype and imple- ment the system. OK 6. Verify the system: neither Fabrication time nor power constraints TDDE35/ Embedded Systems 36 of 128
Informal Specification, Traditional Design Flow Constraints Now you are in great trouble: you have spent a lot of time and mon- Modeling ey and nothing works! Functional Go back to 4, choose a Simulation new architecture and start System Model a new implementation. Or negotiate with the cus- tomer on the constraints. Select Architecture Hardware and Software Implementation not OK Prototype Testing OK Fabrication TDDE35/ Embedded Systems 37 of 128
The Traditional Design Flow The consequences: Delays in the design process - Increased design cost Delays in time to market missed market window - High cost of failed prototypes Bad design decisions taken under time pressure - Low quality, high cost products TDDE35/ Embedded Systems 38 of 128
Informal Specification, Constraints Modeling Functional Simulation System Model More work should be Select Architecture done here! Hardware and Software Implementation not OK Prototype Testing OK Fabrication TDDE35/ Embedded Systems 39 of 128
Example T 1 T 2 T 3 We have the system model (task graph) which has been validated by simulation. T 5 T 6 We decide on a certain processor p1, with cost 6. T 4 T 7 For each task the worst case execution time (WCET) when run on p1 is estimated . T 8 TDDE35/ Embedded Systems 40 of 128
Example T 1 T 2 T 3 We have the system model (task graph) which has been validated by simulation. T 5 T 6 We decide on a certain processor p1, with cost 6. T 4 T 7 For each task the worst case execution time (WCET) when run on p1 is estimated . T 8 task - - - - - - - - - - - - processor Estimator arch. model WCET TDDE35/ Embedded Systems 41 of 128
Example T 1 T 2 T 3 We have the system model (task graph) which has been validated by simulation. T 5 T 6 We decide on a certain processor p1, with cost 6. T 4 T 7 For each task the worst case execution time (WCET) when run on p1 is estimated . T 8 task Tas WCET - - - - k - - - - - - - - T 1 4 T 2 6 T 3 4 processor Estimator T 4 7 arch. model T 5 8 T 6 12 WCET T 7 7 T 8 10 TDDE35/ Embedded Systems 42 of 128
Example T 1 T 2 T 3 We generate a schedule: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 T 5 T 6 T 2 T 4 T 3 T 5 T 6 T 7 T 8 T 1 T 4 T 7 T 8 Tas WCET k T 1 4 T 2 6 T 3 4 T 4 7 T 5 8 T 6 12 T 7 7 T 8 10 TDDE35/ Embedded Systems 43 of 128
Example T 1 T 2 T 3 We generate a schedule: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 T 5 T 6 T 2 T 4 T 3 T 5 T 6 T 7 T 8 T 1 T 4 T 7 Using the architecture with processor p1 we got a solution with: T 8 Execution time: 58 > 42 Tas WCET k Cost: 6 < 8 T 1 4 T 2 6 T 3 4 We have to try with another architecture! T 4 7 T 5 8 T 6 12 T 7 7 T 8 10 TDDE35/ Embedded Systems 44 of 128
Example T 1 T 2 T 3 We look after a processor which is fast enough: p2 T 5 T 6 T 4 T 7 T 8 TDDE35/ Embedded Systems 45 of 128
Example T 1 T 2 T 3 We look after a processor which is fast enough: p2 T 5 T 6 For each task the WCET, when run on p2, is estimated. T 4 T 7 T 8 Tas WCET k T 1 2 T 2 3 T 3 2 T 4 3 T 5 4 T 6 6 T 7 3 T 8 5 TDDE35/ Embedded Systems 46 of 128
Example T 1 T 2 T 3 We look after a processor which is fast enough: p2 T 5 T 6 For each task the WCET, when run on p2, is estimated. T 4 T 7 Using the architecture with processor p2 we got a solution with: Execution time: 28 < 42 T 8 Cost: 15 > 8 Tas WCET k T 1 2 T 2 3 We have to try with another architecture! T 3 2 T 4 3 T 5 4 T 6 6 T 7 3 T 8 5 TDDE35/ Embedded Systems 47 of 128
Example T 1 T 2 T 3 We have to look for a multiprocessor solution In order to meet cost constraints try 2 cheap (and slow) ps: T 5 T 6 p3: cost 3 p4: cost 2 T 4 interconnection bus: cost 1 T 7 p3 p4 T 8 Bus TDDE35/ Embedded Systems 48 of 128
Example T 1 T 2 T 3 We have to look for a multiprocessor solution In order to meet cost constraints try 2 cheap (and slow) ps: T 5 T 6 p3: cost 3 p4: cost 2 T 4 interconnection bus: cost 1 T 7 p3 p4 T 8 WCET Tas Bus k p3 p4 T 1 5 6 For each task the WCET, when run on p3 and p4, is estimated. T 2 7 9 T 3 5 6 T 4 8 10 T 5 10 11 T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 49 of 128
Example T 1 T 2 T 3 Now we have to map the tasks to processors: p3: T 1 , T 3 , T 5 , T 6 , T 7 , T 8 . T 5 T 6 p4: T 2 , T 4 . T 4 If communicating tasks are mapped to different processors, they T 7 have to communicate over the bus. Communication time has to be estimated; it depends on the T 8 amount of bits transferred between the tasks and on the speed of the bus. WCET Tas k p3 p4 Estimated communication times: T 1 5 6 C 1-2 : 1 T 2 7 9 C 4-8 : 1 T 3 5 6 T 4 8 10 T 5 10 11 T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 50 of 128
Example T 1 T 2 T 3 p3: T 1 , T 3 , T 5 , T 6 , T 7 , T 8 . p4: T 2 , T 4 . T 5 T 6 Estimated communication times: T 4 C 1-2 : 1 T 7 C 4-8 : 1 T 8 We generate a schedule: WCET Tas 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 k p3 p4 p3 T 1 T 3 T 5 T 6 T 7 T 8 T 1 5 6 p4 T 2 T 4 T 2 7 9 T 3 5 6 bus T 4 8 10 C 1-2 C 4-8 T 5 10 11 T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 51 of 128
Example T 1 T 2 T 3 p3: T 1 , T 3 , T 5 , T 6 , T 7 , T 8 . p4: T 2 , T 4 . T 5 T 6 Estimated communication times: T 4 C 1-2 : 1 T 7 C 4-8 : 1 T 8 We generate a schedule: WCET Tas 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 k p3 p4 p3 T 1 T 3 T 5 T 6 T 7 T 8 T 1 5 6 p4 T 2 T 4 T 2 7 9 T 3 5 6 bus T 4 8 10 C 1-2 C 4-8 T 5 10 11 We have exceeded the allowed execution time (42)! T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 52 of 128
Example T 1 T 2 T 3 Try a new mapping; T 5 to p4, in order to increase parallelism. Two new communications are introduced, with estimated times: T 5 T 6 C 3-5 : 2 T 4 C 5-7 : 1 T 7 T 8 We generate a schedule: WCET Tas 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 k p3 p4 p3 T 1 T 3 T 6 T 7 T 8 T 1 5 6 p4 T 2 T 4 T 5 T 2 7 9 T 3 5 6 bus T 4 8 10 C 1-2 C 3-5 C 4-8 C 5-7 T 5 10 11 The execution time is still 62, as before! T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 53 of 128
Example T 1 T 2 T 3 Try a new mapping; T 5 to p4, in order to increase parallelism. Two new communications are introduced, with estimated times: T 5 T 6 C 3-5 : 2 T 4 C 5-7 : 1 T 7 T 8 There exists a better schedule! WCET Tas 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 k p3 p4 p3 T 1 T 3 T 6 T 7 T 8 T 1 5 6 p4 T 2 T 5 T 4 T 2 7 9 T 3 5 6 bus T 4 8 10 C 1-2 C 3-5 C 5-7 C 4-8 T 5 10 11 T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 54 of 128
Example T 1 T 2 T 3 Try a new mapping; T 5 to p4, in order to increase parallelism. Two new communications are introduced, with estimated times: T 5 T 6 C 3-5 : 2 T 4 C 5-7 : 1 T 7 T 8 There exists a better schedule! WCET Tas 0 2 4 6 8 10 12 14 16 18 20 22 24 26 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 Time 28 64 k p3 p4 p3 T 1 T 3 T 6 T 7 T 8 T 1 5 6 p4 T 2 T 5 T 4 T 2 7 9 T 3 5 6 bus T 4 8 10 C 1-2 C 3-5 C 5-7 C 4-8 T 5 10 11 Execution time: 52 > 42 T 6 17 21 Cost: 6 < 8 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 55 of 128
Example T 1 T 2 T 3 p3 p4 Bus T 5 T 6 T 4 T 7 Possible solutions: Change proc. p3 with faster one cost limits exceeded T 8 WCET Tas k p3 p4 T 1 5 6 T 2 7 9 T 3 5 6 T 4 8 10 T 5 10 11 T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 56 of 128
Example T 1 T 2 T 3 p3 p4 Bus T 5 T 6 ASIC T 4 T 7 Possible solutions: Change proc. p3 with faster one cost limits exceeded T 8 Implement part of the functionality in hardware as an ASIC WCET Cost of ASIC: 1 Tas k p3 p4 T 1 5 6 T 2 7 9 T 3 5 6 T 4 8 10 T 5 10 11 T 6 17 21 T 7 10 14 T 8 15 19 TDDE35/ Embedded Systems 57 of 128
Example T 1 T 2 T 3 p3 p4 Bus T 5 T 6 ASIC T 4 T 7 Possible solutions: Change proc. p3 with faster one cost limits exceeded T 8 Implement part of the functionality in hardware as an ASIC WCET Tas New architecture k p3 p4 Cost of ASIC: 1 T 1 5 6 T 2 7 9 Mapping T 3 5 6 p3: T 1 , T 3 , T 6 , T 7 . T 4 8 10 p4: T 2 , T 4 , T 5 . T 5 10 11 ASIC: T 8 with estimated WCET= 3 T 6 17 21 New communication, with estimated time: T 7 10 14 C 7-8 : 1 T 8 15 19 TDDE35/ Embedded Systems 58 of 128
Example T 1 T 2 T 3 p3 p4 Bus T 5 T 6 ASIC T 4 T 7 Mapping p3: T 1 , T 3 , T 6 , T 7 . p4: T 2 , T 4 , T 5 . T 8 ASIC: T 8 with estimated WCET= 3 WCET Tas New communication, with estimated time: k p3 p4 C 7-8 : 1 T 1 5 6 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 Time T 2 7 9 p3 T 3 T 6 T 7 T 1 T 3 5 6 T 4 8 10 p4 T 2 T 5 T 4 T 5 10 11 T 8 ASIC T 6 17 21 T 7 10 14 bus C 1-2 C 3-5 C 5-7 C 4-8 C 7-8 T 8 15 19 TDDE35/ Embedded Systems 59 of 128
Example T 1 T 2 T 3 p3 p4 Bus T 5 T 6 ASIC T 4 T 7 Using this architecture we got a solution with: T 8 Execution time: 41 < 42 Cost: 7 < 8 WCET Tas k p3 p4 T 1 5 6 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 Time T 2 7 9 p3 T 3 T 6 T 7 T 1 T 3 5 6 T 4 8 10 p4 T 2 T 5 T 4 T 5 10 11 T 8 ASIC T 6 17 21 T 7 10 14 bus C 1-2 C 3-5 C 5-7 C 4-8 C 7-8 T 8 15 19 TDDE35/ Embedded Systems 60 of 128
Example What did we achieve? We have selected an architecture. We have mapped tasks to the processors and ASIC. We have elaborated a a schedule. TDDE35/ Embedded Systems 61 of 128
Example What did we achieve? We have selected an architecture. We have mapped tasks to the processors and ASIC. We have elaborated a a schedule. Extremely important!!! Nothing has been built yet. All decisions are based on simulation and estimation. TDDE35/ Embedded Systems 62 of 128
Example What did we achieve? We have selected an architecture. We have mapped tasks to the processors and ASIC. We have elaborated a a schedule. Extremely important!!! Nothing has been built yet. All decisions are based on simulation and estimation. Now we can go and do the software and hardware implementation, with a high degree of confidence that we get a correct prototype. TDDE35/ Embedded Systems 63 of 128
Informal Specification, What is the essential difference Constraints compared to the “traditional” design flow? Modeling Functional Simulation System Arch. Selection model System Mapping architecture Scheduling Estimation Mapped and scheduled not OK not OK model OK Hardware and Software Implementation Testing Prototype not OK OK Fabrication TDDE35/ Embedded Systems 64 of 128
Informal Specification, What is the essential difference Constraints compared to the “traditional” design flow? Modeling Functional The inner loop which is per- Simulation formed before the hardware/ System Arch. Selection model software implementation. This loop is performed several System Mapping times as part of the design architecture space exploration . Different architectures, mappings and Scheduling Estimation schedules are explored, be- fore the actual implementation Mapped and scheduled and prototyping. not OK not OK model OK We get highly optimized good quality solutions in short time. Hardware and Software We have a good chance that Implementation the outer loop, including pro- totyping, is not repeated. Testing Prototype not OK OK Fabrication TDDE35/ Embedded Systems 65 of 128
The Design Flow Formal verification It is impossible to do an exhaustive verification by simulation! Especially for safety critical systems formal verification is needed. Hardware/Software codesign During the mapping/scheduling step we also decide what is going to be executed on a programmable processor (software) and what is going into hardware (ASIC, FPGA). During the implementation phase, hardware and software components have to be developed in a coordinated way, keeping care of their consistency (hardware/software cosimulation) TDDE35/ Embedded Systems 66 of 128
Informal Specification, Constraints Functional Modeling Simulation Formal Arch. Selection System model S y s t e m L e v e l Verification System Mapping architecture Estimation Scheduling not OK not OK Mapped and scheduled model Simulation Formal OK Verification Softw. model Hardw. model Simulation Softw. Generation Hardw. Synthesis Lower Levels Softw. blocks Hardw. blocks Simulation Testing Prototype OK not OK Fabrication TDDE35/ Embedded Systems 67 of 128
The “Lower Levels” Software generation: Encoding in an implementation language (C, C++, assembler). Compiling (this can include particular optimizations for application specific processors, DSPs, etc.). Generation of a real-time kernel or adapting to an existing operating system. Testing and debugging (in the development environment). Several courses are teaching this part: Programming related courses, Algorithms and data structures, Compilers, operating systems, real-time systems, .... TDDE35/ Embedded Systems 68 of 128
The “Lower Levels” Hardware synthesis: Encoding in a hardware description language (VHDL, Verilog) Successive synthesis steps: high-level, register-transfer level, logic- level synthesis. Testing and debugging (by simulation) Several courses are teaching this part: Digital design, Electronics and VLSI related courses, Computer Architectures, .... TDDE35/ Embedded Systems 69 of 128
The System Level TDTS07: System Design and Methodology (Modeling and Design of Embedded Systems) TDDE35/ Embedded Systems 70 of 128
Bring Power Consumption into the Picture Why is power consumption an issue? Portable systems: battery life time! Systems with limited power budget: Mars Pathfinder, autonomous helicopter, ... Desktops and servers: high power consumption raises temperature and deteriorates performance & reliability increases the need for expensive cooling mechanisms One main difficulty with developing high performance chips is heat extraction. High power consumption has economical and ecological consequences. TDDE35/ Embedded Systems 71 of 128
Sources of Power Dissipation in CMOS Devices 1 2 P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 C = node capacitances V DD = supply voltage N SW = switching activities Q SC = charge carried by (number of gate transi- short circuit cur- tions per clock cycle) rent per transition f = frequency of operation I leak = leakage current TDDE35/ Embedded Systems 72 of 128
Sources of Power Dissipation in CMOS Devices dynamic 1 2 P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 C = node capacitances V DD = supply voltage N SW = switching activities Q SC = charge carried by (number of gate transi- short circuit cur- tions per clock cycle) rent per transition f = frequency of operation I leak = leakage current TDDE35/ Embedded Systems 73 of 128
Sources of Power Dissipation in CMOS Devices dynamic 1 2 P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 Switching power Short-circ. power Power required to Dissipation due charge/discharge to short-circuit circuit nodes current C = node capacitances V DD = supply voltage N SW = switching activities Q SC = charge carried by (number of gate transi- short circuit cur- tions per clock cycle) rent per transition f = frequency of operation I leak = leakage current TDDE35/ Embedded Systems 74 of 128
Sources of Power Dissipation in CMOS Devices dynamic static 1 2 P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 Switching power Short-circ. power Leakage power Power required to Dissipation due Dissipation charge/discharge to short-circuit due to leakage circuit nodes current current C = node capacitances V DD = supply voltage N SW = switching activities Q SC = charge carried by (number of gate transi- short circuit cur- tions per clock cycle) rent per transition f = frequency of operation I leak = leakage current TDDE35/ Embedded Systems 75 of 128
Sources of Power Dissipation in CMOS Devices dynamic static 1 2 P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 Switching power Short-circ. power Leakage power Power required to Dissipation due Dissipation charge/discharge to short-circuit due to leakage circuit nodes current current Earlier: Leakage power has been considered negligible compared to dynamic. Today: Total dissipation from leakage is approaching the total from dynamic. As transistor sizes shrink: Leakage power becomes significant. TDDE35/ Embedded Systems 76 of 128
Sources of Power Dissipation in CMOS Devices dynamic static 1 2 P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 Switching power Short-circ. power Leakage power Power required to Dissipation due Dissipation charge/discharge to short-circuit due to leakage circuit nodes current current Leakage power is consumed even if the circuit is idle (standby). The only way to avoid is decoupling from power. TDDE35/ Embedded Systems 77 of 128
Sources of Power Dissipation in CMOS Devices dynamic static 1 2 P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 Switching power Short-circ. power Leakage power Power required to Dissipation due Dissipation charge/discharge to short-circuit due to leakage circuit nodes current current Leakage power is consumed even if the circuit is idle (standby). The only way to avoid is decoupling from power. Short circuit power is up to 10% of total. TDDE35/ Embedded Systems 78 of 128
Sources of Power Dissipation in CMOS Devices dynamic static 1 2 P = - C V DD f N SW + Q SC V DD f N SW + I leak V DD - 2 Switching power Short-circ. power Leakage power Power required to Dissipation due Dissipation charge/discharge to short-circuit due to leakage circuit nodes current current Leakage power is consumed even if the circuit is idle (standby). The only way to avoid is decoupling from power. Short circuit power can be around 10% of total. Switching power is still the main source of power consumption. TDDE35/ Embedded Systems 79 of 128
Power and Energy Consumption 1 2 P = - C V DD f N SW - 2 1 2 E = P t = - C V DD N CY N SW - 2 N CY = number of cycles needed for the particular task. TDDE35/ Embedded Systems 80 of 128
Power and Energy Consumption 1 2 P = - C V DD f N SW - 2 1 2 E = P t = - C V DD N CY N SW - 2 N CY = number of cycles needed for the particular task. In certain situations we are concerned about power consumption: heath dissipation, cooling: physical deterioration due to temperature. Sometimes we want to reduce total energy consumed: battery life. TDDE35/ Embedded Systems 81 of 128
Power and Energy Consumption 1 2 P = - C V DD f N SW - 2 1 2 E = P t = - C V DD N CY N SW - 2 Reducing power/energy consumption: Reduce supply voltage TDDE35/ Embedded Systems 82 of 128
Power and Energy Consumption 1 2 P = - C V DD f N SW - 2 1 2 E = P t = - C V DD N CY N SW - 2 Reducing power/energy consumption: Reduce supply voltage Reduce switching activity TDDE35/ Embedded Systems 83 of 128
Power and Energy Consumption 1 2 P = - C V DD f N SW - 2 1 2 E = P t = - C V DD N CY N SW - 2 Reducing power/energy consumption: Reduce supply voltage Reduce switching activity Reduce capacitance TDDE35/ Embedded Systems 84 of 128
Power and Energy Consumption 1 2 P = - C V DD f N SW - 2 1 2 E = P t = - C V DD N CY N SW - 2 Reducing power/energy consumption: Reduce supply voltage Reduce switching activity Reduce capacitance Reduce number of cycles TDDE35/ Embedded Systems 85 of 128
System Level Power/Energy Optimization Dynamic techniques: applied at run time. These techniques are applied at run-time in order to reduce power consumption by exploiting idle or low-workload periods. Static techniques: applied at design time. Compilation for low power: instruction selection considering their pow- er profile, data placement in memory, register allocation. Algorithm design: find the algorithm which is the most power-efficient. Task mapping and scheduling. TDDE35/ Embedded Systems 86 of 128
System Level Power/Energy Optimization Three techniques will be discussed: 1. Dynamic power management: a dynamic technique. 2. Task mapping: a static technique. 3. Task scheduling with dynamic power scaling: static & dynamic. TDDE35/ Embedded Systems 87 of 128
Dynamic Power Management (DPM) application power aware OS hardware TDDE35/ Embedded Systems 88 of 128
Dynamic Power Management (DPM) Decisions: Switching among multiple power states: application idle power aware OS sleep run hardware Switching among multiple frequencies and voltage levels. TDDE35/ Embedded Systems 89 of 128
Dynamic Power Management (DPM) Decisions: Switching among multiple power states: application idle power aware OS sleep run hardware Switching among multiple frequencies and voltage levels. Goal: Energy optimization QoS constraints satisfied TDDE35/ Embedded Systems 90 of 128
Dynamic Power Management (DPM) Intel Xscale Processor RUN: operational IDLE: Clocks to the CPU are disabled; recovery is through interrupt. RUN SLEEP: Mainly powered 10 s 1.5ms off; recovery through 10 s wake-up event. 140ms 90 s Other intermediate IDLE SLEEP states: DEEP IDLE, 160 W 40mW STANDBY, DEEP SLEEP TDDE35/ Embedded Systems 91 of 128
Dynamic Power Management (DPM) Intel Xscale Processor 0.75V, 60mW 150MHz RUN: operational RUN 1.3V, 450mW IDLE: Clocks to the CPU RUN 600MHz RUN are disabled; recovery 1.6V, 900mW RUN is through interrupt. 800MHz 160 s RUN SLEEP: Mainly powered 10 s 1.5ms off; recovery through 10 s wake-up event. 140ms 90 s Other intermediate IDLE SLEEP states: DEEP IDLE, 160 W 40mW STANDBY, DEEP SLEEP TDDE35/ Embedded Systems 92 of 128
The Basic Concept of DPM When there are requests for a device the device is busy ; otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state. TDDE35/ Embedded Systems 93 of 128
The Basic Concept of DPM When there are requests for a device the device is busy ; otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state. Workload Requests Requests T 1 T 4 Time TDDE35/ Embedded Systems 94 of 128
The Basic Concept of DPM When there are requests for a device the device is busy ; otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state. Workload Requests Requests Device state Busy Idle Busy T 1 T 4 Time TDDE35/ Embedded Systems 95 of 128
The Basic Concept of DPM When there are requests for a device the device is busy ; otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state. Workload Requests Requests Device state Busy Idle Busy T sd T w Power state Sleeping Working Working T 1 T 4 Time TDDE35/ Embedded Systems 96 of 128
The Basic Concept of DPM When there are requests for a device the device is busy ; otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state. Workload Requests Requests Device state Busy Idle Busy T sd T w Power state Sleeping Working Working T 1 T 4 Time Changing the power state takes time and extra energy . T sd : shutdown delay T wu : wake-up delay Send the device to sleep only if the saved energy justifies the overhead! TDDE35/ Embedded Systems 97 of 128
The Basic Concept of DPM When there are requests for a device the device is busy ; otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state. Workload Requests Requests Device state Busy Idle Busy T sd T w Power state Sleeping Working Working T 1 T 4 Time The main Problems: Don’t shut down such that delays occur too frequently. Don’t shut down such that the savings due to the sleeping are smaller than the energy overhead of the state changes. TDDE35/ Embedded Systems 98 of 128
Power Management Policies When there are requests for a device the device is busy ; otherwise it is idle . When the device is idle, it can be shut down to enter a low-power sleeping state. Workload Requests Requests Device state Busy Idle Busy T sd T w Power state Sleeping Working Working T 1 T 4 Time Power management policies are concerned with predictions of idle periods: For shut-down: try to predict how long the idle period will be in order to decide if a shut-down should be performed. For wake-up: try to predict when the idle period ends, in order to avoid user delays due to T wu . - Very difficult! TDDE35/ Embedded Systems 99 of 128
Dynamic Power Management (DPM) For many embedded systems DPM techniques, like presented before, are not appropriate: They have time constraints we have to keep deadlines (usually we cannot afford shut-down and wake-up times). The OS is simple&fast no sophisticated run-time techniques. The application is known at design time we know a lot about the application and optimize already at design time. TDDE35/ Embedded Systems 100 of 128
Recommend
More recommend