Hardware-Software Codesign 9. Worst Case Execution Time Analysis Lothar Thiele Swiss Federal Computer Engineering 9 - 1 Institute of Technology and Networks Laboratory
System Design Specification System Synthesis Estimation SW-Compilation Instruction Set HW-Synthesis Intellectual Intellectual Prop. Code Prop. Block Machine Code Net lists Swiss Federal Computer Engineering 9 - 2 Institute of Technology and Networks Laboratory
Performance Estimation Methods – Illustration e.g. delay worst-case best-case probabilistic measurement simulation worst case real system estimation (formal) analysis chapter 6 chapter 9-10 Swiss Federal Computer Engineering 9 - 3 Institute of Technology and Networks Laboratory
Contents Introduction problem statement, tool architecture Program Path Analysis Value Analysis Caches must, may analysis Pipelines Abstract pipeline models Integrated analyses The slides are based on lectures of Reinhard Wilhelm. Swiss Federal Computer Engineering 9 - 4 Institute of Technology and Networks Laboratory
Industrial Needs Hard real-time systems , abound often in safety-critical applications Aeronautics, automotive, train industries, manufacturing control Sideairbag in car, Reaction in <10 mSec Wing vibration of airplane, sensing every 5 mSec Swiss Federal Computer Engineering 9 - 5 Institute of Technology and Networks Laboratory
Hard Real-Time Systems Embedded controllers are expected to finish their tasks reliably within time bounds. Task scheduling must be performed. Essential: upper bound on the execution times of all tasks statically known. Commonly called the Worst-Case Execution Time (WCET) Analogously, Best-Case Execution Time (BCET) Swiss Federal Computer Engineering 9 - 6 Institute of Technology and Networks Laboratory
Measurement – Industry's “best practice” Distribution f execution times Distribution of execution times Unsafe: Execution Time Best Case Upper bound Execution Time Measurement Worst Case Execution Time does this really work? Execution Time Works if either Otherwise: • worst-case input can be determined, or Determine upper bound • exhaustive measurements are performed from execution times of instructions Swiss Federal Computer Engineering 9 - 7 Institute of Technology and Networks Laboratory
(Most of) Industry’s Best Practice Measurements : determine execution times directly by observing the execution or a simulation on a set of inputs. Does not guarantee an upper bound to all executions in general. Exhaustive execution in general not possible ! Too large space of (input domain) × (set of initial execution states). Compute upper bounds along the structure of the program: Programs are hierarchically structured. Statements are nested inside statements. So, try to compute the upper bound for a statement from the upper bounds of its constituents -> does this work? Swiss Federal Computer Engineering 9 - 8 Institute of Technology and Networks Laboratory
Sequence of Statements Constituents of A : A ≡ A1; A2; A1 and A2 Upper bound for A is the sum of the upper bounds for A1 and A2 ub(A) = ub(A1) + ub(A2) Swiss Federal Computer Engineering 9 - 9 Institute of Technology and Networks Laboratory
Conditional Statement A ≡ if B Constituents of A : then A1 1. condition B else A2 2. statements A1 and A2 yes no B ub(A) = ub(B) + A1 A2 max(ub(A1), ub(A2)) Swiss Federal Computer Engineering 9 - 10 Institute of Technology and Networks Laboratory
Loops A ≡ for i ← 1 to 100 do A1 i ← 1 ub(A) = ub(i ← 1) + no 100 × ( ub(i ≤ 100) + i ≤ 100 ub(A1) ) + yes ub( i ≤ 100 ) A1 Swiss Federal Computer Engineering 9 - 11 Institute of Technology and Networks Laboratory
Where to start? load a Assignment Assumes constant x ← a + b load b excution times for instructions add store x cycles ub(x ← a + b) = add 4 cycles( load a ) + Not applicable load m 12 cycles( load b ) + to modern processors! store m cycles( add ) + 14 cycles( store x ) move 1 Swiss Federal Computer Engineering 9 - 12 Institute of Technology and Networks Laboratory
Modern Hardware Features Modern processors increase performance by using: Caches, Pipelines, Branch Prediction, Speculation These features make WCET computation difficult : Execution times of instructions vary widely. Best case - everything goes smoothely: no cache miss, operands ready, needed resources free, branch correctly predicted. Worst case - everything goes wrong: all loads miss the cache, resources needed are occupied, operands are not ready. Span may be several hundred cycles. Swiss Federal Computer Engineering 9 - 13 Institute of Technology and Networks Laboratory
Access Times LOAD r2, _a x = a + b; LOAD r1, _b ADD r3,r2,r1 PPC 755 Execution Time (Clock Cycles) 350 300 250 200 Clock Cycles 150 100 50 0 Best Case Worst Case Swiss Federal Computer Engineering 9 - 14 Institute of Technology and Networks Laboratory
Timing Accidents and Penalties Timing Accident – cause for an increase of the execution time of an instruction Timing Penalty – the associated increase Types of timing accidents Cache misses Pipeline stalls Branch mispredictions Bus collisions Memory refresh of DRAM TLB miss Swiss Federal Computer Engineering 9 - 15 Institute of Technology and Networks Laboratory
Overall Approach: Modularization Micro-architecture Analysis : Uses Abstract Interpretation Excludes as many Timing Accidents as possible Determines WCET for basic blocks (in contexts) Worst-case Path Determination Maps control flow graph to an integer linear program Determines upper bound and associated path Swiss Federal Computer Engineering 9 - 16 Institute of Technology and Networks Laboratory
Overall Structure Executable program Control-Flow-Graph to improve WCET bounds for loops CFG Builder Loop Unfolding Path Analysis Static Analyses Loop- ILP-Generator Bounds Value Analyzer Micro- Architecture LP-Solver Cache/Pipeline Analyzer WCET- Evaluation Visualization Timing Information Micro-architecture Worst-case Path Analysis Determination Swiss Federal Computer Engineering 9 - 17 Institute of Technology and Networks Laboratory
Contents Introduction problem statement, tool architecture Program Path Analysis Value Analysis Caches must, may analysis Pipelines Abstract pipeline models Integrated analyses Swiss Federal Computer Engineering 9 - 18 Institute of Technology and Networks Laboratory
Control Flow Graph (CFG) 1 what_is_this { 1 read (a,b); 2 2 done = FALSE; 3 repeat { 4 if (a>b) 4 5 a = a-b; a>b a<=b 6 elseif (b>a) 5 6 7 b = b-a; a<b a=b 8 else done = TRUE; 9 } until done; 7 8 10 write (a); } 9 !done done 10 Swiss Federal Computer Engineering 9 - 19 Institute of Technology and Networks Laboratory
Program Path Analysis Program Path Analysis which sequence of instructions is executed in the worst-case (longest runtime)? problem : the number of possible program paths grows exponentially with the program length Model we know the upper bounds (number of cycles) for each basic block from static analysis number of loop iterations must be bounded Concept transform structure of CFG into a set of (integer) linear equations. solution of the Integer Linear Program (ILP) yields bound on the WCET. Swiss Federal Computer Engineering 9 - 20 Institute of Technology and Networks Laboratory
Basic Block Definition: A basic block is a sequence of instructions where the control flow enters at the beginning and exits at the end, without stopping in-between or branching (except at the end). t1 := c - d t2 := e * t1 t3 := b * t1 t4 := t2 + t3 if t4 < 10 goto L Swiss Federal Computer Engineering 9 - 21 Institute of Technology and Networks Laboratory
Basic Blocks Determine basic blocks of a program: 1. Determine the first instructions of blocks : the first instruction targets of un/conditional jumps instructions that follow un/conditional jumps 2. determine the basic blocks : there is a basic block for each block beginning the basic block consists of the block beginning and runs until the next block beginning (exclusive) or until the program ends Swiss Federal Computer Engineering 9 - 22 Institute of Technology and Networks Laboratory
Control Flow Graph with Basic Blocks "Degenerated" control flow graph (CFG) the nodes are the basic blocks i := 0 t2 := 0 L t2 := t2 + i i := i + 1 if i < 10 goto L i < 10 i >= 10 x := t2 Swiss Federal Computer Engineering 9 - 23 Institute of Technology and Networks Laboratory
Example B1 s = k; /* k >= 0 */ s = k; B2 WHILE (k<10) WHILE (k < 10) { IF (ok) j++; B3 if (ok) ELSE { j = 0; j = 0; ok = true; B4 j++; B5 ok = true; } k ++; B6 k++; } r = j; B7 r = j; Swiss Federal Computer Engineering 9 - 24 Institute of Technology and Networks Laboratory
Recommend
More recommend