Review: FP Pipeline Model 4-stage fully pipelined adder, Non-pipelined multiplier and divider A1 A2 A3 A4 EX IF MEM WB ID/REG DIV (6 cycle non pipelined) MUL (4 cycle non pipelined) 1
Review: Summary • If instructions A and B are in-flight (assume A issued before B) – Will write to WB on different cycles (structural hazard solution) – Destination registers of A and B are distinct (WAW solution) – Source registers of B differ from the destination register of A (RAW) – Can source registers of A match the destination register of B? • Stalled instructions are held in the ID stage for RAW and WAW – Easy implementation • In order Issue : Instructions leave ID stage in (dynamic) program order – Instructions leave ID with their operands from either REG or forwarding path • Out-of-order completion: A, B may complete in out of program issue order if they write to different registers • Problem if precise exceptions needed – What if A raises an exception (e.g. arithmetic overflow) after B has completed – What if B is an instruction like: ADD.D F0, F0, F2 2
Summary • What is the performance goal of the pipeline? – Try and achieve a CPI close to 1 – Stalls for • Structural hazard (contention for WB) (expected to be rare) • WAW hazard (expected to be rare) • RAW hazards – Reduce number of stalls by forwarding – ?? • Hint: Reduce penalty due to stalls 3
Instruction Level Parallelism Head-of-Line Blocking A • No space in Green Lane for the Green Car A. Waiting for space. (Structural Hazard) • Green Car A has a flat tire. Waiting to be fixed. (RAW Hazard) • All cars on main road stalled till A can progress 4
Example Head-of-Line Blocking DIV F0, F2, F4 SD F0, 0(R0) MUL F6, F8, F10 3
Head-of-Line Blocking 4-stage fully pipelined adder, Non-pipelined multiplier and divider A1 A2 A3 A4 EX M S IF MEM WB ID/REG U D L D DIV (6 cycle non pipelined) I V MUL (4 cycle non pipelined) SD will hold up the following MUL even though it is independent 1
Instruction Level Parallelism A • Provide a separate staging area where cars can wait • We did that for structural hazards for the ID/EX pipeline register • Can we do it for RAW stalls? 5
Instruction Level Parallelism A • Provide a separate staging area where cars can wait: • Lots more concurrency • What is the cost? 6
Instruction Level Parallelism Motivating Example: RAW dependency between A and B A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Current Multi-cycle FP unit design B stalled in ID stage till A produces result • All instructions after B will also be stalled till B’s stall clears 7
Scoreboard T = 1 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register ADD IR EX MEM ID/R B WB A DIV MUL 8
Scoreboard T = 2 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register ADD IR ID/R EX MEM B WB DIV A MUL 9
Scoreboard T = 3 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register ADD IR ID/R EX MEM B WB DIV A MUL 10
Scoreboard T = 4 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register ADD IR ID/R EX MEM B WB DIV A MUL 11
Scoreboard T = 5 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register ADD IR ID/R EX MEM B WB DIV A MUL 12
Scoreboard T = 6 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register ADD B IR ID/R EX MEM C WB DIV MUL 13
Scoreboard T = 7 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register ADD B IR ID/R EX MEM WB DIV C MUL 14
Scoreboard T = 8 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register ADD B IR ID/R EX MEM WB DIV C completes at cycle MUL C 12 15
Instruction Level Parallelism Motivating Example: RAW dependency between A and B A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Current Multi-cycle FP unit design B stalled in ID stage till A produces result • All instructions after B will also be stalled till B’s stall clears • C has no resource or data conflicts • Why not allow C to execute while B waits for data ? 17
Scoreboard T = 1 Staging Area A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register R ADD IR Issue R EX MEM B WB A R DIV MUL R 18
Scoreboard T = 2 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register R ADD B IR Issue R EX MEM C WB R DIV A MUL R 19
Scoreboard T = 3 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register R ADD B IR Issue R EX MEM WB R DIV A MUL C R 20
Scoreboard T = 3 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register R ADD B IR Issue R EX MEM WB R DIV A MUL R C 21
Scoreboard T = 3 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register R ADD B IR Issue R EX MEM WB R DIV A MUL R C 22
Scoreboard T = 6 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register R ADD B IR Issue R EX MEM WB R DIV A MUL R C 23
Scoreboard T = 7 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register R B ADD IR Issue R EX MEM WB R DIV MUL R C 24
Scoreboard T = 8 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register R ADD B IR Issue R EX MEM WB R DIV MUL R C 25
Scoreboard T = 9 A: DIVD F0, F2, F4 B: ADDD F10, F0, F8 C: MULTD F12, F8, F14 Issue Register R ADD B IR Issue R EX MEM WB R DIV All complete in 10 cycles MUL R 26
Scoreboard Operation: RAW A IF I R / / / / / W B IF I R R R R R R + * W * * * W C IF I R + + W A: DIV.D F0, F2, F4 B: MUL.D F6, F0, F8 C: ADD.D F10, F12, F14 B waits in the Issue Register till A writes to F0. Meanwhile C enters, completes the ADD, writes F10 and exits. 27
Scoreboard Operation: WAW Writes F0 A IF I R / / / / / W B IF I R R R R R R * * * * W C IF I + + W Writes F0 A: DIV.D F0, F2, F4 B: MUL.D F6, F0, F8 C: ADD.D F0, F10, F12 WAW hazard since C’s write will be lost when A completes Get rid of WAW hazard • Do not issue an instruction with the same destination register as an in-flight instruction • Do not issue C if previous in-flight instruction with same destination register A IF I R / / / / / W B IF I R R R R R R * * * * W C IF I I I I I I I R + + W 28
Scoreboard Operation: WAR A IF I R / / / / / / W B IF I R R R R R R R * * * * W C IF I R + + W Reads F0 and F8 Writes F8 A: DIV.D F0, F2, F4 B: MUL.D F6, F0, F8 C: ADD.D F8, F10, F12 WAR hazard since C’s write to F8 occurs before B reads F8 Get rid of WAR hazard Need to be careful not to confuse a WAR with a RAW 29
Scoreboard Operation: WAR Hazards Problem of distinguishing a RAW dependency from a WAR dependency A: DIV.D F0, F2, F4 C: MUL.D F6, F0, F8 B: MUL.D F6, F0, F8 D: DIV.D F0, F2, F4 DIV should be allowed to write F0 DIV should not be allowed to write F0 before MUL reads till MUL reads • Q needs to distinguish between P and R both of which may be stalled waiting to read F0 • Q must wait for P to read F0 before overwriting it P: MUL.D F6, F0, F8 • Q must write to F0 before R reads it Q: DIV.D F0, F2, F4 R: DIV.D F10, F0, F4 30
Operation of Issue (I) Stage Issue Stage: Every cycle • Check whether instruction in Instruction Register (IR) should be issued or stalled – Stalled instruction waits in IR and holds up all succeeding instructions – Issued instruction moves to IssueRegister of the functional unit it needs • Instruction in Instruction Register (IR) stalled if either: – Structural hazard for a ID/EX register or – WAW dependency with some earlier issued instruction • Only 1 instruction with the same destination register issued at any time • No WAW hazards • Instruction to be Issued: – Update Data Flow Graph • Maintains dependency information between instructions) 31
Operation of Dispatch (R) and Write (W) stages Dispatch Stage: Every cycle • Check whether instruction in Issue Register can be dispatched or stalled • Instruction in Issue Register is stalled if it has – A structural hazrd for the FU or – a RAW dependency with an in-flight instruction – Waits until FU available and all its source operands are ready (RAW dependencies satisfied) • Instruction in Issue Register is dispatched when all the operands are available – Read the source registers from the Register File – Dispatch the instruction to the FU stage – An operand is available when there is no in-flight instruction with matching destination register Write Stage: Every cycle • Check which instructions in EX/WB pipeline register are SAFE-TO-WRITE • (WAR Hazards) – Select an instruction I that is SAFE-TO-WRITE – Write result of I to its destination register 32 – Update Data Flow Graph to indicate write by I
Recommend
More recommend