Speculative Execution Outcome unknown Block 1 Predict future execution path Begin executing instructions from predicted path ? Speculatively Executed A B Block Also Block 2 Speculatively Executed 1
Speculative Execution Block 1 ? Speculatively Executed A B Block Also Branch Resolved Block 2 Speculatively Rollback execution to Executed decision point if incorrectly predicted 2
Speculative Execution Block 1 Rollback to decision point Begin executing instructions from correct path ? Speculatively Executed A B Block Rollback mechanism must undo consequences of the speculative tasks Also Block 2 Speculatively Executed 3
Handling Mis Speculation • Method 1: • Checkpointing and Rollback • Save enough system state at decision point to restart execution from that point Method 2: • Prevent speculative instructions from updating system state • Writes by speculative instruction are stalled till speculative status is resolved • Speculative instructions can still execute and make progress • Use renaming mechanisms to transfer information between speculative instructions • Rename source registers (a la Tomasulo) • On resolution • Mis-speculation: Squash the speculative instructions • Correct execution: Commit (the writes) of the speculative instructions 4
Reorder Buffer for Speculation • Reorder Buffer (RoB): • Mechanism to support the In-Order Writes of Instructions • Buffer the results of completing instructions • Update destination registers or memory locations in order • Later instructions that complete before earlier ones will wait in the RoB till earlier ones update destinations Commits: Updates destination A: B: (normal instruction) Remains in RoB till preceding Completes execution instruction(s) complete their write B: Speculative Instruction Squash if misspeculated Commit otherwise 5
Extending Tomasulo Pipeline with Reorder Buffer ISSUE DISPATCH EX WRITE COMMIT Commit Unit removes ready to commit instructions from head of Q Tail Head Issue Unit adds newly issued instructions to tail of Q RoB : • Storage to buffer writes until ready to commit • Circular queue written and released in FIFO (instruction) order • Each entry allocates space for 1 instruction to store its results + identifying information 6
Tomasulo’s Pipeline with RoB based Commit EX RS IR Issue Dispatch COMMIT WB LSQ REG RoB Accesses FILE RoB and EX REG FILE Common Data Bus (CDB) 7
Extending Tomasulo Pipeline with Reorder Buffer ISSUE DISPATCH EX WRITE COMMIT Destination registers need to distinguish between 3 possible states: 1. Available (A): No pending write to register. Register has its final value. 2. In Flight (I): Writer instruction is in flight: The last instruction with that destination register has not yet completed its wri 3. Ready (R): Writer has completed write but not yet committed. The value from the reorder buffer will be written to the register when it commits. Note: A and I are the same two register states of regular Tomasulo. The state of a register is used by an issuing instruction to find out where to get its source operand. 8
Key Features (Tomasulo with RoB) Issue instruction X (ALU instruction): 1. Allocate RS entry: Let RS X denote the RS index allocated to X 2. Allocate RoB entry: Instruction X will be identified using its Reorder Buffer index RB X 3. Update fields for source operands value in RS 4. Update state of destination register Stall issue until Free Reservation Station and Reorder Buffer slots are available Reservation Station RS X fields exactly the same as regular Tomasulo • RB X made up of the following fields: • Tag Destination State Value Tag: The identifier for instruction X (usually implicit index of entry in the RoB) Destination: The destination register of instruction X State: Yes/No---- RoB entry is valid result Yes: X has completed write No: X is In flight Value: Result of X (broadcast on CDB during write by X) 9
Example A: MUL F4, F0, F2 Tomasulo’s without RoB Issue A RS A Reservation Station RS A v 0 v 2 MUL F4 F4 Tomasulo’s with RoB RoB Entry Reservation Station RB A v 0 v 2 MUL RB A F 4 No ----- STATE: RB A I F4 F4 10
Example A: MUL F4, F0, F2 B: ADD F8, F4, F6 Tomasulo’s without RoB Issue B RS A RS B RS B v 6 ADD F8 F4 Tomasulo’s with RoB RB A RB A F 4 No ----- RB B v 6 MUL RB B F 8 No ----- STATE: RB B I F8 F4 11
Key Features: Instruction Issue ( contd …. ) For each source operand register S: Action depends on state of source register (A, I, R) • A: copy value from S immediately to RS X • I (pending write by instruction J): tag the source field of RS X with RB J • R (pending update from RB J ): read value from RB J and copy to RS I • STATE: STATE: STATE: RB J RB J I A R F0 F0 F0 RB J RS X RS X RS X RoB entry RoB[RB J ] ADDD F2, F0, F4 12
Key Features: Instruction Issue ( contd …. ) For destination register D Make X the writer of D • Set the state of D to I (Implicitly cancels the previous write if any). • 13
Instruction Issue Example ( contd … ) A: MUL F4, F0, F2 B: ADD F8, F4, F6 After Issue of A and B STATE: RB B RB A I RS B RB B F4 v 6 v 6 ADDD F8 STATE: v 0 v 2 RS A RB A F4 v 0 F4 v 2 MULD RB A I F4 F4 RB index DEST REG STATUS VALUE RB A F4 NO - head - RB B F8 NO tail Reorder Buffer 14
Dispatch and Write Units Execute unit executes instructions from RS whose operands are available • When execution is complete the Write Unit is notified • Write Unit broadcasts the result of completing instruction X on CDB • All units that are waiting on the result of X monitor the broadcast value • • Reservation Stations copy value into source field of RS . • Reorder Buffer entry for X copies value into appropriate field of RB X ( * ) • Destination register D changes state from I to R if X is the current writer; it however does not update D with the broadcast value( * ) To identify the broadcast destination • Write Unit broadcasts the tag ( RB X ) identifying the completing instruction. • Units whose tag match the broadcast tag are updated as above • ( * ) The major change from regular Tomasulo. In regular Tomasulo value copied into destination register and state of register changed from I to R 15
Example RS A A: MUL F4, F0, F2 B: ADD F8, F4, F6 F4 = RESULT F4 A Writes RESULT using CDB RS A v 0 v 2 MUL Tomasulo’s without Reorder Buffer: RESULT is copied into F4 RS B v 6 ADD RESULT RB A v 0 v 2 MUL RB A F 4 Yes RESULT RB B v 6 ADD RESULT STATE: RB A R Tomasulo’s with Reorder Buffer: RESULT is copied into Reorder Buffer entry RB A F4 F4 F4 changes state from I to R 16
Example RS A A: MUL F4, F0, F2 B: ADD F8, F4, F6 C: ADD F10, F4, F6 /* Must get value produced by A */ A Writes RESULT using CDB F4 = RESULT F4 RS A v 0 v 2 MUL Tomasulo’s without Reorder Buffer: RESULT is copied into F4 RS B v 6 ADD RESULT RB A v 0 v 2 MUL RB A F 4 Yes RESULT RB B v 6 ADD RESULT STATE: RB A R Tomasulo’s with Reorder Buffer: RESULT is copied into Reorder Buffer entry RB A F4 F4 F4 changes state from I to R 16
Instruction Issue Example ( contd … ) A: MUL F4, F0, F2 B: ADD F8, F4, F6 STATE: After Write of A RB B I RS B RB B F4 RESULT A F4 v 6 ADDD F8 F4 STATE: RB A R F4 F4 RB index DEST REG STATUS VALUE RB A F4 YES RESULT A head - RB B F8 NO tail Reorder Buffer 17
Commit Unit Wait till instruction at head of Reorder Buffer completes Write stage (state = YES) • Let the instruction at head of the Reorder Buffer be X (assume not speculative) and the • destination register be D and the value field be v Update D with value v If the current writer of register D is instruction X Set state of D to A else Leave state of D unchanged If X is misspeculated: Flush all speculated instructions • Event : A commits STATE: F4: RESULT A A TAG DEST REG STATUS VALUE RB A F4 YES RESULT A RB B F6 NO - head tail 18
Load/Store Instructions Load and Store Buffers: Reservation Stations for Load and Store instructions C: LD F2, 0(R2) RB C I ID ADDR OP RS C RB C ea LD F2 TAG DEST REG STATUS VALUE RB C F2 NO - • The memory address of the Load/Store (ea) is calculated during Issue and stored in the RS. • The value read from memory is broadcast and stored in the Reorder Buffer till it is committed like other ALU instructions. 19
Load/Store Instructions D: SD 0(R2), F4 Note: Later LOADS to the same address would have to ID ADDR DATA OP wait till the SD commits and ea v 4 RS D RB D SD writes to memory. To avoid these stalls can use TAG DEST REG STATUS VALUE LOAD FORWARDING to bypass memory and directly YES pass the result from the RB D [SD]* - v 4 waiting SD to the later LD. • * Destination in this case is a memory address • The difference between regular Tomasulo and here is that the store should not begin writing to memory till it is ready to commit. • Signaled by the commit unit when SD becomes the head instruction. 20
Recommend
More recommend