Pipelining 3: Hazards/Forwarding/Prediction 1
pipeline stages fetch — instruction memory, most PC computation decode — reading register fjle execute — computation, condition code read/write memory — memory read/write writeback — writing register fjle, writing Stat register common case: fetch next instruction in next cycle can’t for conditional jump, return read/write in same stage avoids reading wrong value get value updated for prior instruction (not earlier/later) don’t want to halt until everything else is done 2
pipeline stages fetch — instruction memory, most PC computation decode — reading register fjle execute — computation, condition code read/write memory — memory read/write writeback — writing register fjle, writing Stat register common case: fetch next instruction in next cycle can’t for conditional jump, return read/write in same stage avoids reading wrong value get value updated for prior instruction (not earlier/later) don’t want to halt until everything else is done 2
pipeline stages fetch — instruction memory, most PC computation decode — reading register fjle execute — computation, condition code read/write memory — memory read/write writeback — writing register fjle, writing Stat register common case: fetch next instruction in next cycle can’t for conditional jump, return read/write in same stage avoids reading wrong value get value updated for prior instruction (not earlier/later) don’t want to halt until everything else is done 2
pipeline stages fetch — instruction memory, most PC computation decode — reading register fjle execute — computation, condition code read/write memory — memory read/write writeback — writing register fjle, writing Stat register common case: fetch next instruction in next cycle can’t for conditional jump, return read/write in same stage avoids reading wrong value get value updated for prior instruction (not earlier/later) don’t want to halt until everything else is done 2
Changelog Changes made in this version not seen in fjrst lecture: 13 March 2018: correct PC update rearranging HCL example to check if condition codes NOT taken for correcting misprediction. 2
last time adding pipelining: divide into stages values that cross stages go into pipeline registers each stage: read from previous, write to next pipeline execution: instruction 1 in writeback instruction 2 in memory … instruction 5 in fetch hazards — pipeline can’t work “naturally” data: wrong value control: wrong instruction to fetch 3 generic solution: stalling
stalling costs with only stalling: extra 3 cycles (total 4) for every ret extra 2 cycles (total 3) for conditional jmp up to 3 extra cycles for data dependencies can we do better? can’t easily read memory early might be written in previous instruction trick: guess and check trick: use values waiting to get to register fjle 4
stalling costs with only stalling: extra 3 cycles (total 4) for every ret extra 2 cycles (total 3) for conditional jmp up to 3 extra cycles for data dependencies can we do better? can’t easily read memory early might be written in previous instruction trick: guess and check trick: use values waiting to get to register fjle 5
stalling costs with only stalling: extra 3 cycles (total 4) for every ret extra 2 cycles (total 3) for conditional jmp up to 3 extra cycles for data dependencies can we do better? can’t easily read memory early might be written in previous instruction trick: guess and check trick: use values waiting to get to register fjle 6
stalling costs with only stalling: extra 3 cycles (total 4) for every ret extra 2 cycles (total 3) for conditional jmp up to 3 extra cycles for data dependencies can we do better? can’t easily read memory early might be written in previous instruction trick: guess and check trick: use values waiting to get to register fjle 7
revisiting data hazards stalling worked but very unsatisfying — wait 2 extra cycles to use anything?! (just not stored in a way that let’s us get it) 8 observation: value ready before it would be needed
motivation 900 rB next R[dstE] dstE 0 0x0 1 0x2 8 9 2 9 8 800 9 PC 3 900 800 8 1700 9 4 1700 8 fetch/decode decode/execute execute/writeback should be 1700 rA PC cycle ADD Instr. Mem. register fjle srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM] split 0xF ADD dstE add 2 // initially %r8 = 800, // %r9 = 900, etc. addq %r8, %r9 addq %r9, %r8 addq ... addq ... fetch rA rB next R[dstE] 9 R[srcA] R[srcB] dstE R[srcA] R[srcB] dstE
motivation 900 rB next R[dstE] dstE 0 0x0 1 0x2 8 9 2 9 8 800 9 PC 3 900 800 8 1700 9 4 1700 8 fetch/decode decode/execute execute/writeback should be 1700 rA PC cycle ADD Instr. Mem. register fjle srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM] split 0xF ADD dstE add 2 // initially %r8 = 800, // %r9 = 900, etc. addq %r8, %r9 addq %r9, %r8 addq ... addq ... fetch rA rB next R[dstE] 9 R[srcA] R[srcB] dstE R[srcA] R[srcB] dstE
forwarding (2) reg #s 9, 8 from (2) reg # 9, R8=800; R9=900 (1) (2) old R9=900, R8=800 R9= 1700 (forwarded) addq %r9, %r8 // (2) R8=800 (2b) R10=1000, R9=1700 (forwarded) new R9=1700 (1) MUX MUX addq %r10, %r9 // (2b) addq %r8, %r9 // (1) PC R[srcB] Instr. Mem. register fjle srcA srcB R[srcA] dstE add 2 next R[dstE] dstM next R[dstM] split 0xF ADD ADD 10
forwarding (2) reg #s 9, 8 from (2) reg # 9, R8=800; R9=900 (1) (2) old R9=900, R8=800 R9= 1700 (forwarded) addq %r9, %r8 // (2) R8=800 (2b) R10=1000, R9=1700 (forwarded) new R9=1700 (1) MUX MUX addq %r10, %r9 // (2b) addq %r8, %r9 // (1) PC R[srcB] Instr. Mem. register fjle srcA srcB R[srcA] dstE add 2 next R[dstE] dstM next R[dstM] split 0xF ADD ADD 10
forwarding (2) reg #s 9, 8 from (2) reg # 9, R8=800; R9=900 (1) (2) old R9=900, R8=800 R9= 1700 (forwarded) addq %r9, %r8 // (2) R8=800 (2b) R10=1000, R9=1700 (forwarded) new R9=1700 (1) MUX MUX addq %r10, %r9 // (2b) addq %r8, %r9 // (1) PC R[srcB] Instr. Mem. register fjle srcA srcB R[srcA] dstE add 2 next R[dstE] dstM next R[dstM] split 0xF ADD ADD 10
forwarding (2) reg #s 9, 8 from (2) reg # 9, R8=800; R9=900 (1) (2) old R9=900, R8=800 R9= 1700 (forwarded) addq %r9, %r8 // (2) R8=800 (2b) R10=1000, R9=1700 (forwarded) new R9=1700 (1) MUX MUX addq %r10, %r9 // (2b) addq %r8, %r9 // (1) PC R[srcB] Instr. Mem. register fjle srcA srcB R[srcA] dstE add 2 next R[dstE] dstM next R[dstM] split 0xF ADD ADD 10
forwarding (2) reg #s 9, 8 from (2) reg # 9, R8=800; R9=900 (1) (2) old R9=900, R8=800 R9= 1700 (forwarded) addq %r9, %r8 // (2) R8=800 (2b) R10=1000, R9=1700 (forwarded) new R9=1700 (1) MUX MUX addq %r10, %r9 // (2b) addq %r8, %r9 // (1) PC R[srcB] Instr. Mem. register fjle srcA srcB R[srcA] dstE add 2 next R[dstE] dstM next R[dstM] split 0xF ADD ADD 10
forwarding (2) reg #s 9, 8 from (2) reg # 9, R8=800; R9=900 (1) (2) old R9=900, R8=800 R9= 1700 (forwarded) addq %r9, %r8 // (2) R8=800 (2b) R10=1000, R9=1700 (forwarded) new R9=1700 (1) MUX MUX addq %r10, %r9 // (2b) addq %r8, %r9 // (1) PC R[srcB] Instr. Mem. register fjle srcA srcB R[srcA] dstE add 2 next R[dstE] dstM next R[dstM] split 0xF ADD ADD 10
What could condition be? forwarding: MUX conditions add 2 e. something else d. d_rB == reg_srcA c. e_dstE == reg_srcA b. W_dstE == reg_srcA a. W_rA == reg_srcA ]; 1 : reg_outputA; condition : e_valE; d_valA= [ addq %r9, %r8 // (2) addq %r8, %r9 // (1) MUX MUX ADD PC ADD 0xF split next R[dstM] dstM next R[dstE] dstE R[srcB] R[srcA] srcB srcA register fjle Mem. Instr. 11
What could condition be? forwarding: MUX conditions add 2 e. something else d. d_rB == reg_srcA c. e_dstE == reg_srcA b. W_dstE == reg_srcA a. W_rA == reg_srcA ]; 1 : reg_outputA; condition : e_valE; d_valA= [ addq %r9, %r8 // (2) addq %r8, %r9 // (1) MUX MUX ADD PC ADD 0xF split next R[dstM] dstM next R[dstE] dstE R[srcB] R[srcA] srcB srcA register fjle Mem. Instr. 11
forwarding: MUX conditions add 2 e. something else d. d_rB == reg_srcA c. e_dstE == reg_srcA b. W_dstE == reg_srcA a. W_rA == reg_srcA ]; 1 : reg_outputA; condition : e_valE; d_valA= [ addq %r9, %r8 // (2) addq %r8, %r9 // (1) MUX MUX ADD PC R[srcA] Instr. Mem. register fjle srcA srcB R[srcB] ADD dstE next R[dstE] dstM next R[dstM] split 0xF 11 What could condition be?
What could condition be? forwarding: MUX conditions add 2 e. something else d. d_rB == reg_srcA c. e_dstE == reg_srcA b. W_dstE == reg_srcA a. W_rA == reg_srcA ]; 1 : reg_outputA; condition : e_valE; d_valA= [ addq %r9, %r8 // (2) addq %r8, %r9 // (1) MUX MUX ADD PC ADD 0xF split next R[dstM] dstM next R[dstE] dstE R[srcB] R[srcA] srcB srcA register fjle Mem. Instr. 11
What could condition be? forwarding: MUX conditions add 2 e. something else d. d_rB == reg_srcA c. e_dstE == reg_srcA b. W_dstE == reg_srcA a. W_rA == reg_srcA ]; 1 : reg_outputA; condition : e_valE; d_valA= [ addq %r9, %r8 // (2) addq %r8, %r9 // (1) MUX MUX ADD PC ADD 0xF split next R[dstM] dstM next R[dstE] dstE R[srcB] R[srcA] srcB srcA register fjle Mem. Instr. 11
Recommend
More recommend