precise exceptions and
play

Precise Exceptions and Idea: Have multiple different functional - PDF document

3/16/17 Multi-Cycle Execution Not all instructions take the same amount of time for execution Precise Exceptions and Idea: Have multiple different functional units that take Out-of-Order Execution different number of cycles


  1. 3/16/17 Multi-Cycle Execution • Not all instructions take the same amount of time for “execution” Precise Exceptions and • Idea: Have multiple different functional units that take Out-of-Order Execution different number of cycles • Can be pipelined or not pipelined Samira Khan • Can let independent instructions start execution on a different functional unit before a previous long-latency instruction finishes execution 2 The Von Neumann Model/Architecture ISSUES IN PIPELINING: MULTI-CYCLE EXECUTE • Instructions can take different number of cycles in • Also called stored program computer (instructions in EXECUTE stage memory). Two key properties: • Integer ADD versus FP Multiply • Stored program F D E E E E E E E E W FMUL R4 ß R1, R2 • Instructions stored in a linear memory array ADD R3 ß R1, R2 F D E W • Memory is unified between instructions and data F D E W • The interpretation of a stored value depends on the control signals F D E W F D E E E E E E E E W • Sequential instruction processing FMUL R2 ß R5, R6 F D E W ADD R4 ß R5, R6 • One instruction processed (fetched, executed, and completed) at a time F D E W • Program counter (instruction pointer) identifies the current instr. • What is wrong with this picture? • Program counter is advanced sequentially except for control transfer • What if FMUL incurs an exception? instructions • Sequential semantics of the ISA NOT preserved! 3 4 1

  2. 3/16/17 PRECISE EXCEPTIONS/INTERRUPTS HANDLING EXCEPTIONS IN PIPELINING • The architectural state should be consistent when the • Exceptions versus interrupts exception/interrupt is ready to be handled • Cause • Exceptions: internal to the running thread • Interrupts: external to the running thread 1. All previous instructions should be completely retired. • When to Handle • Exceptions: when detected (and known to be non-speculative) 2. No later instruction should be retired. • Interrupts: when convenient • Except for very high priority ones • Power failure Retire = commit = finish execution and update arch. state • Machine check • Priority: process (exception), depends (interrupt) • Handling Context: process (exception), system (interrupt) 5 6 ENSURING PRECISE EXCEPTIONS IN WHY DO WE WANT PRECISE EXCEPTIONS? PIPELINING • Aid software debugging • Idea: Make each operation take the same amount of time • Enable (easy) recovery from exceptions, e.g. page faults FMUL R3 ß R1, R2 F D E E E E E E E E W ADD R4 ß R1, R2 F D E E E E E E E E W • Enable (easily) restartable processes F D E E E E E E E E W F D E E E E E E E E W F D E E E E E E E E W F D E E E E E E E E W F D E E E E E E E E W • Downside • What about memory operations? • Each functional unit takes 500 cycles? 7 8 2

  3. 3/16/17 SOLUTION: REORDER BUFFER (ROB) • Idea: Complete instructions out-of-order, but reorder them before making results visible to architectural state • When instruction is decoded it reserves an entry in the ROB • When instruction completes, it writes result into ROB entry V DEST DEST CO REG VAL MPL • When instruction oldest in ROB and it has completed, its ETE Oldest FMUL result moved to reg. file or memory 1 R4 -- 0 ADD 1 R3 -- 0 1 0 FMUL 1 0 Youngest 1 0 ADD Func Unit Register Instruction Reorder File Func Unit Cache Buffer Func Unit Reorder File 9 REORD RE RDER ER BU BUFFER: FFER: INDEP EPEN ENDEN ENT T RE REORD RDER ER BU BUFFER: FFER: INDEP EPEN ENDEN ENT T CYCLE 5 CYCLE 5 OPERATION OP ONS OP OPERATION ONS V DEST DEST CO V DEST DEST CO REG VAL MPL REG VAL MPL Oldest Oldest ETE ETE 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 FMUL FMUL 1 R4 -- 0 1 R4 -- 0 ADD 1 R3 1000 1 ADD 1 R3 1000 1 F D E E E E E E E E R W F D E E E E E E E E R W 1 0 1 0 F D E R W F D E R W 1 0 1 0 Youngest F D E R W FMUL 1 R2 -- 0 F D E R W Youngest FMUL 1 R2 -- 0 ADD ADD 1 R4 -- 0 F D E R W F D E R W FMUL R2 ß R5, R6 F D E E E E E E E E R W F D E E E E E E E E R W FMUL R2 ß R5, R6 ADD R4 ß R5, R6 ADD R4 ß R5, R6 F D E R W F D E R W F D E R W F D E R W Reorder File Reorder File 11 12 3

  4. 3/16/17 RE REORD RDER ER BU BUFFER: FFER: INDEP EPEN ENDEN ENT T REORD RE RDER ER BU BUFFER: FFER: INDEP EPEN ENDEN ENT T CYCLE 11 CYCLE 12 OPERATION OP ONS OPERATION OP ONS RETIRE V DEST DEST CO V DEST DEST CO REG VAL MPL OLDEST REG VAL MPL Oldest Oldest ETE ETE 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 FMUL FMUL 1 R4 101 0 1 R4 101 1 ADD 1 R3 1000 1 ADD 1 R3 1000 1 F D E E E E E E E E R W F D E E E E E E E E R W 1 0 1 0 F D E R W F D E R W 1 0 1 0 Youngest F D E R W FMUL 1 R2 -- 0 F D E R W Youngest FMUL 1 R2 -- 0 ADD 1 R4 -- 0 ADD 1 R4 -- 0 F D E R W F D E R W FMUL R2 ß R5, R6 F D E E E E E E E E R W F D E E E E E E E E R W FMUL R2 ß R5, R6 ADD R4 ß R5, R6 ADD R4 ß R5, R6 F D E R W F D E R W F D E R W F D E R W Reorder File Reorder File 13 14 RE REORD RDER ER BU BUFFER: FFER: INDEP EPEN ENDEN ENT T REORD RE RDER ER BU BUFFER: FFER: INDEP EPEN ENDEN ENT T CYCLE 12 CYCLE 12 OPERATION OP ONS OP OPERATION ONS RETIRE V DEST DEST CO V DEST DEST CO OLDEST REG VAL MPL REG VAL MPL ETE ETE 0 1 2 3 4 5 6 7 8 9 10 11 Oldest 0 1 2 3 4 5 6 7 8 9 10 11 Oldest FMUL 0 R4 101 1 0 ADD 1 R3 1000 1 ADD 1 R3 1000 1 F D E E E E E E E E R W F D E E E E E E E E R W 1 0 1 0 F D E R W F D E R W 1 0 1 0 Youngest F D E R W FMUL 1 R2 -- 0 F D E R W Youngest FMUL 1 R2 -- 0 ADD 1 R4 -- 0 ADD 1 R4 -- 0 F D E R W F D E R W FMUL R2 ß R5, R6 F D E E E E E E E E R W F D E E E E E E E E R W FMUL R2 ß R5, R6 ADD R4 ß R5, R6 ADD R4 ß R5, R6 F D E R W F D E R W F D E R W F D E R W Reorder File Reorder File What if a later operation needs a value in the reorder buffer? 15 Read reorder buffer in parallel with the register file. How? 16 4

  5. 3/16/17 REORDER BUFFER: HOW TO ACCESS? Search for Register Value • A register value can be in the register file, reorder buffer, (or bypass paths) VAL V V DEST DEST CO REG VAL MPL R1 1 1 Register ETE Instruction R2 0 Oldest File 0 Cache R3 0 ADD 1 R3 1000 1 Func Unit R4 0 1 0 R5 5 1 1 0 Func Unit R6 6 1 Youngest 1 R2 -- 0 R7 8 1 ADD 1 R4 -- 0 Content Reorder Func Unit R8 8 1 Addressable Buffer R9 9 1 Memory R10 10 1 (searched with bypass path R11 11 0 register ID) 17 SIMPLIFYING REORDER BUFFER ACCESS Search for Register Value • Idea: Use indirection • Access register file first • If register not valid, register file stores the ID of the reorder buffer VAL TAG V V DEST DEST CO entry that contains (or will contain) the value of the register REG VAL MPL R1 1 1 ETE • Mapping of the register to a ROB entry R2 5 0 Oldest 0 R3 2 0 • Access reorder buffer next ADD 1 R3 1000 1 R4 6 0 1 0 R5 5 1 1 0 R6 6 1 Youngest 1 R2 -- 0 • What is in a reorder buffer entry? R7 8 1 ADD 1 R4 -- 0 R8 8 1 R9 9 1 V DestRegID DestRegVal StoreAddr StoreData BranchTarget PC/IP Control/valid bits R10 10 1 • Can it be simplified further? R11 11 1 19 5

  6. 3/16/17 Reorder Buffer in Intel Pentium III REORDER BUFFER PROS AND CONS • Pro • Conceptually simple for supporting precise exceptions • Con • Reorder buffer needs to be accessed to get the results that are Boggs et al., “The yet to be written to the register file Microarchitecture of the Pentium 4 Processor,” Intel • CAM or indirection à increased latency and complexity Technology Journal, 2001. 21 22 In-Order Pipeline with Reorder Buffer • Decode (D): Access regfile/ROB, allocate entry in ROB, check if instruction can execute, if so dispatch instruction • Execute (E): Instructions can complete out-of-order • Completion (R): Write result to reorder buffer Out-of-Order Execution • Retirement/Commit (W): Check for exceptions; if none, write result to (Dynamic Instruction Scheduling) architectural register file or memory; else, flush pipeline and start from exception handler • In-order dispatch/execution, out-of-order completion, in-order retirement Integer add E Integer mul E E E E R W F D FP mul E E E E E E E E E . . . E E E E E E E Load/store 23 6

Recommend


More recommend