previous lecture slides for lecture 16
play

Previous Lecture Slides for Lecture 16 ENCM 501: Principles of - PDF document

slide 2/26 ENCM 501 W14 Slides for Lecture 16 Previous Lecture Slides for Lecture 16 ENCM 501: Principles of Computer Architecture Winter 2014 Term context switches and effects on memory latency Steve Norman, PhD, PEng memory system


  1. slide 2/26 ENCM 501 W14 Slides for Lecture 16 Previous Lecture Slides for Lecture 16 ENCM 501: Principles of Computer Architecture Winter 2014 Term ◮ context switches and effects on memory latency Steve Norman, PhD, PEng ◮ memory system summary ◮ introduction to ILP (instruction-level parallelism) Electrical & Computer Engineering Schulich School of Engineering ◮ review of simple pipelining University of Calgary 11 March, 2014 ENCM 501 W14 Slides for Lecture 16 slide 3/26 ENCM 501 W14 Slides for Lecture 16 slide 4/26 Today’s Lecture A rough sketch of the 5-stage pipeline This sketch was presented at the end of the previous lecture: IF ID EX MEM WB CLK CLK CLK CLK ◮ pipeline hazards instr. ◮ solutions to pipeline hazards I-mem decode CLK ? Related reading in Hennessy & Patterson: Sections C.2–C.3 CLK ALU D-mem add GPRs PC IF/ID ID/EX EX/MEM MEM/WB slide 5/26 slide 6/26 ENCM 501 W14 Slides for Lecture 16 ENCM 501 W14 Slides for Lecture 16 Pipeline Hazards Structural hazards These occur when two instructions “want” to use the same physical resource at the same time, in incompatible ways. If a certain sequence of instructions prevents the usual throughput of one instruction for clock cycle in a simple For example, if the simple 5-stage pipeline had a single pipeline, the situation is called a pipeline hazard . memory unit, instead of split instruction and data memories, MEM of an LW or SW instruction would interfere with IF of a Hazards can be categorized into three main types: structural later instruction. hazards , data hazards , and control hazards . Why is access to three GPRs by two different instructions, one in WB and a later one in ID, not a structural hazard?

  2. slide 7/26 slide 8/26 ENCM 501 W14 Slides for Lecture 16 ENCM 501 W14 Slides for Lecture 16 Structural hazards: solutions Data hazards (We’ll use MIPS32 instructions as examples, because instructions like ADD and SUB are easier to deal with than The best solution is to design hardware to avoid structural DADD and DSUB.) hazards wherever possible. For example: ◮ in the simple, 5-stage pipeline, use separate instruction The most common kind of data hazard is called a RAW and data memories; hazard: RAW stands for Read-After-Write. ◮ in real pipelines, have separate I-TLBs and D-TLBs, and ADD R8, R9, R10 separate L1 I-caches and D-caches. SUB R11, R12, R8 For complex pipelines, it may be practically impossible to For correct processing, SUB must work as if R8 is read by avoid all structural hazards, so stalls may be required—if two SUB after R8 is written by ADD. (This is where the term instructions are contending for a resource, one or the other will RAW comes from.) be delayed one or more clock cycles. Let’s draw a “pipeline diagram” to get a precise understanding of the problem. ENCM 501 W14 Slides for Lecture 16 slide 9/26 ENCM 501 W14 Slides for Lecture 16 slide 10/26 More examples of RAW hazards Forwarding Forwarding is the name given to a technique that can often solve RAW data hazards without loss of clock cycles to stalls. For the simple 5-stage pipeline, let’s find all the RAW hazards (Another name for forwarding is bypassing .) in this sequence . . . The essential idea is that if Instruction B depends on the LW R8, 0(R4) result of Instruction A, Instruction B should not wait for AND R9, R8, R5 Instruction A to write that result to its destination, but instead OR R10, R6, R8 grab that result as soon as it is available. SLT R11, R8, R7 Let’s look at how forwarding helps with this sequence . . . Remark: The deeper a pipeline is (the more stages it has), LW R8, 0(R4) the greater will be the number and complexity of potential AND R9, R8, R5 RAW hazards. OR R10, R6, R8 SLT R11, R8, R7 slide 11/26 slide 12/26 ENCM 501 W14 Slides for Lecture 16 ENCM 501 W14 Slides for Lecture 16 Sketch of forwarding hardware for 5-stage MIPS32 Q1: What should the values of the “forward control” outputs Here is an incomplete schematic for the EX stage . . . be in the case where no forwarding is needed? forward control Consider this sequence: CLK FwdA FwdB LW R8, 0(R4) 2 GPR 2 AND R9, R10, R11 00 ID/EX pipeline register 01 A SUB R12, R8, R9 10 ALU Q2: What should the values of the “forward control” outputs GPR 00 0 be when SUB is in the EX stage? 01 B 10 1 Q3: What are the inputs to “forward control” and how does LW/SW the forwarding logic work? (We’ll give an example or two, not data for SW offset completely specify the logic!) ALU result from EX/MEM reg. LW or ALU result from MEM/WB reg.

  3. slide 13/26 slide 14/26 ENCM 501 W14 Slides for Lecture 16 ENCM 501 W14 Slides for Lecture 16 Can forwarding solve all RAW hazards? Control hazards: Introduction In a simple pipeline, a control hazard is a difficulty in determining the address to use for the next Instruction Fetch. Consider this sequence: Look at this example, and assume a version of MIPS32 in LW R15, 0(R14) which the delay slot instruction is not supposed to be ADD R16, R17, R15 completed if the branch is taken: Is it possible to solve the hazard by forwarding? If not, what is L1: LW R9, 0(R5) the most time-efficient way to solve the hazard? instructions in loop body BEQ R8, R0, L1 Let’s make some general remarks about optimal solutions of OR R16, R10, R0 RAW data hazards. In the clock cycle after IF for the BEQ instruction, why is doing IF difficult? (There is more than one reason.) ENCM 501 W14 Slides for Lecture 16 slide 15/26 ENCM 501 W14 Slides for Lecture 16 slide 16/26 Control hazards: Not just for conditional branches! “Old school” solutions to control hazards (1) In a conditional branch, there is an obvious motivation to wait for the decision about whether or not to take the branch. But consider the following unconditional updates to the PC: Stall as long as necessary to ensure that instruction ◮ jump within a procedure; results are correct. This obviously makes CPI worse (higher) if programs have lots of conditional branches and ◮ procedure call; unconditional jumps. ◮ procedure return. Why do these kinds of instructions generate control hazards? How many cycles might be lost due to such a hazard in a 5-stage pipeline like the one we’ve been looking at? slide 17/26 slide 18/26 ENCM 501 W14 Slides for Lecture 16 ENCM 501 W14 Slides for Lecture 16 “Old school” solutions to control hazards (2) Dynamic branch prediction Delayed jumps and branches. Because it is very difficult to Dynamic branch prediction is the most important current do IF properly in the cycle immediately following a jump or a technology for management of control hazards. taken branch, many ISA designs decreed that the successor to a jump or branch would always be completed before the jump A branch prediction circuit is a memory array comparable in or branch target instruction . . . size to an L1 I-cache, and somewhat more complex. BEQ R12, R0, L99 A branch prediction circuit records the locations of thousands ADD R13, R14, R15 # successor of recently-encountered branches and jumps, along with the more instructions addresses of their targets. L99: SUB R8, R9, R10 # branch target For each conditional branch, a branch prediction circuit OR R16, R8, R0 maintains a few bits of information that can be used to Real MIPS ISAs (as opposed to some hypothetical MIPS-like predict whether the branch will be taken or untaken. ISAs in textbooks and lecture slides) have delayed branches and jumps.

Recommend


More recommend