CS252 Recall f rom Pipelining Review Graduate Computer Architecture Lecture 15: Pipeline CPI = I deal pipeline CPI + St ruct ural I nstruction Level Parallelism and Dynamic St alls + Dat a Hazard St alls + Cont rol St alls Execution

  1. CS252 Recall f rom Pipelining Review Graduate Computer Architecture Lecture 15: • Pipeline CPI = I deal pipeline CPI + St ruct ural I nstruction Level Parallelism and Dynamic St alls + Dat a Hazard St alls + Cont rol St alls Execution – I deal pipeline CPI : measure of the maximum perf ormance attainable by the implementation – Structural hazards: HW cannot support this combination of inst ruct ions – Data hazards: I nstruction depends on result of prior March 11, 2002 instruction still in the pipeline Prof . David E. Culler – Control hazards: Caused by delay between the f etching of instructions and decisions about changes in control f low Comput er Science 252 (branches and jumps) Spring 2002 CS252/ Culler CS252/ Culler 3/ 12/ 02 3/ 12/ 02 Lec 15. 1 Lec 15. 2 Recall Data Hazard Resolution: I n- Order I ssue, Out - of - order I n- order issue, in- order completion Completion Time (clock cycles) ALU I fetch Reg Reg I n ALU Add lw r1, 0(r2) Reg I fetch Reg DMem DMem DMem’ Reg s t r. • Which hazards are present? RAW? WAR? WAW? ALU sub r4,r1,r6 I fetch Reg Bubble DMem Reg • load r3 <- r1, r2 O r • add r1 <- r5, r2 d ALU Bubble • sub r3 <- r3, r1 or r3 <- r2, r1 I fetch Reg DMem Reg and r6,r2,r7 e r • Register Reservations ALU I fetch Reg Bubble DMem or r8, r2,r9 – when issue mark dest inat ion regist er busy t ill complet e – check all regist er reservat ions bef ore issue Ext end t o Mult iple inst ruct ion issue? What if load had longer delay? Can and issue? CS252/ Culler CS252/ Culler 3/ 12/ 02 3/ 12/ 02 Lec 15. 3 Lec 15. 4 I deas to Reduce Stalls I nstruction- Level Parallelism (I LP) • Basic Block (BB) I LP is quite small Technique Reduces Dynamic schedulin g Dat a hazar d st alls – BB: a straight- line code sequence wit h no branches in except to the entry and no branches out except at the exit Dynamic br anch Cont rol st alls – average dynamic branch f requency 15% to 25% pr edict ion => 4 to 7 instructions execute between a pair of branches I ssuing mult iple I deal CP I – Plus instructions in BB likely to depend on each other inst r uct ions per cycle Chapter 3 • To obt ain subst ant ial perf ormance enhancement s, Specula t ion Dat a and cont r ol st alls we must exploit I LP across mult iple basic blocks Dynamic memory Dat a hazar d st alls involving disambiguat ion memor y • Simplest: loop- level parallelism t o exploit Loop unr olling Cont rol hazar d st alls parallelism among it erat ions of a loop Basic compiler p ipel ine Dat a hazar d st alls – Vect or is one way sch e duling – I f not vector, then either dynamic via branch prediction or Compiler dependence I deal CP I and da t a hazar d st alls static via loop unrolling by compiler Chapter 4 analysis Sof t ware pipelining and I deal CP I and da t a hazar d st alls t race scheduling Compiler speculat ion I deal CP I , dat a and cont r ol st alls CS252/ Culler CS252/ Culler 3/ 12/ 02 3/ 12/ 02 Lec 15. 5 Lec 15. 6 P age 1

  2. Data Dependence and Hazards Data Dependence and Hazards • I nstr J is dat a dependent on I nst r I • Dependences are a propert y of programs I nstr J t ries t o read operand bef ore I nstr I writ es it • Presence of dependence indicat es pot ent ial f or a hazard, but act ual hazard and lengt h of any st all I: add r1,r2,r3 is a propert y of t he pipeline J: sub r4,r1,r3 • I mport ance of t he dat a dependencies • or I nst r J is data dependent on I nst r K which is 1) indicat es t he possibilit y of a hazard dependent on I nstr I 2) det ermines order in which result s must be • Caused by a “True Dependence” (compiler term) calculated • I f t rue dependence caused a hazard in t he pipeline, 3) set s an upper bound on how much parallelism can called a Read Af t er Writ e (RAW) hazard possibly be exploit ed • Today looking at HW schemes t o avoid hazard CS252/ Culler CS252/ Culler 3/ 12/ 02 3/ 12/ 02 Lec 15. 7 Lec 15. 8 Name Dependence # 1: Name Dependence # 2: Ant i- dependence Output dependence • I nstr J writ es operand bef ore I nst r I writ es it . • Name dependence: when 2 inst ruct ions use same regist er or memory locat ion, called a name, but no I: sub r1,r4,r3 f low of dat a bet ween t he inst ruct ions associat ed J: add r1,r2,r3 wit h t hat name; 2 versions of name dependence K: mul r6,r1,r7 • I nstr J writ es operand bef ore I nst r I reads it • Called an “out put dependence” by compiler writ ers I: sub r4,r1,r3 This also results f rom the reuse of name “r1” J: add r1,r2,r3 K: mul r6,r1,r7 • I f anti- dependence caused a hazard in t he pipeline, called a Writ e Af t er Writ e (WAW) hazard Called an “ant i- dependence” in compiler work. This results f rom reuse of the name “r1” • I f anti- dependence caused a hazard in t he pipeline, called a Writ e Af t er Read (WAR) hazard CS252/ Culler CS252/ Culler 3/ 12/ 02 3/ 12/ 02 Lec 15. 9 Lec 15. 10 I LP and Data Hazards Control Dependencies • Every inst ruct ion is cont rol dependent on • program order: order inst ruct ions would execut e in some set of branches, and, in general, if execut ed sequent ially 1 at a t ime as det ermined these control dependencies must be by original source program preserved t o preserve program order • HW/ SW goal: exploit parallelism by preserving if p1 { appearance of program order S1; – modif y order in manner than cannot be observed by program – must not af f ect the outcome of the program }; • Ex: I nst ruct ions involved in a name dependence can if p2 { execute simultaneously if name used in inst ruct ions S2; is changed so inst ruct ions do not conf lict – Register renaming resolves name dependence f or regs } – Either by compiler or by HW • S1 is cont rol dependent on p1 , and S2 is – add r1, r2, r3 cont rol dependent on p2 but not on p1 . – sub r2, r4,r5 – and r3, r2, 1 CS252/ Culler CS252/ Culler 3/ 12/ 02 3/ 12/ 02 Lec 15. 11 Lec 15. 12 P age 2

  3. Control Dependence I gnored Exception Behavior • Cont rol dependence need not always be preserved • Preserving except ion behavior => any changes in inst ruct ion execut ion order must – willing to execute instructions that should not have been executed, thereby violating the control dependences, if can do not change how except ions are raised in so without af f ecting correctness of the program program (=> no new except ions) • I nst ead, 2 propert ies crit ical t o program • Example: correctness are except ion behavior and data f low DADDU R2,R3,R4 BEQZ R2,L1 LW R1,0(R2) L1: • Problem wit h moving LW bef ore BEQZ ? CS252/ Culler CS252/ Culler 3/ 12/ 02 3/ 12/ 02 Lec 15. 13 Lec 15. 14 CS 252 Administrivia Data Flow • Final Project Proposals due 3/ 17 • Data f low: act ual f low of dat a values among – send URL to page containing instructions that produce results and those that » title & participants consume t hem » problem statement – branches make f low dynamic, determine which instruction is » annotated bibliography supplier of data – we’ll monitor progress through the pages • Example: • Assignment 3 out , due in 3/ 19 DADDU R1,R2,R3 • Quiz 3/ 21 BEQZ R4,L DSUBU R1,R5,R6 L: … OR R7,R1,R8 • OR depends on DADDU or DSUBU ? Must preserve dat a f low on execut ion CS252/ Culler CS252/ Culler 3/ 12/ 02 3/ 12/ 02 Lec 15. 15 Lec 15. 16 Advantages of HW Schemes: I nstruction Parallelism Dynamic Scheduling • Key idea: Allow inst ruct ions behind st all t o proceed DIVD F0,F2,F4 • Handles cases when dependences unknown at ADDD F10,F0,F8 compile time SUBD F12,F8,F14 – (e. g. , because they may involve a memory ref erence) • Enables out- of - order execut ion • I t simplif ies the compiler and allows out- of - order complet ion • Will dist inguish when an inst ruct ion begins • Allows code t hat compiled f or one pipeline and when it execut ion ; t o run ef f icient ly on a dif f erent pipeline execut ion complet es bet ween 2 t imes, t he inst ruct ion is in execut ion • Hardware speculat ion, a t echnique wit h signif icant perf ormance advant ages, t hat • I n a dynamically scheduled pipeline, all inst ruct ions pass through issue stage in order (in- order issue) builds on dynamic scheduling CS252/ Culler CS252/ Culler 3/ 12/ 02 3/ 12/ 02 Lec 15. 17 Lec 15. 18 P age 3


