Computer Architecture: Lecture “6” Multicycle MIPS Implementation • “Severe 100% midterm advisory” • Thursday!! 1
Single-Cycle CPU Summary • Fairly straightforward • Which instruction takes the longest? By how much? Why is that a problem? • Execution time = insts * cpi * cycle time � Most machines are not single-cycle. 2
Evaluating Critical Path I mem Decode, ALU PC D mem R-Write Total R-Read update R-type 1 1 .9 - - .8 3.7 Load 1 1 .9 - 1 .8 4.7 Store 1 1 .9 - 1 - 3.9 beq 1 1 .9 .1 - - 3 PCSrc 1 M� Add u� x ALU� 0 4 Add result Shift� RegWrite left 2 Instruction [25–21] Read� register 1 Read� MemWrite Read� PC data 1 address Instruction [20–16] Read� MemtoReg ALUSrc register 2 Zero Instruction� Read� 1 ALU ALU� [31–0] 1 Write� Read� data 2 result Address 1 M� register M� data u� Instruction� M� u� Instruction [15–11] x Write� u� x memory Registers 0 x data 0 Data� 0 Write� memory RegDst data 16 32 Sign� Instruction [15–0] Insight: Different extend ALU� MemRead control instructions have Instruction [5–0] 3 ALUOp different critical paths!
Multicycle design Problem: In single-cycle design, cycle time must be long • enough for longest instruction Solution: break execution into smaller tasks • - each task takes a cycle; - different instructions require different numbers of cycles Another advantage: we can multiplex area-intensive datapath • components (memories, ALUs, etc) and use them multiple times for a given instruction (as long as each use is on a different cycle.) This used to be one of the key desirable features of multicycle – not so much anymore.. 4
Idea: quantize instruction execution into smaller steps • Five execution steps (some instructions use fewer) - IF: Instruction Fetch - ID: Instruction Decode (& register fetch & add PC+immed) - EX: Execute - Mem: Memory access - WB: Write-Back into registers I cache Decode, ALU PC D R- Total R-Read update cache Write R-type 1 1 .9 - - .8 3.7 Load 1 1 .9 - 1 .8 4.7 Store 1 1 .9 - 1 - 3.9 beq 1 1 .9 .1 - - 3.0 5
Will multicycle be faster? � Depends on program + stage timings Cycles Instruction needed frequency What is CPI assuming R-type 4 60% this instruction mix??? Load 5 20% Store 4 10% beq 3 10% Single cycle design: Clock cycle time = 4.7ns time/inst = 1 cycle/inst * 4.7 ns/cycle = 4.7 ns/inst Multicycle design: Clock cycle time = 1 ns time/inst = CPI * 1 ns/cycle = 6
Adding State Elements Since execution takes place over multiple cycles, and we reuse some of the hardware, we need to capture intermediate results. Need extra registers when: - signal is computed in one clock cycle and used in another, AND - the inputs to the combinational circuit can change before the signal is written into a state element. • We only require them to be “held” for 1 cycle 7
Where to add registers (more or less) IF ID Ex Mem WB PCSrc M� Add u� x Add ALU� 4 result Shift� left 2 Registers ALU operation 3 Read� MemWrite ALUSrc Read� register 1 P C Read� address Read� MemtoReg data 1 Zero register 2 Instruction ALU ALU� Read� Write� Read� Address� result M� data register data 2 � M� Instruction� u� u� x Write� memory Data� x data memory Write� RegWrite data 32 16 Sign� MemRead extend 8
Merging Logic from Single Cycle to MultiCycle PCWriteCond PCSource PCWrite ALUOp Outputs IorD ALUSrcB MemRead ALUSrcA Control MemWrite RegWrite MemtoReg Op� RegDst IRWrite [5–0] 0 M� 1 u� Jump� 26 28 Instruction [25–0] address [31-0] x Shift� 2 left 2 Instruction� [31-26] PC [31-28] PC 0 0 Instruction� Read� M� M� [25–21] register 1 u� Address u� x x Read� Instruction� Read� A Zero 1 Memory data 1 [20–16] register 2 1 ALU 0 ALU� ALUOut MemData Registers Instruction� Write� result M� Read� B [15–0] register 0 Instruction� u� data 2 Write� x [15–11] Instruction� 1 M� 4 Write� data 1 u� register data 2 x Instruction� 0 3 [15–0] M� u� x Memory� 1 16 32 data� ALU� Sign� Shift� register control left 2 extend Instruction [5–0] PCSrc M� Add u� x ALU� Add 4 result Shift� left 2 Registers ALU operation 3 Read� MemWrite ALUSrc Read� register 1 Read� P C address Read� data 1 MemtoReg Zero register 2 Instruction ALU ALU� Read� Write� Read� Address� result M� data register data 2 � M� Instruction� u� u� x Write� Data� memory x data memory Write� RegWrite data 16 32 Sign� MemRead extend 9
Multicycle Datapath – let’s figure out basic execution model PCWriteCond PCSource PCWrite ALUOp Outputs IorD ALUSrcB MemRead ALUSrcA Control MemWrite RegWrite MemtoReg Op� RegDst IRWrite [5–0] 0 M� 1 u� Jump� 26 28 x Instruction [25–0] address [31-0] Shift� 2 left 2 Instruction� [31-26] PC [31-28] PC 0 0 Instruction� Read� M� M� [25–21] register 1 u� Address u� x Read� x A Instruction� Read� Zero Memory 1 data 1 1 [20–16] register 2 ALU ALU� ALUOut MemData 0 Registers result Instruction� Write� M� Read� B [15–0] register u� 0 Instruction� data 2 Write� x [15–11] 1 M� Instruction� 4 Write� data u� 1 register data 2 x 0 Instruction� 3 [15–0] M� u� x Memory� 1 data� 16 32 ALU� Shift� Sign� register control left 2 extend Instruction [5–0] 10
Summary of execution steps Step R -type M em ory B ranch Instruction Fetch IR = M em [PC ] PC = PC + 4 Instruction D ecode/ A = R eg[IR [25-21]] register fetch B = R eg[IR [20-16]] A L U out = PC + (sign-extend(IR [15-0]) < < 2) E xecution, address A L U out = A op B A L U out = A + if (A = =B ) then com putation, branch sign- PC = A L U out com pletion extend(IR [15-0]) M em ory access or R - R eg[IR [15-11]] = m em ory-data = type com pletion A L U out M em [A L U out] or M em [A L U out]= B W rite-back R eg[IR [20-16]] = m em ory-data This is Register Transfer Language (RTL) “High level” description of changes to state elements We’ll go through these in exacting detail And translate them to “low level” control signal settings Modern design tools do this automatically 11
Multicycle Datapath – let’s figure out control logic PCWriteCond PCSource PCWrite ALUOp Outputs IorD ALUSrcB MemRead ALUSrcA Control MemWrite RegWrite MemtoReg Op� RegDst IRWrite [5–0] 0 M� 1 u� Jump� 26 28 x Instruction [25–0] address [31-0] Shift� 2 left 2 Instruction� [31-26] PC [31-28] PC 0 0 Instruction� Read� M� M� [25–21] register 1 u� Address u� x Read� x A Instruction� Read� Zero Memory 1 data 1 1 [20–16] register 2 ALU ALU� ALUOut MemData 0 Registers result Instruction� Write� M� Read� B [15–0] register u� 0 Instruction� data 2 Write� x [15–11] 1 M� Instruction� 4 Write� data u� 1 register data 2 x 0 Instruction� 3 [15–0] M� u� x Memory� 1 data� 16 32 ALU� Shift� Sign� register control left 2 extend Instruction [5–0] 12
CauseWrite PCWriteCond CauseWrite PCWriteCond IntCause IntCause PCWrite PCWrite EPCWrite EPCWrite Outputs IorD Outputs IorD PCSource PCSource ALUOp MemRead ALUOp MemRead ALUSrcB Control ALUSrcB MemWrite Control MemWrite ALUSrcA ALUSrcA MemtoReg MemtoReg RegWrite RegWrite Op� Op� IRWrite [5–0] RegDst IRWrite [5–0] RegDst 0 0 1 M� 1 M� Jump� 26 28 Jump� u� 26 28 Instruction [25–0] u� Instruction [25–0] address [31-0] Shift� address [31-0] x Shift� x 2 left 2 2 left 2 � Instruction� � 3 Instruction� CO 00 00 00 3 CO 00 00 00 [31-26] [31-26] PC [31-28] PC 0 PC [31-28] PC 0 0 0 Instruction� Read� M� Instruction� Read� M� M� [25–21] register 1 M� u� Address [25–21] register 1 u� Address u� u� x x Read� x Memory Read� A x Instruction� Read� Memory A Instruction� Read� Zero 1 data 1 Zero 1 data 1 1 [20–16] register 2 1 [20–16] register 2 ALU MemData ALU� 0 ALU ALUOut MemData ALU� EPC 0 Registers ALUOut Registers EPC result Instruction� Write� M� result Instruction� Write� Read� M� Read� B B [15–0] register u� 0 [15–0] register data 2 Instruction� u� 0 data 2 Instruction� x Write� [15–11] x Write� Instruction� 1 M� [15–11] 4 Instruction� 1 M� Write� 4 data Write� 1 u� data register 1 u� register data 2 data 2 x x 0 0 0 Instruction� 3 0 0 0 Instruction� 3 M� [15–0] [15–0] M� M� M� Cause u� u� Cause u� u� x x x x 1 1 1 Memory� 1 1 Memory� 1 16 32 data� 16 32 ALU� data� ALU� Shift� Sign� Shift� register Sign� register control left 2 control extend left 2 extend Instruction [5–0] Instruction [5–0] 13
14
Recommend
More recommend