EE182 Computer Organization and Design Winter 1998 Chapter 5 Lectures Processor Datapath and Control Part I: Single-Cycle Implementation Lecture Handout 5-1: Single-Cycle Implementation Slide 1 EE 182 -- Winter 1989 Single-Cycle Implementation Outline The Big Picture MIPS ISA Subset Clocking Methodology Datapath Components Single-Cycle Design — Assembling the Datapath — Controlling the machine — Advantages and Disadvantages Lecture Handout 5-1: Single-Cycle Implementation Slide 2 EE 182 -- Winter 1989
Computer System Organization Computer Processor Memory Devices Control Input Datapath Output Cover control and datapath design Emphasize control structure Use previous ALU design in datapath Lecture Handout 5-1: Single-Cycle Implementation Slide 3 EE 182 -- Winter 1989 Performance Impact Performance of a machine is determined by — Instruction count — Clock cycle time — Clock cycles per instruction Processor design (datapath and control) determines — Clock cycle time — CPI (for fixed instruction mix) In this part: Single-cycle processor — Advantage • Only one clock cycle per instruction — Disadvantages • Long cycle time • Inefficient utilization of memory and function units Lecture Handout 5-1: Single-Cycle Implementation Slide 4 EE 182 -- Winter 1989
MIPS Instruction Formats (Review) Three instruction formats 31 26 21 16 11 6 0 — R-type op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits — 31 26 21 16 0 — I-type immediate op rs rt 6 bits 5 bits 5 bits 16 bits — 31 26 0 — J-type op target address 6 bits 26 bits The different fields are: — op: operation of the instruction — rs, rt, rd: source/destination register specifiers — shamt: shift amount — funct: selects variant of operation in “op” field — address/immediate: address offset or imm. value — target address: target address of jump instruction Lecture Handout 5-1: Single-Cycle Implementation Slide 5 EE 182 -- Winter 1989 The MIPS Subset We Implement 31 26 21 16 11 6 0 Add, subtract op rs rt rd shamt funct — add rd, rs, rt 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits — sub rd, rs, rt Subset differs somewhat from OR Immediate textbook for variety — ori rt, rs, imm16 Load, Store 31 26 21 16 0 — lw rt, rs, imm16 op rs rt immediate — sw rt, rs, imm16 6 bits 5 bits 5 bits 16 bits BRANCH — beq rs, rt, imm16 JUMP: 31 26 0 — j target op target address 6 bits 26 bits Lecture Handout 5-1: Single-Cycle Implementation Slide 6 EE 182 -- Winter 1989
Implementation Overview Data “flows” through memory and functional units Data Register # Address Address PC Instruction Registers ALU Register # Instruction Data memory m em ory Register # Data Figure 5.1 from Text Lecture Handout 5-1: Single-Cycle Implementation Slide 7 EE 182 -- Winter 1989 Clocking Methodology All storage elements clocked by same clock edge — edge-triggered clocking — “instantaneous” state change (simplification!) — design always works if the clock is “slow enough” Cycle Time = Prop. Time* + Longest Delay Path + Setup + Clock Skew Clk Setup Hold Setup Hold Don’t Care . . . . . . . . . . . . Lecture Handout 5-1: Single-Cycle Implementation Slide 8 EE 182 -- Winter 1989
The Steps of Designing a Processor Instruction Set Architecture used for high-level specification or Register-Transfer Level (RTL) model Includes major organizational decisions — • Examples: no. and type of functional units, no. of register file ports Datapath-RTL refined to specify functional unit behavior and interfaces Datapath components — Datapath interconnect — Associated datapath “control points” — Control structure defined and Control-RTL behavioral representation created RTL datapath and control design are refined to track physical design and functional validation Changes made for timing and errata (aka “bug”) fixes — Amount of work varies with capabilities of CAD tools and — degree of optimization for cost and performance — Lecture Handout 5-1: Single-Cycle Implementation Slide 9 EE 182 -- Winter 1989 Example RTL for Add/Load Instructions add rd,rs,rt mem[PC]; Fetch instruction from memory R[rd] <- R[rs] + R[rt]; ADD operation PC <- PC + 4 ; Calculate next address lw rt,rs,imm16 mem[PC]; Fetch instruction from memory Addr <- R[rs] + SignExt(imm16); Compute memory Addr R[rt] <- Mem[Addr]; Load data into register PC <- PC + 4 ; Calculate next address Lecture Handout 5-1: Single-Cycle Implementation Slide 10 EE 182 -- Winter 1989
Datapath Combinational Logic Elements CarryIn A Adder 32 Adder Sum 32 B Carry 32 Select MUX A 32 MUX Y 32 B 32 OP ALU A 32 ALU Result 32 B Zero 32 Lecture Handout 5-1: Single-Cycle Implementation Slide 11 EE 182 -- Winter 1989 Storage Element: Register Register — Similar to the D Flip Flop except • N-bit input and output • Write Enable input — Write Enable: • 0: Data Out will not change • 1: Data Out will become Data In — Note: data changes only on falling clock edge! Write Enable Data Out Data In N N Clk Lecture Handout 5-1: Single-Cycle Implementation Slide 12 EE 182 -- Winter 1989
Storage Element: Register File Register File consists of 32 registers: RW RA RB — Two 32-bit output busses: Write Enable 5 5 5 • busA and busB busA busW — One 32-bit input bus: busW 32 32 32-bit — Register 0 hard-wired to value 0 32 Registers busB Clk Register is selected by: 32 — RA selects the register to put on busA — RB selects the register to put on busB — RW selects the register to be written via busW when Write Enable is 1 Clock input (CLK) — The CLK input is a factor only for write operation — During read, behaves as a combinational logic block: • RA or RB stable => busA or busB valid after “access time.” • minor simplification of reality Lecture Handout 5-1: Single-Cycle Implementation Slide 13 EE 182 -- Winter 1989 Storage Element: Idealized Memory Address Write Enable Memory (idealized) — One input bus: Data In Data In DataOut — One output bus: Data Out 32 32 Clk Memory word is selected by: — Address selects the word to put on Data Out — Write Enable = 1: address selects the memory memory word to be written via the Data In bus Clock input (CLK) — The CLK input is a factor only for write operation — During read, behaves as a combinational logic block: • Address valid => Data Out valid after “access time.” • minor simplification of reality Lecture Handout 5-1: Single-Cycle Implementation Slide 14 EE 182 -- Winter 1989
Instruction Fetch Unit Common RTL operations — Fetch the Instruction: mem[PC] — Update the program counter: • Sequential Code: PC <- PC + 4 • Branch and Jump PC <- “something else” PC Clk Next Address Logic Address Instruction Word Instruction 32 Memory Lecture Handout 5-1: Single-Cycle Implementation Slide 15 EE 182 -- Winter 1989 ADD Instruction 31 26 21 16 11 6 0 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits add rd,rs,rt RTL Description mem[PC]; Fetch instruction from memory R[rd] <- R[rs] + R[rt]; ADD operation PC <- PC + 4 ; Calculate next address Lecture Handout 5-1: Single-Cycle Implementation Slide 16 EE 182 -- Winter 1989
Subtract Instruction 31 26 21 16 11 6 0 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits sub rd,rs,rt RTL Description — mem[PC]; Fetch instruction from memory — R[rd] <- R[rs] + R[rt]; ADD operation — PC <- PC + 4; Calculate next address Lecture Handout 5-1: Single-Cycle Implementation Slide 17 EE 182 -- Winter 1989 Datapath: Register-Register Ops R[rd] <– R[rs] op R[rt] — Example: add rd, rs, rt — Ra, Rb, Rw: from instruction’s rs, rt, and rd fields — ALUctr, RegWr: from control after decoding 31 26 21 16 11 6 0 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits Rd Rs Rt ALUctr RegWr 5 5 5 busA Rw Ra Rb busW 32 Result 32 32-bit ALU 32 Registers 32 Clk busB 32 Lecture Handout 5-1: Single-Cycle Implementation Slide 18 EE 182 -- Winter 1989
OR Immediate Instruction 31 26 21 16 0 op rs rt immediate 6 bits 5 bits 5 bits 16 bits ori rt, rs, imm16 RTL Description mem[PC]; Fetch instruction from memory R[rd] <- R[rs] OR ZeroEx t(imm16); OR operation PC <- PC + 4 ; Calculate next address Lecture Handout 5-1: Single-Cycle Implementation Slide 19 EE 182 -- Winter 1989 Datapath: Logical Ops and Immediate R[rt] <- R[rs] op ZeroExt[imm16] — Example: ori rt, rs, imm16 31 26 21 16 0 op rs rt immediate Rt Rd 5 bits 5 bits 16 bits 6 bits RegDst Mux Rs Rt (Don’t Care) ALUctr RegWr 5 5 5 busA Rw Ra Rb busW 32 Result 32 32-bit ALU 32 Registers 32 Clk busB 32 Mux ZeroExt imm16 32 16 ALUSrc Lecture Handout 5-1: Single-Cycle Implementation Slide 20 EE 182 -- Winter 1989
Recommend
More recommend