ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Datapaths Tyler Bletsch Duke University Slides are derived from work by Andrew Hilton (Duke) and Amir Roth (Penn)
Last time • What did we do last time? • MIPS Assembly • Practice translating C to assembly together • Using functions • Calling conventions • jal (call) • jr (return) 2
Now confluence of MIPS + digital logic • Start of semester: Digital Logic • Building blocks of digital design • Most recently: MIPS assembly, ISA • Lowest level software • Now: where they meet… • Datapaths: hardware implementation of processors • By the way: homework 4 = build a datapath • With some components from the TAs… 3
Necessary ingredient: the ALU • ALU: Arithmetic/Logic Unit • Performs any supported math or logic operation on two inputs • Which operation is chosen by a third input A out ALU B op 4
Add/Subtract With Overflow Detection Overflow S 1 S 0 S n- 1 S n- 2 Full Adder Full Adder Full Adder Full Adder Add/Sub b n- 1 a n- 1 b n- 2 a n- 2 b 1 a 1 b 0 a 0 5
ALU Slice C in 3 a A F Q 0 0 a + b b 2 1 0 a - b - 1 NOT b Q - 2 a OR b 1 - 3 a AND b 0 Add/sub 2 C ou t Add/sub F 6
The ALU A out ALU Overflow Is non-zero? B op Q Q Q Q n-1 n-2 1 0 ALU control ALU Slice ALU Slice ALU Slice ALU Slice b a b a b a b a n-1 n-1 n-2 n-2 1 1 0 0 7
Datapath for MIPS ISA • Consider only the following instructions add $1,$2,$3 addi $1,2,$3 lw $1,4($3) sw $1,4($3) beq $1,$2,PC_relative_target j absolute_target • Why only these? • Most other instructions are the same from datapath viewpoint • The one’s that aren’t are left for you to figure out 8
Remember The von Neumann Model? • Instruction Fetch: Instruction Read instruction bits from memory Fetch • Decode: Instruction Figure out what those bits mean Decode • Operand Fetch: Read registers (+ mem to get sources) Operand Fetch • Execute: Do the actual operation (e.g., add the #s) Execute • Result Store: Result Write result to register or memory Store • Next Instruction: Figure out mem addr of next insn, repeat Next Instruction 9
Start With Fetch + 4 P Insn C Mem • Same for all instructions (don’t know insn yet) • PC and instruction memory • A +4 incrementer computes default next instruction PC • Details of Insn Mem: later… • For now: just assume a bunch of DFFs 10
First Instruction: add Decoding: Very easy in MIPS + 4 P Insn Register C Mem File s1 s2 d R-type Op(6) Rs(5) Rt(5) Rd(5) Sh(5) Func(6) • Add register file and ALU 11
Second Instruction: addi + 4 P Insn Register C Mem File s1 s2 d S X I-type Op(6) Rs(5) Rt(5) Immed(16) • Destination register can now be either Rd or Rt • Add sign extension unit and mux into second ALU input 12
Third Instruction: lw + 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X I-type Op(6) Rs(5) Rt(5) Immed(16) • Add data memory, address is ALU output • Add register write data mux to select memory output or ALU output 13
Fourth Instruction: sw + 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X I-type Op(6) Rs(5) Rt(5) Immed(16) • Add path from second input register to data memory data input 14
Fifth Instruction: beq << 2 + 4 a z P Insn Register Data C Mem File Mem d s1 s2 d S X I-type Op(6) Rs(5) Rt(5) Immed(16) • Add left shift unit and adder to compute PC-relative branch target • Add PC input mux to select PC+4 or branch target • Note: shift by fixed amount very simple 15
Sixth Instruction: j << 2 << + 2 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X J-type Op(6) Immed(26) • Add shifter to compute left shift of 26-bit immediate • Add additional PC input mux for jump target 16
More Instructions… << 2 << + 2 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X • Figure out datapath modifications for • jal (J-type) • jr (R-type) 17
Jal << 2 << + 2 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X • For jal, need to get PC+4 to RF write mux 18
JR << 2 << + 2 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X • For JR need to get RF read value to next PC mux 19
Good practice: Try other insns << 2 << + 2 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X • Pick other MIPS instructions, contemplate how to add them 20
“Continuous Read” Datapath Timing + 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X Read Registers Read DMEM Write DMEM Read IMem Write Registers Write PC • Works because writes (PC, RegFile, DMem) are independent • And because no read logically follows any write 21
What Is Control? BR << 2 << JP + 2 4 a P Insn Register Data C Mem File Mem Rwd d s1 s2 d S ALUop Rwe DMwe X Rdst ALUinB • 8 signals control flow of data through this datapath • MUX selectors, or register/memory write enable signals • A real datapath has 300-500 control signals 22
Example: Control for add BR=0 << 2 << JP=0 + 2 4 a P Insn Register Data C Mem File Mem Rwd=0 d s1 s2 d S ALUop=0 Rwe=1 DMwe=0 X Rdst=1 ALUinB=0 • Control for an instruction: • Values of all control signals to correctly execute it 23
Example: Control for sw BR=0 << 2 << JP=0 + 2 4 a P Insn Register Data C Mem File Mem Rwd=X d s1 s2 d S ALUop=0 Rwe=0 DMwe=1 X Rdst=X ALUinB=1 • Difference between sw and add is 5 signals • 3 if you don’t count the X (don’t care) signals 24
Example: Control for beq BR=1 << 2 << JP=0 + 2 4 a P Insn Register Data C Mem File Mem Rwd=X d s1 s2 d S ALUop=1 Rwe=0 DMwe=0 X Rdst=X ALUinB=0 • Difference between sw and beq is only 4 signals 25
You all figure LW BR << 2 << JP + 2 4 a P Insn Register Data C Mem File Mem Rwd d s1 s2 d S ALUop Rwe DMwe X Rdst ALUinB • How would these control signals be set for LW? 26
Example: Control for LW BR=0 << 2 << JP=0 + 2 4 a P Insn Register Data C Mem File Mem Rwd=1 d s1 s2 d S ALUop=0 Rwe=1 DMwe=0 X Rdst=0 ALUinB=1 27
How Is Control Implemented? BR << 2 << JP + 2 4 a P Insn Register Data C Mem File Mem Rwd d s1 s2 d S ALUop Rwe DMwe X Rdst ALUinB Control? 28
Implementing Control • Each insn has a unique set of control signals • Most are function of opcode • Some may be encoded in the instruction itself • E.g., the ALUop signal is some portion of the MIPS Func field + Simplifies controller implementation • Requires careful ISA design 29
Control Implementation: ROM • ROM (read only memory) : think rows of bits • Bits in data words are control signals • Lines indexed by opcode • Example: ROM control for 6-insn MIPS datapath • X is “don’t care” BR JP ALUinB ALUop DMwe Rwe Rdst Rwd add 0 0 0 0 0 1 0 0 addi 0 0 1 0 0 1 1 0 lw 0 0 1 0 0 1 1 1 opcode sw 0 0 1 0 1 0 X X beq 1 0 0 1 0 0 X X j 0 1 0 0 0 0 X X 30
Control Implementation: Random Logic • Real machines have 100+ insns 300+ control signals • 30,000+ control bits (~4KB) • Not huge, but hard to make faster than datapath (important!) • Alternative: random logic (random = ‘non - repeating’) Yes, “random logic” is a very dumb and misleading name for • Exploits the observation: many signals have few 1s or few 0s this concept. Sorry. • Example: random logic control for 6-insn MIPS datapath add addi lw sw opcode beq j BR JP DMwe Rwd Rwe Rdst ALUop ALUinB 31
Datapath and Control Timing + 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X Control ROM/random logic Read Registers Read DMEM Write DMEM Read IMem Write Registers (Read Control ROM) Write PC 32
Single-Cycle Datapath Performance + 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X Control ROM/random logic • Goes against make common case fast (MCCF) principle + Low Cycles Per Instruction ( CPI ): 1 – Long clock period: to accommodate slowest insn 33
Interlude: Performance • Previous slide alludes to something new: Performance • Don’t just want it to work… • But want it to go fast! • Three components to performance: Number of instructions x Cycles per instruction (CPI) x Clock Period (1 / Clock frequency) Instructions Cycles Seconds Seconds —————— x ————— x ————— = —————— Program Instruction Cycle Program 34
Interlude: Performance • Three components to performance: Number of instructions <- Compiler’s Job x Cycles per instruction (CPI) x Clock Period (1 / Clock frequency) Instructions Cycles Seconds Seconds —————— x ————— x ————— = —————— Program Instruction Cycle Program • Insns/Program: determined by compiler + ISA • Generally assume fixed program when doing micro-architecture 35
Recommend
More recommend