ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 - PowerPoint PPT Presentation

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Datapaths Tyler Bletsch Duke University Slides are derived from work by Andrew Hilton (Duke) and Amir Roth (Penn)

Last time • What did we do last time? • MIPS Assembly • Practice translating C to assembly together • Using functions • Calling conventions • jal (call) • jr (return) 2

Now confluence of MIPS + digital logic • Start of semester: Digital Logic • Building blocks of digital design • Most recently: MIPS assembly, ISA • Lowest level software • Now: where they meet… • Datapaths: hardware implementation of processors • By the way: homework 4 = build a datapath • With some components from the TAs… 3

Necessary ingredient: the ALU • ALU: Arithmetic/Logic Unit • Performs any supported math or logic operation on two inputs • Which operation is chosen by a third input A out ALU B op 4

Add/Subtract With Overflow Detection Overflow S 1 S 0 S n- 1 S n- 2 Full Adder Full Adder Full Adder Full Adder Add/Sub b n- 1 a n- 1 b n- 2 a n- 2 b 1 a 1 b 0 a 0 5

ALU Slice C in 3 a A F Q 0 0 a + b b 2 1 0 a - b - 1 NOT b Q - 2 a OR b 1 - 3 a AND b 0 Add/sub 2 C ou t Add/sub F 6

The ALU A out ALU Overflow Is non-zero? B op Q Q Q Q n-1 n-2 1 0 ALU control ALU Slice ALU Slice ALU Slice ALU Slice b a b a b a b a n-1 n-1 n-2 n-2 1 1 0 0 7

Datapath for MIPS ISA • Consider only the following instructions add $1,$2,$3 addi $1,2,$3 lw $1,4($3) sw $1,4($3) beq $1,$2,PC_relative_target j absolute_target • Why only these? • Most other instructions are the same from datapath viewpoint • The one’s that aren’t are left for you to figure out 8

Remember The von Neumann Model? • Instruction Fetch: Instruction Read instruction bits from memory Fetch • Decode: Instruction Figure out what those bits mean Decode • Operand Fetch: Read registers (+ mem to get sources) Operand Fetch • Execute: Do the actual operation (e.g., add the #s) Execute • Result Store: Result Write result to register or memory Store • Next Instruction: Figure out mem addr of next insn, repeat Next Instruction 9

Start With Fetch + 4 P Insn C Mem • Same for all instructions (don’t know insn yet) • PC and instruction memory • A +4 incrementer computes default next instruction PC • Details of Insn Mem: later… • For now: just assume a bunch of DFFs 10

First Instruction: add Decoding: Very easy in MIPS + 4 P Insn Register C Mem File s1 s2 d R-type Op(6) Rs(5) Rt(5) Rd(5) Sh(5) Func(6) • Add register file and ALU 11

Second Instruction: addi + 4 P Insn Register C Mem File s1 s2 d S X I-type Op(6) Rs(5) Rt(5) Immed(16) • Destination register can now be either Rd or Rt • Add sign extension unit and mux into second ALU input 12

Third Instruction: lw + 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X I-type Op(6) Rs(5) Rt(5) Immed(16) • Add data memory, address is ALU output • Add register write data mux to select memory output or ALU output 13

Fourth Instruction: sw + 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X I-type Op(6) Rs(5) Rt(5) Immed(16) • Add path from second input register to data memory data input 14

Fifth Instruction: beq << 2 + 4 a z P Insn Register Data C Mem File Mem d s1 s2 d S X I-type Op(6) Rs(5) Rt(5) Immed(16) • Add left shift unit and adder to compute PC-relative branch target • Add PC input mux to select PC+4 or branch target • Note: shift by fixed amount very simple 15

Sixth Instruction: j << 2 << + 2 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X J-type Op(6) Immed(26) • Add shifter to compute left shift of 26-bit immediate • Add additional PC input mux for jump target 16

More Instructions… << 2 << + 2 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X • Figure out datapath modifications for • jal (J-type) • jr (R-type) 17

Jal << 2 << + 2 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X • For jal, need to get PC+4 to RF write mux 18

JR << 2 << + 2 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X • For JR need to get RF read value to next PC mux 19

Good practice: Try other insns << 2 << + 2 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X • Pick other MIPS instructions, contemplate how to add them 20

“Continuous Read” Datapath Timing + 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X Read Registers Read DMEM Write DMEM Read IMem Write Registers Write PC • Works because writes (PC, RegFile, DMem) are independent • And because no read logically follows any write 21

What Is Control? BR << 2 << JP + 2 4 a P Insn Register Data C Mem File Mem Rwd d s1 s2 d S ALUop Rwe DMwe X Rdst ALUinB • 8 signals control flow of data through this datapath • MUX selectors, or register/memory write enable signals • A real datapath has 300-500 control signals 22

Example: Control for add BR=0 << 2 << JP=0 + 2 4 a P Insn Register Data C Mem File Mem Rwd=0 d s1 s2 d S ALUop=0 Rwe=1 DMwe=0 X Rdst=1 ALUinB=0 • Control for an instruction: • Values of all control signals to correctly execute it 23

Example: Control for sw BR=0 << 2 << JP=0 + 2 4 a P Insn Register Data C Mem File Mem Rwd=X d s1 s2 d S ALUop=0 Rwe=0 DMwe=1 X Rdst=X ALUinB=1 • Difference between sw and add is 5 signals • 3 if you don’t count the X (don’t care) signals 24

Example: Control for beq BR=1 << 2 << JP=0 + 2 4 a P Insn Register Data C Mem File Mem Rwd=X d s1 s2 d S ALUop=1 Rwe=0 DMwe=0 X Rdst=X ALUinB=0 • Difference between sw and beq is only 4 signals 25

You all figure LW BR << 2 << JP + 2 4 a P Insn Register Data C Mem File Mem Rwd d s1 s2 d S ALUop Rwe DMwe X Rdst ALUinB • How would these control signals be set for LW? 26

Example: Control for LW BR=0 << 2 << JP=0 + 2 4 a P Insn Register Data C Mem File Mem Rwd=1 d s1 s2 d S ALUop=0 Rwe=1 DMwe=0 X Rdst=0 ALUinB=1 27

How Is Control Implemented? BR << 2 << JP + 2 4 a P Insn Register Data C Mem File Mem Rwd d s1 s2 d S ALUop Rwe DMwe X Rdst ALUinB Control? 28

Implementing Control • Each insn has a unique set of control signals • Most are function of opcode • Some may be encoded in the instruction itself • E.g., the ALUop signal is some portion of the MIPS Func field + Simplifies controller implementation • Requires careful ISA design 29

Control Implementation: ROM • ROM (read only memory) : think rows of bits • Bits in data words are control signals • Lines indexed by opcode • Example: ROM control for 6-insn MIPS datapath • X is “don’t care” BR JP ALUinB ALUop DMwe Rwe Rdst Rwd add 0 0 0 0 0 1 0 0 addi 0 0 1 0 0 1 1 0 lw 0 0 1 0 0 1 1 1 opcode sw 0 0 1 0 1 0 X X beq 1 0 0 1 0 0 X X j 0 1 0 0 0 0 X X 30

Control Implementation: Random Logic • Real machines have 100+ insns 300+ control signals • 30,000+ control bits (~4KB) • Not huge, but hard to make faster than datapath (important!) • Alternative: random logic (random = ‘non - repeating’) Yes, “random logic” is a very dumb and misleading name for • Exploits the observation: many signals have few 1s or few 0s this concept. Sorry. • Example: random logic control for 6-insn MIPS datapath add addi lw sw opcode beq j BR JP DMwe Rwd Rwe Rdst ALUop ALUinB 31

Datapath and Control Timing + 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X Control ROM/random logic Read Registers Read DMEM Write DMEM Read IMem Write Registers (Read Control ROM) Write PC 32

Single-Cycle Datapath Performance + 4 a P Insn Register Data C Mem File Mem d s1 s2 d S X Control ROM/random logic • Goes against make common case fast (MCCF) principle + Low Cycles Per Instruction ( CPI ): 1 – Long clock period: to accommodate slowest insn 33

Interlude: Performance • Previous slide alludes to something new: Performance • Don’t just want it to work… • But want it to go fast! • Three components to performance: Number of instructions x Cycles per instruction (CPI) x Clock Period (1 / Clock frequency) Instructions Cycles Seconds Seconds —————— x ————— x ————— = —————— Program Instruction Cycle Program 34

Interlude: Performance • Three components to performance: Number of instructions <- Compiler’s Job x Cycles per instruction (CPI) x Clock Period (1 / Clock frequency) Instructions Cycles Seconds Seconds —————— x ————— x ————— = —————— Program Instruction Cycle Program • Insns/Program: determined by compiler + ISA • Generally assume fixed program when doing micro-architecture 35

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 - PowerPoint PPT Presentation

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Datapaths Tyler Bletsch Duke University Slides are derived from work by Andrew Hilton (Duke) and Amir Roth (Penn) Last time What did we do last time? MIPS Assembly

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Digital Arithmetic Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Pipelines Tyler Bletsch

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 From Transistors to Gates

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Intro to Intel x86 Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Networking Basics Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Introduction Tyler Bletsch

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 The Operating System (OS)

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Combinational Logic Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Finite State Machines Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Virtual Memory Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Storage and Clocking Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Exceptions and Interrupts

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Input/Output (IO) Tyler

MODELING & OPTIMIZATION OF DUAL-BORE OIL DEBRIS MONITORING SYSTEM ECE Team 2016, ME Team 25

Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise

Lecture 20- ECE 240a Distributed Feedback Lasers 1 ECE 240a Lasers - Fall 2019 Lecture 20

Comp. Organization DLX Comp. Arch. ECE 337 Unpipelined DLX Architecture Each DLX instruction

Performance, Power CS301 Prof Szajda Performance Metrics (How do we compare two machines?)

Control (Branch) Hazards A: beqz R2, L1 B C D ------ L1: P Nave (Lazy) Implementation of

SI232 See through the marketing hype Slide Set #12: Performance Key to understanding

q-Credibility OLIVIER LE COURTOIS EMLyon Business School First Version Outline of the Talk

Slide 1 SPHSC 569 Single Subject Design Reliability Slide 2 Reliability-Quantitative and

Credibility and Authority in an Unregulated World helena.francke@hb.se Swedish School of Library

? Class Outline 5.1 Credibility 5.2 Variant data 5.3 Use quotes to verify a quote 5.4 Using

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 - PowerPoint PPT Presentation

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Datapaths Tyler Bletsch Duke University Slides are derived from work by Andrew Hilton (Duke) and Amir Roth (Penn) Last time What did we do last time? MIPS Assembly

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Digital Arithmetic Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Pipelines Tyler Bletsch

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 From Transistors to Gates

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Intro to Intel x86 Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Networking Basics Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Introduction Tyler Bletsch

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 The Operating System (OS)

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Combinational Logic Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Finite State Machines Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Virtual Memory Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Storage and Clocking Tyler

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Exceptions and Interrupts

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2016 Input/Output (IO) Tyler

MODELING &amp; OPTIMIZATION OF DUAL-BORE OIL DEBRIS MONITORING SYSTEM ECE Team 2016, ME Team 25

Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise

Lecture 20- ECE 240a Distributed Feedback Lasers 1 ECE 240a Lasers - Fall 2019 Lecture 20

Comp. Organization DLX Comp. Arch. ECE 337 Unpipelined DLX Architecture Each DLX instruction

Performance, Power CS301 Prof Szajda Performance Metrics (How do we compare two machines?)

Control (Branch) Hazards A: beqz R2, L1 B C D ------ L1: P Nave (Lazy) Implementation of

SI232 See through the marketing hype Slide Set #12: Performance Key to understanding

q-Credibility OLIVIER LE COURTOIS EMLyon Business School First Version Outline of the Talk

Slide 1 SPHSC 569 Single Subject Design Reliability Slide 2 Reliability-Quantitative and

Credibility and Authority in an Unregulated World helena.francke@hb.se Swedish School of Library

? Class Outline 5.1 Credibility 5.2 Variant data 5.3 Use quotes to verify a quote 5.4 Using

MODELING & OPTIMIZATION OF DUAL-BORE OIL DEBRIS MONITORING SYSTEM ECE Team 2016, ME Team 25