RISC Processors Chapter 14 S. Dandamudi
Outline • Introduction • Itanium processor ∗ Architecture • Evolution of CISC ∗ Addressing modes processors ∗ Instruction set • RISC design principles ∗ Instruction-level parallelism • PowerPC processor ∗ Branch handling ∗ Architecture ∗ Speculative execution ∗ Addressing modes ∗ Instruction set 2003 S. Dandamudi Chapter 14: Page 2 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
Introduction • CISC ∗ Complex instruction set » Pentium is the most popular example • RISC ∗ Simple instructions » Reduced complexity ∗ Modern processors use this design philosophy » PowerPC, MIPS, SPARC, Intel Itanium – Borrow some features from CISC ∗ No precise definition » We can identify some common characteristics 2003 S. Dandamudi Chapter 14: Page 3 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
Evolution of CISC Designs • Motivation to efficiently use expensive resources ∗ Processor ∗ Memory • High density code ∗ Complex instructions » Hardware complexity is handled by microprogramming » Microprogramming is also helpful to – Reduce the impact of memory access latency – Offers flexibility � Low-cost members of the same family ∗ Tailored to high-level language constructs 2003 S. Dandamudi Chapter 14: Page 4 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
Evolution of CISC Designs (cont’d) CISC RISC VAX Intel 486 MIPS 11/780 R4000 # instructions 303 235 94 Addr. modes 22 11 1 Inst. size (bytes) 2-57 1-12 4 GP registers 16 8 32 2003 S. Dandamudi Chapter 14: Page 5 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
Evolution of CISC Designs (cont’d) Example ∗ Autoincrement addressing mode of VAX » Performs the following actions: (R2) = (R2) + R3; R2 = R2 + 1 ∗ RISC equivalent R4 = (R2) R4 = R4 + R3 (R2) = R4 R2 = R2 + 1 2003 S. Dandamudi Chapter 14: Page 6 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
Why RISC? • Simple instructions are preferred ∗ Complex instructions are mostly ignored by compilers » Due to semantic gap • Simple data structures ∗ Complex data structures are used relatively infrequently ∗ Better to support a few simple data types efficiently » Synthesize complex ones • Simple addressing modes ∗ Complex addressing modes lead to variable length instructions » Lead to inefficient instruction decoding and scheduling 2003 S. Dandamudi Chapter 14: Page 7 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
Why RISC? (cont’d) • Large register set ∗ Efficient support for procedure calls and returns » Patterson and Sequin’s study – Procedure call/return: 12 − 15% of HLL statements � Constitute 31 − 33% of machine language instructions � Generate nearly half (45%) of memory references ∗ Small activation record » Tanenbaum’s study – Only 1.25% of the calls have more than 6 arguments – More than 93% have less than 6 local scalar variables – Large register set can avoid memory references 2003 S. Dandamudi Chapter 14: Page 8 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
RISC Design Principles • Simple operations ∗ Simple instructions that can execute in one cycle • Register-to-register operations ∗ Only load and store operations access memory ∗ Rest of the operations on a register-to-register basis • Simple addressing modes ∗ A few addressing modes (1 or 2) • Large number of registers ∗ Needed to support register-to-register operations ∗ Minimize the procedure call and return overhead 2003 S. Dandamudi Chapter 14: Page 9 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
RISC Design Principles (cont’d) Register windows storing activation records 2003 S. Dandamudi Chapter 14: Page 10 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
RISC Design Principles (cont’d) • Fixed-length instructions ∗ Facilitates efficient instruction execution • Simple instruction format ∗ Fixed boundaries for various fields » opcode, source operands,… • Other features ∗ Tend to use Harvard architecture ∗ Pipelining is visible at the architecture level 2003 S. Dandamudi Chapter 14: Page 11 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
PowerPC • Registers ∗ 32 general-purpose registers (GPR0 – GPR31) ∗ 32 floating-point registers (FPR0 – FPR31) ∗ Condition register (CR) » Similar to Pentium’s flags register » Divided into 8 CR fields (4 bits each) – “less than” (LT), “greater than” (GT), “equal to” (EQ), Overflow (SO) – CR1 is for floating-point exceptions – Other CR fields can be used for integer or FP exceptions – Branch instructions can test a specific CR field bit 2003 S. Dandamudi Chapter 14: Page 12 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
PowerPC (cont’d) 2003 S. Dandamudi Chapter 14: Page 13 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
PowerPC (cont’d) ∗ XER register serves two distinct purposes » Bits 0, 1, and 2 are used to capture – Summary overflow (SO), overflow (OV), carry (CA) – OV and CA are similar to Pentium’s overflow and carry – SO, once set, only a special instruction can clear it » Bits 25 to 31 (7 bits) – Specifies the number of bytes to be transferred between memory and registers – Two instructions � Load string word indexed ( lswx ) � Store string word indexed ( stswx ) � Can load/store all 32 registers (GPR0-GPR31) 2003 S. Dandamudi Chapter 14: Page 14 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
PowerPC (cont’d) ∗ Link register (LR) » Used to store the procedure return address – Stores the effective address of the instruction following the procedure call instruction – Procedure calls use the branch instructions � Example: b = branch, bl = procedure call ∗ Count register (CTR) » Maintains loop count value – Similar to Pentium's ECX register – Branch instructions can test the value • 32-bit PowerPC implementations use segmentation like the Pentium 2003 S. Dandamudi Chapter 14: Page 15 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
PowerPC (cont’d) • Addressing modes ∗ Load/store instructions support three addressing modes » Can use GPRs ∗ Register Indirect » Effective address = contents of rA or 0 » Specifying 0 generates address 0 ∗ Register Indirect with Immediate Index » Effective address = Contents of rA or 0 + imm16 ∗ Register Indirect with Index » Effective address = Contents of rA or 0 + contents of rB 2003 S. Dandamudi Chapter 14: Page 16 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
PowerPC (cont’d) Instruction format 2003 S. Dandamudi Chapter 14: Page 17 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
PowerPC (cont’d) • Bits 0-5 ∗ Specify primary opcode ∗ Other fields specify suboperations » Depends on instruction type • AA bit ∗ 1 (use absolute address) ∗ 0 (use relative address) • LK bit ∗ 0 (no link --- branch) ∗ 1 (link --- turns branch into a procedure call) 2003 S. Dandamudi Chapter 14: Page 18 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
PowerPC Instruction Set • Data Transfer instructions • Byte loads lbz rD,disp(rA) ; Load byte and zero lbzu rD,disp(rA) ; Load byte and zero ; with update » Effective address = contents of rA + disp lbzx rD,rA,rB ; Load byte and zero indexed lbzux rD,rA,rB ; Load byte and zero ; with update indexed » Effective address = contents of rA + contents of rB » Upper three bytes of rD are zeroed » Update versions: rA ← effective address 2003 S. Dandamudi Chapter 14: Page 19 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
PowerPC Instruction Set (cont’d) • Similar instructions for halfword and word loads lhz, lhzu, lhzx, lhzxu lwz, lwzu, lwzx, lwzxu • For halfword loads, sign extension is possible lha, lhau, lhax, lhaxu • Multiword load lmw rD,disp(rA) » Loads n consecutive words at EA to registers rD , …, r31 2003 S. Dandamudi Chapter 14: Page 20 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
PowerPC Instruction Set (cont’d) • Similar instructions for store stbz, stbzu, stbzx, stbzxu sthz, sthzu, sthzx, sthzxu stwz, stwzu, stwzx, stwzxu • Multiword store stmw rD,disp(rA) » Stores n consecutive words at EA to registers rD , …, r31 2003 S. Dandamudi Chapter 14: Page 21 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
Recommend
More recommend