Prof. V.Catania Lecture 4: Instruction Set Architecture Prof. V. Catania Calcolatori elettron. II 2003 Towards Evaluation of ISA and Organization Prof. V.Catania software instruction set hardware
Evolution of Instruction Sets Prof. V.Catania • Major advances in computer architecture are typically associated with landmark instruction set designs – Ex: Stack vs GPR (System 360) • Design decisions must take into account: – technology – machine organization – programming langauges – compiler technology – operating systems And they in turn influence these Design Space of ISA Prof. V.Catania Five Primary Dimensions • Number of explicit operands ( 0, 1, 2, 3 ) • Operand Storage Where besides memory? • Effective Address How is memory location specified? • Type & Size of Operands byte, int, float, vector, . . . How is it specified? • Operations add, sub, mul, . . . How is it specifed? Other Aspects • Successor How is it specified? • Conditions How are they determined? • Encodings Fixed or variable? Wide? • Parallelism
ISA Metrics Prof. V.Catania Aesthetics: • Orthogonality – No special registers, few special cases, all operand modes available with any data type or instruction type • Completeness – Support for a wide range of operations and target applications • Regularity – No overloading for the meanings of instruction fields • Streamlined – Resource needs easily determined Ease of compilation (programming?) Ease of implementation Scalability Classifying ISA Prof. V.Catania Tipo di storage interno alle CPU: • STACK ARCHITECTURE: operands are implicit • ACCUMULATOR ARCHITECTURE: one operand is implicitly accumulator • GENERAL PURPOSE REGISTER: only explicit operands (regs or mem) Register Register Stack Accumulator (register-memory) (load-store) Push A Load A Load R1, A Load R1, A Push B Add B Add R1, B Load R2, B Add Store C Store C, R1 Add R3,R1,R2 Pop C Store C, R3
Classifying ISA Prof. V.Catania load-store architecture: no memory reference per ALU instructions register-memory (memory-memory) architecture: one memory operand per ALU instruction (multiple memory operands per ALU instruction) Number of Maximum number of memory adresses operands allowed Examples 0 3 SPARC, MIPS, Precision Architecture, PowerPC, ALPHA 1 2 Intel 80x86, Motorola 68000 2 2 VAX (also has 3-operand formats) 3 3 VAX (also has 2-operand formats) FIGURE 2.2 Possible combinations of memory operands and total operands per typical ALU instruction with examples of machines. Classifying ISA Prof. V.Catania Type Advantages Disadvantages Register- Simple, fixed-length instruction encoding. Higher instruction count that architecture register Simple code-generation model. with memory referencesin instructions. (0,3) Instructions take similar numbers of Some instructions are short and bit clocks to execute encoding may be wasteful. Register. Data can be accessed without loading Operands are not equivalent since a memory first. Instructions format tend to be easy source operand in a binary operation is (1,2) to encode and yeld good density. destroyed. Encoding a register number and a memory address in each instruction may restrict the number of registers. Clocks per instruction varies by operand location. Memory- Most compact. Doesn’t waste registers Large variation in instruction size, memory for temporaries. expecially for 3-operand instructions. Also, (3,3) large variation in work per instruction. Memory accesses create memory bottleneck • Fewer alternatives � simpler the compiler task (fewer decisions for the compiler) • Wide variety of flexible inst. formats � smaller # bits to encode the program (higher instr. density)
Addressing modes Prof. V.Catania Addressing mode Example instruction Meaning When used Regs[R4] ← Regs[R4]+ Register When a value is in a register. Add R4,R3 Regs[R3] Regs[R4] ← Regs[R4]+3 Immediate Add R4,#3 For constants. Regs[R4] ← Regs[R4]+ Displacement Add R4,100(R1) Accessing local variables. Mem[100+Regs[R1]] Regs[R4] ← Regs[R4]+ Register deferred Accessing using a pointer or a Add R4,(R1) or indirect computed address. Mem[Regs[R1]] Regs[R3] ← Regs[R3]+ Indexed Add R3,(R1 + R2) Sometimes useful in array Mem[Regs[R1]+Regs[R2]] addressing: R1 = base of array; R2 = index amount. Regs[R1] ← Regs[R1]+ Direct or Add R1,(1001) Sometimes useful for accessing absolute Mem[1001] static data; address constant may need to be large. Regs[R1] ← Regs[R1]+ Memory indirect Add R1,@(R3) If R3 is the address of a pointer or memory Mem[Mem[Regs[R3]]] p , then mode yields *p. deferred Regs[R1] ← Regs[R1]+ Autoincrement Add R1,(R2)+ Useful for stepping through ar- Mem[Regs[R2]] rays within a loop. R2 points to Regs[R2] ← Regs[R2]+ d start of array; each reference in- crements R2 by size of an ele- ment, d . Regs[R2] ← Regs[R2]Ð d Auto- Add R1,Ð(R2) Same use as autoincrement. Regs[R1] ← Regs[R1]+ decrement Autodecrement/increment can Mem[Regs[R2]] also act as push/ pop to imple- ment a stack. Regs[R1] ← Regs[R1]+ Scaled Add Used to index arrays. May be R1,100(R2)[R3] Mem[100+Regs[R2]+Regs applied to any indexed address- ing mode in some machines. [R3]*d] FIGURE 2.5 Selection of addressing modes with examples, meaning, and usage. Addressing modes Prof. V.Catania Wide addressing modes Significantly reduce Increase the complexity instruction counts of building a machine High CPI The knowledge on the usage of various add. modes Help the architect choose what to include! Instruction Set Mesurement!
Mesurement on add. modes Prof. V.Catania Immediate and TeX 1% Memory indirect spice 6% displacement 1% gcc dominate TeX 0% addressing Scaled spice 16% mode usage! 6% gcc TeX 24% Register deferred 3% spice 11% gcc TeX 43% 17% Immediate spice 39% gcc 32% TeX Displacement 55% spice 40% gcc 0% 10% 20% 30% 40% 50% 60% Frequency of the addressing mode FIGURE 2.6 Summary of use of memory addressing modes (including immediates), using 3 programs of SPEC89 on a VAX Range of displacement Prof. V.Catania • 12 bit of displ. captures 75% 30% of the full 32 bit displ. Integer average • 16 bit of displ. captures 99%! 25% Floating-point average 20% Percentage of displacement 15% 10% 5% 0% 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Value FIGURE 2.7 Displacement values are widely distributed.
Where Immediate has to be used? Prof. V.Catania • Int programs use immediate in about 1/3 of the instructions • F.p. programs use immediate 10% Loads in 1/10 of instructions! 45% 87% Compares 77% 58% ALU operations 78% 35% All instructions 10% 0% 50% 100% Percentage of operations that use immediates Integer average Floating-point average FIGURE 2.8 We see that for integer ALU operations about one-half to three-quarters of the operations have an immediate operand, while for integer compares 75% to 85% of the occurrences use an immediate operand. Values for Immediate Prof. V.Catania 75% to 80% of the immediate 60% fit within 16 bits gcc 50% 40% 30% TeX 20% spice 10% 0% 0 4 8 12 16 20 24 28 32 Number of bits needed for an immediate value FIGURE 2.9 The distribution of immediate values is shown. Machine used: VAX.
Summary: Memory Addressing Prof. V.Catania 1. A new architecture should support at least the following addressing modes: DISPLACEMENT IMMEDIATE REGISTER DEFERRED 75% ÷ 99% used in our measurements 2. The size of 12 ÷ 16 bits for displacement will capture 75% ÷ 99% of displacements 3. Immediate field of 8 ÷ 16 bits will capture 50% ÷ 80% of the immediates Typical operations in Instruction Set Prof. V.Catania Operator type Examples Arithmetic and logical Integer arithmetic and logical operations: add, and, subtract, or Data transfer Loads-stores (move instructions on machines with memory addressing) Control Branch, jump, procedure call and return, traps System Operating system call, virtual memory management instructions Floating point Floating-point operations: add, multiply Decimal Decimal add, decimal multiply, decimal-to-character conversions String String move, string compare, string search Graphics Pixel operations, compression/decompression operations FIGURE 2.10 Categories of instruction operators and examples of each.
Frequency of operations in 80x86 Inst.Set. Prof. V.Catania Integer average Rank 80x86 instruction (% total executed) 1 load 22% 2 conditional branch 20% 3 compare 16% 4 store 12% 5 add 8% 6 and 6% 7 sub 5% 8 move register-register 4% 9 call 1% 10 return 1% Total 96% FIGURE 2.11 The top 10 instructions for the 80x86. Average values for 5 programs in SPECint92 Mesurements on Instruction for Control Flow Prof. V.Catania 13% Call/return 11% 6% Jump 4% 81% Conditional branch 87% 0% 50% 100% Frequency of branch classes Integer average Floating-point average FIGURE 2.12 Breakdown of control flow instructions into three classes: calls or returns, jumps, and conditional branches.
Recommend
More recommend