Reduced Instruction Set Computers Raul Queiroz Feitosa Parts of these slides are from the support material provided by W. Stallings Objective To provide an overview of the innovations in the areas of computer organization and architecture related to Reduced Instruction Set Computers. � RISC
Outline � Historical Overview � Instruction Execution Characteristics � Use of Large Register File � Reduced Instruction Set Architecture � RISC Pipelining � RISC Driving forces for CISC � Software costs far exceed hardware costs � Increasingly complex high level languages � Semantic gap � Leads to: � Inefficient code � Excessive machine program size � Compiler complexity � Small register sets � RISC
Driving forces for CISC � Access to control memory faster than to external memory � Leads to: � Move complexity to microcode � Larger and more powerful instruction sets � More addressing modes � Hardware implementations of HLL statements � e.g. CASE (switch) on VAX � RISC Changes toward RISC � The semiconductor technology and cache memories → reduced the memory access time � Compiler technology evolved → more intelligence built in compilers � Pipelining → see later . � The program’s dynamic behavior started being investigated � RISC
Outline � Historical Overview � Instruction Execution Characteristics � Use of Large Register File � Reduced Instruction Set Architecture � RISC Pipelining � RISC Frequency of HLL Operations Procedure call/return is the most time consuming operation in typical HLL. � RISC
Operands Furthermore, 80% of the scalars are local to procedures → → optimisation should → → concentrate on accessing local variables. � RISC Procedure Calls Registers are saved by calling and restored by returning → → very time consuming → → Programs mostly confined to a narrow window of procedure invocation depth �� RISC
Procedure Calls Typically procedures employs few passed parameters and local variables �� RISC Implications � Best support is given by optimising most used and most time consuming features � Large number of registers � Operand referencing � Careful design of pipelines � Branch prediction etc. � Simplified (reduced) instruction set � Move complexity to compiler �� RISC
Outline � Historical Overview � Instruction Execution Characteristics � Use of Large Register File � Reduced Instruction Set Architecture � RISC Pipelining �� RISC Large Register File � Software solution � Require compiler to allocate registers � Allocate based on most used variables in a given time � Requires sophisticated program analysis � Hardware solution � Have more registers � Thus more variables will be in registers �� RISC
SW Based Register Optimization � Assume small number of registers (16-32) � Optimizing use is up to compiler � HLL programs have no explicit references to registers � usually - think about C - register int � Assign symbolic or virtual register to each candidate variable � Map (unlimited) symbolic registers to real registers � Symbolic registers that do not overlap can share real registers � If you run out of real registers some variables use memory �� RISC Graph Coloring Symbolic registers A B A B C D E F C D E F time D Register interference graph E � Symbolic register that are simultaneously in use are connected by an edge and are assigned different colors � The aim is to minimize the number of different colors. R 1 R 2 R 3 � Actual registers Time sequence of active use of registers �� RISC
HW Solution - Register Window Register set is split in windows, just one window visible at a time. A window has three fields: Input parameter & returned results Local variables Input parameter & returned results of the procedure called by current procedure Parameter Local Temporary Window of level J Registers Registers Registers overlap Parameter Local Temporary Window of level J+1 Registers Registers Registers �� RISC Circular Buffer � Only one window register is visible, B.p the one pointed by CWP B.l A.t C.p � Register references are offset by CWP Save A.l B.t W B � If procedure E calls F, arguments for F A.p W A C.l Saved are placed in E.t, and CWP advances F.t W C window one window pointer W F D.p (SWP) F.l W D � SWP identifies the window most C.t W E recently saved in memory F.p D.l n r E.t Restore u t � If procedure F calls another one, e E.p R E.l CWP=SWP, an interrupt occurs, and D.t C the A window is saved. u r r w p e o o n d i n t n i W W ) t P e l l W r a t i C n n C ( e d r C ( r o u r W e w C t n i P o ) p �� RISC
Global Variables � Allocated by the compiler to memory � Inefficient for frequently accessed variables � Have a set of registers for global variables �� RISC Registers × Cache �� RISC
Outline � Historical Overview � Instruction Execution Characteristics � Use of Large Register File � Reduced Instruction Set Architecture � RISC Pipelining �� RISC RISC Characteristics 1. One instruction per cycle 2. Register to register operations Ex.: addu r1,r2,r4 /* add unsigned r2 to r4 and put in r1 addu r1,#imm(r4) /* add unsigned r1 to memory FORBIDDEN address r4 offset #imm 3. Memory access only through Load/Store 4. Few, simple addressing modes Ex.: lw r2,128(r3) /* load address 128 offset from r3 into r2 . �� RISC
RISC Characteristics Few, simple, fixed Operation Operation Operation Code Operation Code 5. Rs Rs Source register specifier Source register specifier instruction formats Rt Rt Source/destination register specifier Source/destination register specifier Immediate Immediate Immediate, branch, or address displacement Immediate, branch, or address displacement Target Target Jump target address Jump target address Ex.: MIPS R4000 Rd Rd Destination register specifier Destination register specifier Ex.: Intel x86 Shift amount Shift amount Shift Shift ALU/shift function specifier ALU/shift function specifier Function Function 6 5 5 5 6 5 5 5 5 6 5 6 I-type I-type Operation rs Operation rs rt rt rd Shift Function rd Shift Function (immediate (immediate 6 26 6 26 J-type J-type Operation Target Operation Target (jump) (jump) 6 5 5 5 6 5 5 5 5 6 5 6 R-type R-type Operation rs Operation rs rt rt Immediate Immediate (register (register �� RISC RISC Characteristics Hardwired design (no microcode) 6. More compile time/effort 7. �� RISC
Outline � Historical Overview � Instruction Execution Characteristics � Use of Large Register File � Reduced Instruction Set Architecture � RISC Pipelining �� RISC RISC Pipelining � Delayed branch Ex.: load complete after 2 � Delayed Load instruction cycles � Register to be the target is locked by processor rA � M1 Load � Continue execution of rB � M2 Load instruction stream until rC � M3 register required Load rD � M4 � Idle until load complete Load rE � rA+rB � Re-arranging instructions Add can allow useful work whilst NOOP loading rF � rC+rD Add � Loop Unrolling �� RISC
Loop Unrolling � Replicate body of loop a number of times � Iterate loop fewer times � In consequence � Reduces loop overhead � Increases instruction parallelism � Improved register, data cache or TLB locality �� RISC Loop Unrolling (2×) Example The code do i=2, n-1 a[i] = a[i] + a[i-1] * a[i+l] Benefits: end do 1. loop overhead halved 2. An assignment, a stores becomes and loop variable updated simultaneously → increase do i=2, n-2, 2 parallelism a[i] = + a[i-1] * = a[i] a[i+1] a[i+l] = + * a[i+2] = a[i+l] a[i] 3. variables used twice in the end do loop body → improve if (mod(n-2,2) = i) then locality a[n-1] = a[n-1] + a[n-2] * a[n] end if �� RISC
Recommend
More recommend