Superscalar Design: An Introduction Virendra Singh Associate Professor C omputer A rchitecture and D ependable S ystems L ab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail: viren@ee.iitb.ac.in EE-739: Processor Design Lecture 24 (12 March 2013) CADSL
Superscalar Pipeline Stages Superscalar Pipeline Stages Fetch Instruction Buffer Decode In Program Order Dispatch Buffer Dispatch Issuing Buffer Out Execute of Order Completion Buffer Complete In Program Store Buffer Order Retire 14 Mar 2013 EE-739@IITB 2 CADSL
Superscalar Architecture Wide pipelines to exploit ILP ILP is not necessarily exploited by widening the pipelines and adding more resources Processor policies towards fetching decoding, and executing instruction have significant effect on its ability to discover instructions which can be executed concurrently Instruction issue policy limits or enhances performance because it determines the processor’s look ahead capability 14 Mar 2013 EE-739@IITB 3 CADSL
Issues in Decoding Issues in Decoding • Primary Tasks Identify individual instructions (!) Determine instruction types Determine dependences between instructions • Two important factors Instruction set architecture Pipeline width 14 Mar 2013 EE-739@IITB 4 CADSL
Pentium Pro Fetch/Decode Pentium Pro Fetch/Decode 14 Mar 2013 EE-739@IITB 5 CADSL
Predecoding in the AMD K5 Predecoding in the AMD K5 14 Mar 2013 EE-739@IITB 6 CADSL
Instruction Dispatching Diversified pipeline Different type instructions executed by different FU in different pipelines Distributed control Operands are fetched from RF Operands may not be available Reservation station 14 Mar 2013 EE-739@IITB 7 CADSL
Instruction Dispatch and Issue Instruction Dispatch and Issue • Parallel pipeline Centralized instruction fetch Centralized instruction decode • Diversified pipeline Distributed instruction execution 14 Mar 2013 EE-739@IITB 8 CADSL
Necessity of Instruction Dispatch Necessity of Instruction Dispatch 14 Mar 2013 EE-739@IITB 9 CADSL
Centralized Reservation Station Centralized Reservation Station 14 Mar 2013 EE-739@IITB 10 CADSL
Distributed Reservation Station Distributed Reservation Station 14 Mar 2013 EE-739@IITB 11 CADSL
Issues in Instruction Execution Issues in Instruction Execution • Current trends More parallelism bypassing very challenging Deeper pipelines More diversity • Functional unit types Integer Floating point Load/store most difficult to make parallel Branch Specialized units (media) • Very wide datapaths (256 bits/register or more) 14 Mar 2013 EE-739@IITB 12 CADSL
Bypass Networks Bypass Networks I-Cache PC BR Fetch Q Scan BR Decode Predict FP FX/LD 1 FX/LD 2 BR/CR Reorder Buffer Issue Q Issue Q Issue Q Issue Q FX1 FX2 CR BR LD1 LD2 Unit Unit Unit Unit FP1 FP2 Unit Unit Unit Unit StQ D-Cache • O(n 2 ) interconnect from/to FU inputs and outputs • Associative tag-match to find operands • Solutions (hurt IPC, help cycle time) – Use RF only (IBM Power4) with no bypass network – Decompose into clusters (Alpha 21264) 14 Mar 2013 EE-739@IITB 13 CADSL
Recommend
More recommend