superscalar design
play

Superscalar Design: An Introduction Virendra Singh Associate - PowerPoint PPT Presentation

Superscalar Design: An Introduction Virendra Singh Associate Professor C omputer A rchitecture and D ependable S ystems L ab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail:


  1. Superscalar Design: An Introduction Virendra Singh Associate Professor C omputer A rchitecture and D ependable S ystems L ab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail: viren@ee.iitb.ac.in EE-739: Processor Design Lecture 23 (11 March 2013) CADSL

  2. Superscalar Pipeline Stages Superscalar Pipeline Stages Fetch Instruction Buffer Decode In Program Order Dispatch Buffer Dispatch Issuing Buffer Out Execute of Order Completion Buffer Complete In Program Store Buffer Order Retire 11 Mar 2013 EE-739@IITB 2 CADSL

  3. Superscalar Architecture  Wide pipelines for enhanced throughput  ILP is not necessarily exploited by widening the pipelines and adding more resources  Processor policies towards fetching decoding, and executing instruction have significant effect on its ability to discover instructions which can be executed concurrently  Instruction issue policy limits or enhances performance because it determines the processor’s look ahead capability 07 Mar 2013 EE-739@IITB 3 CADSL

  4. Highway 11 Mar 2013 EE-739@IITB 4 CADSL

  5. Bad Traffic 11 Mar 2013 EE-739@IITB 5 CADSL

  6. Instruction Flow Instruction Flow  Objective: Fetch multiple instructions per cycle • Challenges : PC  Branches: control dependences Instruction Memory  Branch target misalignment 3 instructions fetched  Instruction cache misses 11 Mar 2013 EE-739@IITB 6 CADSL

  7. Instruction Fetch  Fetch s instructions from I-cache  I-Cache must be wide enough that each row of the I-Cache array can store s instructions and that an entire row can be accessed  Fetch width = Row width  Assume access latency is 1 cycle 11 Mar 2013 EE-739@IITB 7 CADSL

  8. I-Cache Organization I-Cache Organization Tag Tag D Tag E C Tag 1 cache line = 1 physical row 11 Mar 2013 EE-739@IITB 8 CADSL

  9. I-Cache Organization I-Cache Organization Tag D E Tag C Tag 1 cache line = 2 physical rows 11 Mar 2013 EE-739@IITB 9 CADSL

  10. Instruction Flow Instruction Flow  Objective: Fetch multiple instructions per cycle • Challenges: – Branches: control dependences – Branch target misalignment – Instruction cache misses • Solutions – Code alignment (static vs. dynamic) – Prediction/speculation 11 Mar 2013 EE-739@IITB 10 CADSL

  11. Fetch Alignment Fetch Alignment 11 Mar 2013 EE-739@IITB 11 CADSL

  12. Instruction Fetch  2 – way set associative I-Cache with a line size of 16 instructions (64 bytes)  Each row of the I-Cache stores 4 associative sets 9two per set) of instructions  Each line of I-cache spans four physical rows  Physical I-cache array is actually composed of 4 independent sub-arrays  One instruction can be accessed form one array 11 Mar 2013 EE-739@IITB 12 CADSL

  13. RIOS-I Fetch Hardware RIOS-I Fetch Hardware 11 Mar 2013 EE-739@IITB 13 CADSL

  14. Issues in Decoding Issues in Decoding • Primary Tasks  Identify individual instructions (!)  Determine instruction types  Determine dependences between instructions • Two important factors  Instruction set architecture  Pipeline width 11 Mar 2013 EE-739@IITB 14 CADSL

  15. Pentium Pro Fetch/Decode Pentium Pro Fetch/Decode 11 Mar 2013 EE-739@IITB 15 CADSL

  16. Thank You 11 Mar 2013 EE-739@IITB 16 CADSL

Recommend


More recommend