complexity effective issue queue design under load hit
play

Complexity-Effective Issue Queue Design Under Load-Hit Speculation - PowerPoint PPT Presentation

Complexity-Effective Issue Queue Design Under Load-Hit Speculation Tali Moreshet and R. Iris Bahar Brown University Division of Engineering Motivation Pipelines are getting deeper Higher clock frequencies Increased architectural


  1. Complexity-Effective Issue Queue Design Under Load-Hit Speculation Tali Moreshet and R. Iris Bahar Brown University Division of Engineering

  2. Motivation � Pipelines are getting deeper � Higher clock frequencies � Increased architectural complexity � Speculatively issued instructions are particularly sensitive to pipeline depth � Branch prediction � Load hit prediction WCED 2002 Brown University

  3. Pipeline Load Resolution Loop forwarding Register Instruction Issue Functional Data Rename Register Cache Queue Units Cache Unit File Fetch Decode Issue Execute WCED 2002 Brown University

  4. Load Hit Prediction � Issue instructions dependent on load as soon as possible Assume load hits in DL1 � BUT… � Load hit status is known only after dependent instructions may issue WCED 2002 Brown University

  5. Example Cycle: 1 2 3 4 5 6 7 8 Issue Exec Exec Exec LOAD ADD Issue Exec SUB Issue Exec MULT Issue Exec Speculative window WCED 2002 Brown University

  6. Example Cycle: 1 2 3 4 5 6 7 8 9 Issue Exec Exec Exec LOAD ADD Issue Exec SUB Issue Exec MULT Issue Exec Speculative window WCED 2002 Brown University

  7. Example Cycle: 1 2 3 4 5 6 7 8 9 10 Issue Exec Exec Exec LOAD ADD Issue Exec SUB Issue Exec MULT Exec Issue Speculative window WCED 2002 Brown University

  8. What Happens On a Load Miss? � Re-issue instructions in speculative window after a load miss � Keep post-issue instructions in issue queue long enough to ensure re-issuing will not be necessary WCED 2002 Brown University

  9. Complexity-Effective Load Hit Speculation As pipeline depth increases: � Retain performance benefit � Consider complexity of re-issue and prediction � policies Consider impact on issue queue design � WCED 2002 Brown University

  10. Re-Issue Policies 4 different load hit speculation policies: � No load hit speculation 1) Perfect load hit speculation 2) Replay only instructions dependent on load that 3) missed Replay all instructions in speculative window 4) Load hit/miss predictor to limit re-issuing � WCED 2002 Brown University

  11. Performance Impact 45% Perfect_Int Performance Increase from No Load Speculation 40% Dep_Int Dep_Pred_Int Seq_Int 35% Seq_Pred_Int Perfect_FP 30% Dep_FP Dep_Pred_FP 25% Seq_FP Seq_Pred_FP 20% 15% 10% 5% 0% Exe1 Exe3 Exe5 Exe7 -5% WCED 2002 Brown University

  12. Impact on Issue Queue Occupancy Average Number of Instructions in the Issue Queue 40 pre-issue post-issue 35 30 25 20 15 10 5 0 No Load No Load Dependent Dependent Speculation, Speculation, Load Load Integer Floating Point Speculation, Speculation, Benchmarks Benchmarks Integer Floating Point Benchmarks Benchmarks WCED 2002 Brown University

  13. Impact on Issue Queue Occupancy Percentage of Post-Issue Instructions in the Issue 70% compress ijpeg 60% bzip Int_avg apsi 50% swim art wupwise 40% Queue FP_avg 30% 20% 10% 0% Exe1 Exe3 Exe5 Exe7 WCED 2002 Brown University

  14. Impact on Issue Queue Occupancy � As pipeline depth increases: � Issue queue gets cluttered with post-issue instructions (average 55%) � Limits the available ILP � Inefficient use of complexity in instruction bid/grant arbitration logic WCED 2002 Brown University

  15. The Bid / Grant Loop Issue Queue req grant req grant Bid for issue slot N-wide Broadcast grant M entries ... req grant Prioritize & Select WCED 2002 Brown University

  16. Issue Queue Utilization Problem � Complexity of bid/grant arbitration logic increases with size of the IQ � IQ consists largely of post-issue instructions � Limiting the available ILP that a large IQ is supposed to provide Not a complexity-effective design � WCED 2002 Brown University

  17. IQ Design Options � Increase the IQ size ☺ Improve performance – increase available ILP � Increase complexity � Simplify arbitration logic – use slower circuitry ☺ Reduce complexity � Hurt performance � Reduce IQ size ☺ Reduce complexity � Hurt performance WCED 2002 Brown University

  18. Double Latency of Issue Queue Performance Increase From a 64 Entry Issue Queue, Exe1 Exe3 Exe5 Exe7 0% -10% Dependent Load Speculation -20% -30% compress -40% ijpeg bzip Int_avg -50% apsi swim art -60% wupwize FP_avg -70% WCED 2002 Brown University

  19. Smaller IQ (48 Entry) Exe1 Exe3 Exe5 Exe7 Performance Increase From a 64 Entry Issue Queue, 5% 0% Dependent Load Speculation -5% -10% compress ijpeg bzip -15% Int_avg apsi swim art -20% wupwise FP_avg -25% WCED 2002 Brown University

  20. Complexity-Effective Issue Queue � Goal � Reduce complexity � Do not degrade performance � Solution: The Dual Issue Queue � Move post-issue instructions from main queue to separate replay queue � Increase available ILP � Reduce size of main IQ WCED 2002 Brown University

  21. Dual Issue Queue MIQ Main Issue Queue from Register Functional Data Fetch Rename Register Units Cache Unit File unit Replay Issue Queue RIQ Replay_req WCED 2002 Brown University

  22. Dual Issue Queue Performance 10% Performance Increase From Standard Issue Queue, compress 8% ijpeg bzip 6% Int_avg Dependent Load Speculation apsi swim 4% art wupwise 2% FP_avg 0% Exe1 Exe3 Exe5 Exe7 -2% -4% -6% -8% WCED 2002 Brown University

  23. Conclusion � Load hit speculation is critical for high performance in deeper pipelines � Larger percentage of post-issue instructions in issue queue � Complexity-effective issue queue scheme addresses utilization problem � For deepest pipelines, overall performance improves while reducing complexity of IQ WCED 2002 Brown University

Recommend


More recommend