Dual Path Instruction Processing Dual Path Instruction Processing - PowerPoint PPT Presentation

International Conference on Supercomputing International Conference on Supercomputing (ICS 2002) (ICS 2002) New York City, USA, June 2002 New York City, USA, June 2002 Dual Path Instruction Processing Dual Path Instruction Processing Dual Path Instruction Processing Juan L. Aragón 1 , José González 1, *, Antonio González 2, * and James E. Smith 3 1 Dept. Ing. y Tecnología de Computadores Universidad de Murcia 2 Dept. d’Arquitectura de Computadors Universitat Politècnica de Catalunya 3 Dept. Electrical and Computing Eng. University of Wisconsin-Madison * Currently at Intel Barcelona Research Center e-mail: jlaragon@ditec.um.es GACOP

Motivation Motivation Motivation ! Two ways of reducing performance degradation due to branch mispredictions ! Improving prediction accuracy ! Reducing branch misprediction penalty ! Branch misprediction penalty ! Deeper pipelines cause higher misprediction penalties – Pentium 4 (20 stages); Power 4 (14 stages) – Example: IPC slowdown of 22%, using 32 KB gshare comparing a pipeline of 20 stages over 10 stages ( go ) GACOP

Motivation Motivation Motivation ! Causes of performance degradation after a branch misprediction ! Pipeline must be squashed ! Many cycles until new instructions can be issued – Front-end length ! Instruction window is not full during many cycles – ILP cannot be fully exploited ! Correct instructions cannot be scheduled ahead a mispredicted branch GACOP

Outline Outline Outline ! Misprediction Penalty Analysis ! Proposal ! Dual Path Instruction Processing ( DPIP ) ! Experimental Results ! Sensitivity Analysis ! Conclusions GACOP

Misprediction Penalty Analysis Misprediction Penalty Analysis Misprediction Penalty Analysis ! Three Components ! Pipeline-fill penalty – Delay between the misprediction and the first correct instruction enters the window – Depends on " Pipeline length, Recovery actions ! Window-fill penalty – Window empty many cycles after misprediction – ILP cannot be fully exploited ! Serialization penalty – Correct instructions cannot be scheduled ahead of the mispredicted branch GACOP

Misprediction Penalty Analysis Misprediction Penalty Analysis Misprediction Penalty Analysis ! Analysis of each component 4 Perfect Branch Pred. Complete IW fill 3 Instant. F/D 1st group Real pred., pipe 6 IPC 2 Real pred., pipe 10 Real pred., pipe 14 1 0 Average of selected 10 benchmarks Pipeline-fill Window-fill Serialization Overall penalty penalty penalty loss �� pipeline 6 25% 25% 10% 65% �� pipeline 10 33% 44% 7% 49% �� pipeline 14 39% 54% 6% 40% �� GACOP

Proposal Proposal Proposal ! Reduce Pipeline-fill and Window-fill penalties ! Dual Path Instruction Processing ( DPIP ) ! Fetches, decodes and renames both paths – Reduce Pipeline-fill penalty – Hide front-end stages ! Alternative path instructions are pre-scheduled in an estimated execution order – Reduce Window-fill penalty – Similar effect as filling the window completely ! Confidence estimation ! Used to filter branches that must be forked GACOP

Related work Related work Related work ! Multiple path execution ( MPE ) ! Fetch, decode and execute instructions from multiple paths – Selective Dual Path Execution (Heil & Smith, Tech.Report’97) – PolyPath (Klauser et al , ISCA’98) – Threaded Multiple Path Execution (Wallace et al , ISCA’98) ! Too expensive (drawbacks) – Aggressive fetch engines (allowing up to 8 different paths!!!) – Bigger register files, instruction windows and ROBs – Complexity of selective flush – Resource contention: more functional units, memory ports,... – Energy consumption: resources used by useless instructions DPIP does not execute instructions DPIP does not execute instructions balance between complexity, cost, and performance balance between complexity, cost, and performance GACOP

Dual Path Instruction Processing Dual Path Instruction Processing Dual Path Instruction Processing ! DPIP block diagram I-cache DPIP can only manage two paths at the same time Fetch Unit RMT 1 RMT 1 RMT 1 RMT 2 Decode Free list 1 Free list 1 Unit Free list 1 Free list 2 alternative path instructions Instruction LSQ 1 ROB 1 program Window order Alternative Path Buffer LSQ 2 ROB 2 issue logic Funct. Units GACOP

DPIP DPIP DPIP ! Pre-scheduling alternative path instructions ... RMT 1 RMT 1 RMT 1 RMT 2 Decode Free list 1 Free list 1 Unit Free list 1 Free list 2 alternative path predicted path pre- scheduling Instruction Window data-flow order issue logic Pre-schedule Buffer ... Canal & Gonzalez, ICS 2000 Michaud & Seznec, HPCA 2001 GACOP

DPIP DPIP DPIP ! Pre-scheduling Example schedule_line = max( {reg_availability(input reg1), reg_availability(input reg2)} ) reg_availability(output register) = schedule_line + execution_latency line width Alternative path instructions: r6 0 6 A r2 ← r1+ r0 r5 0 5 B store r3, 0(r2) r4 0 2 1 4 pre-sched. C load r2, 0(r6) logical buffer size r3 0 3 D r4 ← r2+ r0 register r2 0 1 2 E r4 ← r3+ r3 r1 0 1 B D active r0 0 0 A C E line Register Pre-schedule Availability Table Buffer GACOP

Results Results Results ! OoO superscalar simulator (sim-outorder) ! Configuration ! Fetch/decode/issue/commit up to 8 inst/cycle ! L1 cache: 64 KB I-cache, 64 KB D-cache (2 way) ! L2 cache: 512 KB 4-way ! 8 Int ALU´s, 2 Int Mult ! 8 FP ALU´s, 2 FP mult ! 64-entry Instruction Window ! 128-entry Reorder Buffer ! 14-stage pipeline (IBM Power 4 - like) ! Evaluated programs ! SpecInt95 and SpecInt2000 GACOP

Results Results Results ! DPIP performance 5.7 gshare (single-path) 16KB gs+DPIP (8+8)KB 4.5 gs+DPIP+preSched (8+8) gs+BPRU+DPIP (8+8)KB 4.0 gs+DPIP(oracle) 8KB 3.5 perfect branch prediction IPC 3.0 2.5 2.0 1.5 1.0 compress gcc go ijpeg bzip2 crafty gzip mcf twolf vpr Average ! 8% improvement for DPIP (with pre-scheduling) ! 10% improvement for DPIP + branch prediction reversal ! 17% for oracle estimation (still work to be done) GACOP

Results Results Results ! How much pre-scheduling influences DPIP performance? pre-fetching+decoding+renaming pre-scheduling Speedup breakdown (%) 100 16% 80 60 84% 40 20 0 compress gcc go ijpeg bzip2 crafty gzip mcf twolf Average vpr ! 16% of improvement provided by pre-scheduling ( 31% for go ) ! Pre-scheduling provides additional benefits. GACOP

Sensitivity Study Sensitivity Study Sensitivity Study ! Alpha 21264 branch predictor 6 21264 (single-path) 16KB 21264+DPIP (8+8)KB 5 21264+DPIP(oracle) 8KB perfect branch prediction 4 IPC 3 2 1 compress gcc go ijpeg bzip2 crafty gzip mcf twolf vpr Average ! 5% average speedup (up to 8% for bzip2 ) ! 15% for oracle estimation GACOP

Dual Path Instruction Processing Dual Path Instruction Processing - PowerPoint PPT Presentation

International Conference on Supercomputing International Conference on Supercomputing (ICS 2002) (ICS 2002) New York City, USA, June 2002 New York City, USA, June 2002 Dual Path Instruction Processing Dual Path Instruction Processing Dual

Dual Path Strategy: Dual Path Strategy: Fast Track Action Plan Fast Track Action Plan June 21,

Lenguaje dual en el distrito 47 Dual Language in District 47 2017-2018 What is Dual Language?

Calhoun Community College Dual Enrollment Info Session for Students & Parents What is Dual

DUAL CREDIT WHAT IS DUAL CREDIT? Dual credit means two things are happening at once. Students

Web Application for the Dual Web Application for the Dual Web Application for the Dual Web

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

Instruction Set 2 Architecting a vocabulary for the HW INSTRUCTION SET OVERVIEW 3 Instruction

A * A path finding algorithm. A path finding algorithm. Given a state space, such as a

On Path Generation, Path Following On Path Generation, Path Following and Time Coordination for

Using Off-Path and On-Path Signaling for Internet Security Saikat Guha, Paul Francis Cornell

Dual Credit Courses What does it mean to be a dual credit student? Dual enrollment simply means: A

Dual Interface Technology Update EuroForum 2014 Munich Agenda 1/ Dual Interface Technologies

FBISD/HCC Dual Credit Program Welcome! We are excited to have you participate in FBISD/HCCs

Dual Credit Temple College Please pick up a Dual Credit and/or REACH Packet. DO NOT fill

Dual-Enrollment Lakewood Ranch High School What is Dual-Enrollment? The dual enrollment

Dual Enrollment 2020-21 presentation by Lori Morrell What is Dual Enrollment? Dual

NRG Oncology: Ovarian Cancer Closed: GOG0212 Front-Line Maintenance, Letter to

Ontology Engineering for the Semantic Web COMP62342 Sean Bechhofer and Uli Sattler University

p -adic dynamical systems of finite order Michel Matignon Institut of Mathematics, University

CMPSC 230 Theory of Computation and Formal Languages Fall 2016 Course Instructor Dr. Oliver

#PeekskillPride January: Collaboration #PeekskillPride PHS Advanced Art Students Tell

On a rank-unimodality conjecture of Morier-Genoud and Ovsienko Thomas McConville Kennesaw State

Resources, Services, and Interfaces Services: Hardware Abstractions CPU/Memory abstractions

Avoiding Accidents - A Misson Impossible? Michael Dorner Chair for Network Architectures and

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Dual Path Instruction Processing Dual Path Instruction Processing - PowerPoint PPT Presentation

International Conference on Supercomputing International Conference on Supercomputing (ICS 2002) (ICS 2002) New York City, USA, June 2002 New York City, USA, June 2002 Dual Path Instruction Processing Dual Path Instruction Processing Dual

Dual Path Strategy: Dual Path Strategy: Fast Track Action Plan Fast Track Action Plan June 21,

Lenguaje dual en el distrito 47 Dual Language in District 47 2017-2018 What is Dual Language?

Calhoun Community College Dual Enrollment Info Session for Students &amp; Parents What is Dual

DUAL CREDIT WHAT IS DUAL CREDIT? Dual credit means two things are happening at once. Students

Web Application for the Dual Web Application for the Dual Web Application for the Dual Web

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

Instruction Set 2 Architecting a vocabulary for the HW INSTRUCTION SET OVERVIEW 3 Instruction

A * A path finding algorithm. A path finding algorithm. Given a state space, such as a

On Path Generation, Path Following On Path Generation, Path Following and Time Coordination for

Using Off-Path and On-Path Signaling for Internet Security Saikat Guha, Paul Francis Cornell

Dual Credit Courses What does it mean to be a dual credit student? Dual enrollment simply means: A

Dual Interface Technology Update EuroForum 2014 Munich Agenda 1/ Dual Interface Technologies

FBISD/HCC Dual Credit Program Welcome! We are excited to have you participate in FBISD/HCCs

Dual Credit Temple College Please pick up a Dual Credit and/or REACH Packet. DO NOT fill

Dual-Enrollment Lakewood Ranch High School What is Dual-Enrollment? The dual enrollment

Dual Enrollment 2020-21 presentation by Lori Morrell What is Dual Enrollment? Dual

NRG Oncology: Ovarian Cancer Closed: GOG0212 Front-Line Maintenance, Letter to

Ontology Engineering for the Semantic Web COMP62342 Sean Bechhofer and Uli Sattler University

p -adic dynamical systems of finite order Michel Matignon Institut of Mathematics, University

CMPSC 230 Theory of Computation and Formal Languages Fall 2016 Course Instructor Dr. Oliver

#PeekskillPride January: Collaboration #PeekskillPride PHS Advanced Art Students Tell

On a rank-unimodality conjecture of Morier-Genoud and Ovsienko Thomas McConville Kennesaw State

Resources, Services, and Interfaces Services: Hardware Abstractions CPU/Memory abstractions

Avoiding Accidents - A Misson Impossible? Michael Dorner Chair for Network Architectures and

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Calhoun Community College Dual Enrollment Info Session for Students & Parents What is Dual