pmpm p rediction by combining m ultiple p artial m atches
play

PMPM: P rediction by Combining M ultiple P artial M atches Hongliang - PowerPoint PPT Presentation

PMPM: P rediction by Combining M ultiple P artial M atches Hongliang Gao Huiyang Zhou Computer Science Department University of Central Florida Prediction by Partial Matching (PPM) Prediction by Partial Matching (PPM) Originally proposed


  1. PMPM: P rediction by Combining M ultiple P artial M atches Hongliang Gao Huiyang Zhou Computer Science Department University of Central Florida

  2. Prediction by Partial Matching (PPM) Prediction by Partial Matching (PPM) • Originally proposed for data compression by Cleary and Witten. Introduced to branch prediction by Chen et. al. • For branch prediction: – Each static branch has a set of Markov predictors from order 0 to order m . – The “longest match” policy: use the m immediately preceding history bits to search a pattern in the highest order Markov predictor. • Assumptions of the PPM algorithm – Longer history provides a more accurate context (true). – A prediction counter associated with a more accurate context will provide higher prediction accuracy (false). University of Central Florida 2

  3. The “ “longest match longest match” ” policy is not optimal policy is not optimal The • The confidence-based PPM – Use the longest confident (ctr <> 0) match. – Misprediction rate (MPKI) reductions vs. PPM. – Max H = 40: 4% 3% 2% 1% 0% G vortex gzip crafty raytrace javac bzip2 gcc eon gap vpr db jack olf trt cf parser jess pegaudio k -1% press perlbm V m m tw A com m – Max H = 0 to 40: 1.5% 1.0% 0.5% 0.0% 0 5 10 15 20 25 30 35 40 -0.5% Max History Length University of Central Florida 3

  4. Introduction Introduction • Key observation on PPM – The “longest match” policy is not optimal for branch prediction. • Our contributions – A novel algorithm: Prediction by combining Multiple Partial Matches (PMPM). – A PMPM-based idealistic branch predictor. – A PMPM-based realistic branch predictor. University of Central Florida 4

  5. Prediction by combining Multiple Partial Matches Prediction by combining Multiple Partial Matches • Different branches favor different history lengths Combine multiple counters – Using a longer history than necessary: • Uncorrelated history information -> noise -> distribute useful information into more prediction counters. Combine multiple counters: A specific 10-bit history, one ctr is enough including more history behaviors. If we use 15-bit histories, may need 32 ctrs • Long history repeats less frequently -> only capture most recently behaviors – Especially harmful for “not-correlated / random-like” branches. • Solution – Combining multiple matches • Why? • How: summation -> integrates both direction AND confidence. • Which: several longest confident matches with non-zero prediction counters. University of Central Florida 5

  6. Prediction accuracy of PMPM Prediction accuracy of PMPM • Configuration – Combine the L longest confident matches. – Maximum global history length: 40. L ∑ – Prediction = ≥ ≠ ( Ctr 0 ), Ctr 0 i i = i 1 Combine all • Prediction accuracy 3.95 PMPM-L 3.9 Minimum MPKI PPM Average MPKI 3.85 3.8 3.75 3.7 3.65 3.6 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 L University of Central Florida 6

  7. The idealistic PMPM predictor The idealistic PMPM predictor Input GHR meta ctr & bias ctr Path history br tag LHR Short-GHR table Long-GHR table LHR table (Len 1-32) (geometrical lengths) (Len 1-32) tag LRU ctr ubit bf tag LRU ctr ubit bf tag LRU ctr ubit bf select M (=7) longest matched select N (=16) longest matched useful counters useful counters bias ctr ∑ ∑ Global prediction Local prediction meta ctr final prediction University of Central Florida 7

  8. Prediction accuracy of the idealistic PMPM predictor Prediction accuracy of the idealistic PMPM predictor -20.9% 14 PPM -15.2% 12 iPMPM 10 8 MPKI 6 4 2 0 raytrace javac vpr parser mpegaudio jack vortex gzip gcc mcf crafty compress jess db mtrt eon perlbmk gap bzip2 twolf AVG • PPM: Same predictor structure, but using the “longest match” prediction policy. • Average MPKI – PPM: 3.330 – PMPM: 2.824 University of Central Florida 8

  9. The TAGE predictor The TAGE predictor tag: pc,ghr pc pc,ghr, path Bimodal gtable6 gtable5 gtable0 (shortest) (longest) table … tag ctr ubit tag ctr ubit tag ctr ubit select one matched counter (the longest match OR the 2nd longest match) final prediction University of Central Florida 9

  10. The realistic PMPM predictor The realistic PMPM predictor tag: pc tag: pc,ghr gtable ctrs: pc pc,lhr pc,ghr, 4 groups: (gtable6, gtable5), (gtable4, path gtable3), (gtable2, gtable1), (gtable0). Bimodal ltable gtable6 gtable5 gtable0 (shortest) (longest) table Total (max): … a 2-bit bimodal ctr, a 5-bit ltable ctr tag ctr ubit tag ctr ubit tag ctr ubit tag ctr ubit four 5-bit gtable ctrs. select one matched counter counter selection logic (the longest match OR the 2nd longest match) ∑ final prediction University of Central Florida 10

  11. Ahead pipelining Ahead pipelining Initiate a 3-block Prediction of D ahead prediction is available A B C D Cycle1 Cycle2 Cycle3 Cycle4 Indexes . 1. Tags . 1. Calculate 4 potential 2. Read 4 adjacent predictions. entries. 2. Use information of B and C to select out one prediction. University of Central Florida 11

  12. Compared to the TAGE predictors Compared to the TAGE predictors • Configuration – 32kB, same global history series (5 - 131), similar structures. – Compared to the TAGE predictor: • PMPM-G (GH only): 2-bit larger ctrs, 2-bit smaller tags. – Compared to the PMPM-G predictor: • PMPM-GL(GH and LH): one ltable, smaller bimodal table, smaller tags for 3 gtables. • Average MPKI: – TAGE: 3.666 – PMPM-G: 3.597 (higher aliasing, gcc +7.3%) – PMPM-GL: 3.441 University of Central Florida 12

  13. The realistic PMPM predictor for CBP2 The realistic PMPM predictor for CBP2 • Submitted configuration – Save some storage for miscellaneous registers, counters etc. – Empirically tuned inputs, tag widths etc. • Several optimizations – Shared hysteresis bits in the bimodal table (proposed in the EV8 predictor). – Detect traces with high branch footprints and reset ubits periodically (borrowed from the TAGE predictor). – Limited ubit updates if all predictions from gtables are same. University of Central Florida 13

  14. The realistic PMPM predictor for CBP2 - - accuracy accuracy The realistic PMPM predictor for CBP2 • Observations: – High accuracy: 3.416 MPKI – The local history is still important for some benchmarks (e.g., raytrace, mtrt and vortex ) although we already use a very long (203) global history. Trace CBP2-GL CBP2-G Trace CBP2-GL CBP2-G gzip 9.712 10.346 vpr 8.945 9.063 gcc 3.690 3.637 mcf 10.092 10.033 crafty 2.581 2.565 parser 5.215 5.244 compress 5.537 5.819 jess 0.393 0.433 raytrace 0.542 0.963 db 2.319 2.380 javac 1.107 1.159 mpegaudio 1.102 1.159 mtrt 0.657 1.009 jack 0.688 0.763 eon 0.276 0.359 perlbmk 0.314 0.484 gap 1.431 1.745 vortex 0.137 0.331 bzip2 0.037 0.042 twolf 13.551 13.616 Average PMPM-CBP2-GL: 3.416 PMPM-CBP2-G: 3.557 University of Central Florida 14

  15. The realistic PMPM predictor for CBP2 – – ahead pipelining ahead pipelining The realistic PMPM predictor for CBP2 0.11 MPKI, 3.0% 3.8 3.7 Average MPKI 0.15 MPKI, 4.4% 3.6 3.5 3.4 P M P M -CB P 2-GL 3.3 P M P M -CB P 2-G 3.2 1-block 2-block 3-block 4-block University of Central Florida 15

  16. Summary Summary • Key observation on PPM – The “longest match” policy is not optimal for branch prediction. • Solution – Prediction by combining Multiple Partial Matches (PMPM) • PMPM-based predictor designs – Idealistic predictor: 2.824 MPKI. – Realistic predictor: 3.416 MPKI. University of Central Florida 16

  17. Thank you and Questions? Thank you and Questions? Computer Science Department University of Central Florida

Recommend


More recommend