Sangam: A Multi-component Core Cache Prefetcher Mainak Chaudhuri, Nayan Deshmukh
Introduction • The word ‘Sangam’ refers to a confluence of 3 rivers which corresponds to 3 core components in our prefetcher • We achieve 40.3% speedup over no prefetching for 46 single core workloads • For 4 core we achieve 19.5% speedup over no prefetching for 100 multiprogramed workloads (45 homo, 55 hetro)
Sangam IP-Delta-based IP-based stride Adaptive sequence prefetcher degree Next- predictor line prefetcher Recent access filter Encode residual Last PQ Entry? prefetches as metadata Inject prefetch Sequence Complete? Stop
Sangam All the components have a common base degree d All the components have a common base degree d
Where?
Where? • Where to place the prefetcher • L1 allows for better learning whereas L2, L3 allows for more hardware resources
Where? • Where to place the prefetcher • L1 allows for better learning whereas L2, L3 allows for more hardware resources Speedup at different levels of cache 1.34 1.32 1.3 Speedup 1.28 L1 prefetcher 1.26 L2 prefetcher 1.24 1.22 1.2 IP-stride IP-delta
IP-Delta-based Sequence predictor • Uses both control-flow and data-flow information to predict a sequence of accesses IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . . . . . . . . . . .
IP-Delta-based Sequence predictor • Uses both control-flow and data-flow information to predict a sequence of accesses IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . . . . . IP . . . . . .
IP-Delta-based Sequence predictor • Uses both control-flow and data-flow information to predict a sequence of accesses IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . . . . . IP . . . . . . offset
IP-Delta-based Sequence predictor • Uses both control-flow and data-flow information to predict a sequence of accesses IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . . . . . IP . . . . . . offset
IP-Delta-based Sequence predictor • Uses both control-flow and data-flow information to predict a sequence of accesses IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . . . . . IP . . . . . . offset -
IP-Delta-based Sequence predictor • Uses both control-flow and data-flow information to predict a sequence of accesses IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . . . . . IP . . . . . . offset -
IP-Delta-based Sequence predictor • Uses both control-flow and data-flow information to predict a sequence of accesses IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . . . . . IP . . . . . . offset h -
IP-Delta-based Sequence predictor • Uses both control-flow and data-flow information to predict a sequence of accesses IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . . . . . IP . . . . . . offset h -
IP-Delta-based Sequence predictor • Uses both control-flow and data-flow information to predict a sequence of accesses IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . . . Delta . . IP . . . . . . offset h -
IP-Delta-based Sequence predictor • Uses both control-flow and data-flow information to predict a sequence of accesses IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . . . Delta Confidence . . IP . . . . . . offset h -
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . IP . . . . . . offset
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . IP . . . . . . offset
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . IP . . . . . . offset -
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . IP . . . . . . offset -
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . IP . . . . . . offset
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . IP . . . k th . . . offset
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . IP . . . (d+1) th k th . . . offset
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . . IP . . . (d+1) th k th . . . offset
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . h . IP . . . (d+1) th k th . . . offset
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . h . IP . . . (d+1) th k th . . . offset
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset . . . h . IP . . . (d+1-k) th (d+1) th k th . . . offset
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset Delta Delta . . . h . IP . . . (d+1-k) th (d+1) th k th . . . offset
IP-Delta-based Sequence predictor • Learning IP Table IP-Delta Table Last IP Last d+1 deltas h(IP, Delta) Next d deltas offset Delta Delta . . . h . IP . . . (d+1-k) th Confidence (d+1) th k th . . . offset
IP-based stride prefetcher • We use IP based stride predictor when IP-delta predictor can no longer offer predictions • This covers both cases when either the entry is missing from IP-delta table or the sequence is below confidence threshold IP Table Last IP Last d+1 deltas offset . . . IP . . .
IP-based stride prefetcher • We use IP based stride predictor when IP-delta predictor can no longer offer predictions • This covers both cases when either the entry is missing from IP-delta table or the sequence is below confidence threshold IP Table Last IP Last d+1 deltas offset . . We use the IP stride . IP predictor when the last two deltas seen for IP are equal . . .
Next-line prefetcher • Maintaining coverage at the cost of accuracy leads to overall better performance • Used when both IP-delta and IP stride prefetcher cannot offer prediction • Feedback directed degree selection Degree 1 2 ... d 2 1 Hits ... 0 4 4 4 Insertions ... Next-line buffer
Next-line prefetcher • Maintaining coverage at the cost of accuracy leads to overall better performance • Used when both IP-delta and IP stride prefetcher cannot offer prediction • Feedback directed degree selection X+1 1 Degree 1 2 ... d 2 1 Hits ... 0 4 4 4 Insertions ... Next-line buffer
Next-line prefetcher • Maintaining coverage at the cost of accuracy leads to overall better performance • Used when both IP-delta and IP stride prefetcher cannot offer prediction • Feedback directed degree selection X+2 X+1 2 1 Degree 1 2 ... d 2 1 Hits ... 0 4 4 4 Insertions ... Next-line buffer
Next-line prefetcher • Maintaining coverage at the cost of accuracy leads to overall better performance • Used when both IP-delta and IP stride prefetcher cannot offer prediction • Feedback directed degree selection X+d X+2 X+1 d 2 1 Degree 1 2 ... d 2 1 Hits ... 0 4 4 4 Insertions ... Next-line buffer
Next-line prefetcher • Maintaining coverage at the cost of accuracy leads to overall better performance • Used when both IP-delta and IP stride prefetcher cannot offer prediction • Feedback directed degree selection X+d X+2 d 2 Degree 1 2 ... d X+1 2 1 Hits ... 0 1 4 4 4 Insertions ... Next-line buffer
Next-line prefetcher • Maintaining coverage at the cost of accuracy leads to overall better performance • Used when both IP-delta and IP stride prefetcher cannot offer prediction • Feedback directed degree selection X+d X+2 d 2 Degree 1 2 ... d X+1 2 1 Hits ... 0 1 4 4 4 Insertions ... Next-line buffer
Next-line prefetcher • Maintaining coverage at the cost of accuracy leads to overall better performance • Used when both IP-delta and IP stride prefetcher cannot offer prediction • Feedback directed degree selection X+d X+2 d 2 Degree 1 2 ... d X+1 2 1 Hits ... 0 1 5 4 4 Insertions ... Next-line buffer
Recommend
More recommend