Self-adaptive Address Mapping Mechanism for Access Pattern - PowerPoint PPT Presentation

Self-adaptive Address Mapping Mechanism for Access Pattern Awareness on DRAM Chundian Li* , Mingzhe Zhang*, Zhiwei Xu*, Xianhe Sun† * ICT, CAS, China † Illinois Tech, USA TECHNOLOGY INSTITUTE OF COMPUTING 12/17/2019

Outline INSTITUTE OF COMPUTING TECHNOLOGY ● Introduction & Background ● Motivation ● Design ● Experiments ● Conclusion ● Future work

Introduction INSTITUTE OF COMPUTING TECHNOLOGY ● Memory wall. ● DRAM serve data accesses in two efficient ways. Locality: row buffer. ● Memory-level parallelism (MLP): channel/bank parallelism. ● ● Worst case. Neither locality nor concurrency. ● When and Why? ● ● Mismatch between data layout and access pattern. Data layout: row-major, column-major, bank-major, etc. ● Access pattern: stream, stride, random, pointer, etc. ● (Take regular access patterns in our study). ●

Background INSTITUTE OF COMPUTING TECHNOLOGY ● Layout <- Address Mappings RI: spatial row-buffer locality. ● XOR: increase MLP potential. ● CI: bank parallelism. ● ● How about these mappings? Row bits are in the high zone. ● Designed for accesses with short distance. ● ● Problems? If distance is quite long, how? ● Worst case will appear. ● Take Matrix Multiplication as an example. ● XOR can really match all the access patterns? ● No. ●

Motivation INSTITUTE OF COMPUTING TECHNOLOGY ● Take three versions and scales of GEMM as cases. ● Naïve. ● Cache-friendly: tiling. ● Highly-optimized: Intel MKI. ● Metrics. ● IPC for whole execution. ● DRAM performance: APC. ● Locality: row-buffer miss rate. ● Concurrency: MLP.

Motivation INSTITUTE OF COMPUTING TECHNOLOGY ● Observation 1. ● RI/ XOR/ CI may fail to provide its advantages when they happen to mismatch access pattern on DRAM.

Motivation INSTITUTE OF COMPUTING TECHNOLOGY ● Observation 2. ● Performance of XOR conquers one of CI, or the other way around on different patterns.

Motivation INSTITUTE OF COMPUTING TECHNOLOGY ● Bit flip: ● address distance. ● Observation 3. ● RI/ XOR/ CI may all degrade DRAM performance when bit flips are outstanding. ● Consecutive accesses span a long distance that disables both locality and MLP.

Design INSTITUTE OF COMPUTING TECHNOLOGY ● Two tags. Distinguish two procedures. ● MC decides when to sample. ● ● Software-level: Ctrl Loader. Interact with MC. ● ● Hardware-level: MC Modifications. Flip sampling. ● Pattern-aware Prediction. ●

Design INSTITUTE OF COMPUTING TECHNOLOGY ● Flip sampling. Care about adjacent accesses. ● Light-weight. ● Little cost. ● ● Access pattern. Check bit flips for all 64 bits. ● Decide which bit is outstanding. ● Reduce side effects of access thrashing. ●

Design INSTITUTE OF COMPUTING TECHNOLOGY ● Pattern-aware Prediction. ● Basic idea: Reshape the layout to match the access pattern. ● ● Based on prominent flipping. ● Two strategies. (Aggressiveness control) Locality-based strategy. ● MLP-based strategy. ● ● Profit model for this mechanism.

Experiments INSTITUTE OF COMPUTING TECHNOLOGY ● Testbed. Ramulator + Champsim. ● Representative benchmarks: diverse scales of GEMM. ● Baseline: XOR. ●

Experiments INSTITUTE OF COMPUTING TECHNOLOGY ● DRAM performance. MLP-based strategy. ● Naïve: 2.1x. ● Tiling: 1.4x. ● Locality-based. ● Naïve: 1.9x. ● Tiling: 1.7x. ● Intel MLK: 1.6x. ●

Experiments INSTITUTE OF COMPUTING TECHNOLOGY ● IPC for whole execution. Execution time decreases by 24%, 8%, and 7% averagely. ●

Experiments INSTITUTE OF COMPUTING TECHNOLOGY ● Sensitivity study. [1]-λ. How much frequency of bit flips is prominent to the access ● pattern [2]-σ. Speed of reaction. ●

Conclusion INSTITUTE OF COMPUTING TECHNOLOGY ● Key observation. ● Inefficiency comes from the mismatch of access patterns and data layout. ● Worst case: both locality and parallelism are harmed. ● An adaptive address mapping mechanism to be aware of access patterns. ● Bridging the huge mismatch between access patterns and data layout on DRAM. ● Adjustable to different access patterns by adopting suitable mappings to gain either locality or bank parallelism.

Future work INSTITUTE OF COMPUTING TECHNOLOGY ● Show potential on other benchmarks. ● Dig more profit from other applications with regular patterns. ● Fast reshaping. ● Exploit efficient data movement in 3D-stack DRAM to support fast reshaping on runtime after predicting a suitable mapping.

INSTITUTE OF COMPUTING TECHNOLOGY Thank you. Q & A.

Self-adaptive Address Mapping Mechanism for Access Pattern - PowerPoint PPT Presentation

Self-adaptive Address Mapping Mechanism for Access Pattern Awareness on DRAM Chundian Li* , Mingzhe Zhang, Zhiwei Xu, Xianhe Sun * ICT, CAS, China Illinois Tech, USA TECHNOLOGY INSTITUTE OF COMPUTING 12/17/2019 Outline INSTITUTE OF

Texture and other Mappings Texture Mapping Texture Mapping Bump Mapping Bump Mapping

Vickery-Clark-Groves Mechanism Maria Serna Fall 2016 AGT-MIRI VCG mechanism Selling one item

Image Warping Image Mapping Image Mapping - Examples Forward Mapping Forward Mapping -

TEXTURE MAPPING 1 OUTLINE Introduce Mapping Methods Texture Mapping Environment

Neural Nets for Adaptive Filter and Adaptive Neural Nets as Adaptive Filters Pattern Recognition

Adaptive Control Chapter 1: Introduction to Adaptive Control Adaptive Control Landau, Lozano,

Adaptive Control Chapter 11: Direct Adaptive Control 1 Adaptive Control Landau, Lozano,

6 KEYNOTE ADDRESS SLIDES 7 KEYNOTE ADDRESS SLIDES 8 KEYNOTE ADDRESS SLIDES 9 KEYNOTE ADDRESS

Adaptive Control Chapter 12: Indirect Adaptive Control 1 Adaptive Control Landau, Lozano,

Recent Advances and Techniques in Algorithmic Mechanism Design Part 2: Bayesian Mechanism Design

Adaptive Mapping of Linear DSP Adaptive Mapping of Linear DSP Algorithms to Fixed- -Point

Autonomic Systems Autonomic Systems Autonomic : adaptive : adaptive Autonomic Self

A Mechanism for Risk Adaptive Access Control (RAdAC) Machon Gregory 14 March 2007 National

Advanced Texturing Environment Mapping Environment Mapping reflections Environment Mapping

Texture Mapping Texture Mapping 1 Texture Mapping Texture Mapping Motivation Motivation:

Texture Mapping Surface mapping OpenGl and Implementation Details Texture mapping Bump

Dynamic Programming Formula Divide a problem into a polynomial number of smaller subproblems

Zvi Griliches Lectures 2011 Pillars of Prosperity The Political Economics of Development Clusters

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources

Efficient Interactive Training Selection for Large-scale Entity Resolution Qing Wang, Dinusha

DUNE FD Calibration workshop: How are things going? Sowjanya Gollapinni (UTK) Kendall Mahn

DUNE FD Calibration workshop: Summary, additional thoughts & Next steps Sowjanya Gollapinni

Precision EW Measurements from ATLAS Extracting sin 2 eff Introduction Why measure sin 2 eff

Resource Planners Forum Goals, process, and potential topics Brian Turner Repowering the Western

Self-adaptive Address Mapping Mechanism for Access Pattern - PowerPoint PPT Presentation

Self-adaptive Address Mapping Mechanism for Access Pattern Awareness on DRAM Chundian Li* , Mingzhe Zhang*, Zhiwei Xu*, Xianhe Sun * ICT, CAS, China Illinois Tech, USA TECHNOLOGY INSTITUTE OF COMPUTING 12/17/2019 Outline INSTITUTE OF

Texture and other Mappings Texture Mapping Texture Mapping Bump Mapping Bump Mapping

Vickery-Clark-Groves Mechanism Maria Serna Fall 2016 AGT-MIRI VCG mechanism Selling one item

Image Warping Image Mapping Image Mapping - Examples Forward Mapping Forward Mapping -

TEXTURE MAPPING 1 OUTLINE Introduce Mapping Methods Texture Mapping Environment

Neural Nets for Adaptive Filter and Adaptive Neural Nets as Adaptive Filters Pattern Recognition

Adaptive Control Chapter 1: Introduction to Adaptive Control Adaptive Control Landau, Lozano,

Adaptive Control Chapter 11: Direct Adaptive Control 1 Adaptive Control Landau, Lozano,

6 KEYNOTE ADDRESS SLIDES 7 KEYNOTE ADDRESS SLIDES 8 KEYNOTE ADDRESS SLIDES 9 KEYNOTE ADDRESS

Adaptive Control Chapter 12: Indirect Adaptive Control 1 Adaptive Control Landau, Lozano,

Recent Advances and Techniques in Algorithmic Mechanism Design Part 2: Bayesian Mechanism Design

Adaptive Mapping of Linear DSP Adaptive Mapping of Linear DSP Algorithms to Fixed- -Point

Autonomic Systems Autonomic Systems Autonomic : adaptive : adaptive Autonomic Self

A Mechanism for Risk Adaptive Access Control (RAdAC) Machon Gregory 14 March 2007 National

Advanced Texturing Environment Mapping Environment Mapping reflections Environment Mapping

Texture Mapping Texture Mapping 1 Texture Mapping Texture Mapping Motivation Motivation:

Texture Mapping Surface mapping OpenGl and Implementation Details Texture mapping Bump

Dynamic Programming Formula Divide a problem into a polynomial number of smaller subproblems

Zvi Griliches Lectures 2011 Pillars of Prosperity The Political Economics of Development Clusters

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources

Efficient Interactive Training Selection for Large-scale Entity Resolution Qing Wang, Dinusha

DUNE FD Calibration workshop: How are things going? Sowjanya Gollapinni (UTK) Kendall Mahn

DUNE FD Calibration workshop: Summary, additional thoughts &amp; Next steps Sowjanya Gollapinni

Precision EW Measurements from ATLAS Extracting sin 2 eff Introduction Why measure sin 2 eff

Resource Planners Forum Goals, process, and potential topics Brian Turner Repowering the Western

Self-adaptive Address Mapping Mechanism for Access Pattern Awareness on DRAM Chundian Li* , Mingzhe Zhang, Zhiwei Xu, Xianhe Sun * ICT, CAS, China Illinois Tech, USA TECHNOLOGY INSTITUTE OF COMPUTING 12/17/2019 Outline INSTITUTE OF

DUNE FD Calibration workshop: Summary, additional thoughts & Next steps Sowjanya Gollapinni