Locality Locality CS 105 Tour of the Black Holes of Computing - PowerPoint PPT Presentation

✁ ✂ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✄ ✁ � Locality Locality CS 105 Tour of the Black Holes of Computing Principle of Locality: Programs tend to use data and instructions with addresses equal or near to those they have used recently Cache Memories Cache Memories Temporal locality: Recently referenced items are likely to be referenced again in the near future Topics Generic cache-memory organization Direct-mapped caches Spatial locality: Set-associative caches Items with nearby addresses tend Impact of caches on performance to be referenced close together in time CS105 – 2 – Locality Example Locality Example Layout of C Arrays in Memory (review) Layout of C Arrays in Memory (review) C arrays allocated in row-major order sum = 0; Each row in contiguous memory locations for (i = 0; i < n; i++) sum += a[i]; Stepping through columns in one row: return sum; for (i = 0; i < N; i++) sum += a[0][i]; Data references Accesses successive elements Reference array elements in If block size (B) > �� , exploit spatial locality �� succession (stride-1 reference pattern). � Miss rate = �� / B Reference variable sum each iteration. �� Stepping through rows in one column: Instruction references for (i = 0; i < n; i++) sum += a[i][0]; Reference instructions in sequence. �� Accesses distant elements Cycle through loop repeatedly. �� No spatial locality! � Miss rate = 1 (i.e. 100%) CS105 CS105 – 3 – – 4 –

✁ ✁ Qualitative Estimates of Locality Qualitative Estimates of Locality Locality Example Locality Example Question: Does this function have good locality with respect to array a ? Claim: Being able to look at code and get a qualitative sense of its locality is a key skill for a professional programmer. int sum_array_cols(int a[M][N]) Question: Does this function have good locality with respect to array a ? { int i, j, sum = 0; int sum_array_rows(int a[M][N]) for (j = 0; j < N; j++) { for (i = 0; i < M; i++) int i, j, sum = 0; sum += a[i][j]; return sum; for (i = 0; i < M; i++) } for (j = 0; j < N; j++) sum += a[i][j]; return sum; } CS105 CS105 – 5 – – 6 – Cache Memories Cache Memories Typical Speeds Typical Speeds Registers: 1 clock (= 400 ps on 2.5 GHz processor) to get 8 bytes Cache memories are small, fast SRAM-based memories managed automatically in hardware Level-1 (L1) cache: 3–5 clocks for 32–64 bytes Hold frequently accessed blocks of main memory L2 cache: 10–20 clocks, 32–64 bytes CPU looks first for data in cache, then in main memory L3 cache: 20–100 clocks (multiple cores make things slower), 32–64 bytes Typical system structure: DRAM: 100–300 clocks, 32–64 bytes SSD: 75,000 clocks and up (high variance), 4096 bytes CPU chip Register file Hard drive: 5,000,000–25,000,000 clocks, 4096 bytes Cache ALU Ouch! memory System bus Memory bus Main I/O Bus interface bridge memory CS105 CS105 – 11 – – 12 –

✁ ✁ ✁ ✁ General Cache Concepts General Cache Concepts General Cache Concepts: Hit General Cache Concepts: Hit �� CS105 CS105 – 13 – – 14 – General Caching Concepts: General Caching Concepts: General Cache Concepts: Miss General Cache Concepts: Miss Types of Cache Misses Types of Cache Misses Cold (compulsory) miss �� Cold misses occur because the cache is empty. �� Conflict miss �� Most caches limit blocks at level k+1 to a small subset (sometimes a singleton) of the block positions at level k �� E.g. Block i at level k+1 must go in block (i mod 4) at level k �� Conflict misses occur when the level k cache is large enough, but multiple data �� objects all map to the same level k block �� • �� E.g. Referencing blocks 0, 8, 0, 8, 0, 8, ... would miss every time � � � � �� Capacity miss • �� Occurs when set of active cache blocks (working set) is larger than the cache �� CS105 CS105 – 15 – – 16 –

� ✂ ☛ ✡ ☎ ✠ ✟ ✆ ✞ ✄ ✄ ✄ ✂ ✁ ✁ ✂ � ☎ ☎ ☞ ✂ ✄ ✄ ✄ ✟ ✄ ☞ ✎ ✎ ✍ ☞ ✁ ✂ � ✡ ✌ ✝ ☎ � ✝ General Cache Organization (S, E, B) General Cache Organization (S, E, B) Cache Read Cache Read • �� • �� • �� • �� Set # � hash code �� Tag � hash key �� CS105 CS105 – 17 – – 18 – �� Example: Direct Mapped Cache (E = 1) Example: Direct Mapped Cache (E = 1) Example: Direct Mapped Cache (E = 1) Example: Direct Mapped Cache (E = 1) �� CS105 CS105 – 19 – – 20 –

Locality Locality CS 105 Tour of the Black Holes of Computing - PowerPoint PPT Presentation

Locality Locality CS 105 Tour of the Black Holes of Computing Principle of Locality: Programs tend to use data and instructions with

Black Holes and Their CFT Duals Maria Johnstone 1 M.M. Sheikh-Jabbari 2 Joan Simn 3 Hossein

Black Hole Thermodynamics Robert M. Wald I. Black Holes; Event Horizons and Killing Horizons II.

Search for Primordial Black Hole Evaporation with VERITAS Simon Archambault, for the VERITAS

THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR Under the framework of

A G E N D A Tour Policy Oakhill Tour Presentation Travel & Sports Tour

Outline Overview VR Tour VR Tour Entities Luiz Velho Tour Script IMPA Tour

CONTEXT LOCALITY LOCALITY LOCALITY LOCALITY LAYOUTS M E E R L U S T R O A D PICK

Census of Active Super Massive Black Holes Active Super Massive Black Holes in the Era of

Higher order black holes of scalar tensor theories E Babichev and CC gr-qc/1312.3204 CC, T

Hairy black holes in scalar tensor theories E Babichev and CC gr-qc/1312.3204 CC, T Kolyvaris, E

Living on Zoom Living on Zoom CS 105 Tour of the Black Holes of Computing! Were

Mechanisms in Procedures Mechanisms in Procedures CS 105 Tour of the Black Holes of

CS 105 Intel x86 (IA32/64) Processors Intel x86 (IA32/64) Processors Tour of the Black Holes

CS 105 x86-64 Linux Memory Layout x86-64 Linux Memory Layout Tour of Black Holes of Computing

Dealing With I/O Dealing With I/O CS 105 Tour of the Black Holes of Computing Problem:

The Unix I/O Philosophy The Unix I/O Philosophy CS 105 Tour of the Black Holes of

Security Engineering Chester Rebeiro IIT Madras Examples motivated from Prof. Nickolai Zeldovich

Automatic Defect Detection Andrzej Wasylkowski Overview Automatic Defect Detection

N UKTI : English-Inuktitut Word Alignment System Description Philippe Langlais, Fabrizio Gotti

What is Smoke Test? Empirical Evaluation of the Fault-detection Effectiveness of Smoke Regression

Hardware OS & OS- Application interface Summer 2016 Cornell University 1 Today

KVM Live Migration Optimization Li, Liang Zhang, Yang Aug 2015 1 Agenda Background

on a Cluster Hongbo Rong, Frank Schlimbach Programming & Systems Lab (PSL) Software Systems

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

Locality Locality CS 105 Tour of the Black Holes of Computing - PowerPoint PPT Presentation

Locality Locality CS 105 Tour of the Black Holes of Computing Principle of Locality: Programs tend to use data and instructions with

Black Holes and Their CFT Duals Maria Johnstone 1 M.M. Sheikh-Jabbari 2 Joan Simn 3 Hossein

Black Hole Thermodynamics Robert M. Wald I. Black Holes; Event Horizons and Killing Horizons II.

Search for Primordial Black Hole Evaporation with VERITAS Simon Archambault, for the VERITAS

THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR Under the framework of

A G E N D A Tour Policy Oakhill Tour Presentation Travel &amp; Sports Tour

Outline Overview VR Tour VR Tour Entities Luiz Velho Tour Script IMPA Tour

CONTEXT LOCALITY LOCALITY LOCALITY LOCALITY LAYOUTS M E E R L U S T R O A D PICK

Census of Active Super Massive Black Holes Active Super Massive Black Holes in the Era of

Higher order black holes of scalar tensor theories E Babichev and CC gr-qc/1312.3204 CC, T

Hairy black holes in scalar tensor theories E Babichev and CC gr-qc/1312.3204 CC, T Kolyvaris, E

Living on Zoom Living on Zoom CS 105 Tour of the Black Holes of Computing! Were

Mechanisms in Procedures Mechanisms in Procedures CS 105 Tour of the Black Holes of

CS 105 Intel x86 (IA32/64) Processors Intel x86 (IA32/64) Processors Tour of the Black Holes

CS 105 x86-64 Linux Memory Layout x86-64 Linux Memory Layout Tour of Black Holes of Computing

Dealing With I/O Dealing With I/O CS 105 Tour of the Black Holes of Computing Problem:

The Unix I/O Philosophy The Unix I/O Philosophy CS 105 Tour of the Black Holes of

Security Engineering Chester Rebeiro IIT Madras Examples motivated from Prof. Nickolai Zeldovich

Automatic Defect Detection Andrzej Wasylkowski Overview Automatic Defect Detection

N UKTI : English-Inuktitut Word Alignment System Description Philippe Langlais, Fabrizio Gotti

What is Smoke Test? Empirical Evaluation of the Fault-detection Effectiveness of Smoke Regression

Hardware OS &amp; OS- Application interface Summer 2016 Cornell University 1 Today

KVM Live Migration Optimization Li, Liang Zhang, Yang Aug 2015 1 Agenda Background

on a Cluster Hongbo Rong, Frank Schlimbach Programming &amp; Systems Lab (PSL) Software Systems

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

A G E N D A Tour Policy Oakhill Tour Presentation Travel & Sports Tour

Hardware OS & OS- Application interface Summer 2016 Cornell University 1 Today

on a Cluster Hongbo Rong, Frank Schlimbach Programming & Systems Lab (PSL) Software Systems