fsu
play

FSU DEPARTMENT OF COMPUTER SCIENCE Humb oldt-Universit at zu - PowerPoint PPT Presentation

F ast Instruction Cache Analysis via Static Cache Simulation F rank Mueller David Whalley FSU DEPARTMENT OF COMPUTER SCIENCE Humb oldt-Universit at zu Berlin Flo rida State Universit y F achb ereich Info rmatik


  1. F ast Instruction Cache Analysis via Static Cache Simulation F rank Mueller David Whalley FSU DEPARTMENT OF COMPUTER SCIENCE Humb oldt-Universit � at zu Berlin Flo rida State Universit y F achb ereich Info rmatik Depa rtment of Computer Science Unter den Linden 6 T allahassee, FL 32304-4019 10099 Berlin (Germany) U.S.A. e-mail: whalley@cs.fsu.edu F ast Instruction Cache Analysis via Static Cache Simulation SS'95 1

  2. Overview � caches b ridge b ottleneck b et w een CPU and MM sp eed FSU � traditional (trace-driven) metho ds slo w (ab out 100x overhead) DEPARTMENT OF COMPUTER SCIENCE � new, e�cient metho d fo r instruction cache simulation: - p rovides faster instruction cache p erfo rmance evaluation - determine numb er of hits and misses of a p rogram execution - used to evaluate new cache designs - used to analyze new optimizati on techniques F ast Instruction Cache Analysis via Static Cache Simulation SS'95 2

  3. Metho ds in Contrast � Goal: faster instruction cache p erfo rmance evaluation � traditional app roach: inline tracing - instrument p rogram on complement of min. spanning tree FSU - generate trace addresses DEPARTMENT OF COMPUTER SCIENCE - simulate caches based on trace � our app roach: on-the-�y analysis - analyze p rogram statically (static cache simulation) - instrument p rogram on \unique paths" - do NOT generate trace addresses - simulate remaining cache b ehavio r within p rogram execution F ast Instruction Cache Analysis via Static Cache Simulation SS'95 3

  4. Static Cache Simulation � address of instructions kno wn statically FSU � p redicts la rge p o rtion of instruction cache references DEPARTMENT OF COMPUTER SCIENCE � uses iterative analysis of call graph and control �o w � catego rizes each instruction � assumes: - direct-mapp ed caches - currently no recursion allo w ed F ast Instruction Cache Analysis via Static Cache Simulation SS'95 4

  5. Overview of Static Cache Simulation FSU DEPARTMENT OF COMPUTER SCIENCE execut. source assembly object linker compiler assembler files files files program cache configuration F ast Instruction Cache Analysis via Static Cache Simulation SS'95 5 cache cache control static state table analysis flow cache instrumen- library info simulator tation routines macros

  6. Instruction Catego rization � transfo rms call graph into function-instance graph (FIG) � p erfo rms analysis on FIG and control-�o w graph � uses data-�o w analysis algo rithms fo r p rediction FSU � abstract cache state : p otentially cached p rogram lines DEPARTMENT OF COMPUTER SCIENCE � reaching state : reachable p rogram lines � catego ries based on these states: - alw a ys hit - alw a ys miss - �rst miss: miss on �rst reference, hit on consecutive ones - con�ict: either hit o r miss (dynamic) F ast Instruction Cache Analysis via Static Cache Simulation SS'95 6

  7. Algo rithm to Calculate Cache States input state(main):= all invalid lines; WHILE any change DO FSU DEPARTMENT OF COMPUTER SCIENCE F OR each instance of a UP in the p rogram DO input state(UP):= � ; F OR each immediate p redecesso r P of UP DO input state(UP):= input state(UP) [ output state(P); output state(UP):= [input state(UP) [ p rog lines(UP)] n conf lines(UP); F ast Instruction Cache Analysis via Static Cache Simulation SS'95 7

  8. main() 1 a-miss F FSU ast a-hit Instruction pgm line 0 a-hit call foo() a-hit Cache 2 a-miss Analysis a-hit pgm line 1 via Static 3 conflict a-hit Cache f-miss Simulation 4 a-hit pgm line 2 a-hit a-hit 5 f-miss DEPARTMENT OF COMPUTER SCIENCE call foo() a-hit pgm line 3 6 f-miss a-hit SS'95 7 a-hit a-hit pgm line 4 return a-hit (a) (b) foo() 8 a-miss a-hit a-miss a-miss pgm line 5 return a-hit a-hit 8

  9. F rank Mueller David Whalley SS'95 � 4 cache lines � 16 b ytes p er line (4 instructions) � instances fo o (a) blo ck 8a and (b) blo ck 8b � 7(1): alw a ys hit, spacial lo calit y � 8b(1): alw a ys hit, temp o ral lo calit y � 3(3): �rst miss � 5(1) and 6(1): group �rst miss � 3(1): con�ict with 8b(2) conditionally executed F ast Instruction Cache Analysis via Static Cache Simulation (notes) 8-1

  10. "I" = invalid cache 0 1 2 3 0 1 2 3 0 1 cache ln. 0 1 2 3 0 1 2 3 0 1 Abstract Cache States fo r Example program I I I I 0 1 2 3 4 5 prog. ln. I I I I 0 1 2 3 4 5 PASS 1 ------ in(1)=[I I I I ] out(1)=[ I I I 0 ] in(8a)=[ I I I 0 ] out(8a)=[ I I 4 5] in(2)=[ I I 4 5] out(2)=[ I I 1 4 ] in(3)=[ I I 1 4 ] out(3)=[ I 1 2 4 ] in(4)=[ I 1 2 4 ] out(4)=[ I 1 2 4 ] in(5)=[ I 1 2 4 ] out(5)=[ 1 2 3 4 ] in(8b)=[ 1 2 3 4 ] out(8b)=[ 2 3 4 5] in(6)=[ I 1 2 3 4 5] out(6)=[ 1 2 3 4 5] FSU in(7)=[ 1 2 3 4 5] out(7)=[ 1 2 3 4 5] DEPARTMENT OF COMPUTER SCIENCE PASS 2 ------ in(1)=[I I I I ] out(1)=[ I I I 0 ] in(8a)=[ I I I 0 ] out(8a)=[ I I 4 5] in(2)=[ I I 4 5] out(2)=[ I I 1 4 ] in(3)=[ I I 1 2 3 4 5] out(3)=[ I 1 2 3 4 ] in(4)=[ I 1 2 3 4 ] out(4)=[ I 1 2 3 4 ] in(5)=[ I 1 2 3 4 ] out(5)=[ 1 2 3 4 ] in(8b)=[ 1 2 3 4 ] out(8b)=[ 2 3 4 5] in(6)=[ I 1 2 3 4 5] out(6)=[ 1 2 3 4 5] in(7)=[ 1 2 3 4 5] out(7)=[ 1 2 3 4 5] F ast Instruction Cache Analysis via Static Cache Simulation SS'95 9

  11. Co de Instrumentation � merging states: lo cal path state, sha red path state (SPS) � states p rovide DF A to simulate con�icts lo cally FSU DEPARTMENT OF COMPUTER SCIENCE � frequency counters � macros fo r calls � macros fo r paths � �rst miss table � calculate hits and misses from frequencies and states F ast Instruction Cache Analysis via Static Cache Simulation SS'95 10

  12. F FSU ast Instruction Cache Analysis SPS (path 1 and 2) 1 0 0 : hit a, hit b via path 2 0 1 : hit a, miss b Static 1 0 : miss a, hit b freq[sps]++ 1 1 : miss a, miss b Cache sps|=0x3 2 pgm line a Simulation 3 pgm line b I-Cache path 1 freq[sps]++ 4 sps|=0x2 cache line c cache line d 5 path 4 DEPARTMENT OF COMPUTER SCIENCE path 3 sps&=~0x3 sps&=~0x1 6 pgm line x pgm line y 7 SS'95 11

  13. Measurements � mo di�ed back-end of opt. compiler VPO FSU � p erfo rmed static cache simulation DEPARTMENT OF COMPUTER SCIENCE � instrumented p rograms fo r instruction cache simulation � direct-mapp ed cache simulated � unifo rm instruction size of 4 b ytes simulated � cache line size w as 4 w o rds (16 b ytes) � results veri�ed b y compa rison against trace-driven simulation F ast Instruction Cache Analysis via Static Cache Simulation SS'95 12

  14. P erfo rmance Evaluation � UPP As and function instances vs. basic blo ck pa rtitioning - static savings: 24% few er measurement p oints FSU - dynamic savings: 31% few er measurement p oints DEPARTMENT OF COMPUTER SCIENCE � p redictabilit y of instructions - static: 16% con�icts, other 84% p redicatble - dynamic: 26% con�icts, other 74% p redictable � e�cient in-line co de instrumentation accounts fo r remaining savings � trace-driven overhead 18x, our metho d only 2x F ast Instruction Cache Analysis via Static Cache Simulation SS'95 13

  15. Static Measurements fo r 1kB Direct-Mapp ed Cache Name Hit Miss Firstmiss Con�ict Measure Pts. cachesim 70.83% 6.99% 0.70% 21.48% 73.38% cb 79.03% 2.35% 0.00% 18.63% 89.62% compact 70.12% 4.96% 0.12% 24.80% 68.89% FSU copt 70.89% 7.41% 7.03% 14.67% 84.19% DEPARTMENT OF COMPUTER SCIENCE dhrystone 70.03% 10.71% 7.30% 11.96% 81.61% �t 74.07% 4.85% 16.42% 4.66% 78.43% genrep o rt 70.61% 9.95% 5.61% 13.84% 71.58% mincost 72.79% 9.96% 1.14% 16.11% 83.19% sched 67.65% 5.06% 0.09% 27.19% 73.16% sdi� 68.94% 12.06% 0.89% 18.11% 72.13% tsp 72.61% 13.50% 3.88% 10.01% 64.08% whetstone 75.70% 12.84% 0.24% 11.22% 70.49% average 71.94% 8.39% 3.62% 16.06% 75.90% F ast Instruction Cache Analysis via Static Cache Simulation SS'95 14

Recommend


More recommend