Link-Time Static Analysis for Efficient Separate Compilation of Object-Oriented Languages Jean Privat Roland Ducournau LIRMM CNRS/Universit´ e Montpellier II France Program Analysis for Software Tools and Engineering Lisbon 2005 Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 1 / 24
Outline Motivation 1 Global Techniques 2 Type Analysis Coloring Binary Tree Dispatch Separate Compilation 3 Benchmarks 4 Description Results Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 2 / 24
Outline Motivation 1 Global Techniques 2 Type Analysis Coloring Binary Tree Dispatch Separate Compilation 3 Benchmarks 4 Description Results Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 3 / 24
Software Engineering Ideal Production of Modular Software Extensible software Reusable software components ⇒ Object-Oriented Programming ( inheritance + late binding ) Production of Software in a Modular Way Small code modification → small recompilation Shared software components are compiled only once Software components can be distributed in a compiled form ⇒ Separate Compilation ( compile components + link ) Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 4 / 24
Compilation of OO Programs Global Techniques Knowledge of the whole program → more efficient implementation: Method invocation Access to attribute Subtyping test The Problem Previous works use global technique with global compilation Global compilation is incompatible with modular production Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 5 / 24
Our Proposition A Compromise A separate compilation framework that includes 3 global compilation techniques How To? ⇒ Perform global techniques at link-time Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 6 / 24
Outline Motivation 1 Global Techniques 2 Type Analysis Coloring Binary Tree Dispatch Separate Compilation 3 Benchmarks 4 Description Results Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 7 / 24
Type Analysis Problems Most method invocations are actually monomorphic → Implement them with a static direct call (no late binding) Many methods are dead → Remove them How to? Approximate 3 sets: Live classes and methods Concrete type of each expression Called methods of each call site Many type analysis exist Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 8 / 24
Coloring Problem Overhead with standard VFT in multiple inheritance: Subobjects Many VFT (quadratic number, cubic size) Solution → Simple inheritance implementation even in multiple inheritance How to? Assign an identifier by class Assign a color (index) by class, method and attribute Minimize size of the tables A NP-hard problem Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 9 / 24
Coloring (example) Methods introduced in A A A table B table Gap B C C table D table D Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 10 / 24
Binary Tree Dispatch Problem Prediction of conditional branching of modern processors does not work with VFT Solution → Use static jumps instead of VFT How to? Perform a type analysis Assign an identifier by live class For each live call site, enumerate concrete type in a select tree Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 11 / 24
Binary Tree Dispatch (Example) Compiling call site x.foo id is the class itentifier of the receiver x Concrete type of x is { A, B, C } Class A B C Identifier 19 12 15 foo implementation A foo B foo C foo Generated Code if id <= 15 then if id <= 12 then call B foo else call C foo else call A foo Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 12 / 24
Outline Motivation 1 Global Techniques 2 Type Analysis Coloring Binary Tree Dispatch Separate Compilation 3 Benchmarks 4 Description Results Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 13 / 24
Separate Compilation Source code Compiled component ... metadata Final local phase void foo() executable ... { ... bar global call bar? ... } ... ... ... phase call 0x05217 ... call 0x05175 ... metadata local phase ... void bar() ... { ... foo call foo? ... } ... ... Two Phases Local phase compiles independently of future use Global phase links compiled components Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 14 / 24
Local Phase Source code Compiled component Local phase Metamodel Metamodel Internal model Input Source code of a class Metamodel of required classes Outputs Compiled version of the class (with unresolved symbols) Metadata : metamodel, internal model Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 15 / 24
Compiled Component Method Call Site Assign a unique symbol by call site Compile into a direct call Attribute Access and Subtype Test Assign a unique symbol by color and identifier Compile into a direct access: ◮ in the instance for attribute access ◮ in the subtyping table for subtype tests Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 16 / 24
Global phase Metadata Compiled component Live global Final Type analysis model Coloring Symbol substitution excutable 3 Stages Type analysis: based on the metadata Coloring: computes colors Symbol substitution: generates the final executable Method Call Site Symbols Substitute the address of: monomorphic → the invoked method polymorphic w/ BTD → a generated select tree polymorphic w/ VFT → a generated table access Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 17 / 24
Outline Motivation 1 Global Techniques 2 Type Analysis Coloring Binary Tree Dispatch Separate Compilation 3 Benchmarks 4 Description Results Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 18 / 24
Benchmarks Description Language and Compilers g++: Separate + VFT w/ subobjects SmartEiffel: Global + Binary Tree Dispatch prmc w/ VFT: Separate + Coloring + VFT prmc w/ BTD: Separate + Coloring + BTD Programs The same programs for all language 1 program per OO mechanism ⇒ Small programs are generated by a script Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 19 / 24
Size of Executables 120 Size of the Exec. (kB) 100 g++ 80 SmartEiffel 60 prmc w/ btd 40 prmc w/ vft 20 0 0 10 20 30 40 50 60 70 Number of Classes Subobjects: many VTF → an important overhead prmc: BTD ≃ VFT SmartEiffel: better dead code removal Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 20 / 24
Late Binding 14 12 10 Time (s) 8 g++ prmc w/ btd 6 SmartEiffel prmc w/ vft 4 2 0 0 10 20 30 40 50 60 70 Size of the Concrete Type of the Reicever Subobjects: constant overhead + cache misses Coloring: better on megamorphic calls BTD: better on oligomorphic calls Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 21 / 24
Attribute Access 10 9 g++ 8 7 SmartEiffel Time (s) 6 prmc 5 4 3 2 1 0 0 10 20 30 40 50 60 70 Size of the Concrete Type of Receiver Subobjects: constant overhead Coloring: constant attribute access SmartEiffel: can degenerate Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 22 / 24
Type Downcast 35 30 g++ 25 Time (s) SmartEiffel 20 prmc 15 10 5 0 0 10 20 30 40 50 60 70 Size of the Concrete Type of the Casted Expression g++: bad performances Coloring and BTD: equivalent and mainly constant Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 23 / 24
Summary Summary A separate compilation framework with global techniques for statically typed class-based languages Better modularity than global compilers Better performance than other separate compilers Outlook Shared libraries linked at load-time or dynamically loaded Time overhead of the global phase (link) Privat, Ducournau (LIRMM) Link-Time Static Analysis PASTE’05 24 / 24
Recommend
More recommend