Shape Analysis Alon Milchgrub
Overview Lisp review The concrete semantics The abstractions function The abstract semantics Discussion
Lisp review In Lisp everything is a list The command cons concatenates two objects by creating a new object with pointers to both the original ones. The commands car and cdr are used to access the first and second elements respectively. e.g. (cons 'pine '(fir oak maple)) returns (pine fir oak maple) (car ‘( pine fir oak maple)) returns pine (cdr ‘( pine fir oak maple)) returns ( fir oak maple) ( fir oak maple) pine
Preliminaries Let PVar be the set of pointers in a program. A shape graph if a directed graph with two type of edges: variable-edges E v and selector-edges E s . E v is a set of pairs of the form x, n where x ∈ PVar and n is a shape-node. 𝐹 𝑡 is a set of triplets of the form 𝑡, 𝑡𝑓𝑚, 𝑢 where 𝑡𝑓𝑚 ∈ 𝑑𝑏𝑠, 𝑑𝑒𝑠 and 𝑡 and 𝑢 are shape nodes. A shape graph is deterministic if from every PVar exit at most one edge and from every shape-node exit at most one edge of each of 𝑑𝑏𝑠, 𝑑𝑒𝑠 .
The Concrete Semantics 𝑦 ≔ 𝒐𝒇𝒙 𝑚 1 𝑧 ≔ 𝒐𝒇𝒙 𝑦 𝑧. 𝑑𝑒𝑠 ≔ 𝑦 𝑨 ≔ 𝐨𝐟𝐱 𝑚 2 𝑦. 𝑑𝑏𝑠 ≔ 𝑨 𝑧 𝑧 ≔ 𝒐𝒋𝒎 𝑚 3 𝑧 ≔ 𝑦. 𝑑𝑏𝑠 𝑨 ≔ 𝒐𝒋𝒎 𝑨 𝑨 ≔ 𝑦 𝑑 𝑇𝐻
The Concrete Semantics The transformations applied to the shape graph are defined by the concrete semantics 𝑡𝑢 𝒯 : 𝒯 → 𝒯 . Let 𝑤 be a control flow graph vertex and 𝑞𝑏𝑢ℎ𝑡𝑈𝑝 𝑤 the set of paths in the control flow graph from start to predecessors of 𝑤 Then the collecting semantics is defined as follows: 𝑑𝑡 𝑤 = 𝑡𝑢 𝑤 𝑙 𝒯 … 𝑡𝑢 𝑤 1 ∅, ∅ 𝑤 1 , … , 𝑤 𝑙 ∈ 𝑞𝑏𝑢ℎ𝑡𝑈𝑝 𝑤 𝒯 This is the set of possible shape graphs at 𝑤 .
The Abstract Semantics A static shape graph (SSG) is a pair 𝑇𝐻, 𝑗𝑡_𝑡ℎ𝑏𝑠𝑓𝑒 , where SG is a shape graph, whose shape nodes are a subset of 𝑜 𝑌 𝑌 ⊆ 𝑄𝑊𝑏𝑠 . 𝑗𝑡_𝑡ℎ𝑏𝑠𝑓𝑒 is a function for the shape nodes of SG to 𝑢𝑠𝑣𝑓, 𝑔𝑏𝑚𝑡𝑓 . Semantically, 𝑗𝑡_𝑡ℎ𝑏𝑠𝑓𝑒 𝑜 = 𝑢𝑠𝑣𝑓 indicates that 𝑜 is pointed to by more than 1 pointer on the heap.
The Abstract Semantics Given a DSG, the mapping 𝛽 generates a SSG by replacing the concrete locations by the set of pointers pointing to the same location (after gc). 𝑜 𝑧 𝑚 1 𝑜 𝑢 𝑚 2 𝑧 𝑢 𝑜 𝑦,𝑢 1 𝑜 𝜚 𝑚 3 𝑜 𝜚 𝑚 4 𝑚 5 𝑦 𝑢 1 For the image of 𝛽 𝐸𝑇𝐻 𝑗𝑡_𝑡ℎ𝑏𝑠𝑓𝑒 𝑜 𝑎 = 𝑢𝑠𝑣𝑓 ⇔ 𝑜 𝑨 represents a concrete location that is pointed by more than 1 pointer on the heap .
The Abstract Semantics For a set of shape graphs 𝑇 the abstraction function 𝛽 is defined as follows: 𝛽 𝑇 = 𝛽 𝑇𝐸𝐻 𝐸𝑇𝐻∈𝑇 Where for two SSGs 𝑇𝐻 and 𝑇𝐻 ′ : 𝑇𝐻 ⊔ 𝑇𝐻 ′ = ′ , 𝑗𝑡_𝑡ℎ𝑏𝑠𝑓𝑒 ∨ 𝑗𝑡_𝑡ℎ𝑏𝑠𝑓𝑒′ ′ , 𝐹 𝑡 ∪ 𝐹 𝑡 𝐹 𝑤 ∪ 𝐹 𝑤
The Abstract Semantics For a single DSG the shape-nodes of 𝛽 𝐸𝑇𝐻 represent disjoint sets of points. Let 𝑇 be a set of DSGs, and 𝛽 𝑇 = 𝐹 𝑤 , 𝐹 𝑡 , 𝑗𝑡_𝑡ℎ𝑏𝑠𝑓𝑒 , then it follow that: For all 𝑜 𝑌 , 𝑡𝑓𝑚, 𝑜 𝑍 ∈ 𝐹 𝑡 either 𝑌 = 𝑍 or 𝑌 ∩ 𝑍 = ∅ 𝑜 𝑧 𝑜 𝑢 𝑧 𝑢 𝑜 𝑦,𝑢 1 𝑜 𝜚 𝑦 𝑢 1
The Abstract Semantics In order for the abstraction to be useful, one should be able to compute it directly by transforming the static shape graph (in contrast to by abstracting the concrete shape graph). For this purpose the SSG meaning function 𝑡𝑢 𝒯𝒯 : 𝒯𝒯 → 𝒯𝒯 is defined.
The Abstract Semantics 𝑦 ≔ 𝒐𝒇𝒙 𝑚 𝑜𝑓𝑥 𝑜 𝑦 𝑦 𝑦 Abstract Concrete
The Abstract Semantics 𝑦 ≔ 𝑧 𝑚 𝑗 𝑜 𝑧,𝑢 1 𝑜 𝑧,𝑢 1 ,𝑦 𝑧 𝑧 𝑦 𝑦 𝑢 1 𝑢 1 𝑜 𝑧,𝑢 2 ,𝑦 𝑜 𝑧,𝑢 2 𝑚 𝑘 𝑧 𝑧 𝑦 𝑢 2 𝑦 𝑢 2 Abstract Concrete
The Abstract Semantics 𝑦. 𝑑𝑒𝑠 ≔ 𝑧 𝑜 𝑧,𝑦 𝑚 𝑗 𝑧 𝑧 𝑦 𝑦 x 𝑜 𝑧 𝑚 𝑘 𝑧 x 𝑧 𝑜 𝑦 𝑚 𝑙 𝑦 𝑦 Abstract Concrete
The Abstract Semantics 𝑦 ≔ 𝑧. 𝑑𝑒𝑠 𝑢 1 𝑢 1 𝑜 𝑧 𝑜 𝑢 2 𝑜 𝑢 1 𝑜 𝑢 1 𝑧 𝑢 2 Abstract
The Abstract Semantics 𝑦 ≔ 𝑧. 𝑑𝑒𝑠 𝑢 1 𝑜 𝑧 𝑜 𝑢 2 𝑜 𝑢 1 𝑧 x 𝑢 2 𝑢 1 𝑦 𝑜 𝑢 1 ,𝑦 Abstract
The Abstract Semantics The abstract semantics associate a SSG, 𝑇𝐻 𝑤 , with every control-flow vertex 𝑤 , defined by: ∅, ∅ , 𝜇𝑜. 𝑔𝑏𝑚𝑡𝑓 𝑗𝑔 𝑤 = 𝑡𝑢𝑏𝑠𝑢 𝑇𝐻 𝑤 = 𝑡𝑢 𝑣 𝒯𝒯 𝑇𝐻 𝑣 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓 𝑣∈𝑞𝑠𝑓𝑒 𝑤 Theorem (Correctness): For every control-flow graph vertex 𝑤 : 𝛽 𝑑𝑡 𝑤 ⊆ 𝑇𝐻 𝑤
Properties and Achievements “Strong Nullification” – When processing a statement of the type 𝑦. 𝑡𝑓𝑚 0 = 𝑧 the 𝑡𝑓𝑚 0 edges currently emanating from 𝑦 are always removed. Materialization – When processing a statement of the type 𝑦 = 𝑧. 𝑡𝑓𝑚 0 the algorithm creates a copy of 𝑧. 𝑡𝑓𝑚 0 and thus is able to un-summarize shape- nodes. The shape analysis algorithm presented is able to verify shape preservation properties of data structures like lists, lists containing a cycle and trees.
Discussion What are possible uses of this kind of analysis? What are possible extensions of this method? What are possible flaws of this method? Is it scalable?
Recommend
More recommend