Shape Analysis via Symbolic Memory Graphs and Conversion from Pointers to Containers Kamil Dudka Luk´ aˇ s Hol´ ık Petr Peringer Marek Trt´ ık Tom´ aˇ s Vojnar Brno University of Technology, Czech Republic Dagstuhl, 2/11/2015
Shape Analysis via Symbolic Memory Graphs [Dudka, Peringer, V., SAS’13] Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 2 / 25
2+ DLS size (ptr), Symbolic Memory Graphs (SMGs) An example of a kernel-style linked list used in Linux: list_head custom_record hfo nfo pfo next next next ... prev prev prev Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 3 / 25
Symbolic Memory Graphs (SMGs) An example of a kernel-style linked list used in Linux: list_head custom_record hfo nfo pfo next next next ... prev prev prev An SMG describing the data structure above: 0,reg 0,ptr hfo,fst nfo,ptr 2+ DLS pfo,ptr hfo,lst size (ptr), ptr Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 3 / 25
Symbolic Memory Graphs (SMGs) An example of a kernel-style linked list used in Linux: list_head custom_record hfo nfo pfo next next next ... prev prev prev An SMG describing the data structure above: 0,reg 0,ptr hfo,fst nfo,ptr 2+ DLS pfo,ptr hfo,lst size (ptr), ptr SMGs are directed graphs consisting of: ◮ objects (allocated space) and values (addresses, integers), ◮ has-value and points-to edges. Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 3 / 25
SMGs: Labelling of Objects (1/2) 0,reg 0,ptr hfo,fst nfo,ptr 2+ DLS pfo,ptr hfo,lst size (ptr), ptr Objects are further divided into: ◮ regions, i.e., individual blocks of memory, ◮ optional regions, i.e., a region or NULL, ◮ doubly-linked list segments (DLSs), ◮ singly-linked list segments (SLSs), Each object has some (constant) size in bytes and a validity flag. Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 4 / 25
SMGs: Labelling of Objects (2/2) Each DLS is given by a head, next, and prev field offset. list_head custom_record hfo nfo pfo next next next ... prev prev prev DLSs can be of length: ◮ N + for any N ≥ 0 or ◮ 0 or 1. Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 5 / 25
SMGs: Labelling of Objects (2/2) Each DLS is given by a head, next, and prev field offset. list_head custom_record hfo nfo pfo next next next ... prev prev prev DLSs can be of length: ◮ N + for any N ≥ 0 or ◮ 0 or 1. Nodes of DLSs can point to objects that are: ◮ shared: each node points to the same object, or ◮ nested: each node points to a separate copy of the object. • Implemented by tagging objects by their nesting level. Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 5 / 25
SMGs: Has-Value and Points-To Edges Memory SMG region 1 region 2 region 1 region 2 o ff set 1 , ptr o ff set 2 , reg o ff set 1 o ff set 2 a 1 size =size 1 size 2 size =size 2 size 1 a 1 has-value points-to edge edge Has-value edges – from objects to values, labelled by: ◮ field offset, ◮ type of the value stored in the field. Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 6 / 25
SMGs: Has-Value and Points-To Edges Memory SMG region 1 region 2 region 1 region 2 o ff set 1 , ptr o ff set 2 , reg o ff set 1 o ff set 2 a 1 size =size 1 size 2 size =size 2 size 1 a 1 has-value points-to edge edge Has-value edges – from objects to values, labelled by: ◮ field offset, ◮ type of the value stored in the field. Points-to edges – from values (addresses) to objects, labelled by: ◮ target offset, ◮ target specifier: first/last/each node of a DLS, • specifier each node: used for back-links from nested objects. Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 6 / 25
SMGs: Join Operator Traverses two SMGs and tries to join simultaneously encountered objects. Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 7 / 25
SMGs: Join Operator Traverses two SMGs and tries to join simultaneously encountered objects. Objects being joined must be locally compatible (same size, nesting level, DLS linking offsets, ...). Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 7 / 25
SMGs: Join Operator Traverses two SMGs and tries to join simultaneously encountered objects. Objects being joined must be locally compatible (same size, nesting level, DLS linking offsets, ...). DLSs join with regions or DLSs. Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 7 / 25
SMGs: Join Operator Traverses two SMGs and tries to join simultaneously encountered objects. Objects being joined must be locally compatible (same size, nesting level, DLS linking offsets, ...). DLSs join with regions or DLSs. If the above fails, try to insert a DLS of length 0+ into one of the SMGs. Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 7 / 25
SMGs: Entailment Checking The join of SMGs is re-used: G 1 ⊑ G 2 tested by computing G 1 ⊔ G 2 while checking that G 1 consists of less general objects. 2+ 0+ 1+ 0+ Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 8 / 25
SMGs: Abstraction Collapsing uninterrupted sequences of compatible objects (same size, nesting level, field offsets, ...) into DLSs. Uses join of the sub-SMGs under the nodes to be collapsed to see whether they are compatible too. 0+ 0+ 0+ 1+ 0+ Distinguishes cases of shared and private sub-SMGs. Heuristic control of the choice of sequences to collapse: ◮ ratio of loss of precision and number of collapsed objects. Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 9 / 25
Predator : An Analyser Based on SMGs
Predator: An Overview An analyser based on SMGs. In addition to the basic features of SMGs, with a (partial) support of: ◮ pointer arithmetics, address alignment, ◮ interval-based pointers, interval-sized objects, ◮ block operations, re-intepretation of nullified blocks. Verification of low-level system code (such as Linux kernel) that manipulates dynamic data structures. Proving absence of memory safety errors (invalid dereferences, buffer overruns, memory leaks, ...). Implemented as an open source GCC plug-in: http://www.fit.vutbr.cz/research/groups/verifit/tools/predator Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 11 / 25
From Pointers to (List) Containers [Dudka, Hol´ ık, Peringer, Trt´ ık, V., VMCAI’16]
From Pointers to Containers: The Goal typedef struct SNode { int x; struct SNode *f; struct SNode *b; } Node; # define NEW(T) (T*)malloc( sizeof (T)) 1 Node *h=0, *t=0; list<Node> L; 2 while (nondet()) { while (nondet()) { 3 Node *p=NEW(Node); Node *p=L.push_back(); 4 if (h==NULL) 5 h=p; 6 else 7 t->f=p; 8 p->f=NULL; 9 p->x=0; p->x=0; 10 p->b=t; 11 t=p; 12 } } ... ... 13 while (t) { while (!L.empty()) 14 Node *p=t->b; L.pop_back(); 15 if (p) p->n=NULL; 16 else h=NULL; 17 free(t); 18 t=p; 19 } Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 13 / 25
From Pointers to Containers: Motivation Recognition of containers and container operations has many applications: automatic parallelization, profiling and optimization, fault tolerance, program signatures for detection of plagiarism or malware, program understanding, debugging and automatic bug finding, Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 14 / 25
From Pointers to Containers: Motivation Recognition of containers and container operations has many applications: automatic parallelization, profiling and optimization, fault tolerance, program signatures for detection of plagiarism or malware, program understanding, debugging and automatic bug finding, simplification of program verification, ◮ separation of pointer and non-pointer analyses. Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 14 / 25
From Pointers to Containers: Motivation Recognition of containers and container operations has many applications: automatic parallelization, profiling and optimization, fault tolerance, program signatures for detection of plagiarism or malware, program understanding, debugging and automatic bug finding, simplification of program verification, ◮ separation of pointer and non-pointer analyses. So far done via unsound dynamic approaches, possibly with human help. Dudka, Hol´ ık, Peringer, Trt´ ık, Vojnar SMGs for going from Pointers to Containers 14 / 25
Recommend
More recommend