Reducing Code Size Using Outlining Jessica Paquette Apple
Outline • Code size • Outlining • Results • Future work
Motivating Example
callq _printf movl $2, %edx movl -8(%rbp), %esi addl $1, %esi movl %esi, -8(%rbp) … movl -20(%rbp), %ecx movl $2, %edx movl -8(%rbp), %esi addl $1, %esi movl %esi, -8(%rbp) … movl $2, %edx movl -8(%rbp), %esi addl $1, %esi movl %esi, -8(%rbp)
callq _printf movl $2, %edx movl -8(%rbp), %esi addl $1, %esi movl %esi, -8(%rbp) … NEW_FUNC: movl -20(%rbp), %ecx movl $2, %edx movl $2, %edx movl -8(%rbp), %esi movl -8(%rbp), %esi addl $1, %esi addl $1, %esi movl %esi, -8(%rbp) movl %esi, -8(%rbp) retq … movl $2, %edx movl -8(%rbp), %esi addl $1, %esi movl %esi, -8(%rbp)
callq _printf movl $2, %edx movl -8(%rbp), %esi addl $1, %esi movl %esi, -8(%rbp) callq _printf … callq NEW_FUNC NEW_FUNC: movl -20(%rbp), %ecx movl $2, %edx … movl $2, %edx movl -8(%rbp), %esi movl -20(%rbp), %ecx movl -8(%rbp), %esi addl $1, %esi addl $1, %esi callq NEW_FUNC movl %esi, -8(%rbp) … movl %esi, -8(%rbp) retq … callq NEW_FUNC movl $2, %edx movl -8(%rbp), %esi addl $1, %esi movl %esi, -8(%rbp)
callq _printf callq NEW_FUNC NEW_FUNC: movl $2, %edx … movl -8(%rbp), %esi movl -20(%rbp), %ecx addl $1, %esi callq NEW_FUNC movl %esi, -8(%rbp) … retq callq NEW_FUNC
Outlining Replacing repeated sequences of instructions with calls to equivalent functions
Outliner A pass that finds repeated instruction sequences and outlines them.
G callq _printf G movl $2, %edx C movl -8(%rbp), %esi A addl $1, %esi T movl %esi, -8(%rbp) … C movl -20(%rbp), %ecx G movl $2, %edx C movl -8(%rbp), %esi A addl $1, %esi T movl %esi, -8(%rbp) … G movl $2, %edx C movl -8(%rbp), %esi A addl $1, %esi T movl %esi, -8(%rbp)
A callq _printf B movl $2, %edx C movl -8(%rbp), %esi D addl $1, %esi E movl %esi, -8(%rbp) … F movl -20(%rbp), %ecx B movl $2, %edx movl -8(%rbp), %esi C addl $1, %esi D movl %esi, -8(%rbp) E … movl $2, %edx B movl -8(%rbp), %esi C addl $1, %esi D movl %esi, -8(%rbp) E
Programs are like strings
Find repeated substrings
Suffix Tree A data structure for string searching (Not a suffix trie!)
A B C A B $0 $0 C A B B $ AB 0 C A B $ 0 CAB$0 $0 0 $
A B C A B $0 $0 C A B B $ AB 0 C A B $ 0 CAB$0 $0 0 $
A B C A B $0 $0 C A B B $ AB 0 C A B $ 0 CAB$0 $0 0 $
A B C A B $0 $0 C A B B $ AB 0 C A B $ 0 CAB$0 $0 0 $
A B C A B $0 $0 C A B B $ AB 0 C A B $ 0 CAB$0 $0 0 $
A B C A B $0 $0 C A B B $ AB 0 C A B $ 0 CAB$0 $0 0 $
Advantages Given a string of length L… • O(L) construction • O(L) longest repeated substring • O(L) time most frequent substring
Simplified Suffix Tree Construction
A B C A B $0 Su ffi xes
A B C A B $0 Su ffi xes
A B C A B $0 Su ffi xes A
A B C A B $0 Su ffi xes A
A B C A B $0 Su ffi xes B, AB A
A B C A B $0 Su ffi xes B AB
A B C A B $0 Su ffi xes B AB
A B C A B $0 Su ffi xes C C ABC B
A B C A B $0 Su ffi xes C C ABC B
A B C A B $0 Su ffi xes A C A A ABCA C B
A B C A B $0 Su ffi xes A C A A ABCA C B
A B C A B $0 Su ffi xes A B B B C ABCAB A A C B B
A B C A B $0 Su ffi xes A B B B C ABCAB A A C B B
A B C A B $0 Su ffi xes A B $0 0 ABCAB$0 $ B $0 B C A A C B B $0 $ 0
A B C A B $0 Su ffi xes B $0 0 $ $0 B C A A C B B $ AB 0 CAB$0 0 $
A B C A B $0 Su ffi xes B $0 0 $ $0 B C A A C B B $ AB 0 CAB$0 0 $
A B C A B $0 Su ffi xes $0 C A B B $ AB 0 C A B $ 0 CAB$0 $0 0 $
A B C A B $0 $0 Su ffi xes C A B B $ AB 0 C A B $ 0 CAB$0 $0 0 $
$0 *Note: Real construction is more complex C A B B $ AB 0 Need to store links C A B $ 0 between internal nodes CAB$0 $0 0 $
Representation
Su ffi x Tree Struct struct SuffixTree { Node *Root; size_t LeafEnd; ActiveState Active; … };
Su ffi x Tree Struct struct SuffixTree { Node *Root; size_t LeafEnd; ActiveState Active; StringType longestRepeatedSubstring(); void findOccurrences(std::vector<int> &Occurrences, const StringType &QueryStr); void prune(const StringType &Str); … };
Node Struct struct Node { Node *Parent; std::map<CharacterType, Node *> Children; size_t StartIdx; size_t EndIdx; size_t SuffixIndex; … };
Node Struct struct Node { Node *Parent; std::map<CharacterType, Node *> Children; size_t StartIdx; size_t EndIdx; size_t SuffixIndex; bool Valid; … };
Outlining Example
FOO ( ) A R1 = 0xDEADBEEF B R3 = R2 + R1 C R1 = *R5 D R7 = 0xFEEDFACE E R1 = R1 - 1 BAR ( ) F R7 = R3 + R2 A R1 = 0xDEADBEEF B R3 = R2 + R1 C R1 = *R5 G R7 = 0xFACEFEED A R1 = 0xDEADBEEF B R3 = R2 + R1 C R1 = *R5 E R1 = R1 - 1
FOO ( ) String Encoding A R1 = 0xDEADBEEF B R3 = R2 + R1 C R1 = *R5 A B C D E D R7 = 0xFEEDFACE E R1 = R1 - 1 BAR ( ) F R7 = R3 + R2 A R1 = 0xDEADBEEF F A B C G A B C E B R3 = R2 + R1 C R1 = *R5 G R7 = 0xFACEFEED A R1 = 0xDEADBEEF B R3 = R2 + R1 C R1 = *R5 E R1 = R1 - 1
FOO ( ) String Encoding A R1 = 0xDEADBEEF B R3 = R2 + R1 C R1 = *R5 A B C D E $0 Unique terminators D R7 = 0xFEEDFACE E R1 = R1 - 1 BAR ( ) F R7 = R3 + R2 A R1 = 0xDEADBEEF F A B C G A B C E $1 B R3 = R2 + R1 C R1 = *R5 G R7 = 0xFACEFEED A R1 = 0xDEADBEEF B R3 = R2 + R1 C R1 = *R5 E R1 = R1 - 1
String Encoding A B C D E $0 F A B C G A B C E $1
String Encoding A B C D E $0 F A B C G A B C E $1
Find Candidates
A B C D E $0 F A B C G A B C E $1 D E $0 F A B C G A B C E $1 G A B C E $1 G A B C E $ 1 E$1 D E $0 F A B C G A B C E $1 C G A B C E $1 A B C E$1 D E $0 F A B C G A B C E $1 B C D E $0 F A B C G A B C E $1 G A B C E $1 E$1 E F A B C G A B C E $0 F A B C G A B C E $1 $1
Occurrences Length Longest repeated substring A B C D E $0 F A B C G A B C E $1 D E $0 F A B C G A B C E $1 G A B C E $1 G A B C E $ 1 E$1 0 0 None D E $0 F A B C G A B C E $1 C G A B C E $1 A B C E$1 D E $0 F A B C G A B C E $1 B C D E $0 F A B C G A B C E $1 G A B C E $1 E$1 E F A B C G A B C E $0 F A B C G A B C E $1 $1
Occurrences Length Longest repeated substring A B C D E $0 F A B C G A B C E $1 D E $0 F A B C G A B C E $1 G A B C E $1 G A B C E $ 1 E$1 0 0 None D E $0 F A B C G A B C E $1 C G A B C E $1 A B C E$1 D E $0 F A B C G A B C E $1 B C D E $0 F A B C G A B C E $1 G A B C E $1 E$1 E F A B C G A B C E $0 F A B C G A B C E $1 $1
Occurrences Length Longest repeated substring A B C D E $0 F A B C G A B C E $1 D E $0 F A B C G A B C E $1 G A B C E $1 G A B C E $ 1 E$1 3 1 C D E $0 F A B C G A B C E $1 C G A B C E $1 A B C E$1 D E $0 F A B C G A B C E $1 B C D E $0 F A B C G A B C E $1 G A B C E $1 E$1 E F A B C G A B C E $0 F A B C G A B C E $1 $1
Occurrences Length Longest repeated substring A B C D E $0 F A B C G A B C E $1 D E $0 F A B C G A B C E $1 G A B C E $1 G A B C E $ 1 E$1 3 3 ABC D E $0 F A B C G A B C E $1 C G A B C E $1 A B C E$1 D E $0 F A B C G A B C E $1 B C D E $0 F A B C G A B C E $1 G A B C E $1 E$1 E F A B C G A B C E $0 F A B C G A B C E $1 $1
Occurrences Length Longest repeated substring A B C D E $0 F A B C G A B C E $1 D E $0 F A B C G A B C E $1 G A B C E $1 G A B C E $ 1 E$1 3 3 ABC D E $0 F A B C G A B C E $1 C G A B C E $1 A B C E$1 D E $0 F A B C G A B C E $1 B C D E $0 F A B C G A B C E $1 G A B C E $1 E$1 E F A B C G A B C E $0 F A B C G A B C E $1 $1
Occurrences Length Longest repeated substring A B C D E $0 F A B C G A B C E $1 D E $0 F A B C G A B C E $1 G A B C E $1 G A B C E $ 1 E$1 3 3 ABC D E $0 F A B C G A B C E $1 C G A B C E $1 A B C E$1 D E $0 F A B C G A B C E $1 B C D E $0 F A B C G A B C E $1 G A B C E $1 E$1 E F A B C G A B C E $0 F A B C G A B C E $1 $1
Occurrences Length Longest repeated substring A B C D E $0 F A B C G A B C E $1 D E $0 F A B C G A B C E $1 G A B C E $1 G A B C E $ 1 E$1 3 3 ABC D E $0 F A B C G A B C E $1 C G A B C E $1 A B C E$1 D E $0 F A B C G A B C E $1 B C D E $0 F A B C G A B C E $1 G A B C E $1 E$1 E F A B C G A B C E $0 F A B C G A B C E $1 $1
Occurrences Length Longest repeated substring A B C D E $0 F A B C G A B C E $1 D E $0 F A B C G A B C E $1 G A B C E $1 G A B C E $ 1 E$1 3 3 ABC D E $0 F A B C G A B C E $1 C G A B C E $1 A B C E$1 D E $0 F A B C G A B C E $1 B C D E $0 F A B C G A B C E $1 G A B C E $1 E$1 E F A B C G A B C E $0 F A B C G A B C E $1 $1
Recommend
More recommend