Modular Interpretive Decompilation of Low-Level Code by Partial - PowerPoint PPT Presentation

Modular Interpretive Decompilation of Low-Level Code by Partial Evaluation Elvira Albert 1 joint work with omez-Zamalloa 1 and Germ´ an Puebla 2 Miguel G´ (1) Complutense University of Madrid (Spain) (2) Technical University of Madrid (Spain) Beijing, September 2008 Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 1 / 15

Introduction Motivation Introduction Motivation Low-level code ⇒ Intermediate representations Mobile environments : only low-level code available. Analysis tools unavoidably more complicated. ◮ unstructured control flow, ◮ use of operand stack, ◮ use of heap, etc. Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 2 / 15

Introduction Motivation Introduction Motivation Low-level code ⇒ Intermediate representations Mobile environments : only low-level code available. Analysis tools unavoidably more complicated. ◮ unstructured control flow, ◮ use of operand stack, ◮ use of heap, etc. Decompiling to intermediate representations: ◮ abstracts away particular language features. ◮ simplifies development of analyzers, model checkers, etc. ◮ variants: clause-based , BoogiePL , Soot , etc. Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 2 / 15

Introduction Motivation Introduction Motivation Low-level code ⇒ Intermediate representations Mobile environments : only low-level code available. Analysis tools unavoidably more complicated. ◮ unstructured control flow, ◮ use of operand stack, ◮ use of heap, etc. Decompiling to intermediate representations: ◮ abstracts away particular language features. ◮ simplifies development of analyzers, model checkers, etc. ◮ variants: clause-based , BoogiePL , Soot , etc. High-level (declarative) languages Convenient intermediate representation: ◮ iterative constructs (loops) ⇒ recursion. ◮ all variables in local scope of methods represented uniformly. Advanced tools (for declarative) languages re-used. Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 2 / 15

Introduction Interpretive Decompilation Introduction Interpretive Decompilation Most of the approaches develop hand-written decompilers. Appealing alternative: interpretive decompilation PE allows specializing a program w.r.t. some part of its input. Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 3 / 15

Introduction Interpretive Decompilation Introduction Interpretive Decompilation Most of the approaches develop hand-written decompilers. Appealing alternative: interpretive decompilation PE allows specializing a program w.r.t. some part of its input. Definition (1st Futamura Projection) A program P written in L S can be compiled into another language L O by specializing an interpreter for L S written in L O w.r.t. P . Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 3 / 15

First Futamura Projection Partial Evaluation and the Interpretive Approach First Futamura Projection Partial Evaluation and the Interpretive Approach p ( in1 , in2 ) = output ✗ ✔ static data ✖ ✕ (in1) ✗ ✔ ❄ program partial ✲ ✖ ✕ p evaluator(mix) ✛ ✘ ✓ ✏ ✲ ✎ ☞ ❄ ❄ specialized dynamic program output ✲ ✒ ✑ ✍ ✌ data (in2) ✲ ✚ ✙ p in1 ✞ ☎ ✝ ✆ = data = programs [[ p ]] [ in1 , in2 ] = [[ [[ mix ]] [ p , in1 ] ]] [ in2 ] Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 4 / 15

First Futamura Projection Partial Evaluation and the Interpretive Approach First Futamura Projection Partial Evaluation and the Interpretive Approach p ( in1 , in2 ) = output ✗ ✔ static data ✖ ✕ (in1) ✗ ✔ ❄ bytecode partial ✲ ✖ interp ( LP ) ✕ evaluator(mix) ✛ ✘ ✓ ✏ ✲ ✎ ☞ ❄ ❄ specialized dynamic program output ✲ ✒ ✑ ✍ ✌ data (in2) ✲ ✚ ✙ p in1 ✞ ☎ ✝ ✆ = data = programs [[ bc interp ]] [ in1 , in2 ] = [[ [[ mix ]] [ bc interp , in1 ] ]] [ in2 ] Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 4 / 15

First Futamura Projection Partial Evaluation and the Interpretive Approach First Futamura Projection Partial Evaluation and the Interpretive Approach p ( in1 , in2 ) = output ✗ ✔ bytecode program ✖ ✕ (in1) ✗ ✔ ❄ bytecode partial ✲ ✖ interp ( LP ) ✕ evaluator(mix) ✛ ✘ ✓ ✏ ✲ ✎ ☞ ❄ ❄ specialized dynamic program output ✲ ✒ ✑ ✍ ✌ data (in2) ✲ ✚ ✙ p in1 ✞ ☎ ✝ ✆ = data = programs [[ bc interp ]] [ in1 , in2 ] = [[ [[ mix ]] [ bc interp , in1 ] ]] [ in2 ] Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 4 / 15

First Futamura Projection Partial Evaluation and the Interpretive Approach First Futamura Projection Partial Evaluation and the Interpretive Approach p ( in1 , in2 ) = output ✗ ✔ bytecode program ✖ ✕ (in1) ✗ ✔ ❄ bytecode partial ✲ ✖ interp ( LP ) ✕ evaluator(mix) ✛ ✘ ✓ ✏ ✲ ✎ ☞ ❄ ❄ specialized input program output ✲ ✒ ✑ ✍ ✌ args (in2) ✲ ✚ ✙ p in1 ✞ ☎ ✝ ✆ = data = programs [[ bc interp ]] [ in1 , in2 ] = [[ [[ mix ]] [ bc interp , in1 ] ]] [ in2 ] Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 4 / 15

First Futamura Projection Partial Evaluation and the Interpretive Approach First Futamura Projection Partial Evaluation and the Interpretive Approach p ( in1 , in2 ) = output ✗ ✔ bytecode program ✖ ✕ (in1) ✗ ✔ ❄ bytecode partial ✲ ✖ interp ( LP ) ✕ evaluator(mix) ✛ ✘ ✓ ✏ ✲ ✎ ☞ ❄ ❄ decompiled input program ( LP ) output ✲ ✒ ✑ ✍ ✌ args (in2) ✲ ✚ ✙ p in1 ✞ ☎ ✝ ✆ = data = programs [[ bc interp ]] [ in1 , in2 ] = [[ [[ mix ]] [ bc interp , in1 ] ]] [ in2 ] Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 4 / 15

An Example of Interpretive Decompilation Example 1: Source code int gcd(int x,int y) { int res; while (y != 0) { res = x mod y; x = y; y = res; } return x; } Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 5 / 15

An Example of Interpretive Decompilation Example 1: Source code int gcd(int x,int y) { int res; while (y != 0) { res = x mod y; x = y; y = res; } return x; } bytecode 0:load(1) 7:store(0) 1:if0eq(11) 8:load(2) 2:load(0) 9:store(1) 3:load(1) 10:goto(0) 4:rem 11:load(0) 5:store(2) 12:return 6:load(1) Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 5 / 15

An Example of Interpretive Decompilation Example 1: Source code bytecode interpreter int gcd(int x,int y) { int res; main(Method,InArgs,Top) :- while (y != 0) { build s0(InArgs,S0), res = x mod y; step(push(X),S1,S2) :- execute(S0,Sf), x = y; S1 = st(PC,S,L)), Sf = st( ,[Top| ], )). y = res; } next(PC,PC2), return x; } S2 = st(PC2,[X|S],L)). execute(S1,Sf) :- S1 = st(PC, , )), step(store(X),S1,S2) :- bytecode bytecode(PC,Inst, ), S1 = st(PC,[I|S],LV)), step(Inst,S1,S2) , 0:load(1) next(PC,PC2), 7:store(0) execute(S2,Sf). 1:if0eq(11) localVar update(LV,X,I,LV2), 8:load(2) 2:load(0) S2 = st(PC2,S,LV2)). ...... 9:store(1) 3:load(1) ............. 10:goto(0) 4:rem 11:load(0) 5:store(2) 12:return 6:load(1) Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 5 / 15

An Example of Interpretive Decompilation Example 1: Source code bytecode interpreter int gcd(int x,int y) { int res; main(Method,InArgs,Top) :- while (y != 0) { build s0(InArgs,S0), res = x mod y; step(push(X),S1,S2) :- execute(S0,Sf), x = y; S1 = st(PC,S,L)), Sf = st( ,[Top| ], )). y = res; } next(PC,PC2), return x; } S2 = st(PC2,[X|S],L)). execute(S1,Sf) :- S1 = st(PC, , )), step(store(X),S1,S2) :- bytecode bytecode(PC,Inst, ), S1 = st(PC,[I|S],LV)), step(Inst,S1,S2) , 0:load(1) next(PC,PC2), 7:store(0) execute(S2,Sf). 1:if0eq(11) localVar update(LV,X,I,LV2), 8:load(2) 2:load(0) S2 = st(PC2,S,LV2)). ...... 9:store(1) 3:load(1) ............. 10:goto(0) 4:rem 11:load(0) 5:store(2) 12:return 6:load(1) Decompiled code main(gcd,[X,0],X). exec 1(Y,0,Y). main(gcd,[X,Y],Z) :- Y \ = 0, exec 1(Y,R,Z) :- R \ = 0, R is X rem Y, exec 1(Y,R,Z). R’ is Y rem R, exec 1(R,R’,Z). Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 5 / 15

Contributions in Interpretive Decompilation Contributions in Interpretive Decompilation Advantages w.r.t. dedicated (de-)compilers: flexibility: interpreter easier to modify; more reliable: easier to trust that the semantics preserved; easier to maintain: new changes easily reflected in interpreter; easier to implement: provided a partial evaluator is available. Elvira Albert (UCM) Interpretive Decomp. of Low-Level Code Beijing, September 2008 6 / 15

Modular Interpretive Decompilation of Low-Level Code by Partial - PowerPoint PPT Presentation

Modular Interpretive Decompilation of Low-Level Code by Partial Evaluation Elvira Albert 1 joint work with omez-Zamalloa 1 and Germ an Puebla 2 Miguel G (1) Complutense University of Madrid (Spain) (2) Technical University of Madrid (Spain)

Lecture 16 Decompilation Why decompilation? This course is ostensibly about Optimising

Modular Budgets Modular Budgets Modular Budgets Modular Budgets OSPA NANO Session 10/25/06

Decompilation, type inference and finding the code to decompile Alan Mycroft Computer

Why decompilation? This course is ostensibly about Optimising Compilers. It is really about

Low Level Low Level Low Level Low Level Detection of Detection of Detection of Detection of

Trustworthy decompilation: Extracting models of machine code inside an ITP Magnus O. Myreen

Real Real- -Time Systems Time Systems Low- Low -level programming level programming Low-

1 TEMPORARY MODULAR HOUSING Meeting Purpose Learn how Temporary Modular Housing will allow

Modular Applications, Loose Coupling, and the NetBeans Lookup API The Need for Modular

Managing Modular Software for your NuGet, C++ and Java Development Agenda Modular software

Decompilation Ximing Yu May 3, 2011 Decompiler Definition Decompiler is a program that attempts

BUFFALOS CANALSIDE INTERPRETIVE STRUCTURES LONGSHED PRESENTATION HHL Architects Erie Canal

Logical minimisation of metarules in meta-interpretive learning Andrew Cropper and Stephen

Pan-Canadian Quality Assurance Recommendations for Interpretive Pathology A Partnership of the

Where We Are Source code Lexical, Syntax, and if (b == 0) a = b; Semantic Analysis IR

Decompilation is an information-flow problem (Or, information flow meets program transformation)

Interpre'ng and Compiling Intex Intex programs are simple arithme'c expressions on integers that

A Reactive Measurement Framework Mark Allman, Vern Paxson International Computer Science

REVISED 10 CFR PART 35: MEDICAL USE OF BYPRODUCT MATERIAL Subpart M: Reports This new subpart

Length and tone in the morphophonology of transitive verbs in Shilluk Bert Remijsen Cynthia L.

Multiplication and Division CMSC 301 Prof. Szajda Administrative Read Ch. 3.1-3.4, C.11

Symbolic Computation and Theorem Proving in Program Analysis Laura Kov acs Chalmers Outline

Paolo Tonella tonella@fbk.eu Web testing Crawler Matteo Biagiola, Filippo Ricca, Paolo Tonella:

Convergence in Distribution Undergraduate version of central limit theo- rem: if X 1 , . . . , X