Decompiling Boolean Expressions from JavaTM Bytecode Decompiling Boolean Expressions from Java TM Bytecode Mangala Gowri Nanda (IBM-IRL) and S. Arun-Kumar (IIT Delhi)
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Motivation Generating executable code from program slices. Java bytecode does not preserve program structure, especially for complex boolean expressions. Try to reconstruct boolean expression (equivalent to the original) in terms of && , || and the ternary if-then-else . goto is a four-letter word (so is break ).
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Equivalent CFGs for a simple OR clause 4201 4201 4202 (n % 9==0) 4202 (n % 9!=0) 4203 4204 (m % 9==1) hs.add(new Integer(3)); 4203 4204 (m % 9!=1) hs.add(new Integer(3)); 4205 hs.add(new Integer(m)); return; 4205 hs.add(new Integer(m)); return; 4206 4206 (a) The Classic OR if(n%9==0) goto 4204 else goto 4203 4203: if(m%9==1) goto 4204 else goto 4205 4204: hs.add(new Int(3)); goto 4205 4205: hs.add(new Int(m)); return;
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Equivalent CFGs for a simple OR clause 4201 4201 4202 (n % 9==0) 4202 (n % 9!=0) 4203 4204 (m % 9==1) hs.add(new Integer(3)); 4203 4204 (m % 9!=1) hs.add(new Integer(3)); 4205 hs.add(new Integer(m)); return; 4205 hs.add(new Integer(m)); return; 4206 4206 (a) The Classic OR (b) The AND equivalent if(n%9==0) goto 4204 else goto 4203 4203: if(m%9==1) goto 4204 if(n%9!=0 && m%9!=1) {} else goto 4205 else { hs.add(new Int(3)) 4204: hs.add(new Int(3)); } goto 4205 hs.add(new Int(m)); 4205: hs.add(new Int(m)); return; return;
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Equivalent CFGs for a simple OR clause 4201 4201 4202 (n % 9==0) 4202 (n % 9!=0) 4203 4204 (m % 9==1) hs.add(new Integer(3)); 4203 4204 (m % 9!=1) hs.add(new Integer(3)); 4205 hs.add(new Integer(m)); return; 4205 hs.add(new Integer(m)); return; 4206 4206 (a) The Classic OR (b) The AND equivalent if(n%9==0) goto 4204 else goto 4203 4203: if(m%9==1) goto 4204 if(n%9!=0 && m%9!=1) {} else goto 4205 else { hs.add(new Int(3)) 4204: hs.add(new Int(3)); } goto 4205 hs.add(new Int(m)); 4205: hs.add(new Int(m)); return; return;
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Equivalent CFGs for a simple OR clause 4201 4201 4202 (n % 9==0) 4202 (n % 9!=0) 4203 4204 (m % 9==1) hs.add(new Integer(3)); 4203 4204 (m % 9!=1) hs.add(new Integer(3)); 4205 hs.add(new Integer(m)); return; 4205 hs.add(new Integer(m)); return; 4206 4206 (a) The Classic OR (b) The AND equivalent if(n%9==0) goto 4204 else goto 4203 4203: if(m%9==1) goto 4204 if(n%9!=0 && m%9!=1) {} else goto 4205 else { hs.add(new Int(3)) 4204: hs.add(new Int(3)); } goto 4205 hs.add(new Int(m)); 4205: hs.add(new Int(m)); return; return;
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Equivalent CFGs for a simple OR clause 4201 4202 (n % 9==0) 4203 4204 (m % 9!=1) hs.add(new Integer(3)); 4205 hs.add(new Integer(m)); return; 4206 (c) An alternative if(n%9==0) goto 4204 else goto 4203 4203: if(m%9!=1) goto 4205 else goto 4204 4204: hs.add(new Int(3)); goto 4205 4205: hs.add(new Int(m)); return;
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Equivalent CFGs for a simple OR clause 4201 4201 4202 4202 (n % 9==0) (n % 9!=0) 4203 4204 4203 4204 (m % 9!=1) hs.add(new Integer(3)); (m % 9==1) hs.add(new Integer(3)); 4205 4205 hs.add(new Integer(m)); hs.add(new Integer(m)); return; return; 4206 4206 (c) An alternative if(n%9==0) goto 4204 else goto 4203 4203: if(m%9!=1) goto 4205 else goto 4204 4204: hs.add(new Int(3)); goto 4205 4205: hs.add(new Int(m)); return;
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Equivalent CFGs for a simple OR clause 4201 4201 4202 4202 (n % 9==0) (n % 9!=0) 4203 4204 4203 4204 (m % 9!=1) hs.add(new Integer(3)); (m % 9==1) hs.add(new Integer(3)); 4205 4205 hs.add(new Integer(m)); hs.add(new Integer(m)); return; return; 4206 4206 (d) Yet Another Alternative (c) An alternative if(n%9!=0) goto 4203 if(n%9==0) goto 4204 else goto 4204 else goto 4203 4203: if(m%9==1) goto 4204 4203: if(m%9!=1) goto 4205 else goto 4205 else goto 4204 4204: hs.add(new Int(3)); 4204: hs.add(new Int(3)); goto 4205 goto 4205 4205: hs.add(new Int(m)); 4205: hs.add(new Int(m)); return; return;
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Equivalent CFGs for a simple OR clause 4201 4201 4202 4202 (n % 9==0) (n % 9!=0) 4203 4204 4203 4204 (m % 9!=1) hs.add(new Integer(3)); (m % 9==1) hs.add(new Integer(3)); 4205 4205 hs.add(new Integer(m)); hs.add(new Integer(m)); return; return; 4206 4206 (d) Yet Another Alternative (c) An alternative if(n%9!=0) goto 4203 if(n%9==0) goto 4204 else goto 4204 else goto 4203 4203: if(m%9==1) goto 4204 4203: if(m%9!=1) goto 4205 else goto 4205 else goto 4204 4204: hs.add(new Int(3)); 4204: hs.add(new Int(3)); goto 4205 goto 4205 4205: hs.add(new Int(m)); 4205: hs.add(new Int(m)); return; return;
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Outline 1 Introduction The Problem 2 Generating Code The Monochromatic Theorem An example with only AND s and OR s Handling ternary expressions
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Outline 1 Introduction The Problem 2 Generating Code The Monochromatic Theorem An example with only AND s and OR s Handling ternary expressions 3 Untwistable DAGs Managing untwistable DAGs
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Outline 1 Introduction The Problem 2 Generating Code The Monochromatic Theorem An example with only AND s and OR s Handling ternary expressions 3 Untwistable DAGs Managing untwistable DAGs 4 Results
Decompiling Boolean Expressions from JavaTM Bytecode Introduction The Problem Outline 1 Introduction The Problem 2 Generating Code The Monochromatic Theorem An example with only AND s and OR s Handling ternary expressions 3 Untwistable DAGs Managing untwistable DAGs 4 Results
Decompiling Boolean Expressions from JavaTM Bytecode Generating Code The Monochromatic Theorem The Monochromatic Theorem: A Lemma The language of non-negative conditional expressions is defined by the BNF c ::= a | c 1 && c 2 | c 1 � c 2 | c 0 ? c 1 : c 2 | ( c 0 ? i 1 : i 2) == val | ( c 0 ? o 1 : o 2) . boolfunc() Lemma For this language of non-negative conditional expressions, every CFG generated for the program segment if c then S true else S false may be transformed into one that preserves the property that all incoming edges to any node in the CFG are of the same color.
Decompiling Boolean Expressions from JavaTM Bytecode Generating Code The Monochromatic Theorem Pushing Negation Inwards Lemma The Boolean Identities !! c ′ c ′ = !( c 0 ? c 1 : c 2 ) = ! c 0 ?! c 2 :! c 1 , !( c 1 && c 2 ) = ! c 1 � ! c 2 !( c 1 � c 2 ) = ! c 1 &&! c 2 , Negations of all comparison operators are also available in non-negative form Negation “twists” the subgraphs Use negation (recursively) to ensure that incoming arcs are of the same color.
Decompiling Boolean Expressions from JavaTM Bytecode Generating Code The Monochromatic Theorem The Monochromatic Theorem Theorem For each basic block that participates in a boolean expression, all the incoming edges must be the same color. That is, all incoming edges are either true edges or they are false edges , or in the case of certain ternary clauses, they may be unconditional edges ( black edges ). If they are not, simply twist them using the boolean identities.
Decompiling Boolean Expressions from JavaTM Bytecode Generating Code The Monochromatic Theorem Constructing the boolean expressions c00 c0 c00 c01 c0 c0 c01 c10 c0 t8 = i1; t8 = i2; c10 c1 c1 c2 c11 c1 c11 (t8==val) sFalse(); sTrue(); sTrue(); sFalse(); sTrue(); sFalse(); sTrue(); sFalse(); sTrue(); sFalse(); sTrue(); sFalse(); (a) c0 && (c) c0 ? c1 (f) (c00 && (b) c0 || c1 (e) (c00 || c1 : c2 c01) || (c10 c01) && (d) (c0 ? i1 : i2) && c11) (c10 || c11) == val c0 c0 c00 c00 c00 c0 t12 = i1; t12 = i2; t12 = i1; t12 = i2; c0 c01 c02 c00 c01 c02 c01 c02 c1 c2 (t12==val1) (t12==val1) c1 c2 c1 c2 c01 c02 c10 c20 c3 t10 = i1; t10 = i2; t14 = i1; t14 = i2; t14 = i3; t14 = i4; c1 c1 c3 (t14==val) c1 c2 c5 c4 t13 = i3; t13 = i4; t13 = i3; t13 = i4; c11 c12 c21 c22 (t10==val) c5 c4 sTrue(); sFalse(); (t13==val2) (t13==val2) sFalse(); sTrue(); sTrue(); sFalse(); sTrue(); sFalse(); sTrue(); sFalse(); (l) ((c00 ? sTrue(); sFalse(); sTrue(); sFalse(); sFalse(); sTrue(); c01 : c02) ? (i) (c0 ? c1 (h) (c00 ? c01 : (g) (c00 ? (j) (c0 ? c1 (c1 ? i1 : i2) : c2) && (k) ((c00 ? (m) (c0 ? i1 (n) (c0 ? i1 c02) ? (c10 ? c01 : c02) ? : c2) || (c3 : (c2 ? i3 : (c3 ? c4 : c01 : c02) ? : i2) == val1 :i2) == val1 c11 : c12): (c20 c1 : c2 ? c4 : c5) i4) ) == val c5) i1 : i2) == && (c1 ? i3 : || (c1? i3 ? c21 : c22) val i4) == val2 :i4) == val2
Recommend
More recommend