Obfuscation vs. Deobfuscation ISSISP 2018 Christian Collberg University of Arizona
1. Obfuscating Arithmetic 2. Opaque Expressions 3. Control Flow Flattening 4. Constructing Opaque Expressions 5. Generic Deobfuscation 6. Turning up the heat… 7. Anti Disassembly (if time) 8. Anti Tamper (if time)
gnitacsufbO Obfuscating citemhtirA Arithmetic
Encoding Integer Arithmetic x+y = x − ¬y − 1 x+y = (x ⊕ y)+2·(x ∧ y) x+y = (x ∨ y)+(x ∧ y) x+y = 2·(x ∨ y) − (x ⊕ y)
Example One possible encoding of z=x+y+w is z = (((x ^ y) + ((x & y) << 1)) | w) + (((x ^ y) + ((x & y) << 1)) & w); Many others are possible, which is good for diversity.
Exercise! • The virtualizer’s add instruction handler could still be identified by the fact that it uses a + operator! • Try adding am arithmetic transformer: tigress --Environment=x86_64:Linux:Gcc:4.6\ --Transform=Virtualize \ --Functions=fib \ --VirtualizeDispatch=switch\ --Transform=EncodeArithmetic \ --Functions=fib \ --out=fib5.c fib.c • What differences do you notice between before and after arithmetic encoding?
int fib(int n ) { ... while (1) { switch (*(_1_fib_$pc[0])) { case PlusA: { (_1_fib_$sp[0] + -1)->_int = (_1_fib_$sp[0] + -1)->_int + (_1_fib_$sp[0] + 0)->_int; break; } }
int fib(int n ) { ... while (1) { switch (*(_1_fib_$pc[0])) { case PlusA: { (_1_fib_$sp[0] + -1)->_int = ((_1_fib_$sp[0] + -1)->_int ^ (_1_fib_$sp[0] + 0)->_int) + (((_1_fib_$sp[0] + -1)->_int & (_1_fib_$sp[0] + 0)->_int) << 1); break; x+y = (x ⊕ y)+2·(x ∧ y) } }
Opaque euqapO snoisserpxE Expressions
Opaque Expressions An expression whose value is known to you as the defender (at obfuscation time) but which is difficult for an attacker to figure out
P T P F Opaquely true/ false predicate FALSE TRUE FALSE TRUE Opaquely P ? indeterminate predicate FALSE TRUE An expression E =42 of value v
Examples ly true predicate: ely indeterminate predicate: false true false true x mod 2 = 0 ? 2 | ( x 2 + x ) T true false 2 | ( x 2 + x ) T
Inserting Bogus Control Flow if (x[k] == E =1 ) if (x[k] == 1) R = (s*y) % n R = (s*y) % n else else R = s; R = s; s = R*R % n; s = R*R % n; L = R; L = R;
Inserting Bogus Control Flow if (x[k] == 1) R = (s*y) % n if (x[k] == 1) else R = (s*y) % n R = s; else if (expr =T ) R = s; s = R*R % n; s = R*R % n; else L = R; s = R*R * n; L = R;
Inserting Bogus Control Flow if (x[k] == 1) R = (s*y) % n if (x[k] == 1) else R = (s*y) % n R = s; else if (expr =? ) R = s; s = R*R % n; s = R*R % n; else L = R; s = (R%n)*(R%n)%n; L = R;
Aritmetic z=x+y+w Encoding z = (((x ^ y) + ((x & y) << 1)) | w) + (((x ^ y) + ((x & y) << 1)) & w); z = (((x ^ y) + ((x & y) << E =1 )) | w) + (((x ^ y) + ((x & y) << E =1 )) & w);
Exercise! ************************************************************ * 1) Opaque Predicates ************************************************************ tigress --Environment=x86_64:Linux:Gcc:4.6 --Seed=0 \ --Transform=InitEntropy \ --InitEntropyKinds=vars \ --Transform=InitOpaque \ --Functions=main\ --InitOpaqueCount=2\ --InitOpaqueStructs=list,array \ --Transform=AddOpaque\ --Functions=fib\ --AddOpaqueKinds=question \ --AddOpaqueCount=10 \ --out=fib1.c fib.c
Control Flow wolF lortnoC sisylanA Analysis
Control-Flow Graph (CFG) •A way to represent the possible flow of control inside a function. • Nodes : called basic blocks. Each block consists of straight-line code ending (possibly) in a branch. • Edges : An edge A → B means that control could flow from A to B. •There is one unique entry node and one unique exit node .
ENTRY int foo() { printf(“Boo!”); printf(“Boo!”); } EXIT
ENTRY int foo() { x=1; x=1; y=2; y=2; printf(x+y); printf(x+y); } EXIT
ENTRY x=1; if (x>0) goto B2 int foo() { read(x); if (x>0) printf(x) printf(x); } EXIT
ENTRY x=1; if (x>0) goto B2 int foo() { read(x); if (x>0) printf(x) printf(x); printf(x+1) else printf(x+1); } EXIT
ENTRY x=10; if (x<=0) goto B3 int foo() { x=10; while (x>0){ printf(x); printf(x); x=x-1; x=x-1; goto B1; } } EXIT
Control Flow wolF lortnoC gninettalF Flattening
Flatten
int modexp( B 0 : k=0 int y,int x[], s=1 int w,int n){ int R, L; B 1 : if (k<w) int k=0; int s=0; while (k < w) { B 6 : if (x[k] == 1) B 2 : if (x[k]==1) return L R = (s*y) % n else B 3 : B 4 : R = s; R=(s*y) mod n R=s s = R*R % n; L = R; B 5 : k++; s=R*R mod n } L = R k++ return L; goto B 1 }
int modexp(int y, int x[], int w, int n) { int R, L, k, s; int next=0 ; for(;;) switch(next) { case 0 : k=0; s=1; next=1 ; break; case 1 : if (k<w) next=2; else next=6; break; case 2 : if (x[k]==1) next=3; else next=4 ; break; case 3 : R=(s*y)%n; next=5 ; break; case 4 : R=s; next=5 ; break; case 5 : s=R*R%n; L=R; k++; next=1 ; break; case 6 : return L; } }
next=0 switch(next) R=(s*y)%n R=s S=R*R%n k=0 if (k<w) if (x[k]==1) return L s=1 next=5 next=5 L=R next=2 next=3 B 6 next=1 K++ else else B 4 B 3 next=1 next=6 next=4 B 0 B 2 B 1 B 5
Flattening Algorithm 1. Construct the CFG 2. Add a new variable int next=0; 3. Create a switch inside an infinite loop, where every basic block is a case: switch case 0: block_0 case n: block_n case n: { 4. Add code to update the next variable: if (expression) next = … else next = … }
ten this CFG: B1 ENTER X := 20; Flatten this CFG! B2 if x >= 10 goto B4 Work with your friends! B3 X := X − 1; B4 A[X] := 10; Y := X + 5; if X <> 4 goto B6 B5 X := X − 2; EXIT B6 goto B2
gnitcurtsnoC Constructing Opaque euqapO snoisserpxE Expressions
int modexp(int y, int x[], int w, int n) { int R, L, k, s; next= E=1 int next= E=0 ; for(;;) switch(next) { case 0: k=0; s=1; next= E=1 ; break; case 1: if (k<w) next= E=2 ; else next= E=6 ; break; case 2: if (x[k]==1) next= E=3 ; else next= E=4 ; break; case 3: R=(s*y)%n; next= E=5 ; break; case 4: R=s; next= E=5 ; break; case 5: s=R*R%n; L=R; k++; next= E=1 ; break; case 6: return L; } }
Opaque Values Opaque values from array aliasing 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 36 58 1 46 23 5 16 65 2 41 2 7 1 37 0 11 16 2 21 16 Invariants: Invariants: • every third cell (in pink), starting will cell 0, is ≡ 1 mod 5; • cells 2 and 5 (green) hold the values 1 and 5, respectively; • every third cell (in blue), starting will cell 1, is ≡ 2 mod 7; • cells 8 and 11 (yellow) hold the values 2 and 7, respectively.
int modexp(int y, int x[], int w, int n) { int R, L, k, s; int next=0; int g[] = {10,9,2,5,3}; for(;;) switch(next) { case 0 : k=0; s=1; next=g[0]%g[1]=1; break; case 1 : if (k<w) next=g[g[2]]=2; else next=g[0]-2*g[2]=6; break; case 2 : if (x[k]==1) next=g[3]-g[2]=3; else next=2*g[2]=4; break; case 3 : R=(s*y)%n; next=g[4]+g[2]=5; break; case 4 : R=s; next=g[0]-g[3]=5; break; case 5 : s=R*R%n; L=R; k++; next=g[g[4]]%g[2]=1; break; case 6 : return L; } }
cireneG Generic noitacsufboeD Deobfuscation
Dynamic Analysis INPUT TRACE TRACE’ ADD ADD SUB BRA BRA DIV PUSH() SHL XOR POP() CALL DIV XOR P(){ OUTPUT } Yadegari, et al., A Generic Approach to Deobfuscation. IEEE S&P’15
main (a) { main () { Backward Forward Taint Analysis Taint Analysis Compiler (contribute to (depend on input) Optimizations b ← a; output) b ← a; ✓ ✓ ✓ ADD ADD ADD ADD ADD ADD SUB SUB SUB SUB BRA ✓ c ← b; printf(b); BRA ✓ ✓ ✓ BRA BRA BRA BRA SHL DIV ✓ ✓ ✓ SHL SHL SHL SHL DIV XOR CALL CALL CALL CALL XOR } } ✓ ✓ DIV DIV DIV DIV ✓ XOR XOR XOR XOR
Not input P(){ SP dependent! ADD STACK: BRA ✓ ADD ADD SHL VPC SUB SUB ADD DIV ✓ BRA BRA BRA DIV ✓ PRINT SHL SHL sub add call print PRINT CALL CALL ✓ DIV DIV ✓ PRINT PRINT P(){ } }
int modexp(int y, int x[], int w, int n) { int R, L, k, s; next= E=1 int next= E=0 ; for(;;) switch(next) { case 0: k=0; s=1; next= E=1 ; break; case 1: if (k<w) next= E=2 ; else next= E=6 ; break; case 2: if (x[k]==1) next= E=3 ; else next= E=4 ; break; Not input case 3: R=(s*y)%n; next= E=5 ; break; dependent! case 4: R=s; next= E=5 ; break; case 5: s=R*R%n; L=R; k++; next= E=1 ; break; case 6: return L; } }
Not input dependent! DEC( ) DEC( ) DEC( ) ENC( ) DEC( )
Anti-Dynamic Analysis BRA FDIV FADD BRLE SUB SHL CALL CALL SRA CALL LEA JMP FADD • Artificially inflate trace size DIV MUL TEST LD • Force the collection of multiple traces LEA JMP LEA STO • Prevent traces from being collected BLE BLE BLE FSUB SLA SRL ADD CALL main(){ BRA CALL SUB BRA if (traced) TEST BOR BAND LD abort(); LEA STO FADD SEXT JREG BRA BRLE CALL } XOR AND FDIV BOR JMP CALL JREG JMP Pawlovski, et al., Probfuscation: An Obfuscation Approach …
Overhead Protection
Resources Resources Overhead Protection Precision Precision
Recommend
More recommend