st statistical de de ob obfu fusc scation ion for or
play

St Statistical De De-ob obfu fusc scation ion for or Android - PowerPoint PPT Presentation

www.srl.inf.ethz.ch St Statistical De De-ob obfu fusc scation ion for or Android oid Pe Petar Tsankov, ETH Zurich DeGua De uard Team Te Benjamin Veselin Petar Martin Bichsel Raychev Tsankov Vechev Why De-obfuscate Android


  1. www.srl.inf.ethz.ch St Statistical De De-ob obfu fusc scation ion for or Android oid Pe Petar Tsankov, ETH Zurich DeGua De uard Team Te Benjamin Veselin Petar Martin Bichsel Raychev Tsankov Vechev

  2. Why De-obfuscate Android Applications? Android binaries (APKs) (no code available) Open-source (code available) Google Play F-Droid 2

  3. Why De-obfuscate Android Applications? 2.6 M APKs Which APKs are malicious ? Which ones use vulnerable libraries ? 5 K APKs Google Play F-Droid 2

  4. Layout Obfuscation in Android Non-descriptive names Names of API classes/methods package com.example.dbhelper package a.b.c class DBHelper extends SQLiteHelper { class a extends SQLiteHelper { SQLiteDatabase db ; SQLiteDatabase b ; Obfuscate public DBHelper (Context ctx) { public a (Context ctx) { db = getWritableDatabase(); b = getWritableDatabase(); } } API names Cursor execSQL (String str) { Cursor c (String str) { remain return db .rawQuery(str); return b .rawQuery(str); } } } } Descriptive application- specific names 3

  5. Layout Obfuscation in Android Non-descriptive names Names of API classes/methods package com.example.dbhelper package a.b.c Security Challenges class DBHelper extends SQLiteHelper { class a extends SQLiteHelper { SQLiteDatabase db ; SQLiteDatabase b ; Code inspection Obfuscate public DBHelper (Context ctx) { public a (Context ctx) { db = getWritableDatabase(); b = getWritableDatabase(); } } Third-party library detection API names Cursor execSQL (String str) { Cursor c (String str) { … many others remain return db .rawQuery(str); return b .rawQuery(str); } } } } Descriptive application- specific names 3

  6. Layout Obfuscation in Android Non-descriptive names Names of API classes/methods package com.example.dbhelper package a.b.c class DBHelper extends SQLiteHelper { class a extends SQLiteHelper { SQLiteDatabase db ; SQLiteDatabase b ; Can we reverse Obfuscate public DBHelper (Context ctx) { public a (Context ctx) { layout obfuscation db = getWritableDatabase(); b = getWritableDatabase(); } } API names Cursor execSQL (String str) { Cursor c (String str) { remain return db .rawQuery(str); return b .rawQuery(str); } } } } Descriptive application- specific names 3

  7. Layout Obfuscation in Android Non-descriptive names Names of API classes/methods package com.example.dbhelper package a.b.c class DBHelper extends SQLiteHelper { class a extends SQLiteHelper { SQLiteDatabase db ; SQLiteDatabase b ; Obfuscate public DBHelper (Context ctx) { public a (Context ctx) { www.apk-deguard.com db = getWritableDatabase(); b = getWritableDatabase(); } } API names Cursor execSQL (String str) { Cursor c (String str) { remain return db .rawQuery(str); return b .rawQuery(str); } } } } Descriptive application- specific names Yes, with roughly 80% accuracy! 3

  8. Demo

  9. www.apk-deguard.com Released in October 2016, so far: > 100GB distinct APKs de-obfuscated Reddit posts/comments Tweets . . . 4 . . .

  10. How Does DeGuard Work?

  11. DeGuard: System Overview Learning phase Semantic representation Static Training analysis Probabilistic model 𝑄 ) Open-source, unobfuscated APKs Prediction phase class a extends SQLiteHelper { class DBHelper extends SQLiteHelper{ SQLiteDatabase b ; Static MAP SQLiteDatabase db ; public a (Context ctx) { Transform public DBHelper (Context ctx) { analysis inference b = getWritableDB(); db = getWritableDB(); } } } } De-obfuscated code Obfuscated code 5

  12. Probabilistic Graphical Models

  13. Probabilistic Graphical Models name1 name2 weight 𝑔 SQLiteHelper DBUtils 0.3 ) 𝑔 * SQLiteHelper DBHelper 0.2 class a extends SQLiteHelper { SQLiteHelper a name1 name2 weight SQLiteDatabase b ; extends 𝑔 - DBUtils instance 0.5 public a (Context ctx) { ` field-in 𝑔 . DBHelper db 0.4 b = getWritableDB(); gets getWritableDB b 𝑔 / … … … } } name1 name2 weight 𝑔 + getWritableDB db 0.7 Graph + features define a probabilistic graphical model 𝑔 , getWritableDB instance 0.4 𝑄 𝑏, 𝑐 π‘‡π‘…π‘€π‘—π‘’π‘“πΌπ‘“π‘šπ‘žπ‘“π‘ , π‘•π‘“π‘’π‘‹π‘ π‘—π‘’π‘π‘π‘šπ‘“πΈπΆ ) Known variables = 1 SQLiteHelper, getWritableDB π‘Ž exp (0.3 J 𝑔 ) π‘‡π‘…π‘€π‘—π‘’π‘“πΌπ‘“π‘šπ‘žπ‘“π‘ , 𝑏 Unknown variables a, b + 0.2 J 𝑔 * π‘‡π‘…π‘€π‘—π‘’π‘“πΌπ‘“π‘šπ‘žπ‘“π‘ , 𝑏 + β‹― ) 𝑔 ) , 𝑔 * , . . , 𝑔 Feature functions / 6 For details see report on www.apk-deguard.com

  14. Probabilistic Graphical Models name1 name2 weight 𝑔 SQLiteHelper DBUtils 0.3 ) 𝑔 * SQLiteHelper DBHelper 0.2 class a extends SQLiteHelper { SQLiteHelper a name1 name2 weight SQLiteDatabase b ; extends 𝑔 - DBUtils instance 0.5 public a (Context ctx) { ` field-in 𝑔 . DBHelper db 0.4 Next b = getWritableDB(); gets getWritableDB b 𝑔 / … … … } } How are the features and name1 name2 weight their weights learned? 𝑔 + getWritableDB db 0.7 Graph + features define a probabilistic graphical model 𝑔 , getWritableDB instance 0.4 𝑄 𝑏, 𝑐 π‘‡π‘…π‘€π‘—π‘’π‘“πΌπ‘“π‘šπ‘žπ‘“π‘ , π‘•π‘“π‘’π‘‹π‘ π‘—π‘’π‘π‘π‘šπ‘“πΈπΆ ) Known variables = 1 SQLiteHelper, getWritableDB π‘Ž exp (0.3 J 𝑔 ) π‘‡π‘…π‘€π‘—π‘’π‘“πΌπ‘“π‘šπ‘žπ‘“π‘ , 𝑏 Unknown variables a, b + 0.2 J 𝑔 * π‘‡π‘…π‘€π‘—π‘’π‘“πΌπ‘“π‘šπ‘žπ‘“π‘ , 𝑏 + β‹― ) 𝑔 ) , 𝑔 * , . . , 𝑔 Feature functions / 6 For details see report on www.apk-deguard.com

  15. Learning

  16. Learning Actual graphs have > 1,000 nodes > 2,000 name1 name2 weight Dependency graphs 𝑔 ) SQLiteHelper DBUtils 0.3 Unobfuscated 𝑔 * SQLiteHelper DBHelper 0.2 name1 name2 𝑔 + getWritableDB db 0.7 APKs Static Train 𝑔 ) SQLiteHelper DBUtils 𝑔 , getWritableDB instance 0.4 analysis model 𝑔 * SQLiteHelper DBHelper 𝑔 - DBUtils instance 0.5 𝑔 + getWritableDB db 𝑔 . DBHelper db 0.4 𝑔 , getWritableDB instance 𝑔 / … … … Feature 𝑔 - DBUtils instance 𝑔 . DBHelper db templates 𝑔 / … … Compute weights that > 100,000 Features (with maximize 𝑄 𝑃 = 𝑝 O 𝐿 = 𝑙 O for 28 templates candidate names) all training samples (𝑝 O , 𝑙 O ) 7

  17. DeGuard: System Overview Learning phase Static Training analysis Probabilistic model 𝑄 ) Open-source, unobfuscated APKs Prediction phase class a extends SQLiteHelper { class DBHelper extends SQLiteHelper{ SQLiteDatabase b ; Static MAP SQLiteDatabase db ; public a (Context ctx) { Transform public DBHelper (Context ctx) { analysis inference b = getWritableDB(); db = getWritableDB(); } } } } De-obfuscated code Obfuscated code

  18. Prediction Phase name1 name2 weight SQLiteHelper DBUtils 0.3 class a extends SQLiteHelper { SQLiteHelper DBHelper 0.2 SQLiteDatabase b ; Static public a (Context ctx) { SQLiteHelper a analysis b = getWritableDB(); extends } field-in Obfuscated Code } gets getWritableDB b name1 name2 weight name1 name2 weight DBUtils instance 0.5 getWritableDB db 0.7 DBHelper db 0.4 getWritableDB instance 0.4 DBUtils db 0.2 DBHelper instance 0.2 8

  19. Prediction Phase name1 name2 weight MAP Inference SQLiteHelper DBUtils 0.3 class a extends SQLiteHelper { SQLiteHelper DBHelper 0.2 SQLiteDatabase b ; Static Static 𝑝 βƒ— = 𝑏𝑠𝑕𝑛𝑏𝑦 𝑄 𝑃 = 𝑝 βƒ—β€² 𝐿 = 𝑙 public a (Context ctx) { SQLiteHelper a analysis analysis b = getWritableDB(); 𝑝 βƒ—β€² ∈ Ξ© extends } field-in Obfuscated Code } gets Candidate assignment 𝒑 getWritableDB b 𝑸 𝒑 𝒍) * a = DBUtils b = instance 1.2 a = DBHelper b = db 1.3 name1 name2 weight name1 name2 weight DBUtils instance 0.5 a = DBUtils b = db 0.8 getWritableDB db 0.7 DBHelper db 0.4 a = DBHelper b = instance 1.2 getWritableDB instance 0.4 DBUtils db 0.2 DBHelper instance 0.2 *Non-normalized 8

  20. Prediction Phase name1 name2 weight MAP Inference SQLiteHelper DBUtils 0.3 class a extends SQLiteHelper { SQLiteHelper DBHelper 0.2 SQLiteDatabase b ; Static 𝑝 βƒ— = 𝑏𝑠𝑕𝑛𝑏𝑦 𝑄 𝑃 = 𝑝 βƒ—β€² 𝐿 = 𝑙 public a (Context ctx) { SQLiteHelper a analysis b = getWritableDB(); 𝑝 βƒ—β€² ∈ Ξ© extends } field-in Obfuscated Code } gets Candidate assignment 𝒑 getWritableDB b 𝑸 𝒑 𝒍) * a = DBUtils b = instance 1.2 a = DBHelper b = db 1.3 name1 name2 weight name1 name2 weight DBUtils instance 0.5 a = DBUtils b = db 0.8 getWritableDB db 0.7 DBHelper db 0.4 a = DBHelper b = instance 1.2 getWritableDB instance 0.4 DBUtils db 0.2 DBHelper instance 0.2 *Non-normalized 8

  21. Prediction Phase name1 name2 weight SQLiteHelper DBUtils 0.3 class a extends SQLiteHelper { SQLiteHelper DBHelper 0.2 SQLiteDatabase b ; Static public a (Context ctx) { SQLiteHelper DBHelper analysis b = getWritableDB(); extends } field-in Obfuscated Code } Semantically gets getWritableDB db the same? class DBHelper extends SQLiteHelper { SQLiteDatabase db ; name1 name2 weight public DBHelper (Context ctx) { name1 name2 weight DBUtils instance 0.5 Transform db = getWritableDB(); getWritableDB db 0.7 DBHelper db 0.4 } getWritableDB instance 0.4 DBUtils db 0.2 Deobfuscated Code } DBHelper instance 0.2 8

Recommend


More recommend