experiences of of la landing machine le learning onto

Experiences of of La Landing Machine Le Learning onto - PowerPoint PPT Presentation

Experiences of of La Landing Machine Le Learning onto Market-Scale Mobile Malware Detection Liangyi Gong, Zhenhua Li, Feng Qian, Zifan Zhang, Qi Alfred Chen, Zhiyun Qian, Hao Lin, Yunhao Liu Mobile Malware Detection Android App Markets

  1. Experiences of of La Landing Machine Le Learning onto Market-Scale Mobile Malware Detection Liangyi Gong, Zhenhua Li, Feng Qian, Zifan Zhang, Qi Alfred Chen, Zhiyun Qian, Hao Lin, Yunhao Liu

  2. Mobile Malware Detection ⚫ Android App Markets “lend credib ibil ilit ity ” Mobile App Markets Mobile Users ⚫ Current Mobile App Review ✓ Fingerprint-based Antivirus Checking ✓ Expert-informed API inspection ✓ User-report-driven Manual Examination ✓ API-based Dynamic Analysis

  3. Mobile Malware Detection ⚫ Android App Markets “lend credib ibil ilit ity ” Mobile App Markets Mobile Users ⚫ ML-based Mobile App Review Techniques ⚫ Fingerprint-based Antivirus Checking ⚫ Static Code Inspection ⚫ Dynamic Behavior Analysis

  4. ML-based Detection at Market Scales Real-world Widely explored in No existing report of the past decade the effectiveness Challenges? ML-based Malware ML-based Solutions Detection at Market Scales

  5. Large-scale Dataset: API-centric, Dynamic • 500K apps submitted to Tencent Market • From March to December 2017 • Containing apps’ malice labels Monkey : UI Event Steam APK APK Commodity servers Trigger api to Trigger API to output log output log App Emulation Tencent Market https://sj.qq.com/ One-hot Feature Vector

  6. API Selection: Correlation ⚫ APIs’ correlations with the ⚫ Time consumption of malice of apps Tracking APIs ▪ Using SRC ( Spearman’s rank ▪ Tracking highly correlated APIs ▪ Fitting a tri-modal distribution correlation coefficient ) to evaluate APIs’ correlation with apps’ malice ▪ 260 APIs pose non-trivial correlation (| SRC | ≥ 0.2) 0.6 0.5 0.4 |SRC| 0.3 0.2 0.1 0 0 200 400 600 800 1000 Ranking of API

  7. API Selection: Correlation ⚫ APIs’ correlations with the ⚫ Time consumption of malice of apps tracking different API sets ▪ Fitting a tri-modal distribution ▪ Using SRC ( Spearman’s rank ▪ Indicating a complex relationship correlation coefficient ) to evaluate APIs’ correlation with apps’ malice ▪ 260 APIs pose non-trivial correlation (| SRC | ≥ 0.2) 0.6 0.5 0.4 |SRC| 0.3 0.2 0.1 0 0 200 400 600 800 1000 Ranking of API

  8. API Selection: Model & Accuracy ⚫ Machine Learning Model & Detection Accuracy Model Precision Recall Training Time Tracking top-490 correlated APIs achieves the highest Naive Bayes 60.4% 59.6% 3.6 min precision/recall LR 81.2% 70.3% 10.4 min SVM 87.9% 71.6% ∼ 27K min GBDT 88.4% 74.3% 364 min kNN 86.5% 83.7% ∼ 1.8K min CART 87.6% 84.3% 11.6 min ∼ 1.2K min ANN 90.8% 89.9% DNN 91.5% 90.9% ∼ 1.9K min Random Forest 91.6% 90.2% 29.1 min

  9. API Selection: Model & Accuracy ⚫ Machine Learning Model & Detection Accuracy Model Precision Recall Training Time Tracking top-490 correlated APIs achieves the highest Naive Bayes 60.4% 59.6% 3.6 min precision/recall LR 81.2% 70.3% 10.4 min SVM 87.9% 71.6% ∼ 27K min GBDT 88.4% 74.3% 364 min kNN 86.5% 83.7% ∼ 1.8K min CART 87.6% 84.3% 11.6 min ∼ 1.2K min ANN 90.8% 89.9% DNN 91.5% 90.9% ∼ 1.9K min Random Forest 91.6% 90.2% 29.1 min

  10. Key API Selection Strategy ⚫ Step 1. Selecting APIs with the highest correlation with malware (Set-C). ⚫ Step 2. Selecting APIs that relate to restrictive permissions (Set-P). ⚫ Step 3. Selecting APIs that perform sensitive operations (Set-S). ⚫ Step 4. Combining the above. Set-C Performance: 244 ⚫ Analysis time: 4.3 minutes 12 4 ⚫ Precision/Recall: 96.8% / 93.7% Set-P Set-S 100 66 ⚫ Training time: 14.4 seconds

  11. Key API Selection Strategy ⚫ Step 1. Selecting APIs with the highest correlation with malware (Set-C). ⚫ Step 2. Selecting APIs that relate to restrictive permissions (Set-P). ⚫ Step 3. Selecting APIs that perform sensitive operations (Set-S). ⚫ Step 4. Combining the above. Set-C Performance: 244 ⚫ Analysis time: 4.3 minutes 12 4 ⚫ Precision/Recall: 96.8% / 93.7% Set-P Set-S 100 66 ⚫ Training time: 14.4 seconds

  12. Further Enriching the Feature Space ⚫ Hidden features – API invocation hidden by certain techniques Hidden and internal APIs IPC through intents triggered by special techniques leveraging other apps/services to like Java reflection perform sensitive actions Checking Permissions Checking Used Intents Key APIs alone API + Permission + Intents ⚫ Precision: 96.8% ⚫ Precision: 98.6% ⚫ Recall: 93.7% ⚫ Recall: 96.7%

  13. Further Enriching the Feature Space ⚫ Hidden features – API invocation hidden by certain techniques Hidden and internal APIs IPC through intents triggered by special techniques leveraging other apps/services to like Java reflection perform sensitive actions Checking Permissions Checking Used Intents Key APIs alone API + Permission + Intents ⚫ Precision: 96.8% ⚫ Precision: 98.6% ⚫ Recall: 93.7% ⚫ Recall: 96.7%

  14. System: Emulation Optimization ⚫ Default Google Android Emulator: full-system emulation ⚫ Result: 30% of apps require ≥ 5-minute analysis time ⚫ Solution: lightweight emulation on powerful x86 server ⚫ Architect: native x86 Android + Dynamic Binary Translation

  15. System: Emulation Optimization ⚫ Configuration: 5x4-core x86 server with CPU pinning ⚫ Compatibility: ≤ 1% incompatible apps ⚫ Roll back to the Google Emulator for incompatible apps ⚫ Performance: saving around 70% of the detection time Able to analyze an app in around 1.3 minutes

  16. System: Real-world Deployment ⚫ Integration to Tencent Market ⚫ Integration to Tencent Market ⚫ System Evoluation ▪ Running since March 2018 ⚫ Monthly updating the key APIs ▪ Checking ~10K apps per day using a with apps and SDK APIs single commodity server ⚫ Dataset ▪ Over 98%/96% online precision/recall contains the original dataset and new apps submitted ⚫ Fluctuating between 425 and 432

  17. System: Real-world Deployment ⚫ Integration to Tencent Market ⚫ System Evolution ▪ Running since March 2018 ▪ Monthly updating the key APIs ▪ Checking ~10K apps per day using a with the original dataset and single commodity server newly submitted apps ▪ Over 98%/96% online precision/recall ▪ Fluctuating between 425 and 432

  18. System: Addressing FPs & FNs ⚫ False Positives ⚫ False Negative ▪ 2% FP apps as complained by ⚫ 4% False Negative (FN) apps developers reported by end users ▪ All using a few top-ranking APIs ⚫ Most (87%) of the FN apps barely ▪ Most are quickly vetted based use the 426 key APIs on previous versions ⚫ These apps have fairly simple functionalities without posing a great security threat to end users ⚫ a small number of false negative Manual Inspection: apps in fact has little effect on the acceptable workload regular operation of T-Market Active & complete Passive mitigation of FNs avoidance of FPs

  19. System: Addressing FPs & FNs ⚫ False Positives ⚫ False Negatives ▪ ▪ 4% FN apps reported by end users 2% FP apps as complained by ▪ Hard to avoid developers ▪ All using a few top-ranking APIs ▪ Most (87%) barely use key APIs ▪ ▪ They have fairly simple Most are quickly vetted based on previous versions functionalities, posing little threat Manual Inspection: Report-driven: acceptable workload mild impact on users Active & complete Passive mitigation of FNs avoidance of FPs

  20. Revealed Important Features ⚫ Attempting to acquire privacy-sensitive information of user devices ⚫ Tracking or intercepting system-level events ⚫ Enabling certain types of attacks such as overlay-based attacks Gini Importance 0 0.02 0.04 0.06 0.08 0.1 API: SmsManager_sendTextMessage Permission: SEND_SMS Intent: SMS_RECEIVED Intent: wifi.STATE_CHANGE Permission: RECEIVE_SMS Intent: DEVICE_ADMIN_ENABLED Intent: buluetooth.STATE_CHANGED Permission: RECEIVE_MMS Intent: ACTION_BATTERY_OKAY API: TelephonyManager_getLine1Number Permission: RECEIVE_WAP_PUSH API: WifiInfo_getMacAddress Permission: READ_SMS API: View_setBackgroundColor Permission: ACCESS_NETWORK_STATE Permission: SYSTEM_ALERT_WINDOW API: SQLiteDatabase_insertWithOnConflict Permission: RECEIVE_BOOT_COMPLETED API: HttpURLConnection_connect API: ActivityManager_getRunningTasks

  21. Experiences of APIC HECKER Feature Engineering Feature Selection Adversary’s Principled, perspective data-driven Benign Malicious Analysis Speed Model Evolution Efficient app Monthly emulation on update with powerful x86 novel apps & servers Developer Engagement SDK APIs Active & complete avoidance of FPs vs. Passive mitigation of FNs

  22. Conclusion & Dataset ⚫ We conduct a large-scale study to understand and overcome real-world challenges of developing ML- based malware detection solutions at market scales. ⚫ We showcase several key design decisions we make towards implementing, deploying, and operating a production market-scale mobile malware detection system – APIC HECKER . ⚫ Our system has been operational at Tencent Market since March 2018, vetting around 10K apps per day on a single commodity server. Dataset & tool release: https://apichecker.github.io/


More recommend