Neural-Augmented Static Analysis of Android Communication Jinman - PowerPoint PPT Presentation

Neural-Augmented Static Analysis of Android Communication Jinman Zhao , Aws Albarghouthi, Vaibhav Rastogi, Somesh Jha, Damien Octeau University of Wisconsin-Madison, Google

Use machine learning Key Idea to refine results from static analysis.

Static Analysis: False Positives Program & Property Static Analyzer Must True Unsure... Must False False Positives Ranking problem

Machine Learning to Augment Program & Property Static Analyzer Must True Likelihood ∈ [0, 1] Unsure... Must False Train Model Predict

Link Inference for Android Communication Inter-Component Communication Links Program & Property Static Analyzer Must True May Must False Must True Likelihood ∈ [0, 1] Must False Links Links Links Train Model Predict

Task Link Inference in Android Communication

Android ICC: A User’s Experience (xxx) xxx-xxxx Restaurant Malicious APP 1234 Alice St. Orlando, FL Send a message I’d like to make a reservation ... Inter-Component Intent Component Communication w/ Filter

Android ICC: An Example Code View (part of) the resolution logic Intent ICC link? Yes! Filter

(Bigger part of) the resolution logic (Octeau et al., POPL’16)

Previous Work: PRIMO PRIMO (Octeau et al., POPL’16) uses a hand-crafted ● probabilistic model that assigns probabilities to ICC links inferred by static analysis. Laborious, error-prone and requiring expert domain knowledge. ○ Difficulty catching up with constantly evolving Android system. ○

Questions

#1 How can we triage may links with minimal expert domain knowledge? Neural networks.

#2 How can we process inputs of complex data types in a systematic way? Type-directed encoder.

#3 How do our models perform? Very good!

#4 Are the models learning the right things? Seems like so.

We are not trying to… We are trying to… Propose new NN Propose systematic way ● ● module to construct NN Eliminate use of domain Provide decent ● ● knowledge performance without Rule out manual effort expert knowledge ● Use less labour with ● more automation

How can we triage may Approach links with minimal expert domain knowledge? Part 1

Link-Inference Neural Network LINN: An end-to-end encoder-and-classifier architecture. Must Train Model True Links [0,1] May Classifier Links Predict Encoder Encoder Must True Intent Filter Links

How can we process inputs Approach of complex data types in a systematic way? Part 2

Model [0,1] Classifier Encoder Encoder Intent Filter

Type-Directed Encoder TDE: mapping type signature to neural network architecture. Rules Instan TDE Input Type TDE tiation Template Type signature Neural network Neural network template

An example: Encoding Product Types Instance t := (a, b) t-en : R l Type T := tuple(A, B) encT comb R n ⨉ R m ➝ R l a-en : R n b-en : R m encA encB encA encB a : A b : A t : T

Rules for type-directed encoding

Android ICC: Our Abstraction intent Type signatures tuple Intent intent := tuple(act, cats) act cats Action act := optional(string) Categories cats := set(string) optional set Filter filter := tuple(acts, cats) string string Actions acts := set(string) list Categories cats := set(string) list char char

Type-Directed Encoder intent-en intent comb act-en cats-en tuple act cats union aggr Rules optional set str-en str-en string string flat flat list char-en list char-en char char enum enum char char Type signature Neural network template

Type-Directed Encoder: Instantiation intent-en comb TreeLSTM act-en cats-en union aggr switch TreeLSTM Instantiation str-en str-en flat flat CNN CNN char-en char-en enum enum lookup lookup char char Neural network Neural network template ( typed-tree )

Type-Directed Encoder: Instantiation intent-en comb concat act-en cats-en union aggr switch max Instantiation str-en str-en flat flat RNN RNN char-en char-en enum enum lookup lookup char char Neural network Neural network template ( str-rnn )

A systematic way to build and explore structured NN.

Are our models correctly Experiments predicting links?

Setup ● Dataset of 10,500 Android APPs from Google Play. ● IC3 (Octeau et al., ICSE’15) for static analysis. ● PRIMO’s abstract matching for may/must partition. ● Simulated ground truth for may links. ● 4 instantiations of the TDE architecture. # pairs # positive # negative training set 105,108 63,168 41,940 testing set 43,680 29,260 14,420

All instantiated models perform as good as PRIMO.

Correlation Our best model ( typed-tree ) fills the correlation gap by 72% compared to PRIMO despite the harder setting.

More Results for Our Best Model ROC (left) and the distribution of predicted likelihood (right) from typed-tree model. Distribution Correlation

How do we know the model Interpretability is learning the right thing?

Sensitivity to Masking Picking distinctive values Ignoring less useful parts

default Learned Encodings (.*) Semantically closer values receive more similar encodings. None Visualized by t-SNE.

● Neural-augmented static analysis ● Type-directed encoder Conclusion ● Increased accuracy with less domain knowledge ● Interpretability study

● Apply to other analysis tasks Future Works ● Push machine learning into static analysis procedure

Thanks for listening! Q & A

Neural-Augmented Static Analysis of Android Communication Jinman - PowerPoint PPT Presentation

Neural-Augmented Static Analysis of Android Communication Jinman Zhao , Aws Albarghouthi, Vaibhav Rastogi, Somesh Jha, Damien Octeau University of Wisconsin-Madison, Google Use machine learning Key Idea to refine results from static

CS619 Android 101 BENCE CSERNA Android: Manifest example Android: Manifest <manifest

Network performance requirements of Augmented Reality Systems Mike P. Wittie 1 Augmented

Developers Google Maps Android API v2 Make your Android app pop with Google Maps Android API v2

IMPACT OF AUGMENTED REALITY ON SOCIETY BY DEREK MANDL AND STEPHEN SLADEK WHAT IS AUGMENTED

APPLICATIONS UDAY LINGALA CSCI 5448, Fall 2012 Content Introduction to Android system

Android Android Application Development - Ashwin Agenda Android Platform Overview

CS 403X Mobile and Ubiquitous Computing Lecture 3: Android UI, WebView, Android Activity Lifecycle

Running Android on the Mainline Graphics Stack Robert Foss @memcpy_io Agenda Android

CS 403X Mobile and Ubiquitous Computing Lecture 2: Android UI Design, First Android Program

HELLO WORLD ON ANDROID Create a new Android Project Open File->New->Android

Static and Method Overloading static One per class, not per object static variables

Android Malware Analysis on Attacks and Defense Android malware Android malware With the

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Static and dynamic verification Static and dynamic V&V Software inspections Concerned

1/08/2012 Augmented Reality How Does This Technology Fit in the Commercial World? Augmented

Portfolio of Work (9 pages) T H E N E X T R E V O L U T I O N I N R E T A I L AUGMENTED

Who are we? Key project of the International Chamber of Commerce (ICC) the worlds business

Context for Efficient Recognition from Large-Scale Topographic Map Series Johannes H. Uhl 1

The Random In Intercept Model PSYC 575 August 6, 2020 (updated: 29 August 2020) Week Learning

Introduction to Game Theory Part II Tyler Moore Computer Science & Engineering Department,

Wendy Thompson Fast, Consolidated Companies Carol Mattey, Mattey Consulting LLC September 18,

Collusive Data Leak and More: Large-scale Threat Analysis of Inter-app Communications Amiangshu

Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework

Apposcopy: Semantics- Based Detection of Android Malware through Static Analysis By Feng et al

Neural-Augmented Static Analysis of Android Communication Jinman - PowerPoint PPT Presentation

Neural-Augmented Static Analysis of Android Communication Jinman Zhao , Aws Albarghouthi, Vaibhav Rastogi, Somesh Jha, Damien Octeau University of Wisconsin-Madison, Google Use machine learning Key Idea to refine results from static

CS619 Android 101 BENCE CSERNA Android: Manifest example Android: Manifest &lt;manifest

Network performance requirements of Augmented Reality Systems Mike P. Wittie 1 Augmented

Developers Google Maps Android API v2 Make your Android app pop with Google Maps Android API v2

IMPACT OF AUGMENTED REALITY ON SOCIETY BY DEREK MANDL AND STEPHEN SLADEK WHAT IS AUGMENTED

APPLICATIONS UDAY LINGALA CSCI 5448, Fall 2012 Content Introduction to Android system

Android Android Application Development - Ashwin Agenda Android Platform Overview

CS 403X Mobile and Ubiquitous Computing Lecture 3: Android UI, WebView, Android Activity Lifecycle

Running Android on the Mainline Graphics Stack Robert Foss @memcpy_io Agenda Android

CS 403X Mobile and Ubiquitous Computing Lecture 2: Android UI Design, First Android Program

HELLO WORLD ON ANDROID Create a new Android Project Open File-&gt;New-&gt;Android

Static and Method Overloading static One per class, not per object static variables

Android Malware Analysis on Attacks and Defense Android malware Android malware With the

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Static and dynamic verification Static and dynamic V&amp;V Software inspections Concerned

1/08/2012 Augmented Reality How Does This Technology Fit in the Commercial World? Augmented

Portfolio of Work (9 pages) T H E N E X T R E V O L U T I O N I N R E T A I L AUGMENTED

Who are we? Key project of the International Chamber of Commerce (ICC) the worlds business

Context for Efficient Recognition from Large-Scale Topographic Map Series Johannes H. Uhl 1

The Random In Intercept Model PSYC 575 August 6, 2020 (updated: 29 August 2020) Week Learning

Introduction to Game Theory Part II Tyler Moore Computer Science &amp; Engineering Department,

Wendy Thompson Fast, Consolidated Companies Carol Mattey, Mattey Consulting LLC September 18,

Collusive Data Leak and More: Large-scale Threat Analysis of Inter-app Communications Amiangshu

Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework

Apposcopy: Semantics- Based Detection of Android Malware through Static Analysis By Feng et al

CS619 Android 101 BENCE CSERNA Android: Manifest example Android: Manifest <manifest

HELLO WORLD ON ANDROID Create a new Android Project Open File->New->Android

Static and dynamic verification Static and dynamic V&V Software inspections Concerned

Introduction to Game Theory Part II Tyler Moore Computer Science & Engineering Department,