Scalability-First Pointer Analysis with Self-Tuning Context - PowerPoint PPT Presentation

Scalability-First Pointer Analysis with Self-Tuning Context Sensitivity Yue Li, Tian Tan, Anders Møller and Yannis Smaragdakis 1

Pointer Analysis • Concept Which objects a variable may point to? • Importance Fundamental for virtually all static analyses e.g., call graphs, alias, etc. Useful for many software engineering tasks e.g., bug detection, security analysis, program understanding, etc. 2

Problem: Unpredictable Scalability 3

Problem: Unpredictable Scalability • Precise pointer analysis is hard to scale • Context Sensitivity (CS): precise but slow • Context Insensitivity (CI): imprecise but fast 4

Problem: Unpredictable Scalability • Precise pointer analysis is hard to scale • Context Sensitivity (CS): precise but slow • Context Insensitivity (CI): imprecise but fast • Variants of Context Sensitivity • Object Sensitivity (obj) • Type Sensitivity (type) Less precise Faster 2obj 2type 1type CI 5

Problem: Unpredictable Scalability 2obj 2type CI timeout (>10800 seconds) 10000 8000 5374 6000 4000 2950 2458 1203 2000 994 960 285 289 228 135 117 112 95 93 49 53 54 67 48 45 22 22 0 6

Problem: Unpredictable Scalability • Scenario 7

Problem: Unpredictable Scalability • Scenario as a part of a large-scale security analysis 8

Problem: Unpredictable Scalability • Scenario as a part of a large-scale security analysis 9

Problem: Unpredictable Scalability • Scenario X ? Precise 2obj X Unscalable for many X X X X X as a part of a large-scale security analysis 10

Problem: Unpredictable Scalability • Scenario ? Precise 2obj Unscalable for many Imprecise for all ? Scalable CI as a part of a large-scale security analysis 11

Problem: Unpredictable Scalability • Scenario ? Precise 2obj Unscalable for many Imprecise for all ? Scalable CI as a part of a large-scale security analysis • Iterate until most precise that scales: 2obj à 2type à 1type à CI • Sleepless nights and still not great precision! 12

Good Scalability & High Precision regardless of the program being analyzed Scaler 13

Good Scalability & High Precision regardless of the program being analyzed Scaler 2obj 2type Scaler timeout (>10800 seconds) Scalability 10000 as good as CI 8000 Precision 5374 6000 comparable to or better than the best scalable CS 4000 2950 2458 1769 1236 1194 1203 2000 960 705 652 452 285 254 289 272 95 93 93 53 45 53 54 0 14

Idea Scaler 15

Key Concept Number of worst-case CS points-to facts for method m c #ctx m #pts m * c : number of contexts for method m under CS c #ctx m • : number of points-to facts for method m #pts m • 16

Insight • Too many CS points-to facts generated for certain methods 17

Insight • Too many CS points-to facts generated for certain methods • m is scalability-critical method under CS c c > ST (Scalability Threshold) #ctx m #pts m * ( c is expensive) 18

Insight • Too many CS points-to facts generated for certain methods • m is scalability-critical method under CS c c > ST (Scalability Threshold) #ctx m #pts m * ( c is expensive) • Identify scalability-critical method m c ’ #pts m ≤ ST (Scalability Threshold) #ctx m * (choose cheap c ’ ) 19

Insight • Too many CS points-to facts generated for certain methods • m is scalability-critical method under CS c c > ST (Scalability Threshold) #ctx m #pts m * ( c is expensive) • Identify scalability-critical method m c ’ #pts m ≤ ST (Scalability Threshold) #ctx m * (choose cheap c ’ ) How to identify scalability-critical methods? c How to estimate ? #ctx m #pts m * 20

c How to estimate #pts m ? #ctx m * Pre-analysis: points-to results of CI • #pts obtained directly m c #ctx m • obtained by leveraging Object Allocation Graph* (based on CI) Context estimation problem à Graph traversal problem *Making k-object-sensitive pointer analysis more precise with still k-limiting . Tan et al. SAS 2016 21

Example Scaler 22

m · #pts c #ctx m 2obj c = 1 000 000 100 000 10 000 0 method method 10 000 1 23

m · #pts c #ctx m 2obj c = 1 000 000 ST: Scalability Threshold ST p 100 000 10 000 0 method method method 10 000 1 1000 2obj 24

m · #pts c #ctx m 2obj c = 1 000 000 ST: Scalability Threshold ST p 100 000 10 000 0 method method method 10 000 1 1000 2obj 25

m · #pts c #ctx m 2obj c = 1 000 000 ST: Scalability Threshold ST p 100 000 10 000 0 method method method 10 000 1 1000 ? 2obj 26

m · #pts c #ctx m 2obj c = 1 000 000 2type c = 1type c = ST p 100 000 10 000 0 method method method 10 000 1 1000 ? 2obj 27

m · #pts c #ctx m 2obj c = 1 000 000 2type c = 1type c = ST p 100 000 10 000 0 method method method 10 000 1 1000 2obj 2type 28

m · #pts c #ctx m 2obj c = 1 000 000 2type c = 1type c = 100 000 ST p 10 000 0 method 1 method method method 2000 4000 10 000 1type 2type 2obj 29

m · #pts c #ctx m 2obj c = 1 000 000 2type c = 1type c = For any scalability-critical method, use the 100 000 most precise CS variant that can turn it to a non-scalability-critical method ST p 10 000 0 method 1 method method method 2000 4000 10 000 1type 2type 2obj 30

m · #pts c #ctx m 2obj c = 1 000 000 2type c = 1type c = For any scalability-critical method, use the 100 000 most precise CS variant that can turn it to a non-scalability-critical method ST p 10 000 0 method 1 method method method 2000 4000 10 000 1type 2type 2obj 31

Total Scalability Threshold (TST) To automatically choose an appropriate for different program p ST p 32

Total Scalability Threshold (TST) To automatically choose an appropriate for different program p ST p • TST is memory size related • TST indicates analysis capacity How many points-to facts can the memory hold? 33

Total Scalability Threshold (TST) To automatically choose an appropriate for different program p ST p • TST is memory size related • TST indicates analysis capacity How many points-to facts can the memory hold? Program B Program A Σ Σ c c #ctx m #ctx m #pts m #pts m * * Memory TST 34

m · #pts c #ctx m 2obj c = 1 000 000 2type c = 1type c = 100 000 ST p 10 000 0 method 1 method method method 2000 4000 10 000 35

m · #pts c #ctx m 2obj c = 1 000 000 2type c = 1type c = 100 000 Program P ST ) = ( A1 + A2 + A3 ≤ TST E ( ST ) E Σ p p c #ctx m #pts m * ST p ST is automatically computed p based on the above inequality 10 000 A1 A2 A3 0 method method 1 method method 10 000 2000 4000 36

m · #pts c #ctx m 2obj c = 1 000 000 2type c = Program P 1type c = Σ c #ctx m #pts m * 100 000 ST ) = ( A1 + A2 + A3 ≤ TST E p ST p ST is automatically computed p based on the above inequality 10 000 A1 A2 A3 0 method method 1 method method 10 000 2000 4000 37

m · #pts c #ctx m 2obj c = 1 000 000 ST is the max value 2type c = p satisfying this inequality 1type c = 100 000 ST ) = ( A1 + A2 + A3 ≤ TST E p ST p ST is automatically computed p based on the above inequality 10 000 A1 A2 A3 0 method method 1 method method 10 000 2000 4000 38

Scalability-First Pointer Analysis with Scaler Self-Tuning Context Sensitivity CS Variants m · #pts c #ctx m c = 2obj 1 000 000 2type c = 1type self-tuned by c = 100 000 ST ) = E ( A1 + A2 + A3 ≤ TST ST p p ST p ST is automatically computed p based on the above inequality depends on 10 000 A1 A2 A3 0 TST method 1 method method method 2000 4000 10 000 39

Results Scaler 40

10 Popular Java Programs Chart Luindex 41

Settings • TST = 30M (48G Memory) - 20M, 40M, 60M, etc. are all ok - Larger TST means better precision but worse efficiency • Time budget = 3 hours (per program) Results Scaler 2obj 2type Scaler timeout (>10800 seconds) Scalability 10000 as good as CI 8000 5374 Precision 6000 4000 comparable to or better 2950 2458 1769 than the best scalable CS 1236 1194 1203 2000 960 705 652 452 285 289 254 272 95 93 93 53 45 53 54 0 42

Settings • TST = 30M (48G Memory) - 20M, 40M, 60M, etc. are all ok - Larger TST means better precision but worse efficiency • Time budget = 3 hours (per program) Results Scaler Scalability Complex program as good as CI Medium-Complexity Precision program comparable to or better Luindex Simple program than the best scalable CS 43

Complex program Precision Metrics Analysis Time (seconds) #may-fail #poly #reachable #call graph 3h = 10800s casts calls methods edges CI 112 2234 2778 12718 114856 2obj à 2type à 1type >3h + >3h + 1997 2117 2577 12430 111834 Scaler 452 1852 2500 12167 107410 In all cases, lower is better 44

Medium-Complexity program Precision Metrics Analysis Time (seconds) #may-fail #poly #reachable #call graph 3h = 10800s casts calls methods edges CI 49 2508 2925 13036 77370 2obj à 2type à 1type 2458 1409 2182 12657 65836 Scaler 272 1452 2195 12676 66177 In all cases, lower is better 45

Scalability-First Pointer Analysis with Self-Tuning Context - PowerPoint PPT Presentation

Scalability-First Pointer Analysis with Self-Tuning Context Sensitivity Yue Li, Tian Tan, Anders Mller and Yannis Smaragdakis 1 Pointer Analysis Concept Which objects a variable may point to? Importance Fundamental for virtually

Scalability and Replication Marco Serafini COMPSCI 532 Lecture 13 Scalability 2 Scalability

Opaque Pointer Types To a world without pointer to pointer bitcasts Motivation Proximal

Pointer arithmetic arrays only arrays only Pointer arithmetic Can add or subtract an

Pointer Basics Lecture 13 COP 3014 Fall 2019 November 7, 2019 What is a Pointer? A pointer

SELF TUNING MEMORY MANAGEMENT FOR DATA SERVERS By Sangeetha Sivaprakasam Introduction : 1)

Performance and Scalability (Chapter 11) Performance and Scalability Performance: How long

Root zone scalability model Bart Gijsen October 28, 2009 Root zone scalability model

Pointers and Memory 1 Pointer values Pointer values are memory addresses

Dangling Pointer Dangling Pointer Jonathan Afek, 2/ 8/ 07, BlackHat USA 1 Table of Contents

Alias Analysis Last time Reuse optimization Today Alias analysis (pointer analysis)

Hierarchical Pointer Analysis for Distributed Programs Distributed Programs Amir Kamil and

Precision-Guided Context Sensitivity for Pointer Analysis Yue Li, Tian Tan, Anders Mller,

A Probabilistic Pointer Analysis A Probabilistic Pointer Analysis for Speculative Optimization

Making k- Object-Sensitive Pointer Analysis More Precise with Still k -Limiting Tian Tan , Yue Li

Alias Analysis Last time Alias analysis I (pointer analysis) Address Taken FIAlias,

Versioning of Topic Map Templates Structuring Versioning and Scalability Scalability Proc.

def::ung By David Greve def::ung defun wrapper macro (include-book

Upgrading Transport Protocols using Untrusted Mobile Code Parveen Patel Andrew Whitaker Jay

Related-Tweak Statistical Saturation Cryptanalysis and Its Application on QARMA Muzhou Li Key Lab

A Note on Online Steiner Tree Problems Gokarna Sharma and Costas Busch Division of Computer

Sustainability & Transformation Partnership Update B&NES H&WB Wednesday 17 th April

What is the Sustainability and Transformation Plan (STP)? STPs were introduced by NHS England

Network Design and Optimization course Lecture 10 Alberto Ceselli alberto.ceselli@unimi.it

CS4102 Algorithms Summer 2020 Warm up Show log ! = ( log ) Hint: show !

Scalability-First Pointer Analysis with Self-Tuning Context - PowerPoint PPT Presentation

Scalability-First Pointer Analysis with Self-Tuning Context Sensitivity Yue Li, Tian Tan, Anders Mller and Yannis Smaragdakis 1 Pointer Analysis Concept Which objects a variable may point to? Importance Fundamental for virtually

Scalability and Replication Marco Serafini COMPSCI 532 Lecture 13 Scalability 2 Scalability

Opaque Pointer Types To a world without pointer to pointer bitcasts Motivation Proximal

Pointer arithmetic arrays only arrays only Pointer arithmetic Can add or subtract an

Pointer Basics Lecture 13 COP 3014 Fall 2019 November 7, 2019 What is a Pointer? A pointer

SELF TUNING MEMORY MANAGEMENT FOR DATA SERVERS By Sangeetha Sivaprakasam Introduction : 1)

Performance and Scalability (Chapter 11) Performance and Scalability Performance: How long

Root zone scalability model Bart Gijsen October 28, 2009 Root zone scalability model

Pointers and Memory 1 Pointer values Pointer values are memory addresses

Dangling Pointer Dangling Pointer Jonathan Afek, 2/ 8/ 07, BlackHat USA 1 Table of Contents

Alias Analysis Last time Reuse optimization Today Alias analysis (pointer analysis)

Hierarchical Pointer Analysis for Distributed Programs Distributed Programs Amir Kamil and

Precision-Guided Context Sensitivity for Pointer Analysis Yue Li, Tian Tan, Anders Mller,

A Probabilistic Pointer Analysis A Probabilistic Pointer Analysis for Speculative Optimization

Making k- Object-Sensitive Pointer Analysis More Precise with Still k -Limiting Tian Tan , Yue Li

Alias Analysis Last time Alias analysis I (pointer analysis) Address Taken FIAlias,

Versioning of Topic Map Templates Structuring Versioning and Scalability Scalability Proc.

def::ung By David Greve def::ung defun wrapper macro (include-book

Upgrading Transport Protocols using Untrusted Mobile Code Parveen Patel Andrew Whitaker Jay

Related-Tweak Statistical Saturation Cryptanalysis and Its Application on QARMA Muzhou Li Key Lab

A Note on Online Steiner Tree Problems Gokarna Sharma and Costas Busch Division of Computer

Sustainability &amp; Transformation Partnership Update B&amp;NES H&amp;WB Wednesday 17 th April

What is the Sustainability and Transformation Plan (STP)? STPs were introduced by NHS England

Network Design and Optimization course Lecture 10 Alberto Ceselli alberto.ceselli@unimi.it

CS4102 Algorithms Summer 2020 Warm up Show log ! = ( log ) Hint: show !

Sustainability & Transformation Partnership Update B&NES H&WB Wednesday 17 th April