INTROPERF: TRANSPARENT CONTEXT- SENSITIVE MULTI-LAYER PERFORMANCE INFERENCE USING SYSTEM STACK TRACES Chung Hwan Kim*, Junghwan Rhee, Hui Zhang, Nipun Arora, Guofei Jiang, Xiangyu Zhang*, Dongyan Xu* NEC Laboratories America *Purdue University and CERIAS
Performance Bugs • Performance bugs • Software defects where relatively simple source-code changes can significantly speed up software, while preserving functionality [Jin et al., PLDI12]. • Common issues in most software projects and these defects are hard to be optimized by compilers due to software logic. • Many performance bugs escape the development stage and cause cost and inconvenience to software users. IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 2
Diagnosis of Performance Bugs is Hard • Diverse root causes User space • Input/workload void do (input) { while (...) { void main () { • Configuration latency ... • Resource do (input) } • Bugs } ... fwrite(input) • Others int fwrite (input) { ... write (input) } } • Performance overhead propagates. Kernel space => Need performance analysis int write (input) { in a global scope! latency } “Performance problems require understanding all system layers” -Hauswirth et al., OOPSLA ‘04 IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 3
Diagnosis of Performance Bugs • Development stage • Source code is available. • Developers have knowledge on programs. • Testing workload • Heavy-weight tools such as profilers and dynamic binary instrumentation are often tolerable. • Post-development stage • Many users do not have source code. • Third-party code and external modules come in binaries. • Realistic workload at deployment • Low overhead is required for diagnosis tools. • Q: How to analyze performance bugs and find their root causes in a post-development stage with low overhead? IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 4
OS Tracers and System Stack Trace • Many modern OSes provide App App tracing tools as swiss-army-tools 1 2 • These tools provide tracing of OS events. OS Kernel • Examples: SystemTap, Dtrace, Trace Microsoft ETW • Advanced OS tracers provide stack traces. A A A A User • We call OS events + stack traces = system stack traces . Code B B C C • Examples: Microsoft ETW, Dtrace Info. D D D D OS S 1 S 2 • Challenges S 3 S 1 Event • Events occur on OS events. t 1 t 2 t 3 t 4 Time • Missing application function latency: Stamp How do we know which program System Stack Trace functions are slow? IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 5
IntroPerf • IntroPerf: A diagnosis tool for Perf ormance Intro spection based on system stack traces • Key Ideas • Function latency inference based on the continuity of a calling context • Context sensitive performance analysis Performance- Function annotated Latency Calling Context Inference A Report of System Ranking Performance Stack Traces Bugs Dynamic Top-down Calling Context Latency Indexing Breakdown Transparent Inference of Context-sensitive Application Performance Performance Analysis IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 6
Inference of Function Latencies • Inference based on the continuity of a function in the context A • Algorithm captures a period of a Return Call B C function execution in the call stack D D without a disruption of its context Function Execution A A A A B B C C D D D t 1 t 2 t 3 t 4 Function lifetime A stack trace event Conservative estimation IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 7
Inference of Function Latencies • Inference based on the continuity of a function in the context A • Algorithm captures a period of a Return Call B C function execution in the call stack D D without a disruption of its context Function Execution IsNew ThisStack Register (Time) A A A A B B C C Yes A A (T1-T1) D D D Yes B B (T1-T1) t 1 t 2 t 3 t 4 Yes D D (T1-T1) Function lifetime A stack trace event Captured Function Instances Conservative estimation IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 8
Inference of Function Latencies • Inference based on the continuity of a function in the context A • Algorithm captures a period of a Return Call B C function execution in the call stack D D without a disruption of its context Function Execution IsNew ThisStack Register (Time) A A A A B B C C No A A (T1- T2 ) D D D No B B (T1- T2 ) t 1 t 2 t 3 t 4 No D D (T1- T2 ) Function lifetime A stack trace event Captured Function Instances Conservative estimation IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 9
Inference of Function Latencies • Inference based on the continuity of a function in the context A • Algorithm captures a period of a Return Call B C function execution in the call stack D D without a disruption of its context Function Execution IsNew ThisStack Register (Time) A A A A B B C C No A A (T1- T3 ) D D D Yes C C (T3-T3) t 1 t 2 t 3 t 4 Yes D D (T3-T3) Function lifetime A stack trace event Captured Function Instances Conservative estimation B (T1-T2) D (T1-T2) IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 10
Inference of Function Latencies • Inference based on the continuity of a function in the context A • Algorithm captures a period of a Return Call B C function execution in the call stack D D without a disruption of its context Function Execution IsNew ThisStack Register (Time) A A A A B B C C No A A (T1- T4 ) D D D No C C (T3-T4) t 1 t 2 t 3 t 4 Function lifetime A stack trace event Captured Function Instances Conservative estimation B (T1-T2) D (T3-T3) D (T1-T2) IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 11
Inference of Function Latencies • Inference based on the continuity of a function in the context A • Algorithm captures a period of a Return Call B C function execution in the call stack D D without a disruption of its context Function Execution IsNew ThisStack Register (Time) A A A A B B C C D D D t 1 t 2 t 3 t 4 Function lifetime A stack trace event Captured Function Instances Conservative estimation A (T1-T4) B (T1-T2) D (T3-T3) C (T3-T4) D (T1-T2) IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 12
Dynamic Calling Context Tree Stack Traces • A calling context is a distinct order of a function call sequence A A A A starting from the “main” function B B C C (i.e., a call path). D D D t 1 t 2 t 3 t 4 • We use calling context tree as the model of application performance to organize inferred latency in a Dynamic Calling Context Tree structured way. root • Unique and concise index of a A dynamic context is necessary for B C analysis. • Adopted a variant of the calling D D context tree data structure [Ammons97]. Index Path • Assign a unique number of the 1 pointer to the end of each path. 2 IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 13
Performance-annotated Calling Context Tree • Top-down Latency Normalization • Inference of latency performed in all layers of the stack causes overlaps of A Return Return Call Call latencies in multiple layers. B C Call Return Call Return • Latency is normalized by recursively D D subtracting children functions’ latencies in the calling context tree. Dynamic Calling Context Tree • Performance-annotated Calling root Context Tree A • Calling context tree is extended by B C annotating normalized inferred performance latencies in calling D D context tree. IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 14
Context-sensitive Performance Analysis • Context-aware performance analysis involves diverse states of programs because of context-sensitive function call behavior. • Manual analysis will consume significant time and efforts of users. • Ranking of function call paths with latency allows us to focus on the sources of performance bug symptoms. IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 15
Ranking Calling Contexts and Functions • We calculate the cost of each calling context (i.e., call path from the root) by storing the inferred function latencies. • The top N ranked calling contexts Top rank context High level regarding latency (i.e., hot calling Low level application contexts) are listed for evaluation. system layer function (e.g., syscall) (e.g., main) Lower rank context • Furthermore, for each hot calling context, function nodes are ranked regarding their latencies Top rank context and hot functions inside the path Low level High level system layer application are determined. (e.g., syscall) function (e.g., main) Lower rank context IntroPerf: Transparent Context-Sensitive Multi-layer Performance Inference using System Stack Traces 16
Recommend
More recommend