D e b c o n f 1 6 LTTng: Kernel and userspace tracing in Debian mjeanson@effjcios.com
w h o a mi Michael Jeanson, Software developer @ EffjciOS ● Debian Maintainer ● Ubuntu Member ● Fedora Packager ● Offjcial and unoffjcial packager for other distros
C o n t e n t ● What is tracing? ● A description of the difgerent tools involved ● LTTng compared to other tracing tools like strace ● The state of LTTng in Debian ● Basic use cases and workfmows ● Analysis of kernel traces
Wh a t i s t r a c i n g ? ● Like a black box / fmight recorder for your system ● Record runtime information – Syscalls – Function entry/exit ● Enable/Disable event(s) at runtime ● Low overhead
Wh y u s e t r a c i n g ? ● Problems that are not easily diagnosed with traditional tools or debugging – Narrow down bug causes – Identify performance hogs ● Very low performance impact – Can be used on production systems
T o o l s ● Tracers ● Control utilities ● Viewers ● Post-processing / analysis
T r a c e r s ● lttng-modules: OOT kernel tracer modules – compatible with kernels 2.6.38 to latest rc – you don’t need to recompile your kernel – lttng-modules-dkms in Debian ● lttng-ust: user-space tracer, in-process library – Java JUL and log4j agent – Python logging agent
C o n t r o l u t i l i t i e s ● lttng-tools: cli utilities and daemons for trace control – lttng: main cli command – lttng-ctl: tracing control library – lttng-sessiond: tracing registry daemon – lttng-consumerd: extract trace data – lttng-relayd: network streaming daemon
V i e w e r s ● babeltrace: cli text viewer and trace converter ● tracecompass: – GUI front-end for lttng – Collect, visualize and analyze traces – Eclipse plugin or standalone version ● lttngtop: ncurse top-like viewer
P o s t - p r o c e s s i n g / A n a l y s i s ● Lttng-analyses – Record your system's activity – Do whatever it takes for your problem to occur. – Diagnose your problem's cause offmine (when tracing is stopped).
C o mp a r e ● strace: syscall and signal tracer ● ftrace: in-kernel function and event tracer ● perf: in-kernel profjler and tracer
L T T n g i n D e b i a n ● All the tools are packaged ● 2 maintainers ● Testing/unstable: latest 2.8 stack ● Stable: 2.5 stack, unsupported :( ● Stable-backports: 2.8 stack coming soon ● Oldstable: Ancient stufg, don’t use it
L T T n g i n U b u n t u ● Xenial: 2.7 stack, supported ● Trusty: 2.4 stack, unsupported :( ● PPAs – Daily builds – Stable branch builds – Release builds for latest LTS
U s e c a s e s ● Debugging complex and hard to reproduce problems ● Embedded development with remote tracing ● Use snapshot mode for diffjcult to reproduce bugs ● Low-level metric collection with network streaming of traces ● Low-overhead top-like monitoring with lttngtop
Wo r k f l o w s ● Given a reproducible problem ● Gather trace (high level at fjrst) ● Analyze (narrow down the problem source) ● Add instrumentation if needed ● Rince, repeat
L T T n g a n a l y s e s ● Demo! – lttng create – lttng enable-event -k -a – lttng start – ...wait for the problem to appear... – lttng stop – lttng destroy
D e mo : I O u s a g e $ lttng-iousagetop demo-trace/
D e mo : I O l a t e n c y $ lttng-iolatencystats demo-trace/ --minsize 2
D e mo : I O l a t e n c y $ lttng-iolog demo-trace --timerange [12:18:50.162776739,12:18:51.157522361] ● Wouldn’t be a demo if everything worked ● From the previous step, we know when the latency happened, look at the log ● Find the root cause
Q u e s t i o n s LTTng Project ? https://{git | www}.lttng.org lttng-dev@lists.lttng.org @lttng_project #lttng on irc.oftc.net
Recommend
More recommend