Masoud Moshref, Minlan Yu Ramesh Govindan, Amin Vahdat Trumpet: Timely and Precise Triggers in Data Centers
The Problem Evolve or Die, SIGCOMM 2016 Long failure repair times in large networks Human-in-the-loop failure assessment and repair 2
Humans in the Loop Fix Inspect Detect Locate 3
Programs in the Loop Fix Inspect Programs in the loop Detect Locate 4
Our Focus Detect A framework for programmed detection of events in large datacenters 5
Events Link failure Loop Packet burst Middlebox failure DDoS Traffic surge Packet delay Congestion ❖ Availability ❖ Performance Lost packet Burst Loss ❖ Security Switch failure Blackhole Incast Load imbalance Traffic hijack 6
Our Focus Detect Aggregated , often sampled measures of network health 7
Fine Detecting Transient Congestion Timescale Events 40 ms burst Timeouts lasting several 100 ms 8
Fine Detecting Attack Onset Timescale Events Did this tenant see a sudden increase in traffic over the last few milliseconds ? 9
Inspect Every Packet Link failure Loop Packet burst Middlebox failure DDoS Traffic surge Packet delay Congestion Lost packet Some event Burst Loss definitions may Switch failure Blackhole require inspecting Incast every packet Load imbalance Traffic hijack 10
Eventing Framework Requirements Expressivity Fine timescale eventing Per-packet processing ▸ Set of possible ▸ Capture transient ▸ Precise event events not known a and onset events determination priori Because data centers will require high availability and high utilization 11
A Key Where do we place eventing Architectural Question functionality? Switches NICs Hosts ❖ Are programmable ❖ Have processing power for fine-time scale eventing ❖ Already inspect every packet 12
We explore the design of a host-based eventing framework 13
Research Questions What eventing How can we achieve What is the architecture permits precise eventing at fine performance envelope programmability and timescales? of such an eventing visibility? framework? 14
Research Questions What eventing How can we achieve What is the architecture permits precise eventing at fine performance envelope programmability and timescales? of such an eventing visibility? framework? Trumpet has a logically centralized event manager that aggregates local events from per-host packet monitors 15
Event Definition For each packet matching Filter group by Flow-granularity and report every Time-interval Predicate each group that satisfies Flow volumes, loss rate, loss pattern (bursts), delay 16
Event Is there any flow sourced by a service that Example sees a burst of losses in a small interval? For each packet matching Service IP Prefix group by 5-tuple and report every 10ms any flow whose sum (is_lost & is_burst) > 10% 17
Event Is there a job in a cluster that sees abnormal Example traffic volumes in a small interval? Cluster IP Prefix For each packet matching and Port group by Job IP Prefix and report every 10ms any job whose sum (volume) > 100MB 18
Trumpet Design Controller Event Report Trumpet Event Manager Triggers Trigger Reports Server Server VM Hypervisor Trumpet Packet Monitor Software switch VM 19
Trumpet Event Manager Congestion? Trumpet Event Manager Contains event Congestion attributes, detects Triggers local events 20
Trumpet Event Manager Trumpet Event Manager 21
Trumpet Trumpet can be used by Event programs to drill-down to Manager potential root causes Large flow? Trumpet Event Manager Large Flow Triggers 22
Research Questions What eventing How can we achieve What is the architecture permits precise eventing at fine performance envelope programmability and timescales? of such an eventing visibility? framework? The monitor optimizes packet processing to inspect every packet and evaluate predicates at fine timescales 23
The Packet Monitor Server VM Hypervisor Trumpet Packet Monitor Software switch VM 24
A Key Assumption Server VM Hypervisor Trumpet Packet Monitor Software switch VM Piggyback on CPU core used by software switch ❖ Conserves server CPU resources ❖ Avoids inter-core synchronization 25
Can a single core monitor thousands of triggers at full packet rate (14.8 Mpps) on a 10G NIC? 26
Two Obvious Tricks Use kernel bypass Use polling to have tighter ▸ Avoid kernel stack scheduling overhead ▸ Trigger time intervals at 10ms Necessary, but far from sufficient…. 27
Monitor With 1000s Design of triggers Update Packet Check Match statistics at predicate filters flow granularity time-interval at Time interval Filter Predicate Flow granularity Source IP = 10.1.1.0/24 5-tuple 10ms Sum(loss) > 10% Source IP = 20.2.2.0/24 Service IP prefix 100ms Sum(size) < 10MB 28
Design Challenges Update Packet Check Match statistics at predicate filters flow granularity time-interval at Which of these should be performed ❖ On-path ❖ Off-path 29
Design Challenges Update Packet Check Match statistics at predicate filters flow granularity time-interval at Which operations to do on-path? ❖ 70ns to forward and inspect packet 30
Design Challenges Update Packet Check Match statistics at predicate filters flow granularity time-interval at How to schedule off-path operations? ❖ Off-path on same core, can delay packets ❖ Bound delay to a few µs 31
Strawman Packet Packet Design History On-Path Off-Path Update Check Match statistics at predicate filters flow granularity time-interval at Doesn’t scale to large numbers of triggers 32
Strawman Update Match statistics at Design Packet filters flow granularity On-Path Off-Path Check predicate time-interval at Still cannot reach goal ❖ Memory subsystem becomes a bottleneck 33
Trumpet Monitor Design Update Match statistics at Packet filters 5-tuple granularity On-Path Off-Path Check Gather predicate statistics at flow granularity time-interval at 34
Optimizations Update Match statistics at Packet filters 5-tuple granularity On-Path ❖ Use tuple-space search for matching ❖ Match on first packet, cache match ❖ Lay out tables to enable cache prefetch ❖ Use TLB huge pages for tables 35
Optimizations ❖ Lazy cleanup of statistics across intervals ❖ Lay out tables to enable cache prefetch ❖ Bounded-delay cooperative scheduling Off-Path Check Gather predicate statistics at flow granularity time-interval at 36
Bound Bounded delay to Delay Off-Path On-Path a few µs Cooperative Scheduling Bounded Delay 37
Research Questions What eventing How can we achieve What is the architecture permits precise eventing at fine performance envelope programmability and timescales? of such an eventing visibility? framework? Trumpet can monitor thousands of triggers at full packet rate on a 10G NIC 38
Evaluation Trumpet is expressive ❖ Transient congestion ❖ Burst loss ❖ Attack onset Trumpet scales to thousands of triggers Trumpet is DoS-Resilient 39
Detecting Transient Congestion 40 ms Trumpet can Large Flow detect (Reactive) millisecond scale congestion Congestion events 40
Scalability Trumpet can process ❉ 14.8 Mpps ❖ 64 byte packets at 10G ❖ 650 byte packets at 4x10G … while evaluating 16K triggers at 10ms granularity ❉ Xeon ES-2650, 10-core 2.3 Ghz, Intel 82599 10G NIC 41
Performance Triggers matched by Envelope each flow Above this rate, Trumpet would miss events How often each predicate is checked 42
Performance Envelope Number of <trigger, flow> pairs increases statistics gathering overhead At moderate packet rates, can detect events at 1ms 43
Performance Envelope Above 10ms, CPU can sustain full packet rate Need to profile and provision Trumpet deployment 44
Conclusion Future datacenters will Trumpet can process 16K Future work: scale to need fast and precise triggers at full packet rate 40G NICs eventing ▸ … without delaying ▸ … perhaps with ▸ Trumpet is an packets by more than NIC or switch expressive system for 10 µs support host-based eventing https://github.com/USC-NSL/Trumpet 45
A Big Discrepancy Outage budget for five 9s availability 99.999% uptime 24 seconds per month Long failure durations due to time to root- cause failures 46
Every optimization is necessary ❉ ❉ Details in the paper 47
Recommend
More recommend