Time-Muxed Parsing in Marking-based Network Telemetry Alon Riesenberg * , Yonnie Kirzon * , Michael Bunin * , Elad Galili * , Gidi Navon • , Tal Mizrahi ⋄ * ACM SYSTOR, Haifa, May 2019 • ⋄ *
Background What is network telemetry? … Packet loss Queue status Delay Performance measurement + exporting to a remote location Why do we need telemetry? … ‘ Elephant ’ Congestion / Failures Detection Bottlenecks flows 6/16/2019 2
Operations, Administration, Maintenance (OAM) Network measurement / monitoring: Control Message Control Message Network Telemetry 6/16/2019 3
Ping / Traceroute 6/16/2019 4
Old-School Passive Monitoring Counters Per port Per flow Queue State Per queue Latency … … 6/16/2019 5
Carrier Network OAM Higher Layers IP OAM IETF ICMPv4 IETF ICMPv6 IETF IPPM Layer 3 ITU-T Y.1711 IETF MPLS-TP IETF LSP-Ping IETF PWE3 MPLS / PWE3 ITU-T G.8113.1 IETF MPLS OAM MPLS-TP OAM OAM MPLS OAM VCCV OAM BFD ITU-T Y.1731 IEEE 802.1ag Layer 2 Ethernet OAM IEEE 802.3ah Layer 1 OAM Protocols Active measurement / monitoring: Control Message Control Message 6/16/2019 6
Fate Sharing http://www.speedtest.net 6/16/2019 7
Piggybacked Measurement Measurement info is piggybacked onto data packets AM-PM IOAM / INT 6/16/2019 8
Piggybacked Metadata – IOAM / INT Analytics Server Telemetry Info IOAM In situ OAM IOAM / INT Domain INT In-band Network Telemetry Switches push local Per-packet metadata metadata into header: Per-packet overhead delay, queue state, … 6/16/2019 9
AM-PM: Alternate Marking – Performance Measurement Fioccola, G., Capello, A., Cociglio, M., Castaldelli, L., Chen, M., Zheng, L., Mirsky, G., and T. Mizrahi, RFC 8321 “ Alternate Marking method for passive and hybrid performance monitoring ” , RFC 8321, 2018. T. Mizrahi, C. Arad, G. Fioccola, M. Cociglio, M. Chen, L. Zheng, and G. Mirsky. “ Compact Alternate draft-mizrahi-ippm-multiplexed-alternate-marking Marking Methods for Passive Performance Monitoring ” , draft-mizrahi-ippm-compact-alternate- (internet draft) marking, work in progress, IETF, 2018. 6/16/2019 10
AM-PM: What Can We Do with ONE Bit Per Packet? Measurement Pulse Step Marking Bit 00000001000000000 Marking Bit 000 11111 00000 111 Time Time 6/16/2019 11
AM-PM: Pulse Marking – Delay Measurement Checks when Checks when packet sent packet received Servers Servers Time Sent: March 8th, 16:02, 123400789 nsec (UTC) Time Received:March 8th, 16:02, 123500789 nsec (UTC) 100 μ sec Network Delay: Analytics Server 6/16/2019 12
AM-PM: Pulse Marking – Loss Measurement Records Records counter value counter value Servers Servers Counter: 2100 Counter: 2000 Packets lost: 100 Analytics Server Out of order? 6/16/2019 13
AM-PM: Alternate Marking – Loss Measurement Counts number Counts number of of packets sent packets received Servers Servers ... per-color counting Consistent counting: PacketsSent: 10,000 • Export the counter of each color Packets Received: 9,500 Packets Lost: 500 when it is not in use. • Resilient to reordering. Analytics Server 6/16/2019 14
AM-PM: Double Marking Pulse bit: Delay Servers Servers Step bit: Loss TWO bits per packet Analytics Server 6/16/2019 15
AM-PM: Multiplexed Marking Pulse: Delay Servers Servers Step: Loss ONE bit per packet Accurate loss and delay measurement! 6/16/2019 16
Design and Implementation of AM-PM Match-Action Lookup TCAM / Exact match / P4 Time-as-a-match TimeFlip State Detect first packet (pulse/step) 6/16/2019 17
Time-as-a-match: TimeFlip [MRM] Switch Periodic range TCAM ... time header / Time field2 field3 field4 … metadata action 1 second * … * 1 * … * Time.Sec Time.Frac [MRM] Mizrahi, Rottenstreich, Moses, INFOCOM 2015. 6/16/2019 18
Design and Implementation of AM-PM: Step/Pulse Match-Action Lookup TCAM / Exact match / P4 Time-as-a-match TimeFlip State Detect first packet (pulse/step) 6/16/2019 19
Multiplexed Marking: a Naïve Implementations 1 0 Marking bit Time Detect pulse Detect step Track the value of When the value changes When the value changes the marking bit. for more than one packet. for one packet. Non-trivial to implement using a match-action abstraction. 6/16/2019 20
Our Approach: Time-multiplexed Parsing Header field(s) have a different interpretation in each time slot! Detect step 0 1 Time 000 001 010 011 100 101 110 111 Marking bit Detect ‘ 1 ’ pulse Detect ‘ 0 ’ pulse • TimeFlip is used to divide time into time slots. • The marking bit has a different interpretation in each time slot. • Requires rough time synchronization, e.g., ~ 1 second. 6/16/2019 21
Our Approach: Time-multiplexed Parsing Header field(s) have a different interpretation in each time slot! Detect step 0 1 Time 000 001 010 011 100 101 110 111 Marking bit Detect ‘ 1 ’ pulse Detect ‘ 0 ’ pulse • TimeFlip is used to divide time into time slots. • The marking bit has a different interpretation in each time slot. • Requires rough time synchronization, e.g., ~ 1 second. 6/16/2019 22
AM-PM Evaluation using Marvell Prestera Switches loss and delay congestion is detected Monitored data flow Management Background traffic Switch 1 Traffic Generator Switch 2 6/16/2019 23
Software Implementation using P4 • Implemented in P4. • Time-of-day match field. • AM-PM in P4. • Tested in Mininet. Server • Open source code. S 1 S 2 S 3 H 1 H 2 6/16/2019 24
AM-PM: Where is it going? Network telemetry Low overhead AM-PM ... Ongoing AM-PM work in the IETF: QUIC MPLS NSH BIER Geneve AM-PM is under discussion in 6 working groups in the IETF … 6/16/2019 25
Large Scale Deployment in Telecom Italia • Mobile backhaul network ~ 1000 eNodeBs. • AM-PM one bit (step-based) loss measurement. • Uses unused bit in DSCP. • Off-the-shelf network equipment. 6/16/2019 26
Summary Design and implementation of AM-PM Hardware-based implementation Software-based table look_for_flag { reads { intrinsic_metadata.time_of_day : ternary; using a Marvell switch. implementation ipv4.flag_a : exact; } actions { _look_for_flag; in P4 – open source. _drop; } size: 256; } Experimental results Novel time-multiplexed parsing 0 1 000 001 010 011 100 101 110 111 Time Marking bit 6/16/2019 27
Thanks! 28
References Fioccola, G., Capello, A., Cociglio, M., Castaldelli, L., Chen, M., Zheng, L., Mirsky, G., and T. Mizrahi, “ Alternate [1] Marking method for passive and hybrid performance monitoring ” , RFC 8321, 2018. Mizrahi, T., Arad, C., Fioccola, G., Cociglio, M., Chen, M., Zheng, L., and G. Mirsky, “ Compact Alternate Marking [2] Methods for Passive and Hybrid Performance Monitoring ” , draft-mizrahi-ippm-compact-alternate-marking, work in progress, IETF, 2019. [3] Brockners, F., Bhandari, S., Pignataro, C., Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, P., Chang, R. and D. Bernier, J. Lemon, "Data Fields for In-situ OAM", draft-ietf-ippm-ioam-data, work in progress, 2019. C. Kim et al., “ In-band network telemetry (INT) ” , P4 consortium, 2015. [4] Mizrahi, T., Vovnoboy, V., Nisim, M., G. Navon, and A. Soffer, “ Network Telemetry Solutions for Data Center and [5] Enterprise Networks ” , Marvell white paper, 2018. Mizrahi, T., Rottenstreich, O. and Y. Moses, “ TimeFlip: Scheduling Network Updates with Timestamp-based TCAM [6] Ranges ” , IEEE INFOCOM, 2015. Mizrahi, T., Navon, G., Fioccola, G., Cociglio, M., Chen, M., and G. Mirsky, “ AM-PM: Efficient Network Telemetry using [7] Alternate Marking ” , IEEE Network, 2019. Riesenberg, A., Kirzon, Y., Bunin, M., Galili, E., Navon, G., and T. Mizrahi, “ Time-Multiplexed Parsing in Marking-based [8] Network Telemetry ” , ACM SYSTOR, 2019. [9] P4 AM-PM, https://github.com/AlternateMarkingP4/FlaseClase, 2018. 6/16/2019 29
Recommend
More recommend