arpit gupta
play

Arpit Gupta Princeton University Rob Harrison, Ankita Pawar, Rdiger - PowerPoint PPT Presentation

SONATA: Query-Driven Network Telemetry Arpit Gupta Princeton University Rob Harrison, Ankita Pawar, Rdiger Birkner, Marco Canini, Nick Feamster, Jennifer Rexford, Walter Willinger Conventional Network Telemetry Compute Store Analysis


  1. SONATA: Query-Driven Network Telemetry Arpit Gupta Princeton University Rob Harrison, Ankita Pawar, Rüdiger Birkner, Marco Canini, Nick Feamster, Jennifer Rexford, Walter Willinger

  2. Conventional Network Telemetry Compute Store Analysis Queries NetFlow, pcap sFlow, SNMP, etc. Collection 2

  3. Conventional Network Telemetry Compute Store Analysis Queries NetFlow, pcap sFlow, SNMP, etc. Collection Collection is not driven by Analysis 3

  4. Problems with Status Quo • Expressibility – Configure collection & analysis stages separately – Static (and often coarse) data collection – Brittle analysis setup---specific to collection tools 4

  5. Problems with Status Quo • Expressibility – Configure collection & analysis stages separately – Static (and often coarse) data collection – Brittle analysis setup---specific to collection tools • Scalability As Traffic Volume or # Monitoring Queries increases • Hard to answer queries in real-time Hard to express & scale queries for • Hard to transport data from monitoring sensors network telemetry tasks! 5

  6. SONATA: Query-Driven Telemetry • Uniform Programming Abstraction Express queries as dataflow operations over pkt. tuples • Query-Driven Data Reduction Execute subset of dataflow operations in data plane • Coordinated Data Collection & Analysis Select query plans that make best use of available resources 6

  7. Uniform Programming Abstraction • Extensible Packet-tuple Abstraction Queries operate over all packet tuples, at every location in the network • Expressive Dataflow Operators – Most telemetry applications require • collecting aggregate statistics over subset of traffic • joining results of one analysis with the other – Easy to express them as declarative queries composed of dataflow operators 7

  8. Example Query Detecting DNS Reflection Attack Detect hosts for which # of unique source IPs sending DNS response messages exceeds threshold (Th) victimIPs = pktStream(W) .filter(p => p.srcPort == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) Express queries without worrying about .reduceByKey(sum) .filter((dstIP, count) => count > Th) where and how they get executed .map((dstIP, count) => dstIP) 8

  9. Changing Status Quo • Expressibility – Express dataflow queries over packet tuples – Not tied to low-level (3 rd party/platform-specific) APIs – Trivial to add new queries and change collection tools 9

  10. Query Execution Process all (or subset of) captured packet tuples using state-of-the-art Stream Processor Queries Stream Processor Runtime Packet Tuples Packet Capture Expressible but not Scalable! 10

  11. PISA Targets for Data Reduction • Programmable parsing Allow new query-specific header fields for parsing • State in packets & registers Support simple stateful computations • Customizable hash functions Support hash functions over flexible set of fields • Flexible match/action table pipelines Support match/action tables with prog. actions 11

  12. Compiling Dataflow Operators • Map, Filter & Sample Apply sequence of match-action tables • Distinct & Reduce – Compute index, & read value from hash tables – Apply function (e.g., bit_or for distinct) & then update the hash table – Use sketches, e.g. reduce(sum) à CM Sketches • Limitations – Complex transformations, e.g. log, regex, etc. 12

  13. Compiling Dataflow Queries • Compiling a Single Query – Generate & update query-specific metadata fields – Apply operator’s match-action tables in sequence – Clone packet if report bit set • Compiling Multiple Queries – Generate & update metadata fields for all queries – Apply operators for all queries in sequence – Clone packet if report bit is set for at least one query 13

  14. Coordinated Data Coll. & Analysis • Query Partitioning – Execute subset of dataflow operators in data plane – Reduce packet tuples at the cost of additional state in the data plane • Iterative Refinement – Iteratively zoom-in on traffic of interests – Reduce state at the cost of additional detection delay How to select the best query plan? 14

  15. Query Planning • Reflection Attack Query pktStream(W) .filter(p => p.srcPort == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP)=>(dstIP,1)) • Partitioning Plans .reduceByKey(sum) .filter((dstIP,count)=>count>Th) Plan 1: Data Plane only .map((dstIP, count) => dstIP) Plan 2: Stream Processor only • Refinement Plans – Refinement key: dstIP – Refinement levels: {/8, /32} 15

  16. Query Planning Query Plan Graph • Reflection Attack Query Src w 0,8,1 w 0,32,1 • Partitioning Plans w 0,8,2 w 0,32,2 dIP/8,1 Plan 1: Data Plane only dIP/32,2 dIP/32,1 dIP/8,2 w 8,32,1 Plan 2: Stream Processor only w 8,32,2 w 8,32,2 w 8,32,1 dIP/32,1 dIP/32,2 0 0 0 • Refinement Plans Tgt 0 – Refinement key: dstIP – Refinement levels: {/8, /32} 16

  17. Query Planning Query Plan Graph Src Src à dIP/8,1 à dIP/32,2 à Tgt w 0,8,1 w 0,32,1 w 0,8,2 w 0,32,2 dIP/8,1 dIP/32,2 dIP/32,1 dIP/8,2 w 8,32,1 w 8,32,2 w 8,32,2 w 8,32,1 dIP/32,1 dIP/32,2 0 0 0 Tgt 0 Selects plan with smallest weighted cost 17

  18. Implementation SONATA’s API Q 2 Q N Q 1 Iterative Refinement (qid, …) Runtime Queries Queries Query Partitioning Streaming Driver Data Plane Driver (qid, …) Stream Processor Collection is now driven by Analysis! Packets In Packets Out Data Plane Target 18

  19. Evaluation • Workload Large-IXP network: 2 hours long IPFIX trace, 3 Tbps peak traffic, packet sampling rate = 1/10K • Queries DDoS-UDP, SSpreader, PortScan, Reflection Attack • Comparisons Stream-Only, Part-OF, Part-PISA, Fixed-Refinement 19

  20. Benefits of Query Planning Part-OF Part-PISA 1 . 0 15 . 0 8 • B max : Max. state data 12 . 5 10 . 0 7 0 . 8 plane can support 7 . 5 6 5 . 0 • N max : Max. pkt. tuples Bmax (KB) 0 . 6 2 . 5 5 stream processor can Fixed Refinement Sonata 4 15 . 0 process 0 . 4 12 . 5 3 10 . 0 • Each color represents a 7 . 5 0 . 2 2 unique query plan 5 . 0 1 2 . 5 0 . 0 0 . 0 2 0 . 2 4 6 0 . 4 8 0 . 6 2 4 0 . 8 6 8 1 . 0 Nmax (Kpps) Selects eight different query plans for different SONATA makes best use of available system configurations resources 20

  21. Scaling Query Executions Part-OF SONATA 30 Number of Tuples (x1000) 24 18 12 6 0 DDoS-UDP SSpreader PortScan All Queries Number of pkt tuples processed by Stream Processor Executing stateful operations in data plane reduces workload on Stream Proc. 21

  22. Scaling Query Executions 25 Part-PISA Fixed Ref. SONATA 20 State (KB) 15 10 5 0 DDoS-UDP SSpreader PortScan All Queries State (KB) required by data plane targets Iterative refinement reduces state required by the data plane targets 22

  23. Changing Status Quo • Expressibility – Express Dataflow queries over packet tuples – Not worry about how and where the query is executed – Adding new queries and collection tools is trivial • Scalability – Answers hundreds of queries in real-time for traffic volume as high as few Tb/s – Strikes a balance between available resources, i.e. Expressible & Scalable! • tuples processed by the stream processor • state in the data plane 23

  24. Summary • SONATA makes it easier to express and scale network monitoring queries using – Programmable Data Plane – Scalable Stream Processor • Running Code – Github: github.com/Sonata-Princeton/SONATA-DEV – Run test queries or express new ones • SONATA@arxiv: arxiv.org / abs /1705.01049 24

Recommend


More recommend