the nids cluster scalable stateful network intrusion
play

The NIDS Cluster: Scalable, Stateful Network Intrusion Detection on - PowerPoint PPT Presentation

The NIDS Cluster: Scalable, Stateful Network Intrusion Detection on Commodity Hardware Matthias Vallentin 1 , Robin Sommer 2,3, Jason Lee 2 , Craig Leres 2 Vern Paxson 3,2 , and Brian Tierney 2 1 TU Mnchen 2 Lawrence Berkeley National


  1. The NIDS Cluster: Scalable, Stateful Network Intrusion Detection on Commodity Hardware Matthias Vallentin 1 , Robin Sommer 2,3, Jason Lee 2 , Craig Leres 2 Vern Paxson 3,2 , and Brian Tierney 2 1 TU München 2 Lawrence Berkeley National Laboratory 3 International Computer Science Institute

  2. Motivation • NIDSs have reached their limits on commodity hardware • Keep needing to do more analysis on more data at higher speeds • However, CPU performance is not growing anymore the way it used to • Single NIDS instance (e.g., Snort, Bro) cannot cope with Gbps links • To overcome, we must either • Restrict the amount of analysis, or • Turn to expensive, custom hardware, or • Employ some form of load-balancing to split analysis across multiple machines • We do load-balancing with the “ NIDS Cluster” • Use many boxes instead of one • Every box works on a slice of traffic • Correlate analysis to create the impression of a single system Recent Advances in Intrusion Detection 2007 2

  3. Correlation is Tricky ... • Most NIDS provide support for multi-system setups • However, instances tend to work independent • Central manager collects alerts of independent NIDS instances • Aggregates results instead of correlating analysis • NIDS cluster works transparently like a single NIDS • Gives same results as single NIDS would if it could analyze all traffic • Does not sacrifice detection accuracy • Is scalable to large number of nodes • Still provides a single system as the user interface (logging, configuration updates) Recent Advances in Intrusion Detection 2007 3

  4. Architecture Tap Gbps Internet Local Gbps Tap NIDS Cluster Recent Advances in Intrusion Detection 2007 4

  5. Architecture Tap Gbps Internet Local Gbps Tap NIDS Cluster Frontend Frontend Recent Advances in Intrusion Detection 2007 4

  6. Architecture Tap Gbps Internet Local Gbps Tap NIDS Cluster Frontend Frontend Backend Backend Backend Backend Backend Backend Backend Backend Backend Backend Backend Backend Recent Advances in Intrusion Detection 2007 4

  7. Architecture Tap Gbps Internet Local Gbps Tap NIDS Cluster Frontend Frontend Backend Backend Backend Backend Backend Backend Proxy Proxy Backend Backend Backend Backend Backend Backend Recent Advances in Intrusion Detection 2007 4

  8. Architecture Tap Gbps Internet Local Gbps Tap NIDS Cluster Frontend Frontend Manager Backend Backend Backend Backend Backend Backend Proxy Proxy Backend Backend Backend Backend Backend Backend Recent Advances in Intrusion Detection 2007 4

  9. Environments • Initial target environment: Lawrence Berkeley National Laboratory • LBNL monitors 10 Gbps upstream link with the Bro NIDS • Setup evolved into many boxes running Bro independently for sub-tasks • Cluster prototype now running at LBNL with 1 frontend & 10 backends • Further prototypes • University of California, Berkeley 2 x 1 Gbps uplink, 2 frontends / 6 backends for 50% of the traffic • Ohio State University 450 Mbps uplink, 1 frontend / 2 backends (10 planned) • IEEE Supercomputing Conference 2007 Conference’s 1 Gbps backbone / 10 Gbps “High Speed Bandwidth Challenge” network • Goal: Replace operational security monitoring Recent Advances in Intrusion Detection 2007 5

  10. Challenges Main challenges when building the NIDS Cluster Distributing the traffic evenly while minimizing need for communication 1. Adapting the NIDS operation on the backend to correlate analysis with peers 2. Validating that the cluster produces sound results 3. Recent Advances in Intrusion Detection 2007 6

  11. Distributing Load Recent Advances in Intrusion Detection 2007 7

  12. Distribution Schemes • Frontends need to pick a backend as destination • Option 1: Route packets individually • Simple example: round-robin • Too expensive due to communication overhead (NIDS keep per-flow state) • Option 2: Flow-based schemes • Send all packets belonging to the same flow to the same backend • Needs communication only for inter-flow analysis • Simple approach: hashing flow identifiers • E.g., md5(src-addr + src-port + dst-addr + dst-port) mod n • Hashing is state-less, which reduces complexity and increases robustness • But how well does hashing distribute the load? Recent Advances in Intrusion Detection 2007 8

  13. Simulation of Hashing Schemes !"#$%&'((")"$*"+%,+-%","$%&'+.)'/0.'1$%234 85 =&6 ! < />( ! <%277%$1&"+4 md5 />( ! < =&6 ! 8 76 75 6 5 !1$%75955 !1$%7:955 !1$%88955 ;0"%<955 1 day of UC Berkeley campus TCP traffic (231M connections), n = 10 Recent Advances in Intrusion Detection 2007 9

  14. Simulation of Hashing Schemes !"#$%&'((")"$*"+%,+-%","$%&'+.)'/0.'1$%234 85 =&6 ! < />( ! <%277%$1&"+4 md5 md5-addr />( ! < =&6 ! 8 76 75 6 5 !1$%75955 !1$%7:955 !1$%88955 ;0"%<955 1 day of UC Berkeley campus TCP traffic (231M connections), n = 10 Recent Advances in Intrusion Detection 2007 9

  15. Cluster Frontends • We chose the address-based hash • Ports not always available (e.g., ICMP , fragments) & more complex to extract • Even with perfect distribution, load is hard to predict • Frontends rewrite MAC addresses according to hash • Two alternative frontend implementations • In software with Click (SHA1) • In hardware with a prototype of Force-10’s P10 appliance (XOR) Recent Advances in Intrusion Detection 2007 10

  16. Adapting the NIDS Recent Advances in Intrusion Detection 2007 11

  17. Cluster Backends • On the backends, we run the Bro NIDS • Bro is the NIDS used in our primary target environment LBNL • Bro already provides extensive, low-level communication facilities • Bro consists of two layers • Core: Low-level, high-performance protocol analysis • Event-engine: Executes scripts which implement the detection analysis • Observation: Core keeps only per-flow state • No need for correlation across backends • Event-engine does all inter-flow analysis • The scripts needs to be adapted to the cluster setting Recent Advances in Intrusion Detection 2007 12

  18. Adapting the Scripts ... • Script language provides primitives to share state • Almost all state is kept in tables, which can easily be synchronized across peers • Main task was identifying state related to inter-flow analysis • A bit cumbersome with 20K+ lines of script code ... • Actually it was a bit more tricky ... • Some programming idioms do not work well in the cluster setting and needed to be fixed • Some trade-offs between gain & overhead exists are hard to assess • Bro’s “loose synchronisation” introduces inconsistencies (which can be mitigated) • Many changes to scripts and few to the core • Will be part of the next Bro release Recent Advances in Intrusion Detection 2007 13

  19. Validating the Cluster Recent Advances in Intrusion Detection 2007 14

  20. Accuracy • Goal: Cluster produces same result as a single system • Compared the results of cluster vs. stand-alone setup • Captured a 2 hour trace at LBNL’s uplink (~97GB, 134M pkts, 1.5 M host pairs) • Splitted the trace into slices and copied them to the cluster nodes • Setup the cluster to examine the slices just as if it would process live traffic • Compared output of the manager with the output of a single Bro instance on the trace • Found excellent match for the alarms & logs • Cluster reported all 2661 alarms of the singe instance as well • Slight differences in timing & context due to latency and synchronization semantics • Some artifacts of the off-line measurement setup Recent Advances in Intrusion Detection 2007 15

  21. CPU Load per Node node0 node1 15 node2 node3 node4 node5 node6 node7 node8 Probability density node9 10 5 0 0.0 0.1 0.2 0.3 0.4 0.5 CPU utilization 10 backends, ext. LBNL config, 2hr full trace, (~97GB, 134M pkts) Recent Advances in Intrusion Detection 2007 16

  22. Scaling of CPU 10 nodes 5 nodes 25 3 nodes 20 Probability density 15 10 5 0 0.0 0.1 0.2 0.3 0.4 0.5 CPU utilization ext. LBNL config, 2hr full trace, (~97GB, 134M pkts) Recent Advances in Intrusion Detection 2007 17

  23. Load on Berkeley Campus 70 Backend 0 Backend 2 Backend 4 Proxy 0 Manager Backend 1 Backend 3 Backend 5 Proxy 1 60 50 40 CPU load (%) 30 20 10 0 Tue 12:00 Tue 18:00 Wed 0:00 Wed 6:00 Wed 12:00 Wed 18:00 Thu 0:00 Thu 6:00 With 1 frontend = 50% of the total traffic Recent Advances in Intrusion Detection 2007 18

  24. Conclusion & Outlook • Cluster monitors Gbps networks on commodity hardware • Provides high-performance, stateful network intrusion detection • Correlates analysis across its nodes rather than just aggregating results • When building the cluster we • Examined different load distribution schemes • Adapted an open-source NIDS to the cluster setting • Evaluated correctness & performance in a real-world setting • Challenge was to build something which works • Less to lead into fundamentally new research directions • Now in the process of making it production quality • We will soon release the Cluster Shell • An interactive shell running on the manager Recent Advances in Intrusion Detection 2007 19

Recommend


More recommend