A RCHITECTURES FOR B IT -S PLIT S TRING S CANNING IN I NTRUSION D ETECTION S TRING MATCHING IS A CRITICAL ELEMENT OF MODERN INTRUSION DETECTION SYSTEMS BECAUSE IT LETS A SYSTEM MAKE DECISIONS BASED NOT JUST ON HEADERS , BUT ACTUAL CONTENT FLOWING THROUGH THE NETWORK . T HROUGH CAREFUL CODESIGN AND OPTIMIZATION OF AN ARCHITECTURE WITH A NEW STRING MATCHING ALGORITHM , THE AUTHORS SHOW IT IS POSSIBLE TO BUILD A SYSTEM THAT IS ALMOST 12 TIMES MORE EFFICIENT THAN THE CURRENTLY BEST KNOWN APPROACHES . Whether tethered to an Ethernet entails significant computational challenges. cable or connected through wireless technol- Intrusion detection systems must scan every ogy, computer systems now operate in an byte of every packet to find the signatures of environment of near ubiquitous connectivi- known attacks, and this requires very-high- ty. The availability of always-on communica- throughput methods for string matching. Lin Tan tion has created countless opportunities for To address these concerns, we take an Web-based businesses, information sharing, approach that relies on a simple yet powerful University of Illinois, and coordination, but it has also created new special-purpose architecture working in con- opportunities for those who seek to illegally junction with novel string-matching algo- Urbana-Champaign disrupt, subvert, or attack these activities. rithms specially optimized for this architecture. Every day, additional critical data becomes The key to achieving both high performance accessible over the network, and any publicly and high efficiency is to build many tiny state Timothy Sherwood accessible system on the Internet is subject to machines, each of which searches for a portion more than one break-in attempt per day. of the rules and a portion of each rule’s bits. University of California, Because we are all increasingly at risk, interest Our new algorithms are specifically tailored in combating these attacks at every level is toward implementation in an architecture Santa Barbara widespread, from end hosts and network taps built up as an array of small memory tiles, and to edge and core routers. Intrusion detection we developed the software and the architec- and prevention has proven highly effective at ture together. This article summarizes the key findings from a longer article. 1 Our efforts finding and blocking known attacks in the network before the end host even encounters result in a device that maintains tight worst- them, but making such protection scalable case bounds on performance, is updatable with 110 Published by the IEEE Computer Society 0272-1732/06/$20.00 2006 IEEE
new rules without interrupting operation, has packets. At minimum, a rule consists of a type configurations generated in seconds instead of of packet to search, a string of content to match, hours, and is 10 times more efficient than a location at which to search for that string, and existing best-known solutions. In particular an associated action to take if the search meets we describe all of the rule’s conditions. An example rule might match packets that look like a known • a novel configurable string-matching buffer overflow exploit in a Web server. The cor- architecture that can store the entire responding action might be to log the packet Snort rule set—about 1,000 strings that information and alert the administrator. Rules are 12 bytes apiece, on average—in only take many forms, but frequently their heart con- 0.4 Mbytes and can operate at upward of sists of strings to be matched anywhere in a 10 Gbps per instance; packet’s payload. The problem is that for accu- • string-matching algorithms that operate rate detection, we must be able to search every through the conjunction of many small byte of every packet for a potential match from state machines working in unison, reduc- a large set of strings. For example, the Snort rule ing the number of required out edges set has on the order of 1,000 strings with an from 256 to as few as two; and average length of about 12 bytes (http://www. • a rule compiler that takes only seconds windowsitpro.com/WindowsSecurity/ to partition and bit split a finite-state Article/ArticleID/39360/39360.html). In addi- machine representation of the strings tion to raw processing speed, a string-matching into a set of small implementable state engine must have bounded performance in the transition tables used to program our worst case to withstand a performance-based attack. 5 Because rule sets are constantly grow- architecture. ing and changing as new threats emerge, a Detecting intrusions successful design must be quickly and auto- Given the importance of protecting infor- matically updatable, all while the system main- mation and services, the security community tains continuous operation. has put much effort into detecting and thwart- ing attacks in the network. 2,3 Intrusion detec- String matching with state machines tion systems and intrusion prevention systems Familiar and efficient algorithms for string matching, such as Boyer-Moore, 6 are designed have emerged as two of the most promising ways to protect the network, and predictions to find a single string in a long input. Our show the market for such systems growing to problem is slightly different: We’re searching $918.9 million by the end of 2007. 4 for one of a set of strings from the input Network-based intrusion detection systems stream. Although simply performing multi- either attempt to find examples of misuse or ple passes of a standard one-string matching anomalies. Both approaches require sensors that algorithm would be functionally correct, it perform real-time monitoring of network pack- doesn’t scale to handle the thousands of strings ets, either by comparing network traffic against that modern intrusion detection systems look a signature database or by finding out-of-the- for. Instead, it is possible to fold the set of ordinary behavior and triggering intrusion strings we’re looking for together into a sin- alarms. A higher-level interface provides man- gle large state machine. This method, the Aho-Corasick algorithm, 7 functions in the agement software to configure, log, and display alarms generated by lower-level processing. fgrep utility as well as in some of the latest ver- These two parts, working in concert, alert sions of the Snort network intrusion detec- tion system. 2 One of Aho-Corasick’s biggest administrators to suspicious activities, keep logs to aid in forensics, and assist in the detection of advantages is that it performs well even in the new worms and denial-of-service attacks. The worst case, making it impossible for an adver- lowest level, where data is actually inspected, is sary to construct a stream of packets that is where the computational challenge lies. difficult or impossible to scan. At a high level, To define suspicious activities, most modern our algorithm works by separating the set of network intrusion detection and prevention sys- strings into groups and building a small state tems rely on a set of rules applied to matching machine for each group. Each state machine’s 111 JANUARY–FEBRUARY 2006
Recommend
More recommend