A Fast Regular Expression Matching Engine for NIDS Applying - PDF document

A Fast Regular Expression Matching Engine for NIDS Applying Prediction Scheme Lei Jiang Qiong Dai, Qiu Tang, Jianlong Tan Binxing Fang Institute of Computing Technology Institute of Information Engineering Institute of Computing Technology Chinese Academy of Sciences Chinese Academy of Sciences Chinese Academy of Sciences Beijing, P.R. China Beijing, P.R. China Beijing, P.R. China University of Chinese Academy of Sciences Email: daiqiong@iie.ac.cn Email: bxfang@ict.ac.cn Beijing, P.R. China tangqiu@iie.ac.cn Email: jianglei@ict.ac.cn tanjianlong@iie.ac.cn bandwidth and lower matching speed. Abstract —Regular expression matching is considered impor- tant as it lies at the heart of many networking applications In this paper, we continue focusing on the DFA com- using deep packet inspection (DPI) techniques. For example, pression mechanism and develop a new DFA compression modern networking intrusion detection systems (NIDSs) typically algorithm called J-DFA. We apply clustering algorithm to accomplish regular expression matching using deterministic finite classify all DFA states to different groups. In each group, automata (DFA) algorithm. However, DFA suffers from the high memory consumption for the state blowup problem. Many we extract a common state, and the transitions in this group algorithms have been proposed to compress the DFA memory different from the common state are stored in a sparse matrix. storage space, meanwhile, they usually pay the price of low Then, we encoded the common state by run-length encoding. matching speed and high memory bandwidth. In this paper, By using these methods in combination, the compression ratio we first propose an effective DFA compression algorithm by of J-DFA reaches 99% . exploiting the similarity between DFA states. Then, we apply a next-state prediction strategy and present a fast DFA matching The key issue of mapping DFA compression algorithm engine. Carefully designing the DFA matching circuit, we keep into FPGA is how to access the compressed DFA structure. the prediction success rate by more than 99 . 5% , thus get a After compressing, the DFA transition table becomes irreg- comparable matching speed with original DFA algorithm. On the ular because a lot of zero-elements are eliminated. Previous side of memory consumption, experimental results show that with works focus on the compressing technologies and place little typical NIDS rule sets, our algorithm compressed the original emphasis on how to access the irregular compressed transi- DFA by more than 99% . Mapping this algorithm on Xilinx Virtex- tion table efficiently. Only in [11], bitmap is mentioned to 7 FPGA chip, we get a throughput of more than 200Gbps. store the compressed DFA structure. However, bitmap method consumes at least 3 clock cycles to accomplish one lookup, I. I NTRODUCTION thus greatly decreasing the matching speed. So, we present a novel architecture to resolve the conflict between memory Regular expression matching lies at the heart of deep packet usage and matching speed. We design a state prediction method inspection (DPI)[1] applications, especially for the Networking to accelerate regular expression matching based on J-DFA intrusion detection systems (NIDSs). Modern NIDS, such as algorithm. We observe that in the real matching process of Snort [2] and L7-filter [3], use regular expression rules to J-DFA, it has a great chance that the “next state” lies in the detect networking attacks. Compared with the simple string same “clustering group” of the “current state”. So we can rules, regular expression rules have higher expressive power predict the “next state” according to the “clustering center” and are able to describe a wider variety of payload signatures of the “current state”. Inspired by the locality principle of [4]. State-of-the-art NIDS uses DFA algorithm to perform programs behaving in memory and CPU cache [12][13], we regular expression matching for its line rate matching speed. design a next-state prediction unit [14][15] and add it to our But as the rule sets become complex and large, DFAs suffer regular expression matching engine on Xilinx Virtex-7 FPGA from the state blowup problem, especially for the patterns chip. Experiment results show that the prediction success rate with constrained and unconstrained repetitions of wildcards is more than 99 . 5% , thus achieving a comparable matching and large character sets [5]. According to [6], the L7-filter’s speed with original DFA algorithm. rule set, containing 109 regular expression rules, consumes more than 16GB memory space when compiled to a composite In summary, the main contributions of this paper are: DFA. Compression mechanism is an effective way to reduce memory consumption of DFA. Many compression algorithms (i) We develop a new DFA compression algorithm called have been proposed, such as D 2 FA [7], δ FA [8][9] and A-DFA J-DFA by clustering algorithm and encoding scheme. [10]. These algorithms use the redundancy of DFA transition Moreover, we measured the compression ratio of J- table to generate a new compressed DFA structure. Meanwhile, DFA. Measurement results show that the compression the compression of DFA implies that multiple states may be ratio reaches about 99%. traversed when processing a single input character. So the (ii) We develop a state prediction method for J-DFA and compression algorithms usually pay a price of worse memory measured it using real-life NIDS regular expression

A Fast Regular Expression Matching Engine for NIDS Applying - PDF document

A Fast Regular Expression Matching Engine for NIDS Applying Prediction Scheme Lei Jiang Qiong Dai, Qiu Tang, Jianlong Tan Binxing Fang Institute of Computing Technology Institute of Information Engineering Institute of Computing Technology

7.5 Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E

Regular a regular expression I Example 1.68 Consider the following DFA b a 1 2 a b a

Regular Expressions A regular expression describes a language using three operations. Regular

Lec 03. Regular expression, Pumping lemma Eunjung Kim F ORMAL DEFINITION OF R EGULAR EXPRESSION

Leftmost Longest Regular Expression Matching in Reconfigurable Logic Kubilay Atasu IBM Research

Gene Expression Data Introduction to gene expression data Expression data storage concept An

Matching of Matrix Elements and Parton Showers CKKW matching in e + e collisions Lecture 2:

Global Shape Matching Section 3.3: Articulated Matching using Graph Cuts Global Shape Matching:

Regular Expressions CS 2110 What is a regular expression? A special string for describing a

The Expression Problem and Lenses Lambdajam 2016 Tony Morris The Expression Problem A new name

Objectives You should be able to ... Regular Languages Use the syntax of regular expressions

Matching Bipartite Matching Input Given a (undirected) graph G = ( V , E ) Input Given a bipartite

Regular Expression More conventionally called a pattern An expression that

Search Engine Optimization What is Search Engine Optimization Search Engine Optimization is the

An Improved Algorithm to Accelerate Regular Expression Evaluation Michela Becchi and Patrick

LPEG: a new approach to pattern LPEG: a new approach to pattern matching in Lua matching in Lua

the city as a shared room Maria Mendel University of Gdansk m.mendel@ug.edu.pl Agenda

Project Completion Krista Rostosky-Sharma 214-427-4659 (O) 214-907-4313 (C)

Women and Men Collaborating with Robots on Assembly Lines: Designing A Novel Evaluation Scenario

Introduc)on: density (56 units) mul;family housing (with provisions for assisted living and

Title : Enabling Citizen's Advice Bureau (CAB) to spot trending issues in society before they grow

Agenda 1. Process and Timeline 2. Original Submission 3. Revised Submission 4. Next Steps

WELCOME TO RIVERBANK LANDING! PRE-APPLICATION PUBLIC INFORMATION MEETING

PM Capital Adviser Forum February 2018 Disclaimer This presentation is issued by PM CAPITAL