Results 1 - 10
of
15
Efficient Signature Matching with Multiple Alphabet Compression Tables
"... Signature matching is a performance critical operation in intrusionpreventionsystems. Modernsystemsexpresssignatures as regular expressions and use Deterministic Finite Automata (DFAs) to efficiently match them against the input. In principle, DFAs can be combined so that all signatures can be exami ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Signature matching is a performance critical operation in intrusionpreventionsystems. Modernsystemsexpresssignatures as regular expressions and use Deterministic Finite Automata (DFAs) to efficiently match them against the input. In principle, DFAs can be combined so that all signatures can be examined in a single pass over the input. In practice, however, combining DFAs corresponding to intrusion prevention signatures results in memory requirements that far exceed feasible sizes. We observe for such signatures that distinct input symbols often have identical behavior in the DFA. In these cases, an Alphabet Compression Table (ACT) can be used to map such groups of symbols to a single symbol to reduce the memory requirements. In this paper, we explore the use of multiple alphabet compression tables as a lightweight method for reducing the memory requirements of DFAs. We evaluate this method on signature sets used in Cisco IPS and Snort. Compared to uncompressed DFAs, multiple ACTs achieve memory savings between a factor of 4 and a factor of 70 at the cost of an increase in run time that is typically between 35 % and 85%. Compared to another recent compression technique, D 2 FAs, ACTs are between 2 and 3.5 times faster in software, and in some cases use less than one tenth of the memory used by D 2 FAs. Overall, for all signature sets and compression methods evaluated, multiple ACTs offer the best memory versus run-time trade-offs.
Evaluating GPUs for Network Packet Signature Matching
"... Modern network devices employ deep packet inspection to enable sophisticated services such as intrusion detection, traffic shaping, and load balancing. At the heart of such services is a signature matching engine that must match packet payloads to multiple signatures at line rates. However, the rece ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Modern network devices employ deep packet inspection to enable sophisticated services such as intrusion detection, traffic shaping, and load balancing. At the heart of such services is a signature matching engine that must match packet payloads to multiple signatures at line rates. However, the recent transition to complex regular-expression based signatures coupled with ever-increasing network speeds has rapidly increased the performance requirements of signature matching. Solutions to meet these requirements range from hardwarecentric ASIC/FPGA implementations to software implementations using high-performance microprocessors. In this paper, we propose a programmable signature matching system prototyped on an Nvidia G80 GPU. We first present a detailed architectural and microarchitectural analysis, showing that signature matching is well suited for SIMD processing because of regular control flow and parallelism available at the packet level. Next, we examine two approaches for matching signatures: standard deterministic finite automata (DFAs) and extended finite automata (XFAs), which use far less memory than DFAs but require specialized auxiliary memory and small amounts of computation in most states. We implement a fully functional prototype on the SIMD-based G80 GPU. This system out-performs a Pentium4 by up to 9X and a Niagarabased 32-threaded system by up to 2.3X and shows that GPUs are a promising candidate for signature matching. 1.
Fast Regular Expression Matching using Small TCAMs for Network Intrusion Detection and Prevention Systems
"... Regular expression (RE) matching is a core component of deep packet inspection in modern networking and security devices. In this paper, we propose the first hardware-based RE matching approach that uses Ternary Content Addressable Memories (TCAMs), which are off-the-shelf chips and have been widely ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Regular expression (RE) matching is a core component of deep packet inspection in modern networking and security devices. In this paper, we propose the first hardware-based RE matching approach that uses Ternary Content Addressable Memories (TCAMs), which are off-the-shelf chips and have been widely deployed in modern networking devices for packet classification. We propose three novel techniques to reduce TCAM space and improve RE matching speed: transition sharing, table consolidation, and variable striding. We tested our techniques on 8 real-world RE sets, and our results show that small TCAMs can be used to store large DFAs and achieve potentially high RE matching throughtput. For space, we were able to store each of the corresponding 8 DFAs with as many as 25,000 states in a 0.59Mb TCAM chip where the number of TCAM bits required per DFA
1 Software Toolchain for Large-Scale RE-NFA Construction on FPGA
"... Abstract—We present a software toolchain for constructing large-scale regular expression matching (REM) on FPGA. The software automates the conversion of regular expressions into compact and high-performance non-deterministic finite automata (RE-NFA) [17]. Assuming a fixed number of fan-out transiti ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract—We present a software toolchain for constructing large-scale regular expression matching (REM) on FPGA. The software automates the conversion of regular expressions into compact and high-performance non-deterministic finite automata (RE-NFA) [17]. Assuming a fixed number of fan-out transitions per state, an n-state m-bytes-per-cycle regular expression matching engine (REME) can be constructed in O (n × m) time and O (n × m) memory by our software. The resulting circuit occupies no more than O (n × m) slices on FPGA. Based on the proposed algorithms, we develop prototype software that converts arbitrary regular expressions into RTL codes in VHDL, utilizing both logic slices and block memory (BRAM) available on FPGA devices. A large number of RE-NFAs are placed onto a twodimensional staged pipeline, allowing scalability to thousands of RE-NFAs with linear area increase and little clock rate penalty due to scaling. On a PC with a 2 GHz Athlon64 processor and 2 GB memory, our prototype software converts hundreds of regular expressions from Snort [2] into VHDL in less than 10 seconds. We also designed a benchmark generator which can produce regular expressions with configurable pattern complexity parameters, including state count, state fan-in, loop-back and feed-forward distances. Several regular expressions with various complexities are used to test the performance of our RE-NFA construction software. Index Terms—Regular expression, FPGA, BRAM, finite state machine, NFA
Multi-Byte Regular Expression Matching with Speculation
, 2009
"... Intrusion prevention systems determine whether incoming traffic matches a database of signatures, where each signature in the database represents an attack or a vulnerability. IPSs need to keep up with ever-increasing line speeds, which leads to the use of custom hardware. A major bottleneck that I ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Intrusion prevention systems determine whether incoming traffic matches a database of signatures, where each signature in the database represents an attack or a vulnerability. IPSs need to keep up with ever-increasing line speeds, which leads to the use of custom hardware. A major bottleneck that IPSs face is that they scan incoming packets one byte at a time, which limits their throughput and latency. In this paper, we present a method for scanning multiple bytes in parallel using speculation. We break the packet in several chunks, opportunistically scan them in parallel and if the speculation is wrong, correct it later. We present algorithms that apply speculation in single-threaded software running on commodity processors as well as algorithms for parallel hardware. Experimental results show that speculation leads to improvements in latency and throughput in both cases.
Carousel: Scalable Logging for Intrusion Prevention Systems
"... We address the problem of collecting unique items in a large stream of information in the context of Intrusion Prevention Systems (IPSs). IPSs detect attacks at gigabit speeds and must log infected source IP addresses for remediation or forensics. An attack with millions of infected sources can resu ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We address the problem of collecting unique items in a large stream of information in the context of Intrusion Prevention Systems (IPSs). IPSs detect attacks at gigabit speeds and must log infected source IP addresses for remediation or forensics. An attack with millions of infected sources can result in hundreds of millions of log records when counting duplicates. If logging speeds are much slower than packet arrival rates and memory in the IPS is limited, scalable logging is a technical challenge. After showing that naïve approaches will not suffice, we solve the problem with a new algorithm we call Carousel. Carousel randomly partitions the set of sources into groups that can be logged without duplicates, and then cycles through the set of possible groups. We prove that Carousel collects almost all infected sources with high probability in close to optimal time as long as infected sources keep transmitting. We describe details of a Snort implementation and a hardware design. Simulations with worm propagation models show up to a factor of 10 improvement in collection times for practical scenarios. Our technique applies to any logging problem with noncooperative sources as long as the information to be logged appears repeatedly. 1
Multi-Byte Regular Expression Matching with Speculation
"... Abstract. Intrusion prevention systems determine whether incoming traffic matches a database of signatures, where each signature in the database represents an attack or a vulnerability. IPSs need to keep up with ever-increasing line speeds, which leads to the use of custom hardware. A major bottlene ..."
Abstract
- Add to MetaCart
Abstract. Intrusion prevention systems determine whether incoming traffic matches a database of signatures, where each signature in the database represents an attack or a vulnerability. IPSs need to keep up with ever-increasing line speeds, which leads to the use of custom hardware. A major bottleneck that IPSs face is that they scan incoming packets one byte at a time, which limits their throughput and latency. In this paper, we present a method for scanning multiple bytes in parallel using speculation. We break the packet in several chunks, opportunistically scan them in parallel and if the speculation is wrong, correct it later. We present algorithms that apply speculation in single-threaded software running on commodity processors as well as algorithms for parallel hardware. Experimental results show that speculation leads to improvements in latency and throughput in both cases. Key words: low latency, parallel pattern matching, regular expressions, speculative pattern matching, multi-byte, multi-byte matching 1
Network DVR: A Programmable Framework for Application-Aware Trace Collection
"... Abstract. Network traces are essential for a wide range of network applications, including traffic analysis, network measurement, performance monitoring, and security analysis. Existing capture tools do not have sufficient built-in intelligence to understand these application requirements. Consequen ..."
Abstract
- Add to MetaCart
Abstract. Network traces are essential for a wide range of network applications, including traffic analysis, network measurement, performance monitoring, and security analysis. Existing capture tools do not have sufficient built-in intelligence to understand these application requirements. Consequently, they are forced to collect all packet traces that might be useful at the finest granularity to meet a certain level of accuracy requirement. It is up to the network applications to process the per-flow traffic statistics and extract meaningful information. But for a number of applications, it is much more efficient to record packet sequences for flows that match some application-specific signatures, specified using for example regular expressions. A basic approach is to begin memory-copy (recording) when the first character of a regular expression is matched. However, often times, a matching eventually fails, thus consuming unnecessary memory resources during the interim. In this paper, we present a programmable application-aware triggered trace collection system called Network DVR that performs precisely the function of packet content recording based on user-specified trigger signatures. This in turn significantly reduces the number of memory copies that the system has to consume for valid trace collection, which has been shown previously as a key indicator of system performance [8]. We evaluated our Network DVR implementation on a practical application using 10 real datasets that were gathered from a large enterprise Internet gateway. In comparison to the basic approach in which the memory-copy starts immediately upon the first character match without triggered-recording, Network DVR was able to reduce the amount of memorycopies by a factor of over 500x on average across the 10 datasets and over 800x in the best case. 1
Improving NFA-based Signature Matching . . .
, 2010
"... Network intrusion detection systems (NIDS) make extensive use of regular expressions as attack signatures. Internally, NIDS represent and operate these signatures using finite automata. Existing representations of finite automata present a well-known time-space tradeoff: Deterministic automata (DF ..."
Abstract
- Add to MetaCart
Network intrusion detection systems (NIDS) make extensive use of regular expressions as attack signatures. Internally, NIDS represent and operate these signatures using finite automata. Existing representations of finite automata present a well-known time-space tradeoff: Deterministic automata (DFAs) provide fast matching but are memory intensive, while non-deterministic automata (NFAs) are space-efficient but are several orders of magnitude slower than DFAs. This time/space tradeoff has motivated much recent research, primarily with a focus on improving the space-efficiency of DFAs, often at the cost of reducing their performance. This paper presents NFA-OBDDs, a symbolic representation of NFAs that retains their space-efficiency while improving their time-efficiency. Experiments using Snort HTTP and FTP signature sets show that an NFA-OBDD-based representation of regular expressions can outperform traditional NFAs by up to three orders of magnitude and is competitive with a variant of DFAs, while still remaining as compact as NFAs.
Regular expressions are used in Modern Deep Packet Inspection to define the various patt...
"... Deep packet Inspection is an advanced method of packet filtering that functions at the Application layer of the OSI reference model. Deep Packet Inspection is a form of computer network packet filtering that examines the data part of a packet as it passes an inspection point, searching for protocol, ..."
Abstract
- Add to MetaCart
Deep packet Inspection is an advanced method of packet filtering that functions at the Application layer of the OSI reference model. Deep Packet Inspection is a form of computer network packet filtering that examines the data part of a packet as it passes an inspection point, searching for protocol,viruses,spam, intrusions or predefined criteria to decide if the packet can pass or it needs to be routed to a different destination, or for the purpose of collecting statistical information. Deterministic Finite Automata (DFAs), use large set of rules need a memory amount that turns out to be too large for practical implementation we have presented a new compressed representation for deterministic finite automata, called Delta Finite Automata. The algorithm considerably reduces the number of states and transitions, and it is based on the observation that most adjacent states share several common transitions, so it

