I am in interested in large scale networking systems. I have focused this interest in the realm of making the Internet more secure, in particular, by exploiting new hardware capabilities and developing new algorithms for better classification of network traffic. This has many applications, but a key one is detecting suspect traffic in order to filter out worms and viruses as well as other forms of bad traffic. In my dissertation, I have designed efficient packet classification and scanning algorithms that have significant improvements over current solutions.
PhD Dissertation Summary
In today’s edge networks, many new services are emerging such as intrusion detection, high speed firewalls, NAT, HTTP load balancing, XML processing, TCP offloading, encryption/decryption, etc. Traditional packet handling techniques, such as next hop forwarding can’t support these new services. Instead, we need deep packet inspection techniques that can function on the entire packet contents rather than just the header. For example, Network Intrusion Detection Systems (NIDS) inspect the packet payload at line rates to detect and filter packets containing worm signatures. These signature sets are large and complex, which make software-only implementations unable to meet typical performance goals. Therefore, we need more efficient algorithms and hardware solutions to achieve desirable performance.
I propose deep packet inspection schemes using emerging hardware technologies. First, I propose a multi-match classification solution which intelligently process packet header information using a special memory -- Ternary Content Addressable Memories (TCAMs) due to their ability to perform fast parallel matching. Second, I develop scheme to identify within packet payload also using TCAM. Lastly, I implement an efficient regular expression rewriting scheme and a fast searching engine for multi-core processor environments. With all these techniques, we can perform multi-match packet classification based on multiple fields in the packet header and perform gigabit rate packet payload scanning against thousands of complex patterns. The proposed schemes can efficiently handle long patterns, correlated patterns, patterns with negation, and also general regular expressions [thesis].
Multi-match packet classification
I adopt Ternary Content Addressable Memories (TCAMs) to solve the multi-match classification problem due to their ability to perform fast parallel matching. However, TCAM is expensive and consumes large amounts of power. None of the previously published multi-match classification schemes is both memory and power efficient. Hence, I developed a novel scheme that meets both requirements by using a new Set Splitting Algorithm (SSA). The main idea of SSA is that it splits filters into multiple groups and performs separate TCAM lookups into these groups. It guarantees the removal of at least half the intersections when a filter set is split into two sets, thus resulting in low TCAM memory usage. SSA also accesses filters in the TCAM only once per packet, leading to low power consumption. Compared to previously published schemes, SSA yields a 75% to 95% reduction in power consumption. [paper1] [slides], [paper2] [slides]
Deep packet pattern matching with TCAM
Because of their intrinsic parallel search capability, TCAMs can also be used effectively for the pattern matching functions needed in intrusion detection systems. However, TCAMs impose limitations on the pattern length that can be directly matched. Also, there is no direct method to handle correlated patterns such as the example patterns shown in Figure 1. In this paper, we develop algorithms that use TCAMs to achieve high speeds while not being restricted to these limitations. Our work is applicable to other layer 7 pattern matching problems. For example, applications like HTTP load balancing, email SPAM filtering, etc., require packet payload scanning. In this paper, we develop and test algorithms for intrusion detection and anti-virus signature sets because they are more complex than the signatures of the applications listed above. This work won a best paper award at the International Conference on Network Protocols (ICNP). [paper] [slides]
Fast Regular Expression Matching
Regular expressions are replacing explicit string patterns as the pattern matching language of choice in the above networking applications. For example, in the Linux Application Protocol Classifier (L7-filter), all protocol identifiers are expressed as regular expressions. Memory requirements using traditional methods for fast regular expression scanning are prohibitively high for many patterns used in networking applications. Hence, I propose regular expression rewrite techniques that reduce memory usage. Further, I develop a scheme based on compiling regular expressions into several engines, which dramatically increases the regular expression matching speed without significantly increasing memory usage. We implement the DFA-based packet scanners. Our experiment results using real-world traffic and patterns have shown that our implementation achieves one to three orders of magnitude of speedup compared to the best NFA-based implementation. [paper] [slides], [related paper]
Research on Optical Networks
In mesh restoration in optical network, the reasons for blocking of restoration path can either be different paths competing for the same channel, which we call glare, or there is insufficient bandwidth available. We proposed two mechanisms to reduce these two sources of blocking. First, channel selection methods that considerably reduce glare, while at the same time maintaining large groups of contiguous channels (for high speed connections) across multiple, parallel links. Then a hybrid solution that utilizes a centralized restoration “path server” to optimize the restoration path selection, yet utilizes distributed control to compute service paths and set up service/restoration paths. The objective is to achieve very fast provisioning, fast restoration upon network failure, and efficient use of capacity. To support this, we also present an efficient restoration path selection algorithm.
Various Previous Research