Exact multi-pattern string matching on the Cell/B.E. processor
Abstract
String searching is the computationally intensive kernel of many security and network applications like search engines, intrusion detection systems, virus scanners and spam filters. The growing size of on-line content and the increasing wire speeds push the need for fast -and often real-time, string searching solutions. Multi-core processors are gaming increasing popularity, thanks to their unprecedented computing power, but they are also bringing new programming challenges. This paper describes a class of high-performance exact string searching solutions that we have optimized for the IBM Cell/B.E. processor using the well known Aho-Corasick algorithm. This class provides several trade-offs between performance and dictionary size. When dictionaries are small enough to fit in the local memories of the processing cores, the throughput reaches 40 Gbps per processor. With larger dictionaries (as many as hundreds of thousands patterns), the typical throughput is between 1.6 and 2.2 Gbps per processor. Copyright 2008 ACM.