Data-Parallel Finite-State Machines

Todd Mytkowicz, Madanlal Musuvathi, and Wolfram Schulte

Abstract

A finite-state machine (FSM) is an important abstraction for solving several problems, including regular-expression matching, tokenizing text, and Huffman decoding. FSM computations typically involve data-dependent iterations with unpredictable memory-access patterns making them difficult to parallelize. This paper describes a parallel algorithm for FSMs that breaks dependences across iterations by efficiently enumerating transitions from all possible states on each input symbol. This allows the algorithm to utilize various sources of data parallelism available on modern hardware, including vector instructions and multiple processors/cores. For instance, on benchmarks from three FSM applications: regular expressions, Huffman decoding, and HTML tokenization, the parallel algorithm achieves up to a 3× speedup over optimized sequential baselines on a single core, and linear speedups up to 21× on 8 cores.

Details

Publication typeInproceedings
PublisherArchitectural Support for Programming Languages and Operating Systems (ASPLOS)

Previous versions

Todd Mytkowicz and Wolfram Schulte. Maine: A Library for Data Parallel Finite Automata, July 2012.

> Publications > Data-Parallel Finite-State Machines