Todd Mytkowicz and Wolfram Schulte
Programmers use Finite Automata (FA)---the computational engines of
Regular Languages---to solve a large and important class of problems
(e.g. lexing, edge detection, DNA sequence matching, variable length
decoding, etc.). Fundamentally, algorithms for evaluating FA have
not changed in 40 years, despite significant advances in hardware;
modern hardware is parallel while current algorithms for evaluating
FA are sequential.
In this paper we introduce Maine: a library for data parallel FA.
Unlike prior approaches to parallelizing FA, Maine does not rely
on speculation. Further, because Maine targets data parallelism,
it is amenable to many forms of parallelism found in modern hardware
(e.g. vector instructions, multicore, and GPU).
Maine formalizes the evaluation of a FA as matrix multiplication,
a well studied parallel algorithm. Maine is not the first to
view FA as operations on matrices, but it is the first to evaluate a
data parallel formulation of FA on real world programs. To that
end, we use Maine to implement parallel HTML lexing and parallel
Huffman decoding. We show how these seemingly sequential algorithms
can effectively take advantage of vector instructions and multicore
and, as a consequence, provide multi-factor performance improvements
over sequential implementations.