Maine: A Library for Data Parallel Finite Automata

Todd Mytkowicz and Wolfram Schulte

Abstract

Programmers use Finite Automata (FA)---the computational engines of Regular Languages---to solve a large and important class of problems (e.g. lexing, edge detection, DNA sequence matching, variable length decoding, etc.). Fundamentally, algorithms for evaluating FA have not changed in 40 years, despite significant advances in hardware; modern hardware is parallel while current algorithms for evaluating FA are sequential.

In this paper we introduce Maine: a library for data parallel FA. Unlike prior approaches to parallelizing FA, Maine does not rely on speculation. Further, because Maine targets data parallelism, it is amenable to many forms of parallelism found in modern hardware (e.g. vector instructions, multicore, and GPU).

Maine formalizes the evaluation of a FA as matrix multiplication, a well studied parallel algorithm. Maine is not the first to view FA as operations on matrices, but it is the first to evaluate a data parallel formulation of FA on real world programs. To that end, we use Maine to implement parallel HTML lexing and parallel Huffman decoding. We show how these seemingly sequential algorithms can effectively take advantage of vector instructions and multicore and, as a consequence, provide multi-factor performance improvements over sequential implementations.

Details

Publication typeTechReport
NumberMSR-TR-2012-62
> Publications > Maine: A Library for Data Parallel Finite Automata