GPU-efficient recursive filtering and summed-area tables
ACM Trans. Graphics (SIGGRAPH Asia), 30(6), 2011.
Efficient overlapped computation of successive recursive filters on 2D images.
Image processing operations like blurring, inverse convolution, and summed-area tables are often computed
efficiently as a sequence of 1D recursive filters. While much research has explored parallel recursive
filtering, prior techniques do not optimize across the entire filter sequence. Typically, a separate
filter (or often a causal-anticausal filter pair) is required in each dimension. Computing these filter
passes independently results in significant traffic to global memory, creating a bottleneck in GPU systems.
We present a new algorithmic framework for parallel evaluation. It partitions the image into 2D blocks,
with a small band of additional data buffered along each block perimeter. We show that these perimeter
bands are sufficient to accumulate the effects of the successive filters. A remarkable result is that the
image data is read only twice and written just once, independent of image size, and thus total memory
bandwidth is reduced even compared to the traditional serial algorithm. We demonstrate significant
speedups in GPU computation.
For completeness we could have mentioned
a term used in computer vision [Viola and Jones 2001]
to also refer to
ACM Copyright Notice
Copyright by the Association for Computing Machinery, Inc. Permission to make digital or
hard copies of part or all of this work for personal or classroom use is granted without fee provided that
copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the
full citation on the first page. Copyrights for components of this work owned by others than ACM must be
honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to
redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications
Dept, ACM Inc., fax +1 (212) 869-0481, or email@example.com. The definitive version of this paper can be
found at ACM's Digital Library http://www.acm.org/dl/.