One of the oldest robust methods used in image analysis and computer vision
is the *Hough transform*. The idea is to map data into the parameter
space, which is appropriately *quantized*, and then seek for the most
likely parameter values to interpret data through *clustering*.
A classical example is the detection of straight lines given
a set of edge points. In the following, we take the example of estimating
plane rigid motion from two sets of points.

Given *p* 2D points in the first set, noted , and *q* 2D
points in the second set, noted , we must find a rigid
transformation between the two sets. The pairing between and
is assumed not known. A rigid transformation can be
uniquely decomposed into a rotation around the origin and a translation, in
that order. The corresponding parameter space is three-dimensional: one
parameter for the rotation angle and two for the translation vector
. More precisely, if is paired to
, then

with

It is clear that at least two pairings are necessary for a unique estimate
of rigid transformation. The three-dimensional parameter space is
quantized as many levels as necessary according to the required
precision. The rotation angle ranges from *0* to .
We can fix the quantization interval for the rotation angle, say, at ,
and we have units. The translation is not bounded,
but it is in practice. We can assume for example that the translation
between two images cannot exceed 200 pixels in each direction (
). If we choose an interval of 20 pixels,
then we have units in each direction. The
quantized space can then considered as a three-dimensional
*accumulator*, of cells, that
is initialized to zero.

Since one pairing of points does not entirely constrain the motion, it
is difficult to update the accumulator because the constraint on
motion is not simple. Instead, we can match *n*-tuples of points in
the first set and in the second set, where *n* is
the smallest value such that matching *n* points in the first set with
*n* points in the second set completely determines the motion (in our
case, *n=2*). Let be one of such matches,
where and are both vectors of dimension *2n*,
each composed of *n* points in the first and second set, respectively.
Then the number of matches to be considered is of
order of (of course, we do not need to consider
all matches in our particular problem, because the distance
invariance between a pair of points under rigid transformation can be
used to discard the infeasible matches). For each such match, we
compute the motion parameters, and the corresponding accumulator cell
is increased by 1. After all matches have been considered, peaks in
the accumulator indicate the best candidates for the motion
parameters.

In general, if the number of data is not larger enough than the number of unknowns, then the maximum peak is not much higher than other peaks, and it may be not the correct one because of data noise and because of parameter space quantization. The Hough transform is thus highly suitable for problems having enough data to support the expected solution.

Because of noise in the measurements, the right peak in the accumulator may be very blurred so that it is not easily detected. The accuracy in the localization with the above simple implementation may be poor. There are several ways to improve it.

- Instead of select just the maximum peak, we can fit a quadratic hyper-surface. The position of its maximum gives a better localization in the parameter space, and the curvature can be used as an indication of the uncertainty of the estimation.
- Statistical clustering techniques can be used to discriminate different candidates of the solution.
- Instead of using an integer accumulator, the uncertainty of data can be taken into account and propagated to the parameter estimation, which would considerably increase the performance.

The Hough transform technique actually follows the principle of
*maximum likelihood estimation*. Let be the parameter
vector ( in the above example). Let
be one datum (
in the above example). Under the assumption that the data
represent the complete sample of the probability
density function of , , we have

by using the law of total probability. The maximum of is considered as the estimation of . The Hough transform described above can thus be considered as one approximation.

Because of its nature of global search, the Hough transform technique is rather robust, even when there is a high percentage of gross errors in the data. For better accuracy in the localization of solution, we can increase the number of samples in each dimension of the quantized parameter space. The size of the accumulator increases rapidly with the required accuracy and the number of unknowns. This technique is rarely applied to solve problems having more than three unknowns, and is not suitable for conic fitting.

Thu Feb 8 11:42:20 MET 1996