next up previous contents
Next: Regression Diagnostics Up: Robust Estimation Previous: Introduction

Clustering or Hough Transform


One of the oldest robust methods used in image analysis and computer vision is the Hough transform. The idea is to map data into the parameter space, which is appropriately quantized, and then seek for the most likely parameter values to interpret data through clustering. A classical example is the detection of straight lines given a set of edge points. In the following, we take the example of estimating plane rigid motion from two sets of points.

Given p 2D points in the first set, noted , and q 2D points in the second set, noted tex2html_wrap_inline3458 , we must find a rigid transformation between the two sets. The pairing between tex2html_wrap_inline3454 and tex2html_wrap_inline3458 is assumed not known. A rigid transformation can be uniquely decomposed into a rotation around the origin and a translation, in that order. The corresponding parameter space is three-dimensional: one parameter for the rotation angle tex2html_wrap_inline3464 and two for the translation vector tex2html_wrap_inline3466 . More precisely, if tex2html_wrap_inline3468 is paired to tex2html_wrap_inline3470 , then



It is clear that at least two pairings are necessary for a unique estimate of rigid transformation. The three-dimensional parameter space is quantized as many levels as necessary according to the required precision. The rotation angle tex2html_wrap_inline3464 ranges from 0 to tex2html_wrap_inline3476 . We can fix the quantization interval for the rotation angle, say, at tex2html_wrap_inline3478 , and we have tex2html_wrap_inline3480 units. The translation is not bounded, but it is in practice. We can assume for example that the translation between two images cannot exceed 200 pixels in each direction ( tex2html_wrap_inline3482 ). If we choose an interval of 20 pixels, then we have tex2html_wrap_inline3484 units in each direction. The quantized space can then considered as a three-dimensional accumulator, of tex2html_wrap_inline3486 cells, that is initialized to zero.

Since one pairing of points does not entirely constrain the motion, it is difficult to update the accumulator because the constraint on motion is not simple. Instead, we can match n-tuples of points in the first set and in the second set, where n is the smallest value such that matching n points in the first set with n points in the second set completely determines the motion (in our case, n=2). Let tex2html_wrap_inline3498 be one of such matches, where tex2html_wrap_inline3500 and tex2html_wrap_inline3502 are both vectors of dimension 2n, each composed of n points in the first and second set, respectively. Then the number of matches to be considered is of order of tex2html_wrap_inline3508 (of course, we do not need to consider all matches in our particular problem, because the distance invariance between a pair of points under rigid transformation can be used to discard the infeasible matches). For each such match, we compute the motion parameters, and the corresponding accumulator cell is increased by 1. After all matches have been considered, peaks in the accumulator indicate the best candidates for the motion parameters.

In general, if the number of data is not larger enough than the number of unknowns, then the maximum peak is not much higher than other peaks, and it may be not the correct one because of data noise and because of parameter space quantization. The Hough transform is thus highly suitable for problems having enough data to support the expected solution.

Because of noise in the measurements, the right peak in the accumulator may be very blurred so that it is not easily detected. The accuracy in the localization with the above simple implementation may be poor. There are several ways to improve it.

The Hough transform technique actually follows the principle of maximum likelihood estimation. Let tex2html_wrap_inline2527 be the parameter vector ( tex2html_wrap_inline3512 in the above example). Let tex2html_wrap_inline3514 be one datum ( tex2html_wrap_inline3516 in the above example). Under the assumption that the data tex2html_wrap_inline3518 represent the complete sample of the probability density function of tex2html_wrap_inline2527 , tex2html_wrap_inline3522 , we have


by using the law of total probability. The maximum of tex2html_wrap_inline3522 is considered as the estimation of tex2html_wrap_inline2527 . The Hough transform described above can thus be considered as one approximation.

Because of its nature of global search, the Hough transform technique is rather robust, even when there is a high percentage of gross errors in the data. For better accuracy in the localization of solution, we can increase the number of samples in each dimension of the quantized parameter space. The size of the accumulator increases rapidly with the required accuracy and the number of unknowns. This technique is rarely applied to solve problems having more than three unknowns, and is not suitable for conic fitting.

next up previous contents
Next: Regression Diagnostics Up: Robust Estimation Previous: Introduction

Zhengyou Zhang
Thu Feb 8 11:42:20 MET 1996