Simplified construction of erasur code for three of fewer erasures

Simplified construction of erasure codes
for three or fewer erasures
Mark S. Manasse
Microsoft Research, Silicon Valley
June 2002

When digital data is transmitted or stored, it is common to be concerned about both errors and erasures. An error occurs in a data stream when some data element is corrupted; an erasure occurs when a data element is missing or known to be faulty. Shannon's basic theorems on coding tell us thatto correct e errors, a data stream needs to guarantee a minimum distance between correct streams of at least 2e+1; to detect e errors, or to correct e erasures, the streams need minimum distance e.

In more practical settings, we are concerned with the transmission or storage of data which is grouped into bytes; we expect erasures or errors to take place in multiples of bytes. To this end, Reed and Solomon devised a method for constructing linear codes over finite fields, where the individual data items can be represented as elements of the finite field. For digital data, it is most convenient to consider the fields GF(2^n), for n=8, 16, or 32. If we want to correct e erasures from m data words (where m+e <= 1+2^n), we can compute m+e result words as the result of multiplying the length-m vector of the data words by an m by m+e coding matrix. The key observation was that a Vandermonde matrix has the property that any m by m submatrix is invertible, which allows the data elements to be reconstructed from any m of the m+e result words.

If erasures are uncommon, we can more quickly recover the data by diagonalizing the coding matrix, so that the initial m by m submatrix is the identity matrix, that is the matrix with the field element 1 on the main diagonal, and the element 0 everywhere else. In this case, the first m result words are identical to the data words, simplifying both encoding and decoding in the common case.

In the standard construction, the largest Vandermonde matrix which can be constructed over GF(2^n) has 2^n columns; recent authors have observed that one extra (non-Vandermonde) column can be added while preserving the invertibility property. This matrix, when diagonalized, gives rise to the bounds described above; in this matrix, the e columns corresponding to the correction coding have no particularly nice properties. As such, the elements of the coding must be stored in a coding matrix, and consulted on every coding step.

This note proposes a simpler construction of the coding matrix in the restricted case when e <= 3. The construction is to append up to three columns of a Vandermonde matrix to an identity matrix of size at most (2^n-1) square, leading to a coding matrix supporting m+e <= 2^n+2. Due to properties of a Vandermonde matrix, the encoding is regular, and inverse matrices are also regular.

A Vandermonde matrix is a matrix whose columns consist of consecutive powers ofdistinct elements of the ground field over which we are working. In a standard Vandermonde matrix, the first row contains the zero'th powers, the second row the first powers, etc. Invertibility follows from computation of the determinant; which (up to sign) is equal to the product of the differences of the elements used to generate the matrix. If a column begins with a higher power of an element, the determinant is multiplied by that value (and is still non-zero, if the element is non-zero); we call such a matrix an extended Vandermonde matrix.

For our use, let g be a generator of the finite field. The elements from which we generate our Vandermonde matrix are g^0 (i.e. 1), g^1, and g^2. When we take an arbitrary m by m submatrix of the resulting coding matrix, it's not hard to see that the columns from the identity-portion of the matrix can be ignored when computing the determinant, leaving an arbitrary 0 by 0, 1 by 1, 2 by 2, or 3 by 3 minor of the Vandermonde section.

In all cases, the minor is a (possibly extended, in the case of 2 by 2) transposed Vandermonde matrix, and hence has non-zero determinant, becauseg^i and g^j, and g^2i and g^2j are distinct andnon-zero for all i and j < 2^n.. The only difficult case is showing that g^2i and g^2j are distinct ; this is true because g is a generator of the multiplicative group, and thus g^2(i-j) is equal to 1 only when 2(i-j) is a multiple of the order of the multiplicative group. For GF(2^n), that order is 2^n-1; since this is odd, we would need i congruent to j to get unity.

If we were to extend this to finite fields of other characteristics, we would need to be more careful: 2 is a factor of the order of the multiplicative group, so we would need to restrict m to be less than half the order of the multiplicative group.

By performing multiplication using logarithms, the coding matrix becomes quite simple: the logarithms in the encoding columns are all 0's, the consecutive integers starting from 0, and integers going up by twos starting from 0; when updating data element i by exclusive or with data d, the update to check value j is to xor by antilog(log(d)+j*i). Compared to conventional techniques, this saves us the cost of looking up the entry; the lookup is trivial, but in a small microcontroller, the storage may not be.

For reconstruction, note that the first check function is parity, so correcting one erasure is easy. Correcting two or more erasures requires inverting the minor; this tells us what to multiply the check elements by; the other data elements multipliers are then linear combinations of the corresponding check element multipliers by the elements of the coding matrix, which are easy to compute.