Reed-Solomon Codes

August 07, 2017

Intro

Reed-Solomon Codes are a form of error correcting codes created by Irving Reed and Gustave Solomon at MIT Lincoln Lab.

Reed-Solomon Codes are in a family of BCH algorithms that use finite fields and polynomial structures to process message data and detect errors.

Table of Contents

Overview

The basic structure of these codes involves taking a message and splitting it up into Code Words

Code Word

A code word includes the original message you want to send and the parity bits added on at the end of it.

The data in the code word is broken up further into what are called symbols or characters. These are ss bits long, usually 8 bits.

The entire code word is nn symbols long, while the original data is kk symbols, and the pairity is 2t2t symbols

The Reed-Solomon algorithm can correct up to 2t2t erasures in the data, or up to tt errors. Erasures are like errors, but where the location is known. Think of a QR code, where part of the code is covered by something; you know that the data isn't correct there before you send it. Errors are just mistakes in the data by some unknown maginitude and at an unkown location.

Galois Fields

Galois Field arithmetic is very important to the Reed-Solomon algorithms. All operations done are done in a Galois field.

The reason this is done is we can do addition, subtraction, multiplication, and division on binary numbers of length ss and always get back binary numbers of length ss. In other words the numbers won't ever be larger than ss bits in length.

To create a Galois Field with integers we can just do normal addition and multiplication operations and just mod by some prime number to wrap them around.

Say x and y are integers and p is some prime number...

  • x+y(modp)x + y \pmod p is an integer
  • xy(modp)x - y \pmod p is an integer
  • xy(modp)x * y \pmod p is an integer
  • x/y(modp)x / y \pmod p is an integer

The division point is the interesting one. By using a prime number as the modulus, we ensure that for every pair of integers xx and yy, there is some integer zz such that yz=xy * z = x, so x/y=zx / y = z will always have an answer.

So what we want to do is take this Galois Field and apply it to binary numbers. Lets say our goal is to have 8 bit binary numbers, and to create a Galois Field so any operation will give us 8 bit binary numbers. This type of Galois Field is represented as GF(28)GF(2^8), where 2 is the characteristic of the field and 8 is the exponent.

For all the operations it may be benefitial to represent the binary number as a polynomial. This is done by treating each bit as a coefficient in a polynomial.

10101010=1x7+0x6+1x5+0x4+1x3+0x2+1x+0=x7+x5+x3+x10101010\\ = 1 * x^7 + 0 * x^6 + 1 * x^5 + 0 * x^4 + 1 * x^3 + 0 * x^2 + 1 * x + 0\\ = x^7 + x^5 + x^3 + x

Addition

say we want to add 5+65 + 6 or 0101+01100101 + 0110

0101=x2+10110=x2+x0101 = x^2 + 1\\ 0110 = x^2 + x
(x2+1)+(x2+x)=2x2+x+1=0x2+x+1=x+1=0011\begin{aligned} (x^2 + 1) + (x^2 + x) &= 2 * x^2 + x + 1\\ &= 0 * x^2 + x + 1\\ &= x + 1\\ &= 0011 \end{aligned}
0101+0110=00115+6=30101 + 0110 = 0011\\ 5 + 6 = 3

The reason the 2 coefficient changed to 0 is because we are dealing with binary numbers, so the coefficients of the polynomials are always modulo 2, so the 2 becomes a 0

An efficient way to do this in binary is to just XOR the two numbers

0101XOR01100011\begin{aligned} 0101\\ \scriptsize XOR \normalsize \enspace 0110\\ \hline 0011 \end{aligned}

Subtraction

Because we mod the coefficients by 2, -1 will become 1, otherwise subtractions are the same as additions.

01010110=(x2+1)(x2+x)=x2+1x2x=0x2x+1=x+1=0011\begin{aligned} 0101 - 0110 &= (x^2 + 1) - (x^2 + x)\\ &= x^2 + 1 - x^2 - x\\ &= 0 * x^2 - x + 1\\ &= x + 1\\ &= 0011 \end{aligned}

Multiplication

For multiplication, we can continue converting to polynomials do the operation then convert back.

1000100100101010=(x7+x3+1)(x5+x3+x)=x7(x5+x3+x)+x3(x5+x3+x)+(x5+x3+x)=x12+x10+x8+x8+x6+x4+x5+x3+x=x12+x10+2x8+x6+x5+x4+x3+x=x12+x10+0x8+x6+x5+x4+x3+x=101000111101010001001 * 00101010\\ \begin{aligned} &= (x^7 + x^3 + 1) * (x^5 + x^3 + x)\\ &= x^7(x^5 + x^3 + x) + x^3(x^5 + x^3 + x) + (x^5 + x^3 + x)\\ &= x^{12} + x^{10} + x^8 + x^8 + x^6 + x^4 + x^5 + x^3 + x\\ &= x^{12} + x^{10} + 2 * x^8 + x^6 + x^5 + x^4 + x^3 + x\\ &= x^{12} + x^{10} + 0 * x^8 + x^6 + x^5 + x^4 + x^3 + x\\ &= 1010001111010 \end{aligned}

Or we can use binary multiplication, replacing the addition with XOR's

10001001×001010101000100110001001XOR100010011010001111010\begin{aligned} &10001001\\ \times \enspace &00101010\\ \hline 1&0001001\\ 100&01001\\ \scriptsize XOR \normalsize \enspace 10001&001\\ \hline 10100&01111010\\ \end{aligned}
1010001111010=x12+x10+x6+x5+x4+x3+x1010001111010 = x^{12} + x^{10} + x^6 + x^5 + x^4 + x^3 + x

Now we have a problem because this binary number is larger than 8 bits.

With the integers we modded the numbers by some prime, so we will have to find the equivalent for polynomials. We need an irreducible polynomial or primitive polynomial to serve as our mod number.

We will use 100011101 as this number. So we need to divide the polynomial by this number and find the remainder

An easy way to do this division is to line up the primitive polynomial with the number being reduced so the Most Significant Bits (msb's) line up, and XOR. Then repeat until the number is less than 9 bits.

1010001111010XOR1000111010010110101010XOR10001110100111011110XOR100011101011000011\begin{aligned} 1010&001111010\\ \scriptsize XOR \normalsize \enspace 1000&11101\\ \hline 0010&110101010\\ \scriptsize XOR \normalsize \enspace 10&0011101\\ \hline 00&111011110\\ \scriptsize XOR \normalsize \enspace &100011101\\ \hline &011000011 \end{aligned}
1010001111010(mod100011101)=0110000111010001111010 \pmod {100011101} = 011000011

There is actually a more efficient way to multiplication in Galois Fields.

The simplest operation is to multiply by 2, since the numbers are in binary, you just shift the bits left by 1, and for any power of 2 you shift the bits left by that power.

We can create some generator number (α = 00000010) that is equal to 2 in the Galois Field, and find all the powers of it still in the field.

α0=00000001α1=00000010α7=10000000α8=00011101α9=00111010α11=11101000\alpha ^ 0 = 00000001\\ \alpha ^ 1 = 00000010\\ \mathellipsis\\ \alpha ^ 7 = 10000000\\ \alpha ^ 8 = 00011101\\ \alpha ^ 9 = 00111010\\ \mathellipsis\\ \alpha ^ {11} = 11101000\\ \mathellipsis\\

If we find all the powers of this generator number from 0 to 255, or whatever the maximum power for the field is, we will get all the numbers in the field.

In other words, every number in the Galois Field can be represented as some power of the generator number.

Then, if we convert the numbers before multiplying, it becomes much simpler.

1000100100101010=α74α142=α74+142=α216=11000011\begin{aligned} 10001001 * 00101010 = \alpha ^ {74} * \alpha ^ {142} &= \alpha ^ {74 + 142}\\ &= \alpha ^ {216}\\ &= 11000011 \end{aligned}

So if we just keep a table of these α powers, we can do a quick lookup, some simple addition of the logarithm of the numbers with α as a base, and easy multiplication.

Division

Using the α trick explained in multiplication, division is as simple as subtracting the logarithms base α instead of adding.

xy=logα(x)logα(y){x \over y} = log_\alpha (x) - log_\alpha (y)

Encoding

We take some message MM, and break it into symbols, where if we use the Galois Field GF(28)GF(2^8), the symbols are 8 bits long. So MM becomes M1,M2,M3,MkM_1, M_2, M_3, \mathellipsis M_k and MiM_i is a symbol. Then we can create a polynomial from these symbols.

M=(M1,M2,M3,Mk)M(x)=M1+M2x+M3x2++Mkxk1M = (M_1, M_2, M_3, \mathellipsis M_k)\\ M(x) = M_1 + M_2x + M_3x^2 + \mathellipsis + M_kx^{k-1}

It is important to realize that this polynomial is not the same as the previous ones. Each coefficient, MiM_i, of this polynomial is some element of the Galois Field. From this point on we will be working with these polynomials where the coefficients are elements of the Galois Field, so don't confuse them with the ones used in Galois Field arithmetic.

Now we need a Generator Polynomial. Again this is different from the generator number mentioned previously, but it is built from that number. The generator polynomial is a irreducible polynomial, with roots that are powers of α.

g(x)=(xα1)(xα2)(xα2t)g(x) = (x - \alpha ^ 1)(x - \alpha ^ 2) \mathellipsis (x - \alpha ^ {2t})

2t is the number of parity symbols you want.

Now we take our message represented as a polynomial, and mod it by the generator polynomial, and we get our parity bits. We just simply tack them on at the end of the message.

Usually you may want to add some buffer bits to the message so that it is a certain length. You can either just add on 0's, and either leave them when you send the message, or remove them and the receiver can tack them on before working with the message.

Decoding

There are some terms I need to define before going into the math of decoding the message.

  • R(x)R(x) the received message polynomial (including parity bits)
  • T(x)T(x) the uncorrupted sent message polynomial (including parity bits)
  • E(x)E(x) the errors in the received message polynomial

These polynomials are just like the ones described in encryption, where the data is split up and each symbol is a coefficient of the polynomial.

These polynomials relate such that...

R(x)=T(x)+E(x)T(x)=M(x)+(M(x)modg(x))E(x)=En1xn1++E1x+E0\begin{aligned} R(x) &= T(x) + E(x)\\ T(x) &= M(x) + (M(x) \mod g(x))\\ E(x) &= E_{n-1}x^{n-1} + \mathellipsis + E_1x + E_0 \end{aligned}

EiE_i is an ss bit error value, and the power of the xx determines the position this error happened at.

Syndromes

We know that T(x)T(x) is divisible by the generator polynomial g(x)g(x), since we added the remainder of M(x)/g(x)M(x) / g(x) to it.

This makes checking for errors very simple, just evaluate R(x)R(x) at each zero of g(x)g(x) and see if it is also a zero of R(x)R(x). Because R(x)R(x) is divisible by g(x)g(x) the zeros of g(x)g(x) must be zeros of R(x)R(x).

The zeros of g(x)g(x) are also very easy to find since it is constructed so that the zeros are αi\alpha^i for 1<=i<=2t1 <= i <= 2t

When we evaluate R(x)R(x) at each power of α\alpha, we get what is called a Syndrome.

Si=R(αi)=Rn1αi(n1)+Rn2αi(n2)++R1αi+R0S_i = R(\alpha ^ i) = R_{n-1}\alpha ^ {i(n-1)} + R_{n-2}\alpha ^ {i(n-2)} + \mathellipsis + R_1\alpha^i + R_0

We also know that T(x)T(x) will evaluate 0 for every zero of g(x)g(x) so we can simplify the equation...

R(αi)=T(αi)+E(αi)T(αi)=0R(αi)=E(αi)=Si\begin{aligned} R(\alpha ^ i) &= T(\alpha ^ i) + E(\alpha ^ i)\\ T(\alpha ^ i) &= 0\\ R(\alpha ^ i) &= E(\alpha ^ i) = S_i \end{aligned}

So syndrome values are only dependent on the Error polynomial E(x)

Let's say that errors occur only at locations (e1,e2,ev)(e_1, e_2, \mathellipsis e_v) were eie_i corresponds to the power of xx where the error is. Then we can rewrite E(x)E(x) to only include these locations.

E(x)=Y1xe1+Y2xe2++YvxevE(x) = Y_1x^{e_1} + Y_2x^{e_2} + \mathellipsis + Y_vx^{e_v}

Where Y1,Y2,YvY_1, Y_2, \mathellipsis Y_v represent the error values at these locations.

If we evaluate αi\alpha^i in this new E(x)E(x) we get

E(αi)=Y1αie1+Y2αie2++YvαievE(\alpha ^ i) = Y_1\alpha ^ {ie_1} + Y_2\alpha ^ {ie_2} + \mathellipsis + Y_v\alpha ^ {ie_v}

And to simplify we can define

X1=αe1,,Xv=αevE(αi)=Y1X1i+Y2X2i++YvXviX_1 = \alpha ^ {e_1}, \mathellipsis , X_v = \alpha ^ {e_v}\\ E(\alpha ^ i) = Y_1X_1^i + Y_2X_2^i + \mathellipsis + Y_vX_v^i

We can take these functions and create a matrix defining each Syndrome based on the error locations and error values

[S1S2S2t]=[X11X21Xv1X12X22Xv2X12tX22tXv2t]×[Y1Y2Yv]\begin{bmatrix} S_1\\ S_2\\ \vdots\\ S_2^t \end{bmatrix} = \begin{bmatrix} X_1^1 & X_2^1 & \cdots & X_v^1\\ X_1^2 & X_2^2 & \cdots & X_v^2\\ \vdots & \vdots & \ddots & \vdots\\ X_1^{2t} & X_2^{2t} & \cdots & X_v^{2t} \end{bmatrix} \times \begin{bmatrix} Y_1\\ Y_2\\ \vdots\\ Y_v \end{bmatrix}

So we already know the values of the syndromes (S1,S2,S2t)(S_1, S_2, \mathellipsis S_{2t}), and if we find the values of the error locations (X1,X2,Xv)(X_1, X_2, \mathellipsis X_v), we can solve for the error weights, which are (Y1,Y2,Yv)(Y_1, Y_2, \mathellipsis Y_v).

Error Locator Polynomial

The Error Locator Polynomial is a polynomial where the roots are the location of the errors in R(x)R(x).

Λ(x)=(1+x1x)(1+X2x)(1+Xvx)=1+Λ1x++Λv1xv1+Λvxv\begin{aligned} \Lambda(x) &= (1 + x_1x)(1 + X_2x) \mathellipsis (1 + X_vx)\\ &= 1 + \Lambda_1x + \mathellipsis + \Lambda_{v-1}x^{v-1} + \Lambda_vx^v \end{aligned}

The goal then is to find the coefficients Λ1,Λ2,Λv\Lambda_1, \Lambda_2, \mathellipsis \Lambda_v.

We know that Xj1X_j^{-1} is a zero of Λ(x)\Lambda(x). So we plug in Xj1X_j^{-1} and get this function.

1+Λ1Xj1++Λv1Xjv+1+ΛvXjv=01 + \Lambda_1X_j^{-1} + \mathellipsis + \Lambda_{v-1}X_j^{-v+1} + \Lambda_vX_j^{-v} = 0

multiply both sides by YjXji+vY_jX_j^{i+v}

YjXji+v+Λ1YjXji+v1++Λv1YjXji+1+ΛvYjXji=0Y_jX_j^{i+v} + \Lambda_1Y_jX_j^{i+v-1} + \mathellipsis + \Lambda_{v-1}Y_jX_j^{i+1} + \Lambda_vY_jX_j^i = 0

Then we multiply both sides by YjXji+vY_j X_j^{i+v}. We can also find the summation of this function for all jj's, which will still be zero.

The point of doing this is, we can substitute the syndromes into the equation, which we already know.

j=1vYjXji+v+Λ1j=1vYjXji+v1++Λv1j=1vYjXji+1+Λvj=1vYjXji=0Sk=j=1vYjXjkSi+v+Λ1Si+v1++Λv1Si+1+ΛvSi=0\def\foo{\displaystyle\sum_{j=1}^v} \foo Y_j X_j^{i+v} + \Lambda_1 \foo Y_j X_j^{i+v-1} + \mathellipsis + \Lambda_{v-1} \foo Y_j X_j^{i+1} + \Lambda_v \foo Y_j X_j^i = 0\\ S_k = \foo Y_j X_j^k\\ S_{i+v} + \Lambda_1 S_{i+v-1} + \mathellipsis + \Lambda_{v-1} S_{i+1} + \Lambda_v S_i = 0

Now we can convert this whole thing into matrix representation.

[SvSv+1S2v1]=[Sv1Sv2S0SvSv1S1S2v2S2v3Sv1]×[Λ1Λ2Λv]\begin{bmatrix} S_v\\ S_{v+1}\\ \vdots\\ S_{2v-1} \end{bmatrix} = \begin{bmatrix} S_{v-1} & S_{v-2} & \cdots & S_0\\ S_v & S_{v-1} & \cdots & S_1\\ \vdots & \vdots & \ddots & \vdots\\ S_{2v-2} & S_{2v-3} & \cdots & S_{v-1} \end{bmatrix} \times \begin{bmatrix} \Lambda_1\\ \Lambda_2\\ \vdots\\ \Lambda_v \end{bmatrix}

There is one problem with this, which is we may not know the number of errors that are in the message. In other words, we don't know vv.

We can find by with Berlekamp's Algorithm.

This algorithm starts with Λ(x)=1\Lambda(x) = 1, and at each stage an error value is calculated by substituting the approximate coefficients for that value of vv.

To start we have two functions, the syndromes, and two parameters.

  • Λ(x)\Lambda(x) is the error locator polynomial
  • C(x)C(x) is the correction polynomial
  • S1,S2,S2tS_1, S_2, \mathellipsis S_{2t} are the syndromes
  • KK is the step parameter
  • LL is the parameter used to track the order of equations

We initialize with K=1, L=0, Λ(x)=1, C(x)=xK = 1 ,\space L = 0 ,\space \Lambda(x) = 1 ,\space C(x) = x

then we calculate the error value ee

e=S1+i=1LΛiSi1iif e0Λ(x)=Λ(x)+e×C(x)e = S_1 + \displaystyle\sum_{i=1}^L \Lambda_i S_{i-1-i}\\ if \space e \ne 0\\ \Lambda * (x) = \Lambda(x) + e \times C(x)\\

If 2L2L greater than K then set L=K=LL=K=L and C(x)=(Λ(x)/e)×xC(x) = (\Lambda(x)/e) \times x

Λ(x)=Λ(x) and K=K+1\Lambda(x) = \Lambda * (x) \space and \space K = K+1

repeat until KK greater than 2t2_t

What this algorithm does is it starts with a potential Λ(x)\Lambda(x) which is C(x)C(x) and LL as the order of C(x)C(x), and over time increases LL so that

Si+v+C1Si+v1++Cv1Si+1+CvSi=0S_{i+v} + C_1 S_{i+v-1} + \mathellipsis + C_{v-1} S_{i + 1} + C_v S_i = 0

And it tries to find the smallest LL possible.

So if we have error's instead of erasures (meaning we don't know how many or where they are) we can use Berlekamp's Algorithm to find vv, and then construct the matrix equation shown above in order to solve for Λ1,Λ2,Λv\Lambda_1, \Lambda_2, \mathellipsis \Lambda_v.

Then once we have the error locator polynomial coefficients, we can solve for it's roots, X11,X22,Xv1X_1^{-1}, X_2^{-2}, \mathellipsis X_v^{-1}.

Then going all the way back to the first matrix equation that looked like this.

[S1S2S2t]=[X11X21Xv1X12X22Xv2X12tX22tXv2t]×[Y1Y2Yv]\begin{bmatrix} S_1\\ S_2\\ \vdots\\ S_{2t} \end{bmatrix} = \begin{bmatrix} X_1^1 & X_2^1 & \cdots & X_v^1\\ X_1^2 & X_2^2 & \cdots & X_v^2\\ \vdots & \vdots & \ddots & \vdots\\ X_1^{2t} & X_2^{2t} & \cdots & X_v^{2t} \end{bmatrix} \times \begin{bmatrix} Y_1\\ Y_2\\ \vdots\\ Y_v \end{bmatrix}

We have everything but Y1,Y2,,YvY_1, Y_2, \mathellipsis ,Y_v. We can simply use matrix inverse to solve for these though.

Plugging every thing together, we can solve for E(x)E(x)

E(x)=Y1xe1+Y2xe2++YvxevE(x) = Y_1 x^{e_1} + Y_2 x^{e_2} + \mathellipsis + Y_v x^{e_v}

Then we simply subtract E(x)E(x) from R(x)R(x) and get T(x)T(x), the original message.

Farney's Algorithm

The Farney Algorithm is a more efficient way to solve for Y1,Y2,YvY_1, Y_2, \mathellipsis Y_v. It does not require calculating the matrix inverse.

To do it we need an additional polynomial called the Error Magnitude Polynomial Ω(x)\Omega(x).

Ω(x)=Ωv1xv1++Ω1x+Ω0Ω(x)=[S(x)Λ(x)](modx2t)whereS(x)=S2tx2t++S2x+S1Ω0=S1Ω1=S2+S1Λ1Ωv1=Sv+Sv1Λ1++SvΛv1\Omega(x) = \Omega_{v-1} x^{v-1} + \mathellipsis + \Omega_1 x + \Omega_0\\ \Omega(x) = [S(x) \Lambda(x)] \pmod {x^{2t}}\\ where\\ S(x) = S_{2t} x^{2t} + \mathellipsis + S_2x + S_1\\ \begin{aligned} &\Omega_0 = S_1\\ &\Omega_1 = S_2 + S_1 \Lambda_1\\ &\vdots\\ &\Omega_{v-1} = S_v + S_{v-1} \Lambda_1 + \mathellipsis + S_v \Lambda_{v-1} \end{aligned}

We mod by x2tx^{2t} in order to truncate the magnitude polynomial to the last 2t2t elements. S(x)S(x) is the Syndrome Polynomial where the coefficients are the Syndromes for R(x)R(x).

The derivative of the Error Locator Polynomial evaluated at Xj1X_j^{-1} is different than the normal derivative, because we are working with Galois Field polynomials.

Λ(Xi1)=Λ1+Λ3Xj2+Λ5Xj4+\Lambda'(X_i^{-1}) = \Lambda_1 + \Lambda_3 X_j^{-2} + \Lambda_5 X_j^{-4} + \mathellipsis

We can simply set even powered terms to 0 and divide through by Xj1X_j^{-1}.

With this derivative and the error magnitude polynomial, we can solve for the error values Y1,Y2,YvY_1, Y_2,\mathellipsis Y_v, with this equation.

Yj=Ω(Xj1)/Λ(Xj1)Y_j = \Omega(X_j^{-1}) / \Lambda'(X_j^{-1})

This equation won't work for position's that don't have errors, so we need to still find these positions first, but this equation is faster than the matrix inverse we would have to do otherwise.