Omnipresent Reed-Solomon Codes

Translation of the article “The Ubiquitous Reed-Solomon Codes” by Barry A. Cipra from SIAM news magazine, file 26-1, January 1993.
*****

In our so-called information age, there is no need for anyone to remind you that not only speed, but also accuracy is important for storing, searching and transmitting data. The principle “that you reap what you reap” in this case does not always work. Machines make mistakes, and their miscalculations that have arisen through no fault of a person can turn a flawlessly written program into useless and even dangerous trash. Just as architects design buildings that do not collapse even in an earthquake, their computer colleagues have come up with sophisticated techniques that can counteract the manifestations of Murphy's law in the digital world.

However, many people using modern technologies may not even realize the importance of the five-page article that appeared in 1960 in the journal of the Society of Industrial and Applied Mathematics. In this article Irred Reed and Gustav Solomon, future employees of the Lincoln Laboratory at the Massachusetts Institute of Technology, presented the ideas that underlie the modern error correction methods used in various equipment, starting with computer hard drives and ending with compact players. -Disc. Reed-Solomon codes (and, of course, high engineering) made it possible to send photographs of the outer planets of the solar system from the Voyager-2 spacecraft to Earth. They make it possible to enjoy music from a scratched CD. And in the near future they will allow the insatiable cable TV people to push more than 500 channels into their systems, and here there is an already uncooked field.
')
Quick reference

Reed-Solomon codes are non-binary cyclic codes that allow to correct errors in data blocks. The elements of the code vector are not bits, but groups of bits (blocks). Reed-Solomon codes that work with bytes (octets) are very common.

The Reed-Solomon code is a special case of the BCH code.

Currently widely used in systems for data recovery from CDs, when creating archives with information for recovery in case of damage, in noise-resistant coding.
Details - Reed Code - Solomon

“If we are talking about CD players and digital audio tapes, as well as digital television and many other new digital image systems, then all this requires the use of Reed-Solomon codes as an integral part of the system,” says Robert McAlees, an expert on the coding theory of the Department of Electrical Engineering, Caltech.

Why? Because digital information, virtual by definition, consists of an array of “bits” - 0 (zeros) and 1 (ones) - and a physical device. And any physical device, no matter how well it is made, can sometimes confuse bits. For example, Voyager-2 had an incredibly low transmitter power — data for tens of millions of kilometers was transmitted almost in a whisper. Disk drives pack data so tightly that you can (in fact) understand the read / write head when it cannot tell where one bit ends and the next unit begins (or zero). With a careful approach to design, the error rate is reduced, let's say, to a very low level, which can be ignored - the industry standard for hard drives is 1 in 10 billion. But, given the amount of information processing in our days, this “insignificant” level subsequently can cause daily accidents. Error correction codes are a kind of safety net — a mathematical insurance against the vicissitudes of the imperfect material world.

The key to error correction is redundancy. Indeed, the simplest error correction code is a simple repetition just a few times. If, for example, you expect no more than one error during transmission, then repeating each bit three times and “majority vote” at the receiving end ensures that the message is heard correctly (for example, 111 000 011 111 will be heard correctly as 1011). In general, n errors can be compensated by repeating something 2n + 1 times.

But such a correction of errors "in the forehead" would impede the goal of high-speed, high-density information processing. One might prefer an approach in which only a few extra bits are added to the original message. Of course, you cannot always get what you want, ”recalls Mick Jagger,“ but if you try, then sometimes you can actually find what you need. The success of Reed-Solomon codes confirms this.

By 1960, the theory of error correction codes existed for only about ten years. The basic theory of reliable digital communications was formulated by Claude Shannon in the late 1940s. At the same time, Richard Hemming presented an elegant approach to the correction of single errors and the detection of double errors. During the 1950s, a large number of researchers began experimenting with a variety of error correction codes. But, according to McElis, Reed and Solomon "broke the bank" with his article in the journal "SIAM".

The reward was a coding system based on groups of bits, such as bytes, instead of individual zeros and ones. This feature makes Reed-Solomon codes particularly useful for countering group errors: six consecutive bit errors, for example, can affect a maximum of two bytes. So even a version of the Reed-Solomon code with double-error correction can provide an adequate margin of safety. (The current implementation of Reed-Solomon coding in CD technology allows overcoming group errors of up to 4,000 consecutive bits.)

Mathematically, Reed-Solomon codes are based on finite field arithmetic. Indeed, an article in 1960 begins with the definition of a code as “a mapping of a vector space of dimension m over a finite field K to a vector space of a higher dimension over the same field.” Starting with the original “message” (a_0, a_1, ..., a_ { m-1}), where each a_k is an element of the K field, Reed-Solomon coding generates (P (0), P (g), P (g ^ 2), ..., P (g ^ {N-1} ), where N is the number of elements in K, g is the generator of a cyclic non-empty group in K, and P (x) is the polynomial a_0 + a_1 * x + ... + a_ {m-1} x ^ {m-1}. If N is greater than m, then the values of P override the polynomial, and the properties of the finite it ensures that the coefficients, that is, the original message can be recovered by any value m.

Conceptually, the Reed-Solomon code defines a polynomial by representing a large number of points. And just as the eye can recognize and correct a few “bad” points in what is a clear parabola, Reed-Solomon's code can identify incorrect P values and still recover the original message. A droplet of combinatorial reasoning (and some linear algebra) establishes that this approach can correct up to s errors, as long as m, the length of the message, is strictly less than N - 2s.

In today's byte-sized world, for example, it makes sense to define K as a field of degree 8 through Z_2, so each element K corresponds to one byte (in computer jargon, there are four bits in a nibble and two nibbles in a byte). In this case, N = 2 ^ 8 = 256, and therefore messages of 251 bytes in length can be recovered, even if two errors occur in transmitting the values of P (0), P (g), ..., P (g ^ { 255}). This is much better than the 1255 bytes required by the talk-all-five approach.

Despite its advantages, Reed-Solomon codes began to be used not immediately - they were waiting for the moment when hardware technologies will become more perfect. “In 1960, there was no such thing as fast digital electronics,” in any case, it was not the same level as today, McAleese says. Reed-Solomon’s work “suggested several interesting ways of processing data, but no one knew whether this was feasible or not, and in 1960, it was probably not feasible.”

But the technology has been improved and many researchers have begun work on the implementation of codes. One of the key figures was Alvin Berlekamp, a professor of electrical engineering at the University of California at Berkeley, who invented an efficient Reed-Solomon code decoding algorithm. Berlekamp's algorithm was used in Voyager-2, and on its basis decoding was built in CD players. In addition, numerous "bells and whistles" were added (some have a fundamental theoretical significance). CDs, for example, use a version of the Reed-Solomon code with cross-interleaving, called CIRC.

Reed, now a professor of electrical engineering at the University of Southern California, is still working on problems in coding theory. Solomon, who recently left the Hughes Aircraft Company, advises the Jet Propulsion Laboratory. Reed was among the first to see abstract algebra as the basis for error correction codes.

“Looking back, it seems obvious,” he told SIAM news. However, he added: “when we published an article, coding theory was not the subject of discussion”. Both authors understood that a good result was achieved, but they had no idea what impact their article would have. Three decades later, we see him. A huge number of applications, already existing and planned, solved the question of the practicality and significance of Reed-Solomon codes. “Clearly, they are practical because now everyone uses them,” says Berlekamp. Billions of dollars in modern technology depend on ideas based on the original work of Reed and Solomon. In a word, “this was a very important article,” according to McElys.

- Translation: © Wio, intnzy, aidsfrag.

Source: https://habr.com/ru/post/84041/

All Articles

Omnipresent Reed-Solomon Codes

More articles: