Glossary

Full glossary

Run-length encoding

What is run-length encoding?

Run-length encoding (RLE) is a lossless compression method where sequences that display redundant data are stored as a single data value. This value represents the repeated block, and shows how many times it appears in the image. During decompression, the image can be reconstructed exactly from this information.

This type of compression works best with simple images and animations that have a lot of redundant pixels. It is useful for black and white images in particular. For complex images and animations that do not have many redundant sections, RLE can make the file size bigger rather than smaller. Therefore it is important to understand the content and whether this algorithm will help or hinder compression.

A graphic showing how run-length encoding works by compressing a string of the same letter down to 1 of the letter and a number representing the number of times the letter appears in a row.

History of run-length encoding

This technique was first patented by Hitachi in 1983, and was regularly used when transmitting analog television signals, even as early as 1967. It is less popular today because there are other more advanced options available. You can still find it in use for color fax machines, icons, line drawings, and simple animations. The method is also used in TIFF and PDF files.

Run-length encoding and LZ77

LZ77 is an algorithm that compresses data in a similar way to RLE. It replaces repeated occurences of data with a reference to a single copy of that data that exists earlier in the uncompressed stream. If there is a match to a chunk of data, then LZ77 encodes a match that represents the distance between the two chunks of data and the length of each chunk. To find matches, LZ77 has to track a chunk of recent data, which can be of varying sizes. The bigger the chunk of recent data, the bigger a chunk it can search through for a match. LZ77 is also called sliding-window compression because it is viewing a chunk of data that can be of varying sizes to find matches. Pairs can repeat multiple times, which makes LZ77 similar to RLE, just a bit more complex.

The concept used to create RLE also appears in LZ78, and other popular compression methods we still use today, like GIF and DEFLATE. The DEFLATE algorithm appears in PNG and ZIP.

Drawbacks of run-length encoding

The main drawback of RLE is that it is best suited for simple and repetitive data only. Here are a few more aspects to keep in mind when working with RLE:

  • The original data is ot instantly accessible, you must decode everything before you can access anything.
  • You cannot tell how large the decoded data will be. This can cause problems when you have limited space to decompress the file in. In this case, you can either store the size of the original data as metadata, or use this compression method with a dynamic buffer and enough space to decompress the content in.

Learn more about the history of video compression and the different algorithms that have appeared over time in our blog article about The History of Video Compression Standards, From 1929 Until Now.