LZ77 and LZ78

From MobileRead
Jump to: navigation, search

LZ77 and LZ78 were early compression formats that revolutionized the saving of digital data.

[edit] LZ77

Digital data often has data that remains the same for several bytes or repeats itself over and over. This was recognized early on and in 1977 an algorithm was devised by Lempel and Ziv to reduce the size of the files. The LZ77 algorithm achieve compression by replacing portions of the data with references to matching data that has already passed through both encoder and decoder. A match is encoded by a pair of numbers called a length-distance pair, which is equivalent to the statement "each of the next length characters is equal to the character exactly distance characters behind it in the uncompressed stream." (The "distance" is sometimes called the "offset" instead.)

It was most successful on graphics images what had backgrounds that covered a large area of the picture. LZ77 is an encoding scheme that replaces repeating data with a pointer to previous data that is identical to the current data. A pointer to a large amount of the same data can take much less space than the data it replaces, hence the size reduction of the file. See PalmDOC for a format that uses LZ77 with a detailed explanation.

[edit] LZ78

LZ78 was an improvement to LZ77 and was released a year later. While LZ77 uses the actual previous data in the stream to point to the LZ78 format uses a dictionary of reusable data that can be pointed to. These can be shown to be basically the same thing. The only difference is that the location of the data is easier to find if it is arranged in a dictionary which can improve performance. A minor variation is called LZH. This and the previous LZ77 have been replaced with LZW.

[edit] For more information

Personal tools
Namespaces

Variants
Actions
Navigation
MobileRead Networks
Toolbox