Incremental encoding , also known as front compression or rear compression , is a type of delta encoding where common prefixes or suffixes and their lengths are written in such a way as to avoid duplication of data. This algorithm is well suited for compressing sorted data , such as a list of words in a dictionary .
For example:
| Input data | General prefix | Compressed output |
|---|---|---|
myxa myxophyta myxopod nab nabbed nabbing nabit nabk nabob nacarat nacelle | start of data 'myx' 'myxop' no common prefix 'nab' 'nabb' 'nab' 'nab' 'nab' 'na' 'nac' | 0 myxa 3 ophyta 5 od 0 nab 3 bed 4 ing 3 it 3 k 3 ob 2 carat 3 elle |
| 64 bytes | 46 bytes |
This method was used as the base for the GNU locate utility in the index of file and directory names. Also, delta encoding is used for common prefix lengths. This means an additional step in which instead of the length of the common prefix, a change in the length of the common prefix is used.
Despite its simplicity, incremental encoding can save a lot of memory, especially when used in front of other archivers such as gzip or bzip2 .