Data Compression: The Complete Reference, Fourth Edition

The methods discussed so far have one common feature, they assign fixed-size codes to the symbols (characters or pixels) they operate on. In contrast, statistical methods use variable-size codes, with the shorter codes assigned to symbols or groups of symbols that appear more often in the data (have a higher probability of occurrence). Designers and implementors of variable-size codes have to deal with the two problems of (1) assigning codes that can be decoded unambiguously and (2) assigning codes with the minimum average size.
Samuel Morse used variable-size codes when he designed his well-known telegraph code (Table 2.1). It is interesting to note that the first version of his code, developed by Morse during a transatlantic voyage in 1832, was more complex than the version he settled on in 1843. The first version sent short and long dashes that were received and drawn on a strip of paper, where sequences of those dashes represented numbers. Each word (not each letter) was assigned a code number, and Morse produced a code book (or dictionary) of those codes in 1837. This first version was therefore a primitive form of compression. Morse later abandoned this version in favor of his famous dots and dashes, developed together with Alfred Vail.
| A | .- | N | ?. | 1 | . | Period | .-.-.- |
| B | ? | O |
| 2 | .. | Comma | .. |
| C | ?.-. | P | . . | 3 |
| Colon |
|
| Ch | - | Q | .- | 4 | .- | Question... |