Applied Speech and Audio Processing: With MATLAB Examples

Coding techniques that follow, or try to predict, a waveform shape tend to be relatively simple and consequently achieve limited results. These techniques typically assume very little about the waveform being coded except perhaps maximum extents and slew rate. There is a trade-off between coding quality and bitrate, and very little room to manoeuvre toward the ideal of a high-fidelity coding scheme with very low bitrate.
Instead of coding the physical waveform directly, researchers hit upon the idea of parameterising the sound in some way: several values are chosen to represent important aspects of the speech signal. Whatever parameters are chosen to represent the waveform are then transmitted from coder to decoder, where they are used to recreate a similar (but not identical) waveform.
Apart from the likelihood of the transmitted parameters requiring fewer bits to represent than a directly coded waveform, parameterisation can hold two other benefits. Firstly if the parameters are chosen to be particularly relevant to the underlying sound (i.e. a better match to the speech signal) then the difference between the original and coded-decoded signal can be reduced, leading to better fidelity. Second is that the method of quantising the parameters themselves or rather the number of bits assigned to each parameter can be carefully chosen to improve quality. In more simple terms, when given a pool of bits that are allowed to represent the parameters being transmitted from encoder to decoder, it is possible to spend more bits on...