Voice coding (cont.)

The source coding discussion so far has focused on a particular type of linear waveform encoding which, while very flexible and widely used, leads to a high data transmission rate (64 kbps per telephone voice channel), for the encoded signal. In many applications where transmission capacity is limited, often because of restricted channel bandwidth, it is desirable to achieve a much lower data rate for the encoded signal, hopefully without incurring any significant degradation in the perceived (subjective) quality of the signal.
If we are not concerned with preserving the waveform shape perfectly, but rather we wish to maintain the subjective quality of the received signal – visual or audio fidelity – then we can move away from waveform coding and use more sophisticated frequency domain or source modelling (parametric) based methods.

The subject of voice coding is vast, and falls outside the scope of this book. A good reference is Papamichalis (1987). A techniques that is proving to give good voice quality and good compression performance is based on linear predictive coding which in part attempts to model the actual human voice synthesis process. Data rates of the order of 7000 bps are now considered to give acceptable voice quality telephone communications (compared with 64 kbps for the standard waveform codec), and these techniques are used extensively in modern digital cellular systems.