Author: Werner Ogiers - TNT Belgium
This is probably the shortest article of my career as audio scribbler. I also would never have thought that, it being almost 2010, there would still be a need for this article.
But there is. Looking around on various audio fora there still appear to be many people, far too many people, who mistrust dither, feeling that it works only for periodic test signals, and not for real music.
That's a bit amazing, the more so as the benefits of dither are easy enough to demonstrate. In the 1990s British academic Christopher Hicks had an excellent personal webpage doing just that. But that site has gone long since, although one paper of Hicks can still be downloaded at the Benchmark site... Nowadays we've got the wikipedia, but while it sports the relevant information, it mixes up audio and images all of the time (which is educative in its own right), and there's no music. Hence my effort here.
In a sampled (=time-discrete) and quantised (=level-discrete) system, such as digital audio is, the digitised signal can only occupy a finite set of amplitude levels, whereas in its analogue, non-quantised form, the signal can take on any value between zero and peak level (bar the always-present uncertainty due to noise: analogue really has no 'infinite resolution', but that's another story).
The quantised signal then can be regarded as a perfect clone of the original signal, plus an error term. That error term is formally called "quantisation noise", although a much better name would be "quantisation distortion": contrary to noise, this error signal is highly correlated with the actual signal, it sings in tune with it. And, contrary to mere noise, this makes the quantisation distortion highly offensive.
Dither is called to the rescue. Adding a little bit of wide-band noise to the signal before the quantisation, before the reduction to discrete levels, cures this. While seemingly too fantastic to be true, this can easily be explained: the quantiser now has to quantise the signal plus the noise, all the time. And it will make the same errors as before, only this time the errors will be correlated with the noise rather than the signal. And something that correlates ('looks like') noise, well, is just noise itself. Applying dither linearises the quantiser's transfer function, turning massive non-linear signal distortion into wide-band and relatively innocent noise. The only price to pay is that 'silence' in this new digital audio system no longer is perfectly-black silence, but a bit hissy. Like a very very good analogue tape.
But there's even better. Noise-shaping is a dither-related process that not just adds noise, but for each and every sample looks at the introduced quantisation error, and puts that error in a frequency-shaped (filtered) feedback loop. The result is less noise in the critical mid-band, where the ear is most sensitive, and more noise (but less audible so) in the treble region.
If that's all gibberish to you then don't worry. I don't want to educate you on dither: if you truly are interested then read Hicks' paper. All I want to do is let you listen to some sound samples, and let you make up your own mindas to dither being effective or just another delusion of engineering types.
All of the following sound clips were made at 4 bit effective resolution, and 44.1kHz sample rate. You read it correct: you are going to listen to 4 bit audio, with a paltry 16 quantisation steps per sample. That's 4000 times less resolution than what CD offers.
The first clip is a 300 Hertz sine wave at medium level, followed by a fade to silence, as raw unprocessed 4 bit in all its innate glory:
You have to admit it sounds exactly like it looks (ignore the small wiggles in the graph, these are an artefact of the display routine and not part of the actual audio samples).
The next two clips are the same signal again, at 4 bit resolution, but first with flat-spectrum Triangular Probability Distribution dither, and then again with very aggressive noise-shaping:
The above graph is a spectral analysis of our three test signals. The green plot is the 4-bit undithered 300Hz sine: there is the fundamental at 300Hz, plus a spray of harmonics caused by the quantisation, and then above 3kHz additional non-harmonic distortion components, these caused by the harmonics aliasing off the 22.05kHz Nyquist frequency.
The red plot is the dithered version of the same: there is the fundamental, a marked lack of distortion components, and obviously a high wide-band background noise. Noise shaping cures the latter, as the purple plot testifies: in the critical mid-band the noise level is on the average 20dB below that of the dithered tone's, while it is much higher around 19kHz, where noise is much less audible, if at all.
All fine and dandy then, but this is still with test tones and of little practical value, not even for convincing the naysayers. Therefore, the music ...
Well ... since you did all the work I'll leave the conclusion to you too. Happy listening.
© Copyright 2009 Werner Ogiers - www.tnt-audio.com