Jitter explained - Part 1.3 [English]

DIGITabilis: crash course on digital audio interfaces

Part 1.3 - The Digital Enemy

What is jitter?

We often use the term jitter. But what really is jitter, and which are its effects? In this article we try to give the final word on this so discussed topic. And, well, yes, uh, final... I should better honestly say from the beginning that I am not sure we have full success....

With the term clock in digital electronic we refer to a very simple signal which is a sequence of zeroes and ones, often with the one duration time equal to the zero duration time. Such simple signal carries only one information: a timing information.
In practice it represents a basic measure of time, like a metronome, and as a metronome is simply used by more or less complex circuits to change their status (synchronous circuits). A typical example of such a circuit is a quartz watch, and also a digital audio system.

Any real-world clock is affected by a precision and accuracy problem. In facts, the frequency of the clock is never exactly the one required: absolute precision and accuracy do not exists in practice. Now, you can better see the difference between the two in different timing perspectives, short or long term.

Continuing with the quartz watch example, in the long term, you sometimes see that your watch is always slow or fast, that is its clock frequency is not exactly the nominal one: this is precision. Clearly, any watch needs to have a reasonable precision, to be of any use.
In the short term, you can be sure that each one second interval measured by the watch is very slightly different from the next and the previous, even though it is normally impossible to detect for us. This means that the clock is not even accurate.

The same applies to any clock: the accuracy error that affects each clock edge and cycle is digital clock jitter.

As we said, also any digital audio unit is essentially a synchronous systems, and as such, it is controlled by a clock, which, as any other real world clock, is affected by jitter.

As you all probably know, in digital audio recording process the analog musical signal is sampled with a given fixed frequency Fc (which according to the current audio standards can be 44.1, 48, 96 or 192 kHz, but could be in principle any other) and each sample level is converted into a number, which is stored on a digital media.

At reproduction time, the samples are extracted from a digital media, converted back into an analog signal with the same frequency used at recording time, and filtered with a very steep low pass filter.

Digital theory proofs that in this way, provided that the original signal does not extend to frequencies higher than Fc/2, it is possible to reconstruct the original signal.

Obviously, technology has gone many steps forward, these days, and the process outlined above is no longer the one exactly in use, but things have not changed so much in the end.

In particular, there is a subtle point that still requires a lot of attention, while it was completely missed by original digital audio designers: what happens of the inevitable differences between the sampling clock at recording time and the sample clock at reproduction time?

It is perfectly clear that the two clocks cannot be exactly the same: there will always be a (very) small difference in frequency, but also each clock edge will not take place exactly at the expected time, that is will be affected by a timing error: a very small one, again, but an error difficult and expensive to control.

This is what happens to a sinusoid when it is sampled and reproduced with a jittered clock. In practice, as you see, it results distorted in a very strange and (apparently) unpredictable way.

It is clear that the problem is two-folds: part of the error is generated at recording time, while another part added at reproduction time. For the first part, we cannot do better than assume that recording processes (as in effects normally happens) take the issue into account, and consider the recording completely error free: in facts, the recording is only a sequence of numbers, there is no information at all about the original sampling clock, which is assumed ideal.

All we can do is to try to minimize the amount of error added at reproduction time. This amount is in general widely predominant, given the different amounts of money that is (ehm... should be) reasonable to invest in a recording studio clock and in a home player.

All these timing errors, taking place both at recording and reproduction (D/A and A/D conversion) time, are called sampling jitter.

Sample clock jitter is the only kind of jitter that really directly affects digital audio, or better, digital audio listening. In audio systems can be recognized many other different types of jitter, and we will discuss about them in the following, but only because they can finally induce sample clock jitter.

Classification and components of digital audio jitter

In literature there are several different classification of jitter, depending essentially on the point of view of the writer.

A rather technical classification of jitter in digital audio ([1]) draws before all a distinction between interface jitter and sampling jitter.

This is easier to understand if we refer to a separate DAC; in this case the DAC structure includes a digital audio receiver that separately retrieves data and clock from the input SPDIF (or AES or Toslink) flow and passes them on to the D/A converter.

Interface (or data link) jitter in this case is the timing error affecting the digital flow from the source at the input of the DAC digital audio receiver.

Typical components of interface jitter are transmitter jitter, line induced jitter and interfering-noise induced jitter.

Transmitter jitter is injected into the data link by the transmitter, because of internal clock phase noise (intrinsic jitter), or poor jitter rejection at the input (transferred jitter, in practice jitter passed on to the transmitting system from its source through its (synchronization) input, and passed on by the transmitting system to the data link under exam).

The limitations imposed by reccomandationreccomandation on transmitter jitter are quite broad (+-20nsec): the problem addressed at the time was only to avoid transmission errors, and the possibility that jitter could be detected because of its audio effects was not considered at all.

Line induced jitter is instead caused by the transmission medium. Any connection link has a limited bandwidth, that makes transitions slower than ideal and this causes waveform deformations that can be easily studied with an oscilloscope (eye pattern).

Now we need some more detail abut SPDIF code to clarify what happens. According to the biphase mark code used in SPDIF interface, each bit is transmitted as two semi-bits of equal duration, which assume the same values if the bit value is 0, and opposite values if it is 1. A transition is always present at each bit boundary, and based on these it is possible to reconstruct the transmission bit clock. Moreover, the SPDIF flow transmits data packed in frames, each containing a stereo sample; each frame has a preamble that is identified by a violation in the bit boundary transition rule (that is, there is no transition between one bit and the following, so that there are three consecutive half bits with the same value). This makes it possible to recognize the frame starting point, and also allows to extract a sampling rate clock.

The waveform deformation that takes place is such that in practice the the higher (absolute) voltage reached depends on how long the line signal remains at the same logical level, and given the limited speed, also the time it takes for the voltage to fall back again to the 0 volt line at the following transition depends on signal recent history. Just to make it very simple, given the structure of the SPDIF flow, this means that all transitions (0 volt crossing) after a 0 will be delayed or anticipated if compared to transitions after a 1. This effect is named intersymbol interference

Clearly these timing deviations are a form of jitter; and given the fact they are depending on the data, it is data dependent jitter.

Interfering-noise induced jitter is another consequence of the limited slope of the transitions. If we inject a small (compared to the signal) amount of noise in the SPDIF channel, this will add to the signal, modifying its instantaneous value. If the noise is low, ad as such is not be able to transform a 0 into a 1, there are no transmission errors, but the signal level will be altered also during the transitions, so that the 0 crossing time will be affected. Note that this would not happen if the transitions were perfectly instantaneous, so also this component depends on trasmissium medium characteristics.

Sampling jitter is instead the timing inaccuracy with which D/A conversion takes place.

Let's now have a look at what happens inside a separate DAC.

The receiver chip essentially has the hard job of extracting clocks and data from the SPDIF data flow. As we have seen the SPDIF flow conveys also double timing information, a bit clock and a word clock whose frequency is equal to the sample rate.

However, the situation is still far from ideal. Incoming SPDIF data link is affected by interface jitter in all its components: its transitions timing is perturbed by both transmitter, line induced and interfering noise jitter. The transitions that should be used to reconstruct the data link bit clock are not precisely timed and irremediably modified in an unknown way during transmission process.

While there is scarcely any way to reduce the effects of transmitter jitter, and it is possible to get some protection from interfering noise jitter using optical fiber or heavily shielded coaxial cables, it is possible to reduce the effects of line induced jitter, which is intrinsically data dependent (and depends essentially on the last few semi-bits). To make it simple, the preambles, appearing in the flow at a fixed rate multiple of the sampling frequency and having a fixed structure, are points where intersymbol interference is minimal, and as such ideal (or best available) references for clock extraction. The jitter affecting these transitions is named preamble jitter.

Here below is a digital oscilloscope screenshot of an SPDIF data flow obtained by repeating many times the acquisition while clocked on the preambles. As you can see the preambles (the two wider gaps at left and center of the screenshot) structure is perfectly stable throughout all acquisitions, while data show a very dynamic behaviour and higher jitter (note that the first data following the preamble is stable too, as this is a 16bit SPDIF flow, and these are bits 16-23 which are zero padded).

Most integrated digital audio receivers use similar strategies to limit their sensitivity to line induced jitter. For example in the recent CS8416 Digital Audio Interface Receiver "the PLL has been designed to only use the preambles of the biphase encoded stream to provide lock update information to the PLL", according to the Crystal/Cirrus data sheet: "This results in the PLL being immune to data dependent jitter effects because the preambles do not vary with data". Similar concepts were expressed also in the CS8412, 8414 etc. receivers data sheets. Texas Instrument's DIR1703 makes (made?) use of a "Sampling period adaptive controlled tracking system (SpAct)", which "is a newly developed clock recover architecture, giving very low jitter clock from S/PDIF data input".

The clock retrieved from preambles is just used as reference for a PLL (Phase locked loop). A PLL is a closed loop (feedback) circuit in which a voltage controlled oscillator is driven by the (adequately filtered) phase error between the oscillator output (typically divided by a fixed factor) and an external reference clock. As soon as the external clock phase error with respect to the oscillator output increases, the error itself, averaged through the low pass filters, causes the oscillator to increase its frequency and catch up the reference.

The two clocks, therefore, maintain exactly the same average frequence (are "locked"), but the voltage controlled oscillator frequency change dynamics can be in a large extent controlled by a careful design of the filter. In particular, the oscillator clock can be much more stable than the reference, and is therefore an ideal source for the conversion clock.

On the other side, any oscillator has an intrinsic jitter: so also the local oscillator gives its own contribution to conversion clock jitter. By the way, higher the clock stability, that is higher the rejection to reference jitter, higher the oscillator intrinsic jitter.

The story is not complete yet. In facts, sampling jitter still differs from the conversion clock jitter, because sampling jitter should be observed by measuring the timing with which conversion takes place, including therefore any degradation effect due to the connection path and, in principle, also inside the conversion chip. And these effects can be really relevant.

Hence, the relation between interface jitter and sampling jitter in a DAC (but this is true in most cases) is far from direct, because in practice

the reference for sampling clock depends only on low jitter preambles, while interface jitter refers to any data bit,
the sampling clock is derived from a local oscillator which
- is phase locked to the reference, and as such only loosely follows the reference, filtering part of reference jitter
- is affected by its own intrinsic jitter, which adds up to the other components
the exact timing with which conversion takes place can differ from the conversion clock because of any kind of external interference or clock waveform distortion, due for example to connection line mismatching (another example of interface jitter, at another level).

As a consequence, it is clearly possible to design systems in which data link jitter has scarcely any effect on sampling jitter, and it would be definitely not correct to assume any simple correlation between the two. As a matter of fact, data jitter can be even 100 or 1000 higher than sampling jitter [9]

It is however true that the typical digital receivers IC available today provide a good rejection for high frequency (>10KHz) jitter only, while low and medium frequency jitter (which seems to be the worst from the audio point of view) is not attenuated at all. So, low frequency jitter affecting the preamble timing in the SPDIF flow can pass without attenuation into the local clock.

In an integrated CD player the situation is different, but the final result quite similar. The connection between transport and DAC circuits is not SPDIF, and the master clock generator is normally placed nearby the DAC chip, but the data flow coming from the transport is still affected by data link jitter, and again this might have (at least potentially) very little in common with data conversion jitter.

The interface here is multi-wire and synchronous, that is the data and the data clock flows on separate wires, which explains why data jitter has a very reduced importance. Moreover, as you can see from the schematic above, both the transport and the DAC are slave of the same master clock, which is a fixed crystal oscillator, whithout any other input apart power supply. The master clock has therefore a very big impact on sound, normally higher that the transport, even though here too transport has some effect on sound.

Given the architecture, the main reasons of this effect are presumably the fact that very often transport, clock and conversion circuits share (part of) the same power supply, and that the DAC circuits are sensitive to trasport irradiated electromagnetic interference. In facts, the transport servos are electrical motors which emit a variable magnetic field and can sometimes require very high currents spikes, especially when any trim correction is required. These in turn depends on mechanical precision of the transport itself (and of CDs), and this in turn explains why better transports allow lower jitter.

Note that this too can be considered a sort of interface jitter.

Rewind to: Part 1.1 | Part 1.2 | Fast forward to: [Part 1.4] | [Part 1.5]