TNT-Convertus: a minimalist DAC

The TNT-Convertus: a minimalist DAC

2. The Two-Boxes CD Player: Implementation issues

2.1 - The pick up and control circuits

The CD reader uses a sophisticated laser pick-up to read the disk. In this phase normally standard chip sets, built only by the major electronic companies (Philips, Sony...) are used. A few esoteric audio companies develop and use proprietary algorithms to extract all possible information from the disk, so that it is not possible to detect reading errors even when they take place.

This is another possible critical issue. In facts the correction algorithm works automatically, eventually recreating missing samples, without any control by the user. This is the reason for which a CD player does not click like an LP: when a sample is missing it can be reconstructed using all adjacent samples and the extra information that has been included specifically for this reason.

This is really a good thing... for a low quality standard! Audiophiles do not appreciate this behaviour so much, as it could hidden great flaws in recorded CDs.
One fact not so easy to explain, for example, is how can two different CD releases of the same digital recording sound so different, provided they have not been re-mastered.
The suspect that there might be an intrinsic media quality factor that alterate the sound quality level is rather diffused...
But what's worse this destroys one of the fundament of digital audio, that is that the digitally recorded signal can be retrieved without any error.

By the way, while in the past in high quality players often there was a limit in correction algorithm implementation, so that could happen that a high level player could not read a CD that was apparently read by a low cost player, recently a few high reputation audio companies have made their correction algorithms better, in order to provide a no-problem reading of faulty CDs.
This anyway means that they accept an higher level of fault and try to correct it, but no one (apart perhaps the designer...) can be sure that the reconstructed samples are definitely correct, that is the original ones.

What can be imagined, alas, is that if enough information is not available to reconstruct the original signal, the correction algorithms can guess what the original signal was. Note anyway that this seems definitely not correct from a purist point of view, and in facts is not correct, but avoids probably the digital equivalent of those annoying LP's clicks (by the way a digital click often sounds exactly like an analogue click...).

During last Top Audio in Milan, I talked with Marco Antonio Lincetto, a well known Italian sound engineer, and he told me that currently any transport should be able to read a CD in flawlessly: that is, if you extract the digital signal from a CD and compare several times and compare the different versions you cannot find any difference.
At the time I accepted the information, but now I understand that this test just shows that a CD player can read a CD in a consistent way, and it is not yet any proof that the extracted stream is correct: another CD player with different correction algorithms could consistently extract a different stream!

Note that even from this point of view the digital to analog converter does not recognise the problem, is completely unaware of it; even when muting takes place, as correction algorithm is not able to extract any reasonably reconstructed sample, the DAC just goes into automuting only after several null samples, but does not perform any correction task.

As a matter of fact, the only information flow from the reader to the DAC, that is the stream we were talking about, is encoded in the SPDIF protocol: we are going to talk about it in the next section.

2.2 - The IEC 958 Consumer mode 1 or SPDIF interface

The SPDIF interface was designed by Sony and Philips (the SP in the name) in order to have a low cost (the magic words...) interface which could effectively (from the economical point of view) drive digital data from a digital source for consumer purposes (by the way, for professional purposes the AES-EBU balanced interface is supposed to be used: it is for sure better since it is far more immune to EMI interference on long runs required in professional usage, but has exactly the same kind of problem we are going to discuss...).
Note that a previous interface existed, which used two wires to separately carry signal and clock: at the time either the effect of improper timing in AD-DA conversion was not completely known, or the cost reduction issues was preponderant in the decision, so that the single wire interconnect was proposed and accepted as a standard.
BUT!!! a flaw was there, even though not relevant... apart for the usual thin slice of the market.

The signal coming out of the CD transport (or the CD player) must be taken to the DAC. It has a frequency depending on the sample rate (44.1Ksps for CDs: instead of KHz you often find Ksps, that means thousand samples per second) and the number of transmitted bits (24 for each sample, of which only 16 are used in CDs, plus other service bits, for a total of 32 bits for each sample of each channel), which takes the flow up to 2.822Mbps.

The bit flow is encoded with a biphase-mark code, which makes the signal independent of absolute polarity and overall prevents the signal to be constantly high or low; this is necessary for the receiver to be able to automatically recognise the bit clock, which would be impossible in case of binary data, where long sequences of 1s or 0s could take place.

Biphase mark encoding simply that at any bit boundary there is a transition and a second transition is present at half bit time when the bit is a mark (digital 1).

Fig. 7

This, as a matter of fact, makes the required band double than the bit rate, about 5.6MHz.

It is for sure not a very high frequency, but it is far higher then the frequencies normally handled by audio DIY people. It is high enough to obtain coupling even when not required... For example I noted that the prototype output started working before really connecting the digital interconnect: it was enough to make its end very near to the binding post!

This means anyway that special care must be taken in order to prevent serious problems.
Apart the fact that careful layout is necessary, there are other problems.

If you look at the SPDIF signal with an oscilloscope, you see a well known pattern named "eye". This is a good way to understand which kind of problems the receiver has to face.

In the case of biphase code, the pattern shows that the waveform is rather heavily distorted depending on the previous symbol: it is a well known problem named intersymbol interference.

The flaw comes out of the fact that the DAC must derive the bit clock from the only signal it receives. But as we saw before, the SPDIF signal is altered by the sequence of symbols, so that the boundary between one symbol and the other does not corresponds to the same threshold level; as normally the boundary recognition is performed by detecting the transitions at a fixed threshold level, the waveform alteration is translated into a time delay of some transitions if compared with others.
That is, the signal from which we are supposed to retrieve the clock does not contain a stable clock indicator (the bit boundary transition), as it is affected by a variable and unpredictable delay.

By the way, the code is not even really a pure biphase mark one: data is organised in frames, and it is necessary to transfer the frame beginning information somehow. In other interfaces it is possible (even though not so easy) to detect the frame alignment analysing decoded data (typically one out of N bits is used only to transmit this alignment info, and this Nth bit contains a fixed pattern, for example 0,1,1).
In SPDIF flow, instead, the alignment information is transmitted by inserting in the flow violations to the general rule of the biphase mark code: in the example above (data is almost all zeroes) you can see two violations (three semi-periods with the same value) between 4 and 6 and one more at 16us. This makes the code an even (slightly) less stable clock reference.

Why a stable clock reference is so important? Well, take into account a 1 kHz sine wave sampled at 44.1Ksps.

The precision of the original sampling clock is not an issue (that is, it is an independent factor which neither the user nor the reproduction device can control), but if you now imagine that the reproduction of a single sample is delayed with respect to the correct reproduction time, you find that the reconstructed value has an error depending on the waveform slope at that time.

The timing error in the conversion of a sample with respect to the ideal time is named jitter.

In the following image the dark line has been obtained by applying a great amount of jitter before D/A conversion phase: obviously the reconstructed signal will be very different from the original one.

Fig. 8

Just to explain what is happening, the second sample (value of the red line where it crosses the second vertical line) is reproduced with the correct value, but at the wrong time: in facts the blue line reaches the value of the red line slightly before time.
Note that in the figures the timing error is clearly exaggerated, in facts it is about 20% of the sampling period: normally we are talking about hundreds of ps (pico second, 10E-12 sec), that is about 0,001% of 44.1kHz sampling period.

Fig. 8a

Jitter is a typical time distortion phenomenon, but can be studied even in the frequency domain (in this case it is properly named phase noise): in practice jitter makes side bands appear beside the fundamental. Analysis of these side bands can be used in quantifying the jitter amount.

The formula for jitter not to cause an error greater than half bit in a n bit system with f maximum signal frequency should be

Max jitter = 2 EXP -n / (pi * f)

(Note: I found the formula in a collection of old e-mails about jitter from newsgroup rec.audio.high-end. It was said it came from an AD manual, but trying to find out how it was computed I found a value four times lower...)

This formula gives 242ps as maximum jitter for a 20kHz / 16bit system. Obviously for an higher number of bits the allowed jitter decreases rapidly, reaching values critical for current technology already for 20kHz / 20 bits. Imagine what happens for 96kHz 24 bits.

Which is the effect of jitter? I must quote, I have not yet made any experiment in this sense, but it is reported as a lack of focus, a disturbance of the soundstage. What's strange, anyway, is that a few authors reported that they appreciated a warm, round sound, but later discovered they were inadvertently injecting jitter in the clock. So even in terms of effect on sound jitter is extremely elusive.

2.4 - A purist's point of view

The point of view of a pure purist could well simply be that for sure any analogue recording sounds better than the same piece in any digital recording.

I do not want to get in this war. Alas, purists have lost it, and indeed far before than the sound of a digital recording could slightly resemble to an analogue recording. But they have lost the war definitely, and they have lost it against the global recorded music market, of which purists are indeed a little part of the already thin slice represented by audiophiles.

Indeed they have lost the war and they believe that all the damage has come from this. I am not so sure. You could imagine two scenarios in the struggle for quality audiophiles are fighting.

First scenario, the majors decide that the audiophile market is rich and fat, and decide to address it specifically with special quality tools, which require a special support, lets' name it YperCD, different and not compatible with the standard consumer support.
The cost of the single YperCD record in this case must rise a lot, it's a special quality recording produced in low numbers, let's say three times the current one.
From the point of view of diffusion, no normal consumer listener would accept the extra cost, and the availability of YperCD software (if price remains limited) or its price (if a wide spread of titles is made available) would soon become the limiting factor in the market (what's happening with vinyl nowadays?).
With regard to the YperCD record player production, the numbers involved are so low that the price of the unit is very high too, and they do not tend to decrease as the low income does not allow any further research in the area (remember the DAT?).
As a consequence, its ability to remain a leading technological solution is progressively eroded by the low cost consumer technology progress, where any little technical improvement can result in great sales advantage (remember DAT vs CDs). Note that should at any time the high quality support production cost become lower than the consumer support one, the majors would immediately kill the most expensive one, forcing consumer into the other...

In the other scenario, majors decide that no different support can be supported and produce one reasonable quality support. The sales price can and must be reduced, as the manufacturing cost gets lower and lower, and even though much of the advantage goes into producers' profit, part of it is shared with the customer because of market laws.
The availability of new recordings is unrestricted. The single player type is widely diffused, and its price decreases too, even though high quality players would maintain an high cost at least to keep up their own image. Technological research on account of good sound would be lead by high end companies which could anyway be sure that the results of their studies can be traded to consumer companies, while the cost reduction would be granted by the always better mass-produced ICs designed by these.

No doubt I prefer the second scenario: the first leads to an elite market with a very high access threshold sustained only by always increasing prices with no (big) technical progress. The example is the current vinyl market: I cannot find anything really interesting from the musical point of view, apart from a few re-editions of old performances.

In the end, the absence of real competitors could also be not so bad in this market. Actually our consumer electronic global village seems to have become too small and too crazy for competition to produce real good selection. Remember Beta vs VHS? The winner (which was the worse of the two, do not forget) controlled the mass market for decades, before something definitely better appeared and killed it.
The same with Compact Cassette: it was able to survive Elcaset, DAT and Minidiscs and was already dead because of age before CDRs took its crown (at least here in Italy, where Minidiscs are not yet so diffused).

But even though no real competitor can arise from the few followers of the Pure Sound, a deep, acre and hard criticism does.

One of the most advanced positions is expressed by Ryohei Kusunoki in a three-parts article published by MJ, an important Japanese audio electronic design magazine, and also available on the Web.

Kusunoki says that oversampling is far worse than no oversampling. His point of view is as follows. Oversampling has been introduces in order to reduce the constraint which the DAC and the following analogue filter were subject to, as we explained in the first part of this article.

But no one has considered jitter, he says. From this point of view, taking into account that the average error (which can also be seen as the noise floor which limits the available dynamic range) is half the LSB, least significant bit, you could decide that this is also the maximum acceptable level for the jitter-induced error.

In this case the maximum acceptable jitter is 173ps at basic sample rate and precision (44.1ksps and 16bit), computed as 1 / 44100 / 2exp16 / 2. But if you take into account for example an 8X oversampling with 20 bit word, then to achieve the same precision jitter should be as low as 1.35psec, which is absolutely impossible to achieve. From this Kusunoki derives that an oversampling unit is not able to offer the same precision as a non oversampling unit.

With regards to the digital anti-aliasing filters which normally come after the DAC in case of oversampling, Kusunoki just thinks that they are no use at all as the human auditory sense is a powerful low pass filter, and filters out any component over 20kHz.
Even the effect of these digital filters on following equipment is very limited, as they filter out all components up to the oversampling frequency, but not the ones over it.

By the way Kusunoki critics hit any digital filter. He says that any digital filter is just a sequence of delay blocks with some elaboration of the delayed samples.
The basic delay block at 44.1KHz is anyway 0.022usec, but delays are cumulated so that the total delay gets often over 2msec, which is the delay detection threshold for the human being. So, he says, the delay can be detected and causes big damage to sound. The best known consequence of this fact, he says, is the typical frequency response of a CD player to a pulse, which results indeed far from the ideal one.

Well, I must admit I only read a translated excerpt of his articles, but I really cannot agree with several points expressed above. Anyway, as Kusunoki seems to be a well known and respected purist, I think it is important to know even this, so different, point of view.
Note that even though some of the technical points seem not to be completely correct from the technical point of view this shows that no assessed knowledge should be accepted without a good, safe deal of scepticism.

Actually, there is at least one thing Kusunoki says that cannot be denied by (almost) anyone: the pulse response of a normal CD player is really poor; any other HiFi component with the same pulse response, apart perhaps speakers, would probably be considered not acceptable.
What's worse, the pulse response of a digital filter, and hence of a normal CD player, has a deeply non-natural behaviour, that is it has a ripple that begins before the real pulse, which is really hard to find in any real word musical instrument behaviour.

Fig. 9

A typical DAC pulse output (the output is inverted)

I read recently in an Italian magazine that the ringing is absolutely correct, as that matches with the theoretical pulse response of a digital filter. Well, I must say that this illuminating statement does not make me less dubious about the all stuff. By the way, I believe to remember from my university days that the ringing is not the theoretical response of the ideal filter, but the theoretical response of the practical filter, as the ideal one requires infinite delay and seems therefore slightly impractical to implement :-)

Kusunoki just says that according to his opinion this ringing makes the sound less natural and musical. I would like you to note that in the case of a sudden transient note (a piano or a drum, for example) the ringing would precede the note by 0.5ms; even though this time might not be long enough and the ringing frequency too high (about 22kHz) to hear, this "leading queue" could hypothetically reduce the perceived dynamic impact of the sound and alter the quality of sound.

Just to show anyway that the topic is not so marginal, a few recent Japanese hi-end players present a pulse response without ringing, obtained via very sophisticated oversampling filters.
I do not know if Kusunoki theory is now supported by serious studies or simply the Kusunoki effect has hit the market even though unsupported by any evidence, but apparently the big Japanese companies intend to address this issue.

[Back to Part 1]
[Go to Part 3 (schematics)]

How to print this article