What sample rate to choose? While psycho-acoustics suggests that 44.1kHz is almost enough, and while there seldom is real signal content above 20kHz on LPs, it still is wise to record at a higher rate.
How high? Quadruple rates (i.e. 4 x 44.1 = 176.4kHz and 4 x 48 = 192kHz) may appeal to the numbers brigade, but are often quite pointless. One reason is the lack of content in the source signal, the other main reason is that many quad-capable ADC chips in reality aren't. Audio ADCs these days are almost invariably of the sigma-delta type, i.e. massively-oversampled low-bit convertors running at several MHz, followed with digital anti-alias filtering, decimation, and large amounts of noise shaping to move the gross quantisation error of the low-bit convertor out of the audio passband.
For a base 44.1kHz ADC, the noise shaper must clean up the sub-22kHz range. For a 96kHz convertor this is the sub-48kHz range, and for a 192kHz device this would be the sub-96kHz range. But what we see is that many ADCs, operated at 192kHz output sample rate, exhibit a lot of shaped noise above 48kHz, still in the bandwidth of interest (which extends to 192/2=96kHz). At the same time their datasheets reveal that their anti-aliasing filter has a poor suppression and in-band flatness. The underlying reason is that 4 x ADC chips often are 2 x designs that were stretched to accommodate another octave. But this octave isn't captured cleanly, and thus quite useless.
Above spectrograms illustrate this for the PCM1804 chip, once a champion among analogue to digital convertors. Shown are the noise floor and harmonic distortion for an input signal of 1kHz at -60dBFS. Sample rate, from left to right: 48kHz, 96kHz, and 192kHz. Observe the mountain of noise above 48kHz in the 192kHz sampling case. (Note that the rising noise floor below 48kHz is mostly an artefact of the FFT algorithm, as the bin size increases here with rising sample rate.)
So back to 88.2kHz and 96kHz for recording LPs. One benefit of using 2x rate, even when targetting a CD, is that this way the ADC's anti-alias filter cuts at 44.1kHz or 48kHz, and not at 22 or 24kHz. ADC filters are part of the silicon chip, and thus are often made for economy rather than performance. Having them out of harm's way enables one to use a superior software-based resampler while later converting the data to 44.1kHz for release on disc.
Another objective reason for choosing 2x is that most digital post-processing just works better in a wider frequency space. Minimum-phase digital equalizer software, as used in Audition, operates as a transform of its analogue counterpart. This yields something similar as the equivalent analogue filter, but with a subtle difference: in the discrete sampled domain half the sample frequency takes on the same properties that an infinitely high frequency has in the continuous domain. As a result the digital filter response deviates progressively with frequency from its analogue target function. This deviation can be compensated for in software, a method called filter warping, but this is not always done. Hence it is safe to assume that filters may work better at an elevated sample rate.
Yet another argument for using higher sampling rates is that any non-linear operation executed in the digital domain insiduously violates the sampling theorem: a non-linearity generates harmonics and intermodulation products, and some of these products exceed half the sampling rate and thus alias irrevocably back into the audible band. Avoiding this requires alias-aware processing software, which is rare. And so the second line of defense is, again, working at a higher sample rate. This is even advised when doing only seemingly-linear operations, such as gain change and equalization. Indeed, in the quantised domain no operation is truly linear: rounding and truncation due to limited numerical accuracy are non-linear in nature and the errors so injected will be multiplied by aliasing. This phenomenon is often not known or understood by designers and users of signal processing software alike, but in my opinion it may well be one of the main reasons why modern digital productions are often thought to sound bad, and also why high-res recordings subjective are preferred over standard resolution!
This still leaves us one choice to make: 88.2k or 96k for recording? If the result is going to be a CD, a DVD-A, or a file (whether high-res or MP3) then stick to 88.2kHz as this allows for a near-transparent reduction to 44.1kHz, even when the sample rate convertor is not top-notch. This said, I would advise against DVD-A itself, as the format is essentially dead and the availability of future replay transports is not guaranteed. Further, once written it is near-impossible to extract the audio data again from a DVD-A, something that would come in handy in case the original files get lost.
Am I questioning here the analogue descent of the LP?
Oh, yes. Nothing is what it seems.
You see, while many consider the record album the quintessential analogue medium, it all to often is not. Since the late 80s many an LP was recorded, mixed, or mastered via the digital domain, a feat often proudly proclaimed in the sleeve notes. We all know that.
But there is another, much more invisible source of digital LPs. LPs are necessarily cut with variable pitch grooving. Quiet parts are thus packed closer, and louder parts are spaced more widely to allow more stylus excursion. The cutting lathe controls this groove distance or pitch dynamically, with information from a preview copy of the sound that leads some hundreds of milliseconds before the cutter head sees the same signal. In the old days this preview was implemented on a dedicated replay tape machine equipped with two appropriately spaced sets of heads. Each head then fed an identical stack of processing gear (limiters, last-ditch equalizers, ...), the outputs of which were routed respectively to the groove pitch controller and to the cutter amplifiers. An excellent system, but also an expensive system, with its dedicated tape deck, associated alignment and maintenance chores, and its double complement of processing components.
Once digital audio delays became viable in the late 70s, the twin-head decks were thrown out. Now a standard replay tape machine delivered a direct signal to the pitch controller, and a delayed signal, via ADC, digital delay, and DAC, to the cutter head. So many LPs were pushed in and out of the digital domain even before they were cut into the lacquer, and this often without the artist or producer knowing (and if they knew: analogue sentimentality still had to be invented).
And if this is not enough then there is a third way for an LP to become digital. For large, world-wide releases it was customary to generate a number of analogue copies from the single master tape. These copies were then distributed to cutting houses and pressing plants all over the globe. Obviously, the deeper the generation of the distribution master copy, the lower the quality. So again, once cassette-based digital recorders became available, it became popular to make digital distribution masters, often using U-Matic based Sony PCM1610 and 1630 machines.
The early ADCs and DACs used in these processes were rather crude, with word lengths as low as 14 bit, and a sampling rate between 40 and 50kHz. As LPs cut through such ADC/DACs cannot contain any real information above 20-25kHz, it is quite pointless to archive them in a hi-res format.
The provenance of an album is easy to check, given software with a good spectrum analyser, like Audition's. Record the LP at a sample rate of 88.2kHz or higher. Then load the file into the spectrum analysis software, select a few seconds worth of busy-sounding music, and plot the averaged spectrum over that selection. A recording that has been standard-resolution digital during its production flow, i.e. including multi-tracking, mixing, mastering, or cutting, will then show a distinctive ridge at about 20kHz, caused by the digital process' anti-alias filtering.
A fully analogue recording isn't likely to show such a ridge, even when this does not necessarily imply that the LP contains useful information above 20kHz: we have to remember humbly that the innate distortion levels of the LP format and of cartridges easily reach 10% in the treble, and thus any content in the 10-20kHz band gives rise to significant harmonic distortion products in the 20-40kHz band. These products are often mistaken for musical content and considered proof of vinyl's perceived superiority. But more often than not they aren't.
Readers wanting to learn more about the limits of performance of LP cutting systems I refer to this repair site www.etec.dk, which deals with modern Ortofon cutters. These were designed for 4-channel cutting at half-speed, and have at standard speed the best treble extension of any commercial cutter ever made, being almost flat to 25kHz. The more often used Neumann SX68 and SX74 systems fared less well.
[Back to Part I] | [Forward to Part III]
© Copyright 2010 Werner Ogiers - werner@tnt-audio.com