In this installment we'll take a look at the toys required for finishing the job, taking the raw recording of an album as input.
Once you have the sound data as files on a computer, you are going to use a Digital Audio Workstation setup for processing them. A DAW is essentially a powerful computer linked to good-quality analogue to digital and digital to analogue convertors and a sound monitoring system, running audio editing software. Just as the studios are doing nowadays, only in our particular case with only 2 tracks (not 134), without external hardware acceleration boxes, and without MIDI instrument control.
Critical declicking and denoising asks for a more detailed and upfront sound, as can be delivered by quality headphones. Finally, making tonal alterations, a very critical task, demands that the listening station closely mimicks the actual system(s) the recordings are going to be played in. Ideally, this means doing the equalisation job in your actual listening room, although for most people this will seem to be too ambitious or disruptive. I do all my work on a 2004 Pentium 4 PC with Windows XP, 1 GB RAM, and two primary 200GB hard drives (one for the operating system and installed software, the other purely for data). Regular maintenance and the judicious disabling of OS features has kept this machine in a trim shape, and now six years after purchase it shows no lack of performance for all my duties (i.e. audio editing and CAE work).
In a cupboard, close to ground level sits another 500GB hard drive, aptly named The Vault. This is used for backups. The PC itself stands on top of my desk, and this brings its greatest weakness to the fore: it is a noisy old bugger, just too noisy for precision audio work, actually. For the time being I'll have to make do with it. As we are renovating a house I have the prospect of a custom-made office and silencing computers will be one of the items on the agenda there!
The audio interfacing on my computer is done with a '96kHz / 24 bit' Terratec Phase 26 USB box. I have written before about this thing. It shows a decidedly mediocre measured performance (far far away from 24 bit) and sounds just acceptable. I also have a somewhat better M-Audio FireWire Audiophile, but I still prefer the Terratec as 1) it is bus powered, the M-Audio requires a wart, 2) USB always boots faultlessly on my PC, FireWire always is a mess of troubles, and 3) it has exactly the right IO connectors in exactly the right spots. In the past I sidestepped the Phase 26's lacking sonics by sending out audio over Toslink to an Apogee MiniDAC, used as DAC and headphone preamp. But I liked the Apogee that much that I moved it over to the main system. So nowadays I listen to the bare Terratec once more. Its line output is sent to a DIY line/headphone amp (*), using a circuit that resembles a simplified Michell Orca. From this unit it goes to my AKG K-400 headphones and to a pair of Fostex PM0.4 active loudspeakers.
The AKGs are another source of frustration in the setup. While detailed they never sound truly musical, and even after more than 10 years in service I'm not familiar enough with them to accurately gauge tonality. It doesn't help that they are very chameleon-like with respect to their driving electronics ...
The Fostex speakers are cute and small enough not to clutter a desk top (oh, and they come in black, red, or white!). I used to drive them from a 10k passive attenuator, in which case they were cuddly and even dull. My line preamp fares much much better, making them quite musical, quite dynamic, quite wide-band, but still not nearly transparent or neutral enough for serious eqaulisation work. Yet, as computer speakers go these are very nice indeed for only little money and they handsomely outperform just about anything sold in computer and media stores.
Our new house will have a dedicated music room measuring 4 by 7 meters. This room will only contain audio gear, records, books, and sofas. It might be that I’ll add a near-ideal mastering setup to it, with a laptop with Toslink output piped into the Apogee DAC (10 meters of cable is no problem for optical). Then I’ll be able to work just like the big boys ...
(* Two years ago I felt I could improve the setup by buying a proper headphone amplifier. I splashed out for a Behringer AMP800, expecting not too much of it, but hoping to modify it for increased current drive and additional line outputs.. In the end the Behringer sounded just awful and proved too hard to modify, so I dropped it and spent some time designing something decent...) |
You can patch your workflow together with a zillion of standalone software tools, but expect to gravitate to a single program than concentrates most or all of the processing functions. In my case that is Adobe Audition 1.5, originally born as the freewave CoolEdit program of the late 90s. I grew up with CoolEdit on a decrepit laptop, and graduated to the full product when I found that it really did all I expected from such a tool (which is a lot), and with good-to-excellent quality. Notable competitors are Sony Soundforge and Steinberg Cubase or Wavelab, to name three of them. Present-day Audition 3 seems to me to be bloatware, and I don’t even know for sure if it would run smoothly on my anno-2004 XP SP3 machine.
And then there is the widespread freeware Audacity.
Now I don’t like Audacity. Please don’t get me wrong. It is a wonderful example of quality freeware, it is still being developed after many years, hinting at a truly dedicated team behind it, and even numerically it gets a clean bill of health: I tried and couldn’t really fault its sample rate conversion and gain implementations, two telltale tests. But I find the user interface awkward, the spectrum analyser too coarse for precision measurements, and its whole concept of ’projects’ hard to grasp when everything should center around the sound file you just loaded. But again, it is totally free, and if you are not used to something else, like Audition, it will likely do the job and do it reasonably well, so please see my opinion as something very personal.
This said, something you should do with any software prior to committing to it is to characterise it, verify that its processing algorithms don’t add any errors and distortions, and doing a null-test, proving that pulling a sound file through it without processing active (use zero settings) doesn’t mutate the file. Or better still: perform two cancelling operations (like positive gain followed by negative gain) and compare the result file to the input file. They should be identical down to at least 24 bit.
Digital audio signal processing comprises of calculations on the samples. Machine calculations have a limited accuracy, meaning that alongside the desired result, they also introduce an error. The total of these errors then constitute a component of distortion in the output signal, a loss of resolution, and this must be minimised. This appears to be an aspect of signal processing often overlooked by today's audio application software designers. But even when the software is up to snuff, there remains the responsibility of the end-user to apply it correctly and optimally.
I’ll give you a couple of examples. Adobe Audition 1.5 has two possible modes of internal operation, regardless the actual format of the input files:: 16 bit and 32 bit (single-precision floating point).
If you open a file recorded at more than 16 bits, then 32 bit mode is automatically entered. Good. But if you open a 16 bit file, say sourced from a CD or a download, then Audition selects 16 bit mode. Any subsequent processing is then done in this (limited) 16 bit signal space and errors will quickly accumulate.
This graph shows what happens when a pure 7kHz tone in 16 bits (dithered) is subject to a 4dB gain increase executed in the 32 bit domain, followed with re-dithering to 16 bit. The result is entirely as expected, clean as a whistle without any added spuriae or distortion.
And this graph shows what happens when the same 16 bit signal is gain-processed in a 16 bit data space, i.e. with insufficient numerical accuracy: The truncation of sample values, in fact just a form of quantisation distortion, generates spurious harmonics and non-harmonic aliases all over the frequency band:
Now I admit that above example is pathological and represents a worst-case scenario. With real music, or even with different test signals, the frequency components generated by non-linearity induced aliasing will be spread more evenly over the band, and this in a time-variant manner, so that there is less long-term correlation between the actual signal and this insidious form of distortion. But this should not be seen as a justification for allowing this kind of error, the more so as the solution to this problem is readily available, and cheap.
To avoid this any processing should be done in the largest-possible signal space. In the case of Audition this is 32 bit mode. There are two ways of assuring this: 1) set the tool up so that it by default opens all files in 32 bit mode (Options > Settings > Data > Auto-convert all data to 32 bit upon opening), or 2) convert any loaded file manually to 32 bit (Edit > Convert sample type).
Intermediate files should always be written in the 32 bit / float format. Only when finally targeting the delivery format (CD, DVD, FLAC, ...) is it allowed and required to dither to 16 bit and perhaps to 24 bit. In the latter case the music is self-dithering and there is no hard need to add to this: simple data truncation is a viable option.
For the proper validation of processing a good spectrum analyser should be used. Again, I found Audition's to be the best for this sort of work. It is accurate, covers a wide dynamic range and frequency span, offers most kinds of window functions and generous FFT lengths, there is averaging over arbitrary music segment lengths, and it allows easy zooming in on any possible detail. Compared to this e.g. Audacity's is positively poor. In fact I have tested commercial spectrum analysers costing a multitude of Audition's price, but for this sort of application the latter still wins hands-down.
Validation involves generating suitable test signals. These are typically digital silence, single tones, and multiple tones, followed by submitting them to the relevant processing. After this the result files are scrutinised by spectral analysis for undesired spuriae, i.e. harmonics (from non-linearities), non-harmonic components (from intermodulation and aliasing), and increased noise floor. Any processing step worth its salt should be clean as a whistle on such tests. Only then do you know that the process is not mutilating the music signal.
Particular care must be observed with sample rate conversion (SRC). While the mathematical backgrounds for such operations are well-established and perfectly feasible to implement perfectly, many a commercial and freeware tool, even today, performs miserably in this respect. This is especially troublesome with downsampling, as this operation potentially destroys information (not so upsampling, which done wrong merely replicates information in the wrong frequency bands). Luckily there is a web-based initiative by a Canadian mastering house, see http://src.infinitewave.ca, to characterise available SRC software (and even some hardware) in a uniform and objective way.
Please visit this site and have a look around. It shows for a number of tools how they perform in the often-needed conversion of 96kHz to 44.1kHz. The aims of such a conversion are:
All of these performance aspects can easily be assessed using the SRC webpage. Let's have a look at a few interesting cases:
Audobe Audition 1.5, as configured for what is suggested to be the best possible performance: quality = '999' and pre-filter enabled. The frequency sweep graph shows a lack of aliases, which is excellent. But closer inspection of this plot (you may want to turn down the lights to have a good look) reveals a background that is not black (=silent), but dark blue. In this mode Audition's SRC injects a broad-band spurious signal, i.e. noise, into the output signal. If you were to zoom in on the filter's impulse response (using Audition itself) you'd see that the ringing response that is so typical of a Sinc(x) FIR filter is not strictly monotonic anymore at low amplitudes, visibly erring from ideal. Likewise, the 1kHz tone graph shows a rather high noise floor (the 'grass' extended along the frequency axis.
Now if we disable Audition's pre-filter, then the sweep graph shows a limited amount of aliasing, but, more importantly, a blacker and cleaner background. The aliasing will only impact on the highest treble (where the human ear is totally insensitive to pitch variation, incidentally), so is not too bad, while the cleaner background will be beneficial to the full music signal. Thus, this is the way Audition's SRC should be used, and not the 'better' full-option configuration.
Now we'll look at freeware or cheapware audio editors. From left to right Audacity (best quality), Audacity (medium quality), and Goldwave 5.18.
Audacity in best mode is quite acceptable, similar to Audition with pre-filtering on, medium quality shouldn't be used, and Goldwave's is just too bad for words. ...
It gets even more interesting when we scrutinise a number of commercial tools that are or were widely used in the music production industry. From left to right Sony Soundforge 9, Pyramix 6.2.3, Pro Tools HD7.2. The latter is reasonable, but not excellent, and the former two are outright horrible. How much confidence does this give you in the other algorithms of these costly packages?
And when a CD sounds bad to your ears, are you still going to blame digital in principle, or might below-par mastering software be the culprit?
Now for something completely different. From left to right r8brain (freeware), SoX (freeware), iZotope (embedded in some new commercial tools, including Sony Soundforge 10).
All three attain near-perfection. iZotope is outright scary, especially when you look at the distortion residue. Oh, and both SoX and iZotope allow for trading transition band rolloff versus impulse response length and ringing and have linear phase, minimum phase, and intermediate phase filters. Just like the latest 'advanced' disc players from e.g. Meridian and Ayre. Enough said.
Another type of software I found indispensable is education. You may find it fun to visit some of the mastering fora on the internet, but the shortest route to knowledge, apart from doing stuff is reading Bob Katz' book Mastering Audio - the Art and the Science. This is almost mandatory for everyone even remotely interested in the modern-day production and processing of music.
[Back to Part I] | [Back to Part II] | [Back to Part III]
© Copyright 2010 Werner Ogiers - werner@tnt-audio.com - www.tnt-audio.com