Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I work in this industry, and I have produced what is almost certainly the most high-performance dither to date, which works through noise-shaped Benford Realness calculations: http://www.airwindows.com/not-just-another-dither/ I mention this to say that I can absolutely make 16 bit 'acceptable' or 'listenable', even for audiophiles. I do that for a living. And yet…

Monty is wrong. To cover the range of human listeners, the required specs even through use of very insensitive double blind testing (which is geared to substantially indicate the PRESENCE of a difference between examples if that's present, and does NOT similarly indicate/prove the absence of a difference with a comparable degree of confidence: that is a horrible logical fallacy with realworld consequences) are more like 19-21 bit resolution at 55-75K sampling.

Beyond this, there's pretty much no problem (unless you are doing further processing: I've established that quantization exists even in floating point, which a surprising number of audio DSP people seem not to understand. There's a tradeoff between the resolution used in typical audio sample values, and the ability of the exponent to cover values way outside what's required)

That said, it is absurd and annoying to strive so tirelessly to limit the format of audio data to EXACTLY the limits of human hearing and not a inch beyond. What the hell? I would happily double it just for comfort and assurance that nobody would ever possibly have an issue, no matter who they were. Suddenly audio data is so expensive that we can't allow formats to use bytes freely? That's the absurdity I speak of.

Our computers process things in 32-bit chunks (or indeed 64!). If you take great pains to snip away spare bits to where your audio data words are exactly 19 bits or something, the data will only be padded so it can be processed using general purpose computing. It is ludicrous to struggle heroically to limit audio workers and listeners to some word length below 32 bit for their own good, or to save space in a world where video is becoming capable of 1080p uncompressed raw capture. Moore's law left audio behind years ago, never to be troubled by audio's bandwidth requirements again.

Sample rate's another issue as only very nearby or artificial sounds (or some percussion instruments, notably cymbals) contain large amounts of supersonic energy in the first place. However, sharp cutoffs are for synthesizers, not audio. Brickwall filters are garbage, technically awful, and expanding sample rate allows for completely different filter designs. Neil Young's ill-fated Pono took this route. I've got one and it sounds fantastic (and is also a fine tool for getting digital audio into the analog domain in the studio: drive anything with a Pono and it's pretty much like using a live feed). I've driven powerful amplifiers running horn-loaded speakers, capable of astonishing dynamic range. Total lack of grain or any digital sonic signature, at any playback level.

My choice for sample rate at the extreme would be 96K, not 192K. Why? Because it's substantially beyond my own needs and it's established. I'm not dissing 192K, but I wouldn't go to war for it: as an output format, I would rather leave the super high sample rate stuff to DSD (which is qualitatively different from PCM audio in that the error in DSD is frequency-sensitive: more noise in the highs, progressively less as frequency drops).

Even with DSD, which is known to produce excessive supersonic noise even while sounding great, the scaremongering about IM distortion is foolish and wrong. If you have a playback system which is suffering from supersonic noise modulating the audio and harming it, I have three words you should be studying before trying to legislate against other people's use of high sample rates.

"Capacitor", and "Ferrite Choke".

Or, you could simply use an interconnect cable which has enough internal capacitance to tame your signal. If you have a playback system that's capable of being ruined just by 192K digital audio, your playback system is broken and it's wrong to blame that on the format. That would be very silly indeed.

I hope this has been expressed civilly: I am very angry with this attitude as expressed by Monty.



I will add that the concerns of transient timing are actually a fallacy: given correct reconstruction, sampling is more than capable of producing a high-frequency transient that crosses a given point at a given moment in time that's NOT simply defined by discrete samples. Reconstruction is key here, and no special technique is required: sampling and reconstruction alone will produce this 'analog' positioning of the transient anywhere along a stretch of time.

The accuracy is limited by the combination of sample rate AND word length: any alteration of the sample's value will also shift the position of the transient in time.

But since the 'timing' issue is a factor of reconstruction, you can improve the 'timing' of transients at 44.1K by moving from 16 to 24 bit. The positioning of samples will be a tiny bit more accurate, and that means the location of the reconstructed wave will be that much more time-accurate, since it's calculated using the known sample positions as signposts.

Positioning of high frequency transients does not occur only at sample boundaries, so that alone isn't an argument for high sample rates. You can already place a transient anywhere between the sample boundaries, in any PCM digital audio system. The argument for higher sample rates is use of less annoying filters, and to some extent the better handling of borderline-supersonic frequencies. For me, the gentler filters is by far more important, and I can take or leave the 'bug killing' super-highs. I don't find 'em that musical as a rule.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: