Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> In order to decimate a signal to 44.1 or 48khz, and preserve high-frequency content, high frequencies need to be phase-shifted.

Your understanding of sampling theorem is incorrect. Sampling alone (not quantization, of course) is completely lossless under the critical frequency.

We demonstrated this in a very clear way near the end, at about 21 minutes in, on the primer two video: http://www.xiph.org/video/vid2.shtml where we show a square wave being phase shifted tiny fractions of the intersample length.



sort by: page size:

When you say

> In order to decimate a signal to 44.1 or 48khz, and preserve high-frequency content, high frequencies need to be phase-shifted.

What do you mean by high frequency? If you mean frequencies below but near the Nyquist frequency then no, there is no phase shift. If you mean at or above...

I'm struggling to avoid a blatant appeal to authority here, but your position is that the author of the Ogg Vorbis coded doesn't understand digital sampling, which seems challenging to believe.


> All signals with content entirely below the Nyquist frequency (half the sampling rate) are captured perfectly and completely by sampling.

Not even close to the truth. It captures the presence of the signal, but only through blind luck does it capture the correct amplitude of the signal. Why? The phase angle of the signal versus the phase angle of the sampler actually matter as you approach the Nyquist frequency.

Why is this? You have a sine wave. You sample it once at the top, and once at the bottom, which is just in line with Nyquist. Perfect. This is because your sampling phase angle and your signal phase angle are aligned so that you get accurate amplitude. But what if you then shift the relative phase angles by 90 degrees? Suddenly your signal is gone completely! Because you happen to be sampling when the sine wave is crossing zero. This is the inaccurate amplitude I talked about earlier. The range of amplitudes is between 0 and the real amplitude.

But if you 2x oversample instead of using the Nyquist frequency, what's the worst-case scenario? You sample at 45, 135, 215 and 305 degrees and don't quite capture the full amplitude of the signal. But at this point you're off by at most 30% or so. And of course it might be spot on, but the max error is 30%.

Now what happens when you 4x oversample? The max possible error goes down again.

People who suggest that 192kHz audio is worse than 44.1kHz or 48kHz need further education in actual signal processing. Also using equipment that was designed to properly low-pass the 192kHz DAC output properly (you don't really need to do this with 44.1!) would eliminate the theoretical problems that are outlined in the article.


> No idea of the technicals, other than sampling at double the audible frequency does- to this layman - seem logical it would give you a wave more diverged from reality as frequency increases - and at the absurd extreme a 22khz (inaudible to most) square wave if sampling at 44, 4 points to represent 11k etc. I don't see how it can be otherwise, just as interpolating and anti aliasing is often right but cannot always correctly insert missing pixels - see the attempts to use AI to make hires versions of Mario and other 8 bit assets, some work, many don't. It was only an idle conjecture after the main point, which was meant to be about mastering, so I'll expand that a little.

This is not correct. If the input signal is entirely below nyquist, the sampled values (ignoring quantisation of the levels, which is a different thing) exactly capture the signal, should you appropriately reconstruct it. Now, a perfect reconstruction (or antialias) filter does not exist, which is one of the reasons why we use 44kHz or 48kHz instead of the 'ideal' 40kHz, but this recreation is extremely accurate, to the point that modern digital approaches will outright outperform analogue techniques.

The same is true for images, incidentally, but most media does not exceed the limits of human vision yet (nor do most display mediums make for very good reconstruction filters).

I highly recommend watching the video, it demonstrates this very well.


> sampling at 44.1 kHz prevents you from accurately preserving e.g. a sine tone at above ~6 kHz

This is mathematically false. A 6kHz or 8kHz or 10kHz or 20kHz signal absolutely can be perfectly preserved with a 44.1kHz sample rate. Not just kind of preserved, but perfectly preserved.


If your argument is along the lines of what I think it is, then this is a fallacy. See "Sampling fallacies and misconceptions" in https://xiph.org/~xiphmont/demo/neil-young.html

"All signals with content entirely below the Nyquist frequency (half the sampling rate) are captured perfectly and completely by sampling; an infinite sampling rate is not required. Sampling doesn't affect frequency response or phase. The analog signal can be reconstructed losslessly, smoothly, and with the exact timing of the original analog signal."


> Anything above 44100hz is just headroom for mixing.

No it's not, you're confusing it with how many bits are used.

Once you are above the Nyquist frequency you can perfectly represent the sample. Any more frequency does absolutely nothing.

But the number of bits of resolution that's what actually determines how perfectly the digital wave form matches the original.


Please do watch & internally digest the explanatory videos at https://xiph.org/video/

It explains why you’re wrong in easily digestible terms & how a 44kHz sample rate will accurately encode signals right up to the Nyquist limit. The second video is an end to end demo showing the process in action.


Yes, and also remeber that as you approach the nyquest frequency, the ability to encode phase is lost. People always talk abou sampling capturing different frequencies but they forget to think about phase.

>For example, suppose you have a pure sine wave, sample it with enough density to make it mathematically reproducible, then play back those quantized samples on a piezo making square waves - it sounds pretty good (and can be indistinguishable to most ears), but it is not the same waveform.

I'm pro analog, but the above is a common misconception, usually caused by the bad "popular science" articles on digital reproduction and sampling, which show quantization as little pixelated waveforms etc.

The waveform produced from a sampled digital signal recreates the original perfectly (for the target frequency), you can verify that on an oscilloscope.


> The high frequency information is gone

diminished in power.

It's only gone if it goes below the quantization threshold. Depends on the filter.


> Exactly. I wrote this explanation, but you beat me to it, so I'll just post it here:

> The real reason we use I/Q sampling is because we want to frequency-shift a signal.

> Why do we want to frequency-shift a signal? In radio frequency applications the signal of interest almost always has a much lower bandwidth than its highest frequency. In other words, the signal has a small bandwidth (say 40 MHz) centered around a high center-frequency (say 2.4 GHz). If we want to digitize the signal, then one way would be to use a very high sample-rate ADC (e.g. a 2.4 GHz ADC). But these are very expensive, and a much better way of digitizing the signal is to use a mixer (a frequency shifter) to shift the signal to be centered around 0 Hz and then use a relatively low sample-rate ADC (e.g. a 40 MHz ADC).

> The way frequency shifting is done is by multipling the signal by a sine signal, which can be done in hardware. But this introduces a distortion to the signal because multiplying by a sine is not actually a frequency shift. It just so happens that this distortion is cancelled out by adding another copy of the signal multiplied with another sine delayed by 90°. But this addition needs to be complex (due to the relationship between sine functions and true frequency shifts), so what we do is sample the two distorted signals and do this complex addition with the digitial signals.

I'm not sure I understand you correctly, but I would not say you distort the signal when you multiply with a sine wave. Essentially you create to frequency components the sum and difference frequencies (f1+f2, f1-f2), now if f1 is your modulated signal (so some f1+fmod, where fmod is a band and can be positive and negative) and you want to convert to baseband you would select f2 so that it's at the carrier (f1=f2) then you generate a baseband signal at 0 carrier frequency and a signal at 2xf1 which is usually outside your detector bandwidth so not detected. However this process only gives you half of the frequencies of your fmod, to get the other half you need to multiply with cosine(f2) which essentially gives you the component that was at 2xf1 now at baseband. So to handle that more elegantly in math you add the two components up as real and imaginary components, essentially that enables you to drop the cos/sin(f1) terms from your equations.


My favorite video that drives this point home: http://xiph.org/video/vid2.shtml

Using actual signal generators and oscilloscopes, this guy proves by experiment everything you said. Goes into differences between sampling rate and quantization depth as well. Well worth the watch.


> Also, contrary to popular belief, and simplified "layman" posts, digital doesn't produce "square-ized" versions of waveforms due to quantization.

I've known musicians who think this way! So it's not just layman.

Anyone who thinks this should look up the Nyquist–Shannon Theorem.


> people have been trying to duplicated that in digital for decades, and failed.

Because decades ago we simply didn't have the processing power to that kind of DSP.

Today, however, you carry that kind of processing power in your pocket.

It used to be really hard, because (amongst other things) the oversampling required to avoid digital aliasing. This is not really a big deal anymore, today. Also, it so happens to be the kind of calculation that parallelises very nicely.

I can really recommend this video to dispel the myths about what we supposedly can't to today in the digital domain: http://xiph.org/video/vid2.shtml (the guy also presents the matter really well, I found it very enjoyable to watch, even if I already knew over half the things he explains). To really make his case, he feeds signals to both a digital and an analogue frequency analyser.


> What happens when you start combining waves together to create more complex signals

Aliasing is a linear effect. If L is an operator that aliases at some frequency, L(Aa(t) + Bb(t))=AL(a(t))+BL(b(t)). Aliasing is in fact multiplicative! Any time you wish to understand aliasing, you really want write the signal multiplied by a dirac comb (a train of equally spaced infinitely narrow pulses). If you wish to understand nonfinite sampling widths, it's effectively convolution with the shape of sampling pulse... which looks like low-pass filtering. Ya dig? This is the frequency on your scope that's stupidly above the sampling rate. Why's my scope have a bazillion ghz bandwidth when it only samples at 4 samples oer week? It's because it's real good at getting you that signal, aliased down into the sampled signal.

> what effect does aliasing really have on an audio signal

PRECISELY the same effect it does on video, except in one dimension, and you haven't understood aliasing until you've understood it in one dimension.

> Is is possible to show that such information either isn't lost

It's lost. Completely... if done properly. First you need to understand Whittaker-shannon interpolation. The result of which is that a CONTINUOUS BAND LIMITED signal is PERFECTLY represented by a train of discrete samples. This is WEIRD. Really weird. Prove it to yourself. It's VERY important in sampling. Say you then perfectly sample a signal with harmonic content ALL over the place. EVERYTHING gets mirrored down into your view, if the dirac comb has infinitely narrow pulses. So at the least you lose frequency information! You only know the original frequencies modulo the nyquist frequency! If you sample CORRECTLY, you band pass filter your incoming signal so there's NOTHING above the nyquist frequency, then you have a perfect digital representation of your original signal.

If you wanna understand signal processing, you really can't use your intuition. Practice understanding things in the frequency domain and time domain, and generally there's only one proper basis to understand a phenomenon. Filtering is best understood in the frequency domain, for instance. If you try to understand filtering in the time domain, you might begin to think that it bounds the speed at which the signal can vary but that's not a useful intuition. Aliasing is best understood in the frequency domain, since it moves stuff down in frequency mod the nyquist frequency. If you try to build an intuition in the time domain, you'll think that there's some signal lost when you multiply by zero between pulses... and you'll guess at what it was and think "distortion" or whatever but what's lost is just WHERE each frequency is... so be very careful about what basis you choose!


You misunderstood. I was relaying that I, too, have a decent grasp of sampling theory, and that if I understand the article's main premise, then saying they have disproven Nyquist's Theorem with this example is akin to saying that being able to draw both 30 degree and 90 degree angles with a compass disproves the idea that angle trisection is generally impossible.

They have simply shifted in time the sampling of the noise. Even if they argue they synthesize the noise in realtime, they had to have sampled it at some point to know what to synthesize in the first place. It was at this point that they were required to sample at at least 2MHz to accurately quantify their noise at 1MHz.

The mechanism used in pilot's noise cancelling headphones is much the same. they don't sample the engine noise as its being made, they synthesize it from known frequencies.


> according to the Nyquist-Shannon Sampling Theorem, the sampling rate must be at least twice the duration of the shortest burst in order to accurately reflect the intensity of a burst.

This statement is a bit sloppy. I think what the author is trying to say is the sampling rate must be such that it samples the shortest burst at least twice. Equivalently you could say that the sampling period should be no longer than half the length of the shortest burst.

This would accurately reflect the fundamental frequency of the burst, but if the burst is not sinusoidal you'll get aliasing. The effect of which will be to falsely indicate lower-frequency bursts that aren't actually present.


> Well, in that video he is reproducing inputs with only modest frequency complexity.

Monty is making them with a signal generator, so that you can easily see what's going on and (if you have suitable gear) you can reproduce the results. The exact same phenomenon would happen for some hypothetical complicated analogue signal. Notice that Monty also demonstrates a square wave, which is in fact the maximum possible "frequency complexity" because of how audio is defined, even though chances are you're still thinking about it as very simple.

> Won't the compute burden depend on the frequency domain complexity of the input signal?

No, it's just resampling, the same way you would to go from say 44.1kHz (from a CD) to 48kHz (typical modern fixed rate DAC in a cheap PC), except you're maybe going from 40 samples to 1000 pixels if you've zoomed in that far. It'd use a windowed sinc function.


This article really misses the facts of the Nyquest-Shannon theory.

In order to decimate a signal to 44.1 or 48khz, and preserve high-frequency content, high frequencies need to be phase-shifted.

This phase-shift is similar to how lossy codecs work.

For what it's worth: I'm a big fan of music in surround, and most of it comes in high sampling rates. When I investigated ripping my DVD-As and Blurays, I found that they never have music over 20khz. It's all filtered out. However, downsampling to 44.1 or 48khz isn't "lossless" because of the phase shift needed due to the Nyquist-Shannon theory.

I still rip my DVD-As at 48khz, though. There isn't a good lossless codec that can preserve phase at high frequencies, yet approach the bitrate of 12/48 flac.

next

Legal | privacy