This post continues on from the previous one which introduced sampling and sample rate which is here if you would like to refresh your memory. This one discusses bit depth of digital audio which is the number of bits used to represent each sample. The more bits, the greater the set which can be used to describe the amplitude of the sample and the more accurately a signal can be sampled. This affects the dynamic range and the Signal to Noise Ratio (SNR). The dynamic range is the difference between the largest and smallest possible amplitudes the signal can represent. The SNR is a little different and will be explained later in the post. Increasing the bit depth increases both of these measurements.
The amplitude of an analogue signal exists on a continuum so there can be an infinite number of possible amplitudes. Digital signals have a discrete number of possible amplitude corresponding to the number of bits. In the digitisation process, the analogue signal amplitude will never exactly match a possible digital amplitude so it must be quantized. This means that the amplitude is rounded to the nearest digital discrete value.
Bit Reduction Effect
This demonstration is an audio effect for reducing bit depth. As with the sample rate, it is not possible to actually adjust the bit depth in the Web Audio API so this effect only simulates bit reduction.
Below is the code for the effect processor. You can find the rest of the code at the bottom of this page or on GitHub here.
Bit Reduction Demonstration
This visualiser has the bit rate reduction processor attached. As usual, the red wave is the input and the blue is the processed output. This time a yellow wave has been added and the reason for this will be revealed shortly.
Click/touching at different heights on the visualiser will affect the output bit depth. At the top, the output will be 16 bit down to the bottom which is 3 bit.
There are also a couple of extra controls. The check box enables the frequency rate reduction effect so you can use both bit reduction and sample rate reduction effects together. The sample rate reduction effect is controlled using the y-axis of the visualiser, as described in the previous post.
The output bit depth is displayed next to the new controls.
Bit Reduction Outcome
Reducing the bit depth introduces artefacts to the output which sound like noise. Noise will be discussed in a later post but, for now, let’s just say that it sound like a radio station which is not tuned into a station well (if you can remember what that sounded like) or like rain hitting the ground. Noise is generated with random sample amplitudes.
Referring back to the sum of sines post, it makes sense that the additional artefacts should sound like noise. We can think of the output signal as a composite of the original signal plus an additional (somewhat random) signal made of different between the original signal and the quantized signal. The yellow wave on the visualiser shows this additional signal which is known as the quantization error. The is very much like noise because it is seemly random how much the signal will need to be quantized. This quantization error is the ‘noise’ in the SNR. As with the dynamic range, the ‘signal’ part of the SNR is the maximum possible amplitude.
The quantization error is always random, though. Listen to the 300Hz sine wave at 4 bits and you will notice that there is no noise introduced, just other frequencies. When the quantized signal is a repeated pattern then the quantization error also becomes a repeated pattern and repeated patterns become frequencies rather than noise. You can see the repeated pattern in the form on the yellow wave. The lower bit depths in general start to introduce more repeated patterns and several samples in a row might be quantized to the sample discrete sample.
Listen to each of the audio files from the drop down menu to see how the bit depth reduction and dithering affect them.
Dithering is the process of adding/subtracting a small amount of random noise to the sample amplitude before quantization. The new (unlabeled) range input on the visualiser controls the dither. It ranges from zero noise up to the interval between 2 discrete values.
The dithering is most effective at around the midpoint of the range. This would equate to +/- half an interval of noise.
You should hear that the level of noise present in the output increases with the dithering but the original audio is much more recognisable and there are little or no other artefacts introduced other than the noise. So in this case, randomly distorting the signal actually improves the sample representation.
The 2 bit the audio is almost impossible to make out – This is most apparent on the speech clips. When the signal is dithered it is still possible to hear and understand. The trade-off introduced by dithering is an increase of noise for a more comprehensible signal.
Bit Depth Comparision
When comparing bit depths it will be helpful to use examples of well-known computers with different bit rates.
The Commodore 64 had 4-bit audio. 4-bits can represent 16 values. Consoles, like the NES and the SEGA Master System, had 8-bit audio which can represent 256 values.
Below is a wave that is 256 pixels tall from peak to peak. You can see it is quite an accurate representation of a sinusoidal wave as it is not really pixellated. If we say that an average monitor is 30cm tall and has 1080 px in height (1080p) then the 8-bit wave is about 7.5cm tall.
These are just visual representations of the waves but they should give you an idea about the difference in fidelity between some common bit depths.
Next are the 16-bit systems. The Sega Mega Drive and Super NES fall into this category. The maximum peak to peak wave using the same scale as above (1080 pixels/30cm tall) would be the size below.
Next, we have 24-bit which is what most PCs support today. Below the size of the wave on the same scale.
Some high-end sound cards support 32-bit. Again, the maximum size wave on the same scale is below.
- Bit Depth Reduction effect
Source Code (click to expand)