- How does bit depth affect audio quality and other recording parameters?
- What bit depth is best for recording, editing, and mastering?
- Which bit depth sounds best (You’ll be surprised – bigger is not always better)?
- Also, check out our guides to sample rate vs bit depth and 48 vs 96kHz sample rates
You can think of bit depth almost as a measuring tool for digital storage. Simply put, the bit depth refers to the amount of digital information in an audio sample. As the bit depth increases, there is also an increase in the accuracy of the digital representation of an analog sound wave.
Additionally, the dynamic range of a recording increases as the bit depth increases, resulting in a lower noise floor.
Although we understand increased bit depths to be more accurate in their computerized form, the human ear cannot perceive differences between increased bit depths; the differences are chiefly the increases in the precision of these digital measurements.
Origins of Bit Depth: Digital Audio Theory
To fully understand the concept of bit depth, you must have a firm grasp of digital audio. Simply put, an acoustic sound wave is a continuous movement of energy passed through a medium. The key word here is continuous. Digital audio takes the continuous information and converts it into fragmented measurements represented as binary data (1s and 0s) and called samples.
An A/D converter takes these measurements at a speed called the sample rate. The sample rate defines the number of samples per second, and each sample holds a number representing a waveform’s amplitude at a specific fraction of time. The possible values used to represent the amplitude are determined by the number of binary digits or bits each sample consists of.
As the bit depth increases, the accuracy of the digital reproduction of the analog sound wave increases.
Bits are processed by computers in strings of bytes or groups of eight bits, and one or more bytes create a digital word. Digital words can vary in length, usually in multiples of eight; for example, 16 bits, 24 bits, 32 bits, etcetera.
The number of bits in a digital word, or the bit depth, determines the precision of the measurements of the amplitudes of the samples.
For example, if we have a bit depth of 1, there are only two distinct possible ways to define the amplitude. But, if we have a bit depth of 16 bits, there are 65,536 potential values for a single audio sample.
With a bit depth of 24 bits, there are 16,777,216 potential values, resulting in almost 256 times more ways to describe an audio sample.
Are there benefits to having a higher bit depth?
Yes. A higher bit depth enables a system to accurately record and reproduce the subtle fluctuations and nuances in the waveform due to increased captured data.
If the bit depth is too low, the signal will be inaccurately converted because it will be sampled in increments too large to provide significant detail.
Additionally, bit depth has a direct relationship with dynamic range: as the bit depth increases, the dynamic range also increases.
So, how is this possible?
Remember that the bit depth tells us how many binary digits of information are collected per sample.
This, in turn, means that the bit depth determines the signal-to-noise ratio and, by extension, the overall dynamic range of the recording.
Therefore, as the bit depth increases, there is a wider dynamic range and a lower noise floor.
Is there a quantifiable amount of dynamic range per bit depth?
Yes! Each bit represents 6dB of dynamic range. For example, a 16-bit recording will have 96dB of dynamic range, and a 24-bit recording will have 144dB of dynamic range.
What is a 32-bit float? Do I need this?
32-bit float is a bit depth encoded in an entirely different computing notation. The floating-point computing notation is more flexible in its representation of the signal, despite being more taxing on the CPU.
32-bit float is, essentially, a 24-bit recording that has a reserve of 8 bits to use for expanded dynamic range.
Unlike the other bit depths, there is no specified maximum sound level and thus no clipping of very loud signals.
So, what’s the con?
Storage. 32-bit float recordings are about a third larger than 24-bit files. But, this is all fine and well, considering that SSDs continue to grow in storage capacity while shrinking in physical size.
A bigger problem, though, is that few hardware recorders support 32-bit float recording formats, but many different software applications offer support, including ProTools, LogicPro, and Audacity.
Now, do you need to record with this bit depth? It’s not necessary, but it is nice if your gear can do it, you have the space for it, and you’d like some additional headroom for your recordings. Industry-standard, however, is 24 bits.
Most DAWs allow you to select a bit depth, among other parameters, before creating a session file. Notice that this software, ProTools, offers support for 32-bit float!
What’s the computer science behind 32-bit float?
16-bit and 24-bit audio are encoded in fixed-point notation or format. In computing, fixed-point notation is a way of representing non-integer values in binary such that there is a constant number of bits to the left and right of the decimal point.
32-bit float, however, is represented in single-precision floating-point notation. We can think of this as a number represented by scientific notation, i.e., using an exponent. Essentially, each sample can be represented using 32 bits, but samples may not use all of the bits; bit usage can fluctuate.
What does this mean? According to the IEEE-754 Standard for computing, there will be 1 bit used to represent the sign (positive or negative), 8 bits for the exponent, and 23 to represent the fraction. The 8 bits used to represent the exponent can change – or float – depending on the fraction it represents.
The con: Not every decimal point number can be expressed exactly as a floating point number.
With typical bit depths, the numerical distance between values is evenly spaced since there is a finite number of ways to express the fractional number in fixed-point notation.
With floating-point notation, the distances between values change in proportion to the audio signal, resulting in rounding errors. As a principle, larger float values will have larger rounding errors than smaller float values.
FAQ
Can we perceive differences between recordings at different bit depths?
Short answer: not exactly, although some claim it is possible (confirmation bias at its finest).
There is a stark difference between the audio quality of 8-bit and 16-bit, but after this, there is not much audible change.
Below, there are two videos: one for an 8-bit recording, and another for the same audio in 16-bit. Can you hear this difference? (hint: all that NOISE!)
Most people believe that the audio quality of 24-bit is better than 16-bit – and this is true in computing and scientific accuracy.
But, conflating quality with a higher number isn’t true perceptually.
While there is a greater dynamic range and less noise, the human ear cannot perceive much difference between the two. Noise in a 16-bit recording is even inaudible to us! So what’s the point?
The advantage is principally in studio monitoring and editing. At a minimum, 24-bit audio is better for monitoring and editing because it allows for listening at higher levels before distortion occurs.
For this reason, too, 32-bit has an advantage over 24-bit audio.
If I can’t hear the difference and my listeners won’t hear the difference, what bit depth is the minimum for recording and mastering?
Despite the world having moved on from CDs, 16-bit audio is still relatively standard – at least for distribution. Industry-standard for recording, however, is generally accepted as 24-bit audio – which brings up the next important point.
Should you choose to record at a bit depth higher than the bit depth specified by the distribution format, you will likely need to apply dither to avoid constricting the mix’s dynamic range upon printing.
What is dither, and what does it do?
Dither is noise. That’s it. But why would we add noise when we are reducing our bit depth? When the bit depth is reduced, quantization distortion becomes more apparent. Therefore, we add noise to a signal to make the quantization noise less noticeable. But what is quantization distortion?
Quantization distortion, or noise, is created by rounding errors when an ADC attempts to measure and replicate a continuous or infinite analog source in a discrete digital form.
A lofty task this is – and one that creates many rounding errors, as the continuous source will have many areas that fluctuate above and below the analog source’s crests and troughs, respectively, but the 1s and 0s can only do so much.
The rounding errors decrease when recording at higher bit depths because, as discussed before, the number of bits describes how many discrete values you have for storing amplitude levels. But, if you need to reduce your bit depth for distribution, you’ll introduce an increase in rounding errors in your signal.
How does dither make quantization noise less noticeable?
Let us all remember that noise is random – a quality that, in this specific case, can be used to your advantage. If you mix random noise with a signal being quantized, you add enough variation to the signal to preserve the original material.
So, ideally, you’d like to add dither that is entirely unrelated to the signal you’re quantizing, most often referred to as decorrelated. When this is done, and there is an appropriate amount of dither, any sample may be rounded up or down depending on the input. This not only preserves the original signal but also helps to limit distortion in the content.
Some dither plug-ins reference noise-shaping. What is this?
Noise-shaping is essentially an application of an EQ to make the addition of dither to a signal less audible. This is mostly applicable at 8- or 16-bits, where the addition of noise will be much more obvious.
Most dither plug-ins, such as the default DAW plug-ins, offer a few noise-shaping curves.
At 24-bit, the addition of dither is so quiet that it is inaudible even without noise-shaping. But, the usage will still help to remove any quantization distortion. Much more applicable in this instance, however, will be a triangular probability density function (TPDF) type dither – or a flat dither.
Wrapping Up
The bit depth is an important parameter in digital audio, as it determines the accuracy of the digital form of the continuous analog wave. The bit depth describes the amount of storage available for information within an audio sample. And as the bit depth increases, there is an increase in the accuracy of the reproduction of the sound wave in binary.
The bit depth determines the dynamic range and the noise floor of a signal. It also determines the amount of quantization noise – or distortion – that may be added due to rounding errors when converting a continuous wave to binary.
Generally, engineers record, edit, and master in 24-bit or higher, but final products are often distributed in 16-bit form. This requires the mastering engineer to add dither to reduce the quantization noise introduced by a bit depth reduction. Although floating-point bit depths exist, like 32- and 64-bit, they provide only a few advantages, chiefly for monitoring and editing.
Despite the knowledge that increased bit depths create a more accurate reproduction of an analog sound wave, the human ear cannot perceive many of these differences between 1s and 0s. We may hear less noise, but the sound wave is no more clear or accurate in the ear canal – only on the computer screen.
Hopefully, you’ve learned some things in this article and enjoyed it – at least a bit!