Sunday, December 29, 2024

How is Audio Represented in Binary

What is Audio

Audio is a sound wave under the hood, and I suggest you understand what sound wave is before indulging in this topic, take your time. Soundwave on a surface level is sound vibration in a medium, but unfortunately, this is not a physics class, what is audio on a programmatic level? It's also wave, but it has properties like sample rate, bit rate, and bit depth. To understand this more clearly, think of the sample rate as the number of times a snapshot of a wave has been taken. The usual sample rate is 44.1k and 48k (Reason upcoming).

Audio refers to sound that can be heard by humans, typically in the form of waves that travel through a medium such as air or water. These audio waves are mechanical disturbances that propagate through the medium, carrying energy and information (physics stuff).

Analog to Digital Conversion

Before you get any deeper, let me introduce you to how a wave that is physics is translated to zeros and ones in the first place. The process of converting analog audio waves to digital format involves several steps:

1. Transduction: A microphone (transducer) converts sound waves into an electrical signal.
2. Sampling: An Analog-to-Digital Converter (ADC) takes snapshots of the analog signal at regular intervals. These intervals are called Sample Rate.
3. Quantization: The amplitude of each sample is measured and assigned a digital value.

Sample Rate

Now the fun part :), Sample rate is the number of samples taken per second when converting an analog signal to digital. It's measured in Hertz (Hz) or kilohertz (kHz). Common sample rates include:

- 44.1 kHz (CD quality)
- 48 kHz (standard for video and film)
- 96 kHz (high-resolution audio)

The sample rate determines the highest frequency that can be accurately represented in the digital signal. According to the Nyquist theorem, the sample rate must be at least twice the highest frequency in the signal to avoid aliasing.

Getting Deeper into Sample Rate

The common sample rate of 44.1 kHz was chosen primarily for compatibility with early audio technologies and is rooted in both technical and practical reasons:

  • According to the Nyquist theorem, the sample rate must be at least twice the highest frequency in the audio signal to accurately reproduce it without aliasing (Example in the picture).
  • Humans typically hear frequencies up to 20 kHz, so a minimum sample rate of 40 kHz is needed.
  • 44.1 kHz provides a safe margin above 40 kHz to account for filter imperfections during analog-to-digital and digital-to-analog conversion.

What is Frequency, And How Does it Relate to Sample Rate?

When talking about sound, frequency, which is part of waves (my brain already hurts), refers to how fast the sound wave vibrates. Higher frequencies are higher-pitched sounds (like a whistle ~ spiky wave), and lower frequencies are deeper sounds (like a bass drum ~ small wave). The human hearing range typically spans from 20 Hz (low bass) to 20,000 Hz (20 kHz) (high pitch).

Now, when we convert sound into digital form, we need to take samples of the sound wave at regular intervals. The more samples we take, the more accurate the digital version will be. The sample rate determines how often we take those samples.

Twice the frequency rule: For the digital version to accurately represent the wave without distortion, the sample rate must be at least twice the highest frequency we want to capture. If you try to sample fewer times than that, the digital version can "mishear" the sound and create weird distortions.

So, if you're capturing sounds with a frequency up to 20 kHz (the highest sound humans can hear), your sample rate should be at least 40 kHz to avoid distortions. This is why 44.1 kHz (CD quality) is a common standard.

But Awad, what if the highest frequency of the sound is 80k? well, first of all, you can't hear it, and second, you won't use it unless you're a scientist who is working with ultrasounds, then you need to sample the sound at a 160k sample rate or even more to get an accurate representation when converted to digital.

Now you have a much deeper understanding of the sample rate. Moreover, 44.1 kHz was chosen because it satisfied human auditory limits (Nyquist theorem), was practical with early video technology, and balanced quality with data storage constraints. It remains a standard due to its widespread adoption and backward compatibility with existing systems.

High frequency vs low frequency waves
If you use a low sample rate on the high-frequency waves (supposing it is audio), then you're gonna lose a lot of info, think about it, a moment you're down the next moment you're at the top. source

An Example of how Analogue waves are sampled into digital waves
An Example of how Analogue waves are sampled into digital waves, the lower the sample rate the higher the Aliasing, the source

Bit Depth 

Bit depth refers to the number of bits used to represent each audio sample. It determines the resolution and dynamic range of the audio. Common bit depths include:

- 16-bit (CD quality)
- 24-bit (professional audio production)
- 32-bit (high-end audio equipment)

A higher bit depth allows for a more precise representation of the audio signal and a greater dynamic range.


A picture demonstrating how bit-depth affect the audio
Aliasing


Bit Rate

Bit rate is the number of bits used to represent one second of audio. It's calculated by multiplying the sample rate by the bit depth. For example:

44.1 kHz sample rate * 16-bit depth = 705,600 bits per second (or about 706 kbps)

Relationship Between Sample Rate, Bit Depth, and Bit Rate

Now for the real deal:

- Sample rate determines how often the analog signal is measured.
- Bit depth determines how precisely each measurement is stored.
- Bit rate is the product of sample rate and bit depth, representing the total amount of data per second.

A higher sample rate captures more detail in the time domain, while a higher bit depth captures more detail in the amplitude domain. Together, they contribute to the overall quality and file size of the digital audio.

In summary, converting analog audio waves to digital format involves sampling the continuous analog signal at a specific rate (sample rate) and representing each sample with a certain precision (bit depth). The combination of these factors determines the bit rate and overall quality of the digital audio representation.


Note: This post is subject to rework, please if you have suggestions don't hesitate to comment.

Sources:

Recent Posts

0 comments:

Post a Comment