BLE Audio Deep-Dive Chapter 5 · Part 1 — Codecs, Latency & Why LC3?

🎧 BLE Audio Deep-Dive
Chapter 5 · Part 1 — Codecs, Latency & Why LC3?
3
Core Sections
10ms
LC3 Frame Sweet-Spot
50%
LC3 Airtime Saving vs SBC

What this post covers

Bluetooth LE Audio introduced a brand-new codec called LC3. To understand why it was needed, we first need to understand how audio codecs work, what latency is, and what was wrong with the older Bluetooth codecs (CVSD & SBC). That is exactly what this post covers — in plain English with real analogies.
Keywords you will understand by the end
Codec PCM Perceptual Coding Frame Latency Encoding Delay Transport Delay Decoding Delay CVSD SBC mSBC A2DP HFP LC3 RANDZ Licence
📦 1. What is a Codec? (And Why Do We Even Need One?)
Imagine you want to send a voice message over WhatsApp. You record 5 seconds of audio on your phone. Uncompressed, that raw audio would be a big chunk of data. But WhatsApp sends it in less than a second — that’s because it uses a codec. Codec = Coder + Decoder. The coder compresses audio; the decoder expands it back. Without compression you cannot retransmit a packet if it gets lost — there simply isn’t enough time.
📦 Analogy — ZIP files Think of a codec like WinZip for audio. ZIP compresses a big file so it fits on a small USB stick. The receiver unzips it and gets the original file back. A codec compresses audio so it fits inside a tiny Bluetooth radio packet.

1.1 From Analogue → Digital: PCM

Before codecs, audio was purely analogue (vinyl records, cassette tapes). If a vinyl record had a scratch, you heard a pop — no recovery possible. The first digital breakthrough was PCM (Pulse Code Modulation), used on CDs. PCM samples audio 44,100 times every second (44.1 kHz) and converts each sample to a number. The problem? No compression. 1 minute of CD-quality stereo audio = ~10 MB. That’s too big to send wirelessly in real time.
PCM Data Rate — Why Compression is Needed
Format Sample Rate Bit Depth Data Rate 5 min song (stereo)
CD (PCM) 44.1 kHz 16 bit 800 kbps ~52 MB ❌ Too large
MP3 (128 kbps) 44.1 kHz compressed 128 kbps ~5 MB ✅ Manageable
LC3 (48 kHz, 80 kbps) 48 kHz compressed 80 kbps ~3 MB ✅ Excellent

1.2 Perceptual Coding — Throw Away What You Can’t Hear

MP3 (and LC3) use a clever trick: perceptual coding (also called psychoacoustic modelling). The idea is simple — the human ear cannot hear everything, so why encode it?
🎵 What perceptual coding throws away:
  • Frequencies above ~20 kHz (most adults can’t hear above 16 kHz)
  • Quiet sounds that happen at the same time as loud sounds (masking)
  • Long stretches of repeated sound — encode the difference only
Result: 25%–95% file size reduction with little perceptible quality loss.
⏱️ 2. Latency — The Invisible Delay in Wireless Audio
Latency is the delay between a sound entering a microphone (or a file starting to play) and that sound coming out of the speaker. For wired headphones it’s essentially zero. For Bluetooth, it’s not.
🎬 Real-world problem: You’re watching a YouTube video on your phone with Bluetooth earbuds. You see a person clap their hands on screen, but you hear the clap 150ms later. That delay is latency — and it ruins the lip-sync.

2.1 The Three Parts of Latency

The 3 Stages of Latency in Wireless Audio
🎙️
Stage 1
Encoding Delay
Audio is sampled into a frame, then compressed by the codec. Typical: 10–12.5 ms
📡
Stage 2
Transport Delay
Compressed packet is sent over the air. May be sent multiple times (retransmissions) to survive interference. Typical: 1–30 ms
🔊
Stage 3
Decoding Delay
Receiver decompresses the packet, applies any audio processing, then outputs to speaker. Typical: 2–5 ms
Total Latency = Encoding Delay + Transport Delay + Decoding Delay Classic Bluetooth A2DP: 100–200 ms  |  BLE Audio with LC3: as low as ~20 ms

2.2 The Frame — The Root Cause of Encoding Delay

Perceptual coding needs to look at multiple consecutive audio samples to find patterns worth compressing. This collection of samples is called a frame. The codec must wait until the frame is complete before it can encode anything.
Frame Size Trade-off — Quality vs Latency
Frame Size Audio Quality Latency Verdict
Too Short (2–5 ms) ❌ Poor — not enough samples to apply psychoacoustic tricks ✅ Very low Bad quality
Sweet Spot (~10 ms) ✅ Good — enough data for perceptual coding ✅ Reasonable 🏆 Best balance
Too Long (25+ ms) ✅ Very good ❌ Very high Unusable for real-time
💡 Key takeaway: The industry settled on 10 ms frame size as the sweet spot. LC3 (the BLE Audio codec) is optimised exactly for this value.

2.3 Why Does Latency Even Matter?

🎵 Music Streaming 100–200 ms is fine. You don’t know when the song “should” have started.
📺 Video / TV >40 ms is noticeable as lip-sync mismatch. TVs compensate by delaying the video.
👂 Hearing Aids Critical! The user hears both ambient sound AND the Bluetooth audio. Even 30 ms creates an echo effect.
🎮 Gaming >20 ms breaks game immersion. Gunshot sounds arrive after you see the flash.
📟 3. Classic Bluetooth Codecs — HFP, A2DP, CVSD & SBC
Before BLE Audio, Bluetooth had two audio profiles, each with its own codec:
Classic Bluetooth Audio Profiles at a Glance
Feature HFP (Hands-Free Profile) A2DP (Advanced Audio Distribution)
Use Case Phone calls Music streaming
Mandatory Codec CVSD (later also mSBC) SBC (optionally AAC, AptX, MP3)
Latency ~20 ms (CVSD) / ~30 ms (mSBC) 100–200 ms
Audio Quality Low (telephony) High (music)
Retransmissions? ❌ No (CVSD is real-time) ✅ Yes (buffered)

3.1 CVSD — The Old Phone Codec

CVSD (Continuous Variable Slope Delta modulation) is one of the oldest voice codecs. It samples at 64,000 times/second but only stores the difference between consecutive samples (“did the sound go up or down?”). This makes it:
  • Frameless — no need to collect a full frame, so encoding starts immediately
  • Very low latency — great for phone calls
  • No compression — no room for retransmissions, so any packet lost = audio lost
  • Poor quality — only captures slope differences, not the full waveform
Analogy: CVSD is like describing a landscape by only saying “went up / went down” at each step. You get a rough shape but lose all the fine detail.

3.2 SBC — The Music Codec (A2DP)

SBC (Sub Band Coding) is a frame-based codec with basic psychoacoustic modelling. It produces good music quality but at a cost:
  • High latency (100–200 ms) because of large frames + retransmission buffers
  • Inefficient — packets are large, consume too much airtime, drain battery
  • ❌ Especially bad for hearing aids running on tiny zinc-air batteries — too much peak current
Quality vs Latency — HFP & A2DP Positions
High Quality (Music) A2DP SBC / AAC / AptX
Low Quality (Telephony) HFP CVSD / mSBC
Low Latency (~20 ms) High Latency (~100 ms)
← Latency →
The problem: There’s a huge gap — no single codec covers everything. HFP is low-latency but poor quality. A2DP is high-quality but high-latency. What if you need both good quality and low latency — like a hearing aid listening to music in a noisy environment?
🚀 4. Enter LC3 — Why a New Codec Was Needed
During the development of Bluetooth LE Audio, it became clear that neither CVSD nor SBC could meet the requirements for modern use cases. The Bluetooth SIG went on a “codec hunt” and the result was LC3 (Low Complexity Communication Codec).

4.1 What Was Wrong With SBC?

❌ Too Much Airtime SBC packets are large. They occupy the radio channel for a long time, leaving little room for retransmissions or other devices.
❌ Battery Drain More airtime = more radio active time = more battery used. Critical issue for earbuds and hearing aids.
❌ Limited Quality–Latency Range SBC cannot cover the full spectrum from voice to music at different latency targets.
LC3 Covers the Entire Quality–Latency Spectrum
High Quality (Music) LC3 🏆 Covers the WHOLE range
Low Quality (Telephony) HFP zone A2DP zone
Low Latency High Latency

4.2 LC3 Key Benefits

Benefit How Much? Why it Matters
Half the packet size of SBC ~50% reduction Less airtime used → lower battery drain
Better quality at same bitrate Equal quality at ½ the bits More headroom for engineers to trade off
Supports 8 kHz to 48 kHz 6 sampling rates Voice calls AND hi-fi music in one codec
Mandatory in all BLE Audio devices 100% interoperability Every BLE Audio earbud/device can talk to every other

4.3 LC3 Licence — Anyone Can Use It

LC3 is published under a RANDZ licence (Reasonable and Non-Discriminatory, Zero fee). This means anyone can write their own LC3 implementation and use it in a Bluetooth product — for free — as long as the product passes the Bluetooth Qualification process.
💡 From a developer perspective: You never implement LC3 from scratch. You use a pre-built LC3 library (Bluetooth SIG provides a reference implementation). You just configure it with parameters — sample rate, frame size, bitrate — and pass audio buffers in/out.

4.4 How This Looks in BlueZ Code

In BlueZ (the Linux Bluetooth stack), when setting up a BLE Audio stream, you configure LC3 parameters via the codec configuration. Here’s a simplified example of how codec settings appear in the D-Bus API:
/*
 * BlueZ BLE Audio – LC3 Codec Configuration Example
 *
 * When an Initiator (e.g. phone) sets up a CIS to an Acceptor
 * (e.g. earbud), it negotiates LC3 parameters.
 *
 * These map directly to the LC3 spec parameters:
 *  - Sampling_Frequency  → 0x03 = 16 kHz, 0x08 = 48 kHz, etc.
 *  - Frame_Duration      → 0x00 = 7.5 ms, 0x01 = 10 ms
 *  - Octets_per_Codec_Frame → payload size per channel
 */

/* LC3 Codec Configuration LTV (Length-Type-Value) structure
 * sent in HCI LE_Setup_ISO_Data_Path or via GATT ASE params */

struct lc3_codec_config {
    uint8_t  sampling_freq;     /* 0x03=16kHz, 0x05=24kHz,
                                   0x06=32kHz, 0x08=48kHz  */
    uint8_t  frame_duration;    /* 0x00=7.5ms, 0x01=10ms   */
    uint32_t audio_channel_alloc; /* Front Left=0x01, Right=0x02 */
    uint16_t octets_per_frame;  /* 20–400 bytes             */
    uint8_t  frames_per_sdu;    /* typically 1              */
};

/* Example: 48 kHz, 10 ms frame, 100 bytes/frame (80 kbps), stereo Left */
struct lc3_codec_config left_earbud_config = {
    .sampling_freq        = 0x08,   /* 48 kHz   */
    .frame_duration       = 0x01,   /* 10 ms    */
    .audio_channel_alloc  = 0x01,   /* Front Left */
    .octets_per_frame     = 100,    /* 80 kbps  */
    .frames_per_sdu       = 1,
};

/* Same for right earbud but channel = Front Right (0x02) */
struct lc3_codec_config right_earbud_config = {
    .sampling_freq        = 0x08,
    .frame_duration       = 0x01,
    .audio_channel_alloc  = 0x02,   /* Front Right */
    .octets_per_frame     = 100,
    .frames_per_sdu       = 1,
};

/*
 * The LC3 encode/decode itself is done by a library.
 * Pseudocode for encoding one frame:
 *
 *   lc3_encode(encoder_handle,
 *              pcm_input_buffer,   // 48000 * 10ms = 480 samples
 *              output_packet,      // compressed to 100 bytes
 *              100);               // target size in bytes
 *
 * The output_packet is then handed to BlueZ / HCI as the SDU
 * (Service Data Unit) for transmission on the ISO channel.
 */

🧠 Quick Recap — Part 1
Codec Compresses audio so it can be retransmitted wirelessly without sounding choppy.
Frame A chunk of audio samples collected before encoding begins. 10 ms is the sweet spot.
Latency = Encoding delay + Transport delay + Decoding delay. BLE Audio with LC3 can achieve ~20 ms.
CVSD / SBC Old codecs. CVSD: low latency but poor quality. SBC: good quality but high latency + inefficient.
LC3 New mandatory BLE Audio codec. Covers voice-to-music, half the airtime of SBC, freely licensable.

Ready for Part 2?

In Part 2 we go inside LC3 — how its encoder and decoder work, how Packet Loss Concealment saves your audio, and how to actually choose LC3 parameters for your BLE Audio design.

Part 2 → LC3 Internals & QoS

Leave a Reply

Your email address will not be published. Required fields are marked *