The GAF — How the Pieces Fit Together
Hi students welcome to this BLE Audio course – Inside the Generic Audio Framework, The Generic Audio Framework is a set of 23 profiles and services that live in the Host. Each one handles a specific aspect of audio operation. They are designed to be composable — you can use a subset for a simple product, or combine most of them for a feature-rich one.
This post goes through each functional group in detail: how stream setup works step by step through the ASCS state machine, how volume and microphone control is structured, how media and telephony control is handled, and how coordination across paired devices works. BlueZ-specific code and commands are included throughout.
Four Functional Groups
1. Stream Control — BAPS
BAPS is the collective name for the four foundation specs: BAP, PACS, ASCS, and BASS. They are responsible for everything related to setting up the underlying ISO channels that carry audio data.
Before a phone (Unicast Client) starts streaming to an earbud (Unicast Server), it needs to know what the earbud supports. PACS is a GATT service on the earbud that exposes its capabilities as a list of PAC (Published Audio Capability) records. Each record describes one supported codec configuration — sampling rate, frame duration, supported octets per frame, and audio channel count.
PACS has four characteristics:
| Characteristic | Direction | Meaning |
|---|---|---|
| Sink PAC | Phone → Earbud | Codec configs the earbud can receive |
| Source PAC | Earbud → Phone | Codec configs the earbud can transmit (microphone) |
| Sink Audio Location | — | Which audio channels: Left, Right, or Mono |
| Available Audio Contexts | — | Context types currently available: Media, Conversational, etc. |
# Read PACS records using gatttool or bluetoothctl
# In bluetoothctl media menu, endpoints show PACS data:
[bluetooth]# menu media
[bluetooth]# list endpoints AA:BB:CC:DD:EE:FF
EndPoint /org/bluez/hci0/dev_.../pac_sink_0
UUID: 00001850-... (BAP Sink / PACS Sink PAC)
Codec: LC3 (0x06)
Capabilities:
Sampling Frequencies: 16kHz 24kHz 32kHz 48kHz
Frame Durations: 7.5ms 10ms
Supported Octets/Frame: 26..240
Supported Frames/SDU: 1
ASCS defines the state machine for each individual Audio Stream Endpoint (ASE). An ASE represents one direction of one audio stream — for example, the left earbud’s sink ASE. The phone (Client) drives the state machine by writing control opcodes to the ASE Control Point characteristic in ASCS. The earbud (Server) holds the state and notifies the phone whenever state changes.
The state machine has six states for unicast:
The state lives in ASCS on the earbud. The phone writes opcodes to the ASE Control Point (ASCP) characteristic. For a typical stereo earbud, there are two ASEs — one for the left channel, one for the right. Both go through the same state machine independently, but are coordinated via the CIG so they stream in sync.
# Monitor ASCS state changes during stream setup with btmon
$ sudo btmon | grep -A5 "Write Request\|Notification"
# Config Codec opcode (0x01) written to ASE Control Point:
> ACL Data TX: handle 0x0040 flags 0x00 dlen 20
ATT: Write Request (0x12) len 15
Handle: 0x002a (ASE Control Point)
Data: 01 01 06 10 00 0f 00 ...
# [opcode][ase_id][codec_id LC3=0x06][config...]
# Server notifies new ASE state = CODEC CONFIGURED (0x02):
< ACL Data RX: handle 0x0040 flags 0x02 dlen 12
ATT: Handle Value Notification (0x1b)
Handle: 0x0028 (ASE Characteristic)
Data: 01 02 ... # ase_id=1, state=CODEC_CONFIGURED
BAP defines the roles that devices take and the procedures they use. There are five roles:
| BAP Role | What It Does | Example Device |
|---|---|---|
| Unicast Client | Reads PACS, drives ASCS state machine | Phone, laptop, tablet |
| Unicast Server | Hosts PACS and ASCS, receives or sends audio | Earbud, hearing aid, speaker |
| Broadcast Source | Advertises and transmits a BIG | TV, audio transmitter |
| Broadcast Sink | Scans, syncs to PA, receives BIS audio | Hearing aid, earbud |
| Broadcast Assistant | Helps Broadcast Sink find sources (uses PAST) | Phone acting as a remote control for hearing aid |
The Broadcast Assistant uses BASS (Broadcast Audio Scan Service) to write the scan result to the Broadcast Sink over the ACL connection. The Broadcast Sink then uses PAST to synchronise to the broadcast without scanning for it itself — this saves significant battery on the hearing aid side.
2. Rendering and Capture Control
Once a stream is running, the user wants to control volume and microphones. These specs handle that. An important design decision in BLE Audio: the final volume gain is always applied at the audio sink (earbud or speaker), not at the source. The audio stream is transmitted at line level, preserving maximum dynamic range. The gain is adjusted at the sink end.
Volume control in BLE Audio is more complex than a single slider because audio devices can have multiple inputs, multiple outputs, and multiple controllers at the same time:
MICP (Microphone Control Profile) and MICS (Microphone Control Service) control the overall mute state of the microphone(s) in an earbud or hearing aid. They work alongside AICS — AICS controls the gain of each individual microphone input, while MICS provides a master mute for all captured audio that is destined for a BLE stream.
A typical hearing aid has at least two microphones (front and rear) for directional processing. AICS gives individual gain control over each; MICS mutes them all at once (useful for push-to-talk scenarios or muting yourself on a call).
3. Content Control
Content control covers everything related to what is playing and the state of calls — not the audio stream itself, but the application that generates it. These specs replicate what AVRCP does for A2DP and what HFP does for calls, but in a more flexible, decoupled way. Because they are separated from the audio streams, they can manage transitions — for example, automatically pausing music when a call arrives.
MCS (Media Control Service) lives on the audio source — the phone or PC that is playing music. It exposes the state of the media player as a GATT service. MCP (Media Control Profile) on the earbud reads and controls that state.
The MCS state machine covers three primary states:
Beyond basic play/pause, MCS provides: track navigation (next, previous, fast-forward, rewind), playback order (single, repeat, shuffle), group management, metadata (track title, artist, duration), playback speed control, and content search using the Object Transfer Service (OTS). A suitably capable MCP implementation can fully replicate a music player UI from an earbud. If there are multiple media applications on the phone, each gets its own MCS instance. The single-instance variant — GMCS (Generic Media Control Service) — acts as a unified interface across all players.
TBS (Telephone Bearer Service) lives on the call device (phone, PC, laptop). CCP (Call Control Profile) on the earbud controls the call by writing opcodes to TBS. The key difference from HFP: TBS is designed for the way telephony actually works today.
HFP was designed around a single cellular call. TBS handles:
- Multiple simultaneous calls (cellular, SIP, Zoom, Teams) — each call type is a separate bearer
- Call operations: accept, terminate, hold, retrieve, join, silence incoming ring
- Caller ID, call state (incoming, dialing, active, held, remotely held)
- Inband and out-of-band ringtone selection
- Signal strength (useful for informing the user of call quality)
Like MCS, TBS can be instantiated once per bearer (e.g., one TBS for the cellular app, one for the VoIP app), or as a single GTBS (Generic TBS) that directs all commands to the correct underlying app.
4. Transition and Coordination Control
Coordination is the glue layer. When you pause music on a left earbud, the right one should pause too. When a new connection arrives, both earbuds should transition together — not one going to the phone and the other staying with the TV. CSIP/CSIS and CAP handle this.
When two earbuds are manufactured as a pair, each one is configured with a CSIS (Coordinated Set Identification Service). CSIS holds a Coordinated Set Identification (CSRK) that allows the phone to discover that two separate BLE devices belong to the same set.
Unicast Client
CSIP Client
Coordinated Set
ASCS + PACS + VCS
CSIS member #1
ASCS + PACS + VCS
CSIS member #2
CSIS introduces two key concepts: Lock — before a transition (e.g., switching from TV to phone), CAP acquires a lock on all set members so they transition together; and Rank — determines which set member has priority for operations that can only apply to one member at a time. Devices configured as Coordinated Set members are typically set up this way at manufacturing time.
CAP (Common Audio Profile) is the orchestrator. It introduces the Commander role — a device that can remotely control Bluetooth LE Audio streams across multiple devices. The Commander can be a phone, a tablet, or a smartwatch.
CAP uses CSIP/CSIS to treat a Coordinated Set as a single entity, and introduces two important concepts:
- Context Types — metadata about what kind of audio is being played (Conversational, Media, Game, Ringtone, Alarm, etc.). Devices use this to decide whether to accept or prioritise a connection request. For example, a hearing aid set to “prioritise ringtones” will interrupt media to take an incoming call.
- Content Control IDs (CCID) — links a specific audio stream to the media or telephony service that controls it. This allows the Commander to know which TBS or MCS instance controls which stream, enabling clean transitions (pause stream X when call Y arrives).
5. Top Level Profiles
Top level profiles sit above the GAF. They add application-specific requirements — mandating optional GAF features and specifying codec configurations required for their use case. They are intentionally lean, building on what GAF already provides.
| Profile | Full Name | What It Adds | Target Device |
|---|---|---|---|
| HAP / HAS | Hearing Access Profile/Service | Mandates specific BAP configs, hearing aid presets (programmed by audiologist) | Hearing aids |
| TMAP | Telephony and Media Audio Profile | Higher quality codec settings (32kHz/48kHz), richer MCP/CCP control | Consumer earbuds, headsets |
| PBP | Public Broadcast Profile | Standardises broadcast stream discovery and metadata for public installations (no GATT — no connection) | Airport PA, cinema, events |
BlueZ GAF Implementation Notes
In BlueZ, GAF profiles are exposed to applications via D-Bus. An application that wants to act as a BAP audio endpoint (Unicast Server or Broadcast Sink) registers itself with bluetoothd using the org.bluez.Media1 interface’s RegisterEndpoint method.
"""
BlueZ BAP Sink endpoint registration (Python + dbus-python)
Registers an LC3 capable sink endpoint with bluetoothd.
bluetoothd will then populate PACS on the device.
"""
import dbus, dbus.service, dbus.mainloop.glib
from gi.repository import GLib
BLUEZ_SERVICE = 'org.bluez'
MEDIA_IFACE = 'org.bluez.Media1'
ENDPOINT_IFACE = 'org.bluez.MediaEndpoint1'
# LC3 codec ID as per Bluetooth Assigned Numbers
LC3_CODEC_ID = 0x06
# BAP Sink PAC UUID
BAP_SINK_UUID = '00001850-0000-1000-8000-00805f9b34fb'
class BAPSinkEndpoint(dbus.service.Object):
def __init__(self, bus, path):
dbus.service.Object.__init__(self, bus, path)
@dbus.service.method(ENDPOINT_IFACE,
in_signature='a{sv}', out_signature='')
def SetConfiguration(self, cfg):
# bluetoothd calls this when ASE reaches CODEC CONFIGURED
# cfg contains: Device, UUID, Codec, Configuration (LTV encoded)
print("Stream configured:", dict(cfg))
@dbus.service.method(ENDPOINT_IFACE,
in_signature='', out_signature='')
def ClearConfiguration(self):
print("Stream cleared (ASE back to IDLE)")
@dbus.service.method(ENDPOINT_IFACE,
in_signature='', out_signature='ay')
def SelectConfiguration(self, caps):
# Return preferred LC3 config from offered capabilities
# LTV: Type=0x01 (Sampling Freq), Length=1, Value=0x08 (16kHz)
return [0x02, 0x01, 0x08, # Sampling freq: 16kHz
0x02, 0x02, 0x01, # Frame duration: 7.5ms
0x02, 0x04, 0x1a] # Octets per frame: 26
def main():
dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
bus = dbus.SystemBus()
adapter_path = '/org/bluez/hci0'
media = dbus.Interface(
bus.get_object(BLUEZ_SERVICE, adapter_path),
MEDIA_IFACE)
endpoint_path = '/org/embeddedpathashala/bap_sink_0'
endpoint = BAPSinkEndpoint(bus, endpoint_path)
# LC3 capabilities LTV structure
caps = [
0x03, 0x01, 0x00, 0xff, # Supported sampling frequencies
0x02, 0x02, 0x03, # Supported frame durations: 7.5ms + 10ms
0x02, 0x04, 0xf0, 0x00, # Supported octets per frame: 26..240
]
media.RegisterEndpoint(
dbus.ObjectPath(endpoint_path),
{
'UUID': dbus.String(BAP_SINK_UUID),
'Codec': dbus.Byte(LC3_CODEC_ID),
'Capabilities': dbus.Array(caps, signature='y'),
}
)
GLib.MainLoop().run()
if __name__ == '__main__':
main()
Monitor the ASCS state machine transitions during stream setup:
# btmon output — full unicast stream setup sequence
$ sudo btmon
# 1. Phone reads PACS from earbud
> ATT Read Request Handle: PACS Sink PAC
# 2. Phone writes Config Codec opcode to ASCS Control Point
> ATT Write Request Handle: ASE Control Point
Opcode: 0x01 (Config Codec) ASE_ID: 0x01
Codec: LC3 Freq: 48kHz Duration: 10ms Octets: 120
# 3. Earbud notifies CODEC CONFIGURED state
< ATT Handle Value Notification Handle: ASE 0x01
State: 0x02 (Codec Configured)
# 4. Phone writes Config QoS
> ATT Write Request Opcode: 0x02 (Config QoS)
CIG_ID: 0x00 CIS_ID: 0x00 SDU_Interval: 10000us RTN: 2
# 5. Earbud: QoS CONFIGURED
< ATT Notification State: 0x03 (QoS Configured)
# 6. Phone creates the CIS at HCI level
> HCI: LE Create CIS (CIG 0x00, CIS 0x00)
< HCI Event: LE CIS Established
# 7. Phone writes Enable
> ATT Write Request Opcode: 0x03 (Enable)
# 8. Earbud signals Receiver Start Ready — enters STREAMING
> ATT Write Request Opcode: 0x07 (Receiver Start Ready)
< ATT Notification State: 0x05 (Streaming)
# 9. ISO audio data flows
< HCI ISO Data handle: CIS_0 length: 120 bytes [LC3 encoded]
Summary — GAF at a Glance
| Spec | Type | Resides On | Responsibility |
|---|---|---|---|
| PACS | Service | Earbud / Sink | Advertises supported codec configurations |
| ASCS | Service | Earbud / Sink | Holds the ASE state machine for each stream |
| BAP | Profile | Phone / Source | Reads PACS, drives ASCS state machine, sets up CIG/CIS |
| BASS | Service | Broadcast Sink | Accepts broadcast source info from Broadcast Assistant |
| VCS / VOCS / AICS | Services | Earbud / Sink | Volume gain, balance, per-input gain |
| MCS / TBS | Services | Phone / Source | Media player state and call bearer state |
| CSIS | Service | Each earbud in a pair | Identifies left+right as members of a Coordinated Set |
| CAP | Profile | Commander device | Orchestrates all profiles; Context Types; CCID linking |
Next posts in this series will cover the LC3 codec internals in depth, how ISO sockets work at the kernel level in Linux, and practical BlueZ programming for building a BLE Audio source and sink.
