Service
ASCS
Profile
BAP
States
7 ASE States
Transport
CIG / CIS
The Big Picture
PACS told us what an Acceptor can do. Now we need to actually set up an audio stream. That involves three layers working together:
- ASCS (Audio Stream Control Service) โ defines Audio Stream Endpoints (ASEs) and their state machine on the Acceptor
- BAP (Basic Audio Profile) โ defines the Client (Initiator) procedures to move through the state machine
- CAP (Common Audio Profile) โ wraps BAP procedures with coordination rules for multi-device setups
An ASE is a named endpoint for one direction of one audio stream, living on the Acceptor. It’s like a socket โ the Initiator connects a stream to it.
- Sink ASE: audio flows into the Acceptor (playback โ e.g. music to earbuds)
- Source ASE: audio flows out of the Acceptor (capture โ e.g. microphone in earbuds)
The Acceptor maintains a separate ASE instance for every connected Initiator. So if two phones are connected to the same earbuds, each phone sees an independent set of ASE states.
๐ข STREAMING
ASE 2 (Sink):
๐ข STREAMING
ACCEPTOR
(Earbuds)
โช IDLE
ASE 2 (Sink):
โช IDLE
๐ต CODEC CONFIG
ASE 2 (Sink):
๐ต CODEC CONFIG
Same physical GATT handles โ different values per client namespace
Every ASE moves through a defined set of states. The Initiator drives most transitions by writing opcodes to the ASE Control Point characteristic.
0x00
0x01 โ Codec & PHY selected
0x02 โ SDU interval, RTN, latency set
0x03 โ CIS being established
0x04 โ Audio data flowing
Can re-enable without full reconfiguration
0x06 โ Tearing down CIS + data path
(no caching)
(with caching)
ASE Control Point Opcodes
| Opcode | Operation | Next State |
|---|---|---|
| 0x01 | Config Codec | Codec Configured |
| 0x02 | Config QoS | QoS Configured |
| 0x03 | Enable | Enabling |
| 0x04 | Receiver Start Ready | Streaming (Source ASE only) |
| 0x05 | Disable | Disabling / QoS Configured |
| 0x06 | Receiver Stop Ready | QoS Configured (Source ASE) |
| 0x07 | Update Metadata | Same state |
| 0x08 | Release | Releasing |
Let’s walk through the entire process of the Initiator (phone) setting up stereo audio to a pair of earbuds. Each step is an ASE Control Point write followed by GATT notifications.
Host + Controller
Host + Controller
The Initiator selects a codec configuration from what PACS advertised and writes it to the ASE. At this point it also specifies a Target Latency (low/balanced/high reliability) and a Target PHY (1M, 2M, Coded). These are recommendations โ the Acceptor uses them to choose its preferred QoS parameters and reports back.
After receiving this, the Acceptor moves to Codec Configured state and sends back: preferred PHY, preferred RTN (retransmission number), max transport latency, presentation delay range, and the codec configuration it accepted.
#include <stdint.h>
#include <string.h>
/* ASE Control Point characteristic UUID: 0x2BC6 */
/* Sink ASE characteristic UUID: 0x2BC4 */
/* Source ASE characteristic UUID: 0x2BC5 */
#define ASE_CP_UUID 0x2BC6
#define OPCODE_CFG_CODEC 0x01
/* LC3 Codec_Specific_Configuration LTV Types
* (Different from Capabilities! These are single values, not bitfields)
*/
#define CFG_SAMPLING_FREQ 0x01 /* single byte enum, e.g. 0x03 = 16kHz */
#define CFG_FRAME_DURATION 0x02 /* 0x00=7.5ms, 0x01=10ms */
#define CFG_AUDIO_CHAN_ALLOC 0x03 /* 4-byte location bitfield */
#define CFG_OCTETS_PER_FRAME 0x04 /* 2-byte value */
#define CFG_FRAMES_PER_SDU 0x05 /* 1-byte value */
/* Sampling Frequency enum values for Config (NOT the same as PAC bitfield!) */
#define SAMPLING_8KHZ 0x01
#define SAMPLING_16KHZ 0x03
#define SAMPLING_24KHZ 0x05
#define SAMPLING_32KHZ 0x06
#define SAMPLING_44_1KHZ 0x07
#define SAMPLING_48KHZ 0x08
/* Audio location bits (for CFG_AUDIO_CHAN_ALLOC) */
#define AUDIO_LOC_MONO 0x00000000
#define AUDIO_LOC_FRONT_LEFT 0x00000002
#define AUDIO_LOC_FRONT_RIGHT 0x00000004
/* Target Latency values */
#define TARGET_LATENCY_LOW 0x01
#define TARGET_LATENCY_BALANCED 0x02
#define TARGET_LATENCY_HIGH_REL 0x03
/* Target PHY values */
#define TARGET_PHY_1M 0x01
#define TARGET_PHY_2M 0x02
#define TARGET_PHY_CODED 0x03
/*
* build_codec_cfg_ltv - build Codec_Specific_Configuration LTV bytes
* for a standard 16_2_1 configuration (16kHz, 10ms, 40 octets/frame)
*/
int build_codec_specific_config_16_2(uint32_t audio_location,
uint8_t *out, int out_size)
{
int pos = 0;
/* LTV: Sampling_Frequency = 16kHz (value 0x03) */
out[pos++] = 0x02; /* Length: 1 type + 1 value */
out[pos++] = CFG_SAMPLING_FREQ;
out[pos++] = SAMPLING_16KHZ; /* 0x03 */
/* LTV: Frame_Duration = 10ms (value 0x01) */
out[pos++] = 0x02;
out[pos++] = CFG_FRAME_DURATION;
out[pos++] = 0x01; /* 10ms */
/* LTV: Audio_Channel_Allocation */
out[pos++] = 0x05; /* Length: 1 type + 4 value bytes */
out[pos++] = CFG_AUDIO_CHAN_ALLOC;
out[pos++] = (audio_location >> 0) & 0xFF;
out[pos++] = (audio_location >> 8) & 0xFF;
out[pos++] = (audio_location >> 16) & 0xFF;
out[pos++] = (audio_location >> 24) & 0xFF;
/* LTV: Octets_Per_Codec_Frame = 40 (0x0028) */
out[pos++] = 0x03; /* Length: 1 type + 2 value bytes */
out[pos++] = CFG_OCTETS_PER_FRAME;
out[pos++] = 0x28; /* low byte of 40 */
out[pos++] = 0x00; /* high byte */
/* LTV: Codec_Frame_Blocks_Per_SDU = 1 */
out[pos++] = 0x02;
out[pos++] = CFG_FRAMES_PER_SDU;
out[pos++] = 0x01;
return pos; /* total bytes written */
}
/*
* build_config_codec_cmd - build the full ASE Control Point write
* for opcode 0x01 (Config Codec) targeting one ASE
*
* @ase_id: ASE ID to configure
* @audio_location: AUDIO_LOC_FRONT_LEFT or AUDIO_LOC_FRONT_RIGHT
* @out_buf: output buffer for the GATT write
* @out_size: size of out_buf
*
* Returns total bytes to write to ASE Control Point.
*/
int build_config_codec_cmd(uint8_t ase_id, uint32_t audio_location,
uint8_t *out_buf, int out_size)
{
uint8_t cc_ltv[32];
int cc_len = build_codec_specific_config_16_2(audio_location,
cc_ltv, sizeof(cc_ltv));
int pos = 0;
out_buf[pos++] = OPCODE_CFG_CODEC; /* Opcode */
out_buf[pos++] = 0x01; /* Number of ASEs = 1 */
/* ASE [0] */
out_buf[pos++] = ase_id;
out_buf[pos++] = TARGET_LATENCY_BALANCED; /* Target_Latency */
out_buf[pos++] = TARGET_PHY_2M; /* Target_PHY */
/* Codec_ID: LC3 = {0x06, 0x00, 0x00, 0x00, 0x00} */
out_buf[pos++] = 0x06; /* Coding_Format = LC3 */
out_buf[pos++] = 0x00; /* Company_ID low */
out_buf[pos++] = 0x00; /* Company_ID high */
out_buf[pos++] = 0x00; /* Vendor_Defined_Codec_ID low */
out_buf[pos++] = 0x00; /* Vendor_Defined_Codec_ID high */
/* Codec_Specific_Configuration_Length + data */
out_buf[pos++] = (uint8_t)cc_len;
memcpy(&out_buf[pos], cc_ltv, cc_len);
pos += cc_len;
return pos;
}
โ ๏ธ Capabilities vs Configuration LTVs are different! In PAC records, Supported_Sampling_Frequencies is a bitfield (e.g. 0x0014 = 16+24 kHz). In Config Codec, Sampling_Frequency is a single enum value (e.g. 0x03 = 16 kHz). This is a common mistake for beginners.
Before writing Config QoS to the Acceptor, the Initiator first instructs its own Controller using the LE_Set_CIG_Parameters HCI command. This tells the Controller the desired scheduling for all CISes in the CIG.
The Controller confirms it can schedule the requested configuration and returns Connection Handles for each CIS. Only then does the Initiator write Config QoS to the Acceptor, using the same parameter values.
The one extra parameter sent only to the Acceptor (not in HCI) is the Presentation Delay โ the time from when an SDU is received to when audio is played out. For earbuds, this is the same for both ears so they stay in sync.
#include <stdint.h>
#include <string.h>
#define OPCODE_CFG_QOS 0x02
/*
* CIG / CIS QoS parameters for standard 16_2_1 configuration
* (16kHz, 10ms frame, 40 octets, Low Latency)
*
* From BAP Table 5.2, configuration named "16_2_1":
* SDU_Interval = 10,000 ยตs (10ms)
* Framing = Unframed (0x00)
* PHY = 2M (0x02)
* Max_SDU = 40 octets
* RTN = 2
* Max_Transport_Lat = 10ms
* Presentation_Del = 40,000 ยตs (40ms)
*/
#define SDU_INTERVAL_US 10000 /* 10ms in microseconds */
#define FRAMING_UNFRAMED 0x00
#define PHY_2M 0x02
#define MAX_SDU_16_2 40
#define RTN_16_2_1 2
#define MAX_LATENCY_MS 10
#define PRES_DELAY_US 40000 /* 40ms presentation delay */
/*
* put_u24_le - write 3-byte little-endian value
*/
static void put_u24_le(uint8_t *p, uint32_t val)
{
p[0] = (val >> 0) & 0xFF;
p[1] = (val >> 8) & 0xFF;
p[2] = (val >> 16) & 0xFF;
}
/*
* put_u16_le - write 2-byte little-endian value
*/
static void put_u16_le(uint8_t *p, uint16_t val)
{
p[0] = val & 0xFF;
p[1] = (val >> 8) & 0xFF;
}
/*
* build_config_qos_cmd - build ASE Control Point write for Config QoS
*
* @ase_id: ASE ID
* @cig_id: CIG ID assigned by Initiator's Host (e.g. 0x01)
* @cis_id: CIS ID within the CIG (e.g. 0x01 for left, 0x02 for right)
* @out_buf: output buffer
* @out_size: buffer size
*/
int build_config_qos_cmd(uint8_t ase_id, uint8_t cig_id, uint8_t cis_id,
uint8_t *out_buf, int out_size)
{
int pos = 0;
out_buf[pos++] = OPCODE_CFG_QOS; /* 0x02 */
out_buf[pos++] = 0x01; /* Number of ASEs */
/* ASE [0] */
out_buf[pos++] = ase_id;
out_buf[pos++] = cig_id;
out_buf[pos++] = cis_id;
/* SDU_Interval: 3 bytes, little-endian (units: microseconds) */
put_u24_le(&out_buf[pos], SDU_INTERVAL_US);
pos += 3;
out_buf[pos++] = FRAMING_UNFRAMED; /* Framing */
out_buf[pos++] = PHY_2M; /* PHY */
/* Max_SDU: 2 bytes */
put_u16_le(&out_buf[pos], MAX_SDU_16_2);
pos += 2;
out_buf[pos++] = RTN_16_2_1; /* Retransmission_Number */
/* Max_Transport_Latency: 2 bytes (milliseconds) */
put_u16_le(&out_buf[pos], MAX_LATENCY_MS);
pos += 2;
/* Presentation_Delay: 3 bytes, little-endian (microseconds) */
put_u24_le(&out_buf[pos], PRES_DELAY_US);
pos += 3;
return pos;
}
/*
* HCI LE Set CIG Parameters โ sent BEFORE Config QoS to the Acceptor
*
* This is the HCI command opcode: OGF=0x08, OCF=0x0062
* In BlueZ, the kernel handles this via the ISO socket API or
* directly via hci_send_cmd() in testing tools.
*
* Key parameters relevant to 16_2_1 (one CIS, one direction):
*/
struct hci_le_set_cig_params {
uint8_t cig_id;
uint8_t sdu_interval_c_to_p[3]; /* Central to Peripheral */
uint8_t sdu_interval_p_to_c[3]; /* Peripheral to Central (0 if sink only) */
uint8_t worst_case_sca; /* get via LE_Request_Peer_SCA */
uint8_t packing; /* 0=sequential, 1=interleaved */
uint8_t framing; /* 0=let controller decide */
uint8_t max_transport_latency_c_to_p[2];
uint8_t max_transport_latency_p_to_c[2];
uint8_t cis_count;
/* Followed by per-CIS parameters for cis_count CISes */
} __attribute__((packed));
void example_fill_cig_params(struct hci_le_set_cig_params *p)
{
p->cig_id = 0x01;
/* SDU_Interval = 10,000 ยตs = 0x002710 in little-endian */
p->sdu_interval_c_to_p[0] = 0x10;
p->sdu_interval_c_to_p[1] = 0x27;
p->sdu_interval_c_to_p[2] = 0x00;
/* Sink-only โ P to C interval can be 0 */
memset(p->sdu_interval_p_to_c, 0, 3);
p->worst_case_sca = 0x00; /* 251โ500 ppm (obtained from peer) */
p->packing = 0x00; /* sequential */
p->framing = 0x00; /* let Controller decide per-CIS */
/* Max Transport Latency = 10 ms = 0x000A */
p->max_transport_latency_c_to_p[0] = 0x0A;
p->max_transport_latency_c_to_p[1] = 0x00;
memset(p->max_transport_latency_p_to_c, 0, 2);
p->cis_count = 2; /* left + right earbud */
}
The Enable command attaches a Context Type (like Media) to the stream via metadata. After the Acceptor acknowledges Enable, the Initiator creates the CIS using LE_Create_CIS HCI command. The Link Layer on both sides exchange a CIS_Request/Accept, and the CIS is established.
Both sides then call LE_Setup_ISO_Data_Path to bind the codec to the CIS connection handle. Once the Acceptor’s data path is ready, it autonomously transitions the Sink ASE to Streaming and notifies the Initiator โ no extra command needed.
For a Source ASE (microphone), the Acceptor waits. The Initiator must send a Receiver Start Ready (0x04) command to signal it is ready to receive audio packets. Only then does the Source ASE move to Streaming.
#include <stdint.h>
#include <string.h>
#define OPCODE_ENABLE 0x03
#define OPCODE_RECV_START_READY 0x04
/* Metadata LTV Types */
#define META_STREAMING_AUDIO_CTX 0x02 /* Streaming_Audio_Contexts */
#define META_CCID_LIST 0x03 /* Content Control ID list */
/*
* build_enable_cmd - build the Enable command with Streaming_Audio_Contexts
*
* @ase_id: which ASE to enable
* @audio_ctx: context type bitfield (e.g. CONTEXT_TYPE_MEDIA = 0x0004)
* @out_buf: output buffer
* @out_size: buffer size
*/
int build_enable_cmd(uint8_t ase_id, uint16_t audio_ctx,
uint8_t *out_buf, int out_size)
{
int pos = 0;
out_buf[pos++] = OPCODE_ENABLE;
out_buf[pos++] = 0x01; /* Number of ASEs */
/* ASE [0] */
out_buf[pos++] = ase_id;
/* Metadata: Streaming_Audio_Contexts LTV */
/* Total metadata length = 4 (1 Len + 1 Type + 2 Value) */
out_buf[pos++] = 0x04; /* Metadata_Length */
/* LTV: Streaming_Audio_Contexts */
out_buf[pos++] = 0x03; /* L: 1 type + 2 value = 3 */
out_buf[pos++] = META_STREAMING_AUDIO_CTX; /* T: 0x02 */
out_buf[pos++] = audio_ctx & 0xFF; /* V low byte */
out_buf[pos++] = (audio_ctx >> 8) & 0xFF; /* V high byte */
return pos;
}
/*
* build_receiver_start_ready_cmd - for Source ASEs only
* Tells the Acceptor the Initiator is ready to receive audio data
*/
int build_receiver_start_ready_cmd(uint8_t ase_id,
uint8_t *out_buf, int out_size)
{
out_buf[0] = OPCODE_RECV_START_READY;
out_buf[1] = 0x01; /* Number of ASEs */
out_buf[2] = ase_id;
return 3;
}
/*
* Example: setting up stereo music to a pair of earbuds using LE Audio
*
* Assumptions:
* - Left earbud: ASE ID=1 (Sink), CIS ID=1
* - Right earbud: ASE ID=1 (Sink), CIS ID=2
* - Both in the same CIG ID=1
* - Configuration: 16_2_1 (16kHz, 10ms, balanced latency)
* - Audio context: Media (0x0004)
*/
void setup_stereo_music_stream(void)
{
uint8_t buf[64];
int len;
/* --- Left earbud connection --- */
/* 1. Config Codec on left earbud, ASE ID=1, Front Left */
len = build_config_codec_cmd(1, AUDIO_LOC_FRONT_LEFT, buf, sizeof(buf));
/* Write buf[0..len-1] to left earbud's ASE Control Point (GATT write) */
/* 2. Config QoS on left earbud */
len = build_config_qos_cmd(1, 0x01 /*cig*/, 0x01 /*cis*/, buf, sizeof(buf));
/* Write to left earbud's ASE Control Point */
/* 3. Enable on left earbud with Media context */
len = build_enable_cmd(1, 0x0004 /*MEDIA*/, buf, sizeof(buf));
/* Write to left earbud's ASE Control Point */
/* --- Right earbud connection (same flow, different CIS ID) --- */
len = build_config_codec_cmd(1, AUDIO_LOC_FRONT_RIGHT, buf, sizeof(buf));
/* ... */
len = build_config_qos_cmd(1, 0x01, 0x02, buf, sizeof(buf));
/* ... */
len = build_enable_cmd(1, 0x0004, buf, sizeof(buf));
/* ... */
/* 4. Create CISes via HCI LE Create CIS (both earbuds, CIG=1) */
/* After LE CIS Established events, both sides call LE Setup ISO Data Path */
/* 5. Both Sink ASEs autonomously move to STREAMING */
/* Audio data now flows from phone Controller to earbud Controllers */
}
/* Suppress unused warnings in example */
(void)build_receiver_start_ready_cmd;
Stopping a stream has a specific path through the state machine. The Initiator sends Disable (0x05), which moves the Sink ASE back to QoS Configured. The CIS is NOT automatically disconnected โ you must send HCI_Disconnect if you want to actually remove it.
If you want to completely free the ASE, send Release (0x08). The Acceptor moves to Releasing, then autonomously transitions to either:
- Idle โ full reset, next use must go through all steps again
- Codec Configured โ caches the codec settings, so next time you can skip Config Codec and go straight to Config QoS (faster reconnect)
When you send Disable to a Source ASE, it moves to Disabling (not directly to QoS Configured). The Acceptor waits for the Initiator to confirm it has stopped receiving by sending Receiver Stop Ready (0x06). Only then does the Source ASE move to QoS Configured. This ensures a clean handoff โ no audio packets get lost in transit.
#define OPCODE_DISABLE 0x05
#define OPCODE_RECV_STOP_READY 0x06
#define OPCODE_RELEASE 0x08
/* Disable a Sink ASE (stops audio, ASE โ QoS Configured) */
int build_disable_cmd(uint8_t ase_id, uint8_t *out_buf)
{
out_buf[0] = OPCODE_DISABLE;
out_buf[1] = 0x01; /* Number of ASEs */
out_buf[2] = ase_id;
return 3;
}
/* Receiver Stop Ready โ for Source ASEs in Disabling state */
int build_recv_stop_ready_cmd(uint8_t ase_id, uint8_t *out_buf)
{
out_buf[0] = OPCODE_RECV_STOP_READY;
out_buf[1] = 0x01;
out_buf[2] = ase_id;
return 3;
}
/* Release an ASE completely (ASE โ Releasing โ Idle or Codec Configured) */
int build_release_cmd(uint8_t ase_id, uint8_t *out_buf)
{
out_buf[0] = OPCODE_RELEASE;
out_buf[1] = 0x01;
out_buf[2] = ase_id;
return 3;
}
/*
* Stop a sink stream properly:
* 1. Disable โ ASE goes to QoS Configured
* 2. HCI_Disconnect on the CIS handle (if you want to remove the CIS)
* 3. Release โ ASE goes to Releasing, then Acceptor moves to Idle
*
* If you plan to stream again soon, SKIP Release โ stay in QoS Configured.
* Re-enable is just one Enable write instead of the full setup flow.
*/
void stop_sink_stream(uint8_t ase_id, uint16_t cis_handle)
{
uint8_t buf[4];
int len;
/* Step 1: Send Disable */
len = build_disable_cmd(ase_id, buf);
/* gatt_write_cmd(ase_cp_handle, buf, len); */
/* Step 2 (optional): Disconnect the CIS */
/* hci_send_cmd(HCI_DISCONNECT, cis_handle, 0x13); */
(void)cis_handle;
/* Step 3 (optional): Release if done for good */
len = build_release_cmd(ase_id, buf);
/* gatt_write_cmd(ase_cp_handle, buf, len); */
(void)len;
}
While streaming, you can change the Context Type of an existing stream using Update Metadata (0x07) โ without tearing down the CIS.
Example: You’re streaming music (Media context). A phone call arrives. Instead of destroying the stream, update the metadata to Conversational context, disable the Source ASEs that weren’t needed for music, and enable them for the call. The codec config and CIS remain intact โ only the use case label changes.
#define OPCODE_UPDATE_META 0x07
/*
* build_update_metadata_cmd - change streaming context without teardown
* Can be sent in Enabling or Streaming state.
*
* @ase_id: ASE to update
* @new_ctx: new Context Type bitfield
* @out_buf: output buffer
*/
int build_update_metadata_cmd(uint8_t ase_id, uint16_t new_ctx,
uint8_t *out_buf)
{
out_buf[0] = OPCODE_UPDATE_META;
out_buf[1] = 0x01; /* Number of ASEs */
out_buf[2] = ase_id;
out_buf[3] = 0x04; /* Metadata length = 4 bytes */
/* LTV: Streaming_Audio_Contexts */
out_buf[4] = 0x03; /* L = 3 */
out_buf[5] = META_STREAMING_AUDIO_CTX; /* T = 0x02 */
out_buf[6] = new_ctx & 0xFF; /* V low */
out_buf[7] = (new_ctx >> 8) & 0xFF; /* V high */
return 8;
}
/* Music โ Phone call transition */
void switch_media_to_call(uint8_t sink_ase_id)
{
uint8_t buf[16];
/* Context Type: Conversational = 0x0002 */
int len = build_update_metadata_cmd(sink_ase_id, 0x0002, buf);
/* gatt_write_cmd(ase_cp_handle, buf, len); */
(void)len;
/* Now enable Source ASEs for microphone โ same CIG, no reconfiguration */
}
If the ACL connection drops (earbud goes out of range), all CISes to that Acceptor are immediately disconnected. The Acceptor will try to move its ASEs to QoS Configured state so they’re ready to resume when the ACL link comes back.
On reconnect, the Initiator should read the ASE characteristics to check their state. If the Acceptor timed out and returned to Idle, the Initiator must tear down and re-establish the entire CIG โ not just the CIS for the missing earbud.
๐ก Design tip for missing earbuds: Pre-schedule a CIS slot for the missing device in the CIG. When it comes back online, configure its ASE and enable the pre-scheduled CIS โ no CIG teardown needed. This prevents an audio gap to the other earbud.
Stay in QoS Configured between sessions โ you can jump straight to Enable without running Config Codec again. Only Release when you’re done for good.
CAP mandates that all Acceptors in a Coordinated Set (both earbuds) complete each state before moving to the next โ Config Codec both, then Config QoS both, then Enable both.
Once you send LE_Create_CIS, you cannot change the CIG parameters. To change codec or QoS, you must Disable, Release, and start the whole flow again.
PACS exposes capabilities. ASCS tracks stream state. BAP drives the procedures. CAP coordinates multi-device setups.
