Skip to content

fix: encode voice memo attachments as Opus instead of LEI16 PCM#793

Open
cypres0099 wants to merge 1 commit intoBlueBubblesApp:developmentfrom
cypres0099:ben/voice-memo-opus-codec
Open

fix: encode voice memo attachments as Opus instead of LEI16 PCM#793
cypres0099 wants to merge 1 commit intoBlueBubblesApp:developmentfrom
cypres0099:ben/voice-memo-opus-codec

Conversation

@cypres0099
Copy link
Copy Markdown

The problem

When a client sends an attachment with `isAudioMessage=true`, the
server calls `FileSystem.convertMp3ToCaf` to transcode the input
into a CAF file before handing it off. The CAF arrives at iMessage
and is displayed as a regular file attachment with an `.mp3.mp3` or
`.caf` filename, rather than an iMessage voice bubble with a waveform
and a play button.

Root cause

`FileSystem.convertMp3ToCaf` currently runs:

```sh
/usr/bin/afconvert -f caff -d LEI16@44100 -c 1 input output
```

`LEI16@44100` is 16-bit little-endian PCM at 44.1 kHz. The resulting
CAF is valid and playable as audio, but iMessage's voice-memo
renderer specifically requires the CAF to contain Opus. When it
sees PCM inside CAF, it falls back to rendering the attachment as a
generic file, which is what users see.

This matches the reproducer and analysis in
openclaw/openclaw#1526, which identified the same failure on
macOS 15 -> iOS 26.

The fix

Change the afconvert codec flag from `LEI16@44100` to `opus@24000`.

```diff

  •        \`/usr/bin/afconvert -f caff -d LEI16@44100 -c 1 \"\${oldPath}\" \"\${outputPath}\"\`
    
  •        \`/usr/bin/afconvert -f caff -d opus@24000 -c 1 \"\${oldPath}\" \"\${outputPath}\"\`
    

```

Opus at 24 kHz mono is the format iMessage uses natively for voice
memos. `afconvert` ships with every macOS version this project
supports, so there is no new dependency.

Both send paths that call this helper benefit from the fix:

  • `packages/server/src/server/api/apple/actions.ts` (AppleScript)
  • `packages/server/src/server/api/interfaces/messageInterface.ts`
    (Private API)

Verification

Tested on macOS 26.3, Apple Silicon:

```
$ afconvert -f caff -d opus@24000 -c 1 test.mp3 test.caf
$ afinfo test.caf
File: test.caf
File type ID: caff
Data format: 1 ch, 24000 Hz, opus (0x00000000) 0 bits/channel
Channel layout: Mono
```

The resulting CAF is rendered as an iMessage voice bubble (waveform

  • play button) on receiving devices when delivered via an
    `isAudioMessage=true` send that reaches the voice-memo code path in
    iMessage (which on BlueBubbles requires the Private API send path).

Note: this fix is necessary but not by itself sufficient to get
voice bubbles on a BlueBubbles install. The Private API send path
is also required, since the AppleScript path has no way to mark an
attachment as a voice memo. This PR only addresses the codec bug
that silently breaks the Private API path's voice-memo output.

FileSystem.convertMp3ToCaf was producing CAF files containing 16-bit
little-endian PCM. iMessage voice bubbles require Opus audio inside
a CAF container -- PCM-in-CAF is a valid CAF but iMessage cannot
render it as a voice memo, so the attachment arrives as a generic
file instead of a waveform bubble with a play button.

Switching the afconvert codec flag from LEI16@44100 to opus@24000
produces a CAF that iMessage recognizes as a voice memo. No new
dependencies -- opus@24000 is supported natively by afconvert on
every macOS version BlueBubbles-Server runs on.

The function is still called from both the AppleScript send path
(packages/server/src/server/api/apple/actions.ts) and the Private
API send path (packages/server/src/server/api/interfaces/
messageInterface.ts), so both paths benefit when they trigger the
MP3 -> CAF conversion for isAudioMessage=true attachments.

Verified on macOS 26.3 Apple Silicon:
  afinfo confirms Data format: 1 ch, 24000 Hz, opus
Related to openclaw/openclaw#1526.
cypres0099 added a commit to cypres0099/hermes-agent that referenced this pull request Apr 17, 2026
iMessage's voice-memo renderer is strict about the attachment format:
it specifically requires Opus audio inside a CAF container. Any other
codec (MP3, AAC, PCM -- including PCM wrapped inside CAF) arrives as
a generic file attachment even when the voice-memo flag is set
correctly on the send.

BlueBubbles-server has its own MP3 -> CAF converter that fires when
an .mp3 is sent with isAudioMessage=true, but it currently produces
16-bit PCM in CAF (see BlueBubblesApp/bluebubbles-server#793), which
hits the same iMessage fallback. So even with a patched upstream BB,
callers that ship any other audio format (AAC, WAV, Opus-in-OGG)
still end up with file attachments.

This pre-converts anything non-.caf to Opus-in-CAF via afconvert, the
native macOS tool. afconvert ships on every macOS version the
BlueBubbles server runs on, so there is no new dependency. If
afconvert is missing or the conversion fails, the original path is
sent unchanged -- matches today's behavior, no regression.

This mirrors the Opus-in-OGG transcode already done for Telegram
voice messages in tools/tts_tool.py, just targeting the CAF container
instead of OGG.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant