Please read this first
- Have you read the docs? Agents SDK docs
- Have you searched for related issues? Others may have faced similar issues.
Describe the bug
During a longer response (think e.g. a legal disclaimer with +10s of playback), it seems the session.interrupt() (which then calls the transport.interrupt) does not actually interrupt Twilio playback.
It seems that the TwilioRealtimeTransport only implements _interrupt. However, the super class interrupt silently aborts if it has already submitted all the audio, never calling the _interrupt.
The variables seem to be reset here, which in turn gets called on response.output_audio.done.
However, Twilio uses marks to indicate when playback has finished, so if the generation + submission to Twilio takes 1s, but actual response is 10s in length, you only have a 1s window to interrupt correctly clearing the response stream.
Debug information
- Agents SDK version: 0.8.3
- Runtime environment (e.g.
Node.js 22.16.0)
Repro steps
Give the agent an instruction like
Read the following disclaimer:
All the information on this website is published in good faith and for general information purpose only. Website Name does not make any warranties about the completeness, reliability and accuracy of this information. Any action you take upon the information you find on this website (Website.com), is strictly at your own risk. will not be liable for any losses and/or damages in connection with the use of our website.
Create your Twilio sessions, and then register an event listener like so:
this.session.on('transport_event', async (transportEvent) => {
if (transportEvent.type === 'input_audio_buffer.speech_started') {
try {
this.session.interrupt();
} catch (interruptErr) {
console.warn(`interrupt() failed (race condition):`, interruptErr?.message || interruptErr);
}
}
}
Then after connecting start with
this.session.transport.sendEvent({ type: 'response.create' });
so the agent will talk first.
Let the agent talk for a bit, then interrupt the agent. Notice that the agent keeps on talking if the interruption does not happen during generation or during transfer of audio to Twilio.
Now, add this code after (or before) the interrupt code above:
if (this.twilioWebSocket) {
this.twilioWebSocket.send(JSON.stringify({ event: 'clear', streamSid: this.payload?.streamSid }));
}
Notice the playback now gets interrupted.
Expected behavior
The expectation is that sending an interrupt to the Twilio Realtime Session will interrupt audio playback.
Likely fix will involve adding the top level interrupt method to TwilioRealtimeTransport which would do the clear like so:
interrupt(cancelOngoingResponse: boolean = true) {
// ALWAYS clear the Twilio buffer immediately when interrupted,
// even if OpenAI has already finished generating the response.
this.#twilioWebSocket.send(
JSON.stringify({
event: 'clear',
streamSid: this.#streamSid,
}),
);
super.interrupt(cancelOngoingResponse);
}
and removing the clear from the _interrupt (though think there is no harm clearing twice).
Please read this first
Describe the bug
During a longer response (think e.g. a legal disclaimer with +10s of playback), it seems the
session.interrupt()(which then calls thetransport.interrupt) does not actually interrupt Twilio playback.It seems that the TwilioRealtimeTransport only implements _interrupt. However, the super class interrupt silently aborts if it has already submitted all the audio, never calling the
_interrupt.The variables seem to be reset here, which in turn gets called on response.output_audio.done.
However, Twilio uses marks to indicate when playback has finished, so if the generation + submission to Twilio takes 1s, but actual response is 10s in length, you only have a 1s window to interrupt correctly clearing the response stream.
Debug information
Node.js 22.16.0)Repro steps
Give the agent an instruction like
Create your Twilio sessions, and then register an event listener like so:
Then after connecting start with
so the agent will talk first.
Let the agent talk for a bit, then interrupt the agent. Notice that the agent keeps on talking if the interruption does not happen during generation or during transfer of audio to Twilio.
Now, add this code after (or before) the interrupt code above:
Notice the playback now gets interrupted.
Expected behavior
The expectation is that sending an interrupt to the Twilio Realtime Session will interrupt audio playback.
Likely fix will involve adding the top level interrupt method to TwilioRealtimeTransport which would do the clear like so:
and removing the clear from the _interrupt (though think there is no harm clearing twice).