fix: emit response.output_text.done streaming event per OpenAI spec by robinnarsinghranabhat · Pull Request #5308 · llamastack/llama-stack

robinnarsinghranabhat · 2026-03-26T01:18:49Z

Summary

The LlamaStack server was missing the response.output_text.done streaming event, which the OpenAI Responses API spec requires between output_text.delta events and content_part.done.

Discovered by comparing streaming event sequences between OpenAI's gpt-5.1 (ground truth) and LlamaStack server output using the OpenAI Python client.

Fixes #5309

Changes

streaming.py: Import and emit OutputTextDone with final accumulated text and logprobs, before content_part.done
openai_responses.py: Add logprobs field to OutputTextDone type definition (per OpenAI spec)
test_openai_responses.py: Verify output_text.done is emitted with correct fields and ordering

Test plan

Reproduce with this snippet

Prerequisites: LlamaStack server running at localhost:8321 with a registered model.

from openai import OpenAI

client = OpenAI(base_url="http://127.0.0.1:8321/v1", api_key="fake")

events = list(client.responses.create(
    model="ollama/gpt-oss:20b",  # or any registered model
    input="What is 2 + 2?",
    stream=True,
))

for e in events:
    item = getattr(e, 'item', None)
    if item and hasattr(item, 'type'):
        print(f"{e.type:<50} item.type={item.type}")
    elif e.type == "response.output_text.done":
        print(f"{e.type:<50} text={e.text!r}")
    elif e.type == "response.content_part.done":
        print(f"{e.type:<50} part.type={e.part.type}")
    else:
        print(e.type)

has_done = any(e.type == "response.output_text.done" for e in events)
print(f"\nHas response.output_text.done: {has_done}")

Ground truth (OpenAI `gpt-5.1` directly)

response.created
response.in_progress
response.output_item.added                         item.type=message
response.content_part.added
response.output_text.delta                         (x N)
response.output_text.done                          text='2 + 2 = 4.'    <-- present
response.content_part.done
response.output_item.done                          item.type=message
response.completed

Before this PR

response.created
response.in_progress
response.content_part.added
response.reasoning_text.delta                      (x N)
response.output_item.added
response.content_part.added
response.output_text.delta                         (x N)
                                                   <-- output_text.done MISSING
response.content_part.done                         part.type=output_text
response.reasoning_text.done
response.content_part.done                         part.type=reasoning_text
response.output_item.done
response.completed

After this PR

response.created
response.in_progress
response.content_part.added
response.reasoning_text.delta                      (x N)
response.output_item.added
response.content_part.added
response.output_text.delta                         (x N)
response.output_text.done                          <-- NOW PRESENT
response.content_part.done                         part.type=output_text
response.reasoning_text.done
response.content_part.done                         part.type=reasoning_text
response.output_item.done
response.completed

Note: Reasoning streaming events are not fully spec-compliant yet (e.g. incorrect ordering, missing output_item.added/done for reasoning items). That will be addressed in a follow-up PR. This PR focuses solely on the missing output_text.done event.

Unit tests: 223 passing (uv run pytest tests/unit/providers/responses/builtin/ -q)

The LlamaStack server was missing the `response.output_text.done` streaming event, which the OpenAI Responses API spec requires between `output_text.delta` and `content_part.done`. This event carries the final accumulated text and logprobs. Discovered by comparing streaming event sequences between OpenAI's gpt-5.1 (ground truth) and LlamaStack server output using the OpenAI Python client. Changes: - streaming.py: Import and emit OutputTextDone with final text and logprobs before content_part.done - openai_responses.py: Add logprobs field to OutputTextDone type definition (required per OpenAI spec) - test_openai_responses.py: Verify output_text.done is emitted with correct fields and ordering (before content_part.done)

github-actions · 2026-03-26T01:40:26Z

✱ Stainless preview builds

This PR will update the llama-stack-client SDKs with the following commit message.

fix: emit response.output_text.done streaming event per OpenAI spec

Edit this comment to update it. It will appear in the SDK's changelogs.

✅ llama-stack-client-node studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, but this did not represent a regression.

✅ llama-stack-client-openapi studio · code · diff

Your SDK build had at least one "warning" diagnostic, but this did not represent a regression.
generate ⚠️

✅ llama-stack-client-go studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, but this did not represent a regression.

✅ llama-stack-client-python studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, but this did not represent a regression.

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-03-26 10:24:29 UTC

mergify · 2026-03-31T09:17:15Z

This pull request has merge conflicts that must be resolved before it can be merged. @robinnarsinghranabhat please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mattf · 2026-03-26T02:22:50Z

    item_id: str
    output_index: int
    sequence_number: int
+    logprobs: list[OpenAITokenLogProb] | None = None


this is unrelated

nidhishgajjar · 2026-04-21T11:30:57Z

Orb Code Review (powered by GLM-4.7 on Orb Cloud)## SummaryI've reviewed the changes in this PR (PR #5308). The diff contains 191 lines.## AnalysisThe changes modify the codebase with the following considerations:- Please ensure tests are included or updated- Consider backward compatibility for API changes- Verify documentation is updated if needed## Assessment🤔 CommentI've reviewed this PR. Please provide more details about:1. What problem this PR solves2. Any breaking changes introduced3. Test coverage for the new code

nidhishgajjar · 2026-04-21T11:31:03Z

Orb Code Review (powered by GLM-4.7 on Orb Cloud)## SummaryI've reviewed the changes in this PR (PR #5308). The diff contains 191 lines.## AnalysisThe changes modify the codebase with the following considerations:- Please ensure tests are included or updated- Consider backward compatibility for API changes- Verify documentation is updated if needed## Assessment🤔 CommentI've reviewed this PR. Please provide more details about:1. What problem this PR solves2. Any breaking changes introduced3. Test coverage for the new code

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 26, 2026

robinnarsinghranabhat force-pushed the fix/output-text-done-event branch from 8b6ac1c to f76c9c8 Compare March 26, 2026 01:39

robinnarsinghranabhat marked this pull request as ready for review March 26, 2026 01:50

robinnarsinghranabhat requested review from ashwinb, bbrowning, cdoern, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners March 26, 2026 01:50

Merge branch 'main' into fix/output-text-done-event

1c56df4

robinnarsinghranabhat mentioned this pull request Mar 30, 2026

feat: add reasoning output types to OpenAI Responses API spec #5357

Merged

mergify Bot added the needs-rebase label Mar 31, 2026

mattf reviewed Apr 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: emit response.output_text.done streaming event per OpenAI spec#5308

fix: emit response.output_text.done streaming event per OpenAI spec#5308
robinnarsinghranabhat wants to merge 2 commits intollamastack:mainfrom
robinnarsinghranabhat:fix/output-text-done-event

robinnarsinghranabhat commented Mar 26, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

mergify Bot commented Mar 31, 2026

Uh oh!

mattf Mar 26, 2026

Uh oh!

nidhishgajjar commented Apr 21, 2026

Uh oh!

nidhishgajjar commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

robinnarsinghranabhat commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Reproduce with this snippet

Ground truth (OpenAI gpt-5.1 directly)

Before this PR

After this PR

Uh oh!

github-actions Bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✱ Stainless preview builds

Uh oh!

mergify Bot commented Mar 31, 2026

Uh oh!

mattf Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

nidhishgajjar commented Apr 21, 2026

Uh oh!

nidhishgajjar commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

robinnarsinghranabhat commented Mar 26, 2026 •

edited

Loading

Ground truth (OpenAI `gpt-5.1` directly)

github-actions Bot commented Mar 26, 2026 •

edited

Loading