fix: emit response.output_text.done streaming event per OpenAI spec#5308
fix: emit response.output_text.done streaming event per OpenAI spec#5308robinnarsinghranabhat wants to merge 2 commits intollamastack:mainfrom
Conversation
The LlamaStack server was missing the `response.output_text.done` streaming event, which the OpenAI Responses API spec requires between `output_text.delta` and `content_part.done`. This event carries the final accumulated text and logprobs. Discovered by comparing streaming event sequences between OpenAI's gpt-5.1 (ground truth) and LlamaStack server output using the OpenAI Python client. Changes: - streaming.py: Import and emit OutputTextDone with final text and logprobs before content_part.done - openai_responses.py: Add logprobs field to OutputTextDone type definition (required per OpenAI spec) - test_openai_responses.py: Verify output_text.done is emitted with correct fields and ordering (before content_part.done)
8b6ac1c to
f76c9c8
Compare
✱ Stainless preview buildsThis PR will update the Edit this comment to update it. It will appear in the SDK's changelogs. ✅ llama-stack-client-node studio · conflict
✅ llama-stack-client-openapi studio · code · diff
✅ llama-stack-client-go studio · conflict
✅ llama-stack-client-python studio · conflict
This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push. |
|
This pull request has merge conflicts that must be resolved before it can be merged. @robinnarsinghranabhat please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork |
| item_id: str | ||
| output_index: int | ||
| sequence_number: int | ||
| logprobs: list[OpenAITokenLogProb] | None = None |
|
Orb Code Review (powered by GLM-4.7 on Orb Cloud)## SummaryI've reviewed the changes in this PR (PR #5308). The diff contains 191 lines.## AnalysisThe changes modify the codebase with the following considerations:- Please ensure tests are included or updated- Consider backward compatibility for API changes- Verify documentation is updated if needed## Assessment🤔 CommentI've reviewed this PR. Please provide more details about:1. What problem this PR solves2. Any breaking changes introduced3. Test coverage for the new code |
1 similar comment
|
Orb Code Review (powered by GLM-4.7 on Orb Cloud)## SummaryI've reviewed the changes in this PR (PR #5308). The diff contains 191 lines.## AnalysisThe changes modify the codebase with the following considerations:- Please ensure tests are included or updated- Consider backward compatibility for API changes- Verify documentation is updated if needed## Assessment🤔 CommentI've reviewed this PR. Please provide more details about:1. What problem this PR solves2. Any breaking changes introduced3. Test coverage for the new code |
Summary
The LlamaStack server was missing the
response.output_text.donestreaming event, which the OpenAI Responses API spec requires betweenoutput_text.deltaevents andcontent_part.done.Discovered by comparing streaming event sequences between OpenAI's
gpt-5.1(ground truth) and LlamaStack server output using the OpenAI Python client.Fixes #5309
Changes
streaming.py: Import and emitOutputTextDonewith final accumulated text and logprobs, beforecontent_part.doneopenai_responses.py: Addlogprobsfield toOutputTextDonetype definition (per OpenAI spec)test_openai_responses.py: Verifyoutput_text.doneis emitted with correct fields and orderingTest plan
Reproduce with this snippet
Prerequisites: LlamaStack server running at
localhost:8321with a registered model.Ground truth (OpenAI
gpt-5.1directly)Before this PR
After this PR
Note: Reasoning streaming events are not fully spec-compliant yet (e.g. incorrect ordering, missing
output_item.added/donefor reasoning items). That will be addressed in a follow-up PR. This PR focuses solely on the missingoutput_text.doneevent.Unit tests: 223 passing (
uv run pytest tests/unit/providers/responses/builtin/ -q)