fix(responses): fall back to o200k_base encoding for non-OpenAI models by Artemon-line · Pull Request #5584 · llamastack/llama-stack

Artemon-line · 2026-04-17T10:07:24Z

Summary

The tiktoken encoding_for_model() lookup in _count_tokens() raises ValueError for any model name tiktoken doesn't recognize (e.g. Vertex AI Gemini, Bedrock Claude). This breaks context_management for all non-OpenAI providers.
Fall back to o200k_base encoding instead of raising, since the token count is only used as a heuristic for compaction thresholds — an approximate count is acceptable here.
Users can still override with tokenizer_encoding in compaction_config for precise control.

Note: Split out from #5219 per review feedback. #5219 (vertexai integration) depends on this fix and is blocked until this merges.

Test plan

Unit tests pass: uv run pytest tests/unit/providers/responses/ -x (236 passed)
Pre-commit hooks pass on the changed file
Integration tests: verify compaction tests pass in replay mode with a non-OpenAI model (e.g. vertexai/publishers/google/models/gemini-2.0-flash)

vertexai tests are green here:
#5219

🤖 Generated with Claude Code

…s in compaction The tiktoken encoding lookup in _count_tokens() raises a ValueError for any model name that tiktoken does not recognize (e.g. Vertex AI, Bedrock). This breaks context_management for all non-OpenAI providers, even though the token count is only used as a heuristic for compaction thresholds. Fall back to o200k_base encoding instead of raising, since an approximate count is acceptable here. Users can still override via tokenizer_encoding in compaction_config. Signed-off-by: Artemy Hladenko <ahladenk@redhat.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Artemy <ahladenk@redhat.com>

nidhishgajjar · 2026-04-21T12:20:14Z

Orb Code Review (powered by GLM-4.7 on Orb Cloud)## SummaryI've reviewed the changes in this PR (PR #5584). The diff contains 22 lines.## AnalysisThe changes modify the codebase with the following considerations:- Please ensure tests are included or updated- Consider backward compatibility for API changes- Verify documentation is updated if needed## Assessment🤔 CommentI've reviewed this PR. Please provide more details about:1. What problem this PR solves2. Any breaking changes introduced3. Test coverage for the new code

nidhishgajjar · 2026-04-21T12:20:14Z

Orb Code Review (powered by GLM-4.7 on Orb Cloud)## SummaryI've reviewed the changes in this PR (PR #5584). The diff contains 22 lines.## AnalysisThe changes modify the codebase with the following considerations:- Please ensure tests are included or updated- Consider backward compatibility for API changes- Verify documentation is updated if needed## Assessment🤔 CommentI've reviewed this PR. Please provide more details about:1. What problem this PR solves2. Any breaking changes introduced3. Test coverage for the new code

Artemon-line · 2026-04-21T14:12:30Z

@leseb please take a look when you have a minute:
I have extracted this change here
Fall back to o200k_base

cdoern · 2026-04-21T14:38:48Z

-                    "or use an OpenAI model name that tiktoken recognizes."
-                ) from e
+            except KeyError:
+                # Fall back to o200k_base for non-OpenAI models (e.g. Vertex AI, Bedrock).


I think a logger.debug is still warranted here. any err we catch and have a default behavior smells weird to me so lets throw a log in here.

Artemon-line requested review from ashwinb, bbrowning, cdoern, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners April 17, 2026 10:07

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 17, 2026

Artemon-line mentioned this pull request Apr 17, 2026

feat(vertexai): add integration test recordings for vertexai provider #5219

Open

5 tasks

cdoern mentioned this pull request Apr 21, 2026

agent is spamming repos AWLSEN/orb-code-reviewer#1

Open

cdoern reviewed Apr 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(responses): fall back to o200k_base encoding for non-OpenAI models#5584

fix(responses): fall back to o200k_base encoding for non-OpenAI models#5584
Artemon-line wants to merge 1 commit intollamastack:mainfrom
Artemon-line:fix/tiktoken-fallback-non-openai-models

Artemon-line commented Apr 17, 2026 •

edited

Loading

Uh oh!

nidhishgajjar commented Apr 21, 2026

Uh oh!

nidhishgajjar commented Apr 21, 2026

Uh oh!

Artemon-line commented Apr 21, 2026

Uh oh!

cdoern Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Artemon-line commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

nidhishgajjar commented Apr 21, 2026

Uh oh!

nidhishgajjar commented Apr 21, 2026

Uh oh!

Artemon-line commented Apr 21, 2026

Uh oh!

cdoern Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Artemon-line commented Apr 17, 2026 •

edited

Loading