Skip to content

fix(responses): fall back to o200k_base encoding for non-OpenAI models#5584

Open
Artemon-line wants to merge 1 commit intollamastack:mainfrom
Artemon-line:fix/tiktoken-fallback-non-openai-models
Open

fix(responses): fall back to o200k_base encoding for non-OpenAI models#5584
Artemon-line wants to merge 1 commit intollamastack:mainfrom
Artemon-line:fix/tiktoken-fallback-non-openai-models

Conversation

@Artemon-line
Copy link
Copy Markdown
Contributor

@Artemon-line Artemon-line commented Apr 17, 2026

Summary

  • The tiktoken encoding_for_model() lookup in _count_tokens() raises ValueError for any model name tiktoken doesn't recognize (e.g. Vertex AI Gemini, Bedrock Claude). This breaks context_management for all non-OpenAI providers.
  • Fall back to o200k_base encoding instead of raising, since the token count is only used as a heuristic for compaction thresholds — an approximate count is acceptable here.
  • Users can still override with tokenizer_encoding in compaction_config for precise control.

Note: Split out from #5219 per review feedback. #5219 (vertexai integration) depends on this fix and is blocked until this merges.

Test plan

  • Unit tests pass: uv run pytest tests/unit/providers/responses/ -x (236 passed)
  • Pre-commit hooks pass on the changed file
  • Integration tests: verify compaction tests pass in replay mode with a non-OpenAI model (e.g. vertexai/publishers/google/models/gemini-2.0-flash)

vertexai tests are green here:
#5219

🤖 Generated with Claude Code

…s in compaction

The tiktoken encoding lookup in _count_tokens() raises a ValueError for any
model name that tiktoken does not recognize (e.g. Vertex AI, Bedrock). This
breaks context_management for all non-OpenAI providers, even though the token
count is only used as a heuristic for compaction thresholds. Fall back to
o200k_base encoding instead of raising, since an approximate count is
acceptable here. Users can still override via tokenizer_encoding in
compaction_config.

Signed-off-by: Artemy Hladenko <ahladenk@redhat.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Artemy <ahladenk@redhat.com>
@nidhishgajjar
Copy link
Copy Markdown

Orb Code Review (powered by GLM-4.7 on Orb Cloud)## SummaryI've reviewed the changes in this PR (PR #5584). The diff contains 22 lines.## AnalysisThe changes modify the codebase with the following considerations:- Please ensure tests are included or updated- Consider backward compatibility for API changes- Verify documentation is updated if needed## Assessment🤔 CommentI've reviewed this PR. Please provide more details about:1. What problem this PR solves2. Any breaking changes introduced3. Test coverage for the new code

1 similar comment
@nidhishgajjar
Copy link
Copy Markdown

Orb Code Review (powered by GLM-4.7 on Orb Cloud)## SummaryI've reviewed the changes in this PR (PR #5584). The diff contains 22 lines.## AnalysisThe changes modify the codebase with the following considerations:- Please ensure tests are included or updated- Consider backward compatibility for API changes- Verify documentation is updated if needed## Assessment🤔 CommentI've reviewed this PR. Please provide more details about:1. What problem this PR solves2. Any breaking changes introduced3. Test coverage for the new code

@Artemon-line
Copy link
Copy Markdown
Contributor Author

@leseb please take a look when you have a minute:
I have extracted this change here
Fall back to o200k_base

"or use an OpenAI model name that tiktoken recognizes."
) from e
except KeyError:
# Fall back to o200k_base for non-OpenAI models (e.g. Vertex AI, Bedrock).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a logger.debug is still warranted here. any err we catch and have a default behavior smells weird to me so lets throw a log in here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants