fix(responses): fall back to o200k_base encoding for non-OpenAI models#5584
fix(responses): fall back to o200k_base encoding for non-OpenAI models#5584Artemon-line wants to merge 1 commit intollamastack:mainfrom
Conversation
…s in compaction The tiktoken encoding lookup in _count_tokens() raises a ValueError for any model name that tiktoken does not recognize (e.g. Vertex AI, Bedrock). This breaks context_management for all non-OpenAI providers, even though the token count is only used as a heuristic for compaction thresholds. Fall back to o200k_base encoding instead of raising, since an approximate count is acceptable here. Users can still override via tokenizer_encoding in compaction_config. Signed-off-by: Artemy Hladenko <ahladenk@redhat.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Artemy <ahladenk@redhat.com>
|
Orb Code Review (powered by GLM-4.7 on Orb Cloud)## SummaryI've reviewed the changes in this PR (PR #5584). The diff contains 22 lines.## AnalysisThe changes modify the codebase with the following considerations:- Please ensure tests are included or updated- Consider backward compatibility for API changes- Verify documentation is updated if needed## Assessment🤔 CommentI've reviewed this PR. Please provide more details about:1. What problem this PR solves2. Any breaking changes introduced3. Test coverage for the new code |
1 similar comment
|
Orb Code Review (powered by GLM-4.7 on Orb Cloud)## SummaryI've reviewed the changes in this PR (PR #5584). The diff contains 22 lines.## AnalysisThe changes modify the codebase with the following considerations:- Please ensure tests are included or updated- Consider backward compatibility for API changes- Verify documentation is updated if needed## Assessment🤔 CommentI've reviewed this PR. Please provide more details about:1. What problem this PR solves2. Any breaking changes introduced3. Test coverage for the new code |
|
@leseb please take a look when you have a minute: |
| "or use an OpenAI model name that tiktoken recognizes." | ||
| ) from e | ||
| except KeyError: | ||
| # Fall back to o200k_base for non-OpenAI models (e.g. Vertex AI, Bedrock). |
There was a problem hiding this comment.
I think a logger.debug is still warranted here. any err we catch and have a default behavior smells weird to me so lets throw a log in here.
Summary
encoding_for_model()lookup in_count_tokens()raisesValueErrorfor any model name tiktoken doesn't recognize (e.g. Vertex AI Gemini, Bedrock Claude). This breakscontext_managementfor all non-OpenAI providers.o200k_baseencoding instead of raising, since the token count is only used as a heuristic for compaction thresholds — an approximate count is acceptable here.tokenizer_encodingincompaction_configfor precise control.Test plan
uv run pytest tests/unit/providers/responses/ -x(236 passed)vertexai/publishers/google/models/gemini-2.0-flash)vertexai tests are green here:
#5219
🤖 Generated with Claude Code