Skip to content

Feat: Add Anthropic CUA adaptive thinking#1912

Closed
chromiebot wants to merge 4 commits intobrowserbase:mainfrom
chromiebot:chromie/fix-for-your-fix-cua-adaptive-thin
Closed

Feat: Add Anthropic CUA adaptive thinking#1912
chromiebot wants to merge 4 commits intobrowserbase:mainfrom
chromiebot:chromie/fix-for-your-fix-cua-adaptive-thin

Conversation

@chromiebot
Copy link
Copy Markdown
Contributor

@chromiebot chromiebot commented Mar 30, 2026

why

what changed

test plan


Summary by cubic

Add adaptive thinking for Anthropic Claude 4.6 models in the CUA client. Uses thinkingEffort with output_config.effort, sets temperature=1, and keeps thinkingBudget for older models.

  • New Features

    • Detects 4.6 models (claude-opus-4-6, claude-sonnet-4-6) and uses thinking: { type: "adaptive" } with output_config.effort.
    • Defaults to "medium" effort when not set and forces temperature: 1 for adaptive thinking.
    • Adds ThinkingEffort and thinkingEffort in ClientOptions ("low" | "medium" | "high" | "max"); falls back to thinking: { type: "enabled", budget_tokens } with thinkingBudget on older models.
    • Uses computer_20251124 for 4.6 and claude-opus-4-5-20251101; tests cover effort levels, defaults and temperature, provider-prefixed names, and legacy budget behavior.
  • Migration

    • For 4.6 models, set thinkingEffort; thinkingBudget is deprecated. Adaptive thinking sets temperature: 1 automatically.
    • Keep using thinkingBudget on older Claude models. No changes needed if you don’t use thinking.

Written for commit 0ea0332. Summary will update on new commits. Review in cubic

Chromie Bot and others added 2 commits March 30, 2026 06:20
Add comprehensive tests for the new adaptive thinking API used by
Claude 4.6 models (claude-opus-4-6, claude-sonnet-4-6).

Tests verify:
- Adaptive thinking uses thinking.type: 'adaptive' (not 'enabled')
- Effort levels are passed via output_config.effort (not budget_tokens)
- All effort levels: low, medium, high, max
- Older models continue using deprecated budget_tokens API
- Model name detection handles provider-prefixed names

These tests define the expected API contract per:
https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update AnthropicCUAClient to use the correct API contract for
Claude 4.6 models (claude-opus-4-6, claude-sonnet-4-6).

Claude 4.6 models use adaptive thinking:
- thinking: { type: "adaptive" }
- output_config: { effort: "low" | "medium" | "high" | "max" }

This replaces the deprecated API:
- thinking: { type: "enabled", budget_tokens: N }

Changes:
- Add ThinkingEffort type for effort levels
- Add thinkingEffort option to ClientOptions
- Detect 4.6 models and use adaptive thinking with output_config
- Keep backward compatibility with thinkingBudget for older models
- Add deprecation notice for thinkingBudget on 4.6 models

The implementation follows the API contract documented at:
https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 30, 2026

⚠️ No Changeset found

Latest commit: 0ea0332

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions
Copy link
Copy Markdown
Contributor

This PR is from an external contributor and must be approved by a stagehand team member with write access before CI can run.
Approving the latest commit mirrors it into an internal PR owned by the approver.
If new commits are pushed later, the internal PR stays open but is marked stale until someone approves the latest external commit and refreshes it.

@github-actions github-actions bot added external-contributor Tracks PRs mirrored from external contributor forks. external-contributor:awaiting-approval Waiting for a stagehand team member to approve the latest external commit. labels Mar 30, 2026
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.
Architecture diagram
sequenceDiagram
    participant App as Application/User
    participant Client as AnthropicCUAClient
    participant SDK as Anthropic SDK (Beta)
    participant API as Anthropic API

    Note over App,API: Request Flow for Claude 4.6+ (Adaptive Thinking)

    App->>Client: getAction(messages, options)
    Note right of Client: Includes NEW: thinkingEffort (low/medium/high/max)

    Client->>Client: Identify Model Version
    
    alt Model is Claude 4.6+ (Opus/Sonnet)
        Client->>Client: NEW: Set thinking.type = "adaptive"
        Client->>Client: NEW: Set output_config.effort = thinkingEffort
        Client->>Client: NEW: Select computer_20251124 tool
    else Model is Legacy (Claude 4.5 or older)
        Client->>Client: CHANGED: Set thinking.type = "enabled"
        Client->>Client: CHANGED: Set thinking.budget_tokens = thinkingBudget
        Client->>Client: Select legacy computer tool version
    end

    Client->>SDK: beta.messages.create(payload)
    Note right of SDK: Payload contains specific<br/>thinking config & tool versions

    SDK->>API: POST /v1/messages (Beta Headers)
    API-->>SDK: Response with thinking block + tool use
    SDK-->>Client: Message object
    Client-->>App: Action/Response
Loading

Chromie Bot and others added 2 commits March 30, 2026 06:34
- Add test for default "medium" effort when thinkingEffort not set
- Add tests verifying temperature=1 is set for adaptive thinking
- Add test that older models don't force temperature=1

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Always set temperature=1 when adaptive thinking is enabled (required by API)
- Default to "medium" effort for Claude 4.6 models when thinkingEffort not set
- This ensures adaptive thinking works out of the box for 4.6 models

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/tests/unit/anthropic-cua-adaptive-thinking.test.ts">

<violation number="1" location="packages/core/tests/unit/anthropic-cua-adaptive-thinking.test.ts:179">
P3: Duplicate coverage: the new claude-sonnet-4-6 temperature test repeats the existing low-effort adaptive-thinking test with identical setup and assertions, adding no new behavior coverage.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

);
});

it("should set temperature to 1 for claude-sonnet-4-6 with adaptive thinking", async () => {
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Duplicate coverage: the new claude-sonnet-4-6 temperature test repeats the existing low-effort adaptive-thinking test with identical setup and assertions, adding no new behavior coverage.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/tests/unit/anthropic-cua-adaptive-thinking.test.ts, line 179:

<comment>Duplicate coverage: the new claude-sonnet-4-6 temperature test repeats the existing low-effort adaptive-thinking test with identical setup and assertions, adding no new behavior coverage.</comment>

<file context>
@@ -142,9 +146,57 @@ describe("AnthropicCUAClient adaptive thinking", () => {
+      );
+    });
+
+    it("should set temperature to 1 for claude-sonnet-4-6 with adaptive thinking", async () => {
+      const client = new AnthropicCUAClient(
+        "anthropic",
</file context>
Fix with Cubic

@github-actions github-actions bot added external-contributor:mirrored An internal mirrored PR currently exists for this external contributor PR. and removed external-contributor:awaiting-approval Waiting for a stagehand team member to approve the latest external commit. labels Apr 3, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

This PR was approved by @miguelg719 and mirrored to #1954. All further discussion should happen on that PR.

@github-actions github-actions bot closed this Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external-contributor:mirrored An internal mirrored PR currently exists for this external contributor PR. external-contributor Tracks PRs mirrored from external contributor forks.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants