Skip to content

feat: Add agent execution limits (max_turns, max_token_budget) #2124

@srbhsrkr

Description

@srbhsrkr

Problem

The Agent class has no built-in mechanism to prevent runaway execution. The recursive event loop (event_loop_cycle -> recurse_event_loop) will continue indefinitely as long as the model keeps requesting tool calls. A single agent invocation can loop without bound, consuming unbounded tokens and time.

What exists today:

  • MAX_ATTEMPTS = 6 in event_loop.py:53 — but this only limits consecutive retry attempts on ModelThrottledException. It resets to 0 after every successful model call (_retry.py:113-118). It does not limit total event loop cycles.
  • CancellationToken — allows external code to request graceful stopping, but requires the caller to implement the timeout logic themselves.
  • Graph multi-agent has max_node_executions, execution_timeout, and node_timeout (graph.py:424-426) — but single-agent Agent has no equivalent.

What's missing:

  • No max_turns parameter to cap total event loop cycles per invocation.
  • No max_token_budget to cap total token consumption per invocation.
  • No detection for diminishing returns (agent spinning without progress).

Why this matters:

  • A 20-step agent at 95% per-step reliability has only 36% end-to-end success (compounding math). Without turn limits, a failing agent compounds errors indefinitely.
  • A real-world case showed 250,000 wasted API calls/day from retry loops without circuit breakers.
  • The SDK's tenet "Simple at any scale" implies safe defaults. Unbounded recursion is not a safe default for production.

Proposed Solution

Add optional max_turns and max_token_budget parameters to Agent.__init__().

agent = Agent(
    max_turns=30,           # Max event loop cycles per invocation (default: None = unlimited)
    max_token_budget=100000, # Max total tokens per invocation (default: None = unlimited)
)

Implementation sketch:

  1. Add max_turns: int | None = None and max_token_budget: int | None = None to Agent.__init__().
  2. In event_loop_cycle() (event_loop.py), track the cycle count via invocation_state. Before each model call:
    • If cycle_count >= max_turns, yield a EventLoopStopEvent with stop_reason="max_turns_reached".
    • If metrics.accumulated_usage.totalTokens >= max_token_budget, yield a EventLoopStopEvent with stop_reason="token_budget_exceeded".
  3. Return a clear AgentResult with the stop reason so callers can distinguish normal completion from limits.

Trade-offs and Edge Cases

Concern Analysis
What if max_turns hits mid-tool-execution? The check occurs before the next model call, not during tool execution. In-flight tools complete normally. The agent stops before the next recursive model invocation.
Should there be a default max_turns? Debatable. A default (e.g., 50) prevents accidental runaway but could break legitimate long-running agents. Recommend: no default initially, but document the parameter prominently.
Token counting accuracy EventLoopMetrics already tracks inputTokens and outputTokens from model responses. This is model-reported, not estimated — so it's accurate but only available after each model call, not before. Budget checks are therefore slightly lagging (checked after each cycle, not before).
Interaction with SlidingWindowConversationManager The conversation manager reduces context when overflow occurs. max_token_budget tracks cumulative tokens across the invocation, not context window size. These are independent concerns and don't conflict.
Graph already has limits Graph's max_node_executions and execution_timeout apply to multi-agent orchestration. max_turns would apply to a single agent's event loop within one invocation. They compose naturally: a Graph node could be an Agent with its own turn limit.

Alternatives Considered

  • Hook-based implementation: Users can implement turn counting via BeforeModelCallEvent + interrupt. However, this is boilerplate that every production user would need to write. A first-class parameter is more aligned with "the obvious path is the happy path."
  • Only CancellationToken: Requires external timeout logic (e.g., threading.Timer). This is available but doesn't cover token budget or turn-based limits.
  • Make it required: Would break existing code. Optional with None default is non-breaking.

References

  • Event loop: src/strands/event_loop/event_loop.py:53-55
  • Graph limits: src/strands/multiagent/graph.py:424-426
  • Retry strategy: src/strands/event_loop/_retry.py:21-158

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions