feat: Add agent execution limits (max_turns, max_token_budget)

## Problem

The `Agent` class has no built-in mechanism to prevent runaway execution. The recursive event loop (`event_loop_cycle` -> `recurse_event_loop`) will continue indefinitely as long as the model keeps requesting tool calls. A single agent invocation can loop without bound, consuming unbounded tokens and time.

**What exists today:**
- `MAX_ATTEMPTS = 6` in `event_loop.py:53` — but this only limits *consecutive retry attempts* on `ModelThrottledException`. It resets to 0 after every successful model call (`_retry.py:113-118`). It does not limit total event loop cycles.
- `CancellationToken` — allows external code to request graceful stopping, but requires the caller to implement the timeout logic themselves.
- `Graph` multi-agent has `max_node_executions`, `execution_timeout`, and `node_timeout` (`graph.py:424-426`) — but single-agent `Agent` has no equivalent.

**What's missing:**
- No `max_turns` parameter to cap total event loop cycles per invocation.
- No `max_token_budget` to cap total token consumption per invocation.
- No detection for diminishing returns (agent spinning without progress).

**Why this matters:**
- A 20-step agent at 95% per-step reliability has only 36% end-to-end success (compounding math). Without turn limits, a failing agent compounds errors indefinitely.
- A real-world case showed 250,000 wasted API calls/day from retry loops without circuit breakers.
- The SDK's tenet "Simple at any scale" implies safe defaults. Unbounded recursion is not a safe default for production.

## Proposed Solution

Add optional `max_turns` and `max_token_budget` parameters to `Agent.__init__()`.

```python
agent = Agent(
    max_turns=30,           # Max event loop cycles per invocation (default: None = unlimited)
    max_token_budget=100000, # Max total tokens per invocation (default: None = unlimited)
)
```

**Implementation sketch:**

1. Add `max_turns: int | None = None` and `max_token_budget: int | None = None` to `Agent.__init__()`.
2. In `event_loop_cycle()` (event_loop.py), track the cycle count via `invocation_state`. Before each model call:
   - If `cycle_count >= max_turns`, yield a `EventLoopStopEvent` with `stop_reason="max_turns_reached"`.
   - If `metrics.accumulated_usage.totalTokens >= max_token_budget`, yield a `EventLoopStopEvent` with `stop_reason="token_budget_exceeded"`.
3. Return a clear `AgentResult` with the stop reason so callers can distinguish normal completion from limits.

## Trade-offs and Edge Cases

| Concern | Analysis |
|---------|----------|
| **What if max_turns hits mid-tool-execution?** | The check occurs before the next model call, not during tool execution. In-flight tools complete normally. The agent stops before the *next* recursive model invocation. |
| **Should there be a default max_turns?** | Debatable. A default (e.g., 50) prevents accidental runaway but could break legitimate long-running agents. Recommend: no default initially, but document the parameter prominently. |
| **Token counting accuracy** | `EventLoopMetrics` already tracks `inputTokens` and `outputTokens` from model responses. This is model-reported, not estimated — so it's accurate but only available *after* each model call, not before. Budget checks are therefore slightly lagging (checked after each cycle, not before). |
| **Interaction with SlidingWindowConversationManager** | The conversation manager reduces context when overflow occurs. `max_token_budget` tracks *cumulative* tokens across the invocation, not context window size. These are independent concerns and don't conflict. |
| **Graph already has limits** | Graph's `max_node_executions` and `execution_timeout` apply to multi-agent orchestration. `max_turns` would apply to a single agent's event loop within one invocation. They compose naturally: a Graph node could be an Agent with its own turn limit. |

## Alternatives Considered

- **Hook-based implementation**: Users can implement turn counting via `BeforeModelCallEvent` + interrupt. However, this is boilerplate that every production user would need to write. A first-class parameter is more aligned with "the obvious path is the happy path."
- **Only CancellationToken**: Requires external timeout logic (e.g., threading.Timer). This is available but doesn't cover token budget or turn-based limits.
- **Make it required**: Would break existing code. Optional with `None` default is non-breaking.

## References

- Event loop: `src/strands/event_loop/event_loop.py:53-55`
- Graph limits: `src/strands/multiagent/graph.py:424-426`
- Retry strategy: `src/strands/event_loop/_retry.py:21-158`

Concern	Analysis
What if max_turns hits mid-tool-execution?	The check occurs before the next model call, not during tool execution. In-flight tools complete normally. The agent stops before the next recursive model invocation.
Should there be a default max_turns?	Debatable. A default (e.g., 50) prevents accidental runaway but could break legitimate long-running agents. Recommend: no default initially, but document the parameter prominently.
Token counting accuracy	`EventLoopMetrics` already tracks `inputTokens` and `outputTokens` from model responses. This is model-reported, not estimated — so it's accurate but only available after each model call, not before. Budget checks are therefore slightly lagging (checked after each cycle, not before).
Interaction with SlidingWindowConversationManager	The conversation manager reduces context when overflow occurs. `max_token_budget` tracks cumulative tokens across the invocation, not context window size. These are independent concerns and don't conflict.
Graph already has limits	Graph's `max_node_executions` and `execution_timeout` apply to multi-agent orchestration. `max_turns` would apply to a single agent's event loop within one invocation. They compose naturally: a Graph node could be an Agent with its own turn limit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add agent execution limits (max_turns, max_token_budget) #2124

Problem

Proposed Solution

Trade-offs and Edge Cases

Alternatives Considered

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: Add agent execution limits (max_turns, max_token_budget) #2124

Description

Problem

Proposed Solution

Trade-offs and Edge Cases

Alternatives Considered

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions