Time: 60 minutes | Difficulty: Advanced
Master context window management with auto-compaction, effort levels, and advanced tool discovery features. These patterns are essential for building long-running agents and managing large tool collections.
- Auto-compaction for managing context window size
- Effort levels for controlling response quality
- Tool search for dynamic tool discovery
- MCP toolset configuration
- Computer use v5 features
Before starting this tutorial, you should:
- Complete Tutorials 0-6 (Foundation series)
- Understand ReAct patterns and tool use
- Have the Claude PHP SDK installed and configured
When building agents that run for extended periods or process large amounts of data, the context window can fill up quickly. Auto-compaction automatically summarizes and compresses the message history when it exceeds a specified token threshold.
$response = $client->messages()->create([
'model' => 'claude-sonnet-4-20250514',
'max_tokens' => 4096,
'messages' => $messages,
'compaction_control' => [
'enabled' => true,
'context_token_threshold' => 100000,
],
]);When compaction triggers:
- Total tokens (input + output) exceeds threshold
- System generates a continuation summary
- Message history is replaced with summary
- Agent continues with compressed context
Control the computational effort Claude puts into generating responses:
| Level | Use Case | Trade-offs |
|---|---|---|
low |
Simple queries, classification | Fast, low tokens |
medium |
General tasks (default) | Balanced |
high |
Complex reasoning, analysis | Thorough, more tokens |
$response = $client->messages()->create([
'model' => 'claude-sonnet-4-20250514',
'max_tokens' => 4096,
'output_config' => [
'effort' => 'high', // For complex reasoning
],
'messages' => $messages,
]);For large tool collections (100+ tools), loading all tools into the system prompt is inefficient. Tool search enables dynamic discovery:
// Define tools with deferred loading
$tools = [
[
'name' => 'analyze_data',
'description' => 'Analyze data sets',
'defer_loading' => true, // Not loaded initially
'input_schema' => [...],
],
// ... hundreds more tools
];
// Add search tool
$tools[] = [
'type' => 'tool_search_tool_bm25_20251119',
'name' => 'tool_search_tool_bm25',
];
// Claude can now search for tools by name/description
$response = $client->messages()->create([
'model' => 'claude-sonnet-4-5-20250929',
'max_tokens' => 1024,
'tools' => $tools,
'messages' => $messages,
'betas' => ['tool-search-tool-2025-10-19'],
]);Configure tools from Model Context Protocol (MCP) servers:
$tools = [
[
'type' => 'mcp_toolset',
'mcp_server_name' => 'database-server',
'default_config' => [
'enabled' => true,
'defer_loading' => true,
],
'configs' => [
'safe_query' => ['defer_loading' => false], // Load immediately
'drop_table' => ['enabled' => false], // Disable dangerous
],
],
];The tutorial code demonstrates:
- Long-running agent with auto-compaction
- Task complexity detection with effort levels
- Dynamic tool discovery with search
- Combining all features for optimal performance
cd tutorials/15-context-management
php context_agent.phpclass ContextManagedAgent
{
private ClaudePhp $client;
private array $messages = [];
private int $compactionThreshold;
public function __construct(
ClaudePhp $client,
int $compactionThreshold = 100000
) {
$this->client = $client;
$this->compactionThreshold = $compactionThreshold;
}
public function run(string $task): void
{
$this->messages[] = [
'role' => 'user',
'content' => $task,
];
$previousMessageCount = count($this->messages);
while (true) {
$response = $this->client->messages()->create([
'model' => 'claude-sonnet-4-20250514',
'max_tokens' => 4096,
'messages' => $this->messages,
'compaction_control' => [
'enabled' => true,
'context_token_threshold' => $this->compactionThreshold,
],
]);
// Detect compaction
$currentMessageCount = count($this->messages);
if ($currentMessageCount < $previousMessageCount) {
echo "🔄 Compaction occurred!\n";
echo "Messages: {$previousMessageCount} → {$currentMessageCount}\n";
}
$previousMessageCount = $currentMessageCount;
// Add assistant response
$this->messages[] = [
'role' => 'assistant',
'content' => $response->content,
];
if ($response->stop_reason === 'end_turn') {
break;
}
}
}
}function selectEffortLevel(string $task): string
{
// Simple heuristics for effort selection
$complexIndicators = [
'analyze', 'prove', 'explain why', 'compare and contrast',
'security', 'optimization', 'architecture', 'debug',
];
$simpleIndicators = [
'what is', 'define', 'list', 'translate', 'summarize briefly',
];
$taskLower = strtolower($task);
foreach ($complexIndicators as $indicator) {
if (str_contains($taskLower, $indicator)) {
return 'high';
}
}
foreach ($simpleIndicators as $indicator) {
if (str_contains($taskLower, $indicator)) {
return 'low';
}
}
return 'medium';
}
// Usage
$task = "Analyze the security implications of this architecture";
$effort = selectEffortLevel($task);
$response = $client->messages()->create([
'model' => 'claude-sonnet-4-20250514',
'max_tokens' => 4096,
'output_config' => ['effort' => $effort],
'messages' => [['role' => 'user', 'content' => $task]],
]);function createToolSearcher(array $tools): callable
{
return function(string $keyword) use ($tools): array {
$results = [];
foreach ($tools as $tool) {
$searchText = json_encode($tool);
if (stripos($searchText, $keyword) !== false) {
$results[] = [
'type' => 'tool_reference',
'tool_name' => $tool['name'],
];
}
}
return $results;
};
}
// Create searchable tools
$tools = [];
// Add many deferred tools
foreach ($allToolDefinitions as $tool) {
$tool['defer_loading'] = true;
$tools[] = $tool;
}
// Add the search tool
$tools[] = [
'name' => 'search_tools',
'description' => 'Search for available tools by keyword',
'input_schema' => [
'type' => 'object',
'properties' => [
'keyword' => [
'type' => 'string',
'description' => 'Search keyword',
],
],
'required' => ['keyword'],
],
];
$toolSearcher = createToolSearcher($tools);| Feature | Without | With | Improvement |
|---|---|---|---|
| Context overflow | Fails at limit | Continues with summary | ✓ Unlimited |
| 100 tools in prompt | ~5000 tokens | ~500 tokens | 90% reduction |
| Simple query latency | Medium | Low (with low effort) | 40% faster |
| Complex analysis quality | Medium | High (with high effort) | Better output |
-
Set appropriate thresholds
- Default: 100,000 tokens
- Lower for memory-constrained scenarios
- Higher when context preservation is critical
-
Use custom summary prompts for domain-specific tasks
-
Monitor for compaction and adjust thresholds as needed
-
Match effort to task complexity
- Don't use high effort for simple queries
- Reserve high for critical analysis
-
Combine with extended thinking for maximum depth
-
Monitor token usage - high effort costs more
-
Defer rarely-used tools to reduce prompt size
-
Use descriptive tool names for better search results
-
Consider BM25 vs Regex based on search needs
-
Context loss in compaction: Important details may be summarized away
- Store critical information externally
- Use system prompts for persistent context
-
Over-using high effort: Increases latency and costs
- Reserve for genuinely complex tasks
-
Too many loaded tools: Slows initial response
- Use defer_loading aggressively
After this tutorial:
- Explore examples/auto_compaction.php
- Explore examples/effort_levels.php
- Explore examples/tool_search.php
- Review examples/mcp_toolset.php