Anthropic just revealed the architecture behind Claude's Research feature.
90.2% faster than single-agent.
15x more tokens than chat.
Here is the production blueprint.
They use an orchestrator-worker pattern. Lead agent is Claude Opus 4. Subagents are Claude Sonnet 4.
The lead decomposes the query, spawns 3-5 subagents in parallel, each with its own fresh context window.
The critical insight: subagents do NOT inherit the orchestrator's conversation history.
Each one gets a clean slate.
This costs more tokens but solves three problems at once:
- Token economy on large tasks
- Failure isolation when a subagent goes off the rails
- Parallel execution without shared state conflicts
Their internal evaluation showed 90.2% improvement over single-agent Opus 4 on breadth-first research queries.
Three factors explained 95% of performance variance in BrowseComp:
- Token usage alone explains 80%
- Tool call count
- Model choice
The scaling rules they embedded in prompts are the real engineering win.
Simple fact-finding: 1 agent, 3-10 tool calls.
Direct comparisons: 2-4 subagents, 10-15 calls each.
Complex research: 10+ subagents with clearly divided responsibilities.
Without these rules, agents were spawning 50 subagents for simple queries and burning money.
Production reliability required rainbow deployments.
Agent systems are stateful webs that run continuously. You cannot update every agent at once. They shift traffic gradually from old to new versions while keeping both running simultaneously.
The gap between prototype and production is where most teams die.
Minor issues cascade into entirely different trajectories. One step failing sends the agent down a different path with unpredictable outcomes.
Anthropic solved this with durable execution, checkpointing, and letting agents adapt when tools fail rather than restarting from scratch.
This is the multi-agent architecture standard.
If your agent system does not have fresh-context subagents, scaling rules, and rainbow deployments, you are still in prototype mode.
Agentic AI
Anthropic just published their production multi-agent architecture. 90.2% faster. 15x token burn. This is the blueprint.
3 views
0 Comments