Chapter 20b: Teams and Multi-Process Collaboration
Positioning: This chapter analyzes Claude Code's Swarm team collaboration mechanism -- a flat-structured multi-Agent collaboration model. Prerequisites: Chapter 20. Target audience: readers who want a deep understanding of CC's Swarm team collaboration mechanism -- including TaskList scheduling, DAG dependencies, and Mailbox communication.
Why Discuss Teams Separately
Chapter 20 introduced Claude Code's three Agent spawning modes -- Subagent, Fork, and Coordinator -- which share the common characteristic of a "parent spawns child" hierarchical relationship. Teams (the teammate system) is a different dimension: it creates a flat-structured team where Agents collaborate through message passing rather than hierarchical calls. This difference manifests not only in architecture but also in engineering implementations of communication protocols, permission synchronization, and lifecycle management.
20b.1 Teammate Agents (Agent Swarms)
The teammate system is another dimension of Agent orchestration. Unlike the "parent spawns child" model of subagents, the teammate system creates a flat-structured team where Agents collaborate through message passing.
TeamCreateTool: Team Creation
TeamCreateTool (tools/TeamCreateTool/TeamCreateTool.ts) is used to create new teams:
// tools/TeamCreateTool/TeamCreateTool.ts:37-49
const inputSchema = lazySchema(() =>
z.strictObject({
team_name: z.string().describe('Name for the new team to create.'),
description: z.string().optional(),
agent_type: z.string().optional()
.describe('Type/role of the team lead'),
}),
)
Team information is persisted to a TeamFile containing the team name, member list, Leader info, etc. Team names must be unique -- conflicts trigger automatic generation of a word slug (lines 64-72).
TeammateAgentContext: Teammate Context
Teammates use the TeammateAgentContext type (agentContext.ts lines 60-85), containing rich team coordination information:
// utils/agentContext.ts:60-85
export type TeammateAgentContext = {
agentId: string // Full ID, e.g., "researcher@my-team"
agentName: string // Display name, e.g., "researcher"
teamName: string // Team membership
agentColor?: string // UI color
planModeRequired: boolean // Whether plan approval is needed
parentSessionId: string // Leader's session ID
isTeamLead: boolean // Whether this is the Leader
agentType: 'teammate'
}
Teammate IDs use the format name@team-name, making it easy to identify an Agent's identity and affiliation at a glance in logs and communications.
Flat Structure Constraint
The teammate system has an important architectural constraint: teammates cannot spawn other teammates (lines 272-274):
// tools/AgentTool/AgentTool.tsx:272-274
if (isTeammate() && teamName && name) {
throw new Error('Teammates cannot spawn other teammates — the team roster is flat.');
}
This is a deliberate design -- the team roster is a flat array, and nested teammates would create entries in the roster without source information, confusing the Leader's coordination logic.
Similarly, in-process teammates cannot spawn background Agents (lines 278-280) because their lifecycle is bound to the Leader's process.
20b.2 Inter-Agent Communication
SendMessageTool: Message Routing
SendMessageTool (tools/SendMessageTool/SendMessageTool.ts) is the core of inter-Agent communication. Its to field supports multiple addressing modes:
// tools/SendMessageTool/SendMessageTool.ts:69-76
to: z.string().describe(
feature('UDS_INBOX')
? 'Recipient: teammate name, "*" for broadcast, "uds:<socket-path>" for a local peer, or "bridge:<session-id>" for a Remote Control peer'
: 'Recipient: teammate name, or "*" for broadcast to all teammates',
),
Message types form a discriminated union (lines 47-65), supporting:
- Plain text messages
- Shutdown requests (
shutdown_request) - Shutdown responses (
shutdown_response) - Plan approval responses (
plan_approval_response)
Broadcast Mechanism
When to is "*", a broadcast is triggered (handleBroadcast, lines 191-266): iterating through all members in the team file (excluding the sender), writing to each mailbox. Broadcast results include the recipient list for coordinator tracking.
Mailbox System
Messages are physically written to filesystem mailboxes via the writeToMailbox() function. Each message contains: sender name, text content, summary, timestamp, and sender color. This filesystem-based mailbox design allows cross-process teammates (tmux mode) to communicate through a shared filesystem.
UDS_INBOX: Unix Domain Socket Extension
When the UDS_INBOX Feature Flag is enabled, SendMessageTool's addressing capability extends to Unix Domain Sockets: "uds:<socket-path>" can send messages to other Claude Code instances on the same machine, and "bridge:<session-id>" can send messages to Remote Control peers.
This creates a communication topology that transcends single-team boundaries:
┌─────────────────────────────────────────────────────────────────┐
│ Inter-Agent Communication Architecture │
│ │
│ ┌──────────────────────────────────┐ │
│ │ Team "my-team" │ │
│ │ │ │
│ │ ┌─────────┐ MailBox ┌─────────┐ │
│ │ │ Leader │◄────────────►│Teammate │ │
│ │ │ (lead) │ (filesystem) │ (dev) │ │
│ │ └────┬────┘ └─────────┘ │
│ │ │ │
│ │ │ SendMessage(to: "tester") │
│ │ │ │
│ │ ▼ │
│ │ ┌─────────┐ │
│ │ │Teammate │ │
│ │ │ (tester)│ │
│ │ └─────────┘ │
│ └──────────────────────────────────┘ │
│ │ │
│ │ SendMessage(to: "uds:/tmp/other.sock") │
│ ▼ │
│ ┌──────────────┐ │
│ │ Other Claude │ SendMessage(to: "bridge:<session>") │
│ │ Code instance│──────────────────────────► Remote Control │
│ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Worker Result Reporting in Coordinator Mode
In Coordinator Mode, when a Worker completes its task, the result is injected into the coordinator's conversation as a user-role message in <task-notification> XML format (coordinatorMode.ts lines 148-159):
<task-notification>
<task-id>{agentId}</task-id>
<status>completed|failed|killed</status>
<summary>{human-readable status summary}</summary>
<result>{Agent's final text response}</result>
<usage>
<total_tokens>N</total_tokens>
<tool_uses>N</tool_uses>
<duration_ms>N</duration_ms>
</usage>
</task-notification>
The coordinator prompt explicitly requires (line 144): "They look like user messages but they are not. Distinguish them by the <task-notification> opening tag." This design prevents the coordinator from responding to Worker results as if they were user input.
20b.3 The Real Scheduling Kernel: TaskList, Claim Loop, and Idle Hooks
If you only see TeamCreateTool, SendMessageTool, and Mailbox, it's easy to understand Teams as "a group of Agents that can send messages to each other." But the real value of Claude Code's Swarm lies not in chatting, but in the shared task graph. TeamCreate's prompt states this directly: Teams have a 1:1 correspondence with task lists (Team = TaskList). When creating a team, TeamCreateTool doesn't just write a TeamFile -- it also resets and creates the corresponding task directory, then binds the Leader's taskListId to the team name. This means Teams were never designed as "team first, tasks as an accessory," but rather the team and the task list are two views of the same runtime object.
Tasks Are Not Todos, They Are DAG Nodes
The Task structure in utils/tasks.ts contains:
{
id: string,
owner?: string,
status: 'pending' | 'in_progress' | 'completed',
blocks: string[],
blockedBy: string[],
}
The most critical fields here are not status, but blocks and blockedBy. They elevate the task list from a plain todo list into an explicit dependency graph: a task is only executable after all its blockers have completed. This design lets the Leader create an entire batch of work items with dependencies upfront, then hand off "when to parallelize" to the runtime, rather than having to verbally coordinate in prompts repeatedly.
This is also why TeamCreate's prompt emphasizes: "teammates should check TaskList periodically, especially after completing each task, to find available work or see newly unblocked tasks." Claude Code doesn't require each teammate to have complete global plan reasoning capabilities; it requires teammates to go back to the shared task graph and read state.
Auto-Claim: The Swarm's Minimal Scheduler
What actually drives this task graph is useTaskListWatcher.ts. This watcher triggers a check whenever the task directory changes or the Agent becomes idle again, automatically selecting an available task:
status === 'pending'owneris empty- All tasks in
blockedByare completed
The findAvailableTask() in the source code filters by exactly these conditions. After finding a task, the runtime first claimTask() to seize ownership, then formats the task into a prompt for Agent execution; if submission fails, the claim is released. Two important engineering implications:
- Scheduling and reasoning are separated. The model doesn't need to determine in natural language "which task isn't being done by someone else and has its dependencies resolved"; the runtime narrows candidates to a single explicit task first.
- Parallelism comes from shared state, not message negotiation. Multiple Agents can make progress simultaneously not because they're smart enough to coordinate with each other, but because claim + blocker checks explicitly encode conflicts into the state machine.
From this perspective, Claude Code's Swarm already has a small but complete scheduler: task graph + atomic claim + state transitions. The Mailbox is just a collaboration supplement, not the primary scheduling surface.
Post-Turn Event Surface: TaskCompleted and TeammateIdle
Another key aspect of the Swarm is that when a teammate finishes a turn of execution, it doesn't simply "stop" -- it enters an event-driven wrap-up phase. In query/stopHooks.ts, when the current executor is a teammate, Claude Code runs two types of specialized events after normal Stop hooks:
TaskCompleted: fires completion hooks forin_progresstasks owned by the current teammateTeammateIdle: fires hooks when the teammate enters idle state
This makes Teams neither purely pull-based nor purely push-based, but a combination of both:
- pull: idle teammates return to the TaskList and continue claiming new tasks
- push: task completion and teammate idle trigger events, notifying the Leader or driving subsequent automation
In other words, Claude Code's Swarm is not "a group of agents that send messages," but rather a collaboration kernel composed of shared task graph + durable mailbox + post-turn events.
This Is Not Shared Memory, But Shared State
The wording here must be very precise. Teams may look like "multiple Agents sharing a workspace," but per the source code, the more accurate description is not "shared memory" but three layers of shared state:
- Shared task state:
~/.claude/tasks/{team-name}/ - Shared communication state:
~/.claude/teams/{team}/inboxes/*.json - Shared team configuration:
~/.claude/teams/{team}/config.json
In-Process teammates just happen to run in the same process physically, and preserve their own identity context via AsyncLocalStorage; this doesn't elevate the entire system into a general-purpose blackboard shared-memory runtime. This distinction is important because it determines the truly portable pattern of Claude Code's Swarm: externalize collaboration state first, then let different execution units collaborate around it.
20b.4 Async Agent Lifecycle
When shouldRunAsync is true (triggered by any of run_in_background, background: true, Coordinator Mode, Fork mode, assistant mode, etc., line 567), the Agent enters an async lifecycle:
- Registration:
registerAsyncAgent()creates a background task record, assignsagentId - Execution: Runs
runAgent()wrapped inrunWithAgentContext() - Progress Reporting: Updates status via
updateAsyncAgentProgress()andonProgresscallbacks - Completion/Failure: Calls
completeAsyncAgent()orfailAsyncAgent() - Notification:
enqueueAgentNotification()injects results into the caller's message stream
A key design choice: background Agents are not associated with the parent Agent's abortController (line 694-696 comment) -- when the user presses ESC to cancel the main thread, background Agents continue running. They can only be explicitly terminated via chat:killAgents.
Worktree Isolation
When isolation: 'worktree', the Agent runs in a temporary git worktree (lines 590-593):
const slug = `agent-${earlyAgentId.slice(0, 8)}`;
worktreeInfo = await createAgentWorktree(slug);
After the Agent completes, if the worktree has no changes (compared to the HEAD commit at creation), it's automatically cleaned up (lines 666-679). Worktrees with changes are retained, and their path and branch name are returned to the caller.
20b.5 Teams Implementation Details: Backends, Communication, Permissions, and Memory
This section is the implementation-level deep dive of 20b.1 (teammate overview). Section 20b.1 answers "what are Teams" -- flat-structured teams, TeamCreateTool, TeammateAgentContext types; this section answers "how do Teams actually run" -- process management, communication protocols, permission synchronization, and shared memory engineering implementations.
In the source code, "Swarm" and "Team" are synonyms: the directory is
utils/swarm/, the tool isTeamCreateTool, the Feature Flag isENABLE_AGENT_SWARMS, and the constant isSWARM_SESSION_NAME = 'claude-swarm'.
Three Backends, One Interface
Teams supports three physical backends, unified behind the PaneBackend + TeammateExecutor interface (utils/swarm/backends/types.ts):
| Backend | Process Model | Communication | Use Case |
|---|---|---|---|
| Tmux | Independent CLI processes, tmux split panes | Filesystem Mailbox | Default backend, for Linux/macOS |
| iTerm2 | Independent CLI processes, iTerm2 split panes | Filesystem Mailbox | macOS native terminal users |
| In-Process | Same-process AsyncLocalStorage isolation | AppState memory queue | No tmux/iTerm2 environment |
Backend detection priority chain (backends/registry.ts):
1. Running inside tmux? → Tmux (native)
2. Inside iTerm2 with it2 available? → iTerm2 (native)
3. Inside iTerm2 but no it2? → Prompt to install it2
4. System has tmux? → Tmux (external session)
5. None of the above? → In-Process fallback
The benefit of this strategy pattern: the Leader's TeamCreateTool and SendMessageTool don't need to know which backend teammates run on -- spawnTeammate() automatically selects the best option.
Team Lifecycle
// utils/swarm/teamHelpers.ts — TeamFile structure
{
name: string, // Unique team name
description?: string,
createdAt: number,
leadAgentId: string, // Format: team-lead@{teamName}
members: [{
agentId: string, // Format: {name}@{teamName}
name: string,
agentType?: string,
model?: string,
prompt: string,
color: string, // Auto-assigned terminal color
planModeRequired: boolean,
tmuxPaneId?: string,
sessionId?: string,
backendType: BackendType,
isActive: boolean,
mode: PermissionMode,
}]
}
Storage location: ~/.claude/teams/{teamName}/config.json
Teammate spawning flow (spawnMultiAgent.ts:305-539):
- Detect backend -> generate unique name -> format agent ID (
{name}@{teamName}) - Assign terminal color -> create tmux/iTerm2 split pane
- Build inherited CLI arguments:
--agent-id,--agent-name,--team-name,--agent-color,--parent-session-id,--permission-mode - Build inherited environment variables -> send startup command to split pane
- Update TeamFile -> send initial instructions via Mailbox
- Register out-of-process task tracking
Flat structure constraint: Teammates cannot spawn sub-teammates (AgentTool.tsx:266-300). This isn't a technical limitation -- it's an intentional organizational principle: coordination is centralized at the Leader, avoiding infinitely deep delegation chains.
Mailbox Communication Protocol
Teammates communicate asynchronously through filesystem mailboxes (teammateMailbox.ts):
~/.claude/teams/{teamName}/inboxes/{agentName}.json
Concurrency control: async lockfile + exponential backoff (10 retries, 5-100ms delay window)
Message structure:
type TeammateMessage = {
from: string, // Sender name
text: string, // Message content or JSON control message
timestamp: string,
read: boolean, // Read marker
color?: string, // Sender's terminal color
summary?: string, // 5-10 word summary
}
Control message types (structured JSON nested in the text field):
| Type | Direction | Purpose |
|---|---|---|
idle notification | Teammate -> Leader | Teammate finished work, reporting reason (available/error/shutdown/completed) |
shutdown_request | Leader -> Teammate | Request graceful shutdown |
shutdown_response | Teammate -> Leader | Approve or reject shutdown request |
plan_approval_response | Leader -> Teammate | Approve or reject teammate's submitted plan |
Idle notification structure (teammateMailbox.ts):
type IdleNotificationMessage = {
type: 'idle',
teamName: string,
agentName: string,
agentId: string,
idleReason: 'available' | 'error' | 'shutdown' | 'completed',
summary?: string, // Work summary
peerDmSummary?: string, // Recent DM summary
errorDetails?: string,
}
Permission Synchronization: Leader Proxy Approval
Teammates cannot self-approve dangerous tool calls -- they must go through the Leader proxy (utils/swarm/permissionSync.ts):
~/.claude/teams/{teamName}/permissions/
├── pending/ # Requests awaiting approval
└── resolved/ # Processed requests
Request flow:
Worker encounters permission check
↓
Creates SwarmPermissionRequest (with toolName, input, suggestions)
↓
Writes to pending/{requestId}.json + sends to Leader Mailbox
↓
Leader polls Mailbox → detects permission request → presents to user
↓
User approves/rejects in Leader terminal
↓
Writes to resolved/{requestId}.json
↓
Worker polls resolved/ → gets result → continues execution
This design ensures that even when teammates run in independent processes, all dangerous operations still go through human approval.
Team Memory
Feature gate TENGU_HERRING_CLOCK controls this. Located at:
~/.claude/projects/{project}/memory/team/MEMORY.md
Independent from personal memory (~/.claude/projects/{project}/memory/), shared by all team members. Uses the same two-step write flow as personal memory: first write the .md file, then update the MEMORY.md index.
Path security validation (memdir/teamMemPaths.ts, PSR M22186 security patch):
| Attack Type | Protection |
|---|---|
| Null byte injection | Reject paths containing \0 |
| URL-encoded traversal | Reject %2e%2e%2f and similar patterns |
| Unicode normalization attacks | Reject fullwidth ../ and similar variants |
| Backslash traversal | Reject paths containing \ |
| Symlink loops | Detect ELOOP + dangling links |
| Path escape | Resolve realpath to verify containment of deepest existing ancestor |
In-Process Teammates: Team Collaboration Without tmux
When the environment lacks tmux/iTerm2, teammates run within the same process isolated by AsyncLocalStorage (utils/swarm/spawnInProcess.ts):
// AsyncLocalStorage context isolation
type TeammateContext = {
agentId: string,
agentName: string,
teamName: string,
parentSessionId: string,
isInProcess: true,
abortController: AbortController, // Independent cancellation control
}
runWithTeammateContext<T>(context, fn: () => T): T // Isolated execution
In-Process teammate task state (InProcessTeammateTaskState) contains:
pendingUserMessages: string[]-- message queue (replaces filesystem Mailbox)awaitingPlanApproval: boolean-- waiting for Leader approval in Plan modeisIdle: boolean-- idle statusonIdleCallbacks: Array<() => void>-- callbacks on idle (notify Leader)messages: Message[]-- UI display buffer (capTEAMMATE_MESSAGES_UI_CAP = 50)
Key difference from tmux teammates: communication is through memory queues rather than filesystem Mailbox, but the API is completely consistent.
Pattern Distillation: Filesystem-Based Inter-Process Collaboration
Teams' communication design makes a counterintuitive but pragmatic choice: using the filesystem rather than IPC/RPC for cross-process communication.
| Dimension | Filesystem Mailbox | Traditional IPC/RPC |
|---|---|---|
| Persistence | Messages survive process crashes | Lost on disconnect |
| Debuggability | Direct cat to inspect | Requires dedicated debug tools |
| Concurrency control | lockfile | Built into protocol |
| Latency | Poll interval (millisecond scale) | Instant |
| Cross-machine | Requires shared filesystem | Natively supported |
For Agent Teams scenarios (second-scale interactions, processes may crash, human debugging needed), the filesystem Mailbox trade-off is reasonable -- UDS serves as a supplementary solution covering low-latency scenarios.
What Users Can Do
Leverage the Teams system to improve multi-Agent collaboration efficiency:
-
Note the addressing modes for inter-Agent communication.
SendMessageToolsupports name addressing ("tester"), broadcast ("*"), and UDS addressing ("uds:<path>"). Understanding these addressing modes helps design more efficient multi-Agent workflows. -
Understand Teams' backend selection. If you use tmux or iTerm2, teammates run as independent terminal split panes communicating through filesystem Mailbox; without a terminal multiplexer, it falls back to in-process mode. Knowing this helps debug inter-teammate communication issues.
-
Use idle detection to gauge teammate status. The Leader senses teammate status by polling idle notifications in the Mailbox. If a teammate seems "stuck," checking the mailbox files under
~/.claude/teams/{teamName}/inboxes/can help locate the problem. -
Permission approval is centralized at the Leader. All teammates' dangerous operations require approval through the Leader terminal. Make sure the Leader terminal stays active, otherwise teammates will block waiting for approval.