Queen-Led Hierarchy

One-Line Summary: A queen-led topology has a single high-authority "queen" agent that allocates tasks to a pool of workers, arbitrates conflicts, and decides when work is done — ruflo's flagship topology and the most token-efficient way to coordinate 5+ agents on a complex task.

Prerequisites: Topology as a design decision, sub-agents as primitives

What Is a Queen-Led Hierarchy?

A queen is one designated agent with elevated responsibilities: she sees the task description, decomposes it, assigns subtasks to workers, reads worker outputs, and decides when the goal is met. Workers do not coordinate with each other directly; all communication funnels through the queen. This is the same shape as a manager-with-reports in a human organization, and it shares both the strengths (clear authority, predictable routing) and the weaknesses (queen is a bottleneck and a SPOF).

Queen-led is ruflo's default for swarms because it is cheap. Coordination cost scales linearly in the number of workers (queen sees each), not quadratically (every worker talks to every other). For most tasks where the workers are clearly specialized, this is the right shape.

How It Works

The queen is a sub-agent like any other, but with a system prompt designed for orchestration: "You are the queen. Decompose the task. Dispatch each subtask to the most appropriate worker. Read their outputs. Decide when done." The queen's tool list includes a dispatch(worker, subtask) tool (or its equivalent — Task in Claude Code) plus aggregation tools.

The workers are domain-specialized sub-agents (coder, tester, reviewer, researcher). Each runs to completion on its assigned subtask and returns one final message. The queen reads all returns, possibly dispatches follow-up work, and eventually emits the final answer.

Why It Matters

For most multi-agent tasks, queen-led is the highest performance-per-token topology. It avoids worker-to-worker chatter (expensive), avoids the consensus overhead of mesh topologies (expensive), and exploits specialization without burning tokens on coordination prompts. The cost is concentration: the queen's quality determines the system's quality.

Key Technical Details

Queen prompts are themselves a discipline: Bad queens dispatch poorly, redo work, or thrash. Good queens decompose carefully and dispatch decisively. Treat the queen prompt as the system's most important asset.
Queens should be smart, workers can be cheap: A common pattern is queen on the largest available model, workers on a smaller/cheaper model. This concentrates spend where it has the most leverage.
Queens can spawn queens: A queen-of-queens hierarchy works for very large tasks but adds latency. Two-level deep is usually the practical max.
Queen failure is total: If the queen errors, the whole task fails. Hooks and budgets matter.
Worker independence is a feature: Workers should not assume context the queen did not pass. This forces clean interfaces.
Aggregation is the queen's hardest job: When three workers return, the queen has to reconcile their outputs. Patterns: simple concatenation, voting, weighted merge, queen-rewrite.

How Harnesses & Frameworks Implement This

Harness / Framework	Queen-led support	How
Claude Code	DIY via supervisor sub-agent	A "lead" sub-agent that dispatches via `Task`
Claude Agent SDK	DIY	Same shape, programmatic
ruflo	First-class — flagship topology	`ruflo-swarm` plugin, queen as default coordinator
LangGraph	DIY	Supervisor node routes to worker nodes
AutoGen	`GroupChatManager` with `select_speaker_method='manual'`	Manager acts as queen
CrewAI	`Process.hierarchical` with manager agent	Built-in
OpenAI Agents SDK	DIY via handoffs	Lead agent hands off
Codex CLI	✗	N/A
Cursor	✗	N/A

Connections to Other Concepts

topology-as-design-decision.md — The framing concept.
mesh-topology.md, hive-mind-pattern.md — The alternatives.
supervisor-pattern-deep-dive.md — A near-synonym in framework speak.
sub-agents-as-primitives.md — The runtime substrate.
topology-selection-decision-tree.md — When to pick queen-led.