A* Planner for Agents

One-Line Summary: A* is the classical heuristic search algorithm at the heart of GOAP and most structured agent planners — it finds the lowest-cost action sequence from current state to goal state by expanding nodes in order of cost-so-far + estimated-cost-to-goal, and it is the workhorse of any harness that does plan-shaped (rather than chain-of-thought-shaped) planning.

Prerequisites: Goal-oriented action planning, agent state management

What Is A*?

A* (pronounced "A-star") is a graph search algorithm that finds the cheapest path from a start node to a goal node when each edge has a cost and you have an admissible heuristic — a lower-bound estimate of the remaining cost from any node to the goal. A* expands nodes in priority order: node with the lowest f(n) = g(n) + h(n) goes next, where g(n) is the cost so far and h(n) is the heuristic.

A* is complete (finds a solution if one exists) and optimal (finds the lowest-cost solution) when the heuristic is admissible. It generalizes BFS (when h = 0) and Dijkstra's algorithm (when h = 0 but with edge weights).

How A* Applies to Agents

In an agent-planner, the graph nodes are world states (or partial plans), edges are actions (tools/skills), edge cost is the action's cost (tokens, latency, dollars), and the goal is the target state. A* searches through possible action sequences and returns the lowest-cost one.

Two pieces are non-trivial in the agent setting:

State representation: A "world state" has to capture enough to decide whether an action's preconditions are met. Pure string states don't compose; structured states (typed dicts, knowledge graphs) do.
Heuristic design: A good heuristic dominates A*'s performance. For agent tasks, common heuristics: number of unmet sub-goals, shortest known path through similar past tasks, LLM-estimated cost.

Why It Matters

A* gives an agent system a plan it can verify before executing. That is a profound difference from chain-of-thought planning, where the LLM emits a plan and you have to trust it. With A*-shaped planning, you know the plan's preconditions are met because the search algorithm only returned plans whose preconditions are met.

The cost is structure: every action has to be annotated with preconditions and effects, which is a discipline most agent codebases avoid. The trade is genuine — A* planners are more reliable but require more upfront work.

Key Technical Details

Admissibility: A heuristic is admissible if it never overestimates remaining cost. Admissible heuristics give optimal A*; non-admissible can be faster but lose optimality.
Consistency: A heuristic is consistent if h(n) ≤ cost(n,n') + h(n') for any neighbor n'. Consistent heuristics let A* skip re-expansion.
Memory cost: A* keeps an open set of frontier nodes. For agent action graphs with many parallel branches, this grows fast.
Iterative deepening A (IDA)**: A memory-efficient variant. Trades runtime for memory.
LLM-as-heuristic: A surprisingly effective hybrid: A* drives the search structure; an LLM scores how close a state is to the goal. Expensive per call; few calls needed.
Re-planning: When execution diverges (a tool fails), the planner re-runs from the new state. A* is fast enough on small action graphs that re-planning each turn is feasible.
Action graph size: A* is great for graphs of ~100s of actions. For very large graphs (1000s of tools), A* alone is impractical; combine with a coarse-grained pre-filter.

How Harnesses & Frameworks Implement This

Harness / Framework	A* / GOAP planner
Claude Code	✗ — chain-of-thought planning
Claude Agent SDK	DIY
ruflo	First-class — `ruflo-goals` plugin
LangGraph	DIY — graph as action space; user implements search
AutoGen	✗
CrewAI	✗
OpenAI Agents SDK	✗
Codex CLI	Implicit "plan mode" but chain-of-thought, not A*
Cursor	✗

Connections to Other Concepts

goal-oriented-action-planning.md — A* is the search engine inside GOAP.
adaptive-replanning.md — A* re-runs when execution diverges.
plan-graphs-vs-plan-strings.md — Why structured plans (A*'s output) beat chain-of-thought plans.
multi-step-plan-evaluation.md — How to evaluate plans an A* search produced.