Agents & RAG · Module 12·12 min read

Agentic Patterns

An “agent” is what you get when you put an LLM in a loop with tools. The patterns below — ReAct, Reflection, Planning — are the recurring shapes that production agentic systems take. They’re less exotic than the word suggests.

Brain Drip EditorsUpdated May 2026·17 references

The five-bullet version

An agent is an LLM placed in a loop with tools. Each iteration: the model picks a tool, runs it, looks at the result, decides what to do next.
ReAct — interleave reasoning and tool use. The model writes a “thought” before each action and an “observation” after.
Reflection — let the model critique its own output and iterate. Improves quality on writing, code, and reasoning.
Planning — decompose the goal into a plan first, then execute step by step. Better for long horizons; fragile if the world doesn’t match the plan.
Most production agents are simpler than the literature suggests — usually one of these patterns with strict guardrails, not all three at once.

§ 00 · FROM ONE-SHOT TO A LOOPWhat “agentic” actually means

In a normal LLM call, you send a prompt, get a response, and you’re done. One round-trip. The model only ever sees what was in the prompt. It can’t reach the database, check the weather, or send an email.

An agentagent. An LLM placed in a loop with tools. Each iteration, the model decides whether to call a tool, observes the result, and continues until it has enough information to answer. The 'agency' comes from the model — not from any other mechanism — deciding which tool to use and when. flips this. You give the model a set of tools — really just functions it can call — and put the whole thing in a loop. At every iteration, the model can:

Decide it has enough information and emit a final answer (loop ends).
Pick a tool, supply arguments, get back a result, and continue the loop.

That’s the whole framework. The patterns in this lesson are different ways of structuring what the model does inside the loop — when it reasons, when it acts, when it stops, when it second-guesses itself.

§ 01 · REACT — THINK, ACT, OBSERVEThe default pattern

ReActReAct. A 2022 prompting pattern that asks the model to alternate 'Thought' and 'Action' steps. Thoughts are free-text reasoning; Actions are tool calls. After each tool call, the result is fed back as 'Observation' and the model writes its next thought. Standard in modern agent stacks. (Reasoning + Acting) is the simplest and most common pattern. The model is prompted to alternate two kinds of output:

Thought — free-text reasoning about what to do next. The model talks to itself.
Action — a structured tool call (function name and arguments).

After each action, the runtime executes the tool and feeds the result back as an observation. The model reads the observation and writes its next thought. Loop until the model decides it has enough to answer.

Lab · a ReAct traceUser: “Which enterprise customers signed last quarter, and what’s the combined ARR?”

1 / 7 steps

Thought

The user asked which of our enterprise customers signed in the last quarter and what their combined ARR is. I'll need to query the CRM for recent contracts, filter to 'enterprise' tier, then sum ARR.

Three things ReAct gets right:

Visible reasoning. The thoughts make it possible to audit what the agent was doing. When it goes wrong, the trace tells you why.
Mixed reasoning and lookup.The model isn’t required to know things — it can look them up. And it isn’t required to think alone — it can call calculators, search, code execution.
Adaptive depth.Easy questions terminate in one step. Hard questions take ten. The pipeline isn’t hardcoded.

The pattern is so natural that modern API tool-calling support (OpenAI function calling, Anthropic tool use, Gemini function calling) is essentially ReAct with the “Thought” field made implicit — the model still reasons, but it does so silently between tool calls.

§ 02 · REFLECTION — LET THE MODEL CRITIQUE ITSELFTwo-pass quality

Reflectionreflection. A pattern where the model produces an output, then is asked to critique that output, then to revise based on the critique. The 'reflection' step is a separate LLM call that examines the first one's work. is the simplest quality booster in the toolkit. After the model produces a first draft (writing, code, an answer, a plan), you ask the same model — usually with a different prompt — to critique that draft. Then ask it to revise based on the critique.

Three steps:

Draft.Normal generation. “Write a function that does X.”
Critique.“Here’s the function. List problems with it: edge cases, performance, style, bugs.”
Revise.“Rewrite the function to address the critique.”

For some tasks, you can run the critique-revise loop multiple times. The diminishing returns kick in fast (usually one round helps a lot, two help a little, three are noise) but the first reflection is often worth it.

Where reflection helps most:

Writing tasks — drafts have predictable weaknesses (vagueness, repetition, missing context) that a critique step catches.
Code generation — first-pass code often misses edge cases or has minor bugs the model can spot in review.
Multi-step reasoning — when the model is asked to double-check a chain of arithmetic or logic, it catches its own mistakes more often than people expect.

Where reflection doesn’t help: factual recall (the model can’t critique what it doesn’t know), and tasks the model is already saturated on (correct first time, reflection adds noise).

§ 03 · PLANNING — DECOMPOSE FIRST, ACT AFTERGoals into steps

For long-horizon tasks — multi-day research, building a feature, doing a quarterly review — a single tool-calling loop is too myopic. The model picks one action, gets one result, picks the next action; it doesn’t see the bigger structure.

Planningplanning. A pattern where the model first generates a multi-step plan (sequence of subtasks or actions), then executes the plan step by step. The plan can be revised between steps based on observations. adds an explicit step: before any actions, decompose the goal into a sequence of subtasks. Then execute the plan, step by step.

A planning agent typically has three phases:

Plan.One LLM call that produces a numbered list of steps to accomplish the goal. The plan can include conditionals (“if X then do Y else Z”) and parallelizable branches.
Execute. Run each step, often as a ReAct sub-loop. Track which steps are done, which failed, which need to be re-attempted.
Replan. When a step fails or returns surprising results, revise the plan from where it broke.

Planning shines when:

The horizon is long (5+ steps).
The cost of doing the wrong thing is high (refunding the wrong customer, sending the wrong email).
Steps can be parallelized (you want to fan out research across 10 sub-questions, then synthesize).

Fig 1Planning vs ReAct. The planning agent commits to a structure up front; the ReAct agent figures it out as it goes. Both work; they fail in different ways.

Planning fails when the plan turns out to be wrong but the agent stays committed to it. The model can spend ten steps faithfully executing a plan that doesn’t match reality. The replanning step helps but introduces its own complexity — and risks plan thrashing (replan, fail, replan, fail, replan…).

§ 04 · WHERE THIS GETS YOU, AND WHERE IT BITESProduction realities

Three honest observations from running agent systems at scale:

Most production agents are simpler than the literature suggests.A single ReAct loop with 3–6 well-chosen tools and strict guardrails will do 90% of the work people gesture at when they say “multi-agent system.” Don’t reach for the complex patterns first.
Latency multiplies fast. Each loop iteration is at least one LLM call plus a tool call. A 10-step agent that takes 2 seconds per step is 20 seconds end-to-end. Plan for it, or design for streaming intermediate state.
The hard part isn’t the agent — it’s the tools. A great agent over bad tools (slow, unreliable, poorly-documented APIs) performs poorly. A mediocre agent over excellent tools performs well. Most agent debugging is tool debugging.

Six guardrails worth building in from day one:

Step cap. Hard limit on loop iterations. 10 is a reasonable default. Agents go on infinite loops surprisingly often.
Token budget.Cap on total tokens (in + out) per run. Easy to blow $1+ in tokens per agent run if you don’t.
Tool allowlist per context. Read-only tools for customer chat; write tools only for authenticated admin tasks. Even if the model wants to call something risky, the runtime refuses.
Trace logging. Every thought, action, observation, and tool input/output to a structured log. You will need this for debugging. Always.
Human approval for irreversible actions. Sending email, charging cards, deleting data — always behind a confirmation step until the agent has earned more trust.
Evals. A small set of (goal, expected outcome) pairs that exercise the agent end-to-end. Run before every change to the prompt or toolset.

CHECKYou're building an agent that answers customer questions over your docs. Most queries are answerable from 1–2 doc lookups. Some require 4–5. Which pattern should you start with?

§ 05 · TAKING THIS FORWARDWhere the field is going

Three threads worth following beyond the basic patterns:

Long-running agents. Most production agents finish in seconds. The next generation runs for minutes, hours, or days — completing research projects, writing PRs, monitoring systems. The hard parts: state persistence, recovery from interruption, observability.
Multi-agent coordination. Multiple agents with different specialties cooperating on a task. Architecturally appealing; in practice the coordination overhead often eats the specialization gains. Used well in narrow vertical domains; less successful in general-purpose products so far.
Tool-use post-training.Models fine-tuned to use specific tool sets reliably. The future of “agents” may be less about prompting tricks and more about models that have been trained against your tools. The line between the base model and the agent runtime is starting to blur.

For a practical first agent, the formula is unromantic: ReAct loop, 3–5 tools you trust, a tight system prompt, strict guardrails, and evals from day one. Most of the magic comes from the model. Your job is to surround it with the right surface.

§ · GOING DEEPERReAct, Reflexion, and how to build an agent that doesn't loop forever

ReAct (Yao et al. 2022) defined the dominant agentic pattern: Thought (free-text reasoning) → Action (tool call) → Observation (result) → loop. Reflexion (Shinn et al. 2023) adds an explicit critique step — after a failed attempt, the model writes down what went wrong and uses that in the next iteration. For tasks with verifiable success criteria (passing a test, hitting an endpoint), Reflexion meaningfully improves multi-shot success rate.

The hard part of production agents isn’t the prompting, it’s the control surface: timeouts, retry policy, max loop iterations, tool-call validation, partial-failure recovery. Anthropic’s “Building Effective Agents” (2024) lays out the patterns that actually ship: prompt-chaining for sequential tasks, routing for classification, parallelization for fan-out, orchestrator-workers for hierarchical task decomposition. Reach for the simplest pattern that fits the task.

§ · FURTHER READINGReferences & deeper sources

Yao et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models · ICLR
Shinn et al. (2023). Reflexion: Language Agents with Verbal Reinforcement Learning · NeurIPS
Park et al. (2023). Generative Agents: Interactive Simulacra of Human Behavior · UIST
Wang et al. (2023). Voyager: An Open-Ended Embodied Agent with LLMs · arXiv
Anthropic (2024). Building Effective Agents · Anthropic Engineering

Original figures live in the linked sources — open the papers for the canonical visuals in their full context.