One-Line Summary: AgentDB (ruflo's purpose-built vector database) and vector stores generally are the harness's substrate for semantic recall — embeddings of past trajectories, code snippets, documents, and decisions are kept in a queryable index so the agent can retrieve relevant memories on demand.
Prerequisites: Harness-owned memory, retrieval-augmented generation
What Is a Harness-Layer Vector Store?
A vector store is a database that indexes content by embedding similarity rather than exact match. Instead of "find rows where text contains foo" you ask "find rows whose meaning is close to this query." For agents, this is how memory is retrieved when the relevant key isn't known in advance.
AgentDB is ruflo's vector store, optimized for agent-shaped workloads: high write rate (every trajectory writes new memories), low-latency reads (every turn does retrieval), and rich metadata (each entry carries trajectory ID, timestamp, success/failure label). It uses HNSW for indexing (see hnsw-for-agent-recall.md).
Other harnesses use different stores: Claude Code can integrate vector stores via MCP servers; LangGraph has memory store abstractions with pluggable backends; CrewAI has its own short-term and long-term memory layers backed by vector DBs.
How It Works in a Harness
The harness pipeline:
- Write: At the end of a session (or after meaningful events), the harness extracts memorable artifacts (decisions, error→fix pairs, project conventions discovered) and writes them as embeddings + metadata.
- Read on demand: At each turn, the harness embeds the current task and queries the store for the top-K most similar memories.
- Inject into context: Retrieved memories are added to the prompt as additional context.
The trick is what to retrieve and when. Naively retrieving every turn is expensive in tokens and noisy. Retrieving only when the agent flags uncertainty is cheap but misses opportunities. Retrieval is itself a tool the agent can call: ruflo and CrewAI both expose memory queries this way.
Why It Matters
A vector store turns a transient agent into one with continuity. Without it, every session starts from CLAUDE.md and the immediate context. With it, the agent can recall: "this is the third time we've seen this error; here is what fixed it last time." That recall is the difference between a fresh assistant and a co-worker who has been with the team for months.
Key Technical Details
- Embedding model is the foundation: Cheap embeddings (Cohere small, OpenAI text-embedding-3-small) work for most cases. Code-aware embeddings (Voyage AI, Jina) help for code-heavy projects.
- Index type trade-offs: HNSW is the practical default (fast reads, decent writes, large memory). Flat indexes are exact but slow. IVF / PQ trade accuracy for size.
- Metadata filtering matters: Retrieving by similarity alone returns relevant-but-stale entries. Combine with metadata: time window, project, success label.
- Stale memories rot: Without curation, vector stores accumulate outdated entries. Periodic compaction (delete contradicted entries, summarize redundant ones) is a real maintenance task.
- Per-project vs. cross-project: Most harnesses scope memory per-project to avoid leakage. Ruflo's cross-project memory is opt-in.
- Storage cost: At ruflo's claimed scale (millions of trajectories), even compact vectors add up. Tiered storage (hot in-memory, warm on disk, cold archived) is standard.
- Query latency: Sub-50ms for top-10 from 1M vectors is achievable with HNSW. Above that and the harness has to either pre-fetch or trim the index.
How Harnesses & Frameworks Implement This
| Harness / Framework | Vector store substrate |
|---|---|
| Claude Code | None native — bring your own via MCP server |
| Claude Agent SDK | Memory adapters; bring your own backend |
| ruflo | AgentDB (purpose-built, HNSW-backed) |
| LangGraph | MemoryStore abstraction; backends include in-memory, Postgres, Pinecone |
| AutoGen | Add-on (mem0, memori) |
| CrewAI | Built-in short-term + long-term memory; backends include Chroma, Qdrant |
| OpenAI Agents SDK | DIY |
| Codex CLI | None |
| Cursor | Internal codebase index (proprietary) |
Connections to Other Concepts
hnsw-for-agent-recall.md— The dominant index data structure.harness-owned-memory.md— The category this concept fits in.reasoning-bank.md— A specialized vector store for trajectories.cross-session-memory-strategies.md— How retrieval is invoked across sessions.../../ai-agent-concepts/06-knowledge-and-retrieval/vector-databases.md— Foundational coverage.
Further Reading
- ruvnet, ruflo agentdb documentation.
- LangGraph, Memory Store documentation.