One-Line Summary: PII gating is the harness-layer scrubbing of personally identifiable information (and secrets, credentials, sensitive metadata) from data flowing across trust boundaries; ruflo's AIDefence plugin is the reference implementation, identifying 14+ classes of sensitive data and either redacting, blocking, or alerting based on configured policy.

Prerequisites: Cross-machine agent federation, permission and tool scoping primitives

What Is PII Gating?

When agent state crosses a trust boundary — sent to a federated peer, written to an external log, posted to a third-party API — there is a question to answer: does this contain anything that should not leave? PII gating is the discipline of asking that question consistently, automatically, and at the harness layer rather than per-tool.

The categories of "sensitive" extend beyond classical PII:

  • Personal: names, emails, phone numbers, addresses, SSNs, DOBs.
  • Credentials: API keys, passwords, OAuth tokens, SSH keys, AWS credentials.
  • Internal: internal hostnames, IP ranges, service names, commit messages with vulnerability details.
  • Confidential business: customer data, unreleased product details, financial figures.
  • Code-level: secrets accidentally committed, embedded credentials in code or config.

A PII gate scans outbound payloads for these categories, redacts what it finds (or blocks, depending on policy), and emits an audit event.

How It Works

A typical PII-gating pipeline:

  1. Hook into outbound traffic: All outbound tool calls, federation messages, log writes are intercepted.
  2. Detection: Pattern matching (regex for credit cards, SSNs), entropy heuristics (high-entropy strings look like keys), trained classifiers (NER for names), and vocabulary lookups (hostnames, API key prefixes).
  3. Action: Redact (replace with [REDACTED-EMAIL]), block (refuse the operation), or alert (allow but warn).
  4. Audit: Log the detection event with category and action.

Ruflo's AIDefence plugin layers additional behaviors on top: prompt-injection detection on inbound payloads, CVE pattern detection in code being shared, and "trust-score adjustments" when peers attempt to send data that looks suspicious.

Why It Matters

Outbound data leaks are the most-cited risk in enterprise agent deployments. A tool call that ships a project's whole .env file to a third-party API is a breach. A federated agent that includes raw customer data in a shared decision is a privacy incident. PII gating is the harness's promise that this category of mistake doesn't slip through silently.

The other reason it matters: it's a regulatory checkbox. GDPR, HIPAA, SOC 2 controls all require demonstrable handling of sensitive data. A harness with first-class PII gating makes that demonstration possible; one without it makes it impossible.

Key Technical Details

  • Detection precision and recall trade off: Aggressive regex catches more (high recall) but produces false positives (low precision). NER models do better but cost compute. Tune to your tolerance.
  • Custom dictionaries: Every organization has its own sensitive terms (project codenames, internal endpoints). PII gates need extension points.
  • Scope by destination: A name being sent to a federated peer is different from the same name being written to a local log. Same data, different policies.
  • Reversible vs. irreversible redaction: Reversible (token-replaced and stored) lets the receiver request unredaction with proper authority. Irreversible is safer.
  • False negatives are the dangerous failure: A PII gate that misses a leak is worse than one that over-redacts. Default to over-redact and let users adjust.
  • Composability with sandboxing: PII gating + sandboxing (network egress restriction) gives layered defense. Don't rely on one alone.
  • Performance: Scanning every outbound payload adds latency. Compile patterns once, reuse aggressively, run async when possible.

How Harnesses & Frameworks Implement This

Harness / FrameworkPII gating
Claude CodeDIY via PreToolUse hooks
Claude Agent SDKDIY
rufloFirst-class — ruflo-aidefence (14 PII classes + injection + CVE detection)
LangGraphDIY
AutoGenDIY
CrewAILimited
OpenAI Agents SDKGuardrails partially overlap
Codex CLI / CursorLimited

Connections to Other Concepts

  • cross-machine-agent-federation.md — The natural setting.
  • permission-and-tool-scoping-primitives.md — Same defense-in-depth philosophy.
  • prompt-injection-defense-in-harnesses.md — The other side of harness-layer protection.
  • behavioral-trust-scoring.md — PII detections feed trust score updates.

Further Reading

  • ruvnet, ruflo-aidefence documentation.
  • Microsoft Presidio — open-source PII detection used by some harnesses as a backend.