One-Line Summary: PII gating is the harness-layer scrubbing of personally identifiable information (and secrets, credentials, sensitive metadata) from data flowing across trust boundaries; ruflo's AIDefence plugin is the reference implementation, identifying 14+ classes of sensitive data and either redacting, blocking, or alerting based on configured policy.
Prerequisites: Cross-machine agent federation, permission and tool scoping primitives
What Is PII Gating?
When agent state crosses a trust boundary — sent to a federated peer, written to an external log, posted to a third-party API — there is a question to answer: does this contain anything that should not leave? PII gating is the discipline of asking that question consistently, automatically, and at the harness layer rather than per-tool.
The categories of "sensitive" extend beyond classical PII:
- Personal: names, emails, phone numbers, addresses, SSNs, DOBs.
- Credentials: API keys, passwords, OAuth tokens, SSH keys, AWS credentials.
- Internal: internal hostnames, IP ranges, service names, commit messages with vulnerability details.
- Confidential business: customer data, unreleased product details, financial figures.
- Code-level: secrets accidentally committed, embedded credentials in code or config.
A PII gate scans outbound payloads for these categories, redacts what it finds (or blocks, depending on policy), and emits an audit event.
How It Works
A typical PII-gating pipeline:
- Hook into outbound traffic: All outbound tool calls, federation messages, log writes are intercepted.
- Detection: Pattern matching (regex for credit cards, SSNs), entropy heuristics (high-entropy strings look like keys), trained classifiers (NER for names), and vocabulary lookups (hostnames, API key prefixes).
- Action: Redact (replace with
[REDACTED-EMAIL]), block (refuse the operation), or alert (allow but warn). - Audit: Log the detection event with category and action.
Ruflo's AIDefence plugin layers additional behaviors on top: prompt-injection detection on inbound payloads, CVE pattern detection in code being shared, and "trust-score adjustments" when peers attempt to send data that looks suspicious.
Why It Matters
Outbound data leaks are the most-cited risk in enterprise agent deployments. A tool call that ships a project's whole .env file to a third-party API is a breach. A federated agent that includes raw customer data in a shared decision is a privacy incident. PII gating is the harness's promise that this category of mistake doesn't slip through silently.
The other reason it matters: it's a regulatory checkbox. GDPR, HIPAA, SOC 2 controls all require demonstrable handling of sensitive data. A harness with first-class PII gating makes that demonstration possible; one without it makes it impossible.
Key Technical Details
- Detection precision and recall trade off: Aggressive regex catches more (high recall) but produces false positives (low precision). NER models do better but cost compute. Tune to your tolerance.
- Custom dictionaries: Every organization has its own sensitive terms (project codenames, internal endpoints). PII gates need extension points.
- Scope by destination: A name being sent to a federated peer is different from the same name being written to a local log. Same data, different policies.
- Reversible vs. irreversible redaction: Reversible (token-replaced and stored) lets the receiver request unredaction with proper authority. Irreversible is safer.
- False negatives are the dangerous failure: A PII gate that misses a leak is worse than one that over-redacts. Default to over-redact and let users adjust.
- Composability with sandboxing: PII gating + sandboxing (network egress restriction) gives layered defense. Don't rely on one alone.
- Performance: Scanning every outbound payload adds latency. Compile patterns once, reuse aggressively, run async when possible.
How Harnesses & Frameworks Implement This
| Harness / Framework | PII gating |
|---|---|
| Claude Code | DIY via PreToolUse hooks |
| Claude Agent SDK | DIY |
| ruflo | First-class — ruflo-aidefence (14 PII classes + injection + CVE detection) |
| LangGraph | DIY |
| AutoGen | DIY |
| CrewAI | Limited |
| OpenAI Agents SDK | Guardrails partially overlap |
| Codex CLI / Cursor | Limited |
Connections to Other Concepts
cross-machine-agent-federation.md— The natural setting.permission-and-tool-scoping-primitives.md— Same defense-in-depth philosophy.prompt-injection-defense-in-harnesses.md— The other side of harness-layer protection.behavioral-trust-scoring.md— PII detections feed trust score updates.
Further Reading
- ruvnet, ruflo-aidefence documentation.
- Microsoft Presidio — open-source PII detection used by some harnesses as a backend.