Behavioral Trust Scoring

One-Line Summary: Behavioral trust scoring assigns each federated peer a reputation score that updates based on observed behavior (latency, accuracy, protocol compliance, malicious actions detected) — and uses that score to gate privileges; cryptographic identity proves who, behavioral trust proves whether they should be allowed.

Prerequisites: mTLS and ed25519 for agent trust, cross-machine agent federation

What Is Behavioral Trust?

Identity tells you that peer A is talking to you. Authorization tells you what permissions A's identity has. Behavioral trust adds a third dimension: what reputation has A built up via past interactions? A peer with verified identity and granted permissions can still be untrustworthy if their behavior is anomalous.

Concretely, a behavioral trust score might fold:

Per-request latency: are responses suspiciously slow or fast?
Accuracy: did the peer's previous votes/results match consensus?
Protocol compliance: did the peer follow the federation protocol (timely heartbeats, well-formed messages)?
Detected malicious actions: prompt-injection attempts, attempted ACL violations, unauthorized capability claims.
External signals: feeds of known-bad keys, ratings from other federations.

The score updates continuously. Peers with high trust get more privileges (faster paths, more concurrent requests, fewer audit checks); peers with low trust get rate-limited, sandboxed, or evicted.

How It Works

A trust-scoring pipeline:

Per-event signals: Each interaction emits one or more signals (succeeded, error, latency, deviation-from-consensus).
Aggregation: Signals roll up to a peer-level score. Common form: a Bayesian beta distribution that updates with each event; the mean is the trust score.
Decay over time: Old behavior matters less than recent. Exponential decay is the standard.
Cross-peer pollination: Trust scores can be shared via gossip (with caveats — peers can lie about their assessments of other peers).
Privilege mapping: A peer's privileges are computed from its trust score plus its base authorization. Below a threshold, the peer is gracefully demoted or evicted.

Ruflo's behavioral-trust system continuously evaluates federation peers and adjusts privileges accordingly without operator intervention.

Why It Matters

Static permissions are brittle. A peer that is trusted today may be compromised tomorrow; a peer that is untrusted today may earn trust over time. Behavioral trust gives the system a way to respond to changes in peer quality without manual intervention.

The other reason it matters: it lets you safely federate with peers you don't fully trust upfront. New peers start with low trust, are restricted to safe operations, and earn capability over time. Without behavioral trust, you have to either grant full trust upfront (risky) or never grant trust (excludes new peers).

Key Technical Details

Cold-start problem: New peers have no history. Default low scores; let them earn trust.
Sybil resistance: An attacker who creates many low-trust identities can drown signal. Pair with proof-of-cost (compute, stake, verified identity) to make Sybil expensive.
Score manipulation: Peers can game easily-measured signals (latency, request count) without actually being trustworthy. Combine measurable and harder-to-game signals.
Trust transitivity: "I trust you, you trust them, so I trust them" is dangerous. Limit transitive trust depth.
Threshold-based privilege levels: Discrete tiers (untrusted, observer, contributor, peer, admin) are easier to reason about than continuous scores.
Audit and revocation: A peer's trust history should be auditable. Revoking trust should be a deliberate, logged action.

How Harnesses & Frameworks Implement This

Harness / Framework	Behavioral trust
Claude Code	None
Claude Agent SDK	DIY
ruflo	First-class — continuous trust evaluation in federation
LangGraph	DIY
AutoGen	DIY
CrewAI	DIY
OpenAI Agents SDK	DIY
Codex CLI / Cursor	✗

Connections to Other Concepts

mtls-and-ed25519-for-agent-trust.md — Identity, the layer below.
cross-machine-agent-federation.md — The setting.
pii-gating-and-aidefence.md — Application-layer defense, complementary.
permission-and-tool-scoping-primitives.md — Foundational permission model.