arf.io / ARF / Governance & Policy / ARF · Autonomous Request Filter · Agent Router & Filter
ARF · Authority Reference Framework

Rules, not trust.
Policy, not prayer.

ARF enforces declarative TOML governance rules at the proxy layer, before any message reaches the model and before any response reaches the agent. Circuit breakers trip on their own. Health grades track agent behavior over time. Write the rules once; ARF enforces them.

Writing Policy

Declarative TOML.
Human-readable governance.

ARF policy files are plain TOML. Every rule is readable by a person, auditable by a machine, and version-controlled like any other infrastructure config. No opaque binary formats, no vendor-locked policy languages, no click-ops.

Rules express what agents are allowed to do, what they're forbidden from doing, and what requires human approval before proceeding. Governance isn't applied after the fact. It's enforced live, in the request path, before the model generates a response.

The Authority Reference Framework evaluates every inbound prompt and outbound completion against your policy DAG. The evaluation is streaming-aware. ARF reads completion chunks as they arrive and can trip a circuit breaker mid-stream if content violates policy.

# .arf/governance/rules.toml [governance] profile = "standard" [[rules]] name = "block-dangerous-shell" condition = { type = "tool_call", tool = "Bash", pattern = "rm -rf|dd if=" } action = "deny" message = "Destructive shell commands require manual approval" [[rules]] name = "approve-file-writes" condition = { type = "tool_call", tool = "Write" } action = "require_approval" timeout_secs = 30 [[rules]] name = "redact-credentials" condition = { type = "stream_text", pattern = "ghp_[A-Za-z0-9]{36,}|sk-[a-zA-Z0-9]{32,}" } action = "rewrite" replacement = "[REDACTED]" [[rules]] name = "token-budget" condition = { type = "session_tokens", gt = 100000 } action = "deny" message = "Token budget exceeded" [circuit_breaker] error_threshold = 5 window_secs = 60 cooldown_secs = 300
Governance Profiles

Strict. Standard.
Minimal.

Strict

Maximum oversight. Every tool call requires explicit approval. Token budgets are tight. Circuit breakers are hair-trigger. For production systems, regulated environments, or any context where an agent mistake is costly.

  • All tool calls require approval
  • 50k token budget
  • 1-failure circuit break
  • Content filtering: maximum
  • Network calls: blocked by default
Standard

Balanced oversight for everyday development. File reads auto-approve. Writes and network calls require approval. 100k token budget. Recommended for most teams starting with governed agent development.

  • Reads auto-approved
  • ~ Writes require approval
  • 100k token budget
  • 3-failure circuit break
  • ~ Network: prompt on first call
Minimal

Light governance for trusted, local-only workflows. Audit and signing stay active, so you still get a full proof record, but approvals are minimal and budgets are relaxed. For personal workstations and experimentation.

  • Most ops auto-approved
  • 500k token budget
  • 5-failure circuit break
  • Audit still active
  • ~ Network: log but allow
Circuit Breakers

Trips fast.
Cools down. Heals.

ARF's circuit breaker model is borrowed from distributed systems engineering. When an agent fails, violates policy, or goes silent for too long, the breaker trips and further requests are blocked until the condition clears.

Circuit breakers have three states: Closed (normal operation), Open (blocked, cooling down), and Half-Open (probing for recovery). Transitions are configurable per profile.

Circuit Breaker State Machine
        ┌─────────────────────────────────────┐
        │                                     │
        ▼                                     │ success
  ┌──────────┐   failure_count ≥ N    ┌───────────┐
  │  CLOSED  │ ─────────────────────▶│   OPEN    │
  │  normal  │                        │  blocked  │
  └──────────┘                        └───────────┘
        ▲                                    │
        │                                    │ cooldown elapsed
        │                             ┌──────────────┐
        │   success                   │  HALF-OPEN   │
        └─────────────────────────────│   probing    │
                                      └──────────────┘
                                             │
                                             │ failure
                                             ▼
                                      ┌──────────┐
                                      │   OPEN    │
                                      │  reset ↺  │
                                      └──────────┘

  Events that trip a breaker:
   error_rate > threshold
   consecutive_failures ≥ N
   policy_violation (content, cost)
   manual trip via TUI / CLI
Health Grading

Your agents get
a report card.

A
Excellent

Error rate <2%. All policy checks passing. Token usage within budget. No manual interventions required.

B
Good

Error rate 2–8%. Minor policy flags. Token usage slightly elevated. Occasional circuit breaker trips.

C
Acceptable

Error rate 8–15%. Moderate policy violations. Budget warnings active. Review recommended.

D–F
Poor / Fail

Error rate >15% or budget exceeded. Repeated policy violations. Circuit breakers tripped. Session requires human review.

Health grades are computed per-session and per-agent over a rolling window. They feed back into the governance profile: a session that drops below its configured health_grade_floor in the policy will have its circuit breakers armed for earlier trips. Set health_grade_floor = "C" and any session that reaches D will immediately open the breaker.

Request Evaluation Path

Every request goes through
the same path.

Governance runs synchronously in the request path. A request is never forwarded to the model until all applicable rules have been evaluated and any required approvals have been granted.

  Runner (claude / codex / gemini / aider / etc.)
      │  HTTP request
      ▼
  ┌──────────────────────────────────────────────────────┐
  │  ARF Proxy (localhost:4554)                          │
  │                                                      │
  │  1. Parse & translate to Canonical IR                │
  │  2. Evaluate [[rules]] in order                      │
  │       action=deny       → return 403, record event   │
  │       action=require_approval → HOLD, toast to TUI  │
  │       action=warn       → log, continue              │
  │       action=rewrite    → mutate payload, continue   │
  │       action=modify     → patch parameters, continue │
  │  3. All rules pass → forward to backend API          │
  │  4. Stream response chunks back                      │
  │  5. Evaluate stream rules on each chunk              │
  │       stream match → rewrite chunk or trip breaker   │
  │  6. Record complete event to provenance chain        │
  └──────────────────────────────────────────────────────┘
      │  API response (streamed)
      ▼
  Runner receives response

Step 2 is the governance kernel. Rules are evaluated in declaration order. The first rule that matches and has action deny or require_approval stops evaluation and takes that action. Rules with action warn or rewrite continue to the next rule after handling. All events are signed and appended to the provenance chain regardless of outcome.

ARL · Agent Rule Language

The condition syntax.
What can match.

Every [[rules]] entry has a condition field. The condition is a TOML inline table with a type key and additional fields that depend on the type.

# Condition types: # Match a specific tool call condition = { type = "tool_call", tool = "Bash" } condition = { type = "tool_call", tool = "Write", path_pattern = "/etc/.*" } condition = { type = "tool_call", tool = "Bash", pattern = "rm -rf|curl.*\| sh" } # Match content in the prompt (request body) condition = { type = "request_text", pattern = "(?i)(api.?key|password|secret)" } # Match content in the model's streaming response condition = { type = "stream_text", pattern = "ghp_[A-Za-z0-9]{36,}" } # Budget / session conditions condition = { type = "session_tokens", gt = 100000 } condition = { type = "request_tokens", gt = 8000 } # Model parameter condition (for action=modify rules) condition = { type = "request_param", field = "temperature", gt = 1.0 } # Compound: all conditions must match condition = { type = "all", conditions = [ { type = "tool_call", tool = "Bash" }, { type = "request_text", pattern = "production" } ] }

Pattern values are RE2-compatible regular expressions. Patterns are applied case-sensitively unless you prefix with (?i). Run arf governance rules to list all active rules and verify patterns.

Governance in the TUI

Rules visible.
Decisions auditable.

The Rules screen shows every active rule, its last-match timestamp, and the current state of all circuit breakers. No need to read TOML files during an active session — the TUI surfaces everything in real time.

When a rule trips a circuit breaker, the TUI shows a toast notification and highlights the breaker in red. You can reset it directly from the Rules screen by pressing Enter on the breaker row.

Hot-reload your policy without restarting: edit .arf/governance/rules.toml and press Ctrl+R in the TUI to reload immediately.

ARF TUI Rules screen showing governance rules
Standards Compatibility

Works with the governance
tooling you already have.

ARF's TOML policy language is designed for composability. Export and import rules in OPA Rego, YAML-based policy formats, and JSON Schema. Governance engineers who know Rego can write ARF rules from day one.

OPA / Rego

Import existing OPA Rego policies as ARF rule modules. The ARF compiler translates Rego allow/deny rules into ARF's streaming evaluation pipeline. Your existing compliance library works out of the box.

Integration docs →
JSON Schema / OpenAPI

Define message shape constraints using JSON Schema. ARF validates inbound and outbound message payloads against your schema definitions and blocks malformed or unexpected structures.

Schema validation →