ARF · Authority Reference Framework

Rules, not trust.
Policy, not prayer.

ARF enforces declarative TOML governance rules at the proxy layer, before any message reaches the model and before any response reaches the agent. Circuit breakers trip on their own. Health grades track agent behavior over time. Write the rules once; ARF enforces them.

Policy Reference View Profiles

Writing Policy

Declarative TOML.
Human-readable governance.

ARF policy files are plain TOML. Every rule is readable by a person, auditable by a machine, and version-controlled like any other infrastructure config. No opaque binary formats, no vendor-locked policy languages, no click-ops.

Rules express what agents are allowed to do, what they're forbidden from doing, and what requires human approval before proceeding. Governance isn't applied after the fact. It's enforced live, in the request path, before the model generates a response.

The Authority Reference Framework evaluates every inbound prompt and outbound completion against your policy DAG. The evaluation is streaming-aware. ARF reads completion chunks as they arrive and can trip a circuit breaker mid-stream if content violates policy.

# .arf/governance/rules.toml

[governance]
profile = "standard"

[[rules]]
name = "block-dangerous-shell"
condition = { type = "tool_call", tool = "Bash", pattern = "rm -rf|dd if=" }
action = "deny"
message = "Destructive shell commands require manual approval"

[[rules]]
name = "approve-file-writes"
condition = { type = "tool_call", tool = "Write" }
action = "require_approval"
timeout_secs = 30

[[rules]]
name = "redact-credentials"
condition = { type = "stream_text", pattern = "ghp_[A-Za-z0-9]{36,}|sk-[a-zA-Z0-9]{32,}" }
action = "rewrite"
replacement = "[REDACTED]"

[[rules]]
name = "token-budget"
condition = { type = "session_tokens", gt = 100000 }
action = "deny"
message = "Token budget exceeded"

[circuit_breaker]
error_threshold = 5
window_secs = 60
cooldown_secs = 300
        

Governance Profiles

Strict. Standard.
Minimal.

Strict

Maximum oversight. Every tool call requires explicit approval. Token budgets are tight. Circuit breakers are hair-trigger. For production systems, regulated environments, or any context where an agent mistake is costly.

✓ All tool calls require approval
✓ 50k token budget
✓ 1-failure circuit break
✓ Content filtering: maximum
✓ Network calls: blocked by default

Standard

Balanced oversight for everyday development. File reads auto-approve. Writes and network calls require approval. 100k token budget. Recommended for most teams starting with governed agent development.

✓ Reads auto-approved
~ Writes require approval
✓ 100k token budget
✓ 3-failure circuit break
~ Network: prompt on first call

Minimal

Light governance for trusted, local-only workflows. Audit and signing stay active, so you still get a full proof record, but approvals are minimal and budgets are relaxed. For personal workstations and experimentation.

✓ Most ops auto-approved
✓ 500k token budget
✓ 5-failure circuit break
✓ Audit still active
~ Network: log but allow

Circuit Breakers

Trips fast.
Cools down. Heals.

ARF's circuit breaker model is borrowed from distributed systems engineering. When an agent fails, violates policy, or goes silent for too long, the breaker trips and further requests are blocked until the condition clears.

Circuit breakers have three states: Closed (normal operation), Open (blocked, cooling down), and Half-Open (probing for recovery). Transitions are configurable per profile.

Circuit Breaker State Machine

        ┌─────────────────────────────────────┐
        │                                     │
        ▼                                     │ success
  ┌──────────┐   failure_count ≥ N    ┌───────────┐
  │  CLOSED  │ ─────────────────────▶│   OPEN    │
  │  normal  │                        │  blocked  │
  └──────────┘                        └───────────┘
        ▲                                    │
        │                                    │ cooldown elapsed
        │                             ┌──────────────┐
        │   success                   │  HALF-OPEN   │
        └─────────────────────────────│   probing    │
                                      └──────────────┘
                                             │
                                             │ failure
                                             ▼
                                      ┌──────────┐
                                      │   OPEN    │
                                      │  reset ↺  │
                                      └──────────┘

  Events that trip a breaker:
  ✗ error_rate > threshold
  ✗ consecutive_failures ≥ N
  ✗ policy_violation (content, cost)
  ✗ manual trip via TUI / CLI

Health Grading

Your agents get
a report card.

Excellent

Error rate <2%. All policy checks passing. Token usage within budget. No manual interventions required.

Good

Error rate 2–8%. Minor policy flags. Token usage slightly elevated. Occasional circuit breaker trips.

Acceptable

Error rate 8–15%. Moderate policy violations. Budget warnings active. Review recommended.

D–F

Poor / Fail

Error rate >15% or budget exceeded. Repeated policy violations. Circuit breakers tripped. Session requires human review.

Health grades are computed per-session and per-agent over a rolling window. They feed back into the governance profile: a session that drops below its configured health_grade_floor in the policy will have its circuit breakers armed for earlier trips. Set health_grade_floor = "C" and any session that reaches D will immediately open the breaker.

Request Evaluation Path

Every request goes through
the same path.

Governance runs synchronously in the request path. A request is never forwarded to the model until all applicable rules have been evaluated and any required approvals have been granted.

  Runner (claude / codex / gemini / aider / etc.)
      │  HTTP request
      ▼
  ┌──────────────────────────────────────────────────────┐
  │  ARF Proxy (localhost:4554)                          │
  │                                                      │
  │  1. Parse & translate to Canonical IR                │
  │  2. Evaluate [[rules]] in order                      │
  │       action=deny       → return 403, record event   │
  │       action=require_approval → HOLD, toast to TUI  │
  │       action=warn       → log, continue              │
  │       action=rewrite    → mutate payload, continue   │
  │       action=modify     → patch parameters, continue │
  │  3. All rules pass → forward to backend API          │
  │  4. Stream response chunks back                      │
  │  5. Evaluate stream rules on each chunk              │
  │       stream match → rewrite chunk or trip breaker   │
  │  6. Record complete event to provenance chain        │
  └──────────────────────────────────────────────────────┘
      │  API response (streamed)
      ▼
  Runner receives response

Step 2 is the governance kernel. Rules are evaluated in declaration order. The first rule that matches and has action deny or require_approval stops evaluation and takes that action. Rules with action warn or rewrite continue to the next rule after handling. All events are signed and appended to the provenance chain regardless of outcome.

ARL · Agent Rule Language

The condition syntax.
What can match.

Every [[rules]] entry has a condition field. The condition is a TOML inline table with a type key and additional fields that depend on the type.

# Condition types:

# Match a specific tool call
condition = { type = "tool_call", tool = "Bash" }
condition = { type = "tool_call", tool = "Write", path_pattern = "/etc/.*" }
condition = { type = "tool_call", tool = "Bash", pattern = "rm -rf|curl.*\| sh" }

# Match content in the prompt (request body)
condition = { type = "request_text", pattern = "(?i)(api.?key|password|secret)" }

# Match content in the model's streaming response
condition = { type = "stream_text", pattern = "ghp_[A-Za-z0-9]{36,}" }

# Budget / session conditions
condition = { type = "session_tokens", gt = 100000 }
condition = { type = "request_tokens", gt = 8000 }

# Model parameter condition (for action=modify rules)
condition = { type = "request_param", field = "temperature", gt = 1.0 }

# Compound: all conditions must match
condition = { type = "all", conditions = [
  { type = "tool_call", tool = "Bash" },
  { type = "request_text", pattern = "production" }
] }
    

Pattern values are RE2-compatible regular expressions. Patterns are applied case-sensitively unless you prefix with (?i). Run arf governance rules to list all active rules and verify patterns.

Governance in the TUI

Rules visible.
Decisions auditable.

The Rules screen shows every active rule, its last-match timestamp, and the current state of all circuit breakers. No need to read TOML files during an active session — the TUI surfaces everything in real time.

When a rule trips a circuit breaker, the TUI shows a toast notification and highlights the breaker in red. You can reset it directly from the Rules screen by pressing Enter on the breaker row.

Hot-reload your policy without restarting: edit .arf/governance/rules.toml and press Ctrl+R in the TUI to reload immediately.

ARF TUI Rules screen showing governance rules

Standards Compatibility

Works with the governance
tooling you already have.

ARF's TOML policy language is designed for composability. Export and import rules in OPA Rego, YAML-based policy formats, and JSON Schema. Governance engineers who know Rego can write ARF rules from day one.

OPA / Rego

Import existing OPA Rego policies as ARF rule modules. The ARF compiler translates Rego allow/deny rules into ARF's streaming evaluation pipeline. Your existing compliance library works out of the box.

Integration docs →

JSON Schema / OpenAPI

Define message shape constraints using JSON Schema. ARF validates inbound and outbound message payloads against your schema definitions and blocks malformed or unexpected structures.

Schema validation →

Rules, not trust.Policy, not prayer.

Declarative TOML.Human-readable governance.

Strict. Standard.Minimal.

Trips fast.Cools down. Heals.

Your agents geta report card.

Every request goes throughthe same path.

The condition syntax.What can match.

Rules visible.Decisions auditable.

Works with the governancetooling you already have.