The Autonomous Request Filter intercepts every message inbound from the runner, outbound from the model and gives you precise control over what goes through, what gets modified, and what gets blocked. No code changes. No agent awareness. Just the proxy, doing its job.
Inbound steering modifies the prompt before it reaches the model. You can inject additional context, prepend constraints, append project-specific coding standards, or strip out content you don't want the model to act on.
This is particularly powerful for teams: define a system prompt injection in your ARF policy that automatically adds your team's coding conventions to every request without every developer having to remember to include it themselves. The agent sees it; the developer doesn't have to type it.
Inbound steering rules are evaluated in order. Each rule can match on message content, session context, user identity, or time-based conditions. Rules can modify, augment, or reject messages.
Outbound filtering evaluates model completions before they're returned to the runner. ARF reads the completion stream in real time. If a completion violates policy contains a disallowed code pattern, references a forbidden path, includes suspicious tool call arguments the stream is interrupted.
Interrupted completions are logged, the session health grade is decremented, and (depending on policy) a human approval prompt is surfaced. The runner sees a clean error response; it can retry with a modified approach.
This is your last line of defense before the agent takes action. The Autonomous Request Filter sees every tool call the model proposes before it's executed.
── Completion stream from engine ────────────── chunk[1]: I'll update the config file... chunk[2]: tool_use: bash args: cmd: "rm -rf /etc/nginx/conf.d/*" ── Filter evaluation ────────────────────────── ✗ MATCH: outbound.deny_pattern rule: block-destructive-ops pattern: rm -rf.*/(etc|var|usr|bin) action: BLOCK + INTERRUPT ── Response to runner ───────────────────────── HTTP 451 Unavailable For Policy Reasons { "error": "completion_blocked", "rule": "block-destructive-ops" } ● Session grade: B → C (policy violation logged)
ARF monitors inbound content files the agent reads, tool call results, web page contents returned to the agent for injection signatures. Common patterns: instruction overrides ("Ignore previous instructions"), role jailbreaks, and credential exfiltration attempts hidden in data.
When injection is detected, ARF's options range from logging-only to full session halt. Configure the response per rule: sanitize the injected content before it reaches the model, flag for human review, or block the request and alert immediately.