Policy-Driven Security for AI Agents

Most AI agent frameworks handle security the same way: a list of allowed tools and a hope that nothing goes wrong. PureClaw takes a different approach. Security is defined in declarative YAML policies, validated against a schema at load time, enforced at every operation boundary, and audited continuously.

This post walks through the six security modules that make up PureClaw's security layer, with real policy examples from our production configuration.

The Policy File

Every PureClaw deployment starts with a policy.yaml file. This is not a configuration file. It is a security contract that defines the complete boundary of what an agent can do. The policy is validated against a JSON schema at startup. If the policy is malformed, the agent refuses to start.

version: 1

filesystem:
  read_allow:
    - "/app/**"
    - "/data/**"
    - "/tmp/**"
  read_deny:
    - "/etc/shadow"
    - "**/.env"
    - "**/*secret*"
    - "**/*credential*"
  write_allow:
    - "/data/**"
    - "/output/**"
    - "/tmp/**"
  write_deny:
    - "/etc/**"
    - "/usr/**"
    - "/bin/**"
    - "/proc/**"
    - "/sys/**"

network:
  fetch_allow_domains:
    - "api.openai.com"
    - "api.anthropic.com"
    - "*.googleapis.com"
  fetch_deny_domains: []
  block_private_ranges: true

Policies are hot-reloadable. Change the YAML, and the agent picks up the new policy on the next operation without a restart. This means security teams can tighten policies in response to incidents without taking the agent offline.

Module 1: Filesystem ACLs

The filesystem module controls what an agent can read and write. Paths are matched using glob patterns. Deny rules take precedence over allow rules.

This is not optional. Every file operation goes through the policy engine. If a tool tries to read /etc/shadow or write to /usr/bin/, the operation is blocked and an audit event is generated. The agent never sees the denied content.

The glob patterns support ** for recursive matching. A rule like **/.env blocks access to .env files at any depth in the filesystem tree. This prevents agents from reading environment files that often contain API keys and database credentials.

Module 2: Network Controls

The network module defines which domains an agent can reach and whether private IP ranges (RFC 1918) are blocked. This prevents Server-Side Request Forgery (SSRF) attacks where an agent is manipulated into making requests to internal services.

With block_private_ranges: true, an agent cannot be tricked into fetching http://169.254.169.254/ (the cloud metadata endpoint) or http://192.168.1.1/ (your router's admin panel). Combined with domain allowlists, you define exactly which external services the agent can communicate with.

Module 3: Tool Governance

The tools module controls which tools an agent can invoke. A wildcard ("*") allows all tools. In production, you restrict this to the specific tools each agent role needs:

tools:
  allowed:
    - "web_search"
    - "read_file"
    - "write_file"
  denied:
    - "shell_command"
    - "subagent"

An agent configured with this policy can search the web and manipulate files but cannot execute arbitrary shell commands or spawn sub-agents. The denied list takes precedence, so even if a future tool update adds new capabilities, they are blocked unless explicitly allowed.

Module 4: Inference Guards

The inference module controls which models the agent can use, enforces token limits per session, and can lock the system prompt to prevent injection attacks:

inference:
  model_allowlist:
    - "nemotron-super"
    - "claude-sonnet-*"
  max_tokens_per_session: 100000
  system_prompt_immutable: true

With system_prompt_immutable: true, the agent's system prompt cannot be modified at runtime. This prevents prompt injection attacks where a malicious input attempts to override the agent's instructions. The token limit prevents runaway sessions from consuming unbounded resources.

Module 5: Credential Redaction

The credential redaction module is always active. It scans every piece of data that flows through the agent using regex patterns that match known credential formats:

credentials:
  redact_patterns:
    - "sk-ant-[a-zA-Z0-9_-]{20,}"   # Anthropic
    - "sk-[a-zA-Z0-9_-]{20,}"        # OpenAI
    - "AKIA[A-Z0-9]{16}"             # AWS
    - "ghp_[a-zA-Z0-9]{36,}"         # GitHub
    - "AIza[a-zA-Z0-9_-]{30,}"       # Google
    - "eyJ[...]\\.eyJ[...]\\.[...]"  # JWT tokens
  redact_env_vars:
    - "ANTHROPIC_API_KEY"
    - "TELEGRAM_BOT_TOKEN"
    - "DISCORD_BOT_TOKEN"

When a credential is detected in any output, it is replaced with [REDACTED] before it reaches the conversation, log file, or audit trail. The redact_env_vars list ensures that even if a tool reads the process environment, sensitive variables are masked.

This is defence in depth. Even if every other security layer fails and an agent gains access to a credential, the redaction engine prevents it from appearing in any output channel.

Module 6: Audit Trail

Every policy decision generates an audit event. The audit module (audit.py) records:

The operation attempted (file read, network fetch, tool invocation)
The policy rule that matched
Whether the operation was allowed or denied
The full context chain (user request, tool call, result)
Timestamp with microsecond precision

The audit trail is append-only. It cannot be modified or deleted by the agent. This is the compliance record that enterprise deployments require for regulatory reporting, incident investigation, and security audits.

Shipping Permissive, Tightening Per Deployment

PureClaw ships with a permissive default policy: all tools allowed, all domains reachable, broad filesystem access. This lets you get started quickly without fighting the security layer.

Production deployments should tighten policies to match their threat model. A customer support agent needs different permissions than an infrastructure management agent. The YAML-based approach makes this straightforward: copy the policy file, remove the wildcards, add the specific rules your deployment needs.

Security is not something you add to an agent framework. It is the foundation you build on. PureClaw's policy engine ensures that every agent operates within explicit, auditable, enforceable boundaries.

The full security module source is available on GitHub.

Back to Blog

Policy-Driven Securityfor AI Agents