Skip to content

Securing Coding Agents with SafeAI

AI coding agents can write files, run shell commands, make API calls, and install packages — all autonomously. A single hallucinated command or prompt injection could delete your repo, leak secrets from .env, or push malicious code.

SafeAI enforces security policies on every action your coding agent takes. Setup takes 3 commands.


Supported agents

Agent Recommended setup Alternative
Claude Code MCP server safeai setup claude-code (hooks)
Cursor MCP server safeai setup cursor (hooks)
Windsurf MCP server
GitHub Copilot MCP server
Codex CLI MCP server safeai hook (universal hook)
Replit Agent MCP server Sidecar proxy
Antigravity MCP server safeai hook (universal hook)
VS Code + Continue MCP server
Any MCP-compatible agent MCP server

MCP is the universal approach

Most coding agents now support MCP (Model Context Protocol). Adding SafeAI as an MCP server works across all of them with the same config — no agent-specific setup needed.


Quick setup

Step 1 — Initialize SafeAI

uv pip install safeai
cd your-project
safeai init

The interactive setup walks you through choosing an AI backend:

Intelligence Layer Setup

Enable the intelligence layer? [Y/n]: Y

Choose your AI backend:
  1. Ollama (local, free — no API key needed)
  2. OpenAI
  3. Anthropic
  4. Google Gemini
  5. Mistral
  6. Groq
  ...

Select provider [1]: 1

Intelligence layer configured!

Step 2 — Auto-generate policies

safeai intelligence auto-config --path . --apply

SafeAI analyzes your project structure and generates security policies automatically.

Step 3 — Add SafeAI MCP server to your agent

Add this to your agent's MCP config:

{
  "mcpServers": {
    "safeai": {
      "command": "safeai",
      "args": ["mcp"]
    }
  }
}

That's it. Your coding agent now calls SafeAI's scan_input, guard_output, and intercept_tool tools before every action.


Where to put the MCP config

Each agent reads MCP server config from a different file:

.claude/settings.json
{
  "mcpServers": {
    "safeai": {
      "command": "safeai",
      "args": ["mcp"]
    }
  }
}

Or use the shortcut: safeai setup claude-code

.cursor/mcp.json
{
  "mcpServers": {
    "safeai": {
      "command": "safeai",
      "args": ["mcp"]
    }
  }
}
~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "safeai": {
      "command": "safeai",
      "args": ["mcp"]
    }
  }
}
.vscode/mcp.json
{
  "mcpServers": {
    "safeai": {
      "command": "safeai",
      "args": ["mcp"]
    }
  }
}
~/.codex/mcp.json
{
  "mcpServers": {
    "safeai": {
      "command": "safeai",
      "args": ["mcp"]
    }
  }
}

In your Replit project, add to .replit:

.replit
[mcp]
[mcp.servers.safeai]
command = "safeai"
args = ["mcp"]

Look for your agent's MCP configuration file and add:

{
  "mcpServers": {
    "safeai": {
      "command": "safeai",
      "args": ["mcp"]
    }
  }
}

MCP tools exposed

Once connected, your coding agent can call these SafeAI tools:

Tool What it does
scan_input Scan text for secrets, PII, and policy violations before the model sees it
guard_output Check model responses for leaked secrets or PII before showing to the user
intercept_tool Validate a tool call (bash, file write, API call) against policies before execution
query_audit Search the audit trail for past decisions
check_approval Check if a pending action has been approved

What gets enforced

Action Example SafeAI response
Destructive command rm -rf ~/projects Blocked
Read private keys cat ~/.ssh/id_ed25519 Blocked
Leak secret in code api_key = "sk-abc123" Blocked
PII in output User email: john@example.com Redacted
Install suspicious package pip install totally-not-malware Blocked
Push to remote git push origin main Held for approval
Write to .env echo "DB_PASS=..." > .env Blocked

Explain blocked actions

When SafeAI blocks something and you want to understand why:

# See recent blocks
safeai logs --action block --last 1h

# Ask the AI to explain
safeai intelligence explain evt_abc123
Classification: DESTRUCTIVE_COMMAND
Severity: HIGH

The agent attempted to run "rm -rf ~/projects" via the bash tool.
This matches the destructive command policy. The command would have
recursively deleted your projects directory.

Improve policies over time

After your coding agent has been running for a while:

safeai intelligence recommend --since 7d --apply

The recommender analyzes your audit trail and suggests policy improvements.


Full example

# One-time setup (30 seconds)
uv pip install safeai
safeai init                                       # interactive setup
safeai intelligence auto-config --path . --apply  # generate policies

# Add MCP config to your agent (same for all agents)
# Then use your coding agent normally — SafeAI enforces on every action.

Alternative: hooks (no MCP)

If your agent doesn't support MCP, SafeAI also supports direct hooks:

# Auto-install hooks for supported agents
safeai setup claude-code
safeai setup cursor

# Or use the universal hook with any agent
echo '{"tool": "bash", "input": {"command": "rm -rf /"}}' | safeai hook
# → {"decision": "block", "reason": "Destructive commands are not allowed."}

Alternative: sidecar proxy (any language)

For agents that can make HTTP calls but don't support MCP or hooks:

safeai serve --mode sidecar --port 8484

Then call the REST API:

curl -X POST http://localhost:8484/v1/intercept/tool \
  -H "content-type: application/json" \
  -d '{"tool": "bash", "input": {"command": "rm -rf /"}, "agent_id": "my-agent"}'

Next steps