Runtime Security for AI Agents
Your AI agents are powerful.
Make them safe.¶
Block secrets before they leak. Redact PII before it escapes. Enforce policies before damage happens. SafeAI intercepts every prompt, tool call, and model response — across any provider, framework, and deployment.
$ pip install safeai-sdk && python -c "from safeai import SafeAI; SafeAI.quickstart()"
The Problem
AI Agents Ship to Production Unprotected¶
Every day, AI agents in production are:¶
- Leaking secrets — API keys, tokens, and credentials embedded in prompts flow straight to LLM providers
- Exposing PII — customer emails, phone numbers, and SSNs appear in model responses, logs, and downstream systems
- Calling unauthorized tools — agents execute shell commands, access databases, and modify files without guardrails
- Running dangerous operations —
rm -rf /,DROP TABLE, force pushes — with no human in the loop
Model safety training doesn't prevent any of this. You need enforcement at runtime, at the exact points where data crosses trust boundaries.
SafeAI: Security enforcement where incidents actually happen¶
SafeAI is a runtime security layer that intercepts every prompt, tool call, agent message, and model response. It enforces deterministic policies — not AI judgment — at three boundaries. Every decision is logged. Every action is auditable. No exceptions.
Quick Start
Two Lines. Full Protection.¶
Now wrap any AI call:
# Block secrets before they reach the model
scan = ai.scan_input("Summarize this: API_KEY=sk-ABCDEF1234567890")
# => BLOCKED: secret detected — never leaves your system
# Redact PII from responses before users see them
guard = ai.guard_output("Contact alice@example.com or call 555-123-4567")
print(guard.safe_output)
# => "Contact [REDACTED] or call [REDACTED]"
Why SafeAI
Built for How AI Agents Actually Fail¶
🔑
A developer pastes credentials into a prompt¶
SafeAI's input boundary detects the secret and blocks the request before it ever reaches the LLM. The incident is logged and an alert fires.
📧
A model response contains customer emails¶
SafeAI's output guard redacts all PII — emails, phone numbers, SSNs, credit cards — before the response reaches your application or your users.
⚡
An agent tries to run rm -rf /¶
SafeAI's action boundary intercepts the tool call and blocks the dangerous command. The agent receives a denial, not a destroyed filesystem.
🤝
A high-risk operation needs human sign-off¶
SafeAI's approval workflow pauses the action and queues it for review. The agent waits for a human to approve or deny — with TTL and deduplication.
Platform
Everything You Need to Secure AI Agents¶
3 Runtime Boundaries
80+ Built-in Detectors
5 AI Advisory Agents
8 Compliance Skill Packs
Intelligence Layer¶
5 AI advisory agents that auto-configure policies, recommend rules, explain incidents, map compliance, and generate integration code. Bring your own model.
Secret Detection¶
API keys, tokens, and credentials blocked before they reach any LLM. Pattern-based and contextual detection across all boundaries.
PII Protection¶
Emails, phone numbers, SSNs, and credit cards redacted or blocked automatically. Nested JSON and file payloads included.
Policy Engine¶
Priority-based YAML rules with tag hierarchies, boundary-specific matching, hot reload, and schema validation.
Tool Contracts¶
Declare what each tool accepts and emits. Undeclared tools are denied. Per-field stripping and response filtering built in.
Agent Identity¶
Bind agents to tools and clearance levels. Cross-agent messaging is policy-gated. Zero-trust by default.
Approval Workflows¶
Human-in-the-loop gates for high-risk operations. Persistent queue with TTL, deduplication, and approve/deny flow.
Encrypted Memory¶
Schema-enforced agent memory with field-level encryption and automatic expiry. Agents never access raw protected data.
Capability Tokens¶
Scoped, time-limited tokens for secret access. Session-bound. Env, Vault, and AWS backends supported.
Audit Trail¶
Every decision logged with context hash. Filter by agent, action, boundary, tool, session, and time range.
Dangerous Commands¶
Blocks rm -rf /, DROP TABLE, fork bombs, pipe-to-shell, and force pushes before they execute.
Content Moderation¶
80+ local detectors for toxicity, prompt injection, jailbreaks, and topic restrictions. No external API calls.
Cost Governance¶
Track token usage, enforce budgets with hard blocks and alerts, extract costs from OpenAI, Anthropic, and Google responses.
Multi-Provider Routing¶
4 routing strategies (priority, cost, latency, round-robin) with circuit-breaker failover across providers.
Skills System¶
Install GDPR, HIPAA, PCI-DSS, prompt injection shields, and more from the registry. One command, full compliance.
Alerting & Observability¶
File, webhook, Slack, email, PagerDuty, and Opsgenie channels. Prometheus metrics. Real-time WebSocket streaming.
Architecture
How SafeAI Works¶
Users / Apps / Agents
│
▼
┌──────────────────────┐
│ Input Boundary │ scan_input · scan_structured_input · scan_file_input
│ Detect + classify │ secrets, PII, policy tags, nested payloads, files
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ Core Policy Plane │ deterministic rules, hot reload, boundary-specific matching
└───────┬───────┬──────┘
│ │
│ └──────────────┐
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ Action Boundary │ │ Output Boundary │
│ intercept tool/API │ │ guard_output │
│ contracts, identity │ │ redact / block / │
│ approvals, secrets │ │ fallback response │
└──────────┬───────────┘ └──────────┬───────────┘
│ │
▼ ▼
Tools / APIs / Agents Users / Apps / Agents
Deployment: SDK · CLI · HTTP proxy · MCP server · dashboard
Design Guarantees¶
- Deterministic enforcement. AI does not make security decisions — deterministic policy rules do.
- Policy-as-data. Versioned YAML, not hidden logic in application code.
- Framework-agnostic. Thin adapters around a single enforcement core.
- Full auditability. Every boundary crossing is logged and queryable.
- Human approval. High-risk operations require human sign-off without exposing protected data to AI.
Full architecture documentation
In Action
See It Work¶
Securing OpenClaw with SafeAI
A complete walkthrough running SafeAI as a sidecar alongside OpenClaw — an open-source personal AI assistant with shell access, file system permissions, and cross-platform messaging.
Covers: secret detection, PII protection, tool contracts, dangerous command blocking, structured payload scanning, audit logging, and proxy deployment — all without modifying OpenClaw's source code.
SDK Quick Start Examples
Progressive examples: input scanning, output guarding, structured payloads, file scanning, memory operations, API tiers, typed results, and content moderation.
Proxy & API Deployment
Deploy SafeAI as an HTTP proxy — health checks, scanning endpoints, approval workflows, dashboard, WebSocket streaming, and OpenAPI documentation.
Works With Everything¶
| AI Providers | Agent Frameworks | Coding Agents | Deployment |
|---|---|---|---|
| OpenAI | LangChain | Claude Code | Python SDK |
| Google Gemini | CrewAI | Cursor | REST API (sidecar) |
| Anthropic Claude | AutoGen | Copilot | Gateway proxy |
| Ollama | Google ADK | Any MCP client | MCP server |
| Any HTTP API | Claude ADK | CLI hooks |
Install SafeAI¶
With extras: