Overview
Safety Engine provides content filtering and policy enforcement for AI agents. It controls what goes into agents (user input, system prompt, context, chat history) and what comes out (agent responses, tool outputs) by applying policies that detect and handle sensitive content. Policies work seamlessly with both synchronous, asynchronous, and streaming execution modes.Why you Should Use Safety Policies
Safety policies are essential for protecting sensitive information and ensuring compliance with your organization’s security and privacy requirements. When you send data to LLM providers, that data may be used for training, stored in logs, or processed in ways that could expose sensitive information. Safety policies act as a critical first line of defense by:- Preventing Data Leaks: Stop sensitive information like PII, financial data, or confidential business information from being sent to LLM providers
- Ensuring Compliance: Meet regulatory requirements (GDPR, HIPAA, PCI DSS, etc.) by automatically detecting and handling sensitive content
- Enforcing Company Policies: Automatically apply your organization’s content safety rules across all agent interactions
- Maintaining Control: Track and monitor what content is being filtered, giving you visibility into safety policy enforcement
- LLM-Agnostic Protection: Once created, your policies work with any LLM provider, ensuring consistent safety regardless of the underlying model
Key Features
- Policy Points: Apply policies at different stages of the agent pipeline
- User Inputs (description, context, system prompt, chat history)
- Agent Outputs
- Tool Interactions (pre-registration and post-execution)
- Policy Scope: Fine-grained control over which parts of the input get sanitized (
apply_to_description,apply_to_context,apply_to_system_prompt,apply_to_chat_history,apply_to_tool_outputs) - Pre-built Policies: Ready-to-use policies for PII, financial data, medical info, phone numbers, adult content, hate speech, profanity, and more
- Custom Policies: Create your own rules and actions
- Action Types: Block, anonymize (unique random placeholders), replace (fixed placeholders), or raise exceptions
- Streaming Support: Full policy support in both event-based and pure text streaming modes with real-time de-anonymization
- Async Support: All policies work with
print_do,print_do_async,stream, andastreammethods - Multi-language Support: Automatically adapts to user’s language
How Anonymization Works
When you use an Anonymize action (likePIIAnonymizePolicy), the Safety Engine performs a fully reversible transformation:
- Detection: The rule scans your input for sensitive content (emails, phone numbers, etc.)
- Anonymization: Detected values are replaced with unique random placeholders before sending to the LLM
- LLM Processing: The LLM receives and processes only the anonymized content — your real data never leaves your environment
- De-anonymization: The agent’s response is mapped back to the original values before returning to you
Example
Streaming
Policies work seamlessly with streaming — de-anonymization happens token-by-token in real-time:Async
Navigation
- Policy Points - Learn where and when to apply safety policies in your agent
- Pre-built Policies - Ready-to-use policies for PII, adult content, hate speech, profanity, and more
- Custom Policy - Create your own safety policies with custom rules and actions
- Creating Rules - Define custom detection rules for content filtering
- Creating Actions - Configure actions for policy violations
- Policy Feedback Loop - LLM-driven feedback for policy violations with retry capabilities

