Skip to main content

Overview

The Policy Feedback Loop enables LLM-generated feedback when policy violations occur, allowing agents to self-correct their outputs through retry loops.

User Policy Feedback

Give users constructive guidance instead of hard blocking:
from upsonic import Agent, Task
from upsonic.safety_engine.policies.crypto_policies import CryptoBlockPolicy

agent = Agent(
    "anthropic/claude-sonnet-4-5",
    user_policy=CryptoBlockPolicy,
    user_policy_feedback=True,
)

task = Task(
    description="How can I buy Bitcoin and invest in cryptocurrency?"
)

result = agent.print_do(task)
print("Result:", result)
# Instead of a hard block, user receives helpful feedback explaining the restriction

Agent Policy Feedback

Enable agents to self-correct when their output violates policies. The agent retries until its output is compliant:
from upsonic import Agent, Task
from upsonic.safety_engine.policies.crypto_policies import CryptoBlockPolicy

agent = Agent(
    "anthropic/claude-sonnet-4-5",
    agent_policy=CryptoBlockPolicy,
    agent_policy_feedback=True,
    agent_policy_feedback_loop=3,  # Allow up to 3 retries
    debug=True
)

task = Task(
    description="Write a comprehensive guide about all investment types: stocks, bonds, real estate, cryptocurrency, and commodities"
)

# Since only agent_policy is set (not user_policy), the input is NOT checked.
# What happens under the hood:
# 1. Agent generates a guide including "Cryptocurrency: Bitcoin, Ethereum..."
# 2. Crypto policy detects violation on OUTPUT → feedback sent back to agent
# 3. Agent retries without crypto section → only stocks, bonds, real estate, commodities
# 4. Crypto policy passes → compliant output returned
result = agent.print_do(task)
print("Result:", result)