Task Response Caching

Enable caching to store and reuse task responses for similar inputs, reducing API costs and improving performance.

Quick Start

from upsonic import Agent, Task

agent = Agent(model="anthropic/claude-sonnet-4-5")
task = Task(
    description="Summarize this article about AI advancements",
    enable_cache=True
)

agent.print_do(task)

Cache Methods

Vector Search (Default)

Uses semantic similarity to find cached responses for similar inputs.

from upsonic import Agent, Task

agent = Agent(model="anthropic/claude-sonnet-4-5")
task = Task(
    description="Analyze this text about climate change",
    enable_cache=True,
    cache_method="vector_search",
    cache_threshold=0.8  # Higher = stricter matching
)

agent.print_do(task)

LLM Call

Uses an LLM to determine if cached responses are applicable.

from upsonic import Agent, Task

agent = Agent(model="anthropic/claude-sonnet-4-5")
task = Task(
    description="Analyze this text about renewable energy",
    enable_cache=True,
    cache_method="llm_call"
)

agent.print_do(task)

Configuration Options

Parameter	Type	Default	Description
`enable_cache`	bool	`False`	Enable/disable caching
`cache_method`	str	`"vector_search"`	`"vector_search"` or `"llm_call"`
`cache_threshold`	float	`0.7`	Similarity threshold (0.0-1.0)
`cache_duration_minutes`	int	`60`	Cache expiration time
`cache_embedding_provider`	Any	Auto-detected	Custom embedding provider

Full Example

from upsonic import Agent, Task

agent = Agent(model="anthropic/claude-sonnet-4-5")
task = Task(
    description="Explain quantum computing",
    enable_cache=True,
    cache_method="vector_search",
    cache_threshold=0.75,
    cache_duration_minutes=120
)

agent.print_do(task)

# Check if response came from cache
cache_stats = task.get_cache_stats()
if cache_stats.get('cache_hit'):
    print("Response retrieved from cache!")
else:
    print("Fresh response generated")

Task Cache Methods

get_cache_stats()

Get cache statistics including hit rate and configuration.

stats = task.get_cache_stats()
print(f"Hit rate: {stats['hit_rate']}")
print(f"Total entries: {stats['total_entries']}")

Returns:

total_entries: Number of cached entries
cache_hits: Number of cache hits
cache_misses: Number of cache misses
hit_rate: Cache hit rate (0.0-1.0)
cache_method: Current cache method
cache_threshold: Current threshold
cache_hit: Whether last request was a cache hit
session_id: Current session ID

clear_cache()

Clear all cache entries.

task.clear_cache()

Best Practices

Threshold Tuning: Start with 0.7, increase for stricter matching
Duration: Set based on how often your data changes
Method Choice: Use vector_search for speed, llm_call for accuracy
Embedding Provider: Auto-detected if not specified

​Quick Start

​Cache Methods

​Vector Search (Default)

​LLM Call

​Configuration Options

​Full Example

​Task Cache Methods

​get_cache_stats()

​clear_cache()

​Best Practices