Skip to main content

Policies

OmniCache-AI provides two policy dataclasses that control cache expiration and eviction behavior: TTLPolicy for time-based expiration and EvictionPolicy for capacity-based eviction.


Overview

Caching without expiration leads to stale data; caching without eviction leads to unbounded memory growth. The policy system gives you fine-grained control over both:

  • TTLPolicy lets you set a global default TTL and override it per cache type (embeddings, retrievals, contexts, responses).
  • EvictionPolicy defines how entries are removed when capacity limits are reached (LRU or TTL-only).

Both are plain Python dataclasses -- lightweight, easy to construct, and easy to serialize.


TTLPolicy

What It Does

TTLPolicy determines how long cached entries survive. It supports a global default and per-cache-type overrides, so embeddings (which rarely change) can live for 24 hours while LLM responses (which may be volatile) expire in 10 minutes.

Usage

Basic TTL Configuration

from omnicache_ai.core.policies import TTLPolicy

# Default: 1 hour for everything
policy = TTLPolicy()
print(policy.ttl_for("response")) # 3600

# Custom default
policy = TTLPolicy(default_ttl=1800)
print(policy.ttl_for("anything")) # 1800

# No expiry
policy = TTLPolicy(default_ttl=None)
print(policy.ttl_for("response")) # None

Per-Type Overrides

policy = TTLPolicy(
default_ttl=3600,
per_type={
"embedding": 86400, # 24 hours
"retrieval": 3600, # 1 hour
"context": 1800, # 30 minutes
"response": 600, # 10 minutes
},
)

print(policy.ttl_for("embedding")) # 86400
print(policy.ttl_for("response")) # 600
print(policy.ttl_for("custom")) # 3600 (falls back to default)

From Settings

from omnicache_ai.config.settings import OmnicacheSettings
from omnicache_ai.core.policies import TTLPolicy

settings = OmnicacheSettings(
default_ttl=7200,
ttl_embedding=172800,
ttl_response=300,
)
policy = TTLPolicy.from_settings(settings)

print(policy.ttl_for("embedding")) # 172800
print(policy.ttl_for("response")) # 300
print(policy.ttl_for("retrieval")) # 3600 (from settings default)
Recommended TTLs by cache type
Cache TypeRecommended TTLRationale
embedding24h (86400s)Embeddings rarely change for the same input text
retrieval1h (3600s)Retrieved documents may be updated periodically
context30min (1800s)Agent context is session-scoped and evolves quickly
response10min (600s)LLM responses may vary and should stay fresh

API Reference

Constructor

@dataclass
class TTLPolicy:
default_ttl: int | None = 3600
per_type: dict[str, int | None] = field(default_factory=dict)
FieldTypeDefaultDescription
default_ttlint | None3600Fallback TTL in seconds; None means no expiry
per_typedict[str, int | None]{}Cache type to TTL mapping (overrides default_ttl)

Methods

MethodSignatureReturnsDescription
ttl_forttl_for(cache_type)int | NoneReturn the effective TTL for the given cache type
from_settingsfrom_settings(settings)TTLPolicyClassmethod: build from an OmnicacheSettings instance

ttl_for(cache_type)

Returns the TTL for a specific cache type. Checks per_type first; if the type is not found, returns default_ttl.

ParameterTypeDescription
cache_typestrThe cache type to look up (e.g., "response", "embedding")

from_settings(settings)

Builds a TTLPolicy from an OmnicacheSettings instance, mapping settings fields to per-type TTLs:

Settings FieldCache Type
settings.default_ttldefault_ttl
settings.ttl_embedding"embedding"
settings.ttl_retrieval"retrieval"
settings.ttl_context"context"
settings.ttl_response"response"

EvictionPolicy

What It Does

EvictionPolicy configures how entries are removed when capacity limits are reached. It supports LRU (least-recently-used) eviction or TTL-only expiry (no active eviction).

note

Eviction enforcement depends on backend support. Not all backends implement capacity-based eviction. The EvictionPolicy is a configuration object -- the backend decides how to honor it.

Usage

from omnicache_ai.core.policies import EvictionPolicy

# LRU with a max of 10,000 entries
policy = EvictionPolicy(
strategy="lru",
max_entries=10_000,
)

# TTL-only (no active eviction, entries expire naturally)
policy = EvictionPolicy(
strategy="ttl_only",
)

# LRU with both entry and byte limits
policy = EvictionPolicy(
strategy="lru",
max_entries=50_000,
max_bytes=500 * 1024 * 1024, # 500 MB
)

Strategies

Least-recently-used eviction removes the oldest-accessed entries when capacity is exceeded.

policy = EvictionPolicy(
strategy="lru",
max_entries=10_000,
)

Best for: general-purpose caching where recent entries are more valuable.

No active eviction. Entries are only removed when they expire or are explicitly deleted.

policy = EvictionPolicy(
strategy="ttl_only",
)

Best for: caches with well-defined TTLs and predictable memory usage.

warning

Using strategy="ttl_only" without setting TTLs on your entries can cause unbounded memory growth. Always pair it with a TTLPolicy that has a finite default_ttl.

API Reference

Constructor

@dataclass
class EvictionPolicy:
strategy: Literal["lru", "ttl_only"] = "lru"
max_entries: int | None = None
max_bytes: int | None = None
FieldTypeDefaultDescription
strategy"lru" | "ttl_only""lru"Eviction strategy
max_entriesint | NoneNoneMaximum number of entries before eviction; None means unlimited
max_bytesint | NoneNoneMaximum total size in bytes; None means unlimited (backend-dependent)

Combining Policies

TTL and eviction policies work together. TTL controls when individual entries expire; eviction controls what happens when capacity limits are hit.

from omnicache_ai.core.policies import TTLPolicy, EvictionPolicy

ttl = TTLPolicy(
default_ttl=3600,
per_type={"embedding": 86400, "response": 600},
)

eviction = EvictionPolicy(
strategy="lru",
max_entries=50_000,
)

# Pass both to CacheManager
from omnicache_ai.backends.memory_backend import InMemoryBackend
from omnicache_ai.core.cache_manager import CacheManager
from omnicache_ai.core.key_builder import CacheKeyBuilder

manager = CacheManager(
backend=InMemoryBackend(max_size=eviction.max_entries),
key_builder=CacheKeyBuilder(),
ttl_policy=ttl,
)

Next Steps