Skip to main content

CacheManager

The CacheManager is the central orchestrator for all cache operations in OmniCache-AI. It wires together a storage backend, key builder, TTL policy, optional vector backend, and invalidation engine into a single, coherent interface.


Overview

Every cache read, write, deletion, and invalidation passes through CacheManager. It is responsible for:

  • Exact-match lookups via the primary CacheBackend.
  • Semantic lookups via an optional VectorBackend (cosine similarity search).
  • TTL resolution using TTLPolicy to determine expiration per cache type.
  • Tag registration so groups of keys can be invalidated together.
  • Lifecycle management (clearing all entries, releasing resources).

You can construct a CacheManager manually by passing in each component, or use the from_settings() factory to build one from an OmnicacheSettings dataclass.


Usage

Manual Construction

from omnicache_ai.backends.memory_backend import InMemoryBackend
from omnicache_ai.core.cache_manager import CacheManager
from omnicache_ai.core.key_builder import CacheKeyBuilder
from omnicache_ai.core.policies import TTLPolicy

manager = CacheManager(
backend=InMemoryBackend(),
key_builder=CacheKeyBuilder(namespace="myapp"),
ttl_policy=TTLPolicy(default_ttl=600),
)

From Settings

from omnicache_ai import CacheManager, OmnicacheSettings

settings = OmnicacheSettings(
backend="redis",
redis_url="redis://localhost:6379/0",
namespace="prod",
)
manager = CacheManager.from_settings(settings)

Basic Operations

# Store a value with explicit TTL
manager.set("user:123:profile", {"name": "Alice"}, ttl=300)

# Retrieve
profile = manager.get("user:123:profile")

# Check existence
if manager.exists("user:123:profile"):
print("Cache hit")

# Delete a single key
manager.delete("user:123:profile")

# Flush everything
manager.clear()

Tag-Based Invalidation

# Store entries with tags
manager.set("resp:a1b2", "Answer A", tags=["user:123", "model:gpt-4"])
manager.set("resp:c3d4", "Answer B", tags=["user:123", "model:gpt-4"])
manager.set("resp:e5f6", "Answer C", tags=["user:456", "model:gpt-4"])

# Invalidate all entries for user:123
removed = manager.invalidate("user:123")
print(f"Removed {removed} keys") # 2

Semantic Cache Lookup

import numpy as np
from omnicache_ai.backends.vector_backend import FAISSBackend

manager = CacheManager(
backend=InMemoryBackend(),
key_builder=CacheKeyBuilder(),
vector_backend=FAISSBackend(dim=1536),
semantic_threshold=0.92,
)

embedding = np.random.rand(1536).astype(np.float32)
manager.set("embed:abc", "cached answer", vector=embedding)

# Semantic lookup -- finds nearest neighbor above threshold
query_vector = embedding + np.random.normal(0, 0.01, 1536).astype(np.float32)
result = manager.get("ignored", semantic=True, vector=query_vector)
When to use semantic lookup

Semantic cache is most useful for LLM response caching where slightly different prompts should return the same cached answer. Set semantic_threshold between 0.90 and 0.98 depending on how strict you want matching to be.


Factory: from_settings

The from_settings() classmethod constructs a fully-wired CacheManager from an OmnicacheSettings instance. It handles backend selection, vector backend initialization, key builder configuration, and TTL policy setup automatically.

from omnicache_ai import CacheManager, OmnicacheSettings

# From environment variables
manager = CacheManager.from_settings(OmnicacheSettings.from_env())

# Programmatic
manager = CacheManager.from_settings(OmnicacheSettings(
backend="disk",
disk_path="/data/cache",
vector_backend="faiss",
embedding_dim=1536,
semantic_threshold=0.93,
))

The factory selects backends based on the backend and vector_backend fields:

settings.backendBackend class
"memory"InMemoryBackend(max_size=settings.max_memory_entries)
"disk"DiskBackend(directory=settings.disk_path)
"redis"RedisBackend(url=settings.redis_url)
settings.vector_backendVector backend class
"none"None (disabled)
"faiss"FAISSBackend(dim=settings.embedding_dim)
"chroma"ChromaBackend()
note

The factory always creates an InvalidationEngine backed by an InMemoryBackend for tag storage. This is separate from the primary cache backend.


API Reference

Constructor

CacheManager(
backend: CacheBackend,
key_builder: CacheKeyBuilder,
ttl_policy: TTLPolicy | None = None,
vector_backend: VectorBackend | None = None,
invalidation_engine: InvalidationEngine | None = None,
semantic_threshold: float = 0.95,
)
ParameterTypeDefaultDescription
backendCacheBackendrequiredPrimary key-value storage backend
key_builderCacheKeyBuilderrequiredKey construction helper
ttl_policyTTLPolicy | NoneNoneTTL rules per cache type; defaults to TTLPolicy()
vector_backendVectorBackend | NoneNoneOptional vector similarity backend for semantic search
invalidation_engineInvalidationEngine | NoneNoneOptional tag-based invalidation engine
semantic_thresholdfloat0.95Minimum cosine similarity for a semantic cache hit

Methods

MethodSignatureReturnsDescription
getget(key, semantic=False, vector=None)Any | NoneRetrieve a cached value (exact or semantic)
setset(key, value, ttl=None, vector=None, tags=None, cache_type="response")NoneStore a value in the cache
deletedelete(key)NoneRemove a key from both primary and vector backends
existsexists(key)boolCheck if a key exists in the cache
invalidateinvalidate(tag)intInvalidate all keys associated with a tag
clearclear()NoneFlush all cache entries from all backends
closeclose()NoneRelease all resources (connections, file handles)
from_settingsfrom_settings(settings)CacheManagerClassmethod factory from OmnicacheSettings

Properties

PropertyTypeDescription
key_builderCacheKeyBuilderAccess the key builder instance
ttl_policyTTLPolicyAccess the TTL policy instance

Method Details

get(key, semantic=False, vector=None)

Retrieve a cached value by exact key or semantic similarity.

ParameterTypeDefaultDescription
keystrrequiredCache key for exact lookup
semanticboolFalseEnable nearest-neighbor lookup via vector backend
vectornp.ndarray | NoneNoneQuery embedding (required when semantic=True)

When semantic=True, the method searches the vector backend for the nearest neighbor. If the best match has a cosine similarity score at or above semantic_threshold, the corresponding value is returned from the primary backend. Otherwise, None is returned.

warning

When semantic=True, both vector_backend and vector must be provided. If either is None, the method falls back to an exact key lookup.


set(key, value, ttl=None, vector=None, tags=None, cache_type="response")

Store a value in the cache with optional TTL, vector indexing, and tag registration.

ParameterTypeDefaultDescription
keystrrequiredCache key
valueAnyrequiredValue to store
ttlint | NoneNoneExplicit TTL in seconds; falls back to TTLPolicy
vectornp.ndarray | NoneNoneEmbedding vector to index alongside the key
tagslist[str] | NoneNoneInvalidation tags to associate with the key
cache_typestr"response"Cache type used to resolve TTL from TTLPolicy
TTL Resolution Order
  1. If ttl is explicitly passed, that value is used.
  2. Otherwise, TTLPolicy.ttl_for(cache_type) is consulted.
  3. If the policy returns None, the entry never expires.

invalidate(tag)

Remove all cache keys registered under the given tag. Returns 0 if no invalidation engine is configured.

count = manager.invalidate("model:gpt-4")
print(f"Invalidated {count} cached responses")

Next Steps