media_cache
Disk-backed LRU media cache.
Caches downloaded media (images, audio, video, files) so that the same URL is never fetched twice. An in-memory index provides fast lookups while a configurable disk directory persists data across restarts.
Each cached entry is stored on disk as two files:
{sha256_of_url}.dat – raw media bytes
{sha256_of_url}.json – sidecar metadata (mimetype, filename, url, ts, size)
On startup the disk directory is scanned to rebuild the in-memory index without loading all bytes into RAM.
- class media_cache.MediaCache(cache_dir='media_cache', max_size_mb=500, max_memory_items=64)[source]
Bases:
objectTwo-tier (memory + disk) LRU media cache.
- Parameters:
cache_dir (
str|Path) – Directory for persistent storage. Created automatically.max_size_mb (
int) – Approximate cap on total disk usage in megabytes. Oldest entries are evicted when the limit is exceeded.max_memory_items (
int) – Maximum number of entries whose bytes are kept in RAM. Entries beyond this limit are still indexed (metadata only) and will be read back from disk on the next access.
- __init__(cache_dir='media_cache', max_size_mb=500, max_memory_items=64)[source]
Initialize the instance.
- async ensure_loaded()[source]
Load the in-memory index from disk (non-blocking).
Called during async startup so the sync disk scan does not block the event loop. Idempotent — safe to call multiple times.
- Return type:
- async put(url, data, mimetype, filename)[source]
Store media bytes under url, writing through to disk.