platforms.media_common module
Shared media-to-content-part conversion for all platforms.
Converts raw bytes + MIME type into the multimodal content-part format expected by the OpenRouter chat-completions API. Platform-specific download logic lives in each adapter; this module only handles the format conversion.
Office / ODF / EPUB documents whose MIME types are not supported by the downstream LLM are automatically converted to PDF via LibreOffice headless before being embedded as content parts.
GIF and animated WebP images are automatically re-encoded as MP4 (H.264 baseline) so the Gemini API receives a well-supported video format instead of GIF/animated-WebP.
- async platforms.media_common.download_with_retry(downloader, *, attempts=3, base_delay=0.5, max_delay=8.0, label='')[source]
Call downloader with bounded retry on transient CDN failures.
downloader is the same
asynccallable the platform adapters already pass toMediaCache.get_or_download()— it returns(data, mimetype, filename). A single transient Discord/Matrix CDN blip otherwise drops the attachment for that one message (the cache fix only stops the failure from becoming permanent); a few retries close that one-shot loss.Retries on:
any raised exception that is not a permanent HTTP status (see
_is_permanent_download_error()), andan empty-bytes result (
not data) — treated as a failed download, consistent withMediaCache.get_or_downloadwhich refuses to cache it.
Permanent errors (404/403/410/…) re-raise immediately. After the final attempt the last result is returned as-is (possibly empty) so the existing empty-media handling downstream stays in control:
get_or_downloadwill not cache it andmedia_to_content_partsemits a text note instead of a blank image.Backoff is exponential (
base_delay * 2**n) capped at max_delay, with uniform jitter to avoid thundering-herd re-fetches.
- async platforms.media_common.maybe_reencode_gif(data, mimetype, filename)[source]
Re-encode GIF or animated WebP as MP4 for the Gemini API.
For animated WebP: Pillow converts WebP->GIF (no ffmpeg libwebp needed), then the GIF is converted to MP4 via ffmpeg.
For GIF: directly converted to MP4 via ffmpeg.
Returns
(data, mimetype, filename)– either the converted MP4 or the original inputs unchanged if conversion fails, the input is not a GIF, or the WebP is not animated (static WebP passes through as a normal image).
- platforms.media_common.detect_image_mimetype_from_bytes(data)[source]
Best-effort image MIME from raw bytes (magic + Pillow).
Returns a lowercase
image/*type, orNoneif unknown.
- platforms.media_common.shrink_image_under_max_bytes(data, declared_mimetype='', *, max_bytes=4194304)[source]
Re-encode and optionally downscale raster image bytes to fit under max_bytes.
Used for API providers with a hard per-image size limit. Returns
Noneif the image cannot be opened as a raster, is not shrinkable (e.g. SVG), or remains above max_bytes after resize attempts.If data is already at or below max_bytes, returns data unchanged.
- platforms.media_common.reconcile_image_mimetype_sync(data, declared)[source]
Correct a declared
image/*MIME type against the type detected from bytes.Platforms and CDNs frequently mislabel images (e.g. a JPEG served as
image/png), which makes downstream providers reject or mis-handle thedata:URI. This sniffs the real type from the leading bytes and returns the detected value when it disagrees with the declared one, stripping anycharset/parameter suffix in the process. Non-image declarations are passed through untouched, and when detection fails the bare declared type is returned.Delegates the actual sniffing to
detect_image_mimetype_from_bytes()(magic-number checks plus a Pillow fallback) and logs an info line at the module logger when a reconciliation actually changes the type; it performs no I/O of its own. Called by the async wrapperreconcile_image_mimetype()and directly byurl_content_extractorwhen sanitizing fetched images.
- async platforms.media_common.reconcile_image_mimetype(data, declared)[source]
Async wrapper for
reconcile_image_mimetype_sync()off the event loop.MIME sniffing can fall back to a Pillow decode, which is CPU-bound and would otherwise block the asyncio event loop on a large image. This offloads the synchronous reconcile to a worker thread via
asyncio.to_threadso callers canawaitit inline. The detection logic, logging, and return contract are identical to the sync variant.Called by
media_to_content_parts()here and byurl_content_extractorwhen preparing fetched images for the model.- Parameters:
- Returns:
The reconciled MIME type (see
reconcile_image_mimetype_sync()).- Return type:
- async platforms.media_common.media_to_content_parts(data, mimetype, filename, body_text=None)[source]
Build an OpenRouter multimodal content-parts list from raw media.
Office / ODF documents are transparently converted to PDF via LibreOffice so the LLM never sees an unsupported MIME type.
- Parameters:
- Returns:
A list of content-part dicts suitable for the
contentfield of an OpenRouter user message.- Return type: