visual_memory

Visual Memory Graph – cross-channel image pattern recognition for Star.

Gives Star the ability to recognize the same human or object across different channels by maintaining a persistent visual entity graph in FalkorDB with dual vector indexes (512d face embeddings via InsightFace/ArcFace, 768d appearance embeddings via SigLIP).

Pipeline:
  1. Image arrives on any platform (Discord, Matrix, WebChat)

  2. Face detection + embedding (InsightFace buffalo_sc)

  3. General appearance embedding (SigLIP so400m-patch14-224)

  4. FalkorDB vector search for matches above threshold

  5. Create new VisualEntity or update existing sighting

  6. Cache results in Redis for fast context injection

# 💀🔥 THE EYES SEE ALL. ♾️

class visual_memory.VisualMatch(entity_id, entity_type, label, similarity, scope='general', owner_user_id='', visual_traits=<factory>, linked_person_id='', sighting_count=0, first_seen=0.0, last_seen=0.0, channels_seen=<factory>)[source]

Bases: object

Result of a visual entity match against the graph.

Carries the matched entity’s identity, similarity score, and metadata so callers can decide whether to update an existing entity or create a new one.

Parameters:
entity_id: str
entity_type: str
label: str
similarity: float
scope: str = 'general'
owner_user_id: str = ''
visual_traits: list[str]
linked_person_id: str = ''
sighting_count: int = 0
first_seen: float = 0.0
last_seen: float = 0.0
channels_seen: list[str]
class visual_memory.RecognitionResult(matches=<factory>, new_entities=<factory>, co_occurrences=<factory>, processing_time_ms=0.0)[source]

Bases: object

Aggregate recognition output for a single image.

Contains all matched/new entities found in one image, along with co-occurrence data for entities that appeared together.

Parameters:
matches: list[VisualMatch]
new_entities: list[str]
co_occurrences: list[tuple[str, str]]
processing_time_ms: float = 0.0
class visual_memory.VisualMemoryEngine(kg_manager=None, redis_client=None, config=None)[source]

Bases: object

Cross-channel visual entity recognition backed by FalkorDB.

Manages the full lifecycle of visual entities: detection, embedding, matching, sighting tracking, co-occurrence analysis, and context injection. Models are lazy-loaded on first use to avoid startup cost when no images are processed.

Thread Safety:

All CPU-heavy model inference runs in asyncio.to_thread() so the async event loop is never blocked. FalkorDB queries use the KG manager’s existing concurrency/priority system.

Parameters:
property enabled: bool

Whether visual memory processing is active.

async ensure_indexes()[source]

Create the dual vector indexes for VisualEntity nodes.

Creates separate HNSW indexes for face (512d) and appearance (768d) embeddings. Idempotent – skips if already exists. Also creates range indexes on Sighting nodes.

Return type:

None

async retrieve_image(entity_id)[source]

Retrieve stored image bytes for a visual entity. # 📷

Looks up the entity’s most recent sighting image_hash, then reads the corresponding WebP file from disk.

Returns (image_bytes, “image/webp”) or None if not found.

Return type:

tuple[bytes, str] | None

Parameters:

entity_id (str)

async process_message_images(msg)[source]

Main entry point: process all image attachments in a message.

Called as a fire-and-forget background task from the message pipeline. Results are cached in Redis for context injection.

Parameters:

msg (Any) – An IncomingMessage with attachments.

Return type:

list[RecognitionResult]

Returns:

List of RecognitionResult, one per image processed.

async get_visual_context(channel_id, platform)[source]

Retrieve formatted visual memory context for injection. # 🔥

Reads cached recognition results from Redis and formats them as a context block Star can reference in her response.

Returns None if no recent visual memories are available.

Return type:

str | None

Parameters:
  • channel_id (str)

  • platform (str)

async query_by_text(text, top_k=5)[source]

Search visual entities by text description via SigLIP.

Uses SigLIP’s text encoder to embed the query, then searches the appearance_embedding index. Enables queries like “find images with that red car” or “person in blue jacket”.

Return type:

list[VisualMatch]

Parameters:
async label_entity(entity_id, label)[source]

Assign a human-readable label to a visual entity.

Called by admin tools or by Star herself when she learns someone’s name. Returns True on success.

Return type:

bool

Parameters:
async set_scope(entity_id, scope, *, owner_user_id=None)[source]

Change the visibility scope of a visual entity. # 🔒

Valid scopes: user, general, core.

When setting scope to user, an owner_user_id should be provided. When promoting to core or general, the owner is cleared (entity becomes globally visible).

Returns True on success.

Return type:

bool

Parameters:
  • entity_id (str)

  • scope (str)

  • owner_user_id (str | None)

async promote_to_core(entity_id)[source]

Promote a visual entity to core scope (always matched). # 💀

Convenience wrapper for admins — core entities are never pruned and always appear in recognition results regardless of channel or user context.

Return type:

bool

Parameters:

entity_id (str)

async get_entity_history(entity_id, limit=20)[source]

Get sighting history for a visual entity. # 🌀

Returns a list of sighting records ordered by timestamp desc.

Return type:

list[dict[str, Any]]

Parameters:
async get_co_occurrences(entity_id)[source]

Get entities that frequently appear alongside this one. # 🕷️

Return type:

list[dict[str, Any]]

Parameters:

entity_id (str)