ncm_semantic_triggers

NCM Semantic Trigger Matcher — cosine-similarity-based emotion detection.

Replaces the exact-word scan_text_for_triggers with embedding-based semantic matching. Each emotion in reality_marble_recursion_index.yaml gets:

An embedding of its affect description string.

LLM-generated colloquial phrases a person might say when feeling that emotion, each also embedded.

A mean_vec (L2-normalized centroid of all above embeddings) that serves as the emotion’s semantic fingerprint for matching.

At runtime, the incoming text is embedded once and dot-producted against every cached mean_vec. Emotions above MATCH_THRESHOLD are returned as (emotion_name, delta_vector) pairs, capped at TOP_N.

Falls back to an empty list (caller should use scan_text_for_triggers from ncm_delta_parser) when no embeddings are ready yet.

Redis key: ncm:trigger_embed:v2:{sha256_hex_of_emotion_name} Redis value: JSON object — see below

class ncm_semantic_triggers.SemanticTriggerMatcher(redis_client=None, api_key=None, openrouter_client=None)[source]

Bases: object

Cosine-similarity emotion trigger detector.

Parameters:

redis_client – An redis.asyncio.Redis instance. May be None.
api_key (Optional[str]) – OpenRouter API key. When None the matcher is a no-op and find_triggers always returns an empty list.
openrouter_client (OpenRouterClient | None)

__init__(redis_client=None, api_key=None, openrouter_client=None)[source]

Initialize the instance.

Parameters:

redis_client – Redis connection client.
api_key (Optional[str]) – The api key value.
openrouter_client (OpenRouterClient | None) – Shared OpenRouterClient for connection pooling and batch embedding. Falls back to direct HTTP when None.

Return type:

None

async find_triggers(text, threshold=0.55, top_n=6, query_embedding=None)[source]

Embed text and return the top-N emotions above threshold.

This is the hot-path semantic replacement for exact-word trigger scanning: it embeds the incoming text once, dot-products that query vector against every cached emotion mean_vec fingerprint, keeps matches at or above threshold, and returns the strongest few as scaled limbic delta vectors so the cascade can nudge the right nodes.

Embedding goes through self._embed_texts (the shared OpenRouterClient.embed_batch when present, else the Gemini key pool), so this issues network I/O but touches no Redis or filesystem. Per-emotion delta templates are pulled lazily via ncm_delta_parser.get_emotion_delta and scaled by the cosine score, SEMANTIC_SCALE, and clamped to _TRIGGER_NODE_CAP per node. It is a no-op returning an empty list when there is no text, no API key, or no warmed embeddings yet, signalling the caller to fall back to exact matching. Called by LimbicCoordinator in limbic_system/coordinator.py as self._trigger_matcher.find_triggers(text).

Parameters:

text (str) – The user/turn text to scan for emotional triggers.
threshold (float) – Minimum cosine similarity for an emotion to fire. Defaults to MATCH_THRESHOLD.
top_n (int) – Maximum number of emotions to return. Defaults to TOP_N.
query_embedding (list[float] | None) – Precomputed embedding for the query text. If provided, embedding generation is bypassed.

Returns:

(emotion_name, delta_vector) pairs sorted by descending similarity, where each delta maps node names to clamped, scaled adjustments. Empty when nothing matches or no embeddings are ready.

Return type:

List[Tuple[str, Dict[str, float]]]

async ensure_all_cached()[source]

Build the full emotion embedding index.

Return type:: None

For each emotion in the recursion index:

Load from Redis if already cached.
Otherwise embed the affect string + generate/embed up to NUM_PHRASES variant trigger phrases, then store in Redis.

Designed to run once at startup as a background task. Safe to call multiple times — already-cached emotions are skipped.