classifiers.vector_classifier module

Vector-based classifier for tool selection.

Lightweight semantic vector classifier that replaces sending all tools to the LLM with deterministic vector retrieval. Pre-computed centroid embeddings are stored in Redis (legacy monolithic hashes and/or per-tool tool_emb:* / per-skill skill_emb:* HASH documents indexed by RediSearch). At query time, RediSearch KNN is used when idx:tool_embeddings / idx:skill_embeddings have documents; otherwise embeddings are loaded and scored in-process (cosine_batch).

classifiers.vector_classifier.detect_tool_request_keywords(response_text)[source]

Return True when the bot seems to request missing tools.

This lightweight regex check gates the heavier embedding-based tool expansion to avoid false positives on legitimate no-tool responses.

Return type:

bool

Parameters:

response_text (str)

classifiers.vector_classifier.find_tools_explicitly_named(message, valid_names)[source]

Return tool names that appear verbatim in message as whole tokens.

Detection:

  • Maximal runs of ASCII letters, digits, and underscores (typical snake_case tools), equivalent to word boundaries for those names.

  • Text inside ASCII backticks (inline code): inner text is stripped and must match a registered tool name exactly, so names containing hyphens or other punctuation still match when quoted.

Hits are ordered by first occurrence in the message; each tool appears at most once.

Return type:

list[str]

Parameters:
class classifiers.vector_classifier.VectorClassifier(redis_client, similarity_threshold=0.3, top_k=15, api_key=None, *, strategy_force_threshold=0.8, strategy_optional_threshold=0.3, group_expansion_threshold=0.55, browser_tool_similarity_threshold=0.6)[source]

Bases: object

Semantic vector-based classifier for tool selection.

Parameters:
  • redis_client (Redis) – An async Redis connection (redis.asyncio.Redis).

  • similarity_threshold (float) – Minimum cosine similarity for a match.

  • top_k (int) – Maximum number of tools to return.

  • api_key (str | None) – OpenRouter API key. Falls back to the OPENROUTER_API_KEY env var.

  • strategy_force_threshold (float)

  • strategy_optional_threshold (float)

  • group_expansion_threshold (float)

  • browser_tool_similarity_threshold (float)

__init__(redis_client, similarity_threshold=0.3, top_k=15, api_key=None, *, strategy_force_threshold=0.8, strategy_optional_threshold=0.3, group_expansion_threshold=0.55, browser_tool_similarity_threshold=0.6)[source]

Store the Redis client and retrieval thresholds for later queries.

Records the async Redis connection and every tuneable threshold but performs no I/O: tool and skill embedding caches (_tool_embeddings_cache / _skill_embeddings_cache), their (N, D) matrices, ordered name lists, and the cached RediSearch document counts all start empty/None and are populated lazily on first use by _load_tool_embeddings(), _load_skill_embeddings(), and the _*_redisearch_has_docs probes. The OpenRouter embedding client is likewise deferred to _get_embedding_client(); only the API key is resolved now, falling back to the OPENROUTER_API_KEY environment variable when api_key is None. Emits a single configuration INFO log line.

Called wherever a VectorClassifier is constructed across the services that perform tool/skill routing; this dunder is not invoked directly by name elsewhere.

Parameters:
  • redis_client (Redis) – Async Redis connection used for all embedding reads and RediSearch KNN queries.

  • similarity_threshold (float) – Minimum cosine similarity for a tool match to be kept.

  • top_k (int) – Maximum number of tools returned from retrieval.

  • api_key (str | None) – OpenRouter API key; falls back to OPENROUTER_API_KEY when None.

  • strategy_force_threshold (float) – Top-score cutoff above which the strategy becomes "force".

  • strategy_optional_threshold (float) – Top-score cutoff above which the strategy becomes "optional".

  • group_expansion_threshold (float) – Minimum score for a tool to trigger prefix/named-group expansion.

  • browser_tool_similarity_threshold (float) – Stronger minimum score required to keep noisy browser_* tool matches.

Return type:

None

async classify(message, query_embedding=None, registry_tool_names=None, *, scan_explicit_tool_mentions=True, observability_extra=None)[source]

Classify message and return tool names + strategy.

Parameters:
  • query_embedding (ndarray | None) – Pre-computed embedding for message. When provided the internal embedding API call is skipped.

  • registry_tool_names (Optional[Iterable[str]]) – Registered tool names (e.g. registry keys). When provided, any name that appears as a whole token in message is included in the tool set alongside vector matches.

  • scan_explicit_tool_mentions (bool) – When True (default), scan message for explicit registered tool names. Set False for non-user text (e.g. assistant drafts, response postprocessing) so mentions in those strings never inflate the tool set.

  • message (str)

  • observability_extra (Mapping[str, Any] | None)

Returns:

A dict with keys tools, strategy, complexity, and safety.

Return type:

dict[str, Any]

async classify_skills(message, query_embedding=None, *, similarity_threshold=0.12, top_k=12, max_catalog_chars=4000)[source]

Retrieve tier-1 skill metadata relevant to message (progressive disclosure).

Public entry point for skill routing: it returns the slim catalog of candidate skills shown to the model first, so the full skill body is only loaded on demand. Blank messages short-circuit to an empty list. It ensures skill embeddings are available (RediSearch via _skill_redisearch_has_docs(), else _load_skill_embeddings()), embeds the message through _get_query_embedding() unless a precomputed vector is supplied, ranks candidates with _find_matching_skills(), projects each match down to the four catalog fields, and finally bounds the result through _trim_skills_catalog().

Called by message_processor/generate_and_send.py (the per-message generation path) as self._classifier.classify_skills(...).

Parameters:
  • message (str) – The user message to route skills for.

  • query_embedding (ndarray | None) – Precomputed embedding for message; when given, the internal embedding call is skipped.

  • similarity_threshold (float) – Minimum cosine score for a skill to be kept.

  • top_k (int) – Maximum number of skills to retrieve before trimming.

  • max_catalog_chars (int) – Character budget passed to _trim_skills_catalog().

Returns:

Skill dicts with skill_id, name, description, and score; empty when nothing qualifies.

Return type:

list[dict[str, Any]]

async classify_response_for_missing_tools(response_text, current_tools, threshold=0.85, *, observability_extra=None)[source]

Find tools the bot might need but lacks.

Used for dynamic tool expansion when the bot signals it needs tools not included in the original set.

Runs vector similarity only on response_text — not find_tools_explicitly_named(), so tool names that appear in assistant output or postprocessed reply text never add tools.

Return type:

list[str]

Parameters:
async close()[source]

Close the underlying embedding client.

Releases the lazily-created OpenRouter embedding client and its HTTP session when one exists, then clears the reference so a later call can rebuild it via _get_embedding_client(). Safe to call when no client was ever created. Touches no Redis or other resources — only the embedding client’s network session.

This is the classifier’s lifecycle teardown hook; no in-repo caller invokes it by name (owners are expected to call it during their own shutdown alongside other resource cleanup).

Return type:

None

async classifiers.vector_classifier.initialize_tool_embeddings_from_file(index_file_path, redis_client, api_key=None, force_recompute=False)[source]

Compute centroid embeddings and store in Redis.

Reads tool_index_data.json, embeds every synthetic query per tool, calculates the centroid, and writes the result into Redis hashes.

Return type:

bool

Parameters:
  • index_file_path (str)

  • redis_client (redis.asyncio.Redis)

  • api_key (str | None)

  • force_recompute (bool)

async classifiers.vector_classifier.reload_tool_embeddings(redis_client, api_key=None)[source]

Reload embeddings from tool_index_data.json.

Convenience wrapper that force-recomputes the tool centroid embeddings: it resolves the path to tool_index_data.json next to this module and delegates to initialize_tool_embeddings_from_file() with force_recompute=True, so the existing Redis hashes are dropped and rewritten (and the per-tool RediSearch documents re-stored). This reads the index file from the filesystem, calls OpenRouter to embed every synthetic query, and writes the results back to Redis.

No in-repo caller invokes this by name; it is an initialization/maintenance entry point used when the tool corpus changes.

Parameters:
  • redis_client (Redis) – Async Redis connection the embeddings are written to.

  • api_key (str | None) – OpenRouter API key; falls back to OPENROUTER_API_KEY when None.

Returns:

True when embeddings were recomputed and stored, False on failure.

Return type:

bool