classifiers.vector_classifier module
Vector-based classifier for tool selection.
Lightweight semantic vector classifier that replaces sending all
tools to the LLM with deterministic vector retrieval. Pre-computed
centroid embeddings are stored in Redis (legacy monolithic hashes and/or
per-tool tool_emb:* / per-skill skill_emb:* HASH documents indexed
by RediSearch). At query time, RediSearch KNN is used when
idx:tool_embeddings / idx:skill_embeddings have documents;
otherwise embeddings are loaded and scored in-process (cosine_batch).
- classifiers.vector_classifier.detect_tool_request_keywords(response_text)[source]
Return True when the bot seems to request missing tools.
This lightweight regex check gates the heavier embedding-based tool expansion to avoid false positives on legitimate no-tool responses.
- classifiers.vector_classifier.find_tools_explicitly_named(message, valid_names)[source]
Return tool names that appear verbatim in message as whole tokens.
Detection:
Maximal runs of ASCII letters, digits, and underscores (typical
snake_casetools), equivalent to word boundaries for those names.Text inside ASCII backticks (
inline code): inner text is stripped and must match a registered tool name exactly, so names containing hyphens or other punctuation still match when quoted.
Hits are ordered by first occurrence in the message; each tool appears at most once.
- class classifiers.vector_classifier.VectorClassifier(redis_client, similarity_threshold=0.3, top_k=15, api_key=None, *, strategy_force_threshold=0.8, strategy_optional_threshold=0.3, group_expansion_threshold=0.55, browser_tool_similarity_threshold=0.6)[source]
Bases:
objectSemantic vector-based classifier for tool selection.
- Parameters:
redis_client (
Redis) – An async Redis connection (redis.asyncio.Redis).similarity_threshold (
float) – Minimum cosine similarity for a match.top_k (
int) – Maximum number of tools to return.api_key (
str|None) – OpenRouter API key. Falls back to theOPENROUTER_API_KEYenv var.strategy_force_threshold (float)
strategy_optional_threshold (float)
group_expansion_threshold (float)
browser_tool_similarity_threshold (float)
- __init__(redis_client, similarity_threshold=0.3, top_k=15, api_key=None, *, strategy_force_threshold=0.8, strategy_optional_threshold=0.3, group_expansion_threshold=0.55, browser_tool_similarity_threshold=0.6)[source]
Store the Redis client and retrieval thresholds for later queries.
Records the async Redis connection and every tuneable threshold but performs no I/O: tool and skill embedding caches (
_tool_embeddings_cache/_skill_embeddings_cache), their(N, D)matrices, ordered name lists, and the cached RediSearch document counts all start empty/Noneand are populated lazily on first use by_load_tool_embeddings(),_load_skill_embeddings(), and the_*_redisearch_has_docsprobes. The OpenRouter embedding client is likewise deferred to_get_embedding_client(); only the API key is resolved now, falling back to theOPENROUTER_API_KEYenvironment variable whenapi_keyisNone. Emits a single configurationINFOlog line.Called wherever a
VectorClassifieris constructed across the services that perform tool/skill routing; this dunder is not invoked directly by name elsewhere.- Parameters:
redis_client (
Redis) – Async Redis connection used for all embedding reads and RediSearch KNN queries.similarity_threshold (
float) – Minimum cosine similarity for a tool match to be kept.top_k (
int) – Maximum number of tools returned from retrieval.api_key (
str|None) – OpenRouter API key; falls back toOPENROUTER_API_KEYwhenNone.strategy_force_threshold (
float) – Top-score cutoff above which the strategy becomes"force".strategy_optional_threshold (
float) – Top-score cutoff above which the strategy becomes"optional".group_expansion_threshold (
float) – Minimum score for a tool to trigger prefix/named-group expansion.browser_tool_similarity_threshold (
float) – Stronger minimum score required to keep noisybrowser_*tool matches.
- Return type:
None
- async classify(message, query_embedding=None, registry_tool_names=None, *, scan_explicit_tool_mentions=True, observability_extra=None)[source]
Classify message and return tool names + strategy.
- Parameters:
query_embedding (
ndarray|None) – Pre-computed embedding for message. When provided the internal embedding API call is skipped.registry_tool_names (
Optional[Iterable[str]]) – Registered tool names (e.g. registry keys). When provided, any name that appears as a whole token in message is included in the tool set alongside vector matches.scan_explicit_tool_mentions (
bool) – WhenTrue(default), scan message for explicit registered tool names. SetFalsefor non-user text (e.g. assistant drafts, response postprocessing) so mentions in those strings never inflate the tool set.message (str)
- Returns:
A dict with keys
tools,strategy,complexity, andsafety.- Return type:
- async classify_skills(message, query_embedding=None, *, similarity_threshold=0.12, top_k=12, max_catalog_chars=4000)[source]
Retrieve tier-1 skill metadata relevant to message (progressive disclosure).
Public entry point for skill routing: it returns the slim catalog of candidate skills shown to the model first, so the full skill body is only loaded on demand. Blank messages short-circuit to an empty list. It ensures skill embeddings are available (RediSearch via
_skill_redisearch_has_docs(), else_load_skill_embeddings()), embeds the message through_get_query_embedding()unless a precomputed vector is supplied, ranks candidates with_find_matching_skills(), projects each match down to the four catalog fields, and finally bounds the result through_trim_skills_catalog().Called by
message_processor/generate_and_send.py(the per-message generation path) asself._classifier.classify_skills(...).- Parameters:
message (
str) – The user message to route skills for.query_embedding (
ndarray|None) – Precomputed embedding for message; when given, the internal embedding call is skipped.similarity_threshold (
float) – Minimum cosine score for a skill to be kept.top_k (
int) – Maximum number of skills to retrieve before trimming.max_catalog_chars (
int) – Character budget passed to_trim_skills_catalog().
- Returns:
Skill dicts with
skill_id,name,description, andscore; empty when nothing qualifies.- Return type:
- async classify_response_for_missing_tools(response_text, current_tools, threshold=0.85, *, observability_extra=None)[source]
Find tools the bot might need but lacks.
Used for dynamic tool expansion when the bot signals it needs tools not included in the original set.
Runs vector similarity only on response_text — not
find_tools_explicitly_named(), so tool names that appear in assistant output or postprocessed reply text never add tools.
- async close()[source]
Close the underlying embedding client.
Releases the lazily-created OpenRouter embedding client and its HTTP session when one exists, then clears the reference so a later call can rebuild it via
_get_embedding_client(). Safe to call when no client was ever created. Touches no Redis or other resources — only the embedding client’s network session.This is the classifier’s lifecycle teardown hook; no in-repo caller invokes it by name (owners are expected to call it during their own shutdown alongside other resource cleanup).
- Return type:
- async classifiers.vector_classifier.initialize_tool_embeddings_from_file(index_file_path, redis_client, api_key=None, force_recompute=False)[source]
Compute centroid embeddings and store in Redis.
Reads
tool_index_data.json, embeds every synthetic query per tool, calculates the centroid, and writes the result into Redis hashes.
- async classifiers.vector_classifier.reload_tool_embeddings(redis_client, api_key=None)[source]
Reload embeddings from
tool_index_data.json.Convenience wrapper that force-recomputes the tool centroid embeddings: it resolves the path to
tool_index_data.jsonnext to this module and delegates toinitialize_tool_embeddings_from_file()withforce_recompute=True, so the existing Redis hashes are dropped and rewritten (and the per-tool RediSearch documents re-stored). This reads the index file from the filesystem, calls OpenRouter to embed every synthetic query, and writes the results back to Redis.No in-repo caller invokes this by name; it is an initialization/maintenance entry point used when the tool corpus changes.