rag_system package
RAG System for Stargazer Bot v3.
A file-based Retrieval-Augmented Generation system using: - Gemini API for embeddings (google/gemini-embedding-001) via shared key pool - ChromaDB for vector storage - Full file retrieval (not chunks) - Per-channel auto-search with context injection
- class rag_system.OpenRouterEmbeddings(api_key=None, model='google/gemini-embedding-001', dimensions=None, timeout=30.0, gemini_api_key=None, gemini_only=True)[source]
Bases:
objectAsync embeddings client using Gemini API via the shared key pool.
- Parameters:
- DEFAULT_MODEL = 'google/gemini-embedding-001'
- MAX_BATCH_SIZE = 50
- MAX_BATCH_CHARS = 50000
- __init__(api_key=None, model='google/gemini-embedding-001', dimensions=None, timeout=30.0, gemini_api_key=None, gemini_only=True)[source]
Initialize the instance.
- async embed_text_for_search(text, task_type='QUESTION_ANSWERING')[source]
Embed a single text using the Gemini API only, with a task type.
Intended for pre-computing a query embedding before passing it to
FileRAGManager.search(query_embedding=...). Retries on transient errors with exponential back-off.
- class rag_system.SyncOpenRouterEmbeddings(api_key=None, model='google/gemini-embedding-001', dimensions=None, timeout=30.0, gemini_api_key=None, gemini_only=True, document_task_type=None, query_task_type=None)[source]
Bases:
objectSynchronous wrapper used by ChromaDB’s embedding function interface.
Uses Gemini API via the shared key pool. Batches are dispatched concurrently via a ThreadPoolExecutor when there are multiple batches.
- Parameters:
- MAX_BATCH_SIZE = 50
- MAX_BATCH_CHARS = 50000
- MAX_EMBED_WORKERS = 20
- __init__(api_key=None, model='google/gemini-embedding-001', dimensions=None, timeout=30.0, gemini_api_key=None, gemini_only=True, document_task_type=None, query_task_type=None)[source]
Initialize the instance.
- Parameters:
api_key (
Optional[str]) – Unused; kept for backward compatibility.model (
str) – The model value.timeout (
float) – Maximum wait time in seconds.gemini_api_key (
Optional[str]) – Unused; pool is used instead.gemini_only (
bool) – Unused; always Gemini API.document_task_type (
Optional[str]) – Optional GeminitaskTypefor corpus (e.g.RETRIEVAL_DOCUMENT); used byembed_documents.query_task_type (
Optional[str]) – Optional GeminitaskTypefor queries (e.g.RETRIEVAL_QUERY); used byembed_query.
- __call__(input)[source]
ChromaDB EmbeddingFunction interface (legacy).
Uses
document_task_typewhen set (same asembed_documents()).
- class rag_system.FileRAGManager(store_name='default', store_path=None, api_key=None, embedding_model='google/gemini-embedding-001', max_file_size=15728640, gemini_only=True, document_task_type=None, query_task_type=None)[source]
Bases:
objectFile-based RAG with ChromaDB storage and OpenRouter embeddings.
- Parameters:
- __init__(store_name='default', store_path=None, api_key=None, embedding_model='google/gemini-embedding-001', max_file_size=15728640, gemini_only=True, document_task_type=None, query_task_type=None)[source]
Initialize the instance.
- Parameters:
store_name (
str) – The store name value.embedding_model (
str) – The embedding model value.max_file_size (
int) – The max file size value.gemini_only (
bool) – Use only the Gemini API for embeddings.document_task_type (
Optional[str]) – Optional Gemini task type for indexed text (e.g.RETRIEVAL_DOCUMENT).query_task_type (
Optional[str]) – Optional Gemini task type for search queries (e.g.RETRIEVAL_QUERY).
- index_file(file_path, tags=None, use_chunking=True, chunk_size=1500, chunk_overlap=200, force=False)[source]
Index a single file into the collection.
When force is True the content-hash dedup check is skipped so the file is always re-embedded (but the store is not cleared).
- async index_url(url, tags=None, use_chunking=True, chunk_size=1500, chunk_overlap=200)[source]
Index url.
- index_directory(directory_path, recursive=True, tags=None, exclude_patterns=None, max_workers=6, force=False, allowed_extensions=None)[source]
Index all supported files in directory_path.
When max_workers > 1, files are indexed concurrently using a thread pool. Each file’s embedding batches are already parallelised inside the embedding function, so even
max_workers=1benefits from concurrent API calls.force bypasses the per-file content-hash dedup check without clearing the store, so already-indexed files get re-embedded.
When allowed_extensions is set, only files whose suffix (after normalizing to a leading dot, lowercase) appears in the collection are queued;
Nonemeans no extension filter (all supported types under SUPPORTED_EXTENSIONS).
- search(query, n_results=5, tags=None, return_content=True, query_embedding=None, max_content_size=8000)[source]
Semantic search returning relevant chunks per file.
Instead of returning entire file contents, this collects the matching chunk texts that ChromaDB found and merges them (respecting
max_content_size). Small files whose full text fits within one chunk are returned in full automatically.- Parameters:
query (
str) – Natural-language search query.n_results (
int) – Maximum number of files to return.return_content (
bool) – Include chunk text in results.query_embedding (
list[float] |None) – Pre-computed query embedding (skips ChromaDB’s internal embedding call).max_content_size (
int) – Maximum characters of merged chunk text to return per file (default 8000).
- Return type:
- rag_system.get_rag_store(store_name='default', api_key=None, max_file_size=None, gemini_only=True, document_task_type=None, query_task_type=None)[source]
Get or create a RAG store by name (LRU-cached).
At most
_STORE_REGISTRY_MAX_SIZEstores are kept open simultaneously. When a new store would exceed the limit the least recently used entry is closed and evicted.Cache entries are keyed by
store_nameplus optional embedding task types so different embedding configurations do not share one client.
- rag_system.get_stargazer_docs_store()[source]
Return the shared RAG store for Sphinx / tool documentation.
Uses
RETRIEVAL_DOCUMENTfor indexed chunks andRETRIEVAL_QUERYfor search queries (Gemini embedding task types).- Return type:
- rag_system.list_rag_stores_with_stats()[source]
List stores with file counts using only filesystem ops (no ChromaDB).
Counts physical files in each store’s
files/subdirectory as a lightweight proxy for the indexed entry count. This never opens a ChromaDB client and therefore uses zero additional RAM.
- class rag_system.RAGAutoSearchManager(redis_client)[source]
Bases:
objectPer-channel auto-search configuration backed by Redis.
- Parameters:
redis_client (aioredis.Redis)
- __init__(redis_client)[source]
Initialize the instance.
- Parameters:
redis_client (
Redis) – Redis connection client.- Return type:
None
- async set_channel_config(channel_key, store_names, enabled=True, n_results=3, min_score=0.5)[source]
Set auto-search configuration for a channel.
- async search_for_message(channel_key, message_content, chunk_size=10000, query_embedding=None, user_id='')[source]
Perform auto-search if the channel is configured.
- Parameters:
query_embedding (
list[float] |None) – Pre-computed embedding for message_content. When provided it is forwarded to ChromaDB to skip its internal embedding call.user_id (
str) – The message author. Used to enforce access control oncloud_usr_stores.string (Returns XML-formatted RAG context)
None. (or)
channel_key (str)
message_content (str)
chunk_size (int)
- Return type:
Submodules
- rag_system.auto_search module
- rag_system.file_rag_manager module
extract_pdf_text()compress_pdf()chunk_text()fetch_url_content()FileRAGManagerFileRAGManager.__init__()FileRAGManager.index_file()FileRAGManager.index_url()FileRAGManager.index_directory()FileRAGManager.search()FileRAGManager.remove_file()FileRAGManager.remove_url()FileRAGManager.list_indexed_files()FileRAGManager.list_store_files()FileRAGManager.read_store_file()FileRAGManager.close()FileRAGManager.get_stats()FileRAGManager.clear()
get_rag_store()get_stargazer_docs_store()list_rag_stores()list_rag_stores_with_stats()delete_rag_store()
- rag_system.openrouter_embeddings module
OpenRouterEmbeddingsOpenRouterEmbeddings.DEFAULT_MODELOpenRouterEmbeddings.MAX_BATCH_SIZEOpenRouterEmbeddings.MAX_BATCH_CHARSOpenRouterEmbeddings.__init__()OpenRouterEmbeddings.embed_text()OpenRouterEmbeddings.embed_texts()OpenRouterEmbeddings.embed_text_for_search()OpenRouterEmbeddings.close()OpenRouterEmbeddings.__aenter__()OpenRouterEmbeddings.__aexit__()
SyncOpenRouterEmbeddingsSyncOpenRouterEmbeddings.MAX_BATCH_SIZESyncOpenRouterEmbeddings.MAX_BATCH_CHARSSyncOpenRouterEmbeddings.MAX_EMBED_WORKERSSyncOpenRouterEmbeddings.__init__()SyncOpenRouterEmbeddings.name()SyncOpenRouterEmbeddings.dimension()SyncOpenRouterEmbeddings.__call__()SyncOpenRouterEmbeddings.embed_documents()SyncOpenRouterEmbeddings.embed_query()