build_kg
Standalone script to build knowledge graph entries from channel messages.
Fetches the last N messages (default 1000) from a channel via Redis cache first, falling back to the platform API. Sends ALL messages plus the entire existing knowledge graph to gemini-3-flash-preview in a single call, then presents the proposed entities/relationships for human approval before committing to FalkorDB.
- Usage:
python build_kg.py –platform discord –channel 123456789 python build_kg.py –platform discord –channel 123456789 –guild 987
- async build_kg.fetch_messages_redis(cache, platform, channel_id, count)[source]
Pull up to
countmessages from the Redis sorted-set message cache.Wraps
MessageCache.get_recent(which returns newest-first) and reverses the result into chronological order so downstream extraction sees the conversation as it happened. Any failure reading the cache is logged and swallowed, returning an empty list so the caller can fall back to the platform API. Reads Redis via the cache; does no other I/O.Called by
gather_messagesin this module as the Redis-first leg of message collection.- Parameters:
cache (
MessageCache) – TheMessageCachewrapping the Redis client.platform (
str) – Platform name (e.g.discord) used to namespace the cache.channel_id (
str) – Channel whose messages to fetch.count (
int) – Maximum number of messages to retrieve.
- Returns:
Up to
countcached messages in chronological order, or an empty list on error.- Return type:
- async build_kg.fetch_messages_discord(token, channel_id, limit)[source]
Fetch messages directly from the Discord API using discord.py.
Returns dicts with keys: user_id, user_name, text, timestamp (float).
- async build_kg.gather_messages(cache, platform, channel_id, count, cfg)[source]
Collect up to count messages, Redis-first with API fallback.
Returns a chronologically-ordered list of message dicts with keys: user_id, user_name, text, timestamp.
- async build_kg.dump_full_graph(kg)[source]
Serialize the entire knowledge graph into a human-readable text block for the LLM.
Loads up to 10,000 entities and 10,000 relationships from FalkorDB via the
KnowledgeGraphManagerand renders them as a labeled, indented listing (entities with type/category/scope/description, relationships assource -[REL]-> target). This block is prepended to the extraction prompt so the model can reference existing nodes instead of duplicating them. Returns a short placeholder when the graph is empty. Reads from the knowledge graph (FalkorDB) but writes nothing.Called by
runin this module to assemble the graph context before extraction.- Parameters:
kg (
KnowledgeGraphManager) – The knowledge graph manager to read entities and relationships from.- Returns:
A formatted, multi-line text block describing the current graph.
- Return type:
- build_kg.build_extraction_prompt(conversation_text, graph_context)[source]
Assemble the chat-format messages for the knowledge-graph extraction LLM call.
Pairs a system instruction (output only valid JSON, do not duplicate existing graph entities) with a user message that stitches together the existing-graph context, the shared
EXTRACTION_PROMPTfromkg_extraction, and the formatted conversation text. The layout primes the model to reference existing nodes by name rather than re-creating them. Pure string assembly with no I/O.Called by
run_extractionin this module immediately before the OpenRouter chat request.
- async build_kg.run_extraction(openrouter, conversation_text, graph_context)[source]
Call the LLM to extract entities and relationships.
Returns {“entities”: […], “relationships”: […]}.
- build_kg.format_entity(idx, ent)[source]
Render one proposed entity as a numbered, human-readable approval line.
Produces a single
e<N>.line (1-based label from the 0-basedidx) showing the entity’s type, name, category, optional user-id scope, and description, so the operator can review it in the terminal before approving. Pure string formatting with no I/O.Called by
prompt_approvalin this module when listing proposed entities.
- build_kg.format_relationship(idx, rel)[source]
Render one proposed relationship as a numbered, human-readable approval line.
Produces a single
r<N>.line (1-based label from the 0-basedidx) showing thesource -[relation]-> targetshape, the model’s confidence, and the description, so the operator can review it before approving. Pure string formatting with no I/O.Called by
prompt_approvalin this module when listing proposed relationships.
- build_kg.prompt_approval(entities, relationships, num_messages)[source]
Display proposed entries and return approved indices.
Returns (entity_indices, relationship_indices) or None to quit. Entity/relationship indices are 0-based.
- async build_kg.commit_entities(kg, entities, approved_indices, channel_id, entity_uuid_lookup)[source]
Resolve-or-create approved entities. Returns count committed.
Populates entity_uuid_lookup with name->uuid mappings.
- async build_kg.commit_relationships(kg, relationships, approved_indices, entity_uuid_lookup)[source]
Persist the operator-approved relationships into the knowledge graph.
Iterates the approved indices and, for each relationship, resolves its source and target entity UUIDs: first from
entity_uuid_lookup(populated bycommit_entities), then falling back to_guess_uuidfor endpoints that already existed in the graph. Relationships with a missing name or unresolvable endpoint are skipped with a printed notice. Resolved edges are written viaKnowledgeGraphManager.add_relationshipusing the model’s confidence as the edge weight, which creates or reinforces the edge. Reads and writes the knowledge graph (FalkorDB) and prints progress/errors to stdout; individual failures are caught so one bad edge does not abort the rest.Called by
runin this module aftercommit_entities, and mirrored by the equivalent step inmemories_port/import_memories.py.- Parameters:
kg (
KnowledgeGraphManager) – The knowledge graph manager used to resolve and add edges.relationships (
list[dict]) – All proposed relationship dicts from the extraction.approved_indices (
list[int]) – Zero-based indices of the relationships the operator approved.entity_uuid_lookup (
dict[str,str]) – Name (lowercased) to UUID map fromcommit_entities.
- Returns:
The number of relationships successfully committed.
- Return type:
- build_kg.format_conversation(messages)[source]
Flatten the gathered message dicts into a single transcript string for the LLM.
Renders each message as one
[ISO-timestamp] user_name (user_id): textline, with the epoch timestamp converted to a UTC ISO string, and joins them with newlines. The result becomes theconversation_textfed intobuild_extraction_prompt. Pure string formatting with no I/O.Called by
runin this module once messages have been gathered.
- async build_kg.run(args)[source]
Drive the full build-KG pipeline end to end for one channel.
Loads
Config, validates thatredis_urlandapi_keyare present (exiting if not), and wires up anOpenRouterClient(gemini-3-flash-preview), aMessageCache, and aKnowledgeGraphManager(ensuring its FalkorDB indexes). It then gathers up toargs.countmessages (Redis-first, Discord API fallback) viagather_messages, dumps the existing graph withdump_full_graph, formats the transcript, makes a single LLM extraction call throughrun_extraction, presents the proposals for interactive approval viaprompt_approval, and commits the approved entities and relationships withcommit_entitiesandcommit_relationships. Progress and a final summary are printed to stdout; the message cache is closed on every exit path.This is the script’s async entry point: it touches Redis, the platform API, the LLM over HTTP, FalkorDB, and stdin/stdout, and may call
sys.exiton misconfiguration. Invoked once bymainviaasyncio.run(run(args)).
- build_kg.main()[source]
Parse command-line arguments, configure logging, and launch the async pipeline.
Builds the
argparseparser for the script’s flags (--platformand--channelrequired, plus--guild,--count, and--verbose), sets up root logging at DEBUG or WARNING depending on the verbosity flag, and hands control to the asyncruncoroutine viaasyncio.run. This is the synchronous CLI entry point.Called from the
__main__guard at the bottom of this module when the script is run directly (python build_kg.py ...).- Return type:
- Returns:
None.