core.service_registry module
Redis-backed service registry with TTL heartbeats.
Each running instance registers under
sg:registry:service:{name}:{instance_id} with a short TTL
(SERVICE_TTL) and periodically refreshes it via
heartbeat(), so the web dashboard and operators can see which
gateway / inference / agents instances are currently alive. Entries
expire automatically when an instance dies.
- async core.service_registry.register_service(redis, service_name, instance_id, metadata)[source]
Announce a live service instance in the Redis registry with a TTL.
Writes a JSON-encoded metadata blob under
sg:registry:service:{service_name}:{instance_id}and stamps it with the shortSERVICE_TTLso the entry self-expires if the instance dies without deregistering. This is the presence signal the web dashboard and operators read to see which gateway / inference / agents replicas are currently alive, and it must be kept fresh byheartbeat().Issues a single Redis
setwithex=SERVICE_TTL; any failure is logged asservice_registration_failedand swallowed (registration is best-effort and never blocks boot). Called bycore.service_base.StargazerService.boot()(phase 8) with metadata{"status": "starting"}; also exercised bytests/core/migration/test_service_registry.py.- Parameters:
redis – An async Redis client supporting
setwith anexTTL.service_name (
str) – Logical service/tier name (e.g."inference").instance_id (
str) – Unique id for this replica, used in the registry key.metadata (
Dict[str,Any]) – JSON-serializable dict describing the instance (e.g. its current status); stored verbatim as the key’s value.
- async core.service_registry.heartbeat(redis, service_name, instance_id)[source]
Refresh an existing registration’s TTL to keep the instance marked alive.
Re-arms the expiry on
sg:registry:service:{service_name}:{instance_id}back toSERVICE_TTLwithout rewriting its metadata, so a healthy instance stays visible in the registry between full re-registrations while a crashed one lets the key lapse and disappear. Meant to be called on a periodic timer shorter than the TTL.Issues a single Redis
expire; failures are logged asservice_heartbeat_failedand swallowed. No production caller invokes this yet – within the repo it is exercised only bytests/core/migration/test_service_registry.py– so the periodic refresh loop that would drive it is expected to be wired in by a service supervisor.- Parameters:
- Returns:
True if the key existed and was successfully refreshed, False otherwise.
- Return type:
- async core.service_registry.deregister_service(redis, service_name, instance_id)[source]
Remove a service instance from the registry on graceful shutdown.
Deletes
sg:registry:service:{service_name}:{instance_id}so the instance disappears from operator and dashboard views immediately rather than lingering until itsSERVICE_TTLlapses. This is the clean-exit counterpart toregister_service(); the TTL is only the safety net for instances that never reach this path (crashes, kills).Issues a single Redis
delete; failures are logged asservice_deregistration_failedand swallowed so teardown is never blocked. Called bycore.service_base.StargazerService.shutdown()before the subclasson_stop()runs; also exercised bytests/core/migration/test_service_registry.py.