tools.query_arxiv module

Tool: query_arxiv Search the arXiv database for scientific papers and return their abstracts, authors, and URLs.

async tools.query_arxiv.run(query, max_results=3)[source]

Search arXiv for papers and return their metadata as a JSON string.

This is the tool entrypoint for query_arxiv. It clamps max_results to a hard ceiling of 10, URL-encodes the query into an all: search expression, and issues an asynchronous GET against the public arXiv export API (http://export.arxiv.org/api/query) using an httpx.AsyncClient with a 10-second timeout and a custom User-Agent. The Atom response body is read and handed to _parse_arxiv_atom() via asyncio.to_thread() so XML parsing runs off the event loop. The structured result is serialized with the shared jsonutil module (imported as json).

Any failure – network error, non-2xx status (via resp.raise_for_status()), or XML parse error – is caught, logged via logger.exception, and surfaced to the model as an error payload rather than propagated. The only side effects are the outbound HTTP request and log output; no Redis, knowledge-graph, or LLM interactions occur.

This is invoked dynamically by the tool dispatch layer when the query_arxiv tool is selected; no direct internal callers were found by name.

Parameters:

query (str) – The free-text arXiv search query (e.g. "quantum entanglement" or "au:feynman").
max_results (int) – Maximum number of papers to return; values above 10 are silently clamped to 10. Defaults to 3.

Returns:

A JSON-encoded string. On success it has "status": "success" and a "results" list (with a "message" noting when no papers matched); on failure it has "status": "error" and a "message" describing the problem.

Return type:

str