The Tool class
Every tool the agent can call — and every matching CLI subcommand you can run yourself — is defined once as a ToolDefinition. A single definition drives three consumers:
- The Anthropic SDK (via
input_schema: JSONSchema) so the model can call it. - Commander.js via an auto-generated subcommand.
- Tests, which import the tool directly and call
execute().
This lives in src/tools/tool.ts.
Shape of a tool
import { z } from "zod";
import type { ToolDefinition } from "../tool.ts";
const inputSchema = z.object({
summary: z.string().describe("Summary of work done"),
});
const outputSchema = z.object({
message: z.string(),
is_error: z.boolean(),
});
export const completeTaskTool = {
name: "complete_task",
description:
"Mark the current task as complete with a summary of what was accomplished.",
group: "task",
terminal: true,
inputSchema,
outputSchema,
execute: async (input, ctx) => ({
message: `Task completed: ${input.summary}`,
is_error: false,
}),
} satisfies ToolDefinition<typeof inputSchema, typeof outputSchema>;Fields:
| Field | Purpose |
|---|---|
name | Snake-case identifier; also the CLI subcommand name |
description | Used for both the LLM tool definition and CLI help text |
group | Groups tools into CLI namespaces (task, file, dir, …) |
terminal | If true, the agent loop ends when this tool is called (e.g., complete_task, fail_task, wait_task) |
inputSchema | Zod schema with .describe() per field — becomes JSON Schema for the model and Commander flags for the CLI |
outputSchema | Zod schema guaranteeing the shape of the response |
execute | The actual implementation, receiving validated input and a ToolContext |
ToolContext
Every tool receives a ToolContext:
interface ToolContext {
conn: DbConnection; // short-lived connection, scoped to this tool call
dbPath: string; // for long-running tools that manage their own withDb
projectDir: string; // absolute path to the project
config: Required<BotholomewConfig>; // resolved config (API keys, model, …)
mcpxClient: McpxClient | null; // external MCP tools (may be null)
}This is the only capability surface. A tool that isn't handed an mcpxClient can't reach the network; a tool that doesn't use conn or dbPath can't touch the database.
conn vs dbPath
The executor (runAgentLoop / runChatTurn) wraps each tool call in withDb(dbPath, async (conn) => tool.execute(input, { ...ctx, conn })). That means:
ctx.connis already open for the duration of oneexecute()call and will be closed immediately after. Use it for ordinary tools that do one or two quick queries.ctx.dbPathis for tools that run long enough that holding the file lock would block the worker or CLI (e.g.,context_refreshre-fetching many URLs). Wrap each DB touch inawait withDb(ctx.dbPath, async (conn) => { … })so the lock is released between items.
DuckDB holds the file lock at the instance level. A tool that hangs on ctx.conn through a long network round-trip keeps that lock held. When in doubt, prefer granular ctx.dbPath wrapping.
Anthropic adapter
toAnthropicTools() walks the registry and converts each Zod input schema to the Anthropic SDK's Tool type using z.toJSONSchema():
{
name: "context_write",
description:
"Write content to a context item. By default, fails if the (drive, path) already exists — pass on_conflict='overwrite' to replace.",
input_schema: {
type: "object",
properties: { /* derived from Zod */ },
required: ["drive", "path", "content"],
}
}context_write accepts an optional on_conflict: "error" | "overwrite" input (default "error"). A collision returns is_error: true, error_type: "path_conflict", and a next_action_hint that steers the model back to context_read or a retry with on_conflict='overwrite'.
runAgentLoop() feeds this array into client.messages.create({ tools: ... }). When the model emits a tool_use block, the loop looks up the tool by name via getTool(name), validates the input against inputSchema, calls execute(), and returns the result as a tool_result block.
Terminal tools (the ones with terminal: true) tell the loop to stop. For workers, those are complete_task, fail_task, and wait_task — any of which transitions the task out of in_progress.
CLI adapter
registerToolsAsCLI(program) iterates the registry and generates a Commander subcommand per tool, grouped by group:
botholomew context read disk:/Users/evan/notes/meeting.md --offset 10 --limit 20
botholomew context tree disk:/Users/evan/notes --max-depth 3
botholomew search semantic "quarterly revenue"Positional args and --options are derived from the Zod schema shape. The same validation that runs for the LLM runs here, so you get the same error messages.
Registry
Tools register themselves on import, so adding a tool is a one-file change:
- Create
src/tools/<group>/<name>.tsexporting aToolDefinition. - Add
registerTool(myTool);tosrc/tools/registry.ts. - Write a test in
test/tools/<group>/<name>.test.ts.
No central dispatch table to edit, no LLM tool list to update, no CLI command to wire. The Zod schema is the source of truth.
capabilities_refresh — the meta-tool
The capabilities-group tool capabilities_refresh exists so the agent can keep its own tool inventory fresh. It walks getAllTools() and mcpxClient.listTools(), then asks Claude (via chunker_model) to produce a thematic summary — one line per theme (e.g. "Gmail — read, send, draft, search, and reply to emails") rather than a line per tool. The result is written to .botholomew/capabilities.md (preserving frontmatter). Because that file is loaded into every system prompt, the next boot picks up the new inventory without another round-trip. Specific tool names are intentionally absent from the rendered file; the agent uses mcp_list_tools / mcp_search / mcp_info to look them up at call-time. See persistent-context.md for when the agent should call it. The matching CLI surface is botholomew capabilities, and the slash command is /capabilities.
Why Zod for the schema?
Zod gives us three things at once:
- Runtime validation. Untrusted inputs (from the model, from the CLI) are validated before
execute()runs. A malformed tool call becomes a cleartool_resulterror the model can recover from, not a crash. - TypeScript inference.
z.infer<typeof inputSchema>givesexecute()a statically-typedinputparameter. - JSON Schema export.
z.toJSONSchema()produces the schema the Anthropic API needs without a separate definition.
The entire adapter layer is ~80 lines (src/tools/tool.ts) because Zod does the heavy lifting.