The knowledge store

Botholomew's agent has no access to your real filesystem. Its world is the membot knowledge store backing this project — a single DuckDB file at <projectDir>/index.duckdb, addressed by logical_path (an opaque string key, not a filesystem path). Every read, write, search, and delete the agent makes goes through the membot_* tools.

The safety properties this gives you:

No filesystem access. A prompt-injected instruction to "read ~/.ssh/id_rsa" fails because there is no tool that takes a host-filesystem path. The agent can only address entries already in the store.
Versioned. Every membot_write / membot_edit creates a new version_id. Deletes are tombstones, not unlinks. Use membot_versions to inspect history, membot_diff to compare two snapshots, and botholomew membot prune to permanently drop old versions when you want to.
Auditable. The DB is local, plain DuckDB, and your data lives in tables you can query directly with the DuckDB CLI if you ever want to.

The store itself is owned by membot — including the ingestion pipeline (PDF/DOCX/HTML → markdown, local WASM embeddings, hybrid BM25 + semantic search), URL refresh, and append-only versioning. This page documents the Botholomew-side surface: the agent tools, the line-patch edit shape, and the CLI passthrough.

Agent tools

Each membot_* tool wraps one membot operation. Names mirror upstream membot exactly so reading membot's docs gives you the same vocabulary the agent uses.

Tool	Purpose
`membot_add`	Ingest a local file, directory, glob, URL, or `inline:<text>` literal.
`membot_list`	List current entries (one row per `logical_path`).
`membot_tree`	Render the path tree synthesized from `/` segments in `logical_path`.
`membot_read`	Read the current (or a historical) version of an entry.
`membot_search`	Hybrid semantic + BM25 search with RRF fusion.
`membot_info`	Inspect metadata (source, mime, sha256s, refresh status) for one entry.
`membot_stats`	Counts and storage summary for the whole store.
`membot_versions`	List every version of an entry (newest first).
`membot_diff`	Unified diff between two versions of an entry.
`membot_write`	Write inline content as a new version. Whole-file replace.
`membot_move`	Rename a `logical_path` (creates a new version, tombstones the old).
`membot_remove`	Tombstone one or more entries. Use `membot_prune` to GC.
`membot_refresh`	Re-fetch a URL-backed entry (if its source supports refresh).
`membot_prune`	Permanently drop history older than a cutoff.

Botholomew adds six wrappers on top so the agent can use the file-shaped idioms it already knows:

Wrapper	Behavior
`membot_edit`	`read` → apply git-hunk line patches → `write`. Same `LinePatchSchema` as `task_edit`, `schedule_edit`, `prompt_edit`.
`membot_copy`	`read` → `write` under a new `logical_path`. The source is untouched (use `membot_move` if you want to rename).
`membot_exists`	`info` + catch `not_found`. Returns `{ exists: true \| false }` — never throws.
`membot_count_lines`	`wc -l` over the markdown surrogate. Useful before a paginated read.
`membot_pipe`	Run another tool and write its output as a new membot entry without ever flowing the body through the conversation.
`membot_query`	Run a JSONata transform over a JSON entry — group, filter, pluck, dedup, sort, aggregate — without loading the blob into context.

Reducing large JSON blobs with `membot_query`

External MCP tools often return big JSON arrays (an inbox, a list of issues, a table dump). Pulling the whole thing into the conversation to "count by day" or "pull these three fields" burns the context window. The pattern instead:

Land it. membot_pipe the MCP call into a logical_path — the bytes go straight to the store, never through the conversation.
Reduce it. membot_query that logical_path with a JSONata expression. Only the (usually small) result returns to the agent.

JSONata expressions run against the parsed JSON root ($). A few examples:

count by day:    ${ $substring(ts,0,10): $count($) }
filter:          $[amount > 100]
pluck fields:    $.{ 'id': id, 'subject': subject }
dedup a field:   $distinct(email)
top-10 newest:   $^(>created)[[0..9]]
sum a field:     $sum(amount)

Set output_logical_path to write the result back into the store as a new entry instead of returning it inline — handy for chaining pipe → query → query. Pass expression: "?" to get the full syntax reference back from the tool. This is a declarative transform, not code execution: a JSONata expression can only read and reshape the document it's given — it has no filesystem, network, or host access.

The patch format

membot_edit uses the shared LinePatchSchema from src/fs/patches.ts:

{
  start_line: number,  // 1-based, inclusive
  end_line: number,    // 1-based, inclusive; 0 = insert without replacing
  content: string      // empty string deletes
}

Patches are applied bottom-up so earlier line numbers stay stable across a multi-hunk edit. The same shape powers task_edit, schedule_edit, prompt_edit, and skill_edit — one mental model across every resource the agent can mutate in place.

CLI passthrough

botholomew membot <verb> … spawns membot <verb> … --config <resolvedDir> (resolved from membot_scope — ~/.membot by default, <projectDir> if scope is "project") and forwards stdio. Run botholomew membot --help for the verb list.

bash

botholomew membot add ./docs/howto.md
botholomew membot add https://docs.google.com/document/d/...
botholomew membot search "how does the worker tick claim tasks?"
botholomew membot ls
botholomew membot tree
botholomew membot read docs/howto.md
botholomew membot versions docs/howto.md
botholomew membot diff docs/howto.md v1 v2

The Botholomew-specific helper is:

bash

botholomew membot import-global

It copies ~/.membot/index.duckdb and ~/.membot/config.json into the project so you can seed a new project with whatever you've built up in your personal membot. Refuses to overwrite a non-empty project store unless you pass --force.

Where Botholomew still uses real files

Knowledge is the only thing that moved into membot. These still live as real files under <projectDir>/:

tasks/<id>.md, schedules/<id>.md — markdown + strict frontmatter, with O_EXCL lockfiles for worker claim
threads/<YYYY-MM-DD>/<id>.csv — RFC-4180 conversation logs
workers/<id>.json — pidfile + heartbeat per worker
prompts/*.md — agent's persistent context (goals, beliefs, capabilities, and any you add)
skills/*.md — slash-command skills
logs/<YYYY-MM-DD>/<workerId>.log — worker stdout/stderr
config/config.json, mcpx/servers.json — settings

All of those still route through src/fs/sandbox.ts::resolveInRoot for path safety (NFC normalize, reject .. / NUL / absolute paths, lstat-walk every component) — that helper is general, not specific to knowledge content.

The knowledge store ​

Agent tools ​

Reducing large JSON blobs with membot_query ​

The patch format ​

CLI passthrough ​

Where Botholomew still uses real files ​