Architecture and Four Tools¶

The Monorepo¶

Pi is built as a monorepo (badlogic/pi-mono) with cleanly separated packages:

Package	Purpose
`pi-ai`	Unified LLM API across 4 wire protocols (OpenAI Completions, OpenAI Responses, Anthropic Messages, Google Generative AI)
`pi-agent-core`	Agent loop, tool execution, event streaming
`pi-tui`	Terminal UI framework with differential rendering, flicker-free output
`pi-coding-agent`	The CLI that wires it all together
`pi-mom`	Slack bot / autonomous agent built on pi
`pi-web-ui`	Web-based chat interface components
`pi-pods`	vLLM pod management for self-hosting

Four Tools¶

The agent has exactly four built-in tools:

read   -- Read file contents (text and images)
write  -- Create or overwrite files
edit   -- Surgical find-and-replace edits
bash   -- Execute shell commands

That's the entire tool surface. Everything else is built on top via extensions.

Why only four? Because these are the primitives that cover 95% of coding tasks. More tools means more token overhead in the system prompt, more confusion for the model, and more things that can break between releases.

Four Execution Modes¶

Interactive  -- Full TUI experience (default)
Print/JSON   -- pi -p "query" for scripts, --mode json for event streams
RPC          -- JSON protocol over stdin/stdout for non-Node integrations
SDK          -- Embed pi in your own apps (how OpenClaw is built)

Context Control Stack¶

Pi provides multiple layers for controlling what enters the model's context:

SYSTEM.md        -- Replace or append to the default system prompt (per-project)
AGENTS.md        -- Project instructions, loaded from ~/.pi/agent/, parent dirs, and cwd
Skills           -- On-demand capability packages (progressive disclosure)
Prompt templates -- Reusable prompts as markdown files (/name to expand)
Extensions       -- Dynamic context injection, RAG, message filtering, compaction

The key insight: skills are loaded on-demand (the agent reads the README only when relevant), which means you pay the token cost only when needed. This is "progressive disclosure" -- the opposite of MCP, which dumps all tool descriptions into context at session start.

Session Format¶

Sessions are JSONL files with a tree structure. Each entry has an id and parentId, enabling in-place branching without creating new files. The format supports:

User messages, assistant messages, tool results
Bash execution records (command, output, exit code)
Custom messages (extension state, persisted across restarts)
Branch summaries and compaction summaries
Full token usage and cost tracking per message

See 04-sessions.md for details.