Skip to content

Architecture and Four Tools

The Monorepo

Pi is built as a monorepo (badlogic/pi-mono) with cleanly separated packages:

Package Purpose
pi-ai Unified LLM API across 4 wire protocols (OpenAI Completions, OpenAI Responses, Anthropic Messages, Google Generative AI)
pi-agent-core Agent loop, tool execution, event streaming
pi-tui Terminal UI framework with differential rendering, flicker-free output
pi-coding-agent The CLI that wires it all together
pi-mom Slack bot / autonomous agent built on pi
pi-web-ui Web-based chat interface components
pi-pods vLLM pod management for self-hosting

Four Tools

The agent has exactly four built-in tools:

read   -- Read file contents (text and images)
write  -- Create or overwrite files
edit   -- Surgical find-and-replace edits
bash   -- Execute shell commands

That's the entire tool surface. Everything else is built on top via extensions.

Why only four? Because these are the primitives that cover 95% of coding tasks. More tools means more token overhead in the system prompt, more confusion for the model, and more things that can break between releases.

Four Execution Modes

Interactive  -- Full TUI experience (default)
Print/JSON   -- pi -p "query" for scripts, --mode json for event streams
RPC          -- JSON protocol over stdin/stdout for non-Node integrations
SDK          -- Embed pi in your own apps (how OpenClaw is built)

Context Control Stack

Pi provides multiple layers for controlling what enters the model's context:

SYSTEM.md        -- Replace or append to the default system prompt (per-project)
AGENTS.md        -- Project instructions, loaded from ~/.pi/agent/, parent dirs, and cwd
Skills           -- On-demand capability packages (progressive disclosure)
Prompt templates -- Reusable prompts as markdown files (/name to expand)
Extensions       -- Dynamic context injection, RAG, message filtering, compaction

The key insight: skills are loaded on-demand (the agent reads the README only when relevant), which means you pay the token cost only when needed. This is "progressive disclosure" -- the opposite of MCP, which dumps all tool descriptions into context at session start.

Session Format

Sessions are JSONL files with a tree structure. Each entry has an id and parentId, enabling in-place branching without creating new files. The format supports:

  • User messages, assistant messages, tool results
  • Bash execution records (command, output, exit code)
  • Custom messages (extension state, persisted across restarts)
  • Branch summaries and compaction summaries
  • Full token usage and cost tracking per message

See 04-sessions.md for details.