Agentic CLIs
How Claude Code, Copilot CLI, Aider, and others implement the agentic loop in the terminal.
The Landscape (March 2026)
| CLI | Creator | Model | Open Source | Key Innovation |
|---|---|---|---|---|
| Claude Code | Anthropic | Claude | Yes | Subagents, MCP, hooks, extended thinking |
| Copilot CLI | GitHub | Multi-model | No | GitHub integration, agent skills, fleet mode |
| Aider | Paul Gauthier | Multi-model | Yes | Git-native, repo maps (98% token reduction) |
| OpenAI Codex CLI | OpenAI | OpenAI models | Yes | OS-level sandboxing (Seatbelt/Landlock) |
| Gemini CLI | Gemini | Yes | 1M token context window | |
| Cursor | Cursor Inc | Multi-model | No | IDE-native, composer agent, automations |
| Windsurf | Codeium | Multi-model | No | Cascade (multi-file agent, parallel panes) |
Claude Code
The most transparent about its architecture - it’s open source.
Architecture
User Input
|
v
Claude Code Process
|-- System prompt (CLAUDE.md, context, tools list)
|-- Tool definitions (Read, Write, Edit, Bash, Grep, Glob, Agent, etc.)
|-- Permission system (auto-allow, ask, deny)
|
v
Claude API (agentic loop)
|-- Model decides action
|-- Execute tool
|-- Feed result back
|-- Repeat until done
Tool System
Claude Code exposes ~15 built-in tools:
| Tool | Purpose | Permission |
|---|---|---|
| Read | Read files | Auto-approved |
| Glob | Find files by pattern | Auto-approved |
| Grep | Search file contents | Auto-approved |
| Edit | Edit files (string replacement) | Needs approval |
| Write | Create/overwrite files | Needs approval |
| Bash | Execute shell commands | Needs approval |
| Agent | Spawn subagents | Auto-approved |
| LSP | Language server queries | Auto-approved |
Permission Model
Three tiers:
- Auto-approved - read-only operations (Read, Glob, Grep)
- Approval required - write operations (Edit, Write, Bash)
- Allowlisted - specific bash commands pre-approved in settings
Users can configure via .claude/settings.json:
{
"permissions": {
"allow": ["Bash(npm test)", "Bash(git status)"],
"deny": ["Bash(rm -rf *)"]
}
}
Subagent Architecture
Claude Code spawns subagents for complex tasks:
Main conversation (full tool access)
|
|-- Agent(type="Explore") -> read-only tools, fast
|-- Agent(type="Plan") -> read-only tools, for design
|-- Agent(type="general") -> all tools, for implementation
Each subagent gets a fresh context window. Results are returned as a summary to the main conversation.
Context Management
- 200K token context window (Claude)
- Automatic summarization when approaching limits
- CLAUDE.md files loaded at start for project context
- Git status and recent commits included as context
Hooks
Custom shell commands that run on events:
{
"hooks": {
"on_tool_call": {
"Bash": "echo 'Bash command: $TOOL_INPUT' >> /tmp/audit.log"
}
}
}
GitHub Copilot CLI
GA as of February 2026. Terminal-native coding agent.
Architecture
User Input (gh copilot or standalone copilot)
|
v
Copilot CLI Engine (same as Copilot SDK)
|-- GitHub MCP Server (repo access)
|-- File system tools
|-- Terminal execution
|-- Agent skills (per-repo customization)
|
v
GitHub Models API (multi-model)
Key Features
- Agent skills - per-repo instructions in
.github/copilot/skills/ - Session memory - remembers context across invocations
- GitHub-native - deep integration with issues, PRs, Actions
- Multi-model - GPT-4.1, GPT-5 mini, Claude (configurable)
- MCP support - connect external tools via MCP servers
Agent Skills
Per-repo customization files:
<!-- .github/copilot/skills/deploy.md -->
# Deploy Skill
When asked to deploy:
1. Run `npm run build`
2. Run tests with `npm test`
3. Deploy with `./scripts/deploy.sh`
4. Verify at https://staging.example.com
Skills are loaded automatically when relevant to the user’s prompt.
Copilot Coding Agent
The autonomous agent that works in GitHub Actions:
Assign issue to Copilot -> Spins up GitHub Actions environment
-> Reads codebase -> Plans changes -> Implements
-> Runs tests -> Self-corrects -> Creates PR
Triggered from: GitHub Issues, VS Code, CLI, or MCP.
Aider
Open-source, model-agnostic coding CLI. Known for git-native workflow.
Architecture
User Input
|
v
Aider
|-- Repo map (AST-based file index)
|-- Git integration (auto-commit changes)
|-- Edit format (whole file or diff)
|
v
LLM API (Claude, GPT, Gemini, local models)
Key Innovation: Repo Maps
Aider builds an AST-based map of the entire repo:
repo_map:
src/auth.py:
classes: [AuthManager, TokenStore]
functions: [verify_token, refresh_token]
src/api.py:
classes: [APIRouter]
functions: [handle_request, validate_input]
This lets the LLM understand the codebase structure without reading every file. The model asks for specific files when needed.
Git-Native Workflow
Every change Aider makes is automatically committed:
$ aider
> Fix the bug in auth.py
# Aider edits auth.py, then:
git add auth.py
git commit -m "fix: handle expired tokens in verify_token"
You can easily undo: git diff HEAD~1 or git revert HEAD.
Edit Formats
Aider supports multiple ways of applying changes:
| Format | How it works | Best for |
|---|---|---|
| Whole file | LLM outputs entire file | Small files |
| Diff | LLM outputs unified diff | Large files |
| Search/Replace | LLM outputs search/replace blocks | Precise edits |
OpenAI Codex CLI
Open-source agentic CLI with OS-level sandboxing. Different philosophy from Claude Code.
Architecture
User Input
|
v
Codex CLI
|-- Sandbox (Seatbelt on macOS, Landlock/seccomp on Linux)
|-- Tool system (file ops, shell, web search)
|-- Multi-model (o3, o4-mini, GPT-4.1)
|
v
OpenAI API (agentic loop)
Key Innovation: OS-Level Sandboxing
Instead of application-level permission rules (like Claude Code’s allowlists), Codex uses kernel-level sandboxing:
- macOS: Apple’s Seatbelt framework restricts filesystem access
- Linux: Landlock LSM + seccomp-bpf for syscall filtering
Three modes:
- Suggest - read-only, no execution
- Auto-edit - can write files, no network, no shell
- Full auto - network access within sandbox, shell execution
This is a fundamentally different approach. Claude Code trusts the model to ask permission; Codex trusts the operating system to enforce boundaries.
Gemini CLI
Google’s entry into agentic CLIs. Open source, leverages Gemini’s 1M token context.
Key Differentiator: Brute Force Context
Where Claude Code manages 200K tokens carefully (summarization, subagents), Gemini CLI just loads everything into a 1M token window. Different trade-off:
| Approach | Used by | Pros | Cons |
|---|---|---|---|
| Smart retrieval (200K) | Claude Code, Aider | Cheaper, faster | May miss context |
| Brute force (1M) | Gemini CLI, Codex | Never misses context | Expensive, slower |
Uses GEMINI.md for project instructions (equivalent to CLAUDE.md).
Emerging Standards
AGENTS.md
An emerging cross-tool standard for repo-level agent instructions. Supported by Codex, Cursor, Copilot, and Windsurf. Each tool also has its own proprietary format:
| Tool | Proprietary | Cross-compatible |
|---|---|---|
| Claude Code | CLAUDE.md | - |
| Gemini CLI | GEMINI.md | - |
| Codex CLI | - | AGENTS.md |
| Cursor | .cursor/rules | AGENTS.md |
| Copilot | .github/copilot-instructions.md | AGENTS.md |
| Windsurf | .windsurfrules | AGENTS.md |
Agent Skills
A cross-cutting standard for per-repo capabilities. SKILL.md files with YAML frontmatter:
---
name: deploy
description: Deploy the application to production
---
When asked to deploy:
1. Run `npm run build`
2. Run tests with `npm test`
3. Deploy with `./scripts/deploy.sh`
Works across Copilot (VS Code, CLI, coding agent) and Claude Code. Progressive loading: metadata for discovery, full body on match, supporting files on reference.
Two Permission Philosophies
The agentic CLI space has split into two camps:
OS-Level Sandboxing (Codex)
Kernel enforces boundaries
|-- Process can't access files outside sandbox
|-- Process can't make network calls (in restricted mode)
|-- No reliance on model behavior
Pros: Stronger guarantees, can’t be prompt-injected around. Cons: Coarse-grained (all-or-nothing per capability), OS-specific.
Application-Level Rules (Claude Code)
Application enforces boundaries
|-- Pattern matching on tool calls (Bash(npm test) = allowed)
|-- 8 lifecycle hooks for custom logic
|-- Wildcard permissions (mcp__github__*)
Pros: Fine-grained, programmable, cross-platform. Cons: Runs in same process, theoretically bypassable.
Neither approach is “right.” Codex is safer for untrusted environments; Claude Code is more flexible for power users.
Architecture Comparison
The Agentic Loop
All CLIs implement the same core loop, but with different emphases:
| Aspect | Claude Code | Copilot CLI | Codex CLI | Aider |
|---|---|---|---|---|
| Loop control | LLM-driven | LLM-driven | LLM-driven | LLM-driven |
| Tool granularity | Fine (Read, Edit, Grep separately) | Coarse (actions) | Medium (file ops, shell) | Medium (edit, run) |
| Self-correction | Yes (sees errors, retries) | Yes (monitors terminal) | Yes (in sandbox) | Yes (sees lint/test errors) |
| Parallelism | Subagents | Fleet mode | Single thread | Single thread |
| Context strategy | Summarization + subagents | Session memory | Brute force (large context) | Repo map + selective loading |
Permission Models
| CLI | Read | Write | Execute | Approach |
|---|---|---|---|---|
| Claude Code | Auto | Ask | Ask | App-level rules + hooks |
| Copilot CLI | Auto | Review | Approve | App-level + skills |
| Codex CLI | Auto | Mode-dependent | Mode-dependent | OS-level sandbox |
| Gemini CLI | Auto | Ask | Ask | App-level rules |
| Aider | Auto | Auto-commit | Ask | Minimal |
| Cursor | Auto | Review (diff) | Ask | IDE settings |
Context Management
| CLI | Strategy | Max Context |
|---|---|---|
| Claude Code | Auto-summarization + subagents | 200K tokens |
| Gemini CLI | Brute force loading | 1M tokens |
| Copilot CLI | Session memory + skills | Model-dependent |
| Aider | Repo map (tree-sitter AST + NetworkX graphs, 98% token reduction) | Model-dependent |
| Cursor | Codebase embedding index + @ references | Model-dependent |
Choosing a CLI
| If you need… | Use |
|---|---|
| Transparent, hackable architecture | Claude Code |
| GitHub ecosystem integration | Copilot CLI |
| Strongest sandboxing guarantees | Codex CLI |
| Largest context window | Gemini CLI (1M) or Claude Code (200K) |
| Git-native workflow, model-agnostic | Aider |
| IDE integration with visual diffs | Cursor or Windsurf |
| Full autonomy on issues | Copilot Coding Agent |
| Team/enterprise features | Copilot CLI |