Agentic CLIs

How Claude Code, Copilot CLI, Aider, and others implement the agentic loop in the terminal.

The Landscape (March 2026)

CLI	Creator	Model	Open Source	Key Innovation
Claude Code	Anthropic	Claude	Yes	Subagents, MCP, hooks, extended thinking
Copilot CLI	GitHub	Multi-model	No	GitHub integration, agent skills, fleet mode
Aider	Paul Gauthier	Multi-model	Yes	Git-native, repo maps (98% token reduction)
OpenAI Codex CLI	OpenAI	OpenAI models	Yes	OS-level sandboxing (Seatbelt/Landlock)
Gemini CLI	Google	Gemini	Yes	1M token context window
Cursor	Cursor Inc	Multi-model	No	IDE-native, composer agent, automations
Windsurf	Codeium	Multi-model	No	Cascade (multi-file agent, parallel panes)

Claude Code

The most transparent about its architecture - it’s open source.

Architecture

User Input
    |
    v
Claude Code Process
    |-- System prompt (CLAUDE.md, context, tools list)
    |-- Tool definitions (Read, Write, Edit, Bash, Grep, Glob, Agent, etc.)
    |-- Permission system (auto-allow, ask, deny)
    |
    v
Claude API (agentic loop)
    |-- Model decides action
    |-- Execute tool
    |-- Feed result back
    |-- Repeat until done

Tool System

Claude Code exposes ~15 built-in tools:

Tool	Purpose	Permission
Read	Read files	Auto-approved
Glob	Find files by pattern	Auto-approved
Grep	Search file contents	Auto-approved
Edit	Edit files (string replacement)	Needs approval
Write	Create/overwrite files	Needs approval
Bash	Execute shell commands	Needs approval
Agent	Spawn subagents	Auto-approved
LSP	Language server queries	Auto-approved

Permission Model

Three tiers:

Auto-approved - read-only operations (Read, Glob, Grep)
Approval required - write operations (Edit, Write, Bash)
Allowlisted - specific bash commands pre-approved in settings

Users can configure via .claude/settings.json:

{
  "permissions": {
    "allow": ["Bash(npm test)", "Bash(git status)"],
    "deny": ["Bash(rm -rf *)"]
  }
}

Subagent Architecture

Claude Code spawns subagents for complex tasks:

Main conversation (full tool access)
    |
    |-- Agent(type="Explore")   -> read-only tools, fast
    |-- Agent(type="Plan")      -> read-only tools, for design
    |-- Agent(type="general")   -> all tools, for implementation

Each subagent gets a fresh context window. Results are returned as a summary to the main conversation.

Context Management

200K token context window (Claude)
Automatic summarization when approaching limits
CLAUDE.md files loaded at start for project context
Git status and recent commits included as context

Hooks

Custom shell commands that run on events:

{
  "hooks": {
    "on_tool_call": {
      "Bash": "echo 'Bash command: $TOOL_INPUT' >> /tmp/audit.log"
    }
  }
}

GitHub Copilot CLI

GA as of February 2026. Terminal-native coding agent.

Architecture

User Input (gh copilot or standalone copilot)
    |
    v
Copilot CLI Engine (same as Copilot SDK)
    |-- GitHub MCP Server (repo access)
    |-- File system tools
    |-- Terminal execution
    |-- Agent skills (per-repo customization)
    |
    v
GitHub Models API (multi-model)

Key Features

Agent skills - per-repo instructions in .github/copilot/skills/
Session memory - remembers context across invocations
GitHub-native - deep integration with issues, PRs, Actions
Multi-model - GPT-4.1, GPT-5 mini, Claude (configurable)
MCP support - connect external tools via MCP servers

Agent Skills

Per-repo customization files:

<!-- .github/copilot/skills/deploy.md -->
# Deploy Skill

When asked to deploy:
1. Run `npm run build`
2. Run tests with `npm test`
3. Deploy with `./scripts/deploy.sh`
4. Verify at https://staging.example.com

Skills are loaded automatically when relevant to the user’s prompt.

Copilot Coding Agent

The autonomous agent that works in GitHub Actions:

Assign issue to Copilot -> Spins up GitHub Actions environment
    -> Reads codebase -> Plans changes -> Implements
    -> Runs tests -> Self-corrects -> Creates PR

Triggered from: GitHub Issues, VS Code, CLI, or MCP.

Aider

Open-source, model-agnostic coding CLI. Known for git-native workflow.

Architecture

User Input
    |
    v
Aider
    |-- Repo map (AST-based file index)
    |-- Git integration (auto-commit changes)
    |-- Edit format (whole file or diff)
    |
    v
LLM API (Claude, GPT, Gemini, local models)

Key Innovation: Repo Maps

Aider builds an AST-based map of the entire repo:

repo_map:
  src/auth.py:
    classes: [AuthManager, TokenStore]
    functions: [verify_token, refresh_token]
  src/api.py:
    classes: [APIRouter]
    functions: [handle_request, validate_input]

This lets the LLM understand the codebase structure without reading every file. The model asks for specific files when needed.

Git-Native Workflow

Every change Aider makes is automatically committed:

$ aider
> Fix the bug in auth.py

# Aider edits auth.py, then:
git add auth.py
git commit -m "fix: handle expired tokens in verify_token"

You can easily undo: git diff HEAD~1 or git revert HEAD.

Edit Formats

Aider supports multiple ways of applying changes:

Format	How it works	Best for
Whole file	LLM outputs entire file	Small files
Diff	LLM outputs unified diff	Large files
Search/Replace	LLM outputs search/replace blocks	Precise edits

OpenAI Codex CLI

Open-source agentic CLI with OS-level sandboxing. Different philosophy from Claude Code.

Architecture

User Input
    |
    v
Codex CLI
    |-- Sandbox (Seatbelt on macOS, Landlock/seccomp on Linux)
    |-- Tool system (file ops, shell, web search)
    |-- Multi-model (o3, o4-mini, GPT-4.1)
    |
    v
OpenAI API (agentic loop)

Key Innovation: OS-Level Sandboxing

Instead of application-level permission rules (like Claude Code’s allowlists), Codex uses kernel-level sandboxing:

macOS: Apple’s Seatbelt framework restricts filesystem access
Linux: Landlock LSM + seccomp-bpf for syscall filtering

Three modes:

Suggest - read-only, no execution
Auto-edit - can write files, no network, no shell
Full auto - network access within sandbox, shell execution

This is a fundamentally different approach. Claude Code trusts the model to ask permission; Codex trusts the operating system to enforce boundaries.

Gemini CLI

Google’s entry into agentic CLIs. Open source, leverages Gemini’s 1M token context.

Key Differentiator: Brute Force Context

Where Claude Code manages 200K tokens carefully (summarization, subagents), Gemini CLI just loads everything into a 1M token window. Different trade-off:

Approach	Used by	Pros	Cons
Smart retrieval (200K)	Claude Code, Aider	Cheaper, faster	May miss context
Brute force (1M)	Gemini CLI, Codex	Never misses context	Expensive, slower

Uses GEMINI.md for project instructions (equivalent to CLAUDE.md).

Emerging Standards

AGENTS.md

An emerging cross-tool standard for repo-level agent instructions. Supported by Codex, Cursor, Copilot, and Windsurf. Each tool also has its own proprietary format:

Tool	Proprietary	Cross-compatible
Claude Code	`CLAUDE.md`	-
Gemini CLI	`GEMINI.md`	-
Codex CLI	-	`AGENTS.md`
Cursor	`.cursor/rules`	`AGENTS.md`
Copilot	`.github/copilot-instructions.md`	`AGENTS.md`
Windsurf	`.windsurfrules`	`AGENTS.md`

Agent Skills

A cross-cutting standard for per-repo capabilities. SKILL.md files with YAML frontmatter:

---
name: deploy
description: Deploy the application to production
---

When asked to deploy:
1. Run `npm run build`
2. Run tests with `npm test`
3. Deploy with `./scripts/deploy.sh`

Works across Copilot (VS Code, CLI, coding agent) and Claude Code. Progressive loading: metadata for discovery, full body on match, supporting files on reference.

Two Permission Philosophies

The agentic CLI space has split into two camps:

OS-Level Sandboxing (Codex)

Kernel enforces boundaries
    |-- Process can't access files outside sandbox
    |-- Process can't make network calls (in restricted mode)
    |-- No reliance on model behavior

Pros: Stronger guarantees, can’t be prompt-injected around. Cons: Coarse-grained (all-or-nothing per capability), OS-specific.

Application-Level Rules (Claude Code)

Application enforces boundaries
    |-- Pattern matching on tool calls (Bash(npm test) = allowed)
    |-- 8 lifecycle hooks for custom logic
    |-- Wildcard permissions (mcp__github__*)

Pros: Fine-grained, programmable, cross-platform. Cons: Runs in same process, theoretically bypassable.

Neither approach is “right.” Codex is safer for untrusted environments; Claude Code is more flexible for power users.

Architecture Comparison

The Agentic Loop

All CLIs implement the same core loop, but with different emphases:

Aspect	Claude Code	Copilot CLI	Codex CLI	Aider
Loop control	LLM-driven	LLM-driven	LLM-driven	LLM-driven
Tool granularity	Fine (Read, Edit, Grep separately)	Coarse (actions)	Medium (file ops, shell)	Medium (edit, run)
Self-correction	Yes (sees errors, retries)	Yes (monitors terminal)	Yes (in sandbox)	Yes (sees lint/test errors)
Parallelism	Subagents	Fleet mode	Single thread	Single thread
Context strategy	Summarization + subagents	Session memory	Brute force (large context)	Repo map + selective loading

Permission Models

CLI	Read	Write	Execute	Approach
Claude Code	Auto	Ask	Ask	App-level rules + hooks
Copilot CLI	Auto	Review	Approve	App-level + skills
Codex CLI	Auto	Mode-dependent	Mode-dependent	OS-level sandbox
Gemini CLI	Auto	Ask	Ask	App-level rules
Aider	Auto	Auto-commit	Ask	Minimal
Cursor	Auto	Review (diff)	Ask	IDE settings

Context Management

CLI	Strategy	Max Context
Claude Code	Auto-summarization + subagents	200K tokens
Gemini CLI	Brute force loading	1M tokens
Copilot CLI	Session memory + skills	Model-dependent
Aider	Repo map (tree-sitter AST + NetworkX graphs, 98% token reduction)	Model-dependent
Cursor	Codebase embedding index + @ references	Model-dependent

Choosing a CLI

If you need…	Use
Transparent, hackable architecture	Claude Code
GitHub ecosystem integration	Copilot CLI
Strongest sandboxing guarantees	Codex CLI
Largest context window	Gemini CLI (1M) or Claude Code (200K)
Git-native workflow, model-agnostic	Aider
IDE integration with visual diffs	Cursor or Windsurf
Full autonomy on issues	Copilot Coding Agent
Team/enterprise features	Copilot CLI