A deep-dive reference for power users who want to extract maximum value from Copilot CLI. This guide covers context management, token optimization, agent orchestration, IDE integration, custom skills, MCP configuration, and performance patterns.
- Context Management
- Token Optimization
- Memory & Persistence
- Verification Patterns
- Parallelization
- Advanced Agent Patterns
- IDE Integration Deep-Dive
- Custom Skills Creation
- MCP Deep-Dive
- Multi-AI Orchestration Advanced
- Performance Tips
- Troubleshooting
Copilot CLI operates within a context window — the total amount of text (instructions, code, conversation history, tool output) the model can see at once. Managing this window effectively is the single biggest lever for quality output.
┌─────────────────────────────────────────────────┐
│ Context Window │
├─────────────────────────────────────────────────┤
│ System prompt + instructions ~5-10% │
│ Session history (compacted) ~20-30% │
│ Active files / tool output ~30-40% │
│ Agent reasoning + response ~20-30% │
│ Buffer (safety margin) ~10% │
└─────────────────────────────────────────────────┘
Use /clear to reset the conversation context when:
- Context is polluted: Too many irrelevant file reads, failed experiments, or off-topic discussions have filled the window
- Switching tasks: Moving from "fix authentication bug" to "add caching layer" — the old context will only confuse the model
- After large refactors: Once a big change is committed, the intermediate steps are noise — clear and start fresh
- Performance degrades: When responses become slow, repetitive, or lose coherence, the context is likely saturated
Don't clear when you're mid-task and the model has built up useful understanding of your codebase. Instead, let session compaction handle it.
When conversation history grows too long, Copilot CLI automatically compacts older
turns into a summary. This preserves key decisions and context while freeing space for
new work. You can observe compaction in events.jsonl.
Tips for working with compaction:
- State critical requirements early — compacted summaries preserve early context better
- Use the SQL database for data you need to persist exactly (compaction may lose details)
- If the model "forgets" something, re-state it rather than scrolling back
- Be specific in prompts — "Fix the auth bug in src/auth/login.ts" loads one file, while "Fix auth bugs" may trigger a codebase-wide search
- Use explore agents for investigation — they run in separate context windows, keeping your main context clean
- Batch questions — ask the explore agent 5 questions at once, not 5 separate calls
- Suppress verbose output — use
--quiet,| head -20,| Select-Object -First 10 - Chain commands —
npm run build && npm testproduces one output block, not two turns
Copilot CLI offers 20+ models spanning three tiers. Choose based on task complexity and cost:
| Tier | Models | Best For | Cost |
|---|---|---|---|
| Premium | claude-opus-4.6, claude-opus-4.6-fast, claude-opus-4.5 |
Architecture decisions, complex refactors, subtle bugs | High |
| Standard | claude-sonnet-4.6, claude-sonnet-4.5, claude-sonnet-4, gpt-5.4, gpt-5.3-codex, gpt-5.2-codex, gpt-5.2, gpt-5.1-codex-max, gpt-5.1-codex, gpt-5.1, gemini-3-pro-preview, gemini-3.1-pro-preview, grok-code-fast-1 |
General development, code review, multi-file changes | Medium |
| Fast/Cheap | gpt-5.4-mini, gpt-5.1-codex-mini, gpt-5-mini, gpt-4.1, claude-haiku-4.5, gemini-3-flash |
Exploration, simple edits, boilerplate, formatting | Low |
Each agent type has a different cost profile based on its default model and capabilities:
| Agent Type | Default Model | Context Cost | Best For |
|---|---|---|---|
explore |
Haiku (cheap) | Separate window | Code search, file discovery, Q&A |
task |
Haiku (cheap) | Separate window | Builds, tests, installs — success/fail only |
general-purpose |
Sonnet (standard) | Separate window | Complex multi-step implementation |
code-review |
Sonnet (standard) | Separate window | Change analysis, bug detection |
Cost optimization strategy:
Exploration (cheap) → Planning (standard) → Implementation (standard) → Review (standard)
explore agent main context general-purpose code-review
claude-haiku default model claude-sonnet claude-sonnet
Every conversational turn has overhead (system prompt, history, tool negotiation). Reduce turns by batching:
❌ Slow: 5 separate explore calls, one question each (5 turns × overhead)
✅ Fast: 1 explore call with 5 questions batched (1 turn × overhead)
❌ Slow: Read file → Edit file → Read another → Edit another (4 turns)
✅ Fast: Read both files in parallel → Edit both files in parallel (2 turns)
Every Copilot CLI session includes a SQLite database with pre-built tables. This is your primary tool for structured state that must survive context compaction.
Pre-built tables:
-- Track work items
SELECT * FROM todos WHERE status = 'pending';
-- Track dependencies between tasks
SELECT t.* FROM todos t
WHERE t.status = 'pending'
AND NOT EXISTS (
SELECT 1 FROM todo_deps td
JOIN todos dep ON td.depends_on = dep.id
WHERE td.todo_id = t.id AND dep.status != 'done'
);Custom tables for any workflow:
-- TDD test case tracking
CREATE TABLE test_cases (
id TEXT PRIMARY KEY,
name TEXT NOT NULL,
file_path TEXT,
status TEXT DEFAULT 'not_written' -- not_written → written → passing → refactored
);
-- Batch processing tracker
CREATE TABLE files_to_process (
path TEXT PRIMARY KEY,
status TEXT DEFAULT 'pending', -- pending → in_progress → done → error
error_message TEXT
);
-- Key-value session state
CREATE TABLE session_state (key TEXT PRIMARY KEY, value TEXT);
INSERT OR REPLACE INTO session_state (key, value) VALUES ('current_phase', 'testing');Session history is stored in events.jsonl — a line-delimited JSON log of all
interactions. This enables:
- Session resume: Pick up where you left off after closing the terminal
- Audit trail: Review what changes were made and why
- Debugging: Trace what the model saw when it made a particular decision
The session_store database provides read-only access to historical session data:
-- Search across previous sessions (FTS5 full-text search)
SELECT * FROM session_store.sessions WHERE content MATCH 'authentication refactor';Cross-session memory is experimental but enables knowledge persistence — patterns learned in one session can inform future sessions.
Files created during a session are stored in the session's files/ directory. Use these
for artifacts that need to persist beyond the conversation:
- Generated reports or analysis documents
- Exported data or intermediate results
- Plan files (
plan.md) for structured planning
See Cross-Session Memory skill for detailed patterns.
The strongest verification pattern is writing tests before implementation:
1. Write failing test → Confirms you understand the requirement
2. Run test (RED) → Confirms the test actually tests something
3. Write implementation → Focused on making the test pass
4. Run test (GREEN) → Confirms the implementation works
5. Refactor → Clean up with confidence
6. Run test (GREEN) → Confirms refactoring didn't break anything
Use the TDD Guide agent and track test cases in SQL:
INSERT INTO test_cases (id, name, status) VALUES
('auth-login', 'should authenticate valid credentials', 'not_written'),
('auth-invalid', 'should reject invalid password', 'not_written'),
('auth-expired', 'should handle expired tokens', 'not_written');Always verify changes compile before committing:
# Chain build + test for atomic verification
npm run build && npm test
# Or for compiled languages
dotnet build --no-restore && dotnet test --no-buildUse the task agent for builds — it returns brief output on success, full output on
failure, keeping your context clean.
Run existing linters, don't add new ones:
# Check what lint scripts exist
npm run --list-scripts | Select-String "lint"
# Run them
npm run lintFor critical changes, chain multiple review perspectives:
1. Self-review → Re-read your own changes with fresh eyes
2. code-review → Automated review for bugs and logic errors
3. security-review → Check for vulnerabilities (if security-relevant)
4. Build + test → Mechanical verification
5. Manual spot-check → Verify key behaviors in the running application
See Code Review skill and Agent Review Chain.
Fleet mode launches multiple autonomous agents in parallel, each with independent context windows. This is Copilot CLI's most powerful scaling feature.
When Fleet beats Sequential:
| Scenario | Sequential | Fleet | Speedup |
|---|---|---|---|
| Update 10 config files | ~20 min | ~5 min | 4x |
| Add tests for 5 modules | ~30 min | ~8 min | 3.5x |
| Review 8 PRs | ~40 min | ~10 min | 4x |
| Fix 6 lint categories | ~15 min | ~5 min | 3x |
Task decomposition strategy:
1. Identify independent units of work (no shared state)
2. Write clear, self-contained prompts for each unit
3. Include all necessary context in each prompt (agents are stateless)
4. Launch fleet with decomposed tasks
5. Aggregate results and resolve conflicts
Anti-patterns to avoid:
- Don't fleet tasks that modify the same files (merge conflicts)
- Don't fleet tasks where order matters (migrations, sequential APIs)
- Don't fleet tasks that need shared state (use SQL + sequential instead)
See Fleet Parallel skill.
Background Delegation frees your terminal immediately. Prefix any prompt with &
to hand off work to a cloud-based Copilot coding agent:
1. Delegate: & "Migrate all service tests to the new test framework"
2. Terminal is immediately free — continue your main work
3. Agent works on GitHub, opens a draft PR when complete
4. Review results on GitHub via the draft PR
5. Optionally bring the session local: /resume [SESSION-ID]
Note:
/resumebrings a cloud agent session into your local CLI for continued conversation. Results from delegation are surfaced via the draft PR on GitHub, not by polling with/resume.
Use cases:
- Large-scale refactors spanning many files
- Full test suite additions or migrations
- Documentation generation
- Dependency audit and upgrade
Combine agent types for complex workflows:
┌──────────┐ ┌──────────┐ ┌────────────────┐ ┌─────────────┐
│ explore │ → │ planner │ → │ general-purpose│ → │ code-review │
│ (search) │ │ (plan) │ │ (implement) │ │ (verify) │
└──────────┘ └──────────┘ └────────────────┘ └─────────────┘
Haiku Default Sonnet Sonnet
Cheap Medium Standard Standard
The most effective workflows compose agents in a pipeline, each handling what it does best:
Pattern: Explore → Plan → Implement → Review
Step 1: explore agent (parallel, cheap)
- "What authentication libraries does this project use?"
- "Where are the API route definitions?"
- "What test framework is configured?"
Step 2: Plan (main context)
- Synthesize explore results into a plan
- Create SQL todos with dependencies
- Enter plan mode for user approval
Step 3: general-purpose agent (per todo)
- Execute each todo with full tool access
- Self-contained prompt with all context
Step 4: code-review agent
- Review all changes for bugs, security, logic
- Only surfaces genuinely important issues
After a cloud delegation completes and opens a draft PR, bring the session local with
/resume to continue the conversation with full accumulated context:
1. Delegate: & "Analyze auth system and refactor weak points"
2. Continue locally, agent works on GitHub
3. Draft PR opens: review changes on GitHub
4. Bring local: /resume abc123
5. Follow-up: > Also check the session management — same issues?
6. Agent continues with accumulated context from the original run
This is powerful for progressive refinement — review the initial work on GitHub, then drill into specifics by resuming the session locally.
When background agents are running, Copilot CLI provides four tools to manage them:
| Tool | Purpose |
|---|---|
task |
Launch an agent (sync or background mode) — returns agent_id for background runs |
read_agent |
Read output from a running or completed background agent |
write_agent |
Send a follow-up message to an idle agent (waiting for input) |
list_agents |
List all active and completed background agents in the session |
Typical lifecycle:
1. task(..., mode="background") → get agent_id
2. [continue other work] → notified automatically on completion
3. read_agent(agent_id) → retrieve full results
4. write_agent(agent_id, msg) → if agent is idle and needs more input
5. list_agents() → rediscover agent IDs if lost
These tools are the backbone of the Team Planner skill's Phase 5: Monitor — dispatch multiple background agents, then poll/follow-up using
read_agentandwrite_agent, storing summaries in the SQL session database.
When fleet agents complete in parallel, aggregate their results:
-- Track fleet task results
CREATE TABLE fleet_results (
task_id TEXT PRIMARY KEY,
agent_id TEXT,
status TEXT DEFAULT 'pending',
summary TEXT,
files_changed TEXT -- JSON array
);
-- After fleet completes, check for conflicts
SELECT a.task_id, b.task_id, a.files_changed
FROM fleet_results a, fleet_results b
WHERE a.task_id < b.task_id
AND json_each.value IN (SELECT value FROM json_each(b.files_changed));The VS Code Copilot extension and CLI are complementary, not competing:
┌─────────────────────────────┐ ┌─────────────────────────────┐
│ VS Code Extension │ │ Copilot CLI │
├─────────────────────────────┤ ├─────────────────────────────┤
│ ✅ Inline completions │ │ ✅ Multi-file batch changes │
│ ✅ Visual diff review │ │ ✅ Autonomous workflows │
│ ✅ Debugging integration │ │ ✅ Background agents │
│ ✅ UI component preview │ │ ✅ Fleet parallelization │
│ ✅ Interactive refactoring │ │ ✅ Multi-AI orchestration │
│ ✅ Chat with file context │ │ ✅ Session SQL database │
│ ⚠️ Single-file focused │ │ ⚠️ No visual feedback │
│ ⚠️ Manual approval each │ │ ⚠️ No inline completions │
└─────────────────────────────┘ └─────────────────────────────┘
- Debugging: Breakpoints, variable inspection, call stacks — visual debugging wins
- Visual diffs: Reviewing changes side-by-side with syntax highlighting
- UI components: Seeing rendered output (React components, HTML pages)
- Inline completions: Quick single-line or single-function completions
- Interactive refactoring: Rename symbol, extract method with IDE tooling
- Batch operations: Updating 20 files, adding tests across modules
- Autonomous workflows: "Implement this feature end-to-end" with autopilot
- CI/CD integration: Running in pipelines, automated reviews
- Multi-AI orchestration: Coordinating Claude Code + Codex + Gemini
- Long-running tasks: Background agents that run while you do other work
Both the VS Code extension and CLI read from the same configuration sources:
.github/copilot-instructions.md— shared instructionsAGENTS.md— agent definitions.vscode/mcp.jsonordevcontainer.json— MCP server configs- Git history — both can see commits, branches, diffs
Workflow: IDE for exploration, CLI for execution:
1. Use VS Code Copilot chat to explore and understand a codebase
2. Identify the changes needed
3. Switch to CLI for autonomous multi-file implementation
4. Return to IDE to review diffs and debug if needed
5. Use CLI to create PR and run final review
See IDE Switching skill.
Skills are Markdown files with YAML frontmatter that define reusable, composable workflows:
---
name: my-custom-skill
description: One-line description of what this skill does
category: development # development | security | testing | documentation | copilot-exclusive
triggers:
- keyword or phrase that activates this skill
- another trigger phrase
requires_tools:
- powershell
- edit
- view
---
# My Custom Skill
## When to Use
- Bullet points describing when this skill applies
- Be specific about trigger conditions
## Prerequisites
- What must be true before this skill can run
- Required tools, configurations, or project structure
## Workflow
### Step 1: Investigate
Describe what to investigate and how.
### Step 2: Implement
Describe the implementation steps with code examples.
### Step 3: Verify
Describe how to verify the changes work.
## Examples
### Example: Basic Usage
\```powershell
# Show realistic commands
npm run build && npm test
\```
## Tips
- Practical tips for getting the best results| Field | Required | Type | Description |
|---|---|---|---|
name |
✅ | string | Kebab-case identifier matching filename |
description |
✅ | string | One-line purpose statement |
category |
✅ | string | One of: development, security, testing, documentation, copilot-exclusive |
triggers |
string[] | Phrases that should activate this skill | |
requires_tools |
string[] | Tools the skill needs access to | |
agent_type |
string | Which agent type best executes this skill | |
model |
string | Recommended model override |
- Syntax validation: Run the repo's schema validator against your skill file
- Dry run: Ask the CLI to execute your skill on a test project
- Edge cases: Test with missing prerequisites, empty projects, large codebases
- Cross-reference: Ensure links to other skills and agents resolve correctly
The Model Context Protocol (MCP) is a standard for connecting AI models to external tools and data sources. Copilot CLI uses MCP to integrate with GitHub, other AI tools, and custom servers.
An MCP server exposes tools that Copilot CLI can call. The simplest implementation:
{
"servers": {
"my-custom-server": {
"command": "node",
"args": ["path/to/my-server.js"],
"env": {
"API_KEY": "${env:MY_API_KEY}"
}
}
}
}Your server implements the MCP protocol to expose tools:
// my-server.ts — minimal MCP server
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
const server = new McpServer({ name: "my-tools", version: "1.0.0" });
server.tool("search_docs", { query: z.string() }, async ({ query }) => {
const results = await searchDocumentation(query);
return { content: [{ type: "text", text: JSON.stringify(results) }] };
});MCP bridges enable Copilot CLI to invoke other AI tools as if they were native tools:
┌──────────────┐ MCP ┌──────────────┐
│ Copilot CLI │ ──────────── │ Claude Code │
│ (hub) │ bridge │ (worker) │
└──────────────┘ └──────────────┘
│ MCP ┌──────────────┐
└──────────────────── │ Codex CLI │
bridge │ (worker) │
└──────────────┘
See MCP Bridge pattern and bridge configs in orchestration/configs/.
For projects using dev containers, configure MCP servers in devcontainer.json:
{
"customizations": {
"vscode": {
"settings": {
"github.copilot.chat.mcpServers": {
"github": {
"command": "github-mcp-server",
"args": ["--tools=all"]
},
"custom-tools": {
"command": "node",
"args": ["tools/mcp-server.js"]
}
}
}
}
}
}See MCP Ecosystem skill.
The Agent Council brings multiple AI perspectives to complex decisions:
┌─────────────────────────────────────────────────────────┐
│ Agent Council │
├──────────┬──────────┬───────────┬───────────────────────┤
│ Copilot │ Claude │ Codex │ Gemini │
│ CLI │ Code │ CLI │ CLI │
├──────────┼──────────┼───────────┼───────────────────────┤
│ GitHub │ Deep │ Fast │ Multimodal │
│ context │ analysis │ generation│ analysis │
└──────────┴──────────┴───────────┴───────────────────────┘
│ │ │ │
└──────────┴──────────┴───────────┘
│
Synthesized Decision
Real-world example — Architecture review:
- Copilot CLI gathers GitHub context (PRs, issues, CI status)
- Claude Code performs deep architectural analysis (200K context)
- Gemini CLI analyzes diagrams and visual documentation
- Copilot CLI synthesizes all perspectives into a recommendation
Route tasks to the cheapest model that can handle them:
┌──────────────────────────────────────────────────────┐
│ Cost-Aware Router │
├──────────────────────────────────────────────────────┤
│ │
│ Simple task? ──→ Haiku / GPT-4.1 ($) │
│ Standard task? ──→ Sonnet / GPT-5.1 ($$) │
│ Complex task? ──→ Opus / GPT-5.4 ($$$) │
│ │
│ Exploration? ──→ explore agent ($) │
│ Build/test? ──→ task agent ($) │
│ Implementation ──→ general-purpose ($$) │
│ Review? ──→ code-review ($$) │
│ │
└──────────────────────────────────────────────────────┘
Practical routing rules:
| Task Type | Recommended Model | Agent Type | Why |
|---|---|---|---|
| Find files / search code | claude-haiku-4.5 |
explore | Cheap, fast, sufficient |
| Run builds / tests | claude-haiku-4.5 |
task | Only need pass/fail |
| Simple edits / boilerplate | gpt-5-mini |
general-purpose | Fast generation |
| Complex refactoring | claude-sonnet-4.6 |
general-purpose | Needs reasoning |
| Architecture decisions | claude-opus-4.6 |
general-purpose | Deep analysis |
| Security review | claude-sonnet-4.6 |
code-review | Specialized focus |
When multiple AI tools produce conflicting recommendations:
- Identify the conflict — Log both recommendations with rationale
- Evaluate evidence — Which recommendation has stronger supporting evidence?
- Consider expertise — Claude excels at reasoning, Codex at patterns, Copilot at GitHub context
- Test both — If possible, prototype both approaches and measure outcomes
- Escalate to human — For architectural decisions, present both options to the developer
Transfer context between AI tools using file-based hand-off:
# Copilot CLI generates analysis
copilot-cli "Analyze auth system, write findings to analysis.md"
# Claude Code continues with deep reasoning
claude "Read analysis.md and propose architectural improvements"
# Copilot CLI implements the chosen approach
copilot-cli "Implement changes from analysis.md improvements"See Pipeline pattern for structured hand-off.
- Use explore agents for investigation — They're cheap and keep your main context clean
- Batch related questions into a single explore call — 1 call with 5 questions beats 5 calls
- Launch parallel explore agents for independent questions — safe to parallelize
- Use task agents for builds/tests — Brief output on success, full output on failure
- Chain commands with
&&—npm run build && npm testuses one turn, not two - Suppress verbose output —
--quiet,| head,| Select-Object -First N - Use the SQL database for structured state — Survives context compaction
- Track todos in SQL, not in chat —
INSERT INTO todosinstead of "remember to do X" - Use plan mode for complex tasks — Structured approval prevents wasted work
- Switch to autopilot for well-defined tasks — Skip per-step approval
- Use fleet mode for independent tasks — 3-4x speedup on parallelizable work
- Choose the right model — Don't use Opus for file searches (use Haiku)
- Use
/clearbetween unrelated tasks — Fresh context = better results - Be specific in prompts — "Fix bug in src/auth/login.ts:42" beats "fix auth"
- Include file paths in your requests — Reduces search time and context usage
- Use background agents for long tasks — Continue working while they run
- Review with code-review agent — High signal-to-noise, catches real bugs
- Leverage GitHub MCP tools — Native PR, issue, and actions integration
- Create custom skills for repeated workflows — Reusable, consistent patterns
- Compose agent pipelines — explore → plan → implement → review
Cause: Context window is saturated with irrelevant information.
Solution:
1. Use /clear to reset the conversation
2. Re-state your current goal concisely
3. Point to specific files rather than asking for broad searches
Cause: The agent lost context about what it already read (compaction or long conversation).
Solution:
1. Store key findings in the SQL database
2. Reference stored data instead of re-reading files
3. Use more specific prompts to avoid redundant exploration
Cause: Multiple fleet agents modified the same files.
Solution:
1. Decompose tasks so each agent works on different files
2. Use a shared SQL table to coordinate file assignments
3. Run a post-fleet merge step to resolve any conflicts
Cause: Server binary not found, wrong path, or missing environment variables.
Solution:
# Verify the server binary exists
Get-Command github-mcp-server
# Check environment variables
$env:GITHUB_TOKEN
# Test server manually
node path/to/server.js --helpCause: The task is too broad or the agent is stuck in a loop.
Solution:
1. Use /resume to check current status and partial output
2. If stuck, refine the prompt and re-delegate with &
3. Break large tasks into smaller, well-defined chunks
Cause: Tables were not created or data was inserted in a different session.
Solution:
-- Check what tables exist
SELECT name FROM sqlite_master WHERE type='table';
-- Verify data exists
SELECT COUNT(*) FROM todos;Cause: The question was too broad or asked without enough context.
Solution:
1. Be specific: "Find all Express route handlers in src/routes/"
instead of "Find API endpoints"
2. Batch related questions into one call
3. Provide file path hints when you have them
Cause: The agent made changes that don't compile or pass tests.
Solution:
# Check what changed
git --no-pager diff --stat
# Revert specific files if needed
git checkout -- path/to/broken/file.ts
# Re-run with more specific instructions- Copilot Exclusive Features Guide — Features unique to Copilot CLI
- Comparison with Claude Code — Feature-by-feature comparison
- Migration from Claude Code — Step-by-step migration guide
- Orchestration Patterns — Multi-AI coordination patterns
- Skills Library — Reusable workflow capabilities
- Agent Catalog — All available agents