You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Analysis Date: 2026-03-12 Repository: github/gh-aw Scope: 167 total workflows, 80 using Copilot engine (48%), 34 Claude, 9 Codex
📊 Executive Summary
This analysis compares the full capabilities of the GitHub Copilot CLI engine (as implemented in pkg/workflow/copilot_engine*.go) against actual usage across the 80 Copilot-engine workflows in .github/workflows/. The repository shows strong adoption of GitHub MCP toolsets and memory tools, but several high-value features remain nearly unused.
Key Findings:
max-continuations (autopilot mode) is almost completely unused — only 1 of 80 workflows uses it, despite many multi-step analytical workflows that would benefit
Custom agent files are underutilized — 9 agent files exist in .github/agents/ but only 3 workflows reference them via engine.agent
AWF sandbox adoption is low — only 13/80 Copilot workflows (16%) use the firewall sandbox despite it being the security best practice
rate-limit is barely used — only 3 workflows define rate limiting despite many event-triggered workflows that could flood on burst events
skip-if-match is underused — only 12 workflows use it, yet many scheduled workflows create issues/PRs that could pile up
Primary Recommendation: Enable max-continuations: 3-5 for complex research/analysis workflows (daily reporters, quality analyzers) to allow Copilot to self-continue through multi-phase work without manual re-triggering.
🔴 Critical Findings
High Priority Issues
1. AWF Sandbox Only 16% Adoption (Security Gap)
Only 13 of 80 Copilot workflows use sandbox: { agent: awf }. Workflows that use bash: tools (shell execution) without AWF sandbox have unrestricted network access during agent execution, creating a potential exfiltration risk.
Affected pattern: workflows with bash: true or bash: ["*"] but no sandbox.agent: awf.
2. max-continuations Nearly Unused (1/80)
The autopilot feature (max-continuations → --autopilot --max-autopilot-continues N) allows Copilot to self-continue complex tasks. Only smoke-copilot.md uses it. Workflows like agent-performance-analyzer.md, daily-architecture-diagram.md, and daily-compiler-quality.md perform deep multi-phase analysis that would benefit from 3-5 continuations.
Medium Priority Opportunities
3. Custom Agent Files Underused (3/80)
9 agent files exist (.github/agents/), but only glossary-maintainer.md, hourly-ci-cleaner.md, and technical-doc-writer.md use engine.agent. Agents like contribution-checker, interactive-agent-designer, and w3c-specification-writer have no corresponding workflows using them.
4. rate-limit Gaps (3/80)
Only auto-triage-issues.md, bot-detection.md, and one other use rate-limiting. Event-triggered workflows (slash commands, issue/PR events) can burst under high activity. Workflows responding to issues: [opened] or issue_comment should have rate-limit: { max: 5, window: 60 }.
1️⃣ Current State Analysis
View Copilot CLI Capabilities Inventory
Available Copilot Engine Configuration Options
Config Field
Description
Default
engine.id
copilot
copilot
engine.version
Pin CLI version (e.g., "0.0.422")
latest
engine.model
Override AI model
claude-sonnet-4
engine.agent
Custom agent file (.github/agents/*.agent.md)
none
engine.args
Custom CLI args injected before prompt
none
engine.command
Custom executable path
auto-installed
engine.env
Custom environment variables
none
engine.max-continuations
Enable autopilot with N continuations
1 (disabled)
CLI Flags Automatically Managed by the Compiler
Flag
When Used
--add-dir /tmp/gh-aw/
Always
--add-dir "\$\{GITHUB_WORKSPACE}"
When AWF sandbox enabled
--disable-builtin-mcps
Always
--log-level all --log-dir
Always
--agent (id)
When engine.agent is set
--autopilot --max-autopilot-continues N
When max-continuations > 1
--allow-tool shell(cmd)
Per bash tool entry
--allow-tool write
When edit: tool enabled
--allow-all-tools
When bash: ["*"] or bash: [":*"]
--allow-all-paths
When edit: tool enabled
COPILOT_MODEL env var
When engine.model set
Sandbox/Security Options
Option
Description
sandbox.agent: awf
Network firewall (AWF binary wrap)
sandbox.agent: srt
Process isolation sandbox
sandbox.agent: false
Disable sandbox (not recommended)
network.allowed: [...]
Allowlist domains when AWF active
strict: true
Enable strict mode validation
View Usage Statistics
Engine Distribution (167 total workflows)
Engine
Count
%
copilot
80
48%
claude
34
20%
codex
9
5%
other/unspecified
44
26%
Feature Usage Across 80 Copilot Workflows
Feature
Used
% of Copilot
github: MCP tool
~60
75%
cache-memory: tool
56
70%
strict: true
47
59%
network: config
38
48%
repo-memory: tool
25
31%
sandbox.agent: awf
13
16%
serena MCP
11
14%
web-fetch: tool
7
9%
agentic-workflows: MCP
9
11%
engine.agent field
3
4%
max-continuations
1
1.25%
rate-limit:
3
4%
playwright: MCP
4
5%
Version pinning
0
0%
concurrency:
~7
9%
Most Common GitHub Toolsets
Toolset Config
Count
[default]
15
Inline list
12
[default, discussions]
6
[pull_requests, repos]
4
[default, actions]
3
Model Usage
7 workflows use gpt-5.1-codex-mini (cost-effective detection model)
1 workflow uses gpt-5
Rest use default claude-sonnet-4
2️⃣ Feature Usage Matrix
Feature
Available
Used
Not Used
Rate
max-continuations
✅
1
79
1%
engine.agent (custom)
✅
3
77
4%
AWF sandbox
✅
13
67
16%
rate-limit
✅
3
77
4%
skip-if-match
✅
12
155
7%
engine.version pinning
✅
0
80
0%
engine.env
✅
~3
~77
~4%
engine.args
✅
~2
~78
~3%
web-fetch:
✅
7
73
9%
playwright: MCP
✅
4
76
5%
copilot-requests feature
✅
~25
~55
~31%
concurrency:
✅
7
~73
9%
3️⃣ Missed Opportunities
View High Priority Opportunities
🔴 Opportunity 1: Enable max-continuations for Multi-Phase Workflows
What: The max-continuations field triggers --autopilot --max-autopilot-continues N, letting Copilot self-continue complex tasks without stopping.
Why It Matters: Many analytical workflows (daily reports, code quality analyzers, research agents) do multi-phase work that often gets truncated in a single pass. Autopilot allows Copilot to complete all phases.
Where: High candidates — agent-performance-analyzer.md, daily-compiler-quality.md, daily-mcp-concurrency-analysis.md, daily-architecture-diagram.md, copilot-pr-nlp-analysis.md, research.md
Expected Benefits: Richer, more complete analysis outputs; fewer truncated reports; better utilization of available compute time.
🔴 Opportunity 2: Expand AWF Sandbox Adoption
What: Only 13/80 Copilot workflows use sandbox: { agent: awf }. The AWF sandbox provides network firewalling to prevent data exfiltration.
Why It Matters: Workflows with bash: access have unrestricted network egress during execution without AWF. Any prompt injection could exfiltrate data.
Where: All workflows with bash: true or broad bash permissions without a sandbox — especially those processing external user content (slash commands, issue events).
🟡 Opportunity 4: Rate-Limiting for Event-Triggered Workflows
What: Only 3 workflows use rate-limit, yet many respond to issues:, pull_request:, or slash_command: events that can burst.
Why It Matters: Without rate-limiting, a burst of 20 new issues would trigger 20 simultaneous workflow runs, consuming credits and potentially creating 20 duplicate safe-outputs.
Where: Especially needed for: auto-triage-issues.md (has it ✅), pr-nitpick-reviewer.md, contribution-check.md, sub-issue-closer.md, all slash-command workflows.
How to Implement:
rate-limit:
max: 5window: 60# 5 runs per 60 minutes
🟡 Opportunity 5: skip-if-match for Idempotent Scheduled Workflows
What: Only 12 workflows use skip-if-match to prevent duplicate issue/PR creation.
Why It Matters: Scheduled workflows that create issues or PRs can pile up if they don't check for existing open items. skip-if-match prevents redundant work.
Where: Any scheduled workflow creating issues or PRs — e.g., dead-code-remover.md, daily-syntax-error-quality.md, daily-testify-uber-super-expert.md, layout-spec-maintainer.md.
Where: Research workflows could use arxiv, context7, deepwiki; quality workflows could use semgrep; analysis workflows could use chroma for vector search.
The repo uses a sophisticated shared imports system (36 shared files, 22 MCP configs)
Cache-memory (56 workflows) and repo-memory (25 workflows) adoption is strong
GitHub MCP toolsets are well-configured with specific scoping in most workflows
strict: true is used in 59% of Copilot workflows — good security hygiene
AWF sandbox migration is in progress (comments show "migrated from network.firewall")
6️⃣ Best Practice Guidelines
Based on this research:
Enable max-continuations: 3 for analytical workflows — Any workflow doing multi-phase research, report generation, or code analysis benefits from autopilot continuation.
Use AWF sandbox for all workflows with bash: access — Add sandbox: { agent: awf } + network: { allowed: [defaults, github] } to prevent exfiltration.
Register custom agents for specialized roles — Create .github/agents/ files for recurring personas, then reference via engine.agent for consistent behavior.
Add rate-limit to all event-triggered workflows — Especially issues:, pull_request:, and slash_command: triggers. Standard: max: 5, window: 60.
Use skip-if-match for scheduled PR/issue creators — Prevents accumulation of duplicate issues/PRs from daily/weekly workflows.
Scope GitHub toolsets precisely — Use [repos] instead of [default] when only repo access is needed; reduces MCP attack surface.
Enable web-fetch: when workflows need external data — Copilot CLI supports it natively; explicitly declare it in tools:.
7️⃣ Action Items
Immediate (this week):
Add max-continuations: 3 to agent-performance-analyzer.md, daily-compiler-quality.md, research.md
Wire engine.agent: grumpy-reviewer in grumpy-reviewer.md
Add rate-limit to pr-nitpick-reviewer.md and contribution-check.md
Short-term (this month):
Audit all event-triggered Copilot workflows for missing rate-limit
Add AWF sandbox to workflows with bash: true that handle external user content
Create workflows for unused agent files (contribution-checker, interactive-agent-designer)
Add skip-if-match to all scheduled PR-creating workflows
Long-term (this quarter):
Evaluate version pinning for critical production workflows
Expand third-party MCP usage (arxiv/context7 for research, semgrep for security)
Complete AWF sandbox migration across all Copilot workflows
Track adoption of these recommendations in future deep-research runs
View Supporting Evidence & Methodology
Research Methodology
Data sources examined:
pkg/workflow/copilot_engine.go — Engine capabilities, supported features
pkg/workflow/copilot_engine_execution.go — CLI flag generation logic
All 167 .github/workflows/*.md files — Actual usage patterns
Analysis approach: grep-based inventory of 80 Copilot workflows cross-referenced against engine implementation to identify gaps between available capabilities and actual usage.
Confidence: High for quantitative counts (direct file analysis). Medium for qualitative recommendations (based on workflow purpose/complexity assessment).
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Analysis Date: 2026-03-12
Repository: github/gh-aw
Scope: 167 total workflows, 80 using Copilot engine (48%), 34 Claude, 9 Codex
📊 Executive Summary
This analysis compares the full capabilities of the GitHub Copilot CLI engine (as implemented in
pkg/workflow/copilot_engine*.go) against actual usage across the 80 Copilot-engine workflows in.github/workflows/. The repository shows strong adoption of GitHub MCP toolsets and memory tools, but several high-value features remain nearly unused.Key Findings:
max-continuations(autopilot mode) is almost completely unused — only 1 of 80 workflows uses it, despite many multi-step analytical workflows that would benefit.github/agents/but only 3 workflows reference them viaengine.agentrate-limitis barely used — only 3 workflows define rate limiting despite many event-triggered workflows that could flood on burst eventsskip-if-matchis underused — only 12 workflows use it, yet many scheduled workflows create issues/PRs that could pile upPrimary Recommendation: Enable
max-continuations: 3-5for complex research/analysis workflows (daily reporters, quality analyzers) to allow Copilot to self-continue through multi-phase work without manual re-triggering.🔴 Critical Findings
High Priority Issues
1. AWF Sandbox Only 16% Adoption (Security Gap)
Only 13 of 80 Copilot workflows use
sandbox: { agent: awf }. Workflows that usebash:tools (shell execution) without AWF sandbox have unrestricted network access during agent execution, creating a potential exfiltration risk.Affected pattern: workflows with
bash: trueorbash: ["*"]but nosandbox.agent: awf.2.
max-continuationsNearly Unused (1/80)The autopilot feature (
max-continuations→--autopilot --max-autopilot-continues N) allows Copilot to self-continue complex tasks. Onlysmoke-copilot.mduses it. Workflows likeagent-performance-analyzer.md,daily-architecture-diagram.md, anddaily-compiler-quality.mdperform deep multi-phase analysis that would benefit from 3-5 continuations.Medium Priority Opportunities
3. Custom Agent Files Underused (3/80)
9 agent files exist (
.github/agents/), but onlyglossary-maintainer.md,hourly-ci-cleaner.md, andtechnical-doc-writer.mduseengine.agent. Agents likecontribution-checker,interactive-agent-designer, andw3c-specification-writerhave no corresponding workflows using them.4.
rate-limitGaps (3/80)Only
auto-triage-issues.md,bot-detection.md, and one other use rate-limiting. Event-triggered workflows (slash commands, issue/PR events) can burst under high activity. Workflows responding toissues: [opened]orissue_commentshould haverate-limit: { max: 5, window: 60 }.1️⃣ Current State Analysis
View Copilot CLI Capabilities Inventory
Available Copilot Engine Configuration Options
engine.idcopilotengine.version"0.0.422")latestengine.modelclaude-sonnet-4engine.agent.github/agents/*.agent.md)engine.argsengine.commandengine.envengine.max-continuationsCLI Flags Automatically Managed by the Compiler
--add-dir /tmp/gh-aw/--add-dir "\$\{GITHUB_WORKSPACE}"--disable-builtin-mcps--log-level all --log-dir--agent (id)engine.agentis set--autopilot --max-autopilot-continues Nmax-continuations > 1--allow-tool shell(cmd)--allow-tool writeedit:tool enabled--allow-all-toolsbash: ["*"]orbash: [":*"]--allow-all-pathsedit:tool enabledCOPILOT_MODELenv varengine.modelsetSandbox/Security Options
sandbox.agent: awfsandbox.agent: srtsandbox.agent: falsenetwork.allowed: [...]strict: trueView Usage Statistics
Engine Distribution (167 total workflows)
copilotclaudecodexFeature Usage Across 80 Copilot Workflows
github:MCP toolcache-memory:toolstrict: truenetwork:configrepo-memory:toolsandbox.agent: awfserenaMCPweb-fetch:toolagentic-workflows:MCPengine.agentfieldmax-continuationsrate-limit:playwright:MCPconcurrency:Most Common GitHub Toolsets
[default][default, discussions][pull_requests, repos][default, actions]Model Usage
gpt-5.1-codex-mini(cost-effective detection model)gpt-5claude-sonnet-42️⃣ Feature Usage Matrix
max-continuationsengine.agent(custom)rate-limitskip-if-matchengine.versionpinningengine.envengine.argsweb-fetch:playwright:MCPcopilot-requestsfeatureconcurrency:3️⃣ Missed Opportunities
View High Priority Opportunities
🔴 Opportunity 1: Enable
max-continuationsfor Multi-Phase WorkflowsWhat: The
max-continuationsfield triggers--autopilot --max-autopilot-continues N, letting Copilot self-continue complex tasks without stopping.Why It Matters: Many analytical workflows (daily reports, code quality analyzers, research agents) do multi-phase work that often gets truncated in a single pass. Autopilot allows Copilot to complete all phases.
Where: High candidates —
agent-performance-analyzer.md,daily-compiler-quality.md,daily-mcp-concurrency-analysis.md,daily-architecture-diagram.md,copilot-pr-nlp-analysis.md,research.mdHow to Implement:
Expected Benefits: Richer, more complete analysis outputs; fewer truncated reports; better utilization of available compute time.
🔴 Opportunity 2: Expand AWF Sandbox Adoption
What: Only 13/80 Copilot workflows use
sandbox: { agent: awf }. The AWF sandbox provides network firewalling to prevent data exfiltration.Why It Matters: Workflows with
bash:access have unrestricted network egress during execution without AWF. Any prompt injection could exfiltrate data.Where: All workflows with
bash: trueor broad bash permissions without a sandbox — especially those processing external user content (slash commands, issue events).How to Implement:
Example from
artifacts-summary.md(already correct):View Medium Priority Opportunities
🟡 Opportunity 3: Custom Agent Files for Specialized Roles
What: 9
.github/agents/*.agent.mdfiles exist defining specialized Copilot behaviors, but only 3 workflows use them.Unused Agent Files:
contribution-checker.agent.md— no workflow using itinteractive-agent-designer.agent.md— no workfloww3c-specification-writer.agent.md— no workflowcreate-safe-output-type.agent.md— no workflowagentic-workflows.agent.md— used as custom agent but not viaengine.agentgrumpy-reviewer.agent.md— has a grumpy-reviewer.md but doesn't useengine.agentHow to Implement:
Expected Benefits: Consistent persona/behavior across workflow executions; centralized agent instruction management.
🟡 Opportunity 4: Rate-Limiting for Event-Triggered Workflows
What: Only 3 workflows use
rate-limit, yet many respond toissues:,pull_request:, orslash_command:events that can burst.Why It Matters: Without rate-limiting, a burst of 20 new issues would trigger 20 simultaneous workflow runs, consuming credits and potentially creating 20 duplicate safe-outputs.
Where: Especially needed for:
auto-triage-issues.md(has it ✅),pr-nitpick-reviewer.md,contribution-check.md,sub-issue-closer.md, all slash-command workflows.How to Implement:
🟡 Opportunity 5:
skip-if-matchfor Idempotent Scheduled WorkflowsWhat: Only 12 workflows use
skip-if-matchto prevent duplicate issue/PR creation.Why It Matters: Scheduled workflows that create issues or PRs can pile up if they don't check for existing open items.
skip-if-matchprevents redundant work.Where: Any scheduled workflow creating issues or PRs — e.g.,
dead-code-remover.md,daily-syntax-error-quality.md,daily-testify-uber-super-expert.md,layout-spec-maintainer.md.How to Implement:
🟡 Opportunity 6:
web-fetch:for Workflows Needing External DataWhat: Copilot CLI has built-in web-fetch support (
supportsWebFetch: true), but only 7 Copilot workflows enable it.Where: Workflows doing research, checking external URLs, fetching documentation — e.g.,
blog-auditor.md,cli-version-checker.md(uses network but not explicitly web-fetch tool),daily-news.md.How to Implement:
View Low Priority Opportunities
🟢 Opportunity 7: Engine Version Pinning for Stability
What: Zero production Copilot workflows pin to a specific version. Only smoke test workflows use version pinning.
Why It Matters: Critical workflows (CI, security) could break when a new Copilot CLI version ships with behavior changes.
How to Implement (for critical workflows):
🟢 Opportunity 8:
concurrency:for Long-Running WorkflowsWhat: Only ~7 workflows define concurrency control. Long-running workflows (30-45 min) could queue up.
How to Implement:
🟢 Opportunity 9:
engine.envfor Per-Workflow ConfigurationWhat: The
engine.envfield allows passing custom environment variables to the Copilot CLI process. Almost no workflows use this for customization.Use Cases: Debug flags, feature toggles, custom API endpoints for testing, workflow-specific configuration without modifying the prompt.
How to Implement:
🟢 Opportunity 10: Third-Party MCPs Widely Available but Underused
What: 22 shared MCP configs exist in
.github/workflows/shared/mcp/(arxiv, azure, brave, chroma, context7, datadog, deepwiki, drain3, fabric-rti, jupyter, markitdown, microsoft-docs, notion, semgrep, sentry, serena-go, server-memory, skillz, slack, svelte, tavily) — many unused in Copilot workflows.Where: Research workflows could use
arxiv,context7,deepwiki; quality workflows could usesemgrep; analysis workflows could usechromafor vector search.How to Implement (example):
4️⃣ Specific Workflow Recommendations
View Workflow-Specific Recommendations
agent-performance-analyzer.mdmax-continuationsmax-continuations: 4, increasetimeout-minutes: 60grumpy-reviewer.mdgrumpy-reviewer.agent.mdin.github/agents/but workflow doesn't useengine.agentengine: { id: copilot, agent: grumpy-reviewer }pr-nitpick-reviewer.md,contribution-check.mdrate-limit: { max: 5, window: 60 }dead-code-remover.md,daily-file-diet.mddead-code-remover.mdhasskip-if-matchwhiledaily-file-diet.mddoesskip-if-matchdaily-architecture-diagram.md,daily-compiler-quality.mdmax-continuations: 3,sandbox: { agent: awf }research.mdmax-continuations: 5for thorough research,web-fetch:tool, considerarxivorcontext7MCPs5️⃣ Trends & Insights
View Historical Trends
This is the first comprehensive analysis using this research framework. Future runs (via
copilot-cli-deep-research.md) will track:max-continuations, AWF sandbox, custom agentscopilot-requests) rollout progressBaseline established: 2026-03-12 (run §23024359071)
Notable Architecture Observations:
strict: trueis used in 59% of Copilot workflows — good security hygiene6️⃣ Best Practice Guidelines
Based on this research:
Enable
max-continuations: 3for analytical workflows — Any workflow doing multi-phase research, report generation, or code analysis benefits from autopilot continuation.Use AWF sandbox for all workflows with
bash:access — Addsandbox: { agent: awf }+network: { allowed: [defaults, github] }to prevent exfiltration.Register custom agents for specialized roles — Create
.github/agents/files for recurring personas, then reference viaengine.agentfor consistent behavior.Add
rate-limitto all event-triggered workflows — Especiallyissues:,pull_request:, andslash_command:triggers. Standard:max: 5, window: 60.Use
skip-if-matchfor scheduled PR/issue creators — Prevents accumulation of duplicate issues/PRs from daily/weekly workflows.Scope GitHub toolsets precisely — Use
[repos]instead of[default]when only repo access is needed; reduces MCP attack surface.Enable
web-fetch:when workflows need external data — Copilot CLI supports it natively; explicitly declare it intools:.7️⃣ Action Items
Immediate (this week):
max-continuations: 3toagent-performance-analyzer.md,daily-compiler-quality.md,research.mdengine.agent: grumpy-revieweringrumpy-reviewer.mdrate-limittopr-nitpick-reviewer.mdandcontribution-check.mdShort-term (this month):
rate-limitbash: truethat handle external user contentcontribution-checker,interactive-agent-designer)skip-if-matchto all scheduled PR-creating workflowsLong-term (this quarter):
View Supporting Evidence & Methodology
Research Methodology
Data sources examined:
pkg/workflow/copilot_engine.go— Engine capabilities, supported featurespkg/workflow/copilot_engine_execution.go— CLI flag generation logicpkg/workflow/copilot_engine_tools.go— Tool argument computationpkg/workflow/copilot_mcp.go— MCP configuration renderingdocs/src/content/docs/reference/engines.md— Feature documentation.github/workflows/*.mdfiles — Actual usage patternsAnalysis approach: grep-based inventory of 80 Copilot workflows cross-referenced against engine implementation to identify gaps between available capabilities and actual usage.
Confidence: High for quantitative counts (direct file analysis). Medium for qualitative recommendations (based on workflow purpose/complexity assessment).
Repo memory saved:
/tmp/gh-aw/repo-memory/default/latest.jsonReferences:
pkg/workflow/copilot_engine_execution.godocs/src/content/docs/reference/engines.mdBeta Was this translation helpful? Give feedback.
All reactions