Skip to content

feat(agent): add local workflow skill suggestions#987

Closed
Nikhil (shadowfax92) wants to merge 1 commit into
devfrom
polecat/flint/bosmain-nj6@moxnt96t
Closed

feat(agent): add local workflow skill suggestions#987
Nikhil (shadowfax92) wants to merge 1 commit into
devfrom
polecat/flint/bosmain-nj6@moxnt96t

Conversation

@shadowfax92
Copy link
Copy Markdown
Contributor

Summary

  • Track local tool-command sequences for opt-in workflow analysis without recording URLs or page content.
  • Add a workflow usage advisor that turns repeated command patterns into concrete skill suggestions.
  • Wire explicit workflow-analysis prompts into agent and sidepanel chat flows.

Fixes #955

Test plan

  • git diff --check origin/dev...HEAD
  • bun run test from packages/browseros-agent/apps/agent (new workflow-usage tests passed; local run later hit existing zod resolution blocker tracked in bosmain-8o0)
  • bun run lint from packages/browseros-agent/apps/agent
  • bun run typecheck from packages/browseros-agent/apps/agent (blocked locally by existing wxt script resolution issue tracked in bosmain-090)
  • bun run build from packages/browseros-agent/apps/agent (blocked locally by existing graphql-codegen script resolution issue tracked in bosmain-4xe)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

✅ Tests passed — 1237/1241

Suite Passed Failed Skipped
agent 84/84 0 0
build 9/9 0 0
eval 93/93 0 0
server-agent 266/266 0 0
server-api 205/205 0 0
server-browser 4/4 0 0
server-integration 9/10 0 1
server-lib 245/245 0 0
server-root 60/63 0 3
server-skills 31/31 0 0
server-tools 231/231 0 0

View workflow run

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 9, 2026

Greptile Summary

This PR adds a local workflow-usage tracking system that records tool-name sequences (without URLs or page content) and surfaces a skill-suggestion advisor that users can invoke via special chat phrases. The feature wires into both the sidepanel chat session and the agent-harness conversation flows.

  • lib/workflow-usage/: New types.ts, storage.ts, and advisor.ts modules handle persistence in chrome.storage.local (via @wxt-dev/storage), pattern analysis, and formatted responses; a advisor.test.ts covers core analysis paths.
  • useAgentConversation.ts: Accumulates per-turn tool IDs with deduplication, then persists a WorkflowUsageRecord on turn completion; cancelled turns are correctly skipped.
  • useChatSession.ts / useExecutionHistoryTracker.ts: Intercept matching chat phrases for local handling and hook execution-task completions to persist workflow records respectively.

Confidence Score: 3/5

The core data path works correctly but two defects need attention before merging: concurrent writes can silently lose records, and an overly-broad chat trigger can intercept normal user questions and prevent them from reaching the LLM.

The storage module's read-modify-write pattern has no concurrency guard, meaning records from useExecutionHistoryTracker and useAgentConversation firing close together will race and the last writer wins — the earlier record is dropped with no error. The "what patterns do you see" trigger fires without any workflow qualifier, so a user asking "what patterns do you see in this log?" silently gets a local workflow response instead of the LLM answer. The analytics mis-fire is a third issue where MESSAGE_SENT_EVENT tracks locally-handled commands as if they were real LLM interactions.

lib/workflow-usage/storage.ts (race condition) and lib/workflow-usage/advisor.ts (broad trigger phrase) need the most attention; useChatSession.ts needs a minor ordering fix for the analytics call.

Important Files Changed

Filename Overview
packages/browseros-agent/apps/agent/lib/workflow-usage/storage.ts New storage module for workflow usage patterns; contains a read-modify-write race condition in recordWorkflowUsage that can silently drop records under concurrent calls.
packages/browseros-agent/apps/agent/lib/workflow-usage/advisor.ts New advisor module for analysing tool-use patterns; detectWorkflowAdvisorCommand has an overly broad trigger ("what patterns do you see") that can hijack unrelated user messages, and suggestion IDs are assigned before sorting.
packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/useChatSession.ts Wires workflow advisor commands into the chat session; MESSAGE_SENT_EVENT analytics fires before the early-return check, polluting analytics for locally-handled commands.
packages/browseros-agent/apps/agent/entrypoints/app/agent-command/useAgentConversation.ts Captures per-turn tool-name sequences and persists them to workflow usage storage; deduplication by tool ID is correct, and the cancelled-turn guard prevents spurious writes in the resume path.
packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/useExecutionHistoryTracker.ts Hooks workflow usage recording into execution task completion; straightforward fire-and-forget with proper Sentry catch.
packages/browseros-agent/apps/agent/lib/workflow-usage/types.ts New type definitions for workflow usage; clean and minimal.
packages/browseros-agent/apps/agent/lib/workflow-usage/advisor.test.ts Unit tests cover the main analysis and formatting paths; the "what patterns do you see?" test explicitly confirms the broad trigger that can intercept unrelated user messages.

Sequence Diagram

sequenceDiagram
    participant User
    participant useChatSession
    participant useExecutionHistoryTracker
    participant useAgentConversation
    participant storage as workflow-usage/storage
    participant advisor as workflow-usage/advisor

    User->>useChatSession: sendMessage(text)
    useChatSession->>advisor: detectWorkflowAdvisorCommand(text)
    alt workflow command detected
        advisor-->>useChatSession: "'analyze' | 'view' | 'clear'"
        useChatSession->>storage: getWorkflowUsageRecords() / clearWorkflowUsageRecords()
        storage-->>useChatSession: records[]
        useChatSession->>advisor: analyzeWorkflowUsage(records)
        advisor-->>useChatSession: WorkflowUsageAnalysis
        useChatSession->>useChatSession: appendLocalWorkflowAdvisorExchange()
    else normal message
        useChatSession->>useExecutionHistoryTracker: startExecutionTask()
        useExecutionHistoryTracker->>useAgentConversation: stream tool calls
        useAgentConversation->>useAgentConversation: upsertAgentHarnessTool() - workflowToolNamesRef
        useAgentConversation->>storage: recordWorkflowUsage(record) [on turn end]
        useExecutionHistoryTracker->>storage: recordWorkflowUsage(record) [on task complete]
    end
Loading
Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
packages/browseros-agent/apps/agent/lib/workflow-usage/storage.ts:48-67
**Read-modify-write race condition loses records**

`recordWorkflowUsage` reads, mutates, and writes storage in separate async steps with no guard. Two concurrent calls — which are realistic given that `useExecutionHistoryTracker` and `useAgentConversation` can both fire on turn completion — will both read the same snapshot, each produce a new record set that includes only their own addition, and the last writer will overwrite the first, silently dropping the earlier record.

Consider serialising writes with a queued promise chain (e.g., a module-level `let pendingWrite = Promise.resolve()` that chains each new write onto the tail), or use a storage API that supports atomic updates.

### Issue 2 of 4
packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/useChatSession.ts:783-787
**`MESSAGE_SENT_EVENT` analytics fires for locally-handled commands**

`track(MESSAGE_SENT_EVENT, ...)` executes before the `detectWorkflowAdvisorCommand` early-return check. Workflow advisor commands — which never reach an LLM — are therefore counted as sent messages in analytics, skewing provider/model/agent event data with local-only interactions. Moving the `detectWorkflowAdvisorCommand` check (or the `track` call) before the other to ensure the event only fires when a message is genuinely dispatched would fix this.

### Issue 3 of 4
packages/browseros-agent/apps/agent/lib/workflow-usage/advisor.ts:62-72
**Overly-broad trigger phrase hijacks unrelated user questions**

`"what patterns do you see"` matches without any workflow-specific qualifier, so a user asking "what patterns do you see in this error log?" or "what patterns do you see in this regex?" will have their message silently intercepted as a workflow-advisor `analyze` command instead of being forwarded to the LLM. The test confirms this is current behaviour but it is likely unintentional. Adding a workflow-data qualifier (similar to the `mentionsWorkflowData` guard used for `view`/`clear`) would restrict the trigger to intentional workflow analysis requests.

### Issue 4 of 4
packages/browseros-agent/apps/agent/lib/workflow-usage/advisor.ts:154-168
Suggestion IDs are assigned from the pre-sort `.map()` index, so after `.sort(compareSuggestions)` the label `workflow-1` will not necessarily identify the top-ranked suggestion. Assigning IDs after sorting keeps them stable and consistent.

```suggestion
  const suggestions = Array.from(groups.values())
    .filter((group) => group.runCount >= minRuns)
    .map((group): Omit<WorkflowSkillSuggestion, 'id'> => {
      const pattern = group.pattern
      return {
        title: buildSuggestionTitle(pattern),
        runCount: group.runCount,
        pattern,
        lastUsedAt: group.lastUsedAt,
        benefit: buildBenefit(pattern),
      }
    })
    .sort(compareSuggestions)
    .slice(0, limit)
    .map((group, index): WorkflowSkillSuggestion => ({
      ...group,
      id: `workflow-${index + 1}`,
    }))
```

Reviews (1): Last reviewed commit: "feat: add local workflow skill suggestio..." | Re-trigger Greptile

Comment on lines +48 to +67
export async function recordWorkflowUsage(
record: WorkflowUsageRecord | null,
): Promise<void> {
if (!record) return

const current = (await workflowUsageStorage.getValue()) ?? {
version: 1,
records: [],
}
const recordsById = new Map(
current.records.map((existing) => [existing.id, existing]),
)
recordsById.set(record.id, record)

const records = Array.from(recordsById.values())
.sort((left, right) => left.recordedAt - right.recordedAt)
.slice(-MAX_WORKFLOW_USAGE_RECORDS)

await workflowUsageStorage.setValue({ version: 1, records })
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Read-modify-write race condition loses records

recordWorkflowUsage reads, mutates, and writes storage in separate async steps with no guard. Two concurrent calls — which are realistic given that useExecutionHistoryTracker and useAgentConversation can both fire on turn completion — will both read the same snapshot, each produce a new record set that includes only their own addition, and the last writer will overwrite the first, silently dropping the earlier record.

Consider serialising writes with a queued promise chain (e.g., a module-level let pendingWrite = Promise.resolve() that chains each new write onto the tail), or use a storage API that supports atomic updates.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/agent/lib/workflow-usage/storage.ts
Line: 48-67

Comment:
**Read-modify-write race condition loses records**

`recordWorkflowUsage` reads, mutates, and writes storage in separate async steps with no guard. Two concurrent calls — which are realistic given that `useExecutionHistoryTracker` and `useAgentConversation` can both fire on turn completion — will both read the same snapshot, each produce a new record set that includes only their own addition, and the last writer will overwrite the first, silently dropping the earlier record.

Consider serialising writes with a queued promise chain (e.g., a module-level `let pendingWrite = Promise.resolve()` that chains each new write onto the tail), or use a storage API that supports atomic updates.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +783 to +787
const workflowAdvisorCommand = detectWorkflowAdvisorCommand(params.text)
if (workflowAdvisorCommand) {
void handleWorkflowAdvisorCommand(params.text, workflowAdvisorCommand)
return
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 MESSAGE_SENT_EVENT analytics fires for locally-handled commands

track(MESSAGE_SENT_EVENT, ...) executes before the detectWorkflowAdvisorCommand early-return check. Workflow advisor commands — which never reach an LLM — are therefore counted as sent messages in analytics, skewing provider/model/agent event data with local-only interactions. Moving the detectWorkflowAdvisorCommand check (or the track call) before the other to ensure the event only fires when a message is genuinely dispatched would fix this.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/useChatSession.ts
Line: 783-787

Comment:
**`MESSAGE_SENT_EVENT` analytics fires for locally-handled commands**

`track(MESSAGE_SENT_EVENT, ...)` executes before the `detectWorkflowAdvisorCommand` early-return check. Workflow advisor commands — which never reach an LLM — are therefore counted as sent messages in analytics, skewing provider/model/agent event data with local-only interactions. Moving the `detectWorkflowAdvisorCommand` check (or the `track` call) before the other to ensure the event only fires when a message is genuinely dispatched would fix this.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +62 to +72
normalized.includes('analyze my workflow') ||
normalized.includes('analyse my workflow') ||
normalized.includes('what patterns do you see') ||
normalized.includes('suggest skills') ||
normalized.includes('find skill suggestions') ||
normalized.includes('what can be automated') ||
normalized.includes('analyze workflow patterns') ||
normalized.includes('analyse workflow patterns')
) {
return 'analyze'
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Overly-broad trigger phrase hijacks unrelated user questions

"what patterns do you see" matches without any workflow-specific qualifier, so a user asking "what patterns do you see in this error log?" or "what patterns do you see in this regex?" will have their message silently intercepted as a workflow-advisor analyze command instead of being forwarded to the LLM. The test confirms this is current behaviour but it is likely unintentional. Adding a workflow-data qualifier (similar to the mentionsWorkflowData guard used for view/clear) would restrict the trigger to intentional workflow analysis requests.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/agent/lib/workflow-usage/advisor.ts
Line: 62-72

Comment:
**Overly-broad trigger phrase hijacks unrelated user questions**

`"what patterns do you see"` matches without any workflow-specific qualifier, so a user asking "what patterns do you see in this error log?" or "what patterns do you see in this regex?" will have their message silently intercepted as a workflow-advisor `analyze` command instead of being forwarded to the LLM. The test confirms this is current behaviour but it is likely unintentional. Adding a workflow-data qualifier (similar to the `mentionsWorkflowData` guard used for `view`/`clear`) would restrict the trigger to intentional workflow analysis requests.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +154 to +168
const suggestions = Array.from(groups.values())
.filter((group) => group.runCount >= minRuns)
.map((group, index): WorkflowSkillSuggestion => {
const pattern = group.pattern
return {
id: `workflow-${index + 1}`,
title: buildSuggestionTitle(pattern),
runCount: group.runCount,
pattern,
lastUsedAt: group.lastUsedAt,
benefit: buildBenefit(pattern),
}
})
.sort(compareSuggestions)
.slice(0, limit)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Suggestion IDs are assigned from the pre-sort .map() index, so after .sort(compareSuggestions) the label workflow-1 will not necessarily identify the top-ranked suggestion. Assigning IDs after sorting keeps them stable and consistent.

Suggested change
const suggestions = Array.from(groups.values())
.filter((group) => group.runCount >= minRuns)
.map((group, index): WorkflowSkillSuggestion => {
const pattern = group.pattern
return {
id: `workflow-${index + 1}`,
title: buildSuggestionTitle(pattern),
runCount: group.runCount,
pattern,
lastUsedAt: group.lastUsedAt,
benefit: buildBenefit(pattern),
}
})
.sort(compareSuggestions)
.slice(0, limit)
const suggestions = Array.from(groups.values())
.filter((group) => group.runCount >= minRuns)
.map((group): Omit<WorkflowSkillSuggestion, 'id'> => {
const pattern = group.pattern
return {
title: buildSuggestionTitle(pattern),
runCount: group.runCount,
pattern,
lastUsedAt: group.lastUsedAt,
benefit: buildBenefit(pattern),
}
})
.sort(compareSuggestions)
.slice(0, limit)
.map((group, index): WorkflowSkillSuggestion => ({
...group,
id: `workflow-${index + 1}`,
}))
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/agent/lib/workflow-usage/advisor.ts
Line: 154-168

Comment:
Suggestion IDs are assigned from the pre-sort `.map()` index, so after `.sort(compareSuggestions)` the label `workflow-1` will not necessarily identify the top-ranked suggestion. Assigning IDs after sorting keeps them stable and consistent.

```suggestion
  const suggestions = Array.from(groups.values())
    .filter((group) => group.runCount >= minRuns)
    .map((group): Omit<WorkflowSkillSuggestion, 'id'> => {
      const pattern = group.pattern
      return {
        title: buildSuggestionTitle(pattern),
        runCount: group.runCount,
        pattern,
        lastUsedAt: group.lastUsedAt,
        benefit: buildBenefit(pattern),
      }
    })
    .sort(compareSuggestions)
    .slice(0, limit)
    .map((group, index): WorkflowSkillSuggestion => ({
      ...group,
      id: `workflow-${index + 1}`,
    }))
```

How can I resolve this? If you propose a fix, please make it concise.

@shadowfax92
Copy link
Copy Markdown
Contributor Author

Refinery rejected this merge request after Greptile reported branch-caused P1 defects: workflow usage storage can lose concurrent records, local workflow advisor commands are counted as sent LLM messages, and the broad trigger phrase can hijack unrelated user questions. Source issue bosmain-nj6 has been reopened with details for rework.

@shadowfax92 Nikhil (shadowfax92) deleted the polecat/flint/bosmain-nj6@moxnt96t branch May 9, 2026 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Suggestion: Smart Skill Discovery — Analyze usage patterns to suggest custom skills

1 participant