feat(agent): add local workflow skill suggestions#987
feat(agent): add local workflow skill suggestions#987Nikhil (shadowfax92) wants to merge 1 commit into
Conversation
✅ Tests passed — 1237/1241
|
Greptile SummaryThis PR adds a local workflow-usage tracking system that records tool-name sequences (without URLs or page content) and surfaces a skill-suggestion advisor that users can invoke via special chat phrases. The feature wires into both the sidepanel chat session and the agent-harness conversation flows.
Confidence Score: 3/5The core data path works correctly but two defects need attention before merging: concurrent writes can silently lose records, and an overly-broad chat trigger can intercept normal user questions and prevent them from reaching the LLM. The storage module's read-modify-write pattern has no concurrency guard, meaning records from
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant useChatSession
participant useExecutionHistoryTracker
participant useAgentConversation
participant storage as workflow-usage/storage
participant advisor as workflow-usage/advisor
User->>useChatSession: sendMessage(text)
useChatSession->>advisor: detectWorkflowAdvisorCommand(text)
alt workflow command detected
advisor-->>useChatSession: "'analyze' | 'view' | 'clear'"
useChatSession->>storage: getWorkflowUsageRecords() / clearWorkflowUsageRecords()
storage-->>useChatSession: records[]
useChatSession->>advisor: analyzeWorkflowUsage(records)
advisor-->>useChatSession: WorkflowUsageAnalysis
useChatSession->>useChatSession: appendLocalWorkflowAdvisorExchange()
else normal message
useChatSession->>useExecutionHistoryTracker: startExecutionTask()
useExecutionHistoryTracker->>useAgentConversation: stream tool calls
useAgentConversation->>useAgentConversation: upsertAgentHarnessTool() - workflowToolNamesRef
useAgentConversation->>storage: recordWorkflowUsage(record) [on turn end]
useExecutionHistoryTracker->>storage: recordWorkflowUsage(record) [on task complete]
end
Prompt To Fix All With AIFix the following 4 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 4
packages/browseros-agent/apps/agent/lib/workflow-usage/storage.ts:48-67
**Read-modify-write race condition loses records**
`recordWorkflowUsage` reads, mutates, and writes storage in separate async steps with no guard. Two concurrent calls — which are realistic given that `useExecutionHistoryTracker` and `useAgentConversation` can both fire on turn completion — will both read the same snapshot, each produce a new record set that includes only their own addition, and the last writer will overwrite the first, silently dropping the earlier record.
Consider serialising writes with a queued promise chain (e.g., a module-level `let pendingWrite = Promise.resolve()` that chains each new write onto the tail), or use a storage API that supports atomic updates.
### Issue 2 of 4
packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/useChatSession.ts:783-787
**`MESSAGE_SENT_EVENT` analytics fires for locally-handled commands**
`track(MESSAGE_SENT_EVENT, ...)` executes before the `detectWorkflowAdvisorCommand` early-return check. Workflow advisor commands — which never reach an LLM — are therefore counted as sent messages in analytics, skewing provider/model/agent event data with local-only interactions. Moving the `detectWorkflowAdvisorCommand` check (or the `track` call) before the other to ensure the event only fires when a message is genuinely dispatched would fix this.
### Issue 3 of 4
packages/browseros-agent/apps/agent/lib/workflow-usage/advisor.ts:62-72
**Overly-broad trigger phrase hijacks unrelated user questions**
`"what patterns do you see"` matches without any workflow-specific qualifier, so a user asking "what patterns do you see in this error log?" or "what patterns do you see in this regex?" will have their message silently intercepted as a workflow-advisor `analyze` command instead of being forwarded to the LLM. The test confirms this is current behaviour but it is likely unintentional. Adding a workflow-data qualifier (similar to the `mentionsWorkflowData` guard used for `view`/`clear`) would restrict the trigger to intentional workflow analysis requests.
### Issue 4 of 4
packages/browseros-agent/apps/agent/lib/workflow-usage/advisor.ts:154-168
Suggestion IDs are assigned from the pre-sort `.map()` index, so after `.sort(compareSuggestions)` the label `workflow-1` will not necessarily identify the top-ranked suggestion. Assigning IDs after sorting keeps them stable and consistent.
```suggestion
const suggestions = Array.from(groups.values())
.filter((group) => group.runCount >= minRuns)
.map((group): Omit<WorkflowSkillSuggestion, 'id'> => {
const pattern = group.pattern
return {
title: buildSuggestionTitle(pattern),
runCount: group.runCount,
pattern,
lastUsedAt: group.lastUsedAt,
benefit: buildBenefit(pattern),
}
})
.sort(compareSuggestions)
.slice(0, limit)
.map((group, index): WorkflowSkillSuggestion => ({
...group,
id: `workflow-${index + 1}`,
}))
```
Reviews (1): Last reviewed commit: "feat: add local workflow skill suggestio..." | Re-trigger Greptile |
| export async function recordWorkflowUsage( | ||
| record: WorkflowUsageRecord | null, | ||
| ): Promise<void> { | ||
| if (!record) return | ||
|
|
||
| const current = (await workflowUsageStorage.getValue()) ?? { | ||
| version: 1, | ||
| records: [], | ||
| } | ||
| const recordsById = new Map( | ||
| current.records.map((existing) => [existing.id, existing]), | ||
| ) | ||
| recordsById.set(record.id, record) | ||
|
|
||
| const records = Array.from(recordsById.values()) | ||
| .sort((left, right) => left.recordedAt - right.recordedAt) | ||
| .slice(-MAX_WORKFLOW_USAGE_RECORDS) | ||
|
|
||
| await workflowUsageStorage.setValue({ version: 1, records }) | ||
| } |
There was a problem hiding this comment.
Read-modify-write race condition loses records
recordWorkflowUsage reads, mutates, and writes storage in separate async steps with no guard. Two concurrent calls — which are realistic given that useExecutionHistoryTracker and useAgentConversation can both fire on turn completion — will both read the same snapshot, each produce a new record set that includes only their own addition, and the last writer will overwrite the first, silently dropping the earlier record.
Consider serialising writes with a queued promise chain (e.g., a module-level let pendingWrite = Promise.resolve() that chains each new write onto the tail), or use a storage API that supports atomic updates.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/agent/lib/workflow-usage/storage.ts
Line: 48-67
Comment:
**Read-modify-write race condition loses records**
`recordWorkflowUsage` reads, mutates, and writes storage in separate async steps with no guard. Two concurrent calls — which are realistic given that `useExecutionHistoryTracker` and `useAgentConversation` can both fire on turn completion — will both read the same snapshot, each produce a new record set that includes only their own addition, and the last writer will overwrite the first, silently dropping the earlier record.
Consider serialising writes with a queued promise chain (e.g., a module-level `let pendingWrite = Promise.resolve()` that chains each new write onto the tail), or use a storage API that supports atomic updates.
How can I resolve this? If you propose a fix, please make it concise.| const workflowAdvisorCommand = detectWorkflowAdvisorCommand(params.text) | ||
| if (workflowAdvisorCommand) { | ||
| void handleWorkflowAdvisorCommand(params.text, workflowAdvisorCommand) | ||
| return | ||
| } |
There was a problem hiding this comment.
MESSAGE_SENT_EVENT analytics fires for locally-handled commands
track(MESSAGE_SENT_EVENT, ...) executes before the detectWorkflowAdvisorCommand early-return check. Workflow advisor commands — which never reach an LLM — are therefore counted as sent messages in analytics, skewing provider/model/agent event data with local-only interactions. Moving the detectWorkflowAdvisorCommand check (or the track call) before the other to ensure the event only fires when a message is genuinely dispatched would fix this.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/useChatSession.ts
Line: 783-787
Comment:
**`MESSAGE_SENT_EVENT` analytics fires for locally-handled commands**
`track(MESSAGE_SENT_EVENT, ...)` executes before the `detectWorkflowAdvisorCommand` early-return check. Workflow advisor commands — which never reach an LLM — are therefore counted as sent messages in analytics, skewing provider/model/agent event data with local-only interactions. Moving the `detectWorkflowAdvisorCommand` check (or the `track` call) before the other to ensure the event only fires when a message is genuinely dispatched would fix this.
How can I resolve this? If you propose a fix, please make it concise.| normalized.includes('analyze my workflow') || | ||
| normalized.includes('analyse my workflow') || | ||
| normalized.includes('what patterns do you see') || | ||
| normalized.includes('suggest skills') || | ||
| normalized.includes('find skill suggestions') || | ||
| normalized.includes('what can be automated') || | ||
| normalized.includes('analyze workflow patterns') || | ||
| normalized.includes('analyse workflow patterns') | ||
| ) { | ||
| return 'analyze' | ||
| } |
There was a problem hiding this comment.
Overly-broad trigger phrase hijacks unrelated user questions
"what patterns do you see" matches without any workflow-specific qualifier, so a user asking "what patterns do you see in this error log?" or "what patterns do you see in this regex?" will have their message silently intercepted as a workflow-advisor analyze command instead of being forwarded to the LLM. The test confirms this is current behaviour but it is likely unintentional. Adding a workflow-data qualifier (similar to the mentionsWorkflowData guard used for view/clear) would restrict the trigger to intentional workflow analysis requests.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/agent/lib/workflow-usage/advisor.ts
Line: 62-72
Comment:
**Overly-broad trigger phrase hijacks unrelated user questions**
`"what patterns do you see"` matches without any workflow-specific qualifier, so a user asking "what patterns do you see in this error log?" or "what patterns do you see in this regex?" will have their message silently intercepted as a workflow-advisor `analyze` command instead of being forwarded to the LLM. The test confirms this is current behaviour but it is likely unintentional. Adding a workflow-data qualifier (similar to the `mentionsWorkflowData` guard used for `view`/`clear`) would restrict the trigger to intentional workflow analysis requests.
How can I resolve this? If you propose a fix, please make it concise.| const suggestions = Array.from(groups.values()) | ||
| .filter((group) => group.runCount >= minRuns) | ||
| .map((group, index): WorkflowSkillSuggestion => { | ||
| const pattern = group.pattern | ||
| return { | ||
| id: `workflow-${index + 1}`, | ||
| title: buildSuggestionTitle(pattern), | ||
| runCount: group.runCount, | ||
| pattern, | ||
| lastUsedAt: group.lastUsedAt, | ||
| benefit: buildBenefit(pattern), | ||
| } | ||
| }) | ||
| .sort(compareSuggestions) | ||
| .slice(0, limit) |
There was a problem hiding this comment.
Suggestion IDs are assigned from the pre-sort
.map() index, so after .sort(compareSuggestions) the label workflow-1 will not necessarily identify the top-ranked suggestion. Assigning IDs after sorting keeps them stable and consistent.
| const suggestions = Array.from(groups.values()) | |
| .filter((group) => group.runCount >= minRuns) | |
| .map((group, index): WorkflowSkillSuggestion => { | |
| const pattern = group.pattern | |
| return { | |
| id: `workflow-${index + 1}`, | |
| title: buildSuggestionTitle(pattern), | |
| runCount: group.runCount, | |
| pattern, | |
| lastUsedAt: group.lastUsedAt, | |
| benefit: buildBenefit(pattern), | |
| } | |
| }) | |
| .sort(compareSuggestions) | |
| .slice(0, limit) | |
| const suggestions = Array.from(groups.values()) | |
| .filter((group) => group.runCount >= minRuns) | |
| .map((group): Omit<WorkflowSkillSuggestion, 'id'> => { | |
| const pattern = group.pattern | |
| return { | |
| title: buildSuggestionTitle(pattern), | |
| runCount: group.runCount, | |
| pattern, | |
| lastUsedAt: group.lastUsedAt, | |
| benefit: buildBenefit(pattern), | |
| } | |
| }) | |
| .sort(compareSuggestions) | |
| .slice(0, limit) | |
| .map((group, index): WorkflowSkillSuggestion => ({ | |
| ...group, | |
| id: `workflow-${index + 1}`, | |
| })) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/agent/lib/workflow-usage/advisor.ts
Line: 154-168
Comment:
Suggestion IDs are assigned from the pre-sort `.map()` index, so after `.sort(compareSuggestions)` the label `workflow-1` will not necessarily identify the top-ranked suggestion. Assigning IDs after sorting keeps them stable and consistent.
```suggestion
const suggestions = Array.from(groups.values())
.filter((group) => group.runCount >= minRuns)
.map((group): Omit<WorkflowSkillSuggestion, 'id'> => {
const pattern = group.pattern
return {
title: buildSuggestionTitle(pattern),
runCount: group.runCount,
pattern,
lastUsedAt: group.lastUsedAt,
benefit: buildBenefit(pattern),
}
})
.sort(compareSuggestions)
.slice(0, limit)
.map((group, index): WorkflowSkillSuggestion => ({
...group,
id: `workflow-${index + 1}`,
}))
```
How can I resolve this? If you propose a fix, please make it concise.|
Refinery rejected this merge request after Greptile reported branch-caused P1 defects: workflow usage storage can lose concurrent records, local workflow advisor commands are counted as sent LLM messages, and the broad trigger phrase can hijack unrelated user questions. Source issue bosmain-nj6 has been reopened with details for rework. |
Summary
Fixes #955
Test plan
git diff --check origin/dev...HEADbun run testfrompackages/browseros-agent/apps/agent(new workflow-usage tests passed; local run later hit existing zod resolution blocker tracked in bosmain-8o0)bun run lintfrompackages/browseros-agent/apps/agentbun run typecheckfrompackages/browseros-agent/apps/agent(blocked locally by existing wxt script resolution issue tracked in bosmain-090)bun run buildfrompackages/browseros-agent/apps/agent(blocked locally by existing graphql-codegen script resolution issue tracked in bosmain-4xe)