Skip to content

Main to prod#123

Closed
knechtionscoding wants to merge 12 commits into
prodfrom
main
Closed

Main to prod#123
knechtionscoding wants to merge 12 commits into
prodfrom
main

Conversation

@knechtionscoding
Copy link
Copy Markdown
Collaborator

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR is related to:

Special notes for your reviewer:

Does this PR introduce a user-facing change?


gjkim42 and others added 12 commits May 2, 2026 13:31
docs: document Workspace.spec.setupCommand in reference.md
Add ExcludePatterns field to the Slack TaskSpawner source. Regex
patterns (RE2) reject messages when any pattern matches, enabling
negative routing between multiple Slack-triggered TaskSpawners.
Leading @-mentions are stripped before matching so patterns target
semantic content. Does not apply to slash commands.

Co-Authored-By: Claude Opus 4.6 <[email protected]>

Co-authored-by: jkahuja <[email protected]>
…ude-patterns

feat(api): add excludePatterns to Slack source for negative routing
test(e2e): poll Task phase via WaitForTaskPhase to fix flake
…mmentOn

Mark bodyContains as deprecated and add reference rows for its regex
replacements (bodyPattern, excludeBodyPatterns) and the related
issue_comment scoping field (commentOn). New users following
reference.md were silently being steered toward the deprecated field.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
docs: document githubWebhook bodyPattern, excludeBodyPatterns, and commentOn
…spawners

Add `excludeAuthors: [kelos-bot[bot]]` at the top level of webhook-driven
self-development TaskSpawners and remove the now-redundant `kelos-bot[bot]`
positive author allowlist entries. Top-level ExcludeAuthors is evaluated
before filters and takes precedence over filter-level Author matches, so
agent-authored comments containing strings like `/kelos api-review` no
longer recursively trigger another spawner.

Affects: kelos-workers, kelos-planner, kelos-pr-responder, kelos-reviewer,
kelos-api-reviewer. kelos-triage is left unchanged so bot-created issues
(e.g. from kelos-self-update) continue to be triaged. kelos-squash-commits
already restricts to gjkim42 only.
self-development: prevent kelos-bot from triggering self-development spawners
watchTask returned nil for both Succeeded and Failed phases, so
'kelos run -w' silently exited 0 even when the task failed — breaking
CI/CD integration where downstream steps assumed success.

Return a non-nil error on TaskPhaseFailed and print a diagnostic hint
pointing users at 'kelos logs' and 'kelos get tasks -d' for details.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
fix(cli): exit non-zero on failed task in 'run -w'
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 5, 2026

Greptile Summary

This PR merges main into prod, bringing three primary features: (1) a new ExcludePatterns field on the Slack TaskSpawner spec that rejects messages whose text matches any listed RE2 pattern, (2) improved watchTask testability and failure UX in the CLI, and (3) more reliable e2e phase assertions using a shared WaitForTaskPhase helper.

  • Slack exclude patterns (internal/slack/filter.go, api/v1alpha1/taskspawner_types.go): adds matchesExcludePatterns and stripLeadingMentions helpers, updates MatchesSpawner to apply the exclude check after the positive-match gate, and regenerates the deepcopy and CRD manifests.
  • CLI watchTask refactor (internal/cli/run.go, run_test.go): accepts io.Writer outputs for testability, writes an actionable hint to errOut on failure, and returns a non-nil error so callers can surface task failure to the user.
  • E2E reliability (test/e2e/, framework.go): replaces bare Expect(f.GetTaskPhase(...)) point-in-time reads with f.WaitForTaskPhase(...) polling to tolerate the async gap between Job completion and Task status reconciliation.

Confidence Score: 4/5

The change is safe to merge; logic is well-tested and the new ExcludePatterns path correctly gates on the positive-match result before applying exclusion.

Two minor concerns exist: the stdlib log call in filter.go likely doesn't integrate with the controller's structured logger, and the fixed 2-minute timeout in WaitForTaskPhase inflates the failure window for the dep-chain Waiting assertion that was previously bounded at 30 seconds. Neither affects correctness in production.

internal/slack/filter.go (stdlib log import) and test/e2e/framework/framework.go (hardcoded 2-minute timeout in WaitForTaskPhase).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Slack message received] --> B{matchesChannel?}
    B -- No --> REJECT1[return false]
    B -- Yes --> C{IsSlashCommand?}
    C -- Yes --> ALLOW1[return true bypass all filters]
    C -- No --> D{len Triggers == 0?}
    D -- Yes --> E[hasBotMention]
    D -- No --> F[matchesTriggers strips leading mentions checks pattern + mention]
    E --> G{positiveMatch?}
    F --> G
    G -- No --> REJECT2[return false]
    G -- Yes --> H[matchesExcludePatterns strips leading mentions OR across patterns]
    H -- matches --> REJECT3[return false]
    H -- no match --> ALLOW2[return true]
Loading

Comments Outside Diff (1)

  1. test/e2e/framework/framework.go, line 808-816 (link)

    P2 Fixed 2-minute timeout hides fast-fail phase transitions

    WaitForTaskPhase always waits up to 2 minutes. The dep-chain test previously used a targeted 30-second window to assert that dep-chain-b reaches the Waiting phase: a transient scheduling bug that prevents Waiting from being set now takes up to 2 minutes to surface instead of 30 seconds. Consider accepting a timeout time.Duration parameter so each call site can express the expected latency, or at least using a shorter default for phase assertions that should resolve in seconds.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: test/e2e/framework/framework.go
    Line: 808-816
    
    Comment:
    **Fixed 2-minute timeout hides fast-fail phase transitions**
    
    `WaitForTaskPhase` always waits up to 2 minutes. The dep-chain test previously used a targeted 30-second window to assert that `dep-chain-b` reaches the `Waiting` phase: a transient scheduling bug that prevents `Waiting` from being set now takes up to 2 minutes to surface instead of 30 seconds. Consider accepting a `timeout time.Duration` parameter so each call site can express the expected latency, or at least using a shorter default for phase assertions that should resolve in seconds.
    
    How can I resolve this? If you propose a fix, please make it concise.

    Fix in Claude Code

Fix All in Claude Code

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
internal/slack/filter.go:60
**Stdlib `log` inconsistent with controller logging**

`log.Printf` (stdlib) was introduced here, while Kubernetes controllers typically route all structured logging through `logr` or `slog`. This call produces timestamped output to `stderr` without any of the key/value context that the rest of the controller surface uses. If an invalid pattern is stored in a CRD and the controller is running with a structured-log aggregator, this message will likely end up in an unstructured log stream and may be missed by alerting rules.

### Issue 2 of 2
test/e2e/framework/framework.go:808-816
**Fixed 2-minute timeout hides fast-fail phase transitions**

`WaitForTaskPhase` always waits up to 2 minutes. The dep-chain test previously used a targeted 30-second window to assert that `dep-chain-b` reaches the `Waiting` phase: a transient scheduling bug that prevents `Waiting` from being set now takes up to 2 minutes to surface instead of 30 seconds. Consider accepting a `timeout time.Duration` parameter so each call site can express the expected latency, or at least using a shorter default for phase assertions that should resolve in seconds.

Reviews (1): Last reviewed commit: "Merge pull request #1101 from kelos-dev/..." | Re-trigger Greptile

Comment thread internal/slack/filter.go
}
if err != nil {
return nil, err
log.Printf("Invalid regex pattern %q: %v", pattern, err)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Stdlib log inconsistent with controller logging

log.Printf (stdlib) was introduced here, while Kubernetes controllers typically route all structured logging through logr or slog. This call produces timestamped output to stderr without any of the key/value context that the rest of the controller surface uses. If an invalid pattern is stored in a CRD and the controller is running with a structured-log aggregator, this message will likely end up in an unstructured log stream and may be missed by alerting rules.

Prompt To Fix With AI
This is a comment left during a code review.
Path: internal/slack/filter.go
Line: 60

Comment:
**Stdlib `log` inconsistent with controller logging**

`log.Printf` (stdlib) was introduced here, while Kubernetes controllers typically route all structured logging through `logr` or `slog`. This call produces timestamped output to `stderr` without any of the key/value context that the rest of the controller surface uses. If an invalid pattern is stored in a CRD and the controller is running with a structured-log aggregator, this message will likely end up in an unstructured log stream and may be missed by alerting rules.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

Copy link
Copy Markdown

@anomalogravity anomalogravity Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-approved: PR assessed as low risk by Gravity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants