Skip to content

feat: detect duplicate/similar commit subjects#30

Merged
matmar10 merged 5 commits intomasterfrom
feat/duplicate-commit-detection
Apr 21, 2026
Merged

feat: detect duplicate/similar commit subjects#30
matmar10 merged 5 commits intomasterfrom
feat/duplicate-commit-detection

Conversation

@matmar10
Copy link
Copy Markdown
Owner

Summary

  • Adds duplicate_threshold input (0.0–1.0, disabled by default) to action.yml
  • Implements Sørensen–Dice bigram coefficient — no new runtime dependency
  • Strips conventional-commit prefix (type(scope):) before comparing so feat(website): foo and feat(campaign): foo are correctly flagged as identical
  • Handles double-colon typo (feat(auto-giving): : description) via + on the colon group
  • Reports duplicate pairs in a new Duplicate Commit Subjects section of the PR comment, showing a markdown table with links and similarity percentage
  • Increments the failure count so the action fails whenever duplicates exceed the threshold

How it works

  1. After all commits are linted, every pair (i, j) is compared
  2. normalizeSubject takes first line only, strips prefix, lowercases, collapses punctuation/whitespace
  3. diceSimilarity computes the Dice coefficient over character bigrams
  4. Any pair ≥ threshold is collected into duplicatePairs
  5. core.setFailed is called if duplicatePairs.length > 0

Usage

- uses: matmar10/prcolinter@v1
  with:
    token: ${{ secrets.GITHUB_TOKEN }}
    duplicate_threshold: '0.8'

Test plan

  • PR with feat(website): enable foo and feat(campaign): enable foo and threshold 0.5 → fails, shows duplicate section
  • PR with unrelated commit messages → passes even with low threshold
  • Omitting duplicate_threshold (default '') → feature disabled, no behaviour change
  • duplicate_threshold: '1.0' → only exact-duplicate subjects after normalisation fail

🤖 Generated with Claude Code

matmar10 and others added 5 commits April 21, 2026 23:02
Previously used __dirname which pointed to the action's own dist/
directory, so the config file was never found in the consuming repo.

Closes #28

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a `duplicate_threshold` action input (0.0–1.0). When set, all
commit subject pairs are compared using the Sørensen–Dice bigram
coefficient. The conventional-commit prefix (type(scope):) is stripped
before comparison so that commits differing only in scope but sharing
an identical description are still flagged.

Pairs whose similarity meets or exceeds the threshold are reported in
the PR comment and cause the action to fail.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extracts similarity helpers into src/similarity.js so they can be
imported by tests without pulling in GitHub Actions dependencies.
Adds jest and 16 test cases covering prefix stripping, scope
normalisation, double-colon typo, multiline messages, symmetry,
and the real-world duplicate example from the feature spec.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds hasDoubleColon() to src/similarity.js which detects the pattern
type(scope): : description using a single regex on the first line.
The check is wired into the per-commit report loop so violations
increment countErrors and appear inline alongside commitlint errors.

Controlled by the new check_double_colon action input (default: true).
10 new unit tests cover type-only, type+scope, breaking-change (!),
no-space variant (fix::), colon-in-description false-positive,
multiline messages, and case-insensitivity.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Runs npm test (jest) on Node 18 before the action self-test job.
The self-test now depends on unit-tests passing first.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

✅🙏🏻 Conventional Commit 🥳 🎉

🤖 Beep boop! Congrats, it like all your commit messages conform to the Conventional Commit spec! 👏👏👏

Your PR can be closed. Coffee is for closers, so here's a coffee for you: ☕️

Commit Message Lint Report

  • ✏️ 5 commit(s)
  • 👤 1 author(s)
  • 0 lint error(s)
  • ⚠️ 0 lint warning(s)

@matmar10 matmar10 merged commit 897c82e into master Apr 21, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant