Skip to content
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,16 @@
- Deletion errors are logged but don't fail the operation (messages might be already deleted or too old)
- Design principle: When a spammer is detected, aggressively clean up ALL their spam messages, not just the triggering one

### Warn Auto-Ban
- `/warn` admin command optionally records each warning to `warnings` table and bans the user when count within `warn.window` reaches `warn.threshold`
- Disabled by default (`warn.threshold=0`); enabling preserves the existing warn message + delete behavior and adds the count/ban path on top
- Threshold semantics match `Report.AutoBanThreshold`: `count >= threshold` triggers ban (so threshold=2 bans on the 2nd warn)
- Storage layer (`app/storage/warnings.go`) opportunistically prunes rows older than 1 year on each `Add`; the 1y cap is a storage bound, NOT the configured window — `CountWithin` enforces the window at query time
- Ban does NOT delete the warning rows: subsequent `/warn` on an already-banned user re-triggers the ban path. Telegram's `BanChatMemberConfig` is idempotent, so the repeat ban is a no-op API call but produces a fresh admin-chat notification (audit visibility for repeat offenders)
- Warn auto-ban does NOT update spam samples (`bot.UpdateSpam` is not called) — warnings reflect admin policy, not spam content
- `executeWarnBan` mirrors `executeAutoBan` for dry/training/soft-ban handling but does not share an abstraction (the two diverge on spam-sample updates and on `From` vs `SenderChat` resolution)
- Settings: only `Warn.Threshold` is in `zeroAwarePaths` (0=disabled, must survive merges); `Warn.Window` zero is invalid and rejected by startup validation

### LLM Checker Structure
- Shared provider-agnostic LLM flow lives in `lib/tgspam/llm.go`
- Keep provider-specific transport and request construction in `lib/tgspam/openai.go`, `lib/tgspam/gemini.go`, etc
Expand Down
25 changes: 25 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,27 @@ The reporting system includes rate limiting to prevent abuse. Each user can subm

All reports are stored in the database for audit purposes and can help identify patterns of spam or abuse over time.

### Warn-Driven Auto-Ban

The admin `/warn` command can optionally escalate to an automatic ban once a user has accumulated enough warnings within a sliding time window. This complements `--report.auto-ban-threshold` (which aggregates user `/report` submissions for one message) by tracking admin warnings per user across messages.

The feature is disabled by default. To enable it, set `--warn.threshold=, [$WARN_THRESHOLD]` to a positive number and adjust `--warn.window=, [$WARN_WINDOW]` (default: `720h`). When enabled:

1. Each `/warn` issued by an admin is recorded in the `warnings` table together with the user/channel id and timestamp
2. After recording the warning, the bot counts how many warnings the same user has received within the configured window
3. If the count reaches `--warn.threshold=` (i.e. `count >= threshold`), the bot bans the user immediately, respecting `--training`, `--dry`, and `--soft-ban` modes

Copilot AI Apr 28, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flag/env-var examples in this README section appear to have stray = and commas (e.g. --warn.threshold=, [$WARN_THRESHOLD] and --warn.threshold=). This reads like a formatting typo and makes the enablement instructions ambiguous; please format them like the rest of the README (e.g., --warn.threshold / $WARN_THRESHOLD, and --warn.window / $WARN_WINDOW).

Suggested change
The feature is disabled by default. To enable it, set `--warn.threshold=, [$WARN_THRESHOLD]` to a positive number and adjust `--warn.window=, [$WARN_WINDOW]` (default: `720h`). When enabled:
1. Each `/warn` issued by an admin is recorded in the `warnings` table together with the user/channel id and timestamp
2. After recording the warning, the bot counts how many warnings the same user has received within the configured window
3. If the count reaches `--warn.threshold=` (i.e. `count >= threshold`), the bot bans the user immediately, respecting `--training`, `--dry`, and `--soft-ban` modes
The feature is disabled by default. To enable it, set `--warn.threshold` / `$WARN_THRESHOLD` to a positive number and adjust `--warn.window` / `$WARN_WINDOW` (default: `720h`). When enabled:
1. Each `/warn` issued by an admin is recorded in the `warnings` table together with the user/channel id and timestamp
2. After recording the warning, the bot counts how many warnings the same user has received within the configured window
3. If the count reaches `--warn.threshold` (i.e. `count >= threshold`), the bot bans the user immediately, respecting `--training`, `--dry`, and `--soft-ban` modes

Copilot uses AI. Check for mistakes.
4. A notification is posted to the admin chat in the form `**warn auto-banned** @user (12345) after N warns within <window>`. The verb adapts to the active mode: `auto-would have banned` in dry mode, `auto-would have banned (training)` in training mode, and `auto-restricted` for users in soft-ban mode (channels still fall through to `auto-banned` because Telegram has no restrict variant for channel senders).

Example: `--warn.threshold=3 --warn.window=168h` bans a user once they accumulate three warnings within a week (the third warn triggers the ban).

Notes:

- The default `--warn.threshold=0` preserves the original `/warn` behavior exactly: a warning message is posted and the offending message is deleted, but no warning is recorded and no auto-ban is performed.
- Warnings issued before the window expires are counted; older rows are pruned opportunistically by the storage layer. The storage retention is capped at one year, so configuring `--warn.window` beyond `8760h` is not supported.
- Unlike `/spam`, `/warn` does not update spam samples — warnings reflect admin policy, not spam content.
- Repeat bans are intentional: if an already-banned user is warned again, the threshold check fires again and re-bans them. Telegram treats banning an already-banned user as a no-op, so this is safe and serves as audit visibility for repeat offenders.
- Toggling `--warn.threshold` from `0` to a positive value (or vice versa) requires a process restart: the warnings storage is wired only at startup. Runtime changes via the settings UI are persisted but take effect only after the next restart.

### Lua Plugins Support

TG-Spam supports custom spam detection through Lua plugins. This allows users to extend the spam detection capabilities without modifying the Go codebase.
Expand Down Expand Up @@ -647,6 +668,10 @@ report:
--report.rate-limit= max reports per user per period (default: 10) [$REPORT_RATE_LIMIT]
--report.rate-period= rate limit time period (default: 1h) [$REPORT_RATE_PERIOD]

warn:
--warn.threshold= auto-ban after N warns within window (0=disabled) (default: 0) [$WARN_THRESHOLD]
--warn.window= sliding window for counting warns (default: 720h) [$WARN_WINDOW]

files:
--files.samples= samples data path, defaults to dynamic data path [$FILES_SAMPLES]
--files.dynamic= dynamic data path (default: data) [$FILES_DYNAMIC]
Expand Down
10 changes: 9 additions & 1 deletion app/config/settings.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ type Settings struct {
Duplicates DuplicatesSettings `json:"duplicates" yaml:"duplicates" db:"duplicates"`
Reactions ReactionsSettings `json:"reactions" yaml:"reactions" db:"reactions"`
Report ReportSettings `json:"report" yaml:"report" db:"report"`
Warn WarnSettings `json:"warn" yaml:"warn" db:"warn"`

// spam detection settings
SimilarityThreshold float64 `json:"similarity_threshold" yaml:"similarity_threshold" db:"similarity_threshold"`
Expand Down Expand Up @@ -181,6 +182,12 @@ type ReportSettings struct {
RatePeriod time.Duration `json:"rate_period" yaml:"rate_period" db:"report_rate_period"`
}

// WarnSettings contains admin /warn auto-ban settings
type WarnSettings struct {
Threshold int `json:"threshold" yaml:"threshold" db:"warn_threshold"`
Window time.Duration `json:"window" yaml:"window" db:"warn_window"`
}

// LuaPluginsSettings contains Lua plugins settings
type LuaPluginsSettings struct {
Enabled bool `json:"enabled" yaml:"enabled" db:"lua_plugins_enabled"`
Expand Down Expand Up @@ -238,7 +245,7 @@ type TransientSettings struct {
// temporary auth password (used only to generate hash)
WebAuthPasswd string `json:"-" yaml:"-"`

// AuthFromCLI marks web auth (hash or password) as held in memory rather
// authFromCLI marks web auth (hash or password) as held in memory rather
// than authoritative in the database. It is set by applyCLIOverrides for
// explicit --server.auth/--server.auth-hash overrides, and by
// applyAutoAuthFallback for the auto-generated password safety net. When
Expand Down Expand Up @@ -303,6 +310,7 @@ var zeroAwarePaths = map[string]bool{
"Duplicates.Threshold": true, // app/main.go:717 (> 0): 0 disables
"Report.AutoBanThreshold": true, // app/main.go:336, app/events/reports.go:191 (> 0): 0 disables
"Report.RateLimit": true, // app/events/reports.go:154 (<= 0): 0 disables rate limiting
"Warn.Threshold": true, // app/main.go, app/events/admin.go (> 0): 0 disables
"OpenAI.HistorySize": true, // lib/tgspam/detector.go:409 (> 0): 0 disables history
"Gemini.HistorySize": true, // lib/tgspam/detector.go:409 (> 0): 0 disables history
"FirstMessagesCount": true, // app/main.go:703, lib/tgspam/detector.go:205,208 (> 0): 0 disables
Expand Down
34 changes: 34 additions & 0 deletions app/config/settings_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,9 @@ func newPopulatedSettings() *Settings {
s.Report.RateLimit = 5
s.Report.RatePeriod = 90 * time.Second

s.Warn.Threshold = 3
s.Warn.Window = 12 * time.Hour

s.AggressiveCleanup = true
s.AggressiveCleanupLimit = 50

Expand All @@ -239,6 +242,7 @@ func TestSettings_JSONRoundTrip_NewGroups(t *testing.T) {
assert.Contains(t, jsonStr, `"duplicates"`)
assert.Contains(t, jsonStr, `"reactions"`)
assert.Contains(t, jsonStr, `"report"`)
assert.Contains(t, jsonStr, `"warn"`)
assert.Contains(t, jsonStr, `"aggressive_cleanup"`)
assert.Contains(t, jsonStr, `"aggressive_cleanup_limit"`)
assert.Contains(t, jsonStr, `"contact_only"`)
Expand All @@ -253,6 +257,7 @@ func TestSettings_JSONRoundTrip_NewGroups(t *testing.T) {
assert.Equal(t, original.Duplicates, restored.Duplicates)
assert.Equal(t, original.Reactions, restored.Reactions)
assert.Equal(t, original.Report, restored.Report)
assert.Equal(t, original.Warn, restored.Warn)
assert.Equal(t, original.AggressiveCleanup, restored.AggressiveCleanup)
assert.Equal(t, original.AggressiveCleanupLimit, restored.AggressiveCleanupLimit)
assert.True(t, restored.Meta.ContactOnly)
Expand All @@ -272,6 +277,7 @@ func TestSettings_YAMLRoundTrip_NewGroups(t *testing.T) {
assert.Contains(t, yamlStr, "duplicates:")
assert.Contains(t, yamlStr, "reactions:")
assert.Contains(t, yamlStr, "report:")
assert.Contains(t, yamlStr, "warn:")
assert.Contains(t, yamlStr, "aggressive_cleanup:")
assert.Contains(t, yamlStr, "aggressive_cleanup_limit:")
assert.Contains(t, yamlStr, "contact_only:")
Expand All @@ -286,6 +292,7 @@ func TestSettings_YAMLRoundTrip_NewGroups(t *testing.T) {
assert.Equal(t, original.Duplicates, restored.Duplicates)
assert.Equal(t, original.Reactions, restored.Reactions)
assert.Equal(t, original.Report, restored.Report)
assert.Equal(t, original.Warn, restored.Warn)
assert.Equal(t, original.AggressiveCleanup, restored.AggressiveCleanup)
assert.Equal(t, original.AggressiveCleanupLimit, restored.AggressiveCleanupLimit)
assert.True(t, restored.Meta.ContactOnly)
Expand Down Expand Up @@ -396,6 +403,14 @@ func TestSettings_ApplyDefaults_SkipsZeroAware(t *testing.T) {
},
assertFn: func(t *testing.T, target *Settings) { assert.Equal(t, 0, target.Report.AutoBanThreshold) },
},
{
name: "Warn.Threshold",
setup: func(target, template *Settings) {
target.Warn.Threshold = 0
template.Warn.Threshold = 3
},
assertFn: func(t *testing.T, target *Settings) { assert.Equal(t, 0, target.Warn.Threshold) },
},
{
name: "Report.RateLimit",
setup: func(target, template *Settings) {
Expand Down Expand Up @@ -457,6 +472,25 @@ func TestSettings_ApplyDefaults_SkipsZeroAware(t *testing.T) {
}
}

func TestSettings_ApplyDefaults_FillsWarnWindow(t *testing.T) {
// warn.Window is NOT zero-aware: zero is invalid (caught by startup
// validation), so the regular merge semantics must fill it from the
// template when the persisted blob has zero.
target := New()
target.Warn.Threshold = 5 // non-zero (preserved either way)
template := &Settings{
Warn: WarnSettings{
Threshold: 1,
Window: 720 * time.Hour,
},
}

target.ApplyDefaults(template)

assert.Equal(t, 5, target.Warn.Threshold, "non-zero target threshold preserved")
assert.Equal(t, 720*time.Hour, target.Warn.Window, "zero target window filled from template")
}

func TestSettings_ApplyDefaults_NestedStructs(t *testing.T) {
target := New()
template := &Settings{
Expand Down
134 changes: 130 additions & 4 deletions app/events/admin.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,14 @@ import (
"github.com/umputun/tg-spam/app/bot"
)

//go:generate moq --out mocks/warnings.go --pkg mocks --with-resets --skip-ensure . Warnings

// Warnings is an interface for admin /warn records storage used by the warn auto-ban feature
type Warnings interface {
Add(ctx context.Context, userID int64, userName string) error
CountWithin(ctx context.Context, userID int64, window time.Duration) (int, error)
}

// admin is a helper to handle all admin-group related stuff, created by listener
// public methods kept public (on a private struct) to be able to recognize the api
type admin struct {
Expand All @@ -31,6 +39,9 @@ type admin struct {
warnMsg string
aggressiveCleanup bool
aggressiveCleanupLimit int
warnings Warnings // storage for /warn records, used by DirectWarnReport auto-ban path
warnThreshold int // auto-ban after N /warn within warnWindow (0 disables auto-ban)
warnWindow time.Duration // sliding window for counting warns
}

const (
Expand Down Expand Up @@ -343,27 +354,142 @@ func (a *admin) DirectWarnReport(update tbapi.Update) error {
}

// make a warning message and replay to origMsg.MessageID
warnTarget := "@" + origMsg.From.UserName
warnTargetName := "@" + origMsg.From.UserName
if origMsg.SenderChat != nil && origMsg.SenderChat.ID != 0 && origMsg.SenderChat.ID != a.primChatID {
chName := a.channelDisplayName(origMsg.SenderChat)
if origMsg.SenderChat.UserName != "" {
warnTarget = "@" + chName
warnTargetName = "@" + chName
} else {
warnTarget = chName
warnTargetName = chName
}
}
warnMsg := fmt.Sprintf("warning from %s\n\n%s %s", update.Message.From.UserName,
warnTarget, a.warnMsg)
warnTargetName, a.warnMsg)
Comment on lines 356 to +367

Copilot AI Apr 28, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DirectWarnReport builds warnTargetName from origMsg.From.UserName without checking origMsg.From for nil. For channel messages, Telegram may omit From and only set SenderChat, which would panic before the SenderChat branch runs. Please guard accesses to ReplyToMessage.From / origMsg.From and use SenderChat (or a safe fallback) when From is nil.

Copilot uses AI. Check for mistakes.
if err := send(tbapi.NewMessage(a.primChatID, escapeMarkDownV1Text(warnMsg)), a.tbAPI); err != nil {
errs = multierror.Append(errs, fmt.Errorf("failed to send warning to main chat: %w", err))
}

if banErr := a.trackWarnAndMaybeBan(origMsg); banErr != nil {
errs = multierror.Append(errs, banErr)
}

if err := errs.ErrorOrNil(); err != nil {
return fmt.Errorf("direct warn report failed: %w", err)
}
return nil
}

// warnTarget identifies the entity (user or channel) that a warn applies to.
// channelID is 0 for plain users; for channel posts it equals the SenderChat ID.
type warnTarget struct {
userID int64
userName string
channelID int64
}

// resolveWarnTarget extracts the warn target from the original message.
// returns (target, true) for plain users and channel posts; (target, false) for
// anonymous admin posts (SenderChat == group itself, From is shared GroupAnonymousBot)
// and updates with no resolvable identity.
func (a *admin) resolveWarnTarget(origMsg *tbapi.Message) (warnTarget, bool) {
if origMsg.SenderChat != nil && origMsg.SenderChat.ID != 0 {
// anonymous admin posts have SenderChat.ID == primChatID; From identity is the
// shared GroupAnonymousBot user. tracking warns against either is meaningless
// (banning the group itself or a shared bot id), so skip entirely.
if origMsg.SenderChat.ID == a.primChatID {
return warnTarget{}, false
}
return warnTarget{
userID: origMsg.SenderChat.ID,
userName: a.channelDisplayName(origMsg.SenderChat),
channelID: origMsg.SenderChat.ID,
}, true
}
if origMsg.From != nil && origMsg.From.ID != 0 {
return warnTarget{userID: origMsg.From.ID, userName: origMsg.From.UserName}, true
}
return warnTarget{}, false
}

// trackWarnAndMaybeBan records the warning and triggers an auto-ban when the
// configured threshold is reached within the sliding window. it is a no-op when
// the feature is disabled (threshold == 0), warnings storage is unwired, or the
// target cannot be resolved (anonymous admin posts, missing From/SenderChat).
// returns nil unless the ban itself fails - storage failures are logged but not propagated
// because the warning message has already been posted (best-effort).
func (a *admin) trackWarnAndMaybeBan(origMsg *tbapi.Message) error {
if a.warnThreshold == 0 || a.warnings == nil {

Copilot AI Apr 28, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-ban disablement is checked with a.warnThreshold == 0. Negative thresholds can be persisted (e.g., crafted form POST), and if warnings storage is wired this would make count < threshold false and trigger bans immediately. Consider treating warnThreshold <= 0 as disabled (and/or validating threshold is non-negative at settings validation and form parsing).

Suggested change
// the feature is disabled (threshold == 0), warnings storage is unwired, or the
// target cannot be resolved (anonymous admin posts, missing From/SenderChat).
// returns nil unless the ban itself fails - storage failures are logged but not propagated
// because the warning message has already been posted (best-effort).
func (a *admin) trackWarnAndMaybeBan(origMsg *tbapi.Message) error {
if a.warnThreshold == 0 || a.warnings == nil {
// the feature is disabled (threshold <= 0), warnings storage is unwired, or the
// target cannot be resolved (anonymous admin posts, missing From/SenderChat).
// returns nil unless the ban itself fails - storage failures are logged but not propagated
// because the warning message has already been posted (best-effort).
func (a *admin) trackWarnAndMaybeBan(origMsg *tbapi.Message) error {
if a.warnThreshold <= 0 || a.warnings == nil {

Copilot uses AI. Check for mistakes.
return nil
}
target, ok := a.resolveWarnTarget(origMsg)
if !ok {
return nil
}
ctx := context.TODO()
if err := a.warnings.Add(ctx, target.userID, target.userName); err != nil {
log.Printf("[WARN] failed to record warn for %q (%d): %v", target.userName, target.userID, err)
return nil
}
count, err := a.warnings.CountWithin(ctx, target.userID, a.warnWindow)
if err != nil {
log.Printf("[WARN] failed to count warns for %q (%d): %v", target.userName, target.userID, err)
return nil
}
if count < a.warnThreshold {
return nil
}
return a.executeWarnBan(target, count)
}

// executeWarnBan bans a user or channel after the warn-threshold is reached within warnWindow.
// it respects dry, training, and softBan modes, and posts an admin-chat notification.
// it does not update spam samples - a warn is not necessarily spam content.
func (a *admin) executeWarnBan(target warnTarget, count int) error {
log.Printf("[INFO] warn auto-ban triggered for %q (%d): %d warns within %v",
target.userName, target.userID, count, a.warnWindow)

banReq := banRequest{
duration: bot.PermanentBanDuration,
userID: target.userID,
channelID: target.channelID,
chatID: a.primChatID,
tbAPI: a.tbAPI,
dry: a.dry,
training: a.trainingMode,
userName: target.userName,
restrict: a.softBan,
}
if err := banUserOrChannel(banReq); err != nil {
return fmt.Errorf("failed to auto-ban %q (%d) after %d warns: %w",
target.userName, target.userID, count, err)
}

if a.adminChatID == 0 {
return nil
}

action := "banned"
switch {
case a.dry:
action = "would have banned"
case a.trainingMode:
action = "would have banned (training)"
case a.softBan && target.channelID == 0:
action = "restricted"
}

displayName := target.userName
if target.channelID == 0 && target.userName != "" {
displayName = "@" + target.userName
}
notification := fmt.Sprintf("**warn auto-%s** %s (%d) after %d warns within %v",
action, escapeMarkDownV1Text(displayName), target.userID, count, a.warnWindow)
if err := send(tbapi.NewMessage(a.adminChatID, notification), a.tbAPI); err != nil {
return fmt.Errorf("failed to send warn auto-ban notification: %w", err)
}
return nil
}

// returns the user ID and username from the tg update if's forwarded message,
// or just username in case sender is hidden user
func (a *admin) getForwardUsernameAndID(update tbapi.Update) (fwdID int64, username string) {
Expand Down
Loading