Skip to content

feat: add batch garbage collector#28

Open
acardace wants to merge 2 commits intollm-d-incubation:mainfrom
acardace:feat/garbage-collector
Open

feat: add batch garbage collector#28
acardace wants to merge 2 commits intollm-d-incubation:mainfrom
acardace:feat/garbage-collector

Conversation

@acardace
Copy link
Contributor

@acardace acardace commented Feb 2, 2026

Summary

  • Add batch-gc command that scans for expired batch jobs and removes them from the database
  • Designed for one-shot execution, suitable for Kubernetes CronJob deployment
  • Supports --dry-run flag to preview deletions without executing
  • Includes unit tests

TODO

  • Add file garbage collection once the Files API is implemented (files have their own ExpiresAt field)

Helm Chart

Adds CronJob template with configurable:

  • schedule (default: daily at 2am)
  • dryRun mode
  • Database connection via secret

Add a new batch-gc command that scans for expired batch jobs and
removes them from the database. Designed for one-shot execution,
suitable for Kubernetes CronJob deployment. Includes unit tests.

TODO: Add file garbage collection once the Files API is implemented.

Signed-off-by: Antonio Cardace <acardace@redhat.com>
Add Helm chart support for deploying the batch-gc as a Kubernetes
CronJob. Configurable schedule, dry-run mode, and database connection.

Signed-off-by: Antonio Cardace <acardace@redhat.com>
Copy link
Collaborator

@lioraron lioraron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!


gc := collector.NewGarbageCollector(dbClient, *dryRun)

result, err := gc.Run(ctx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of running this as a cron job, relative to implementing this as a long running background process, and inside it will run the gc process periodically based on a frequency param?
If it's a background process it can also expose metrics on its operation?

}

// Check if job has expired
if statusInfo.ExpiresAt == nil || *statusInfo.ExpiresAt > now {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new DB interface - get can provide all the expired items, so no need for this check?
i'm assuming we will rebase this after merging the updated db interface.

}

result.JobsExpired++
logger.V(logging.INFO).Info("Found expired job", "jobID", job.ID, "expiresAt", *statusInfo.ExpiresAt)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe info level logging would be on the iteration counters level, and specific items that are operated on would be in a debug logging level?
Another option is to get as an input config param - the level for logging individual item operations?

logger.V(logging.INFO).Info("Found expired job", "jobID", job.ID, "expiresAt", *statusInfo.ExpiresAt)

if c.dryRun {
logger.V(logging.INFO).Info("[DRY RUN] Would delete job", "jobID", job.ID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of multiple logs line for an individual item being operated on, consider a single debug level log line per item being operated on - that log line will summarize what was applied to that item.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants