Skip to content

fix: garbage cleanup migration + FalkorDB GetNeighbors path fix#36

Merged
thebtf merged 1 commit intomainfrom
fix/garbage-cleanup-and-neighbors
Mar 21, 2026
Merged

fix: garbage cleanup migration + FalkorDB GetNeighbors path fix#36
thebtf merged 1 commit intomainfrom
fix/garbage-cleanup-and-neighbors

Conversation

@thebtf
Copy link
Owner

@thebtf thebtf commented Mar 21, 2026

Changes

Migration 040: garbage observation cleanup

  • Deletes observations matching 26 known garbage title patterns (PowerShell errors, auth failures, stdin checks, etc.)
  • Purges orphan vectors from observation_vectors where sqlite_id has no matching observation

FalkorDB GetNeighbors fix

  • Error: "Type mismatch: expected List or Null but was Path"
  • Root cause: variable-length relationship binding [r:REL*1..N] returns Path in FalkorDB, not List
  • Fix: use named path p = ...-[:REL*1..N]-... with relationships(p) to extract list

Test plan

  • Build passes
  • Migration 040 runs on deploy, logs deleted counts
  • Graph page GetNeighbors no longer errors in logs
  • Orphan vectors cleaned from observation_vectors table

Summary by CodeRabbit

Исправления ошибок

  • Очистка базы данных — выполнена одноразовая очистка неработающих записей и связанных осиротелых данных в базе данных во время миграции.
  • Исправление запроса к графовой базе данных — устранена несоответствие типов при извлечении информации о соседних узлах, улучшена стабильность обработки отношений в графе.

…th query

Migration 040: one-time cleanup of garbage observations by title pattern
(PowerShell errors, auth failures, stdin checks, etc.) + orphan vector purge.

FalkorDB: fixed GetNeighbors "Type mismatch: expected List but was Path"
by using named path variable and relationships() function instead of
operating on the raw variable-length relationship binding.
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on improving data integrity and system stability. It introduces a one-time database migration to clean up erroneous or irrelevant "garbage" observations and their corresponding orphan vectors, which can accumulate from SDK tool output. Additionally, it addresses a critical bug in the FalkorDB graph client's GetNeighbors query, preventing a type mismatch error that occurred when processing variable-length relationship paths.

Highlights

  • Database Migration for Data Cleanup: Implemented a new migration (040) to remove "garbage" observations based on predefined title patterns and to purge any associated orphan vectors from the observation_vectors table.
  • FalkorDB GetNeighbors Query Fix: Resolved a Type mismatch error in FalkorDB's GetNeighbors function by modifying the Cypher query to use named paths, ensuring correct extraction of relationship lists from variable-length paths.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Mar 21, 2026

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

Добавлена новая миграция базы данных (040) для удаления устаревших записей наблюдений из таблицы observations по заданным SQL-шаблонам и очистки сирот из observation_vectors. Также исправлен запрос в GetNeighbors для использования явной переменной пути в FalkorDB, корректирующей извлечение типа связи и подсчёт переходов.

Changes

Cohort / File(s) Summary
Database Cleanup Migration
internal/db/gorm/migrations.go
Добавлена миграция 040 для удаления наблюдений по SQL LIKE шаблонам названий и удаления опрофанских записей observation_vectors через проверку metadata->>'sqlite_id'. Ошибки логируются как некритичные; откат не выполняется.
FalkorDB Query Fix
internal/graph/falkordb/client.go
Переписан запрос GetNeighbors с явной переменной пути p вместо прямого связывания последовательности связей. Исправлено вычисление расстояния через length(p)-1 и извлечение типа связи через relationships(p) и type(rels[0]) для разрешения несоответствия типов FalkorDB.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 Миграция скачет сквозь данные чистой тропой,
Сирота-вектор прощается с былью,
А путь в графе танцует с названьем самим,
Четыредцать строк — и всё светлеет, светло!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the two main changes: a garbage cleanup migration (040) and a FalkorDB GetNeighbors path query fix.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/garbage-cleanup-and-neighbors

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new migration to clean up garbage observations and orphan vectors, which is a valuable data hygiene improvement. Additionally, it includes a critical fix for the FalkorDB GetNeighbors query, resolving a type mismatch issue with variable-length relationship paths. The changes enhance both data quality and the robustness of graph operations.

Comment on lines +1335 to +1362
garbagePatterns := []string{
"PowerShell%Error%",
"PowerShell%Anomaly%",
"PowerShell Dot-Source%",
"Stdin Terminal%",
"Authorization Header Missing%",
"FINDSTR%Cannot%",
"Missing Authentication%",
"JavaScript Property Setting%",
"Incorrect FINDSTR%",
"Invalid Argument in Child%",
"bufio Over-read%",
"Stdin Terminal Check%",
"File Lock Handling%",
"Upstream Connection%",
"TRACE Logging Removal%",
"npm install completion%",
"Stderr Input Handling%",
"Status Discrepancy Detection%",
"Job-Session ID Synchronization%",
"Incorrect Redirection Syntax%",
"Rename node_modules%",
"Case Sensitivity in Format%",
"Cleanup Function%Parameter%",
"Cleanup by startedAt%",
"Null%Numeric Properties%",
"User Cancellation Handling%",
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The garbagePatterns list is hardcoded directly within the migration. While functional for a one-time cleanup, for future similar cleanups, consider externalizing such lists into a configuration file or a dedicated database table. This would allow for easier updates and modifications to the patterns without requiring a new code deployment and migration, improving maintainability.

@thebtf thebtf merged commit 524c2f3 into main Mar 21, 2026
1 of 2 checks passed
@thebtf thebtf deleted the fix/garbage-cleanup-and-neighbors branch March 21, 2026 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant