-
Notifications
You must be signed in to change notification settings - Fork 122
Allow LLM to enhance provider merchants #455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds guarded, three-branch merchant assignment for auto-detected merchant data: map to user merchant, create/assign AI merchant, or enrich existing provider merchant with AI data; introduces helper methods, eager-loading scope changes, and error-handled logging for creation/enrichment failures. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Transaction
participant AutoDetector
participant UserMerchantDB as "UserMerchant DB"
participant ProviderMerchantDB as "ProviderMerchant DB"
participant AIMerchantDB as "AI Merchant DB"
note over AutoDetector,Transaction: Start processing enriched transactions
Transaction->>AutoDetector: provide auto_detection
AutoDetector->>AutoDetector: guard: require business_name & business_url
alt missing data
AutoDetector-->>Transaction: skip processing
else data present
AutoDetector->>UserMerchantDB: find by business_name
alt found user merchant
UserMerchantDB-->>AutoDetector: user merchant
AutoDetector->>Transaction: assign user merchant id
else not found
AutoDetector->>ProviderMerchantDB: check existing merchant on transaction
alt no existing merchant
AutoDetector->>AIMerchantDB: find_or_create AI merchant (with error handling)
AIMerchantDB-->>AutoDetector: ai_merchant (or error logged)
AutoDetector->>Transaction: assign AI merchant id, lock attr, increment mods
else existing provider merchant (non-AI)
AutoDetector->>ProviderMerchantDB: enhance with website_url/logo_url (with error handling)
ProviderMerchantDB-->>AutoDetector: updated merchant (or error logged)
AutoDetector->>Transaction: lock attr, increment mods if updated
end
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
app/models/family/auto_merchant_detector.rb (1)
104-117: Consider extracting logo URL generation to reduce duplication.The merchant creation logic correctly addresses the PR objective by creating AI merchants regardless of BrandFetch configuration. However, the logo URL generation on lines 110-112 is duplicated in
enhance_provider_merchant(lines 127-129).Consider extracting to a helper method:
def logo_url_for(business_url) return nil unless Setting.brand_fetch_client_id.present? "#{default_logo_provider_url}/#{business_url}/icon/fallback/lettermark/w/40/h/40?c=#{Setting.brand_fetch_client_id}" endThen use it in both places:
- if Setting.brand_fetch_client_id.present? - pm.logo_url = "#{default_logo_provider_url}/#{auto_detection.business_url}/icon/fallback/lettermark/w/40/h/40?c=#{Setting.brand_fetch_client_id}" - end + pm.logo_url = logo_url_for(auto_detection.business_url)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/models/family/auto_merchant_detector.rb(2 hunks)
🧰 Additional context used
📓 Path-based instructions (10)
**/*.rb
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rb: Application supports two modes: 'managed' and 'self_hosted' via Rails.application.config.app_mode
Use Current.user and Current.family instead of current_user / current_family for authentication context
Optimize database queries with proper indexes to prevent N+1 queries using includes/joins
**/*.rb: Use Rails pluralization in i18n:t("transactions.count", count: @transactions.count).
Runbin/rubocop -f github -afor Ruby linting with auto-correct before pull requests.
**/*.rb: Ruby code should use 2-space indent,snake_casefor methods and variables, andCamelCasefor classes and modules, following Rails conventions for folders and file names
Prefer environment variables over hard-coded values for configuration
Files:
app/models/family/auto_merchant_detector.rb
app/models/**/*.rb
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
app/models/**/*.rb: Keep business logic in models using skinny controllers and fat models pattern
Store all monetary values in base currency (user's primary currency) and use Money objects for conversion
Use ActiveRecord validations for form convenience and complex business logic validations
Use Rails concerns and POROs for organizing business logic instead of services folder
app/models/**/*.rb: All monetary values stored in base currency (user's primary currency). UseMoneyobjects for currency conversion and formatting.
Place business logic inapp/models/folder. Avoid creatingapp/services/folder. Use Rails concerns and POROs for organization.
Models should answer questions about themselves using instance methods: useaccount.balance_seriesnotAccountSeries.new(account).call.
Files:
app/models/family/auto_merchant_detector.rb
**/*.{rb,erb,html.erb}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{rb,erb,html.erb}: UseCurrent.userfor the current user context in controllers and views. Do NOT usecurrent_user.
UseCurrent.familyfor the current family context. Do NOT usecurrent_family.
Files:
app/models/family/auto_merchant_detector.rb
**/*.{erb,html.erb,rb}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{erb,html.erb,rb}: All user-facing strings must use localization (i18n) via thet()helper. Update locale files for each new or changed element.
Use server-side formatting for currencies, numbers, and dates. Do not format in JavaScript.
Files:
app/models/family/auto_merchant_detector.rb
{app/models/**/*.rb,db/migrate/**/*.rb}
📄 CodeRabbit inference engine (CLAUDE.md)
Simple validations (null checks, unique indexes) should be in the database. ActiveRecord validations for form convenience (prefer client-side when possible). Complex validations and business logic in ActiveRecord.
Files:
app/models/family/auto_merchant_detector.rb
**/*.{rb,erb}
📄 CodeRabbit inference engine (.cursor/rules/general-rules.mdc)
**/*.{rb,erb}: UseCurrent.userfor accessing the current user. Do NOT usecurrent_user
UseCurrent.familyfor accessing the current family. Do NOT usecurrent_family
Files:
app/models/family/auto_merchant_detector.rb
**/*.{rb,js,erb}
📄 CodeRabbit inference engine (.cursor/rules/project-conventions.mdc)
Format currencies, numbers, dates, and other values server-side, then pass to Stimulus controllers for display only
Files:
app/models/family/auto_merchant_detector.rb
**/*.{rb,html.erb}
📄 CodeRabbit inference engine (.cursor/rules/project-conventions.mdc)
Use ActiveRecord validations for complex validations and business logic. Simple validations may be mirrored in ActiveRecord for form error handling convenience, but prioritize client-side form validation when possible
Files:
app/models/family/auto_merchant_detector.rb
app/**/*.{rb,js,ts,jsx,tsx,css,scss}
📄 CodeRabbit inference engine (AGENTS.md)
Code should be organized in
app/directory (Rails MVC, services, jobs, mailers, components), with JS inapp/javascript/, and styles/assets inapp/assets/(Tailwind, images, fonts)
Files:
app/models/family/auto_merchant_detector.rb
app/**/*.rb
📄 CodeRabbit inference engine (AGENTS.md)
Run
bin/brakemansecurity scan before major PRs to check for static analysis of common Rails issues
Files:
app/models/family/auto_merchant_detector.rb
🧠 Learnings (2)
📚 Learning: 2025-11-24T16:56:30.669Z
Learnt from: CR
Repo: we-promise/sure PR: 0
File: .cursor/rules/project-design.mdc:0-0
Timestamp: 2025-11-24T16:56:30.669Z
Learning: Account sync must perform transfer auto-matching, calculate daily balance records from start_date to Date.current, calculate holdings, and optionally enrich transaction data
Applied to files:
app/models/family/auto_merchant_detector.rb
📚 Learning: 2025-09-13T11:27:55.271Z
Learnt from: alessiocappa
Repo: we-promise/sure PR: 141
File: app/controllers/enable_banking_items_controller.rb:48-62
Timestamp: 2025-09-13T11:27:55.271Z
Learning: The `create_enable_banking_item!` method in `Family::EnableBankingConnectable` uses `find_or_create_by(id: enable_banking_id)` to either find an existing EnableBankingItem or create a new one, so it automatically handles updates to existing items without creating duplicates.
Applied to files:
app/models/family/auto_merchant_detector.rb
🧬 Code graph analysis (1)
app/models/family/auto_merchant_detector.rb (2)
app/models/concerns/enrichable.rb (2)
enrich_attribute(26-28)lock_attr!(113-115)app/models/rule/action_executor.rb (1)
family(69-71)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: ci / test
🔇 Additional comments (6)
app/models/family/auto_merchant_detector.rb (6)
33-33: LGTM - Guard clause ensures data quality.The guard requiring both
business_nameandbusiness_urlprevents processing incomplete merchant data, which aligns with the downstream logic that needs both fields.
37-46: LGTM - Correct merchant assignment logic.The two-step approach (first matching user merchants, then creating AI merchants) correctly prioritizes user-created merchants while ensuring AI merchants are available as a fallback. The conditional increment based on
was_modifiedensures accurate tracking.
48-54: LGTM - Provider merchant enhancement logic is sound.The condition correctly identifies provider merchants that can be enriched with AI data, and the attribute locking ensures these transactions aren't reprocessed in future runs.
95-98: LGTM - Efficient query with proper eager loading.The use of
includes(:merchant, :entry)prevents N+1 queries, and theenrichable(:merchant_id)filter ensures only unlocked attributes are processed.
100-102: LGTM - Simple name-based matching.The helper correctly prioritizes user-created merchants by matching names, which is the appropriate behavior for this use case.
119-139: LGTM - Robust error handling for merchant enhancement.The method correctly enriches provider merchants with missing data and handles errors gracefully. The conditional update logic ensures only necessary changes are made, and the error handling prevents failures from disrupting the entire batch process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
app/models/family/auto_merchant_detector.rb (3)
31-55: Confirm stricter requirement onbusiness_urland case-based flow behaviorThe new
next unless auto_detection&.business_name.present? && auto_detection&.business_url.present?means we now ignore LLM results that have a name but no URL. That’s reasonable if a URL is mandatory for merchants, but it does change behavior; please confirm this is intended and doesn’t drop too many usable matches.The three-case handling (no merchant → create/assign AI; non-AI ProviderMerchant → enhance; AI/FamilyMerchant → skip) looks coherent and preserves user/AI merchants. One minor optimization you could consider later is pre-indexing
result.databytransaction_id(e.g.,index = result.data.index_by(&:transaction_id)) instead of callingfindon every iteration, to avoid O(n²) scans when there are many transactions.
95-97: Scope looks good; consider avoiding repeated relation evaluationUsing
.enrichable(:merchant_id).includes(:merchant, :entry)on the scope is a good match for the “optimize queries / avoid N+1” guideline and fits the enrichment use case. As a small follow-up, you might want to memoizescopeor load to an array once inauto_detectso thatscope.none?,scope.count,transactions_input(which also callsscope), and the main loop don’t each build a fresh relation and instantiate records multiple times, even if the query cache softens the DB cost.As per coding guidelines, this keeps the merchant auto-detection query path efficient.
120-140: Provider merchant enhancement logic is solid; consider DRYing logo URL constructionThe enrichment logic for non-AI
ProviderMerchant(only updating whenwebsite_urlis blank and AI provides one, and logging onRecordInvalid) looks good and avoids unnecessary writes. Since the BrandFetch logo URL string is duplicated here and infind_or_create_ai_merchant, you might want to extract a small helper likelogo_url_for(business_url)to centralize that interpolation and make future changes (e.g., base URL or sizing params) safer:- def enhance_provider_merchant(merchant, auto_detection) + def logo_url_for(business_url) + "#{default_logo_provider_url}/#{business_url}/icon/fallback/lettermark/w/40/h/40?c=#{Setting.brand_fetch_client_id}" + end + + def enhance_provider_merchant(merchant, auto_detection) updates = {} @@ - if Setting.brand_fetch_client_id.present? - updates[:logo_url] = "#{default_logo_provider_url}/#{auto_detection.business_url}/icon/fallback/lettermark/w/40/h/40?c=#{Setting.brand_fetch_client_id}" - end + if Setting.brand_fetch_client_id.present? + updates[:logo_url] = logo_url_for(auto_detection.business_url) + end(And similarly use
logo_url_forinfind_or_create_ai_merchant.) This is purely a maintainability/DRY improvement; behavior stays the same.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/models/family/auto_merchant_detector.rb(2 hunks)
🧰 Additional context used
📓 Path-based instructions (10)
**/*.rb
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rb: Application supports two modes: 'managed' and 'self_hosted' via Rails.application.config.app_mode
Use Current.user and Current.family instead of current_user / current_family for authentication context
Optimize database queries with proper indexes to prevent N+1 queries using includes/joins
**/*.rb: Use Rails pluralization in i18n:t("transactions.count", count: @transactions.count).
Runbin/rubocop -f github -afor Ruby linting with auto-correct before pull requests.
**/*.rb: Ruby code should use 2-space indent,snake_casefor methods and variables, andCamelCasefor classes and modules, following Rails conventions for folders and file names
Prefer environment variables over hard-coded values for configuration
Files:
app/models/family/auto_merchant_detector.rb
app/models/**/*.rb
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
app/models/**/*.rb: Keep business logic in models using skinny controllers and fat models pattern
Store all monetary values in base currency (user's primary currency) and use Money objects for conversion
Use ActiveRecord validations for form convenience and complex business logic validations
Use Rails concerns and POROs for organizing business logic instead of services folder
app/models/**/*.rb: All monetary values stored in base currency (user's primary currency). UseMoneyobjects for currency conversion and formatting.
Place business logic inapp/models/folder. Avoid creatingapp/services/folder. Use Rails concerns and POROs for organization.
Models should answer questions about themselves using instance methods: useaccount.balance_seriesnotAccountSeries.new(account).call.
Files:
app/models/family/auto_merchant_detector.rb
**/*.{rb,erb,html.erb}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{rb,erb,html.erb}: UseCurrent.userfor the current user context in controllers and views. Do NOT usecurrent_user.
UseCurrent.familyfor the current family context. Do NOT usecurrent_family.
Files:
app/models/family/auto_merchant_detector.rb
**/*.{erb,html.erb,rb}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{erb,html.erb,rb}: All user-facing strings must use localization (i18n) via thet()helper. Update locale files for each new or changed element.
Use server-side formatting for currencies, numbers, and dates. Do not format in JavaScript.
Files:
app/models/family/auto_merchant_detector.rb
{app/models/**/*.rb,db/migrate/**/*.rb}
📄 CodeRabbit inference engine (CLAUDE.md)
Simple validations (null checks, unique indexes) should be in the database. ActiveRecord validations for form convenience (prefer client-side when possible). Complex validations and business logic in ActiveRecord.
Files:
app/models/family/auto_merchant_detector.rb
**/*.{rb,erb}
📄 CodeRabbit inference engine (.cursor/rules/general-rules.mdc)
**/*.{rb,erb}: UseCurrent.userfor accessing the current user. Do NOT usecurrent_user
UseCurrent.familyfor accessing the current family. Do NOT usecurrent_family
Files:
app/models/family/auto_merchant_detector.rb
**/*.{rb,js,erb}
📄 CodeRabbit inference engine (.cursor/rules/project-conventions.mdc)
Format currencies, numbers, dates, and other values server-side, then pass to Stimulus controllers for display only
Files:
app/models/family/auto_merchant_detector.rb
**/*.{rb,html.erb}
📄 CodeRabbit inference engine (.cursor/rules/project-conventions.mdc)
Use ActiveRecord validations for complex validations and business logic. Simple validations may be mirrored in ActiveRecord for form error handling convenience, but prioritize client-side form validation when possible
Files:
app/models/family/auto_merchant_detector.rb
app/**/*.{rb,js,ts,jsx,tsx,css,scss}
📄 CodeRabbit inference engine (AGENTS.md)
Code should be organized in
app/directory (Rails MVC, services, jobs, mailers, components), with JS inapp/javascript/, and styles/assets inapp/assets/(Tailwind, images, fonts)
Files:
app/models/family/auto_merchant_detector.rb
app/**/*.rb
📄 CodeRabbit inference engine (AGENTS.md)
Run
bin/brakemansecurity scan before major PRs to check for static analysis of common Rails issues
Files:
app/models/family/auto_merchant_detector.rb
🧠 Learnings (2)
📚 Learning: 2025-11-24T16:56:30.669Z
Learnt from: CR
Repo: we-promise/sure PR: 0
File: .cursor/rules/project-design.mdc:0-0
Timestamp: 2025-11-24T16:56:30.669Z
Learning: Account sync must perform transfer auto-matching, calculate daily balance records from start_date to Date.current, calculate holdings, and optionally enrich transaction data
Applied to files:
app/models/family/auto_merchant_detector.rb
📚 Learning: 2025-09-13T11:27:55.271Z
Learnt from: alessiocappa
Repo: we-promise/sure PR: 141
File: app/controllers/enable_banking_items_controller.rb:48-62
Timestamp: 2025-09-13T11:27:55.271Z
Learning: The `create_enable_banking_item!` method in `Family::EnableBankingConnectable` uses `find_or_create_by(id: enable_banking_id)` to either find an existing EnableBankingItem or create a new one, so it automatically handles updates to existing items without creating duplicates.
Applied to files:
app/models/family/auto_merchant_detector.rb
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: ci / test
| def find_matching_user_merchant(auto_detection) | ||
| user_merchants_input.find { |m| m[:name] == auto_detection.business_name }&.dig(:id) | ||
| end | ||
|
|
||
| def find_or_create_ai_merchant(auto_detection) | ||
| # Only use (source, name) for find_or_create since that's the uniqueness constraint | ||
| ProviderMerchant.find_or_create_by!( | ||
| source: "ai", | ||
| name: auto_detection.business_name | ||
| ) do |pm| | ||
| pm.website_url = auto_detection.business_url | ||
| if Setting.brand_fetch_client_id.present? | ||
| pm.logo_url = "#{default_logo_provider_url}/#{auto_detection.business_url}/icon/fallback/lettermark/w/40/h/40?c=#{Setting.brand_fetch_client_id}" | ||
| end | ||
| end | ||
| rescue ActiveRecord::RecordInvalid, ActiveRecord::RecordNotUnique | ||
| # Race condition: another process created the merchant between our find and create | ||
| ProviderMerchant.find_by(source: "ai", name: auto_detection.business_name) | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User-merchant matching and AI merchant creation: robustness and error handling
find_matching_user_merchant currently does a linear, case-sensitive match on the in-memory user_merchants_input. For better UX and performance on large merchant lists, you could (optionally) normalize names (downcase/strip) and prebuild a hash once, e.g. @user_merchants_by_name ||= family.merchants.index_by { |m| m.name.downcase }, then look up by auto_detection.business_name.downcase.
More importantly, find_or_create_ai_merchant rescues both ActiveRecord::RecordInvalid and ActiveRecord::RecordNotUnique as if they were race conditions. RecordNotUnique is the race you expect, but RecordInvalid typically indicates a real validation failure (e.g., invalid URL or length constraints). Swallowing it and then doing find_by can silently hide data issues and leave you with nil and no log. I’d strongly recommend splitting the rescue and logging the validation error, for example:
- ProviderMerchant.find_or_create_by!(
- source: "ai",
- name: auto_detection.business_name
- ) do |pm|
- pm.website_url = auto_detection.business_url
- if Setting.brand_fetch_client_id.present?
- pm.logo_url = "#{default_logo_provider_url}/#{auto_detection.business_url}/icon/fallback/lettermark/w/40/h/40?c=#{Setting.brand_fetch_client_id}"
- end
- end
- rescue ActiveRecord::RecordInvalid, ActiveRecord::RecordNotUnique
- # Race condition: another process created the merchant between our find and create
- ProviderMerchant.find_by(source: "ai", name: auto_detection.business_name)
+ ProviderMerchant.find_or_create_by!(
+ source: "ai",
+ name: auto_detection.business_name
+ ) do |pm|
+ pm.website_url = auto_detection.business_url
+ if Setting.brand_fetch_client_id.present?
+ pm.logo_url = "#{default_logo_provider_url}/#{auto_detection.business_url}/icon/fallback/lettermark/w/40/h/40?c=#{Setting.brand_fetch_client_id}"
+ end
+ end
+ rescue ActiveRecord::RecordNotUnique
+ # Race condition: another process created the merchant between our find and create
+ ProviderMerchant.find_by(source: "ai", name: auto_detection.business_name)
+ rescue ActiveRecord::RecordInvalid => e
+ Rails.logger.error("Failed to create AI merchant for '#{auto_detection.business_name}': #{e.message}")
+ nilThis keeps the race-condition path safe while surfacing real data problems instead of silently dropping them.
🤖 Prompt for AI Agents
In app/models/family/auto_merchant_detector.rb around lines 100 to 118, improve
user-merchant matching by normalizing names and avoiding an O(n) scan: build a
memoized hash like @user_merchants_by_name ||= user_merchants_input.index_by {
|m| m[:name].to_s.strip.downcase } and then lookup with
auto_detection.business_name.to_s.strip.downcase to return the id; and change
find_or_create_ai_merchant error handling so you only treat RecordNotUnique as
the race-condition path (rescue ActiveRecord::RecordNotUnique and fallback to
ProviderMerchant.find_by(...)), while allowing ActiveRecord::RecordInvalid to
surface (or log and re-raise) instead of silently swallowing it so real
validation failures are visible.
jjmata
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gtg!
Summary by CodeRabbit
Bug Fixes
Enhancements
✏️ Tip: You can customize this high-level summary in your review settings.