fix: auto-reject PROGRAM messages with non-dict metadata#1137
Open
odesenfans wants to merge 1 commit into
Open
fix: auto-reject PROGRAM messages with non-dict metadata#1137odesenfans wants to merge 1 commit into
odesenfans wants to merge 1 commit into
Conversation
Some PROGRAM messages slipped past validation while ExecutableContent.metadata accepted lists. The current validator requires a dict, so reading those rows fails parsed_content and surfaces as 500s on GET /messages/<hash>. Move them to REJECTED at startup so the API renders them like nodes that rejected them in the first place. The transition logic also lives behind a deployment/scripts helper for ad-hoc cleanups when waiting for a restart is not an option. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
foxpatch-aleph
approved these changes
May 14, 2026
foxpatch-aleph
left a comment
There was a problem hiding this comment.
Clean, well-structured fix for a production bug where PROGRAM messages with list-typed metadata cause 500s via parsed_content. Implements a reusable rejection utility for processed messages, wires a repair function into startup, and ships a companion CLI script. Thorough test coverage and good code quality throughout.
src/aleph/repair.py (line 69): Consider using session.execute(delete_vm_updates(...)) instead of _ = list(...) to avoid loading results into memory and make the intent clearer. The list() is needed to force execution, but a comment explaining why would help maintainers.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Some PROGRAM messages slipped past validation while
ExecutableContent.metadataaccepted lists. The current validator requires a dict, so reading those rows failsparsed_contentand surfaces as 500s onGET /api/v0/messages/<hash>(ex: 42a4a8...3d96f3 returns 500, while the same hash on epyc properly reports the message as rejected).This change:
mark_processed_message_as_rejectedinaleph.repair. It mirrorsmark_pending_message_as_rejectedbut starts from aMessageDbrow instead of aPendingMessageDb: cleans up VM rows for program/instance, upsertsrejected_messages, flipsmessage_statusto REJECTED, and deletes themessagesrow. The trigger keepsmessage_countsconsistent; FK cascades cleanmessage_confirmationsandaccount_costs._reject_invalid_program_metadataand wires it intorepair_nodeso the API rejects affected PROGRAM messages on every startup. The query usesjsonb_typeof(content->'metadata') = 'array'; an empty result is a no-op.deployment/scripts/reject_processed_messages.pyfor ad-hoc cleanups when a restart is not an option. Dry-run by default,--committo persist; targets specific hashes via--hash/--hashes-file. Runs from inside the API container against the deployed config at/var/pyaleph/config.yml.Test plan
venv/bin/python -m pytest tests/test_repair.py -v— 5 tests, all pass (rejects list metadata, preserves dict/None metadata, ignores non-program types, no-op on empty DB).venv/bin/python -m pytest tests/db/test_messages.py tests/db/test_credit_balances.py— adjacent suites still pass (63 total).venv/bin/ruff check+black+isortclean on changed files.GET /messages/<hash>no longer 500s.🤖 Generated with Claude Code