fix: auto-reject PROGRAM messages with non-dict metadata by odesenfans · Pull Request #1137 · aleph-im/pyaleph

odesenfans · 2026-05-14T23:12:30Z

Summary

Some PROGRAM messages slipped past validation while ExecutableContent.metadata accepted lists. The current validator requires a dict, so reading those rows fails parsed_content and surfaces as 500s on GET /api/v0/messages/<hash> (ex: 42a4a8...3d96f3 returns 500, while the same hash on epyc properly reports the message as rejected).

This change:

Adds mark_processed_message_as_rejected in aleph.repair. It mirrors mark_pending_message_as_rejected but starts from a MessageDb row instead of a PendingMessageDb: cleans up VM rows for program/instance, upserts rejected_messages, flips message_status to REJECTED, and deletes the messages row. The trigger keeps message_counts consistent; FK cascades clean message_confirmations and account_costs.
Adds _reject_invalid_program_metadata and wires it into repair_node so the API rejects affected PROGRAM messages on every startup. The query uses jsonb_typeof(content->'metadata') = 'array'; an empty result is a no-op.
Ships deployment/scripts/reject_processed_messages.py for ad-hoc cleanups when a restart is not an option. Dry-run by default, --commit to persist; targets specific hashes via --hash / --hashes-file. Runs from inside the API container against the deployed config at /var/pyaleph/config.yml.

Test plan

venv/bin/python -m pytest tests/test_repair.py -v — 5 tests, all pass (rejects list metadata, preserves dict/None metadata, ignores non-program types, no-op on empty DB).
venv/bin/python -m pytest tests/db/test_messages.py tests/db/test_credit_balances.py — adjacent suites still pass (63 total).
venv/bin/ruff check + black + isort clean on changed files.
Manual: on a staging snapshot, confirm targeted hashes flip from PROCESSED to REJECTED and GET /messages/<hash> no longer 500s.

🤖 Generated with Claude Code

Some PROGRAM messages slipped past validation while ExecutableContent.metadata accepted lists. The current validator requires a dict, so reading those rows fails parsed_content and surfaces as 500s on GET /messages/<hash>. Move them to REJECTED at startup so the API renders them like nodes that rejected them in the first place. The transition logic also lives behind a deployment/scripts helper for ad-hoc cleanups when waiting for a restart is not an option. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

foxpatch-aleph

Clean, well-structured fix for a production bug where PROGRAM messages with list-typed metadata cause 500s via parsed_content. Implements a reusable rejection utility for processed messages, wires a repair function into startup, and ships a companion CLI script. Thorough test coverage and good code quality throughout.

src/aleph/repair.py (line 69): Consider using session.execute(delete_vm_updates(...)) instead of _ = list(...) to avoid loading results into memory and make the intent clearer. The list() is needed to force execution, but a comment explaining why would help maintainers.

foxpatch-aleph approved these changes May 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: auto-reject PROGRAM messages with non-dict metadata#1137

fix: auto-reject PROGRAM messages with non-dict metadata#1137
odesenfans wants to merge 1 commit into
mainfrom
fix/reject-program-invalid-metadata

odesenfans commented May 14, 2026

Uh oh!

foxpatch-aleph left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

odesenfans commented May 14, 2026

Summary

Test plan

Uh oh!

foxpatch-aleph left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants