Skip to content

fix(legacy/netsync): classify and observe pre-warm validation errors#785

Open
ordishs wants to merge 1 commit intobsv-blockchain:mainfrom
ordishs:security/4590-prewarm-error-classify
Open

fix(legacy/netsync): classify and observe pre-warm validation errors#785
ordishs wants to merge 1 commit intobsv-blockchain:mainfrom
ordishs:security/4590-prewarm-error-classify

Conversation

@ordishs
Copy link
Copy Markdown
Collaborator

@ordishs ordishs commented Apr 29, 2026

Closes #4590.

Summary

The validateTransactions pre-warm path discarded all errors from validationClient.Validate with _, _ =. Policy errors during pre-warm are expected (warm-up runs before full validation), but critical errors (service unavailable, consensus-level tx invalidity, processing failures) were also silently swallowed. Ops had no signal that pre-warm was failing.

Decision

Add observability (log + counter), keep drop semantics. Same pattern as #4585 (PR #781). Real validation runs at subtree-validation time and catches consensus violations on its own — making pre-warm part of the validation gate would risk false-positive block rejections from classification mistakes.

Fix

  • New prometheusLegacyNetsyncPrewarmErrors counter vec with class label (tx_invalid, service, processing, policy, other).
  • New helper classifyAndCountPrewarmError(logger, err) that classifies via errors.Is and logs at the appropriate level (debug for policy, warn for service/processing/other, error for tx_invalid).
  • Both pre-warm Validate call sites in validateTransactions (small-batch path and goroutine-pool path) now capture the error and route through the helper.
  • Drop semantics preserved: function still returns nil; g.Wait() still ignored.

Test plan

  • Unit tests on classifyAndCountPrewarmError for each class; assert counter increments.
  • go test -race ./services/legacy/netsync/... passes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 29, 2026

🤖 Claude Code Review

Status: Complete


Summary: No issues found. This PR successfully adds observability to pre-warm validation errors while preserving drop semantics.

What this PR does well:

  • Error classification logic correctly uses errors.Is for all 5 error classes (tx_invalid, service, processing, policy, other)
  • Policy errors collapse both ErrTxConflicting and ErrTxExists into a single label, which is appropriate
  • Log levels match severity: Error for tx_invalid (consensus violations shouldn't happen in pre-warm), Warn for service/processing/other, Debug for expected policy errors
  • Test coverage verifies each error class increments the correct counter
  • Both pre-warm call sites (small-batch and goroutine-pool paths) now capture and classify errors
  • Drop semantics preserved: function still returns nil, g.Wait() still ignores errors
  • Prometheus metric includes helpful godoc explaining the rationale

Design rationale confirmed:
The PR description correctly explains why pre-warm errors are dropped: real subtree validation runs later and catches consensus violations on its own. Making pre-warm part of the validation gate would risk false-positive block rejections from classification mistakes. This change adds the missing observability without changing that decision.

@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-785 (ba11513)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 142
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.677µ 1.668µ ~ 1.000
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 61.37n 61.88n ~ 0.200
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 61.68n 61.39n ~ 0.100
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 61.60n 61.42n ~ 0.200
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 30.59n 31.68n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 51.49n 53.56n ~ 0.200
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 107.9n 107.4n ~ 1.000
MiningCandidate_Stringify_Short-4 262.5n 267.0n ~ 0.100
MiningCandidate_Stringify_Long-4 1.899µ 1.902µ ~ 0.700
MiningSolution_Stringify-4 981.6n 975.9n ~ 0.100
BlockInfo_MarshalJSON-4 1.785µ 1.773µ ~ 0.100
NewFromBytes-4 126.9n 154.7n ~ 0.200
Mine_EasyDifficulty-4 64.59µ 65.02µ ~ 0.100
Mine_WithAddress-4 5.667µ 4.958µ ~ 0.700
BlockAssembler_AddTx-4 0.02625n 0.02642n ~ 1.000
AddNode-4 11.05 10.92 ~ 0.700
AddNodeWithMap-4 10.31 10.88 ~ 1.000
DirectSubtreeAdd/4_per_subtree-4 61.61n 60.55n ~ 0.400
DirectSubtreeAdd/64_per_subtree-4 28.74n 28.70n ~ 0.700
DirectSubtreeAdd/256_per_subtree-4 27.34n 27.16n ~ 0.100
DirectSubtreeAdd/1024_per_subtree-4 26.24n 26.23n ~ 1.000
DirectSubtreeAdd/2048_per_subtree-4 25.92n 25.85n ~ 0.200
SubtreeProcessorAdd/4_per_subtree-4 311.2n 312.8n ~ 1.000
SubtreeProcessorAdd/64_per_subtree-4 306.8n 308.9n ~ 0.700
SubtreeProcessorAdd/256_per_subtree-4 308.8n 313.3n ~ 0.600
SubtreeProcessorAdd/1024_per_subtree-4 312.7n 310.4n ~ 0.700
SubtreeProcessorAdd/2048_per_subtree-4 310.6n 311.7n ~ 1.000
SubtreeProcessorRotate/4_per_subtree-4 311.5n 315.4n ~ 1.000
SubtreeProcessorRotate/64_per_subtree-4 317.6n 313.5n ~ 0.400
SubtreeProcessorRotate/256_per_subtree-4 313.8n 309.8n ~ 1.000
SubtreeProcessorRotate/1024_per_subtree-4 302.4n 307.0n ~ 0.400
SubtreeNodeAddOnly/4_per_subtree-4 66.90n 66.22n ~ 0.100
SubtreeNodeAddOnly/64_per_subtree-4 39.25n 38.88n ~ 0.200
SubtreeNodeAddOnly/256_per_subtree-4 37.75n 37.87n ~ 0.100
SubtreeNodeAddOnly/1024_per_subtree-4 37.19n 37.27n ~ 0.400
SubtreeCreationOnly/4_per_subtree-4 166.3n 165.9n ~ 0.800
SubtreeCreationOnly/64_per_subtree-4 654.4n 632.0n ~ 0.700
SubtreeCreationOnly/256_per_subtree-4 1.997µ 2.000µ ~ 0.700
SubtreeCreationOnly/1024_per_subtree-4 5.175µ 5.211µ ~ 1.000
SubtreeCreationOnly/2048_per_subtree-4 7.948µ 9.112µ ~ 0.100
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 305.0n 307.3n ~ 0.100
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 307.2n 311.0n ~ 0.200
ParallelGetAndSetIfNotExists/1k_nodes-4 962.1µ 965.7µ ~ 0.400
ParallelGetAndSetIfNotExists/10k_nodes-4 1.932m 1.940m ~ 0.100
ParallelGetAndSetIfNotExists/50k_nodes-4 8.721m 8.782m ~ 0.400
ParallelGetAndSetIfNotExists/100k_nodes-4 17.35m 17.36m ~ 0.700
SequentialGetAndSetIfNotExists/1k_nodes-4 757.7µ 749.8µ ~ 0.700
SequentialGetAndSetIfNotExists/10k_nodes-4 2.988m 2.971m ~ 0.700
SequentialGetAndSetIfNotExists/50k_nodes-4 10.57m 10.84m ~ 0.400
SequentialGetAndSetIfNotExists/100k_nodes-4 20.31m 20.08m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 1.020m 1.056m ~ 0.400
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 4.740m 4.719m ~ 0.700
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 19.14m 19.26m ~ 1.000
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 837.4µ 832.6µ ~ 0.700
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 5.935m 5.945m ~ 0.400
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 39.21m 39.25m ~ 0.700
DiskTxMap_SetIfNotExists-4 3.847µ 3.734µ ~ 0.200
DiskTxMap_SetIfNotExists_Parallel-4 3.595µ 3.534µ ~ 0.700
DiskTxMap_ExistenceOnly-4 360.8n 314.8n ~ 0.400
Queue-4 190.9n 190.3n ~ 0.700
AtomicPointer-4 3.254n 3.236n ~ 0.400
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 818.5µ 869.1µ ~ 0.200
ReorgOptimizations/DedupFilterPipeline/New/10K-4 817.0µ 842.9µ ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/10K-4 117.3µ 129.1µ ~ 0.100
ReorgOptimizations/AllMarkFalse/New/10K-4 64.52µ 64.03µ ~ 0.100
ReorgOptimizations/HashSlicePool/Old/10K-4 59.58µ 61.16µ ~ 0.400
ReorgOptimizations/HashSlicePool/New/10K-4 11.50µ 11.02µ ~ 0.100
ReorgOptimizations/NodeFlags/Old/10K-4 5.491µ 5.160µ ~ 0.700
ReorgOptimizations/NodeFlags/New/10K-4 1.874µ 1.774µ ~ 0.200
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 10.271m 9.907m ~ 1.000
ReorgOptimizations/DedupFilterPipeline/New/100K-4 9.383m 9.690m ~ 0.400
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.171m 1.100m ~ 0.100
ReorgOptimizations/AllMarkFalse/New/100K-4 706.2µ 705.6µ ~ 0.700
ReorgOptimizations/HashSlicePool/Old/100K-4 626.6µ 588.1µ ~ 0.200
ReorgOptimizations/HashSlicePool/New/100K-4 204.0µ 202.8µ ~ 0.700
ReorgOptimizations/NodeFlags/Old/100K-4 54.38µ 54.26µ ~ 0.700
ReorgOptimizations/NodeFlags/New/100K-4 19.25µ 18.31µ ~ 0.100
TxMapSetIfNotExists-4 46.30n 46.21n ~ 0.400
TxMapSetIfNotExistsDuplicate-4 38.74n 38.60n ~ 0.700
ChannelSendReceive-4 593.6n 597.7n ~ 0.700
CalcBlockWork-4 464.3n 464.2n ~ 1.000
CalculateWork-4 635.0n 620.3n ~ 1.000
BuildBlockLocatorString_Helpers/Size_10-4 1.291µ 1.301µ ~ 0.100
BuildBlockLocatorString_Helpers/Size_100-4 12.40µ 12.43µ ~ 1.000
BuildBlockLocatorString_Helpers/Size_1000-4 154.0µ 153.4µ ~ 1.000
CatchupWithHeaderCache-4 104.2m 104.2m ~ 0.700
SubtreeSizes/10k_tx_4_per_subtree-4 1.387m 1.361m ~ 0.700
SubtreeSizes/10k_tx_16_per_subtree-4 313.6µ 318.9µ ~ 1.000
SubtreeSizes/10k_tx_64_per_subtree-4 76.07µ 77.55µ ~ 0.400
SubtreeSizes/10k_tx_256_per_subtree-4 18.86µ 19.14µ ~ 0.100
SubtreeSizes/10k_tx_512_per_subtree-4 9.394µ 9.422µ ~ 0.700
SubtreeSizes/10k_tx_1024_per_subtree-4 4.638µ 4.728µ ~ 0.100
SubtreeSizes/10k_tx_2k_per_subtree-4 2.336µ 2.357µ ~ 0.100
BlockSizeScaling/10k_tx_64_per_subtree-4 74.40µ 75.85µ ~ 0.200
BlockSizeScaling/10k_tx_256_per_subtree-4 18.71µ 18.79µ ~ 1.000
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.702µ 4.746µ ~ 0.100
BlockSizeScaling/50k_tx_64_per_subtree-4 393.9µ 393.9µ ~ 0.700
BlockSizeScaling/50k_tx_256_per_subtree-4 95.16µ 94.75µ ~ 0.200
BlockSizeScaling/50k_tx_1024_per_subtree-4 23.10µ 23.12µ ~ 1.000
SubtreeAllocations/small_subtrees_exists_check-4 158.1µ 159.9µ ~ 0.700
SubtreeAllocations/small_subtrees_data_fetch-4 165.8µ 166.7µ ~ 0.700
SubtreeAllocations/small_subtrees_full_validation-4 323.1µ 326.2µ ~ 0.700
SubtreeAllocations/medium_subtrees_exists_check-4 9.360µ 9.427µ ~ 0.700
SubtreeAllocations/medium_subtrees_data_fetch-4 9.765µ 9.635µ ~ 0.200
SubtreeAllocations/medium_subtrees_full_validation-4 18.87µ 19.16µ ~ 0.200
SubtreeAllocations/large_subtrees_exists_check-4 2.296µ 2.261µ ~ 0.300
SubtreeAllocations/large_subtrees_data_fetch-4 2.374µ 2.360µ ~ 1.000
SubtreeAllocations/large_subtrees_full_validation-4 4.706µ 4.751µ ~ 0.700
_BufferPoolAllocation/16KB-4 2.668µ 2.402µ ~ 0.700
_BufferPoolAllocation/32KB-4 5.407µ 6.387µ ~ 0.100
_BufferPoolAllocation/64KB-4 10.99µ 11.68µ ~ 0.100
_BufferPoolAllocation/128KB-4 21.98µ 21.21µ ~ 0.200
_BufferPoolAllocation/512KB-4 83.09µ 82.78µ ~ 0.700
_BufferPoolConcurrent/32KB-4 12.25µ 13.58µ ~ 0.700
_BufferPoolConcurrent/64KB-4 19.72µ 22.13µ ~ 0.100
_BufferPoolConcurrent/512KB-4 106.3µ 106.1µ ~ 1.000
_SubtreeDeserializationWithBufferSizes/16KB-4 472.2µ 513.6µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/32KB-4 471.2µ 499.8µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/64KB-4 462.2µ 495.4µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/128KB-4 463.5µ 501.8µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/512KB-4 472.6µ 491.7µ ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/16KB-4 27.90m 27.13m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/32KB-4 27.71m 27.21m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/64KB-4 27.42m 27.24m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/128KB-4 27.23m 26.99m ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/512KB-4 26.93m 26.86m ~ 0.400
_PooledVsNonPooled/Pooled-4 647.3n 644.3n ~ 0.100
_PooledVsNonPooled/NonPooled-4 5.189µ 5.078µ ~ 0.100
_MemoryFootprint/Current_512KB_32concurrent-4 5.439µ 5.520µ ~ 1.000
_MemoryFootprint/Proposed_32KB_32concurrent-4 8.341µ 7.885µ ~ 0.100
_MemoryFootprint/Alternative_64KB_32concurrent-4 7.392µ 8.193µ ~ 0.100
_prepareTxsPerLevel-4 384.2m 387.8m ~ 1.000
_prepareTxsPerLevelOrdered-4 3.778m 3.912m ~ 0.700
_prepareTxsPerLevel_Comparison/Original-4 384.4m 393.7m ~ 0.400
_prepareTxsPerLevel_Comparison/Optimized-4 3.809m 4.078m ~ 0.100
StoreBlock_Sequential/BelowCSVHeight-4 242.2µ 245.1µ ~ 1.000
StoreBlock_Sequential/AboveCSVHeight-4 235.1µ 248.2µ ~ 0.200
GetUtxoHashes-4 270.6n 271.6n ~ 1.000
GetUtxoHashes_ManyOutputs-4 46.08µ 48.09µ ~ 0.700
_NewMetaDataFromBytes-4 229.1n 231.4n ~ 0.200
_Bytes-4 605.1n 611.7n ~ 0.200
_MetaBytes-4 566.2n 563.2n ~ 1.000

Threshold: >10% with p < 0.05 | Generated: 2026-04-29 03:30 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant