Skip to content

fix(p2p): cap broadcast goroutines with bounded semaphore#775

Open
ordishs wants to merge 1 commit intobsv-blockchain:mainfrom
ordishs:security/4574-ws-broadcast-pool
Open

fix(p2p): cap broadcast goroutines with bounded semaphore#775
ordishs wants to merge 1 commit intobsv-blockchain:mainfrom
ordishs:security/4574-ws-broadcast-pool

Conversation

@ordishs
Copy link
Copy Markdown
Collaborator

@ordishs ordishs commented Apr 28, 2026

Closes #4574.

Summary

clientChannelMap.broadcast spawned one goroutine per client per message with no concurrency cap. Under a notification burst with many clients (256 clients × 100 msg/sec ≈ 25,600 goroutines/sec, plus per-goroutine timers), the resource pressure can OOM the node.

Fix

Adds a fixed-size chan struct{} semaphore (default 256 slots) that caps in-flight broadcast goroutines. The send-with-timeout logic inside each goroutine is unchanged.

Test plan

  • Regression test: N (= 4× pool) unresponsive channels; assert broadcast wall-clock matches (N / poolSize) * timeout rather than the un-bounded timeout (proves the semaphore actually serialises).
  • go test -race ./services/p2p/... passes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 28, 2026

🤖 Claude Code Review

Status: Complete


Summary: The implementation is solid and addresses the OOM issue effectively. The semaphore pattern correctly caps in-flight goroutines, and the test verifies the behavior with appropriate timing assertions.

Observations:

Minor - Test timing flexibility: The test uses a 2-second buffer for expectedMax (line 687 in test file). On slow CI runners or under heavy load, this might occasionally fail. Consider increasing the buffer or adding a note if flakiness occurs.

Positive highlights:

  • Semaphore correctly acquired before goroutine spawn and released in defer (lines 127, 130)
  • Test properly restores original pool size in defer (line 665)
  • Documentation clearly explains the var vs const choice (lines 100-103)
  • Wall-clock timing test confirms serialization behavior rather than just counting goroutines

@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-775 (d398f83)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 142
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.659µ 1.890µ ~ 0.100
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 62.07n 62.44n ~ 1.000
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 61.37n 61.85n ~ 0.200
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 61.55n 61.63n ~ 0.700
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 30.53n 31.62n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 52.38n 52.55n ~ 1.000
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 107.8n 105.5n ~ 0.200
MiningCandidate_Stringify_Short-4 261.8n 264.2n ~ 0.100
MiningCandidate_Stringify_Long-4 1.897µ 1.906µ ~ 0.700
MiningSolution_Stringify-4 976.9n 981.5n ~ 0.400
BlockInfo_MarshalJSON-4 1.778µ 1.786µ ~ 0.700
NewFromBytes-4 124.6n 140.6n ~ 0.100
Mine_EasyDifficulty-4 49.87µ 50.11µ ~ 0.400
Mine_WithAddress-4 3.851µ 3.757µ ~ 0.700
BlockAssembler_AddTx-4 0.02740n 0.02705n ~ 1.000
AddNode-4 10.85 11.02 ~ 1.000
AddNodeWithMap-4 10.79 11.16 ~ 0.700
DirectSubtreeAdd/4_per_subtree-4 62.61n 59.21n ~ 0.100
DirectSubtreeAdd/64_per_subtree-4 29.89n 28.64n ~ 0.700
DirectSubtreeAdd/256_per_subtree-4 27.99n 27.87n ~ 1.000
DirectSubtreeAdd/1024_per_subtree-4 26.55n 26.48n ~ 1.000
DirectSubtreeAdd/2048_per_subtree-4 26.06n 26.19n ~ 0.400
SubtreeProcessorAdd/4_per_subtree-4 325.2n 335.5n ~ 0.200
SubtreeProcessorAdd/64_per_subtree-4 338.3n 325.7n ~ 0.100
SubtreeProcessorAdd/256_per_subtree-4 328.2n 330.2n ~ 1.000
SubtreeProcessorAdd/1024_per_subtree-4 336.9n 322.9n ~ 0.100
SubtreeProcessorAdd/2048_per_subtree-4 343.5n 318.4n ~ 0.100
SubtreeProcessorRotate/4_per_subtree-4 325.4n 322.8n ~ 0.600
SubtreeProcessorRotate/64_per_subtree-4 327.9n 320.8n ~ 0.100
SubtreeProcessorRotate/256_per_subtree-4 326.3n 328.7n ~ 1.000
SubtreeProcessorRotate/1024_per_subtree-4 320.2n 315.3n ~ 0.100
SubtreeNodeAddOnly/4_per_subtree-4 68.80n 68.36n ~ 0.700
SubtreeNodeAddOnly/64_per_subtree-4 41.10n 41.09n ~ 0.800
SubtreeNodeAddOnly/256_per_subtree-4 39.52n 39.19n ~ 0.700
SubtreeNodeAddOnly/1024_per_subtree-4 38.78n 39.13n ~ 0.100
SubtreeCreationOnly/4_per_subtree-4 169.7n 166.4n ~ 0.100
SubtreeCreationOnly/64_per_subtree-4 606.2n 623.5n ~ 0.700
SubtreeCreationOnly/256_per_subtree-4 1.991µ 1.997µ ~ 1.000
SubtreeCreationOnly/1024_per_subtree-4 4.544µ 4.823µ ~ 0.200
SubtreeCreationOnly/2048_per_subtree-4 8.297µ 7.189µ ~ 0.400
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 320.5n 319.4n ~ 0.700
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 327.0n 321.9n ~ 0.700
ParallelGetAndSetIfNotExists/1k_nodes-4 957.1µ 948.3µ ~ 0.700
ParallelGetAndSetIfNotExists/10k_nodes-4 2.040m 2.040m ~ 0.700
ParallelGetAndSetIfNotExists/50k_nodes-4 9.219m 9.148m ~ 1.000
ParallelGetAndSetIfNotExists/100k_nodes-4 18.42m 18.32m ~ 1.000
SequentialGetAndSetIfNotExists/1k_nodes-4 742.3µ 763.4µ ~ 0.700
SequentialGetAndSetIfNotExists/10k_nodes-4 3.251m 3.372m ~ 0.100
SequentialGetAndSetIfNotExists/50k_nodes-4 12.43m 12.28m ~ 0.400
SequentialGetAndSetIfNotExists/100k_nodes-4 23.83m 23.47m ~ 0.700
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 1.037m 1.029m ~ 1.000
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 5.491m 5.371m ~ 0.400
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 20.62m 20.47m ~ 1.000
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 820.5µ 832.8µ ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 7.108m 7.228m ~ 0.400
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 47.13m 44.57m ~ 0.100
DiskTxMap_SetIfNotExists-4 3.968µ 3.706µ ~ 0.100
DiskTxMap_SetIfNotExists_Parallel-4 3.523µ 3.660µ ~ 0.200
DiskTxMap_ExistenceOnly-4 314.8n 311.7n ~ 0.400
Queue-4 192.2n 191.9n ~ 0.800
AtomicPointer-4 3.275n 3.242n ~ 1.000
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 896.7µ 882.7µ ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/10K-4 842.5µ 854.2µ ~ 0.400
ReorgOptimizations/AllMarkFalse/Old/10K-4 117.1µ 114.7µ ~ 0.400
ReorgOptimizations/AllMarkFalse/New/10K-4 64.31µ 64.70µ ~ 0.100
ReorgOptimizations/HashSlicePool/Old/10K-4 62.64µ 58.90µ ~ 0.400
ReorgOptimizations/HashSlicePool/New/10K-4 11.17µ 11.06µ ~ 0.400
ReorgOptimizations/NodeFlags/Old/10K-4 5.416µ 5.420µ ~ 0.700
ReorgOptimizations/NodeFlags/New/10K-4 1.910µ 1.972µ ~ 0.400
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 10.089m 9.584m ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/100K-4 9.472m 10.134m ~ 0.200
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.173m 1.234m ~ 0.100
ReorgOptimizations/AllMarkFalse/New/100K-4 704.9µ 703.6µ ~ 0.700
ReorgOptimizations/HashSlicePool/Old/100K-4 661.6µ 523.2µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/100K-4 199.6µ 196.6µ ~ 1.000
ReorgOptimizations/NodeFlags/Old/100K-4 55.51µ 53.53µ ~ 0.200
ReorgOptimizations/NodeFlags/New/100K-4 18.93µ 19.39µ ~ 0.400
TxMapSetIfNotExists-4 46.42n 46.39n ~ 1.000
TxMapSetIfNotExistsDuplicate-4 38.76n 38.64n ~ 0.700
ChannelSendReceive-4 572.9n 604.9n ~ 0.100
CalcBlockWork-4 523.4n 523.0n ~ 1.000
CalculateWork-4 698.1n 701.7n ~ 0.700
BuildBlockLocatorString_Helpers/Size_10-4 1.325µ 1.341µ ~ 0.700
BuildBlockLocatorString_Helpers/Size_100-4 14.90µ 15.74µ ~ 1.000
BuildBlockLocatorString_Helpers/Size_1000-4 126.5µ 125.6µ ~ 0.400
CatchupWithHeaderCache-4 104.4m 104.3m ~ 1.000
_prepareTxsPerLevel-4 416.4m 405.9m ~ 0.200
_prepareTxsPerLevelOrdered-4 3.540m 3.783m ~ 0.700
_prepareTxsPerLevel_Comparison/Original-4 416.9m 419.2m ~ 0.400
_prepareTxsPerLevel_Comparison/Optimized-4 3.580m 3.608m ~ 1.000
SubtreeSizes/10k_tx_4_per_subtree-4 1.386m 1.383m ~ 1.000
SubtreeSizes/10k_tx_16_per_subtree-4 319.0µ 326.6µ ~ 0.400
SubtreeSizes/10k_tx_64_per_subtree-4 77.14µ 77.99µ ~ 0.400
SubtreeSizes/10k_tx_256_per_subtree-4 19.69µ 19.29µ ~ 0.100
SubtreeSizes/10k_tx_512_per_subtree-4 9.716µ 9.565µ ~ 0.100
SubtreeSizes/10k_tx_1024_per_subtree-4 4.843µ 4.762µ ~ 0.100
SubtreeSizes/10k_tx_2k_per_subtree-4 2.411µ 2.369µ ~ 0.700
BlockSizeScaling/10k_tx_64_per_subtree-4 74.35µ 75.53µ ~ 0.100
BlockSizeScaling/10k_tx_256_per_subtree-4 18.94µ 18.90µ ~ 1.000
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.733µ 4.731µ ~ 0.700
BlockSizeScaling/50k_tx_64_per_subtree-4 395.3µ 392.9µ ~ 0.700
BlockSizeScaling/50k_tx_256_per_subtree-4 95.06µ 95.62µ ~ 0.100
BlockSizeScaling/50k_tx_1024_per_subtree-4 23.12µ 23.13µ ~ 0.700
SubtreeAllocations/small_subtrees_exists_check-4 158.5µ 158.7µ ~ 0.700
SubtreeAllocations/small_subtrees_data_fetch-4 167.2µ 168.0µ ~ 0.400
SubtreeAllocations/small_subtrees_full_validation-4 322.1µ 327.3µ ~ 0.400
SubtreeAllocations/medium_subtrees_exists_check-4 9.454µ 9.481µ ~ 0.400
SubtreeAllocations/medium_subtrees_data_fetch-4 9.831µ 9.842µ ~ 0.400
SubtreeAllocations/medium_subtrees_full_validation-4 19.00µ 19.21µ ~ 1.000
SubtreeAllocations/large_subtrees_exists_check-4 2.255µ 2.257µ ~ 1.000
SubtreeAllocations/large_subtrees_data_fetch-4 2.373µ 2.370µ ~ 0.400
SubtreeAllocations/large_subtrees_full_validation-4 4.715µ 4.745µ ~ 0.700
_BufferPoolAllocation/16KB-4 3.977µ 4.394µ ~ 0.500
_BufferPoolAllocation/32KB-4 9.469µ 8.649µ ~ 0.200
_BufferPoolAllocation/64KB-4 17.04µ 15.49µ ~ 0.200
_BufferPoolAllocation/128KB-4 33.59µ 29.31µ ~ 0.100
_BufferPoolAllocation/512KB-4 112.9µ 106.4µ ~ 0.200
_BufferPoolConcurrent/32KB-4 18.48µ 17.59µ ~ 0.200
_BufferPoolConcurrent/64KB-4 28.20µ 28.77µ ~ 0.400
_BufferPoolConcurrent/512KB-4 146.8µ 144.0µ ~ 0.400
_SubtreeDeserializationWithBufferSizes/16KB-4 667.6µ 640.6µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/32KB-4 637.5µ 642.2µ ~ 0.400
_SubtreeDeserializationWithBufferSizes/64KB-4 632.5µ 637.8µ ~ 0.400
_SubtreeDeserializationWithBufferSizes/128KB-4 625.3µ 634.2µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/512KB-4 628.2µ 638.4µ ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/16KB-4 35.99m 35.57m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/32KB-4 36.06m 36.44m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/64KB-4 35.71m 35.96m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/128KB-4 35.59m 35.96m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/512KB-4 35.40m 35.85m ~ 0.400
_PooledVsNonPooled/Pooled-4 830.4n 830.8n ~ 1.000
_PooledVsNonPooled/NonPooled-4 7.478µ 7.126µ ~ 0.700
_MemoryFootprint/Current_512KB_32concurrent-4 7.440µ 7.670µ ~ 0.100
_MemoryFootprint/Proposed_32KB_32concurrent-4 9.594µ 10.617µ ~ 0.100
_MemoryFootprint/Alternative_64KB_32concurrent-4 9.385µ 10.172µ ~ 0.100
StoreBlock_Sequential/BelowCSVHeight-4 330.9µ 333.0µ ~ 1.000
StoreBlock_Sequential/AboveCSVHeight-4 335.1µ 332.5µ ~ 0.700
GetUtxoHashes-4 259.9n 260.9n ~ 1.000
GetUtxoHashes_ManyOutputs-4 45.11µ 47.81µ ~ 0.200
_NewMetaDataFromBytes-4 238.6n 239.9n ~ 0.500
_Bytes-4 629.4n 631.6n ~ 0.700
_MetaBytes-4 571.4n 570.0n ~ 1.000

Threshold: >10% with p < 0.05 | Generated: 2026-04-28 20:23 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant