Skip to content

Section-Bloom Control Optimization #6335

@bladehan1

Description

@bladehan1

Background

The current storage of section-bloom data is strongly coupled with the JSON-RPC interface toggle (node.jsonrpc.httpFullNodeEnable), leading to the following issues:

  1. Irreversible Data Loss: If a user disables the toggle and later re-enables it, section-bloom data generated during the disabled period cannot be recovered, causing permanent failures in eth_getLogs queries.
  2. Confusing Configuration Semantics: A single toggle controls both the data layer and interface layer, blurring behavioral boundaries for users.
  3. Contradictory Default Behavior: Full nodes should natively support complete log queries, but the current configuration defaults to disabling this functionality.

Image

Data Status (as of 2025-05-29, Mainnet Block Height 72.6 million):

  • Theoretical Storage Size: 17.3 GB
  • Actual Compressed Size: 12.9 GB (25.4% compression ratio)
    Annual growth: ~10 million new blocks, max theoretical increase: 2.4 GB;
    Factoring in 25.4% compression, actual storage growth: ~1.79 GB/year.

Rationale

Why should this feature exist?

  1. Data Integrity: Index data should be decoupled from interface availability to prevent configuration changes from causing data gaps.
  2. Operational Flexibility: Users may need to temporarily disable interfaces (e.g., during security audits) without compromising data completeness.

What are the use cases?

  1. Full Node Operation: Preserve section-bloom data for future analysis even when the JSON-RPC interface is disabled.
  2. Light Node Optimization: Resource-constrained nodes can disable data writes (storage.writeSectionBloom=false) to save storage.
  3. Chain Service Providers: DApps require guaranteed completeness of eth_getLogs results, unaffected by temporary configuration changes.

Specification

Candidate Solutions Comparison

Solution Description Pros Cons
Solution 1: Unconditional Writes Remove all write condition checks Write section-bloom data unconditionally Simple implementation (minimal code changes) 100% data integrity No write disable option (no choice for storage-sensitive users) Violates config controllability
Solution 2: New Independent Config storage.writeSectionBloom (data toggle) Fully decouple data and interface control User freedom to enable/disable writes Unified control for eth_getFilterChanges and eth_getLogs New config affects eth_getFilterChanges Missing historical data requires full sync recovery
Solution 3: Hybrid Config storage.writeSectionBloom + legacy control Decouple persistent data from interface control Granular write control Compatible with legacy interface logic for eth_getFilterChanges Code redundancy Overly complex config granularity

Test Specification

Test Scenario Solution 1 Solution 2 Solution 3
Default configuration Continuous writes, interface OK Same Same
Disable writes (if applicable) N/A section-bloom writes halted eth_getLogs returns existing data eth_getFilterChanges returns no data section-bloom writes halted eth_getLogs returns existing data eth_getFilterChanges returns data

Scope Of Impact

  1. Affected Modules:
    • Block processing pipeline (BlockProcessor)
    • LevelDB storage (SectionBloomStore)
    • JSON-RPC interface layer (EthApi)
  2. Affected Interfaces:
    • eth_getLogs
    • eth_getFilterLogs
    • eth_getFilterChanges

Implementation

Approach

Recommended: Solution 2 (New independent config), with steps:

  1. Code Decoupling

    • Separate data write logic from interface control

    •  if (CommonParameter.getInstance().writeSectionBloom()) {
                   Bloom blockBloom = chainBaseManager.getSectionBloomStore()
                       .initBlockSection(transactionRetCapsule);
                   chainBaseManager.getSectionBloomStore().write(block.getNum());
                   block.setBloom(blockBloom);
                 } 
      
  2. Documentation Updates

    • Explicitly state: Disabling the data toggle mid-operation requires full block sync to recover missing historical data.

Are you willing to implement this feature?

Yes. Willing to lead development.

Questions for community discussion:

  1. Solution Selection: Do you agree with recommending Solution 2?
  2. Default Value: Should storage.writeSectionBloom default to true (data priority) or false (resource priority)?

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions