Skip to content

feat: Add balance validation to wrapped block validator#2184

Open
rockysingh wants to merge 12 commits intohiero-ledger:mainfrom
rockysingh:2173-add-balance-validator-for-wrap
Open

feat: Add balance validation to wrapped block validator#2184
rockysingh wants to merge 12 commits intohiero-ledger:mainfrom
rockysingh:2173-add-balance-validator-for-wrap

Conversation

@rockysingh
Copy link
Contributor

@rockysingh rockysingh commented Feb 10, 2026

Summary

  • Add optional balance file validation to the wrapped block validator
  • Compare computed account balances against signed protobuf balance files from GCP mainnet bucket
  • Verify balance file signatures using address book (>1/3 node signatures required)
  • Update balance tracking to include amendments (genesis and missing transactions)
  • Add fetchBalanceCheckpoints command to pre-compile balance checkpoints into compressed resource files
  • Add lastMerkleLeaf.bin output to wrap command for quick CN access to final block hash

Description

This adds per-account balance verification in addition to the existing 50 billion HBAR supply check. Balance files were published every ~15 minutes and contain signed snapshots of all account balances until October 2023. By comparing our
computed balances against these signed files at checkpoint timestamps, we can verify that transactions are being processed correctly.

New CLI options for validate-wrapped:

Option Description
--validate-balances Enable balance validation
--balance-check-interval-days How often to validate (default: 30 days)
--balance-checkpoints Path to balance checkpoints file or directory
--address-book Path to address book history JSON for signature verification

New command: blocks fetchBalanceCheckpoints

Fetches balance files from GCP, verifies signatures, and compiles them into a compressed resource file for offline validation:

Option Description
--interval-days Checkpoint interval (7 for weekly, 30 for monthly)
--start-day / --end-day Date range to fetch
--skip-signatures Skip signature verification

Pre-compiled resource files:

  • balance_checkpoints_monthly.zstd - 32 checkpoints (~14MB) for monthly validation
  • balance_checkpoints_weekly.zstd - 136 checkpoints (~20MB) for weekly validation

lastMerkleLeaf.bin

The wrap command now outputs lastMerkleLeaf.bin containing the final block number (8 bytes) and block hash (48 bytes, SHA-384), allowing CN to quickly access the latest block hash without loading the full merkle tree. The value is tracked
in memory during processing and written once at the end (or on shutdown) for efficiency.

Files added/modified:

  • BalanceFileBucket.java - Downloads balance files from GCP bucket
  • BalanceProtobufParser.java - Manual protobuf parser to bypass PBJ's 2M account limit
  • BalanceCheckpointsLoader.java - Loads pre-compiled checkpoint files
  • FetchBalanceCheckpointsCommand.java - CLI command to fetch and compile checkpoints
  • ValidateWrappedBlocksCommand.java - Added CLI options for balance validation
  • WrappedBlockValidator.java - Fixed to process amendments in balance tracking
  • ToWrappedBlocksCommand.java - Added lastMerkleLeaf.bin output

Limitations

Balance files are available from September 2019 through October 23, 2023. Post-October 2023 validation will be addressed in a follow-up PR.

Test plan

  • Added unit tests for balance comparison logic
  • Added unit tests for timestamp formatting/parsing
  • Added tests for signature threshold calculations
  • All 1061 tests pass

This is partial work for: #2173

@rockysingh rockysingh self-assigned this Feb 10, 2026
@rockysingh rockysingh added the Block Node Tools Additional tools related to, but not part of, the Block Node label Feb 10, 2026
@rockysingh rockysingh added this to the 0.28.0 milestone Feb 10, 2026
@rockysingh rockysingh force-pushed the 2173-add-balance-validator-for-wrap branch from 752d5b0 to 6b30bc3 Compare February 10, 2026 23:01
@rockysingh rockysingh marked this pull request as ready for review February 11, 2026 00:49
@rockysingh rockysingh requested review from a team as code owners February 11, 2026 00:49
jsync-swirlds
jsync-swirlds previously approved these changes Feb 11, 2026
Copy link
Contributor

@jsync-swirlds jsync-swirlds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, one note regarding var usage for future reference.

@Option(
names = {"--validate-balances"},
description = "Enable validation of account balances against CSV balance files from GCP")
private boolean validateBalances = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel like we should make this true by default. Also we should add local cache for downloaded balance files. So we only download once and can validate many times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@Option(
names = {"--balance-start-day"},
description = "Start day for balance validation in format YYYY-MM-DD (e.g., 2019-09-13)")
private String balanceStartDay;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this only effect balance verification? I think it is not needed. We should validate complete date range. We could add option start/end date for all validation. For balance files we can hard code the first and last available dates for mainnet. We might need to add a network config property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@Option(
names = {"--gcp-project"},
description = "GCP project for requester-pays bucket access")
private String gcpProject;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should pickup from environment variable by default. Like we do else where.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

names = {"--verify-signatures"},
description = "Verify balance file signatures (requires --address-book)",
defaultValue = "false")
private boolean verifySignatures;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be true always I think. It is key to the trust of the data and should not be slow. Especially once balance files are downloaded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

names = {"--cache-dir"},
description = "Directory for caching downloaded files",
defaultValue = "data/gcp-cache")
private Path cacheDir;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should do some special caching. The reason I say that is I have tool now for converting newer saved states into balance files. I would like to be able to drop them into a directory we could use. Or maybe seperate dir for custom downloaded files. I have for example accountBalances_89270840.pb.gz and accountBalances_91019204.pb.gz. The numbers in file name are the block number they are for. We will not have signatures for them so can't do that part of validation. But the tool that converted checked the saved state sigantures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

* @param block the block to extract timestamp from
* @return the block timestamp as Instant, or null if not found
*/
private static Instant extractBlockTimestamp(Block block) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a good util method for our TimeUtils class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Parameters(index = "0..1", description = "Block files, directories, or zip archives to process")
private File[] files;

@Option(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have a option to pick granularity for how often we check balances file. I am thinking in days, maybe every 7 once a week. Just want to balance amount of downloads vs time to failure. Seems like a week or even month would be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed it to a month for now.

@rockysingh rockysingh force-pushed the 2173-add-balance-validator-for-wrap branch from 6b30bc3 to 4437c02 Compare February 12, 2026 18:39
@rockysingh rockysingh force-pushed the 2173-add-balance-validator-for-wrap branch from f6e1868 to f55a284 Compare February 13, 2026 22:19
@rockysingh rockysingh force-pushed the 2173-add-balance-validator-for-wrap branch from f9a074d to 964cfca Compare February 18, 2026 15:49
jsync-swirlds
jsync-swirlds previously approved these changes Feb 19, 2026
Copy link
Contributor

@jsync-swirlds jsync-swirlds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

* </ul>
*/
@SuppressWarnings("CyclomaticComplexity")
public class BalanceProtobufParser {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not need to manual parse as you can just pass max size into PBJ parse method. We will also need to parse the tokens as well as balances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

* @param amendments the amendment items (missing transactions) to merge in
* @return a new list containing all items sorted by consensus timestamp
*/
private static List<RecordStreamItem> mergeRecordStreamItems(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not right, the amendments can replace transactions in the original file as well as add new transactions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

* @param dayPrefix the day prefix in format "YYYY-MM-DD" (e.g., "2019-09-13")
* @return list of Instant timestamps for available balance files
*/
public List<Instant> listBalanceTimestampsForDay(String dayPrefix) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like we should be able to target midnight balance file more directly than doing a list operation for every day needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

* <p>Also supports loading custom balance files in the format
* {@code accountBalances_{blockNumber}.pb.gz} from a directory.
*/
public class BalanceCheckpointsLoader {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like we should just be able to use balance file protobuf format. Not sure why we need another file format. Also this doesn't cover tokens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was an issue with the file size. Let me see if I can get around it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been addressed.

@rockysingh rockysingh force-pushed the 2173-add-balance-validator-for-wrap branch 7 times, most recently from f040b08 to f19be68 Compare February 25, 2026 17:55
Copy link
Contributor

@jsync-swirlds jsync-swirlds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow we seem to have added the giant list of cache files here too.
I recommend adding data/gcp-cache to the .gitignore file to prevent adding these again.

   Add optional balance file validation that compares computed account
   balances against signed protobuf balance files from GCP mainnet bucket.
   This provides per-account verification in addition to the existing
   50 billion HBAR supply check.

   - Add BalanceFileBucket to download balance files from GCP
   - Add BalanceCsvValidator to compare balances at checkpoints
   - Add signature verification for balance files using address book
   - Update WrappedBlockValidator to process amendments in balance tracking
   - Add CLI options: --validate-balances, --balance-start-day,
     --balance-end-day, --address-book, --verify-signatures, --gcp-project

   Balance files are available from September 2019 through October 23, 2023

Signed-off-by: Rocky Thind <harpender.t@swirldslabs.com>
Signed-off-by: Rocky Thind <harpender.t@swirldslabs.com>
    (default: 30 days/monthly) to control how often balance checkpoints
    are validated
  - Add setCheckIntervalDays() to BalanceCheckpointValidator with
    filtering logic based on ~20,000 blocks/day
  - Update README.md command tree with fetchBalanceCheckpoints and
    validate-wrapped commands
  - Add comprehensive documentation for fetchBalanceCheckpoints command
    including options, prerequisites, output format, and examples
  - Update validate-wrapped documentation with new balance validation
    options and clarify relationship between fetch and validation intervals

Signed-off-by: Rocky Thind <harpender.t@swirldslabs.com>
Signed-off-by: Rocky Thind <harpender.t@swirldslabs.com>
Signed-off-by: Rocky Thind <harpender.t@swirldslabs.com>
Signed-off-by: Rocky Thind <harpender.t@swirldslabs.com>
Signed-off-by: Rocky Thind <harpender.t@swirldslabs.com>
  Address PR review comments:

  - Rename BalanceCsvValidator to BalanceProtobufValidator (downloads .pb.gz files)
  - Change checkpoint file format to length-prefixed protobufs:
    [blockNumber (8 bytes)][length (4 bytes)][raw protobuf bytes]
  - Add token balance support via BalanceProtobufParser.parseWithTokens()
  - Add direct midnight targeting in BalanceFileBucket.downloadMidnightBalanceFile()
  - Update mergeRecordStreamItems to handle replacements (same timestamp = replace)
  - Remove unused BalanceProtobufParser.parseAndWrite() method

  Note: Existing .zstd checkpoint files need regeneration with new format.

Signed-off-by: Rocky Thind <harpender.t@swirldslabs.com>
Signed-off-by: Rocky Thind <harpender.t@swirldslabs.com>
Remove validateCheckpoint(HBAR-only) overloads and parseProtobufBalances
method that are no longer called after token balance support was added.

Signed-off-by: Rocky Thind <harpender.t@swirldslabs.com>
@rockysingh rockysingh force-pushed the 2173-add-balance-validator-for-wrap branch from f19be68 to 8e01a28 Compare February 27, 2026 01:22
@rockysingh rockysingh requested a review from a team as a code owner February 27, 2026 01:23
Remove 128 cached GCS balance files that were accidentally committed.

Signed-off-by: Rocky Thind <harpender.t@swirldslabs.com>
Prevent cached GCS balance files from being accidentally committed.

Signed-off-by: Rocky Thind <harpender.t@swirldslabs.com>
@rockysingh rockysingh force-pushed the 2173-add-balance-validator-for-wrap branch from 65cbeb5 to 56bc4ae Compare February 27, 2026 01:26
@codecov
Copy link

codecov bot commented Feb 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

@@             Coverage Diff              @@
##               main    #2184      +/-   ##
============================================
- Coverage     81.08%   80.96%   -0.12%     
+ Complexity     1463     1460       -3     
============================================
  Files           139      139              
  Lines          6757     6757              
  Branches        727      727              
============================================
- Hits           5479     5471       -8     
- Misses          957      968      +11     
+ Partials        321      318       -3     

see 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@AlfredoG87 AlfredoG87 modified the milestones: 0.28.0, 0.29.0 Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Block Node Tools Additional tools related to, but not part of, the Block Node

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

4 participants