feat: wrapped block validator token tracing for post october 2023 with saved states#2265
feat: wrapped block validator token tracing for post october 2023 with saved states#2265rockysingh wants to merge 13 commits intohiero-ledger:mainfrom
Conversation
eab9ad7 to
5122146
Compare
5122146 to
f4cb48b
Compare
|
You're pushing a large number of large binary files in this PR that appear to be cached data, not code.
|
That is not intentional. I had them removed. I guess I did not stage the changes correctly. I've removed them now. |
Add optional balance file validation that compares computed account
balances against signed protobuf balance files from GCP mainnet bucket.
This provides per-account verification in addition to the existing
50 billion HBAR supply check.
- Add BalanceFileBucket to download balance files from GCP
- Add BalanceCsvValidator to compare balances at checkpoints
- Add signature verification for balance files using address book
- Update WrappedBlockValidator to process amendments in balance tracking
- Add CLI options: --validate-balances, --balance-start-day,
--balance-end-day, --address-book, --verify-signatures, --gcp-project
Balance files are available from September 2019 through October 23, 2023
Signed-off-by: Rocky Thind <[email protected]>
Signed-off-by: Rocky Thind <[email protected]>
(default: 30 days/monthly) to control how often balance checkpoints
are validated
- Add setCheckIntervalDays() to BalanceCheckpointValidator with
filtering logic based on ~20,000 blocks/day
- Update README.md command tree with fetchBalanceCheckpoints and
validate-wrapped commands
- Add comprehensive documentation for fetchBalanceCheckpoints command
including options, prerequisites, output format, and examples
- Update validate-wrapped documentation with new balance validation
options and clarify relationship between fetch and validation intervals
Signed-off-by: Rocky Thind <[email protected]>
Signed-off-by: Rocky Thind <[email protected]>
Signed-off-by: Rocky Thind <[email protected]>
Signed-off-by: Rocky Thind <[email protected]>
Signed-off-by: Rocky Thind <[email protected]>
Address PR review comments:
- Rename BalanceCsvValidator to BalanceProtobufValidator (downloads .pb.gz files)
- Change checkpoint file format to length-prefixed protobufs:
[blockNumber (8 bytes)][length (4 bytes)][raw protobuf bytes]
- Add token balance support via BalanceProtobufParser.parseWithTokens()
- Add direct midnight targeting in BalanceFileBucket.downloadMidnightBalanceFile()
- Update mergeRecordStreamItems to handle replacements (same timestamp = replace)
- Remove unused BalanceProtobufParser.parseAndWrite() method
Note: Existing .zstd checkpoint files need regeneration with new format.
Signed-off-by: Rocky Thind <[email protected]>
Signed-off-by: Rocky Thind <[email protected]>
Remove validateCheckpoint(HBAR-only) overloads and parseProtobufBalances method that are no longer called after token balance support was added. Signed-off-by: Rocky Thind <[email protected]>
Signed-off-by: Jasper Potts <[email protected]>
…ests Add loadFromGzippedStream to BalanceCheckpointsLoader and BalanceCheckpointValidator to support loading individual gzipped protobuf balance files. Extend ValidateWrappedBlocksCommand to auto-load bundled checkpoint resources (accountBalances_91019204.pb.gz) at startup. Add unit tests for both loader and validator covering HBAR/token balance loading, checkpoint querying, interval filtering, range checks, and validation pass/fail paths. Signed-off-by: Rocky Thind <[email protected]>
Signed-off-by: Rocky Thind <[email protected]>
7221f30 to
0dc0943
Compare
jsync-swirlds
left a comment
There was a problem hiding this comment.
Just a few more large (20-50M) files to double check if we really need them.
tools-and-tests/tools/src/main/resources/metadata/accountBalances_91019204.pb.gz
Show resolved
Hide resolved
tools-and-tests/tools/src/main/resources/metadata/balance_checkpoints_monthly.zstd
Show resolved
Hide resolved
There was a problem hiding this comment.
This is created using a command but it can be very flakey to run due to GCP but gets even worse as its at a weekly granularity. This is why I've pushed them to the repo as a source of truth.
There was a problem hiding this comment.
Could we store it separately and load/download as needed?
It is a lot of data to store in git, and that data gets copied to every clone forever.
I understand if it's actually needed, and we can perhaps fix later if necessary, but it would be nice to not store these binaries if we can reasonably avoid it.
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## main #2265 +/- ##
============================================
- Coverage 80.96% 80.89% -0.08%
- Complexity 1445 1448 +3
============================================
Files 139 139
Lines 6730 6730
Branches 726 726
============================================
- Hits 5449 5444 -5
- Misses 962 964 +2
- Partials 319 322 +3 see 2 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Summary
FetchBalanceCheckpointsCommand(blocks fetchBalanceCheckpoints) to download and bundle weekly/monthly balance snapshots from the mainnet GCP bucket into length-prefixed protobuf checkpoint filesBalanceCheckpointValidatorandBalanceProtobufValidatorto compare computed running account/token balances against checkpoint snapshots at configurable intervals (--balance-check-interval-days, default 30)BalanceFileBucketutility for downloading midnight-targeted.pb.gzbalance files from GCP with signature verification supportRunningAccountsStateto track per-account HBAR and token balances across blocks, processing amendments (crypto transfers, token mints/burns/wipes/transfers)accountBalances_91019204.pb.gz) sovalidate-wrappedworks out of the box without a separate fetch stepvalidate-wrapped:--validate-balances,--balance-checkpoints,--custom-balances-dir,--balance-check-interval-daysBalanceCheckpointValidator,BalanceCheckpointsLoader,BalanceProtobufValidator, andBalanceFileBucketfetchBalanceCheckpointsandvalidate-wrappedcommandsNotes
The bundled
accountBalances_91019204.pb.gzcheckpoint was generated using the standalone [hiero-state-to-balance-file]project, which extracts account/token balances from a consensus nodesaved state at block 91,019,204. This file provides a known-good starting point for balance validation. If the saved-state extraction tool or its protobuf format diverges in the future, this checkpoint may need to be regenerated. There is a potential diverge issue we need to raise. The "diverge issue" is that this is a separate standalone project — not part of the hiero-block-node repo — and the generated .pb.gz file was copied over. If that project's protobuf format, extraction logic, or Hiero dependency version changes, the bundled file could become stale or incompatible. Therefore, we will need to generate other files for any diverge.
Test plan
blocks validate-wrappedruns with default balance validation enabled (uses bundled checkpoint)blocks fetchBalanceCheckpointsdownloads and bundles balance files from GCP--no-validate-balancesdisables balance checking--balance-check-interval-dayscontrols checkpoint frequency--custom-balances-dirloads balance files from a local directoryBalanceCheckpointValidatorTest,BalanceCheckpointsLoaderTest,BalanceProtobufValidatorTest,BalanceFileBucketTestAlso, rebased using this pr: #2184 as it was needed to work on this issue etc.
This closes: #2173 and #2212