Skip to content

Conversation

fdefelici
Copy link
Contributor

Description

This PR add block marf computation to the AAC test harness, so that we can avoid to pass it as an input.

I checked different ways to do it (even using ephemeral implementation), but in the end the one that worked best was to write on the test chainstate, retrieve the root hash and then rollback the marf transaction.

Applicable issues

  • fixes #

Additional info (benefits, drawbacks, caveats)

Checklist

  • Test coverage for new or modified code paths
  • Changelog is updated
  • Required documentation changes (e.g., docs/rpc/openapi.yaml and rpc-endpoints.md for v2 endpoints, event-dispatcher.md for new events)
  • New clarity functions have corresponding PR in clarity-benchmarking repo
  • New integration test(s) added to bitcoin-tests.yml

@fdefelici fdefelici requested review from Jiloc, jferrant and kantai October 3, 2025 14:23
@fdefelici fdefelici self-assigned this Oct 3, 2025
@fdefelici fdefelici added the aac Avoiding Accidental Consensus label Oct 3, 2025
@fdefelici fdefelici added this to the 3.2.0.0.2 milestone Oct 3, 2025
@fdefelici fdefelici marked this pull request as ready for review October 6, 2025 07:35
@fdefelici fdefelici requested review from a team as code owners October 6, 2025 07:35
@fdefelici fdefelici added the aac-testing Avoiding Accidental Consensus Testing Specific Task label Oct 6, 2025
@fdefelici fdefelici moved this to Status: In Review in Stacks Core Eng Oct 6, 2025
Copy link

codecov bot commented Oct 6, 2025

Codecov Report

❌ Patch coverage is 97.67442% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.69%. Comparing base (483f1ad) to head (111595a).
⚠️ Report is 15 commits behind head on develop.

Files with missing lines Patch % Lines
stackslib/src/chainstate/nakamoto/tests/mod.rs 83.33% 4 Missing ⚠️
stackslib/src/chainstate/tests/consensus.rs 99.43% 1 Missing ⚠️

❌ Your project status has failed because the head coverage (75.69%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6560      +/-   ##
===========================================
+ Coverage    69.88%   75.69%   +5.80%     
===========================================
  Files          568      568              
  Lines       347547   347674     +127     
===========================================
+ Hits        242887   263171   +20284     
+ Misses      104660    84503   -20157     
Files with missing lines Coverage Δ
stacks-signer/src/client/mod.rs 99.24% <100.00%> (+1.50%) ⬆️
stackslib/src/chainstate/nakamoto/test_signers.rs 77.93% <100.00%> (+0.95%) ⬆️
stackslib/src/chainstate/tests/consensus.rs 91.86% <99.43%> (-0.70%) ⬇️
stackslib/src/chainstate/nakamoto/tests/mod.rs 95.65% <83.33%> (+15.86%) ⬆️

... and 368 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 483f1ad...111595a. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@Jiloc Jiloc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, automatically computing the marf will definitely make these tests easier to write! Regarding the approach used, LGTM, but can't gaurantee if there is an easier way that also works for other types of txs. Maybe @kantai can share his thoughs on that.

One thought (that we already discussed offline): since we've removed the MARF from the input parameters (which is great for making the first test execution faster and cleaner), we might now want to include it in the expected output. It's still an important part of consensus, and we'll want to guarantee that a newer version of stacks-node produces the same MARF root for a block as previous versions.

@jferrant
Copy link
Contributor

jferrant commented Oct 6, 2025

I agree that the MARF should still be listed in expected outputs :D Great to see this though.

Copy link

@aaronb-stacks aaronb-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this PR should be updated so that marf_hash is an Option type.

My rationale is that the marf_hash is actually part of the consensus protocol. So, for creating test vectors that prevent consensus breaking changes, it's better if the marf_hash is included in the input vector. Otherwise, a change which altered that hash could pass the test vector (even though it would be a consensus breaking change).

So, for most kinds of tests that we would write here, we actually want the marf hash included explicitly in the test vector. However, there are plenty of cases where we'd want to be able to run this test harness without the marf hash. In particular, it would help during test writing and generation: when someone writes a test vector, they would create all the test blocks, execute the test with the marf hashes set to None, and then use the output to fill in the expected hashes. Then, subsequent changes to the codebase would need to continue to pass tests with those hashes. A similar pattern would be used when setting up fuzzing targets.

@fdefelici fdefelici force-pushed the feat/aac-compute-marf branch from e47882b to 111595a Compare October 8, 2025 13:05
@federico-stacks
Copy link

federico-stacks commented Oct 8, 2025

With this update I added the marf_hash to the ExpectedBlockOutput (so that it is registered in the snapshot), and also merged with insta implementation from develop

Caveats:

  • Now that with don't have no more expected failure in TestOutput, the marf hash is always computed for all test case and if it fails (for invalid block) it set the zeroed marf hash. (Note: eventually we could restore the old implementation in case we decide to add some flag to TestBlock to say if it should be success or failure)
  • As a conseguence I removed the test test_append_state_index_root_mismatches
  • Futhermore I add to remove insta::allow_duplicates! because of fact we have different marf hashes for each expected block result

@fdefelici fdefelici added this pull request to the merge queue Oct 9, 2025
Merged via the queue into stacks-network:develop with commit a7f4240 Oct 9, 2025
299 of 303 checks passed
@fdefelici fdefelici deleted the feat/aac-compute-marf branch October 9, 2025 07:30
@github-project-automation github-project-automation bot moved this from Status: In Review to Status: ✅ Done in Stacks Core Eng Oct 9, 2025
@fdefelici fdefelici removed this from the 3.2.0.0.2 milestone Oct 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

aac Avoiding Accidental Consensus aac-testing Avoiding Accidental Consensus Testing Specific Task

Projects

Status: Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

AAC Testing: Develop Integration Test Harness for append_block in stackslib

5 participants