docs(policy)_: Reintroduced the testing policy for full review #6413

Samyoul · 2025-03-12T14:12:43Z

PR Description: Reintroducing the `status-go` Testing Policy

Summary

This PR reintroduces the status-go Testing Policy, providing clear guidelines for test creation, validation, and handling flaky tests. The policy has been updated to align with Policy Zero, ensuring a structured, transparent, and inclusive approach to test management in the repository.

Background

The status-go Testing Policy previously existed in the repository but was removed to allow all Core Contributors (CCs) to review, discuss, and provide input before formal reintroduction. This aligns with Policy Zero, which establishes a framework for collaboratively developing policies in key areas of status-go development. The goal is to ensure that all policies reflect the needs and perspectives of contributors across the project.

What's Changed?*

Reintroduces clear guidelines for creating tests, ensuring they are falsifiable, reliable, and resistant to fuzzing.
Defines flaky tests and mandates their resolution before merging.
Introduces structured steps for identifying, resolving, and reporting flaky tests.
Ensures compliance with Policy Zero’s submission, review, and approval processes.
Includes mermaid flowcharts for visual guidance on test resolution and reporting.

Why Change?

Establishes a consistent testing framework for status-go.
Prevents flaky or unreliable tests from being merged into the codebase.
Ensures that testing policies are maintained in an inclusive and transparent manner, following Policy Zero’s principles.
Gives all Core Contributors (CCs) an opportunity to engage with and shape this policy.

…ryone.

status-im-auto · 2025-03-12T14:16:08Z

Jenkins Builds

❔	Commit	#️⃣	Finished (UTC)	Duration	Platform	Result
✔️	`8f0b579`	#1	2025-03-12 14:16:08	~3 min	`ios`	📦`zip`
✔️	`8f0b579`	#1	2025-03-12 14:16:16	~3 min	`android`	📦`aar`
✔️	`8f0b579`	#1	2025-03-12 14:17:43	~4 min	`macos`	📦`zip`
✔️	`8f0b579`	#1	2025-03-12 14:18:17	~5 min	`macos`	📦`zip`
✔️	`8f0b579`	#1	2025-03-12 14:18:42	~5 min	`windows`	📦`zip`
✔️	`8f0b579`	#1	2025-03-12 14:18:50	~5 min	`linux`	📦`zip`
✔️	`8f0b579`	#1	2025-03-12 14:33:58	~20 min	`tests-rpc`	📄`log`
✔️	`8f0b579`	#1	2025-03-12 14:49:22	~36 min	`tests`	📄`log`

codecov · 2025-03-12T14:50:15Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 60.81%. Comparing base (068537e) to head (8f0b579).
Report is 1 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #6413      +/-   ##
===========================================
- Coverage    60.83%   60.81%   -0.02%     
===========================================
  Files          874      874              
  Lines       112420   112420              
===========================================
- Hits         68389    68368      -21     
- Misses       36085    36101      +16     
- Partials      7946     7951       +5

Flag	Coverage Δ
functional	`23.03% <ø> (+0.01%)`	⬆️
unit	`59.09% <ø> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 29 files with indirect coverage changes

friofry · 2025-03-13T12:41:02Z

_docs/policies/testing.md

+
+- All new functionality MUST be introduced with tests that:
+  - Prove that the functionality performs as described
+  - Can be falsified


Can be falsified

I think it can be elaborated. Maybe it's just me, but I think these 2 possible interpretations are different:

Tests SHOULD not only verify the correct ("happy path"). Edge-case and failure-scenario testing proactively validates the new feature resilience to unusual or invalid input

Falsifiable testing retroactively ensures that a bug is demonstrably captured by a test, ensuring that fixes address concrete, reproducible problems rather than hypothetical scenarios.

Samyoul · 2025-03-13T13:16:40Z

Hey, as we refine this Testing Policy, I wanted to raise some questions for further discussion. Let me know what you think?

API level Functional testing

Should API-level functional tests be explicitly included in this policy?
Right now, the policy covers functional testing in general, requiring tests to be falsifiable, reliable, and resistant to fuzzing, but it doesn’t explicitly differentiate API-level functional tests from other types of functional tests (e.g., integration or system tests).

Should API-level functional tests have their own specific requirements in this policy?
Are there any different considerations for API-level tests compared to other functional tests?
Do we need additional tooling, automation, or CI support to ensure API tests are properly covered in this policy?

Performance & Load Testing

Should we introduce any policy items about performance benchmarks for functional tests, particularly API tests?
Should functional tests measure response times and resource consumption, or should that be handled separately in a benchmarking policy for example?

Handling Test Debt & Maintenance

Do we need a process for reviewing and cleaning up outdated tests? 😬
How do we prevent an accumulation of unreliable or useless tests over time?

Fuzzing & Security Considerations

This policy mentions resistance to fuzzing, but should we also consider security testing within our tests?
Are we able to mock threats and attacks that are automatable via testing?
Are there specific tooling or best practices we should standardise for security resilience?

Automation & CI Integration

Should we mention in this policy the CI/CD automation steps currently in place that enforce test coverage?
Should we define requirements for test parallelisation, isolation, or dependencies?

Flaky Test Management Process

Should we set a maximum timeframe for how long a test can remain flaky before it must be resolved or rewritten?

jrainville

Nice to see this back

jrainville · 2025-03-17T16:50:34Z

_docs/policies/testing.md

+  - Prove that the functionality performs as described
+  - Can be falsified
+  - Are resistant to fuzzing
+- All new `functional tests` MUST BE validated via a minimum of 1000 test runs.


I don't know functional tests that well, but integration tests are the usual culprits of flakiness, so either this is a swap or we need to add integration there too

This should probably apply to unit/integration (go) tests, not functional (python). The latter have not yet been noted to be flaky, so I wouldn't apply such strict rules there.

We have tests types definition here: status-go/Test Types.

ilmotta

This is a welcome policy!

My impression while reading it is that it contains a bit too much detail, which leaves room for more amendments, something we mentioned in the introductory policy that we'd like to minimize.

For example, I don't think we need the policy to describe in detail how to resolve or report flaky tests (like in section Steps to Resolving or Reporting Flaky Tests). Another approach could be to link to external guidelines that can evolve independently of the policy. Essentially, I'm questioning what belongs in the policy versus what should live in non-policy docs.

ilmotta · 2025-03-18T00:00:14Z

_docs/policies/testing.md

+## Purpose
+
+This policy defines the requirements and guidelines for testing within the `status-go` repository. It aims to ensure all new functionality is introduced with robust, reliable, and repeatable tests 
+while preventing flaky tests from being merged into the codebase. This aligns with Policy Zero’s principles of transparency, collaboration, and quality assurance in the development process.


while preventing flaky tests from being merged into the codebase

In my experience, many flaky tests are tricky and end up in the main branch because they resist verification attempts. Beyond addressing flaky tests during PRs, an important process is to define how/who/when to handle flaky tests after they infect the main branch. But perhaps the details of such a process can be defined in a non-policy doc.

ilmotta · 2025-03-18T00:04:47Z

_docs/policies/testing.md

+  - Are resistant to fuzzing
+- All new `functional tests` MUST BE validated via a minimum of 1000 test runs.
+  - This can be achieved using the `-count` or `-test.count` flag with the test command, e.g., `-count 1000` / `-test.count 1000`.
+  - Where the CI cannot support this workflow automatically, the developer MUST perform validation tests via local testing.


How will the PR author prove the N number of test runs were executed? Should the CI be responsible for detecting new functional tests and increase their test count by default for the duration of the PR? Is it always feasible to mandate a minimum of 1000 test runs when some tests can take a significant amount of time to finish?

I agree that a CI check would be nice, but the effort to implement it might not be worth the results.

There're 3 cases when a test can become flaky, and each reason requires different effort to implement a CI check:

Reason How to CI check Implementation difficulty
(out of 5) Warnings

When implementing a new test Check if this test exists on develop 🔴 🔴 -

When changing and old test code Check tests diff and find amended tests 🔴 🔴 🔴 🔴 -

When changing tested code (without changing the test itself) Use external tools (e.g. Sentry Carryforward flags) 🔴 🔴 🔴 🔴 Some code may be used in ALL tests, which would make CI to verify all tests a 1000 times

I think we could some day implement (1),
but I would not bother with (2),
and there will always be (3), which is wrong to do (note the warning above).

👉 So for now I believe it's fine to require devs to run new/amended test locally and perhaps attach the execution report to the PR description/comment. At the same time teach devs to prevent flaky tests, so that we won't even need to run tests a 1000 times.

I had a feeling CI detection would be tricky, but I wasn't expecting such a detailed answer. Thanks @igor-sirotin 💯 I completely agree with your conclusions.

One idea to improve the effectiveness of the test suite is to randomly select a subset of tests and increase their run count (randomly as well if we like). Essentially, we'd extend the total test run time by a controlled amount. This would raise the chances of catching flaky tests before they hit develop. As the suite gets more stable, we could gradually reduce the randomness factor. We could also increase the number of affected tests by the random selection in CI by using dedicated jobs (e.g. twice per day with 10% of the test suite running between 10-200 times).

Is the idea above practical and perhaps straightforward?

ilmotta · 2025-03-18T00:12:02Z

_docs/policies/testing.md

+  - This can be achieved using the `-count` or `-test.count` flag with the test command, e.g., `-count 1000` / `-test.count 1000`.
+  - Where the CI cannot support this workflow automatically, the developer MUST perform validation tests via local testing.
+    - `TODO` Add link to issue for CI automation of validation test runs of new `functional tests`.
+  - Ensuring that the test passes consistently every time gives confidence that the test is not flaky.


We know one source of flakyness is external network dependency. I wonder if we can just define in status-go that certain tests (e.g. within a certain standardized directory like ./unit/ or following a specific naming pattern) should automatically block outbound traffic. I wonder if a hard block using a firewall in the CI could be practical, but it would be better I think if we could do from Go directly.

I'm just looking for ways to rely less on conventions and trust.

ilmotta · 2025-03-18T00:14:14Z

_docs/policies/testing.md

+```
+
+#### Reporting Flaky Tests
+If an old test fails and/or seems flaky either locally or in CI, you MUST report the event.


Is the lack of a blank line after the headline intentional? The other one at level 2 ## Flaky Tests above has a blank line.

ilmotta · 2025-03-18T00:14:52Z

_docs/policies/testing.md

+  - Archival MUST be performed by submitting a PR that moves the policy to `_docs/policies/archived`.
+- The PR MUST include a rationale for the proposed amendment or archival in the PR description.
+- The PR MUST follow the [Review and Approval Process](./introduce-policy.md#review-and-approval-process) outlined in Policy Zero.
+


Two extra lines.

ilmotta · 2025-03-18T00:16:49Z

_docs/policies/testing.md

+## Purpose
+
+This policy defines the requirements and guidelines for testing within the `status-go` repository. It aims to ensure all new functionality is introduced with robust, reliable, and repeatable tests 
+while preventing flaky tests from being merged into the codebase. This aligns with Policy Zero’s principles of transparency, collaboration, and quality assurance in the development process.


Most lines don't have spurious line breaks, but some do like this one. For consistency with the other policy introduce-policy.md, it's better to fill all paragraphs at ~80 chars.

ilmotta · 2025-03-18T00:55:58Z

_docs/policies/testing.md

+
+## Creating Tests
+
+- All new functionality MUST be introduced with tests that:


When it comes to the quality of tests, there are other characteristics not covered in the policy and we could go in more detail, but then, by going in more detail, I think we are already failing to use policies adequately. The policies are not great documents to iterate in my opinion and are too rigid and require too much coordination.

A few other characteristics:

How sociable is the test?

How does the test perform?

How is the test structured in terms of arrange/act/assert?

How much coverage should a test provide at a minimum?

Is the test written using the best test pyramid level? Example: could the integration test be rewritten to be one or more unit tests, or a combination of different levels?

Is the test verifying possible security threats?

This can be achieved using the -count or -test.count flag with the test command, e.g., -count 1000 / -test.count 1000.

An example (to me) of something that seems too detailed to live in the policy.

igor-sirotin

Thank you @Samyoul!

I believe it is essential for this policy to exist, and we should agree on common rules to introduce new tests, otherwise we will continue to introduce more and more flaky tests, and just keep skipping them when they become unbearably annoying.

P.S. This is just part of my review, I will come back with more comments on this. Just leaving this meanwhile.

igor-sirotin · 2025-03-14T15:05:50Z

_docs/policies/testing.md

+## Submitting and Maintaining the Test Policy
+
+- Any individual MAY propose amendments to this policy.
+- Proposed changes MUST be submitted as a pull request (PR) to the `_docs/policies` directory in the `status-go` repository.
+- The PR MUST include a brief justification for the amendment, answering: "Why is this change needed?"
+- The PR MUST follow the [Review and Approval Process](./introduce-policy.md#review-and-approval-process) outlined in Policy Zero.


I think this is redundant, as this is defined in the Policy Zero.

igor-sirotin · 2025-03-14T19:50:27Z

_docs/policies/testing.md

+  - Prove that the functionality performs as described
+  - Can be falsified
+  - Are resistant to fuzzing


Actually I don't think we need to specify the quality of the tests. I find our developers reasonable enough to implement tests that makes sense and cover the feature. This basically means "be a good developer, write good tests" 😄 We are good developers here!

So the point is: IMO just specifying that "the new functionality MUST be covered with tests" is enough here.

igor-sirotin · 2025-03-18T10:15:12Z

_docs/policies/testing.md

+  - Prove that the functionality performs as described
+  - Can be falsified
+  - Are resistant to fuzzing
+- All new `functional tests` MUST BE validated via a minimum of 1000 test runs.


This should probably apply to unit/integration (go) tests, not functional (python). The latter have not yet been noted to be flaky, so I wouldn't apply such strict rules there.

We have tests types definition here: status-go/Test Types.

igor-sirotin · 2025-03-18T10:36:05Z

_docs/policies/testing.md

+  - Are resistant to fuzzing
+- All new `functional tests` MUST BE validated via a minimum of 1000 test runs.
+  - This can be achieved using the `-count` or `-test.count` flag with the test command, e.g., `-count 1000` / `-test.count 1000`.
+  - Where the CI cannot support this workflow automatically, the developer MUST perform validation tests via local testing.


I agree that a CI check would be nice, but the effort to implement it might not be worth the results.

There're 3 cases when a test can become flaky, and each reason requires different effort to implement a CI check:

Reason How to CI check Implementation difficulty
(out of 5) Warnings

When implementing a new test Check if this test exists on develop 🔴 🔴 -

When changing and old test code Check tests diff and find amended tests 🔴 🔴 🔴 🔴 -

When changing tested code (without changing the test itself) Use external tools (e.g. Sentry Carryforward flags) 🔴 🔴 🔴 🔴 Some code may be used in ALL tests, which would make CI to verify all tests a 1000 times

I think we could some day implement (1),
but I would not bother with (2),
and there will always be (3), which is wrong to do (note the warning above).

👉 So for now I believe it's fine to require devs to run new/amended test locally and perhaps attach the execution report to the PR description/comment. At the same time teach devs to prevent flaky tests, so that we won't even need to run tests a 1000 times.

igor-sirotin · 2025-03-18T10:36:58Z

_docs/policies/testing.md

+  - Can be falsified
+  - Are resistant to fuzzing
+- All new `functional tests` MUST BE validated via a minimum of 1000 test runs.
+  - This can be achieved using the `-count` or `-test.count` flag with the test command, e.g., `-count 1000` / `-test.count 1000`.


I think this is unnecessary detail

Suggested change

- This can be achieved using the `-count` or `-test.count` flag with the test command, e.g., `-count 1000` / `-test.count 1000`.

- This can be achieved using the `-count` or `-test.count` flag with the `go test` command

Btw, there's also stress tool, which is helpful: status-go/Detect flaky tests

igor-sirotin · 2025-03-18T10:38:12Z

_docs/policies/testing.md

+  - Are resistant to fuzzing
+- All new `functional tests` MUST BE validated via a minimum of 1000 test runs.
+  - This can be achieved using the `-count` or `-test.count` flag with the test command, e.g., `-count 1000` / `-test.count 1000`.
+  - Where the CI cannot support this workflow automatically, the developer MUST perform validation tests via local testing.


Suggested change

- Where the CI cannot support this workflow automatically, the developer MUST perform validation tests via local testing.

- Where the CI cannot support this workflow automatically, the developer MUST perform validation tests locally.

igor-sirotin · 2025-03-18T10:40:17Z

_docs/policies/testing.md

+## Policy Amendments and Archival
+
+- Amendments:
+  - This policy MAY be amended at any time.
+  - Amendments MUST be submitted via a PR to the `status-go` repository.
+- Archival:
+  - This policy MAY be archived if it becomes obsolete or replaced by newer policies.
+  - Archival MUST be performed by submitting a PR that moves the policy to `_docs/policies/archived`.
+- The PR MUST include a rationale for the proposed amendment or archival in the PR description.
+- The PR MUST follow the [Review and Approval Process](./introduce-policy.md#review-and-approval-process) outlined in Policy Zero.


This I believe is also redundant, as it is specified in the policy zero, which applies to all policies.

docs(policy)_: Reintroduced the testing policy for full review by eve…

8f0b579

…ryone.

Samyoul requested review from iurimatias, alaibe, shivekkhurana, ilmotta, plopezlpz and a team as code owners March 12, 2025 14:12

Samyoul self-assigned this Mar 12, 2025

Samyoul added the policy label Mar 12, 2025

friofry reviewed Mar 13, 2025

View reviewed changes

ilmotta mentioned this pull request Mar 14, 2025

[#21902] Reimplement community overview status-im/status-mobile#22293

Merged

jrainville reviewed Mar 17, 2025

View reviewed changes

ilmotta reviewed Mar 18, 2025

View reviewed changes

igor-sirotin reviewed Mar 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(policy)_: Reintroduced the testing policy for full review #6413

docs(policy)_: Reintroduced the testing policy for full review #6413

Samyoul commented Mar 12, 2025

status-im-auto commented Mar 12, 2025 •

edited

Loading

codecov bot commented Mar 12, 2025 •

edited

Loading

friofry Mar 13, 2025

Samyoul commented Mar 13, 2025

jrainville left a comment

jrainville Mar 17, 2025

igor-sirotin Mar 18, 2025

ilmotta left a comment

ilmotta Mar 18, 2025

ilmotta Mar 18, 2025

igor-sirotin Mar 18, 2025

ilmotta Mar 19, 2025

ilmotta Mar 18, 2025

ilmotta Mar 18, 2025

ilmotta Mar 18, 2025

ilmotta Mar 18, 2025

ilmotta Mar 18, 2025

igor-sirotin left a comment

igor-sirotin Mar 14, 2025

igor-sirotin Mar 14, 2025

igor-sirotin Mar 14, 2025

igor-sirotin Mar 18, 2025

igor-sirotin Mar 18, 2025

igor-sirotin Mar 18, 2025

igor-sirotin Mar 18, 2025

igor-sirotin Mar 18, 2025

igor-sirotin Mar 18, 2025

Reason	How to CI check	Implementation difficulty (out of 5)	Warnings
When implementing a new test	Check if this test exists on `develop`	🔴 🔴	-
When changing and old test code	Check tests diff and find amended tests	🔴 🔴 🔴 🔴	-
When changing tested code (without changing the test itself)	Use external tools (e.g. Sentry Carryforward flags)	🔴 🔴 🔴 🔴	Some code may be used in ALL tests, which would make CI to verify all tests a 1000 times


		## Creating Tests

		- All new functionality MUST be introduced with tests that:

	- This can be achieved using the `-count` or `-test.count` flag with the test command, e.g., `-count 1000` / `-test.count 1000`.
	- This can be achieved using the `-count` or `-test.count` flag with the `go test` command

	- Where the CI cannot support this workflow automatically, the developer MUST perform validation tests via local testing.
	- Where the CI cannot support this workflow automatically, the developer MUST perform validation tests locally.

docs(policy)_: Reintroduced the testing policy for full review #6413

Are you sure you want to change the base?

docs(policy)_: Reintroduced the testing policy for full review #6413

Conversation

Samyoul commented Mar 12, 2025

PR Description: Reintroducing the status-go Testing Policy

Summary

Background

What's Changed?*

Why Change?

status-im-auto commented Mar 12, 2025 • edited Loading

Jenkins Builds

codecov bot commented Mar 12, 2025 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Samyoul commented Mar 13, 2025

API level Functional testing

Performance & Load Testing

Handling Test Debt & Maintenance

Fuzzing & Security Considerations

Automation & CI Integration

Flaky Test Management Process

jrainville left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilmotta left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

igor-sirotin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PR Description: Reintroducing the `status-go` Testing Policy

status-im-auto commented Mar 12, 2025 •

edited

Loading

codecov bot commented Mar 12, 2025 •

edited

Loading