-
Notifications
You must be signed in to change notification settings - Fork 254
docs(policy)_: Reintroduced the testing policy for full review #6413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,121 @@ | ||||||||||||||||||||||
# `status-go` Test Policy | ||||||||||||||||||||||
|
||||||||||||||||||||||
## Purpose | ||||||||||||||||||||||
|
||||||||||||||||||||||
This policy defines the requirements and guidelines for testing within the `status-go` repository. It aims to ensure all new functionality is introduced with robust, reliable, and repeatable tests | ||||||||||||||||||||||
while preventing flaky tests from being merged into the codebase. This aligns with Policy Zero’s principles of transparency, collaboration, and quality assurance in the development process. | ||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Most lines don't have spurious line breaks, but some do like this one. For consistency with the other policy |
||||||||||||||||||||||
|
||||||||||||||||||||||
## Submitting and Maintaining the Test Policy | ||||||||||||||||||||||
|
||||||||||||||||||||||
- Any individual MAY propose amendments to this policy. | ||||||||||||||||||||||
- Proposed changes MUST be submitted as a pull request (PR) to the `_docs/policies` directory in the `status-go` repository. | ||||||||||||||||||||||
- The PR MUST include a brief justification for the amendment, answering: "Why is this change needed?" | ||||||||||||||||||||||
- The PR MUST follow the [Review and Approval Process](./introduce-policy.md#review-and-approval-process) outlined in Policy Zero. | ||||||||||||||||||||||
Comment on lines
+8
to
+13
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is redundant, as this is defined in the Policy Zero. |
||||||||||||||||||||||
|
||||||||||||||||||||||
## Creating Tests | ||||||||||||||||||||||
|
||||||||||||||||||||||
- All new functionality MUST be introduced with tests that: | ||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When it comes to the quality of tests, there are other characteristics not covered in the policy and we could go in more detail, but then, by going in more detail, I think we are already failing to use policies adequately. The policies are not great documents to iterate in my opinion and are too rigid and require too much coordination. A few other characteristics:
An example (to me) of something that seems too detailed to live in the policy. |
||||||||||||||||||||||
- Prove that the functionality performs as described | ||||||||||||||||||||||
- Can be falsified | ||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think it can be elaborated. Maybe it's just me, but I think these 2 possible interpretations are different:
|
||||||||||||||||||||||
- Are resistant to fuzzing | ||||||||||||||||||||||
Comment on lines
+18
to
+20
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually I don't think we need to specify the quality of the tests. I find our developers reasonable enough to implement tests that makes sense and cover the feature. This basically means "be a good developer, write good tests" 😄 We are good developers here! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So the point is: IMO just specifying that "the new functionality MUST be covered with tests" is enough here. |
||||||||||||||||||||||
- All new `functional tests` MUST BE validated via a minimum of 1000 test runs. | ||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know functional tests that well, but There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should probably apply to unit/integration (go) tests, not functional (python). The latter have not yet been noted to be flaky, so I wouldn't apply such strict rules there. We have tests types definition here: status-go/Test Types. |
||||||||||||||||||||||
- This can be achieved using the `-count` or `-test.count` flag with the test command, e.g., `-count 1000` / `-test.count 1000`. | ||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is unnecessary detail
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Btw, there's also |
||||||||||||||||||||||
- Where the CI cannot support this workflow automatically, the developer MUST perform validation tests via local testing. | ||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How will the PR author prove the N number of test runs were executed? Should the CI be responsible for detecting new functional tests and increase their test count by default for the duration of the PR? Is it always feasible to mandate a minimum of 1000 test runs when some tests can take a significant amount of time to finish? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that a CI check would be nice, but the effort to implement it might not be worth the results. There're 3 cases when a test can become flaky, and each reason requires different effort to implement a CI check:
I think we could some day implement (1), 👉 So for now I believe it's fine to require devs to run new/amended test locally and perhaps attach the execution report to the PR description/comment. At the same time teach devs to prevent flaky tests, so that we won't even need to run tests a 1000 times. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had a feeling CI detection would be tricky, but I wasn't expecting such a detailed answer. Thanks @igor-sirotin 💯 I completely agree with your conclusions. One idea to improve the effectiveness of the test suite is to randomly select a subset of tests and increase their run count (randomly as well if we like). Essentially, we'd extend the total test run time by a controlled amount. This would raise the chances of catching flaky tests before they hit Is the idea above practical and perhaps straightforward? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||||||||
- `TODO` Add link to issue for CI automation of validation test runs of new `functional tests`. | ||||||||||||||||||||||
- Ensuring that the test passes consistently every time gives confidence that the test is not flaky. | ||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We know one source of flakyness is external network dependency. I wonder if we can just define in status-go that certain tests (e.g. within a certain standardized directory like I'm just looking for ways to rely less on conventions and trust. |
||||||||||||||||||||||
|
||||||||||||||||||||||
## Flaky Tests | ||||||||||||||||||||||
|
||||||||||||||||||||||
Flaky tests are defined as tests that fail intermittently. | ||||||||||||||||||||||
|
||||||||||||||||||||||
- All flaky tests / failing tests MUST be resolved. | ||||||||||||||||||||||
- No flaky tests may be introduced into the codebase. | ||||||||||||||||||||||
|
||||||||||||||||||||||
### Steps to Resolving or Reporting Flaky Tests | ||||||||||||||||||||||
|
||||||||||||||||||||||
#### Is it Me? | ||||||||||||||||||||||
Determine who caused the flaky test. | ||||||||||||||||||||||
|
||||||||||||||||||||||
- Is a new test you’ve written flaky or failing? | ||||||||||||||||||||||
- It was you. | ||||||||||||||||||||||
- You MUST fix the test before merge is acceptable. | ||||||||||||||||||||||
- Has an existing test become flaky? | ||||||||||||||||||||||
- Check rerun reports. `TODO` Add link to rerun reports. | ||||||||||||||||||||||
- If the test does not appear in https://github.com/status-im/status-go/labels/E%3AFlaky%20Test or in the last three nightly test runs, it is most likely that the flakiness was introduced by | ||||||||||||||||||||||
your changes and needs to be addressed before proceeding with the merge. | ||||||||||||||||||||||
- Otherwise, the test is already documented as a flaky test (appears in the GitHub issues or in the nightly test runs), proceed to below. | ||||||||||||||||||||||
|
||||||||||||||||||||||
```mermaid | ||||||||||||||||||||||
flowchart TB | ||||||||||||||||||||||
A([PR ready for merge]) --> B{Have any test failed?} | ||||||||||||||||||||||
B -->|No| C[🎉 Proceed with merge 🪄] | ||||||||||||||||||||||
B -->|Yes| D{ | ||||||||||||||||||||||
Is the failing test introduced | ||||||||||||||||||||||
or altered by this PR? | ||||||||||||||||||||||
} | ||||||||||||||||||||||
D -->|No| E[Check rerun reports.] | ||||||||||||||||||||||
D -->|Yes| F[ | ||||||||||||||||||||||
It is likely your changes introduced the flakiness. | ||||||||||||||||||||||
You MUST fix the test before merge is acceptable. | ||||||||||||||||||||||
] | ||||||||||||||||||||||
F --> A | ||||||||||||||||||||||
E --> G{Does the test appear in `E:Flaky Test` issues<br/> or in the last three nightly test runs?<br/>} | ||||||||||||||||||||||
G -->|Yes| I[The flakiness needs reporting] | ||||||||||||||||||||||
G -->|No| F | ||||||||||||||||||||||
I --> J([Proceed to Reporting flow]) | ||||||||||||||||||||||
``` | ||||||||||||||||||||||
|
||||||||||||||||||||||
#### Reporting Flaky Tests | ||||||||||||||||||||||
If an old test fails and/or seems flaky either locally or in CI, you MUST report the event. | ||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the lack of a blank line after the headline intentional? The other one at level 2 |
||||||||||||||||||||||
|
||||||||||||||||||||||
- Check the `status-go` GitHub repo issues for the test name(s) failing. | ||||||||||||||||||||||
- If the test appears in the list of flaky test issues: | ||||||||||||||||||||||
- If the issue is open: | ||||||||||||||||||||||
- Add a comment to the issue. | ||||||||||||||||||||||
- Detail that you have experienced the test being flaky and in what context (local vs CI, link to the PR or branch). | ||||||||||||||||||||||
- If the issue is closed: | ||||||||||||||||||||||
- Reopen the issue OR create a new issue referencing the previous issue. | ||||||||||||||||||||||
- Either is fine, use your best judgment in this case. | ||||||||||||||||||||||
- Detail that you have experienced the test being flaky and in what context (local vs CI, link to the PR or branch). | ||||||||||||||||||||||
- If the test does not appear in the list of flaky test issues: | ||||||||||||||||||||||
- Create a new issue. | ||||||||||||||||||||||
- The issue title should include the flaky test name. | ||||||||||||||||||||||
- The issue should use the https://github.com/status-im/status-go/labels/E%3AFlaky%20Test label. | ||||||||||||||||||||||
- Detail that you have experienced the test being flaky and in what context (local vs CI, link to the PR or branch). | ||||||||||||||||||||||
|
||||||||||||||||||||||
```mermaid | ||||||||||||||||||||||
flowchart TB | ||||||||||||||||||||||
A([Ready to report a flaky test]) --> B[Check the `status-go` GitHub repo<br/>issues for the test name failing.] | ||||||||||||||||||||||
B --> C{Does the test appear in<br/>the list of `E: Flaky Test` issues?} | ||||||||||||||||||||||
C -->|No| D[ | ||||||||||||||||||||||
Create a new issue | ||||||||||||||||||||||
- The issue title should include the flaky test name | ||||||||||||||||||||||
- The issue should use the `E:Flaky Test` label | ||||||||||||||||||||||
] | ||||||||||||||||||||||
D --> E[ | ||||||||||||||||||||||
Detail which test is flaky and in what context: | ||||||||||||||||||||||
local vs CI, link to the PR or branch. | ||||||||||||||||||||||
] | ||||||||||||||||||||||
E --> J | ||||||||||||||||||||||
C -->|Yes| F{Is the issue open?} | ||||||||||||||||||||||
F -->|No| G((Either)) | ||||||||||||||||||||||
H --> E | ||||||||||||||||||||||
G --> I[Reopen the issue] | ||||||||||||||||||||||
G --> D | ||||||||||||||||||||||
I --> H | ||||||||||||||||||||||
F -->|Yes| H[Add a comment to the issue] | ||||||||||||||||||||||
J([End]) | ||||||||||||||||||||||
``` | ||||||||||||||||||||||
|
||||||||||||||||||||||
## Policy Amendments and Archival | ||||||||||||||||||||||
|
||||||||||||||||||||||
- Amendments: | ||||||||||||||||||||||
- This policy MAY be amended at any time. | ||||||||||||||||||||||
- Amendments MUST be submitted via a PR to the `status-go` repository. | ||||||||||||||||||||||
- Archival: | ||||||||||||||||||||||
- This policy MAY be archived if it becomes obsolete or replaced by newer policies. | ||||||||||||||||||||||
- Archival MUST be performed by submitting a PR that moves the policy to `_docs/policies/archived`. | ||||||||||||||||||||||
- The PR MUST include a rationale for the proposed amendment or archival in the PR description. | ||||||||||||||||||||||
- The PR MUST follow the [Review and Approval Process](./introduce-policy.md#review-and-approval-process) outlined in Policy Zero. | ||||||||||||||||||||||
Comment on lines
+110
to
+119
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This I believe is also redundant, as it is specified in the policy zero, which applies to all policies. |
||||||||||||||||||||||
|
||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Two extra lines. |
||||||||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experience, many flaky tests are tricky and end up in the main branch because they resist verification attempts. Beyond addressing flaky tests during PRs, an important process is to define how/who/when to handle flaky tests after they infect the main branch. But perhaps the details of such a process can be defined in a non-policy doc.