Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/telemetry/counter/countertest: improve ergonomics of countertest.Read #71590

Open
findleyr opened this issue Feb 6, 2025 · 4 comments
Open
Labels
FeatureRequest Issues asking for a new feature that does not need a proposal. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. telemetry x/telemetry issues
Milestone

Comments

@findleyr
Copy link
Member

findleyr commented Feb 6, 2025

While investigating a frequent gopls flake (#68659), I encountered several API problems with the countertest package that make it hard (or really impossible) to use correctly.

I believe the sequence of events that leads to the flake in question are:

  1. TestMain calls countertest.Open, because it must be called once during program execution.
  2. The test setup first performs an initial increment of counter values, since we must inc before the first Read as ReadCounter(counter.New("unknownCounter")) returns an error rather than succeeding and returning 0.
  3. ReadCounter calls rotate1 internally. I'm not sure why. Since TestMain may have occurred ~minutes ago, it's quite likely that the day has actually changed in the meantime, and so the file rotation invalidates the previous counters.

IMO, all three of these are indicative of an API problem:

  1. We should probably be able to call countertest.Open inside each test (albeit not in parallel), to increase test isolation.
  2. ReadCounter(counter.New("unknownCounter")) should return the zero value.
  3. ReadCounter should probably not call rotate1, but even if it does, countertest should not be subject to day shifts. We should either stop time, or let the countertest user control the progress of time. Otherwise, no test can assert on the output of Read.

At the very least, we should fix (3), but it would be good to fix (1) and (2) as well.

@gopherbot gopherbot added the telemetry x/telemetry issues label Feb 6, 2025
@gopherbot gopherbot added this to the Unreleased milestone Feb 6, 2025
@gabyhelp gabyhelp added the FeatureRequest Issues asking for a new feature that does not need a proposal. label Feb 6, 2025
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/647436 mentions this issue: gopls/internal/test/integration: reduce flakes in TestTelemetryPrompt

@cherrymui cherrymui added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 6, 2025
gopherbot pushed a commit to golang/tools that referenced this issue Feb 6, 2025
Issue golang/go#71590 describes the likely source of flakiness of
TestTelemetryPrompt_Response: the test will flake if the UTC day has
changed between the start of TestMain and the start of the test. While
we cannot fix this flakiness without a change to x/telemetry, we can
significantly reduce it by first causing a file rotation check at the
start of the test.

For golang/go#68659

Change-Id: I22b6b41ebadbea1f4af0b7d4dc64dbfcf617cefd
Reviewed-on: https://go-review.googlesource.com/c/tools/+/647436
Auto-Submit: Robert Findley <[email protected]>
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Jonathan Amsterdam <[email protected]>
@rsc
Copy link
Contributor

rsc commented Feb 11, 2025

Was just looking into this, but it appears to be fixed by CL 599075, which is exactly what I was going to do for (3).
I don't believe (1) and (2) should be necessary.

@rsc
Copy link
Contributor

rsc commented Feb 11, 2025

I am a little confused now because I notice that CL 599075 was submitted in July 2024. Perhaps it's just not thorough enough.

@findleyr
Copy link
Member Author

@rsc that's not quite related. Open still calls rotate1, which maps the file initially.

counter.Read also calls rotate1. I don't actually know why. It's the rotate1 in counter.Read that causes the switch to a different file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FeatureRequest Issues asking for a new feature that does not need a proposal. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. telemetry x/telemetry issues
Projects
None yet
Development

No branches or pull requests

5 participants