Skip to content

Conversation

peterargue
Copy link
Contributor

@peterargue peterargue commented Sep 1, 2025

Adds a new Claude command /improve-error-docs that analyzes and improves error handling documentation in Go source files.

This can be used to analyze the godocs within a file or directory, and automatically improve based on the code.

Usage

within claude code, run

/improve-error-docs engine/common/

Changes

  • Added /improve-error-docs command for batch processing files/directories
  • Added error-docs-analyzer subagent for analyzing individual files
  • Updated GoDocs.md with improved formatting and examples.
  • Enhanced AGENTS.md with clearer formatting and structure
  • Modified Makefile to remove git diff check from tidy target

@peterargue peterargue requested a review from a team as a code owner September 1, 2025 19:57
@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Comment on lines +105 to +110
There are 2 categories of returned errors:
- Expected benign errors
- Exceptions (unexpected errors)

Expected errors are benign sentinel errors returned by the function.
Exceptions are unexpected errors returned by the function. These include all errors that are not benign within the context of the method.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this distinction helped the agents understand that some error paths do not return sentinels. before it assumed any error returned was a sentinel and it just needed to look hard enough to find them.

// Expected errors during normal operations:
// CAUTION: not concurrency safe! (if applicable, documentation is obligatory)
//
// Expected error returns during normal operations:
Copy link
Contributor Author

@peterargue peterargue Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note:

we've used several variations of this phrase. the most common are

  • Expected errors during normal operations
  • Expected error returns during normal operations

AI seems to understand the formatting and requirements better when it specifically includes the word returns or returned.

the wording Expected errors returned during normal operation: makes the most sense to me, but Expected error returns during normal operations is the most common throughout the codebase.

I think we should align on a single phrase, which will allow the agents to be more successful at helping to maintain our godocs

// Expected error returns during normal operations:
// - ErrType1: when and why this error occurs
// - ErrType2: when and why this error occurs
// - All other errors are potential indicators of bugs or corrupted internal state (continuation impossible)
Copy link
Member

@AlexHentschel AlexHentschel Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️

Effectively, you could write this statement in the goDoc of any function with an error return in our code base and it would be 99.9% correct.

Our coding convention states:

All errors beyond the specified, benign sentinel errors are considered unexpected failures, i.e. a symptom of potential state corruption.

More broadly, this is a universal rule of high-assurance systems: anything outside of the paths explicitly specified as safe must be treated as a critical failure. Because of the universality of this rule for Flow, we tend to not explicitly specify this over and over again.

I am of the opinion that it is counter-productive to have vastly repetitive statements in goDoc even if they are correct, because human brains tend to filter those out -- always with the risk of tagging other stuff as "unimportant" hence bypassing conscious perception. In short, we already have the challenge that engineers don't read the existing documentation at times, this problem will only become worse if our documentation becomes bloated with repetitive phrases that provide no value to human brains.

//
// Expected error returns during normal operations:
// - ErrNotFound if no block header with the given ID exists
// - All other errors are potential indicators of bugs or corrupted internal state (continuation impossible)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the iterations.

Though, I have major concerns that goDoc produced according to this guide will be more harmful than do good. I have used your guide in Cursor and found that the comments it generated were bloated and largely redundant with the method signature, because it documented only what was trivially apparent from the method signature.

It might be helping AI, but if humans don't properly read the godoc anymore, I think that is even worse. We are already struggling with engineers not properly reading goDoc!! In my opinion, that's a reality we must face and adjust our process to it.

I have a strong preference for documentation in natural language presenting a concise description that naturally incorporates the meaning of parameters that need to be documented. In general, we strive for descriptive method signatures and variable names.

- ONLY include expected errors documentation if the function returns an error.
- ONLY document benign errors that are expected during normal operations
- Exceptions (unexpected errors) are NOT individually documented in the error section.
- If sentinel errors are expected, include the catch-all statement about exceptions as the last item: `All other errors are potential indicators of bugs or corrupted internal state (continuation impossible)`
Copy link
Member

@AlexHentschel AlexHentschel Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather prefer not to. "No error returns expected during normal operation" already entails that any error received is abnormal.

// Expected error returns during normal operations:
// - ErrTypeName: when and why this error occurs (for sentinel errors)
// - ErrWrapped: when wrapped via fmt.Errorf, document the original sentinel error
// - All other errors are potential indicators of bugs or corrupted internal state (continuation impossible)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// - All other errors are potential indicators of bugs or corrupted internal state (continuation impossible)

- DO NOT document generic fmt.Errorf errors unless they wrap a sentinel error
- DO NOT document exceptions (unexpected errors that may indicate bugs)
- DO NOT mix benign and exceptional errors without clear distinction
- DO NOT omit the catch-all statement about other errors
Copy link
Member

@AlexHentschel AlexHentschel Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worried this will bloat the documentation for humans.

Comment on lines 143 to 148
```go
// Expected error returns during normal operations:
// - ErrTypeName: when and why this error occurs (for sentinel errors)
// - ErrWrapped: when wrapped via fmt.Errorf, document the original sentinel error
// - All other errors are potential indicators of bugs or corrupted internal state (continuation impossible)
```
Copy link
Member

@AlexHentschel AlexHentschel Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[thought, may not require any action]

I am worried the AI will get confused, because this is presented as a go example, which is talking about wrapped errors. So this is not an example for something the AI should write in a goDoc. But I also don't have a quick fix ... just long thoughts on wordy explanations, that are probably not worth the time.

Mentioning this as possibly a place to come back to in case AI is talking about wrapped errors or maybe duplicating error documentation (when the error might occur in wrapped and unwrapped versions).

Copy link
Member

@AlexHentschel AlexHentschel Sep 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just came across an example where I think the AI has misunderstood this directive:

// BatchStoreMyReceipt stores blockID-to-my-receipt index entry keyed by blockID in a provided batch.
// No errors are expected during normal operation
// If entity fails marshalling, the error is wrapped in a generic error and returned.
// If Badger unexpectedly fails to process the request, the error is wrapped in a generic error and returned.
// If a different my receipt has been indexed for the same block, the error is wrapped in a generic error and returned.
func (m *MyExecutionReceipts) BatchStoreMyReceipt(lctx lockctx.Proof, receipt *flow.ExecutionReceipt, rw storage.ReaderBatchWriter) error {

It talks about wrapping errors, where all of those are actually exceptions, that we don't discuss in detail per convention.

```
- By default, we assume methods and functions to be concurrency safe.
- Every struct and interface must explicitly state whether it is safe for concurrent access.
- If not thread-safe, explain why
Copy link
Member

@AlexHentschel AlexHentschel Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if we want the AI to explain why something might not be concurrency safe. Stating that it isn't is good enough in my opinion. Especially for interfaces, the reason that an interface is not concurrency safe is typically deeply related to the implementation. Not sure that it is helpful to document in the interface (do caller's really care why the implementation behind the interface might not be concurrency safe?)

```go
// CAUTION: not concurrency safe!
```
- If **ALL** methods of a struct or interface are thread-safe, only document this in the struct's or interface's godoc and mention that all methods are thread-safe. Do NOT include the line in each method:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree.

Comment on lines 206 to 225
- For getters/setters, use simplified format:
```go
// GetterName returns the value of the field.
//
// Returns:
// - value: description of the returned value
```
- For constructors, use:
```go
// NewTypeName creates a new instance of TypeName.
//
// Parameters:
// - param1: description of param1
//
// Returns:
// - *TypeName: the newly created instance
// - error: any error that occurred during creation
//
// No errors are expected during normal operations.
```
Copy link
Member

@AlexHentschel AlexHentschel Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to redundantly replicate the same information in long comments, where most of the contents is trivially apparent from the method signatures.

The following is a modification, which I made on your earlier version, which worked a lot better from my perspective to avoid bloated documentation:

Method Description

  • First line must be a complete sentence describing what the method does
  • Use present tense
  • Start with the method name
  • End with a period
  • Prefer a concise description that naturally incorporates the meaning of parameters
  • Example:
    // ByBlockID returns the header with the given ID. It is available for finalized and ambiguous blocks.
    // Error returns:
    //  - ErrNotFound if no block header with the given ID exists
    ByBlockID(blockID flow.Identifier) (*flow.Header, error)

Parameters

  • Only document parameters separately when they have non-obvious aspects:
    • Complex constraints or requirements
    • Special relationships with other parameters
    • Formatting or validation rules
    • Example:
      // ValidateTransaction validates the transaction against the current state.
      //   - script: must be valid BPL-encoded script with max size of 64KB
      //   - accounts: must contain at least one account with signing capability
  • Specifically pay attention to document parameters only when you are adding additional information beyond what is already obvious from the function or method signature.

Returns

  • Only document return values if there is additional information necessary to interpret the function's or method's return values, which is not apparent from the method signature's return values
  • When documenting non-error returns, be concise and focus only on non-obvious aspects:
    // Example 1 - No return docs needed (self-explanatory):
    // GetHeight returns the block's height.
    
    // Example 2 - Additional context needed:
    // GetPipeline returns the execution pipeline, or nil if not configured.
    
    // Example 3 - Complex return value needs explanation:
    // GetBlockStatus returns the block's current status.
    // We return the status PENDING if still processing, FINALIZED if complete, INVALID if failed validation
  • Error returns documentation is mandatory (see section Error Returns below)

Copy link
Contributor Author

@peterargue peterargue Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's here is the same as your original edits (see commit). The only change I made was to move the Example... parts out of the comment so it wouldn't be confused with the actual content we want. I think it all shows up as changed because of some whitespace differences.

Is there something here you think I should add/remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed these specific "special case" examples since they don't contribute much and are more likely to confuse.

I also removed the specific "Returns" and "Parameters" sections from the top level section, and added a note that they should just be included in the main description body.

@AlexHentschel
Copy link
Member

Another directive, which I have found to be extremely important is the following

  • When updating existing code, if godocs exist, keep the existing content and improve formating/expand with additional details to conform with these rules.

Because in may cases, I have observed the following pattern:

  • There exists some concise goDoc covering important aspects (though maybe missing some details).
  • The AI will come, throw the important information away, because it doesn't conform to its perceives guidelines, and replaces it with bloated keywords about input parameters and return values, which is completely trivially apparent from the method signature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants