Skip to content
32 changes: 30 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,34 @@ Pre-commit hooks run ruff, ruff-format, and mypy. If ruff-format modifies files,

### GitHub MCP Server Setup

To use the PR review rules, configure the GitHub MCP server in Cursor's "Tools & MCP" settings. You will need a read-only personal access token with the following permissions: Pull requests, Issues, Contents, Metadata.
To use the PR review rules, configure the GitHub MCP server in Cursor's "Tools & MCP" settings.
You will need a read-only personal access token with the following permissions: Pull requests, Issues, Contents, Metadata.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the file to follow "one sentence per line" formatting, which it already mostly followed.


## Code Guidelines for Agents

### Naming

- Config classes loaded from user-specified yaml: append `Config` to the built type (`TrainStepperConfig`).
- Private functions get a `_` prefix.

### Config design

- Validate in `__post_init__`, not at runtime.
- Config loading backwards compatibility for inference is critical, but can be broken for training; use deprecation warnings for config removal. Ask user if unsure.

### Testing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should indicate a preference for fast-running and parsimonious tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm removing this for now in light of @frodre 's article, because I've so far found Claude is very good at writing fast-running tests already.


- Create helpers for repeated test setup (threshold: 3+ instances).
- Prefer explicit helpers over pytest fixtures; use fixtures only when sharing scope across tests is valuable.
- When fixing a bug, add a failing test first.
- Prefer tests that cover user-story-level behavior over tests that lock down subjective API details.
- Helper function `validate_tensor_dict` is available for regression testing against a saved reference

### Code organization

- Consolidate duplicated code to shared locations (e.g. `fme/core/`).
- `fme/core` is not allowed to import from `fme.ace` or other submodules.
- `fme/ace` is allowed to import from `fme.core`, but not from other submodules.

## Pull Request Review Assistant

Expand Down Expand Up @@ -60,7 +87,8 @@ Use these severity buckets:
- **Suggestions (Should Consider)**: performance, error handling, clarity/design improvements
- **Minor/Nitpicks (Optional)**: style, naming, docs polish

For re-reviews, classify prior comments as **Addressed**, **Partially Addressed**, **Unaddressed**, or **Dismissed**. Treat clear author rationale as addressed when appropriate.
For re-reviews, classify prior comments as **Addressed**, **Partially Addressed**, **Unaddressed**, or **Dismissed**.
Treat clear author rationale as addressed when appropriate.

### 4) Output format

Expand Down
67 changes: 67 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,16 +26,83 @@ If you're an external user and want to implement a new feature, **please open an
issue first to discuss it with us**. This helps ensure the feature aligns with
the project's direction and prevents wasted effort.

## Code Guidelines
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For much of the general advice below, rather than commit to developing and maintaining our own set of detailed coding guidelines and design principles, I think it would be better to simply include links to external style guides and guidelines that match our expectations for the quality of contributions from sources that we know aren't going anywhere.

This way the contributing guidelines focus on things that are specific to fme.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have examples of external style guides to include? These came specifically from looking at our code reviews, rather than an external reference, so I don't have one (except for conventional comments).

I do feel like while some of these are general guidelines, they are ones that are specifically relevant to our code. It's hard to use a lot of CS-focused style guidelines because the nature of our code (our being the end user and developer) means a lot of the rules can be bent or broken. I find it hard in using them to pick out the "right" ones.

For example, isolating responsibilities to one abstraction level is specifically relevant to us because the freedom to make changes to our code in the future is high-value (because our requirements constantly change).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like @brianhenn mentioned, this feels overly verbose for CONTRIBUTING.md since the wall of text may dissuade humans from reading it! I like the sort of FAQ type high-level stuff under the first abstraction section. I'd vote for keeping that and the testing section since those would be most broadly relevant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree. As a first pass, I'll keep those, but move everything else (including the part about refactoring using facades) to AGENTS.md and add a reference to it from here.

It's hard because I feel like this is text all humans should read >.< but I do agree with you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.


For general Python style we follow the
[Google Python Style Guide](https://google.github.io/styleguide/pyguide.html).
The sections below cover what matters most in this project. See
[AGENTS.md](AGENTS.md) for more detailed guidelines on naming, configuration,
code organization, and other topics.

### Design: isolate responsibilities

When designing a change, think first about what absolutely must change, then
about what level of abstraction that change could be handled in. Choose the
option that splits the concern across the fewest levels of abstraction. When a
decision changes in the future, as little code as possible should need to be
touched. This does not mean minimizing the number of lines changed — you may
need to modify APIs at several levels to properly isolate a feature into one
level of abstraction.

- **Prefer polymorphism over type-checking.** If you see `if isinstance(x, A)
... elif isinstance(x, B) ...` chains, the behavior should be a method on
the types being checked.
- **Keep functions at one level of abstraction.** Don't mix high-level
orchestration with low-level details. Extract helpers when needed.
- **Centralize cross-cutting concerns.** Only `Distributed` should access
`torch.distributed`; other modules call `dist.method()`. Training concerns
(DDP, weight freezing, loss config) stay in training code.
- **Facade-first refactors.** When refactoring across multiple PRs, implement
the new class internally, keep the old class as a translation layer, then
swap in a final PR.

### Testing

- **Test helpers over copy-paste.** Create helper functions to build common
test fixtures. If the same setup appears 3+ times, extract a helper.
Comment on lines +61 to +62
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also worth noting we want to avoid using pytest fixtures, use helpers instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that fixtures can quickly become unwieldy, but can also be very useful e.g. for sharing scope across tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated (see extension of bullet).

Prefer explicit helper functions over pytest fixtures, which can become
unwieldy; use fixtures only when sharing scope across tests is valuable.
- **Demonstrate bugs with failing tests.** When fixing a bug, add a test
that fails without the fix, then fix it.
- **Test behavior, not implementation.** If a test re-implements the logic
it's testing, it isn't actually verifying anything. Prefer tests that
cover important user-story-level behavior over tests that lock down
subjective API details, since the latter make it harder to evolve
interfaces. Both have a place, but use API-level tests in moderation.
- **Use xfail for known bugs.** Mark known issues with `pytest.mark.xfail`
rather than skipping them silently.
- **Exercise meaningful values.** Don't use all-ones for area weights or
trivial shapes that hide real bugs.
- **Regression tests for checkpoints.** Maintain regression checkpoints from
specific model releases. Use `strict=True` for state dict loading.

## Code Review

Automated CI tests must pass before your pull request will be reviewed. All
pull requests will be reviewed for:
- Correctness and functionality
- Code style, type-hinting and consistency
- Alignment with project goals
- Responsibility isolation and abstraction level consistency
- Appropriate test coverage

Please be responsive to feedback during the review process.

### Review comment conventions

One guide our team finds helpful is [Conventional Comments](https://conventionalcomments.org/).
Reviewers label comments to indicate priority:
- **Issue**: Must be addressed before merge.
- **Suggestion (optional)**: Worth considering but non-blocking.
- **Question**: Seeking clarification; may or may not require changes.
- **nit**: Minor style preference; does not require re-review.

### PR scope

- Keep PRs focused. Split cleanly separable changes into separate PRs.
- Non-blocking suggestions are often deferred to follow-on PRs.
- Config changes typically need multiple rounds of review.

## Developing

You can install a developer environment using the `conda` package manager with
Expand Down
Loading