Add linting and formatting to linkml-runtime #347

sneakers-the-rat · 2024-10-28T21:25:27Z

upstream_repo: dalito/linkml
upstream_branch: issue2578-fix-uri-in-snapshot

edit: not anymore ~~Builds on: #345~~

The non-modular PR is because the changes to update to 3.9 use the linting rules we're implementing here.

This sets us up with

The existing rules in linkml
UP - pyupgrade with an override for X | Y types until 3.9 gets dropped
T100 - no pdb imports or uses (i am guilty of this a lot)

I punted on a handful of rules for the tests module because there are a lot of violations and it's just in tests, those are mostly assigned values that aren't used, which are fine in tests if a little implicit.

Bigass diff, i'm aware, but that's why we add the linter rules, so future diffs are smaller :)

Review spots

To help out review, all the changes here are linter fixes with no functional change except:

linkml_runtime.utils.metamodelcore - There were a handful of invalid error handling blocks that looked like this:
```
try:
    # something
except TypeError:
    pass
except ValueError:
    pass
if not is_strict():
    return str(value)
raise e
```
And that isn't valid because e is undefined. It's also ambiguous whether or not those values should always pass, and we raise on any other error, or whether we raise those errors if in strict mode. I assumed the latter, so i changed that to
```
try:
    # something
except (TypeError, ValueError):
    if is_strict():
        raise

return str(value)
```
linkml_runtime.loaders.rdf_loader - there's a reference to pyld_jsonld_from_rdflib_graph and i'm not sure what that is, can't find it anywhere. so i marked it noqa
linkml_runtime.utils.permissiblevalueimpl:PvFormulaOptions has a default value that uses EnumDefinition. That would make a pretty nasty dependency loop if we fixed it, so i just commented it out. This prevented the module from being imported at all, so i suspect this is deprecated.
tests.__init__.py used an eval call to get a log level, using eval that should basically never be done, so i switched it to getattr

sneakers-the-rat · 2024-10-31T03:44:12Z

looks like we need to add a parameter to the upstream tests that lets us specify which branch to test against.

ialarmedalien · 2025-05-03T15:42:40Z

Is this PR still a going concern or has it been abandoned?

sneakers-the-rat · 2025-05-04T02:07:41Z

I just forgot about it, but it would be relatively straightforward to rebase it - it was always intended to be considered after #345 and that's in now so it could be done.

ialarmedalien · 2025-05-04T18:37:26Z

I was going to put in a PR with some ruff formatting/linting, but since this one is already done, it'd be much better to use this instead. Could you update it so it can be re-reviewed if necessary and merged? (I am happy to review.)

I think it'd make more sense to use ruff for code formatting -- the output is 99% identical to that of black and it'd be one less dependency.

Silvanoc · 2025-05-07T16:09:38Z

@sneakers-the-rat may I propose some changes to make this PR digestible for a reviewer? I would split it into smaller PRs, building each of the following points a PR build upon the previous ones:

1. Add linter configurations

PR including:

Add the possibility of running ruff from tox, similarly to the way linkml does it.
Adapt the pyproject.py configuration to use the desired ruff rules/linters.
Add lint-fix and format targets to the Makefile, similarly to those existing in the linkml/Makefile.
Add a pre-commit configuration similar to that provided in the linkml repository.

This should be a relatively easy review for someone with some ruff knowledge. Once such a PR is merged, everybody is capable of running it locally in an opt-in manner.

2. Lint and format

PR containing:

The result of running make lint-fix in a single commit, which document that command in the message. This way any reviewer and thanks to PR 1 is capable of reproducing it. That is an easy review, if the reviewer trust the tool.
Alternatively smaller commits can be created manually running individual ruff commands which should be documented on the commit message. Here again, reproducing for a reviewer is peanuts.
Any additionally beautification manually applied over the previous results.

A reviewer can simply check if the ruff configuration is fine, reproduce to ensure that all generated changes are generated. Such a review is also really easy, although it contains a huge amount of changes.

3. Enforce linting and formatting

PR containing:

Changes to the GitHub Workflows to add linter and formatter validation.

This is also easy to review and, if both previous PRs run well, should run really smoothly.

From that point on, we start practicing "code lookism" (don't forget never to practice real lookism!).

I'm taking the setup available in the https://github.com/linkml/linkml repo as a reference. If you disagree with it, please consider fixing it too.

I'm happy to help you on all of it, if desired!

ialarmedalien · 2025-05-07T17:03:58Z

See also the recently-implemented ruff formatting and linting in linkml-map.

Silvanoc · 2025-05-07T19:23:38Z

See also the recently-implemented ruff formatting and linting in linkml-map.

@ialarmedalien IMO the ruff rules that you use is the most interesting part. Apart from that I can see some relevant differences with the use of ruff in the linkml repo:

linkml-map is explicitly declaring ruff as a project dev dependency. I would also tend to do so, but linkml relies on ruff to install it in the ephemeral virtual environments that it creates, therefore even the version is specified.
linkml-map does not provide any pre-commit configuration, whereas linkml does. Keeping the ruff versions synchronous between pyproject.toml/tox.ini and .pre-commit-config.yaml is challenging, but I appreciate having a pre-commit configuration.
linkml-map does not provide any Makefile target for the linter.

ialarmedalien · 2025-05-08T03:39:00Z

@Silvanoc I mostly took the path of least change (least resistance) with the linkml-map set up but would prefer to have a standardised formatting/linting set up in as many of the repos as possible (ideally via template, but it would probably have to be manually configured for now).

My preference would be:

have ruff as a dev dependency (which also makes it available for local code editors)
add shortcuts (Makefile targets) for checking code formatting, linting, and codespell
pre-commit hooks and GitHub Actions for the above if they do not exist
remove formatting/linting/etc. from tox

That makes it easy for devs and contributors to "do the right thing".

Initial targets for this set up would be linkml-map, runtime, and the main linkml repo.

… bunch of actually used imports

codecov · 2025-05-08T05:35:29Z

Codecov Report

Attention: Patch coverage is 60.84724% with 305 lines in your changes missing coverage. Please review.

Project coverage is 63.74%. Comparing base (00abef0) to head (4d89f20).

Files with missing lines	Patch %	Lines
linkml_runtime/linkml_model/datasets.py	0.00%	84 Missing ⚠️
linkml_runtime/linkml_model/validation.py	0.00%	25 Missing ⚠️
linkml_runtime/utils/schemaview_cli.py	0.00%	25 Missing ⚠️
linkml_runtime/linkml_model/mappings.py	0.00%	16 Missing ⚠️
linkml_runtime/processing/referencevalidator.py	74.46%	10 Missing and 2 partials ⚠️
linkml_runtime/utils/namespaces.py	60.00%	10 Missing and 2 partials ⚠️
linkml_runtime/utils/schemaview.py	84.61%	10 Missing and 2 partials ⚠️
linkml_runtime/utils/permissiblevalueimpl.py	0.00%	11 Missing ⚠️
linkml_runtime/utils/yamlutils.py	75.60%	9 Missing and 1 partial ⚠️
linkml_runtime/loaders/loader_root.py	65.21%	7 Missing and 1 partial ⚠️
... and 20 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #347      +/-   ##
==========================================
- Coverage   63.79%   63.74%   -0.06%     
==========================================
  Files          63       63              
  Lines        8946     8939       -7     
  Branches     2587     2589       +2     
==========================================
- Hits         5707     5698       -9     
  Misses       2633     2633              
- Partials      606      608       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sneakers-the-rat · 2025-05-08T05:55:15Z

rebased.

may I propose some changes to make this PR digestible for a reviewer? I would split it into smaller PRs, building each of the following points a PR build upon the previous ones:

i appreciate the request, but it essentially amounts to completely redoing the PR, so i'm not going to do that. feel free to close this if it's unwanted. i've spent enough time on this doing it the first time and then rebasing it 7 months later.

there is no way to make a "reformat the whole package" PR smaller. i don't see the purpose of splitting out adding the linter rules and applying them. it's pretty easy to ctrl+f for "pyproject.toml" in the diff view.

The result of running make lint-fix in a single commit, which document that command in the message. This way any reviewer and thanks to PR 1 is capable of reproducing it. That is an easy review, if the reviewer trust the tool.
Alternatively smaller commits can be created manually running individual ruff commands which should be documented on the commit message. Here again, reproducing for a reviewer is peanuts.
Any additionally beautification manually applied over the previous results.

this is exactly how the PR is structured already, modulo rebasing.

Side note on pre-commit actions: ime they are exclusionary - pre-commit the package is sort of ridiculously wasteful of space and time, and not everybody has 100GB of free space to install the million docker images that pre-commit actions always seem to want to rely on. the purpose of CI is to lower barriers to contribution while ensuring code correctness. linting can run in CI. I don't think linkml will ever get to the code velocity where having an extra "lint" commit is actually impactful to anyone's work or understanding of the repo.

current test failures are mysterious to me. It looks like the current snapshots are incorrect, but not sure why this fixes them -
schema that's failing is this: https://github.com/linkml/linkml/blob/main/tests/test_utils/input/owl1.yaml
output snapshot that doesn't match: https://github.com/linkml/linkml/blob/main/tests/test_utils/__snapshots__/owl1.owl

notice how the URIs seem to be incorrectly generated with leading : in the generated output - as far as I know, for a schema with id: http://example.org/owl1 and some element slotopt, its uri should be http://example.org/owl1/slotopt rather than http://example.org/owl1/:slotopt

ialarmedalien · 2025-05-08T11:51:26Z

This PR fixes the upstream test issues but it's awaiting the release of linkml-runtime. Gotta love tightly-coupled packages!

ialarmedalien · 2025-05-08T11:54:48Z

pyproject.toml


 [tool.ruff.lint.per-file-ignores]
-"tests/**/*.py" = ["F401"] # unused imports
+"tests/**/**.py" = ["F841", "E501", "F842", "E741"] # I ain't fixing all that


Nobody needs nitpicking on tests 😄

ialarmedalien · 2025-05-08T11:57:00Z

Do we need black and ruff as code formatters? Due to the amount of work involved in generating this PR, I would be inclined to merge as-is and make a second PR that removes black and implements any formatting changes that introduces there.

Apart from that, LGTM. Do we need another maintainer to take a look and approve?

Thank you for all your work here!

Silvanoc · 2025-05-08T13:02:42Z

i appreciate the request, but it essentially amounts to completely redoing the PR, so i'm not going to do that. feel free to close this if it's unwanted. i've spent enough time on this doing it the first time and then rebasing it 7 months later.

@sneakers-the-rat Sorry if I haven't shown enough sensitivity in my proposal. You invested significant time in this PR and I come proposing that you reformat the whole... As I explain further down, my proposal partially originates from a misunderstanding from my side.

I was just making a proposal to make a review easier to accomplish.

Let's switch roles. Imagine that I'm the author of this PR and you are considering reviewing it. So you would probably consider following:

Manually reviewing all the changes: you would probably discard it, because this PR is changing 156 files, adding 12,615 lines and removing 6,689.
Blindly trusting me and approving it: you would probably discard it too, because you deliver quality (you have shown it often enough) and wouldn't give your stamp of approval so blindly.
Discarding taking over the review: what is currently happening.
Contacting me to ask for a reviewable PR: that should be IMO the only real alternative to the option 3.

there is no way to make a "reformat the whole package" PR smaller. i don't see the purpose of splitting out adding the linter rules and applying them. it's pretty easy to ctrl+f for "pyproject.toml" in the diff view.

Of course option 4 does not only imply proposing a full refactoring of this PR, and that's been my error.

I should have asked first for the ruff rules that you have used for the changes so that they can be reproduced by a reviewer. Then you would have given me this answer and I would have realized how blind I've been 🤦🏻 and I would have agreed with you 😄

The result of running make lint-fix in a single commit, which document that command in the message. This way any reviewer and thanks to PR 1 is capable of reproducing it. That is an easy review, if the reviewer trust the tool.
Alternatively smaller commits can be created manually running individual ruff commands which should be documented on the commit message. Here again, reproducing for a reviewer is peanuts.
Any additionally beautification manually applied over the previous results.

this is exactly how the PR is structured already, modulo rebasing.

That's what I was expecting from you, but to be honest I could not easily identify in your commits which contained only changes made by ruff and which made by you.

Side note on pre-commit actions: ime they are exclusionary - pre-commit the package is sort of ridiculously wasteful of space and time, and not everybody has 100GB of free space to install the million docker images that pre-commit actions always seem to want to rely on. the purpose of CI is to lower barriers to contribution while ensuring code correctness. linting can run in CI. I don't think linkml will ever get to the code velocity where having an extra "lint" commit is actually impactful to anyone's work or understanding of the repo.

pre-commit is optional for developers who want to use it (like me), so I don't see a problem providing it. You don't like it and prefer to let CI tell you that something is wrong? Fine for you. But I personally hate when I push and get to grab a coffee or switch task to check CI after some time and realize that the linter is telling me after 30s that I've spelled a word wrong.

Silvanoc · 2025-05-08T15:30:29Z

I would like to shortly illustrate what I had in mind.

This branch of mine has all the configurations to apply the same linting and formatting (it's the pyproject.toml of this branch) that you are applying. It's simple and easy to review. I, as a reviewer, would mostly look at the configured rules.

A developer fetching the above mentioned branch can run the linter and formatter and should get exactly this commit. A reviewer could simply reproduce it and accept the bunch of changes resulting from it.

The diff with the branch of this PR contains, in theory, only the changes that you've manually applied. That part should also be relatively easy to review.

Silvanoc

Pending tests fixing

Silvanoc · 2025-05-08T15:36:59Z

current test failures are mysterious to me. It looks like the current snapshots are incorrect, but not sure why this fixes them - schema that's failing is this: https://github.com/linkml/linkml/blob/main/tests/test_utils/input/owl1.yaml output snapshot that doesn't match: https://github.com/linkml/linkml/blob/main/tests/test_utils/__snapshots__/owl1.owl

notice how the URIs seem to be incorrectly generated with leading : in the generated output - as far as I know, for a schema with id: http://example.org/owl1 and some element slotopt, its uri should be http://example.org/owl1/slotopt rather than http://example.org/owl1/:slotopt

@sneakers-the-rat the issue with the tests is possibly related to linkml/linkml#2648. @dalito can you confirm it? If so, can it be easily fixed?

dalito · 2025-05-08T16:44:41Z

The tests failures are exactly as expected until linkml/linkml#2648 is merged.

@sneakers-the-rat could edit the first message of this PR to test against linkml/linkml#2648 to get all tests green (as I did here). If required, I can rebase that PR again.

sneakers-the-rat · 2025-05-09T02:55:02Z

"Apply linting rules from upstream"
59a8135

"Reformat with black"
d7d8975

"Apply ruff safe fixes"
74cec4f

"Mid ruff linting" aka manually fixing linter errors
6e3c85d

(A few smaller, labeled manual changes)

The one outlier step is repeating the above process for all the code that has changed since the rebase.

ed75b2d

The first commit has the commands needed to reproduce each of the programmatic stages of the PR - they are in the tox format action. The other changes that are not uncontroversially linter fixes are described in the OP. One can view the diff between any two commits on github using {repo}/compare/hash...hash

so im not disagreeing about how taking those steps would make the PR more reviewable, im just saying I already did them.

@sneakers-the-rat could edit the first message of this PR to test against linkml/linkml#2648 to get all tests green (as I did #388 (comment)). If required, I can rebase that PR again.

Just did that, I love that thing.

Do we need black and ruff as code formatters?

I am mostly trying to get parity between upstream and this package for now, but afterwards if we want to remove black from both, I take no position on that but could be done together. Last I checked there were some minor places they disagreed with one another? But could be wrong, and it also might not matter. Idk if we take it out and ruff formats the same way, then that seems like it would be fine??

Silvanoc · 2025-05-09T08:39:15Z

so im not disagreeing about how taking those steps would make the PR more reviewable, im just saying I already did them.

I could not recognize it. So it's probably me, that got overwhelmed by the surface (huge amount of changes) without scratching the surface enough.

Anyway, I really appreciate the huge effort and, as always, very valuable contribution 🚀

The only intention of my proposal was lowering the review effort to ensure that it finally gets reviewed and merged.

cmungall approved these changes Oct 31, 2024

View reviewed changes

ialarmedalien mentioned this pull request May 7, 2025

SchemaView: format, lint autofix, add typing + docs #389

Merged

sneakers-the-rat added 11 commits May 7, 2025 22:11

python 3.13 update

c190a72

dang i guess ruff really just failed at that check and also removed a…

6165589

… bunch of actually used imports

add ruff, black, and codespell rules from upstreadm

59a8135

reformat with black

d7d8975

apply ruff safe fixes

74cec4f

mid ruff linting

6e3c85d

pyproject fixes - all rules, proper table format

24d9acf

lint action with fixed tox

25939ce

fixed linting

0ddddfd

fix import locations

18861ea

reapply linting after rebase

ed75b2d

sneakers-the-rat force-pushed the ruff-linting branch from 997c3e7 to ed75b2d Compare May 8, 2025 05:32

relock

4d89f20

ialarmedalien reviewed May 8, 2025

View reviewed changes

Silvanoc approved these changes May 8, 2025

View reviewed changes

Add linting and formatting to linkml-runtime #347

Are you sure you want to change the base?

Add linting and formatting to linkml-runtime #347

Uh oh!

Conversation

sneakers-the-rat commented Oct 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review spots

Uh oh!

sneakers-the-rat commented Oct 31, 2024

Uh oh!

ialarmedalien commented May 3, 2025

Uh oh!

sneakers-the-rat commented May 4, 2025

Uh oh!

ialarmedalien commented May 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Silvanoc commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Add linter configurations

2. Lint and format

3. Enforce linting and formatting

Uh oh!

ialarmedalien commented May 7, 2025

Uh oh!

Silvanoc commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ialarmedalien commented May 8, 2025

Uh oh!

codecov bot commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sneakers-the-rat commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ialarmedalien commented May 8, 2025

Uh oh!

ialarmedalien May 8, 2025

Choose a reason for hiding this comment

Uh oh!

Silvanoc May 8, 2025

Choose a reason for hiding this comment

Uh oh!

ialarmedalien commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Silvanoc commented May 8, 2025

Uh oh!

Silvanoc commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Silvanoc left a comment

Choose a reason for hiding this comment

Uh oh!

Silvanoc commented May 8, 2025

Uh oh!

dalito commented May 8, 2025

Uh oh!

sneakers-the-rat commented May 9, 2025

Uh oh!

Silvanoc commented May 9, 2025

Uh oh!

Uh oh!

sneakers-the-rat commented Oct 28, 2024 •

edited

Loading

ialarmedalien commented May 4, 2025 •

edited

Loading

Silvanoc commented May 7, 2025 •

edited

Loading

Silvanoc commented May 7, 2025 •

edited

Loading

codecov bot commented May 8, 2025 •

edited

Loading

sneakers-the-rat commented May 8, 2025 •

edited

Loading

ialarmedalien commented May 8, 2025 •

edited

Loading

Silvanoc commented May 8, 2025 •

edited

Loading