-
-
Notifications
You must be signed in to change notification settings - Fork 60
Language annotation support #343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Language annotation support #343
Conversation
src/Nixfmt/Lexer.hs
Outdated
isLanguageIdentifier :: Text -> Bool | ||
isLanguageIdentifier content = | ||
let stripped = strip content | ||
in not (Text.null stripped) | ||
&& Text.length stripped <= 30 -- TODO: make configurable or remove limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want a length limit here?
I added this logic as well as a test case for it, but I'm not sure if that's the intended behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on
The rendering engine currently ignores all line length calculations involving comments, however this strongly depends on the assumption that comment tokens are always strictly followed by a line break.
-- @piegamesde #179 (comment)
I'm assuming yes.
As IIUC, the code responsible for line wrapping will assume this token is small and not worth wrapping, since it doesn't know how long comments actually are.
It's also possible that I've completely misunderstood this though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is related to this comment
This is length limit for considering a inline comment a language notation block.
This is the reason this test case
nixfmt/test/diff/language-annotation/in.nix
Lines 100 to 103 in 6830219
# Edge case: very long language annotation | |
longAnnotation = /* this-is-a-very-long-language-annotation-that-might-affect-line-length */ '' | |
content | |
''; |
Gets formatted as a regular comment
nixfmt/test/diff/language-annotation/out.nix
Lines 105 to 109 in 6830219
# Edge case: very long language annotation | |
longAnnotation = # this-is-a-very-long-language-annotation-that-might-affect-line-length | |
'' | |
content | |
''; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is related to this comment
My interpretation is the reason we want a length limit on what is considered a "language annotation" is because we can't wrap lines based on line length when the line is long because of a comment.
If we had smarter line wrapping, then we probably wouldn't need a special case for "annotations" and could simply preserve more inline comments more generally.
If we had smarter line wrapping, your test case could probably render something like:
# Edge case: very long language annotation
longAnnotation =
/* this-is-a-very-long-language-annotation-that-might-affect-line-length */
''
content
'';
Or a similar example with some combos of long/short binding names and/or inline comments:
thisIsAVeryLongBindingNameThatWillAffectLineLengthForSure =
/* short */ ''
content
'';
thisIsALongBindingNameButNotTooLong = /* short */ ''
content
'';
thisIsAnotherLongBindingNameButNotTooLong =
/* medium-length-comment */ ''
content
'';
My argument is that the only reason we need to handle language annotations specially, is because we currently handle these kinda long comments poorly. Therefore, if we don't cap the length of the language annotations, they will suffer from the same issues: long comments being allowed to stay at the end of long lines.
...at least, if I understand correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it fine to hard code this limit here or should this come from some cli option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it fine to hard code this limit here or should this come from some cli option?
I think it's niche enough not to need a CLI option, personally.
I'll let others review the implementation details, as my Haskell is very weak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be fine with removing the limit here. The formatter doesn't promise to keep lines under a certain length, this would just be another way to have a long line.
That said, I'm also totally OK with the code as written. We can always make tweaks in the future, and erring on the side of preserving the old behaviour is a safe choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without a limit, we could end up with things like:
somewhat_long_name = /* very-long-comment-that-makes-this-line-so-long-you-cant-really-see-the-string-value */ "a moderate length string";
IIUC, the formatter will count that comment as having zero length when deciding whether to do any line wrapping.
Maybe there's a way we could have it so that normal comments are still not counted to the line length, but annotations are?
That sounds like the ideal solution, but I personally don't want to block this PR on it.
# Language annotation in function arguments | ||
processCode = builtins.readFile (/* lua */ '' | ||
return "Hello" | ||
''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This input is not idempotent in master
Checking test/diff/language-annotation/in.nix …
<stdin>: Nixfmt is not idempotent. This is a bug in nixfmt. Please report it at https://github.com/NixOS/nixfmt
After one formatting:
builtins.readFile (
# lua
''
return "Hello"
''
)
After two:
builtins.readFile (
# lua
''
return "Hello"
'')
so we're also fixing this bug here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but also I can imagine inputs not covered by this PR which still exhibit this bug, so it might be worth tracking separately
Can you please outline the rough high-level approach taken here to make reading the code easier? |
Sure! The test cases on I added a new Most of the logic is on |
Looks like some test cases are still using the "old" formatting: https://github.com/NixOS/nixfmt/actions/runs/17923265870/job/50963146661?pr=343#step:6:1004 Is that why this is still a draft? Or are there other things to resolve too? |
@MattSturgeon Yes. I also want to get feedback regarding some edge cases that I decided how to handle, such as both of these are ignored: nixfmt/test/diff/language-annotation/in.nix Lines 44 to 53 in 6830219
nixfmt/test/diff/language-annotation/out.nix Lines 44 to 58 in 6830219
|
96b1c0e
to
0bde506
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for working on this, I've been looking forward to this feature since before I joined the team!
I can't review the haskell, but from my perspective this PR LGTM. @piegamesde how does the haskell look to you?
FYI our next formatting team meeting is Tuesday 30th September. The meetings are public so you're welcome to attend: https://github.com/NixOS/nixfmt/tree/master/team
That link also has info about other things like our matrix room.
I'm assuming it'd be best to discuss this at the meeting prior to merge, but I'm fine with merging earlier if we get enough approvals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one minor tweak, please!
Thanks to @infinisil for reviewing the Haskell while I screen shared.
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/formatting-team-meeting-2025-09-30/70099/1 |
also update existing tests to match new logic
language annotations should remain as block comments when directly preceding string literals, while other block comments get converted to line comments. - Detect language annotations: single-line, non-doc comments with valid language identifiers - Preserve as `/* lang */` block comment syntax instead of converting to `# lang` line comments - Works with both regular strings `"..."` and indented strings `''...''`
0bde506
to
5438c2b
Compare
Initial work to get language annotation support working
This pr is not ready, but I created it mostly to get feedback on how we should handle edge cases and the direction I'm taking.
closes #179