-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting fmt
to the finish-line
#2757
Comments
We use it only internally, because I don't want to have 7 different files with 5 lines in each.
I'm +0.5 on
Right now, they are broken based on width only. We can add other heuristics, for example "more than two elements of a pipeline will always break into multiple lines". |
|
One big gap I just realized — comments are erased! This is something we have to think about — we want to retain some aesthetics; for example:
...these two comments are in the same place as far as AST nodes go — but can't be treated the same. |
Jup. My plan was to do something similar to what rustfmt does. But that's hard, so I didn't - yet. |
Interesting link, thanks for sharing. One option would be — since we're also in control of the parser — to have comments in the initial AST. During compilation, we could then run through a cheap |
To share my thinking — part of the reason for spending time here is: without getting all the way there, it's not that instrumentally useful — to auto-format files, we need it to produce reasonable PRQL. It doesn't have to be completely perfect — we're OK with small line-length changes etc — but it needs to not lose information (e.g. comments), and be acceptable PRQL. |
(I edited this a few times, was working through it in my own mind...) I just added this to the description:
As a reminder, the reason we require parentheses around function calls in
select x = (sum foo) ...but here, select x = sum foo # wrong!
Here,
I'm not sure it's possible to elide the parentheses in both without changing the syntax between an assign & alias, or resolving much later in the compilation pipeline |
To have a proper formatter, we do need to fix the comments. Your approach with having comments in AST would probably not work well, because it would move comments around, as AST would not capture where exactly the comment was (on the same line as code, in a new line, maybe indented?). This is why I think the rustfmt's approach would be better in the long term. But, we don't need a proper formatter - at least not now. In the current state, the formatter can have the function of "the idiomatic PRQL standard" - i.e. the definition of what we want the idiomatic PRQL to look like. It can also be used to format book snippets (if they don't contain comments). |
Hmmm, if I look at the output of rustfmt, it seems to retain:
...but that's all — can't have different identations, can't have more than one linebreak! (this might still be too much to store in the AST)
Sure, that's nice :) But much less impactful than being able to format on every save! |
Codegen framework is working only partially, I'm fixing it. |
FYI I looked at adding comments to the lexer (not the parser), so we could use that to grab the comments. (#4094 was some of this) It's not difficult in the lexer, but it would involve having lots of If we can get comments working, they we're in spitting distance of |
For future reference: I've posted a few screenshots with general outline of how it is possible to implement formatting with comments on Discord. Also, I've ticked off a few things from the checklist:
|
I've done a few PRs to add important whitespace to the lexer — I think now complete. We still need to decide how we add that to the output (@aljazerzen gave a few options on discord). Another thing we might want to add is a way of turning off formatting. That could be as simple as a comment such as |
What's up?
After @aljazerzen built the foundations of this, I have filled a few gaps, and now all the examples format into correct PRQL. But we still have a few gaps in getting to something we can use everywhere.
In order to roll this out fully, it's quite important to be close to 100%, since all PRQL will be formatted this way — in the book, whenever someone saves a file in VS Code, whenever someone runs
pre-commit
(both in our repo and others'), etc. Without being close to 100%, it's not that instrumentally useful.{x = (sum foo)}
(from feat: FormatUnOp
correctly #2803 (comment)). This in the docs here. It might require understanding the grandparent node (need to think more)module
breaks, which breaks the standard library (though I actually thought we were only going to do files as modules @aljazerzen ?) feat: codegen for module, type and annotations #2949from t=tracks
orfrom t = tracks
? I marginally preferred the former — given that we have fairly few separators, having tighter expressions groups things better. (but this is not a strong view, maybe +0.3, and only aesthetics) style: Usefoo=bar
style in codegen #2779prql/prql-compiler/tests/integration/snapshots/integration__fmt@distinct_on.prql.snap
Line 8 in 31c3e71
{(-foo)}
has extra parentheses, :prql/prql-compiler/tests/integration/snapshots/integration__fmt@distinct_on.prql.snap
Lines 8 to 9 in 31c3e71
UnOp
correctly #2803The text was updated successfully, but these errors were encountered: