-
-
Notifications
You must be signed in to change notification settings - Fork 781
fix: using tokens from parser to handle asi #11577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for rspack canceled.
|
24aa288
to
bedc58d
Compare
📦 Binary Size-limit
❌ Size increased by 206.50KB from 47.23MB to 47.43MB (⬆️0.43%) |
CodSpeed Performance ReportMerging #11577 will not alter performanceComparing 🎉 Hooray!
|
7e8715e
to
ef318d8
Compare
📝 Benchmark detail: Open
|
853f714
to
12187ce
Compare
d1d9cb8
to
7a90a92
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes automatic semicolon insertion (ASI) handling by switching from using a separate lexer to collect tokens to obtaining tokens directly from the parser during the parsing phase. This addresses compatibility issues that arose when migrating from an old context-aware lexer to a newer, faster lexer that has less context but struggles with ambiguous syntax outside of the parser.
- Removes the separate lexer instantiation and token collection approach
- Updates the parser to return both AST and tokens when requested
- Switches token imports from
swc_ecma_lexer
toswc_core::ecma::parser::unstable
Reviewed Changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
crates/rspack_plugin_javascript/src/visitors/semicolon.rs | Updates token imports to use parser's unstable module |
crates/rspack_plugin_javascript/src/parser_and_generator/mod.rs | Removes separate lexer, updates parser call to collect tokens |
crates/rspack_plugin_javascript/Cargo.toml | Removes swc_ecma_lexer dependency |
crates/rspack_javascript_compiler/src/compiler/parse.rs | Adds token collection capability to parser methods |
crates/rspack_javascript_compiler/Cargo.toml | Adds ecma_parser_unstable feature and swc_ecma_lexer dependency |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Summary
There's a subtle difference between old lexer and new lexer,
That's why we met #11555 when migrate from old lexer to new lexer.
The key problem is that the new lexer can not be used out of parser now(but it's good for performance), so we need to collect tokens during parse phase other than use the old lexer(which not only slow but also increase code size).
we can easily tell the difference between lexer + collect and lexer +paser in following example:
/a/g
Let's see some real world case to show the different behaviors of different lexers.
code:
[ TokenAndSpan { token: `, had_line_break: true, span: 1..2, }, TokenAndSpan { token: template token (), had_line_break: false, span: 2..2, }, TokenAndSpan { token: ${, had_line_break: false, span: 2..4, }, TokenAndSpan { token: a, had_line_break: false, span: 4..5, }, TokenAndSpan { token: }, had_line_break: false, span: 5..6, }, TokenAndSpan { token: template token (b), had_line_break: false, span: 6..7, }, TokenAndSpan { token: `, had_line_break: false, span: 7..8, }, TokenAndSpan { token: ;, had_line_break: false, span: 8..9, }, ]
[ TokenAndSpan { token: `, had_line_break: true, span: 1..2, }, TokenAndSpan { token: ${, had_line_break: false, span: 2..4, }, TokenAndSpan { token: a, had_line_break: false, span: 4..5, }, TokenAndSpan { token: }, had_line_break: false, span: 5..6, }, TokenAndSpan { token: template token (b), had_line_break: false, span: 6..7, }, TokenAndSpan { token: `, had_line_break: false, span: 7..8, }, TokenAndSpan { token: ;, had_line_break: false, span: 8..9, }, ]
Follow up
Related links
related to #11555 microsoft/typescript-go#1554
Checklist