fix: using tokens from parser to handle asi #11577

hardfist · 2025-09-04T05:25:26Z

Summary

There's a subtle difference between old lexer and new lexer,

the old lexer will carry context which makes itself can using lookahead to deal with ambiguous syntax(but it's still not 100% right)
new lexer will carry less context which makes it faster but also make it's hard to deal with ambiguous syntax out of parser right.

That's why we met #11555 when migrate from old lexer to new lexer.

The key problem is that the new lexer can not be used out of parser now(but it's good for performance), so we need to collect tokens during parse phase other than use the old lexer(which not only slow but also increase code size).

we can easily tell the difference between lexer + collect and lexer +paser in following example:

/a/g

lexer + collect

 [
    TokenAndSpan {
        token: /,
        had_line_break: true,
        span: 1..2,
    },
    TokenAndSpan {
        token: a,
        had_line_break: false,
        span: 2..3,
    },
    TokenAndSpan {
        token: /,
        had_line_break: false,
        span: 3..4,
    },
    TokenAndSpan {
        token: g,
        had_line_break: false,
        span: 4..5,
    },
    TokenAndSpan {
        token: ;,
        had_line_break: false,
        span: 5..6,
    },
]

lexer + parser

 [
    TokenAndSpan {
        token: /,
        had_line_break: true,
        span: 1..2,
    },
    TokenAndSpan {
        token: regexp literal (a, g),
        had_line_break: true,
        span: 2..5,
    },
    TokenAndSpan {
        token: ;,
        had_line_break: false,
        span: 5..6,
    },
]

Let's see some real world case to show the different behaviors of different lexers.
code:

`${a}b`;

old lexer + collect

[
    TokenAndSpan {
        token: `,
        had_line_break: true,
        span: 1..2,
    },
    TokenAndSpan {
        token: template token (),
        had_line_break: false,
        span: 2..2,
    },
    TokenAndSpan {
        token: ${,
        had_line_break: false,
        span: 2..4,
    },
    TokenAndSpan {
        token: a,
        had_line_break: false,
        span: 4..5,
    },
    TokenAndSpan {
        token: },
        had_line_break: false,
        span: 5..6,
    },
    TokenAndSpan {
        token: template token (b),
        had_line_break: false,
        span: 6..7,
    },
    TokenAndSpan {
        token: `,
        had_line_break: false,
        span: 7..8,
    },
    TokenAndSpan {
        token: ;,
        had_line_break: false,
        span: 8..9,
    },
]

old lexer + parser

[
    TokenAndSpan {
        token: `,
        had_line_break: true,
        span: 1..2,
    },
    TokenAndSpan {
        token: ${,
        had_line_break: false,
        span: 2..4,
    },
    TokenAndSpan {
        token: a,
        had_line_break: false,
        span: 4..5,
    },
    TokenAndSpan {
        token: },
        had_line_break: false,
        span: 5..6,
    },
    TokenAndSpan {
        token: template token (b),
        had_line_break: false,
        span: 6..7,
    },
    TokenAndSpan {
        token: `,
        had_line_break: false,
        span: 7..8,
    },
    TokenAndSpan {
        token: ;,
        had_line_break: false,
        span: 8..9,
    },
]

new lexer + collect(which will generate TokenError)

[
    TokenAndSpan {
        token: None,
        had_line_break: true,
        span: 1..4,
    },
    TokenAndSpan {
        token: <identifier>,
        had_line_break: false,
        span: 4..5,
    },
    TokenAndSpan {
        token: },
        had_line_break: false,
        span: 5..6,
    },
    TokenAndSpan {
        token: <identifier>,
        had_line_break: false,
        span: 6..7,
    },
    TokenAndSpan {
        token: token:error,
        had_line_break: false,
        span: 7..9,
    },
]

new lexer + parser

[
    TokenAndSpan {
        token: <template head>,
        had_line_break: true,
        span: 1..4,
    },
    TokenAndSpan {
        token: <identifier>,
        had_line_break: false,
        span: 4..5,
    },
    TokenAndSpan {
        token: },
        had_line_break: false,
        span: 5..6,
    },
    TokenAndSpan {
        token: ;,
        had_line_break: false,
        span: 8..9,
    },
]

Follow up

remove collect support for new lexer since it doesn't make sense now
collect asi directly using a asicapturing to avoid collect tokens

Checklist

Tests updated (or not required).
Documentation updated (or not required).

netlify · 2025-09-04T05:25:32Z

✅ Deploy Preview for rspack canceled.

Name	Link
🔨 Latest commit	`e9f7125`
🔍 Latest deploy log	https://app.netlify.com/projects/rspack/deploys/68c26684799d4a00087a86c4

github-actions · 2025-09-04T05:52:17Z

📦 Binary Size-limit

Comparing e9f7125 to fix: persistent cache update issuer is updated modules (#11633) by jinrui

❌ Size increased by 206.50KB from 47.23MB to 47.43MB (⬆️0.43%)

codspeed-hq · 2025-09-04T05:58:07Z

CodSpeed Performance Report

Merging #11577 will not alter performance

_{Comparing yj/fix-token (e9f7125) with main (d005492)¹}

🎉 Hooray! `codspeed-node` just leveled up to 4.0.1!

A heads-up, this is a breaking change and it might affect your current performance baseline a bit. But here's the exciting part - it's packed with new, cool features and promises improved result stability 🥳!
Curious about what's new? Visit our releases page to delve into all the awesome details about this new version.

Summary

✅ 17 untouched benchmarks

No successful run was found on main (195ce01) during the generation of this report, so d005492 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩

github-actions · 2025-09-04T10:01:12Z

📝 Benchmark detail: Open

Name	Base (2025-09-04 37589c4)	Current	Change
10000_big_production-mode_disable-minimize + exec	29.1 s ± 568 ms	28.5 s ± 1.28 s	-2.30 %
10000_development-mode + exec	1.65 s ± 22 ms	1.64 s ± 39 ms	-0.94 %
10000_development-mode_hmr + exec	688 ms ± 5 ms	693 ms ± 24 ms	+0.76 %
10000_development-mode_noop-loader + exec	2.59 s ± 74 ms	2.59 s ± 54 ms	+0.15 %
10000_production-mode + exec	1.69 s ± 63 ms	1.71 s ± 25 ms	+0.83 %
10000_production-mode_persistent-cold + exec	1.87 s ± 77 ms	1.87 s ± 36 ms	+0.12 %
10000_production-mode_persistent-hot + exec	1.35 s ± 58 ms	1.37 s ± 41 ms	+1.14 %
arco-pro_development-mode + exec	1.76 s ± 95 ms	1.75 s ± 98 ms	-0.50 %
arco-pro_development-mode_hmr + exec	367 ms ± 1.6 ms	369 ms ± 6.9 ms	+0.64 %
arco-pro_production-mode + exec	3.18 s ± 114 ms	3.29 s ± 132 ms	+3.42 %
arco-pro_production-mode_generate-package-json-webpack-plugin + exec	3.3 s ± 113 ms	3.38 s ± 280 ms	+2.37 %
arco-pro_production-mode_persistent-cold + exec	3.26 s ± 108 ms	3.37 s ± 108 ms	+3.24 %
arco-pro_production-mode_persistent-hot + exec	2.02 s ± 75 ms	2.07 s ± 261 ms	+2.66 %
arco-pro_production-mode_traverse-chunk-modules + exec	3.25 s ± 67 ms	3.25 s ± 117 ms	-0.08 %
large-dyn-imports_development-mode + exec	1.87 s ± 26 ms	1.86 s ± 36 ms	-0.44 %
large-dyn-imports_production-mode + exec	1.88 s ± 11 ms	1.93 s ± 226 ms	+2.54 %
threejs_development-mode_10x + exec	1.54 s ± 99 ms	1.48 s ± 16 ms	-3.69 %
threejs_development-mode_10x_hmr + exec	970 ms ± 18 ms	937 ms ± 27 ms	-3.42 %
threejs_production-mode_10x + exec	4.57 s ± 274 ms	4.61 s ± 263 ms	+0.88 %
threejs_production-mode_10x_persistent-cold + exec	4.63 s ± 49 ms	4.73 s ± 315 ms	+2.20 %
threejs_production-mode_10x_persistent-hot + exec	4.08 s ± 44 ms	4.15 s ± 13 ms	+1.71 %
10000_big_production-mode_disable-minimize + rss memory	9944 MiB ± 416 MiB	9682 MiB ± 217 MiB	-2.64 %
10000_development-mode + rss memory	691 MiB ± 19 MiB	695 MiB ± 26.3 MiB	+0.52 %
10000_development-mode_hmr + rss memory	834 MiB ± 35.5 MiB	828 MiB ± 46.6 MiB	-0.67 %
10000_development-mode_noop-loader + rss memory	986 MiB ± 20.9 MiB	975 MiB ± 11.9 MiB	-1.06 %
10000_production-mode + rss memory	665 MiB ± 39.1 MiB	666 MiB ± 41.8 MiB	+0.11 %
10000_production-mode_persistent-cold + rss memory	777 MiB ± 31.6 MiB	778 MiB ± 39.8 MiB	+0.09 %
10000_production-mode_persistent-hot + rss memory	732 MiB ± 48.8 MiB	753 MiB ± 37.8 MiB	+2.88 %
arco-pro_development-mode + rss memory	595 MiB ± 42.5 MiB	581 MiB ± 28.2 MiB	-2.41 %
arco-pro_development-mode_hmr + rss memory	496 MiB ± 27 MiB	476 MiB ± 11.1 MiB	-4.10 %
arco-pro_production-mode + rss memory	695 MiB ± 112 MiB	694 MiB ± 65.8 MiB	-0.11 %
arco-pro_production-mode_generate-package-json-webpack-plugin + rss memory	750 MiB ± 41.2 MiB	719 MiB ± 110 MiB	-4.09 %
arco-pro_production-mode_persistent-cold + rss memory	830 MiB ± 76.2 MiB	786 MiB ± 34.3 MiB	-5.24 %
arco-pro_production-mode_persistent-hot + rss memory	653 MiB ± 53.1 MiB	646 MiB ± 85.1 MiB	-1.13 %
arco-pro_production-mode_traverse-chunk-modules + rss memory	701 MiB ± 64.2 MiB	689 MiB ± 70.8 MiB	-1.68 %
large-dyn-imports_development-mode + rss memory	712 MiB ± 6.51 MiB	704 MiB ± 1.26 MiB	-1.16 %
large-dyn-imports_production-mode + rss memory	613 MiB ± 8.31 MiB	601 MiB ± 3.05 MiB	-2.07 %
threejs_development-mode_10x + rss memory	621 MiB ± 15.6 MiB	602 MiB ± 13.8 MiB	-3.01 %
threejs_development-mode_10x_hmr + rss memory	856 MiB ± 56.6 MiB	858 MiB ± 67.3 MiB	+0.17 %
threejs_production-mode_10x + rss memory	822 MiB ± 213 MiB	850 MiB ± 179 MiB	+3.49 %
threejs_production-mode_10x_persistent-cold + rss memory	841 MiB ± 25.6 MiB	825 MiB ± 24.5 MiB	-1.88 %
threejs_production-mode_10x_persistent-hot + rss memory	737 MiB ± 33.7 MiB	696 MiB ± 36.1 MiB	-5.62 %

Copilot

Pull Request Overview

This PR fixes automatic semicolon insertion (ASI) handling by switching from using a separate lexer to collect tokens to obtaining tokens directly from the parser during the parsing phase. This addresses compatibility issues that arose when migrating from an old context-aware lexer to a newer, faster lexer that has less context but struggles with ambiguous syntax outside of the parser.

Removes the separate lexer instantiation and token collection approach
Updates the parser to return both AST and tokens when requested
Switches token imports from swc_ecma_lexer to swc_core::ecma::parser::unstable

Reviewed Changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
crates/rspack_plugin_javascript/src/visitors/semicolon.rs	Updates token imports to use parser's unstable module
crates/rspack_plugin_javascript/src/parser_and_generator/mod.rs	Removes separate lexer, updates parser call to collect tokens
crates/rspack_plugin_javascript/Cargo.toml	Removes swc_ecma_lexer dependency
crates/rspack_javascript_compiler/src/compiler/parse.rs	Adds token collection capability to parser methods
crates/rspack_javascript_compiler/Cargo.toml	Adds ecma_parser_unstable feature and swc_ecma_lexer dependency

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/rspack_plugin_javascript/src/parser_and_generator/mod.rs

crates/rspack_javascript_compiler/src/compiler/parse.rs

github-actions bot added release: bug fix release: bug related release(mr only) team The issue/pr is created by the member of Rspack. labels Sep 4, 2025

hardfist force-pushed the yj/fix-token branch from 24aa288 to bedc58d Compare September 4, 2025 05:27

hardfist force-pushed the yj/fix-token branch from 7e8715e to ef318d8 Compare September 4, 2025 08:47

hardfist force-pushed the yj/fix-token branch from 853f714 to 12187ce Compare September 4, 2025 10:40

hardfist mentioned this pull request Sep 9, 2025

Support onInsertedSemicolon callback when parsing swc-project/swc#11053

Closed

fix: get tokens from parser

7a90a92

hardfist force-pushed the yj/fix-token branch from d1d9cb8 to 7a90a92 Compare September 11, 2025 02:38

chore: don't call capturing if with_tokens false

e9f7125

hardfist marked this pull request as ready for review September 11, 2025 06:05

hardfist requested a review from quininer as a code owner September 11, 2025 06:05

Copilot AI review requested due to automatic review settings September 11, 2025 06:05

Copilot AI reviewed Sep 11, 2025

View reviewed changes

crates/rspack_plugin_javascript/src/parser_and_generator/mod.rs Show resolved Hide resolved

crates/rspack_javascript_compiler/src/compiler/parse.rs Show resolved Hide resolved

CPunisher approved these changes Sep 11, 2025

View reviewed changes

quininer approved these changes Sep 11, 2025

View reviewed changes

hardfist merged commit da5778e into main Sep 11, 2025
42 of 44 checks passed

hardfist deleted the yj/fix-token branch September 11, 2025 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: using tokens from parser to handle asi #11577

fix: using tokens from parser to handle asi #11577

Uh oh!

hardfist commented Sep 4, 2025 •

edited

Loading

Uh oh!

netlify bot commented Sep 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 4, 2025 •

edited

Loading

Uh oh!

codspeed-hq bot commented Sep 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 4, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fix: using tokens from parser to handle asi #11577

fix: using tokens from parser to handle asi #11577

Uh oh!

Conversation

hardfist commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Follow up

Related links

Checklist

Uh oh!

netlify bot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for rspack canceled.

Uh oh!

github-actions bot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📦 Binary Size-limit

Uh oh!

codspeed-hq bot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #11577 will not alter performance

🎉 Hooray! codspeed-node just leveled up to 4.0.1!

Summary

Footnotes

Uh oh!

github-actions bot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hardfist commented Sep 4, 2025 •

edited

Loading

netlify bot commented Sep 4, 2025 •

edited

Loading

github-actions bot commented Sep 4, 2025 •

edited

Loading

codspeed-hq bot commented Sep 4, 2025 •

edited

Loading

🎉 Hooray! `codspeed-node` just leveled up to 4.0.1!

github-actions bot commented Sep 4, 2025 •

edited

Loading