-
-
Notifications
You must be signed in to change notification settings - Fork 767
Description
Bug Report
System Info
System:
OS: macOS 26.2
CPU: (10) arm64 Apple M4
Memory: 462.08 MB / 32.00 GB
Shell: 5.9 - /bin/zsh
Binaries:
Node: 22.17.1 - /Users/xxx/.nvm/versions/node/v22.17.1/bin/node
Yarn: 1.22.22 - /Users/xxx/.nvm/versions/node/v22.17.1/bin/yarn
npm: 10.9.2 - /Users/xxx/.nvm/versions/node/v22.17.1/bin/npm
pnpm: 10.13.1 - /Users/xxx/.nvm/versions/node/v22.17.1/bin/pnpm
bun: 1.2.19 - /opt/homebrew/bin/bun
Browsers:
Chrome: 144.0.7559.133
Safari: 26.2Description
Rspack panics with a SIGABRT when building a project whose source files contain string literals with multibyte (CJK / Chinese) characters. The panic originates in hstr's WTF-8 implementation where a byte offset that is not on a character boundary is used to slice the string.
Panic output
Panic occurred at runtime. Please file an issue on GitHub with the backtrace below:
https://github.com/web-infra-dev/rspack/issues
panicked at index.crates.io-1949cf8c6b5b557f/hstr-3.0.3/src/wtf8/not_quite_std.rs:173:5:
index 0 and/or 14 in "## 环境信息\n+ 版本号:\n+ 账号:\n..." do not lie on character boundary
The string "## 环境信息" has the following byte layout (UTF-8):
# # (space) 环(3B) 境(3B) 信(3B) 息(3B)
0 1 2 3-5 6-8 9-11 12-14
Byte offset 14 is the last byte of 息 (a 3-byte character), not a character boundary — causing Rust to panic when slicing.
Reproduction
Any TypeScript/JavaScript source file containing a CJK string literal, e.g.:
// src/index.ts
const template = `## 环境信息
+ 版本号:
+ 账号:`;Build it with rspack (directly or via rslib/rsbuild). The build will crash with SIGABRT.
Environment
| Version | |
|---|---|
@rspack/core |
~1.7.5 |
hstr |
3.0.3 |
| OS | macOS (Apple Silicon) |
Root cause hypothesis
hstr 3.0.3 introduced or changed WTF-8 string interning. Somewhere in the string-processing pipeline (likely during module parsing or stats generation), a byte offset derived from a JS code unit index or a regex match is used directly to slice a WTF-8 JsWord / Atom. For ASCII-only strings this accidentally works, but for multibyte characters the offset is not a valid UTF-8 boundary, triggering the panic.
Workaround
Splitting the problematic string constant out of .ts source into a separate JSON/text resource file (loaded at runtime) avoids the panic, since rspack does not parse the string content of non-JS assets the same way.
Additional context
Reported via rslib issue reproduction. The panic is fully deterministic and reproducible on every build attempt.