Skip to content

fix: prefer message body fileName over Content-Disposition for non-ASCII filenames#379

Open
meijing0114 wants to merge 1 commit into
larksuite:mainfrom
meijing0114:fix/chinese-filename-encoding-364
Open

fix: prefer message body fileName over Content-Disposition for non-ASCII filenames#379
meijing0114 wants to merge 1 commit into
larksuite:mainfrom
meijing0114:fix/chinese-filename-encoding-364

Conversation

@meijing0114
Copy link
Copy Markdown
Contributor

Problem

Fixes #364

When downloading files with Chinese/CJK filenames from Feishu, the saved filenames are garbled (e.g., ã_ç_å_è_.pdf instead of 助英台.pdf).

Root cause: Feishu returns raw UTF-8 bytes in the Content-Disposition header filename field. Node.js HTTP clients decode headers as Latin1 per HTTP/1.1 spec, producing garbled strings for non-ASCII characters.

Fix

The message body already contains the correct UTF-8 file_name, extracted during the converter phase into ResourceDescriptor.fileName. The fix is a one-line priority swap in media-resolver.ts:

- const fileName = result.fileName || res.fileName;
+ const fileName = res.fileName || result.fileName;

res.fileName (from message body, always correct UTF-8) is now preferred over result.fileName (from Content-Disposition header, potentially garbled). The header value still serves as a fallback for cases where the message body doesn't include a filename (e.g., image messages).

Why not fix the Content-Disposition parsing?

An earlier approach attempted Latin1→UTF-8 byte recovery on the header value, but that relies on heuristic detection (guessing whether a string is misencoded). Using the message body as the primary source is deterministic and zero-guesswork.

…CII filenames (larksuite#364)

Feishu returns raw UTF-8 bytes in the Content-Disposition header filename
field. Node.js HTTP clients decode headers as Latin1 per HTTP/1.1 spec,
producing garbled filenames for Chinese/CJK characters.

The message body already contains the correct UTF-8 file_name extracted
during the converter phase. This fix simply swaps the priority:
res.fileName (message body) is now preferred over result.fileName
(Content-Disposition header).

Fixes larksuite#364
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 2, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Feishu plugin: Chinese filenames saved as garbled text (ã_ç_å_è_)

2 participants