fix: handle multi-byte UTF-8 characters in DeleteWhiteSpace by Yanhu007 · Pull Request #34 · Masterminds/goutils

Yanhu007 · 2026-04-15T03:10:14Z

Fixes #29

Problem

DeleteWhiteSpace iterates by byte index (str[i]) and casts each byte to rune. For multi-byte UTF-8 characters (Chinese, Japanese, Korean, emoji, etc.), this produces mojibake:

DeleteWhiteSpace(" 测试 测试 ")
// Got:    "æµ\x8dè¯\x95æµ\x8dè¯\x95"
// Expect: "测试测试"

Fix

Use range-based iteration which correctly yields Unicode runes instead of raw bytes.

// Before
for i := 0; i < sz; i++ {
    ch := rune(str[i])  // BUG: treats each byte as a rune

// After
for _, ch := range str {  // correctly iterates runes

All existing tests pass.

DeleteWhiteSpace iterated over string bytes (str[i]) and cast each byte to rune, which corrupts multi-byte UTF-8 characters like Chinese, Japanese, Korean, emoji, etc. DeleteWhiteSpace(" 测试测试 ") → "æµ\x8dè¯\x95æµ\x8dè¯\x95" Fix: use range-based iteration which correctly yields Unicode runes. DeleteWhiteSpace(" 测试测试 ") → "测试测试" Fixes Masterminds#29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle multi-byte UTF-8 characters in DeleteWhiteSpace#34

fix: handle multi-byte UTF-8 characters in DeleteWhiteSpace#34
Yanhu007 wants to merge 1 commit intoMasterminds:masterfrom
Yanhu007:fix/delete-whitespace-utf8

Yanhu007 commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Yanhu007 commented Apr 15, 2026

Problem

Fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant