Skip to content

SNOW-3066557: UnicodeDecodeError when executing SQL files with UTF-8 encoding on Japanese Windows #2759

@KazunoriMatsuzawa

Description

@KazunoriMatsuzawa

SnowCLI version

3.15.0

Python version

3.11 (embedded in PyApp)

Platform

Windows 10/11 (Japanese locale, CP932 default encoding)

What happened

Snowflake CLI fails to execute SQL files containing non-ASCII characters (Japanese comments) with UTF-8 encoding on Japanese Windows environment, despite setting PYTHONUTF8=1 and PYTHONIOENCODING=utf-8.

Console output

Actual Result:
UnicodeDecodeError: 'cp932' codec can't decode byte 0x86 in position 88: illegal multibyte sequence

Error Traceback:
File "...\snowflake\cli\_plugins\sql\statement_reader.py", line 233, in files_reader
    stmts = split_statements(io.StringIO(f.read()), remove_comments)
UnicodeDecodeError: 'cp932' codec can't decode byte 0x86 in position 88: illegal multibyte sequence

How to reproduce

Steps to Reproduce:

Create a SQL file with UTF-8 encoding containing Japanese comments:
Save the file as test.sql (UTF-8 encoding)

Run the command:

Actual Result:

Error Traceback:

Expected Result:
SQL file should be read with UTF-8 encoding and executed successfully.

What I've Tried:

✅ Set PowerShell to UTF-8 (chcp 65001) - No effect
✅ Set environment variables PYTHONUTF8=1 and PYTHONIOENCODING=utf-8 - No effect
✅ Modified PowerShell profile with UTF-8 settings - No effect
✅ Removed Japanese comments from SQL file - Workaround successful
Root Cause Analysis:

The issue occurs in statement_reader.py:233 where SecurePath.read() is called without specifying encoding. On Japanese Windows, the default encoding is CP932, not UTF-8.

The environment variables PYTHONUTF8 and PYTHONIOENCODING don't affect PyApp-bundled Python executables, as they use their own embedded Python runtime.

Proposed Solution:

Explicitly specify UTF-8 encoding when opening SQL files in statement_reader.py:

Or add encoding parameter support to SecurePath.read_text() method.

Related Documentation:

Python's open() encoding parameter
Similar issue was resolved for icacls command outputs (SnowflakeCLI_EncodingError.md in user documentation)
Workaround:
Remove non-ASCII characters from SQL files or use SnowSQL instead of Snowflake CLI.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions