FIX: Write text files with utf-8 encoding instead of utf-8-sig by scott-huberty · Pull Request #1531 · mne-tools/mne-bids

scott-huberty · 2026-03-04T22:13:16Z

Fixes #1530 cc @sappelhoff @hoechenberger

Hope you don't mind that I added 2 new constants to config.py. I think this makes the code intent clearer and will make updating these encodings easier in the future.

codecov · 2026-03-04T22:27:02Z

Codecov Report

❌ Patch coverage is 96.55172% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 96.98%. Comparing base (bbc83e7) to head (a2f71e2).
⚠️ Report is 6 commits behind head on main.

Files with missing lines	Patch %	Lines
mne_bids/utils.py	80.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1531      +/-   ##
==========================================
- Coverage   97.00%   96.98%   -0.02%     
==========================================
  Files          43       43              
  Lines       10669    10679      +10     
==========================================
+ Hits        10349    10357       +8     
- Misses        320      322       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

hoechenberger

Hi, great so far!

Except for JSON reading: we should NOT support reading JSON with a BOM, as this simply isn't valid JSON. We should fail hard in this case.

As for the changelog update: I would think that the only relevant change for users is TSV writing. I would be specific about this and omit that other text files are affected as well. But this is just my personal view

np.loadtxt with encoding="utf-8-sig" crashes on TSV files that contain Latin-1 characters such as µ (micro-sign, 0xB5), which is common in European datasets for channel units like "µV". Add a try/except UnicodeDecodeError that retries with latin-1 encoding and emits a warning. This is related to open issue mne-tools#1530 and PR mne-tools#1531. Discovered via OpenNeuro datasets during eegdash batch ingestion.

scott-huberty · 2026-03-11T05:28:53Z

Except for JSON reading: we should NOT support reading JSON with a BOM, as this simply isn't valid JSON. We should fail hard in this case.

OK! addressed in a2f71e2 . I expanded my two encoding constants into a little class, now that we have different encoding rules for TSV vs JSON I/O. For me this feels like a clean approach but If folks think it is overkill feel free to let me know.

scott-huberty added 2 commits March 4, 2026 14:10

FIX: write text files with utf-8 instead of utf-8-sig

cf1d0fa

DOC: changelog

a3f96d0

hoechenberger reviewed Mar 5, 2026

View reviewed changes

bruAristimunha mentioned this pull request Mar 9, 2026

FIX: Fall back to latin-1 encoding for non-UTF-8 TSV files bruAristimunha/mne-bids#8

Open

2 tasks

FIX: Don't allow reading JSON files written in utf-8-sig

a2f71e2

scott-huberty mentioned this pull request Mar 12, 2026

ENH: read/write gzipped tsv #1528

Open

1 task

bruAristimunha mentioned this pull request Mar 16, 2026

BUG: _from_tsv crashes on Latin-1 encoded TSV files (e.g., µ character) bruAristimunha/mne-bids#20

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX: Write text files with utf-8 encoding instead of utf-8-sig#1531

FIX: Write text files with utf-8 encoding instead of utf-8-sig#1531
scott-huberty wants to merge 3 commits intomne-tools:mainfrom
scott-huberty:encoding

scott-huberty commented Mar 4, 2026

Uh oh!

codecov Bot commented Mar 4, 2026 •

edited

Loading

Uh oh!

hoechenberger left a comment

Uh oh!

scott-huberty commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

scott-huberty commented Mar 4, 2026

Uh oh!

codecov Bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hoechenberger left a comment

Choose a reason for hiding this comment

Uh oh!

scott-huberty commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented Mar 4, 2026 •

edited

Loading