Skip to content

Conversation

@jaelliot
Copy link
Contributor

  • Add explicit env.sync(force=True) before close to flush writes
  • Add 100ms delay on Windows after close when clear=True
  • Ensures file locks are released before directory cleanup
  • Fixes test_mailboxing PermissionError on Windows CI

Resolves: Pre-existing Windows CI test failure
Related: PR #1143

@jaelliot jaelliot force-pushed the jay/fix-windows-lmdb-cleanup branch 2 times, most recently from 53a2e3b to 27a8eee Compare January 23, 2026 19:52
- Add explicit env.sync(force=True) before close to flush writes
- Implement exponential backoff retry (50ms, 100ms, 200ms, 400ms, 800ms)
- Catch PermissionError on Windows during cleanup
- Ensures file locks are released before directory removal
- Fixes test_essr_stream and test_essr_mbx PermissionError on Windows CI

Resolves: Pre-existing Windows CI test failure
Related: PR WebOfTrust#1143
@jaelliot
Copy link
Contributor Author

Local Windows Testing Results

I tested the Windows LMDB fix locally and got unexpected results that warrant discussion before merging.

Test Environment

  • OS: Windows 10 via WSL2 (Ubuntu 22.04)
  • Python: 3.14.2
  • Test Method: Ran pytest from Windows PowerShell against WSL filesystem (\\wsl$\Ubuntu-22.04\...)

Results

All LMDB-related tests PASSED ✅:

pytest tests/app/test_storing.py::test_mailboxing -v
# 1 passed in 1.57s

pytest tests/db/test_dbing.py -v  
# 4 passed in 0.89s

pytest tests/app/test_habbing.py -v
# 10 passed in 6.90s

Important Caveat ⚠️

This may not reflect true native Windows behavior because:

  1. WSL filesystem layer: Files accessed through \\wsl$\... go through WSL's translation layer
  2. Different locking semantics: WSL may handle file locks differently than native Windows NTFS
  3. CI uses native Windows: GitHub Actions runs on native Windows filesystem, not WSL

Why This Matters

The original CI failures were:

  • PermissionError: [WinError 32] The process cannot access the file because it is being used by another process
  • On native Windows filesystem in GitHub Actions

My local tests may be passing because WSL provides different file locking behavior.

Recommendation

We should push this fix and verify it works in the actual GitHub Actions CI environment (native Windows) before considering it resolved. The local WSL tests are encouraging but not definitive.

Code Changes Summary

Added to LMDBer.close():

  • 10ms delay after env.close() on Windows
  • Exponential backoff retry (50ms → 25s total)
  • Explicit garbage collection between retries
  • Better error classification (only retries on WinError 32/errno 13)

- Downloads libsodium 1.0.20 stable MSVC build
- Extracts and copies DLL to System32 (GitHub Actions has admin)
- Fixes missing dependency that prevented pysodium imports on Windows CI
- Resolves: Pre-existing Windows CI import failures
@jaelliot jaelliot force-pushed the jay/fix-windows-lmdb-cleanup branch from 27a8eee to e60b1c4 Compare January 23, 2026 21:29
- Remove complex error type checking that may have been failing
- Retry on ANY PermissionError/OSError during Windows cleanup
- Increase base delay to 100ms (was 50ms)
- Store last_error to ensure proper exception re-raising
- More aggressive retry stance: assume all errors are lock-related

The previous logic was too conservative in detecting lock errors,
causing immediate failures instead of retries.
- Change from '>=3.12.6' to exact '3.14.2'
- Ensures CI uses same Python version as successful local tests
- Eliminates version variance as a potential cause of test failures
- Windows blocks access to ports 5000-5999 with permission error
- Created windows_ports.py utility for cross-platform port mapping
- Updated test_forwarding.py to use available ports (8000+)
- Resolves: RuntimeError on HTTP server creation
- Resolves: AttributeError on socket.accept() (was NoneType)

This fixes the actual root cause discovered in native Windows testing:
the failures were NOT LMDB cleanup issues, but Windows network
port access restrictions.
@jaelliot jaelliot marked this pull request as ready for review January 23, 2026 23:37
@jaelliot jaelliot marked this pull request as draft January 23, 2026 23:38
@jaelliot
Copy link
Contributor Author

Converting to draft - shifting focus per team guidance

After multiple iterations attempting to fix Windows CI failures (LMDB cleanup, port binding issues), tests are still failing. Per team discussion, switching focus to Sphinx documentation work which has no platform-specific issues.

What was attempted:

Windows LMDB retry logic with exponential backoff
Windows port mapping (5000-5999 → 8000+)
Python version pinning to 3.14.2
libsodium CI installation
Current status:

Tests still fail with similar errors
Local Windows testing via WSL was not representative of CI environment
Will revisit if maintainers have guidance, but not blocking other work
Moving to Sphinx documentation PRs now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant