Skip to content

DazzleLib/dazzle-filekit

Repository files navigation

dazzle-filekit

Release Date PyPI PyPI Downloads Python License GitHub Discussions Platform

Cross-platform file operations with path handling, verification, and metadata preservation.

A Python toolkit for reliable file operations across Windows, Linux, and macOS. Handles path normalization between Git Bash, WSL, and native formats, file verification with multiple hash algorithms, and metadata-preserving copy/move operations.

Features

  • Cross-Platform Paths - Normalize between Git Bash (/c/...), WSL (/mnt/c/...), and native Windows/Unix paths via a single canonical normalize_cross_platform_path(path, *, resolve=False) entry point
  • Rich Metadata Preservation - dazzle_filekit.metadata module captures Windows SDDL ACLs (JSON-serializable), NTFS creation time, Unix extended attributes, and attribute flag booleans; restore-on-recovery preserves everything via pywin32.SetFileTime for ctime
  • File Operations - Copy, move, and manage files with metadata preservation
  • Atomic Write Primitives - atomic_write_text / atomic_write_json use tmp+rename for crash-safe config and manifest writes
  • Link-Safe Tree Copy - copy_tree_preserving_links wraps shutil.copytree(symlinks=True) with documented intent (never traverses junctions on Windows)
  • NTFS ADS Detection - platform.windows.detect_alternate_streams enumerates alternate data streams via FindFirstStreamW; has_significant_ads filters out browser Zone.Identifier noise
  • Correct Junction Detection - is_junction uses DeviceIoControl(FSCTL_GET_REPARSE_POINT) to distinguish real junctions (IO_REPARSE_TAG_MOUNT_POINT) from directory symlinks
  • File Verification - Calculate and verify file hashes (MD5, SHA1, SHA256, SHA512)
  • Disk Space Checking - Pre-flight space verification before operations
  • Platform Support - Windows, Linux, and macOS with platform-specific optimizations
  • UNC Path Detection - Native is_unc_path / get_path_type helpers; optional UNCtools peer for UNC ↔ drive-letter translation (see docs/unctools-integration.md)

Why dazzle-filekit?

While Python's standard library (shutil, pathlib, os) provides basic file operations, dazzle-filekit offers:

  • Metadata Preservation: Automatic preservation of timestamps, permissions, and extended attributes across platforms
  • Hash Verification: Built-in file verification with multiple hash algorithms (MD5, SHA1, SHA256, SHA512)
  • Cross-Platform Path Handling: Unified API for handling Windows UNC paths, network drives, and Unix paths
  • Batch Operations: Process entire directory trees with pattern matching and filtering
  • Safe Operations: Built-in conflict resolution, unique path generation, and error handling
  • Directory Comparison: Compare directory contents and verify file integrity across locations

dazzle-filekit was designed for applications requiring reliable file operations with verification, such as backup tools, file synchronization, and data preservation systems (like the preserve project).

Installation

pip install dazzle-filekit

Optional Dependencies

# UNCtools peer install (enables UNC ↔ drive-letter translation
# via user-side composition; filekit does not import unctools directly).
# See docs/unctools-integration.md for composition patterns.
pip install 'dazzle-filekit[unctools]'

# Development tools
pip install 'dazzle-filekit[dev]'

Quick Start

Cross-Platform Path Handling

from dazzle_filekit import (
    normalize_cross_platform_path,
    resolve_cross_platform_path,
    path_exists_cross_platform,
)

# Convert Git Bash style paths to native format
# On Windows: /c/Users/foo -> C:\Users\foo
# On Unix: C:\Users\foo -> /c/Users/foo
path = normalize_cross_platform_path("/c/Users/foo/file.txt")

# Also handles WSL paths: /mnt/c/Users/...
path = normalize_cross_platform_path("/mnt/c/Users/foo/file.txt")

# Resolve with probing: if the normalized path doesn't exist,
# tries alternate platform formats (WSL, MSYS, Windows)
path = resolve_cross_platform_path("/mnt/c/Users/foo/file.txt")

# Check if a cross-platform path exists (uses resolve internally)
if path_exists_cross_platform("/c/Users/foo/file.txt"):
    print("File exists!")

Path Operations

from dazzle_filekit import normalize_path, find_files, is_unc_path

# Normalize paths (returns Path object)
path = normalize_path("/some/path/../file.txt")
print(path)  # PosixPath('/some/file.txt') or WindowsPath('C:/some/file.txt')

# Find files with patterns (returns list of path strings)
files = find_files("/directory", patterns=["*.py", "*.txt"])

# Check UNC paths
if is_unc_path(r"\\server\share"):
    print("This is a UNC path")

File Operations

from dazzle_filekit import copy_file, collect_file_metadata, create_symlink

# Copy file with attribute preservation (timestamps, permissions, etc.)
success = copy_file("source.txt", "dest.txt", preserve_attrs=True)

# Collect file metadata (v0.2.4: returns SDDL ACLs on Windows,
# xattrs on Linux/macOS, ctime, and ISO timestamps alongside the raw floats)
metadata = collect_file_metadata("file.txt")
print(f"Size: {metadata['size']}, Modified: {metadata['timestamps']['modified_iso']}")

# Create symbolic link (cross-platform, with Windows fallbacks)
success = create_symlink("/path/to/target", "/path/to/link")

# Force replace existing link
success = create_symlink("/new/target", "/path/to/link", force=True)

Disk Space Checking

from dazzle_filekit import get_disk_usage, check_disk_space, ensure_disk_space

# Get disk usage statistics
usage = get_disk_usage("/path/to/check")
print(f"Total: {usage.total}, Free: {usage.free}, Used: {usage.used_percent:.1f}%")

# Check if space is available for an operation
has_space, required, available, message = check_disk_space(
    "/destination",
    required_bytes=1_000_000_000,  # 1GB
    safety_margin=0.1  # 10% extra margin
)

# Check space for a list of source files
has_space, message = ensure_disk_space(
    dest_path="/destination",
    source_paths=["/path/to/file1.zip", "/path/to/dir/"]
)

File Verification

from dazzle_filekit import calculate_file_hash, verify_file_hash

# Calculate hash
hash_value = calculate_file_hash("file.txt", algorithm="sha256")

# Verify hash
is_valid = verify_file_hash("file.txt", expected_hash, algorithm="sha256")

Atomic Writes (v0.2.4)

from dazzle_filekit import atomic_write_text, atomic_write_json

# Atomic text write (tmp + os.replace). Crash mid-write leaves the
# original file intact; readers see either the old or the new contents.
atomic_write_text("config.ini", "[section]\nkey=value\n")

# Atomic JSON write with sensible defaults. default=str handles
# datetime, Path, and other non-JSON-native types out of the box.
atomic_write_json("manifest.json", {
    "version": "1.0",
    "created_at": datetime.datetime.now(),
    "root": Path("/data"),
})

Rich Metadata (v0.2.4)

from dazzle_filekit import metadata

# Collect rich metadata. On Windows this captures SDDL ACL strings
# (JSON-serializable), creation time, file attribute flags, and owner.
# On Linux/macOS it captures extended attributes (xattrs) as base64.
md = metadata.collect_file_metadata("important.txt")

# Save it as JSON alongside the file
import json
with open("important.txt.meta.json", "w") as f:
    json.dump(metadata.metadata_to_json(md), f, indent=2)

# Later, restore metadata to a copy (including Windows ctime)
metadata.apply_file_metadata("restored.txt", md)

# Check if the richer Windows code path is available
if metadata.is_win32_available():
    print("pywin32 present -- full SDDL/ctime/ADS support")

Link-Safe Tree Copy (v0.2.4)

from dazzle_filekit import copy_tree_preserving_links

# Copies the tree, preserving symlinks and junctions as links (never
# traversing them). Safe for copying source trees that may contain
# self-referential junctions on Windows.
copy_tree_preserving_links("src_tree", "dst_tree", dirs_exist_ok=True)

API at a glance

The Quick Start above covers the common 90% of what most users need. For the full function-by-function reference, see docs/api-reference.md.

Area Key entry points
Paths normalize_cross_platform_path(path, *, resolve=False) (canonical), resolve_cross_platform_path, path_exists_cross_platform, is_wsl()
File ops copy_file, move_file, create_symlink, copy_tree_preserving_links, atomic_write_text, atomic_write_json
Metadata dazzle_filekit.metadata -- collect_file_metadata, apply_file_metadata, restore_windows_creation_time, compare_metadata, is_win32_available
Platform (Windows) dazzle_filekit.platform.windows -- detect_alternate_streams, has_significant_ads, is_admin
Disk space get_disk_usage, check_disk_space, calculate_total_size, ensure_disk_space
Verification calculate_file_hash, verify_file_hash, verify_files_with_manifest, compare_directories
UNC detection is_unc_path, get_path_type (compose with UNCtools for translation -- see docs/unctools-integration.md)

Platform Support

See docs/platform-support.md for the full platform support matrix and platform-specific features.

Platform Status
Windows 10/11 Tested
Linux Tested
WSL / WSL2 Tested
macOS Expected to work
BSD Expected to work

Configuration

Logging

from dazzle_filekit import configure_logging, enable_verbose_logging
import logging

# Configure logging level
configure_logging(level=logging.DEBUG, log_file="dazzle-filekit.log")

# Or enable verbose logging
enable_verbose_logging()

Development

Setup Development Environment

git clone https://github.com/DazzleLib/dazzle-filekit.git
cd dazzle-filekit
pip install -e ".[dev]"

Run Tests

# Standard run
pytest tests/ -v --cov=dazzle_filekit

# Cross-platform cross-check (Windows + WSL Ubuntu from one command)
./scripts/run-cross-platform-tests.sh

Code Formatting

black dazzle_filekit tests
flake8 dazzle_filekit tests

Documentation

tests/test_import_stability.py is the automated canary that enforces docs/api-stability.md. If you rename or remove a locked symbol, that test will fail.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Like the project?

"Buy Me A Coffee"

License

This project is licensed under the MIT License - see the LICENSE file for details.

Part of DazzleLib

dazzle-filekit is part of the DazzleLib ecosystem of Python file manipulation tools.

Related Projects

About

Cross-platform file operations toolkit with path handling, verification, and metadata preservation

Resources

License

Contributing

Stars

Watchers

Forks

Sponsor this project

  •  

Packages

 
 
 

Contributors