Skip to content

mtahle/s3syncy

Repository files navigation

s3syncy

Tests PyPI version License: MIT

Cross-platform, multithreaded S3 file synchronisation daemon.

Features

  • Continuous sync — watches directories for changes in real-time (via watchdog) and runs periodic full scans as a safety net.
  • Daemon controls — start in background and control with stop, pause, resume, reload, daemon-status.
  • Multithreaded — configurable thread pool for parallel uploads/downloads.
  • Bandwidth throttling — token-bucket rate limiter (upload & download independently).
  • Resource-friendly — chunked streaming (no full-file buffering), optional soft memory cap, bounded thread pool.
  • Configurable — single config.yaml controls everything (S3 target, threads, bandwidth, conflict strategy, integrity, logging).
  • Gitignore-style exclusions.syncignore file uses the same pattern syntax as .gitignore.
  • Auto-reload — config and exclusion files are reloaded automatically on change.
  • Searchable local index — SQLite metadata database with full-text search on file paths and folder-prefix listing.
  • Conflict resolutionlocal_wins, remote_wins, newest_wins, or skip — with optional .bak backup before overwriting.
  • Remote delete self-heal — if an object is deleted directly from S3 but still exists locally, daemon restores it on the next scan.
  • Integrity checks — post-upload hash verification (MD5 via S3 ETag, or SHA256). Configurable reaction: warn, retry, or delete_remote.
  • Cross-platform — macOS, Linux, Windows (Python 3.10+).

Quick Start

# Install from PyPI
pip install s3syncy

# Initialize configuration
s3syncy init

# Edit config.yaml with your S3 bucket and sync directories
# Then run:
s3syncy start -c config.yaml --background

# Check status
s3syncy status -c config.yaml

CLI Commands

Command Description
s3syncy start -c config.yaml Start the sync daemon
s3syncy start -c config.yaml --background Start daemon in background
s3syncy stop -c config.yaml Stop background daemon
s3syncy pause -c config.yaml Pause syncing (daemon stays alive)
s3syncy resume -c config.yaml Resume syncing after pause
s3syncy reload -c config.yaml Reload config + exclusions immediately
s3syncy daemon-status -c config.yaml Show daemon PID/running/state info
s3syncy search "report" -c config.yaml Search the index for files matching "report"
s3syncy ls "photos/2024" -c config.yaml List synced files under a path prefix
s3syncy pull "docs/file.pdf" ./local.pdf -c config.yaml Download a single file from S3
s3syncy status -c config.yaml Show index statistics (total files, synced count, total size)
s3syncy init Create starter config.yaml and .syncignore

Configuration

See config.yaml for full documentation. Key settings:

sync_dirs:
  - ~/Documents/sync
  - ~/Desktop/uploads

s3:
  bucket: "my-bucket"
  prefix: "backups"
  region: "us-east-1"

threads: 4
scan_interval_seconds: 300

bandwidth:
  upload_limit_mbps: 10    # 0 = unlimited
  download_limit_mbps: 0

conflict:
  strategy: "newest_wins"  # local_wins | remote_wins | newest_wins | skip
  backup_before_overwrite: true

integrity:
  enabled: true
  algorithm: "md5"         # md5 | sha256
  on_failure: "warn"       # warn | retry | delete_remote

When multiple sync_dirs are configured, one daemon handles all of them.
S3 keys are namespaced per root (for example Documents/file.txt, uploads-2/file.txt) to avoid collisions.

.syncignore

Works exactly like .gitignore:

# OS junk
.DS_Store
Thumbs.db

# Build artefacts
node_modules/
__pycache__/
*.pyc

# Secrets
.env
*.pem

Signals (Unix)

  • SIGINT / SIGTERM — graceful shutdown (finish in-flight transfers, close index).
  • SIGHUP — reload config and exclusions.
  • SIGUSR1 — pause syncing.
  • SIGUSR2 — resume syncing.

Architecture

┌─────────────┐     events      ┌─────────────┐    ThreadPool    ┌──────────┐
│  watchdog   │ ──────────────▸ │   watcher   │ ──────────────▸ │  engine  │
│  (OS-level) │   debounced     │  (handler)  │   submit tasks   │ (upload/ │
└─────────────┘                 └──────┬──────┘                  │ download)│
                                       │                         └────┬─────┘
                          periodic     │                              │
                          full scan    ▼                              ▼
                                ┌─────────────┐              ┌──────────────┐
                                │   daemon    │              │   S3 (boto3) │
                                │ (main loop) │              │  + throttle  │
                                └─────────────┘              │  + integrity │
                                       │                     └──────────────┘
                                       ▼
                                ┌─────────────┐
                                │   SQLite    │
                                │   index     │
                                └─────────────┘

License

MIT

About

No description, website, or topics provided.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages