scanc = scan c(ode)
A fast, pure‑Python project code‑scanner that outputs clean, AI‑ready Markdown or XML.
scanc helps you spill an entire codebase into an LLM prompt (or a file) in seconds—while keeping noise low, controlling token budgets, and giving you full visibility.
| Feature | Description |
|---|---|
| Blazing Fast, Pure‑Python | Zero native dependencies; easy to install and run anywhere. |
| Smart Default Ignores | Automatically skips node_modules, .venv, .git, and more. |
| Flexible Filters | Include/exclude by extension, filename, or regex patterns. |
| Optional Directory Tree | Prepend a fenced tree diagram of your project structure. |
| Token Counter | Estimate LLM token costs with tiktoken before you paste. |
| Cross‑Platform CLI | Works on macOS, Linux, and Windows out of the box. |
# Optional: Use a virutal environment
python3 -m venv --prompt scanc-env .venv
source .venv/bin/activate
pip install scanc[tiktoken] # installs optional token‑counter supportScan a directory and emit Markdown:
scanc . # scan current folder
scanc -e py,js --tree # only .py and .js files + directory tree
scanc -f xml # output scan in xml format (new in v1.2.0)
scanc -e py -x "tests" | less # only py files exclude tests in path
scanc --tokens gpt-4o # show token count for gpt 4o only
scanc -e py | pbcopy # scan and copy (macOS copy command example)Write output directly to a file:
scanc -e ts --tree -o scan.md src/
cat scan.mdscanc [OPTIONS] [PATHS...]-e, --ext EXTSComma‑separated extensions to include (e.g.py,js).-i, --include-regexRegex patterns to include (full path match).-x, --exclude-regexRegex patterns to exclude (full path match).--no-default-excludesDisable built‑in ignore list.-t, --treePrepend directory tree (fenced code block).-T, --tokens MODELOutput only token count for given LLM model.--max-size BYTESSkip files larger than BYTES (default 1 MiB).--follow-symlinksTraverse symlinks when scanning.-o, --out OUTFILEWrite result toOUTFILEinstead of stdout.-f, --format FORMATOutput format (default:markdown).-V, --versionShow version and exit.
- Formatter Hook: Customize output by passing your own formatter via entry points.
- Extras: Use
scanc[tiktoken]to enable token counting; more extras may follow.
A ready-to-run container is published to GitHub Container Registry (GHCR). It runs as non-root and scans the mounted host directory by default.
docker pull ghcr.io/mqxym/scanc-cli:latest# Linux/macOS (Bash/Zsh)
docker run --rm -v "$PWD":/work:ro ghcr.io/mqxym/scanc-cli:latest
# Windows PowerShell
docker run --rm -v "${PWD}:/work:ro" ghcr.io/mqxym/scanc-cli:latestBecause the container’s WORKDIR is /work and ENTRYPOINT is scanc,
passing . scans your host’s current folder.
Either redirect on the host:
docker run --rm -v "$PWD":/work:ro ghcr.io/mqxym/scanc-cli:latest -e py --tree > scan.md...or mount as writable and write into /work:
docker run --rm -v "$PWD":/work ghcr.io/mqxym/scanc-cli:latest -e py --tree -o /work/scan.mdTip (Linux/macOS): preserve file ownership when writing by mapping your UID/GID
docker run --rm \ --user "$(id -u)":"$(id -g)" \ -v "$PWD":/work ghcr.io/mqxym/scanc-cli:latest -o /work/scan.md
# Only Python & JS files, include directory tree
docker run --rm -v "$PWD":/work:ro ghcr.io/mqxym/scanc-cli:latest -e py,js --tree
# Token count only (requires optional 'tiktoken' which is baked into the image)
docker run --rm -v "$PWD":/work:ro ghcr.io/mqxym/scanc-cli:latest --tokens gpt-4oReleased under the MIT Licence. See LICENCE for details.