Skip to content

Latest commit

 

History

History
183 lines (144 loc) · 7.5 KB

File metadata and controls

183 lines (144 loc) · 7.5 KB
description End-to-end caribou demo for MegaDetector-Overhead: download the Zenodo OWL-C weights and test patches, run OWL-C inference on GPU or CPU, and visualize the predictions.
tags
demo
quickstart
caribou
OWL-C
inference
visualization
PyTorch-Wildlife

Caribou Demo (download → infer → visualize)

This walkthrough takes you from a fresh clone to visualized OWL-C predictions on real caribou aerial patches. It uses the public Caribou Aerial Survey Dataset on Zenodo (weights + test patches), runs the same evaluation stack as tools/test.py, and renders the detections onto the patches as PNGs.

The demo auto-detects your hardware: it runs on a CUDA GPU when one is available and otherwise falls back to CPU. It makes no assumption that you have a GPU.

!!! note "About the weights" The Zenodo release labels the checkpoint "HerdNet (DLA-34)". In this repo the same DLA-34 detection branch is registered as OWL-C, so the demo loads it under model.name: OWLC. They are the same network.

Prerequisites

Install the environment with uv (see Installation):

uv sync
uv run python -c "import animaloc.models, dinov3; print('OK')"

You also need curl and unzip on your PATH (both are standard on Linux/macOS).

One command

./tools/demo_caribou.sh

This will:

  1. Download weights.zip (216 MB) and test.zip (1.2 GB) from Zenodo into demo_data/ (skipped if already present).
  2. Verify the weights' SHA-256 against the published checksum.
  3. Build a deterministic 50-patch subset (40 annotated + 10 background).
  4. Auto-detect the device (GPU if available, else CPU).
  5. Run OWL-C inference (tools/test.py) with Weights & Biases disabled.
  6. Render predictions onto every patch with tools/visualize_detections.py.

Outputs:

Path Contents
demo_data/run/metrics_results.csv F1 / precision / recall / MAE / RMSE
demo_data/run/detections.csv One row per detection (images, x, y, dscores, …)
demo_data/viz/*.png Patches with green = ground truth, red = predictions

Options

./tools/demo_caribou.sh --device cpu        # force CPU
./tools/demo_caribou.sh --device cuda        # force GPU
./tools/demo_caribou.sh --full               # run the full 2,607-patch test set
./tools/demo_caribou.sh --subset-size 100    # larger subset
./tools/demo_caribou.sh --score-threshold 0.3

Expected results

On the default 50-patch subset (229 ground-truth points) you should see numbers close to:

recall ≈ 0.98   precision ≈ 0.89   f1 ≈ 0.93

These match the per-patch validation regime reported for the checkpoint (val F1 = 0.937). The full test set reproduces the paper headline (F1 = 0.965 at τ = 20 px); see Datasets. GPU and CPU produce identical detections — only the speed differs (on a Tesla V100 the subset runs ~25× faster than CPU).

Manual walkthrough

If you prefer to run the steps yourself:

# 1. Download + extract
mkdir -p demo_data/weights demo_data/test
curl -fL -o demo_data/weights.zip \
    "https://zenodo.org/api/records/20767534/files/weights.zip/content"
curl -fL -o demo_data/test.zip \
    "https://zenodo.org/api/records/20767534/files/test.zip/content"
unzip -q demo_data/weights.zip -d demo_data/weights
unzip -q demo_data/test.zip   -d demo_data/test

# 2. Run OWL-C eval (CPU shown; use ++test.device_name=cuda for GPU)
export OWL_DEMO_DATA="$(pwd)/demo_data"
WANDB_MODE=disabled uv run python tools/test.py test=owlc_caribou_demo \
    ++test.device_name=cpu \
    ++test.model.pth_file="$OWL_DEMO_DATA/weights/best_model.pth" \
    ++test.dataset.root_dir="$OWL_DEMO_DATA/test" \
    ++test.dataset.csv_file="$OWL_DEMO_DATA/test/gt.csv" \
    ++hydra.run.dir="$OWL_DEMO_DATA/run"

# 3. Visualize predictions onto the patches
#    (predictions are saved in the model's down-sampled space; OWL-C uses
#     down_ratio=2, so pass --pred-scale 2 to map them onto the patch)
uv run python tools/visualize_detections.py \
    --detections "$OWL_DEMO_DATA/run/detections.csv" \
    --images-dir "$OWL_DEMO_DATA/test" \
    --output-dir "$OWL_DEMO_DATA/viz" \
    --gt "$OWL_DEMO_DATA/test/gt.csv" \
    --score-threshold 0.2 --pred-scale 2 --all-images

The portable demo config lives at configs/test/owlc_caribou_demo.yaml — unlike the author-specific eval configs, it hardcodes no machine paths (they come from OWL_DEMO_DATA or ++ overrides) and defaults to CPU.

Evaluation operating point

The demo config (configs/test/owlc_caribou_demo.yaml) evaluates with:

  • Match radius τ = 20 image px. evaluator.threshold: 10 is measured on the half-resolution heatmap (down_ratio: 2, stitcher up: False); ground truth is down-sampled by the same factor, so 10 heatmap px = 20 original px.
  • Confidence (peak selection) adapt_ts: 0.3 (LMDS), with neg_ts: 0.1 and a (3, 3) peak kernel.

This mirrors the per-patch validation regime (val F1 ≈ 0.937). The paper's headline F1 = 0.965 is reported at a slightly different operating point (c* = 0.20); see Datasets.

!!! note "Detection coordinate space" With up: False, tools/test.py writes detections.csv in the model's down-sampled space (x, y in 0…255 for a 512-px patch at down_ratio=2). Ground truth in gt.csv is in original 512-px space. The visualizer's --pred-scale 2 rescales predictions so the two overlay correctly.

Visualizing detections on your own runs

tools/visualize_detections.py works with any detections.csv produced by tools/test.py:

uv run python tools/visualize_detections.py \
    --detections path/to/detections.csv \
    --images-dir path/to/patches \
    --output-dir path/to/viz \
    --pred-scale 2 \
    [--gt path/to/gt.csv] [--score-threshold 0.2] [--all-images]

Predicted points are drawn in red; if --gt is given, ground-truth points are drawn in green. Each patch is captioned with its predicted (and GT) point count. Pass --pred-scale equal to the model's down_ratio (2 for OWL-C) so the down-sampled predictions land on the full-resolution patch; ground truth is never scaled.

Troubleshooting

Symptom Cause / Fix
wandb: ERROR ... or a login prompt The demo sets WANDB_MODE=disabled. Running tools/test.py by hand requires WANDB_MODE=disabled (or wandb login).
CUDA: False even though nvidia-smi shows a GPU A plain uv sync installs the CPU build. Install a GPU build with uv pip install torch torchvision --torch-backend=auto (see Installation → GPU support).
RuntimeError: ... unable to find an engine on an older GPU Some newer wheels omit kernels for older architectures (e.g. Volta / V100). Use uv pip install torch torchvision --torch-backend=cu124, which includes them.
Red prediction dots look shifted toward the top-left / "smaller" Predictions are in the model's down-sampled space — pass --pred-scale 2 (the OWL-C down_ratio) to the visualizer.
ImportError: libGL.so.1 / libgthread-2.0.so.0 Image libs need system glib/GL. The project pins opencv-python-headless; re-run uv sync if it was replaced.
Checksum mismatch on weights A corrupted/partial download. Delete demo_data/weights/ and re-run.

See also