Skip to content

WarrenGreen/ship-search

Repository files navigation

ship-search

Bathymetry-first shipwreck discovery and triage pipeline:

  • builds training tiles from NOAA BAG surveys,
  • trains deep models for wreck-presence detection,
  • runs large-area streaming discovery with aggressive download parallelism,
  • persists only positive detections plus geospatial metadata,
  • and provides a reviewer webapp for rapid analyst validation.

Scientific Basis

This project is grounded in the workflow described in:

Chartrand et al. (2021), Archaeologic Machine Learning for Shipwreck Detection Using Lidar and Sonar
Remote Sensing, 13(9), 1759.
Paper PDF in this repo: remotesensing-13-01759-v2.pdf

Key idea we adopt: transform bathymetry into model-consumable imagery, then classify candidate targets with human-in-the-loop review.


What This Repo Does

The codebase has four major parts:

  1. Data ingestion + tiling

    • NOAA survey discovery and BAG URL resolution
    • BAG/TIF raster loading and tiling
    • composite tile generation for model input
  2. Modeling

    • YOLOv8 baseline detector
    • heatmap U-Net point model
    • ViT large-scale binary classifier for tile-level wreck presence
  3. Streaming discovery at scale

    • concurrent raster download workers
    • bounded producer/consumer queue
    • single inference consumer (stable MPS/CUDA behavior)
    • positive-only output persistence
  4. Review webapp

    • browse detections
    • inspect lat/lon, confidence, tile scale
    • nearest AWOIS cross-reference in modal

Installation

Prereqs:

  • Python 3.11+
  • GDAL/raster stack capable of reading .bag (or gdal_translate for BAG->TIF fallback)
cd /Users/warren/workspace/ship-search
python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
pip install -e .

Core CLI Workflows

1) Demo end-to-end (YOLO baseline)

# Download demo data (NOAA BAG + wreck points)
ship-search download --out data/demo

# Prepare YOLO-style dataset
ship-search prepare --demo data/demo --out data/yolo_h13140

# Train
ship-search train --dataset data/yolo_h13140/dataset.yaml --epochs 50 --imgsz 416

# Predict over raster
ship-search predict \
  --weights runs/detect/train/weights/best.pt \
  --raster data/demo/H13140_MB_VR_MLLW.bag \
  --out artifacts/preds

2) ViT classifier (tile-level wreck presence)

# Example train
ship-search vit-cls train \
  --dataset data/datasets/scale10x_10k/dataset.yaml \
  --model vit_giant_patch14_clip_224 \
  --project runs/vit_cls \
  --name vitg14_10k_balanced \
  --device mps \
  --balanced-sampling \
  --auto-pos-weight

3) Streaming discovery (disk-efficient)

Florida:

ship-search discover florida \
  --weights runs/vit_cls/vitg14_10k_balanced_h200/weights/best.pt \
  --out artifacts/discovery_florida_mps_085 \
  --conf-thresh 0.85 \
  --download-workers 90 \
  --device mps

Bahamas + Turks and Caicos:

ship-search discover bahamas-tci \
  --weights runs/vit_cls/vitg14_10k_balanced_h200/weights/best.pt \
  --out artifacts/discovery_bahamas_tci_mps_085 \
  --conf-thresh 0.85 \
  --download-workers 90 \
  --device mps

Important discovery behavior:

  • default deletes stale out/rasters/* at startup (resumes cleanly),
  • skips previously seen rasters in the same output root,
  • deletes downloaded rasters after processing unless --keep-rasters is set.

Discovery Output Contract

Under --out, discovery writes:

  • tiles/
    Positive detection PNGs only.

  • meta/detections.jsonl
    One row per positive tile with:

    • tile identity/context (tile_id, survey_id, raster_name)
    • geospatial fields (center_lat, center_lon, bbox_lonlat)
    • model confidence (score)
    • tile geometry (tile_row, tile_col)
    • source metadata (source_resolution_m)
    • tile bathymetry stats (tile_depth_min_m, tile_depth_mean_m, tile_depth_max_m)
  • meta/detections.geojson and meta/summary.json

  • meta/processed_rasters.jsonl
    Resume ledger used to avoid reprocessing imagery already seen in prior runs.


AWOIS Nearest-Match Enrichment

For analyst triage, detections can be joined to nearest known records from:

/Users/warren/Downloads/Wrecks_and_Obstructions_in_AWOIS.geojson

Current enriched files:

  • meta/detections_nearest_awois.jsonl
  • meta/detections_nearest_awois.csv

These include:

  • nearest AWOIS point lat/lon
  • distance_m
  • key AWOIS fields (OBJECTID, record, depth, yearSunk, etc.)

Webapp Review

Serve discovery outputs directly:

ship-search viewer serve \
  --data-root artifacts/discovery_florida_mps_085 \
  --host 127.0.0.1 \
  --port 8000

In modal view, the app shows:

  • detection lat/lon
  • tile width/height in meters
  • nearest AWOIS match distance and record/id

Positive ID Example (Candidate New Wreck)

Copied into a commit-friendly docs path:

docs/assets/positive_ids/H11896_H11896_MB_2m_MLLW_8of8_r00007_c00006.png

Positive ID candidate: H11896 tile

Reference row IDs:

  • tile_id: H11896_H11896_MB_2m_MLLW_8of8_r00007_c00006
  • detection source: artifacts/discovery_florida_mps_085/meta/detections.jsonl
  • nearest AWOIS source: artifacts/discovery_florida_mps_085/meta/detections_nearest_awois.jsonl

Notes

  • Research/prototyping only (not for navigation).
  • BAG support depends on GDAL build options.
  • If BAG open fails in your environment, convert with gdal_translate and run on GeoTIFF.

About

Finding shipwrecks using vision models and bathometry

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors