Skip to content

Fetch external binaries on demand from raven-data (drop ~238 MB of committed binaries)#644

Merged
edkerk merged 4 commits into
develop3from
feat/on-demand-binaries
Jun 17, 2026
Merged

Fetch external binaries on demand from raven-data (drop ~238 MB of committed binaries)#644
edkerk merged 4 commits into
develop3from
feat/on-demand-binaries

Conversation

@edkerk

@edkerk edkerk commented Jun 17, 2026

Copy link
Copy Markdown
Member

What

Move RAVEN's external command-line binaries (BLAST+, DIAMOND, HMMER, WoLFPSORT) out of the git repository and download them per platform, on demand from the raven-data repository — mirroring the existing on-demand pattern already used for SCIP (setRavenSolver) and the KEGG HMM library (getKEGGModelForOrganism). Removes ~238 MB of committed binaries (software/ tracked files: 176 → 109).

Changes

  • installation/downloadRavenBinaries.m (new) — fetches the binary set for the current platform from raven-data releases (websaveunzipchmod, and clears the macOS Gatekeeper quarantine attribute). Skips tools already present.
  • Lazy guards in getBlast / getDiamond / getKEGGModelForOrganism / getWoLFScores — download the binary if it's missing before invoking it.
  • checkInstallation — prefetches any missing reconstruction binaries (so everything is ready after install); testBinary and makeBinaryExecutable updated.
  • Windows now uses the native .exe builds for BLAST+/DIAMOND/HMMER (the WSL path in testBinary/getKEGGModelForOrganism is dropped); macOS uses bare names to match the raven-data ZIP layout — so binEnd is '' on all unix and .exe on Windows.
  • software/{blast+,diamond,hmmer,WoLFPSORT} are gitignored with .keep placeholders, like software/scip.

libSBML/GLPKmex (MEX) and apache-poi stay committed (basic SBML I/O + solving must work offline).

Depends on the raven-data hosting — blast-2.17.0, diamond-2.1.17, hmmer-3.4.0, hmmer-3.3.2, wolfpsort-0.2 releases + manifest (SysBioChalmers/raven-data#1).

Validation (MATLAB R2024b, Windows)

  • checkcode — all edited files clean. (The getKEGGModelForOrganism "parse error at END" / unused-variable flags are pre-existing on develop3, identical apart from a line-number shift.)
  • FunctionaldownloadRavenBinaries({'hmmer'}) downloaded and extracted hmmsearch.exe + cygwin1.dll + LICENSE into software/hmmer/; the binary runs (HMMER 3.3.2).

Not covered here (follow-ups)

  • Full reconstruction run (getBlast/getKEGGModelForOrganism with real data) — not exercised.
  • macOS/Linux download paths not run on this Windows machine (logic is platform-symmetric).
  • Offline *-binaries full-bundle release (download once, copy to an air-gapped machine) — a separate release-process step.
  • Excel-export / apache-poi optimization — separate topic.

edkerk added 3 commits June 17, 2026 10:51
Remove the ~238 MB of committed external binaries and fetch them per-platform
on first use (or up front via checkInstallation), wiring in downloadRavenBinaries
and mirroring the existing scip / KEGG-HMM on-demand pattern.

- getBlast/getDiamond/getKEGGModelForOrganism/getWoLFScores: download the binary
  from raven-data if it is missing before invoking it.
- checkInstallation: prefetch any missing reconstruction binaries (doubles as the
  up-front fetch); testBinary and makeBinaryExecutable updated accordingly.
- Windows uses the native .exe builds for blast/diamond/hmmer (dropping the WSL
  path); macOS uses bare names to match the raven-data ZIP layout, so binEnd is
  '' on all unix and '.exe' on Windows.
- software/{blast+,diamond,hmmer,WoLFPSORT} are now gitignored with .keep
  placeholders, like software/scip.

Not yet tested in MATLAB.
@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown

Function test results

202 tests   178 ✅  37s ⏱️
 22 suites   24 💤
  1 files      0 ❌

Results for commit 6cfeb40.

♻️ This comment has been updated with latest results.

Add tBinaries (downloads each tool from raven-data, asserts it lands in
software/ and runs) and a binary-tests CI job that runs it on ubuntu-latest
and macos-latest. Running the fetched binaries also confirms the macOS
Gatekeeper quarantine is cleared. Excluded from the main function-tests suite
to avoid a redundant download there.
@edkerk edkerk merged commit e3111fc into develop3 Jun 17, 2026
4 checks passed
@edkerk edkerk deleted the feat/on-demand-binaries branch June 17, 2026 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant