Fetch external binaries on demand from raven-data (drop ~238 MB of committed binaries)#644
Merged
Conversation
Remove the ~238 MB of committed external binaries and fetch them per-platform
on first use (or up front via checkInstallation), wiring in downloadRavenBinaries
and mirroring the existing scip / KEGG-HMM on-demand pattern.
- getBlast/getDiamond/getKEGGModelForOrganism/getWoLFScores: download the binary
from raven-data if it is missing before invoking it.
- checkInstallation: prefetch any missing reconstruction binaries (doubles as the
up-front fetch); testBinary and makeBinaryExecutable updated accordingly.
- Windows uses the native .exe builds for blast/diamond/hmmer (dropping the WSL
path); macOS uses bare names to match the raven-data ZIP layout, so binEnd is
'' on all unix and '.exe' on Windows.
- software/{blast+,diamond,hmmer,WoLFPSORT} are now gitignored with .keep
placeholders, like software/scip.
Not yet tested in MATLAB.
Function test results202 tests 178 ✅ 37s ⏱️ Results for commit 6cfeb40. ♻️ This comment has been updated with latest results. |
Add tBinaries (downloads each tool from raven-data, asserts it lands in software/ and runs) and a binary-tests CI job that runs it on ubuntu-latest and macos-latest. Running the fetched binaries also confirms the macOS Gatekeeper quarantine is cleared. Excluded from the main function-tests suite to avoid a redundant download there.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Move RAVEN's external command-line binaries (BLAST+, DIAMOND, HMMER, WoLFPSORT) out of the git repository and download them per platform, on demand from the raven-data repository — mirroring the existing on-demand pattern already used for SCIP (
setRavenSolver) and the KEGG HMM library (getKEGGModelForOrganism). Removes ~238 MB of committed binaries (software/tracked files: 176 → 109).Changes
installation/downloadRavenBinaries.m(new) — fetches the binary set for the current platform from raven-data releases (websave→unzip→chmod, and clears the macOS Gatekeeper quarantine attribute). Skips tools already present.getBlast/getDiamond/getKEGGModelForOrganism/getWoLFScores— download the binary if it's missing before invoking it.checkInstallation— prefetches any missing reconstruction binaries (so everything is ready after install);testBinaryandmakeBinaryExecutableupdated..exebuilds for BLAST+/DIAMOND/HMMER (the WSL path intestBinary/getKEGGModelForOrganismis dropped); macOS uses bare names to match the raven-data ZIP layout — sobinEndis''on all unix and.exeon Windows.software/{blast+,diamond,hmmer,WoLFPSORT}are gitignored with.keepplaceholders, likesoftware/scip.libSBML/GLPKmex(MEX) andapache-poistay committed (basic SBML I/O + solving must work offline).Depends on the raven-data hosting —
blast-2.17.0,diamond-2.1.17,hmmer-3.4.0,hmmer-3.3.2,wolfpsort-0.2releases + manifest (SysBioChalmers/raven-data#1).Validation (MATLAB R2024b, Windows)
checkcode— all edited files clean. (ThegetKEGGModelForOrganism"parse error at END" / unused-variable flags are pre-existing ondevelop3, identical apart from a line-number shift.)downloadRavenBinaries({'hmmer'})downloaded and extractedhmmsearch.exe+cygwin1.dll+LICENSEintosoftware/hmmer/; the binary runs (HMMER 3.3.2).Not covered here (follow-ups)
getBlast/getKEGGModelForOrganismwith real data) — not exercised.*-binariesfull-bundle release (download once, copy to an air-gapped machine) — a separate release-process step.