Skip to content

Commit 575ded8

Browse files
jplfariaclaude
andcommitted
Fix refdata path convention: version is the refdata version, not the tool version
Per Gavin: same refdata can serve multiple tool versions, and CTS already tracks (image, refdata) compatibility at registration time. Path versioning should reflect the refdata itself, not what reads it. Updated paths: - kofam: use KEGG download date (2025-04-30) - bakta: use Bakta DB schema version (v6.0) Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
1 parent bf389fd commit 575ded8

2 files changed

Lines changed: 10 additions & 4 deletions

File tree

README.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -77,9 +77,15 @@ Things waiting on someone else. Update as items move.
7777

7878
### Pending CTS admin (Gavin) registrations
7979

80-
Refdata path convention (Gavin's): `cts-refdata/{toolname}/{tool_version}/{filename}`
80+
Refdata path convention: `cts-refdata/{toolname}/{refdata_version}/{filename}`
81+
82+
The path version is the **refdata version**, not the tool version. The same refdata can serve multiple tool versions; CTS tracks (image, refdata) compatibility at registration time.
8183

8284
| Tool | Move from | Move to | Image |
8385
|------|-----------|---------|-------|
84-
| kofamscan | `cts/io/jplfaria/refdata_staging/kofam/kofam_refdata.tar.gz` (~1.5GB) | `cts-refdata/kofam/1.3.0/kofam_refdata.tar.gz` | `ghcr.io/kbaseincubator/cdm_kofamscan:0.1.0@sha256:d6b20eccf4c6bf1b095e530844a8b04dbae5fca85daf0c9b2bdffb0cf10a9a42` |
85-
| bakta | `cts/io/jplfaria/refdata_staging/bakta/bakta_db.tar.gz` (~30GB) | `cts-refdata/bakta/1.12.0/bakta_db.tar.gz` | `ghcr.io/kbaseincubator/cdm_bakta:0.1.0@sha256:6de4c51cadd75bc6a1d9f6e6b05716ecfdcfa63510b82459477ff757200d8d06` |
86+
| kofamscan | `cts/io/jplfaria/refdata_staging/kofam/kofam_refdata.tar.gz` (~1.5GB) | `cts-refdata/kofam/2025-04-30/kofam_refdata.tar.gz` | `ghcr.io/kbaseincubator/cdm_kofamscan:0.1.0@sha256:d6b20eccf4c6bf1b095e530844a8b04dbae5fca85daf0c9b2bdffb0cf10a9a42` |
87+
| bakta | `cts/io/jplfaria/refdata_staging/bakta/bakta_db.tar.gz` (~30GB) | `cts-refdata/bakta/v6.0/bakta_db.tar.gz` | `ghcr.io/kbaseincubator/cdm_bakta:0.1.0@sha256:6de4c51cadd75bc6a1d9f6e6b05716ecfdcfa63510b82459477ff757200d8d06` |
88+
89+
Refdata version notes:
90+
- **kofam** uses the date pulled from KEGG FTP (KEGG does not publish formal version numbers for the kofam dump).
91+
- **bakta** uses the published DB schema version (v6.0, Feb 2025), which is what Bakta 1.12.0 expects.

docs/pattern.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -255,7 +255,7 @@ importer_meta:
255255
| Make GHCR package public | José (GitHub settings) |
256256
| Register image in CTS | Gavin only (`full_admin`) |
257257
| Register refdata in CTS | Gavin only |
258-
| Refdata path convention | `cts-refdata/{toolname}/{tool_version}/{filename}` (Gavin's convention, e.g. `cts-refdata/checkm2/1.1.0/uniref100.KO.1.dmnd.gz`) |
258+
| Refdata path convention | `cts-refdata/{toolname}/{refdata_version}/{filename}` — version in the path is the **refdata** version, not the tool version. Same refdata can serve multiple tool versions; CTS tracks (image, refdata) compatibility at registration. For tools without formal refdata versions (e.g. KEGG kofam dumps), use the date pulled from upstream as the version stamp. |
259259
| Submit jobs | José (needs `kbase_staff` role) |
260260
| Write to `cts/io/` | José |
261261
| Merge importer PRs | Gavin |

0 commit comments

Comments
 (0)