Skip to content

Fix chembl issue 584#585

Draft
gaurav wants to merge 34 commits intoadd-pipeline-tests-for-shared-identifiersfrom
fix-chembl-issue-584
Draft

Fix chembl issue 584#585
gaurav wants to merge 34 commits intoadd-pipeline-tests-for-shared-identifiersfrom
fix-chembl-issue-584

Conversation

@gaurav
Copy link
Copy Markdown
Collaborator

@gaurav gaurav commented Sep 23, 2025

Because of a bug introduced in #446, we were removing CHEMBL IDs if they had a label identical to their CHEMBL. With this PR, we retain those identifiers, but we do remove their label. Fixes #584.

While fixing this, I found another bug in how we read synonym files without an explicit label/synonym, and fixed that as well.

WIP. Should be merged after PR #692.

Base automatically changed from babel-v1.13 to master September 24, 2025 17:21
@gaurav gaurav moved this from Backlog to In progress in Babel sprints Feb 28, 2026
gaurav and others added 21 commits March 5, 2026 19:31
Describes how users can file issues with Priority, Impact, Size, and
Component fields, and how to read the sprint board to track progress.
Includes a developer guide covering triage checklist, adding
babel-validation BabelTest assertions to issues, and sprint planning
heuristics. Also includes suggestions for improving the process.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…vement ideas

Documents the current development loop (building prerequisites, manual output review,
delayed feedback from SLURM runs) and its pain points, then proposes improvements
ranging from small helper scripts to large architectural changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
- CONTRIBUTING.md: replace outdated milestone-based triage details with
  a pointer to docs/Triage.md (the authoritative sprint-based guide);
  add link to docs/Development.md for dev workflow; remove empty bullets
  from the frontends section
- README.md: add link to docs/Development.md in Contributing section
- docs/Development.md: fix pre-existing MD013 line-length violation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ADME

- Create docs/Understanding.md: move preferred-ID algorithm, label
  selection, descriptions, IC values, and split/lumped clique reporting
  out of README.md
- Create docs/Architecture.md: human-readable source layout, data-flow
  narrative, key patterns, and runtime directories (replaces the brief
  architecture paragraph in CONTRIBUTING.md)
- Slim README.md from 295 to ~160 lines: replace removed sections with
  pointers to docs/Understanding.md; condense "Running Babel" to a
  summary + link
- Expand docs/README.md into an audience-based TOC (users / operators /
  contributors) with one-line descriptions per doc
- Update CONTRIBUTING.md: replace architecture paragraph and TODO stubs
  with links to Architecture.md and Development.md
- Trim docs/Triage.md: remove "Suggestions for improvement" section
- Update DataFormats.md IC link to point to Understanding.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The file make_cliques.py does not exist. Clique merging is done by the
glom() function in src/babel_utils.py, with write_compendium() in the
same file driving the overall compendium-building process.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
gaurav and others added 6 commits March 15, 2026 02:31
…685)

Pretty significant and Claude-led reorganization of the Babel
documentation, including a description of the new GitHub issues in Babel
Validator. While I hope the documentation is as clear as possible, part
of the goal is to set up the necessary files so that future changes can
provide more details instructions for Babel-related tasks and fill gaps
in the current documentation.
@gaurav gaurav changed the base branch from master to get-test-suite-working-again March 15, 2026 06:50
gaurav and others added 2 commits March 15, 2026 03:20
…g fixes (issue #584)

Tests cover two of the bugs fixed in PR #585: single-column lines in label
files no longer cause a silent mis-parse in SynonymFactory or an IndexError in
NodeFactory, and labels containing tabs are preserved intact via maxsplit=1.
Also derives BIOLINK_VERSION in conftest.py from config.yaml rather than a
hardcoded string.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gaurav gaurav changed the base branch from get-test-suite-working-again to add-pipeline-tests-for-shared-identifiers March 15, 2026 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

CHEMBL.COMPOUND:CHEMBL4784370 is missing from 2025sep1 but was present in 2025mar31

1 participant