Fix chembl issue 584#585
Draft
gaurav wants to merge 34 commits intoadd-pipeline-tests-for-shared-identifiersfrom
Draft
Fix chembl issue 584#585gaurav wants to merge 34 commits intoadd-pipeline-tests-for-shared-identifiersfrom
gaurav wants to merge 34 commits intoadd-pipeline-tests-for-shared-identifiersfrom
Conversation
Describes how users can file issues with Priority, Impact, Size, and Component fields, and how to read the sprint board to track progress. Includes a developer guide covering triage checklist, adding babel-validation BabelTest assertions to issues, and sprint planning heuristics. Also includes suggestions for improving the process. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…vement ideas Documents the current development loop (building prerequisites, manual output review, delayed feedback from SLURM runs) and its pain points, then proposes improvements ranging from small helper scripts to large architectural changes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
- CONTRIBUTING.md: replace outdated milestone-based triage details with a pointer to docs/Triage.md (the authoritative sprint-based guide); add link to docs/Development.md for dev workflow; remove empty bullets from the frontends section - README.md: add link to docs/Development.md in Contributing section - docs/Development.md: fix pre-existing MD013 line-length violation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ADME - Create docs/Understanding.md: move preferred-ID algorithm, label selection, descriptions, IC values, and split/lumped clique reporting out of README.md - Create docs/Architecture.md: human-readable source layout, data-flow narrative, key patterns, and runtime directories (replaces the brief architecture paragraph in CONTRIBUTING.md) - Slim README.md from 295 to ~160 lines: replace removed sections with pointers to docs/Understanding.md; condense "Running Babel" to a summary + link - Expand docs/README.md into an audience-based TOC (users / operators / contributors) with one-line descriptions per doc - Update CONTRIBUTING.md: replace architecture paragraph and TODO stubs with links to Architecture.md and Development.md - Trim docs/Triage.md: remove "Suggestions for improvement" section - Update DataFormats.md IC link to point to Understanding.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The file make_cliques.py does not exist. Clique merging is done by the glom() function in src/babel_utils.py, with write_compendium() in the same file driving the overall compendium-building process. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…685) Pretty significant and Claude-led reorganization of the Babel documentation, including a description of the new GitHub issues in Babel Validator. While I hope the documentation is as clear as possible, part of the goal is to set up the necessary files so that future changes can provide more details instructions for Babel-related tasks and fill gaps in the current documentation.
…g fixes (issue #584) Tests cover two of the bugs fixed in PR #585: single-column lines in label files no longer cause a silent mis-parse in SynonymFactory or an IndexError in NodeFactory, and labels containing tabs are preserved intact via maxsplit=1. Also derives BIOLINK_VERSION in conftest.py from config.yaml rather than a hardcoded string. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Because of a bug introduced in #446, we were removing CHEMBL IDs if they had a label identical to their CHEMBL. With this PR, we retain those identifiers, but we do remove their label. Fixes #584.
While fixing this, I found another bug in how we read synonym files without an explicit label/synonym, and fixed that as well.
WIP. Should be merged after PR #692.