Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Added new files to BuscoDatabaseDirFmt #231

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

VinzentRisch
Copy link
Contributor

@VinzentRisch VinzentRisch commented Jan 15, 2025

closes #215

  • Adds file collection r'busco_downloads/information/.+.txt$'.
  • Adds file r'busco_downloads/lineages/fungi_odb10/missing_in_parasitic.txt$'.
  • Adds r'busco_downloads/lineages/pectobacteriaceae_odb12/no_hits$'.
  • Symlink gets removed in download action from tetrapoda lineage.

Copy link

codecov bot commented Jan 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.42%. Comparing base (b6d068a) to head (cd1c50d).
Report is 13 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #231      +/-   ##
==========================================
- Coverage   95.60%   95.42%   -0.19%     
==========================================
  Files          34       34              
  Lines        1956     1987      +31     
  Branches      226      232       +6     
==========================================
+ Hits         1870     1896      +26     
- Misses         48       50       +2     
- Partials       38       41       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@VinzentRisch VinzentRisch requested a review from misialq January 23, 2025 14:42
Copy link
Contributor

@misialq misialq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but please see some ideas/questions about generalization 🔍

@@ -37,6 +38,12 @@ def fetch_busco_db(
f"Error during BUSCO database download: {e.returncode}"
)

# There is a symlink in the BUSCO database that needs to be removed
symlink = os.path.join(str(busco_db), "busco_downloads", "lineages",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether we should make this a bit more universal and try to detect any symlink and remove it, just in case this happens again in the future. Do you know whether this is the only link that exists there or are there more and this one is simply broken?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only symlink in the database. Because it's the only one and because its broken I think it's a mistake that it's there in the first place.
But i can implement something that searches for any symlinks and removes them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I'd say let's implement something like that then - hopefully there won't be more mistakes like this in the future but if there are then we won't need more PRs...

Comment on lines +145 to +149
no_hits = model.File(
r'busco_downloads\/lineages\/pectobacteriaceae_odb12\/no_hits$',
format=BuscoGenericTextFileFmt,
optional=True
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should also be generalized, in case it happens in the future, by replacing the pectobacteriaceae lineage with a wildcard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: unrecognised file when downloading Busco db
2 participants