Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Response to issue 34 #35

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Response to issue 34 #35

wants to merge 8 commits into from

Conversation

LTibbs
Copy link

@LTibbs LTibbs commented Dec 16, 2024

I had some issues with the prokaryote trees, as described in Issue 34 (#34). So, I made the following updates to address that:

  1. Update uniprot_strata() to work with from=1 (starting from cellular organisms).
  2. Create function generate_prokaryote_tree to build new prokaryotic tree from UniProt with customizable species numbers and weights.
  3. Create function use_custom_prokaryote_tree to add the custom tree to the analysis, analogous to use_recommended_prokaryotes.

@arendsee
Copy link
Owner

@LTibbs Thanks for the PR. I appreciate the work you have put into this. I've been a bit under the rocks lately, but I'll look over your commits soon. I do have an old function for making prokaryotic trees: uniprot_sample_prokaryotes(remake=TRUE). Running that will create a new tree that is in harmony with they latest Uniprot annotation. This should avoid all the warnings you observed. I'll be back in touch soon.

@arendsee
Copy link
Owner

@LTibbs I've updated the phylostratr functions uniprot_sample_prokaryotes (see c1d3deb) and the use_recommended_prokaryotes. I'm happy to merge your contributions, but first see if these updated functions solve your problem. After pulling the latest from the main phylostratr branch, just update your code with use_recommended_prokaryotes(version=2). I do the idea behind your generate_prokaryote_tree - it is more general than uniprot_sample_prokaryotes in that it allows weights and numbers to be set. Your update to uniprot_strata also looks reasonable, though I think the prokaryote outgroup will typically need special handling. So I'll keep this pull request open.

@LTibbs
Copy link
Author

LTibbs commented Dec 20, 2024

@arendsee Thanks for pointing out the option to have remake=T in the uniprot_sample_prokaryotes function, and for your update to use_recommended_prokaryotes. That does solve my original problem of addressing warnings in the recommended prokaryotes. I know that some of the warnings in the earlier version were just due to not having a reference proteome, but others were more problematic (missing or "excluded" proteomes) so it's good to be able to update the tree easily as UniProt updates.

Still, I do think there is value in having the more general tree building function available for those who want to customize their prokaryote tree further, in both weights and size. I plan to use custom weights myself, so I think this branch and generate_prokaryote_tree could potentially be useful to others.

Depending on how things go, I could even combine the two approaches by adding an option for weights to the uniprot_sample_prokaryotes sampling?

@LTibbs
Copy link
Author

LTibbs commented Jan 6, 2025

@arendsee, I just finished code to add weights to the uniprot_sample_prokaryotes function, but because it's based on your new branch, I think adding it now may cause conflicts. Could you merge my other changes, and then I can submit a new pull request on the most updated branch?

In addition, as I was working with phylostratr more, I found what was (to me) an unexpected result: even if I set weights for a given taxon to 0 or a negative number (i.e., intending to exclude it due to low quality proteome) it could still potentially be included in the final tree, depending on its position in the tree. So, I added a drop.names argument to uniprot-access.R in order to enable completely excluding certain taxa from the final tree.

@arendsee
Copy link
Owner

arendsee commented Jan 7, 2025

Hi @LTibbs, your work here is awesome and super-appreciated; we'll get everything merged. But could we kick this down the road two weeks? I'm in serious crunch time at the moment working on a paper re-submission. Once that's done, I can allocate the time to re-familiarize myself with the phylostratr code and merge your work. Feel free to keep working on the code in the meantime, we'll get everything through eventually.

@LTibbs
Copy link
Author

LTibbs commented Jan 7, 2025

Of course! Hope your paper submission goes well. I really appreciate the work you did making this package, I'm just making these tweaks to help in my own use of the package and hope they can be useful for others!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants