-
|
I'm trying out iqtree2's gene concordance factor analysis for the first time, and I'm a bit stumped by this. If I pass in the boostrap trees, the gCF analysis runs as as expected, like so. However, if I pass in my gene trees, they are interpreted as rooted, which causes the analysis to fail. Which is weird to me. These trees were also made with iqtree2, and were not rooted. They were, however, manipulated with It looks like Also, "best viewed in FigTree"? I appreciate the suggestion, but FigTree? Really? |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 2 replies
-
|
@ryneches I'm confused about the Can you post an (subset of?) your data, particularly the Also I suspect a workaround here would be to swap names on the alignments themselves (i.e. before estimating gene trees in IQ-TREE), so that you circumvent the need for an intermediate tree processing step. Finally, I am shocked to see the wonderful FigTree dissed. That said, I am open to suggestions of other amazing interactive tree viewers. (Particularly if we can format amazingly long and complicated branch labels and then have really flexible ways of viewing them that don't require attempting to use the excellent, obviously, various implementations of tree-viewing libraries in R and Python). |
Beta Was this translation helpful? Give feedback.
-
|
The is no taxon named Here are my test files (gzipped to make GitHub happy) : concat.treefile.gz With any luck, this will turn out to be some embarrassing oversight on my part, and we can all go about our day. I would like to avoid renaming the alignments. There are a lot of paralogous genes in this analysis, and rebuilding the trees with taxa represented by every combination of paralogs would be absolutely impossible. Anyway, I was just joking around about FigTree. We have students in the lab who are younger than the codebase! When folks ask for an approachable tool for interactive tree exploration, I think the recommendation these days is usually iTOL. All of these tools have their downsides, but it's hard to beat "browse to this website" for ease of installation. |
Beta Was this translation helpful? Give feedback.
-
|
OK, I have an explanation and a solution for you. I also wonder if @bqminh and I should think about force-unrooting gene trees for gcf calculation. 1. Your gene trees are still rooted (I think...)The newick utilities manual says that newick assumes all trees are rooted, unless there's a trifurcation at the root. Your gene trees (at least, the one I looked at in the wonderful FigTree) have bifurcations at the root, so are strictly rooted. I suspect (but don't know) that this is why IQ-TREE thinks they are rooted. 2. Newick Utilities unroots them to IQ-TREE's satisfactionIf you don't already know about newick utilities, it's incredible:
Here's how it solves your issue. I can reproduce your error like this (IQ-TREE v. 2.3.4) iqtree2 -t concat.treefile --gcf genes.treefile --prefix genesgiving: But if I first unroot with nw_reroot -d genes.treefile > genes_unrooted.treefile
iqtree2 -t concat.treefile --gcf genes_unrooted.treefile --prefix genes_unrootedRuns fine, and produces output like A commentSince in your case the gCFs are very low because most genes end up in gDF_P (i.e. one of the four groups required to define a branch is paraphyletic in your trees), the qCF might be more informative. For example, if most of the reason that they are ending up in gDF_P is noise, then qCF will avoid that because it uses quartets, meaning that the four groups are assumed to be monophyletic by default. More info on that in this preprint: https://ecoevorxiv.org/repository/view/6484/ And an accompanying almost-finished how-to-get qCFs and map them along with gCFs and other things (you need ASTRAL for qCFs) here: https://github.com/iqtree/iqtree2/wiki/Estimating-gene,-site,-and-quartet-concordance-vectors Hope some of that helps. Also yes - +10 for iTOL. I do need to get good at using it, it looks amazing. |
Beta Was this translation helpful? Give feedback.
-
|
Yes, I can confirm @roblanf is right. When there is a bifurcation at the root in the newick string, IQ-TREE will classify the tree as rooted. If you have trifurcation or more, IQ-TREE will treat it as unrooted tree. I believe that's also a convention in other software. Re message about |
Beta Was this translation helpful? Give feedback.
-
|
@roblanf, you are a scholar and a gentleman. Thank you! You are absolutely right about these gene trees, and qCF will indeed be my next stop. I have several thousand of these consensus trees to analyze, so my intent is is to use the gCF analysis to get a sense of how much of a mess the gene trees are. I promise to stop making fun of FigTree. I named my own phylogenetics package after a meme from 2013, so stones and glass houses, etc etc. @bqminh I thought that was where May I humbly request that an option to override the rooting inference in some future release? |
Beta Was this translation helpful? Give feedback.
-
|
Well, there is a simpler solution: you can force all trees to be rooted via |
Beta Was this translation helpful? Give feedback.
OK, I have an explanation and a solution for you. I also wonder if @bqminh and I should think about force-unrooting gene trees for gcf calculation.
1. Your gene trees are still rooted (I think...)
The newick utilities manual says that newick assumes all trees are rooted, unless there's a trifurcation at the root. Your gene trees (at least, the one I looked at in the wonderful FigTree) have bifurcations at the root, so are strictly rooted. I suspect (but don't know) that this is why IQ-TREE thinks they are rooted.
2. Newick Utilities unroots them to IQ-TREE's satisfaction
If you don't already know about newick utilities, it's incredible: