
Build maximum likelihood phylogenetic trees with automatic model selection and ultrafast bootstrap.
IQ-TREE is a phylogenetic tree inference software that uses maximum likelihood methods to reconstruct evolutionary relationships from aligned sequences. It combines fast tree search algorithms with sophisticated model selection to produce publication-quality phylogenetic trees for both protein and nucleotide sequences.
The software addresses two major bottlenecks in phylogenetic analysis: selecting the best substitution model and assessing branch support. ModelFinder automatically identifies the optimal evolutionary model 10-100 times faster than traditional tools like jModelTest. The ultrafast bootstrap (UFBoot) provides branch support values 10-40 times faster than standard bootstrap methods while maintaining statistical rigor.
IQ-TREE works with aligned sequences. If your sequences are unaligned, use MAFFT or Clustal Omega first to create a multiple sequence alignment. For faster approximate trees, consider FastTree.
Maximum likelihood phylogenetics finds the tree topology and branch lengths that maximize the probability of observing your sequence data under a given evolutionary model. For each tree, IQ-TREE calculates the likelihood L that the observed alignment arose from that tree:
lnL=i=1∑nlnP(xi∣T,θ)where xi is the pattern at site i, T is the tree topology, and θ represents model parameters like substitution rates and branch lengths. IQ-TREE uses a stochastic perturbation algorithm that efficiently explores tree space by making strategic rearrangements guided by likelihood improvements.
ModelFinder tests a comprehensive set of substitution models and selects the best one using information criteria. For each model M, it calculates:
BIC=−2lnL+klnnwhere k is the number of free parameters and n is the alignment length. BIC penalizes complex models more heavily than AIC, reducing the risk of overfitting. ModelFinder supports 546 protein models and 286 DNA models, including rate heterogeneity options like gamma-distributed rates and FreeRate categories.
The automatic selection saves you from manually testing dozens of models. ModelFinder identifies not just the substitution matrix (like JTT or GTR) but also the optimal rate heterogeneity model and invariant site proportion.
Traditional bootstrap resampling requires rebuilding thousands of trees from resampled alignments, which is computationally expensive. UFBoot approximates the bootstrap distribution using a more efficient approach: it performs maximum likelihood tree search on fewer replicates while estimating bootstrap proportions from intermediate tree topologies encountered during the search.
We recommend 1000 bootstrap replicates for publication-quality results. Values ≥95% indicate strong support, 70-95% moderate support, and <70% weak support. UFBoot provides approximately unbiased estimates comparable to standard bootstrap but finishes in a fraction of the time.
The SH-aLRT (Shimodaira-Hasegawa approximate likelihood ratio test) offers an alternative to bootstrap. It tests whether the likelihood of the best tree is significantly better than the second-best tree for each branch. SH-aLRT runs even faster than UFBoot and uses a different statistical framework, so using both can provide complementary confidence measures.
Your input must be a pre-aligned multiple sequence alignment with at least 3 sequences. IQ-TREE will fail if you provide unaligned sequences because maximum likelihood requires site-by-site comparisons across all sequences.
Supported formats include FASTA, PHYLIP, NEXUS, and CLUSTAL. FASTA is the most common choice for its simplicity.
Auto (ModelFinder): Automatically selects the best model using BIC. This is the recommended option for most analyses as it tests all available models and chooses the optimal one.ModelFinder Plus: Extended model selection that includes FreeRate models and other advanced options. Slower but more thorough than standard ModelFinder.JTT, WAG, LG: Protein-specific substitution matrices. JTT works well for general protein data, WAG for nuclear proteins, and LG for diverse protein families.GTR: General time reversible model for nucleotides. The most parameter-rich DNA model with six substitution rates and four base frequencies.HKY, K2P: Simpler nucleotide models. HKY distinguishes transitions from transversions with unequal base frequencies. K2P assumes equal base frequencies.IQ-TREE outputs a phylogenetic tree in Newick format, which displays branch relationships and support values. Branch lengths represent evolutionary distance—longer branches indicate more substitutions per site.
Support values appear at internal nodes. With UFBoot enabled, these represent the percentage of bootstrap replicates supporting that clade. Values ≥95% are considered strong evidence, 70-95% moderate, and <70% weak. Branches with weak support indicate uncertainty in the tree topology at those positions.
The results include the selected substitution model and its parameters. For example, GTR+F+I+G4 indicates:
IQ-TREE excels when you need rigorous model selection and publication-quality results. The automatic ModelFinder testing ensures your tree is based on the most appropriate evolutionary model for your data.
For large alignments where speed matters more than model sophistication, FastTree provides approximate maximum likelihood trees much faster. FastTree uses simpler models but can handle datasets with tens of thousands of sequences.
For phylogenomic analyses with hundreds of genes, IQ-TREE's partition models can assign different evolutionary models to different alignment regions. This is important when combining genes with different evolutionary rates.
Start with ModelFinder (Auto) unless you have strong prior knowledge about which model fits your data. Manual model selection rarely outperforms ModelFinder's systematic approach.
Run at least 1000 bootstrap replicates for publication. Preliminary analyses can use 100-200 to save time, but final trees should use ≥1000 for reliable support estimates.
Check for outlier sequences before running IQ-TREE. Sequences with excessive gaps or unusual composition can distort tree topology. If bootstrap support is uniformly low, your alignment may be too divergent or contain paralogs rather than orthologs.
The alignment quality matters more than the phylogenetic method. Poorly aligned regions introduce noise that no tree-building algorithm can overcome. Consider using MAFFT with the L-INS-i algorithm for difficult-to-align sequences.