RAxML-NG

Infer maximum-likelihood phylogenetic trees from aligned protein or DNA sequences with bootstrap support.

Input

Alignment

Analysis mode

Sequence type

Model mode

Starting tree

Random seed

CPU threads

Bootstrap replicates

Bootstrap support metric

Output prefix

20 credits

Output

Configure inputs to begin

Set options on the left, then click “Infer Tree”.

What is RAxML-NG?

RAxML-NG infers phylogenetic trees from an existing multiple sequence alignment using maximum likelihood. It is the current-generation rewrite of RAxML and ExaML, built for faster likelihood calculations, more stable optimization, and large alignments where branch support still matters.

The practical point of RAxML-NG is not just tree building. It is tree building under explicit evolutionary models, with branch lengths, likelihood scores, and bootstrap support that can stand up in downstream comparative analyses. That makes it a better fit than approximate methods when the topology will be interpreted biologically or used in a manuscript.

This tool expects an alignment, not raw sequences. If the sequences are not already aligned, start with MAFFT, MUSCLE5, or Clustal Omega before running tree inference.

How does RAxML-NG work?

RAxML-NG searches tree space for the topology and branch lengths that maximize the likelihood of the observed alignment under a substitution model. In practice, that means comparing many candidate trees, optimizing branch lengths and model parameters, and keeping the trees that improve the score.

The rewrite improved several pieces of the original RAxML search procedure. The published implementation fixes missed topological moves from earlier versions, improves optimization for models such as LG4X, adds transfer bootstrap expectation (TBE), and reports terraces in tree space when the data structure implies many equally scoring topologies. It also incorporates site-repeat optimizations and parallelization improvements that matter on taxon-rich datasets.

One consequence of maximum likelihood inference is that alignment quality dominates the final result. RAxML-NG can optimize a tree very efficiently, but it cannot rescue a poor alignment, mixed paralogs, or a nucleotide alignment forced into a protein model.

How to use RAxML-NG online

RAxML-NG runs on ProteinIQ from a pre-aligned protein or DNA dataset in FASTA or PHYLIP format. Submit the alignment, choose whether to run ML search, bootstrapping, or both, and the job returns Newick tree files, bootstrap outputs, support-annotated trees, model information, and the execution log produced by RAxML-NG.

Inputs

Input	Description
`Alignment`	One aligned protein or DNA dataset in `FASTA`, `PHYLIP`, or plain text containing one of those formats. At least 2 sequences are required.
`Job name`	Optional label for the submitted job in the ProteinIQ interface.

Settings

Setting	Description
`Analysis mode`	`ML tree search`, `Bootstrapping only`, or `ML search + bootstrapping`. The combined mode is the default.
`Sequence type`	`Auto-detect from alignment`, `Protein`, or `DNA`. Auto-detection works when the alignment alphabet is unambiguous. Short nucleotide alignments made only of `A`, `C`, `G`, and `N` may need manual selection.
`Model mode`	`Automatic (AA or DNA)` uses the published model family shorthand, `AA` or `DNA`. `Custom model string` passes an explicit RAxML-NG model such as `GTR+G` or `LG+G8+F`.
`Custom model`	Required only when `Model mode` is set to custom. Useful when a manuscript or lab workflow needs a specific model string instead of the family default.
`Starting tree`	`auto`, `pars{N}`, or `rand{N}`. `pars{10}` starts from 10 parsimony trees, `rand{10}` starts from 10 random trees. This only affects ML search workflows.
`Random seed`	Positive integer used for reproducibility. The same alignment and same settings should reproduce the same random starting conditions.
`CPU threads`	Number of CPU threads used by the hosted run. More threads can reduce wall-clock time on larger alignments.
`Bootstrap replicates`	Number of bootstrap trees when bootstrap analysis is enabled. `100` is often enough for exploratory work. Publication workflows often use several hundred to one thousand replicates.
`Bootstrap support metric`	`FBP`, `TBE`, or `RBS`. `FBP` is the classic bootstrap proportion. `TBE` is often more stable on large trees with a few unstable taxa.
`Output prefix`	Prefix used for downloadable filenames such as `raxml.bestTree.newick`.

Outputs

Output file	When it appears	Meaning
`prefix.bestTree.newick`	ML search or combined mode	Best maximum-likelihood tree in Newick format.
`prefix.bestTreeCollapsed.newick`	ML search or combined mode	Best tree with near-zero branches collapsed, useful when tiny branches clutter interpretation.
`prefix.bootstraps.newick`	Bootstrap or combined mode	All bootstrap replicate trees in one Newick file.
`prefix.support.newick` or metric-specific support trees	Bootstrap or combined mode	Support-annotated tree. Depending on the chosen metric, the tool can return files such as `supportFBP`, `supportTBE`, or `supportRBS`.
`prefix.bestModel.txt`	When emitted by RAxML-NG	Optimized model parameters and final model details.
`prefix.mlTrees.newick`	ML search or combined mode	Candidate ML trees from the starting-tree searches. Useful for auditing the search rather than for final reporting.
`prefix.raxml.log`	Always when the run succeeds	Full RAxML-NG log, including command details and summary statistics such as final log-likelihood, AIC, and BIC when available.

Understanding the results

RAxML-NG results are easiest to interpret in three layers: topology, branch lengths, and branch support.

Topology

The branching pattern is the inferred evolutionary hypothesis. Internal nodes define clades. A change in topology is biologically meaningful, so any region of the tree with weak support should be treated as unresolved rather than overinterpreted.

Branch lengths

Branch lengths are reported in substitutions per site. Longer branches indicate more inferred evolutionary change. Extremely long terminal branches often signal problematic sequences, alignment issues, contamination, or highly divergent taxa.

Support values

Support values only appear when bootstrap analysis is run. Lower values indicate that small changes in the resampled alignment often alter that split.

Support value	Interpretation
`>= 95`	Strong support for that clade in most datasets
`80-94`	Reasonable support, often usable but worth checking against alignment quality
`70-79`	Weak to moderate support, commonly treated with caution
`< 70`	Unstable split, usually not strong evidence for that relationship

Those cutoffs are conventions, not laws. Deep trees with uneven taxon sampling can show lower classical bootstrap support even when the overall signal is meaningful. That is one reason TBE exists.

FBP vs TBE vs RBS

Metric	What it emphasizes	When it is useful
`FBP`	Exact recovery of the same bipartition across replicates	Standard reporting and direct comparison with older phylogenetics literature
`TBE`	Similarity between splits rather than exact matching	Large trees where a few unstable taxa would otherwise deflate support for deep branches
`RBS`	Rapid bootstrap workflow	Faster support estimation when runtime matters more than strict comparability with classical bootstrap

When to use RAxML-NG vs alternatives

RAxML-NG occupies the middle ground between very fast approximate tree builders and feature-rich ML packages with their own model-selection ecosystems.

Tool	Best use case	Tradeoff
RAxML-NG	Maximum-likelihood inference when a solid ML search and standard bootstrap workflow are the priority	Requires a pre-aligned dataset and is slower than approximate methods
IQ-TREE	Analyses where built-in model selection and ultrafast bootstrap are central	Different search heuristics and support framework, often more feature-focused on model testing
FastTree	Very large alignments where speed matters more than exact ML optimization	Much faster, but approximate

RAxML-NG is a strong default when the alignment is already prepared and the goal is a conventional ML tree with explicit bootstrap support. IQ-TREE is often the better choice when systematic model testing is part of the analysis plan. FastTree is better for exploratory work on very large datasets where approximate topology is good enough.

MAFFT: Builds multiple sequence alignments before phylogenetic inference.
MUSCLE5: Alternative MSA method for protein or nucleotide datasets.
Clustal Omega: Scalable alignment tool that fits naturally before tree building.
IQ-TREE: Another maximum-likelihood phylogeny tool with strong model-selection features.
FastTree: Approximate maximum-likelihood trees for very large alignments.

Based on the RAxML-NG paper by Kozlov et al. (2019) in Bioinformatics and the official RAxML-NG GitHub repository.

Related tools

DockQ

Assess docking model quality by comparing predicted complexes against native references. DockQ v2.1.3 supports protein, nucleic-acid, and supported small-molecule interfaces with faithful native metrics.

HMMER

Sensitive sequence homology search using profile hidden Markov models. More accurate than BLAST for detecting remote homologs, ideal for finding evolutionarily distant protein family members.

MAFFT

Perform multiple sequence alignment using MAFFT (Multiple Alignment using Fast Fourier Transform). Supports multiple algorithms from fast progressive to highly accurate iterative methods.

MMseqs2

Ultra-fast sequence search and clustering. 10,000x faster than BLAST for database searches, with powerful sequence clustering capabilities for proteins and nucleotides.

MUSCLE5

Perform multiple sequence alignment using MUSCLE5 (MUltiple Sequence Comparison by Log-Expectation). Uses the PPP algorithm for high-quality alignments with support for ensemble generation.

RNAdistance

RNAdistance compares RNA secondary structures using the selected native ViennaRNA distance representation and comparison mode.

AF-Cluster

Cluster Multiple Sequence Alignments to predict alternative protein conformations with AlphaFold2. Uses DBSCAN clustering to identify sequence subgroups.

ORF Finder

Find all Open Reading Frames (ORFs) in DNA sequences. Searches all six reading frames and supports multiple genetic codes.

DR-BERT

DR-BERT is a compact protein language model that predicts intrinsically disordered regions (IDRs) in proteins. It outputs per-residue disorder probability scores (0–1) from amino acid sequences, enabling fast and accurate annotation of disordered regions without structural data.

IgBLAST

Analyze immunoglobulin (antibody) and T cell receptor variable domain sequences. Identifies V/D/J gene segments, delineates CDR regions, and analyzes rearrangement junctions.