Related tools

Salmon
Quantify transcript abundance from RNA-seq reads with Salmon selective alignment. Upload a transcript FASTA reference plus single-end or paired-end FASTA/FASTQ reads to produce TPM and estimated read-count tables.

Clustal Omega
Perform multiple sequence alignment on protein or nucleotide sequences using the Clustal Omega algorithm.

FastTree
Infer approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences.

IQ-TREE
Build phylogenetic trees using maximum likelihood with automatic model selection (ModelFinder) and ultrafast bootstrap support.

MAFFT
Perform multiple sequence alignment using MAFFT (Multiple Alignment using Fast Fourier Transform). Supports multiple algorithms from fast progressive to highly accurate iterative methods.

MUSCLE5
Perform multiple sequence alignment using MUSCLE5 (MUltiple Sequence Comparison by Log-Expectation). Uses the PPP algorithm for high-quality alignments with support for ensemble generation.

USAlign
USAlign (Universal Structure Alignment) aligns protein, RNA, and DNA structures to compute TM-scores and generate superposed structures. Compare 3D structures to assess structural similarity.

MUMmer4
Rapidly align and compare DNA sequences using MUMmer4 nucmer. Perform pairwise genome comparisons to identify SNPs, indels, and structural variants between reference and query genomes.

MMseqs2
Ultra-fast sequence search and clustering. 10,000x faster than BLAST for database searches, with powerful sequence clustering capabilities for proteins and nucleotides.

RNAcofold
RNAcofold predicts the joint secondary structure of two interacting RNA molecules and optionally reports partition-function and concentration-dependent equilibrium metrics.
What is RNAalifold?
RNAalifold predicts consensus RNA secondary structure from multiple sequence alignments. Unlike single-sequence folding methods that rely solely on thermodynamic stability, RNAalifold incorporates covariation information from evolutionary data. When two alignment columns show compensatory mutations that preserve base pairing across species, this provides strong evidence that the positions are structurally paired in vivo.
The algorithm averages energy contributions across all aligned sequences while adding covariance bonuses for positions where mutations maintain Watson-Crick or wobble pairing. This combination of thermodynamics and phylogenetic signal yields more accurate structure predictions than either approach alone, particularly for well-conserved non-coding RNAs like riboswitches, ribozymes, and regulatory elements.
How does RNAalifold work?
RNAalifold extends the standard dynamic programming algorithms for RNA secondary structure prediction by computing an averaged free energy over all sequences in the alignment, then modifying this energy with a covariation score for each potential base pair.
Covariation scoring
The original implementation used a simple scoring scheme: +1 kcal/mol bonus for consistent mutations (e.g., A-U to G-U), +2 kcal/mol for compensatory mutations (e.g., A-U to G-C), and -2 kcal/mol penalty for inconsistent pairs. This has been replaced by RIBOSUM-like scoring matrices derived from structural alignments of ribosomal RNAs.
RIBOSUM matrices capture the empirical substitution rates between base pair types in evolutionarily related sequences. They assign higher scores to mutations that preserve base pairing and lower scores to changes that disrupt it, weighted by how frequently such changes occur in real data.
Adaptive weighting
The relative contribution of covariation versus thermodynamic energy adapts to alignment characteristics. At high sequence identity, the covariance term is weighted more heavily because fewer mutations provide less statistical power. At lower identity levels, covariation signals become more informative and the balance shifts accordingly.
Gap handling
Alignment columns containing gaps require special treatment. The improved algorithm handles gaps more rationally than earlier versions, preventing artifacts like unrealistically short hairpins that could arise when gap-containing columns were simply ignored.
How to use RNAalifold online
Upload a multiple sequence alignment in FASTA or Clustal format and RNAalifold returns a consensus secondary structure that combines thermodynamic stability with evolutionary covariation evidence. The output includes a consensus sequence in IUPAC codes, the predicted structure in dot-bracket notation, and the minimum free energy.
Input
| Field | Description |
|---|---|
Aligned RNA Sequences | Multiple sequence alignment in FASTA, Clustal, or plain text format. Sequences must be pre-aligned and of equal length. Minimum 2 sequences required. |
RNAalifold requires sequences that are already aligned. To create alignments from unaligned sequences, use Clustal Omega, MAFFT, or MUSCLE5 first, then pass the output to RNAalifold.
Settings
Prediction options
| Setting | Description |
|---|---|
Temperature | Folding temperature in Celsius (0-100, default 37). Higher temperatures destabilize base pairs. |
Disallow lonely pairs | Prevents isolated base pairs (helices of length 1). Often improves prediction accuracy for real structures. |
Energy parameters
| Setting | Description |
|---|---|
Dangling ends | Treatment of unpaired nucleotides adjacent to helices. Double dangles (default) includes contributions from both sides, Ignore dangling ends omits them entirely, Allow coaxial stacking enables stacking across multi-loops. |
Output
| Column | Description |
|---|---|
Num Sequences | Number of sequences in the input alignment |
Alignment Length | Length of the alignment in nucleotides (including gaps) |
Consensus Sequence | Computed consensus sequence using IUPAC ambiguity codes |
Consensus Structure | Predicted secondary structure in dot-bracket notation |
MFE | Minimum free energy of the consensus structure in kcal/mol |
Interpreting results
The consensus structure uses standard dot-bracket notation: dots represent unpaired nucleotides, matching parentheses indicate base pairs. More negative MFE values indicate more stable structures.
The consensus sequence reflects the most common nucleotide at each position. When multiple nucleotides appear with similar frequency, IUPAC ambiguity codes are used (e.g., R for purine, Y for pyrimidine).
When to use RNAalifold
RNAalifold excels when homologous RNA sequences are available from multiple species. The covariation signal provides independent evidence for base pairing that pure thermodynamic folding cannot capture.
Ideal use cases include:
- Predicting structures of conserved non-coding RNAs (riboswitches, ribozymes, snoRNAs)
- Validating predicted structures against evolutionary evidence
- Identifying functionally constrained regions in RNA alignments
- Comparative analysis of viral RNA genomes
For single sequences without homologs, RNAfold provides standard MFE prediction. For very similar sequences with minimal covariation, the benefit over single-sequence folding is limited.
Limitations
Alignment quality matters: RNAalifold assumes the input alignment is correct. Misaligned sequences will produce spurious covariation signals and incorrect structures. Curated alignments from databases like Rfam typically perform better than automated alignments.
Sequence diversity: Very similar sequences provide little covariation information; very divergent sequences may be difficult to align accurately. Moderate diversity (60-90% identity) often yields the best predictions.
No sequence weighting: The current implementation treats all sequences equally. If the alignment contains many nearly identical sequences alongside diverse ones, the similar sequences will dominate the averaged energy calculation.
Pseudoknots excluded: Like other ViennaRNA tools, RNAalifold predicts only nested secondary structures. Base pairs whose partners interleave with another pair are not modeled.