RNAalifold

Predict consensus RNA secondary structures from sequence alignments.

6
Configure input settings on the left, then click "Submit"

Related tools

Salmon

Salmon

Quantify transcript abundance from RNA-seq reads with Salmon selective alignment. Upload a transcript FASTA reference plus single-end or paired-end FASTA/FASTQ reads to produce TPM and estimated read-count tables.

Clustal Omega

Clustal Omega

Perform multiple sequence alignment on protein or nucleotide sequences using the Clustal Omega algorithm.

FastTree

FastTree

Infer approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences.

IQ-TREE

IQ-TREE

Build phylogenetic trees using maximum likelihood with automatic model selection (ModelFinder) and ultrafast bootstrap support.

MAFFT

MAFFT

Perform multiple sequence alignment using MAFFT (Multiple Alignment using Fast Fourier Transform). Supports multiple algorithms from fast progressive to highly accurate iterative methods.

MUSCLE5

MUSCLE5

Perform multiple sequence alignment using MUSCLE5 (MUltiple Sequence Comparison by Log-Expectation). Uses the PPP algorithm for high-quality alignments with support for ensemble generation.

USAlign

USAlign

USAlign (Universal Structure Alignment) aligns protein, RNA, and DNA structures to compute TM-scores and generate superposed structures. Compare 3D structures to assess structural similarity.

MUMmer4

MUMmer4

Rapidly align and compare DNA sequences using MUMmer4 nucmer. Perform pairwise genome comparisons to identify SNPs, indels, and structural variants between reference and query genomes.

MMseqs2

MMseqs2

Ultra-fast sequence search and clustering. 10,000x faster than BLAST for database searches, with powerful sequence clustering capabilities for proteins and nucleotides.

RNAcofold

RNAcofold

RNAcofold predicts the joint secondary structure of two interacting RNA molecules and optionally reports partition-function and concentration-dependent equilibrium metrics.

What is RNAalifold?

RNAalifold predicts consensus RNA secondary structure from multiple sequence alignments. Unlike single-sequence folding methods that rely solely on thermodynamic stability, RNAalifold incorporates covariation information from evolutionary data. When two alignment columns show compensatory mutations that preserve base pairing across species, this provides strong evidence that the positions are structurally paired in vivo.

The algorithm averages energy contributions across all aligned sequences while adding covariance bonuses for positions where mutations maintain Watson-Crick or wobble pairing. This combination of thermodynamics and phylogenetic signal yields more accurate structure predictions than either approach alone, particularly for well-conserved non-coding RNAs like riboswitches, ribozymes, and regulatory elements.

How does RNAalifold work?

RNAalifold extends the standard dynamic programming algorithms for RNA secondary structure prediction by computing an averaged free energy over all sequences in the alignment, then modifying this energy with a covariation score for each potential base pair.

Covariation scoring

The original implementation used a simple scoring scheme: +1 kcal/mol bonus for consistent mutations (e.g., A-U to G-U), +2 kcal/mol for compensatory mutations (e.g., A-U to G-C), and -2 kcal/mol penalty for inconsistent pairs. This has been replaced by RIBOSUM-like scoring matrices derived from structural alignments of ribosomal RNAs.

RIBOSUM matrices capture the empirical substitution rates between base pair types in evolutionarily related sequences. They assign higher scores to mutations that preserve base pairing and lower scores to changes that disrupt it, weighted by how frequently such changes occur in real data.

Adaptive weighting

The relative contribution of covariation versus thermodynamic energy adapts to alignment characteristics. At high sequence identity, the covariance term is weighted more heavily because fewer mutations provide less statistical power. At lower identity levels, covariation signals become more informative and the balance shifts accordingly.

Gap handling

Alignment columns containing gaps require special treatment. The improved algorithm handles gaps more rationally than earlier versions, preventing artifacts like unrealistically short hairpins that could arise when gap-containing columns were simply ignored.

How to use RNAalifold online

Upload a multiple sequence alignment in FASTA or Clustal format and RNAalifold returns a consensus secondary structure that combines thermodynamic stability with evolutionary covariation evidence. The output includes a consensus sequence in IUPAC codes, the predicted structure in dot-bracket notation, and the minimum free energy.

Input

FieldDescription
Aligned RNA SequencesMultiple sequence alignment in FASTA, Clustal, or plain text format. Sequences must be pre-aligned and of equal length. Minimum 2 sequences required.

RNAalifold requires sequences that are already aligned. To create alignments from unaligned sequences, use Clustal Omega, MAFFT, or MUSCLE5 first, then pass the output to RNAalifold.

Settings

Prediction options

SettingDescription
TemperatureFolding temperature in Celsius (0-100, default 37). Higher temperatures destabilize base pairs.
Disallow lonely pairsPrevents isolated base pairs (helices of length 1). Often improves prediction accuracy for real structures.

Energy parameters

SettingDescription
Dangling endsTreatment of unpaired nucleotides adjacent to helices. Double dangles (default) includes contributions from both sides, Ignore dangling ends omits them entirely, Allow coaxial stacking enables stacking across multi-loops.

Output

ColumnDescription
Num SequencesNumber of sequences in the input alignment
Alignment LengthLength of the alignment in nucleotides (including gaps)
Consensus SequenceComputed consensus sequence using IUPAC ambiguity codes
Consensus StructurePredicted secondary structure in dot-bracket notation
MFEMinimum free energy of the consensus structure in kcal/mol

Interpreting results

The consensus structure uses standard dot-bracket notation: dots represent unpaired nucleotides, matching parentheses indicate base pairs. More negative MFE values indicate more stable structures.

The consensus sequence reflects the most common nucleotide at each position. When multiple nucleotides appear with similar frequency, IUPAC ambiguity codes are used (e.g., R for purine, Y for pyrimidine).

When to use RNAalifold

RNAalifold excels when homologous RNA sequences are available from multiple species. The covariation signal provides independent evidence for base pairing that pure thermodynamic folding cannot capture.

Ideal use cases include:

  • Predicting structures of conserved non-coding RNAs (riboswitches, ribozymes, snoRNAs)
  • Validating predicted structures against evolutionary evidence
  • Identifying functionally constrained regions in RNA alignments
  • Comparative analysis of viral RNA genomes

For single sequences without homologs, RNAfold provides standard MFE prediction. For very similar sequences with minimal covariation, the benefit over single-sequence folding is limited.

Limitations

Alignment quality matters: RNAalifold assumes the input alignment is correct. Misaligned sequences will produce spurious covariation signals and incorrect structures. Curated alignments from databases like Rfam typically perform better than automated alignments.

Sequence diversity: Very similar sequences provide little covariation information; very divergent sequences may be difficult to align accurately. Moderate diversity (60-90% identity) often yields the best predictions.

No sequence weighting: The current implementation treats all sequences equally. If the alignment contains many nearly identical sequences alongside diverse ones, the similar sequences will dominate the averaged energy calculation.

Pseudoknots excluded: Like other ViennaRNA tools, RNAalifold predicts only nested secondary structures. Base pairs whose partners interleave with another pair are not modeled.