What is RNAalifold?
RNAalifold predicts consensus RNA secondary structure from multiple sequence alignments. Unlike single-sequence folding methods that rely solely on thermodynamic stability, RNAalifold incorporates covariation information from evolutionary data. When two alignment columns show compensatory mutations that preserve base pairing across species, this provides strong evidence that the positions are structurally paired in vivo.
The algorithm averages energy contributions across all aligned sequences while adding covariance bonuses for positions where mutations maintain Watson-Crick or wobble pairing. This combination of thermodynamics and phylogenetic signal yields more accurate structure predictions than either approach alone, particularly for well-conserved non-coding RNAs like riboswitches, ribozymes, and regulatory elements.
How does RNAalifold work?
RNAalifold extends the standard dynamic programming algorithms for RNA secondary structure prediction by computing an averaged free energy over all sequences in the alignment, then modifying this energy with a covariation score for each potential base pair.
Covariation scoring
The original implementation used a simple scoring scheme: +1 kcal/mol bonus for consistent mutations (e.g., A-U to G-U), +2 kcal/mol for compensatory mutations (e.g., A-U to G-C), and -2 kcal/mol penalty for inconsistent pairs. This has been replaced by RIBOSUM-like scoring matrices derived from structural alignments of ribosomal RNAs.
RIBOSUM matrices capture the empirical substitution rates between base pair types in evolutionarily related sequences. They assign higher scores to mutations that preserve base pairing and lower scores to changes that disrupt it, weighted by how frequently such changes occur in real data.
Adaptive weighting
The relative contribution of covariation versus thermodynamic energy adapts to alignment characteristics. At high sequence identity, the covariance term is weighted more heavily because fewer mutations provide less statistical power. At lower identity levels, covariation signals become more informative and the balance shifts accordingly.
Gap handling
Alignment columns containing gaps require special treatment. The improved algorithm handles gaps more rationally than earlier versions, preventing artifacts like unrealistically short hairpins that could arise when gap-containing columns were simply ignored.
How to use RNAalifold online
ProteinIQ provides browser-based access to RNAalifold, running computations on cloud infrastructure without local installation.
Input
| Field | Description |
|---|---|
Aligned RNA Sequences | Multiple sequence alignment in FASTA, Clustal, or plain text format. Sequences must be pre-aligned and of equal length. Minimum 2 sequences required. |
RNAalifold requires sequences that are already aligned. To create alignments from unaligned sequences, use Clustal Omega, MAFFT, or MUSCLE5 first, then pass the output to RNAalifold.
Settings
Prediction options
| Setting | Description |
|---|---|
Temperature | Folding temperature in Celsius (0-100, default 37). Higher temperatures destabilize base pairs. |
Disallow lonely pairs | Prevents isolated base pairs (helices of length 1). Often improves prediction accuracy for real structures. |
Energy parameters
| Setting | Description |
|---|---|
Dangling ends | Treatment of unpaired nucleotides adjacent to helices. Double dangles (default) includes contributions from both sides, Ignore dangling ends omits them entirely, Allow coaxial stacking enables stacking across multi-loops. |
Output
| Column | Description |
|---|---|
Num Sequences | Number of sequences in the input alignment |
Alignment Length | Length of the alignment in nucleotides (including gaps) |
Consensus Sequence | Computed consensus sequence using IUPAC ambiguity codes |
Consensus Structure | Predicted secondary structure in dot-bracket notation |
MFE | Minimum free energy of the consensus structure in kcal/mol |
Interpreting results
The consensus structure uses standard dot-bracket notation: dots represent unpaired nucleotides, matching parentheses indicate base pairs. More negative MFE values indicate more stable structures.
The consensus sequence reflects the most common nucleotide at each position. When multiple nucleotides appear with similar frequency, IUPAC ambiguity codes are used (e.g., R for purine, Y for pyrimidine).
When to use RNAalifold
RNAalifold excels when homologous RNA sequences are available from multiple species. The covariation signal provides independent evidence for base pairing that pure thermodynamic folding cannot capture.
Ideal use cases include:
- Predicting structures of conserved non-coding RNAs (riboswitches, ribozymes, snoRNAs)
- Validating predicted structures against evolutionary evidence
- Identifying functionally constrained regions in RNA alignments
- Comparative analysis of viral RNA genomes
For single sequences without homologs, RNAfold provides standard MFE prediction. For very similar sequences with minimal covariation, the benefit over single-sequence folding is limited.
Limitations
Alignment quality matters: RNAalifold assumes the input alignment is correct. Misaligned sequences will produce spurious covariation signals and incorrect structures. Curated alignments from databases like Rfam typically perform better than automated alignments.
Sequence diversity: Very similar sequences provide little covariation information; very divergent sequences may be difficult to align accurately. Moderate diversity (60-90% identity) often yields the best predictions.
No sequence weighting: The current implementation treats all sequences equally. If the alignment contains many nearly identical sequences alongside diverse ones, the similar sequences will dominate the averaged energy calculation.
Pseudoknots excluded: Like other ViennaRNA tools, RNAalifold predicts only nested secondary structures. Base pairs whose partners interleave with another pair are not modeled.
Related tools
- RNAfold: Single-sequence MFE structure prediction
- Clustal Omega: Create multiple sequence alignments for input to RNAalifold
- MAFFT: Alternative alignment tool with multiple accuracy/speed modes
- MUSCLE5: High-accuracy alignment using the PPP algorithm
- ViennaRNA: Unified interface to all ViennaRNA analysis methods
