ProteinIQ
RMSD calculator example image

RMSD calculator

Calculate RMSD between protein structures with automatic alignment

What is RMSD?

Root Mean Square Deviation (RMSD) is the most widely used metric for quantifying structural similarity between proteins. It measures the average distance between corresponding atoms in two superimposed structures, expressed in Ångströms (Å).

An RMSD of 0 Å indicates identical structures. Values below 2 Å typically suggest high similarity, while values above 4 Å indicate significant structural differences. RMSD is essential for validating predicted structures against experimental data, tracking conformational changes in molecular dynamics, and assessing the diversity of structural models.

For more comprehensive structural comparison that includes TM-score and sequence identity, use USAlign. To search for structurally similar proteins in databases, try FoldSeek.

How does RMSD calculation work?

The RMSD formula

RMSD calculates the square root of the average squared distances between nn pairs of equivalent atoms:

RMSD=1ni=1ndi2\text{RMSD} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} d_i^2}

where did_i is the Euclidean distance between the ii-th pair of corresponding atoms after superposition.

Kabsch alignment

Before calculating RMSD, structures must be optimally superimposed. This tool uses the Kabsch algorithm to find the rotation matrix that minimizes RMSD between the two structures.

The algorithm works in three steps. First, both structures are centered on their centroids. Second, the optimal rotation matrix is computed using singular value decomposition (SVD) of the covariance matrix. Third, one structure is rotated to best match the other before computing the final RMSD.

Disable automatic alignment only if your structures are already pre-aligned in the same coordinate frame.

Atom selection

The choice of which atoms to compare significantly affects the result.

Alpha carbons (CA) — The standard choice for backbone comparison. Using only Cα atoms (one per residue) provides a robust measure of overall fold similarity while being insensitive to side chain conformations.

Backbone atoms (N, CA, C, O) — Includes all main chain atoms. This gives a more detailed backbone comparison but increases sensitivity to local conformational differences.

All atoms — Compares every atom in the structure. This is most sensitive to differences but can be dominated by flexible side chains and may not reflect overall structural similarity.

Inputs & settings

Structures

Reference structure — The structure you want to compare against. Upload a PDB file or fetch from RCSB using a 4-letter PDB ID.

Comparison structures — One or more structures to compare with the reference. You can upload multiple PDB files or enter multiple PDB IDs separated by commas.

Comparison mode

Structure — Outputs one RMSD value per comparison file. Atoms are matched across the entire structure by residue number.

Chain — Compares each chain in the reference against each chain in the comparison structure. This produces a matrix of all pairwise chain comparisons, useful for identifying which chains correspond between structures.

Understanding the results

ColumnDescription
ReferenceThe reference structure (or chain in chain mode), e.g., 5VF9_A
ComparisonThe comparison structure/chain being evaluated
RMSD (Å)Root mean square deviation in Ångströms
AtomsNumber of atom pairs used in the calculation
SkippedAtoms present in one structure but not the other

Interpreting RMSD values

The significance of an RMSD value depends on protein size and the atoms being compared.

For Cα RMSD between similar-sized proteins, these ranges provide rough guidance. Values below 1 Å indicate nearly identical structures. Between 1–2 Å suggests high similarity with minor differences. Values of 2–4 Å show moderate similarity with significant local variations. Above 4 Å typically indicates different conformations or folds.

A high "Skipped" count suggests the structures have different residue numbering or missing regions, which can affect the reliability of the comparison.

Limitations

RMSD has known limitations as a similarity metric.

It is sensitive to outliers. A single misaligned loop or flexible terminus can dramatically increase the global RMSD even when the core structures are nearly identical.

RMSD is size-dependent. An RMSD of 3 Å has different significance for a 500-residue protein versus a 50-residue protein. For size-independent comparison, TM-score (available in USAlign) is preferred.

Structures must have corresponding atoms. RMSD requires matching residue numbers between structures. Insertions, deletions, or different numbering schemes will result in skipped atoms.