
Validate protein structure quality with all-atom contact analysis and geometry checks.
MolProbity is a comprehensive structure validation tool that assesses protein and nucleic acid quality through all-atom geometry analysis. Rather than just checking if atoms fit the electron density map, MolProbity examines whether atoms are physically positioned correctly relative to each other—detecting steric clashes, backbone geometry problems, and sidechain conformational errors that even high-resolution crystallography can miss.
Structure validation is essential before using a protein model for downstream analysis or design work. A structure with poor geometry may look reasonable in electron density but contain atomic overlaps or distortions that lead to incorrect functional predictions. MolProbity combines multiple quality metrics into a single, interpretable framework that catches both obvious errors and subtle geometric inconsistencies.
MolProbity produces detailed validation reports with flagged residues, quality scores at multiple resolutions, and automated feedback about structure reliability. This makes it invaluable during structure refinement and for establishing confidence in predicted models from AlphaFold2, Boltz-2, or experimental methods.
MolProbity performs several independent geometric analyses that together provide a comprehensive view of structure quality. Each metric targets different types of structural problems.
The clashscore measures severe steric overlaps where nonbonded atoms come within 0.4 Å of each other—penetrating well into forbidden van der Waals space. These aren't minor geometric strains but actual atomic collisions that signal local fitting errors or mistakes in atomic coordinate assignment.
MolProbity counts all such clashes and normalizes to per-thousand atoms, making scores comparable across different protein sizes. A clashscore of 0 is excellent; clashscores above 10 indicate problematic regions. Well-built structures typically average well below 5 per thousand atoms.
The backbone of a protein is defined by three dihedral angles: phi (φ), psi (ψ), and omega (ω). Omega is nearly always 180° due to partial double-bond character in the peptide bond. Phi and psi, however, are free to rotate, but only certain combinations avoid atomic clashes.
MolProbity evaluates each residue's φ/ψ angles against a modern reference distribution derived from nearly a million high-quality reference residues. Residues are classified as:
The tool applies residue-specific distributions because glycine (no sidechain) has much broader allowed regions, while proline (cyclic sidechain) is severely restricted to φ ≈ -60°.
Sidechains adopt discrete rotameric conformations where chi (χ) dihedral angles cluster around staggered orientations (±60°, 180°). These rotamers are energetically favorable because they minimize clash between the sidechain and backbone.
MolProbity classifies each sidechain rotamer as:
Non-glycine, non-alanine residues are evaluated; glycine has no sidechain and alanine's sidechain is too small to adopt rotamers.
MolProbity checks for C-beta deviations, where the measured C-beta position deviates from ideal geometry predicted from backbone (N, Cα, C) coordinates. Large deviations suggest problems with local backbone geometry or sidechain modeling.
For lower-resolution structures (2.5–4.0 Å), the tool also applies CaBLAM analysis, which evaluates backbone geometry and secondary structure likelihood from the Cα virtual dihedral angles. This provides useful validation feedback even when full-atom detail is uncertain.
MolProbity combines clashscore, Ramachandran favored percentage, and rotamer outlier percentage into a single score that approximates the resolution at which such a quality would be average. This allows you to compare your structure against others of similar resolution.
MolProbity accepts protein and nucleic acid structures in:
1UBQ) to automatically fetch the structure from the Protein Data BankYour structure must contain valid atomic coordinates. For proteins, all backbone atoms (N, Cα, C, O) should be present for reliable analysis. Structures with missing atoms in the backbone will have reduced validation detail.
MolProbity returns a spreadsheet with individual metrics for your structure, each with a quality assessment.
Each metric includes a status flag indicating severity:
Clashscore: Look for values below 5 for high-quality structures, below 10 for acceptable ones. Clashes above 10 per thousand atoms suggest systematic problems in specific regions. If clashes are concentrated in ligand-binding sites or flexible loops, they may reflect real conformational ensembles rather than errors.
Ramachandran outliers: 0–0.5% outliers is excellent; 0.5–1% is normal; above 2% suggests problems. However, functionally important residues (especially in active sites) sometimes adopt strained conformations for catalysis or substrate binding, appearing as outliers.
Rotamer outliers: Similar interpretation—0–0.5% is excellent, up to 1% is typical. Outliers in hydrophobic cores likely represent errors; outliers at protein surfaces or binding sites may be functionally important.
C-beta deviations: Large deviations (>0.5 Å) are rare in well-built structures and typically indicate local backbone strain or atomic position errors.
MolProbity Score: Compare against structures at similar resolution. A score within one percentile point of the median for your resolution range indicates typical quality.
MolProbity flags specific residues with problems—clashes, Ramachandran outliers, rotamer outliers, and C-beta deviations. Always examine these residues visually in your structure viewer. Some indicate genuine errors requiring correction; others reflect catalytic residues or binding sites under conformational stress.
MolProbity works exceptionally well for crystal structures at 1.5–3.0 Å resolution where atomic positions are well-determined. For cryo-EM structures or low-resolution models, interpretation requires more caution.
Best uses: Validating crystal structures, assessing quality of predicted structures from AlphaFold2 or Boltz-2, identifying problematic regions during refinement, comparing multiple conformational states.
Limitations: At very low resolution (>4 Å), geometric metrics become less informative because electron density ambiguity itself can cause apparent clashes or deviations. For designed proteins or synthetic sequences with no evolutionary precedent, some metrics may be overly strict.
When validating structures predicted by AlphaFold2, Boltz-2, or other ML models, expect slightly different patterns than experimental structures. Predicted models often show fewer rotamer outliers (models are smoother) but may have unusual backbone angles in flexible regions. Use MolProbity's detailed feedback to identify regions you should trust versus regions requiring additional analysis.
For comprehensive protein analysis, use our Protein Parameters calculator to examine molecular weight, isoelectric point, stability indices, and other sequence-based metrics alongside MolProbity's structure-based validation.
For detailed backbone angle analysis, Ramachandran Plot provides an interactive visualization of phi/psi distributions specific to your structure.
For structure improvement, PDB Fixer can correct common structural errors and add missing atoms detected by MolProbity.
To visualize your structure and examine flagged residues in 3D, use PDB Viewer.