
Fast single-sequence protein structure prediction with multimer support
ESMFold is a protein structure prediction model developed by Meta AI Research (FAIR) and published in Science in 2023. Built on the ESM-2 protein language model with 15 billion parameters, it predicts 3D protein structures directly from amino acid sequences without requiring multiple sequence alignments (MSA).
The key advantage of ESMFold is speed. By eliminating the MSA generation step—which typically requires searching through large sequence databases—ESMFold achieves predictions up to 60x faster than AlphaFold2. This makes it particularly valuable for high-throughput screening and rapid prototyping.
ESMFold also supports multimer prediction, allowing you to predict structures of protein complexes with multiple chains. This is useful for studying protein-protein interactions and multi-subunit assemblies.
ESMFold takes a fundamentally different approach from MSA-based methods like AlphaFold2. Instead of explicitly searching for evolutionary homologs, it learns evolutionary patterns implicitly through language model pre-training.
The model is built on ESM-2, a transformer-based protein language model trained using masked residue prediction. During training, random amino acids in protein sequences are hidden, and the model learns to predict them from surrounding context—similar to how GPT-style models learn language patterns.
Through this training on millions of protein sequences, ESM-2 develops rich internal representations that capture evolutionary constraints and structural information. Remarkably, the attention patterns in these representations correspond to residue-residue contact maps in 3D structures.
ESMFold converts these learned representations into 3D coordinates through a structure module similar to AlphaFold2's. The model projects attention patterns onto known residue-residue contact maps derived from experimental structures, learning to translate sequence embeddings into atomic coordinates.
The prediction process includes a recycling phase where the model feeds its output back through the network multiple times. Each recycling step refines the predicted structure, allowing the model to resolve ambiguities in difficult regions.
On benchmark datasets, ESMFold achieves median TM-scores of 0.95 (vs 0.96 for AlphaFold2) and median RMSD of 1.74Å (vs 1.30Å for AlphaFold2). While slightly less accurate overall, ESMFold excels for proteins with limited sequence homology—precisely the cases where MSA generation provides little benefit anyway.
For high-confidence predictions (pLDDT > 70), ESMFold approaches experimental-level accuracy with backbone RMSD of 1.33Å.
You can provide protein sequences in several ways:
.fasta, .fa, .txt)For multimer prediction, add multiple chains as separate inputs. Chain IDs (A, B, C...) are assigned automatically. The tool supports up to 10 chains per prediction.
4 matches what was used during training and works well for most proteins.Auto is recommended for most cases. Lower values (64, 32) reduce memory usage but increase runtime—useful if you encounter memory errors with long sequences.ESMFold outputs pLDDT (predicted Local Distance Difference Test) scores for each residue, stored in the B-factor field of the PDB file. These scores range from 0-100 and indicate local prediction confidence.
| pLDDT | Confidence | Interpretation |
|---|---|---|
| > 90 | Very high | Backbone and sidechains accurate |
| 70-90 | Good | Reliable backbone prediction |
| 50-70 | Low | Treat with caution |
| < 50 | Very low | Likely disordered or unstructured |
Regions with pLDDT below 50 often appear as extended ribbons and may represent intrinsically disordered regions rather than prediction failures. These regions may only adopt defined structures when bound to interaction partners.
Each prediction produces a PDB file containing atomic coordinates for all modeled residues. The B-factor column contains per-residue pLDDT scores, which most molecular visualization tools can display as a color gradient.
Use ESMFold when you need rapid structure predictions, are working with designed proteins lacking natural homologs, or want to quickly screen many sequences before running more expensive predictions.
Use Boltz-2 when you need the highest accuracy, want to predict protein-ligand complexes, or need binding affinity estimates. Boltz-2's MSA generation improves accuracy for natural proteins.
Use Chai-1 when you're predicting multi-component complexes involving DNA, RNA, or small molecules, and don't need affinity predictions.
ESMFold trades some accuracy for speed. For applications requiring the highest possible accuracy—such as drug design or detailed mechanistic studies—MSA-based methods like Boltz-2 or AlphaFold2 may be more appropriate.
The model works best for globular proteins with clear evolutionary signatures encoded in the sequence. Predictions for intrinsically disordered regions, membrane proteins, or highly repetitive sequences may be less reliable.
ESMFold also cannot predict ligand binding poses or non-protein molecules. For protein-small molecule or protein-nucleic acid complexes, use Boltz-2 or Chai-1 instead.