What is ESMFold?
ESMFold is a protein structure prediction model developed by Meta AI Research (FAIR) and published in Science in 2023. Built on the ESM-2 protein language model with 15 billion parameters, it predicts 3D protein structures directly from amino acid sequences without requiring multiple sequence alignments (MSA).
The key advantage of ESMFold is speed. By eliminating the MSA generation step—which typically requires searching through large sequence databases—ESMFold achieves predictions up to 60x faster than AlphaFold2. This makes it particularly valuable for high-throughput screening and rapid prototyping.
ESMFold also supports multimer prediction, allowing you to predict structures of protein complexes with multiple chains. This is useful for studying protein-protein interactions and multi-subunit assemblies.
How does ESMFold work?
ESMFold takes a fundamentally different approach from MSA-based methods like AlphaFold2. Instead of explicitly searching for evolutionary homologs, it learns evolutionary patterns implicitly through language model pre-training.
Language model foundation
The model is built on ESM-2, a transformer-based protein language model trained using masked residue prediction. During training, random amino acids in protein sequences are hidden, and the model learns to predict them from surrounding context—similar to how GPT-style models learn language patterns.
Through this training on millions of protein sequences, ESM-2 develops rich internal representations that capture evolutionary constraints and structural information. Remarkably, the attention patterns in these representations correspond to residue-residue contact maps in 3D structures.
From attention to structure
ESMFold converts these learned representations into 3D coordinates through a structure module similar to AlphaFold2's. The model projects attention patterns onto known residue-residue contact maps derived from experimental structures, learning to translate sequence embeddings into atomic coordinates.
The prediction process includes a recycling phase where the model feeds its output back through the network multiple times. Each recycling step refines the predicted structure, allowing the model to resolve ambiguities in difficult regions.
Accuracy vs. speed trade-off
On benchmark datasets, ESMFold achieves median TM-scores of 0.95 (vs 0.96 for AlphaFold2) and median RMSD of 1.74Å (vs 1.30Å for AlphaFold2). While slightly less accurate overall, ESMFold excels for proteins with limited sequence homology—precisely the cases where MSA generation provides little benefit anyway.
For high-confidence predictions (pLDDT > 70), ESMFold approaches experimental-level accuracy with backbone RMSD of 1.33Å.
Inputs & settings
Protein sequence input
You can provide protein sequences in several ways:
- Text input: Paste raw amino acid sequences or FASTA format
- File upload: Upload FASTA files (
.fasta,.fa,.txt) - PDB fetch: Enter a PDB ID to retrieve sequences directly from RCSB
For multimer prediction, add multiple chains as separate inputs. Chain IDs (A, B, C...) are assigned automatically. The tool supports up to 10 chains per prediction.
Prediction parameters
- Number of recycles: Controls how many refinement iterations the model performs. Higher values may improve accuracy for difficult targets but increase runtime. The value of
4matches what was used during training and works well for most proteins. - Chunk size: Memory optimization for long sequences.
Autois recommended for most cases. Lower values (64,32) reduce memory usage but increase runtime—useful if you encounter memory errors with long sequences.
Understanding the results
pLDDT confidence scores
ESMFold outputs pLDDT (predicted Local Distance Difference Test) scores for each residue, stored in the B-factor field of the PDB file. These scores range from 0-100 and indicate local prediction confidence.
| pLDDT | Confidence | Interpretation |
|---|---|---|
| > 90 | Very high | Backbone and sidechains accurate |
| 70-90 | Good | Reliable backbone prediction |
| 50-70 | Low | Treat with caution |
| < 50 | Very low | Likely disordered or unstructured |
Regions with pLDDT below 50 often appear as extended ribbons and may represent intrinsically disordered regions rather than prediction failures. These regions may only adopt defined structures when bound to interaction partners.
Structure files
Each prediction produces a PDB file containing atomic coordinates for all modeled residues. The B-factor column contains per-residue pLDDT scores, which most molecular visualization tools can display as a color gradient.
When to use ESMFold vs. other tools
| Feature | ESMFold | Boltz-2 | Chai-1 |
|---|---|---|---|
| Speed | Very fast (~seconds) | Moderate (~20 sec) | Moderate |
| MSA required | No | Optional | Optional |
| Accuracy (TM-score) | ~0.95 | ~0.96 | ~0.95 |
| Multimer support | Yes | Yes | Yes |
| Ligand prediction | No | Yes | Yes |
| DNA/RNA support | No | Yes | Yes |
| Affinity prediction | No | Yes | No |
Which tool should you choose?
Use ESMFold when you need rapid structure predictions, are working with designed proteins lacking natural homologs, or want to quickly screen many sequences before running more expensive predictions.
Use Boltz-2 when you need the highest accuracy, want to predict protein-ligand complexes, or need binding affinity estimates. Boltz-2's MSA generation improves accuracy for natural proteins.
Use Chai-1 when you're predicting multi-component complexes involving DNA, RNA, or small molecules, and don't need affinity predictions.
Limitations
ESMFold trades some accuracy for speed. For applications requiring the highest possible accuracy—such as drug design or detailed mechanistic studies—MSA-based methods like Boltz-2 or AlphaFold2 may be more appropriate.
The model works best for globular proteins with clear evolutionary signatures encoded in the sequence. Predictions for intrinsically disordered regions, membrane proteins, or highly repetitive sequences may be less reliable.
ESMFold also cannot predict ligand binding poses or non-protein molecules. For protein-small molecule or protein-nucleic acid complexes, use Boltz-2 or Chai-1 instead.
References
- Lin, Z., et al. (2023). Evolutionary-scale prediction of atomic-level protein structure with a language model. Science. https://doi.org/10.1126/science.ade2574
