ESMfold

Fast single-sequence protein structure prediction with multimer support

Input

Job name

Protein sequence(s) (Chain: A)

50 credits

Output

Configure input settings on the left, then click "Submit"

What is ESMFold?

ESMFold is a protein structure prediction model developed by Meta AI Research (FAIR) and published in Science in 2023. Built on the ESM-2 protein language model with 15 billion parameters, it predicts 3D protein structures directly from amino acid sequences without requiring multiple sequence alignments (MSA).

The key advantage of ESMFold is speed. By eliminating the MSA generation step—which typically requires searching through large sequence databases—ESMFold achieves predictions up to 60x faster than AlphaFold2. This makes it particularly valuable for high-throughput screening and rapid prototyping.

ESMFold also supports multimer prediction, allowing you to predict structures of protein complexes with multiple chains. This is useful for studying protein-protein interactions and multi-subunit assemblies.

How does ESMFold work?

ESMFold takes a fundamentally different approach from MSA-based methods like AlphaFold2. Instead of explicitly searching for evolutionary homologs, it learns evolutionary patterns implicitly through language model pre-training.

Language model foundation

The model is built on ESM-2, a transformer-based protein language model trained using masked residue prediction. During training, random amino acids in protein sequences are hidden, and the model learns to predict them from surrounding context—similar to how GPT-style models learn language patterns.

Through this training on millions of protein sequences, ESM-2 develops rich internal representations that capture evolutionary constraints and structural information. Remarkably, the attention patterns in these representations correspond to residue-residue contact maps in 3D structures.

From attention to structure

ESMFold converts these learned representations into 3D coordinates through a structure module similar to AlphaFold2's. The model projects attention patterns onto known residue-residue contact maps derived from experimental structures, learning to translate sequence embeddings into atomic coordinates.

The prediction process includes a recycling phase where the model feeds its output back through the network multiple times. Each recycling step refines the predicted structure, allowing the model to resolve ambiguities in difficult regions.

Accuracy vs. speed trade-off

On benchmark datasets, ESMFold achieves median TM-scores of 0.95 (vs 0.96 for AlphaFold2) and median RMSD of 1.74Å (vs 1.30Å for AlphaFold2). While slightly less accurate overall, ESMFold excels for proteins with limited sequence homology—precisely the cases where MSA generation provides little benefit anyway.

For high-confidence predictions (pLDDT > 70), ESMFold approaches experimental-level accuracy with backbone RMSD of 1.33Å.

Inputs & settings

Protein sequence input

You can provide protein sequences in several ways:

Text input: Paste raw amino acid sequences or FASTA format
File upload: Upload FASTA files (.fasta, .fa, .txt)
PDB fetch: Enter a PDB ID to retrieve sequences directly from RCSB

For multimer prediction, add multiple chains as separate inputs. Chain IDs (A, B, C...) are assigned automatically. The tool supports up to 10 chains per prediction.

Prediction parameters

Number of recycles: Controls how many refinement iterations the model performs. Higher values may improve accuracy for difficult targets but increase runtime. The value of 4 matches what was used during training and works well for most proteins.
Chunk size: Memory optimization for long sequences. Auto is recommended for most cases. Lower values (64, 32) reduce memory usage but increase runtime—useful if you encounter memory errors with long sequences.

Understanding the results

pLDDT confidence scores

ESMFold outputs pLDDT (predicted Local Distance Difference Test) scores for each residue, stored in the B-factor field of the PDB file. These scores range from 0-100 and indicate local prediction confidence.

pLDDT	Confidence	Interpretation
> 90	Very high	Backbone and sidechains accurate
70-90	Good	Reliable backbone prediction
50-70	Low	Treat with caution
< 50	Very low	Likely disordered or unstructured

Regions with pLDDT below 50 often appear as extended ribbons and may represent intrinsically disordered regions rather than prediction failures. These regions may only adopt defined structures when bound to interaction partners.

Structure files

Each prediction produces a PDB file containing atomic coordinates for all modeled residues. The B-factor column contains per-residue pLDDT scores, which most molecular visualization tools can display as a color gradient.

When to use ESMFold vs. other tools

Feature	ESMFold	Boltz-2	Chai-1
Speed	Very fast (~seconds)	Moderate (~20 sec)	Moderate
MSA required	No	Optional	Optional
Accuracy (TM-score)	~0.95	~0.96	~0.95
Multimer support	Yes	Yes	Yes
Ligand prediction	No	Yes	Yes
DNA/RNA support	No	Yes	Yes
Affinity prediction	No	Yes	No

Which tool should you choose?

Use ESMFold when you need rapid structure predictions, are working with designed proteins lacking natural homologs, or want to quickly screen many sequences before running more expensive predictions.

Use Boltz-2 when you need the highest accuracy, want to predict protein-ligand complexes, or need binding affinity estimates. Boltz-2's MSA generation improves accuracy for natural proteins.

Use Chai-1 when you're predicting multi-component complexes involving DNA, RNA, or small molecules, and don't need affinity predictions.

Limitations

ESMFold trades some accuracy for speed. For applications requiring the highest possible accuracy—such as drug design or detailed mechanistic studies—MSA-based methods like Boltz-2 or AlphaFold2 may be more appropriate.

The model works best for globular proteins with clear evolutionary signatures encoded in the sequence. Predictions for intrinsically disordered regions, membrane proteins, or highly repetitive sequences may be less reliable.

ESMFold also cannot predict ligand binding poses or non-protein molecules. For protein-small molecule or protein-nucleic acid complexes, use Boltz-2 or Chai-1 instead.

References

Lin, Z., et al. (2023). Evolutionary-scale prediction of atomic-level protein structure with a language model. Science. https://doi.org/10.1126/science.ade2574