ESMFold2

(2026-05)

Predict protein and protein-complex structures from sequences with ESMFold2 confidence outputs.

Input

Protein sequenceA

Model variant

Number of loops

Sampling steps

50 credits

Output

Configure inputs to begin

Set options on the left, then click “Submit job”.

What is ESMFold2?

ESMFold2 is Biohub's protein structure prediction model built from ESMC protein language model representations and a diffusion-based structure generator. It predicts atom-level protein structures and protein complexes directly from amino acid sequence, returning confidence scores that help separate reliable folded regions from uncertain loops, termini, or interfaces.

Compared with the original ESMFold, ESMFold2 is aimed at higher-accuracy structure and complex prediction. It is still useful for sequence-only folding, but the confidence outputs are especially important when assessing multi-chain assemblies or deciding whether to spend time on downstream analysis.

How to use ESMFold2 online

Run ESMFold2 online by entering one or more protein chains as raw amino acid sequence, FASTA text, a FASTA file, or an RCSB sequence fetch. ProteinIQ predicts the structure with ESMFold2-Fast or the full ESMFold2 model, then returns an mmCIF structure, pLDDT, pTM, iPTM, and optional confidence matrix outputs.

Inputs

Input	Accepted format	Notes
`Protein sequence`	Raw sequence, FASTA text, `.fasta`, `.fa`, `.fas`, or `.txt`	ProteinIQ accepts canonical single-letter amino acid sequences.
`Multiple chains`	Up to 10 protein chains	Each chain is modeled as part of the same protein complex.
`RCSB fetch`	PDB ID, for example `1UBQ`	ProteinIQ fetches the protein sequence from RCSB and uses the sequence, not the deposited coordinates, as the prediction input.
`Job name`	Text	Optional label for finding the prediction later.

The total sequence length across all chains is capped at 2000 residues on ProteinIQ. For long multidomain proteins, running individual domains can produce more interpretable confidence patterns than folding the full construct in one job.

ProteinIQ supports protein-only ESMFold2 jobs in this release. DNA, RNA, ligands, covalent bonds, pocket conditioning, embedding export, and MSA input are not part of the online form.

Settings

Setting	Default	Description
`Model variant`	`ESMFold2-Fast`	`ESMFold2-Fast` is optimized for sequence-only inference. `ESMFold2 full` uses the larger model and may be preferable when accuracy matters more than runtime.
`Number of loops`	`10`	Number of folding refinement loops. More loops can give the model more opportunity to refine difficult regions, with longer runtime.
`Sampling steps`	`100`	Number of diffusion sampling steps used during structure generation. Higher values increase compute and may improve difficult predictions.
`Include PAE matrix`	Off	Adds a JSON file with predicted aligned error values for domain and interface interpretation.
`Include pair-chain iPTM`	Off	Adds a JSON file with pairwise chain-interface confidence for protein complexes when available.

For a quick first pass, keep ESMFold2-Fast, Number of loops = 10, and Sampling steps = 100. Increase compute only when a prediction is scientifically important enough to justify a slower run or when comparing alternative chain definitions.

Results

Output	Format	Meaning
`Predicted structure`	`.cif`	Native mmCIF coordinate file for the predicted protein or protein complex.
`pLDDT`	JSON and structure metadata	Per-residue local confidence, summarized as mean, minimum, and maximum pLDDT.
`pTM`	Result table	Global fold confidence. Higher values indicate a more confident overall topology.
`iPTM`	Result table	Interface confidence for multi-chain predictions. Higher values suggest more reliable chain placement.
`PAE matrix`	Optional JSON	Predicted aligned error between residue pairs. Useful for domain orientation and interface uncertainty.
`Pair-chain iPTM`	Optional JSON	Chain-by-chain interface confidence matrix for complex predictions.

The structure is returned as mmCIF rather than PDB. mmCIF handles larger structures, chain identifiers, and modern structure metadata more reliably than legacy PDB format.

Interpreting ESMFold2 confidence

ESMFold2 confidence values are not experimental validation. They are model-estimated reliability signals, useful for triage and for deciding which parts of a prediction deserve closer inspection.

Metric	Range	Best use
`pLDDT`	0 to 100	Local residue-level confidence. High pLDDT supports backbone-level interpretation; low pLDDT often marks flexible or disordered regions.
`pTM`	0 to 1	Overall fold confidence. Higher pTM indicates a more reliable global topology.
`iPTM`	0 to 1	Interface confidence for complexes. Low iPTM means chain placement should be treated cautiously even if individual chains have high pLDDT.
`PAE`	Error-like matrix	Domain and interface uncertainty. Low error between two regions supports a confident relative orientation; high error suggests the regions may move independently.

Typical pLDDT interpretation follows the same convention used by AlphaFold-style structure predictors:

pLDDT	Interpretation
> 90	Very high confidence. Backbone and many side-chain placements are likely reliable.
70 to 90	Useful model. Backbone geometry is often reliable, but details may need checking.
50 to 70	Low confidence. Use for rough hypotheses, not residue-level conclusions.
< 50	Very low confidence. The region may be disordered, flexible, or outside the model's reliable regime.

For complexes, pLDDT and iPTM answer different questions. A heterodimer can have confident individual chains and still have an uncertain interface. In that case, the model may know each fold but not how the chains pack together. Pair-chain iPTM and PAE are the best outputs to inspect before using the interface in docking, mutagenesis planning, or binder design.

How ESMFold2 works

Biohub's ESMFold2 starts from ESMC, a 6 billion-parameter protein language model trained on billions of protein sequences. The language model learns statistical regularities from protein evolution, then ESMFold2 uses those representations to guide a diffusion-based structure prediction model.

The diffusion component samples atomic structure rather than producing a single deterministic coordinate set in one pass. Sampling steps controls how many denoising steps are used, while Number of loops controls iterative refinement. ProteinIQ returns one ranked prediction by default so the result remains easy to inspect and download.

For multi-chain inputs, each sequence is passed as a separate protein chain in one structure prediction request. The returned mmCIF contains the predicted assembly, while pTM, iPTM, PAE, and pair-chain iPTM help identify whether the predicted interface is strong enough to use downstream.

When to use ESMFold2 vs alternatives

Tool	Best fit	Tradeoff
ESMFold2	Protein-only structure prediction and protein-protein complexes from sequence	Does not accept ligands, nucleic acids, or MSA input in the ProteinIQ form.
ESMFold	Fast single-sequence structure prediction with familiar pLDDT output	Lower accuracy target than ESMFold2 for difficult complex prediction.
AlphaFold 2	MSA-assisted protein folding and established AlphaFold-style workflows	MSA generation adds runtime and can be less convenient for designed or orphan proteins.
Boltz-2	Protein, ligand, DNA, RNA, and affinity-aware complex prediction	Better fit for mixed biomolecular complexes, not the simplest protein-only prediction.
Chai-1	Multi-component protein, nucleic acid, and ligand structure prediction	Useful for broad biomolecular inputs, but not focused on ESMFold2's protein-language-model route.

ESMFold2 is a strong first choice when the input is protein-only and the key question is whether one or more chains can form a plausible folded structure. For protein-ligand binding or protein-nucleic acid assemblies, Boltz-2, Chai-1, OpenFold 3, or RosettaFold3 are more appropriate.

After prediction, common follow-up checks include MolProbity for geometry validation, DSSP for secondary structure assignment, USAlign for comparison against an experimental structure, and PDB Fixer when a downstream simulation or docking tool needs standardized coordinates.

Related tools

AlphaFold2

AlphaFold2 via ColabFold for high-accuracy protein structure prediction. Uses MMSeqs2 API for MSA generation with no local databases required. Supports monomer and multimer prediction.

RosettaFold3

Open-source structure prediction neural network for proteins, nucleic acids, and small molecules. State-of-the-art accuracy with multi-chain support.

AlphaFlow

Generate protein conformational ensembles with ESMFlow, the single-sequence AlphaFlow model family. Produces multiple diverse structures showing protein flexibility and dynamics.

ABodyBuilder3

ABodyBuilder3 predicts antibody variable-domain structures from paired heavy and light chain sequences. It returns a PDB structure and, for the pLDDT checkpoint, per-residue confidence values.

Boltz-2

Boltz-2 is a biomolecular foundation model for structure and binding affinity prediction. Supports proteins, ligands, DNA, and RNA in multi-component complexes. Automatically scales GPU resources for large complexes. Predicts binding affinity with near-FEP accuracy at 1000x faster speed.

Chai-1

Chai-1 is a multi-modal foundation model for molecular structure prediction. Predicts 3D structures for proteins, ligands, DNA, RNA, and multi-component complexes with high accuracy.

ESMfold

ESMfold is a fast, single-sequence protein structure predictor from Meta AI. Predicts 3D protein structures directly from amino acid sequences without requiring multiple sequence alignments (MSA), making it significantly faster than AlphaFold while automatically scaling GPU resources for larger proteins.

ImmuneBuilder

ImmuneBuilder predicts 3D structures of immune receptor proteins including antibodies, nanobodies, and T-cell receptors. It uses ABodyBuilder2, NanoBodyBuilder2, and TCRBuilder2/TCRBuilder2+ to generate structures with per-residue error estimates and optional ensemble artifacts.

IntelliFold 2

Controllable biomolecular structure prediction model for proteins, ligands, DNA, RNA, and multi-component complexes. IntelliFold 2 supports fast v2-Flash inference, optional MSA generation, and ranked confidence outputs.

LMI4Boltz

LMI4Boltz is a low-memory fork of Boltz for biomolecular structure and binding affinity prediction. It preserves Boltz inference behavior while reducing VRAM use with in-place pair updates, CPU offload, reduced precision pair representation, and aggressive chunking.

What is ESMFold2?