AlphaFlow

Generate protein conformational ensembles from sequence or structure inputs.

60
Configure input settings on the left, then click "Submit"orLoad an example (it's free)

Human Ubiquitin (76 aa) - Flexible regions

Related tools

AlphaFold2

AlphaFold2

AlphaFold2 via ColabFold for high-accuracy protein structure prediction. Uses MMSeqs2 API for MSA generation with no local databases required. Supports monomer and multimer prediction.

Boltz-2

Boltz-2

Boltz-2 is a biomolecular foundation model for structure and binding affinity prediction. Supports proteins, ligands, DNA, and RNA in multi-component complexes. Automatically scales GPU resources for large complexes. Predicts binding affinity with near-FEP accuracy at 1000x faster speed.

Chai-1

Chai-1

Chai-1 is a multi-modal foundation model for molecular structure prediction. Predicts 3D structures for proteins, ligands, DNA, RNA, and multi-component complexes with high accuracy.

ESMfold

ESMfold

ESMfold is a fast, single-sequence protein structure predictor from Meta AI. Predicts 3D protein structures directly from amino acid sequences without requiring multiple sequence alignments (MSA), making it significantly faster than AlphaFold while automatically scaling GPU resources for larger proteins.

ImmuneBuilder

ImmuneBuilder

ImmuneBuilder predicts 3D structures of immune receptor proteins including antibodies, nanobodies, and T-cell receptors. It uses ABodyBuilder2, NanoBodyBuilder2, and TCRBuilder2/TCRBuilder2+ to generate structures with per-residue error estimates and optional ensemble artifacts.

IntelliFold 2

IntelliFold 2

Controllable biomolecular structure prediction model for proteins, ligands, DNA, RNA, and multi-component complexes. IntelliFold 2 supports fast v2-Flash inference, optional MSA generation, and ranked confidence outputs.

LMI4Boltz

LMI4Boltz

LMI4Boltz is a low-memory fork of Boltz for biomolecular structure and binding affinity prediction. It preserves Boltz inference behavior while reducing VRAM use with in-place pair updates, CPU offload, reduced precision pair representation, and aggressive chunking.

MiniFold

MiniFold

MiniFold is a fast single-sequence protein structure predictor that is 10-20x faster than ESMFold. It predicts 3D protein structures directly from amino acid sequences without requiring multiple sequence alignments (MSA), making it ideal for rapid structure prediction.

OpenFold-3

OpenFold-3

OpenFold-3 is an open-source AI model for biomolecular structure prediction, aiming to reproduce AlphaFold3. Predicts 3D structures for proteins, RNA, DNA, and small molecule ligands with high accuracy.

Protenix

Protenix

Open-source AlphaFold 3 implementation by ByteDance for biomolecular structure prediction. Predicts 3D structures for proteins, RNA, DNA, and small molecule ligands with high accuracy.

What is AlphaFlow?

Protein structures in the PDB and in molecular dynamics simulations rarely describe a single rigid shape. Loops open and close, termini wander, domains breathe, and binding sites can appear only in a subset of conformations. AlphaFlow models that variability by turning AlphaFold-style structure prediction into a generative ensemble method.

AlphaFlow, developed by Jing, Berger, and Jaakkola, includes two model families. AlphaFlow uses AlphaFold with multiple sequence alignments. ESMFlow uses ESMFold and runs from sequence alone. ProteinIQ supports both model families: ESMFlow for single-sequence ensemble generation, and AlphaFlow for MSA-backed AlphaFold-style ensemble generation.

For a single best-guess structure, ESMFold or AlphaFold 2 is usually the clearer choice. AlphaFlow is useful when the question is about plausible conformational spread rather than one static model.

How to use AlphaFlow online

AlphaFlow runs online in ProteinIQ by selecting an input mode, providing one protein sequence, choosing a matching model variant, and generating a conformational ensemble as downloadable PDB files. ESMFlow mode needs only a sequence. AlphaFlow mode accepts the same sequence plus an optional .a3m multiple sequence alignment, and generates the alignment automatically when none is uploaded.

Inputs

InputAccepted formatNotes
Protein sequenceFASTA, raw amino acid sequence, .fasta, .fa, or .txtOne protein sequence between 10 and 1600 residues. Standard amino acids are expected.
Pre-computed MSAOptional .a3mAppears only after AlphaFlow input mode is set to AlphaFlow. The uploaded alignment must be valid A3M with the query sequence first. Invalid .a3m files fail clearly rather than falling back to generated MSA.
RCSB fetchPDB ID, for example 1UBQProteinIQ fetches FASTA from RCSB when a PDB ID is used. The resulting sequence, not the PDB coordinates, is used for ESMFlow inference.
Job nameTextOptional label for finding the run later.

AlphaFlow on ProteinIQ is designed for single-chain protein sequences. Multimer interfaces, ligand-bound states, membranes, cofactors, and chain-specific contacts are not modeled directly from the input. When a PDB ID is fetched from RCSB, the sequence is used as the input and the deposited coordinates are not used as a structural template.

A3M validation

Uploaded .a3m alignments are checked before the job starts. ProteinIQ accepts permissive A3M content, but rejects files that would make the AlphaFold-style feature pipeline ambiguous.

CheckRequirement
FASTA headersEach record starts with a non-empty header line beginning with >.
Sequence recordsEvery header has at least one sequence line.
CharactersSequences use plausible A3M characters: amino acid letters, lowercase insertion letters, -, ., or *.
Aligned lengthAfter stripping lowercase insertion characters, every record has the same aligned column length.
Query sequenceThe first record has a non-empty ungapped query sequence after removing lowercase insertions and gap-like characters.

If no .a3m is uploaded in AlphaFlow mode, ProteinIQ generates an MSA server-side with an MMseqs2 workflow compatible with ColabFold-style A3M output. The generated alignment is returned with the results so the run can be inspected or repeated.

Before running

  • Trim expression tags and purification handles: Flexible His-tags or long artificial linkers can dominate the ensemble and make biologically relevant motion harder to see.
  • Keep domains interpretable: Multi-domain proteins can be useful, but long flexible linkers may cause large global rearrangements. Running individual domains can make local flexibility easier to inspect.
  • Check the sequence alphabet: Ambiguous residues such as X, B, Z, or U should be resolved before inference when possible.
  • Start with 10 samples: The default gives enough structures to spot repeated motion without making the first run slow. Increase to 25 or 50 when comparing domain movement, loop states, or binding-site exposure.

Human ubiquitin (1UBQ, 76 residues) is a useful sanity-check target because the beta-grasp fold should remain compact while termini and surface loops vary modestly across samples. A run where the core unfolds or most samples look unrelated is a warning sign to inspect the input sequence and settings.

Model settings

SettingDefaultDescription
AlphaFlow input modeESMFlowControls both the input form and the available model variants. ESMFlow uses sequence only. AlphaFlow shows the optional .a3m upload and otherwise generates an MSA server-side.
Model variantESMFlow MD Base in ESMFlow mode, AlphaFlow MD Base in AlphaFlow modeSelects the checkpoint. The menu is scoped to the selected input mode, so ESMFlow variants appear only in ESMFlow mode and AlphaFlow variants appear only in AlphaFlow mode.
Number of samples10Number of PDB conformations to generate, from 1 to 50. More samples give a better view of ensemble spread, but runtime scales roughly linearly with sample count.

MD models learn conformational variation from molecular dynamics trajectories. PDB models learn diversity across experimental structures. Base models are the full checkpoints. Distilled models are faster and automatically use the distilled inference settings.

Choosing a model variant

VariantBest fitPractical tradeoff
ESMFlow MD BaseStudying thermal flexibility, mobile loops, termini, and domain breathing around physiological conditionsRecommended default. Slower than distilled models, but closest to the full MD-trained ESMFlow behavior.
ESMFlow PDB BaseLooking for experimental-state diversity, such as alternative crystal forms or condition-dependent conformationsCan reflect broader structural heterogeneity than MD-like local motion.
ESMFlow MD DistilledFast exploratory ensemble generation when many sequences need screeningFaster, with some accuracy loss relative to the base model. Uses No diffusion and Noisy first automatically.
ESMFlow PDB DistilledFast comparison of experimental-ensemble-like conformationsUseful for triage, not the best choice for final structural interpretation.
AlphaFlow MD BaseMSA-backed molecular-dynamics-like ensemblesUses AlphaFold-style MSA features. Slower and more expensive than ESMFlow, but closer to the original AlphaFlow method.
AlphaFlow MD DistilledFaster MSA-backed MD-like exploratory ensemblesUses the distilled inference mode.
AlphaFlow PDB BaseMSA-backed experimental-state ensemble diversityEnables Self conditioning and MSA resampling by default for PDB-style sampling.
AlphaFlow PDB DistilledFaster MSA-backed PDB-like ensemblesUses PDB-style sampling plus the distilled inference mode.

Advanced settings

SettingDefaultDescription
Inference steps10Number of denoising steps, from 2 to 50. The standard setting is 10. Lower values truncate generation and can reduce diversity. Higher values cost more runtime.
Diversity (tmax)1.0Starting point in the generation schedule, from 0.2 to 1.0. 1.0 samples from the full schedule. Lower values keep samples closer together.
Self conditioningOffEnables --self_cond. AlphaFlow PDB variants enable it automatically for PDB-style sampling.
No diffusionOffEnables --no_diffusion. Distilled variants enable this automatically.
Noisy firstOffEnables --noisy_first. Distilled variants enable this automatically.

The two controls that most users should change first are Number of samples and Model variant. Inference steps, Diversity (tmax), and the boolean inference flags are better treated as reproducibility controls. Changing them changes the sampling procedure, not just the display.

Results

ESMFlow returns one primary PDB file per sample. AlphaFlow returns a canonical multi-model PDB containing the full sampled ensemble, plus individual sample PDB files for convenient viewing. When AlphaFlow uses an uploaded or generated .a3m, that alignment is returned as a secondary downloadable file for reproducibility.

OutputMeaning
sample_01.pdb, sample_02.pdb, ...Individual predicted conformations in PDB format. Each file is a complete protein model for the same input sequence.
{name}.pdbAlphaFlow modes only. Multi-model PDB containing the sampled ensemble.
{name}.a3mAlphaFlow modes only. Uploaded or generated MSA used for inference.
Model variantCheckpoint used for the run, such as esmflow_md_base or esmflow_pdb_distilled.
Sample indexPosition of the sampled conformation in the generated ensemble. It is not a confidence rank.
Inference settingssteps, tmax, self_cond, no_diffusion, and noisy_first are recorded so runs can be compared later.

AlphaFlow does not return a pLDDT confidence table or a binding score. The value of the run is in the ensemble itself: how similar the structures are, which regions move, and whether the sampled conformations suggest more than one plausible state.

Pattern in the ensembleLikely interpretationFollow-up check
Core secondary structure overlaps across most samplesThe model has a consistent fold hypothesisInspect loops, termini, and domain interfaces rather than whole-protein RMSD alone.
One loop repeatedly opens or closesThe loop may be mobile or weakly constrained by sequence contextCompare with known active-site loops, disorder predictions, or mutagenesis data.
Two domains stay folded but change relative orientationHinge-like motion may be plausibleAlign one domain first, then measure displacement of the other domain.
Only one sample shows an extreme rearrangementThe structure is more likely an outlier than a stable alternate stateLook for the same movement in additional samples before using it for docking or design.
Most samples are globally inconsistentThe input may be too long, disordered, or outside the model's reliable regimeTry domain-level constructs or compare a single-structure prediction with ESMFold.

Interpreting AlphaFlow ensembles

An AlphaFlow output should be read as a set of plausible conformations, not as a ranked list where sample 1 is best. The sample index only records generation order.

Large differences across samples usually point to regions where the model sees conformational uncertainty or mobility. In practice, the most useful checks are visual:

  • Stable core: Helices and sheets that overlap across samples are more likely to represent a consistent fold.
  • Flexible loops and termini: Disordered ends and solvent-exposed loops often spread widely. That spread is expected and should not be treated as a modeling failure by itself.
  • Domain motion: If two domains remain internally stable but shift relative to each other, the ensemble may suggest hinge-like motion.
  • Binding-site exposure: A pocket that appears in only some samples can be useful for hypothesis generation, especially before docking or mutagenesis planning.
  • Outlier samples: A single highly distorted sample should not outweigh the rest of the ensemble. Inspect whether the same movement appears repeatedly.

Quantitative RMSD analysis is useful after the job, but RMSD needs a reference choice. Whole-protein RMSD can be dominated by flexible tails. For many proteins, aligning the stable core first and then measuring loop or domain movement gives a clearer interpretation.

How AlphaFlow works

AlphaFlow trains structure predictors as flow-matching models. During training, the model sees an interpolation between a noisy protein backbone representation and a target structure:

xt=(1t)x0+tx1x_t = (1 - t) \cdot x_0 + t \cdot x_1

Here, x0x_0 is sampled from a harmonic prior, x1x_1 is the target structure, and tt marks progress along the denoising path. At inference time, the model starts from a noisy configuration and repeatedly predicts a cleaner structure along the schedule.

ESMFlow applies the same idea to ESMFold. Because ESMFold is a single-sequence model, ESMFlow can run without MSA generation. That makes it practical for fast web execution and for proteins where homologous sequence coverage is weak.

AlphaFlow mode uses AlphaFold-style MSA features. A supplied .a3m file is used directly after validation. Without an upload, ProteinIQ generates the alignment first, then writes it into the directory layout expected by the AlphaFlow inference code before sampling structures. That extra MSA step is the main reason AlphaFlow mode is slower than ESMFlow mode.

Training data behind the model variants

Training setWhat it capturesInterpretation consequence
MDMolecular dynamics trajectories from ATLAS at 300KBetter for local thermal motion, loop mobility, and dynamic fluctuations around an existing fold.
PDBExperimental structures deposited under different conditionsBetter for alternative crystallographic or cryo-EM states, ligand-bound differences, and broader experimental heterogeneity.

The distinction matters. An MD-trained model may emphasize physically local fluctuations. A PDB-trained model may sample larger state changes if similar state diversity appears in experimental structures.

When to use AlphaFlow vs alternatives

ToolBest useDifference from AlphaFlow
AlphaFlowGenerating a conformational ensemble from one protein sequenceProduces multiple plausible structures rather than one best structure.
ESMFoldFast single-structure prediction without MSA generationBetter when a single model and pLDDT-style confidence are needed.
AlphaFold 2High-accuracy structure prediction where MSA information is helpfulUsually stronger for a single final structure, but not designed to sample an ensemble.
AF-ClusterExploring alternate AlphaFold conformations by clustering subsampled MSA predictionsUseful when MSA diversity may reveal different states, but it still depends on AlphaFold-style prediction runs.
Boltz-2Structure prediction for complexes involving proteins, ligands, DNA, or RNABetter for multimolecular complexes and binding affinity workflows.
MD trajectory analysisMeasuring flexibility from an existing molecular dynamics trajectoryWorks from simulated trajectory data instead of generating conformations from sequence.

AlphaFlow fits best between structure prediction and simulation. It can suggest flexible regions worth checking before setting up molecular dynamics, docking, mutagenesis, or construct design. It should not replace experimental dynamics data when NMR, HDX-MS, cryo-EM heterogeneity, or high-quality MD trajectories are available.

Practical limits

  • MSA generation can dominate runtime: AlphaFlow modes require MSA features. Uploaded .a3m files make runs more reproducible; automatic MMseqs2 generation depends on external search availability.
  • No ligand or multimer conditioning: The input is a protein sequence. Ligands, cofactors, membranes, partner chains, and DNA or RNA binding partners are not included in generation.
  • Backbone-centered ensemble interpretation: The method is most useful for backbone conformational diversity. Side-chain packing and small binding-site rearrangements need follow-up validation.
  • Long sequences cost more: Memory and runtime increase sharply with sequence length. ProteinIQ accepts up to 1600 residues, but shorter constructs are usually easier to interpret.
  • Samples are hypotheses: Repeated motion across several samples is more informative than a single unusual conformation. Experimental or simulation evidence is still needed before claiming a functional state.