What is Boltz-2?
Boltz-2 is a biomolecular foundation model that jointly predicts molecular complex structures and protein-ligand binding affinities. Developed by MIT Jameel Clinic and Recursion and released in June 2025, it extends Boltz-1—the most widely used open-source alternative to AlphaFold3 across academia and industry.
The model achieves binding affinity predictions approaching the accuracy of physics-based free-energy perturbation (FEP) calculations while running over 1000x faster. Traditional FEP methods cost hundreds of dollars per prediction and require 6–12 hours to complete; Boltz-2 achieves comparable accuracy in approximately 20 seconds on a single GPU.
Boltz-2 supports multi-component biomolecular complexes including proteins, small molecule ligands, DNA, and RNA. This broad coverage makes it particularly valuable for drug discovery workflows requiring rapid screening of compound libraries before experimental synthesis.
How to use Boltz-2 online
ProteinIQ hosts Boltz-2 on GPU infrastructure, delivering structure and affinity predictions through a browser interface without local installation or command-line configuration.
Inputs
Boltz-2 accepts multiple molecule types that combine into a single prediction job. Chain IDs (A, B, C...) are assigned automatically and displayed in the interface—these identifiers are needed when defining constraints.
| Input | Description |
|---|---|
Protein | PDB/CIF structure files, FASTA sequences, or RCSB PDB IDs (e.g., 1ABC). Up to 10 chains. |
Ligand (SMILES) | SMILES strings, SDF/MOL/MOL2 files, or PubChem compound IDs. Up to 10 ligands. |
Ligand (CCD) | Standard codes from the PDB Chemical Component Dictionary (e.g., ATP, NAD, HEM). |
DNA/RNA | FASTA sequences or structure files. Up to 10 chains each. |
Template | Known structures from homologous proteins to guide prediction. CIF files or RCSB IDs. Up to 5 templates. |
Pre-computed MSA | A3M alignment files. Optional—Boltz-2 generates MSAs automatically when enabled. |
Settings
Prediction parameters
| Setting | Description |
|---|---|
Number of samples | Structure predictions to generate (1–20, default 5). More samples capture conformational diversity but increase runtime. |
Confidence threshold | Minimum confidence score for predictions (0.0–1.0, default 0.5). Higher values filter low-confidence results. |
MSA generation
Multiple sequence alignment (MSA) aligns a target protein sequence against thousands of evolutionarily related sequences. By analyzing coevolution patterns—which amino acid positions change together across evolution—the model infers spatial relationships in the 3D structure.
| Setting | Description |
|---|---|
Generate MSA | Enables automatic MSA generation via ColabFold. Disable only for quick screening or synthetic proteins lacking natural homologs. |
MSA depth | Search thoroughness: Shallow (2048 seqs, fast), Normal (8192 seqs, default), or Deep (16384 seqs, slow but thorough). |
MSA pairing strategy | How MSAs from different chains combine: Greedy (default) or Complete (slower, better for obligate heterodimers). |
Max MSA sequences | Maximum aligned sequences (512–16384, default 8192). Lower values are faster; higher values may improve accuracy. |
Advanced options
Boltz-2 uses diffusion-based structure generation with iterative refinement ("recycling"). These parameters control that process.
| Setting | Description |
|---|---|
Recycling steps | Refinement iterations (1–10, default 3). More steps improve accuracy at increased runtime. |
Sampling steps | Diffusion denoising steps (50–500, default 200). Higher values produce more refined structures. |
Step scale | Temperature controlling sampling diversity (1.0–2.0, default 1.5). Higher values explore more conformational space. |
Output format | CIF (recommended) or PDB (legacy compatibility). |
MW-corrected affinity | Applies molecular weight correction when comparing ligands of different sizes. |
Affinity sampling steps | Diffusion steps for affinity prediction (50–500, default 200). |
Affinity samples | Independent affinity predictions to average (1–20, default 5). More samples reduce variance. |
Constraints
Constraints guide predictions toward specific structural features when prior knowledge exists about the system—from crystallography, mutagenesis studies, or crosslinking mass spectrometry data.
| Setting | Description |
|---|---|
Enforce constraints | Must be enabled before defining constraints below. Activates inference-time potentials. |
Pocket constraints | Forces ligand binding near specified residues. Format: `binder |
Covalent bonds | Defines covalent bonds between atoms. Format: chain:residue:atom,chain:residue:atom (e.g., A:12:SG,C:1:C1). |
Contact constraints | Forces residues within a specified distance. Format: `chain:residue,chain:residue |
Modifications
| Setting | Description |
|---|---|
Residue modifications | Post-translational modifications using CCD codes. Format: chain:position:CCD_code. Common codes: SEP (phosphoserine), TPO (phosphothreonine), MLY (methylated lysine). |
Cyclic chains | Chains with head-to-tail cyclization. Enter one chain ID per line. |
Output
Boltz-2 returns ranked predictions, each with a structure file and associated scores.
| Column | Description |
|---|---|
Rank | Prediction ranking by confidence. |
Affinity (log10 IC50) | Quantitative binding strength in log10(IC50) μM. More negative = stronger binding. |
Confidence | Structural prediction confidence (0–1). |
Interpreting affinity predictions
Boltz-2 outputs two affinity metrics serving different purposes:
Affinity probability (0–1): Binary classifier confidence that the molecule binds. Values above 0.7 indicate strong predicted binders; below 0.5 suggests unlikely binding. Best used for initial screening to separate hits from non-binders.
Affinity value (log10 IC50 in μM): Quantitative binding strength. More negative values indicate stronger binding.
| Value | Interpretation | Typical use case |
|---|---|---|
| < −8 | Very strong (nM) | Clinical candidates |
| −8 to −6 | Strong (low μM) | Lead compounds |
| −6 to −4 | Moderate | Hit compounds |
| > −4 | Weak | Likely inactive |
Interpreting confidence scores
Confidence reflects certainty about the predicted structure, not the affinity estimate. Low confidence does not mean the prediction is wrong—it indicates the result warrants closer examination.
| Score | Interpretation |
|---|---|
| > 0.7 | High confidence—prediction likely reliable |
| 0.5–0.7 | Moderate—visual inspection recommended |
| < 0.5 | Low—may indicate a difficult target, missing MSA data, or unusual binding mode |
How does Boltz-2 work?
Boltz-2 extends Boltz-1 with an affinity prediction module trained on millions of binding measurements. The result is a single model predicting both 3D structures and binding strength.
Architecture
The model processes input in two stages. First, the structure module generates 3D coordinates using diffusion—starting from noise and iteratively refining toward a physically plausible structure. This approach parallels image generation models but operates on molecular geometry.
Second, the affinity module takes the predicted structure and outputs binding predictions: both a binary "does it bind?" probability and a quantitative IC50 estimate.
Training data
Boltz-2 was trained on approximately 5 million binding affinity measurements (Kd and IC50 values) from public assays, plus molecular dynamics simulations capturing protein flexibility.
The training data spans diverse complex types: protein-ligand, protein-DNA, protein-RNA, and protein-protein interactions. This breadth accounts for Boltz-2's effectiveness on multi-molecule complexes.
Benchmark performance
For structure prediction, Boltz-2 matches or slightly outperforms AlphaFold3, Chai-1, and Boltz-1 on recent PDB deposits (2024–2025), with particular strength on RNA, DNA-protein complexes, and antibody-antigen interactions.
For affinity prediction, Boltz-2 achieves Pearson correlation r ≈ 0.62 on the FEP+ benchmark—matching physics-based free energy perturbation methods that require hours instead of seconds. In the CASP16 affinity challenge, it outperformed all submitted methods across 140 complexes.
When to use Boltz-2
Boltz-2 excels in scenarios requiring both structure and affinity prediction in a single workflow.
| Scenario | Recommendation |
|---|---|
| Need structure + quantitative affinity | Boltz-2 |
| Multi-component complexes (protein + DNA + ligand) | Boltz-2 |
| Already have protein structure, need ligand poses only | DiffDock (faster, no affinity) |
| High-throughput screening with known binding site | AutoDock Vina or AutoDock-GPU |
| CNN-enhanced scoring on traditional docking | GNINA |
Comparison with other methods
| Feature | Boltz-2 | DiffDock | AutoDock Vina | FEP+ |
|---|---|---|---|---|
| Predicts structure | Yes | No (pose only) | No (pose only) | No |
| Predicts affinity | Yes (quantitative) | No | Yes (scoring) | Yes (quantitative) |
| Affinity accuracy | r ≈ 0.62 | N/A | r ≈ 0.3–0.4 | r ≈ 0.6–0.7 |
| Speed | ~20 sec | ~30 sec | ~1 min | 6–12 hours |
| Molecule types | Protein, DNA, RNA, ligand | Protein, ligand | Protein, ligand | Protein, ligand |
Limitations
- Maximum system size: Complexes are limited to approximately 1,400 residues total across all chains. Larger systems may require domain splitting or template guidance.
- Memory scaling: Memory requirements increase quadratically with sequence length.
- MSA dependency: Prediction accuracy correlates with MSA depth (Spearman
r ≈ 0.35–0.37). Synthetic or designed proteins lacking natural homologs may yield lower-confidence results. - Stereochemistry sensitivity: Ligand binding is highly stereospecific. SMILES strings must correctly represent the intended enantiomer.
Related tools
Structure prediction
- AlphaFold 2: Higher residue limits, no affinity prediction
- Chai-1: Alternative multi-molecule structure prediction
- ESMFold: Fast single-chain prediction without MSA
- OpenFold 3: Open-source AlphaFold3 implementation
- Protenix: ByteDance's AlphaFold3-style model
- RoseTTAFold 3: Baker lab structure prediction
Molecular docking
- DiffDock: Diffusion-based blind docking
- AutoDock Vina: Classical scoring function
- AutoDock-GPU: GPU-accelerated AutoDock
- GNINA: CNN-enhanced docking and scoring
- DynamicBind: Flexible receptor docking
Binding site analysis
- Fpocket: Geometry-based pocket detection
- PocketFlow: Deep learning pocket prediction
- ScanNet: Binding site prediction from sequence
