LMI4Boltz

Low-memory Boltz inference for biomolecular structure and affinity prediction

Add your input molecules to get started

50
Configure input settings on the left, then click "Submit"

Related tools

Boltz-2

Boltz-2

Boltz-2 is a biomolecular foundation model for structure and binding affinity prediction. Supports proteins, ligands, DNA, and RNA in multi-component complexes. Automatically scales GPU resources for large complexes. Predicts binding affinity with near-FEP accuracy at 1000x faster speed.

Chai-1

Chai-1

Chai-1 is a multi-modal foundation model for molecular structure prediction. Predicts 3D structures for proteins, ligands, DNA, RNA, and multi-component complexes with high accuracy.

OpenFold-3

OpenFold-3

OpenFold-3 is an open-source AI model for biomolecular structure prediction, aiming to reproduce AlphaFold3. Predicts 3D structures for proteins, RNA, DNA, and small molecule ligands with high accuracy.

IntelliFold 2

IntelliFold 2

Controllable biomolecular structure prediction model for proteins, ligands, DNA, RNA, and multi-component complexes. IntelliFold 2 supports fast v2-Flash inference, optional MSA generation, and ranked confidence outputs.

Protenix

Protenix

Open-source AlphaFold 3 implementation by ByteDance for biomolecular structure prediction. Predicts 3D structures for proteins, RNA, DNA, and small molecule ligands with high accuracy.

RosettaFold3

RosettaFold3

Open-source structure prediction neural network for proteins, nucleic acids, and small molecules. State-of-the-art accuracy with multi-chain support.

AlphaFlow

AlphaFlow

Generate protein conformational ensembles with ESMFlow, the single-sequence AlphaFlow model family. Produces multiple diverse structures showing protein flexibility and dynamics.

AlphaFold2

AlphaFold2

AlphaFold2 via ColabFold for high-accuracy protein structure prediction. Uses MMSeqs2 API for MSA generation with no local databases required. Supports monomer and multimer prediction.

ESMfold

ESMfold

ESMfold is a fast, single-sequence protein structure predictor from Meta AI. Predicts 3D protein structures directly from amino acid sequences without requiring multiple sequence alignments (MSA), making it significantly faster than AlphaFold while automatically scaling GPU resources for larger proteins.

ImmuneBuilder

ImmuneBuilder

ImmuneBuilder predicts 3D structures of immune receptor proteins including antibodies, nanobodies, and T-cell receptors. It uses ABodyBuilder2, NanoBodyBuilder2, and TCRBuilder2/TCRBuilder2+ to generate structures with per-residue error estimates and optional ensemble artifacts.

What is LMI4Boltz?

Large Boltz jobs often fail for a simple reason: the model can handle the chemistry, but the GPU runs out of memory before inference finishes. LMI4Boltz solves that bottleneck by running Boltz with low-memory inference changes while keeping the same core structure and affinity prediction behavior.

The method is designed for biomolecular complexes that are too large or memory-heavy for standard Boltz-2 settings. It predicts structures for proteins, small molecules, DNA, RNA, and mixed complexes, and can use Boltz-2 affinity prediction for protein-ligand binding. LMI4Boltz is most useful when the scientific question is still a Boltz question, but the system size pushes ordinary inference close to the VRAM limit.

How to use LMI4Boltz online

Run LMI4Boltz online by combining one or more protein, ligand, DNA, or RNA inputs into a single complex prediction job. ProteinIQ accepts FASTA sequences, RCSB protein fetches, SMILES ligands, PubChem ligands, and CCD ligand codes, then returns predicted CIF or PDB structures plus confidence, pLDDT, PAE, PDE, and affinity files when generated.

Inputs

LMI4Boltz builds one prediction target from the submitted molecules. Chain IDs are shown in the input interface and become important when interpreting interface confidence, ligand affinity, and multi-chain contacts.

InputAccepted formatsPractical notes
ProteinFASTA text, .fasta, .fa, .fas, or RCSB fetchUp to 10 protein chains. Total sequence length across all sequence-based molecules is capped at 5,000 residues in ProteinIQ.
LigandSMILES text or PubChem fetchUp to 10 small molecules. Ligands are limited to 300 heavy atoms. Peptide ligands should be entered as protein chains.
Ligand (CCD)PDB Chemical Component Dictionary code, such as ATP, NAD, HEM, or SAHBest for common cofactors, ions, and ligands already represented in the PDB chemical dictionary.
DNAFASTA text, .fasta, .fa, .fasUp to 10 DNA chains. Use only canonical DNA sequence characters.
RNAFASTA text, .fasta, .fa, .fasUp to 10 RNA chains. Useful for RNA-protein, RNA-ligand, and mixed nucleic acid complexes.

At least one foldable biopolymer is required. Ligand-only jobs are not valid because Boltz predicts complexes, not isolated small-molecule conformer ensembles.

Settings

Most jobs should start with the standard settings: one diffusion sample, Boltz-2, no manual memory tuning, and MSA generation enabled only when evolutionary information is needed for protein chains. The low-memory controls are mainly for large complexes that fail or approach the memory limit.

Standard inference

SettingDescription
Diffusion samplesNumber of structures to generate (1 to 20, default 1). More samples help identify alternative conformations, but runtime and output size increase roughly with sample count.
Use MSA serverGenerates protein MSAs through the configured ColabFold-compatible server. MSA information can improve protein structure accuracy, especially for natural proteins with homologs.
MSA server URLMSA endpoint used when Use MSA server is on. The standard endpoint is https://api.colabfold.com.
MSA pairing strategyGreedy pairs likely matching sequences across chains quickly. Complete is more exhaustive and can help obligate multimers at higher compute cost.
Max MSA sequencesMaximum number of MSA rows retained for protein inference (default 8192). Decreasing this value can reduce memory use; increasing it can help difficult natural proteins.

Advanced inference

SettingDescription
Recycling stepsIterative refinement passes through the model (default 3). Higher values can improve difficult predictions but increase runtime and memory pressure.
Sampling stepsDiffusion denoising steps for structure generation (default 200). More steps can refine geometry, but the gain is usually smaller than adding samples or improving input quality.
Step scaleOptional sampling temperature override. Leave empty unless comparing controlled reruns or matching a command-line configuration.
Output formatmmCIF by default, with PDB available for compatibility. mmCIF is safer for larger complexes and nonstandard chemistry.
ModelBoltz-2 for structure plus affinity prediction. Boltz-1 is structure-only and does not support Boltz-2 method conditioning.
Method conditioningOptional Boltz-2 conditioning for X-ray diffraction, electron microscopy, solution NMR, or molecular dynamics. Leave as None unless the intended structure should resemble a specific experimental modality.
Use potentialsApplies inference-time potentials when the native input supports them. Useful when physical pose quality matters and a job shows clashes or unrealistic contacts.
MW-corrected affinityApplies molecular-weight correction to affinity predictions. Most useful when comparing ligands with very different sizes.
Affinity sampling stepsDiffusion steps for affinity prediction (default 200). Higher values increase cost and may stabilize difficult ligand predictions.
Affinity samplesNumber of affinity samples averaged (default 5). More samples reduce variance for ligand ranking.
Write full PAESaves the full predicted aligned error matrix. Enable for detailed domain placement or interface uncertainty analysis.
Write full PDESaves the full predicted distance error matrix. Enable when downstream inspection needs residue-pair distance uncertainty.
Random seedFixed seed for reproducible sampling. Leave blank for normal stochastic inference.

Low-memory controls

LMI4Boltz reduces peak memory by chunking expensive pair and MSA operations, moving some tensors to host memory, updating pair representations in place, and optionally using reduced precision. Smaller chunks usually reduce memory demand but can add overhead.

SettingDefaultWhat it changes
Transition z chunk sizeEmptyOptional chunk size for pair transition operations. Lower values can help very large complexes.
MSA transition chunk size32Chunk size for MSA transition work. Lowering it can reduce memory when deep MSAs are active.
Triangle attention chunk size128Chunk size for triangle attention, one of the memory-heavy parts of pair reasoning.
Triangle mult gate chunks1Number of chunks for the triangle multiplication gate. Increasing chunks can reduce peak memory.
Outer product chunk size4Chunk size for outer product operations linking MSA information to pair features.
Chunk threshold384Token count threshold at which chunked low-memory behavior becomes active.
Use bfloat16OffUses bfloat16 precision where supported. Reduced precision can lower memory, with small numerical differences expected.

Results

LMI4Boltz returns a structure viewer, downloadable structure files, and data files from the prediction run. File availability depends on selected options and whether the selected model supports affinity prediction.

OutputMeaning
Predicted structure (CIF)Predicted complex in mmCIF format. This is the recommended format for large multi-chain or ligand-containing systems.
Predicted structure (PDB)Predicted complex in PDB format when selected. PDB is convenient for older viewers but less expressive for complex chemistry.
confidence_*.jsonSummary confidence metrics, including confidence_score, ptm, iptm, complex_plddt, interface-weighted confidence, and chain-pair interface scores.
affinity_*.jsonBoltz-2 ligand binding outputs, including predicted affinity value and binder probability. Present when affinity prediction is requested for a compatible ligand complex.
plddt_*.npzPer-token local confidence. Higher values indicate more reliable local geometry.
pae_*.npzPredicted aligned error matrix. Useful for domain placement and relative chain orientation.
pde_*.npzPredicted distance error matrix. Lower values indicate more confident inter-token distances.

Understanding LMI4Boltz results

The first check should be structural confidence, not affinity. A strong affinity score on a low-confidence protein-ligand pose is weak evidence because the binding estimate depends on a plausible complex geometry.

Structure confidence

Boltz confidence fields are reported on a 0 to 1 scale, where higher values indicate more model confidence. The most useful distinction is between local confidence and interface confidence.

MetricInterpretation
confidence_scoreAggregate ranking score used to sort generated structures. Good for choosing among samples from the same job.
complex_plddtAverage local coordinate confidence across the full complex. High values mean local folds are likely stable, but not necessarily that chain placement is correct.
ptmPredicted TM-like global fold confidence. Useful for single-chain and overall topology assessment.
iptmInterface-focused confidence. More relevant than ptm for protein-protein, protein-ligand, and protein-nucleic acid complexes.
ligand_iptmInterface confidence for protein-ligand contacts. Low values suggest the ligand pose should not be overinterpreted.
complex_iplddtLocal confidence weighted toward interface tokens. Useful when the interface matters more than distal flexible regions.

As a rough guide, confidence values above 0.7 usually justify closer structural interpretation, 0.5 to 0.7 calls for visual inspection and reruns, and values below 0.5 often indicate an underconstrained system, insufficient MSA signal, problematic input chemistry, or genuine conformational ambiguity.

Affinity outputs

Boltz-2 affinity files contain two kinds of predictions:

FieldMeaning
affinity_probability_binaryProbability-like estimate that the ligand is a binder. Values closer to 1 indicate stronger predicted binder classification.
affinity_pred_valueQuantitative affinity value reported as log(IC50), with IC50 in micromolar. Lower values indicate stronger predicted binding.
affinity_pred_value1, affinity_pred_value2Individual ensemble-head affinity values. Large disagreement means the ligand ranking is less stable.
affinity_probability_binary1, affinity_probability_binary2Individual ensemble-head binder probabilities. Agreement between heads is more convincing than one high value.

The affinity value is not pIC50. In the Boltz convention, lower is stronger because the value is log(IC50) with IC50 measured in micromolar:

affinity_pred_valueApproximate IC50 interpretation
-3About 1 nM, strong binder
0About 1 micromolar, moderate binder
2About 100 micromolar, weak binder or likely decoy

For ligand ranking, compare compounds against the same target, same input preparation, same MSA choice, same sampling settings, and preferably the same random seed policy. Cross-target affinity values are much less meaningful.

How LMI4Boltz works

LMI4Boltz changes how Boltz inference is executed, not the biological question the model is trying to answer. The underlying Boltz model remains a diffusion-based biomolecular structure predictor, with Boltz-2 adding an affinity module trained to estimate protein-ligand binding.

The low-memory changes target the expensive pair-representation operations that scale poorly as complexes grow. Pair tensors represent relationships between tokens, so memory rises quickly with longer proteins, deeper MSAs, and multi-chain assemblies. LMI4Boltz reduces peak VRAM by:

  • Updating pair representations in place: Intermediate tensors are reused instead of materializing more copies than needed.
  • Offloading less frequently used tensors to host memory: GPU memory is reserved for active computation, while some data can sit in CPU memory.
  • Using reduced precision for pair features: Lower precision reduces memory footprint, with small numerical differences expected.
  • Chunking memory-heavy operations: Triangle attention, MSA transition, outer product, and related operations can run in smaller pieces.

The LMI4Boltz README reports substantially higher token limits on 24 GB VRAM, with near-exact reproduction of ordinary Boltz outputs and negligible runtime change in the tested settings. Small numerical differences can still occur because reduced precision and altered execution order change floating-point arithmetic.

When to use LMI4Boltz vs alternatives

LMI4Boltz is not a different scoring model from Boltz. It is the practical choice when Boltz-style prediction is desired and memory is the limiting factor.

SituationBetter fitReason
Large protein, nucleic acid, or mixed complexes strain GPU memoryLMI4BoltzLow-memory inference can handle larger token counts than ordinary Boltz execution.
Routine structure plus affinity prediction where memory is not a concernBoltz-2Simpler settings and the same model family for most standard-size jobs.
Comparing multimodal structure predictorsChai-1, Protenix, and LMI4BoltzAgreement across independent model families is useful for difficult complexes.
Protein-ligand pose prediction from an existing receptor structureDiffDock or GNINADocking tools focus on ligand placement against a provided structure rather than predicting the whole biomolecular complex.
High-confidence lead optimization for a few close analogsFEP or experimental binding assaysBoltz-2 affinity is fast and useful for triage, but rigorous free-energy workflows remain better for final potency claims.

For a practical pipeline, LMI4Boltz can generate candidate complex structures for large systems, DockQ can score protein-protein interface models when a reference is available, and GROMACS or gmx_MMPBSA can support downstream simulation-based checks.

Common failure modes

  • Incorrect ligand stereochemistry: SMILES must encode the intended enantiomer. Binding predictions can change when stereocenters are missing or wrong.
  • Treating peptides as small molecules: Peptide ligands are better entered as protein chains. The SMILES ligand path is capped at 300 heavy atoms and is intended for small molecules.
  • Reading affinity without pose quality: Affinity values should be interpreted only after checking ligand interface confidence and visual pose plausibility.
  • Disabling MSA for natural proteins: Single-sequence protein predictions can work, but many natural proteins benefit from MSA signal.
  • Over-tuning memory chunks first: If a job fails, reduce sample count, MSA depth, or system size before changing every low-memory chunk parameter at once. One change at a time makes reruns easier to interpret.