ProteinIQ
DynamicBind example image

DynamicBindBeta

AI-powered protein-ligand binding prediction with protein conformational flexibility

What is DynamicBind?

DynamicBind is a deep learning docking tool that predicts how small-molecule ligands bind to proteins while accounting for protein flexibility. Unlike traditional docking methods that treat proteins as rigid structures, DynamicBind predicts ligand-induced conformational changes—the structural adjustments proteins undergo when binding occurs.

This flexibility is critical for drug discovery. Proteins in their unbound (apo) state often have different side-chain orientations and backbone conformations compared to their ligand-bound (holo) state. DynamicBind recovers these holo-like conformations directly from apo structures, including AlphaFold predictions, without requiring prior knowledge of the binding site.

For traditional rigid docking approaches, see DiffDock, AutoDock Vina, or GNINA. For structure-based ligand generation, explore PocketFlow.

How does DynamicBind work?

Equivariant diffusion model

DynamicBind uses a diffusion-based generative model that learns to transition proteins from their apo state to ligand-specific holo conformations. The model is E(3)-equivariant, meaning it respects the 3D rotational and translational symmetries of molecular structures. This mathematical property allows the model to train on 1000× less data while achieving superior generalization compared to non-equivariant architectures.

The diffusion process operates over 20 inference steps with progressively smaller perturbations. Rather than adding random Gaussian noise, DynamicBind learns biologically relevant conformational transitions by training on paired apo-holo crystal structures from the PDBbind2020 dataset (19,443 protein-ligand complexes).

Coarse-grained protein representation

Each protein residue is represented as a node with two geometric features: backbone coordinates and side-chain dihedral angles (χ\chi angles). This coarse-graining reduces degrees of freedom while maintaining the ability to reconstruct all non-hydrogen atom positions. The model simultaneously predicts:

  • Ligand translational and rotational coordinates
  • Ligand torsional angles
  • Protein backbone translations and rotations
  • Protein side-chain χ\chi angles

The use of tensor products of irreducible representations enables the model to capture complex protein-ligand interactions while preserving SE(3) equivariance throughout the neural network layers.

Handling conformational changes

DynamicBind accommodates large protein conformational changes, including kinase DFG-in/out transitions and cryptic pocket opening. By learning smooth energy landscapes between equilibrium states, the model can recover holo-structures even when the initial AlphaFold prediction has large pocket RMSD (>5>5 Å) relative to the bound state.

The model outputs both the predicted ligand pose and the corresponding protein conformation, allowing users to visualize how binding site residues adjust to accommodate the ligand.

Input requirements

Protein structure

  • Upload PDB file: Provide a protein structure in PDB or ENT format (max 50 MB)
  • Fetch from RCSB: Enter a 4-character PDB ID to automatically download the structure
  • AlphaFold compatibility: DynamicBind works directly with AlphaFold-predicted apo structures

DynamicBind automatically removes water molecules, non-standard residues, and ligands during preprocessing. For structure prediction, use ESMFold, Boltz-2, or Chai-1.

Ligand structure

  • SMILES input: Enter the ligand as a SMILES string (recommended for small molecules)
  • Upload SDF/MOL: Provide ligand coordinates in SDF or MOL format
  • Fetch from PubChem: Search for compounds by name or CID

The ligand is automatically processed and converted to the appropriate internal representation for docking.

Docking parameters

Number of poses

Controls how many binding poses to generate per protein-ligand pair. More poses increase the chance of sampling near-native conformations but require longer computation time.

We recommend starting with 10 poses for initial exploration and increasing to 20-40 for final predictions or challenging targets where the binding site is unknown.

Inference steps

The number of diffusion denoising steps used during sampling. Higher values may improve pose quality by allowing more gradual refinement but increase computation time proportionally.

The model was trained and validated with 20 steps, which provides a good balance between quality and speed. Values below 15 may produce lower-quality poses, while values above 25 show diminishing returns.

Protein flexibility

When enabled, DynamicBind predicts ligand-induced conformational changes in both backbone and side-chain positions. When disabled, the model performs rigid docking with the input structure held fixed.

Enable flexibility (default) when working with apo structures or AlphaFold predictions. Disable for holo structures where the binding site conformation is already optimized, or when you want to test if a ligand can bind without requiring protein adjustments.

Understanding the results

DynamicBind returns ranked binding poses with two scoring metrics:

Affinity (pKd)

Predicted binding affinity in pKd\text{pKd} units, where pKd=log10(Kd)\text{pKd} = -\log_{10}(K_d). Higher values indicate stronger binding:

  • pKd>8\text{pKd} > 8: High affinity (Kd<10K_d < 10 nM)
  • pKd=6-8\text{pKd} = 6\text{-}8: Moderate affinity (Kd=10-1000K_d = 10\text{-}1000 nM)
  • pKd<6\text{pKd} < 6: Low affinity (Kd>1K_d > 1 μM)

The affinity prediction is learned from PDBbind experimental binding data and provides an estimate of binding strength. Note that affinity predictions are less reliable for proteins with low sequence homology to the training set.

lDDT (local distance difference test)

Confidence metric ranging from 0 to 1 that estimates the accuracy of the predicted protein conformation. Higher lDDT scores indicate the model has high confidence in both the ligand pose and the induced protein conformational change.

  • lDDT>0.7\text{lDDT} > 0.7: High confidence
  • lDDT=0.5-0.7\text{lDDT} = 0.5\text{-}0.7: Moderate confidence
  • lDDT<0.5\text{lDDT} < 0.5: Low confidence

Poses are ranked by a combination of lDDT and clash scores to prioritize geometrically plausible structures with favorable predicted confidence.

Interpreting poses

The top-ranked pose represents the model's best prediction, but examining multiple poses is recommended. For successful predictions on benchmark datasets:

  • 33% of poses achieve ligand RMSD <2< 2 Å (near-native)
  • 65% achieve RMSD <5< 5 Å (reasonable binding mode)

Download predicted structures in SDF format (ligand) and PDB format (protein conformation) for further analysis or molecular dynamics simulations.

When to use DynamicBind

DynamicBind is particularly valuable when:

  • Working with apo structures or AlphaFold predictions that lack bound ligand conformations
  • Targeting proteins known to undergo conformational changes upon binding
  • Investigating cryptic pockets that are closed in apo structures
  • Screening compounds against kinases, GPCRs, or nuclear receptors (well-represented in training data)

For these applications, DynamicBind achieves 1.7× higher success rates compared to DiffDock under stringent accuracy criteria.

Limitations

Training set dependence

Performance depends on similarity to the PDBbind2020 training set. Predictions for proteins with low sequence homology to known structures show reduced accuracy. The model performs best on major drug target families (kinases, GPCRs, nuclear receptors, ion channels) that are well-represented in crystallographic databases.

Blind docking challenges

When the binding site is unknown, DynamicBind may struggle to identify the correct pocket, especially for novel protein targets. Providing a general binding region or using multiple inference runs with different random seeds can improve coverage.

Confidence score calibration

Perfect pose selection using confidence scores could improve success rates from 33% to 50%, indicating that the current lDDT-based ranking occasionally prioritizes incorrect poses. Manual inspection of top-ranked poses is recommended for critical applications.

Alternative docking methods:

  • DiffDock — Rigid docking with diffusion models
  • AutoDock Vina — Physics-based scoring function
  • GNINA — CNN-based scoring with GPU acceleration
  • Smina — Fast screening with AutoDock Vina engine

Structure prediction:

  • ESMFold — Fast structure prediction from sequence
  • Boltz-2 — AlphaFold 3-level protein-ligand complex prediction
  • Chai-1 — Multi-entity complex prediction

Structure-based design:

  • PocketFlow — Generate novel ligands for binding pockets