ProteinIQ

PandaDock

Physics-based molecular docking with CPU-optimized hierarchical algorithms

What is PandaDock?

PandaDock is an open-source molecular docking program written in Python. It predicts how small-molecule ligands bind to protein targets by searching for low-energy binding poses and scoring them with physics-based or hybrid energy functions. Developed by Pritam Kumar Panda, PandaDock is designed entirely around CPU computation — no GPU is required — and achieves sub-angstrom structural accuracy on standard benchmarks (0.014 Å mean RMSD on PDBbind v2020 complexes, with 99.3% of poses within 2 Å of the crystallographic reference).

Where most docking tools offer a single search strategy, PandaDock bundles several algorithms under one interface. An enhanced hierarchical search (the default) runs a coarse global scan, intermediate refinement, and fine local optimization in three stages. Monte Carlo sampling uses simulated annealing. A genetic algorithm applies crossover and mutation operators to evolve pose populations. A crystal-guided mode constrains the search near a known binding geometry for re-docking validation. Each algorithm pairs with any of the available scoring functions.

How does PandaDock work?

PandaDock's docking workflow follows three phases: search space definition, conformational sampling, and scoring/ranking.

Search space

If no binding site is specified, PandaDock computes a bounding box around all protein atoms with padding. When the binding pocket is known, coordinates and box dimensions can be set manually to focus the search and reduce computation.

Conformational sampling

The default Enhanced Hierarchical CPU algorithm operates in three stages:

  1. Coarse search: Broad rotational and translational sampling across the binding region to identify promising orientations
  2. Intermediate refinement: Local perturbation of top candidates to improve pose geometry
  3. Fine optimization: Energy minimization of the best poses to converge on sub-angstrom accuracy

Alternative algorithms trade off speed and thoroughness differently. Monte Carlo sampling explores conformational space through random perturbations accepted or rejected by a Boltzmann criterion, making it faster for screening. The genetic algorithm maintains a population of poses that evolve over generations — useful for complex binding sites with multiple local minima.

Scoring functions

FunctionDescription
Physics-basedFull force-field evaluation including van der Waals, electrostatics, hydrogen bonding, and desolvation terms. Best general-purpose accuracy.
EmpiricalStatistical potential derived from known protein-ligand complexes. Faster, suited for rapid screening.
Precision ScoreHigher-resolution energy calculation with tighter convergence criteria. Slower but more discriminating for closely ranked poses.
HybridCombines physics-based energy terms with machine-learning rescoring for improved affinity ranking.

How to use PandaDock online

ProteinIQ hosts PandaDock on cloud infrastructure so docking jobs can be submitted directly from the browser — no Python environment, dependencies, or command-line setup needed.

Inputs

InputDescription
Protein (Receptor)PDB file upload or 4-character PDB ID (e.g., 1HSG). The structure is fetched from RCSB if an ID is provided.
LigandSMILES string, SDF/MOL file, or PubChem compound name. SMILES are converted to 3D coordinates automatically using RDKit.

Settings

Docking parameters

SettingDescription
Docking algorithmSearch strategy. Enhanced Hierarchical CPU (default, recommended) runs a three-stage search for maximum accuracy. Monte Carlo CPU is faster for screening. Genetic Algorithm CPU handles complex binding sites. Crystal-Guided CPU is for re-docking validation. Auto selects based on input characteristics.
Scoring functionEnergy evaluation method. Physics-based (default) for general use, Empirical for speed, Precision Score for fine discrimination, Hybrid for ML-enhanced ranking.
Number of posesHow many binding poses to generate (1–20, default 5). More poses sample more of the energy landscape at the cost of longer runtime.

Advanced settings

Manual search space configuration is available for users who know the binding pocket location. When disabled (default), PandaDock automatically computes a box enclosing the entire protein.

SettingDescription
Configure search spaceToggle manual binding site specification.
Center X/Y/ZCoordinates of the search box center in Angstroms.
Box size X/Y/ZDimensions of the search box (5–100 Å, default 22.5 Å per axis).

Results

PandaDock returns ranked binding poses viewable in an interactive 3D viewer alongside downloadable files.

OutputDescription
Ligand posesIndividual ligand conformations (PDB format), ranked by score. Displayed in the 3D viewer superimposed on the receptor.
Complex structuresFull protein-ligand complexes for each pose, available for download.
Visualization plotsBinding affinity and interaction energy charts (PNG).
Interaction analysisHydrogen bonds, hydrophobic contacts, and energy decomposition (JSON).

Interpreting scores

PandaDock scores are unitless energy-like values where lower is better — a more negative score indicates a more favorable predicted binding interaction. Scores are most meaningful when comparing poses within the same docking run rather than across different protein-ligand pairs.

Score rangeInterpretation
Most negative (rank 1)Predicted best binding pose
Near zeroWeak or unfavorable interaction

Because PandaDock uses its own scoring function rather than calibrated free-energy estimates, scores should not be interpreted as binding affinities in kcal/mol. For absolute affinity prediction, consider re-scoring top poses with dedicated tools or running experimental validation.

Limitations

  • Rigid receptor: The protein backbone remains fixed during docking. Side-chain flexibility is limited. Large induced-fit effects — common with flexible active sites — may not be captured.
  • Scoring accuracy: Physics-based scoring functions approximate binding free energy but do not account for explicit solvent, entropy, or protein dynamics. Ranking within a single run is generally reliable; comparing scores across different targets is not.
  • Ligand size: Very large ligands (>150 atoms) may require longer runtimes and can challenge the conformational sampling.
  • Metal coordination: Standard scoring may underestimate interactions at metalloprotein active sites. The crystal-guided algorithm can help for known metalloproteins.
  • SMINA: Fork of AutoDock Vina with enhanced scoring and faster minimization — a widely used alternative for small-molecule docking
  • AutoDock-GPU: GPU-accelerated AutoDock4 for high-throughput virtual screening
  • DynamicBind: Deep learning approach that models ligand-induced conformational changes in the receptor
  • fpocket: Binding site detection using Voronoi tessellation — useful for identifying where to dock when the pocket is unknown
  • PocketFlow: Generative model that designs novel molecules directly within a protein binding pocket