
DFMDock (Denoising Force Matching Dock) is a diffusion model that unifies sampling and ranking for protein-protein docking within a single framework. It predicts docked poses for protein-protein complexes from unbound structures using denoising score matching with optional clash force guidance.

EquiDock is an SE(3)-equivariant graph neural network for rigid protein-protein docking. It predicts a binding pose for a protein-protein complex from unbound structures using geometric deep learning, with DIPS and DB5 pretrained checkpoints from the upstream release.

ColabDock is a protein-protein docking framework that uses AlphaFold2 to predict complex structures guided by experimental restraints from cross-linking mass spectrometry, NMR, or other sources.

HADDOCK (High Ambiguity Driven protein-protein DOCKing) is an integrative modeling platform for biomolecular complexes. It uses experimental data and bioinformatic predictions to guide the docking process, generating accurate protein-protein complex structures.

LightDock is a protein-protein, protein-peptide, and protein-DNA docking framework using Glowworm Swarm Optimization (GSO). It predicts macromolecular binding modes and interfaces for biological complexes.

Geometric deep learning model for predicting protein binding sites directly from 3D structure. Identifies where proteins interact with other proteins, antibodies, or disordered proteins with high accuracy, including for novel protein folds.

SPRINT (Structure-aware PRotein ligand INTeraction) predicts drug-target interactions using co-embedded protein and ligand representations. Screen thousands of compounds against a protein target in seconds.

Antibody-specific language model for predicting non-germline residues (NGL) in antibody sequences. AbLang-2 addresses germline bias in existing antibody language models by focusing on somatic hypermutation patterns, enabling more accurate prediction of amino acid likelihoods and generation of context-aware embeddings for antibody sequences.

GNINA is a molecular docking tool that combines traditional physics-based docking with deep learning CNN scoring for protein-small-molecule complexes. It provides accurate binding predictions with confidence scores, optimized for high-throughput virtual screening.

Humatch is an antibody humanization tool that transforms non-human antibody sequences into humanized variants. Uses three lightweight CNNs to identify optimal human V-genes and generate paired heavy and light chain sequences with minimal edits while maintaining functionality.
ParaSurf is a surface-based deep learning model that predicts paratope binding sites on antibodies. A paratope is the region of an antibody that physically contacts and binds to an antigen—identifying these sites is essential for understanding immune recognition, designing therapeutic antibodies, and developing vaccines.
The model is antigen-agnostic: it only requires the antibody structure, not the antigen. This makes ParaSurf useful for predicting binding sites before experimental characterization or when the target antigen is unknown.
ParaSurf achieves state-of-the-art accuracy on multiple benchmarks, with particularly strong performance on the highly variable CDR3 loops that are critical for antigen specificity. If you don't have an antibody structure yet, you can generate one using ESMFold, Chai-1, or Boltz-2.
Rather than operating on raw atomic coordinates, ParaSurf extracts the solvent-accessible surface of the antibody and samples points across this surface. Each surface point becomes a local prediction target.
For each surface point, the model constructs a voxel grid centered on that point with 1 Å resolution, covering approximately a 20 Å radius. The grid is aligned perpendicular to the surface normal, which reduces sensitivity to arbitrary rotations.
ParaSurf encodes 22 features per voxel, grouped into three categories:
Chemical features (18 channels): Nine atom type classes (C, N, O, S, etc.), hybridization state, valence metrics, partial charge, and SMARTS-based descriptors for hydrophobicity, aromaticity, hydrogen bond donor/acceptor properties, and ring membership.
Electrostatic features (4 channels): Force field values from AMBER and CHARMM, plus atomic radii calculated via PDB2PQR.
Geometric features: Van der Waals surface representation with outward-pointing normal vectors.
The extracted features pass through a hybrid architecture:
Individual surface point predictions are aggregated to residue-level scores using the maximum:
where is the predicted binding probability for surface point belonging to that residue. Residues with scores above 0.5 are classified as binding sites.
Upload a PDB file containing your antibody structure, or fetch one directly from the RCSB PDB by entering the 4-letter code. The structure should include the Fab region (variable and constant domains of heavy and light chains).
Before running ParaSurf, ensure your structure is properly prepared. Use PDB Fixer to add missing atoms, fix non-standard residues, or remove water molecules and ions that may interfere with surface generation.
ParaSurf offers four model variants trained on different benchmark datasets:
Paragraph Expanded: Trained on 1,086 antibody-antigen complexes. We recommend this for general use—it has the largest and most diverse training set.PECAN: Trained on 460 complexes from the PECAN benchmark. Use this if you want predictions consistent with PECAN-based literature.Paragraph - Heavy Chains Only: Predicts binding sites only on the heavy chain. Useful when you're specifically interested in heavy chain contributions.Paragraph - Light Chains Only: Predicts binding sites only on the light chain.Controls the resolution of the molecular surface sampling. Lower values (toward 0.1) generate a coarser mesh with fewer surface points, resulting in faster predictions but less detail. Higher values (toward 1.0) produce a denser mesh with more precise predictions at the cost of longer computation time.
For most antibody structures, the default value provides a good balance between accuracy and speed. Increase the density for detailed analysis of specific binding interfaces.
ParaSurf outputs a PDB file with predicted binding scores encoded in the B-factor column. The interactive viewer colors residues by their predicted binding probability:
The model predicts across three antibody regions with varying accuracy:
| Region | Description | Typical performance |
|---|---|---|
| CDR ± 2 | Complementarity-determining regions plus 2 flanking residues | Highest accuracy |
| Fv | Full variable region (all CDRs and framework regions) | High accuracy |
| Fab | Entire antigen-binding fragment | Good accuracy |
Predictions in the CDR regions, especially CDR-H3 (AUC-ROC ~0.96), tend to be the most reliable since these loops are directly responsible for antigen recognition.
A residue score of 0.7 indicates the model is fairly confident that residue participates in antigen binding. Scores near 0.5 represent uncertainty—these residues may warrant experimental validation.
For therapeutic antibody development, focus on residues with scores above 0.6 as candidates for mutagenesis studies or epitope mapping experiments.
ParaSurf is particularly valuable in several scenarios:
ParaSurf requires the antibody structure to include the Fab region for accurate predictions. Predictions on isolated Fv fragments or single-domain antibodies (nanobodies) may be less reliable.
The model was trained on conventional antibody-antigen complexes. Performance on unusual binding modes (e.g., antibodies that bind through framework regions) has not been extensively validated.
Surface generation requires atomic coordinates—ParaSurf cannot process sequence-only inputs. Generate a structure first using a structure prediction tool if you only have the sequence.