PPAP (Protein-Protein Affinity Predictor) is a deep learning model that predicts binding affinity between protein chains in a complex. Given a PDB structure with two or more protein chains, it estimates the free energy of binding (ΔG) and dissociation constant (Kd) for each pair of interacting chains.
The model combines structural features extracted from the protein interface with sequence representations from ESM2-3B, a large protein language model. An interfacial contact-aware attention mechanism focuses on residue pairs at the binding interface, where interaction strength is primarily determined.
PPAP is designed specifically for protein-protein interactions. For small molecule binding, see molecular docking tools like AutoDock Vina or GNINA.
PPAP integrates two complementary information sources:
The interfacial contact-aware attention mechanism weights residue pairs by their relevance to binding, prioritizing contacts at the interface over distant residues. This architecture allows PPAP to focus on the regions most critical for affinity determination.
On benchmark datasets, PPAP achieved a Pearson correlation of 0.63 on an external test set, outperforming sequence-only language models and comparable structure-based approaches.
ProteinIQ provides GPU-accelerated PPAP predictions directly in the browser. No software installation or command-line experience needed.
| Input | Description |
|---|---|
Protein Complex Structure | PDB file containing at least 2 protein chains. Upload a file or fetch directly from RCSB using a PDB ID (e.g., 1BRS). |
The structure must contain multiple protein chains. Ligands, small molecules, and single-chain structures are not supported.
| Setting | Description |
|---|---|
Analyze all chain pairs | When enabled (default), PPAP calculates affinity for every possible pair of chains in the structure. |
Chain pairs to analyze | Manually specify which pairs to analyze. Format: A_B, A_C where the first chain is the receptor and the second is the ligand. For multi-chain partners, concatenate IDs: HL_Y means chains H+L together binding to chain Y. |
Results are returned as a table with one row per chain pair:
| Column | Description |
|---|---|
Chain Pair | The analyzed receptor-ligand pair (e.g., A_B). |
ΔG (kcal/mol) | Predicted Gibbs free energy of binding. More negative values indicate stronger binding. |
Kd | Predicted dissociation constant, derived from ΔG. Lower Kd indicates tighter binding. |
The binding free energy describes how favorable the interaction is thermodynamically. At physiological conditions:
| ΔG (kcal/mol) | Binding strength |
|---|---|
| < −12 | Very strong (sub-nanomolar) |
| −10 to −12 | Strong (low nanomolar) |
| −7 to −10 | Moderate (nanomolar to micromolar) |
| −5 to −7 | Weak (micromolar) |
| > −5 | Very weak or non-binding |
Kd represents the concentration at which half of the protein is bound. Lower Kd means tighter binding:
| Kd range | Interpretation | Typical examples |
|---|---|---|
| pM (10⁻¹²) | Extremely tight | High-affinity antibodies |
| nM (10⁻⁹) | Strong | Therapeutic antibodies, enzyme inhibitors |
| μM (10⁻⁶) | Moderate | Transient signaling interactions |
| mM (10⁻³) | Weak | Non-specific or transient contacts |
The relationship between ΔG and Kd follows: , where R is the gas constant and T is temperature.
Predicted affinities are estimates, not experimental measurements. Use them for:
Experimental validation with surface plasmon resonance (SPR), isothermal titration calorimetry (ITC), or similar techniques is recommended for quantitative applications.