ProteinIQ
fpocket icon

fpocket

Identify protein pockets and ligand binding sites with druggability scores.

What is fpocket?

fpocket is an open-source protein pocket detection algorithm that identifies ligand binding sites using Voronoi tessellation and alpha sphere geometry.

The software analyzes protein structures to locate and characterize cavities and clefts where small molecules can bind. The algorithm ranks detected pockets by druggability and provides geometric and physicochemical descriptors.

Traditional binding site detection methods often rely on grid-based approaches or require ligand-bound structures as templates. fpocket instead uses geometric principles to identify pockets directly from protein coordinates, making it applicable to both bound and unbound (apo) protein structures.

The algorithm achieves 94% detection accuracy within the top three ranked pockets while executing in under 3 seconds per structure.

Applications

  • Virtual screening: Identifying druggable pockets before molecular docking campaigns
  • Cryptic pocket discovery: Detecting transient or induced-fit binding sites in apo structures
  • Protein function annotation: Predicting functionally important cavities in novel protein structures
  • Structure-based design: Guiding where to position ligands for AutoDock Vina or GNINA docking studies

How to use fpocket online

ProteinIQ provides a web interface for running fpocket without command-line installation. You upload a protein structure or enter a PDB ID, and the tool returns ranked pockets with geometric and chemical descriptors.

Inputs

InputDescription
Protein StructureThe target protein for pocket detection. Upload a PDB or mmCIF file, or enter a 4-character PDB ID (e.g., 1HSG) to fetch from RCSB. Maximum file size is 50 MB.

Results

fpocket returns a ranked table of detected pockets with quantitative descriptors.

ColumnDescription
PocketPocket rank, where 1 represents the most druggable predicted site.
ScoreOverall pocket score derived from geometric and physicochemical descriptors. Higher values indicate more favorable binding characteristics.
Drug ScoreDruggability score estimating suitability for small molecule binding. Values range from 0 to 1, with values above 0.5 considered druggable.
Volume (ų)Pocket volume in cubic angstroms. Typical drug binding sites range from 200–800 ų.
Apolar SASASolvent-accessible surface area of hydrophobic residues in Ų. Higher values suggest hydrophobic binding environments.
Polar SASASolvent-accessible surface area of polar residues in Ų.
Alpha SpheresNumber of alpha spheres comprising the pocket. Larger values indicate more extensive cavities.

Interpreting druggability scores

The druggability score integrates multiple descriptors using a partial least squares model trained on known drug-binding sites:

  • Above 0.7: Highly druggable, comparable to validated pharmaceutical targets
  • 0.5–0.7: Moderately druggable, suitable for fragment-based approaches
  • Below 0.5: Challenging targets requiring specialized design strategies

Scores reflect both geometric properties (pocket depth, volume) and chemical features (hydrophobicity, polarity balance).

How does fpocket work?

fpocket employs alpha sphere theory based on computational geometry. An alpha sphere is defined as a sphere that touches exactly four protein atoms on its boundary while containing no atoms in its interior. These spheres naturally concentrate in protein cavities and clefts, making them ideal markers for binding site detection.

Voronoi tessellation

The algorithm begins by computing a Voronoi decomposition of 3D space around the protein using the Qhull library. Voronoi vertices—points equidistant from four neighboring atoms—correspond to potential alpha sphere centers. fpocket filters these vertices by radius, discarding spheres too small (tight atomic packing in the protein core) or too large (solvent-exposed surface regions).

The default radius range is 3.0–6.0 Å, optimized for typical small molecule binding sites. This geometric criterion eliminates ~80% of candidate spheres before clustering.

Clustering algorithm

fpocket groups neighboring alpha spheres into pockets using a three-pass clustering procedure:

  1. Rough segmentation: Initial clusters form from spheres within 3.3 Å of each other
  2. Center-of-mass aggregation: Small clusters merge if their centers lie within 4.5 Å
  3. Multiple linkage: Final refinement connects clusters sharing boundary spheres

This hierarchical approach handles irregular pocket geometries better than single-threshold methods. The algorithm leverages Qhull's neighbor lists to avoid pairwise distance calculations, achieving near-linear runtime scaling.

Scoring function

Each pocket receives a composite score derived from five weighted descriptors:

DescriptorWeightMeaning
Normalized alpha sphere count0.3Pocket size indicator
Mean local hydrophobic density0.25Apolar residue concentration
Proportion of apolar spheres0.2Hydrophobic character
Polarity score0.15Polar residue presence
Alpha sphere density0.1Spatial compactness

Weights derive from partial least squares regression against a training set of 48 known ligand-binding sites. The scoring function prioritizes deep, hydrophobic pockets with balanced polarity—characteristics of successful drug targets.

Descriptor calculation

Beyond the scoring function, fpocket computes additional physicochemical properties:

  • Volume: Calculated from the union of alpha spheres using numerical integration
  • SASA decomposition: Solvent-accessible surface area partitioned by residue hydrophobicity
  • Charge: Net electrostatic character from charged residue contributions
  • Residue composition: Count of each amino acid type lining the pocket

These descriptors enable users to assess pockets beyond the druggability score alone.

Performance characteristics

Benchmark studies on the PocketPicker dataset (48 diverse proteins) demonstrate:

  • Bound structures: 83% rank-1 accuracy, 92% rank-3 accuracy
  • Unbound structures: 69% rank-1 accuracy, 94% rank-3 accuracy
  • Speed: Under 3 seconds per structure on a single CPU core

On the Astex Diverse set (85 high-quality pharmaceutical complexes):

  • Rank-1 detection: 67–73% depending on pocket size
  • Rank-3 detection: 82–88%

The algorithm outperforms CAST, PASS, SURFNET, and LIGSITE at rank-3 while executing 10–100× faster than grid-based competitors. This speed advantage enables proteome-scale screening applications.

Limitations

  • Shallow pockets: Surface grooves lacking depth may score poorly despite biological relevance
  • Induced-fit sites: Pockets requiring significant backbone rearrangement upon ligand binding may remain undetected in apo structures
  • Allosteric sites: Cryptic pockets distant from the protein surface often fall below detection thresholds
  • Protein-protein interfaces: Extended, relatively flat interaction surfaces score lower than compact small molecule binding sites
  • PocketFlow: AI-powered molecular generation within detected binding pockets
  • AutoDock Vina: Molecular docking to evaluate ligand binding in fpocket-identified sites
  • GNINA: Deep learning docking with improved scoring for ranked pockets