What is fpocket?
fpocket is an open-source protein pocket detection algorithm that identifies ligand binding sites using Voronoi tessellation and alpha sphere geometry.
The software analyzes protein structures to locate and characterize cavities and clefts where small molecules can bind. The algorithm ranks detected pockets by druggability and provides geometric and physicochemical descriptors.
Traditional binding site detection methods often rely on grid-based approaches or require ligand-bound structures as templates. fpocket instead uses geometric principles to identify pockets directly from protein coordinates, making it applicable to both bound and unbound (apo) protein structures.
The algorithm achieves 94% detection accuracy within the top three ranked pockets while executing in under 3 seconds per structure.
Applications
- Virtual screening: Identifying druggable pockets before molecular docking campaigns
- Cryptic pocket discovery: Detecting transient or induced-fit binding sites in apo structures
- Protein function annotation: Predicting functionally important cavities in novel protein structures
- Structure-based design: Guiding where to position ligands for AutoDock Vina or GNINA docking studies
How to use fpocket online
ProteinIQ provides a web interface for running fpocket without command-line installation. You upload a protein structure or enter a PDB ID, and the tool returns ranked pockets with geometric and chemical descriptors.
Inputs
| Input | Description |
|---|---|
Protein Structure | The target protein for pocket detection. Upload a PDB or mmCIF file, or enter a 4-character PDB ID (e.g., 1HSG) to fetch from RCSB. Maximum file size is 50 MB. |
Results
fpocket returns a ranked table of detected pockets with quantitative descriptors.
| Column | Description |
|---|---|
Pocket | Pocket rank, where 1 represents the most druggable predicted site. |
Score | Overall pocket score derived from geometric and physicochemical descriptors. Higher values indicate more favorable binding characteristics. |
Drug Score | Druggability score estimating suitability for small molecule binding. Values range from 0 to 1, with values above 0.5 considered druggable. |
Volume (ų) | Pocket volume in cubic angstroms. Typical drug binding sites range from 200–800 ų. |
Apolar SASA | Solvent-accessible surface area of hydrophobic residues in Ų. Higher values suggest hydrophobic binding environments. |
Polar SASA | Solvent-accessible surface area of polar residues in Ų. |
Alpha Spheres | Number of alpha spheres comprising the pocket. Larger values indicate more extensive cavities. |
Interpreting druggability scores
The druggability score integrates multiple descriptors using a partial least squares model trained on known drug-binding sites:
- Above 0.7: Highly druggable, comparable to validated pharmaceutical targets
- 0.5–0.7: Moderately druggable, suitable for fragment-based approaches
- Below 0.5: Challenging targets requiring specialized design strategies
Scores reflect both geometric properties (pocket depth, volume) and chemical features (hydrophobicity, polarity balance).
How does fpocket work?
fpocket employs alpha sphere theory based on computational geometry. An alpha sphere is defined as a sphere that touches exactly four protein atoms on its boundary while containing no atoms in its interior. These spheres naturally concentrate in protein cavities and clefts, making them ideal markers for binding site detection.
Voronoi tessellation
The algorithm begins by computing a Voronoi decomposition of 3D space around the protein using the Qhull library. Voronoi vertices—points equidistant from four neighboring atoms—correspond to potential alpha sphere centers. fpocket filters these vertices by radius, discarding spheres too small (tight atomic packing in the protein core) or too large (solvent-exposed surface regions).
The default radius range is 3.0–6.0 Å, optimized for typical small molecule binding sites. This geometric criterion eliminates ~80% of candidate spheres before clustering.
Clustering algorithm
fpocket groups neighboring alpha spheres into pockets using a three-pass clustering procedure:
- Rough segmentation: Initial clusters form from spheres within 3.3 Å of each other
- Center-of-mass aggregation: Small clusters merge if their centers lie within 4.5 Å
- Multiple linkage: Final refinement connects clusters sharing boundary spheres
This hierarchical approach handles irregular pocket geometries better than single-threshold methods. The algorithm leverages Qhull's neighbor lists to avoid pairwise distance calculations, achieving near-linear runtime scaling.
Scoring function
Each pocket receives a composite score derived from five weighted descriptors:
| Descriptor | Weight | Meaning |
|---|---|---|
| Normalized alpha sphere count | 0.3 | Pocket size indicator |
| Mean local hydrophobic density | 0.25 | Apolar residue concentration |
| Proportion of apolar spheres | 0.2 | Hydrophobic character |
| Polarity score | 0.15 | Polar residue presence |
| Alpha sphere density | 0.1 | Spatial compactness |
Weights derive from partial least squares regression against a training set of 48 known ligand-binding sites. The scoring function prioritizes deep, hydrophobic pockets with balanced polarity—characteristics of successful drug targets.
Descriptor calculation
Beyond the scoring function, fpocket computes additional physicochemical properties:
- Volume: Calculated from the union of alpha spheres using numerical integration
- SASA decomposition: Solvent-accessible surface area partitioned by residue hydrophobicity
- Charge: Net electrostatic character from charged residue contributions
- Residue composition: Count of each amino acid type lining the pocket
These descriptors enable users to assess pockets beyond the druggability score alone.
Performance characteristics
Benchmark studies on the PocketPicker dataset (48 diverse proteins) demonstrate:
- Bound structures: 83% rank-1 accuracy, 92% rank-3 accuracy
- Unbound structures: 69% rank-1 accuracy, 94% rank-3 accuracy
- Speed: Under 3 seconds per structure on a single CPU core
On the Astex Diverse set (85 high-quality pharmaceutical complexes):
- Rank-1 detection: 67–73% depending on pocket size
- Rank-3 detection: 82–88%
The algorithm outperforms CAST, PASS, SURFNET, and LIGSITE at rank-3 while executing 10–100× faster than grid-based competitors. This speed advantage enables proteome-scale screening applications.
Limitations
- Shallow pockets: Surface grooves lacking depth may score poorly despite biological relevance
- Induced-fit sites: Pockets requiring significant backbone rearrangement upon ligand binding may remain undetected in apo structures
- Allosteric sites: Cryptic pockets distant from the protein surface often fall below detection thresholds
- Protein-protein interfaces: Extended, relatively flat interaction surfaces score lower than compact small molecule binding sites
Related tools
- PocketFlow: AI-powered molecular generation within detected binding pockets
- AutoDock Vina: Molecular docking to evaluate ligand binding in fpocket-identified sites
- GNINA: Deep learning docking with improved scoring for ranked pockets
