What is AutoDock Vina?

AutoDock Vina is one of the most widely-used molecular docking programs in computational drug discovery. It predicts how small molecules bind to protein targets and estimates their binding affinity in kcal/mol.

Vina achieves approximately 100-fold speedup compared to AutoDock 4 while improving prediction accuracy. The approach combines a machine learning-optimized scoring function with an efficient global search algorithm, making it the gold standard for structure-based virtual screening.

For ligand preparation and analysis, consider using Lipinski's Rule of Five to assess drug-likeness or ADMET-AI for comprehensive pharmacokinetic predictions before docking.

How does AutoDock Vina work?

AutoDock Vina combines a machine learning-optimized scoring function with an efficient global search algorithm. The authors describe their approach as "more of 'machine learning' than directly physics-based in its nature," justified by empirical performance rather than theoretical assumptions.

Scoring function

The scoring function evaluates protein-ligand interactions through several components:

Steric interactions: Gaussian functions model attractive van der Waals forces, while a repulsion term prevents atomic clashes
Hydrophobic interactions: Favorable contacts between hydrophobic atoms on protein and ligand
Hydrogen bonding: Directional interactions between donors and acceptors
Rotatable bond penalty: Accounts for entropy loss upon ligand binding

The predicted binding affinity (kcal/mol) is calculated from the intermolecular portion of the lowest-scoring conformation, combined with a torsional penalty based on the number of rotatable bonds.

Search algorithm

Vina implements Iterated Local Search with the BFGS (Broyden-Fletcher-Goldfarb-Shanno) quasi-Newton method for local optimization. Unlike earlier genetic algorithm approaches, BFGS uses scoring function gradients to efficiently navigate the conformational landscape.

The global search uses random mutations of position, orientation, and torsion angles, with a Metropolis acceptance criterion to balance exploration and exploitation. Multiple independent runs (controlled by exhaustiveness) start from randomized positions to improve coverage of the search space.

Multithreading

Vina supports parallel execution on multi-core processors. Independent docking runs are distributed across available CPU cores, with benchmarks showing 7.25x speedup on 8-core systems compared to single-threaded execution.

Input requirements

Protein (Receptor): PDB file or RCSB PDB ID. The protein should be properly protonated with missing residues fixed. Use PDB Fixer for automated preparation.

Ligand: SMILES string (simplest for small molecules), SDF, MOL, or PDBQT file. For peptides and complex molecules (>150 atoms), PDBQT format is recommended as it bypasses automatic conversion.

Scoring functions

AutoDock Vina 1.2 supports three scoring functions optimized for different use cases:

Vina (default)

The original Vina scoring function offers the best balance of speed and accuracy for most applications. We recommend this for general-purpose docking and virtual screening.

Vinardo

Vinardo (Vina RaDii Optimized) was trained on curated datasets using a novel approach that optimizes docking performance rather than just binding affinity correlation. It removes the long-range attraction term and doubles the contribution of hydrophobic interactions compared to Vina.

Use Vinardo when you need improved ranking of compounds in virtual screening campaigns.

AutoDock4

The classical AutoDock4 force field is better suited for metalloproteins and systems with metal coordination. It uses a more physics-based approach compared to the empirical Vina function.

Use AutoDock4 when docking to zinc-containing proteins, heme groups, or other metalloenzymes.

Docking parameters

Core settings

Exhaustiveness: Number of independent docking runs. Higher values increase search thoroughness and reduce the chance of missing the optimal binding mode. Runtime scales linearly. Use 16-32 for publication-quality results.
Number of poses: Maximum binding poses to generate (1-20). More poses provide broader coverage of the binding landscape.
Energy range: Maximum energy difference (kcal/mol) from the best pose for inclusion in results. Only poses within this range are returned.
Min RMSD between poses: Minimum structural difference between generated poses. Lower values produce more similar poses; higher values ensure diverse binding modes.

Search space

The search space defines a 3D box where the ligand can bind.

Auto mode calculates a box covering the entire protein surface. This works well when you don't know the binding site but increases computation time.

Manual mode lets you specify exact center coordinates and dimensions. We recommend keeping the box under 30×30×30 Å unless you also increase exhaustiveness proportionally.

Flexible residues

Flexible docking allows specified protein residues to move during docking, modeling induced-fit binding. Enter residues in the format Chain:ResidueName+Number, separated by commas (e.g., A:ARG120,A:TYR135).

Use flexible residues for:

Known active site residues that undergo conformational changes
Gatekeeper residues controlling pocket access
Cases where rigid docking produces steric clashes

Note that flexible docking significantly increases computation time and search space complexity.

Understanding the results

Binding affinity

Binding affinity is reported in kcal/mol. Lower (more negative) values indicate stronger predicted binding:

Range	Interpretation
-4 to -6	Weak binding
-6 to -8	Moderate binding
-8 to -10	Strong binding
< -10	Very strong binding

Values below -12 kcal/mol may indicate scoring artifacts and should be validated experimentally or with additional computational methods.

Multiple poses

Vina generates multiple binding poses ranked by predicted affinity. The top-ranked pose represents the most favorable binding mode, but examining the top 3-5 poses is recommended. Alternative binding modes may be biologically relevant.

Validation

Vina achieves ~87% success rate (ligand RMSD < 2Å from crystal structure) on benchmark datasets. Predicted affinities have a standard error of approximately 2.85 kcal/mol, so relative rankings are more reliable than absolute values.

Comparison to other docking tools

Feature	AutoDock Vina	Smina	GNINA	DiffDock
Method	Physics-based + ML	Vina fork	Vina + CNN scoring	Diffusion model
Scoring	Vina/Vinardo/AD4	Vinardo + custom	CNN affinity	Confidence score
Speed	~1-2 min	~1-2 min	~2-3 min	~5-10 min
Best for	General docking	Custom scoring	Pose accuracy	Blind docking

Smina is a Vina fork with additional scoring function options and better command-line interface.

GNINA adds convolutional neural network rescoring for improved pose prediction accuracy.

DiffDock uses a diffusion generative model and excels at blind docking when the binding site is unknown.

Best practices

We recommend starting with default parameters (exhaustiveness 8, 9 poses) for initial screening.

For publication-quality results, increase exhaustiveness to 16-32 and validate top hits with molecular dynamics or experimental binding assays.

When the binding site is known, define the search space manually. This reduces runtime and focuses sampling on the relevant region.

For complex ligands (peptides, macrocycles, molecules >150 atoms), prepare PDBQT files externally using Meeko or OpenBabel.

PDB Fixer - Prepare proteins by adding missing residues and fixing common issues
Smina - Vina fork with additional scoring functions
GNINA - CNN-enhanced docking for improved pose accuracy
DiffDock - Diffusion-based blind docking
LightDock - Protein-protein docking
PocketFlow - Structure-based ligand design
Lipinski's Rule of Five - Assess drug-likeness before docking
ADMET-AI - Predict pharmacokinetic properties

Frequently asked questions

Why is my docking job taking so long?

Runtime scales with three factors: search space volume, ligand flexibility, and exhaustiveness. Large search boxes (especially Auto mode on big proteins) dramatically increase computation time because Vina must sample more conformational space.

To speed things up: define a manual search box around the known binding site, reduce exhaustiveness for initial screening, or use a smaller ligand. For very large proteins, consider whether blind docking is truly necessary.

Why did I get fewer poses than I requested?

Vina only returns poses within the specified energy range of the best pose. If you requested 9 poses but only got 3, the remaining poses exceeded your energy cutoff or were too similar (below the min RMSD threshold).

This often indicates the binding site has limited conformational diversity for your ligand. Try increasing the energy range parameter or decreasing the min RMSD threshold to generate more poses.

What does exhaustiveness actually control?

Exhaustiveness determines how many independent docking runs Vina performs. Each run starts from a random position and explores the search space using the BFGS optimization algorithm.

Higher exhaustiveness reduces the probability of missing the global energy minimum but increases runtime linearly. Think of it as "how hard should Vina search?" For quick screening, 8 is adequate. For publication or lead optimization, use 16-32.

Why is my binding affinity extremely negative (below -12 kcal/mol)?

Predicted affinities below -12 kcal/mol are often scoring artifacts rather than true predictions. Common causes include ligands with many rotatable bonds, very large binding pockets, or ligands that clash with the protein despite appearing to fit.

We recommend treating such results skeptically. Validate with alternative methods like GNINA for CNN-based rescoring, or check the 3D pose visually for unrealistic binding modes.

Can I compare binding affinities between different proteins?

No. Vina's binding affinity predictions are only meaningful for comparing ligands docked to the same protein structure. The scoring function is calibrated relative to a specific binding site.

Comparing affinities across different proteins—even related family members—is not valid. Each protein-ligand system must be evaluated independently.

Which scoring function should I use?

Use Vina (default) for most applications—it offers the best speed-accuracy trade-off for general docking. Use Vinardo when ranking compounds in virtual screening campaigns, as it was optimized for pose discrimination rather than absolute affinity prediction.

Use AutoDock4 specifically for metalloproteins (zinc fingers, heme enzymes, metalloenzymes). The classical force field handles metal coordination better than the empirical Vina function.

Do I need to prepare my protein before docking?

Yes. AutoDock Vina expects a properly protonated protein with no missing residues in the binding site. Use PDB Fixer to add missing atoms and residues automatically.

Crystal structures often have missing loops, alternate conformations, and no hydrogen atoms. Skipping preparation can lead to failed jobs or unreliable results.

Can I dock peptides or macrocyclic ligands?

Yes, but with limitations. Vina handles molecules up to about 150 atoms reasonably well. For larger peptides, cyclic peptides, or macrocycles, prepare PDBQT files externally using Meeko to ensure proper torsion handling.

Very flexible ligands (>15 rotatable bonds) exponentially increase search space complexity. Consider increasing exhaustiveness proportionally or using DiffDock, which handles flexibility differently.

When should I use flexible residues?

Use flexible residues when you know specific active site residues undergo conformational changes upon ligand binding (induced fit). Common candidates include gatekeeper residues, catalytic residues, and residues known to adopt multiple rotamers.

Adding flexibility significantly increases runtime and search complexity. Start with rigid docking, then add flexibility only if poses show steric clashes with known flexible residues or if redocking a crystal ligand fails.

Why doesn't my top-ranked pose match the crystal structure?

Several factors can cause pose prediction failures: incorrect protonation states, missing water molecules that mediate binding, metal coordination issues, or insufficient exhaustiveness. Vina's ~87% success rate means approximately 1 in 8 dockings will fail to reproduce the native pose.

Try increasing exhaustiveness, using a different scoring function, or using GNINA which adds CNN rescoring for improved pose accuracy.

How do I find binding site coordinates for manual search?

If you have a reference ligand or crystal structure, extract coordinates from the bound ligand's center of mass. Many PDB structures include co-crystallized ligands—download these and calculate the geometric center.

For novel targets without known ligands, use cavity detection tools or literature to identify likely binding pockets. Consider using Auto mode for initial blind docking, then refine with manual coordinates around the identified site.

What's the difference between Vina and GNINA?

GNINA is built on Vina's codebase but adds convolutional neural network (CNN) scoring. It generates poses the same way but rescores them using a model trained to distinguish correct from incorrect binding modes.

Use Vina for speed and established workflows. Use GNINA when pose accuracy is critical and you can tolerate slightly longer runtimes. GNINA often recovers correct poses that Vina's scoring ranks poorly.

When should I use DiffDock instead?

DiffDock uses a diffusion generative model rather than physics-based docking. It excels at blind docking when you don't know the binding site, and handles protein flexibility implicitly.

Use DiffDock for exploratory docking on novel targets. Use Vina when you know the binding site, need interpretable scoring, want faster turnaround, or require reproducibility with a specific random seed.

Docking

Molecular docking

AutoDock Vina