ProteinIQ
PROPKA 3 example image

PROPKA 3

Predict pKa values of ionizable groups in proteins based on 3D structure.

What is PROPKA 3?

PROPKA 3 predicts the pKa values of ionizable amino acid residues in proteins based on their 3D structure. The pKa determines at what pH a residue will be protonated or deprotonated, which affects protein stability, enzyme activity, and ligand binding.

Amino acids in solution have well-characterized "model" pKa values. However, when buried inside a protein, the local environment can shift these values dramatically—sometimes by several pH units. PROPKA uses empirical rules to estimate these shifts from desolvation effects, hydrogen bonds, and electrostatic interactions.

Understanding pKa values is essential for structure preparation in molecular dynamics simulations, docking studies, and electrostatics calculations. For preparing structures for electrostatics workflows, consider using PDB2PQR, which uses PROPKA internally to assign protonation states.

How does PROPKA 3 work?

PROPKA calculates pKa values using an empirical approach that starts with model pKa values and applies perturbations based on the protein environment.

The pKa equation

The predicted pKa for each ionizable group is:

pKa=pKa,model+ΔpKa,desolv+ΔpKa,HB+ΔpKa,coulpK_a = pK_{a,model} + \Delta pK_{a,desolv} + \Delta pK_{a,HB} + \Delta pK_{a,coul}

Where:

  • pKa,modelpK_{a,model} is the reference pKa in solution
  • ΔpKa,desolv\Delta pK_{a,desolv} is the desolvation penalty from burial
  • ΔpKa,HB\Delta pK_{a,HB} accounts for hydrogen bond interactions
  • ΔpKa,coul\Delta pK_{a,coul} captures coulombic interactions with other ionizable groups

Model pKa values

PROPKA uses these reference pKa values for amino acids in solution:

ResidueModel pKa
ASP (Aspartic acid)3.80
GLU (Glutamic acid)4.25
HIS (Histidine)6.50
CYS (Cysteine)9.00
TYR (Tyrosine)10.07
LYS (Lysine)10.50
ARG (Arginine)12.50
N-terminus (N+)8.00
C-terminus (C-)3.20

Desolvation effects

When a charged group is buried inside a protein, it loses favorable interactions with water. This desolvation penalty is the largest contributor to pKa shifts for buried residues.

PROPKA estimates desolvation based on how much solvent volume the protein excludes around the ionizable group. Deeply buried residues experience larger pKa shifts than surface-exposed ones.

Prediction accuracy

PROPKA 3 achieves RMSD values of approximately 0.79 for ASP/GLU, 0.65 for LYS, 0.75 for TYR, and 1.00 for HIS residues when compared to experimental data.

Surface-exposed residues are predicted more accurately since their pKa values remain close to model values. Buried residues with large pKa shifts are more challenging to predict.

Understanding the results

PROPKA outputs a table with one row per ionizable residue:

ColumnDescription
StructurePDB identifier (when processing multiple structures)
ResidueThree-letter amino acid code (ASP, GLU, HIS, etc.)
PositionResidue number in the sequence
ChainChain identifier
pKaPredicted pKa value
Model pKaReference pKa in solution
ShiftDifference between predicted and model pKa (pKapKa,modelpK_a - pK_{a,model})

Interpreting pKa shifts

A positive shift means the residue is harder to deprotonate than in solution. For acidic residues (ASP, GLU), this indicates stabilization of the protonated form—often from burial or hydrogen bonding.

A negative shift means the residue is easier to deprotonate. For basic residues (LYS, ARG, HIS), a negative shift indicates destabilization of the protonated form.

Shifts greater than ±2 pH units suggest significant environmental perturbation and warrant careful examination of the local structure.

Determining protonation states

To determine protonation states at a specific pH, compare the predicted pKa to your target pH:

  • If pH<pKapH < pK_a: the residue is predominantly protonated
  • If pH>pKapH > pK_a: the residue is predominantly deprotonated

At physiological pH (7.4), ASP and GLU are typically deprotonated (negatively charged), while LYS and ARG are protonated (positively charged). HIS is near its pKa and may exist in either state.

Use cases

PROPKA predictions help with several common workflows:

  • MD simulation setup: Determine correct protonation states before running simulations
  • Enzyme mechanism studies: Identify catalytic residues with unusual pKa values
  • pH-dependent behavior: Understand why proteins aggregate or unfold at certain pH
  • Drug design: Assess how ligand binding might shift active site pKa values

For calculating the overall isoelectric point (pI) of a protein from sequence, use our Isoelectric Point calculator. PROPKA provides residue-level detail from structure, while pI gives a single whole-protein value from sequence.

Limitations

PROPKA works best for globular, well-folded proteins with reliable structures. Keep these limitations in mind:

  • Predictions are less accurate for membrane proteins and intrinsically disordered regions
  • Structures with missing atoms or poor resolution may give unreliable results
  • Ligand effects on pKa are handled in PROPKA 3.1+, but accuracy depends on ligand parameterization
  • ARG predictions are less validated due to limited experimental data at high pH

We recommend using PDB Fixer to repair structures with missing atoms before running PROPKA.