Guides / Statistics

What's the largest protein?

Dr. Matic Broz·April 9, 2026

What's the largest protein?

Key takeaways

  • Largest known protein: PKZILLA-1, 45,212 amino acids, 4,760,854 Da (4.76 MDa)
  • Computed theoretical pI 7.08, instability index 43.78, GRAVY +0.118 (UniProt A0AB34IYJ6)
  • Previous record holder: titin, 34,350 amino acids, ~3.8 MDa (UniProt Q8WZ42)
  • PKZILLA-1 contains 140 enzyme domains and is ~100 times larger than an average protein

What is PKZILLA-1?

The largest known protein is PKZILLA-1, a giant enzyme containing 45,212 amino acids with a molecular mass of 4.7 megadaltons (MDa). Discovered in August 2024 by researchers at UC San Diego's Scripps Institution of Oceanography, this protein surpasses titin by 25% in molecular weight.[1]

PKZILLA-1 was identified in the golden alga Prymnesium parvum, where it functions as a polyketide synthase enzyme. The protein contains 140 enzyme domains arranged in sequence, making it an extraordinarily complex molecular machine.[1]

"This is the Mount Everest of proteins," noted Bradley Moore, the study's senior author at Scripps Oceanography.[2]

FeaturePKZILLA-1Titin (previous record)
Mass4.7 MDa3.7–3.8 MDa
Amino acids45,21234,350
Enzyme domains140N/A (structural)
FunctionPolyketide synthesisMuscle elasticity
OrganismPrymnesium parvum (algae)Human muscle
Discovery20241979

How big is titin, the previous record holder?

Titin consists of 34,350 amino acids in its canonical human isoform, with a molecular weight of approximately 3.8 MDa. The protein stretches over 1 micrometer in length, spanning the entire length of the muscle sarcomere from Z-disk to M-line.[4]

Titin functions as a molecular spring in striated muscle, providing passive elasticity that allows muscles to stretch and return to their resting state. According to RCSB PDB-101, titin consists of more than 34,000 amino acids organized into several hundred modular domains, including Ig-like domains, fibronectin-like domains, and a disordered PEVK region.[5]

The mouse homologue is slightly larger, comprising 35,213 amino acids with a molecular weight of 3.9 MDa. Titin constitutes approximately 10% of muscle mass, making it the third most abundant protein in muscle after actin and myosin.[4]

How big is PKZILLA-1 compared to typical proteins?

PKZILLA-1 is approximately 100 times larger than an average protein. According to Cell Biology by the Numbers, eukaryotic proteins average 472 amino acids, while bacterial proteins average 320 amino acids. Using the standard residue weight of 110 Da per amino acid, this translates to approximately 33–55 kDa for a typical protein.[6]

Protein categoryAmino acidsMolecular weight
Typical protein300–50033–55 kDa
Large protein1,000–5,000100–500 kDa
Giant protein (titin)34,3503,700–3,800 kDa
Largest protein (PKZILLA-1)45,2124,700 kDa

PKZILLA-1 reaches approximately 1 micrometer in length. Despite its enormous size, it remains far smaller than most cells; a typical human cell measures 10–100 micrometers in diameter.

Largest proteins by molecular weight
Largest proteins by molecular weight

Biochemical properties of PKZILLA-1

I ran the full UniProt A0AB34IYJ6 sequence through my protein parameters calculator to get the biochemistry PKZILLA-1's discovery papers never reported. At 4,760,854 Da (4.76 MDa), the computed molecular weight is slightly higher than the widely cited 4.7 MDa rounding.[3]

PropertyValueNotes
Length45,212 residuesLargest protein on record
Molecular weight4,760,854 Da (4.76 MDa)Computed from sequence; ~25% heavier than titin
Theoretical pI7.08Nearly neutral, unusual for a protein of this size
Instability index43.78Above ExPASy's 40 threshold — classified as unstable in vitro
Aliphatic index94.2High; suggests good thermostability
GRAVY (hydropathy avg)+0.118Slightly hydrophobic overall — unusual for a soluble enzyme
Aromaticity (F+W+Y)5.5%Low; sparse UV-visible aromatic coverage
Extinction coefficient (ε₂₈₀)3,904,800 M⁻¹ cm⁻¹ (reduced)A 1 mg/mL PKZILLA-1 solution would give A₂₈₀ ≈ 0.82

The instability index above 40 is striking. Most stable enzymes score well below that threshold, and it suggests PKZILLA-1 relies on cellular machinery (chaperones, membrane anchoring, or co-translational folding) to remain functional at its enormous length. Jon Clardy's observation that these proteins approach the upper size limit fits the biochemistry: the protein is right at the edge of what cells can assemble and hold together.[7]

The positive GRAVY score is also unusual. Most soluble enzymes have negative GRAVY values (more hydrophilic than hydrophobic). PKZILLA-1 being slightly hydrophobic on average likely reflects the large number of hydrophobic interfaces between its 140 enzyme domains, which must pack against each other along the assembly line.

What is PKZILLA-1's amino acid composition?

PKZILLA-1's composition diverges sharply from the average protein, with heavy biases toward alanine, leucine, serine, and glycine. I computed the full breakdown using my amino acid composition tool.

RankAmino acidPKZILLA-1Count (of 45,212)Average protein
1Alanine (A)13.55%6,126~8.3%
2Leucine (L)11.46%5,181~9.7%
3Serine (S)10.31%4,661~6.6%
4Glycine (G)8.65%3,911~7.1%
5Valine (V)8.05%3,640~6.9%
6Arginine (R)6.37%2,880~5.5%
7Threonine (T)5.52%2,496~5.4%
8Glutamate (E)4.26%1,926~6.7%
9Proline (P)4.17%1,885~4.7%
10Aspartate (D)4.07%1,840~5.5%
11Glutamine (Q)3.94%1,781~3.9%
12Isoleucine (I)3.23%1,460~5.9%
13Histidine (H)3.08%1,392~2.3%
14Phenylalanine (F)2.95%1,334~3.9%
15Asparagine (N)2.24%1,013~4.1%
16Methionine (M)1.99%900~2.4%
17Cysteine (C)1.84%832~1.4%
18Lysine (K)1.75%791~5.8%
19Tyrosine (Y)1.37%619~2.9%
20Tryptophan (W)1.20%542~1.1%

Three features stand out:

  • Alanine is massively over-represented at 13.55%, nearly double the ~8% average seen in typical proteins. This is consistent with PKZILLA-1 containing many short linker and ACP (acyl carrier protein) regions, which are alanine-rich.
  • Lysine is strikingly depleted at 1.75%, compared to ~5.8% in the average protein. Combined with the elevated arginine content (6.37%), PKZILLA-1 favors arginine over lysine for its positive charge, which can affect protease susceptibility and electrostatic interactions.
  • Tryptophan is the rarest residue at 1.20%, consistent with the general trend across proteins but notable given the enzyme still contains 542 tryptophan residues in absolute terms.

Multiplied out, PKZILLA-1 contains approximately 6,126 alanines, 5,181 leucines, and 4,661 serines — counts that by themselves exceed the total length of most average proteins.

What does PKZILLA-1 do?

PKZILLA-1 and its companion enzyme PKZILLA-2 (3.2 MDa, 99 enzyme domains) work together to produce prymnesin, a potent ichthyotoxin responsible for massive fish kills during harmful algal blooms. According to the Science publication, these giant enzymes catalyze 239 sequential chemical reactions to assemble the complex toxin molecule.[1]

This discovery has two important implications for us:

  1. Detecting PKZILLA genes in water samples could enable early warning systems for toxic algal blooms. As co-author Timothy Fallon noted, "Monitoring for the genes instead of the toxin could allow us to catch blooms before they start."
  2. Understanding how nature assembles such complex molecules could inform synthesis of new compounds for medical applications. According to Moore, "Understanding how nature has evolved its chemical wizardry gives us as scientific practitioners the ability to apply those insights to creating useful products."

What are the top 10 largest proteins?

Based on molecular weight, here are the largest known proteins. You can calculate the molecular weight and protein parameters for any sequence using my tools.

RankProteinMolecular weightAmino acidsFunction
1PKZILLA-14,700 kDa45,212Polyketide synthesis (algae)
2Titin3,000–3,800 kDa27,000–34,350Muscle elasticity
3PKZILLA-23,200 kDa~30,000Polyketide synthesis (algae)
4Versican~1,000 kDaVariableExtracellular matrix
5Obscurin720–900 kDa~7,000Muscle organization
6Nebulin600–900 kDa~6,700Thin filament regulation
7AHNAK~700 kDa5,890Membrane organization
8Ryanodine receptor (RyR1)~565 kDa (monomer)5,038Calcium release channel
9Apolipoprotein B-100~515 kDa4,536Lipid transport
10Plectin~500 kDa~4,500Cytoskeletal crosslinking

Sources: PKZILLA-1 discovery report, UniProt titin record, and cited reviews for obscurin, AHNAK, and plectin.[1][4][8][9][10]

Many of the largest proteins are found in muscle tissue (titin, obscurin, nebulin, dystrophin), reflecting the complex structural requirements of the contractile apparatus.

Why are some proteins so large?

Giant proteins serve specialized functions that require their extraordinary size:

  • Titin spans the entire muscle sarcomere (~1 μm), physically connecting the Z-disk to the M-line. This requires a protein large enough to bridge this distance while maintaining elastic properties.
  • PKZILLA-1 contains 140 enzyme domains that work in sequence. Rather than relying on 140 separate enzymes finding each other through diffusion, consolidating them into a single protein ensures efficient, sequential processing of chemical reactions.
  • Large muscle proteins like titin, nebulin, and obscurin act as molecular rulers that define sarcomere dimensions and organize other proteins during muscle development.

Jon Clardy, a biological chemist at Harvard, noted that these PKZILLA proteins likely approach the upper size limits for proteins because of the sheer workload of expressing and assembling something that large.[7]

How many different proteins exist?

While PKZILLA-1 represents the extreme upper end of protein size, the total number of known protein sequences exceeds 465 million (UniRef100, 2025). The human genome encodes approximately 19,000–20,000 protein-coding genes, which produce roughly 70,000 distinct protein isoforms through alternative splicing.

Sources
  1. Giant polyketide synthase enzymes in haptophyte algae Science · 2024. https://www.science.org/doi/10.1126/science.adp7199
  2. Largest protein yet discovered builds algal toxins Scripps Institution of Oceanography · 2024. https://scripps.ucsd.edu/news/largest-protein-yet-discovered-builds-algal-toxins
  3. PKZILLA-1 UniProtKB entry A0AB34IYJ6 UniProt. https://www.uniprot.org/uniprotkb/A0AB34IYJ6/entry
  4. Titin UniProtKB entry Q8WZ42 UniProt. https://www.uniprot.org/uniprotkb/Q8WZ42/entry
  5. Molecule of the Month: Titin RCSB PDB-101. https://pdb101.rcsb.org/motm/185
  6. How big is the average protein? Cell Biology by the Numbers. https://book.bionumbers.org/how-big-is-the-average-protein/
  7. PKZILLA proteins smash protein size record Chemical & Engineering News · 2024. https://cen.acs.org/biological-chemistry/PKZILLA-proteins-smash-protein-size/102/web/2024/08
  8. Obscurin, a giant sarcomeric Rho guanine nucleotide exchange factor protein involved in sarcomere assembly PMC. https://pmc.ncbi.nlm.nih.gov/articles/PMC3957234/
  9. The AHNAK nucleoprotein PMC. https://pmc.ncbi.nlm.nih.gov/articles/PMC49314/
  10. Plectin Current Biology. https://www.cell.com/current-biology/fulltext/S0960-9822(22)01998-4
Matic Broz

Matic Broz

Founder & CEO, ProteinIQ

Matic founded ProteinIQ to make computational biology accessible to every researcher. He builds code-free bioinformatics tools used by thousands of scientists worldwide for protein analysis, molecular docking, and drug discovery.