
Amino acid composition
Calculate the percentage and frequency of each amino acid in protein sequences
Amino acid composition is the frequency of each of the 20 standard amino acids within a protein sequence. This fundamental metric provides a quantitative fingerprint of any protein, revealing patterns related to structure, function, and evolutionary history.
The composition of a protein directly influences its physical and chemical properties. Proteins enriched in hydrophobic amino acids (Ile, Leu, Val) tend to form stable hydrophobic cores, while those rich in charged residues (Asp, Glu, Lys, Arg) are typically more soluble and interact with other charged molecules.
Amino acid composition analysis serves as the foundation for many downstream calculations. Other protein parameters like GRAVY, instability index, and isoelectric point are all derived from the same underlying composition data. For a comprehensive analysis that calculates all these properties at once, use our Protein Parameters calculator.
The calculation counts each amino acid in the sequence and expresses the result either as a raw count or as a percentage of total residues.
For each amino acid, the percentage composition is:
Where is the count of amino acid type and is the total number of residues in the sequence.
The tool analyzes all 20 standard amino acids, displayed using their three-letter codes with one-letter codes in parentheses:
| Three-letter | One-letter | Property |
|---|---|---|
| Ala | A | Nonpolar |
| Cys | C | Polar |
| Asp | D | Acidic |
| Glu | E | Acidic |
| Phe | F | Aromatic |
| Gly | G | Nonpolar |
| His | H | Basic |
| Ile | I | Nonpolar |
| Lys | K | Basic |
| Leu | L | Nonpolar |
| Met | M | Nonpolar |
The output table shows one row per input sequence with columns for each amino acid.
The Residues column shows the total sequence length. Each amino acid column displays either the percentage (when Show percentage is enabled) or the raw count of that residue.
Typical globular proteins show characteristic composition patterns. Leucine (Leu) is often the most abundant amino acid at around 9-10%, while tryptophan (Trp) and cysteine (Cys) are typically rare at 1-2%.
Unusual compositions can indicate specialized function. Membrane proteins are enriched in hydrophobic residues. Intrinsically disordered proteins often have high proportions of charged and polar amino acids with reduced hydrophobic content.
The average amino acid composition varies by organism and protein type. Comparing your protein's composition against these references can highlight enrichments or depletions that may be functionally significant.
.fasta, .fa, .fas, .pdb, .csv, and .txt file uploadsAmino acid composition analysis supports many research applications:
Composition analysis captures the global amino acid distribution but not positional information. Two proteins with identical compositions can have completely different sequences and structures.
Non-standard amino acids (selenocysteine, pyrrolysine) and modified residues are not recognized. These will cause validation errors if present in the input sequence.
| Asn | N | Polar |
| Pro | P | Nonpolar |
| Gln | Q | Polar |
| Arg | R | Basic |
| Ser | S | Polar |
| Thr | T | Polar |
| Val | V | Nonpolar |
| Trp | W | Aromatic |
| Tyr | Y | Aromatic |