
Chou-Fasman
Predict protein secondary structure from amino acid propensities for helices, sheets, and turns.
Related tools

Aggrescan3D
Faithful static-mode Aggrescan3D wrapper for per-residue aggregation propensity analysis from a single protein structure.

Protein charge plot
Plot net charge vs pH for protein sequences. Visualize how protein charge changes across pH 0-14 and identify the isoelectric point (pI) where the net charge crosses zero.

DSSP
Assign protein secondary structure using the DSSP algorithm. The gold standard for hydrogen bond-based structure assignment from coordinates.

FindPept
Match experimental peptide masses against theoretical digest fragments of a protein sequence. Identify peptides from mass spectrometry data by peptide mass fingerprinting.

Hydropathy plot
Generate Kyte-Doolittle hydropathy plots to visualize hydrophobic and hydrophilic regions along protein sequences. Identify transmembrane domains and surface-exposed regions.

Hydrophobicity plot
Generate hydrophobicity plots using 24 different amino acid scales. Visualize hydrophobic and hydrophilic regions for protein analysis, epitope prediction, and membrane protein studies.

Peptide cutter
Predict protease and chemical cleavage sites across a protein sequence for up to 39 enzymes simultaneously. Identify where each enzyme cuts, the cleavage residue, and context window around each site.

Peptide mass calculator
Cleave a protein sequence with a chosen protease and compute the masses of the resulting peptides. Supports multiple enzymes, missed cleavages, chemical modifications, and different ion types for mass spectrometry experiment planning.

PROPKA 3
Predict pKa values of ionizable groups in proteins and protein-ligand complexes from 3D structure. PROPKA calculates environment-driven pKa shifts for standard ionizable residues, terminal groups, and supported ligand atom types.

Protein parameters
Calculate protein parameters, including molecular weight, theoretical pI, extinction coefficients, aromaticity, secondary structure fractions, atomic composition, estimated half-life, and several indices, including instability, aliphatic index, and GRAVY.
What is the Chou-Fasman method?
The Chou-Fasman method predicts protein secondary structure from amino acid sequence alone. Published by Peter Chou and Gerald Fasman in 1974, it was the first widely used algorithm to show that local amino acid composition carries enough information to predict helices, sheets, and turns.
The approach is statistical rather than physical. Chou and Fasman counted how often each amino acid appeared in known crystal structures within helices, sheets, or turns, then converted those frequencies into propensity scores. A propensity above 100 means that amino acid favors the structure; below 100 means it disfavors it. These 60 numbers (20 amino acids times 3 structure types) are the entire model.
Accuracy sits around 60-65%, well below modern deep learning methods like ProstT5 or ESMFold. The method remains useful as a teaching tool and as a fast, interpretable baseline. Because it uses no MSA, no neural network, and no GPU, predictions return in seconds.
How to use Chou-Fasman online
ProteinIQ runs the canonical Chou-Fasman algorithm (1978 parameter set) on its servers. Paste one or more sequences and results are returned within seconds.
Inputs
| Input | Description |
|---|---|
Protein sequence | One or more sequences in FASTA format, raw text, or uploaded as .fasta/.txt/.csv/.pdb files. |
Batch fetch | Fetch sequences by PDB ID from RCSB. |
Output columns
| Column | Description |
|---|---|
Structure | Full per-residue prediction string using H (helix), E (sheet), T (turn), C (coil). |
Helix % / Sheet % / Turn % / Coil % | Fraction of residues assigned to each state. |
Helix # / Sheet # / Turn # | Number of distinct regions of each type. |
Helix avg / Sheet avg / Turn avg | Mean length (in residues) of each region type. |
Results can be downloaded as CSV, JSON, or copied directly.
How does Chou-Fasman work?
The algorithm runs in four steps, applied sequentially to each sequence.
1. Nucleation
The method scans for local stretches rich in structure-forming residues:
- Helix nucleation: At least 4 of any 6 consecutive residues must have .
- Sheet nucleation: At least 3 of any 5 consecutive residues must have .
These thresholds reflect the cooperative nature of secondary structure formation. A single strong former surrounded by breakers won't nucleate.
2. Extension
Each nucleated region extends outward one residue at a time in both directions. Extension continues as long as the average propensity of 4 consecutive residues at the boundary stays above 100. Once this running average drops below the threshold, the region terminates.
3. Turn prediction
Turns are predicted for each tetrapeptide (4-residue window) that satisfies all three conditions:
- The product of positional turn frequencies exceeds . These frequencies capture the preference of each amino acid for specific positions within a turn.
- The average turn propensity of the four residues exceeds 100.
- The sum of values exceeds both the sum of and the sum of for the same four residues. This ensures turns are only predicted where turn tendency genuinely dominates.
4. Conflict resolution and cleanup
When predicted helix and sheet regions overlap, the algorithm compares the total helix propensity () against total sheet propensity () across the overlap. The higher sum wins. After resolution, any helix or sheet region shorter than 5 residues is removed as unreliable.
Positions not assigned to helix, sheet, or turn default to random coil (C).
Understanding the results
The Structure string is the primary output. Each character maps directly to a position in the input sequence:
- H (helix): The residue is in a predicted alpha-helix region
- E (sheet): The residue is in a predicted beta-sheet region
- T (turn): The residue is at a predicted turn
- C (coil): No regular structure predicted
Typical globular proteins show 30-40% helix and 15-25% sheet. If a prediction shows 0% for both, the sequence may be intrinsically disordered, or too short for meaningful nucleation (sequences under ~20 residues often produce all-coil predictions).
The percentage and region count columns are useful for comparing across sequences. A protein with 35% helix in 4 regions has a different architecture than one with 35% helix in 1 long region.
When to use Chou-Fasman vs alternatives
Chou-Fasman is a sequence-only, statistics-only method. Several alternatives exist depending on what matters most:
- ProstT5 uses a protein language model and achieves over 80% accuracy for 3-state (H/E/C) prediction. For any serious research application, ProstT5 is the better choice.
- DSSP assigns secondary structure from a solved 3D structure (PDB file), not from sequence. If a crystal or cryo-EM structure exists, DSSP gives ground truth rather than a prediction.
- ESMFold and AlphaFold 2 predict full 3D structures, from which secondary structure can be derived. These are slower but far more informative.
Chou-Fasman's niche is speed and interpretability. Every prediction can be traced back to a specific propensity value and no black-box model is involved. This makes it well-suited for teaching the fundamentals of structure prediction, for quickly screening large sequence sets, and for situations where understanding why a prediction was made matters more than accuracy.