Chou-Fasman

Predict protein secondary structure from amino acid propensities for helices, sheets, and turns.

1
Configure input settings on the left, then click "Submit"

Related tools

Aggrescan3D

Aggrescan3D

Faithful static-mode Aggrescan3D wrapper for per-residue aggregation propensity analysis from a single protein structure.

Protein charge plot

Protein charge plot

Plot net charge vs pH for protein sequences. Visualize how protein charge changes across pH 0-14 and identify the isoelectric point (pI) where the net charge crosses zero.

DSSP

DSSP

Assign protein secondary structure using the DSSP algorithm. The gold standard for hydrogen bond-based structure assignment from coordinates.

FindPept

FindPept

Match experimental peptide masses against theoretical digest fragments of a protein sequence. Identify peptides from mass spectrometry data by peptide mass fingerprinting.

Hydropathy plot

Hydropathy plot

Generate Kyte-Doolittle hydropathy plots to visualize hydrophobic and hydrophilic regions along protein sequences. Identify transmembrane domains and surface-exposed regions.

Hydrophobicity plot

Hydrophobicity plot

Generate hydrophobicity plots using 24 different amino acid scales. Visualize hydrophobic and hydrophilic regions for protein analysis, epitope prediction, and membrane protein studies.

Peptide cutter

Peptide cutter

Predict protease and chemical cleavage sites across a protein sequence for up to 39 enzymes simultaneously. Identify where each enzyme cuts, the cleavage residue, and context window around each site.

Peptide mass calculator

Peptide mass calculator

Cleave a protein sequence with a chosen protease and compute the masses of the resulting peptides. Supports multiple enzymes, missed cleavages, chemical modifications, and different ion types for mass spectrometry experiment planning.

PROPKA 3

PROPKA 3

Predict pKa values of ionizable groups in proteins and protein-ligand complexes from 3D structure. PROPKA calculates environment-driven pKa shifts for standard ionizable residues, terminal groups, and supported ligand atom types.

Protein parameters

Protein parameters

Calculate protein parameters, including molecular weight, theoretical pI, extinction coefficients, aromaticity, secondary structure fractions, atomic composition, estimated half-life, and several indices, including instability, aliphatic index, and GRAVY.

What is the Chou-Fasman method?

The Chou-Fasman method predicts protein secondary structure from amino acid sequence alone. Published by Peter Chou and Gerald Fasman in 1974, it was the first widely used algorithm to show that local amino acid composition carries enough information to predict helices, sheets, and turns.

The approach is statistical rather than physical. Chou and Fasman counted how often each amino acid appeared in known crystal structures within helices, sheets, or turns, then converted those frequencies into propensity scores. A propensity above 100 means that amino acid favors the structure; below 100 means it disfavors it. These 60 numbers (20 amino acids times 3 structure types) are the entire model.

Accuracy sits around 60-65%, well below modern deep learning methods like ProstT5 or ESMFold. The method remains useful as a teaching tool and as a fast, interpretable baseline. Because it uses no MSA, no neural network, and no GPU, predictions return in seconds.

How to use Chou-Fasman online

ProteinIQ runs the canonical Chou-Fasman algorithm (1978 parameter set) on its servers. Paste one or more sequences and results are returned within seconds.

Inputs

InputDescription
Protein sequenceOne or more sequences in FASTA format, raw text, or uploaded as .fasta/.txt/.csv/.pdb files.
Batch fetchFetch sequences by PDB ID from RCSB.

Output columns

ColumnDescription
StructureFull per-residue prediction string using H (helix), E (sheet), T (turn), C (coil).
Helix % / Sheet % / Turn % / Coil %Fraction of residues assigned to each state.
Helix # / Sheet # / Turn #Number of distinct regions of each type.
Helix avg / Sheet avg / Turn avgMean length (in residues) of each region type.

Results can be downloaded as CSV, JSON, or copied directly.

How does Chou-Fasman work?

The algorithm runs in four steps, applied sequentially to each sequence.

1. Nucleation

The method scans for local stretches rich in structure-forming residues:

  • Helix nucleation: At least 4 of any 6 consecutive residues must have Pα>100P_\alpha > 100.
  • Sheet nucleation: At least 3 of any 5 consecutive residues must have Pβ>100P_\beta > 100.

These thresholds reflect the cooperative nature of secondary structure formation. A single strong former surrounded by breakers won't nucleate.

2. Extension

Each nucleated region extends outward one residue at a time in both directions. Extension continues as long as the average propensity of 4 consecutive residues at the boundary stays above 100. Once this running average drops below the threshold, the region terminates.

3. Turn prediction

Turns are predicted for each tetrapeptide (4-residue window) that satisfies all three conditions:

  • The product of positional turn frequencies exceeds 7.5×1057.5 \times 10^{-5}. These frequencies capture the preference of each amino acid for specific positions within a turn.
  • The average turn propensity PtP_t of the four residues exceeds 100.
  • The sum of PtP_t values exceeds both the sum of PαP_\alpha and the sum of PβP_\beta for the same four residues. This ensures turns are only predicted where turn tendency genuinely dominates.

4. Conflict resolution and cleanup

When predicted helix and sheet regions overlap, the algorithm compares the total helix propensity (Pα\sum P_\alpha) against total sheet propensity (Pβ\sum P_\beta) across the overlap. The higher sum wins. After resolution, any helix or sheet region shorter than 5 residues is removed as unreliable.

Positions not assigned to helix, sheet, or turn default to random coil (C).

Understanding the results

The Structure string is the primary output. Each character maps directly to a position in the input sequence:

  • H (helix): The residue is in a predicted alpha-helix region
  • E (sheet): The residue is in a predicted beta-sheet region
  • T (turn): The residue is at a predicted turn
  • C (coil): No regular structure predicted

Typical globular proteins show 30-40% helix and 15-25% sheet. If a prediction shows 0% for both, the sequence may be intrinsically disordered, or too short for meaningful nucleation (sequences under ~20 residues often produce all-coil predictions).

The percentage and region count columns are useful for comparing across sequences. A protein with 35% helix in 4 regions has a different architecture than one with 35% helix in 1 long region.

When to use Chou-Fasman vs alternatives

Chou-Fasman is a sequence-only, statistics-only method. Several alternatives exist depending on what matters most:

  • ProstT5 uses a protein language model and achieves over 80% accuracy for 3-state (H/E/C) prediction. For any serious research application, ProstT5 is the better choice.
  • DSSP assigns secondary structure from a solved 3D structure (PDB file), not from sequence. If a crystal or cryo-EM structure exists, DSSP gives ground truth rather than a prediction.
  • ESMFold and AlphaFold 2 predict full 3D structures, from which secondary structure can be derived. These are slower but far more informative.

Chou-Fasman's niche is speed and interpretability. Every prediction can be traced back to a specific propensity value and no black-box model is involved. This makes it well-suited for teaching the fundamentals of structure prediction, for quickly screening large sequence sets, and for situations where understanding why a prediction was made matters more than accuracy.