
Chou-Fasman
Predict protein secondary structure using the classic Chou-Fasman algorithm based on amino acid propensities for α-helices, β-sheets, and turns.
The Chou-Fasman method is an empirical algorithm for predicting protein secondary structure from amino acid sequences. Developed by Peter Y. Chou and Gerald D. Fasman in 1974, it was one of the first computational approaches to tackle the protein folding problem using statistical analysis of known protein structures.
This method laid the foundation for modern secondary structure prediction and remains valuable for understanding the relationship between amino acid composition and structural preferences.
The Chou-Fasman algorithm operates on the principle that different amino acids have varying propensities to form specific secondary structures. The method consists of three main steps:
Each of the 20 standard amino acids is assigned numerical propensities for forming:
These propensities were derived from statistical analysis of known protein crystal structures available in the 1970s.
The algorithm identifies potential nucleation sites where secondary structures can begin:
Once a nucleation site is identified, the algorithm extends the predicted structure in both directions along the sequence until:
The Chou-Fasman method classifies each amino acid position into one of four categories:
Extended regions of right-handed helical structure, typically 4-40 residues in length. Helices are common in globular proteins and provide structural stability through backbone hydrogen bonding.
Extended conformations that can participate in sheet formation with other strands. Sheets can be parallel, antiparallel, or mixed, and are crucial components of many protein architectures.
Short regions (typically 3-5 residues) where the protein chain reverses direction. Turns often connect secondary structure elements and are frequently found on protein surfaces.
Irregular, flexible regions that don't conform to regular secondary structure patterns. These regions often serve as linkers between structured domains or participate in protein function through conformational flexibility.
The Chou-Fasman method achieves approximately 60-65% accuracy in secondary structure prediction, which was remarkable for its time but is now considered moderate by current standards.
Despite its limitations, the Chou-Fasman method demonstrated that:
This pioneering work paved the way for modern machine learning approaches that achieve >80% accuracy in secondary structure prediction.
The Chou-Fasman method remains useful for:
Providing a quick overview of likely secondary structure content before applying more sophisticated prediction methods or experimental techniques.
Teaching the fundamental concepts of secondary structure prediction and the relationship between amino acid properties and structural preferences.
Analyzing how sequence variations might affect secondary structure preferences, particularly in protein engineering or evolutionary studies.
Performing preliminary structural analysis of large sequence datasets where computational efficiency is more important than maximum accuracy.
Using the Chou-Fasman predictor through ProteinIQ costs 1 credit per analysis, regardless of the number of sequences analyzed in a single job.
Based on: Chou, P.Y. & Fasman, G.D. (1974). Prediction of protein conformation. Biochemistry, 13(2), 222-245. DOI: 10.1021/bi00699a002