HighFold

Cyclic peptide structure prediction with CycPOEM-enhanced AlphaFold2

Input

Job name

Cyclic peptide

Cyclization options

Disulfide bond pairs

MSA mode

Use templates

162 credits

Output

Configure input settings, then click "Submit"

What is HighFold?

HighFold predicts the three-dimensional structures of cyclic peptides — molecules where the chain loops back on itself through a head-to-tail peptide bond, sometimes reinforced by disulfide bridges. Standard protein structure prediction methods like AlphaFold 2 treat sequences as linear chains and encode residue positions accordingly, which means they systematically misrepresent the topology of cyclic peptides.

HighFold solves this by replacing AlphaFold's linear position encoding with CycPOEM (Cyclic Position Offset Encoding Matrix), a distance matrix that accounts for head-to-tail cyclization and disulfide bond shortcuts. The rest of the AlphaFold2/ColabFold pipeline — MSA generation, Evoformer attention, and structure module — remains unchanged. The result is substantially more accurate cyclic peptide structures without retraining any neural network weights.

The method was developed at Zhejiang University of Technology and Shanghai Highslab Therapeutics, published in Briefings in Bioinformatics (2024).

How CycPOEM works

AlphaFold encodes the relative position between two residues as a simple linear offset: residue $i$ and residue $j$ are $|i - j|$ apart. For a cyclic peptide of length $N$ , this ignores that residues 1 and $N$ are covalently bonded.

CycPOEM constructs a more accurate distance matrix using a modified Floyd-Warshall shortest-path algorithm:

An $N \times N$ adjacency matrix is initialized with distance 1 between consecutive residues
A head-to-tail edge connects residue 1 to residue $N$ (distance 1)
If disulfide bonds are specified, each pair of bridged cysteine residues also receives distance 1
The Floyd-Warshall algorithm then computes the shortest path between every pair of residues through this graph

This means two residues on opposite sides of a 20-residue cyclic peptide are encoded as being 10 apart rather than 19 — reflecting the actual molecular topology.

Sign strategies

CycPOEM also encodes directionality. The Upper Negative (UN) strategy, which assigns negative values to the upper triangle of the matrix, produced the best results in benchmarks. This captures the asymmetric nature of peptide bonds (N→C directionality).

Benchmark results

On a test set of 63 NMR-resolved cyclic peptide structures, HighFold achieved a median backbone RMSD of 1.058 Å, compared to 1.737 Å for AfCycDesign and 1.956 Å for standard AlphaFold. For cyclic peptides with disulfide bridges, the improvement was even more pronounced: 1.720 Å average RMSD versus 3.256 Å for AfCycDesign.

How to use HighFold online

ProteinIQ runs HighFold on A100 GPU infrastructure with pre-loaded AlphaFold2 model weights, eliminating the need to configure ColabFold, install HighFold's overlay, or manage GPU environments locally.

Input

Input	Description
`Cyclic peptide`	Amino acid sequence in FASTA or raw format. Single chain only. Standard amino acids (20 canonical residues).

Cyclization settings

Setting	Description
`Disulfide bond pairs`	Residue pairs forming disulfide bridges, e.g. `1 5, 3 8`. Each pair is two space-separated residue numbers; multiple pairs are comma-separated. Leave empty for head-to-tail cyclization only.

Prediction settings

Setting	Default	Description
`Number of models`	`5`	AlphaFold2 models to run (1–5). All five gives the most reliable ranking but increases runtime proportionally.
`Random seeds`	`1`	Number of random seeds per model (1–5). More seeds produce greater diversity in predicted conformations.
`Model type`	`AlphaFold2`	Model variant. `AlphaFold2` or `AlphaFold2-ptm`. Both are monomer models suitable for cyclic peptides.

MSA settings

Setting	Default	Description
`MSA mode`	`UniRef+Environmental`	Database for multiple sequence alignment generation via MMseqs2. `UniRef+Environmental` searches UniRef and environmental sequence databases. `UniRef only` is faster. `No MSA` runs single-sequence mode (no evolutionary information).
`Use templates`	On	Query the PDB for structural templates. Useful when homologous structures exist.

Advanced settings

Setting	Default	Description
`AMBER relaxation`	On	Run OpenMM/AMBER energy minimization on predicted structures. Recommended for cyclic peptides, where strained geometry is common.
`Structures to relax`	`1`	Number of top-ranked predictions to relax (0–5). Only applies when AMBER relaxation is enabled.

Output

Each prediction run produces up to 5 ranked structures (depending on model count and seeds). Results include:

PDB files: 3D coordinates for each ranked prediction, viewable in the built-in 3D viewer
pLDDT scores: Per-residue confidence (0–100), averaged across the structure for ranking
pTM scores: Predicted TM-score (when available), used as the primary ranking metric
MSA alignment: The multiple sequence alignment used as input (.a3m format)
ColabFold plots: Coverage and pLDDT visualization plots (.png)

Interpreting confidence scores

pLDDT	Interpretation
> 90	High confidence — backbone and sidechain positions likely accurate
70–90	Good confidence — backbone probably correct, sidechains less certain
50–70	Low confidence — treat with caution, may need experimental validation
< 50	Very low confidence — structure unreliable in this region

For cyclic peptides, pLDDT values tend to be lower than for globular proteins of similar size, particularly in loop regions. A pLDDT of 70+ for a short cyclic peptide generally indicates a successful prediction.

Limitations

Natural amino acids only: HighFold supports the 20 canonical amino acids. Cyclic peptides containing non-natural amino acids, N-methylated residues, or D-amino acids cannot be modeled. (The successor HighFold3 addresses this limitation but is not yet available here.)
Single chain: Only monomeric cyclic peptides are supported. Cyclic peptide–protein complexes require HighFold_Multimer, which uses a different model architecture.
Sequence length: Very short peptides (< 5 residues) may not generate meaningful MSAs, reducing prediction quality. Consider using No MSA mode for very short sequences.
AMBER relaxation failures: Energy minimization can fail on highly strained or unusual cyclic topologies. When this happens, the unrelaxed structure is returned instead.

AlphaFold 2: General-purpose protein structure prediction — the foundation HighFold builds on
Boltz-2: Biomolecular structure prediction including protein-ligand complexes
ESMFold: Fast single-sequence structure prediction (no MSA), useful for quick screening before HighFold
Chai-1: Multi-modal biomolecular structure prediction

HighFold

Input

Prediction parameters

MSA options

Advanced options

Output

What is HighFold?

How CycPOEM works

Sign strategies

Benchmark results

How to use HighFold online

Input

Cyclization settings

Prediction settings

MSA settings

Advanced settings

Output

Interpreting confidence scores

Limitations

Related tools

Input

Prediction parameters

MSA options

Advanced options

Output