
Perform maximum-likelihood phylogenetic tree inference with RAxML-NG for aligned protein or DNA sequences. Supports ML search, bootstrap analysis, and native automatic model-family selection.

DR-BERT is a compact protein language model that predicts intrinsically disordered regions (IDRs) in proteins. It outputs per-residue disorder probability scores (0–1) from amino acid sequences, enabling fast and accurate annotation of disordered regions without structural data.

Restore missing residues in antibody sequences using a language model trained on the Observed Antibody Space (OAS) database. Achieves better restoration than IMGT germlines or ESM-1b while being 7x faster.

Antibody-specific language model for predicting non-germline residues (NGL) in antibody sequences. AbLang-2 addresses germline bias in existing antibody language models by focusing on somatic hypermutation patterns, enabling more accurate prediction of amino acid likelihoods and generation of context-aware embeddings for antibody sequences.

Faithful static-mode Aggrescan3D tool for per-residue aggregation propensity analysis from a single protein structure.

Predict metal and water binding sites in protein structures using 3D convolutional neural networks (AllMetal3D + Water3D).

DeepEMhancer is a deep learning-based post-processing tool for cryo-EM maps. It performs automatic sharpening, masking, and denoising in a single step without requiring an atomic model. Supports half-map inputs for improved local mask estimation.

ESM-2 is a 650M parameter protein language model from Meta AI trained on 250M protein sequences. Generate rich sequence representations for downstream tasks like structure prediction, function annotation, and variant effect prediction.

ESM-C generates protein sequence representations and optional masked-token logits using Biohub protein language models. It supports the 300M, 600M, and 6B model variants for embedding extraction from canonical amino acid sequences.

Isoelectric Point Calculator 2.0 - Predict protein/peptide isoelectric point (pI) using 18+ validated pKa scales, SVR models, and deep learning. Supports proteins, peptides, and comprehensive analysis.
AF-Cluster is a method for predicting multiple protein conformations by clustering a multiple sequence alignment (MSA) before running AlphaFold2. Standard AlphaFold2 predictions converge on a single dominant structure, even for proteins that adopt two or more biologically relevant folds. AF-Cluster addresses this by splitting the MSA into sequence subgroups using the DBSCAN density-based clustering algorithm, then generating separate AlphaFold2 predictions from each cluster.
The approach was developed by Hannah Wayment-Steele, Sergey Ovchinnikov, Lucy Colwell, Dorothee Kern, and colleagues at Brandeis University, and published in Nature in 2023. The authors validated the method on metamorphic proteins, including the cyanobacterial clock protein KaiB, where AF-Cluster correctly predicted both the ground-state and fold-switched conformations. NMR spectroscopy confirmed that a KaiB variant predicted by AF-Cluster was indeed stabilized in the opposite fold.
Proteins evolve under selective pressure to maintain function, and function often requires switching between conformational states. Homologous sequences in an MSA may carry co-evolutionary signals for different conformations. When the full MSA is fed to AlphaFold2, these conflicting signals average out and the prediction collapses onto a single state.
AF-Cluster separates these signals by clustering the MSA:
min_samples sequences, and sequences within epsilon distance of a core point are assigned to the same cluster. Sequences that do not fall within any dense region are labeled as noise.Each cluster's alignment can then be used as input to AlphaFold2 independently, producing structure predictions that may capture different conformational states.
ProteinIQ runs AF-Cluster directly in the browser, handling the clustering pipeline on cloud infrastructure with no software installation needed.
| Input | Description |
|---|---|
Multiple Sequence Alignment | An MSA in FASTA or A3M format. The first sequence is treated as the query. Minimum 10 sequences required. Upload a file (up to 100 MB) or paste directly. |
MSAs can be generated from tools like Clustal Omega or MAFFT, or obtained from databases such as UniRef or ColabFold search.
| Setting | Description |
|---|---|
Min samples per cluster | Minimum number of sequences required to form a DBSCAN cluster (2–20, default 3). Higher values produce fewer, more populated clusters. |
Gap fraction cutoff | Remove sequences with more than this fraction of gaps relative to the query (0–1, default 0.25). Lower values enforce stricter filtering. |
DBSCAN epsilon | Maximum distance for points to be grouped into a cluster (0–10, default 0 for automatic). When set to 0, AF-Cluster scans a range of epsilon values and selects the one yielding the most clusters. |
| Setting | Description |
|---|---|
Generate PCA plot | Project clustered sequences onto their first two principal components. Useful for seeing how clusters separate in sequence space. |
Generate t-SNE plot | Non-linear embedding of sequence distances. Can reveal cluster structure that PCA misses, but is more computationally intensive and results vary between runs. |
AF-Cluster produces several files:
AF-Cluster is most valuable for proteins suspected of adopting multiple folds: