ProteinIQ
CleaveNet icon

CleaveNet

Predict protease cleavage sites for 17 matrix metalloproteinases using AI. Input peptide sequences and receive cleavage probability z-scores for MMP variants.

What is CleaveNet?

CleaveNet is a deep learning model developed by Microsoft Research that predicts matrix metalloproteinase (MMP) cleavage sites in peptide sequences. Given a peptide up to 10 residues long, CleaveNet outputs z-scores indicating how efficiently each of 17 MMP variants would cleave that sequence.

Matrix metalloproteinases are zinc-dependent enzymes that digest extracellular matrix components. They play essential roles in tissue remodeling, wound healing, and embryonic development, but are also implicated in cancer progression—where MMPs facilitate tumor invasion and metastasis by degrading physical barriers between tissues. Understanding which sequences a particular MMP will cleave is valuable for designing protease-activated therapeutics, diagnostic biosensors, and studying enzyme specificity.

CleaveNet uses a transformer architecture trained on mRNA display data from Kukreja et al. (2015), which profiled cleavage activity across thousands of peptide substrates. The model learns sequence patterns associated with efficient cleavage by different MMP family members.

How to use CleaveNet online

ProteinIQ provides cloud-based access to CleaveNet, with results returned in seconds without requiring Python installation or GPU hardware.

Input

FieldDescription
Peptide SequencesOne or more sequences, up to 10 residues each. Accepts FASTA format or plain text (one sequence per line). Shorter sequences are automatically padded.
MMP VariantsWhich MMPs to predict for. Enter all for all 17 variants, or a comma-separated list like MMP1,MMP2,MMP9.

Supported MMP variants

CleaveNet predicts cleavage for 17 matrix metalloproteinases:

MMP1, MMP2, MMP3, MMP7, MMP8, MMP9, MMP10, MMP11, MMP12, MMP13, MMP14, MMP15, MMP16, MMP17, MMP19, MMP20, MMP25

Output

Results are returned as a spreadsheet with one row per input peptide. Each MMP variant has its own column containing a z-score.

ColumnDescription
SequenceThe input peptide sequence
MMP1, MMP2, ...Z-score for each MMP variant

Interpreting z-scores

Z-scores quantify cleavage efficiency relative to the training distribution. Higher positive values indicate stronger predicted cleavage.

Z-scoreInterpretation
> 2.0Strong predicted cleavage
1.0 to 2.0Moderate cleavage likelihood
0 to 1.0Weak cleavage
< 0Below-average cleavage, unlikely substrate

When designing selective substrates, compare z-scores across MMP variants. A peptide with high z-score for MMP13 but low scores for other MMPs would be a candidate MMP13-selective substrate.

How does CleaveNet work?

CleaveNet uses a transformer neural network to predict cleavage from peptide sequence alone. The model tokenizes each amino acid in the input sequence, processes them through self-attention layers that capture dependencies between positions, and outputs a z-score for each MMP variant.

The predictor was trained on data from Kukreja et al. (2015), an mRNA display experiment that measured cleavage rates for thousands of randomized peptide substrates across multiple MMPs. Each peptide in the training set has experimentally measured cleavage values, allowing the model to learn which sequence motifs correlate with efficient cleavage by specific proteases.

Input peptides shorter than 10 residues are padded with special tokens. The model processes the padded sequence and predicts z-scores independently for each MMP—these scores represent how far above or below the mean cleavage rate a given peptide falls, normalized by the standard deviation of the training distribution.

An alternative LSTM architecture is also available in the original codebase for cases where the transformer may overfit to training patterns. The LSTM backbone can sometimes extrapolate better to novel cleavage motifs not well-represented in the training data.

Applications

Protease-activated drug delivery: Peptide linkers connecting antibodies to cytotoxic payloads can be designed to release the drug only when cleaved by tumor-associated MMPs, reducing off-target toxicity.

Diagnostic biosensors: Peptides that fluoresce upon cleavage serve as activity-based probes. CleaveNet helps identify sequences with appropriate MMP selectivity for detecting specific cancers or inflammatory conditions.

Enzyme specificity studies: Comparing z-scores across the 17 MMP variants reveals which positions in a peptide drive selectivity, informing mechanistic understanding of MMP substrate preferences.

Limitations

  • Predictions are calibrated for 10-residue peptides. The model has not been extensively validated on shorter or longer sequences.
  • CleaveNet predicts relative cleavage efficiency, not absolute kinetic rates. Z-scores indicate ranking, not quantitative Km or kcat values.
  • Training data comes from in vitro mRNA display experiments; in vivo cleavage may differ due to protein folding, localization, and inhibitor presence.