DeepImmuno

Deep learning-based peptide immunogenicity prediction for supported MHC-I alleles

Input

Job name

Peptide Sequences

3 credits

Output

Configure input settings, then click "Run"

What is DeepImmuno?

DeepImmuno is a deep learning model that predicts whether a peptide will trigger a T cell immune response when presented by a specific MHC class I molecule. Developed at Cincinnati Children's Hospital Medical Center, it uses a convolutional neural network (DeepImmuno-CNN) trained on over 8,900 experimentally validated peptide-MHC immunogenicity assays from the Immune Epitope Database (IEDB).

Predicting immunogenicity is harder than predicting MHC binding. A peptide may bind an MHC molecule tightly yet fail to activate T cells. DeepImmuno addresses this downstream question directly: given a peptide-MHC pair, how likely is it to be immunogenic?

On independent benchmarks, DeepImmuno-CNN identified 83% of immunogenic tumor neoantigens (compared to 63% for IEDB and 34% for DeepHLApan) and showed similarly strong performance on SARS-CoV-2 and dengue virus epitope datasets.

How does DeepImmuno work?

Sequence encoding

Rather than one-hot encoding, DeepImmuno represents each amino acid using 566 physicochemical features from the AAindex1 database, compressed via principal component analysis (PCA). This captures biologically meaningful properties like hydrophobicity, charge, and size while keeping the feature space manageable. HLA allele sequences are encoded the same way using paratope information from the IMGT database, creating a combined peptide-MHC feature matrix.

CNN architecture

Each peptide-MHC pair passes through two convolutional layers followed by two fully connected dense layers. The convolutional layers extract local sequence motifs, while the dense layers learn the non-linear relationships between these motifs and immunogenicity.

Training with beta-binomial scoring

Instead of treating immunogenicity as a binary yes/no, the training labels are continuous scores derived from a beta-binomial model. A peptide tested in 40 subjects with consistently positive responses receives a higher immunogenicity score than one tested in only 6 subjects. This probabilistic labeling captures experimental uncertainty and evidence quality.

Key residue positions

Occlusion sensitivity analysis of the trained model shows that peptide positions P4, P5, and P6 contribute most to predictions. These are the central residues that contact the T cell receptor, consistent with structural immunology. Anchor positions P2 and P9, which bind the MHC groove, are less informative for immunogenicity since MHC binding is assumed.

How to use DeepImmuno online

ProteinIQ provides cloud-hosted access to DeepImmuno-CNN, requiring no Python environment or GPU setup.

Inputs

Input	Description
`Peptide Sequences`	One or more peptide sequences, 9 or 10 amino acids each. Accepts raw sequences (one per line), FASTA format, or file upload (.txt, .fasta, .csv, .tsv).

DeepImmuno-CNN only supports 9-mer and 10-mer peptides. Longer sequences must be pre-processed into overlapping windows before submission.

Settings

Setting	Description
`HLA assignment mode`	`Single HLA for all peptides` (default) applies one allele to every sequence. `One HLA per peptide` allows a different allele for each sequence.
`HLA allele`	The MHC-I allele to use in single mode. Default: `HLA-A*0201`, the most common allele in many populations.
`Per-sequence HLA alleles`	One allele per line, matching the order of input peptides. Required when using per-sequence mode.
`Immunogenicity threshold`	Score cutoff for the immunogenic/non-immunogenic label (0.0–1.0, default `0.5`).

Supported HLA alleles

DeepImmuno supports 20 MHC class I alleles covering the most common HLA-A, HLA-B, and HLA-C types:

HLA-A	HLA-B	HLA-C
A*01:01	B*07:02	C*07:02
A*02:01	B*08:01
A*02:02	B*15:01
A*02:03	B*27:05
A*02:06	B*35:01
A*03:01	B*40:01
A*11:01	B*44:02
A*24:02	B*44:03
A*26:01	B*51:01

Output

Results are returned as a spreadsheet with one row per peptide-MHC pair.

Column	Description
`Sequence ID`	Identifier from FASTA header or auto-generated index.
`Peptide`	The input peptide sequence.
`HLA Allele`	The MHC-I allele used for this prediction.
`Immunogenicity Score`	Continuous score from 0 to 1. Higher values indicate greater predicted immunogenicity.
`Label`	`Immunogenic` or `Non-immunogenic` based on the threshold setting.

Interpreting results

The immunogenicity score reflects the model's confidence that a peptide-MHC complex will elicit a T cell response. It is not a probability in the strict statistical sense but correlates with experimental immunogenicity rates.

Score range	Interpretation
0.8–1.0	Strong immunogenic signal. High-priority candidates for experimental validation.
0.5–0.8	Moderate signal. Worth investigating, especially if supported by other evidence (binding affinity, expression level).
0.2–0.5	Weak signal. Unlikely to be immunogenic but cannot be ruled out.
0.0–0.2	Minimal immunogenic potential.

The default threshold of 0.5 balances sensitivity and specificity. For vaccine design, where missing a true epitope is costly, lowering the threshold to 0.3–0.4 may be appropriate. For neoantigen prioritization with limited validation capacity, raising it to 0.6–0.7 focuses on the most confident predictions.

Limitations

Only 9-mer and 10-mer peptides are supported. Peptides of other lengths require external pre-processing.
Coverage is limited to 20 MHC class I alleles. Uncommon alleles or MHC class II are not supported.
Immunogenicity depends on factors beyond peptide-MHC interaction (antigen processing, expression level, T cell repertoire) that the model does not capture.
Training data from IEDB reflects experimental biases: some alleles and viral peptides are over-represented.

TLimmuno2: MHC class II immunogenicity prediction for CD4+ T cell epitopes, complementing DeepImmuno's MHC class I focus
PepMLM: De novo peptide binder design using language modeling
ImmuneBuilder: 3D structure prediction for antibodies, nanobodies, and T cell receptors

DeepImmuno

Input

Prediction settings

Output

What is DeepImmuno?

How does DeepImmuno work?

Sequence encoding

CNN architecture

Training with beta-binomial scoring

Key residue positions

How to use DeepImmuno online

Inputs

Settings

Supported HLA alleles

Output

Interpreting results

Limitations

Related tools

Input

Prediction settings

Output

What is DeepImmuno?

How does DeepImmuno work?

Sequence encoding

CNN architecture

Training with beta-binomial scoring

Key residue positions

How to use DeepImmuno online

Inputs

Settings

Supported HLA alleles

Output

Interpreting results

Limitations

Related tools