ProteinIQ
IgGM icon

IgGM

AI-powered antibody design with generative foundation model

What is IgGM?

IgGM (Immunoglobulin Generative Model) is a generative foundation model for designing antibodies and nanobodies against target antigens. Developed by Tencent AI4S and published at ICLR 2025, it simultaneously generates both sequences and 3D structures—unlike many existing methods that handle these separately.

The model excels at designing complementarity-determining regions (CDRs), the hypervariable loops responsible for antigen recognition. On benchmark tests, IgGM achieved a 36% sequence recovery rate for CDR-H3 (the most flexible and challenging region), representing a 22% improvement over prior state-of-the-art methods.

IgGM supports both conventional antibodies (heavy + light chain) and single-domain nanobodies. It won a top-three prize in the AIntibody competition, an experimentally validated antibody design challenge.

How does IgGM work?

IgGM combines three components:

  1. Pre-trained language model (ESM-PPI): Extracts evolutionary and structural features from protein sequences. ESM-PPI extends ESM2 with improved handling of inter-chain relationships in multi-chain complexes.
  2. Feature learning module: Processes 3D coordinate information through structure encoding and inter-chain interaction modules, integrating spatial relationships between antibody and antigen.
  3. Diffusion-based prediction module: Uses SO(3) diffusion to model rotational degrees of freedom in protein structure. The system samples initial translations from a Gaussian distribution and rotations from the standard SO(3) group, then iteratively refines both sequence and structure.

Training proceeds in two phases. First, a diffusion model learns structure prediction while preserving original sequence information. Then a consistency model is distilled from the diffusion model, enabling fast generation from arbitrary noise levels. This two-phase approach proved essential for model performance.

How to use IgGM online

ProteinIQ provides GPU-accelerated IgGM without local installation or the complexity of managing PyTorch Geometric dependencies.

Inputs

InputDescription
Heavy Chain (VH)Antibody heavy chain variable region sequence. Mark positions to design with X characters.
Light Chain (VL)Light chain variable region (optional for nanobodies).
Antigen StructureRequired. PDB file of the target antigen. Upload directly or fetch by PDB ID.
Antigen SequenceOptional sequence for epitope-guided design. If omitted, extracted from PDB.
Original SequenceReference sequence for affinity maturation comparisons.

The X character indicates positions for the model to redesign. For CDR design, mask the CDR residues while providing the framework sequence.

Design tasks

TaskDescription
CDR DesignDesign CDR loops to bind the target epitope. Provide framework sequence with X at CDR positions.
Inverse DesignGenerate a sequence compatible with an existing antibody structure.
Framework DesignRedesign framework regions while preserving CDRs. Useful for humanization.
Affinity MaturationOptimize an existing antibody for improved binding. Provide original sequence for comparison.

Settings

SettingDescription
Number of designsSamples to generate (1–100, default 10). More samples provide diversity but increase runtime.
Epitope residuesTarget binding site positions on the antigen (e.g., 45,46,47,50-55). If omitted, calculated automatically from the antigen structure.
Apply PyRosetta relaxationEnergy minimization to improve structural quality. Increases runtime.
Random seedFor reproducible results. Leave empty for random seed.

Outputs

IgGM produces:

  • Designed sequences in FASTA format for each sample
  • Predicted structures as PDB files showing the antibody-antigen complex
  • Statistics (CSV) with amino acid distributions and generation frequencies for affinity maturation

All outputs can be visualized in the integrated 3D viewer and downloaded as a ZIP archive.

Epitope specification

The epitope—the antigen region the antibody should bind—guides design. IgGM requires epitope information and handles it three ways:

  1. Provide comma-separated residue numbers (e.g., 417,449,453,455-456). Numbers correspond to sequence positions, not PDB residue IDs.
  2. For multi-chain PDB files (e.g., an existing antibody-antigen complex), IgGM identifies interface residues using a 5Å distance cutoff.
  3. For single-chain antigen structures, surface-exposed residues are identified by solvent-accessible surface area (SASA > 25 Ų).

Epitope size is limited to 50 residues to manage computational cost.

Limitations

  • Antigen structure required: Unlike some tools that accept sequence-only input, IgGM needs a 3D antigen structure for epitope-guided design.
  • Memory constraints: Large antigens may need truncation (the max_antigen_size parameter, up to 384 residues by default).
  • Binding affinity not predicted: IgGM generates plausible sequences but does not estimate binding strength. Experimental validation remains necessary.
  • Rigid docking: The antigen structure is treated as fixed during design.
  • AntiFold: Inverse folding specifically for antibody structures, useful for sequence optimization when a structure is already known
  • IgDesign: CDR design via inverse folding from antibody-antigen complex structures
  • BioPhi: Antibody humanization and humanness scoring for therapeutic development
  • AbLang-2: Antibody language model for sequence analysis and embedding generation