What is IgGM?
IgGM (Immunoglobulin Generative Model) is a generative foundation model for designing antibodies and nanobodies against target antigens. Developed by Tencent AI4S and published at ICLR 2025, it simultaneously generates both sequences and 3D structures—unlike many existing methods that handle these separately.
The model excels at designing complementarity-determining regions (CDRs), the hypervariable loops responsible for antigen recognition. On benchmark tests, IgGM achieved a 36% sequence recovery rate for CDR-H3 (the most flexible and challenging region), representing a 22% improvement over prior state-of-the-art methods.
IgGM supports both conventional antibodies (heavy + light chain) and single-domain nanobodies. It won a top-three prize in the AIntibody competition, an experimentally validated antibody design challenge.
How does IgGM work?
IgGM combines three components:
- Pre-trained language model (ESM-PPI): Extracts evolutionary and structural features from protein sequences. ESM-PPI extends ESM2 with improved handling of inter-chain relationships in multi-chain complexes.
- Feature learning module: Processes 3D coordinate information through structure encoding and inter-chain interaction modules, integrating spatial relationships between antibody and antigen.
- Diffusion-based prediction module: Uses SO(3) diffusion to model rotational degrees of freedom in protein structure. The system samples initial translations from a Gaussian distribution and rotations from the standard SO(3) group, then iteratively refines both sequence and structure.
Training proceeds in two phases. First, a diffusion model learns structure prediction while preserving original sequence information. Then a consistency model is distilled from the diffusion model, enabling fast generation from arbitrary noise levels. This two-phase approach proved essential for model performance.
How to use IgGM online
ProteinIQ provides GPU-accelerated IgGM without local installation or the complexity of managing PyTorch Geometric dependencies.
Inputs
| Input | Description |
|---|---|
Heavy Chain (VH) | Antibody heavy chain variable region sequence. Mark positions to design with X characters. |
Light Chain (VL) | Light chain variable region (optional for nanobodies). |
Antigen Structure | Required. PDB file of the target antigen. Upload directly or fetch by PDB ID. |
Antigen Sequence | Optional sequence for epitope-guided design. If omitted, extracted from PDB. |
Original Sequence | Reference sequence for affinity maturation comparisons. |
The X character indicates positions for the model to redesign. For CDR design, mask the CDR residues while providing the framework sequence.
Design tasks
| Task | Description |
|---|---|
CDR Design | Design CDR loops to bind the target epitope. Provide framework sequence with X at CDR positions. |
Inverse Design | Generate a sequence compatible with an existing antibody structure. |
Framework Design | Redesign framework regions while preserving CDRs. Useful for humanization. |
Affinity Maturation | Optimize an existing antibody for improved binding. Provide original sequence for comparison. |
Settings
| Setting | Description |
|---|---|
Number of designs | Samples to generate (1–100, default 10). More samples provide diversity but increase runtime. |
Epitope residues | Target binding site positions on the antigen (e.g., 45,46,47,50-55). If omitted, calculated automatically from the antigen structure. |
Apply PyRosetta relaxation | Energy minimization to improve structural quality. Increases runtime. |
Random seed | For reproducible results. Leave empty for random seed. |
Outputs
IgGM produces:
- Designed sequences in FASTA format for each sample
- Predicted structures as PDB files showing the antibody-antigen complex
- Statistics (CSV) with amino acid distributions and generation frequencies for affinity maturation
All outputs can be visualized in the integrated 3D viewer and downloaded as a ZIP archive.
Epitope specification
The epitope—the antigen region the antibody should bind—guides design. IgGM requires epitope information and handles it three ways:
- Provide comma-separated residue numbers (e.g.,
417,449,453,455-456). Numbers correspond to sequence positions, not PDB residue IDs. - For multi-chain PDB files (e.g., an existing antibody-antigen complex), IgGM identifies interface residues using a 5Å distance cutoff.
- For single-chain antigen structures, surface-exposed residues are identified by solvent-accessible surface area (SASA > 25 Ų).
Epitope size is limited to 50 residues to manage computational cost.
Limitations
- Antigen structure required: Unlike some tools that accept sequence-only input, IgGM needs a 3D antigen structure for epitope-guided design.
- Memory constraints: Large antigens may need truncation (the
max_antigen_sizeparameter, up to 384 residues by default). - Binding affinity not predicted: IgGM generates plausible sequences but does not estimate binding strength. Experimental validation remains necessary.
- Rigid docking: The antigen structure is treated as fixed during design.
Related tools
- AntiFold: Inverse folding specifically for antibody structures, useful for sequence optimization when a structure is already known
- IgDesign: CDR design via inverse folding from antibody-antigen complex structures
- BioPhi: Antibody humanization and humanness scoring for therapeutic development
- AbLang-2: Antibody language model for sequence analysis and embedding generation
