
Input
Output
Input
Output
What is Genie 3?
Genie 3 is an all-atom SE(3)-equivariant diffusion model for protein design. Unlike backbone-only generators that produce only Cα traces, it models every heavy atom during generation, so output structures are ready for visualization, sequence inversion, and downstream validation without an extra full-atom reconstruction step.
The model supports three design tasks from a single checkpoint:
- Unconditional generation: creates novel protein structures from noise
- Motif scaffolding: holds fixed structural motifs and redesigns the surrounding scaffold
- Binder design: generates a candidate binder conditioned on hotspot residues of a target surface
Genie 3 is developed by the AQLaboratory. The implementation on ProteinIQ runs the official generation pipeline and returns each sampled structure as a PDB file.
How to use Genie 3 online
ProteinIQ hosts Genie 3 on GPU infrastructure and surfaces the three design modes through one form. Choose a mode, set the length or motif constraints, and request up to 20 samples. The job returns PDB files for every generated structure plus a file list, with no local installation of the model, sampling code, or evaluation dependencies.
Inputs
The required input depends on the selected design mode.
| Mode | Required input | Accepted formats |
|---|---|---|
| Unconditional | None | Only length settings |
| Motif scaffolding | One motif structure | .pdb, .ent, or RCSB PDB ID |
| Binder design | One target structure | .pdb, .ent, or RCSB PDB ID |
For motif scaffolding and binder design, the uploaded structure should be clean: one conformer per residue, no alternative locations, and residue numbering that matches the segment or hotspot specification. RCSB fetches retrieve the PDB file directly.
Core settings
| Setting | Type | Range / values | Default | Description |
|---|---|---|---|---|
design_mode | Select | unconditional, motif_scaffolding, binder_design | unconditional | Which design task to run. |
n_sample | Slider | 1–20 | 5 | Number of independent structures to generate. More samples increase cost and runtime. |
n_sample_step | Slider | 20–200 | 100 | Number of diffusion denoising steps. 100 is the Genie 3 default. |
predict_sequence | Switch | On / off | Off | When enabled, Genie 3 also predicts an amino acid sequence for each generated structure. |
Unconditional generation settings
| Setting | Range | Default | Description |
|---|---|---|---|
direction_scale_unconditional | 0–2 | 0.8 | DDIM guidance scale for unconditional generation. |
min_length | 20–256 | 50 | Minimum total number of residues. |
max_length | 20–256 | 50 | Maximum total number of residues. |
length_step | 1–236 | 50 | Step size for sampling lengths between min_length and max_length. |
If min_length equals max_length, every sample has the same length. If they differ, Genie 3 samples lengths across the range and generates n_sample structures per sampled length.
Motif scaffolding settings
| Setting | Range | Default | Description |
|---|---|---|---|
motif_segments | Text | A1-10 | Fixed residue ranges in ChainStart-End format. Multiple segments are separated by commas. |
direction_scale_motif | 0–2 | 0.1 | DDIM guidance scale for motif scaffolding. |
scaffold_flank_min | 0–100 | 5 | Minimum residues generated before and after the motif. |
scaffold_flank_max | 0–100 | 15 | Maximum residues generated before and after the motif. |
The motif structure must contain a single protein chain. The total design length includes both the fixed motif residues and every generated flanking scaffold segment. The combined length must stay within the 256-residue limit.
Binder design settings
| Setting | Range | Default | Description |
|---|---|---|---|
target_hotspots | Text | empty | Target interface residues in ChainResidue format, for example A65,A74. |
direction_scale_binder | 0–2 | 0.0 | DDIM guidance scale for binder design. |
binder_length_min | 20–256 | 40 | Minimum length of the designed binder. |
binder_length_max | 20–256 | 80 | Maximum length of the designed binder. |
Hotspot residues define where the binder should contact the target. They must be present in the uploaded target structure and use the same chain IDs and residue numbers.
Outputs
Genie 3 returns one PDB file per generated structure. ProteinIQ renders the structures in an interactive 3D viewer and lists all files for download.
| Output | Description |
|---|---|
| PDB files | One structure per sample. Atom records include all heavy atoms generated by the diffusion model. |
| File list | Spreadsheet-like list of generated files with metadata for each sample. |
If predict_sequence is enabled, residue names in the generated PDB files reflect the predicted amino acid sequence. The returned structures are not energy-minimized or folded by a structure predictor by default.
How Genie 3 works
Genie 3 treats protein design as a diffusion process over 3D atomic coordinates. During training, the model learns to reverse a noise-corruption process that gradually perturbs real protein structures. At inference time, it starts from random coordinates and iteratively denoises them into physically plausible protein geometries.
All-atom SE(3)-equivariance
SE(3)-equivariance means the model's predictions transform consistently with rotations and translations of the input. If the input structure is rotated, the output rotates by the same amount without changing the internal geometry. This removes the need for data augmentation and lets the model learn geometric relationships in a coordinate-independent way.
Modeling all atoms, not just the backbone, matters most for design tasks where side-chain packing determines success. Binder interfaces, motif scaffolding constraints, and hydrogen-bond networks all depend on accurate heavy-atom geometry.
Guidance and sampling
The mode-specific direction scale controls classifier-free guidance strength during DDIM sampling. Genie 3's example configurations use 0.8 for unconditional short proteins, 0.1 for motif scaffolding, and 0.0 for binder design. Higher values follow the model direction more strongly; lower values allow more diversity.
n_sample_step sets the number of denoising iterations. 100 steps match the Genie 3 default and balance quality with runtime. Fewer steps run faster but may leave residual noise. More steps refine local geometry at increased cost.
Interpreting generated structures
Generated structures should be treated as design candidates, not final experimental structures. The diffusion model produces geometrically plausible backbones and side-chain conformations, but it does not guarantee foldability, stability, or binding affinity.
A practical validation workflow on ProteinIQ looks like this:
- Inspect the PDB files in the interactive viewer. Look for closed, compact folds in unconditional designs and continuous backbone connectivity in scaffolds.
- Run promising structures through AlphaFold 2, ESMFold, or Boltz-2 to check whether the sequence folds back to the designed structure.
- For binders, predict the binder-target complex and inspect interface confidence. Low predicted aligned error at the interface and high interface pTM are positive signs.
- Use ProteinMPNN or LigandMPNN to redesign sequences for generated backbones before experimental testing.
Common warning signs include:
- Clashing side chains or broken chain connectivity
- Extended, unstructured loops that dominate the design
- Binder designs where the generated chain does not contact the specified hotspots
- Self-consistency RMSD much larger than 2 Å when the sequence is folded back through a structure predictor
When to use Genie 3 vs alternatives
Genie 3 is a generative structure model. It is most useful when the goal is to explore new protein geometries rather than evaluate an existing sequence or complex.
| Task | Recommended tool | Why |
|---|---|---|
| Generate novel protein backbones | Genie 3 | Direct all-atom diffusion with length control. |
| Design binders to a target surface | Genie 3 or BindCraft | Genie 3 is fast and diffusion-native; BindCraft uses AlphaFold2-guided hallucination with built-in filtering. |
| Scaffolding a functional motif | Genie 3 | Holds fixed segments and redesigns flanking scaffold regions. |
| Redesign sequence for an existing backbone | ProteinMPNN | Inverse folding is faster and more controllable than full diffusion. |
| Predict structure of an existing sequence | AlphaFold 2 or Boltz-2 | These are predictors, not generators. |
| Small-molecule docking | AutoDock Vina or DiffDock | Genie 3 does not model ligand poses. |
Genie 3 vs BindCraft
Both tools generate protein binders, but their philosophies differ. Genie 3 samples from a diffusion model conditioned on target hotspots and returns raw structures quickly. BindCraft runs a longer optimization loop that uses AlphaFold2-Multimer as an objective function and filters designs with interface metrics. Genie 3 is better for rapid exploration; BindCraft is better when built-in in-silico filtering and relaxation are worth the extra compute.
Limitations
- No built-in folding validation: Genie 3 returns generated structures only. It does not run ESMFold, ColabFold, or AlphaFold2 to check self-consistency. Users must validate separately.
- Length cap: The current ProteinIQ deployment supports designs up to 256 residues and 2,048 tokens per batch.
- Sequence prediction is optional: Predicted sequences from
predict_sequenceare model proposals. They should be re-evaluated with an inverse-folding model or structure predictor before ordering genes. - Binder design requires hotspot knowledge: The quality of binder designs depends heavily on choosing reasonable hotspot residues. Poorly chosen hotspots produce binders that miss the intended interface.
- No small-molecule or nucleic-acid support: Genie 3 designs proteins only. It cannot generate DNA, RNA, or ligand structures.
- Stochastic output: Results vary between runs unless the underlying sampling seed is fixed. Multiple samples are recommended to explore the design space.
Related tools

ODesign
All-atom generative AI for designing protein binders. Specify target binding sites and generate diverse binding proteins with fine-grained control over interaction parameters.

ProFam
ProFam-1 is a protein family language model for family-conditioned sequence generation. Provide a protein family FASTA/MSA and generate new sequences with model likelihood scores for downstream ranking and screening.

ProGen2
ProGen2 is Salesforce Research's protein language model suite for prompt-based de novo protein sequence generation. It samples novel amino acid sequences from a plain-text context string using top-p sampling and temperature control.

Proteo-R1
Reasoning-guided antibody CDR co-design for antibody-antigen complexes. Proteo-R1 identifies residue-level functional decisions and uses conditional diffusion to generate ranked designed structures with confidence metrics.

EvoDiff
EvoDiff is a diffusion-based protein sequence generation framework from Microsoft Research. ProteinIQ currently wraps the EvoDiff-Seq OA_DM_38M model for unconditional protein generation, motif scaffolding, and user-sequence inpainting.

PocketFlow
PocketFlow is a structure-based molecular generative model that designs novel drug-like molecules within protein binding pockets. It uses autoregressive flow modeling with chemical knowledge to generate 100% chemically valid, highly drug-like compounds.

PocketXMol
PocketXMol is a pocket-interacting generative foundation model for docking, small-molecule design, and peptide design in protein binding pockets.

Proteina-Complexa
Design protein binders against a target structure with NVIDIA BioNeMo's Proteina-Complexa generative pipeline.

RFdiffusion
RFdiffusion is a state-of-the-art protein structure generation tool that uses diffusion models to design proteins de novo, create binders, scaffold motifs, and generate symmetric oligomers with atomic precision.

RFdiffusion 2
RFdiffusion2 is an atom-level enzyme active site scaffolding tool that generates protein scaffolds around your input motif. REQUIRES an input PDB structure containing the active site residues to scaffold. For ligand-aware design, ligands must be embedded in the input PDB as HETATM records.