RFdiffusion3

All-atom generative diffusion for designing binders, enzymes, and symmetric protein assemblies

35
Configure input settings on the left, then click "Submit"

Related tools

RFdiffusion 2

RFdiffusion 2

RFdiffusion2 is an atom-level enzyme active site scaffolding tool that generates protein scaffolds around your input motif. REQUIRES an input PDB structure containing the active site residues to scaffold. For ligand-aware design, ligands must be embedded in the input PDB as HETATM records.

ODesign

ODesign

All-atom generative AI for designing protein binders. Specify target binding sites and generate diverse binding proteins with fine-grained control over interaction parameters.

RFdiffusion

RFdiffusion

RFdiffusion is a state-of-the-art protein structure generation tool that uses diffusion models to design proteins de novo, create binders, scaffold motifs, and generate symmetric oligomers with atomic precision.

BoltzGen

BoltzGen

BoltzGen is a state-of-the-art AI model for designing protein and peptide binders against any biomolecular target. Using generative diffusion models, it creates novel binders (proteins, peptides, nanobodies) with nanomolar-level binding affinity.

EvoDiff

EvoDiff

EvoDiff is a diffusion-based protein sequence generation framework from Microsoft Research. ProteinIQ currently wraps the EvoDiff-Seq OA_DM_38M model for unconditional protein generation, motif scaffolding, and user-sequence inpainting.

IgGM

IgGM

IgGM is a generative foundation model for antibody and nanobody design against a target antigen. Supports CDR design, affinity maturation, inverse design, and framework design. Requires an antigen structure (PDB) and antibody sequences with "X" marking positions to design.

LigandMPNN

LigandMPNN

Design protein sequences with atomic context from ligands, metals, and nucleotides. Achieves 63.3% sequence recovery at binding sites, significantly outperforming ProteinMPNN (50.5%).

PocketFlow

PocketFlow

PocketFlow is a structure-based molecular generative model that designs novel drug-like molecules within protein binding pockets. It uses autoregressive flow modeling with chemical knowledge to generate 100% chemically valid, highly drug-like compounds.

RFantibody

RFantibody

Structure-based de novo antibody and nanobody design pipeline combining antibody-tuned RFdiffusion, ProteinMPNN sequence design, and antibody-tuned RoseTTAFold2 filtering.

AntiFold

AntiFold

Inverse folding for antibody variable domains and nanobodies. Predicts amino acid sequences compatible with antibody structures using IMGT numbering while preserving upstream AntiFold chain handling and structural constraints.

What is RFdiffusion3?

RFdiffusion3 (RFD3) is a generative diffusion model for de novo protein design. Unlike structure prediction tools that fold existing sequences, RFD3 creates entirely new protein backbones through an iterative denoising process. The model can generate proteins from scratch or design new chains that interact with specified targets.

RFD3 extends the original RFdiffusion architecture with all-atom capabilities, enabling design tasks involving small molecules, metals, and other non-protein components. The model learns to reverse a diffusion process that gradually adds noise to protein structures, generating novel backbones that satisfy specified constraints.

For sequence design after generating backbones, pair RFD3 with LigandMPNN to optimize amino acid sequences. To validate designed structures, use RosettaFold3 or Boltz-2 for structure prediction.

How does RFdiffusion3 work?

Diffusion process

RFD3 operates on a diffusion framework where protein structures are progressively corrupted with Gaussian noise during training. At inference time, the model reverses this process:

  1. Initialization: Start from random coordinates (pure noise)
  2. Denoising: Iteratively predict and remove noise at each timestep
  3. Refinement: Final coordinates converge to a valid protein backbone

The number of diffusion steps controls the quality-speed tradeoff. More steps (100-200) produce higher quality structures but take longer, while fewer steps (10-50) enable rapid prototyping.

Constraint satisfaction

RFD3 can incorporate various constraints during generation:

Structural constraints: Fix specific residue positions from a template structure while designing new regions around them. The contig specification language allows precise control over which regions are fixed versus designed.

Hotspot targeting: For binder design, specify which target residue atoms should be contacted by the designed protein. The model orients the new chain to maximize interactions with these hotspots.

Length constraints: Control the size of designed proteins or protein regions using length ranges. The model samples within the specified range during generation.

Inputs and settings

Target structure (optional)

Provide a target protein structure for constrained design tasks like binder design. Upload a PDB/CIF file or fetch directly from RCSB using a PDB ID (e.g., 4ZXB). The target structure defines the binding interface for binder design or provides structural motifs for scaffolding tasks.

For unconditional generation (creating proteins from scratch), leave this empty and RFD3 will generate novel folds based solely on the specified length.

Design task

Select the type of protein design:

  • Binder Design: Design a new protein that binds to the target structure. Requires a target structure input.
  • Enzyme Design: Design proteins with functional active sites. Often combined with hotspot specification.
  • Symmetric Assembly: Generate proteins with internal symmetry (homo-oligomers).
  • Unconditional Generation: Create novel protein folds without any structural constraints.

Number of designs

Generate multiple independent designs (1-50) in a single job. Each design samples a different trajectory through the diffusion process, producing structural diversity. More designs increase the chance of finding high-quality candidates but require more computation time.

Diffusion steps

Controls the number of denoising iterations (10-200). Higher values produce more refined structures:

StepsUse Case
10-20Quick prototyping, initial exploration
50Standard quality (default)
100-200High-quality final designs

Design length range

Specify the length of designed protein regions as a range (e.g., 50-100). The model samples within this range during generation. For binder design, this controls the size of the designed binder chain.

Advanced options

Custom contig

The contig specification provides fine-grained control over the design:

If your uploaded structure has missing residues or gaps, leave Custom contig blank unless you explicitly split the fixed residue ranges to match residues that actually exist in the structure.

Contigs are written as comma-separated pieces:

  • A1-95 or B5-12 references residues from the uploaded structure
  • 50 or 50-100 asks RFD3 to design a region of that length
  • /0 creates a chain break

Examples:

  • 50-100,/0,A1-95 - Design a 50-100 residue binder and place it on a separate chain from target residues A1-A95
  • A40-60,70,A120-170,/0,B3-45,60-80 - Keep residues A40-A60, design a 70-residue segment, keep A120-A170, then start a new chain with B3-B45 followed by a designed 60-80 residue segment
  • 100 - Generate a 100-residue protein unconditionally

Whitespace around commas is tolerated by the ProteinIQ form and will be normalized before submission, but the underlying grammar is still comma-separated pieces. Use /0, not a bare /, for chain breaks.

Hotspot residues

Target specific atoms on the target protein for binder design. This focuses the designed interface on functionally important residues.

Format: JSON object mapping residue IDs to atom names

JSON
1{2  "A64": "CD2,CZ",3  "A88": "CG,CZ",4  "A96": "CD1,CZ"5}

Residue ID format: ChainResidueNumber (e.g., A64 = chain A, residue 64)

The field accepts a JSON object only. Residue IDs must match residues that are actually present in the uploaded target structure.

Common atom names:

  • Backbone: N, CA, C, O
  • Aromatic: CD1, CD2, CE1, CE2, CZ
  • Aliphatic: CB, CG, CD
  • Charged: NZ (Lys), OD1/OD2 (Asp), OE1/OE2 (Glu)

Hotspot selection dramatically improves binder design success rates by ensuring the designed protein contacts critical interface residues.

Orientation strategy

Controls how the designed chain is positioned relative to the target:

  • Hotspot-based (default): Orient based on specified hotspot atoms. Recommended when hotspots are defined.
  • Center of Mass (com): Orient based on the target's center of mass. Use when no specific hotspots are known.

Understanding the results

Output structures

Each design generates a PDB file containing the designed backbone coordinates. Files are named sequentially (e.g., design_0.pdb, design_1.pdb). The structures contain:

  • Full backbone atoms (N, CA, C, O) for all residues
  • Designed chain(s) with placeholder sequence (typically poly-glycine or poly-alanine)
  • Target chain(s) if provided in the input

Interpreting designs

RFD3 outputs are backbone-only structures requiring sequence design:

  1. Visualize: Check that designs have reasonable topology and target contact
  2. Filter: Remove designs with obvious clashes or disconnected regions
  3. Sequence design: Use LigandMPNN to assign optimal amino acid sequences
  4. Validate: Predict structure with RosettaFold3 to verify foldability

Quality indicators

While RFD3 doesn't provide explicit confidence scores like folding tools, design quality can be assessed by:

  • Compactness: Well-designed proteins have tightly packed cores
  • Secondary structure: Presence of regular helices and sheets indicates valid topology
  • Interface contacts: For binders, check that the designed chain contacts the target
  • Sequence design scores: LigandMPNN confidence scores indicate designability

Use cases

RFD3 excels at:

  • Therapeutic binders: Design proteins targeting disease-relevant receptors
  • Enzyme scaffolds: Create new protein architectures for catalytic function
  • Symmetric assemblies: Generate homo-oligomeric protein cages and rings
  • Protein diversification: Explore the space of foldable proteins

Example workflow: Binder design

  1. Prepare target: Upload target protein structure (e.g., receptor extracellular domain)
  2. Identify hotspots: Select key interface residues from known binding data or structural analysis
  3. Configure: Set binder length (e.g., 50-100), specify hotspots, select "Binder Design" task
  4. Generate: Run RFD3 with 10-50 designs
  5. Design sequences: Use LigandMPNN on promising backbones
  6. Validate: Predict structures with RosettaFold3, filter by pLDDT

Limitations

  • Backbone only: RFD3 generates backbones without sequences; requires separate sequence design
  • No side chains: Output structures lack side chain atoms until sequence is assigned
  • Computational cost: High-quality designs require significant GPU time
  • Success rate varies: Not all designs will fold correctly; generate multiple candidates
  • Limited small molecule support: While RFD3 has all-atom capabilities, complex ligand interactions may require specialized tools