Use RFdiffusion online

What is RFdiffusion?

RFdiffusion is a groundbreaking protein design tool developed at the University of Washington's Baker Lab that uses diffusion models to generate protein structures with atomic precision. Published in Nature (2023), RFdiffusion represents a major advance in computational protein design, enabling the creation of proteins with specific functions, binding properties, and structural features that don't exist in nature.

Unlike traditional structure prediction tools like AlphaFold that predict structures from sequences, RFdiffusion works in reverse—it designs entirely new protein backbones from scratch or scaffolds specific structural motifs to create functional proteins.

Key capabilities

RFdiffusion supports five design modes, each tailored for specific protein engineering tasks:

1. Binder design

Design proteins that bind to specific target proteins with high affinity. RFdiffusion can create binders for therapeutic targets, biosensors, or protein-protein interaction modulators. This mode is particularly powerful for designing peptide binders, protein therapeutics, and diagnostic tools.

2. Motif scaffolding

Embed functional motifs (enzyme active sites, binding sites) into novel protein scaffolds. This enables engineering proteins with specific catalytic activities or binding properties while maintaining structural stability and solubility.

3. Partial diffusion

Redesign specific regions of existing proteins while preserving the rest of the structure. Useful for protein optimization, stabilization, or introducing new functionalities into known scaffolds.

4. Unconditional protein generation

Generate entirely new protein structures from scratch without constraints. Create novel folds, design symmetric oligomers (cyclic, dihedral, tetrahedral, octahedral, icosahedral), or explore uncharted regions of protein structure space.

5. Custom design

Advanced mode for expert users who want full control over the diffusion process using custom contig specifications. Enables complex multi-domain designs, flexible motif positioning, and sophisticated sequence inpainting.

How does RFdiffusion work?

RFdiffusion applies the same diffusion model technology behind AI image generators to protein structure design. Instead of pixels, it operates on protein backbone torsion angles and atomic coordinates.

Diffusion on SE(3) manifold

The model defines a forward diffusion process that gradually adds noise to protein backbone coordinates and orientations, transforming any structure into random noise. The reverse process is learned by a neural network that denoises random structures into valid protein backbones through iterative refinement steps.

RoseTTAFold architecture

RFdiffusion builds on RoseTTAFold, combining 1D sequence processing, 2D distance map prediction, and 3D structure refinement in a three-track neural network. The model uses SE(3)-equivariant layers that respect the rotational and translational symmetries of 3D space, ensuring physically realistic outputs.

Self-conditioning

A key innovation is self-conditioning: at each diffusion step, the model conditions on its own predictions from the previous step, progressively refining the structure with greater certainty. This dramatically improves sample quality and reduces required diffusion steps from 200 to 50.

Guided diffusion with potentials

Optional guiding potentials can bias the diffusion process toward desired properties—compact structures, high contact density, or specific binding interfaces. These potentials act as soft constraints that nudge the generative process without breaking the diffusion framework.

Design modes explained

Binder design mode

Creates proteins that bind to specified target regions on a protein of interest.

Key parameters:

Target chain: Which chain to design a binder for
Binding pocket: Crop region around the desired binding site (residue range)
Binder length: Minimum and maximum size of the designed binder
Hotspots: Specific residues that should be involved in binding (biases diffusion)

Tips for success:

Cropping the target around the binding site significantly speeds up computation
Define hotspots to avoid designing binders for exposed hydrophobic patches (artifacts of cropping)
Start with 10-20 residue binders; longer binders are harder to express and less stable
The cyclic chains option enables macrocyclic peptide binders with enhanced stability

Motif scaffolding mode

Builds protein scaffolds around functional motifs (catalytic sites, binding loops).

Key parameters:

Motif chain: Chain containing the motif to scaffold
N/C-terminal extensions: How much to extend the motif on each end
Scaffold range: Which residues of the input to preserve as scaffold

Use cases:

Transplanting enzyme active sites into novel scaffolds
Stabilizing functional loops by embedding them in rigid protein frameworks
Creating de novo enzymes with specified catalytic geometries

Partial diffusion mode

Redesigns specific regions while keeping the rest of the structure fixed.

Key parameters:

Partial diffusion chain: Which chain to modify
Diffused residue range: Which residues to redesign (rest are fixed)
Partial timesteps: Amount of noise to add (lower = more conservative, higher = more creative)

Applications:

Protein stabilization by redesigning flexible loops
Interface engineering without disrupting the core fold
Introducing functional mutations in defined regions

Unconditional generation mode

Generates entirely new protein structures with specified length and symmetry.

Key parameters:

Protein length: Size of the monomeric unit (10-500 residues recommended)
Symmetry: None (monomer), cyclic, dihedral, tetrahedral, octahedral, icosahedral
Order: Number of copies for cyclic/dihedral symmetries

Achievements:

Novel protein folds not found in nature
Symmetric nanocages for drug delivery
Artificial enzymes with designed active sites
Structural proteins with enhanced stability

Custom design mode

For advanced users familiar with RFdiffusion's contig syntax.

Contig format examples:

150-150: Generate 150-residue protein
A10-25/30-40: Use chain A residues 10-25, then design 30-40 new residues
B1-100/0 100-100: Full chain B, gap, then 100 new residues

Enables complex designs like multi-domain proteins, flexible motif positioning, and partial sequence inpainting.

Input requirements

Required inputs

For binder/scaffolding/partial diffusion: PDB file with chain IDs (upload or PDB ID)
For unconditional generation: No input required (de novo design)
For custom mode: Depends on contig specification

PDB preparation

Ensure chain IDs are properly assigned
Remove water molecules unless critical for motif function
Clean structure of missing residues (or note them for diffusion to fill in)
Use PDB Fixer for automated preparation

Understanding the results

RFdiffusion outputs designed protein backbones as PDB files. Each design is scored based on confidence metrics.

Interpreting scores

Higher scores indicate higher confidence in the design
Scores are model-predicted estimates of designability and stability
Top-ranked designs aren't always the best—examine multiple outputs

Next steps after RFdiffusion

RFdiffusion only generates backbones. For functional proteins, you typically:

Sequence design: Use ProteinMPNN to design amino acid sequences for the backbone
Structure prediction: Validate with AlphaFold2 to ensure sequence folds correctly
Experimental validation: Express, purify, and characterize the designed protein

ProteinIQ can automate steps 1-2 with the "Backbone only" toggle disabled (default).

Best practices

Timesteps

Default (50 steps): Good balance of quality and speed
Lower (20-30): Faster but lower quality—acceptable for rapid prototyping
Higher (100-200): Marginally better quality but 2-4× slower—rarely necessary

Binder design

Start with 10-20 residue peptide binders before attempting protein binders
Crop the target protein around the binding site for 5-10× speedup
Use hotspots to guide binders toward specific epitopes
Consider cyclic peptides for enhanced stability and binding affinity

Motif scaffolding

Keep motifs compact (fewer than 20 residues) for better success rates
Substrate contacts potential helps maintain binding site geometry
Validate scaffold stability with AlphaFold2 before synthesis

Symmetry design

Start with low symmetry orders (C2-C3, D2-D3) before attempting complex geometries
Higher symmetries (icosahedral) require more timesteps (100+)
Oligomer contacts potential stabilizes multimeric interfaces

Guiding potentials

Use sparingly—start without potentials, then add if needed
Monomer ROG compacts structures (useful for small domains)
Contact potentials increase stability but may reduce structural diversity
Not all potentials work with all modes—check tooltips for compatibility

Limitations

No sequence design: RFdiffusion only generates backbones; use ProteinMPNN for sequences
Rigid backbone assumption: Doesn't model conformational flexibility during design
No small molecule support: Can't directly incorporate ligands (yet—use V2 for this)
Computational cost: Larger proteins (>300 residues) scale quadratically in memory
Experimental success rate: Not all designs fold correctly—validation with AlphaFold2 recommended

Comparison: RFdiffusion vs AlphaFold

RFdiffusion (Design)

Creates new proteins: Generates structures that don't exist
Backward direction: Structure → sequence
Applications: Therapeutics, enzymes, materials
Output: Backbone coordinates (+ sequences via ProteinMPNN)

AlphaFold (Prediction)

Predicts existing proteins: Folds sequences from nature
Forward direction: Sequence → structure
Applications: Structural biology, function annotation
Output: Atomic coordinates with confidence scores

Think of RFdiffusion as the "protein designer" and AlphaFold as the "protein validator."

Cost

Using RFdiffusion through ProteinIQ costs 150 credits per design job, regardless of the number of designs generated.

References

Watson, J.L., Juergens, D., Bennett, N.R. et al. (2023). De novo design of protein structure and function with RFdiffusion. Nature 620, 1089–1100. https://doi.org/10.1038/s41586-023-06415-8
Yim, J., Trippe, B.L., De Bortoli, V. et al. (2023). SE(3) diffusion model with application to protein backbone generation. ICML 2023. https://arxiv.org/abs/2302.02277
Official GitHub: https://github.com/RosettaCommons/RFdiffusion

RFdiffusion

RFdiffusion: Protein Structure Design

What is RFdiffusion?

Key capabilities

1. Binder design

2. Motif scaffolding

3. Partial diffusion

4. Unconditional protein generation

5. Custom design

How does RFdiffusion work?

Diffusion on SE(3) manifold

RoseTTAFold architecture

Self-conditioning

Guided diffusion with potentials

Design modes explained

Binder design mode

Motif scaffolding mode

Partial diffusion mode

Unconditional generation mode

Custom design mode

Input requirements

Required inputs

PDB preparation

Understanding the results

Interpreting scores

Next steps after RFdiffusion

Best practices

Timesteps

Binder design

Motif scaffolding

Symmetry design

Guiding potentials

Limitations

Comparison: RFdiffusion vs AlphaFold

RFdiffusion (Design)

AlphaFold (Prediction)

Cost

References