ProteinIQ
PDB to SDF Converter example image

PDB to SDF Converter

Convert PDB protein structure files to SDF format for cheminformatics and molecular analysis. Upload a PDB file or paste the structure data below.

What is PDB to SDF conversion?

PDB (Protein Data Bank) and SDF (Structure Data File) are both molecular file formats, but they serve different purposes. PDB files store 3D coordinates of macromolecules with biological context, while SDF files are designed for cheminformatics with rich chemical property storage.

The SDF format was developed by MDL (now part of BIOVIA) and has become the standard exchange format in computational chemistry. It wraps the older MOL file format while adding support for multiple molecules and associated data fields.

Converting PDB to SDF is essential when preparing structures for virtual screening, QSAR modeling, or chemical database applications that require the flexible data storage capabilities of SDF.

How does the conversion work?

The converter transforms PDB's fixed-width column format into SDF's sectioned structure while preserving atomic coordinates and connectivity.

Atom block transformation

PDB ATOM records contain coordinates, atom names, residue information, and chain identifiers in fixed columns. These are mapped to SDF's atom block which stores coordinates and atom types in a more compact format.

Bond generation

SDF files require explicit bond information in the connection table. When "Generate bonds" is enabled, the converter infers chemical bonds from interatomic distances using standard covalent radii.

PDB CONECT records, if present, are also incorporated to ensure accurate bond representation for non-standard connectivity.

Property blocks

SDF files support arbitrary data fields after the M END marker. When metadata is included, information like chain IDs, residue names, and original PDB header data are preserved as searchable properties.

Input requirements

Structure selection

  • Chain selection: Choose which chains to include. All chains exports the complete structure, First chain only extracts just chain A (or the first defined chain), and Specific chains lets you define exactly which chains to export.
  • Chain IDs: Comma-separated list of chain identifiers when using specific chain selection (e.g., A,B,C).

Conversion options

  • Generate bonds: Automatically detect and create chemical bonds based on interatomic distances. Disable this only if your PDB already has complete CONECT records that you want to preserve exclusively.
  • Include metadata properties: Add PDB header information as SDF data fields. Useful for maintaining provenance in chemical databases.

When to use SDF format

SDF is the preferred format for:

  • Virtual screening libraries — Tools like Gypsum-DL and VSFlow expect SDF input for ligand preparation and screening
  • Chemical databases — SDF's property fields make it ideal for storing compounds with associated bioactivity data
  • QSAR modeling — Structure-activity datasets are typically distributed in SDF format
  • Cheminformatics pipelines — RDKit, Open Babel, and most toolkits have excellent SDF support

Use PDB to MOL2 instead if you need Tripos atom typing for molecular dynamics or SYBYL-based workflows.

Other conversion methods

Using Open Babel

# Basic conversion
obabel input.pdb -O output.sdf

# Add hydrogens and compute 3D coordinates
obabel input.pdb -O output.sdf -h --gen3d

# Extract specific chains (requires preprocessing)
obabel input.pdb -O output.sdf -d  # Remove hydrogens first if needed

Using RDKit (Python)

from rdkit import Chem

# Read PDB and write SDF
mol = Chem.MolFromPDBFile('input.pdb')
if mol:
    writer = Chem.SDWriter('output.sdf')
    writer.write(mol)
    writer.close()

Using PyMOL

# In PyMOL command line
load input.pdb
save output.sdf