PDB to SDF Converter

Convert PDB structures to SDF with covalent-radii bond perception and chain selection.

How to convert PDB to SDF

Paste or upload a PDB file, or fetch one from the RCSB by ID, and the converter rewrites it into SDF records in the browser. Each chain becomes one V2000 record, bonds are perceived from the coordinates, and PDB details such as the chain ID and a molecular formula are written as data fields after M END. The result downloads as a .sdf file. Nothing is uploaded to a server.

A small ligand converts as follows. Input:

Text
HETATM    1  C1  UNL A   1       0.000   0.000   0.000  1.00  0.00           C
HETATM    2  O1  UNL A   1       1.220   0.000   0.000  1.00  0.00           O
HETATM    3  H1  UNL A   1      -0.540   0.930   0.000  1.00  0.00           H
HETATM    4  H2  UNL A   1      -0.540  -0.930   0.000  1.00  0.00           H
END

Output with default settings:

Text
Chain_A
  ProteinIQ PDB to SDF

  4  3  0  0  0  0  0  0  0  0999 V2000
    0.0000    0.0000    0.0000 C    0  0  0  0  0  0  0  0  0  0  0  0
    1.2200    0.0000    0.0000 O    0  0  0  0  0  0  0  0  0  0  0  0
   -0.5400    0.9300    0.0000 H    0  0  0  0  0  0  0  0  0  0  0  0
   -0.5400   -0.9300    0.0000 H    0  0  0  0  0  0  0  0  0  0  0  0
  1  4  1  0  0  0  0
  1  3  1  0  0  0  0
  1  2  1  0  0  0  0
M  END
>  <NAME>
Chain_A

>  <CHAIN>
A

>  <ATOMS>
4

>  <BONDS>
3

>  <FORMULA>
COH2

$$$$

Coordinates and atom order are preserved. The element comes from columns 77 to 78 when present and is otherwise read from the atom name. Bonds are perceived from interatomic distances using element covalent radii, so the C, O, and H atoms here are connected without any CONECT records in the input. Every bond is written as a single bond, which is why the carbonyl above appears as 1 2 1 rather than a double bond.

A structure with several chains keeps each chain as its own record. With default settings, two chains produce two molecules:

Text
Chain_A
  ProteinIQ PDB to SDF

  2  1  0  0  0  0  0  0  0  0999 V2000
    0.0000    0.0000    0.0000 N    0  0  0  0  0  0  0  0  0  0  0  0
    1.4500    0.0000    0.0000 C    0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
M  END
>  <CHAIN>
A
...
$$$$
Chain_B
  ProteinIQ PDB to SDF

  2  1  0  0  0  0  0  0  0  0999 V2000
   10.0000    0.0000    0.0000 N    0  0  0  0  0  0  0  0  0  0  0  0
   11.4500    0.0000    0.0000 C    0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
M  END
>  <CHAIN>
B
...
$$$$

Bonds are only formed inside a record, so connectivity that spans two chains, such as an inter-chain disulfide, is not represented when chains are written separately. To pull a single residue or ligand out of a structure, turn on Split by residue, or restrict the output with Chain selection.

Input

FormatDescription
.pdb, .entProtein Data Bank coordinate file with ATOM and HETATM records.

Paste content directly, upload a file up to 50 MB, or fetch a structure from the RCSB by its four-character PDB ID. Multi-model files such as NMR ensembles are converted using the first model, with a notice when extra models are dropped.

Settings

SettingDescription
Chain selectionWhich chains to export. All chains keeps every chain, First chain only keeps the first chain in the file, and Specific chains keeps only the IDs you list. Default: All chains.
Chain IDsComma-separated chain IDs (for example A,B,C), used when chain selection is Specific chains. Default: A,B.
Split by residueWrite one record per residue instead of one per chain. Useful for extracting a single ligand or residue. Default: off.
Generate bondsPerceive bonds from interatomic distances using element covalent radii with a 0.45 Å tolerance. Turn off for an atoms-only record with an empty connection table. Default: on.
Include metadata propertiesWrite the chain ID, atom and bond counts, and molecular formula as SDF data fields. Default: on.
Include hydrogensKeep hydrogen atoms. Turn off for a heavy-atom-only file. Default: on.
Include waterKeep water molecules such as HOH and WAT. Default: off.
Include 3D coordinatesWrite atomic coordinates. Turn off to zero them for consumers that need only the connection table. Default: on.

Results

FileContents
name.sdfThe converted structure: one V2000 record per chain (or per residue), each with its atom block, a perceived bond block, and optional data fields after M END.
conversion-summary.txtThe chains found and exported, atom and bond counts, hydrogens and waters removed, model count, and a note on how bonds were perceived.

Both files are named after the input file (an uploaded 1abc.pdb produces 1abc.sdf and conversion-summary.txt) and appear in the Files tab to copy or download. Warnings, such as the multi-model notice or a request for chains that were not found, also show above the output.

What are PDB and SDF

PDB (Protein Data Bank) is a fixed-width coordinate format built for macromolecules. It records atom positions along with biological context such as chains, residues, and CONECT connectivity, but it has no field for bond order.

SDF (Structure-Data File), developed by MDL, stores small-molecule atoms, explicit bonds with bond orders, and tagged data fields. It is the standard exchange format for chemical libraries and the output of most cheminformatics toolkits.

Because SDF is designed for small molecules, a whole protein chain becomes a single record whose bonds are inferred from geometry rather than read from chemistry. The conversion preserves coordinates and connectivity and is common for preparing a bound ligand for virtual screening, building a chemical database, or feeding a structure into a QSAR pipeline. Bond orders are not assigned, so keep the original chemistry source if a downstream tool needs them.

Which converter should I use

GoalTool
Turn an SDF back into a PDB structureSDF to PDB
Convert a PDB to Tripos MOL2 with atom typingPDB to MOL2
Keep full macromolecular metadata as mmCIFPDB to CIF
Extract the protein sequence from a structurePDB to FASTA

FAQ

Does PDB to SDF assign bond orders?

No. Bonds are perceived from interatomic distances and are all written as single bonds. The geometry and which atoms are connected are preserved, but double, triple, and aromatic orders are not. A carbonyl carbon and oxygen, for example, are written as a single bond.

Why does my whole protein become one molecule?

SDF groups atoms into records, and by default each chain is one record. SDF is meant for small molecules, so a full protein converts as a single distance-bonded molecule. To work with one piece at a time, turn on Split by residue to get a record per residue, or use Chain selection to export a specific chain.

How do I extract just the ligand from a structure?

Turn on Split by residue so each residue, including the HETATM ligand, becomes its own SDF record, then keep the record you need. If the ligand sits on its own chain, you can also select that chain with Chain selection.

Are PDB CONECT records used?

No. Connectivity comes entirely from the coordinates. Bonds are perceived with element covalent radii, which is why disulfides and other longer bonds inside a record are still detected.

How are hydrogens and water handled?

Hydrogens are kept by default and can be removed with Include hydrogens. Water molecules such as HOH and WAT are removed by default and can be kept with Include water. The summary file reports how many of each were dropped.