PDB to SDF converter
Convert PDB protein structure files to SDF format for cheminformatics and molecular analysis. Upload a PDB file or paste the structure data below.
Input (PDB format)
Output (SDF format)
What is PDB to SDF Conversion?
The PDB to SDF converter transforms Protein Data Bank files into Structure Data Format files, bridging the gap between structural biology and cheminformatics. This conversion enables researchers to take protein structures from crystallographic databases and prepare them for chemical analysis, drug discovery workflows, and molecular modeling applications.
PDB format structure
PDB files use a rigid, fixed-width column format that dates back to the 1970s when punch cards limited data storage. Each line represents a specific record type, with atomic coordinates stored in precisely defined character positions.
Here's what a typical PDB file looks like:
HEADER HYDROLASE/HYDROLASE INHIBITOR 20-MAY-97 1HTM
ATOM 1 N ALA A 1 20.154 16.967 27.462 1.00 11.99 N
ATOM 2 CA ALA A 1 19.030 16.097 26.849 1.00 11.85 C
ATOM 3 C ALA A 1 17.666 16.817 26.697 1.00 11.89 C
ATOM 4 O ALA A 1 17.657 18.049 26.530 1.00 12.15 O
ATOM 5 CB ALA A 1 19.464 15.573 25.481 1.00 11.77 C
HETATM 1234 C1 ATP A 501 15.678 14.234 23.456 1.00 20.15 C
HETATM 1235 N1 ATP A 501 16.789 13.567 24.123 1.00 19.87 N
END
The format stores each atom on a separate line with coordinates at fixed positions: X at columns 31-38, Y at columns 39-46, and Z at columns 47-54. The format includes chain identifiers (column 22), residue names (columns 18-20), and atom types (columns 13-16).
SDF format structure
SDF files take a completely different approach, using a flexible block structure that prioritizes chemical information over rigid formatting. Each molecule begins with a header block, followed by a connection table, and ends with property data.
Here's the same molecular information represented in SDF format:
Alanine residue from 1HTM
Converted from PDB format
5 4 0 0 0 0 0 0 0 0999 V2000
20.1540 16.9670 27.4620 N 0 0 0 0 0 0 0 0 0 0 0 0
19.0300 16.0970 26.8490 C 0 0 0 0 0 0 0 0 0 0 0 0
17.6660 16.8170 26.6970 C 0 0 0 0 0 0 0 0 0 0 0 0
17.6570 18.0490 26.5300 O 0 0 0 0 0 0 0 0 0 0 0 0
19.4640 15.5730 25.4810 C 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
2 3 1 0 0 0 0
2 5 1 0 0 0 0
3 4 2 0 0 0 0
M END
> <CHAIN_ID>
A
> <RESIDUE_NAME>
ALA
> <RESIDUE_NUMBER>
1
$$$$
The SDF format explicitly defines molecular connectivity through bond tables (lines starting with atom numbers), while PDB relies on standard residue templates and proximity calculations to infer bonds.
Format comparison in practice
The fundamental difference becomes apparent when examining how each format handles the same chemical information. PDB treats molecules as collections of atoms with implicit relationships, while SDF treats them as explicit chemical graphs with defined connectivity.
Data Organization: PDB organizes information hierarchically by biological relevance (chain → residue → atom), whereas SDF organizes by chemical connectivity (atoms → bonds → properties). This means a protein chain in PDB becomes multiple separate molecular entries in SDF format.
Coordinate Precision: PDB coordinates are stored as fixed-precision text (3 decimal places), while SDF allows variable precision. This can affect downstream computational chemistry calculations that require high-precision coordinates.
Chemical Context: PDB preserves biological context like secondary structure and experimental conditions, while SDF focuses on chemical properties and molecular descriptors. Converting from PDB to SDF necessarily loses some biological context while gaining chemical specificity.