Guides / Drug Discovery

What is Open Babel? Chemical file conversion explained

Dr. Matic Broz·June 9, 2026

What is Open Babel?

Open Babel is an open-source cheminformatics toolkit for reading, writing, converting, searching, and manipulating chemical data. It is widely used when molecular information needs to move between file formats used by drug discovery, computational chemistry, structural biology, crystallography, molecular docking, and materials science.

The project was described in a 2011 Journal of Cheminformatics paper by O'Boyle and colleagues as an open chemical toolbox designed to handle the "many languages" of chemical data. That need remains practical: SMILES stores connectivity compactly, PDB stores atom coordinates for macromolecular structures, SDF can carry molecule properties, MOL2 carries atom types and charges, and PDBQT is used by AutoDock-family docking tools.

Open Babel is not limited to format conversion. It also includes utilities for hydrogen handling, 2D and 3D coordinate generation, conformer searching, molecular fingerprints, SMARTS searching, filtering, descriptors, and force-field operations. The official project distributes Open Babel under the GNU General Public License; GitHub lists the repository license as GPL-2.0.

Format familyExamplesCommon use
Line notationsSMILES, canonical SMILES, InChI, InChIKeyCompact molecule identifiers, database exchange, deduplication
Small-molecule structure filesMOL, SDF, MOL2, CML, XYZLigand files, property tables, molecular modeling
Biomolecular and crystallographic filesPDB, mmCIF, CIF, PQRProtein structures, crystallography, electrostatics preparation
Docking and simulation formatsPDBQT, Amber Prep, GRO, GROMOS96, LAMMPS dataDocking input, molecular dynamics, force-field workflows
Computational chemistry filesGaussian, GAMESS, MOPAC, ORCA, NWChemQuantum chemistry input and output conversion

The main limitation is that conversion cannot create information the input never contained. A SMILES string has no measured 3D coordinates, a basic PDB file may not encode bond order, and some formats cannot preserve SDF property fields or partial charges. Open Babel can infer or generate some missing details, but those choices affect downstream chemistry.

How does Open Babel work?

Open Babel reads a chemical file through a format-specific reader, maps the result into an internal molecular representation, applies requested operations, and writes the molecule through a format-specific writer. Its format system is plugin based, which allows different communities to support common cheminformatics formats, computational chemistry package formats, crystallographic files, docking files, drawing formats, and utility outputs.

The obabel command line tool uses short format identifiers such as smi, sdf, mol2, pdb, pdbqt, and cif. A typical conversion specifies an input format, input file, output format, and output file. When only filenames are provided, Open Babel can guess file types from extensions; when text is provided without a specified input format, SMILES is the default assumption.

Operations such as coordinate generation and hydrogen handling run before the output writer. The official documentation notes that Open Babel does not generate coordinates unless asked, so converting SMILES to SDF without 2D or 3D coordinate generation can produce an SDF record without useful coordinates. Hydrogen handling is similarly explicit: -h adds hydrogens, -d deletes hydrogens, and pH correction adds hydrogens according to Open Babel's protonation transforms.

How to use Open Babel online

Open Babel is available as a command line program, programming library, graphical interface, and source distribution from the official project. As of June 2026, the GitHub repository lists Open Babel 3.2.0 as the latest release. The official software is free and open source under GPL-2.0, but local use requires installation and basic familiarity with command line syntax or the GUI.

For online conversion, Open Babel runs in the browser. It accepts pasted molecular text or an uploaded file, auto-detects common inputs from filename extensions and content signatures, and writes a selected output format from a searchable Open Babel catalog. Optional controls include 2D or 3D coordinate generation, hydrogen addition or deletion, and pH correction.

Access optionBest forNotes
ProteinIQ Open BabelQuick browser conversion and downloadable logsNo local installation. Supports the format catalog and common conversion options available in the browser runtime.
obabel command lineBatch conversion, scripting, filtering, advanced format optionsBest for repeated workflows and large local datasets.
Open Babel GUILocal desktop conversion without writing commandsUseful for occasional conversion on a workstation.
Open Babel library and bindingsBuilding cheminformatics softwareUsed by developers integrating format conversion or chemistry operations into applications.

Open Babel alternatives

RDKit is the closest open-source alternative for many cheminformatics workflows. It is stronger as a programmable toolkit for molecule standardization, descriptors, fingerprints, substructure search, conformer generation, and machine learning pipelines. Open Babel remains especially useful for broad file-format interconversion, including many computational chemistry and docking formats.

The Chemistry Development Kit (CDK) is another open-source chemistry toolkit, mainly used in Java ecosystems. Commercial platforms such as OpenEye and Schrödinger provide polished chemistry toolkits and enterprise support, but access and redistribution terms differ from Open Babel's GPL license.

ProteinIQ includes focused conversion tools when a narrow conversion should have tighter defaults than a general Open Babel run. SMILES to SDF generates 3D SDF files from single molecules or batches, SDF to SMILES extracts SMILES from SDF records, and PDB to SDF converts Protein Data Bank files with chain, residue, hydrogen, water, and metadata controls. For docking after conversion, AutoDock Vina predicts ligand binding poses and affinities instead of only changing file format.

Sources

Matic Broz

Matic Broz

Founder & CEO, ProteinIQ

Matic founded ProteinIQ to make computational biology accessible to every researcher. He builds code-free bioinformatics tools used by thousands of scientists worldwide for protein analysis, molecular docking, and drug discovery.