List of biological molecules
A curated list of small biological molecules, metabolites, cofactors, amino acids, nucleobases, lipids, and related structures.
| Name | Formula | Biological class | SMILES | PubChem CID | ChEBI ID |
|---|---|---|---|---|---|
| Water | H2O | Solvent | O | 962 | CHEBI:15377 |
| D-Glucose | C6H12O6 | Carbohydrate | C([C@@H]1[C@H]([C@@H]([C@H](C(O1)O)O)O)O)O | 5793 | CHEBI:17234 |
| Glycine | C2H5NO2 | Amino acid | C(C(=O)O)N | 750 | CHEBI:15428 |
| Adenine | C5H5N5 | Nucleobase | C1=NC2=C(N1)N=CN=C2N | 190 | CHEBI:16708 |
| Adenosine triphosphate | C10H16N5O13P3 | Cofactor / nucleotide | C1=NC2=C(C(=N1)N)N=CN2C3C(C(C(O3)COP(=O)(O)OP(=O)(O)OP(=O)(O)O)O)O | 5957 | CHEBI:15422 |
| Cholesterol | C27H46O | Sterol lipid | CC(C)CCCC(C)C1CCC2C1(CCC3C2CCC4=CC(O)CCC34C)C | 5997 | CHEBI:16113 |
| Lactic acid | C3H6O3 | Metabolite | CC(C(=O)O)O | 612 | CHEBI:28358 |
| Citric acid | C6H8O7 | Central-carbon metabolite | C(C(=O)O)C(CC(=O)O)(C(=O)O)O | 311 | CHEBI:30769 |
| Palmitic acid | C16H32O2 | Fatty acid | CCCCCCCCCCCCCCCC(=O)O | 985 | CHEBI:15756 |
| Dopamine | C8H11NO2 | Neurotransmitter | C1=CC(=C(C=C1CCN)O)O | 681 | CHEBI:18243 |
| L-Alanine | C3H7NO2 | Amino acid | C[C@@H](C(=O)O)N | 5950 | CHEBI:16977 |
| L-Valine | C5H11NO2 | Amino acid | CC(C)[C@@H](C(=O)O)N | 6287 | CHEBI:16414 |
| L-Serine | C3H7NO3 | Amino acid | C([C@@H](C(=O)O)N)O | 5951 | CHEBI:17115 |
| Uracil | C4H4N2O2 | Nucleobase | C1=CNC(=O)NC1=O | 1174 | CHEBI:17568 |
| Cytosine | C4H5N3O | Nucleobase | C1=C(NC(=O)N=C1)N | 597 | CHEBI:16040 |
| Guanine | C5H5N5O | Nucleobase | C1=NC2=C(N1)C(=O)NC(=N2)N | 135398634 | CHEBI:16235 |
| Thymine | C5H6N2O2 | Nucleobase | CC1=CNC(=O)NC1=O | 1135 | CHEBI:17821 |
| Pyruvic acid | C3H4O3 | Central-carbon metabolite | CC(=O)C(=O)O | 1060 | CHEBI:15361 |
| Fructose | C6H12O6 | Carbohydrate | C1[C@H]([C@H]([C@@H](C(O1)(CO)O)O)O)O | 2723872 | - |
| Ribose | C5H10O5 | Carbohydrate | C1[C@H]([C@H]([C@H](C(O1)O)O)O)O | 10975657 | - |
| Galactose | C6H12O6 | Carbohydrate | C([C@@H]1[C@@H]([C@@H]([C@H](C(O1)O)O)O)O)O | 6036 | - |
| Sucrose | C12H22O11 | Disaccharide | C([C@@H]1[C@H]([C@@H]([C@H]([C@H](O1)O[C@]2([C@H]([C@@H]([C@H](O2)CO)O)O)CO)O)O)O)O | 5988 | - |
| Maltose | C12H22O11 | Disaccharide | C([C@@H]1[C@H]([C@@H]([C@H]([C@H](O1)O[C@@H]2[C@H](OC([C@@H]([C@H]2O)O)O)CO)O)O)O)O | 439186 | - |
| Lysine | C6H14N2O2 | Amino acid | C(CCN)C[C@@H](C(=O)O)N | 5962 | - |
| Glutamic acid | C5H9NO4 | Amino acid | C(CC(=O)O)[C@@H](C(=O)O)N | 33032 | - |
| Aspartic acid | C4H7NO4 | Amino acid | C([C@@H](C(=O)O)N)C(=O)O | 5960 | - |
| Phenylalanine | C9H11NO2 | Amino acid | C1=CC=C(C=C1)C[C@@H](C(=O)O)N | 6140 | - |
| Tryptophan | C11H12N2O2 | Amino acid | C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)N | 6305 | - |
| Tyrosine | C9H11NO3 | Amino acid | C1=CC(=CC=C1C[C@@H](C(=O)O)N)O | 6057 | - |
| Cysteine | C3H7NO2S | Amino acid | C([C@@H](C(=O)O)N)S | 5862 | - |
| Methionine | C5H11NO2S | Amino acid | CSCC[C@@H](C(=O)O)N | 6137 | - |
| Histidine | C6H9N3O2 | Amino acid | C1=C(NC=N1)C[C@@H](C(=O)O)N | 6274 | - |
| Proline | C5H9NO2 | Amino acid | C1C[C@H](NC1)C(=O)O | 145742 | - |
| Arginine | C6H14N4O2 | Amino acid | C(C[C@@H](C(=O)O)N)CN=C(N)N | 6322 | - |
| Asparagine | C4H8N2O3 | Amino acid | C([C@@H](C(=O)O)N)C(=O)N | 6267 | - |
| Glutamine | C5H10N2O3 | Amino acid | C(CC(=O)N)[C@@H](C(=O)O)N | 5961 | - |
| Creatine | C4H9N3O2 | Metabolite | CN(CC(=O)O)C(=N)N | 586 | - |
| Glutathione | C10H17N3O6S | Peptide antioxidant | C(CC(=O)N[C@@H](CS)C(=O)NCC(=O)O)[C@@H](C(=O)O)N | 124886 | - |
| Serotonin | C10H12N2O | Neurotransmitter | C1=CC2=C(C=C1O)C(=CN2)CCN | 5202 | - |
| Epinephrine | C9H13NO3 | Hormone / neurotransmitter | CNC[C@@H](C1=CC(=C(C=C1)O)O)O | 5816 | - |
| Melatonin | C13H16N2O2 | Hormone | CC(=O)NCCC1=CNC2=C1C=C(C=C2)OC | 896 | - |
| Ascorbic acid | C6H8O6 | Vitamin | C([C@@H]([C@@H]1C(=C(C(=O)O1)O)O)O)O | 54670067 | - |
| Riboflavin | C17H20N4O6 | Vitamin | CC1=CC2=C(C=C1C)N(C3=NC(=O)NC(=O)C3=N2)C[C@@H]([C@@H]([C@@H](CO)O)O)O | 493570 | - |
| Deoxyribose | C5H10O4 | Biological molecule | C1[C@@H]([C@H](OC1O)CO)O | 9828112 | - |
| Lactose | C12H22O11 | Biological molecule | C([C@@H]1[C@@H]([C@@H]([C@H]([C@@H](O1)O[C@@H]2[C@H](OC([C@@H]([C@H]2O)O)O)CO)O)O)O)O | 440995 | - |
| Glycogen | C24H42O21 | Biological molecule | C([C@@H]1[C@H]([C@@H]([C@H]([C@H](O1)OC[C@@H]2[C@H]([C@@H]([C@H]([C@H](O2)O[C@@H]3[C@H](O[C@@H]([C@@H]([C@H]3O)O)O)CO)O)O)O[C@@H]4[C@@H]([C@H]([C@@H]([C@H](O4)CO)O)O)O)O)O)O)O | 439177 | - |
| Leucine | C6H13NO2 | Biological molecule | CC(C)C[C@@H](C(=O)O)N | 6106 | - |
| Isoleucine | C6H13NO2 | Biological molecule | CC[C@H](C)[C@@H](C(=O)O)N | 6306 | - |
| Threonine | C4H9NO3 | Biological molecule | C[C@H]([C@@H](C(=O)O)N)O | 6288 | - |
| Adenosine | C10H13N5O4 | Biological molecule | C1=NC(=C2C(=N1)N(C=N2)[C@H]3[C@@H]([C@@H]([C@H](O3)CO)O)O)N | 60961 | - |
| Guanosine | C10H13N5O5 | Biological molecule | C1=NC2=C(N1[C@H]3[C@@H]([C@@H]([C@H](O3)CO)O)O)N=C(NC2=O)N | 135398635 | - |
| Cytidine | C9H13N3O5 | Biological molecule | C1=CN(C(=O)N=C1N)[C@H]2[C@@H]([C@@H]([C@H](O2)CO)O)O | 6175 | - |
| Uridine | C9H12N2O6 | Biological molecule | C1=CN(C(=O)NC1=O)[C@H]2[C@@H]([C@@H]([C@H](O2)CO)O)O | 6029 | - |
| Adenosine Diphosphate | C10H15N5O10P2 | Biological molecule | C1=NC(=C2C(=N1)N(C=N2)[C@H]3[C@@H]([C@@H]([C@H](O3)COP(=O)(O)OP(=O)(O)O)O)O)N | 6022 | - |
| Adenosine Monophosphate | C10H14N5O7P | Biological molecule | C1=NC(=C2C(=N1)N(C=N2)[C@H]3[C@@H]([C@@H]([C@H](O3)COP(=O)(O)O)O)O)N | 6083 | - |
| Nicotinamide Adenine Dinucleotide | C21H27N7O14P2 | Biological molecule | C1=CC(=C[N+](=C1)[C@H]2[C@@H]([C@@H]([C@H](O2)COP(=O)([O-])OP(=O)(O)OC[C@@H]3[C@H]([C@H]([C@@H](O3)N4C=NC5=C(N=CN=C54)N)O)O)O)O)C(=O)N | 5892 | - |
| Flavin Adenine Dinucleotide | C27H33N9O15P2 | Biological molecule | CC1=CC2=C(C=C1C)N(C3=NC(=O)NC(=O)C3=N2)C[C@@H]([C@@H]([C@@H](COP(=O)(O)OP(=O)(O)OC[C@@H]4[C@H]([C@H]([C@@H](O4)N5C=NC6=C(N=CN=C65)N)O)O)O)O)O | 643975 | - |
| Coenzyme A | C21H36N7O16P3S | Biological molecule | CC(C)(COP(=O)(O)OP(=O)(O)OC[C@@H]1[C@H]([C@H]([C@@H](O1)N2C=NC3=C(N=CN=C32)N)O)OP(=O)(O)O)[C@H](C(=O)NCCC(=O)NCCS)O | 87642 | - |
| Isocitric Acid | C6H8O7 | Biological molecule | C(C(C(C(=O)O)O)C(=O)O)C(=O)O | 1198 | - |
| Alpha-Ketoglutaric Acid | C5H6O5 | Biological molecule | C(CC(=O)O)C(=O)C(=O)O | 51 | - |
| Succinic Acid | C4H6O4 | Biological molecule | C(CC(=O)O)C(=O)O | 1110 | - |
| Fumaric Acid | C4H4O4 | Biological molecule | C(=C/C(=O)O)\C(=O)O | 444972 | - |
| Malic Acid | C4H6O5 | Biological molecule | C(C(C(=O)O)O)C(=O)O | 525 | - |
| Oxaloacetic Acid | C4H4O5 | Biological molecule | C(C(=O)C(=O)O)C(=O)O | 970 | - |
| Acetyl-CoA | C23H38N7O17P3S | Biological molecule | CC(=O)SCCNC(=O)CCNC(=O)[C@@H](C(C)(C)COP(=O)(O)OP(=O)(O)OC[C@@H]1[C@H]([C@H]([C@@H](O1)N2C=NC3=C(N=CN=C32)N)O)OP(=O)(O)O)O | 444493 | - |
| Creatinine | C4H7N3O | Biological molecule | CN1CC(=O)N=C1N | 588 | - |
| Urea | CH4N2O | Biological molecule | C(=O)(N)N | 1176 | - |
| Uric Acid | C5H4N4O3 | Biological molecule | C12=C(NC(=O)N1)NC(=O)NC2=O | 1175 | - |
| Stearic Acid | C18H36O2 | Biological molecule | CCCCCCCCCCCCCCCCCC(=O)O | 5281 | - |
| Oleic Acid | C18H34O2 | Biological molecule | CCCCCCCC/C=C\CCCCCCCC(=O)O | 445639 | - |
| Linoleic Acid | C18H32O2 | Biological molecule | CCCCC/C=C\C/C=C\CCCCCCCC(=O)O | 5280450 | - |
| Linolenic Acid | C18H30O2 | Biological molecule | CC/C=C\C/C=C\C/C=C\CCCCCCCC(=O)O | 5280934 | - |
| Arachidonic Acid | C20H32O2 | Biological molecule | CCCCC/C=C\C/C=C\C/C=C\C/C=C\CCCC(=O)O | 444899 | - |
| Docosahexaenoic Acid | C22H32O2 | Biological molecule | CC/C=C\C/C=C\C/C=C\C/C=C\C/C=C\C/C=C\CCC(=O)O | 445580 | - |
| Eicosapentaenoic Acid | C20H30O2 | Biological molecule | CC/C=C/C/C=C/C/C=C/C/C=C/C/C=C/CCCC(=O)O | 5282847 | - |
| Glycerol | C3H8O3 | Biological molecule | C(C(CO)O)O | 753 | - |
| Sphingosine | C18H37NO2 | Biological molecule | CCCCCCCCCCCCC/C=C/[C@H]([C@H](CO)N)O | 5280335 | - |
| Norepinephrine | C8H11NO3 | Biological molecule | C1=CC(=C(C=C1[C@H](CN)O)O)O | 439260 | - |
| Histamine | C5H9N3 | Biological molecule | C1=C(NC=N1)CCN | 774 | - |
| Acetylcholine | C7H16NO2+ | Biological molecule | CC(=O)OCC[N+](C)(C)C | 187 | - |
| Gamma-Aminobutyric Acid | C4H9NO2 | Biological molecule | C(CC(=O)O)CN | 119 | - |
| Niacin | C6H5NO2 | Biological molecule | C1=CC(=CN=C1)C(=O)O | 938 | - |
| Pyridoxine | C8H11NO3 | Biological molecule | CC1=NC=C(C(=C1O)CO)CO | 1054 | - |
| Folic Acid | C19H19N7O6 | Biological molecule | C1=CC(=CC=C1C(=O)N[C@@H](CCC(=O)O)C(=O)O)NCC2=CN=C3C(=N2)C(=O)NC(=N3)N | 135398658 | - |
| Biotin | C10H16N2O3S | Biological molecule | C1[C@H]2[C@@H]([C@@H](S1)CCCCC(=O)O)NC(=O)N2 | 171548 | - |
| Thiamine | C12H17N4OS+ | Biological molecule | CC1=C(SC=[N+]1CC2=CN=C(N=C2N)C)CCO | 1130 | - |
| Retinol | C20H30O | Biological molecule | CC1=C(C(CCC1)(C)C)/C=C/C(=C/C=C/C(=C/CO)/C)/C | 445354 | - |
| Cholecalciferol | C27H44O | Biological molecule | C[C@H](CCCC(C)C)[C@H]1CC[C@@H]\2[C@@]1(CCC/C2=C\C=C/3\C[C@H](CCC3=C)O)C | 5280795 | - |
| Tocopherol | C28H48O2 | Biological molecule | CC1=C(C=C2CCC(OC2=C1C)(C)CCCC(C)CCCC(C)CCCC(C)C)O | 14986 | - |
| Phylloquinone | C31H46O2 | Biological molecule | CC1=C(C(=O)C2=CC=CC=C2C1=O)C/C=C(\C)/CCC[C@H](C)CCC[C@H](C)CCCC(C)C | 5284607 | - |
| Heme | C34H32FeN4O4 | Biological molecule | CC1=C(C2=CC3=C(C(=C([N-]3)C=C4C(=C(C(=N4)C=C5C(=C(C(=N5)C=C1[N-]2)C)C=C)C)C=C)C)CCC(=O)O)CCC(=O)O.[Fe+2] | 26945 | - |
| Bilirubin | C33H36N4O6 | Biological molecule | CC1=C(NC(=C1CCC(=O)O)CC2=C(C(=C(N2)/C=C\3/C(=C(C(=O)N3)C)C=C)C)CCC(=O)O)/C=C\4/C(=C(C(=O)N4)C=C)C | 5280352 | - |
| Biliverdin | C33H34N4O6 | Biological molecule | CC\1=C(/C(=C/C2=C(C(=C(N2)/C=C\3/C(=C(C(=O)N3)C)C=C)C)CCC(=O)O)/N/C1=C\C4=NC(=O)C(=C4C)C=C)CCC(=O)O | 5280353 | - |
| Cortisol | C21H30O5 | Biological molecule | C[C@]12CCC(=O)C=C1CC[C@@H]3[C@@H]2[C@H](C[C@]4([C@H]3CC[C@@]4(C(=O)CO)O)C)O | 5754 | - |
| Testosterone | C19H28O2 | Biological molecule | C[C@]12CC[C@H]3[C@H]([C@@H]1CC[C@@H]2O)CCC4=CC(=O)CC[C@]34C | 6013 | - |
| Estradiol | C18H24O2 | Biological molecule | C[C@]12CC[C@H]3[C@H]([C@@H]1CC[C@@H]2O)CCC4=C3C=CC(=C4)O | 5757 | - |
| Progesterone | C21H30O2 | Biological molecule | CC(=O)[C@H]1CC[C@@H]2[C@@]1(CC[C@H]3[C@H]2CCC4=CC(=O)CC[C@]34C)C | 5994 | - |
| Aldosterone | C21H28O5 | Biological molecule | C[C@]12CCC(=O)C=C1CC[C@@H]3[C@@H]2[C@H](C[C@]4([C@H]3CC[C@@H]4C(=O)CO)C=O)O | 5839 | - |
| Thyroxine | C15H11I4NO4 | Biological molecule | C1=C(C=C(C(=C1I)OC2=CC(=C(C(=C2)I)O)I)I)C[C@@H](C(=O)O)N | 5819 | - |
| Triiodothyronine | C15H12I3NO4 | Biological molecule | C1=CC(=C(C=C1OC2=C(C=C(C=C2I)C[C@@H](C(=O)O)N)I)I)O | 5920 | - |
Methodology
ChEBI is the preferred ontology source for biological roles and chemical classes; PubChem supplies compound identifiers and structures. HMDB is useful for metabolite expansion but needs license review before redistribution.
Expansion plan
Add filtered views for amino acids, nucleobases, sugars, lipids, cofactors, vitamins, neurotransmitters, and central-carbon metabolites with ChEBI role/class provenance.
Sources
ChEBI
Chemical ontology classes, biological roles, synonyms, formulas, SMILES, InChI, and ChEBI identifiers.
PubChem
Canonical compound identifiers, names, formulas, structures, SMILES, and downloadable compound records.
RDKit
Derived descriptors such as molecular weight, LogP, TPSA, hydrogen-bond counts, and canonicalized structures.