Three-to-one converter
Convert three-letter amino acid codes (Ala, Arg) to single-letter format (A, R)
What is three-to-one amino acid conversion?
Three-to-one amino acid conversion is the process of translating three-letter amino acid codes (such as Ala, Arg, Asn) into their corresponding single-letter codes (A, R, N). Both notation systems were standardized by the International Union of Pure and Applied Chemistry (IUPAC) and the International Union of Biochemistry (IUB) in the early 1970s to provide consistent representations of protein sequences across scientific literature and databases.
The three-letter code uses the first three letters of each amino acid name (with minor modifications to avoid ambiguity), making it readable and commonly used in clinical settings, publications, and educational materials. The one-letter code, proposed in 1968 by Margaret Dayhoff, provides a more compact representation essential for displaying long protein sequences, performing sequence alignments, and storing data efficiently in bioinformatics databases.
How to use the three-to-one converter online
ProteinIQ provides a web-based interface for converting three-letter amino acid codes to single-letter format without any software installation. Paste sequences directly or upload a file, configure delimiter and output options, and receive the converted sequence instantly.
Input
| Input | Description |
|---|---|
Input | Three-letter amino acid sequences. Accepts FASTA format with headers, plain sequences, or delimited lists (e.g., Ala-Arg-Asn or Ala Arg Asn). Supports .txt, .fasta, .fa, .fas, and .seq files up to 50 MB. |
Settings
Conversion options
| Setting | Description |
|---|---|
Input delimiter | Specifies how amino acids are separated in the input. Auto-detect (default) identifies the delimiter automatically. Manual options include Hyphen (-), Space, Comma (,), Tab, Semicolon (;), or Custom. |
Custom delimiter | User-defined delimiter character(s). Appears when Custom is selected for input delimiter. Default is /. |
Unknown residues | Determines handling of unrecognized three-letter codes. Convert to X (default) replaces unknowns with X. Skip omits them from output. Keep original code preserves them in brackets. |
Output formatting
| Setting | Description |
|---|---|
Output case | Letter case for output codes. Uppercase (A, R, N) (default) or Lowercase (a, r, n). |
Preserve FASTA format | When enabled (default), maintains FASTA headers and structure in the output. When disabled, outputs only the sequence. |
Characters per line | Number of characters before line wrapping (0–200). Default 0 outputs the entire sequence on one line. Common values are 60 or 80 for standard FASTA formatting. |
Output
The converter produces a plain text sequence in single-letter format. When FASTA preservation is enabled, headers remain intact with the converted sequence below. The output can be copied to clipboard or downloaded as a text file.
Amino acid code reference
The table below lists the standard IUPAC amino acid codes, including the 20 canonical amino acids, two rare amino acids (selenocysteine and pyrrolysine), and three ambiguity codes.
| Amino acid | Three-letter | One-letter |
|---|---|---|
| Alanine | Ala | A |
| Arginine | Arg | R |
| Asparagine | Asn | N |
| Aspartic acid | Asp | D |
| Cysteine | Cys | C |
| Glutamic acid | Glu | E |
| Glutamine | Gln | Q |
| Glycine | Gly | G |
| Histidine | His | H |
| Isoleucine | Ile | I |
| Leucine | Leu | L |
| Lysine | Lys | K |
| Methionine | Met | M |
| Phenylalanine | Phe | F |
| Proline | Pro | P |
| Serine | Ser | S |
| Threonine | Thr | T |
| Tryptophan | Trp | W |
| Tyrosine | Tyr | Y |
| Valine | Val | V |
| Selenocysteine | Sec | U |
| Pyrrolysine | Pyl | O |
| Asparagine or Aspartic acid | Asx | B |
| Glutamine or Glutamic acid | Glx | Z |
| Unknown amino acid | Xaa | X |
How the converter works
The three-to-one converter parses input sequences by first detecting or applying the specified delimiter to separate individual amino acid codes. Each three-letter code is then matched against a lookup table containing all standard IUPAC amino acids. The matching is case-insensitive, accepting Ala, ALA, or ala equivalently.
Delimiter detection
When set to auto-detect, the converter analyzes the input to identify the most likely delimiter:
- Character frequency — Counts occurrences of hyphens, spaces, and commas
- Pattern matching — Checks for common patterns like
Xxx-XxxorXxx Xxx - Continuous string detection — If no delimiters are found and the string length is divisible by three, assumes continuous concatenation (e.g.,
AlaArgAsn)
FASTA format handling
When processing FASTA-formatted input, the converter:
- Preserves header lines (lines beginning with
>) without modification - Accumulates sequence lines until the next header or end of input
- Applies the conversion to the complete sequence
- Optionally wraps output at the specified line length
Applications
- Database submission — Converting sequences from literature format to database-compatible single-letter format
- Sequence alignment — Preparing sequences for alignment tools that require one-letter codes
- Bioinformatics pipelines — Batch processing of sequences from experimental outputs or clinical reports
- Educational use — Learning amino acid abbreviations and sequence notation conventions
Related tools
- One-to-three converter — Converts single-letter codes to three-letter format
- Amino acid composition — Analyzes the frequency of amino acids in a protein sequence
