ProteinIQ

Three-to-one converter

Convert three-letter amino acid codes (Ala, Arg) to single-letter format (A, R)

What is three-to-one amino acid conversion?

Three-to-one amino acid conversion is the process of translating three-letter amino acid codes (such as Ala, Arg, Asn) into their corresponding single-letter codes (A, R, N). Both notation systems were standardized by the International Union of Pure and Applied Chemistry (IUPAC) and the International Union of Biochemistry (IUB) in the early 1970s to provide consistent representations of protein sequences across scientific literature and databases.

The three-letter code uses the first three letters of each amino acid name (with minor modifications to avoid ambiguity), making it readable and commonly used in clinical settings, publications, and educational materials. The one-letter code, proposed in 1968 by Margaret Dayhoff, provides a more compact representation essential for displaying long protein sequences, performing sequence alignments, and storing data efficiently in bioinformatics databases.

How to use the three-to-one converter online

ProteinIQ provides a web-based interface for converting three-letter amino acid codes to single-letter format without any software installation. Paste sequences directly or upload a file, configure delimiter and output options, and receive the converted sequence instantly.

Input

InputDescription
InputThree-letter amino acid sequences. Accepts FASTA format with headers, plain sequences, or delimited lists (e.g., Ala-Arg-Asn or Ala Arg Asn). Supports .txt, .fasta, .fa, .fas, and .seq files up to 50 MB.

Settings

Conversion options

SettingDescription
Input delimiterSpecifies how amino acids are separated in the input. Auto-detect (default) identifies the delimiter automatically. Manual options include Hyphen (-), Space, Comma (,), Tab, Semicolon (;), or Custom.
Custom delimiterUser-defined delimiter character(s). Appears when Custom is selected for input delimiter. Default is /.
Unknown residuesDetermines handling of unrecognized three-letter codes. Convert to X (default) replaces unknowns with X. Skip omits them from output. Keep original code preserves them in brackets.

Output formatting

SettingDescription
Output caseLetter case for output codes. Uppercase (A, R, N) (default) or Lowercase (a, r, n).
Preserve FASTA formatWhen enabled (default), maintains FASTA headers and structure in the output. When disabled, outputs only the sequence.
Characters per lineNumber of characters before line wrapping (0–200). Default 0 outputs the entire sequence on one line. Common values are 60 or 80 for standard FASTA formatting.

Output

The converter produces a plain text sequence in single-letter format. When FASTA preservation is enabled, headers remain intact with the converted sequence below. The output can be copied to clipboard or downloaded as a text file.

Amino acid code reference

The table below lists the standard IUPAC amino acid codes, including the 20 canonical amino acids, two rare amino acids (selenocysteine and pyrrolysine), and three ambiguity codes.

Amino acidThree-letterOne-letter
AlanineAlaA
ArginineArgR
AsparagineAsnN
Aspartic acidAspD
CysteineCysC
Glutamic acidGluE
GlutamineGlnQ
GlycineGlyG
HistidineHisH
IsoleucineIleI
LeucineLeuL
LysineLysK
MethionineMetM
PhenylalaninePheF
ProlineProP
SerineSerS
ThreonineThrT
TryptophanTrpW
TyrosineTyrY
ValineValV
SelenocysteineSecU
PyrrolysinePylO
Asparagine or Aspartic acidAsxB
Glutamine or Glutamic acidGlxZ
Unknown amino acidXaaX

How the converter works

The three-to-one converter parses input sequences by first detecting or applying the specified delimiter to separate individual amino acid codes. Each three-letter code is then matched against a lookup table containing all standard IUPAC amino acids. The matching is case-insensitive, accepting Ala, ALA, or ala equivalently.

Delimiter detection

When set to auto-detect, the converter analyzes the input to identify the most likely delimiter:

  1. Character frequency — Counts occurrences of hyphens, spaces, and commas
  2. Pattern matching — Checks for common patterns like Xxx-Xxx or Xxx Xxx
  3. Continuous string detection — If no delimiters are found and the string length is divisible by three, assumes continuous concatenation (e.g., AlaArgAsn)

FASTA format handling

When processing FASTA-formatted input, the converter:

  • Preserves header lines (lines beginning with >) without modification
  • Accumulates sequence lines until the next header or end of input
  • Applies the conversion to the complete sequence
  • Optionally wraps output at the specified line length

Applications

  • Database submission — Converting sequences from literature format to database-compatible single-letter format
  • Sequence alignment — Preparing sequences for alignment tools that require one-letter codes
  • Bioinformatics pipelines — Batch processing of sequences from experimental outputs or clinical reports
  • Educational use — Learning amino acid abbreviations and sequence notation conventions