
CSV to FASTA
Convert CSV or TSV sequence tables to FASTA with automatic delimiter detection and column mapping.

Convert TXT or plain text sequences into FASTA format files for DNA, RNA, and protein workflows with cleanup, validation, and downloads

Extract sequence features (CDS, mRNA, gene, etc.) from GenBank files in FASTA format with support for spliced features

Convert FASTA sequence files to FASTQ format with mock quality scores

Convert FASTQ sequence files to FASTA format

Convert GenBank files to FASTA format

Translate DNA sequences to protein sequences using genetic code

Convert DNA sequences to RNA (transcription) - replaces T with U

Reverse translate protein sequences to possible DNA sequences

Convert RNA sequences to DNA (reverse transcription) - replaces U with T

Convert single-letter amino acid codes to three-letter codes
CSV to FASTA converts tabular sequence data—stored in CSV, TSV, or other delimited formats—into FASTA format. Spreadsheets and databases often store sequences alongside metadata in columns, but most bioinformatics tools expect FASTA input. This converter bridges that gap.
The tool handles common spreadsheet quirks: quoted fields containing commas, inconsistent delimiters, and extraneous characters mixed into sequence columns. It auto-detects column names and delimiters, though manual overrides are available when needed.
ProteinIQ's converter runs entirely in the browser—no uploads to external servers, no software installation. Large files process instantly on the client.
| Format | Description |
|---|---|
| CSV | Comma-separated values |
| TSV | Tab-separated values |
| Other delimited | Semicolon or pipe-delimited files |
The converter expects at least two columns: one for sequence identifiers (headers) and one for the sequences themselves. Additional metadata columns are ignored.
| Setting | Description |
|---|---|
Delimiter | Auto-detect (default), Comma, Semicolon, Tab, or Pipe. Auto-detection examines the first few rows. |
Has header row | Whether the first row contains column names. Default: on. |
Quote character | Character used to wrap fields containing delimiters. " (default), ', or None. |
The converter looks for common column names automatically: "id", "name", "header" for identifiers; "sequence", "seq", "protein" for sequences. Override these when your columns use different names.
| Setting | Description |
|---|---|
Use column indices | Map by position (0-indexed) instead of column name. Useful for headerless files. |
ID column name | Column containing sequence identifiers. Default: id. |
Sequence column name | Column containing sequences. Default: sequence. |
ID column index | Zero-based position of the ID column (when using indices). |
Sequence column index | Zero-based position of the sequence column (when using indices). |
| Setting | Description |
|---|---|
Line wrapping | 80 characters (default), 60, or no wrapping. Standard FASTA uses 60–80. |
Case format | UPPERCASE (default), lowercase, or Preserve original. |
Header prefix | Optional text prepended to all FASTA headers. |
Spreadsheet data often contains formatting artifacts—position numbers, spaces for readability, or stray punctuation. The cleanup options remove these automatically.
| Setting | Description |
|---|---|
Character cleanup | Master toggle for cleanup options. Default: on. |
Remove spaces | Strip whitespace. Default: on. |
Remove numbers | Strip digits 0–9. Default: on. |
Remove punctuation | Strip punctuation marks. Default: on. |
Remove invalid characters | Strip characters not valid in biological sequences. Default: on. |
| Setting | Description |
|---|---|
Validate sequences | Check for invalid characters after cleanup. Default: on. |
Skip empty rows | Ignore rows with missing sequence values. Default: on. |
Skip invalid sequences | Exclude sequences that fail validation instead of including them with warnings. Default: off. |
Show sequence statistics | Display count, total length, and detected sequence type. Default: on. |
The converter produces standard FASTA with headers beginning with > followed by the identifier, then sequence lines wrapped at the specified length.
>seq1
ATCGATCGATCGATCGATCG
>seq2
GCTAGCTAGCTAGCTAGCTAResults can be copied to clipboard or downloaded as a .fasta file.