
CSV to FASTA
Convert spreadsheet data to FASTA format. Upload a CSV file or paste your data below. Configure column mapping and formatting options on the right.
What is CSV to FASTA?
CSV to FASTA converts tabular sequence data—stored in CSV, TSV, or other delimited formats—into FASTA format. Spreadsheets and databases often store sequences alongside metadata in columns, but most bioinformatics tools expect FASTA input. This converter bridges that gap.
The tool handles common spreadsheet quirks: quoted fields containing commas, inconsistent delimiters, and extraneous characters mixed into sequence columns. It auto-detects column names and delimiters, though manual overrides are available when needed.
How to use CSV to FASTA online
ProteinIQ's converter runs entirely in the browser—no uploads to external servers, no software installation. Large files process instantly on the client.
Input
| Format | Description |
|---|---|
| CSV | Comma-separated values |
| TSV | Tab-separated values |
| Other delimited | Semicolon or pipe-delimited files |
The converter expects at least two columns: one for sequence identifiers (headers) and one for the sequences themselves. Additional metadata columns are ignored.
Settings
CSV parsing
| Setting | Description |
|---|---|
Delimiter | Auto-detect (default), Comma, Semicolon, Tab, or Pipe. Auto-detection examines the first few rows. |
Has header row | Whether the first row contains column names. Default: on. |
Quote character | Character used to wrap fields containing delimiters. " (default), ', or None. |
Column mapping
The converter looks for common column names automatically: "id", "name", "header" for identifiers; "sequence", "seq", "protein" for sequences. Override these when your columns use different names.
| Setting | Description |
|---|---|
Use column indices | Map by position (0-indexed) instead of column name. Useful for headerless files. |
ID column name | Column containing sequence identifiers. Default: id. |
Sequence column name | Column containing sequences. Default: sequence. |
ID column index | Zero-based position of the ID column (when using indices). |
Sequence column index | Zero-based position of the sequence column (when using indices). |
FASTA formatting
| Setting | Description |
|---|---|
Line wrapping | 80 characters (default), 60, or no wrapping. Standard FASTA uses 60–80. |
Case format | UPPERCASE (default), lowercase, or Preserve original. |
Header prefix | Optional text prepended to all FASTA headers. |
Sequence cleanup
Spreadsheet data often contains formatting artifacts—position numbers, spaces for readability, or stray punctuation. The cleanup options remove these automatically.
| Setting | Description |
|---|---|
Character cleanup | Master toggle for cleanup options. Default: on. |
Remove spaces | Strip whitespace. Default: on. |
Remove numbers | Strip digits 0–9. Default: on. |
Remove punctuation | Strip punctuation marks. Default: on. |
Remove invalid characters | Strip characters not valid in biological sequences. Default: on. |
Validation
| Setting | Description |
|---|---|
Validate sequences | Check for invalid characters after cleanup. Default: on. |
Skip empty rows | Ignore rows with missing sequence values. Default: on. |
Skip invalid sequences | Exclude sequences that fail validation instead of including them with warnings. Default: off. |
Show sequence statistics | Display count, total length, and detected sequence type. Default: on. |
Output
The converter produces standard FASTA with headers beginning with > followed by the identifier, then sequence lines wrapped at the specified length.
1>seq12ATCGATCGATCGATCGATCG3>seq24GCTAGCTAGCTAGCTAGCTAResults can be copied to clipboard or downloaded as a .fasta file.
Related tools
- TXT to FASTA: Convert plain text sequences without tabular structure
- GenBank to FASTA: Extract sequences from GenBank records
- PDB to FASTA: Extract sequences from protein structures
- FASTA Splitter: Divide large FASTA files into smaller chunks