How to convert TXT to FASTA?
The easiest way to convert TXT to FASTA format online is to use this ProteinIQ TXT to FASTA converter. To convert TXT to FASTA, paste a raw sequence or upload a .txt file, choose the sequence type if you know it, and run the converter. ProteinIQ removes copied line numbers and spacing, adds or preserves FASTA headers, wraps sequence lines, validates sequence characters, and returns a .fasta file.
For one sequence, the result contains one FASTA record:
>Sequence_1
ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAGWhile for multi-sequence FASTA files, each sequence gets a new line:
>Sequence_1
ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG
>Sequence_2
MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFFor multiple sequences, separate entries with blank lines, existing FASTA headers, or the selected custom separator. The output keeps each sequence as a separate FASTA record so it can be used in alignment tools, sequence search tools, or downstream bioinformatics pipelines.
| Input | Description |
|---|---|
Input | Plain text containing one or more sequences. Accepts pasted text or file uploads. Common extensions: .txt, .fasta, .fa, .fas, .seq, .dat. Maximum file size: 50 MB. |
| Setting | Description |
|---|---|
Multi-sequences | Method for identifying separate sequences. Auto-detect sequences (default) analyzes text structure, keeps likely wrapped sequence lines together, and avoids treating prose labels as sequence records. Split on empty lines treats each block separated by blank lines as a distinct sequence. Custom separator uses a specified delimiter string. |
Custom separator | Delimiter string for separating sequences when Custom separator mode is selected. Default: ---. An empty custom separator is rejected instead of falling back to auto-detection. |
Sequence type | Select Auto-detect (default), DNA, RNA, or Protein. Auto-detect avoids deleting valid protein residues before classification. Explicit DNA mode rejects U, explicit RNA mode rejects T, and Protein mode preserves valid amino acid residues. |
Header format | Controls how sequence identifiers are generated. Preserve existing headers (default) maintains existing FASTA header lines. seq_1, seq_2, ... or sequence_1, sequence_2, ... provide simple incrementing names. Custom prefix allows defining a custom naming scheme. Extract from text (smart) attempts to identify meaningful names from surrounding text. |
Custom prefix | Prefix string for sequence headers when Custom prefix mode is selected. Default: seq. Prefix text is sanitized so spaces, punctuation, or pasted line breaks cannot create malformed FASTA headers. |
Header extraction pattern | Refines smart extraction behavior when using Extract from text (smart) mode. First word of each sequence block takes the initial word before each sequence. Line numbers searches for patterns like "1.", "2.". Sequence identifiers looks for conventions like "seq1" or "protein_a". |
Line wrapping | Number of characters per line in the output. 80 characters per line (standard) (default) follows NCBI recommendations. 60 characters per line is common in many workflows. No wrapping (single line) outputs each sequence on a single line. |
Case format | Letter case for output sequences. UPPERCASE (default) matches database expectations. lowercase for alternative formatting. Preserve original maintains input capitalization. |
Character cleanup | Master switch enabling automatic removal of non-sequence characters. Default: enabled. |
Remove spaces | Strips whitespace characters from sequences. Default: enabled. |
Remove numbers | Strips numeric characters (0-9) from sequences, useful for sequences copied from numbered formats. Default: enabled. |
Remove tabs | Strips tab characters from sequences. Default: enabled. |
Remove punctuation | Strips punctuation marks from sequences. Default: enabled. |
Preserve alignment gaps | Keeps - and . gap characters when converting aligned FASTA or MSA-style input. Default: disabled. |
Preserve stop codons | Keeps terminal * stop markers in protein sequences. Internal * characters are still rejected during validation. Default: disabled. |
[Remove invalid characters](/app/filter-dna) | Strips characters that are not valid biological residues. Type-specific residue removal only runs when a Sequence type is selected, so in auto-detect mode short proteins made mostly of nucleotide-overlap letters are not truncated. Default: enabled. |
Validate sequences | Performs a final check that all output characters are valid biological sequence codes for the selected sequence type. Default: enabled. |
Validation strictness | Lenient (auto-clean) removes invalid characters and reports what changed. Strict (reject invalid) fails any sequence that would require cleanup, including spaces, tabs, numbers, punctuation, non-letter characters, or sequence-type mismatches. |
Add line numbers to headers | Includes original line numbers from the input file in FASTA headers, useful for tracking sequence sources. Default: disabled. |
Show sequence statistics | Displays statistics including sequence count, total length, average length, and detected sequence type. Default: enabled. |
Results
The converter produces FASTA-formatted output that can be copied to clipboard or downloaded as a .fasta file.
| Output | Description |
|---|---|
| FASTA text | Properly formatted sequences with > headers and wrapped sequence lines. Each sequence appears on separate lines following its header. |
| Statistics | When enabled, displays sequence count, total length, average length, shortest and longest length, and detected sequence type (protein, DNA, or RNA). |
How to make a FASTA file from a TXT file
A plain text file can be converted to FASTA by adding a header line that starts with > and placing the sequence on the next line. This is the standard way to create a FASTA file from raw sequence text before saving it with a .fasta or .fa extension.
Edit the file in a text editor
For small files, open the .txt file in a plain text editor such as Notepad or TextEdit and format each sequence like this:
>Sequence_1
MTEITAAMVKELRESTGAGMMDCKNALSETQHEWAYKIf your file contains multiple sequences, repeat the same pattern for each entry:
>Sequence_1
MTEITAAMVKELRESTGAGMMDCKNALSETQHEWAYK
>Sequence_2
MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFHow to save a FASTA file
After formatting the header and sequence lines, use Save As in a text editor and save the file as plain text with a .fasta or .fa extension. If the editor appends .txt, choose plain text output explicitly and rename the file so the final filename ends in .fasta or .fa.
Use a converter when the input is messy
If the text contains numbering, spaces, or inconsistent formatting, use a dedicated converter to clean the sequences and generate headers automatically. ProteinIQ supports pasted text and uploaded files, so it is useful when manual editing would be slow or error-prone.
FASTA format rules
FASTA files are simple, but a few rules matter for downstream tool compatibility.
- Header line: Each sequence starts with a single header line beginning with
>. The identifier should be unique within the file. - Sequence line: Put the sequence directly below the header. Many tools accept wrapped lines, but one continuous line per sequence is often easier to inspect.
- Valid characters: Use standard nucleotide codes such as
A,C,G,T,U, andN, or standard one-letter amino acid codes for proteins. - Alignment gaps: Gapped alignment FASTA files may contain
-or.gap characters. EnablePreserve alignment gapswhen converting aligned sequences. - Protein stops: Translated protein FASTA records sometimes end with
*to mark a stop codon. EnablePreserve stop codonsif you need to keep that terminal marker. - No numbering or spaces: Remove residue numbers, tabs, spaces, and other non-sequence characters unless a tool explicitly allows them.
- Plain text file: Save the file as plain text before renaming it to
.fastaor.fa.
Which FASTA converter should I use?
Use TXT to FASTA when your input is a plain text sequence, copied sequence block, or .txt file. Use a more specific converter when the source file has a structured format with metadata or quality scores.
| Input format | Best converter | Use when |
|---|---|---|
| TXT or pasted sequence text | TXT to FASTA | You need to create a FASTA file from raw DNA, RNA, or protein text. |
| CSV or spreadsheet-like text | CSV to FASTA | Sequence IDs and sequences are stored in columns. |
| FASTQ sequencing reads | FASTQ to FASTA | You need to remove quality-score lines from sequencing-read data. |
| GenBank records | GenBank to FASTA | You want sequence records from annotated GenBank files. |
| GenBank feature tables | GenBank Feature Extractor | You need coding sequences, genes, or other annotated features before FASTA conversion. |
| PDB structure files | PDB to FASTA | You need the amino acid sequence from a protein structure. |
FAQ
How do I convert a text file to FASTA format?
Upload the .txt file or paste its contents, choose DNA, RNA, Protein, or Auto-detect, and run the converter. The result can be copied or downloaded as a .fasta file with headers and wrapped sequence lines.
How do I save a TXT file as FASTA?
Add a header line that starts with > above each sequence, put the sequence on the next line, and save the file as plain text with a .fasta or .fa extension. If your editor appends .txt, rename the file so it ends in .fasta or .fa. This converter can generate the headers and the downloaded file for you.
What characters are valid in DNA, RNA, and protein FASTA?
DNA uses A, C, G, T, and ambiguity codes such as N, R, and Y. RNA uses U in place of T. Protein uses one-letter amino acid codes such as A, C, D, E, and M, plus ambiguity and extended codes such as X, B, Z, J, O, and U.
Can FASTA contain gaps?
Yes. Aligned FASTA files often contain - or . gap characters to keep homologous positions in the same columns, while raw unaligned FASTA usually should not. Enable Preserve alignment gaps when your input is already an alignment.
Should I use TXT to FASTA or FASTQ to FASTA?
Use TXT to FASTA for raw sequences copied from notes, spreadsheets, or plain text files. Use FASTQ to FASTA when your input is sequencing-read data, since FASTQ includes quality-score lines that need a dedicated converter to strip.
Related tools

CSV to FASTA
Convert CSV and TSV files containing sequence data to FASTA format with flexible column mapping and automatic delimiter detection

GenBank Feature Extractor
Extract sequence features (CDS, mRNA, gene, etc.) from GenBank files in FASTA format with support for spliced features

FASTA to FASTQ Converter
Convert FASTA sequence files to FASTQ format with mock quality scores

FASTQ to FASTA converter
Convert standard FASTQ reads to FASTA with validation, IUPAC nucleotide support, average-quality filtering, and downloadable summaries

GenBank to FASTA Converter
Convert GenBank files to FASTA format

DNA to Protein Converter
Translate DNA sequences to protein sequences using genetic code

DNA to RNA converter
Convert DNA sequences to RNA (transcription) - replaces T with U

Protein to DNA converter
Reverse translate protein sequences to possible DNA sequences

RNA to DNA converter
Convert RNA sequences to DNA (reverse transcription) - replaces U with T

One-to-Three Converter
Convert single-letter amino acid codes to three-letter codes
