
FASTA to FASTQ Converter
Convert FASTA sequencing files to FASTQ format with mock quality scores. Upload a FASTA file or paste your sequences below. Configure quality score generation methods to customize your output.
FASTA and FASTQ are the two most common formats for storing biological sequences. While FASTA contains only sequence identifiers and nucleotide/amino acid sequences, FASTQ also has quality scores for each base. Converting from FASTA to FASTQ means adding quality information where none previously existed.
This conversion is necessary when downstream tools require FASTQ input but your sequences are in FASTA format. Many sequence alignment tools (like BWA or Bowtie2), quality control pipelines, and assembly programs expect FASTQ files. Our converter generates synthetic quality scores, which is useful for reference sequences, synthetic constructs, or sequences from sources that don't provide quality data.
We recommend this tool for testing pipelines, working with reference sequences, or preparing synthetic DNA designs for tools that require FASTQ format. If your original data had quality scores that were lost during processing, consider re-obtaining the original FASTQ files instead.
Quality scores in FASTQ files use the Phred scale, developed during the Human Genome Project. The Phred score represents the probability of a sequencing error at each position using a logarithmic formula:
Where is the Phred quality score and is the probability of the base call being incorrect. This can be rearranged to calculate error probability from a known quality score:
| Phred Score | Error Probability | Base Call Accuracy | Typical Use |
|---|---|---|---|
| Q10 | 1 in 10 (10%) | 90% | Minimum threshold for most analyses |
| Q20 | 1 in 100 (1%) | 99% | Standard quality threshold |
| Q30 | 1 in 1,000 (0.1%) | 99.9% | High-quality data |
| Q40 | 1 in 10,000 (0.01%) | 99.99% | Excellent sequencing data |
Modern Illumina sequencing typically produces reads with average quality scores between Q30-Q40. Oxford Nanopore and PacBio long-read technologies often have lower average scores (Q10-Q20) but continue to improve.
FASTQ files encode quality scores as single ASCII characters for compact storage. The most common encoding (Phred+33, used by Illumina 1.8+) adds 33 to the Phred score and converts to the corresponding ASCII character:
!+5?IOur converter uses Phred+33 encoding, which is the current standard for all major sequencing platforms.
A FASTQ file contains four lines per sequence:
1@SEQ_ID description2GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT3+4!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65@, containing the sequence identifier and optional description+, optionally repeating the identifierWhen converting from FASTA, our tool preserves your original sequence headers and generates line 4 based on your selected quality score method.
Assigns Q40 to every position, representing excellent sequencing data with 99.99% accuracy per base. Use this when:
Assigns Q20 to every position, representing the standard quality threshold (99% accuracy). Use this when:
Assigns Q10 to every position, representing marginal quality (90% accuracy). Use this when:
Set any Phred score from 0-40 uniformly across all bases. This provides fine-grained control for specific testing scenarios or when you have prior knowledge about expected data quality.
Generates random quality scores between 0 and 40 for each base independently. This simulates realistic variation in sequencing quality and is useful for:
Simulates the characteristic quality degradation seen in Illumina sequencing, where bases near the end of reads typically have lower quality than those at the beginning. The score starts at Q40 and gradually decreases toward Q10 at the end of each sequence.
This pattern reflects how sequencing chemistry degrades during read extension and is useful for testing quality trimming algorithms or simulating realistic Illumina data.
If your original data had real quality scores, recovering the original FASTQ files is preferable to generating synthetic scores. Real quality data reflects actual sequencing confidence and improves downstream analysis accuracy.
You should use this tool for:
If you have FASTQ data and need FASTA format, use our FASTQ to FASTA converter, which removes quality scores rather than synthesizing them.
Many bioinformatics tools require FASTQ format even when quality filtering isn't the primary goal. Reference sequences, synthetic constructs, and Sanger-sequenced data often exist only in FASTA format but need to be processed by FASTQ-dependent pipelines.
For most alignment tools, uniform high-quality scores (Q40) will not negatively impact alignment. Quality scores primarily affect variant calling and consensus building, where they weight the confidence of each base. For simple alignment or format compatibility, synthetic scores work well.
For reference sequences and synthetic DNA, use High quality (Q40). For pipeline testing, match the expected quality profile of your real data. Use Declining quality to simulate Illumina-like behavior or Random scores for diverse test datasets.
Yes. Paste or upload a multi-sequence FASTA file, and each sequence will be converted to FASTQ format with independent quality score generation.