FASTA to FASTQ Converter

Convert FASTA sequencing files to FASTQ format with mock quality scores. Upload a FASTA file or paste your sequences below. Configure quality score generation methods to customize your output.

Input

0/1 sequences

Output

Configure input settings, then click "Convert"

What is FASTA to FASTQ conversion?

FASTA and FASTQ are the two most common formats for storing biological sequences. While FASTA contains only sequence identifiers and nucleotide/amino acid sequences, FASTQ also has quality scores for each base. Converting from FASTA to FASTQ means adding quality information where none previously existed.

This conversion is necessary when downstream tools require FASTQ input but your sequences are in FASTA format. Many sequence alignment tools (like BWA or Bowtie2), quality control pipelines, and assembly programs expect FASTQ files. Our converter generates synthetic quality scores, which is useful for reference sequences, synthetic constructs, or sequences from sources that don't provide quality data.

We recommend this tool for testing pipelines, working with reference sequences, or preparing synthetic DNA designs for tools that require FASTQ format. If your original data had quality scores that were lost during processing, consider re-obtaining the original FASTQ files instead.

Understanding Phred quality scores

Quality scores in FASTQ files use the Phred scale, developed during the Human Genome Project. The Phred score represents the probability of a sequencing error at each position using a logarithmic formula:

$Q = -10 \log_{10}(P)$

Where $Q$ is the Phred quality score and $P$ is the probability of the base call being incorrect. This can be rearranged to calculate error probability from a known quality score:

$P = 10^{-Q/10}$

Quality score interpretation

Phred Score	Error Probability	Base Call Accuracy	Typical Use
Q10	1 in 10 (10%)	90%	Minimum threshold for most analyses
Q20	1 in 100 (1%)	99%	Standard quality threshold
Q30	1 in 1,000 (0.1%)	99.9%	High-quality data
Q40	1 in 10,000 (0.01%)	99.99%	Excellent sequencing data

Modern Illumina sequencing typically produces reads with average quality scores between Q30-Q40. Oxford Nanopore and PacBio long-read technologies often have lower average scores (Q10-Q20) but continue to improve.

ASCII encoding in FASTQ

FASTQ files encode quality scores as single ASCII characters for compact storage. The most common encoding (Phred+33, used by Illumina 1.8+) adds 33 to the Phred score and converts to the corresponding ASCII character:

Q0 → ASCII 33 → !
Q10 → ASCII 43 → +
Q20 → ASCII 53 → 5
Q30 → ASCII 63 → ?
Q40 → ASCII 73 → I

Our converter uses Phred+33 encoding, which is the current standard for all major sequencing platforms.

How the FASTQ format works

A FASTQ file contains four lines per sequence:

1@SEQ_ID description2GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT3+4!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

Line 1: Header starting with @, containing the sequence identifier and optional description
Line 2: The nucleotide sequence (A, T, G, C, N)
Line 3: A separator line starting with +, optionally repeating the identifier
Line 4: Quality scores, one ASCII character per base (must match sequence length exactly)

When converting from FASTA, our tool preserves your original sequence headers and generates line 4 based on your selected quality score method.

Score generation methods

High quality (Q40)

Assigns Q40 to every position, representing excellent sequencing data with 99.99% accuracy per base. Use this when:

Working with high-confidence reference sequences from databases like NCBI RefSeq
Testing pipelines where quality filtering shouldn't remove any data
Preparing synthetic DNA sequences designed in silico

Medium quality (Q20)

Assigns Q20 to every position, representing the standard quality threshold (99% accuracy). Use this when:

Simulating typical sequencing data for pipeline testing
Working with sequences that may have some uncertainty
Creating test datasets that should pass basic quality filters

Low quality (Q10)

Assigns Q10 to every position, representing marginal quality (90% accuracy). Use this when:

Testing how your pipeline handles low-quality data
Simulating degraded samples or challenging sequencing conditions
Validating quality filtering steps in your workflow

Custom quality score

Set any Phred score from 0-40 uniformly across all bases. This provides fine-grained control for specific testing scenarios or when you have prior knowledge about expected data quality.

Random scores (Q0-Q40)

Generates random quality scores between 0 and 40 for each base independently. This simulates realistic variation in sequencing quality and is useful for:

Testing quality-aware alignment algorithms
Benchmarking quality trimming tools
Creating diverse test datasets

Declining quality

Simulates the characteristic quality degradation seen in Illumina sequencing, where bases near the end of reads typically have lower quality than those at the beginning. The score starts at Q40 and gradually decreases toward Q10 at the end of each sequence.

This pattern reflects how sequencing chemistry degrades during read extension and is useful for testing quality trimming algorithms or simulating realistic Illumina data.

When to use FASTA to FASTQ converter?

If your original data had real quality scores, recovering the original FASTQ files is preferable to generating synthetic scores. Real quality data reflects actual sequencing confidence and improves downstream analysis accuracy.

You should use this tool for:

Converting reference sequences for use in alignment pipelines
Preparing synthetic DNA designs for tools requiring FASTQ input
Testing bioinformatics pipelines with controlled quality profiles
Creating training or benchmark datasets

If you have FASTQ data and need FASTA format, use our FASTQ to FASTA converter, which removes quality scores rather than synthesizing them.

FAQ

Why would I need synthetic quality scores?

Many bioinformatics tools require FASTQ format even when quality filtering isn't the primary goal. Reference sequences, synthetic constructs, and Sanger-sequenced data often exist only in FASTA format but need to be processed by FASTQ-dependent pipelines.

Do synthetic quality scores affect alignment accuracy?

For most alignment tools, uniform high-quality scores (Q40) will not negatively impact alignment. Quality scores primarily affect variant calling and consensus building, where they weight the confidence of each base. For simple alignment or format compatibility, synthetic scores work well.

Which quality score method should I choose?

For reference sequences and synthetic DNA, use High quality (Q40). For pipeline testing, match the expected quality profile of your real data. Use Declining quality to simulate Illumina-like behavior or Random scores for diverse test datasets.

Can I convert multiple sequences at once?

Yes. Paste or upload a multi-sequence FASTA file, and each sequence will be converted to FASTQ format with independent quality score generation.

FASTA to FASTQ Converter

Input

Quality Score Generation

Output

What is FASTA to FASTQ conversion?

Understanding Phred quality scores

Quality score interpretation

ASCII encoding in FASTQ

How the FASTQ format works

Score generation methods

High quality (Q40)

Medium quality (Q20)

Low quality (Q10)

Custom quality score

Random scores (Q0-Q40)

Declining quality

When to use FASTA to FASTQ converter?

FAQ

Why would I need synthetic quality scores?

Do synthetic quality scores affect alignment accuracy?

Which quality score method should I choose?

Can I convert multiple sequences at once?

Input

Quality Score Generation

Output