DNA Shuffle

Shuffle DNA sequences while preserving nucleotide, dinucleotide, or k-mer composition.

Input

DNA sequence input

Output

Configure input settings on the left, then click "Shuffle"

Input

DNA sequence input

Output

Configure input settings on the left, then click "Shuffle"

What is DNA Shuffle?

DNA Shuffle generates randomized DNA sequences that preserve specific compositional properties of the original sequence. Three shuffling methods are available: mononucleotide shuffling preserves exact nucleotide counts (A, T, C, G), dinucleotide shuffling preserves all 16 dinucleotide frequencies, and k-mer shuffling preserves frequencies of longer subsequences.

Shuffled sequences serve as statistical controls in bioinformatics analyses. When testing whether a sequence property (such as predicted secondary structure stability or regulatory motif enrichment) is significant, comparing against randomized sequences with matching composition provides a null distribution for hypothesis testing.

How to use DNA Shuffle online

ProteinIQ runs DNA shuffling directly in the browser with instant results, no installation or account required.

Input

Input	Description
`DNA sequences`	One or more sequences in FASTA format, or a raw sequence without headers. Only A, T, C, G nucleotides are accepted.

Settings

Shuffle options

Setting	Description
`Shuffle method`	Algorithm for randomization. `Mononucleotide` (default) preserves single nucleotide counts. `Dinucleotide` preserves all 16 dinucleotide frequencies. `K-mer` preserves frequencies of specified k-mer length.
`K-mer size`	Size of k-mers to preserve (2–6), only used with `K-mer` method. Larger values constrain the shuffle more heavily.
`Number of shuffles`	How many randomized sequences to generate per input (1–100). Multiple shuffles provide replicates for statistical analyses.
`Random seed`	Seed value for reproducibility (0 = random seed). Setting a specific seed ensures identical output across runs.

Output formatting

Setting	Description
`Output case`	`Uppercase` (default) or `Lowercase` for output sequences.
`Add suffix to headers`	Appends `_shuffled` or `_shuffled_N` to FASTA headers. Enabled by default.
`Line length`	Characters per line in output (0–200, default 80). Set to 0 for no line wrapping.

Output

FASTA-formatted sequences with shuffled nucleotide order. When generating multiple shuffles per input, each receives a numbered suffix.

How DNA Shuffle works

Mononucleotide shuffling

The simplest method uses the Fisher-Yates algorithm to randomly permute all nucleotides. The result has identical nucleotide counts but completely randomized order, destroying any dinucleotide or higher-order patterns.

Dinucleotide shuffling

Preserving dinucleotide frequencies requires the Altschul-Erickson algorithm, which models the sequence as a directed graph. Each nucleotide (A, T, C, G) becomes a vertex, and each dinucleotide in the sequence becomes a directed edge. The shuffled sequence is reconstructed by finding a random Eulerian path through this graph—a path that traverses each edge exactly once.

Because the graph preserves all dinucleotide transitions from the original sequence, the shuffled output maintains the same dinucleotide composition. This matters for RNA folding analyses where stacking energies depend on adjacent base pairs.

K-mer shuffling

The generalized Euler algorithm extends dinucleotide shuffling to arbitrary k-mer sizes. Instead of single nucleotides as vertices, the graph uses (k-1)-mers. Each k-mer in the original sequence creates an edge between its prefix and suffix (k-1)-mers. Finding an Eulerian path through this graph produces a sequence preserving all k-mer frequencies.

Larger k values impose stronger constraints. With k=6, the shuffled sequence maintains the same hexanucleotide composition as the original, which may be important when codon usage or restriction site patterns need preservation.

Applications

Shuffled sequences commonly serve as negative controls for:

Motif discovery: Testing whether identified patterns occur more frequently than expected by chance
RNA structure prediction: Determining if predicted folding stability exceeds that of composition-matched random sequences
Regulatory element analysis: Validating that putative binding sites show genuine enrichment
Alignment scoring: Establishing background distributions for sequence similarity statistics

Dinucleotide shuffling is particularly important for RNA analyses because secondary structure free energies depend heavily on stacking interactions between adjacent bases. Mononucleotide-shuffled controls may have systematically different folding energies simply due to altered dinucleotide composition.

Related tools

DNA mutator

Generate batches of mutated DNA variants from one or more FASTA sequences. Create substitution, insertion, deletion, or mixed variant libraries with reproducible settings.

DNAGenIQ - Random DNA sequence generator

Generate random DNA sequences with customizable length, GC content, and restriction sites for molecular cloning and testing purposes.

ProtGenIQ - Random protein sequence generator

Generate random protein sequences with customizable length, composition, and amino acid properties

RNAGenIQ - Random RNA sequence generator

Generate random RNA sequences with customizable types and structural features

Filter DNA

Clean and filter DNA sequences by removing or replacing non-standard nucleotide characters. Supports multiple filter modes including standard 4 bases, IUPAC ambiguity codes, and custom character sets.

GenBank Feature Extractor

Extract sequence features (CDS, mRNA, gene, etc.) from GenBank files in FASTA format with support for spliced features

Reverse complement generator

Generate reverse, complement, or reverse-complement of DNA/RNA sequences

CSV to FASTA

Convert CSV and TSV files containing sequence data to FASTA format with flexible column mapping and automatic delimiter detection

DNA to Protein Converter

Translate DNA sequences to protein sequences using genetic code

DNA to RNA converter

Convert DNA sequences to RNA (transcription) - replaces T with U

Tools

DNA Shuffle

Input

Output

Input

Output

What is DNA Shuffle?

How to use DNA Shuffle online

Input

Settings

Shuffle options

Output formatting

Output

How DNA Shuffle works

Mononucleotide shuffling

Dinucleotide shuffling

K-mer shuffling

Applications

Related tools

DNA mutator

DNAGenIQ - Random DNA sequence generator

ProtGenIQ - Random protein sequence generator

RNAGenIQ - Random RNA sequence generator

Filter DNA

GenBank Feature Extractor

Reverse complement generator

CSV to FASTA

DNA to Protein Converter

DNA to RNA converter

Tools

Input

Output

Input

Output