ORF Finder
Identify all potential protein-coding regions (Open Reading Frames) in your DNA sequences across all six reading frames.
What is ORF Finder?
An open reading frame (ORF) is a stretch of DNA that begins with a start codon and ends with a stop codon, representing a potential protein-coding region. ORF Finder scans DNA sequences to identify all such regions across all six reading frames—three on the forward strand and three on the reverse complement.
Because DNA is double-stranded and codons are read in triplets, any sequence can be read in six different ways. A start codon (typically ATG) signals where translation might begin, while stop codons (TAA, TAG, or TGA in the standard genetic code) mark termination. ORF Finder systematically locates these boundaries and reports the predicted protein translations.
Finding ORFs is often the first step in gene annotation for newly sequenced DNA, though the presence of an ORF doesn't guarantee it encodes a functional protein—additional evidence like expression data or sequence conservation is usually needed to confirm coding potential.
How to use ORF Finder online
ProteinIQ's ORF Finder runs entirely in the browser, processing sequences instantly without uploading data to a server.
Input
| Input | Description |
|---|---|
DNA Sequence | One or more DNA sequences in FASTA format or raw nucleotide text. Supports files up to 50 MB. |
Settings
Search parameters
| Setting | Description |
|---|---|
Minimum ORF length | Filter out short ORFs. Options: 75 nt (25 aa), 150 nt (50 aa), 225 nt (75 aa), 300 nt (100 aa). Default is 75 nt. |
Genetic code | NCBI translation table for codon interpretation. 25 codes available, from standard eukaryotic to various mitochondrial and bacterial variants. |
Start codon mode | Which codons initiate translation: ATG only (canonical), ATG + Alternative (includes TTG, CTG, GTG), or Any sense codon (for finding all potential reading frames). |
Strand | Search Both strands, Forward only (+), or Reverse only (-). |
Ignore nested ORFs | When enabled, suppresses smaller ORFs that fall entirely within a larger ORF on the same reading frame. |
Output
Results display in an interactive table with columns for:
| Column | Description |
|---|---|
| ORF ID | Identifier combining sequence name, strand, frame, and ORF number |
| Strand | + (forward) or - (reverse complement) |
| Frame | Reading frame (1, 2, or 3) |
| Start | Nucleotide position where the ORF begins (1-based) |
| Stop | Nucleotide position where the ORF ends |
| Length (nt) | ORF length in nucleotides |
| Length (aa) | Predicted protein length in amino acids |
| Protein | Translated amino acid sequence |
Results can be exported as CSV, JSON, or FASTA (protein sequences).
Genetic codes
Different organisms use different codon-to-amino-acid mappings. The standard code applies to most nuclear genes in eukaryotes and many prokaryotes, but mitochondria, plastids, and certain protists have reassigned codons.
Common genetic codes:
| Code | Name | Key differences |
|---|---|---|
| 1 | Standard | TAA, TAG, TGA are stop codons |
| 2 | Vertebrate Mitochondrial | TGA→Trp, AGA/AGG→Stop |
| 11 | Bacterial/Plastid | Same as standard but with more alternative start codons |
| 4 | Mold/Protozoan Mitochondrial | TGA→Trp |
| 6 | Ciliate Nuclear | TAA/TAG→Gln |
Select the appropriate code for the organism being analyzed—using the wrong table will produce incorrect translations.
Interpreting results
Not every ORF encodes a real protein. Consider these factors:
- Length: Longer ORFs are more likely to be genuine. Random sequence produces a stop codon roughly every 21 codons on average, so ORFs under 100 codons may arise by chance.
- Context: True genes often have regulatory elements upstream (promoters, ribosome binding sites) not detected by ORF scanning alone.
- Conservation: ORFs shared across related species are more likely functional.
- Codon bias: Coding regions often show non-random codon usage characteristic of the organism.
For prokaryotic sequences, ORFs closely correspond to coding sequences (CDS). Eukaryotic genes with introns require splice-aware prediction methods—ORF Finder operates on continuous sequences and won't identify genes interrupted by introns.
Related tools
- DNA to Protein: Translate DNA in a specific reading frame
- Reverse Complement: Generate the reverse complement strand
- GC Content: Analyze nucleotide composition
