Does transcription start at aug?
The start codon is the first codon of a messenger RNA (mRNA) transcript translated by a ribosome. The start codon always codes for methionine in eukaryotes and Archaea and a N-formylmethionine (fMet) in bacteria, mitochondria and plastids.
The start codon is often preceded by a 5' untranslated region (5' UTR). In prokaryotes this includes the ribosome binding site.
Alternative start codons are different from the standard AUG codon and are found in both prokaryotes (bacteria and archaea) and eukaryotes. Alternate start codons are still translated as Met when they are at the start of a protein (even if the codon encodes a different amino acid otherwise). This is because a separate transfer RNA (tRNA) is used for initiation.
Alternate start codons (non-AUG) are very rare in eukaryotic genomes. However, naturally occurring non-AUG start codons have been reported for some cellular mRNAs. Seven out of the nine possible single-nucleotide substitutions at the AUG start codon of dihydrofolate reductase are functional as translation start sites in mammalian cells. In addition to the canonical Met-tRNA Met and AUG codon pathway, mammalian cells can initiate translation with leucine using a specific leucyl-tRNA that decodes the codon CUG.
Candida albicans uses a CAG start codon.
Prokaryotes use alternate start codons significantly, mainly GUG and UUG.
E. coli uses 83% AUG (3542/4284), 14% (612) GUG, 3% (103) UUG and one or two others (e.g., an AUU and possibly a CUG).
Well-known coding regions that do not have AUG initiation codons are those of lacI (GUG) and lacA (UUG) in the E. coli lac operon. Two more recent studies have independently shown that 17 or more non-AUG start codons may initiate translation in E. coli.
Mitochondrial genomes use alternate start codons more significantly (AUA and AUG in humans). Many such examples, with codons, systematic range, and citations, are given in the NCBI list of translation tables.
These are "alternative" start codons in the sense that they are upstream of the regular start codons and thus could be used as alternative start codons. More than half of all human mRNAs have at least one AUG codon upstream (uAUG) of their annotated translation initiation starts (TIS) (58% in the current versions of the human RefSeq sequence). Their potential use as TISs could result in translation of so-called upstream Open Reading Frames (uORFs). uORF translation usually results in the synthesis of short polypeptides, some of which have been shown to be functional, e.g., in ASNSD1, MIEF1, MKKS, and SLC35A4. However, it is believed that most translated uORFs only have a mild inhibitory effect on downstream translation because most uORF starts are leaky (i.e. don't initiate translation or because ribosomes terminating after translation of short ORFs are often capable of reinitiating).
Engineered initiator tRNAs (tRNAfMet2 with CUA anticodon) have been used to initiate translation at the amber stop codon UAG. This type of engineered tRNA is called a nonsense suppressor tRNA because it suppresses the translation stop signal that normally occurs at UAG codons. One study has shown that the amber initiator tRNA does not initiate translation to any measurable degree from genomically-encoded UAG codons, only plasmid-borne reporters with strong upstream Shine-Dalgarno sites.
The universal genetic code is made up of several codons or triplet bases. The standard code has evolved over time to minimize coding errors. There are a total of 64 codons in the genetic code arising from the permutation and combination of the 4 bases in nucleic acids.
The genetic code is degenerate i.e. more than one codon can code for a single amino acid. Due to this, of the 64 codons, 61 codons code for the 20 amino acids.
There are two punctuation marks in the genetic code called the START and STOP codons which signal the end of protein synthesis in all organisms.
The genetic code can be read in multiple ways depending on where the reading starts. For example, if the base sequence is GGGAAACCC, reading could start from the first letter, G and there will be 3 codons - GGG, AAA, and CCC. If reading starts at G in the second position, the string will have two codons - GGA and AAC. If reading starts at the third base G, 2 codons will again result - GAA and ACC.
Thus, there are 3 ways of reading the code of every strand of genetic material. These different ways of reading a nucleotide sequence is known as a reading frame. Each reading frame will produce a different sequence of amino acids and hence proteins. Thus, in double stranded DNA, there are 6 possible reading frames.
The codon AUG is called the START codon as it the first codon in the transcribed mRNA that undergoes translation. AUG is the most common START codon and it codes for the amino acid methionine (Met) in eukaryotes and formyl methionine (fMet) in prokaryotes. During protein synthesis, the tRNA recognizes the START codon AUG with the help of some initiation factors and starts translation of mRNA.
Some alternative START codons are found in both eukaryotes and prokaryotes. Alternate codons usually code for amino acids other than methionine, but when they act as START codons they code for Met due to the use of a separate initiator tRNA.
Non-AUG START codons are rarely found in eukaryotic genomes. Apart from the usual Met codon, mammalian cells can also START translation with the amino acid leucine with the help of a leucyl-tRNA decoding the CUG codon. Mitochondrial genomes use AUA and AUU in humans and GUG and UUG in prokaryotes as alternate START codons.
In prokaryotes, E. coli is found to use AUG 83%, GUG 14%, and UUG 3% as START codons. The lacA and lacI coding regions in the E coli lac operon don’t have AUG START codon and instead use UUG and GUG as initiation codons respectively.
There are 3 STOP codons in the genetic code - UAG, UAA, and UGA. These codons signal the end of the polypeptide chain during translation. These codons are also known as nonsense codons or termination codons as they do not code for an amino acid.
The three STOP codons have been named as amber (UAG), opal or umber (UGA) and ochre (UAA). "Amber" or UAG was discovered by Charles Steinberg and Richard Epstein and they named it amber after the German meaning of the last name of their friend Harris Bernstein. The remaining two STOP codons were then named "ochre" and "opal" so as to maintain the "color names" theme.
During protein synthesis, STOP codons cause the release of the new polypeptide chain from the ribosome. This occurs because there are no tRNAs with anticodons complementary to the STOP codons.
After binding to the mRNA, the ribosome begins translation at the start codon , AUG, and then moves down the mRNA transcript one codon (three nucleotides) at a time until it reaches a stop codon.