Overview of transcription chemistry
Transcription is the process of RNA synthesis, and RNA molecules are also called transcripts.
Within a chromosome, gene sequences are interspersed with intergenic sequence (in-between sequences that are not transcribed into RNA), as shown in Figure 1.
The relative abundance of gene to intergenic sequence varies among organisms. In bacteria, most of the genome encodes protein. But in eukaryotes this is much lower: for humans, only about 2% of the genome encodes protein. Some of the remaining 98% encodes functional RNAs, but the function of the remaining sequence is not well understood. However, some of the intergenic regions include sequences that are important for the regulation of gene expression. When a gene is transcribed, it is said to be expressed. Not all genes are expressed in every cell type.
The biochemistry of transcription is similar to replication: one strand of DNA is used as a template to synthesize a daughter strand with a complementary sequence. In both replication and transcription, nucleotide triphosphates are used as the building block for the new polymeric daughter strand. In both replication and transcription, synthesis proceeds from the 5’ to 3’ end, with nucleotides added on to the 3’ end of a growing polymer. Once nucleotides are incorporated into the polymer, they are called nucleotide residues. The chemistry of synthesis is discussed in more detail in the Replication module and is illustrated in Figure 2.
However, there are notable differences between replication and transcription: there are differences in chemical composition of DNA and RNA, replication results in a double-helical DNA molecule while transcription produces a single-stranded RNA, replication must faithfully replicate the entire genome in every dividing cell but transcription copies only a small part of the genome, and transcription is regulated so that only a subset of genes are transcribed at a given time in any cell.
Chemical differences between DNA and RNA
First, the chemical structure of DNA and RNA nucleotides differs: DNA (deoxyribonucleic acid) nucleotides use deoxyribose as the sugar. RNA (ribonucleic acid) nucleotides use ribose. DNA nucleotides include the bases adenine, thymine, guanine, and cytosine, but RNA nucleotides include the base uracil instead of thymine. Uracil and thymine are structurally similar and differ only by one additional methyl group in thymine (Figure 3). This methyl group does not affect base pairing: Both thymine and uracil base pair with adenine.
Three dimensional structure of RNA
Second, DNA usually exists as a stable double-helical structure with two complementary strands. But most functional RNA is single-stranded. The bases can – and do! – form base pairs with one another. Because of the single-stranded nature of RNA, intra-strand base pairs often form, and the three-dimensional structure of folded RNA molecules can be quite variable. An example is shown in Figure 4.
The structure of RNA molecules, like that of proteins, can be described in terms of primary, secondary, tertiary, and quaternary structures. The primary structure (1o) is the sequence of the nucleotide residues. The secondary structure (2o) is a collection of recognizable structural elements, including double-helices and stem loops that can form when one RNA molecule folds and basepairs internally. The tertiary structure (3o) is the three-dimensional structure of the whole, folded RNA molecule, which can include elements of secondary structure. Some higher-order complexes can form from multiple RNA molecules. In those cases, quaternary structure (4o) refers to the resulting multimeric structure.
Only parts of the genome are transcribed
Finally, although both strands of the entire genome must be replicated in each cell cycle, only part of the genome is transcribed into RNA: genes are interspersed between long stretches of intergenic DNA as shown in Error! Reference source not found.. In addition, usually only one strand of the double-helix will be used as an RNA template (Figure 5). Within and around a gene, the strands of the DNA double helix are therefore referred to as the template and nontemplate strands.
Because only certain parts of the genome are destined to be transcribed, RNA synthesis depends on sequences in the DNA that signal where the transcription machinery should bind (and which strand to use as a template), where transcription should begin, and where transcription should end. These signal sequences, called DNA elements, are important to the function of a gene, despite not being incorporated into the RNA itself. They often serve as binding sites for components of the transcriptional machinery.
The process of transcription can be broken into three stages: initiation, elongation, and termination. The rest of this module looks at the mechanism of transcription and the DNA elements controlling RNA production.
Test Your Understanding
Media Attributions
- Dogma of molecular genetics © Amanda Simons is licensed under a CC BY-SA (Attribution ShareAlike) license
- DNA polymerase © Wikipedia is licensed under a CC0 (Creative Commons Zero) license
- Ribonucleotides © OpenStax is licensed under a CC0 (Creative Commons Zero) license
- 3D structure of RNA © Wikipedia is licensed under a CC0 (Creative Commons Zero) license
- Transcription bubble © OpenStax is licensed under a CC BY-NC-SA (Attribution NonCommercial ShareAlike) license