Overview of transcription chemistry

Diagram of a chromosome, with three genes shown. Genes (pink) are interspersed with non-gene sequence (black). Genes are transcribed into RNA (green). RNAs may be translated into protein (blue), or they may serve other purposes in the cell (functional RNA). Information flows from DNA to RNA to protein.
Figure 1 The Central Dogma of molecular genetics states that genetic information stored in DNA can be used to make more DNA through the process of replication and can be used to make RNA through transcription. RNA can be used as is by the cell, or it can be used to make protein through the process of translation. Here, genes (pink) are interspersed with non-gene sequence (black). Genes are the regions of the genome that are transcribed into RNA (green). RNAs may be translated into protein (blue), or they may serve other purposes in the cell (functional RNA).

Transcription is the process of RNA synthesis, and RNA molecules are also called transcripts.

Within a chromosome, gene sequences are interspersed with intergenic sequence (in-between sequences that are not transcribed into RNA), as shown in Figure 1.

The relative abundance of gene to intergenic sequence varies among organisms. In bacteria, most of the genome encodes protein. But in eukaryotes this is much lower: for humans, only about 2% of the genome encodes protein. Some of the remaining 98% encodes functional RNAs, but the function of the remaining sequence is not well understood. However, some of the intergenic regions include sequences that are important for the regulation of gene expression. When a gene is transcribed, it is said to be expressed. Not all genes are expressed in every cell type.

Diagram depicting the biochemistry of elongation during DNA synthesis. The alpha phosphate of incoming nucleotide triphosphate is connected to the 3' end of a growing daughter strand. In the process, the beta and gamma phosphates are lost.
Figure 2 The chemistry of DNA and RNA synthesis. Both replication and transcription use a single-stranded DNA template. New strands are synthesized by from nucleotide triphosphates. The beta and gamma phosphates are lost as a phosphodiester bond is formed between the 3’ end of the growing chain and the incoming nucleotide.

The biochemistry of transcription is similar to replication: one strand of DNA is used as a template to synthesize a daughter strand with a complementary sequence. In both replication and transcription, nucleotide triphosphates are used as the building block for the new polymeric daughter strand. In both replication and transcription, synthesis proceeds from the 5’ to 3’ end, with nucleotides added on to the 3’ end of a growing polymer. Once nucleotides are incorporated into the polymer, they are called nucleotide residues. The chemistry of synthesis is discussed in more detail in the Replication module and is illustrated in Figure 2.

However, there are notable differences between replication and transcription: there are differences in chemical composition of DNA and RNA, replication results in a double-helical DNA molecule while transcription produces a single-stranded RNA, replication must faithfully replicate the entire genome in every dividing cell but transcription copies only a small part of the genome, and transcription is regulated so that only a subset of genes are transcribed at a given time in any cell.

a) diagrams of ribose (in RNA) and deoxyribose (in DNA). Both have a pentagon shape with Oxygen at the top point of the pentagon. Both have an OH at carbon 1 and 3 and a CH2OH at carbon 4 (this last carbon is carbon 5). The difference is that ribose has an OH at carbon 2 and deoxyribose has an H at carbon 2. B) diagrams of thymine (T in DNA) and Uracil (U in RNA). Both have a single hexagon ring containing carbons and nitrogens. Both have a double bound O at the top carbon, and the bottom left carbon. The difference is that the top right carbon has an H in uracil and a CH3 in thymine.
Figure 3 Chemical differences between DNA and RNA. A (left): DNA contains the sugar deoxyribose, while RNA contains the sugar ribose. Ribose has one additional OH on the 2’ carbon. B (right) DNA contains the base thymine, while RNA contains the base uracil. Thymine has one additional methyl (-CH3) group.

Chemical differences between DNA and RNA

First, the chemical structure of DNA and RNA nucleotides differs: DNA (deoxyribonucleic acid) nucleotides use deoxyribose as the sugar. RNA (ribonucleic acid) nucleotides use ribose. DNA nucleotides include the bases adenine, thymine, guanine, and cytosine, but RNA nucleotides include the base uracil instead of thymine. Uracil and thymine are structurally similar and differ only by one additional methyl group in thymine (Figure 3). This methyl group does not affect base pairing: Both thymine and uracil base pair with adenine.

Three dimensional structure of RNA

Second, DNA usually exists as a stable double-helical structure with two complementary strands. But most functional RNA is single-stranded. The bases can – and do! – form base pairs with one another. Because of the single-stranded nature of RNA, intra-strand base pairs often form, and the three-dimensional structure of folded RNA molecules can be quite variable. An example is shown in Figure 4.

Molecular structures showing four levels of RNA structure. Primary structure refers to the sequence of nucleotides. Secondary structures are recognizable features like a stem loop or pseudoknot created by the folding of the single-stranded RNA. The three dimensional structure of an RNA molecule is called the tertiary structure. It can include elements of secondary structure.Some functional RNA complexes are made up of multiple RNA molecules. In those cases, the RNA complex has quaternary structure.
Figure 4 Illustration of levels of three-dimensional structure of RNA molecules: primary, secondary, tertiary, and quaternary. The tertiary and quaternary structures shown here are of the VS ribozyme, and RNA enzyme responsible for a particular mode of replication in the mitochondria of Neurospora crassa.

The structure of RNA molecules, like that of proteins, can be described in terms of primary, secondary, tertiary, and quaternary structures. The primary structure (1o) is the sequence of the nucleotide residues. The secondary structure (2o) is a collection of recognizable structural elements, including double-helices and stem loops that can form when one RNA molecule folds and basepairs internally. The tertiary structure (3o) is the three-dimensional structure of the whole, folded RNA molecule, which can include elements of secondary structure. Some higher-order complexes can form from multiple RNA molecules. In those cases, quaternary structure (4o) refers to the resulting multimeric structure.

Only parts of the genome are transcribed

Finally, although both strands of the entire genome must be replicated in each cell cycle, only part of the genome is transcribed into RNA: genes are interspersed between long stretches of intergenic DNA as shown in Error! Reference source not found.. In addition, usually only one strand of the double-helix will be used as an RNA template (Figure 5). Within and around a gene, the strands of the DNA double helix are therefore referred to as the template and nontemplate strands.

Diagram of a transcription bubble Parent double helix strands are melted apart to form a "bubble" of single stranded DNA in the middle of the image. The contemplate strand is on the top and the template strand is on the bottom. An RNA molecule is shown in the process of being transcribed. The 3' end is paired with the bottom strand of the transcription bubble, but the 5' end extends outward from the bubble and is unpaired.
Figure 5 Diagram of a transcription bubble. During elongation, the bacterial RNA polymerase tracks along the DNA template, synthesizes mRNA in the 5’ to 3’ direction, and unwinds and rewinds the DNA as it is read. In this image, the ellipsis “…” indicates additional bases not shown.

Because only certain parts of the genome are destined to be transcribed, RNA synthesis depends on sequences in the DNA that signal where the transcription machinery should bind (and which strand to use as a template), where transcription should begin, and where transcription should end. These signal sequences, called DNA elements, are important to the function of a gene, despite not being incorporated into the RNA itself. They often serve as binding sites for components of the transcriptional machinery.

The process of transcription can be broken into three stages: initiation, elongation, and termination. The rest of this module looks at the mechanism of transcription and the DNA elements controlling RNA production.

Test Your Understanding

Media Attributions

License

Share This Book