"

Sequence, chromosome structure, structural variants: evolution of genomes

While a lot of information can be gathered from a comparison of DNA sequence, chromosome structure, and structural variants also offer a wealth of molecular information about evolution. Genomes evolve, too, and new genes can arise through large-scale structural variants.

Over the course of evolution, genomes don’t just accumulate single nucleotide polymorphisms. Larger structural variants also accumulate. Like all mutations, structural variants can be advantageous, disadvantageous, or neutral, but most will only be maintained in a population if they confer some sort of reproductive advantage. Some large-scale chromosomal rearrangements may affect far more than just one gene at a time, and they likely play a correspondingly larger role in speciation.

Structural variants include chromosomal rearrangements, exon shuffling, gene deletions, and duplications, and even whole-genome duplications in some cases! This can result in multiple copies of a gene. Exon shuffling is a chromosomal rearrangement that rearranges exons. This may involve moving, duplicating, or deleting exons of a single gene, or it can involve a recombination event that links exons of two different genes.

Not all duplications result in functional genes: some may copy only part of a gene or reassemble exons in a nonproductive manner, resulting in an incomplete, nonfunctional gene. Gene duplication and exon shuffling events are followed by further mutations, so the duplicated sequences also diverge from each other over time. These events can generate new genes with new functions. These structurally related genes are paralogs of one another. Sometimes, additional loss-of-function mutations may accumulate, causing the inactivation of one or more of the paralogs. These nonfunctional pseudogenes persist in the genome but are usually not translated into protein. (Figure 14)

Pseudogenes actually make up quite a bit of most eukaryotic genomes. In humans, for example, there appear to be more pseudogenes than protein-coding genes![1]

Figure 14. Paralogs result from gene duplication. Duplicated genes can undergo further mutation over time, resulting in the divergence of the paralogs. Mutations can inactivate a paralog, resulting in a nonfunctional pseudogene. Eukaryotic genomes have many pseudogenes.

We’ve seen examples of large-scale rearrangements and paralogs in this text already: the chapter on gene expression during eukaryotic development looked at the hedgehog gene in Drosophila and its three corresponding orthologs sonic hedgehog, desert hedgehog, and Indian hedgehog in vertebrates. (Remember that the word ortholog refers to a related gene found in a different species, and the word paralog refers to duplicated genes within the genome of an organism.)

The chapter on cancer biology looked at paralogs of p53 in proboscideans (elephants), which likely evolved concomitantly with an increase in body size.

Large-scale structural variants also result in genome structures that can be quite varied from species to species, so a comparison of chromosome structure among species can often give additional clues to evolutionary relationships. For example, mouse and human protein-coding genes are about 85% identical, depending on what sequences are counted. (This varies from gene to gene: some genes are >85% identical, and some are <85% identical.) But if we compare how those sequences are arranged within the genome, genome structure looks quite different, as shown in Figure 15. In Figure 15, genome synteny refers to how sequences align within two or more genomes

Figure 15. Synteny between the human and mouse genome. The 23 autosomes plus the X and Y of the human genome are arranged in the inner circle, each chromosome represented by a different colored arc. The Mouse genome has 19 autosomes, plus X and Y. The mouse chromosomes are color-coded to match where a similar sequence is found.

Test Your Understanding

Media Attributions


  1. Karro, J. E. et al. Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation. Nucleic Acids Res. 35, D55–D60 (2007).