Comparing Genomes

Comparing genomes

A key challenge of modern evolutionary biology is to find a way to link the evolution of DNA sequences, which we are now able to study in great detail, with the evolution of the complex morphological characters used to construct a traditional phylogeny. Many different genes make contributions to complex characters like feathers, and making the connection between a specific change in one of them and a modification in a morphological character is particularly difficult. Comparing genomes (entire DNA sequences) of different species provides a powerful new tool to explore these relationships. Genomes are more than instruction books for building and maintaining an organism; they contain vast amounts of information on the history of life. The growing number of fully sequenced genomes in all kingdoms is leading to a revolution in comparative evolutionary biology. It is now possible to explore the genetic differences between species in a very direct way, examining, one-by-one, the footprints of the evolutionary path between different species.

Over a hundred different prokaryotic genomes have been completed, and at least 18 different eukaryotic genomes have been or are being sequenced. As the draft (preliminary) sequences of these and other genomes become available in the next few years, our view of the evolution of life on earth can be expected to become far clearer, our knowledge of phylogenic relationships far more certain. The few genomic sequences that are already completed give exciting clues of what is to come.

The Tiger Pufferfish

The draft sequence of the tiger pufferfish (Fugu rubripes) was completed in 2002, only the second vertebrate genome to be fully sequenced. For the first time, we were able to compare the genomes of two vertebrates. Some human and pufferfish genes have been conserved during 450 million years of evolution, while other genes are unique to each species. About 25% of human genes have no counterparts in Fugu. There have also been extensive rearrangements during the 450 million years since mammals and teleost fish diverged, indicating a considerable scrambling of gene order. The human genome also has more repetitive DNA. Repetitive DNA counts for less than one-sixth of the Fugu sequence.

The pufferfish genome, 365 million base-pairs (Mb) in length, has only one-ninth of the DNA of humans, although both vertebrate species have approximately the same number of genes. Why is there so much extra DNA in humans? Much of it appears to be in the form of introns that are substantially bigger than those in pufferfish. The Fugu genome has a handful of “giant” genes containing long introns; studying them should provide insight into what evolutionary forces have driven the change in genome size during vertebrate evolution.

Sequences that are conserved between human and pufferfish provide valuable clues for understanding the genetic basis of many human diseases. Amino acids critical to protein function tend to be preserved over the course of evolution, and changes at such sites within genes are more likely to cause disease. It is difficult to distinguish functionally conserved sites in the protein sequences of humans when comparing human proteins to those of other mammals, as there has not been enough time for sufficient changes to accumulate at nonconserved sites. Because the pufferfish genome is more distantly related to humans, conserved sequences are far more easily distinguished.

The Mouse

Later in 2002 a draft sequence of the mouse (Mus musculus) genome was completed by an international consortium of investigators, allowing for the first time a comparison of two mammalian genomes. The human genome has about 400 million more nucleotides than the mouse. A comparison of the two genomes reveals that both have about 30,000 genes, and they share the bulk of them—the human genome shares 99% of its genes with mice. Humans and mice diverged about 75 million years ago, too little time for many evolutionary differences to accumulate. There are only 300 genes unique to either organism, about 1% of the genome. Most of the nearly 150 genes unique to mice are linked with the sense of smell, which is highly developed in rodents, and with reproduction. Mice produce frequent large litters. The genomes of humans and mice are so similar that the best explanation for why a mouse develops into a mouse and not a human is that the genes are expressed at different times and possibly in different tissues.

Comparison of the mouse and human genomes reveals that since mice and humans last shared a common ancestor about 75 million years ago, mouse DNA has mutated about twice as fast as human DNA. This is a fascinating observation in search of an explanation. The difference in generation time between mice and humans could account for some of this difference, as mice would have had more opportunities to mix and match genomic components during meiosis.

Perhaps the most unexpected finding in comparing the mouse and human genomes lies in the similarities between the “junk” DNA, mostly retrotransposons, in the two species. This DNA does not code for genes. A survey of the location of retrotransposon DNA in both species shows that it has independently ended up in comparable regions of the genome. It’s beginning to look like this “junk” DNA may have more of a function than was previously assumed. The possibility that it is rich in regulatory RNA sequences such as described in chapter 18 is being actively investigated. In one such study, researchers collected almost all of the RNA transcripts made by mouse cells taken from every tissue. While most of the transcripts code for mouse proteins, as many as 4,280 could not be matched to any known mouse protein. This suggests that a large part of the transcribed genome consists of nonprotein-encoding genes—that is, of transcripts that function as RNA.

A draft of the rat genome has just been completed, and there may be even more exciting news about the evolution of mammalian genomes ahead.

Variation in the organization of genomes is as intriguing as gene sequence differences. Over long segments of chromosomes, the linear order of mouse and human genes is the same—the common ancestral sequence has been preserved in both species. This conservation of synteny was anticipated from earlier gene mapping studies, and provides strong evidence that evolution actively shapes the organization of the mammalian genome.


The chimpanzee genome project is still underway. Humans and chimps diverged from a common ancestor only about 5 million years ago, too little time for much genetic differentiation to evolve between the two species. Preliminary sequence comparisons indicate that chimp DNA is 98.7% identical with human DNA. If just the gene sequences encoding proteins are considered, the similarity increases to 99.2%. How could two species differ so much in body and behavior, and yet have almost equivelent sets of genes?

One potential answer to this question is provided by the observation that chimp and human genomes show very different patterns of gene transcription activity, at least in brain cells. Investigators used gene chips containing up to 18,000 human genes. Fluid extracted from living brain cells was washed over the array of genes on the chip. Each gene lights up if a transcript of that gene is present in the fluid. The more copies, the more intense the signal. Because the chimp genome is so similar, the chip could detect the activity of chimp genes reasonably well. While the same genes were transcribed in chimp and human brain cells, the levels of transcription varied widely. It would seem that much of the difference between humans and chimps lies in which genes are transcribed, and when.

Humans have one less chromosome than chimpanzees, gorillas, and orangutans. It’s not that we have lost a chromosome. Rather, at some point in time, two mid-sized ape chromosomes fused to make what is now human chromosome 2, the second largest chromosome in our genome.

The fusion leading to human chromosome 2 is an example of the sort of genome reorganization that has occurred in many species. Rearrangements like this can provide evolutionary clues, but are not always definitive proof of how closely related two species are. Consider the organization of known orthologous genes (that is, genes with the same ancestral sequence) shared by humans, chickens, and mice. One study estimated that 72 chromosome arrangements had occurred since the chicken and human last shared a common ancestor. This is substantially less than the estimated 128 rearrangements between chicken and mouse or 171 between mouse and human. Does this mean that chicken and humans are more closely related than mice and humans or mice and chickens? No, what these data actually show is that chromosome rearrangements have occurred at a much lower frequency in humans and chickens than in mice. Chromosomal rearrangements in mice seem to have occurred at twice the rate seen in humans.

This finding of marked differences in the rate of chromosomal rearrangement among vertebrates raises new questions about genome evolution that are currently being explored. Identifying genomes that have undergone relatively slow chromosome change is most helpful in reconstructing the hypothetical genomes of ancestral vertebrates. If regions of chromosomes have changed little in distantly-related vertebrates over the last 300 million years, it is reasonable to hypothesize that the common ancestor had genomic similarities.

next page

Category: Uncategorized