The Human Genome Project
In 1990 American geneticists embarked on an ambitious attempt to map and ultimately sequence the entire human genome. This effort, which quickly became an international program, presented no small challenge, as the human genome is hugemore than 3 billion base pairs. To get an idea of the magnitude of the task, consider that if all 3.2 billion base pairs were written down on the pages of this book, the book would be 500,000 pages long and it would take you about 60 years, working eight hours a day, every day, at 5 bases a second, to read it all.
In sequencing DNA, a DNA fragment of unknown sequence is first amplified, so there are thousands of copies of the fragment. The fragments are then mixed with a primer, DNA polymerase, a supply of the four nucleotide bases, and a supply of four different chain-terminating chemical tags that each can act as one of the four nucleotide bases in DNA synthesis. After heat is applied to denature the double-stranded DNA, the primer binds to one strand of the DNA, and synthesis of the complementary strand proceeds. Whenever a tag is added instead of a nucleotide base, the synthesis stops. However, because of the relatively low concentration of the chemical tags compared to the nucleotides, a tag that binds to A on the DNA fragment, for example, will not necessarily be added to the first A site. Thus, the mixture will contain a series of fragments of different lengths, corresponding to the different distances the polymerase traveled from the primer before a chain-terminating tag was incorporated.
The series of fragments are then separated according to size by gel electrophoresis. The fragments become arrayed like the rungs of a ladder, each rung one base longer than the one preceding it. In the manual method of sequencing DNA (the Sanger method), the synthesized fragments are radioactively labeled and are visualized on X-ray film in four different columns that correspond to the four different nucleotide bases. The DNA sequence can then be read directly from the film by researchers. This time-consuming procedure was improved with the development of automated DNA sequencing. In this method, the chemical tags used are fluorescently colored, one color corresponding to each nucleotide. Computers read off the colors on the gel in order to determine the DNA sequence and display this sequence as a series of colored peaks (figure 2). What
has made the attempt to sequence 3.2 billion bases practical was the development in the mid-1990s of automated sequencers that perform electrophoresis of DNA fragments in capillary tubes instead of the traditional gel slabs. These systems can handle about 1000 samples a day, with only 15 minutes of human attention. An institute with several hundred such instruments can produce about 100 Mb (million base pairs) every day.
Two Strategies for Sequencing Such a Large Genome
The original plan for the publicly-financed Human Genome Project was systematic and conservative. First, detailed genetic maps of each of the 23 human chromosomes would be prepared. For each segment of the map, fragments of DNA would be isolated and cloned into bacterial plasmids. http://biologywriter.com/backgrounder//cloning-2 allows investigators to get enough material to sequence the fragments. The map would then allow the sequenced fragments to be pieced together in the proper order.
Then, in May of 1998, the researcher who had sequenced the first bacterial genome, Craig Venter, announced he had established a private company to sequence the human genome. Venter proposed an astonishing schedule, proposing to finish a rough draft of the entire human genome in only two years. Instead of relying on a map, with the time-consuming ordering of clones used to build it, Ventner proposed a shotgun sequencing approach in which the mapping step would be skipped altogether. Instead, the entire human genome would be chopped up, the fragments cloned, and the DNA sequence of each clone determined. Finally, the sequences would be pieced together using powerful computer programs that looked for overlaps between fragments.
The publicly-financed venture rose to the challenge, and the two ventures raced to see who could complete the human genome first. The upshot of this race was a tie of sorts. On Darwins birthday, June 26, 2000, the two research groups jointly announced success.
The entire 3.2 billion base pair human genome has been sequenced, using automated machines to sequence random shotgun cloned fragments and powerful computers to order the sequenced fragments.
A DNA strand is sequenced by adding complementary bases to it, and looking to see which base is added at each stage in the process of assembling the new chain.
 
		
		
	
