TABLE 2 Classes of the DNA sequences found in the human genome Class Frequency Description
Protein-encoding genes 1% Transcribed exons, within some 30,000 genes scattered about the chromosomes
Non-coding DNA comprising the great majority of most genes
Constitutive heterochromatin, localized near centromeres and telomeres
Simple sequence repeats (SSRs) of a few nucleotides repeated millions of times
20% long interspersed elements (LINEs), active transposons
15% Other transposable elements, including long terminal repeats (LTRs)
10% the parasite sequence (ALU), present in half a million copies
One of the most notable characteristics of the human genome is the startling amount of non-coding DNA they possess. Only 1 to 1.5 percent of the human genome is coding DNA, devoted to genes encoding proteins. Each of your cells has about six feet of DNA stuffed into it, but of that, less than one inch is devoted to genes! Nearly 99% of the DNA in your cells seems to have little or nothing to do with the instructions that make you you. True genes are scattered about the human genome in clumps among the much larger amount of non-coding DNA, like isolated hamlets in a desert.
There are four major sorts of non-coding human DNA (table 2).
Non-coding DNA within genes As we discussed on the previous page, a human gene is made up of numerous fragments of protein-encoding information (exons) embedded within a much larger matrix of non-coding DNA (introns). Together, introns make up about 24% of the human genome, and exons about 1.5%. Structural DNA Some regions of the chromosomes remain highly condensed, tightly coiled, and untranscribed throughout the cell cycle. Called constitutive hetero- chromatin, these portions tend to be localized around the centromere, or located near the ends of the chromosome.
Repeated sequences Scattered about chromosomes are simple sequence repeats (SSRs). An SSR is a two or threenucleotide sequence like CA or CGG, repeated like a broken record thousands and thousands of times. SSRs make up about 3% of the human genome. An additional 7% is devoted to other sorts of duplicated sequences.
Repetitive sequences with excess C and G tend to be found in the neighborhood of genes, while A and Trich repeats dominate the non-gene deserts. The light bands on chromosome karyotypes now have an explanationthey are regions rich in GC and genes. Dark bands signal neighborhoods rich in AT and thin on genes. Chromosome 19, dense with genes, has few dark bands. Roughly 25% of the human genome has no genes at all.
Transposable elements Fully 45% of the human genome consists of mobile bits of DNA called transposable elements. Discovered by Barbara McClintock in 1950 (she won the Nobel Prize for her discovery in 1983), transposable elements are bits of DNA that are able to jump from one location on a chromosome to another, tiny molecular versions of Mexican jumping beans.
Human chromosomes contain five sorts of transposable elements. Fully 20% of the genome consists of long interspersed elements (LINEs). An ancient and very successful element, LINEs are about 6KB (6 thousand DNA bases) long, and contain all the equipment needed for transposition, including genes for a DNAloopnicking enzyme and a reverse transcriptase.
Nested within the genomes LINEs are over half a million copies of a parasitic element called ALU, composing 10% of the human genome. ALU is only about 300 bases long, and has no transposition machinery of its own; like a flea on a dog, ALU moves with the LINE it resides within. Just as a flea sometimes jumps to a different dog, so ALU sometimes uses the enzymes of its LINE to move to a new chromosome location. Often jumping right into genes, ALU transpositions cause many harmful mutations.
Three other sorts of transposable elements are also present in the human genome. Eight percent of the genome is devoted to long terminal repeats (LTRs), also called retroposons. Three percent is devoted to DNA transposons, which copy themselves as DNA rather than RNA. And, some 4% is devoted to dead transposons, elements that have lost the signals for replication and so can no longer jump.
Gene sequences in humans vary greatly in copy number, some occurring many thousands of times, others only once. Only about 1% of the human genome is devoted to protein-encoding genes. Much of the rest is comprised of transposable elements.