Transcription Factors

DNA-binding transcription factors recognize specific DNA sequences

The preceding discussion has emphasized the structure of the gene and the cis-acting elements that regulate gene expression. We now turn to the proteins that interact with these DNA elements and thus regulate gene transcription. Because the basal transcriptional machinery—Pol II and the general transcription factors—is incapable of efficient gene transcription alone, additional proteins are required to stimulate the activity of the enzyme complex. The additional proteins include transcription factors that recognize and bind to specific DNA sequences (enhancers) located near their target genes, as well as others (see pp. 83–84) that do not bind to DNA.

Examples of DNA-binding transcription factors are shown in Table 4-1. The general mechanism of action of a specific transcription factor is depicted in Figure 4-6B. After the basal transcriptional machinery assembles on the gene promoter, it can interact with a transcription factor that binds to a specific DNA element, the enhancer (or silencer). Looping out of the intervening DNA permits physical interaction between the activator (or repressor) and the basal transcriptional machinery, which subsequently leads to stimulation (or inhibition) of gene transcription. The specificity with which transcription factors bind to DNA depends on the interactions between the amino-acid side chains of the transcription factor and the purine and pyrimidine bases in DNA. Most of these interactions consist of noncovalent hydrogen bonds between amino acids and DNA bases. A peptide capable of a specific pattern of hydrogen bonding can recognize and bind to the reciprocal pattern in the major (and to a lesser extent the minor) groove of DNA. Interaction with the DNA backbone may also occur and involves electrostatic interactions (salt bridge formation) with anionic phosphate groups. The site that a transcription factor recognizes (see Table 4-1 ) is generally short, usually less than a dozen or so base pairs.

TABLE 4-1 DNA-Binding Transcription Factors and the DNA Sequences They Recognize
NAME TYPE RECOGNITION SITE BINDS AS
Sp1 Zinc finger 5′-GGGCGG-3′ Monomer
AP-1 bZIP 5′-TGASTCA-3′ Dimer
C/EBP bZIP 5′-ATTGCGCAAT-3′ Dimer
Heat shock factor bZIP 5′-NGAAN-3′ Trimer
ATF/CREB bZIP 5′-TGACGTCA-3′ Dimer
c-Myc bHLH 5′-CACGTG-3′ Dimer
Oct-1 HTH 5′-ATGCAAAT-3′ Monomer
NF-1 Novel 5′-TTGGCN5GCCAA-3′ Dimer
ATF, activating transcription factor; NF-1, nuclear factor 1.

DNA-binding transcription factors do not recognize single, unique DNA sequences; rather, they recognize a family of closely related sequences. For example, the transcription factor AP-1 (activator protein 1) recognizes the sequences

  • 5′-TGA C TCA-3′
  • 5′-TGA C TCA-3′
  • 5′-TGA C TCA-3′

and so on, as well as each of the complementary sequences. That is, some redundancy is usually built into the recognition sequence for a transcription factor. An important consequence of these properties is that the recognition site for a transcription factor may occur many times in the genome. For example, if a transcription factor recognizes a 6-bp sequence, the sequence would be expected to occur once every 46 (or 4096) base pairs, that is, 7 × 105 times in the human genome. If redundancy is permitted, recognition sites will occur even more frequently. Of course, most of these sites will not be relevant to gene regulation but will instead have occurred simply by chance. This high frequency of recognition sites leads to an important concept: transcription factors act in combination. Thus, high-level expression of a gene requires that a combination of multiple transcription factors bind to multiple regulatory elements. Although it is complicated, this system ensures that transcription activation occurs only at appropriate locations. Moreover, this system permits greater fine-tuning of the system, inasmuch as the activity of individual transcription factors can be altered to modulate the overall level of transcription of a gene.

Basic Zipper

Also known as the leucine zipper family, the basic zipper (bZIP) family consists of transcription factors that bind to DNA as dimers (see Fig. 4-9B). Members include C/EBPβ (CCAAT/enhancer-binding protein-β), c-Fos, c-Jun, and CREB. Each monomer consist of two domains, a basic region that contacts DNA and a leucine zipper region that mediates dimerization. The basic region contains about 30 amino acids and is enriched in arginine and lysine residues. This region is responsible for sequence-specific binding to DNA via an α helix that inserts into the major groove of DNA. The leucine zipper consists of a region of about 30 amino acids in which every seventh residue is a leucine. Because of this spacing, the leucine residues align on a common face every second turn of an α helix. Two protein subunits that both contain leucine zippers can associate because of hydrophobic interactions between the leucine side chains; they form a tertiary structure called a coiled coil. Proteins of this family interact with DNA as homodimers or as structurally related heterodimers. Dimerization is essential for transcriptional activity because mutations of the leucine residues abolish both dimer formation and the ability to bind DNA and activate transcription. The crystal structure reveals that these transcription factors resemble scissors in which the blades represent the leucine zipper domains and the handles represent the DNA-binding domains (see Fig. 4-9B).