Basic Science for Challenge A – Eukaryotic promoter structure

Basic Science for
Challenge A (PDF)

Basic Science for Challenge A (PDF)

Eukaryotes (such as fungus, animals and plants) have the 3 types of RNA polymerase, each of which is involved in the transcription of genes, as follows.
RNA polymerase I: is involved in the transcriptions of the genes encoding ribosomal RNA (18S, 5.8S, 28S rRNA).
RNA polymerase II: is involved in the transcriptions of the all genes encoding proteins and some genes encoding snRNA. Most genes are transcribed by this polymerase.
RNA polumerase III: is involved in the transcription of ribosomal RNA (5S rRNA), transfer RNA (tRNA), and small nuclear RNA(7SL RNA, snRNA).

Promoter: the region around where the RNA polymerase binds to the DNA, contains the basal and regulatory binding sites necessary for transcription initiation. These sites are organized around the transcriptional start site (TSS). Sites can occur upstream (negative base pair coordinates) or downstream (positive base pair coordinates) from the TSS.

General (or basal) transcription factors (GTFs) are usually required for transcription. For RNA polymerase II they include TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH. Basal transcription factors are usually strict about the location of binding relative to the TSS. Some GTFs are protein complexes, and make several contacts with each other and the RNA polymerase to form the basal transcription apparatus (Table 1).

Regulatory transcription factors (TFs) are more diverse: they may act only in response to environmental or cellular signals, they may bind and regulate very far from the TSS, and may be flexible about the position and orientation of their DNA binding sites. Often multiple TFs can activate a target promoter, with both OR-like logic (requiring one TF) or AND-like logic (requiring multiple TFs) regulatory architectures. Activators are TFs that activate transcription and repressors: inhibit transcription.


Fig 1. Eukaryotic promoter structure.
Example of human insulin gene promoter. (Plant promoter structure will differ.)

The RNA Polymerase II promoter is composed of the three regions: ・Core promoter: is located from about -50 to about +40 bp relative to TSS. Generally only GTFs act in this region.
Proximal promoter region: is the region located immediately upstream of the core promoter (around -200 to -50 bp). This region binds regulatory TFs, which have position specific interactions with the basal factors and the RNA polymerase. GTFs may also make additional contacts within this region. One example feature that occurs in this region in many eukaryotic promoters is the CCAAT box. The CCAAT box can be recognized by multiple transcription factors, including the ubiquitous NF-Y. Many proximal promoter binding TFs are organism specific: such as the GC box in mammals, the MIG1 site in yeast, and the Y-patch in plants.
Distal promoter region: is the sequence region located upstream of -200bp. This includes regulatory sites with variable position such as enhancers.

For the GenoCon2 challenge, the region downstream from -50 has been taken as the Cauliflower Mosaic Virus 35S minimal promoter (CaMV35Smp). This determines most of the basal transcription factor interactions. Contestants in the design challenge will design artificial sequences in the proximal or distal regions only (up to -550 bp).

Table 1. General transcription factors needed for transcription initiation by eukaryotic RNA polymerase II. TFII means the transcription factor for RNA polymerase II.
General transcription factor Function
TFIID (transcription factor II D) is composed of TATA box-binding protein (TBP) and TBP-associated factors (TAFs) which binding nearby sequences. This is the first factor to bind TATA containing promoters, and provides binding scaffolds for other factors in the basal transcription apparatus.
TFIIB (transcription factor II B) contacts TFIID and acts as a bridge to recruit RNA polymerase II. It also determines the location of the TSS.
TFIIA (transcription factor II A) stabilizes the binding of TBP to the TATA box.
TFIIF (transcription factor II F) binds RNA polymerase II and TFIIB.
TFIIE (transcription factor II E) associates with TFIIH. It can bind to single stranded DNA, and is thought to be involved in promoter melting.
TFIIH (transcription factor II H) is composed of helicase, which unwinds DNA at promoter region, and protein kinase, which phosphorylates C terminal domain (CTD) of RNA polymerase II to activate it. Activated polymerase leaves the basal transcription apparatus during initiation.

An overview of the process of basal transcriptional apparatus formation and activation is illustrated in Fig 2.

  1. TFIID binds the core promoter region. In promoters containing a TATA box, the TBP subunit binds to this AT-rich region around -30 (+/- 10 bp). TFIID can also bind to additional contact sequences near the TSS, such as the INR (initiator sequence).
  2. TFIIA and TFIIB help to stabilize the binding of TFIID to the promoter. TATA box containing promoters may also have a BRE (TFIIB recognition element) either just upstream and/or just downstream of the TATA box.
  3. TFIIB also presents a contact to begin recruiting the RNA polymerase II.
  4. RNA polymerase II makes contacts with TFIIB and TFIID. The contact between RNA polymerase II and TFIIB can be further stabilized by TFIIH.
  5. TFIIF binds both RNA polymerase II and TFIIB, mediating the recruitment of RNA polymerase II.
  6. TFIIE and TFIIH are recruited. These factors help to melt and unwind the promoter region, and activate the polymerase to begin RNA synthesis. Helicase activity of TFIIH allows DNA to unwind in TSS using ATP. After kinase activity of TFIIH phosphorylates RNA polymerase II, the polymerase releases from general transcription factors and can proceed in the elongation step of transcription.
Fig. 2

Fig 2. Formation of basal transcription apparatus and initiation of transcription

The following additional proteins associate transcription factors as transcriptional mediators (Figure 3).
Coactivator: the proteins that bridges basal transcription apparatus, which is composed of general transcription factors and RNA polymerase, and activators.
Corepressor: the proteins that bridge basal transcription apparatus with repressors.

Fig 3. Tissue-specific transcription factor (activator, repressor) and mediator (coactivator, corepressor)

Following regulatory sequences control basal level transcription by the core promoter.
Enhancer: is a positive regulatory sequence, and accelerates basal level transcription initiated by core promoter sequences. (e.g. CCAAT box)
Silencer: is a negative regulatory sequence, and decelerates basal level transcription initiated by core promoter sequences. (e.g. NRE (negative regulatory element))
Insulator: is a sequence which insulates regulatory effects of enhancers and silencers.

Fig. 4

Fig 4. Enhancer, silencer and insulator


  • Genes VIII. Benjamin Lewin.
  • Human Molecular Genetics, 3rd edition (in Japanese). Masami Muramatsu.
  • Essential cell biology, 2nd edition. Bruce Alberts, Dennis Bray, Karen Hopkin, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, and Peter Walter.
  • Fundamentals of Biochemistry, 1st edition. Donald Voet, Judith G. Voet, and Charlotte W. Pratt.