Biochemistry and Molecular Biology
Penn State Science
You are here: Home Directory David Gilmour
David Gilmour

David Gilmour

Main Content

  • Professor of Molecular and Cell Biology
465A North Frear Laboratory
University Park, PA 16802
Email: dsg11@psu.edu
Phone: (814) 863-8905

Research Interests

Eukaryotic transcriptional regulation.

Graduate Programs

BMMB, MCIBS

Research Summary

The overall goal of our research is to understand mechanisms of transcriptional regulation. Transcription is the process by which information contained within a gene in the DNA is retrieved to produce RNA, which in turn often directs synthesis of a specific protein. We focus on transcription by RNA polymerase II (Pol II), the enzyme that is responsible for transcribing all protein-encoding genes in eukaryotes. Many diseases including cancer are caused by defects in the level or location in the body at which specific genes are transcribed.

We use Drosophila as a model system because of the combination of genetic, genomic, biochemical and microscopic approaches that can be used to provide comprehensive investigations of complex molecular processes. For example, transcription reactions in cell extracts provide functional tests of specific proteins while immunofluorescent microscopy of Drosophila salivary glands evaluates the behavior of the protein on and off the chromosomes.

Research in my laboratory currently focuses on two aspect of transcriptional control. One is the function of the carboxy terminal domain of the largest subunit of Pol II. The other is the mechanism of promoter proximal pausing.

Function of the carboxy-terminal domain of RNA polymerase II.

Background. The largest subunit of Pol II has a domain called the CTD that consists of a repeating array of amino acids with the consensus YSPTSPS. The Pol II CTD functions as a hub during gene expression by associating with proteins involved in transcription, RNA processing and chromatin structure modulation. Simple organisms are composed mostly of repeats that exactly match the consensus YSPTSPS whereas multicellular organisms often have many more motifs than simple organisms and contain motifs that differ from the consensus. For example, yeast has 26 repeats and Drosophila has 42. While 19 of the repeats in the yeast CTD are YSPTSPS, Drosophila only has two of these. These two consensus motifs along with the 40 remaining divergent motifs are conserved among 12 species of Drosophila indicating that these divergent motifs are functionally important in the fly.

 

An unexpected finding. The number of repeats and the complexity of the CTD sequence found in multicellular organisms have long been thought to be important for the intricate patterns of gene regulation that provide for development and cellular differentiation. However, virtually everything we know about the function of the CTD comes from studies done in yeast or mammalian cultured cells, so the role of the divergent motifs in development remained to be investigated. We developed RNAi based and CRISPR based approaches to investigate the role of the divergent motifs in flies. We were shocked to discover that the divergent motifs can be eliminated and CTDs composed solely of consensus motifs fully support Drosophila viability and fecundity. However, the number of consensus heptads impact Pol II’s functionality: 20, 24, and 29 consensus heptads fully support the fly whereas 10, 42, and 52 do not.

The CTD functions as a signal sequence. Previous studies of CTD function have focused on the association of individual proteins with regions of the CTD spanning 1 to 3 repeats and how these are influenced by post-translational modifications. However, it is difficult reconcile this knowledge with our finding that all of the highly conserved, Drosophila-specific motifs can be replaced with consensus heptads. Recently, purified preparations of the CTD have been found to coalesce into liquid phase separated droplets and to partition into liquid phase separated droplets formed by other proteins involved in gene expression. This raises the possibility that CTD interactions maybe malleable and depend not so much on lock-and-key types of interactions encompassing a few heptads but more on and extended array of binding sites, often referred to valency.

 

There is growing interest in the possibility that transcriptional regulation involves formation of phase separated condensates in which Pol II, Mediator, and other components of the transcriptional apparatus coalesce. Our finding that the functionality of the consensus CTDs in Drosophila is length dependent and that too few or too many consensus heptads impaired Pol II function prompted us to investigate how the CTD behaves in cells when it is expressed separate from the rest of Pol II. We fused various derivatives of the CTD to GFP and ectopically expressed them in salivary glands where we could use fluorescent microscopy to assess their interaction with polytene chromosomes. Polytene chromosomes consist of approximately 2000 copies of the chromosome aligned side-by-side, which makes them visible in the light microscope. Highly transcribed regions are easily identified by staining the chromosomes with antibodies against Pol II. Remarkably, we observe GFP-tagged CTDs associating with transcribed regions of the chromosome. These interactions are dynamic and have properties consistent with the CTD partitioning into liquid phase separate domains. Moreover, we find that the partitioning characteristics of the GFP-tagged CTDs correlate with their functionality the CTD is part of Pol II. CTDs composed of 20, 24, and 29 consensus heptads associate with transcribed regions on chromosomes while the CTD composed of 10 heptads does not. CTDs of 42 and 52 associate with transcribed regions but are also observed to form many static extrachromosomal foci. A characteristic of multivalent polymers is that too many binding sites can lead to the formation of aggregates. Hence, we speculate that CTDs with too many consensus heptads may be misdirecting Pol II to static extrachromosomal locations.

Constructive neutral evolution. The finding that all Drosophila specific motifs can be eliminated from the CTD without any apparent impact on the fly came as a shock. Phase partitioning provides one explanation for why these mutations could be tolerated since both the Drosophila and the consensus CTDs are multivalent. However, the conundrum remains why the Drosophila specific motifs are so highly conserved among various Drosophila species. If this is viewed as a simple repeating consensus heptad substituting for a more complex Drosophila sequence, a possible explanation can be found in the theory of constructive neutral evolution. This theory was formulated to explain how relatively simple molecular complexes in one organism perform the same function as much more complicated derivatives in other organisms. It has been argued that the consensus heptad onstitutes the primordial CTD. We speculate that a common core of proteins shared among all eukaryotes mediates the essential functions of the consensus motifs. According to constructive neutral evolution theory, chance mutations that created binding sites for other proteins that themselves interacted with the common core would be tolerated. These mutations could then set the stage for other chance mutations that would be tolerated because they too served as binding sites for proteins that associate with the common core. Hence, the trajectory for the evolution of the CTD sequence along a specific evolutionary lineage would be set by chance. This could explain why, in Drosophila, the simple repeating YSPTSPS motif and the CTD of human Pol II each supports Pol II function in the fly even though their sequences are significantly different from that of the fly.

Current and future work. To gain new insight into CTD function, we are working to compare the protein interactions occurring on the various CTDs using biochemistry and by screening for second site modifiers that might act only in the context of a subset of CTDs. The constructive neutral evolution hypothesis predicts that the consensus CTD will associate with the common core while the natural Drosophila CTD will also associate with adaptors that are involved in recruiting the common core to the CTD. In addition, we will be comparing the genomic distribution of the various Pol II derivatives and investigating how the various CTDs impact the rapid induction of the heat shock genes in response to heat shock.

Promoter proximal pausing

Background. In Drosophila and mammals, Pol II initiates transcription but pauses 30 to 50 nucleotides downstream. The duration of the pause can control the level of transcription. This potentiates the gene for activation and may provide time for the elongation complex to be modified so that it can productively transcribe the chromatin template. Paused Pol II also serves as a place holder that prevents nucleosomes from assembling over the promoter region and repressing transcription.

Mechanisms. Using a combination of biochemical, molecular genetic, and genomic approaches, we have demonstrated that promoter proximal pausing requires two proteins, DSIF and NELF. The association of these two proteins with the elongation complex requires the nascent transcript to be greater than 18 nucleotides, which is when the transcript begins to emerge from the Pol II elongation complex. We have identified two mechanisms for how these proteins capture the elongating Pol II in Drosophila. Approximately 1500 genes associate with a protein called GAGA factor and our biochemical data indicates that GAGA factor recruits NELF to the promoter. This recruitment allows NELF to capture the elongation complex before it advances beyond the promoter proximal region. This pausing appears to be independent of nucleosomes and depends a kinetic competition between elongation and NELF binding. A key observation supporting this kinetic competition model is our finding that slow, mutant forms of Pol II pause further upstream from the normal counterpart.

A distinct set of genes, numbering about 2000, associates with a different transcription factor we call M1BP. A characteristic of these genes is that they are bound by ordered arrays of nucleosomes. High resolution mapping of the distribution of the paused Pol II reveals that the Pol II is located adjacent to the +1 nucleosome suggesting that Pol II slows as it collides with the nucleosome and becomes bound by NELF.

Current and future work. We are investigating the mechanics of how NELF and DSIF cause Pol II to pause. Association of NELF with the elongation complex requires DSIF. Recent high-resolution structures of the Pol II elongation complex bound by DSIF and NELF provide a framework for introducing mutations into these proteins and testing if and how these mutations impact pausing in a cell-free transcription system and the binding of DSIF and NELF to the elongation complex.

Selected Publications

The CTD of RNA polymerase II

  • Lu, F., B. Portz, D.S. Gilmour (2019) The C-terminal domain of RNA polymerase II is a multivalent targeting sequence that supports Drosophila development with only consensus heptads. Molecular Cell 73: 1232-1242.
  • Lu, F., D.S. Gilmour (2019) Genetic analysis of the RNA polymerase II CTD in Drosophila. Methods 159-160: 129-137.
  • Gibbs, E.B., F. Lu, B. Portz, M.J. Fisher, B.P. Medellin, T.N. Laremore, Y.J. Zhang, D.S. Gilmour, S.A Showalter (2017) Phosphorylation induces sequence-specific conformational switches in the RNA polymerase II C-terminal domain. Nature Communication 8:15233.
  • Portz, B., F. Lu, E.B. Gibbs, J.E. Mayfield, M. Rachel Mehaffey, Y.J. Zhang, J.S. Brodbelt, S.A. Showalter, and D.S. Gilmour (2017) Structural heterogeneity in the intrinsically disordered RNA polymerase II C-terminal domain. Nature Communications 8:15231.

 

Promoter proximal pausing

  • Qiu Y., and D.S. Gilmour (2017) Identification of Regions in the Spt5 Subunit of DRB Sensitivity-inducing Factor (DSIF) That Are Involved in Promoter-proximal Pausing. Journal of Biological Chemistry 292:5555-5570.
  • Li, J., and D.S. Gilmour (2015) Reconstitution of factor-dependent, promoter proximal pausing in Drosophila nuclear extracts. Methods in Molecular Biology 1276:133-152.
  • Li, J, Y. Liu, H.S. Rhee, S.K. Ghosh, L. Bai, B.F. Pugh, D.S. Gilmour (2013) Kinetic Competition between Elongation Rate and Binding of NELF Controls Promoter-Proximal Pausing. Molecular Cell 6:711-722.
  • Li, J., and D.S. Gilmour (2013) Distinct mechanisms of transcriptional pausing orchestrated by GAGA factor and M1BP, a novel transcription factor. EMBO J. 32:1829-1841.
  • Missra, A. and D.S. Gilmour (2010) Interactions between DSIF (DRB sensitivity inducing factor), NELF (negative elongation factor), and the Drosophila RNA polymerase II transcription elongation complex. Proceedings of the National Academy of Sciences, USA 107:11301-11306.
  • Wu, C.-H., Y. Yamaguchi, L.R. Benjamin, M. Horvat-Gordon, J. Washinski, E. Enerly, J. Larsson, A. Lambertsson, H. Handa, and D. Gilmour (2003) NELF and DSIF cause promoter proximal pausing on the hsp70 promoter in Drosophila. Genes and Development 17:1402-1414.
  • Gilmour, D.S. and J.T. Lis (1986) RNA polymerase II interacts with the promoter region of the noninduced hsp70 gene in Drosophila melanogaster cells. Molecular and Cellular Biology 6:3984-3989.

 

Additional recent publications

  • Baumann, D.G., M.S. Dai, H. Lu, D.S. Gilmour (2017) GFZF, a glutathione S-transferase protein implicated in cell cycle regulation and hybrid inviability, is a transcriptional co-activator. Molecular and Cellular Biology 38:1-16.
  • Baumann, D.G., D.S. Gilmour (2017) A sequence-specific core promoter-binding transcription factor recruits TRF2 to coordinately transcribe ribosomal protein genes. Nucleic Acids Research 45:10481-10491.