<%BANNER%>

Protein-Protein Interaction Map of the Arabidopsis thaliana General Transcription Factors A, B, D, E, and F


PAGE 1

PROTEIN-PROTEIN INTERACTION MAP OF Arabidopsis thaliana GENERAL TRANSCRIPTION FACTORS A, B, D, E, AND F By SHAI JOSHUA LAWIT A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2003

PAGE 2

Copyright 2003 by Shai Joshua Lawit

PAGE 3

This document is dedicated to my family, both genetic and scientific.

PAGE 4

iv ACKNOWLEDGMENTS I thank my dearest wife, Kristel Lynn, fo r her undying support for me. I also thank my son, Benjamin Owen, for constant interest in this document as I wrote and unending smiles and hugs at all times. I thank my pare nts and all of my family for instilling me with a desire for education and excellence. Of course, I have a great appreciation for the members of the Gurley lab (past, present, a nd future) who continually contribute to this field of research. John Davis and the member s of his lab (especially Chris Dervinis for helping me to get access to the Poplar ge nomic sequences and Ram Kishore Alavalapati for running all the PAUP analyses) deserve special thanks for t echnical assistance, collaboration, and helpful discussions. I would finally like to thank William B. Gurley, Eva Czarnecka-Verner, Robert Ferl, A lice Harmon, Karen Koch, Donald McCarty, Thomas Yang, Robert R. Schmidt, Waltraud I. Dunn, and the entire teaching faculty who have molded me into the scientist that I am.

PAGE 5

v TABLE OF CONTENTS Page ACKNOWLEDGMENTS.................................................................................................iv LIST OF TABLES...........................................................................................................viii LIST OF FIGURES...........................................................................................................ix ABSTRACT......................................................................................................................x ii CHAPTER 1 INTRODUCTION TO THE LITERATURE...............................................................1 General Transcription Factors......................................................................................1 TATA Binding Protein and TFIID........................................................................3 TATA Binding Protein-A ssociated Factors..........................................................6 Histone-like TAFs..........................................................................................6 TAF1 family...................................................................................................8 Other TAFs and interactions of TFIID.........................................................13 Alternative TBPor TAFContaining Complexes..............................................17 TAFs: Required Factors or Optional Accessories...............................................23 Interplay of GTFs................................................................................................26 Transcriptional Activators That Bind DNA...............................................................36 2 PHYLOGENETIC ANALYSIS OF POPLAR, Arabidopsis AND OTHER PLANT GENERAL TRANSCRIPTION FACTORS..............................................................51 Introduction.................................................................................................................51 Methods......................................................................................................................53 Results........................................................................................................................ .56 TFIIA Large and Small Subunits........................................................................56 TFIIB Family.......................................................................................................57 Representative TFIID Components.....................................................................58 TFIIE and TFIIE Subunits..............................................................................59 TFIIF and TFIIF Subunits..............................................................................60 Discussion...................................................................................................................60 TFIIA Large and Small Subunits........................................................................60 TFIIB Family.......................................................................................................62 Representative TFIID Components.....................................................................65

PAGE 6

vi TFIIE and TFIIE Subunits..............................................................................68 TFIIF Family.....................................................................................................68 TFIIF Family.....................................................................................................69 3 BINARY PROTEIN-PROTEIN INTERACTIONS OF THE Arabidopsis thaliana GENERAL TRANSCRIPTION FACTOR IID..........................................................89 Introduction.................................................................................................................89 Materials and Methods...............................................................................................90 Results........................................................................................................................ .96 Discussion...................................................................................................................98 4 BINARY PROTEIN-PROTEIN INTERACTIONS OF Arabidopsis TFIIA, TFIIB, TFIID, TFIIE, AND TFIIF.......................................................................................118 Introduction...............................................................................................................118 Materials and Methods.............................................................................................119 Results.......................................................................................................................1 21 Discussion.................................................................................................................123 5 DISCUSSION...........................................................................................................147 TFIIA Large and Small Subunits..............................................................................147 TFIIB Family............................................................................................................149 TFIID Components...................................................................................................152 TFIIE and TFIIE Subunits...................................................................................154 TFIIF and TFIIF Subunits....................................................................................155 Conclusion................................................................................................................157 APPENDIX A NUCLEOTIDE AND AMINO ACID SEQUENCES OF GENERAL TRANSCRIPTION FACTORS................................................................................161 TFIIA Small Subunit Sequences..............................................................................161 TFIIA Large Subunit Sequences..............................................................................163 TFIIB Family Sequences..........................................................................................165 TATA Binding Protein Sequences...........................................................................172 TAF6 Sequences.......................................................................................................176 TAF9 Sequences.......................................................................................................179 TAF10 Sequences.....................................................................................................182 TAF11 Sequences.....................................................................................................185 TFIIE Sequences....................................................................................................186 TFIIE Sequences....................................................................................................189 TFIIF Sequences....................................................................................................192 TFIIF Sequences.....................................................................................................194

PAGE 7

vii B AMINO ACID MULTIPLE SEQUENCE ALIGNMENTS FOR CORE DOMAINS OF THE GENERAL TRANS CRIPTION FACTORS.............................................197 TFIIA Small Subunit Alignment..............................................................................197 TFIIA Large Subunit Alignment..............................................................................198 TFIIB Family Alignment..........................................................................................200 TBP Alignment.........................................................................................................205 TAF6 Alignment.......................................................................................................208 TAF9 Alignment.......................................................................................................211 TAF10 Alignment.....................................................................................................214 TAF11 Alignment.....................................................................................................215 TFIIE Alignment....................................................................................................216 TFIIE Alignment....................................................................................................219 TFIIF Alignment....................................................................................................222 TFIIF Alignment....................................................................................................223 LIST OF REFERENCES.................................................................................................226 BIOGRAPHICAL SKETCH...........................................................................................249

PAGE 8

viii LIST OF TABLES Table page 1-1. TATA binding protein-associated factors of the TFIID complex.........................39 1-2. Protein-protein inte ractions of TFIID in Homo sapiens Drosophila melanogaster and Saccharomyces cerevisiae with corresponding references.............................41 1-3. Protein-protein interactions betw een TFIIA, TFIIB, TFIID, TFIIE, and TFIIF subunits in Homo sapiens Drosophila melanogaster and Saccharomyces cerevisiae with corresponding references..............................................................46 2-1. Arabidopsis GTF genes, loci, genomic sizes, coding sequence sizes (counting stop codons), predicted protein molecular weight s, and pI of the predicted proteins...72 2-2: Similarity and identity percentage ranges of the GTF protein families examined.74 3-1. Primers for amplification of TBP and TAF-like cDNAs and cloning into pENTR/D-Topo or pDONR207 vectors..............................................................105 3-2. Primers for cloning of TAF12 N-term inal, middle, and C-te rminal fragments...107 3-3. Arabidopsis thaliana TFIID subunit cDNA GenBank accession numbers.........108 3-4. A yeast two-hybrid targeted protein-pr otein interaction matrix between subunits of the Arabidopsis thaliana TFIID complex........................................................113 4-1. Primers for amplification of cDNAs to Arabidopsis homologs of TFIIA, TFIIB, TFIIE, and TFIIF cloning into the pENTR/D-Topo vector.................................129 4-2. Primers for cloning of TFIIE 2 N-terminal, and C-terminal fragments.............131 4-3. Arabidopsis thaliana TFIIA, TFIIB, TFIIE, and TFIIE component cDNA GenBank accession numbers...............................................................................132 4-4. A yeast two-hybrid targeted protei n-protein interaction matrix between components of Arabidopsis thaliana TFIIA, TBIIB, TFIIE, and TFIIF with subunits of the TFIID complex............................................................................137

PAGE 9

ix LIST OF FIGURES Figure page 1-1. The “two-step handoff” model of rem oval of auto-inhibition of TFIID by the TAF1 N-terminal domains TAND1 and TAND2 (T1 and T2, respectively)........40 1-2. Binary protein-prot ein interactions of the Homo sapiens general transcription factors TFIIA, TFIIB, TFIID, TF IIE, TFIIF, and their homologs.........................48 1-3. Binary protein-pr otein interactions of Drosophila melanogaster general transcription factors TFII A, TFIIB, TFIID, TFIIE, TFIIF, and their homologs....49 2-1: Unrooted phylogram of TFIIA small subunit proteins from plants, humans, fruit flies, and yeast........................................................................................................75 2-2: Unrooted phylogram of TFIIA large subunit proteins from plants, humans, fruit flies, and yeast........................................................................................................76 2-3: Unrooted phylogram of TFIIB-related proteins from plants, humans, fruit flies, yeast, and Archaea .................................................................................................77 2-4: Unrooted phylogram of TBP-related proteins from plants, humans, fruit flies, yeast, and Archaea .................................................................................................78 2-5: Unrooted phylogram of TA F6-related proteins from plants, humans, fruit flies, and yeast........................................................................................................79 2-6: Unrooted phylogram of TA F9-related proteins from plants, humans, fruit flies, and yeast........................................................................................................80 2-7: Unrooted phylogram of TA F10-related proteins from plants, humans, fruit flies, and yeast........................................................................................................81 2-8: Unrooted phylogram of TA F11-related proteins from plants, humans, fruit flies, and yeast........................................................................................................82 2-9: Unrooted phylogram of TFIIE -related proteins from plan ts, humans, fruit flies, yeast, and Archaea .................................................................................................83 2-10: Unrooted phylogram of TFIIE -related proteins from plants, humans, fruit flies, and yeast........................................................................................................84

PAGE 10

x 2-11: Unrooted phylogram of TFIIF -related proteins from plants humans, fruit flies, and yeast........................................................................................................85 2-12: Unrooted phylogram of TFIIF -related proteins from plants, humans, fruit flies, and yeast........................................................................................................86 2-13. Multiple sequence alignment of the TFIIB region containing the conserved lysine residue that is acetylated in hum an and yeast TFIIB (in green)............................87 2-14: Exon-Intron diagrams of Arabidopsis TAF6 and TAF6b alternative splicing forms.........................................................................................................88 3-1. Histogram of percent of matings, per bait construc t, that yielded colony growth......................................................................................................109 3-2. Histogram of percent of matings, per prey constr uct, that yielded colony growth......................................................................................................110 3-3. Immunoblots of TFIID components e xpressed as bait fusion proteins in MaV204K............................................................................................................111 3-4. Immunoblots of TFIID components expr essed as prey fusion proteins in AH109................................................................................................112 3-5. Colorimetric assays of the -galactosidase reporter levels in yeast diploids containing both bait and prey plasmids...............................................................114 3-6. Protein-protein interactions of Arabidopsis thaliana TFIID subunits as determined by yeast two-hybrid and -galactosidase confirmations......................................116 3-7. Protein-protein interactions of Arabidopsis thaliana TFIID subunits as determined by yeast two-hybrid and -galactosidase confirmations......................................117 4-1. Histogram of percent of matings, per bait construc t, that yielded colony growth......................................................................................................133 4-2. Histogram of percent of matings, per prey constr uct, that yielded colony growth......................................................................................................134 4-3. Immunoblots of TFIIA, TFIIB, TFIIE, a nd TFIIF components expressed as bait fusion proteins in MaV204K...............................................................................135 4-4. Immunoblots of TFIIA, TFIIB, TFIIE, a nd TFIIF components expressed as bait fusion proteins in AH109.....................................................................................136 4-5. Colorimetric assays of the -galactosidase reporter levels in yeast diploids containing both bait and prey plasmids...............................................................138

PAGE 11

xi 4-6. Protein-protein interactions of Arabidopsis thaliana TFIIA subunits with components of TFIIB, TFIID, TFIIE, TF IIF, and other TFIIA components as determined by yeast two-hybrid and -galactosidase confirmations...................140 4-7. Protein-protein interactions of Arabidopsis thaliana TFIIB homologs with components of TFIIA, TFIID, TFIIE, TFIIF and other homologs of TFIIB as determined by yeast two-hybrid and -galactosidase confirmations...................141 4-8. Protein-protein interactions of Arabidopsis thaliana TFIID components with components of TFIIA, TFIIB, TFIIE, and TFIIF as determined by yeast twohybrid and -galactosidase confirmations...........................................................142 4-9. Protein-protein interactions of Arabidopsis thaliana TFIIE subunits with components of TFIIA, TFIIB, TFIID, TF IIF, and other TFIIE components as determined by yeast two-hybrid and -galactosidase confirmations...................143 4-10. Protein-protein interactions of Arabidopsis thaliana TFIIF subunits with components of TFIIA, TFIIB, TFIID, TF IIE, and other TFIIF components as determined by yeast two-hybrid and -galactosidase confirmations...................144 4-11. Protein-protein interactions of Arabidopsis thaliana TFIIA, TFIIB, TFIIE, and TFIIF with each other and subunits of TFIID as determined by yeast two-hybrid and -galactosidase confirmations.......................................................................145 4-12. Strong protein-prot ein interactions of Arabidopsis thaliana TFIIA, TFIIB, TFIIE, and TFIIF with each other and subunits of TFIID as determined by yeast twohybrid and -galactosidase confirmations...........................................................146 5-1. Protein-protein interactions among TFIIA, TFIIB, TFIIE, TFIID, and TFIIF that are unique to Arabidopsis thaliana as determined by yeast two-hybrid and galactosidase confirmations.................................................................................158 5-2. Protein-protein interactions of Arabidopsis thaliana TFIIA, TFIIB, TFIID, TFIIE, and TFIIF that have been reported previously for Homo sapiens Drosophila melanogaster and/or Saccharomyces cerevisiae homologs................................159 5-3. Interactions of Homo sapiens Drosophila melanogaster and/or Saccharomyces cerevisiae TFIIA, TFIIB, TFIID, TFIIE or TFIIF that were not confirmed for Arabidopsis thaliana homologs...........................................................................160

PAGE 12

xii Abstract of Dissertation Pres ented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy PROTEIN-PROTEIN INTER ACTION MAP OF THE Arabidopsis thaliana GENERAL TRANSCRIPTION FACTORS A, B, D, E, AND F By Shai Joshua Lawit December 2003 Chair: William B. Gurley Cochair: Eva Czarnecka-Verner Major Program: Plant Mo lecular and Cellular Biology General transcription factor IID (TFIID ) is a protein complex central to the nucleation of the deoxyribonucleic acid (DNA) dependent ribo nucleic acid (RNA) polymerase II (PolII) preinitiation complex (PIC ) and is critical to the transcriptional activation of many genes. The presence of TF IID at a promoter lead s to recruitment of the other general transcripti on factors (GTFs) TFIIA, TFIIB, TFIIE, TFIIH, and TFIIF (in association with PolII). While GTFs have b een heavily studied in metazoans and yeast, little is known about their functions in th e plant kingdom. Recent studies of selected GTF proteins in plants have uncovered possibl e plant-specific and developmental roles, suggesting that some GTF proteins have e volved different functions since the last common ancestor of plants, animals, and fungi. The specific objectives for characterization of the GTFs from Arabidopsis were to identify the GTF proteins and uncover their binary interactio ns. A number of genes for putative GTFs were identified by homology-ba sed searches. These newly identified

PAGE 13

xiii genes added to the two previously known TATA-binding protein (TBP) genes, three genes encoding subunits of TFIIA, and two TF IIB genes. Of these genes, 16 encoded TBP-associated factor like proteins (TAFs), and 14 enc oding putative components of TFIIA, TFIIB, TFIIE, and TFIIF in Arabidopsis Many of their complementary DNAs (cDNAs) were cloned using reverse transcri ptase-mediated polymerase chain reaction (PCR). Often, these clones were the first confirmation of messenger RNAs for their respective genes. The cDNAs of these Arabidopsis GTF genes have been subcloned into yeast twohybrid bait and prey vectors, and tr ansformed into yeast MATa and MAT strains, respectively. Using a targeted interacti on scheme, 1598 interactions were tested. Interactions that yielded colony growth in the yeast two-hybrid system were verified using -galactosidase assays. A map of binary protein-protein intera ctions between the subunits of Arabidopsis TFIIA, TFIIB, TFIID, TFIIE, and TF IIF was constructed. Of the 112 interactions, 36.4% were prot ein interactions that were previously characterized in other systems. However, 63.6% (112) of the in teractions were novel. This is the first comprehensive protein-protein interacti on map for TFIIA, TFIIB, TFIID, TFIIE and TFIIF and has elucidated new PIC nuc leation pathways (i.e. a TAF8-TAF10 heterotetramer with extens ive protein contacts).

PAGE 14

1 CHAPTER 1 INTRODUCTION TO THE LITERATURE General Transcription Factors The central dogma of molecular biology defi nes the flow of genetic information as directed from deoxyribonucleic acid (DNA), to ribonucleic acid (RNA), and ultimately to protein. In eukaryotes, DNA-dependent RNA polymerase II (PolII) is responsible for the transcription of some small nuclear RNAs and all messenger RNA (mRNA is the only RNA that is translated into protein) (Burle y and Roeder, 1996). Ini tiation of transcription by PolII is a complex process and requires inte ractions of many proteins that comprise the general transcription factors (GTFs) TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH (Matsui et al ., 1980; Samuels et al ., 1982; Albright and Tjian, 2000). In the first step, TFIID nucleates the assembly of the GTFs at the TATA element of PolII promoters. This is achieved in part by sequence specific recognition of DNA by TATA binding protein (TBP). Two predominant models describe PolII as sembly at a promoter. The stepwise model states that GTFs are se quentially recruited to a promot er in a predetermined order (TFIID, TFIIB, TFIIA, TFIIF with PolII, TFIIE, and TFIIH) (Buratowski et al ., 1989; Flores et al ., 1992; Koleske and Young, 1995). The second model is called the holoenzyme model because TFIIB, TFIIE, TFII F, TFIIH, Mediator (a compilation of suppressor of RNA polymerase B proteins, and numerous othe r proteins), and PolII are pre-assembled as a multi-protein mega Dalton (MDa) complex before recruitment to the promoter (Koleske and Young, 1994, 1995). Both models require either TBP or TFIID to

PAGE 15

2 nucleate the assembly of the preinitiation comple x (PIC) at the promoter as a first step. In total, the PolII PIC is composed of at least 48 protein subunits : 12 PolII; 18-20 TFIID, including multiple copies of some TBP associated factors (TAF) subunits; one TFIIB; four TFIIF ( 22) (Flores et al ., 1990); four TFIIE ( 22) (Ohkuma et al ., 1990); and 9 TFIIH. In recent years, the holoen zyme model has gained preference in the transcriptionoriented community. Evidence in suppor t of the holoenzyme model includes the isolation of the large holoenzyme complex from a variety of species; and that artificial recruitment of holoenzyme subunits (out of order according to the stepwise model) to a promoter leads to high levels of transc ription (Koleske and Young, 1994; Berk, 1999). Furthermore, early purifications of PolII l acking some members of the holoenzyme can be accounted for by low abundance relative to PolII and unfavorable protein-protein interaction conditions in the purification schemes (Koleske and Young, 1994; Berk, 1999). However, arguing against the importance of the holo-form of the enzyme is the finding that quantitative immunobl ots of HeLa extracts demonstrated that only 3% of soluble PolII was present in a holoenzyme-sized complex (Kimura et al ., 1999). The stepwise model implies that multiple assembly steps are involved in formation of the PIC. Conversely, the holoenzyme mode l suggests that there are only two major regulatory steps in PIC formation: recruitm ent of either TFIID or holoenzyme to the promoter and subsequent recruitment of th e reciprocal complex through protein-protein interaction (Koleske and Young, 1995). In ei ther case, recognition of a promoter by TFIID is certainly a prominent regulatory step in the initiation of transcription by PolII.

PAGE 16

3 TATA Binding Protein and TFIID The saddle-shaped TBP constitutes the core of the TFIID complex. This evolutionarily ancient protei n is essential for recogniti on of promoters by all three eukaryotic RNA polymerases, as well as the archaeal RNA polymerase (Rowlands et al ., 1994). This was reviewed by Buratowski (1994); and Burley a nd Roeder (1996). Promoter specificity for DNAdependent RNA polymerase I (PolI), which transcribes ribosomal DNA, is determined by selectivity fact ors. These selectivity factors are SL1 in humans (Learned et al ., 1985), TIF-IB in mice (Clos et al ., 1986), and CF in yeast (Lin et al ., 1996). These selectivity factors contai n TBP and three TBP associated factors (TAFIs): TAFI48, TAFI63, and TAFI110) (Comai et al ., 1994; Zomerdijk et al ., 1994). The TAFIs contact the consensus elements in ribosomal DNA core promoters and are vital to PolI transcription (Learned et al ., 1985; Clos et al ., 1986). In addition, TAFIs contact a transcriptional activ ator that enhances Selectiv ity factor 1 (SL1) binding and increases PolI transcription (Beckmann et al ., 1995; Steffan et al ., 1996). RNA polymerase III is responsible for the transcription of transfer RNAs, 5S ribosomal RNA, small ribonuclear protein RNAs, and small nuclear RNAs (snRNAs; which in some cases are also transcribed by PolII) (Carmo-Fonseca et al ., 2000). In mammals, TBP plus the PolIII selectivity -factors PTF/SNAPc (proximal-sequence element-binding transcription-factor/small nuclear RNA-activati ng protein complex) recognize the snRNA promoters (Murphy et al ., 1992; Sadowski et al ., 1993). PTF/SNAPc is composed of four subunits th at bind core snRNA promoter elements and stabilize TBP binding at these promoters (Y oon and Roeder, 1996; Mittal and Hernandez, 1997). In yeast, two essential TAFIIIs bind TBP, forming the TFIIIB complex, which

PAGE 17

4 functions as an snRNA promoter selectivity factor (Margottin et al ., 1991; Joazeiro et al ., 1994). In addition to its role in PolI and PolIII-mediated gene expression, TBP also serves as a promoter selectivity factor for PolII within the TFIID complex. In early experiments, TBP was thought to be the sole component of TFIID. These experiments showed that TBP alone could form the foundation of the PIC in vitro (Peterson et al ., 1990). TBP is a saddle-shaped prot ein with near symmetry (Nikolov et al ., 1992). The concave surface of TBP intera cts with the minor groove of the DNA TATA-element; and the convex surface is left open to protein-protein interactions (Chasman et al ., 1993; Kim et al ., 1993a; Kim et al ., 1993b; Albright and Tjian, 2000). The C-terminal stirrup of TBP interacts with the 5’-end of the TATA element and TFIIB, which facilitates the directionality of the TATA inte raction and transcription (Nikolov et al ., 1995). The combined interaction of TBP and TFIIB with TATA leads to an 80 bend in the promoter (Kim et al ., 1993b) that is believed to lead to a wrapping of DNA from –60 to +40 around PolII (Coulombe, 1999; Coulombe and Burton, 1999). Later experiments demonstrated that TB P alone could not initiate activatordependent transcription with some transactivators (Hoey et al ., 1990; Kambadur et al ., 1990; Peterson et al ., 1990; Pugh and Tjian, 1990; Burl ey and Roeder, 1996). Moreover, it was found that TBP binding to DNA is not required at TATA-less promoters that contain an Initiator element (Zhou et al ., 1992; Martinez et al ., 1994). These data suggested that TFIID might be compos ed of TBP and accessory factors (TAFIIs, referred to as TAFs from here forward). When Dynlacht et al (1991) found peptides associated with TBP that provided coactiv ator function, it was realized that TFIID was composed of

PAGE 18

5 more than just TBP. It is now confirme d that TBP and 8 to 14 TAFs comprise TFIID (Dynlacht et al ., 1991; Sanders and Weil, 2000). Ther efore, TBP recruitment (often a limiting step of PolII transcription) (Klein and Struhl, 1994; Chatterj ee and Struhl, 1995) can be achieved by targeted recr uitment TAFs to a promoter. Furthermore, some of the TAFs (in addition to TBP) form the promot er specific DNA interacting surface of TFIID, probably increasing the stability of a promoter-TFIID interaction (Martinez et al ., 1994; Burke and Kadonaga, 1997; Chalkley and Ve rrijzer, 1999)}. Since SLI, PTF/SNAPc, and TFIID all rely on TBP, they must be fo rmed by mutual exclusion of the various TAF protein-protein interactions with TBP. This model has been experimentally demonstrated in that TAFIs exclude TAF2 and TAF1 binding to TBP, and vice versa (Comai et al ., 1994). Arabidopsis TBP has been well characterized and indeed was the first TBP structure from any organism to be el ucidated by X-ray cr ystallography (Nikolov et al ., 1992). Many subsequent papers described the structure of Arabidopsis TBP in complex with the TATA-box, TFIIA, and TFIIB and various combinations thereof (Kim et al ., 1993a; Kim and Burley, 1994; Ni kolov and Burley, 1994; Nikolov et al ., 1995). The Arabidopsis TBP was utilized in these studies not because of an interest in the mechanisms of plant gene regulation, but ra ther because it closely approximates a TBP core structure, lacking the l ong N-terminal extensions found in metazoans. Nonetheless, the structure of the Arabidopsis TBP, its DNA-sequence recognition sites (Mukumoto et al ., 1993), bending of the TATA element (Takeda et al ., 1994), interactions with transcriptional regulators (Qadri et al ., 1995; Reindl and Schoffl, 1998; Le Gourrierec et

PAGE 19

6 al ., 1999; Pan et al ., 1999), and its effect on growth of Arabidopsis when overexpressed (Li et al ., 2001) have been well characterized. TATA Binding Protein-Associated Factors TAFs may have a wide range of activities (such as coactivation, repression, and even protein modification) th at are thus bestowed on the TFIID complex (Albright and Tjian, 2000). It is now accep ted that the TFIID complex contains at least 14 TAF subunits. Some of these subunits have enzymatic activity, while many do not have apparent catalytic properties. Most of the TAFs are in st oichiometric unity, while some are in multiple copies in a complex, and others are substoichiometric. These proteins have been studied extensively, and this work will be discussed below. Histone-like TAFs It is believed that a structural core of TFIID is composed of the histone-like TAFs. Yeast TAF6 (yTAF6), yTAF9, and yTAF 12 have similarity with histones H4, H3, and H2B, respectively. Additionally, evid ence supports a theory that yTAF4 has structural similarity with the histone H2A, forming a heterodimer with yTAF12 (Gangloff et al ., 2000; Sanders and Weil, 2000). Huma ns contain two distinct TAF4 homologs: TAF4 and TAF4b. TAF4b appears to be a likely targ et of activators responsible for transcri ption of genes in Bcells, specifically the -promoter or enhancer (Dikstein et al., 1996b). Intere stingly, TAF4b protein levels are post-transcriptionally regulated such as to reduce the protein leve ls in non-B-cells to below detection limits (Dikstein et al., 1996b). Th e crystal structure of Drosophila TAF9 and 6 (dTAF9 and 6) display histone-fold domains (HFDs) that in teract with one another, forming a dTAF9dTAF6 dimer (Xie et al ., 1996). TAF dimerization is crit ical to TFIID formation because

PAGE 20

7 TAF HFDs are predicted to not properly fold when the binding partner is lacking (Burley and Roeder, 1996). It has been hypothesized that some of the histone-like TAFs bind DNA and form a structure similar to a nucleosome, perhaps in conjunction with th e TBP-TFIIB bending of DNA (Coulombe, 1999). This hypothesis was supported by data showing that TFIID introduces a negative supercoi l in bound DNA, much like a nu cleosome (Oelgeschlager et al ., 1996). However, TAFs do not contain th e arginine residues f ound in histone tails that bind the minor grove of DNA (Luger et al ., 1997; Workman and Kingston, 1998; Albright and Tjian, 2000). Additionally, stru ctural evidence also refutes a nucleosomelike organization. Human TFIID is composed of four dimensionally equivalent domains, none of which are large enough to contai n the histone-fold TAF-octamer (Brand et al ., 1999). Despite this evidence, DNase I footpr inting experiments suggest that DNA is in some way wound around TFIID, approximately one turn as opposed to the two turns characteristic of nucleosomes (Michel et al ., 1998). Furthermore, dTAF6 and 9 contact a conserved downstream promoter element (DPE) in photo-cross linking experiments (Burke and Kadonaga, 1997). Sell eck and colleagues (Selleck et al ., 2001) demonstrated that bacterially expressed, y east histone-like TAFs 4, 6, 9, and 12 can self-assemble into a TAF octamer in a 2:2:2:2 ratio in vitro Overexpression of these yeast histone-like TAFs individually has the capacity to suppress mu tations in the other members of the octamer (Selleck et al ., 2001). Additionally, by co-immunopr ecipitation, TAF4 and TAF4b have been shown to be present within the same TFIID complex, indicating that there are likely two copies of TAF4 in TFIID (Dikstein et al., 1996b). This evidence constitutes strong support for a histone-like octamer in TFIID, or some variation thereof. Irrespective of

PAGE 21

8 DNA binding, multiple, possibly interchangeable histone-fold proteins within TFIID suggest that this complex has a flexible composition modulated in response to cellular signals (Albright and Tjian, 2000). More recently, it has been realized that other TAFs contain c onserved histone-fold domains, such as TAF3, TAF8, TAF10, TAF11, and TAF13 (Leurent et al ., 2002). In addition to the TAF6-TAF9 association, a num ber of TAF heterodimers containing HFDs have been shown to form including hu man TAF11 (hTAF11) -hTAF13 (Birck et al ., 1998), human and yeast TAF4-TAF12 (Gangloff et al ., 2000; Reese et al ., 2000; Sanders and Weil, 2000), TAF3-TAF10 (Gangloff et al., 2001b), and TAF8-TAF10 (Gangloff et al., 2001b). TAFs 11 and 13 are unique in that th ey are a histone-fold binding-pair that is specific to the TFIID complex (Grant et al ., 1998). However, it is believed that they are structural and functional or thologs of the Spt3 subunit of the Spt-Ada-Gcn5acetyltransferase (SAGA) complex (Apone et al ., 1998; Birck et al ., 1998). The full complement of putative histone-like TAFs has been identified in Arabidopsis thaliana with the exception of TAF3 (Table 1-1). Several of the putative Arabidopsis TAFs are members of two-gene families. These include the TAF1/1b, TAF4/4b, TAF6/6b, TAF11/11b, TAF12/12b, TAF14/14b, and TAF15/15b genes. TAF1 family Proteins in the TAF1 family represent th e largest TAFs in animal and plant cells, and have a bevy of biochemical roles includ ing histone acetyltransferase (HAT) activity and protein kinase activity (Takada et al ., 1992; Lee and Young, 1998). The protein kinase phosphorylates the la rge subunit of TFIIF ( or RAP74), but ultimately has an unknown downstream function (Dikst ein et al., 1996a; O. Brien and Tjian, 1998). Yeast

PAGE 22

9 TAF1 can also phosphorylate TFIIA apparently raising levels of transcription and increasing the affinity for TBP (Solow et al ., 2001). The TAF1 family also has a recently identified ubiquitin conjugating activity (ubac); however, the u tility of this function is yet to be elucidated (Pham and Sauer, 2000). The substrate of the ub ac activity is histone H1, which it monoubiquitylates. Histone H1 monoubiquityla tion may lead to transcriptional activation, since monoubiquitylation of histones H2A and H2B has been correlated with actively transcribe d genes (Davie and Murphy, 1990). Generally, TBP binding to the TATA box and PIC formation are inhibited by nucleosomes, suggesting that a condensed chroma tin state directly i nhibits transcription (Workman et al ., 1991; Imbalzano et al ., 1994; Workman and Kingston, 1998). Furthermore, a correlation between hyperacety lated (open) chromatin and regions that are transcriptionally active leads to the hypothesis that HAT activity functions in the remodeling of chromatin to activate transcri ption (Ayer, 1999; Gr ant and Berger, 1999; Wolffe and Guschin, 2000). The TAF1 fam ily HAT activity implies that histone acetylation is important at the core promot er to aid transcription factor/chromatin contacts (Workman and Kingston, 1998). Fu rthermore, it is known that human TAF1 also acetylates TFIIF and the subunit of TFIIE (Imhof et al ., 1997). This activity is known as Factor Acetyltransferase (FAT) activity; however, the significance of FAT activity is unclear (Grant and Berger, 1999). The yTAF1 FAT/HAT might acetylate histones near the core promoter, basal tran scription factors, or other unknown protein targets (Jacobson et al ., 2000). FAT/HAT, and kinase activities may function in conjunction as a signal transduction cascade ta rgeting GTFs and histones and ultimately result in gene activation (Alb right and Tjian, 2000; Jacobson et al ., 2000). Two putative

PAGE 23

10 Arabidopsis TAF1 homologs (AtTAF1 and AtTAF 1b) also appear to have these conserved domains responsible for the activ ities described above (HAT/FAT, protein kinase, and ubiquitin conjugati ng activity; E. Czarnecka-Verner and W.B. Gurley; unpublished data). At the C-terminus, human and Drosophila TAF1 contain two tandem bromodomains that are known to bind acetylated lysine residues of histone H4 (Jacobson et al ., 2000). The first bromodomain shown to bind acetylated histones (H3 and H4 NH2terminal peptide) was from p300/CBP-a ssociated factor (PCAF) (Dhalluin et al ., 1999). A later binding study with hTAF1 found that the double bromodomain motif has an affinity for acetylated H4 peptide 70-fol d greater than the single domain of PCAF (Dhalluin et al ., 1999; Jacobson et al ., 2000). Acetylation of f our histone H4 lysines is correlated with transcriptional activity, raising the possibility that th e role of the hTAF1 bromodomain(s) is to bind to the acetyl-lys ines, thereby facilitati ng histone modification by the HAT domain (Jacobson et al ., 2000). Interestingly, yeast TAF1 is lacking bromodomains and a C-terminal kinase domain; however, a separate protein, bromodomain factor 1 (Bdf1), was identifie d that contains two bromodomains, and a kinase domain, and is found in associ ation with TFIID (Matangkasombut et al ., 2000). Due to these structural and functional similari ties to the C-terminus of hTAF1, Bdf1 is hypothesized to be functionally analogous to the hTAF1 C-terminus. In Arabidopsis TAF1 and TAF1b each have a single bromodomain (W.B. Gurley, unpublished data), and thus are hypothesized to have a limited affinity for acetylated histone H4, perhaps as much as 70-fold lower affinity than hTAF1 (Jacobson et al ., 2000). Drawing on the parallel with yeast, it is hypothesized that Arabidopsis may also

PAGE 24

11 express a protein analogous to Bdf1, s upplementing the AtTAF1 and AtTAF1b single bromodomains. However, analyses of Arabidopsis genomic sequence did not reveal a Bdf1-analgous protein to be present. Another property of the TAF1 family is a capacity to auto-inhibit TFIID function (Kokubo et al ., 1993b). This regulatory property re sides in the N-terminal domain (TAND1) that acts as a TATA-element minor-g roove mimic. Inhibition is achieved by competition between the TAND1 and the TATA -box for binding to the concave surface of TBP (Liu et al ., 1998; Kotani et al ., 2000). While the TAND1-TBP interaction may add to the stability of the TFIID complex, it is not required for TFIID integrity (Albright and Tjian, 2000). A TAND1-adjacent domain TAND2 appears to stabilize the TAND1TBP interaction by binding the helix H2 on th e convex face of TBP (Burley and Roeder, 1998). It has been shown through domain swapping that there is a functional conservation between the TAND1 and acidic activation domains of transactivator proteins such as VP16 (Kotani et al ., 2000). In domain-swapping experiments, acidic activator domains are capable of TFIID inhibition when translationally fused to yTAF1; conversely, TAND1 is capable of serving as an activator when fused to a DNA-binding domain (Kotani et al ., 2000). This leads to the “two-s tep hand off model” in which the auto-inhibition caused by TAND1 is competed by acidic activators. Subsequencely, the TAND2 interaction with the TBP convex surface is competed by TFIIA, ultimately leading to a cooperative removal of the inhib itory region of TAF1 from TBP (Figure 1-1) (Kotani et al ., 2000). In this model, the TFIIA-T FIID-acidic activator intermediate allows some TAFs to bind near the transcrip tional initiation site (TAF2 binds Initiator, TAF6/TAF9 dimer binds DPE) and leads to the TATA-box displacing the acidic activator

PAGE 25

12 from the concave surface of TBP (Kotani et al ., 2000). However, this model may not prove useful in humans or Drosophila where the TBP-TAND1 affinity is much higher than that of yeast (Kotani et al ., 2000). A study using HeLa heat-treated chromatin demonstrated that TFIID had less capacity fo r transcriptional initiation than TBP alone, possibly due to the TAND1-based inhibiti on of TBP in the absence of functional transcriptional activ ators (Remboutsika et al ., 2001). The genomic region encoding A. thaliana TAF1b is lacking a conserved sequence corresponding to the yeast and metazoan TBP-inhibiting TAND1 and TAND2. On the other hand, AtTAF1 appears to contain a T AND1 and TAND2 (E. Czarnecka-Verner and W.B. Gurley, unpublished data). AtTAF1b s eems to be the only example of a TAF1 homolog lacking a TAND1/2, possibly al tering the transcriptional activation characteristics of AtTFIID. Interestingly, in a microarray experiment utilizing a temperature sensitive (TS) mutant yeast only 16% of PolII promoter s showed dependence on yTAF1 (Holstege et al ., 1998). Many of the genes that were down re gulated upon temperature shift were cell cycle and DNA repair genes (not unexpected given the original id entification of yTAF1 as a cyclin). The surprise is in the low percentage of yTAF1-dependent genes. This apparent low dependence on yTAF1 can be ex plained by the overlap in function between TFIID and other TAF-containing complexes (s uch as SAGA) that do not require yTAF1. Alternatively, the yTAF1 TS mutation could be leaky in its disruption of function as seen for other TS TAF mutations that underrepresent the ma gnitude of the yTAF1 contribution to gene regulation in these experiments (Michel et al ., 1998).

PAGE 26

13 Other TAFs and interactions of TFIID Many other TAFs are known, but for the mo st part their functions are quite nebulous. The metazoan TAF2 family represents one exception in that it has been shown to bind the core promoter Initiator elemen t. This was shown directly by DNase I footprinting and electrophoretic mobility shift assays (EMSA) with recombinant Drosophila TAF2 (Albright and Tjian, 2000). Ho wever, it seems unlikely that this should be the only function of TAF2 due to it s large size (~150 kDa) For example, TBP binds a specific DNA sequence with only 20% the mass of TAF2. With so many properties attributed to the TAF1 family, it w ould be intellectually satisfying to find other roles for TAF2. Interestingly, consensus sequences for plant and fungal Initiator elements have not been defined and it is unknown whether the TAF2 proteins in these kingdoms bind to core promoter elements. Other than TAF1, the only other stoichiometric TAF with enzymatic activity is TAF7 (personal communication, M. Horikoshi ) TAF7 from humans and yeast (but not Drosophila or Arabidopsis ) have similarity with von W illebrand factor type A domain (VWA). Interestingly, the majority of VW A-containing proteins are extracellular. However, very ancient VWA-containing proteins (found in all eukaryotes) are intracellular proteins involved in various cel lular tasks such as transcription, DNA repair, ribosomal and membrane transport, and th e proteasome protein degradation pathway (Colombatti et al ., 1993). One feature common to thes e proteins is the formation of multiprotein complexes (Colombatti et al ., 1993). It is important to note that yTAF7 most closely resembles the ATPases associat ed with diverse cellu lar activities (AAA) ATPase family of VWA-containing proteins. A yeast two-hybrid experiment from the lab of Laurie Stargell recently demonstrated that of all the TAFs, only yeast TAF7 was

PAGE 27

14 directly associated with TBP (Yatherajam et al ., 2003). This, taken together with the fact that the TAND1 domain of TAF1 binds to the concave surface of TBP, inhibiting TBP dimerization and promoter bindi ng, may suggest that a role of some TAFs are to prevent and dissociate nonproductive TB P interactions. Gegonne et al (2001) demonstrated that hTAF7 bound hTAF1 and inhibited the FAT/ HAT activity. Therefore, I propose an alternative function for the put ative TAF7 ATPase activity a nd this could be to shut-off the TAF1 FAT/HAT activity by acting as a chap erone protein. Regardless of the veracity of these models, the developing TA F7 story is of great interest. Other classical TAFs have no recognized role beside proteinprotein or proteinDNA interactions. However, it is thought th at these interactions may stabilize the PIC (Burley and Roeder, 1996). The recent work of Yatherajam et al (2003) elaborated the binary protein-protein interactio ns within the TFIID complex of yeast. These interactions and others for TFIID from humans, Drosophila and yeast are assembled in Table 1-2. It is significant that a large numb er of the binary interactions of yeast TFIID are nucleated by five histone-fold TAFs (TAFs 4, 6, 9, 10, and 12), four of which (TAFs 4, 6, 9, and 12) are proposed members of a nuc leosome-like octamer (Yatherajam et al ., 2003). In addition, a large number of protein-protein in teractions occur between TAFs and other GTFs, as will be discussed in detail below. Human TAF10, which has affinity for the estrogen receptor and thus may play a role in estrogen dependent activation, is found in only a subset of cellular TFIID complexes (Jacq et al ., 1994). Nevertheless, a TS mutation of yTAF10 was shown to impede bulk transcription of mRNA and destabilize TFIID and SAGA upon temperature shift (Sanders et al ., 1999). These results suggest a mo re general role for yTAF10 than

PAGE 28

15 for its human homolog hTAF10. Another exampl e of a TFIID specializ ation is seen with hTAF13 (which binds hTAF11; see above). The human TAF11-TAF13 pair may be an alternative to hTAF10 becau se it is found only in TFIID complexes lacking hTAF10 (Jacq et al ., 1994; Mengus et al ., 1995). Several TAFs (beside the TAF2, TAF6, TAF9 families mentioned above) contact promoter DNA and these include hT AFs 1, 4, 5, and 7 (Oelgeschlager et al ., 1996). These interactions aid in the stabilizati on of promoter-TFIID interactions. TBP interactions with the TATA box have similar a ffinity to TAF interactions with promoter DNA (Purnell et al ., 1994); therefore, TAF-promoter bi nding is critical for transcription from TATA-less promoters (Bell and Tora 1999). Thus, it is hypothesized that some TAFs function in recruiting TFIID to TATA-l ess promoters, participating in promoter recognition by TFIID (Bell and Tora, 1999). The TAF15 family of TAFs is a group of pro-oncoproteins that are common sites of chromosome translocations in human sarcomas (Bertolotti et al ., 1996; Attwooll et al ., 1999). These are hTAF15, TLS/FUS (transloca ted in liposarcoma/fusion of CHOP) and EWS (Ewing sarcoma) and are all RNA bindi ng proteins with high similarity to RNAbinding domain (RNP-CS). TAF15 binds not only RNA but also single stranded DNA (ssDNA) (Bertolotti et al ., 1996). Like TAF15, TLS/FUS an d EWS associate with TFIID in a mutually exclusive manner (Bertolotti et al ., 1996; Bertolotti et al ., 1998). TAF15 and EWS contact exactly the same subunits of TFIID (Bertolotti et al ., 1998), suggesting that EWS (and possibly TLS/FUS) are TAF15b (and TAF15c) proteins. Similarly to TAF14, TAF15 and EWS are also associated with another core transc ription complex, in this case PolII (Bertolotti et al ., 1998). TAF15 and EWS contacted the hRPB3 subunit

PAGE 29

16 (Bertolotti et al ., 1998). However, only TAF15 cont acted hRPB5 and hRPB7 (Bertolotti et al ., 1998). Recent work (presented recently at the 22nd Summer Symposium in Molecular Biology: Chromatin Structure a nd Function) from the laboratory of Stephen Buratowski in which a large-scale purification of TFIID was performed from yeast cells, has demonstrated sub-stoichiometric association of four ubiquitin mach inery proteins with TFIID (Auty et al ., 2003). Although these proteins (BRE5, BUL1, UBP3, and UBP8) are found in many other complexes, they may be operationally defined as TAFs due to their association in TFIID. Yeast TAF1 does not have a demonstrated ubiquitylation activity, nor the domains associated w ith this activity (Wassarman and Sauer, 2001 Aug) as reported for Drosophila TAF1 (Pham and Sauer, 2000). Perhaps, the two ubiquitinconjugases are in some way adopting this ro le in yeast. Recent evidence from the Drosophila melanogaster genome-wide protein-protei n interaction study shows a ubiquitin conjugase inter action with TAF10 (Giot et al ., 2003), suggesting that the presence of ubiquitylation machinery in TFIID is conserved (and not due to a complementation of an activity that yeast TAF1 is lacking). Proteins involved in ubiquitylation (suc h as the E2-conjugases) could be involved in activation of transcription and lead to degr adation of inhibitory pr oteins (i.e., histones) or may result in a rapid turn over of transcript ional activators, which would be critical for shutting off a promoter after the activation tr iggers are removed. However, the role of the two ubiquitin-hydrolases is more mysterious. It has recently been suggested by Shelly Berger (2003) that histone H2 B ubiquitylation, followed rapidly by deubiquitylation, is required for histone H3 methylation on K4 and K36 resulting in gene

PAGE 30

17 activation. If this is the case, the presen ce of both ubiquitylation and de-ubiquitylation enzymes in one complex may be mechan istically linked for gene activation. Little information is available about the TFIID complex in plants. A crude preparation of TFIID has been purified from wheat germ, which appears to be a stable complex similar to that from metazoans (Washburn et al ., 1997). Unfortunately, this purified complex was sparse, not homogeneous, and did not lead to the identification of any subunits. Some information on subunits TFIID from plan ts is beginning to become available. In Arabidopsis TAF10 was found to interact with a ubiquitin conjugase as tested by yeast-two hybrid screen (S.J. Lawit, P. Michaluk, E. Czar necka-Verner, W.B. Gurley unpublished results), suggesting that plant TFII D complexes also include ubiquitylation machinery. Also, Tamada et al (2003) have shown that the Arabidopsis TAF10 is transcriptionally regulated to a high degree. As it is highly ex pressed in developing tissues, but expressed below dete ction levels in mature tissues Along this line, TAF10 is not expressed in non-reproductive tissues following bolting of the inflorescences of Arabidopsis Such a close tie to development is consistent with biochemical information on human TAF10 that is present in a subset of TFIID complexes and interacts with the estrogen receptor, potentially pl aying a part in development. Alternative TBPor TAFContaining Complexes Several protein complexes other than TF IID contain TBP or TAFs, blurring the lines of what can be considered general tran scription factors. At least four types of coactivators interact with TBP and display pr omoter selection propert ies (for review see (Lee and Young, 1998). These are TAFIs, TAFIIs (TAFs), TAFIIIs, and PTF/SNAPc. Other coactivators/c orepressors/GTFs such as SAGA, Mot1, NC2, Nots, and TFIIA,

PAGE 31

18 along with TAFs, play important roles in regulating expressi on of mRNA (Lee and Young, 1998; Mitsiou and Stunnenberg, 2000). The SAGA complex does not copurify with TBP; however, multiple SAGA subun its do bind TBP individually (Spt3, Ada2, Ada5/Spt20) and may recruit it to a promoter (Eisenmann et al ., 1992; Barlev et al ., 1995; Roberts and Winston, 1996; Saleh et al ., 1997; Sterner et al ., 1999). Interestingly, western blots have demonstrated that TBP in yeast is around ten-fold more abundant than TAFs, SAGA, BTAF1 (Mot1), NC2, and Nots (Lee and Young, 1998). This is consistent with the observations that TBP may be a compon ent of a variety of protein complexes. One alternate TFIID complex, B-TFIID, is found in yeast and humans. Human BTFIID is capable of nucleating basal transcription, but is unresponsive to activators much like TBP alone (Chang and Jaehning, 1997) B-TFIID functions as a global transcriptional co-repressor and was initially believed to contain several core TAFs (Wade and Jaehning, 1996; Chang and Jaehning, 1997). Later it was established that BTFIID consisted of TBP and at least one TAF (BTAF1, or Mot1), but not the full complement of TAFs (Auble et al ., 1994; Wade and Jaehning, 1996). BTAF1 is an essential protein in yeast and affects only a s ubset of the organismal transcriptosome in microarray studies, possibly through promoter recruitment (Wade and Jaehning, 1996), or adenosine triphosphate (ATP)-dependent rel ease of the rate limiting TBP from high affinity TATA elements for use at lower affi nity promoters (Collart, 1996). This second model is more likely since BTAF1 seems to function through di ssociating TBP from DNA in an ATP-dependent manner (Lee and Young, 1998). Other alternative TAF complexes are hTFTC, hPCAF, ySAGA, human SPT3TAFII31-GCN5-L acetyltransferase (hSTAGA), ySLIK (yeast SAGA-like), and

PAGE 32

19 ySALSA (yeast SAGA altered Spt8 absent), et c. These complexes are quite similar in structure and function (Struhl and Moqtaderi, 1998). Sign ificantly, none of these complexes contain either TBP or TAF1; how ever, each complex contains HAT activity and a subset of TAFs (only four to five of nine histone-fold motif TAFs) (Bell and Tora, 1999). Interestingly, some well-characterized histone-binding partners are replaced in SAGA and other alternative complexes. Examples include hTAF11-hTAF13 partnering being replaced by an intramolecular Spt3 hist one-fold paring, and yTAF12 being paired with Ada51 (Birck et al ., 1998; Gangloff et al ., 2000). Outside of TFIID, perhaps the most r ecognized TAF-containing complex is the yeast SAGA and its human counterpart STAGA (Green, 2000). SAGA is a 1.8 – 2.0 MDa complex containing TAFs, ubiquitin-machinery proteins, Spt, Ada (alteration/deficiency in activ ation), and Gcn5 subunits (Grant et al ., 1998; Lee and Young, 1998; Grant and Berger, 1999; Berger, 2003). The TAFs that copurify with SAGA are TAF5 (WD40 domain), 6 (H4-like) 9 (H3-like), 10 (histone-fold domain), 12 (H2B-like), and 13 (hist one-fold domain) (Grant et al ., 1998; Grant and Berger, 1999). The presence of ubiquitin-machinery protei ns in SAGA, like TFIID a TAF containing complex, further supports a functional relati onship with TAFs and ubiquitylation. Yeast Gcn5 (a HAT) is additionally found in othe r large protein complexes containing Ada proteins (Grant and Berger, 1999). Yeast SAGA is somewhat redundant with other complexes including TFIID, SWI/SNF (Switch/Sucrose non-fermenting, a chromatin remodeling complex), and suppressor of RNA polymerase B (SRB)/Med iators (Grant and Berger, 1999). A temperature sensitive mutation of yTAF9 a member of both TFIID and SAGA, tested by

PAGE 33

20 a microarray experiment demonstrated that 67% of PolII genes require yTAF9 (Apone et al ., 1998; Holstege et al ., 1998). A TS mutation in yTAF10 (also a member of the same complexes) showed that it was also re quired for bulk mRNA expression (Sanders et al ., 1999). Similarly, a TS mutation in yTAF11 (a member of TFIID only) also decreased PolII transcription to background levels when tested under the nonpermissive temperature (Komarnitsky et al ., 1999). However, a similarly tested Gcn5 mutation was required by only 5% of PolII transcribed genes (Holstege et al ., 1998). An even more dramatic demonstration of the functional redundancy of SAGA is the fact that a deletion of Spt20 causing complete loss of SAGA does not create an apparent deficiency in transcription as monitored by total mRNA levels (Ber k, 1999). Interestingly, mutants of Ada1 and Spt20 individually had more dramatic phenotypes than mutants with inactivated or eliminated Gcn5 suggesting that HAT activity may not be the major role of SAGA (Struhl and Moqtaderi, 1998). While SAGA function ma y not be requisite for viability, the association of TAFs and ot her transcription regulators may convey the potential for SAGA to regulate at many di fferent promoters (Grant et al ., 1998). Recent work by Pugh and co-workers (Huisinga and Pugh, 2003; Pugh, 2003) has further elaborated the role of SAGA (and the related SLIK and SALSA complexes). Microarray analysis of Gcn5 deletion mutants demonstrated that 10% of the genome was activated by these Gcn5 HAT complexes, wherea s a strict TAF1 TS mutant demonstrated a non-overlapping 90% dependence on TFIID (H uisinga and Pugh, 2003). Interestingly, 46% of the Gcn5-dependent genes are stress in ducible, thus nearly 1/3 of all stressinducible genes in yeast are dependent upon SAGA for activation (Huisinga and Pugh, 2003). Pugh’s group has also discovered that the 90% of the promoters regulated by

PAGE 34

21 TFIID are TATA-less, while those that are SAGA-dependent contain TATA-boxes (Pugh, 2003). This is explained most easily by TFIID being able to correctly position TBP at TATA-less promoters by having a firmly incorporated TBP and being able to read other contextual cues in a promoter (i.e., the DPE and Initiator element, if present). On the other hand, SAGA may not be able to read su ch core promoter elements since it lacks several TAFs involved in prom oter recognition. This suggest s that SAGA may be able to recruit TBP, but not position it correctly. In recent years evidence has arisen th at suggests a SAGA-like complex exists in Arabidopsis. The presence of a SAGA-like complex, suggests that AtTAFs may interact with alternative complexes like TAFs in other organisms (Stockinger et al ., 2001; Vlachonasios et al ., 2003). Vlachonasios et al (2003) in a series of microarray experiments demonstrated that Arabidopsis Ada52b and Gcn5 regulate 5% of the genes represented in the 8,200 gene A ffymetrix array. This result is strikingly similar to the findings for yeast Gcn5, suggesting a very si milar extent of regulation by a potential SAGA-like Arabidopsis complex. The human PCAF complex is structur ally related to SAGA, containing many homologous subunits. In addition to severa l hAda subunits, hTAFs 9, 10, and 12 are also found associated with PCAF, which has a pproximately 20 subunits in all (Ogryzko et al ., 1998). The histone H4-like hTAF6 is missing from the PCAF complex, but is apparently replaced in the histone-octamer-like structur e by an ortholog with 42% similarity (PCAF associated factor – TAF6L) (Ogryzko et al ., 1998). There is also an hTAF5 ortholog with WD40 repeats, TAF5L (Ogryzko et al ., 1998).

PAGE 35

22 Drosophila TBP-related factor 1 (TRF1) is e xpressed only in neuronal tissues and is a component of yet another protein complex with similarities to TFIID. However, this complex contains no bona fide TAFs (Hansen et al ., 1997). TRF (TRF2 or TATA binding protein-like, TLF) homologs are found in humans, Drosophila and C. elegans with a unique subset of TRF associated f actors (Chang and Jaehning, 1997; Albright and Tjian, 2000). An interesting finding from th e laboratory of Robert Roeder (Xiao et al ., 1999) places a human TRF proximal (hTRFP) in the mediator complex. The function of the various TRFs appears to be, in genera l, to mediate transc riptional responses (potentially not mediated by TFIID ) by a variety of activators. TBP-free TAF-containing complex (TFTC) is a human protein complex with similarities to TFIID. TFTC lacks TBP and TAF1, but contains most other TAFs (Grant and Berger, 1999). In addition to TAFs, TFTC includes hAda53, hSPT3, and hGcn5L that provides a HAT activity in place of TAF1 (Grant and Berger, 1999). TFTC can functionally substitute for TFIID at both TATA-containing and TATA-less promoters by nucleating PIC assembly (Wieczorek et al ., 1998; Bell and Tora, 1999; Grant and Berger, 1999). The existence of multiple TAF-containi ng protein complexes with HAT activity emphasizes that chromatin remodeling is essent ial for transcriptiona l activation of many promoters. In addition, this multiplicity of TAF/HAT complexes suggests a functional redundancy in activator complexes. Th ese arguments imply that TBP and TAF recruitment to promoters is complex and th e role of specific TAF-containing complexes is not well understood, even in the well-studied metazoans (Lee and Young, 1998).

PAGE 36

23 TAFs: Required Factors or Optional Accessories Several studies indicate that TAFs may ha ve redundant or even optional roles in transcription of PolII-dependent genes. For instance, several well-studied, strong activators such as VP16 and Gal4 have redundant activation motifs that interact separately with TFIID and/or holoenzyme (Chang and Jaehning, 1997). This redundant interaction suggests that strong activators may be capable of activating tr anscription in the absence of TFIID, by c ontacting holoenzyme through other GTFs such as SRBs, TFIIA, and TFIIB (Burley and Roeder, 1996). Furthermore, in human embryonal carcinoma cells, a novel TFIIA-TBP complex ha s been identified th at is capable of activating transcription but completely lacks TAFs (Mitsiou and Stunnenberg, 2000). In addition, several novel coactivat or complexes in mammalian sy stems such as vitamin D3 receptor interacting proteins/activator-recr uited factor (DRIP/ARC), thyroid-hormone associated protein complex (TRAP), and co factor required for Sp1 activation (CRSP) completely lack TBP and TAFs (Fondell et al ., 1996; Naar et al ., 1999; Rachez et al ., 1999; Ryu et al ., 1999). Taken together, TFIID (and TAFs) may be optional accessories to transcription. Alternatively, protei n complexes other than TFIID and SAGA (containing TAF subunits or other coactiv ators) such as TFTC and TRAP can compensate for the lack of TFIID and SAGA under some conditions (Albright and Tjian, 2000). Early studies employing TAF TS mutations resulting in down-regulated expression and targeted degradation of TAFs in yeast did not demonstrate promoter dependency on these coactivators. In some studies, PolII holoenzyme alone (no TAFs present) supported activated transcription in vitro (Koleske and Young, 1994; Berk, 1999). However, in apparent contradiction to earlier results, ex periments utilizing TS mutations demonstrated

PAGE 37

24 significant promoter-depe ndency on TAFs. Michel et al. (1998) hypothesized that this was due to the use of tighter TS mutations than in previous studies, and that only TS mutations causing rapid cessation of growth upon temperature shift were appropriate for such studies (Berk, 1999). In fact, using “tight” TS mutants, a significant loss of transcription was observed only 30 min after the shift to the nonpermissive temperature (Michel et al ., 1998). This temperature shift also caused rapid degradation of not only the mutated protein but also the other pr oteins of the TFIID and SAGA complexes (Michel et al ., 1998). This result suggested that PolII cannot transcribe without a functional TFIID complex (Michel et al ., 1998). Interestingly, temperature shift also resulted in a degradation of western blot-detectable TAFs a nd two thirds of TBP (Michel et al ., 1998; Berk, 1999). This suggests that two th irds of TBP is associated with TFIID, while the other one third is either free or bound by TAFIs or TAFIIIs (Berk, 1999). A recent study utilizing an inducible depl etion strategy for chicken TAF9 (histone H3-like TAF) demonstrated a high level of PolI I transcription without detectable levels of chicken TAF9 (Chen and Manley, 2000). This elegant experimental system measured transcription directly through pulse-labeling and included st eady state measurements of several gene transcripts. This apparent inconsistency with yTAF TS experiments was mainly explained by an increased functiona l redundancy in mammalian transcriptional machinery (Chen and Manley, 2000). Such alternative mammalian complexes as PCAF, and TRAP were proposed to play much la rger roles than SAGA in yeast (Chen and Manley, 2000). Other studies lead to the conclusion th at different promoters have distinct dependencies on TAFs for their expressi on. In yeast, TAF-independent (TAFind) and

PAGE 38

25 TAF dependent (TAFdep) promoters were identified usi ng the TAF TS mutants. After temperature shift of these mutants, transcript profiling was used to detect genes that were transcriptionally dependent or inde pendent of the various TAFs (Li et al ., 2000). It was established with the us e of formaldehyde DNA-crosslinking chromatin immunoprecipitations (ChIPs) that TAFdep promoters recruited TAFs and TBP at similar levels (Li et al ., 2000). However, ChIP of TAFind promoters indicated that TAFs were only present at background levels (Li et al ., 2000). These TAFind promoters still recruited TBP, apparently sans TAFs, as TBP bound all pr omoters at levels th at correlated well with transcript accumulation (Li et al ., 2000). Interestingly, when yeast TBP was inactiv ated in a temperature shift experiment, TAFs continued to be recruited to TAFdep promoters (Li et al ., 2000). In general, TAFdep promoters recruit TAFs in an activator-dependent fashion, in dependent of other GTFs (Li et al ., 2000). For instance, the bindin g of TBP with TAFs to a TAFdep promoter (the yeast RPS5 promoter) is lost after removal of the activator binding sites, but inactivation of TFIIB or SRB4 had minimal e ffect on binding of TBP to the TAFdep (ACT1, and RPS5) promoters (Li et al ., 2000). This is compelling evidence for a model in which yeast TAFs are directly targeted for recrui tment by transcriptional activators, and in parallel pulling TBP to the prom oter, nucleating PIC assembly (Li et al ., 2000). On the other hand, inactivation of TF IIB or SRB4 substantially reduced binding of TBP to a TAFind (ADH1) promoter (Le Gourrierec et al ., 1999). This evidence seems to indicate that holoenzyme recruits TBP (sans TAFs) to TAFind promoters, leading to stabilization of each other’s interact ion with the promoter.

PAGE 39

26 It is clear that most TAFs are essen tial to yeast survival (Green, 2000). Work with temperature sensitive mutants of the histone-like yTAFs demonstrated that they were essential to PolII transcription as well (Michel et al ., 1998). Michel et al. (1998) made the argument that loss of SAGA did not cause a large drop off in transcription because most of its components were not required for yeast viability or PolII transcription. Therefore, it seems that TFIID is required for viability, but not necessarily for correct transcription of every PolII depe ndent gene. From the evidence accumulated to date, the model of TFIID se rving as the major PolII coactivator seems secure in most organisms. However, there is clearly signi ficant redundancy of coactivator complexes in yeast and metazoans. With so little known about similar coactivators in plants, it is futile at this point to postulate how their transcription is regulated. Interplay of GTFs Assembly of the PIC involves many GTFs a nd nearly an order of magnitude greater number of individual proteins. Therefore, implicit in formation of this complex are a large number of binary protein-protein inte ractions involving intra-GTF and inter-GTF binding partners. The intera ctions between TFIIA, TFIIB TFIID, TFIIE, and TFIIF are summarized in Table 1-3 and in Figu res 1-2, 1-3, and 1-4 for humans, Drosophila and yeast, respectively. Just a few examples of the many known TAF-GTF interactions are dTAF9/ hTAF9 with TFIIB; yTAF14 with TF IIF; hTAF6 with TFIIF and TFIIE; and dTAF4/ hTAF4 with TFIIA (Tjian and Ma niatis, 1994; Burley and Roeder, 1996). While TFIID is generally responsible for nucleation of the PIC, the other GTFs play major roles as well. Regardless of which model is assumed (ordered multi-step assembly or holoenzyme), TFIIA has a some what controversial presence. TFIIA is composed of either two subun its (L and S) in yeast and Arabidopsis or three subunits (

PAGE 40

27 and ) in metazoans. TFIIAand are derived from post-translational cleavage of a protein homologous to the larger (L ) subunit in fungi and plants (Li et al ., 1999). TFIIA is able to integrate into the PIC at any step of assembly and is even capable of binding TBP in the absence of DNA (Orphanides et al ., 1996). However, when TFIIA does bind TBP at a promoter it is able to interact with both the N-terminal stirrup of TBP and DNA upstream of the TATA element increasing the stability of the DNA-protein complex (Langelier et al ., 2001). Besides this function of TF IIA, it is suggested that TFIIA is involved in TBP anti-repression be cause it is able to remove repressors like Mot1. Only TFIIAand are required for anti-repression, but not However, all three subunits are required for activation mediated by trans-acti vators that recruit TFIIA. In the work of Langelier et al (2001), TFIIA stimulated basal tr anscriptional activity only in the presence of TFIIE and TFIIF suggesting that TFIIA ma y somehow be involved in enhancing the activitie s of TFIIE and TFIIF. The structures of TFIIB as well as TF IIA in association with the TBP-TATA complex have been determined by x-ray crys tallography (Nikolov et al., 1995; Geiger et al., 1996). The binding of TFIIB to the C-term inal stirrup of TBP at a promoter is a required step to PIC formation (regardless of which model is followed). Like the binding of TFIIA to TBP-TATA, the TFIIB-TBP-T ATA complex is stabilized by TFIIB interactions with both TBP and DNA both ups tream and downstream of TATA in this case (due to the 80 bend in the TATA-element) (Malik et al ., 1993; Lee and Hahn, 1995; Tang et al ., 1996). The element directly upstream of TATA that is contacted by TFIIB is termed the IIB recognition element (BRE)(Lagrange et al ., 1998). The BRE is contacted in a sequence specific manner by a helix -turn-helix DNA binding domain of TFIIB

PAGE 41

28 (Lagrange et al ., 1998). The BRE has a consensus se quence of 5'-G/C -G/C-G/A-C-G-CC-3' and represents the fourth known core promoter element in addition to the TATAelement, the DPE, and the Initiator (Lagrange et al ., 1998). Along with creating a more stable TBP-TATA interaction, TFIIB makes contact with a TAF, TFIIF, and PolII (Ha et al ., 1993; Fang and Burton, 1996). In fact, some mutations of TFIIB have a great effect on transcription start sites suggesting that TF IIB plays a significant role in positioning of PolII in the PIC (Orphanides et al ., 1996). Choi et al (2003) recently demonstrated that human TFIIB has the capacity to autoacetylate itself on K238. This autoace tylation was competitively inhibited by coenzyme A and was reversible under the cond itions of high coenzyme A concentrations, indicating that this is a catalytic process (Choi et al ., 2003). Interestingly, yeast TFIIB had the same autoacetylation capacity and the TFIIB affinity for TFIIF was then significantly increased (Glutathione-S transf erase-pulldown efficiency increased from 15% to 90%) (Choi et al ., 2003). This affinity increase s uggests that TFII B acetylation is a key mechanism for recruitment of TFIIF and PolII to a promoter. Similarly to TBP and TFIIA, TFIIB has been studied to a limited degree in plants (Baldwin and Gurley, 1996; Pan et al ., 2000). Pan et al (2000) demonstrated that TBPTFIIB interactions were dispensable for basal transcription and activated transcription at strong complex promoters (CaMV 35S). Ho wever, the TBP-TFIIB interaction was required for activated transcription at simplified artificial promoters (Pan et al ., 2000). These results can be most simply interprete d to mean that during basal and activated transcription from complex promoters, othe r factors besides TBP and TFIIB (perhaps TAFs and transcriptional activators, respectively) are able to recruit Po lII to the promoter.

PAGE 42

29 However, at the simple, artificial promot ers the rate-limiting step is no longer TBP recruitment (due to recruitment by the transactivator) but recruitment of PolII. Since the artificial promoters are TATA-c ontaining, they may act in a TAFind manner and thus TAF-holoenzyme interactions may not fully complement the lack of TBP-TFIIB interaction. In separate work, Pan et al (1999) showed that a 14-3-3 protein binds TBP and TFIIB independently of known 14-3-3 pr otein-binding motifs. It was also demonstrated that this 14-3-3 protein was capable of stimulating transcription in vivo suggesting that 14-3-3s might act as co-activators thus creating another layer of complexity to transcriptional regulation (Pan et al ., 1999). An exciting story that is be ginning to unfold is that of plant-specific TFIIB-related protein (pBrp), or AtTFIIB5 (chapters 2 and 4). This protein was shown to interact in vitro to form a TBP-TFIIB5-TATA complex (Lagrange et al ., 2003). Using enhanced yellow fluorescent protein-tagging, chloropl ast-fractionation and proteolytic-cleavage experiments analyzed by western blots, as well as plastid agglutination experiments, Lagrange et al (2003) have shown that AtTFIIB5 was normally localized to the cytosolic face of the plastid envelope. This localization to the chloroplast is the first occurrence of any GTF to be observed stably lo cated outside of the nucleus. AtTFIIB5 was also observed to contain a P/E/D/S/T-rich domain that appears to play a role in targeting this protein for degradation by the proteasome (Lagrange et al ., 2003). Upon pharmacological disruption of proteasome function and in COP9 signalosome mutants, AtTFIIB5 was observed to localize to the nucleus. Lagrange et al (2003) suggest a model in which an unknown pl astid-derived signal l eads to release of the TFIIB5 protein from the outer envelope and movement into the nucleus. In the

PAGE 43

30 nucleus, TFIIB5 would then induce transcripti on of genes appropriate for response to the original signal. In this m odel, proteasome-mediated degr adation provides a rapid turnover of the nuclear-localized TFIIB5 protein, l eading to tighter control of the response to the signal. However, this model lacks an explanation as to why the TFIIB5 protein appeared to be released from the chloroplast under pr oteasome/COP9 signalosome dysfunction. I propose a model in which the COP9 signalosome leads to degradation of a TFIIB5/pBrp co-factor that allows the protei n to localize to the nucleus. The co-factor could be either a chaperone that escorts TFIIB5/pBrp to the nucleus or a chloroplas t-docking antagonist. Alternatively, the proteasome/COP9 signaloso me may be activating either a plastid docking factor or a chloroplas t targeting signal in TFIIB5/ pBrp by partial proteolysis (Gille et al ., 2003). In either of these cases, TFII B5/pBrp transport to the nucleus and induction of transcription is likely to be the culmination of this signal transduction pathway. Whatever their role, the TFIIB5/pBrp subfamily of TF IIB-like factors is sure to play a novel role in tr anscriptional regulation. Part of the role of TFIIB is to recr uit TFIIF into the PIC. TFIIF is a heterotetrameric complex of two TFIIF (RNA polymerase-associated protein 74 kDa, RAP74) and two TFIIF (RAP30) molecules (Flores et al ., 1990) that is tightly associated with PolII. However, in yeast a third factor interacts as part of TFIIF, the yeast TAF14 (also a member of the TFIID and SWI/SNF complexes) (Henry et al ., 1994; Cairns et al ., 1996). Interestingly, although both human TFIIF and TFIIF bind to TFIIB individually, it has been shown that TFIIF blocks the binding of TFIIF to TFIIB by simultaneously binding to both proteins in the regions required for their respective

PAGE 44

31 interaction (Fang and Burton, 1996). TFIIF is required for recruitment of PolII to the TATA-TBP-TFIIA-TFIIB complex, and is found ti ghtly associated with PolII. Indeed, TFIIF is credited with stimulating the rate of transcriptional elongation (which implicates that it is an elongation factor in addition to its function as an initiation factor) (Flores et al ., 1989). Besides elongation stimulati on, TFIIF (specifically the -subunit) also inhibits and reverses PolII binding to non-promoter DNA regions making this interaction promoterspecific, similarly to the bacterial -factor (Killeen and Greenblatt, 1992). TFIIF has sequence similarity with bacterial factors and is able to bind E. coli RNA polymerase in the same region as 70 (McCracken and Greenblatt, 1991). Interestingly, a dimer of TFIIF alone is able to recrui t PolII to promoters and support proper initiation of transcription (Flores et al ., 1991). Three functional domains compose TFIIF : the TFIIF binding N-terminus (Fang and Burton, 1996); the polymerase binding middle domain (Killeen and Greenblatt, 1992); and the C-terminal winged-helix domain (Groft et al ., 1998). Similarly, TFIIF can be functionally divided into th ree domains: the N-terminal TFIIF binding domain; the highly charged middle region; and th e C-terminal winged helix domain (Kamada et al ., 2001). TFIIF seems to largely play a role in stimulating transcriptional elongation and aides TFIIF to remove PolII from non-specific DNA interactions. One interesting observation is the presence of a seri ne/threonine kinase activity in TFIIF that is involved in transcriptio nal elongation (Rossignol et al ., 1999). Rossignol and co-workers (1999) were unable to find an iden tifiable ATPase domain in TFIIF that must be present for kinase activity; however, a weak simila rity with an AAA ATPa se VWA-containing

PAGE 45

32 proteins is clearly identifiable in A. thaliana TFIIF (as mentioned in Chapter 2). Two other interesting findings are: the phosphorylation of TFIIF by the TAF1 factor kinase domain, and a protein kinase in TFIIH (Dikst ein et al., 1996a; O. Brien and Tjian, 1998; Rossignol et al., 1999). Although th e significance of these activ ities utilizing TFIIF as a substrate is still unknown, some data suggest that both TFIIF initiation and elongation activities are stimulated by th is phosphorylation similarly to the stimulation of TFIIA activities by phosphorylation (Kitajima et al ., 1994). After TFIIF and PolII, TFIIE is the next GTF to enter the assembling PIC possibly with PolII. TFIIE, like TFIIF is a heterotetr amer composed of two different proteins ( and ) (Ohkuma et al ., 1990; Inostroza et al ., 1991). It was found that TFIIE without TFIIE has the capacity to mediate basal transc ription when added to the other required factors (Ohkuma et al ., 1990; Inostroza et al ., 1991); however, in the recombinant form both subunits were required (Peterson et al ., 1991). TFIIE contains a zinc-finger domain that is critical its stable incorporation into the PIC (Maxon and Tjian, 1994), a leucine repeat, and a helixturn-helix domain as well as sequence similarity to E. coli factor region 2.1 (Ohkuma et al ., 1991). TFIIE also contains a clearly identifiable catalytic loop domain found in many protein kinases (Peterson et al ., 1991), although TFIIE has no known ATPase activity. The protein-protein interaction between TFIIE and TFIIE may be mediated by leuc ine repeats, since TFIIE also contains this recognizable mo tif (Sumimoto et al ., 1991). TFIIE like TFIIE has some sequence similarity to -factors, but in this case it is with region 3 which is implicated in promoter recognition (Sumimoto et al ., 1991). TFIIE also has similarity with TFIIF in a region

PAGE 46

33 similar to a -factor domain that binds to core RNA polymerase, consistent with both TFIIE and TFIIF interacti ons with PolII (Sumimoto et al ., 1991). TFIIE contacts ssDNA through a C-terminal winged helix domain that may play a role in stabilization of the open promoter and assist DNA melting (Okamoto et al ., 1998; Okuda et al ., 2000). This ssDNA-binding domain is novel in that it binds DNA in the opposite face of where winged helix domains t ypically interact with DNA, in a positively charged channel (Okuda et al ., 2000). Since TFIIE interacts with PolII and GTFs (TFIIB and TFIIF), Okuda et al (2000) suggested a model in wh ich these properties in addition to the ssDNA binding lead to a stabilization of the PIC where the promoter starts to open. Interestingly, like TBP and TFIIB, TFIIE appear s to have ancient r oots. Homologs of TFIIE (TFE) have been identified in Archaea. TFE was not required for transcription, but was stimulatory to transcription under conditions of limiting TBP (Bell et al ., 2001). This suggests a conserved function for TFE/TFIIE in stabilization of the PIC. Once TFIIE is incorporated into the PIC, it recruits TFIIH a multi-subunit GTF with two ATP-dependent helicases and a protein kinase (Orphanides et al ., 1996). One of the TFIIH helicases appears to be requi red for transcriptiona l initiation, but TFIIE, TFIIH, and ATP are all dispensable on template s that are highly negatively supercoiled (Parvin and Sharp, 1993). It is believed that this negative supercoiling greatly lowers the energetic requirement for strand separation and precludes the need for helicases activity. Interestingly, both TFIIB and TFIIE contain zinc-ribbon motifs, and both TFIIB and TFIIE bind DNA between the TATA-element and the start site of tr anscription (Robert et al ., 1996). It has been speculated that these DNA binding motifs may play a role in stabilizing the melted region of the promoter and as such supplement one of the main

PAGE 47

34 functions of TFIIH in transcription; howeve r, more recent evidence suggests this is a function of the TFIIE ssDNA binding domain (Okamoto et al ., 1998). Another main function of TFIIH (beside transcription-coupled nucleotide excision repair, which will not be discussed here) is to stimulate elongation by hyperphosphorylation of the C-terminal domain (CTD ) of the largest subunit of PolII. This evolutionarily conserved CT D is composed of many tande m repeats of a heptapeptide (YSPTSPS) of which five residues are potenti al recipients of phosphate moieties. The cdk7 subunit, a cyclin-dependent kinase, of human TFIIH has been shown to be the subunit responsible for hyperphos phorylation of the CTD, an activity that potentially leads to PolII promoter escape (Orphanides et al ., 1996). One additional function of both subunits of TFIIE is to s timulate this CTD kinase ac tivity of TFIIH (Okamoto et al ., 1998). PolII is a 12 subunit complex of approximately 500 kDa (Dvir et al ., 2001). The core of PolII is composed of two large subunits, RNA polymerase B protein 1 (Rbp1) and Rbp2 (Dvir et al ., 2001). The ten remaining subunits (R bp 3-12) coat the surface of these two proteins in single copies (Dvir et al ., 2001). PolII is capable of unwinding double stranded (dsDNA), adding ribonucleotides to R NA transcripts, and proofreading nascent transcript (Cramer et al ., 2000). The structure of PolII shows a deep cleft between Rbp1 and Rbp2 through which ~20 bp of dsDNA is held as it enters the active site (Cramer et al ., 2000). This entering dsDNA is griped by a “pair of jaws” formed by a portion of Rbp1 with Rbp5 on one jaw with Rbp9 on the other jaw (Cramer et al ., 2000). A “sliding clamp” composed of the C-terminal region of Rbp2 and the N-terminal parts of Rbp6 and Rbp1 greatly stabilizes the in teraction with downstream DNA, leading to the processivity

PAGE 48

35 of the enzyme (Cramer et al ., 2000). Cramer et al (2000) propose that a groove leading away from the active site behind the hinge of the downstream-DNA sliding-clamp binds the emerging RNA and that acts as a lock on the clamp increasing processivity. Underneath the base of the cleft is an inve rted funnel leading to two pores that give access to the DNA-RNA hybrid and are near the active site (Cramer et al ., 2000). These pores may provide access for elongation factors, nucleotides, and exit of the 3’ end of the mRNA during backtracking (Cramer et al ., 2000). PolII is clearly a complex assembly of protein domains with a multitude of functions (many of which are elucidated by the structure). Unfortunately, further details must be left for other manuscripts. The CTD of PolII is not a naked protein-ta il structure as once thought, instead it is covered by a large Mediator comple x of co-activators (Orphanides et al ., 1996). While the Mediator complex is not technically a GT F (because it was not identified as a factor required for basal transcription in vitro ), it does merit some discussion here. Mediator is composed of approximately 60 proteins and is ~3.5 MDa (reviewed in Myers and Kornberg, 2000; Rachez and Freedman, 2001) Many Mediator genes were first identified as suppressors of CTD trunca tion mutants of RNA polymerase B (SRB) in yeast. Many of these so called SRB proteins have little (if any) recognizable sequence similarity with other proteins, except SRB 10 and SRB11 which are also known as cyclin C and cdk8, respectively. Besides the GTFs, RNA PolII and Mediator needed for regulati on of transcription, there are many co-activator and co-repressors complexes that are required in the cell, but detailed discussions are beyond the scope of this document. Virtually all of them in some way modulate the access of GTFs to their prom oters. Some of these include the TAF-

PAGE 49

36 containing HAT complexes (TFTC; SAGA; STAGA; PCAF; nucleosome acetyltransfersase histone H4, NuA4; etc.), histone deacetylase complexes (i.e. SWI independent 3 complex, SIN3; nucleosome remodeling HD complex, NuRD; regulator of nucleolar silencing a nd telophase exit complex, RENT) (reviewed in Lawit and CzarneckaVerner, 2002), ATP-dependent chromatin remodeling complexes (SWI/SNF, BRG1associated factor (BAF), and related f actors), DNA and histone methyltransferase complexes (i.e., Complex Proteins Associated with Set1, COMPASS; Enhancer of Zeste) (Miller et al., 2001; Czermin et al., 2002), ju st to name the best-studied classes. However, all of these protein complexes requ ire some type of contextual cues to find their substrates. At some point, nearly all of these cues originate with sequence-specific DNA-binding transcriptional regulators (either activators or repressors). Transcriptional Activators That Bind DNA The GTFs alone are capable of conveying basa l levels of transcription from core promoters (for a review of core promot ers see Smale and Kadonaga, 2003). However, for activated transcription several additional layers of control are often necessary. The first of these are transcriptional trans -activators, the second are cis -acting DNA elements in promoters to which transcriptional activat ors (and repressors) can bind. In general, DNA binding transcriptional activators are co mposed minimally of two functionally separable domains: a DNA binding domain and a transcriptional activation domain (Ptashne, 1988). Differential expression of the 27,000 genes of Arabidopsis implies involvement of many different transcriptional regulators that are capable of differential combinations of protein-protein and protein-DNA in teractions. The team of sc ientists that analyzed the Arabidopsis genomic information found 1,709 putative pr oteins with similarity to known

PAGE 50

37 classes of DNA-binding domain-containing tr anscription factors (The Arabidopsis Genome Initiative, 2000). This large, and potentially underestimated, set of transcriptional regulators is certainly involved in a multitude of protein-protein interactions with each other ( homoand heterodimers and tr imers) and with co-activator and/or co-repressor complexes. Many transcriptional regulato rs have been studies in various organisms, including plants. Interestingly, only 8-23% of th e genes encoding proteins containing DNAbinding domains identified in Arabidopsis are similar to genes in other non-plant eukaryotic genomes (The Arabidopsis Genom e Initiative, 2000). Unlike transcriptionrelated genes, 48-60% of Arabidopsis protein synthesis genes ha ve homologs in the other eukaryotic genomes (The Arabidopsis Genome Initiative, 2000). Th is great disparity reflects an independent evolution of pl ant transcription factors in general. More than half of the transcription factor families in Arabidopsis (16 of 29) appear to be specific to plants. The Apetala 2/Ethylene response elem ent binding proteinrelated to ABI3/VP1 (AP2/EREBP-RAV), no apical meristem/cup shaped cotyledon 2 family (NAC) and auxin response factor (ARFAUX/IAA) familie s, have DNA-binding domains not found outside of the plant kingdom DOF zinc-finger, WRKY zinc-finger, and the two-repeat MYB families contain plant-specific variants of more widespread domains. Some large families of transc ription factors (R2R3-repeat MYB, WRKY families, etc.) have expanded in plants, with approximately 100 members in some groups. Other classes of DNA-binding proteins are comp letely missing in plants such as the Rellike DNA-binding domain, nuclear steroid recep tors, forkhead-winged helix, and POU

PAGE 51

38 (Pit-1, Oct and Unc-8b) domain protein families (The Arabidopsis Genome Initiative, 2000). The functions of the individual transcri ption-factor family-members can be regulated by expression charac teristics. Another way to add a layer of control to a transcriptional activator is for it to target a co -activator or GTF that is only expressed at certain temporal or spatial coordinates. With the plethora of different GTFs in Arabidopsis it seems likely that differential expr ession of GTFs targeted by different transcription factors is a key m echanism of control in plants. Therefore, to truly begin to understand transcriptional regulati on of any gene in plants we must attempt to understand regulation circuitry of the transcription factors, the GTFs, and the coactivator/corepressor complexes, as well as all of their protein-protein interactions.

PAGE 52

39 Table 1-1. TATA binding protei n-associated factors of th e TFIID complex. The TAFs are displayed as identified in (from left to right) Arabidopsis thaliana Saccharomyces cerevisiae Drosophila melanogaster and Homo sapiens Note: The names displayed in red are from the unified nomenclature designated by Lazlo Tora (Tora, 2002). Other names are based on accession numbers, mutant designations, or molecular weight (either observed or predicted). Histone -fold domain, HFD; bromodomains, BD; protein kinase, PK; transcriptional activators, TAs; acidic activators, AAs; Broad-complex, Tramtrack, and Bric-a`-brac, BTB. BDs, PK, HAT, TAND, Ub ligase 145(130) (1) 250 (1) 225 (1) /205 (1b) 135 (4) /105 (4b) 230(250) (1) 150 ( TSM1 ) (2) 150 ( 2 ) 110 (4) 90 (5) 100(95) (5) 70(80) (6) 80(85) (5) 60 (6) 62(60) (6) 59 (6) /55 (6b) 31(32) (9) 17(20) (9) 42(40) (9) 21 (9) 20 ( 15 ) ( 12 ) 61 ( 68 ) ( 12 ) 30 ( 22 ) ( 12 ) 58 (12) /75 (12b) 30 (ANC-1) (14) contacts initiator H2A-like HFD, contacts Q-rich TAs, and IIA WD-40 Repeats, contacts IIF H3-like HFD, contacts acidic TAs, IIB, & DPE H2B-like HFD SWI/SNF, TFIIF, TFIID member 150 ( CIF150 ) (2) 68 (15) 47 (3) = required in yeast 172 ( BTAF1 ) 228 ( BTAF1 ) helicase similarity, TBP negative regulator Mot1 (BTAF1) Not in TFIID com p lex p art of B-TFIID 142 ( 2 ) 48 (4) Blue = Present in ySAGA, or hTFTC; = Present in hPCAF Complex substoichiometric, contacts RNA and ssDNA 78 (5) 155 (BIP1) (3) 140 (3) H2A-like HFD, contacts BTB domain p roteins 55 ( 7 ) contacts multiple TAs, and Bdf1 67 ( 7 ) AAF54162 (7) 65 ( 8 ) Prodos (8) 30 (10) 23(25) (10) 15 (10) H3-like HFD 24 (10) /16 (10b) 28 (11) 40 (11) 30 ( 11 ) 24 (11) /19 (11b) H3-like HFD 18 (13) 19 (FUN81) (13) H4-like HFD 43 (8) AAF53875 (13) 14 (13) 76 (4) /69 (4b) 22 ( 7 ) 40 ( 8 ) 23 (14) /30 (14b) 41 (15) /39 (15b) Yeast Human Arabidopsis Drosophila H4-like HFD, contacts IIE IIF AAs, & DPE H3 likeHFD

PAGE 53

40 Figure 1-1. The “two-step handoff” model of removal of auto-inhibition of TFIID by the TAF1 N-terminal domains TAND1 and TAND2 (T1 and T2, respectively). TAFs are labeled and shown in light bl ue, the acidic activator is shown in red, and TFIIA is shown in light green. TATAAAT 1 TBP TAND 4 9 12 8 10 6 2 13 11 5 7 15 10 3 T ATAAAT DPE 1 TBP T2 T1 1 4 9 12 8 10 6 TBP 2 13 11 5 7 15 10 3 4 9 12 8 10 6 2 13 11 5 7 15 10 3 A A A

PAGE 54

41 Table 1-2. Protein-protein interactions of TFIID in Homo sapiens Drosophila melanogaster and Saccharomyces cerevisiae with corresponding references. References Interaction Human Drosophila Yeast TBP-TAF1 (Ruppert et al ., 1993; Xenarios et al ., 2002) (Weinzierl et al ., 1993a) (Reese et al ., 1994; Kokubo et al ., 1998) TBP-TAF2 (Verrijzer et al ., 1994) (Verrijzer et al ., 1994) TBP-TAF5 (Dubrovskaya et al ., 1996; Tao et al ., 1997) (Kokubo et al ., 1993c) TBP-TAF6 (Weinzierl et al ., 1993b; Hisatake et al ., 1995) (Weinzierl et al ., 1993b; Kokubo et al ., 1994) TBP-TAF7 (Yatherajam et al ., 2003) TBP-TAF9 (Kokubo et al ., 1994) TBP-TAF10 (Jacq et al ., 1994) (Klebanow et al ., 1996) TBP-TAF11 (Xenarios et al ., 2002) (Kraemer et al ., 2001) TBP-TAF12 (Mengus et al ., 1995) (Yokomori et al., 1993b; Kokubo et al., 1994) (Reese et al ., 2000) TBP-TAF13 (Mengus et al ., 1995) TAF1-TAF2 (Verrijzer et al ., 1994) (Verrijzer et al ., 1994) TAF1-TAF4 (Mengus et al ., 1995; Burley and Roeder, 1996) TAF1-TAF4b (Dikstein et al., 1996b) (Kokubo et al ., 1993a; Weinzierl et al ., 1993a) (Yatherajam et al ., 2003) TAF1-TAF5 (Dubrovskaya et al ., 1996; Tao et al ., 1997) (Yatherajam et al ., 2003)

PAGE 55

42 Table 1-2 continued. References Interaction Human Drosophila Yeast TAF1-TAF6 (Weinzierl et al ., 1993b; Hisatake et al ., 1995) (Weinzierl et al ., 1993b) (Yatherajam et al ., 2003) TAF1-TAF7 (Lavigne et al ., 1996) TAF1TAF7L (Pointud et al ., 2003) (Yatherajam et al ., 2003) TAF1-TAF9 (Kokubo et al ., 1994) (Yatherajam et al ., 2003) TAF1-TAF10 (Jacq et al ., 1994) TAF1-TAF11 (Yokomori et al., 1993b) (Yatherajam et al ., 2003) TAF1-TAF12 (Yokomori et al., 1993b) TAF2-TAF3 (Yatherajam et al ., 2003) TAF2-TAF4 TAF2TAF4b (Dikstein et al., 1996b) (Yatherajam et al ., 2003) TAF2-TAF7 (Yatherajam et al ., 2003) TAF2-TAF8 (Yatherajam et al ., 2003) TAF2-TAF10 (Yatherajam et al ., 2003) TAF2-TAF11 (Yokomori et al., 1993b) TAF2-TAF12 (Yokomori et al., 1993b) TAF3-TAF10 (Gangloff et al., 2001a) (Gangloff et al., 2001a) (Gangloff et al., 2001b; Yatherajam et al., 2003) TAF4-TAF5 (Kokubo et al ., 1993c) (Yatherajam et al ., 2003) TAF4-TAF7 (Yatherajam et al ., 2003) TAF4-TAF8 (Yatherajam et al ., 2003) TAF4-TAF9 (Kokubo et al ., 1994) (Yatherajam et al ., 2003) TAF4-TAF10 (Yatherajam et al ., 2003)

PAGE 56

43 Table 1-2 continued. References Interaction Human Drosophila Yeast TAF4-TAF11 (Yokomori et al., 1993b) (Yatherajam et al ., 2003) TAF4-TAF12 (Hoffmann et al ., 1996; Gangloff et al ., 2000; Werten et al ., 2002) (Yokomori et al., 1993b; Kokubo et al., 1994) (Selleck et al ., 2001; Yatherajam et al ., 2003) TAF5-TAF6 (Tao et al ., 1997) (Ito et al ., 2001) TAF5-TAF7 (Dubrovskaya et al ., 1996; Lavigne et al ., 1996) TAF5-TAF8 (Yatherajam et al ., 2003) TAF5-TAF9 (Tao et al ., 1997) (Kokubo et al ., 1994) (Uetz et al ., 2000) TAF5-TAF10 (Yatherajam et al ., 2003) TAF5-TAF11 (Lavigne et al ., 1996; Tao et al ., 1997) TAF5-TAF12 (Lavigne et al ., 1996; Tao et al ., 1997) (Yatherajam et al ., 2003) TAF5-TAF13 (Dubrovskaya et al ., 1996; Lavigne et al ., 1996) TAF5-TAF15 (Bertolotti et al ., 1998) TAF6-TAF9 (Weinzierl et al ., 1993b; Hisatake et al ., 1995; Xenarios et al ., 2002) (Weinzierl et al ., 1993b; Kokubo et al ., 1994; Xie et al ., 1996) (Uetz et al ., 2000; Ito et al ., 2001; Selleck et al ., 2001; Yatherajam et al ., 2003) TAF6-TAF10 (Yatherajam et al ., 2003) TAF6-TAF11 (Yatherajam et al ., 2003) TAF6-TAF12 (Hisatake et al ., 1995; Hoffmann et al ., 1996) TAF7-TAF7 (Yatherajam et al ., 2003)

PAGE 57

44 Table 1-2 continued. References Interaction Human Drosophila Yeast TAF7-TAF8 (Yatherajam et al ., 2003) TAF7-TAF11 (Lavigne et al ., 1996) (Yatherajam et al ., 2003) TAF7-TAF12 (Hoffmann et al ., 1996) TAF7-TAF15 (Bertolotti et al ., 1998) TAF8-TAF10 TAF8-TAF10b (HernandezHernandez and Ferrus, 2001) (Uetz et al., 2000; Gangloff et al., 2001b; Yatherajam et al., 2003) TAF8-TAF12 (Yatherajam et al ., 2003) TAF9-TAF10 (Yatherajam et al ., 2003) TAF9-TAF12 (Yatherajam et al ., 2003) TAF10-TAF10 (Klebanow et al., 1996; Gangloff et al., 2001b; Yatherajam et al., 2003) TAF10-TAF11 (Uetz et al ., 2000; Yatherajam et al ., 2003) TAF10-TAF12 (Mengus et al ., 1995; Xenarios et al ., 2002) (Yatherajam et al ., 2003) TAF10-TAF13 (Mengus et al ., 1995; Xenarios et al ., 2002) (Ito et al ., 2000; Yatherajam et al ., 2003) TAF10-TAF14 (Uetz et al ., 2000) TAF11-TAF12 (Mengus et al ., 1995) (Yatherajam et al ., 2003) TAF11-TAF13 (Mengus et al ., 1995; Birck et al ., 1998; Xenarios et al ., 2002) (Giot et al ., 2003) (Ito et al ., 2001; Yatherajam et al ., 2003) TAF11-TAF15 (Bertolotti et al ., 1998)

PAGE 58

45 Table 1-2 continued. References Interaction Human Drosophila Yeast TAF12-TAF12 (Yatherajam et al ., 2003) TAF13-TAF15 (Bertolotti et al ., 1998)

PAGE 59

46 Table 1-3. Protein-protei n interactions between TF IIA, TFIIB, TFIID, TFIIE, and TFIIF subunits in Homo sapiens Drosophila melanogaster and Saccharomyces cerevisiae with corresponding references. References Interaction Human Drosophila Yeast TBP-TFIIA-S/ (Sun et al ., 1994) (Yokomori et al ., 1994) (Geiger et al ., 1996; Kokubo et al ., 1998) TBP-TFIIA-L TFIIA – TBP (Sun et al ., 1994) TFIIA-L20 – TBP (Yokomori et al ., 1994) (Geiger et al ., 1996; Kokubo et al ., 1998) TBP-TFIIB (Maldonado et al ., 1990; Ha et al ., 1993; Xenarios et al ., 2002) (Yamashita et al ., 1993) TBP-TFIIE (Yokomori et al ., 1998) (Maxon et al ., 1994) TBP-TFIIE (Okamoto et al ., 1998) TAF1-TFIIE Acetylation (Imhof et al ., 1997) TAF1-TFIIF Acetylation (Ruppert and Tjian, 1995; Dikstein et al., 1996a) TAF4-TFIIA-L TAF4bTFIIA (Dikstein et al., 1996b) (Yokomori et al., 1993a) TAF5-TFIIF (Dubrovskaya et al ., 1996) TAF6-TFIIE (Hisatake et al ., 1995) TAF6-TFIIF (Hisatake et al ., 1995) TAF9-TFIIB (Klemm et al ., 1995; Xenarios et al ., 2002) (Goodrich et al ., 1993) TAF11-TFIIA-S (Kraemer et al ., 2001) TAF13-TFIIA-L (Giot et al ., 2003)

PAGE 60

47 Table 1-3 continued. References Interaction Human Drosophila Yeast TAF14-TFIIF (Henry et al ., 1994) TFIIA-L TFIIA-S TFIIA -TFIIA (Sun et al ., 1994) TFIIA-Like Factor (ALF)-TF TFIIA (Upadhyaya et al ., 1999) (Giot et al ., 2003) (Ranish and Hahn, 1991) TFIIA-L (TFIIA / ) TFIIE (Yokomori et al ., 1998; Langelier et al ., 2001) TFIIA-L TFIIB (Uetz et al ., 2000) TFIIA (S)TFIIE (Langelier et al ., 2001) TFIIB-TFIIE (Okamoto et al ., 1998) (Ito et al ., 2001) TFIIB-TFIIF (Fang and Burton, 1996; Xenarios et al ., 2002) TFIIB-TFIIF (Ha et al ., 1993; Fang and Burton, 1996) (Ito et al ., 2001) TFIIE -TFIIE (Okamoto et al ., 1998) (Austin and Biggin, 1996; Giot et al ., 2003) (Riechmann and Ratcliffe, 2000; Uetz et al ., 2000) TFIIE -TFIIE (Okamoto et al ., 1998) TFIIE -TFIIF (Okamoto et al ., 1998) TFIIE -TFIIF (Okamoto et al ., 1998) TFIIF -TFIIF (Flores et al ., 1990; Killeen and Greenblatt, 1992 ) (Austin and Biggin, 1996)

PAGE 61

48 Figure 1-2. Binary protein-pr otein interactions of the Homo sapiens general transcription factors TFIIA, TF IIB, TFIID, TFIIE, TFIIF, and their homologs. TFIIATFIIB TFIIE TFIIF TFIIF TFIIATAF1 TAF3 TAF8 TBP TAF2 TAF4 TAF4b TAF5 TAF9 TAF9L TAF10 TAF11 TAF12 TAF13 TAF6 TFIIE TFIIAALF TAF5L TAF15 TAF6L TAF7 TAF7L

PAGE 62

49 Figure 1-3. Binary proteinprotein interactions of Drosophila melanogaster general transcription factors TFIIA, TF IIB, TFIID, TFIIE, TFIIF, and their homologs. TFIIB TFIIE TFIIE TAF1 TAF3 TAF8 TAF2 TAF4 TAF9 TAF10 TAF11 TAF12 TAF13 TAF10b TBP TFIIF TFIIA-S TFIIF TAF7 TAF6L TAF6 TAF5L TAF5 TFIIA-L30 TFIIA-L20

PAGE 63

50 Figure 1-4. Binary proteinprotein interactions of Saccharomyces cerevisiae general transcription factors TFIIA, TFIIB, TFIID, TFIIE, and TFIIF. TFIIB TFIIE TAF1 TAF2 TAF5 TAF11 TAF10 TAF13 TBP TAF3 TAF4 TAF6 TAF7 TAF8 TAF9 TAF14 TFIIE TFIIF TFIIA-S TFIIF TFIIF ( TAF14 ) TFIIA-L TAF12

PAGE 64

51 CHAPTER 2 PHYLOGENETIC ANALYSIS OF POPLAR, Arabidopsis AND OTHER PLANT GENERAL TRANSCRIPTION FACTORS Introduction While there has been a great deal of interest in DNA-binding transcriptional activators in plants, co mparatively little work has been done on plant regulatory factors downstream of the activators themselves. One exception is a current project that seeks to clone and characterize the majority of chroma tin regulators that are readily identifiable based on similarity (Pandey et al ., 2002). Another exceptio n is the plant histone deacetylases, which have been characterized to some degree in plants (reviewed in Lawit and Czarnecka-Verner, 2002). Since plant hist one deacetylases have historically been of interest due to their involvement with the maize disease caused by an inhibitor, HC-toxin from Cochliobolus carbonum they have been characterized more so with respect to enzymology than their function in protein co mplexes or gene regulation. Only two plant histone acetyltransferase complexes have been characterized to any extent: a SAGA-like complex in Arabidopsis (Stockinger et al ., 2001), and TFIID in wheat (Washburn et al ., 1997). The present study provides a detailed phyl ogenetic analysis of peptide subunits from the principal eukaryotic complexes res ponsible for transcrip tional activation from plants. The model plant system Arabidopsis thaliana (ecotype Columbia) was chosen as a primary source of peptide sequences because it was the first of the plant genomes to be fully sequence and annotated (The Arabi dopsis Genome Initiative, 2000). General

PAGE 65

52 transcription factors (GTFs) from plant sy stems are readily iden tifiable using homology based searches since these proteins are highl y conserved within eukaryotes and in some cases within Archaea as well. Putative peptide subunits of plant GTFs TFIIA, TFIIB, TFIID, TFIIE, and TFIIF were identified using basic local alignment search tool (BLAST) algorithms of the Arabidopsis genomic data. In many cases, mu ltiple genes were identified in Arabidopsis based on sequence similarity with GTFs in other eukaryotes. These putative Arabidopsis GTFs were then used to identify analogous ge nes in other plants to evaluate how wide spread the gene duplication and evolutionary changes found in Arabidopsis were to the plant kingdom. Phylogenetic analyses were performed using these other plant homologs, and the well-characterized proteins from Homo sapiens, Drosophila melanogaster and Saccharomyces cerevisiae The greatest emphasis was placed on identifying homologs in arabidopsis, rice, and poplar (given that these plant genome projects were the most advanced). This provided an examina tion of differences between GTFs from dicotyledonous and monocotyledonous plants, as well as between herbaceous and woody dicotyledonous plants. Functional analysis tied with genome se quence scrutiny has identified families of transcription factors found uni quely in plants (Riechmann et al ., 2000). These large families have encouraged efforts to systematically examine the functions of the individual family members looking for redundancy and divergence (Eulgem et al ., 2000; Jakoby et al ., 2002; Heim et al ., 2003; Toledo-Ortiz et al ., 2003). However, a similar kingdom-level investigation of GTFs has not been reported. There are numerous examples of GTF family members in metazoans that are specialized fo r certain functions.

PAGE 66

53 Mammalian TRF2 (TBP-related factor 2) a ppears to be required for spermiogenesis, Drosophila Cannonball (TAF5L) is required for sper matogenesis, and human TAF4b is part of a unique TFIID complex in follicle ce lls of the ovary (reviewed in Levine and Tjian, 2003). These and other findings suggest that some GTF homologs have evolved to undertake specialized functions in animals, and similar even ts may have occurred in plants. The present analysis was undertake n because: 1) sequen ce of three divergent seed-plant genomes are nearly complete and have the potential to allow discovery of plant-specific GTF family members; 2) tr anscriptional regulati on is a fundamental adaptive mechanism in plants to short-term st resses, and hitherto information is lacking on potential functional redundanc y or divergence in GTFs. Methods The putative proteins of Arabidopsis GTFs TFIIA, TFIIB, TFIID, TFIIE, and TFIIF were identified using two BLAST resources. The National Center for Biotechnology Information protein BLAST ( Arabidopsis thaliana organism setting; no filtering) was used to identify the follow ing GTF subunits: TBP1, TBP2, TAF1, TAF1b, TAF2, TAF6, TAF6b, TAF9, TAF10, TAF11, TAF11b, and TAF12. The Arabidopsis Information Resource (TAIR) WU-BLAST2 BL ASTp using the protein database was used to identify TAF4, TAF4b, TAF5, TAF7, TAF8, TAF13, TAF14, TAF14b, TAF15, TAF15b, TFIIA-S, TFIIA-L1-3, TFIIB1-6, TFIIE 1-3, TFIIE 1-2, TFIIF and TFIIF 1-2 (Table 2-1 for a summary of these ge nes and their respective proteins). All searches used human protein sequences as queries, with the ex ception of the TAF14 search in which the yeast protein sequence wa s used. Iterative searches were performed using the identified Arabidopsis protein sequences to iden tify TAF12b using TAIR WU-

PAGE 67

54 BLAST2 BLASTp. Nucleotide sequences a nd other relevant annotated data were identified by following the appropriate li nks provided by TAIR and the US National Center for Biotechnology Information (NCBI). Putative poplar GTFs were identified in the Department of Energy Joint Genome Institute poplar database genomic seque nce of the female black cottonwood ( Populus balsamifera subsp. trichocarpa ) clone Nisqually-1 (Wullschleger et al ., 2002). Searches were performed using the tBLASTn (Altschul et al ., 1990; Altschul et al ., 1997) function at http://aluminum.jgipsf.org/prod/bin/runBlast.pl? db=poplar0&dump=1&matchReads=1 Genomic contigs sequences were assembled using Contig Assembly Program 3 (CAP3) (Huang and Madan, 1999). Contig gaps were filled using iterative searches of the genomic sequence (BLASTn) and CAP3 assembly. Contigs were analyzed to predict cDNA sequence using Softberry software ( http://www.softberry.com/berry.phtml?topic=gfind ) FGENESH with either the Dicots ( Arabidopsis ) or Nicotiana tabacum settings (Chicurel, 2002). Since there is a disagreement between taxo nomists and the group sequencing the Populus genome as to the proper name of the species, either trichocarpa or balsamifera subsp. trichocarpa, these two names are used interchangeably in this text. Putative GTF amino acid sequen ces from plants other than Arabidopsis and poplar were identified using th ree different resources with Arabidopsis protein sequences as queries. NCBI BLASTp ( http://www.ncbi.nlm.nih.gov/BLAST/ ) was utilized to search all available protei n sequences using the Viridi plantae organism setting. H. sapiens, D. melanogaster and S. cerevisiae sequences were collected using text searches in Entrez ( http://www.ncbi.nlm.nih.gov/Entrez/ ). TBP, TFIIB, and TFIIE homologies

PAGE 68

55 can be traced back to Archaea (Rowlands et al ., 1994; Qureshi et al ., 1995; Bell et al ., 2001); therefore, in the case of these proteins archeal homologs were assembled as well. The Institute for Genome Research (TIGR) pl ant expressed sequence tag (EST) databases were searched using Arabidopsis sequence queries and a tBLASTn algorithm ( http://tigrblast.tigr.org/tgi/ ). Only full-length EST contigs were considered and the open reading frame (ORF) encoding the sequence wi th GTF similarity was translated using Java based Molecular Biologist’s Wo rkbench 1.1 (JaMBW 1.1) (Toldo, 1997) TranslatER. Finally, the Plan t Genome Database (PlantGDB; http://www.plantgdb.org/ ) was searched with Arabidopsis queries using BLASTp on protein sequences and tBLASTn on EST contigs (ORFs were translates as above). The sequences of predicted TATA biding proteins from poplar were assembled as above by John Davis. The collected amino acid sequences of each GTF homology group were aligned using the Gonnet weight matrix (Gonnet et al ., 1994) on ClustalX software (Thompson et al ., 1997). Unaligned N-terminal or C-termin al sequence extensions were deleted by hand and datasets realigned to conservativ ely estimate phylogenic distances based on conserved protein regions. ClustalX ali gnment outputs were produced in Nexus format and phylogenetically analyzed using PAUP* (phylogenetic analysis using parsimony *and other methods) parsimony with 500 bootst rap replicates (PAUP analysis done by Ram Kishore Alavalapati) (Swofford, 2003). Phylogenetic trees were created using TreeView (Page, 1996). Similarity and identity matrices were cr eated using Matrix Global Alignment Tool (MatGAT; http://www.angelfire.com/nj 2/arabidopsis/MatGAT.html ) with the blocks substitution matrix 62 (BLOSUM62) similarity matrix (Campanella et al ., 2003).

PAGE 69

56 Truncated sequences were used as input data for those cases in which protein sequences were shortened for ClustalX alignments. Results The amino acid sequences of the assembled GTFs are shown in Appendix A. The amino acid multiple sequence alignments for core domains of the GTFs (sans N-terminal and C-terminal extensions) are found in Appendix B. Similarity and identity ranges for protein families are found in Table 2-2. Phylogenic trees derived from the parsimony analyses are found in Fi gures 2-1 through 2-12. TFIIA Large and Small Subunits The poplar TFIIA-L1 coding sequences pred iction using the Softberry, GlimmerM ( http://www.tigr.org/tdb/ glimmerm/glmr_form.html ) (Majoros et al ., 2003), and Eukaryotic GeneMark.hmm ( http://opal.biology.gatech.edu/GeneMark/eukhmm.cgi ) (Lukashin and Borodovsky, 1998) programs were all predicted to have a C-terminal extension of varying lengths that did not have homology with any other group and was missing the last seven amino acids that were highly conserved among all other organism examined. However, further anal ysis using the GeneSeqer program ( http://www.plantgdb.org/cgi-bin/P lantGDB/GeneSeqer/PlantGDBgs.cgi ) (Schlueter et al ., 2003) which used cDNA data from all plants and only Populus predicted a shorter Cterminus that was highly similar to other organisms and ended with the ATGEFEF plant consensus. Since this model was truncated at the N-terminus due to the paucity of 5’ cDNA sequences, it was merged with th e Softberry FGENESH output from the Nicotiana tabacum and Arabidopsis thaliana to yield the final pred iction of poplar TFIIA-L1 coding sequence.

PAGE 70

57 TFIIB Family A large number of full-length cDNAs and predicted genes encoding TFIIB homologs have been identified in plants (30 pl ant homologs in all). It appears that the TFIIB protein-family has undergone many dup lications as well as differentiations including a novel homolog which has a functi onal connection to the plastid (Lagrange et al ., 2003). Arabidopsis and poplar both have four dist inct phylogenetic TFIIB clusters (Class A, Class C, Class D, and Class E). Clearly identifiable plant homologs of TFIIBrelated factors (BRFs) associated with DNA-dependent RNA polymerase III (PolIII) were excluded from the phylogenetic analysis; however, it appears th at distantly related homologs from Lycopersicon esculentum and Populus may have remained. Plant TFIIB homologs appear to have ma ny conserved motifs first identified in the metazoan TFIIB. Among these is a lysine re sidue that has recently been shown to be autoacetylated in human and yeast TFIIBs (Choi et al ., 2003). This lysine is conserved in many members of the Class A TFIIB family (Figures 2-3 and 2-13). Similarly, the putative zinc-ribbon domain at the N-terminus has been conserved in most family members (Appendix B3). Although it is not a pparent in Appendix B3 due to N-terminal trimming for alignment purposes, AtTFIIB4 also contains this cons erved metal-binding domain. Poplar TFIIB3, poplar TFIIB9, poplar TFIIB8, Lycopersicon esculentum TFIIB AF273333, and Sulfolobus solfataricus TFB AAK40772.1 are all missing the conserved cysteine and/or histidine residues essentia l to this N-terminal motif. Since a known functional TFB from the archaea Sulfolobus solfataricus is lacking this motif, it may not be required for TFIIB functi on in all cases. Thus, poplar TFIIB3, poplar TFIIB9, poplar TFIIB8, and Lycopersicon esculentum TFIIB AF273333 may all be functional TFIIBhomologs despite the absen ce of this conserved motif.

PAGE 71

58 Likewise, the imperfect direct repe ats of amino acid sequences found in human core-TFIIB (Nikolov et al ., 1995) have been well conserved (Appendix B3). AtTFIIB6 is lacking the second direct repeat region. Th is region is involved in protein-protein interactions with PolII (Ha et al ., 1993). Therefore, it is suggested that the AtTFIIB6 protein may be deficient in this PolII interaction, and could possibly function as a negative regulator. Vitis vinifera TFIIB TC9302, as well as am ino acid predictions from poplar TFIIB4, TFIIB5 and TFIIB6 are notably lacking both direct repeats suggesting that these proteins, if expr essed, are not functional TFIIB homologs since they would likely be deficient in TBP and PolII interactions (Ha et al ., 1993). In addition to the canonical TFIIB prot eins (Class A) and BRFs (Class B), the plastid envelope associated (Class C) TFIIB-like proteins (Lagrange et al ., 2003) are conserved in all plant lines with available sequence. The Arabidopsis AtTFIIB5/pBrp shows two closely related homologs in Lycopersicon esculentum (Accession AAG01118) and poplar (TFIIB7/pBrp) and high rela tedness to the partial cDNAs from Spinacia oleracea and Zea mays reported by Lagrange et al (2003). Representative TFIID Components TBP has been highly conserved in plants (T able 2-2 and Figure 2-4), even in cases where plants contain duplicate TBP genes. These duplicated TBP proteins are in all cases highly similar and are not likely to have diverged f unctions. Plant TBP genes are tightly clustered phylogenetically, although so mewhat diverged from metazoan, fungal, protistan, and archaeal TBPs and TBP-like prot eins. Significantly, bo th imperfect repeat motifs within the protein structure are conserved in all the TBP-like proteins. Similarly to the case with TBP, A. thaliana has two loci that encode TAF6 homologs. Both of these genes (designate d TAF6 and TAF6b) ar e transcribed and the

PAGE 72

59 latter has at least four alte rnative splicing variants (E. C zarnecka-Verner, S.J. Lawit, W.B. Gurley unpublished data). Clones were sequenced by the University of Florida ICBR DNA sequencing facility. Intron-Exon diagrams of TA F6 and TAF6b isoforms are shown in Figure 2-14. TAFs 6, 9, 10, and 11 phylogeny patterns eas ily cluster into monocot and dicot families. The plant proteins are well conser ved and somewhat divergent from metazoan, fungal and protistan proteins. However, poplar TAF9b is more related to the Chlamydomonas reinhardtii TAF9 and S. cerevisiae TAF9 than the plant TAF9 proteins. This TAF9 homolog is perhaps a TAF9-like protein involved in ot her transcriptional complexes similarly to H. sapiens TAF9L (Chen and Manley, 2003). Similarly, poplar TAF11 is roughly equally related to plant a nd fungal family members suggesting that it represents a more ancien t form of TAF11 than is found in other plants. TFIIE and TFIIE Subunits Similarly to the TAF proteins, the TFIIE phylogeny pattern also formed along dicot-monocot lines. The six dicot proteins and single monocot protein were diverged and branched separately with 100% bootstra p support. Interestingly, the archaeal TFE proteins were phylogenetically more similar to plant TFIIE than to the yeast and metazoan counterparts. This suggests a hi gher degree of primary structure conservation in the plant TFIIE proteins than in the other kin gdoms, possibly indicating a greater reliance on TFIIE for stabilization of the open promoter conformation. Monocot TFIIE proteins cluster closely with on e another as do dicots with the exception of two proteins. These exceptions are the A. thaliana TFIIE family members.

PAGE 73

60 Both proteins from Arabidopsis are found well outside of the core plant cluster suggesting a significant dive rgence in sequence in the Arabidopsis TFIIE genes. TFIIF and TFIIF Subunits A. thaliana TFIIF has an 87 amino acid C-terminal extension and S. cerevisiae TFIIF has a 68 amino acid C-terminal extens ion that were removed from their sequences for alignment purposes. Due to the large size of the TFIIF subunit, relatively few full-length plant cDNAs could be assemb led for comparison. Furthermore, due to a lack of sequence in one regi on of a poplar TFIIF gene, a full-length genomic region could not be assembled. Therefore, onl y one of two poplar putative TFIIF proteins was phylogenetically analyzed. These proteins are well conserved thr oughout the length of the protein. The N-terminus and C-terminus of S. cerevisiae TFIIF was trimmed by 32 and 33 amino acids, respectively, for alignment purposes. Similarly, D. melanogaster TFIIF was trimmed at the C-terminus by nine amino acids. TFIIF proteins are likewise well conserved, with two exceptions. Both A. thaliana TFIIF 1 and poplar TFIIF 1 have large, non-conserved insertions It should be noted that neither of these genes have cDNA clone representation and are therefore onl y predictions (despite numerous tries at RT-PCR cloning of AtTFIIF 1; data not shown). Discussion TFIIA Large and Small Subunits TFIIA is composed of either two subunits (L and S) in fungi and plants or three subunits ( , and ) in metazoans where and are derived from post-translational cleavage of a protein homologous to fungal and plant TFIIA-L subunit (Li et al ., 1999).

PAGE 74

61 TFIIA interacts with both the N-terminal st irrup of TBP and DNA upstream of the TATA element (Langelier et al ., 2001). TFIIA in A. thaliana seems to be encoded by four genes, three encoding large subunit homologs and one encoding a small subunit homolog. AtTFIIA-L1 and AtTFIIA-L2 appear to result from a recen t gene-duplication due to their high degree of identity and their close juxtap osition in chromosomal location (Fig. 2-2). The genes encoding TFIIA-L1 and TFIIA-L2 are oriented in a tail-to-tail fashion with their polyadenylation sites separated by 1,922 bp. AtTFIIA-L3 appears to have arisen from a more ancient gene dup lication, and has significantly di verged from other TFIIA-L genes (Figure 2-2). The AtTFIIA-L3 protein is approximately half the size of its two Arabidopsis homologs, although it has maintained a similar isoelectric point (pI) and appears to be competent for assembly of the TFIIA complex base d on yeast two-hybrid interactions (Chapter 4). On e hypothesis is that AtTFIIA-L3 re presents an ancestral form of the TFIIA-L protein family due to its phylogenetic clustering with fungal and metazoan sequences. Poplar TFIIA also appears to be encode d by four genes (two encoding large subunit proteins, and two encoding small subunit proteins ). Unfortunately, two contigs that most likely encode one of the TFIIA-L genes could not be connected due to the presence of a large, low-complexity (most likely introni c) region; however, the predicted TFIIA-L amino acid sequence grouped solidly with other dicot TFIIA-L sequences (data not shown). The predicted amino acid sequence from the incomplete TFIIA-L sequence has a high degree of identity with the complete form (Fig. 2-2), suggesting a recent gene duplication and redundancy in function. In terestingly, poplar TFIIA-S1 is highly

PAGE 75

62 conserved (Fig. 2-1) and one th at is quite diverged from ot her plant proteins (TFIIA-S2), nearly equidistant from other plan t proteins and metazoan proteins. Arabidopsis and poplar both encode TFIIA proteins that ar e highly diverged from other plant proteins (a TFIIA-L3, and a TF IIA-S2, respectively). These proteins may have evolved specialized functions within their respective organisms; however, they do not seem to be conserved within other plants that have been sequenced. This suggests that these proteins are potent ially the products of evolutiona ry experiments in progress or may be ancestral forms of these proteins th at have been retained in their respective species. However, the significance of thei r presence cannot be reliably predicted. Overall, TFIIA conservation appears to be qui te high. The TFIIA-S family is conserved throughout the length of the prot ein, while the TFIIA-L family is conserved mainly at the N-terminal and C-terminal ends. The TFIIA-L sequence conservation pattern is consistent with the observation that human and fruit fly TFIIA is composed of three subunits, the two largest of which are deri ved from proteolytic cleavage of the TFIIA / (TFIIA-L) pre-protein. This suggests that the middle region of TFIIA-L proteins may function as a flexible linker. TFIIB Family Full-length cDNAs of 30 plant homologs of TF IIB have been identified in plants. The TFIIB protein-family has undergone my riad duplications and differentiations including one (the Class C TFIIB-related proteins) that has evolved a functional interaction with the de fining plant organelle, the plastid (Lagrange et al ., 2003). The canonical member of the Class C group (T FIIB5/pBrp) was discovered by Lagrange et al (2003) to bind the outer enve lope of the plastid, sugge sting a function in signal

PAGE 76

63 transduction from the plastid to the nucle us. Six distinct phylogenetic TFIIB groups are apparent in Arabidopsis and Populus (if one accounts for the BRFs). Clear homologs of DNA-dependent RNA polymerase III (PolIII) associated TFIIB-rela ted factors (BRFs) from plants were excluded from my phyloge netic analysis with the exception of the Arabidopsis proteins for use as an out-group. Plant TFIIB homologs have a number of c onserved motifs. These include a lysine residue, located 28 amino acids from the N-term inus of the second direct repeat (in the human sequence), which has recently been s hown to be autoacetylated in human and yeast TFIIBs (Choi et al ., 2003). This lysine is conserved in many members of the plant Class A TFIIB family (Figures 2-3 and 2-13) suggesting a conservation of this autoacetylation activity in plants. Choi et al. (2003) did not identif y the catalytic domain of this autoacetylase activity ; therefore, the conservation of this domain could not be assessed. Choi et al. (2003) reported that the presence of the acetyllysine group in TFIIB increases the affinity of this protein fo r TFIIF, implying a role in transcriptional initiation. It is likely that an activity involved in this crit ical process will be conserved not only in metazoans and fungi, but also in pl ants. Equally significa nt is the absence of this lysine in severa l of the plant TFIIBs, suggesting plant-specific specialization among members of the TFIIB family. Similarly, the putative zinc-ribbon domain at the N-terminus has been conserved in most family members (Appendix B3) includi ng AtTFIIB4 although it is not apparent in due to N-terminal trimming. Significan tly, poplar TFIIB3, poplar TFIIB8, poplar TFIIB9, Lycopersicon esculentum TFIIB AF273333, and Sulfolobus solfataricus TFB AAK40772.1 are all missing the conser ved cysteine and/or histidine residues essential to

PAGE 77

64 this N-terminal motif. However, at least one archaeal species ( Sulfolobus solfataricus ) is lacking the zinc-ribbon in its TFB suggesting that this motif may not be required for TFIIB function in all cases. Another conserved domain, the impe rfect direct repeats (Nikolov et al ., 1995) are found in most plant TFIIB ho mologs (Appendix B3). AtTF IIB6 is lacking the second direct repeat region, which interacts directly with PolII in animals (Ha et al ., 1993). Therefore, it is suggested that this proteins may be deficient in this PolII interaction and, if they are functional, could possibly play a role as negative regulators. Four TFIIBrelated proteins are lacking both direct repeats suggesti ng that these proteins, if expressed, are not functi onal TFIIB homologs (Ha et al ., 1993). In addition to the TFIIB-family proteins in Class A and Class B (BRFs, which were not analyzed extensively in th is study), the a clear conser vation has been shown for the plastid envelope associated (Class C) TFIIB-like proteins in this study and by Lagrange et al (2003). This plant-specific TFIIB is localized to the ou ter plastid membrane and is not detectable in the nucleus of wild type plants (Lagrange et al ., 2003). The characterized protein AtTFIIB5/pBrp has two closely related homologs ( Lycopersicon esculentum Accession AAG01118, and poplar TFIIB7/pB rp) in addition to the partial cDNAs from Spinacia oleracea and Zea mays reported by Lagrange et al (2003). This suggests that this protein has a conserved activity that is crit ical to plant cell functions. Lagrange et al (2003) suggested a functional model in which a plastid-derived signal leads to release of the TFIIB 5 protein from the outer envelope and movement into the nucleus. Once in the nucleus, TFIIB5 induces expression of its unique transcriptosome. The nuclear accumulation of TFIIB5 in plants deficient in COP9 suggests proteasome-

PAGE 78

65 mediated degradation provides a rapid turnove r of the nuclear-localized TFIIB5 protein, facilitating temporal control of the signal re sponse. However, this is model does not explain the lack of TFIIB5/pBrp prot ein on the plastid under conditions of proteasome/COP9 signalosome dysfunction. In the present study, a model is proposed in which the COP9 signalosome leads to degradation of a TFIIB5/pBrp co -factor that allows the protein to the nucleus. Such a cofactor may be either a chaperone that esco rts TFIIB5/pBrp to the nucleus, a chloroplastdocking antagonist, or a post-translational modify ing regulator (i.e. a kinase or a 14-3-3 protein). In any of these cases, TFIIB5/pBrp transport to the nu cleus and induction of transcription is likely to be the culmination of this signal transduction pathway. Whatever the role of TFIIB5/pBrp, the Cla ss C subfamily of TFIIB-like factors plays a novel role in plant tran scriptional regulation. There is weak bootstrap support for two additional conserved classes of TFIIB-like proteins in plants. The Class D group contains Arabidopsis TFIIB4 and Poplar TFIIB8. Class E contains Arabidopsis TFIIB3 and TFIIB6, as well as Poplar TFIIIB2. The functions of these proteins are unknown; however, Arabidopsis TFIIIB3 and TFIIB6 have similar interactions with other GTFs (Chapter 4). Representative TFIID Components TBP is widely regarded as being the rate-limiting factor of PIC formation. Consistent with this critical role, it is am ong the most highly conser ved proteins of the GTFs, through all the organisms examined in this study, with 73.7% and 63.1% average similarity and identity, respectively. Likewi se, TBP is highly conser ved in plants (84% average identity). Similarly to animals, many plants contain duplicate TBP genes; however, unlike the case in animals the plant prot eins are highly similar and are likely to

PAGE 79

66 be largely redundant. Plant TB Ps are tightly clustered, alth ough significantly diverged from metazoan, fungal, protist, and archaeal TBPs and TBP-like prot eins. As would be expected, the TBP two repeated structural domai ns are conserved in a ll the proteins in the TBP-like family. In general, TBP-associated factors are mo re highly variable than TBP. There are many cases of duplicate TAFs as well as TAF-like proteins in fungi and animals (reviewed in Tora, 2002). One example of this in plants is the pres ence of two genetic loci encoding homologs of TAF6 in Arabidopsis Upon cloning of these cDNAs (Chapter 3) it was found that one of these genes, TAF6b is represented by four alternatively spliced mRNAs (Figure 2-14). TAF6b has 12 coding exons in the mRNAs of three isoforms, and TAF6b-4 has only 5 coding exons due to what appears to be a premature stop codon in exon V caused by the lack of splicing of intron II. In contrast, TAF6 has 11 coding exons and no det ected alternative splicing. The TAF genes that have been investigated in this work are clearly divergent along taxonomic lines. This is clearly demonstr ated by TAF9, of wh ich monocot and dicot TAF9 sequences cluster sepa rately in the unr ooted phylogram (Figure 2-6). This situation is also evident in the TAF6 phylogr am (Figure 2-5). However, poplar TAF9b is more closely related to the C. reinhardtii TAF9 and S. cerevisiae TAF9 than the monocot or dicot TAF9 proteins. This TAF9 homol og is perhaps a TAF9-like protein involved in other transcriptional complexes similarly to H. sapiens TAF9L (Chen and Manley, 2003). A second possibility is that poplar TAF9b may be a bona fide TAF9 that regulates a subset of genes in poplar, or is merely re dundant. Finally, poplar TAF9b could represent an ancient form of the gene that ha s been maintained in this lineage.

PAGE 80

67 Similarly to the situation with TAF9, TAF 10 proteins are plainly grouped as either monocots or dicots with other kingdoms cluste ring separately. One gymnosperm (Pine) TAF10 protein was included in this analysis, and it was found to be more similar to the TAF10 proteins of dicots than those of monocots, consistent with the more recent evolution of monocots. TAF11 proteins are similarly clustered in the phylogram, with the exception of the protein encoded by poplar TAF11 Poplar TAF11 is equally similar to yeast and plant TAF11 proteins. Interestingly, AtTAF11 is located on Arabidopsis chromosome 4 only five loci away from TFIIE2 (~27 kbp) that is in-t urn located very near TFIIE2 (see TFIIE section below). Although TAF11 has no known direct-connection to TFIIE, this close genomic proximity seems unusually coincidental. Chromosome 3 of Oryza sativa (rice) has genes encoding both TAF11-like and TFIIE -like protein; however, these genes are separated by over 8 Mb. Unfortuna tely, fine mapping of chromosomes have not yet been performed for sequences for any dicot except Arabidopsis ; therefore, comparison of GTF synteny within dicots is not yet possible. Arabidopsis TAF11b does not have representation in EST collections, nor has it been amplified by RT-PCR. This data suggests that AtTAF11b may be a non-expressed or very low expression gene. This hypot hesis is supported by the evolutionary divergence of the protein sequence in re lation to other plant TAF11 amino acid sequences. However, the sequence of AtTAF 11 is actually more similar to other plant sequences than the putative poplar TAF11 protein.

PAGE 81

68 TFIIE and TFIIE Subunits A. thaliana has genes encoding three homologs of TFIIE and two of TFIIE (Table 2-1). TFIIE and TFIIE of H. sapiens are acidic (pI of 4.5) and basic (pI of 9.5) proteins, respectively (Peterson et al ., 1991). The acidic properties of TFIIE appear to be well conserved in Arabidopsis with pI values of 4.75, 4.95, and 4.72 for E 1, E 2, and E 3, respectively. Likewise, the basi c pI values are conserved in Arabidopsis TFIIE proteins (10.23 and 10.04 respectively for E 1 and E 2). Four of these Arabidopsis TFIIE genes are cluste red on chromosome 4. TFIIE2 and TFIIE2 neighbor each other in a head to he ad inverted fashion sharing a common promoter region. TFIIE3 and TFIIE1 are in relatively clos e proximity both in the same orientation (18 genetic lo ci inserted between the gene s, 83 kbp apart). The extreme proximity (only 972 bp between start codons) of TFIIE2 and TFIIE2 suggest that they are direct descendents of the an cestral genes in plants and ha ve been duplicated to create the other loci. This hypothesis is suppor ted by phylogenetic data in the case of TFIIE2 in which the Arabidopsis protein (along with the gene product of TFIIE1 ) is clustered separately from all other TFIIE proteins. However, the TFIIE 2 protein clusters with the other Arabidopsis TFIIE sequences, within the dicot grouping. TFIIF Family The poplar genome clearly encodes two TFIIF genes; however, two contigs encoding what appears to be the N-termin al and C-terminal regions could not be connected due to lack of sequence in what appears to be a low-complexity intronic

PAGE 82

69 region. Therefore, only one poplar putative-TFIIF amino acid sequence is included in my analyses. TFIIF is highly conserved throughout the le ngth of the primary structure. In metazoans and yeast, TFIIF can be functionally divide d into three domains: the Nterminal TFIIF binding domain, the highl y charged middle domain, and the C-terminal winged helix domain (Kamada et al ., 2001). The high conservation of the plant TFIIF primary structures carries through to the hydrophobic residues in the C-termini (Appendix B). Therefore, th e conclusions of Kamada et al (2001) are followed, suggesting that these proteins have a conserved winged helix do main in their C-termini. This winged helix domain is not yet implicat ed in DNA-binding as are the winged helix domains of TFIIE TFIIF and many of the winged helix superfamily members (Kamada et al ., 2001). TFIIF has been reported to contain a serine /threonine kinase ac tivity in that is involved in transcriptio nal elongation (Rossignol et al ., 1999). However, Rossignol and co-workers (1999) were unable to find an identifiable ATPase domain in TFIIF Interestingly, I have identified a weak si milarity with AAA ATPa se VWA-containing proteins in a ll the plant TFIIF homologs studied within this work, suggesting that this kinase activity is retained in plants. TFIIF Family The poplar genomic sequence has four coding regions with homology to the Arabidopsis TFIIF proteins. However, only three of the poplar contigs created from the genomic sequences appear to be transcribed into RNA. The fourth sequence (analyzed using BLASTx) (Altschul et al ., 1997) does not support an mRNA prediction of

PAGE 83

70 reasonable length and encodes a stop codon within a highly conserved region of the predicted amino acid sequence. Thus, this four th contig region is most likely a remnant of a non-functional gene duplication. TFIIF proteins are highly conserved, except for large, non-conser ved insertions in A. thaliana TFIIF 1 and poplar TFIIF 1. It should be noted that neither of these genes have cDNA clone representation and are therefore only predicti ons. Despite the lack of cDNA support for these genes, the insertions seem to be grouped between conserved functional domains identified by Tan et al (1995). Therefore, thes e insertions may be in flexible linker regions between more define d structural domains suggesting that these gene products, if expressed, may be functional. The tight clustering of GTF protei ns (TAF6, TAF9, TAF10, TAF11, TFIIE and TFIIE ) along evolutionary lines suggests that these proteins are evolving with the species that encode them in their genomes a nd have functions that are somewhat resistant to minor sequence variations. This may be due to linker regions in proteins that have little need for conservation of sequence. Conversely, the lack of obvious clustering of proteins within evolutionary groupings belo w the kingdom level (such as in TFIIB and TBP) that these proteins are very tightly conserved and that minor sequence alterations may drastically disrupt functions. Thus in the case of TFIIB and TBP, it appears more likely that the conservation may be so tight (at least within subclasses) that very few changes have occurred below the kingdom le vel and thus phylogene tic analysis cannot produce subgroups reliably. Three plant-specif ic clusters of TFIIB-like proteins have been identified. At least one of these has evidence of a pl ant-specific function due to it

PAGE 84

71 intracellular localization to the plastid outer membrane under normal conditions (Lagrange et al., 2003).

PAGE 85

72 Table 2-1. Arabidopsis GTF genes, loci, genomic sizes, coding sequence sizes (counting stop codons), predicted protein molecular weights, and pI of the predicted proteins. Gene Locus Genomic Size of CDS (bp) CDS Size (bp) Predicted Mw (KDa) Predicted pI TFIIA-S At4g24440 1,094 321 12.1 5.61 TFIIA-L1 At1g07480 2,510 1,128 41.3 3.98 TFIIA-L2 At1g07470 2,628 1,128 41.2 4.02 TFIIA-L3 At5g59230 825 561 20.9 3.94 TFIIB1 At2g41630 1,846 939 34.3 6.77 TFIIB2 At3g10330 1,736 939 34.2 6.66 TFIIB3 At3g29380 1,118 1,011 37.7 6.32 TFIIB4 At3g57370 1,637 1,083 39.7 7.76 TFIIB5 At4g36650 2,245 1,512 55.7 6.14 TFIIB6 At4g10680 549 549 19.9 8.89 TBP1 At3g13445 1,453 603 22.4 10.21 TBP2 At1g55520 1,334 603 22.4 10.31 TAF1 At1g32750 9,877 5,760 217.2 5.55 TAF1b At3g19040 8,107 5,103 192.1 7.65 TAF2 At1g73960 9,201 4,113 153.5 6.18 TAF4 At5g43130 5,537 2,163 80.6 9.34 TAF4b At1g27720 4,026 1,854 68.9 9.84 TAF5 At5g25150 4,942 2,010 74.4 6.65 TAF6 At1g04950 3,508 1,584 58.9 8.73 TAF6b1 At1g54360 2,562 1,515 56.5 8.83 TAF6b2 At1g54360 2,562 1,494 55.7 8.84 TAF6b3 At1g54360 2,562 1,431 53.1 8.56 TAF6b4 At1g54360 857 588 22.6 9.62 TAF7 At1g55300 1,321 612 22.5 4.11 TAF8 At4g34340 1,062 1,062 39.5 4.96 TAF9 At1g54140 794 552 20.6 4.67 TAF10 At4g31720 1,482 405 14.9 5.50 TAF11 At4g20280 873 633 23.7 5.39 TAF11b At1g20000 769 615 23.4 9.59 TAF12 At3g10070 2,604 1,620 57.7 10.42 TAF12b At1g17440 3,257 2,052 74.8 10.46 TAF13 At1g02680 836 384 14.3 5.81 TAF14 At2g18000 1,025 609 22.8 6.16 TAF14b At5g45600 1,608 807 30.2 7.03 TAF15 At1g50300 2,845 1,119 41.3 7.98 TAF15b-1 At5g58470 2,131 1,164 38.9 7.93 TAF15b-2 At5g58470 2,289 1,269 42.3 8.73 TFIIE 1 At1g03280 3,124 1,440 54.1 4.75

PAGE 86

73 Table 2-1 Continued. Gene Locus Genomic Size of CDS (bp) CDS Size (bp) Predicted Mw (KDa) Predicted pI TFIIE 2 At4g20340 2,738 1,428 54.6 4.95 TFIIE 3 At4g20810 2,171 1,251 47.8 4.72 TFIIE 1 At4g21010 1,047 828 31.5 10.23 TFIIE 2 At4g20330 1,357 861 32.4 10.04 TFIIF At4g12610 3,349 1,950 72.3 5.22 TFIIF 1 At3g52270 1,708 1,095 42.1 7.70 TFIIF 2 At1g75510 1,376 761 29.7 6.92

PAGE 87

74 Table 2-2: Similarity and identity percentage ranges of the GTF protein families examined. Protein Family Similarity Range (Average) Identity Range (Average) Similarity Range within Plants (Average) Identity Range within Plants (Average) TFIIA-S 57.4 – 100.0 (83.5) 36.2 – 100.0 (72.2) 62.5 – 100.0 (90.1) 46.7 – 100.0 (83.4) TFIIA-L 23.8 – 97.3 (49.4) 14.8 – 96.5 (34.5) 33.6 – 97.3 (63.2) 21.5 – 96.5 (50.5) TFIIB (all) 13.2 – 99.7 (46.0) 8.6 – 99.4 (33.2) 16.3 – 99.7 (50.4) 10.0 – 99.4 (37.8) TFIIB Class A 21.7 – 99.7 (58.3) 10.0 – 99.4 (46.8) 24.1 – 99.7 (63.3) 10.0 – 99.4 (54.7) TFIIB Class C 88.4 – 91.2 (89.9) 80.2 – 86.3 (82.7) 88.4 – 91.2 (89.9) 80.2 – 86.3 (82.7) TBP 31.4 – 99.5 (73.7) 16.5 – 98.5 (63.1) 61.5 – 99.5 (88.2) 52.8 – 98.5 (84.0) TAF6 15.5 – 98.6 (49.8) 10.2 – 98.6 (35.2) 20.0 – 98.6 (59.3) 15.2 – 98.6 (47.5) TAF9 22 – 98.4 (52.7) 12.1 – 96.2 (40.1) 24.1 – 98.4 (61.8) 13.2 – 96.2 (50.7) TAF10 47.3 – 98.7 (71.5) 22.0 – 98.0 (57.3) 66.9 – 98.7 (82.5) 49.7 – 98.0 (71.6) TAF11 21.4 – 97.2 (51.1) 10.6 – 96.3 (35.5) 31.3 – 97.2 (61.6) 17.2 – 96.3 (47.1) TFIIE 15.4 – 85.0 (42.6) 8.8 – 73.5 (28.3) 53.3 – 85.0 (69.1) 39.6 – 73.5 (53.5) TFIIE 36.3 – 100.0 (69.6) 16.8 – 99.6 (53.4) 67.4 – 100.0 (80.7) 48.6 – 99.6 (67.1) TFIIF 29.1 – 94.9 (49.2) 15.1 – 89.1 (33.4) 70.8 – 94.9 (76.2) 58.5 – 89.1 (65.3) TFIIF 35.4 – 100 (62.3) 18.8 – 99.6 (46.1) 48.9 – 100 (74.4) 34.8 – 99.6 (60.2)

PAGE 88

75 Figure 2-1: Unrooted phylogram of TFIIA small subunit proteins from plants, human s, fruit flies, and yeas t. Bootstrap percent age support values are shown on branches. 97 70 51 52 100 = 1 amino acid change 77 96 55

PAGE 89

76 Figure 2-2: Unrooted phylogram of TFIIA la rge subunit proteins from plants, humans, fr uit flies, and yeast. Bootstrap percent age support values are shown on branches. 85 100 100 78 99 59 = 10 amino acid changes 92 80

PAGE 90

77 Figure 2-3: Unrooted phylogram of TFIIB-related proteins from plants humans, fruit flies, yeast, and Archaea Bootstrap percentage support values are shown on branches. Archetypical TFIIB proteins (Class A) are enclosed in blue boxes, PolIII associated TFIIIB-related factors (Class B) are in a red box, and the plasti d-associated TFIIBs (Class C) are in a green box, Class D are in a gold box, and Class E are in an orange box. 97 100 98 77 100 63 62 96 84 58 100 71 89 99 58 55 66 99 = 10 amino acid changes

PAGE 91

78 Figure 2-4: Unrooted phylogram of TBP-related proteins from plants humans, fruit f lies, yeast, and Archaea Bootstrap percentage support values are shown on branches. 95 80 66 86 71 72 100 = 10 amino acid changes 56

PAGE 92

79 Figure 2-5: Unrooted phylogram of TAF6-related proteins from pl ants, humans, fruit flies, and ye ast. Bootstrap percentage sup port values are shown on branches. Dicot TAF6 proteins are grouped in a green box, monocot proteins are grouped in a gold box. 95 100 100 99 56 82 71 100 = 10 amino acid changes

PAGE 93

80 Figure 2-6: Unrooted phylogram of TAF9-related proteins from pl ants, humans, fruit flies, and ye ast. Bootstrap percentage sup port values are shown on branches. Dicot TAF9 proteins are grouped in a green box, monocot proteins are grouped in a gold box. 100 58 64 99 61 94 74 64 100 67 100 = 10 amino acid changes

PAGE 94

81 Figure 2-7: Unrooted phylogram of TAF10-related proteins from plants, humans, fruit flies, and yeast. Bootstrap percentage support values are shown on branches. Dicot TAF10 prot eins are grouped in a gree n box, monocot proteins are grouped in a gold box. 99 53 81 84 74 53 99 70 69 51 76 = 10 amino acid changes 56 90

PAGE 95

82 Figure 2-8: Unrooted phylogram of TAF11-related proteins from plants, humans, fruit flies, and yeast. Bootstrap percentage support values are shown on branches. Dicot TAF11 prot eins are grouped in a gree n box, monocot proteins are grouped in a gold box. 100 99 99 97 76 = 10 amino acid changes

PAGE 96

83 552 Figure 2-9: Unrooted phylogram of TFIIE -related proteins from plants, hu mans, fruit flies, yeast, and Archaea Bootstrap percentage support values are s hown on branches. Dicot TFIIE proteins are group ed in a green box. 100 100 67 99 98 98 93 99 98 67 66 62 87 = 10 amino acid changes

PAGE 97

84 Figure 2-10: Unrooted phylogram of TFIIE -related proteins from plants, humans, fru it flies, and yeast. Bootstrap percentage support values are shown on branches. Monocot TFIIE proteins are grouped in a gold box. 74 99 74 100 78 84 100 100 100 100 95 91 99 93 = 10 amino acid changes

PAGE 98

85 Figure 2-11: Unrooted phylogram of TFIIF -related proteins from plants humans, fruit flies, and y east. Bootstrap percentage support values are shown on branches. 100 52 100 100 = 10 amino acid changes

PAGE 99

86 Figure 2-12: Unrooted phylogram of TFIIF -related proteins from plants, humans, fruit flies, and yeast. Bootstrap percentage support values are shown on branches. Dicot TFIIF proteins are grouped in gree n boxes, monocot proteins are grouped in a gold box. 89 100 100 88 99 73 56 77 81 = 10 amino acid changes

PAGE 100

87 Arabidopsis_thaliana_BRF1_At2g45100 VLTATHIIASMKRDWMQT Arabidopsis_thaliana_BRF2_At3g09360 VATARDIIASMKRDWIQT Arabidopsis_thaliana_BRF3_At2g01280 ANTAKNIISSMKRDWIQT Drosophila_melanogaster_BRF_AAF72065 SMTALRIVQRMKKDCMHS Homo_sapiens_BRF_NP_001510.2 SMTALRLLQRMKRDWMHT Saccharomyces_cerevisiae_BRF_NP_011762.1 VKDAVKLAQRMSKDWMFE Populus_balsamifera_TFIIB7/pBrp QELATHIGEVVINKCFCT Arabidopsis_thaliana_TFIIB5_At4g36650 QELATHIGEVVINKCFCT Lycopersicon_esculentum_AAG01118 QELATHIGEVIINKCFCT Populus_balsamifera_TFIIB6 -----------------Populus_balsamifera_TFIIB4 LMKV-------------Populus_balsamifera_TFIIB5 IK---------------Oryza_sativa_TFIIB1_AF464908 VKAAQEAVQR-SEELDIR Triticum_aestivum_TC68795 VKAAQEAVQR-SEELDIR Mesembryanthemum_crystallinum_TFIIB_TC5895 MKAAQEAVQ K -SEEIDIR Vitis_vinifera_TFIIB_TC19782 VKAAQEAVQ K -SEEFDIR Populus_balsamifera_TFIIB1 VKAATEAV K T-SEQFDIR Citrus_sinensis_TFIIB_CB292941 VKAAQEAVQ K -SEEFDIR Glycine_max_TFIIB_U31097 VKAAQEAVQ K -SEEFDIR Medicago_truncatula_TC86832 VKAAQESVQ K -SEEFDIR Arabidopsis_thaliana_TFIIB2_At3g10330 VKAAQESVQ K -SEEFDIR Arabidopsis_thaliana_TFIIB1_At2g41630 VKAAQEAVQ K -SEEFDIR Lycopersicon_esculentum_TFIIB_TC124975 IKVVQETVQ K -AEEFDIR Solanum_tuberosum_TFIIB1_TC58701 IKVVQETVQ K -AEEFDIR Populus_balsamifera_TFIIB3 VKAATEAV K T-SEQFDIR Populus_balsamifera_TFIIB8 -----------ELKRDGArabidopsis_thaliana_TFIIB4_At3g57370 VEAALEAAESYDYMTNGR Populus_balsamifera_TFIIB2 VKAVHEAVE K -IQDVDIR Oryza_sativa_TFIIB2_AAN59779 VREAQRAAQTLEDKLDVR Drosophila_melanogaster_TFIIB_NM_057540 QRAATHIAK K -AVEMDIV Homo_sapiens_TFIIB_NM_001514 QMAATHIAR K -AVELDLV Arabidopsis_thaliana_TFIIB3_At3g29380 IMAIPEAVE K -AENFDIR Arabidopsis_thaliana_TFIIB6_At4g10680 -----------------Saccharomyces_cerevisiae_TFIIB_M81380 TTSAEYTAK K CKEIKEIA Methanosarcina_acetivorans_TFB_NP_615574.1 QSKSVEILRQ-ASEKELT Sulfolobus_solfataricus_TFB_AAK40772.1 MKTAAEIID K -AKGSGLT Populus_balsamifera_TFIIB9 -NPDGDLIQGFEIIETMA Arabidopsis_thaliana_BRF4_At4g35540 VRTDGFCVEDLVMDCLSK Lycopersicon_esculentum_TFIIB_AF273333 DIISLNVLANTHSNTMQI Figure 2-13. Multiple sequence alignment of the TFIIB region containing the conserved lysine residue that is acetylated in human and yeast TFIIB (in green). The poplar TFIIB1 and TFIIB3 predicted am ino acid sequences have a lysine one amino acid off register (in blue) that might be autoacetylated.

PAGE 101

88 A. B. Figure 2-14: Exon-Intron diagrams of Arabidopsis TAF6 and TAF6b alternative splicing forms. Exons are depicted as orange boxes, introns are depicted as a black line. Blue boxes show e xons that are modified in different clones due to differential splicing. A. TAF6 and B. TAF6b Nucleotide annotations are relative to the genomic sequence. TAF6b forms are from top to bottom TAF6b-1, TA F6b-4, TAF6b-2, and TAF6b-3. +266 AUG codon +1 +3771 Stop codon +1 AUG codon +2560 +2629 Stop codon +1 AUG codon +2629 +855 Stop codon +1 AUG codon +2560 +2629 Stop codon +1 AUG codon +2560 +2629 Stop codon

PAGE 102

89 CHAPTER 3 BINARY PROTEIN-PROTEIN INTERACTIONS OF THE Arabidopsis thaliana GENERAL TRANSCRIPTION FACTOR IID Introduction The existence of the two Arabidopsis thaliana TBP proteins has been accepted for many years (Gasch et al ., 1990; Nikolov et al ., 1992; Chasman et al ., 1993; Heard et al ., 1993; Kim et al ., 1993a; Nikolov and Burley, 1994; Rowlands et al ., 1994); however, there has been very little examination of othe r GTFs in plants. In fact, there has been no publication of a TBP associat ed factor (TAF) from plants having been cloned and biochemically verified as a TAF. With the advent of ge nome sequencing, gene annotation has been often performed base d on homology alone, and with sufficient similarity, gene function is often assumed. TAFs however, despite frequent sequence similarity, are defined by their association wi th TBP. If a protein is not found in TFIID (with TBP), it technically is not a TAF despite sequence similarity with bona fide TAFs. There are examples of human and Drosophila proteins that are not part of TFIID yet are very similar in sequence to TAFs and are designated “TAF-like” (Tora, 2002). Homo sapiens have TAF5L, TAF6L, TAF7L, TAF9L, and TAF11L (Tora, 2002); TAF5L and TAF6L are both found in the p300/CBP-associ ated factor (PCAF) complex (PAF65 and PAF65 respectively). Without a direct demons tration of integrat ion into the TFIID complex, the next best experiment is to test protein-protein intera ctions with putative TAF proteins and TBP. The yeast twohybrid system was chosen to undertake a comprehensive analysis of the Arabidopsis TFIID protein-protein interactions.

PAGE 103

90 The Stargell laboratory has recently conducte d a similar analysis with the proteins of yeast TFIID (Yatherajam et al ., 2003). This group found that 17% of the potential interactions tested re sulted in a strong to intermedia te growth response, another 8% demonstrated a potential w eak interaction (Yatherajam et al ., 2003). Significantly, this study was unable to reproduce some interactions that were previous ly characterized by other experimental methods. These unconf irmed interactions included TBP-TAF1, TAF1-TAF2, and TBP-TAF12 among others (Yatherajam et al ., 2003). In fact, the entire complement of TBP-TAF interactions was not reproduced, with the exception of TBPTAF7 (Yatherajam et al ., 2003). Yatherajam et al (2003) suggested either that this observation was the result of less extensive TBP-TAF interactions than previously thought or that some necessary TAF trunc ations removed TBP-interaction domains (Yatherajam et al ., 2003). Arabidopsis homologs of all TAFs, other than TA F3, have been identified (Chapter 2, Table 2-1). These putative TAFs, as well as TBP1 and TBP2, were cloned into the Gateway cloning system (Invitrogen, USA) and further into MATCHMAKER III yeasttwo hybrid vectors (Clontech, USA). These clones were transformed into yeast and protein-protein interaction test s were performed. Of the 720 identified interactions, 102 (14.2%) were positive. Materials and Methods Total RNA was extracted from A. thaliana ecotype Columbia suspension cells (kindly provided by Robert Ferl, University of Florida) using Plant RNeasy (Qiagen, USA) with the additional on column DNase treatment (Qiagen, USA). RNA extraction was performed according to the manufacturer’s protocols. First strand cDNA synthesis

PAGE 104

91 was performed with 1 g of total Arabidopsis RNA with Superscript II reverse transcriptase (Invitrogen) followi ng the manufacturer’s protocol. Primers compatible with the pENTR/D-T opo vector (Invitrogen) were designed to amplify the coding sequences (CDS) of the id entified proteins (Table 3-1). Primer designs targeted a melting temperature of 65 C with a G or C 3’ end nucleotide preceded by an A or T. Primers were ordered at the 10 nmole synthesis level with standard preparation from Invitrogen. Prim ers were resuspended at 20 pmol/ l in TE buffer (10 mM Tris, 1 mM EDTA, pH 8.0). Blunt e nd polymerase chain reaction (PCR) products were created using the high-fidelity en zyme Platinum PFX (Invitrogen), 1x PCR Optimizer (included with Platinum PFX), S upescript II product derived from 30 ng of RNA, 35 amplification cycles, and an annealing temperature of 52 C. Elongation times varied by template using 1 min/Kb + ~15 s. The PCR program ended with an 8 min elongation step followed by a hold at 4 C. TAF12b was subcloned from the Arabidopsis Biological Resource Center (ABRC) cl one U11077 because PCR product was not obtained from the cDNA preparation. All PCR reactions were carried out using an Eppendorf Mastercycler Personal thermocycler (Brinkmann). The Gateway system (Invitrogen, USA) was selected for rapid cloning and subcloning of cDNAs through recombination into appropriately modified vectors. As such, PCR products were cloned into pENT R/D-Topo (Invitrogen, USA) according to the manufacturers instructions. The direc tional Topo cloning reaction products were transformed into chemically competent One Shot TOP10 Escherichia coli cells (Invitrogen, USA). TAF1, TAF2 and TAF4 were cloned by BP reactions (Invitrogen, USA) into pDONR207 (Invitrogen, USA) from P CR using primers with attB sites. The

PAGE 105

92 resultant colonies were screened by PCR using a combination of coding region (CDS)specific and vector-specific primers. Positive clones were verified by sequencing at the University of Florida Microbiology and Cell Science Sequencing Core or Macrogen, Inc. (Seoul, Korea). Yeast two-hybrid vectors pGAD-T7 and pGBK-T7 (Clontech, USA) were converted to be Gateway compatible by the insertion of the Gateway reading frame A cassette by blunt-end ligation into the digested SmaI sites. Th e constructs were verified by DNA sequencing by Macrogen, Inc. Veri fied pENTR/D-Topo constructs were recombined (Gateway LR and BP reactio ns, Invitrogen) succ essively through pGADT7Gateway, pDONR207, and pGBK-T7Gat eway. These plasmid clones were transformed into electrocompetent MACH1 E. coli (Invitrogen, USA). Clones in pGAD-T7Gateway and pGBK-T7Gat eway were confirmed by digestion with BamHI. Clones in pDONR207 were confirmed by digestion with BamHI and EcoRV. The Gateway recombination system is dependent on the proper function of a negative selection gene (ccdB), which is normally replaced by the DNA fragment being cloned. However, if the ccdB gene become s mutated, the un-recombined vector could grow on the selective media. If this is the case, BamHI digesti on of pGAD-T7Gateway and pGBK-T7Gateway results in the production of a 600 bp and a 700 bp band (in addition to a ~8 kb vector band). In a similar case with pDONR207, BamHI and EcoRV digestion results in the forma tion of 3.9 Kb and 2.0 Kb bands. The resultant yeast two-hybr id bait (contains Gal4 DNA binding domain) and prey (contains Gal4 activation domain) constr ucts were transformed into MaV204K ( MAT a, trp1–901 leu2–3 112 his3 200 ade2–101 :: kanMX gal4 gal80 SPAL10 :: URA3

PAGE 106

93 UASGAL1 :: HIS3 GAL1 :: lacZ ; a kind gift of Dr. T. Ito, Kanazawa University, Japan) (Ito et al ., 2000) and AH109 ( MATa trp1-901 leu2-3 112 ura3-52 his3-200 gal4 gal80 LYS2::GAL1UAS-GAL1TATA-HIS3 MEL1 GAL2UAS-GAL2TATA-ADE2 URA3::MEL1UAS-MEL1TATA-lacZ ; Clontech) yeast strains, respectively. All transformations of vector plasmids were conducted using Frozen-EZ Yeast Transformation II (Zymo Research, USA). Ba it constructs in MaV204K were plated on SD –Trp, +0.1% 5’ fluoroorotic acid (5’FOA) to check for spurious activation of the SPLA10::URA3 reporter. In the event that the bait protein contains an activation domain, the URA3 gene would be activated by the re cruitment of the bait protein to the SPLA10 promoter that contains the Ga l4 upstream activation site (UASGal4). The URA3 gene product catalyzes the formation of a toxic pr oduct from 5’FOA, preventing yeast growth, and indicates an unsuitable bait cons truct for yeast two-hybrid analysis. Western blots were performed to examine bait and prey protein expression levels. Protein was harvested and prepared using a modification of the method developed by Horvath and Riezman (1994). Bait and prey transformants were grown in their appropriate SD medium to an OD600 between 0.65 and 0.9. Culture volumes were linearly adjusted to normalize harvested cultures to 1.5 mL of culture at 0.7 OD600. Cultures were harvested by cen trifugation at 16,000 x g for 1 min, washed with 1 mL of deionized H2O, and collected again by centrif ugation. The resulting pellet was resuspended in 100 l of reducing 2x Laemmli lo ading buffer (Laemmli, 1970) containing 5 mM EDTA, and 2x Halt Protease Inhibitor Cocktail (P ierce, USA), placed in a boiling water bath for 5 min and separate d on a 10% SDS-PAGE gel. The separated proteins were transferred to Immobilon-P PVDF membranes (Millipore, USA) in a BIO-

PAGE 107

94 RAD TRANS-BLOT SD semi-dry transfer cell at 75 mA per minigel for 35 min. Blots were blocked in Tris-buffere d saline (pH 7.4) containing 0.2% Tween-20 (TTBS) and 3% non-fat dried milk for 1 hour. Blots containing bait proteins were probe d overnight with anti-Myc-tag 9B11 monoclonal antibody (C ell Signaling Technology, USA) diluted 1:800 in TTBS 1% non-fat dried milk. Blot s containing prey pr oteins were probed overnight with anti-hemagglutinin-epit ope HA.11 monoclonal antibody (Covance, USA) diluted 1:500 in TTBS 1% non-fat dried mil k. All blots were probed with goat antimouse horseradish peroxidase (HRP) c onjugated Immunopure Antibody (Pierce) as a secondary antibody for 1 h. Four washes of 5 min each were performed in TTBS following each incubation with antibody. HRP activity was di splayed after incubation of blots with ECL+ chemiluminescent substrat e (Amersham, UK) using X-ray film (RPI, USA). Bait/MaV204K and prey/AH109 transformants were then mated with one another and selected on SD -Trp, -Leu (to test for mating efficiency) and SD -Trp, -Leu, -His, Ade (to test for interaction). Yeast matings were performed similarl y to the protocol in the Yeast Protocols Handbook (Clontech). Several fresh, large (2-3 mm diameter) colonies were resuspended in 1 mL YPD media and vortexed vigorously for 30 s to disrupt clumps of cells. Into st erile 96-well plates containing 180 L of YPD in each well, 10 L each of bait/MaV204K and prey/AH109 suspensions were added. The 96well plates were incubated at 30 C overnight with shaking at 200 rpm. Using a 30 L 8-channel pipette (Matri x Technologies, USA), 10 L of each overnight mating culture was removed from each well and serially diluted 10-fold three successive times in sterile 96-well plates containing 90 L of YPD in each well. Each

PAGE 108

95 dilution was mixed by three 30 L aspiration/blow-out cycles before removing an aliquot for the next dilution. From each dilution, 3 L of cell suspension was removed and spotted in a grid on both a SD -Trp, -Leu a nd SD -Trp, -Leu, -His, -Ade 25 cm x 25 cm screening plate. The plat es were incubated at 30 C and spots were monitored for growth over 14 days. -galactosidase assays were performed to obtain semi-quantitative data regarding the strength of individual intera ctions. The activation of the GAL1 :: lacZ reporter is driven by the reconstituted Gal4 protein in a manner which correlates with the strength of interaction and activation potenti al of the two fusion proteins. -galactosidase assays of colonies exhibiting growth were performe d using CPRG substrate (Roche, Germany) according to the Clontech Yeast Protocols Handbook. All values were expressed as Miller units (Miller, 1972, 1992). The normalized activity (NAct) for positive a galactosidase test was determined by the following equation: NAct = AVGt – SDt – 1.1 x AVGc – 1.1 x SDc. In this equation AVGt is the average of the test activities, SDt is the standard deviation of the test activities, AVGc is the average of the activities of appropriate bait negative controls, and SDc is the standard deviat ion of the activities of appropriate bait negative controls. If the NAc t was determined to be greater than zero, the -galactosidase assay was considered positiv e. If the NAct was equal to or greater than one, then the interaction was deemed st rong. Thus, the average test activity minus its standard deviation must be equal to or greater than 110% of the average negative control activity plus its standard deviation.

PAGE 109

96 Results The TAF7, TAF12, and TAF15 bait constructs were spur ious activators defined by the lack of growth on 5’FOA containing plates. These bait constructs were thus excluded from further studies. It was also found that the TAF12 prey construct caused the activation of the reporter genes and thus TAF12 was cloned as N-terminal (aa 1-200), middle (aa 201-394), and C-terminal (aa 395-539 ) fragments into pENTR/D-topo vector and subcloned into the bait and prey vectors as detailed above (Table 3-2). Histograms of the percentage of matings that formed coloni es from each bait and prey constructs were created (Figures 3-1 and 3-2). These figur es exclude the full length TAF12 to avoid redundancy (TAF12 is represented as the separa te peptide fragments consisting of amino acids 1-200, 201-394, and 395-538). The bait prot eins that interacted with the Gal4 activation domain (Gal4 AD) alone (TAF1 #6, N-terminal one-th ird of TAF1; TAF12 amino acids 1-200; TAF12 amino acids 395-538; and TAF15b) formed colonies with a frequency of over 70% of their matings. Thes e baits were the only constructs to have colony frequencies above 70%, confirming the assertion that only baits that were not observed to interact with the Gal4 AD were suitable for analysis. Immunoblots to detect bait a nd prey proteins (Figures 3-3 and 3-4, respectively) demonstrated a large variability in protein expression levels. The majority of bait proteins were at steady state levels that were below the le vel of detection. However, interaction data suggest that th ese proteins are present in the yeast cells. The results of the targeted two-hybrid screen s are presented in Table 3-4 and depicted pictorially in Figure 3-6. Of the interactions in Table 34, 552 were reciprocated meaning that bait X was interacted with prey Y, and bait Y was interacted with prey X. Only TAF1 #8 (middle region; MR) was lacki ng in interactions with ot her TFIID components, but it

PAGE 110

97 does interact with the small subunit of TF IIF (chapter 4). The results of the galactosidase assays are shown in Figure 3-5. The -galactosidase normalized activities were utilized to verify or exclude protein-protein interact ions based on colony growth. NAct values below zero were considered nega tive interactions, while NAct values above one were considered evidence of strong interactions. The criteria for positive interactions were that greater that 50% of interaction evidence must be positive. Thus, if an inte raction is not supported by growth of diploid yeast containing reciprocated constructs, it must be supported by a NAct value greater than zero to be considered positive. If an interaction is supported by growth of diploid yeast containing reciprocated constr ucts, only one of the two tested -galactosidase activities must have a NAct greater than zero fo r the interaction to be considered positive. Strong interactions (NAct 1) are depicted in Figure 3-7. Of the 720 total combinations tested, wh ich do not include negative controls, 102 or 14.2% were positive. Thirty or 4.2% of the total combinations grew in yeast, but were determined to be negative based on NAct va lues. However, of these, five were reciprocated interactions that were verifi ed by the NAct values of their reciprocal constructs in the -galactosidase assays. The total number of protei n-protein interactions that were tested regardless of bait or pr ey conformations was 444 (the non-redundant interaction set), leaving 276 protei n-protein interactions that we re reciprocally tested. Of the 444 non-redundant interactions tested, 72 (16.2%) formed col onies that were verified and 25 (5.6%) produced colonies that were not verified by NAct values.

PAGE 111

98 Discussion Using homology based searches of the A. thaliana genomic sequence database 23 loci encoding putative TFIID subunits have be en identified. Of these, 30 putative TFIID subunit coding-sequences (including 4 splice va riants of TAF6b and fragments of TAF1 and TAF12) have been cloned into th e MATCHMAKER III yeast two-hybrid system (Clontech). Interestingly, full-length TAF12 was found to act as an activator of the reporter genes in both the bait and prey cons tructs (data not show n), and has therefore been cloned as three separate sub-frac tions (amino acids 1-200, 201-394, and 395-539). Surprisingly, none of these TAF12 sub-fractions interfere or act as spurious activators in bait or prey forms (all baits passed the 5’ FOA test). However, amino acids 1-200 and 395-538 fragments interact with a ll prey constructs tested in dicating an interaction with the Gal4 AD. The majority of interactions identified in this study have been described for homologs in other systems. For instan ce, TAF10 was shown to interact with TAF4/TAF4b, TAF6b (splicing versions 2 and 3), TAF8, TAF9, TAF10 (dimer formation), TAF11, TAF12b, and TAF13. All of these interactions have been described previously for homologs in other systems (Chapter 1). Unique interactions have also been described such as TAF5 and TAF8 homodimers, and interactions of TAF8 with TAF13 and TAF14. TAF8 and TAF10 are a HFD binding pair and interact st rongly in this study. Both proteins independently form dimers, suggesting formation of a 22 tetramer. Consistent w ith there tight interaction are a number of shared interactions w ith other TAFs (TAF4, TAF4b, TAF12b, and TAF13).

PAGE 112

99 As with any system a number of false pos itives and false negatives can occur. False positives were minimized with the use of -galactosidase activity measurements and analysis of bait constructs (5’FOA tests and positive interaction frequency analyses). Of the 444 non-redundant interactions tested to date (via 720 matings), 16.2% (72) have resulted in confirmed interactions. This rate is in line with observations of Yatherajam et al (2003) in yeast TFIID and suggests that mo lecular bridging of TBP-TBP, TBP-TAF, and TAF-TAF interactions is not being mediat ed by yeast GTF components in this assay. If molecular bridging was problematic, intera ctions between two proteins that have a known common interactor would be expected. For example, TAF4 interacts strongly with both TAF1 #9 (C-terminal one-third) a nd TAF12 395-538. If molecular bridging of TAF1 #9 with TAF12 395-538 occurred thr ough yeast TAF4, then a TAF1#9-TAF12 395-538 interaction would have been observed. TBP1 and TBP2 notably lacked interac tion with any TAFs besides TAF1 #6 (Nterminus). This interaction could potentia lly be mediated by the TAND domain of TAF1 that in other systems interact s with the concave surface of TBP. TBP1 and TBP2 prey constructs interacted with both TAF1 bromodomain and TAF10 in yeast growth studies. However, -galactosidase NAct values did not verify these interactions. TBP1 and TBP2 could potentially dampen the Gal4 AD fusion activity by intermolecularly or intramolecularly masking the activation domain, leading to false negatives in both yeast growth studies and -galactosidase assays. Yatherajam et al (2003) observed that yeast TBP only interacted with TAF7 in yeast two-hybrid assays. Arabidopsis TAF7 failed the 5’FOA test as a bait pr otein, thus only the prey protein construct was analyzed further.

PAGE 113

100 Although this prevented testing a bait-TAF7/pre y-TBP interaction, bait-TBP did interact with TAF1 #6 indicating viability in some cases. AtTAF2 as in the study by Yatherajam et al (2003), interacted with only one other TFIID component. In this study, TAF2 inte racted with the N-terminus of TAF1, although in the yeast study, it interacted with TAF4. TAF2 in metazoans is known to have sequence-specific DNA interactions with the Initiator element. Neither yeast nor plants have been shown to contain an initia tor consensus sequence in their promoters. The role of TAF2 in these organisms is therefore unknown, and yeast two-hybrid assays show limited structural interactio ns with other TFIID components. Despite the pit-falls listed above, a nu mber of plant interactions unique among eukaryotes have been elucidated. The stronge st unique interaction appears to be TAF1 #9-TAF1 #9 (a dimer of the C-terminal one-t hird of TAF1). No previous study has suggested that TAF1 may form dimers. This interaction may be an artifact of incorrect protein-folding due to the examination of a protein fragment, or expression in a heterologous system. Formation of a TAF1 dimer in Arabidopsis is an attractive model. Both Arabidopsis TAF1 and TAF1b encode only one bromodomain. Bromodomains are known to bind acetylated lysine residues on hi stone tails; however, single bromodomains have approximately 70-fold lower affin ity for acetylated lysines than do double bromodomains (Dhalluin et al ., 1999; Jacobson et al ., 2000). Human and Drosophila TAF1 proteins both contain two tandem br omodomains, and while yeast TAF1 has no bromodomains, a novel TFIID member in yeast encodes two bromodomains (Matangkasombut et al ., 2000). The Arabidopsis genome does not appear to encode such

PAGE 114

101 a novel bromodomain-containing pr otein; therefore, the presence of a TAF1 dimer could compensate for the single bromodomain. Arabidopsis is the first organism to be repor ted to encode two TAF1 homologs (Pandey et al ., 2002). In silico analyses that suggest TAF1b does not encode an Nterminal domain (TAND; W.B. Gurley and E. Czarnecka-Verner unpublished data). The TAND domains have been shown to be auto-inhibitors of TFIID binding to the TATAbox (Kokubo et al ., 1993b), and there are no known exam ples of TAF1 proteins lacking the TAND domains outside of Arabidopsis A TAF1-TAF1b heterodimer would leave TFIID with two bromodomains, and a single complement of TAND inhibitory domains. This is one possible model to explain the la ck of the TAND domains in TAF1b and single bromodomains in both TAF1 and TAF1b. Three other novel strong interactions in Arabidopsis TFIID are the TAF12 1-200 with TAF13, TAF12b-TAF15b, and TAF14-TAF15b pairs. The novel finding that TAF12 homologs and TAF13 interact in Arabidopsis may suggest a unique structure of this TFIID complex. TAF13 is a histone-f old containing protein, as are the TAF12 homologs. However, the HFD of the TAF12 homologs are in the C-terminus, not the Nterminus that interacts strongly with TA F13. Although TAF13 does interact with the HFD containing amino acid 395-538 fragment of TAF12, the much stronger interaction with the amino acid 1-200 fragment suggests th at the primary inter action is not through the HFD domains of these proteins. The strong TAF12b-TAF15b interaction is also unique. Neither TAF12 nor TAF12b were found to interact with TAF15, nor did TAF12 interact with TAF15b. Like the TAF12-TAF13 interaction, this may also suggest a novel structure of plant TFIID.

PAGE 115

102 Although both TAF15 and TAF15b were observe d to interact with TAF4 and TAF4b (histone-fold binding partners for TAF12/ TAF12b), the selectiv ity of TAF12b and TAF15b for each other could have two possibl e ramifications. The histone-fold core TAFs (TAF4, TAF6, TAF9, and TAF12) are s uggested to be in TFIID as an octamer structure similar to the nucleosome, although there are discrepancies between this model and the data. One such discrepancy is the known dimerization of the H2B-like TAF12 proteins since the two histone H2B polypeptid es lie on opposite sides of the nucleosome. The likely presence of two TAF12-like proteins in Arabidopsis TFIID offer three possible configurations. TAF12-TAF12, TAF12-TA F12b, and TAF12b-TAF12b dimers are all consistent with the interaction data pres ented here. Therefore, TAF15b could be interacting with 0, 1, or 2 copies of TAF12b. Furthermore, TAF15b may or may not be present in an Arabidopsis TFIID complex, depending on the presence of TAF 12b. However, TAF15b does have a number of other weak interactions with other TAFs which may c ooperatively incorporate this protein into the complex. Interestingly, the protein designated as TAF15 appears to be less tightly connected to TFIID in general. This protei n was only shown to interact weakly with both TAF4 and TAF4b. However, analysis of interactions with several TFIID components was restricted by the activ ating nature of several constructs. Potentially, TAF15 may interact strongly with TAF12 (either the N-terminus or the Cterminus), interactions that could not be tested in this system. Even if these TAF15 proteins are not present in TFIID, they ma y still be recruited to the PIC through and interaction with PolII as is seen for th eir human homologs (Bertolotti et al., 1998).

PAGE 116

103 TAF15b also interacts str ongly with TAF14. Although no organism has previously been reported to contai n both TAF14 and TAF15, Arabidopsis contains two homologs of each. TAF14 and TAF15 both interact with a number of the same partners including TAF4b, TAF5, and TAF12b. Their common intera ctions and strong pairing suggest that TAF14 and TAF15b are likely to be localized to a lobe of TFIID containing TAF4(b), TAF5, and TAF12(b). The presence of such a lobe is consistent with the results presented here and by Yatherajam et al (2003). TAF9 is expected to interact with TA F6-like proteins (a Histone H3/H4-like interaction) (Hisatake et al ., 1995), and this was upheld in our studies, with the exception of the TAF6b-3 cDNA. It is interesting to note that TAF6b-3 contai ns an altered histonefold structure (N-terminus) due to splicing differences, and this is the only TAF6-like protein shown to interact with TAF11 (containing a histone H3-like fold). Another interesting interaction is that of TAF7 with the embedded ubiquitin domain unique to the plant TAF1 genes. Personal communication (M. Horikoshi) and sequence analysis of TAF7 genes have s uggested that this TAF may have an VWArelated AAA ATPase activity. This similarity with proteasome subunits could potentially be part of a chromatin or factor remodeling activity within TFIID targeting ubiquitylated proteins. Despite a number of studies of the st ructure of TFIID, X-ray crystallographic analysis of this complex has been elusive. Two studies have characterized the gross molecular structure of TFIID (Andel et al ., 1999; Brand et al ., 1999), and the HFD TAFs have been mapped on this structure (Leurent et al ., 2002). However, absolute stoichiometries have not been determined, a nd the positioning of each component within

PAGE 117

104 the structure is unknown after more than a decad e of study. This work represents the first detailed analysis of the TFIID complex from any plant species. While no TAF3 homolog has been observed in plants (possibly due to poor conservation of seque nce similarity), at least one homolog has been identified fo r each of the other 15 confirmed TFIID components. In all Arabidopsis appears to have 23 genes encoding TBP or TAF homologs. Of these, 21 were examined in th is work. A total of 72 binary interactions were identified, including 26 novel protein interactions. In addition, Arabidopsis is the first organism to encode both TAF14 and TAF15.

PAGE 118

105 Table 3-1. Primers for amplification of TBP and TAF-like cDNAs and cloning into pENTR/D-Topo or pDONR207 vectors. Gene Primer Sequence TBP1 Upper Primer caccATGACTGATCAAGGATTGGAAGGGAGTAATC TBP1 Lower Primer TTGCTGTATCTTTCTG AATTCCGAGAGCAC TBP2 Upper Primer caccATGGCTGATCAAGGAACGGAAGGGAG TBP2 Lower Primer TTGCTGGACCTTCCTGAATTCTCTAAGAAC TAF2 Upper Primer ggggacaagtttgtacaaaaaagcaggctta ATGGCCAAGGCTCGAAAGCC GAAG TAF2 Lower Primer ggggaccactttgtacaagaaagctgggttT GAGTTGTTGAAC GCTTTGCTT TTCAGTTTGATTC TAF4 Upper Primer ggggacaagtttgtacaaaaaagcaggcttaATGGATCTCTCCATT GTCAAGCTCCTC TAF4 Lower Primer ggggaccactttgtacaagaaagctgggttAACATCCGAGCAGATT CTATTGTATACGCGATAC TAF4b Upper Primer caccATGGATCCTTCAATTTTCAAGCTCCTTGAAG TAF4b Lower Primer TTGAATTAATCGATACATCAGAGTGGATTTGGAC TAF5 Upper Primer caccATGGATCCAGAGCAAATCAACGAGTTCGTC TAF5 Lower Primer ATATCTGATACCATTGTTTTGATCAGTTTGCGGGT TAF6 Upper Primer ggggacaagtttgtacaaaaaagcaggctta ATGAGCATTGTACCTAAGGA AACGGTTGAG TAF6 Lower Primer ggggaccactttgtacaagaaagctgggttG AGGAATACTGACATCTCTGT AGAAGGGATAAAG TAF6b-1 Upper Primer caccATGGTGACGAAAGAAT CCATTGAAGTGATAGCTC TAF6b-1 Lower Primer CAAGAAGAAACTGAGCTCATGTGTG TAF6b-2 Upper Primer caccATGGTGACGAAAGAAT CCATTGAAGTGATAGCTC TAF6b-2 Lower Primer CAAGAAGAAACTGAGCTCATGTGTG TAF6b-3 Upper Primer caccATGGTGACGAAAGAAT CCATTGAAGTGATAGCTC TAF6b-3 Lower Primer CAAGAAGAAACTGAGCTCATGTGTG TAF6b-4 Upper Primer caccATGGTGACGAAAGAAT CCATTGAAGTGATAGCTC TAF6b-4 Lower Primer AAGCCCACTCCGTGACTTTGTCAAAGTAAATC TAF7 Upper Primer caccATGGAAGAACAGTTCATACTTAGGGTTC TAF7 Lower Primer CATTGAATCATCAGAATC TTCTGATTCACTTCTCTC TAF8 Upper Primer caccATGAACACAGAGAGAGCTCAAGAAGGTGATAG TAF8 Lower Primer CAACTGATTGAGGTCTACTGGGTTCTCCATAC TAF9 Upper Primer caccATGGCAGGAGAAGGTG AAGAAGATGTACCTAGAGAT GCTAAG TAF9 Lower Primer TTTGGGTCGTCTAGAGAG TGGGAAAGAGACCCTTTGAGGA TAF10 Upper Primer caccATGAATCACGGCCAACAATCTGGTGAGGC TAF10 Lower Primer TTCGTCCCTTGTTGCAGGGTCCATTCCAGTCGA TAF11 Upper Primer caccATGAAGCATTCAAAGGAT CCGTTTGAAGCAGCGA TAF11 Lower Primer GCGGAAAAGGCGTGGAACTGATCTTTTAGGCAC TAF11b Upper Primer caccATGGCCTTTAACGCAAGGTCTTGTTGTTTTGCTAG TAF11b Lower Primer GCGAAAAAGCCGTTGAACTGATCTTTGAG TAF12 Upper Primer CaccATGGATCAGCCACGGCAAAGCTCGA

PAGE 119

106 Table 3-1 Continued. Gene Primer Sequence TAF12 Lower Primer GTGATTGAAAGTTGT AGAGCCCATGGGA TAF12b Upper Primer caccATGGCGGAACCGATTCCCTCATCGTC TAF12b Lower Primer GTATCGTGTCATGTGTTGTAATATGTGAGGACCGGATG TAF13 Upper Primer caccATGAGTAACACACCAGCAGCG TAF13 Lower Primer ATCAACGAGTTCCTTTTCGTCGACATC TAF14 Upper Primer caccATGGAGTCGGATATCG AGATTTTGTCTGAAG TAF14 Lower Primer GAACAAGAATGCACCTGGAGGAGGCAG TAF14b Upper Primer ggggacaagtttgtacaaaaaagcaggcttaATGACGAACAGCTCGTCATC GAAGAAACAAGCTC TAF14b Lower Primer ggggaccactttgtacaagaaagctgggttC AGGTCTGATCCT GTTTTAACG GTCTGATTC TAF15 Upper Primer caccATGGCTGGATATCCTACTAATGGATCAGTCTAC TAF15 Lower Primer GTTACGGTACCTGCTTCCACGTTCGCGAC TAF15b Upper Primer caccATGGCTGGGATGTACAATCAAGACGGCGGCGGAG TAF15b Lower Primer GATCAGACACAGACATCTCTGGTCAAAGGTAGGAGCAC Note: Lower case sequences are not homologous to the gene of inte rest, but are required for cloning into the appropriate vector. Th e sequence “cacc” on the upper primer is for directional cloning into pENTR/D-Topo, th e attB1 “ggggacaagtttg tacaaaaaagcaggctta” and attB2 “ggggaccactttgtacaagaaagctgggtt” sequences are for directional BP cloning into pDONR207.

PAGE 120

107 Table 3-2. Primers for cloning of TAF 12 N-terminal, middle, and C-terminal fragments. Lower case sequences are not homologous to the gene of interest, but are required for cloni ng into the pENTR/D-Topo vector. Gene Primer Sequence TAF12 1-200 Upper Primer caccATGGATCAGCCACGGCAAAGCTCGA TAF12 1-200 Lower Primer CTGAGTTCCTTGCATCATTCTAACCTGAG TAF12 201-394 Upper Primer caccGGAATTGGGATGATGGGAACACTTG TAF12 201-394 Lower Primer CGGCTCGGTCTCTGCAGAAACTG TAF12 395-539 Upper Primer caccTCTGATGATCGTATCCTGGGGAAACGAAGCATC TAF12 395-539 Lower Primer GTGATTGAAAGTTGT AGAGCCCATGGGA

PAGE 121

108 Table 3-3. Arabidopsis thaliana TFIID subunit cDNA GenBank accession numbers. Gene Locus Accession CDS Size (bp) Mw (kDa) pI TBP1 At3g13445 AY463625 603 22.4 10.21 TBP2 At1g55520 AY463626 603 22.4 10.31 TAF1 At1g32750 AF510669 5,760 217.2 5.55 TAF1b At3g19040 N/A 5,103 192.1 7.65 TAF2 At1g73960 AY457045 4,113 153.5 6.18 TAF4 At5g43130 AY457043 2,163 80.6 9.34 TAF4b At1g27720 AY457044 1,854 68.9 9.84 TAF5 At5g25150 AY463620 2,010 74.4 6.65 TAF6 At1g04950 AY463621 1,584 58.9 8.73 TAF6b1 At1g54360 AY463630 1,515 56.5 8.83 TAF6b2 At1g54360 AY463631 1,494 55.7 8.84 TAF6b3 At1g54360 AY463632 1,431 53.1 8.56 TAF6b4 At1g54360 AY463633 588 22.6 9.62 TAF7 At1g55300 AY463622 612 22.5 4.11 TAF8 At4g34340 AY463623 1,062 39.5 4.96 TAF9 At1g54140 AY463624 552 20.6 4.67 TAF10 At4g31720 AY463628 405 14.9 5.50 TAF11 At4g20280 AY463612 633 23.7 5.39 TAF11b At1g20000 N/A 615 23.4 9.59 TAF12 At3g10070 AY463613 1,620 57.7 10.42 TAF12b At1g17440 AY463614 2,052 74.8 10.46 TAF13 At1g02680 AY463615 384 14.3 5.81 TAF14 At2g18000 AY463616 609 22.8 6.16 TAF14b At5g45600 AY463617 807 30.2 7.03 TAF15 At1g50300 AY463618 1,119 41.3 7.98 TAF15b-1 At5g58470 AY463619 1,164 38.9 7.93 TAF15b-2 At5g58470 N/A 1,269 42.3 8.73

PAGE 122

109 Figure 3-1. Histogram of percent of mati ngs, per bait construct, that yielded colony growth. Bars marked in red represent ba it constructs that were excluded from the analysis and consider ed spurious bait activators. 0 2 4 6 8 10 12 14 16 0-1010-2020-3030-4040-5050-6060-7070-8080-9090-100 Percent GrowthFrequency

PAGE 123

110 Figure 3-2. Histogram of percent of matings, per prey construct, that yi elded colony growth. 0 2 4 6 8 10 12 14 16 0-1010-2020-3030-4040-5050-6060-7070-8080-9090-100 Percent GrowthFrequency

PAGE 124

111 Figure 3-3. Immunoblots of TFIID components expressed as bait fusi on proteins in MaV204K. Protei ns were expressed as Gal4 DNA binding domain fusion proteins with an internal Myctag from a Gateway converte d pGBK-T7 vector. Blots were probed with anti-Myc 9B11 antibody. The lane labe led pGBK-T7 is the non-Gateway modified empty vector control. Ubiquitin domain, UBD; bromodomain, BD; middle region, MR; Cterminal domain, CD. pGBK-T7 TBP1 TBP2 TAF1 UBD TAF1 BD TAF1 #8 (MR) TAF1 #9 (CD) TAF2 TAF4 TAF4b TAF5 TAF6 TAF6b-1 TAF6b-2 TAF6b-3 TAF6b-4 TAF8 TAF9 TAF10 TAF11 TAF12 201-394 TAF12b TAF13 TAF14 TAF14b Kda 120 100 80 60 50 40 30

PAGE 125

112 Figure 3-4. Immunoblots of TFIID components expressed as prey fu sion proteins in AH109. Proteins were expressed as Gal4 AD fusion proteins with an internal HA-tag from a Gateway converted pGAD-T7 vector. Blots were probed with HA.11 antibody. The lane labeled pGAD-T7 is the non-Gateway modi fied empty vector control. Ubiquitin domain, UBD; bromodomain, BD; N-terminal domain, ND; middle region, MR; C-terminal domain, CD. Kda 120 100 80 60 50 40 30 pGAD-T7 TBP1 TBP2 TAF1 UBD TAF1 BD TAF1 #6 (ND) TAF1 #8 (MR) TAF1 #9 (CD) TAF2 TAF4 TAF4b TAF5 TAF6 TAF6b-1 TAF6b-2 TAF6b-3 TAF6b-4 TAF7 TAF8 TAF9 TAF10 TAF11 TAF12 1-200 TAF12 201-394 TAF12 395-538 TAF12b TAF13 TAF14 TAF14b TAF15 TAF15b

PAGE 126

113Table 3-4. A yeast two-hybrid targ eted protein-protein interaction matrix between subunits of the Arabidopsis thaliana TFIID complex. Note: Data describing the average number of days after spotting (DAS) and in how many of the five spots form colonies (i.e., 6/ 3+ indicates that three spots in the serial-dilu tion series developed colonies in an aver age of 6 DAS). Boxes highlighted in oran ge and green are reciprocated growth interactions bold boxes are tests for dimer formation. Green and blue interactions were validat ed by Gal assays, with darker boxes being stronger interactions. Red and oran ge interactions were negated by -Gal NAct values. Prey constructs highlight in blue are constructs that have not been shown as bait constr ucts (either because the bait proteins inter act with the Gal4 AD, the bait proteins did not grow in the 5’FOA test, or they have not yet been tested as bait constructs). pBGK-T7 LAM 53 TBP1 TBP2 TAF1 #4 UbD TAF1 #5 BD TAF1 #8 TAF1 #9 TAF2 TAF4 TAF4b TAF5 TAF6 TAF6b-1 TAF6b-2 TAF6b-3 TAF6b-4 TAF8 TAF9 TAF10 TAF11 TAF12 201-394 TAF12b TAF13 TAF14 TAF14bpGAD-T78/1+ TBP113/1+ 13/1+ TBP214/1+ 12/4+ TAF1 #4 UbD TAF1 #5 BD TAF1 #63/2+3/2+5/2+ TAF1 #8 11/1+ TAF1 #99/1+5/2+3/2+11/1+ 8/1+ TAF2 TAF4 5/1+6/1+10/1+ TAF4b 5/1+2/3+4/2+3/4+3/3+3/3+4/2+ TAF514/1+5/1+5/4+9/3+12/1+8/4+6/3+14/1+7/4+6/3+ TAF6 7/2+ TAF6b-112/1+ 7/4+12/1+ TAF6b-26/1+ 13/1+10/4+8/2+11/1+ TAF6b-3 8/1+10/1+8/1+10/2+10/1+ TAF6b-4 12/2+ TAF72/3+4/2+ TAF8 3/3+2/3+4/2+3/1+4/3+2/4+3/3+4/3+ TAF9 3/2+5/4+7/3+6/3+6/4+12/1+ TAF1014/1+3/2+2/4+2/4+6/1+2/3+2/4+8/3+ TAF11 10/1+10/1+10/1+6/2+3/4+ TAF12 1-200 TAF12 201-39413/1+ 13/1+8/2+ TAF12 395-53813/1+2/3+2/3+5/4+ 3/3+6/1+4/4+3/3+2/4+6/4+4/3+ TAF12b 3/2+2/3+9/1+2/3+2/3+3/3+8/3+3/2+ TAF13 5/3+6/3+14/1+5/3+ TAF14 8/1+3/2+8/1+ 4/2+10/1+6/3+ TAF14b10/2+3/2+10//1+ 3/2+4/1+4/2+ TAF15 6/1+3/3+ 7/2+ TAF15b 3/3+3/4+7/3+ 4/4+2/4+

PAGE 127

114 Figure 3-5. Colorimetric assays of the -galactosidase reporter levels in yeast diploids containing both ba it and prey plasmids. Assays were preformed using the CPRG liquid culture assa y protocol in the Yeast Pr otocols Handbook (Clontech). Green bars indicate negative cont rols of empty vector and tw o control baits with Gal4-AD. Yellow bars indicate the negative control for each bait construct with the empty prey vector. Blue bars are interactions yielding -Galactosidase activities with NAct values a bove zero, red bars indicate -Galactosidase activities with NAct values below zero. 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0pBGKT7/pGAD-T7 LAM/pGAD-T7 53/pGAD-T7 TBP1/pGAD-T7 TBP1/TAF1 #6 TBP2/pGAD-T7 TBP2/TAF1 #6 TAF1 #4/pGAD-T7 TAF1 #4/TAF7 TAF1 #5/pGAD-T7 TAF1 #5/TBP1 TAF1 #5/TBP2 TAF1 #5/TAF1 #9 TAF1 #5/TAF5 TAF1 #5/TAF6b-1 TAF1 #5/TAF10 TAF1 #5/TAF12 395-538 TAF1 #5/TAF14b TAF1 #9/pGAD-T7 TAF1 #9/TAF1 #9 TAF2/pGAD-T7 TAF2/TAF1#6 TAF4/pGAD-T7 TAF4/TAF1#9 TAF4/TAF4b TAF4/TAF8 TAF4/TAF10 TAF4/TAF12 395-538 TAF4/TAF12b TAF4/TAF14 TAF4/TAF15 TAF4/TAF15b TAF4b/pGAD-T7 TAF4b/TAF4b TAF4b/TAF5 TAF4b/TAF7 TAF4b/TAF8 TAF4b/TAF9 TAF4b/TAF10 TAF4b/TAF11 TAF4b/TAF12 395-538 TAF4b/TAF12b TAF4b/TAF13 TAF4b/TAF14 TAF4b/TAF14b TAF4b/TAF15 TAF4b/TAF15b TAF5/pGAD-T7 TAF5/TAF4 TAF5/TAF5 TAF5/TAF6b-3 TAF5/TAF8 TAF5/TAF9 TAF5/TAF12 395-538 TAF5/TAF12b TAF5/TAF14 TAF5/TAF14b TAF5/TAF15b TAF6/pGAD-T7 TAF6/TAF5 TAF6b-1/pGAD-T7 TAF6b-1/TAF5 TAF6b-1/TAF6b-2 TAF6b-1/TAF9 TAF6b-1/TAF12 201-394 TAF6b-2/pGAD-T7 TAF6b-2/TAF6b-3 TAF6b-2/TAF9 TAF6b-3/pGAD-T7 TAF6b-3/TAF9 TAF6b-3/TAF11 TAF6b-4/pGAD-T7 TAF6b-4/TAF6b-3 TAF6b-4/TAF9Bait/Prey -galactosidase units (1 mol CPRG x min-1 x cell-1)

PAGE 128

115 Figure 3-5 continued. 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0pBGKT7/pGAD-T7 LAM/pGAD-T7 53/pGAD-T7 TAF8/pGAD-T7 TAF8/TAF4b TAF8/TAF5 TAF8/TAF8 TAF8/TAF10 TAF8/TAF12 395-538 TAF9/pGAD-T7 TAF9/TAF4b TAF9/TAF5 TAF9/TAF6 TAF9/TAF6b-2 TAF9/TAF6b-1 TAF9/TAF6b-4 TAF9/TAF10 TAF9/TAF12 395-538 TAF10/pGAD-T7 TAF10/TBP1 TAF10/TBP2 TAF10/TAF4 TAF10/TAF4b TAF10/TAF5 TAF10/TAF6b-1 TAF10/TAF6b-2 TAF10/TAF6b-3 TAF10/TAF8 TAF10/TAF9 TAF10/TAF10 TAF10/TAF11 TAF10/TAF12 395-538 TAF10/TAF12b TAF11/pGAD-T7 TAF11/TAF6b-3 TAF11/TAF12 395-538 TAF11/TAF13 TAF12 201-395/pGAD-T7 TAF12 201-394/TAF13 TAF12b/pGAD-T7 TAF12b/TAF1#8 TAF12b/TAF1 #9 TAF12b/TAF4 TAF12b/TAF4b TAF12b/TAF5 TAF12b/TAF6b-2 TAF12b/TAF8 TAF12b/TAF10 TAF12b/TAF11 TAF12b/TAF12 201-394 TAF12b/TAF12 395-538 TAF12b/TAF12b TAF12b/TAF13 TAF12b/TAF14 TAF12b/TAF14b TAF12b/TAF15 TAF12b/TAF15b TAF13/pGAD-T7 TAF13/TAF4b TAF13/TAF8 TAF13/TAF10 TAF13/TAF11 TAF13/TAF12 395-538 TAF13/TAF12b TAF13/TAF14 TAF13/TAF14b TAF14/pGAD-T7 TAF14/TAF5 TAF14/TAF8 TAF14/TAF12 395-538 TAF14/TAF12b TAF14/TAF14 TAF14/TAF14b TAF14/TAF15b TAF14b/pGAD-T7 TAF14b/TAF12bBait/Prey -galactosidase units (1 mol CPRG x min-1 x cell-1)

PAGE 129

116 Figure 3-6. Protein-prot ein interactions of Arabidopsis thaliana TFIID subunits as determined by yeast two-hybrid and galactosidase confirmations. Dashed black lines are interac tions found in this study that have been demonstrated to occur with homologs from either Homo sapiens Drosophila melanogaster and /or Saccharomyces cerevisiae Solid green lines are novel interactions demons trated only in this study. Striped figures could not be tested as baits. TAF12 TAF2 TAF6 TAF6b -2 TAF9 TBP2 TAF1b TAF11 TAF5 TAF6b -1 TAF6b -3 TAF6b 4 Low conservation in histone-fold domain TAF11b Possible pseudo-gene TBP1 TAF10 TAF13 TAF7 TAF8 TAF4 TAF4b TAF14 TAF1 TAF14b TAF15 TAF15b TAF12b

PAGE 130

117 Figure 3-7. Protein-prot ein interactions of Arabidopsis thaliana TFIID subunits as determined by yeast two-hybrid and galactosidase confirmations. Dashed black lines are interac tions found in this study that have been demonstrated to occur with homologs from Homo sapiens Drosophila melanogaster and /or Saccharomyces cerevisiae Solid green lines are novel interact ions demonstrated only in th is study. Striped figures coul d not be tested as baits. TAF12 TAF2 TAF6 TAF6b -2 TAF9 TBP2 TAF1b TAF11 TAF5 TAF6b -1 TAF6b -3 TAF6b 4 Low conservation in histone-fold domain TAF11b Possible pseudo-gene TBP1 TAF10 TAF13 TAF7 TAF8 TAF4 TAF4b TAF14 TAF1 TAF14b TAF15 TAF15b TAF12b

PAGE 131

118 CHAPTER 4 BINARY PROTEIN-PROTEIN INTERACTIONS OF Arabidopsis TFIIA, TFIIB, TFIID, TFIIE, AND TFIIF Introduction Relatively little research ha s been conducted on GTFs of the plant kingdom despite a large number of genes present in the fully sequenced Arabidopsis thaliana genome that are related to the GTFs of metazoans and yeast. A. thaliana TFIIA large subunit 1 (TFIIA-L1) has been shown to interact with TFIIA small subunit (TFIIA-S) and the reconstituted recombinant TFIIA-L1/TFIIA-S complex is able to bind an TBP2/CaMV 35S promoter complex in vitro (Li et al ., 1999). Furthermore, TBP2 has been shown to interact with TFIIB1 (Pan et al ., 2000). Given the relative paucity of in formation regarding GTFs in plants our objectives were to identify and clone GTFs from a model plant system ( Arabidopsis ) and to develop a rapid and reliabl e assay of GTF protein-protei n interactions (TFIIH and PolII have been excluded due to their ma ny subunit composition and greater roles in transcription elongation than in initiation) No comprehensive analysis of GTF interactions from any species has been conduc ted previously (with the exception of the protein interaction maps of Arabidopsis and yeast TFIID) (Yatherajam et al ., 2003) and in fact there has been little or no publica tion on plant TFIIE or TFIIF complexes. In this study, 878 binary protei n-protein intera ctions between Arabidopsis TFIIA, TFIIB, TFIID, TFIIE, and TFIIF components were examined (with th e exception of interTFIID interactions). Of these potential in teractions, 118 (13.4%) were positive. The

PAGE 132

119 total number of protein-protein interactions that were tested regardless of bait or prey conformations was 664 (the non-redundant inte raction set), leavi ng 214 protein-protein interactions that were reci procally tested.. Of the 664 non-redundant inte ractions 104 (15.7%) were positive. Seventy-seven of th ese interactions are novel, having not been found to occur between homologs. The work shown here significantly adds to the knowledge base of GTF-GTF binary protein interactions and le nds credence to the functionality of each GTF in the multiple gene families. With the plethora of different GTFs in Arabidopsis it seems likely that differential expression of GTFs targeted by different transcription factors is a ke y mechanism of control in plants. Materials and Methods Experiments with TFIIA, TFIIB, TFIIE, a nd TFIIF were performed similarly to those with TFIID subunits (Chapter 3). Firs t strand cDNA synthesis was performed as in Chapter 3. Primers compatible with the pENTR/D-Topo or pDONR207 vectors (Invitrogen) were designed to amplify the coding sequences (CDS) of the identified proteins (Table 4-1). Primers were designe d and prepared and PCR products were made as in Chapter 3. TFIIA-L2 entry clone was produced by PCR and pENTR/D-Topo cloning from the Arabidopsis Biological Resource Center (ABRC) clone. TFIIE 1 was also cloned in this way from ABRC clone C103330. PCR products were reacted with pENTR/D-Topo per the manuf acturers instructions. Th e directional Topo cloning reaction products were transf ormed into One Shot TOP10 E. coli cells (Invitrogen, USA). The resultant colonies were screened by P CR using a CDS specific and vector specific primer. Positive clones were verified by sequencing at the University of Florida Microbiology and Cell Science Sequencing Co re or Macrogen, Inc. (Seoul, Korea).

PAGE 133

120 Upon screening and sequencing numerous TFIIE 2 clones, it became apparent that this gene was toxic in the forward orientation in pENTR/D-Topo (clones were always found in the reverse orientation). Therefore, th is gene was cloned in two fragments into pENTR/D-Topo (TFIIE 2 1-215 and TFIIE 2 200-475; Table 4-2 for primer sequences). Verified pENTR/D-Topo constructs we re recombined (Gateway LR and BP reactions, Invitrogen) su ccessively through pGAD-T7Ga teway, pDONR207, and pGBKT7Gateway and propagated in E. coli MACH1 cells (Invitrogen). Clones in pGADT7Gateway and pGBK-T7Gateway were c onfirmed by digestion with BamHI as discussed in Chapter 3. A positiv e pGBK-T7Gateway clone of TFIIE 1 was not obtained after numerous attempts, suggesti ng that this construct is toxic to E. coli Clones in pDONR207 were confirmed by diges tion with BamHI and EcoRV as discussed in Chapter 3. The resultant yeast two-hybrid bait and prey construc ts were transformed into MaV204K (Ito et al ., 2000) and AH109 ( MATa trp1-901 leu2-3 112 ura3-52 his3-200 gal4 ; gal80 LYS2::GAL1UAS-GAL1TATA –HIS3 GAL2UAS-GAL2TATA-ADE2 URA3::MEL1UAS-MEL1TATA-lacZ, MEL1 ; Clontech) Saccharomyces cerevisiae strains, respectively, as in Chapter 3. Bait constr ucts in MaV204K were plated on SD –Trp, +0.1% 5’FOA to check for th e potential to activate the SPLA10::URA3 reporter. Expression of the URA3 protein resulted in no growth and indicated spurious activation by the bait fusion protein. Suitable bait /MaV204K (non-activat ors) and prey/AH109 transformants were then mated with one anot her and selected on SD -Trp, -Leu (to test for mating efficiency) and SD -Trp, -Leu, -His, -Ade (to te st for interaction). Double selection on plates lacking His and Ade served as a stringent screen for interactions.

PAGE 134

121 Yeast matings were performed similarly to the protocol in the Yeast Protocols Handbook (Clontech), as outlined in Chapter 3. -galactosidase assays were performed to obtain semi-quantitative data regarding the strength of individual interactions. -galactosidase assays with the resulting interactions were performed using CPRG substrate (Roche, Germany) according to the Clontech Yeast Protocols Handbook. Normalized activities (NAct) were determined and evaluated as in Chapter 3 with th e exception of those involving the TFIIF 2 prey protein. The normalization value for interactions involving the TFIIF 2 prey protein was determined by the following equation: NAct = AVGt – SDt – 1.1 x [AVG (AVGc1 +AVGc2) – 1.1 x AVG (SDc1 – SDc2)]. In this equation AVGt is the average of the test activities, SDt is the standard deviation of the test activities, AVGc1 is the average activity of the appropriate bait negative control, SDc1 is the standard deviat ion of the activity of the appropriate bait negative control, AVGc2 is the average activity of the empty vector bait interacting with the TFIIF 2 prey, and SDc2 is the standard devi ation of the activity of the empty vector bait interacting with the TFIIF 2 prey. Immunoblots were performed as described in Chapter 2. Results The TFIIA-L1, and TFIIA-L2 bait construc ts did not grow on 5’FOA containing plates as expected, since TFIIA-L1 was show n to be an activator when artificially recruited to a yeast promoter (Li et al ., 1999). After screening, it became apparent that several bait proteins were in teracting with the Gal4 activation domain (Gal4AD) due to growth on the SD -Trp, -Leu, -His, -Ade in all prey combinations. The Gal4AD interacting bait clones TFIIB 2, TFIIB4, TFIIB5, and TFIIE 2 200-475 have been

PAGE 135

122 eliminated from interactions for these reasons Histograms of the percentage of matings that formed colonies from each bait and prey constructs were created (Figures 4-1 and 42). These figures exclude the full length TA F12 to avoid redundancy (as in Chapter 2). The bait proteins that interacted with the Ga l4 AD formed colonies with a frequency of over 70% of their matings and were the only ba it constructs with frequencies in this range. This confirmed the assertion that the ba its that were not observed to interact with the Gal4 AD and were suitable for use as bait proteins. Immunoblots to detect bait a nd prey proteins (Figures 4-3 and 4-4, respectively) demonstrated a large variability in protein expression levels. Bait and prey proteins used in Chapter 3 are included in this study; the results of the immunoblots for these proteins are in Figures 3-3, and 3-4. The majority of ba it proteins were at st eady state levels that were below the level of detection. However, interaction data based on growth suggest that these proteins are present in the yeast ce lls. The results of the targeted two-hybrid screens are presented in Table 4-4. These interactions are depicted for TFIIA, TFIIB, TFIID, TFIIE, TFIIF and as a summary in Figures 4-6 through 4-11, respectively. The results of the -galactosidase assays are shown in Figure 4-5. The galactosidase normalized activities were utili zed to verify or exclude protein-protein interactions based on colony growth. The NAc t values below zero were not considered to interactions, while NAct values above one were considered evidence of strong interactions. For interactions to be deem ed positive, greater than 50% of interaction evidence was required to be posit ive. Thus, if an interac tion is not supported by growth of diploid yeast containing r eciprocated constructs, it must be supported by a NAct value greater than zero in order to be considered positive. If an interaction is supported by

PAGE 136

123 growth of diploid yeast containing reciprocat ed constructs, only one of the two tested galactosidase activities must have a NAct great er than zero for the interaction to be considered positive. Strong interactions (NAct 1) are depicted in Figure 4-12. A total of 878 partially reciprocated binary interactions were tested which included most possible interactions of TFIIA, TF IIB, TFIID, TFIIE, and TFIIF. Inter-TFIID interactions were excluded fr om this study, but are shown in Chapter 3. For a variety of reasons, bait constructs of TFIIA-L1, TFIIA-L2, TFIIB2, TFIIB4, TFIIB5, TAF1 #6, TAF7, TAF12 1-200, TAF12 395-538, TAF15, TAF15b, TFIIE 2 200-475, and TFIIE 1 were not included in this study. Therefore, interactions betw een these subunits could not be tested. This leaves a minimum of 169 untested interactions between these five complexes. Discussion Using homology based searches of the A. thaliana genomic sequence database, 18 loci encoding putative TFIIA, TFIIB, TFIIE, a nd TFIIF subunits have been identified, in addition to the 23 loci discussed in Chapte r 3 for TFIID. Of these, 47 putative TFIID subunit coding-sequences (including 4 splice va riants of TAF6b and fragments of TAF1, TAF12, and TFIIE 2) have been cloned into th e MATCHMAKER III yeast two-hybrid system (Clontech). Thirteen of these protei ns lead to transcriptional activation of the yeast reporter promoters in artificial recruitm ent assays when expressed as baits either without or with the Gal4 AD being co-expr essed. Interactions among these 13 (TFIIAL1, TFIIA-L2, TFIIB2, TFIIB4, TFIIB5, TA F1 #6, TAF7, TAF12 1-200, TAF12 395538, TAF15, TAF15b, TFIIE 2 200-475, and TFIIE 1) have therefore not been tested.

PAGE 137

124 Development of another protein-protein interac tion assay system will be necessary to test interactions of these proteins. TFIIA in Arabidopsis appears to potentially be represented by three different complexes. These being the result of hetero dimer formation of TFIIA-S with TFIIA-L1, TFIIA-L2, or TFIIA-L3. Because of the high degree of similarity between TFIIA-L1 and TFIIA-L2, it is predicted that these two proteins have re dundant functions. However, TFIIA-L3 is significantly diverg ed in sequence from the other plant TFIIA-L proteins (Chapter 2). The small subunit of TFIIA displays a num ber of novel interac tions, which have not been found in the literature. These include interactions with TFIIB3, TFIIB6, TAF4b, TAF8, TAF10, TAF12 1-200, TAF12b, TAF13, TAF14, and TFIIF 2. Only TFIIA-L3 interacts with a TFIIB homolog (TFIIB6); althoug h in yeast, TFIIA-L interacts with TFIIB in two-hybrid experiments. Since there are essentially two diverged versions of TFIIA-L in Arabidopsis some of the functions (interac tions) of TFIIA may have been evolutionarily transferred to TFIIA-S, wh ich is present in every TFIIA complex. TFIIB is represented by si x different proteins in Arabidopsis TFIIB1 and TFIIB2 are very similar in sequence and are suggested to play a canonical TFIIB role in formation of the PIC. Interestingly, TFIIB 1 and TFIIB2 have very dissimilar yeast twohybrid interactions. TFIIB1 interact s with TAF4, TAF4b, TAF8, TAF10, TAF12b, TAF13, TFIIB5, TFIIE 1, and forms a homooligomer (possibly a dimer). TFIIB2 was only shown to interact with TAF10. This di screpancy in interactions was unexpected for two close homologs. When expressed as a bait construct, TFIIB2 was found to interact

PAGE 138

125 with the Gal4 AD. This interaction might dampen the transcript ional response of the TFIB2 prey construct leading to false negatives in this system. TFIIB3 and TFIIB6 group closely in a phylogene tic analysis of th e TFIIB-family of proteins. This is remarkable because TFIIB 6 lacks the second direct repeat region found in TFIIB proteins, while TFIIB3 does not. None theless, multiple interactions with other proteins are in common between TFIIB3 and TFIIB6: TFIIA-S, TFIIB5, TAF12 395-538, TAF12b, and TFIIF 2. While neither of these proteins has been tested to interact with TBP at the TATA-element (the gold standard for TFIIB function), their multiple proteinprotein interactions with ot her GTF proteins strongly s uggest a role in the PIC. TFIIB5/pBrp has recently been shown to a ssociate with the plastid outer envelope and presumably is involved in a signaling pathway from the plastid to the nucleus, triggering a transcriptional response (Lagrange et al ., 2003). This protein is a bona fide TFIIB, because it interacts w ith TBP bound to the TATA-element in electromobility gel shift assays. While the exact function of TFIIB5/pBrp is still unknown, the data presented here demonstrates that it interact s with many other proteins in TFIID, TFIIE, and TFIIF as well as with several of its TFIIB homologs. The protein interaction information presented here, along with the traffi cking of this protein from the plastid to the nucleus under conditions of proteaso me/signalosome dysfunction (Lagrange et al ., 2003) strongly suggest a signal transduction role resulting in direct manipulation of the central proteins regu lating transcription. Arabidopsis TFIID appears to be composed of at least 15 different protein subunits, some of which are present in multiple copies. There is no evidence of a TAF3-like

PAGE 139

126 protein in plants other than a minimal sim ilarity with TAF8. In teractions of TFIID subunits with each other are explored in detail within Chapter 3. Interestingly, TAF8 and TAF10 (which bot h dimerize and interact very strongly with one another) have a very similar pattern of interactions with the proteins of other GTFs. Both TAF8 and TAF10 were s hown to interact w ith TFIIA-S, TFIIB1, TFIIB5/pBrp, TFIIE 1, and TFIIF 2. This data suggests that TAF8 and TAF10 (perhaps as an 22 structure) mediate inte ractions between TFIID a nd TFIIA, TFIIB, TFIIE, and TFIIF. There are some TAF-GTF interactions of TAF8 and TAF10 which are not shared. These include TAF8-TFIIE 2, TAF10-TFIIB2, TAF10-TFIIB3, and TAF10-TFIIB6; however, all of these interactions are cons istent with the shared TAF8 and TAF10 interactions. The TAF8-TFIIE 2 interaction is congruous with a TAF8-TAF10 heterotetramer interacting with the TFIIE 1-TFIIE 2 heterotetramer. Similarly, the TAF10-TFIIB2, and TAF10-TFIIB6 interacti ons are consistent with a TAF8-TAF10 interacting with TFIIB proteins None of these interacti ons of TAF8 or TAF10 with other GTFs have been reported previously in any organism. These data suggest a previously unknown role of a TA8-TAF10 heterotetramer in PIC nucleation, at least within plants. Arabidopsis TFIIE proteins are encoded by five genes (three encoding TFIIE subunits and two encoding TFIIE subunits). TFIIE 1 was shown to interact strongly with TFIIE 2, although TFIIE 2 fragments did not interact with TFIIE proteins in this study. This may be caused by improper folding of TFIIE 2 domains when expressed in a fragmented form. TFIIE 3 was not examined in this study because it was not amplified under the RT-PCR conditions utilized. This gene is not represented by EST

PAGE 140

127 sequences and may therefore be a non-expressed pseudoge ne, or a under-represented mRNA/cDNA. Interestingly, TFIIE 1 did not interact with a ny other proteins in this study. This could be due to a two amino aci d truncation of the C-terminus, or another artifact of the protein expre ssion system. However, this pr otein was readily expressed in the prey form as evidenced by Figure 4-2. TFIIF is a heterotetrameric complex of two TFIIF and two TFIIF molecules (Orphanides et al ., 1996). The yeast two-hybrid da ta presented here suggest a composition for plant TFIIF, since the TFIIF 2 protein was shown to dimerize while TFIIF apparently did not. TFIIF 1, like TFIIE 3, was not examined in this study because an RT-PCR product was not obtained. This gene is also not represented by ESTs, and given it divergence in primary structure from other TFIIF proteins in plants, it is considered a possible pseudogene (Chapter 2, Figure 2-12). However, in yeast a third factor interacts as part of TFIIF, the yeast TAF14 (Henry et al ., 1994). The data presented here demonstrates a connection of TAF14 and TAF14b with TFIIF; however, these interactions were between TAF14(b) and TFIIF 2, not TFIIF as in yeast. Since the TAF14(b)-TFIIF connection differs from that in yeast, support for TAF14 or TAF14b acting as TFIIF subunits is tenuous without di rect evidence of its localization to an isolated TFIIF complex. Strikingly, TFIIF 2 interacts with many other GTF subunits (24 of 47), while TFIIF only interacts with TFIIF 2. TFIIF did not interact with any subfragment of TAF1, which in other systems is known to occur since TAF1 acetylates TFIIF Interestingly, TFIIF 2 interacts with both TAF1 #8 (m iddle region) and TAF1 UBD (an internally coded ubiqui tin moiety plus over one-hundred amino acids on either side,

PAGE 141

128 which is located in the middle region cons truct). The HAT/FAT domain of TAF1 is located in the TAF1 #8 construct, and is partially represented in the TAF1 UBD construct. This suggests that in Arabidopsis either the TFIIF 2 is acetylated in the place of TFIIF or that TFIIF 2 is a major stabilizer of the TAF1-TFIIF interaction. TFIIF 2 interacted with four of six TFIIB homologs, as is expected since this interaction is a major connection point for TF IIB and PolII. Of the TFIIBs, only TFIIB2 and TFIIB4 did not interact with TFIIF2. In general, these two TFIIB homologs had the fewest interactions of the family, although both were detectably expressed as prey proteins. Both TFIIB2 and TFIIB4 contai n putative zinc-bindi ng domains that are implicated in interactions with TFIIF homologs (Buratowski and Zhou, 1993). Since false negatives are often a problem in any protein interact ion study, definitive conclusions can not be drawn with respect to these failed interactions. However, a lack of interactions with TFIIF 2 and nearly all other GTFs te sted here draws into question the veracity of TFIIB2 and TFIIB 4 as functional TFIIB-homologs. Of the 118 interactions identified in th is study, 86 were novel. A significant portion of these are likely to be due to the lack of a previously performed systematic study to specifically test interactions among TFIIA, TFIIB, TFIID, TFIIE, and TFIIF from any organism. However, with the la rge number of novel plant GTF homologs identified by homology based searches, various specializations are to be expected. The data provided by this study can lead to speci fic, testable hypotheses as to variable PIC conformations. Further detailed analyses will be necessary to unravel the meanings behind these varied binary interactions.

PAGE 142

129 Table 4-1. Primers for am plification of cDNAs to Arabidopsis homologs of TFIIA, TFIIB, TFIIE, and TFIIF cloning into the pENTR/D-Topo vector. Gene Primer Sequence TFIIA-L1 Upper Primer caccATGGGTACAACAACGACAACAAGCGCTGTGTATATCC ATG TFIIA-L1 Lower Primer GAAGTCAAACTCGCCTGCTGCTTTGTTGAAGAGAATGTCC TTATC TFIIA-L2 Upper Primer caccATGGGTACAACAACGACAACAAGCGCTGTG TFIIA-L2 Lower Primer GAAGTCGAACTCGCCTGTTGCTTTGTTGAAGAGAATG TFIIA-L3 Upper Primer caccATGGTGTTATCAACGAGCGATACGAGTAGCTCTTACA ACTATG TFIIA-L3Lower Primer GAAGTTGAAATCTCCTGTTGCCTGTGAG TFIIA-S Upper Primer caccATGGCGACGTTTGAGCTG TACAGGAGATCGACGATC TFIIA-S Lower Primer CTGTGTGAGCAGCTTGGAATCACATGCCACTATCTTC TFIIB1 Upper Primer caccATGTCGGATGCGTATTGTACGGATTG TFIIB1 Lower Primer AGGACTTGACAGGTTTTTCAGATCCTCTTCCTTTGCATACC AAC TFIIB2 Upper Primer caccATGAGTGACGCGTTTTGTTCGGACTGTAAGAGGCACA CGGA TFIIB2 Lower Primer AGGGCTTTGAAGGTTCTTGAGATCTTCTTCTTTAGCGTACC AAGCT TFIIB3 Upper Primer caccATGGAAGAAGAGACCT GCTTGGACTG TFIIB3 Lower Primer TACTGAAAATTTTGCAGAATCCCAGGACGTGATG TFIIB4 Upper Primer caccATGACGATGAAGTGGGGTC ACAGTTGCAGGAGATGTA AG TFIIB4 Lower Primer AGGAGCTCCAAGGTTTTTCAGGTCATTTGCATTGGCAAAC CAC TFIIB5 Upper Primer caccATGAAGTGTCCGTACTGTTCATC TFIIB5 Lower Primer GAAGTCTCCATGGGGATTATCAGCATTC TFIIB6 Upper Primer caccATGAAAGAAGACGGAATTTGC TTGGAGTGCAAGAGGC CAAC TFIIB6 Lower Primer AATAGTACCGAAAGAATCTCCAAGAAGCTTCACCGCTTTG TFIIE 1 Upper Primer caccATGGAAAAATCAGGCCCGGT GCAGAAAGCCGTTGTTC TC TFIIE 1 Lower Primer GCCTTCTTCCCAATCGACGTCGTCTTCCTCTTCTTCTTC TFIIE 2 Upper Primer caccATGGACAAATCAATCACG GTGGTGCGGAAAACCGTTG TG TFIIE 2 Lower Primer GCCTTCTTCCCAGTCGATGTCGTCATCTTCGTCTCCATCT TFIIE 3 Upper Primer caccATGGTGAAGCTTGTAGCGAAAAC TFIIE 3 Lower Primer GCATTCTTGCCAATCGACATCGTTTTCGTC TFIIE 1 Upper Primer caccATGGCTTTGCGGGAGCAGCTTG TFIIE 1 Lower Primer ACTCTGGAAGAGCTCGAGCATATGGGAATTG TFIIE 2 Upper Primer caccATGGCTCTAAAGGAACAGCTAG TFIIE 2 Lower Primer GTTCCGGGAACTGCTGCCGTTAAG

PAGE 143

130 Table 4-1 Continued. Gene Primer Sequence TFIIF Upper Primer caccATGTCGAACTGTTTGCAAT TGAATACGTCTTGTGTTGG TTGCGGATCAC TFIIF Lower Primer AGCAAGCGGAGTAACATTATCTCTCAAAACAACAACAAA CTTTTCAGAAC TFIIF 1 Upper Primer caccATGGAAGATGTAAAGGT GGAAATGAAGGTAAG TFIIF 1 Lower Primer TTCCTGAGTGGCTTTCTTATATTCAGGCTTCAG TFIIF 2 Upper Primer caccATGGAAGATATTCAT AATCTCGATATAGAG TFIIF 2 Lower Primer CTGCCCACCTGTATCATCTTCAGCAG Note: The lower case sequence “cacc” on the upper primers is for directional cloning into pENTR/D-Topo.

PAGE 144

131 Table 4-2. Primers for cloning of TFIIE 2 N-terminal, and C-terminal fragments. Lower case sequences are not homologous to the gene of interest, but are required for cloning into the pENTR/D-Topo vector. Gene Primer Sequence TFIIE 2 1-215 Upper Primer caccATGGACAAATCAATCACGGTGGTGCGGAAAACCGTTGTG TFIIE 2 1-215 Lower Primer TCTATCTACTACTTCTTCGGAAATTAGCTTGTTACATTC TFIIE 2 200-475 Upper Primer caccGTTATGGAATGTAACAAGCTAATTTCCGAAGAAG TFIIE 2 200-475 Lower Primer GCCTTCTTCCCAGTCGATGTCGTCATCTTCGTCTCCATCT

PAGE 145

132 Table 4-3. Arabidopsis thaliana TFIIA, TFIIB, TFIIE, and TFIIE component cDNA GenBank accession numbers. Gene Locus Accession CDS Size (bp) Mw (KDa) pI TFIIA-S At4g24440 AY463599 321 12.1 5.61 TFIIA-L1 At1g07480 AY463627 1,128 41.3 3.98 TFIIA-L2 At1g07470 AY463597 1,128 41.2 4.02 TFIIA-L3 At5g59230 AY463598 561 20.9 3.94 TFIIB1 At2g41630 AY463600 939 34.3 6.77 TFIIB2 At3g10330 AY463601 939 34.2 6.66 TFIIB3 At3g29380 AY463629 1,011 37.7 6.32 TFIIB4 At3g57370 AY463602 1,083 39.7 7.76 TFIIB5 At4g36650 AY463603 1,512 55.7 6.14 TFIIB6 At4g10680 AY463604 549 19.9 8.89 TFIIE 1 At1g03280 AY463605 1,440 54.1 4.75 TFIIE 2 At4g20340 N/A 1,428 54.6 4.95 TFIIE 2 1-215 At4g20340 AY463606 645 25.5 9.04 TFIIE 2 200-475 At4g20340 AY463607 825 30.9 4.32 TFIIE 3 At4g20810 N/A 1,251 47.8 4.72 TFIIE 1 At4g21010 AY463610 828 31.5 10.23 TFIIE 2 At4g20330 AY463608 861 32.4 10.04 TFIIF At4g12610 AY463611 1,950 72.3 5.22 TFIIF 1 At3g52270 N/A 1,095 42.1 7.70 TFIIF 2 At1g75510 AY463609 761 29.7 6.92

PAGE 146

133 Figure 4-1. Histogram of percent of mati ngs, per bait construct, that yielded colony growth. Bars marked in red represent ba it constructs that were excluded from the analysis and consider ed spurious bait activators. 0 2 4 6 8 10 12 14 16 18 20 0-1010-2020-3030-4040-5050-6060-7070-8080-9090-100 Percent GrowthFrequency

PAGE 147

134 Figure 4-2. Histogram of percent of matings, per prey construct, that yielded colony growth. 0 2 4 6 8 10 12 14 16 0-1010-2020-3030-4040-5050-6060-7070-8080-9090-100Percent GrowthFrequency

PAGE 148

135 Figure 4-3. Immunoblots of TFIIA, TFIIB, TFIIE and TFIIF components expressed as bait fusion proteins in MaV204K. Proteins were expressed as Gal4 DNA binding doma in fusion proteins with an internal Myc-tag from a Gateway converted pGBK-T7 vector. Blots were probed with anti-Myc 9B11 antibody. The lane labeled pGBK-T7 is the non-Gateway modified empty vector control. KDa 120 100 80 60 50 40 30 pGBK-T7 TFIIA-L3 TFIIA-S TFIIB1 TFIIB3 TFIIB6 TFIIE 1 TFIIE 2 1-215 TFIIE 2 TFIIF TFIIF 2

PAGE 149

136 Figure 4-4. Immunoblots of TFIIA, TFIIB, TFIIE and TFIIF components expressed as bait fusion proteins in AH109. Proteins were expressed as Gal4 AD fu sion proteins with an intern al HA-tag from a Gateway conve rted pGAD-T7 vector. Blots were probed with HA.11 antibody. The lane labeled pGAD-T7 is the non-Gateway modified empty vector control. pGAD-T7 TFIIA-L1 TFIIA-L2 TFIIA-L3 TFIIA-S TFIIB1 TFIIB2 TFIIB3 TFIIB4 TFIIB5 TFIIB6 TFIIE 1 TFIIE 2 1-215 TFIIE 2 200-475 TFIIE 1 TFIIE 2 TFIIF TFIIF 2 KDa 120 100 80 60 50 40 30

PAGE 150

137Table 4-4. A yeast two-hybrid targ eted protein-protein interacti on matrix between components of Arabidopsis thaliana TFIIA, TBIIB, TFIIE, and TFIIF with subunits of the TFIID complex. Note: Data describing the average number of days after spotting (DAS) and in how many of the five spots form colonies (i.e., 6/ 3+ indicates that three spots in the serial-dilu tion series developed colonies in an aver age of 6 DAS). Boxes highlighted in oran ge and green are reciprocated growth interactions bold boxes are tests for dimer formation. Green and blue interactions were validat ed by Gal assays, with darker boxes being stronger interactions. Red and oran ge interactions were negated by -Gal NAct values. Prey constructs highlighted in blue are construc ts that have not been shown as bait constr ucts (either because the bait proteins int eracted with the Gal4 AD, the bait proteins did not grow in the 5’FOA te st, or they have not yet been tested as bait constructs).

PAGE 151

138 Figure 4-5. Colorimetric assays of the -galactosidase reporter levels in yeast diploids containing both ba it and prey plasmids. Assays were preformed using the CPRG liquid culture assa y protocol in the Yeast Pr otocols Handbook (Clontech). Green bars indicate negative cont rols of empty vector and tw o control baits with Gal4-AD. Yellow bars indicate the negative control for each bait construct with the empty prey vector. Blue bars are interactions yielding -Galactosidase activities with NAct values a bove zero; red bars indicate -Galactosidase activities with NAct values below zero. 0.0 1.0 2.0 3.0 4.0 5.0 6.0pBGKT7/pGAD-T7 LAM/pGAD-T7 53/pGAD-T7 TBP1/pGAD-T7 TBP1/TFIIFb2 TAF1 #4/pGAD-T7 TAF1 #4/TFIIB5 TAF1 #4/TFIIFb2 TAF1 #5/pGAD-T7 TAF1 #5/TFIIA-L3 TAF1#8/pGAD-T7 TAF1#8/TFIIFb2 TAF2/pGAD-T7 TAF2/TFIIA-L1 TAF4/pGAD-T7 TAF4/TFIIA-L1 TAF4/TFIIB1 TAF4/TFIIB5 TAF4/TFIIEa1 TAF4/TFIIFb2 TAF4b/pGAD-T7 TAF4b/TFIIA-L1 TAF4b/TFIIA-L2 TAF4b/TFIIA-L3 TAF4b/TFIIA-S TAF4b/TFIIB1 TAF4b/TFIIB3 TAF4b/TFIIB4 TAF4b/TFIIB5 TAF4b/TFIIB6 TAF4b/TFIIEa1 TAF4b/TFIIEa2 TAF4b/TFIIEb2 TAF4b/TFIIFb2 TAF5/pGAD-T7 TAF5/TFIIA-S TAF5/TFIIB5 TAF5/TFIIEb2 TAF5/TFIIFb2 TAF6b-2/pGAD-T7 TAF6b-2/TFIIA-L1 TAF6b-2/TFIIA-S TAF6b-3/pGAD-T7 TAF6b-3/TFIIFb2 TAF8/pGAD-T7 TAF8/TFIIA-L3 TAF8/TFIIA-S TAF8/TFIIB1 TAF8/TFIIB5 TAF8/TFIIEa1 TAF8/TFIIEb2 TAF8/TFIIFb2 TAF10/pGAD-T7 TAF10/TFIIA-L1 TAF10/TFIIA-S TAF10/TFIIB1 TAF10/TFIIB2 TAF10/TFIIB3 TAF10/TFIIB5 TAF10/TFIIB6 TAF10/TFIIEa1 TAF10/TFIIFb2 TAF11/pGAD-T7 TAF11/TFIIFb2 TAF12b/pGAD-T7 TAF12b/TFIIA-S TAF12b/TFIIA-L1 TAF12b/TFIIA-L2 TAF12b/TFIIA-L3 TAF12b/TFIIB1 TAF12b/TFIIB3 TAF12b/TFIIB4 TAF12b/TFIIB5 TAF12b/TFIIB6 TAF12b/TFIIEa1 TAF12b/TFIIEb2 TAF12b/TFIIFa TAF12b/TFIIFb2 TAF13/pGAD-T7 TAF13/TFIIA-S TAF13/TFIIB1 TAF13/TFIIB5 TAF13/TFIIEa1 TAF13/TFIIFb2 TAF14/pGAD-T7 TAF14/TFIIA-L2 TAF14/TFIIA-L3 TAF14/TFIIA-S TAF14/TFIIB3 TAF14/TFIIB4 TAF14b/pGAD-T7 TAF14b/TFIIA-S TAF14b/TFIIFb2Bait/Prey -galactosidase units (1 mol CPRG x min-1 x cell-1)

PAGE 152

139Figure 4-5 continued. 0.0 1.0 2.0 3.0 4.0 5.0pBGKT7/pGAD-T7 LAM/pGAD-T7 53/pGAD-T7 TFIIA-S/pGAD-T7 TFIIA-S/TFIIA-L1 TFIIA-S-TFIIA-L2 TFIIA-S/TFIIA-L3 TFIIA-S/TFIIEa2 201-475 TFIIA-S/TFIIFb2 TFIIA-L3/pGAD-T7 TFIIA-L3/TBP2 TFIIA-L3/TAF12 395-538 TFIIA-L3/TAF12b TFIIA-L3/TFIIA-S TFIIA-L3/TFIIB1 TFIIA-L3/TFIIB6 TFIIA-L3/TFIIEb2 TFIIA-L3/TFIIFb2 TFIIB1/pGAD-T7 TFIIB1/TAF8 TFIIB1/TAF10 TFIIB1/TAF12 395-538 TFIIB1/TFIIA-S TFIIB1/TFIIB1 TFIIB1/TFIIB5 TFIIB1/TFIIB6 TFIIB1/TFIIEa1 TFIIB1/TFIIFb2 TFIIB3/pGAD-T7 TFIIB3/TBP1 TFIIB3/TAF7 TFIIB3/TAF12 395-538 TFIIB3/TFIIA-S TFIIB3/TFIIB5 TFIIB3/TFIIFb2 TFIIB6/pGAD-T7 TFIIB6/TAF1 #9 TFIIB6/TAF12 1-200 TFIIB6/TAF12 395-538 TFIIB6/TAF14b TFIIB6/TFIIA-L3 TFIIB6/TFIIA-S TFIIB6/TFIIB5 TFIB6/TFIIFb2 TFIIEa1/pGAD-T7 TFIIEa1/TBP1 TFIIEa1/TAF1#8 TFIIEa1/TAF6b-2 TFIIEa1/TAF8 TFIIEa1/TAF10 TFIIEa1/TAF12 395-538 TFIIEa1/TAF13 TFIIEa1/TFIIB1 TFIIEa1/TFIIB4 TFIIEa1/TFIIB5 TFIIEa1/TFIIEa1 TFIIEa1/TFIIEb2 TFIIEa1/TFIIFb2 TFIIEa2 1-215/pGAD-T7 TFIIEa2 1-215/TAF10 TFIIEa2 1-215/TAF12 395-538 TFIIEa2 1-215/TFIIB5 TFIIEa2 1-215/TFIIFb2 TFIIEb2/pGAD-T7 TFIIEb2/TFIIB5 TFIIEb2/TFIIEa1 TFIIEb2/TFIIFb2 TFIIFa/pGAD-T7 TFIIFa/TFIIFb2 TFIIFb2/pGAD-T7 TFIIFb2/TAF10 TFIIFb2/TAF12 395-538 TFIIFb2/TFIIA-S TFIIFb2/TFIIB1 TFIIFb2/TFIIB5 TFIIFb2/TFIIB6 TFIIFb2/TFIIFb2Bait/Prey -galactosidase units (1 mol CPRG x min-1 x cell-1)

PAGE 153

140 Figure 4-6. Protein-prot ein interactions of Arabidopsis thaliana TFIIA subunits with component s of TFIIB, TFIID, TFIIE, TFIIF, and other TFIIA components as de termined by yeast two-hybrid and -galactosidase confirmati ons. Dashed lines are interactions found in this study that have been demonstrated to occu r with homologs from Homo sapiens Drosophila melanogaster and/or Saccharomyces cerevisiae Solid lines are novel interactions demonstrated only in this study. Striped figures could no t be tested as baits. TAF12 TAF2 TAF6 TAF6b2 TAF9 TBP2 TAF1b TAF11 TAF5 TAF6b1 TAF6b3 Low conservation in histonefold domain TAF11b Possible pseudogene TBP1 TAF10 TAF13 TAF7 TAF8 TAF4 TAF4b TAF14 TAF1 TAF14b TAF15 TAF15bTAF6b -4TAF12b TFIIB2 TFIIB1 TFIIB3 TFIIB4 TFIIB6 TFIIB5 TFIIA-S TFIIA-L1 TFIIA-L3 TFIIA-L2 TFIIE 1 TFIIE 3 TFIIE 2 TFIIE 1 TFIIE 2 TFIIF 1 TFIIF TFIIF 2

PAGE 154

141 Figure 4-7. Protein-prot ein interactions of Arabidopsis thaliana TFIIB homologs with component s of TFIIA, TFIID, TFIIE, TFIIF, and other homologs of TFIIB as determined by yeast two-hybrid and -galactosidase confirma tions. Dashed black lines are interactions found in this study that have been demonstrated to occur with homologs from Homo sapiens Drosophila melanogaster and/or Saccharomyces cerevisiae Solid green lines are nove l interactions demonstrated only in this study. Striped figures could not be tested as baits. TAF12 TAF2 TAF6 TAF6b2 TAF9 TBP2 TAF1b TAF11 TAF5 TAF6b1 TAF6b3 Low conservation in histonefold domain TAF11b Possible pseudogene TBP1 TAF10 TAF13 TAF7 TAF8 TAF4 TAF4b TAF14 TAF1 TAF14b TAF15 TAF15bTAF6b -4TAF12b TFIIB2 TFIIB1 TFIIB3 TFIIB4 TFIIB6 TFIIB5 TFIIA-S TFIIA-L1 TFIIA-L3 TFIIA-L2 TFIIE 1 TFIIE 3 TFIIE 2 TFIIE 1 TFIIE 2 TFIIF 1 TFIIF TFIIF 2

PAGE 155

142 Figure 4-8. Protein-prot ein interactions of Arabidopsis thaliana TFIID components with componen ts of TFIIA, TFIIB, TFIIE, and TFIIF as determined by yeast two-hybrid and -galactosidase confirmations. Dashed black lines are interactions found in this study that have been demons trated to occur with homologs from Homo sapiens Drosophila melanogaster and/or Saccharomyces cerevisiae Solid green lines are novel interactions demonstrated only in this study. Striped figures could not be tested as baits. TAF12 TAF2 TAF6 TAF6b2 TAF9 TBP2 TAF1b TAF11 TAF5 TAF6b1 TAF6b3 Low conservation in histonefold domain TAF11b Possible pseudogene TBP1 TAF10 TAF13 TAF7 TAF8 TAF4 TAF4b TAF14 TAF1 TAF14b TAF15 TAF15bTAF6b -4TAF12b TFIIB2 TFIIB1 TFIIB3 TFIIB4 TFIIB6 TFIIB5 TFIIA-S TFIIA-L1 TFIIA-L3 TFIIA-L2 TFIIE 1 TFIIE 3 TFIIE 2 TFIIE 1 TFIIE 2 TFIIF 1 TFIIF TFIIF 2

PAGE 156

143 Figure 4-9. Protein-prot ein interactions of Arabidopsis thaliana TFIIE subunits with component s of TFIIA, TFIIB, TFIID, TFIIF, and other TFIIE components as de termined by yeast two-hybrid and -galactosidase confirmati ons. Dashed black lines are interactions found in this study that have b een demonstrated to occur with homologs from Homo sapiens Drosophila melanogaster and/or Saccharomyces cerevisiae Solid green lines are nove l interactions demonstrated only in this study. Striped figures could not be tested as baits. TAF12 TAF2 TAF6 TAF6b2 TAF9 TBP2 TAF1b TAF11 TAF5 TAF6b1 TAF6b3 Low conservation in histonefold domain TAF11b Possible pseudogene TBP1 TAF10 TAF13 TAF7 TAF8 TAF4 TAF4b TAF14 TAF1 TAF14b TAF15 TAF15bTAF6b -4TAF12b TFIIB2 TFIIB1 TFIIB3 TFIIB4 TFIIB6 TFIIB5 TFIIA-S TFIIA-L1 TFIIA-L3 TFIIA-L2 TFIIE 1 TFIIE 3 TFIIE 2 TFIIE 1 TFIIF 1 TFIIF TFIIF 2 TFIIE 2

PAGE 157

144 Figure 4-10. Protein-pr otein interactions of Arabidopsis thaliana TFIIF subunits with component s of TFIIA, TFIIB, TFIID, TFIIE, and other TFIIF components as de termined by yeast two-hybrid and -galactosidase confirmati ons. Dashed black lines are interactions found in this study that have b een demonstrated to occur with homologs from Homo sapiens Drosophila melanogaster and/or Saccharomyces cerevisiae Solid green lines are nove l interactions demonstrated only in this study. Striped figures could not be tested as baits. TAF12 TAF2 TAF6 TAF6b2 TAF9 TBP2 TAF1b TAF11 TAF5 TAF6b1 TAF6b3 Low conservation in histonefo lddomain TAF11b Possible pseudogene TBP1 TAF10 TAF13 TAF7 TAF8 TAF4 TAF4b TAF14 TAF1 TAF14b TAF15 TAF15bTAF6b -4TAF12b TFIIB2 TFIIB1 TFIIB3 TFIIB TFIIB6 TFIIB5 TFIIA-S TFIIA-L1 TFIIA-L3 TFIIA-L2 TFIIE 1 TFIIE 3 TFIIE 2 TFIIE 1 TFIIE 2 TFIIF 1 TFIIF TFIIF 2

PAGE 158

145 Figure 4-11. Protein-pr otein interactions of Arabidopsis thaliana TFIIA, TFIIB, TFIIE, and TFIIF w ith each other and subunits of TFIID as determined by yeast two-hybrid and -galactosidase confirmations. Dashed black lines are interactions found in this study that have been demonstrat ed to occur with homologs from humans, fr uit flies, and/or yeast. Solid green lines are novel interact ions demonstrated only in th is study. Striped figures coul d not be tested as baits. TAF12 TAF2 TAF6 TAF6b2 TAF9 TBP2 TAF1b TAF11 TAF5 TAF6b1 TAF6b3 Low conservation in histone-fold domain TAF11b Possible pseudogene TBP1 TAF10 TAF13 TAF7 TAF8 TAF4 TAF4b TAF14 TAF1 TAF14b TAF15 TAF15bTAF6b -4TAF12b TFIIB2 TFIIB1 TFIIB3 TFIIB4 TFIIB6 TFIIB5 TFIIA-S TFIIA-L1 TFIIA-L3 TFIIA-L2 TFIIE 1 TFIIE 3 TFIIE 2 TFIIE 1 TFIIE 2 TFIIF 1 TFIIF TFIIF 2

PAGE 159

146 Figure 4-12. Strong proteinprotein interactions of Arabidopsis thaliana TFIIA, TFIIB, TFIIE, and TF IIF with each other and subunits of TFIID as determined by yeast two-hybrid and -galactosidase confirmations Dashed black lines are interactions found in this study that have been demonstrated to occur with homo logs from humans, fruit flies, and/or yeast. Solid green lines are novel interactions from this study. Striped figures coul d not be tested as baits. TAF12 TAF2 TAF6 TAF6b2 TAF9 TBP2 TAF1b TAF11 TAF5 TAF6b1 TAF6b3 Low conservation in histonefold domain TAF11b Possible pseudogene TBP1 TAF10 TAF13 TAF7 TAF8 TAF4 TAF4b TAF14 TAF1 TAF14b TAF15 TAF15bTAF6b -4TAF12b TFIIB2 TFIIB1 TFIIB3 TFIIB4 TFIIB6 TFIIB5 TFIIA-S TFIIA-L1 TFIIA-L3 TFIIA-L2 TFIIE 1 TFIIE 3 TFIIE 2 TFIIE 1 TFIIE 2 TFIIF 1 TFIIF TFIIF 2

PAGE 160

147 CHAPTER 5 DISCUSSION Using homology based searches of the Arabidopsis thaliana genomic sequence database, 41 loci encodi ng putative TFIIA, TFIIB, TFIID TFIIE, and TFIIF subunits have been identified. TFIIA, TFIIB, TFIID, TFIIE, and TFIIF are encoded by four, six, twenty-three, five, and three of these loci, respectively. In total, 38 cDNAs encoding these genes have been cloned. Their prot ein-protein interactions and phylogenetic relationships have been analy zed and are discussed below. TFIIA Large and Small Subunits TFIIA is composed of either two subunits (L and S) in fungi and plants or three subunits ( , and ) in metazoans where and are derived from post-translational cleavage of a protein homologous to th e fungal and plant TFIIA-L subunit (Li et al ., 1999). In A. thaliana TFIIA is encoded by four genes. Three of these encode large subunit homologs and one encodes a small s ubunit homolog. Poplar TFIIA also appears to be encoded by four genes (two encodi ng large subunit proteins, and two encoding small subunit proteins). AtTFIIA-L1 and AtTFIIA-L2 appear to have resulted from a recent gene-duplication due to their high degree of identity and thei r close juxtaposition in chromosomal location (Fig. 2-2). Because of this similarity, TFIIA-L1 and TFIIA-L2 are predicted to have redundant functions. The TFIIA-S family is conserved throughout the length of the protein, while the TFIIA-L family is conserved mainly at the N-terminal

PAGE 161

148 and C-terminal ends. The TFIIA-L sequence co nservation pattern is consistent with the observation that human and fruit fly TFIIA is composed of three subunits, the two largest of which are derived from proteolytic cleavage of the TFIIA / (TFIIA-L) pre-protein. This suggests that the middle region of TFIIA -L proteins may function as a flexible linker. AtTFIIA-L3 appears to have arisen from a mo re ancient gene duplication and has significantly diverged from other TFIIA-L genes (Figure 2-2). The AtTFIIA-L3 protein is approximately half the size of its two Arabidopsis homologs, although it has maintained a similar pI and appears to be competent for assembly of the TFIIA complex based on yeast two-hybrid intera ctions (Chapter 4). One hypothesis is that AtTFIIA-L3 represents an ancestral form of the TFIIA -L protein family due to its phylogenetic clustering with fungal and metazoan sequences. Arabidopsis and poplar both encode TFIIA proteins that ar e highly diverged from other plant proteins (TFIIA-L3, and TFIIA-S2, respectively). These proteins may have evolved specialized functions within their respective organisms; however, they do not seem to be conserved within other plant genomes that have been sequenced. This suggests that these proteins ar e potentially evolutionary e xperiments in progress or may be ancestral forms of these prot eins that have been retained in their respective species. The small subunit of Arabidopsis TFIIA displays a number of novel interactions, which have not been previously identified. These include interactions with TFIIB3, TFIIB6, TAF4b, TAF8, TAF10, TAF12 1200, TAF12b, TAF13, TAF14, and TFIIF 2. Only TFIIA-L3 interacts with a TFIIB ho molog (TFIIB6), although in yeast TFIIA-L interacts with TFIIB in two-hybrid experiments. Since essentially tw o diverged versions

PAGE 162

149 of TFIIA-L (TFIIA-L1/TFIIA-L 2 and TFIIA-L3) exist in Arabidopsis some of the functions (interactions) of TFIIA may have e volutionarily been selected to occur in TFIIA-S, which is present in every TFIIA complex. TFIIB Family Full-length cDNAs of over 30 plant homol ogs of TFIIB have been identified in plants. The TFIIB protein-family has undergone a myriad of duplications and differentiations including one (the Class C TFIIB-related proteins ) that has evolved a functional interaction with the defining plant organelle, the plastid (Lagrange et al ., 2003). Six distinct phy logenetic TFIIB groups are apparent in Arabidopsis and Populus (if one accounts for the BRFs). Clear homol ogs of DNA-dependent RNA polymerase III (PolIII) associated TFIIB-related factors (BRF s) from plants were excluded from the phylogenetic analysis with the exception of the Arabidopsis proteins for use as an outgroup. Plant TFIIB homologs have a number of c onserved motifs. These include a lysine residue in the second direct repeat, which has been shown to be autoacetylated in human and yeast TFIIBs (Choi et al ., 2003). Choi et al. (2003) reported that the presence of the acetyllysine group increases TFIIB affinity fo r TFIIF, implying a role in transcriptional initiation. It is likely that an activity involved in this crit ical process will be conserved not only in metazoans and fungi, but also in pl ants. Equally significa nt is the absence of this lysine in severa l of the plant TFIIBs, suggesting plant-specific specialization among members of the TFIIB family. Arabidopsis TFIIB1, TFIIB2, TFIIB3 all contain this lysine residue, while TFIIB4, TFIIB5 a nd TFIIB6 do not. Interestingly, TFIIB2 and TFIIB4 do not interact with TFIIF 2 in yeast two-hybrid assa ys while the other TFIIB

PAGE 163

150 homologs do. This confirms that the acetyllysi ne is not the only factor involved in the TFIIB-TFIIF interaction. The putative zinc-ribbon domain at the Nterminus has been conserved in most TFIIB-family members. Significantly, popl ar TFIIB3, poplar TFIIB8, poplar TFIIB9, Lycopersicon esculentum TFIIB AF273333, and Sulfolobus solfataricus TFB AAK40772.1 are all missing the conser ved cysteine and/or histidine residues essential to this N-terminal motif. However, at least one archaeal species ( Sulfolobus solfataricus ) is lacking the zinc-ribbon in its TFB suggesting that this motif may not be required for TFIIB function in all cases. Another conserved domain, the impe rfect direct repeats (Nikolov et al ., 1995) are found in most plant TFIIB homologs (Appendix B3). Although not closely related, AtTFIIB6 and the predicted amino acid seque nce of poplar TFIIB4 are both lacking the second direct repeat region, which interact s directly with Po lII in animals (Ha et al ., 1993). Therefore, it is suggest ed that these proteins may be deficient in this PolII interaction and, if they are f unctional, could possibly play a role as negativ e regulators. Vitis vinifera TFIIB TC9302, as well as amino acid predictions from poplar TFIIB5 and TFIIB6 are notably lacking bot h direct repeats suggesting that these proteins, if expressed, are not functional TF IIB homologs since they wo uld likely be deficient in TBP and PolII interactions (Ha et al ., 1993). In addition to the TFIIB-family proteins in Class A and Class B (BRFs, which were not analyzed extensively in this study), a clear conservation has been shown for the plastid-envelope associated TF IIB-like proteins (Class C). The canonical member of the Class C group (TFIIB5/pBrp) was discovered by Lagrange et al (2003) to bind the outer

PAGE 164

151 envelope of the plastid, sugge sting a function in signal tran sduction from the plastid to the nucleus. This plant-specific TFIIB is not de tectable in the nucleus of wild type plants (Lagrange et al ., 2003). The characterized protei n AtTFIIB5/pBrp has two closely related homologs ( Lycopersicon esculentum Accession AAG01118, and poplar TFIIB7/pBrp) in addition to the partial cDNAs from Spinacia oleracea and Zea mays reported by Lagrange et al (2003). This suggests that this protein is widely distributed in plants and has a conserved activity that is cri tical to plant cell functions. TFIIB5/pBrp is also a bona fide TFIIB, because it interacts with TBP bound to the TATA-element in EMSA experiments. While the exact functi on of TFIIB5/pBrp is still unknown, the data presented here demonstrates that it interact s with many other proteins in TFIID, TFIIE, and TFIIF as well as with several of its TF IIB homologs. These in teractions, along with the trafficking of this protein from the pl astid to the nucleus under the conditions of proteasome/signalosome dysfunction (Lagrange et al ., 2003) strongly suggest a signal transduction role resulting in direct mani pulation of the central proteins regulating transcription. There is weak bootstrap support for two additional conserved classes of TFIIB-like proteins in plants. The Class D group contains Arabidopsis TFIIB4 and Poplar TFIIB8. Class E contains Arabidopsis TFIIB3 and TFIIB6, as well as Poplar TFIIIB2. The functions of these proteins are unknown; however, Arabidopsis TFIIIB3 and TFIIB6 have similar interactions with other GTFs (Chapter 4). TFIIB is represented by si x different proteins in Arabidopsis TFIIB1 and TFIIB2 are both closely related Class A proteins. Interestingly, TFIIB1 and TFIIB2 have very dissimilar yeast two-hybrid in teractions. TFIIB1 interact s with TAF4, TAF4b, TAF8,

PAGE 165

152 TAF10, TAF12b, TAF13, TFIIB5, TFIIE 1, and forms a homooligomer (possibly a dimer). However, TFIIB2 only interacts with TAF10. This discrepancy in interactions was unexpected for two close homologs, and is difficult to explain. When expressed as a bait construct, TFIIB2 was found to interact with the Gal4 AD; this interaction might dampen the transcriptional response of th e TFIB2 prey construct leading to false negatives. TFIIB3 and TFIIB6 group closely in a phylogene tic analysis of th e TFIIB-family of proteins. However, TFIIB6 lacks the s econd direct repeat region found in TFIIB proteins, and TFIIB3 does not. This suggests that these two proteins may have vastly different roles in transcription, with TFIIB6 perhaps playing a role as a repressor of transcription. Altern atively, TFIIB6 may be lacking this second repeat domain through a functionally deleterious mutation. TFIID Components TBP is widely regarded as being the rate -limiting factor of PIC formation {Klein, 1994 #464; Collart, 1996 #189; Chatterjee, 1995 #465}. Consistent with this critical role, it is among the most highly conserved protei ns of the GTFs through all the organisms examined in this study, with 73.7% and 63.1% average similarity and identity, respectively. Likewise, TBP is highly cons erved in plants in general (84% average identity). Similarly to animals, many plants contain duplicat e TBP genes; however, unlike the case in animals, the plant protei ns are highly similar and are likely redundant. Plant TBPs are tightly clustered phylogene tically, although significantly diverged from metazoan, fungal, protist, and archaeal TB Ps and TBP-like proteins. As would be expected, the TBP two repeated structural domai ns are conserved in a ll the proteins in the

PAGE 166

153 TBP-like family. Arabidopsis TBPs, like their counterpart in yeast, interact with very few proteins in yeast two-hybrid. TBP1 and TB P2 interact with the N-terminus of TAF1 and TBP1 interacts with TFIIF 2. The paucity of TBP interactions may be due to a dampening of the Gal4 AD activity by TBP itself. Arabidopsis TFIID is composed of at least 15 different protein subunits, some of which are present in multiple copies. In ge neral, TBP-associated factors are more highly variable than TBP. There are many cases of duplicate TAFs as well as TAF-like proteins in fungi and animals (reviewed in Tora, 2002) One extreme example of this type of duplication in Arabidopsis are the TAF6 homologs. Two genetic loci encoding homologs of TAF6 have been identified in Arabidopsis One of these genes, TAF6b is alternatively spliced into four distinct messages. The TAF genes that were phylogenetically analyzed were found to be clearly divergent al ong taxonomic lines. This was demonstrated by TAF9, of which mo nocot and dicot TAF9 sequences clustered separately in the unroo ted phylogram. However, poplar TAF9b is more closely related to the Chlamydomonas reinhardtii TAF9 than the monocot or dicot TAF9 proteins. This TAF9 homolog is perhaps a TAF9-like pr otein involved in ot her transcriptional complexes similarly to H. sapiens TAF9L (Chen and Manley, 2003). A second possibility is that poplar TAF9b may be a bona fide TAF9 that regulate s a subset of genes in poplar, or is merely redundant. Finally, pop lar TAF9b could represent an ancient form of the gene that has been maintained in this lineage. TAF8 and TAF10 both dimerize and interact very strongly with one another. These two proteins also have a very similar pa ttern of interactions with the proteins of other GTFs. Both TAF8 and TAF10 were shown to interact with TFIIA-S, TFIIB1,

PAGE 167

154 TFIIB5/pBrp, TFIIE 1, and TFIIF 2. These data suggests that TAF8 and TAF10 (perhaps as an 22 structure) mediate interactions between TFIID and TFIIA, TFIIB, TFIIE, and TFIIF. Some non-shared interact ions are consistent with the shared interactions, such as the TAF8-TFIIE 2, which supports the TAF8-TAF10 heterotetramer interacting with the TFIIE 1-TFIIE 2 heterotetramer. To the best of my knowledge, none of these interactions of TAF8 or TAF10 with other GTFs have been reported previously in any organism. This data suggest a previous ly unknown role of a TA8-TAF10 heterotetramer in PIC nuc leation, at least within plants. TFIIE and TFIIE Subunits Arabidopsis thaliana has genes encoding three homologs of TFIIE and two of TFIIE (Table 2-1). TFIIE and TFIIE of H. sapiens are acidic (pI of 4.5) and basic (pI of 9.5) proteins, respectively (Peterson et al ., 1991). The acidic properties of TFIIE appear to be well conserved in Arabidopsis with pI values of 4.75, 4.95, and 4.72 for E 1, E 2, and E 3, respectively. Likewise, the basi c pI values are conserved in Arabidopsis TFIIE proteins (10.23 and 10.04 respectively for E 1 and E 2). Four of these Arabidopsis TFIIE genes are cluste red on chromosome 4. TFIIE2 and TFIIE2 neighbor each other in a head to he ad inverted fashion sharing a common promoter region. TFIIE3 and TFIIE1 are in relatively clos e proximity both in the same orientation (18 genetic lo ci inserted between the gene s, 83 kbp apart). The extreme proximity (only 972 bp between start codons) of TFIIE2 and TFIIE2 suggest that they are direct descendents of the an cestral genes in plants and ha ve been duplicated to create the other loci. This hypothesis is suppor ted by phylogenetic data in the case of TFIIE2 in which the Arabidopsis protein (along with the gene product of TFIIE1 ) is clustered

PAGE 168

155 separately from all other TFIIE proteins. However, the TFIIE 2 protein clusters with the other Arabidopsis TFIIE sequences, within the dicot grouping. TFIIE 1 interacts strongly with TFIIE 2, although surprisingly, TFIIE 2 fragments did not interact with TFIIE proteins. This may be caused by improper folding of TFIIE 2 domains when expressed in yeas t in a fragmented form. TFIIE 3 was not examined in this study because it was not amplified from the cDNA population. TFIIE3 is not represented by EST sequences and may therefore be a non-expressed pseudogene, or a under-represented mRNA/cDNA. Interestingly, TFIIE 1 did not interact with any other protei ns in this study, a re sult that may be caused by a two amino acid truncation of the C-terminus. However, this protein was readily expressed in the prey form as evidenced by Figure 4-2, sugges ting that protein stabil ity and folding were not factors in the la ck of interaction. TFIIF and TFIIF Subunits TFIIF is a heterotetrameric complex of two TFIIF and two TFIIF molecules (Orphanides et al ., 1996). TFIIF is highly conserved thr oughout the length of the primary structure through the animal, plan t, and fungal kingdoms. The poplar genome clearly encodes two TFIIF genes; however, the genomic sequence of one of these could not be completed due to lack of sequence in wh at appears to be a low-complexity intronic region. In yeast two-hybrid assays, Arabidopsis TFIIF only interacted with TFIIF 2. The yeast two-hybrid data pres ented here would suggest a composition for plant TFIIF, since the TFIIF 2 protein was shown to dimerize while TFIIF apparently did not. This is an unusual conformation for this complex, which has is considered to be 22 heterotetramer (Flores et al ., 1990). In yeast a third factor interact s as part of TFIIF

PAGE 169

156 as a distinct complex, the yeast TAF14 (Henry et al ., 1994). The data presented here demonstrate a connection of TA F14 and TAF14b with TFIIF; however, these interactions were between TAF14(b) and TFIIF 2, not TFIIF as in yeast. Since the TAF14(b)TFIIF connection is different from that in yeast, support for TAF14 or TAF14b acting as TFIIF subunits is weak. Interestingly, TFIIF 2 interacts with many other GT F subunits (25 of 47), while TFIIF did not interact with any subunits except TFIIF 2. TFIIF binds TAF1 in other systems, and is acetylated by the TAF1 FAT activity (Ruppert and Tjian, 1995). Interestingly, TFIIF 2 interacts with both TAF1 #8 (m iddle region) and TAF1 UBD (an internally coded ubiqui tin moiety plus over one-hundred amino acids on either side, which is located in the middle region cons truct). The HAT/FAT domain of TAF1 is located in the TAF1 #8 construct, and is partially represented in the TAF1 UBD construct. This suggests that in Arabidopsis either TFIIF 2 is acetylated in the place of TFIIF or that TFIIF 2 is a stabilizer of the TAF1-TFIIF interaction. A major anchoring point of PolII to the GTFs is the binding of TFIIB through the tightly associated TFIIF complex. Consistently with this, TFIIF 2 interacted with four of six TFIIB homologs. Of the TFIIBs, only TFIIB2 and TFIIB4 did not interact with TFIIF2. These TFIIB homologs had relativel y few interactions, despite detectable expression as prey proteins. Both TFIIB 2 and TFIIB4 contain putative zinc-binding domains that are implicated in interactions with TFIIF homologs (Buratowski and Zhou, 1993). This lack of interactio ns between TFIIB2/TFIIB4 and TFIIF 4 may indicate false negatives in the yeast two-hybrid assay. Fa lse negatives are a problem in any protein interaction study, and often lead to a lack of definitive conclusions with respect to these

PAGE 170

157 failed interactions. However, a lack of interactions with TFIIF 2 and nearly all other GTFs tested here draws into question the veracity of TFIIB2 a nd TFIIB4 as functional TFIIB-homologs. Conclusion Throughout this work, 11 plant GTF protei n families have been phylogenetically analyzed, 39 Arabidopsis cDNAs for TFIIA, TFIIB, TFIID, TFIIE, and TFIIF protein homologs have been cloned, and 1598 potential protein-protein inte ractions have been tested by yeast two-hybrid an alyses. Of the 1108 tested non-redundant interactions, 176 (15.9%) have been positive. From these, 112 (63.6%) were novel for any system (Figure 5-1), and 64 (36.4%) of th e interactions from Arabidopsis were confirmations of interactions from human, Drosophila or yeast (Figure 5-2) Figure 5-3 shows 52 interactions that have been described in human, Drosophila and/or yeast that were not found for the Arabidopsis homologs. While it is somewhat surprising to be missing this many interactions in a comprehensive study of this type, there are se veral explanations. Poor expression or misfolding of proteins in yeast may account for some of these missing binary interactions. A few may be missing because of the inabili ty to test the interactions (non-testable bait constructs). However, it is also likely that many of these “lacking” interactions are simply not found in the Arabidopsis PIC. Many changes have occurred in these protein homologs since the last co mmon ancestor of plants animals and fungi. This degree of shuffling of pr otein-protein interactions in the PIC, since the last common ancestor of these eukaryotes, is not unreasonable.

PAGE 171

158 Figure 5-1. Protein-protein intera ctions among TFIIA, TFIIB, TFIIE, TFIID and TFIIF that are unique to Arabidopsis thaliana as determined by yeast two-hybrid and -galactosidase confirmations. 5 TAF12 TAF2 TAF6 TAF6b -2 TAF9 TBP2 TAF1b TAF11 TAF5 TAF6b -1 TAF6b -3 Low conservation in histone-fold domain TAF11b Possible pseudogene TBP1 TAF10 TAF13 TAF7 TAF8 TAF4 TAF4b TAF14 TAF1 TAF14b TAF15 TAF15bTAF6b -4TAF12b TFIIB2 TFIIB1 TFIIB3 TFIIB4 TFIIB6 TFIIB5 TFIIA-S TFIIA-L1 TFIIA-L3 TFIIA-L2 TFIIE 1 TFIIE 3 TFIIE 2 TFIIE 1 TFIIE 2 TFIIF 1 TFIIF TFIIF 2

PAGE 172

159 Figure 5-2. Protein-prot ein interactions of Arabidopsis thaliana TFIIA, TFIIB, TFIID, TFIIE, and TFIIF that have been reported previously for Homo sapiens Drosophila melanogaster and/or Saccharomyces cerevisiae homologs. Striped proteins represent those that could not be tested as baits. TAF12 TAF2 TAF6 TAF6b -2 TAF9 TBP2 TAF1b TAF11 TAF5 TAF6b -1 TAF6b -3 Low conservation in histone-fold domain TAF11b Possible pseudogene TBP1 TAF10 TAF13 TAF7 TAF8 TAF4 TAF4b TAF14 TAF1 TAF14b TAF15 TAF15bTAF6b -4TAF12b TFIIB2 TFIIB1 TFIIB3 TFIIB4 TFIIB6 TFIIB5 TFIIA-S TFIIA-L1 TFIIA-L3 TFIIA-L2 TFIIE 1 TFIIE 3 TFIIE 2 TFIIE 1 TFIIE 2 TFIIF 1 TFIIF TFIIF 2

PAGE 173

160 Figure 5-3. Interactions of Homo sapiens Drosophila melanogaster and/or Saccharomyces cerevisiae TFIIA, TFIIB, TFIID, TFIIE or TFIIF that were not confirmed for Arabidopsis thaliana homologs TBP TFIIAS TFIIB TFIIE TFIIF TFIIF TAF1 TAF3 TAF8 TAF2 TAF4 TAF5 TAF9 TAF10 TAF11 TAF12 TAF13 TAF6 TFIIE TAF15 TAF14TFIIA-a& L TAF7

PAGE 174

161 APPENDIX A NUCLEOTIDE AND AMINO ACID SEQU ENCES OF GENERAL TRANSCRIPTION FACTORS TFIIA Small Subunit Sequences Arabidopsis thaliana TFIIA-S At4g24440 MATFELYRRSTIGMCLTETLDEMVQSGTLSPELAIQVLVQFDKSMTEALESQVKTKVSI KGHLHTYRFCDNVWTFILQDAMFKSDDRQENVSRVKIVACDSKLLTQ Drosophila melanogaster TFIIA-S NP_524467 MSYQLYRNTTLGNTLQESLDELIQYGQITPGLAFKVLLQFDKSINNALNQRVKARVTFK AGKLNTYRFCDNVWTLMLNDVEFREVHEIVKVDKVKIVACDGKSGEF Glycine max TFIIA-S TC148651 MATFELYRRSTIGMCLTETLDEMVQNGTLSPELAIQVLVQFDKSMTEALETQVKSKVSI KGHLHTYRFCDNVWTFILQDALFKNEDSQENVGRVKIVACDSKLLTQ Homo sapiens TFIIANP_004483 MAYQLYRNTTLGNSLQESLDELIQSQQITPQLALQVLLQFDKAINAALAQRVRNRVNFR GSLNTYRFCDNVWTFVLNDVEFREVTELIKVDKVKIVACDGKNTGSNTTE Hordeum vulgare TFIIA-S TC66396 MATFELYRRSTIGMCLTETLDEMVSSGTLSPELAIQVLVQFDKSMTEALENQVKSKVTV KGHLHTYRFCDNVWTFILTDAQFKNEEITEQVSKVKIVACDSKLLSQ Lycopersicon esculentum TFIIA-S TC119445 MATFELYRRSTIGMCLTETLDEMVSNGILSPEHAIQVLVQFDKSMTEALETQVKSKVTI KGHLHTYRFCDNVWTFILQDAVFKSEECQETVNRVKIVACDSKLLTQ Medicago truncatula TFIIA-S TC79554 MATFELYRRSTIGMCLTETLDEMVQNGTLSPEIAIQVLVQFDKSMTEALETQVKSKVSI KGHLHTYRFCDNVWTFILQDALFKNEDNQENVGRVKIVACDSKLLSQ Mesembryanthemum crystallinum TFIIA-S TC5775 MATFELYRRSTIGMCLTETLDEMVQSGTLSPELAIQVLVQFDKSMTEALEAQVKTKVTI KGHLHTYRFCDNVWTFMLQDALFKSEECQENVSRVKIVACDSKLLTQ Oryza sativa TFIIA-S AAK73129 MATFELYRRSTIGMCLTETLDEMVSSGTLSPELAIQVLVQFDKSMTEALENQVKSKVSI KGHLHTYRFCDNVWTFILTEASFKNEETTEQVGKVKIVACDSKLLSQ

PAGE 175

162 Oryza_sativa_TFIIA-S2 MATFELYRRSTIGMCLTDTLDDMVSSGALSPELAIQVLVQFDKSMTSALEHQVKSKVTV KGHLHTYRFCDNVWTFILTDAIFKNEEITETINKVKIVACDSKLLETKEE Pinus TFIIA-S TC16392 MATFELYRKSTIGTCLTETLDELVLNGTLSPEHAIQVLVQFDKSMAEALETQVKSKVTI KGHLHTYRFCDNVWTFLLQDAQFKGEDIHEQAGRVKIVACDSKILTQ Populus trichocarpa TFIIA-S1 MATFELYRRSTIGMCLTETLDDMVQNGTLSPELAFQVLVQFDKSMTEALETKVKSKVTI KGHLHTYRFCDNVWTFILQDANFKNEDSQENVGRVKIVACDSKLLTQ Populus trichocarpa TFIIA-S2 MSTNGNNPAPYFELYRRSSVGLALTDALDELIQSGHINPQLALTVLKQFDKSASQVLST QLRSKCLIKGHLSTYRLCDEVWTFLLRDSIYKLEGGEQVGPVKRVKIVACKGNAGASAP PA Saccharomyces cerevisiae TFIIA-S TOA2p NP_012865 MAVPGYYELYRRSTIGNSLVDALDTLISDGRIEASLAMRVLETFDKVVAETLKDNTQSK LTVKGNLDTYGFCDDVWTFIVKNCQVTVEDSHRDASQNGSGDSQSVISVDKLRIVACNS KKSE Solanum tuberosum TFIIA-S TC60470 MATFELYRRSTIGMCLTETLDEMVSNGILSPEHAIQVLVQFDKSMTEALETQVKSKVTI KGHLHTYRFCDNVWTFILQDAVFKSEECQETVNRVKIVACDSKLLTQ Triticum aestivum TFIIA-S TC71252 MATFELYRRSTIGMCLTETLDEMVSSGTLSPELAIQVLVQFDKSMTEALENQVKSKVTV KGHLHTYRFCDNVWTFILTDAQFKNEETTEQVGKVKIVACDSKLLSQ Triticum aestivum TFIIA-S TC71251 MATFELYRRSTIGMCLTETLDEMVSSGTLSPELAIQVLVQFDKSMTDALETQVKSKVTV KGHLHTYRFCDNVWTFILTDAQFKNEETTEQVGKVKIVACDSKLLSQ Triticum aestivum TFIIA-S CA484144 MATFELYRRSTIGMCLTETLDEMVSSGTLSPELAIQVLVQFDKSMTDALENQVKSKVNI KGHLHTYRFCDNVWTFILTDASFKNEETTEQVGKVKIVACDSKLLGQ Vitis vinifera TFIIA-S TC15540 MATFELYRRSTIGMCLTETLDEMVQNGTLSPELAIQVLVQFDKSMTEALESQVKSKVTI KGHLHTYRFCDNVWTFILQDALFKNEESQENVGRVKIVACDSKLLTQ Zea mays TFIIA-S TC170582 MATFELYRRSTIGMCLTETLDEMVSNGTLSPELAIQVLVQFDKSMTDALENQVKSKVTV KGHLHTYRFCDNVWTFILTDASFKNEEATEQVGKVKIVACDSKLLGQ

PAGE 176

163 Zea mays TFIIA-S TC173972 MATFELYRRSTIGTCLTETLDELVSSGAVSPELAIQVLVQFDKSMTEALEMQVKSKVSV KGHLHTYRFCDNVWTFILTDATFKSEEIQETLGRVKIVACDSKLLQPQHP TFIIA Large Subunit Sequences Arabidopsis thaliana TFIIA-L1 MGTTTTTSAVYIHVIEDVVNKVREEFINNGGPGESVLSELQGIWETKMMQAGVLNGPIE RSSAQKPTPGGPLTHDLNVPYEGTEEYETPTAEMLFPPTPLQTPLPTPLPGTADNSSMY NIPTGSSDYPTPGTENGVNIDVKARPSPYMPPPSPWTNPRLDVNVAYVDGRDEPERGNS NQQFTQDLFVPSSGKRKRDDSSGHYQNGGSIPQQDGAGDAIPEANFECDAFRITSIGDR KVPRDFFSSSSKIPQVDGPMPDPYDEMLSTPNIYSYQGPSEEFNEARTPAPNEIQTSTP VAVQNDIIEDDEELLNEDDDDDELDDLESGEDMNTQHLVLAQFDKVTRTKSRWKCSLKD GIMHINDKDILFNKAAGEFDF Arabidopsis thaliana TFIIA-L2 MGTTTTTSAVYIHVIEDVVNKVREEFINNGGPGESVLSELQGIWETKMMQAGVLNGPIE RSSAQKPTPGGPLTHDLNVPYEGTEEYETPTAEMLFPPTPLQTPLPTPLPGTADNSSMY NIPTGSSDYPTPGTENGVNIDVKARPSPYMPPPSPWTNPRLDVNVAYVDGRDEPERGNS NQQFTQDLFVPSSGKRKRDDSSAHYQNGGSIPQQDGASDAIPEANFECAALRITYVGDR KIPRDLIGSSSKIPQVDGPMPDPYDEMLSTPNIYSYQGPNEEFNEARTPAPNEIQTSTP VAVPNDIIEDDEELLNEDDDDDELDDLESGEDMNTQHLVLAQFDKVTRTKSRWKCSLKD GIMHINDKDILFNKATGEFDF Arabidopsis thaliana TFIIA-L3 MVLSTSDTSSSYNYVIDDVINKSRCDLVYNGELDESVLSQIQSMWKTKMIQAGAMSGTI ETSSASIPTTPVIVQTTLQTPDAIPLPEKKMSPKKESDGFYYIPQQDGARDEAIVDVDE NEEPLNEDDDDEEDDIDDDDMNIQHLVMCQFDKVKRSKNKWECKFNAGVMQINGKNVLF SQATGDFNF Drosohphila melanogaster TFIIA-alpha-beta NP_476996 MALCQTSVLKVYHAVIEDVITNVRDAFLDEGVDEQVLQEMKQVWRNKLLASKAVELSPD SGDGSHPPPIVANNPKAANAKAKKAAAATAVTSHQHIGGNSSMSSLVGLKSSAGMAAGS GIRNGLVPIKQEVNSQNPPPLHPTSAASMMQKQQQAASSGQGSIPIVATLDPNRIMPVN ITLPSPAGSASSESRVLTIQVPASALQENQLTQILTAHLISSIMSLPTTLASSVLQQHV NAALSSANHQKTLAAAKQLDGALDSSDEDESEESDDNIDNDDDDDLDKDDDEDAEHEDA AEEEPLNSEDDVTDEDSAEMFDTDNVIVCQYDKITRSRNKWKFYLKDGIMNMRGKDYVF QKSNGDAEW Glycine max TFIIA-L TC192713 MAASTTSQVYIQVIDDVMNKVRDEFVNNGGPGDEVLKELQSIWESKMMQAGAIVGPIER SGAPKPTPGGPITPVHDLNMPYEGTEEYETPTAEMLFPPTPLQTPLQTPLPGTVDNSMY NIPTGPSDYPSAGNEPGANNEIKGGRPGPYMQPPPSPWTNQNQNQNQRAPLDVNVAYVE GRDEAERGASNQPLTQDFFMSSGKRKRDEIASQYNAGGYIPQQDGAGDAASQILEIEVY GGGMSIDAGHSTSKGKMPAQSDRPASQIPQLDGPIPYDDDVLSTPNIYNYGVFNEDYNI ANTPAPSEVPASTPAPIAQNEVDEEDDDDEPPLNENDDDDLDDLDQGEDQNTHHLVLAQ FDKVTRTKSRWKCTLKDGIMHINNKDILFNKATGEFDF

PAGE 177

164 Homo sapiens TFIIA-alpha/beta-like factor NM_006872 MACLNPVPKLYRSVIEDVIEGVRNLFAEEGIEEQVLKDLKQLWETKVLQSKATEDFFRN SIQSPLFTLQLPHSLHQTLQSSTASLVIPAGRTLPSFTTAELGTSNSSANFTFPGYPIH VPAGVTLQTVSGHLYKVNVPIMVTETSGRAGILQHPIQQVFQQLGQPSVIQTSVPQLNP WSLQATTEKSQRIETVLQQPAILPSGPVDRKHLENATSDILVSPGNEHKIVPEALLCHQ ESSHYISLPGVVFSPQVSQTNSNVESVLSGSASMAQNLHDESLSTSPHGALHQHVTDIQ LHILKNRMYGCDSVKQPRNIEEPSNIPVSEKDSNSQVDLSIRVTDDDIGEIIQVDGSGD TSSNEEIGSTRDADENEFLGNIDGGDLKVPEEEADSISNEDSATNSSDNEDPQVNIVEE DPLNSGDDVSEQDVPDLFDTDNVIVCQYDKIHRSKNKWKFYLKDGVMCFGGRDYVFAKA IGDAEW Homo sapiens TFIIA-alpha-beta NP_056943 MANSANTNTVPKLYRSVIEDVINDVRDIFLDDGVDEQVLMELKTLWENKLMQSRAVDGF HSEEQQLLLQVQQQHQPQQQQHHHHHHHQQAQPQQTVPQQAQTQQVLIPASQQATAPQV IVPDSKLIQHMNASNMSAAATAATLALPAGVTPVQQILTNSGQLLQVVRAANGAQYIFQ PQQSVVLQQQVIPQMQPGGVQAPVIQQVLAPLPGGISPQTGVIIQPQQILFTGNKTQVI PTTVAAPTPAQAQITATGQQQPQAQPAQTQAPLVLQVDGTGDTSSEEDEDEEEDYDDDE EEDKEKDGAEDGQVEEEPLNSEDDVSDEEGQELFDTENVVVCQYDKIHRSKNKWKFHLK DGIMNLNGRDYIFSKAIGDAEW Hordeum vulgare TFIIA-L Barley1_09796 MASGNVSTVYISVIDDVVAKVREDFITYGVGDAVLNELQALWEMKMLHCGAISGNIDRN RAPPASAGGAPGAGATPPVHDLNVPYEATSEEYATPTADMLFPPTPLQTPIQTPLPGID TGMYNIPTGPSDYAPSPISDMRNGMGMNGSDPKTGRPSPYMQPPSPWMNQRPLGVDVNV AYEESREDPDRLMQPQPLTKDFLMMSSGKRKRDEYPGQLPSGSFVPQQDGCADQVAEFV GSKDNAQQVWNSILNKQESVTKTLSIKESTIPPVLPQRDGIQDDYNDQFFFPGVPTEDY NTPGESSEYRTPTPAIATPKPRNDMAGGDDDDDDDDEPPLNEDDDDDDEIDDLQDGDEE PNTQHLVLAQFDKVTRTKNRWKCTLKDGIMHLNGRDVLFNKASGEFDF Oryza sativa TFIIA-L MASSNVSTVYISVIDDVISKVRDDFISYGVGDAVLNELQALWEMKMLHCGAISGTIDRS KAAPAPSAGTPGAGTTPPVHDLNVPYEATSEEYATPTADMLFPPTPLQTPIQTPLPGTD AGMYNIPTGPSDYAPSPISDVRNGMAMNGADPKTGRPSPYMPPPSPWMTQRPLGVDVNV AYVENREDPDRTGQPPQLTKDFLMMSSGKRKRDEYPGQLPSGSFVPQQDGSADQIVEFV VSKDNAQQLWSSIVNKQGTATKESSTKETIIAPTIPQRDGMDDYNDPFYFQGVPTEDYN TPGESSEYRAPTPAVGTPKPRNDVGDDDEPPLNEDDDDDDELDDLEQGEDEPNTQHLVL AQFDKVTRTKNRWKCTLKDGIMHLNGRDVLFNKVVNMIF Populus trichocarpa TFIIA-L1 MASSATSTVYTEVIEDVIDKVRDEFINNGGPGETVLSELQGLWEKKLMQAGVLSGPIVR SSANKQLVPGGLTPVHDLNVPYEGTEEYETPTAEILFPPTPLQTPMQTPLPGSAQTPLP GNVQTPLPGNVPTPLPGSVDNSSMYNISTGSSSDYPTPVSDAGGSTDVKAGRPSHFMQS PSPLMHQRPPLDVNVGKSYFYAPRRVHGQKDFFMSSGKRKRGDFAPKYNNGGFIPQQDG AVDSASEVSQVSQGNNPHGRCDTITTKNREILARVSRSYVKIPQVDGPIPDPYDDMLST PNIYNYQGVANEDYNIASTPAPNDLQASTPAVVSQNDDVDDDDDEPLNEDDDDDEDLDG VDQGEELNTQHLILAQFDKVTRTKSRWKCTLKDGVMHINNRDILFNKATGEFEF

PAGE 178

165 Saccharomyces cerevisiae TOA1p NP_014837 MSNAEASRVYEIIVESVVNEVREDFENAGIDEQTLQDLKNIWQKKLTETKVTTFSWDNQ FNEGNINGVQNDLNFNLATPGVNSSEFNIKEENTGNEGLILPNINSNNNIPHSGETNIN TNTVEATNNSGATLNTNTSGNTNADVTSQPKIEVKPEIELTINNANITTVENIDDESEK KDDEEKEEDVEKTRKEKEQIEQVKLQAKKEKRSALLDTDEVGSELDDSDDDYLISEGEE DGPDENLMLCLYDKVTRTKARWKCSLKDGVVTINRNDYTFQKAQVEAEWV Solanum tuberosum TFIIA-L STtuc02-10-23.4519 MASSTTSNVYIHVIEDVISKVRDEFISNGGPGESILKELQALWEVKMMNAGAILGTIER NSAAKATPGGPITPVHDLNMPYEGNEEYETPTADILFPPTPLQTPLPGTAQTPLPGTVQ TPLPGTAQTPLPGTADSSMYNIPTGGTPFTPSDYSPLNDTGGATELKAGPGRPSPFMHP PSPWLNQRPPLDVNGAYVEGREEVGDRGGSQQPMTQDFFMNSAGKRKREDFPPQYHNGG YIPQQDGAADSIYDNLKSGEGSNIQLELVTVGPVQASAYRIPQFDGPIPDSYDDALSTP NIYYQGVVNEDYNIVNTPAPNDMQAPTPAPALQNDDIDDDDEPLNEDDDDDLDDVDQGE DLNTAHLVLAQFDKVTRTKSRWKCTLKDGIMHINNKDILFNKANGEFDF Zea mays TFIIA-L TC183075 MASSNVSTVYISVIDDVISKVREDFITYGVGDAVLNELQALWEMKMLHCGAISGNIDRT KAAAASVGGTTGTTAPVHDLNVPYEATSEEYATPTADMLFPPTPLQTPIQTPLPGTDTA MYNIPTGPSDYAPSPISDIRNGMTINGADPKAGHPSPYMPPPSPWMNQRPLGVDVNVAY VEGREDPDRGVQPQPLTQDFLTMSSGKRKRDEYPGQLPSGSFVPQQDGSADQIVEFVVS KENANQHWSSIINKLETPTKTVTPVIPQCDGIQDDYNDQFFFPGVPTEDYNTPGESAEY RAPTPAVGTPKPRNDAGDDNDDDDDDEEPPLNEDDDDDDDLDDLEEGEDEPNTQHLVLA QFDKVTRTKNRWKCTLKDGIMHLNGRDVLFNKATGEFDF TFIIB Family Sequences Arabidopsis thaliana BRF1 At2g45100 MVWCKHCGKNVPGIRPYDAALSCDLCGRILENFNFSTEVTFVKNAAGQSQASGNILKSV QSGMSSSRERIIRKATDELMNLRDALGIGDDRDDVIVMASNFFRIALDHNFTKGRSKEL VFSSCLYLTCRQFKLAVLLIDFSSYLRVSVYDLGSVYLQLCDMLYITENHNYEKLVDPS IFIPRFSNMLLKGAHNNKLVLTATHIIASMKRDWMQTGRKPSGICGAALYTAALSHGIK CSKTDIVNIVHICEATLTKRLIEFGDTEAASLTADELSKTEREKETAALRSKRKPNFYK EGVVLCMHQDCKPVDYGLCESCYDEFMTVSGGLEGGSDPPAFQRAEKERMEEKASSEEN DKQVNLDGHSDESSTLSDVDDRESDRFTVSQLDCYFRTPEEVRLVKIFFDHENPGYDEK EAAKKAAGLNACNNASNIFEASKAAAAKSRKEKRQQRAEEEKNAPPPATGIEAVDSMVK RKKFRDINCDYLEELFDASVEKSPKRSKTETVMEKKKKEEHEIVENEQEEEDYAAPYEQ DEEDYAAPYEMNTDKKFYESEVEEEEDGYDFGLY Arabidopsis thaliana BRF2 At3g09360 MVWCNHCVKNVPGIRPYDGALACNLCGRILENFHFSTEVTFVKNAAGQSQASGNIVRSV QSGITSSRERRFRIARDELMNLKDALGIGDERDDVIVIAAKFFEMAVEQNFTKGRRTEL VQASCLYLTCRELNIALLLIDFSSYLRVSVYELGSVYLQLCEMLYLVENRNYEKLVDPS IFMDRFSNSLLKGKNNKDVVATARDIIASMKRDWIQTGRKPSGICGAALYTAALSHGIK CSKTDIVNIVHICEATLTKRLIEFGDTDSGNLNVNELRERESHKRSFTMKPTSNKEAVL CMHQDSKPFGYGLCEDCYKDFINVSGGLVGGSNPPAFQRAEKERMEKAAREENEGGISS

PAGE 179

166 LNHDEQLYHLRIYLGCVAEKGEKDKDGAEEHADTSDESDNFSDISDDEVNGYINNEEET HYKTITWTEMNKDYLEEQAAKEAALKAASEALKASNSNCPEDARKAFEAAKADAAKSRK EKQQKKAEEAKNAAPPATAVEAVRRTLDKKRLSSVINYDVLESLFDTSAPEKSPKRSKT ETDIEKKKEENKEMKSNEHENGENEDEDEEDEEEGNVESYDMKTDFQNGEKFYEEDEEE EEDGNDFGLY Arabidopsis thaliana BRF3 At2g01280 MDQNFTKGRRAELVQSSCLYLACRDMKISLLFIDFSSYLRVSVYELGSVYLQLCEMLYL VQNKNYEELVDPSIFIPRFTNSLLKGAHAKAKDVANTAKNIISSMKRDWIQTGRKPSGI CGAAIYMAALSHGIMYSRADIAKVVHMCEATITKRLNEFANTEAGSLTLLVGRILLLIS EQRKREWKKQLEKKTREELAANCPEDARNLVEASKAAVANSRKEKRRKRAEEAKNAPPS ATATEAVCRTLERKIKIYSIF Arabidopsis thaliana BRF4 At4g35540 MRCKRCNGSNFERDEDTGNSYCGGCGTLREYDNYEAQLGGIRGPQGTYIRVGTIGRGSV LDYKDKKIYEANNLIEETTERLNLGNKTEVIKSMISKLTDGEFGQGEWFPILIGACCYA VVREEGKGVLSMEEVAYEVGCDLHQLGPMIKRVVDHLDLELREFDLVGLFTKTVTNSPR LTDVDRDKKEKIIKQGTFLMNCALKWFLSTGRRPMPLVVAVLAFVVQVNGVKVKIDDLA KDASVSLTTCKTRYKELSEKLVKVAEEVGLPWAKDVTVKNVLKHSGTLFALMEAKSMKK RKQGTGKELVRTDGFCVEDLVMDCLSKESMYCYDDDARQDTMSRYFDVEGERQLSLCNY DDNISENQLSTKYNEFEDRVCGGTLAKRSQGSSQSMWQRRSVFGMVSTENWWKGKSELS KRLLLKDLLEKDVGLEALPPSYIKGCVAVERRREKIKAAKLRINAIQHPSDNVSEGALS LELEHSKKKRKKGSEIDWEDLVIQTLVLHNVNEEEIEKGHYKTLLDLHVFNSGEV Arabidopsis thaliana TFIIB1 At2g41630 MSDAYCTDCKKETELVVDHSAGDTLCSECGLVLESHSIDETSEWRTFANESSNSDPNRV GGPTNPLLADSALTTVIAKPNGSSGDFLSSSLGRWQNRNSNSDRGLIQAFKTIATMSER LGLVATIKDRANELYKRLEDQKSSRGRNQDALYAACLYIACRQEDKPRTIKEICVIANG ATKKEIGRAKDYIVKTLGLEPGQSVDLGTIHAGDFMRRFCSNLAMSNHAVKAAQEAVQK SEEFDIRRSPISIAAVVIYIITQLSDDKKTLKDISHATGVAEGTIRNSYKDLYPHLSKI APSWYAKEEDLKNLSSP Arabidopsis thaliana TFIIB2 At3g10330 MSDAFCSDCKRHTEVVFDHSAGDTVCSECGLVLESHSIDETSEWRTFANESGDNDPVRV GGPTNPLLADGGLTTVISKPNGSSGDFLSSSLGRWQNRGSNPDRGLIVAFKTIATMADR LGLVATIKDRANEIYKRVEDQKSSRGRNQDALLAACLYIACRQEDKPRTVKEICSVANG ATKKEIGRAKEYIVKQLGLETGQLVEMGTIHAGDFMRRFCSNLGMTNQTVKAAQESVQK SEEFDIRRSPISIAAAVIYIITQLSDEKKPLRDISVATGVAEGTIRNSYKDLYPHLSKI IPAWYAKEEDLKNLQSP Arabidopsis thaliana TFIIB3 At3g29380 MEEETCLDCKRPTIMVVDHSSGDTICSECGLVLEAHIIEYSQEWRTFASDDNHSDRDPN RVGAATNPFLKSGDLVTIIEKPKETASSVLSKDDISTLFRAHNQVKNHEEDLIKQAFEE IQRMTDALDLDIVINSRACEIVSKYDGHANTKLRRGKKLNAICAASVSTACRELQLSRT LKEIAEVANGVDKKDIRKESLVIKRVLESHQTSVSASQAIINTGELVRRFCSKLDISQR EIMAIPEAVEKAENFDIRRNPKSVLAAIIFMISHISQTNRKPIREIGIVAEVVENTIKN SVKDMYPYALKIIPNWYACESDIIKRLDGVITSWDSAKFSV

PAGE 180

167 Arabidopsis thaliana TFIIB4 At3g57370 MTMKWGHSCRRCKQINVVTDHVTRRTRCFGCGLEFKYRPIGDLSPVAENDTVRLPDPTN TLLSNTDLSIVTTEHKNGSFDDSLSLNLGNSSKPRLDPVSIATAKLMNGSSNDFLSLGT SQNSETITASSDEFLFSDLGHLQKFSFDPLSMASTKPNKALSIVSIEAISNGLKLPATI KGQANEIFKVVESYARGKERNVLFAACIYIACRDNDMTRTMREISRFANKASISDISET VGFIAEKLEINKNWYMSIETANFIKRFCSIFRLDKEAVEAALEAAESYDYMTNGRRAPV SVAAGIVYVIARLSYEKHLLKGLIEATGVAENTIKGTYGDLYPNLPTIVPTWFANANDL KNLGAP Arabidopsis thaliana TFIIB5 At4g36650 MKCPYCSSAQGRCTTTSSGRSITECSSCGRVMEERQTQNHHLFHLRAQDTPLCLVTSDL QTAAQPSPEDEEDPFEPTGFITAFSTWSLEPSPIFARSSLSFSGHLAELERTLELASST SNSNSSTVVVDNLRAYMQIIDVASILGLDCDISEHAFQLFRDCCSATCLRNRSVEALAT ACLVQAIREAQEPRTLQEISIAANVQQKEIGKYIKILGEALQLSQPINSNSISVHMPRF CTLLQLNKSAQELATHIGEVVINKCFCTRRNPISISAAAIYLACQLEDKRKTQAEICKI TGLTEVTLRKVYKELLENWDDLLPSNYTPAVPPEKAFPTTTISTTRSTTPRAVDPPEPS FVEKDKPSAKPIETFDHTYQQPKGKEDKQPKFRQPWLFGTASVMNPAEMISEPAKPNAM DYEKQQLDKQQQQQLGDKETLPIYLRDHNPFPSNPSPSTGISTINWSFRPSVVPGSSSN LPVIHPPKLPPGYAEIRGSGSRNADNPHGDF Arabidopsis thaliana TFIIB6 At4g10680 MKEDGICLECKRPTETVVNYKNGDTICIECGHVIENNIIDDLDGASTNPNLKSGHLPTI IFKLSGKSSSLASKLRRTQNEMIKNKQEEDVIKIAYAEIERMTEALGLTFGISNTACKI LSKLDKKNLRGGKSLRGLCAASVSRACRQVNIPKTLKEISAVANVDMKEINKAVKLLGD SFG Citrus sinensis TFIIB CB292941 MTDAFCSDCKKHTEVVFDHSAGDTVCSECGLVLESHSIDETSEWRTFANESGDNDPVRV GGPTNPLLADGGLSTVIAKPNGASGEFLSSSLGRWQNRGSNPDRGLILAFKTIATMSDR LGLVATIKDRANEIYKKVEDQKSSRGRNQDALLAACLYIACRQEDKPRTVKEICSVANG ATKKEIGRAKEYIVKQLGLETGQSVEMGTIHAGDFMRRFCSNLGMNNQAVKAAQEAVQK SEEFDIRRSPISIAAAVIYIITQLSDDKKPLKDISVATGVAEGTIRNSYKDLYPHVSKI IPNWYAKEEDLKNLCSP Drosophila melanogaster BRF AAF72065 MSTGLKCRNCGSNEIEEDNARGDRVCMNCGSVLEDSLIVSEVQFEEVGHGAAAIGQFVS AESSGGATNYGYGKFQVGSGTESREVTIKKAKKDITLLCQQLQLSQHYADTALNFFKMA LGRHLTRGRKSTHIYAACVYMTCRTEGTSHLLIDISDVQQICSYELGRTYLKLSHALCI NIPSLDPCLYIMRFANRLQLGAKTHEVSMTALRIVQRMKKDCMHSGRRPTGLCGAALLI AARMHDFSRTMLDVIGVVKIHESTLRKRLSEFAETPSGGLTLEEFMTVDLEREQDPPSF KAARKKDRERIKDMGEHELTELQKEIDAHLEKDLGKYSNSVYRQLTKGKGLSPLSSPST PNSSSEKDIELEESRQFIEQSNAEVIKELIAKNEDVKKSEPGGLVAGIEGLRPDIEAIC RVTQSDLEDVEKAKQPQEQELITDDLNDDELDQYVLTEEESVAKLEMWKNLNAEYLQEQ KERDERLAKEREEGKPERKKRKPRKKVIGPSSTAGEAIEKMLQEKKISSKINYEILKTL TDGMGGLTDDSPTTSADTKPSTLEELKHQPVIVEEGPVPSKSRGNRAAYDLPGPSRKRP

PAGE 181

168 KLEVGLPVSQAADVEQPETKPAVVVEADDLDEDADDPDVEPEAEPEATLQDMLNTGGDD DEFGYGFDEEEEY Drosophila melanogaster TFIIB NM_057540 MASTSRLDNNKVCCYAHPESPLIEDYRAGDMICSECGLVVGDRVIDVGSEWRTFSNEKS GVDPSRVGGPENPLLSGGDLSTIIGPGTGSASFDAFGAPKYQNRRTMSSSDRSLISAFK EISSMADRINLPKTIVDRANNLFKQVHDGKNLKGRSNDAKASACLYIACRQEGVPRTFK EICAVSKISKKEIGRCFKLTLKALETSVDLITTADFMCRFCANLDLPNMVQRAATHIAK KAVEMDIVPGRSPISVAAAAIYMASQASEHKRSQKEIGDIAGVADVTIRQSYKLMYPHA AKLFPEDFKFTTPIDQLPQM Glycine max TFIIB U31097 MSDAFCSDCKRQTEVVFDHSAGDTVCSECGLVLESHSIDETSEWRTFANESGDNDPNRV GGPSNPLLTDGGLSTVIAKPNGGGGGEFLSSSLGRWQNRGSNPDRALIQAFKTIATMSD RLGLVATIKDRANEIYKRVEDQKSSRGRNQDALLAACLYIACRQEDKPRTVKEICSVAN GATKKEIGRAKEYIVKQLGLENGNAVEMGTIHAGDFMRRFCSNLCMNNQAVKAAQEAVQ KSEEFDIRRSPISIAAAVIYIITQLSDDKKPLKDISLATGVAEGTIRNSYKDLYPHVSK IIPNWYAKEEDLKNLCSP Homo sapiens BRF NP_001510.2 MTGRVCRGCGGTDIELDAARGDAVCTACGSVLEDNIIVSEVQFVESSGGGSSAVGQFVS LDGAGKTPTLGGGFHVNLGKESRAQTLQNGRRHIHHLGNQLQLNQHCLDTAFNFFKMAV SRHLTRGRKMAHVIAACLYLVCRTEGTPHMLLDLSDLLQVNVYVLGKTFLLLARELCIN APAIDPCLYIPRFAHLLEFGEKNHEVSMTALRLLQRMKRDWMHTGRRPSGLCGAALLVA ARMHDFRRTVKEVISVVKVCESTLRKRLTEFEDTPTSQLTIDEFMKIDLEEECDPPSYT AGQRKLRMKQLEQVLSKKLEEVEGEISSYQDAIEIELENSRPKAKGGLASLAKDGSTED TASSLCGEEDTEDEELEAAASHLNKDLYRELLGGAPGSSEAAGSPEWGGRPPALGSLLD PLPTAASLGISDSIRECISSQSSDPKDASGDGELDLSGIDDLEIDRYILNESEARVKAE LWMRENAEYLREQREKEARIAKEKELGIYKEHKPKKSCKRREPIQASTAREAIEKMLEQ KKISSKINYSVLRGLSSAGGGSPHREDAQPEHSASARKLSRRRTPASRSGADPVTSVGK RLRPLVSTQPAKKVATGEALLPSSPTLGAEPARPQAVLVESGPVSYHADEEADEEEPDE EDGEPCVSALQMMGSNDYGCDGDEDDGY Homo sapiens TFIIB NM_001514 MASTSRLDALPRVTCPNHPDAILVEDYRAGDMICPECGLVVGDRVIDVGSEWRTFSNDK ATKDPSRVGDSQNPLLSDGDLSTMIGKGTGAASFDEFGNSKYQNRRTMSSSDRAMMNAF KEITTMADRINLPRNIVDRTNNLFKQVYEQKSLKGRANDAIASACLYIACRQEGVPRTF KEICAVSRISKKEIGRCFKLILKALETSVDLITTGDFMSRFCSNLCLPKQVQMAATHIA RKAVELDLVPGRSPISVAAAAIYMASQASAEKRTQKEIGDIAGVADVTIRQSYRLIYPR APDLFPTDFKFDTPVDKLPQL Lycopersicon esculentum TFIIB AF273333 MDRGKIPDLAARSNRIYLDLEDIIKENALPFLPAKSAVKFQAVCRDWRLQISAPLFAHK QSLSCNSTSGIFSQLNRGSPFLIPIDANSCGVPDPFLNFLPEPVDIKSSSNGLLCCRGR EGDKVYYICNPFTKQWKELPKSNAYHGSDPAIVLLFEPSLLNFVAEYKIICAFPSTDFD KATEFDIYYSREGCWKIAEEMCFGSRTIFPKSGIHVNGVVYWMTSKNILAFDLTKGRTQ LLESYGTRGFLGTFSGKLCKVDVSGDIISLNVLANTHSNTMQIGSQIKMWSEKEIVVLD

PAGE 182

169 SEIVGDGAARNHTVLHVDSDIMVVLCGRRTCSYDFKSRLTKFLSSKVGILDRCFPYVNS LVSL Lycopersicon esculentum TFIIB TC124975 MDTYCSDCKRNTEVVFDHAAGDTVCSECGLVLESRSIDETSEWRTFADESGGDDPNRVG GPVNPLLGDAALSTVISKGPNGSNGDGSLARLQNRGGDPDRAIVLAFKAIATMADRLSL VSTIRDRASEIYKRLEDQKCTRGRNLDALVAACIYIACRQEGKPRTVKEICSIANGASK KEIGRAKEFIVKQLKVEMGESMEMGTIHAGDYLRRFCSNLGMNHEEIKVVQETVQKAEE FDIRRSPISIAAAIIYMITQLSDSKKPVLRDISVATTVAEGTIKNAYKDLYPHASKIIP EWYVKDKDLKSLCSPKA Lycopersicon esculentum TFIIB AAG01118 MRCPYCSAEQGRCTSSTSGRPITECTSCGRVVEERLTQSHHLFHTRAQDSPLCLATSDL PTLPISATNDDEDPFEPTGFITTFSTWSLEPYPVFAQSSISFAGHLAELERVLEMTSTS SSSSSSSVVVENLRAYLQIIDVASILRLDSDISDHAFQLFRDCSSATCLRNRSVEALAT AALVHAIREAQEPRTLQEISVAANLPQKEIGKYIKILGEALQLSQPINSNSISVHMPRF CTLLQLNKSAQELATHIGEVIINKCFCTRRNPISISAAAIYLACQLEDKRKTQAEICKV TGLTEVTLRKVYKELLENWDDLLPSSYKPVVPPEKAFPSATIATGRSSTPRVDIVEGTS SERDKPVKPVDSLDISPQIRGKEDSDSKDNINTTQLSWPPPFWKPQAPAEGGVKSATDK SQNATEEMEIDL Mesembryanthemum crystallinum TFIIB TC5895 MSDAFCSDCKKCTEVVFDHSAGDTVCSECGLVLESHSIDETSEWRTFANESNDNDPVRV GGPTNPLLSDGGLSTVISKPNGTTGDYLSSSLGRWQNRGANPDRGLILAFKTIATMADR LGLVATIKDRASEIYKKVEDQKSSRGRNQDAILAACLYIACRQEDKPRTVKEICSVANG ASKKEIGRAKEYIVKQLELEMGKSVTIGTIHAADFLRRFCSNLGMNNQAMKAAQEAVQK SEEIDIRRSPISIAAAVIYIITQLSEEKKPLRDISLATGVAEGTIRNAYKDLYPHISKI IPVWYATEDDLKTSAAHKVKQ Methanosarcina acetivorans TFB NP_615574.1 MVEVERVRYSDTLEREKIRAMIKARKEKQKEQSFENEKAVCPECGSRNLVHDYERAELV CGDCGLVIDADFVDEGPEWRAFDHDQRMKRSRVGAPMTYTIHDKGLSTMIDWRNRDSYG KSISSKNRAQLYRLRKWQRRIRVSNATERNLAFALSELDRMASALGLPRTVRETAAVVY RKAVDKNLIRGRSIEGVAAAALYAACRQCSVPRTLDEIEEVSRVSRKEIGRTYRFISRE LALKLMPTSPIDYVPRFCSGLNLKGEVQSKSVEILRQASEKELTSGRGPTGVAAAAIYI ASILCGERRTQREVADVAGVTEVTIRNRYKELAEELDIEIIL Medicago truncatula TC86832 MSDAFCSDCKRATEVVFDHSAGDAVCSECGLVLESHSIDETSEWRTFANESGDNDPVRV GGPSNPLLTDGGLSTVIAKPNGASGDFLSSSLGRWQNRGSNPDRGLILAFKTIGTMAER LGLVPTIKDRANEIYKRVEDQKSSRGRNQDALLAACLYIACRQEDKPRTVKEICSIANG ATKKEIGRAKEYIVKQLGLENGGQSVEMGTIHAGDFMRRFCSNLGMNHQAVKAAQESVQ KSEEFDIRRSPISIAAAVIYIITQLSDDKKPLKDISVATGVAEGTIRNSYKDLYPHVSK IIPNWYAKEEDLKNLCSP Oryza sativa TFIIB AF464908 MSDSFCPDCKKHTEVAFDHSAGDTVCTECGLVLEAHSVDETSEWRTFANESSDNDPVRV

PAGE 183

170 GGPTNPLLTDGGLSTVIAKPNGAQGEFLSSSLGRWQNRGSNPDRSLILAFRTIANMADR LGLVATIKDRANEIYKKVEDLKSIRGRNQDAILAACLYIACRQEDRPRTVKEICSVANG ATKKEIGRAKEFIVKQLEVEMGQSMEMGTIHAGDFLRRFCSTLGMNNQAVKAAQEAVQR SEELDIRRSPISIAAAVIYMITQLSDDKKPLKDISLATGVAEGTIRNSYKDLYPYASRL IPNTYAKEEDLKNLCTP Populus trichocarpa TFIIB6 MVKTKTNPLVSRAKQETSNKYVYQTYPKVIDLLTHSNPFLLSFSLPPHPSSHGSKPKNK QEMGDAFCSDCKRHTEVVFDHSAGDTVCSECGLVLESHSIDETSEWRTFANESGDNDPV RVGGPTNPLLTDGGLSTVIAKPNGASGEFLSSSLGRWQNRGSNPDRGLITAFKTIATMS DRWVREYKLLDVEFGGF Populus trichocarpa TFIIB3 MNRTITNIKSASPSLSLTRKGLDAFLVWLFMRISPNVSFSFSGDAFFIILKDRANEIYK KVEDQKSSRGRNQDALLAACLYIACRQEDKPRTVKEICSVANGATKKEIGRAKEYIVKQ LGLEAGQSVEMGTIHAGDFMRRFCSNLGMSNHTVKAATEAVKTSEQFDIRRSPISIAAA VIYIITQLSDDKKPLRDISLATGVAEGTIRNSYKDLYPHVSKIIPAWYANEEDLKNLSS P Populus trichocarpa TFIIB5 MEDSYCPDCKRLTEIVFDHSAGDTICSECGLILEAHSVDETSEWRTFSNESSDHDPNRV GGPLNPLLADGGLSTTISKTNGGSNELLSCSLGKWQSRGANPDRNRIQAFKSIAAMADR FFLFYFLWEKNDVCQLWIRLLAIVCRLGKCMLNSCWLWNFIRQHHGKIK Populus trichocarpa TFIIB2 MARNGEIDDYRDYCKDCKANTYIVLDHCTGDTICSDCGLVLESCYIDEIAEWRTFNDDN NDKDPNRVGYNVNPLLSQGNLKTLISNNKGDHAIPRWQDGVSNSDRVLLQGFDIIEIIA NRLGLVRPIKDRAKEIFKKIEEQKTCVMRKRDSICAACLFISSRENKLPRTLNEISSVV YGVTKKEINKAVQSIKRHVELEDMGTLNPSELVRRFCSNLGMKNHAVKAVHEAVEKIQD VDIRRNPKSVLAAIIYTITQLSDEKKPLRDISLAADVAEGTIKKSFKDISPHVSRLVPK WYAREEDIRRIRIPRNCGAKQLN Populus trichocarpa TFIIB1 MVKTKTNPLVSRAKQETSNKYVYQTYPKVIDLLTHSNPFLLSFSLPPHPSSHGSKPKNK QEMGDAFCSDCKRHTEVVFDHSAGDTVCSECGLVLESHSIDETSEWRTFANESGDNDPV RVGGPTNPLLTDGGLSTVIAKPNGASGEFLSSSLGRWQNRGSNPDRGLITAFKTIATMS DRLGLVATIKDRANEIYKKVEDQKSSRGRNQDALLAACLYIACRQEDKPRTVKEICSVA NGATKKEIGRAKEYIVKQLGLETGQSVEMGTIHAGDFMRRFCSNLGMSNHTVKAATEAV KTSEQFDIRRSPISIAAAVIYIITQLSDDKKPLRDISLATGVAEGTIRNSYKDLYPHVS KIIPSWYASEEDLKNLCSP Populus trichocarpa TFIIB4 MGDAFCSDCKKHTEVVCDHSAGDTVCSECGLVLESHSIDETSEWRIFANESGDNDPVRV GGPTNPLLTDGGLSTVIAKPNGASGDFLSTSLGRWQNRGSNPDRGLILAFKTIATMSDR LGLVATIKVFILDVCTLLLPLMVSTVSLKHPLMNMALNLNYHVNKVCFLGISRVLPGNE PYILFHPDLSFSSILKMYRHILMKV

PAGE 184

171 Populus trichocarpa TFIIB9 MSPALASSGASIELALYAGSKWFGSNRVFFPPPPPSLTALGCIGQAKKKWVSLSSPTWR QCVIKQSCELSKIRFLSPPFQETYTHPYQLRKKKKKTHQEKQNRKKYSANVLDHLTGDT ICIDCGLVLISYYVDEEPEWRTFGIEDNINEYDPNHLGSLSDPLLTHANLATTISKPAK GGTTAVAISKNWLINRQSNPDGDLIQGFEIIETMARRRNREAMPAACLFISCKENKLPR TLKETCSAASCNGGGGGGLTMKEACTIGGYDRRDHES Populus trichocarpa TFIIB7/pBrp MKCPYCSATQGRCATTTTTNRCITECTSCGRVVEERQFHPHHLFHLRAQDTPLCLVTSD LPTLHHHHQNEEDPFEPTGFITSFSTWSLEPNPVSLRSSLSFSGHLAELERTIELSAST PASSSNVVVDNLRAYMQIIDVASILGLDCDISDHAFQLFRDCCSATCLRNRSVEALATA ALVQAIREAQEPRTLQVGTLVNGEEISIAANVPQKEIGKYIKILGEALQLSQPINSNSI SVHMPRFCTLLQLNKSAQELATHIGEVVINKCFCTRRNPISISAAAIYLACQLEDKRKT QAEICKVTGLTEVTLRKVYKELLENWGDLLPKNYTPAVPPEKAFPTTTITSGRSSAPKI DPVELVSSSSEKDKQLESKSNKPSELARGKEDAENNGNSRGIQPPPWQNFRQPWLQFVT SGVRMVGDTNQNLARVDINESQPRRQEFEEKADKQKMDKDPTASAWPNQLSSSPASGAS TISWPFRSPTLSGPSPIVQPPPKLTPGYAELKGIGSQNGGKTGNNSGDNK Populus trichocarpa TFIIB8 MENNLMFVWRSPISIAAAVIYIIIQLSDDKKPLKDISVVTQVAEGTIKNSYKDLSPHLS QIIPSWFAKEEDIKNLHSKHTNLDEGINICLRLKEAPPHNNEQYTATFLLLFVTNELKR DGGGKVLLLLNLCMDRKILESGKQPSEAKLYALSVPTDTHRPTMLPE Saccharomyces cerevisiae BRF NP_011762.1 MPVCKNCHGTEFERDLSNANNDLVCKACGVVSEDNPIVSEVTFGETSAGAAVVQGSFIG AGQSHAAFGGSSALESREATLNNARRKLRAVSYALHIPEYITDAAFQWYKLALANNFVQ GRRSQNVIASCLYVACRKEKTHHMLIDFSSRLQVSVYSIGATFLKMVKKLHITELPLAD PSLFIQHFAEKLDLADKKIKVVKDAVKLAQRMSKDWMFEGRRPAGIAGACILLACRMNN LRRTHTEIVAVSHVAEETLQQRLNEFKNTKAAKLSVQKFRENDVEDGEARPPSFVKNRK KERKIKDSLDKEEMFQTSEEALNKNPILTQVLGEQELSSKEVLFYLKQFSERRARVVER IKATNGIDGENIYHEGSENETRKRKLSEVSIQNEHVEGEDKETEGTEEKVKKVKTKTSE EKKENESGHFQDAIDGYSLETDPYCPRNLHLLPTTDTYLSKVSDDPDNLEDVDDEELNA HLLNEEASKLKERIWIGLNADFLLEQESKRLKQEADIATGNTSVKKKRTRRRNNTRSDE PTKTVDAAAAIGLMSDLQDKSGLHAALKAAEESGDFTTADSVKNMLQKASFSKKINYDA IDGLFR Saccharomyces cerevisiae TFIIB_M81380 MMTRESIDKRAGRRGPNLNIVLTCPECKVYPPKIVERFSEGDVVCALCGLVLSDKLVDT RSEWRTFSNDDHNGDDPSRVGEASNPLLDGNNLSTRIGKGETTDMRFTKELNKAQGKNV MDKKDNEVQAAFAKITMLCDAAELPKIVKDCAKEAYKLCHDEKTLKGKSMESIMAASIL IGCRRAEVARTFKEIQSLIHVKTKEFGKTLNIMKNILRGKSEDGFLKIDTDNMSGAQNL TYIPRFCSHLGLPMQVTTSAEYTAKKCKEIKEIAGKSPITIAVVSIYLNILLFQIPITA AKVGQTLQVTEGTIKSGYKILYEHRDKLVDPQLIANGVVSLDNLPGVEKK

PAGE 185

172 Solanum tuberosum TFIIB TC58701 MDTYCSDCKRNTEVVFDHAAGDTVCSECGLVLESRSIDETSEWRTFADESGGDDPNRVG GPVNPLLGDAALSTVISKGPNGSNGDGSLARLQNRGGDPDRAIVLAFKAIATMADRLSL VSTIRDRASEIYKRLEDQKCTRGRNLDALVAACIYIACRQEGKPRTVKEICSIANGASK KEIGRAKEFIVKQLKVEMGESMEMGTIHAGDYLRRFCSNLGMNHEEIKVVQETVQKAEE FDIRRSPISIAAAIIYMITQLSDSKKPVLRADISVATTVAEGTIKNAYKDLYPHASKII PEWYVKDKDLKNLCSPKA Sulfolobus solfataricus TFB AAK40772.1 MLYLSEENKSVSTPCPPDKIIFDAERGEYICSETGEVLEDKIIDQGPEWRAFTPEEKEK RSRVGGPLNNTIHDRGLSTLIDWKDKDAMGRTLDPKRRLEALRWRKWQIRARIQSSIDR NLAQAMNELERIGNLLNLPKSVKDEAALIYRKAVEKGLVRGRSIESVVAAAIYAACRRM KLARTLDEIAQYTKANRKEVARCYRLLLRELDVSVPVSDPKDYVTRIANLLGLSGAVMK TAAEIIDKAKGSGLTAGKDPAGLAAAAIYIASLLHDERRTQKEIAQVAGVTEVTVRNRY KELTQELKISIPTQ Triticum aestivum TFIIB TC68795 MGDSYCQDCKKHTEVAFDHSAGDTVCTECGLVLEAHSVDETSEWRTFANESNDNDPVRV GGPSNPLLTDGGLSTVIAKPNGAHGDFLSSSLGRWQNRGSNPDRSLILAFRTIANMADR LGLVATIKDRANEIYKKVEDLKSIRGRNQDAILAACLYIACRQEDRPRTVKEICSVANG ATKKEIGRAKEFIVKQLEVEMGQSMEMGTIHAGDFLRRFCSTLGMNNTAVKAAQEAVQR SEELDIRRSPISIAAAVIYMITQLSEDKKPLKDISLATGVAEGTIRNSYKDLYPYAARL IPNSYAKEEDLKNLCTP Vitis vinifera TFIIB TC19782 MADAFCTDCKKNTEVVFDHSAGDTVCSECGLVLESHSIDETSEWRTFANESGDNDPVRV GGPSNPLLTDGGLSTVIAKPNGVSGDFLSSSLGRWQNRGSNPDRGLILAFKTIATMSDR LGLVATIKDRANEIYKKVEDQKSTRGRNQDALLAACLYIACRQEDKPRTVKEICSVANG ATKKEIGRAKEYIVKQLEAEKGQSVEMGTIHAGDFMRRFCSNLGMTNQVVKAAQEAVQK SEEFDIRRSPVSIAAAVIYIITQLSDEKKLLRDISIATGVAEGTIRNSYKDLYPHISRI IPSWYAKEEDLRNLCSP TATA Binding Protein Sequences Arabidopsis thaliana TBP1 At3g13445 MTDQGLEGSNPVDLSKHPSGIVPTLQNIVSTVNLDCKLDLKAIALQARNAEYNPKRFAA VIMRIREPKTTALIFASGKMVCTGAKSEDFSKMAARKYARIVQKLGFPAKFKDFKIQNI VGSCDVKFPIRLEGLAYSHAAFSSYEPELFPGLIYRMKVPKIVLLIFVSGKIVITGAKM RDETYKAFENIYPVLSEFRKIQQ Arabidopsis thaliana TBP2 At1g55520 MADQGTEGSQPVDLTKHPSGIVPTLQNIVSTVNLDCKLDLKAIALQARNAEYNPKRFAA VIMRIREPKTTALIFASGKMVCTGAKSEHLSKLAARKYARIVQKLGFPAKFKDFKIQNI VGSCDVKFPIRLEGLAYSHSAFSSYEPELFPGLIYRMKLPKIVLLIFVSGKIVITGAKM REETYTAFENIYPVLREFRKVQQ

PAGE 186

173 Cenarchaeum symbiosum TBP AAC62688 MLDPRTRPRVVNVVSTSDLVQRVSAKKMAAMPCCMYDEAVYGGRCGYIKTPGMQGRVTV FISGKMISVGARSVRASFGQLHEARLHLVRNGAAGDCKIRPVVRNIVATVDAGRNVPID RISSRMPGAVYDPGSFPGMILKGLDSCSFLVFASGKMVIAGAKSPDELRRSSFDLLTRL NNAGA Chlamydomonas reinhardtii TBP TC24902 MMAAAEAPPATPQLSAADVEAEMAAHVSGIKPQLQNVVATVNLGTKLDLKEIAMHARNA EYNPKRFAAVIMRIREPKTTALIFASGKMVCTGAKSEDDSRTAARRYAKIVQKLGFPAT FKEFKIQNIVGSCDVKFPIRLEGLAYAHSLFASYEPELFPGLIYRMKQPKIVLLIFVSG KVVLTGAKTRGEIYQAYMNIYPTLIQYKKGDAVVPTLPNNLMGPPRALPAAKQGGQADV GEPQQAQEQDGAGPSGVRGAGASAGAAAVPAAGSGWHDAAAASGGDASAGPEPAAADTH APPPAAAHASAPGAGGYTQPEPPPAAALTAAVRRCKAGAQRRAELDARRMAGGSAGGGG GQCLVASAYHHVASVYHSSMHFGGMVLVGNGRRRVGVAXRQLDVTLKIDRSVYAVGR Glycine max TBP TC146463 MADQGLEGSQPVDLQKHPSGIVPTLQNIVSTVNLDCKLDLKTIALQARNAEYNPKRFAA VIMRIRDPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFPAKFKDFKIQNI VGSCDVKFPIRLEGLAYSHGAFSSYEPELFPGLIYRMKQPKIVLLIFVSGKIVLTGAKV RDETYTAFENIYPVLTEFRKNQQ Hordeum vulgare TBP TC78738 MAEAALEGSQPVDLSKHPSGIVPTLQNIVSTVNLDCKLDLKAIALQARNAEYNPKRFAA VIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFPAKFKDFKIQNI VASCDVKFPIRLEGLAYSHGAFSSYEPELFPGLIYRMRQPKIVLLIFVSGKIVLTGAKV REETYTAFENIYPVLTEFRKVQQ Medicago truncatula TBP TC86874 MADQGLEGSQPVDLSKHPSGIVPTLQNIVSTVNLDCKLELKSIALQARNAEYNPKRFAA VIMRIREPKTTALIFASGKMVCTGAKSEVQSKLAARKYARIIQKLGFPAKFKDFKIQNI VGSCDVKFPIRLEGLAYSHGAFSSVKYDTKLLLSISPGEFEEIMYHYYQSHMTLALFPS IIYLKSSILQKTGTSLETLVSKVCPMENKTDTNQS Medicago truncatula TBP TC88717 MADQGLEGSQPVDLAKHPSGIVPTLQNIVSTVNLDTKLDLKAIALQARNAEYNPKRFAA VIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFNAKFKDFKIQNI VGSCDVKFPIRLEGLAYSHGAFSSVSYFIYSYLHTSSSLDVICIGISMAKRFRFFLKI Mesembryanthemum crystallinum TBP TC7116 MAEQGLEGSQPVDPIKHPSGIVPTLQNIVSTVNLDCKLDLKAIALQARNAEYNPKRFAA VIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFPAKFKDFKIQNI VGSCDVKFPIRLEGLAYSHGAFSSYEPELFPGLIYRMKQPKIVLLIFVSGKIVLTGAKV REETYTAFENIYPVLTEFRKNQQ Oryza sativa TBP TC116362 MAAEAAAALEGSEPVDLAKHPSGIIPTLQNIVSTVNLDCKLDLKAIALQARNAEYNPKR FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFPAKFKDFKI

PAGE 187

174 QNIVGSCDVKFPIRLEGLAYSHGAFSSYEPELFPGLIYRMKQPKIVLLIFVSGKIVLTG AKVRDETYTAFENIYPVLTEFRKVQQ Populus trichocarpa TBP1 MAEQGGLEGSQPVDLSKHPSGIVPTLQNIVSTVNLDCKLELKQIALQARNAEYNPKRFA AVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFAAKFKDFKIQN IVGSCDVKFPIRLEGLAYSHGAFSSYEPELFPGLIYRMKQPKIVLLIFVSGKIVITGAK VREETYTAFENIYPVLAEFRKVQQWYTSQSLCPAL Populus trichocarpa TBP2 MAEQGGLEGSQPVDLSKHPSGIVPILQNIVSTVNLDCRLDLKQIALQARNAEYNPKRFA AVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFAAKFKDFKIQN IVGSCDVKFPIRLEGLAYSHGAFSSYEPEIFPGLIYRMKQPKIVLLIFVSGKIVITGAK VRDETYTAFGNIYPVLTEFRKVQQW Pyrococcus woesei TBP AAA73447 MVDMSKVKLRIENIVASVDLFAQLDLEKVLDLCPNSKYNPEEFPGIICHLDDPKVALLI FSSGKLVVTGAKSVQDIERAVAKLAQKLKSIGVKFKRAPQIDVQNMVFSGDIGREFNLD VVALTLPNCEYEPEQFPGVIYRVKEPKSVILLFSSGKIVCSGAKSEADAWEAVRKLLRE LDKYGLLEEEEEEL Solanum tuberosum TBP TC74102 MADQGLEGSQPVDLTKHPSGIVPTLQNIVSTVNLDCKLDLKAIALQARNAEYNPKRFAA VIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFPAKFKDFKIQNI VGSCDVKFPIRLEGLAYAHGAFSSYEPELFPGLIYRMKQPKIVLLIFVSGKIVITGAKV RDETYTAFENIYPVLTEFRKNQQ Sorghum bicolor TBP TC54739 MAEPGLEGSQPVDLSKHPSGIVPTLHFPVLGASKRANIVLNSWGFGGNYLVVILSPRVD FRNIVSTVNLDCKLDLKAIALQARNAEYNPKRFAAVIMRIREPKTTALIFASGKMYARI IQKLGFPAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSYEPELFPGLIYRMKQPK IVLLIFVSGKIVLTGAKVREETYTAFENIYPVLTEFRKVQQ Triticum aestivum TBP TC72701 MAEATLEGSEPVDLSKHPSGIIPTLQNIVSTVNLDCKLDLKAIALQARNAEYNPKRFAA VIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFPAKFKDFKIQNI VGSCDVKFPIRLEGLAYSHGAFSSYEPELFPGLIYRMKQPKIVLLIFVSGKIVLTGAKV REETYTAFENIYPVLTEFRKVQQ Triticum aestivum TBP TC88519 MAEAAALEGSEPVDLTKHPSGIIPTLQNIVSTVNLDCKLDLKAIALQARNAEYNPKRFA AVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFPAKFKDFKIQN IVASCDVKFPIRLEGLAYSHGAFSSYEPELFPGLIYRMRQPKIVLLIFVSGKIVLTGAK VREETYTAFENIYPVLTEFRKVQQ Triticum aestivum TBP TC90291 MAAAAVDPMVLGLGTSGGASGSGVVGGGVGRAGGGGAVMEGAQPVDLARHPSGIVPVLQ

PAGE 188

175 NIVSTVNLDCRLDLKQIALQARNAEYNPKRFAAVIMRIRDPKTTALIFASGKMVCTGAK SEEHSKLAARKYARIVQKLGFPATFKDFKIQNIVASCDVKFPIRLEGLAYSHGAFSSYE PELFPGLIYRMKQPKIVLLVFVSGKIVLTGAKVRDEIYAAFENIYPVLTEYRKSQQ Zea mays TBP TC182979 MAEPGLEGSQPVDLSKHPSGIVPTLQNIVSTVNLDCKLDLKAIALQARNAEYNPKRFAA VIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFPAKFKDFKIQNI VGSCDVKFPIRLEGLAYSHGAFSSYEPELFPGLIYRMKQPKIVLLIFVSGKIVLTGAKV REETYTAFENIYPVLSEFRKIQQ Zea mays TBP TC171023 MAEPGLEDSQPVDLSKHPSGIVPTLQNIVSTVNLDCKLDLKAIALQARNAEYNPKRFAA VIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFPAKFKDFKIQNI VGSCDVKFPIRLEGLAYSHGAFSSYEPELFPGLIYRMKQPKIVLLIFVSGKIVLTGAKV REETYTAFENIYPVLAEFRKVQQ Zea mays TBP X90652.1 MAEPRLEDSQPVDLSKHPSGIVPTLQNIVSTVNLDCKLDLKAIALQARNAEYNPKRFAA VIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGFPAKFKDFKIQNI VGSCDVKFPIRLEGLAYSHGAFSSYEPELFPGLIYRMKQPKIVLLIFVSGKIVLTGAKV REETYTAFENIYPVLAEFRKVQQWYVVLFYHVSIIVRS Saccharomyces cerevisiae TBP NP_011075.1 MADEERLKEFKEANKIVFDPNTRQVWENQNRDGTKPATTFQSEEDIKRAAPESEKDTSA TSGIVPTLQNIVATVTLGCRLDLKTVALHARNAEYNPKRFAAVIMRIREPKTTALIFAS GKMVVTGAKSEDDSKLASRKYARIIQKIGFAAKFTDFKIQNIVGSCDVKFPIRLEGLAF SHGTFSSYEPELFPGLIYRMVKPKIVLLIFVSGKIVLTGAKQREEIYQAFEAIYPVLSE FRKM Homo sapiens TBP_L1 NP_004856 MDADSDVALDILITNVVCVFRTRCHLNLRKIALEGANVIYKRDVGKVLMKLRKPRITAT IWSSGKIICTGATSEEEAKFGARRLARSLQKLGFQVIFTDFKVVNVLAVCNMPFEIRLP EFTKNNRPHASYEPELHPAVCYRIKSLRATLQIFSTGSITVTGPNVKAVATAVEQIYPF VFESRKEIL Homo sapiens TBP NP_003185.1 MDQNNSLPPYAQGLASPQGAMTPGIPIFSPMMPYGTGLTPQPIQNTNSLSILEEQQRQQ QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQAVAAAAVQQSTSQQATQGTSGQA PQLFHSQTLTTAPLPGTTPLYPSPMTPMTPITPATPASESSGIVPQLQNIVSTVNLGCK LDLKTIALRARNAEYNPKRFAAVIMRIREPRTTALIFSSGKMVCTGAKSEEQSRLAARK YARVVQKLGFPAKFLDFKIQNMVGSCDVKFPIRLEGLVLTHQQFSSYEPELFPGLIYRM IKPRIVLLIFVSGKVVLTGAKVRAEIYEAFENIYPILKGFRKTT Drosophila melanogaster TRF_Q27896 MQFHFKVADAERDRDNVAATSNAAANPHAALQPQQPVALVEPKDAQHEIRLQNIVATFS VNCELDLKAINSRTRNSEYSPKRFRGVIMRMHSPRCTALIFRTGKVICTGARNEIEADI

PAGE 189

176 GSRKFARILQKLGFPVKFMEYKLQNIVATVDLRFPIRLENLNHVHGQFSSYEPEMFPGL IYRMVKPRIVLLIFVNGKVVFTGAKSRKDIMDCLEAISPILLSFRKT Drosophila melanogaster TBP NP_523805 MDQMLSPNFSIPSIGTPLHQMEADQQIVANPVYHPPAVSQPDSLMPAPGSSSVQHQQQQ QQSDASGGSGLFGHEPSLPLAHKQMQSYQPSASYQQQQQQQQLQSQAPGGGGSTPQSMM QPQTPQSMMAHMMPMSERSVGGSGAGGGGDALSNIHQTMGPSTPMTPATPGSADPGIVP QLQNIVSTVNLCCKLDLKKIALHARNAEYNPKRFAAVIMRIREPRTTALIFSSGKMVCT GAKSEDDSRLAARKYARIIQKLGFPAKFLDFKIQNMVGSCDVKFPIRLEGLVLTHCNFS SYEPELFPGLIYRMVRPRIVLLIFVSGKVVLTGAKVRQEIYDAFDKIFPILKKFKKQS Drosophila melanogaster TRF2 AAD28784 MQNDMVSIPVANLNGGLKAASSGSGVGVVTPGGVVSSAVLANAPRVYLTPSSTFMTNRQ MAGVASTGRMSGQVVGGSSGTASTAGTVRHFSQFSKMQTAGGPSLQRKLANGDTIVLAT GSKNMFLTSSENKANLPTVASNGNGLITAKMDLLEEEVMQSITVIDDDDEEKKEVAEDE EESSNNAKPIDLHQPIADNEHELDIVINNVVCSFSVGCHLKLREIALQGSNVEYRRENG MVTMKLRHPYTTASIWSSGRITCTGATSESMAKVAARRYARCLGKLGFPTRFLNFRIVN VLGTCSMPWAIKIVNFSERHRENASYEPELHPGVTYKMRDPDPKATLKIFSTGSVTVTA ASVNHVESAIQHIYPLVFDFRKQRSAEELQHLRQKQRLQAGGDPHELEKNVLADNKTAS LDNIFVNTTAAHSKSSSNDQTSAPATILSSTVDSMPRLKQMVNYHQMMKQTQEERRHIM FNGEKANPASTSSAAAAPSTSSSSSSSGDNICANARRRATECWATKLQNKRPRYNDPGT TGTINAASSTASAATSSLASQATHLRNPLKTAALANARMLGAKVTTCTRNSIIVQQPQR IQMQQQQQQLQPQQQQTSFSPSEFDVDDLIEEEENNELDMPF TAF6 Sequences Arabidopsis thaliana TAF6 At1g04950 MSIVPKETVEVIAQSIGITNLLPEAALMLAPDVEYRVREIMQEAIKCMRHSKRTTLTAS DVDGALNLRNVEPIYGFASGGPFRFRKAIGHRDLFYTDDREVDFKDVIEAPLPKAPLDT EIVCHWLAIEGVQPAIPENAPLEVIRAPAETKIHEQKDGPLIDVRLPVKHVLSRELQLY FQKIAELAMSKSNPPLYKEALVSLASDSGLHPLVPYFTNFIADEVSNGLNDFRLLFNLM HIVRSLLQNPHIHIEPYLHQLMPSVVTCLVSRKLGNRFADNHWELRDFAANLVSLICKR YGTVYITLQSRLTRTLVNALLDPKKALTQHYGAIQGLAALGHTVVRLLILSNLEPYLSL LEPELNAEKQKNQMKIYEAWRVYGALLRAAGLCIHGRLKIFPPLPSPSPSFLHKGKGKG KIISTDPHKRKLSVDSSENQSPQKRLITMDGPDGVHSQDQSGSAPMQVDNPVENDNPPQ NSVQPSSSEQASDANESESRNGKVKESGRSRAITMKAILDQIWKDDLDSGRLLVKLHEL YGDRILPFIPSTEMSVFL Arabidopsis thaliana TAF6b-1 At1g54360 MVTKESIEVIAQSIGLSTLSPDVSAALAPDVEYRVREVMQEAIKCMRHARRTTLMAHDV DSALHFRNLEPTSGSKSMRFKRAPENRDLYFFDDKDVELKNVIEAPLPNAPPDASVFSH WLAIDGIQPSIPQNSPLQAISDLKRSEYKDDGLAARQVLSKDLQIYFDKVTEWALTQSG STLFRQALASLEIDPGLHPLVPFFTSFIAEEIVKNMDNYPILLALMRLARSLLHNPHVH IEPYLHQLMPSIITCLIAKRLGRRSSDNHWDLRNFTASTVASTCKRFGHVYHNLLPRVT RSLLHTFLDPTKALPQHYGAIQGMVALGLNMVRFLVLPNLGPYLLLLLPEMGLEKQKEE AKRHGAWLVYGALMVAAGRCLYERLKTSETLLSPPTSSVWKTNGKLTSPRQSKRKASSD

PAGE 190

177 NLTHQPPLKKIAVGGIIQMSSTQMQMRGTTTVPQQSHTDADARHHNSPSTIAPKTSAAA GTDVDNYLFPLFEYFGESMLMFTPTHELSFFL Arabidopsis thaliana TAF6b-2 At1g54360 MVTKESIEVIAQSIGLSTLSPDVSAALAPDVEYRVREVMQEAIKCMRHARRTTLMAHDV DSALHFRNLEPTSGSKSMRFKRAPENRDLYFFDDKDVELKNVIEAPLPNAPPDASVFSH WLAIDGIQPSIPQNSPLQAISDLKRSEYKDDGLAARQIYFDKVTEWALTQSGSTLFRQA LASLEIDPGLHPLVPFFTSFIAEEIVKNMDNYPILLALMRLARSLLHNPHVHIEPYLHQ LMPSIITCLIAKRLGRRSSDNHWDLRNFTASTVASTCKRFGHVYHNLLPRVTRSLLHTF LDPTKALPQHYGAIQGMVALGLNMVRFLVLPNLGPYLLLLLPEMGLEKQKEEAKRHGAW LVYGALMVAAGRCLYERLKTSETLLSPPTSSVWKTNGKLTSPRQSKRKASSDNLTHQPP LKKIAVGGIIQMSSTQMQMRGTTTVPQQSHTDADARHHNSPSTIAPKTSAAAGTDVDNY LFPLFEYFGESMLMFTPTHELSFFL Arabidopsis thaliana TAF6b-4 At1g54360 MVTKESIEVIAQSIGLSTLSPDVSAALAPDVEYRVREVMQEAIKCMRHARRTTLMAHDV DSALHFRNLEVSSSSLLLLFHTVDPDFDFFLYSLPLAPKVCGSRELLRTEIYTSSMTKM SSSRMLSKLLYQMHLLMHLFSLIGWQLMVFNLPFHRILLSKPYLTLNDRNIRTMAWLLH RCFLRTFRFTLTKSRSGL Arabidopsis thaliana TAF6b-3 At1g54360 MVTKESIEVIAQSIGLSTLSPDVSAALAPDVDSALHFRNLEPTSGSKSMRFKRAPENRD LYFFDDKDVELKNVIEAPLPNAPPDASVFSHWLAIDGIQPSIPQNSPLQAISDLKRSEY KDDGLAARQVLSKDLQIYFDKVTEWALTQSGSTLFRQALASLEIDPGLHPLVPFFTSFI AEEIVKNMDNYPILLALMRLARSLLHNPHVHIEPYLHQLMPSIITCLIAKRLGRRSSDN HWDLRNFTASTVASTCKRFGHVYHNLLPRVTRSLLHTFLDPTKALPQHYGAIQGMVALG LNMVRFLVLPNLGPYLLLLLPEMGLEKQKEEAKRHGAWLVYGALMVAAGRCLYERLKTS ETLLSPPTSSVWKTNGKLTSPRQSKRKASSDNLTHQPPLKKIAVGGIIQMSSTQMQMRG TTTVPQQSHTDADARHHNSPSTIAPKTSAAAGTDVDNYLFPLFEYFGESMLMFTPTHEL SFFL Drosophila melanogaster TAF6 NP_524161 MSGKPSKPSSPSSSMLYGSSISAESMKVIAESIGVGSLSDDAAKELAEDVSIKLKRIVQ DAAKFMNHAKRQKLSVRDIDMSLKVRNVEPQYGFVAKDFIPFRFASGGGRELHFTEDKE IDLGEITSTNSVKIPLDLTLRSHWFVVEGVQPTVPENPPPLSKDSQLLDSVNPVIKMDQ GLNKDAAGKPTTGKIHKLKNVETIHVKQLATHELSVEQQLYYKEITEACVGSDEPRRGE ALQSLGSDPGLHEMLPRMCTFIAEGVKVNVVQNNLALLIYLMRMVRALLDNPSLFLEKY LHELIPSVMTCIVSKQLCMRPELDNHWALRDFASRLMAQICKNFNTLTNNLQTRVTRIF SKALQNDKTHLSSLYGSIAGLSELGGEVIKVFIIPRLKFISERIEPHLLGTSISNTDKT AAGHIRAMLQKCCPPILRQMRSAPDTAEDYKNDFGFLGPSLCQAVVKVRNAPASSIVTL SSNTINTAPITSAAQTATTIGRVSMPTTQRQGSPGVSSLPQIRAIQANQPAQKFVIVTQ NSPQQGQAKVVRRGSSPHSVVLSAASNAASASNSNSSSSGSLLAAAQRSSDNVCVIAGS EAPAVDGITVQSFRAS Homo sapiens TAF6 NP_647476 MAEEKKLKLSNTVLPSESMKVVAESMGIAQIQEETCQLLTDEVSYRIKEIAQDALKFMH MGKRQKLTTSDIDYALKLKNVEPLYGFHAQEFIPFRFASGGGRELYFYEEKEVDLSDII

PAGE 191

178 NTPLPRVPLDVCLKAHWLSIEGCQPAIPENPPPAPKEQQKAEATEPLKSAKPGQEEDGP LKGKGQGATTADGKGKEKKAPPLLEGAPLRLKPRSIHELSVEQQLYYKEITEACVGSCE AKRAEALQSIATDPGLYQMLPRFSTFISEGVRVNVVQNNLALLIYLMRMVKALMDNPTL YLEKYVHELIPAVMTCIVSRQLCLRPDVDNHWALRDFAARLVAQICKHFSTTTNNIQSR ITKTFTKSWVDEKTPWTTRYGSIAGLAELGHDVIKTLILPRLQQEGERIRSVLDGPVLS NIDRIGADHVQSLLLKHCAPVLAKLRPPPDNQDAYRAEFGSLGPLLCSQVVKARAQAAL QAQQVNRTTLTITQPRPTLTLSQAPQPGPRTPGLLKVPGSIALPVQTLVSARAAAPPQP SPPPTKFIVMSSSSSAPSTQQVLSLSTSAPGSGSTTTSPVTTTVPSVQPIVKLVSTATT APPSTAPSGPGSVQKYIVVSLPPTGEGKGGPTSHPSPVPPPASSPSPLSGSALCGGKQE AGDSPPPAPGTPKANGSQPNSGSPQPAP Homo sapiens TAF6L NP_006464 MSEREERRFVEIPRESVRLMAESTGLELSDEVAALLAEDVCYRLREATQNSSQFMKHTK RRKLTVEDFNRALRWSSVEAVCGYGSQEALPMRPAREGELYFPEDREVNLVELALATNI PKGCAETAVRVHVSYLDGKGNLAPQGSVPSAVSSLTDDLLKYYHQVTRAVLGDDPQLMK VALQDLQTNSKIGALLPYFVYVVSGVKSVSHDLEQLHRLLQVARSLFRNPHLCLGPYVR CLVGSVLYCVLEPLAASINPLNDHWTLRDGAALLLSHIFWTHGDLVSGLYQHILLSLQK ILADPVRPLCCHYGAVVGLHALGWKAVERVLYPHLSTYWTNLQAVLDDYSVSNAQVKAD GHKVYGAILVAVERLLKMKAQAAEPNRGGPGGRGCRRLDDLPWDSLLFQESSSGGGAEP SFGSGLPLPPGGAGPEDPSLSVTLADIYRELYAFFGDSLATRFGTGQPAPTAPRPPGDK KEPAAAPDSVRKMPQLTASAIVSPHGDESPRGSGGGGPASASGPAASESRPLPRVHRAR GAPRQQGPGTGTRDVFQKSRFAPRGAPHFRFIIAGRQAGRRCRGRLFQTAFPAPYGPSP ASRYVQKLPMIGRTSRPARRWALSDYSLYLPL Hordeum vulgare TAF6 Barley1_10250 MSIVPKETIEVIAQSIGIPSLPADVSAALAPDVEYRLREIMQEAIKCMRHAKRTVLTAD DVDSALSLRNVEPVYGFASGDPLRFKRAVGHKDLFYIDDREVDFKEIIEAPLPKAPLDT AVVAHWLAIEGVQPAIPENPPIDAISAPTENKRTEQVKDDGLPVDIKLPVKHILSRELQ MYFDKIAELTMSRSSTPIFREALVSLSKDSGLHPLVPYFSYFIADEVTRSLADLPVLFA LMRVVQSLLRNPHIHIEPYLHQLMPSMITCIVAKRLGHRLSDNHWELRDFSANLVASVC RRYGHVYHNLQIRLTKTLVHAFLDPHKALTQHYGAVQGISALGPSAIRLLLLPNLQTYM QLLDPELQLEKQSNEMKRKEAWRVYGALLCAAGKCLYERLKLFPNLLCPSTRPLLRSNS RVATNNPNKRKSSTDLSASQPPLKKMASDVSMSPMGSAAPVAGNMAGSMDGFSAQLPNP GMMQASSSGQKVESMTAAGAIRRDQGSNHAQRVSAVLRQAWKEDQDAGHLLGSLHEVFG EAIFSFIQPPELSIFL Oryza sativa TAF6 BAB92191 MSIVPKETIEVIGQSVGIANLPADVSAALAPDVEYRLREIMQEAIKCMRHAKRTVLTAD DVDSALSLRNVEPVYGFASGDPLRFKRAVGHKDLFYIDDREVDFKEIIEAPLPKAPLDT AVVAHWLAIEGVQPAIPENPPVDAIVAPTENKRTEHGKDDGLPVDIKLPVKHVLSRELQ MYFDKIAELTMSRSETSVFREALVSLSRDSGLHPLVPYFSYFIADEVTRSLGDLPVLFA LMRVVQSLLHNPHIHIEPYLHQLMPSIITCMVAKRLGHRLSDNHWELRDFSANLVGSVC RRFGHAYHNIQTRVTRTLVQGFLDPQKSLTQHYGAIQGISALGPSAIRLLLLPNLETYM QLLEPELQLDKQKNEMKRKEAWRVYGALLCAAGKCLYDRLKLFPNLLSPSTRPLLRSNK RVVTNNPNKRKSSTDLSTSQPPLKKMTTDGAMNSMTSAPMPGTMDGFSTQLPNPSMTQT SSSGQLVESTASGVIRRDQGSNHTQRVSTVLRLAWKEDQNAGHLLSSLYEVFGEAIFSF VQPPEISFFL

PAGE 192

179 Populus trichocarpa TAF6 MSIVAKETIEVIAQSIGISNLSEDVALTLAPDVEFRMRQIMQEAIKCMRHSKRTRLTTD DVDGALNLTNVEPIYGFASGGALQFKRAIGHRDLFYVDDKDIDFKDVIEAPLPKAPLDT AVVCHWLAIEGVQPAIPENAPLEVIAPPSDGKISEQNDEFPVDIKLPVKHVLSRELQLY FDKITDLTVRRSDSVLFKEALVSLATDSGLHPLIPYFTYFIADEVARGLNDYSLLFALM RVVWSLLQNPHIHIEPYIIVNVLSFVFRIMSSIDEYKIKVQSLKLRRRWISCQLHQLMP SVVTCLVARKLGNRFADNHWELRDFTANLVAPICKRVHGWQHSALILCKHSLTEYVPRV SWSGCCRFGHVYNSLQTRLTKTLLNALLDPKRSLTQHYGAIQGLAALGPNVVRLLLLPN LKPYLQLLEPEMLLEKQKNEMKRHEAWHVYGALLCAAGQSIYDRLKMFPALMSHPACAV LRTNEKVVTKRPGDFYDFSFQKLYHLNATVCDVSMPMYLWVESNLFPLIENYQDKRKAS MEHMEQPPPKKIATDGPVDMQVEPIAPVPLGDSKTGLSTSSEHTPNYSEAGSRNQKDKG DSQAIKTSAILSQVWKDDLNSGHLLVSLFELFGESILSFIPSPEMSLFL Populus trichocarpa TAF6b MSSSIVAKEAIEVIAQGIGITNLSPDVSLTLAPDVEYRLREIIQEAIKCMRHSRRTALT AHDVDTALILRNVEPIYGFGSGGDKVPLRFKRAAAAGHKDLYYIDDKDVNFKHVIEAPP PKPPLDTSLTSHWLAIEGVQPAIPENVPIEALGVISDGKKSDYKDDGLSIDVKLPVKDI LSRELQLYFEKVTELTARRSESAIFKQALVSLATDSGLHPLVPYFIQFIADEVSRNLNN FSLLLAVMRIARSLLQNPYIHIEPYLHQLMPSIITCLVAKRLGNRFSDNHWELRNFTAN LVASICKRFGHAYHNLQPRIIRTLVHAFLDPTKSLPQHYGSIQGLAALGPSVVRLLILP NLEPYLLLLEQEMLLEKQKNEIKRHEAWQRAAGLCMYDRLKMLPGLFIPPSRAIWKSNG RVMTAMPSMTCFNLSHWDTFIHASINPVTGYVYCLKIPVNACVEMGLYVGTSSFHYVHL TLYPACISCRSLLANQDKRKASTDNLMQQPLLKKIATDSAIGAMPMNSMPVEMQGAASG FPTAVGASSVSVSAISRQLSNENVPRREISGRGLKTSTVLAQAWKEDMDAGHLLASLFE LFSESMFSFTPKPELSFFL Saccharomyces cerevisiae TAF6 NP_011403 MSTQQQSYTIWSPQDTVKDVAESLGLENINDDVLKALAMDVEYRILEIIEQAVKFKRHS KRDVLTTDDVSKALRVLNVEPLYGYYDGSEVNKAVSFSKVNTSGGQSVYYLDEEEVDFD RLINEPLPQVPRLPTFTTHWLAVEGVQPAIIQNPNLNDIRVSQPPFIRGAIVTALNDNS LQTPVTSTTASASVTDTGASQHLSNVKPGQNTEVKPLVKHVLSKELQIYFNKVISTLTA KSQADEAAQHMKQAALTSLRTDSGLHQLVPYFIQFIAEQITQNLSDLQLLTTILEMIYS LLSNTSIFLDPYIHSLMPSILTLLLAKKLGGSPKDDSPQEIHEFLERTNALRDFAASLL DYVLKKFPQAYKSLKPRVTRTLLKTFLDINRVFGTYYGCLKGVSVLEGESIRFFLGNLN NWARLVFNESGITLDNIEEHLNDDSNPTRTKFTKEETQILVDTVISALLVLKKDLPDLY EGKGEKVTDEDKEKLLERCGVTIGFHILKRDDAKELISAIFFGE TAF9 Sequences Arabidopsis thaliana TAF9 At1g54140 MAGEGEEDVPRDAKIVKSLLKSMGVEDYEPRVIHQFLELWYRYVVEVLTDAQVYSEHAS KPNIDCDDVKLAIQSKVNFSFSQPPPREVLLELAASRNKIPLPKSIAGPGVPLPPEQDT LLSPNYQLVIPKKSVSTEPEETEDDEEMTDPGQSSQEQQQQQQQTSDLPSQTPQRVSFP LSRRPK

PAGE 193

180 Chlamydomonas reinhardtii TAF9 TC21330 MDAARGAGGAVSDGAQPQDVATMHALLRSMGVEEFEPRVVNQLMDFMYKYTTDVLLDAE VFSEHAGRQPGQVDASGVTMAIQSRTALYVQPPPQERVTELARQVNDTGTARPGHQAPA CRCRPRASR Drosophila melanogaster TAF9 A49067 MSAEKSDKAKISAQIKHVPKDAQVIMSILKELNVQEYEPRVVNQLLEFTFRYVTCILDD AKVYANHARKKTIDLDDVRLATEVTLDKSFTGPLERHVLAKVADVRNSMPLPPIKPHCG LRLPPDRYCLTGVNYKLRATNQPKKMTKSAVEGRPLKTVVKPVSSANGPKRPHSVVAKQ QVVTIPKPVIKFTTTTTTKTVGSSGGSGGGGGQEVKSESTGAGGDLKMEVDSDAAAVGS IAGASGSGAGSASGGGGGGGSSGVGVAVKREREEEEFEFVTN Gossypium arboretum TAF9 TC14563 MAEGEEDLPRDAKIVKSLLKSMGVEDYEPRVIHQFLELWYRYVVDVLTDAQVYSEHAGK QTIDCDDVKLAIQSKVNFSFSQPPPREVLLELARNRNKVPLPKAIPGPGIPLPPEQDTL ISTNYQLAIPKKQPAQAMEEMEEDEESVEPNSSQEHKTDAPHPTSQRVSFPLTKRSK Homo sapiens TAF9 NP_003178 MESGKTASPKSMPKDAQMMAQILKDMGITEYEPRVINQMLEFAFRYVTTILDDAKIYSS HAKKATVDADDVRLAIQCRADQSFTSPPPRDFLLDIARQRNQTPLPLIKPYSGPRLPPD RYCLTAPNYRLKSLQKKASTSAGRITVPRLSVGSVTSRPSTPTLGTPTPQTMSVSTKVG TPMSLTGQRFTVQMPTSQSPAVKASIPATSAVQNVLINPSLIGSKNILITTNMMSSQNT ANESSNALKRKREDDDDDDDDDDDYDNL Homo sapiens TAF9L NP_057059 MESGKMAPPKNAPRDALVMAQILKDMGITEYEPRVINQMLEFAFRYVTTILDDAKIYSS HAKKPNVDADDVRLAIQCRADQSFTSPPPRDFLLDIARQKNQTPLPLIKPYAGPRLPPD RYCLTAPNYRLKSLIKKGPNQGRLVPRLSVGAVSSKPTTPTIATPQTVSVPNKVATPMS VTSQRFTVQIPPSQSTPVKPVPATTAVQNVLINPSMIGPKNILITTNMVSSQNTANEAN PLKRKHEDDDDNDIM Hordeum vulgare TAF9 TC68170 MDSGGVRPSLPSAAAAGGASVPDEPRDARVVRELLRSMGLGEGEYEPRVVGQFLDLAYR YVGDVLGDAQVYADHADKPQIDADDVRLAIQANVNFSFSQPPPREVLLELARSRNKIPL PKSIAPPGSIPLPPEQDTLLSENYQLLPALKPPTQTEEAEDDNEGADAIPANPSPSYSQ DQRGSEQHQPQSQSQRVSFQLNAVAAAAAKRPLVTTDQLNMG Lycopersicon esculentum TAF9 TC128464 MAEGGEEDLPRDAKIVKTLLKSMGVDDYEPRVVHQFLELWYRYVVDVLTDAQVYSEHAR KASIDSDDIKLAIQSKVNFSFSQPPPREVLLELARNRNKIPLPKSIAGSGVPLPPEQDT LINPNYQLAIAKKQTNQPEETEEDEESADPNPAPSKNPTLSHEKTDLPQGTPQRVSFPL GAKRPR Medicago truncatula TAF9 TC85341 MADNEEDSNMPRDAKIMQSLLKSMGVEEYEPRVINKFLELWYRYVVDVLTDAQVYSEHA GKPAIDVDDVKLAIQSQVNFSFSQPPPREVLLELAQNRNKIPLPKSIAGPGFPLPPDQD

PAGE 194

181 TLIAPNYQFAIPNKRSVEPMEETEDEEVPNADPNPSQEEKTDAEQNPHQRVSFPLPKRQ KD Medicago truncatula TAF9 TC85342 MADNEEDSNMPRDAKIVQSLLKSMGVEEYEPRVINKFLELWYRYVVDVLTDAQVYSEHA GKPAIDVDDVKLAIQSQVNFSFSQPPPREVLLELAQNRNKIPLPKSIAGPGFPLSPDQD TLIAPNYQFAIPNKRSVEPMEETEDERSSQWPIPTHLKKRRQMRNKIPIKECHFPCLNP KGLI Oryza sativa TAF9 BAC21319.1 MDTGADQAPPPPPPPPVAAASAAADEPRDLRVVREILHSLGLREGDYEEAAVHKLLLFA HRYAGDVLGEAKAYAGHAGRESLQADDVRLAIQARGMSSAAPPSREEMLDIAHKCNEIP IPKPCVPSGSISLPHYEDMLLNKKHIFVPRVEPTPHQIEETEDDYNDDGSNANVASPNS NYDQDLFGSISLPHYQDMLLNQNHLSVHRVEPAHDQLEKIKDDGSNDNADSSHSNYVQD SSGSVSLQHHQDMSLNQNHLFVHQVELTLDQIEEIKDDGSNDNVDSPNFNCVQDPSRSV SFPHYQVMPLNQNHLSFHQVEPMLDQVEEIKDDSSNDNVASLDSNCIQDPHYQDMLLNQ DHLSVRGVEPTLDQVEEIEDDCSSDNVASPDSNYDKEKNDSNKQKPSKKVSQLNTLVAA GKDKVDCSTELS Oryza sativa TAF9 AAP12985 MDPGGLRPAPQSAAAAAAAAAAGAGAGASAADEPRDARVVRELLRSMGLSEGEYEPRVV HQFLDLAYRYVGDVLGDAQVYADHAGKPQLDADDVRLAIQSKVNFSFSQPPPRECSEFF HSDQDFRSRSLPSDNPLFFSMVLLEVARNRNKIPLPKSIAPPGSIPLPPEQDTLLSQNY QLLAPLKPPPQFEETEDDNAGANPTPTSNPSNPSPNNLQEQQQLPQHGQRVSFQLNAVA AAKRRGTMDQLNMG Populus trichocarpa TAF9 MAEGEEDMPRDAKIVKSLLKSMGVEDYEPRVVHQFLELWYRYVVDVLTDAQVYSEHANK TAIDCDDVKLAIQSKVNFSFSQPPPREVLLELARNRNKIPLPKSIAGPGIPLPPEQDTL ISPNYQLAIPKKRTAQAIEETEEDEESADPNQSQEQKTDPPQLTPQRVSFPLTKRPNYR FQVMSSISCSSSMNSPDSSTLFTRLKFELCDMRIALI Populus trichocarpa TAF9b MGEGTVPLEVQIRPKEMHLQAEFGFAAHWRYKEGDCKHSSFVLQVVEWARWVITWQCET MSKDRPSIGCDDSIKPPCTFPSHSDGCPYSYKPHCGQDGPIFIIMIENDKMSVQEFPAD STVMDLLERAGRASSRWSAYGFPVKEELRPRLNHRPVHDATCKLKMGDVVELTPAIPDK SLSDYREEIQRMYEHGSATVSSTAPAVSGTVGRRS Saccharomyces cerevisiae TAF9 NP_013963 MNGGGKNVLNKNSVGSVSEVGPDSTQEETPRDVRLLHLLLASQSIHQYEDQVPLQLMDF AHRYTQGVLKDALVYNDYAGSGNSAGSGLGVEDIRLAIAARTQYQFKPTAPKELMLQLA AERNKKALPQVMGTWGVRLPPEKYCLTAKEWDLEDPKSM Solanum tuberosum TAF9 TC67183 MAEGGEEDLPRDAKIVKTLLKSMGVDDYEPRVVHQFLELWYRYVVDVLMDAQVYSEHAG KASIDSDDIKLAIQSKVNFSFSQPPPREVLLELARNRNKIPLPKSIAGSGVPLPPEQDT

PAGE 195

182 LINPNYQLAIAKKQTSQPEETEEDEERADPNPAPSKNPSLSHEKTDVPQGTPQQVSFPL GAKRPR Solanum tuberosum TAF9 TC67182 MAEGGEEDLPRDAKIDKTSLKSMGVDDYEPRVVQQFLELRNSYVVDVLTDAQVYSEHAG KTSIDSDDIKLAIQSKVNFSFSQPPPREVLLELARNRNKIPLPKSIAGSGVPLPPQQDT LINPNYQLAIAKKQTSQPEETEEDEESADPNPAPSKNPSLSHEKTDVPQGTPQRVSFPL GAKRPR Triticum aestivum TAF9 TC70841 MDGGGGGGGRPALQPAAAGGGASGPDEPRDARVVRELLRSMGLGEGEYEPRVVHQFLDL AYRYAGDVLGDAQVYADHAGKPQLDADDVRLAIQAKVNFSFSQPPPREVLLELARSRNK IPLPKSIAPPGSIPLPPEQDTLLSQNYQLLPALKPPTQTEEAEDEEEGANADAANANPN SSQDQRGNEAXSSSLRARARAQGFFQA Vitis vinifera TAF9 TC11580 MAGGDEDLPRDAKIVKSLLKSMGVDDYEPRVIHQFLELWYRYVVDVLTDAQVYSEHASK LAIDCDDVKLAIHFKVNFSFFQPPAREVLLELARNRNKIPLPKSIAGPGIPLPPEQDTL ISPNYQLAIPKKRTAQAVEETEEDEEGADPSHASQEGRTDLPQHTPQRVSFPIGAKRPR Zea mays TAF9 TC182853 MDAGAARPSAPSTAAVAGASVADEPRDARVVRELLRSMGLREGEYEPRVVHQFLDLAYR YVGDVLGDAQVYADHAGKAQIDADDVRLAIQAKVNFSFSQPPPREVLLELARSRNRMPL PKSIAPPGSIPLPPEQDTLLAQNYQLLPPLKPPPQYEEIEDETEEPNPSNPANSNPSYS QDQSSKEQQQQHTPQHGQRVSFQLNAVAAAAAAAKRPRMAIDQLNMG Zea mays TAF9 TC182854 MDAADARPSAPSAAAAAVAGASVADEPRDARVVRELLRSMGLGEGEYEPRVVHQFLDLA YRYVGDVLGDAQVYADHAGKAQIDADDVRLAIQAKVNFSFSQPPPREVLLELARSRNRM PLPKSIAPPGSIPLPPEQDTLLAQNYQLLPPLKPPPQYEENEDENEESNPSLTPNPANS NPTFSQDQRSNEQQHTPQHGQRVSFQLNAVAAAAAKRPRMTVDQLNIG TAF10 Sequences Arabidopsis thaliana TAF10 AAK29671 MNHGQQSGEAKHEDDAALTEFLASLMDYTPTIPDDLVEHYLAKSGFQCPDVRLIRLVAV ATQKFVADVASDALQHCKARPAPVVKDKKQQKDKRLVLTMEDLSKALREYGVNVKHPEY FADSPSTGMDPATRDE Beta vulgaris TAF10 BVSVtuc03-04-08.1346 MNPQTSDGRHDDDAALSEFLASLMDYTPTIPDELVEHYLAKSGFQCPDVRLIRLVAVAT QKFISEVATDALQHCKARQSSVVKDKRDKLQKDKRLVLTMEDLSRALKEYGVNLKHQEY FADNPSTGMDPASRDE Drosophila melanogaster TAF10 CAC08819 MASDGEDISVTPAESVTSATDTEEEDIDSPLMQSELHSDEEQPDVEEVPLTTEESEMDE

PAGE 196

183 LIKQLEDYSPTIPDALTMHILKTAGFCTVDPKIVRLVSVSAQKFISDIANDALQHCKTR TTNIQHSSGHSSSKDKKNPKDRKYTLAMEDLVPALADHGITMRKPQYFV Drosophila melanogaster TAF10b AAL48842 MVGSNFGIIYHNSAGGASSHGQSSGGGGGGDRDRTTPSSHLSDFMSQLEDYTPLIPDAV TSHYLNMGGFQSDDKRIVRLISLAAQKYMSDIIDDALQHSKARTHMQTTNTPGGSKAKD RKFTLTMEDLQPALADYGINVRKVDYSQ Glycine max TAF10 TC162515 MNQNPQSSDGRNDDDSALSDFLASLMDYTPTIPDELVEHYLAKSGFQCPDVRLTRLVAV ATQKFVAEVAGDALQHCKARQATIPKDKRDKQQKDKRLVLTMEDLSKALREYGVNLKHQ EYFADSPSTGMDPATREE Glycine max TAF10 TC162516 MNQNPQSSEGRNDDDSALSDFLASLMDYTPTIPDELVEHYLAKSGFQCPDVRLTRLVAV ATQKFVAEVAGDALQHCKARQATIPKDKRDKQQKDKRLVLTMEDLSQALREYGANLTDQ EYFADSPSTVMDPATREE Gossypium arboretum TAF10 BQ401852 MNHNPQSSDGKHDDDSALSDFLASLMDYAPTIPDELVEHYLAKSGFQCPDVRLIRLVAV ATQKFVAEVASDALQHCKARQAAVVKDKREKQQKDKRLILTMDDLSKSLREYGVNVKHQ EYFADSPSTGIDPASREE Homo sapiens TAF10 Q12962 MSCSGSGADPEAAPASAASAPGPAPPVSAPAALPSSTAAENKASPAGTAGGPGAGAAAG GTGPLAARAGEPAERRGAAPVSAGGAAPPEGAISNGVYVLPSAANGDVKPVVSSTPLVD FLMQLEDYTPTIPDAVTGYYLNRAGFEASDPRIIRLISLAAQKFISDIANDALQHCKMK GTASGSSRSKSKDRKYTLTMEDLTPALSEYGINVKKPHYFT Hordeum vulgare TAF10 Barley1_07779 MGSNNSGGAGGGGGMAPGTGAGGSDGRHDDEAVLTEFLSSLMDYNPTIPDELVEHYLGR SGFHCPDLRLTRLVAVAAQKFISDIASDSLQHCKARVAAPIKDNKSKQPKDRRLVLTMD DLSKALREHGVNLRHPEYFADSPSAGMAPSTRDE Hordeum vulgare TAF10 TC68796 MMGSNNPGGAGGGMAPGMGAGGSDGRHDDEAVLTEFLSSLMDYNPMIPDELVEHYLGRS GFHXPDLRLTRLVAVATQKFISDVASDSLQHCKARVAAPIKDNKSKQPKDRRLVLTMDD LSKALREHGVNLKHPEYFADSPSAGMGHSTREE Hordeum vulgare TAF10 HVtuc02-11-10.5382 MGSNNSGGAGGGGGMAPGTGAGGSDGRHDDEAVLTEFLSSLMDYNPTIPDELVEHYLGR SGFHCPDLRLTRLVAVAAQKFISDIASDSLQHCKARVAAPIKDNKSKQPKDRRLVLTMD DLSKALREHGVNLRHPEYFADSPSAGMGHSTREE Lycopersicon esculentum TAF10 TC118341 MNQSQGQQTSEGRHEDDAVLADFLASLMDYTPTIPDELVEHYLGKSGFQCPDVRLIRLV

PAGE 197

184 AVATQKFIADVATDALQHCKARQSTIVKDKRDKQQKDKRLTLTMDDLSKSLREYGVNVK HQDYFADSPSAGLDPASREE Oryza sativa TAF10 TC129171 MVPGGMGGGGPMGAAPPGGGGGGDGRHDDEAVLTEFLSSLMDYTPTIPDELVEHYLGRS GFYCPDLRLTRLVAVATQKFISDIASDSLQHCKARVAAPIKDNKSKQPKDRRLVLTMDD LSKALQEHGVNLKHPEYFADSPSAGMAPAAREE Pinus TAF10 TC9616 MAESKQDDDAVLIEFLSSLMDYTPTIPDELAEYYLSKSGFQCPDVRIIRMVSIATQKFI AEIASDAFQLCKARQSAVNKEKRDKQQKDKSFVLTTEDLSMALREYGVNMKRQEYFADN PSAGTNPTSKDE Populus trichocarpa TAF10 MNNTSSSNSQQQQQSSEARHDDDAVLTEFLASLMDYTPTIPDELVEHYLAKSGFQCPDV RLVRLVAVATQKFVADVATDALQQCKARPAPVVKDKRDKQQKEKRLILTMEDLSKALSE YGVNVKHQEYFADSPSTGMDPASREE Saccharomyces cerevisiae TAF10 NP_010451 MDFEEDYDAEFDDNQEGQLETPFPSVAGADDGDNDNDDSVAENMKKKQKREAVVDDGSE NAFGIPEFTRKDKTLEEILEMMDSTPPIIPDAVIDYYLTKNGFNVADVRVKRLLALATQ KFVSDIAKDAYEYSRIRSSVAVSNANNSQARARQLLQGQQQPGVQQISQQQHQQNEKTT ASKVVLTVNDLSSAVAEYGLNIGRPDFYR Triticum aestivum TAF10 TC64687 MMGSNNPGGAGGGGGMAPGTGAGGSDGRHDDEAVLTEFLSSLMDYNPTIPDELVEHYLG RSGFHCPDLRLTRLVAVAAQKFISDIASDSLQHCKARVAAPVKDNKSKQPKDRRLVLTM DDLSKALREHGGNLKHPEYFADSPSAGMPPSTREE Triticum aestivum TAF10 TC64747 MMGSNNPGGAGGGGGGGMAPGTGGGGSDGRHDDEAVLTDFLSSLMDYNPTIPDELVEHY LGRSGFHCPDLRLTRLVAVAAQKFISDIASDSLQHCKARVAAPIKDNKSKQPKDRRLVL TMDDLSKALREHGVNLRHPEYFADSPSAGXPLKREE Triticum aestivum TAF10 CA620043 MAPGMGAGSSDGRHDDEAVLTEFLSSLMDYNPMIPDELVEHYLGRSGFHCPDLRLTRLV AIATQKFISDVASDSLQHCKARVAAPIKDNKSKQPKDRRLVLTMDDLSKALREHGVNLK HPEYFADSPSARMGPSTREE Zea mays TAF10 TC184169 MGTGVGGGGDGRHDDEAALTEFLSSLMDYTPTIPDELVEHYLGRSGFHCPDLRLTRLVA VATQKFLSDIASDSLQHCKARVAAPIKDNKSKQPKDRRLVLTMDDLSKALREHGVNLKH AEYFADSPSAGMAPSTREE

PAGE 198

185 TAF11 Sequences Arabidopsis thaliana TAF11 At4g20280 MKHSKDPFEAAIEEEQEESPPESPVGGGGGGDGSEDGRIEIDQTQDEDERPVDVRRPMK KAKTSVVVTEAKNKDKDEDDEEEEENMDVELTKYPTSSDPAKMAKMQTILSQFTEDQMS RYESFRRSALQRPQMKKLLIGVTGSQKIGMPMIIVACGIAKMFVGELVETARVVMAERK ESGPIRPCHIRESYRRLKLEGKVPKRSVPRLFR Arabidopsis thaliana TAF11b At1g20000 MAFNARSCCFASSNERVTCNCNCLKDQPVPSVVGCATKKLAEFWSFKIQRYVIFVKVLL RMKHSKDPFEAAMEEQEESPVETEQTLEGDERAVKKCKTSVVAEAKNKDEVEFTKNITG ADPVTRANKMQKILSQFTEEQMSRYESFRRSGFKKSDMEKLVQRITGGPKMDDTMNIVV RGIAKMFVGDLVETARVVMRERKESGPIRPCHIRESYRRLKLQGKVPQRSVQRLFR Drosophila melanogaster TAF11 NP_723484 MDEILFPTQQKSNSLSDGDDVDLKFFQSASGERKDSDTSDPGNDADRDGKDADGDNDNK NTDGDGDSGEPAHKKLKTKKELEEEERERMQVLVSNFTEEQLDRYEMYRRSAFPKAAVK RLMQTITGCSVSQNVVIAMSGIAKVFVGEVVEEALDVMEAQGESGALQPKFIREAVRRL RTKDRMPIGRYQQPYFRLN Homo sapiens TAF11 NP_005634 MDDAHESPSDKGGETGESDETAAVPGDPGATDTDGIPEETDGDADVDLKEAAAEEGELE SQDVSDLTTVEREDSSLLNPAAKKLKIDTKEKKEKKQKVDEDEIQKMQILVSSFSEEQL NRYEMYRRSAFPKAAIKRLIQSITGTSVSQNVVIAMSGISKVFVGEVVEEALDVCEKWG EMPPLQPKHMREAVRRLKSKGQIPNSKHKKIIFF Hordeum vulgare TAF11 TC81880 MKDPFEAAVEEQESPPDSPAPPEEGPATAVPHTIDEDYDGSAGAGGSRPPPPRPRPSAL AAPSTSAAPAAAKAKVRPQKEQDDDDDEEDPMEVDLDKLPSGTSDPDKLAKMNALLSQF TEDQMNRYESFRRSGFQKSNMKKLLASITGSQKISMPTTIVVSGIAKMFVGEVIETARI IMSERKDSGPIRPCHIREAYRRLKLEGKIPKRSVPRLFR Medicago truncatula TAF11 TC80073 MAGGISFGIGLKRMKQSKDPFEAAFEESPPESPIETEPDPDASTENPNSTNSSLPQSTL THEEEHNHIKTPNSNNTITKHKDEEDDEEEDNMDVELAKFPTAGDPHKMAKMQAILSQF TEEQMSRYESFRRAGFQKANMKRLLTSITGTQKISIPITIAVSGIAKVFVGEVVETART IMKERKETGPIRPCHLREAHRRLKLEGKIFKRTTSRLFR Oryza sativa TAF11 BAB90043 MATRIAQARKRQGRDRRSSTRTPLNRGQPDKATLLQLQPGLALQRSAQPKRGIIIGNDS GSLVRRASDEQPVEYSLSSPAKRKKKHDDICGSRRFIFTCMMYHTEYDSIYRFDSPVKI LAAATEKSAQKSGPRKRPTREAAHQARRRRAAAAMKDPFEAAVEEQESPPESPAANEED AAGAPEGYDGASGSRGPPLRLPPSRAAPSGSGGAAAAAARGKVVRVQKEQQEEEDDEED HMEVDLDKLPSGTSDPDKLAKMNAILSQFTEDQMNRYESFRRSGFQKSNMKKLLASITG SQKISLPTTIVVSGIAKMFVARIVMTERKDSGPQGNQSKQYVQAEVLRYY

PAGE 199

186 Oryza sativa TAF11b TC124761 MKDPFEAAVEEQESPPESPAANEEDAAGAPEGYDGASGSRGPPLRLPPSRAAPSGSGGA AAAAARGKVVRVQKEQQEEEDDEEDHMEVDLDKLPSGTSDPDKLAKMNAILSQFTEDQM NRYESFRRSGFQKSNMKKLLASITGSQKISLPTTIVVSGIAKMFVGELVETARIVMTER KDSGPVRPCHIREAYRRLKLEGKIPRRTVPRLFR Populus trichocarpa TAF11 MKQSKDPFEAAYVEQEESPPESPVAQDDYDTQASNAAAAADDSQGAVVGQDDDDLGGGG RNDFAHSSDHPSASRPMLGSARSKAKNKDDDEEEEEDNMDVELSKLASTADPDKMAKMQ FGNSRTEIFQELSSYVHSALHGRRASAPVHAYCKEYHAQTIASSIRPSGLKYFYNSLCK KG Saccharomyces cerevisiae TAF11 NP_013697 MTEPQGPLDTIPKVNYPPILTIANYFSTKQMIDQVISEDQDYVTWKLQNLRTGGTSINN QLNKYPKYKYQKTRINQQDPDSINKVPENLIFPQDILQQQTQNSNYEDTNTNEDENEKL AQDEQFKLLVTNLDKDQTNRFEVFHRTSLNKTQVKKLASTVANQTISENIRVFLQAVGK IYAGEIIELAMIVKNKWLTSQMCIEFDKRTKIGYKLKKYLKKLTFSIIENQQYKQDYQS DSVPEDEPDFYFDDEEVDKRETTLGNSLLQSKSLQQSDHNSQDLKLQLIEQYNKLVLQF NKLDVSIEKYNNSPLLPEHIREAWRLYRLQSDTLPNAYWRTQGEGQGSMFR Triticum aestivum TAF11 TC91943 MKDPFEAAVEEQDSPPDSPAPPEEDPATAVPHTAAEDYDGSAGAGGSRAPPPRPRPSAL AAPSTSVAPAAAKAKVRPHKEQDDDDDEEDPMEVDLDKLPSGTSDPDKLAKMNALLSQF TEDQMNRYESFRRSGFQKSNMKKLLASITGSQKISMPTTIVVSGIAKMFVGEVIETARI VMSERKDSGPIRPCHIREAYRRLKLEGKIPKRSVPRLFR TFIIE Sequences Arabidopsis thaliana TFIIE 1 At1g03280 MEKSGPVQKAVVLQPFVKLVRLVARAFYDDYTTKSDNQQKSARSDNRGIAAVVLDALAR RQWVREEDLAKDLQLHAKQLRKIIRLFEEEKLIMRDHRKETAKGAKMYSAAVAATTDGR AEDKVKLHTHSYCCLDYAQICDVVRFRLHRMKKRLKDELEDKNTVQEYGCPNCQRKYNA LDALRLISMVDDSFHCENCNGELVVECNKLTSEEVVDGDDNARRRRRENLKNMLQKLEV QMKPLMDQLNRVKDLPIPEFGSFLAWEARAAMAARENGDLNPNDPLRSQGGYGSTPMPF LGETKVEVNLGDGNEDVKSKGGDSSLKVLPPWMIKEGMNLTEEQRGEMRQEAKVDGGAG AAAKLSDDKKSAIGNGDEKDLKDEYLKAYYAELMKQQELAARRNQQESAGEPTSGIQSG TVYSGRQVSMKAKREEDEDEDEEEVEWEEKAPVTANGNYKVDLNVEAEASGGEEEEEED DVDWEEG Arabidopsis thaliana TFIIE 2 At4g20340 MDKSITVVRKTVVLEPFVKLVRLLVRIFYDNYTPESDNQQKSVKNVKGSAVIVLDALTR RQWVREEDLAKEVKRNAKELRKLIRHFEEQKFVMRYHRKETAKRAKMYSYAVGGTTDGR AEDNVKFHTHSYCCLDYAQIYDIVRYKLHRLKKKFKDELEDRNTVQEYGCPNCKRKYNA LDALRLISMEDDSFHCENCNGELVMECNKLISEEVVDRGDNARRRQREKVKVWLQDLEG ELKPLMELINRVKDLPFPAFEPFPAWEARAAKAARENGDFNPDDPSRSLGGYGSTPMPF LGETKVEVNLGEGNEDVTSTGGDSSLKMLPPWMIKQGMKLTEEQRGEMRQEANVDGEAA KLSDDKKSVMENGDDNKDLKDEYLKAYYAAIMEQQKLAAKLNEQESAGESTTTDIESAT

PAGE 200

187 TYSDRQVGMKSKREEEEEDVEWEEGASVAANGNYKVDLNVEAEEAEEKEDGDEDDDIDW EEG Arabidopsis thaliana TFIIE 3 At4g20810 MVKLVAKTFYDNYTPKNNNQKKSAKNGSGGIAVLVLDALTRRQWVREEDLAKELKLNTK QLRTILRYFEEQQFIMRVHRKEKSSATTNGRGEDKVKVHMYSYCCLDYSQIYDVIRYKL HRMKKEFKDVLEDKDNVQEYGCPNCKRKIFFHCENCNGELVMECNKLTSEEVVVDGSDN PRSRRDHLKDLLQNMEVRLKPLMDHINRIKDLPVPSFESFPAWETRVAKAARENGDLNP DDTLRPQGGYGSTPMPFLGETEIEVNLGEENEDVKSDEVGDSSRRKLTPSWLIKKGMNL SDEQRGEIRHEAKADDGGSSMENGDDDRNLKDEYLKAYYAAILEEQELAEKLNQQESAG KVTTDIELATSSSDRQVGMKSKREEEEEEEASVAANGNYKVDLNVEAEEAEQDENDVDW QEC Drosophila melanogaster TFIIE NP_524026 MSSTSTAAANAAPAKTEVRYVTEVPSSLKQLARLVVRGFYSLEDALIIDMLVRNPCMKE DDIGELLRFEKKQLRARITTLRTDKFIQIRLKMETGPDGKAQKVNYYFINYKTFVNVVK YKLDLMRKRMETEERDATSRASFKCSSCSKTFTDLEADQLFDMATLEFRCTFCGSSVEE DSAAMPKKDSRLMLAHFNEQLQPLYDLLREVEGIKLAPEVLEPEPVDIDTIRGLNKPNA TRPDGMAWSGEATRNQGFAVEETRVDVTIGGDDTSDAVIERKSRPIWMTESTVITDTDA ADGAADAVQTASGSGHRNRKENEDIMSVLLQHEKQPGQKEPHMKGMRVGSSNANSSDSS DDEKDIENSKIPDVDFDNYINSDSAEEDDDVPTVLVAGRPHPLDQLDDNLIAQMTPQEK ENYIHVYQQHYSHIFE Homo sapiens TFIIE NP_005504 MADPDVLTEVPAALKRLAKYVIRGFYGIEHALALDILIRNSCVKEEDMLELLKFDRKQL RSVLNNLKGDKFIKCRMRVETAADGKTTRHNYYFINYRTLVNVVKYKLDHMRRRIETDE RDSTNRASFKCPVCSSTFTDLEANQLFDPMTGTFRCTFCHTEVEEDESAMPKKDARTLL ARFNEQIEPIYALLRETEDVNLAYEILEPEPTEIPALKQSKDHAATTAGAASLAGGHHR EAWATKGPSYEDLYTQNVVINMDDQEDLHRASLEGKSAKERPIWLRESTVQGAYGSEDM KEGGIDMDAFQEREEGHAGPDDNEEVMRALLIHEKKTSSAMAGSVGAAAPVTAANGDDS ESETSESDDDSPPRPAAVAVHKREEDEEEDDEFEEVADDPIVMVAGRPFSYSEVSQRPE LVAQMTPEEKEAYIAMGQRMFEDLFE Hordeum vulgare TFIIE TC90346 MGSLEPFNRLVRLTARAFYDDISIKGDTQAKTSRGDNRGMAVVVLDGLTRRQWVREEDL AKSLKLHSKQLRRVLRFFEEEKLVTRDHRKESAKGAKIYSAAAAAAGDGQPTKEGEEKV KLHTHSYCCLDYAQICDVVRYRIHRMKKTLKDELDSRNTVQHYICPNCKKRYSAFDALQ LISYTDEYFHCENCNGELLAESDKLSSEEMGDGDDNARKRRREKLNDMQQRIDEQLKPL QAQLKRVKDLPAPEFGSLQSWERLNLGAFAHGDSAAAEAARNAQGQYNGTPMPYLGDTK VDVELAGSGVKEEGAESGRDGTVLKVLPPWMVREGMNLTKEQRGESSNTSKGDEKSDVK DEKKQDSKEDEKSIQDEYLKAYYEAFKKKQEEEDAKRMQQEGQAFSSEIHSERQLGMKA KREDENVEDDGVEWEEEQPAGNASEEPYKFVDLNAEAPESGDEEDEIDWEEG Methanosarcina acetivorans TFE NP_618742 MNTLVDLNDKVIRGYLISLVGEEGLRMIEEMPEGEVTDEEIAAKTGVLLNTVRRTLFIL YENKFAICRRERDSNSGWLTYLWHLDFSDVEHQLMREKKKLLRNLKTRLEFEENNVFYV CPQGCVRLLFDEATETEFLCPMCGEDLVYYDNSRFVSALKKRVDALSSV

PAGE 201

188 Oryza sativa TFIIE 1 MGSMEPFNRLVRLAARAFYDDISMKGDNQPKTSRGDNRGMAVVVLDALTRRQWVREEDL AKALKLHSKQLRRILRFFEEEKLVTRDHRKESAKGAKIYSAAAAAAGDGQSITKEGEEK VKMHTHSYCCLDYAQICDVVRYRIHRMKKKLKDELDSRNTIQHYICPNCKKRYSAFDAL QLISYTDEYFHCENCNGELVAESDKLASEEMGDGDDNARKRRREKLKDMQQRIDEQLKP LQAQLNRVKDLPAPEFGSLQSWERANIGAFGTADPSAADSSRNPQGQYGTPMPYLGETK VEVALSGTGVKDEGAESGTNGNGLKVLPPWMIKQGMNLTKEQRGETSNSSNLDEKSEVK DEKKQDSKEDEKSIQDEYIKAYYEALRKRQDEEEAKRKIQQEGDTFASASHSERQVGMK SKREDDDEGVEWEEEQPAGNTAETYKLADLNVEAQESGDEEDEIDWEEG Oryza sativa TFIIE 2 MVYDVVRYRIHRMRKKLKDGLDDRDTVQHYVCPNCKRRYSAFDALQLVSDMDDYFHCEH CKGELRPESEKLTLDEIVCGGGNAIKHTHDKLKDMQQRMEEQLKPLIAVLDRVKDLPFP SFMSLQDWERATIGASANGAVGSSQNSEGRYSSKPMPFLGETEVEVNFLGSTGAQEGVE SGMESIKPQHSWMNRKRTVLAGEHKEENNNTANLDQSSEAKSDKKQLSEEDEMKSIQEA YAKAYYEAIQKRQEDEGKRAIQEESLACISDQPFASDAQFERRLGAKSKRDDGGESGDD GIELKVRQSTGNIEEVYKFADLNVETQELVEKNCIPPAE Oryza sativa TFIIE 3 MDTMEQLNRLVRMVARGFYEDVSLEEDQSKPNGSGSCGIVVVVLDALTRQQWVREEDLA RSLMIPFNRLRQITHFLEQQKLVRRYYRKEAIHDASISTASPSHVSHDAHLVPTNVAGK LKMIMQPYCCLHYGQVYDVTLYRIHEMKKKLKDELDGNYMIQNYVCPNCERRYSSLNAL DLVSHIDNNFHCKHCNEELSQDFGDLAWGGRGGDGDNARRDRHAKLKDFLQRMEHQMER LISQLNKVKDLDFPEFLALETWERNMREPAGGDDVSRPMLFLGEVMSHEHQKGSASCID ADEEIFEFRVQDARPIPSFVIRKDINHTEDKEEQL Oryza sativa TFIIE 4 MSINERLVKCAAQLLYGNVGFKAGEVRIDCDENRGVVVMVLDALTRYQWVPDTHLAKSL KVQKKKLCLILEFLEKQMFVRRCEVKAKTGRNVSNTATTAGVSAIPRNEKVKSKHPKWY CCINYAKICSVVRYHIMQMEANLKSQLENTNTVDKYTCPNCGKSFSAFDVKDLVSCTDG NFYCESCKHELVACSEYGNYNEREGRSANLLDFLENMKEKLRPLKTKLDLLEDLPAPDF GSTPDFKGTYNISDWSRTSVPLPEPTNGDDSFSSPCAKDDESDAGVSELKILPSWLIRK GMKLKQAHLSNSSTVCGEGGTNIQEEYMKAYYEAIQKRQEDRIRHSGQSSVPGGPSVSS ERPMGVKRQKLCNDINNNALECQGEEPPGDTFRT Populus trichocarpa TFIIE 2 MDMNTTISVEPFKRLVKLAARAFYDDVSTKGENQSKNNARGDNKGIAVVVLDALTRRLW VNEEGLAKDLKIHIKQLRRILRLFEEDKLLTRAHRKETAKVTKKPNAGGADSQRKFGSR EDDKNKLHTHSYCCLDYAQIYDVVRYRLHRMRKMIKDELENNNAVQQYICPICERRYNA LDALRLISLVDEDFHCENCDGVLVAESDKLAAQEGGDGDDNARKRRREKLKDMLQNMEC YFMVPNFDFESINCKNWPARFWLAKVQLKPLMDQLSRVKDLPIPEIGSLQAWQLHENAA GRATNGDPNSDDHFKYSQGPGYGGTPMPFLGETKVEVAFAGDESKENIKSETASTSLKV LPPWMIKQGMNLTKEQRGEVKQESKMDSSSTAVEFSDEKKSAKVNGDSIKEEYVKAYYA ALLEQQRQAEESAKQQQELSQTSMSNGLSESSSNRQVGMKSKREEGEGDDDVEWEEAPI EGKSNNWNLIALLSY

PAGE 202

189 Populus trichocarpa TFIIE 1 MAEFGSKLVNKFEESPRGTTAFIKINEAHTEVKKELVKLAARAFYDDITTKGDNQPKTG RSDNRGIAVVVLDALTRRQWVREEDLAKELKLHSKQLRRTLRFFEEEKLVTRDHRKETA KAAKMHNAAVANTTDGHRTKEGDDKIKMHTHSYCCLDYAQIYDVVRYRLHRMRKKLKDE LEDKNTVQEYTCPNCGRRYNALDALRLMSLVDEYFHCENCDGELVAESDKLAAQEGGDG DDNARRRRREKLKDMLQKMEDASNLFLFKCYLLLMKACYRVIEEVLGRRFIFSMTGQIE MARVKDLPVPEFGSLQEWQIHASAAGRAANGDSSYNDPSRSSQGYGGTPMPFLGETKHR VEFNASKRCQLRHDQEKDSSTKGRVEVSFSGVEGKEDLKSETASTGLKVLPPWMIKQGM NLTKEQRGEVKQGSKMDDSSAAAEPPDDKKISIENDDKIKDEYVKAYYAALLQKQREAE ESAEKQQELLQTSISNGFSKSSSDRQVGMKSKREEDDEPDDDVEWEEAPIGGMSYLSME WDPLQSY Saccharomyces cerevisiae TFIIE NP_012897 MDRPIDDIVKNLLKFVVRGFYGGSFVLVLDAILFHSVLAEDDLKQLLSINKTELGPLIA RLRSDRLISIHKQREYPPNSKSVERVYYYVKYPHAIDAIKWKVHQVVQRLKDDLDKNSE PNGYMCPICLTKYTQLEAVQLLNFDRTEFLCSLCDEPLVEDDSGKKNKEKQDKLNRLMD QIQPIIDSLKKIDDSRIEENTFEIALARLIPPQNQSHAAYTYNPKKGSTMFRPGDSAPL PNLMGTALGNDSSRRAGANSQATLHINITTASDEVAQRELQERQAEEKRKQNAVPEWHK QSTIGKTALGRLDNEEEFDPVVTASAMDSINPDNEPAQETSYQNNRTLTEQEMEERENE KTLNDYYAALAKKQAKLNKEEEEEEEEEEDEEEEEEEEMEDVMDDNDETARENALEDEF EDVTDTAGTAKTESNTSNDVKQESINDKTEDAVNATATASGPSANAKPNDGDDDDDDDD DEMDIEFEDV Solanum tuberosum TFIIE TC67033 MSIEPFNRLVKLAARAFYDDITTKGDNQPKSGRSDNRGIAVVILDALTRRQWVREEDLA KDLKLHTKQLRRTLRFFEEEKLITRDHRKEGAKGAKVYNSAVAATVDGLQNGKEGDDKI KMHTHSYCCLDYAQIYDVVRYRLHRMKKKLRDELDNKNTVQEYICPNCGKRYTALDALR LISPVDEYFHCESCNEELVAESDKLASQGTTDGDDNDRRRRREKLEDMLHRVEAQLKPL MDQLARVKDLPAPEFGSLQAWEVRANAVARGANGDNANDSKSGQGLGFGGTPMPFVGET KVEVAFSGLEEKGDIKSEVSVTPMKVLPPWMIKEGMNLTKEQRGEVKQESNMEGTSTAA GLSDDKKSIGFEDVKNIQDEYIKAYYEALFKRQKEQEEATKMLPETSTTDGVYNTSTER QVGMKSKREEEDEGEDVEWEEAPPAGNTTTGNLKVDLNVQADASEDDNDEEDDIDWEEG Sulfolobus solfataricus TFE NP_341815 MVNAEDLFINLAKSLLGDDVIDVLRILLDKGTEMTDEEIANQLNIKVNDVRKKLNLLEE QGFVSYRKTRDKDSGWFIYYWKPNIDQINEILLNRKRLILDKLKTRLEYEKNNTFFICP QDNSRYSFEEAFENEFKCLKCGSQLTYYDTDKIKSFLEQKIRQIEEEIDKETKLGANKN H TFIIE Sequences Arabidopsis thaliana TFIIE 1 At4g21010 MALREQLDKFNKQQEKCQSTLSSISSSRTALSRSYVPAATTSQKPNVFRGKFSENTKQL QHITNIRNSAVGAQMKIVIDLLFKTRLAYTAEQINEACYVDMHNNKAVFDSLRKNPKVH YDGRRFSYKATHNIKDKKQLLSFVNKSDKVIDVSDLKDAYPNVMEDLKSLKSSGEIFWL LSNTDSKEGTVYRNNMEYPKIDDELKALFRDIIPSDMLEVEKELLKIGLKPATNIAERR AAEQLHGVSNKPKDKKKKKKEITNRTKLTNSHMLELFQS

PAGE 203

190 Arabidopsis thaliana TFIIE 2 At4g20330 MALKEQLDKFNKQQVKCQSTLSSIASSRERTSSSRQNVPLPAAITQKKPDAAPVKFSSD TERLQNINNIRKAPVGAQIKRVIDLLYERRLALTPEQINEWCHVDMHANKAVFDSLRKN PKAHYDGRRFSYKATHDVNDKNQLLSLVRKYLDGIAVVDLKDAYPNVMEDLKALSASGD IYLLSNSQEDIAYPNDFKCEIKVDDEFKALFRDINIPNDMLDVEKELLKIGLKPATNTA ERRAAAQTHGISNKPKDKKKKKQEISKRTKLTNAHLPELFQNLNGSSSRN Drosophila melanogaster TFIIE NP_523923 MDPALLREREAFKKRAMATPTVEKKSKPDRPAPPPPSDDSRRKMRPPNAPRLDATTYKT MSGSSQYRFGVLAKIVKFMRTRHQDGDDHPLTIDEILDETNQLDIGQSVKNWLASEALH NNPKVEASPCGTKFSFKPVYKIKDGKTLMRLLKQHDLKGLGGILLDDVQESLPHCEKVL KNRSAEILFVVRPIDKKKILFYNDRTANFSVDDEFQKLWRSATVDAMDDAKIDEYLEKQ GIRSMQDHGLKKAIPKRKKAANKKRQFKKPRDNEHLADVLEVYEDNTLTLKGVNPT Glycine max TFIIE TC192062 MTLQEKLDKFKKQQEKCQTTLSSIAASKAAATQKSAAHGSANGRNAAPAVKFSNDTERL QHINSIRKAPVGAQMKRVIDLLLETRQAFTPEQINGACYVDMKANKDVFENLRKNPKVN YDGQRFSYKSKYGLKDKTELLQLIRKYPEGLAVIDLKDAYPTVMEDLQAMKAAGQIWLL SNFDSQEDIAYPNDPKVHIKVDDDLKHLFRSIELPRDMIDIEKDLQKNGMKPATNTAQR RSAAQIQGISSKPKPKKKKSEISKRTKLTNAHLPELFQNLNSS Helianthus annuus TFIIE TC9497 MGSLRESLNRFKQQQEKCQSTLTSIAAGSKTSNRTTTPAPRVAPAASTLAKNPVPAVKF SNDTERLQHINNVRKSPVGAQIKKVIDLLFESRQAFTAEQINEACYVDVKGNKAVFESL AKNPKVNYDGKRFSYKSKHNVRDQKELLRLIRTFAEGIAVADLKDAYPTVMEDLQALKA GRQIWLLSNFDSQEEIAYPNDPRVPIKVDDELKQLFRSIELPRDMLDIERDLQKNGMKP ATNTAKRRVDGSKWQYFE Homo sapiens TFIIE NP_002086 MDPSLLRERELFKKRALSTPVVEKRSASSESSSSSSKKKKTKVEHGGSSGSKQNSDHSN GSFNLKALSGSSGYKFGVLAKIVNYMKTRHQRGDTHPLTLDEILDETQHLDIGLKQKQW LMTEALVNNPKIEVIDGKYAFKPKYNVRDKKALLRLLDQHDQRGLGGILLEDIEEALPN SQKAVKALGDQILFVNRPDKKKILFFNDKSCQFSVDEEFQKLWRSVTVDSMDEEKIEEY LKRQGISSMQESGPKKVAPIQRRKKPASQKKRRFKTHNEHLAGVLKDYSDITSSK Hordeum vulgare TFIIE TC102892 MDLKDSLSRFKQQQERCQSSLASIAASQASTTKPKHRAQPINAQSAPARPAQPIKFSND TERLQHINSIRKSPIGAQIKLVIELLYKTRQAFTAEQINDETYVDINGNKAVFESLRNN LKVHYDGRRFSYKSKHDLEGKDQLLELIRCHQEGLAVVEVKDAYPSVLEDLQALKAAGE VWLLSNMDSQEDIVYPNDPKVKIKVDDDLKELFRGIELPRDMVDIEKDLQKNGMKPMTD TTKRRAAAQIHGVKPKAKPKKKQREITKRTKLTNAHLPELFQHLKS Hordeum vulgare TFIIE TC89335 MALNERLSKFKQQQERCQTTLSSIAATQASTTKSHNAPRSRPANAPSAPAKQIQAIKFS NDTERLQHINSVRKSPVGAQIKLVIELLYKTRLAYTAEQINEATYVAINSNKAVFDSLT

PAGE 204

191 NNPKVQFDGKRFSYKSKHDLKGKDQLLHLIRRFPEGLPVVEVKDSYPTVLDDLQALKAS GDVWWLSSMDSQEDIVYPNDPKSKIKLDADLKQLYREIELPRDMIDIEKELLKNGHKPA TDTTKRRAAAQIHGQRPKPKAKKKQKEITKRTKLTNAHLPELFDLPR Lycopersicon esculentum TFIIE TC116522 MASLQESLQRFKKQQEKCQAITSMAARAGPSKGAPPRPANAKPPAPAVKFSNDTERLQH INTIRKGPVGSQMKRVIDLLLETRQAFTPEQINEACYVDLIGNKPVFDSLRKNVKVYYD GNRFSYKSKHALKNKEQLLILIRKFPEGIAVIDLKDAYPTVMEDLQALKGAGQIWLLSN FDSQEDIAFPNDPRVPIKVDDDLKQLFRGIELPRDMLDIERDLQKNGMKPATNTAKRRA MAQVHGIAPKPKTKKKKHEISKRTKLTNAHLPELFKL Medicago truncatula TFIIE TC77471 partial MALQGKLDRFKKQQEKCQSTLSSIAANKAVSASVPNALAPVKFSTDTERLQHINSIRKA PVGAQMKRVIDLLFETRQALTLEQINETCHVDMKANKDVFDNMRKNPKVRYDGERFSYK SKHALRDKKELLFLIRKFPEGIAVIDLKDSYPTVMEDLQALKGGREIWLLSNFDSQEDI AYPNDPKVPIKVDDDLKQLFRGIELPRDMIDIERDLQKNGMKPATNTAKRRSAAQMEGI SSKPKPKKKKNEITKRTKLTNAHLPE Oryza sativa TFIIE AAM01137 MDLKDSLSKFKQQQERCQSSLASIAASTSKPKHRAQPVNAPSAPARPLQPIKFSNDTER LQHINSVRKSPIGAQIKLVIELLYKTRQAFTAEQINETTYVDIHGNKSVFDSLRNNPKV HYDGRRFSYKSKHDLKGKDQLLVLVRKYPEGLAVVEVKDAYPTVMEDLQALKAAGEVWL LSNMDSQEDIVYPNDPKAKIKVDDDLKQLFREMELPRDMVDIEKELQKNGIKPMTNTAK RRAAAQINGVQPKAKPKKKQREITRRTKLTNAHLPELFQNLNT Oryza sativa TFIIE TC151474 MDLKDSLSKFKQQQERCQSSLASIAASTSKPKHRAQPVNAPSAPARPLQPIKFSNDTER LQHINSVRKSPIGAQIKLVIELLYKTRQAFTAEQINETTYVDIHGNKSVFDSLRNNPKV HYDGRRFSYKSKHDLKGKDQLLVLVRKYPEGLAVVEVKDAYPTVMEDLQALKAAGEVWL LSNMDSQEDIVYPNDPKAKIKVDDDLKQLFREMELPRDMVDIEKELQKNGIKPMTNTAK RRAAAQINGVQPKAKPKKKQREITRRTKLTNAHLPELFQNLNT Populus trichocarpa TFIIE 1 MALQEQLDRFKKQQEKCQSTLTSIAKSRPSKSSLTQKTVAVAPAPSTSARTPAPAVKFS NDTERLQHINSIRKAPAGAQIKRVIDLLLETRQAFTPEQINDHCYVDMNSNKAVFDSLR NNPKVHYDGKRFSYKSKHDLKDKSQLLVLIRKFPEGIAVIDLKDSYPSVMDDLQALKAV GQIWLLSNFDSQEDIAYPNDPRMVIKVDDDLKQLFRGIELPRDMLDIEKDLQKNGMKPA TNTAKRRAAAQVQGISTKQKAKKKKHEISKRTKLTNAHLPELFKNLGS Saccharomyces cerevisiae TFIIE NP_012988 MSKNRDPLLANLNAFKSKVKSAPVIAPAKVGQKKTNDTVITIDGNTRKRTASERAQENT LNSAKNPVLVDIKKEAGSNSSNAISLDDDDDDEDFGSSPSKKVRPGSIAAAALQANQTD ISKSHDSSKLLWATEYIQKKGKPVLVNELLDYLSMKKDDKVIELLKKLDRIEFDPKKGT FKYLSTYDVHSPSELLKLLRSQVTFKGISCKDLKDGWPQCDETINQLEEDSKILVLRTK KDKTPRYVWYNSGGNLKCIDEEFVKMWENVQLPQFAELPRKLQDLGLKPASVDPATIKR QTKRVEVKKKRQRKGKITNTHMTGILKDYSHRV

PAGE 205

192 Solanum tuberosum TFIIE TC60506 MASLQESLQRFKKQQEKCQAISSMAARAGPSKGAPPRPANAKPPAPAVKFSNDTERLQH INSIRKGPVGAQIKRVIDLLLETRQAFTPEQINEACYVDINGNKAVFDSLRNNLKVYYD GNRFSYKSKHALKNKEQLLILIRKFPEGIAVIDLKDAYPTVMEDLQALKGAGQIWLLSK FDSQEDIAFPNDPRVPIKVDDDLKQLFRSIELPRDMLDIERDLQKNGMKPATNTAKRRA MAQVHGIVPKPKTKKKKHEISKRTKLTNAHLPELFKL Sorghum bicolor TFIIE TC59949 MDLKDSLSRFKQQQERCQSSLASIAASSSKPKHRAQPAHAPNVPARPSQPVKFSNDTER LQHINSIRKSPVGAQIKLVIELLYKTRQAFTAEQINDATYVDIHGNKAVFDSLRNNPKV SYDGRRFSYKSKHDLKGKDQLLVLIRKFPEGLAVVEVKDAYPNVLEDLQALKAAGEVWL LSNMDSQEDIVYPNDPKAKIKVDDDLKQLFREIELPRDMVDIEKELQRNGFKPMTNTAK RRAAAQINGVKPKAKPKKKQREITKRTKLTNAHLPELFQNLNT Sorghum bicolor TFIIE TC67168 MALNDRLNKFKQQQERCQNTLSSIFASQTSISTSKHVPGIQPVNAPLAPIKPLHPIKFS NDTERLQHINSVRKSAVGVQIKLVVELLYKTRQSFTAKQVNEATYVDIHGNKAVSDSLR NNPKVLFDGTRFSYKPKHILTGRDELLGLIKEKECGLPVEDIKDAYPSVLEDLQALKAS GDVWWLSSTQSQEDMAYFNDPRYNITVDNDLKELFLKTELPRDMLDVEKEIKKSGEKPM TNTTKRRALAQILDAAPKTKTKGSKKKQRRLTGKSKGLTNIHMPELFDA Triticum aestivum TFIIE TC110564 MDLKDSLSRFKQQQERCQSSLASIAASQASTTKPKHRAQPINAPSAPARPAQPIKFSND TERLQHINSIRKSPVGAQIKLVIELLYKTRQAFTAEQINDATYVDINANKAVFDSLRNN LKVQYDGRRFSYKSKHDLEGKDQLLDLIRCHQEGLAVVEVKDAYPSVLEDLQALKAAGE VWLLSNMDSQEDIVYPNDPKVKIKVDDDLKELFRGIELPRDMVDIEKELQKNGMKPMTD TTKRRAAAQIHGVKPKAKPKKKQREITKRTKLTNAHLPELFQHLKS Triticum aestivum TFIIEb TC129305 MALNERLSKFKQQQERCQTTLSSIAATQASTTKSHNAPRSRPANAPSAPAKQIQAIKFS NDTERLQHINSVRKSPVGAQIKLVIELLYKTRLAYTAEQINEATYVAINSNKAVFDSLT NNPKVQFDGKRFSYKSKHDLKGKDQLLHLIRRFPEGLPVVEVKDSYPTVLDDLQALKAS GDVWWLSSMDSQEDIVYPNDPKSKIKVDADLKQLYREIELPRDMIDIEKELLKNGHKPA TDTTKRRAAAQIHGQRPKPKAKKKQKEITKRTKLTNAHLPELFDLPR Zea mays TFIIE TC209727 MDLKDSLSKFKQQQERCQSSLASIAASTSKPKHRAQPAHAPNVPARPSQPIKFSNDTER LQHINSIRKSPVGAQIKLVIELLYKTRQAFTAEQINEATYVDIHGNKAVFDSLRNNPKV SYDGRRFSYKSKHDLKGKDQLLVLIRKFPEGLAVVEVKDAYSNVLEDLQALKAAGEVWL LSNMDSQEDIVYPNDPKAKIKVDDDLKQLFREIELPRDMVDIEKELQKNGFKPMTNTAK RRAAAQINGVKPKAKPKKKQREITKRTKLTNAHLPELFQNLNT TFIIF Sequences Arabidopsis thaliana TFIIF At4g12610 MSNCLQLNTSCVGCGSQSDLYGSSCRHMTLCLKCGRTMAQNKSKCHECGTVVTRLIREY NVRAAAPTDKNYFIGRFVTGLPNFKKGSENKWSLRKDIPQGRQFTDAQREKLKNKPWIL

PAGE 206

193 EDETGQFQYQGHLEGSQSATYYLLVMQNKEFVAIPAGSWYNFNKVAQYKQLTLEEAEEK MKNRRKTADGYQRWMMKAANNGPALFGEVDNEKESGGTSGGGGRGRKKSSGGDEEEGNV SDRGDEDEEEEASRKSRLGLNRKSNDDDDEEGPRGGDLDMDDDDIEKGDDWEHEEIFTD DDEAVGNDPEEREDLLAPEIPAPPEIKQDEDDEENEEEEGGLSKSGKELKKLLGKANGL DESDEDDDDDSDDEEETNYGTVTNSKQKEAAKEEPVDNAPAKPAPSGPPRGTPPAKPSK GKRKLNDGDSKKPSSSVQKKVKTENDPKSSLKEERANTVSKSNTPTKAVKAEPASAPAS SSSAATGPVTEDEIRAVLMEKKQVTTQDLVSRFKARLKTKEDKNAFANILRKISKIQKN AGSQNFVVLREKCQPKPGKRESRVNKLNIRSNLQPRKMELVTEDEIRKVLMEKKQLTTL ELVMRFKERLTTTEDKDSFSHILKKIAKLQKNPGSEKFVVVLRDNVTPLASDLTRLSIS Drosophila melanogaster TFIIF NP_524246 MSSASKSTPSAASGSSTSAAAAAAASVASGSASSSANVQEFKIRVPKMPKKHHVMRFNA TLNVDFAQWRNVKLERENNMKEFRGMEEDQPKFGAGSEYNRDQREEARRKKFGIIARKY RPEAQPWILKVGGKTGKKFKGIREGGVGENAAFYVFTHAPDGAIEAYPLTEWYNFQPIQ RYKSLSAEEAEQEFGRRKKVMNYFSLMLRKRLRGDEEEEQDPEEAKLIKAATKKSKELK ITDMDEWIDSEDESDSEDEEDKKKKEQEDSDDGKAKGKGKKGADKKKKKRDVDDEAFEE SDDGDEEGREMDYDTSSSEDEPDPEAKVDKDMKGVAEEDALRKLLTSDEEEDDEKKSDE SDKEDADGEKKKKDKGKDEVSKDKKKKKPTKDDKKGKSNGSGDSSTDFSSDSTDSEDDL SNGPPKKKVVVKDKDKEKEKEKESAASSKVIASSSNANKSRSATPTLSTDASKRKMNSL PSDLTASDTSNSPTSTPAKRPKNEISTSLPTSFSGGKVEDYGITEEAVRRYLKRKPLTA TELLTKFKNKKTPVSSDRLVETMTKILKKINPVKHTIQGKMYLWIK Homo sapiens TFIIF /RAP74 NP_002087 MAALGPSSQNVTEYVVRVPKNTTKKYNIMAFNAADKVNFATWNQARLERDLSNKKIYQE EEMPESGAGSEFNRKLREEARRKKYGIVLKEFRPEDQPWLLRVNGKSGRKFKGIKKGGV TENTSYYIFTQCPDGAFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVLNHFSIM QQRRLKDQDQDEDEEEKEKRGRRKASELRIHDLEDDLEMSSDASDASGEEGGRVPKAKK KAPLAKGGRKKKKKKGSDDEAFEDSDDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQE EGPKGVDEQSDSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSEESDID SEASSAFFMAKKKTPPKRERKPSGGSSRGNSRPGTPSAEGGSTSSTLRAAASKLEQGKR VSEMPAAKRLRLDTGPQSLSGKSTPQPPSGKTTPNSGDVQVTEDAVRRYLTRKPMTTKD LLKKFQTKKTGLSSEQTVNVLAQILKRLNPERKMINDKMHFSLKE Oryza sativa TFIIF TC148835 MGSADLVLKAACEGCGSPSDLYGTSCKHTTLCSSCGKSMALSGARCLVCSAPITNLIRE YNVRANATTDKSFSIGRFVTGLPPFSKKKSAENKWSLHKEGLQGRQIPENMREKYNRKP WILEDETGQYQYQGQMEGSQSSTATYYLLMMHGKEFHAYPAGSWYNFSKIAQYKQLTLE EAEEKMNKRKTSATGYERWMMKAATNGPAAFGSDVKKLEPTNGTEKENARPKKGKNNEE GNNSDKGEEDEEEEAARKNRLALNKKSMDDDEEGGKDLDFDLDDEIEKGDDWEHEETFT DDDEAVDIDPEERADLAPEIPAPPEIKQDDEENEEEGGLSKSGKELKKLLGKAAGLNES DADEDDEDDDQEDESSPVLAPKQKDQPKDEPVDNSPAKPTPSGHARGTPPASKSKQKRK SGGGDDSKASGGAASKKAKVESDTKPSVAKDETPSSSKPASKATAASKTSANVSPVTED EIRTVLLAVAPVTTQDLVSRFKSRLRGPEDKNAFAEILKKISKIQKTNGHNYVVLRDDK K Populus trichocarpa TFIIF 1 MSFDLLLKPSCSGCGSTTDLYGSNCKHMTLCLNCGKTMAENRGKCFDCGTTEYNVRAST

PAGE 207

194 SSDKNYFIGRFVTGLPSFSKKKNAENKWSLHKEGILGRQITDALREKFKNKPWLLEDET GQSQYQGHLEGSQSATYYLLMMTGKEFVAIPAGSWYNFNKVAHYKQLTLEEAEEKMKNR RKTADGYERWMMKAANNGAAAFGEVEKVDDKEGVSAGGRGGRRKASGDDDEGNVSDRGE EDEEEEAGRKSRLGLNKQGGDDDEEGPRGGDLDMDDDDIEKGDDWEHEEIFTDDDEAVA IDPEEREDLAPEVPAPPEIKQDEDDEDEENEEGGLSKSGKELKKLLGKANGLNESDVED DDDDEDMDDDISPVLAPKQKDVVPKEEAADISPAKPTPSGSAKGTPSTSKSAKGKRKLN GEDAKSSNGAPVKKVKTENEVKPAVKEESSPATKGTATPKVTPPSSKTGSTSGSTGPVT EEEIRAVLLQNGPVTTQDLVARFKSRLRTPECFTADYSLGLSVRLCMLLRICVVHDGIN TISGVWVAKFLHRTWGFYQFNKGWVGGTG Saccharomyces cerevisiae TFIIF-L/ AAA61640 MSRRNPPGSRNGGGPTNASPFIKRDRMRRNFLRMRMGQNGSNSSSPGVPNGDNSRGSLV KKDDPEYAEEREKMLLQIGVEADAGRSNVKVKDEDPNEYNEFPLRAIPKEDLENMRTHL LKFQSKKKINPVTDFHLPVRLHRKDTRNLQFQLTRAEIVQRQKEISEYKKKAEQERSTP NSGGMNKSGTVSLNNTVKDGSQTPTVDSVTKDNTANGVNSSIPTVTGSSVPPASPTTVS AIESNGLSNGSTSAANGLDGNASTANLANGRPLVTKLEDAGPAEDPTKVGMVKYDGKEV TNEPEFEEGTMDPLADVAPDGGGRAKRGNLRRKTRQLKVLDENAKKLRFEEFYPWVMED FDGYNTWVGSYEAGNSDSYVLLSVEDDGSFTMIPADKVYKFTARNKYATLTIDEAEKRM DKKSGEVPRWLMKHLDNIGTTTTRYDRTRRKLKAVADQQAMDEDDRDDNSEVELDYDEE FADDEEAPIIDGNEQENKESEQRIKKEMLQANAMGLRDEEAPSENEEDELFGEKKIDED GERIKKALQKTELAALYSSDENEINPYLSESDIENKENESPVKKEEDSDTLSKSKRSSP KKQQKKATNAHVHKEPTLRVKSIKNCVIILKGDKKILKSFPEGEWNPQTTKAVDSSNNA SNTVPSPIKQEEGLNSTVAEREETPAPTITEKDIIEAIGDGKVNIKEFGKFIRRKYPGA ENKKLMFAIVKKLCRKVGNDHMELKKE Triticum aestivum TFIIF TC106270 MGSVDLVLKPACEGCGSTSDLYGTGCKHTTLCSSCGKSMALSRARCLVCSAPITNLIRE YNVRANASTDKAFSIGRFVTGLPPFSKKKNAENKWSLHKEGLQGRQLTDKMLEKYNRKP WILEDETGQYQFQGHMEGSQSATATYYLLMLHGKEFHAFPAGSWYNFSKVAQYKQLTLE EAEEKMNKRKTSATGYERWMMKAATNGPAAFGSDMMKLEPANDGEKESARHKKGKDNEE GNNSDKGEENEEEEAARKDRLGLSKRGMDDDEEGGKDLDFDLDDDIEKGDDWEHEETFT DDDEAVDIDPEERADLAPEIPAPPEIKQDDEENEEEGGLSKSGKELKKLLGRSSGQNES DADDDDEEDDQDDESSPVLAPKQTDQPKDEPVDNSPAKPTPSSGHARSTPPASKSKQKR KSGGDDAKASSGAASKKAKVESDTKTSSIKEETPSSSKPTPKASASSRSANVSPVTEDE IRTVLLAVAPVTTQDLVSRFKSRLRGPEDKNAFAEILKKISKIQKTNGHNYVVLREDKK TFIIF Sequences Arabidopsis thaliana TFIIF 1 At3g52270 MEDVKVEMKVRKNENEALETGLAERSMLLMKAPSLVASSLQSHSFPDDPYRPDDPYRPD AKVILGVDPLAHEDEGTQLFRVSSNHSGKFHPLRNLLLHSLKFHGFGEMGFLSLEISSG PHFDHEIPCNLRIGLLSMNFLARHLNYEFLGVKHGNSFALQFVMELARADSGNMPRRYT LDMSKDFIPMNVFCESSDDFGSLGEEFSIGMFIYSPGKMSVEGKIKNKFDMRPHNENIE SYGRLCRERTNKYMGKNRQIQVIDNARGMHMRPMPGMIIPTAAPEKKKLTNRTSEMKRT RRDRREMEEVMFNLFERQSNWTLRLLIQETDQPEQFLKDLLKDLCIYNNKGSNQGTYEL KPEYKKATQE

PAGE 208

195 Arabidopsis thaliana TFIIF 2 At1g75510 MEDIHNLDIEKSDRSIWLMKCPVVVDKAWHKIAASSSSSFASSDSPPDMAKIVREVDPL RDDSPPEFKMYMVGAEYGNMPKCYALNMFTDFVPMGGFSDVNQGCAAAEGKVDHKFDMK PYGETIEEYARLCRERTSKAMVKNRQIQVIDNDRGVHMRPMPGMLGLVSSNSKEKRKPP PVKQTEVKRTRRDRGELEAIMFKLFEGQPNWTLKQLVQETDQPAQFLKEILNELCVYNK RGSNQGTYELKPEYKKSAEDDTGGQ Drosophila melanogaster TFIIF NP_524305 MSKEDKEKTQIIDKDLDLSNAGRGVWLVKVPKYIAQKWEKAPTNMDVGKLRINKTPGQK AQVSLSLTPAVLALDPEEKIPTEHILDVSQVTKQTLGVFSHMAPSDGKENSTTSAAQPD NEKLYMEGRIVQKLECRPIADNCYMKLKLESIRKASEPQRRVQPIDKIVQNFKPVKDHA HNIEYRERKKAEGKKARDDKNAVMDMLFHAFEKHQYYNIKDLVKITNQPISYLKEILKD VCDYNMKNPHKNMWELKKEYRHYKTEEKKEEEHKSGSSDSE Glycine max TFIIF TC178154 MDEENGYSGSISSNLETTKAERSVWLMKCPLVVAKSWQTHPPSQPLAKVVLSLDPLHPE EDDPSAVQFTMEMAGTEAVNMSKTYSLNMFKDFVPMCVFSETSQGGKVAMEGKVEHKFD MKPHGENIEEYGKLCRERTNKSMIKNRQIQVIDNDRGVLMRPMPGMIGLVSSNSKDKKK TQPVKQSDTKRTRRDRGELEDIMFKLFERQPNWALKQLVQETDQPAQFLKEILNELCVY NKRGANQGTYELKPEYKKSVEDTSAE Homo sapiens TFIIF NP_004119 MAERGELDLTGAKQNTGVWLVKVPKYLSQQWAKASGRGEVGKLRIAKTQGRTEVSFTLN EDLANIHDIGGKPASVSAPREHPFVLQSVGGQTLTVFTESSSDKLSLEGIVVQRAECRP AASENYMRLKRLQIEESSKPVRLSQQLDKVVTTNYKPVANHQYNIEYERKKKEDGKRAR ADKQHVLDMLFSAFEKHQYYNLKDLVDITKQPVVYLKEILKEIGVQNVKGIHKNTWELK PEYRHYQGEEKSD Hordeum vulgare TFIIF TC103743 MGDEAKYLETARADRSVWLMKCPPVVSQAWQGASASSGDANPNPVVAKVVLSLDPLSSA EPSLQFKMEMSQTSVASTCNLPKSYSLNMFKDFVPMCVFSETNQGKLSCEGKVEHKFDM EPHKDNLLNYAKLCRERTQKSMVKTRKVQVLDNDHGMSMRPMPGMVGLISSSSKEKRKP TPTKPSDVKRTRRDRRELENIIFKLFEKQPNWALKALVQETDQPEQFLKEILNDLCMYN KRGPNQGTHELKPEYKKSSEDAAGAP Medicago truncatula TFIIF TC78885 MEDENSYGGSSGGSNLETSKAERSVWLMKCPVAVAKSWQNHPPSQPLSKVVFSIDPLLP EDDPAHLQFTMEMSGTEAVNMPKTYSLNMFKDFVPMCIFSETSEGDKVAMEGKVEHKFD MKPRHENMDDYGKLCRERTKKSMIKNRQVQIIADDRGTHMRPMPGMVGLVSSNFKDKKR TQPVKQTDTKRTRRDRGELEDIMFKLFERQPNWALKQLVQETDQPAQFLKEILNELCVY NKRGANQGTYELKPEYKKSVEDANAE Oryza sativa TFIIF TC137623 MAEEAKNLETARADRSVWLMKCPTVVSRAWQEAATAAASSSSSSDAAAGANSNSNANPN PVVAKFKMEMAQTGNGNTPKSYSLNMFKDFVPMCVFSESNQGKLSCEGKVGHKFDMEPH SDNLVNYGKLCRERTQKSMIKNRKLMVLANDNGMSMRPLPGLVGLMSSGPKQKEKKPLP

PAGE 209

196 VKPSDMKRTRRDRRELENILFKLFERQPNWSLKNLMQETDQPEQFLKEILNDLCFYNKR GPNQGTHELKPEYKKSTEDADATAT Populus trichocarpa TFIIF 1 MDDEASNSSSGNNNNNNKNLTNDNNNKSPVLGGFLDASKAEKSVWLMKCPSIVSRFLRS QEHEVGDGDASSPPVAKVIVSVDPLKSNDDDNSATEDYPNALNFELSLVLFCLVFTLHD FFCSLWKWLGTGLGDGLKSYSMEMSKDLVDMSVFSESSQGKLSVEGRILNKFDVRPHSE NLENYRKICRERTKKYMVKSRQIKVIDNDTGSHMMPMPGMIISGLAVLSFFYIFVNDKK KLPIKASDMKRTRRDRREMEGIMFKLFEKQPNWTLKQLVQETDQPEQFVKDMLKDLCVY NNKGSNQGSYELKPEYKKSNEEPAPE Populus trichocarpa TFIIF 2 MEEDHSNGGNSSSSGNLETSKADKAVWLMKCPVVVAKSWKSHHTSSSDSAPLAKVVLSL DPLQSDDPSAIQFTMEMARTETGNVPKSYSLNMFKDFVPMGVFSETPQGRVSMEGKVEH KFDMKPHEENIEEYSKLCRDRTKKSMIKNRQIRVIDNDRGVHMRPMPGMVGLISSTSKD KKKTQPVKQSDVKRTRRDRGELEDIMFKLFERQPNWALKQLVQETDQPAQFLKEILNEL CVYNKRGTNQGTYELKPEYKKTAEDTGAD Populus trichocarpa TFIIF 3 MEEDNSSSSANLETSKADKSVWLMKCPVVVAKSWKTHTSPSSSDSAPLAKVVLSLDPLQ SDDPSALQFTMEMARTEAGNVPKSYSLNMFKDFVPMCVFSETPQGKVAMEGKVEHKFDM KPHEQNIEEYHKLCRERTKKSMVKIRQIQVINNDRGVHMRPMPGMVGLISSSSKDKKRP QPVKQSDVKRTRRDRGELEDIMFKLFERQPNWALKQLVQETDQPAQFLKEILNELCVYN KRGTNQGTYELKPEYKKTVEDTGAD Saccharomyces cerevisiae TFIIF NP_011519 MSSGSAGAPALSNNSTNSVAKEKSGNISGDEYLSQEEEVFDGNDIENNETKVYEESLDL DLERSNRQVWLVRLPMFLAEKWRDRNNLHGQELGKIRINKDGSKITLLLNENDNDSIPH EYDLELTKKVVENEYVFTEQNLKKYQQRKKELEADPEKQRQAYLKKQEREEELKKKQQQ QKRRNNRKKFNHRVMTDRDGRDRYIPYVKTIPKKTAIVGTVCHECQVMPSMNDPNYHKI VEQRRNIVKLNNKERITTLDETVGVTMSHTGMSMRSDNSNFLKVGREKAKSNIKSIRMP KKEILDYLFKLFDEYDYWSLKGLKERTRQPEAHLKECLDKVATLVKKGPYAFKYTLRPE YKKLKEEERKATLGELADEQTGSAGDNAQGDAEADLEDEIEMEDVV Triticum aestivum TFIIF TC122239 MGDEAKYLETARADRSVWLMKCPPVVSQAWQGASSSSGDANPNPVVAKVVLSLDPLSSA EPSLQFKMEMSQTSVASTCNLPKSYSLNMFKDFVPMCVFSETNQGKLSCEGKVEHKFDM EPHKDNLLNYAKLCRERTQKSMVKTRKVQVLDNDHGMSMRPMPGMVGLISSSSKEKRKP TPTKPSDVKRTRRDRRELENIIFKLFEKQPNWALKALVQETDQPEQFLKEILNDLCMYN KRGPNQGTHELKPEYKKSSEDAAGAP Vitis vinifera TFIIF TC20528 MEEEQGNSSSSNLETGKAERSVWLMKCPLAVSKSWQSHSSSESQPVAKVVLSLDPLRSE DPSALEFTMEMTGTGAPNMPKSYSLNMFKDFVPMCVFSETNQGRVAMEGKVEHKFDMKP HNENIEEYGKLCRERTNKSMIKNRQIQVIDNDRGVHMRPMPGMVGLIASNSKDKKKTAP VKGSDMKRTRRDRGELEDIMFKLFERQPNWALKQLVQETDQPAQFLKEILNELCVYNKR GTNQGTYELKPEYKKSAEDTGAE

PAGE 210

197 APPENDIX B AMINO ACID MULTIPLE SEQUENCE AL IGNMENTS FOR CORE DOMAINS OF THE GENERAL TRANSCRIPTION FACTORS TFIIA Small Subunit Alignment CLUSTAL X (1.83) multiple sequence alignment Homo_sapiens_TFIIA-gamma_NP_00 MAY---------QLYRNTTLGNSLQESLDELIQSQQITPQLALQVLLQFD Drosophila_melanogaster_TFIIAMSY---------QLYRNTTLGNTLQESLDELIQYGQITPGLAFKVLLQFD Arabidopsis_thaliana_TFIIA-S_A MAT--------FELYRRSTIGMCLTETLDEMVQSGTLSPELAIQVLVQFD Mesembryanthemum_crystallinum_ MAT--------FELYRRSTIGMCLTETLDEMVQSGTLSPELAIQVLVQFD Medicago_truncatula_TFIIA-S_TC MAT--------FELYRRSTIGMCLTETLDEMVQNGTLSPEIAIQVLVQFD Glycine_max_TFIIA-S_TC148651 MAT--------FELYRRSTIGMCLTETLDEMVQNGTLSPELAIQVLVQFD Populus_balsamifera_TFIIA-S1 MAT--------FELYRRSTIGMCLTETLDDMVQNGTLSPELAFQVLVQFD Vitis_vinifera_TFIIA-S MAT--------FELYRRSTIGMCLTETLDEMVQNGTLSPELAIQVLVQFD Lycopersicon_esculentum_TFIIAMAT--------FELYRRSTIGMCLTETLDEMVSNGILSPEHAIQVLVQFD Solanum_tuberosum_TFIIA-S_TC60 MAT--------FELYRRSTIGMCLTETLDEMVSNGILSPEHAIQVLVQFD Triticum_aestivum_TFIIA-S1_TC7 MAT--------FELYRRSTIGMCLTETLDEMVSSGTLSPELAIQVLVQFD Triticum_aestivum_TFIIA-S2_TC7 MAT--------FELYRRSTIGMCLTETLDEMVSSGTLSPELAIQVLVQFD Oryza_sativa_TFIIA-S_AAK73129 MAT--------FELYRRSTIGMCLTETLDEMVSSGTLSPELAIQVLVQFD Triticum_aestivum_TFIIA-S3_CA4 MAT--------FELYRRSTIGMCLTETLDEMVSSGTLSPELAIQVLVQFD Zea_mays_TFIIA-S1_TC170582 MAT--------FELYRRSTIGMCLTETLDEMVSNGTLSPELAIQVLVQFD Hordeum_vulgare_TFIIA-S_TC6639 MAT--------FELYRRSTIGMCLTETLDEMVSSGTLSPELAIQVLVQFD Oryza_sativa_TFIIA-S2 MAT--------FELYRRSTIGMCLTDTLDDMVSSGALSPELAIQVLVQFD Zea_mays_TFIIA-S2_TC173972 MAT--------FELYRRSTIGTCLTETLDELVSSGAVSPELAIQVLVQFD Pinus_TFIIA-S_TC16392 MAT--------FELYRKSTIGTCLTETLDELVLNGTLSPEHAIQVLVQFD Populus_trichocarpa_TFIIA-S2 MSTNGNNPAPYFELYRRSSVGLALTDALDELIQSGHINPQLALTVLKQFD Saccharomyces_cerevisiae_TFIIA MAVPG-----YYELYRRSTIGNSLVDALDTLISDGRIEASLAMRVLETFD Homo_sapiens_TFIIA-gamma_NP_00 KAINAALAQRVRNRVNFR-GSLNTYRFCDNVWTFVLNDVEFREV-----Drosophila_melanogaster_TFIIAKSINNALNQRVKARVTFKAGKLNTYRFCDNVWTLMLNDVEFREV-----Arabidopsis_thaliana_TFIIA-S_A KSMTEALESQVKTKVSIK-GHLHTYRFCDNVWTFILQDAMFKSD-----Mesembryanthemum_crystallinum_ KSMTEALEAQVKTKVTIK-GHLHTYRFCDNVWTFMLQDALFKSE-----Medicago_truncatula_TFIIA-S_TC KSMTEALETQVKSKVSIK-GHLHTYRFCDNVWTFILQDALFKNE-----Glycine_max_TFIIA-S_TC148651 KSMTEALETQVKSKVSIK-GHLHTYRFCDNVWTFILQDALFKNE-----Populus_balsamifera_TFIIA-S1 KSMTEALETKVKSKVTIK-GHLHTYRFCDNVWTFILQDANFKNE-----Vitis_vinifera_TFIIA-S KSMTEALESQVKSKVTIK-GHLHTYRFCDNVWTFILQDALFKNE-----Lycopersicon_esculentum_TFIIAKSMTEALETQVKSKVTIK-GHLHTYRFCDNVWTFILQDAVFKSE-----Solanum_tuberosum_TFIIA-S_TC60 KSMTEALETQVKSKVTIK-GHLHTYRFCDNVWTFILQDAVFKSE-----Triticum_aestivum_TFIIA-S1_TC7 KSMTEALENQVKSKVTVK-GHLHTYRFCDNVWTFILTDAQFKNE-----Triticum_aestivum_TFIIA-S2_TC7 KSMTDALETQVKSKVTVK-GHLHTYRFCDNVWTFILTDAQFKNE-----Oryza_sativa_TFIIA-S_AAK73129 KSMTEALENQVKSKVSIK-GHLHTYRFCDNVWTFILTEASFKNE-----Triticum_aestivum_TFIIA-S3_CA4 KSMTDALENQVKSKVNIK-GHLHTYRFCDNVWTFILTDASFKNE-----Zea_mays_TFIIA-S1_TC170582 KSMTDALENQVKSKVTVK-GHLHTYRFCDNVWTFILTDASFKNE-----Hordeum_vulgare_TFIIA-S_TC6639 KSMTEALENQVKSKVTVK-GHLHTYRFCDNVWTFILTDAQFKNE-----Oryza_sativa_TFIIA-S2 KSMTSALEHQVKSKVTVK-GHLHTYRFCDNVWTFILTDAIFKNE-----Zea_mays_TFIIA-S2_TC173972 KSMTEALEMQVKSKVSVK-GHLHTYRFCDNVWTFILTDATFKSE-----Pinus_TFIIA-S_TC16392 KSMAEALETQVKSKVTIK-GHLHTYRFCDNVWTFLLQDAQFKGE-----Populus_balsamifera_TFIIA-S2 KSASQVLSTQLRSKCLIK-GHLSTYRLCDEVWTFLLRDSIYKLEG----Saccharomyces_cerevisiae_TFIIA KVVAETLKDNTQSKLTVK-GNLDTYGFCDDVWTFIVKNCQVTVEDSHRDA Homo_sapiens_TFIIA-gamma_NP_00 --------TELIKVDKVKIVACDGKNTGSNTTE Drosophila_melanogaster_TFIIA--------HEIVKVDKVKIVACDGK-SGEF--Arabidopsis_thaliana_TFIIA-S_A --------DRQENVSRVKIVACDSKLLTQ---Mesembryanthemum_crystallinum_ --------ECQENVSRVKIVACDSKLLTQ---Medicago_truncatula_TFIIA-S_TC --------DNQENVGRVKIVACDSKLLSQ---Glycine_max_TFIIA-S_TC148651 --------DSQENVGRVKIVACDSKLLTQ---Populus_balsamifera_TFIIA-S1 --------DSQENVGRVKIVACDSKLLTQ---Vitis_vinifera_TFIIA-S --------ESQENVGRVKIVACDSKLLTQ---Lycopersicon_esculentum_TFIIA--------ECQETVNRVKIVACDSKLLTQ----

PAGE 211

198 Solanum_tuberosum_TFIIA-S_TC60 --------ECQETVNRVKIVACDSKLLTQ---Triticum_aestivum_TFIIA-S1_TC7 --------ETTEQVGKVKIVACDSKLLSQ---Triticum_aestivum_TFIIA-S2_TC7 --------ETTEQVGKVKIVACDSKLLSQ---Oryza_sativa_TFIIA-S_AAK73129 --------ETTEQVGKVKIVACDSKLLSQ---Triticum_aestivum_TFIIA-S3_CA4 --------ETTEQVGKVKIVACDSKLLGQ---Zea_mays_TFIIA-S1_TC170582 --------EATEQVGKVKIVACDSKLLGQ---Hordeum_vulgare_TFIIA-S_TC6639 --------EITEQVSKVKIVACDSKLLSQ---Oryza_sativa_TFIIA-S2 --------EITETINKVKIVACDSKLLETKEEZea_mays_TFIIA-S2_TC173972 --------EIQETLGRVKIVACDSKLLQPQHPPinus_TFIIA-S_TC16392 --------DIHEQAGRVKIVACDSKILTQ---Populus_balsamifera_TFIIA-S2 -------GEQVGPVKRVKIVACKGNAGASAPPA Saccharomyces_cerevisiae_TFIIA SQNGSGDSQSVISVDKLRIVACNSKKSE----TFIIA Large Subunit Alignment CLUSTAL X (1.83) multiple sequence alignment Arabidopsis_thaliana_TFIIA-L1_ ---MGTTTTTSAVYIHVIEDVVNKVREEFINNGGPGESVLSELQGIWETK Arabidopsis_thaliana_TFIIA-L2_ ---MGTTTTTSAVYIHVIEDVVNKVREEFINNGGPGESVLSELQGIWETK Glycine_max_TFIIA-L_TC192713 ----MAASTTSQVYIQVIDDVMNKVRDEFVNNGGPGDEVLKELQSIWESK Populus_balsamifera_TFIIA-L1 ----MASSATSTVYTEVIEDVIDKVRDEFINNGGPGETVLSELQGLWEKK Solanum_tuberosum_TFIIA-L-STtu ----MASSTTSNVYIHVIEDVISKVRDEFISNGGPGESILKELQALWEVK Hordeum_vulgare_TFIIA-L_Barley ----MASGNVSTVYISVIDDVVAKVREDFITYG-VGDAVLNELQALWEMK Oryza_sativa_TFIIA-L ----MASSNVSTVYISVIDDVISKVRDDFISYG-VGDAVLNELQALWEMK Zea_mays_TFIIA-L_TC183075 ----MASSNVSTVYISVIDDVISKVREDFITYG-VGDAVLNELQALWEMK Arabidopsis_thaliana_TFIIA-L3_ --MVLSTSDTSSSYNYVIDDVINKSRCDLVYNGELDESVLSQIQSMWKTK Homo_sapiens_TFIIA-alpha/beta---MACLNPVPKLYRSVIEDVIEGVRNLFAEEG-IEEQVLKDLKQLWETK Homo_sapeins-TFIIA-alpha-beta_NP MANSANTNTVPKLYRSVIEDVINDVRDIFLDDG-VDEQVLMELKTLWENK Drosohphila_melanogaster_TFIIA --MALCQTSVLKVYHAVIEDVITNVRDAFLDEG-VDEQVLQEMKQVWRNK Saccharomyces_cerevisiae_TOA1p ----MSNAEASRVYEIIVESVVNEVREDFENAG-IDEQTLQDLKNIWQKK Arabidopsis_thaliana_TFIIA-L1_ MMQAGVLNGPIERSSAQKPTP-----GGPLT--HDLNVPYEGT-EEYETP Arabidopsis_thaliana_TFIIA-L2_ MMQAGVLNGPIERSSAQKPTP-----GGPLT--HDLNVPYEGT-EEYETP Glycine_max_TFIIA-L_TC192713 MMQAGAIVGPIERSGAPKPTP-----GGPITPVHDLNMPYEGT-EEYETP Populus_balsamifera_TFIIA-L1 LMQAGVLSGPIVRSSANKQLV-----PGGLTPVHDLNVPYEGT-EEYETP Solanum_tuberosum_TFIIA-L-STtu MMNAGAILGTIERNSAAKATP-----GGPITPVHDLNMPYEGN-EEYETP Hordeum_vulgare_TFIIA-L_Barley MLHCGAISGNIDRNRAPPASAGGAPGAGATPPVHDLNVPYEATSEEYATP Oryza_sativa_TFIIA-L MLHCGAISGTIDRSKAAPAPSAGTPGAGTTPPVHDLNVPYEATSEEYATP Zea_mays_TFIIA-L_TC183075 MLHCGAISGNIDRTKAAAASVGGTT--GTTAPVHDLNVPYEATSEEYATP Arabidopsis_thaliana_TFIIA-L3_ MIQAGAMSGTIETSSAS-------------------------------IP Homo_sapiens_TFIIA-alpha/betaVLQSKATEDFFR--NSIQSPLFTLQLPHSLHQTLQSSTASLVIPAGRTLP Homo_sapeins-TFIIA-alpha-beta_NP LMQSRAVDGFHS--EEQQLLLQVQQQHQPQQQQHHHHHHHQQAQPQQTVP Drosohphila_melanogaster_TFIIA LLASKAVELSPDSGDGSHPPPIVANNPKAANAKAKKAAAATAVTSHQHIG Saccharomyces_cerevisiae_TOA1p LTETKVTTFSWDNQFNEGNIN------GVQNDLNFNLATPGVNSSEFNIK Arabidopsis_thaliana_TFIIA-L1_ TAEMLFPPTPLQT------------PLP--------------TPLPGTAD Arabidopsis_thaliana_TFIIA-L2_ TAEMLFPPTPLQT------------PLP--------------TPLPGTAD Glycine_max_TFIIA-L_TC192713 TAEMLFPPTPLQT------------PLQ--------------TPLPGTVD Populus_balsamifera_TFIIA-L1 TAEILFPPTPLQTPMQTPLPGSAQTPLPGNVQTPLPG--NVPTPLPGSVD Solanum_tuberosum_TFIIA-L-STtu TADILFPPTPLQT----PLPGTAQTPLPGTVQTPLPG--TAQTPLPGTAD Hordeum_vulgare_TFIIA-L_Barley TADMLFPPTPLQT------------PIQ--------------TPLPGIDT Oryza_sativa_TFIIA-L TADMLFPPTPLQT------------PIQ--------------TPLPGTDA Zea_mays_TFIIA-L_TC183075 TADMLFPPTPLQT------------PIQ--------------TPLPGTDT Arabidopsis_thaliana_TFIIA-L3_ TTPVIVQTT-LQT------------PDA--------------IPLPEKKM Homo_sapiens_TFIIA-alpha/betaSFTTAELGTSNSSANFTFPGYPIHVPAGVTLQTVSGHLYKVNVPIMVTET Homo_sapeins-TFIIA-alpha-beta_NP QQAQTQQVLIPASQQATAP--QVIVPDSKLIQHMN----ASNMSAAATAA Drosohphila_melanogaster_TFIIA GNSSMSSLVGLKSSAGMAAGSGIRNGLVPIKQEVN------SQNPPPLHP Saccharomyces_cerevisiae_TOA1p EENTGNEGLILPN-----------------------------INSNNNIP Arabidopsis_thaliana_TFIIA-L1_ NSSMYNIPTG-----SSDYP-TPGTENG-----VNIDVK-A--RPSPYMArabidopsis_thaliana_TFIIA-L2_ NSSMYNIPTG-----SSDYP-TPGTENG-----VNIDVK-A--RPSPYMGlycine_max_TFIIA-L_TC192713 N-SMYNIPTG-----PSDYP-SAGNEPG-----ANNEIKGG--RPGPYMQ Populus_balsamifera_TFIIA-L1 NSSMYNISTG----SSSDYP-TPVSDAG-----GSTDVKAG--RPSHFMSolanum_tuberosum_TFIIA-L-STtu -SSMYNIPTGGTPFTPSDY--SPLNDTG-----GATELKAGPGRPSPFMHordeum_vulgare_TFIIA-L_Barley --GMYNIPTG-----PSDYAPSPISDMRNGMGMNGSDPKTG--RPSPYMOryza_sativa_TFIIA-L --GMYNIPTG-----PSDYAPSPISDVRNGMAMNGADPKTG--RPSPYMZea_mays_TFIIA-L_TC183075 --AMYNIPTG-----PSDYAPSPISDIRNGMTINGADPKAG--HPSPYMArabidopsis_thaliana_TFIIA-L3_ S------------------------------------------------Homo_sapiens_TFIIA-alpha/betaSGRAGILQHPIQQVFQQLGQPSVIQTSVPQLNPWSLQATTEKSQRIETVL Homo_sapeins-TFIIA-alpha-beta_NP TLALPAGVTPVQQILTNSGQLLQVVR--------AANGAQYIFQPQQSVV

PAGE 212

199 Drosohphila_melanogaster_TFIIA TSAASMMQKQQQAASSGQGSIPIVAT-----------------LDPNRIM Saccharomyces_cerevisiae_TOA1p HSGETNINTN---------------------------------------Arabidopsis_thaliana_TFIIA-L1_ PPPSPWTNPR--------LDVNVAYVDGRD-EPERGNSNQQFTQDLFVPS Arabidopsis_thaliana_TFIIA-L2_ PPPSPWTNPR--------LDVNVAYVDGRD-EPERGNSNQQFTQDLFVPS Glycine_max_TFIIA-L_TC192713 PPPSPWTNQNQNQNQRAPLDVNVAYVEGRD-EAERGASNQPLTQDFFMSS Populus_balsamifera_TFIIA-L1 QSPSPLMHQR------PPLDVN---VGKSYFYAPRRVHGQ---KDFFMSS Solanum_tuberosum_TFIIA-L-STtu HPPSPWLNQR------PPLDVNGAYVEGREEVGDRGGSQQPMTQDFFMNS Hordeum_vulgare_TFIIA-L_Barley QPPSPWMNQRP-----LGVDVNVAYEESRE-DPDRLMQPQPLTKDFLMMS Oryza_sativa_TFIIA-L PPPSPWMTQRP-----LGVDVNVAYVENRE-DPDRTGQPPQLTKDFLMMS Zea_mays_TFIIA-L_TC183075 PPPSPWMNQRP-----LGVDVNVAYVEGRE-DPDRGVQPQPLTQDFLTMS Arabidopsis_thaliana_TFIIA-L3_ -------------------------------------------------Homo_sapiens_TFIIA-alpha/betaQQPAILPSGPVDRKHLENATSDILVSPGNEHKIVPEALLCHQESSHYISL Homo_sapeins-TFIIA-alpha-beta_NP LQQQVIP----------------QMQPGGVQAPVIQQVLAPLPG-GISPQ Drosohphila_melanogaster_TFIIA PVNITLP------------------SPAGSASSESRVLTIQVPASALQEN Saccharomyces_cerevisiae_TOA1p -----------------------TVEATNNSGATLNTNTSGNTNADVTSQ Arabidopsis_thaliana_TFIIA-L1_ SGKRKRDDSSGHYQNGGSIPQQDGAGDA---IPEANFECDAFR------Arabidopsis_thaliana_TFIIA-L2_ SGKRKRDDSSAHYQNGGSIPQQDGASDA---IPEANFECAALR------Glycine_max_TFIIA-L_TC192713 -GKRKRDEIASQYNAGGYIPQQDGAGDAASQILEIEVYGGGMS------Populus_balsamifera_TFIIA-L1 -GKRKRGDFAPKYNNGGFIPQQDGAVDSASEVSQVSQGNNPHG------Solanum_tuberosum_TFIIA-L-STtu AGKRKREDFPPQYHNGGYIPQQDGAADSIYDNLKSGEGSNIQL------Hordeum_vulgare_TFIIA-L_Barley SGKRKRDEYPGQLPSGSFVPQQDGCADQVAEFVGSKDNAQQVWNS----Oryza_sativa_TFIIA-L SGKRKRDEYPGQLPSGSFVPQQDGSADQIVEFVVSKDNAQQLWSS----Zea_mays_TFIIA-L_TC183075 SGKRKRDEYPGQLPSGSFVPQQDGSADQIVEFVVSKENANQHWSS----Arabidopsis_thaliana_TFIIA-L3_ -PKKESDGFY-------YIPQQDGARDEA--------------------Homo_sapiens_TFIIA-alpha/betaPGVVFSPQVSQTNSNVESVLSGSASMAQNLHDESLSTSPHGALHQHVTDI Homo_sapeins-TFIIA-alpha-beta_NP TGVIIQPQQILFTGNKTQVIPTTVAAPTPAQAQITATGQQ---------Drosohphila_melanogaster_TFIIA Q-----LTQILTAHLISSIMSLPTTLASSVLQQHVNAALS---------Saccharomyces_cerevisiae_TOA1p PKIEVKPEIELTINNANITTVENIDDESEKKDDEEKEE-----------Arabidopsis_thaliana_TFIIA-L1_ -----------ITS---IGDRKVPRD------------------FFSSSS Arabidopsis_thaliana_TFIIA-L2_ -----------ITY---VGDRKIPRD------------------LIGSSS Glycine_max_TFIIA-L_TC192713 -----------IDAGHSTSKGKMPAQ------------------SDRPAS Populus_balsamifera_TFIIA-L1 -----------RCDTITTKNREILAR------------------VSRSYV Solanum_tuberosum_TFIIA-L-STtu -----------ELVTVGP--------------------------VQASAY Hordeum_vulgare_TFIIA-L_Barley -----------ILNKQESVTKTLSIK------------------ESTIPP Oryza_sativa_TFIIA-L -----------IVNKQGTATKESSTK------------------ETIIAP Zea_mays_TFIIA-L_TC183075 -----------IINKLETPTKT-------------------------VTP Arabidopsis_thaliana_TFIIA-L3_ -------------------------------------------------Homo_sapiens_TFIIA-alpha/betaQLHILKNRMYGCDSVKQPRNIEEPSNIPVSEKDSNSQVDLSIRVTDDDIG Homo_sapeins-TFIIA-alpha-beta_NP ------------QPQAQPAQTQAP-------------------------Drosohphila_melanogaster_TFIIA -------------SANHQKTLAAAK------------------------Saccharomyces_cerevisiae_TOA1p -------------------------------------------------Arabidopsis_thaliana_TFIIA-L1_ KIPQVDGPMPDPYDEMLSTPNIYSYQGP-SEEFNEARTPAPNEIQTSTPV Arabidopsis_thaliana_TFIIA-L2_ KIPQVDGPMPDPYDEMLSTPNIYSYQGP-NEEFNEARTPAPNEIQTSTPV Glycine_max_TFIIA-L_TC192713 QIPQLDGPIPY-DDDVLSTPNIYNYGVF-NEDYNIANTPAPSEVPASTPA Populus_balsamifera_TFIIA-L1 KIPQVDGPIPDPYDDMLSTPNIYNYQGVANEDYNIASTPAPNDLQASTPA Solanum_tuberosum_TFIIA-L-STtu RIPQFDGPIPDSYDDALSTPNIY-YQGVVNEDYNIVNTPAPNDMQAPTPA Hordeum_vulgare_TFIIA-L_Barley VLPQRDGIQDDYNDQFF-------FPGVPTEDYNTPGESSEYRTPTPAIA Oryza_sativa_TFIIA-L TIPQRDG-MDDYNDPFY-------FQGVPTEDYNTPGESSEYRAPTPAVG Zea_mays_TFIIA-L_TC183075 VIPQCDGIQDDYNDQFF-------FPGVPTEDYNTPGESAEYRAPTPAVG Arabidopsis_thaliana_TFIIA-L3_ -IVDVD-------------------------------------------Homo_sapiens_TFIIA-alpha/betaEIIQVDGSGDTSSNEEIGSTRDADENEFLGNIDGGDLKVPEEEADSISNE Homo_sapeins-TFIIA-alpha-beta_NP LVLQVDGTGDTSSEE--------DEDE-------------EEDYDDDEEE Drosohphila_melanogaster_TFIIA ---QLDGALDSSDED--------ESEE------------SDDNIDNDDDD Saccharomyces_cerevisiae_TOA1p ---DVEKTRKE-----------------------------KEQIEQVKLQ Arabidopsis_thaliana_TFIIA-L1_ AVQN-DIIE--------DDEELLNE-DDDDDELDDLESGEDM-NTQHLVL Arabidopsis_thaliana_TFIIA-L2_ AVPN-DIIE--------DDEELLNE-DDDDDELDDLESGEDM-NTQHLVL Glycine_max_TFIIA-L_TC192713 PIAQ-NEVDE----EDDDDEPPLNE-NDDDD-LDDLDQGEDQ-NTHHLVL Populus_balsamifera_TFIIA-L1 VVSQNDDVD-------DDDDEPLNEDDDDDEDLDGVDQGEEL-NTQHLIL Solanum_tuberosum_TFIIA-L-STtu PALQNDDID-------DDD-EPLNEDDDD--DLDDVDQGEDL-NTAHLVL Hordeum_vulgare_TFIIA-L_Barley TPKPRNDMAGGDDDDDDDDEPPLNEDDDDDDEIDDLQDGDEEPNTQHLVL Oryza_sativa_TFIIA-L TPKPRNDVG-------DDDEPPLNEDDDDDDELDDLEQGEDEPNTQHLVL Zea_mays_TFIIA-L_TC183075 TPKPRNDAGDDNDDDDDDEEPPLNEDDDDDDDLDDLEEGEDEPNTQHLVL Arabidopsis_thaliana_TFIIA-L3_ -----------------ENEEPLNE--DDDDEEDDID--DDDMNIQHLVM Homo_sapiens_TFIIA-alpha/betaDSATNSSDNEDPQVN-IVEEDPLNSGDDVSE-----QDVPDLFDTDNVIV Homo_sapeins-TFIIA-alpha-beta_NP DKEKDGAEDGQ------VEEEPLNSEDDVSD-----EEGQELFDTENVVV

PAGE 213

200 Drosohphila_melanogaster_TFIIA DLDKDDDEDAEHED--AAEEEPLNSEDDVTD-----EDSAEMFDTDNVIV Saccharomyces_cerevisiae_TOA1p AKKEKRSAL--------LDTDEVGSELDDSDDDYLISEGEEDGPDENLML Arabidopsis_thaliana_TFIIA-L1_ AQFDKVTRTKSRWKCSLKDGIMHINDKDILFNKAAGEFDFArabidopsis_thaliana_TFIIA-L2_ AQFDKVTRTKSRWKCSLKDGIMHINDKDILFNKATGEFDFGlycine_max_TFIIA-L_TC192713 AQFDKVTRTKSRWKCTLKDGIMHINNKDILFNKATGEFDFPopulus_balsamifera_TFIIA-L1 AQFDKVTRTKSRWKCTLKDGVMHINNRDILFNKATGEFEFSolanum_tuberosum_TFIIA-L-STtu AQFDKVTRTKSRWKCTLKDGIMHINNKDILFNKANGEFDFHordeum_vulgare_TFIIA-L_Barley AQFDKVTRTKNRWKCTLKDGIMHLNGRDVLFNKASGEFDFOryza_sativa_TFIIA-L AQFDKVTRTKNRWKCTLKDGIMHLNGRDVLFNKVVNMIF-Zea_mays_TFIIA-L_TC183075 AQFDKVTRTKNRWKCTLKDGIMHLNGRDVLFNKATGEFDFArabidopsis_thaliana_TFIIA-L3_ CQFDKVKRSKNKWECKFNAGVMQINGKNVLFSQATGDFNFHomo_sapiens_TFIIA-alpha/betaCQYDKIHRSKNKWKFYLKDGVMCFGGRDYVFAKAIGDAEWHomo_sapeins-TFIIA-alpha-beta_NP CQYDKIHRSKNKWKFHLKDGIMNLNGRDYIFSKAIGDAEWDrosohphila_melanogaster_TFIIA CQYDKITRSRNKWKFYLKDGIMNMRGKDYVFQKSNGDAEWSaccharomyces_cerevisiae_TOA1p CLYDKVTRTKARWKCSLKDGVVTINRNDYTFQKAQVEAEWV TFIIB Family Alignment Cysteine and histidine residues potentially involved in metal ion chelation are highlited in yellow. A conserved lysine re sidue found to be autoacetylated in human and yeast TFIIB proteins is shown in green (Choi et al ., 2003). The first core-TFIIB imperfect direct-repeat is highlighted in grey, an d the second direct repeat is in a red font (Nikolov et al ., 1995). CLUSTAL X (1.83) multiple sequence alignment Arabidopsis_thaliana_BRF1_At2g ------------------MVWCKHCGKNVPG--IRPYDAALSCD-----L Arabidopsis_thaliana_BRF2_At3g ------------------MVWCNHCVKNVPG--IRPYDGALACN-----L Arabidopsis_thaliana_BRF3_At2g -------------------------------------------------Drosophila_melanogaster_BRF_AA ---------------MSTGLKCRNCGSNEIE--EDNARGDRVCM-----N Homo_sapiens_BRF_NP_001510.2 ----------------MTGRVCRGCGGTDIE--LDAARGDAVCT-----A Saccharomyces_cerevisiae_BRF_N ------------------MPVCKNCHGTEFERDLSNANNDLVCK-----A Populus_balsamifera_TFIIB7/pBrp ----------------MKCPYCSATQGRC-ATTTTTNRCITECT-----S Arabidopsis_thaliana_TFIIB5_At ----------------MKCPYCSSAQGRC-TTTSSG-RSITECS-----S Lycopersicon_esculentum_AAG011 ----------------MRCPYCSAEQGRC-TSSTSG-RPITECT-----S Populus_balsamifera_TFIIB6 ----------------MGDAFCSDCKRHT-EVVFDHSAGDTVCS-----E Populus_balsamifera_TFIIB4 ----------------MGDAFCSDCKKHT-EVVCDHSAGDTVCS-----E Populus_balsamifera_TFIIB5 ----------------MEDSYCPDCKRLT-EIVFDHSAGDTICS-----E Oryza_sativa_TFIIB1_AF464908 ----------------MSDSFCPDCKKHT-EVAFDHSAGDTVCT-----E Triticum_aestivum_TC68795 ----------------MGDSYCQDCKKHT-EVAFDHSAGDTVCT-----E Mesembryanthemum_crystallinum_ ----------------MSDAFCSDCKKCT-EVVFDHSAGDTVCS-----E Vitis_vinifera_TFIIB_TC19782 ----------------MADAFCTDCKKNT-EVVFDHSAGDTVCS-----E Populus_balsamifera_TFIIB1 ----------------MGDAFCSDCKRHT-EVVFDHSAGDTVCS-----E Citrus_sinensis_TFIIB_CB292941 ----------------MTDAFCSDCKKHT-EVVFDHSAGDTVCS-----E Glycine_max_TFIIB_U31097 ----------------MSDAFCSDCKRQT-EVVFDHSAGDTVCS-----E Medicago_truncatula_TC86832 ----------------MSDAFCSDCKRAT-EVVFDHSAGDAVCS-----E Arabidopsis_thaliana_TFIIB2_At ----------------MSDAFCSDCKRHT-EVVFDHSAGDTVCS-----E Arabidopsis_thaliana_TFIIB1_At ----------------MSDAYCTDCKKET-ELVVDHSAGDTLCS-----E Lycopersicon_esculentum_TFIIB_T -----------------MDTYCSDCKRNT-EVVFDHAAGDTVCS-----E Solanum_tuberosum_TFIIB1_TC587 -----------------MDTYCSDCKRNT-EVVFDHAAGDTVCS-----E Populus_balsamifera_TFIIB3 ---------------------------------MNRTITNIKSA-----S Populus_balsamifera_TFIIB8 -------------------------------------------------Arabidopsis_thaliana_TFIIB4_At --------CGLEFKYRPIGDLSPVAENDT-VRLPDPTNTLLSNT-----D Populus_balsamifera_TFIIB2 --------MARNGEIDDYRDYCKDCKANT-YIVLDHCTGDTICS-----D Oryza_sativa_TFIIB2_AAN59779 ----------MCSVAGNEQCYCPECHRTT-VVVVDHATGDTICT-----E Drosophila_melanogaster_TFIIB_ --------MASTSRLDNN-KVCCYAHPES-PLIEDYRAGDMICS-----E Homo_sapiens_TFIIB_NM_001514 --------MASTSRLDALPRVTCPNHPDA-ILVEDYRAGDMICP-----E Arabidopsis_thaliana_TFIIB3_At ----------------MEEETCLDCKRPT-IMVVDHSSGDTICS-----E Arabidopsis_thaliana_TFIIB6_At ---------------MKEDGICLECKRPT-ETVVNYKNGDTICI-----E

PAGE 214

201 Saccharomyces_cerevisiae_TFIIB ----------------NIVLTCPECKVYPPKIVERFSEGDVVCA-----L Methanosarcina_acetivorans_TFB --------------FENEKAVCPECGSRN--LVHDYERAELVCG-----D Sulfolobus_solfataricus_TFB_AA ----------MLYLSEENKSVSTPCPPDK--IIFDAERGEYICS-----E Populus_balsamifera_TFIIB9 --------------------------------MSPALASSGASI-----E Arabidopsis_thaliana_BRF4_At4g MRCKRCNGSNFERDEDTGNSYCGGCGTLREYDNYEAQLGGIRGPQGTYIR Lycopersicon_esculentum_TFIIB_AF ---------------YLDLEDIIKENALPFLPAKSAVKFQAVCR-----D Arabidopsis_thaliana_BRF1_At2g CGRILENFNFSTEVTFVKNAAGQSQ---ASGNILKSVQSGMS-------Arabidopsis_thaliana_BRF2_At3g CGRILENFHFSTEVTFVKNAAGQSQ---ASGNIVRSVQSGIT-------Arabidopsis_thaliana_BRF3_At2g -------------------------------------------------Drosophila_melanogaster_BRF_AA CGSVLEDSLIVSEVQFEEV-GHGAA---AIGQFVSAESSGGATNYGYG-Homo_sapiens_BRF_NP_001510.2 CGSVLEDNIIVSEVQFVESSGGGSS---AVGQFVSLDGAGKTPTLG-G-Saccharomyces_cerevisiae_BRF_N CGVVSEDNPIVSEVTFGETSAGAAV---VQGSFIGAGQSHAA-------Populus_balsamifera_TFIIB7/pBrp CGRVVEERQFHPHHLFHLRAQDTPL---CLVTSDLPTLHHHHQ-NEEDPF Arabidopsis_thaliana_TFIIB5_At CGRVMEERQTQNHHLFHLRAQDTPL---CLVTSDLQTAAQPSPEDEEDPF Lycopersicon_esculentum_AAG011 CGRVVEERLTQSHHLFHTRAQDSPL---CLATSDLPTLPISATNDDEDPF Populus_balsamifera_TFIIB6 CGLVLESHSIDETSEWRTFANESG-----DNDPVRVGGPTNPLLTDGG-Populus_balsamifera_TFIIB4 CGLVLESHSIDETSEWRIFANESG-----DNDPVRVGGPTNPLLTDGG-Populus_balsamifera_TFIIB5 CGLILEAHSVDETSEWRTFSNESS-----DHDPNRVGGPLNPLLADGG-Oryza_sativa_TFIIB1_AF464908 CGLVLEAHSVDETSEWRTFANESS-----DNDPVRVGGPTNPLLTDGG-Triticum_aestivum_TC68795 CGLVLEAHSVDETSEWRTFANESN-----DNDPVRVGGPSNPLLTDGG-Mesembryanthemum_crystallinum_ CGLVLESHSIDETSEWRTFANESN-----DNDPVRVGGPTNPLLSDGG-Vitis_vinifera_TFIIB_TC19782 CGLVLESHSIDETSEWRTFANESG-----DNDPVRVGGPSNPLLTDGG-Populus_balsamifera_TFIIB1 CGLVLESHSIDETSEWRTFANESG-----DNDPVRVGGPTNPLLTDGG-Citrus_sinensis_TFIIB_CB292941 CGLVLESHSIDETSEWRTFANESG-----DNDPVRVGGPTNPLLADGG-Glycine_max_TFIIB_U31097 CGLVLESHSIDETSEWRTFANESG-----DNDPNRVGGPSNPLLTDGG-Medicago_truncatula_TC86832 CGLVLESHSIDETSEWRTFANESG-----DNDPVRVGGPSNPLLTDGG-Arabidopsis_thaliana_TFIIB2_At CGLVLESHSIDETSEWRTFANESG-----DNDPVRVGGPTNPLLADGG-Arabidopsis_thaliana_TFIIB1_At CGLVLESHSIDETSEWRTFANESS-----NSDPNRVGGPTNPLLADSA-Lycopersicon_esculentum_TFIIB_T CGLVLESRSIDETSEWRTFADESG-----GDDPNRVGGPVNPLLGDAA-Solanum_tuberosum_TFIIB1_TC587 CGLVLESRSIDETSEWRTFADESG-----GDDPNRVGGPVNPLLGDAA-Populus_balsamifera_TFIIB3 PSLSLTRKGLDAFLVW-LFMRIS---------------PNVSFSFSGD-Populus_balsamifera_TFIIB8 ------------MENNLMFVWRS------------PISIAAAVIY----Arabidopsis_thaliana_TFIIB4_At LSIVTTEHKNGSFDDSLSLNLGNS-----SKPRLDPVSIATAKLMNGSSN Populus_balsamifera_TFIIB24 CGLVLESCYIDEIAEWRTFNDDNN-----DKDPNRVGYNVNPLLSQGN-Oryza_sativa_TFIIB2_AAN59779 CALVLEERYIDETSEWRTFSDAGSG---EDRDPNRVGGCSDPFLSHAE-Drosophila_melanogaster_TFIIB_ CGLVVGDRVIDVGSEWRTFSNEKS-----GVDPSRVGGPENPLLSGGD-Homo_sapiens_TFIIB_NM_001514 CGLVVGDRVIDVGSEWRTFSNDKA-----TKDPSRVGDSQNPLLSDGD-Arabidopsis_thaliana_TFIIB3_At CGLVLEAHIIEYSQEWRTFASDDNH---SDRDPNRVGAATNPFLKSGD-Arabidopsis_thaliana_TFIIB6_At CGHVIENNIID------------------DLD----GASTNPNLKSGH-Saccharomyces_cerevisiae_TFIIB CGLVLSDKLVDTRSEWRTFSNDDHN----GDDPSRVGEASNPLLDGNN-Methanosarcina_acetivorans_TFB CGLVIDADFVDEGPEWRAFDHDQR------MKRSRVGAPMTYTIHDKG-Sulfolobus_solfataricus_TFB_AA TGEVLEDKIIDQGPEWRAFTPEEK------EKRSRVGGPLNNTIHDRG-Populus_balsamifera_TFIIB9 LALYAGSKWFGSNRVFFPPPPPSL------TALGCIGQAKKKWVSLSS-Arabidopsis_thaliana_BRF4_At4g VGTIGRGSVLDYKDKKIYEANNLIE---ETTERLNLGNKTEVIKSMISKL Lycopersicon_esculentum_TFIIB_AF WRLQISAPLFAHKQSLSCNSTSGIFSQLNRGSPFLIPIDANSCGVPDP-Arabidopsis_thaliana_BRF1_At2g ------------------------------------------------SS Arabidopsis_thaliana_BRF2_At3g ------------------------------------------------SS Arabidopsis_thaliana_BRF3_At2g -------------------------------------------------Drosophila_melanogaster_BRF_AA ----------------------------------------KFQVGSGTES Homo_sapiens_BRF_NP_001510.2 ----------------------------------------GFHVNLGKES Saccharomyces_cerevisiae_BRF_N -----------------------------------------FGGSSALES Populus_balsamifera_TFIIB7/pBrp EPTG--FITSFSTWSLEPNPVSLRSSLSFSGHLAELERTIELSASTPASArabidopsis_thaliana_TFIIB5_At EPTG--FITAFSTWSLEPSPIFARSSLSFSGHLAELERTLELASSTSNSN Lycopersicon_esculentum_AAG011 EPTG--FITTFSTWSLEPYPVFAQSSISFAGHLAELERVLEMTSTSSSSS Populus_balsamifera_TFIIB6 ----------LSTVIAK-PNGASG-EFL---------SSSLGRWQNRGSN Populus_balsamifera_TFIIB4 ----------LSTVIAK-PNGASG-DFL---------STSLGRWQNRGSN Populus_balsamifera_TFIIB5 ----------LSTTISK-TNGGSN-ELL---------SCSLGKWQSRGAN Oryza_sativa_TFIIB1_AF464908 ----------LSTVIAK-PNGAQG-EFL---------SSSLGRWQNRGSN Triticum_aestivum_TC68795 ----------LSTVIAK-PNGAHG-DFL---------SSSLGRWQNRGSN Mesembryanthemum_crystallinum_ ----------LSTVISK-PNGTTG-DYL---------SSSLGRWQNRGAN Vitis_vinifera_TFIIB_TC19782 ----------LSTVIAK-PNGVSG-DFL---------SSSLGRWQNRGSN Populus_balsamifera_TFIIB1 ----------LSTVIAK-PNGASG-EFL---------SSSLGRWQNRGSN Citrus_sinensis_TFIIB_CB292941 ----------LSTVIAK-PNGASG-EFL---------SSSLGRWQNRGSN Glycine_max_TFIIB_U31097 ----------LSTVIAK-PNGGGGGEFL---------SSSLGRWQNRGSN Medicago_truncatula_TC86832 ----------LSTVIAK-PNGASG-DFL---------SSSLGRWQNRGSN Arabidopsis_thaliana_TFIIB2_At ----------LTTVISK-PNGSSG-DFL---------SSSLGRWQNRGSN Arabidopsis_thaliana_TFIIB1_At ----------LTTVIAK-PNGSSG-DFL---------SSSLGRWQNRNSN Lycopersicon_esculentum_TFIIB_T ----------LSTVISKGPNGSNG-------------DGSLARLQNRGGD Solanum_tuberosum_TFIIB1_TC587 ----------LSTVISKGPNGSNG-------------DGSLARLQNRGGD

PAGE 215

202 Populus_balsamifera_TFIIB3 ----------AFFIILK--------------------------------Populus_balsamifera_TFIIB8 ------------IIIQLSD------------------------------D Arabidopsis_thaliana_TFIIB4_At DFLSLGTSQNSETITASSDEFLFSDLGH---------LQKFSFDPLSMAS Populus_balsamifera_TFIIB2 ----------LKTLISN-NKG----------------DHAIPRWQDGVSN Oryza_sativa_TFIIB2_AAN59779 ----------LGTVVAPAKRQAKD---T---------ASPPHVRVDSKSG Drosophila_melanogaster_TFIIB_ ----------LSTIIGP-GTGSASFDAF---------GAPKYQNRRTMSS Homo_sapiens_TFIIB_NM_001514 ----------LSTMIGK-GTGAASFDEF---------GNSKYQNRRTMSS Arabidopsis_thaliana_TFIIB3_At ----------LVTIIEKPKETASSVLSKDDI----STLFRAHNQVKN--H Arabidopsis_thaliana_TFIIB6_At ----------LPTIIFKLSGKSSSLASK---------LRRTQNEMIKNKQ Saccharomyces_cerevisiae_TFIIB ----------LSTRIGKGETTDMRFTKELN----------KAQGKNVMDK Methanosarcina_acetivorans_TFB ----------LSTMIDWRNRDSYGKSISSKNRAQLYRLRKWQRRIRVSNA Sulfolobus_solfataricus_TFB_AA ----------LSTLIDWKDKDAMGRTLDPKRRLEALRWRKWQIRARIQSS Populus_balsamifera_TFIIB9 -----------------------------------------PTWRQCVIK Arabidopsis_thaliana_BRF4_At4g TDGEFGQGEWFPILIGACCYAVVREEGKG----VLSMEEVAYEVGCDLHQ Lycopersicon_esculentum_TFIIB_AF ---------FLNFLPEPVDIKSSSNGLLCCR------GREGDKVYYICNP Arabidopsis_thaliana_BRF1_At2g RERIIRKATDELMNLRDALGIGDDRDDVIVMASNFFRIALDHN------Arabidopsis_thaliana_BRF2_At3g RERRFRIARDELMNLKDALGIGDERDDVIVIAAKFFEMAVEQN------Arabidopsis_thaliana_BRF3_At2g ---------------------------------------MDQN------Drosophila_melanogaster_BRF_AA REVTIKKAKKDITLLCQQLQLS---QHYADTALNFFKMALGRH------Homo_sapiens_BRF_NP_001510.2 RAQTLQNGRRHIHHLGNQLQLN---QHCLDTAFNFFKMAVSRH------Saccharomyces_cerevisiae_BRF_N REATLNNARRKLRAVSYALHIP---EYITDAAFQWYKLALANN------Populus_balsamifera_TFIIB7/pBrp SSNVVVDNLRAYMQIIDVASILGLDCDISDHAFQLFRDCCSAT------Arabidopsis_thaliana_TFIIB5_At SSTVVVDNLRAYMQIIDVASILGLDCDISEHAFQLFRDCCSAT------Lycopersicon_esculentum_AAG011 SSSVVVENLRAYLQIIDVASILRLDSDISDHAFQLFRDCSSAT------Populus_balsamifera_TFIIB6 PDRGLITAFKTIATMSDRW------------------------------Populus_balsamifera_TFIIB4 PDRGLILAFKTIATMSDRLGLVATIKVFILDVCTLLL--PLM-------Populus_balsamifera_TFIIB5 PDRNRIQAFKSIAAMADRF------------------------------Oryza_sativa_TFIIB1_AF464908 PDRSLILAFRTIANMADRLGLVATIKDRANEIYKKVE--DLK-------Triticum_aestivum_TC68795 PDRSLILAFRTIANMADRLGLVATIKDRANEIYKKVE--DLK-------Mesembryanthemum_crystallinum_ PDRGLILAFKTIATMADRLGLVATIKDRASEIYKKVE--DQK-------Vitis_vinifera_TFIIB_TC19782 PDRGLILAFKTIATMSDRLGLVATIKDRANEIYKKVE--DQK-------Populus_balsamifera_TFIIB1 PDRGLITAFKTIATMSDRLGLVATIKDRANEIYKKVE--DQK-------Citrus_sinensis_TFIIB_CB292941 PDRGLILAFKTIATMSDRLGLVATIKDRANEIYKKVE--DQK-------Glycine_max_TFIIB_U31097 PDRALIQAFKTIATMSDRLGLVATIKDRANEIYKRVE--DQK-------Medicago_truncatula_TC86832 PDRGLILAFKTIGTMAERLGLVPTIKDRANEIYKRVE--DQK-------Arabidopsis_thaliana_TFIIB2_At PDRGLIVAFKTIATMADRLGLVATIKDRANEIYKRVE--DQK-------Arabidopsis_thaliana_TFIIB1_At SDRGLIQAFKTIATMSERLGLVATIKDRANELYKRLE--DQK-------Lycopersicon_esculentum_TFIIB_T PDRAIVLAFKAIATMADRLSLVSTIRDRASEIYKRLE--DQK-------Solanum_tuberosum_TFIIB1_TC587 PDRAIVLAFKAIATMADRLSLVSTIRDRASEIYKRLE--DQK-------Populus_balsamifera_TFIIB3 -DR-------------------------ANEIYKKVE--DQK-------Populus_balsamifera_TFIIB8 KKPLKDISVVTQVAEG---TIKNSYKDLSPHLSQIIP--S---------Arabidopsis_thaliana_TFIIB4_At TKPNKALSIVSIEAISNGLKLPATIKGQANEIFKVVE--S---------Populus_balsamifera_TFIIB2 SDRVLLQGFDIIEIIANRLGLVRPIKDRAKEIFKKIE--EQK-------Oryza_sativa_TFIIB2_AAN59779 QDSSLAVAFRAISDMADRLQLVATIRDRAKELFKKME--EAKL------Drosophila_melanogaster_TFIIB_ SDRSLISAFKEISSMADRINLPKTIVDRANNLFKQVH--DGK-------Homo_sapiens_TFIIB_NM_001514 SDRAMMNAFKEITTMADRINLPRNIVDRTNNLFKQVY--EQK-------Arabidopsis_thaliana_TFIIB3_At EEDLIKQAFEEIQRMTDALDLDIVINSRACEIVSKYDGHANTK------Arabidopsis_thaliana_TFIIB6_At EEDVIKIAYAEIERMTEALGLTFGISNTACKILSKLD---KKN------Saccharomyces_cerevisiae_TFIIB KDNEVQAAFAKITMLCDAAELPKIVKDCAKEAYKLCH---DEK------Methanosarcina_acetivorans_TFB TERNLAFALSELDRMASALGLPRTVRETAAVVYRKAV---DKN------Sulfolobus_solfataricus_TFB_AA IDRNLAQAMNELERIGNLLNLPKSVKDEAALIYRKAV---EKG------Populus_balsamifera_TFIIB9 QSCELSKIRFLSPPFQETYTHPYQLRKKKKKTHQEKQ---NRK------Arabidopsis_thaliana_BRF4_At4g LGPMIKRVVDHLDLELREFDLVGLFTKTVTNSPRLTDVDRDKKEKIIKQG Lycopersicon_esculentum_TFIIB_AF FTKQWKELPKSNAYHGSDPAIVLLFEPSLLNFVAEYKIICAFP------Arabidopsis_thaliana_BRF1_At2g -----------FTKGRSKELVFSSCLYLTCRQFKLAVLLI--------DF Arabidopsis_thaliana_BRF2_At3g -----------FTKGRRTELVQASCLYLTCRELNIALLLI--------DF Arabidopsis_thaliana_BRF3_At2g -----------FTKGRRAELVQSSCLYLACRDMKISLLFI--------DF Drosophila_melanogaster_BRF_AA -----------LTRGRKSTHIYAACVYMTCRTEGTSHLLI--------DI Homo_sapiens_BRF_NP_001510.2 -----------LTRGRKMAHVIAACLYLVCRTEGTPHMLL--------DL Saccharomyces_cerevisiae_BRF_N -----------FVQGRRSQNVIASCLYVACRKEKTHHMLI--------DF Populus_balsamifera_TFIIB7/pBrp -----------CLRNRSVEALATAALVQAIREAQEPRTLQVGTLVNGEEI Arabidopsis_thaliana_TFIIB5_At -----------CLRNRSVEALATACLVQAIREAQEPRTLQ--------EI Lycopersicon_esculentum_AAG011 -----------CLRNRSVEALATAALVHAIREAQEPRTLQ--------EI Populus_balsamifera_TFIIB6 --------------------------VREYKLLD---------------V Populus_balsamifera_TFIIB4 -----------VSTVSLKHPLMNMALNLNYHVNK---------------V Populus_balsamifera_TFIIB5 -----------------------FLFYFLWEKND---------------V Oryza_sativa_TFIIB1_AF464908 -----------SIRGRNQDAILAACLYIACRQEDRPRTVK--------EI Triticum_aestivum_TC68795 -----------SIRGRNQDAILAACLYIACRQEDRPRTVK--------EI Mesembryanthemum_crystallinum_ -----------SSRGRNQDAILAACLYIACRQEDKPRTVK--------EI

PAGE 216

203 Vitis_vinifera_TFIIB_TC19782 -----------STRGRNQDALLAACLYIACRQEDKPRTVK--------EI Populus_balsamifera_TFIIB1 -----------SSRGRNQDALLAACLYIACRQEDKPRTVK--------EI Citrus_sinensis_TFIIB_CB292941 -----------SSRGRNQDALLAACLYIACRQEDKPRTVK--------EI Glycine_max_TFIIB_U31097 -----------SSRGRNQDALLAACLYIACRQEDKPRTVK--------EI Medicago_truncatula_TC86832 -----------SSRGRNQDALLAACLYIACRQEDKPRTVK--------EI Arabidopsis_thaliana_TFIIB2_At -----------SSRGRNQDALLAACLYIACRQEDKPRTVK--------EI Arabidopsis_thaliana_TFIIB1_At -----------SSRGRNQDALYAACLYIACRQEDKPRTIK--------EI Lycopersicon_esculentum_TFIIB_T -----------CTRGRNLDALVAACIYIACRQEGKPRTVK--------EI Solanum_tuberosum_TFIIB1_TC587 -----------CTRGRNLDALVAACIYIACRQEGKPRTVK--------EI Populus_balsamifera_TFIIB3 -----------SSRGRNQDALLAACLYIACRQEDKPRTVK--------EI Populus_balsamifera_TFIIB8 -----------WFAKEE-------------------------------DI Arabidopsis_thaliana_TFIIB4_At -----------YARGKERNVLFAACIYIACRDNDMTRTMR--------EI Populus_balsamifera_TFIIB2 -----------TCVMRKRDSICAACLFISSRENKLPRTLN--------EI Oryza_sativa_TFIIB2_AAN59779 -----------CARVRNRDAAYAACLHIACRNEGNPRTLK--------EL Drosophila_melanogaster_TFIIB_ -----------NLKGRSNDAKASACLYIACRQEGVPRTFK--------EI Homo_sapiens_TFIIB_NM_001514 -----------SLKGRANDAIASACLYIACRQEGVPRTFK--------EI Arabidopsis_thaliana_TFIIB3_At -----------LRRGKKLNAICAASVSTACRELQLSRTLK--------EI Arabidopsis_thaliana_TFIIB6_At -----------LRGGKSLRGLCAASVSRACRQVNIPKTLK--------EI Saccharomyces_cerevisiae_TFIIB -----------TLKGKSMESIMAASILIGCRRAEVARTFK--------EI Methanosarcina_acetivorans_TFB -----------LIRGRSIEGVAAAALYAACRQCSVPRTLD--------EI Sulfolobus_solfataricus_TFB_AA -----------LVRGRSIESVVAAAIYAACRRMKLARTLD--------EI Populus_balsamifera_TFIIB9 -----------KYSANVLDHLTGDTICIDCGLVLISYYVDEEPEWRTFGI Arabidopsis_thaliana_BRF4_At4g TFLMNCALKWFLSTGRRPMPLVVAVLAFVVQVNGVKVKIDD---LAKDAS Lycopersicon_esculentum_TFIIB_AF -----------STDFDKATEFDIYYSREGCWKIAEEMCFG---------Arabidopsis_thaliana_BRF1_At2g SSYLRVSV--YDLGSVYLQLCDMLYITENH-------NYEKLV DPSIFIP Arabidopsis_thaliana_BRF2_At3g SSYLRVSV--YELGSVYLQLCEMLYLVENR-------NYEKLV DPSIFMD Arabidopsis_thaliana_BRF3_At2g SSYLRVSV--YELGSVYLQLCEMLYLVQNK-------NYEELV DPSIFIP Drosophila_melanogaster_BRF_AA SDVQQICS--YELGRTYLKLSHALCIN------------IPSL DPCLYIM Homo_sapiens_BRF_NP_001510.2 SDLLQVNV--YVLGKTFLLLARELCIN------------APAI DPCLYIP Saccharomyces_cerevisiae_BRF_N SSRLQVSV--YSIGATFLKMVKKLHITE-----------LPLA DPSLFIQ Populus_balsamifera_TFIIB7/pBrp SIAANVPQ--KEIGKYIKILGEALQLSQP----------INSN SISVHMP Arabidopsis_thaliana_TFIIB5_At SIAANVQQ--KEIGKYIKILGEALQLSQP----------INSN SISVHMP Lycopersicon_esculentum_AAG011 SVAANLPQ--KEIGKYIKILGEALQLSQP----------INSN SISVHMP Populus_balsamifera_TFIIB6 EFGG----------------F---------------------------Populus_balsamifera_TFIIB4 CFLG----------------ISRVLPGN----------EPYIL FHPDLSPopulus_balsamifera_TFIIB5 CQLW----------------IRLLAIVC----------RLGKC MLNSCWOryza_sativa_TFIIB1_AF464908 CSVANGAT-KKEIGRAKEFIVKQLEVEMG------QSMEMGTI HAGDFLR Triticum_aestivum_TC68795 CSVANGAT-KKEIGRAKEFIVKQLEVEMG------QSMEMGTI HAGDFLR Mesembryanthemum_crystallinum_ CSVANGAS-KKEIGRAKEYIVKQLELEMG------KSVTIGTI HAADFLR Vitis_vinifera_TFIIB_TC19782 CSVANGAT-KKEIGRAKEYIVKQLEAEKG------QSVEMGTI HAGDFMR Populus_balsamifera_TFIIB1 CSVANGAT-KKEIGRAKEYIVKQLGLETG------QSVEMGTI HAGDFMR Citrus_sinensis_TFIIB_CB292941 CSVANGAT-KKEIGRAKEYIVKQLGLETG------QSVEMGTI HAGDFMR Glycine_max_TFIIB_U31097 CSVANGAT-KKEIGRAKEYIVKQLGLENG------NAVEMGTI HAGDFMR Medicago_truncatula_TC86832 CSIANGAT-KKEIGRAKEYIVKQLGLENGG-----QSVEMGTI HAGDFMR Arabidopsis_thaliana_TFIIB2_At CSVANGAT-KKEIGRAKEYIVKQLGLETG------QLVEMGTI HAGDFMR Arabidopsis_thaliana_TFIIB1_At CVIANGAT-KKEIGRAKDYIVKTLGLEPG------QSVDLGTI HAGDFMR Lycopersicon_esculentum_TFIIB_T CSIANGAS-KKEIGRAKEFIVKQLKVEMG------ESMEMGTI HAGDYLR Solanum_tuberosum_TFIIB1_TC587 CSIANGAS-KKEIGRAKEFIVKQLKVEMG------ESMEMGTI HAGDYLR Populus_balsamifera_TFIIB3 CSVANGAT-KKEIGRAKEYIVKQLGLEAG------QSVEMGTI HAGDFMR Populus_balsamifera_TFIIB8 KNLHSKHT-NLDEGINICLRLKEAPPHN---------------NEQYTA Arabidopsis_thaliana_TFIIB4_At SRFANKAS-ISDISETVGFIAEKLEINKN---------WYMSI ETANFIK Populus_balsamifera_TFIIB2 SSVVYGVT-KKEINKAVQSIKRHVELED-----------MGTL NPSELVR Oryza_sativa_TFIIB2_AAN59779 ASVMRDCQDKKEIGRMERIIRRHLGEEAG------TAMEMGVV RAADYMS Drosophila_melanogaster_TFIIB_ CAVSKISK--KEIGRCFKLTLKALETSVD------------LI TTADFMC Homo_sapiens_TFIIB_NM_001514 CAVSRISK--KEIGRCFKLILKALETSVD------------LI TTGDFMS Arabidopsis_thaliana_TFIIB3_At AEVANGVD-KKDIRKESLVIKRVLESHQTS-----VSASQAII NTGELVR Arabidopsis_thaliana_TFIIB6_At SAVAN-VD-MKEINK---AVKLLGDSFG--------------------Saccharomyces_cerevisiae_TFIIB QSLIHVKT--KEFGKTLNIMKNILRGKSEDGFLKIDTDNMSGA QNLTYIP Methanosarcina_acetivorans_TFB EEVSRVSR--KEIGRTYRFISRELALKLM------------PT SPIDYVP Sulfolobus_solfataricus_TFB_AA AQYTKANR--KEVARCYRLLLRELDVSVP------------VS DPKDYVT Populus_balsamifera_TFIIB9 EDNINEYD-PNHLGSLSDPLLTHANLATT---------ISKPA KGGTTAV Arabidopsis_thaliana_BRF4_At4g VSLTTCKTRYKELSEKLVKVAEEVGLPWAKD---VTVKNVLKH SGTLFAL Lycopersicon_esculentum_TFIIB_AF SRTIFPKSGIHVNGVVYWMTSKNILAFDLTKG---RTQLLESY GTRGFLG Arabidopsis_thaliana_BRF1_At2g RFSNMLLKGAHN--NKLVLTATHIIASMKRDWMQTG-RKPSGICGAALYT Arabidopsis_thaliana_BRF2_At3g RFSNSLLKGKNN--KDVVATARDIIASMKRDWIQTG-RKPSGICGAALYT Arabidopsis_thaliana_BRF3_At2g RFTNSLLKGAHAKAKDVANTAKNIISSMKRDWIQTG-RKPSGICGAAIYM Drosophila_melanogaster_BRF_AA RFANRLQLGAKT--HEVSMTALRIVQRMKKDCMHSG-RRPTGLCGAALLI Homo_sapiens_BRF_NP_001510.2 RFAHLLEFGEKN--HEVSMTALRLLQRMKRDWMHTG-RRPSGLCGAALLV Saccharomyces_cerevisiae_BRF_N HFAEKLDLADKK--IKVVKDAVKLAQRMSKDWMFEG-RRPAGIAGACILL

PAGE 217

204 Populus_balsamifera_TFIIB7/pBrp RFCTLLQLNKSA-----QELATHIGEVVINKCFCTR-RNPISISAAAIYL Arabidopsis_thaliana_TFIIB5_At RFCTLLQLNKSA-----QELATHIGEVVINKCFCTR-RNPISISAAAIYL Lycopersicon_esculentum_AAG011 RFCTLLQLNKSA-----QELATHIGEVIINKCFCTR-RNPISISAAAIYL Populus_balsamifera_TFIIB6 -------------------------------------------------Populus_balsamifera_TFIIB4 -FSSILKMYRHI-----LMKV----------------------------Populus_balsamifera_TFIIB5 -LWNFIRQHHGK-----IK------------------------------Oryza_sativa_TFIIB1_AF464908 RFCSTLGMNNQA-----VKAAQEAVQR-SEELDIR--RSPISIAAAVIYM Triticum_aestivum_TC68795 RFCSTLGMNNTA-----VKAAQEAVQR-SEELDIR--RSPISIAAAVIYM Mesembryanthemum_crystallinum_ RFCSNLGMNNQA-----MKAAQEAVQ K -SEEIDIR--RSPISIAAAVIYI Vitis_vinifera_TFIIB_TC19782 RFCSNLGMTNQV-----VKAAQEAVQ K -SEEFDIR--RSPVSIAAAVIYI Populus_balsamifera_TFIIB1 RFCSNLGMSNHT-----VKAATEAV K T-SEQFDIR--RSPISIAAAVIYI Citrus_sinensis_TFIIB_CB292941 RFCSNLGMNNQA-----VKAAQEAVQ K -SEEFDIR--RSPISIAAAVIYI Glycine_max_TFIIB_U31097 RFCSNLCMNNQA-----VKAAQEAVQ K -SEEFDIR--RSPISIAAAVIYI Medicago_truncatula_TC86832 RFCSNLGMNHQA-----VKAAQESVQ K -SEEFDIR--RSPISIAAAVIYI Arabidopsis_thaliana_TFIIB2_At RFCSNLGMTNQT-----VKAAQESVQ K -SEEFDIR--RSPISIAAAVIYI Arabidopsis_thaliana_TFIIB1_At RFCSNLAMSNHA-----VKAAQEAVQ K -SEEFDIR--RSPISIAAVVIYI Lycopersicon_esculentum_TFIIB_T RFCSNLGMNHEE-----IKVVQETVQ K -AEEFDIR--RSPISIAAAIIYM Solanum_tuberosum_TFIIB1_TC587 RFCSNLGMNHEE-----IKVVQETVQ K -AEEFDIR--RSPISIAAAIIYM Populus_balsamifera_TFIIB3 RFCSNLGMSNHT-----VKAATEAV K T-SEQFDIR--RSPISIAAAVIYI Populus_balsamifera_TFIIB8 TFLLLFVTN-------------------ELKRDG----------GGKVLL Arabidopsis_thaliana_TFIIB4_At RFCSIFRLDKEA-----VEAALEAAESYDYMTNGR--RAPVSVAAGIVYV Populus_balsamifera_TFIIB2 RFCSNLGMKNHA-----VKAVHEAVE K -IQDVDIR--RNPKSVLAAIIYT Oryza_sativa_TFIIB2_AAN59779 RFGSRLGMGKPE-----VREAQRAAQ T LEDKLDVR--RNPESIAAAIIYM Drosophila_melanogaster_TFIIB_ RFCANLDLPNMV-----QRAATHIAK K -AVEMDIVPGRSPISVAAAAIYM Homo_sapiens_TFIIB_NM_001514 RFCSNLCLPKQV-----QMAATHIAR K -AVELDLVPGRSPISVAAAAIYM Arabidopsis_thaliana_TFIIB3_At RFCSKLDISQRE-----IMAIPEAVE K -AENFDIR--RNPKSVLAAIIFM Arabidopsis_thaliana_TFIIB6_At -----------------------------------------------Saccharomyces_cerevisiae_TFIIB RFCSHLGLPMQV-----TTSAEYTAK K CKEIKEIAG-KSPITIAVVSIYL Methanosarcina_acetivorans_TFB RFCSGLNLKGEV-----QSKSVEILRQ-ASEKELTSGRGPTGVAAAAIYI Sulfolobus_solfataricus_TFB_AA RIANLLGLSGAV-----MKTAAEIID K -AKGSGLTAGKDPAGLAAAAIYI Populus_balsamifera_TFIIB9 AISKNWLINRQS------NPDGDLIQGFEIIETMARRRNREAMPAACLFI Arabidopsis_thaliana_BRF4_At4g MEAKSMKKRKQGTGKELVRTDGFCVEDLVMDCLSKE-SMYCYDDDARQDT Lycopersicon_esculentum_TFIIB_AF TFSGKLCKVDVSG----DIISLNVLANTHSNTMQIGSQIKMWSEKEIVVL Arabidopsis_thaliana_BRF1_At2g AALSHGIKCSKTD--IVNIVHICEATL--TKRLIEFGDTEAASLTADE LS Arabidopsis_thaliana_BRF2_At3g AALSHGIKCSKTD--IVNIVHICEATL--TKRLIEFGDTDSGNLNVNE LR Arabidopsis_thaliana_BRF3_At2g AALSHGIMYSRAD--IAKVVHMCEATI--TKRLNEFANTEAGSLTLLV GR Drosophila_melanogaster_BRF_AA AARMHDFSRTMLD--VIGVVKIHESTL--RKRLSEFAETPSGGLTLEE FM Homo_sapiens_BRF_NP_001510.2 AARMHDFRRTVKE--VISVVKVCESTL--RKRLTEFEDTPTSQLTIDE FM Saccharomyces_cerevisiae_BRF_N ACRMNNLRRTHTE--IVAVSHVAEETL--QQRLNEFKNTKAAKLSVQK FR Populus_balsamifera_TFIIB7/pBrp ACQLEDKRKTQAE--ICKVTGLTEVTL--RKVYKELLENWGDLLPKNY TP Arabidopsis_thaliana_TFIIB5_At ACQLEDKRKTQAE--ICKITGLTEVTL--RKVYKELLENWDDLLPSNY TP Lycopersicon_esculentum_AAG011 ACQLEDKRKTQAE--ICKVTGLTEVTL--RKVYKELLENWDDLLPSSY KP Populus_balsamifera_TFIIB6 ------------------------------------------------Populus_balsamifera_TFIIB4 ------------------------------------------------Populus_balsamifera_TFIIB5 ------------------------------------------------Oryza_sativa_TFIIB1_AF464908 ITQLSDDKKPLKD--ISLATGVAEGTI--RNSYKDLYPYASRLIPNTY AK Triticum_aestivum_TC68795 ITQLSEDKKPLKD--ISLATGVAEGTI--RNSYKDLYPYAARLIPNSY AK Mesembryanthemum_crystallinum_ ITQLSEEKKPLRD--ISLATGVAEGTI--RNAYKDLYPHISKIIPVWY AT Vitis_vinifera_TFIIB_TC19782 ITQLSDEKKLLRD--ISIATGVAEGTI--RNSYKDLYPHISRIIPSWY AK Populus_balsamifera_TFIIB1 ITQLSDDKKPLRD--ISLATGVAEGTI--RNSYKDLYPHVSKIIPSWY AS Citrus_sinensis_TFIIB_CB292941 ITQLSDDKKPLKD--ISVATGVAEGTI--RNSYKDLYPHVSKIIPNWY AK Glycine_max_TFIIB_U31097 ITQLSDDKKPLKD--ISLATGVAEGTI--RNSYKDLYPHVSKIIPNWY AK Medicago_truncatula_TC86832 ITQLSDDKKPLKD--ISVATGVAEGTI--RNSYKDLYPHVSKIIPNWY AK Arabidopsis_thaliana_TFIIB2_At ITQLSDEKKPLRD--ISVATGVAEGTI--RNSYKDLYPHLSKIIPAWY AK Arabidopsis_thaliana_TFIIB1_At ITQLSDDKKTLKD--ISHATGVAEGTI--RNSYKDLYPHLSKIAPSWY AK Lycopersicon_esculentum_TFIIB_T ITQLSDSKKPVLR-DISVATTVAEGTI--KNAYKDLYPHASKIIPEWY VK Solanum_tuberosum_TFIIB1_TC587 ITQLSDSKKPVLRADISVATTVAEGTI--KNAYKDLYPHASKIIPEWY VK Populus_balsamifera_TFIIB3 ITQLSDDKKPLRD--ISLATGVAEGTI--RNSYKDLYPHVSKIIPAWY AN Populus_balsamifera_TFIIB8 LLNLCMDRKILE-----SGKQPSEAKL--YALSVPTDTHRPTMLPE--Arabidopsis_thaliana_TFIIB4_At IARLSYEKHLLKG--LIEATGVAENTI--KGTYGDLYPNLPTIVPTWF AN Populus_balsamifera_TFIIB2 ITQLSDEKKPLRD--ISLAADVAEGTI--KKSFKDISPHVSRLVPKWY AR Oryza_sativa_TFIIB2_AAN59779 VVQRAGAQTSARD--VSKASGVAEATI--KEACKELSQHEELLFSS--Drosophila_melanogaster_TFIIB_ ASQASEHKRSQKE--IGDIAGVADVTI--RQSYKLMYPHAAKLFPEDF KF Homo_sapiens_TFIIB_NM_001514 ASQASAEKRTQKE--IGDIAGVADVTI--RQSYRLIYPRAPDLFPTDF KF Arabidopsis_thaliana_TFIIB3_At ISHISQTNRKPIR-EIGIVAEVVENTI--KNSVKDMYPYALKIIPNWY AC Arabidopsis_thaliana_TFIIB6_At ------------------------------------------------Saccharomyces_cerevisiae_TFIIB NILLFQIPITAAK--VGQTLQVTEGTI--KSGYKILYEHRDKLVDPQL IA Methanosarcina_acetivorans_TFB ASILCGERRTQRE--VADVAGVTEVTI--RNRYKELAEELDIEIIL--Sulfolobus_solfataricus_TFB_AA ASLLHDERRTQKE--IAQVAGVTEVTV--RNRYKELTQELKISIPTQ-Populus_balsamifera_TFIIB9 SCKENKLPRTLKE------TCSAASCN--GGGGGGLTMKEACTIGGYD RR

PAGE 218

205 Arabidopsis_thaliana_BRF4_At4g MSRYFDVEGERQLSLCNYDDNISENQL--STKYNEFEDRVCGGTLAKR SQ Lycopersicon_esculentum_TFIIB_AF DSEIVGDGAARNHTVLHVDSDIMVVLCGRRTCSYDFKSRLTKFLSSKV GI Arabidopsis_thaliana_BRF1_At2g KTEREKETAA----Arabidopsis_thaliana_BRF2_At3g ERESHKRSFT----Arabidopsis_thaliana_BRF3_At2g ILLLISEQR-----Drosophila_melanogaster_BRF_AA TVDLER-EQDP---Homo_sapiens_BRF_NP_001510.2 KIDLEE-ECDP---Saccharomyces_cerevisiae_BRF_N ENDVEDGEARP---Populus_balsamifera_TFIIB7/pBrp AVPPEKAFPT----Arabidopsis_thaliana_TFIIB5_At AVPPEKAFPT----Lycopersicon_esculentum_AAG011 VVPPEKAFPS----Populus_balsamifera_TFIIB6 --------------Populus_balsamifera_TFIIB4 --------------Populus_balsamifera_TFIIB5 --------------Oryza_sativa_TFIIB1_AF464908 EEDLKNLCTP----Triticum_aestivum_TC68795 EEDLKNLCTP----Mesembryanthemum_crystallinum_ EDDLKTSAAH----Vitis_vinifera_TFIIB_TC19782 EEDLRNLCSP----Populus_balsamifera_TFIIB1 EEDLKNLCSP----Citrus_sinensis_TFIIB_CB292941 EEDLKNLCSP----Glycine_max_TFIIB_U31097 EEDLKNLCSP----Medicago_truncatula_TC86832 EEDLKNLCSP----Arabidopsis_thaliana_TFIIB2_At EEDLKNLQSP----Arabidopsis_thaliana_TFIIB1_At EEDLKNLSSP----Lycopersicon_esculentum_TFIIB_T DKDLKSLCSPKA--Solanum_tuberosum_TFIIB1_TC587 DKDLKNLCSPKA--Populus_balsamifera_TFIIB3 EEDLKNLSSP----Populus_balsamifera_TFIIB8 --------------Arabidopsis_thaliana_TFIIB4_At ANDLKNLGAP----Populus_balsamifera_TFIIB2 EEDIRRIRIP----Oryza_sativa_TFIIB2_AAN59779 --------------Drosophila_melanogaster_TFIIB_ TTPIDQLPQM----Homo_sapiens_TFIIB_NM_001514 DTPVDKLPQL----Arabidopsis_thaliana_TFIIB3_At ESDIIKRLDG----Arabidopsis_thaliana_TFIIB6_At --------------Saccharomyces_cerevisiae_TFIIB NGVVSLDNLPGVEKK Methanosarcina_acetivorans_TFB --------------Sulfolobus_solfataricus_TFB_AA --------------Populus_balsamifera_TFIIB9 DHES----------Arabidopsis_thaliana_BRF4_At4g GSSQSMWQRR----Lycopersicon_esculentum_TFIIB_AF LDRCFPYVNSLVSLTBP Alignment The N-terminal imperfect repeat is highli ghted in yellow, the C-terminal imperfect repeat is shaded in grey (Nikolov et al ., 1995). CLUSTAL X (1.83) multiple sequence alignment Arabidopsis_thaliana_TBP1_At3g --------------------------MTDQG------------------Arabidopsis_thaliana_TBP2_At1g --------------------------MADQG------------------Medicago_truncatula_TBP_TC8687 --------------------------MADQG------------------Medicago_truncatula_TBP_TC8871 --------------------------MADQG------------------Zea_mays_TBP_TC171023 --------------------------MAEPG------------------Zea_mays_TBP_X90652.1 --------------------------MAEPR------------------Oryza_sativa_TBP_TC116362 -----------------------MAAEAAAA------------------Triticum_aestivum_TBP_TC88519 -------------------------MAEAAA------------------Triticum_aestivum_TBP_TC72701 --------------------------MAEAT------------------Hordeum_vulgare_TBP_TC78738 --------------------------MAEAA------------------Sorghum_bicolor_TBP_TC54739 --------------------------MAEPG------------------Zea_mays_TBP_TC182979 --------------------------MAEPG------------------Mesembryanthemum_crystallinum_ --------------------------MAEQG------------------Populus_trichocarpa_TBP_Contig1 --------------------------MAEQGG-----------------Populus_trichocarpa_TBP_Contig2 --------------------------MAEQGG-----------------Glycine_max_TBP_TC146463 --------------------------MADQG------------------Solanum_tuberosum_TBP_TC74102 --------------------------MADQG-------------------

PAGE 219

206 Triticum_aestivum_TBP_TC90291 --------------------------MAAAAVDPMVLGLGTSGGASGSGV Chlamydomonas_reinhardtii_TBP_ -------------------------MMAAAEAPPATP------------Saccharomyces_cerevisiae_TBP_N --------------------------QSEED------------------Homo_sapiens_TBP_NP_003185.1 -------------------------QATQGTSGQAPQLFH--SQTLTTAP Drosophila_melanogaster_TBP_NP ------------------------MPMSERSVGGSGAGGG--GDALSNIH Drosophila_melanogaster_TRF_Q2 --------------------MQFHFKVADAERDRDNVAATSNAAAN---Homo_sapiens_TBP_L1_NP_004856 --------------------------MD---------------------Drosophila_melanogaster_TRF2_A VASNGNGLITAKMDLLEEEVMQSITVIDDDDEEKKEVAED---------Arabidopsis_thaliana_TBP1_At3g --------------LEGSNPVDLSKHPSGIVPTL---------------Arabidopsis_thaliana_TBP2_At1g --------------TEGSQPVDLTKHPSGIVPTL---------------Medicago_truncatula_TBP_TC8687 --------------LEGSQPVDLSKHPSGIVPTL---------------Medicago_truncatula_TBP_TC8871 --------------LEGSQPVDLAKHPSGIVPTL---------------Zea_mays_TBP_TC171023 --------------LEDSQPVDLSKHPSGIVPTL---------------Zea_mays_TBP_X90652.1 --------------LEDSQPVDLSKHPSGIVPTL---------------Oryza_sativa_TBP_TC116362 --------------LEGSEPVDLAKHPSGIIPTL---------------Triticum_aestivum_TBP_TC88519 --------------LEGSEPVDLTKHPSGIIPTL---------------Triticum_aestivum_TBP_TC72701 --------------LEGSEPVDLSKHPSGIIPTL---------------Hordeum_vulgare_TBP_TC78738 --------------LEGSQPVDLSKHPSGIVPTL---------------Sorghum_bicolor_TBP_TC54739 --------------LEGSQPVDLSKHPSGIVPTLHFPVLGASKRANIVLN Zea_mays_TBP_TC182979 --------------LEGSQPVDLSKHPSGIVPTL---------------Mesembryanthemum_crystallinum_ --------------LEGSQPVDPIKHPSGIVPTL---------------Populus_trichocarpa_TBP_Contig1 --------------LEGSQPVDLSKHPSGIVPTL---------------Populus_trichocarpa_TBP_Contig2 --------------LEGSQPVDLSKHPSGIVPIL---------------Glycine_max_TBP_TC146463 --------------LEGSQPVDLQKHPSGIVPTL---------------Solanum_tuberosum_TBP_TC74102 --------------LEGSQPVDLTKHPSGIVPTL---------------Triticum_aestivum_TBP_TC90291 VGGGVGRAGGGGAVMEGAQPVDLARHPSGIVPVL---------------Chlamydomonas_reinhardtii_TBP_ ------------QLSAADVEAEMAAHVSGIKPQL---------------Saccharomyces_cerevisiae_TBP_N ------------IKRAAPESEKDTSATSGIVPTL---------------Homo_sapiens_TBP_NP_003185.1 LPGTTPLYPSPMTPMTPITPATPASESSGIVPQL---------------Drosophila_melanogaster_TBP_NP QT---------MGPSTPMTPATPGSADPGIVPQL---------------Drosophila_melanogaster_TRF_Q2 ----------PHAALQPQQPVALVEPKDAQHEIR---------------Homo_sapiens_TBP_L1_NP_004856 ---------------------------ADSDVALD--------------Drosophila_melanogaster_TRF2_A ----------EEESSNNAKPIDLHQPIADNEHELD--------------Arabidopsis_thaliana_TBP1_At3g -------------------QNIVSTVNLDCKLDLKAIALQARNAEYNPKR Arabidopsis_thaliana_TBP2_At1g -------------------QNIVSTVNLDCKLDLKAIALQARNAEYNPKR Medicago_truncatula_TBP_TC8687 -------------------QNIVSTVNLDCKLELKSIALQARNAEYNPKR Medicago_truncatula_TBP_TC8871 -------------------QNIVSTVNLDTKLDLKAIALQARNAEYNPKR Zea_mays_TBP_TC171023 -------------------QNIVSTVNLDCKLDLKAIALQARNAEYNPKR Zea_mays_TBP_X90652.1 -------------------QNIVSTVNLDCKLDLKAIALQARNAEYNPKR Oryza_sativa_TBP_TC116362 -------------------QNIVSTVNLDCKLDLKAIALQARNAEYNPKR Triticum_aestivum_TBP_TC88519 -------------------QNIVSTVNLDCKLDLKAIALQARNAEYNPKR Triticum_aestivum_TBP_TC72701 -------------------QNIVSTVNLDCKLDLKAIALQARNAEYNPKR Hordeum_vulgare_TBP_TC78738 -------------------QNIVSTVNLDCKLDLKAIALQARNAEYNPKR Sorghum_bicolor_TBP_TC54739 SWGFGGNYLVVILSPRVDFRNIVSTVNLDCKLDLKAIALQARNAEYNPKR Zea_mays_TBP_TC182979 -------------------QNIVSTVNLDCKLDLKAIALQARNAEYNPKR Mesembryanthemum_crystallinum_ -------------------QNIVSTVNLDCKLDLKAIALQARNAEYNPKR Populus_trichocarpa_TBP_Contig1 -------------------QNIVSTVNLDCKLELKQIALQARNAEYNPKR Populus_trichocarpa_TBP_Contig2 -------------------QNIVSTVNLDCRLDLKQIALQARNAEYNPKR Glycine_max_TBP_TC146463 -------------------QNIVSTVNLDCKLDLKTIALQARNAEYNPKR Solanum_tuberosum_TBP_TC74102 -------------------QNIVSTVNLDCKLDLKAIALQARNAEYNPKR Triticum_aestivum_TBP_TC90291 -------------------QNIVSTVNLDCRLDLKQIALQARNAEYNPKR Chlamydomonas_reinhardtii_TBP_ -------------------QNVVATVNLGTKLDLKEIAMHARNAEYNPKR Saccharomyces_cerevisiae_TBP_N -------------------QNIVATVTLGCRLDLKTVALHARNAEYNPKR Homo_sapiens_TBP_NP_003185.1 -------------------QNIVSTVNLGCKLDLKTIALRARNAEYNPKR Drosophila_melanogaster_TBP_NP -------------------QNIVSTVNLCCKLDLKKIALHARNAEYNPKR Drosophila_melanogaster_TRF_Q2 ------------------LQNIVATFSVNCELDLKAINSRTRNSEYSPKR Homo_sapiens_TBP_L1_NP_004856 ----------------ILITNVVCVFRTRCHLNLRKIALEGANVIYK-RD Drosophila_melanogaster_TRF2_A ----------------IVINNVVCSFSVGCHLKLREIALQGSNVEYR-RE Arabidopsis_thaliana_TBP1_At3g FAAVIMRIREPKTTALIFASGKMVCTGAKSEDFSKMAARKYARIVQKLGF Arabidopsis_thaliana_TBP2_At1g FAAVIMRIREPKTTALIFASGKMVCTGAKSEHLSKLAARKYARIVQKLGF Medicago_truncatula_TBP_TC8687 FAAVIMRIREPKTTALIFASGKMVCTGAKSEVQSKLAARKYARIIQKLGF Medicago_truncatula_TBP_TC8871 FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF Zea_mays_TBP_TC171023 FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF Zea_mays_TBP_X90652.1 FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF Oryza_sativa_TBP_TC116362 FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF Triticum_aestivum_TBP_TC88519 FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF Triticum_aestivum_TBP_TC72701 FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF Hordeum_vulgare_TBP_TC78738 FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF

PAGE 220

207 Sorghum_bicolor_TBP_TC54739 FAAVIMRIREPKTTALIFASGKM-----------------YARIIQKLGF Zea_mays_TBP_TC182979 FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF Mesembryanthemum_crystallinum_ FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF Populus_trichocarpa_TBP_Contig1 FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF Populus_trichocarpa_TBP_Contig2 FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF Glycine_max_TBP_TC146463 FAAVIMRIRDPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF Solanum_tuberosum_TBP_TC74102 FAAVIMRIREPKTTALIFASGKMVCTGAKSEQQSKLAARKYARIIQKLGF Triticum_aestivum_TBP_TC90291 FAAVIMRIRDPKTTALIFASGKMVCTGAKSEEHSKLAARKYARIVQKLGF Chlamydomonas_reinhardtii_TBP_ FAAVIMRIREPKTTALIFASGKMVCTGAKSEDDSRTAARRYAKIVQKLGF Saccharomyces_cerevisiae_TBP_N FAAVIMRIREPKTTALIFASGKMVVTGAKSEDDSKLASRKYARIIQKIGF Homo_sapiens_TBP_NP_003185.1 FAAVIMRIREPRTTALIFSSGKMVCTGAKSEEQSRLAARKYARVVQKLGF Drosophila_melanogaster_TBP_NP FAAVIMRIREPRTTALIFSSGKMVCTGAKSEDDSRLAARKYARIIQKLGF Drosophila_melanogaster_TRF_Q2 FRGVIMRMHSPRCTALIFRTGKVICTGARNEIEADIGSRKFARILQKLGF Homo_sapiens_TBP_L1_NP_004856 VGKVLMKLRKPRITATIWSSGKIICTGATSEEEAKFGARRLARSLQKLGF Drosophila_melanogaster_TRF2_A NGMVTMKLRHPYTTASIWSSGRITCTGATSESMAKVAARRYARCLGKLGF Arabidopsis_thaliana_TBP1_At3g PAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHAAFSSYEP----------E Arabidopsis_thaliana_TBP2_At1g PAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHSAFSSYEP----------E Medicago_truncatula_TBP_TC8687 PAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSVKYDTKLLLSISPG Medicago_truncatula_TBP_TC8871 NAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSVSY----------Zea_mays_TBP_TC171023 PAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSYEP----------E Zea_mays_TBP_X90652.1 PAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSYEP----------E Oryza_sativa_TBP_TC116362 PAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSYEP----------E Triticum_aestivum_TBP_TC88519 PAKFKDFKIQNIVASCDVKFPIRLEGLAYSHGAFSSYEP----------E Triticum_aestivum_TBP_TC72701 PAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSYEP----------E Hordeum_vulgare_TBP_TC78738 PAKFKDFKIQNIVASCDVKFPIRLEGLAYSHGAFSSYEP----------E Sorghum_bicolor_TBP_TC54739 PAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSYEP----------E Zea_mays_TBP_TC182979 PAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSYEP----------E Mesembryanthemum_crystallinum_ PAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSYEP----------E Populus_trichocarpa_TBP_Contig1 AAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSYEP----------E Populus_trichocarpa_TBP_Contig2 AAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSYEP----------E Glycine_max_TBP_TC146463 PAKFKDFKIQNIVGSCDVKFPIRLEGLAYSHGAFSSYEP----------E Solanum_tuberosum_TBP_TC74102 PAKFKDFKIQNIVGSCDVKFPIRLEGLAYAHGAFSSYEP----------E Triticum_aestivum_TBP_TC90291 PATFKDFKIQNIVASCDVKFPIRLEGLAYSHGAFSSYEP----------E Chlamydomonas_reinhardtii_TBP_ PATFKEFKIQNIVGSCDVKFPIRLEGLAYAHSLFASYEP----------E Saccharomyces_cerevisiae_TBP_N AAKFTDFKIQNIVGSCDVKFPIRLEGLAFSHGTFSSYEP----------E Homo_sapiens_TBP_NP_003185.1 PAKFLDFKIQNMVGSCDVKFPIRLEGLVLTHQQFSSYEP----------E Drosophila_melanogaster_TBP_NP PAKFLDFKIQNMVGSCDVKFPIRLEGLVLTHCNFSSYEP----------E Drosophila_melanogaster_TRF_Q2 PVKFMEYKLQNIVATVDLRFPIRLENLNHVHGQFSSYEP----------E Homo_sapiens_TBP_L1_NP_004856 QVIFTDFKVVNVLAVCNMPFEIRLPEFTKNNRPHASYEP----------E Drosophila_melanogaster_TRF2_A PTRFLNFRIVNVLGTCSMPWAIKIVNFSERHRENASYEP----------E Arabidopsis_thaliana_TBP1_At3g LFPGLIYRMKV--PKIVLLIFVSGKIVITGAKMRDETYKAFENIYPVLSE Arabidopsis_thaliana_TBP2_At1g LFPGLIYRMKL--PKIVLLIFVSGKIVITGAKMREETYTAFENIYPVLRE Medicago_truncatula_TBP_TC8687 EFEEIMYHYYQ--SHMTLALFPS--IIYLKSSILQKTGTSLETLVSKVCP Medicago_truncatula_TBP_TC8871 ----FIYSYLH--TSSSLD------VICIGISMAKRFRFFLKI------Zea_mays_TBP_TC171023 LFPGLIYRMKQ--PKIVLLIFVSGKIVLTGAKVREETYTAFENIYPVLAE Zea_mays_TBP_X90652.1 LFPGLIYRMKQ--PKIVLLIFVSGKIVLTGAKVREETYTAFENIYPVLAE Oryza_sativa_TBP_TC116362 LFPGLIYRMKQ--PKIVLLIFVSGKIVLTGAKVRDETYTAFENIYPVLTE Triticum_aestivum_TBP_TC88519 LFPGLIYRMRQ--PKIVLLIFVSGKIVLTGAKVREETYTAFENIYPVLTE Triticum_aestivum_TBP_TC72701 LFPGLIYRMKQ--PKIVLLIFVSGKIVLTGAKVREETYTAFENIYPVLTE Hordeum_vulgare_TBP_TC78738 LFPGLIYRMRQ--PKIVLLIFVSGKIVLTGAKVREETYTAFENIYPVLTE Sorghum_bicolor_TBP_TC54739 LFPGLIYRMKQ--PKIVLLIFVSGKIVLTGAKVREETYTAFENIYPVLTE Zea_mays_TBP_TC182979 LFPGLIYRMKQ--PKIVLLIFVSGKIVLTGAKVREETYTAFENIYPVLSE Mesembryanthemum_crystallinum_ LFPGLIYRMKQ--PKIVLLIFVSGKIVLTGAKVREETYTAFENIYPVLTE Populus_trichocarpa_TBP_Contig1 LFPGLIYRMKQ--PKIVLLIFVSGKIVITGAKVREETYTAFENIYPVLAE Populus_trichocarpa_TBP_Contig2 IFPGLIYRMKQ--PKIVLLIFVSGKIVITGAKVRDETYTAFGNIYPVLTE Glycine_max_TBP_TC146463 LFPGLIYRMKQ--PKIVLLIFVSGKIVLTGAKVRDETYTAFENIYPVLTE Solanum_tuberosum_TBP_TC74102 LFPGLIYRMKQ--PKIVLLIFVSGKIVITGAKVRDETYTAFENIYPVLTE Triticum_aestivum_TBP_TC90291 LFPGLIYRMKQ--PKIVLLVFVSGKIVLTGAKVRDEIYAAFENIYPVLTE Chlamydomonas_reinhardtii_TBP_ LFPGLIYRMKQ--PKIVLLIFVSGKVVLTGAKTRGEIYQAYMNIYPTLIQ Saccharomyces_cerevisiae_TBP_N LFPGLIYRMVK--PKIVLLIFVSGKIVLTGAKQREEIYQAFEAIYPVLSE Homo_sapiens_TBP_NP_003185.1 LFPGLIYRMIK--PRIVLLIFVSGKVVLTGAKVRAEIYEAFENIYPILKG Drosophila_melanogaster_TBP_NP LFPGLIYRMVR--PRIVLLIFVSGKVVLTGAKVRQEIYDAFDKIFPILKK Drosophila_melanogaster_TRF_Q2 MFPGLIYRMVK--PRIVLLIFVNGKVVFTGAKSRKDIMDCLEAISPILLS Homo_sapiens_TBP_L1_NP_004856 LHPAVCYRIKS--LRATLQIFSTGSITVTGPNVK-AVATAVEQIYPFVFE Drosophila_melanogaster_TRF2_A LHPGVTYKMRDPDPKATLKIFSTGSVTVTAASVN-HVESAIQHIYPLVFD Arabidopsis_thaliana_TBP1_At3g FRKIQQ--------------Arabidopsis_thaliana_TBP2_At1g FRKVQQ--------------Medicago_truncatula_TBP_TC8687 MENKTDTNQS-----------

PAGE 221

208 Medicago_truncatula_TBP_TC8871 --------------------Zea_mays_TBP_TC171023 FRKVQQ--------------Zea_mays_TBP_X90652.1 FRKVQQWYVVLFYHVSIIVRS Oryza_sativa_TBP_TC116362 FRKVQQ--------------Triticum_aestivum_TBP_TC88519 FRKVQQ--------------Triticum_aestivum_TBP_TC72701 FRKVQQ--------------Hordeum_vulgare_TBP_TC78738 FRKVQQ--------------Sorghum_bicolor_TBP_TC54739 FRKVQQ--------------Zea_mays_TBP_TC182979 FRKIQQ--------------Mesembryanthemum_crystallinum_ FRKNQQ--------------Populus_trichocarpa_TBP_Contig1 FRKVQQWYTSQSLCPAL---Populus_trichocarpa_TBP_Contig2 FRKVQQW-------------Glycine_max_TBP_TC146463 FRKNQQ--------------Solanum_tuberosum_TBP_TC74102 FRKNQQ--------------Triticum_aestivum_TBP_TC90291 YRKSQQ--------------Chlamydomonas_reinhardtii_TBP_ YKKGDAVVPTLPN-------Saccharomyces_cerevisiae_TBP_N FRKM----------------Homo_sapiens_TBP_NP_003185.1 FRKTT---------------Drosophila_melanogaster_TBP_NP FKKQS---------------Drosophila_melanogaster_TRF_Q2 FRKT----------------Homo_sapiens_TBP_L1_NP_004856 SRKEIL--------------Drosophila_melanogaster_TRF2_A FRKQRS--------------TAF6 Alignment CLUSTAL X (1.83) multiple sequence alignment Drosophila_melanogaster_TAF6_N MSGKPSKPSSPSSSMLYGSSISAESMKVIAESIGVGSLSDDAAKELAEDV Homo_sapiens_TAF6_NP_647476 MAEE-------KKLKLSNTVLPSESMKVVAESMGIAQIQEETCQLLTDEV Arabidopsis_thaliana_TAF6b_1 -------------------MVTKESIEVIAQSIGLSTLSPDVSAALAPDV Arabidopsis_thaliana_TAF6b_3 -------------------MVTKESIEVIAQSIGLSTLSPDVSAALAP-Arabidopsis_thaliana_TAF6b_2 -------------------MVTKESIEVIAQSIGLSTLSPDVSAALAPDV Arabidopsis_thaliana_TAF6_At1g -----------------MSIVPKETVEVIAQSIGITNLLPEAALMLAPDV Populus_trichocarpa_TAF6_Contig1 -----------------MSIVAKETIEVIAQSIGISNLSEDVALTLAPDV Hordeum_vulgare_TAF6_Barley1_1 -----------------MSIVPKETIEVIAQSIGIPSLPADVSAALAPDV Oryza_sativa_TAF6_BAB92191 -----------------MSIVPKETIEVIGQSVGIANLPADVSAALAPDV Populus_trichocarpa_TAF6_Contig2 ---------------MSSSIVAKEAIEVIAQGIGITNLSPDVSLTLAPDV Arabidopsis_thaliana_TAF6b_4 -------------------MVTKESIEVIAQSIGLSTLSPDVSAALAPDV Saccharomyces_cerevisiae_TAF6_ ---------MSTQQQSYTIWSPQDTVKDVAESLGLENINDDVLKALAMDV Homo_sapiens_TAF6L_NP_006464 ---------MSEREERRFVEIPRESVRLMAESTGL-ELSDEVAALLAEDV Drosophila_melanogaster_TAF6_N SIKLKRIVQDAAKFMNHAKRQKLSVRDIDMSLKVRNVEPQYGFVAKD--Homo_sapiens_TAF6_NP_647476 SYRIKEIAQDALKFMHMGKRQKLTTSDIDYALKLKNVEPLYGFHAQE--Arabidopsis_thaliana_TAF6b_1 EYRVREVMQEAIKCMRHARRTTLMAHDVDSALHFRNLEPTSGSKS----Arabidopsis_thaliana_TAF6b_3 --------------------------DVDSALHFRNLEPTSGSKS----Arabidopsis_thaliana_TAF6b_2 EYRVREVMQEAIKCMRHARRTTLMAHDVDSALHFRNLEPTSGSKS----Arabidopsis_thaliana_TAF6_At1g EYRVREIMQEAIKCMRHSKRTTLTASDVDGALNLRNVEPIYGFASGG--Populus_trichocarpa_TAF6_Contig1 EFRMRQIMQEAIKCMRHSKRTRLTTDDVDGALNLTNVEPIYGFASGG--Hordeum_vulgare_TAF6_Barley1_1 EYRLREIMQEAIKCMRHAKRTVLTADDVDSALSLRNVEPVYGFASGD--Oryza_sativa_TAF6_BAB92191 EYRLREIMQEAIKCMRHAKRTVLTADDVDSALSLRNVEPVYGFASGD--Populus_trichocarpa_TAF6_Contig2 EYRLREIIQEAIKCMRHSRRTALTAHDVDTALILRNVEPIYGFGSGGDKArabidopsis_thaliana_TAF6b_4 EYRVREVMQEAIKCMRHARRTTLMAHDVDSALHFRNLEVS---------Saccharomyces_cerevisiae_TAF6_ EYRILEIIEQAVKFKRHSKRDVLTTDDVSKALRVLNVEPLYGYYDGSEVN Homo_sapiens_TAF6L_NP_006464 CYRLREATQNSSQFMKHTKRRKLTVEDFNRALRWSSVEAVCGYGSQE--Drosophila_melanogaster_TAF6_N --FIPFRFASGGGRELHFTEDKEIDLGEITSTN-SVKIPLDLTLRSHWFV Homo_sapiens_TAF6_NP_647476 --FIPFRFASGGGRELYFYEEKEVDLSDIINTP-LPRVPLDVCLKAHWLS Arabidopsis_thaliana_TAF6b_1 --MRFKR--APENRDLYFFDDKDVELKNVIEAP-LPNAPPDASVFSHWLA Arabidopsis_thaliana_TAF6b_3 --MRFKR--APENRDLYFFDDKDVELKNVIEAP-LPNAPPDASVFSHWLA Arabidopsis_thaliana_TAF6b_2 --MRFKR--APENRDLYFFDDKDVELKNVIEAP-LPNAPPDASVFSHWLA Arabidopsis_thaliana_TAF6_At1g -PFRFRK--AIGHRDLFYTDDREVDFKDVIEAP-LPKAPLDTEIVCHWLA Populus_trichocarpa_TAF6_Contig1 -ALQFKR--AIGHRDLFYVDDKDIDFKDVIEAP-LPKAPLDTAVVCHWLA Hordeum_vulgare_TAF6_Barley1_1 -PLRFKR--AVGHKDLFYIDDREVDFKEIIEAP-LPKAPLDTAVVAHWLA Oryza_sativa_TAF6_BAB92191 -PLRFKR--AVGHKDLFYIDDREVDFKEIIEAP-LPKAPLDTAVVAHWLA Populus_trichocarpa_TAF6_Contig2 VPLRFKRAAAAGHKDLYYIDDKDVNFKHVIEAP-PPKPPLDTSLTSHWLA Arabidopsis_thaliana_TAF6b_4 ---------SSSLLLLFHTVDPDFDF---FLYS-LPLAP----------K Saccharomyces_cerevisiae_TAF6_ KAVSFSKVNTSGGQSVYYLDEEEVDFDRLINEP-LPQVPRLPTFTTHWLA Homo_sapiens_TAF6L_NP_006464 --ALPMR--PAREGELYFPEDREVNLVELALATNIPKGCAETAVRVHVSY

PAGE 222

209 Drosophila_melanogaster_TAF6_N VEGVQPTVPENPPPLSKDSQLLDSVNPVIKMDQGLNKD-----------Homo_sapiens_TAF6_NP_647476 IEGCQPAIPENPPPAPKEQQKAEATEPLKSAKPGQEEDGPLKGKGQGATT Arabidopsis_thaliana_TAF6b_1 IDGIQPSIPQNSPLQAIS-------------------------------Arabidopsis_thaliana_TAF6b_3 IDGIQPSIPQNSPLQAIS-------------------------------Arabidopsis_thaliana_TAF6b_2 IDGIQPSIPQNSPLQAIS-------------------------------Arabidopsis_thaliana_TAF6_At1g IEGVQPAIPENAPLEVIRAP-----------------------------Populus_trichocarpa_TAF6_Contig1 IEGVQPAIPENAPLEVIAPP-----------------------------Hordeum_vulgare_TAF6_Barley1_1 IEGVQPAIPENPPIDAISAP-----------------------------Oryza_sativa_TAF6_BAB92191 IEGVQPAIPENPPVDAIVAP-----------------------------Populus_trichocarpa_TAF6_Contig2 IEGVQPAIPENVPIEALGVI-----------------------------Arabidopsis_thaliana_TAF6b_4 VCGSRELLRT---------------------------------------Saccharomyces_cerevisiae_TAF6_ VEGVQPAIIQNPNLNDIRVSQPPFIRGAIVTALNDNSLQTPVTSTTASAS Homo_sapiens_TAF6L_NP_006464 LDGKGNLAPQGSVPSAVSS------------------------------Drosophila_melanogaster_TAF6_N AAGKPTTGKIHKLKNVETIHVKQLATHELSVEQQLYYKEIT-----EACV Homo_sapiens_TAF6_NP_647476 ADGKGKEKKAPPLLEGAPLRLKPRSIHELSVEQQLYYKEIT-----EACV Arabidopsis_thaliana_TAF6b_1 ----DLKRSEYK-------DDGLAARQVLSKDLQIYFDKVT-----EWAL Arabidopsis_thaliana_TAF6b_3 ----DLKRSEYK-------DDGLAARQVLSKDLQIYFDKVT-----EWAL Arabidopsis_thaliana_TAF6b_2 ----DLKRSEYK-------DDGLAAR-------QIYFDKVT-----EWAL Arabidopsis_thaliana_TAF6_At1g ---AETKIHEQ--KDGPLIDVRLPVKHVLSRELQLYFQKIA-----ELAM Populus_trichocarpa_TAF6_Contig1 ---SDGKISEQ--NDEFPVDIKLPVKHVLSRELQLYFDKIT-----DLTV Hordeum_vulgare_TAF6_Barley1_1 ---TENKRTEQVKDDGLPVDIKLPVKHILSRELQMYFDKIA-----ELTM Oryza_sativa_TAF6_BAB92191 ---TENKRTEHGKDDGLPVDIKLPVKHVLSRELQMYFDKIA-----ELTM Populus_trichocarpa_TAF6_Contig2 ---SDGKKSDYK-DDGLSIDVKLPVKDILSRELQLYFEKVT-----ELTA Arabidopsis_thaliana_TAF6b_4 ---------------------------------EIYTSSMT-----KMSS Saccharomyces_cerevisiae_TAF6_ VTDTGASQHLSNVKPGQNTEVKPLVKHVLSKELQIYFNKVISTLTAKSQA Homo_sapiens_TAF6L_NP_006464 ----------------------------LTDDLLKYYHQVT------RAV Drosophila_melanogaster_TAF6_N G-SDEPRRGEALQSLGSDPGLHEMLPRMCTFIAEGVKVNVVQNNLALLIY Homo_sapiens_TAF6_NP_647476 G-SCEAKRAEALQSIATDPGLYQMLPRFSTFISEGVRVNVVQNNLALLIY Arabidopsis_thaliana_TAF6b_1 TQSGSTLFRQALASLEIDPGLHPLVPFFTSFIAE--EIVKNMDNYPILLA Arabidopsis_thaliana_TAF6b_3 TQSGSTLFRQALASLEIDPGLHPLVPFFTSFIAE--EIVKNMDNYPILLA Arabidopsis_thaliana_TAF6b_2 TQSGSTLFRQALASLEIDPGLHPLVPFFTSFIAE--EIVKNMDNYPILLA Arabidopsis_thaliana_TAF6_At1g SKSNPPLYKEALVSLASDSGLHPLVPYFTNFIAD--EVSNGLNDFRLLFN Populus_trichocarpa_TAF6_Contig1 RRSDSVLFKEALVSLATDSGLHPLIPYFTYFIAD--EVARGLNDYSLLFA Hordeum_vulgare_TAF6_Barley1_1 SRSSTPIFREALVSLSKDSGLHPLVPYFSYFIAD--EVTRSLADLPVLFA Oryza_sativa_TAF6_BAB92191 SRSETSVFREALVSLSRDSGLHPLVPYFSYFIAD--EVTRSLGDLPVLFA Populus_trichocarpa_TAF6_Contig2 RRSESAIFKQALVSLATDSGLHPLVPYFIQFIAD--EVSRNLNNFSLLLA Arabidopsis_thaliana_TAF6b_4 SRMLSKLLYQ----------MHLLMHLFS------------LIGWQLMVF Saccharomyces_cerevisiae_TAF6_ DEAAQHMKQAALTSLRTDSGLHQLVPYFIQFIAE--QITQNLSDLQLLTT Homo_sapiens_TAF6L_NP_006464 LGDDPQLMKVALQDLQTNSKIGALLPYFVYVVSG---VKSVSHDLEQLHR Drosophila_melanogaster_TAF6_N LMRMVRALLDNPSLFLEKY------------------------------Homo_sapiens_TAF6_NP_647476 LMRMVKALMDNPTLYLEKY------------------------------Arabidopsis_thaliana_TAF6b_1 LMRLARSLLHNPHVHIEPY------------------------------Arabidopsis_thaliana_TAF6b_3 LMRLARSLLHNPHVHIEPY------------------------------Arabidopsis_thaliana_TAF6b_2 LMRLARSLLHNPHVHIEPY------------------------------Arabidopsis_thaliana_TAF6_At1g LMHIVRSLLQNPHIHIEPY------------------------------Populus_trichocarpa_TAF6_Contig1 LMRVVWSLLQNPHIHIEPYIIVNVLSFVFRIMSSIDEYKIKVQSLKLRRR Hordeum_vulgare_TAF6_Barley1_1 LMRVVQSLLRNPHIHIEPY------------------------------Oryza_sativa_TAF6_BAB92191 LMRVVQSLLHNPHIHIEPY------------------------------Populus_trichocarpa_TAF6_Contig2 VMRIARSLLQNPYIHIEPY------------------------------Arabidopsis_thaliana_TAF6b_4 NLPFHRILLSKPYLTLNDRNIR---------------------------Saccharomyces_cerevisiae_TAF6_ ILEMIYSLLSNTSIFLDPY------------------------------Homo_sapiens_TAF6L_NP_006464 LLQVARSLFRNPHLCLGPY------------------------------Drosophila_melanogaster_TAF6_N -----LHELIPSVMTCIVSKQLCMRP----------ELDNHWALRDFASR Homo_sapiens_TAF6_NP_647476 -----VHELIPAVMTCIVSRQLCLRP----------DVDNHWALRDFAAR Arabidopsis_thaliana_TAF6b_1 -----LHQLMPSIITCLIAKRLGRR-----------SSDNHWDLRNFTAS Arabidopsis_thaliana_TAF6b_3 -----LHQLMPSIITCLIAKRLGRR-----------SSDNHWDLRNFTAS Arabidopsis_thaliana_TAF6b_2 -----LHQLMPSIITCLIAKRLGRR-----------SSDNHWDLRNFTAS Arabidopsis_thaliana_TAF6_At1g -----LHQLMPSVVTCLVSRKLGNR-----------FADNHWELRDFAAN Populus_trichocarpa_TAF6_Contig1 WISCQLHQLMPSVVTCLVARKLGNR-----------FADNHWELRDFTAN Hordeum_vulgare_TAF6_Barley1_1 -----LHQLMPSMITCIVAKRLGHR-----------LSDNHWELRDFSAN Oryza_sativa_TAF6_BAB92191 -----LHQLMPSIITCMVAKRLGHR-----------LSDNHWELRDFSAN Populus_trichocarpa_TAF6_Contig2 -----LHQLMPSIITCLVAKRLGNR-----------FSDNHWELRNFTAN Arabidopsis_thaliana_TAF6b_4 -------------------------------------------------Saccharomyces_cerevisiae_TAF6_ -----IHSLMPSILTLLLAKKLGGSPKDDSPQEIHEFLERTNALRDFAAS Homo_sapiens_TAF6L_NP_006464 -----VRCLVGSVLYCVLEPLAASIN----------PLNDHWTLRDGAAL

PAGE 223

210 Drosophila_melanogaster_TAF6_N LMAQICK------------------------------NFNTLTNNLQTRV Homo_sapiens_TAF6_NP_647476 LVAQICK------------------------------HFSTTTNNIQSRI Arabidopsis_thaliana_TAF6b_1 TVASTCK------------------------------RFGHVYHNLLPRV Arabidopsis_thaliana_TAF6b_3 TVASTCK------------------------------RFGHVYHNLLPRV Arabidopsis_thaliana_TAF6b_2 TVASTCK------------------------------RFGHVYHNLLPRV Arabidopsis_thaliana_TAF6_At1g LVSLICK------------------------------RYGTVYITLQSRL Populus_trichocarpa_TAF6_Contig1 LVAPICKRVHGWQHSALILCKHSLTEYVPRVSWSGCCRFGHVYNSLQTRL Hordeum_vulgare_TAF6_Barley1_1 LVASVCR------------------------------RYGHVYHNLQIRL Oryza_sativa_TAF6_BAB92191 LVGSVCR------------------------------RFGHAYHNIQTRV Populus_trichocarpa_TAF6_Contig2 LVASICK------------------------------RFGHAYHNLQPRI Arabidopsis_thaliana_TAF6b_4 --------------------------------------------TMAWLL Saccharomyces_cerevisiae_TAF6_ LLDYVLK------------------------------KFPQAYKSLKPRV Homo_sapiens_TAF6L_NP_006464 LLSHIFWT------------------------------HGDLVSGLYQHI Drosophila_melanogaster_TAF6_N TRIFSKALQNDKTHLSSLYGSIAGLSELGGEVIKVFIIPRLKFISERIEP Homo_sapiens_TAF6_NP_647476 TKTFTKSWVDEKTPWTTRYGSIAGLAELGHDVIKTLILPRLQQEGERIRS Arabidopsis_thaliana_TAF6b_1 TRSLLHTFLDPTKALPQHYGAIQGMVALGLNMVRFLVLPNLGPYLLLLLP Arabidopsis_thaliana_TAF6b_3 TRSLLHTFLDPTKALPQHYGAIQGMVALGLNMVRFLVLPNLGPYLLLLLP Arabidopsis_thaliana_TAF6b_2 TRSLLHTFLDPTKALPQHYGAIQGMVALGLNMVRFLVLPNLGPYLLLLLP Arabidopsis_thaliana_TAF6_At1g TRTLVNALLDPKKALTQHYGAIQGLAALGHTVVRLLILSNLEPYLSLLEP Populus_trichocarpa_TAF6_Contig1 TKTLLNALLDPKRSLTQHYGAIQGLAALGPNVVRLLLLPNLKPYLQLLEP Hordeum_vulgare_TAF6_Barley1_1 TKTLVHAFLDPHKALTQHYGAVQGISALGPSAIRLLLLPNLQTYMQLLDP Oryza_sativa_TAF6_BAB92191 TRTLVQGFLDPQKSLTQHYGAIQGISALGPSAIRLLLLPNLETYMQLLEP Populus_trichocarpa_TAF6_Contig2 IRTLVHAFLDPTKSLPQHYGSIQGLAALGPSVVRLLILPNLEPYLLLLEQ Arabidopsis_thaliana_TAF6b_4 HRCFLRTFR----------------------------------------Saccharomyces_cerevisiae_TAF6_ TRTLLKTFLDINRVFGTYYGCLKGVSVLEGESIR-FFLGNLNNWARLVFN Homo_sapiens_TAF6L_NP_006464 LLSLQKILADPVRPLCCHYGAVVGLHALGWKAVERVLYPHLSTYWTNLQA Drosophila_melanogaster_TAF6_N HLLGTSISNTDKTAAGHIRAMLQKCCPPILRQMRSAPDTAEDYKND---F Homo_sapiens_TAF6_NP_647476 VLDGPVLSNIDRIGADHVQSLLLKHCAPVLAKLRPPPDNQDAYRAE---F Arabidopsis_thaliana_TAF6b_1 EMGLEKQKEEAKRHGAWLVYGALMVAAGRCLYERLKTSETLLSPPT---S Arabidopsis_thaliana_TAF6b_3 EMGLEKQKEEAKRHGAWLVYGALMVAAGRCLYERLKTSETLLSPPT---S Arabidopsis_thaliana_TAF6b_2 EMGLEKQKEEAKRHGAWLVYGALMVAAGRCLYERLKTSETLLSPPT---S Arabidopsis_thaliana_TAF6_At1g ELNAEKQKNQMKIYEAWRVYGALLRAAGLCIHGRLKIFPPLPSPSP---S Populus_trichocarpa_TAF6_Contig1 EMLLEKQKNEMKRHEAWHVYGALLCAAGQSIYDRLKMFPALMSHPA---C Hordeum_vulgare_TAF6_Barley1_1 ELQLEKQSNEMKRKEAWRVYGALLCAAGKCLYERLKLFPNLLCPST---R Oryza_sativa_TAF6_BAB92191 ELQLDKQKNEMKRKEAWRVYGALLCAAGKCLYDRLKLFPNLLSPST---R Populus_trichocarpa_TAF6_Contig2 EMLLEKQKNEIKRHEAWQR------AAGLCMYDRLKMLPGLFIPPS---R Arabidopsis_thaliana_TAF6b_4 -FTLTKSRSGL--------------------------------------Saccharomyces_cerevisiae_TAF6_ ESGITLDNIEEHLNDDSNPTRTKFTKEETQILVDTVISALLVLKKD---L Homo_sapiens_TAF6L_NP_006464 VLDDYSVSNAQVKADGHKVYGAILVAVERLLKMKAQAAEPNRGGPGGRGC Drosophila_melanogaster_TAF6_N GFLGPSLCQAVVKVR----------------------------------Homo_sapiens_TAF6_NP_647476 GSLGPLLCSQVVKARAQA-------------------------------Arabidopsis_thaliana_TAF6b_1 SVWKTN--GKLTSPRQ---------------------------------Arabidopsis_thaliana_TAF6b_3 SVWKTN--GKLTSPRQ---------------------------------Arabidopsis_thaliana_TAF6b_2 SVWKTN--GKLTSPRQ---------------------------------Arabidopsis_thaliana_TAF6_At1g FLHKGKGKGKIISTDP---------------------------------Populus_trichocarpa_TAF6_Contig1 AVLRTN--EKVVTKRPGDFYD----------------------------F Hordeum_vulgare_TAF6_Barley1_1 PLLRSN--SRVATNNP---------------------------------Oryza_sativa_TAF6_BAB92191 PLLRSN--KRVVTNNP---------------------------------Populus_trichocarpa_TAF6_Contig2 AIWKSN--GRVMTAMPSMTCFNLSHWDTFIHASINPVTGYVYCLKIPVNA Arabidopsis_thaliana_TAF6b_4 -------------------------------------------------Saccharomyces_cerevisiae_TAF6_ PDLYEGKGEKVTDEDK---------------------------------Homo_sapiens_TAF6L_NP_006464 RRLDDLPWDSLLFQESSSGGGAEPSFGSGLPLPPGGAGPEDPSLSVTLAD Drosophila_melanogaster_TAF6_N ----NAPASSIVTLSSN------TINTAP--------------------Homo_sapiens_TAF6_NP_647476 --ALQAQQVNRTTLTITQPRPTLTLSQAPQPGPRTPGLLKVPGSIALPVQ Arabidopsis_thaliana_TAF6b_1 ------SKRKASSDNLT--------HQPPL-------------------Arabidopsis_thaliana_TAF6b_3 ------SKRKASSDNLT--------HQPPL-------------------Arabidopsis_thaliana_TAF6b_2 ------SKRKASSDNLT--------HQPPL-------------------Arabidopsis_thaliana_TAF6_At1g ------HKRKLSVDSSE--------NQSPQ-------------------Populus_trichocarpa_TAF6_Contig1 SFQKLYHLNATVCDVSMPMYLWVESNLFPL-------------------Hordeum_vulgare_TAF6_Barley1_1 ------NKRKSSTDLSA--------SQPPL-------------------Oryza_sativa_TAF6_BAB92191 ------NKRKSSTDLST--------SQPPL-------------------Populus_trichocarpa_TAF6_Contig2 CVEMGLYVGTSSFHYVHLTLYPACISCRSL-------------------Arabidopsis_thaliana_TAF6b_4 -------------------------------------------------Saccharomyces_cerevisiae_TAF6_ -------------------------------------------------Homo_sapiens_TAF6L_NP_006464 IYRELYAFFGDSLATRFGTGQPAPTAPRPPGD-----------------

PAGE 224

211 Drosophila_melanogaster_TAF6_N ---------------------ITSAAQTATTIGRVSMPTTQRQGSPGVSS Homo_sapiens_TAF6_NP_647476 TLVSARAAAPPQPSPPPTKFIVMSSSSSAPSTQQVLSLSTSAPGSGSTTT Arabidopsis_thaliana_TAF6b_1 ------------------------KKIAVGG---------------IIQM Arabidopsis_thaliana_TAF6b_3 ------------------------KKIAVGG---------------IIQM Arabidopsis_thaliana_TAF6b_2 ------------------------KKIAVGG---------------IIQM Arabidopsis_thaliana_TAF6_At1g ------------------------KRLITMDGPDGVHSQDQSGSAPMQVD Populus_trichocarpa_TAF6_Contig1 ------------------------IENYQDKRKASMEHMEQPPPKKIATD Hordeum_vulgare_TAF6_Barley1_1 ------------------------KKMASDVSMSPMGSAAPVAGNMAGSM Oryza_sativa_TAF6_BAB92191 ------------------------KKMTTDGAMNSMTSAP-----MPGTM Populus_trichocarpa_TAF6_Contig2 ------------------------LANQDKRKASTDNLMQQPLLKKIATD Arabidopsis_thaliana_TAF6b_4 -------------------------------------------------Saccharomyces_cerevisiae_TAF6_ -------------------------------------------------Homo_sapiens_TAF6L_NP_006464 ----------------------KKEPAAAPDSVRKMPQLTASAIVSPHGD Drosophila_melanogaster_TAF6_N LPQIRAIQANQPAQKFVIVTQNSP----QQGQAKVVR--------RGSSP Homo_sapiens_TAF6_NP_647476 SPVTTTVPSVQPIVKLVSTATTAPPSTAPSGPGSVQKYIVVSLPPTGEGK Arabidopsis_thaliana_TAF6b_1 SSTQMQMRGTTTVPQ--------------QSHTDADARHH-------NSP Arabidopsis_thaliana_TAF6b_3 SSTQMQMRGTTTVPQ--------------QSHTDADARHH-------NSP Arabidopsis_thaliana_TAF6b_2 SSTQMQMRGTTTVPQ--------------QSHTDADARHH-------NSP Arabidopsis_thaliana_TAF6_At1g NPVENDNPPQNSVQP--------SSSEQASDANESESRNGK----VKESG Populus_trichocarpa_TAF6_Contig1 GPVDMQVEPIAPVPLGDSKTGLSTSSEHTPNYSEAGSRNQ------KDKG Hordeum_vulgare_TAF6_Barley1_1 DGFSAQLPNPGMMQA--------SSSGQKVESMTAAGAIR------RDQG Oryza_sativa_TAF6_BAB92191 DGFSTQLPNPSMTQT--------SSSGQLVES-TASGVIR------RDQG Populus_trichocarpa_TAF6_Contig2 SAIGAMPMNSMPVEMQGAASGFPTAVGASSVSVSAISRQLSNENVPRREI Arabidopsis_thaliana_TAF6b_4 -------------------------------------------------Saccharomyces_cerevisiae_TAF6_ -------------------------------------------------Homo_sapiens_TAF6L_NP_006464 ESPRGSGGGGPASASGPAASESRPLPRVHRARGAPRQQGPGTGTRDVFQK Drosophila_melanogaster_TAF6_N HSVVLSAASNAASASNSNSSSSGSLLAAAQRSSDNVCVIAGSEAPAVDGI Homo_sapiens_TAF6_NP_647476 GGPTSHPSPVPPPASSPSPLSGSALCGGKQEAGDSPPPAPGTPKANGSQP Arabidopsis_thaliana_TAF6b_1 STIAPKTSAAAG------TDVDNYLFPLFEYFGESMLMFTPTHELSFFLArabidopsis_thaliana_TAF6b_3 STIAPKTSAAAG------TDVDNYLFPLFEYFGESMLMFTPTHELSFFLArabidopsis_thaliana_TAF6b_2 STIAPKTSAAAG------TDVDNYLFPLFEYFGESMLMFTPTHELSFFLArabidopsis_thaliana_TAF6_At1g RSRAITMKAILDQIWKDDLDSGRLLVKLHELYGDRILPFIPSTEMSVFLPopulus_trichocarpa_TAF6_Contig1 DSQAIKTSAILSQVWKDDLNSGHLLVSLFELFGESILSFIPSPEMSLFLHordeum_vulgare_TAF6_Barley1_1 SNHAQRVSAVLRQAWKEDQDAGHLLGSLHEVFGEAIFSFIQPPELSIFLOryza_sativa_TAF6_BAB92191 SNHTQRVSTVLRLAWKEDQNAGHLLSSLYEVFGEAIFSFVQPPEISFFLPopulus_trichocarpa_TAF6_Contig2 SGRGLKTSTVLAQAWKEDMDAGHLLASLFELFSESMFSFTPKPELSFFLArabidopsis_thaliana_TAF6b_4 -------------------------------------------------Saccharomyces_cerevisiae_TAF6_ EKLLERCGVTIGFHILKRDDAKELISAIFFGE-----------------Homo_sapiens_TAF6L_NP_006464 SRFAPRGAPHFRFIIAGRQAGRRCRGRLFQTAFPAPYGPSPASRYVQKLP Drosophila_melanogaster_TAF6_N TVQSFRAS--------------Homo_sapiens_TAF6_NP_647476 NSGSPQPAP-------------Arabidopsis_thaliana_TAF6b_1 ----------------------Arabidopsis_thaliana_TAF6b_3 ----------------------Arabidopsis_thaliana_TAF6b_2 ----------------------Arabidopsis_thaliana_TAF6_At1g ----------------------Populus_trichocarpa_TAF6_Contig1 ----------------------Hordeum_vulgare_TAF6_Barley1_1 ----------------------Oryza_sativa_TAF6_BAB92191 ----------------------Populus_trichocarpa_TAF6_Contig2 ----------------------Arabidopsis_thaliana_TAF6b_4 ----------------------Saccharomyces_cerevisiae_TAF6_ ----------------------Homo_sapiens_TAF6L_NP_006464 MIGRTSRPARRWALSDYSLYLPL TAF9 Alignment CLUSTAL X (1.83) multiple sequence alignment Gossypium__Cotton__TAF9_TC1456 MAEG-------------------------EEDLPRDAKIVKSLLKSMGVE Populus_balsamifera_TAF9 MAEG-------------------------EEDMPRDAKIVKSLLKSMGVE Vitis_vinifera_TAF9_TC11580 MAGG-------------------------DEDLPRDAKIVKSLLKSMGVD Solanum_tuberosum_TAF9_TC67183 MAEGG------------------------EEDLPRDAKIVKTLLKSMGVD Solanum_tuberosum_TAF9_TC67182 MAEGG------------------------EEDLPRDAKIDKTSLKSMGVD Lycopersicon_esculentum_TAF9_T MAEGG------------------------EEDLPRDAKIVKTLLKSMGVD Arabidopsis thaliana TAF9 MAGEG------------------------EEDVPRDAKIVKSLLKSMGVE M_truncatula_TAF9_TC85341 MADNEE-----------------------DSNMPRDAKIMQSLLKSMGVE

PAGE 225

212 M_truncatula_TAF9_TC85342 MADNEE-----------------------DSNMPRDAKIVQSLLKSMGVE Zea_mays_TAF9_TC182853 MDAGAARPSAPS--TAAVA-------GASVADEPRDARVVRELLRSMGLR Zea_mays_TAF9_TC182854 MDAADARPSAPSAAAAAVA-------GASVADEPRDARVVRELLRSMGLG Hordeum_vulgare_TAF9_TC68170 MDSGGVRPSLPS--AAAAG-------GASVPDEPRDARVVRELLRSMGLG Oryza_sative_TAF9_AAP12985 MDPGGLRPAPQSAAAAAAAAAAGAGAGASAADEPRDARVVRELLRSMGLS Triticum_aestivum_TAF9_TC70841 MDGGGGGGGRPALQPAAAGG------GASGPDEPRDARVVRELLRSMGLG Oryza_sativa_TAF9_BAC21319.1 MDTGADQAPPPPPPPPVAAAS-------AAADEPRDLRVVREILHSLGLR Chlamydomonas_reinhardtii_TAF9 MDAARGAGGAVS-----------------DGAQPQDVATMHALLRSMGVE Homo_sapiens_TAF9_NP_003178 MESGK---------------------TASPKSMPKDAQMMAQILKDMGIT Homo_sapiens_TAF9L_NP_057059 MESGK---------------------MAPPKNAPRDALVMAQILKDMGIT Drosophila_melanogaster_TAF9_A MSAEKSDKAKI---------------SAQIKHVPKDAQVIMSILKELNVQ Saccharomyces_cerevisiae_TAF9_ MNGGGKNVLNKNSVGSVSEVGP----DSTQEETPRDVRLLHLLLASQSIH Populus_balsamifera_TAF9b MGEGTVP--------------------LEVQIRPKEMHLQAEFGFAAHWR Gossypium__Cotton__TAF9_TC1456 D--YEPRVIHQFLELWYRYVVDVLTDAQVYSEHAGKQ-----TIDCDDVK Populus_balsamifera_TAF9 D--YEPRVVHQFLELWYRYVVDVLTDAQVYSEHANKT-----AIDCDDVK Vitis_vinifera_TAF9_TC11580 D--YEPRVIHQFLELWYRYVVDVLTDAQVYSEHASKL-----AIDCDDVK Solanum_tuberosum_TAF9_TC67183 D--YEPRVVHQFLELWYRYVVDVLMDAQVYSEHAGKA-----SIDSDDIK Solanum_tuberosum_TAF9_TC67182 D--YEPRVVQQFLELRNSYVVDVLTDAQVYSEHAGKT-----SIDSDDIK Lycopersicon_esculentum_TAF9_T D--YEPRVVHQFLELWYRYVVDVLTDAQVYSEHARKA-----SIDSDDIK Arabidopsis thaliana TAF9 D--YEPRVIHQFLELWYRYVVEVLTDAQVYSEHASKP-----NIDCDDVK M_truncatula_TAF9_TC85341 E--YEPRVINKFLELWYRYVVDVLTDAQVYSEHAGKP-----AIDVDDVK M_truncatula_TAF9_TC85342 E--YEPRVINKFLELWYRYVVDVLTDAQVYSEHAGKP-----AIDVDDVK Zea_mays_TAF9_TC182853 EGEYEPRVVHQFLDLAYRYVGDVLGDAQVYADHAGKA-----QIDADDVR Zea_mays_TAF9_TC182854 EGEYEPRVVHQFLDLAYRYVGDVLGDAQVYADHAGKA-----QIDADDVR Hordeum_vulgare_TAF9_TC68170 EGEYEPRVVGQFLDLAYRYVGDVLGDAQVYADHADKP-----QIDADDVR Oryza_sative_TAF9_AAP12985 EGEYEPRVVHQFLDLAYRYVGDVLGDAQVYADHAGKP-----QLDADDVR Triticum_aestivum_TAF9_TC70841 EGEYEPRVVHQFLDLAYRYAGDVLGDAQVYADHAGKP-----QLDADDVR Oryza_sativa_TAF9_BAC21319.1 EGDYEEAAVHKLLLFAHRYAGDVLGEAKAYAGHAGRE-----SLQADDVR Chlamydomonas_reinhardtii_TAF9 E--FEPRVVNQLMDFMYKYTTDVLLDAEVFSEHAGRQP---GQVDASGVT Homo_sapiens_TAF9_NP_003178 E--YEPRVINQMLEFAFRYVTTILDDAKIYSSHAKKA-----TVDADDVR Homo_sapiens_TAF9L_NP_057059 E--YEPRVINQMLEFAFRYVTTILDDAKIYSSHAKKP-----NVDADDVR Drosophila_melanogaster_TAF9_A E--YEPRVVNQLLEFTFRYVTCILDDAKVYANHARKK-----TIDLDDVR Saccharomyces_cerevisiae_TAF9_ Q--YEDQVPLQLMDFAHRYTQGVLKDALVYNDYAGSGNSAGSGLGVEDIR Populus_balsamifera_TAF9b YKEGDCKHSSFVLQVVEWARWVITWQCETMSKDRPSIG-CDDSIKPPCTF Gossypium__Cotton__TAF9_TC1456 LAIQSKVNFSFSQPPPRE--------------------------VLLELA Populus_balsamifera_TAF9 LAIQSKVNFSFSQPPPRE--------------------------VLLELA Vitis_vinifera_TAF9_TC11580 LAIHFKVNFSFFQPPARE--------------------------VLLELA Solanum_tuberosum_TAF9_TC67183 LAIQSKVNFSFSQPPPRE--------------------------VLLELA Solanum_tuberosum_TAF9_TC67182 LAIQSKVNFSFSQPPPRE--------------------------VLLELA Lycopersicon_esculentum_TAF9_T LAIQSKVNFSFSQPPPRE--------------------------VLLELA Arabidopsis thaliana TAF9 LAIQSKVNFSFSQPPPRE--------------------------VLLELA M_truncatula_TAF9_TC85341 LAIQSQVNFSFSQPPPRE--------------------------VLLELA M_truncatula_TAF9_TC85342 LAIQSQVNFSFSQPPPRE--------------------------VLLELA Zea_mays_TAF9_TC182853 LAIQAKVNFSFSQPPPRE--------------------------VLLELA Zea_mays_TAF9_TC182854 LAIQAKVNFSFSQPPPRE--------------------------VLLELA Hordeum_vulgare_TAF9_TC68170 LAIQANVNFSFSQPPPRE--------------------------VLLELA Oryza_sative_TAF9_AAP12985 LAIQSKVNFSFSQPPPRECSEFFHSDQDFRSRSLPSDNPLFFSMVLLEVA Triticum_aestivum_TAF9_TC70841 LAIQAKVNFSFSQPPPRE--------------------------VLLELA Oryza_sativa_TAF9_BAC21319.1 LAIQARG-MSSAAPPSRE--------------------------EMLDIA Chlamydomonas_reinhardtii_TAF9 MAIQSRTALYVQPPPQER---------------------------VTELA Homo_sapiens_TAF9_NP_003178 LAIQCRADQSFTSPPPRD--------------------------FLLDIA Homo_sapiens_TAF9L_NP_057059 LAIQCRADQSFTSPPPRD--------------------------FLLDIA Drosophila_melanogaster_TAF9_A LATEVTLDKSFTGPLERH--------------------------VLAKVA Saccharomyces_cerevisiae_TAF9_ LAIAARTQYQFKPTAPKE--------------------------LMLQLA Populus_balsamifera_TAF9b PSHSDGCPYSYKPHCGQDG-------------------------PIFIIM Gossypium__Cotton__TAF9_TC1456 RNRNKVPLPKAIPGPG-IPLPPEQDTLISTNYQLAIPKKQPAQAMEEMEE Populus_balsamifera_TAF9 RNRNKIPLPKSIAGPG-IPLPPEQDTLISPNYQLAIPKKRTAQAIEETEE Vitis_vinifera_TAF9_TC11580 RNRNKIPLPKSIAGPG-IPLPPEQDTLISPNYQLAIPKKRTAQAVEETEE Solanum_tuberosum_TAF9_TC67183 RNRNKIPLPKSIAGSG-VPLPPEQDTLINPNYQLAIAKKQTSQP-EETEE Solanum_tuberosum_TAF9_TC67182 RNRNKIPLPKSIAGSG-VPLPPQQDTLINPNYQLAIAKKQTSQP-EETEE Lycopersicon_esculentum_TAF9_T RNRNKIPLPKSIAGSG-VPLPPEQDTLINPNYQLAIAKKQTNQP-EETEE Arabidopsis thaliana TAF9 ASRNKIPLPKSIAGPG-VPLPPEQDTLLSPNYQLVIPKKSVSTEPEETED M_truncatula_TAF9_TC85341 QNRNKIPLPKSIAGPG-FPLPPDQDTLIAPNYQFAIPNKRSVEPMEETED M_truncatula_TAF9_TC85342 QNRNKIPLPKSIAGPG-FPLSPDQDTLIAPNYQFAIPNKRSVEPMEETED Zea_mays_TAF9_TC182853 RSRNRMPLPKSIAPPGSIPLPPEQDTLLAQNYQLLPPLKPPPQY-EEIED Zea_mays_TAF9_TC182854 RSRNRMPLPKSIAPPGSIPLPPEQDTLLAQNYQLLPPLKPPPQY-EENED Hordeum_vulgare_TAF9_TC68170 RSRNKIPLPKSIAPPGSIPLPPEQDTLLSENYQLLPALKPPTQT-EEAED Oryza_sative_TAF9_AAP12985 RNRNKIPLPKSIAPPGSIPLPPEQDTLLSQNYQLLAPLKPPPQF-EETED

PAGE 226

213 Triticum_aestivum_TAF9_TC70841 RSRNKIPLPKSIAPPGSIPLPPEQDTLLSQNYQLLPALKPPTQT-EEAED Oryza_sativa_TAF9_BAC21319.1 HKCNEIPIPKPCVPSGSISLPHYEDMLLNKKHIFVPRVEPTPHQIEETED Chlamydomonas_reinhardtii_TAF9 RQVNDTGTARPGHQAPACRCRPRASR-----------------------Homo_sapiens_TAF9_NP_003178 RQRNQTPLPLIKPYSG-PRLPPDRYCLTAPNYRLKS----LQKKASTSAG Homo_sapiens_TAF9L_NP_057059 RQKNQTPLPLIKPYAG-PRLPPDRYCLTAPNYRLKS----LIKKG-PNQG Drosophila_melanogaster_TAF9_A DVRNSMPLPPIKPHCG-LRLPPDRYCLTGVNYKLRATNQPKKMTKSAVEG Saccharomyces_cerevisiae_TAF9_ AERNKKALPQVMGTWG-VRLPPEKYCLTAKEWDLEDPKSM---------Populus_balsamifera_TAF9b IENDKMSVQEFPADSTVMDLLERAGRASSRWSAYGFPVKEELRPRLNHRP Gossypium__Cotton__TAF9_TC1456 DEE---------SVEP-NSSQEH----------------KTDAPHPTSQR Populus_balsamifera_TAF9 DEE---------SADP-NQSQEQ----------------KTDPPQLTPQR Vitis_vinifera_TAF9_TC11580 DEE---------GADPSHASQEG----------------RTDLPQHTPQR Solanum_tuberosum_TAF9_TC67183 DEE---------RADPNPAPSKNPSLSHE----------KTDVPQGTPQQ Solanum_tuberosum_TAF9_TC67182 DEE---------SADPNPAPSKNPSLSHE----------KTDVPQGTPQR Lycopersicon_esculentum_TAF9_T DEE---------SADPNPAPSKNPTLSHE----------KTDLPQGTPQR Arabidopsis thaliana TAF9 DEE---------MTDPGQSSQEQQQQQQQ----------TSDLPSQTPQR M_truncatula_TAF9_TC85341 EEVP--------NADPNPSQEEK-----T----------DAEQN--PHQR M_truncatula_TAF9_TC85342 ERS---------SQWPIPTHLKK-----R----------RQMRNKIPIKE Zea_mays_TAF9_TC182853 ETEEPNPS---NPANSNPSYSQDQSSKEQQ---------QQHTPQHG-QR Zea_mays_TAF9_TC182854 ENEESNPSLTPNPANSNPTFSQDQRSNEQ-----------QHTPQHG-QR Hordeum_vulgare_TAF9_TC68170 DNEGADAI----PANPSPSYSQDQRGSEQ------------HQPQSQSQR Oryza_sative_TAF9_AAP12985 DNAGANPTPTSNPSNPSPNNLQEQQ----------------QLPQHG-QR Triticum_aestivum_TAF9_TC70841 EEEGANAD----AANANPNSSQDQR---------------------GNEA Oryza_sativa_TAF9_BAC21319.1 DYNDDGSNAN--VASPNSNYDQDLFGSISLPHYQDMLLNQNHLSVHRVEP Chlamydomonas_reinhardtii_TAF9 -------------------------------------------------Homo_sapiens_TAF9_NP_003178 RITVPRLSVGSVTSRPSTPTLGTPTPQTMSVSTKVGTPMSLTGQRFTVQM Homo_sapiens_TAF9L_NP_057059 RL-VPRLSVGAVSSKPTTPTIATP--QTVSVPNKVATPMSVTSQRFTVQI Drosophila_melanogaster_TAF9_A RPLKTVVKPVSSANGPKRPHSVVAKQQVVTIPKPVIKFTTTTTTKTVGSS Saccharomyces_cerevisiae_TAF9_ -------------------------------------------------Populus_balsamifera_TAF9b VHDATCKLKMGDVVELTPAIPDKSLSDYR------------EEIQRMYEH Gossypium__Cotton__TAF9_TC1456 VSFPL-TKRSK--------------------------------------Populus_balsamifera_TAF9 VSFPL-TKRPNYRFQVMSSISCSSSMNSPDSSTLFTRLKFELCDMRIALI Vitis_vinifera_TAF9_TC11580 VSFPIGAKRPR--------------------------------------Solanum_tuberosum_TAF9_TC67183 VSFPLGAKRPR--------------------------------------Solanum_tuberosum_TAF9_TC67182 VSFPLGAKRPR--------------------------------------Lycopersicon_esculentum_TAF9_T VSFPLGAKRPR--------------------------------------Arabidopsis thaliana TAF9 VSFPL-SRRPK--------------------------------------M_truncatula_TAF9_TC85341 VSFPLPKRQKD--------------------------------------M_truncatula_TAF9_TC85342 CHFPCLNPKGLI-------------------------------------Zea_mays_TAF9_TC182853 VSFQLNAVAAAAAAAKRPRMAIDQLNMG---------------------Zea_mays_TAF9_TC182854 VSFQLNAVAAAA--AKRPRMTVDQLNIG---------------------Hordeum_vulgare_TAF9_TC68170 VSFQLNAVAAAA--AKRPLVTTDQLNMG---------------------Oryza_sative_TAF9_AAP12985 VSFQLNAVAAAK---RRG--TMDQLNMG---------------------Triticum_aestivum_TAF9_TC70841 XSSSLRARARAQ----------GFFQA----------------------Oryza_sativa_TAF9_BAC21319.1 AHDQLEKIKDDGSNDNADSSHSNYVQDSSGSVSLQHHQDMSLNQNHLFVH Chlamydomonas_reinhardtii_TAF9 -------------------------------------------------Homo_sapiens_TAF9_NP_003178 PTSQS---PAVKASIPATSAVQNVLINPSLIGSKNILITTNMMSSQNTAN Homo_sapiens_TAF9L_NP_057059 PPSQS---TPVKP-VPATTAVQNVLINPSMIGPKNILITTNMVSSQNTAN Drosophila_melanogaster_TAF9_A GGSGGGGGQEVKSESTGAGGDLKMEVDSDAAAVGSIAGASGSGAGSASGG Saccharomyces_cerevisiae_TAF9_ -------------------------------------------------Populus_balsamifera_TAF9b GSATVSSTAPAVSGTVGRRS-----------------------------Gossypium__Cotton__TAF9_TC1456 ------------------------------------------------Populus_balsamifera_TAF9 ------------------------------------------------Vitis_vinifera_TAF9_TC11580 ------------------------------------------------Solanum_tuberosum_TAF9_TC67183 ------------------------------------------------Solanum_tuberosum_TAF9_TC67182 ------------------------------------------------Lycopersicon_esculentum_TAF9_T ------------------------------------------------Arabidopsis thaliana TAF9 ------------------------------------------------M_truncatula_TAF9_TC85341 ------------------------------------------------M_truncatula_TAF9_TC85342 ------------------------------------------------Zea_mays_TAF9_TC182853 ------------------------------------------------Zea_mays_TAF9_TC182854 ------------------------------------------------Hordeum_vulgare_TAF9_TC68170 ------------------------------------------------Oryza_sative_TAF9_AAP12985 ------------------------------------------------Triticum_aestivum_TAF9_TC70841 ------------------------------------------------Oryza_sativa_TAF9_BAC21319.1 QVELTLDQIEEIKDDGSNDNVDSPNFNCVQDPSRSVSFPHYQVMPLNQN Chlamydomonas_reinhardtii_TAF9 ------------------------------------------------Homo_sapiens_TAF9_NP_003178 ESS-------NALKRKREDDDDDDDDDDDYDNL---------------Homo_sapiens_TAF9L_NP_057059 EA--------NPLKRKHEDDDDNDIM-----------------------

PAGE 227

214 Drosophila_melanogaster_TAF9_A GGGGGSSGVGVAVKREREEEEFEFVTN---------------------Saccharomyces_cerevisiae_TAF9_ ------------------------------------------------Populus_balsamifera_TAF9b ------------------------------------------------TAF10 Alignment CLUSTAL X (1.83) multiple sequence alignment Homo_sapiens_TAF10_Q12962 ----------------------------NGDVKPVVSSTPLVDFLMQLED Drosophila_melanogaster_TAF10b MVGSNFGIIYHNSAGGASSHGQSSGGGGGGDRDRTTPSSHLSDFMSQLED Drosophila_melanogaster_TAF10_ -----TEEEDIDSPLMQSELHSDEEQPDVEEVPLTTEESEMDELIKQLED Hordeum_vulgare_TAF10_Barley1_ -----MGSNNSGGAGGGGG--MAPGTGAGGSDGRHDDEAVLTEFLSSLMD Hordeum_vulgare_TAF10c_HVtuc02 -----MGSNNSGGAGGGGG--MAPGTGAGGSDGRHDDEAVLTEFLSSLMD Triticum_aestivum_TAF10b_TC647 ----MMGSNNPGGAGGGGGGGMAPGTGGGGSDGRHDDEAVLTDFLSSLMD Triticum_aestivum_TAF10_TC6468 ----MMGSNNPGGAGGGGG--MAPGTGAGGSDGRHDDEAVLTEFLSSLMD Hordeum_vulgare_TAF10b_TC68796 ----MMGSNNPGGAG--GG--MAPGMGAGGSDGRHDDEAVLTEFLSSLMD Triticum_aestivum_TAF10c_CA620 ---------------------MAPGMGAGSSDGRHDDEAVLTEFLSSLMD Oryza_sativa_TAF10_TC129171 --------MVPGGMGGGGPMGAAPPGGGGGGDGRHDDEAVLTEFLSSLMD Zea_mays_TAF10_TC184169 -------------MG---------TGVGGGGDGRHDDEAALTEFLSSLMD Arabidopsis_thaliana_TAF10_AAK -------------MN----------HGQQSGEAKHEDDAALTEFLASLMD Populus_trichocarpa_TAF10 -------------MNNTSS--SNSQQQQQSSEARHDDDAVLTEFLASLMD Glycine_max_TAF10_TC162515 -------------MNQ----------NPQSSDGRNDDDSALSDFLASLMD Glycine_max_TAF10b_TC162516 -------------MNQ----------NPQSSEGRNDDDSALSDFLASLMD Gossypium_arboreum_TAF10_BQ401 -------------MNH----------NPQSSDGKHDDDSALSDFLASLMD Beta_vulgaris_TAF10_BVSVtuc03-------------MN------------PQTSDGRHDDDAALSEFLASLMD Lycopersicon_esculentum_TAF10_ -------------MNQS--------QGQQTSEGRHEDDAVLADFLASLMD Pinus_TAF10_TC9616 -----------------------------MAESKQDDDAVLIEFLSSLMD Saccharomyces_cerevisiae_TAF10 -----------------------------GIPEFTRKDKTLEEILEMMDS Homo_sapiens_TAF10_Q12962 YTPTIPDAVTGYYLNRAGFEASDPRIIRLISLAAQKFISDIANDALQHCK Drosophila_melanogaster_TAF10b YTPLIPDAVTSHYLNMGGFQSDDKRIVRLISLAAQKYMSDIIDDALQHSK Drosophila_melanogaster_TAF10_ YSPTIPDALTMHILKTAGFCTVDPKIVRLVSVSAQKFISDIANDALQHCK Hordeum_vulgare_TAF10_Barley1_ YNPTIPDELVEHYLGRSGFHCPDLRLTRLVAVAAQKFISDIASDSLQHCK Hordeum_vulgare_TAF10c_HVtuc02 YNPTIPDELVEHYLGRSGFHCPDLRLTRLVAVAAQKFISDIASDSLQHCK Triticum_aestivum_TAF10b_TC647 YNPTIPDELVEHYLGRSGFHCPDLRLTRLVAVAAQKFISDIASDSLQHCK Triticum_aestivum_TAF10_TC6468 YNPTIPDELVEHYLGRSGFHCPDLRLTRLVAVAAQKFISDIASDSLQHCK Hordeum_vulgare_TAF10b_TC68796 YNPMIPDELVEHYLGRSGFHXPDLRLTRLVAVATQKFISDVASDSLQHCK Triticum_aestivum_TAF10c_CA620 YNPMIPDELVEHYLGRSGFHCPDLRLTRLVAIATQKFISDVASDSLQHCK Oryza_sativa_TAF10_TC129171 YTPTIPDELVEHYLGRSGFYCPDLRLTRLVAVATQKFISDIASDSLQHCK Zea_mays_TAF10_TC184169 YTPTIPDELVEHYLGRSGFHCPDLRLTRLVAVATQKFLSDIASDSLQHCK Arabidopsis_thaliana_TAF10_AAK YTPTIPDDLVEHYLAKSGFQCPDVRLIRLVAVATQKFVADVASDALQHCK Populus_trichocarpa_TAF10 YTPTIPDELVEHYLAKSGFQCPDVRLVRLVAVATQKFVADVATDALQQCK Glycine_max_TAF10_TC162515 YTPTIPDELVEHYLAKSGFQCPDVRLTRLVAVATQKFVAEVAGDALQHCK Glycine_max_TAF10b_TC162516 YTPTIPDELVEHYLAKSGFQCPDVRLTRLVAVATQKFVAEVAGDALQHCK Gossypium_arboreum_TAF10_BQ401 YAPTIPDELVEHYLAKSGFQCPDVRLIRLVAVATQKFVAEVASDALQHCK Beta_vulgaris_TAF10_BVSVtuc03YTPTIPDELVEHYLAKSGFQCPDVRLIRLVAVATQKFISEVATDALQHCK Lycopersicon_esculentum_TAF10_ YTPTIPDELVEHYLGKSGFQCPDVRLIRLVAVATQKFIADVATDALQHCK Pinus_TAF10_TC9616 YTPTIPDELAEYYLSKSGFQCPDVRIIRMVSIATQKFIAEIASDAFQLCK Saccharomyces_cerevisiae_TAF10 TPPIIPDAVIDYYLTKNGFNVADVRVKRLLALATQKFVSDIAKDAYEYSR Homo_sapiens_TAF10_Q12962 MKG----T----------------------ASGSSRSK----SKDRKYTL Drosophila_melanogaster_TAF10b ARTH-MQT----------------------TNTPGGSK----AKDRKFTL Drosophila_melanogaster_TAF10_ TRTTNIQH----------------------SSGHSSSKDKKNPKDRKYTL Hordeum_vulgare_TAF10_Barley1_ ARV----------------------------AAPIKDNKSKQPKDRRLVL Hordeum_vulgare_TAF10c_HVtuc02 ARV----------------------------AAPIKDNKSKQPKDRRLVL Triticum_aestivum_TAF10b_TC647 ARV----------------------------AAPIKDNKSKQPKDRRLVL Triticum_aestivum_TAF10_TC6468 ARV----------------------------AAPVKDNKSKQPKDRRLVL Hordeum_vulgare_TAF10b_TC68796 ARV----------------------------AAPIKDNKSKQPKDRRLVL Triticum_aestivum_TAF10c_CA620 ARV----------------------------AAPIKDNKSKQPKDRRLVL Oryza_sativa_TAF10_TC129171 ARV----------------------------AAPIKDNKSKQPKDRRLVL Zea_mays_TAF10_TC184169 ARV----------------------------AAPIKDNKSKQPKDRRLVL Arabidopsis_thaliana_TAF10_AAK ARP----------------------------APVVKDK--KQQKDKRLVL Populus_trichocarpa_TAF10 ARP----------------------------APVVKDKRDKQQKEKRLIL Glycine_max_TAF10_TC162515 ARQ----------------------------ATIPKDKRDKQQKDKRLVL Glycine_max_TAF10b_TC162516 ARQ----------------------------ATIPKDKRDKQQKDKRLVL Gossypium_arboreum_TAF10_BQ401 ARQ----------------------------AAVVKDKREKQQKDKRLIL Beta_vulgaris_TAF10_BVSVtuc03ARQ----------------------------SSVVKDKRDKLQKDKRLVL Lycopersicon_esculentum_TAF10_ ARQ----------------------------STIVKDKRDKQQKDKRLTL

PAGE 228

215 Pinus_TAF10_TC9616 ARQ----------------------------SAVNKEKRDKQQKDKSFVL Saccharomyces_cerevisiae_TAF10 IRSSVAVSNANNSQARARQLLQGQQQPGVQQISQQQHQQNEKTTASKVVL Homo_sapiens_TAF10_Q12962 TMEDLTPALSEYGINVKKPHYFT-------------Drosophila_melanogaster_TAF10b TMEDLQPALADYGINVRKVDYSQ-------------Drosophila_melanogaster_TAF10_ AMEDLVPALADHGITMRKPQYFV-------------Hordeum_vulgare_TAF10_Barley1_ TMDDLSKALREHGVNLRHPEYFADSPSAGMAPSTRDE Hordeum_vulgare_TAF10c_HVtuc02 TMDDLSKALREHGVNLRHPEYFADSPSAGMGHSTREE Triticum_aestivum_TAF10b_TC647 TMDDLSKALREHGVNLRHPEYFADSPSAGX-PLKREE Triticum_aestivum_TAF10_TC6468 TMDDLSKALREHGGNLKHPEYFADSPSAGMPPSTREE Hordeum_vulgare_TAF10b_TC68796 TMDDLSKALREHGVNLKHPEYFADSPSAGMGHSTREE Triticum_aestivum_TAF10c_CA620 TMDDLSKALREHGVNLKHPEYFADSPSARMGPSTREE Oryza_sativa_TAF10_TC129171 TMDDLSKALQEHGVNLKHPEYFADSPSAGMAPAAREE Zea_mays_TAF10_TC184169 TMDDLSKALREHGVNLKHAEYFADSPSAGMAPSTREE Arabidopsis_thaliana_TAF10_AAK TMEDLSKALREYGVNVKHPEYFADSPSTGMDPATRDE Populus_trichocarpa_TAF10 TMEDLSKALSEYGVNVKHQEYFADSPSTGMDPASREE Glycine_max_TAF10_TC162515 TMEDLSKALREYGVNLKHQEYFADSPSTGMDPATREE Glycine_max_TAF10b_TC162516 TMEDLSQALREYGANLTDQEYFADSPSTVMDPATREE Gossypium_arboreum_TAF10_BQ401 TMDDLSKSLREYGVNVKHQEYFADSPSTGIDPASREE Beta_vulgaris_TAF10_BVSVtuc03TMEDLSRALKEYGVNLKHQEYFADNPSTGMDPASRDE Lycopersicon_esculentum_TAF10_ TMDDLSKSLREYGVNVKHQDYFADSPSAGLDPASREE Pinus_TAF10_TC9616 TTEDLSMALREYGVNMKRQEYFADNPSAGTNPTSKDE Saccharomyces_cerevisiae_TAF10 TVNDLSSAVAEYGLNIGRPDFYR-------------TAF11 Alignment CLUSTAL X (1.83) multiple sequence alignment Drosophila_melanogaster_TAF11_ -------MDEILFPTQQKSNSLSDGDDV-DLKFFQSASGERKDSDTSDPG Homo_sapiens_TAF11_NP_005634 -------MDDAHESPSDKGGETGESDET-AAVPGDPGATD-TDGIPEETD Arabidopsis_thaliana_TAF11_At4 ----MKHSKDPFEAAIEEEQEES--------PPESPVGGGGGGDGSEDGR Arabidopsis_thaliana_TAF11b_AA ----MAFNARSCCFASSNERVTCNCNCL-KDQPVPSVVGCATKKLAEFWS Medicago_truncatula_TAF11_TC80 MAGGISFGIGLKRMKQSKDPFEAAFE---ESPPESPIETEPDPDASTENP Hordeum_vulgare_TAF11_TC81880 ----------------MKDPFEAAVEEQ-ESPPDSPAPPEEGPATAVPHT Triticum_aestivum_TAF11_TC9194 ----------------MKDPFEAAVEEQ-DSPPDSPAPPEEDPATAVPHT Oryza_sativa_TAF11_BAB90043 TREAAHQARRRRAAAAMKDPFEAAVEEQ-ESPPESPAANEEDAAGAP--Oryza_sativa_TAF11b_TC124761 ----------------MKDPFEAAVEEQ-ESPPESPAANEEDAAGAP--Populus_trichocarpa_TAF11_Contig1 -------------MKQSKDPFEAAYVEQEESPPESPVAQDDYDTQASNAA Saccharomyces_cerevisiae_TAF11 --MTEPQGPLDTIPKVNYPPILTIANYFSTKQMIDQVISEDQDYVTWKLQ Drosophila_melanogaster_TAF11_ NDAD------------------RDGKDADGDNDNKNTD----------GD Homo_sapiens_TAF11_NP_005634 GDAD------------------VDLKEAAAEEGELESQDVSDLTTVERED Arabidopsis_thaliana_TAF11_At4 IEID-------------QTQDEDERPVDVRR--PMKKAKTSVVVTEAKNK Arabidopsis_thaliana_TAF11b_AA FKIQRYVIFVKVLLRMKHSKDPFEAAMEEQEESPVETEQTLEGDERAVKK Medicago_truncatula_TAF11_TC80 NSTN--------------SSLPQSTLTHEEEHNHIKTPNSN----NTITK Hordeum_vulgare_TAF11_TC81880 IDEDYDGSAGAGGSR-PPPPRPRPSALAAPSTSAAPAAAKAK---VRPQK Triticum_aestivum_TAF11_TC9194 AAEDYDGSAGAGGSR-APPPRPRPSALAAPSTSVAPAAAKAK---VRPHK Oryza_sativa_TAF11_BAB90043 --EGYDG---ASGSR-GPPLR-LPPSRAAPSGSGGAAAAAARGKVVRVQK Oryza_sativa_TAF11b_TC124761 --EGYDG---ASGSR-GPPLR-LPPSRAAPSGSGGAAAAAARGKVVRVQK Populus_trichocarpa_TAF11_Contig1 AAAD-----------------DSQGAVVGQDDDDLGGGGRND------FA Saccharomyces_cerevisiae_TAF11 NLRTGG----------TSINNQLNKYPKYKYQKTRINQQDPDSINKVPEN Drosophila_melanogaster_TAF11_ GDSGEPAHKKLKT--------KKELEEEERE---RMQVLVSNFTEEQLDR Homo_sapiens_TAF11_NP_005634 SSLLNPAAKKLKIDTKEKKEKKQKVDEDEIQ---KMQILVSSFSEEQLNR Arabidopsis_thaliana_TAF11_At4 DKDEDDEEEEENMDVELTKYPTS-SDPAKMA---KMQTILSQFTEDQMSR Arabidopsis_thaliana_TAF11b_AA CKTSVVAEAKNKDEVEFTKNITG-ADPVTRAN--KMQKILSQFTEEQMSR Medicago_truncatula_TAF11_TC80 HKDEEDDEEEDNMDVELAKFPTA-GDPHKMA---KMQAILSQFTEEQMSR Hordeum_vulgare_TAF11_TC81880 EQ-DDDDDEEDPMEVDLDKLPSGTSDPDKLA---KMNALLSQFTEDQMNR Triticum_aestivum_TAF11_TC9194 EQ-DDDDDEEDPMEVDLDKLPSGTSDPDKLA---KMNALLSQFTEDQMNR Oryza_sativa_TAF11_BAB90043 EQQEEEDDEEDHMEVDLDKLPSGTSDPDKLA---KMNAILSQFTEDQMNR Oryza_sativa_TAF11b_TC124761 EQQEEEDDEEDHMEVDLDKLPSGTSDPDKLA---KMNAILSQFTEDQMNR Populus_trichocarpa_TAF11_Contig1 HSSDHPSASRPMLGSARSKAKNKDDDEEEEED--NMDVELSKLASTADPD Saccharomyces_cerevisiae_TAF11 LIFPQDILQQQTQNSNYEDTNTNEDENEKLAQDEQFKLLVTNLDKDQTNR Drosophila_melanogaster_TAF11_ YEMYRRSAFPKAAVKRLMQTITGCS-VSQNVVIAMSGIAKVFVGEVVEEA Homo_sapiens_TAF11_NP_005634 YEMYRRSAFPKAAIKRLIQSITGTS-VSQNVVIAMSGISKVFVGEVVEEA Arabidopsis_thaliana_TAF11_At4 YESFRRSALQRPQMKKLLIGVTGSQKIGMPMIIVACGIAKMFVGELVETA Arabidopsis_thaliana_TAF11b_AA YESFRRSGFKKSDMEKLVQRITGGPKMDDTMNIVVRGIAKMFVGDLVETA Medicago_truncatula_TAF11_TC80 YESFRRAGFQKANMKRLLTSITGTQKISIPITIAVSGIAKVFVGEVVETA

PAGE 229

216 Hordeum_vulgare_TAF11_TC81880 YESFRRSGFQKSNMKKLLASITGSQKISMPTTIVVSGIAKMFVGEVIETA Triticum_aestivum_TAF11_TC9194 YESFRRSGFQKSNMKKLLASITGSQKISMPTTIVVSGIAKMFVGEVIETA Oryza_sativa_TAF11_BAB90043 YESFRRSGFQKSNMKKLLASITGSQKISLPTTIVVSGIAKMFV------A Oryza_sativa_TAF11b_TC124761 YESFRRSGFQKSNMKKLLASITGSQKISLPTTIVVSGIAKMFVGELVETA Populus_trichocarpa_TAF11_Contig1 KMAKMQFGNSRTEIFQELSSYVHSALHGRRASAPVHAYCKEYH------Saccharomyces_cerevisiae_TAF11 FEVFHRTSLNKTQVKKLASTVANQT-ISENIRVFLQAVGKIYAGEIIELA Drosophila_melanogaster_TAF11_ LDVMEAQGES---------------------------------------Homo_sapiens_TAF11_NP_005634 LDVCEKWGEM---------------------------------------Arabidopsis_thaliana_TAF11_At4 RVVMAERKES---------------------------------------Arabidopsis_thaliana_TAF11b_AA RVVMRERKES---------------------------------------Medicago_truncatula_TAF11_TC80 RTIMKERKET---------------------------------------Hordeum_vulgare_TAF11_TC81880 RIIMSERKDS---------------------------------------Triticum_aestivum_TAF11_TC9194 RIVMSERKDS---------------------------------------Oryza_sativa_TAF11_BAB90043 RIVMTERKDS---------------------------------------Oryza_sativa_TAF11b_TC124761 RIVMTERKDS---------------------------------------Populus_trichocarpa_TAF11_Contig1 -----AQTIA---------------------------------------Saccharomyces_cerevisiae_TAF11 MIVKNKWLTSQMCIEFDKRTKIGYKLKKYLKKLTFSIIENQQYKQDYQSD Drosophila_melanogaster_TAF11_ -------------------------------------------------Homo_sapiens_TAF11_NP_005634 -------------------------------------------------Arabidopsis_thaliana_TAF11_At4 -------------------------------------------------Arabidopsis_thaliana_TAF11b_AA -------------------------------------------------Medicago_truncatula_TAF11_TC80 -------------------------------------------------Hordeum_vulgare_TAF11_TC81880 -------------------------------------------------Triticum_aestivum_TAF11_TC9194 -------------------------------------------------Oryza_sativa_TAF11_BAB90043 -------------------------------------------------Oryza_sativa_TAF11b_TC124761 -------------------------------------------------Populus_trichocarpa_TAF11_Contig1 -------------------------------------------------Saccharomyces_cerevisiae_TAF11 SVPEDEPDFYFDDEEVDKRETTLGNSLLQSKSLQQSDHNSQDLKLQLIEQ Drosophila_melanogaster_TAF11_ --------------------GALQPKFIREAVRRLRTKDRMPIGRYQQPY Homo_sapiens_TAF11_NP_005634 --------------------PPLQPKHMREAVRRLKSKGQIPNSKHKKII Arabidopsis_thaliana_TAF11_At4 --------------------GPIRPCHIRESYRRLKLEGKVPKRSVPRLF Arabidopsis_thaliana_TAF11b_AA --------------------GPIRPCHIRESYRRLKLQGKVPQRSVQRLF Medicago_truncatula_TAF11_TC80 --------------------GPIRPCHLREAHRRLKLEGKIFKRTTSRLF Hordeum_vulgare_TAF11_TC81880 --------------------GPIRPCHIREAYRRLKLEGKIPKRSVPRLF Triticum_aestivum_TAF11_TC9194 --------------------GPIRPCHIREAYRRLKLEGKIPKRSVPRLF Oryza_sativa_TAF11_BAB90043 --------------------GPQG----NQSKQYVQAE-------VLRYY Oryza_sativa_TAF11b_TC124761 --------------------GPVRPCHIREAYRRLKLEGKIPRRTVPRLF Populus_trichocarpa_TAF11_Contig1 --------------------SSIRPSGLKYFYNSLCKKG----------Saccharomyces_cerevisiae_TAF11 YNKLVLQFNKLDVSIEKYNNSPLLPEHIREAWRLYRLQSDTLPNAYWRTQ Drosophila_melanogaster_TAF11_ FRLN----Homo_sapiens_TAF11_NP_005634 FF------Arabidopsis_thaliana_TAF11_At4 R-------Arabidopsis_thaliana_TAF11b_AA R-------Medicago_truncatula_TAF11_TC80 R-------Hordeum_vulgare_TAF11_TC81880 R-------Triticum_aestivum_TAF11_TC9194 R-------Oryza_sativa_TAF11_BAB90043 --------Oryza_sativa_TAF11b_TC124761 R-------Populus_trichocarpa_TAF11_Contig1 --------Saccharomyces_cerevisiae_TAF11 GEGQGSMFR TFIIE Alignment CLUSTAL X (1.83) multiple sequence alignment Drosophila_melanogaster_TFIIEa ------------------------MSSTSTAAANAAPAKTEVRYVTEVPS Homo_sapiens_TFIIE-alpha_NP_00 --------------------------------------MADPDVLTEVPA Arabidopsis_thaliana_TFIIEa2_A -MDKSITVVRKT---------------VVLEPFVKLVRLLVRIFYDNYTP Arabidopsis_thaliana_TFIIEa3_A -----------------------------------MVKLVAKTFYDNYTP Arabidopsis_thaliana_TFIIEa1_A -MEKSG-PVQKA---------------VVLQPFVKLVRLVARAFYDDYTT Populus_balsamifera_TFIIE-alpha 2 -MDMNTTIS--------------------VEPFKRLVKLAARAFYDDVST Populus_balsamifera_TFIIE-alpha 1 MAEFGSKLVNKFEESPRGTTAFIKINEAHTEVKKELVKLAARAFYDDITT Solanum_tuberosum_TFIIEa_TC670 ---MS------------------------IEPFNRLVKLAARAFYDDITT Hordeum_vulgare_TFIIEa_TC90346 --------------------------MGSLEPFNRLVRLTARAFYDDISI Oryza_sativa_TFIIEa1 --------------------------MGSMEPFNRLVRLAARAFYDDISM

PAGE 230

217 Oryza_sativa_TFIIEa2 -------------------------------------------------Oryza_sativa_TFIIEa3 --------------------------MDTMEQLNRLVRMVARGFYEDVSL Oryza_sativa_TFIIEa4 -----------------------------MSINERLVKCAAQLLYGNVGF Methanosarcina_acetivorans_TFE --------------------------------------------MNTLVD Sulfolobus_solfataricus_TFE_NP -----------------------------------------------MVN Saccharomyces_cerevisiae_TFIIE -------------------------------------------MDRPIDD Drosophila_melanogaster_TFIIEa SLKQLARLVVRGFYSLEDALIIDMLVRN-PCMKEDDIGELLRFEKKQLRA Homo_sapiens_TFIIE-alpha_NP_00 ALKRLAKYVIRGFYGIEHALALDILIRN-SCVKEEDMLELLKFDRKQLRS Arabidopsis_thaliana_TFIIEa2_A ESDNQQK-SVKN-VKGSAVIVLDALTRR-QWVREEDLAKEVKRNAKELRK Arabidopsis_thaliana_TFIIEa3_A KNNNQKK-SAKNGSGGIAVLVLDALTRR-QWVREEDLAKELKLNTKQLRT Arabidopsis_thaliana_TFIIEa1_A KSDNQQK-SARSDNRGIAAVVLDALARR-QWVREEDLAKDLQLHAKQLRK Populus_balsamifera_TFIIE-alpha 2 KGENQSKNNARGDNKGIAVVVLDALTRR-LWVNEEGLAKDLKIHIKQLRR Populus_balsamifera_TFIIE-alpha 1 KGDNQPK-TGRSDNRGIAVVVLDALTRR-QWVREEDLAKELKLHSKQLRR Solanum_tuberosum_TFIIEa_TC670 KGDNQPK-SGRSDNRGIAVVILDALTRR-QWVREEDLAKDLKLHTKQLRR Hordeum_vulgare_TFIIEa_TC90346 KGDTQAK-TSRGDNRGMAVVVLDGLTRR-QWVREEDLAKSLKLHSKQLRR Oryza_sativa_TFIIEa1 KGDNQPK-TSRGDNRGMAVVVLDALTRR-QWVREEDLAKALKLHSKQLRR Oryza_sativa_TFIIEa2 -------------------------------------------------Oryza_sativa_TFIIEa3 E-EDQSK-PNGSGSCGIVVVVLDALTRQ-QWVREEDLARSLMIPFNRLRQ Oryza_sativa_TFIIEa4 KAGEVRI--DCDENRGVVVMVLDALTRY-QWVPDTHLAKSLKVQKKKLCL Methanosarcina_acetivorans_TFE LNDKVIRGYLISLVGEEGLRMIEEMPEG--EVTDEEIAAKTGVLLNTVRR Sulfolobus_solfataricus_TFE_NP AEDLFIN-LAKSLLGDDVIDVLRILLDKGTEMTDEEIANQLNIKVNDVRK Saccharomyces_cerevisiae_TFIIE IVKNLLKFVVRGFYGGSFVLVLDAILFH-SVLAEDDLKQLLSINKTELGP Drosophila_melanogaster_TFIIEa RITTLRTDKFIQIRLKMETGPDGKAQKVN--------------------Homo_sapiens_TFIIE-alpha_NP_00 VLNNLKGDKFIKCRMRVETAADGKTTRHN--------------------Arabidopsis_thaliana_TFIIEa2_A LIRHFEEQKFVMRYHRKETAKRAKMYSYA-VGGTTDGRA-----EDNVKF Arabidopsis_thaliana_TFIIEa3_A ILRYFEEQQFIMRVHRKEKSS-----------ATTNGRG-----EDKVKV Arabidopsis_thaliana_TFIIEa1_A IIRLFEEEKLIMRDHRKETAKGAKMYSAA-VAATTDGRA-----EDKVKL Populus_balsamifera_TFIIE-alpha 2 ILRLFEEDKLLTRAHRKETAKVTKKPNAG-GADSQRKFG-SRE-DDKNKL Populus_balsamifera_TFIIE-alpha 1 TLRFFEEEKLVTRDHRKETAKAAKMHNAA-VANTTDGHR-TKEGDDKIKM Solanum_tuberosum_TFIIEa_TC670 TLRFFEEEKLITRDHRKEGAKGAKVYNSA-VAATVDGLQNGKEGDDKIKM Hordeum_vulgare_TFIIEa_TC90346 VLRFFEEEKLVTRDHRKESAKGAKIYSAA-AAAAGDGQ-PTKEGEEKVKL Oryza_sativa_TFIIEa1 ILRFFEEEKLVTRDHRKESAKGAKIYSAA-AAAAGDGQSITKEGEEKVKM Oryza_sativa_TFIIEa2 -------------------------------------------------Oryza_sativa_TFIIEa3 ITHFLEQQKLVRRYYRKEAIHDASISTASPSHVSHDAHLVPTNVAGKLKM Oryza_sativa_TFIIEa4 ILEFLEKQMFVRRCEVKAKTGRNVSNTATTAGVSAIPRN-----EKVKSK Methanosarcina_acetivorans_TFE TLFILYENKFAICRRERDSNSGWLTYLWH--------------------Sulfolobus_solfataricus_TFE_NP KLNLLEEQGFVSYRKTRDKDSGWFIYYWK--------------------Saccharomyces_cerevisiae_TFIIE LIARLRSDRLISIHKQREYPPNSKSVERV--------------------Drosophila_melanogaster_TFIIEa ----YYFINYKTFVNVVKYKLDLMRKRMETEERDATSRASFKCSSCSKTF Homo_sapiens_TFIIE-alpha_NP_00 ----YYFINYRTLVNVVKYKLDHMRRRIETDERDSTNRASFKCPVCSSTF Arabidopsis_thaliana_TFIIEa2_A HTHSYCCLDYAQIYDIVRYKLHRLKKKFKDELEDRNTVQEYGCPNCKRKY Arabidopsis_thaliana_TFIIEa3_A HMYSYCCLDYSQIYDVIRYKLHRMKKEFKDVLEDKDNVQEYGCPNCKRKI Arabidopsis_thaliana_TFIIEa1_A HTHSYCCLDYAQICDVVRFRLHRMKKRLKDELEDKNTVQEYGCPNCQRKY Populus_balsamifera_TFIIE-alpha 2 HTHSYCCLDYAQIYDVVRYRLHRMRKMIKDELENNNAVQQYICPICERRY Populus_balsamifera_TFIIE-alpha 1 HTHSYCCLDYAQIYDVVRYRLHRMRKKLKDELEDKNTVQEYTCPNCGRRY Solanum_tuberosum_TFIIEa_TC670 HTHSYCCLDYAQIYDVVRYRLHRMKKKLRDELDNKNTVQEYICPNCGKRY Hordeum_vulgare_TFIIEa_TC90346 HTHSYCCLDYAQICDVVRYRIHRMKKTLKDELDSRNTVQHYICPNCKKRY Oryza_sativa_TFIIEa1 HTHSYCCLDYAQICDVVRYRIHRMKKKLKDELDSRNTIQHYICPNCKKRY Oryza_sativa_TFIIEa2 -----------MVYDVVRYRIHRMRKKLKDGLDDRDTVQHYVCPNCKRRY Oryza_sativa_TFIIEa3 IMQPYCCLHYGQVYDVTLYRIHEMKKKLKDELDGNYMIQNYVCPNCERRY Oryza_sativa_TFIIEa4 HPKWYCCINYAKICSVVRYHIMQMEANLKSQLENTNTVDKYTCPNCGKSF Methanosarcina_acetivorans_TFE -------LDFSDVEHQLMREKKKLLRNLKTRLEFEENNVFYVCPQGCVRL Sulfolobus_solfataricus_TFE_NP -------PNIDQINEILLNRKRLILDKLKTRLEYEKNNTFFICPQDNSRY Saccharomyces_cerevisiae_TFIIE ----YYYVKYPHAIDAIKWKVHQVVQRLKDDLDKNSEPNGYMCPICLTKY Drosophila_melanogaster_TFIIEa TDLEADQLFDMATLEFRCTFCGSSVEEDSAAMPKKDSRLMLAHFN-EQLQ Homo_sapiens_TFIIE-alpha_NP_00 TDLEANQLFDPMTGTFRCTFCHTEVEEDESAMPKKDARTLLARFN-EQIE Arabidopsis_thaliana_TFIIEa2_A NALDALRLISMEDDSFHCENCNGELVMECNKLISEEVVDRGDNARRRQRE Arabidopsis_thaliana_TFIIEa3_A --------------FFHCENCNGELVMECNKLTSEEVVVDGSDNPRSRRD Arabidopsis_thaliana_TFIIEa1_A NALDALRLISMVDDSFHCENCNGELVVECNKLTSEEVVDGDDNARRRRRE Populus_balsamifera_TFIIE-alpha 2 NALDALRLISLVDEDFHCENCDGVLVAESDKLAAQEGGDGDDNARKRRRE Populus_balsamifera_TFIIE-alpha 1 NALDALRLMSLVDEYFHCENCDGELVAESDKLAAQEGGDGDDNARRRRRE Solanum_tuberosum_TFIIEa_TC670 TALDALRLISPVDEYFHCESCNEELVAESDKLASQGTTDGDDNDRRRRRE Hordeum_vulgare_TFIIEa_TC90346 SAFDALQLISYTDEYFHCENCNGELLAESDKLSSEEMGDGDDNARKRRRE Oryza_sativa_TFIIEa1 SAFDALQLISYTDEYFHCENCNGELVAESDKLASEEMGDGDDNARKRRRE Oryza_sativa_TFIIEa2 SAFDALQLVSDMDDYFHCEHCKGELRPESEKLTLDEIVCGGGNAIKHTHD Oryza_sativa_TFIIEa3 SSLNALDLVSHIDNNFHCKHCNEELSQDFGDLAWGGRGGDGDNARRDRHA Oryza_sativa_TFIIEa4 SAFDVKDLVSCTDGNFYCESCKHELVACS--------EYGNYNEREGRSA

PAGE 231

218 Methanosarcina_acetivorans_TFE L------FDEATETEFLCPMCGEDLVYYD-------------------NS Sulfolobus_solfataricus_TFE_NP S------FEEAFENEFKCLKCGSQLTYYD-------------------TD Saccharomyces_cerevisiae_TFIIE TQLEAVQLLNFDRTEFLCSLCDEPLVEDDSGKKNKE-KQDKLNRLMDQIQ Drosophila_melanogaster_TFIIEa PLYDLLREVEG-----------------------------IKLAPEVLEHomo_sapiens_TFIIE-alpha_NP_00 PIYALLRETED-----------------------------VNLAYEILEArabidopsis_thaliana_TFIIEa2_A KVKVWLQDLEG----------------------------ELKPLMELINArabidopsis_thaliana_TFIIEa3_A HLKDLLQNMEV----------------------------RLKPLMDHINArabidopsis_thaliana_TFIIEa1_A NLKNMLQKLEV----------------------------QMKPLMDQLNPopulus_balsamifera_TFIIE-alpha 2 KLKDMLQNMECYFMVPNFDFESINCKNWP--ARFWLAKVQLKPLMDQLSPopulus_balsamifera_TFIIE-alpha 1 KLKDMLQKMEDASNLFLFKCYLLLMKACYRVIEEVLGRRFIFSMTGQIEM Solanum_tuberosum_TFIIEa_TC670 KLEDMLHRVEA----------------------------QLKPLMDQLAHordeum_vulgare_TFIIEa_TC90346 KLNDMQQRIDE----------------------------QLKPLQAQLKOryza_sativa_TFIIEa1 KLKDMQQRIDE----------------------------QLKPLQAQLNOryza_sativa_TFIIEa2 KLKDMQQRMEE----------------------------QLKPLIAVLDOryza_sativa_TFIIEa3 KLKDFLQRMEH----------------------------QMERLISQLNOryza_sativa_TFIIEa4 NLLDFLENMKE----------------------------KLRPLKTKLDMethanosarcina_acetivorans_TFE RFVSALKKRVD-------------------------------ALSSV--Sulfolobus_solfataricus_TFE_NP KIKSFLEQKIR-------------------------------QIEEEIDSaccharomyces_cerevisiae_TFIIE PIIDSLKKIDDSRIEEN----------------------TFEIALARLIP Drosophila_melanogaster_TFIIEa --PEPVDIDTIRGLNKPNATRPDGMAWSG----EATRNQGFAVEETRVDHomo_sapiens_TFIIE-alpha_NP_00 --PEPTEIPALKQSKDHAATTAGAASLAGGHHREAWATKGPSYEDLYTQN Arabidopsis_thaliana_TFIIEa2_A -RVKDLPFPAFEPFPAWEARAAKAAR-ENGDFNPDDPSRSLG--GYGSTP Arabidopsis_thaliana_TFIIEa3_A -RIKDLPVPSFESFPAWETRVAKAAR-ENGDLNPDDTLRPQG--GYGSTP Arabidopsis_thaliana_TFIIEa1_A -RVKDLPIPEFGSFLAWEARAAMAAR-ENGDLNPNDPLRSQG--GYGSTP Populus_balsamifera_TFIIE-alpha 2 -RVKDLPIPEIGSLQAWQLHENAAGRATNGDPNSDDHFKYSQGPGYGGTP Populus_balsamifera_TFIIE-alpha 1 ARVKDLPVPEFGSLQEWQIHASAAGRAANGDSSYNDPSRSSQ--GYGGTP Solanum_tuberosum_TFIIEa_TC670 -RVKDLPAPEFGSLQAWEVRANAVARGANGDNAND--SKSGQGLGFGGTP Hordeum_vulgare_TFIIEa_TC90346 -RVKDLPAPEFGSLQSWER--LNLGAFAHGDSAAAEAARNAQ-GQYNGTP Oryza_sativa_TFIIEa1 -RVKDLPAPEFGSLQSWER--ANIGAFGTADPSAADSSRNPQ-GQY-GTP Oryza_sativa_TFIIEa2 -RVKDLPFPSFMSLQDWER--ATIGASANG---AVGSSQNSE-GRYSSKP Oryza_sativa_TFIIEa3 -KVKDLDFPEFLALETWER---NMREPAGGD--------------DVSRP Oryza_sativa_TFIIEa4 -LLEDLPAPDFGSTPDFKG--------------------TYNISDWSRTS Methanosarcina_acetivorans_TFE -------------------------------------------------Sulfolobus_solfataricus_TFE_NP ------KETKLGANKNH--------------------------------Saccharomyces_cerevisiae_TFIIE PQNQSHAAYTYNPKKGSTMFRPGDSAPLPNLMGTALGNDSSRRAGANSQA Drosophila_melanogaster_TFIIEa VTIGGDDTSD--------------------------AVIERKS---RPIW Homo_sapiens_TFIIE-alpha_NP_00 VVINMDDQEDL-----------------------HRASLEGKSAKERPIW Arabidopsis_thaliana_TFIIEa2_A MPFLGETK--------------------------VEVNLG--EGNED-VT Arabidopsis_thaliana_TFIIEa3_A MPFLGETE--------------------------IEVNLG--EENED-VK Arabidopsis_thaliana_TFIIEa1_A MPFLGETK--------------------------VEVNLG--DGNED-VK Populus_balsamifera_TFIIE-alpha 2 MPFLGETK--------------------------VEVAFAGDESKEN-IK Populus_balsamifera_TFIIE-alpha 1 MPFLGETKHRVEFNASKRCQLRHDQEKDSSTKGRVEVSFSGVEGKED-LK Solanum_tuberosum_TFIIEa_TC670 MPFVGETK--------------------------VEVAFSGLEEKGD-IK Hordeum_vulgare_TFIIEa_TC90346 MPYLGDTK--------------------------VDVELAGSGVKEEGAE Oryza_sativa_TFIIEa1 MPYLGETK--------------------------VEVALSGTGVKDEGAE Oryza_sativa_TFIIEa2 MPFLGETE--------------------------VEVNFLGSTGAQEGVE Oryza_sativa_TFIIEa3 MLFLGEVMS--------------------------HEHQKGSASCIDADE Oryza_sativa_TFIIEa4 VPLPEPTNG--------------------------DDSFSSPCAKDD--E Methanosarcina_acetivorans_TFE -------------------------------------------------Sulfolobus_solfataricus_TFE_NP -------------------------------------------------Saccharomyces_cerevisiae_TFIIE TLHINITTAS---------------------DEVAQRELQERQAEEKRKQ Drosophila_melanogaster_TFIIEa MTESTVITDTDAADG-----AADAVQTASGSGHRNRKENE---------Homo_sapiens_TFIIE-alpha_NP_00 LRESTVQGAYGSEDMKEGGIDMDAFQERE-EGHAGPDDNE---------Arabidopsis_thaliana_TFIIEa2_A S--TGGDSSLKMLPPWMIKQGMKLTEEQRGEMRQEANVDG---EAAKLSD Arabidopsis_thaliana_TFIIEa3_A SDEVGDSSRRKLTPSWLIKKGMNLSDEQRGEIRHEAKAD----------Arabidopsis_thaliana_TFIIEa1_A S--KGGDSSLKVLPPWMIKEGMNLTEEQRGEMRQEAKVDGGAGAAAKLSD Populus_balsamifera_TFIIE-alpha 2 S--ETASTSLKVLPPWMIKQGMNLTKEQRGEVKQESKMDSSSTAVEFSDE Populus_balsamifera_TFIIE-alpha 1 S--ETASTGLKVLPPWMIKQGMNLTKEQRGEVKQGSKMDDSSAAAEPPDD Solanum_tuberosum_TFIIEa_TC670 S--EVSVTPMKVLPPWMIKEGMNLTKEQRGEVKQESNMEGTSTAAGLSDD Hordeum_vulgare_TFIIEa_TC90346 S--GRDGTVLKVLPPWMVREGMNLTKEQRGESSNTSKGDE---KSDVKDE Oryza_sativa_TFIIEa1 S--GTNGNGLKVLPPWMIKQGMNLTKEQRGETSNSSNLDE---KSEVKDE Oryza_sativa_TFIIEa2 S--GMES--IKPQHSWMNRKRTVLAGEHKEENNNTANLDQ---SSEAKSD Oryza_sativa_TFIIEa3 EIFEFRVQDARPIPSFVIRKDINHTEDKEEQL-----------------Oryza_sativa_TFIIEa4 S--DAGVSELKILPSWLIRKGMKLKQAHLSNSSTVCGEGG---------Methanosarcina_acetivorans_TFE -------------------------------------------------Sulfolobus_solfataricus_TFE_NP -------------------------------------------------Saccharomyces_cerevisiae_TFIIE NAVPEWHKQSTIGKTALGRLDNEEEFDPVVTASAMDSINPDNEPAQETSY

PAGE 232

219 Drosophila_melanogaster_TFIIEa ----DIMSVLLQHEKQPGQKEPHMKGMRVGSSNANSSDSSDDEKDIENSK Homo_sapiens_TFIIE-alpha_NP_00 ----EVMRALLIHEKKTSSAMAGSVGAAAPVTAANGDDSESETSESDDDS Arabidopsis_thaliana_TFIIEa2_A DKKSVMENGDDNKDLKDEYLKAYYAAIMEQQKLA-AKLNEQESAGESTTT Arabidopsis_thaliana_TFIIEa3_A DGGSSMENGDDDRNLKDEYLKAYYAAILEEQELA-EKLNQQESAGK-VTT Arabidopsis_thaliana_TFIIEa1_A DKKSAIGNGDE-KDLKDEYLKAYYAELMKQQELA-ARRNQQESAGE-PTS Populus_balsamifera_TFIIE-alpha 2 KK-SAKVNGDS---IKEEYVKAYYAALLEQQRQA-EESAKQQQELSQTSM Populus_balsamifera_TFIIE-alpha 1 KK-ISIENDDK---IKDEYVKAYYAALLQKQREA-EESAEKQQELLQTSI Solanum_tuberosum_TFIIEa_TC670 KKSIGFEDVKN---IQDEYIKAYYEALFKRQKEQ-EEATK---MLPETST Hordeum_vulgare_TFIIEa_TC90346 KK--QDSKEDE-KSIQDEYLKAYYEAFKKKQEEEDAKR-MQQEG-----Oryza_sativa_TFIIEa1 KK--QDSKEDE-KSIQDEYIKAYYEALRKRQDEEEAKRKIQQEG-----Oryza_sativa_TFIIEa2 KK--QLSEEDEMKSIQEAYAKAYYEAIQKRQEDE-GKRAIQEESLACISD Oryza_sativa_TFIIEa3 -------------------------------------------------Oryza_sativa_TFIIEa4 ------------TNIQEEYMKAYYEAIQKRQEDR--------IRHSGQSS Methanosarcina_acetivorans_TFE -------------------------------------------------Sulfolobus_solfataricus_TFE_NP -------------------------------------------------Saccharomyces_cerevisiae_TFIIE QNNRTLTEQEMEERENEKTLNDYYAALAKKQAKLNKEEEEEEEEEEDEEE Drosophila_melanogaster_TFIIEa IP-DVDFDNYINSDSAEEDD------DVPTVLVAGRPHPLDQLDDN--LI Homo_sapiens_TFIIE-alpha_NP_00 PPRPAAVAVHKREEDEEEDDEFEEVADDPIVMVAGRPFSYSEVSQRPELV Arabidopsis_thaliana_TFIIEa2_A DIESATTYSDRQVGMKSKREE----EEEDVEWEEGASVAANGNY---KVD Arabidopsis_thaliana_TFIIEa3_A DIELATSSSDRQVGMKSKREE----EEE-----EEASVAANGNY---KVD Arabidopsis_thaliana_TFIIEa1_A GIQSGTVYSGRQVSMKAKREEDEDEDEEEVEWEEKAPVTANGNY---KVD Populus_balsamifera_TFIIE-alpha 2 SNGLSESSSNRQVGMKSKREEG---EGDDDVEWEEAPIEGKSN------N Populus_balsamifera_TFIIE-alpha 1 SNGFSKSSSDRQVGMKSKREEDD--EPDDDVEWEEAPIGGMSYL-----S Solanum_tuberosum_TFIIEa_TC670 TDGVYNTSTERQVGMKSKREEED--EGED-VEWEEAPPAGNTTTGNLKVD Hordeum_vulgare_TFIIEa_TC90346 QAFSSEIHSERQLGMKAKREDE-NVEDDGVEWEEEQPAGNASEEPYKFVD Oryza_sativa_TFIIEa1 DTFASASHSERQVGMKSKRED----DDEGVEWEEEQPAGNTAET-YKLAD Oryza_sativa_TFIIEa2 QPFASDAQFERRLGAKSKRDDGGESGDDGIELKVRQSTGN-IEEVYKFAD Oryza_sativa_TFIIEa3 -------------------------------------------------Oryza_sativa_TFIIEa4 VPGGPSVSSERPMGVKRQKLCNDINNNALECQGEEPPGDTFRT------Methanosarcina_acetivorans_TFE -------------------------------------------------Sulfolobus_solfataricus_TFE_NP -------------------------------------------------Saccharomyces_cerevisiae_TFIIE EEEEEMEDVMDDNDETARENALEDEFEDVTDTAGTAKTESNTSNDVKQES Drosophila_melanogaster_TFIIEa AQMTPQEKENYIHVYQQHYSHIFE--------------------Homo_sapiens_TFIIE-alpha_NP_00 AQMTPEEKEAYIAMGQRMFEDLFE--------------------Arabidopsis_thaliana_TFIIEa2_A LNVEAEEAEEKEDGDEDDDIDWEEG-------------------Arabidopsis_thaliana_TFIIEa3_A LNVEAEEAEQDE-----NDVDWQEC-------------------Arabidopsis_thaliana_TFIIEa1_A LNVEAEASGGEEE-EEEDDVDWEEG-------------------Populus_balsamifera_TFIIE-alpha 2 WNLIALLSY-----------------------------------Populus_balsamifera_TFIIE-alpha 1 MEWDPLQSY-----------------------------------Solanum_tuberosum_TFIIEa_TC670 LNVQADASEDDND--EEDDIDWEEG-------------------Hordeum_vulgare_TFIIEa_TC90346 LNAEAPESGDEED-----EIDWEEG-------------------Oryza_sativa_TFIIEa1 LNVEAQESGDEED-----EIDWEEG-------------------Oryza_sativa_TFIIEa2 LNVETQELVEKN------CIPPAE--------------------Oryza_sativa_TFIIEa3 --------------------------------------------Oryza_sativa_TFIIEa4 --------------------------------------------Methanosarcina_acetivorans_TFE --------------------------------------------Sulfolobus_solfataricus_TFE_NP --------------------------------------------Saccharomyces_cerevisiae_TFIIE INDKTEDAVNATATASGPSANAKPNDGDDDDDDDDDEMDIEFEDV TFIIE Alignment CLUSTAL X (1.83) multiple sequence alignment Arabidopsis_thaliana_TFIIEb1_A -----MALREQLDKFNKQQEKCQSTL-----------------------S Arabidopsis_thaliana_TFIIEb2_A -----MALKEQLDKFNKQQVKCQSTL-----------------------S Lycopersicon_esculentum_TFIIEb ----MASLQESLQRFKKQQEKCQAIT-----------------------S Solanum_tuberosum_TFIIEb_TC605 ----MASLQESLQRFKKQQEKCQAIS-----------------------S Populus_trichocarpa_TFIIEb_Contig1 -----MALQEQLDRFKKQQEKCQSTL-----------------------T Glycine_max_TFIIEb_TC192062 -----MTLQEKLDKFKKQQEKCQTTL-----------------------S Medicago_truncatula_TFIIEb_TC7 -----MALQGKLDRFKKQQEKCQSTL-----------------------S Helianthus_annuus_TFIIEb_TC949 ----MGSLRESLNRFKQQQEKCQSTL-----------------------T Hordeum_vulgare_TFIIEb_TC10289 -----MDLKDSLSRFKQQQERCQSSL-----------------------A Triticum_aestivum_TFIIEb_TC110 -----MDLKDSLSRFKQQQERCQSSL-----------------------A Oryza_sativa_TFIIE-beta_AAM011 -----MDLKDSLSKFKQQQERCQSSL-----------------------A Oryza_sativa_TFIIEb_TC151474 -----MDLKDSLSKFKQQQERCQSSL-----------------------A Sorghum_bicolor_TFIIEb_TC59949 -----MDLKDSLSRFKQQQERCQSSL-----------------------A

PAGE 233

220 Zea_mays_TFIIEb_TC209727 -----MDLKDSLSKFKQQQERCQSSL-----------------------A Hordeum_vulgare_TFIIEb_TC89335 -----MALNERLSKFKQQQERCQTTL-----------------------S Triticum_aestivum_TFIIEb_TC129 -----MALNERLSKFKQQQERCQTTL-----------------------S Sorghum_bicolor_TFIIEb_TC67168 -----MALNDRLNKFKQQQERCQNTL-----------------------S Drosophila_melanogaster_TFIIE---MDPALLREREAFKKRAMATPTVE-----------------------K Homo_sapiens_TFIIE-beta_NP_002 ---MDPSLLRERELFKKRALSTPVVE-----------------------K Saccharomyces_cerevisiae_TFIIE MSKNRDPLLANLNAFKSKVKSAPVIAPAKVGQKKTNDTVITIDGNTRKRT Arabidopsis_thaliana_TFIIEb1_A SISSSR--TALSRS---YVP--AATTSQKPNVFRGKFSENTKQLQHITNI Arabidopsis_thaliana_TFIIEb2_A SIASSRERTSSSRQ---NVPLPAAITQKKPDAAPVKFSSDTERLQNINNI Lycopersicon_esculentum_TFIIEb MAARAGPSKG---------APPRPANAKPP-APAVKFSNDTERLQHINTI Solanum_tuberosum_TFIIEb_TC605 MAARAGPSKG---------APPRPANAKPP-APAVKFSNDTERLQHINSI Populus_trichocarpa_TFIIEb_Contig1 SIAKSRPSKSSLTQKTVAVAPAPSTSARTP-APAVKFSNDTERLQHINSI Glycine_max_TFIIEb_TC192062 SIAASKAAATQKS------AAHGSANGRNA-APAVKFSNDTERLQHINSI Medicago_truncatula_TFIIEb_TC7 SIAANKAVS---------------ASVPNA-LAPVKFSTDTERLQHINSI Helianthus_annuus_TFIIEb_TC949 SIAAGSKTSNRTTTPAPRVAPAASTLAKNP-VPAVKFSNDTERLQHINNV Hordeum_vulgare_TFIIEb_TC10289 SIAASQASTTKPKHR--AQPINAQSAPARP-AQPIKFSNDTERLQHINSI Triticum_aestivum_TFIIEb_TC110 SIAASQASTTKPKHR--AQPINAPSAPARP-AQPIKFSNDTERLQHINSI Oryza_sativa_TFIIE-beta_AAM011 SIAAS---TSKPKHR--AQPVNAPSAPARP-LQPIKFSNDTERLQHINSV Oryza_sativa_TFIIEb_TC151474 SIAAS---TSKPKHR--AQPVNAPSAPARP-LQPIKFSNDTERLQHINSV Sorghum_bicolor_TFIIEb_TC59949 SIAAS---SSKPKHR--AQPAHAPNVPARP-SQPVKFSNDTERLQHINSI Zea_mays_TFIIEb_TC209727 SIAAS---TSKPKHR--AQPAHAPNVPARP-SQPIKFSNDTERLQHINSI Hordeum_vulgare_TFIIEb_TC89335 SIAATQASTTKSHNAPRSRPANAPSAPAKQ-IQAIKFSNDTERLQHINSV Triticum_aestivum_TFIIEb_TC129 SIAATQASTTKSHNAPRSRPANAPSAPAKQ-IQAIKFSNDTERLQHINSV Sorghum_bicolor_TFIIEb_TC67168 SIFASQTSISTSKHVPGIQPVNAPLAPIKP-LHPIKFSNDTERLQHINSV Drosophila_melanogaster_TFIIEKSKP-DRPAPPPPSDDSRRKMRPPNAPRLD-ATT------YKTMSGSSQY Homo_sapiens_TFIIE-beta_NP_002 RSASSESSSSSSKKKKTKVEHGGSSGSKQN-SDHSNGSFNLKALSGSSGY Saccharomyces_cerevisiae_TFIIE ASERAQENTLNSAKNPVLVDIKKEAGSNSSNAISLDDDDDDEDFGSSPSK Arabidopsis_thaliana_TFIIEb1_A RNSAVGAQMKIVIDLLFK-------TRLAYTAEQIN-------EACYVDM Arabidopsis_thaliana_TFIIEb2_A RKAPVGAQIKRVIDLLYE-------RRLALTPEQIN-------EWCHVDM Lycopersicon_esculentum_TFIIEb RKGPVGSQMKRVIDLLLE-------TRQAFTPEQIN-------EACYVDL Solanum_tuberosum_TFIIEb_TC605 RKGPVGAQIKRVIDLLLE-------TRQAFTPEQIN-------EACYVDI Populus_trichocarpa_TFIIEb_Contig1 RKAPAGAQIKRVIDLLLE-------TRQAFTPEQIN-------DHCYVDM Glycine_max_TFIIEb_TC192062 RKAPVGAQMKRVIDLLLE-------TRQAFTPEQIN-------GACYVDM Medicago_truncatula_TFIIEb_TC7 RKAPVGAQMKRVIDLLFE-------TRQALTLEQIN-------ETCHVDM Helianthus_annuus_TFIIEb_TC949 RKSPVGAQIKKVIDLLFE-------SRQAFTAEQIN-------EACYVDV Hordeum_vulgare_TFIIEb_TC10289 RKSPIGAQIKLVIELLYK-------TRQAFTAEQIN-------DETYVDI Triticum_aestivum_TFIIEb_TC110 RKSPVGAQIKLVIELLYK-------TRQAFTAEQIN-------DATYVDI Oryza_sativa_TFIIE-beta_AAM011 RKSPIGAQIKLVIELLYK-------TRQAFTAEQIN-------ETTYVDI Oryza_sativa_TFIIEb_TC151474 RKSPIGAQIKLVIELLYK-------TRQAFTAEQIN-------ETTYVDI Sorghum_bicolor_TFIIEb_TC59949 RKSPVGAQIKLVIELLYK-------TRQAFTAEQIN-------DATYVDI Zea_mays_TFIIEb_TC209727 RKSPVGAQIKLVIELLYK-------TRQAFTAEQIN-------EATYVDI Hordeum_vulgare_TFIIEb_TC89335 RKSPVGAQIKLVIELLYK-------TRLAYTAEQIN-------EATYVAI Triticum_aestivum_TFIIEb_TC129 RKSPVGAQIKLVIELLYK-------TRLAYTAEQIN-------EATYVAI Sorghum_bicolor_TFIIEb_TC67168 RKSAVGVQIKLVVELLYK-------TRQSFTAKQVN-------EATYVDI Drosophila_melanogaster_TFIIERFGVLAKIVKFMRTRHQDG-----DDHPLTIDEILD-------ETNQLDI Homo_sapiens_TFIIE-beta_NP_002 KFGVLAKIVNYMKTRHQRG-----DTHPLTLDEILD-------ETQHLDI Saccharomyces_cerevisiae_TFIIE KVRPGSIAAAALQANQTDISKSHDSSKLLWATEYIQKKGKPVLVNELLDY Arabidopsis_thaliana_TFIIEb1_A HNNKA---VFDSLRKNPKVHYDGR--RFSYKATHNIKDKKQLLSFVN-KS Arabidopsis_thaliana_TFIIEb2_A HANKA---VFDSLRKNPKAHYDGR--RFSYKATHDVNDKNQLLSLVR-KY Lycopersicon_esculentum_TFIIEb IGNKP---VFDSLRKNVKVYYDGN--RFSYKSKHALKNKEQLLILIR-KF Solanum_tuberosum_TFIIEb_TC605 NGNKA---VFDSLRNNLKVYYDGN--RFSYKSKHALKNKEQLLILIR-KF Populus_trichocarpa_TFIIEb_Contig1 NSNKA---VFDSLRNNPKVHYDGK--RFSYKSKHDLKDKSQLLVLIR-KF Glycine_max_TFIIEb_TC192062 KANKD---VFENLRKNPKVNYDGQ--RFSYKSKYGLKDKTELLQLIR-KY Medicago_truncatula_TFIIEb_TC7 KANKD---VFDNMRKNPKVRYDGE--RFSYKSKHALRDKKELLFLIR-KF Helianthus_annuus_TFIIEb_TC949 KGNKA---VFESLAKNPKVNYDGK--RFSYKSKHNVRDQKELLRLIR-TF Hordeum_vulgare_TFIIEb_TC10289 NGNKA---VFESLRNNLKVHYDGR--RFSYKSKHDLEGKDQLLELIR-CH Triticum_aestivum_TFIIEb_TC110 NANKA---VFDSLRNNLKVQYDGR--RFSYKSKHDLEGKDQLLDLIR-CH Oryza_sativa_TFIIE-beta_AAM011 HGNKS---VFDSLRNNPKVHYDGR--RFSYKSKHDLKGKDQLLVLVR-KY Oryza_sativa_TFIIEb_TC151474 HGNKS---VFDSLRNNPKVHYDGR--RFSYKSKHDLKGKDQLLVLVR-KY Sorghum_bicolor_TFIIEb_TC59949 HGNKA---VFDSLRNNPKVSYDGR--RFSYKSKHDLKGKDQLLVLIR-KF Zea_mays_TFIIEb_TC209727 HGNKA---VFDSLRNNPKVSYDGR--RFSYKSKHDLKGKDQLLVLIR-KF Hordeum_vulgare_TFIIEb_TC89335 NSNKA---VFDSLTNNPKVQFDGK--RFSYKSKHDLKGKDQLLHLIR-RF Triticum_aestivum_TFIIEb_TC129 NSNKA---VFDSLTNNPKVQFDGK--RFSYKSKHDLKGKDQLLHLIR-RF Sorghum_bicolor_TFIIEb_TC67168 HGNKA---VSDSLRNNPKVLFDGT--RFSYKPKHILTGRDELLGLIK-EK Drosophila_melanogaster_TFIIEGQSVKNWLASEALHNNPKVEASPCGTKFSFKPVYKIKDGKTLMRLLK-QH Homo_sapiens_TFIIE-beta_NP_002 GLKQKQWLMTEALVNNPKIEVIDG--KYAFKPKYNVRDKKALLRLLD-QH Saccharomyces_cerevisiae_TFIIE LSMKKDDKVIELLKKLDRIEFDPKKGTFKYLSTYDVHSPSELLKLLRSQV

PAGE 234

221 Arabidopsis_thaliana_TFIIEb1_A D-KVIDVSDLKDAYPNVMEDLKSLKSSGEIFWLLSNTDSKEGTVYRNNME Arabidopsis_thaliana_TFIIEb2_A L-DGIAVVDLKDAYPNVMEDLKALSASGDIY-LLSN--SQEDIAYPNDFK Lycopersicon_esculentum_TFIIEb P-EGIAVIDLKDAYPTVMEDLQALKGAGQIW-LLSNFDSQEDIAFPNDPR Solanum_tuberosum_TFIIEb_TC605 P-EGIAVIDLKDAYPTVMEDLQALKGAGQIW-LLSKFDSQEDIAFPNDPR Populus_trichocarpa_TFIIEb_Contig1 P-EGIAVIDLKDSYPSVMDDLQALKAVGQIW-LLSNFDSQEDIAYPNDPR Glycine_max_TFIIEb_TC192062 P-EGLAVIDLKDAYPTVMEDLQAMKAAGQIW-LLSNFDSQEDIAYPNDPK Medicago_truncatula_TFIIEb_TC7 P-EGIAVIDLKDSYPTVMEDLQALKGGREIW-LLSNFDSQEDIAYPNDPK Helianthus_annuus_TFIIEb_TC949 A-EGIAVADLKDAYPTVMEDLQALKAGRQIW-LLSNFDSQEEIAYPNDPR Hordeum_vulgare_TFIIEb_TC10289 Q-EGLAVVEVKDAYPSVLEDLQALKAAGEVW-LLSNMDSQEDIVYPNDPK Triticum_aestivum_TFIIEb_TC110 Q-EGLAVVEVKDAYPSVLEDLQALKAAGEVW-LLSNMDSQEDIVYPNDPK Oryza_sativa_TFIIE-beta_AAM011 P-EGLAVVEVKDAYPTVMEDLQALKAAGEVW-LLSNMDSQEDIVYPNDPK Oryza_sativa_TFIIEb_TC151474 P-EGLAVVEVKDAYPTVMEDLQALKAAGEVW-LLSNMDSQEDIVYPNDPK Sorghum_bicolor_TFIIEb_TC59949 P-EGLAVVEVKDAYPNVLEDLQALKAAGEVW-LLSNMDSQEDIVYPNDPK Zea_mays_TFIIEb_TC209727 P-EGLAVVEVKDAYSNVLEDLQALKAAGEVW-LLSNMDSQEDIVYPNDPK Hordeum_vulgare_TFIIEb_TC89335 P-EGLPVVEVKDSYPTVLDDLQALKASGDVW-WLSSMDSQEDIVYPNDPK Triticum_aestivum_TFIIEb_TC129 P-EGLPVVEVKDSYPTVLDDLQALKASGDVW-WLSSMDSQEDIVYPNDPK Sorghum_bicolor_TFIIEb_TC67168 E-CGLPVEDIKDAYPSVLEDLQALKASGDVW-WLSSTQSQEDMAYFNDPR Drosophila_melanogaster_TFIIEDLKGLGGILLDDVQESLPHCEKVLKNRSAEILFVVRPIDKKKILFYNDRT Homo_sapiens_TFIIE-beta_NP_002 DQRGLGGILLEDIEEALPNSQKAVKALGDQILFVNRP-DKKKILFFNDKS Saccharomyces_cerevisiae_TFIIE TFKGISCKDLKDGWPQCDETINQLEEDSKILVLRTKKDKTPRYVWYNSGG Arabidopsis_thaliana_TFIIEb1_A YP-KIDDELKALFRDI-IPSDMLE-VEKELLKIGLKPATNIAERRAAEQL Arabidopsis_thaliana_TFIIEb2_A CEIKVDDEFKALFRDINIPNDMLD-VEKELLKIGLKPATNTAERRAAAQT Lycopersicon_esculentum_TFIIEb VPIKVDDDLKQLFRGIELPRDMLD-IERDLQKNGMKPATNTAKRRAMAQV Solanum_tuberosum_TFIIEb_TC605 VPIKVDDDLKQLFRSIELPRDMLD-IERDLQKNGMKPATNTAKRRAMAQV Populus_trichocarpa_TFIIEb_Contig1 MVIKVDDDLKQLFRGIELPRDMLD-IEKDLQKNGMKPATNTAKRRAAAQV Glycine_max_TFIIEb_TC192062 VHIKVDDDLKHLFRSIELPRDMID-IEKDLQKNGMKPATNTAQRRSAAQI Medicago_truncatula_TFIIEb_TC7 VPIKVDDDLKQLFRGIELPRDMID-IERDLQKNGMKPATNTAKRRSAAQM Helianthus_annuus_TFIIEb_TC949 VPIKVDDELKQLFRSIELPRDMLD-IERDLQKNGMKPATNTAKRR----V Hordeum_vulgare_TFIIEb_TC10289 VKIKVDDDLKELFRGIELPRDMVD-IEKDLQKNGMKPMTDTTKRRAAAQI Triticum_aestivum_TFIIEb_TC110 VKIKVDDDLKELFRGIELPRDMVD-IEKELQKNGMKPMTDTTKRRAAAQI Oryza_sativa_TFIIE-beta_AAM011 AKIKVDDDLKQLFREMELPRDMVD-IEKELQKNGIKPMTNTAKRRAAAQI Oryza_sativa_TFIIEb_TC151474 AKIKVDDDLKQLFREMELPRDMVD-IEKELQKNGIKPMTNTAKRRAAAQI Sorghum_bicolor_TFIIEb_TC59949 AKIKVDDDLKQLFREIELPRDMVD-IEKELQRNGFKPMTNTAKRRAAAQI Zea_mays_TFIIEb_TC209727 AKIKVDDDLKQLFREIELPRDMVD-IEKELQKNGFKPMTNTAKRRAAAQI Hordeum_vulgare_TFIIEb_TC89335 SKIKLDADLKQLYREIELPRDMID-IEKELLKNGHKPATDTTKRRAAAQI Triticum_aestivum_TFIIEb_TC129 SKIKVDADLKQLYREIELPRDMID-IEKELLKNGHKPATDTTKRRAAAQI Sorghum_bicolor_TFIIEb_TC67168 YNITVDNDLKELFLKTELPRDMLD-VEKEIKKSGEKPMTNTTKRRALAQI Drosophila_melanogaster_TFIIEANFSVDDEFQKLWRSATVDAMDDAKIDEYLEKQGIRSMQDHGLKKAIPKHomo_sapiens_TFIIE-beta_NP_002 CQFSVDEEFQKLWRSVTVDSMDEEKIEEYLKRQGISSMQESGPKKVAPIQ Saccharomyces_cerevisiae_TFIIE NLKCIDEEFVKMWENVQLPQFAEL--PRKLQDLGLKPASVDPATIKRQTK Arabidopsis_thaliana_TFIIEb1_A HGVSNKPKDK--KKKKKEITNRTK-LTNSHMLELFQS-------Arabidopsis_thaliana_TFIIEb2_A HGISNKPKDK--KKKKQEISKRTK-LTNAHLPELFQNLNGSSSRN Lycopersicon_esculentum_TFIIEb HGIAPKPKTK---KKKHEISKRTK-LTNAHLPELFKL-------Solanum_tuberosum_TFIIEb_TC605 HGIVPKPKTK---KKKHEISKRTK-LTNAHLPELFKL-------Populus_trichocarpa_TFIIEb_Contig1 QGISTKQKAK---KKKHEISKRTK-LTNAHLPELFKNLGS----Glycine_max_TFIIEb_TC192062 QGISSKPKPK---KKKSEISKRTK-LTNAHLPELFQNLNSS---Medicago_truncatula_TFIIEb_TC7 EGISSKPKPK---KKKNEITKRTK-LTNAHLPE-----------Helianthus_annuus_TFIIEb_TC949 DGS------K---WQYFE--------------------------Hordeum_vulgare_TFIIEb_TC10289 HGVKPKAKPK---KKQREITKRTK-LTNAHLPELFQHLKS----Triticum_aestivum_TFIIEb_TC110 HGVKPKAKPK---KKQREITKRTK-LTNAHLPELFQHLKS----Oryza_sativa_TFIIE-beta_AAM011 NGVQPKAKPK---KKQREITRRTK-LTNAHLPELFQNLNT----Oryza_sativa_TFIIEb_TC151474 NGVQPKAKPK---KKQREITRRTK-LTNAHLPELFQNLNT----Sorghum_bicolor_TFIIEb_TC59949 NGVKPKAKPK---KKQREITKRTK-LTNAHLPELFQNLNT----Zea_mays_TFIIEb_TC209727 NGVKPKAKPK---KKQREITKRTK-LTNAHLPELFQNLNT----Hordeum_vulgare_TFIIEb_TC89335 HGQRPKPKAK---KKQKEITKRTK-LTNAHLPELFDLPR-----Triticum_aestivum_TFIIEb_TC129 HGQRPKPKAK---KKQKEITKRTK-LTNAHLPELFDLPR-----Sorghum_bicolor_TFIIEb_TC67168 LDAAPKTKTKGSKKKQRRLTGKSKGLTNIHMPELFDA-------Drosophila_melanogaster_TFIIE-RKK-AANKKRQFKKPRDNEHLADVLEVYEDNTLTLKGVNPT--Homo_sapiens_TFIIE-beta_NP_002 RRKKPASQKKRRFKT--HNEHLAGVLKDYSDITSSK--------Saccharomyces_cerevisiae_TFIIE RVEVKKKRQR---KGKITNTHMTGILKDYSHRV-----------

PAGE 235

222 TFIIF Alignment Conserved C-terminal hydrophobic amino acids are highlighted in yellow. CLUSTAL X (1.83) multiple sequence alignment Drosophila_melanogaster_TFIIFa -----------------MSSASKSTPSAASGSSTSAAAAAAASVASGSAS Homo_sapiens_TFIIF_RAP74_NP_00 --------------------------MAALGP-----------------Oryza_sativa_TFIIFa_TC148835 --------------------------MGSADLVLKAACEGCGSPSDLYGT Triticum_aestivum_TFIIFa_TC106 --------------------------MGSVDLVLKPACEGCGSTSDLYGT Arabidopsis_thaliana_TFIIF-alp ---------------------------MSNCLQLNTSCVGCGSQSDLYGS Populus_trichocarpa_TFIIF_alpha_C3 ---------------------------MSFDLLLKPSCSGCGSTTDLYGS Saccharomyces_cerevisiae_TFIIF MSRRNPPGSRNGGGPTNASPFIKRDRMRRNFLRMRMGQNGSNSSSPGVPN Drosophila_melanogaster_TFIIFa SSANVQEFKIRVPK-MPKKHHVMRFNATLNVDFAQWRNVKLERENNMK-Homo_sapiens_TFIIF_RAP74_NP_00 SSQNVTEYVVRVPKNTTKKYNIMAFNAADKVNFATWNQARLERDLSNK-Oryza_sativa_TFIIFa_TC148835 SCKHTTLCSSCGKSMALSGARCLVCSAPITNLIREYNVRANATTDKSF-Triticum_aestivum_TFIIFa_TC106 GCKHTTLCSSCGKSMALSRARCLVCSAPITNLIREYNVRANASTDKAF-Arabidopsis_thaliana_TFIIF-alp SCRHMTLCLKCGRTMAQNKSKCHECGTVVTRLIREYNVRAAAPTDKNY-Populus_trichocarpa_TFIIF_alpha_C3 NCKHMTLCLNCGKTMAENRGKCFDCGT------TEYNVRASTSSDKNY-Saccharomyces_cerevisiae_TFIIF GDNSRGSLVKKDDPEYAEEREKMLLQIGVEADAGRSNVKVKDEDPNEYNE Drosophila_melanogaster_TFIIFa ---------EFRGMEEDQPKF--GAGSEYNRDQREEARRKKFGIIARKYR Homo_sapiens_TFIIF_RAP74_NP_00 ---------KIY-QEEEMPES--GAGSEFNRKLREEARRKKYGIVLKEFR Oryza_sativa_TFIIFa_TC148835 ---------SIGRFVTGLPPFSKKKSAENKWSLHKEGLQGRQIPENMREK Triticum_aestivum_TFIIFa_TC106 ---------SIGRFVTGLPPFSKKKNAENKWSLHKEGLQGRQLTDKMLEK Arabidopsis_thaliana_TFIIF-alp ---------FIGRFVTGLPNF--KKGSENKWSLRKDIPQGRQFTDAQREK Populus_trichocarpa_TFIIF_alpha_C3 ---------FIGRFVTGLPSFSKKKNAENKWSLHKEGILGRQITDALREK Saccharomyces_cerevisiae_TFIIF FPLRAIPKEDLENMRTHLLKFQSKKKINPVTDFHLPVRLHRKDTRNLQFQ Drosophila_melanogaster_TFIIFa PEAQPWILKVGGKTGKKFKG-----------------------IREGGVG Homo_sapiens_TFIIF_RAP74_NP_00 PEDQPWLLRVNGKSGRKFKG-----------------------IKKGGVT Oryza_sativa_TFIIFa_TC148835 YNRKPWILEDETGQ-YQYQG-----------------------QMEGSQS Triticum_aestivum_TFIIFa_TC106 YNRKPWILEDETGQ-YQFQG-----------------------HMEGSQS Arabidopsis_thaliana_TFIIF-alp LKNKPWILEDETGQ-FQYQG-----------------------HLEGSQS Populus_trichocarpa_TFIIF_alpha_C3 FKNKPWLLEDETGQ-SQYQG-----------------------HLEGSQS Saccharomyces_cerevisiae_TFIIF LTRAEIVQRQKEISEYKKKAEQERSTPNSGGMNKSGTVSLNNTVKDGSQT Drosophila_melanogaster_TFIIFa ENAAFYVFTHAPDGAIEAYPLTEWYNFQPIQRYKSLSAEEAEQEFGRRKK Homo_sapiens_TFIIF_RAP74_NP_00 ENTSYYIFTQCPDGAFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNK Oryza_sativa_TFIIFa_TC148835 STATYYLLMMHGK-EFHAYPAGSWYNFSKIAQYKQLTLEEAEEKMNKRKT Triticum_aestivum_TFIIFa_TC106 ATATYYLLMLHGK-EFHAFPAGSWYNFSKVAQYKQLTLEEAEEKMNKRKT Arabidopsis_thaliana_TFIIF-alp --ATYYLLVMQNK-EFVAIPAGSWYNFNKVAQYKQLTLEEAEEKMKNRRK Populus_trichocarpa_TFIIF_alpha_C3 --ATYYLLMMTGK-EFVAIPAGSWYNFNKVAHYKQLTLEEAEEKMKNRRK Saccharomyces_cerevisiae_TFIIF PTVDSVTKDNTANGVNSSIPTVTGSSVPPASPTTVSAIESNGLSNGSTSA Drosophila_melanogaster_TFIIFa VMNYFSLMLRKRLRGDEEEEQDPEEA--KLI--KAATKKSKELKITDMDE Homo_sapiens_TFIIF_RAP74_NP_00 VLNHFSIMQQRRLK---DQDQDEDEE--EKE--KRGRRKASELRIHDLED Oryza_sativa_TFIIFa_TC148835 SATGYERWMMKAATNGPAAFGSDVKK--LEP--TNGTEKENARPKKGKNN Triticum_aestivum_TFIIFa_TC106 SATGYERWMMKAATNGPAAFGSDMMK--LEP--ANDGEKESARHKKGKDN Arabidopsis_thaliana_TFIIF-alp TADGYQRWMMKAANNGPALFGEVDNE--KESGGTSGGGGRGRKKSSGGDE Populus_trichocarpa_TFIIF_alpha_C3 TADGYERWMMKAANNGAAAFGEVEKV--DDKEGVSAGGRGGRRKASG-DD Saccharomyces_cerevisiae_TFIIF ANGLDGNASTANLANGRPLVTKLEDAGPAEDPTKVGMVKYDGKEVTNEPE Drosophila_melanogaster_TFIIFa WID-SEDESDSEDEEDKKKKEQEDSDDGKAKGKGKKGADKKKKKRDVDDE Homo_sapiens_TFIIF_RAP74_NP_00 DLEMSSDASDASGEEGGR------VPKAKKKAPLAKGGRKKKKKKGSDDE Oryza_sativa_TFIIFa_TC148835 EEGNNSDKGEEDEEEEAA----------------------RKNRLALNKK Triticum_aestivum_TFIIFa_TC106 EEGNNSDKGEENEEEEAA----------------------RKDRLGLSKR Arabidopsis_thaliana_TFIIF-alp EEGNVSDRGDEDEEEEAS----------------------RKSRLGLNRK Populus_trichocarpa_TFIIF_alpha_C3 DEGNVSDRGEEDEEEEAG----------------------RKSRLGLNKQ Saccharomyces_cerevisiae_TFIIF FEEGTMDPLADVAPDGGG------------RAKRGNLRRKTRQLKVLDEN Drosophila_melanogaster_TFIIFa AFEESDDGDEEGREMDYDT---SSSEDEPDPE------AKVDKDMKGVAE Homo_sapiens_TFIIF_RAP74_NP_00 AFEDSDDGDFEGQEVDYMSDGSSSSQEEPESK------AKAPQQEEGPKG Oryza_sativa_TFIIFa_TC148835 SMDDD-EEGGKDLDFDLDD-EIEKGDDWEHEE------TFTDDDEAVDID Triticum_aestivum_TFIIFa_TC106 GMDDD-EEGGKDLDFDLDD-DIEKGDDWEHEE------TFTDDDEAVDID Arabidopsis_thaliana_TFIIF-alp SNDDDDEEGPRGGDLDMDDDDIEKGDDWEHEE------IFTDDDEAVGND Populus_trichocarpa_TFIIF_alpha_C3 GGDDD-EEGPRGGDLDMDDDDIEKGDDWEHEE------IFTDDDEAVAID Saccharomyces_cerevisiae_TFIIF AKKLRFEEFYPWVMEDFDGYNTWVGSYEAGNSDSYVLLSVEDDGSFTMIP

PAGE 236

223 Drosophila_melanogaster_TFIIFa EDALRKLLTSDEEEDDEKKSDESDKEDADGEKKKKDKGKDEVSKDKKKKK Homo_sapiens_TFIIF_RAP74_NP_00 VDEQ----SDSSEESEEEKPPEEDKEEEE-----------------EKKA Oryza_sativa_TFIIFa_TC148835 PEERAD-LAPEIPAPPEIKQD--DEENEEE-------------GGLSKSG Triticum_aestivum_TFIIFa_TC106 PEERAD-LAPEIPAPPEIKQD--DEENEEE-------------GGLSKSG Arabidopsis_thaliana_TFIIF-alp PEEREDLLAPEIPAPPEIKQDEDDEENEEEE------------GGLSKSG Populus_trichocarpa_TFIIF_alpha_C3 PEERED-LAPEVPAPPEIKQDEDDEDEENEE------------GGLSKSG Saccharomyces_cerevisiae_TFIIF ADKVYKFTARNKYATLTIDEAEKRMDKKSGEVPRWLMKHLDNIGTTTTRY Drosophila_melanogaster_TFIIFa PTKDDKKGKSNGSGDSSTDFSSDSTDSEDDLS---NGPPKKKVVVKDKDK Homo_sapiens_TFIIF_RAP74_NP_00 PTPQEKKRRKDSSEES------DSSE-ESDID---SEASSAFFMAKKKTP Oryza_sativa_TFIIFa_TC148835 KELKKLLGKAAGLNESDADEDDEDDDQEDESS-PVLAPKQKDQ-PKDEPV Triticum_aestivum_TFIIFa_TC106 KELKKLLGRSSGQNESDADDDDEEDDQDDESS-PVLAPKQTDQ-PKDEPV Arabidopsis_thaliana_TFIIF-alp KELKKLLGKANGLDESDEDDDDDSDDEEETNYGTVTNSKQKEA-AKEEPV Populus_trichocarpa_TFIIF_alpha_C3 KELKKLLGKANGLNESDVEDDDDDEDMDDDIS-PVLAPKQKDVVPKEEAA Saccharomyces_cerevisiae_TFIIF DRTRRKLKAVADQQAMDEDDRDDNSEVELDYDEEFADDEEAPIIDGNEQE Drosophila_melanogaster_TFIIFa EKEKE---KESA--ASSKVIASS-SNANKSRSATPTLSTDASKRKMNSLP Homo_sapiens_TFIIF_RAP74_NP_00 PKRER---KPSG--GSSRGNSRPGTPSAEGGSTSSTLRAAASKLEQGKRV Oryza_sativa_TFIIFa_TC148835 DNSPA---KPTP-SGHARGTPPASK-SKQKRKSGGGDDSKASGGAASKKA Triticum_aestivum_TFIIFa_TC106 DNSPA---KPTPSSGHARSTPPASK-SKQKRKSGG-DDAKASSGAASKKA Arabidopsis_thaliana_TFIIF-alp DNAPA---KPAP-SGPPRGTPPAKP-SKGKRKLNDGDSKKPSS-SVQKKV Populus_trichocarpa_TFIIF_alpha_C3 DISPA---KPTP-SGSAKGTPSTSKSAKGKRKLNG-EDAKSSNGAPVKKV Saccharomyces_cerevisiae_TFIIF NKESEQRIKKEMLQANAMGLRDEEAPSENEEDELFGEKKIDEDGERIKKA Drosophila_melanogaster_TFIIFa SDLTASDTSNSPTSTPAKRPKNEI-----STSLPTSFSGGKVEDYGITEE Homo_sapiens_TFIIF_RAP74_NP_00 SEMPAAKRLRLDTGPQSLSGKS-------TPQPPSGKTTPNSGDVQVTED Oryza_sativa_TFIIFa_TC148835 KVESDTKPSVAKDETPSSSKP---------ASKATAASKTSANVSPVTED Triticum_aestivum_TFIIFa_TC106 KVESDTKTSSIKEETPSSSKP---------TPKASASSR-SANVSPVTED Arabidopsis_thaliana_TFIIF-alp KTENDPKSSLKEERANTVSKSNTPTKAVKAEPASAPASSSSAATGPVTED Populus_trichocarpa_TFIIF_alpha_C3 KTENEVKPAVKEESSPATKGTATP----KVTPPSSKTGSTSGSTGPVTEE Saccharomyces_cerevisiae_TFIIF LQKTELAALYSSDENEINPYLSESD---IENKENESPVKKEEDSDTLSKS Drosophila_melanogaster_TFIIFa AVRRYLKRK-PLTATELLTKFKNKKTPVSS-----DRLVETMTKILKKIN Homo_sapiens_TFIIF_RAP74_NP_00 AVRRYLTRK-PMTTKDLLKKFQTKKTGLSS-----EQTVNVLAQILKRLN Oryza_sativa_TFIIFa_TC148835 EIRTVLLAVAPVTTQDLVSRFKSRLRGPE--------DKNAFAEILKKIS Triticum_aestivum_TFIIFa_TC106 EIRTVLLAVAPVTTQDLVSRFKSRLRGPE--------DKNAFAEILKKIS Arabidopsis_thaliana_TFIIF-alp EIRAVLMEKKQVTTQDLVSRFKARLKTKE--------DKNAFANILRKIS Populus_trichocarpa_TFIIF_alpha_C3 EIRAVLLQNGPVTTQDLVARFKSRLRTPECFTADYSLGLSVRLCMLLRIC Saccharomyces_cerevisiae_TFIIF KRSSPKKQQKKATNAHVHKEPTLRVKSIKN----CVIILKGDKKILKSFP Drosophila_melanogaster_TFIIFa PVKHTIQGKMYLWIK--------------------Homo_sapiens_TFIIF_RAP74_NP_00 PERKMINDKMHFSLKE-------------------Oryza_sativa_TFIIFa_TC148835 KIQKTNG-HNYVVLRDDKK----------------Triticum_aestivum_TFIIFa_TC106 KIQKTNG-HNYVVLREDKK----------------Arabidopsis_thaliana_TFIIF-alp KIQKNAGSQNFVVLREKCQPKPGKRESRVNKLNIRS Populus_trichocarpa_TFIIF_alpha_C3 VVHDGINTISGVWVAKFLHRTWGFYQFNKGWVGGTG Saccharomyces_cerevisiae_TFIIF EGEWNPQTTKAVDSSNNASNTVPSPIKQEEGLNSTV TFIIF Alignment CLUSTAL X (1.83) multiple sequence alignment Glycine_max_TFIIFb_TC178154 MDEENGYS-GSIS-------------------SNLETTKAERSVWLMKCP Medicago_truncatula_TFIIFb_TC7 MEDENSYG-GSSGG------------------SNLETSKAERSVWLMKCP Vitis_vinifera_TFIIFb_TC20528 MEEEQ----GNSSS------------------SNLETGKAERSVWLMKCP Populus_balsamifera_TFIIF-beta 2 MEEDHSNGGNSSSS------------------GNLETSKADKAVWLMKCP Populus_balsamifera_TFIIF-beta 3 MEED-----NSSSS------------------ANLETSKADKSVWLMKCP Arabidopsis_thaliana_TFIIFb-2_ MEDIH----------------------------NLDIEKSDRSIWLMKCP Hordeum_vulgare_TFIIFb_TC10374 MGDEA---------------------------KYLETARADRSVWLMKCP Triticum_aestivum_TFIIFb_TC122 MGDEA---------------------------KYLETARADRSVWLMKCP Oryza_sativa_TFIIFb_TC137623 MAEEA---------------------------KNLETARADRSVWLMKCP Arabidopsis_thaliana_TFIIFb-1_ MEDVKVEMKVRKNEN-----------------EALETGLAERSMLLMKAP Populus_balsamifera_TFIIF-beta 1 MDDEASNSSSGNNNNNNKNLTNDNNNKSPVLGGFLDASKAEKSVWLMKCP Drosophila_melanogaster_TFIIF_ MSKEDKEK-------------------TQIIDKDLDLSNAGRGVWLVKVP Homo_sapiens_TFIIFb_NP_004119 MAERG--------------------------ELDLTGAKQNTGVWLVKVP Saccharomyces_cerevisiae_TFIIF LSQEEEVFDGNDIENNE--------TKVYEESLDLDLERSNRQVWLVRLP Glycine_max_TFIIFb_TC178154 LVVAKSWQTHPP----------S--QPLAKVVLSLDPLHPEE-------Medicago_truncatula_TFIIFb_TC7 VAVAKSWQNHPP----------S--QPLSKVVFSIDPLLPE---------

PAGE 237

224 Vitis_vinifera_TFIIFb_TC20528 LAVSKSWQSHSS----------SESQPVAKVVLSLDPLRSE--------Populus_balsamifera_TFIIF-beta 2 VVVAKSWKSHHT--------SSSDSAPLAKVVLSLDPLQSD--------Populus_balsamifera_TFIIF-beta 3 VVVAKSWKTHTSP-------SSSDSAPLAKVVLSLDPLQSD--------Arabidopsis_thaliana_TFIIFb-2_ VVVDKAWHKIAASSSS-SFASSDSPPDMAKIVREVDPLRDD--------Hordeum_vulgare_TFIIFb_TC10374 PVVSQAWQGASASS-----GDANPNPVVAKVVLSLDPLSSA--------Triticum_aestivum_TFIIFb_TC122 PVVSQAWQGASSSS-----GDANPNPVVAKVVLSLDPLSSA--------Oryza_sativa_TFIIFb_TC137623 TVVSRAWQEAATAA-----ASSSSSSDAAAGANSNSNANPN--------Arabidopsis_thaliana_TFIIFb-1_ SLVASSLQSHSFPDDP--YRPDDPYRPDAKVILGVDPLAHEDEGTQLFRV Populus_balsamifera_TFIIF-beta 1 SIVSRFLRSQEHEVG----DGDASSPPVAKVIVSVDPLKSNDDDNS---Drosophila_melanogaster_TFIIF_ KYIAQKWEKAPTNMDVGKLRINKTPGQKAQVSLSLTPAVLAL-------Homo_sapiens_TFIIFb_NP_004119 KYLSQQWAKASGRGEVGKLRIAKTQG-RTEVSFTLN-------------Saccharomyces_cerevisiae_TFIIF MFLAEKWRDRNNLHG-QELGKIRINKDGSKITLLLNENDNDS-------Glycine_max_TFIIFb_TC178154 -------------------------------------------------Medicago_truncatula_TFIIFb_TC7 -------------------------------------------------Vitis_vinifera_TFIIFb_TC20528 -------------------------------------------------Populus_balsamifera_TFIIF-beta 2 -------------------------------------------------Populus_balsamifera_TFIIF-beta 3 -------------------------------------------------Arabidopsis_thaliana_TFIIFb-2_ -------------------------------------------------Hordeum_vulgare_TFIIFb_TC10374 -------------------------------------------------Triticum_aestivum_TFIIFb_TC122 -------------------------------------------------Oryza_sativa_TFIIFb_TC137623 -------------------------------------------------Arabidopsis_thaliana_TFIIFb-1_ SSNHSGKFHPLRNLLLHSLKFHGFGEMGFLSLEISSGPHFDHEIPCNLRI Populus_balsamifera_TFIIF-beta 1 -------------------------------------------------Drosophila_melanogaster_TFIIF_ -------------------------------------------------Homo_sapiens_TFIIFb_NP_004119 -------------------------------------------------Saccharomyces_cerevisiae_TFIIF -------------------------------------------------Glycine_max_TFIIFb_TC178154 ---------------------DDPSAVQFTMEMA---------------Medicago_truncatula_TFIIFb_TC7 ---------------------DDPAHLQFTMEMS---------------Vitis_vinifera_TFIIFb_TC20528 ----------------------DPSALEFTMEMT---------------Populus_balsamifera_TFIIF-beta 2 ----------------------DPSAIQFTMEMA---------------Populus_balsamifera_TFIIF-beta 3 ----------------------DPSALQFTMEMA---------------Arabidopsis_thaliana_TFIIFb-2_ ------------------------SPPEFKMYMV---------------Hordeum_vulgare_TFIIFb_TC10374 -----------------------EPSLQFKMEMS---------------Triticum_aestivum_TFIIFb_TC122 -----------------------EPSLQFKMEMS---------------Oryza_sativa_TFIIFb_TC137623 -----------------------PVVAKFKMEMA---------------Arabidopsis_thaliana_TFIIFb-1_ GLLSMNFLARHLNYEFLGVKHGNSFALQFVMELA---------------Populus_balsamifera_TFIIF-beta 1 ------------------ATEDYPNALNFELSLVLFCLVFTLHDFFCSLW Drosophila_melanogaster_TFIIF_ ---------------------DPEEKIPTEHILD---------------Homo_sapiens_TFIIFb_NP_004119 ------------------------EDLANIHDIG---------------Saccharomyces_cerevisiae_TFIIF ----------------------IPHEYDLELTKKVVENEYVFTEQNLKKY Glycine_max_TFIIFb_TC178154 ---GTEAV---NMSKTYSLNMFKDFVPMCVFSETSQGG-----------Medicago_truncatula_TFIIFb_TC7 ---GTEAV---NMPKTYSLNMFKDFVPMCIFSETSEGD-----------Vitis_vinifera_TFIIFb_TC20528 ---GTGAP---NMPKSYSLNMFKDFVPMCVFSETNQG------------Populus_balsamifera_TFIIF-beta 2 ---RTETG---NVPKSYSLNMFKDFVPMGVFSETPQG------------Populus_balsamifera_TFIIF-beta 3 ---RTEAG---NVPKSYSLNMFKDFVPMCVFSETPQG------------Arabidopsis_thaliana_TFIIFb-2_ ---GAEYG---NMPKCYALNMFTDFVPMGGFSDVNQG------------Hordeum_vulgare_TFIIFb_TC10374 ---QTSVASTCNLPKSYSLNMFKDFVPMCVFSETNQG------------Triticum_aestivum_TFIIFb_TC122 ---QTSVASTCNLPKSYSLNMFKDFVPMCVFSETNQG------------Oryza_sativa_TFIIFb_TC137623 ---QTGNG---NTPKSYSLNMFKDFVPMCVFSESNQG------------Arabidopsis_thaliana_TFIIFb-1_ ---RADSG---NMPRRYTLDMSKDFIPMNVFCESSDDFGSLGEE-----Populus_balsamifera_TFIIF-beta 1 KWLGTGLG---DGLKSYSMEMSKDLVDMSVFSESSQG------------Drosophila_melanogaster_TFIIF_ ---VSQVTKQTLGVFSHMAPSDGKENSTTSAAQPDNE------------Homo_sapiens_TFIIFb_NP_004119 ---GKPASVSAPREHPFVLQSVG-GQTLTVFTESSSD------------Saccharomyces_cerevisiae_TFIIF QQRKKELEADPEKQRQAYLKKQEREEELKKKQQQQKRRNNRKKFNHRVMT Glycine_max_TFIIFb_TC178154 -----------------KVAMEGKVEHKFDMKPHGENIEEYGKLCRERTN Medicago_truncatula_TFIIFb_TC7 -----------------KVAMEGKVEHKFDMKPRHENMDDYGKLCRERTK Vitis_vinifera_TFIIFb_TC20528 -----------------RVAMEGKVEHKFDMKPHNENIEEYGKLCRERTN Populus_balsamifera_TFIIF-beta 2 -----------------RVSMEGKVEHKFDMKPHEENIEEYSKLCRDRTK Populus_balsamifera_TFIIF-beta 3 -----------------KVAMEGKVEHKFDMKPHEQNIEEYHKLCRERTK Arabidopsis_thaliana_TFIIFb-2_ -----------------CAAAEGKVDHKFDMKPYGETIEEYARLCRERTS Hordeum_vulgare_TFIIFb_TC10374 -----------------KLSCEGKVEHKFDMEPHKDNLLNYAKLCRERTQ Triticum_aestivum_TFIIFb_TC122 -----------------KLSCEGKVEHKFDMEPHKDNLLNYAKLCRERTQ Oryza_sativa_TFIIFb_TC137623 -----------------KLSCEGKVGHKFDMEPHSDNLVNYGKLCRERTQ Arabidopsis_thaliana_TFIIFb-1_ ------FSIGMFIYSPGKMSVEGKIKNKFDMRPHNENIESYGRLCRERTN Populus_balsamifera_TFIIF-beta 1 -----------------KLSVEGRILNKFDVRPHSENLENYRKICRERTK Drosophila_melanogaster_TFIIF_ -----------------KLYMEGRIVQKLECRPIAD--NCYMKLKLESIR

PAGE 238

225 Homo_sapiens_TFIIFb_NP_004119 -----------------KLSLEGIVVQRAECRPAAS--ENYMRLKRLQIE Saccharomyces_cerevisiae_TFIIF DRDGRDRYIPYVKTIPKKTAIVGTVCHECQVMPSMNDPNYHKIVEQRRNI Glycine_max_TFIIFb_TC178154 KSMIKNRQIQVIDNDRGVLMRPMPGMIG-------LVSSNSKDK-KKTQP Medicago_truncatula_TFIIFb_TC7 KSMIKNRQVQIIADDRGTHMRPMPGMVG-------LVSSNFKDK-KRTQP Vitis_vinifera_TFIIFb_TC20528 KSMIKNRQIQVIDNDRGVHMRPMPGMVG-------LIASNSKDK-KKTAP Populus_balsamifera_TFIIF-beta 2 KSMIKNRQIRVIDNDRGVHMRPMPGMVG-------LISSTSKDK-KKTQP Populus_balsamifera_TFIIF-beta 3 KSMVKIRQIQVINNDRGVHMRPMPGMVG-------LISSSSKDK-KRPQP Arabidopsis_thaliana_TFIIFb-2_ KAMVKNRQIQVIDNDRGVHMRPMPGMLG-------LVSSNSKEK-RKPPP Hordeum_vulgare_TFIIFb_TC10374 KSMVKTRKVQVLDNDHGMSMRPMPGMVG-------LISSSSKEK-RKPTP Triticum_aestivum_TFIIFb_TC122 KSMVKTRKVQVLDNDHGMSMRPMPGMVG-------LISSSSKEK-RKPTP Oryza_sativa_TFIIFb_TC137623 KSMIKNRKLMVLANDNGMSMRPLPGLVG-------LMSSGPKQKEKKPLP Arabidopsis_thaliana_TFIIFb-1_ KYMGKNRQIQVIDNARGMHMRPMPGMI---------IPTAAPEK--KKLT Populus_balsamifera_TFIIF-beta 1 KYMVKSRQIKVIDNDTGSHMMPMPGMIISGLAVLSFFYIFVNDK--KKLP Drosophila_melanogaster_TFIIF_ KASEPQRRVQPIDKIVQN-FKPVKDHAH---------NIEYRER-----Homo_sapiens_TFIIFb_NP_004119 ESSKPVRLSQQLDKVVTTNYKPVANHQY---------NIEYERK-----Saccharomyces_cerevisiae_TFIIF VKLNNKERITTLDETVGVTMSHTGMSMR----------SDNSNFLKVGRE Glycine_max_TFIIFb_TC178154 VKQSDTKRTRRDRGELEDIMFKLFERQPNWALKQLVQETDQPAQFLKEIL Medicago_truncatula_TFIIFb_TC7 VKQTDTKRTRRDRGELEDIMFKLFERQPNWALKQLVQETDQPAQFLKEIL Vitis_vinifera_TFIIFb_TC20528 VKGSDMKRTRRDRGELEDIMFKLFERQPNWALKQLVQETDQPAQFLKEIL Populus_balsamifera_TFIIF-beta 2 VKQSDVKRTRRDRGELEDIMFKLFERQPNWALKQLVQETDQPAQFLKEIL Populus_balsamifera_TFIIF-beta 3 VKQSDVKRTRRDRGELEDIMFKLFERQPNWALKQLVQETDQPAQFLKEIL Arabidopsis_thaliana_TFIIFb-2_ VKQTEVKRTRRDRGELEAIMFKLFEGQPNWTLKQLVQETDQPAQFLKEIL Hordeum_vulgare_TFIIFb_TC10374 TKPSDVKRTRRDRRELENIIFKLFEKQPNWALKALVQETDQPEQFLKEIL Triticum_aestivum_TFIIFb_TC122 TKPSDVKRTRRDRRELENIIFKLFEKQPNWALKALVQETDQPEQFLKEIL Oryza_sativa_TFIIFb_TC137623 VKPSDMKRTRRDRRELENILFKLFERQPNWSLKNLMQETDQPEQFLKEIL Arabidopsis_thaliana_TFIIFb-1_ NRTSEMKRTRRDRREMEEVMFNLFERQSNWTLRLLIQETDQPEQFLKDLL Populus_balsamifera_TFIIF-beta 1 IKASDMKRTRRDRREMEGIMFKLFEKQPNWTLKQLVQETDQP-------Drosophila_melanogaster_TFIIF_ -KKAEGKKARDDKNAVMDMLFHAFEKHQYYNIKDLVKITNQPISYLKEIL Homo_sapiens_TFIIFb_NP_004119 -KKEDGKRARADKQHVLDMLFSAFEKHQYYNLKDLVDITKQPVVYLKEIL Saccharomyces_cerevisiae_TFIIF KAKSNIKSIRMPKKEILDYLFKLFDEYDYWSLKGLKERTRQPEAHLKECL Glycine_max_TFIIFb_TC178154 NELCVYNKRGANQGTYELKPEYKKSVEDTSAE-Medicago_truncatula_TFIIFb_TC7 NELCVYNKRGANQGTYELKPEYKKSVEDANAE-Vitis_vinifera_TFIIFb_TC20528 NELCVYNKRGTNQGTYELKPEYKKSAEDTGAE-Populus_balsamifera_TFIIF-beta 2 NELCVYNKRGTNQGTYELKPEYKKTAEDTGAD-Populus_balsamifera_TFIIF-beta 3 NELCVYNKRGTNQGTYELKPEYKKTVEDTGAD-Arabidopsis_thaliana_TFIIFb-2_ NELCVYNKRGSNQGTYELKPEYKKSAEDDTGGQHordeum_vulgare_TFIIFb_TC10374 NDLCMYNKRGPNQGTHELKPEYKKSSEDAAGAPTriticum_aestivum_TFIIFb_TC122 NDLCMYNKRGPNQGTHELKPEYKKSSEDAAGAPOryza_sativa_TFIIFb_TC137623 NDLCFYNKRGPNQGTHELKPEYKKSTEDADATAT Arabidopsis_thaliana_TFIIFb-1_ KDLCIYNNKGSNQGTYELKPEYKKATQE-----Populus_balsamifera_TFIIF-beta 1 -EVCFLTT-------------------------Drosophila_melanogaster_TFIIF_ KDVCDYNMKNPHKNMWELKKEYRHYKTEEKKEEE Homo_sapiens_TFIIFb_NP_004119 KEIGVQNVKGIHKNTWELKPEYRHYQGEEKSD-Saccharomyces_cerevisiae_TFIIF DKVATLVKKGPYAFKYTLRPEYKKLKEEERKATL

PAGE 239

226 LIST OF REFERENCES Albright, S.R., and Tjian, R. (2000). TAFs revisited: more data reveal new twists and confirm old ideas. Gene 242, 1-13. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990). Basic local alignment search tool. J Mol Biol 215, 403-410. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search pr ograms. Nucleic Acids Res 25, 3389-3402. Andel, F.r., Ladurner, A.G., Inouye, C., Tjian, R., and Nogales, E. (1999). Threedimensional structure of the hum an TFIID-IIA-IIB complex. Science 286, 21532156. Apone, L.M., Virbasius, C.A., Holstege F.C., Wang, J., Young, R.A., and Green, M.R. (1998). Broad, but not universal, tran scriptional requirement for yTAFII17, a histone H3-like TAFII presen t in TFIID and SAGA. Mol Cell 2, 653-661. Attwooll, C., Tariq, M., Harris, M., C oyne, J.D., Telford, N., and Varley, J.M. (1999). Identification of a novel fusion ge ne involving hTAFII68 and CHN from a t(9;17)(q22;q11.2) translocation in an extraskeletal myxoid chondrosarcoma. Oncogene 18, 7599-7601. Auble, D.T., Hansen, K.E., Mueller, C.G., Lane, W.S., Thorner, J., and Hahn, S. (1994). Mot1, a global repressor of RNA polymerase II transc ription, inhibits TBP binding to DNA by an ATP-dependent mechanism. Genes Dev 8, 19201934. Austin, R.J., and Biggin, M.D. (1996). Purification of the Drosophila RNA polymerase II general transcription factor s. Proc Natl Acad Sci U S A 93, 5788-5792. Auty, R., Steen, H., Myers, L., Gygi, S., and Buratowski, S. (2003). The Purification and Initial Characteri zation of TFIID from Saccharomyces cerevisiae In 22nd Summer Symposium in Molecular Biol ogy: Chromatin Structure and Function (State College, PA), pp. 151. Ayer, D.E. (1999). Histone deacetylases: transcri ptional repression with SINers and NuRDs. Trends Cell Biol 9, 193-198.

PAGE 240

227 Baldwin, D.A., and Gurley, W.B. (1996). Isolation and ch aracterization of cDNAs encoding transcripti on factor IIB from Arabidopsis and soybean. Plant J 10, 561568. Barlev, N.A., Candau, R., Wang, L., Darpin o, P., Silverman, N., and Berger, S.L. (1995). Characterization of physical intera ctions of the putative transcriptional adaptor, ADA2, with acidic activation do mains and TATA-binding protein. J Biol Chem 270, 19337-19344. Beckmann, H., Chen, J.L., O'Brien, T., and Tjian, R. (1995). Coactivator and promoter-selective properties of RNA polymerase I TAFs. Science 270, 15061509. Bell, B., and Tora, L. (1999). Regulation of gene expres sion by multiple forms of TFIID and other novel TAFII-containing complexes. Exp Cell Res 246, 11-19. Bell, S.D., Brinkman, A.B., van der Oost, J., and Jackson, S.P. (2001). The archaeal TFIIE alpha homologue facilitates tran scription initiation by enhancing TATAbox recognition. EMBO Rep 2, 133-138. Berger, S.L. (2003). Histone Covalent Modifica tions in Gene Regulation. In 22nd Summer Symposium in Molecular Biol ogy: Chromatin Structure and Function (State College, PA), pp. 57. Berk, A.J. (1999). Activation of RNA polymerase II transcription. Curr Opin Cell Biol 11, 330-335. Bertolotti, A., Lutz, Y., Heard, D.J., Chambon, P., and Tora, L. (1996). hTAF(II)68, a novel RNA/ssDNA-binding protein with homology to the pro-oncoproteins TLS/FUS and EWS is associated with both TFIID and RNA polymerase II. Embo J 15, 5022-5031. Bertolotti, A., Melot, T., Acker, J., Vigneron, M., Delattre, O., and Tora, L. (1998). EWS, but not EWS-FLI-1, is associated with both TFIID and RNA polymerase II: interactions between two members of the TET family, EWS and hTAFII68, and subunits of TFIID and RNA polymer ase II complexes. Mol Cell Biol 18, 14891497. Birck, C., Poch, O., Romier, C., Ruff, M., Mengus, G., Lavigne, A.C., Davidson, I., and Moras, D. (1998). Human TAF(II)28 and TAF(II)18 interact through a histone fold encoded by atypical evolutio nary conserved motifs also found in the SPT3 family. Cell 94, 239-249. Brand, M., Leurent, C., Mallouh, V., Tora, L., and Schultz, P. (1999). Threedimensional structures of the TAFII-containing complexes TFIID and TFTC. Science 286, 2151-2153.

PAGE 241

228 Buratowski, S., Hahn, S., Guarente, L., and Sharp, P.A. (1989). Five intermediate complexes in transcription init iation by RNA polymerase II. Cell 56, 549-561. Buratowski, S., and Zhou, H. (1993). Functional domains of transcription factor TFIIB. Proc Natl Acad Sci U S A 90, 5633-5637. Burke, T.W., and Kadonaga, J.T. (1997). The downstream core promoter element, DPE, is conserved from Drosophila to humans and is recognized by TAFII60 of Drosophila Genes Dev 11, 3020-3031. Burley, S.K., and Roeder, R.G. (1998). TATA box mimicry by TFIID: autoinhibition of pol II transcription. Cell 94, 551-553. Burley, S.K., and Roeder, R.G. (1996). Biochemistry a nd Structural Biology of Transcription Factor IID (TFIID). In Annual Review of Biochemistry, C.C. Richardson, J.N. Abelson, and C.R.H. Raet z, eds (Palo Alta, CA: Annual Reviews Inc.), pp. 769-799. Cairns, B.R., Henry, N.L., and Kornberg, R.D. (1996). TFG/TAF30/ANC1, a component of the yeast SWI/SNF complex that is similar to the leukemogenic proteins ENL and AF-9. Mol Cell Biol 16, 3308-3316. Campanella, J., Bitincka, L., and Smalley, J. (2003). MatGAT: An application that generates similarity/identity matrices using protein or DNA sequences. BMC Bioinformatics 4, 29. Carmo-Fonseca, M., Mendes-Soares, L., and Campos, I. (2000). To be or not to be in the nucleolus. Nat Cell Biol 2, E107-112. Chalkley, G.E., and Verrijzer, C.P. (1999). DNA binding site selection by RNA polymerase II TAFs: a TAF(II)250-TAF(II)150 complex recognizes the initiator. Embo J 18, 4835-4845. Chang, M., and Jaehning, J.A. (1997). A multiplicity of medi ators: alternative forms of transcription complexes communicate with transcriptional regulators. Nucleic Acids Res 25, 4861-4865. Chasman, D.I., Flaherty, K.M., Sharp, P.A., and Kornberg, R.D. (1993). Crystal structure of yeast TATA-binding protei n and model for interaction with DNA. Proc Natl Acad Sci U S A 90, 8174-8178. Chatterjee, S., and Struhl, K. (1995). Connecting a promoter-bound protein to TBP bypasses the need for a transcrip tional activation domain. Nature 374, 820-822.

PAGE 242

229 Chen, Z., and Manley, J.L. (2000). Robust mRNA transcription in chicken DT40 cells depleted of TAF(II)31 suggests both func tional degeneracy and evolutionary divergence. Mol Cell Biol 20, 5064-5076. Chen, Z., and Manley, J.L. (2003). In vivo functional analysis of the histone 3-like TAF9 and a TAF9-related factor, TAF9L. J Biol Chem 278, 35172-35183. Chicurel, M. (2002). Putting a name on it. Nature 419, 755, 757. Choi, C.H., Hiromura, M., and Usheva, A. (2003). Transcription factor IIB acetylates itself to regulate tr anscription. Nature 424, 965-969. Clos, J., Buttgereit, D., and Grummt, I. (1986). A purified transc ription factor (TIF-IB) binds to essential sequences of the mouse rDNA promoter. Proc Natl Acad Sci U S A 83, 604-608. Collart, M.A. (1996). The NOT, SPT3, and MOT1 genes functionally interact to regulate transcription at core promoters. Mol Cell Biol 16, 6668-6676. Colombatti, A., Bonaldo, P., and Doliana, R. (1993). Type A modules: interacting domains found in several non-fibrillar collag ens and in other ex tracellular matrix proteins. Matrix 13, 297-306. Comai, L., Zomerdijk, J.C., Beckmann, H ., Zhou, S., Admon, A., and Tjian, R. (1994). Reconstitution of transcription f actor SL1: exclusive binding of TBP by SL1 or TFIID subunits. Science 266, 1966-1972. Coulombe, B. (1999). DNA wrapping in transcripti on initiation by RNA polymerase II. Biochem Cell Biol 77, 257-264. Coulombe, B., and Burton, Z.F. (1999). DNA bending and wrapping around RNA polymerase: a "revolutionary" model describing transcriptional mechanisms. Microbiol Mol Biol Rev 63, 457-478. Cramer, P., Bushnell, D.A., Fu, J., Gnatt, A.L., Maier-Davis, B., Thompson, N.E., Burgess, R.R., Edwards, A.M., David, P.R., and Kornberg, R.D. (2000). Architecture of RNA polymerase II and implications for the transcription mechanism. Science 288, 640-649. Czermin, B., Melfi, R., McCabe, D., Seitz, V., Imhof, A., and Pirrotta, V. (2002). Drosophila enhancer of Zeste/ESC comp lexes have a histone H3 methyltransferase activity that mark s chromosomal Polycomb sites. Cell 111, 185-196. Davie, J.R., and Murphy, L.C. (1990). Level of ubiquitinated histone H2B in chromatin is coupled to ongoing tran scription. Biochemistry 29, 4752-4757.

PAGE 243

230 Dhalluin, C., Carlson, J.E., Zeng, L., He, C., Aggarwal, A.K., and Zhou, M.M. (1999). Structure and ligand of a histone acetyltransferase bromodomain. Nature 399, 491-496. Dikstein, R., Ruppert, S., and Tjian, R. (1996a). TAFII250 is a bipartite protein kinase that phosphorylates the base tr anscription factor RAP74. Cell 84, 781-790. Dikstein, R., Zhou, S., and Tjian, R. (1996b). Human TAFII 105 is a cell type-specific TFIID subunit related to hTAFII130. Cell 87, 137-146. Dubrovskaya, V., Lavigne, A.C., Davidson, I., Acker, J., Staub, A., and Tora, L. (1996). Distinct domains of hTAFII100 ar e required for functional interaction with transcription factor TFIIF beta (R AP30) and incorporation into the TFIID complex. Embo J 15, 3702-3712. Dvir, A., Conaway, J.W., and Conaway, R.C. (2001). Mechanism of transcription initiation and promoter escape by RNA polymerase II. Curr Opin Genet Dev 11, 209-214. Dynlacht, B.D., Hoey, T., and Tjian, R. (1991). Isolation of co activators associated with the TATA-binding protein that me diate transcripti onal activation. Cell 66, 563-576. Eisenmann, D.M., Arndt, K.M., Ricupero, S.L., Rooney, J.W., and Winston, F. (1992). SPT3 interacts with TFIID to allow normal transcription in Saccharomyces cerevisiae. Genes Dev 6, 1319-1331. Eulgem, T., Rushton, P.J., Robatzek, S., and Somssich, I.E. (2000). The WRKY superfamily of plant transcrip tion factors. Trends Plant Sci 5, 199-206. Fang, S.M., and Burton, Z.F. (1996). RNA polymerase II-a ssociated protein (RAP) 74 binds transcription factor (TF) IIB and blocks TFIIB-RAP30 binding. J Biol Chem 271, 11703-11709. Flores, O., Ha, I., and Reinberg, D. (1990). Factors involved in specific transcription by mammalian RNA polymerase II. Purifi cation and subunit composition of transcription factor IIF. J Biol Chem 265, 5629-5634. Flores, O., Lu, H., Killeen, M., Greenbla tt, J., Burton, Z.F., and Reinberg, D. (1991). The small subunit of transcription factor IIF recruits RNA polymerase II into the preinitiation complex. Proc Natl Acad Sci U S A 88, 9999-10003. Flores, O., Lu, H., and Reinberg, D. (1992). Factors involved in specific transcription by mammalian RNA polymerase II. Identific ation and characteri zation of factor IIH. J Biol Chem 267, 2786-2793.

PAGE 244

231 Flores, O., Maldonado, E., and Reinberg, D. (1989). Factors involved in specific transcription by mammalian RNA pol ymerase II. Factors IIE and IIF independently interact with R NA polymerase II. J Biol Chem 264, 8913-8921. Fondell, J.D., Ge, H., and Roeder, R.G. (1996). Ligand induction of a transcriptionally active thyroid hormone receptor coactivator complex. Proc Natl Acad Sci U S A 93, 8329-8333. Gangloff, Y.G., Pointud, J.C., Thuault, S., Carre, L., Romier, C., Muratoglu, S., Brand, M., Tora, L., Couderc, J.L., and Davidson, I. (2001a). The TFIID components human TAF(II)140 and Drosophila BIP2 (TAF(II)155) are novel metazoan homologues of yeast TAF(II)47 containing a histone fold and a PHD finger. Mol Cell Biol 21, 5109-5121. Gangloff, Y.G., Sanders, S.L., Romier, C., Kirschner, D., Weil, P.A., Tora, L., and Davidson, I. (2001b). Histone folds mediate sele ctive heterodimerization of yeast TAF(II)25 with TFIID components yTAF (II)47 and yTAF(II)65 and with SAGA component ySPT7. Mol Cell Biol 21, 1841-1853. Gangloff, Y.G., Werten, S., Romier, C., Carre, L., Poch, O., Moras, D., and Davidson, I. (2000). The human TFIID components TAF(II)135 and TAF(II)20 and the yeast SAGA components ADA1 a nd TAF(II)68 heterodimerize to form histone-like pairs. Mol Cell Biol 20, 340-351. Gasch, A., Hoffmann, A., Horikoshi, M., Roeder, R.G., and Chua, N.H. (1990). Arabidopsis thaliana contains two genes for TFIID. Nature 346, 390-394. Gegonne, A., Weissman, J.D., and Singer, D.S. (2001). TAFII55 binding to TAFII250 inhibits its acetyltransferase ac tivity. Proc Natl Acad Sci U S A 98, 12432-12437. Geiger, J.H., Hahn, S., Lee, S., and Sigler, P.B. (1996). Crystal structure of the yeast TFIIA/TBP/DNA complex. Science 272, 830-836. Gille, C., Goede, A., Schloetelburg, C., P reissner, R., Kloetzel, P.M., Gobel, U.B., and Frommel, C. (2003). A comprehensive view on proteasomal sequences: implications for the evolution of the proteasome. J Mol Biol 326, 1437-1448.

PAGE 245

232 Giot, L., Bader, J.S., Brouwer, C., Chaudhur i, A., Kuang, B., Li, Y., Hao, Y.L., Ooi, C.E., Godwin, B., Vitols, E., Vijayadamodar, G., Pochart, P., Machineni, H., Welsh, M., Kong, Y., Zerhusen, B., Malcolm, R., Varrone, Z., Collis, A., Minto, M., Burgess, S., McDaniel, L., Stimpson, E., Spriggs, F., Williams, J., Neurath, K., Ioime, N., Agee, M., Voss, E., Furtak, K., Renzulli, R., Aanensen, N., Carrolla, S., Bickelha upt, E., Lazovatsky, Y., DaSilva, A., Zhong, J., Stanyon, C.A., Finley Jr., R.L., White, K.P., Braverman, M., Jarvie, T., Gold, S., Leach, M., Knight, J., Shimkets, R.A., McKenna, M.P., Chant, J., and Rothberg, J.M. (2003). A Protein Interaction Map of Drosophila melanogaster Science 1090289. Gonnet, G.H., Cohen, M.A., and Benner, S.A. (1994). Analysis of amino acid substitution during divergent evoluti on: the 400 by 400 dipeptide substitution matrix. Biochem Biophys Res Commun 199, 489-496. Goodrich, J.A., Hoey, T., Thut, C.J., Admon, A., and Tjian, R. (1993). Drosophila TAFII40 interacts with both a VP16 activation domain and the basal transcription factor TFIIB. Cell 75, 519-530. Grant, P.A., and Berger, S.L. (1999). Histone acetyltransferase complexes. Semin Cell Dev Biol 10, 169-177. Grant, P.A., Schieltz, D., Pray-Grant, M.G., Steger, D.J., Reese, J.C., Yates, J.R., and Workman, J.L. (1998). A subset of TAF(II)s are integral components of the SAGA complex required for nucleos ome acetylation and transcriptional stimulation. Cell 94, 45-53. Green, M.R. (2000). TBP-associated factors (TAFIIs ): multiple, selective transcriptional mediators in common complexes. Trends Biochem Sci 25, 59-63. Groft, C.M., Uljon, S.N., Wang, R., and Werner, M.H. (1998). Structural homology between the Rap30 DNA-binding domain and linker histone H5: implications for preinitiation complex assembly. Proc Natl Acad Sci U S A 95, 9117-9122. Ha, I., Roberts, S., Maldonado, E., Sun, X., Kim, L.U., Green, M., and Reinberg, D. (1993). Multiple functional domains of huma n transcription fact or IIB: distinct interactions with two general transcrip tion factors and RNA polymerase II. Genes Dev 7, 1021-1032. Hansen, S.K., Takada, S., Jacobson, R.H., Lis, J.T., and Tjian, R. (1997). Transcription properties of a cell type-s pecific TATA-binding protein, TRF. Cell 91, 71-83.

PAGE 246

233 Heard, D.J., Kiss, T., and Filipowicz, W. (1993). Both Arabidopsis TATA binding protein (TBP) isoforms are functionally identical in RNA polymerase II and III transcription in plant ce lls: evidence for gene-specific changes in DNA binding specificity of TBP. Embo J 12, 3519-3528. Heim, M.A., Jakoby, M., Werber, M., Mart in, C., Weisshaar, B., and Bailey, P.C. (2003). The Basic Helix-Loop-Helix Transcri ption Factor Family in Plants: A Genome-wide Study of Protein Structur e and Functional Diversity. Mol Biol Evol. Henry, N.L., Campbell, A.M., Feaver, W.J., Poon, D., Weil, P.A., and Kornberg, R.D. (1994). TFIIF-TAF-RNA polymerase II connection. Genes Dev 8, 28682878. Hernandez-Hernandez, A., and Ferrus, A. (2001). Prodos Is a Conserved Transcriptional Regulator That Interacts with dTAF(II)16 in Drosophila melanogaster Mol Cell Biol 21, 614-623. Hisatake, K., Ohta, T., Takada, R., Guermah M., Horikoshi, M ., Nakatani, Y., and Roeder, R.G. (1995). Evolutionary conser vation of human TATA-bindingpolypeptide-associated factors TAFII31 a nd TAFII80 and interactions of TAFII80 with other TAFs and with general transcri ption factors. Proc Natl Acad Sci U S A 92, 8195-8199. Hoey, T., Dynlacht, B.D., Peterson M.G., Pugh, B.F., and Tjian, R. (1990). Isolation and characterization of the Drosophila gene encoding the TATA box binding protein, TFIID. Cell 61, 1179-1186. Hoffmann, A., Chiang, C.M., Oelgeschlager, T., Xie, X., Burley, S.K., Nakatani, Y., and Roeder, R.G. (1996). A histone octamer-like st ructure within TFIID. Nature 380, 356-359. Holstege, F.C., Jennings, E.G., Wyrick, J.J., Lee, T.I., Hengartner, C.J., Green, M.R., Golub, T.R., Lander, E.S., and Young, R.A. (1998). Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95, 717-728. Horvath, A., and Riezman, H. (1994). Rapid protein extraction from Saccharomyces cerevisiae Yeast 10, 1305-1310. Huang, X., and Madan, A. (1999). CAP3: A DNA Sequence Assembly Program. Genome Res. 9, 868-877.

PAGE 247

234 Huisinga, K.L., and Pugh, B.F. (2003). A Genome-wide Hous ekeeping Role for TFIID and Highly Regulated Stress -related Role for SAGA in Saccharomyces cerevisiae In 22nd Summer Symposium in Molecular Biology: Chromatin Structure and Function (Pennsylvania Stat e University, Univer sity Park, PA), pp. 101. Imbalzano, A.N., Kwon, H., Green, M.R., and Kingston, R.E. (1994). Facilitated binding of TATA-binding protein to nucleosomal DNA. Nature 370, 481-485. Imhof, A., Yang, X.J., Ogryzko, V.V., Nakatani, Y., Wolffe, A.P., and Ge, H. (1997). Acetylation of general tran scription factors by histone acetyltransferases. Curr Biol 7, 689-692. The Arabidopsis Genome Initiative (2000). Analysis of th e genome sequence of the flowering plant Arabidopsis thaliana Nature 408, 796-815. Inostroza, J., Flores, O., and Reinberg, D. (1991). Factors involved in specific transcription by mammalian RNA polymer ase II. Purification and functional analysis of general transcript ion factor IIE. J Biol Chem 266, 9304-9308. Ito, T., Chiba, T., Ozawa, R., Yoshid a, M., Hattori, M., and Sakaki, Y. (2001). A comprehensive two-hybrid analysis to expl ore the yeast protein interactome. Proc Natl Acad Sci U S A 98, 4569-4574. Ito, T., Tashiro, K., Muta, S., Ozawa, R., Chiba, T., Nishizawa, M., Yamamoto, K., Kuhara, S., and Sakaki, Y. (2000). Toward a protein-pr otein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations between the y east proteins. Proc Natl Acad Sci U S A 97, 1143-1147. Jacobson, R.H., Ladurner, A.G., King, D.S., and Tjian, R. (2000). Structure and function of a human TAFII250 doubl e bromodomain module. Science 288, 14221425. Jacq, X., Brou, C., Lutz, Y., Davidson, I., Chambon, P., and Tora, L. (1994). Human TAFII30 is present in a dis tinct TFIID complex and is required for transcriptional activation by the estrogen receptor. Cell 79, 107-117. Jakoby, M., Weisshaar, B., Droge-Laser, W., Vicente-Carbajosa, J., Tiedemann, J., Kroj, T., and Parcy, F. (2002). bZIP transcription factors in Arabidopsis Trends Plant Sci 7, 106-111. Joazeiro, C.A., Kassavetis, G.A., and Geiduschek, E.P. (1994). Identical components of yeast transcription factor IIIB are required and suffi cient for transcription of TATA box-containing and TATA-less genes. Mol Cell Biol 14, 2798-2808.

PAGE 248

235 Kamada, K., De Angelis, J., Roeder, R.G., and Burley, S.K. (2001). Crystal structure of the C-terminal domain of the RAP74 s ubunit of human transcription factor IIF. Proc Natl Acad Sci U S A 98, 3115-3120. Kambadur, R., Culotta, V., and Hamer, D. (1990). Cloned yeast and mammalian transcription factor TFIID gene pr oducts support basal but not activated metallothionein gene transcription. Proc Natl Acad Sci U S A 87, 9168-9172. Killeen, M.T., and Greenblatt, J.F. (1992). The general transcription factor RAP30 binds to RNA polymerase II and prevents it from binding nonspecifically to DNA. Mol Cell Biol 12, 30-37. Kim, J.L., and Burley, S.K. (1994). 1.9 A resolution re fined structure of TBP recognizing the minor groove of TATAAAAG. Nat Struct Biol 1, 638-653. Kim, J.L., Nikolov, D.B., and Burley, S.K. (1993a). Co-crystal structure of TBP recognizing the minor groove of a TATA element. Nature 365, 520-527. Kim, Y., Geiger, J.H., Hahn, S., and Sigler, P.B. (1993b). Crystal structure of a yeast TBP/TATA-box complex. Nature 365, 512-520. Kimura, H., Tao, Y., Roeder, R.G., and Cook, P.R. (1999). Quantitation of RNA polymerase II and its transcription fact ors in an HeLa cell: little soluble holoenzyme but significant amounts of pol ymerases attached to the nuclear substructure. Mol Cell Biol 19, 5383-5392. Kitajima, S., Chibazakura, T., Yonaha, M., and Yasukochi, Y. (1994). Regulation of the human general transcript ion initiation factor TFIIF by phosphorylation. J Biol Chem 269, 29970-29977. Klebanow, E.R., Poon, D., Zhou, S., and Weil, P.A. (1996). Isolation and characterization of TAF25, an essentia l yeast gene that encodes an RNA polymerase II-specific TATA-binding protei n-associated factor. J Biol Chem 271, 13706-13715. Klein, C., and Struhl, K. (1994). Increased recruitment of TATA-binding protein to the promoter by transcrip tional activation doma ins in vivo. Science 266, 280-282. Klemm, R.D., Goodrich, J.A., Zhou, S., and Tjian, R. (1995). Molecular cloning and expression of the 32-kDa subunit of human TFIID reveals interactions with VP16 and TFIIB that mediate transcriptional activation. Proc Natl Acad Sci U S A 92, 5788-5792.

PAGE 249

236 Kokubo, T., Gong, D.W., Roeder, R.G., Horikoshi, M., and Nakatani, Y. (1993a). The Drosophila 110-kDa transcription factor TFIID subunit directly interacts with the N-terminal region of the 230-kDa subunit. Proc Natl Acad Sci U S A 90, 5896-5900. Kokubo, T., Gong, D.W., Wootton, J.C., Horikoshi, M., Roeder, R.G., and Nakatani, Y. (1994). Molecular cloning of Drosophila TFIID subunits. Nature 367, 484-487. Kokubo, T., Gong, D.W., Yamashita, S., Horikoshi, M., Roeder, R.G., and Nakatani, Y. (1993b). Drosophila 230-kD TFIID subunit, a f unctional homolog of the human cell cycle gene product, negative ly regulates DNA binding of the TATA box-binding subunit of TFIID. Genes Dev 7, 1033-1046. Kokubo, T., Gong, D.W., Yamashita, S., Taka da, R., Roeder, R.G., Horikoshi, M., and Nakatani, Y. (1993c). Molecular cloning, expr ession, and characterization of the Drosophila 85-kilodalton TFIID subunit. Mol Cell Biol 13, 7859-7863. Kokubo, T., Swanson, M.J., Nishikawa, J.I., Hinnebusch, A.G., and Nakatani, Y. (1998). The yeast TAF145 inhibitory do main and TFIIA competitively bind to TATA-binding protein. Mol Cell Biol 18, 1003-1012. Koleske, A.J., and Young, R.A. (1994). An RNA polymerase II holoenzyme responsive to activators. Nature 368, 466-469. Koleske, A.J., and Young, R.A. (1995). The RNA polymerase II holoenzyme and its implications for gene regulation. Trends Biochem Sci 20, 113-116. Komarnitsky, P.B., Michel, B., and Buratowski, S. (1999). TFIID-specific yeast TAF40 is essential for the majority of RNA polymerase II-mediated transcription in vivo. Genes Dev 13, 2484-2489. Kotani, T., Banno, K., Ikura, M., Hinnebus ch, A.G., Nakatani, Y., Kawaichi, M., and Kokubo, T. (2000). A role of transcriptional activators as antirepressors for the autoinhibitory activity of TATA box binding of transc ription factor IID. Proc Natl Acad Sci U S A 97, 7178-7183. Kraemer, S.M., Ranallo, R.T., Ogg, R.C., and Stargell, L.A. (2001). TFIIA interacts with TFIID via association with TA TA-binding protein and TAF40. Mol Cell Biol 21, 1737-1746. Laemmli, U.K. (1970). Cleavage of struct ural proteins during the assembly of the head of bacteriophage T4. Nature 227, 680-685.

PAGE 250

237 Lagrange, T., Hakimi, M.A., Pontier, D., Courtois, F., Alcaraz, J.P., Grunwald, D., Lam, E., and Lerbs-Mache, S. (2003). Transcription fact or IIB (TFIIB)-related protein (pBrp), a plant-specific member of the TFIIB-related protein family. Mol Cell Biol 23, 3274-3286. Lagrange, T., Kapanidis, A.N., Tang, H., Reinberg, D., and Ebright, R.H. (1998). New core promoter element in RNA polymerase II-dependent transcription: sequence-specific DNA binding by transc ription factor IIB. Genes Dev 12, 34-44. Langelier, M.F., Forget, D., Rojas, A., Po rlier, Y., Burton, Z.F., and Coulombe, B. (2001). Structural and functiona l interactions of transcrip tion factor (TF) IIA with TFIIE and TFIIF in transcription initia tion by RNA polymerase II. J Biol Chem 276, 38652-38657. Lavigne, A.C., Mengus, G., May, M., Dubr ovskaya, V., Tora, L., Chambon, P., and Davidson, I. (1996). Multiple interactions be tween hTAFII55 and other TFIID subunits. Requirements for the formation of stable ternary complexes between hTAFII55 and the TATA-binding protein. J Biol Chem 271, 19774-19780. Lawit, S.J., and Czarnecka-Verner, E. (2002). Histone Deacetylase Complexes: Implications for Plants. Biotechnologia 3, 39-52. Le Gourrierec, J., Li, Y.F., and Zhou, D.X. (1999). Transcriptional activation by Arabidopsis GT-1 may be through interaction with TFIIA-TBP-TATA complex. Plant J 18, 663-668. Learned, R.M., Cordes, S., and Tjian, R. (1985). Purification and characterization of a transcription factor that c onfers promoter specificity to human RNA polymerase I. Mol Cell Biol 5, 1358-1369. Lee, S., and Hahn, S. (1995). Model for binding of tran scription factor TFIIB to the TBP-DNA complex. Nature 376, 609-612. Lee, T.I., and Young, R.A. (1998). Regulation of gene expression by TBP-associated proteins. Genes Dev 12, 1398-1408. Leurent, C., Sanders, S., Ruhlmann, C., Mallouh, V., Weil, P.A., Kirschner, D.B., Tora, L., and Schultz, P. (2002). Mapping histone fo ld TAFs within yeast TFIID. Embo J 21, 3424-3433. Levine, M., and Tjian, R. (2003). Transcription regulation and animal diversity. Nature 424, 147-151. Li, X.Y., Bhaumik, S.R., and Green, M.R. (2000). Distinct classe s of yeast promoters revealed by differential TA F recruitment. Science 288, 1242-1244.

PAGE 251

238 Li, Y.F., Dubois, F., and Zhou, D.X. (2001). Ectopic expre ssion of TATA box-binding protein induces shoot proliferation in Arabidopsis FEBS Lett 489, 187-191. Li, Y.F., Le Gourierrec, J., Torki, M., Kim, Y.J., Guerineau, F., and Zhou, D.X. (1999). Characterization and functional analysis of Arabidopsis TFIIA reveal that the evolutionarily unconserved region of the large subunit has a transcription activation domain. Plant Mol Biol 39, 515-525. Lin, C.W., Moorefield, B., Payne, J., Apri kian, P., Mitomo, K., and Reeder, R.H. (1996). A novel 66-kilodalton protein co mplexes with Rrn6, Rrn7, and TATAbinding protein to promote polymeras e I transcription initiation in Saccharomyces cerevisiae Mol Cell Biol 16, 6436-6443. Liu, D., Ishima, R., Tong, K.I., Bagby, S., Kokubo, T., Muhandiram, D.R., Kay, L.E., Nakatani, Y., and Ikura, M. (1998). Solution structure of a TBPTAF(II)230 complex: protein mimicry of the minor groove surface of the TATA box unwound by TBP. Cell 94, 573-583. Luger, K., Mader, A.W., Richmond, R.K., Sargent, D.F., and Richmond, T.J. (1997). Crystal structure of the nucleosom e core particle at 2.8 A resolution. Nature 389, 251-260. Lukashin, A.V., and Borodovsky, M. (1998). GeneMark.hmm: ne w solutions for gene finding. Nucleic Acids Res 26, 1107-1115. Majoros, W.H., Pertea, M., Antonescu, C., and Salzberg, S.L. (2003). GlimmerM, Exonomy and Unveil: three ab initio eukaryotic genefinde rs. Nucleic Acids Res 31, 3601-3604. Maldonado, E., Ha, I., Cortes, P., Weis, L., and Reinberg, D. (1990). Factors involved in specific transcription by mammalian R NA polymerase II: role of transcription factors IIA, IID, and IIB during formati on of a transcription-competent complex. Mol Cell Biol 10, 6335-6347. Malik, S., Lee, D.K., and Roeder, R.G. (1993). Potential RNA polymerase II-induced interactions of transcripti on factor TFIIB. Mol Cell Biol 13, 6253-6259. Margottin, F., Dujardin, G., Gerard, M., Egly, J.M., Huet, J., and Sentenac, A. (1991). Participation of the TATA factor in transcription of the yeast U6 gene by RNA polymerase C. Science 251, 424-426. Martinez, E., Chiang, C.M., Ge, H., and Roeder, R.G. (1994). TATA-binding proteinassociated factor(s) in TFIID function through the initiator to direct basal transcription from a TATA-l ess class II promoter. Embo J 13, 3115-3126.

PAGE 252

239 Matangkasombut, O., Buratowski, R.M., Swilling, N.W., and Buratowski, S. (2000). Bromodomain factor 1 corresponds to a missing piece of yeast TFIID. Genes Dev 14, 951-962. Matsui, T., Segall, J., Weil, P.A., and Roeder, R.G. (1980). Multiple factors required for accurate initiation of transcription by purified R NA polymerase II. J Biol Chem 255, 11992-11996. Maxon, M., and Tjian, R. (1994). Transcriptional Activity of Transcription Factor IIE is Dependent on Zinc Binding. PNAS 91, 9529-9533. Maxon, M.E., Goodrich, J.A., and Tjian, R. (1994). Transcription factor IIE binds preferentially to RNA polymerase IIa a nd recruits TFIIH: a model for promoter clearance. Genes Dev 8, 515-524. McCracken, S., and Greenblatt, J. (1991). Related RNA polymer ase-binding regions in human RAP30/74 and Escherichia coli sigma 70. Science 253, 900-902. Mengus, G., May, M., Jacq, X., Staub, A., Tora, L., Chambon, P., and Davidson, I. (1995). Cloning and characterization of hTAFII18, hTAFII20 and hTAFII28: three subunits of the human tran scription factor TFIID. Embo J 14, 1520-1531. Michel, B., Komarnitsky, P., and Buratowski, S. (1998). Histone-like TAFs are essential for transcri ption in vivo. Mol Cell 2, 663-673. Miller, J.H. (1972). Experiments in Molecular Gene tics. (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press). Miller, J.H. (1992). A Laboratory Manual for Escher ichia coli and Related Bacteria: A Short Course in Bacterial Genetics. (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press). Miller, T., Krogan, N.J., Dover, J., Erdjum ent-Bromage, H., Tempst, P., Johnston, M., Greenblatt, J.F., and Shilatifard, A. (2001). COMPASS: a complex of proteins associated with a trithorax-rel ated SET domain protein. Proc Natl Acad Sci U S A 98, 12902-12907. Mitsiou, D.J., and Stunnenberg, H.G. (2000). TAC, a TBP-sans-TAFs complex containing the unprocessed TFIIA alpha/b eta precursor and the TFIIA gamma subunit. Mol Cell 6, 527-537. Mittal, V., and Hernandez, N. (1997). Role for the amino-terminal region of human TBP in U6 snRNA transcription. Science 275, 1136-1140.

PAGE 253

240 Mukumoto, F., Hirose, S., Imaseki, H., and Yamazaki, K. (1993). DNA sequence requirement of a TATA element-binding protein from Arabidopsis for transcription in vitro Plant Mol Biol 23, 995-1003. Murphy, S., Yoon, J.B., Gerster, T., and Roeder, R.G. (1992). Oct-1 and Oct-2 potentiate functional interact ions of a transcription f actor with the proximal sequence element of small nuclear RNA genes. Mol Cell Biol 12, 3247-3261. Myers, L.C., and Kornberg, R.D. (2000). Mediator of tran scriptional regulation. Annu Rev Biochem 69, 729-749. Naar, A.M., Beaurang, P.A., Zhou, S., Abraham, S., Solomon, W., and Tjian, R. (1999). Composite co-activator ARC mediat es chromatin-directed transcriptional activation. Nature 398, 828-832. Nikolov, D.B., and Burley, S.K. (1994). 2.1 A resolution refi ned structure of a TATA box-binding protein (TBP). Nat Struct Biol 1, 621-637. Nikolov, D.B., Chen, H., Halay, E.D., Usheva, A.A., Hisatake, K., Lee, D.K., Roeder, R.G., and Burley, S.K. (1995). Crystal structure of a TFIIB-TBP-TATA-element ternary complex. Nature 377, 119-128. Nikolov, D.B., Hu, S.H., Lin, J., Gasch, A., Hoffmann, A., Horikoshi, M., Chua, N.H., Roeder, R.G., and Burley, S.K. (1992). Crystal struct ure of TFIID TATAbox binding protein. Nature 360, 40-46. O. Brien, T., and Tjian, R. (1998). Functional analysis of the human TAFII250 Nterminal kinase domain. Mol Cell 1, 905-911. Oelgeschlager, T., Chiang, C.M., and Roeder, R.G. (1996). Topology and reorganization of a human TFIID-promoter complex. Nature 382, 735-738. Ogryzko, V.V., Kotani, T., Zhang, X., Sc hlitz, R.L., Howard, T., Yang, X.J., Howard, B.H., Qin, J., and Nakatani, Y. (1998). Histone-like TAFs within the PCAF histone acetylase complex. Cell 94, 35-44. Ohkuma, Y., Sumimoto, H., Hoffmann, A., Shimasaki, S., Horikoshi, M., and Roeder, R.G. (1991). Structural motifs and pot ential sigma homologies in the large subunit of human general tr anscription factor TFIIE. Nature 354, 398-401. Ohkuma, Y., Sumimoto, H., Horikoshi, M., and Roeder, R.G. (1990). Factors involved in specific transcription by ma mmalian RNA polymerase II: purification and characterization of gene ral transcription factor TFIIE. Proc Natl Acad Sci U S A 87, 9163-9167.

PAGE 254

241 Okamoto, T., Yamamoto, S., Watanabe, Y ., Ohta, T., Hanaoka, F., Roeder, R.G., and Ohkuma, Y. (1998). Analysis of the role of TFIIE in transcriptional regulation through structure-f unction studies of the TFIIE beta subunit. J Biol Chem 273, 19866-19876. Okuda, M., Watanabe, Y., Okamura, H., Ha naoka, F., Ohkuma, Y., and Nishimura, Y. (2000). Structure of the central core domain of TFIIE beta with a novel doublestranded DNA-binding surface. Embo J 19, 1346-1356. Orphanides, G., Lagrange, T., and Reinberg, D. (1996). The general transcription factors of RNA polymerase II. Genes Dev 10, 2657-2683. Page, R.D. (1996). TreeView: an app lication to display phyloge netic trees on personal computers. Comput Appl Biosci 12, 357-358. Pan, S., Czarnecka-Verner, E., and Gurley, W.B. (2000). Role of the TATA binding protein-transcription factor IIB intera ction in supporting basal and activated transcription in plant cells. Plant Cell 12, 125-136. Pan, S., Sehnke, P.C., Ferl, R.J., and Gurley, W.B. (1999). Specific interactions with TBP and TFIIB in vitro suggest that 143-3 proteins may participate in the regulation of transcription when pa rt of a DNA binding complex. Plant Cell 11, 1591-1602. Pandey, R., Muller, A., Napoli, C.A., Selinger, D.A., Pikaard, C.S., Richards, E.J., Bender, J., Mount, D.W., and Jorgensen, R.A. (2002). Analysis of histone acetyltransferase and histone deacetylase families of Arabidopsis thaliana suggests functional diversification of chromatin modification among multicellular eukaryotes. Nucleic Acids Res 30, 5036-5055. Parvin, J.D., and Sharp, P.A. (1993). DNA topology and a minimal set of basal factors for transcription by RNA polymerase II. Cell 73, 533-540. Peterson, M.G., Inostroza, J., Maxon, M.E., Flores, O., Admon, A., Reinberg, D., and Tjian, R. (1991). Structure and functiona l properties of human general transcription factor IIE. Nature 354, 369-373. Peterson, M.G., Tanese, N., Pugh, B.F., and Tjian, R. (1990). Functional domains and upstream activation properties of clon ed human TATA binding protein. Science 248, 1625-1630. Pham, A.D., and Sauer, F. (2000). Ubiquitin-activati ng/conjugating activity of TAFII250, a mediator of activa tion of gene expression in Drosophila Science 289, 2357-2360.

PAGE 255

242 Pointud, J.-C., Mengus, G., Brancorsini, S., Monaco, L., Parvinen, M., SassoneCorsi, P., and Davidson, I. (2003). The intracellular lo calisation of TAF7L, a paralogue of transcriptio n factor TFIID subunit TAF7, is developmentally regulated during male germ-cell differentiation. J Cell Sci 116, 1847-1858. Ptashne, M. (1988). How eukaryotic transcri ptional activators work. Nature 335, 683689. Pugh, B.F. (2003). Short Talk: Coordination of TFIID and SAGA. In Keystone Symposia: The Enzymology of Chromatin and Transcription (Santa Fe, NM), pp. 16. Pugh, B.F., and Tjian, R. (1990). Mechanism of transcriptional activation by Sp1: evidence for coactivators. Cell 61, 1187-1197. Purnell, B.A., Emanuel, P.A., and Gilmour, D.S. (1994). TFIID sequence recognition of the initiator and sequen ces farther downstream in Drosophila class II genes. Genes Dev 8, 830-842. Qadri, I., Maguire, H.F., and Siddiqui, A. (1995). Hepatitis B virus transactivator protein X interacts with the TATA-bi nding protein. Proc Natl Acad Sci U S A 92, 1003-1007. Qureshi, S.A., Khoo, B., Baumann, P., and Jackson, S.P. (1995). Molecular cloning of the transcription factor TFIIB homolog from Sulfolobus shibatae Proc Natl Acad Sci U S A 92, 6077-6081. Rachez, C., and Freedman, L.P. (2001). Mediator complexes and transcription. Curr Opin Cell Biol 13, 274-280. Rachez, C., Lemon, B.D., Suldan, Z., Br omleigh, V., Gamble, M., Naar, A.M., Erdjument-Bromage, H., Tempst, P., and Freedman, L.P. (1999). Liganddependent transcription activation by nuclear receptors requires the DRIP complex. Nature 398, 824-828. Ranish, J.A., and Hahn, S. (1991). The yeast general tran scription factor TFIIA is composed of two polypeptide subunits. J Biol Chem 266, 19320-19327. Reese, J.C., Apone, L., Walker, S. S., Griffin, L.A., and Green, M.R. (1994). Yeast TAFIIS in a multisubunit complex required for activated transcription. Nature 371, 523-527. Reese, J.C., Zhang, Z., and Kurpad, H. (2000). Identification of a yeast transcription factor IID subunit, TSG2/TAF48. J Biol Chem 275, 17391-17398.

PAGE 256

243 Reindl, A., and Schoffl, F. (1998). Interaction between the Arabidopsis thaliana heat shock transcription factor HSF1 and th e TATA binding protein TBP. FEBS Lett 436, 318-322. Remboutsika, E., Jacq, X., and Tora, L. (2001). Chromatin is permissive to TBPmediated transcription initiation. J Biol Chem Accepted Manuscript, 22. Riechmann, J.L., Heard, J., Martin, G., Reuber, L., Jiang, C., Keddie, J., Adam, L., Pineda, O., Ratcliffe, O.J., Samaha, R.R ., Creelman, R., Pilgrim, M., Broun, P., Zhang, J.Z., Ghandehari, D., Sherman, B.K., and Yu, G. (2000). Arabidopsis transcription factors: genome-w ide comparative analysis among eukaryotes. Science 290, 2105-2110. Riechmann, J.L., and Ratcliffe, O.J. (2000). A genomic perspective on plant transcription factors. Curr Opin Plant Biol 3, 423-434. Robert, F., Forget, D., Li, J., Greenblatt, J., and Coulombe, B. (1996). Localization of subunits of transcription factors IIE and IIF immediately upstream of the transcriptional initiat ion site of the adenovirus majo r late promoter. J Biol Chem 271, 8517-8520. Roberts, S.M., and Winston, F. (1996). SPT20/ADA5 encodes a novel protein functionally related to the TATA-binding protein and im portant for transcription in Saccharomyces cerevisiae Mol Cell Biol 16, 3206-3213. Rossignol, M., Keriel, A., Staub, A., and Egly, J.M. (1999). Kinase activity and phosphorylation of the largest subunit of TF IIF transcription factor. J Biol Chem 274, 22387-22392. Rowlands, T., Baumann, P., and Jackson, S.P. (1994). The TATA-binding protein: a general transcription fa ctor in eukaryotes a nd archaebacteria. Science 264, 13261329. Ruppert, S., and Tjian, R. (1995). Human TAFII250 in teracts with RAP74: implications for RNA polymerase II initiation. Genes Dev 9, 2747-2755. Ruppert, S., Wang, E.H., and Tjian, R. (1993). Cloning and expression of human TAFII250: a TBP-associated factor impli cated in cell-cycle regulation. Nature 362, 175-179. Ryu, S., Zhou, S., Ladurner, A.G., and Tjian, R. (1999). The transcriptional cofactor complex CRSP is required for activity of the enhancer-binding protein Sp1. Nature 397, 446-450.

PAGE 257

244 Sadowski, C.L., Henry, R.W., Lo bo, S.M., and Hernandez, N. (1993). Targeting TBP to a non-TATA box cis-regulatory element: a TBP-containing complex activates transcription from snRNA promot ers through the PSE. Genes Dev 7, 1535-1548. Saleh, A., Lang, V., Cook, R., and Brandl, C.J. (1997). Identification of native complexes containing the yeast coactivat or/repressor prot eins NGG1/ADA3 and ADA2. J Biol Chem 272, 5571-5578. Samuels, M., Fire, A., and Sharp, P.A. (1982). Separation and characterization of factors mediating accurate transcrip tion by RNA polymerase II. J Biol Chem 257, 14419-14427. Sanders, S.L., Klebanow, E.R., and Weil, P.A. (1999). TAF25p, a non-histone-like subunit of TFIID and SAGA complexes, is essential for total mRNA gene transcription in vivo. J Biol Chem 274, 18847-18850. Sanders, S.L., and Weil, P.A. (2000). Identification of tw o novel TAF subunits of the yeast Saccharomyces cerevisiae TFIID complex. J Biol Chem 275, 13895-13900. Schlueter, S.D., Dong, Q., and Brendel, V. (2003). GeneSeqer@PlantGDB: Gene structure prediction in plant genomes. Nucleic Acids Res 31, 3597-3600. Selleck, W., Howley, R., Fang, Q., Podoln y, V., Fried, M.G., Buratowski, S., and Tan, S. (2001). A histone fold TAF oc tamer within the yeast TFIID transcriptional coactivator. Nat Struct Biol 8, 695-700. Smale, S.T., and Kadonaga, J.T. (2003). The RNA polymerase II core promoter. Annual Review of Biochemistry 72, 449-479. Solow, S., Salunek, M., Ryan, R., and Lieberman, P.M. (2001). Taf(II) 250 phosphorylates human transcription factor IIA on serine residues important for TBP binding and transcription activity. J Biol Chem 276, 15886-15892. Steffan, J.S., Keys, D.A., Dodd, J.A., and Nomura, M. (1996). The role of TBP in rDNA transcription by RNA polymerase I in Saccharomyces cerevisiae : TBP is required for upstream activation factor-dep endent recruitment of core factor. Genes Dev 10, 2551-2563. Sterner, D.E., Grant, P.A., Roberts, S.M., Duggan, L.J., Belotserkovskaya, R., Pacella, L.A., Winston, F., Workman, J.L., and Berger, S.L. (1999). Functional organization of the yeast SAGA complex: distinct components involved in structural integrity, nucleosome acetylation, and TATA-binding protein interaction. Mol Cell Biol 19, 86-98.

PAGE 258

245 Stockinger, E.J., Mao, Y., Regier, M.K., Triezenberg, S.J., and Thomashow, M.F. (2001). Transcriptional adaptor and hi stone acetyltransferase proteins in Arabidopsis and their interactions with CBF1, a transcriptional activator involved in cold-regulated gene expression. Nucleic Acids Res 29, 1524-1533. Struhl, K., and Moqtaderi, Z. (1998). The TAFs in the HAT. Cell 94, 1-4. Sumimoto, H., Ohkuma, Y., Sinn, E., Kato, H., Shimasaki, S., Horikoshi, M., and Roeder, R.G. (1991). Conserved sequence motifs in the small subunit of human general transcription factor TFIIE. Nature 354, 401-404. Sun, X., Ma, D., Sheldon, M., Yeung, K., and Reinberg, D. (1994). Reconstitution of human TFIIA activity from recombinant pol ypeptides: a role in TFIID-mediated transcription. Genes Dev 8, 2336-2348. Swofford, D.L. (2003). PAUP*. Phylogene tic Analysis Using Parsimony (*and Other Methods) (Sunderland, Massachuset ts: Sinauer Associates). Takada, R., Nakatani, Y., Hoffmann, A., Kokubo, T., Hasegawa, S., Roeder, R.G., and Horikoshi, M. (1992). Identification of human TFIID components and direct interaction between a 25 0-kDa polypeptide and the TATA box-binding protein (TFIID tau). Proc Natl Acad Sci U S A 89, 11809-11813. Takeda, Y., Hirokawa, H., and Yamazaki, K. (1994). Bending of DNA in solution caused by a protein from Arabidopsis that binds to a TATA element. Biosci Biotechnol Biochem 58, 916-920. Tamada, Y., Nakamori, K., Matsuda, K., Furumoto, T., and Izui, K. (2003). Characterization of TAF10, a general tran scription factor, in plants. In 7th International Congress of Plant Molecu lar Biology (Barcelona, SP), pp. S05-22. Tan, S., Conaway, R.C., and Conaway, J.W. (1995). Dissection of transcription factor TFIIF functional domains required for in itiation and elongation. Proc Natl Acad Sci U S A 92, 6042-6046. Tang, H., Sun, X., Reinberg, D., and Ebright, R.H. (1996). Protein-prot ein interactions in eukaryotic transcription initiation: st ructure of the preinitiation complex. Proc Natl Acad Sci U S A 93, 1119-1124. Tao, Y., Guermah, M., Martinez, E., Oelges chlager, T., Hasegawa, S., Takada, R., Yamamoto, T., Horikoshi, M., and Roeder, R.G. (1997). Specific interactions and potential functions of human TAFII100. J Biol Chem 272, 6714-6721. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., and Higgins, D.G. (1997). The ClustalX windows interface: flex ible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25, 4876-4882.

PAGE 259

246 Tjian, R., and Maniatis, T. (1994). Transcriptional activation: a complex puzzle with few easy pieces. Cell 77, 5-8. Toldo, L.I. (1997). JaMBW 1.1: Java-based Molecu lar Biologists' Workbench. Comput Appl Biosci 13, 475-476. Toledo-Ortiz, G., Huq, E., and Quail, P.H. (2003). The Arabidopsis basic/helix-loophelix transcription factor family. Plant Cell 15, 1749-1770. Tora, L. (2002). A unified nomenclature fo r TATA box binding protein (TBP)associated factors (TAFs) involved in RNA polymerase II transcription. Genes Dev 16, 673-675. Uetz, P., Giot, L., Cagney, G., Mansfi eld, T.A., Judson, R.S., Knight, J.R., Lockshon, D., Narayan, V., Srinivasan M., Pochart, P., Qureshi-Emili, A., Li, Y., Godwin, B., Conover, D., Kalb fleisch, T., Vijayadamodar, G., Yang, M., Johnston, M., Fields, S., and Rothberg, J.M. (2000). A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae Nature 403, 623-627. Upadhyaya, A.B., Lee, S.H., and DeJong, J. (1999). Identification of a general transcription factor TFIIA alpha/beta homolog selectivel y expressed in testis. J Biol Chem 274, 18040-18048. Verrijzer, C.P., Yokomori, K., Chen, J.L., and Tjian, R. (1994). Drosophila TAFII150: similarity to yeast gene TSM1 and specific binding to core promoter DNA. Science 264, 933-941. Vlachonasios, K.E., Thomashow, M.F., and Triezenberg, S.J. (2003). Disruption mutations of ADA2b and GCN5 transcripti onal adaptor genes dramatically affect Arabidopsis growth, development, and gene expression. Plant Cell 15, 626-638. Wade, P.A., and Jaehning, J.A. (1996). Transcriptional corepression in vitro : a Mot1passociated form of TATA-binding prot ein is required for repression by Leu3p. Mol Cell Biol 16, 1641-1648. Washburn, K.B., Davis, E.A., and Ackerman, S. (1997). Coactivators and TAFs of transcription activation in wheat. Plant Mol Biol 35, 1037-1043. Wassarman, D.A., and Sauer, F. (2001 Aug). TAF(II)250: a transcription toolbox. J Cell Sci 114, 2895-2902. Weinzierl, R.O., Dynlacht, B.D., and Tjian, R. (1993a). Largest subunit of Drosophila transcription factor IID directs assemb ly of a complex containing TBP and a coactivator. Nature 362, 511-517.

PAGE 260

247 Weinzierl, R.O., Ruppert, S., Dynlacht B.D., Tanese, N., and Tjian, R. (1993b). Cloning and expression of Drosophila TAFII60 and human TAFII70 reveal conserved interactions with ot her subunits of TFIID. Embo J 12, 5303-5309. Werten, S., Mitschler, A., Romier, C., Gang loff, Y.G., Thuault, S., Davidson, I., and Moras, D. (2002). Crystal structure of a s ubcomplex of human transcription factor TFIID formed by TATA bindi ng protein-associated factors hTAF4 (hTAF(II)135) and hTAF12 (hTAF(II)20). J Biol Chem 277, 45502-45509. Wieczorek, E., Brand, M., Jacq, X., and Tora, L. (1998). Function of TAF(II)containing complex without TBP in tran scription by RNA polymerase II. Nature 393, 187-191. Wolffe, A.P., and Guschin, D. (2000). Chromatin structural features and targets that regulate transcripti on. J Struct Biol 129, 102-122. Workman, J.L., and Kingston, R.E. (1998). Alteration of nuc leosome structure as a mechanism of transcriptional regulat ion. Annual Review of Biochemistry 67, 545-579. Workman, J.L., Taylor, I.C., and Kingston, R.E. (1991). Activation domains of stably bound GAL4 derivatives alleviate repressi on of promoters by nucleosomes. Cell 64, 533-544. Wullschleger, S.D., Jansson, S., and Taylor, G. (2002). Genomics and Forest Biology: Populus Emerges as the Perennial Favorite. Plant Cell 14, 2651-2655. Xenarios, I., Salwinski, L., Duan, X .J., Higney, P., Kim, S.M., and Eisenberg, D. (2002). DIP, the Database of Interacti ng Proteins: a resear ch tool for studying cellular networks of protein in teractions. Nucleic Acids Res 30, 303-305. Xiao, H., Tao, Y., and Roeder, R.G. (1999). The human homologue of Drosophila TRF-proximal protein is associated w ith an RNA polymerase II-SRB complex. J Biol Chem 274, 3937-3940. Xie, X., Kokubo, T., Cohen, S.L., Mirza, U.A., Hoffmann, A., Chait, B.T., Roeder, R.G., Nakatani, Y., and Burley, S.K. (1996). Structural similarity between TAFs and the heterotetrameric core of the histone octamer. Nature 380, 316-322. Yamashita, S., Hisatake, K., Kokubo, T., Do i, K., Roeder, R.G., Horikoshi, M., and Nakatani, Y. (1993). Transcription factor TFIIB sites important for interaction with promoter-bound TFIID. Science 261, 463-466. Yatherajam, G., Zhang, L., Kraemer, S.M., and Stargell, L.A. (2003). Protein-protein interaction map for yeast TFIID. Nucleic Acids Res 31, 1252-1260.

PAGE 261

248 Yokomori, K., Admon, A., Goodri ch, J.A., Chen, J.L., and Tjian, R. (1993a). Drosophila TFIIA-L is processed into two subunits that are associated with the TBP/TAF complex. Genes Dev 7, 2235-2245. Yokomori, K., Chen, J.L., Admon, A., Zhou, S., and Tjian, R. (1993b). Molecular cloning and characterization of dTAFII 30 alpha and dTAFII30 beta: two small subunits of Drosophila TFIID. Genes Dev 7, 2587-2597. Yokomori, K., Verrijzer, C.P., and Tjian, R. (1998). An interplay between TATA boxbinding protein and transcription factor s IIE and IIA modulates DNA binding and transcription. Proc Natl Acad Sci U S A 95, 6722-6727. Yokomori, K., Zeidler, M.P., Chen, J.L., V errijzer, C.P., Mlodzik, M., and Tjian, R. (1994). Drosophila TFIIA directs cooperative DNA binding with TBP and mediates transcriptiona l activation. Genes Dev 8, 2313-2323. Yoon, J.B., and Roeder, R.G. (1996). Cloning of two proximal sequence elementbinding transcription factor subunits (gamma and delta ) that are required for transcription of small nuclear RNA ge nes by RNA polymerases II and III and interact with the TATA-bi nding protein. Mol Cell Biol 16, 1-9. Zhou, Q., Lieberman, P.M., Boyer, T.G., and Berk, A.J. (1992). Holo-TFIID supports transcriptional stimulation by diverse ac tivators and from a TATA-less promoter. Genes Dev 6, 1964-1974. Zomerdijk, J.C., Beckmann, H., Comai, L., and Tjian, R. (1994). Assembly of transcriptionally active RNA polymerase I initiation factor SL1 from recombinant subunits. Science 266, 2015-2018.

PAGE 262

249 BIOGRAPHICAL SKETCH Shai Joshua Lawit was born October 5, 1976 in Daytona Beach, FL to Steven A. Lawit and Donna B. Lawit. He graduated magna cum laude from Spruce Creek High School, Port Orange, FL with an Interna tional Baccalaureate Diploma in 1995. He enrolled at the University of Florida in August 1995 with his new bride Kristel. In August 1998, he graduated first in his College of Agriculture class with Highest Honors. He received a Bachelor of Science degree majoring in microbiology and cell science, with a minor in Chemistry. Thereafter, he immediately enrolled in the Plant Molecular and Cellular Biology Doctor of Philosophy program.


Permanent Link: http://ufdc.ufl.edu/UFE0002362/00001

Material Information

Title: Protein-Protein Interaction Map of the Arabidopsis thaliana General Transcription Factors A, B, D, E, and F
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0002362:00001

Permanent Link: http://ufdc.ufl.edu/UFE0002362/00001

Material Information

Title: Protein-Protein Interaction Map of the Arabidopsis thaliana General Transcription Factors A, B, D, E, and F
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0002362:00001


This item has the following downloads:


Full Text












PROTEIN-PROTEIN INTERACTION MAP OF Arabidopsis thaliana GENERAL
TRANSCRIPTION FACTORS A, B, D, E, AND F















By

SHAI JOSHUA LAWIT


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


2003






































Copyright 2003

by

Shai Joshua Lawit

































This document is dedicated to my family, both genetic and scientific.















ACKNOWLEDGMENTS

I thank my dearest wife, Kristel Lynn, for her undying support for me. I also thank

my son, Benjamin Owen, for constant interest in this document as I wrote and unending

smiles and hugs at all times. I thank my parents and all of my family for instilling me

with a desire for education and excellence. Of course, I have a great appreciation for the

members of the Gurley lab (past, present, and future) who continually contribute to this

field of research. John Davis and the members of his lab (especially Chris Dervinis for

helping me to get access to the Poplar genomic sequences and Ram Kishore Alavalapati

for running all the PAUP analyses) deserve special thanks for technical assistance,

collaboration, and helpful discussions. I would finally like to thank William B. Gurley,

Eva Czarnecka-Verner, Robert Ferl, Alice Harmon, Karen Koch, Donald McCarty,

Thomas Yang, Robert R. Schmidt, Waltraud I. Dunn, and the entire teaching faculty who

have molded me into the scientist that I am.
















TABLE OF CONTENTS
Page

A C K N O W L E D G M E N T S ................................................................................................. iv

LIST OF TABLES ...... ............. ......... .................... viii

LIST OF FIGURES ......... ......................... ...... ........ ............ ix

ABSTRACT .............. ..................... .......... .............. xii

CHAPTER

1 INTRODUCTION TO THE LITERATURE .................................... ...............

G general Transcription Factors .................................. ..........................................1
TATA Binding Protein and TFIID .............................................. .................. 3
TATA Binding Protein-A associated Factors ........................................ ...............6
H istone-like TAFs .................. ................................. .. ... .. ........ ..
TAF1 family ................................................ ....... .........8
Other TAFs and interactions of TFIID...................................................13
Alternative TBP- or TAF-Containing Complexes ...........................................17
TAFs: Required Factors or Optional Accessories.............................................23
Interplay of G T F s .................................................. ................ .... .......... 26
Transcriptional Activators That Bind DNA .................................... ............... 36

2 PHYLOGENETIC ANALYSIS OF POPLAR, Arabidopsis AND OTHER PLANT
GENERAL TRANSCRIPTION FACTORS ................................... .................51

In tro du ctio n .................51............................................
M e th o d s ..............................................................................5 3
R e su lts ................. ... ... ........... .............................................................. ............... 5 6
TFIIA Large and Sm all Subunits ........................................ ..... ............... 56
T FIIB F am ily ................ ........................................................ 57
Representative TFIID Components............ .................. ...............58
TFIIEc and TFIIE3 Subunits .. .. ................. ...................59
TFIIFc and TFIIFP Subunits ........................................ ......................... 60
D discussion .............. ......... .. ....... ... .......................... 60
TFIIA Large and Sm all Subunits ........................................ ..... ............... 60
TFIIB Fam ily ............ ..... ..... .... .............................. 62
Representative TFIID Components............ .................. ...............65



v










TFIIEc and TFIIEP Subunits ................... ................ .. ............... 68
T F IIF F am ily ................. ................. ...............................................................68
TFIIF Fam ily ............... ................................ ........ ..... ............ 69

3 BINARY PROTEIN-PROTEIN INTERACTIONS OF THE Arabidopsis thaliana
GENERAL TRANSCRIPTION FACTOR IID.....................................................89

In tro d u ctio n ........................................................................................8 9
M materials and M methods ....................................................................... ..................90
R e su lts ................................................................................................................... 9 6
D isc u ssio n ............................................................................................................. 9 8

4 BINARY PROTEIN-PROTEIN INTERACTIONS OF Arabidopsis TFIIA, TFIIB,
TFIID, TFIIE, AND TFIIF .................................................. ............... 118

Introduction ............................................... ...................118
M materials and M methods ............................................................ ............ 119
R e su lts ................................ ...................................................... 1 2 1
Discussion ................................. ................................. ......... 123

5 D ISCU SSION ................ ................ .......... .. .......... .. ..........147

TFIIA Large and Sm all Subunits......................................... ......................... 147
TFIIB Family ............ ............. ................... 149
TFIID Com ponents ...................................................... ............... 152
TFIIEc and TFIIEP Subunits ............................................................................154
TFIIFc and TFIIFP Subunits......... ......... ................... .. ............... 155
Conclusion ....................... .. ...................... 157

APPENDIX

A NUCLEOTIDE AND AMINO ACID SEQUENCES OF GENERAL
TRANSCRIPTION FACTORS ................................................... .................. 161

TFIIA Sm all Subunit Sequences ........................................ ......................... 161
TFIIA Large Subunit Sequences ........................................ ......................... 163
TFIIB Fam ily Sequences .......................................................... ............... 165
TA TA Binding Protein Sequences ........................................ ....................... 172
TAF6 Sequences ....... ...... ........... .. ......... .............. ......... 176
TAF9 Sequences ....... ...... ................................ .......... .. ............ 179
T A F 10 Sequences........... .................................................................. ....... .. ...... 182
TAF11 Sequences...................................................... 185
TFIIEc Sequences ...................................... ........ .......... .... .......... .... 186
TFIIE3 Sequences ............................... .. ..... .. ...... .. ............189
TFIIFc Sequences ............................................ .. .... .... ......... .. .... .. 192
T F IIF P Sequ ences........... .................................................................. ........ .. ...... .. 194











B AMINO ACID MULTIPLE SEQUENCE ALIGNMENTS FOR CORE DOMAINS
OF THE GENERAL TRANSCRIPTION FACTORS ............................................197

TFIIA Sm all Subunit A lignm ent ................................... .............................. ....... 197
TFIIA Large Subunit A lignm ent ................................... .............................. ....... 198
TFIIB Fam ily A lignm ent .......................................................... ............... 200
TBP A lignm ent .................. .................................. ........ .......... ....205
T A F 6 A lignm ent.......... .................................................................. .......... ....... 208
TAF9 Alignment ........................ ....... ........... .............. 211
TA F 10 A lignm ent............. ............................................................ .............. 214
TA F11 A lignm ent............................................ ................215
TFIIEc A lignm ent ................................................. .. .. .. .. .. ........ .... 216
TFIIE P A lignm ent ....................... .................... ... .... ................. 219
TFIIFc A lignm ent .......................................... ... ............ ..... 222
TFIIFp A lignm ent ....................... .................... ... .... ................. 223

L IST O F R E FE R E N C E S ......... ............... ............................................ .......................226

B IO G R A PH IC A L SK E T C H ........................................ ............................................249
















LIST OF TABLES


Table p

1-1. TATA binding protein-associated factors of the TFIID complex. ......................39

1-2. Protein-protein interactions of TFIID in Homo sapiens, Drosophila melanogaster,
and Saccharomyces cerevisiae with corresponding references ...........................41

1-3. Protein-protein interactions between TFIIA, TFIIB, TFIID, TFIIE, and TFIIF
subunits in Homo sapiens, Drosophila melanogaster, and Saccharomyces
cerevisiae with corresponding references................... ...... ......... .......46

2-1. Arabidopsis GTF genes, loci, genomic sizes, coding sequence sizes (counting stop
codons), predicted protein molecular weights, and pi of the predicted proteins. ..72

2-2: Similarity and identity percentage ranges of the GTF protein families examined.74

3-1. Primers for amplification of TBP and TAF-like cDNAs and cloning into
pENTR/D-Topo or pDONR207 vectors. .....................................................105

3-2. Primers for cloning of TAF 12 N-terminal, middle, and C-terminal fragments... 107

3-3. Arabidopsis thaliana TFIID subunit cDNA GenBank accession numbers. ....... 108

3-4. A yeast two-hybrid targeted protein-protein interaction matrix between subunits
of the Arabidopsis thaliana TFIID complex............................ ... .......... 113

4-1. Primers for amplification of cDNAs to Arabidopsis homologs of TFIIA, TFIIB,
TFIIE, and TFIIF cloning into the pENTR/D-Topo vector..............................129

4-2. Primers for cloning of TFIIEa2 N-terminal, and C-terminal fragments.............131

4-3. Arabidopsis thaliana TFIIA, TFIIB, TFIIE, and TFIIE component cDNA
GenBank accession numbers. ........................................ .......................... 132

4-4. A yeast two-hybrid targeted protein-protein interaction matrix between
components of Arabidopsis thaliana TFIIA, TBIIB, TFIIE, and TFIIF with
subunits of the TFIID com plex ...................................................... ............... 137
















LIST OF FIGURES


Figure page

1-1. The "two-step handoff' model of removal of auto-inhibition of TFIID by the
TAF1 N-terminal domains TAND1 and TAND2 (T1 and T2, respectively). .......40

1-2. Binary protein-protein interactions of the Homo sapiens general transcription
factors TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and their homologs.......................48

1-3. Binary protein-protein interactions of Drosophila melanogaster general
transcription factors TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and their homologs....49

2-1: Unrooted phylogram of TFIIA small subunit proteins from plants, humans, fruit
flie s, an d y e a st .................................................. .............. ................ 7 5

2-2: Unrooted phylogram of TFIIA large subunit proteins from plants, humans, fruit
flies, and yeast.................................................................... ... ...... ....... 76

2-3: Unrooted phylogram of TFIIB-related proteins from plants, humans, fruit flies,
yeast, and A rchaea. ............................................ ................. .. .....77

2-4: Unrooted phylogram of TBP-related proteins from plants, humans, fruit flies,
yeast, and A rchaea. ...................... .................... ................... .. ......78

2-5: Unrooted phylogram of TAF6-related proteins from plants, humans, fruit
flies, and yeast.................................................................... ... ...... ....... 79

2-6: Unrooted phylogram of TAF9-related proteins from plants, humans, fruit
flies, and yeast.................................................................... ... ...... ....... 80

2-7: Unrooted phylogram of TAF 10-related proteins from plants, humans, fruit
flies, and yeast.................................................................... .......... ....... 81

2-8: Unrooted phylogram of TAF 11-related proteins from plants, humans, fruit
flies, and yeast.................................................................... ... ...... ....... 82

2-9: Unrooted phylogram of TFIIEa-related proteins from plants, humans, fruit flies,
yeast, and A rchaea. ...................... .................... ................... .. ......83

2-10: Unrooted phylogram of TFIIEp-related proteins from plants, humans, fruit
flies, and yeast.................................................................... ... ...... ....... 84









2-11: Unrooted phylogram of TFIIFa-related proteins from plants, humans, fruit
flies, and yeast.................................................................... ... ...... ....... 85

2-12: Unrooted phylogram of TFIIFP-related proteins from plants, humans, fruit
flies, and yeast.................................................................... ... ...... ....... 86

2-13. Multiple sequence alignment of the TFIIB region containing the conserved lysine
residue that is acetylated in human and yeast TFIIB (in green). .........................87

2-14: Exon-Intron diagrams of Arabidopsis TAF6 and TAF6b alternative
sp licin g fo rm s.................................................. ................ 8 8

3-1. Histogram of percent of matings, per bait construct, that yielded
colony grow th. .....................................................................109

3-2. Histogram of percent of matings, per prey construct, that yielded
colony growth. ................ .. .............. ..... ........... 110

3-3. Immunoblots of TFIID components expressed as bait fusion proteins in
M aV204K. ............ .... ......... ... .......... ....... ...... ............... 111

3-4. Immunoblots of TFIID components expressed as prey fusion
proteins in AH 109............ .. ................ ........ ..... ...... .......... .... 112

3-5. Colorimetric assays of the P-galactosidase reporter levels in yeast diploids
containing both bait and prey plasmids. ............. ...................... ....................114

3-6. Protein-protein interactions ofArabidopsis thaliana TFIID subunits as determined
by yeast two-hybrid and P-galactosidase confirmations.................................... 116

3-7. Protein-protein interactions ofArabidopsis thaliana TFIID subunits as determined
by yeast two-hybrid and P-galactosidase confirmations.................................... 117

4-1. Histogram of percent of matings, per bait construct, that yielded
colony grow th. .....................................................................133

4-2. Histogram of percent of matings, per prey construct, that yielded
colony grow th. .....................................................................134

4-3. Immunoblots of TFIIA, TFIIB, TFIIE, and TFIIF components expressed as bait
fusion proteins in M aV204K ........................................ .......................... 135

4-4. Immunoblots of TFIIA, TFIIB, TFIIE, and TFIIF components expressed as bait
fusion proteins in A H 109............................................. ............................ 136

4-5. Colorimetric assays of the P-galactosidase reporter levels in yeast diploids
containing both bait and prey plasmids. ................................... ............... 138









4-6. Protein-protein interactions ofArabidopsis thaliana TFIIA subunits with
components of TFIIB, TFIID, TFIIE, TFIIF, and other TFIIA components as
determined by yeast two-hybrid and P-galactosidase confirmations.................140

4-7. Protein-protein interactions of Arabidopsis thaliana TFIIB homologs with
components of TFIIA, TFIID, TFIIE, TFIIF, and other homologs of TFIIB as
determined by yeast two-hybrid and P-galactosidase confirmations.................141

4-8. Protein-protein interactions of Arabidopsis thaliana TFIID components with
components of TFIIA, TFIIB, TFIIE, and TFIIF as determined by yeast two-
hybrid and P-galactosidase confirmations. .................................. .................142

4-9. Protein-protein interactions of Arabidopsis thaliana TFIIE subunits with
components of TFIIA, TFIIB, TFIID, TFIIF, and other TFIIE components as
determined by yeast two-hybrid and P-galactosidase confirmations.................143

4-10. Protein-protein interactions of Arabidopsis thaliana TFIIF subunits with
components of TFIIA, TFIIB, TFIID, TFIIE, and other TFIIF components as
determined by yeast two-hybrid and P-galactosidase confirmations.................144

4-11. Protein-protein interactions of Arabidopsis thaliana TFIIA, TFIIB, TFIIE, and
TFIIF with each other and subunits of TFIID as determined by yeast two-hybrid
and P-galactosidase confirmations.............................................................145

4-12. Strong protein-protein interactions ofArabidopsis thaliana TFIIA, TFIIB, TFIIE,
and TFIIF with each other and subunits of TFIID as determined by yeast two-
hybrid and P-galactosidase confirmations. .................................. .................146

5-1. Protein-protein interactions among TFIIA, TFIIB, TFIIE, TFIID, and TFIIF that
are unique to Arabidopsis thaliana as determined by yeast two-hybrid and 3-
galactosidase confirm nations. ........................................... .......................... 158

5-2. Protein-protein interactions of Arabidopsis thaliana TFIIA, TFIIB, TFIID, TFIIE,
and TFIIF that have been reported previously for Homo sapiens, Drosophila
melanogaster, and/or Saccharomyces cerevisiae homologs............................ 159

5-3. Interactions of Homo sapiens, Drosophila melanogaster, and/or Saccharomyces
cerevisiae TFIIA, TFIIB, TFIID, TFIIE or TFIIF that were not confirmed for
Arabidopsis thaliana hom ologs ............. ................. ..................................... 160















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

PROTEIN-PROTEIN INTERACTION MAP OF THE Arabidopsis thaliana GENERAL
TRANSCRIPTION FACTORS A, B, D, E, AND F

By

Shai Joshua Lawit

December 2003

Chair: William B. Gurley
Cochair: Eva Czarnecka-Vemer
Major Program: Plant Molecular and Cellular Biology

General transcription factor IID (TFIID) is a protein complex central to the

nucleation of the deoxyribonucleic acid (DNA) dependent ribonucleic acid (RNA)

polymerase II (PolII) preinitiation complex (PIC) and is critical to the transcriptional

activation of many genes. The presence of TFIID at a promoter leads to recruitment of

the other general transcription factors (GTFs) TFIIA, TFIIB, TFIIE, TFIIH, and TFIIF (in

association with PolII). While GTFs have been heavily studied in metazoans and yeast,

little is known about their functions in the plant kingdom. Recent studies of selected

GTF proteins in plants have uncovered possible plant-specific and developmental roles,

suggesting that some GTF proteins have evolved different functions since the last

common ancestor of plants, animals, and fungi.

The specific objectives for characterization of the GTFs from Arabidopsis, were to

identify the GTF proteins and uncover their binary interactions. A number of genes for

putative GTFs were identified by homology-based searches. These newly identified









genes added to the two previously known TATA-binding protein (TBP) genes, three

genes encoding subunits of TFIIA, and two TFIIB genes. Of these genes, 16 encoded

TBP-associated factor like proteins (TAFs), and 14 encoding putative components of

TFIIA, TFIIB, TFIIE, and TFIIF in Arabidopsis. Many of their complementary DNAs

(cDNAs) were cloned using reverse transcriptase-mediated polymerase chain reaction

(PCR). Often, these clones were the first confirmation of messenger RNAs for their

respective genes.

The cDNAs of these Arabidopsis GTF genes have been subcloned into yeast two-

hybrid bait and prey vectors, and transformed into yeast MATa and MATca strains,

respectively. Using a targeted interaction scheme, 1598 interactions were tested.

Interactions that yielded colony growth in the yeast two-hybrid system were verified

using P-galactosidase assays. A map of binary protein-protein interactions between the

subunits of Arabidopsis TFIIA, TFIIB, TFIID, TFIIE, and TFIIF was constructed. Of the

112 interactions, 36.4% were protein interactions that were previously characterized in

other systems. However, 63.6% (112) of the interactions were novel. This is the first

comprehensive protein-protein interaction map for TFIIA, TFIIB, TFIID, TFIIE and

TFIIF and has elucidated new PIC nucleation pathways (i.e. a TAF8-TAF10

heterotetramer with extensive protein contacts).














CHAPTER 1
INTRODUCTION TO THE LITERATURE

General Transcription Factors

The central dogma of molecular biology defines the flow of genetic information as

directed from deoxyribonucleic acid (DNA), to ribonucleic acid (RNA), and ultimately to

protein. In eukaryotes, DNA-dependent RNA polymerase II (PolII) is responsible for the

transcription of some small nuclear RNAs and all messenger RNA (mRNA is the only

RNA that is translated into protein) (Burley and Roeder, 1996). Initiation of transcription

by PolII is a complex process and requires interactions of many proteins that comprise

the general transcription factors (GTFs) TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH

(Matsui et al., 1980; Samuels et al., 1982; Albright and Tjian, 2000). In the first step,

TFIID nucleates the assembly of the GTFs at the TATA element of PolII promoters. This

is achieved in part by sequence specific recognition of DNA by TATA binding protein

(TBP).

Two predominant models describe PolII assembly at a promoter. The stepwise

model states that GTFs are sequentially recruited to a promoter in a predetermined order

(TFIID, TFIIB, TFIIA, TFIIF with PolII, TFIIE, and TFIIH) (Buratowski et al., 1989;

Flores et al., 1992; Koleske and Young, 1995). The second model is called the

holoenzyme model because TFIIB, TFIIE, TFIIF, TFIIH, Mediator (a compilation of

suppressor of RNA polymerase B proteins, and numerous other proteins), and PolII are

pre-assembled as a multi-protein mega Dalton (MDa) complex before recruitment to the

promoter (Koleske and Young, 1994, 1995). Both models require either TBP or TFIID to









nucleate the assembly of the preinitiation complex (PIC) at the promoter as a first step.

In total, the PolII PIC is composed of at least 48 protein subunits: 12 PolII; 18-20 TFIID,

including multiple copies of some TBP associated factors (TAF) subunits; one TFIIB;

four TFIIF (X232) (Flores et al., 1990); four TFIIE (-232) (Ohkuma et al., 1990); and 9

TFIIH.

In recent years, the holoenzyme model has gained preference in the transcription-

oriented community. Evidence in support of the holoenzyme model includes the

isolation of the large holoenzyme complex from a variety of species; and that artificial

recruitment of holoenzyme subunits (out of order according to the stepwise model) to a

promoter leads to high levels of transcription (Koleske and Young, 1994; Berk, 1999).

Furthermore, early purifications of PolII lacking some members of the holoenzyme can

be accounted for by low abundance relative to PolII and unfavorable protein-protein

interaction conditions in the purification schemes (Koleske and Young, 1994; Berk,

1999). However, arguing against the importance of the holo-form of the enzyme is the

finding that quantitative immunoblots of HeLa extracts demonstrated that only 3% of

soluble PolII was present in a holoenzyme-sized complex (Kimura et al., 1999).

The stepwise model implies that multiple assembly steps are involved in formation

of the PIC. Conversely, the holoenzyme model suggests that there are only two major

regulatory steps in PIC formation: recruitment of either TFIID or holoenzyme to the

promoter and subsequent recruitment of the reciprocal complex through protein-protein

interaction (Koleske and Young, 1995). In either case, recognition of a promoter by

TFIID is certainly a prominent regulatory step in the initiation of transcription by PolII.









TATA Binding Protein and TFIID

The saddle-shaped TBP constitutes the core of the TFIID complex. This

evolutionarily ancient protein is essential for recognition of promoters by all three

eukaryotic RNA polymerases, as well as the archaeal RNA polymerase (Rowlands et al.,

1994). This was reviewed by Buratowski (1994); and Burley and Roeder (1996).

Promoter specificity for DNA-dependent RNA polymerase I (Poll), which transcribes

ribosomal DNA, is determined by selectivity factors. These selectivity factors are SL1 in

humans (Learned et al., 1985), TIF-IB in mice (Clos et al., 1986), and CF in yeast (Lin et

al., 1996). These selectivity factors contain TBP and three TBP associated factors

(TAFis): TAFI48, TAFI63, and TAFi110) (Comai et al., 1994; Zomerdijk et al., 1994).

The TAFis contact the consensus elements in ribosomal DNA core promoters and are

vital to Poll transcription (Learned et al., 1985; Clos et al., 1986). In addition, TAFis

contact a transcriptional activator that enhances Selectivity factor 1 (SL1) binding and

increases Poll transcription (Beckmann et al., 1995; Steffan et al., 1996).

RNA polymerase III is responsible for the transcription of transfer RNAs, 5S

ribosomal RNA, small ribonuclear protein RNAs, and small nuclear RNAs (snRNAs;

which in some cases are also transcribed by PolII) (Carmo-Fonseca et al., 2000). In

mammals, TBP plus the PolIII selectivity-factors PTF/SNAPc (proximal-sequence

element-binding transcription-factor/small nuclear RNA-activating protein complex)

recognize the snRNA promoters (Murphy et al., 1992; Sadowski et al., 1993).

PTF/SNAPc is composed of four subunits that bind core snRNA promoter elements and

stabilize TBP binding at these promoters (Yoon and Roeder, 1996; Mittal and Hernandez,

1997). In yeast, two essential TAFIIIs bind TBP, forming the TFIIIB complex, which









functions as an snRNA promoter selectivity factor (Margottin et al., 1991; Joazeiro et al.,

1994).

In addition to its role in Poll and PolIII-mediated gene expression, TBP also

serves as a promoter selectivity factor for PolII within the TFIID complex. In early

experiments, TBP was thought to be the sole component of TFIID. These experiments

showed that TBP alone could form the foundation of the PIC in vitro (Peterson et al.,

1990). TBP is a saddle-shaped protein with near symmetry (Nikolov et al., 1992). The

concave surface of TBP interacts with the minor groove of the DNA TATA-element; and

the convex surface is left open to protein-protein interactions (Chasman et al., 1993; Kim

et al., 1993a; Kim et al., 1993b; Albright and Tjian, 2000). The C-terminal stirrup of

TBP interacts with the 5'-end of the TATA element and TFIIB, which facilitates the

directionality of the TATA interaction and transcription (Nikolov et al., 1995). The

combined interaction of TBP and TFIIB with TATA leads to an 800 bend in the promoter

(Kim et al., 1993b) that is believed to lead to a wrapping of DNA from -60 to +40

around PolII (Coulombe, 1999; Coulombe and Burton, 1999).

Later experiments demonstrated that TBP alone could not initiate activator-

dependent transcription with some transactivators (Hoey et al., 1990; Kambadur et al.,

1990; Peterson et al., 1990; Pugh and Tjian, 1990; Burley and Roeder, 1996). Moreover,

it was found that TBP binding to DNA is not required at TATA-less promoters that

contain an Initiator element (Zhou et al., 1992; Martinez et al., 1994). These data

suggested that TFIID might be composed of TBP and accessory factors (TAFuIs, referred

to as TAFs from here forward). When Dynlacht et al. (1991) found peptides associated

with TBP that provided coactivator function, it was realized that TFIID was composed of









more than just TBP. It is now confirmed that TBP and 8 to 14 TAFs comprise TFIID

(Dynlacht et al., 1991; Sanders and Weil, 2000). Therefore, TBP recruitment (often a

limiting step of PolII transcription) (Klein and Struhl, 1994; Chatterjee and Struhl, 1995)

can be achieved by targeted recruitment TAFs to a promoter. Furthermore, some of the

TAFs (in addition to TBP) form the promoter specific DNA interacting surface of TFIID,

probably increasing the stability of a promoter-TFIID interaction (Martinez et al., 1994;

Burke and Kadonaga, 1997; Chalkley and Verrijzer, 1999)}. Since SLI, PTF/SNAPc,

and TFIID all rely on TBP, they must be formed by mutual exclusion of the various TAF

protein-protein interactions with TBP. This model has been experimentally demonstrated

in that TAFIs exclude TAF2 and TAF1 binding to TBP, and vice versa (Comai et al.,

1994).

Arabidopsis TBP has been well characterized and indeed was the first TBP

structure from any organism to be elucidated by X-ray crystallography (Nikolov et al.,

1992). Many subsequent papers described the structure of Arabidopsis TBP in complex

with the TATA-box, TFIIA, and TFIIB and various combinations thereof (Kim et al.,

1993a; Kim and Burley, 1994; Nikolov and Burley, 1994; Nikolov et al., 1995). The

Arabidopsis TBP was utilized in these studies not because of an interest in the

mechanisms of plant gene regulation, but rather because it closely approximates a TBP

core structure, lacking the long N-terminal extensions found in metazoans. Nonetheless,

the structure of the Arabidopsis TBP, its DNA-sequence recognition sites (Mukumoto et

al., 1993), bending of the TATA element (Takeda et al., 1994), interactions with

transcriptional regulators (Qadri et al., 1995; Reindl and Schoffl, 1998; Le Gourrierec et









al., 1999; Pan et al., 1999), and its effect on growth of Arabidopsis when overexpressed

(Li et al., 2001) have been well characterized.

TATA Binding Protein-Associated Factors

TAFs may have a wide range of activities (such as coactivation, repression, and

even protein modification) that are thus bestowed on the TFIID complex (Albright and

Tjian, 2000). It is now accepted that the TFIID complex contains at least 14 TAF

subunits. Some of these subunits have enzymatic activity, while many do not have

apparent catalytic properties. Most of the TAFs are in stoichiometric unity, while some

are in multiple copies in a complex, and others are substoichiometric. These proteins

have been studied extensively, and this work will be discussed below.

Histone-like TAFs

It is believed that a structural core of TFIID is composed of the histone-like

TAFs. Yeast TAF6 (yTAF6), yTAF9, and yTAF12 have similarity with histones H4, H3,

and H2B, respectively. Additionally, evidence supports a theory that yTAF4 has

structural similarity with the histone H2A, forming a heterodimer with yTAF12

(Gangloff et al., 2000; Sanders and Weil, 2000). Humans contain two distinct TAF4

homologs: TAF4 and TAF4b. TAF4b appears to be a likely target of activators

responsible for transcription of genes in B-cells, specifically the K-promoter or enhancer

(Dikstein et al., 1996b). Interestingly, TAF4b protein levels are post-transcriptionally

regulated such as to reduce the protein levels in non-B-cells to below detection limits

(Dikstein et al., 1996b). The crystal structure of Drosophila TAF9 and 6 (dTAF9 and 6)

display histone-fold domains (HFDs) that interact with one another, forming a dTAF9-

dTAF6 dimer (Xie et al., 1996). TAF dimerization is critical to TFIID formation because









TAF HFDs are predicted to not properly fold when the binding partner is lacking (Burley

and Roeder, 1996).

It has been hypothesized that some of the histone-like TAFs bind DNA and form a

structure similar to a nucleosome, perhaps in conjunction with the TBP-TFIIB bending of

DNA (Coulombe, 1999). This hypothesis was supported by data showing that TFIID

introduces a negative supercoil in bound DNA, much like a nucleosome (Oelgeschlager

et al., 1996). However, TAFs do not contain the arginine residues found in histone tails

that bind the minor grove of DNA (Luger et al., 1997; Workman and Kingston, 1998;

Albright and Tjian, 2000). Additionally, structural evidence also refutes a nucleosome-

like organization. Human TFIID is composed of four dimensionally equivalent domains,

none of which are large enough to contain the histone-fold TAF-octamer (Brand et al.,

1999). Despite this evidence, DNase I footprinting experiments suggest that DNA is in

some way wound around TFIID, approximately one turn as opposed to the two turns

characteristic of nucleosomes (Michel et al., 1998). Furthermore, dTAF6 and 9 contact a

conserved downstream promoter element (DPE) in photo-crosslinking experiments

(Burke and Kadonaga, 1997). Selleck and colleagues (Selleck et al., 2001) demonstrated

that bacterially expressed, yeast histone-like TAFs 4, 6, 9, and 12 can self-assemble into a

TAF octamer in a 2:2:2:2 ratio in vitro. Overexpression of these yeast histone-like TAFs

individually has the capacity to suppress mutations in the other members of the octamer

(Selleck et al., 2001). Additionally, by co-immunoprecipitation, TAF4 and TAF4b have

been shown to be present within the same TFIID complex, indicating that there are likely

two copies of TAF4 in TFIID (Dikstein et al., 1996b). This evidence constitutes strong

support for a histone-like octamer in TFIID, or some variation thereof. Irrespective of









DNA binding, multiple, possibly interchangeable, histone-fold proteins within TFIID

suggest that this complex has a flexible composition modulated in response to cellular

signals (Albright and Tjian, 2000).

More recently, it has been realized that other TAFs contain conserved histone-fold

domains, such as TAF3, TAF8, TAF10, TAF11, and TAF13 (Leurent et al., 2002). In

addition to the TAF6-TAF9 association, a number of TAF heterodimers containing HFDs

have been shown to form including human TAF11 (hTAF 11) -hTAF13 (Birck et al.,

1998), human and yeast TAF4-TAF12 (Gangloff et al., 2000; Reese et al., 2000; Sanders

and Weil, 2000), TAF3-TAF10 (Gangloff et al., 2001b), and TAF8-TAF10 (Gangloff et

al., 2001b). TAFs 11 and 13 are unique in that they are a histone-fold binding-pair that is

specific to the TFIID complex (Grant et al., 1998). However, it is believed that they are

structural and functional orthologs of the Spt3 subunit of the Spt-Ada-Gcn5-

acetyltransferase (SAGA) complex (Apone et al., 1998; Birck et al., 1998).

The full complement of putative histone-like TAFs has been identified in

Arabidopsis thaliana, with the exception of TAF3 (Table 1-1). Several of the putative

Arabidopsis TAFs are members of two-gene families. These include the TAFl/lb,

TAF4/4b, TAF6/6b, TAF11/ lb, TAF12/12b, TAF14/14b, and TAF15/15b genes.

TAF1 family

Proteins in the TAF1 family represent the largest TAFs in animal and plant cells,

and have a bevy of biochemical roles including histone acetyltransferase (HAT) activity

and protein kinase activity (Takada et al., 1992; Lee and Young, 1998). The protein

kinase phosphorylates the large subunit of TFIIF (ac or RAP74), but ultimately has an

unknown downstream function (Dikstein et al., 1996a; 0. Brien and Tjian, 1998). Yeast









TAF1 can also phosphorylate TFIIA apparently raising levels of transcription and

increasing the affinity for TBP (Solow et al., 2001). The TAF1 family also has a recently

identified ubiquitin conjugating activity (ubac); however, the utility of this function is yet

to be elucidated (Pham and Sauer, 2000). The substrate of the ubac activity is histone

H1, which it monoubiquitylates. Histone H1 monoubiquitylation may lead to

transcriptional activation, since monoubiquitylation of histones H2A and H2B has been

correlated with actively transcribed genes (Davie and Murphy, 1990).

Generally, TBP binding to the TATA box and PIC formation are inhibited by

nucleosomes, suggesting that a condensed chromatin state directly inhibits transcription

(Workman et al., 1991; Imbalzano et al., 1994; Workman and Kingston, 1998).

Furthermore, a correlation between hyperacetylated (open) chromatin and regions that are

transcriptionally active leads to the hypothesis that HAT activity functions in the

remodeling of chromatin to activate transcription (Ayer, 1999; Grant and Berger, 1999;

Wolffe and Guschin, 2000). The TAF 1 family HAT activity implies that histone

acetylation is important at the core promoter to aid transcription factor/chromatin

contacts (Workman and Kingston, 1998). Furthermore, it is known that human TAF1

also acetylates TFIIFac, and the 3 subunit of TFIIE (Imhof et al., 1997). This activity is

known as Factor Acetyltransferase (FAT) activity; however, the significance of FAT

activity is unclear (Grant and Berger, 1999). The yTAF1 FAT/HAT might acetylate

histones near the core promoter, basal transcription factors, or other unknown protein

targets (Jacobson et al., 2000). FAT/HAT, and kinase activities may function in

conjunction as a signal transduction cascade targeting GTFs and histones and ultimately

result in gene activation (Albright and Tjian, 2000; Jacobson et al., 2000). Two putative









Arabidopsis TAF 1 homologs (AtTAF 1 and AtTAF b) also appear to have these

conserved domains responsible for the activities described above (HAT/FAT, protein

kinase, and ubiquitin conjugating activity; E. Czarnecka-Verner and W.B. Gurley;

unpublished data).

At the C-terminus, human and Drosophila TAF 1 contain two tandem

bromodomains that are known to bind acetylated lysine residues of histone H4 (Jacobson

et al., 2000). The first bromodomain shown to bind acetylated histones (H3 and H4 NH2-

terminal peptide) was from p300/CBP-associated factor (PCAF) (Dhalluin et al., 1999).

A later binding study with hTAF 1 found that the double bromodomain motif has an

affinity for acetylated H4 peptide 70-fold greater than the single domain of PCAF

(Dhalluin et al., 1999; Jacobson et al., 2000). Acetylation of four histone H4 lysines is

correlated with transcriptional activity, raising the possibility that the role of the hTAF1

bromodomain(s) is to bind to the acetyl-lysines, thereby facilitating histone modification

by the HAT domain (Jacobson et al., 2000). Interestingly, yeast TAF1 is lacking

bromodomains and a C-terminal kinase domain; however, a separate protein,

bromodomain factor 1 (Bdfl), was identified that contains two bromodomains, and a

kinase domain, and is found in association with TFIID (Matangkasombut et al., 2000).

Due to these structural and functional similarities to the C-terminus of hTAFl, Bdfl is

hypothesized to be functionally analogous to the hTAF 1 C-terminus.

In Arabidopsis, TAF1 and TAFlb each have a single bromodomain (W.B. Gurley,

unpublished data), and thus are hypothesized to have a limited affinity for acetylated

histone H4, perhaps as much as 70-fold lower affinity than hTAF 1 (Jacobson et al.,

2000). Drawing on the parallel with yeast, it is hypothesized that Arabidopsis may also









express a protein analogous to Bdfl, supplementing the AtTAF1 and AtTAFlb single

bromodomains. However, analyses of Arabidopsis genomic sequence did not reveal a

Bdfl-analgous protein to be present.

Another property of the TAF1 family is a capacity to auto-inhibit TFIID function

(Kokubo et al., 1993b). This regulatory property resides in the N-terminal domain

(TAND ) that acts as a TATA-element minor-groove mimic. Inhibition is achieved by

competition between the TAND1 and the TATA-box for binding to the concave surface

of TBP (Liu et al., 1998; Kotani et al., 2000). While the TAND1-TBP interaction may

add to the stability of the TFIID complex, it is not required for TFIID integrity (Albright

and Tjian, 2000). A TAND1-adjacent domain TAND2 appears to stabilize the TAND1-

TBP interaction by binding the helix H2 on the convex face of TBP (Burley and Roeder,

1998). It has been shown through domain swapping that there is a functional

conservation between the TAND1 and acidic activation domains of transactivator

proteins such as VP16 (Kotani et al., 2000). In domain-swapping experiments, acidic

activator domains are capable of TFIID inhibition when translationally fused to yTAF 1;

conversely, TAND1 is capable of serving as an activator when fused to a DNA-binding

domain (Kotani et al., 2000). This leads to the "two-step hand off model" in which the

auto-inhibition caused by TAND1 is competed by acidic activators. Subsequencely, the

TAND2 interaction with the TBP convex surface is competed by TFIIA, ultimately

leading to a cooperative removal of the inhibitory region of TAF1 from TBP (Figure 1-1)

(Kotani et al., 2000). In this model, the TFIIA-TFIID-acidic activator intermediate

allows some TAFs to bind near the transcriptional initiation site (TAF2 binds Initiator,

TAF6/TAF9 dimer binds DPE) and leads to the TATA-box displacing the acidic activator









from the concave surface of TBP (Kotani et al., 2000). However, this model may not

prove useful in humans or Drosophila where the TBP-TAND 1 affinity is much higher

than that of yeast (Kotani et al., 2000). A study using HeLa heat-treated chromatin

demonstrated that TFIID had less capacity for transcriptional initiation than TBP alone,

possibly due to the TAND1-based inhibition of TBP in the absence of functional

transcriptional activators (Remboutsika et al., 2001).

The genomic region encoding A. thaliana TAF lb is lacking a conserved sequence

corresponding to the yeast and metazoan TBP-inhibiting TAND1 and TAND2. On the

other hand, AtTAF 1 appears to contain a TAND 1 and TAND2 (E. Czamecka-Verner and

W.B. Gurley, unpublished data). AtTAF b seems to be the only example of a TAF1

homolog lacking a TAND1/2, possibly altering the transcriptional activation

characteristics of AtTFIID.

Interestingly, in a microarray experiment utilizing a temperature sensitive (TS)

mutant yeast only 16% of PolII promoters showed dependence on yTAF 1 (Holstege et

al., 1998). Many of the genes that were down regulated upon temperature shift were cell

cycle and DNA repair genes (not unexpected given the original identification of yTAF 1

as a cyclin). The surprise is in the low percentage of yTAF 1-dependent genes. This

apparent low dependence on yTAF 1 can be explained by the overlap in function between

TFIID and other TAF-containing complexes (such as SAGA) that do not require yTAF 1.

Alternatively, the yTAF 1 TS mutation could be leaky in its disruption of function as seen

for other TS TAF mutations that underrepresent the magnitude of the yTAF1 contribution

to gene regulation in these experiments (Michel et al., 1998).









Other TAFs and interactions of TFIID

Many other TAFs are known, but for the most part their functions are quite

nebulous. The metazoan TAF2 family represents one exception in that it has been shown

to bind the core promoter Initiator element. This was shown directly by DNase I

footprinting and electrophoretic mobility shift assays (EMSA) with recombinant

Drosophila TAF2 (Albright and Tjian, 2000). However, it seems unlikely that this

should be the only function of TAF2 due to its large size (-150 kDa). For example, TBP

binds a specific DNA sequence with only 20% the mass of TAF2. With so many

properties attributed to the TAF 1 family, it would be intellectually satisfying to find other

roles for TAF2. Interestingly, consensus sequences for plant and fungal Initiator

elements have not been defined and it is unknown whether the TAF2 proteins in these

kingdoms bind to core promoter elements.

Other than TAF1, the only other stoichiometric TAF with enzymatic activity is

TAF7 (personal communication, M. Horikoshi) TAF7 from humans and yeast (but not

Drosophila or Arabidopsis) have similarity with von Willebrand factor type A domain

(VWA). Interestingly, the majority of VWA-containing proteins are extracellular.

However, very ancient VWA-containing proteins (found in all eukaryotes) are

intracellular proteins involved in various cellular tasks such as transcription, DNA repair,

ribosomal and membrane transport, and the proteasome protein degradation pathway

(Colombatti et al., 1993). One feature common to these proteins is the formation of

multiprotein complexes (Colombatti et al., 1993). It is important to note that yTAF7

most closely resembles the ATPases associated with diverse cellular activities (AAA)

ATPase family of VWA-containing proteins. A yeast two-hybrid experiment from the

lab of Laurie Stargell recently demonstrated that of all the TAFs, only yeast TAF7 was









directly associated with TBP (Yatherajam et al., 2003). This, taken together with the fact

that the TAND 1 domain of TAF 1 binds to the concave surface of TBP, inhibiting TBP

dimerization and promoter binding, may suggest that a role of some TAFs are to prevent

and dissociate nonproductive TBP interactions. Gegonne et al. (2001) demonstrated that

hTAF7 bound hTAF 1 and inhibited the FAT/HAT activity. Therefore, I propose an

alternative function for the putative TAF7 ATPase activity and this could be to shut-off

the TAF 1 FAT/HAT activity by acting as a chaperone protein. Regardless of the veracity

of these models, the developing TAF7 story is of great interest.

Other classical TAFs have no recognized role beside protein-protein or protein-

DNA interactions. However, it is thought that these interactions may stabilize the PIC

(Burley and Roeder, 1996). The recent work of Yatherajam et al. (2003) elaborated the

binary protein-protein interactions within the TFIID complex of yeast. These interactions

and others for TFIID from humans, Drosophila and yeast are assembled in Table 1-2. It

is significant that a large number of the binary interactions of yeast TFIID are nucleated

by five histone-fold TAFs (TAFs 4, 6, 9, 10, and 12), four of which (TAFs 4, 6, 9, and

12) are proposed members of a nucleosome-like octamer (Yatherajam et al., 2003). In

addition, a large number of protein-protein interactions occur between TAFs and other

GTFs, as will be discussed in detail below.

Human TAF 10, which has affinity for the estrogen receptor and thus may play a

role in estrogen dependent activation, is found in only a subset of cellular TFIID

complexes (Jacq et al., 1994). Nevertheless, a TS mutation of yTAF10 was shown to

impede bulk transcription of mRNA and destabilize TFIID and SAGA upon temperature

shift (Sanders et al., 1999). These results suggest a more general role for yTAF10 than









for its human homolog hTAF10. Another example of a TFIID specialization is seen with

hTAF13 (which binds hTAF 11; see above). The human TAF 11-TAF13 pair may be an

alternative to hTAF10 because it is found only in TFIID complexes lacking hTAF10

(Jacq et al., 1994; Mengus et al., 1995).

Several TAFs (beside the TAF2, TAF6, TAF9 families mentioned above) contact

promoter DNA and these include hTAFs 1, 4, 5, and 7 (Oelgeschlager et al., 1996).

These interactions aid in the stabilization of promoter-TFIID interactions. TBP

interactions with the TATA box have similar affinity to TAF interactions with promoter

DNA (Purnell et al., 1994); therefore, TAF-promoter binding is critical for transcription

from TATA-less promoters (Bell and Tora, 1999). Thus, it is hypothesized that some

TAFs function in recruiting TFIID to TATA-less promoters, participating in promoter

recognition by TFIID (Bell and Tora, 1999).

The TAF 15 family of TAFs is a group of pro-oncoproteins that are common sites

of chromosome translocations in human sarcomas (Bertolotti et al., 1996; Attwooll et al.,

1999). These are hTAF 15, TLS/FUS (translocated in liposarcoma/fusion of CHOP) and

EWS (Ewing sarcoma) and are all RNA binding proteins with high similarity to RNA-

binding domain (RNP-CS). TAF15 binds not only RNA but also single stranded DNA

(ssDNA) (Bertolotti et al., 1996). Like TAF 15, TLS/FUS and EWS associate with TFIID

in a mutually exclusive manner (Bertolotti et al., 1996; Bertolotti et al., 1998). TAF15

and EWS contact exactly the same subunits of TFIID (Bertolotti et al., 1998), suggesting

that EWS (and possibly TLS/FUS) are TAF15b (and TAF15c) proteins. Similarly to

TAF14, TAF15 and EWS are also associated with another core transcription complex, in

this case PolII (Bertolotti etal., 1998). TAF15 and EWS contacted the hRPB3 subunit









(Bertolotti et al., 1998). However, only TAF15 contacted hRPB5 and hRPB7 (Bertolotti

et al., 1998).

Recent work (presented recently at the 22nd Summer Symposium in Molecular

Biology: Chromatin Structure and Function) from the laboratory of Stephen Buratowski

in which a large-scale purification of TFIID was performed from yeast cells, has

demonstrated sub-stoichiometric association of four ubiquitin machinery proteins with

TFIID (Auty et al., 2003). Although these proteins (BRE5, BULl, UBP3, and UBP8) are

found in many other complexes, they may be operationally defined as TAFs due to their

association in TFIID. Yeast TAF 1 does not have a demonstrated ubiquitylation activity,

nor the domains associated with this activity (Wassarman and Sauer, 2001 Aug) as

reported for Drosophila TAF 1 (Pham and Sauer, 2000). Perhaps, the two ubiquitin-

conjugases are in some way adopting this role in yeast. Recent evidence from the

Drosophila melanogaster genome-wide protein-protein interaction study shows a

ubiquitin conjugase interaction with TAF10 (Giot et al., 2003), suggesting that the

presence of ubiquitylation machinery in TFIID is conserved (and not due to a

complementation of an activity that yeast TAF 1 is lacking).

Proteins involved in ubiquitylation (such as the E2-conjugases) could be involved

in activation of transcription and lead to degradation of inhibitory proteins (i.e., histones)

or may result in a rapid turn over of transcriptional activators, which would be critical for

shutting off a promoter after the activation triggers are removed. However, the role of

the two ubiquitin-hydrolases is more mysterious. It has recently been suggested by

Shelly Berger (2003) that histone H2B ubiquitylation, followed rapidly by de-

ubiquitylation, is required for histone H3 methylation on K4 and K36 resulting in gene









activation. If this is the case, the presence of both ubiquitylation and de-ubiquitylation

enzymes in one complex may be mechanistically linked for gene activation.

Little information is available about the TFIID complex in plants. A crude

preparation of TFIID has been purified from wheat germ, which appears to be a stable

complex similar to that from metazoans (Washburn et al., 1997). Unfortunately, this

purified complex was sparse, not homogeneous, and did not lead to the identification of

any subunits.

Some information on subunits TFIID from plants is beginning to become available.

In Arabidopsis, TAF 10 was found to interact with a ubiquitin conjugase as tested by

yeast-two hybrid screen (S.J. Lawit, P. Michaluk, E. Czarnecka-Verner, W.B. Gurley

unpublished results), suggesting that plant TFIID complexes also include ubiquitylation

machinery. Also, Tamada et al. (2003) have shown that the Arabidopsis TAF10 is

transcriptionally regulated to a high degree. As it is highly expressed in developing

tissues, but expressed below detection levels in mature tissues. Along this line, TAF10 is

not expressed in non-reproductive tissues following bolting of the inflorescences of

Arabidopsis. Such a close tie to development is consistent with biochemical information

on human TAF10 that is present in a subset of TFIID complexes and interacts with the

estrogen receptor, potentially playing a part in development.

Alternative TBP- or TAF-Containing Complexes

Several protein complexes other than TFIID contain TBP or TAFs, blurring the

lines of what can be considered general transcription factors. At least four types of

coactivators interact with TBP and display promoter selection properties (for review see

(Lee and Young, 1998). These are TAFIs, TAFuIs (TAFs), TAFIIIs, and PTF/SNAPc.

Other coactivators/corepressors/GTFs such as SAGA, Motl, NC2, Nots, and TFIIA,









along with TAFs, play important roles in regulating expression of mRNA (Lee and

Young, 1998; Mitsiou and Stunnenberg, 2000). The SAGA complex does not copurify

with TBP; however, multiple SAGA subunits do bind TBP individually (Spt3, Ada2,

Ada5/Spt20) and may recruit it to a promoter (Eisenmann et al., 1992; Barley et al.,

1995; Roberts and Winston, 1996; Saleh etal., 1997; Sterner et al., 1999). Interestingly,

western blots have demonstrated that TBP in yeast is around ten-fold more abundant than

TAFs, SAGA, BTAF1 (Motl), NC2, and Nots (Lee and Young, 1998). This is consistent

with the observations that TBP may be a component of a variety of protein complexes.

One alternate TFIID complex, B-TFIID, is found in yeast and humans. Human B-

TFIID is capable of nucleating basal transcription, but is unresponsive to activators much

like TBP alone (Chang and Jaehning, 1997). B-TFIID functions as a global

transcriptional co-repressor and was initially believed to contain several core TAFs

(Wade and Jaehning, 1996; Chang and Jaehning, 1997). Later it was established that B-

TFIID consisted of TBP and at least one TAF (BTAF1, or Motl), but not the full

complement of TAFs (Auble et al., 1994; Wade and Jaehning, 1996). BTAF1 is an

essential protein in yeast and affects only a subset of the organismal transcriptosome in

microarray studies, possibly through promoter recruitment (Wade and Jaehning, 1996), or

adenosine triphosphate (ATP)-dependent release of the rate limiting TBP from high

affinity TATA elements for use at lower affinity promoters (Collart, 1996). This second

model is more likely since BTAF 1 seems to function through dissociating TBP from

DNA in an ATP-dependent manner (Lee and Young, 1998).

Other alternative TAF complexes are hTFTC, hPCAF, ySAGA, human SPT3-

TAFII31-GCN5-L acetyltransferase (hSTAGA), ySLIK (yeast SAGA-like), and









ySALSA (yeast SAGA altered Spt8 absent), etc. These complexes are quite similar in

structure and function (Struhl and Moqtaderi, 1998). Significantly, none of these

complexes contain either TBP or TAF 1; however, each complex contains HAT activity

and a subset of TAFs (only four to five of nine histone-fold motif TAFs) (Bell and Tora,

1999). Interestingly, some well-characterized histone-binding partners are replaced in

SAGA and other alternative complexes. Examples include hTAF 11-hTAF13 partnering

being replaced by an intramolecular Spt3 histone-fold paring, and yTAF12 being paired

with Ada51 (Birck et al., 1998; Gangloff et al., 2000).

Outside of TFIID, perhaps the most recognized TAF-containing complex is the

yeast SAGA and its human counterpart STAGA (Green, 2000). SAGA is a 1.8 2.0

MDa complex containing TAFs, ubiquitin-machinery proteins, Spt, Ada

(alteration/deficiency in activation), and Gcn5 subunits (Grant et al., 1998; Lee and

Young, 1998; Grant and Berger, 1999; Berger, 2003). The TAFs that copurify with

SAGA are TAF5 (WD40 domain), 6 (H4-like), 9 (H3-like), 10 (histone-fold domain), 12

(H2B-like), and 13 (histone-fold domain) (Grant et al., 1998; Grant and Berger, 1999).

The presence of ubiquitin-machinery proteins in SAGA, like TFIID, a TAF containing

complex, further supports a functional relationship with TAFs and ubiquitylation. Yeast

Gcn5 (a HAT) is additionally found in other large protein complexes containing Ada

proteins (Grant and Berger, 1999).

Yeast SAGA is somewhat redundant with other complexes including TFIID,

SWI/SNF (Switch/Sucrose non-fermenting, a chromatin remodeling complex), and

suppressor of RNA polymerase B (SRB)/Mediators (Grant and Berger, 1999). A

temperature sensitive mutation ofyTAF9, a member of both TFIID and SAGA, tested by









a microarray experiment demonstrated that 67% of PolII genes require yTAF9 (Apone et

al., 1998; Holstege et al., 1998). A TS mutation inyTAFlO (also a member of the same

complexes) showed that it was also required for bulk mRNA expression (Sanders et al.,

1999). Similarly, a TS mutation inyTAF11 (a member of TFIID only) also decreased

PolII transcription to background levels when tested under the nonpermissive temperature

(Komarnitsky et al., 1999). However, a similarly tested Gcn5 mutation was required by

only 5% of PolII transcribed genes (Holstege et al., 1998). An even more dramatic

demonstration of the functional redundancy of SAGA is the fact that a deletion of Spt20

causing complete loss of SAGA does not create an apparent deficiency in transcription as

monitored by total mRNA levels (Berk, 1999). Interestingly, mutants of Ada] and Spt20

individually had more dramatic phenotypes than mutants with inactivated or eliminated

Gcn5, suggesting that HAT activity may not be the major role of SAGA (Struhl and

Moqtaderi, 1998). While SAGA function may not be requisite for viability, the

association of TAFs and other transcription regulators may convey the potential for

SAGA to regulate at many different promoters (Grant et al., 1998).

Recent work by Pugh and co-workers (Huisinga and Pugh, 2003; Pugh, 2003) has

further elaborated the role of SAGA (and the related SLIK and SALSA complexes).

Microarray analysis of Gcn5 deletion mutants demonstrated that 10% of the genome was

activated by these Gcn5 HAT complexes, whereas a strict TAF1 TS mutant demonstrated

a non-overlapping 90% dependence on TFIID (Huisinga and Pugh, 2003). Interestingly,

46% of the Gcn5-dependent genes are stress inducible, thus nearly 1/3 of all stress-

inducible genes in yeast are dependent upon SAGA for activation (Huisinga and Pugh,

2003). Pugh's group has also discovered that the 90% of the promoters regulated by









TFIID are TATA-less, while those that are SAGA-dependent contain TATA-boxes

(Pugh, 2003). This is explained most easily by TFIID being able to correctly position

TBP at TATA-less promoters by having a firmly incorporated TBP and being able to read

other contextual cues in a promoter (i.e., the DPE and Initiator element, if present). On

the other hand, SAGA may not be able to read such core promoter elements since it lacks

several TAFs involved in promoter recognition. This suggests that SAGA may be able to

recruit TBP, but not position it correctly.

In recent years evidence has arisen that suggests a SAGA-like complex exists in

Arabidopsis. The presence of a SAGA-like complex, suggests that AtTAFs may interact

with alternative complexes like TAFs in other organisms (Stockinger et al., 2001;

Vlachonasios et al., 2003). Vlachonasios et al. (2003) in a series of microarray

experiments demonstrated that Arabidopsis Ada52b and Gcn5 regulate 5% of the genes

represented in the 8,200 gene Affymetrix array. This result is strikingly similar to the

findings for yeast Gcn5, suggesting a very similar extent of regulation by a potential

SAGA-like Arabidopsis complex.

The human PCAF complex is structurally related to SAGA, containing many

homologous subunits. In addition to several hAda subunits, hTAFs 9, 10, and 12 are also

found associated with PCAF, which has approximately 20 subunits in all (Ogryzko et al.,

1998). The histone H4-like hTAF6 is missing from the PCAF complex, but is apparently

replaced in the histone-octamer-like structure by an ortholog with 42% similarity (PCAF

associated factor TAF6L) (Ogryzko et al., 1998). There is also an hTAF5 ortholog

with WD40 repeats, TAF5L (Ogryzko et al., 1998).









Drosophila TBP-related factor 1 (TRF1) is expressed only in neuronal tissues and

is a component of yet another protein complex with similarities to TFIID. However, this

complex contains no bonafide TAFs (Hansen et al., 1997). TRF (TRF2 or TATA

binding protein-like, TLF) homologs are found in humans, Drosophila, and C. elegans

with a unique subset of TRF associated factors (Chang and Jaehning, 1997; Albright and

Tjian, 2000). An interesting finding from the laboratory of Robert Roeder (Xiao et al.,

1999) places a human TRF proximal (hTRFP) in the mediator complex. The function of

the various TRFs appears to be, in general, to mediate transcriptional responses

(potentially not mediated by TFIID) by a variety of activators.

TBP-free TAF-containing complex (TFTC) is a human protein complex with

similarities to TFIID. TFTC lacks TBP and TAF1, but contains most other TAFs (Grant

and Berger, 1999). In addition to TAFs, TFTC includes hAda53, hSPT3, and hGcn5L

that provides a HAT activity in place of TAF1 (Grant and Berger, 1999). TFTC can

functionally substitute for TFIID at both TATA-containing and TATA-less promoters by

nucleating PIC assembly (Wieczorek et al., 1998; Bell and Tora, 1999; Grant and Berger,

1999).

The existence of multiple TAF-containing protein complexes with HAT activity

emphasizes that chromatin remodeling is essential for transcriptional activation of many

promoters. In addition, this multiplicity of TAF/HAT complexes suggests a functional

redundancy in activator complexes. These arguments imply that TBP and TAF

recruitment to promoters is complex and the role of specific TAF-containing complexes

is not well understood, even in the well-studied metazoans (Lee and Young, 1998).









TAFs: Required Factors or Optional Accessories

Several studies indicate that TAFs may have redundant or even optional roles in

transcription of PolII-dependent genes. For instance, several well-studied, strong

activators such as VP16 and Gal4 have redundant activation motifs that interact

separately with TFIID and/or holoenzyme (Chang and Jaehning, 1997). This redundant

interaction suggests that strong activators may be capable of activating transcription in

the absence of TFIID, by contacting holoenzyme through other GTFs such as SRBs,

TFIIA, and TFIIB (Burley and Roeder, 1996). Furthermore, in human embryonal

carcinoma cells, a novel TFIIA-TBP complex has been identified that is capable of

activating transcription but completely lacks TAFs (Mitsiou and Stunnenberg, 2000). In

addition, several novel coactivator complexes in mammalian systems such as vitamin D3

receptor interacting proteins/activator-recruited factor (DRIP/ARC), thyroid-hormone

associated protein complex (TRAP), and cofactor required for Spl activation (CRSP)

completely lack TBP and TAFs (Fondell et al., 1996; Naar et al., 1999; Rachez et al.,

1999; Ryu et al., 1999). Taken together, TFIID (and TAFs) may be optional accessories

to transcription. Alternatively, protein complexes other than TFIID and SAGA

(containing TAF subunits or other coactivators) such as TFTC and TRAP can

compensate for the lack of TFIID and SAGA under some conditions (Albright and Tjian,

2000).

Early studies employing TAF TS mutations resulting in down-regulated expression

and targeted degradation of TAFs in yeast did not demonstrate promoter dependency on

these coactivators. In some studies, PolII holoenzyme alone (no TAFs present) supported

activated transcription in vitro (Koleske and Young, 1994; Berk, 1999). However, in

apparent contradiction to earlier results, experiments utilizing TS mutations demonstrated









significant promoter-dependency on TAFs. Michel et al. (1998) hypothesized that this

was due to the use of tighter TS mutations than in previous studies, and that only TS

mutations causing rapid cessation of growth upon temperature shift were appropriate for

such studies (Berk, 1999). In fact, using "tight" TS mutants, a significant loss of

transcription was observed only 30 min after the shift to the nonpermissive temperature

(Michel et al., 1998). This temperature shift also caused rapid degradation of not only

the mutated protein but also the other proteins of the TFIID and SAGA complexes

(Michel et al., 1998). This result suggested that PolII cannot transcribe without a

functional TFIID complex (Michel et al., 1998). Interestingly, temperature shift also

resulted in a degradation of western blot-detectable TAFs and two thirds of TBP (Michel

et al., 1998; Berk, 1999). This suggests that two thirds of TBP is associated with TFIID,

while the other one third is either free or bound by TAFIs or TAF1IIs (Berk, 1999).

A recent study utilizing an inducible depletion strategy for chicken TAF9 (histone

H3-like TAF) demonstrated a high level of PolII transcription without detectable levels of

chicken TAF9 (Chen and Manley, 2000). This elegant experimental system measured

transcription directly through pulse-labeling and included steady state measurements of

several gene transcripts. This apparent inconsistency with yTAF TS experiments was

mainly explained by an increased functional redundancy in mammalian transcriptional

machinery (Chen and Manley, 2000). Such alternative mammalian complexes as PCAF,

and TRAP were proposed to play much larger roles than SAGA in yeast (Chen and

Manley, 2000).

Other studies lead to the conclusion that different promoters have distinct

dependencies on TAFs for their expression. In yeast, TAF-independent (TAFind) and









TAF dependent (TAFdep) promoters were identified using the TAF TS mutants. After

temperature shift of these mutants, transcript profiling was used to detect genes that were

transcriptionally dependent or independent of the various TAFs (Li et al., 2000). It was

established with the use of formaldehyde DNA-crosslinking chromatin

immunoprecipitations (ChIPs) that TAFdep promoters recruited TAFs and TBP at similar

levels (Li et al., 2000). However, ChIP of TAFind promoters indicated that TAFs were

only present at background levels (Li et al., 2000). These TAFind promoters still recruited

TBP, apparently sans TAFs, as TBP bound all promoters at levels that correlated well

with transcript accumulation (Li et al., 2000).

Interestingly, when yeast TBP was inactivated in a temperature shift experiment,

TAFs continued to be recruited to TAFdep promoters (Li et al., 2000). In general, TAFdep

promoters recruit TAFs in an activator-dependent fashion, independent of other GTFs (Li

et al., 2000). For instance, the binding of TBP with TAFs to a TAFdep promoter (the

yeast RPS5 promoter) is lost after removal of the activator binding sites, but inactivation

of TFIIB or SRB4 had minimal effect on binding of TBP to the TAFdep (ACT1, and

RPS5) promoters (Li et al., 2000). This is compelling evidence for a model in which

yeast TAFs are directly targeted for recruitment by transcriptional activators, and in

parallel pulling TBP to the promoter, nucleating PIC assembly (Li et al., 2000). On the

other hand, inactivation of TFIIB or SRB4 substantially reduced binding of TBP to a

TAFind (ADH1) promoter (Le Gourrierec et al., 1999). This evidence seems to indicate

that holoenzyme recruits TBP (sans TAFs) to TAFind promoters, leading to stabilization

of each other's interaction with the promoter.









It is clear that most TAFs are essential to yeast survival (Green, 2000). Work

with temperature sensitive mutants of the histone-like yTAFs demonstrated that they

were essential to PolII transcription as well (Michel et al., 1998). Michel et al. (1998)

made the argument that loss of SAGA did not cause a large drop off in transcription

because most of its components were not required for yeast viability or PolII

transcription. Therefore, it seems that TFIID is required for viability, but not necessarily

for correct transcription of every PolII dependent gene. From the evidence accumulated

to date, the model of TFIID serving as the major PolII coactivator seems secure in most

organisms. However, there is clearly significant redundancy of coactivator complexes in

yeast and metazoans. With so little known about similar coactivators in plants, it is futile

at this point to postulate how their transcription is regulated.

Interplay of GTFs

Assembly of the PIC involves many GTFs and nearly an order of magnitude greater

number of individual proteins. Therefore, implicit in formation of this complex are a

large number of binary protein-protein interactions involving intra-GTF and inter-GTF

binding partners. The interactions between TFIIA, TFIIB, TFIID, TFIIE, and TFIIF are

summarized in Table 1-3 and in Figures 1-2, 1-3, and 1-4 for humans, Drosophila, and

yeast, respectively. Just a few examples of the many known TAF-GTF interactions are

dTAF9/ hTAF9 with TFIIB; yTAF14 with TFIIF; hTAF6 with TFIIF and TFIIE; and

dTAF4/ hTAF4 with TFIIA (Tjian and Maniatis, 1994; Burley and Roeder, 1996).

While TFIID is generally responsible for nucleation of the PIC, the other GTFs

play major roles as well. Regardless of which model is assumed (ordered multi-step

assembly or holoenzyme), TFIIA has a somewhat controversial presence. TFIIA is

composed of either two subunits (L and S) in yeast and Arabidopsis or three subunits (a,









3, and y) in metazoans. TFIIA-a and -3 are derived from post-translational cleavage of a

protein homologous to the larger (L) subunit in fungi and plants (Li et al., 1999). TFIIA

is able to integrate into the PIC at any step of assembly and is even capable of binding

TBP in the absence of DNA (Orphanides et al., 1996). However, when TFIIA does bind

TBP at a promoter it is able to interact with both the N-terminal stirrup of TBP and DNA

upstream of the TATA element increasing the stability of the DNA-protein complex

(Langelier et al., 2001). Besides this function of TFIIA, it is suggested that TFIIA is

involved in TBP anti-repression because it is able to remove repressors like Motl. Only

TFIIA-P and -y are required for anti-repression, but not -u. However, all three subunits

are required for activation mediated by trans-activators that recruit TFIIA. In the work of

Langelier et al. (2001), TFIIA stimulated basal transcriptional activity only in the

presence of TFIIEP and TFIIFu suggesting that TFIIA may somehow be involved in

enhancing the activities of TFIIE and TFIIF.

The structures of TFIIB as well as TFIIA in association with the TBP-TATA

complex have been determined by x-ray crystallography (Nikolov et al., 1995; Geiger et

al., 1996). The binding of TFIIB to the C-terminal stirrup of TBP at a promoter is a

required step to PIC formation (regardless of which model is followed). Like the binding

of TFIIA to TBP-TATA, the TFIIB-TBP-TATA complex is stabilized by TFIIB

interactions with both TBP and DNA both upstream and downstream of TATA in this

case (due to the 800 bend in the TATA-element) (Malik et al., 1993; Lee and Hahn, 1995;

Tang et al., 1996). The element directly upstream of TATA that is contacted by TFIIB is

termed the IIB recognition element (BRE)(Lagrange et al., 1998). The BRE is contacted

in a sequence specific manner by a helix-turn-helix DNA binding domain of TFIIB









(Lagrange et al., 1998). The BRE has a consensus sequence of 5'-G/C-G/C-G/A-C-G-C-

C-3' and represents the fourth known core promoter element in addition to the TATA-

element, the DPE, and the Initiator (Lagrange et al., 1998). Along with creating a more

stable TBP-TATA interaction, TFIIB makes contact with a TAF, TFIIF, and PolII (Ha et

al., 1993; Fang and Burton, 1996). In fact, some mutations of TFIIB have a great effect

on transcription start sites suggesting that TFIIB plays a significant role in positioning of

PolII in the PIC (Orphanides et al., 1996).

Choi et al. (2003) recently demonstrated that human TFIIB has the capacity to

autoacetylate itself on K238. This autoacetylation was competitively inhibited by

coenzyme A and was reversible under the conditions of high coenzyme A concentrations,

indicating that this is a catalytic process (Choi et al., 2003). Interestingly, yeast TFIIB

had the same autoacetylation capacity and the TFIIB affinity for TFIIF was then

significantly increased (Glutathione-S transferase-pulldown efficiency increased from

15% to 90%) (Choi et al., 2003). This affinity increase suggests that TFIIB acetylation is

a key mechanism for recruitment of TFIIF and PolII to a promoter.

Similarly to TBP and TFIIA, TFIIB has been studied to a limited degree in plants

(Baldwin and Gurley, 1996; Pan et al., 2000). Pan et al. (2000) demonstrated that TBP-

TFIIB interactions were dispensable for basal transcription and activated transcription at

strong complex promoters (CaMV 35S). However, the TBP-TFIIB interaction was

required for activated transcription at simplified artificial promoters (Pan et al., 2000).

These results can be most simply interpreted to mean that during basal and activated

transcription from complex promoters, other factors besides TBP and TFIIB (perhaps

TAFs and transcriptional activators, respectively) are able to recruit PolII to the promoter.









However, at the simple, artificial promoters the rate-limiting step is no longer TBP

recruitment (due to recruitment by the transactivator) but recruitment of PolII. Since the

artificial promoters are TATA-containing, they may act in a TAFind manner and thus

TAF-holoenzyme interactions may not fully complement the lack of TBP-TFIIB

interaction. In separate work, Pan et al. (1999) showed that a 14-3-3 protein binds TBP

and TFIIB independently of known 14-3-3 protein-binding motifs. It was also

demonstrated that this 14-3-3 protein was capable of stimulating transcription in vivo,

suggesting that 14-3-3s might act as co-activators thus creating another layer of

complexity to transcriptional regulation (Pan et al., 1999).

An exciting story that is beginning to unfold is that of plant-specific TFIIB-related

protein (pBrp), or AtTFIIB5 (chapters 2 and 4). This protein was shown to interact in

vitro to form a TBP-TFIIB5-TATA complex (Lagrange et al., 2003). Using enhanced

yellow fluorescent protein-tagging, chloroplast-fractionation and proteolytic-cleavage

experiments analyzed by western blots, as well as plastid agglutination experiments,

Lagrange et al. (2003) have shown that AtTFIIB5 was normally localized to the cytosolic

face of the plastid envelope. This localization to the chloroplast is the first occurrence of

any GTF to be observed stably located outside of the nucleus.

AtTFIIB5 was also observed to contain a P/E/D/S/T-rich domain that appears to

play a role in targeting this protein for degradation by the proteasome (Lagrange et al.,

2003). Upon pharmacological disruption of proteasome function and in COP9

signalosome mutants, AtTFIIB5 was observed to localize to the nucleus. Lagrange et al.

(2003) suggest a model in which an unknown plastid-derived signal leads to release of

the TFIIB5 protein from the outer envelope and movement into the nucleus. In the









nucleus, TFIIB5 would then induce transcription of genes appropriate for response to the

original signal. In this model, proteasome-mediated degradation provides a rapid turn-

over of the nuclear-localized TFIIB5 protein, leading to tighter control of the response to

the signal.

However, this model lacks an explanation as to why the TFIIB5 protein appeared to

be released from the chloroplast under proteasome/COP9 signalosome dysfunction. I

propose a model in which the COP9 signalosome leads to degradation of a TFIIB5/pBrp

co-factor that allows the protein to localize to the nucleus. The co-factor could be either

a chaperone that escorts TFIIB5/pBrp to the nucleus or a chloroplast-docking antagonist.

Alternatively, the proteasome/COP9 signalosome may be activating either a plastid

docking factor or a chloroplast targeting signal in TFIIB5/pBrp by partial proteolysis

(Gille et al., 2003). In either of these cases, TFIIB5/pBrp transport to the nucleus and

induction of transcription is likely to be the culmination of this signal transduction

pathway. Whatever their role, the TFIIB5/pBrp subfamily of TFIIB-like factors is sure to

play a novel role in transcriptional regulation.

Part of the role of TFIIB is to recruit TFIIF into the PIC. TFIIF is a

heterotetrameric complex of two TFIIFc (RNA polymerase-associated protein 74 kDa,

RAP74) and two TFIIF3 (RAP30) molecules (Flores et al., 1990) that is tightly

associated with PolII. However, in yeast a third factor interacts as part of TFIIF, the

yeast TAF14 (also a member of the TFIID and SWI/SNF complexes) (Henry et al., 1994;

Cairns et al., 1996). Interestingly, although both human TFIIFc and TFIIFP bind to

TFIIB individually, it has been shown that TFIIFc blocks the binding of TFIIFP to TFIIB

by simultaneously binding to both proteins in the regions required for their respective









interaction (Fang and Burton, 1996). TFIIF is required for recruitment of PolII to the

TATA-TBP-TFIIA-TFIIB complex, and is found tightly associated with PolII. Indeed,

TFIIF is credited with stimulating the rate of transcriptional elongation (which implicates

that it is an elongation factor in addition to its function as an initiation factor) (Flores et

al., 1989).

Besides elongation stimulation, TFIIF (specifically the P-subunit) also inhibits and

reverses PolII binding to non-promoter DNA regions making this interaction promoter-

specific, similarly to the bacterial o-factor (Killeen and Greenblatt, 1992). TFIIF3 has

sequence similarity with bacterial o factors and is able to bind E. coli RNA polymerase in

the same region as 070 (McCracken and Greenblatt, 1991). Interestingly, a dimer of

TFIIFP alone is able to recruit PolII to promoters and support proper initiation of

transcription (Flores et al., 1991).

Three functional domains compose TFIIFP: the TFIIFc binding N-terminus (Fang

and Burton, 1996); the polymerase binding middle domain (Killeen and Greenblatt,

1992); and the C-terminal winged-helix domain (Groft et al., 1998). Similarly, TFIIFc

can be functionally divided into three domains: the N-terminal TFIIFP binding domain;

the highly charged middle region; and the C-terminal winged helix domain (Kamada et

al., 2001). TFIIFc seems to largely play a role in stimulating transcriptional elongation

and aides TFIIFP to remove PolII from non-specific DNA interactions. One interesting

observation is the presence of a serine/threonine kinase activity in TFIIFc that is

involved in transcriptional elongation (Rossignol et al., 1999). Rossignol and co-workers

(1999) were unable to find an identifiable ATPase domain in TFIIFc that must be present

for kinase activity; however, a weak similarity with an AAA ATPase VWA-containing









proteins is clearly identifiable in A. thaliana TFIIFa (as mentioned in Chapter 2). Two

other interesting findings are: the phosphorylation of TFIIFa by the TAF 1 factor kinase

domain, and a protein kinase in TFIIH (Dikstein et al., 1996a; 0. Brien and Tjian, 1998;

Rossignol et al., 1999). Although the significance of these activities utilizing TFIIF as a

substrate is still unknown, some data suggest that both TFIIF initiation and elongation

activities are stimulated by this phosphorylation similarly to the stimulation of TFIIA

activities by phosphorylation (Kitajima et al., 1994).

After TFIIF and PolII, TFIIE is the next GTF to enter the assembling PIC possibly

with PolII. TFIIE, like TFIIF is a heterotetramer composed of two different proteins (a

and P) (Ohkuma et al., 1990; Inostroza et al., 1991). It was found that TFIIEc without

TFIIE3 has the capacity to mediate basal transcription when added to the other required

factors (Ohkuma et al., 1990; Inostroza et al., 1991); however, in the recombinant form

both subunits were required (Peterson et al., 1991). TFIIEc contains a zinc-finger

domain that is critical its stable incorporation into the PIC (Maxon and Tjian, 1994), a

leucine repeat, and a helix-turn-helix domain as well as sequence similarity to E. coli c-

factor region 2.1 (Ohkuma et al., 1991). TFIIEc also contains a clearly identifiable

catalytic loop domain found in many protein kinases (Peterson et al., 1991), although

TFIIE has no known ATPase activity. The protein-protein interaction between TFIIEc

and TFIIE3 may be mediated by leucine repeats, since TFIIE3 also contains this

recognizable motif (Sumimoto et al., 1991). TFIIE3, like TFIIEu has some sequence

similarity to o-factors, but in this case it is with region 3 which is implicated in promoter

recognition (Sumimoto et al., 1991). TFIIE3 also has similarity with TFIIFP in a region









similar to a o-factor domain that binds to core RNA polymerase, consistent with both

TFIIE and TFIIF interactions with PolI (Sumimoto et al., 1991).

TFIIE3 contacts ssDNA through a C-terminal winged helix domain that may play a

role in stabilization of the open promoter and assist DNA melting (Okamoto et al., 1998;

Okuda et al., 2000). This ssDNA-binding domain is novel in that it binds DNA in the

opposite face of where winged helix domains typically interact with DNA, in a positively

charged channel (Okuda et al., 2000). Since TFIIE interacts with PolII and GTFs (TFIIB

and TFIIF), Okuda et al. (2000) suggested a model in which these properties in addition

to the ssDNA binding lead to a stabilization of the PIC where the promoter starts to open.

Interestingly, like TBP and TFIIB, TFIIE appears to have ancient roots. Homologs of

TFIIEc (TFE) have been identified in Archaea. TFE was not required for transcription,

but was stimulatory to transcription under conditions of limiting TBP (Bell et al., 2001).

This suggests a conserved function for TFE/TFIIEc in stabilization of the PIC.

Once TFIIE is incorporated into the PIC, it recruits TFIIH a multi-subunit GTF

with two ATP-dependent helicases and a protein kinase (Orphanides et al., 1996). One

of the TFIIH helicases appears to be required for transcriptional initiation, but TFIIE,

TFIIH, and ATP are all dispensable on templates that are highly negatively supercoiled

(Parvin and Sharp, 1993). It is believed that this negative supercoiling greatly lowers the

energetic requirement for strand separation and precludes the need for helicases activity.

Interestingly, both TFIIB and TFIIEc contain zinc-ribbon motifs, and both TFIIB and

TFIIE bind DNA between the TATA-element and the start site of transcription (Robert et

al., 1996). It has been speculated that these DNA binding motifs may play a role in

stabilizing the melted region of the promoter and as such supplement one of the main









functions of TFIIH in transcription; however, more recent evidence suggests this is a

function of the TFIIE3 ssDNA binding domain (Okamoto et al., 1998).

Another main function of TFIIH (beside transcription-coupled nucleotide excision

repair, which will not be discussed here) is to stimulate elongation by hyper-

phosphorylation of the C-terminal domain (CTD) of the largest subunit of Poll. This

evolutionarily conserved CTD is composed of many tandem repeats of a heptapeptide

(YSPTSPS) of which five residues are potential recipients of phosphate moieties. The

cdk7 subunit, a cyclin-dependent kinase, of human TFIIH has been shown to be the

subunit responsible for hyperphosphorylation of the CTD, an activity that potentially

leads to PolI promoter escape (Orphanides et al., 1996). One additional function of both

subunits of TFIIE is to stimulate this CTD kinase activity of TFIIH (Okamoto et al.,

1998).

PolII is a 12 subunit complex of approximately 500 kDa (Dvir et al., 2001). The

core of PolII is composed of two large subunits, RNA polymerase B protein 1 (Rbpl) and

Rbp2 (Dvir et al., 2001). The ten remaining subunits (Rbp 3-12) coat the surface of these

two proteins in single copies (Dvir et al., 2001). PolII is capable of unwinding double

stranded (dsDNA), adding ribonucleotides to RNA transcripts, and proofreading nascent

transcript (Cramer et al., 2000). The structure of PolII shows a deep cleft between Rbpl

and Rbp2 through which -20 bp of dsDNA is held as it enters the active site (Cramer et

al., 2000). This entering dsDNA is griped by a "pair of jaws" formed by a portion of

Rbpl with Rbp5 on one jaw with Rbp9 on the other jaw (Cramer et al., 2000). A "sliding

clamp" composed of the C-terminal region of Rbp2 and the N-terminal parts of Rbp6 and

Rbpl greatly stabilizes the interaction with downstream DNA, leading to the processivity









of the enzyme (Cramer et al., 2000). Cramer et al. (2000) propose that a groove leading

away from the active site behind the hinge of the downstream-DNA sliding-clamp binds

the emerging RNA and that acts as a lock on the clamp increasing processivity.

Underneath the base of the cleft is an inverted funnel leading to two pores that give

access to the DNA-RNA hybrid and are near the active site (Cramer et al., 2000). These

pores may provide access for elongation factors, nucleotides, and exit of the 3' end of the

mRNA during backtracking (Cramer et al., 2000). PolII is clearly a complex assembly of

protein domains with a multitude of functions (many of which are elucidated by the

structure). Unfortunately, further details must be left for other manuscripts.

The CTD of PolI is not a naked protein-tail structure as once thought, instead it is

covered by a large Mediator complex of co-activators (Orphanides et al., 1996). While

the Mediator complex is not technically a GTF (because it was not identified as a factor

required for basal transcription in vitro), it does merit some discussion here. Mediator is

composed of approximately 60 proteins and is -3.5 MDa (reviewed in Myers and

Kornberg, 2000; Rachez and Freedman, 2001). Many Mediator genes were first

identified as suppressors of CTD truncation mutants of RNA polymerase B (SRB) in

yeast. Many of these so called SRB proteins have little (if any) recognizable sequence

similarity with other proteins, except SRB 10 and SRB11 which are also known as cyclin

C and cdk8, respectively.

Besides the GTFs, RNA PolI and Mediator needed for regulation of transcription,

there are many co-activator and co-repressors complexes that are required in the cell, but

detailed discussions are beyond the scope of this document. Virtually all of them in some

way modulate the access of GTFs to their promoters. Some of these include the TAF-









containing HAT complexes (TFTC; SAGA; STAGA; PCAF; nucleosome

acetyltransfersase histone H4, NuA4; etc.), histone deacetylase complexes (i.e. SWI

independent 3 complex, SIN3; nucleosome remodeling HD complex, NuRD; regulator of

nucleolar silencing and telophase exit complex, RENT) (reviewed in Lawit and Czarnecka-

Verner, 2002), ATP-dependent chromatin remodeling complexes (SWI/SNF, BRG1-

associated factor (BAF), and related factors), DNA and histone methyltransferase

complexes (i.e., Complex Proteins Associated with Setl, COMPASS; Enhancer of Zeste)

(Miller et al., 2001; Czermin et al., 2002), just to name the best-studied classes.

However, all of these protein complexes require some type of contextual cues to find

their substrates. At some point, nearly all of these cues originate with sequence-specific

DNA-binding transcriptional regulators (either activators or repressors).

Transcriptional Activators That Bind DNA

The GTFs alone are capable of conveying basal levels of transcription from core

promoters (for a review of core promoters see Smale and Kadonaga, 2003). However,

for activated transcription several additional layers of control are often necessary. The

first of these are transcriptional trans-activators, the second are cis-acting DNA elements

in promoters to which transcriptional activators (and repressors) can bind. In general,

DNA binding transcriptional activators are composed minimally of two functionally

separable domains: a DNA binding domain and a transcriptional activation domain

(Ptashne, 1988).

Differential expression of the 27,000 genes of Arabidopsis implies involvement of

many different transcriptional regulators that are capable of differential combinations of

protein-protein and protein-DNA interactions. The team of scientists that analyzed the

Arabidopsis genomic information found 1,709 putative proteins with similarity to known









classes of DNA-binding domain-containing transcription factors (The Arabidopsis

Genome Initiative, 2000). This large, and potentially underestimated, set of

transcriptional regulators is certainly involved in a multitude of protein-protein

interactions with each other (homo- and hetero- dimers and trimers) and with co-activator

and/or co-repressor complexes.

Many transcriptional regulators have been studies in various organisms, including

plants. Interestingly, only 8-23% of the genes encoding proteins containing DNA-

binding domains identified in Arabidopsis are similar to genes in other non-plant

eukaryotic genomes (The Arabidopsis Genome Initiative, 2000). Unlike transcription-

related genes, 48-60% of Arabidopsis protein synthesis genes have homologs in the other

eukaryotic genomes (The Arabidopsis Genome Initiative, 2000). This great disparity

reflects an independent evolution of plant transcription factors in general.

More than half of the transcription factor families in Arabidopsis (16 of 29) appear

to be specific to plants. The Apetala 2/Ethylene response element binding protein-

related to ABI3/VP1 (AP2/EREBP-RAV), no apical meristem/cup shaped cotyledon 2

family (NAC) and auxin response factor (ARFAUX/IAA) families, have DNA-binding

domains not found outside of the plant kingdom. DOF zinc-finger, WRKY zinc-finger,

and the two-repeat MYB families contain plant-specific variants of more widespread

domains. Some large families of transcription factors (R2R3-repeat MYB, WRKY

families, etc.) have expanded in plants, with approximately 100 members in some groups.

Other classes of DNA-binding proteins are completely missing in plants such as the Rel-

like DNA-binding domain, nuclear steroid receptors, forkhead-winged helix, and POU









(Pit-1, Oct and Unc-8b) domain protein families (The Arabidopsis Genome Initiative,

2000).

The functions of the individual transcription-factor family-members can be

regulated by expression characteristics. Another way to add a layer of control to a

transcriptional activator is for it to target a co-activator or GTF that is only expressed at

certain temporal or spatial coordinates. With the plethora of different GTFs in

Arabidopsis, it seems likely that differential expression of GTFs targeted by different

transcription factors is a key mechanism of control in plants. Therefore, to truly begin to

understand transcriptional regulation of any gene in plants we must attempt to understand

regulation circuitry of the transcription factors, the GTFs, and the coactivator/co-

repressor complexes, as well as all of their protein-protein interactions.






39


Table 1-1. TATA binding protein-associated factors of the TFIID complex. The TAFs
are displayed as identified in (from left to right) Arabidopsis thaliana,
Saccharomyces cerevisiae, Drosophila melanogaster, and Homo sapiens.

Arabidopsis Yeast Drosophila Human
225(1)/205(lb) 145(130)(1) 230(250)(1) 250(1) BDs, PK, HAT, TAND

142(2) 150(TSM1)(2) 150(2) 150(CIF150)(2) contacts initiator

47(3) 155(BIP1)(3) 140(3) H2A-like HFD, contacts
47(3) 155(BIP )(3) 140(3) 'BTB domain proteins
76(4)/48(4) 1 5 ( I H2A-like HFD, contacts
76(4)/69(4b) 48(4) 110(4) 135(4)/105(4b) Q-rich TAs, and IIA
WD-40 Repeats,
78(5) 90(5) 80(85)(5) 100(95)(5) contacts ,IFR
contacts IIF[
H4-like HFD, contacts
59(6)/55(6b) 60(6) 62(60)(6) 70(80)(6) IIE IF, AAs, & DPE
contacts multiple TAs,
22(7) 67(7) AAF54162(7) 55(7) and Bdfl

40(8) 65(8) Prodos(8) 43(8) H3-like HFD
H3-like HFD, contacts
21(9) 17(20)(9) 42(40)(9) 31(32)(9)* acidic TAs, IB, & DPE

15(10) 23(25)(10) 24(10)/16(10b) 30(10)* H3-like HFD

24(11)/19(11b) 40(11) 30(3(11) 28(11) H3-likeHFD

58(12)/75(12b) 61(68)(12) 30ax(22)(12) 20(15)(12)* H2B-like HFD

14(13) 19(FUN81)(13) AAF53875(13) 18(13) H4-likeHFD
SWI/SNF, TFIIF, TFIID
23(14)/30(14b) 30(ANC-1)(14)NF TII TFIID
substoichiometric,
41(15)/39(15b) 68(15) contacts RNA and
ssDNA
Not in TFIID complex, part of B-TFIID
228(BTAF1) Motl(BTAF1) 172(BTAF1) helicase similarity, TBP
228(BTAF1negative regulator


= required in yeast


Blue = Present in ySAGA, or hTFTC;
= Present in hPCAF Complex


Note: The names displayed in red are from the unified nomenclature designated by Lazlo Tora
(Tora, 2002). Other names are based on accession numbers, mutant designations, or molecular
weight (either observed or predicted). Histone-fold domain, HFD; bromodomains, BD; protein
kinase, PK; transcriptional activators, TAs; acidic activators, AAs; Broad-complex, Tramtrack,
and Bric-a'-brac, BTB.













Ir


I


Figure 1-1. The "two-step handoff' model of removal of auto-inhibition of TFIID by the
TAF1 N-terminal domains TAND1 and TAND2 (T and T2, respectively).
TAFs are labeled and shown in light blue, the acidic activator is shown in
red, and TFIIA is shown in light green.


40
t7>









Table 1-2. Protein-protein interactions of TFIID in Homo sapiens, Drosophila
melanogaster, and Saccharomyces cerevisiae with corresponding
references.

References
Interaction Human Drosophila Yeast
(Ruppert et al., Weinzier et al (Reese et al., 1994;
TBP-TAF1 1993; Xenarios et e Kokubo et al.,
al., 2002) 1 a 1998)
TBP-TAF2 (Verrijzer et al., (Verrijzer et al.,
1994) 1994)
(Dubrovskaya et (Kokubo et a.,
TBP-TAF5 al., 1996; Tao et 1993c
al., 1997) c)
(Weinzierl et al., (Weinzierl et al.,
TBP-TAF6 1993b; Hisatake et 1993b; Kokubo et
al., 1995) al., 1994)
TBP-TAF7 (Yatheraj am et al.,
TBP-TAF7 2
2003)
TBP-TAF9 (Kokubo et al.,
TBP-TAF9 '
1994)
(Klebanow et al
TBP-TAF10 (Jacq et al., 1994) 1Kleal
1996)
TBP-TAF711 (Xenarios et al., (Kraemer et al.,
TBP-TAF11 2
2002) 2001)
(Mengus et., (Yokomori et al.,
TBP-TAF12 1993b; Kokubo et (Reese et al., 2000)
al., 1994)
TBP-TAF13 (Mengus et al.,
TBP-TAF13 1
1995)
TAF1-TA-F2 (Verrijzer et al., (Verrijzer et al.,
1994) 1994)
(Mengus et al.,
1995; Burley and (Kokubo et.,
(Kokubo et al.,
Roeder, 1996) 1993a; Weinzierl et (Yatherajam et al.,
TAF1-TAF4 1993a; Weinzierl et '
TAF1-TAF4b 2003)
(Dikstein et al.,
1996b)
(Dubrovskaya et
(Dubrovskaya et (Yatheraj am et al.,
TAF1-TAF5 al., 1996; Tao et 2003)
al., 1997) 0)









Table 1-2 continued.
References
Interaction Human Drosophila Yeast
(Weinzierl et al., .
TWeinzierl et al., (Weinzierl et al., (Yatherajam et al.,
TAF1-TAF6 1993b; Hisatake et 1 2
a!., 1995) 1993b) 2003)
(Lavigne et al.,
TAF1-TAF7 1996) TAF1- (Yatheraj am et al.,
TAF7L (Pointud et 2003)
al., 2003)
TAF1-TA9 (Kokubo et al., (Yatherajam et al.,
1994) 2003)
TAF1-TAF10 (Jacq et al., 1994)
TAF1-TAF1 71 (Yokomori et al., (Yatheraj am et al.,
1993b) 2003)
TAF1-TAF12 (Yokomori et al.,
TAF1-TAF12 1993b)
1993b)
TAF-TAF (Yatheraj am et al.,
TAF2-TAF3 2003)
2003)
TAF2-
A 1TA- (Yatherajam et al.,
TAF2-TAF4 TAF4b(Dikstein et (Yath
al., 1996b)20)
TAF-TA7 (Yatheraj am et al.,
TAF2-TAF7 2003)
2003)
TAF-TAF (Yatheraj am et al.,
TAF2-TAF8 2003)
2003)
TAF-TAF0 (Yatheraj am et al.,
TAF2-TAF10 2003)
2003)
TATF2-TAF11 (Yokomori et al.,
TAF2-TAF11 1993b)
1993b)
TAF2-TAF12 (Yokomori et al.,
TAF2-TAF121 b
1993b)
(Gangloff et al.,
(Gangloff et al., (Gangloff et al., n a,
TAF3-TAF10 2001a) 2001a) 2001b; Yatherajam
et al., 2003)
TAF4-TAF5 (Kokubo et al., (Yatheraj am et al.,
1993c) 2003)
TAF4-TAF7 (Yatherajam et al.,
TAF4-TAF7 2
2003)
TAF4-TAF8 (Yatherajam et al.,
TAF4-TAF8 2
2003)
TAF4-TAF9 (Kokubo et al., (Yatherajam et al.,
1994) 2003)
TAF4-TAF1(Yatherajam et al.,
TAF4-TAF102003)
2003)









Table 1-2 continued.
References
Interaction Human Drosophila Yeast
TAF4-TAF11 (Yokomori et al., (Yatheraj am et al.,
1993b) 2003)
(Hoffmann et al.
(Hofmann et al, (Yokomori et al., (Selleck et al.,
TAF4-TAF12 G f e 1993b; Kokubo et 2001; Yatherajam et
al., 2000; Werten et
., 200al., 1994) al., 2003)
al., 2002)
TAF5-TAF6 (Tao et al., 1997) (Ito et al., 2001)
(Dubrovskaya et al.,
TAF5-TAF7 1996; Lavigne et
al., 1996)
TAF-TAF (Yatherajam et al.,
TAF5-TAF8 2003)
2003)
TAF5-TAF9 (Tao et al., 1997) (Kokubo et al (etz et al 2000)
1994) Utea, 2000)
TAF-TAF (Yatherajam et al.,
TAF5-TAF10 2003)
2003)
(Lavigne et al.,
TAF5-TAF11 1996; Tao et al.,
1997)
(Lavigne et al.,
(Lavigne et al., (Yatheraj am et al.,
TAF5-TAF12 1996; Tao et al., 20
1997)2003)
1997)
(Dubrovskaya et al.,
TAF5-TAF13 1996; Lavigne et
al., 1996)
TAF5-TAF15 (Bertolotti et al.,
1998)
(Weinzierl et al., (Weinzierl et al., (Uetz et a, 2000;
Ito et al., 2001;
TAF6-TAF9 1993b; Hisatake et 1993b; Kokubo et et 2
al., 1995; Xenarios al., 1994; Xie et al., Selleck al., 200
et al., 2002) 1996) erajam a,
2003)
TAF-TAF1 (Yatheraj am et al.,
TAF6-TAF10 2003)
2003)
TAF6-TAF11 (Yatherajam et al.,
TAF6-TAF11 2
2003)
(Hisatake et al.,
TAF6-TAF12 1995; Hoffmann et
al., 1996)
TAF7TA7 (Yatherajam et al.,
TAF7-TAF72003)
2003)









Table 1-2 continued.
References
Interaction Human Drosophila Yeast
TAF7-TA (Yatherajam et al.,
TAF7-TAF8 2003)
2003)
TAF7-TAF11 (Lavigne et al., (Yatherajam et al.,
TAF7-TAF119 2
1996) 2003)
TAF7-TAF12 (Hoffmann et al.,
TAF7-TAF12 1
1996)
TAF7-TAF15 (Bertolotti et al.,
1998)
TAF8-TAF10b (Uetz et al., 2000;
TAF8-TAF1(Hernandez- Gangloff et al.,
Hemandez and 2001b; Yatherajam
Ferrus, 2001) et al., 2003)
TAF8-TAF12 (Yatheraj am et al.,
TAF8-TAF12 2
2003)
TAF9-TAF10 (Yatheraj am et al.,
TAF9-TAF10 2
2003)
TAF9-TAF12 (Yatheraj am et al.,
TAF9-TAF12 2
2003)
(Klebanow et al.,
1996; Gangloff et
TAF10-TAF10 al., 2001b;
Yatheraj am et al.,
2003)
(Uetz et al., 2000;
TAF10-TAF11 Yatherajam et al.,
2003)
(Mengus etal.,
(Mengus et al., (Yatheraj am et al.,
TAF10-TAF12 1995; Xenarios ethe m
al., 2002)2003)
(Mengus et al., (Ito et al., 2000;
TAF10-TAF13 1995; Xenarios et Yatherajam et al.,
al., 2002) 2003)
TAF10-TAF14 (Uetz et al., 2000)
TAF11-TAF12 (Mengus et al., (Yatherajam et al.,
TAF11-TAF1295 2
1995) 2003)
(Mengus et al.,
(Mens et a., (Ito et al., 2001;
TAF11-TAF13 199; Birc et (Giot et al., 2003) Yatherajam et al.,
1998; Xenarios et
al., 2002)2003)
TAF1t1-TAF15 (Bertolotti et al.,
1998)









Table 1-2 continued.
References
Interaction Human Drosophila Yeast
TAF12-TAF12 (Yatherajam et al.,
2003)
TAF 13- TAF 15 (Bertolotti et al.,
1998)









Table 1-3. Protein-protein interactions between TFIIA, TFIIB, TFIID, TFIIE, and
TFIIF subunits in Homo sapiens, Drosophila melanogaster, and
Saccharomyces cerevisiae with corresponding references.

References
Interaction Human Drosophila Yeast
(Geiger et al., 1996;
TBP-TFIIA-S/y (Sun et al., 1994) okomoeta., Kokubo et al.,
1994)1998)
TFITA- TBTFIIA-L20 TBP (Geiger et al., 1996;
TBP-TFIIA-L TFIIA TBP (Sun (Yokomori et al., Kokubo et al.,
et al., 1994) 1994) 1998)
1994) 1998)
(Maldonado et al.,
TBP-TFII 1990; Ha et al., (Yamashita et al.,
1993; Xenarios et 1993)
al., 2002)
(Yokomori et al.,
TBP-TFIIEa (Yo et (Maxon et al., 1994)
1998)
(Okamoto et al.,
TBP-TFIIE3 1998)
1998)
TAF1-TFE Acetylation (Imhof
TAF1-TFIIEP eta.,1997)
et al., 1997)
Acetylation
TAF1-TFF (Ruppert and Tjian,
1995; Dikstein et
al., 1996a)
TAF4b-
(Yokomori et al.,
TAF4-TFIIA-L TFIIAa(Dikstein et
al., 1996b)1993a)
(Dubrovskaya et al.,
TAF5-TFIIF3 1996) '
1996)
(Hisatake et al.,
TAF6-TFIIEa 1995)
1995)
(Hisatake et al.,
TAF6-TFIIFc 1995)
1995)
(Klemm et al., (Goodrich et.,
TAF9-TFIIB 1995; Xenarios et (Good
al., 2002)1993)
TAF 1-TFIITA-S (Kraemer et al.,
TAF11-TFIIA-S 2
2001)
TAF13-TFIIA-L (Giot et al., 2003)









Table 1-3 continued.
References
Interaction Human Drosophila Yeast
TAF14-TFIIFa (Henry et al., 1994)
TFIIAa-TFIIAy
(Sun et al., 1994)
TFIIA-L TFIIA-Like Factor ., 2 ) (Ranish and Hahn,
TFIIA-S (ALF)-TF TFIIAy 1991)
(Upadhyaya et al.,
1999)
TFIIA-L (Yokomori et al.,
(TFIIAca/) 1998; Langelier et
TFIIEP al., 2001)
TFIIA-L -
TFIB (Uetz et al., 2000)
TFIIB
TFIIAy(S)- (Langelier et al.,
TFIIEa 2001)
(Okamoto et al.,
TFIIB-TFIIEP 19) (Ito et al., 2001)
1998)
(Fang and Burton,
TFIIB-TFIIFc 1996; Xenarios et
al., 2002)
(Ha et al., 1993;
TFIIB-TFIIFp Fang and Burton, (Ito et al., 2001)
1996)
Se (Austin and Biggin, (Riechmann and
TFIIEa-TFIIEp 1998) 1996; Giot et al., Ratcliffe, 2000;
19) 2003) Uetz et al., 2000)
(Okamoto et al.,
TFIIE-TFIIE1998)
1998)
(Okamoto et al.,
TFIIEp-TFIIFa '
1998)
(Okamoto et al.,
TFIIEP-TFIIF 1998)
(Flores et al., 1990; n ad
TFIIFa-TFIIFp Killeen and 1996)
Greenblatt, 1992) 1996)





































Figure 1-2. Binary protein-protein interactions of the Homo sapiens general
transcription factors TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and their
homologs.



































Figure 1-3. Binary protein-protein interactions of Drosophila melanogaster general
transcription factors TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and their
homologs.




































TFTTEI


TFIIA-S TAF10 TAF9


Binary protein-protein interactions of Saccharomyces cerevisiae general
transcription factors TFIIA, TFIIB, TFIID, TFIIE, and TFIIF.


Figure 1-4.














CHAPTER 2
PHYLOGENETIC ANALYSIS OF POPLAR, Arabidopsis AND OTHER PLANT
GENERAL TRANSCRIPTION FACTORS

Introduction

While there has been a great deal of interest in DNA-binding transcriptional

activators in plants, comparatively little work has been done on plant regulatory factors

downstream of the activators themselves. One exception is a current project that seeks to

clone and characterize the majority of chromatin regulators that are readily identifiable

based on similarity (Pandey et al., 2002). Another exception is the plant histone

deacetylases, which have been characterized to some degree in plants (reviewed in Lawit

and Czamecka-Verner, 2002). Since plant histone deacetylases have historically been of

interest due to their involvement with the maize disease caused by an inhibitor, HC-toxin

from Cochliobolus carbonum, they have been characterized more so with respect to

enzymology than their function in protein complexes or gene regulation. Only two plant

histone acetyltransferase complexes have been characterized to any extent: a SAGA-like

complex in Arabidopsis (Stockinger et al., 2001), and TFIID in wheat (Washburn et al.,

1997).

The present study provides a detailed phylogenetic analysis of peptide subunits

from the principal eukaryotic complexes responsible for transcriptional activation from

plants. The model plant system Arabidopsis thaliana (ecotype Columbia) was chosen as

a primary source of peptide sequences because it was the first of the plant genomes to be

fully sequence and annotated (The Arabidopsis Genome Initiative, 2000). General









transcription factors (GTFs) from plant systems are readily identifiable using homology

based searches since these proteins are highly conserved within eukaryotes and in some

cases within Archaea as well.

Putative peptide subunits of plant GTFs TFIIA, TFIIB, TFIID, TFIIE, and TFIIF

were identified using basic local alignment search tool (BLAST) algorithms of the

Arabidopsis genomic data. In many cases, multiple genes were identified in Arabidopsis

based on sequence similarity with GTFs in other eukaryotes. These putative Arabidopsis

GTFs were then used to identify analogous genes in other plants to evaluate how wide

spread the gene duplication and evolutionary changes found in Arabidopsis were to the

plant kingdom. Phylogenetic analyses were performed using these other plant homologs,

and the well-characterized proteins from Homo sapiens, Drosophila melanogaster, and

Saccharomyces cerevisiae. The greatest emphasis was placed on identifying homologs in

arabidopsis, rice, and poplar (given that these plant genome projects were the most

advanced). This provided an examination of differences between GTFs from

dicotyledonous and monocotyledonous plants, as well as between herbaceous and woody

dicotyledonous plants.

Functional analysis tied with genome sequence scrutiny has identified families of

transcription factors found uniquely in plants (Riechmann et al., 2000). These large

families have encouraged efforts to systematically examine the functions of the

individual family members looking for redundancy and divergence (Eulgem et al., 2000;

Jakoby et al., 2002; Heim et al., 2003; Toledo-Ortiz et al., 2003). However, a similar

kingdom-level investigation of GTFs has not been reported. There are numerous

examples of GTF family members in metazoans that are specialized for certain functions.









Mammalian TRF2 (TBP-related factor 2) appears to be required for spermiogenesis,

Drosophila Cannonball (TAF5L) is required for spermatogenesis, and human TAF4b is

part of a unique TFIID complex in follicle cells of the ovary (reviewed in Levine and

Tjian, 2003). These and other findings suggest that some GTF homologs have evolved to

undertake specialized functions in animals, and similar events may have occurred in

plants. The present analysis was undertaken because: 1) sequence of three divergent

seed-plant genomes are nearly complete and have the potential to allow discovery of

plant-specific GTF family members; 2) transcriptional regulation is a fundamental

adaptive mechanism in plants to short-term stresses, and hitherto information is lacking

on potential functional redundancy or divergence in GTFs.

Methods

The putative proteins of Arabidopsis GTFs TFIIA, TFIIB, TFIID, TFIIE, and

TFIIF were identified using two BLAST resources. The National Center for

Biotechnology Information protein BLAST (Arabidopsis thaliana organism setting; no

filtering) was used to identify the following GTF subunits: TBP1, TBP2, TAF1, TAFlb,

TAF2, TAF6, TAF6b, TAF9, TAF10, TAF11, TAFllb, and TAF12. The Arabidopsis

Information Resource (TAIR) WU-BLAST2 BLASTp using the protein database was

used to identify TAF4, TAF4b, TAF5, TAF7, TAF8, TAF13, TAF14, TAF14b, TAF15,

TAF15b, TFIIA-S, TFIIA-L1-3, TFIIB1-6, TFIIEal-3, TFIIEl1-2, TFIIFa, and

TFIIFl1-2 (Table 2-1 for a summary of these genes and their respective proteins). All

searches used human protein sequences as queries, with the exception of the TAF14

search in which the yeast protein sequence was used. Iterative searches were performed

using the identified Arabidopsis protein sequences to identify TAF 12b using TAIR WU-









BLAST2 BLASTp. Nucleotide sequences and other relevant annotated data were

identified by following the appropriate links provided by TAIR and the US National

Center for Biotechnology Information (NCBI).

Putative poplar GTFs were identified in the Department of Energy Joint Genome

Institute poplar database genomic sequence of the female black cottonwood (Populus

balsamifera subsp. trichocarpa) clone Nisqually-1 (Wullschleger et al., 2002). Searches

were performed using the tBLASTn (Altschul et al., 1990; Altschul et al., 1997) function

at http://aluminum.jgi-

psf.org/prod/bin/runBlast.pl?db=poplar0&dump=l&matchReads=l. Genomic contigs

sequences were assembled using Contig Assembly Program 3 (CAP3) (Huang and

Madan, 1999). Contig gaps were filled using iterative searches of the genomic sequence

(BLASTn) and CAP3 assembly. Contigs were analyzed to predict cDNA sequence using

Softberry software (http://www.softberry.com/berry.phtml?topic=gfind) FGENESH with

either the Dicots (Arabidopsis) or Nicotiana tabacum settings (Chicurel, 2002). Since

there is a disagreement between taxonomists and the group sequencing the Populus

genome as to the proper name of the species, either trichocarpa or balsamifera subsp.

trichocarpa, these two names are used interchangeably in this text.

Putative GTF amino acid sequences from plants other than Arabidopsis and

poplar were identified using three different resources with Arabidopsis protein sequences

as queries. NCBI BLASTp (http://www.ncbi.nlm.nih.gov/BLAST/) was utilized to

search all available protein sequences using the Viridiplantae organism setting. H.

sapiens, D. melanogaster, and S. cerevisiae sequences were collected using text searches

in Entrez (http://www.ncbi.nlm.nih.gov/Entrez/). TBP, TFIIB, and TFIIEac homologies









can be traced back to Archaea (Rowlands et al., 1994; Qureshi et al., 1995; Bell et al.,

2001); therefore, in the case of these proteins archeal homologs were assembled as well.

The Institute for Genome Research (TIGR) plant expressed sequence tag (EST) databases

were searched using Arabidopsis sequence queries and a tBLASTn algorithm

(http://tigrblast.tigr.org/tgi/). Only full-length EST contigs were considered and the open

reading frame (ORF) encoding the sequence with GTF similarity was translated using

Java based Molecular Biologist's Workbench 1.1 (JaMBW 1.1) (Toldo, 1997)

TranslatER. Finally, the Plant Genome Database (PlantGDB; http://www.plantgdb.org/)

was searched with Arabidopsis queries using BLASTp on protein sequences and

tBLASTn on EST contigs (ORFs were translates as above). The sequences of predicted

TATA biding proteins from poplar were assembled as above by John Davis.

The collected amino acid sequences of each GTF homology group were aligned

using the Gonnet weight matrix (Gonnet et al., 1994) on ClustalX software (Thompson et

al., 1997). Unaligned N-terminal or C-terminal sequence extensions were deleted by

hand and datasets realigned to conservatively estimate phylogenic distances based on

conserved protein regions. ClustalX alignment outputs were produced in Nexus format

and phylogenetically analyzed using PAUP* phylogeneticc analysis using parsimony

*and other methods) parsimony with 500 bootstrap replicates (PAUP analysis done by

Ram Kishore Alavalapati) (Swofford, 2003). Phylogenetic trees were created using

TreeView (Page, 1996).

Similarity and identity matrices were created using Matrix Global Alignment Tool

(MatGAT; http://www.angelfire.com/nj2/arabidopsis/MatGAT.html) with the blocks

substitution matrix 62 (BLOSUM62) similarity matrix (Campanella et al., 2003).









Truncated sequences were used as input data for those cases in which protein sequences

were shortened for ClustalX alignments.

Results

The amino acid sequences of the assembled GTFs are shown in Appendix A. The

amino acid multiple sequence alignments for core domains of the GTFs (sans N-terminal

and C-terminal extensions) are found in Appendix B. Similarity and identity ranges for

protein families are found in Table 2-2. Phylogenic trees derived from the parsimony

analyses are found in Figures 2-1 through 2-12.

TFIIA Large and Small Subunits

The poplar TFIIA-L1 coding sequences prediction using the Softberry, GlimmerM

(http://www.tigr.org/tdb/glimmerm/glmrform.html) (Majoros et al., 2003), and

Eukaryotic GeneMark.hmm (http://opal.biology.gatech.edu/GeneMark/eukhmm.cgi)

(Lukashin and Borodovsky, 1998) programs were all predicted to have a C-terminal

extension of varying lengths that did not have homology with any other group and was

missing the last seven amino acids that were highly conserved among all other organism

examined. However, further analysis using the GeneSeqer program

(http://www.plantgdb.org/cgi-bin/PlantGDB/GeneSeqer/PlantGDBgs.cgi) (Schlueter et

al., 2003) which used cDNA data from all plants and only Populus, predicted a shorter C-

terminus that was highly similar to other organisms and ended with the ATGEFEF plant

consensus. Since this model was truncated at the N-terminus due to the paucity of 5'

cDNA sequences, it was merged with the Softberry FGENESH output from the Nicotiana

tabacum and Arabidopsis thaliana to yield the final prediction of poplar TFIIA-L 1

coding sequence.









TFIIB Family

A large number of full-length cDNAs and predicted genes encoding TFIIB

homologs have been identified in plants (30 plant homologs in all). It appears that the

TFIIB protein-family has undergone many duplications as well as differentiations

including a novel homolog which has a functional connection to the plastid (Lagrange et

al., 2003). Arabidopsis and poplar both have four distinct phylogenetic TFIIB clusters

(Class A, Class C, Class D, and Class E). Clearly identifiable plant homologs of TFIIB-

related factors (BRFs) associated with DNA-dependent RNA polymerase III (PolIII)

were excluded from the phylogenetic analysis; however, it appears that distantly related

homologs from Lycopersicon esculentum and Populus may have remained.

Plant TFIIB homologs appear to have many conserved motifs first identified in the

metazoan TFIIB. Among these is a lysine residue that has recently been shown to be

autoacetylated in human and yeast TFIIBs (Choi et al., 2003). This lysine is conserved in

many members of the Class A TFIIB family (Figures 2-3 and 2-13). Similarly, the

putative zinc-ribbon domain at the N-terminus has been conserved in most family

members (Appendix B3). Although it is not apparent in Appendix B3 due to N-terminal

trimming for alignment purposes, AtTFIIB4 also contains this conserved metal-binding

domain. Poplar TFIIB3, poplar TFIIB9, poplar TFIIB8, Lycopersicon esculentum TFIIB

AF273333, and Sulfolobus solfataricus TFB AAK40772.1 are all missing the conserved

cysteine and/or histidine residues essential to this N-terminal motif. Since a known

functional TFB from the archaea Sulfolobus solfataricus is lacking this motif, it may not

be required for TFIIB function in all cases. Thus, poplar TFIIB3, poplar TFIIB9, poplar

TFIIB8, and Lycopersicon esculentum TFIIB AF273333 may all be functional TFIIB-

homologs despite the absence of this conserved motif.









Likewise, the imperfect direct repeats of amino acid sequences found in human

core-TFIIB (Nikolov et al., 1995) have been well conserved (Appendix B3). AtTFIIB6 is

lacking the second direct repeat region. This region is involved in protein-protein

interactions with PolII (Ha et al., 1993). Therefore, it is suggested that the AtTFIIB6

protein may be deficient in this PolII interaction, and could possibly function as a

negative regulator. Vitis vinifera TFIIB TC9302, as well as amino acid predictions from

poplar TFIIB4, TFIIB5 and TFIIB6 are notably lacking both direct repeats suggesting

that these proteins, if expressed, are not functional TFIIB homologs since they would

likely be deficient in TBP and PolII interactions (Ha et al., 1993).

In addition to the canonical TFIIB proteins (Class A) and BRFs (Class B), the

plastid envelope associated (Class C) TFIIB-like proteins (Lagrange et al., 2003) are

conserved in all plant lines with available sequence. The Arabidopsis AtTFIIB5/pBrp

shows two closely related homologs in Lycopersicon esculentum (Accession AAG01118)

and poplar (TFIIB7/pBrp) and high relatedness to the partial cDNAs from Spinacia

oleracea and Zea mays reported by Lagrange et al. (2003).

Representative TFIID Components

TBP has been highly conserved in plants (Table 2-2 and Figure 2-4), even in cases

where plants contain duplicate TBP genes. These duplicated TBP proteins are in all

cases highly similar and are not likely to have diverged functions. Plant TBP genes are

tightly clustered phylogenetically, although somewhat diverged from metazoan, fungal,

protistan, and archaeal TBPs and TBP-like proteins. Significantly, both imperfect repeat

motifs within the protein structure are conserved in all the TBP-like proteins.

Similarly to the case with TBP, A. thaliana has two loci that encode TAF6

homologs. Both of these genes (designated TAF6 and TAF6b) are transcribed and the









latter has at least four alternative splicing variants (E. Czarnecka-Verner, S.J. Lawit,

W.B. Gurley unpublished data). Clones were sequenced by the University of Florida

ICBR DNA sequencing facility. Intron-Exon diagrams of TAF6 and TAF6b isoforms are

shown in Figure 2-14.

TAFs 6, 9, 10, and 11 phylogeny patterns easily cluster into monocot and dicot

families. The plant proteins are well conserved and somewhat divergent from metazoan,

fungal and protistan proteins. However, poplar TAF9b is more related to the

Chlamydomonas reinhardtii TAF9 and S. cerevisiae TAF9 than the plant TAF9 proteins.

This TAF9 homolog is perhaps a TAF9-like protein involved in other transcriptional

complexes similarly to H. sapiens TAF9L (Chen and Manley, 2003). Similarly, poplar

TAF 11 is roughly equally related to plant and fungal family members suggesting that it

represents a more ancient form of TAF 11 than is found in other plants.

TFIIEa and TFIIEP Subunits

Similarly to the TAF proteins, the TFIIEc phylogeny pattern also formed along

dicot-monocot lines. The six dicot proteins and single monocot protein were diverged

and branched separately with 100% bootstrap support. Interestingly, the archaeal TFE

proteins were phylogenetically more similar to plant TFIIEc than to the yeast and

metazoan counterparts. This suggests a higher degree of primary structure conservation

in the plant TFIIEu proteins than in the other kingdoms, possibly indicating a greater

reliance on TFIIE for stabilization of the open promoter conformation.

Monocot TFIIE3 proteins cluster closely with one another as do dicots with the

exception of two proteins. These exceptions are the A. thaliana TFIIE3 family members.









Both proteins from Arabidopsis are found well outside of the core plant cluster

suggesting a significant divergence in sequence in the Arabidopsis TFIIE3 genes.

TFIIFa and TFIIF3 Subunits

A. thaliana TFIIFc has an 87 amino acid C-terminal extension and S. cerevisiae

TFIIFc has a 68 amino acid C-terminal extension that were removed from their

sequences for alignment purposes. Due to the large size of the TFIIFc subunit, relatively

few full-length plant cDNAs could be assembled for comparison. Furthermore, due to a

lack of sequence in one region of a poplar TFIIF gene, a full-length genomic region could

not be assembled. Therefore, only one of two poplar putative TFIIFc proteins was

phylogenetically analyzed. These proteins are well conserved throughout the length of

the protein.

The N-terminus and C-terminus of S. cerevisiae TFIIFP was trimmed by 32 and 33

amino acids, respectively, for alignment purposes. Similarly, D. melanogaster TFIIFP

was trimmed at the C-terminus by nine amino acids. TFIIFP proteins are likewise well

conserved, with two exceptions. Both A. thaliana TFIIFl1 and poplar TFIIFl1 have

large, non-conserved insertions. It should be noted that neither of these genes have

cDNA clone representation and are therefore only predictions (despite numerous tries at

RT-PCR cloning of AtTFIIFp31; data not shown).

Discussion

TFIIA Large and Small Subunits

TFIIA is composed of either two subunits (L and S) in fungi and plants or three

subunits (c, 3, and y) in metazoans where a and 3 are derived from post-translational

cleavage of a protein homologous to fungal and plant TFIIA-L subunit (Li et al., 1999).









TFIIA interacts with both the N-terminal stirrup of TBP and DNA upstream of the TATA

element (Langelier et al., 2001). TFIIA in A. thaliana seems to be encoded by four

genes, three encoding large subunit homologs and one encoding a small subunit homolog.

AtTFIIA-L1 and AtTFIIA-L2 appear to result from a recent gene-duplication due to their

high degree of identity and their close juxtaposition in chromosomal location (Fig. 2-2).

The genes encoding TFIIA-L1 and TFIIA-L2 are oriented in a tail-to-tail fashion with

their polyadenylation sites separated by 1,922 bp. AtTFIIA-L3 appears to have arisen

from a more ancient gene duplication, and has significantly diverged from other TFIIA-L

genes (Figure 2-2). The AtTFIIA-L3 protein is approximately half the size of its two

Arabidopsis homologs, although it has maintained a similar isoelectric point (pI) and

appears to be competent for assembly of the TFIIA complex based on yeast two-hybrid

interactions (Chapter 4). One hypothesis is that AtTFIIA-L3 represents an ancestral form

of the TFIIA-L protein family due to its phylogenetic clustering with fungal and

metazoan sequences.

Poplar TFIIA also appears to be encoded by four genes (two encoding large subunit

proteins, and two encoding small subunit proteins). Unfortunately, two contigs that most

likely encode one of the TFIIA-L genes could not be connected due to the presence of a

large, low-complexity (most likely intronic) region; however, the predicted TFIIA-L

amino acid sequence grouped solidly with other dicot TFIIA-L sequences (data not

shown). The predicted amino acid sequence from the incomplete TFIIA-L sequence has

a high degree of identity with the complete form (Fig. 2-2), suggesting a recent gene

duplication and redundancy in function. Interestingly, poplar TFIIA-S1 is highly









conserved (Fig. 2-1) and one that is quite diverged from other plant proteins (TFIIA-S2),

nearly equidistant from other plant proteins and metazoan proteins.

Arabidopsis and poplar both encode TFIIA proteins that are highly diverged from

other plant proteins (a TFIIA-L3, and a TFIIA-S2, respectively). These proteins may

have evolved specialized functions within their respective organisms; however, they do

not seem to be conserved within other plants that have been sequenced. This suggests

that these proteins are potentially the products of evolutionary experiments in progress or

may be ancestral forms of these proteins that have been retained in their respective

species. However, the significance of their presence cannot be reliably predicted.

Overall, TFIIA conservation appears to be quite high. The TFIIA-S family is conserved

throughout the length of the protein, while the TFIIA-L family is conserved mainly at the

N-terminal and C-terminal ends. The TFIIA-L sequence conservation pattern is

consistent with the observation that human and fruit fly TFIIA is composed of three

subunits, the two largest of which are derived from proteolytic cleavage of the TFIIAac/P

(TFIIA-L) pre-protein. This suggests that the middle region of TFIIA-L proteins may

function as a flexible linker.

TFIIB Family

Full-length cDNAs of 30 plant homologs of TFIIB have been identified in plants.

The TFIIB protein-family has undergone myriad duplications and differentiations

including one (the Class C TFIIB-related proteins) that has evolved a functional

interaction with the defining plant organelle, the plastid (Lagrange et al., 2003). The

canonical member of the Class C group (TFIIB5/pBrp) was discovered by Lagrange et al.

(2003) to bind the outer envelope of the plastid, suggesting a function in signal









transduction from the plastid to the nucleus. Six distinct phylogenetic TFIIB groups are

apparent in Arabidopsis and Populus (if one accounts for the BRFs). Clear homologs of

DNA-dependent RNA polymerase III (PolIII) associated TFIIB-related factors (BRFs)

from plants were excluded from my phylogenetic analysis with the exception of the

Arabidopsis proteins for use as an out-group.

Plant TFIIB homologs have a number of conserved motifs. These include a lysine

residue, located 28 amino acids from the N-terminus of the second direct repeat (in the

human sequence), which has recently been shown to be autoacetylated in human and

yeast TFIIBs (Choi et al., 2003). This lysine is conserved in many members of the plant

Class A TFIIB family (Figures 2-3 and 2-13) suggesting a conservation of this

autoacetylation activity in plants. Choi et al. (2003) did not identify the catalytic domain

of this autoacetylase activity; therefore, the conservation of this domain could not be

assessed. Choi et al. (2003) reported that the presence of the acetyllysine group in TFIIB

increases the affinity of this protein for TFIIF, implying a role in transcriptional

initiation. It is likely that an activity involved in this critical process will be conserved

not only in metazoans and fungi, but also in plants. Equally significant is the absence of

this lysine in several of the plant TFIIBs, suggesting plant-specific specialization among

members of the TFIIB family.

Similarly, the putative zinc-ribbon domain at the N-terminus has been conserved in

most family members (Appendix B3) including AtTFIIB4 although it is not apparent in

due to N-terminal trimming. Significantly, poplar TFIIB3, poplar TFIIB8, poplar

TFIIB9, Lycopersicon esculentum TFIIB AF273333, and Sulfolobus solfataricus TFB

AAK40772.1 are all missing the conserved cysteine and/or histidine residues essential to









this N-terminal motif. However, at least one archaeal species (Sulfolobus solfataricus) is

lacking the zinc-ribbon in its TFB suggesting that this motif may not be required for

TFIIB function in all cases.

Another conserved domain, the imperfect direct repeats (Nikolov et al., 1995) are

found in most plant TFIIB homologs (Appendix B3). AtTFIIB6 is lacking the second

direct repeat region, which interacts directly with PolII in animals (Ha et al., 1993).

Therefore, it is suggested that this proteins may be deficient in this PolII interaction and,

if they are functional, could possibly play a role as negative regulators. Four TFIIB-

related proteins are lacking both direct repeats suggesting that these proteins, if

expressed, are not functional TFIIB homologs (Ha et al., 1993).

In addition to the TFIIB-family proteins in Class A and Class B (BRFs, which were

not analyzed extensively in this study), the a clear conservation has been shown for the

plastid envelope associated (Class C) TFIIB-like proteins in this study and by Lagrange

et al. (2003). This plant-specific TFIIB is localized to the outer plastid membrane and is

not detectable in the nucleus of wild type plants (Lagrange et al., 2003). The

characterized protein AtTFIIB5/pBrp has two closely related homologs (Lycopersicon

esculentum Accession AAG01118, and poplar TFIIB7/pBrp) in addition to the partial

cDNAs from Spinacia oleracea and Zea mays reported by Lagrange et al. (2003). This

suggests that this protein has a conserved activity that is critical to plant cell functions.

Lagrange et al. (2003) suggested a functional model in which a plastid-derived signal

leads to release of the TFIIB5 protein from the outer envelope and movement into the

nucleus. Once in the nucleus, TFIIB5 induces expression of its unique transcriptosome.

The nuclear accumulation of TFIIB5 in plants deficient in COP9 suggests proteasome-









mediated degradation provides a rapid turnover of the nuclear-localized TFIIB5 protein,

facilitating temporal control of the signal response. However, this is model does not

explain the lack of TFIIB5/pBrp protein on the plastid under conditions of

proteasome/COP9 signalosome dysfunction.

In the present study, a model is proposed in which the COP9 signalosome leads to

degradation of a TFIIB5/pBrp co-factor that allows the protein to the nucleus. Such a co-

factor may be either a chaperone that escorts TFIIB5/pBrp to the nucleus, a chloroplast-

docking antagonist, or a post-translational modifying regulator (i.e. a kinase or a 14-3-3

protein). In any of these cases, TFIIB5/pBrp transport to the nucleus and induction of

transcription is likely to be the culmination of this signal transduction pathway.

Whatever the role of TFIIB5/pBrp, the Class C subfamily of TFIIB-like factors plays a

novel role in plant transcriptional regulation.

There is weak bootstrap support for two additional conserved classes of TFIIB-like

proteins in plants. The Class D group contains Arabidopsis TFIIB4 and Poplar TFIIB8.

Class E contains Arabidopsis TFIIB3 and TFIIB6, as well as Poplar TFIIIB2. The

functions of these proteins are unknown; however, Arabidopsis TFIIIB3 and TFIIB6 have

similar interactions with other GTFs (Chapter 4).

Representative TFIID Components

TBP is widely regarded as being the rate-limiting factor of PIC formation.

Consistent with this critical role, it is among the most highly conserved proteins of the

GTFs, through all the organisms examined in this study, with 73.7% and 63.1% average

similarity and identity, respectively. Likewise, TBP is highly conserved in plants (84%

average identity). Similarly to animals, many plants contain duplicate TBP genes;

however, unlike the case in animals the plant proteins are highly similar and are likely to









be largely redundant. Plant TBPs are tightly clustered, although significantly diverged

from metazoan, fungal, protist, and archaeal TBPs and TBP-like proteins. As would be

expected, the TBP two repeated structural domains are conserved in all the proteins in the

TBP-like family.

In general, TBP-associated factors are more highly variable than TBP. There are

many cases of duplicate TAFs as well as TAF-like proteins in fungi and animals

(reviewed in Tora, 2002). One example of this in plants is the presence of two genetic

loci encoding homologs of TAF6 in Arabidopsis. Upon cloning of these cDNAs

(Chapter 3) it was found that one of these genes, TAF6b, is represented by four

alternatively spliced mRNAs (Figure 2-14). TAF6b has 12 coding exons in the mRNAs

of three isoforms, and TAF6b-4 has only 5 coding exons due to what appears to be a

premature stop codon in exon V caused by the lack of splicing of intron II. In contrast,

TAF6 has 11 coding exons and no detected alternative splicing.

The TAF genes that have been investigated in this work are clearly divergent along

taxonomic lines. This is clearly demonstrated by TAF9, of which monocot and dicot

TAF9 sequences cluster separately in the unrooted phylogram (Figure 2-6). This

situation is also evident in the TAF6 phylogram (Figure 2-5). However, poplar TAF9b is

more closely related to the C. reinhardtii TAF9 and S. cerevisiae TAF9 than the monocot

or dicot TAF9 proteins. This TAF9 homolog is perhaps a TAF9-like protein involved in

other transcriptional complexes similarly to H. sapiens TAF9L (Chen and Manley, 2003).

A second possibility is that poplar TAF9b may be a bonafide TAF9 that regulates a

subset of genes in poplar, or is merely redundant. Finally, poplar TAF9b could represent

an ancient form of the gene that has been maintained in this lineage.









Similarly to the situation with TAF9, TAF10 proteins are plainly grouped as either

monocots or dicots with other kingdoms clustering separately. One gymnosperm (Pine)

TAF10 protein was included in this analysis, and it was found to be more similar to the

TAF10 proteins of dicots than those of monocots, consistent with the more recent

evolution of monocots.

TAF 11 proteins are similarly clustered in the phylogram, with the exception of the

protein encoded by poplar TAF11. Poplar TAF 11 is equally similar to yeast and plant

TAF11 proteins. Interestingly, AtTAF11 is located on Arabidopsis chromosome 4 only

five loci away from TFIIE/32 (-27 kbp) that is in-turn located very near TFIIEa2 (see

TFIIE section below). Although TAF 11 has no known direct-connection to TFIIE, this

close genomic proximity seems unusually coincidental. Chromosome 3 of Oryza sativa

(rice) has genes encoding both TAF 11-like and TFIIEp-like protein; however, these

genes are separated by over 8 Mb. Unfortunately, fine mapping of chromosomes have

not yet been performed for sequences for any dicot except Arabidopsis; therefore,

comparison of GTF synteny within dicots is not yet possible.

Arabidopsis TAF11b does not have representation in EST collections, nor has it

been amplified by RT-PCR. This data suggests that AtTAF11b may be a non-expressed

or very low expression gene. This hypothesis is supported by the evolutionary

divergence of the protein sequence in relation to other plant TAF 11 amino acid

sequences. However, the sequence of AtTAF 11 is actually more similar to other plant

sequences than the putative poplar TAF11 protein.









TFIIEa and TFIIEP Subunits

A. thaliana has genes encoding three homologs of TFIIEc and two of TFIIE3

(Table 2-1). TFIIEc and TFIIE3 of H. sapiens are acidic (pI of 4.5) and basic (pI of 9.5)

proteins, respectively (Peterson et al., 1991). The acidic properties of TFIIEc appear to

be well conserved in Arabidopsis with pi values of 4.75, 4.95, and 4.72 for Ea Ec2,

and Ea3, respectively. Likewise, the basic pi values are conserved in Arabidopsis

TFIIE3 proteins (10.23 and 10.04 respectively for Epl and E32).

Four of these Arabidopsis TFIIE genes are clustered on chromosome 4. TFIIEa2

and TFIIE/2 neighbor each other in a head to head inverted fashion sharing a common

promoter region. TFIIEa3 and TFIIE/l are in relatively close proximity both in the

same orientation (18 genetic loci inserted between the genes, 83 kbp apart). The extreme

proximity (only 972 bp between start codons) of TFIIEa2 and TFIIE/2 suggest that they

are direct descendents of the ancestral genes in plants and have been duplicated to create

the other loci. This hypothesis is supported by phylogenetic data in the case of TFIIE/2

in which the Arabidopsis protein (along with the gene product of TFIIE/l) is clustered

separately from all other TFIIEP proteins. However, the TFIIEc2 protein clusters with

the other Arabidopsis TFIIEc sequences, within the dicot grouping.

TFIIFc Family

The poplar genome clearly encodes two TFIIFc genes; however, two contigs

encoding what appears to be the N-terminal and C-terminal regions could not be

connected due to lack of sequence in what appears to be a low-complexity intronic









region. Therefore, only one poplar putative-TFIIFc amino acid sequence is included in

my analyses.

TFIIFc is highly conserved throughout the length of the primary structure. In

metazoans and yeast, TFIIFc can be functionally divided into three domains: the N-

terminal TFIIFP binding domain, the highly charged middle domain, and the C-terminal

winged helix domain (Kamada et al., 2001). The high conservation of the plant TFIIFc

primary structures carries through to the hydrophobic residues in the C-termini

(Appendix B). Therefore, the conclusions of Kamada et al. (2001) are followed,

suggesting that these proteins have a conserved winged helix domain in their C-termini.

This winged helix domain is not yet implicated in DNA-binding as are the winged helix

domains of TFIIEP, TFIIFP, and many of the winged helix superfamily members

(Kamada et al., 2001).

TFIIFc has been reported to contain a serine/threonine kinase activity in that is

involved in transcriptional elongation (Rossignol et al., 1999). However, Rossignol and

co-workers (1999) were unable to find an identifiable ATPase domain in TFIIFa.

Interestingly, I have identified a weak similarity with AAA ATPase VWA-containing

proteins in all the plant TFIIFc homologs studied within this work, suggesting that this

kinase activity is retained in plants.

TFIIFP Family

The poplar genomic sequence has four coding regions with homology to the

Arabidopsis TFIIFp proteins. However, only three of the poplar contigs created from the

genomic sequences appear to be transcribed into RNA. The fourth sequence (analyzed

using BLASTx) (Altschul et al., 1997) does not support an mRNA prediction of









reasonable length and encodes a stop codon within a highly conserved region of the

predicted amino acid sequence. Thus, this fourth contig region is most likely a remnant

of a non-functional gene duplication.

TFIIFP proteins are highly conserved, except for large, non-conserved insertions in

A. thaliana TFIIFl 1 and poplar TFIIF1. It should be noted that neither of these genes

have cDNA clone representation and are therefore only predictions. Despite the lack of

cDNA support for these genes, the insertions seem to be grouped between conserved

functional domains identified by Tan et al. (1995). Therefore, these insertions may be in

flexible linker regions between more defined structural domains suggesting that these

gene products, if expressed, may be functional.

The tight clustering of GTF proteins (TAF6, TAF9, TAF 10, TAF 11, TFIIEca, and

TFIIEp) along evolutionary lines suggests that these proteins are evolving with the

species that encode them in their genomes and have functions that are somewhat resistant

to minor sequence variations. This may be due to linker regions in proteins that have

little need for conservation of sequence. Conversely, the lack of obvious clustering of

proteins within evolutionary groupings below the kingdom level (such as in TFIIB and

TBP) that these proteins are very tightly conserved and that minor sequence alterations

may drastically disrupt functions. Thus in the case of TFIIB and TBP, it appears more

likely that the conservation may be so tight (at least within subclasses) that very few

changes have occurred below the kingdom level and thus phylogenetic analysis cannot

produce subgroups reliably. Three plant-specific clusters of TFIIB-like proteins have

been identified. At least one of these has evidence of a plant-specific function due to it






71


intracellular localization to the plastid outer membrane under normal conditions

(Lagrange et al., 2003).









Arabidopsis GTF genes, loci, genomic sizes, coding sequence sizes
(counting stop codons), predicted protein molecular weights, and pi of the
predicted proteins.


Genomic Size
of CDS (bp)
1,094
2,510
2,628
825
1,846
1,736
1,118
1,637
2,245
549
1,453
1,334
9,877
8,107
9,201


CDS Size
(bp)
321
1,128
1,128
561
939
939
1,011
1,083
1,512
549
603
603
5,760
5,103
4,113


Predicted
Mw (KDa)
12.1
41.3
41.2
20.9
34.3
34.2
37.7
39.7
55.7
19.9
22.4
22.4
217.2
192.1
153.5


TFIIA-S
TFIIA-L1
TFIIA-L2
TFIIA-L3
TFIIB 1
TFIIB2
TFIIB3
TFIIB4
TFIIB5
TFIIB6
TBP1
TBP2
TAF1
TAFlb
TAF2
TAF4
TAF4b
TAF5
TAF6
TAF6b 1
TAF6b2
TAF6b3
TAF6b4
TAF7
TAF8
TAF9
TAF10
TAF11
TAF1lb
TAF 12
TAF12b
TAF13
TAF14
TAF14b
TAF15
TAF15b-1
TAF15b-2
TFIIEal


Table 2-1.


Gene


At4g24440
Atlg07480
Atlg07470
At5g59230
At2g41630
At3g10330
At3g29380
At3g57370
At4g36650
At4g10680
At3g13445
Atlg55520
Atlg32750
At3g19040
Atlg73960
At5g43130
Atlg27720
At5g25150
Atlg04950
Atlg54360
Atlg54360
Atlg54360
Atlg54360
Atlg55300
At4g34340
Atlg54140
At4g31720
At4g20280
Atlg20000
At3gl0070
Atlg17440
Atlg02680
At2gl8000
At5g45600
Atlg50300
At5g58470
At5g58470
Atlg03280


Locus


2,163
1,854
2,010
1,584
1,515
1,494
1,431
588
612
1,062
552
405
633
615
1,620
2,052
384
609
807
1,119
1,164
1,269
1,440


Predicted
pi
5.61
3.98
4.02
3.94
6.77
6.66
6.32
7.76
6.14
8.89
10.21
10.31
5.55
7.65
6.18


9.34
9.84
6.65
8.73
8.83
8.84
8.56
9.62
4.11
4.96
4.67
5.50
5.39
9.59
10.42
10.46
5.81
6.16
7.03
7.98
7.93
8.73
4.75


80.6
68.9
74.4
58.9
56.5
55.7
53.1
22.6
22.5
39.5
20.6
14.9
23.7
23.4
57.7
74.8
14.3
22.8
30.2
41.3
38.9
42.3
54.1


5,537
4,026
4,942
3,508
2,562
2,562
2,562
857
1,321
1,062
794
1,482
873
769
2,604
3,257
836
1,025
1,608
2,845
2,131
2,289
3,124









Table 2-1 Continued.


Genomic Size
of CDS (bp)
2,738
2,171
1,047
1,357
3,349
1,708
1,376


CDS Size
(bp)
1,428
1,251
828
861
1,950
1,095
761


Predicted
Mw (KDa)
54.6
47.8
31.5
32.4
72.3
42.1
29.7


Gene


TFIIEa2
TFIIEca3
TFIIElI
TFIIEP2
TFIIFc
TFIIFp1
TFIIFB2


Locus


At4g20340
At4g20810
At4g21010
At4g20330
At4gl2610
At3g52270
Atlg75510


Predicted
pi
4.95
4.72
10.23
10.04
5.22
7.70
6.92









Table 2-2: Similarity and identity percentage ranges of the GTF protein families
examined.

Similarly I y Similarity Identity
Similarity Identity
Simlait ent Range within Range within
Protein Family Range Range R an t
Plants Plants
(Average) (Average) Plants
(Average) (Average)
TFIIA-S 57.4- 100.0 36.2- 100.0 62.5 100.0 46.7- 100.0
(83.5) (72.2) (90.1) (83.4)
TFIIA-L 23.8 97.3 14.8 96.5 33.6 97.3 21.5 96.5
(49.4) (34.5) (63.2) (50.5)
TFIIB (all) 13.2 99.7 8.6 99.4 16.3 99.7 10.0 99.4
(46.0) (33.2) (50.4) (37.8)
TFIIB Class A 21.7 99.7 10.0 99.4 24.1 99.7 10.0 99.4
(58.3) (46.8) (63.3) (54.7)
TFIIB Class C 88.4 91.2 80.2 86.3 88.4 91.2 80.2 86.3
(89.9) (82.7) (89.9) (82.7)
TBP 31.4 99.5 16.5 98.5 61.5 99.5 52.8 98.5
(73.7) (63.1) (88.2) (84.0)
TAF6 15.5 98.6 10.2 98.6 20.0 98.6 15.2 98.6
(49.8) (35.2) (59.3) (47.5)
TAF9 22 98.4 12.1 96.2 24.1 98.4 13.2 96.2
(52.7) (40.1) (61.8) (50.7)
TAF10 47.3 98.7 22.0 98.0 66.9 98.7 49.7 98.0
(71.5) (57.3) (82.5) (71.6)
TAF11 21.4 97.2 10.6 96.3 31.3 97.2 17.2 96.3
(51.1) (35.5) (61.6) (47.1)
TFIIEa 15.4 85.0 8.8 73.5 53.3 85.0 39.6 73.5
(42.6) (28.3) (69.1) (53.5)
TFIIEP 36.3 100.0 16.8 99.6 67.4 100.0 48.6 99.6
(69.6) (53.4) (80.7) (67.1)
TFIIFa 29.1 94.9 15.1 -89.1 70.8 94.9 58.5 89.1
(49.2) (33.4) (76.2) (65.3)
TFIIFP 35.4 100 18.8 99.6 48.9 100 34.8 99.6
(62.3) (46.1) (74.4) (60.2)
















Homo sapiens TFIIA gamma NP 004483

Drosophila melanogaster TFIIA S NP 524467


100'


--70-


97


1 amino acid change


Figure 2-1:


- Populus trichocarpa TFIIA S2
Saccharomyces cerevisiae TFIIA S TOA2p NP 012865

Oryza sativa TFIIA S2

Zea mays TFIIA S2 TC173972

Medicago truncatula TFIIA S TC79554

Glycine max TFIIA S TC148651
Populus trichocarpa TFIIA S1

Vitis vinifera TFIIA S
-52
2 Pinus TFIIA S TC16392

7-77 Arabidopsis thaliana TFIIA S AJ223634

Mesembryanthemum crystallinum TFIIA S TC5775

-96 Lycopersicon esculentum TFIIA S TC119445
Solarum tuberosum TFIIA S TC60470

Triticum aestivum TFIIA S1 TC71252

Triticum aestivum TFIIA S2 TC71251

-51 Oryza sativa TFIIA S AAK73129
Hordeum vulgare TFIIA S TC66396

Triticum aestivum TFIIA S3 CA484144
55ea mays TIA 1 TC170582
lea mays TFIIA 81 TC170582


Unrooted phylogram of TFIIA small subunit proteins from plants, humans, fruit flies, and yeast. Bootstrap percentage
support values are shown on branches.
















-Arabidopsis thaliana TFIIA L1 AJ223635


-Arabidopsis thaliana TFIIA L2 At g07470


Glycine max TFIIA L TC192713


Populus balsamifera TFIIA L1


Solanum tuberosum TFIIA L STtuc02 10 23.4519

-100-
Oryza sativa TFIIA L

l-100-
SHordeum vulgare TFIIA L Barleyl 09796
80

SZea mays TFIIA LTC183075
78 -i

S Arabidopsis thaliana TFIIA L3 AT5G59230


85 -Saccharomyces cerevisiae TOA1 p NP 014837

_99-
99- Drosohphila melanogaster TFIIA alpha beta NP 476996

-92-
9"- Homo sapiens TFIIA alpha beta like factor NM 006872
-59-
Homo sapeins TFIIA alpha beta NP 056943
= 10 amino acid changes




Figure 2-2: Unrooted phylogram of TFIIA large subunit proteins from plants, humans, fruit flies, and yeast. Bootstrap percentage
support values are shown on branches.
















- Arabidopsis thaliana BRF1 At2g45100
- Arabidopsis thaliana BRF2 At3g09360
S Arabidopsis thaliana BRF3 At2g01280
I Saccharomyces cerevisiae BRF NP 011782
S Iqq Drosophila melanogaster BRF AAF72065
F j 7 Homo sapiens BRF NP 001510 2


Oryza sativa TFIIB2 AAN59779
Saccharomvces cerevisiae TFIIB MB1380 I


Populus balsamifera TFIIB9
Arabidopsis thaliana BRF4 At4g35540
Lycopersicon esculentum TFIIB AF273333
____Lycopersicon esculentum AAG01118
100 89 Populus balsamifera TFIIB7
8 Arabidopsis thaliana TFIIB5 At4q36650


0 A


'In


ro


Populus balsamifera TFIIB5
S Mesembryanthemum crystallinum TFIIB TC6895
Vitis vinifera TFIIB TC19782
Citrus sinensis TFIIB CB292941
Glycine max TFIIB U31097
Medicago truncatula TC86832
Arabidopsis thaliana TFIIB2 At3g10330
Arabidopsis thaliana TFIIB1 At2g41630
66-r1 Populus balsamifera TFIIB6
Populus balsamifera TFIIB4
1 99( Oryza satliva TFIIB1 AF464908
S Triticum aestivum TC68795
55j Populus balsamifera TFIIB1
Populus balsamifera TFIIB3
1001 Lycopersicon esculentum TFIIB TC124975 LEtuc02 10 21 5910
SSolanum tuberosum TFIIB1 TC58701 STtuc02 10 23 2687
_____ Populus balsamifera TFIIBB
Arabidopsis thaliana TFIIB4 At3q57370
S Populus balsamifera TFIIB2
96Arabidopsis thaliana TFIIB3 At3g29380
Arabidopsis thaliana TFIIB6 At4a10680


1n0 I ~Drosophila melanogaster TFIIB NM 057540
100 I' Homo sapiens TFIIB NM 001514
97 __Methanosarcina acetivorans TFB NP 615574.1
Sulfolobus solfataricus TFB AAK40772.1
- = 10 amino acid changes


Unrooted phylogram of TFIIB-related proteins from plants, humans, fruit flies, yeast, and Archaea. Bootstrap
percentage support values are shown on branches. Archetypical TFIIB proteins (Class A) are enclosed in blue boxes,
PolIII associated TFIIIB-related factors (Class B) are in a red box, and the plastid-associated TFIIBs (Class C) are in a
green box, Class D are in a gold box, and Class E are in an orange box.


-77-


-63-


Figure 2-3:


I
















SArabidopsis thaliana TBP1 At3g13445
Arabidopsis thaliana TBP2 At1 g55520
Sorghum bicolor TBP TC54739
Zea mays TBP TC182979
Oryza sativa TBP TC116362
Triticum aestivum TBP TC88519
Triticum aestivum TBP TC72701
Hordeum vulgare TBP TC78738
Mesembryanthemum crystallinum TBP TC7116
Glycine max TBP TC146463
Solanum tuberosum TBP TC74102
Triticum aestivum TBP TC90291
Chlamydomonas reinhardtii TBP TC24902
56 Saccharomyces cerevisiae TBP NP 011075.1

95 -- Medicago truncatula TBP TCB6874
Medicago truncatula TBP TC8B717

-80i Zea mays TBP TC171023
Zea mays TBP X90652 1
66r Populus balsamifera TBP1
Populus balsamifera TBP2
6 Homo sapiens TBP NP 003185.1
Drosophila melanogaster TBP NP 523805
Drosophila melanogaster TRF QQ27B96
71 1 00 Homo sapiens TBP L1 NP 004856
Drosophila melanogaster TRF2 AAAD28784
7I Pyrococcus woesei TBP AAA73447
Cenarchaeum symbiosum TBP AAC62688
=10 amino acid changes


Figure 2-4:


Unrooted phylogram of TBP-related proteins from plants, humans, fruit flies, yeast, and Archaea. Bootstrap
percentage support values are shown on branches.














Drosophila melanogaster TAF6 NP 524161

Homo sapiens TAF6 NP 647476

Homo sapiens TAF6L NP 006464

Saccharomyces cerevisiae TAF6 NP 011403

100
Populus balsamifera TAFGb

56-- Arabidopsis thaliana TAF6b 4 At1g54360
-95-
82- Arabidopsis thaliana TAF6b 2 At1 g54360

100.
Arabidopsis thaliana TAF6b 1 At1 g54360

99- Arabidopsis thaliana TAF6b 3 At1 g54360

71 -- Arabidopsis thaliana TAF6 At1 g04950
71-
Populus balsamifera TAF6

SHordeum vulgare TAF6 Barleyl 10250

Oryza sativa TAF6 BAB92191
= 10 amino acid changes

Figure 2-5: Unrooted phylogram of TAF6-related proteins from plants, humans, fruit flies, and yeast. Bootstrap percentage support
values are shown on branches. Dicot TAF6 proteins are grouped in a green box, monocot proteins are grouped in a
gold box.




























Chlamydomonas reinhardtii TAF9 TC21330
Saccharomyces cerevisiae TAF9 NP 013963


Populus balsamifera TAF9b
Oryza sativa TAF9 BAC21319.1
Hordeum vulgare TAF9 TC68170
- Oryza sativa TAF9 AAP12985
Triticum aestivum TAF9 TC70841

-100 Zea mays TAF9 TC182853
Zea mays TAF9 TC182854


I Drosophila melanogaster TAF9 A49067
0 Homo sapiens TAF9 NP 003178
I -- 100 ---
Homo sapiens TAF9L NP 057059
= 10 amino acid changes


Unrooted phylogram of TAF9-related proteins from plants, humans, fruit flies, and yeast. Bootstrap percentage support
values are shown on branches. Dicot TAF9 proteins are grouped in a green box, monocot proteins are grouped in a
gold box.


Gossypium arboretum TAF9 TC14563
Populus balsamifera TAF9
Vitis vinifera TAF9 TC11580
- Lycopersicon esculentum TAF9 TC128464
Solanum tuberosum TAF9 TC67183
Solanum tuberosum TAF9 TC67182
Arabidopsis thaliana TAF9

i- Medicago truncatula TAF9 TC85341
-Medicago truncatula TAF9 TC85342
'- Medicago truncatula TAF9 TC85342


-61-


-74-


58.


Figure 2-6:

















Homo sapiens TAF10 012962

_- 53 1. Drosophila melanogaster TAF10b AAL48842

Drosophila melanogaster TAF10 CAC08819

Saccharomyces cerevisiae TAF10 NP 010451

Triticum aestivum TAF10 TC64747

Triticum aestivum TAF10 TC64687

76- -56 Hordeum vulgare TAF10 Barleyl 07779
81L Hordeum vulgare TAF10 HVtuc02 11 10.5382

99- -- 0 Hordeum vulgare TAF10b TC68796
Triticum aestivum TAF10 CAG20043

C Oryza sativa TAF10 TC129171
-51
84- Zea mays TAF10 TC184169
00

Beta vulgaris TAF10 BVSVtuc03 04 08 1346
I--74-
74 Lycopersicon esculentum TAF10 TC118341

53- -69 Arabidopsis thaliana TAF10 At4g31720
Populus balsamifera TAF10

L Gossypium arboreum TAF10 BQ401852

Glycine max TAF10 TC162515

Glycine max TAF10b TC162516
= 10 amino acid changes




Figure 2-7: Unrooted phylogram of TAF10-related proteins from plants, humans, fruit flies, and yeast. Bootstrap percentage
support values are shown on branches. Dicot TAF10 proteins are grouped in a green box, monocot proteins are
grouped in a gold box.














Drosophila melanogaster TAF11 NP 723484


Homo sapiens TAF11 NP 005634


Populus balsamifera TAF11


-Saccharomyces cerevisiae TAF11 NP 013697


Oryza sativa TAF11b TC124761


= 10 amino acid changes


Unrooted phylogram of TAF 11-related proteins from plants, humans, fruit flies, and yeast. Bootstrap percentage
support values are shown on branches. Dicot TAF 11 proteins are grouped in a green box, monocot proteins are
grouped in a gold box.


76-


97.


Figure 2-8:















Drosophila melanogaster TFIIE alpha NP 524026

Homo sapiens TFIIE alpha NP 005504

Saccharomyces cerevisiae TFIIE alpha NP 012897

SOryza sativa TFIIE alpha 3

98- Oryza sativa TFIIE alpha 4
-98
Oryza sativa TFIIE alpha 2
--100-- 67-
100- Arabidopsis thaliana TFIIE alpha 1 Atl g03280

-100
Arabidopsis thaliana TFIIE alpha 2 At4g20340
-67- 98
-98- Arabidopsis thaliana TFIIE alpha 3 At4g20B10

Solanum tuberosum TFIIE alpha TC67033
-93- (.2
S8937 Populus balsamifera TFIIE alpha 2
-66- 87
Populus balsamlfera TFIIE alpha 1

Hordeum vulgare TFIIE alpha TC90346

Oryza sativa TFIIE alpha 1

9 Methanosarcina acetivorans TFE NP 618742
-99-
Sulfolobus solfatancus TFE NP 341815
= 10 amino acid changes



Figure 2-9: Unrooted phylogram of TFIIEc-related proteins from plants, humans, fruit flies, yeast, and Archaea. Bootstrap
percentage support values are shown on branches. Dicot TFIIEc proteins are grouped in a green box.
















Arabidopsis thaliana TFIIE beta 1 At4g21010

Arabidopsis thaliana TFIIE beta 2 AT4G20330

Populus balsamifera TFIIE beta 1

Helianthus annuus TFIIE beta TC9497

74- Lycopersicon esculentum TFIIE beta TC116522
L Solanum tuberosum TFIIE beta TC60506

78- Glycine max TFIIE beta TC192062
-78
Medicago truncatula TFIIE beta TC77471

74- Hordeum vulgare TFIIE beta TC102892

S Triticum aestivum TFIIE beta TC110564
93
I Oryza sativa TFIIE beta AAM01137
91
| iF Sorghum bicolor TFIIE beta TC59949

-99-- 95
99L Zea mays TFIIE beta TC209727
'--99--

L Sorghum bicolor TFIIE beta TC67168
-84
--4 Hordeum vulgare TFIIE beta TC89335
100
Triticum aestivum TFIIE beta TC129305

Saccharomyces cerevisiae TFIIE beta NP 012988

100 Drosophila melanogaster TFIIE beta NP 523923
100 -
Homo sapiens TFIIE beta NP 002086
10 amino acid changes


Figure 2-10: Unrooted phylogram of TFIIEp-related proteins from plants, humans, fruit flies, and yeast. Bootstrap percentage
support values are shown on branches. Monocot TFIIE3 proteins are grouped in a gold box.














Drosophila melanogaster TFIIF alpha NP 524246


Homo sapiens TFIIF alpha NP 002087


100


Saccharomyces cerevisiae TFIIF alpha AAA61640


100


Populus balsamifera TFIIF alpha 1


Arabidopsis thaliana TFIIF alpha At4g12610


52-
-- Oryza sativa TFIIF alpha TC148835


-100-


Triticum aestivum TFIIF alpha TC106270
10 amino acid changes


Figure 2-11: Unrooted phylogram of TFIIFac-related proteins from plants, humans, fruit flies, and yeast. Bootstrap percentage
support values are shown on branches.















S Glycine max TFIIF beta TC178154

Medicago truncatula TFIIF beta TC78885

Vitis vinifera TFIIF beta TC20528

Populus balsamifera TFIIF beta 2


Populus balsamifera TFIIF beta 3
89-
Arabidopsis thaliana TFIIF beta 2 Atl g75510

SOryza sativa TFIIF beta TC137623

-73- -77--
Hordeum vulgare TFIIF beta TC103743
-100
Triticum aestivum TFIIF beta TC122239 oo
-81-
Arabidopsis thaliana TFIIF beta 1 At3g52270


Populus balsamifera TFIIF beta 1
-56- -

Saccharomyces cerevisiae TFIIF beta NP 011519
100
-100- Drosophila melanogaster TFIIF beta NP 524305
---88-
Homo sapiens TFIIF beta NP 004119

=10 amino acid changes


Figure 2-12: Unrooted phylogram of TFIIFP-related proteins from plants, humans, fruit flies, and yeast. Bootstrap percentage
support values are shown on branches. Dicot TFIIFP proteins are grouped in green boxes, monocot proteins are
grouped in a gold box.











Arabidopsis thaliana BRF1 At2g45100
Arabidopsis thaliana BRF2 At3g09360
Arabidopsis thaliana BRF3 At2g01280
Drosophila melanogaster BRF AAF72065
Homo sapiens BRF NP 001510.2
Saccharomyces cerevisiae BRF NP 011762.1
Populus balsamifera TFIIB7/pBrp
Arabidopsis thaliana TFIIB5 At4g36650
Lycopersicon esculentum AAG01118
Populus balsamifera TFIIB6
Populus balsamifera TFIIB4
Populus balsamifera TFIIB5
Oryza sativa TFIIB1 AF464908
Triticum aestivum TC68795
Mesembryanthemum crystallinum TFIIB TC5895
Vitis vinifera TFIIB TC19782
Populus balsamifera TFIIB1
Citrus sinensis TFIIB CB292941
Glycine max TFIIB U31097
Medicago truncatula TC86832
Arabidopsis thaliana TFIIB2 At3g10330
Arabidopsis thaliana TFIIB1 At2g41630
Lycopersicon esculentum TFIIB TC124975
Solanum tuberosum TFIIB1 TC58701
Populus balsamifera TFIIB3
Populus balsamifera TFIIB8
Arabidopsis thaliana TFIIB4 At3g57370
Populus balsamifera TFIIB2
Oryza sativa TFIIB2 AAN59779
Drosophila melanogaster TFIIB NM 057540
Homo sapiens TFIIB NM 001514
Arabidopsis thaliana TFIIB3 At3g29380
Arabidopsis thaliana TFIIB6 At4g10680
Saccharomyces cerevisiae TFIIB M81380
Methanosarcina acetivorans TFB NP 615574.1
Sulfolobus solfataricus TFB AAK40772.1
Populus balsamifera TFIIB9
Arabidopsis thaliana BRF4 At4g35540
Lycopersicon esculentum TFIIB AF273333


VLTATHIIASMKRDWMQT
VATARDIIASMKRDWIQT
ANTAKNIISSMKRDWIQT
SMTALRIVQRMKKDCMHS
SMTALRLLQRMKRDWMHT
VKDAVKLAQRMSKDWMFE
QELATHIGEVVINKCFCT
QELATHIGEVVINKCFCT
QELATHIGEVIINKCFCT
------------------
LMKV--------------
IK----------------
VKAAQEAVQR-SEELDIR
VKAAQEAVQR-SEELDIR
MKAAQEAVQK-SEEIDIR
VKAAQEAVQK-SEEFDIR
VKAATEAVKT-SEQFDIR
VKAAQEAVQK-SEEFDIR
VKAAQEAVQK-SEEFDIR
VKAAQESVQK-SEEFDIR
VKAAQESVQK-SEEFDIR
VKAAQEAVQK-SEEFDIR
IKVVQETVQK-AEEFDIR
IKVVQETVQK-AEEFDIR
VKAATEAVKT-SEQFDIR
-----------ELKRDG-
VEAALEAAESYDYMTNGR
VKAVHEAVEK-IQDVDIR
VREAQRAAQTLEDKLDVR
QRAATHIAKK-AVEMDIV
QMAATHIARK-AVELDLV
IMAIPEAVEK-AENFDIR
------------------
TTSAEYTAKKCKEIKEIA
QSKSVEILRQ-ASEKELT
MKTAAEIIDK-AKGSGLT
-NPDGDLIQGFEIIETMA
VRTDGFCVEDLVMDCLSK
DIISLNVLANTHSNTMQI


Figure 2-13. Multiple sequence alignment of the TFIIB region containing the conserved
lysine residue that is acetylated in human and yeast TFIIB (in green). The
poplar TFIIB 1 and TFIIB3 predicted amino acid sequences have a lysine
one amino acid off register (in blue) that might be autoacetylated.