![]() ![]() |
![]() |
UFDC Home | myUFDC Home | Help |
Title Page | |
Dedication | |
Acknowledgement | |
Table of Contents | |
List of Tables | |
List of Figures | |
Abstract | |
Introduction | |
Biomass and bioconversion | |
Family 10 glycosyl hydrolases:... | |
Paenibacillus species strain JDR-2... | |
Characterization of XynC from Bacillus... | |
Summary discussion | |
References | |
Biographical sketch |
|
|
||||||||||||||||||||||||||||||||||||||||||||
Table of Contents | |||||||||||||||||||||||||||||||||||||||||||||
Title Page
Page 1 Page 2 Dedication Page 3 Acknowledgement Page 4 Table of Contents Page 5 Page 6 List of Tables Page 7 List of Figures Page 8 Page 9 Abstract Page 10 Page 11 Introduction Page 12 Page 13 Page 14 Page 15 Biomass and bioconversion Page 16 Page 17 Page 18 Page 19 Page 20 Page 21 Page 22 Page 23 Page 24 Page 25 Page 26 Page 27 Page 28 Page 29 Page 30 Page 31 Page 32 Page 33 Page 34 Family 10 glycosyl hydrolases: Structure, function and phylogenetic relationships Page 35 Page 36 Page 37 Page 38 Page 39 Page 40 Page 41 Page 42 Page 43 Page 44 Page 45 Page 46 Page 47 Page 48 Page 49 Page 50 Page 51 Page 52 Page 53 Page 54 Page 55 Page 56 Page 57 Page 58 Page 59 Page 60 Page 61 Page 62 Page 63 Page 64 Page 65 Page 66 Page 67 Page 68 Page 69 Page 70 Page 71 Page 72 Page 73 Page 74 Page 75 Paenibacillus species strain JDR-2 and XynA1: A novel system for glucuronoxylan utilization Page 76 Page 77 Page 78 Page 79 Page 80 Page 81 Page 82 Page 83 Page 84 Page 85 Page 86 Page 87 Page 88 Page 89 Page 90 Page 91 Page 92 Page 93 Page 94 Page 95 Page 96 Page 97 Page 98 Page 99 Page 100 Page 101 Page 102 Page 103 Page 104 Page 105 Page 106 Page 107 Page 108 Page 109 Characterization of XynC from Bacillus subtilis subspecies subtilis strain 168 and analysis of its role in depolymerization of glucuronoxylan Page 110 Page 111 Page 112 Page 113 Page 114 Page 115 Page 116 Page 117 Page 118 Page 119 Page 120 Page 121 Page 122 Page 123 Page 124 Page 125 Page 126 Page 127 Page 128 Page 129 Page 130 Page 131 Page 132 Page 133 Page 134 Page 135 Page 136 Page 137 Page 138 Page 139 Page 140 Page 141 Summary discussion Page 142 Page 143 Page 144 Page 145 Page 146 References Page 147 Page 148 Page 149 Page 150 Page 151 Page 152 Page 153 Page 154 Page 155 Page 156 Page 157 Page 158 Page 159 Page 160 Page 161 Page 162 Page 163 Page 164 Page 165 Biographical sketch Page 166 |
|||||||||||||||||||||||||||||||||||||||||||||
Full Text | |||||||||||||||||||||||||||||||||||||||||||||
ENDOXYLANASES FROM GLYCOSYL HYDROLASE FAMILIES 5 AND 10: BACTERIAL ENZYMES FOR DEVELOPMENT OF GRAM-POSITIVE BIOCATALYSTS FOR PRODUCTION OF BIO-BASED PRODUCTS By FRANZ JOSEF ST. JOHN A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2006 Copyright 2006 by Franz Josef St. John I dedicate this work to the memory of Josefa Florentine St. John 1940-1990 Wife to one, mother of four, five who miss, evermore ACKNOWLEDGMENTS I acknowledge the steadfast support of my family and wife who have vicariously enjoyed graduate school through my continued praise. Without their support this may have been impossible. Further, I thank my advisor, James. F. Preston III, for taking the time to help me grow as a scientist and teaching me that my imagination may be the best scientific tool of all. I would also like to thank my Ph. D. research committee; Doctors A. Edison, L. O. Ingram, J. Maupin-Furlow and K. T. Shanmugam, for their support and guidance. Their comments and suggestions have helped guide this work to a fruitful conclusion. I thank them. TABLE OF CONTENTS page A C K N O W L E D G M E N T S ......... ................. ...............................................................................4 LIST O F TA B LE S ............................................................................................. ............ LIST OF FIGURES .................................. .. .. .... ..... ................. .8 CHAPTER 1 INTRODUCTION ............... ................. ........... .............................. 12 2 BIOM ASS AND BIOCONVERSION ........................................................ ............. 16 The Complex Composition of Bioenergy Crops and Agricultural Crop Residues ................16 C ellu lo se ...............................................................................................17 Hemicellulose ................................................................ ..... ..... ........ 17 Lignin .................................. ........................... .......... 18 Polymer Interactions Which Create Recalcitrant Tissues ............................................18 Pretreatm ent of Lignocellulose............................................................................................20 Description of Biomass Pretreatm ent M ethods.................................. ............... ......21 Analysis of the Methods to Determine Which Pretreatment Protocols are Most E effective ...................... ...... .. ... ... ............... ............ ......... 23 Enzyme Systems for Utilization of Glucuronoxylan .............................. .... .............25 3 Family 10 glycosyl hydrolaseS: Structure, Function and Phylogenetic relationaships..........35 Xylanases of Glycosyl Hydrolase Fam ily 10 ........................................ ...... ............... 35 Enzym e Structure and M echanism ...................................................................... ...... 36 Modular Characteristics of GH 10 Xylanases............................................................36 CBM classification by target substrate................................................... ................ 38 CBM modules common in bacterial GH 10 xylanases and their general architectural arrangem ent .............................................................................. 39 F ungal m odules ......................................................... ................. 46 Other modules and sequences from GH 10 xylanase............... ...............47 Glycosyl Hydrolase Accessory Module Discussion....................... ..................49 Hydrolysis of Substituted Xylans by GH 10 Xylanases .....................................................51 H ydrolysis of M ethylglucuronoxylan......................................... ......................... 52 Hydrolysis of Methylglucuronoarabinoxylan........... ....... ....................... 53 Hydrolysis of Rhodymenan by GH 10 xylanases................................. ................. 55 GH 10 Xylanase Substrate Binding Cleft Studies.............................................. 55 Phylogenetic Relationships of Glycosyl Hydrolase Family 10 Xylanases...........................57 Plant and Related Bacterial GH 10 Xylanase .............. ........................................... 58 Fungal and Streptomyces Association ..................................... .......................... ....... 59 Bacterial GH 10 Xylanases: Tools to Work With..................................................59 C o n c lu sio n ................... .......................................................... ................ 6 0 4 Paenibacillus SPECIES STRAIN JDR-2 AND XynAi: A NOVEL SYSTEM FOR GLUCURONOXYLAN UTILIZATION ............................................................................. 76 In tro du ctio n ................... ...................7...................6.......... M materials and M methods ...................................... .. ......... ....... ...... 79 R e su lts ................... ...................8...................9.......... D iscu ssio n ................... ...................9...................4.......... 5 CHARACTERIZATION OF XynC FROM Bacillus subtilis SUBSPECIES subtilis STRAIN 168 AND ANALYSIS OF ITS ROLE IN DEPOLYMERIZATION OF G L U C U R O N O X Y L A N ............................................................................ ..................... 110 In tro du ctio n ................... ...................1.............................0 M materials and M methods ..................................... .......... .. .... ...... .. ........ .... 115 R e su lts ................... ...................1.............................2 D iscu ssio n ................... ...................1.............................6 6 Summary Discussion ................................................................ 142 C current R research D irections....................................................................... ......... ........... 142 G ram -positive B iocataly sts......................................................................... ....................143 L IST O F R E F E R E N C E S .................................................................................... ...................147 B IO G R A PH IC A L SK E T C H ......................................................................... .. ...................... 166 6 LIST OF TABLES Table page 2-1 Composition of potential biofuel crops and other biomass sources ..............................30 3-1 Distribution by bacterial genus of carbohydrate binding modules and other functional domains associated with GH 10 xylanases......................................................................63 3-2 Glycosyl hydrolase family 10 sequences included in phylogenetic studies and some of their properties ............................................................... ... .... ......... 65 4-1 Source and characteristics of sequences used for phylogenetic comparison................. 104 5-1 Relationship of XynC activity to the degree of MeGA substitution on MeGAXn ..........134 5-2 M ALDI-TOF peak assignments ........................................... ................... 136 5-3 Relative transcript quantity measured by Q-RT-PCR for gapA, abnA, xynA and xy n C g en e s ...................... .. .............. .. ............................... ................ 14 0 LIST OF FIGURES Figure pe 2-2 Common structural elements and sites of enzymatic hydrolysis which degrade methylglucuronoxylan and methylglucuronoarabinoxylan. ............................................32 2-3 Generation of hydrolysis products by different families of xylanases. ..........................33 2-4 X ylanase structure and function............................................... .............................. 34 3-1 Common domain arrangements found in GH 10 xylanases. ...........................................62 3-2 Products formed by the hydrolysis of methylglucuronoxylan and methyglucuronoarabinoxylan by a glycosyl hydrolase family 10 xylanase....................64 3-3 Phylogenetic distribution of catalytic domains of glycosyl hydrolase family 10 xylanases .................................................................................75 4-1 Growth of Paenibacillus sp. strain JDR-2. .......................................... ............... 101 4-2 Genetic map ofxynA1 and surrounding sequence resulting from sequencing of the Paenibacillus sp. strain JDR-2 genomic DNA insert of pFSJ4................................ 102 4-3 Domain alignment of GH 10 subset B and subset A sequences.................................. 103 4-4 Phylogenetic analysis of a randomly selected set of GH 10 xylanases with respect to the XynA i CD GH 10B subset........... ................. .......... ............... ............... 105 4-5 Localization of modular XynAi in subcellular fractions. ...........................................106 4-6 Lineweaver Burk kinetic analysis of XynAi CD................................ ..................107 4-7 Kinetic analysis of product formation catalyzed by XynAi CD hydrolysis of SG M eG A X n. ................................................................................108 4-8 Differential carbohydrate utilization by Paenibacillus sp. strain JDR-2.........................109 5-1 O ptim ization of X ynC activity................................................................ .................... 133 5-2 MALDI-TOF MS analysis of the Filtrate (A) and Retentate (B) resulting from 3 kDa ultrafiltration of a SG M eGAXn XynC digest.............. .............................................. 135 5-3 1H-NMR of SG MeGAXn 3 kDa filtrate revealing the general action of XynC hydrolysis of MeGAXn and the predicted limit product of XynC MeGAXn digestion...137 5-4 Identification of products generated by XynA (GH 11) and XynC (GH 5) secreted by B su b tilis 16 8 ....................................................................... 13 8 5-5 Regulation of expression of xynA and xynC genes in early- to mid-exponential phase growth cultures of B. subtilis 168 with different sugars as substrate.............................. 139 5-6 Limit aldouronates expected from a SG MeGAXn digestion with a GH 11 xylanase and a GH 5 xylanase co-secreted in the growth medium of B. subtilis 168 ..................141 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy ENDOXYLANASES FROM GLYCOSYL HYDROLASE FAMILIES 5 AND 10: BACTERIAL ENZYMES FOR DEVELOPMENT OF GRAM-POSITIVE BIOCATALYSTS FOR PRODUCTION OF BIO-BASED PRODUCTS By Franz Josef St. John December 2006 Chair: James F. Preston III Major Department: Microbiology and Cell Science Fossil fuels are a nonrenewable resource. Their wide geological distribution and relatively simple acquisition have allowed massive increases in human population and associated energy expenditure over the last century. The current rate of consumption and the expectation of reduced fuel supplies predicate the need to develop new energy sources that may be merged with the current energy infrastructure. As an underutilized renewable resource, lignocellulosic biomass may, through microbial bioconversion, contribute to the environmentally benign production of alternative fuels and chemical feedstocks. A major target for this conversion is methylglucuronoxylan, the predominant structural polysaccharide in the hemicellulose fraction of hardwood and crop residues. The research described herein has focused on xylanolytic bacteria and their secreted endoxylanases that are involved in the depolymerization of the methylglucuronoxylan. In this work, endoxylanases of glycosyl hydrolase families 5 and 10 (GH 5 and GH 10 xylanases) have been characterized with emphasis on the native bacterial host and utilization of the hydrolysis limit products of methylglucuronoxylan. Recombinant constructs of the genes encoding these xylanases have made them available for definitive characterization and for expression in transgenic organisms. XynA1, a multimodular GH 10 xylanase anchored to the cell surface ofPaenibacillus sp. strain JDR-2, generates aldouronates that are efficiently assimilated and metabolized by this organism. The proximity between XynA1 and the native Paenibacillus host, and the proficient utilization of the resulting hydrolysis products, identify a process of vectoral assimilation of methylglucuronoxylan-derived products. A GH 5 xylanase encoded by the ynfF gene in Bacillus subtilis 168 is directed for catalysis by methylglucuronosyl substitutions on the xylan chain, supporting its application in an accessory role in the overall depolymerization process. Secretion of this GH 5 and a GH 11 endoxylanase by the genetically malleable B. subtilis 168, for which the entire genome has been sequenced, recommends it as a target for introduction of genes encoding the GH 10 endoxylanase, XynAi and aldouronate-utilization enzymes for efficient depolymerization and metabolism of methylglucuronoxylan. These discoveries provide insight needed for the development of second- generation bacterial biocatalysts for the direct conversion of lignocellulosic biomass to alternative fuels and bio-based products. CHAPTER 1 INTRODUCTION The political and social stability of the world is dependent on a plentiful supply of energy. For the last century, this supply has been in the form of fossil fuels mined or pumped from the earth. At the current rate of use, it is predicted that approximately half-way through this century, the supply of easily obtainable fossil fuels will be significantly limited. Three recent studies of proven world oil supplies show an average estimate of currently obtainable crude oil reserves of 1188 billion barrels (Energy Information Administration, 2006a). In 2003 world oil consumption was estimated at about 29 billion barrels per year (Energy Information Administration, 2006b). These values lead to an estimated time of 41 years before the current proven reserves are exhausted. Although there are newly realized additions to the world crude oil reserve identified frequently, consumption is increasing steadily and outpacing discovery (Erickson, 2003). Regardless of the number of years until a severe shortage in crude oil reserves, the inevitable consequence of dependence on this finite resource is troubling. Increasing population and industrialization around the world exacerbate this situation. Furthermore, as crude oil supplies decrease, that which remains becomes more difficult to obtain. These circumstances result in high demand for a diminishing commodity which can stifle worldwide growth and challenge international relations. Fossil fuels are also a primary contributor to "greenhouse" gases (GHG) which could, over the next few decades, have a major effect on global climate. There is no need to wait until this inevitable energy crisis suffocates the world economy. It is not a requirement that we utilize every obtainable drop of crude oil. In fact, it would be extremely irresponsible to do so. Research efforts must focus on increasing energy yield of current alternative energy sources, on developing novel methods for harvesting energy, and decreasing per capital energy consumption. Together, these goals may add up to a sustainable energy future for the world. Further, investment in this new energy infrastructure may spur equivalent growth as that witnessed during the 20th century in the United States. After all, survival is a strong motive for reform. Over the last several decades, methods have been developed by which the constituent sugars of lignocellulosics can be converted to clean burning ethanol using microorganisms as biocatalysts (Dien et al., 2003; Ingram et al., 1999; Jeffries, 2005). Large scale application of this technology to renewable resources, such as woody biomass and crop residues, may help to balance the demand for fossil fuels by supplementing the supply with ethanol. In the long term, this could become a major contributor to liquid fuel supplies for transportation needs. Use of ethanol produced from biomass is carbon neutral, meaning that it does not increase the net atmospheric CO2 concentration. Therefore, it does not contribute to GHG emissions. The process by which this conversion takes place can be broken down into two key steps. The first is a preprocessing step, which is typically required to prepare the complex, recalcitrant lignocellulosic biomass for conversion. The second utilizes engineered microbial biocatalysts to convert the simple sugars released by the pretreatment to ethanol. In some systems, conversion of the sugars in lignocellulose to ethanol has been accomplished to near theoretical yields (Ingram et al., 1998). Two food crops which are used for large scale ethanol production are corn grain and sugarcane. In the US, corn grain is the primary source for ethanol production representing approximately 2% of liquid fuel use whereas in Brazil, sugarcane has been used to produce enough ethanol to significantly displace gasoline. Another biofuel, biodiesel, used as a diesel fuel replacement, can be prepared from triglyceride rich crops such as soybean, rapeseed and palm. A primary concern frequently raised when considering ethanol or biodiesel production from food crops is the net energy balance. Recent studies have shown that production of ethanol from corn grain and biodiesel from soybean are energy positive (Dewulf et al., 2005; Farrell et al., 2006). When considering these food crops as substrate biomass, the balance is obtained by considering all inputs, including all costs of farming and biomass preprocessing, and the output costs including the net yield of fuel and all other valuable components. In a recent study, bioconversion of corn grain to ethanol yielded only a 25% energy increase over the energy consumed in the process and biodiesel from soybean yielded a 93% energy increase (Hill et al., 2006). The lower energy yield for corn conversion is primarily attributed to the greater growth requirements of corn and the use of gas-fired boilers for ethanol purification. Although soybean may be a good candidate crop for biodiesel production, analysis of total crop land requirements for use in growing any of theses crops as major sources for fuel is thought to be very ambitious, and would still contribute only a small amount of total liquid fuel requirements (Hill et al., 2006). Large scale food crop cultivation for bioenergy also concerns scientists as it links energy production directly to food production and also increases demand for valuable agricultural land. It is generally acknowledged that the greatest benefit for society will come from production of cellulosic ethanol that results from conversion of lignocellulosics and crop residues (Dewulf et al., 2005; Farrell et al., 2006; Hammerschlag, 2006; Hill et al., 2006; Perlack et al., 2005). Idealized substrates for ethanol production will be lignocellulosics such as crop residues like corn stover, forestry products including pulp and paper mill waste, and renewable energy crops such as switchgrass and poplar. As considered above, bioenergy crops are expected to compete for generally useful agricultural lands. However, they will not directly compete with agricultural crop products. Use of these underutilized renewable biomass resources as substrates for biofuel production greatly reduces the input requirements therefore increasing energy yield. The United States Department of Agriculture recently published a report entitled "Biomass as Feedstock for a Bioenergy and Bioproducts Industry: The Technical Feasibility of a Billion-Ton Annual Supply" (Perlack et al., 2005). The analysis presented in this report proposes that it is possible to replace 30% of the US petroleum consumption with biofuels by 2030. Development of bioenergy crops like poplar (hardwoods) and switchgrass graminaceouss plants) along with food crop residues, such as corn stover and sugar cane bagasse, will contribute greatly to ethanol production and accomplishment of the "Billion-Ton Annual Supply" and energy independence. CHAPTER 2 BIOMASS AND BIOCONVERSION The Complex Composition of Bioenergy Crops and Agricultural Crop Residues Energy harvested from the sun through the process of photosynthesis is stored by plants as fixed carbon in the form of several different, tightly associated complex polymers. The combined characteristics of these interacting polymers impart to the plant the strength and resilience that is common in wood products. For this reason, wood has been ubiquitous in the urbanization of society, allowing for relatively affordable, easily constructed homes and buildings. When considering bioconversion processes, these characteristics are referred to as recalcitrance, and are a primary concern for development of processes which liberate the complex sugar components as simple utilizable sugars. The primary components of wood which interact to create this strength and recalcitrance are cellulose, hemicellulose and lignin. Cellulose is the most abundant carbohydrate polymer in the biosphere and is the major structural component of plant life. In bioenergy crops and crop residues, cellulose can range from 36% to 50% total mass (Table 2-1) (Lynd et al., 1999; Scurlock). The second most abundant polymer in biomass is the combined hemicellulose fraction, which in hardwoods is composed primarily of methylglucuronoxylan, and can range from 20% to 35% of total mass (Timell, 1967). This fraction in graminaceous plants such as switchgrass and the crop residue corn stover represents up to 35% of mass and is primarily composed of the polymer methylglucuronoarabinoxylan (Lynd et al., 1999). A secondary hemicellulose from hardwoods, glucomannan, is a minor sugar component representing no more that 4% of mass, and will not be directly considered in this discussion. Lignin is a complex non carbohydrate polymer which is directly associated with recalcitrance of biomass. Renewable biomass sources receiving the most intense scrutiny contain lignin from between 5.5% for early cut harvest of switchgrass growth up to 24% for common hardwood sources such as poplar (Lynd et al., 1999; Timell, 1967). Together, these interacting polymers impart the characteristics for which wood has been cherished, but they further act to prevent enzymatic accessibility to the individual carbohydrate polymers for direct use of woody biomass as a renewable resource. Cellulose Cellulose is a homopolymer composed of repeating 0-1,4 linked glucose molecules. When derived from woody biomass it has a degree of polymerization of approximately 10,000 glucose residues, making it the largest naturally occurring polymer (O'sullivan, 1997). Although the word cellulose refers to the 0-1,4-linked polymer described above, it also describes the tightly associated crystalline fibers that result from many individual cellulose strands hydrogen bonding together to form the chemically and physically recalcitrant cellulose fiber. In general, cellulose from biomass is composed of many identical molecules which are tightly associated through hydrogen bonding interactions, intermixed with hemicellulose. This tight association between cellulose molecules and fibrils makes this specific polymer recalcitrant to chemical and enzymatic degradation. Hemicellulose Hardwood hemicellulose differs from graminaceous sources of hemicellulose primarily by not having arabinose substitutions. As the name suggests methylglucuronoxylan is a 0-1,4 linked xylose chain randomly substituted (Jacobs et al., 2001) with a-1,2 linked 4-0- methylglucuronate moieties. Common hardwood sources have methylglucuronate substitutions on one in every ten xylose residues, but there have been reports of hardwoods with significantly higher methylglucuronate content (Hurlbert and Preston, 2001; Preston et al., 2003; Puls, 1997). Detailed 13C-NMR studies of methylglucuronoxylan from the hardwood sweetgum, a candidate bioenergy hardwood, reflect degrees of substitution as high as 1 in every 6 xylose residues. Methylglucuronoarabinoxylan from graminaceous biomass resources have fewer methylglucuronate substitutions, but can contain arabinose at a ratio of one for every six xylose residues (Puls, 1997; Wilkie, 1979). An additional substitution on both the methylglucuronoxylan and methylglucuronoarabinoxylan polymers is O-linked acetyl groups (Puls, 1997; Timell, 1967). Methylglucuronoxylan has this substitution in either the 02 or 03 position on 70% of all the xylose residues and methylglucuronoarabinoxylan contains approximately half this amount (Puls, 1997; Timell, 1967). Lignin Of all the polymers in lignocellulose, lignin is the most complex due to its random amorphous nature. There is little if any in the primary cell wall, but it forms a major component in the secondary wall and the middle lamella. It is composed of characteristic phenyl propane derivatives, randomly linked through carbon-carbon bonds by an enzymatic dehydrogenation process, shedding light on the reason for the complexity. The phenyl propane derivatives differ depending on the lignin source, but in general lignin from most plants is composed of guaiacyl, syringyl and p-hydroxyphenol moieties (Fujita and Harada, 2001). Polymer Interactions Which Create Recalcitrant Tissues The secondary cell wall in woody tissue is the primary structural component of biomass, which imparts rigidity and strength to the plant and is the primary source of stored carbon composed of cellulose, hemicellulose and lignin. Based on data reviewed by Mellerowicz et al. (Mellerowicz et al., 2001), 86% to 88% (v/v) of cells in poplar wood have secondary cell walls consisting of 80% utilizable carbohydrate. This quantifies the value of hardwood as a biomass resource. This wood type is characterized by its high content of methylglucuronoxylan and lower content of lignin when compared to softwoods. These properties make this wood type more amenable to pretreatment processes. Formation of a full secondary cell wall begins with synthesis of a primary wall. Cellulose microfibrils, which are composed of many individual cellulose molecules, are synthesized on the cell surface by large complexes of cellulose synthase enzymes called rosettes (Ding and Himmel, 2006; Doblin et al., 2002). The layers of cellulose microfibrils that are produced in the primary wall form a net-like cellulose matrix (Fujita and Harada, 2001). Together with this matrix are the primary wall hemicellulose polymers and associated polysaccharides and proteins which lend support to and flexibility for cell expansion (Carpita et al., 2001; Somerville et al., 2004). After the primary wall is synthesized, if the cell is a xylem fiber type, it begins to synthesize a secondary cell wall. The secondary cell wall is much thicker than the primary wall and is typically divided into at least three distinct layers. As can be observed in Figure 2-1, cellulose fiber synthesis in each layer is offset from the next by some degree so that after synthesis of the complete secondary cell wall, there are cellulose fibers bracing the wall from most angles (Fujita and Harada, 2001). It is thought that hemicellulose interacts with and coats the outer layer of cellulose microfibrils to allow for movement, in effect acting as a lubricant and preventing formation of larger cellulose fibers through hydrogen bonding (Ding and Himmel, 2006; Fengel, 1971; Reis et al., 1994). Layers of the secondary cell wall are described as showing a helicoid structure. It has been postulated that it is the intimate interaction with hemicellulose polymers, which are at a greater concentration between secondary cell wall layers that control the formation of this helicoid structure (Reis et al., 1994). Although the mass of cellulose and associated hemicellulose polymers impart strength to the cell wall and the plant, the wall is further reinforced by cross linking with lignin. This randomly formed complex, non-carbohydrate polymer forms ester linkages to various moieties on the hemicellulose chain (Puls, 1997; Reis et al., 1994). The final secondary cell wall structure of mature xylem cells contains three heavy layers of cellulose (Table 2-1) intertwined with hemicellulose, which is lignified through covalent bonds to lignin. The middle lamella is fully lignified, filling in between individual cells (Fromm et al., 2003). The complex matrix formed by these three associated materials can be compared to reinforced concrete and the outermost layer of lignin filling the middle lamella acts to glue it all together. Softwood conifers are composed of the same three major components discussed above. However, the hemicellulose fraction is not composed of methylglucuronoxylan as in hardwood, but rather consists of two different polymers. The minor polymer is similar to methylglucuronoarabinoxylan of the graminaceous plants, but has a greater amount of methyglucuronate substitution and no acetyl groups. The predominant hemicellulose polymer in softwood is galactoglucomannan. This polymer is an O-acetylated polymer composed of terminally branched galactose, glucose and mannose in the ratio 1 : 1 : 3. Generally, softwood is more heavily lignified than hardwoods and is considered to be less efficient in bioconversion processes. Further, the pulp and paper and construction industries are major consumers of softwood supplies. It is possible that future bioconversion endeavors may combine all wood sources for bioconversion, but current objectives primarily include hardwoods and switchgrass as biofuel crops and waste agricultural residues, e.g., corn stover and sugar cane bagasse. Pretreatment of Lignocellulose Due to the complex nature of lignocellulosics, utilization of the embedded carbohydrates requires preprocessing, which usually includes a physical and chemical pretreatment. Following this, all pretreatment methods in development require supplementation with fungal cellulolytic hydrolase mixtures (e.g., Spezyme CP). The most established pretreatment methods have recently been reviewed. In this particularly constructive effort, several laboratories worked together to obtain equivalent, directly comparable data (Kim and Holtzapple, 2005; Kim and Lee, 2005; Liu and Wyman, 2005; Lloyd and Wyman, 2005; Mosier et al., 2005a; Teymouri et al., 2005; Wyman et al., 2005b). The seven preprocessing methods studied used NREL standard methods for data collection and result analysis. Results were reported as total sugars released after the pretreatment step and also after the subsequent enzyme hydrolysis. Besides the review of the individual preprocessing methods and general results of each (Wyman et al., 2005a), further articles published simultaneously (same volume) combined complete data sets of all seven pretreatments for comparative analysis (Eggeman and Elander, 2005). They also provided an in-depth economic analysis of the most promising approaches (Mosier et al., 2005b) and considered the effect of preprocessing on the biomass structure (Lloyd and Wyman, 2005). Here I briefly consider these processing methods to better understand the requirements for subsequent bioconversion to ethanol. Description of Biomass Pretreatment Methods The pretreatment methods compared include dilute sulfuric acid, flow-through, pH controlled water, ammonia fiber explosion (AFEX), ammonia recycle percolation (ARP) and lime pretreatment. Each pretreatment method is reviewed to better understand how each affects biomass. The first applies mild acid conditions (0.5-3.0% H2S04) and temperatures and pressures from 1300C to 2000C and 3atm tol5atm, respectively. In this method, the acid, temperature and pressure function to liberate the hemicellulose in an almost completely utilizable form. In addition, dilute acid treatment disrupts the normally recalcitrant cellulose for effective subsequent enzymatic hydrolysis (Liu and Wyman, 2005). The flow-through pretreatment method was simultaneously compared to an improved method termed partial flow-through pretreatment (PFP), both of which are similar in principle to the dilute acid pretreatment. Both of these methods apply temperatures of 190-2000C and pressure of 20-24 atm. Partial flowthrough pretreatment was superior to water flow-through pretreatment because the final solubilized xylan was more concentrated by the reduced water elution volumes. This significantly affects downstream costs associated with product recovery for fermentation. Furthermore, this method can remove significant lignin content prior to enzyme hydrolysis (Mosier et al., 2005a). The pH controlled water pretreatment method uses high temperature (160-190C) and high pressure (6-14 atm) for times between 10 and 30 minutes. The application of high temperature and neutral pH water has several advantages over the acid hydrolysis methods. Hemicellulose is solubilized and primarily remains as small oligomers, reducing the conditional formation of side products such as furfural, which can inhibit fermentations. Furthermore, this method significantly reduces the lignin content to increase enzymatic hydrolysis (Teymouri et al., 2005). Ammonia fiber explosion (AFEX) is a promising pretreatment method which applies ammonia at elevated temperatures (70-90C) and pressures (15-20 atm) for only short periods of time (<5 minute). The inherent solvent properties and volatility of ammonia are the characteristics which allow this unique approach to disrupt the biomass. The explosion occurs with the sudden release of pressure resulting in rapid expansion of the volatile heated ammonia. The method does not remove any component of the biomass, but rather disrupts it sufficiently to allow near complete enzyme hydrolysis (Kim et al., 2006). ARP uses a 10-15% (w/w) ammonia solution at 150-170C and 10-20 atm of pressure for 10-20 minutes. This method is somewhat similar to the PFP described above, but uses an aqueous ammonia solvent rather than acidic water. As with PFP, this method has the advantage of obtaining significant lignin removal, but requires downstream separation of solubilized carbohydrates for maximum realized sugar yield (Kim and Holtzapple, 2005; Kim and Holtzapple, 2006a). Use of lime (CaOH2) in pretreatment applies opposite chemistry as compared to acid based pretreatments. Combination of lime and oxygen with lignocellulose degrades hemicellulose and cellulose by the peeling reaction endwisee reducing end 3- elimination), releasing glucoisosaccharinate and xylosaccharinate. In general, this pretreatment method produces a complex mixture of degradation products, and allows removal of a large percentage of lignin (Wyman et al., 2005a). This treatment, as with all the discussed pretreatment methods, significantly improves enzymatic hydrolysis of the resulting pretreated biomass. Analysis of the Methods to Determine Which Pretreatment Protocols are Most Effective From the pretreatment studies of corn stover described above, comparisons were performed by analysis of total sugar yields (Eggeman and Elander, 2005). With a standard enzyme amount of 15 filter paper units Spezyme CP per gram glucan applied post pretreatment, all methods yielded no greater than 86% of the total carbohydrate, indicating that they are all candidates for further refinement and optimization. The lowest yield (86.8%) was obtained with an optimized lime treatment and three of the seven methods resulted in yields over 90%. These included dilute acid treatment yielding 92.4%, AFEX yielding 94.4% and flow-through yielding 96.6%. These three pretreatment methods differ greatly in their resulting carbohydrate product mixtures. The dilute acid treatment converted approximately 83% of the total xylose content to free xylose and slightly more than 2% to xylooligomers. This achieved near complete removal of xylan from the cellulose and lignin matrix and provides a likely explanation for the almost complete hydrolysis of cellulose (-92%) observed with the subsequent cellulase treatment. AFEX is by far the most interesting case and resulted in the second best sugar yield (94.4%). The pretreatment did not release any sugars, but resulted in a much altered biomass structure, which facilitated near complete enzymatic hydrolysis. The best results were obtained with the flow-through pretreatment method. With this approach, almost complete xylose conversion occurred, but it was primarily in the form of xylooligomers (-92%). Only a small amount of free xylose was detected (-4.5%). Economic analysis of these preprocessing methods showed that no single method had a clear advantage (Eggeman and Elander, 2005). The dilute acid, AFEX, ARP and lime pretreatments were each estimated to require a similar fixed capital investment. As an example of this, in the dilute acid method, the primary cost was associated with equipment requirements needed to handle the corrosive conditions, and a minor cost was associated with chemical supply requirements. The opposite case was observed for AFEX pretreatment, which requires costly pure ammonia, which is less corrosive then acid. Use of pure ammonia is a substantial cost although AFEX plants are designed to recover most by condensation. Dilute acid, AFEX and lime pretreatments resulted in the lowest total fixed capital per gallon of annual ethanol production capacity making these the best current methods for large scale pretreatment. Both dilute acid and AFEX were at $3.72/gallon and lime was significantly lower at $3.35/gallon. Except for AFEX, it is clear that all these pretreatment methods act primarily by removing xylose or lignin, and in some cases, a significant amount of both. Since the hemicellulose and lignin fractions are thoroughly embedded within the cellulose matrix, it seems likely that methods which remove either will significantly alter cellulose accessibility and render it susceptible to enzymatic hydrolysis. In the case of AFEX, no degradation of the treated biomass is apparent, but based on the subsequent ability for almost complete enzyme hydrolysis it seems likely that this process readily alters the lignin and xylan association with cellulose (Kim and Holtzapple, 2006b; Teymouri et al., 2005). Although it is thought that residual lignin in biomass inhibits enzymes added for hydrolysis, (Berlin et al., 2006) these studies show no indication of this. Dilute acid preprocessing does not remove lignin, but enzyme hydrolysis resulted in the third highest yield of carbohydrate. AFEX, as mentioned, retained the majority of its lignin content and resulted in the second highest yield of carbohydrate. This suggests that 15 FPU Spezyme CP/g glucose may be a wasteful amount of hydrolytic activity for complete hydrolysis following some pretreatments. Enzyme Systems for Utilization of Glucuronoxylan Although further research into pretreatment methods is required, it seems likely that the methods reviewed above approach their optimal performance. Limitations with the current methods for ethanol production from biomass include the high cost of pretreatment and the cost of commercial enzyme preparations required to obtain maximum yield. Advancements which lower the cost of bioconversion of lignocellulosics to ethanol are likely to come from the development of less expensive, more robust enzyme systems with a greater range of enzymatic activities and development of robust microbial biocatalysts. The latter research direction includes biocatalyst advances such as: decreasing growth requirements and increasing the substrate range, development of hydrolytic enzyme secretion systems to reduce commercial enzyme use, and in general, optimizing a specific biocatalyst for use with a specific preprocessing and bioconversion method. The ultimate biocatalyst would secrete most, if not all of the required hydrolytic activity and efficiently transport and ferment hydrolysis limit products to fuel ethanol. Research in this direction will facilitate advances for low-cost, high yield bioconversion processes. Dilute acid pretreatment is currently being developed as a leading pretreatment method. Other than the energy and chemical requirements discussed above, limitations specific to this process include formation of acid hydrolysis side products such as furfural and a-1,2- glucuronoxylose. Furfural forms from the acid and heat catalyzed dehydration of xylose. Formation of this side product reduces process efficiency in two ways. First, it reduces the net convertible xylose concentration and secondly, furfural is known to inhibit microbial growth and fermentation (Zaldivar et al., 1999). The aldouronate, a-1,2-glucuronoxylose, results from acid hydrolysis of methylglucuronoxylan and methyglucuronoarabinoxylan due to the stability of the a-1,2 glycosyl linkage, which is thought to form an internal lactone between the carboxylate moiety on the glucuronic acid and a hydroxyl on the substituted xylose while under acidic conditions (M. E. Rodriquez, A. Martinez, S. W. York, K. Zuobi-Hasona, L. O. Ingram, K. T. Shanmugam, J. F. Preston, Abstr. 101st ASM General Meeting, abstr.O-21, 2001) (Jones et al., 1961). Whereas the arabinose and acetyl linkages are considered acid labile, the stability of the a-1,2 glucuronosyl linkage allows for the buildup of singly substituted aldouronates, which are unable to be utilized by any current biocatalyst. Considering that the frequency of substitution of methylglucuronate is at least 1 for every 10 xylose residues, this suggests at best, a bioconversion process can only recover 90% of the total xylose fraction. This does not take into account the significant potential contribution made by free glucuronic acid to the net ethanol yield. In the short term, feasible goals can be met by developing enzyme systems which function efficiently to allow reduced pretreatment of biomass, in effect, lowering the total cost of preprocessing. Limited pretreatment with dilute acid would allow for reduced energy and/or acid consumption in the pretreatment process and would also lower the formation of furfural and other fermentation inhibiting compounds. The resulting pretreated biomass would still have a significant content of polymerized xylose and may require enzymatic treatment to fully release fermentable carbohydrate. For this reason, enzymes which degrade hemicellulose are primary research targets to facilitate utilization of methylglucuronoxylan and methyglucuronarabinoxylan by biocatalysts. As detailed above, these two polymers make up the second most abundant carbohydrate in bioenergy crops and agricultural crop residues and unlike the chemically simple, but physically recalcitrant cellulose polymer, methylglucuronoxylan and methylglucuronoarabinoxylan are chemically complex and require a battery of enzymes with a wide range of activities to fully degrade them to simple sugars. As shown in Figure 2-2 these activities include several different xylanases, an a-glucuronidase, acetyl esterases, arabinofuranosidases, and lignin esterases (not shown). Xylanases have a primary role in the degradation of xylan as they reduce the large linear polymer to small xylooligomers and small substituted xylooligomers. Accessory enzymes such as the a-1,2-glucuronidase are known to have activity on small substituted hydrolysis products resulting from xylanase digestion, but not on the intact polymer (Nagy et al., 2002; Nurizzo et al., 2002). This research will address how xylanases of glycosyl hydrolase (GH) family 5 and 10 function to hydrolyze polymeric methylglucuronoxylan. Throughout this dissertation, methylglucuronoxylan is considered an idealized substrate consisting of a 0-1,4-xylan substituted with a-1,2-linked 4-O-methylglucuronate moieties. Based on the acid labile character of the less significant substitutions on the methylglucuronxylan and methylglucuronoarabinoxylan polymers, pretreatment utilizing limited dilute acid conditions may result in methylglucuronoxylan being the primary retained polymer. Processing of methylglucuronoxylan by the three major families of xylanase enzymes is depicted in Figure 2-3. Xylanases of glycosyl hydrolases family 5 (GH 5) are the newest xylanases to be characterized. This work (Chapter 5) details the current understanding of this novel xylanase family. Although all indications are that it is specific for the hydrolysis of methylglucuronoxylan, resulting products are thought to be too large for direct utilization by biocatalysts. The abilities of this enzyme may well complement the activities of the other two primary xylanase families. Xylanases from families GH 10 and GH 11 are relatively well characterized. Both of these xylanase families are known to produce primarily xylobiose and xylotriose as primary neutral limit products of methylglucuronoxylan. However, while GH 10 xylanases yield the smallest substituted aldouronate, aldotetrauronate (MeGAX3) (Fig 2-3) which is substituted directly on the nonreducing terminal xylose with a-1,2 glucuronate, GH 11 xylanases yield aldopentauronate (MeGAX4) which is substituted penultimate to the nonreducing terminal xylose (Biely et al., 1997). This slight difference has significance in that substrates for the xylanolytic a-1,2- glucuronidase accessory enzyme can only hydrolyze methylglucuronate from xylooligomers when it is substituted directly on the nonreducing terminal xylose (Nagy et al., 2002; Nagy et al., 2003; Nurizzo et al., 2002). Further, most bacterial a-1,2-glucuronidase enzymes are intracellular and supporting research has indicated that MeGAX3 is the largest aldouronate which is readily transported for catabolism (G. Nong, V. Chow, J. D. Rice, F. St. John, J. F. Preston, Abstr. 105th ASM General Meeting, abstr.O-055, 2005) (Shulami et al., 1999; St. John et al., 2006). Figure 2-3 depicts the processing of methylglucuronoxylan as an idealized substrate for utilization by bacterial biocatalysts. Although GH 10 and GH 11 xylanases share identical hydrolytic mechanisms (as with GH 5) these two families differ in primary protein fold. Catalysis of the 0-1,4 xylan chain proceeds by a double displacement mechanism with retention of anomeric configuration. Figure 2-4 identifies the structural differences and presents the common mechanism by which these xylanases function. The different limit aldouronates resulting from GH 10 and GH 11 xylanases result from steric interactions between the substituted xylan polymer and the binding cleft of these two xylanases with different protein folds. For consistency, throughout the following chapters and just below, 4-0- methylglucuronoxylan will be abbreviated MeGAXn and corresponding substituted xylooligomers as MeGAXx, where x equals the number of xylose residues. In sections considering arabinose substitutions the name methylglucuronoarabinoxylan will be used and arabinose substituted xylooligomers will be denoted as AXx, where x equals the number of xylose residues. The following chapters contain the analysis of xylanases from glycosyl hydrolase families 5 and 10 and explore their mode of action and their hydrolysis products on the substrate MeGAXn. By developing a strong understanding of how these enzymes act to hydrolyze MeGAXn, how they function to benefit the native bacterial host in MeGAXn utilization and how they may facilitate enzyme systems for complete hydrolysis and utilization of MeGAXn, we may better employ these enzymes in development of bacterial bioconversion processes and next- generation biocatalysts. Table 2-1. Composition of potential biofuel crops and other biomass sources Non-carbohydrate Biomass Resource Carbohydrate Composition Non-carbohydrate Composition Hardwood Populus tremuloides 48% cellulose 21% lignin (Poplar)a 24% glucuronoxylan Betula papyrifera 42% cellulose 19% lignin (Paper Birch)a 35% glucuronoxylan Herbaceous Switchgrass (early 40.7% cellulose 5.5% lignin cut)b 35.1% hemicellulose Switchgrass (late 44.9% cellulose 12% lignin cut)b 31.4% hemicellulose Crop residue Corn stoverb 36.4% glucan 16.6% lignin 18.0% xylan Wheat straw 38.2% glucan 23.4% lignin 21.2% xylan Sugarcane Bagasse' 32-48% cellulose 23-32% lignin 19-24% xylan Adapted From: b Timell, T. E. (1967). Recent progress in the chemistry of wood hemicelluloses. Wood Sci Technol 1, 45-70. a Lynd, L. R., C. E. Wyman, and T. U. Gerngross (1999). Biocommodity Engineering. Biotechnol Prog 15, 777-793. c Scurlock, J. (http://bioenergy.ornl.gov/papers/misc/biochar factsheet.html). Bioenergy feedstock characteristics (Oak Ridge: Oak Ridge National Laboratory, Department of Energy). I U Figure 2-1. Pattern of cellulose fiber deposit in different layers of the primary and secondary cell wall. A "P" designation refers to layers of the primary cell wall while an "S" designation refers to layers of the secondary cell wall. Glucuronoxylan is thought to be more concentrated at the interface between secondary cell wall layers. Figure adapted from, Fujita, M., and H. Harada (2001). Ultrastructure and formation of wood cell wall, In Wood and Cellulosic Chemistry, D. N.-S. Hon, and N. Shiraishi, eds. (New York: Marcel Dekker), pp. 1-49. 3-1,4-Xylosidase non-reducing end l Amo, Acetylesterase 2-Glucurodase a-1,3-Arabinofuranos Figure 2-2. Common structural elements and sites of enzymatic hydrolysis which degrade methylglucuronoxylan and methylglucuronoarabinoxylan. idasc Processing of Glucuronoxylan by Bacterial Enzyme Systems GH 11 0% GH 10 4r GH 11 and 10 Non Reducing End Reducing End Primary Aldouronates Generated GH 10 GH 11 GH 5 S( (n).)t Penultimate reducing end Aldotetrauronate Aldopentauronate substimtted prduct substituted product > Most characterized a-glucuronidases act only on aldouronates with nonreducing end substitutions Transporter >Aldotetrauronate has been shown to act as a catabolite activator binding the UxuR represor in Geobacillus stearothermophilus (Shulami, et al 1999) Xylosidase AI Xylotriose Sa-Glucuronidase Value added products Glucuronic acid Figure 2-3. Generation of hydrolysis products by different families of xylanases highlighting the intricate role of GH 10 xylanases in complete pentosan utilization with respect to the other xylanases. Glu172 Cellvibrio japonicus GH 10 it' \B 1CLX (Harris et al. 1996) I I-, a'), barrel Bacillus circulans GH 11 RCSB 1XNB (Campbell et al. 1994) P-jellyroll fold Erwinia chrysanthemi GH 5 RCSB 1NOF (Larson et al. 2003) (P/a), barrel with attached beta structure Glycosylation Glu78 Glu172 HOR' S-0 - HO -O-H OH Glu78 Deglycosylation Glul' ROH Glu78 Retaining reaction mechanism for Bacillus circulans GH 11 endoxylanase Figure 2-4. Xylanase structure and function. Diverse xylanase structures catalyze identical reactions by the same mechanism. 72 %O CHAPTER 3 FAMILY 10 GLYCOSYL HYDROLASES: STRUCTURE, FUNCTION AND PHYLOGENETIC RELATIONSHIPS A Paenibacillus sp. (strain JDR-2) has been isolated that is capable of efficient utilization of MeGAXn. Studies of this organism have attributed this ability to the production of a large multimodular GH 10 xylanase. This 154 kDa secreted protein has eight separate modules which contribute functions for efficient hydrolysis and utilization of polymeric xylan. Two different modules are involved with substrate association while another module is involved in cell surface localization. The proximity between the cell, substrate and hydrolysis products which results from the combined function of these appended modules are thought to facilitate vectoral or directional utilization of xylan hydrolysis products (St. John et al., 2006). Analysis of Paenibacillus sp. strain JDR-2 and characterization of XynA1 is presented in Chapter 4. This chapter reviews GH 10 xylanases and endeavors to establish functional themes through analysis of associated modules and application of phylogenetics. Xylanases of Glycosyl Hydrolase Family 10 Glycosyl hydrolase family 10 (GH 10) xylanases are arguably the best studied and understood family of xylanolytic enzymes. Their substrate is the ubiquitous P-1,4-linked xylose backbone of the major xylans of hardwood and crop residues, the primary sources of biomass for bioconversion to ethanol. It is expected that GH 10 xylanases have a leading role in the degradation of MeGAXn, allowing for subsequent turnover of this major biomass component. To date, they have been found in all three domains of life. The Carbohydrate Active Enzymes database (CAZy) (www.cazy.org/CAZY/) has 175 bacterial GH 10 entries and 110 eukaryote entries (Davies and Henrissat, 1995; Henrissat, 1991; Henrissat and Bairoch, 1993). As with any sequence database in this era of genomics, most of the sequences have been deposited in conjunction with genome sequencing projects. Hence, only a few have been studied for kinetic properties and fewer have received detailed molecular analysis to understand mechanistic enzyme substrate interactions. Enzyme Structure and Mechanism The primary unit of GH 10 xylanases is the catalytic domain (CD). This module typically ranges in size from 30 kDa to 40 kDa. Several examples of GH 10 xylanases have been crystallized and x-ray diffraction for structural analysis revealed a common a/8P protein fold. As with many endo acting glycosyl hydrolases, GH 10 xylanases have a substrate binding cleft that appears from crystal structures to run the breadth of the enzyme (Figure 2-3). The binding site of GH 10 xylanases (as with most glycosyl hydrolases) is composed of a series of subsites that position and bind individual xylose residues. The nomenclature for describing the organization of subsites has been reviewed (Davies et al., 1997). It is based on the convention for the naming of polymeric carbohydrates and the point of hydrolysis within the enzyme. Subsites are numbered increasing from the point of hydrolysis with negative designation toward the nonreducing terminus (glycone) (left) and a positive designation in the direction of the reducing terminus (aglycone) (right). Hydrolysis of xylan occurs through a double displacement mechanism with retention of anomeric configuration (Davies and Henrissat, 1995; Gebler et al., 1992; Henrissat et al., 1995) (Fig. 2-4). Two glutamate residues have been identified which catalyze this hydrolysis, one acting as the primary nucleophile and the other as the proton donor. Modular Characteristics of GH 10 Xylanases Often GH 10 xylanases are associated within their translated protein product, with additional separately folding domains. Although a variety of different functional domains have been identified, the majority represent carbohydrate binding modules (CBM). These separately folding modules are P-sheet structures that bind target carbohydrates, not necessarily xylan. Further, it is common for there to be modular repeats such that there are multiple modules of the same family. The largest GH 10 xylanase in the CAZy database has eight separate modules and is 194 kDa in mass. Six of the modules represent three different CBMs and a seventh module, an additional enzymatic activity. No direct interaction has been identified between a GH 10 CD and its associated CBM. The modules are generally connected through linker regions that in some cases have characteristic protein sequences. It is generally thought that linker regions lack structure, having the singular task of connecting two functional domains. In only two cases has a CBM been crystallized together with its cognate CD (Fujimoto et al., 2000; Pell et al., 2004a). In these studies the linker did not yield a precise electron density map for structure analysis. The tethering action of the linker sequence between a CD and CBM identifies a simple spatial relationship required for enhancement in CD function. This contrasts with the concept of a coordinated interaction in which accessory modules may directly interact with catalytic modules for enhanced functionality (Akin et al., 2006; Irwin et al., 1998; Sakon et al., 1997). Boraston et al. have recently reviewed the structure function and classification of CBM modules (Boraston et al., 2004). Conventional wisdom suggests that these modules help target the CD to the expected substrate, thereby increasing the localized substrate concentration. In contrast to idealized kinetic systems with a soluble substrate, the recalcitrant, composite character of lignocellulosic biomass requires enzymes to be targeted to specific regions of the substrate for effective hydrolysis. The frequent occurrence of GH 10 xylanases that do not contain an appended carbohydrate binding module suggest that in many cases CBMs are not necessary for desired function. There are 46 families of carbohydrate binding modules in the CAZy database assigned on the basis of sequence similarity and hydrophobic cluster analysis (Henrissat and Bairoch, 1993; Tomme et al., 1995). Currently, eleven are found associated with GH 10 xylanases. These include members of families 1, 2, 3, 4, 5, 6, 9, 10, 13, 15 and 22. The occurrence of these modules within GH 10 xylanases varies greatly. For instance, of all the different 106 families of glycosyl hydrolases in the CAZy database, CBM 22 modules are primarily found associated with bacterial and plant GH 10 xylanases (Table 3-1). CBM 9 modules are only associated with GH 10 xylanase that already have a CBM 22 module, but for these modules, only bacterial associations are known. CBM's of families 2, 3, 5, 6 and 10 are primarily found in bacterial enzymes, with only a few from fungal glycosyl hydrolases. Many of these modules are associated with a variety of enzymatic activities. To exemplify an extreme distribution, CBM family 5 has only a single module associated with a GH 10 xylanase, but has about 200 entries in the CAZy database associated mainly with chitinase and cellulase enzymes, while family 15 has only two entries in the database and both are associated Cellvibrio GH 10 xylanases. Of all the families, CBM family 13 is probably the most diverse, with representatives in bacterial, fungal, plant and mammalian proteins. These modules are only common in GH 10 xylanases from Streptomyces (Table 3-1). CBM 1 modules are common in fungal GH 10 xylanases, but are found more often in other families of glycosyl hydrolase enzymes of fungal origin. CBM classification by target substrate A relatively new classification system for carbohydrate binding modules identifies their target substrate rather than their protein fold. Type A modules bind crystalline substrates such as crystalline cellulose, which is not necessarily the target of the associated catalytic module. Type B modules bind soluble substrates, which are usually the intended substrate for the catalytic module and Type C modules bind small soluble sugars such as cellobiose. The CBM modules listed above which are found appended to GH 10 catalytic modules have representatives in each of these types. Modules 1, 2a (see below), 3, 5, 10 classify as Type A, modules 2b (see below), 4, 6, 15 and 22 classify as Type B and modules 9 and 13 classify as Type C (Boraston et al., 2004). The following descriptions clarify their carbohydrate binding preferences detailing the differences between the three types. CBM modules common in bacterial GH 10 xylanases and their general architectural arrangement Modules of CBM family 22 are associated with GH 10 xylanases. There are examples of this module associated with GH 10 xylanases of bacterial and plant origin. Even with this diversity, there is only one example of a characterized CBM 22 not from bacterial origin. Early studies assigned a thermostabilizing function to these domains as removal of the module resulted in decreased thermal stability of the cognate xylanase CD (Fontes et al., 1995). It was soon realized that these domains have a primary role in binding carbohydrate polymers. Most CBM 22 modules are located N-terminal to the CD and are often observed as a duplicate or triplicate set (Ali et al., 2001b; St. John et al., 2006). Xynl0B of Clostridium thermocellum has a unique CBM 22 configuration and also has been well characterized (Charnock et al., 2000; Dias et al., 2004; Xie et al., 2001). In this case the CD is flanked on both sides by a single CBM 22 module. Substrate binding studies showed that while the module on the C-terminal side of the CD has affinity for xylan, the N-terminal localized CBM 22 has no detectable affinity for tested substrates. Crystal structure analysis of the functional module revealed a P-sandwich structure with a small cleft for binding substrate sugars. The tandem N-terminal CBM 22 modules of XynlOA of Clostridiumjosui expressed together showed similar carbohydrate binding properties as the C-terminal located CBM 22 of XynlOB from Clostridium thermocellum described above (Ali et al., 2005a). XynlOC of Clostridium thermocellum has a single CBM 22 in the N-terminal region. In a recent report, absorption assays with the recombinantly expressed CBM showed that it bound best to acid-swollen cellulose and ball milled cellulose but native affinity polyacrylamide gel electrophoresis (NAPAGE) analysis showed the greatest gel retardation with birch wood xylan. Although in these studies substrate affinity for this CBM 22 is not definitive, results showed a 4- fold activity increase between the Xynl0C CD expression product and the native Xynl0C protein product, confirming the generally accepted role of CBM modules (Ali et al., 2005b). The nonbacterial contribution to CBM 22 characterizations comes from the ruminal protozoan Polyplastron mutivesiculatum. XynlOB of this protozoan has a single N-terminal CBM 22 as described above for XynlOC of Clostridium thermocellum. While it was shown to bind xylan, it did not function to enhance catalytic activity (Devillard et al., 2003). In two cases, CBM 22 modules have been shown to bind mixed linkage 0-1,3-1,4 glucan chains. Work with XynlOB of Clostridium stercorarium did not differentiate between an N- terminal duplicate of CBM 22 modules, but showed that they only slightly increased activity on oat spelt xylan with respect to the separate catalytic domain. Unexpectedly, activity on 0-1,3-1,4 glucan, which was very low with the separate catalytic domain, was higher than activity on oat spelt xylan for the native non truncated enzyme (Araki et al., 2004). Previous work with this enzyme showed that the CBM modules facilitated binding to cellulose (Ali et al., 2001b). XynA of Thermotoga maritima has an identical CBM 22 arrangement. Very detailed studies of these modules identified major differences between the first (CBM 22-1) and second (CBM 22-2) (left to right) modules. Meissner and colleagues showed by NAPAGE that CBM 22-2 bound 0-1,3- 1,4 glucan, 0-1,3-1,4 xylan, and 0-1,4 xylans while CBM 22-1 failed to show separate affinity for these potential substrates (Meissner et al., 2000). Their wide-spread diversity and apparent variety of specificity make CBM 22 modules interesting platforms to study carbohydrate epitope recognition and binding. Further structural work may lead to sugar binding cleft engineering for development of tools in biotechnology. Thorough studies of CBM 22 modules clearly show that sequence based determination of the presence of these modules cannot confidently be correlated to a specific function, it is clear that these modules are involved with binding carbohydrate polymers. CBM 9 modules are frequently associated with CBM 22 modules. These modules are usually positioned just C-terminal to the CD and their presence is most common in modular GH 10 xylanases which already have a CBM 22 module to the N-terminal side of the CD. In several cases, modular GH 10 xylanases from thermopiles have tandem CBM 9 modules. Defining research, characterizing the second CBM 9 module of T. maritima Xynl0A (CBM 9-2) showed that this module had high affinity for small soluble oligosaccharides, including glucose, xylose, cellobiose, xylobiose. This attribute classifies these modules as Type C. Binding affinities for oligomers over two residues did not increase, indicating that the binding epitope recognized no more that two sugar residues. This type of CBM also displayed affinity toward xylans and cellulose of all types (Ali et al., 2001a; Clarke et al., 1996; Notenboom et al., 2001). The Mechanism of binding was characterized using sodium borohydride reduced polymers. Replacement of the hemiacetyl reducing terminal sugar with sugar alcohols prevented binding of CBM 9-2. Subsequent crystal structure analysis supported these findings revealing that every hydroxyl group of the reducing terminal sugar in a cyclic conformation interacted with the protein via multiple hydrogen bonding interactions (Boraston et al., 2001; Notenboom et al., 2001). Analysis of CBM 9-1 of T. maritima Xynl0A failed to identify a functional role for this very similar module. Modeling of CBM 9-1 using CBM 9-2 as template and a sequence alignment of many CBM 9 sequences showed that CBM 9-1 as well as all other CBM 9 modules in the same modular position in a tandem arrangement lacked the structurally characterized conserved sugar binding amino acids identified in CBM 9-2. Based on the differences between CBM 9-1 and CBM 9-2, it has been proposed that two subfamilies be designated (CBM 9a and 9b). In all of the CBM 9 tandem arrangements, the first CBM 9 classifies as a CBM 9a and the second CBM 9 classifies as a CBM 9b (Notenboom et al., 2001). Many other GH 10 xylanases, including three in the alignment discussed above, have single CBM 9 modules. The three included in the alignment and the single CBM 9 of Paenibacillus sp. strain JDR-2, a mesophilic, aggressive xylan utilizing organism, show near complete conservation of the key residues attributed to sugar binding in CBM 9-2 of T maritima. This suggests that GH 10 xylanases, in which there is a single CBM 9 module, it may function similar to that of CBM 9-2 of T. maritima. The concise studies performed with XynlOA CBM 9b of T. maritima identified a role for these modules in binding of reducing terminal sugars. Although this module showed affinity for the reducing ends of xylan and cellulose, it had a much higher association constant with cellobiose that with xylobiose, possibly indicating a preference for binding of cellulose. Modules of CBM family 2 can bind crystalline cellulose and xylan. While there are eleven sequences within the GH 10 family that contain this domain, there are about 200 in the database from glycosyl hydrolase families of chitinases and various cellulases. This family has been grouped into two subfamilies designated CBM 2a (Type A) and CBM 2b (Type B). While CBM 2a binds to crystalline cellulose, CBM 2b has been shown to bind soluble xylan. The difference in structure that changes substrate specificity is attributed to a single amino acid switch which reorients a tryptophan for binding to xylan (Simpson et al., 2000). Based on this analysis, out of the eleven CBM 2 modules found in GH 10 xylanases, only one is classified as a CBM 2b. The other ten classify as CBM 2a, presumably having specificity for crystalline cellulose. More examples of CBM 2b modules are associated with GH 11 xylanases. CBM 2a modules bind crystalline cellulose irreversibly (Creagh et al., 1996) but are thought to be mobile, allowing for movement on the surface of cellulose crystalline fibers (Jervis et al., 1997). An example of a CBM 2a module from a GH 6 cellobiohydrolase has been shown to disrupt crystalline cellulose (Din et al., 1994), revealing a synergism between the CBM 2 module and the associated CD. Thermodynamic and structural analysis of these modules conclude that binding of crystalline cellulose occurs through an entropic driven process, probably due to displacement of water molecules between the cellulose surface and the near planar face of the carbohydrate binding module (Creagh et al., 1996; McLean et al., 2000). Family 3 CBM modules bind crystalline cellulose. This family is of notable interest for cellulases of GH family 9. It can be found in five GH 10 xylanases, three of which have two separated modules. These modules have been divided into three subfamilies. Although both CBM 3a and 3b bind to the surface of crystalline cellulose, CBM 3a differs from CBM 3b primarily in a loop structure which contributes to substrate binding (Jindou et al., 2006). Further, CBM 3a modules are associated with scafoldin components of the cellulosome (Shimon et al., 2000) where CBM 3b modules are enzyme localized (Gilad et al., 2003). The last subtype, CBM 3c, is a glycosyl hydrolase family 9 CD fusion domain which is thought to feed or guide the cellulose chain into the GH 9 catalytic domain. This type of association is attributed to processive degradation of cellulose (Irwin et al., 1998; Sakon et al., 1997). CBM modules of family 6 bind soluble polymeric sugar substrates. There are seven GH 10 xylanases containing this Type B CBM module. They differ from other group B CBM domains in that the substrate binding location is a ridge rather than the typical cleft of the 3- sandwich structures. They have been shown to bind a variety of soluble substrate sugars with similar affinities. A CBM 6 module from Cellvibrio mixtus endoglucanase 5A has revealed two binding sites, each with unique substrate specificities. Binding ofxylan was specific for cleft A which could also bind cellooligosaccharides, while cleft B also bound cellooligosaccharides, but was specific for p-1,3-1,4-glucans (Boraston et al., 2003; Henshaw et al., 2004; Pires et al., 2004). Xylanases from Streptomyces spp. have CBM 13 modules. Of the 10 CBM 13 modules associated with GH 10 xylanases, 7 are in Streptomyces sequences (Table 3-1). These modules have similarity to the lectin like B-chain of ricin toxin which has specificity for galactose. Each CBM is composed of a triplicate repeat of approximately 40 amino acids and each repeat is a separate site for carbohydrate binding. CBM 13 modules are selective for pyranose sugars with generally low association constants at each site. Upon binding of polymeric xylan there is a "cooperative and additive" effect (Notenboom et al., 2002), increasing the affinity for this substrate more than by a simple additive result of the three contributing sugar binding sites (Boraston et al., 2000; Fujimoto et al., 2002; Notenboom et al., 2002). Studies have indicated that the three binding sites (a, 0, y) accommodate three different xylooligomers (Scharpf et al., 2002). CBM modules of families 4, 5, 10 and 15 are rare in GH 10 xylanases. CBM modules of family 4 have been identified in about 30 sequences from the CAZy database. Only one of these sequences is a GH 10 xylanase which has an N-terminal tandem set. All the others which have been identified are associated with various P-1,4 and P-1,3 glucanases. Structural studies have characterized this module family as having a P-sandwich jelly roll fold (Johnson et al., 1996b). Binding of soluble carbohydrates occurs within a binding cleft. The bottom of the cleft is lined with hydrophobic residues and the walls have hydrophilic residues for hydrogen bonding interactions with the carbohydrate polymers. Several subfamilies of CBM 4 modules have been identified. In general, the CBM 4 modules bind substrate for the associated catalytic module (Simpson et al., 2002; Zverlov et al., 2001). The first studies of this module family were performed with the N-terminal tandem CBM 4 modules from Cellulomonasfimi CenC. These modules were specific for soluble P-1,4 glucan and did not associate with xylan (Brun et al., 2000; Johnson et al., 1996a; Johnson et al., 1996b; Tomme et al., 1996). Xynl0A of Rhodothermus marinus was found to have a related tandem N-terminal set of modules and substrate binding studies showed that although they had a low affinity for soluble cellulose they showed specificity for xylans (Abou Hachem et al., 2000). Structural analysis of the second module of this system allowed the researchers to postulate differences which dictate substrate specificity between those from C. fimi CenC that bound soluble cellulose and those from R. marinus XynlOA that bind xylan (Simpson et al., 2002). The CBM 4 modules from T. neapolitana Lam 6A, a laminarinase (P-1,3-glucan), do not bind soluble cellulose but are specific for various P-1,3 linked glucan polymers (Zverlov et al., 2001). The diversity of carbohydrate binding in this family is similar to that found in family 22 modules. Both are classified together as Type B CBMs in a larger superfamily (Sunna et al., 2001). There is only one example of a CBM 5 module in GH 10 xylanases. These modules are thought to bind cellulose but the primary associated enzymatic activity is a chitinase. The CBM 5 ofErwinia c 1hiyn\1heini endoglucanase Z has been structurally characterized and the authors correlated it with CBM 5 modules associated with chitinase enzymes (Brun et al., 1995; Brun et al., 1997). At present, family 15 CBMs have only been found in two enzymes. Both are GH 10 xylanases of Cellvibrio (Table 3-1). With these modules, association constants increase up to xylohexaose, indicating there are 6 subsites in the binding cleft. Although no natural substituted polymer such as MeGAXn achieved as high an association constant as observed with xylohexaose, affinity in the worst case decreased by only one-half, not a significant decrease in the measure of association. These modules are thought to efficiently bind decorated xylan because the 02 and 03 hydroxyls (substituted positions in native xylan) of most xylan binding subsites are solvent exposed (Szabo et al., 2001). Only a single CBM 10 module has been found associated with GH 10 xylanases. These 45 amino acid modules have a hydrophobic side involved with cellulose binding. The mechanism of association with crystalline cellulose is similar to CBM 2 modules with coplanar aromatic amino acid residues stacking on the cellulose surface (Millward-Sadler et al., 1995; Ponyi et al., 2000; Raghothama et al., 2000). Fungal modules Of all the domains associated with GH 10 xylanases, CBM 1 modules are strictly found in sequences from fungal enzymes. However it is not restricted to xylanases, being primarily found in fungal cellobiohydrolases and cellulases. These small 36 amino acid structures have four highly conserved cysteine residues involved in the formation of disulfide bridging (Kraulis et al., 1989). This module has been shown to facilitate association of cellobiohydrolases and cellulases with cellulose (Carrard et al., 2000; Gilkes et al., 1991). In GH 10 xylanases these modules are usually located to the far N or C-terminal region, some distance from the CD. One report showed that the CBM 1 of a GH 7 reducing terminal cellobiohydrolase (Cbhl) from Penicillium janthinellum had a disruptive effect on cellulose that enhanced activity (Boraston et al., 2004). No research has determined possible differences between the CBM 1 modules in cellulose active fungal enzymes and those in fungal GH 10 xylanases. It may be that similarities among these small modules are significantly high to discourage such endeavor. If the primary purpose of this module is to associate the fungal GH 10 CD with cellulose, it serves a similar role as several bacterial CBM modules. Other modules and sequences from GH 10 xylanase Surface Layer Homology (SLH) domains anchor proteins to the cell surface. SLH domains have several roles in bacterial physiology. With respect to GH 10 xylanase and glycosyl hydrolase function, these domains are often arranged as C-terminal sets (up to three separate domains) and are involved in anchoring the associated enzyme to the cell surface. They are also involved as a primary surface anchoring mechanism for the multicomponent lignolytic cellulosome complex produced by several Clostridium spp. Bacterial surface binding studies of several SLH module sets have identified two mechanisms of binding. Several studies have identified binding to secondary cell wall polysaccharide (SCWP). These binding sites consist of carbohydrates associated with the peptidoglycan cell wall. Genetic verification for this binding mechanism was obtained from a csaB gene knockout in Bacillus culll/lt, i% (Mesnage et al., 2000). Other results indicate that SLH domains bind directly to the cell wall peptidoglycan layer. Recent work with the SLH C-terminal triplicate of the scafoldin dockerin binding protein (SdbA) from C. thermocellum found that it bound to the peptidoglycan layer of Escherichia coli. This was in contrast to the SLH doublets from the xylanases, Xynl0A and XynlOB of C. josui and C. stercorarium, respectfully. In this case, these SLH domains displayed specificity for the Clostridia SCWP extract and reduced affinity for hydrofluoric acid extracted secondary cell wall polymer (Zhao et al., 2006a). This preference for binding native peptidoglycan suggests that the SLH modules from the SdbA protein may be used in biotechnology applications. Recently, similar binding selectivity was observed between the two surface layer proteins (Slpl and Slp2) and the cellulosome anchoring protein (Ancl) of C. thermocellum (Zhao et al., 2006b). Studies of XynAi and Xyn5, both GH 10 xylanases from different Paenibacillus spp. (Ito et al., 2003; St. John et al., 2006) have shown that the C-terminal triplicate SLH module anchors the GH 10 xylanase to the cell surface. These triplicate SLH domains as well as the linker region to their N- terminal have homology to the same region of the SdbA protein discussed above. Based on this homology, these two xylanases may bind with specificity similar to those examples above which bind native peptidoglycan with no requirement for SCWP. Currently, there may be enough characterized examples available to allow for sequence based determination of amino acid functionality and define the differences between these two similar modules that bind different polysaccharides on bacterial cell surfaces. Characteristic linker regions connect modules in some enzymes. Although the ascribed function of linker regions in glycosyl hydrolases is that they connect together functional domains, the identification of linkers with unique amino acid sequences has made them an interesting topic of research. These unique sequences are characterized as having very high content of specific amino acids. These include the serine rich linker (Sr) (Cellvibrio, Pseudomonas, and Saccharophagus) (Hall et al., 1989), the asparagine rich linker (Nr) (Ruminococcus), the proline and threonine rich linker (PTr) (Caldibacillus, Caldicellulosiruptor, and Cellulomonas), the proline and glutamate rich linker (PEr) (Cohwellia, Pseudomonas, and Saccharophagus), and the proline and glycine rich linker (PGr) (Thermobifida). The serine rich linker regions of XynA and XynC (an arabinofuranosidase) of Cellvibriojaponicus have been characterized (Table 3-1) (Black et al., 1997; Black et al., 1996; Ferreira et al., 1990). Initial studies with XynA determined that the linker sequence was not required for activity and substrate binding functions. Removal of this intervening sequence resulted in lower activities. Although this could be attributed to other functions, it was concluded that it resulted from reduced flexibility of the CD with respect to the CBM. A completely novel linker sequence has been identified in XynB of Neocallimastix patriciarum. It is composed of 12 tandem repeats of the core amino acid sequence TLPG followed by 45 tandem repeats of the octapeptide XSKTLPGG (X=S, K or N). This linker region connects a C-terminal family 1 CBM. Research to elucidate this modular system failed to obtain good expression for functional analysis but showed that the CD sequence coded for a functional GH 10 xylanase (Black et al., 1994). As discussed above, CBM 22 modules were originally thought to confer thermostabilizing properties to GH 10 xylanases. New research shows it is possible that these conclusions resulted from the presence of the linker sequence. The 18 amino acids connecting the CBM 22 module with its cognate GH 10 CD has recently been shown to attribute thermo stabilization and resistance to proteolysis (Dias et al., 2004). Glycosyl Hydrolase Accessory Module Discussion The descriptions above regarding the modules appended to GH 10 CDs exemplify the functional diversity common in glycosyl hydrolase families for lignocellulose degradation. Whether associated with a xylan or crystalline cellulose binding domains the assumed goal of these modules is to facilitate interaction with the substrate. Modules like family CBM 2, which target crystalline cellulose, may have roles in xylan hydrolysis by GH 10 xylanases that cannot easily be determined. Endeavors to distinguish functionality of these modules with respect to the GH 10 catalytic core may facilitate development of applications for efficient enzymatic hydrolysis of lignocellulosics. Bacterial GH 10 domain architecture. As can be observed in Figure 3-1, common domain arrangements are evident. Significant modular arrangements include: CBM 22 modules are localized to the N-terminal region of the GH 10 CD (except for one GH 10 of C. thermocellum), all CBM 9 modules are localized to the C-terminal side of the CD and all but one is associated with CDs that also have a CBM 22 module. All sequences which have SLH modules for possible cell surface anchoring also have both CBM 22 and 9 modules. CBM 3 modules are always immediately flanked by proline and threonine-rich linker regions and are only found in Caldibacillus and Caldicellulosiruptor. In several cases there are two of these in the same xylanase. CBM 2 modules are in GH 10 xylanases from Cellvibrio, Sacharophagus, Cellulomonas, Streptomyces and Thermobifida (Table 3-1). These modules are also often flanked by a proline and threonine-rich or a serine-rich linker sequence. Predictions of GH 10 xylanase function can be proposed based on common architectural module associations and an understanding of the function of these carbohydrate binding modules. Information concerning the method by which the CBM facilitates CD activity can also be deduced from positional relationships. While many of the CBM modules of bacterial GH 10 xylanases are usually in a specific position with respect to the CD, some modules are not. The CBM 1 module is only found associated with fungal CDs. Of the fourteen which have this domain, seven have it toward the N-terminal and seven have it toward the C-terminal. From this, it seems that the only purpose of this module is to ensure localization to the lignocellulose substrate. From Figure 3-1 and the brief description of associated modules above, we can imagine a mode of action for these GH 10 xylanases based on their respective module assemblages. The Thermoanaerobacterium saccharolyticum (P36917) and Paenibacillus species JDR-2 (62990090) xylanase would be expected to associate with soluble polymers such as xylan or 3- 1,3-1,4 glucan with their N-terminal CBM 22 domains. The CBM 9 module is expected to bind the reducing terminus of a cellulose chain fixing the catalytic module in place and the C-terminal SLH modules should anchor this enzyme system to the cell surface. The combined properties of these appended modules favor simultaneous substrate and cell surface localization, perhaps increasing hydrolysis product recovery by the cell through a process of vectoral transport. The triplicate family 22 CBM in the N-terminal region of the Arabidopsis thaliana GH 10 xylanase (Q9SM08) is expected to facilitate substrate localization for this CD. How these enzymes function in A. thaliana is difficult to determine but it can be imagined that they may function in expansion of the cell wall. The Caldibacillus cellulovorans xylanase (7385020) has a C-terminal localized double CBM 3 set. These modules would be expected to bind the crystalline surface of cellulose and the N-terminal CBM 22 would promote association with soluble substrate. A similar mode of action can be imagined for the xylanase from Cellulomonasfimi (73427793). The irreversible binding and mobile character of the C-terminal CBM 2 module would allow an associated CD to translate the surface of cellulose crystals, in search for substrate. The Streptomyces coelicolor (Q8CJQ1) modular xylanase is the simplest of all the examples. The CBM 13 module in the C-terminal region is expected to associate with soluble xylan and increase localized substrate concentration to enhance enzymatic efficiency. These examples offer a glimpse into the possible mode of action for several modular GH 10 xylanases. Although these descriptions are not absolute, they provide a framework for development of methods which utilize these enzymes for complex biomass degradation. Hydrolysis of Substituted Xylans by GH 10 Xylanases Xylan hydrolysis by GH 10 xylanases primarily result in the limit products xylose, xylobiose, xylotriose and small substituted xylooligomers. Early studies using GH 10 xylanases from Cryptococcus albidus and Streptomyces lividans to digest methylglucuronoxylan resulted in the characterization of aldotetrauronate (MeGAX3) as the smallest substituted xylooligosaccharide (Fig. 3-2) (Biely et al., 1997). Similar work digesting insoluble wheat arabinoxylan with the GH 10 xylanase, XylA from Thermoascus aurantiacus resulted in two small substituted limit products. Arabinofuranose-xylobiose (AX2) with the substitution in the 03 position of the nonreducing xylose of xylobiose and arabinofuranose-xylotriose (AX3) with the same substitution on the middle xylose of xylotriose resulted as 50% and 30% respectively of the total arabinofuranose (Araj) substituted products (Fig. 3-2) (Vardakou et al., 2003). These biochemical methods have recently been validated with detailed structural studies of two GH 10 xylanases together with these limit products (Fujimoto et al., 2004; Pell et al., 2004b). Xylan has been reported to have a three fold helical symmetry. Binding subsites of GH 10 xylanases and CBM modules specific for xylan accommodate this characteristic. Native xylan is usually substituted at the 02 or 03 hydroxyl positions (Chapter 2). Substitutions in these positions along the xylan chain can either be accommodated into the protein structure or exposed to the solvent so as not to interfere with subsite xylan interaction. Specific interactions can be understood from the positioning of the 02 and 03 hydroxyl in the subsite bound xylose residue relative to the protein structure. If a subsite orients the bound xylose moiety such that these positions of the xylose are sterically confined by protein structure, no substitution can be accommodated in that position. For a subsite to bind a substituted xylose moiety in the xylan chain, there can either be a pocket into which the substitution can fit in the protein tertiary structure, or the 02 and 03 hydroxyl positions can be solvent exposed away from the protein surface. As will be seen below, substituted hydrolysis products can also result from subsite flexibility. Resulting substituted hydrolysis limit products reflect subsite accommodation by GH 10 xylanases. Hydrolysis of Methylglucuronoxylan Crystal structure analysis of GH 10 xylanases from Streptomyces olivaceoviridis (XynlOA) and Cellvibrio mixtus (XynlOB) have provided molecular level determination of subsite interactions of the methylglucuronosyl moiety on the xylan chain (Fujimoto et al., 2004; Pell et al., 2004b). The limit product, MeGAX3, was cocrystallized with an active site mutant and structure analysis revealed binding of this hydrolysis product in the -3 through -1 subsites and +1 through +3 subsites. Binding of MeGAX3 reflected enzyme substrate interactions, indicating the -3 and +1 subsites accommodate methylglucuronosyl substitutions. For C. mixtus XynlOB, the - 3 subsite methylglucuronosyl could not be modeled as electron density was diffuse, but for the same position in S. olivaceoviridis XynlOA electron density was clear. In this position the 02 hydroxyl is solvent exposed and the substituted methylglucuronate is extended up into solvent. No interactions between the methylglucuronate and protein were identified to explain the clear electron density observed for XynlOA. In the +1 subsite, the 02 position points into the protein. A pocket in this position accommodates 02 substituted methylglucuronosyl moieties. For S. olivaceoviridis XynlOA, diffuse electron density did not allow modeling, indicating that the protein has minimal interactions with this carbohydrate residue but is structured to allow unrestricted access in this position. In the case of C. mixtus XynlOB, clear electron density was observed for the methylglucuronosyl in this position. In XynlOB, the +1 subsite has more xylose-binding interactions then other GH 10 xylanases, and while in the methylglucuronosyl pocket the glucuronate moiety is hydrogen bound to two separate amino acid residues. The additional stability in this position is used to explain the clear electron density for the methylglucuronate and significantly increased activity with respect to other xylanases on the polymeric substrate MeGAXn (Fujimoto et al., 2004; Pell et al., 2004b). Identification of the methylglucuronosyl pocket within the aglycone +1 subsite suggests that GH 10 xylanases may have evolved to address this specific 02 hydroxyl substitution. Hydrolysis of Methylglucuronoarabinoxylan GH 10 xylanase crystal structure analysis of Araf substituted hydrolysis products, AX2 and AX3, did not identify conserved Arafprotein interactions. Results for XynlOB of C. mixtus and XynlOA of S. olivaceoviridis were comparable. XynlOB binding of AX2 and AX3 in the glycone subsites resulted in clear xylose modeling in subsites -1 through -2 and -1 through -3, respectively. In both cases the Araf substitution yielded clear electron density. The Arafof AX2 had two alternative conformations. In one, Arafhydrogen bonds with the protein and in the other, similar to the positioning of Arafin AX3, the 03 hydroxyl hydrogen bonds to the 05 endocyclic oxygen of the xylose in subsite -3 having no direct interaction with the protein. XynlOA is similar to this, but electron density is not clear for Arafof AX2 in the -1 through -2 subsites. AX3 however resulted in clear modeling of the Arafmoiety. In this case the 03 hydroxyl of Arafhydrogen bonded with two separate positions within XynlOB. Interactions of AX2 and AX3 in the aglycone subsite region identified possible xylose subsite binding flexibility. XynlOB of C. mixtus did not have clear electron density data for AX2, but the xylotriose backbone of AX3 modeled into subsite +1 through +3 as expected. For both enzymes, no Arafmoiety could be clearly modeled in the aglycone sites. For XynlOC of S. olivaceoviridis, the oligomers only allowed modeling of xylobiose in subsite +1 through +2 with the third xylose of AX3 not clear in subsite +3. Based on the modeling for xylose residues in the +1 and +2 subsites for both oligomers, it is assumed Arafis positioned in these subsites for AX2 and AX3, respectively. In the case of AX3, the +2 subsite xylose was slightly displaced from the binding subsite suggesting that Arafwas wedging into an awkward position. It is a good indication that this flexibility in arabinose accommodation is normal as XynlOC was used to generate AX3 as the major Araf substituted hydrolysis product of wheat arabinoxylan. AX2 was produced by hydrolysis of the same with XynlOB. Subsite binding of this oligomer into the expected aglycone subsites did not allow modeling. Further the authors identified restrictions of the 03 hydroxyl of xylose in the +2 subsite of XynlOB, suggesting that accommodation of AX3 would be more difficult then in XynlOC. It is apparent that glycone subsites of both enzymes can accommodate 02 linked glucuronosyl in the -3 and an 03 linked Arafin the -2 subsites. These substitutions occur as the 02 and 03 in these positions are solvent exposed. For aglycone subsites, 02 glucuronosyl substitutions in the +1 subsite are easily accommodated within a pocket. Araf accommodation in this area of the catalytic cleft seems to be variable among xylanases. The differences between these two enzymes can be highlighted by the fact that XynlOB of C. mixtus was used to produce AX2 and XynlOC of S. olivaceoviridis was used to produce AX3. The latter, as positioned in the aglycone binding region of XynlOC, revealed a flexibility of xylose binding in the +2 subsite which helps explain how the Arafin this position is accommodated. XynlOB was suggested not to have this flexibility for 03 linked Arafin the +2 subsite, but based on hydrolysis product analysis must accommodate it in the +1 subsite. Hydrolysis of Rhodymenan by GH 10 xylanases Only one reported study has considered the hydrolytic products of GH 10 xylanases on substrates other than P-1,4-linked xylans. Rhodymenan, a 0-1,3-1,4-linked xylan digested with the two GH 10 xylanases of Cryptococcus albidus and Streptomyces lividans discussed above, resulted in the hydrolysis limit product xylosyl-P-1,3-xylosyl-P-1,4-xylose (Biely et al., 1997). GH 10 Xylanase Substrate Binding Cleft Studies An important expectation has recently been addressed, which changes the way we must consider synergy of methylglucuronoxylan hydrolysis between different families of xylanases. This expectation was that the smallest methyglucuronate substituted hydrolysis product released by a GH 11 xylanase, aldopentauronate (MeGAX4), would be further hydrolyzed by a GH 10 xylanase with release of xylose and generation of aldotetrauronate. MeGAX4 is substituted penultimate to the nonreducing terminal xylose of xylotetraose and this methylglucuronosyl substitution would be expected to guide the substrate into the +1 subsite where the methylglucuronosyl can be accommodated. The additional xylose would then be expected to lie across the active site residues and hydrolysis would release xylose. In this interesting study, four different GH 10 xylanases did not have activity on this substrate (Kolenova et al., 2006). However, hydrolysis of the substrate aldohexauronate, in which the methylglucuronosyl moiety is substituted on the middle xylose of xylopentaose, resulted in release of xylobiose. This study may have identified a substrate requirement for GH 10 xylanases. The inability of these GH 10 xylanases to use MeGAX4 as substrate but use MeGAX5, indicates that binding of xylose to the - 1 subsite does not occur with a single xylose (Kolenova et al., 2006). This reflects strong binding of xylose at the -2 subsite compared to binding at the -1 subsite. In a similar study, the GH 10 xylanase of T. aurantiacus (Xynl0) was shown to use an 03 Arafsubstitution in the -2 subsite as a substrate specificity determinant (Vardakou et al., 2005). It was determined that multiple interactions between the Arafmoiety and amino acids in the enzyme stabilized the interaction with this substituted substrate more than with unsubstituted substrate. The purpose of this interaction was validated with comparison of activity on xylotriose and AX3 in which the arabinose was substituted 03 on the nonreducing terminal xylose. The results showed a four-fold higher activity on the Arafsubstituted substrate. An interesting comparison between the above result and the previous discussion, considering hydrolysis of MeGAX4, is that a single xylose extending across the active site from the glycone region was hydrolyzed, suggesting that the +1 sub site binds xylose without additional interactions into the +2 subsite. In the work described for GH 10-catalyzed hydrolysis of MeGAX4, a single xylose residue extending across the active site into the glycone region was not hydrolyzed, but the larger xylobiose was hydrolyzed. It seems possible that the methylglucuronosyl substitution in MeGAX4 may limit the binding of xylose in the -1 subsite. These findings suggest that GH 10 and GH 11 xylanases may not function synergistically. Rather, GH 11 xylanases may hinder the full potential of a GH 10 xylanase. Identifying how decorated substrates interact with the catalytic cleft of GH 10 xylanases is important. This knowledge can be used to develop enzyme mixtures for efficient hydrolysis of target biomass substrates. Studies to determine the functional properties of GH 10 xylan binding subsites will also help to develop synthetic xylanases with engineered characteristics. Phylogenetic Relationships of Glycosyl Hydrolase Family 10 Xylanases Phylogenetic analysis of 241 GH 10 CD sequences is presented in Figure 3-3. A complete list of compared sequences and their attributes is presented in Table 3-2. The tree identifies three major branches of divergence (A, B and C). The first major branch, A, diverges to plants (Ai) and a bacterial clade (A2). The bacteria in this clade closely associate with the B branch bacterial clade which contains most members of the phytopathogenic genus Xanthomonas. This large bacterial clade is made larger, grouping close to the Ci bacterial subgroup which diverges from the major C branch. The C1 clade contains interesting bacterial genera such as Rhizobium, Agrobacterium, Synechococcus, Anabaena and Nostoc. The divergent C2a branch leads to most fungal sequences. It diverges into two major fungal clades one of which splits into a Streptomyces clade. This association is of notable interest as filamentous prokaryotic Streptomyces spp. have similar cell structure and morphological stages as some fungi. The third fungal group is composed entirely of Fusarium which branches separately from other fungi. Sequences 1-59 are comprised of almost entirely bacterial species. Many of these sequences cluster by bacterial genus or enzyme characteristics, such as the associated modular architecture. For instance, sequences 22 through 44 are all highly modular enzymes consisting of similar modular type and architecture. Plant and Related Bacterial GH 10 Xylanase. The phylogenetic tree highlights associations between GH 10 xylanases of plants and bacteria which can only be attributed to close evolutionary origin or interaction. Bacterial sequences from 189 through 206 which originate partially from branches B and all of A2 are closest to plant sequences but represent such a diverse assemblage of genera that it is difficult to draw conclusions. However, the remaining sequences in branch B (177 through 188) and those in the closely associated C1 clade contain bacteria with clear similarities or associations to plants. These represent sequences from the phytopathogenic genera Xanthomonas and Argobacterium and the well studied plant pathogen Pseudomonas syringae. Also included are three different genera of cyanobacteria, the nonsulfur purple photosynthetic bacterium Rhodopseudomonas palustric and two rhizosphere nitrogen-fixing plant endosymbionts. Similarities between these sequences may arise from common ancestry or common evolutionary ascendancy defined by prolonged plant-bacterial interaction. It would be expected that plant GH 10 xylanases have a role in expansion of the cell wall and may have the inherent capability for hydrolysis of highly substituted xylans. Plant pathogens probably would benefit from these same properties found in plant GH 10 xylanases as they are expected to perform a similar task in a similar environment. Due to these possibilities, GH 10 xylanases from plant pathogens may have interesting characteristics when compared to the same from saprophytic microorganisms. Although the goal of each of these enzymes is considered to be the same, the substrate for saprophytes is not unaltered plant tissue but is rather decaying biomass, occurring through the function of saprophytic microbial consortia. The combined activities of many hydrolytic enzymes within this environment may present a significantly altered substrate. These GH 10 xylanases may have evolved high turnovers on simplified substrate vs. others, having lower rates on complex substrates. Fungal and Streptomyces Association GH 10 xylanases from fungal origin are intriguing in that they seem significantly less complex then the modularly diverse bacterial xylanases. Of the two major fungal clades, the one containing sequences 127 through 162 have seven sequences with an appended CBM 1 module, six having this module in the N-terminal domain. The other clade, containing fungal sequences 97 through 113 (17 sequences) contains seven which have a CBM 1 module in the C-terminal. From this, it seems that the difference between the two clades including the positioning of the CBM 1 module is reflected within the sequence of the CD. A clade for the genus Streptomyces intervenes between the two fungal groupings. Every sequence in this clade has a C-terminal CBM module. Most have CBM 13 modules (p-trefoil), but there are four CBM 2 modules, two found with Cellulomonasfimi sequences in this clade. These are the only two which are not Streptomyces spp. Bacterial GH 10 Xylanases: Tools to Work With Sequences 1 through 79 and also 89 through 96 are primarily bacterial. Approximately 52% of these sequences contain accessory modules. Several small clades come directly off the C2 branch but most branch from the C2b (Fig. 3-3). In this subgroup, sequences 1 through 9, 14 through 21 and 45 through 59 consist of CDs only. Of these, sequences 1 through 9 do not contain detectable secretion signal sequences and are therefore considered intracellular. Sequences 10 through 13 and 22 through 44 are all modular, many showing a common modular architecture consisting of a CBM 22 and CBM 9 appended to the CD. Several of these that group closely, including Paenibacillus sp. strain JDR-2, also have SLH modules involved with cell surface anchoring. Conclusion This review emphasizes the wide distribution and significant diversity of glycosyl hydrolases of family 10 xylanases. The array of accessory modules often found associated with GH 10 xylanases highlights possible functional variability and suggests that directed effort to develop xylanases to facilitate preprocessing may benefit from inclusion of these modules. Substrate binding studies of GH 10 xylanases have revealed the details describing the interaction of the GH 10 catalytic cleft with substitutions on the methylglucuronoxylan chain. For subsites which bind xylose orienting the 02 or 03 hydroxyls into the protein, substitutions can only be accommodated by open secondary structure such as the existence of a pocket as in the case of 02 substituted methylglucuronate in the +1 subsite, or by subsite flexibility as suggested for binding of AX3 in the +2 subsite. Aglycone substrate binding accepts methyglucuronate in the -3 subsite and 03 substituted Arafin the -2 subsite. These positions are solvent exposed and generally display little to no interactions between the protein and the substrate appendage. The product variability found for hydrolysis of methyglucuronoarabinoxylan suggest that minor amino acid changes within the xylan binding cleft may contribute to to large differences in hydrolysis product profiles. Even though the catalytic cleft is well conserved, differences in the ability of GH 10 xylanases can occur as a result of appended accessory modules and variations in the catalytic cleft. Phylogenetic tree analysis identified interesting associations between the GH 10 xylanases from different organisms. Although the role of GH 10 xylanases has not been determined in plants, the number of available sequences from Arabidopsis and other plant genera suggests that they may have some function in cell wall alteration. Proximity in the phylogenetic tree identifies a close similarity of GH 10 xylanases from plants and several plant pathogens and photosynthetic organisms. Other than the distribution of the CBM 22 modules which are significant in plant sequences, and the CBM 1 module that is strictly fungal, all other modules and the CBM 22 module are found only in bacterial GH 10 xylanases. This again, highlights the significant diversity available to accentuate GH 10 catalytic abilities and identifies bacterial GH 10 xylanases as a significant biotechnology resource for bioengineering and development of next- generation bacterial biocatalysts. Common Domain Arrangements Thermoanaerobacterium saccharolyticum (P36917) alllllGMMMMM llllllll GD OD ~ ~ Paenibacillus sp. strain JDR-2 (27227837) Arabidopsis thaliana (Q9SM08) Caldibacillus cellulovorans (7385020) Cellulomonasfimi (73427793) Streptomyces coelicolor (Q8CJQ1) IIII CBM 22 GH GH 10 iE CBM 9 777 CBM 2 CBM 13 SLH Domain Figure 3-1. Common domain arrangements found in GH 10 xylanases. f^ CBM3 EOGDED Table 3-1. Distribution by bacterial genus of carbohydrate binding modules and other functional domains associated with GH 10 xylanases. Other Genus Family of Carbohydrate Binding Module Linking Sequence SLH Domain 2 3 4 5 6 9 10 13 15 22 Nr Sr PEr PTr PGr Aeromonas Anaerocellum Bacillus Caldibacillus Caldicellulosiruptor Cellulomonas Cellvibrio Clostridium Colwellia Cytophaga Eubacterium Fibrobacter Nonomuraea Paenibacillus Prevotella Pseudomonas Rhodothermus Ruminococcus Saccharophagus * *0 * * * * * * Streptomyces *g Thermoanaerobacterium * Thermobifida * Thermotoga * a GH 5 cellulase module and a truncated GH 43 module. b Chitin binding module and a deacetylation module. C Esterase module. d Cadherin repeat module and Salmonella repeat of unknown function. e GH 11 module. f Chitin binding module and a truncated GH 43 module. g GH 62 module and a chitinase module. Glycone Hydrolysis Aglycone non reducing reducing end end MeGAX3 lCbH3 0H13 Glucuronoxylan Arabinoxylobiose Arabinoxylan *N,4.T .J Arabinoxylotriose + X2 X3 Figure 3-2. Products formed by the hydrolysis of methylglucuronoxylan and methyglucuronoarabinoxylan by a glycosyl hydrolase family 10 xylanase. Substituted hydrolysis limit products are determined by the interaction between the substitutions and binding subsites. Table 3-2. Glycosyl hydrolase family Organism 1 2 3 4 5 6 7 8 ( 9 10 11 12 13 14 15 16 17 18 19 20 Caldicellulosiruptor saccharolyticus Caulobacter crescentus CB15 thermophilic anaerobe NA10 Ampullaria crossean Eucalyptus globulus subsp. globulus Hordeum vulgare subsp. vulgare Aeromonas punctata Bacillus sp. BP-23 uncultured bacterium Geobacillus stearothermophilus Geobacillus stearothermophilus Bacillus alcalophilus Bacillus sp. Thermobacillus xylanilyticus ButyrivibrioJ,.i- .. h ,i, Thermotoga sp. strain Fj SS3-B.1 Thermotoga maritima Thermotoga neapolitana Thermotoga sp. strain Fj SS3-B.1 Clostridium stercorarium Clostridium stercorarium Geobacillus stearothermophilus Bacillus sp. NG-27 Bacillus firms Bacillus halodurans Bacillus so. 10 sequences included in phylogenetic studies and some of their properties Accession Module Architecture Number 144299 CD 16127272 CD 024820 CD/PTr/CBM3/PTr/Cel5 66474472 CD 88659658 CD Q8GZB5 CD 61287936 CD 3201483 CD Q7X3W7 CD 499714 CD 73332107 CD 37694736 CD 662884 CD 069261 CD 48963 CD/tAes Q9WWJ9 CBM22(2)/CD/CBM9(2) Q60037 CBM22(2)/CD/CBM9(2) Q60042 CBM22(2)/CD/CBM9(2) Q9R6T4 CBM22(2)/CD/CBM9(2) 216419 CD 23304849 CD P40943 CD 2429332 CD 34978678 CD 56567273 CD 216371 CD Secreted Table 3-2. Glycosyl hydrolase family 10 sequences included in phylogenetic studies and some of their properties. Continued. Organism Accession Module Architecture Secreted Number Bacillus halodurans unidentified Caldicellulosiruptor saccharolyticus Caldicellulosiruptor sp. Rt8B.4 Anaerocellum thermophilum Caldicellulosiruptor saccharolyticus Caldicellulosiruptor saccharolyticus Caldicellulosiruptor sp. Tok7B. 1 unidentified Caldicellulosiruptor sp. Tok7B. 1 Aeromonas punctata Bacillus sp. BP-23 Cellulomonas fimi Cellulomonas pachnodae Clostridium thermocellum 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 22597186 CD Yes 39749821 2645425 P40944 1208895 2645417 40646 4836168 39749823 4836167 3810965 3201481 1103639 5880612 144776 Q60046 P36917 Q60043 5360744 23304851 12225048 62990090 27227837 7385020 109701250 P48789 21110686 tCBM22/CD/CBM9t CBM22(2)/CD CBM22(2)/CD CBM22(2)/CD CBM22(2)/CD CD/PTr/CBM3/PTr/Cel5 CD/PTr/CBM3/PTr/CBM3(2)/PTr/Cel5 CD CBM22(2)/CD/PTr/CBM3(2)/PTr/CBM3/PTr/GH43t/CBM6 CBM22/CD/CBM9(2) CBM22(2)/CD/CBM9(2) Deac/CBM22/CD/CBM9 CBM22(2)/CD/CBM9/CBM5/ChtBD3 CBM22/CD/CBM9(2)/SLH(3) CBM22(2)/CD/CBM9(2)/SLH(3) CBM22(2)/CD/CBM9(2)/SLH(2) CBM22(2)/CD/CBM9(2)/SLH(3) CBM22/CD/CBM9 CBM22/CD/CBM9 CBM22/CD/CBM9/tSLH CBM22(3)/CD/CBM9/SLH(3) CBM22(2)/CD/CBM9/SLH(2) CBM22/CD/PTr/CBM3/PTr/CBM3 CD CD CD 21 Thermoanaerobacterium ;i, ,. iiil--, ,., ,. , Thermoanaerobacterium saccharolyticum Thermoanaerobacterium sp.strain JW/SL-YS 485 Clostridium stercorarium Clostridium stercorarium Clostridium josui Paenibacillus sp. JDR-2 Paenibacillus sp. W-61 Caldibacillus cellulovorans Pseudoalteromonas atlantica T6c Prevotella ruminicola Xanthomonas axonopodis pv. citri str. 306 Table 3-2. Glycosyl hydrolase family 10 sequences included in phylogenetic studies and some of their properties. Continued. Organism Accession Module Architecture Secreted Number 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 Xanthomonas campestris pv. vesicatoria str. 85-10 Xanthomonas campestris pv. campestris str. 8004 Xanthomonas campestris pv. campestris str. Bacteroides ovatus uncultured bacterium Flavobacterium sp. MSY2 Saccharophagus degradans 2-40 Cellvibrio japonicus Cellvibrio mixtus Caldicellulosiruptor saccharolyticus Dictyoglomus thermophilum uncultured bacterium Clostridium cellulovorans Clostridium thermocellum Polyplastron multivesiculatum Clostridium thermocellum Ruminococcus flavefaciens Epidinium caudatum ButyrivibrioJ, *-h.. a, Eubacterium ruminantium Cellvibrio japonicus Cellvibrio mixtus Cellvibrio japonicus Saccharophagus degradans 2-40 Neocallimastix patriciarum Clostridium thermocellum Rhodopirellula baltica SH 1 78038341 66575835 21115393 450852 56709936 68525474 89951878 5690438 37962277 2645419 973983 Q8VPE4 47716661 4850306 Q9UOG1 P51584 P29126 28569972 P23551 974180 38323070 757809 45520 89952176 Q02290 P10478 32446690 CD CD CD CD CD CD CD CD CD CD CD CD CBM22/CD CBM22/CD CBM22/CD CBM22/CD/CBM22/Est GH11/Nr/CD CD/CBM13 CD CBM22/CD/CBM9 Sr/CBM15/CD Sr/CBM15/CD CBM2/Sr/CBM10/Sr/CD CBM2/Sr(2)/CD CD/XSKTLPGG(45)/CBM1 Est/CBM6/CD CD No Yes Yes Yes Yes ND Yes Yes TAT No ND Yes Yes Yes Yes Yes Yes No Yes No ND ND Yes Yes Yes Yes No Table 3-2. Glycosyl hydrolase family 10 sequences included in phylogenetic studies and some of their properties. Continued. Organism Accession Module Architecture Secreted Number 75 76 77 78 79 80 81 82 83 84 85 86 87 00 88 89 90 91 92 93 94 95 96 97 98 99 100 101 Thermotoga sp. Thermotoga neapolitana Thermotoga maritima Clavibacter michiganensis subsp. michiganensis Streptomyces turgidiscabies Aspergillus nidulans FGSC A4 Fusarium oxysporum Fusarium oxysporum Fusarium oxysporum f. sp. lycopersici Fusarium oxysporum Fusarium oxysporum Fusarium oxysporum Fusarium oxysporum Fusarium oxysporum Prevotella ruminicola Streptomyces avermitilis MA-4680 Thermobifidafusca YX Streptomyces avermitilis MA-4680 Thermobifida alba Thermobifidafusca YX Colwellia psychrerythraea 34H Cryptococcus adeliensis Patent 5693518 Aspergillus nidulans FGSC A4 Aspergillus oryzae Penicillium funiculosum Talaromyces emersonii Q60044 Q60041 Q9WXS5 31559721 57338460 40745311 Q8TGC2 Q8TGC3 093976 Q8TGC4 19912845 19912843 21699819 19912853 P72234 29605742 71916922 29608643 P74912 71917054 71145740 013436 3015123 40742582 83775646 53747929 21437253 CD CD CD CD CD/ChNT CD CD CD CD CD CD CD CD CD CBM22/tCD CD CD CD/CBM2 CD/PGr/CBM2 CD/PGr/CBM2 CD CD CD/CBM1 CD/CBM1 CD CD/CBM1 CD/CBM1 Table 3-2. Glycosyl hydrolase family 10 sequences included in phylogenetic studies and some of their properties. Continued. Organism Accession Module Architecture Secreted Number 102 103 104 105 106 107 108 109 110 111 112 113 114 C 115 116 117 118 119 120 121 122 123 124 125 126 127 128 - fl i,,y ..'..' i,." grisea 70-15 Neurospora crassa OR74A Neurospora crassa OR74A J f,, .... ioh.. grisea Gibberella zeae Neurospora crassa OR74A Humicola grisea i i,,.''.... thi '.. grisea 70-15 Aureobasidium pullulans var. melanigenum Gibberella zeae Agaricus bisporus Phanerochaete chrysosporium Streptomyces coelicolor Streptomyces lividans Streptomyces olivaceoviridis Streptomyces thermocyaneoviolaceus Streptomyces thermoviolaceus Streptomyces avermitilis Patent 6300114 Nonomuraea flexuosa Cellulomonas fimi Cellulomonas fimi Streptomyces chattanoogensis Streptomyces coelicolor Streptomyces halstedii Alternaria alternate Cochliobolus carbonum 39973147 32416834 32413873 24496243 50844266 32407695 P79046 39963865 84469404 56555501 060206 Q9HEZ1 Q8CJQ1 P26514 Q7SI98 Q9RMM5 38524461 Q9X584 34606109 Q8GMV6 73427793 144425 Q9X583 Q9RJ91 Q59922 Q9UVP5 49066418 Bpht/CD CD CD CD CD CD/CBM1 CD/CBM1 CD/CBM1 CD CD CD CBM1/CD CD/CBM13 CD/CBM13 CD/CBM13 CD/CBM13 CD/CBM13 CD/CBM13 CD/CBM13t CD/CBM13 CD/PTr/CBM2 CD/PTr/CBM2 CD/CBM13/GH62 CD/CBM2 CD/CBM2 CBM1/CD CD Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes Yes ND Yes Yes Yes Yes Table 3-2. Glycosyl hydrolase family 10 sequences included in phylogenetic studies and some of their properties. Continued. Organism Accession Module Architecture Secreted Number 129 Claviceps purpurea 074717 CD Yes 130 Fusarium oxysporum P46239 CBM1/CD Yes 131 Fusarium oxysporum f. sp. lycopersici 059937 CBM1/CD Yes 132 Gibberella zeae 50844270 CD Yes 133 i h,.i,lt... h, .i grisea 22415585 CBM1/CD Yes 134 Coniothyrium minitans 11876710 CBM1/CD Yes 135 Fusarium oxysporum f. sp. lycopersici 059938 CD Yes 136 Gibberella zeae 50844272 CD Yes 137 Cryptovalsa sp. BCC 7197 53636303 CD Yes 138 Neurospora crassa OR74A 32410597 CD Yes 139 Hypocreajecorina 6705997 CD Yes 140 il,,lq... di. grisea 70-15 39951799 CD Yes 141 il,,,q ..... icgrisea Q01176 CD Yes - 142 Agaricus bisporus Q9HGX1 CD No 143 Volvariella volvacea Q7Z948 CBM1/CD Yes 144 Emericella nidulans 95025700 CD Yes 145 Emericella nidulans Q00177 CD Yes 146 Aspergillus oryzae 83772405 CD Yes 147 Thermoascus aurantiacus P23360 CD Yes 148 Aspergillus oryzae 15823785 CD Yes 149 Aspergillus oryzae 83766611 CD Yes 150 Aspergillus sojae Q9P955 CD Yes 151 Aspergillus terreus 68161138 CD Yes 152 Aspergillus oryzae 094163 CD Yes 153 Aspergillus oryzae 83775732 CD Yes 154 Penicillium canescens 55792811 CD Yes 155 Penicillium simplicissimum P56588 CD No Table 3-2. Glycosyl hydrolase family 10 sequences included in phylogenetic studies and some of their properties. Continued. Organism Accession Module Architecture Secreted Number 156 Penicillium purpurogenum Q9P8J1 CD Yes 157 Aspergillus aculeatus 059859 CD Yes 158 Patent 6197564 14480380 CD Yes 159 Aspergillus kawachii P33559 CD Yes 160 Penicillium chrysogenum 46406032 CD Yes 161 Penicillium chrysogenum 83416731 CD Yes 162 Penicillium chrysogenum P29417 CD Yes 163 Rhizobium etli CFN 42 86282913 CD Yes 164 Rhizobium leguminosarum bv. trifolii 88657052 CD Yes 165 Anabaena variabilis ATCC 29413 75701321 CD No 166 Nostoc sp. PCC 7120 Q8YNW3 CD No 167 Pseudomonas syringae pv. phaseolicola 1448A 71555629 CD Yes 168 Pseudomonas syringae pv. syringae B728a 63258442 CD Yes S 169 Caulobacter crescentus CB15 16127035 CD TAT 170 Synechococcus elongatus PCC 7942 81169090 CD Yes 171 Synechococcus elongatus PCC 6301 56685123 CD Yes 172 Acidobacterium capsulatum 13591553 CD No 173 Agrobacterium tumefaciens str. C58 17740854 CD Yes 174 Agrobacterium tumefaciens str. C58 15157542 CD Yes 175 Bradyrhizobiumjaponicum USDA 110 27377352 CD TAT 176 Rhodopseudomonas palustris BisB18 90104203 CD Yes 177 Xanthomonas oryzae pv. oryzae KACC10331 58428646 CD No 178 Xanthomonas oryzae pv. oryzae MAFF 311018 84369769 CD No 179 Xanthomonas oryzae pv. oryzae Q9AM29 CD No 180 Xanthomonas campestris pv. vesicatoria str. 85-10 78038346 CD Yes 181 Xanthomonas axonopodis pv. citri str. 306 21110692 CD Yes 182 Xanthomonas campestris pv. campestris str. 8004 66575838 CD Yes Table 3-2. Glycosyl hydrolase family 10 sequences included in phylogenetic studies and some of their properties. Continued. Organism Accession Module Architecture Secreted Number 183 184 185 186 187 188 189 190 191 192 193 194 195 196 S 197 ) 198 199 200 201 202 203 204 205 206 207 208 209 210 Xanthomonas campestris pv. campestris str. Xanthomonas campestris pv. vesicatoria str. 85-10 Xanthomonas oryzae pv. oryzae KACC10331 Xanthomonas oryzae pv. oryzae Xanthomonas axonopodis pv. citri str. 306 Xanthomonas oryzae pv. oryzae MAFF 311018 Rhodothermus marinus Fibrobacter succinogenes S85 Fibrobacter succinogenes S85 Fibrobacter succinogenes S85 Cellvibrio japonicus Saccharophagus degradans 2-40 Pseudomonas sp. ND137 Saccharophagus degradans 2-40 Clostridium acetobutylicum ATCC 824 Clostridium acetobutylicum ATCC 824 Cytophaga hutchinsonii ATCC 33406 Cytophaga hutchinsonii ATCC 33406 Colwellia psychrerythraea 34H Colwellia psychrerythraea 34H Pseudomonas sp. PE2 Saccharophagus degradans 2-40 Cytophaga hutchinsonii ATCC 33406 Rhodopirellula baltica SH 1 Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Medicago truncatula 21115397 78038344 58428645 12658424 21110690 84369768 P96988 11526752 9965987 9965986 45524 89952852 57999823 89949430 15004819 15004757 110280325 110281120 71145380 71143508 25137524 89949572 110281182 32446276 081754 081751 081752 92868656 CD CD CD CD CBM4(2)CD CD/CBM6 CD/CBM6t CD/CBM6 CBM2/Sr/CBM6/Sr/CD GH43t/CBM6/Sr/CBM2/Sr/CBM22/CD Sr/CD CD/Sr(2)/ChtBD3 CD CD CD/CBM22 CD/CBM9 PEr/CD/Cad/DUF823 CD PEr/CD PEr/CD CBM22/CD CD CBM22/CD CBM22/CD tCD/CBM22/CD CBM22/CD Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes ND Yes No ND Yes Yes Yes No No Yes Table 3-2. Glycosyl hydrolase family 10 sequences included in phylogenetic studies and some of their properties. Continued. Organism Accession Module Architecture Secreted Number 211 212 213 214 215 216 217 218 219 220 221 222 223 S 224 225 226 227 228 229 230 231 232 233 234 235 236 Triticum aestivum Oryza sativa Oryza sativa (japonica cultivar-group) Zea mays Carica papaya Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Thermosynechococcus elongatus BP- 1 Bacillus pumilus Clostridium thermocellum Ampullaria crossean Oryza sativa (japonica cultivar-group) Nicotiana tabacum Nicotiana tabacum Populus tremula x Populus tremuloides Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Oryza sativa (japonica cultivar-group) Oryza sativa (japonica cultivar-group) Oryza sativa (japonica cultivar-group) Oryza sativa (japonica cultivar-group) Hordeum vulgare Hordeum vulgare subsp. vulgare 40363757 19920133 Q7XFF8 Q9ZTB8 Q8GTJ2 Q9ZVK8 081897 082111 Q9SZP3 22295628 20386142 37651955 Q7Z1V6 28411931 73624749 73624751 60656567 Q9SM08 Q9SYE3 080596 29788834 15528604 55168219 15528602 P93186 71142590 CD CBM22/CD CD CD CBM22/CD CD CD CD CD CD CD CBM22/CD CD CBM22(4)/CD CBM22(3)/CD CBM22(3)/CD CBM22(3)/CD CBM22(3)/CD tCBM22/CD CBM22(4)/CD CBM22/tCBM22/CD CBM22/CD CD CD CD CBM22/CD Table 3-2. Glycosyl hydrolase family 10 sequences included in phylogenetic studies and some of their properties. Continued. Organism Accession Module Architecture Secreted Number 237 Hordeum vulgare 71142588 CBM22/CD No 238 Triticum aestivum (bread wheat) Q9XGT8 CD Yes 239 Hordeum vulgare P93185 CD No 240 Hordeum vulgare 14861199 CBM22/CD No 241 Hordeum vulgare subsp. vulgare 71142586 CBM22/CD No CD refers to a GH 10 catalytic module, Aes refers to esterase/lipase module, Cel5 refers to a GH 5 cellulase module, Deac refers to deacetylase domain, ChtBD3 refers to chitin-binding domain, Est refers to esterase, Cad refers to Cadherin repeat domain, DUF823 refers to Salmonella repeat of unknown function, Nr refers to asparagine rich domain, Sr refers to serine rich domain, PEr refers to proline glutamate rich domain, PTr refers to proline threonine rich domain, PGr refers to proline glycine rich domain, XSKTLPGG refers to unique Neocallimastix linker, GH 62 refers to arabinofuranosidase domain, ChNT refers to chitinase N-terminal domain, Bph refers to bacterial phosphatase, t refers to a predicted truncation and is positioned to the side of the truncation. Xanthomonas Figure 3-3. Phylogenetic distribution of catalytic domains of glycosyl hydrolase family 10 xylanases. Numbering corresponds to Table 3-2. Full xylanase sequences were aligned in MEGA 3.1 and the highly conserved catalytic domain was trimmed out. These sequences were realigned and used to generate a Neighbor-joining Bootstrap phylogenetic tree. CHAPTER 4 Paenibacillus SPECIES STRAIN JDR-2 AND XynAi: A NOVEL SYSTEM FOR GLUCURONOXYLAN UTILIZATION Introduction A version of this chapter has previously been published as a peer reviewed manuscript in the Febuary 2006 issue of Applied and Environmental Microbiology. Increasing cost and demand of fossil fuels highlights the need to develop efficient methods to utilize renewable resources for conversion to alternative energy sources such as fuel ethanol (Aldhous, 2005; Sun and Cheng, 2002). Supplementing the energy infrastructure with ethanol may help to shift economic dependence from petroleum-based energy. Microbial biocatalysts, both yeast and bacteria, have been developed for the conversion of glucose derived from cellulose and pentoses from hemicellulose to ethanol (Dien et al., 2003; Ingram et al., 1999; Jeffries and Jin, 2004; Jin et al., 2004), and similar approaches with bacteria have been successfully applied to the formation of value-added products such as optically pure lactic acid (Dien et al., 2002; Zhou et al., 2003a; Zhou et al., 2003b). Current research efforts are directed at improving the pretreatment processes to maximize the release of fermentable pentoses as well as glucose, and also further development of the biocatalysts for specific fermentations of these sugars. The adoption of microbial strategies for the efficient depolymerization and assimilation of hemicellulose derived carbohydrates offer promise for maximizing the conversion of the hemicellulose fraction of lignocellulosic biomass to alternative fuels and biobased products (Preston et al., 2003). Hemicellulose represents 20% to 30% of lignocellulosic biomass and methylglucuronoxylan (MeGAXn) is the predominant form of hemicellulose found in hardwood and crop residues (Preston et al., 2003; Singh et al., 2003). This polymer consists of P-1,4 linked xylan in which 10% to 20% of the xylose residues are periodically substituted with a-1,2-4-O- methylglucuronic acid (MeGA) moieties. Complete enzymatic hydrolysis of MeGAXn requires the combined action of several families of glycosyl hydrolases, including 3-1,4-endoxylanase, a- 1,2-4-O-methylglucuronidase and 3-1,4-xylosidase (Preston et al., 2003). Secreted microbial xylanases that catalyze the depolymerization of MeGAXn are primarily represented by two families of glycosyl hydrolase, GH 10 and GH 11, based on sequence similarity and hydrophobic cluster analysis (http://afmb.cnrs-mrs.fr/CAZY), (Gilkes et al., 1991; Henrissat and Bairoch, 1993; Henrissat and Davies, 1997). In bacteria capable of utilizing MeGAXn, the metabolism of the aldouronates generated by enzyme-catalyzed depolymerization is dependent on their assimilation and cleavage of the MeGA substitution. Most substrate and structural studies of a-glucuronidases, the enzymes required to initiate complete degradation of MeGA substituted xylooligosaccharides, have clearly established that only aldouronates in which MeGA is linked to the nonreducing terminal xylose are suitable substrates (Nagy et al., 2002; Nurizzo et al., 2002). This distinguishes the role of GH 10 xylanases from GH 11 xylanases in generating products for direct assimilation and metabolism. This argument is further supported by evidence that aldotetrauronate acts as a catabolic signaling molecule for its further metabolism (Shulami et al., 1999). Studies of the glucuronic acid utilization gene cluster of Geobacillus %e, Itilh' mniV,,/hiihi have identified a putative MeGAX3 transporter in an operon composed of genes involved with the degradative and catabolic processing of glucuronoxylan. The uxuR gene product, a DNA binding protein, was found to be a self-regulating element of this operon that acts to repress transcription. Binding of MeGAX3 by UxuR alleviates repression. From this, it appears that GH 10 xylanases play a prominent role, both directly and indirectly, in processing of MeGAXn for its complete catabolism. There is no evidence to support a similar role for the GH 11 xylanases. It is possible that GH 11 xylanases act to hydrolyze polymeric xylan primarily into shorter fragments that can then be further acted upon by GH 10 xylanases and P-xylosidases. (Pell et al., 2004a). Another factor affecting the efficiency of metabolism is the localization of the xylanase relative to the cell. The cellulosome, found primarily in anaerobic Clostridium spp. and some ruminant microorganisms, and the xylanosome from aerobic soil bacteria, often have associated GH 10 xylanases (Bayer et al., 2004; Doi et al., 2003; Jiang et al., 2004). These extracellular surface anchored complexes often display a variety of enzymes from several glycosyl hydrolase families with diverse functions. Clostridium thermocellum has a well described cellulosome with twenty-six glycosyl hydrolases (62% cellulases, 23% xylanases, 15% other) and several associated esterase activities that contribute to hydrolysis of the lignocellulose complex (Doi and Kosugi, 2004). Localization of this complex to the surface of the organism presumably allows efficient utilization of hydrolysis products, which may provide a competitive advantage in an anaerobic niche. Extracellular GH 10 xylanases may also occur as large multi-modular surface- anchored enzymes separate from other glycosyl hydrolases. Representative GH 10 xylanases from Clostridium, Thermoanaerobacterium, Caldicellulosirupter, Thermotoga, Promicromonospora, Paenibacillus and several other genera have been shown to have similar modular architectures. Here we describe the properties of an extra-cellular multidomain endoxylanase from an aggressively xylanolytic Paenibacillus sp. (strain JDR-2). The association of XynAi with cell wall preparations indicates an anchoring role for the SLH domains near the C-terminus of the 155 kDa enzyme. The marked preference of this organism for polymeric MeGAXn as a growth substrate compared to xylose or the aldouronates generated by the action of the GH 10 endoxylanase, supports a role for this enzyme in the vectoral processing of MeGAXn for subsequent transport and metabolism. Materials and Methods Isolation and identification ofPaenibacillus sp. strain JDR-2. Paenibacillus sp. strain JDR-2 was isolated from fresh cut discs (5 cm diameter by 2-4 mm thick) of sweetgum stem wood (Liquidamber styraciflua) incubated about one inch below the soil surface in a sweetgum stand for approximately three weeks. Discs were suspended in 50 ml sterile deionized water and sonicated in a 125 Watt Branson Ultrasonic Cleaner water bath for 10 min. The sonicate was inoculated into 0.2% (w/v) sweetgum (SG) MeGAXn containing the mineral salts of Zucker and Hankin (Zucker and Hankin, 1970) at pH 7.4 and incubated at 30 OC. The SG MeGAXn was prepared and characterized by 13C-NMR as described previously (Hurlbert and Preston, 2001; Jones et al., 1961; Kardosova et al., 1998). Isolated colonies were passed several times in MeGAX1 broths and agars until pure. A culture growing on 0.2% MeGAXn Z-H medium was cryostored by mixing 0.5 ml of exponentially growing culture with 0.5 ml 50% (v/v) sterile glycerol and freezing at -70 OC. The purified isolate was submitted to MIDI Labs (http://www.midilabs.com) for partial 16s rRNA sequencing. The organism was identified as Paenibacillus sp. with 96% identity to Paenibacillus granivorans by blastn submission of 530 nucleotides of sequenced 16s rRNA. The organism has been deposited with the Bacillus Genetic Stock Center (BGSC), (http://www.bgsc.org). Growth studies. A common protocol was applied in the maintenance and analysis of Paenibacillus sp. strain JDR-2 cultures. Each time a culture was prepared for study, a sample from the cryostored stock culture was transferred into 4 ml of 0.5% SG MeGAXn Z-H medium in 16 x 100 mm test tubes. After 36 to 48 hours of growth the culture was plated on agar medium containing 1.0% yeast extract (YE) and 0.5% oat spelt xylan in Z-H and grown for 36 to 48 hours until appropriately sized colonies were observed. In a slight deviation, inoculum from the cryostored culture was plated directly onto the agar medium and grown for 48 to 72 hrs before picking an isolated colony. All colonies regularly displayed the expected phenotype, i.e. a clearing zone on the opaque oat spelt xylan background with the expected colony morphology. For the various growth studies described below a single colony was inoculated into medium specified for the particular experiment. All growth was performed at 300C. Growth optimization studies were performed aerobically in 16 x 100 mm test tubes containing 4 ml volumes of medium and optical densities of cultures were measured at 600 nm with a Beckman DU500 series spectrophotometer with a 16 x 100 mm test tube holder. Individual 4 ml cultures for study were inoculated with 200 [pl (5% volume) of an exponentially growing culture (4 ml medium of 1.0% YE in Z-H). For these test tube cultures, agitation was achieved by setting a test tube rack in a large flask holder on a New Brunswick G-2 gyrotory shaker at an angle of approximately 45. Under these conditions, rotation at 200 rpm yielded the best agitation when compared to simple rotation. Studies comparing Paenibacillus sp. strain JDR-2 utilization of MeGAXn, with or without xylose or glucose as co-substrates, were performed in 125 ml baffle flasks with shaking at 150 rpm on a G-2 gyrotory shaker. Cultures were initiated by the addition of 4 ml (8% volume) of Z- H mineral salts washed cells from an overnight culture (25 ml) of 1.0% YE Z-H medium. Growth was monitored using an HP Diode Array spectrophotometer at 600 nm in a 1.00 cm cuvette. For these cultures, sample dilutions were performed to obtain OD 600 nm readings between 0.2 and 0.8 absorbance units and the resulting value was corrected by the dilution factor. Culture aliquots were centrifuged, supernatants filtered and carbohydrate utilization was measured by HPLC using a complete modular Waters chromatography system comprised of a 600 controller, 610 solvent delivery unit, 2410 RI detector and a 710B WISP automated injector. Carbohydrate separation was achieved with a Bio-Rad HPX-87H column running in 0.01 N H2SO4 with a flow rate of 0.8 ml/min at 65 C. Data analysis was performed using Waters Millennium Software. The differential utilization of MeGAXn and XynAi CD generated products from MeGAXn as growth substrates by the organism was evaluated by the initiation of 50 ml cultures with 4 ml (8% volume) of Z-H washed cells from 25 ml overnight cultures in 0.5% SG MeGAXn Z-H medium. Growth was monitored as described above and aliquots examined by TLC (see procedure below). DNA cloning, sequencing and analysis. A genomic library of Paenibacillus sp. strain JDR-2 DNA, prepared in pUC 18 with gel purified 6-9 kb fragments obtained from a partial Sau3AI digest, was kindly provided by Ms. Loraine Yomano from the laboratory of Professor Lonnie Ingram. All cloning and general DNA manipulation methods originate from Molecular Cloning: A Laboratory Manual (Sambrook et al., 1989). In addition, DNA purification and gel extraction was performed using kits purchased from Qiagen (Valencia, California). Cloning analysis, planning and image preparation was performed with Clone Manager 6 and Enhance (Scientific and Educational Software, Cary, NC). Analysis of sequences for regulatory elements was conducted using the online tools available through Softberry (http://www.softberry.com/berry.phtml). The pUC18-based 6-9 kb library was transformed into E. coli DH5ca and screened for xylanase positive clones by plating transformed cells on Remazol Brilliant Blue xylan plates and observing agar clearing after 24 hours (Braun and Rodrigues, 1993). Sequencing of cloned DNA was done in house by subcloning the insert into smaller sizes and using pUC18 M13 priming sites for sequencing of both strands. Primer walking at the ICBR Genome Sequencing Services Laboratory at the University of Florida filled in gaps and completed 2x coverage. All sequencing employed the Sanger dideoxy chain termination method. The final sequence was assembled using the CAP3 sequence assembly program (Huang and Madan, 1999) located on the P81e Bio-Informatique Lyonnais server (http://pbil.univ-lyonl.fr/). Sequence analysis was performed with online resources available through the NCBI (http://www.ncbi.nlm.nih.gov) and BCM (http://searchlauncher.bcm.tmc.edu) websites. The main tools employed were BLAST and CD-Search of the CDD (Conserved Domain Database) (Marchler-Bauer et al., 2003) on the NCBI site and the 6 Frame Translation and Readseq utility at the BCM site. Phylogenetic Analysis of Paenibacillus sp. strain JDR-2 XynAi. All presented phylogenetic analyses resulted from sequences that had been trimmed to contain only the highly conserved catalytic domain from the proton donor (WDVVNE) to the catalytic nucleophile (ITELDI). These sequences were aligned using Clustalx and phylogenetic trees constructed using MEGA 2.1 (Molecular Evolutionary Genetics Analysis, Version 2.1.; Kumar et al., 2002). The domain arrangement of the whole xylanase was determined with CDD (Marchler-Bauer et al., 2003) at NCBI (http://www.ncbi.nih.gov/Structure/cdd/wrpsb.cgi). Signal sequences were analyzed by the on-line program Signal-P (Bendtsen et al., 2004) (http://www.cbs.dtu.dk/services/SignalP). Eighty-four bacterial GH 10 xylanases were downloaded from the CAZy(ModO) database (http://afmb.cnrs-mrs.fr/CAZY/) and processed as described above. This processing ensured the strictest comparison between all the bacterial GH 10 xylanases. Four xylanases showed high similarity to Paenibacillus sp. strain JDR-2 XynAi. These and eleven randomly chosen sequences from the eighty set were presented in figures for this chapter. Carbohydrate and protein assays. Total carbohydrate concentrations related to substrate preparations and enzymatic kinetic analysis were determined by the phenol-sulfuric acid assay (Dubois et al., 1956). In conjunction with the total carbohydrate assay, measurements to define the degree of polymerization of substrate and increased reducing terminus levels due to xylanolytic activities were performed by the method of Nelson (Nelson, 1944). Xylose was used as the reference for both assays. Protein levels were determined using Bradford assay reagents (BioRad) with BSA (Fraction V) as the standard (Bradford, 1976). XynA1 CD cloning, overexpression and purification. The expression vector pET15b+ (Novagen, San Diego, CA) was used to overexpress the catalytic domain of XynA1 independent of other modules. Primers were designed to delimit the CD based on the modular endpoints identified by Pfam (Bateman et al., 2004). The forward primer (5'aagcatggctccactcaaa) included an Ndel site for in-frame fusion with the His-Tag sequence, and the reverse primer (5'tgtgctcagccggatcaat) contained a Blpl (Bpu1021) site for directional cloning into pET15b+. This primer selection method added a Gly, Ser, His, Met sequence to the N-terminus just prior to the beginning of the pfam designated sequence. This additional sequence was derived from the pET15 expression ORF coding sequence. There was additional sequence corresponding to Ala, Glu, Gln at the C-terminal end resulting from vector-derived sequence just upstream from the vector-encoded stop codon. The PCR product was generated using Proof Start high fidelity PCR (Qiagen, Valencia, CA). The construct was verified by sequencing. For expression, the pET15XynA1 CD N-terminal His construct was transformed into E. coli Rosetta (DE3) chemically competent cells, and grown with selection (see below) for about 24 hours. A single colony was picked and inoculated into 50 ml LB containing 50 [g/ml ampicillin and 34 [g/ml chloramphenicol, grown at 37 C to an OD 600 nm of about 0.6 to 0.8, and the entire culture centrifuged for 10 minutes at 5000 x g at 350 C. The pellet was resuspended in 100 ml LB containing 100 [g/ml ampicillin and 34 [g/ml chloramphenicol and grown for about 1 hr at 370C. Cells were harvested as above, resuspended in 10 ml LB medium and used to inoculate a 1 L LB batch culture (preequilibrated at 370C) containing antibiotics as above. This was grown at 37 C to an OD 600 nm of 0.6 to 0.8 and overexpression induced by the addition of IPTG to 1.0 mM final concentration, and incubation continued for no more than 3 hours before cells were harvested. Cell pellets representing the growth of 1 liter of culture were suspended in 35 ml of 20 mM sodium phosphate buffer, pH 7.0 and lysed at 16000 psi with a single pass through a French pressure cell. The total volume was estimated to be 37.5 ml and 1 M MgCl2 was added to a final concentration of 1.5 mM to obtain optimal Benzonase (Novagen, San Diego, California) activity for hydrolysis of nucleic acid. Benzonase was added at 8 units/ml and the cell lysate was incubated with gentle mixing at room temperature for about 45 minutes. The crude cell lysate was centrifuged at 40C for 30 minutes at 35000 x g, the supernatant filtered through a 0.45 um syringe tip filter, and 5.0 M NaCl added to a final concentration of 0.50 M. Ten ml aliquots of the cell-free extract were affinity-purified using the HiTrap Chelating HP column procedure (Amersham Biosciences, Piscataway, NJ). Each loaded volume yielded a single expected band detected with Coomassie Blue (CB) following SDS-PAGE analysis. Removal of the N-Terminal His-Tag was accomplished using the thrombin cleavage capture kit available through Novagen. XynAi CD and XynAi CD N-terminal His were stored for short periods of time at 40C in 50 mM potassium phosphate buffer, pH 6.5. For longer storage, these stocks were split with equal volumes of glycerol and stored at -20 OC. Enzyme analysis by activity measurement and protein profiles following SDS-PAGE staining with CB were the same after 6 months storage at -20 C. Xylanase activity measurements for enzyme optimization and kinetic analysis. The temperature optimum for XynAi CD xylanase activity was determined by incubation in 0.25 ml reaction mixes containing 1.0% SG MeGAXn in 0.1 M potassium phosphate, pH 7.0 for 10 to 30 minutes over a 40 C to 60 OC range. Reactions were halted by the addition of 0.25 ml Nelson's A:B reagent (25:1) (v:v) and the increase in reducing termini determined (Nelson, 1944). The resulting temperature optimum (45 C) was subsequently used to determine the optimal activity for the enzyme over the pH range from 5.5 to 7.0 in reaction mixes containing 1.0% SG MeGAXn in 0.1 M potassium phosphate. The optimal conditions from these determinations (pH 6.5 at 450C) were used in experiments to examine the reaction kinetics of the enzyme with SG MeGAXn as substrate. Activity units are described as the amount of enzyme producing 1 [tmole of reducing termini per minute at 450C. Production rates were linear through 30 minutes and data obtained are averages of 3 separate experiments performed in triplicate. Chromatographic resolution and detection of aldouronates and xylooligosaccharides. Standards were obtained by acid and enzymatic hydrolysis of SG MeGAXn. Aldouronate oligomers, MeGAX1 through MeGAX5, were prepared by acid hydrolysis of MeGAXn in 0.1 N H2SO4 at 121 C for 60 min. The acid hydrolysate was neutralized with BaCO3 and the aldouronates adsorbed onto Bio-Rad AG2-X8 anion exchange resin in the acetate form. Xylose and xylooligosaccharides were eluted with water, and the aldouronates were then eluted with 20% acetic acid. After concentration by flash evaporation, aldouronates were fractionated with 50 mM formic acid eluent on a 2.5 cm x 160 cm BioGel P-2 column (BioRad, Hercules, CA) equilibrated in the same buffer. Identities of MeGAX1 and MeGAX2 were confirmed by 13C and 1H-NMR spectrometry (K. Hasona, unpublished data). Identities of MeGAX3, MeGAX4 and MeGAX5 are based upon the elution profile from the BioGel P-2 column and TLC analysis of aldouronates resulting from GH 10 and GH 11 catalyzed MeGAXn hydrolysis. Xylobiose and xylotriose were generated by hydrolysis with a GH 11 xylanase, XynII of Trichoderma longibrachiatum (Hampton Research, Aliso Viejo, CA), and fractionated using water based BioGel P-2 column chromatography. These methods allowed the isolation of X2, X3 and the aldouronates MeGAX1 through MeGAX5. To follow the depolymerization of MeGAXn catalyzed by XynA1 CD, a 250 [tl reaction containing 0.5 units of enzyme and 5 mg SG MeGAXn in potassium phosphate buffer, pH 6.5, was incubated at 30 OC. Samples (5 IL) were removed every 10 min up to 120 min, and spotted on 20 x 20 cm pre-coated 0.25 mm Silica Gel 60 TLC plates (EM Reagents Darmstadt, Germany). An additional 0.5 units of XynAi CD was added after the initial 120 minutes and incubation was continued for an additional 16 h. A 5 [tl sample representing the reaction limit products was also spotted on the plate. Oligomers were separated by ascension with a solvent system containing chloroform: glacial acetic acid: water (6:7:1) (v:v:v) two times for 4 hours each with at least 1 hour of drying time between each solvent presentation. After the second development the plate was allowed to dry for at least 30 minutes and then sprayed with 6.5 mM N-(1-naphthyl)ethylenediamine dihydrochloride in methanol containing 3% sulfuric acid with subsequent heating to detect the carbohydrates (Bounias, 1980). To compare the ability of Paenibacillus sp. strain JDR-2 to utilize MeGAXn, aldouronates and xylooligosaccharides, MeGAXn was digested with XynA1 CD to generate primarily X2, X3, and MeGAX3. For digestions with XynAi CD, 50 ml of substrate containing 30 mg/ml SG MeGAXn were prepared with 10 mM sodium phosphate buffer, pH 6.5. Digestions were initiated by addition of 3.5 units XynA1 CD and incubated with rocking at 30 OC for 24 h. An additional 1 unit was added after 24 hours and incubation was continued for 40 h. Digests were processed by stir-cell filtration under nitrogen pressure through YM-3 ultrafiltration membranes (3 kDa MWCO) (Millipore, Billerica, MA). The filtrates containing oligomers with molecular weights less than 3000 Da were concentrated by flash evaporation and analyzed by the phenol-sulfuric acid assay (Dubois et al., 1956). These concentrated fractions then served as substrates comparing the growth rates and yield of Paenibacillus sp. strain JDR-2 on MeGAXn. Cultures were incubated at 30 OC in baffle flasks containing 50 ml medium supplemented with 10 mg/ml of anhydroxylose equivalents (determined by the total carbohydrate assay). Growth was followed by measuring OD 600 nm. Samples of 250 pll were removed at selected times, centrifuged at maximum speed in a microfuge and the supernatant and pellet were separated and saved. The supernatant was incubated at 70 OC for 10 minutes prior to storage. A volume of 6 pll was used for each sample on the TLC plate. TLC plates were developed as described above. Immunolocalization of XynAi. Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) was performed according to Laemmli (Laemmli, 1970) using a MINI-PROTEAN 3 electrophoresis cell, a 12% Ready Gel and Precision Plus Dual Color pre stained molecular weight standards (Bio-Rad Laboratories, Hercules, CA) following described methods (Mini- PROTEAN 3 Cell Instruction Manual, Bio-Rad Laboratories, Hercules, Ca.). Immunodetection was performed as previously described (Schmidt et al., 2003). XynAl CD was purified to homogeneity as judged by SDS-PAGE after staining with CB. Chickens were inoculated with XynAl CD as antigen. An amount of 100 tg was delivered in a total volume of 1 ml PBS. A volume of 700 pll was injected subcutaneously under the wing and 300 ul was injected in the footpad. No adjuvant was used. Eggs were collected from before injection and through the entire process. A boost injection was administered as above about two weeks post primary injection. The eggs were screened by ELISA in groups of three as crude unprocessed yolk in PBS. Peak fractions were pooled and chicken IgY polyclonal antibody obtained following the method of Polson, et al (Polson et al., 1985). Immunolocalization studies were performed with cell fractions following growth on SG MeGAXn. Bacillus subtilis 168 was cultured as a negative control and compared with Paenibacillus sp. strain JDR-2. Colonies of B. subtilis and Paenibacillus sp. strain JDR-2 were suspended in 2 ml, lx Z-H and vortexed until cells were fully suspended. The complete 2 ml volume was used to inoculate 50 ml of media in 250 ml baffle flasks containing 0.2% YE, 0.36% SG MeGAXn in Z-H. Cultures were grown overnight (16 hr) at 300C with shaking at 150 rpm on a New Brunswick G-2 gyrotory shaker. Cells were harvested at an OD 600 nm of 1.0 (Paenibacillus sp. strain JDR-2) and 0.7 (B. subtilis). Cultures were centrifuged at 5000 x g for 15 minutes at room temperature and the supernatant was recovered. Cell pellets were resuspended in 50 mM sodium phosphate buffer, pH 6.5, and centrifuged as above. The procedure was repeated with 50 mM sodium phosphate pH 6.5 containing 0.5 M NaC1. The final cell pellet was resuspended in 5 ml, 50 mM sodium phosphate, pH 6.5. Some cell lysis of Paenibacillus sp. strain JDR-2 was apparent, observed as increased viscosity, probably due to osmotic shock. A volume of 50 pl Promega DNase RQ1 at 1 unit/pl was added with 1/10 the volume of 10 x DNase RQ1 buffer (0.40 M Tris-HC1, 0.10 M MgCl2, 0.01 M CaC12, pH 8.0) and the suspension was incubated for 30 minutes at room temperature. Cells were then lysed by two passes at 16,000 psi through the French pressure cell. Lysates were centrifuged at 30,600 x g for 20 min at 40C. Supernatant was collected as the cell free extract, the pellet was resuspended in 1 ml, 50 mM sodium phosphate pH 6.5 and designated the cell wall suspension. All supernatants were concentrated using YM-10 Centriprep concentrators (Millipore, Billerica, MA) to volumes less than 4 ml. Samples of the media supernatant concentrate (MSC), NaCl wash (NaC1), cell free extract (CFE) and cell wall suspension (CWS) were analyzed by SDS-PAGE. Reactive antigens were detected on immunoblots using rabbit anti-chicken alkaline phosphatase conjugate (Sigma, St. Louis, Missouri ) as previously described, and proteins were detected in gels with CB (Schmidt et al., 2003). Results Growth analysis of Paenibacillus sp. strain JDR-2. Based on OD 600 nm measurements, the initial growth analysis of Paenibacillus sp. strain JDR-2 indicated that the organism utilized MeGAXn more efficiently compared to glucose or xylose as substrates (Figure 4-1A). More detailed studies (Figure 4-1B-D) using HPLC to follow substrate concentration showed that MeGAXn is almost completely utilized. Additionally, Paenibacillus sp. strain JDR- 2 preferentially utilized the MeGAXn in the presence of glucose or xylose. Under these conditions, the concentrations of glucose and xylose in the medium decreased more slowly, and at a nearly linear rate. Figure 4-1B also shows that xylose accumulates in the medium to a small extent during growth on MeGAXn, indicating that what is produced during the extracellular depolymerization may not be directly assimilated. Identification and sequencing ofxynAi encoding a secreted modular GH10 endoxylanase. Analysis of the Paenibacillus sp. strain JDR-2 chromosomal DNA library in E. coli for xylanases led to the isolation of four clones. Restriction analysis of these clones suggested that the inserts were from the same genomic DNA location. Plasmid pFSJ4 was selected for sequencing which revealed an insert (Figure 4-2) including a large modular xylanase (xynA1) of 4401 nt (1467 aa). Sequencing of the compete genomic DNA insert identified genes flanking xynAi. In the 5' direction on the same chain there is a mdep gene encoding a putative multi-drug efflux permease with 43% amino acid identity to the same in Bacillus halodurans (gene = BH3482) determined by blastp. In the 3' direction on the opposite strand there is a putative c-1,6-mannanase gene (amanA) that codes for a protein with 67% identity to Aman6 protein (aman6 gene) from Bacillus circulans. Domain analysis revealed that AmanA has the exact modular structure of Aman6 with a GH 76 catalytic module followed by triplicate family 6 CBM. In silico sequence analysis identified a probable promoter region and rho-independent terminator for xynA1, but only a terminator for the mdep gene and a promoter for the gene encoding AmanA. Much like many other glycosyl hydrolases, XynAi is a modular protein composed of 8 separate modules (Figure 4-2). The domains include a triplicate N-terminal set of CBM 22 modules which have previously been shown to bind soluble xylan and P-1,3-1,-4-glucan (Dias et al., 2004; Xie et al., 2001). These modules are followed in sequence by a GH 10 CD and a CBM 9, which has been shown to bind to the reducing end of carbohydrate chains (Boraston et al., 2001; Notenboom et al., 2001). Following CBM 9 is an undefined sequence with high similarity to the same region in Xyn5, a GH 10B xylanase of Paenibacillus sp. W-61. This region, as previously reported, has high identity to the lysine-rich region of the SdbA protein of C. thermocellum. Xyn5 and XynAi have 36% and 35% amino acid identity, respectively, to this region of SdbA, and this region in XynAi has 49% identity to the same in Xyn5. Although these identities to the lysine rich-region of SdbA are relatively high, XynAi and Xyn5 contain only about 5% and 6.5% lysine, respectively, to the same region of SdbA which has 13% lysine (data not shown) (Ito et al., 2003; Leibovitz et al., 1997). The C-terminal region includes a triplicate set of SLH modules which are predicted to function in surface anchoring (Cava et al., 2004; Kosugi et al., 2002; Mesnage et al., 2000). The xynAi coding sequence has been deposited in EMBL with the accession number AJ938162. Phylogenetic analysis of XynAi. Initial phylogenetic analysis revealed that XynA1 and XynA1 CD amino acid sequences, when subjected to blastp, had high bit scores to the same set of four modular GH 10 xylanases (Figure 4-3). Comparison of the top 9 blastp hits to XynA1 of Paenibacillus sp. strain JDR-2 shows the comparative modular structures. Additionally, the bit scores are represented for the whole sequence blastp and the CD sequence blastp. XynAi and the top four hits were classified as GH 10B and the lower set as GH 10A based on the number of amino acids separating the glutamate residue functioning as the catalytic proton donor from the glutamate functioning as the catalytic nucleophile. Although there are some exceptions, most catalytic domains of GH 10 xylanases have about 105 amino acids separating the two catalytic residues. In the case of sequence group GH 10B, the distance separating the catalytic residues is about 123 amino acids (Table 4-1). For further analysis we reasoned that the catalytic residue bridge sequence was probably the most highly conserved portion of GH 10 xylanases and compared this sequence from many xylanases. Figure 4-4 represents a phylogenetic comparison of the GH 10B subset to eleven randomly selected GH 10A xylanases. Table 4-1 characterizes the modular structures for the xylanases, indicating significant diversity among those represented in subset 10A. In Figure 4-4A the Clustal alignment of the CD region used to prepare the phylogenetic tree revealed three areas in which GH 10B subset differs from GH 10A. The additional sequences accounted for the extra length between the catalytic proton donor and nucleophilic glutamate residues. A phylogenetic tree developed with the neighbor-joining method for the alignment in Figure 4-4A shows sequences within a clade distinct from the others (Figure 4-4B). It should be noted that this comparison set is biased to the extent that it contains five very similar GH 10B sequences with eleven other random GH 10A sequences. However, the presentation of data identifies a relationship that indicates that GH 10B sequences have a common lineage. Large-scale analysis of 84 bacterial GH 10 xylanases obtained from CAZy identified GH 10B as a subset and allowed few other subsets to be created with a > 95% bootstrap value. Many of these sequences did not place with confidence in any subset potentially allowing for only a few well-defined subgroups of GH 10 xylanases. XynAi localization. Chicken polyclonal IgY generated against XynAi CD (anti-CD) was used to examine the localization of XynAi in Paenibacillus sp. strain JDR-2 cell fractions. CB stained SDS-PAGE bands of both Paenibacillus sp. strain JDR-2 and Bacillus subtilis 168 proteins were primarily greater than 100 kDa for all fractions (Figure 4-5A). However, anti-CD showed reactivity with Paenibacillus sp. strain JDR-2 CWS protein (approx. 150 kDa) which was not apparent with the CB stained gel (Figure 4-5B). The antibody reacted well with XynAi CD with essentially no cross reactivity towards B. subtilis fractions. Size estimation of the reactive Paenibacillus sp. strain JDR-2 CWS protein at approximately 150 kDa compares favorably with the MW (154 kDa) obtained from the translated amino acid sequence of the XynAi modular enzyme. The band identified as XynAi in the immunoblot is not visible in the CB-stained gel (Figure 4-5A), indicating that XynAi represents a minor component of the surface protein complement. What is obvious from the CB-stained gel is a band size of approximately 80 kDa. This undoubtedly is the most prominent protein overshadowing all others. Observing that XynAi is anchored to the surface supports the possibility that Paenibacillus sp. strain JDR-2 produces a crystalline surface layer. The size of the prominent band at 80 kDa is roughly the same size as the Sap and 80K surface layer proteins from Bacillus amlani i% and Bacillus sphaericus respectively (Bowditch et al., 1989; Etienne-Toumelin et al., 1995). Kinetic and product analysis of XynAi CD. XynAi CD was overexpressed in pET 15b+ and affinity purified using an N-terminus His-Tag. Removal of the affinity tag by thrombin protease treatment resulted in an increased activity against SG MeGAXn of approximately 50%. Initial characterization showed XynAi CD to have an optimal pH and temperature of 6.5 and 450C, respectively (data not shown). Kinetic analysis with SG MeGAXn as substrate (Figure 4- 6) showed XynAi CD to have Vmax and Km values of 8 units/mg and 1.96 mg/ml, respectively, and a kcat of 306.8 /min. Analysis of products by TLC (Figure 4-7) showed that XynAi CD is a typical GH 10 xylanase hydrolyzing MeGAXn primarily to X2 and MeGAX3. Small amounts of X3 and MeGAX4 were also produced. True limit products of the reaction included xylose, which built up from the seemingly slow conversion of X3 and MeGAX4 to X2 and MeGAX3. Thirty minutes after reaction initiation, X2 and MeGAX3 are the predominant products (Figure 4-7). The small amounts of X3 and GAX4 disappeared by 24 hr. Paenibacillus sp. strain JDR-2 utilization of aldouronates and xylooligosaccharides in comparison to MeGAXn. Xylooligosaccharides and aldouronates were generated by hydrolysis of SG MeGAXn with XynAi CD. A 3 kDa molecular weight cutoff ultrafiltration filtrate product was used to evaluate growth of Paenibacillus sp. strain JDR-2 on xylanase generated aldouronates and xylooligosaccharides. Growth was compared for SG MeGAXn and XynAi CD filtrate. Data presented in Figure 4-8 shows the aldouronates and xylooligosaccharides resolved by TLC during the growth of Paenibacillus sp. strain JDR-2. Through the time course for growth on SG MeGAXn, neither aldouronates nor xylooligosaccharides were detected in the media during exponential growth. In contrast to growth observed on the XynAi CD-generated products, higher growth rates and yields were observed with MeGAXn as substrate, indicating preferred utilization of polymeric glucuronoxylan compared to aldouronates and xylooligosaccharides generated by the in vitro XynAi CD-catalyzed depolymerization of MeGAXn. Discussion Based upon growth and substrate utilization analysis, Paenibacillus sp. strain JDR-2 has been shown to more efficiently utilize the biomass polymer MeGAXn compared to simple sugars such as glucose and xylose. In addition, growth on MeGAXn with competing simple sugars does not seem to affect its utilization of MeGAXn (Figure 4-1). This observation stands in contrast to a similar xylanolytic system from Paenibacillus sp. W-61 in which the investigators found that glucose strongly repressed xylanase activity (Viet et al., 1991). Although there appear to be metabolic differences, Paenibacillus sp. W-61 produces Xyn5, a GH 10B xylanase that is the top blastp hit of XynAi. With 51% identity the two full sequences are very similar with Xyn5 differing; i.e., having only 2 CMB 22 modules rather than three. Kinetic properties of the two xylanases are similar but the generation of aldouronates by Paenibacillus sp. W-61 was not determined, precluding a comparison to XynAi secreted by Paenibacillus sp. strain JDR-2 (Ito et al., 2003). Even though Paenibacillus sp. strain JDR-2 utilizes MeGAXn very efficiently, it is probable that XynAi is the only extracellular xylanase responsible for this ability. Genomic library screening led to the isolation of four xylanolytic clones with identical restriction profiles, each containing the same xynA] coding sequence. Only one other xylanase gene has been identified from this organism during intensive cosmid library screening in E. coli, and this encodes a 40 kDa GH 10 catalytic domain designated XynA2. The primary amino acid sequence for XynA2 does not have a detectable secretion signal sequence and is expected to be localized to the cytosol. The xynA2 gene sequence is located within an operon including aguA, encoding a GH 67 a-glucuronidase, and encodes a GH 10 xylanase that may be involved in the intracellular processing of aldouronates and xylooligosaccharides generated by the action of XynAi on the cell surface (G. Nong, V. Chow, J. D. Rice, F. St. John, J. F. Preston, Abstr. 105th ASM General Meeting, abstr.O-055, 2005). MeGAX3, the primary aldouronate limit product of GH 10 xylanases (Biely et al., 1997), is presumably efficiently assimilated as it has been identified as an inducer for genes involved in hydrolysis and catabolism of glucuronoxylan in Geobacillus stearothermophilus (Shulami et al., 1999). Phylogenetic characterization of XynAi placed the sequence with a highly similar set of GH 10 xylanases referred to in this paper as the GH 10B subset (Figure 4-3). This classification is supported by differences observed in the CD coding sequence. Specifically, the area bridging the two catalytic glutamate residues contains three areas of additional sequence that are not observed in other xylanases. Although GH 10B xylanases are modular, there are many similarly modular xylanases that may not be classified as GH 10B. This suggests a unique mode of action for GH 10B xylanases. It is interesting to note that this subset includes GH 10 xylanases in anaerobic Clostridium spp. and aerobic Paenibacillus spp., all of which are found in soil environments. The common modular architecture (Figures 4-3 and 4-4) that, in this case, includes anchoring motifs, suggests a positive role in niche development of these bacteria. The variability in the number of CBM and SLH modules suggests these may be mobile elements that may be combined from different genes during evolution. XynAi CD analysis identified XynAi as a typical GH 10 endoxylanase, producing primarily X2, X3 and MeGAX3 in the early stages of the reaction (Figure 4-7) (Biely et al., 1997; Preston et al., 2003). Hydrolysis seemed to proceed in two stages. The first stage resulted in formation of X2 and MeGAX3 with small amounts xylose, X3 and MeGAX4. The second stage included the slow conversion of the minor products to X2 and MeGAX3 with increased formation of xylose. The XynA1 CD Km for SG MeGAXn was within a comparable range to other GH 10 xylanases but the rate of catalysis was significantly lower than that found in other reports (Figure 4-6). This may in part be due to the special attention given to xylanases showing high activity. Wild type purification of Xyn5 from Paenibacillus sp. W-61 yielded an enzyme showing a similar specific activity as XynAi. (Roy et al., 2000). Xylan binding subsite analysis revealed that XynA1 contains a CD that has four well- conserved subsites. Subsites -2 through +2 are highly conserved in GH 10 xylanases and XynA1 is no exception. In addition, by alignment analysis (data not included), XynA1 does not appear to have a +3 subsite and subsites -3 and +4 do not exist as defined in some other GH 10 xylanases (Charnock et al., 1998; Pell et al., 2004a). Analysis of these subsites as they may impact the catalysis of MeGAXn seems to support the results of product analysis by TLC. A xylanase with strong binding -2 through +2 subsites should yield X2 as a primary product of MeGAXn hydrolysis. Accumulation of xylose and small odd numbered xylooligomers (X, X3) would only result from processing through odd numbered oligomers such as X5 and X7. Structural studies have also identified a glucuronic acid pocket in the +1 subsite that would facilitate hydrolysis of MeGAXn, leaving the MeGA substitution on the nonreducing terminus (Pell et al., 2004b). Since MeGAX3 is a primary limit product of GH 10 xylanases, it follows that large xylooligomers containing MeGA as a nonreducing end substituted residue can only be further processed by positioning into the -3 subsite yielding MeGAX3 (Fujimoto et al., 2004). XynAi is the largest GH 10 xylanase so far identified from a Paenibacillus sp. The net modular architecture is similar to other Paenibacillus sp.; however, the triplicate N-terminal CBM 22 is unique to Paenibacillus sp. strain JDR-2. Although at least one other bacterial GH 10 xylanase has been identified with a triplicate set of N-terminal CBM 22 modules (XynB from Caldicellulosiruptor sp. Rt69B. 1), it classifies as a GH 10A subset member according to this analysis (Figure 4-3) (Morris et al., 1999). Published reports that consider the role of carbohydrate binding modules with respect to the function of the catalytic domain suggest that these modules accentuate activity by increasing the localized substrate concentration (Boraston et al., 2004). It is difficult to imagine hydrolysis of MeGAXn by XynAi being potentiated by the addition of a third CBM 22 module. Two publications concerned with a set of CBM 22 modules (not necessarily in tandem) from different xylanases have shown that while one seems to function to bind a potential carbohydrate substrate, the other does not (Charnock et al., 2000; Meissner et al., 2000). Additionally, in some cases these modules have been shown to have better binding to P-1,3-1,4-glucan (barley 1-glucan) than to xylan (Araki et al., 2004; Meissner et al., 2000). Based on these inconsistent findings it is impossible to assume any precise functionality of these CBM 22 modules. Analysis of this system by expression of each module separately and the application of native affinity polyacrylamide gel electrophoresis (NAPAGE) would clarify the role of each CBM 22 as they may function to accentuate CD catalytic ability (Meissner et al., 2000). It seems likely that there is a competitive advantage in colonizing a niche for an organism that utilizes surface anchored enzymes to hydrolyze biomass polymers. The proximity of the resulting hydrolysis products would decrease diffusion-dependent assimilation rates. This strategy has been attributed to Clostridium spp. that produce the cell surface-localized cellulosome (Bayer et al., 2004; Doi and Kosugi, 2004; Doi et al., 2003). However in anaerobic ecosystems in which Clostridium spp. are the primary utilizers of biomass it has been suggested that the main product of crystalline cellulose hydrolysis, cellobiose, actually decreases cellulolytic activity and that excess cellobiose and other free sugars are utilized by other members of the ecosystem. This relationship is truly communal in that these other bacteria receive carbon substrates and may return the favor in the form of vitamins or other beneficial growth factors for the cellulolytic organisms (Bayer et al., 1994). Inclusion of xylanolytic enzymes in the cellulosome does not establish that these organisms can utilize the products resulting from hydrolysis of MeGAXn. It's probable that hemicellulolytic activities are associated with the cellulosome to increase cellulase access to cellulose by removing associated non-cellulose polymers (Bayer et al., 2004). It has been reported that the mesophilic Clostridium cellulovorans can ferment xylan, but there is no clear analysis of hydrolysis products and the extent to which the xylan is utilized (Kosugi et al., 2001; Sleat et al., 1984). Additionally, although there is increasing evidence of GH 10 xylanases in the Clostridium, there is no evidence of accessory enzymes such as an a-glucuronidase which is thought to be required for complete utilization of MeGAXn (Han et al., 2004). All of these characteristics pertaining to the cellulosomal systems seem to stand in contrast to the MeGAXn hydrolytic system of Paenibacillus sp. strain JDR-2. This system does not utilize the hydrolytic products of XynAi CD efficiently and seems to require the activity of XynAi anchored to the cell surface for efficient utilization of MeGAXn. This would suggest that the XynAi anchoring/ vectoral transport mechanism has evolved to yield almost complete recovery of hydrolytic products as an advantage against potential niche competitors. This may be further supported by the fact that Paenibacillus sp. strain JDR-2 requires no nutritional supplement for growth on MeGAXn, but growth of C. thermocellum and C. cellulovorans requires medium supplemented with yeast extract (Bayer et al., 1983; Quinn et al., 1963; Sleat et al., 1984). Paenibacillus sp. strain JDR-2 XynAi anchoring to the cell surface may be considered somewhat analogous to the surface anchoring of the cellulosome in the Clostridia (Bayer et al., 2004). An important distinction between these genera is that Clostridia are strictly anaerobic organisms and Paenibacillus sp. strain JDR-2 requires oxygen for growth. This physiological difference spatially separates these two genera in environmental niche development. In addition, the cell surface associations of cellulases and other enzymes in Clostridium spp. is mediated through the cellulosome. This complex of enzymes is maintained via interactions between dockerin modules on individual enzyme proteins and cohesion modules on a scaffoldin protein, which is then anchored to the cell surface (Doi, Kosuge et al., 2003). Surface anchoring of biomass degradative enzymes may provide a strategy for efficient hydrolysis and transport of resulting products, yielding a distinct advantage over organisms with free enzyme lignocellulose degradative systems. In the example of C. thermocellum cellulose, cellobiose and cellodextrins (degrees of polymerization < 4) are directly transported by a cellodextrin ABC transporter. Transport of these oligosaccharides conserves ATP by the action of intracellular phosphorylase yielding a significant growth advantage (Zhang and Lynd, 2005). Paenibacillus sp. strain JDR-2 may utilize a similar mechanism for the efficient utilization of X2 and MeGAX3. Additionally, Paenibacillus sp. strain JDR-2 seems to couple the action of substrate hydrolysis to product uptake. This may be a secondary method for the conservation of energy. Once internalized, the MeGAX3 is apparently processed by a-glucuronidase (AguA)-mediated hydrolysis to MeGA and X3, with subsequent hydrolysis of xylotriose by intracellular GH 10 xylanases (XynA2) with specificity for small xylooligosaccharides and P-xylosidase (Gallardo et al., 2003; Pell et al., 2004a; Preston et al., 2003). Although this may be the case, it is also possible that Paenibacillus sp. strain JDR-2 utilizes an intracellular phosphorylase as described for several cellulolytic organisms (Lou et al., 1996; Lou et al., 1997; Reichenbecher et al., 1997). Considering the efficient utilization of MeGAXn by Paenibacillus sp. strain JDR-2, this organism may provide a platform for future biocatalyst development. Under conditions of low oxygen, Paenibacillus sp. strain JDR-2 produces succinate and acetate as fermentation products (unpublished data). Alternatively, the genes encoding the cell surface anchored XynAi, as well as those involved in the assimilation and metabolism of XynAl-generated products, may be used to engineer other bacterial platforms to efficiently convert MeGAXn to desired fermentation products. The aggressive utilization of MeGAXn by Paenibacillus sp. strain JDR-2 supports its further development and genetic exploitation for the conversion of lignocellulosic biomass to alternative fuels and bio-based products. |