MUSCLEBLIND LIKE 1 REGULATION OF ALTERNATIVE SPLICING: THE IMPACTS OF PROTEIN DOMAIN ARCHITECTURE AND TRANS ACTING ELEMENTS By MELISSA ANN HALE A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2018
2018 Melissa Ann Hale
To my family, in loving memory of my grandfather Gerald Yoder
4 ACKNOWLEDGMENTS Completing my PhD has bee n a journey that has covered vast distances, both figuratively and literally. Through this process I have been blessed to b e supported by so many people. Most of all, I would like to thank my family During the time of my scientific education, I have been supported, loved, and cared for in ways that I expected and in ways that I did not. Firstly, I would like to thank my mother, Linda S. Hale, for her care and support th ese last 27 years. Thank you for encouraging me and reminding me that failure is a part of learning. Thank you for r eminding me to never give up and to keep moving forward, even when the steps seemed so small. In turn, thank you to my father, Matthew T. Hale, for being one of the most inspiring and impactful men in my life. You are the one who showed me that the pursuit one of the most important things that you can do as a citizen of this global world. My father also impressed upon me to live my life in kindness Thank you for reminding me in times of anger and frust ration to continue to be kind. Finally, thank you to my sibling, Laura C. Hale, for being one of the strongest and most inspiring people that I know. Thank you for showing me that being true to yours elf and pursuing your own dreams no matter what those around you may th ink is ultimately one of the most i ntense, yet fruitful challenges you will face in life. I would like to thank my husband, Jacob P. Morris for being the best part of my graduate experience. Thank you for loving me throughout this entire process, even on the days when I crumpled under the stress. Thank you for holdi ng me tight in times of anxiety Most of all, thank you for supporting me beyond wh at I thought possible of any person in my life. You moved for me once to Oregon when I began graduate school in su pport, and even bef ore we were married, put me and my scientific education above
5 yourself in a manner of self sacrifice that I was unsure I would ever experience. By agreeing to move to Florida, I was reminded of how much I was loved and supported. I will spend a lifetime thanking you and trying to re pa y you for your incredible actions I would also like to thank my mentor, Dr. Andy Berglund, for the opportunity to learn and grow in my scientific aspirations under the guidance of his mentorship. Thank you for your kindness and support during these last five years. Most of all, thank you for believing in me enough to bring me from Oregon to Florida in the middle of my graduate education such that I could continue to grow and learn. Thank you for reminding me to believe in myself, even when I was unsure that I would it make it through. I would also like to thank my committee here in Florida, the members including Dr. Eric Wang, Dr. Maurice Swanson, Dr. Kevin Brown, and Dr. Linda Bloom. Thank you all for being a vast resource of scientific knowledge and thinking. Thank for the encou ragement and support, even as I transitioned between graduate programs. I would like to give a very special thank you Dr. Brian Cain for the opportunity to grow as an educator while at the University of Florida. Thank you for the teaching opportunities you have provided and also for some of the most illuminating conversations of my graduate career. Thank you reminding me that I was doing the right thing s even when I felt as if I was failing. Although you may have been unaware, your scientific and moral support have been of the upmost importance to me The opportunity to work with you provided me a much needed respite from the lab whenever I was feeling down. Finally, t han k you to a ll the members of the Berglund labs, both past and present, for their teaching and scientific feedback. Much of this work would not have been
6 possible without the experimental education and support they provided. I would also like to thank the many undergr aduates I have had the pleasure of teaching. Thank you especially to Ryan C. Day who contributed substantially to the work presented here. Thank you for your enthusiasm, dedication, and overall sense of fun. Additionally, thank you to the Promise to Kate F oundation for contributing to my funding from 2017 2018. Connections with the network of patients associated with this group were incre dibly motivating and inspiring. Finally, I would like to thank Leslie Coonrod, Stacey Wagner, Sunny Ketchum, Amy Mahady, Marco Esters, J odi Bubenik Marina Scotti, Tammy Reid John Cleary and Curtis Nutter Thank you for being my confidants, allies, and most of all, friends during the course of this journey. Without all of you the work here would not have been possible.
7 TABLE OF CONTENTS page ACKNOWLEDGMENTS ................................ ................................ ................................ .. 4 LIST OF TABLES ................................ ................................ ................................ .......... 11 LIST OF FIGURES ................................ ................................ ................................ ........ 12 LIST OF ABBREVIATIONS ................................ ................................ ........................... 17 ABSTRACT ................................ ................................ ................................ ................... 20 CHAPTER 1 INTRODUCTION ................................ ................................ ................................ .... 22 RNA Splicing ................................ ................................ ................................ ........... 22 RNA Splicing is Performed by the Spliceosome ................................ ............... 22 Alternative Splicing Generates Transcriptome and P roteome Diversity ........... 23 RNA Binding Proteins Regulate Alternative Splicing ................................ ........ 25 RNA Binding Proteins Work in Complex Networks to Regulate Alternative Splicing through a Splicing Code ................................ ................................ .. 26 MBNL Proteins: Highly Conserved Master Regulators of RNA Processing ............ 28 MBNL Proteins Are Functionally Conserved in Metazoans .............................. 28 MBNL Proteins Regulate Alternative Splicing During Tissue Specific Development ................................ ................................ ................................ 29 MBNL Positional Depend ent Splicing Regulation ................................ ............. 30 MBNL Functional Domains Mediate Interaction with Target RNAs .................. 32 RNA Splicing and Disease ................................ ................................ ...................... 33 RNA Processing by MBNL Proteins is Mis Regulated in Myotonic Dystrophy .. 3 4 Mis regulation of MBNL in Other Diseases ................................ ....................... 37 Implications of Studying Muscleblind Alterna tive Splicing Regulation ..................... 38 2 AN ENGINEERED RNA BINDING PROTEIN WITH IMPROVED SPLICING REGULATION ................................ ................................ ................................ ........ 54 Background ................................ ................................ ................................ ............. 54 Results ................................ ................................ ................................ .................... 58 Synthetic MBNL1 Proteins with Modified Zinc Finger Domain Organization Possess Different Splicing Activities ................................ ............................. 58 Controlled Dosing of Synthetic MBNL Proteins in Two Different Systems Reveals Significantly Different Activities for Splicing Regulation ................... 61 Synthetic MBNL1 Proteins Possess Distinct RNA Binding Specificities ........... 65 Synthetic MBNL1 Proteins Rescue CUG Dependent Mis Splicing in a DM1 Cell Model ................................ ................................ ................................ ..... 70 Discussion ................................ ................................ ................................ .............. 71
8 Synthetic MBNL1 Proteins with Altered RNA Binding Specificity have Differential Splicing Activity ................................ ................................ ........... 71 ZF1 2 And ZF3 4 Possess Distinct RNA Binding Activities that Modulate MBNL1 Activity ................................ ................................ .............................. 74 Modular Architecture of MBNL1 ZF Domains Provides a Unique Platform For RNA Recognition ................................ ................................ .................... 76 Engineered MBNL1s as Protein Therapeutics in Neuromuscular Disorders .... 78 Materials and Methods ................................ ................................ ............................ 79 Protein Design, Synthesis, and Cloning ................................ ........................... 79 Creation of Stable, Inducible Synthetic MBNL Expression Cell Lines .............. 80 Cell Culture and Transfection ................................ ................................ ........... 81 Western Blot Analysis ................................ ................................ ...................... 82 Cell based Splicing Assay ................................ ................................ ................ 83 Protein Expression and Purification ................................ ................................ .. 84 RNA Radiolabeling and Electrophoretic Mobility Shift Assays (EMSAs) .......... 85 In Vitro Transcription of RNA Bind N Seq (RBNS) Random Input RNA ........... 86 RBNS and Computational Analysis ................................ ................................ .. 86 Immunofluorescence and Microscopy ................................ .............................. 88 Real time PCR ................................ ................................ ................................ 89 3 RBFOX1 MODIF IES THE MBNL1 DOSE RESPONSE OF ALTERNATIVELY SPLICED PRE MRNAS ................................ ................................ ........................ 124 Background ................................ ................................ ................................ ........... 124 Alternative Splicing is Impacted by Splicing Factor Concentration in Development and Disease ................................ ................................ .......... 124 MBNL Proteins Regulate Splicing in a Dose Dependent Manner ................... 126 MBNL Networks of Alternative Splicing Regulation Overlap with Those of Other Splicing Factors ................................ ................................ ................. 127 RBFOX RNA Binding Proteins Regulators of Alternative Splicing in Development and Disease ................................ ................................ .......... 127 Potential for MBNL and RBFOX Co regulation of Alternative Splicing ........... 129 Results ................................ ................................ ................................ .................. 130 Utilization of a Tunable MBNL1 Dosing System Indicates the Potential for RBFOX Co Regulation of Select Alternative Splicing Ev ents ...................... 130 RBFOX1 Expression Dampens MBNL1 Dose Dependent Regulation of the INSR Minigene ................................ ................................ ............................ 132 MBNL1 Regulation of Alternative Splicing is Conserved in Two Distinct Dosing Cell Lines ................................ ................................ ........................ 135 RNAseq in HEK 293 Cells at Variable MBNL1 Expression Levels in the Presence or Absence of RBFOX1 Reveals Common Modes of Co Regulation ................................ ................................ ................................ ... 137 Inducible Expression of RBFOX1 in Mbnl1/2 knockout mouse embryonic fibroblasts ................................ ................................ ................................ .... 139 Targeting of Orthologous Cassette Exons in HEK 293 and MEF Cells Reveals Similarities and Differences in RBFOX Co Regulation .................. 140
9 Full Dose Response Curves Magnify the Impacts of RBFOX1 Expression on MBNL1 Dose Dependent Regulation of Alternative Splicing .................. 144 Discussion and Future Directions ................................ ................................ ......... 147 Dose Dependent Regulation of Alternative Splicing by MBNL1 is Conserved Across Distinct Species and Cellular Environments ................................ .... 147 Utilization of Dosing Systems Provides a Unique Method to Characterize the Impacts of Differential Trans Acting Splicing Factor Expression on the MBNL1 Dose Respo nse ................................ ................................ ............. 148 RBFOX and MBNL Proteins Co Regulate a Group of Events Via Common Mechanisms ................................ ................................ ................................ 149 Compensatory Action of Other Members of Splicing Factor Families Complicates Characterization of Splicing Co Regulation ............................ 152 RBFOX Expression Buffers MBNL Splicing Activity Impacts on Disease Phenotypes and Biomarker Selection for Therapeutic Development .......... 153 Continuing to Explore Interesting and Unique Events Co Regulated by MBNL and RBFOX ................................ ................................ ...................... 155 Materials and Methods ................................ ................................ .......................... 156 Plasmids and Cloning ................................ ................................ ..................... 156 Creation of Stable, Inducible MBNL1 / RBFOX1 Expression Cell Lines ......... 156 Cell culture, Transfection, and Doxycycline / Ponasterone A Treatment ........ 158 Cell Based Splicing Assay ................................ ................................ .............. 160 Immunoblot Analysis ................................ ................................ ...................... 161 Immunofluorescence and Microscopy ................................ ............................ 162 High Throughput RNA Sequencing ................................ ................................ 162 4 FUTURE DIRECTIONS ................................ ................................ ........................ 198 Developing Synthetic MBNL Proteins for Therapeutic Applications ...................... 198 Synthetic MBNL Proteins as Therapeutic Biologics ................................ ........ 199 Engineering a Minimal MBNL Protein for Direct Delivery in Disease via Modification of the Linker Region ................................ ................................ 201 Creating a Minimal MBNL Protein for Therapeutic Delivery via Use of the TAT Cell Penetrating Peptide ................................ ................................ ...... 206 Development of New Synthetic MBNL Proteins Via Domain Swapping to Probe Mechanisms of RNA Processing ................................ ................................ ....... 209 MBNL RS Fusions to Analyze the Importa nce of Individual Domains in Modulating Positional Dependent Splicing ................................ .................. 209 MBNL dsRBD Fusions to Modulate MBNL RNA Secondary Structure Recognition ................................ ................................ ................................ 211 Characterization of RBFOX Dose Dependent Splicing Regulation ....................... 212 Materials and Methods ................................ ................................ .......................... 215 Synthetic MBNL P rotein Design and Cloning ................................ ................. 215 Cell Culture, Transfection, and Protein Expression Induction ......................... 215 Cell based Splicing Assay ................................ ................................ .............. 216 Immunoblot Analysis ................................ ................................ ...................... 217 5 SUMMAR Y AND CONCLUDING REMARKS ................................ ....................... 229
10 LIST OF REFERENCES ................................ ................................ ............................. 232 BIOGRAPHICAL SKETCH ................................ ................................ .......................... 251
11 LIST OF TABLES Table page 2 1 Sequences of primers used for RT PCR of endogenous splicing events in M EFs ................................ ................................ ................................ ............... 121 2 2 Index primers used to identify each RBNS library within t he multiplexed sequencing reads ................................ ................................ ............................ 122 3 1 Orthologous exons from MEF RNAseq that respond to MBNL1 (M), RBFOX1 (F), or both (B) proteins with > 0.1, Bayes factor > 5 ................................ 193 3 2 Sequences of primers used for RT PCR of endogenous splicing events in HEK 293 and MEFcells ................................ ................................ ................... 195
12 LIST OF FIGURES Figure page 1 1 RNA sequence s recognized by the spliceosome ................................ ............... 41 1 2 Mechanism of spliceosome assembly on pre mR NA substrate ......................... 42 1 3 M echanism of splicing catalysis ................................ ................................ ......... 43 1 4 Types of alternative splicing ................................ ................................ .............. 44 1 5 Cis acting splicing regulatory elements (SREs) ................................ ................. 45 1 6 Positional dependent splicing regulation by trans acting RNA binding proteins ................................ ................................ ................................ .............. 46 1 7 RNA binding proteins (RBPs) work in complex networks to regulate alternative splicing of large collections of RNA targets ................................ ...... 47 1 8 MBNL regulates alternative splicing in a position dependent manner via recognition of YGCY motifs. ................................ ................................ ............... 48 1 9 Mechanisms of MBNL mediated alternative splicing. Diagrams depicting known mechanisms of MBNL mediated exon exclusion ................................ .... 49 1 10 MBNL domain organization. ................................ ................................ ............... 50 1 11 Location of expanded repeats in myotonic dystrophy type 1 (DM1) and myotonic dystrophy type 1 (DM2). ................................ ................................ ...... 51 1 12 Molecular mechanism of DM1 ................................ ................................ ........... 52 1 13 Sequestration of MBNL by toxic RNA in DM1 tissue leads to mis splicing of CLCN1 ................................ ................................ ................................ ............... 53 2 1 Synthetic MBNL protein amino acid sequences and MBNL1 3 zinc finger domain alignments. ................................ ................................ ............................ 91 2 2 Zinc finger domain architecture and protein expression levels of synthetic MBNL proteins. ................................ ................................ ................................ ... 93 2 3 Subcellular localization and mRNA expression levels are not impacted by ZF domain rearrangement in transfected HeLa cells. ................................ .............. 94 2 4 Synthetic MBNL proteins are expressed at different levels in transfected HEK 293 cells. ................................ ................................ ................................ .... 95
13 2 5 Synthetic MBNL proteins regulate splicing of minigenes in HeLa cells with different activities. ................................ ................................ ............................... 96 2 6 Synthetic MBNL proteins regulate splicing of minigenes in HEK 293 cells with different activities. ................................ ................................ ............................... 97 2 7 Average splicing activity of synthetic MBNL proteins across six minigene events. ................................ ................................ ................................ ................ 98 2 8 Representative immunoblots used to create dose response curves in plasmid dosing syste m. ................................ ................................ ................................ .... 99 2 9 Representative splicing gels used to calculate changes in exon inclusion across the gradient of protein expression produ ced within the plasmid dosing system and quantitative parameters derived to describe dose response behavior. ................................ ................................ ................................ ........... 100 2 10 MBNL AA and MBNL BB proteins regulate splicing at different relative protein levels compared to MBNL AB in a plasmid dosing system. .................. 101 2 11 N terminal GFP tagged MBNL AB and MBNL AA localize with the nucleus in response in doxycycline treatment in inducible MEFs. ................................ ..... 10 3 2 12 MBNL AA regulates splicing at similar relative protein levels in an inducible tet on system in Mbnl1/2 KO MEFs. ................................ ................................ 104 2 13 Qua ntification of relative MBNL AB and MBNL AA protein levels across a gradient of doxycycline treatment in inducible MEF system. ............................ 106 2 14 Representative splicing gels used to calculate changes in exon inclusion across the gradient of protein expression produced within the inducible MEF system. ................................ ................................ ................................ ............. 107 2 15 MBNL AB and MBNL AA produce similar dose response curves for an additional 12 endogenous splicing events assayed in the inducible MEF system ................................ ................................ ................................ ............. 108 2 16 Quantitative parameters used to describe dose response behavior of 15 endogenous splicing assayed in inducible MEF system. ................................ .. 110 2 17 Reorganization of zinc finger domains does not significantly impact RNA binding of synthetic MBNL proteins. ................................ ................................ 112 2 18 Comparison of R values and kmers derived from this an d other RBNS studies with MBNL1 ................................ ................................ ......................... 113 2 19 RBNS analysis of engineered MBNL proteins indicates that the ZF domains have differential RNA binding specificity. ................................ .......................... 115
14 2 20 Representative splicing gels used to create dose response curves with plasmid dosing system in the presence of toxic RNA. ................................ ...... 116 2 21 Expression of CUG repeat toxic RNA alters the MBNL dose response curve. 117 2 22 Dose curves of synthetic MBNLs are altered in the presence of toxic RNA. .... 118 2 23 Model summarizing differences between synthetic MBNL proteins. MBNL AA is a more active alternative splicing regulator while MBNL BB is significantly weaker compared to MBNL AB. ................................ ................................ ....... 120 3 1 RBFOX and MBNL proteins regulate alternative splicing in the same positional dependent manner. ................................ ................................ .......... 164 3 2 Titration of doxycycline produces a gradient of HA MBNL1 protein expression from an integrated transgene in HEK 293 cells. ................................ ............... 165 3 3 MBNL1 and RBFOX1 regulate splicing of the INSR and Nfix minigenes. ......... 166 3 4 Evaluating splicing across a gradient of MBNL1 protein expression allows for characterization of dose dependent splicing behavior due to variances in cis element parameters and trans acting factor environment. ............................... 167 3 5 RBFOX1 expression dampens the MBNL1 dose response via an overlapping RNA binding motif in the INSR minigene. ................................ ......................... 168 3 6 Titration of doxycycline generates a gradient of GFP MBNL1 expression in Mbnl1 : Mbnl2 KO MEFs that regulates splicing of several MBNL1 dependent splicing events. ................................ ................................ ................................ 170 3 7 Many orthologous cassette exons are regulated in a similar MBNL1 dependent manne r in both HEK 293 and MEF dosing cell lines as assayed by RNAseq. ................................ ................................ ................................ ...... 172 3 8 Validation of HA MBNL1 / HA RBFOX1 expression and splici ng regulation in HEK 293 cells. ................................ ................................ ................................ .. 173 3 9 Expression of RBFOX1 in HEK 293 MBNL1 dosing line significantly alters MBNL1 dose dependen t splicing behavior. ................................ ...................... 174 3 10 Schematic showing potential working models of MBNL1 and RBFOX1 co regulation based on the event classes, or modes, described in Figure 3 9. ..... 176 3 11 GFP MBNL1 and mOrange RBFOX1 expression can be selectively induced via tetracycline or ecdysone inducible systems in Mbnl1:Mbnl2 KO M EFs. ..... 178 3 12 Distribution of MBNL and RBFOX RNA binding motifs in orthologous cassette exon events. ................................ ................................ ....................... 180
15 3 13 Limited dosing of MBNL1 in the presence or absence of RBFOX1 expression shows differences and similarities in of splicing co regulation for positively regulated events in HEK 293 and MEF dosing cell lines. ................................ 182 3 14 Limited dosing of MBNL1 in the presence or absence of R BFOX1 expression shows differences and similarities of splicing co regulation for negatively regulated events in HEK 293 and MEF dosing cell lines. ................................ 184 3 15 RBFOX1 expression alters the MBNL1 dose response in MEFs. ..................... 186 3 16 RBFOX1 does not alter the MBNL1 dos e response in HEK 293 cells. ............. 188 3 17 RBFOX2 knockdown via siRNA alters the MBNL1 dose response in HEK 293 cells. ................................ ................................ ................................ .......... 189 3 18 Orthologous CLASP1 / Clasp1 exon 20 does not respond in the same manner to MBNL1 dose in HEK 293 and MEF dosing lines. ............................ 190 3 19 Increased gene expression of RBFOX2 but not RBFOX1 correlates with inferred levels of free functional MBNL1 levels in RNAseq samples from DM1 patient tibialis anterior muscle. ................................ ................................ ......... 191 3 20 Distribution of MBNL and RBFOX RNA binding motifs in orthologous PLOD2 / Plod2 e14 ................................ ................................ ................................ ...... 192 4 1 Synthetic MBNL proteins with varying linker lengths. ................................ ....... 218 4 2 Synthetic MBNL proteins with altered linker lengths regulate splicing of minigenes in HeLa cells with different activities. ................................ ............... 219 4 3 Synthetic MBNL proteins as therapeutic biologics for DM. A synthetic MBNL protein could be used as a therapeutic biologic to rescue DM associated spliceopathy. ................................ ................................ ................................ ..... 220 4 4 Synthetic MBNL proteins with single RNA binding domains and HIV TAT cell penetrating peptides. ................................ ................................ ........................ 221 4 5 Synthetic MBNL proteins with only a single RBD regulate splicing of minigenes in HeLa cells with different activities. ................................ ............... 222 4 6 Domain organization of natural and sy nthetic Drosophila Mbl proteins ........... 223 4 7 A shortened, synthetic version of the Drosophila Mbl protein does not regulate splicing of four minigene reporters in HeLa cells. ............................... 224 4 8 Differences in positional dependent alternative splicing regulation between MBNL and SR proteins and potential predictions for mechanism of action for MBNL RS chimeric proteins. ................................ ................................ ............ 225
16 4 9 Model representing predicted preferences of MBNL dsRBD fusions for toxic CUG RNA over less structured RNA substrates. ................................ .............. 226 4 10 Titration of ponasterone A generates a gradient of mOrange RBFOX1 expression in Mbnl1 : Mbnl2 K O MEFs that regulates splicing of three target even ts ................................ ................................ ................................ .............. 227 4 11 MBNL1 expression alters the RBFOX1 dose response in inducible MEF s. ..... 228
17 LIST OF ABBREVIATIONS ALS Amyotrophic lateral sclerosis AS Alternative Splicing ATP2A1 S arcoplasmic / endoplasmic reticulum Ca 2+ ATPase 1 BPS Branchpoint sequence CLIPseq Cross linking immunoprecipitation and sequencing CNS Central nervous system CPP Cell penetrating peptides DCM Dilated cardiomyopathy DM Myotonic dystrophy DMPK D ystrophia myotonia protein kinase dsRBD Double stranded RNA binding domains EMSA Electrophoretic mobility shift assay ES Embryonic stem ESE Exonic splicing enhancer ESS Exonic splicing silencer FECD GFP Green fluorescent protein HEK Human embryonic kidney hnRNP Heterogeneous nuclear ribonuceloprotein INSR Insulin receptor iPSC Induced pluripotent stem cells ISE Intronic splicing enhancer
18 ISS Intronic splicing silencer KO Knockout LASR Large assembly of splicing regulators mbl muscleblind Mbl Muscleblind MBNL Muscleblind like MEF Mouse embryonic fibroblast mRNA Messenger RNA Nfix N uclear factor I/X NLS Nuclear localization signal NMD Nonsense mediated decay PPT Poly pyrimidine tract PSI Percent spliced in QKI Quaking RAN Repeat associated non ATG translation RBD RNA binding domain RBM20 RNA binding motif protein 20 RBNS RNA Bind n seq RBP snRNP RNA binding protein Small nuclear ribonuclearprotein RNAseq RNA sequencing RRM RNA recognition motif RT PCR Reverse transcription polymerase chain reaction RxR Retinoid x receptor SCA8 Spinocerebellar ataxia type 8
19 SCN5A Sodium voltage gated channel alpha subunit 5 SELEX Systematic evolution of ligands by exponential enrichment SF1 Branchpoint binding protein SMA Spinal muscular atrophy SR Serine/Arginine rich proteins SRE Splicing regulatory element SRSF1 Serine/arginine rich splicing factor 1 TALEN Transcription activator like effector nucleases TNNT2 C ardiac troponin T type 2 TRE Tet response element UTR Untranslated region Vldlr V ery low density lipoprotein receptor WT Wild type ZF Zinc finger
20 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy MUSCLEBLIND LIKE 1 REGULATION OF ALTERNATIVE SPLICING: THE IMPACTS OF PROTEIN DOMAIN ARCH ITECTURE AND TRANS ACTING ELEMENTS By Melissa Ann Hale May 2018 Chair: J. Andrew Berglund Major: Medical Sciences Biochemistry and Molecular Biology The muscleblind like (MBNL) family of proteins are key developmental regulators of alternative splicing. Sequestration of MBNL proteins by expanded CUG/CCUG repeat RNA transcripts is a major pathogenic mechanism in the neuromuscular disorder myotonic dystr ophy (DM). Despite two decades of intense study, questions still remain regarding how this master regulator of RNA metabolism regula tes alternative splicing (AS), specifically how the individual protein domains of MBNL contribute to and function in AS regu lation and how trans acting factors, such as other RNA binding proteins (RBPs), can modify MBNL AS regulation. MBNL1 contains four zinc finger (ZF) motifs that form two tandem RNA binding domains (ZF1 2 and ZF3 4) which each bind YGCY RNA motifs. In an eff ort to determine the differences in fu nction between these domains, synthetic MBNL proteins with duplicate ZF1 2 or ZF3 4 domains, referred to as MBNL AA and MBNL BB were designed and characterized Analysis of splicing regulation revealed that MBNL AA ha d up to 5 fold increased splicing activity while MBNL BB had 4 fold decreased acti vity compared to a MBNL protein with the canonical arrangement of zinc finger domains RNA binding analysis revealed that the variations in splicing activity are due to diffe rences in RNA binding specificities between the two
21 ZF domains rather than binding affinity. Our findings indicate that ZF1 2 drives splicing regulation via recognition of YGCY RNA motifs while ZF3 4 acts as a general RNA binding domain. These stud ies sugg est that synthetic MBNL proteins with improved or altered splicing activity have the potential to be used as both tools for investigating splicing regulation and protein therapeutics for DM and other microsatellite diseases. In addition to exploring how MBNL domains contribute to RNA binding and AS regulation, much of this work has focused on how other trans acting splicing regulators, specifically RBFOX, can modulate MBNL AS regulation These studies indicate that MBNL and RBFOX have many m odes of co regulation. Understand ing how RBPs work in coop erative and antagonistic ways not only informs how these proteins co regulate AS during development but also what AS markers will serve as informative rescue biomarkers for DM therapeutics in tissu es with both low and high RBFOX expression.
22 CHAPTER 1 INTRODUCTION RNA Splicing One of the most striking features of the eukaryotic genome is the split arrangement of protein coding sequences (exons) by intervening s equences of non coding regions (introns). The reconstitution of these coding regions into a continuous, mature RNA molecule following transcription is a highly complex and dynamic process, the function of which is critical for effective gene expression. Wh ile the biochemical process of splicing is chemically quite simple a two step transesterification reaction that results in an intron lariat and the ligation of adjacent exons its execution inside the cell is far more complicated. RNA Splicing is Perfo rmed by the Spliceosome The co transcriptional process of RNA splicing is predominantly performed by two exceptionally dynamic macromolecular complexes, the major (U2 dependent) and minor (U12 dependent) spliceosome. The major spliceosome, which proces ses ~95.5% of all introns, is composed of five main small nuclear ribonucleoprotein (snRNP) complexes: U1, U2, U4, U5, and U6 (1 5) A mature major spliceosome contains these core splicing machinery components along with more than 150 accessory proteins, making it one of the most complex macromolecular machines within the cell (6, 7) Each snRNP protein complex contains a distinct non coding, non polyadenylated small nuclear RNA (snRNA) that is critical for recognition of specific sequences within a target pre mRNA via base pairing. These interactions are vital for initiating and directing spliceosomal assembly at exon/intron junctions (5, 8)
23 Pre 1). Spliceosomal assembly and catalysis begins with recognition of site is defined by a 6 nucleotide consensus (9, 10) binding protein (SF1) binds to BPS while the large (65kDa) and small (35Kda) su bunits (4, 5) These interactions lead to the formation of the early splicing complex (E complex) (4, 8, 11) Following this step, U2AF65 recruits the U2 snRNP to the BPS in an ATP dependent manner, displacing SF1 to form the A complex (4, 5, 12) Next, the B complex is formed via U4/U5/U6 tri snRNP displacement of the (4, 5) In a complex set of spliceosomal rearrangements, U4 snRNP is lost to form the catalytic C complex (Figure 1 2) (4, 5, 13) Within the catalytic C complex, two sequent ial transesterification reactions are phosphodiester linkage. This intron lariat is remove d by a second reaction in which the and formation of a splice junction between the phosphodiester bond (Figure 1 3) (5) Alternative Splicing Generates Transcriptome and Proteome Diversity While the removal of introns in a multi exon pre mRNA is a critical process for effective RNA processing, alternative splicing (AS) the differential inclusion of coding
24 or non coding regions within a single pre mRNA adds yet another layer of complexity to splicing regulation. Via AS, a single gene can produ ce multiple mRNA isoforms that can dramatically impact the complexity of the cellular transcriptome and proteome. In fact, although the human genome only contains ~20,000 protein coding genes, nearly 90 95% of multi exon genes produce alternatively spliced transcripts (14 16) Continuous advances in the sensitivity of high throughput sequencing technologies have indicated that, on average, 6.3 alternatively spliced transcripts are produced from a single, protei n coding, multi exon gene locus, and 3.9 of these are predicted to encode for different protein isoforms (17) made to generate a vast array of mRNA iso forms within the cell. Cassette exon usage an AS event in which a target exon is either included or excluded in the final splice isoform is the most common type of AS in complex organisms (15, 18) However, there are many other types of AS events including mutually exclusive exons, alternative 4). The complex combin atorial use of many of these AS mechanisms provides the capacity for the generation of a diverse array of RNA products. Proteins from mRNA isoforms produced via AS can vary in simply a few amino acids that may impact enzymatic activity or ligand affinity. However, AS may led to entire domain deletions or insertions (19 21) AS can also impact the fate of mRNAs by regulating the inclusion of regions that impact RN A localization, translation, and turnover (20, 22) Overall, AS provides a platform for a complex means of post transcriptional gene regulation that can be
25 precisely manipulated to alter gene ex pression in a developmental, spatio temporal, or tissue dependent manner (23, 24) RNA Binding Proteins Regulate Alternative Splicing The precise control of AS is controlled by two major, interacting factors: (1) cis regulatory sequences within the pre mRNA that influence splice site usage by the spliceosome and (2) their recognition by trans acting factors, most notably RNA binding proteins (RBPs) that act as alternative splicing factors. Within a pre mRNA mo lecule there are four main types of cis acting RNA elements called splicing regulatory elements (SREs) that are separate from the splicing signals recognized by the (25, 26) These SREs are defined by their location within the pre mRNA (intron or exon) and their impact on splice site selection (enhancer or inhibitor / silencer): exonic splicing enhancers (ESEs), exonic splicing silencers (ESSs), intronic splicing enhancers (I SEs), and intronic splicing silencers (ISSs) (Figure 1 5) (25 27) These SRE RNA elements are bound by vast collection of trans acting RBPs that recognize these sequences through a variety of defined and yet to be explored mechanisms which signal the spliceosome to utilize or skip over an affected splice site, leading to the ab ove discussed modes of AS (Figure 1 4) (28) Although there are hundreds of RBPs that can bind and modify splicing of a pre mRNA, some of the largest and best studied classes of RBPs include Serine/Arginine rich (SR) proteins and heterogeneous nuclear ribonucleoproteins (hnRNPs) (26, 29) These families of auxiliary splicing factors and their RNA interactions result in the recruitment of key spliceosome components and subsequent snRNP remodeling required for spliceosomal activation and the completion of a RNA splicing reaction (26, 29, 30) Beyond SR and hnRNP proteins, there are many other splicing regulatory factors,
26 including the RBFOX family, the muscleblind (MBNL) family, and Quaking (QKI) that bind to their target cis regulatory motifs to modify AS decisio ns (26) In general, alternative splicing factors regulate AS in a positional dependent manner; that is, these RBPs modulate splice site selection and exon inclusion based on the location of binding within a target pre mRNA substrate. RBPs have a variety of different positional dependent splicing regulations. hnRNPs and SR proteins bind to either exons or intronic regions to mo dify exon inclusion or exclusion, albeit with opposite impacts on recognition of adjacent splice sites by the core splicing machinery when binding to similar regions in the pre mRNA (Figure 1 6a b) (26) Other groups of RBPs bind primarily to intronic regions flanking target exons. Many groups of RBPs, specifically QKI, RBFOX, and MBNL, enhance exon inclusion when binding to th e downstream intron and repress exon inclusion via association with the upstream intron (Figure 1 6c). In contrast to these RBPs, CELF proteins regulate exon inclusion by binding to the upstream intron while exclusion is driven by binding to the downstream intron (Figure 1 6d) (31, 32) With opposite mechanisms of positional dependent AS splicing regulation, it has been shown that MBNL and CELF proteins antagonistically regulate AS of m any target genes, emphasizing the complexity of combinatorial regulation of AS by RBPs (33, 34) RNA Binding Proteins Work in Complex Networks to Regulate Alternative Splicing through a Splicing Code During cell development and differentiation, RBP expression levels, localization, and mRNA / protein stability are finely regul ated. Groups of these of RBPs work to regulate complex networks of AS events to facilitate control of complicated cellular process, most notably cell differentiation and tissue development (Figure 1 7) (23)
27 Many studies of different developmental systems have indicated that RBP expression is critical for maintenance of these coordinated AS networks and the subsequ ent developmental decisions that ensue (23) Global analysis of AS during development has shown that splicing t ransitions occur for groups of genes at the same time and that specific RBPs work together to contribute to the splicing coordination (35, 36) These splicing networks are cell type and/or region specific within tissues, and this, in part, correlates to RBP expression (26) Overall, RBP maintenance and spatio temporal organization of these AS networks relies on the accurate, coordinated recognition and execution of wh at is often that is the combination of RNA cis element motifs and constitutive splicing signals (37, 38) Due to the complexity of AS regulation, deciphering this code ha s not been a simple or trivial task. Firstly, it is predicted that a complete list of RBPs which can participate in AS regulation is unknown. Secondly, many of the same RBPs can recognize the same short, degenerate sequence with a range of affinities and s pecificities, a fact that becomes more complex when considering the varying expression profiles of RBPs across cell types. Additionally, many RBPs work in conjunction with each other and other proteins, which in turn can modify their recognition of target cis elements. Finally, there is no simple division of RBPs acting as positive or negative regulators of AS; many RBPs exert differential effects on AS depending on their location of binding relative to a target exon (Figure 1 6) (26) While enormous progress has been made in the splicing field to attempt to understand this code, especially with the plethora of high throughout s equencing based approaches like RNAseq and CLIPseq that can be utilized to predict experimental outcomes upon
28 modification of the RBP landscape, there is still much to learn about how RBPs modulate AS, especially as it relates to development and disease. MBNL Proteins: Highly Conserved Master Regulators of RNA Processing MBNL proteins are a family of highly conserved RBPs that regulate RNA metabolism during tissue specific development, most notably the activation or repression of alternative exon inclusion (39, 40) MBNL proteins have been specifically implicated in regulating fetal to adult mRNA isoform transitions in heart and muscle (33, 34, 41 43) In addition, MBNL proteins have been linked to the regulation of other RNA metabolic processes including RNA localization (44) turnover (45) gene expression (46, 47) alternative polyadenylation (48) and micro RNA processing (49) MBNL Proteins Are Functionally Conserved in Metazoans muscleblind ( mbl ) was initially discovered in Drosophila where this embryonic lethal gene was found to be both essential and autonomously required for photoreceptor differentiation in the fly eye (50) Further characte rization of several hypomorphic mbl alleles in Drosophila larvae revealed that the protein product from this gene (i.e. Mbl) is expressed in several muscles, the nerve cord, and photoreceptor system. While lethal in the larval stage, larvae displayed disti nct phenotypes including partial paralysis, a contracted abdomen, and reduced muscle striation (51) These initial studies in Drosophila re vealed that the Mbl protein is a key developmental regulator of muscle differentiation. Following its discovery in this model organism, homologs of mbl have been found in a wide variety of species. Although plants, fungi, and bacteria lack any protein r esembling Mbl, homologs are found in nearly all metazoans sequenced to date of several distinct phylum, including mice ( M. musculus ), worms ( C. elegans ), sea squirts
29 ( Ciona intestinalis ), and one of the most basal metazoans, Trichoplax adhaerens (52) While invertebrates typically encode a single mbl gene, vertebrates have multiple genes (39) Humans and other mammals, including mice, have three muscleblind like ( MBNL ) genes, MBNL1 3 (53) Experimental evidence indicates that these proteins are functiona lly exchangeable (39, 52) Ubiquitous expression of full length MBNL1 human cDNA in mutant Drosophila muscleblind embryos rescues embryonic lethality and the contracted abdomen phenotype (54) Expression of several Mbl homologs in mouse cells revealed that all homologs could regulate alternative splicing of mammalian transcripts (52) These results indicate that the overall AS activity and role of Mbl proteins is conserved across metazoans, further exemplifying the importance of this RBP. MBNL Proteins Regulate Alternative Splicing During Tissue Specific Dev elopment While vertebrates express three MBNL paralogs (MBNL1 3), the expression patterns of each of these genes across tissue types vary. MBNL1 and MBNL2 are ubiquitously expressed in adult tissues, but MBNL1 shows increased expression in skeletal muscle and heart while MBNL2 levels are higher within the brain (55 57) In contrast, literature indicates that MBNL3 is developmentally regulated and primarily expressed in placental tissue and activated myoblasts where it has been shown to play roles in muscle regeneration and differentiation (57 59) In the majority of tissues, MBNL mRNA and protein levels rise during tissue differentiation. This is especially dramatic in heart, skeletal muscle, and brain where there is a several fold increase in MBNL1 and MBNL2 m RNA expression in adult compared to fetal tissue (60) Within these tissue types it has been shown that MBNL1 and MBNL2 have largely compensatory roles. In M bnl1 knockout (KO) mice, MBNL2
30 expression is dramatically increased and profound sp licing aberrations in AS can be quantified with a functional loss of MBNL1 and MBNL2 that cannot be observed when each protein individually is knocked down (44, 61) Increased levels of MBNL1 and MBNL2 during tissue differentiation have been shown to play a major role in promoting transitions from the fetal to adult AS pattern s for MBNL target mR NAs. Analysis of key splicing events in MBNL1 knockout (KO) mice reve a led that MBNL proteins control the coordinated transition s of AS for a large subset of events that are regulated in a temporal manner during development in the heart and skeletal muscle (34, 41) Other studies revealed that MBNL1 and MBNL2 proteins contribute to and aid in the coordination of differentiation and tissue development by repressing embryonic stem (ES) cell specific AS and negatively regulating ES cell pluripotency (62) In contrast to MBNL1 and MBNL2, MBNL3 appears to antagonize differentiation specifically myotube formation, and the functional consequences of its expression and AS regulation are still being explored (58, 59, 63) MBNL Positional Dependent Splicing Regulation In order to regulate specific splicing events, MBNL1 acts as an enhancer or repressor of exon inclusion i n a transcript dependent manner. In general, if MBNL1 binds upstream or wit hin a regulated exon it represses inclusion while if it binds downstream o f a regulated exon it enhances inclusion (Figure 1 8a) (44, 47, 64) This specific pattern of positional dependent splicing i s conser ved among other families of RBPs, including Nova, RBFOX, and QKI (28) Many studies using a wide variety of methods have found that YGCY (Y = C or U) is the consensus MBNL1 binding motif within its RNA targets. Both SELEX (systematic evolution of ligands by exponential enrichment) and compu tational analysis
31 of motif enrichment in events derived from splicing sensitive microarrays identified YGCY, and in particular UGCU, as the most common MBNL1 binding motif, especially when repeated several times over a short sequence region (Figure 1 8b) (47, 64) Ultraviolet cross linking and immunoprecipitation combined with deep seq uencing (CLIPseq) of MBNL/RNA complexes confirmed that UGCU tetra nucleotide sequences were indeed a major binding motif for all three MBNL paralogs (MBNL1 3) (44, 45, 58, 65) This consensus tetra nucleotide binding sequence was expanded to the heptamer GCUUGCU foll owing completion of an in vitro screen that bound MBNL1 to a randomized RNA sequence population followed by deep sequencing (RNA Bind n seq) (66) Overall, the binding of MBNL proteins to YGCY consensus motifs within the introns / exons of pre mRNAs is associated with AS regulation whereas MBNL binding of the roles in regulation of RNA localization and stability, including the compartmentalization of mRNAs (44, 45) Despite extensive evidence describing the importance of MBNL proteins in promoting AS splicing transitions during development and definition of the consensus binding motif, specific mechanisms describing how MBNL proteins regulate exon inclusion or exclusion of a target ca ssette exon remains elusive. Based on the positional dependent mode of RNA binding described, it has been predicted that dividual MBNL1 regulated AS events validate this hypothesis. For example, it has been shown that MBNL1 binds and stabilizes the formation of a stem loop structure within intron 4 of TNNT2 which blocks
32 U2AF65 recognition of the polypyrimidine tract, in turn inhibiting U2 snRNP recruitment and the subsequent skipping of exon 5 (Figure 1 9a) (67, 68) Similar modes of MBNL1 mediated negative regulation of exon inclusion have been descr ibed for exon F of TNNT3 and exon 7a of CLCN1 (69, 70) Fewer mechanistic insights exist for MBNL1 mediated regulation of cassette exon inclusion. While it has been pr edicted that MBNL might promote inclusion via competing for binding with splicing silencers, activating ISEs, or promoting enhanced biochemically characterized. One examp le described in the literature is the MBNL1 binding of a highly conserved ISE within intron 11 of INSR that enhances U2AF65 binding and splicing of the upstream intron (Figure 1 9b) (71, 72) MBNL Functional Domains Mediate Interaction with Target RNA s Out of 10 coding exons in MBNL1 six undergo AS to generate a complex and diverse compendium of mRNA isoforms with at least 7 distinct MBNL1 protein isoforms (39, 55, 73) This extensive range of MBNL1 AS adds to the functional range of this protein. Of the three human MBNL paralogs, MBNL1 is the best studied due to its high expres sion in muscle tissue As such, much of the remaining content will focus on studies using MBNL1. RNA binding by MBNL proteins is mediated by four highly conserved CCCH type (CX 7 CX 4 6 CX 3 H) zinc finger (ZF) motifs, referring to the four amino acids which coo rdinate the zinc ions within each ZF (74) These motifs fold into two tandem domains orientated towards the N terminal region of the protein and are separated by a flexible linker that is predicted to mediate interactions of each ZF pair with YCGY motifs within a target RNA (Figure 1 10) (75) Both crystallographic and NMR based structural studies
33 of MBNL1 have revealed how these ZF domains associate with YGCY elements within target RNAs (76, 77) NMR studies have shown that each ZF pair interacts with single stranded RNA molecules in a 1:1 stoichiometric ratio and that the individual ZFs within each pair are not equivalent; ZF2 and ZF4 form the primary RNA binding surface while ZF1 and ZF3 serve as secondary stabilizing domains (77) The only ava ilable crystal structure of MBNL1 in complex with RNA is of the ZF3 CGCUGU RNA interaction is based on the stacking of aromatic and arginine residues with the RNA guanine and cytosine bases (76) These nucleotides are coordinated additionally by a network of hydrogen bonds (76) Interestingly, this mode of RNA recognition requiring interaction with the Watson Crick face of the GC dinucleotide suggests that MBNL binding to RNAs destabilizes RNA structure (78) Beyond the ZF domains lies the C terminal region of MBNL1, which like the linker region separating the ZF pairs is predicted to be relatively unstructured (Figure 1 10) E xtensive analysis of MBNL1 deletion constructs indicate that while this region is not nuclear localization (73, 79, 80) Additionally, another section within the C terminus has been shown t o induce MBNL1 self dimerization in a yeast two hybrid assay and has been linked to the formation of multimeric ring like structures (70, 7 3) RNA Splicing and Disease Point mutations are the most common cause of hereditary disease and it is estimated that about 50% of these mutations result in aberrant splicing (81, 82) Splicing regulation in disease can be disrupted by mutations within key cis regulatory sequences required for accurate pre mR NA processing, mutations of splic eosome proteins that
34 alter their function in the sp licing reaction, or alterations to trans acting splicing components that are required for snRNP biogenesis or splicing regulation (82 84) Diseases impacted by AS include those with both dominant and recessive inheritance patterns and with varying frequency of occurrence in the population including Retinitis pigmentosa, the most common form of hereditary blindness, Amyotrophic lateral sclerosis (ALS), and Spinal muscular atrophy (SMA). Additionally, mis regulation of AS is common in other diseases such as cancer, where splicing alternations have been linked to tumor progression, metastasis, and other oncogenic processes (85 87) RNA Processing by MBNL Proteins is Mis Regulated in Myotonic Dystrophy Myotonic dystrophy (DM) is the second most common form of muscular dystrophy following Duchenne muscular dystrophy and is the most common form of adult onset muscular dystrophy with an incidence rate of 1 in ~8000. While this dis ease presents in two genetically distinct forms (DM type 1 and type 2, DM1 and DM2), both are multi systemic, neurodegenerative dis orders that impact the musculoske letal, gastric, cardiac, and central nervous systems (CNS) organ systems and tissues. Phenotypically, DM is characterized by symptoms of muscle wasting, myotonia, early onset iridescent cataracts, insulin resistance, reduced fertility, cognitive impairment, and cardiac dysfunction including conduction abnormalities and arrhythmia (88, 89) The molecular basis of DM1 is a CTG trinucleotide repeat expa untranslated region of the dystrophia myotonia protein kinase gene ( DMPK ) (Figure 1 11a) (90 92) Unaff ected individuals have between 5 and 38 CTG repeats while patients with the disease have between 50 and 2,000 repeats (93, 94) DM2 is caused by a CCTG repeat expansion within intron 1 of the CNBP gene (Figure 1 11b) (95, 96) Patients with DM2 have between 75 and 11,000 repeats (96) These expansions are
35 inherited in an autosomal d ominant manner and the number of repeats increases through each successive generation (97) Disease severity escalates and age of onset decreases with increasing repeat size, although these correlations are not as strong for DM2 compared to DM1 (98 101) Once transcribed into RNA the CUG/ CCUG repeats sequester members of the MBNL family of proteins in discrete nuclear foci (55, 102, 103) The depletion of these alternative splicing factors from the nucleoplasm via to mis splicing events responsible for DM sympt oms (104, 105) This RNA gain of function mechanism is the prevailing model for DM pathology (Figure 1 12). Mis splicing events caused by sequestration of MBN L proteins within the CUG/CCUG repeat RNA foci has been linked to many DM symptoms. Direct comparis on of transcriptome changes in CUG repeat expression and M bnl1 knockout mouse models rev ealed that loss of MBNL1 accounts for greater than 80% of the splicing pathology observed in DM1 (47, 53) The fetal isoforms of several MBNL1 regulated transc ripts have been found in many adult DM1 affected tissues For example, cardiac troponin T type 2 ( TNNT2 ) and cardiac sodium voltage gated channel alpha subunit 5 ( SCN5A ) have been found be mis spliced in the heart (42, 67, 68, 106, 107) These genes co ntain MBNL1 regulated exons that are included in fetal tissue that lacks MBNL protein expression and excluded in adult tissue due to the high expression of MBNL (42, 67, 68, 106, 107) Inclusion of these fetal ex ons in DM1 patients due to MBNL depletion has been linked with cardiac symptoms including con duction defects and arrhythmia (42, 107)
36 The hallmark symptom of DM1 myotonia, has also been directly linked with MBNL mis splicing of the CLCN1 chloride channel. Loss of MBNL leads to inc lusion of exon 7a which encompasses a pre mature stop codon and degradation of the CLCN1 message via nonsense mediated decay (Figure 1 13) (108, 109) This reduction in functional CLCN1 protein leads to reduced conductance of chloride ions in the sarcolemma of muscle tissue and subsequent myotonia (108) The connection of this mis splicing event to loss of MBNL was discovered in M bnl1 knockout mice which display CLCN1 mis splicing and the myotonia phenotype (53) MBNL1 overexpression in a DM1 mou se model restores Clcn1 mis splicing to the adult pattern and rescues myotonia (1 10) While the primary driver of DM disease symptoms is the induced MBNL dependent spliceopathy, MBNL proteins have also been implicated in the regulation of repeat associated non ATG (RAN) translation (111) This non canonical form of translation that occurs in the absence of an AUG sta rt codon has been shown to occur across a wide spectrum of microsatellite repeat expansion disorders and produce a vast array of toxic expansion proteins that accumulate and aggregate within the cell (112, 113) Toxic poly glutamine proteins derived from the (CAG) n expansion antisense RNA transcript have been detected in DM1 mouse models and patient derived myoblasts (114) Toxic poly (LPAC) poly (QAGR) RAN proteins produced from the DM2 CCTG / CAGG expansion sense and antisense RNA transcripts, respectively, have also been shown to accumulate in DM2 patient brains (111) In the case of DM2, RNA foci and nuclear sequestration of toxic RNA by MBNL proteins is inversely correlated with poly (LPAC) expression (111) Based on this data, models of disease now suggest that
37 nuclear retention of these repeat expansion RNAs by MBNL within ribonuclear foci limits toxi c RAN expression until either increased repeat length or RNA expression overwhelm MBNL sequestration, leading to nuclear export and production of RAN translation products (111) Mis regulation of MBNL in Other Diseases Beyond its well studied role in DM, MBNL proteins have also been implicate d in other diseases. Spinocerebellar ataxia type 8 (SCA8), like DM, is a neurodegenerative disorder characterized by progressive motor deficits and cerebellar atrophy (115, 116) At the m olecular level this disease is due to a CT G trinucleotide repeat expansion in the ATXN8OS gene that produces toxic CUG expansion RNA transcripts which form ribonuclear foci that co localize with MBNL proteins in SCA8 patient brains (115) Additionally, it has been shown that these transcripts trigger splicing changes of the MBNL1 regulated event Gabt4 in the CNS (115) corneal dystrophy (FECD), an inherited degenerative disease that impacts the endothelial layer of the cornea and can result in corneal edema and vision loss (117) This disease impacts approximately 5% of middle aged Caucasians in the United States and is associated with more than 14,000 corneal transplants annually (117) Although there are several genes associated with this d isease, the greatest correlation is with an intronic CUG trinucleotide repeat expansion in the TCF4 gene. Like in DM and SCA8, the CUG expansion RNAs produced from this locus forms foci that co local ize with MBNL proteins resulting in mis splicing (117, 118) Finally, downregulation of MBNL expression has been linked with cancerous tumor metastasis (119, 120) AS is a common process that is disrupted in cancer and
38 has been linked with exten sive changes in expression of splicing regulators including RBPs (84 87) Large scale analysis of transcriptomic data revealed that the MBNL family of proteins is often downreg ulated in tumors, particularity in breast and prostate cancers (119, 120) This downregulatio n is linked with an AS profile that mimics differentiation (119) Additionally, MBNL1 downregulation has been linked with changes in splicing of cancer driver genes like NUMA1 (119) Conversely, increased MBNL1 expression in human breast tumors has been shown to reduce the risk of disease relapse by stabilizing transcripts from the metastat ic suppressor genes DBNL and TACC1 (120) F rom rare genetic disorders to d iseases that impact millions of patients, MBNL proteins have been found to be key RBPs that are implicated in global mis splicing regulation across multiple disorders. Implications of Studying Muscleblind Alternative Splicing Regulation MBNL proteins have been implicated as key regulators of development and facilitators of tissue dependent differentiation via regulation of fetal to adult AS transitions. The importance of MBNL proteins is further highlighted by thier role in disease in which disruption of M BNL protein function is implicated in a variety of symptomatic presentations in patients, from myotonia and muscle degeneration in DM, connection to disease has led to nearly 20 years of intensive research, there are still questions regarding the complex functions of this protein and how it regulates diverse RNA processing events. More specifically, work is still underway to understand the rules that govern MBNL recognition of ci s elements within a diverse RNA target population and the downstream impacts of this cis element recognition (i.e. cassette exon inclusion
39 or exclusion). Additionally, insights into how other trans acting factors within complex AS regulatory networks impac t MBNL protein function has yet to be fully explored. Continuing to improve our understanding of MBNL AS regulation and the impacts of this and disease symptoms / outc omes. Studies focusing on MBNL RNA processing can also contribute to our current understanding of the role of this RBP in development and tissue differentiation, including stem cell pluripotency and reprogramming. A greater ocesses may aid in informing and improving methods for iPSC generation for research and therapeutic applications. The focus of this dissertation is working towards answering some of these questions regarding MBNL AS regulation. Chapter 2 is published work which focused on the use of an engineering approach to design and characterize synthetic MBNL proteins as tools to understand the importance of individual MBNL domains and their contribution to protein function. Chapter 3 focuses on utilizing precise cell ular systems to control levels of MBNL protein expression and investigate how AS changes across a gradient of MBNL expression, especially in the context of other trans acting RBPs, specifically RBFOX1. Finally, Chapter 4 describes a collection of projects which are currently under development utilizing the synthetic MBNL platform to: 1) discern the for potential therapeutic delivery in CUG repeat expansion disorders, and 3) create new MBNL proteins with altered functionality via the fusion of other modular RNA binding domains. Future directions utilizing the dosing systems described in Chapter 3 are also outlined. Overall, this collection of work focuses on further elucida ting the layers of
40 MBNL splicing regulation, specifically the roles of the individual RBDs of MBNL and this
41 Figure 1 1 RNA sequences recognized by the spliceosome. Diagram depicting consensus sequences for major (U2 dependent) spliceosome recognition of pyrimidine tract (PPT) (8) sequences within the diagram represent their relative position within introns (lines) and exons (boxes) of a pre mRNA substrate
42 Figure 1 2 Mechanism of spliceosome assembly on pre mRNA substrate. Diagram depicting sequential major (U2 dependent) spliceosome assembly on pre mRNA from E complex (top) to final spliced mRNA product (bottom). The splicing components and their relative bindi ng positions within the pre mRNA are shown.
43 Figure 1 3 Mechanism of splicing catalysis. Diagram depicting two step biochemical process performed by the spliceosome to remove introns and join exons, resulting in an exon exon junction and removal of an intron lariat.
44 Figure 1 4 Types of alternative splicing. Diagram depicting common types of alternative splicing. Exons (boxes) targeted for alternative splicing regulation are colored blue while constitutively included exons are colored grey
45 Figure 1 5 Cis acting splicing regulatory elements (SREs). Diagram depicting four main types of cis acting SREs. This diagram shows their relative location within a pre specifically enhancing o r silencing their recognition by the spliceosome.
46 Figure 1 6 Positional dependent splicing regulation by trans acting RNA binding proteins. RNA binding proteins have positional dependent effects on exon inclusion within a final mRNA transcript base d on their location of binding to a pre mRNA. hnRNPs A) and SR proteins B) regulate exon inclusion in an antagonistic manner whereby hnRNPs promote inclusion of an exon by binding intronic sequences; SR proteins repress exon inclusion via this binding patt ern. Conversely, SR proteins promote inclusion by binding to the regulated exon; hnRNPs repress inclusion by binding within an exon. C D) Other RBPs regulate exon inclusion based on binding to the upstream or downstream intron flanking the regulated exon. C) Muscleblind like (MBNL), RBFOX, and Quaking (QKI) promote exon inclusion by binding to the downstream intron and exon exclusion via binding to an upstream intron. D) CELF proteins promote exon inclusion by binding to the flanking introns, but have oppos ite positional dependent effects from the RBPs shown in panel C.
47 Figure 1 7 RNA binding proteins (RBPs) work in complex networks to regulate alternative splicing of large collections of RNA targets. [Adapted from (121) ]
48 Figure 1 8 MBNL regulates alternative splicing in a position dependent manner via recognition of YGCY motifs. A) Diagram depicting how MBNL binding to a pre mRNA regulates AS of a target exon. When MBNL binds within the downstream intron exon inclusion is promoted; MBNL b inding within the upstream intron leads to exon exclusion. B) Sequence logo depicting MBNL1 RNA binding consensus motif from SELEX experiments (64) Height of each
49 Figure 1 9 Mechanisms of MBNL mediated alternative splicing. Diagrams depicting known mechanisms of MBNL mediated exon exclusion (A) and exon inclusion (B). A) MBNL1 binds to and stabilizes a predicted helical element within intron 4 of TNNT2 exclusion of exon 5. B) MBNL bin ds a downstream conserved intronic splicing silencer (ISE) within intron 11 of INSR that enhances U2AF65 binding to the upstream intron and subsequent inclusion of exon 11.
50 Figure 1 10 MBNL domain organization. Diagram depicting MBNL1 domain organ ization, specifically the location of the two tandem zinc finger (ZF) pairs as well as the linker and C terminal region ( splice isoform a; NCBI accession number NP_066368) Lines represent regions predicted to be unstructured. Length of each region corresp onds to relative number of amino acids within each section of MBNL1.
51 Figure 1 11 Location of expanded repeats in myotonic dystrophy type 1 (DM1) and myotonic dystrophy type 1 (DM2). A) DM1 is caused by a CTG repeat expansion in exon 15 of DMPK untranslated region (UTR) of the mRNA transcript. B) DM2 is caused by a CCTG repeat expansion in intron 1 of CNBP
52 Figure 1 12 Molecular mechanism of DM1. Upon transcription of the DMPK gene, the repeat expansion RNA is predicted to form structured RNA elements that MBNL proteins from the nucleoplasm and leads to reduced functional levels of MBNL protein.
53 Figure 1 13 Sequestration of MBNL by toxic RNA in DM1 tissue leads to mis splicing of CLCN1 Diagram depicting MBNL mediated splicing of CLCN1 exon 7a. In unaffected tissue with few CUG repeats in the DMPK locus, MBNL acts as a regulator of CLCN1 exon 7a exclusion. This transcript is translated and cells have normal chloride ion conductance due to adequate levels of CLCN1 protein. In DM1 tissue with a large CUG repeat load, MBNL proteins are sequestered in foci and exon 7a is included in the CLCN1 transcr ipt. This transcript contains a premature stop codon which induces nonsense mediated decay (NMD) of the transcript and reduced CLCN1 protein levels. This leads to reduced chloride ion conductance and is causative of myotonia in DM1 patients.
54 CHAPTER 2 AN ENGINEERED RNA BINDING PROTEIN WITH IMPROVED SPLICING REGULATION This work was published in Volume 46 Issue 6 of the Nucleic Acids Research Journal in April 2018. Melissa A. Hale, the author of this dissertation, was first author on this publication and performed most experiments described with the assistance of Ryan C. Day (122) Jared I. Richa rdson performed the RNA Bind n seq experiments described. Ona L. McConnell and Juan Arboleda of the Eric T. Wang lab generated the synthetic MBNL inducible cell lines utilized. Background Alternative splicing (AS) is a complex and versatile process of post transcriptional gene regulation whereby exons within a precursor RNA transcript are differentially joined and introns removed to produce a mature mRNA. AS generates multiple mRNA isoforms from an individual gene most often resulting in the expression of a diverse set of protein products. Additionally, AS alters the fate of mRNAs through the inclusion of regions that impact RNA localization, translation, and turnover (24) It is now recognized due to large scale transcriptome based studies that more than 90 % of human protein coding genes undergo AS, making regulation of this process critical for proper cellular function (15, 16) Trans acting protein factors, including RNA binding proteins (RBPs, reviewed in (26) and (123) ), can function as regulators of AS by interacting with specific RNA motifs, or splicing regulatory elements, to enhance or repress the inclusion of alternative exons. RBPs also act in a spatio temporal and developmentally dependent manner to modulate the overall profile of mRNAs produced within specific cell types, developmental stages, or in response to varying environmental conditions (23, 24)
55 Muscleblind like (MBNL) proteins are a family of hig hly conserved RBPs that regulate RNA metabolism during tissue specific development, most notably the activation or repression of alternative exon inclusion (39, 40) MBNL proteins have been specifically implicated in regulating fetal to adult mRNA isoform transi tions in heart and muscle (33, 34, 41 43) In a ddition, MBNL proteins have been linked to the regulation of other RNA metabolic processes including localization (44) turnover (15) gene expression (46, 47) alternative polyadenylation (48) and micro RNA processing (49) MBNL proteins, particularly M BNL1, have been the focus of intense study for the past 15 years due to their prominent role in the pathogenesis of myotonic dystrophy (DM). DM is a multi systemic neuromuscular disorder caused by expression of CTG or ntranslated region of DMPK (DM Type 1) or intron 1 of CNBP (DM Type 2), respectively (90, 96) Once transcribed into RNA, t hese expanded CUG or CCUG repeats sequester MBNL proteins into discrete nuclear RNA protein aggregates called foci (55, 102, 103) Sequestration o f MBNL by these toxic, expanded RNAs leads to dysregulation of MBNL mediated AS linked to causi ng some of the disease symptoms (89, 105, 124, 125) Although most commonly associated with DM, loss of MBNL1 function has also been associated with other disorders, specifically spinocerebellar ataxia type 8 (SCA8) and Fuchs Endothelial Corneal Dystrophy (FECD) (115, 117) In order to regulate specific splicing events, MBNL1 acts as an enhancer or repressor in a transcript dependent manner. In general, if MBNL1 binds upstream of a regulated exon it suppresses inclusion and if it binds downstream it enhances inclusion (47, 64, 74) RNA binding by MBNL1 is mediated via four highly conserved CCCH type
56 (CX 7 CX 4 6 CX 3 H) zinc finger (ZF) motifs that fold into two tandem RNA binding domains commonly referred to as ZF1 2 and ZF3 4 (74, 76) These two domains are located within the N terminal region of the protein and are separa ted by a flexible linker predicted to mediate MBNL1 bindin g to a wide variety of RNAs (75, 76) Studies have shown that MBNL proteins bind YGCY (Y = C or U) motifs within their RNA targets (32, 64) Crosslinking immuno precipitation (CLIP) seq and RNA Bind n seq (RBNS) experiments have identified several additional related motifs (44, 66) The expanded CUG / CCUG repeat RNA in DM patients contain many YGCY motifs, providing a sink for MBNL and the subsequent dysreg ulation of RNA processing mediated by MBNL proteins. Sequence alignment and secondary structural overlay of the two ZF domains show that ZF1 / ZF3 and ZF2 / ZF4 have high sequence similarity a nd nearly identical structures (74, 76, 77) The major differences between the domains is (i) helix at the end of ZF2, (ii) an interdomain lin ker that is two amino acids shorter in the ZF1 2 domain ( Figure 2 1 ) and (iii) a short N terminal helix before ZF1 absent in the ZF3 4 domain (74, 76, 77) Due to the high degree of similarity between the two domains as well as their physical separation via the linker, it has been predicted that the ZF domains have the same or very similar RNA binding activities and may be functionally redundant (74, 76) The hypothesis of functional redundancy is further supported by studies with the D. melanogaster and C. elegans orthologs of the MBNL1 gene, muscleblind (mbl) Mbl from these organisms contains only a single ZF domain or the major isoform contains only a single ZF domain orthologous to the human ZF1 2
57 and yet is able to regulate splicing of many MBNL1 target transcripts in mammalian cell culture (52, 126, 127) Despite the similarity between these domains, combinatorial mutagenic analysis of the four ZF s found that ZF1 2 and ZF3 4 are not functionally equivalent (74) Using this approach it was discovered that a MBNL1 protein with a single functional ZF1 2 bound with higher affinity to all tested RNA substrates compared to a MBNL1 with only a functional ZF3 4 (74) Additionally, a MBNL1 with only an active ZF1 2 retained approximately 80 % of splicing activity while the MBNL1 mutant with only a n active ZF3 4 maintained 50 % splicing regulation (74) Despite these observations it still remained unclear if the ZF pairs truly act as independent domains. Additionally, the function of the individual ZF domains and whether they cooperate in some manner through higher order interactions to achieve AS regul ation remain ed ambiguous. In order to address these questions, we utilized a synthetic biology approach to generate chimeric MBNL1 proteins with novel ZF domain organization. Specifically, we hypothesized that a synthetic MBNL1 protein with higher RNA bin ding affinity and subsequent splicing activity could be engineered by replacing ZF3 4 with a second ZF1 2 Additionally, we predicted that substitution of the ZF1 2 domain with a ZF3 4 would result in weak ened RNA binding and reduced splicing regulation. T o test these hypotheses, two synthetic MBNL constructs were designed with d uplicate ZF domains: (i) a MBNL in which the ZF3 4 domain is replaced with a ZF1 2 (defined as domain A) to create MBNL AA and (ii) MBNL BB in which the ZF1 2 domain is substituted with a ZF3 4 (defined as domain B) (Figure 2 2a).
58 Using this approach we discovered that the ZF1 2 and ZF3 4 domains act as independent units with distinct characteristics, most notably different RNA binding specificities. We also showed that the ZF domai ns can be organized in novel ways to produce synthetic MBNL1 proteins with different activities as assayed by AS and RNA binding assays. The creation and characterization of these synthetic proteins has not only given us additional insights into the functi on of the individual ZF domains, but also provides a framework to develop novel MBNL proteins with the potential to serve as tools to investigate AS regulation and as potential therapeutic biologics for DM and other microsatellite diseases. Results Synthe tic MBNL1 Proteins w ith Modified Zinc Finger Domain Organization Possess Different Splicing Activities In order to evaluate the importance of ZF domain organization and content in MB NL proteins, two synthetic MBNL proteins with different ZF domain content were created, MBNL AA and MBNL BB, the activities of which we planned to compare to an MBNL protein with the canonical ZF domain content and arrangement, MBNL AB (Figure 2 2a ). Using the extensively studied 41 kDa isoform (1 382 amino acids) of MBNL1 as a platform for our synthetic protein design, we defined the ZF1 2 domain (domain A) from 9 to 101 (93 amino acids) and the ZF3 4 domain (domain B) from 178 to 253 (76 amino acids). Although the domain boundaries previously published were used as a guide (76, 128) we chose to extend the C terminal sequence of ZF1 2 to include the Q rich region (amino acids 91 101) downstream of ZF2, which we have shown previously to be important for ZF1 2 splicing function (74) To reduce the overall size of our synth etic proteins and facilitate in vitro purification, the C terminal region
59 (amino acids 261 382) was removed and replaced with an eight amino acid nuclear localization signal (SV40 NLS). Although predicted to be relatively unstructured, the C terminus of MBNL1 has been shown to contain several regions required for nuclear localization and potential MBNL1 dimerization (73) However, previous work has shown that this region is not required for high affinity RNA binding and MBNL1 proteins with the C terminus removed retain nearly full splicing activity compared to full length MBNL1 (67, 80) Finally, a N terminal HA tag w as also added for use in immunobl ot and immunofluorescence detection methods (see Figure 2 1a for the a mino acid sequences of all MBNL protein constructs used in this study). Prior to fu nctional characterization of our synthetic MBNL proteins, we evaluated relative protein expression le vels and subcellular localization. Immunofluorescence detection in transfected HeLa cells showed predominant nuclear localization with a modest signal in the cytoplasm for MBNL AB and both synthetic proteins (Figure 2 3a ). This distribution is comparable to past results, including those using full length MBNL1 (52, 74) The only noticeable difference in the subcellular distribution of the synthetic proteins was a lack of nucleolar definition in cells expressing MBNL BB Surprisingly, we detected significant differences in steady state protein levels in transfected HeLa cells as det ermined by immunoblot (Figure 2 2b ). When normalized to MBNL AB MBNL AA is expressed at an approximately 0.5 fold lower level while MBNL BB is expressed at a 2.5 fold higher level (Figure 2 2c ). This pattern of expression was maintained in transfected HEK 293 cells indicating that the observed relative express ion levels observed are independent of cell type and transfection method ( Figure 2 4a and 2 4b ). These variations in protein levels were not due to changes in mRNA expression
60 as assayed by RT qPCR (Figure 2 3b ). Overall, these data suggest that the ZF3 4 d omain confers additional stability to MBNL1 compared to ZF1 2 in the context of these protein constructs To explore how variation of ZF content within the synthetic proteins would impact MBNL1 splicing activity, a cell based splicing assay was used with a series of splicing reporter minigenes, many of which are derived from events known to be mis regulated in DM. These reporters include (i) human insulin receptor exon 11 ( INSR ) (71, 128, 129) (ii) human cardiac troponin T type 2 exon 5 ( TNNT2 ) (32, 67) (iii) human sarcoplasmic / endoplasmic reticulum Ca 2+ ATPase 1 exon 22 ( ATP2A1 ) (64, 130) (iv) mouse nuclear factor I/X exon 8 ( Nfix ) (47) (v) mouse very low density lipoprotein receptor exon 16 ( Vldlr) (47) and (vi) human MBNL1 exon 5 (131) HeLa cells were co transfected with synthetic MBNL expression plasmids or empty vector (mock) and a single minigene reporter. Inclusion levels of each alternative exon were then quantified via RT PCR and expressed as percent spliced in Splicing activity of MBNL AA and MBNL BB were then determined as a percentage of activity relative to MBNL AB for each minigene event. The data for the six minigenes tested revealed that MBNL AA regulated splicing at a level equivalent to or better than MBNL AB MBNL BB while still functional, had significantly reduce d splicing activity (Figure 2 5a 2 5f ). These patterns of splicing regulation were maintained for both inclusion ( INSR ATP2A1 and Vldlr ) and exclusion ( TNNT2 MBNL1 and Nfix ) events, indicating that the splicing activity of these synthetic proteins is independen t of RNA target and regulation type. Overall, these observations are consistent with our hypothesis that the splicing activity of MBNL AA would be high
61 while that of MBNL BB would be low. Importantly, disruption of the canonical ZF domain organization and removal / replacement of specific ZF domains di d not render our synthetic MBNL proteins dysfunct ional, indicating that (i) MBNL proteins are amenable to major sequence alterations and substitutions, and (ii) the splicing activity of the individual ZF domai ns can be uncoupled. The only reporter that showed large differences in regulation was TNNT2 Within the context of this event, MBNL AA displayed enhanced activity (147%) while MBNL BB had only minimal splicing activity (16% activity) (Figure 2 5d ). In contrast, all three proteins were able to regulate splicing of the MBNL1 reporter with similar activity (Figure 2 5e ). For all other reporters utilized, MBNL AA regulated splicing at equivalent levels to MBNL AB while MBNL BB retained approximately 5 0% of MBNL AB splicing activity ( Figure 2 7a ). These trends were maintained in HEK for each minigene between the two different cell types (Figure 2 6a 2 6f and 2 7b ). An important point regarding these results is that equa l amounts of plasmid were transfected into cells and this resulted in differences in the amou nt of each MBNL protein expressed (Figure 2 2b and 2 2c for HeLa, Figure 2 4a and 2 4b for HEK 293). Interestingly, the high levels of MBNL BB were not sufficient to regulate splicing as well as MBNL AB In contrast, MBNL AA maintained comparable splicing regulation to MBNL AB with half the amount of protein present. Controlled Dosing o f S ynthetic MBNL Proteins in Two Different Systems Reveals Significantly Differen t Activities f or Splicing Regulation To gain further insight into MBNL AA and MBNL BB AS regulation, especially as it relates to protein concentration, we performed the same cell based splicing assays previously uti lized across a gradient of MBNL expressio n. We found that this
62 experimental analysis was necessary as our synthetic proteins had different expr ession profiles (Figure 2 2c and 2 2c, Figure 2 4a and 2 4b ). To create the range of protein levels required within this system, HeLa cells were transfect ed with increasing amounts of protein e xpression plasmid for each synthetic MBNL protein tested. Immunoblot analysis against the HA tag was then used to quantify relative MBNL1 levels at each concentration of plasmid transfected (see Figure 2 8 for representative blots and quantification ). As expected, MBNL BB maintained relatively high levels of expression across the gradient while MBNL AA protein levels rema ined lower compared to MBNL AB values for three different minigenes ( TNNT2 MB NL1, and ATP2A1 ) for each individual poin t along the protein gradient were determined (representative images are shown in Figure 2 9a ). These values were then plotted against log [MBNL ] to create dose response curves for each protein. M BNL1 and ATP2A1 were selected from the pool of mingene reporters to test in this system because (i) these two based spl icing assay (Figure 2 5b and 2 5e ), (ii) they represent both MBNL1 regulated inclusion and exclusion events, respectively, (iii) both minigenes have been well characterized (64, 131) and (iv) MBNL AA and MBNL BB both show similar splicing activity and maximal d to MBNL AB (Figure 2 5b and 2 5e ). TNNT2 was chosen as an additional reporter to test in this dosing system because it displayed the largest difference in splicing activity for synthetic MBNL1 protein s (Figure 2 5d ). Creation of these dose curves allowed for the derivation of several quantitative parameters that describe the splicing regulation of each event, i.e. EC 50 and slope. The slope of the response curve
63 provides a relative measure of cooperativity while the EC 50 value provides a relative measure o f how much protein is required to obtain splicing regulation at 50% of Results from these experiments revealed different dos e response curves for each MBNL protein tested and for each minigene assayed (Figure 2 10a 2 10c ). Both MBNL AB and MB NL AA displayed typical dose response curves that show a plateau in minigene events tested (Figure 2 10a 2 10c ). Based on the EC 50 values derived, MBNL AB required approximately 5 fold more protein compared to MBNL AA to achieve similar l evels o f splicing regulation (Figure 2 10d ). For all three events tested the slope of the dose response curves for MBNL AB was steeper compared to MBNL AA (Figure 2 10d ). Interestingly, this indicates that while less MBNL AA protein is required to reach th regulation. In contrast to MBNL AB and MBNL AA the dose response curves for MBNL BB revealed that, as expected, high expression levels are required to achieve modest splicing regul ation (Figure 2 10d TNNT2 (Figure 2 10c ). Even for minigene events assayed in which MBNL BB was able to achieve splicing regulation in the overexpression system ( ATP2A1 and MBNL1 Figure 2 5b and 2 5e ), the EC 50 values ar e high and the slopes are shallow compared to t he other two proteins (Figure 2 10d ). Overall, the controlled dosing of our synthetic MBNL1 proteins in this system revealed that as predicted, MBNL BB has significantly reduced splicing activity while MBNL AA should be considered a high activity, synthetic derivative of MBNL AB with a 5 fold increase in splicing activity.
64 To expand our analysis of synthetic MBNL AS regulation as a function of protein concentration, we established stable cell lines expressing GFP tagged MBNL AB or MBNL AA controlled with tet on regulation. A cell line with MBNL BB was not generated due to its weak splicing activity. Both a constitutively expressed rTta and an N terminal GFP tagged synthetic MBNL protein under control of a tet r esponse element were stably integrated into mbnl 1/2 double knockout mouse embryonic fibroblasts (MEFs). Integration of both cassettes was driven by puromycin selection. After selection and treatment with doxycycline to activate GFP MBNL protein expression fluorescent activated cell sorting was used to isolate and select individual clones for each cell line that have high expression of the synthetic MBNL protein in response to drug treatment. The fluorescence of the GFP tag was utilized to show that as in the transfected HeLa system, MBNL AB and MBNL AA co localize in the nucleus of doxycycline treated cells (Figure 2 11). In this system, the concentration of synthetic MBNL proteins can be precisely controlled as a function of doxycycline (0 60 ng/mL for MBNL AB, 0 2000 ng/mL for MNBL AA). In both cell lines, MBNL expression covered a broad range (Figure 2 12a, Figure 2 13). In contrast to the plasmid dosing system in HeLa cells, expression levels of MBNL AB and MBNL AA at matched doxycycline doses are statistically equivalent except at the highest dose, where MBNL AA expression levels were slightly increased (Figure 2 13). The differences in protein expression levels in the two systems are likely due to the presence of the N terminal tag (GFP vs. HA) an d possibly the different cellular environments. Next we tested the AS activity of the synthetic MBNL proteins for 15 endogenous splicing events across a range of protein expression generated via
65 doxycycline gradient. These 15 endogenous events (9 inclusion 6 exclusion) were selected from RNAseq data sets previously published from the M bnl 1/2 knockout MEFs (52) RT PCR was then performed to determine of each individual point along the protein g radient (representative images to calculate are shown in Figure 2 14) and plotted against log [MBNL] levels to generate dose response curves (Figure 2 12a 2 12c, Figure 2 15). EC 50 slope, and values were then derived from these dose response curve s (Figure 2 16). MBNL AB and MBNL AA displayed nearly identical dose response curves with similar EC 50 and slope values (Figure 2 12b 2 12d and Figure 2 15). In most cases, the dose response curves overlapped (Figure 2 12b, Figure 2 15c 2 15g, 2 15j, and 2 15l). For a few select events, while the overall shape / of the dose response curves was similar, the minimal and maximal for MBNL AB or MBNL BB was shifted (Figure 2 12c and Figure 2 15b, 2 15e, 2 15i, and 2 15k). These shifts in the curves di d, for some events ( Mta, Add3, and Exoc1) result in increased EC 50 values for MBNL AA compared to MBNL AB (Figure 2 16a and 2 16b). Depdc5 (Figure 2 12d) was the only event for which MBNL AA showed a significantly lower EC 50 and reduced slope compared to M BNL AB, the same pattern of activity displayed in the HeLa plasmid dosing system. Overall, MBNL AB and MBNL AA showed similar activities across many splicing events suggesting these two proteins regulate most or all splicing events with similar activities. Synthetic MBNL1 Proteins Possess Distinct RNA Binding Specificities To determine if enhanced or disrupted RNA binding correlated with the observed splicing activities of MBNL AA and MBNL BB electrophoretic mobility shift assays
66 (EMSAs) were performed w ith purified MBNL proteins and short model RNAs. The first tested was with a CUG 4 RNA substrate which co ntains two UGCU motifs (Figure 2 17a ) predicted to form a short hairpin designed to mimic the structure CUG repeats are proposed to adopt in DM1 (67) Surprisingly, MBNL AB and MBNL BB possessed nearly identical binding affinities to the CUG 4 RNA while MBNL AA had a slightly higher K D (Figure 2 17b ). As expected, all three proteins had no observable binding to the CAG 4 RNA s ubstrate (Fig ure 2 17a ) in which the UGCU motifs were mutated to AGCA to weaken MBNL1 RNA interactions (Figure 2 17b ) (All K d s > 2500 nM). Second, we assayed binding to NV11, a 24 nucleotide, single stranded RNA substrate that serves as a model for sites in pre mRNAs with minimal RNA structure (75) This RNA contains two GC dinucleotides separated by an eleven uridine spacer creating t wo UGCU binding motifs (Figure 2 17a ) (75) In conjunction we tested the NV2CC substrate in which both GC dinucleot ides are mutated to CC (Figure 2 17a ). This modification to the sequence leads t o disruption of the YGCY binding motifs (UGCU to UCCU) and has been shown to significantly weaken MBNL1 binding (75) MBNL AB and MBNL AA had nearly identical, low nanomolar binding affinities to the NV11 substrate (Figure 2 17d and 2 17e ). In a manner similar to CAG 4 both proteins displayed a substantial decrease in RNA binding affinity to the NV2CC construct (Figure 2 17d and 2 17e ). Differential protein RNA compl ex migration was observed for NV2CC compared to NV11. We suggest these differences are due to alterations in the binding mode of the MBNL proteins for specific vs. non specific binding modes. Overall, t hese results indicate that MBNL AB and MBNL AA both re cognize YGCY motifs with
67 relatively high levels of specificity (59 fold increased recognition of specific motifs for MBNL AB and 18 fold for MBNL AA ). MBNL BB exhibited a 6 fold decrease in RNA binding affinity for the NV11 RNA substrate compared to MBNL AB (Figure 2 17d and 2 17e ). Interestingly, MBNL BB only displayed a 2 fold decrease in RNA binding affinity for NV2CC compared to NV11 (Figure 2 17d and 2 17e ). This result indicates that MBNL BB partially lost the ability to specifically recognize target YGCY motifs in the context of a pyrimidine rich RNA. This pattern is significantly different from both MBNL AA and MBNL AB as both proteins exhibit high affinities for NV11 with significantly increased K D s for NV2CC (Figure 2 17e ). Overall, these data su ggest that MBNL BB is primarily a non specific RNA binding protein. Although we originally predicted that the differences in splicing activity observed between the synthetic MBNL1 proteins would be due to changes in RNA binding affinity, the results from o ur EMSA analysis suggested that differences in RNA binding specificity might be responsible. To more broadly test the RNA binding spec ificities of the synthetic MBNL proteins, we performed RNA Bind n Seq (RBNS), a comprehensive, next gen sequencing based approach to characterize sequence specificity of RBPs (66) MBNL AB, MBNL AA, and MBNL BB were incubated at increasing concentrations with a pool of random 40 mer RNAs The bound RNA, as well as a sample of the un processed input RNA, was then used to produce cDNA libraries for deep sequencing (66) Following sequencing, for each protein at each of the tested concentrations, motif read frequency of the kmer in the experimental pool as compared to that of the input R NA
68 library. Using this approach, a higher R value is indicative of increased enrichment of a specific motif in the bound RNA pool where R=1 indicates no significant enrichment. First, we compared data from our RBNS analysis of MBNL AB to that of a previou sly published RBNS MBNL1 data set (66) We observed many of the same top 7mers in both RBNS analyses as well as similar R values with correlations across the range of prote in concentrations ( Figure 2 18b and 2 18c ). This indicates that (i) our MBNL AB protein had similar levels of b inding activity compared to the truncated MBNL1 protein utilized in other independent studies and (ii) there is only a modest diff erence between the experiments likely due to the use of different tags in the experimental methodology and changes in the washing step of the protocol (GST vs. a streptavidin binding peptide (66) see Materials and Methods for additional information about experimental design). Next, we compared the unimodal enrichment profiles of the top four 7mers for MBNL AB and the two synthetic MBNL derivatives (Figure 2 19a 2 19c ). Analysis of these plots revealed several interesting patterns. First, the top three kmers identified were the same for MBNL AB and MBNL AA (GCUUGCU, CGCUUGC, and UGCUUGC). All three kmers contain either YGCU or GCUU motifs, with the top kmer of GCUUGCU containing both motifs. Overall, there was significant overlap in the top 50 kmers identified for both proteins as well as similar nucleotide occurrence in the selected motifs (Figure 2 19d and 2 19e ) This indicates that both MBNL proteins rec ognize and bind similar RNA motifs. Interestingly, the R values for many top kmers are significantly increased for MBNL AA compared to MBNL AB (10 versus 7), albeit at different protein
69 concentra tions (500nM vs 250nM) (Figure 2 19b and 2 19a ). Although t his is the most striking difference between these two proteins, the overall pattern is that at lower concentrations, MBNL AA has lower R values as compared to MBNL AB until these R values dramatically increase at 500nM, and then drop to nearly identical le vels at higher protein concentrations (Figure 2 19b ). In contrast, R values for MBNL AB increase modestly at lower concentrations, peaking at 250nM and then stayi ng relatively constant (Figure 2 19a ) This overall pattern suggests that higher concentration s of MBNL AA may be needed to achieve specific sequence binding relative to MBNL AB potentially due to loss of cooperative binding as was suggested by the shallower slopes of the splicing dose response curves generated using minigene reporter substrates ( assuming RNA binding correlates to splicing). Despite changes in the shape of the enrichment profiles, there is high correlation in the R values across the protein gradient (Figure 2 18a ). Overall, RBNS analysis indicates that while MBNL AA and MBNL AB bin d and recognize similar RNA motifs, MBNL AA has increased RNA binding specificity for many of these motifs. MBNL BB selected a different set of top kmers that contain fewer uridines (GCGCUGC, GCUGCGC, CGCUGCU, and CUGCUGC) (Figure 2 19c ). Percent nucleotid e occurrence within the top 100 kmers showed a reduction of uridines and a modest enrichment in gu anosines and cytosines (Figure 2 19d ). Due to this change in the distribution of nucleotides in the enriched MBNL BB kmers, fewer YGCU and UGCU motifs were id entified. As such, fewer overlapping motifs were found between the top 50 kmers of MBNL BB and the other MBNL proteins (Figure 2 19e ). Consistent with modest RNA binding specificity, the enrichment profiles for MBNL BB have low R
70 values across the gradient of protein concentrations, peaking at R = 3 at 1000 nM for th e most enriched motifs (Figure 2 19c ). Overall, the EMSA and RBNS data for MBNL BB indicate that this synthetic MBNL1 has significantly reduced RNA binding specificity while maintaining general RNA binding affinity. This is in sharp contrast to MBNL AA in which RBNS revealed that this synthetic protein has enhanced YGCY RNA sequence recognition over MBNL AB These overall changes in binding specificity are consistent with a model in which ZF1 2 c onfers specific sequence recognition while ZF3 4 acts as a more general RNA bindin g domain in the context of MBNL proteins and our synthetic derivatives. Synthetic MBNL1 Proteins Rescue CUG Dependent Mis Splicing in a DM1 Cell Model Given the differences in splicing activity and RNA binding sp ecificity of the synthetic MBNL proteins, we sought to determine if these proteins could rescue CUG mediated mis splicing like that found in DM1. This was accomplished by expressing CUG repeats from a plasmid containi ng 960 interrupted CTG repeats (DMPK CTG 960 a.k.a CUG 960 ) in culture (132) Interrupted repeats were used due to challenges with the instability of pure repeats (133) and similar to previous st udies, the use of these long repeats with interruptions in HeLa cells leads to MBNL co localization with CUG repeat RNA in nuclea r foci and mis splicing of MBNL regulated minigenes (64, 67, 131) The same plasmid dosing system used previously (Figure 2 10 ) was used with co transfection of the CUG 960 repeat plasmid, minigene reporter, and syn thetic MBNL the MBNL1 and ATP2A1 minigenes
71 in Figure 2 20 ) These results were also compared to those generated in the absence of CUG repeat RNA expression (Figure 2 21) Co expression of CUG 960 led to reduced splicing activity for all three proteins at low protein levels (Figure 2 22a and 2 22b ), presumably due to sequestration of endogenous MBNL proteins. At higher prote in ex pression levels, all three MBNL the absence of CUG repeats (Figure 2 22a and 2 22b, Figure 2 21a and 2 21b ). The addition of CUG repeats had no effect or only a modest effect on the EC 50 values for all 3 proteins and the two minigene reporters (Figure 2 22c ). An increase in the slopes of the dose response curves for MBNL AB for both events s tudied was significant (Figure 2 22c ). The overall effects on the dose response curves with the addition of CUG repeats are consistent with a model in which at low levels of MBNL all of the protein is sequestered by the CUG repeats and unable to regulate splicin g. As the concentration of MBNL increases binding to the CUG repeats is saturated, leading to a rep lenishment of free, active MBNL in the nucleus and effective splicing rescue. Despite the changes in the dose response curves in the presence of toxic RNA, MBNL AA remained the most active protein (lowest EC 50 values) for both minigene reporters (Figure 2 22c ). Discussion Synthetic MBNL1 Proteins w ith Altered RNA Binding Specificity h ave Differential Splicing Activity To gain insight into the function of the individual ZF domains, we utilized a synthetic biology appro ach to engineer and biochemically characterize two synthetic MBNL proteins with altered ZF domain content, i.e. MBNL AA and MBNL BB Using this system we determined that ZF1 2 has increased RNA binding specificity over ZF3 4
72 that led to enhanced AS activit y of the synthetic MBNL AA protein. Additionally, we showed that MBNL AA was capable of rescuing CUG dependent mis splicing in a DM1 cell model at lower protein concentrations than MBNL AB indicating that these synthetic proteins could potentially be used as therapeutics to repla ce or displace sequestered MBNL from foci in DM patient cells. Splicing assays, in vitro EMSAs, and RBNS analysis revealed that MBNL AA is a more active derivative of MBNL1 and can reg ulate AS of RNA targets at reduced protein con ce ntrations (Figure 2 10 and Figure 2 22 ). Overall, MBNL AA had either equivalent or 5 fold increased activity compared to MBNL AB depending on the splicing assay and cell system utilized We predict that differences in activity observed for MBNL AA in the transfection experiments versus the inducible expression system in the mbnl 1/2 double knockout MEFs is likely due to (i) differences in the N terminal tag (HA vs. GFP, respectively) and/or (ii) assessing AS with minigene events versus endogenously expres sed pre mRNA substrates in the two systems. Fusion of MBNL AB and MBNL to load onto RNA substrates for effective regulation. Additionally, it is possible that t he differences in MBNL AB and MBNL AA activity are magnified with RNA substrates at high cellular concentrations produced from transfected minigenes. Overall, these results indicate that protein modifications (tags), cellular environment and substrate conc entrations can affect MBNL1 protein activity, but that RNA binding specificity of MBNL1 is a primary determinant of splicing regulation. In contrast to MBNL AA MBNL BB, while still functional, had 4 fold weaker activity compared to MBNL AB We predicted that these variations in splicing activity
73 would be due to altered binding affinity to RNA targets, but we found through EMSA and RBNS studies that changes in RNA specificity appear to be primarily responsible for the observed alternations in splicing regu lation (Fi gures 2 17 and 2 19 ). MBNL AA retained the ability to recognize YGCY motifs with increased specificity compared to MBNL AB (Figure 2 19 ). In contrast, MBNL BB had very low RNA binding specificity overall with diminished recognition of canonical Y GCY motifs (Figure 2 19 ). Overall, our worki ng model (summarized in Figure 2 23 ) is that in the context of canonical arrangement of ZF domains in MBNL proteins (MBNL AB) ZF1 2 drives splicing activity via specific binding to YGCY motifs in the appropriat e sites of pre mRNA substrates. ZF3 4, with its modest preference for YGCY motifs, will sample and bind many RNA motifs providing general binding affinity for MBNL We propose that MBNL AA with two high specificity domains has heightened recognition of MBN L1 YGCY regulatory elements leading to increased splicing activity. In contrast, our model is that MBNL BB will bind many off target sites leading to reduced occupancy at the sites needed for regulation of AS by MBNL1, resulting in the need for high concentrations of this protein for splicing regulation. In this model, the addition of a third ZF1 2 domain to c reate the synthetic protein MBNL AAA would lead to enhanced interactions with multiple YGCY motifs and improved specificity of RNA recognition and regulation. The results from the cell based assays are consistent with our proposed model of AS r egulation b y the synthetic MBNL proteins. In general, MBNL BB weakly regulated all tested minigene events, but achieved nearly complete splicing regulation with MBNL1 and ATP2A1 Both events possess many functional clusters of YGCY motifs
74 (64, 134) and MBNL BB with its low specificity may bind the high density sites with sufficient occupancy to regulate splicing. The TNNT2 substrate contains only two UGCU motifs separated by the polypyrimidine tract within intron 4 (67) and these two sites may not be sufficient to recruit MBNL BB to this substrate accounting for the acutely weak regulation of this event. Alternatively, it is possible that the sy nthetic MBNL proteins have altered recognition of specific RNA str uctural elements. Recognition of a structured element within the TNNT2 pre mRNA and its subsequent disruption into a single stranded segment is proposed to be required for regulation by MBNL1 (67, 77) MBNL AA may have enhanced rec ognition of structured RNAs which may account in part for its increased splicing r egulation of this event. An alternative possibility is that the ZF1 2 domain interacts with other splicing factors that bind the TNNT2 pre mRNA and the duplication of the ZF 1 2 domain improves recruitment and leads to the higher level of splicing activity obse rved (Figure 2 5dD and Figure 2 10c ). ZF1 2 And ZF3 4 Possess Distinct RNA Binding Activities t hat Modulate MBNL1 Activity Previous work in the field had attempted to determine the activities and RNA binding of the individual ZF1 2 and ZF3 4 domains and how each contributes to the overall function of MBNL1 (74) Truncated MBNL1 proteins possessing only ZF1 2 or ZF3 4 bind weakly to RNA substrates, suggesting that the tandem ZFs work cooperat ively to bind their RNA targets (135) Other studies have utilized point mutations to eliminate the RNA binding functio n of one ZF pair while leaving the other functional (74) It was s hown using this strategy that ZF3 4 binds RNA with lower affinity than ZF1 2 (74) While these results are consistent with our binding studies and RBN S analysis, previous studies had not addressed questions of RNA binding specificity. The
75 duplication of the ZFs in this study was important beca use it allowed us to study M BNL proteins that maintained nanomolar binding affinity for RNA targets and revealed significant differences in specificity This duplication strategy should be useful for other domains that bind RNA with low affinity in isolation, assuming the domains ope rate independently. Overall, the results with our synthetic MBNL proteins indicate that ZF1 2 and ZF3 4 are independent domains and can be reorganized without obvious negative impacts on protein function. Our studies indicate that ZF1 2 drives splici ng regulation (Figures 2 5, 2 10, and 2 12 ) via specific reco gnition of YGCY motifs (Figure 2 19 ). This activity is consistent with observations that MBNL1 orthologs in D. melanogaster and C. elegans which contain only a single ZF pair orthologous to ZF1 2 can regulate exon inclusion in a manner similar to human MBNL1 in cell culture (52) The differential protein levels of the syn thetic pro teins (Figure 2 2b and 2 2c, Figure 2 4 ), suggest that the ZF3 4 domain may confer stability to MBNL1. Consistent with this hypothesis, incr eased mammalian (Figure 2 2b and 2 2c, Figure 2 4 ) and bacterial protein expression levels were observed for MBNL BB ; in contrast levels of MBNL AA were reduced in both systems except for the inducible MEF system where fusion to a GFP tag may stabilize relative protein levels Interestingly, in many of the immunoblots performed in this study, two bands for MBNL BB, and to a lesser extent, MBNL AB, can be visualized. These bands could represent incorrectly processed protein or differential levels of post translational modifications, which may in part account for the MBNL reduced activity.
76 We propose that the differe ntial activities between ZF1 2 and ZF3 4 activity observed in this study namely the difference in RNA binding specificity and splicing activity, are due to subtle changes in the architecture and sequence of the ZF domains (Figure 2 1b). Specifically, the N terminal helical turn of ZF1 and the slightly extended C terminal helix of ZF2 that are both absent in the ZF3 4 domain. Both of these structural elements were shown to be important for coordinating RNA in the binding pocket observed in the NMR solution structure of ZF1 2 bound to a short, human TNNT2 RNA fragment containing YGYC MBNL1 binding sites (77) We propose that the coordination and packing of RNA along the extended C terminal helical element aids in specific RNA recognition, making this region of the ZF act as a RNA discrimination domain. Although the structures of ZF1 2 and ZF3 4 are similar (76, 77) due to the absence of the N terminal helical turn in ZF3 and the shortened C terminal helical region of ZF4 (76, 77) we predict that specific RNA binding by this domain is diminished, potentially due to reduced RNA coordination in the predicted binding pocket. These changes in domain architecture between ZF1 2 and ZF3 4 are conserved in all thre e human MBNL homologs (MBNL1 / MBNL2 / MBNL3) (see Figure 2 1c for sequence alignment). We predict that the differences in activity between these domains are maintained across MBNL1, MBNL2, and MBNL3 and potentially more broadly across MBNL proteins throug hout metazoans (52) Modular Architecture o f MBNL1 ZF Domains Provides a Unique Platform For RNA Recognition Although RBPs have a broad range of functions, they are often built from relatively f ew RNA binding domains. To increase the functionality and specificity of their target interactions, multiple RNA binding domains are frequently found in RBPs. A
77 classic example of this is the Pumilio (PUF) family of RBPs, where up to 8 tandem domains that each recognize a single nucleotide ca n be combined i n a single polypeptide chain to create a highly specific RNA interaction surface (136) In a similar manner we propose that the modular architecture of MBNL1 with its two tandem ZF pairs increases the protein RNA binding surface. Our working model is that ZF1 2 drives splicing regulation t hrough specific recognition of YGCY motifs and the ZF3 4 domain binds secondarily to a broader range of motifs to allow MBNL1 to recognize a wi de range of substrates (Figure 2 23 ). Additionally, the domain organization and differences in RNA binding specif icity between the ZF pairs may explain the relative levels of cooperativity observed in t he dosing experiments (Figures 2 10, 2 12, and 2 22), assuming binding of the MBNL proteins correlates to splicing. One possible mechanism for cooperativity is that bi nding of one MBNL protein facilitates the binding of one or more addi tional MBNL proteins or other splicing factors to a pre mRNA substrate. These additional binding events mediated by MBNL may shift splicing decisions over tighter protein gradients comp ar ed to the two synthetic MBNL proteins utilized in this study which in general, have less cooperative splicing curves. This model of a modular RBP containing multiple domains, one for specific RNA recognition and the other with broader target binding, has been utilized by several other RBPs, including those containing CCCH zinc finger motifs. One example is the neuronal protein Unkempt, a highly conserved RBP that binds to its target mRNAs to reduce translation and is required for the establishment of neuronal morphology in development (137, 138) Unkempt contains six CCCH ZF domains that form two tandem clusters, each with three ZFs [ZF1 3 and ZF4 6] (137, 138) Both CLIP and structural data
78 confirm that ZF1 3 binds to a UAG trinucleotide while ZF4 6 binds to a more variable U rich motif (137, 138) Mutatio nal analysis of RNAs bound to Unkempt in vitro revealed that the UAG motif was mandatory for binding while alterations to the downstream U rich element were more tolerated (138) These data suggest that in a manner similar to MBNL1 ZF1 2, Unkempt ZF1 3 drives RNA recognition via binding to the UAG rich motif by Z F4 6 allows for recognition of a wider array of RNA substrates in a manner similar to MBNL1 ZF3 4. The similar modes of MBNL1 and Unkempt RNA interactions indicate that this might be a common strategy for RNA binding proteins with ZF domains Engineered M BNL1s as Protein Therapeutics i n Neuromuscular Disorders The creation of designer RBPs has increased over the past several years as a means to modulate RNA function (139, 140) Although the traditional methodology of engineering these proteins often fo cuses on combining domains to target a specific RNA sequence of interest, such as with the PUF proteins (139, 140) we choose to utilize a different synthetic design strategy. We focused on enhancing the pre existing a ctivity and specificity of MBNL by re combining its ZF domains. No such designer RBP has previously been created that focuses on enhancement of protein function via duplication of specific modular RNA binding domains. This design strategy may be the most effective for engineering RBPs as p rotein therapeutics in which the function of a target protein is decreased or absent such as in DM. MBNL1 overexpression has been proposed as a therapeutic strategy in the DM field to ameliorate symptoms caused by the loss of free MBNL1 in CUG/CCUG RNA foci. Delivery of MBNL1 through adeno associated virus (AAV) has been shown to rescue mis splicing events in a DM1 mouse model and reverse disease associated
79 symptoms in skeletal muscle, including myotonia (110) Additionally, MBNL1 overexpression has been shown to be well tolerated in non disease mice (141) suggesting that therapies designed to increase levels of free, active MBNL1 in the cell c ould be an effective strategy for treatment of D M. Delivery of a synthetic MBNL with increased activity, such as MBNL AA could be a powerful approach to correct disease specific mis splici ng. The use of a synthetic MBNL as a protein therapeutic is potenti ally ideally suited for Fuchs Endothelial Corneal Dystrophy where the protein would only need to be delivered to single tissue, the eye (117) Our work to create a synthetic MBNL serves as a proof of principle that MBNL proteins tolerate domain reorganization and can be manipulated while retaining function. Further r ational d esign strategies to modify MBNL could be utilized moving forward to continue to create a smaller, more stable, and higher activity synthetic MBNL for use in disease therapies. Overall, our studies indicate that engineered RNA binding proteins with improved splicing activity may represent a therapeutic avenue for DM and other microsatellite diseases. Materials and Methods Protein Design, Synthesis, a nd Cloning The wild type (WT) MBNL1 protein (amino acids 1 382; splice isoform a; NCBI accession numb er NP_066368) was used as a template for t he construction of the MBNL AB MBNL AA and MBNL BB constructs Due to the difficulty of purifying MBNL1 with the C terminal region (amino acids 261 382) and to reduce the size of our synthetic proteins, we chose to exclude this portion of the protein in our synthetic design. Previous studies have shown that the C terminal region is not required for high affinity RNA binding (67, 142) MBNL AB was created using primers to add the N terminal HA tag and the C terminal nuclear localization signal (SV40 NLS). The
80 sequence of MBNL AA and MBNL BB was synthesized (GenScrip t ) All three proteins were cloned into pCI (Promega) for mammalian expression and pG EX 6P 1 (Amersham ) for bacterial protein expression using XhoI and NotI sit es. The amino acid sequences of all MBNL constructs are reported in Figure 2 1a Creation of Stable, Inducible Synthetic MBNL Expression Cell Lines N terminal GFP tagged constructs encoding MBNL AB and MBNL AA with the HA tag removed were cloned into PB PuroTet, a vector co ntaining PiggyBac Transposon sequences (143) flanking a PGK driven puromycin cassette and a m inimal CMV promotor downstream of a TetR response element (TRE) to drive doxycycline inducible expression of the GFP MBNL construct. The In Fusion cloning system tagged constructs into the PB PuroTet vector. At 60 % confluency in 6 well plates, mouse embryonic fibroblasts (MEFs) deficient in MBNL1 and MBNL2, gifted by Maurice Swanson, were transfected with 1 g of PB PuroTet vector encoding GFP tagged MBNL AB or MBNL AA, 1 g of a PB Tet On Advanced (vector containing PiggyBac Transposon sequences (143) flanking rtTA Advanced (Clontech) under CMV driven expression as well as a puromycin selection cassette), and 1 g of PiggyBac transposase (total = 3 g) using TransIT instructions. After 24 hours the cells were subjected to pu allowed to recover for several days, and then exposed to 1000 ng/mL doxycycline (Sigma) for 24 hours. Cells were then sorted for high GFP expression using the SH800S Cell Sorter (Sony). Individual clones were isolated and the p opulations expanded in the presence of puromycin. Individual clones for each cell line were
81 selected for experimental use based on GFP MBNL expression across a range of doxycycline concentrations. Cell Culture a nd Transfection HeLa cells were cultured as (DMEM) Glutamax (Gibco ) supplemented with 10 % fetal bovine serum (FBS) and 1X antibiotic antimycotic (Gibco) at 37 C under 5 % CO 2 Prior to transfection, cells were plated in twelve well plates at a dens ity of 8 x 10 4 cells/well. Cells were transfected approximately 36 hours later at roughly 80 % confluency. Plasmids (400 ng/well) were transfected using 2 protocol. Cells were placed in Opti M EM I reduced serum media (Gibco) at the time of transfection. Six hours later the Opti MEM I was replaced with our supplemented DMEM. 18 hours post medium exchange cells were harvested using TrypLE (Gibco) and pelleted using centrifugation. For overexpres sion cell based splicing ass ays (Figure 2 5) and Western blots (Figure 2 2b ), 200 ng of protein plasmid or empty pCI vector (mock) were co transfected with 200 ng of minigene. In the context of the plasmid dosing system (both splicing ass ays and Western bl ots) (Figure 2 10 and Figure 2 8 ), 20 0 ng of a selected minigene was co transfected with increasing amounts of protein expression plasmid up to 200 ng. In cases where less than 200 ng were transfected, empty pCI vector was used to make up the remainder of the total 400 ng transfected When plasmid dosing was performed in the con text of CUG repeat RNA (Figure 2 22 ), the amount of protein expression vector remained unchanged from previous dosing experiments, but only 100 ng of the selected minigene was transf ected with 100 ng of a DMPK CUG 960 expressing plasmid (34).
82 MEFs were regularly maintained in DMEM Glutamax supplemented with 10 % 37 C under 5 % CO 2 To assay endogenous splicing regulation (Figure 2 12b 2 12d, Figure 2 15) and GFP MBNL expression levels via Western blot (Figure 2 12a and Supplemental Figure 2 13), cells were plated in twelve well plates at a density of 6 x 10 4 cells/well. After 24 hours, fresh doxycycline was prepared at 1 mg/mL, diluted, and then added to the cells at the appropriate concentrations to induce a range of GFP MBNL protein expression. 24 hours post docycycline treatment cells were harvested using TrypLE and pelleted using centrifugation. HEK 293 cells (Flp In T Rex 293, Invitrogen) were routi nely cultured as a monolayer in DMEM Glutamax (Gibco) suppleme nted with 10% fetal bovine serum and 10 basticidin / 300 zeocin at 37 C under 5 % CO 2 Prior to transfection, cells were plated in twenty four well plates at a density of 1.5 x 10 5 cell/well. Cells were transfected 24 hours later at roughly 80 % confluency. Plasmids (500 ng/well) were transfected using 1.5 293 (Mirus all overexpression cell based splicing assays (Figure 2 6) and Western blots ( Figure 2 4 ) 250 ng of protein expressing plasmid or empty pCI vector (mock) were co transfected with 250 ng of minigene reporter. 24 hours post transfection cells were harvested using TrypLE (Gibco) and pelleted using centrifugation. Wester n Blot A nalysis C ell pellets were lysed in RIPA (25 mM Tris HCl pH 7.6, 150 mM NaCl, 1 % NP 40, 1 % sodium deoxycholate, 0.1 % SDS) (ThermoFisher ) supplemented with 1 mM phen ylmethylsulfonyl fluoride and 1X protease inhibitor co cktail (SigmaFAST, Sigma ) by light agitation for 15 minutes via vortex. Equal amounts of lysate were resolved on a 10
83 % (HeLa / HEK 293 cells, Figure 2 2b and Figure 2 4a) or 4 15 % (MEF cells, Figure 2 12a) SDS PAGE gel prior to transfer. For blots with lysates from HeLa / HEK 293 c ells, MBNL proteins were probed using a mouse anti HA antibody (1:1000 dilution, 6E2, Cell Signaling Technology) and goat anti mouse secondary IRDye 800CW (1:15000 dilution, LI COR ). A GAPDH loading control was probed using rabbit anti GAPDH antibody (1:10 00 dilution, 14C10, Cell Signaling Technology) followed by goat anti rabbit IRDye 680RD (LI COR ). Blots from MEF lysates were probed with a rabbit anti GFP (1:1000 dilution, D5.1, Cell Signaling Technology) and a donkey anti rabbit secondary IRDye 680RD (1 :15000 dilution, LI COR). A GAPDH loading control was probed using a chicken anti GAPDH antibody (1:2000 dilution, ab14247, Abcam) followed by a donkey anti chicken secondary 800CW (1:15000 dilution, LI COR). In both systems, f luorescence was measured usin g a LI COR Odyssey Fc or LI COR Odyssey CLx Imaging instrument. Quantification was performed using the associated Image Studio analy sis software (LI COR ). Cell based Splicing A ssay RNA was isolated from HeLa and HEK 293 cells using an RNeasy kit (Qiagen) o r Aurum Total RNA Mini kit (Bio Rad). The i solated RNA was processed via reverse transcription (RT) for each minigene event upon protein or mock treatment was determined as previously de scribed (74) (Figure 2 5 and Figure 2 6) The only differences fro m this previously publi shed protocol was that for some RT steps SuperScript IV (Invitrogen) was utilized. Additionally, some cDNA samples were visualized and the percent exon inclusion ( ) v alues determined using the Fragment Analyzer (DNF 905 dsDNA 905 reagent kit, 1 500bp, Advanced Analytical Technologies) and associated ProSize data analysis
84 visualized using 6 % native gels and SY BR green I nucleic acid stain (Invitrogen) and the Fragment Analyzer system. In the plasmid dosing system with and without CUG repeat RNA expression (Figure 2 10 and Figure 2 22) alues were plotted against relative MBNL leve ls as determined by immunobl ot (Figure 2 8) and fit to a four parameter dose cu rve ( min max min ) / (1 + 10 ((log(EC50) log[MBNL1]) slope) ))). Parameters that correlate to biological data, i.e. concentration (EC 50 ) and steepness of response (slope), were then derived f rom these curves (Figure 2 10d and Figure 2 22c) RNA was isolated from MEF cells using the Aurum Total RNA mini kit and DNase treated on column. 1000 ng of DNAsed RNA was reverse transcribed with SuperScript IV with random hexamer priming according to th half of the recommended SuperScript IV was utilized. cDNA was then PCR amplified for 25 32 cycles using flanking exon specific primers. Primer sequences, annealing temperatures, and inclusion and exclusion product size s in base pairs are listed in Table 2 1. Samples were visualized and quantified using the Fragment Analyzer system. v alues were plotted against MBNL levels (Figure 2 12b 2 12d, Figure 2 15) as determined by Western plot (Figure 2 12a and Figure 2 13) r elative to GAPDH and fit to a four parameter dose cu rve as described above. Parameters that correlate to biological data, i.e. concentration (EC 50 ) and steepness of response (slope), were then derived from these curves (Figure 2 16) Protein Expression an d P urification All proteins were expressed as N terminal glutathione S transferase (GST) fusions. Using BL21 Star (DE3) cells (Invitrogen), protein expression was induced using 0.5 mM IPTG at an OD 600 = 0.6 0.7 for 2 hours at 37 C. Following induction, ce lls were
85 lysed in B PER (bacterial protein extraction reagent) (Pierce) supplemented with DNase I (5 U/ml) and lysozyme (0.1 mg/mL ) for 30 minutes at room temperature. The lysate was then diluted with 1 volume of 1X PBS and incubated for 30 minutes on ice prior to centrifugation at 17,000 rpm. The supernatant was isolated and mixed with gl utathione agarose (Sigma ) for 2 hours at 4 C. The resin was washed twice with 5 volumes of GST buffer (40 mM bicine pH 8.3, 50 mM NaCl), twice with 5 volumes of GST buffe r supplemented with 1 M NaCl, and finally 3 times with 5 volumes of GST buffer 20 mM NaCl. GST tagged MBNL AB, MBNL AA, and MBNL BB were then eluted with 10 mM glutathione in GST buffer 20 m M NaCl The resulting elution was then concentrated and dialyz ME), 50 % glycerol). Final purity of the proteins was assessed via SDS PAGE gel analysis and no significant differences were detected. The GST MBNL fusions were the most prom inent band with very few non specific carryover products from the purification. Working c oncentr ations were determined via the Pierce 660 nM protein assay reagent using BSA standards. RNA Radiolabeling a nd Electrophore tic Mobility Shift Assays (EMSAs ) end labelled using T4 PNK (NEB 32 P] ATP. All RNAs were purified on 10 % polyacrylamide denaturing gels. Prior to incubation with protein, these RNAs were denatured by incubation at 95 C for 2 minutes followed by a 5 minute incubation on ice. Once cooled the RNA was mixed with increasing concentrations of protein (final volume = 10 ) to ME, 0.01 mM EDTA, 10 % glyc erol, 5 mM MgCl 2 0.1 mg/mL heparin, 2 mg/mL bovine serum albumin (BSA), and 0.02 % xylene cyanol. This protein RNA mixture was incubated for
86 30 minutes at room temperature for binding to reach equilibrium prior to electrophoresis. 3 of the sample was t hen loaded on a pre chilled, 1.5 mm, 6 % native acrylamide (37.5:1) gel and run for 45 minutes at 150V at 4 C. Gels were dried for overnight exposure on phosphorus plates (Figure 2 12b and 2 12d) Binding curves were quantified using ImageQuant software (GE Healthcare Life Sciences). The fraction of RNA bound was calculated as the ratio of all RNA protein complexes divided by total RNA signal in each lane. The apparent K d was then determined using the following equation: f bound = f max ([MBNL]/([MBNL] + K d ) ) (Figure 2 17c and 2 17e). In Vitro Transcription o f RNA Bind N Seq (RBNS) Random Input RNA RBNS random input RNA was prepared by in vitro transcription using the RBNS ACACTCTTTCCCTACACGACGCTCTTCCGATCT(N) 40 GATCGGAAGAGCACACGTCT GAACTCCAGTCAC CCTATAGTGAGTCGTATTA a DNA oligo containing a random 40mer sequence flanked by priming sites for the addition of Illumina adaptors and the T7 promoter sequence. To artificially create a double stranded T7 promoter, a T7 oligo TAATACGACTCACTATAGGG 3 template corresponding to the T7 promoter sequence by heating the template and T7 oligo in equal proportions up to 95 C and cooling down at a rate of 0.1 C per second to 45 C. The RBNS input RNA pool was then in vitro transcribed using the HiScribe T7 in vitro transcription kit (NEB). The produced RNA was then bead purified using AMPure XP RNase fr ee beads (Beckman Coulter Inc. ). RBNS and Computational A nalysis RBNS was performed using the same proteins purif ied as GST fusions for EMSAs. Ei ght concentrations of each MBNL protein (nM = 0,16, 32,125, 250, 500,
87 1000, 2000), including a no MBNL condition, were equilibrated in binding buffer (25 mM Tris pH 7.5, 150 mM KCl, 3 mM MgCl 2 0.01 % Tween 20, 1 mg/mL BSA, 1 mM DTT, )) for 30 minutes at room temperature. In vitro transcribed RBNS random input RNA was then added to a final concentratio SUPERase In (Ambion) and incubated for 1 hour at room temperature. During this i ncubation 50 aliquots of glutathione magnetic agarose beads (Pierce) were washed four times with 0.2 ml of wash buffer (25mM Tris pH 7.5, 150 mM KCl, 60 BSA, 0.5 mM EDTA, 0.01 % Tween 20). The beads were then placed in 50 of binding buffer unti l needed. To pull down the tagged MBNL and interacting RNA, each RNA / protein solution was added to 15 of equilibrated and washed glutathione magnetic agarose beads and incubated for 1 hour at room temperature. Unbound RNA was removed by washing the be ads 3 times with 0.2 ml of wash buffer. The beads were incubated at 70 C for 10 minutes in 100 of elution buffer (10 mM Tris pH 7.0, 1mM EDTA, 1 % SDS) and the eluted material (bound RNA) collected with AMPure XP RN ase free beads The RNA was then reverse transcribed into cDNA u sing SuperScript IV ACTGACCTCAAGTCTGCACACGAGAAGGCTAG was also reverse transcribed to control for any nucleotide biases in the i nput library. Illumina sequencing libraries were prepared using primers with Illumina adaptors and unique sequencing barcodes (to allow for multiplexing all samples) to amplify the cDNA using Phusion high fidelity DNA polymerase (NEB) for 16 amplification cycles. Table of primers used to index each sample library with unique barcodes are listed in Table 2 2 PCR products were bead purified using AMPure XP RNas e free beads Sequencing
88 libraries corresponding to all concentrations of a given MBNL were pooled in a single lane and the random 40mer sequenced using the Illumina NextSeq 500. Motif (kmer) R values were calculated as the motif frequency in the selected RBP pool over the fre quency in the input RNA library (Figure 2 19a 2 19c). Frequencies were contr olled for the respective library read depth. The overall rate of kmer enrichment in the no protein condition relative to the input library was defined as the false discovery rate (FDR). More detailed methods and theoretical assumptions utilized have been p reviously reported (66) Immunofluorescence and Microscopy To acquire immuofluorescence images in transfected HeLa cells, e ight well culture slides were tre ated with poly lysine soluti on (Sigma ) for 30 minutes at 37 C. HeLa cells were then plated at 2 x 10 4 cells/chamber. Cells were transfected 24 hours post plating with 200 ng total plasmid (100 ng protein expression plasmid and 100 ng of empty pCI vector, 200ng of pCI for mock) using 1 of Lipofectamine 2000 (Invitrogen) MEM I reduced serum media (Gibco) at the time of transfection. Six hours later the Opti MEM I wa s replaced with DMEM Glutamax (Gibco) supplemented with 10 % fetal bovine serum (FBS) and 1X antibiotic antimycotic (Gibco) 18 hours post medium exchange cells were fix ed for 10 minutes on ice with 4 % paraformaldehyde. Cells were then permea bilized with 0.1 % Triton X 100 in 1X PBS f or 10 minutes at room temperature (RT). Next, cells were treated with Image iT FX Signal Enhancer (Invitrogen) for 30 minutes at RT. The cells were probed overnight at 4 C with mouse anti HA antibody (1:100 dilution, 6E2, Cell Signaling Technology). After 3 washes in 1X PBS for 5 minutes at RT, cells were then probed with goat anti mouse Alexa 488 (5 dilution, Invitrogen) for 1 hour at RT.
89 Finally, cells were mounted using Prolong Diamond Antifade Mountant with DAPI (Invitrogen). After the slides had cured, images were acquired using a Zeiss Axioskop 2 with equal exposures across all samples (Figure 2 3a) To acquire images of MBNL proteins in doxycycline treated MEFs, glass cover slips were placed in six well plates and treated with poly lysine solu ti on for 30 minutes at 37 C MEFs were then plated at 1.5 x 10 4 cells/well. 24 hours later, fresh doxycycline (Sigma) was prepared at 1 mg/mL, diluted, and then added to the cells at the appropriate concentrations to induce GFP MBNL protein expression. 24 hours post drug treatment, cells were fixed for 10 minutes on ice with 4 % paraformaldehyde. Cells were then permeabilized with 0.1 % Triton X 100 in 1X PBS for 10 minutes at room temperature (RT). Cells were then mounted using Prolong Diamond Antifade Mo untant with DAPI After the slides had cured, images were acquired using the EVOS FL Cell Imaging System (ThermoFisher) (Figure 2 11). Real time PCR RNA was isolated from HeLa cells transfected with equivalent amounts of empty vector (mock) or plasmid ex pressing a synthetic MBNL protein using the Aurum Total RNA mini kit (Bio Rad). 500 ng of RNA was DNAsed using Promega RQ1 RNAse Free SuperScript IV with random hexamer pri except that half of the recommended SuperScript IV was utilized. Real time PCR analysis was then conducted using SsoAdvanced Universal SYBR Green Supermix (Bio Rad) on a CFX96 Touch Real Time PCR Detection Syst em (Bio Rad) as per the or reverse transcriptase. Primers utilized are as listed: MBNL AB/MBNL AA/MBNL BB
90 AGAGAAAGGTCGAATGAGCGG TGCATTCTAGTT GTGGTTTGTCC AATCCCATCACCATCTTCCA TGGACTCCACGACGTACTCA differences in amplification efficiency were detected for GAPDH or target primer pairs. Expression levels of MBNL were determined via normalization of the cycle threshold (Ct) to GAPDH. Calculations of fold change relative to MBNL AB were determined via the formula 2
91 Figure 2 1 Synthetic MBNL protein amino acid sequences and MBNL1 3 zinc finger domain alignments. A) Sequences of both MBNL AB, MBNL AA, and MBNL BB proteins are shown. The ZF1 2 domain, ZF3 4 domain, HA tag, and nuclear localization signal (NLS) are highlighted. B) Sequence alignment of MBNL1 ZF1 2 and ZF3 4 derived using MUSCLE (144) Amino acid residues shown to contact RNA in the crystal structure of ZF3 4 and NMR solution structure of ZF1 2 are highlighted (76, 77) Numbering above and below sequences corresponds to ZF1 2 and ZF3 4, respectively. C) Sequence alignment of the ZF1 2 and ZF3 4 domains within the three human MBN L1 homologs using MUSCLE (144) Amino acid residues shown to contact RN A in the crystal structure of ZF3 4 and NMR solution structure of ZF1 2 are highlighted (76, 77) The protein amino acid sequences used were from the following NCBI accession numbers: NP_066368, NP_659002, and NP_060858 for MBNL1, MBNL2, and MBNL3, respectively.
93 Figure 2 2 Zinc finger domain architecture and protein expression levels of synthetic MBNL proteins. A) Schematic of synthetic MBNL proteins that shows organization of zinc finger domains (ZF1 2 (Domain A) and ZF3 4 (Domain B) ) and location of HA tag and nuclear localization signal (NLS). The leng th of the individual segments are proportional to the size o f each region of the protein. B ) Representative immunoblot com paring relative protein levels of synthetic MBNL pro teins in transfected HeLa cells. C) Quantification of synthetic protein levels in HeLa cells via Western blot against the HA tag (n = 4). Relative levels of each protein were normalized to GAPDH. MBNL AB expression values were then set equal to 1 and MBNL AA and MBNL BB protein levels normalized (data represented as mean standard error; **p < 0.01 ***p < 0.001 test ).
94 Figure 2 3 Subcellular localization and mRNA expression levels are not impacted by ZF domain rearrangement in transfected HeLa cells. A) Subcellular protein localization of synthetic proteins was determined using immunofluorescence against the HA tag. No significant differences in subcellular localization were detected be tween synthetic MBNL proteins. B) Real time qPCR analysis of synthetic MBNL RNA levels normalized to GAPDH in transfected HeLa cells. Determination of fold change in expression showed no diff erences in RNA levels of MBNL AA and MBNL BB compared to MBNL AB (data represented as mean standard error ).
95 Figure 2 4 Synthetic MBNL proteins are expressed at different levels in transfected HEK 293 cells. A) Representative immunoblot comparing relative protein levels of synthetic MBNL proteins in transfected H EK 293 cells. B ) Quantification of protein level s in HEK 293 cells (n = 3) via W estern blot against the HA tag. Relative levels of each protein were normalized to GAPDH. MBNL AB levels were then set equal to 1 and MBNL AA and MBNL BB protein levels normali zed (data represented as mean standard error; *p test).
96 Figure 2 5. Synthetic MBNL proteins regulate splicing of minigenes in HeLa cells with different activities. A F) Jitter plot representations of cell based splicing assays using INSR, ATP2A1, Vldlr, TNNT2, MBNL1, and Nfix minigenes, respectively. HeLa cells were transfected with empty vector (mock) or MBNL protein expression plasmids and a single minigene reporter. Percent spliced quantified. Eac h point is from a single experiment and the line represents the average of all experiments for that condition (at least n = 5 for each protein (displayed in white) are listed below the representative splicing gels. (Op < 0.05 vs. mock, #p < 0.0001 vs. mock, p < 0.05 vs. MBNL vs. MBNL test)
97 Figure 2 6 Synthetic MBNL proteins regulate splicing of minigenes in HEK 293 ce lls with different activit ies. A F) Jitter plot representations of splicing assays using INSR ATP2A1 Vldlr TNNT2 MBNL1 and Nfix minigenes, respectively. HEK 293 cells were transfected with empty vector (mock), MBNL AB, MBNL AA, or MBNL BB protein expression plasmids and a sing le minigene reporter. RNA was isolated from cells, RT PCR performed, and the percent exon inclusion) for each protein treatment was then quantified. Each point is from a single exper iment and the line represents the average of all experiments for that condition (n = 5 for each protein treatment). Average standard deviation) and percent splicing activity (displayed in white) are listed below representative splicing gels. ( O p < 0.0 5 vs. mock, p <0.01 vs. mock, p < 0.001 vs. mock, #p < 0.0001 vs. mock, p < 0.05 vs. MBNL AB, p < 0.001 vs. MBNL p < 0.0001 vs. MBNL AB, test )
98 Figure 2 7 Average splicing activity of synthetic MBNL proteins across six minigene events. A B ) Jitter plot representation of average splicing activities for MBNL AB and synthetic proteins in HeLa and HEK 293 cells, respectively. Each point is the splicing activity of each protein for a single minigene event tested (see Figure 2 5 and Fi gure 2 6 ) and the line represents the average of all splicing activities. MBNL AB was considered to have 100% splicing activity for each event and the values for MBNL AA and MBNL BB were calculated accordingly (**p < 0.01, ****p < 0.001, test ).
99 Figure 2 8 Representative immunoblots used to create dose response curves in plasmid dosing system. A) Representative immunoblots used determine relative MBNL1 protein levels in plasmid dosing assay. HeLa cells were transfected with increasing conc entrations of MBNL1 expression plasmid and MBNL1 levels detected using an anti HA antibody. Protein levels at each plasmid dose were normalized to a GAPDH loading control. Please note that breaks in the MBNL AA and MBNL BB blots are due to removal of lanes in immunoblot that were not used for quantification of protein levels. All lanes are from the same blot. B) Normalized quantification of protein expression levels at each plasmid dose for all synthetic MBNL proteins (n=3 at each dose, data represented as mean standard error ).
100 Figure 2 9 Representative splicing gels used to calculate changes in exon inclusion across the gradient of protein expression produced within the plasmid dosing system and quantitative parameters derived to describe dose re sponse behavior. A) Representative splicing gels for three minigenes tested acquired using the Advanced Analytical Fragment Analyzer. Bands are representative of relative fluorescence units (RFU) for each cDNA product and were used to plasmid dose are listed below each gel (n = 3 5 for each). B) Table of quantitative parameters (log(EC 50 ) and Hill Slope, standard error) generated from the dose response curves. Overall quality of t he fit, as represented by R 2 values, are also listed.
101 Figure 2 10 MBNL AA and MBNL BB proteins regulate splicing at different relative protein levels compared to MBNL AB in a plasmid dosing system. A C ) Plasmid dosing assays for MBNL1 ATP2A1 and TNNT2 minigene events, respectively. Increas ing amounts of plasmid expressing MBNL AB, MBNL AA, or MBNL BB were transfected into HeLa cells along with a minigene reporter (n = 3 (data represented as mean standard deviation) we re then quantified, plotted against log [MBNL ] levels, and fit to a four parameter dos e response curve. Relative MBNL expression levels for each protein were determin ed via immunoblot (n = 3) at each plasmid dose and normalized to GAPDH. MBNL AB levels at the highest plasmid dose (200 ng) were then set equal to 1 and all other values for MBNL AB MBNL AA and MBNL BB normalized. Representa tive immunoblots with quantification and splicing ge ls can be found in Figure 2 8 and 2 9a respectively D ) Bar plots o f log(EC 50 ) values and Hill slopes derived from the dose response curves (table of exact values standard error are listed in Figure 2 9b ). Due to ambiguous curve fitting of MBNL BB the bottom ( MBNL1 and TNNT2 ) or top ( ATP2A1 ) of the curve was constraine d to MBNL AB at the highest plasmid dose. (*p < 0.05, ****p < 0.0001, test )
103 Figure 2 11 N terminal GFP tagged MBNL AB and MBNL AA localize with the nucleus in response in doxycycline treatment in inducible MEFs. Subcellular protein localization of synthetic MBNL proteins was determined using fluorescence of the GFP tag in the presence and absence of doxycycline In the absence of doxycycline treatment, there was no visible protein expression. In th e presence of doxycycline, n o significant differences in subcellular localization were detected between synthetic MBNL proteins.
104 Figure 2 12 MBNL AA regulates splicing at similar relative protein levels in an inducible tet on system in Mbnl1/2 KO MEFs. A) Representative immunoblots used to determine relative MBNL protein levels across a gradient of doxycycline treatment (0 60 ng/mL for MBNL AB and 0 2000 ng/mL for MBNL AA). Relative MBNL protein expression levels for each protein was determ ined via immunoblot (n =3) at each doxycycline dose and normalized to GAPDH. MBNL AB levels at the highest dose were then set equal to 1 and all other values for MBNL AB and MBNL AA normalized. Quantification of MBNL levels from triplicate immunoblots can be can be found in Figure 2 13. B D) Dose curves of three endogenous splicing events Apbb2 Mta and Depdc5 respectively. MEFs were treated with increasing amounts of splicing gels u Figure 2 14 values (data represented as mean standard deviation) were then plotted against log [MBNL] levels and fit to a four parameter dose curve. Due to ambiguous curve fitting in some cases, for all dose response curves, the top or bottom (i.e. inclusion or exclusion event, respectively) of the curve was Additional dose response curves and R2 values for each curve fit can be found in Figure 2 15 Quantitative parameters derived from these dose curves can be found in Figure 2 16
106 Figure 2 13 Quantification of relative MBNL AB and MBNL AA protein levels across a gradient of doxycycline treatment in inducible MEF system. Normalized quantification of protein expression levels at each matched doxycycline dose for MBNL AB (0 60 ng/mL) and MBNL AA (0 2000 ng/mL) (n=3 at each dose, data represented as mean standard error ). Relative MBNL expression levels for each protein was determined via immunoblot at each doxycycline dose and normalized to GAPDH. MBNL AB levels at the highest dose were then set equal to 1 and all other values for MBNL AB and MBNL AA normalized accordingly. No statistical significance for relative protei n levels was detected expect at the highest dose in each cell line (*p < 0.05, t test ).
107 Figure 2 14 Representative splicing gels used to calculate changes in exon inclusion across the gradient of protein expression produced within the i nducible MEF system. Representative splicing gels for the 15 endogenous events tested acquired using the Advanced Analytical Fragment Analyzer. Bands are representative of relative fluorescence units (RFU) for each cDNA product standard deviation at each plasmid dose are listed below each gel (n = 3).
108 Figure 2 15 MBNL AB and MBNL AA produce similar dose response curves for an additional 12 endogenous splicing events assayed in the induci ble MEF system (A L) Dose response curves generated for 12 endogenous splicing events in MBNL AB and MBNL AA inducible MEFs. MEFs were treated with increasing amounts of doxycycline (n = 3) and values quantified at each dose (representative splicing gel s used to quantify values can be found in Figure 2 14). These values (data represented as mean standard deviation ) were then plotted against log [MBNL] levels (quantified via immunoblot, see Figure 2 12a and Figure 2 13) and fit to a four parameter d ose curve. Due to ambiguous curve fitting in some cases, for all events the top or bottom (i.e. inclusion or exclusion event, respectively) of the curve was constrained to match the average at the highest doxycycline dose. (M) Overall quality of fit of t he four parameter dose curve, as represented by R 2 values, is listed for all 12 events in this figure and in Figure 2 12b 2 12d.
110 Figure 2 16 Quantitative parameters used to describe dose response behavior of 15 endogenous splicing assayed in inducible MEF system. Bar plots and tables of log (EC 50 ) (A and B), Hill slopes (C D) and (E F) values derived from dose response curves (Figure 2 12b 2 12d and Figure 2 15a 2 15l) to describe the dose response behavior of MBNL AB and MBNL AA. Data represented as mean standard error for all values except for in which only the mean is reported due to the dose curve fitting constraints.
112 Figure 2 17 Reorganization of zinc finger domains does not significantly impact RNA binding of syntheti c MBNL proteins. A ) Sequence of four RNAs used in EM SAs with synthetic MBNL proteins. The occurrence of specific UGCU motifs (bold and underlined) and mutated non specific motifs (red, bold, and underlined) within the RNA substrates are noted. B ) Represent ative EMSA gels to CUG 4 /CAG 4 RNA substrates (n = 3 for each RNA). C ) Binding curves of all synthetic MBNL proteins for CUG 4 Apparent dissociation constants (K d s) ( standard erro r) for each MBNL1 are listed. D ) Representative EMSA gels for NV11/NV2CC RNA sub strates (n = 3 for each RNA). E ) Bi nding curves for each MBNL protein comparing the differences in affinity between NV11 and the non specific mutant NV2CC. K d ( standard error) for each RNA are listed below each plot.
113 Figure 2 18 Comparison of R v alues and kmers derived from this and other RBNS studies with MBNL1. A ) Scatterplots comparing R values for all kmers (k=7) between MBNL AB, MBNL AA and MBNL BB at three different protein concentrations. Correlation coefficients indicate strong similaritie s between MBNL AA and MBNL AB, although some kmers for MBNL AA display increased R values. MBNL BB and MBNL AB are not as strongly correlated and the scatterplots display the overall low R values of MBNL BB across all protein concentrations. Comparison of MBNL AA and MBNL BB magnifies these differences in RNA binding specific ity between the two proteins. B ) Scatterplots comparing R values for all kmers between MBNL AB and those identified with MBNL1 in a previous RBNS study at three protein concentrations (66) Correlation coefficients indicate a strong correlation of R values between each MBNL1 protein indicating a similarity in motif recognition and RNA bindi ng activity in each protein population C ) Area proportional venn diagram showing overlap in top 50 kmers for MBNL AB and those identified using MBNL1 in a previous RBNS study (66) Values listed represent number of kmers within each sub population.
115 Figure 2 19 RBNS analysis of engineered MBNL proteins indicates that the ZF domains have differential RNA binding specificity. A C) RBNS R values for the top four kmers (k = 7) are shown as a function of MBNL1 protein concentration for MBNL AB, MBNL AA, and MBNL BB, respectively. Top four kmers for each protein are determined based on concentration of protein that shows the greatest R values (250 nM, 500 n M, and 1000 nM for MBNL AB, MBNL AA, and MBNL BB, respectively). R values at all other concentrations for the respective kmers were then determined to create the un imodal enrichment plots shown. D) Percent nucleotide occurrence within the top 10 0 kmers for each MBNL protein. E) Area proportional venn diagram showing overlap in top 50 kmers for each MBNL protein. Values listed represent number of kmers within each sub population. [RBNS data can be accessed via Sequence Read Archive SUB2513163; RNA Bind N Seq for Synthetic MBNL.]
116 Figure 2 20 Representative splicing gels used to create dose response curves with plasmid dosing system in the presence of toxic RNA. A B) Representative splicing gels from plasmid dosing assay with the MBNL1 and ATP2A1 minigen e reporters, respectively, in the presence of CUG repeat RNA. Images were acquired using the Advanced Analytical Fragment Analyzer. Bands are representative of relative fluorescence units (RFU) for each cDNA standard deviation at each plasmid dose are listed below each gel (n = 3 5 for each).
117 Figure 2 21 Expression of CUG repeat toxic RNA alters the MBNL dose response curve. A B ) Comparison of dose curves with the MBNL1 and AT P2A1 minigene in presence and absence of DMPK CTG 960 transfection for all three MBNL1 proteins tested (data represented as mean standard deviation ). C ) Table of quantitative parameters (log(EC 50 ) and Hill Slope, mean standard error) generated from the dose response curves in the presence and absence of toxic repeat RNA expression. Overall quality of the fit, as represented by R 2 values is also listed.
118 Figure 2 22 Dose curves of synthetic MBNLs are altered in the presence of toxic RNA. ( A B ) Plasmid dosing assays were performed using the MBNL1 and ATP2A1 minigenes, respectively, in the presence of CUG repeat RNA. HeLa cells were transfected with the same increasing amounts of plasmid to create a gradient of protei n expression as done in Figu re 2 10 Cells were transfected with a minigene reporter and a CTG 960 at each plasmid dose were quantified (n = 3 4 for each plasmid dose), plotted against log [MBNL ] levels, and fit to a four parameter dose response c urve (data represented as mean standard deviation) Representative splicing gels can be found in Figure 2 20. C ) Bar plots of log (EC 50 ) values and slopes derived from the dose response curves (table of exact values standard error are listed in Figure 2 21c ). Due to ambiguous curve fitting of MBNL BB the bottom ( MBNL1 ) or top ( ATP2A1 ) of the curve was MBNL AB at the highest plasmid dose. These values are compared to those determined in the absence o f toxic RN A expression (Figure 2 10d and Figure 2 9b ). Comparison of the dose curves in the presence and absence of CUG 960 RNA expression can be found in Figure 2 21a and 2 21b for MBNL1 and ATP2A1 respectively. (*p < 0.05, test)
120 Figure 2 23 Model summarizing differences between synthetic MBNL proteins. MBNL AA is a more active alternative splicing regulator while MBNL BB is significantly weaker compared to MBNL AB These differences in activity are represented by the size of t he arrows showing how each MBNL protein promotes exon inclusion / exclusion. In the context of the canonical arrangement of ZF domains within MBNL proteins (MBNL AB) ZF1 2 binds YGCY motifs with high specificity while ZF3 4 has a modest preference fo r YGCY motifs but will sample and bind many motifs with similar affinities. This activity is represented by the dotted lines illustrating ZF3 4 binding to both canonical and non canonical RNA motifs. MBNL AA possesses two high specificity ZF1 2 motifs driv ing RNA recognition and subsequently increased splicing regulation at lower protein concentrations. MBNL BB has significantly reduced RNA binding specificity and samples many specific and non specific RNA motifs. Due to the reduced motif recognition, regul atory sites are not bound until high concentrations of protein are present leading to an overall reduction in splicing regulation. Structures of domains shown here are derived from PDB ID 3D2N (ZF1 2) and PDB 3D2Q (ZF3 4) (76)
121 Table 2 1. Sequences of primers used for RT PCR of endogenous splicing events in MEFs. Event Exon Inclusion (bp) Exclusion (bp) T m (C) Cycle Number Add3 14 GCAGTTTGACGATGACGATCAGG GACATCATGCATCTCGTCCTTGC 301 205 55 26 Apbb2 7 TCATTTCAGACAGATCCCGATTTGC GTTTCAAGGATGCATAGCGTAGC 334 271 55 26 Clstn1 3 CATGGGATAGTCACCGAGAACG CCTCACCCGTGGACTTATCC 196 166 57 26 Exoc1 12 TGACTGGCACCTCTAAAGAAAGC GGTCAGAGGCAGACATGTTCC 210 165 57 28 Madd 16 TCACACTGCCCACCAAAGG GAAGGTTCTCTTTTCACGGTTGG 190 130 55 28 Nfix 7 CCATCGACGACAGTGAGATGG CTGGATGATGGACGTGGAAGG 296 173 51 27 Numa1 2 CGACAAGAAGCACAGAGTACTAGC CTTCTGTTGCTGCACCTTGC 231 189 55 26 Numb 8 AGCATCAGCTCCTTGTGTTCC GCAGCACCAGAAGACTGACC 302 155 57 25 Pla2g6 9 AGAAGTGGACACCCCAAACG CTCATGGAGCTCAGGATGAACG 294 129 57 28 Plod2 14 AGGAATCTGGAATGTCCCATATATGG TCTGCCAGAAGTCATTGTTAAGATGG 302 239 55 26 Synj2 23 CTCGGTGGAGACAACTCTTCC GTGCTCCTGGGAGAAGTTTCG 335 200 55 28 Mta1 17 ACCCCGTGAAGAGTTCATCC GTGCCTGGTCTGTCCATGG 191 155 57 28 Ktn1 38 AGATGGAGCGATCGACTTACG AATCATCAGCTACCTTCTTTCTCTCC 198 114 51 27 Depdc5 33 CGACTGTGCACGGAAAAAGC CCAGAGTTTGCAGAGGGAAAGG 295 229 57 28 Dtx2 6 CCGTGCAGATGCCAAAGG TCAGGGGCCACTTTCAGC 253 115 51 32
122 Table 2 2 Index primers used to identify each RBNS library within the multiplexed sequencing reads. Protein Concentration [nM] Index Primer Sequences (5' 3') MBNL AB 0 PEIndex_8 CAAGCAGAAGACGGCATACGAGAT CACTGT GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AB 16 PEIndex_7 CAAGCAGAAGACGGCATACGAGAT ATTGGC GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AB 32 PEIndex_6 CAAGCAGAAGACGGCATACGAGAT GATCTG GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AB 125 PEIndex_5 CAAGCAGAAGACGGCATACGAGAT TCAAGT GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AB 250 PEIndex_4 CAAGCAGAAGACGGCATACGAGAT CTGATC GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AB 500 PEIndex_3 CAAGCAGAAGACGGCATACGAGAT AAGCTA GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AB 1000 PEIndex_2 CAAGCAGAAGACGGCATACGAGAT GTAGCC GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AB 2000 PEIndex_1 CAAGCAGAAGACGGCATACGAGAT TACAAG GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AA 0 PEIndex_16 CAAGCAGAAGACGGCATACGAGAT GGACGG GTGACTGGAGTTCAGACGTGTGCTCTTCCGA TC MBNL AA 16 PEIndex_15 CAAGCAGAAGACGGCATACGAGAT TGACAT GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AA 32 PEIndex_14 CAAGCAGAAGACGGCATACGAGAT GGAACT GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AA 125 PEIndex_13 CAAGCAGAAGACGGCATACGAGAT TTGACT GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AA 250 PEIndex_12 CAAGCAGAAGACGGCATACGAGAT CGTGAT GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AA 500 PEIndex_11 CAAGCAGAAGACGGCATACGAGAT ACATCG GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AA 1000 PEIndex_10 CAAGCAGAAGACGGCATACGAGAT GCCTAA GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL AA 2000 PEIndex_9 CAAGCAGAAGACGGCATACGAGAT TGGTCA GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL BB 0 PEIndex_24 CAAGCAGAAGACGGCATACGAGAT CTCTAC GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL BB 16 PEIndex_23 CAAGCAGAAGACGGCATACGAGAT GCGGAC GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC
123 Table 2 2. Continued Protein Concentration [nM] Index Primer Sequences (5' -> 3') MBNL BB 32 PEIndex_22 CAAGCAGAAGACGGCATACGAGAT TTTCAC GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL BB 125 PEIndex_21 CAAGCAGAAGACGGCATACGAGAT GGCCAC GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL BB 250 PEIndex_20 CAAGCAGAAGACGGCATACGAGAT CGAAAC GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL BB 500 PEIndex_19 CAAGCAGAAGACGGCATACGAGAT CGTACG GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL BB 1000 PEIndex_18 CAAGCAGAAGACGGCATACGAGAT CCACTC GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC MBNL BB 2000 PEIndex_17 CAAGCAGAAGACGGCATACGAGAT GCTACC GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC Input RNA N/A PEIndex_25 CAAGCAGAAGACGGCATACGAGAT ATCAGT GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC
124 CHAPTER 3 RBFOX1 MODIFIES THE MBNL1 DOSE RESPONSE OF ALTERNATIVELY SPLICED PRE MRNAS Background Alternative Splicing is Impacted by Splicing Factor Concentration in Development and Disease Alternative splicing (AS) is an intricate and dynamic step of RNA processing that is regulated by hundreds of trans acting factors which bind to cis regulatory RNA sequences to modulate specific mRNA isoform expression in a tissue s pecific and developmental dependent manner (23, 123) These trans acting factors, namely RNA binding prot eins (RBPs), bind to specific sites in pre mRNAs to dictate which regions of the transcript are included in the final, processed isoform (26) Regulation of these AS decisions by RBPs is critical for proper tissue differentiation and development (23, 24) With the explosion of available high throughput sequencing technologies, thousa nds of splicing events have been identified via RNAseq and vast collections of RBP binding sites have been identified via CLIPseq (15, 145, 146) Using these approaches, research in the splicing field has focused on attempting to decipher this the combination of specific cell type RBP expression and predicted regulatory RNA features, both RNA structure and sequence (28, 37) Using the splicing code as a predictive tool, extensive work to understand how individual cis regulatory sequences and how specific RBP expression contributes to AS regulation has been perform ed over the last several decades. Most of these critical studies have been performed in the context of RBP knockdown or over expression in model systems. However, changes in overall cellular concentrations of RBPs has been shown to impact
125 AS beyond simply the absence of the RBP (knockdown experiments) or highly increased expression (over expression). Changes in RBP concentrations have been implicated in a variety of human diseases including cancer, heart disease, and neurodegenerative disorders. Serine/arg inine rich splicing factor 1 (SRSF1) is often overexpressed in many solid tumors, including 13% of breast cancers, 25% of colon cancers, and 25% of lung cancers (147) In contrast, down regulation of Quaki ng (QKI) has been linked to aberrant splicing in lung cancer and its reduced expression is associated with poor disease prognosis (148) Mutations in RNA binding motif protein 20 (RBM20) has been linked with dilated cardiomyopathy (DCM) and its differential expression leads to aberrant alternative splicing and reduced cardiac function (149, 150) Finally, in myotonic dystrophy (DM), the most common form of adult onset muscular dystrophy, the expression of an expanded CUG (DM type 1) or CCUG (DM type 2) RNA serves as a sink for muscleblind like (MBNL) proteins, a family o f RNA binding proteins that regulate fetal to adult RNA isoform transitions in development. This reduction of functional MBNL protein levels via this toxic RNA gain of function mechanism leads to mis splicing linked as causative of some disease symptoms (r eviewed in (105) and (151) ) These examples and others (reviewed in (24) ) provide evidence that re gulation of effective RBP concentrations within the cell and their perturbation in disease can profoundly impact AS. With these observations in mind, active areas of investigation within the splicing field include how changes in splicing factor concentrati on contribute to specific RNA isoform expression and what properties of the splicing code, particularly
126 cis RNA elements and cell type specific RBP co expression, dictate the responsiveness of a particular AS event. Careful characterization of how AS decis ions are altered across a range of RBP levels will inform both how specific RNA isoform expression changes are regulated during development and provide insights on how to correct perturbations to changes in RNA isoform expression in disease. MBNL P rotein s R egulate Splicing in a Dose Dependent M anner MBNL proteins are master regulators of RNA processing that are known to be important in muscle, heart, and central nervous system (CNS) development (reviewed in (60) ) MBNL1 and its paralogs, MBNL2 and MBNL3, bind to YGCY elements in pre mRNAs to regulate AS is a positional dependent manner (64) Previous work revealed that RNA splicing responds to changes in MBNL concentrations across a broad, dynamic range of expression (134, 152) Using a tunable MBNL1 expression system, it was revealed that individual MBNL1 regulated AS events require different amounts of MBNL to achieve half maximal regulation and are regulated wi th more or less cooperatively than other events (134) This system was also utilized to study the importance of predicted RNA cis regulatory elements in the auto regulation of MBNL1 exon 5. Using deletion analysis of five sections of YGCY motifs within the highly conserved intron of MBNL1 upstream of exon 5, it was found that while several of the deletions impacted the MBNL1 dose imp acted MBNL1 regulation of exon inclusion (134) The evaluation of MBNL1 AS regulation across a range of protein expression provided insights into the mechanism of MBNL dependent splicing of specific events that could not have been discerned by only measuring splicing responses at low and high MBNL concentrations.
127 MBNL Networks of Alternative Splicing Regulation Overlap with Those of Other Splicing Factors While a variety of work in both the splicing and DM field has focused on characterizing MBNL regulation of AS, little work has focused on identifying other splicing factors that coordinate with MBNL to regulate AS of groups of target genes. The only major family of RBPs studied to date is the CELF family of proteins whose steady state levels are increased in DM tissue via PKC mediated hyper phosphorylation (153, 154) MBNL and CELF protei ns antagonize AS during tissue specific developmental transitions, especially in the heart and skeletal muscle (33, 34, 69) CELF proteins play roles in early embryonic development of these tissues to promote fetal RNA isoforms. This AS pattern is antagonized by MBNL which promotes the processing of adult splicing isoforms during tissue differentiation. Upregulation of steady state CELF protein le vels in combination with MBNL sequestration contributes to the increase in fetal mRNA isoform expression detected in DM patient tissues (41, 69) Interestingly, recen t studies have indicated that MBNL and another family of RBPs, RBFOX, may co regulate a subset of AS events in a similar, rather than antagonistic manner. In the human embryonic muscle cell line HFN, it was demonstrated via siRNA knockdown of either MBNL1 or RBFOX1 that a group of events was under combinatorial control by both RBPs, including some events that are mis regulated in DM (155) Add itionally, recent reports have indicated that RBFOX2 and MBNL cooperate to regulate pluripotent stem cell differentiation (156) RBFOX RNA Binding Proteins Regulators of Alternative Splicing in De velopment and Disease The family of RBFOX proteins are, like MBNL, a highly conserved family of tissue specific AS regulators in metazoans with known orthologs present in C. elegans
128 D. melanogaster zebrafish and mice. There are three RBFOX homologs in humans ( RBFOX1 3 ) that display differential expression patterns. RBFOX1 is expressed primarily in brain, heart, and skeletal muscle. RBFOX2 is more ubiquitously expressed and RBFOX3, also known as NeuN, is expressed exclusively in the brain (157 160) The RBFOX family shares an overall protein domain architecture containing a single, highly conserved RRM that resi des in the middle of the protein and is 100% identical in RBFOX1 and RBFOX2 (161) The N terminal and C terminal regions of the protein are less conserved, although domain deletion studies have revealed that at least some regions are required for protein localization, splicing regulation, and may facilitate assoc iations with other RBPs (157, 162 164) RBFOX proteins, like MBNL, regulate AS in a positional dependent manner where, in general, if RBFOX binds to an upstream intron it induces exon exclus ion while if it binds to a downstream intron exon inclusion is promoted (157, 15 9, 165 167) RBFOX proteins interact with (U)GCAUG motifs within target RNAs with extremely high affinity and specificity, although these proteins can also bind to the GCAUG pentanucleotide motif (66, 157, 165, 168) These motifs have been shown to be extremely enriched within 200 300 nucleotides of tissue specific alternative exons, and, in general, the d egree of exon inclusion correlates with the number of motifs (169, 170) RBFOX regulation of AS can also be expanded as these proteins can reside in a larger complex termed LASR (Large Assembly of Splicing Regulators). Within this complex RBFOX proteins can be recruited to RNA in an indirect manner via the binding of other RBPs such as hnRNP M, allowing it to exhibit splicing regulation on RNA targets in the absence of a UGCAUG binding motif.
129 RB FOX protein expression has been found to be critical for proper skeletal muscle, heart, and CNS system development. Knockouts or reduced RBFOX expression in these tissues leads to severe alterations in organ morphology and functionality that in some cases has been partially linked with the induced aberrant RBFOX dependent splicing (171 175) Correlating to its function in tissue development, disruption of RBFOX within these organs has also been implicated in disease. Genome wide studies have repeatedly identified gene alternations in RBFOX1 in patients with autism and epilepsy (176) Alterations in RBFOX1 and RBFOX2 expression have been linked to diabetic heart disease (177) Potential for MBNL and RBFOX Co regulation of Alternative Splicing Both the MBNL and RBFOX family of splicing factors regulate AS in the same positional dependent manner (Figure 3 1). Members from each family are also expressed in the same tissues and their appropriate function is required for proper impli cated in disease of the same impacted tissues. Additionally, while the binding sites for each RBP do not completely overlap, we hypothesize that there is the capacity for recognition of the same cis regulatory RNA sequences, especially if MBNL binds sub optimal sequences with slightly divergent nucleotide content (Figure 3 1). With such striking similarities in AS regulation, the potential for recognition of the same or similar RNA regulatory sequenc es, and recent observations of MBNL and RBFOX co regulation of AS (155, 156) we choose to utilize our tunable dosing system to investigate how the expression of RBFOX1 alters the MBNL dose response and dissect potential mechanisms of co regulation
130 Using both a human embryonic kidney (HEK 293) and mouse embryonic fibroblast (MEF) MBNL1 dosing cell line, we co expressed RBFOX1 in these systems and used both targeted RT PCR of endogenous and minigene splicing events as well as high throughput sequen cing based approaches to begin to elucidate potential mechanisms of co regulation. Using these two cell lines in which MBNL1 expression can be precisely controlled, we determined that RBFOX1 and MBNL1 co regulate many AS events via groupings of common mech anisms that are event specific. Evaluation of the importance of specific cis regulatory sequences indicates that for some events, MBNL and RBFOX regulate AS via the use of the same RNA regulatory elements. Finally, these methods of co regulation appear to be highly conserved between the mouse and human model systems utilized, highlighting the importance of this co regulation across species and tissue types. Results Utilization of a Tunable MBNL1 Dosing System Indicates the Potential for RBFOX Co Regulati on of Select Alternative Splicing Events To test if RBFOX1 expression may have an impact on the MBNL1 regulation of AS, a previously published tetracycline inducible HEK 293 cell line was utilized to express a HA tagged MBNL1 (full length, 41kDa isoform) i n a precise and controlled manner from a single integration site (134) This cell line was additionally chosen due to its low endogenous expression of MBNL1, MBNL2, and MBNL3 (number of transcripts assayed by RNAseq (134) ). In the absence of doxycycline, minimal HA MBNL1 expression can be detected (Figure 3 2a). A broad range of HA MBNL1 concentrations can be achieved as a function of doxycycline treatment (1.0 ng/mL to 10 ng/mL), creating a dynamic ~10 to 15 fold range of HA MBNL1 protein expression as quantified
131 by immunoblot (Figure 3 2a b). To begin to study the potential co regulation of AS by MBNL1 and RBFOX1, we asked if minigene reporters with putative RBFOX binding sites were in fact regulated by RBFOX1 by co expressing eac h protein and assaying the splicing response. We selected two reporters, human INSR and mouse Nfix, both of which are well characterized as MBNL1 targets (47, 71, 72, 128, 129) These reporters were chosen because based on sequence analysis each have putative RBFOX binding motifs (Figure 3 3a). The inducible HEK 293 cells were then c o transfected with an HA RBFOX1 expression plasmid or empty vector (mock) and a single minigene reporter. High levels of MBNL1 expression was achieved via doxycycline treatment (10 ng/mL ). Inclusion levels of each alternative exon were then quantified via RT PCR and expressed as percent spliced in (PSI, ) (i.e percent exon inclusion). The predicted RBFOX binding site within the INSR minigene resides downstream of the regulated exon 11 between several MBNL binding motifs, which based on the pattern of RBFO X positional dependent splicing would, like MBNL1, be predicted to mediate exon inclusion (Figure 3 3a). In contrast, the putative RBFOX motif lies within the alternative exon itself of the Nfix minigene (Figure 3 3a). In a similar manner to MBNL proteins, previous reports have indicated that RBFOX binding to an alternative exon leads to its exclusion from the final RNA transcript. Cell based splicing assays indicate that consistent with previous reports, MBNL1 expression leads to changes in as compared t o mock (0.89 0.005 vs. 0.63 0.02 and 0.74 0.01 vs. 0.38 0.01, MBNL1 vs. mock for INSR and Nfix minigenes, respectively) (Figure 3 3b). HA RBFOX1 both events (0.81 0.005 and 0.61 0.01 for INSR and Nfix minigenes, respectively)
132 (Figure 3 3b). Interestingly, while the presence of RBFOX1 alone led to a similar INSR minigene compared to MBNL1 expression alone, RBFOX1 was not able to reduce exon inclusion of the Nfix exon 8 with the same efficiency as MBNL1. Overall, this analysis demonstrated that RBFOX1 was capable of regulating splicing of both reporters, albeit with different efficiencies. Co expression of both proteins further increased or decreased the meas value for INSR and Nfix respectively, beyond the expression of each splicing factor individually (Figure 3 3b). Interestingly, RBFOX1 had a more substantial impact on the splicing outcome of the two tested events when MBNL1 expression was not induc ed but only a modest effect was observed at high concentrations of MBNL1. Overall, these basic splicing assays demonstrated that (1) RBFOX1 is capable of altering splicing of events traditionally demonstrated to be regulated by MBNL1 and (2) RBFOX1 may hav e differential impacts on splicing of events at varying concentrations of MBNL1 expression. RBFOX1 Expression Dampens MBNL1 Dose Dependent Regulation of the INSR M inigene To further investigate the potential for concentration dependent effects of RBFOX 1 on the regulation of AS by MBNL1, the inducible HEK 293 cell line was utilized to vary the concentration of HA MBNL1 in the presence or absence of HA RBFOX1 (doxycycline) (i.e. log(dox)) as a proxy for HA MBNL1 protein concentration. This data can then be fit to a four parameter dose curve, allowing for the derivation of several quantitative parameters that describe the splicing regulation of a specific event including EC 50 slope 4a) The slope of the response
133 curve provides a relative measure of cooperativity while the EC 50 value indicates how much protein is required to obtain splicing regulation at 50% of maximum This system provides a unique platform not only to study how RNA cis elements contribute to the MBNL1 dose response (Figure 3 4b), but how the co expression of other trans acting splicing factors, such as RBFOX1, modulates the dose response (Figure 3 4c). To more carefully c haracterize how RBFOX1 alters MBNL1 dose dependent splicing regulation, we began by utilizing the human INSR minigene which contains a single, predicted RBFOX binding motif in the downstream intron (Figure 3 5a). At no/low MBNL1 expression, type (WT) INSR was 0.63 0.03 and increased to 0.89 0.01 at high MBNL1 expression levels (Figure 3 5b, blue curve). Interestingly, RBFOX1 expression led to a general shift of the curve upwards (Figure 3 5b, red curve) ncreased to 0. 81 0.0 0. 18, ~20% increase). However, induction of higher cellular concentrations of MBNL1 did not RBFOX1 was co expressed was only minimally increased ( 0.0 4 ) While a truly only be 1), this observation suggests MBNL1 and RBFOX1 interact in some manner to regulate AS of INSR poten tially through use of the same or nearby cis regulatory elements. response by decreasing the ion. These observations support a model of co regulation by both RBFOX1 and MBNL1 where both splicing factors positively regulate exon inclusion. This conclusion is further supported by the reduced
134 amount of MBNL1 (reduced EC 50 value) required for 50% maximal regulation when HA RBFOX1 is co expressed (Figure 3 5e) To test th e function of the predicted RBFOX binding site (UGCAUG) within intron 11 of the INSR minigene, it was mutated to U CG AUG, inverting the sequence of the GC dinucleotide ( INSR mutant 1). Interestingly, this modest change in sequence predicted to abrogate RBFO X1 binding and prevent splicing regulation dramatically impacted the dosing curve of MBNL1 and the extent to which MBNL1 increased exon inclusion (Figure 3 5c, compare grey and blue lines). As predicted, addition of HA RBFOX1 expression had no effect on th e splicing of the mutant 1 minigene (Figure 3 5c). Overall, the fact that this mutation significantly blunted the ability of MBNL1 was a surprising result because (1) many additional YGCY motifs are present within the intron that have been previously repor ted to be important for MBNL1 mediated splicing regulation (71, 72, 12 8, 129) and (2) the RBFOX UGCA UG binding motif does not contain a standard MBNL1 consensus motif ( UGCA ) due to the presence of an adenosine residue at the fourth position (64) These results suggest that MBNL1 and RBFOX1 may compete for the same binding site. To test this hypothesis an INSR mutant minigene was generated in which the RBFOX site UGCAUG was mutated to C GC U UG ( INSR mutant 2); this mutation disrupts the RBFOX binding site as in INSR mutant 1 but generates a new MBNL binding site in the same position within the WT minigene (CGCU). Using this minigene RBFOX1, as predicted, had no impact on the MBNL1 dose response (Figure 3 5d). These data are consist ent with a model in which MBNL1 and RBFOX1 bind to
135 overlapping binding sites within intron 11 to regulate exon inclusion. This conclusion fits response for the WT minigene. At no/low concentrations of MBNL1, RBFOX1 binds to the target site to promote exon inclusion. As the cellular concentrations of MBNL1 protein rise, MBNL1 may outcompete RBFOX1 to bind the same, overlapping site such that the maximal splicing regulation for this event is achieved. The implications of this result are that MBNL and RBFOX proteins may be able to replace one another for events that have regulatory sites for both families of proteins. As such, these protein families may act as buffers for splicing regulation of co regulated events in instances where the expression or functionality of the other splicing factor is reduced. MBNL1 Regulation of Alternative Splicing is Conserved in Two Distinct Dosing Cell Lines While we began this study by utilizing an inducible MBNL1 HEK 293 c ell line, we choose to generate a new cell line to compare and contrast to the HEK 293 system. As such, the PiggyBac transposase system was utilized to perform transgenesis of both a rTetR expression element and a GFP tagged MBNL1 construct (same full leng th human isoform expressed in the HEK 293 system) in m ouse e mbryonic f ibroblasts (MEFs) genetically lacking both MBNL1 and MBNL2 expression. Like in the HEK 293 system, while minimal GFP MBNL1 expression is detected in the absence of doxycycline, the titra tion of doxycycline leads to a broad range of a GFP MBNL1 expression that can be precisely controlled (Figure 3 6a and Figure 3 6b). Immunofluorescence detection of GFP MBNL1 shows predominantly nuclear localization with a modest signal in the cytoplasm (F igure 3 6c). Interestingly, although GFP MBNL1 expression in this system needs to be more fully characterized, the gradient of protein levels produced seems to
136 be broader and require higher concentrations of doxycycline to induce (10 ng/mL to 500ng/mL). Ho wever, both systems are capable of generating similar expression levels of MBNL1. Next, we characterized MBNL1 dependent splicing regulation utilizing several events shown previously to be mis spliced in Mbnl1 : Mbnl2 knockout MEFs compared to control MEFs (52) Some of these events have also been shown to be mis spliced in DM1 patient tibialis anterior muscle and their mis splicing correlated with changes in available concentrations of MBNL1 (134) Upon induction of a gradient of GFP MBNL1 expression, dose response curves for several MBNL1 dependent splicing events similar to those previously reported in the HEK 293 cell line were generated (plotted against log(dox) in Figure 3 6d, pl otted against relative [MBNL1] in Figure 3 6f). These 7 splicing events show a ~3 fold change in EC 50 values with slopes ranging from 1 to 4, demonstrating a breadth of different dose dependent behaviors with varying measures of cooperative regulation (Fig ure 3 6e and Figure 3 6g). This behavior matches well with the HEK 293 cell lines; although a different subset of MBNL1 regulated targets were studied, authors observed a 3 fold change in measured EC 50 values and slopes ranging from 1 to 6 (134) To more directly compare MBNL1 dependent splicing regulation in both systems, RNAseq was performed on samples from each cell line in which cells were either treated with no doxycycline (0 ng/mL) or high concentrations (6.5 ng/mL for HEK 293 cells, 500 ng/mL for MEFs). Interestingly, these data revealed that 68 orthologous cassette exons respond similarly to induction of MBNL1 expression with a high rate of correlation (R spearman = 0.73) (Figure 3 7). This surprising result indicates that MBNL1
137 regulatio n is highly conserved across both species and different cell types, further supporting the use of both cell lines in our studies. Based on this information, we would predict that these events with a high correlation of MBNL1 regulation would be highly cons erved at the RNA sequence and structural level between mouse and human while those events that behave dissimilarly may exhibit lower conservation or are regulated differentially due to differences in the trans acting factor environment of each cell type. RNAseq in HEK 293 Cells at Variable MBNL1 Expression L evels in the Presence or Absence of RBFOX1 Reveals Common Modes of Co R egulation To more globally assess co regulation of AS by RBFOX and MBNL, RNAseq was performed on HEK 293 cells treated with no/low (0 ng/mL), medium (2.2 ng/mL), and high (6.5 ng/mL) concentrations of doxycycline in the presence or absence of HA RBFOX1 expression. This treatment induced a gradient of HA MBNL1 expression that effectively regulated exon inclusion of a well established M BNL dependent splicing event, CLASP1 e20 (Figure 3 8a d) (134) Interestingly, levels of HA RBFOX1 expression achieved via transient transfection were minimal compared to robust increase of HA MBNL1 expression achieved via treatment with doxyc ycline (Figure 3 8a b). Additionally, HA RBFOX1 expression also had no impact on MBNL1 regulated exon inclusion of CLASP1 e20 (Figure 3 8c d). We began by analyzing alternative splicing at low and high concentrations of MBNL1 in the presence or absence of RBFOX1 expression to determine the global effect of RBFOX1 expression on MBNL1 regulated events at the two ends of the dosing curves. The presence of RBFOX1 shifted the splicing of a large number of events (>500 events including both cassette exon and alt changes are plotted in Figure 3 9a where the change of the splicing response due to
138 RBFOX1 expression is modeled as the direction and length of an arrow (pointed end of arrow shows changes in overall when RBFOX1 i s expressed). For example, in the context of endogenous INSR exon 11, in the absence of RBFOX1, the end of the arrow is plotted as (0.06, 0.56) representing a change in of 0.50 between low and high MBNL1 expression (Figure 3 9b). RBFOX1 expression induc es a change at both the low and high value estimated via RNAseq (0.13,0.52). The pointed end of the arrow drawn to this point shows how the expression of RBFOX1 alters MBNL1 splicing regulation, especially at low MBNL1 levels. This change in MBNL1 splici ng regulation of endogenous INSR is consistent with the observed effects of RBFOX1 expression on the MBNL1 dose response for the WT INSR minigene (Figure 3 5b). This analysis was then used to infer changes in the shape of the MBNL1 dose response curve upo n RBFOX1 expression and to categorize the changes into 9 different classes, or modes, of splicing regulation for both positively and negatively regulated events (Figure 3 9c and Figure 3 9d). From these categorizations we can infer potential mechanisms of co regulation and their relative usage within the cell from a more global perspective. Interestingly, the most common class for both positive and negative regulation (212 and 260 events, mode E and zL, respectively) is where the addition of RBFOX1 dampens the MBNL1 dose dependency at low MBNL1 concentrations but does not impact high concentrations of MBNL1 expression. This observation matches the mechanism inferred from dissection of the INSR minigene. Interestingly, the next two most common classes for bot h positive and negative regulation are where RBFOX1 appears to abrogate the MBNL1 dose response at all doses (lower right for positively regulated events, upper left for negatively regulated
139 events), potentially via an antagonistic mechanism whereby RBFOX1 mediates the opposite mode of exon inclusion and works against MBNL1 function. These and other potential modes of co regulation are pictorially described in Figure 3 10. Overall, RNAseq analysis indicates that RBFOX1 co regulates many MBNL1 regulated spli cing events and that RBFOX1 favors specific mechanisms of co regulation. Inducible Expression of RBFOX1 in Mbnl 1/2 knockout mouse embryonic fibroblasts Due to the difficulty of transfecting MEFs, in order to characterize the impacts of RBFOX regulation in this system we choose to utilize the PiggyBac transposase system again to separately integrate an mOrange RBFOX1 (same nuclear isoform expressed via transie nt transfection in HEK 293 cells) under the control of an ecdysone inducible promotor (178, 179) In this system GFP MBNL1 can be selectively expressed using doxycycline (Figure 3 11a) and mOrange RBFOX1 induced via ponasterone A (PonA, Figure 3 11b). Additionally, both proteins can be co expressed via dual drug treatment as detected by immunoblot (Figure 3 11c), Interestingly, via this ecdysone inducible expression system we are able to increase re lative mOrange RBFOX1 levels above that of GFP MBNL1 (~20 fold upregulation of GFP MBNL1 vs. ~70 80 fold upregulation of mOrange RBFOX1, both compared to untreated cells (mock)) (Figure 3 11d). Splicing analysis of two endogenous events, Plod2 and Numa1 revealed that the expression of mOrange RBFOX1 led to changes in AS regulation compared to untreated cells (mock) as measured by However, these changes were weaker compared to regulation of these events by MBNL1 alone despite the distinct difference s in protein levels as quantified by immunoblot ( Numa1 = 0.90 for MBNL1, = 0.13 for RBFOX1 and Plod2 = 0.71 for MBNl1, = 0.37 for RBFOX1) (Figure 3 11e g).
140 The differences in the strength of RBFOX1 regulation for each event did not appear to co rrelate with number of RBFOX1 binding motifs in these target exons (Figure 3 12). Co expression of both proteins did not change the overall beyond that of MBNL1 expression alone, indicating the potential for a similar dampening mechanism of co regulatio n as observed for the INSR minigene and as predicated from RNAseq in the HEK 293 dosing system. Interestingly, while RBFOX1 expression can be generated in the HEK 293 system via transient transfection, it is important to note that the same fold change in RBFOX1 expression compared to that achieved in the MEFs cannot be produced (Figure 3 8b and 3 11d). These observed variations in expression could be due to differences in (1) the N terminal fusion tag (HA vs. mOrange), (2) means of expression, (3) cellular background, or (4) means of detection (HA vs mOrange antibody). We have previously reported that usage of a large tag stabilizes RBP expression levels in a similar MEF dosing system (122) Despite these differences this initial characterization indicates that we can modulate both RBFOX1 and MBNL1 expression independently within the MEF dosing line to measure and characterize splicing co regulation by both splicing factors. Targeting of Orthologous Cassette Exons in HEK 293 and MEF Cells Reveals Similarities and Differences in RBFOX Co Regulation At the time of the publication of this dissertation RNA seq analysis had not been fully completed on the MEFs in which RBFOX1 expression had been induced (samples described and validated in Figure 3 11). However, because we were interested in examining the regulation of orthologous exons in both systems by MBNL 1 and RBFOX1, we utilized the MEF RNAseq data to identify orthologous cassette exons that
141 responded with a 0.1 at either high MBNL1 expression, high RBFOX1 expression, or at high expression of both splicing factors compared to untreated MEF cells (Baye s Factor 1). These 16 events were then utilized to generate targeted primer sets in both HEK 293 and MEF cells to perform RT PCR based splicing analysis at no/low, medium, and high MBNL1 expression levels in the presence and absence of RBFOX1 expression. We began with this limited dosing approach to (1) quickly compare and contrast general trends in MBNL1 dependent event splicing regulation in both cell lines (inclusion or exclusion, general ) and (2) compare the impacts of R BFOX1 expression on this regulation. In general, events responded similarly across the limited gradient of MBNL1 expression in both HEK 293 and MEF cells although the general change in varied between cell lines for each event (compare blue bars in Figure 3 13 and 3 14 for positively and negatively regulated events, respectively). Interestingly, for two of the events probed ( Map4k4 / MAP4K4 and Insr / INSR lowercase gene name refers to mouse event while uppercase refers to the orthologous human event) increased MBNL1 e xpression led to differential regulation (exon inclusion vs. exclusion) in the two cell lines (Figure 3 13b and Figure 3 14c, respectively). In the case of the Insr / INSR event, expression of the same human MBNL1 isoform promoted exon inclusion in the HEK 293 cells but repressed exon exclusion in MEFs. This difference in regulation does not appear to be due to significant differences in the distribution of MBNL cis regulatory motifs (Figure 3 12) as their position is relatively conserved within the downstr eam intron. Interestingly, there are no putative YGCY motifs within 200bp of the upstream intron in the mouse Insr locus that would be consistent with the observed exon skipping (Figure 3 12). In the case of the MAP4K4 /
142 Map4k4 event, exon 20 inclusion is promoted in in the MEFs while exon exclusion is observed in the HEKs. Interestingly, while YGCY motifs are found on both sides of the regulated exon in both the mouse and human transcript, there is an increase in concentration of these motifs within the do wnstream intron 20 in the mouse Map4k4 transcript (Figure 3 12). It is possible that this increase in motifs leads to heightened recruitment of MBNL proteins and subsequent exon inclusion activity. Additionally, two events ( Arhgap17 / ARHGAP17 and Golim4 / GOLIM4 ) responded to the dosing of MBNL1 in HEK 293 cells but not in MEF cells (Figure 3 14d and 3 14f) despite few differences in the distribution of potential regulatory motifs (Figure 3 12). Overall, the differences in MBNL1 regulation within the two systems highlight how variations in both the distribution of cis regulatory elements and trans acting factor expression, or cellular When RBFOX1 was expressed in conjunction with this limite d MBNL1 expression gradient, distinct differences in the MBNL1 dose response in both systems were discerned. The most striking observation was the relatively subtle, if not extremely limited, impact of RBFOX1 expression on the splicing response in the HEK 293s. In contrast, significant changes were observed when RBFOX1 expression was induced in the MEF dosing line. Grey boxes within Figure 3 13 and 3 14 outline the range in for the MBNL1 dose response in the absence of RBFOX1 to help visualize these chang es in splicing regulation. While the expression of RBFOX1, in general, did not appear to change the directionally of splicing regulation (positive or negative exon inclusion), most of the impacts observed appeared to occur at either the no/low MBNL1 or hig h MBNL1 expression levels, consistent with most common predicted mechanisms of co regulation
143 observed in the HEK 293 RNAseq data (i.e. dampening (mode E and L) and antagonism (mode G and J)). Interestingly, the events that did not respond to MBNL1 expressi on in the MEF line ( Arhgap17 and Golim4 ) did change in response to high levels of RBFOX1 expression (Figure 3 14d and 3 14f) Splicing regulation of some targets were altered in an interesting pattern upon RBFOX1 expression, specifically in the MEF system where RBFOX1 appeared to have the most distinct impact. Firstly, in the case of Kif23 expression of RBFOX1 raised the value for all levels of MBNL1 expression and appeared to impact splicing in an 13f), suggestive of a co re gulation mechanism where both splicing factors independently regulate exon inclusion (mode C, Figure 3 10d). In fact, there is distinct separation between the single minimal RBFOX (GCAUG) and single MBNL binding motif in intron 8 of Kif23 that supports thi s mechanism (Figure 3 12). Another event of interest is Insr (Figure 3 14c). While the directionality of splicing regulation for Insr was not impacted by RBFOX1 expression, the overall value at each MBNL1 concentration was increased in a manner suggestiv e of a co regulation mechanism where RBFOX1 antagonizes MBNL1 splicing regulation at all doses (Figure 3 9d, mode K). This observation is interesting because this predicted mode of co regulation does not match that observed for the human orthologous exon i n the HEK 293 RNAseq (Figure 3 9b) and the human INSR minigene (Figure 3 5). While there are no predicted YGCY motifs within 200bp of the upstream intron consistent with the MBNL1 regulated exon skipping (Figure 3 12), there are two putative RBFOX binding motifs in the downstream intron 11 that if utilized by RBFOX in vivo correspond with the exon inclusion activity observed.
144 Full Dose Response Curves Magnify the Impacts of RBFOX1 Expressio n on MBNL1 Dose D ep endent R egulation of Alternative Splicing Next, b ased on the preliminary testing of 16 select orthologous events, full MBNL1 dose response curves were generated in the presence and absence of RBFOX1 expression in both dosing cell lines. In the context of the MEF system, as expected based on the effects of RBFOX1 expression in our limited dosing range experiements, RBFOX1 expression reduced the ability of MBNL1 to regulate splicing at low concentrations but did not impact at high doses of MBNL1 for several events (Figure 3 15a e). These observations ar e consistent with the HEK 293 RNAseq analysis response curve was the most common predicted co regulation mechanism observed (Figure 3 9c d, mode E and L). Interestingly, despite these changes in the overall shape an d slope of the MBNL1 dose response upon RBFOX1 expression, EC 50 values were only minimally reduced for most events or not significantly impacted (Figure 3 15h) indicating that roughly the same concentration of MBNL1 was required for 50% maximal regulation. In the context of Exoc1 RBFOX1 impacted all concentrations of MBNL1 expression shifting the entire dose response curve without impacting the EC 50 or slope (Figure 3 15f and Figure 3 15h). These observations are consistent with an additive mechanism of regulation whereby RBFOX1 and MBNL1 independently regulate AS of the target exon without impacting regulation by the other splicing factor (Figure 3 9d and Figure 3 10d, mode N). As expected based on the interesting mode of co regulation observed in the limited dosing range for Insr (Figure 3 14c), RBFOX1 expression shifted the MBNL1 dosing curve upwards at all concentrations of MBNL1 expression with out a significant
145 impact on slope, EC 50 or span (Figure 3 15g and 3 15h). This change suggests a differential mechanism of co regulation from that observed for the human INSR minigene whereby both splicing factors antagonize the function of the other by p romoting either exon inclusion or exclusion in an independent manner (Figure 3 9d, mode K). In contrast to the MEF system, RBFOX1 expression in the HEK 293 cells had no apparent impact on the MBNL1 dose response curve for 5 events, including endogenous IN SR exon 11 (Figure 3 16). This was quite surprising based on the general changes we detected in the RNAseq data (Figure 3 9). Although there are several reasons why these differences may have occurred, one potential compounding factor is the relatively hig h expression of RBFOX2 in this cell line. While endogenous RBFOX1 expression is relatively low based on read coverage from RNAseq, RBFOX2 expression was much higher and high protein levels of RBOX2 have been previously reported in HEK 293 cell lines (180) In a similar manner reported for M BNL1 and MBNL2, previous studies have indicated that RBFOX1 and RBF OX2 have compensatory functions. For example, in mouse RBFOX1 knockout brains, relatively few splicing changes can be detected due to the up regulation of RBFOX2 levels (171) As such, we hypothesized that the modest levels of exogenous HA RBFOX1 expression achieved via transfection were unable to increase the overall functional levels of RBFOX expression to effectively modify the MBNL1 dose response. While predicted to restrict the influence of RBFOX1 on the MBNL1 dose response, this background expression of RBFOX2 m ay also alter the MBNL1 dose response in our assays, even in the absence of exogenous RBFOX1 expression.
146 To test these hypotheses a similar dosing experiment was performed, but alterations in levels of RBFOX expression were achieved by treating cells with siRNA against RBFOX2 ( low RBFOX protein levels) or control, non targeting siRNA (high RBFOX protein levels). After 48 hours of siRNA treatment, cells were then treated with the same gradient of doxycycline expression previously utilized to induce a gradie nt of MBNL1 expression. Although the overall efficacy of RBFOX2 knockdown has not been evaluated at this time, it is clear from the splicing regulation analysis of 4 events that the curves are beginning to separate and spread in a manner not observed in th e original experiments (Figure 3 17a d vs. Figure 3 16a e). Consistent with the lack of RBFOX RNA motifs for this event, no change in the MBNL1 dose response curve was observed for PALM e8 even with RBFOX2 knockdown (Figure 3 17b and Figure 3 12). While a t this time we can only make limited conclusions from these data, the overall analysis performed to date has likely been impacted, at least in part, by the cellular trans acting splicing factor expression RBFOX2. Additionally, while we did not have the sam e difficulties in observing RBFOX1 and MBNL1 co regulation in the MEF dosing line, RBFOX2 expression in this system still appears to be quite high based on read coverage from RNAseq experiments. However, we hypothesize that co regulation by MBNL1 and RBFOX 1 was observable in this system simply due to the extremely high levels of RBFOX1 expression (Figure 3 11c) achieved such that the overall functional concentration of RBFOX was significantly increased.
147 Discussion and Future Directions Dose Dependent Regulation of Alternative Splicing by MBNL1 is Conserved Across Distinct Species and Cellular E nvironments One of the major purposes of this study was to determine whether MBNL regulated splicing events exhibit differential dose dependent behavior in varyi ng cellular environments. To evaluate this question two cell lines (HEK 293 and MEF) from distinct species and tissues were utilized in which MBNL1 cellular concentrations can be precisely controlled via a tet on system. RNAseq analysis revealed that 68 or thologous cassette exons responded in a similar manner to MBNL1 dosing indicating a distinct conservation of MBNL1 regulation across species (Figure 3 7). Despite differences in genomic context, tissue type ancestry, and trans acting factor background, MBN L splicing was conserved for many events further exhibiting the importance of this splicing factor in RNA processing. Targeted RT PCR splicing analysis of select orthologous cassette exons also revealed that nearly all events responded to MBNL1 in a simila r concentration dependent manner, specifically the direction of exon inclusion. Unsurprisingly, due to the variances in cellular context, the overall span ( ) and shape (slope) of the curves varied from event to event (Figure 3 13 Figure 3 16). Despite these striking similarities, some orthologous splicing events did not respond in a similar manner. One of the most interesting events was CLASP1 / Clasp1 exon 20. While previously reported to robustly respond to MBNL1 dose in the HEK 293 system, th is exon was not regulated by MBNL1 at any concentrations in the MEFs (134) (Figure 3 18a b). Interestingly, this distinct difference in splicing regulation is observed despite a similar distribution and number of predicted MBNL1 binding motif s in the downstream intron. However, there are several MBNL / RBFOX motifs within the
148 upstream intron flanking Clasp1 exon 20 that if bound by either RBP may induce exon exclusion as observed in MEF system (Figure 3 18c). This example highlights (1) splici ng regulation in many cases cannot be exclusively explained by the presence of predicted cis regulatory elements and (2) observations about and characterization of splicing behavior of a RBP of interest can be complicated by the cellular background in whic h characterization is completed. Overall, these similarities and differences in splicing regulation observed not only highlight the conservation and importance of MBNL dependent splicing across systems, but also emphasizes how differences in the cellular c ontext, such as differential trans acting factor expression, can impact RBP splicing regulation despite conservation of the cis RNA elements of the splicing code. Utilization of Dosing Systems Provides a Unique Method to Characterize the Impacts of Differ ential Trans Acting Splicing Factor Expression on the MBNL1 Dose Response While dose dependent splicing regulation by MBNL has been previously characterized and the impacts of specific predicted cis regulatory elements on these dosing curves evaluated (134, 152) the effects of differential trans acting splicing factor concentrations on dose dependent splicing b ehavior have not been characterized. Most splicing research to date has relied on binary on/off measurements under conditions of splicing factor knockdown or over expression. This experimental perspective limits deep mechanistic insights into splicing regu lation and stratification of splicing events by their sensitivity. Because of the advantages of this experimental analysis, this system was chosen to characterize the impacts of RBFOX1 expression on the MBNL1 dose response and to parse apart splicing co re gulatory mechanisms. Initial experiments in HEK 293 cells using the human INSR minigene revealed response due to the use of an
149 overlapping regulatory motif not previously predicted to be required for MBNL1 de pendent splicing regulation (Figure 3 5). When MBNL concentrations were low, RBFOX1 had a large effect on As concentrations of MBNL1 increased this effect was minimized. The impact of RBFOX1 expression was mitigated by mutation of the putative RBFOX sit e to a sequence where MBNL1 binding was not impaired ( INSR mutant 2) (Figure 3 5). These observations support a co regulatory mechanism where both events positively regulate exon inclusion via use of the same cis element. At low MBNL concentrations, RBFOX1 is able to bind its target motif leading to an increase in ; as MBNL1 concentration increases RBFOX1 binding to the same site is outcompeted. Support of this mechanistic hypothesis could be improved by in vitro competition binding assays. Overall, these initial experiments indicated that (1) RBFOX1 and MBNL1 are capable of co regulating AS and (2) dampening of the MBNL dose response by RBFOX will likely be common co regulation mechanism because the non perfect MBNL binding site found to be critical for IN SR minigene regulation (UGCA) is present in all RBFOX hexanucleotide binding motifs ( UGCA UG). Overall, the use of this dosing system allows for complex visualization of AS regulation in a way that is not often appreciated or capitalized upon in the splicing field and was critical for evaluating mechanisms of MBNL1 / RBFOX1 co regulation of the INSR reporter. RBFOX and MBNL Proteins Co Regulate a Group of Events Via Common Mechanisms Based on the mutational analysis with the INSR minigene, we discovered the importance of the non perfect, MBNL motif within the RBFOX RNA binding element UGCA UG (MBNL motif underlined) for MBNL / RBFOX splicing co regulation (Figure 3 5). Because the UGCA MBNL RNA binding motif is present in all hexanucleotide
150 RBFOX binding motifs, this mode of co regulation was predicted to be commonly observed. RNAseq and target ed RT PCR based splicing analysis in HEK 293 and MEF cells revealed that the dampening effect of RBFOX1 expression on the MBNL1 dose response was frequently observed for many endogenous splicing events, in cluding Plod2 and Numb (Figure 3 9, Figure 3 15a an d 3 15d). Mutational analysis of minigenes derived from these events or the use of antisense oligonucleotides (ASOs) to block RBP binding in cells will be required to clarify if the observed dampening effect is due to competition for binding to overlapping regulatory RNA elements in the pre mRNA (working model for this m ode of co regulation in Figure 3 10a). O ther probabilistic modes of RBFOX / MBNL co regulation were discerned from this experimental analysis most notably additive effects. For example, in the case of Exoc1 and Kif23 (Figure 3 13f and Figure 3 15f), RBFOX1 expression concentrations of MBNL1, enhancing the overall regulation of event. We hypothesize for these splicing targets that while both splicing factors enhance or repr ess exon inclusion, they do not interact and bind separate RNA motifs. If this model holds true, mutation or blockage of the RBFOX1 binding motif would not impact the MBNL1 dose response or the binding affinity of MBNL1 for the pre mRNA as assayed by in vi tro RNA binding assays. Overall, these extensive analyses appear to indicate that, in general, RBFOX1 and MBNL1 are members of an alternative splicing network based on compensatory or additive regulation of AS. Interestingly, recent studies have found that RBFOX motifs are enriched downstream of exons repressed by CELF proteins and that there is a global, antagonistic co regulation of alternative splicing by these two families of splicing factors
151 (181) Additionally, it is well established that CELF and MBNL proteins antagonistically regulate AS of many events during heart, muscle, and neuronal development (33, 34, 69) CELF protein levels are also up regulated in DM tissues contributing to the spliceopathy of this disease (153, 154) Interestingly, many of the events identified to be antagonistically regulated by RBFOX and CELF proteins were both previously reported to be regulated by MBNL1 ( CLASP1, MBNL2 ) or were found to be co regulated in our studies by both MBNL1 and RBFOX1 ( ADD3 ) (134, 181) This combinatio n of work suggest s that the families of CELF, MBNL, and RBFOX proteins are members of a complex AS regulatory network where MBNL/RBFOX work in an antagonistic manner to CELF. Understanding how variations in cellular concentrations of all three families of proteins and the impact of each on splicing behavior will be critical for comprehending the contribution of each duri ng developmental AS tran sitions and mis splicing in disease Despite clear evidence that RBFOX and MBNL proteins regulate AS in the same manner for many events, one of the most interesting events analyzed in this study is INSR / Insr exon 11. Surprisingly in the dosing systems the same human MBNL1 isoform mediates opposite patterns of regulation (inclusion in the HEK 29 3 cells, exclusion in MEFs) (Figure 3 14c). Additionally, RBFOX1 expression has differential and distinct impacts on the MBNL dose depende nt regulation. While we are not able to fully evaluate the impacts of RBFOX1 expression on the endogenous INSR event due to high levels of RBFOX2 expression in HEK 293 cells, in the context of the INSR minigene we determined that RBFOX1 and MBNL1 both bind to an overlapping cis regulatory element to enhance exon inclusion (Figure 3 5, working model in Figure 3
152 MBNL1, antagonistically functioning against MBNL 1 regulation of exon exclusion (3 15g). We hypothesize that in this case, RBFOX1 and MBNL utilize distinct regulatory motifs on opposite sides of the regulated exon. The fact that the direction of exon inclusion and shape of the MBNL1 dose curve was not altered in the pre sence or absence of RBFOX further supports this probabilistic mode of co regulation. Interestingly, while MBNL regulation of this orthologous exon is not conserved across species, RBFOX1 splicing is preserved, potentially via use of relativel y conserved RN A motifs (Figure 3 12). Compensatory Action of Other Members of Splicing Factor Families Complicates Characterization of Splicing Co Regulation Despite significant progress in characterizing the co regulation of AS by MBNL1 and RBFOX1 proteins, one of the problems encountered in this study was the compensatory actions of other RBFOX family members. While the sensitivity of RNAseq allowed for the detection of RBFOX1 impacts on MBNL1 depe ndent splicing regulation in HEK 293 cells, targeted RT PCR of endogeno us splicing events in this same system indicated that the impacts of RBFOX1 expression were far subtler than would be predicted from either minigene or RNAseq based analysis (Figure 3 9 and Figure 3 16). We postulated and preliminarily confirmed that the l ack of effects observed were due to high expre ssion levels of RBFOX2 (Figure 3 17). Many previous reports have indicated that both RBFOX1 and RBFOX2 have the same RNA binding motif and can act to regulate RNA splicing in a compensatory manner in the absenc e of the other homolog (171, 182, 183) The presence of high levels of RBFO X2 in HEK 293 ce lls may limit impacts from the modest over expression of RBFOX1 achieved via transient
153 transfection Further work will need to be performed to validate these preliminary conclusions. While RBFOX2 is also expressed in the MEFs based on preliminary RNAseq analysis we hypothesize that the extremely high levels of RBFOX1 induced via titration of ponasterone A are able increase functional levels of RBFOX protein such that impacts on splicing regulation are observed although this may be due to use of non perfe ct or non normally utilized binding motifs. In a similar manner, the effects of RBFOX1 expression on the INSR minigene may have been observed due to the high number of pre mRNA transcripts produced from these vectors. The increased substrate concentration may have magnified the impacts of RBFOX1 expression on splicing in this case In continuing to evaluate MBNL and RBFOX co regulation of AS, the functionality of RBFOX2 in both dosing lines will need to be considered. While not directly addressed in this st udy, it is important to note that MBNL3 expression, while predicted to be low in the HEK 293 cells (134) may still have impacts on the MBNL dose response in the MEFs as it is not genetically knockout out like Mbnl1 and Mbnl2 Levels of MBNL3 expression should be evaluated moving forward with splicing analysis in this system. RBFOX Expres sion Buffers MBNL Splicing Activity Impacts on Disease Phenotypes and Biomarker Selection for Therapeutic Development Coordinated regulation of AS by RBFOX and MBNL proteins has many significant biological implications. Most importantly, co regulation by these two families of splicing factors provides flexibility in regulating sub networks of cassette exons beyond simply that of each individual RBP. This flexibility in regulation is especially critical in instances where the cellular concentration of one splicing factor is absent or
154 low. The ability of RBFOX proteins to buffer MBNL splicing activity is especially impactful in the context of disease, specifically DM, where functional MBNL concentrations are reduced due to sequestration by the toxic RNA prod uced (55, 102, 103) The presence of RBFOX in MBNL depleted tissues may buffer the impacts of functional MBNL loss, potentially mitigating the de gree of mis splicing observed in these tissues. In fact, increased RBFOX2 expression highly correlates with reduced inferred levels of functional, non sequestered MBNL proteins from RNAseq data from a large cohort of DM patients (Figure 3 19). This suggest s that (1) RBFOX expression may be upregulated in DM as a cellular compensatory mechanism to rescue mis splicing and (2) some of the phenotypic variation experienced by patients may in part be due to variations in expression of this family of trans acting splicing factors. These observations also have important consequences on splicing biomarker selection for therapeutic trials for DM. One of the major goals in the DM field is the development of effective biomarkers that estimate changes in functional level s of MBNL1 upon therapeutic intervention (134, 184) Selection of biomarkers has, in part, proven difficult due to use of splicing events that do not provide sufficient predictive power when considering the broad range of DM severity. While the analysis of MBNL dose dependent splicing regulation has revealed splicing ev ents that respond gradually to MBNL expression (134) because of the now identified RBFOX and MBNL co regulation it will be important to characterize the impact of variable RBFOX expression on the MBNL1 dose response for these events. If RBFO X alters the shape of the MBNL dose response such that it no longer responds to perturbations of MBNL expression in a
155 gradual manner, the event may not serve as an adequate biomarker to accurately measure therapeutic value of a treatment in clinical trials Continuing to Explore Interesting and Unique Events Co Regulated by MBNL and RBFOX Within the context of this study many MBNL1 and RBFOX1 dependent splicing events were identified, some with both interesting predicted modes of regulation and relevan ce in development and disease. One such event is NUMB the expression of which is important for proper cellular differentiation and acts as an antagonist of Notch function in mammalian systems (185) In our studi es MBNL1 and RBFOX1 regulated exclusion of the orthologous cassette exon in both systems (Figure 3 15d and 3 16c). Previous reports have indicated that exclusion of the same cassette exon identified in this study via an upstream UGCAUG RBFOX3 binding motif is critical for proper neuronal differentiation during development (186) Interestingly, the mis splicing of the same exon leads to an increased inclusion isoform that has b een linked with cancer and enhanced cell proliferation (148, 187, 188) As such, the mis splicing of NUMB may play a part in the increased cancer risk experienced by DM patients (189, 190) PLOD 2 is another interesting event identified in our analysis. The procollagen lysine, 2 oxoglutarate 5 dioxygenase (i.e. lysyl hydroxylase 2 PLOD2 ) gene encodes for a protein that is important in collagen biosynthesis and the mis splicing of the same ortholo gous exon targeted in this study is linked with scleroderma (191) Surprisingly, upst ream RBFOX binding motifs in a PLO D2 minigene were found to be the most critical for regulation despite RBFOX expression regulating inclusion of the cassette exon (191) In our dosing system, MBNL1 expression leads to exon 14 inclusion for both the mouse and human exon and similar to other events, RBFOX1 expression
156 dampens this response in the MEFs (Figure 3 15a and Figure 3 16a). Interestingly, there is a high density of conserved MBNL and RBFOX binding motifs in the upstream intron. Additionally, CLIP seq of MBNL1 in mouse C2C12 myoblasts also revealed two large CLIP clusters in this conserved region, including one that overlaps with a RBFOX motif (Figur e 3 20) (44) Experimental evidence indicates that MBNL and RBFOX proteins mediate exon inclusion by binding to upstream motifs, a pattern that does not match classical RNA splicing maps for either splicing factor. Currently, we are investigating how both proteins work to co regulate this event by generating minigenes for both the mouse and human cassette exon. Materials and Methods Plasmids and Cloning N terminal HA tagged RBFOX1 (1 397 amino acids, RNA binding protein fox 1 homolog isoform 4; NCBI accession number NP_061193.2) was synthesized (Life Technologies) and cloned into pcDNA3.1+ using standard cloning procedures Mutations to the INSR minigene was performed using a QuikChange protocol (Aglient) and Phusion polymerase (NEB). Cre ation of Stable, I nducible MBNL1 / RBFOX1 Expression Cell L ines The doxycycline inducible HA MBNL1 ( amino acids 1 382; splice isoform a; NCBI accession number NP_066368) in the Flip in T Rex HEK 293 cell lines (Life Technologies) was created as previously reported (134) To generate the GFP MBNL1 inducible MEFs, N terminal GFP tagged constructs encoding the same MBNL1 isofor m were cloned into PB PuroTet, a vector containing PiggyBac Transposon sequences (143) flan king a PGK driven puromycin cassette and a minimal CMV promotor downstream of a TetR response element (TRE) to drive doxycycline inducible
157 expression of GFP MBN1L. The In Fusion cloning system (Clonetech) was utilized ctions to clone the GFP tagged constructs into the PB PuroTet vector. At 60 % confluency in 6 well plates, mouse embryonic fibroblasts (MEFs) deficient in MBNL1 and MBNL2, gifted by Maurice Swanson, were transfected PuroTet vector encoding GFP tagged MBNL 1 Tet On Advanced (vector containing PiggyBac Transposon sequences (143) flanking rtTA Advanced (Clontech) under CMV driven expression as well as a puromycin selection LT1 (Mirus) as were subjected to puromycin selection (4 ), allowed to recover for several days, and then exposed to 1000 ng/mL doxycycline (Sigma) for 24 hours. Cells were then sorted for high GFP expression using the SH800S Cell Sorter (Sony). Individual clones wer e isolated and the populations expanded in the presence of puromycin. Individual clones for were selected for experimental use based on GFP MBNL 1 expression across a range of doxycycline concentrations. The ponasterone A inducible mOrange RBFOX1 MEFs wer e generated by taking the Mbnl1: Mbnl2 KO MEFs reconstituted with the doxycycline inducible GFP MBNL1 and repeating an additional round of PiggyBac Transposase integration. N terminal mOrange tagged RBFOX1 (same isoform used in HEK 293 transfection experime nts) was cloned into PB PuroPonA, a vector containing PiggyBac Transposon sequences (143) f lanking a PGK driven puromycin cassette and a minimal CMV promotor downstream of a 5x E/GRE response element to drive ponasterone A inducible expression of mOrange RBFOX1. The In Fu sion cloning system was utilized
158 according to the manufacturer ns to clone the mOrange tagge d construct into the PB PuroPonA vector. At 60 % confluency in 6 well plates, MEFs were transfected g of PB PuroPonA vector encoding mOrange RBFOX1 PERV3 (vector containing PiggyBac Transposon sequences (143) flanking RxR and VgEcR under CMV driven expression as well as a puromycin selection ca g) using TransIT LT1 instructions. After 24 hours the cells were subjected to puromycin selection (4 ), allowed to recover for several days, and then exposed to 1000 ng/mL doxycycline (Sigma) and 10 M Ponasterone A (ThermoFisher) for 24 hours. Cells were then sorted for high GFP and high mOrange expression using the SH800S Cell Sorter (Sony). Individual clones were isolated and the populations expanded in the presence of p uromycin. Individual clones for were selected for experimental use based on GFP MBNL 1 and mOrange RBFOX1 expression across a range of doxycycline and ponasterone A concentrations respectively. Cell culture, T ransfection, and D oxycycline / Ponasterone A T reatment HEK (DMEM) Glutamax (Gibco) supplemented with 10 % fetal bovine serum (FBS), 10 blasticidin, and 150 hygromycin at 37 C under 5 % CO 2 For all m inigene work, cells were plated in twenty four well plates at a density of 1.5 x 10 5 cells/well. Cells were transfected 24 hours later at approximately 80% confluence. Plasmid (500 ng/well) was trans l of TransIT 293 (Mirus) following the RBFOX1 expression vector was co were co transfected with 250ng of a single minigene reporter. Fresh doxycycline (Sigma) was prepared at 1 mg/mL diluted, and added to
159 the cells at the appropriate concentrations four hours post transfection. 20 hours post doxycycline treatment cells were harvested using TrypLE (Gibco) and pelleted using centrifugation. To evaluate changes in endogenous splicing, HEK 293 cells were plat ed in twenty four well plates at a density of 1.5 x 10 5 cells/well. Cells were transfected 24 hours later at approximately 80% confluence. Empty vector (mock, pcDNA3.1+) or our HA RBFOX1 expression vector (500 ng/w l of TransI T 293 Fresh doxycycline was prepared at 1 mg/mL diluted, and added to the cells at the appropriate concentrations four hours post transfection. 20 hours post doxycycline treatment cells wer e harvested using TrypLE an d pelleted using centrifugation. For siRNA treatment in HEK 293 cells, cells were plated at a density of 7.5 x 10 4 cells/well. 24 hours post plating INTERFERin (Polypus) was utilized as per the targeting siRNA (SMARTpool: ON TARGET plus non targeting, Dharmacon) or RBFOX2 siRNA (SMARTpool: ON TARGET plus RBFOX2, Dharmac on) at a final concentration of 1nM. 48 hours post transfection fresh doxycycline was prepared at 1 mg/mL diluted, and added to the cells. 24 hours post doxycycline treatment cells wer e harvested using TrypLE and pelleted using centrifugation. MEFs were regularly maintained in DMEM Glutamax supplemented with 10 % FBS and 2 puromycin at 37 C under 5 % CO 2 To assay endogenous splicing regulation cells were plated in twelve well plates at a density of 6 x 10 4 cells/well. After 24 h ours, fresh doxycycline was prepared at 1 mg/mL diluted, and then added to the
160 cells at the appropriate concentrations to induce a range of GFP MBNL1 protein expression. If mOrange RBFOX1 expression was to be induced, fresh ponasterone A (Ther m oFisher) wa s dissolved in 100 % ethanol and added to the cells at the along with doxycycline Cells were treated with 100 % ethanol as a vehicle control if not exposed to ponasterone A. 24 hours post drug treatm ent cells were harvested using TrypLE and pelleted using centrifugation. Cell Based Splicing Assay RNA was isolated from HEK 293 cells for minigene experiments using an RNeasy kit (Qiagen). The isolated RNA was processed via reverse transcription (RT) PC R and the percent spliced in (PSI, ) (i.e. percent exon inclusion) for each minigene event determined as previously described (196) For all other experiments targeting endogenous events in both HEK 293 and MEFs cells, RNA was harvested utilizing the Aurum Total RNA Mini kit (Bio Rad) and DNase treated on column. 1000 ng of DNAsed RNA was reverse transcribed with SuperScript IV (Life Technologies) with random recommended SuperScript IV was utilized. cDNA was then PCR amplified for 25 32 cy cles using flanking exon specific primers. Primer sequences, annealing temperatures, and inclusion and exclusion product sizes in base pairs are listed in Table 3 2 Samples were visualized and quantified using the Fragment Analyzer (DNF 905 dsDNA 905 reag ent kit, 1 500bp, Advanced Analytical Technologies) and associated ProSize data analysis software. values were plotted against relative MBNL levels as determined by immunoblot or log (doxycycline ( ng/mL )) and fit to a four parameter dose curve ( = min + (( max min ) / (1 + 10 ((log(EC50) log[MBNL1]) slope) ))). Parameters that correlate to
161 biological data, i.e. concentration (EC 50 ) and steepness of response (slope), were then derived from these curves. Immunoblot Analysis Cell pellets were lysed in RIPA (25 mM Tris HCl pH 7.6, 150 mM NaCl, 1 % NP 40, 1 % sodium deoxycholate, 0.1 % SDS) (ThermoFisher) supplemented with 1 mM phenylmethylsulfonyl fluoride and 1X protease inhibitor cocktail (SigmaFAST, Sigma) by light agitation for 15 minutes via vort ex. Equal amounts of lysate were resolved on a 4 15 % SDS PAGE gel s prior to transfer. For blots with lysates from HEK 293 cells, HA MBNL1 and HA RBFOX1 proteins were probed using a mouse anti HA antibody (1:1 000 dilution, 6E2, Cel l Signaling Technology) and goat anti mouse secondary IRDye 800CW (1:15 000 dilution, LI COR). A GAPDH loading control was probed using rabbit anti GAPDH antibody (1:1 000 dilution, 14C10, Cell Signaling Technology) followed by goat anti rabbit IRDye 680RD ( 1: 15,000 dilution, LI COR). Blots from MEF lysates were probed with a rabbit anti GFP (1:1 000 dilution, D5.1, Cell Signaling Technology) to detect GFP MBNL1 expression and a donkey anti rabbit secondary IRDye 680RD (1:15 000 dilution, LI COR). mOrange RBFOX1 expression was detected with a mouse anti mOrange antibody (1:2,000 dilution, OriGene 4E10) and a donkey anti mouse IRDye 680RD (1:15 000 dilution, LI COR). A GAPDH loading control was probed using a chicken anti GAPDH antibody (1:2 000 dilution, ab14247, Abcam) followed by a donkey anti chicken secondary 800CW (1:15 000 dilution, LI COR). In both systems, fluorescence was measured using a LI COR Odyssey Fc or LI COR Odyssey CLx Imaging instrument. Quantification was performed using the associated Image Studio analysis so ftware (LI COR).
162 Immunofluorescence and Microscopy To acquire immunofluorescence images in GFP MBNL1 inducible MEFs, glass cover slips were placed in six well plates and treated with poly lysine solution for 30 minutes at 37 C. MEFs were then plated at 8 x 10 4 cells/well. 24 hours l ater, fresh doxycycline was prepared at 1 mg/mL diluted, and then added to the cells at the appropriate concentrations to induce GFP MBNL protein expression. 24 hours post drug treatment, cells were fixed for 10 minutes on ice with 4 % paraformaldehyde. Cells were then permeabilized with 0.1 % Triton X 100 in 1X PBS for 10 minutes at room temperature (RT). Next, cells were treated with Image iT FX Signal Enhancer (Invitrogen) for 30 minutes at RT. The cells were probed overnight at 4 C with chicken anti GFP (1:2 000 dilution, Abcam ab13970) After 3 washes in 1X PBS for 5 minutes at RT, cells were then probed with goat anti dilution, Invitrogen) for 1 hour at RT. Finally, cells were mounted using Prolo ng Diamond Antifade Mountant with DAPI (Invitrogen). After the slides had cured, images were acquired using a Zeiss Axioskop 2 with equal exposures across all samples High Throughput RNA Sequencing RNA to be utilized for RNAseq experiments was harvested utilizing the Aurum Total RNA Mini kit (Bio Rad) and DNase treated on column. RNA quality was assessed using the Fragment Analyzer ( DNF 471 standard sensitivity RNA analysis kit, RQN > 9.5). RNAseq libraries were prepared using the NEBNext Ultra II Direct ional RNA Library Kit with rRNA depletion. Library quality was assessed using the Fragment Analyzer DNF 474 High Sensitivity NGS Fragment Analysis Kit. Concentrations of each library were quantified using the KAPA Library Quantification Kit, pooled, and
163 se quenced using the Illumina Next Seq 500 75 x 75 paired end reads kit. All protocols
164 Figure 3 1 RBFOX and MBNL proteins regulate alternative splicing in the same positional dependent manner. The family of RBFOX and MBNL RBPs regulate splicing via binding to (U)GCAUG and YGCY RNA motifs, respectively. Binding to these motifs upstream of a regulated exon represses exon inclusion while binding to these same motifs downstream enhances exon inclusi on. We hypothesis that MBNL may be able to bind to RBFOX binding motifs, especially if binding to motifs with disfavored nucleotide content in the fourth position (A vs Y (C or U)) (upper right image).
165 Figure 3 2 Titration of doxycycline produces a gradient of HA MBNL1 protein expression from an integrated transgene in HEK 293 cells. A) Representative immunoblot showing increasing levels of HA MBNL1 expression resulting from doxycycline titration (ng/mL) in HEK 293 cells. B) Quantification of HA MB NL1 levels via immunoblot against the HA tag (n=1). Relative levels of HA MBNL1 were normalized to GAPDH. HA MBNL1 expression levels at 1 ng/mL doxycycline was then set equal to 1 and protein levels at all other doses normalized and plotted against log(dox ). Data is fit to a sigmoidal dose curve.
166 Figure 3 3 MBNL1 and RBFOX1 regulate splicing of the INSR and Nfix minigenes. A) Schematic showing portions of the minigene (intron and regulated exon) ( INSR exon 11, Nfix exon 8) containing predicted RBFOX binding motifs (red). Predicted MBNL binding motifs are shown in blue. B) Bar plot showing changes in percent spliced in ( INSR or Nfix minigene upon HA MBNL1 expression induced via doxycycline treatment, HA RBFOX1 expression induced via transient transfection, or expression of both proteins (n=3 for each treatment). (Data represented as average SEM, *** p < 0.001, ** p < 0.01, p < 0.05, t test). (Courtesy of Sunny Ketchum)
167 Figure 3 4 Evaluating splicing across a gradient of MBNL1 protein expression allows for characterization of dose dependent splicing behavior due to variances in cis element parameters and trans acting factor environment. A) Schematic showing a theoretical example of the dose dependent relationship between log [MBNL1] and Curve fitting parameters (EC 50 shown. B) Both and the overall shape of the MBNL1 dose response curve will var y as a function of MBNL1 concentration as well as cis element parameters including RNA sequence and structure. C) We predict that dose dependent behaviors will be altered as a function of the trans factor environment. Schematic shows a theoretical shift in the MBNL1 dose response curve from cells or model systems containing low RBFOX expression to those having high RBFOX levels.
168 Figure 3 5. RBFOX1 expression dampens the MBNL1 dose response via an overlapping RNA binding motif in the INSR minigene. A) Sche matic showing sections of INSR minigene containing a predicted RBFOX binding site (red) and MBNL binding sites (blue). The sequence surrounding the RBFOX binding site is shown and the RBFOX site boxed in red. Sequences of the mutated minigenes ( INSR mutant 1 and INSR mutant 2) are shown, and the preserved MBNL binding site in the INSR mutant 2 minigene boxed in blue. B D) MBNL1 dose response curves for the INSR wild type and mutant 1 and mutant 2 minigenes. Cells were transfected with either an empty plasmi d (blue, RBFOX1) or with an HA RBFOX1 expression plasmid (red, +RBFOX) and a gradient of MBNL1 expression generated via titration of doxycycline. was measured at each concentration and plotted against log (dox) as a proxy for relative MBNL1 expression levels and fit to a four parameter dose curve. Representative splicing gels are shown below each plot and average ( SEM) are listed (n=3 for eac h treatment). E) Table summarizing parameters of each dose response curve. (Courtesy of Sunny Ketchum)
170 Figure 3 6 Titration of doxycycline generates a gradient of GFP MBNL1 expression in Mbnl1 : Mbnl2 KO MEFs that regulates splicing of several MBN L1 dependent splicing events. A) Representative immunoblot showing increasing levels of GFP MBNL1 expression resulting from doxycycline titration (ng/mL) in MEFs. B) Quantification of GFP MBNL1 levels via immunoblot against the GFP tag (n=1). Relative leve ls of GFP MBNL1 were normalized to GAPDH. GFP MBNL1 expression levels at 10 ng/mL doxycycline was then set equal to 1 and protein levels at all other doses normalized and plotted against log(dox). C) Subcellular protein localization of GFP MBNL1 upon doxyc ycline induction was determined using immunofluorescence against the GFP tag. When untreated, no discernable GFP MBNL1 was detected. Exposure to 100 ng/mL doxycycline leads to a significant increase in GFP MBNL1 detection, which showed a mainly nuclear loc alization pattern with some signal in the cytoplasm. D) MBNL dose response curves for seven endogenous splicing events. Cells were treated with a titration of doxycycline to create a gradient of GFP MBNL1 protein expression. was measured at each concentr ation and plotted against log (dox) as a proxy for relative MBNL1 expression levels and fit to a four parameter dose curve (n=1). E) Table summarizing parameters of each dose response curves in panel D. F) Same MBNL dose response curves as in panel D plott ed against relative MBNL1 levels as quantified in panel B. G) Table summarizing parameters of each dose response curve in panel F.
172 Figure 3 7 Many orthologous cassette exons are regulated in a similar MBNL1 dependent manner in both HEK 293 a nd MEF dosing cell lines as assayed by RNAseq. Cells were treated with either no doxycycline (no dox) or high concentrations of drug (6.5 ng/mL for HEK 293 cells, 500 ng/mL for MEFS). Changes in (high dox no dox) for orthologous cassette exons in MEFs were plotted against the same values in HEK 293 cells. Green points mark the 68 orthologous cassette exons (top right and bottom left quadrant) regulated in a similar manner in both cell lines with a strong degree of correlation (R spearman = 0.73). Only a few cassette exons (8 total, top left and bottom right quadrant) are significantly differentially regulated between the cell lines. (Analysis and figure courtesy of Eric Wang)
173 Figure 3 8 Validation of HA MBNL1 / HA RBFOX1 expression and splicing regulation in HEK 293 cells. A) Immunoblot showing HA MBNL1 and HA RBFOX1 expression in HEK 293 cells. Cells were were left untreated (0 ng/mL) or exposed to medium (2.2 ng/mL) and high (6.5 ng/mL) concentrations of doxycycline to induce increasing levels of HA MBNL1. Cells were either transfected with empty plasmid (mock) or an expression vector for HA RBFOX1. B) Quantification of HA MBNL1 and HA RBFOX1 levels via immunoblot against the HA tag (n=1 ). Relative levels of both proteins were normalized to GAPDH. HA MBNL1 expression levels at 0 ng/mL doxycycline were then set equal to 1 and all other protein levels normalized accordingly. C) Representative splicing gel showing changes in exon inclusion ( ) for CLASP1 e20 D) Bar plot showing quantification of CLASP1 e20 splicing regulation from splicing gel in panel C.
174 Figure 3 9 Expression of RBFOX1 in HEK 293 MBNL1 dosing line significantly alters MBNL1 dose dependent splicing behavior. A) Color coded arrows in the large scatter plot show how addition of RBFOX alters the MBNL1 dose response as predicted by measuring data points f rom each end of the dosing curve (no/low MBNL1 and high MBNL1 expression, +/ RBFOX1) using RNAseq. Two examples of how these data are used to extrapolate predicted changes in the MBNL1 dose response curves are shown in the right panel. B) Example of how e nds of MBNL1 dose response curve for endogenous INSR e11 event respond to RBFOX1 expression. This data can be used to extrapolate model dose response curves. C) Summary of 9 different classes (modes) of events that that are positively regulated by MBNL1 (s ee heatmap legend for the predicted dose dependency curves associated with each category, or mode, labeled A H). (Pos. = positive, N.C. = no change, Neg. = negative) D) Summary of event classes for negatively regulated events similar to panel C. (Analysis and figure courtesy of Eric Wang)
176 Figure 3 10 Schematic showing potential working models of MBNL1 and RBFOX1 co regulation based on the event classes, or modes, described in Figure 3 9. A) Model of dampening mechanism, or an overlapping regulation mechanism, whereby both proteins bind to an overlappin g cis regulatory element to promote exon inclusion (mode E) or repress exon inclusion (mode L). This dose response due to RBFOX1 binding to MBNL1 RNA regulatory sequences. B) Model of an antagonist ic co regulation mechanism where RBFOX1 and MBNL1 regulate alternative splicing differentially due to binding to target cis regulatory elements on opposite sides of the target cassette exon (Mode G and J). If this model of co regulation holds we would expe ct RBFOX1 to disrupt the MBNL1 dose response at all doses, limiting the overall change in promoted by MBNL1 expression. C) Model of synergistic mechanism whereby RBFOX1 and MBNL1 cooperate in some way to increase the splicing regulation of a target spli cing event beyond that achieved by either splicing factor alone. For this model we would predict in increase in overall at high MBNL1 doses in the presence of RBFOX1 expression (mode A/B and mode O/P). D) In the additive model of co regulation, MBNL1 an d RBFOX1 both regulate splicing of a target exon in the same manner, likely via recognition of non overlapping cis regulatory elements. In the context in this model we would expect an increase in (mode C) or a decrease in (mode N) across the gradient o f MBNL1 expression due to an additive function of
178 Figure 3 11. GFP MBNL1 and mOrange RBFOX1 expression can be selectively induced via tetracycline or ecdysone inducible systems in Mbnl1: Mbnl2 KO MEFs. A) In the MEF dosing line, the rTetR protein is always expressed. rTerR can only bind the tet response element (TRE) in the presence of doxycycline to induce expression of GFP MBNL1. B) In the MEF dosing line, both the retinoid X receptor (RxR) and VP16 activation domain fused to an ecdysone receptor (VpECR) are constitutively expressed. In the presence of ponasterone A, an analog of ecdysone, these two proteins heterodimerize and bind to a unique, synthetic response element (5xE/GRE) to induce expression of mOrange RBFOX1 (140, 197) C) Representative immunoblot comparing relative protein levels of GFP MBNL1 and mOrange RBFOX1 in treated MEFs (Dox = 1000 ng/mL blot against GFP and RBFOX1 was detected via blot against mOrange. D) Quantification of relative protein levels of MBNL1 and RBFOX1 (n=6). Relative levels of each protein were normalized to GAPDH. The average signal for untreated (mock) samples was set equal to 1 and all other protein le vels normalized. E) Bar MBNL1, mOrnage RBFOX1, or both as measured by RT PCR for the endogenous Numa1 e2 (n=6). F) Same as in panel E for the endogenous Plod2 e14 standa relative to untreated cells (mock) for splicing assays in panel E and F.
180 Figure 3 12 Distribution of MBNL and RBFOX RNA binding motifs in orthologous cassette exon events. MBNL binding motifs (YGCY) and RBFOX binding motifs (UGCAUG and GCAUG) are colored as in legend. The relative position of these motifs is marked in the regulated exon an d in 200bp regions of the upstream and downstream introns. Events in all capital letters refer to human events targeted in HEK 293 cells. Events with only the first letter capitalized refer to mouse events targeted in MEFs. The ARHGAP17 / Arhgap17 exon (li ght grey box) is 1/2 the scale of all other exons / introns due to size constraints.
182 Figure 3 13 Limited dosing of MBNL1 in the presence or absence of RBFOX1 expression shows differences and similarities in of splicing co regulation for positive ly regulated events in HEK 293 and MEF dosing cell lines. A I) Both HEK 293 cells and MEFs were treated with doxycycline concentrations that induce no/low (0 ng/mL), medium (2.2 ng/mL in HEK 293, 80 ng/mL in MEFs), and high (6.5 ng/mL and 1000 ng/mL) MBNL1 expression levels. HA RBFOX1 expression was achieved via transient transfection in the HEK 293 M PonA to induce high RBFOX1 expression in the MEFs. Splicing of orthologous cassette exons (events in all capital lette rs refer to human events targeted in HEK 293 cells while events with only the first letter capitalized refer to mouse events targeted in MEFs) as measured by was than analyzed at each point using targeted RT PCR. Blue bars represent MBNL1 expression alon e while red bars represent co expression of RBFOX1. The grey boxes show the change in observed in the absence of RBFOX1.
184 Figure 3 14 Limited dosing of MBNL1 in the presence or absence of RBFOX1 expression shows differences and similarities of splicing co regulation for negatively regulated events in HEK 293 and MEF dosing cell lines. A G) Both HEK 293 cells and MEFs were treated with do xycycline concentrations that induce no/low (0 ng/mL), medium (2.2 ng/mL in HEK 293, 80 ng/mL in MEFs), and high (6.5 ng/mL and 1000 ng/mL) MBNL1 expression levels. HA RBFOX1 expression was achieved via transient transfection in the HEK 293 cells while cel expression in the MEFs. Splicing of orthologous cassette exons (events in all capital letters refer to human events targeted in HEK 293 cells while events with only the first letter capitalized refer to mouse events targeted in MEFs) as measured by was than analyzed at each point using targeted RT PCR. Blue bars represent MBNL1 expression alone while red bars represent co expression of RBFOX1. The grey boxes show the change in observed in the absence of RBFOX1.
186 Figure 3 15 RBFOX1 expression alters the MBNL1 dose response in MEFs. A G) Dose response curves of MBNL1 regualted exons in the absence (blue) or presence (red) of RBFOX1 expression. Doxycycline (ng/mL) was titrated to induce a gradient of GFP MBNL1 expression. Cells were either treated with a vehicle (100% ethanol, mOrange RBFOX1 expression. Splicing of these orthologous cassette exons as measured by were then plotted against log (dox) as a proxy for GFP MBNL1 expression a nd fit to a four parameter dose curve (n=3). H) Table summarizing curve fitting parameters, including slope, log (EC 50 ), and span.
188 Figure 3 16 RBFOX1 does not alter the MBNL1 dose response in HEK 293 cells. A E) Dose response curves of MBNL1 regualted exons in the absence (blue) or presence (red) of RBFOX1 expression. Doxycycline (ng/mL) was titrated to induce a gradient of HA MBNL1 expression. C ells were either transfected with an empty vector ( RBFOX1) or a plasmid expressing HA RBFOX1 (+RBFOX1). Splicing of these orthologous cassette exons as measured by were then plotted against log (dox) as a proxy for HA MBNL1 expression and fit to a four parameter dose curve (n=2). F) Table summarizing curve fitting parameters, including slope, log (EC 50 ), and span.
189 Figure 3 17 RBFOX2 knockdown via siRNA alters the MBNL1 dose response in HEK 293 cells. A E) Dose response curves of MBNL1 regualte d exons upon treatment with control, non targeting (red, high endogenous RBFOX2 levels) or RBFOX2 siRNA (blue, low endogenous RBFOX2). Cells were treated for 48 hours with 1nM control or RBFOX2 siRNA. Doxycycline (ng/mL) was then titrated to induce a gradi ent of HA MBNL1 expression. Splicing of these orthologous cassette exons as measured by were then plotted against log (dox) as a proxy for HA MBNL1 expression and fit to a four parameter dose curve (n=1). E) Table summarizing curve fitting parameters, in cluding slope, log (EC 50 ), and span.
190 Figure 3 18 Orthologous CLASP1 / Clasp1 exon 20 does not respond in the same manner to MBNL1 dose in HEK 293 and MEF dosing lines. A B) Representative splicing gels for CLASP1 / Clasp1 exon 20 splicing regulation across a gradient of MBNL1 expression in HEK 293 and MEFs, respectively. The doxycycline dose (ng/mL) is noted above each lane on the representative gel and is listed below (n=1). C) Distribution of MBNL and RBFOX binding moti fs in regulated exon and 200bp of upstream and downstream intron.
191 Figure 3 19 Increased gene expression of RBFOX2 but not RBFOX1 correlates with inferred levels of free functional MBNL1 levels in RNAseq samples from DM1 patient tibialis anterio r muscle. A B) Gene expression (TPM) of RBFOX1 and RBFOX2 respectively, plotted against inferred levels of free, non sequestered MBNL1 (197) ( [MBNL1] inferred = 1 means no MBNL1 sequestration). C) Same as in panel A and B but both RBFOX1 and RBFOX2 are plotted against [MBNL1] inferred. (Analysis and figure courtesy of Eric Wang)
192 Figure 3 20 Distribution of MBNL and RBFOX RNA binding motifs in orthologous PLOD2 / Plod2 e14 MBNL binding motifs (YGCY) and RBFOX binding motifs (UGCAUG and GCAUG) are colored as in legend. The relative position of these motifs is marked in the regulated exon and in a 250bp region of the upstream and downstream intron. Events in all capital lett ers refer to human events targeted in HEK 293 cells while events with only the first letter capitalized refer to mouse events targeted in MEFs. CLIP clusters from MBNL1 CLIP seq in C2C12 myoblasts are noted in orange boxes (44)
193 Table 3 1 Orthologous exons from MEF RNAseq that respond to MBNL1 (M), RBFOX1 (F), or both (B) proteins with > 0.1, Bayes factor > 5. Gene (H uman: Mouse) Mouse Coordinates Human Coordinates Group NCOR2:Ncor2 chr5:125022871:125022924: @chr5:125019819:125020046: @chr5:125018811:125018990: chr12:124815391:124815444: @chr12:124811955:124812179: @chr12:124810737:124810916: B MAP4K4:Map4k4 chr1:40007433:40007530:+@chr1:40008331:40 008339:+@chr1:40009688:40009863:+ chr2:102483674:102483771:+@chr2:102484491:1024 84499:+@chr2:102486084:102486259:+ MB MAP2K7:Map2k7 chr8:4238740:4239033:+@chr8:4240688:42407 35:+@chr8:4243304:4243445:+ chr19:7968765 :7968953:+@chr19:7970693:7970740: +@chr19:7974640:7974781:+ MB ADD3:Add3 chr19:53242504:53242627:+@chr19:53244310: 53244405:+@chr19:53244998:53247399:+ chr10:111890121:111890244:+@chr10:111892063:11 1892158:+@chr10:111893084:111895323:+ FB INSR:Insr chr8:31 84951:3185152: @chr8:3181702:3181737: @chr8:3174614:3174888: chr19:7152737:7152938: @chr19:7150508:7150543: @chr19:7142827:7143101: FM NUMA1:Numa1 chr7:101998300:102001659:+@chr7:102002295 :102002336:+@chr7:102007486:102007554:+ chr11:71723941:71727306: @chr11:71723447:71723488: @chr11:71721832:71721900: FM PALM:Palm chr10:79815185:79815241:+@chr10:79816792: 79816923:+@chr10:79819041:79820896:+ chr19:736019:736078:+@chr19:740352:740483:+@c hr19:746285:748330:+ F ARHGAP17:Arhgap17 chr7:123296410:123296566: @chr7:123294472:123294705: @chr7:123292036:123292205: chr16:24953308:24953464: @chr16:24950685:24950918: @chr16:24946791:24946960: F KIF23:Kif23 chr9:61935380:61935550: @chr9:61933229:61933270: @chr9:61932075:61932187: chr15: 69715498:69715668:+@chr15:69717621:69717 662:+@chr15:69718409:69718521:+ F NUMB:Numb chr12:83799508:83799657: @chr12:83797197:83797343: @chr12:83795439:83796154: chr14:73749067:73749213: @chr14:73745989:73746132: @chr14:73741918:73744001: F MADD:Madd chr2:91165366:91165467: @chr2:91163963:91164022: @chr2:91163499:91163601: chr11:47307984:47308085:+@chr11:47310519:47310 578:+@chr11:47310942:47311044:+ F GOLIM4:Golim4 chr3:75903247:75903329: @chr3:75902401:75902484: @chr3:75897971:75898132: chr3:167759 180:167759262: @chr3:167758574:167758657: @chr3:167754624:167754782: F EXOC1:Exoc1 chr5:76554100:76554205:+@chr5:76557819:76 557863:+@chr5:76559004:76559167:+ chr4:56749989:56750094:+@chr4:56755054:5675509 8:+@chr4:56756389:56756552:+ F FAT1:Fat1 chr8:450 45031:45045168:+@chr8:45049842:45 049877:+@chr8:45050781:45052257:+ chr4:187516843:187516980: @chr4:187511522:187511557: @chr4:187508937:187510374: F
194 Table 3 1. Continued Gene (Human: Mouse) Mouse Coordinates Human Coordinates Group ITGA6:Itga6 chr2:71849381:71849506:+@chr2:71853534:71 853663:+@chr2:71855854:71856758:+ chr2:173362703:173362828:+@chr2:173366500:1733 66629:+@chr2:173368819:173371181:+ F INF2:Inf2 chr12:112611410:112612036:+@chr12:1126125 48:112612604:+@chr12:112614895:112615557: + chr14:105180540:105181193:+@chr14:105181621:10 5181677:+@chr14:105185132:105185947:+ F
195 Table 3 2 Sequences of primers used for RT PCR of end ogenous splicing events in HEK 293 and MEFcells.. Event Exon Species Inclusion (bp) Exclusion (bp) T m (C) Cycle Number Add3 14 Mouse GCAGTTTGACGATGA CGATCAGG GACATCATGCATCTCGTCCTT GC 301 205 55 26 ADD3 14 Human ACCCATTTAGTCATCT CACAGAAGG CCTACTCACTCGCTTAGCAAG C 290 194 55 28 Arhgap17 17 Mouse AGTGATGGAAGGGGA CTTGG TGCAGACACGGAGTCTTTGG 298 64 55 28 ARHGAP17 17 Human TCACTCATTCCACACT GGAAACG TGCCTGGGGCTGATTTTGG 430 196 55 28 Clasp1 20 Mouse CAAATCTGTGTCGAC GACAGGA GCTGAGACTGTGAAACCACTT TGG 311 263 55 28 CLASP1 20 Human CAAAGTCTCCTCATCT TCGGGCACG GCTGGGACTGTGAAACCACT TTAGC 230 182 55 28 Exoc1 12 Mouse TGACTGGCACCTCTA AAGAAAGC GGTCAGAGGCAGACATGTTC C 210 165 57 28 EXOC1 11 Human GAAGTTGCAAAGATC AAGATGACTGG AGATCAGAGGCAGACATGTTT CC 230 185 55 28 Fat1 13 Mouse AAGCCCCTGGAGGAA AAGC GGGTGTGTGTTCGTCAATAG C 234 198 55 28 FAT1 3 Human TGATCCCTGTCTTTCC AAGAAGC CTCCAGGGTAATAGTCCGTAT CG 362 266 55 28 Golim4 7 Mouse CACACCAAGACATTCA CACACAG CAGGATTTCGAAAGCTTGGAA TCC 189 105 55 28 GOLIM4 7 Human AGAATAGACAACTAAG GAAAGCACACC AGGTTTTCGAAGGCTTGGAAT CC 209 125 55 28 Inf2 22 Mouse AGGACGCAGTGACAG ATTCC AGAGCACTCACTTGGCTTTG G 204 147 55 28 INF2 22 Human ATCCCTGGACAAGTC CTTCTCC GGCACGGAGTTTTGGTTTCC 316 259 55 28 Insr 11 Mouse AGACTGACTCTCAGAT CCTGAAGG CTGACTTGTGGGCACAATGG 236 200 55 28 INSR 11 Human CAACCAGAGTGAGTA TGAGGATTCG CCGTCACATTCCCAACATCG 224 188 55 28 Itga6 25 Mouse ATCCTCCTGGCTGTTC TTGC GGGTAGTGTGAGGTGTTCTTT CC 391 261 55 28 ITGA6 25 Human CCTCAAAGACTGTAG CTCAGTATTCG TCTTTGATCTCTCGCTCTTCT TTCC 326 196 55 28
196 Table 3 2. Continued Event Exon Species Inclusion (bp) Exclusion (bp) T m (C) Cycle Number Kif23 8 Mouse GAAGTTGATGAAGAC AGTGTCTATGG TACATCCTGCCACATACATAT TGTGG 214 172 55 28 KIF23 8 Human ACGACAAGTAGATCC AGAGTTTGC ATGTTATGGTTCTTATCTTCA CGAAGC 258 216 55 28 Madd 16 Mouse TCACACTGCCCACCA AAGG GAAGGTTCTCTTTTCACGGTT GG 190 130 55 28 MADD 16 Human GTACCAGCTTCAGTCT TTCAAACC TGGAGGTTCTCTTTTCACTGT TGG 215 155 55 28 Map2k7 2 Mouse CCAAGCTGAAGCAGG AGAACC CGGTGTGAACAAGGTTGATG GG 257 209 55 28 MAP2K7 2 Human GTCCTCCCTGGAACA GAAGC GGTGTGAACAGGGTTGACG 291 243 55 28 Map4k4 20 Mouse AGGAAAGTGCAGCCA AAAAGC CCACTGCTCGAAGCTCTTTG G 108 99 55 28 MAP4K4 20 Human GGAAGTTTTCAGACC CCTCAAGC CACTGCTCGAAGCTCTTTGG 73 64 55 28 Mpdz 27 Mouse GATGGACCTCAGAGA TGCAAGC ATGGCTGTGCACAATCACTAC C 308 209 55 28 Myo1b 23 Mouse GGAGCTGAAGCATCA GAAGC CCAAATGACAGCAACTGCAT GC 223 136 55 28 Ncor2 45 Mouse GGGACGGAAATCTTC AACATGC GAGTGTACTGAGGAGACAGA AGG 380 152 55 28 NCOR2 46 Human CTGGGACGGAGATCT TCAATATGC CGGGGACTTGGCTTTTCG 326 101 55 28 Numa1 2 Mouse CGACAAGAAGCACAG AGTACTAGC CTTCTGTTGCTGCACCTTGC 231 189 55 26 NUMA1 16 Human GGGAGAAGTATGTCC AAGAGTTGG TGCTGGCTTGGTCAGAGTC 279 242 55 28 Numb 8 Mouse AGCATCAGCTCCTTGT GTTCC GCAGCACCAGAAGACTGACC 302 155 57 25 NUMB 11 Human TGTGCTCACAGATCA CCAATGC CGCTCTTAGACACCTCTTCTA ACC 363 219 55 28 Palm 8 Mouse GGGGATCCACAATGA TGAAAGC CACCTCGGAGGAACTTAGAG G 218 86 55 28 PALM 8 Human CAAGCGAGTCTCCAA CACG CCGCTTTGTGGATGAGTTCG 275 143 55 28
197 Table 3 2. Continued Event Exon Species Reverse Inclusion (bp) Exclusion (bp) T m (C) Cycle Number Plod2 14 Mouse AGGAATCTGGAATGT CCCATATATGG TCTGCCAGAAGTCATTGTTAA GATGG 302 239 55 26 PLOD2 14 Human ACTCCGATCAGAGAT GAATGAAAGG CTGCCAGAGGTCATTGTTATA ATGG 247 184 55 28 Slain2 5 Mouse ACGCAGTCTTCCAAA CCTATCC TCTTGGCTACCAGAACCAGA GC 312 234 55 28 Spag9 27 Mouse TCGTGAGGATAACAG CTCTTATGG TGCATGTGCCATTGAACAGTA GG 251 212 55 28
198 CHAPTER 4 FUTURE DIRECTIONS Developing Synthetic MBNL Proteins for Therapeutic Applications To satisfy vast functional requirements and RNA target recognition, RBPs are often built from a few simple RNA binding domains (RBDs) arranged in various manners such that these individual modules cooperate to regulate RNA binding specificity and biologica l activity of the RBP (136) MBNL proteins also possess a modular architecture consi sting of two tandem ZF pairs (ZF1 2 and ZF3 4) separated by a flexible linker (75) Despite structural and sequence similarities previously thought to indicate fu nctional redundancy of these RBDs (76) several studies including those described here have revealed that ZF1 2 is a higher activity RBD compared to ZF3 4 (74, 122) Because of differences in functionality of the modular ZF domains, we hypothesized th at a synthetic MBNL protein with increased activity could be generated via a domain duplication approach whereby ZF3 4 (domain B) was replaced with a ZF1 2 (domain A) to create an MBNL AA. Additionally, we predicted that substitution of the ZF1 2 domain wi th a ZF3 4 (i.e. MBNL BB), would result in an MBNL protein with weak RNA binding ability and reduced splicing regulation. Using this approach, we discovered that the ZF1 2 and ZF3 4 domains act as independent units with distinct characteristics, most notab ly different RNA binding specificities. Through several experimental strategies we discovered that MBNL AA is in fact a more active derivative of wild type (WT) MBNL1 (122) While these studies aided in elucidating how the individual RBDs of MBNL proteins cooperate to bind and regulate the AS of target RNAs, more importantly, it was revealed that the ZF domains can be organized in novel ways to produce new, functional, synthetic MBNL1 proteins with different activities. Because the domain
199 architecture of MBNL is modular in nature, this RBP is amenable to new and exciting engineering approaches to gener ate synthetic MBNL variants that can be used to probe mechanisms of RNA processing or utilized as therapeutic biologics in diseases. Synthetic MBNL Proteins as Therapeutic Biologics Although still a relatively new class of drugs, the development of biologics especially protein therapeutics is a rapidly growing field in which interest and investment is high. Since the approval of recombinant human insulin as a treatment for diabete s in 1982, the FDA has approved more than 100 protein therapies and many more are in clinical trials (192) In fact, biologics composed 32% of the total approved drugs by the FDA in 2016 (193) The use of protein therapeutics has several advantages over other therapeutic strategies, specifically over small molecule drugs. Firstly, as proteins are designed to be highly specific and serve a complex set of functions, these biologics have the capac ity to rescue a broader range of dysregulation with less off target effects than other avenues (192) Secondly, as natural products, proteins are assumed to be more well tolerated and are less likely to elicit an immune response in patients (80 (192) Thirdly, studies have shown that clinical development and FDA approval timelines may be faster for these molecules than that of chemical compounds (80, 128) Finally, due to the unique form and function of proteins, companies are able to obtain far reaching patent protection making these biologics more attractive from a financial perspective. Although many years of research have led to a better understanding behind the molecular mechanisms that cause diseases in which the regulation of RNA processing by MBNL proteins is perturbed by toxic RNA gain of function mechanisms, namely DM, SCA8, and FECD, there are currently no available therape utic interventions available for
200 patients beyond palliative care. In the case of DM, most approaches have focused on targeting toxic RNA levels via small molecules or antisense RNA oligos (ASOs) that lead to reduced production or degradation of the toxic R NA (reviewed in (80, 128) ). However, a strategy that has been relatively unexplored is counteracti ng the MBNL loss of function pathology via upregulation of functional MBNL levels. Delivery of MBNL1 through adeno associated virus (AAV) has been shown to rescue mis splicing events in a DM1 mouse model and reverse disease associated symptoms in skeletal muscle, including myotonia (74) Additionally, MBNL1 overexpression has been shown to be well tolerated in non disease mice, suggesting that therapies designed to increase levels of free, active MBNL could be an effective strategy for treatment of DM (141) Delivery of a high activity, synthetic MBNL would significantly improve the efficacy of this approach. The use of synthetic MBNL1 proteins as biologics for DM also has the advantage of potentially complementing other therapeutic strategies already in development, including ASOs (208) The combination of ASOs targeting the toxic RNA for degradation as well as an increase in functional MBNL levels via delivery of an MBNL protein biologic could act in an additive or synergistic manner to provide enhanced therapeutic benefit. The overall advantages of protein biologics and previous evidence suggesting the MBNL1 upregulation can rescue disease phenotypes merits further exploration of synthetic MBN L proteins as therapeutics for DM. We are particularly excited about exploring the development of synthetic MBNL proteins due to the success in the past two decades of generating modular DNA binding proteins containing a domain to bind specific DNA sequenc es and another to
201 manipulate the DNA target (200) These artificial proteins have utilized zinc fingers (ZF), transcription activator like effector nucleases (TALENS), or the CRISPER Cas9 system to cleave DNA targets or alter (activate / repress) gene transcription. The success of these strategies has encouraged us and many others to explore the development of modular RNA binding proteins (200, 209, 210) Because RBPs are involved in a broader range of functions which are transient and reversible, these proteins may ultimately be more efficacious with a higher safety profile since they should induce no permanent genomic damage. Engineering a M inimal MBNL Protein for Direct Delivery in D isease via Modification of the Linker R egio n While multiple paths are actively being explored to design and characterize a variety of synthetic MBNL proteins following the success of our initial studies, one of the most promising and immediate strategies is to develop minimal MBNL proteins for dire ct delivery into disease tissues. While biologically active molecules like proteins are highly advantageous for a wide range of therapeutic applications, their increased size and often hydrophilic makeup can limit their therapeutic value simply due to thei r low membrane penetrability and subsequent reduced cellular transduction. As such, many studies have focused on utilizing a variety of strategies, including AAV, cell penetrating peptides, or membrane p enetrable antibodies, to improve uptake of therapeuti c cargos. AAV is a small virus from the Parvoviridae family that contains a single stranded DNA genome packaged in non enveloped protein capsid. This particular viral technology has been optimized to package gene therapies to either correct genetic mutati ons or normalize expression of overactive / underactive genes connected to disease and has been found to well tolerated in clinical and pre clinical trials (reviewed
202 in (198) ). Another strategy that has been explored for delivery of protein biologics is the murine anti DNA autoantibody, 3E10, which has recently been developed as a novel intracellular delivery vehicle and been shown to be c apable of delivering full length, functional proteins into living cells (199) It was found that a small portion of this antibody, the Fv fragment, was capable of cell transduction and protein delivery into nuclei with no measureable toxicity (199) Finally, cell penetrating peptides, or CPPs, are a class of small, short, positively charged peptide sequences (~10 to 30 amino acids) that can be taken up by cells by directly penetrating the hydrophobic cell membrane or via endocytosis. These particular peptides have been covalently and non covalently fused to a variety of bioactive cargos, including proteins, peptides, small molecular drugs, and oligonucleotides to facilitate delivery (reviewed in (200) ). While all of the above described strategies are viable approaches for overcoming delivery of bioactive molecules, larger cargo sizes can impact the usage and efficacy of each of these approaches. For example, AAVs are only capable of encapsulating a limited amount of DNA (201) and the efficacy of cellular uptake via CPPs is reduced with increasing cargo size (202, 203) As such, one of the main goals in moving forward with developing a synthetic MBNL for ther apeutic use was to develop and test methodologies to reduce the overall size of the synthetic protein. While several studies have already revealed that the C terminal region of MBNL can be removed and the (80) another region of MBNL that has not been the two tandem ZF R NA binding domains.
203 Interdomain linkers are a critical component of consideration when engineering RBPs with multiple domains. In a natural context these regions are often critical for mediating cooperative binding of RBPs with multiple, modular domains (204 206) Despite the importance of this region, desi gning an effective linker can be complicated in part due to the limited structural information often available f or these regions. The use of too long a linker region with extensive flexibility may lead to RNA independent activities of a fused effector doma in. Conversely, a short, rigid linker may limit the functions of an RBP by reducing the independent activities of each domain (207) In the past, a string of Gly rich peptides have been utilize d because they are considered inert and unlikely to impact the binding mode of each RBD (197) However, a more sophisticated strategy may be to test natural linkers in the context of an engineered RBP to potentially preserve cooperative binding patter ns mediated by the linker in an established setting (i.e. a wild type RBP). The linker of MBNL proteins, like that in many other RBPs, while predicted to be relatively unstructured, is hypothesized to serve as a flexible tether between the two tandem ZF R BDs of MBNL (75) While several studies have aimed to determine if this region aids in splicing regulation via deletion analysis, these studies have (1) been perf ormed in a manner that disrupts other regions of the MBNL proteins (i.e. large scale deletions from either the N terminal or C terminal end) or (2) have been performed when artificially tethered to a regulatory target via use of an MS2 system (80, 128) In general, these studies have found that while not absolutely required for splicing regulation, large scale deletions do cause significant reduction s in splicing activity (80, 128)
204 To evaluate the importance of this region and begin optimizing a minimal linker length sufficient to achieve sub stantial splicing regulation while reducing overall protein size we systematically reduced the size of the MBNL linker region while preserving all other MBNL domain architecture (i.e. ZF1 2 and ZF3 4). Because the major aim of this study was to continue t o reduce the size of this RBP for potential therapeutic use, we utilized the MBNL AB backbone (C terminal region replaced with NLS, N t erminal HA tag) to generate synthetic MBNLs with reduced linker lengths (Figure 4 1a). Based on the designation of the ZF 1 2 and ZF3 4 domains, the linker region of MBNL consists of 76 amino acids. We choose to methodically reduce the size of the linker by separating this region into approximately 19 amino acids segments, which were removed from the protein from the center o utwards to create 3 synthetic MBNL proteins with 57, 38, and 19 amino acid linkers, named MBNL A 57 B, MBNL A 38 B, and MBNL A 19 B, respectively (Figure 4 1a and Figure 4 1b). This strategy was utilized over deletions from the C terminal or N terminal end of li nker because previous work has indicated that specific residues within this region nearest to the tandem ZF domains boundaries contribute to splicing regulation (74) Testing of the splicing activity of these synthetic MBNL proteins utilizing three minigene reporters ( MBNL1 ex. 5, ATP2A1 ex. 22, and TNNT2 ex. 5 ) reve aled that the splicing activity of these synthetic variants, in contrast to our previous studies with MBNL AA and MBNL BB, appeared to be event dependent (Figure 4 2). For the two reporters in which MBNL mediates exon exclusion ( MBNL1 and TNNT2 ), little re duction in activity was observed as the linker size decreased (Figure 4 2a and 4 2c respectively ). This was especially apparent for the MBNL1 auto regulation event,
205 whereby all three synthetic MBNL proteins with reduced linker lengths retained at least 95 % splicing activity compared to MBNL AB (Figure 4 2a). Conversely, for the single MBNL mediated activation (i.e. exon inclusion) reporter utilized, ATP2A1 a dramatic decrease in splicing activity was observed for each 19 amino acid deletion until MBNL A 19 B only retained 10% splicing activity (Figure 4 2b). While these three reporters only represent a small subset of MBNL dependent splicing events, some preliminary conclusions can be drawn from this limited data set. Firstly, in accordance with previous stu dies, while not absolutely required, the presence of this linker region does mediate MBNL splicing regulation, especially in the context of exon inclusion events. Additionally, this region does not appear to be as necessary in the context of MBNL mediated exon repressi on Previous work utilizing MBNL deletion mutants in an MS2 tethering assay indicated that the amino acids 2 102, which encapsulates, based on our domain definitions, ZF1 2 and a small region within the N terminal region of the linker, can act as a minimal repressor domain capable of facilitating exon exclusion (80) The data presented here is consistent with this model whereby the lin ker region is not absolutely required for exon exclusion activity. However, based on this data we cannot conclusively confirm that the ZF3 4 domain is required for this specific MBNL mode of splicing activity; the presence of two domains may be required to mediate binding to an RNA target. Protein levels of each synthetic MBNL protein will need to be evaluated via immunoblot to validate these activities in the context of relative protein concentrations. The second major conclusion that can be taken from thi s data is that MBNL exon inclusion activity is likely dependent on this linker. While we c an make no extensive interpretations at this time, it is possible that this flexible region mediates the binding of
206 each tandem ZF such that the RNA is organized in m anner to facilitate exon inclusion. Additionally, this region of MBNL may aid in recruiting spliceosomal components or other splicing factors required for proper exon definition and inclusion of the target exon in the final RNA transcript. Overall, these p reliminary experiments indicate that our synthetic MBNL proteins with reduced linker lengths can be used to probe mechanisms of MBNL exon inclusion and exclus ion activity. Additionally, this data suggests that reducing both the linker length and content ma y serve as a viable strategy to minimize the size of a synthetic MBNL fo r therapeutic applications Creating a Minimal MBNL Protein for Therapeutic Delivery via Use of the TAT Cell Penetrating Peptide While developing a minimal MBNL protein through shortening of the linker region, we are also beginning to actively explore methods for delivery, specifically the use of the minimal human immunodeficiency virus type 1 TAT peptide shown to be required for tr ansduction into cells (208) This particular CPP is one of the best studied and is currently conjugated to a variety of therapeutic molecules in pre clinical and clinical trials to facilitate delivery ( 200) These therapies are currently in development to treat a variety of disorders, including cerebral ischemia, cancers, and amyotrophic lateral sclerosis (ALS), among others (200, 209, 210) Using this system, we are pursuing if in vitro purified synthetic MBNL proteins can be directly d elivered to cells and mediate the reversal of DM associated spliceopathy by either displacing endogenous MBNL proteins from the toxic RNA or by acting as a high activity splicing factor that specifically regulates MBNL dependent splicing events (Figure 4 3 ). This strategy was further supported by recent reports showing that CCHH ZF domains that bind to specific DNA targets are able to be directly delivered to mammalian cells in culture independent of
2 07 any targeted delivery system (211, 212) As such, we hypothesized that a minimal MBNL, further aided by fusion to a TAT CPP, could be directly taken up by cells and retain activity. To begin to test this hypothesis, we have begun to characterize a new group of synthetic MBNL proteins with the TA T CPP. Based on results from modulating the length of the linker domain, we also hypothesized that the second ZF3 4 domain of MBNL may be removed and minimum functionality retained. This was predicted to also be a viable strategy to continue to reduce the size of the MBNL protein, in part because other muscleblind homologs, like the Drosophila Mbl contain only a single ZF RBD that can bind with high affinity in vitro to several RNA targets and effectively regulate AS in mammalian systems (52, 213) As such, two additional synthetic MBNL proteins were designed one in which ZF3 4 is r emoved but the 76 amino acid linker remains intact (MBNL A 76 TAT) An NLS sequence and HIV TAT CPP were fused to the C terminal region of this synthetic protein to drive nuclear loca lization and to be used for direct protein delivery pending successful cel lular characterization. A second synthetic MBNL was created in which the linker region was fully removed, leaving only a ZF1 2 domain fused to an NLS and TAT (MBNL A TAT) A schematic visualizing the domain organization of these synthetic MBNL proteins and the amino acids sequences are listed in Figure 4 4. Preliminary m inig ene splicing analysis using three reporters ( ATP2A1, INSR, and MBNL1) was performed to quickly assess the splicing activity of these small, synthetic MBNL proteins. Interestingly, MBNL A TAT had almost no splicing activity for any of the reporters utilized while MBNL A 76 TAT, with the linker region intact, retained 80 90%
208 splicing activity for all three reporters compared to MBNL AB (Figure 4 5) Again, while preliminary, these results suggest that while the ZF3 4 domain is not absolutely required for splicing regulation, the linker region does play a critical role. Interestingly, the results for these minimal MBNL proteins follows that of Drosophila Mbl The well studied Drosophila M BNL homolog, Mbl, contains only a single ZF domain orthologous to ZF1 2 orientated towards the N terminal region of the protein followed by an unstructured C terminal domain (Figure 4 6) Removal of this C terminal region for in vitro purification does not eliminate RNA binding (213) However, expression of the same deletion construct in HeLa cells with a minigene reporter has no splicing regulatory activity (Figure 4 7, compare mock and Mbl short / Mbl short TAT). This is in sharp contrast to full length Mbl, which maintains equivalent splicing regulation to MBNL AB for all four minigenes (Figure 4 7, compare MBNL AB and Mbl). While all these results could be complicated by differences in protein expression and subcellular distribution, they coalesce arou nd a model where regions outside of the MBNL ZF RBD are crucial for splicing regulatory activity, potentially by aiding in RNA binding or facilitating RNA organization required for effective splicing regulation. Overall these minimal, synthetic MBNLs can b e utilized to understand which regions outside of the traditional, designated RNA binding domains are required for MBNL dependent splicing activity. Additionally, future work will focus on utilizing the TAT CPP to enable direct delivery of these proteins i n cell culture to evaluate their potential as therapeutic biologics.
209 Development of New Synthetic MBNL Proteins Via Domain Swapping to Probe Mechanisms of RNA Processing One of the major findings from the characterization of the synthetic MBNL proteins d iscussed in Chapter 2 is that the tandem ZF domains of MBNL operate independently and are interchangeable, where ZF3 4 appears to function primarily as a low specificity RBD. These findings suggested that the ZF3 4 domain of MBNL could be replaced with oth er domains from a variety of RBPs to create new synthetic MBNL proteins to be used to investigate mechanisms of RNA processing. Some examples of described below. MBNL RS Fusions to A nalyze the Importance of Individual Domains in Modulating Positional Dependent Splicing One of many outstanding questions in the splicing field is how an important class of RBPs, including MBNL, RBFOX, NOVA, and QKI, all of which have been implicated in a range of diseases (reviewed in (214) ), regulate AS in a positional dependent manner (26) In general, this class of splicing regulators exclude exons when bound upstream or within the regulated exon or promote exon inclusion when bound downstream of the target exon. This mode of po sitional dependent splicing regulation for MBNL is shown in Figure 4 8a. However, the splicing field does not fully understand the rules and mechanisms that dictate this important mode of splicing regulation. A better understood class of splicing regulato rs are the family of SR proteins ( SRSF1 SRSF12) (reviewed in (215 217) ). This particular class of splicing regulators are characterized b y the presence of a C terminal domain with a high density of arginine (R) and serine (S) amino acids called an RS domain and an N terminal RNA recognition motif (RRM). These domains are also modular in nature and have independent
210 activities; the RRM is uti lized to specifically bind RNA sequences whereas RS domains participate in vast protein protein and non specific RNA protein interactions to recruit spliceosomal components (reviewed in (216) ). In contrast to MBNL, SR proteins display a different pattern of positional dependent splicing regulation. When SR proteins bind to a target exon they promote exon inclusion whereas binding to either the upstream or dow nstream intron flanking a target exon promotes exon exclusion (Figure 4 8b). Interestingly, tethering RS domains from SR proteins to the viral MS2 coat protein with MS2 RNA binding motifs placed in exonic regions demonstrated that the RS domains were neces sary and sufficient for splicing activation (218) Additionally, other studies have demonstrated that the RS domain is functionally interchan geable between SR proteins (219) We are currently working to derive new synthetic MBNL proteins in which the MBN L ZF1 2 RBD is fused to various RS domains, including a small, synthetic RS 10 as well as a canonical RS domains from SRSF1 and SRSF3 (220) Using our current understanding of how these domains individually contribute to splicing regulation in a natural context, we are aiming to use these synthetic proteins to ask and answer questions about how these domains may cooperate to regulate splicing, and in particular, what positional dependent patterns or strength of splicing regulatory activity are preserved or altered. These questions are outlined in graphical de tail in Figure 4 8c. For this group of synthetic MBNL proteins it will be important to characterize changes in RNA binding affinity and specificity assuming changes in splicing regulation are observed compared to WT MBNL.
211 MBNL dsRBD Fusions to Modulate M BNL RNA Secondary Structure Recognition Another group of synthetic MBNL proteins actively being designed, synthetized, and characterized are ZF1 2 domains fused to double stranded RNA binding domains (dsRBDs). dsRBD domains are found in many proteins incl uding Dicer, Staufen, and ADAR, all of which are involved in various aspects of RNA processing, including mRNA localization, A to I RNA editing, and miRNA/siRNA processing (221) In contrast to a variety of other RBDs, this small domain consisting of ~60 75 amino acids binds to RNA in a shape, rather than sequence dependent manner (reviewed in (221, 222) ). Structural based analysis of dsRBD binding to RNA indicates that these domains specifically bind RNA RNA duplexes in an A form helical structure (223, 224) Structural studies of the toxic CUG repeat RNA present in DM1 tissues has previously been shown in vitro to adopt a structure similar to A form RNA helices (225) Interestingl y, while sequestration of MBNL leads to the spliceopathy observed in DM, MBNL binds with 10 fold or greater affinity to short model RNAs predicted to have less RNA structure over the expanded CUG repeats (Figure 4 9a) (67) Shifti ng the conformation of the CUG repeat RNA to a more rigid, double stranded RNA structure also reduces the affinity of MBNL proteins for the toxic RNA ( 78) As such, we hypothesized that the fusion of the ZF1 2 domain of MBNL to a dsRBD domain may enhance affinity for the CUG repeat RNA over other substrates in the cellular environment. By utilizing the MBNL ZF1 RNA sequence recognition, the dsRBD domains may be directed to the CUG repeats, stabilizing the double stranded RNA structure and enhancing this synthetic MBNL association either by limiting shifts in RNA structure required for MBNL binding or by
212 occupying MBNL binding sites (Figure 4 9b). This reduction is endogenous MBNL binding with the toxic RNA may then allow for rescue of disease associated mis splicing. Complementing this p otential reversal of DM spliceopathy, stabilization of the toxic RNA by an MBNL dsRBD fusion within nuclear foci may additionally limit RNA nuclear export and the production of toxic polypeptide products by RAN translation (111) Characterization of RBFOX Dose Dependent Splicing Regulation The work presented in this dissertation has focused on characterizing the dose dependent regulation of alternative splicing by MBNL proteins utilizing two cell lines in which MBNL protein concentration can be precisely controlled. The advantage of these d osing systems is that the same principles can be applied to characterize dose dependent regulation of other alternative splicing factors. The MEF ( Mbnl1:Mbnl2 KO) with ecdysone inducible mOrange RBFOX1 serves as a pre existing system in which to generate a gradient of RBFOX1 protein expression and to monitor RBFOX dependent splicing regulation. Preliminary characterization of RBFOX dependent splicing across a gradient of mOrange RBFOX1 protein expression (Figure 4 10a) indicates that RBFOX does indeed regul ate splicing in a dose dependent manner with a wide range of EC 50 and slopes for three splicing events ( Plod2, Add3, and Numb ) (Figure 4 10b and Figure 4 10c). Interestingly, the overall changes in for these same events in the context of MBNL dose dependent regulation is reduced and the slopes steeper (Compare tables in Figure 4 10a b and Figure 3 15h ( RBFOX1)). This may to be due to differences in the number and distribution of RBFOX vs. MBNL moti fs. MBNL regulatory motifs (i.e. YGCY RNA elements) are present in splicing targets in greater numbers compared to RBFOX (UGCAUG / GCAUG) motifs (Figure 3 12). As such, it is possible that regulation of AS
213 may occur over a smaller protein concentration gra dient in a steeper, more cooperative manner compared to MBNL due to the few RNA regulatory motifs present through which RBFOX mediates exon inclusion or exclusion. Work described in this dissertation has indicated that for a variety of events MBNL and RBF OX proteins appear to regulate AS through the use of overlapping cis regulatory elements. Dosing of mOrange RBFOX1 in the presence and absence of high GFP MBNL1 expression provides the opportunity to validate this model and gain additional insights into th e interplay of splicing regulation by these two RBPs. In comparison to the converse experiments described in Chapter 3, the presence of high GFP MBNL1 expression leads to a complete flattening of the dose response curve in the case of the Plod2 event to th e maximal achieved by MBNL alone (Figure 4 11a). Combined with the dampening of the MBNL1 dose response curve observed in the reverse experiments (Figure 3 15a), these results further support a model where MBNL and RBFOX bind overlapping cis regulatory e lements to facilitate exon inclusion. At high expression levels of MBNL1, all splicing regulatory motifs are bound, and differential levels of RBFOX1 expression do not alter this pattern. This may suggest that either simply due to the concentration of MBNL motifs over RBFOX motifs, or potentially a higher binding affinity of MBNL1 for these motifs, splicing peaks at all concentrations of RBFOX1 assayed. This may also suggest that, in general, MBNL out competes RBFOX for binding to these overlapping cis regu latory elements used by both factors to achieve regulation. Similar preliminary dosing experiments targeting the INSR splicing event also confirms predicted patterns of RBFOX regulation previously hypothesized in Chapter 3.
214 GFP MBNL1 dosing experiments in the presence and absence of high mOrange RBFOX1 expression suggested that in contrast to MBNL, RBFOX1 promoted exon 11 inclusion through putative downstream RNA motifs. Dosing of mOrange RBFOX1 alone verifies this regulatory activity (Figure 4 11b, blue c urve). Interestingly, expression of MBNL1 shifts the curve downward, consistent with the confirmed exon exclusion activity of this RBP described in Chapter 3 (Figure 4 11b and Figure 3 15g). However, in contrast to the Plod2 event, the shape of the curve i s conserved and as mOrange RBFOX1 expression goes up, also increases These alterations to the dose response curves are conserved upon MBNL1 dosing with and without RBFOX1 expression (Figure 3 15g). Overall, this data, while preliminary, continues to sup port a model of antagonistic regulation of INSR e11 AS by RBFOX and MBNL. As discussed in Chapter 3, RBFOX2 expression appears to be quite high in these MEFs. As such, further characterization of RBFOX dosing will need to be performed in the context of RB FOX2 knockdown via siRNA treatment or after using genome editing tools, such as CRISPER Cas9, to permanently knockout RBFOX2 RBFOX2 is highly homologous to RBFOX1 and binds to the same regulatory motifs through a 100% conserved RRM domain (182, 183) The presence of this protein likely limits the full range of RBFOX1 splicing activity observed in this system. Overall, these dosing cell lines provide the opportunity to cont inue to characterize RBFOX and MBNL co regulation of AS from multiple angles, and current work is focusing on utilizing these tools to continue to understand splicing regulation of these two highly conserved RBPs, both individually and in combination.
215 Mat erials and Methods Synthetic MBNL Protein Design and Cloning MBNL AB was used as a template for the construction of MBNL A 57 B, MBNL A 38 B, and MBNL A 19 B, MBNL A 76 TAT and MBNL A TAT. These synthetic MBNL proteins were cloned using the pCI MBNL AB construct and the Q5 site directed mutagenesis kit (NEB). N terminal HA tagged Mbl (1 243 amino acids, muscleblind, isoform D, Drosophila melanogaster; NCBI accession number NP_788390.1) in pcDNA3.1 + was utilized as a template to design Mbl short and Mbl short TAT These synthetic MBNL protein sequences were PCR amplified and inserted into pCI (Promega) for mammalian expression using standard cloning procedures. The amino acid sequences for all of these synthetic MBNL variants is listed in Figure 4 1b, Figure 4 4c, and Figure 4 6b. Cell Culture, Transfection, and Protein Expression Induction (DMEM) Glutamax (Gibco ) supplemented with 10 % fetal bovine serum (FBS) and 1X antibiotic antimy cotic (Gibco) at 37 C under 5 % CO 2 For minigene splicing experiments in Figure 4 2, cells were plated in twelve well plates at a density of 8 x 10 4 cells/well. Cells were transfected approximately 36 hours later at roughly 80 % confluency. Plasmids (400 ng/well 200ng protein expression vector or mock and a 200ng of a single minigene reporter ) were transfected using 2 of Lipofectamine 2000 MEM I reduced serum media (Gibco) at t he time of transfection. Six hours later the Opti MEM I was replaced with our supplemented DMEM. 18 hours post medium exchange cells were harvested using TrypLE (Gibco) and pelleted using centrifugation. For minigene
216 experiments in Figure 4 5 and Figure 4 7, cells were plated in twelve well plates at a density of 1.6 x 10 4 cells/well. At the time of transfection, p lasmids (400 ng/well 200ng protein expression vector or mock and a 200ng of a single minigene reporter ) were transfected using 2 of TransIT L T1 (Mirus) 24 hours post transfection cells were harvested using TrypLE and pelleted using centrifugation. MEFs were regularly maintained in DMEM Glutamax supplemented with 10 % FBS and 2 puromycin at 37 C under 5 % CO 2 To assay endogenous splicing regulation cells were plated in twelve well plates at a density of 6 x 10 4 cells/well. A fter 24 hours, ponasterone A (ThermoFisher ) was dissolved in 100% ethanol, diluted, and then added to the cells at the appropriate concentrations to induce a range of mOrange RBFOX1 protein expression (doses between 0 ) If GFP MNBL1 expression was to be induced, fresh doxycycline (Sigma ) was prepared and added to the cells (final concentration = 1000 ng/mL ) along with ponaster one A Cells were treated with water as a vehicle control if not exposed to doxycycline. 24 hours post drug treatment cells were harvested using TrypLE and pelleted using centrifugation. Cell based Splicing Assay RNA was harvested from HeLa and MEF cells using an Aurum Total RNA Mini kit (Bio Rad) and DNase treated on column RNA from both HeLa and MEF cells was reverse transcribed using SuperScript IV (Life Technologies) according to the ded SuperSc ript IV was utilized (random hexamer priming for MEFs, minigene specific priming for HeLa described in (74) ). cDNA derived from minigene experiments was then PCR amplified
217 upon protein or mock treatment was determined as previously described (74) cDNA derived from MEF RNA was amplifi ed using gene specific primers (listed in Table 3 2). Samples were visualized and quantified using the Fragment Analyzer (DNF 905 dsDNA 905 reagent kit, 1 500bp, Advanced Analytical Technologies) and associated ProSize data analysis software. For endogenou s events in the MEFs, values we re plotted against relative RBFOX levels as determined b y immunoblot or log (ponasterone A )) and fit to a four parameter dose curve ( = min + (( max min ) / (1 + 10 ((log(EC50) log[RBFOX1 ]) slope) ))). Paramete rs that correlate to biological data, i.e. concentration (EC 50 ) and steepness of response (slope), were then derived from these curves. I mmunoblot Analysis Cell pellets were lysed in RIPA (25 mM Tris HCl pH 7.6, 150 mM NaCl, 1 % NP 40, 1 % sodium deoxychol ate, 0.1 % SDS) (ThermoFisher) supplemented with 1 mM phenylmethylsulfonyl fluoride and 1X protease inhibitor cocktail (SigmaFAST, Sigma) by light agitation for 15 minutes via vortex. Equal amounts of lysate were resolved on a 10% SDS PAGE gel prior to tra nsfer. Blots were probed with a mouse anti mOrange antibody (1:2000 dilution, OriGene 4E10) and a donkey anti mouse IRDye 680RD (1:15000 dilution, LI COR) to detect induced mOrange RBFOX1 expression A GAPDH loading control was probed using a chicken anti GAPDH antibody (1:2000 dilution, ab14247, Abcam) followed by a donkey anti chicken secondary 800CW (1:15000 dilut ion, LI COR). F luorescence was measured using a LI COR Odyssey Fc or LI COR Odyssey CLx Imaging instrument. Quantification was performed using the associated Image Studio analysis software (LI COR).
218 Figure 4 1 Synthetic MBNL proteins with varying linker lengths. A) Schematic of synthetic MBNL proteins that shows organization of zinc finger domains (ZF1 2 (Domain A) and ZF3 4 (Domain B)) and location of HA tag and nuclear localization signal (NLS). Alterations in the natural size of the linker region connecting domain A and B are noted in red. The length of the individual segments is proportional to the size of each region of the protein. B) Amino acid sequences of all proteins shown in schematic in panel A.
219 Figure 4 2 Synthetic MBNL proteins with altered linker lengths regulate splicing of minigenes in HeLa cells with different activities. A C) Jitter plot representations of cell based splicing assays using MBNL1 ATP2A1 and TNNT2 minigenes, respectively HeLa cells were transfected with empty vector (mock) or MBNL protein expression plasmids and a single minigene protein treatment was then quantified. Each point is from a single experiment and the line represents the average of all experiments for that condition (n = 2 3). in white) are listed below
220 Figure 4 3 Synthetic MBNL proteins as therapeutic biologics for DM. A synthetic MBNL protein could be used as a therapeutic biologic to rescue DM associated spliceopathy. An ideal synthetic MBNL protein will be a ble to directly enter the cell  potentially via the use of a cell penetrating peptid e, and localize to the nucleus  Once within the nucleus, some synthetic MBNL proteins may interact with t he disease causative toxic RNA [3a] displacing endogenous MBNL and limiting its re association. This will ideally lead to a rescue of m is splicing by en dogenous MBNL [3b] Other synthetic MBNL proteins may act as high activity splicing regulators, functioning to reconstitute endogenous MBNL and rescuing mis splicing event s linked with disease symptoms 
221 Figure 4 4 Synthetic MBNL proteins with single RNA binding domains and HIV TAT cell penetrating peptides. A) Schematic of synthetic MBNL proteins with only a single RNA binding domain. Presence of the linker region is designated with a black line and the location of a nuclear localization signal (NLS) and HIV TAT cell penetrating peptide (CPP) is also shown. The lengt h of the individual segments is proportional to the size of each region of the protein. B) Amino acid sequences of all proteins shown in schematic in pan el A.
222 Figure 4 5 Synthetic MBNL proteins with only a single RBD regulate splicing of minigenes in HeLa cells with different activities. A C) Jitter plot representations of cell based splicing assays using ATP2A1 INSR and MBNL1 minigenes, respec tively. HeLa cells were transfected with empty vector (mock) or MBNL protein expression plasmids and a single minigene protein treatment was then quantified. Each point is from a single experiment and the line represents the average of all experiments for that condition (n = 2). in white) are listed below
223 Figure 4 6. Domain organization of natural and synthetic Drosophila Mbl proteins. A) Schematic of natural and synthetic Mbl proteins with only a single RNA binding domain. The Drosophila Mbl protein contains a single ZF domain (tan color) orthologous to MBNL ZF1 2 followed by C terminal region predicte d to be unstructured. This figure shows the organization of these regions compared to MBNL AB, as well as the location of a HA tag, TAT CP P, and 6x His tag The length of the individual segments is proportional to the size of each region of the protein. B ) Amino acid sequences of all proteins shown in schematic in panel A.
224 Figure 4 7 A shortened, synthetic version of the Drosophila Mbl protein does not regulate splicing of four minigene reporters in HeLa cells. A D) Bar plot representations of cell based splicing assays using ATP2A1 Nfix and TNNT2 minigenes, respectively. HeLa cells were transfected with empty vector (mock) or MBNL protein expression plasmids and a single minigene (i.e. percent exon inclusion) for each protein treatment was then quantified and plotted
225 Figure 4 8 Differences in positional dependent alternative splicing regulation between MBNL and SR proteins and potential predictions for mechanism of action for MBNL RS chimeric proteins. A) MBNL proteins promote exon inclusion when bound downstream of the exon (represented by arrow). MBNL proteins repress exon inclusion when bound within the exon or upstream intron. B) SR proteins which contain C terminal RS domains generally promote exon inclusion when bound in the exon and inhibit exon inclusion when bound to intronic sequences. C) Potential outcomes and changes in positional dependent splicing regulation when RS domains are fused to MBNL.
226 Figure 4 9 Model representing predicted preferences of MBNL dsRBD fusions for toxic CUG RNA over less structured RNA substrates. A) Model representing predicted binding preferences of synthetic MBNL dsRBD chimeric proteins for more structured RNA targets, such as the toxic CUG RNA repeats found in DM, over more minimally structured endogenous sites in pre mRNA substrates. This is the opposite of MBNL proteins. B) Model representing predicted reduction of endogenous MBNL association with the toxic CUG RNA in the pre sence of MBNL dsRBD chimeras. If synthetic MBNL dsRBD proteins bind the toxic CUG RNA with increased affinity, they may prevent endogenous MBNL binding by increasing the double stranded nature the RNA.
227 Figure 4 10 Titration of ponasterone A gene rates a gradient of mOrange RBFOX1 expression in Mbnl1 : Mbnl2 KO MEFs that regulates sp licing of three target events. A) Representative immunoblot showin g increasing levels of mOrange RBFOX1 mO range RBFOX1 expression was detected via immunoblot against the mOrange tag. Equal load was determined by detection of a GAPDH loading control. B) RBFOX1 dose response curves for three splicing events. Cells were treated with a titration of ponasterone A t o create a gradient of mOrange RBFOX1 protein expression. concentration and plotted against log (PonA) as a proxy for relative RBFOX1 expression levels and fit to a four parameter dose curve (n = 3 4). A table summarizing parameters of each dose response curve is listed below. C) Same RBFOX1 dose response curves as in panel B plotted against relative mOrange RBFOX1 expression levels as quantified by immunoblot. Relative levels of mOrange RBFOX1 were normalized to GAPDH in each immunob lot. mOrange RBFOX1 expression at 0 1 and protein levels at all other doses normalized accordingly (n = 5). A table summarizing parameters of each dose response curve is listed below. (Courtesy of Joseph Ellis).
228 Figure 4 11 MBNL1 expression alters the RBFOX1 dose response in inducible MEFs. A B) RBFOX1 dose response curves for two splicing events in the absence (blue) or presence (green) of GFP MBNL1 expression. Ponasterone A ( ) was titrated to induce a gradie nt of mOrange RBFOX1 expression. Cells were then treated with a vehicle (water, MBNL1) or 1000 ng/mL doxycycline (+MBNL1) to induce GFP MBNL1 expression. Splicing of these exons was measured by and plotted against log (PonA) as a proxy for mOrange RBFOX 1 expression and fit to a four parameter dose curve (n = 1).
229 CHAPTER 5 SUMMARY AND CONCLUDING REMARKS Alternative splicing is a complex and dynamic process of gene regulation that alters tissue specific and developmental patterns of exon inclusio n and diversifies the transcriptome and proteome of individual cells. Surprisingly, with such extensive AS patterns observed, a limited population of regulators, or RNA binding proteins, work to modulate this intricate process. Decades of research have rev ealed that modes of AS control by individual RBPs alone can be quite elaborate and the intricacy of the process increased when networks of RBPs function to co regulate groups of genes. Through the course of the research studies presented here I have worked to elucidate how one such family of master regulators of RNA processing, MBNL, works to regulate AS through (1) understanding how the individual RNA binding domains of MBNL contribute to its function, and (2) how MBNL coordinates with other RBPs, namely R BFOX, to co regulate AS decisions. Using a synthetic engineering approach, I aimed to determine if the tandem zinc finger RBDs of MBNL (ZF1 2 and ZF3 4, domain A and B) act as independent, modular units to contribute to MBNL mediated splicing regulation. V ia characterization of MBNL variants with a non canonical rearrangement of ZF domains (MBNL AA and MBNL BB), it was discovered that (1) the RBDs of MBNL are in fact independent domains that can be separated and retain functionality, (2) ZF1 2 has increased splicing activity via enhanced recognition of RNA targets (i.e. RNA binding specificity), and (3) ZF3 4 acts as a modifier domain in the context of wild type MBNL, expanding the RNA binding capacity of MBNL and potentially the breadth of target pre mRNA r ecognition and
230 regulation. Overall, this work revealed that the domain architecture of MBNL clearly contributes to its function. Additionally, I utilized two doxycycline inducible cell lines to determine how the presence of another family of AS regulators modulate splicing regulation of a panel of MBNL dependent splicing events. It was determined using this system that RBFOX contributes to the splicing regulation of many splicing events through predicted mechanisms based on recognition of overlapping regula tory motifs. Discovery of this pattern of co regulation indicated that MBNL and RBFOX proteins may act as compensatory and/or cooperative RBPs to maintain critical, intricate patterns of AS during developmental transitions. While these studies focused on different levels of MBNL dependent alternative splicing regulation, the success of each was dictated by carefully characterizing dose dependent regulation of AS of either (1) our synthetic MBNL proteins or (2) WT MBNL1 in the presence or absence of RBFOX1 expression. In the splicing field, characterization of AS by any wildtype or mutagenized (either point mutations or domain deletions) splicing factor is performed via a knockdown/knockout or over expression based methodologies. While providing valuable ins ights, conclusions gleamed from these dose response curve. Ultimately, understanding the interplay of substrate (pre mRNA splicing target) and RBP (splicing factor) concentration is critical for accurately defining splicing activity; the use of plasmid dosing assays, inducible expression systems, or comparison of splicing in the presence of variant expression levels provides the opportunity to discern these insights i n a biochemical and quantitative manner. While
231 clearly important for splicing regulation, characterizing other aspects of RNA processing as RNA turnover and microRNA pr ocessing. While understanding the many layers of RNA processing may be challenging, dose dependent approaches like those used in this collection of work could contribute to and advance of the work of others studying these and other facets of RNA biology.
232 LIST OF REFERENCES 1. Turunen,J.J., Niemel,E.H., Verma,B. and Frilander,M.J. (2013) The significant other: splicing by the minor spliceosome. Wiley Interdiscip Rev RNA 4 61 76. 2. Matera,A.G. and Wang,Z. (2014) A day in the life of the spliceosome. Nat Rev Mol Cell Biol 15 108 121. 3. Staley,J.P. and Guthrie,C. (1998) Mechanical devices of the spliceosome: motors, clocks, springs, and things. CELL 92 315 326. 4. House,A.E. and Lynch,K.W. (2008) Regulation of alternative splicing: more than just the ABCs. J. Biol. Chem. 283 1217 1221. 5. Shi,Y. (2017) Mechanistic insights into precursor messenger RNA splicing by the spliceosome. Nat Rev Mol Cell Biol 18 655 670. 6. Jurica,M.S. and Moore,M.J. (2003) Pre mRNA splicing: awash in a sea of proteins. Molecular Cell 12 5 14. 7. Zhou,Z., Licklider,L.J., Gygi,S.P. and Reed,R. (2002) Comprehensive proteomic analysis of the human spliceosome. Nature 419 182 185. 8. Lee,Y. and Rio,D.C. (2015) Mechanisms and Regulation of Alternative Pre mRNA Splicing. Annu. Rev. Biochem. 84 291 323. 9. Schwartz,S.H., Silva,J., Burstein,D., Pupko,T., Eyras,E. and Ast,G. (2008) Large scale comparative analysis of splicing signals and their corresponding splicing factors in eukaryotes. Genome Res. 18 88 103. 10. Iwata,H. and Gotoh,O. (2011) Comparative analysis of information contents relevant to recognition of introns in many species. BMC Genomics 12 45. 11. Das,R., Zhou,Z. and Ree d,R. (2000) Functional association of U2 snRNP with the ATP independent spliceosomal complex E. Molecular Cell 5 779 787. 12. Valcrcel,J., Gaur,R.K., Singh,R. and Green,M.R. (1996) Interaction of U2AF65 RS region with pre mRNA branch point and promotion of base pairing with U2 snRNA [corrected]. Science 273 1706 1709. 13. Lamond,A.I., Konarska,M.M., Grabowski,P.J. and Sharp,P.A. (1988) Spliceosome assembly involves the binding and release of U4 small nuclear ribonucleoprotein. Proc. Natl. Acad. Sci. U. S.A. 85 411 415. 14. Gerstein,M.B., Rozowsky,J., Yan,K. K., Wang,D., Cheng,C., Brown,J.B., Davis,C.A., Hillier,L., Sisu,C., Li,J.J., et al. (2014) Comparative analysis of the transcriptome across distant species. Nature 512 445 448.
233 15. Wang,E.T., Sand berg,R., Luo,S., Khrebtukova,I., Zhang,L., Mayr,C., Kingsmore,S.F., Schroth,G.P. and Burge,C.B. (2008) Alternative isoform regulation in human tissue transcriptomes. Nature 456 470 476. 16. Pan,Q., Shai,O., Lee,L.J., Frey,B.J. and Blencowe,B.J. (2008) De ep surveying of alternative splicing complexity in the human transcriptome by high throughput sequencing. Nat. Genet. 40 1413 1415. 17. ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489 57 74. 1 8. Kim,E., Magen,A. and Ast,G. (2007) Different levels of alternative splicing among eukaryotes. Nucleic Acids Research 35 125 131. 19. Black,D.L. (2003) Mechanisms of alternative pre messenger RNA splicing. Annu. Rev. Biochem. 72 291 336. 20. Kelemen, O., Convertini,P., Zhang,Z., Wen,Y., Shen,M., Falaleeva,M. and Stamm,S. (2013) Function of alternative splicing. Gene 514 1 30. 21. Park,E., Pan,Z., Zhang,Z., Lin,L. and Xing,Y. (2018) The Expanding Landscape of Alternative Splicing Variation in Human Po pulations. Am. J. Hum. Genet. 102 11 26. 22. Braunschweig,U., Gueroussov,S., Plocik,A.M., Graveley,B.R. and Blencowe,B.J. (2013) Dynamic integration of splicing within gene regulatory pathways. CELL 152 1252 1269. 23. Baralle,F.E. and Giudice,J. (2017) Alternative splicing as a regulator of development and tissue identity. Nat Rev Mol Cell Biol 18 437 451. 24. Kalsotra,A. and Cooper,T.A. (2011) Functional consequences of developmentally regulated alternative splicing. Nat. Rev. Genet. 12 715 729. 25 Wang,Z. and Burge,C.B. (2008) Splicing regulation: from a parts list of regulatory elements to an integrated splicing code. RNA 14 802 813. 26. Fu,X. D. and Ares,M. (2014) Context dependent control of alternative splicing by RNA binding proteins. Nat. Rev. Genet. 15 689 701. 27. Hertel,K.J. (2008) Combinatorial control of exon recognition. J. Biol. Chem. 283 1211 1215. 28. Witten,J.T. and Ule,J. (2011) Understanding splicing regulation through RNA splicing maps. Trends Genet. 27 89 97. 29. Busch,A and Hertel,K.J. (2012) Evolution of SR protein and hnRNP splicing regulatory factors. Wiley Interdiscip Rev RNA 3 1 12.
234 30. Graveley,B.R. (2000) Sorting out the complexity of SR protein functions. RNA 6 1197 1211. 31. Barreau,C., Paillard,L., Mreau, A. and Osborne,H.B. (2006) Mammalian CELF/Bruno like RNA binding proteins: molecular characteristics and biological functions. Biochimie 88 515 525. 32. Ho,T.H., Charlet B,N., Poulos,M.G., Singh,G., Swanson,M.S. and Cooper,T.A. (2004) Muscleblind protein s regulate alternative splicing. EMBO J. 23 3103 3112. 33. Wang,E.T., Ward,A.J., Cherone,J.M., Giudice,J., Wang,T.T., Treacy,D.J., Lambert,N.J., Freese,P., Saxena,T., Cooper,T.A., et al. (2015) Antagonistic regulation of mRNA expression and splicing by C ELF and MBNL proteins. Genome Res. 25 858 871. 34. Kalsotra,A., Xiao,X., Ward,A.J., Castle,J.C., Johnson,J.M., Burge,C.B. and Cooper,T.A. (2008) A postnatal switch of CELF and MBNL proteins reprograms alternative splicing in the developing heart. Proc. N atl. Acad. Sci. U.S.A. 105 20333 20338. 35. Fagnani,M., Barash,Y., Ip,J.Y., Misquitta,C., Pan,Q., Saltzman,A.L., Shai,O., Lee,L., Rozenhek,A., Mohammad,N., et al. (2007) Functional coordination of alternative splicing in the mammalian central nervous sys tem. Genome Biol. 8 R108. 36. Tilgner,H., Jahanbani,F., Gupta,I., Collier,P., Wei,E., Rasmussen,M. and Snyder,M. (2018) Microfluidic isoform sequencing shows widespread splicing coordination in the human transcriptome. Genome Res. 28 231 242. 37. Baras h,Y., Calarco,J.A., Gao,W., Pan,Q., Wang,X., Shai,O., Blencowe,B.J. and Frey,B.J. (2010) Deciphering the splicing code. Nature 465 53 59. 38. Fu,X. D. (2004) Towards a splicing code. CELL 119 736 738. 39. Pascual,M., Vicente,M., Monferrer,L. and Artero ,R. (2006) The Muscleblind family of proteins: an emerging class of regulators of developmentally programmed alternative splicing. Differentiation 74 65 80. 40. Fernandez Costa,J.M., Llamusi,M.B., Garcia Lopez,A. and Artero,R. (2011) Alternative splicing regulation by Muscleblind proteins: from development to disease. Biological Reviews 86 947 958. 41. Lin,X., Miller,J.W., Mankodi,A., Kanadia,R.N., Yuan,Y., Moxley,R.T., Swanson,M.S. and Thornton,C.A. (2006) Failure of MBNL1 dependent post natal splicing transitions in myotonic dystrophy. Human Molecular Genetics 15 2087 2097. 42. Dixon,D.M., Choi,J., El Ghazali,A., Park,S.Y., Roos,K.P., Jordan,M.C., Fishbein,M.C., Comai,L. and Reddy,S. (2015) Loss of muscleblind like 1 results in cardiac pathology and persistence of embryonic splice isoforms. Sci. Rep. 5 9042.
235 43. Blech Hermoni,Y. and Ladd,A.N. (2013) RNA binding proteins in the regulation of heart development. Int. J. Biochem. Cell Biol. 45 2467 2478. 44. Wang,E.T., Cody,N.A.L., Jog,S., Biancolella ,M., Wang,T.T., Treacy,D.J., Luo,S., Schroth,G.P., Housman,D.E., Reddy,S., et al. (2012) Transcriptome wide Regulationof Pre mRNA Splicing and mRNA Localization by Muscleblind Proteins. CELL 150 710 724. 45. Masuda,A., Andersen,H.S., Doktor,T.K., Okamoto ,T., Ito,M., Andresen,B.S. and Ohno,K. (2012) CUGBP1 and MBNL1 preferentially bind to 3' UTRs and facilitate mRNA decay. Sci. Rep. 2 209. 46. Osborne,R.J., Lin,X., Welle,S., Sobczak,K., O'Rourke,J.R., Swanson,M.S. and Thornton,C.A. (2009) Transcriptional and post transcriptional impact of toxic RNA in myotonic dystrophy. Human Molecular Genetics 18 1471 1481. 47. Du,H., Cline,M.S., Osborne,R.J., Tuttle,D.L., Clark,T.A., Donohue,J.P., Hall,M.P., Shiue,L., Swanson,M.S., Thornton,C.A., et al. (2010) Aberrant alternative splicing and extracellular matrix gene expression in mouse models of myotonic dystrophy. Nature Structural & Molecular Biology 17 187 193. 48. Batra,R., Charizanis,K., Manchanda,M., Mohan,A., Li,M., Finn,D.J., Goodwin,M., Zha ng,C., Sobczak,K., Thornton,C.A., et al. (2014) Loss of MBNL leads to disruption of developmentally regulated alternative polyadenylation in RNA mediated disease. Molecular Cell 56 311 322. 49. Rau,F., Freyermuth,F., Fugier,C., Villemin,J. P., Fischer,M. C., Jost,B., Dembele,D., Gourdon,G., Nicole,A., Duboc,D., et al. (2011) Misregulation of miR 1 processing is associated with heart defects in myotonic dystrophy. Nature Structural & Molecular Biology 18 840 845. 50. Begemann,G., Paricio,N., Artero,R., K iss,I., Prez Alonso,M. and Mlodzik,M. (1997) muscleblind, a gene required for photoreceptor differentiation in Drosophila, encodes novel nuclear Cys3His type zinc finger containing proteins. Development 124 4321 4331. 51. Artero,R., Prokop,A., Paricio,N., Begemann,G., Pueyo,I., Mlodzik,M., Prez Alonso,M. and Baylies,M.K. (1998) The muscleblind gene participates in the organization of Z bands and epidermal attachments of Drosophila muscles and is regulated by Dmef2. Dev Biol. 195 131 143. 52. Oddo,J.C., Saxena,T., McConnell,O.L., Berglund,J.A. and Wang,E.T. (2016) Conservation of context dependent splicing activity in distant Muscleblind homologs. Nucleic Acids Research 44 8352 8362. 53. Kanadia,R.N., Johnstone,K.A. Mankodi,A., Lungu,C., Thornton,C.A., Esson,D., Timmers,A.M., Hauswirth,W.W. and Swanson,M.S. (2003) A muscleblind knockout model for myotonic dystrophy. Science 302 1978 1980.
236 54. Monferrer,L. and Artero,R. (2006) An interspecific functional complement ation test in Drosophila for introductory genetics laboratory courses. J. Hered. 97 67 73. 55. Fardaei,M., Rogers,M.T., Thorpe,H.M., Larkin,K., Hamshere,M.G., Harper,P.S. and Brook,J.D. (2002) Three proteins, MBNL, MBLL and MBXL, co localize in vivo with nuclear foci of expanded repeat transcripts in DM1 and DM2 cells. Human Molecular Genetics 11 805 814. 56. Miller,J.W., Urbinati,C.R., Teng Umnuay,P., Stenberg,M.G., Byrne,B.J., Thornton,C.A. and Swanson,M.S. (2000) Recruitment of human muscleblind prot eins to (CUG)(n) expansions associated with myotonic dystrophy. EMBO J. 19 4439 4448. 57. Kanadia,R.N., Urbinati,C.R., Crusselle,V.J., Luo,D., Lee,Y. J., Harrison,J.K., Oh,S.P. and Swanson,M.S. (2003) Developmental expression of mouse muscleblind genes M bnl1, Mbnl2 and Mbnl3. Gene Expr. Patterns 3 459 462. 58. Poulos,M.G., Batra,R., Li,M., Yuan,Y., Zhang,C., Darnell,R.B. and Swanson,M.S. (2013) Progressive impairment of muscle regeneration in muscleblind like 3 isoform knockout mice. Human Molecular Gen etics 22 3547 3558. 59. Squillace,R.M., Chenault,D.M. and Wang,E.H. (2002) Inhibition of muscle differentiation by the novel muscleblind related protein CHCR. Dev. Biol. 250 218 230. 60. Konieczny,P., Stepniak Konieczna,E. and Sobczak,K. (2014) MBNL pr oteins and their target RNAs, interaction and splicing regulation. Nucleic Acids Research 42 10873 10887. 61. Lee,K. Y., Li,M., Manchanda,M., Batra,R., Charizanis,K., Mohan,A., Warren,S.A., Chamberlain,C.M., Finn,D., Hong,H., et al. (2013) Compound loss of muscleblind like function in myotonic dystrophy. EMBO Mol Med 5 1887 1900. 62. Han,H., Irimia,M., Ross,P.J., Sung,H. K., Alipanahi,B., David,L., Golipour,A., Gabut,M., Michael,I.P., Nachman,E.N., et al. (2013) MBNL proteins repress ES cell specific al ternative splicing and reprogramming. Nature 498 241 245. 63. Lee,K. S., Smith,K., Amieux,P.S. and Wang,E.H. (2008) MBNL3/CHCR prevents myogenic differentiation by inhibiting MyoD dependent gene transcription. Differentiation 76 299 309. 64. Goers,E.S. Purcell,J., Voelker,R.B., Gates,D.P. and Berglund,J.A. (2010) MBNL1 binds GC motifs embedded in pyrimidines to regulate alternative splicing. Nucleic Acids Research 38 2467 2484.
237 65. Charizanis,K., Lee,K. Y., Batra,R., Goodwin,M., Zhang,C., Yuan,Y., Shiue,L., Cline,M., Scotti,M.M., Xia,G., et al. (2012) Muscleblind like 2 Mediated Alternative Splicing in the Developing Brain and Dysregulationin Myotonic Dystrophy. Neuron 75 437 4 50. 66. Lambert,N., Robertson,A., Jangi,M., McGeary,S., Sharp,P.A. and Burge,C.B. (2014) RNA Bind n Seq: quantitative assessment of the sequence and structural binding specificity of RNA binding proteins. Molecular Cell 54 887 900. 67. Warf,M.B. and Berg lund,J.A. (2007) MBNL binds similar RNA structures in the CUG repeats of myotonic dystrophy and its pre mRNA substrate cardiac troponin T. RNA 13 2238 2251. 68. Warf,M.B., Diegel,J.V., Hippel,von,P.H. and Berglund,J.A. (2009) The protein factors MBNL1 an d U2AF65 bind alternative RNA structures to regulate splicing. Proc. Natl. Acad. Sci. U.S.A. 106 9203 9208. 69. Kino,Y., Washizu,C., Oma,Y., Onishi,H., Nezu,Y., Sasagawa,N., Nukina,N. and Ishiura,S. (2009) MBNL and CELF proteins regulate alternative spli cing of the skeletal muscle chloride channel CLCN1. Nucleic Acids Research 37 6477 6490. 70. Yuan,Y., Compton,S.A., Sobczak,K., Stenberg,M.G., Thornton,C.A., Griffith,J.D. and Swanson,M.S. (2007) Muscleblind like 1 interacts with RNA hairpins in splicing target and pathogenic RNAs. Nucleic Acids Research 35 5474 5486. 71. Sen,S., Talukdar,I., Liu,Y., Tam,J., Reddy,S. and Webster,N.J.G. (2010) Muscleblind like 1 (Mbnl1) Promotes Insulin Receptor Exon 11 Inclusion via Binding to a Downstream Evolutionaril y Conserved Intronic Enhancer. Journal of Biological Chemistry 285 25426 25437. 72. Echeverria,G.V. and Cooper,T.A. (2014) Muscleblind like 1 activates insulin receptor exon 11 inclusion by enhancing U2AF65 binding and splicing of the upstream intron. Nu cleic Acids Research 42 1893 1903. 73. Tran,H., Gourrier,N., Lemercier Neuillet,C., Dhaenens,C.M., Vautrin,A., Fernandez Gomez,F.J., Arandel,L., Carpentier,C., Obriot,H., Eddarkaoui,S., et al. (2011) Analysis of Exonic Regions Involved in Nuclear Localiz ation, Splicing Activity, and Dimerization of Muscleblind like 1 Isoforms. Journal of Biological Chemistry 286 16435 16446. 74. Purcell,J., Oddo,J.C., Wang,E.T. and Berglund,J.A. (2012) Combinatorial Mutagenesis of MBNL1 Zinc Fingers Elucidates Distinct Classes of Regulatory Events. Molecular and Cellular Biology 32 4155 4167. 75. Cass,D., Hotchko,R., Barber,P., Jones,K., Gates,D.P. and Berglund,J.A. (2011) The four Zn fingers of MBNL1 provide a flexibleplatform for recognition of its RNA bindingelements. BMC Molecular Biology 12 20.
238 76. Teplova,M. and Patel,D.J. (2008) Struct ural insights into RNA recognition by the alternative splicing regulator muscleblind like MBNL1. Nature Structural & Molecular Biology 15 1343 1351. 77. Park,S., Phukan,P.D., Zeeb,M., Martinez Yamout,M.A., Dyson,H.J. and Wright,P.E. (2017) Structural Bas is for Interaction of the Tandem Zinc Finger Domains of Human Muscleblind with Cognate RNA from Human Cardiac Troponin T. Biochemistry 56 4154 4168. 78. deLorimier,E., Coonrod,L.A., Copperman,J., Taber,A., Reister,E.E., Sharma,K., Todd,P.K., Guenza,M.G. and Berglund,J.A. (2014) Modifications to toxic CUG RNAs induce structural stability, rescue mis splicing in a myotonic dystrophy cell model and reduce toxicity in a myotonic dystrophy zebrafish model. Nucleic Acids Research 42 12768 12778. 79. Kino,Y., Washizu,C., Kurosawa,M., Oma,Y., Hattori,N., Ishiura,S. and Nukina,N. (2015) Nuclear localization of MBNL1: splicing mediated autoregulation and repression of repeat derived aberrant proteins. Human Molecular Genetics 24 740 756. 80. Edge,C., Gooding,C. and Smith,C.W. (2013) Dissecting domains necessary for activation and repression of splicing by muscleblind like protein 1. BMC Molecular Biology 14 29. 81. Lpez Bigas,N., Audit,B., Ouzounis,C., Parra,G. and Guig,R. (2005) Are splicing mutations the mo st frequent cause of hereditary disease? FEBS Lett. 579 1900 1903. 82. Ward,A.J. and Cooper,T.A. (2010) The pathobiology of splicing. J. Pathol. 220 152 163. 83. Scotti,M.M. and Swanson,M.S. (2016) RNA mis splicing in disease. Nat. Rev. Genet. 17 19 32. 84. Cooper,T.A., Wan,L. and Dreyfuss,G. (2009) RNA and disease. CELL 136 777 793. 85. Singh,B. and Eyras,E. (2017) The role of alternative splicing in cancer. Transcription 8 91 98. 86. Sveen,A., Kilpinen,S., Ruusulehto,A., Lothe,R.A. and Skotheim,R.I. (2016) Aberrant RNA splicing in cancer; expression changes and driver mutations of splicing factor genes. Oncogene 35 2413 2427. 87. Venables,J.P. (2004) Aberrant and alternative spli cing in cancer. Cancer Res. 64 7647 7654.
239 88. Ranum,L.P.W. and Cooper,T.A. (2006) RNA mediated neuromuscular disorders. Annu. Rev. Neurosci. 29 259 277. 89. Lee,J.E. and Cooper,T.A. (2009) Pathogenic mechanisms of myotonic dystrophy. Biochem. Soc. Tran s 37 1281 1286. 90. Brook,J.D., McCurrach,M.E., Harley,H.G., Buckler,A.J., Church,D., Aburatani,H., Hunter,K., Stanton,V.P., Thirion,J.P. and Hudson,T. (1992) Molecular basis of myotonic dystrophy: expansion of a trinucleotide (CTG) repeat at the 3' end of a transcript encoding a protein kinase family member. CELL 68 799 808. 91. Fu,Y.H., Pizzuti,A., Fenwick,R.G., King,J., Rajnarayan,S., Dunne,P.W., Dubel,J., Nasser,G.A., Ashizawa,T. and de Jong,P. (1992) An unstable triplet repeat in a gene related to myotonic muscular dystrophy. Science 255 1256 1258. 92. Mahadevan,M., Tsilfidis,C., Sabourin,L., Shutler,G., Amemiya,C., Jansen,G., Neville,C., Narang,M., Barcel,J. and O'Hoy,K. (1992) Myotonic dystrophy mutation: an unstable CTG repeat in the 3' untran slated region of the gene. Science 255 1253 1255. 93. Harley,H.G., Rundle,S.A., MacMillan,J.C., Myring,J., Brook,J.D., Crow,S., Reardon,W., Fenton,I., Shaw,D.J. and Harper,P.S. (1993) Size of the unstable CTG repeat sequence in relation to phenotype and parental transmission in myotonic dystrophy. Am. J. Hum. Genet. 52 1164 1174. 94. Yum,K., Wang,E.T. and Kalsotra,A. (2017) Myotonic dystrophy: disease repeat range, penetrance, age of onset, and relationship between repeat size and phenotypes. Current Op inion in Genetics & Development 44 30 37. 95. Ranum,L.P., Rasmussen,P.F., Benzow,K.A., Koob,M.D. and Day,J.W. (1998) Genetic mapping of a second myotonic dystrophy locus. Nat. Genet. 19 196 198. 96. Liquori,C.L., Ricker,K., Moseley,M.L., Jacobsen,J.F., Kress,W., Naylor,S.L., Day,J.W. and Ranum,L.P. (2001) Myotonic dystrophy type 2 caused by a CCTG expansion in intron 1 of ZNF9. Science 293 864 867. 97. Morales,F., Couto,J.M., Higham,C.F., Hogg,G., Cuenca,P., Braida,C., Wilson,R.H., Adam,B., del Valle, G., Brian,R., et al. (2012) Somatic instability of the expanded CTG triplet repeat in myotonic dystrophy type 1 is a heritable quantitative trait and modifier of disease severity. Human Molecular Genetics 21 3558 3567. 98. Groh,W.J., Groh,M.R., Shen,C., Monckton,D.G., Bodkin,C.L. and Pascuzzi,R.M. (2011) Survival and CTG repeat expansion in adults with myotonic dystrophy type 1. Muscle Nerve 43 648 651. 99. Tsilfidis,C., MacKenzie,A.E., Mettler,G., Barcel,J. and Korneluk,R.G. (1992) Correlation between CTG trinucleotide repeat length and frequency of severe congenital myotonic dystrophy. Nat. Genet. 1 192 195.
240 100. Hunter,A., Tsilfidis,C., Mettler,G., Jacob,P., Mahadevan,M., Surh,L. and Korneluk,R. (1992) The correlation of age of onset with CTG trinucleotide repeat amplification in myotonic dystrophy. J. Med. Genet. 29 774 779. 101. Day,J.W., Ricker,K., Jacob sen,J.F., Rasmussen,L.J., Dick,K.A., Kress,W., Schneider,C., Koch,M.C., Beilman,G.J., Harrison,A.R., et al. (2003) Myotonic dystrophy type 2: molecular, diagnostic and clinical spectrum. Neurology 60 657 664. 102. Fardaei,M., Larkin,K., Brook,J.D. and Ha mshere,M.G. (2001) In vivo co localisation of MBNL protein with DMPK expanded repeat transcripts. Nucleic Acids Research 29 2766 2771. 103. Mankodi,A., Urbinati,C.R., Yuan,Q.P., Moxley,R.T., Sansone,V., Krym,M., Henderson,D., Schalling,M., Swanson,M.S. a nd Thornton,C.A. (2001) Muscleblind localizes to nuclear foci of aberrant RNA in myotonic dystrophy types 1 and 2. Human Molecular Genetics 10 2165 2170. 104. Kuyumcu Martinez,N.M. and Cooper,T.A. (2006) Misregulation of alternative splicing causes patho genesis in myotonic dystrophy. Prog. Mol. Subcell. Biol. 44 133 159. 105. Chau,A. and Kalsotra,A. (2015) Developmental insights into the pathology of and therapeutic strategies for DM1: Back to the basics. Dev. Dyn. 244 377 390. 106. Terenzi,F. and Lad d,A.N. (2010) Conserved developmental alternative splicing of muscleblind like (MBNL) transcripts regulates MBNL localization and activity. rnabiology 7 43 55. 107. Wahbi,K., Algalarrondo,V., Bcane,H.M., Fressart,V., Beldjord,C., Azibi,K., Lazarus,A., Berber,N., Radvanyi Hoffman,H., Stojkovic,T., et al. (2013) Brugada syndrome and abnormal splicing of SCN5A in myotonic dystrophy type 1. Arch Cardiovasc Dis 106 635 643. 108. Charlet B,N., Savkur,R.S., Singh,G., Philips,A.V., Grice,E.A. and Cooper,T.A. (2002) Loss of the muscle specific chloride channel in type 1 myotonic dystrophy due to misregulated alternative splicing. Molecular Cell 10 45 53. 109. Lueck,J. D., Lungu,C., Mankodi,A., Osborne,R.J., Welle,S.L., Dirksen,R.T. and Thornton,C.A. (2007) Chloride channelopathy in myotonic dystrophy resulting from loss of posttranscriptional regulation for CLCN1. Am. J. Physiol., Cell Physiol. 292 C1291 7. 110. Kanad ia,R.N., Shin,J., Yuan,Y., Beattie,S.G., Wheeler,T.M., Thornton,C.A. and Swanson,M.S. (2006) Reversal of RNA missplicing and myotonia after muscleblind overexpression in a mouse poly(CUG) model for myotonic dystrophy. Proc. Natl. Acad. Sci. U.S.A. 103 11 748 11753.
241 111. Zu,T., Cleary,J.D., Liu,Y., Baez Coronel,M., Bubenik,J.L., Ayhan,F., Ashizawa,T., Xia,G., Clark,H.B., Yachnis,A.T., et al. (2017) RAN Translation Regulated by Muscleblind Proteins in Myotonic Dystrophy Type 2. Neuron 95 1292 1305.e5. 112 Cleary,J.D. and Ranum,L.P.W. (2014) Repeat associated non ATG (RAN) translation: new starts in microsatellite expansion disorders. Current Opinion in Genetics & Development 26 6 15. 113. Cleary,J.D. and Ranum,L.P. (2017) New developments in RAN transla tion: insights from multiple diseases. Current Opinion in Genetics & Development 44 125 134. 114. Zu,T., Gibbens,B., Doty,N.S., Gomes Pereira,M., Huguet,A., Stone,M.D., Margolis,J., Peterson,M., Markowski,T.W., Ingram,M.A.C., et al. (2011) Non ATG initia ted translation directed by microsatellite expansions. Proc. Natl. Acad. Sci. U.S.A. 108 260 265. 115. Daughters,R.S., Tuttle,D.L., Gao,W., Ikeda,Y., Moseley,M.L., Ebner,T.J., Swanson,M.S. and Ranum,L.P.W. (2009) RNA gain of function in spinocerebellar a taxia type 8. PLoS Genet. 5 e1000600. 116. Day,J.W., Schut,L.J., Moseley,M.L., Durand,A.C. and Ranum,L.P. (2000) Spinocerebellar ataxia type 8: clinical features in a large family. Neurology 55 649 657. 117. Du,J., Aleff,R.A., Soragni,E., Kalari,K., Ni e,J., Tang,X., Davila,J., Kocher,J. P., Patel,S.V., Gottesfeld,J.M., et al. (2015) RNA toxicity and missplicing in the common eye disease fuchs endothelial corneal dystrophy. Journal of Biological Chemistry 290 5979 5990. 118. Wieben,E.D., Aleff,R.A., Ta ng,X., Butz,M.L., Kalari,K.R., Highsmith,E.W., Jen,J., Vasmatzis,G., Patel,S.V., Maguire,L.J., et al. (2017) Trinucleotide Repeat Expansion in the Transcription Factor 4 (TCF4) Gene Leads to Widespread mRNA Splicing Changes in Fuchs' Endothelial Corneal Dy strophy. Invest. Ophthalmol. Vis. Sci. 58 343 352. 119. Sebestyn,E., Singh,B., Miana,B., Pags,A., Mateo,F., Pujana,M.A., Valcarcel,J. and Eyras,E. (2016) Large scale analysis of genome and transcriptome alterations in multiple tumors unveils novel can cer relevant splicing networks. Genome Res. 26 732 744. 120. Fish,L., Pencheva,N., Goodarzi,H., Tran,H., Yoshida,M. and Tavazoie,S.F. (2016) Muscleblind like 1 suppresses breast cancer metastatic colonization and stabilizes metastasis suppressor transcri pts. Genes Dev. 30 386 398. 121. Vuong,C.K., Black,D.L. and Zheng,S. (2016) The neurogenetics of alternative splicing. Nat. Rev. Neurosci. 17 265 281.
242 122. Hale,M.A., Richardson,J.I., Day,R.C., McConnell,O.L., Arboleda,J., Wang,E.T. and Berglund,J.A. ( 2018) An engineered RNA binding protein with improved splicing regulation. Nucleic Acids Research 12 715. 123. Jangi,M. and Sharp,P.A. (2014) Building robust transcriptomes with master splicing factors. CELL 159 487 498. 124. Klein,A.F., Gasnier,E. and Furling,D. (2013) Gain of RNA function in pathological cases: Focus on myotonic dystrophy. Biochimie 10.1016/j.biochi.2011.06.028. 125. Meola,G. and Cardani,R. (2015) Myotonic dystrophies: An update on clinical aspects, genetic, pathology, and molecular pathomechanisms. Biochim. Biophys. Acta 1852 594 606. 126. Irion,U. (2012) Drosophila muscleblind codes for proteins with one and t wo tandem zinc finger motifs. PLoS ONE 7 e34248. 127. Vicente Crespo,M., Pascual,M., Fernandez Costa,J.M., Garcia Lopez,A., Monferrer,L., Miranda,M.E., Zhou,L. and Artero,R.D. (2008) Drosophila muscleblind is involved in troponin T alternative splicing and apoptosis. PLoS ONE 3 e1613. 128. Grammatikaki s,I., Goo,Y.H., Echeverria,G.V. and Cooper,T.A. (2011) Identification of MBNL1 and MBNL3 domains required for splicing activation and repression. Nucleic Acids Research 39 2769 2780. 129. Kosaki,A., Nelson,J. and Webster,N.J. (1998) Identification of int ron and exon sequences involved in alternative splicing of insulin receptor pre mRNA. J. Biol. Chem. 273 10331 10337. 130. Hino,S. I., Kondo,S., Sekiya,H., Saito,A., Kanemoto,S., Murakami,T., Chihara,K., Aoki,Y., Nakamori,M., Takahashi,M.P., et al. (2007 ) Molecular mechanisms responsible for aberrant splicing of SERCA1 in myotonic dystrophy type 1. Human Molecular Genetics 16 2834 2843. 131. Gates,D.P., Coonrod,L.A. and Berglund,J.A. (2011) Autoregulated splicing of muscleblind like 1 (MBNL1) Pre mRNA. Journal of Biological Chemistry 286 34224 34233. 132. Philips,A.V., Timchenko,L.T. and Cooper,T.A. (1998) Disruption of splicing regulated by a CUG binding protein in myotonic dystrophy. Science 280 737 741. 133. Cleary,J.D. and Pearson,C.E. (2003) The contribution of cis elements to disease associated repeat instability: clinical and experimental evidence. Cytogenet. Genome Res. 100 25 55.
243 134. Wagner,S.D., Struck,A.J., Gupta,R., Farnsworth,D.R., Mahady,A.E., Eichinger,K., Thornton,C.A., Wang,E.T. a nd Berglund,J.A. (2016) Dose Dependent Regulation of Alternative Splicing by MBNL Proteins Reveals Biomarkers for Myotonic Dystrophy. PLoS Genet. 12 e1006316. 135. Fu,Y., Ramisetty,S.R., Hussain,N. and Baranger,A.M. (2011) MBNL1 RNA Recognition: Contribu tions of MBNL1 Sequence and RNA Conformation. ChemBioChem 13 112 119. 136. Lunde,B.M., Moore,C. and Varani,G. (2007) RNA binding proteins: modular design for efficient function. Nat Rev Mol Cell Biol 8 479 490. 137. Murn,J., Teplova,M., Zarnack,K., Shi ,Y. and Patel,D.J. (2016) Recognition of distinct RNA motifs by the clustered CCCH zinc fingers of neuronal protein Unkempt. Nature Structural & Molecular Biology 23 16 23. 138. Murn,J., Zarnack,K., Yang,Y.J., Durak,O., Murphy,E.A., Cheloufi,S., Gonzalez ,D.M., Teplova,M., Curk,T., Zuber,J., et al. (2015) Control of a neuronal morphology program by an RNA binding zinc finger protein, Unkempt. Genes Dev. 29 501 512. 139. Mackay,J.P., Font,J. and Segal,D.J. (2011) The prospects for designer single stranded RNA binding proteins. Nature Structural & Molecular Biology 18 256 261. 140. Chen,Y. and Varani,G. (2013) Engineering RNA binding proteins for biology. FEBS J. 280 3734 3754. 141. Chamberlain,C.M. and Ranum,L.P.W. (2012) Mouse model of muscleblind lik e 1 overexpression: skeletal muscle effects and therapeutic promise. Human Molecular Genetics 21 4645 4654. 142. Kino,Y., Mori,D., Oma,Y., Takeshita,Y., Sasagawa,N. and Ishiura,S. (2004) Muscleblind protein, MBNL1/EXP, binds specifically to CHHG repeats. Human Molecular Genetics 13 495 507. 143. Li,X., Burnight,E.R., Cooney,A.L., Malani,N., Brady,T., Sander,J.D., Staber,J., Wheelan,S.J., Joung,J.K., McCray,P.B., et al. (2013) piggyBac transposase tools for genome engineering. Proc. Natl. Acad. Sci. U.S. A. 110 E2279 87. 144. Edgar,R.C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32 1792 1797. 145. Djebali,S., Davis,C.A., Merkel,A., Dobin,A., Lassmann,T., Mortazavi,A., Tanzer,A., Lagarde,J., Lin,W., Schlesinger,F., et al. (2012) Landscape of transcription in human cells. Nature 489 101 108.
244 146. Van Nostrand,E.L., Pratt,G.A., Shishkin,A.A., Gelboin Burkhart,C., Fang,M.Y., Sundararaman,B., Blue,S.M., Nguyen,T.B., Surka,C., Elkins,K., et al. (2016) Robust transcriptome wide discovery of RNA binding protein binding sites with enhanced CLIP (eCLIP). Nat. Methods 13 508 514. 147. Anczukw,O. and Krainer,A.R. (2016) Splicing factor alterations in cancers. RNA 22 1285 1301. 148. Zong,F. Y., Fu,X., Wei,W. J., Luo,Y. G., Heiner,M., Cao,L. J., Fang,Z., Fang,R., Lu,D., Ji,H., et al. (2014) The RNA binding protein QKI suppresses cancer associated aberrant splicing. PLoS Genet. 10 e1004289. 149. van den Hoogenhof,M.M.G., Pinto,Y. M. and Creemers,E.E. (2016) RNA Splicing: Regulation and Dysregulation in the Heart. Circ. Res. 118 454 468. 150. Maatz,H., Jens,M., Liss,M., Schafer,S., Heinig,M., Kirchner,M., Adami,E., Rintisch,C., Dauksaite,V., Radke,M.H., et al. (2014) RNA binding p rotein RBM20 represses splicing to orchestrate cardiac pre mRNA processing. J. Clin. Invest. 124 3419 3430. 151. Mohan,A., Goodwin,M. and Swanson,M.S. (2014) RNA protein interactions in unstable microsatellite diseases. Brain Res. 1584 3 14. 152. Jog,S .P., Paul,S., Dansithong,W., Tring,S., Comai,L. and Reddy,S. (2012) RNA splicing is responsive to MBNL1 dose. PLoS ONE 7 e48825. 153. Timchenko,N.A., Patel,R., Iakova,P., Cai,Z. J., Quan,L. and Timchenko,L.T. (2004) Overexpression of CUG triplet repeat b inding protein, CUGBP1, in mice inhibits myogenesis. J. Biol. Chem. 279 13129 13139. 154. Kuyumcu Martinez,N.M., Wang,G. S. and Cooper,T.A. (2007) Increased steady state levels of CUGBP1 in myotonic dystrophy 1 are due to PKC mediated hyperphosphorylatio n. Molecular Cell 28 68 78. 155. Klinck,R., Fourrier,A., Thibault,P., Toutant,J., Durand,M., Lapointe,E., Caillet Boudin,M. L., Sergeant,N., Gourdon,G., Meola,G., et al. (2014) RBFOX1 cooperates with MBNL1 to control splicing in muscle, including events altered in myotonic dystrophy type 1. PLoS ONE 9 e107324. 156. Venables,J.P., Lapasset,L., Gadea,G., Fort,P., Klinck,R., Irimia,M., Vignal,E., Thibault,P., Prinos,P., Chabot,B., et al. (2013) MBNL1 and RBFOX2 cooperate to establish a splicing programme i nvolved in pluripotent stem cell differentiation. Nat Commun 4 2480. 157. Jin,Y., Suzuki,H., Maegawa,S., Endo,H., Sugano,S., Hashimoto,K., Yasuda,K. and Inoue,K. (2003) A vertebrate RNA binding protein Fox 1 regulates tissue specific splicing via the pen tanucleotide GCAUG. EMBO J. 22 905 912.
245 158. Shibata,H., Huynh,D.P. and Pulst,S.M. (2000) A novel protein with RNA binding motifs interacts with ataxin 2. Human Molecular Genetics 9 1303 1313. 159. Underwood,J.G., Boutz,P.L., Dougherty,J.D., Stoilov,P. and Black,D.L. (2005) Homologues of the Caenorhabditis elegans Fox 1 protein are neuronal splicing regulators in mammals. Molecular and Cellular Biology 25 10005 10016. 160. Kim,K.K., Adelstein,R.S. and Kawamoto,S. (2009) Identification of neuronal nucl ei (NeuN) as Fox 3, a new member of the Fox 1 gene family of splicing factors. Journal of Biological Chemistry 284 31052 31061. 161. Kuroyanagi,H. (2009) Fox 1 family of RNA binding proteins. Cell. Mol. Life Sci. 66 3895 3907. 162. Nakahata,S. and Kawa moto,S. (2005) Tissue dependent isoforms of mammalian Fox 1 homologs are associated with tissue specific splicing activities. Nucleic Acids Research 33 2078 2089. 163. Sun,S., Zhang,Z., Fregoso,O. and Krainer,A.R. (2012) Mechanisms of activation and repr ession by the alternative splicing factors RBFOX1/2. RNA 18 274 283. 164. Mauger,D.M., Lin,C. and Garcia Blanco,M.A. (2008) hnRNP H and hnRNP F complex with Fox2 to silence fibroblast growth factor receptor 2 exon IIIc. Molecular and Cellular Biology 28 5403 5419. 165. Ponthier,J.L., Schluepen,C., Chen,W., Lersch,R.A., Gee,S.L., Hou,V.C., Lo,A.J., Short,S.A., Chasis,J.A., Winkelmann,J.C., et al. (2006) Fox 2 splicing factor binds to a conserved intron motif to promote inclusion of protein 4.1R alternati ve exon 16. J. Biol. Chem. 281 12468 12474. 166. Yeo,G.W., Coufal,N.G., Liang,T.Y., Peng,G.E., Fu,X. D. and Gage,F.H. (2009) An RNA code for the FOX2 splicing regulator revealed by mapping RNA protein interactions in stem cells. Nature Structural & Molec ular Biology 16 130 137. 167. Zhang,C., Zhang,Z., Castle,J., Sun,S., Johnson,J., Krainer,A.R. and Zhang,M.Q. (2008) Defining the regulatory network of the tissue specific splicing factors Fox 1 and Fox 2. Genes Dev. 22 2550 2563. 168. Auweter,S.D., Fas an,R., Reymond,L., Underwood,J.G., Black,D.L., Pitsch,S. and Allain,F.H. T. (2006) Molecular basis of RNA recognition by the human alternative splicing factor Fox 1. EMBO J. 25 163 173. 169. Minovitsky,S., Gee,S.L., Schokrpur,S., Dubchak,I. and Conboy,J. G. (2005) The splicing regulatory element, UGCAUG, is phylogenetically and spatially conserved in introns that flank tissue specific alternative exons. Nucleic Acids Research 33 714 724.
246 170. Das,D., Clark,T.A., Schweitzer,A., Yamamoto,M., Marr,H., Arrib ere,J., Minovitsky,S., Poliakov,A., Dubchak,I., Blume,J.E., et al. (2007) A correlation with exon expression approach to identify cis regulatory elements for tissue specific alternative splicing. Nucleic Acids Research 35 4845 4857. 171. Gehman,L.T., Sto ilov,P., Maguire,J., Damianov,A., Lin,C. H., Shiue,L., Ares,M., Mody,I. and Black,D.L. (2011) The splicing regulator Rbfox1 (A2BP1) controls neuronal excitation in the mammalian brain. Nat. Genet. 43 706 711. 172. Gehman,L.T., Meera,P., Stoilov,P., Shiue ,L., O'Brien,J.E., Meisler,M.H., Ares,M., Otis,T.S. and Black,D.L. (2012) The splicing regulator Rbfox2 is required for both cerebellar development and mature motor function. Genes Dev. 26 445 460. 173. Singh,R.K., Xia,Z., Bland,C.S., Kalsotra,A., Scavuzzo,M.A., Curk,T., Ule,J., Li,W. and Cooper,T.A. (2014) Rbfox2 coordinated alternative splicing of Mef2d and Rock2 controls myoblast fusion during myogenesis. Molecular Cell 55 592 603. 174. Gao,C., Ren,S., Lee,J. H., Qiu,J., Chapski,D.J., Rau,C.D., Zhou,Y., Abdellatif,M., Nakano,A., Vondriska,T.M., et al. (2016) RBFox1 mediated RNA splicing regulates cardiac hypertrophy and heart failure. J. Clin. Invest. 126 195 206. 175. Wei,C., Qiu,J., Zhou,Y., Xue,Y., Hu,J., Ouyang,K., Banerjee,I., Zhang,C., Chen,B., Li,H., et al. (2015) Repression of the Central Splicing Regulator RBFox2 Is Functionally Linked to Pressure Overload Induced Heart Failure. Cell Rep 10 1521 1533. 176. Bill,B.R., Lowe,J.K., Dybunc io,C.T. and Fogel,B.L. (2013) Orchestration of neurodevelopmental programs by RBFOX1: implications for autism spectrum disorder. Int. Rev. Neurobiol. 113 251 267. 177. Nutter,C.A., Jaworski,E.A., Verma,S.K., Deshmukh,V., Wang,Q., Botvinnik,O.B., Lozano,M .J., Abass,I.J., Ijaz,T., Brasier,A.R., et al. (2016) Dysregulation of RBFOX2 Is an Early Event in Cardiac Pathogenesis of Diabetes. Cell Rep 15 2200 2213. 178. No,D., Yao,T.P. and Evans,R.M. (1996) Ecdysone inducible gene expression in mammalian cells a nd transgenic mice. Proc. Natl. Acad. Sci. U.S.A. 93 3346 3351. 179. Graham,L.D. (2002) Ecdysone controlled expression of transgenes. Expert Opin Biol Ther 2 525 535. 180. Chen,Y., Zubovic,L., Yang,F., Godin,K., Pavelitz,T., Castellanos,J., Macchi,P. a nd Varani,G. (2016) Rbfox proteins regulate microRNA biogenesis by sequence specific binding to their precursors and target downstream Dicer. Nucleic Acids Research 44 4381 4395.
247 181. Gazzara,M.R., Mallory,M.J., Roytenberg,R., Lindberg,J.P., Jha,A., Lync h,K.W. and Barash,Y. (2017) Ancient antagonism between CELF and RBFOX families tunes mRNA splicing outcomes. Genome Res. 27 1360 1370. 182. Lovci,M.T., Ghanem,D., Marr,H., Arnold,J., Gee,S., Parra,M., Liang,T.Y., Stark,T.J., Gehman,L.T., Hoon,S., et al. (2013) Rbfox proteins regulate alternative mRNA splicing through evolutionarily conserved RNA bridges. Nature Structural & Molecular Biology 20 1434 1442. 183. Weyn Vanhentenryck,S.M., Mele,A., Yan,Q., Sun,S., Farny,N., Zhang,Z., Xue,C., Herre,M., Silver ,P.A., Zhang,M.Q., et al. (2014) HITS CLIP and integrative modeling define the Rbfox splicing regulatory network linked to brain development and autism. Cell Rep 6 1139 1152. 184. Nakamori,M., Sobczak,K., Puwanant,A., Welle,S., Eichinger,K., Pandya,S., Dekdebrun,J., Heatwole,C.R., McDermott,M.P., Chen,T., et al. (2013) Splicing biomarkers of disease severity in myotonic dystrophy. Ann. Neurol. 74 862 872. 185. Gulino,A., Di M arcotullio,L. and Screpanti,I. (2010) The multiple functions of Numb. Exp. Cell Res. 316 900 906. 186. Kim,K.K., Nam,J., Mukouyama,Y. S. and Kawamoto,S. (2013) Rbfox3 regulated alternative splicing of Numb promotes neuronal differentiation during develop ment. J. Cell Biol. 200 443 458. 187. Bechara,E.G., Sebestyn,E., Bernardis,I., Eyras,E. and Valcarcel,J. (2013) RBM5, 6, and 10 differentially regulate NUMB alternative splicing to control cancer cell proliferation. Molecular Cell 52 720 733. 188. Mis quitta Blencowe,B.J. (2011) Global profiling and molecular characterization of alternative splicing events misregulated in lung cancer. Molecular and Cellular Biology 31 138 150. 189. G adalla,S.M., Lund,M., Pfeiffer,R.M., Grtz,S., Mueller,C.M., Moxley,R.T., Kristinsson,S.Y., Bjrkholm,M., Shebl,F.M., Hilbert,J.E., et al. (2011) Cancer risk among patients with myotonic muscular dystrophy. JAMA 306 2480 2486. 190. Win,A.K., Perattur,P.G ., Pulido,J.S., Pulido,C.M. and Lindor,N.M. (2012) Increased cancer risks in myotonic dystrophy. Mayo Clin. Proc. 87 130 135. 191. Seth,P. and Yeowell,H.N. (2010) Fox 2 protein regulates the alternative splicing of scleroderma associated lysyl hydroxylas e 2 messenger RNA. Arthritis Rheum. 62 1167 1175. 192. Leader,B., Baca,Q.J. and Golan,D.E. (2008) Protein therapeutics: a summary and pharmacological classification. Nat Rev Drug Discov 7 21 39.
248 193. Torre,B.G. de L. and Albericio,F. (2017) The Pharmac eutical Industry in 2016. An Analysis of FDA Drug Approvals from a Perspective of the Molecule Type. Molecules 22 368. 194. Reichert,J.M. (2003) Trends in development and approval times for new therapeutics in the United States. Nat Rev Drug Discov 2 6 95 702. 195. Thornton,C.A., Wang,E. and Carrell,E.M. (2017) Myotonic dystrophy: approach to therapy. Current Opinion in Genetics & Development 44 135 140. 196. Wheeler,T.M., Leger,A.J., Pandey,S.K., MacLeod,A.R., Nakamori,M., Cheng,S.H., Wentworth,B.M., Bennett,C.F. and Thornton,C.A. (2012) Targeting nuclear RNA for in vivo correction of myotonic dystrophy. Nature 488 111 115. 197. Wei,H. and Wang,Z. (2015) Engineering RNA binding proteins with diverse activities. Wiley Interdiscip Rev RNA 6 597 613. 198. Naso,M.F., Tomkowicz,B., Perry,W.L. and Strohl,W.R. (2017) Adeno Associated Virus (AAV) as a Vector for Gene Therapy. BioDrugs 31 317 334. 199. Hansen,J.E., Weisbart,R.H. and Nishimura,R.N. (2005) Antibody mediated transduction of therapeutic protei ns into living cells. ScientificWorldJournal 5 782 788. 200. Guidotti,G., Brambilla,L. and Rossi,D. (2017) Cell Penetrating Peptides: From Basic Research to Clinics. Trends Pharmacol. Sci. 38 406 424. 201. Colella,P., Ronzitti,G. and Mingozzi,F. (2018) Emerging Issues in AAV MediatedIn VivoGene Therapy. Mol Ther Methods Clin Dev 8 87 104. 202. Madani,F., Lindberg,S., Langel,U., Futaki,S. and Grslund,A. (2011) Mechanisms of cellular uptake of cell penetrating peptides. J Biophys 2011 414729 10. 203. Tnnemann,G., Martin,R.M., Haupt,S., Patsch,C., Edenhofer,F. and Cardoso,M.C. (2006) Cargo dependent mode of uptake and bioavailability of TAT containing proteins and peptides in living cells. FASEB J. 20 1775 1784. 204. Shamoo,Y., Abdul Manan,N. and Wi lliams,K.R. (1995) Multiple RNA binding domains (RBDs) just don't add up. Nucleic Acids Research 23 725 728. 205. Prez Caadillas,J.M. (2006) Grabbing the message: structural basis of mRNA 3'UTR recognition by Hrp1. EMBO J. 25 3167 3178. 206. Beuth,B., Pennell,S., Arnvig,K.B., Martin,S.R. and Taylor,I.A. (2005) Structure of a Mycobacterium tuberculosis NusA RNA complex. EMBO J. 24 3576 3587.
249 207. Bhaskara,R.M., de Brevern,A.G. and Srinivasan,N. (2013) Understanding the role of domain dom ain linkers in the spatial orientation of domains in multi domain proteins. J. Biomol. Struct. Dyn. 31 1467 1480. 208. Park,J., Ryu,J., Kim,K. A., Lee,H.J., Bahn,J.H., Han,K., Choi,E.Y., Lee,K.S., Kwon,H.Y. and Choi,S.Y. (2002) Mutational analysis of a h uman immunodeficiency virus type 1 Tat protein transduction domain which is required for delivery of an exogenous protein into mammalian cells. J. Gen. Virol. 83 1173 1181. 209. Martorana,F., Brambilla,L., Valori,C.F., Bergamaschi,C., Roncoroni,C., Aroni ca,E., Volterra,A., Bezzi,P. and Rossi,D. (2012) The BH4 domain of Bcl X(L) rescues astrocyte degeneration in amyotrophic lateral sclerosis by modulating intracellular calcium signals. Human Molecular Genetics 21 826 840. 210. Borsello,T., Clarke,P.G.H., Hirt,L., Vercelli,A., Repici,M., Schorderet,D.F., Bogousslavsky,J. and Bonny,C. (2003) A peptide inhibitor of c Jun N terminal kinase protects against excitotoxicity and cerebral ischemia. Nat. Med. 9 1180 1186. 211. Gaj,T. and Liu,J. (2015) Direct prot ein delivery to mammalian cells using cell permeable Cys2 His2 zinc finger domains. J Vis Exp 10.3791/52814. 212. Hossain,M.A., Shen,Y., Knudson,I., Thakur,S., Stees,J.R., Qiu,Y., Pace,B.S., globin Gene Expression via Direct Protein Delivery of Synthetic Zinc finger DNA Binding Domains. Mol Ther Nucleic Acids 5 e378. 213. Goers,E.S., Voelker,R.B., Gates,D.P. and Berglund,J.A. (2008) RNA binding specificity of Drosophila muscleblind. Biochemistry 47 7284 7294. 214. Cieply,B. and Carstens,R.P. (2015) Functional roles of alternative splicing factors in human disease. Wiley Interdiscip Rev RNA 6 311 326. 215. Jeong,S. (2017) SR Proteins: Binders, Regulators, and Connectors of RNA. Mol. Cells 40 1 9. 216. Shepard,P.J. and Hertel,K.J. (2009) The SR protein family. Genome Biol. 10 242. 217. Long,J.C. and Caceres,J.F. (2009) The SR protein family of splicing factors: master regulators of gene expression. Biochem. J. 417 15 27. 218. Graveley,B.R. a nd Maniatis,T. (1998) Arginine/serine rich domains of SR proteins can function as activators of pre mRNA splicing. Molecular Cell 1 765 771. 219. van Der Houven Van Oordt,W., Newton,K., Screaton,G.R. and Cceres,J.F. (2000) Role of SR protein modular dom ains in alternative splicing specificity in vivo. Nucleic Acids Research 28 4822 4831.
250 220. Cazalla,D., Zhu,J., Manche,L., Huber,E., Krainer,A.R. and Caceres,J.F. (2002) Nuclear export and retention signals in the RS domain of SR proteins. Molecular and Cellular Biology 22 6871 6882. 221. Masliah,G., Barraud,P. and Allain,F.H. T. (2013) RNA recognition by double stranded RNA binding domains: a matter of shape and sequence. Cell. Mol. Life Sci. 70 1875 1895. 222. Stefl,R., Skrisovska,L. and Allain,F.H. T. (2005) RNA sequence and shape dependent recognition by proteins in the ribonucleoprotein particle. EMBO Rep. 6 33 38. specificity of double stranded RNA binding proteins. Biochemistry 53 3457 3466. 224. Chang,K. Y. and Ramos,A. (2005) The double stranded RNA binding motif, a versatile m acromolecular docking platform. FEBS J. 272 2109 2117. 225. Mooers,B.H.M., Logue,J.S. and Berglund,J.A. (2005) The structural basis of myotonic dystrophy from the crystal structure of CUG repeats. Proc. Natl. Acad. Sci. U.S.A. 102 16626 16631.
251 BIOGRAPHICAL SKETCH Melissa A. Hale majored obtain ed her Bachelor of Science in biochemistry at Seattle Pacific University in 2012 where she was a University Honors Scholar During the course of her undergraduate education she worked as an undergraduate research associate in the lab of Dr. Benjamin McFarland studying the binding interface between two immune response proteins (MICA and NKG2D) paramount to triggering the autoi mmune response and clearance of virally infected and potentially oncogenic cells. In an effort to continue her scientific education, Melissa joined the graduate program at the University of Oregon in the Department of Chemistry and Biochemistry in 2012. He r research interests included alternative splicing and RNA processing regulation by MBNL proteins in the context of the neuromuscular disorder myotonic dystrophy. While at the choose to move his research group to the University of Florida. Melissa ch ose to re ceive her Master of Science in chemistry and b iochemistry from the University of Oregon in 2015 s graduate Interdisciplinary Program in Biomedical Sciences. Melissa completed her doctoral research in the lab of Dr. J. Andrew Berglund in the Department of Biochemistry and Molecular Biology and received her Doctor of Philosophy in medical s ciences with a co ncentration in biochemistry and molecular b iology in May 2018.