Citation
Restriction fragment analysis of the evolutionary relationship of the murine H-2 A[subscript [alpha]] and A[subscript [beta]] alleles

Material Information

Title:
Restriction fragment analysis of the evolutionary relationship of the murine H-2 A[subscript [alpha]] and A[subscript [beta]] alleles
Creator:
McConnell, Thomas John, 1955-
Publication Date:
Language:
English
Physical Description:
viii, 135 leaves : ill. ; 29 cm.

Subjects

Subjects / Keywords:
Alleles ( jstor )
Digestion ( jstor )
DNA ( jstor )
Exons ( jstor )
Genomics ( jstor )
Haplotypes ( jstor )
Inbred strains ( jstor )
MHC class II genes ( jstor )
Mice ( jstor )
Molecules ( jstor )
Alleles ( mesh )
Dissertations, Academic -- Pathology -- UF ( mesh )
Pathology thesis Ph.D ( mesh )
Polymorphism (Genetics) ( mesh )
City of Miami ( local )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1986.
Bibliography:
Includes bibliographical references (leaves 120-134).
Additional Physical Form:
Also available online.
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by Thomas John McConnell.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright Thomas John McConnell. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
16955333 ( OCLC )
022707377 ( ALEPH )

Downloads

This item has the following downloads:


Full Text










A RESTRICTION FRAGMENT ANALYSIS OF THE EVOLUTIONARY RELATIONSHIP OF THE MURINE H-2 A AND AB ALLELES







By

THOMAS JOHN MCCONNELL






















A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY


UNIVERSITY OF FLORIDA

1986



















































Copyright 1986

by

Thomas John McConnell





















This dissertation is dedicated to
my wife Ann, my parents, and to Bosco and Max.














ACKNOWLEDGEMENTS



I would like to express my appreciation to the many people who have helped me complete this dissertation.

Most especially I would like to thank Dr. Edward Wakeland for his help and support throughout this project. As I meet more scientists from other parts of the country, my respect for Dr. Wakeland only increases.

My appreciation is also extended to Drs. Ammon Peck and Arthur Kimura who have been helpful through their comments and friendship. My thanks are extended to Dr. Noel Maclaren, without whose help and support I might not have had the opportunity to join this excellent department. Also, my thanks are extended to Dr. Edward Siden whom I respect for his questioning mind and his enthusiasm for science.

My thanks are extended to Dr. Linda Smith for her friendship and for the reagents we have been able to borrow back. My thanks are extended to Dr. William Winter for his friendship and scientific curiosity in the laboratory. And my thanks are extended to Dr. David Cooper for his friendship.

The students and research personnel in the

laboratory have been a particularly excellent group.



iv








William Talbot and Randy Horwitz have been two of my best friends through the past few years for which I would like to thank them. I would like to thank Vickie Henson for her friendship and support throughout the time we have both been students in Dr. Wakeland's laboratory. Stefen Boehme, Roy Tarnuzzer, and Richard McIndoe have all provided me a tremendous amount of support and some excellent lunch trips. Marge Price-LaFace has been an integral part of this project for which I would like to thank her. Cheryl Zack has offered encouragement to me during the writing process for which I would like to thank her.

Many thanks are offered to people in other

laboratories who have been very good friends. My thanks are extended to Lena Dingler, a very good friend I hope to see in the future. My thanks are extended to Jane Strandberg and my congratulations to her on her choice of an excellent graduate program. My thanks are extended to Judith Nutkis, Jim Xiang, and to Dan Cook, who have been supportive friends during my graduate studies. And my thanks are extended to Drake LaFace and Kathy Edmundson for their friendship.

For their help in last minute typing and printing rushes, I would like to thank Crystal Grimes and Rose Mills.






T7















TABLE OF CONTENTS


Page

ACKNOWLEDGEMENTS ................................. iv

ABSTRACT.......................................... vii

INTRODUCTION...................................... 1

REVIEW OF THE LITERATURE.......................... 5

Major Histocompatibility Complex Structure... 6 Polymorphism of the Class II Genes............ 26
Variation in Wild Mice ....................... 39

MATERIALS AND METHODS ............................. 61

Mice ....... ... ....... ...................... 61
Isolation of Genomic DNA...................... 61
Restriction Endonuclease Digestion and
Agarose Gel Electrophoresis................. 62
Capillary Transfer and Hybridization.......... 63 Data Analysis ................................ 64

RESULTS.......................................... 66

RFLP Analysis of the A. and A5 Genes of
Standard Laboratory Inbred and Wild Mice.... 66
RFLP Analysis of the Divergence of the A and
Ag Genes Within the I-AP Family............. 91
RFP Analysis of the Divergence of the A. and
As Genes Within the I-Ak Family............. 100

DISCUSSION ........ ............ ................... 110

RFLP Analysis of the Genomic Structures of
Sand AR Alleles............................ 110
Comparisons of Class II Molecules and
Class II gene RF Genotypes within the I-AP
and I-Ak Family........ .................... 117

REFERENCES............................. 120

BIOGRAPHICAL SKETCH............................... 135




vi













Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy


A RESTRICTION FRAGMENT ANALYSIS OF THE EVOLUTIONARY
RELATIONSHIPS OF THE MURINE H-2 A AND Ag ALLELES


By

Thomas John McConnell


August 1986

Chairman: Dr. Edward K. Wakeland Major Department: Pathology

The presence of three Ap evolutionary groupings,

designated Ab, And, and A k, has been established on the basis of a restriction fragment length polymorphism (RFLP) analysis of the Ap genes of 37 independently derived mouse H-2 haplotypes. All mice analyzed, which included laboratory inbred and wild derived haplotypes of Mus musculus domesticus, Mus musculus musculus, and Mus musculus castaneus, were found to have an Ag allele which was could be related to one of these three A5 groupings. No similar grouping in the RFLP analysis of the A. alleles of these same haplotypes, however, was possible. All of the 8 t haplotypes studied are found to fall into either the AOb or A0d evolutionary group based on the same RFLP analysis.








Restriction fragments were detected with a 5.8 kb IL-Ad genomic DNA probe and a 1.2 kb -Ab genomic DNA probe. The I-A alleles in each group have 50% or greater restriction fragments in common. Alleles in separate groups share less than 20% of their restriction fragments.

The polymorphism of the A and A8 genes as detected by RFLP analysis did not always correlate with known protein sequences. In an RFLP analysis of I-Ak protein related H-2 haplotypes, 5 mouse strains were known to be closely related to one another by serology and tryptic peptide mapping, were found to fall into two different AO evolutionary groups. There are also examples of 2 A8 alleles being very different at the protein level, but very similar when their RFLP's are compared. Possible explanations of these data include the existence of mechanisms which allow or promote gene conversion or some form intragenic recombination occurring in or near the introns, possibly even exon shuffling. The presence of three different Mus subspecies in one of the AO evolutionary groups suggests that these groups arose prior to the divergence of the subspecies, approximately one million years ago.










viii














INTRODUCTION



The major histocompatibility complex of the mouse, known as the H-2 complex, is a cluster of loci encoding proteins whose function in part includes the immune defense of the animal in its natural environment. There are three classes of genes in the H-2 complex: the class I genes (the strongest histocompatibility genes), the class II genes genes encoding proteins which are involved in the presentation of foreign antigen to the regulatory T lymphocytes (Benacerraf 1981; Klein 1975; Klein 1979), and the class III genes (which encode the complement genes). Polymorphisms in the class I and class II genes, i.e. the presence of multiple allelic forms of the gene at frequencies of greater than 1% in the general population, are thought to help a species to survive the continuous onslaught of environmental pathogens. The polymorphic nature of the class II A. and A5 genes, as related to evolutionary genetic mechanisms, is the subject of this dissertation.

Genetic and biochemical studies of class II

molecules have identified two molecules, designated I-A and I-E, which normally are expressed on the surfaces of antigen presenting cells and B lymphocytes (Cullen et


1





2


al. 1976; Uhr et al. 1979). The molecular cloning and sequencing of a large portion of the murine I region has supplied extensive information on the organization of the murine class II genes and the molecules they encode (Benoist et al. 1983a; Benoist et al. 1983b; Choi et al. 1983; Malissen et al. 1983; Steinmetz et al. 1982). The A., AO, E., and Eg genes are single copy genes present on a segment of about 110 kb of DNA in the I region of the H-2 complex (Steinmetz et al. 1982). The A and Ag genes each encode transmembrane glycoproteins which noncovalently associate with one another on the cell surface to form the heterodimeric I-A molecule. Similarly, the E. and Eg genes encode transmembrane glycoproteins which form the heterodimeric I-E molecule. Studies of the DNA sequence of these class II genes have established that they all have a common evolutionary origin and that they represent one branch of the immunoglobulin supergene family (Benoist et al. 1983a; Choi et al. 1983; Malissen et al. 1983). Each class II gene consists of at least 6 exons and occupies more than 5 kb of genomic DNA. The A., AO, and part of the EO genes are located within a portion of the I region which exhibits extensive sequence polymorphism. This region extends 5' from a recombinational hot spot located within the central intron of E (Steinmetz et al. 1984). The Ea gene is located 3' to EO in a "conserved tract" of the I region, and exhibits much less polymorphism than A , AO,





3


or E. The evolutionary mechanisms responsible for the production and maintenance of polymorphic and conserved tracts within the I region are unknown.

Previous studies on the genetic polymorphisms of class II genes in wild mouse populations have provided some insights into the genetic mechanisms responsible for their diversification (Wakeland and Darby 1983; Wakeland and Klein 1979a; Wakeland and Klein 1979b; Wakeland and Klein 1983; Wakeland et al. 1985). Serologic and structural analyses of the I-A molecules expressed among a collection of H-2 haplotypes derived from wild mice led to the definition of "families" of I-A alleles (Wakeland and Klein 1979a; Wakeland and Klein 1983). The I-A. alleles within the same family encode antigenically similar molecules that are identical in more than 90% of their tryptic peptides when compared by high pressure liquid chromatography tryptic peptide fingerprinting (Wakeland and Darby 1983; Wakeland and Klein 1979b; Wakeland and Klein 1983). The I-A molecules encoded by alleles in separate families have distinct antigenic phenotypes and are identical in less than 70% of their tryptic peptides. An analysis of over 40 H-2 haplotypes derived from laboratory and wild mouse strains led to the definition of 8 distinct I-A families (Wakeland and Klein 1983).

This dissertation describes the RFLP analysis of the Ai and A4 genes of 37 standard laboratory inbred and wild





4


mouse strains from the 3 separate subspecies Mus musculus domesticus, Mus musculus musculus, and Mus musculus castaneus, and includes t bearing haplotype wild mice. Based on the data generated using 6 to 7 Restriction endonucleases (RE) on each haplotype examined for AD, all of the homozygous haplotypes examined could be placed into one of three evolutionary groupings. Because the Am gene of subspecies Mus musculus musculus, Mus musculus domesticus, and Mus musculus castaneus are in one AD evolutionary group, these evolutionary groups probably predate subspeciation in the mouse, which occurred approximately 1 millon years ago. Since both the A and AD probes contain approximately 80% noncoding sequence and in some instances the protein sequences or tryptic peptide mapping could be compared between haplotypes being examined, the AD evolutionary groups that I have found appear to be determined predominantly by noncoding sequence. Finally, the A. and AD genes of wild t haplotype bearing mice examined by RFLP analysis also fell into the separate AD evolutionary groups.














REVIEW OF LITERATURE



The major histocompatibility complex (MHC) of the mouse, also known as the H-2 complex, has been an area of fascination in modern genetics since its discovery laid the groundwork for serology as a tool to study genetics (Gorer 1936; Gorer 1938). The groundwork for the discovery of the genetic basis of the H-2 complex had been established by Little and Tyzzer (1916) using inbred mouse lines. From the initial characterization of erythrocyte antigens, through protein biochemistry and sequencing, and up to modern genetic engineering, the MHC has continued to allow us to learn about genetics, natural selection, cellular physiology, and other far reaching areas of biology.

The class II genes of the H-2 complex are located in the I region, so named because that region was originally defined by the differential ability of inbred mouse strains to mount an immune response to certain simple antigens (Martin et al. 1981; McDevitt and Chinitz 1969; McDevitt and Sela 1965; Martin et. al. 1971). The actual mapping of the immune response genes within the H-2 complex was accomplished with the use of inbred congenic and recombinant mouse strains (Benacerraf and


5





6


McDevitt 1972; McDevitt et al. 1972), although the actual identity of these gene products was in question for a number of years. The class II genes encode proteins that restrict the recognition of foreign antigen by the regulatory T lymphocytes to those antigen presenting cells which are of the same allelic form, and are thus critical for the development of a normal immune response. This literature review will focus primarily on class II genes structure and possible functional correlates of the genetic structure.



Major Histocompatibility Complex Structure


General Organization and Protein Structure

The murine major histocompatibility complex (or H-2 complex) encompasses about 2 centimorgans of DNA which may be equivalent to as much as 2000-4000 kb of DNA (Hood et al. 1982;Klein 1975) and contains at least 3 classes of immunologically related genes, denoted class I, class II, and class III (Klein 1975; Snell et al. 1976). The molecules encoded by the class I genes are of

2 general types. The class I genes designated K, D, L, and R encode the classical transplantation antigens located on the cell surface of most nucleated cells and are known to be very polymorphic. These molecules are primarily involved in the restricted recognition of some viral antigens by cytotoxic T lymphocytes (Zinkernagel





7


1979). The other general type of class I genes, designated 2a and Tla, are expressed on nucleated blood cells (Qa) or on thymocytes and certain leukemias (Tla) (Michaelson et al. 1983), are much less polymorphic, and whose functions are not yet known (Flaherty 1980). The 2a and Tla genes are located telomeric to the D, L, and R genes and number over 30 (Winoto et al. 1983). The molecular structure of both types of class I molecules consists of a 40-45,000 dalton membrane bound glycoprotein of approximately 350 amino acids which form 3 extracellular domains of about 90 amino acids per domain. A fourth domain in the form of 02 microglobulin, a 12,000 dalton polypeptide encoded on chromosome 2 in the mouse, noncovalently associates with the class I gene product, possibly in a stabilizing role (Klein et al. 1983b). The relative locations of the class I genes to the class II and class III genes, as well as their general protein structure, are illustrated in figure 1.

The class III genes of the H-2 complex encode

complement components such as the C2, Bf, SIp, and C4 genes. The linkage association of some of the complement genes to the MHC of different animals varies (Alper 1981). While there has been some argument for the inclusion of the complement loci in the MHC on the basis of the MHC and linked loci possibly evolving as a genetic unit (Bodmer 1976), Klein et al. (1983b) argue against a
















Figure 1. The known functional class II loci and their relative position on chromosome 17. Also illustrated are the class II gene products, the I-A and I-E cell-surface glycoproteins.









I-A I-E




TM
CT tco on c rCOOCCOO






Chromosome I A. Ep C4, SIP QaT multiple
Class I molecules









T M cell surface membrane
CT coon coon coon




10



purely physical inclusion of the complement genes as part of the MHC in their review article.

The class II genes map between the K and S regions of the H-2 complex (figure 1). There are 4 functional class II genes denoted A., AO, E., and E0, as well as the pseudogenes A43, A2, and E 2 (Steinmetz et al. 1986; Widera and Flavell 1985). The overall organization of the I region and its protein products are illustrated in figure 1. The class II glycoproteins are made up of a 28,000 dalton 0 chain of about 230 amino acids, and a 34,000 dalton a chain of about 220 amino acids (Klein et al. 1983b). Both the a and the 8 chains consist of five protein domains including a hydrophobic leader peptide of about 25 amino acids absent in the mature cell-surface form of the molecule, two approximately 90 amino acid extramembrane domains (aa2 or 0102), a hydrophobic transmembrane segment of about 25 amino acids, and a cytoplasmic tail region containing a high proportion of charged residues (Mengle-Gaw and McDevitt 1985). Each of the a1, a2, 01, and 82 domains is formed by a disulfide cystine bridge. The two a and two 0 domains noncovalently associate to form the I-A molecule. The presence of four extramembrane protein domains appears to be a stabilizing configuration for both the class I and class II molecules. The structuring of the MHC protein molecules into domains reflects the basic organization of the encoding DNA into exons and introns.









Structure of the Class II Genes

Both the AB and E genes are made up of six exons, one exon for each of the five protein domains, plus one exon encoding the 3' untranslated region (Saito et al. 1983). Both the A, and E. genes, though very similar to the 0 genes, are made up of only five exons. This difference is due to the transmembrane and cytoplasmic tail regions of the a chains being encoded by a single exon (Mathis et al. 1983; McNicholas et al. 1982).

It is important to note the polymorphic nature of

the AP, Aa, and Eg genes although they will be discussed in detail in a later section. The presence of multiple allelic forms of the AO and Ep genes may imply a unique role for the encoded proteins in the ability of a population to respond to an antigen. The actual molecular interplay between the class II molecule, the antigen, and the T cell receptor is still largely an unknown.


Brief History of the I Region Loci

The I region, which contains the class II genes, has had a relatively short, but turbulent, history. As has already been mentioned, the I region was discovered and mapped by the differential ability of inbred and congenic animals to respond to certain simple antigens (Benacerraf and McDevitt 1972; Martin et al. 1971; McDevitt and Sela 1965; McDevitt et al. 1972). Even then, the I region gene products were mistaken for being




12


related to, but not a part of, the MHC and somehow related to the T cell receptor (McDevitt and Chinitz 1969).

Historically, five subregions were defined in the I region, the A, B, J, E, and C subregions (Murphy 1981). The B locus was originally postulated by Lieberman et al. (1972) to explain what appeared to be a response to an allotypic determinant on an IgG2a molecule known as MOPC173. Responses to several other antigens were mapped to the B locus including lactate dehydrogenase B (Melchers et al. 1973), staphlococcal nuclease (Lozner et al. 1974), oxazolone (Fachet and Ando 1977), and H-Y (Hurme et al. 1978). But the data from the different laboratories have not always corroborated the existence of the B locus, and several alternative explanations have in fact been offered based on an interplay of the A and E loci (Baxevanis et al. 1981). However, no protein product has been detected and sequencing data has not demonstrated the presence of any gene corresponding to the B locus (Hood et al. 1983; Steinmetz et al. 1982; Steinmetz et al. 1986).

The C subregion was first discovered with an H-2h2 anti-H-2h4 antiserum (David and Shreffler 1974). Other evidence in support of the existence of the C subregion was found by Rich et al. (1979a; 1979b) with the presence of C-specific antibodies that reacted with a suppressor factor produced in a mixed lymphocyte culture. The C





13


locus was mapped, using recombinant inbred strains, telomeric to the E. locus and centromeric to the genes encoding the C4 component of complement. Other investigators have been unable to confirm many of the results dealing with the C locus and so question its existence (Juretic et al. 1981; Livnat et al. 1973). In the most current molecular cloning data of this region (Steinmetz et al. 1986) there is no evidence for the C locus in the 150 kb of DNA telomeric to the E. locus, although the entire chromosomal segment in question has not been characterized. And once again, no protein product has been isolated from the C locus. The existence of both the B and the C loci is based entirely on their possible regulatory effect on the immune response of the mouse.

The J subregion is the third subregion from which

no protein product has been well-characterized, although anti-J antiserum and monoclonal antibodies have been produced by several laboratories (Kanno et al. 1981; Murphy 1978; Waltenbaugh 1981). In fact, the J locus has been perhaps the most publicized single gene in immunology, and the most controversial, other than the T cell receptor. Thousands of papers have been published on either the J product, or on its role as the class II element controlling the T suppressor cells. The J subregion was originally defined by reciprocal antisera raised between inbred congenic mouse strains B10.A(3R)





14


and B10.A(5R) as well as the mouse strains B1O.HTT and B1O.S(7R). The same mouse combinations mapped the location of the J subregion between the A and E loci (Murphy et al. 1976; Murphy 1978). These alloantisera recognize soluble suppressor factors secreted by these cells as well as recognizing polymorphic determinants expressed on T suppressor cells (Krupen et al. 1982; Murphy 1978; Tada et al. 1976; Taniguchi et al. 1980; Waltenbaugh 1981). Although no protein has been positively identified for the J locus, Taniguchi et al. (1982) report finding a 25,000 dalton protein using an anti-J monoclonal antibody.

In the first extensive DNA level characterization of the murine I region, Steinmetz et al. (1982) found no evidence for the existence of the J locus within the I region. Due to a hotspot for recombination at the 3' end of the E gene, these authors were able to map the suspected position of J between the A and E loci and found that if it was located there it would have to be encoded by less than 3.4 kb of DNA. In further DNA cloning analysis of this region and more RFLP mapping of additional intra-I region recombinants, Kobori et al. (1984) shortened the distance down to about 2.0 kb, making it even more unlikely that J might be encoded here. Related experiments have shown that cloned DNA encompassing this critical 3.4 kb segment fails to hybridize to RNA from J positive T suppressor cell lines




15


(Kronenberg et al. 1983), thus making unlikely the presence of an I region encoded J gene product. Alternative explanations for the location of J have been offered (Hayes et al. 1984; Klyczek et al. 1984), but have not been substantiated.

The A and E subregions have survived the despoilment of the I region which occurred with the onslaught of molecular analysis of the I region DNA. These two subregions contain genes which encode four cell surface protein products which have been identified by serological and biochemical methods (Jones 1977; Uhr et al. 1979). The A subregion contains at least three loci that encode class II molecules which are expressed on the cell surface: A, A., and E (Jones et al. 1978). The E subregion contains a fourth loci that is known to encode a molecule expressed on the cell surface, E. (Jones et al. 1978). It is important to note that with the molecular characterization of the genome containing the I region, the nomenclature of I "subregion" is no longer appropriate. The term originally defined the I region loci, several of which are now generally believed to have been artifactual for reasons listed above. Recombinational events are more accurately represented when viewing the class II genes as being part of the continuum of DNA versus the archaic concept of subregion. Henceforth in this literature review individual loci shall be referred to by their gene designation, for





16


example, Ap. These loci, plus other class II loci recently discovered in the genome, will now be discussed in more detail.



Cloning and Sequencing of the Murine Class II Genes

The cloning and sequencing of the murine class II genes were based in part on technical advances and new approaches made while isolating the human class I (Ploegh et al. 1980; Sood et al. 1980), and class II (Auffray et al. 1982; Korman et al. 1982a, 1982b; Larhammar et al. 1982a; Lee et al. 1982a; Yang et al. 1982) genes. Protein sequence comparisons done earlier, reviewed by Nathenson et al. (1981), have already established the homology between humans and mice when comparing DNA sequences of the class I genes. Similar work on the class II gene products also reveals strong homologies between mice and human (Allison et al. 1978; Cook et al. 1979). More indicative of the evolutionary stability of the class II gene products in a dynamic molecular environment is the maintenance of the domain structure as the basic functional unit of the molecule.

The most revealing feature of the class II proteins in terms of their evolutionary origins is the aforementioned domain structure and their sequence relatedness to other immunological molecules. There is a consistent correlation between all the class II genes of each structural domain being encoded by a separate exon,





17


as there is for the class I, 2-microglobulin, the Thy-i molecule, and antibody genes (Kaufman et al. 1984). Domain structure alone does not indicate homology between proteins, since similar domains have been found in such proteins as superoxide dismutase (Richardson et al. 1976), but taken together with the nucleotide sequence homology found between these immunological molecules (Benoist et al. 1983a; Bregegere et al. 1981; Korman et al. 1982b; Larhammar et al. 1982b; McNicholas et al. 1982; Parnes and Seidman 1982; Steinmetz et al. 1981), there is strong evidence for the existence of a common ancestral gene (Peterson et al. 1975). Other similarities between members of the immunoglobulin supergene family include similar placement and size of the disulfide bridges and RNA splicing according to the GT/AG rule (Hood et al. 1983). The T8 cell surface glycoprotein expressed by most cytotoxic T lymphocytes has also been determined to belong to the immunoglobulin supergene family by domain structure and cDNA sequencing (Sukhatme et al. 1985), as has the T4 molecule (Maddon et al. 1985). Thus, evolution through gene duplication and divergence (Ohno 1970) may be an ancient mechanism for the immune system gene family.

Although the murine class II genes have an exonintron organization that corresponds to the domain organization of the expressed protein product, the murine class II a gene structure differs from that of the murine





18


class II 0 gene. A large intron separates the exons encoding the signal peptide from the first domain in the class II a gene, and the 3' untranslated region is split between two exons, but the transmembrane and cytoplasmic regions are encoded by a single exon (Benoist et al. 1982; McNicholas et al. 1982). This genetic structure is similar to that of the murine 02-microglobulin gene (Suggs et al. 1981). In contrast, the murine class II

0 genes have a large intron between exons encoding the first and second extracellular protein domains. The transmembrane, cytoplasmic, and 3' untranslated regions are split over three exons (Larhammer et al. 1983a; Saito et al. 1983), more similar, though not identical, to the class I heavy chain gene structure (Malissen et al. 1982). The genomic structure of the Ap gene can be seen in figure 2.

As mentioned above, because of all of the common structural and sequence homologies between the members of the immunoglobulin supergene family, there is a strong possibility that each has all evolved from a common ancestral gene. It is important to keep in mind that one cannot distinguish between convergent and divergent evolution (Hood et al. 1983). The membrane proximal domains of these molecules have the most sequence homology and are therefore even more likely to have a common origin. But the nearly identical external domain size and disulfide bridge placement of the different
















Figure 2. The 5.4 kilobase Eco RI AO fragment derived from an H-2d cosmid library, shown complete with exon and intron structure. Also shown is section of the I region from Ag to Ea, including the recombination hotspot as defined by Steinmetz et al. (1982). Adapted from Hood et al. (1983).







Recombination
Hot Spot
Polymorphic Domain I Conserved Domain

A , A E- E, E.4
5'- 3' 3'-5' 5' 3' 3'. 5
,' 1--0 kb




R H SC C CH SSH B R
I I I I I I I ( I I
SP 082 TM CY CY 3UT

500 bp
L I





21


members of the immunoglobulin supergene family argues strongly for a common ancestral gene evolving in a divergent manner following gene duplication.


Genome Organization of the Class II Genes

The first evidence at the DNA level of the linkage of class II genes was made by Steinmetz et al. (1982) with their cloning of about 230 kb of DNA isolated from a BALB/c sperm DNA cosmid library. The cosmid library was first probed with a DR. cDNA probe (characterized by Wake et al. 1982), and then probed by single copy genetic fragments subcloned from contiguous cosmids. In this manner, Steinmetz et al. were able to "walk" along the chromosome as long as there were cosmid clones in the genomic library that contained overlapping fragments of genomic DNA, identifying approximately 200 kb of linked DNA in the process. The telomeric boundary of the I region was defined as the structural gene for the C4 complement component mapping about 90 kb downstream from E. The centromeric boundary of the I region was not determined in this particular publication, but several other important discoveries were made. First, four class II genes were identified, one as a possible pseudogene because a 5' probe failed to hybridize to the gene. Second, the BALB/c genome was determined to contain two a and four to six 0 genes, a finding which has been borne out in more recent work (Widera and Flavell 1985). Third,





22


Steinmetz et al. (1982) also reported that the E and E genes are present in strains of mice which do not express an E molecule, e.g., the b and s haplotypes which express the protein in the cytoplasm, and the f and _ haplotypes, which do not express the protein at any detectable level. This finding has led to more work on control of gene expression in the class II gene system. Fourth, correlation of the molecular map with the serologically and genetically determined map of the I region led scientists to question the existence and location of the B and J genes. Finally, a recombinational hotspot was identified where nine independently generated recombinant mouse lines were found to all have recombined within the same 3.4 kb of DNA. Kobori et al. (1984) have furthur characterized six of the murine I region MHC recombinants using southern blot DNA analysis to limit the recombination region in these strains to less than 2.0 kb of DNA. This 2.0 kb contains part of the intron between the first and second protein encoding exons, and part of the second domain encoding exon.

Figure 3 shows the most recent concept of the I

region at the DNA level. Other class II genes which have been characterized recently are ,02 , A53, and 502* Larhammar et al. (1983a) identified A2 and located it to be about 20 kb centromeric to Ao. Larhammar et al. (1983b) sequenced the genomic AB2 of the b haplotype isolated from cosmid clone IP-101. The exon-intron





23


structure of &A2 is the same as for the other class II genes. The predicted amino acid sequence of 602, as interpreted from the nucleotide sequence, shows only up to 56% homology to the other 0 chains, including the human 0 chain class II proteins. These other B chains typically show up to almost 80% homology to each other. On this basis, the AO2 second domain sequence was determined to be the most divergent member of the class II 0 gene family. Larhammar et al. (1983b) also cloned and sequenced a cDNA clone, proving transcription of A02 does occur, although it was not detected on the cell surface, and some possible splicing errors were detected. When AS2 was used as a probe to hybridize blots of other strains, a lesser degree of polymorphism was detected.

The latest class II gene, and possibly the last in the I region, is A03. Widera and Flavell (1985) isolated A33 from a b haplotype cosmid library and were able to link it 75 kb telomeric to the class I H-2K region. The nucleotide sequence of the 12 domain of AB3 has homology to the immunoglobulin-like domains of other class II genes, and 83% homology to the human SBO gene. An examination of the nucleotide sequence also showed a deletion of 8 nucleotides which makes impossible the translation of this gene into a functional protein. The existence of A53 in another haplotype was confirmed by Steinmetz et al. (1986) with their cosmid cloning of the BALB/c A3. Whereas Widera and Flavell (1985) linked the
















Figure 3. A current molecular map of the relative position and distances of the class II loci and the K and K2 genes. Adapted from Steinmetz et al. (1986).















K2 K AS A'2A,8 Aa E/E/3)2 Ea GENES - - ,SCALE 1. 100 200 300 400 500 600kb 1
0 I O0 200 300 400 500 600 kb





26


K region with A03, Steinmetz et al. (1986) were also able to link w3 with the rest of the I region, effectively providing a 600 kb continuous DNA map of the K and I region. Therefore, as illustrated in Figure 3, the order of the genes discussed is K2, K, A3, 2, S, A-a, E4, E32, and E. In addition, Steinmetz et al. (1986) also localized two short regions of DNA which had recombination frequencies of 0.6% to 1.5% between genes from Mus musculus castaneus and standard laboratory mouse strains (Mus musculus domesticus). Such hotspots for recombination may be instrumental in the generation of polymorphism in the class II genes.



Polymorphism of the Class II Genes



The most unusual feature of the MHC in the murine system, or in vertebrate systems, is the extensive polymorphism of certain of the class I and class II genes. Of the class II molecules, the 0 chain proteins have been known to be the most polymorphic, and E. the least polymorphic (Klein et al. 1983a). The polymorphic nature of the class II genes has agreed with that found in the proteins in general, but A has been determined to be more polymorphic than originally thought by many (Benoist et al. 1983a). As mentioned earlier, this unique degree of polymorphism implies a unique biological role for the encoded glycoproteins.




27



Biological Role of Polymorphism

The class II molecules are involved in the

communication between immunocompetent (thymic education aside) cells to induce and maintain a defensive reaction to what the body perceives as a foreign invasion. The class II molecules are key elements in the activation of an immune response via regulatory T lymphocytes.

The discovery and characterization of the class II molecules have already been described in detail. The interaction between the antigen, the class II glycoprotein, and the T cell receptor determine if an animal is able to mount an immune response to a particular antigen. The T cells apparently cannot recognize free antigen as the B cells can (Moller 1978; Moller 1980). The function of the class II glycoprotein, then, is to enable the T lymphocytes to recognize a foreign antigen so that they can respond appropriately. This process is known as MHC restriction. Because the T cell receptor only recognizes the class II glycoprotein which is of the identical allelic haplotype as itself, the process is also sometimes referred to as I-region restriction or self-MHC restriction (Klein et al. 1981; Nagy et al. 1981).

The T cell receptor, therefore, must recognize and form a ternary complex with two ligands (Schwartz 1985). One of these ligands is the antigen itself, which is usually a partial degradation product of an antigen





28


presenting cell. The other ligand is the class II gene product expressed as a transmembrane glycoprotein present most abundantly on B lymphocytes and antigen presenting cells (Asano et al. 1983; Hammerling et al. 1974; Katz et al. 1973; Kindred and Shreffler 1972; Nagy et al. 1981). Each member of the ternary complex possesses a precise and high binding affinity for each other member of the ternary complex; otherwise, the biological triggering of the T helper cell, and consequently the stimulating of an antibody response, does not occur. It is in this specificity of binding that the role of polymorphism of class II glycoproteins can best be understood.

The extent of the polymorphism of the class II gene products, although very high, is not nearly enough to explain the ability of the class II glycoproteins to control immune responsiveness to the enormous number of foreign antigens an animal is able to respond to. The precise mechanism by which the class II glycoproteins trigger specific immune responses is still not known. There is evidence that the T cells can differentiate class II gene products in association with molecules which are subtle structural variants as with insulin (Rosenthal 1978), lysozyme (Adorini et al. 1979), and cytochrome c (Solinger et al. 1979).

The polymorphic nature of the class II glycoproteins might be explained by the fine balance the immune system





29


seems to maintain. There are many animal models, and human models, of diseases caused by the immunocompetent cells attacking self. Even the prevalent existence of so many allergies and asthmatics in the human population suggests that control of the immune system is relatively easily thrown off. If so, then the presence of many haplotypes in a population would mean that a given animal with at the most two haplotypes would be less likely to react with an innocuous antigen, and thus lower that animal's selective advantage. However, the advantage to the population at large of having many alleles to best defend the species against a threatening plague would be of tremendous selective advantage. If one class II glycoprotein could not present a particular dangerous antigen to the immune system, then perhaps another allele in the population could (Zinkernagel 1979). Although selection operates on the individual level, mechanisms which would enhance the introduction of new alleles could have a selective advantage. The proof of a postulate such as the one suggested above awaits appropriate experimental design and statistical analysis. The possible heterozygous advantage involving class II genes also needs to be taken into account. Nevertheless, the existence of the polymorphism through evolutinary time suggests its importance in the survival of the species and of the importance of the mechanisms which have generated and maintained the polymorphism.





30



Mechanisms of Generating Polymorphism

If the class II genes are viewed as being in a

dynamic state of flux in evolutionary time rather than being static structures, then visualizing the genetic mechanisms which have generated the polymorphisms, and the selective pressures which have maintained them, is more revealing. The entire immunoglobulin gene superfamily, which includes the class II genes, appears to have arisen by gene duplication and divergence. Two of the most popular possibilities for divergence of the class II genes into polymorphic alleles are unequal crossing over and gene conversion.

Unequal crossingover and gene conversion, originally found in fungi (Radding 1978), are mechanisms whereby DNA sequence is transferred or copied from one gene to another. Although by definition the DNA sequence can be transferred from and to genes anywhere in the genome, it is much more probable to occur within tandem multigenic or multiallelic families (Baltimore 1981; Egel 1981; Robertson 1982; Slightom et al. 1980). Pairing between partially homologous sequences during meiosis or mitosis would occur, followed by mismatch repair which converts part of one sequence to the other. The primary evidence for gene conversion is the discovery of clusters of substitutions, especially at the DNA level. While these "tracts" of nucleotide substitutions have been clearly demonstrated in class I genes (Mellor et al. 1983; Weiss





31


et al. 1983a; Weiss et al. 1983b), there is also evidence (Mengle-Gaw et al. 1984; Widera and Flavell 1984), though not as thoroughly documented, for a similar mechanism acting on the class II genes.

Regions of allelic hypervariability have been

reported in the murine A. gene (Benoist et al. 1983b), suggesting that this gene has more of a polymorphic nature than previously thought by some (Cullen et al. 1976; Klein and Figueroa 1981; and Klein et al. 1981), though a few scientists had evidence for an unexpectedly high degree of polymorphism for the A. gene (Cecka et al. 1979; Cook et al. 1979). Benoist et al. (1983b) sequenced a total of six different A alleles, including the k, d, b, f, u, and S haplotypes, and compared their cDNA sequences. Not only did they find a surprising degree of polymorphism, they also found that the amino acid substitutions were clustered in the first domain exon. In fact, many of the substitutions were localized at a few highly variable positions within the first domain exon. Also, 40 out of 46 dinucleotide changes, which are indicative of nucleotide sequence fluidity, occur in the first domain exon. A translation of the cDNA nucleotide sequence into the corresponding amino acid sequence for the six haplotypes reveals not only the polymorphism of domain one, but the corresponding Kabat-Wu variability plot (Kabat et al. 1979) also shows two regions of "allelic hypervariability" at residues 11-15 and at




32


56-57. These regions, however, are not nearly as variable as the immunoglobulin hypervariable regions.

The polymorphism of A still leaves open the

question of how it was generated. Because Aa is not a member of a large gene family it might not be a good candidate for gene conversion, although one must still consider interallelic gene conversion. Benoist et al. (1983b) do mention the likeliness of interallelic conversion in heterozygotic wild mice, which will be discussed later in this dissertation, but they do not feel it sufficent to explain the generation of polymorphism in A. The ia gene lacks the clustering of nucleotide substitutions, and a clear donor of sequence material has not been detected as yet, to make it a good candidate for gene conversion (Benoist et al. 1983a). They offer instead a hypothesis of a gene duplication event followed by one of the copies subject to slow drift, the other copy acquiring a degree of sequence instability which would lead to a high rate of point mutations. Data presented in the results of this dissertation tend to support some type of conversion event over simply the accumulation of point mutations.

Regions of allelic hypervariability have also been reported for fE (Mengle-Gaw and McDevitt 1983). Again, these regions were found only in the first domain and correspond to the hypervariable regions found both in the alleles at a particular locus, and between 8 loci.





33


Clusters of polymorphism separated by sequences of nucleotide homology found both among the Ep alleles and between the 0 loci suggested to the authors the possibility of generation of this polymorphism by a gene conversion type event.

Genomic clones from three different haplotypes, the b, d, and k haplotypes, have been isolated and their DNA sequences compared to one another (Choi et al. 1983). While the overall structural organization of these genomic clones was determined, unfortunately only the exons were sequenced at the time. The authors determined that there is a concentration of amino acid substitutions in the amino terminal portion of the encoded molecule and that the pattern of nucleotide substitutions is consistent with multiple independent mutational events. Their restriction map analysis of sequences flanking the exons suggests that there may be large differences between the haplotypes, which agrees with the data presented in this dissertation. They interpret their data as being inconsistent with gene conversion, but do not take into consideration the low number of haplotypes they analyzed.

Evidence for gene conversion in a class II 0 gene

has been reported by Mengle-Gaw et al. (1984). They have isolated an alloreactive T cell clone, 4.1.4, that recognizes a determinant present on both gob and Abml2. Comparison of the nucleotide sequence of Anb (Choi et al.





34


1983) and A bml2 (McIntyre and Seidman 1984) to the cDNA sequence of Eb revealed that the bml2 sequence is identical to the Eb sequence in the region where it differs from A0. The particular region where the conversion event may have occurred includes three nucleotides in a clustered region of 14 nucleotides between sequence coding for amino acids 67-71. This region is also flanked by regions of exact homology which extend 20 nucleotides 5' and 9 nucleotides 3'. These flanking regions may provide stabilization of heteroduplex formation between the genes, which might potentiate sequence transfer. The T cell clone 4.1.4 was found to recognize a determinant shared by A bml2 and Epb, so the possible gene conversion event would have occurred in a functional zone. Previous information which led to this interest in the bml2 mutation includes genetic mapping of the bml2 mutation to within the Apb gene (Hansen et al. 1980), and tryptic peptide data showing the bm12 mutant to differ from its C57BL/6 parent only in its AB polypeptide (Lee et al. 1982b; McKean et al. 1981).

There is tremendous difficulty in distinguishing between gene conversion and unequal crossing over as mechanisms of the genetic exchanges in the MHC. The discovery of gene conversion in fungi was only possible because the products of a single meiosis in some species remain in a tightly clustered tetrad in which mendelian





35


ratios are directly detectable (Baltimore 1981; Radding 1978). A change in gene number might be expected in unequal crossover, but if the crossover event took place totally within the genes, then one might find an insertion or deletion of genetic sequence as the only evidence, which is something our laboratory is looking for in the intron between the first and second protein coding domains. Steinmetz et al. (1982) have even postulated that unequal crossover may occur using pseudogenes as a genetic reservoir for polymorphic sequence material. Possible evidence for gene conversion at the DNA sequence level is the strong homology seen in the flanking regions of suspected conversion events; perhaps such sequences have been selected for indirectly within introns as shuttle elements to continually generate polymorphism on an evolutonary timescale. Still, there are now three known AO sequences, as well as a very large number of alleles, for generation of diversity in Ag, and there is no rule that requires one mechanism to operate for all class II genes or that requires only one mechanism to generate that diversity. More nucleotide sequence information, especially in the introns of the class II genes, should do much to elucidate the mechanisms involved.

Mechanisms for the generation of polymorphism should take into account the variable and conserved tracts within the I region characterized by Steinmetz et al.





36


(1984). Single copy probes were isolated from the class II region of a BALB/c library and were used to screen DNA cosmid libraries of AKR and B1O.WR7, haplotypes H-2k and H-2wr7 respectively. The isolated clones were aligned to provide a nearly continuous stretch of DNA through the I region of the three haplotypes, which was restriction endonuclease (RE) mapped and oriented. Using probes spanning the I region in a southern blot analysis, a variable tract was found in the left half of the I region, and a conserved tract in the right half, with the dividing point being in the middle of the E gene, probably overlapping the hot spot for recombination in the middle of the Eg gene. The AP, A., and EO genes, which show extensive polymorphism, are located in the variable tract, whereas the much less polymorphic Ea gene is located in the conserved tract. Noncoding sequences located in the variable tract were found to be just as polymorphic, or often more so, than the coding regions in the variable tract. Again, only more nucleotide sequence information is likely to elucidate the mechanisms operating to generate and maintain the polymorphism in the I region.

The hotspots for recombination are of special interest in the generation of polymorphism in the I region. The recombination rates may even be strain dependent. Shiroishi et al. (1982) examined a congenic mouse strain, B1O.MOL-SGR, which has an H-2wm7 haplotype





37


bred onto a C57BL/10 background. This H-2 haplotype, of Mus musculus molossinus origin, tremendously enhanced recombination rates between the K and A loci. A similar dramatic increase in specific recombination rates has been reported in another wild mouse haplotype (Steinmetz et al. 1986). Two haplotypes from Mus musculus castaneus (CAS3 and CAS4) showed recombination at the same high frequency, 0.6%-1.5%, as was seen in Mus musculus molossinus derived MHC genes.

Steinmetz et al. (1986) went on to sequence the

intron between the first and second protein coding domain of the Eg gene, which probably contains the hotspot region, and found that the sequence contained a CAGG tetramer repeated in tandem 22 times, if a mismatch of one nucleotide is allowed. The sequence has some homology to the lambda Chi sequence, which promotes recombination, but the homology is not very strong. A much stronger degree of homology was found to the core sequence of the hypervariable minisatellite regions found in human DNA (Jeffreys et al. 1985). These regions could generate allelic variability by facilitating unequal crossover events during meiosis, or perhaps even by initiating a gene conversion event.

Control of expression of the class II genes may also play a role in their generation of diversity, either by differential control of expression, or by polymorphism in the control elements themselves. Some standard laboratory





38


inbred mouse strains carry mutations that cause failure of expression of the class II E molecule on the cell surface (Jones et al. 1978, 1981). These mutations can be of any one of three types (Hyldig-Nielsen et al. 1983; Mathis et al. 1983): the H-2b and H-2s haplotypes have a deletion in the E gene, the H-2f haplotype makes an E. mRNA of aberrant size, and the H-2q haplotype has a defect in RNA processing or RNA stability. The lack of a cell surface expressed E molecule for any reason is referred to as an Eo mutation. The Eo mutations have been identified in over 50% of the t bearing strains (Nizetic et al. 1984). Eighteen t haplotype carrying strains have been found to be EO by Dembic et al. (1984). Three strains, CR0437, tw2, and to were found to transcribe E but do not make a functional protein. All fifteen other EO strains had a deletion encompassing the promoter region, the RNA initiation site, and the first exon, which amounts to an approximately 650 bp deletion. The role these mutations might play in the polymorphism of class II genes is just now getting underway.

In the human system, there are cell lines which have specifically lost expression of all class II molecules (Levine et al. 1985). The cell line 6.1.6 is a variant of a normal lymphoblastoid line which has been shown to have a regulatory defect in class II gene expression (Gladstone and Pious 1978, 1980; Levine and Pious 1984). P30 is a partial revertent of the 6.1.6 cell line. Levine





39


et al. (1985) used southern and northern blotting of these two cell lines to show evidence that class II and Ii (I invariant) chain expression may be linked. The characterization and polymorphic nature of the regulatory elements of class II genes is just beginning.



Variation in Wild Mice



The major purpose of this dissertation is to address the question of how the generation of the polymorphism of the MHC class II genes arose and how this polymorphism is maintained. To address this question realistically requires an understanding of the evolutionary relationships of the model system being studied. The more thorough the understanding of the strain development of the system, the more informative the study.

A major limitation of many previous studies of the extensive genetic polymorphism of the murine class II genes is that only a limited number of class II alleles have been studied, and nearly all of these come from the standard laboratory inbred strains of mice. These strains were derived from a limited number of sources with a high degree of interbreeding early in their development. As such, they represent a highly biased sampling of the mouse population and an artifical collection of considerable genetic homogeneity (Ferris et al. 1982;





40


Klein 1974). Wild mouse populations have a relatively high degree of genetic variation, particularly at the H-2 complex, when compared to the standard laboratory inbred strains of mice. It is this variability, generated and maintained through natural selection, which makes wild mice a near ideal model for the study of the genetics of class II polymorphism. In turn, the class II polymorphism is a near ideal model to study the evolution of a species.

A useful definition of wild mice is a population

whose reproduction is not controlled directly by humans. (Bruell 1970). This study will examine the polymorphic nature of the class II genes at the DNA level in mice of wild mouse populations of different subspecies and geographic origins as well as the standard laboratory inbred strains. For this reason it is of major importance to understand wild mice as a genetic model.



Natural History of Wild Mice

Basic to the understanding of the evolutionary implications of the selection process on wild murine class II gene products is a rudimentary understanding of the natural history of wild mice. The degree of association of the wild mice with humans can be used to distinguish three groups (Sage 1981). Aboriginal mice live predominantly unassociated with human dwellings or food sources. Commensal mice live in close association





41


with human buildings and food supplies, while feral mice have made a transition from the commensal stage back to an aboriginal existence. Much of what is known about the natural history of wild mice has been learned from studies on commensal mice.

The term house mouse is defined here because it describes essentially all wild mice used in this dissertation research. House mouse literally refers to the commensal relationship between human dwellings and certain species of mice. The number of species comprising what we call the house mouse varies depending on the person defining the term. In this dissertation, the house mouse shall be split into seven species and subspecies as per Joe Marshall (1981). These species include the commensal mice Mus musculus domesticus, Mus musculus musculus, Mus musculus castaneus, and Mus musculus molossinus, and the closely related aboriginal mice Mus hortulanus, Mus spretus, and Mus abbotti. From fossil evidence, nuclear genetic variation, and mitochondrial genetic variation, it has been estimated that the commensal association between humans and mice began more than a million years ago (Ferris et al. 1983).

The native distribution of the wild mouse species ranges across Europe, North Africa, and northern Asia. M.m. domesticus and M.m. castaneus, two commensal species, have followed man into North and South America, Australia, and southeastern Africa, presumably as





42


stowaways on sailing ships. Thus, most of the standard laboratory mice of M.m. domesticus origination were already introduced in the New World. The three related aboriginal species, M. hortulanus, M. spretus, and M. abbotti, have a native distribution in Europe and Asia Minor. The mound-building species, M. hortulanus, is restricted to the steppe grasslnd regions of the Carpathian basin and the Ukraine (Petrov 1979). M. spretus is found in the warmer parts of the western Mediterranean regions from France to Libya, and M. abbotti is found in southeastern Europe abd Asia Minor, although its geographic distribution is less well characterized (Sage 1981). Distribution of these three aboriginal species is consistent with patterns of other animal and plant groups, suggesting that their present locations were determined by natural factors, not humans, as opposed to M. domesticus.

The western European house mouse, M. domesticus, has the most diverse geographic distribution of the house mouse species and has provided the most information about the range of genetic variability of the house mouse species. This species, due to its occupation of buildings and sailing ships during an era of worldwide colonization, established founding populations in areas as diverse as the Americas, Australia, and varied temperate and tropic Pacific island chains. M.m. domesticus may be a more advanced member of the genus





43


based on its great adaptability and spectacular variation in color which matches its various geographic environments (Marshall 1981). Many studies have been carried out where mice, usually M.m. domesticus, have been introduced, a fact which should be kept in mind when reviewing the older studies (Schwarz and Schwarz 1943). This problem is poteniated by mice which go feral after colonizing a new land, thus subject themselves to new and different natural selective pressures.

The native distribution of the house mice species

has not been thoroughly documented, but some informative observations have been made (Sage 1981). M. spretus, for instance, is native to western Europe and North Africa, but has been found in agricultural fields, often cornfields, in Spain and France, and grasslands in North Africa. This species has been found inside buildings in at least one instance (Sage 1981). M. hortulanus, a well-studied aboriginal species (Mikes 1971), has been found in grain fields and some native steppe grasslands. Whether or not it inhabits buildings is still questionable. Information on the natural habitats of M. abbotti is sparse (Osborn 1965). They have been reported in agricultural habitats in southern Georgia, U.S.S.R., in grain fields and bamboo groves in Turkey, and adjacent to cornfields in southern Yugoslavia.

The more commensal of the house mouse species are

most often found associated with human buildings, but not





44


always. M. castaneus is found indoors in Malaya (Harrison 1955), India (Srivastva and Wattal 1973), Indonesia (Hadi et al. 1976), and Nepal and Thailand (Marshall 1977). In fact, it has not been reported outdoors in these areas and may be the most commensal of the house mouse species. M.m. molossinus has been found in houses, farms, cultivated fields, and even along river levees in Japan, as well as abandoned agricultural fields in Korea (Hamajima 1962; Jones and Johnson 1965), suggesting that it is less of an obligate commensal than M.m. castaneus. The native range of M.m. musculus includes central Asia to northeastern Europe. Its microhabitats vary from inside buildings and haystacks in much of northern Russia and central Europe (Pelikan 1974; Romanova 1970; Zejda 1975) to agricultural fields and meadows in Denmark (Ursin 1952). In Sweden this species has been reported in natural wild locations independent of any human influence whatsoever (Zimmerman 1949).

M.m. domesticus is presently found throughout the world, although it is an adventive species in most of these areas. Its native range extends from Nepal to North Africa and western Europe. Within its native range it can be found in habitats as diverse as agricultural fields to barren stony ravines isolated from human settlements, especially in Afghanistan and Pakistan (Gaisler 1975; Hassinger 1973; Roberts 1977). In the desert of the south Arabian peninsula it has been





45


found living in burrows of sand rats (Harrison 1972). The versatile adaptability of this species is demonstrated by unusual commensal habitats it has occupied such as coal mines (Philip 1938) and frozen meat lockers (Mohr and Dunker 1930). M.m. domesticus is at least as versatile in non-native lands, and has been found in environments such as salt marshes (Breakey 1963) and grasslands (Pearson 1963) to the Andes mountains (Harland 1958), although it is predominantly a commensal species. It is worth noting that it has not been reported in woodland forests in Europe, nor in the Americas, although it has been found to occupy the native silver beech forests in New Zealand (Taylor 1978).

Interspecies competitive interactions are difficult to study in rodents in their native habitats. The most thoroughly studied case remains one involving two species, M.m. domesticus and the vole Microtus californicus in the California grassland ecosystems (Lidicker 1966). A population of approximately 12,000 mice on an island was extinguished within one year after the introduction of a small number of voles. DeLong (1966; 1967) studied two enclosed populations of mice, one group with the presence of voles, one without. The population of mice in the enclosure with the voles has a significantly lower survival rate for postnatal, preweaning mice. Lidicker (1966) also found that the voles dominated house mice in 94% of their encounters.





46


DeLong and Lidicker's studies are actually some of the few experimental approaches in this area. These rodent interactions demonstrate part of the natural selection process when a new species enters a territory. How these interactions affect the evolution of a native species over hundreds of thousands of years remains to be determined.



Variation of Non-MHC Features in Wild Mice

Factors affecting the evolution of the murine MHC class II molecules are probably numerous. The wild mice are an excellent system to study evolution of morphological features, protein structure, and DNA structure. The morphological features of the wild mice have been especially instrumental in organizing the phylogenetic relationships of the different species of Mus while anatomical features such as dental structure (Bader 1965; Van Valen 1965), skull shape (Hussain et al. 1976), and relative tail length and foot size (Ursin 1952) have all contributed to the classification scheme. Relative tail length has also been a useful feature in the classification of wild mice, particularly long tail length of M.m. domesticus, because of a genetic region known as the t complex, which complicates tail length inheritance.

Color variation as a morphological feature was

critical in establishing the mouse as an excellent model





47


in the twentieth century to study inherited traits. Geographic factors and microhabitat have played major roles in determining resulting coat color for the species in their native ranges. There is a notable polymorphism of coat color, particularly in ventral coloration, detectable in some species of mice such as M.m. molossinus (Hamajima 1964) and M.m. musculus (Serafinski 1965). The coat color patterns are important for genetists because of their great utility as genetic markers, but also because a multifactorial nature has been shown to be involved (Falconer 1947). Genes on five or more chromosomes have been found controlling melanism (Radbruch 1973). Just as the coat color genes are important markers for geneticists, coloration affecting natural selection and survival will influence the polymorphism of some of the biochemical factors.

Variation in the proteins in wild mice has been

assayed most commonly with electrophoresis and serology reviewed by Sage (1981). Many variant forms of proteins have been localized to a particular chromosomal position (Womack 1979), but the function of these proteins has not always been identified. Protein variation in wild mouse populations has also proved useful in learning about the heterozygosity levels in M. domesticus populations around the world (Berry and Peters 1977; Rice and O'Brian 1980; Sage 1978). Such studies have aided in the classification of the different subspecies of wild mice (Bonhomme et al.




48


1978; Minezawa et al. 1979). One example of how protein variants have led to important discoveries in evolution and ecology is the discovery of the hybrid zone in Europe by Selander (Selander et al. 1969; Hunt and Selander 1973). A zone of contact between M.m. domesticus and M.m. musculus runs across the Jutland peninsula in Denmark and continues through the eastern part of West Germany. The hybrid zone is as narrow as 20 kilometers in some places, and has possibly been in existence for 5000 years. Free interbreeding occurs between the "semispecies" within the zone, but not on either side of it. M.m. domesticus alleles have been detected within the M.m. musculus populations within the hybrid zone, but not vice versa, perhaps reflecting social dominance of M.m. domesticus over M.m. musculus (Thuesen 1977). While selection operates most often at the protein level, this dissertation will examine the DNA coding for variation in a specific group of proteins.

Variability in chromosome structure, once thought nonexistent, has been discovered to be quite prevalent in certain regions of the world, e.g. Italy and Switzerland (Gropp et al. 1969; 1970; 1972). The previously "normal" chromosomal complement was thought to be 20 pairs of acrocentric chromosomes. In an excellent review article by the discoverer of Robertsonian translocations (Gropp and Winking 1981), Gropp and Winking describe the presence in wild mouse




49



populations of metacentric chromosomes formed by the joining of two acrocentric chromosomes.

Studies on the variability of mitochondrial DNA

sequence in various species of house mice using an RFLP analysis were first reported by Yonekawa et al. (1980). The 25 standard laboratory strains they analyzed showed no variation and were identical with a sample of wild M.m. domesticus from Canada, but the patterns from M.m. castaneus and M.m. molossinus were very different. Varibility of mitochondrial DNA within M.m. molossinus populations appears to limited.

An extensive analysis of mitochondrial DNA evolution in 208 mice by Ferris et al. (1983) has reinforced the phylogenetic classification scheme of Marshall and Sage (1981) for the seven Mus species and subspecies of house mice discussed here. An RFLP analysis of the mitochondrial DNA of four commensal and three aboriginal species of house mice and the standard laboratory mice led to the construction of evolutionary trees on the basis of mitochondrial polymorphisms. These evolutionary trees emphasized the distinctiveness of M.m. domesticus from the other commensal species of mice. All 50 of the standard laboratory mouse strains analyzed were found to be M.m. domesticus. The mitochondrial evolutionary tree also reinforces that the three European aboriginal species of mice which have been discussed, M. spretus, M. abbotti, and M. hortulanus, differ substantially from




50


the commensal mouse species and are each an individual species of Mus.

The first commensal mice may have begun their

relationship with humans one to two million years ago, assuming the rate of mutational divergence in mitochondrial DNA is between 2% and 4% per million years (Ferris et al. 1983). Mitochondrial DNA comparisons between mammals whose divergence times have been estimated from fossil records (Brown et al. 1979; Brown et al. 1982; Ferris et al. 1981; Upholt and Dawid 1977) have provided this estimate of mitochondrial DNA divergence. This estimate of commensalism between mice and humans fits with Sage's (1981) protein comparisons and may correlate with Mus species divergence.



The "t" Complex in Wild Mice

The term t complex indicates the part of the

chromosome which is occupied by a complete t haplotype. Occurring in a frequency of 10% (Artzt et al. 1985) to 40% in most of the sampled wild mouse populations (Dembic et al. 1984), t haplotypes are structurally variant forms of a segment of murine chromosome 17. Mouse t haplotypes are thoroughly reviewed in a recent article by Silver (1985). When first discovered (Dobrovoloskaia-Zavadskaia and Kobozieff 1932), and for many years thereafter, t haplotypes were thought to be recessive alleles at the Brachyury (T) locus. Although




51


there is a T locus near the centromere on chromosome 17, it is well-defined single locus which is only a small part of a t haplotype (Bennett et al. 1975).

The different t haplotypes all appear to be related to one another structurally. A complete t haplotype encompasses about 30 x 103 kb, which accounts for approximately 1% of the entire mouse genome and includes the entire murine H-2 complex, hence the connection with the polymorphism of the class II genes. There are also polymorphisms within the t haplotypes themselves, among the most t specific of which may be the t complex proteins (TCP) (Silver et al. 1979; Silver et al. 1983). Also within the chromosomal region occupied by t haplotypes are many other normal genes common to non-t bearing mice, along with a smaller number of mutant t genes which must effect the t specific characteristics.

The t haplotypes have been known to influence tail length, fertility, embryogenesis, male transmission ratio, and meiotic recombination (Dunn and GluecksohnSchoenheimer 1950; Silver 1985). Of these characteristics, it is believed that through suppression of recombination the t haplotypes has been maintained as a distinct genomic unit. Furthermore, a distorted male-specific transmission ratio permits propagation through mouse populations despite the deleterious effects which accompany complete t haplotypes.




52


The suppression of recombination which occurs in complete t haplotypes with non-t wild haplotypes, as first discovered by Dunn and Caspari (1945), is related to the t genomic structure. This suppression extends from T, includes the H-2 complex (Hammerling and Klein 1975), and the Tla and Qa-2 regions (Shin et al. 1982; Silver 1981), but not the Pgk-2 locus (Nadeau 1983; Rudolph and Vanderberg 1981). Thus, a complete t haplotype comsists of a 12 to 15 cM region of the chromosome with concomitant suppression of recombination from somewhere between the centromere and T and extending to somewhere between the distal part of the MHC and Pgk-2. Rare chromosomes which had recombined within the t haplotype were discovered and designated as partial t haplotypes. These rare recombinants were subsequently found to be of critical importance in understanding the physical structure of the t haplotypes (Lyon 1960; Lyon and Meredith 1959).

With partial t haplotypes as a tool used to infer

structure, the region of suppression of recombination was found to occur only along the extent of t DNA present (Bechtol and Lyon 1978; Bennett et al. 1979). Normal recombination rates between t haplotypes also suggested that the structures of t haplotypes were similar to one another, and different from the same chromosomal region in wild type DNA (Artzt et al. 1982a; Condamine et al. 1983). Artzt et al. (1982b) were able to demonstrate that




53


the physical locations of the H-2 and the tf locus were reversed in t haplotypes relative to their location in the wild type chromosome. These results were confirmed by others (Shin et al. 1983b; Shin et al. 1984). A complete t haplotype therefore consists of a distal inversion, which includes tf and H-2, a proximal inversion which includes T and the genes encoding the Tcp (T complex proteins) products (Herrmann et al. 1986), and possibly a small central inversion.

Many complete t haplotypes are known to have lethal effects in homozygous t embryos (Klein et al. 1984). This can be a useful tool as one of the few ways to distinguish complete t haplotypes from one another, as different chromosomes carrying different lethal mutations can complement each other in genetic tests (Bennett 1975; Klein et al. 1984; Winking and Guenet 1978). It also became possible to analyze the genetic basis for t lethal effects with the finding that normal crossover occurs between two different t haplotypes (Silver and Artzt 1981). The majority of the complete t lethal mutations analyzed appear to be single-locus mutations, and lethal mutations of complementing t haplotypes are not allelic to each other (Artzt et al. 1982a). Although there is some evidence for clustering (Artzt 1984), the different lethal mutations appear to be distributed over the entire length of complete t haplotypes. Overall, the entire genetic basis for the t lethal mutations seems to be





54



straight forward, but the molecular mechanism by which they effect their lethality is still unknown.

The male-specific transmission ratio distortion

(TRD) inherent with t haplotypes is responsible in part for propagation of t haplotypes through the wild mouse population, even though the t haplotypes carry deleterious genes. Wild males with a complete t haplotype will transmit it to well over 90% of their offspring (Lyon and Meredith 1964a, 1964b). Mice carrying a single partial t haplotype cannot transmit it at a high ratio, but the TRD can be restored in males carrying particular pairs of partial t haplotypes in cis or trans configuration (Silver 1985). This effect was higher with certain trans combinations for a portion of the t haplotypes, leading Lyon (1984) to propose a model in which partial t haplotypes carry different lengths of t DNA with particular sets of distortion loci. Lyon (1984) hypothesized that a series of t-specific distorter loci, Tcd, act on a single t-specific responder locus, Tcr. The effects of the Tcd loci would be additive, and they could act cis or trans to the Tcr locus to transmit it at a high ratio when enough Tcd loci are present. Evidence in support of this model has been obtained by Fox et al. (1985). Further research based on this model should be forthcoming.

Sterility is another effect which accompanies the presence of any two complete t haplotypes in male mice.




55


The physiological reasons for this sterility are still unknown. The sperm appear to be morphologically normal (Hillman and Nadijcka 1980). This sterile condition is of particular interest because of its strong similarities to the t's TRD effect. The possibility exists that the two are related, but this has not yet been proven.

The association of the t haplotypes with the murine histocompatibility class II molecules has long fascinated geneticists, due to the inclusion of the H-2 complex in the recombination suppresion of complete t haplotypes. The extreme polymorphic nature of the class II molecules has provided excellent markers and an approach to study evolution of the t haplotypes. Dembic et al. (1984) and Nizetic et al. (1984) have drawn correlations between an Ia deletion and its association with t haplotypes which suggest an ancient origin for this deletion. Association of the members of the same t complementary group with the same H-2 haplotype supports this view, but interpretations should be made cautiously as the evidence that H-2 haplotype association with t chromosomes is derived from a single ancestor is not conclusive. More recently, by Figueroa et al. (1985) have revealed the existence of three major groups of class II alleles associated with particular t haplotypes. These results are not in conflict with those to be presented in this dissertation.





56



Polymorphism of the H-2 Class II Genes in Wild Mice

The murine class II histocompatibility genes are

one of the most polymorphic gene complexes in mammalian genetics. Research on the genetic basis for this polymorphism and its functional significance has led to many critical discoveries in transplantation biology, cancer research, and genetics. However, most of the research in these areas has been carried out using standard laboratory mice. As was mentioned earlier, almost all of these mice are of M.m. domesticus origin and derived from a very limited number of stocks which were not well characterized.

The presence in wild mouse populations of private H-2 antigen specificities absent from the standard laboratory inbred mice led to the realization that there was a need to identify and characterize these new H-2 specificities. The methodology of choice was a serological characterization of the wild H-2 haplotypes, but the problem was to isolate these antigens from non-H-2 antigens so that antisera specific for only the H-2 could be produced. Klein developed the B1O.W congenic lines (Klein 1973, 1975), where "W" stands for wild. The wild males were bred with a B1O.BR female and the progeny were backcrossed 8 to 14 times to the same inbred strain with a continual selection for an H-2 marker (Ssh) specific for the wild mouse's H-2 haplotype. The Ssh animals were then intercrossed and progeny with Ssh and




57


a H-2.23 negative phenotype (wild haplotype) were selected to establish homozygous lines with brother x sister matings to maintain the line. Thus, each wild H-2 haplotype is bred onto a C57BL/10 background for specific analysis of the wild type H-2.

Once the B1O0.W lines were established, a serological examination of sixteen of their wild H-2 haplotypes substantiated their extreme polymorphism (Klein 1975; Zaleska-Rutczynska and Klein 1977). A few wild haplotypes appeared to be identical to one another serologically, and a few of them resembled standard laboratory strains of mice, but most were different from one another and different from all known laboratory inbred mouse strains (Zaleska-Rutcznska 1977). A serological analysis of 29 wild-derived H-2 haplotypes (Wakeland and Klein 1979a; Wakeland and Klein 1981) defined five new I region antigens, with the inclusion of three new haplotypes, u, v, and J, on their inbred panel. Mentioned in this same report is the beginnings of discernible "phenogroups." Also, wild mouse haplotypes which showed showed evidence of possible recombination in the H-2 complex were characterized (Duncan and Klein 1980; Wakeland and Klein 1979b), suggesting that the wild mouse haplotypes may be of use in analyzing recombination mechanisms as they occur in a natural population.

The combination of serological and tryptic peptide mapping analyses proved to be very informative for




58


Wakeland and Klein (1983). They were able to organize 29 B10.W lines into 8 distinct antigenic groups. The tryptic peptide mapping correlated with the serology, but also demonstrated an extremely high degree of similarity of the class II molecules of members of the same family. These class II families often had a standard laboratory inbred mouse strain as a "prototypic" member. The discovery of the existence of groupings of class II wild haplotypes bears directly on the question of how the polymorphism of the class II molecules arose and how it is maintained. The process of generation of diversity in class II molecules may not be as random as once thought. These aspects are stressed here because these groupings formed the basis for this dissertation.

Two of the groupings established by Wakeland and

Klein (1983), the Ak and AP families, were subsequently selected for more detailed analysis. The tryptic peptide fingerprints of the Ag, A5, E., and Eg subunits encoded by four of the wild H-2 genes in the Ak group were compared. The As and AO subunits of all of the related haplotypes differed from A.k and A~k by less than 10% of their tryptic peptides (Wakeland and Darby 1983). The tryptic peptide fingerprint comparisons of the E1 gene in these same strains were Eod-like in two wild haplotypes and Es-like in another wild haplotype suggesting that recombination between A. and E may be significant in the wild. This may reflect different





59


evolutionary patterns of the Ap and A genes with respect to the Ep genes.

The Ak and AP families (Wakeland and Darby 1983;

Wakeland and Klein 1983) were also analyzed to determine the effect of their minor structural variations on allorecognition by T lymphocytes (Peck et al. 1983). Minor structural variations in the A molecule were usually found to cause major functional changes in in allorecognition. These changes were always detected when the AO subunit contained the structural variation. Peck et al. (1983) also found that more than one site in the A molecule can be recognized by alloreactive T lymphocytes. These results suggest that specific sites in the A molecule are critical for allorecognition. Thus it would be informative to know the location of the minor structural differences between wild H-2 haplotypes in either the Ak or the AP family to determine if the differences are in a critical binding area of the molecule. If so, the evolutionary mechanism generating the polymorphism found in one of these haplotype families would seem to be operating in a non-random process.

Radiochemical sequence analysis of tryptic peptides of wild-derived H-2 complexes of Ak family members has localized structural variations of the A molecule to the al and 81 domains (Wakeland et al. 1985). The variations have been localized in the A. molecule to two adjacent peptides. In the AO subunit the differences have been




60


localized to single amino acid changes, possibly due to single point mutations in the encoding DNA. Thus, the Ak family of class II alleles probably are diversifying by the accumulation of discrete mutations within the exons encoding the al and 01 domains. Again this suggests that wild-derived variants in exon structure are not random. Recent data on the DNA structure of the Ak and AP families based on RFLP analysis (McConnell et al. 1986) suggests that the intron structure may also be informative to determine evolutionary lineage of H-2 class II haplotypes.














MATERIALS AND METHODS



Mice



All mice were from the mouse colony in the Tumor

Biology Unit at the Department of Pathology, University of Florida, or from our wild mouse colony at the Animal Care Facility, University of Florida. Strains used included AZROU 1, AZROU 2, BELGRADE 1, C57BL/10, B10.BR, B10.BUAl6, B10.CAA2, B10.CAS2, BlO.CHA2, B10.D2, BI0.F, BIO.KEA5, BlO.M, Bl0.PL, B10.Q, B1O.RIII(7INS), B10.S, B10.SAA48, B10.SM, Bl0.STC77, B10.STC90, B10.WB, tw7l, TT6, t6-JR1, tw8, tw75, twl2, tw5, tw32, JERUSALEM 3, JERUSALEM 4, METKOVIC 1, STU, VIBORG 5, and W12A. The inbred mouse strains are maintained by full brother x sister mating with a single line of descent. All mouse strains are homozygous at the H-2 complex unless otherwise noted.



Isolation of Genomic DNA



Genomic DNA was prepared from liver tissue according to the methods of Maniatis et al. (1982). The mice were deprived of food for 24 hours prior to sacrifice. The




62


livers were minced with surgical scissors, placed in a mortar which contained liquid nitrogen. The frozen tissue pieces were then ground to a fine powder and added to 40 ml of TES buffer (10 mM Tris HC1, pH 7.5; 5 mM EDTA, 100 mM NaCl) with 1% SDS and 0.4 mg/ml of protease K (Sigma, St. Louis, MO). This DNA preparation was then incubated at 650C for 16 hours, extracted three times with Tris equilibrated phenol (pH 7.5), twice with chloroform and isoamyl alcohol (96:4 v/v) and then precipated by the addition of 2.5 volumes of isopropanol. The high molecular weight genomic DNA was hooked from the isopropanol solution with a drawn Pasteur pipette, dissolved in 0.5 ml TE (10 mM Tris HC1, pH 7.5, 1 mM EDTA), and dialyzed extensively against TE buffer. The resulting genomic DNA prepartions were then analyzed for purity and quantitated by spectrophotometry and agarose gel electrophoresis. All DNA preparations used in this study have 260/280 ratios in excess of 1.8 and migrate as high molecular weight DNA on 0.7% agarose gels.



Restriction Endonuclease Digestions
and Agarose Gel Electrophoresis



A Tris buffered solution containing 15 ug of genomic DNA was digested with 30 units of enzyme for 16 hours at 370C under conditions described by the supplier (Bethesda Research Laboratories, Bethesda, MD). An additional 15 units of enzyme was then added for 8 hours. The





63


efficiency of each endonuclease digestion was monitored by removing 10% of the digest reaction volume immediately following the final addition of endonuclease and adding this aliquot to 0.5 4g of lambda phage DNA. Following an

8 hour incubation, digestion of the genomic DNA was analyzed by electrophoresis in 0.7% agarose gels. Complete digestion of the genomic and lambda phage DNA was detected as a "smear" of genomic DNA-derived restriction fragments together with a pattern of lambda DNA derived restriction fragments characteristic of complete digestion with each specific enzyme. The bulk of the digested genomic DNA (13.5 4g) was stored at -200C until electrophoresis. Digested genomic DNAs were electrophoresed through 0.7% agarose gels for 40 hours at 1.5 V/cm or for 20 hours at 3.0 V/cm in a high resolution horizontal electrophoresis apparatus with cooling thermoplate (International Biotechnologies Incorporated, New Haven, CT).



Capillary Transfer and Hybridization.



Following electrophoresis, DNA was transferred from the gel to nylon filters (Zetabind, AMF, Meriden, CT) by the method of Southern (1980). Transfer efficiency was monitored by comparing the amount of DNA remaining in the gel following transfer with the amount present prior to transfer by ethidium bromide staining and photographic




64



analysis. The nylon filters were vacuum dried for 2 hours at 800C and stored on dessicant at 40C until hybridization. The filters were hybridized with a 32p-labeled 5.8 kb Eco RI fragment containing the entire Apd gene (Malissen et al. 1983) or with a 1.2 kb Hind III fragment containing part of the Aab gene derived from I-Ab (J. Seidman, personal communication 1984). The probes were radiolabeled with 32P-dCTP to a specific activity of >2 x 108 dpm/4g by nick translation (Bethesda Research Laboratories, Bethesda, MD). Hybridization and rehybridization conditions were as described by the supplier of the Zetabind nylon filters (AMF, Meriden, CT). Final stringency was established by two 30 minute washes at 650C with 0.015 M NaCl, 0.0015 M sodium citrate, 0.1% SDS. Autoradiographs were produced by exposure for 2-6 days on XAR-5 X-ray film (Kodak, Rochester, NY) with intensifying screens (Dupont, Wilmington, DE) at -700C.



Data Analysis.



RFLP analyses were performed using equation 21 from Nei and Li (1979) with F= 2nxy/(nx + ny) in which nX and ny are the numbers of fragments in populations X and Y, respectively, whereas nXy is the number if fragments shared by the two populations. The validity of the formula was tested by Nei and Li in known pairwise




65


sequence comparisoms. An F value was calculated for each pairwise comparison for all restriction digests. Restriction fragments which weakly hybridized with probes for either A. or A were not included in the analysis.














RESULTS



RFLP Analysis of the A, and Ag Genes
of Standard Laboratory Inbred and Wild Mice



Table 1 presents the 37 mouse strains analyzed in this study, including 13 standard laboratory inbred strains, 15 strains containing wild derived H-2 haplotypes, and 9 t haplotype bearing strains. The 28 wild and standard laboratory inbred mouse haplotypes will be dealt with in this first section of the results. Also relevant to the data presented here, mice representative of the three different subspecies were analyzed in this study, Mus musculus domesticus, Mus musculus musculus, and Mus musculus castaneus, as seen in table 1. The genomic structures of the ia and A5 alleles of these haplotypes were compared by RFLP analysis with DNA probes specific for these genes, as described in the materials and methods. A more detailed diagram of the AO probe is illustrated in figure 2 in the literature review. The probe consists of a 5.4 kilobase

(kb) Eco RI genomic fragment derived from the H-2b haplotype.

In the initial RFLP analysis of the AO gene of the 28 standard laboratory inbred and wild mouse strains, I 66





67



Table 1. Mouse strains used in this study.


Mouse Geographic AB strain Subspecies H-2 origin group


C57BL/10 M.m. domesticusa b Old inbred b AZROU 1 " w201 Morocco b JERSALEM 3 w203 Israel b

JERUSALEM 4 " w204 Israel b B10.M " f Old inbred b B10.WB " j Old inbred b B10.S " s Old inbred b BI1.STC90 w15 Michigan b
tEwl 2 b TT6-865 " b TT6-866 " b AZROU 2 w217 Morocco b STU " w34 Eur. inbred b W12A " w216 Netherlands b

B10.D2 M.m. domesticus d Old inbred d BI0.RIII " r Old inbred d METKOVIC 1 " w205 Yugoslavia d B10.BUA16 " w22 Michigan d B0.CAS2 M.m. castaneus w17 Thailand d t -JR1 M.m. domesticus d ,w5 d 10. SM " v Old inbred d B10.F ," p Old inbred d B10.Q " q Old inbred d B10.KEA5 " w5 Michigan d B10.CAA2 " wll Michigan d B10.STC77 w14 Michigan d BE GRADE 1 M.m. musculus w202 Yugoslavia d t M.m. domesticus d
_w32 f d B10.SAA48 " w3 Michigan d

B10.BR M.m. domesticus k Old inbred k B10.CHA2 " w26 Old inbred k B10.PL ", u Old inbred k NZW to Old inbred k


aM.m. abbreviation represents the species designation Mus musculus.





68


obtained restriction fragments such as those illustrated in figure 4. Patterns of similarity between different A alleles began to stand out. Apd and Apw22 alleles are seen to have identical Pvu II fragments of 2.89 kb and 1.85 kb. Agb, Af , and A 3 alleles have identical 4.83 kb and 2.92 kb Pvu II RFs, the extra band in the APb allele has been proven to be a plasmid contaminant in this particular southern blot. The ApP, A r, AV, and A w201 alleles all have Pvu II RFs of 2.89 kb and 2.75 kb.

As the RFLP patterns became more obvious, I arranged the mouse strains according to their similarities and carried out more southern blots. Figure 5 is a representative autoradiogram with the identical strains of mice being analyzed, minus one, as were seen in the Pvu II autoradiogram of figure 4, but reorganized to emphasize the groupings. The AO alleles of r, 2, d, v, and w22 all have Sac I RF of 5.2, 3.8 and 2.65 kb. The one evolutionary group d member on this autoradiogram without the 5.2 and 2.65 kb restriction fragment (RF), although it does have the 3.8 kb RF, is BELGRADE 1 (A w201). When the two missing 2.65 and 5.2 kb RF from the d group alleles are added together, the sum is 7.95 kb, 0.1 kb different from the new 7.85 kb RF of the Agw201 and well within experimental error to suggest that the new RF is due to the absence of a single Sac I





























Figure 4. Autoradiogram of a southern blot of the different mouse strains' DNA which was digested with the rstriction endonuclease Pvu II, electrophoresced, blotted, and probed with the As genomic probe as described. Each band represents the relative location of the restriction fragment in the gel, which differs according to the position in the gene of the restriction endonuclease used, Pvu II in this case. Molecular weight markers are indicated at the righht side of the figure.





70









I .,REGION ALLELESYESTED

b d a w22 2w2 kb

4 7.4



Ai 45.8 4.9



3.4










Al2.1
!i-1

iN~~'~-




























Figure 5. Autoradiogram of a Sac I restriction endonuclease digestion of standard laboratory inbred and wild mouse strains' DNA probed with AR. This panel shows representative members of the three evolutionary A8 groups discovered.




72


Group d Group b k

r P d v w22 w201 f J b s k u




9.4)



6.7 )






3.50 m




73


site in BELGRADE 1 not present in the Ap r, p, d, v, and w22 alleles.

Of the group b A0 alleles on this autoradiogram, the f, j, b, and s alleles have a Sac I 1.58 kb RF in common. The f, _, and b alleles have a common 7.3 kb Sac I RF while the f, j, and s alleles have a common

3.8 kb Sac I RF. The k group members shown on this autoradiogram have a common AB Sac I 4.6 RF kb which has been found only in k group members and not in any of the other 33 mouse strains examined.

The digestion of the mouse strains with a total of seven RE makes possible a detailed and accurate comparison between alleles to better define the groups. Consequently, the DNA sequence homology among these Ap alleles can be quantitatively estimated from the RFLP data by calculating the fraction homologous (F) value as defined by Nei and Li (1979). The F value is the fraction of RF's which the two alleles have in common. An F value of 1.00 indicates all RF's for all seven RE's are identical for the two alleles being compared. An F value of 0 indicates that no RF's are shared between the two alleles being compared. A number of mouse strains, which would always be in the same group, had identical RFLP's for all seven restriction enzymes, giving them F values of 1.00 when compared to one another. B10, AZROU 1, JERUSALEM 3, and tw75 all have identical An alleles, thereby establishing them as a core of the b group,




74


named after the b haplotype of the prototypic C57BL/10 mouse strain (also designated B10). The tw71 and tw12 strains have identical AO alleles, as do the two TT6 t haplotype strains, although these two may be identical chromosomes. The two mouse strains B1O.WB and JERUSALEM 4 are another pair with an F value of 1.00 between them, with B1O.M differing from them by a single Bgl II fragment. In the d group, named after the well characterized prototypic d haplotype, the AO alleles of B1O.RIII and METKOVIC 1 are identical to one another, as are tw8 and tw32. The mouse strains B10.F, B10.Q, B1O.CAA2, B1O.KEA5, and B10.STC77 all possess identical A alleles, as do the pair STU and W12A, and the k group members B1O.BR and B1O.CHA2, which will all be considered in detail shortly. RFLP analysis with seven different restriction endonucleases indicated that, of the 36 different mouse strains analyzed including t haplotype bearing mice, 22 different A5 alleles were identified.

Table 2 presents a matrix comparison of the

divergence of A5 within and between representative members of groups b, d, and k expressed as F values, and based on results obtained with seven RE's. The full strain designations can be found in the Table 1. Once the F values for the Ao alleles of the 37 standard laboratory inbred strains, wild derived strains, and t haplotype bearing strains were calculated and compared, the existence of three groups became obvious. The F






Table 2. RFLP analysis of mouse Ap alleles representative of the 3 evolutionary groups. AR Mouse
group strain B10 AZR 1 JER 4 .M .D2 MET I .BUA16 .CAS2 BEL 1 .BR .PL

1 B10 - 1.00 .69 .69 0 0 0 0 0 .17 .15 1 AZR 1 26/26 - .69 .69 0 0 0 0 0 .17 .15 1 JER 4 18/26 18/26 - .85 .08 .16 .09 .10 .17 0 .15 1 B10.M 18/26 18/26 22/26 - .08 .08 0 .10 .08 .17 .15



2 .D2 0/26 0/26 2/26 2/26 - .72 .87 .70 .58 0 .23 2 MET 1 0/25 0/25 4/25 2/25 18/25 - .82 .84 .87 .09 .24 2 .BUA16 0/22 0/22 2/22 0/22 20/23 18/22 - .78 .67 0 .26 2 .CAS2 0/21 0/21 2/21 2/21 14/20 16/19 14/18 - .67 .10 .27 2 BEL 1 0/24 0/24 4/24 2/24 14/24 20/23 14/21 12/18 - .18 .08



3 .BR 4/24 4/24 0/24 4/24 0/24 2/23 0/21 2/20 4/22 - .67

3 .PL 4/26 4/26 4/26 4/26 6/26 6/25 6/23 6/22 2/24 16/24 aThe fraction homologous (F) values as defined by Nei and Li (1979). bNumber of shared RF/total RF scored for both alleles, based on restriction digestions with Bam HI, Bgl II, Eco RI, Hind III, Pst I, Pvu II, and Sac I.
U,




76


values within any one of the three groups is usually greater than 0.50, and most often be 0.67 or higher. The discrete nature of the groups is demonstrated by the F values between the groups, ranging from zero to 0.18, indicating very little homology.

Table 1 contains the complete listing of the 37 strains analyzed with the Ap group to which they have been assigned shown in the far right column. Every mouse strain for which RFLP data was obtained using seven RE's, can be placed into one of these three groups. Also presented in table 1 is the subspecies listing of each of the strains tested. There is no correlation of the subspecies of the mice to the Ap group to which they belong. For example, the subspecies Mus musculus domesticus is present in all three groups, and three different subspecies are present in the d group, indicating that the existence of these Ap groups predated the subspeciation of Mus musculus at least into the three subspecies represented here. Subspeciation probably occurred approximately one million years ago. The ancient origin of these evolutionary A5 groups indicates a continual maintenance of at least part of the Ag gene structure.

To further substantiate the existence and the discrete nature of the evolutionary AO groups, a statistical comparison of the AS groups is shown in table 3. The average F values between any two members





77














Table 3. Statistical comparison of AB groups
defined by RFLP analysis.


Mean F value + S.D.

Within Between Students Group same group different groups d.f.a T test


1 .641 + .157 .098 + .074 466 50.13 2 .644 + .161 .103 + .088 511 48.94 3 .697 + .158 .147 + .091 140 13.99


adegrees of freedom




78


within the same group is no lower than .641, while the F values between different groups is no higher than .091. The Student's T test number for each of the three groups calculates to a probability of p < 0 .001, indicating the statistical validity of these three evolutionary groupings.

The genetic diversity of A in 35 of the 36 strains of mice was analyzed by RFLP using a 1.2 kb Hind III restriction fragment derived from Ab (J. Seidman, personal communication). This fragment contains part of the exon which encodes the a1 domain of Aa together with approximately one kb of flanking intron sequences (McIndoe and Wakeland, unpublished observation). Representative examples of the restriction fragment patterns detected at high stringency with this Ai probe are shown in figure 6. Digestion with the restriction endonuclease Eco RI resulted in a 10.6 kb A fragment in all of the 35 mouse strains examined without exception. Upon the digestion with the other six restriction endonucleases used in this study, however, a significant amount of restriction fragment polymorphisms were noted. Of the 35 mouse strains assessed for A., 27 separate alleles are discernible by the RFLP analysis, indicating the polymorphic nature of the A. gene.

Table 4 presents a matrix comparison of the

calculated F values based on the restriction fragments obtained with the total of seven restriction





























Figure 6. Autoradiogram of an Eco RI restriction endonuclease digestion of wild and standard laboratory inbred strains of mice, probed with a 1.2 kb Hind III fragment from A and kindly supplied by Dr. John Seidman. As with the A5 probe, this A probe is also derived from a genomic clone and contains predominantly noncoding sequence.





80







w2wl w25 w220, d v W22 wlS w17 w3 kb





!! i "


7.4 . ...






4.9






� . ...





Table 4. RFLP analysis of representative mouse A. alleles. AR Mouse
group strain 810 AZR 1 JER 4 .M .D2 MET 1 ,BUA16 ,CAS2 BEL 1 .BR .PL b 810 - .61a .44 .57 .56 .44 .46 .53 .32 .32 .25

b AZR 1 14/23b - .61 .61 .43 .52 .53 .58 .42 .42 .30 b JER 4 8/18 14/23 - .48 .33 .67 .31 .63 .32 .53 .12 b .M 12/21 16/26 10/21 - .38 .48 .67 .64 .36 .27 .44


d .D2 10/18 10/23 6/18 8/21 - .33 .46 .53 .42 .32 .25 d MET 1 8/18 12/23 12/18 10/21 6/18 - .31 .63 .42 .42 .25 d .BUA16 6/13 8/15 4/13 10/15 6/13 4/13 - .57 .57 .27 .53 d .CAS2 10/19 14/24 12/19 14/22 10/19 12/19 8/14 - .40 .40 .35 d BEL 1 6/19 10/24 6/19 8/22 8/19 8/19 8/14 8/20 - .40 .23


k .BR 6/19 10/24 10/19 6/22 6/19 8/19 4/15 8/20 8/20 - .21 k .PL 4/16 6/20 2/16 8/18 4/16 4/16 8/15 6/17 4/17 4/19 aThe fraction homologous (F) value as defined by Nei and Li (1979). bNumber of shared RF/total RF scored for both alleles, based on restriction digestions with Bam HI, Bgl II, Eco RI, Hind III, Pst I, Pvu II, and Sac I.
HO





82


endonucleases. The mouse strains in this A matrix are the same strains listed in table 2 and represent the three evolutionary Ap groups. Within each of the three A~ groups, the A. gene shows lower F values, indicating a lesser degree of homology between members of the same A4 group. More significantly, the F values of the Ea alleles when comparing between the different Ap groups are much higher than for the AO alleles, ranging very close to the same F values seen within an AO group. Thus no groups are detected in this RFLP analysis of the A gene of 35 different strains of mice.



RFLP Analysis of t HaDlotype-Bearing Mice



A number of t haplotype bearing mice have also been examined in this protocol because there exists some controversy regarding their evolutionary origin. In an attempt to address this aspect, nine different t haplotype strains were examined by RFLP analysis in a collaborative study with Dr. Joesph Nadeau of Jackson Laboratories. Eight different t haplotype bearing mice, plus two k haplotype controls, were examined with the same seven restriction endonuclease RFLP analysis as with the other 28 mouse strains examined.

Figure 7 shows a representative autoradiogram of an AA probed Sac I restriction endonuclease digestion of the nine t haplotype bearing mouse strains as listed in





























Figure 7. Autoradiogram of a Sac I restriction endonuclease digestion of t haplotype bearing mice strains, probed with the As probe. Many of the strains are heterozygous with either the k haplotype or the t haplotype, as described in the text, explaining the overabundance of A3 restriction fragments seen.






o


U *** ad

,,,1 I ! I r
. .. .

'Lm1 I

ur- 9
911 S 8 6 S
911 3 S S 6

LLMtt

Or ( C; c4




85


the figure and in table 1. The t haplotype strains are often maintained as heterozygous for the t haplotype due to some of their homozygous lethal effects. This may explain the multitude of restriction fragments seen in this figure. To compensate for this, the k restriction fragments (7.85, 4.6, and 2.0 kb) have subtracted from t haplotypes tw71, w75, tw12, tw5, and tw32. In addition, the TT6 strains, which may have identical chromosomes, are heterozygous with t6-JR1, so those restriction fragments have been subtracted from TT6 for each restriction endonuclease in order to calculate relevant F values. The 7.3 AP Sac I restriction fragment is present in strains tw71, TT6, tw75, and tw12. The 5.2 Ap Sac I fragment is found in strains TT6, t6, tw8, tw5 and tw32. The 3.6 kb fragment is present in all eight of the t haplotype strains except for tw75, which is identical to C57BL/10 for all seven restriction endonucleases. Finally, the 1.58 kb Sac I fragment as shown in figure 7 is present in strains tw71, TT6, tw75, and twl2, all of which are in the b group.

On seeing the various Ap restriction fragments in

the t haplotype strains which were in common with members of the three Ag evolutionary groups, the t haplotype strains were compared to the RFLP's of the other 28 mouse strains analyzed. All nine of the t haplotype strains AO alleles examined by RFLP analysis fit into either the b or the d groups, as listed in table 1 and shown with




86


representative strains in table 5. The F values for their Ap alleles range from 0.50 to 1.00 within a group, and range from 0 to 0.17 between groups in table 5. Again, as with some of the other wild and standard laboratory inbred strains, a few are identical to one another such as tw12 and tw71, and tw8 and tw32. Not shown in table 5 are the F values comparing the t haplotype strains in the b and d groups to members in the k group. The F values in this comparison ranged from 0 to 0.17, therefore none of the t haplotype strains examined are members of the k group.

Table 6 presents a statistical comparison of the A5 alleles of the t haplotype strains to determine if they are more related to one another within either of the A1 evolutionary groups, or if they are just as related to any other member of their respective Ap group. As was just mentioned, there is no question that they belong to either group. The Student's T test values shown in table 6 indicate that in neither group are the AO alleles of the t haplotypes more related to one another than to other members of the same Ag group at the p < 0.001 level of significance, particularly in the d AO evolutionary group.

The a alleles of the t haplotype strains were also examined by RFLP analysis with the same seven restriction endonucleases. Figure 8 shows representative Bgl II restriction fragments seen in the t haplotype strains










Table 5. RFLP analysis of the AB alleles of t haplotype bearing mice.
Ap Mouse
group strain B10 tw75 tw71 tw12 TT6 .D2 .RIII t6-JR1 tw5 tw8 tw32 b B10 - 1.00a .50 .50 .50 0 0 0 0 0 0 b tw75 26/26b - .50 .50 .50 0 0 0 0 0 0 b tw71 12/24 12/24 - 1.00 .91 .08 .09 .09 .09 .09 .09 b twl2 12/24 12/24 22/22 - .91 .08 .09 .09 .09 .09 .09 b TT6 12/24 12/24 20/22 20/22 - .08 .17 .17 .17 .17 .17 d .D2 0/26 0/26 2/24 2/24 2/24 - .87 .56 .64 .48 .48 d .RIII 0/25 0/25 2/23 2/23 4/23 18/25 - .72 .75 .67 .67 d t6-JR1 0/25 0/25 2/23 2/23 4/23 14/25 18/25 - .58 .83 .83 d tw5 0/25 0/25 2/23 2/23 4/23 16/25 18/24 14/24 - .58 .58 d tw8 0/25 0/25 2/23 2/23 4/23 12/25 16/24 20/24 14/24 - 1.00 d tw32 0/25 0/25 2/23 2/23 4/23 12/25 16/24 20/24 14/24 24/24 aThe fraction homologous (F) value as defined by Nei and Li (1979). bNumber of shared RF/total RF scored for both alleles, based on restriction digestions with Bam HI, Bgl II, Eco RI, Hind III, Pst I, Pvu II, and Sac I.




88

















Table 6. Statistical comparison of Ag alleles of
t haplotype mice within related t haplotype groupings
and between t haplotype groupings compared to
their related evolutionary Ap grouping.


Mean F value + S.D.

Within Between Students Group AO t group t and A5 groups d.f.a T test


b .764 + .230 .614 + .137 63 2.85 d .733 + .179 .606 + .136 60 2.11


degrees of freedom





























Figure 8. Autoradiogram of a Bgl II restriction endonuclease digestion of the t haplotype bearing strains of mice, probed with the A fragment.








0
ON CM













I.P911

9�1



9.Lk-





91


with the IA probe which has already been described. Although A. is a polymorphic gene, no grouping patterns were detected by RFLP analysis as is readily demonstrated in figure 8.



RFLP Analysis of the Divergence
of the Aa and Ag Genes Within the AF Family



I have compared the organization and structures of the As and Ap alleles present in the AP family members by RFLP analysis with DNA probes specific for each gene. The AP family consists of 6 M.m. domesticus H-2 haplotypes and 1 M.m. castaneus H-2 haplotype derived from Asian and North American wild mouse populations (Wakeland and Klein 1983). Their grouping of the 7 mouse strains into the AP family is based on similarities in the antigenic phenotypes of the respective class II molecules. High Pressure Liquid Chromotography (HPLC) tryptic peptide fingerprint comparisons of the As and AP subunits of the AP family members have demonstrated close structural relationships of the I-A molecules encoded by alleles in the AP family (McConnell et al. 1986). A gradation in the relatedness of the 7 I-A molecules became apparent from this tryptic peptide fingerprint data.

Repesentative examples of the restriction fragment patterns detected by probing at high stringency with AO are presented in figure 9. Digestion with endonuclease





























Figure 9. Autoradiogram of an Eco RI and a Bam HI restriction endonuclease digestion of the seven members of the I-AP family, probed with A5, demonstrating the RFLP relatedness of the five core strains.




Full Text

PAGE 1

A RESTRICTION FRAGMENT ANALYSIS OF THE EVOLUTIONARY RELATIONSHIP OF THE MURINE H^2 A^ AND A3 ALLELES By THOMAS JOHN MCCONNELL A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1986

PAGE 2

Copyright 1986 by Thomas John McConnell

PAGE 3

This dissertation is dedicated to my wife Ann, my parents, and to Bosco and Max,

PAGE 4

ACKNOWLEDGEMENTS I would like to express my appreciation to the many people who have helped me complete this dissertation. Most especially I would like to thank Dr. Edward Wakeland for his help and support throughout this project. As I meet more scientists from other parts of the country, my respect for Dr. Wakeland only increases. My appreciation is also extended to Drs. Ammon Peck and Arthur Kimura who have been helpful through their comments and friendship. My thanks are extended to Dr. Noel Maclaren, without whose help and support I might not have had the opportunity to join this excellent department. Also, my thanks are extended to Dr. Edward Siden whom I respect for his questioning mind and his enthusiasm for science. My thanks are extended to Dr. Linda Smith for her friendship and for the reagents we have been able to borrow back. My thanks are extended to Dr. William Winter for his friendship and scientific curiosity in the laboratory. And my thanks are extended to Dr. David Cooper for his friendship. The students and research personnel in the laboratory have been a particularly excellent group. IV

PAGE 5

William Talbot and Randy Horwitz have been two of my best friends through the past few years for which I would like to thank them. I would like to thank Vickie Henson for her friendship and support throughout the time we have both been students in Dr. Wakeland's laboratory. Stefen Boehme, Roy Tarnuzzer, and Richard Mclndoe have all provided me a tremendous amount of support and some excellent lunch trips. Marge Price-LaFace has been an integral part of this project for which I would like to thank her. Cheryl Zack has offered encouragement to me during the writing process for which I would like to thank her. Many thanks are offered to people in other laboratories who have been very good friends. My thanks are extended to Lena Dingier, a very good friend I hope to see in the future. My thanks are extended to Jane Strandberg and my congratulations to her on her choice of an excellent graduate program. My thanks are extended to Judith Nutkis, Jim Xiang, and to Dan Cook, who have been supportive friends during my graduate studies. And my thanks are extended to Drake LaFace and Kathy Edmundson for their friendship. For their help in last minute typing and printing rushes, I would like to thank Crystal Grimes and Rose Mills. v

PAGE 6

TABLE OF CONTENTS Page ACKNOWLEDGEMENTS iv ABSTRACT vii INTRODUCTION 1 REVIEW OF THE LITERATURE 5 Major Histocompatibility Complex Structure... 6 Polymorphism of the Class II Genes 26 Variation in Wild Mice 39 MATERIALS AND METHODS 61 Mice 61 Isolation of Genomic DNA 61 Restriction Endonuclease Digestion and Agarose Gel Electrophoresis 62 Capillary Transfer and Hybridization 63 Data Analysis 64 RESULTS 66 RFLP Analysis of the A^ and Ag Genes of Standard Laboratory Inbred and Wild Mice.... 66 RFLP Analysis of the Divergence of the A^ and A s Genes Within the I-AP Family 91 RFLP Analysis of the Divergence of the A^ and A3 Genes Within the I-A k Family 100 DISCUSSION 110 RFLP Analysis of the Genomic Structures of Aq and Ar Alleles 110 Comparisons of Class II Molecules and Class II gene RF Genorypes within the I-A? and I-A k Family 117 REFERENCES 120 BIOGRAPHICAL SKETCH 135 VI

PAGE 7

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy A RESTRICTION FRAGMENT ANALYSIS OF THE EVOLUTIONARY RELATIONSHIPS OF THE MURINE H^ A^ AND Ap ALLELES By Thomas John McConnell August 1986 Chairman: Dr. Edward K. Wake land Major Department: Pathology The presence of three Ag evolutionary groupings, designated Ap b , Ag d , and A^, has been established on the basis of a restriction fragment length polymorphism (RFLP) analysis of the Ap genes of 37 independently derived mouse H-2 haplotypes. All mice analyzed, which included laboratory inbred and wild derived haplotypes of Mus musculus domesticus , Mus musculus musculus , and Mus musculus castaneus , were found to have an A3 allele which was could be related to one of these three Ao groupings. No similar grouping in the RFLP analysis of the A„ alleles of these same haplotypes, however, was possible. All of the 8 t haplotypes studied are found to fall into either the Ap b or Ag d evolutionary group based on the same RFLP analysis.

PAGE 8

Restriction fragments were detected with a 5.8 kb I-Afl d genomic DNA probe and a 1.2 kb I-A a b genomic DNA probe. The I-A alleles in each group have 50% or greater restriction fragments in common. Alleles in separate groups share less than 20% of their restriction fragments. The polymorphism of the A^ and Aq genes as detected by RFLP analysis did not always correlate with known protein seguences. In an RFLP analysis of I-A k protein related H-2 haplotypes, 5 mouse strains were known to be closely related to one another by serology and tryptic peptide mapping, were found to fall into two different A evolutionary groups. There are also examples of 2 Ao alleles being very different at the protein level, but very similar when their RFLP's are compared. Possible explanations of these data include the existence of mechanisms which allow or promote gene conversion or some form intragenic recombination occurring in or near the introns, possibly even exon shuffling. The presence of three different Mus subspecies in one of the A3 evolutionary groups suggests that these groups arose prior to the divergence of the subspecies, approximately one million years ago. Vlll

PAGE 9

INTRODUCTION The major histocompatibility complex of the mouse, known as the H-2 complex, is a cluster of loci encoding proteins whose function in part includes the immune defense of the animal in its natural environment. There are three classes of genes in the H-2 complex: the class I genes (the strongest histocompatibility genes), the class II genes genes encoding proteins which are involved in the presentation of foreign antigen to the regulatory T lymphocytes (Benacerraf 1981; Klein 1975; Klein 1979), and the class III genes (which encode the complement genes). Polymorphisms in the class I and class II genes, i-ethe presence of multiple allelic forms of the gene at freguencies of greater than 1% in the general population, are thought to help a species to survive the continuous onslaught of environmental pathogens. The polymorphic nature of the class II A^ and A3 genes, as related to evolutionary genetic mechanisms, is the subject of this dissertation. Genetic and biochemical studies of class II molecules have identified two molecules, designated I-A and I-E, which normally are expressed on the surfaces of antigen presenting cells and B lymphocytes (Cullen et

PAGE 10

al. 1976; Uhr et al. 1979). The molecular cloning and sequencing of a large portion of the murine I region has supplied extensive information on the organization of the murine class II genes and the molecules they encode (Benoist et al. 1983a; Benoist et al. 1983b; Choi et al^ 1983; Malissen et al^_ 1983; Steinmetz et al. 1982). The A^, Ap, E^, and Eg genes are single copy genes present on a segment of about 110 kb of DNA in the I region of the H-2 complex (Steinmetz et al. 1982). The A^ and Ap genes each encode transmembrane glycoproteins which noncovalently associate with one another on the cell surface to form the heterodimeric I-A molecule. Similarly, the E^ and Ep genes encode transmembrane glycoproteins which form the heterodimeric I-E molecule. Studies of the DNA sequence of these class II genes have established that they all have a common evolutionary origin and that they represent one branch of the immunoglobulin supergene family (Benoist et al. 1983a; Choi et aJL 1983; Malissen et al^_ 1983). Each class II gene consists of at least 6 exons and occupies more than 5 kb of genomic DNA. The k a , A3, and part of the Ep genes are located within a portion of the I region which exhibits extensive sequence polymorphism. This region extends 5 ' from a recombinational hot spot located within the central intron of Eg (Steinmetz et al. 1984). The E^ gene is located 3 ' to Eg in a "conserved tract" of the I region, and exhibits much less polymorphism than A^, A«,

PAGE 11

or Eg. The evolutionary mechanisms responsible for the production and maintenance of polymorphic and conserved tracts within the I region are unknown. Previous studies on the genetic polymorphisms of class II genes in wild mouse populations have provided some insights into the genetic mechanisms responsible for their diversification (Wakeland and Darby 1983; Wakeland and Klein 1979a; Wakeland and Klein 1979b; Wakeland and Klein 1983; Wakeland et al^ 1985). Serologic and structural analyses of the I -A molecules expressed among a collection of H-2 haplotypes derived from wild mice led to the definition of "families" of I-A alleles (Wakeland and Klein 1979a; Wakeland and Klein 1983). The I~A alleles within the same family encode antigenically similar molecules that are identical in more than 90% of their tryptic peptides when compared by high pressure liquid chromatography tryptic peptide fingerprinting (Wakeland and Darby 1983; Wakeland and Klein 1979b; Wakeland and Klein 1983). The I-A molecules encoded by alleles in separate families have distinct antigenic phenotypes and are identical in less than 70% of their tryptic peptides. An analysis of over 40 H-2 haplotypes derived from laboratory and wild mouse strains led to the definition of 8 distinct I-A families (Wakeland and Klein 1983). This dissertation describes the RFLP analysis of the A^ and Ag genes of 37 standard laboratory inbred and wild

PAGE 12

mouse strains from the 3 separate subspecies Mus musculus domesticus , Mus musculus musculus , and Mus musculus castaneus , and includes t bearing haplotype wild mice. Based on the data generated using 6 to 7 Restriction endonucleases (RE) on each haplotype examined for An, all of the homozygous haplotypes examined could be placed into one of three evolutionary groupings . Because the Ao gene of subspecies Mus musculus musculus , Mus musculus domesticus , and Mus musculus castaneus are in one An evolutionary group, these evolutionary groups probably predate subspeciation in the mouse, which occurred approximately 1 millon years ago. Since both the Aq and Ap probes contain approximately 80% noncoding seguence and in some instances the protein seguences or tryptic peptide mapping could be compared between haplotypes being examined, the Ag evolutionary groups that I have found appear to be determined predominantly by noncoding seguence. Finally, the A^ and Ap genes of wild t haplotype bearing mice examined by RFLP analysis also fell into the separate Ag evolutionary groups.

PAGE 13

REVIEW OF LITERATURE The major histocompatibility complex (MHC) of the mouse, also known as the H-2 complex, has been an area of fascination in modern genetics since its discovery laid the groundwork for serology as a tool to study genetics (Gorer 1936; Gorer 1938). The groundwork for the discovery of the genetic basis of the H-2 complex had been established by Little and Tyzzer (1916) using inbred mouse lines. From the initial characterization of erythrocyte antigens, through protein biochemistry and seguencing, and up to modern genetic engineering, the MHC has continued to allow us to learn about genetics, natural selection, cellular physiology, and other far reaching areas of biology. The class II genes of the H-2 complex are located in the I region, so named because that region was originally defined by the differential ability of inbred mouse strains to mount an immune response to certain simple antigens (Martin et al. 1981; McDevitt and Chinitz 1969; McDevitt and Sela 1965; Martin et_;_ al^_ 1971). The actual mapping of the immune response genes within the H-2 complex was accomplished with the use of inbred congenic and recombinant mouse strains (Benacerraf and

PAGE 14

McDevitt 1972; McDevitt et al^ 1972), although the actual identity of these gene products was in guestion for a number of years. The class II genes encode proteins that restrict the recognition of foreign antigen by the regulatory T lymphocytes to those antigen presenting cells which are of the same allelic form, and are thus critical for the development of a normal immune response. This literature review will focus primarily on class II genes structure and possible functional correlates of the genetic structure. Major Histocompatibility Complex Structure General Organization and Protein Structure The murine major histocompatibility complex (or H-2 complex) encompasses about 2 centimorgans of DNA which may be eguivalent to as much as 2000-4000 kb of DNA (Hood et al^ 1982, -Klein 1975) and contains at least 3 classes of immunologically related genes, denoted class I, class II, and class III (Klein 1975; Snell et al. 1976). The molecules encoded by the class I genes are of 2 general types. The class I genes designated K, D, L, and R encode the classical transplantation antigens located on the cell surface of most nucleated cells and are known to be very polymorphic. These molecules are primarily involved in the restricted recognition of some viral antigens by cytotoxic T lymphocytes (Zinkernagel

PAGE 15

1979). The other general type of class I genes, designated Qa and Tla , are expressed on nucleated blood cells (Qa.) or on thymocytes and certain leukemias ( Tla ) (Michaelson et al^ 1983), are much less polymorphic, and whose functions are not yet known (Flaherty 1980). The 2a and Tla genes are located telomeric to the D, L, and R genes and number over 30 (Winoto et al. 1983). The molecular structure of both types of class I molecules consists of a 40-45,000 dalton membrane bound glycoprotein of approximately 3 50 amino acids which form 3 extracellular domains of about 90 amino acids per domain. A fourth domain in the form of 3 2 microglobulin, a 12,000 dalton polypeptide encoded on chromosome 2 in the mouse, noncovalently associates with the class I gene product, possibly in a stabilizing role (Klein et al^_ 1983b). The relative locations of the class I genes to the class II and class III genes, as well as their general protein structure, are illustrated in figure 1. The class III genes of the H^2 complex encode complement components such as the C2, Bf , Sip , and C4 genes. The linkage association of some of the complement genes to the MHC of different animals varies (Alper 1981). While there has been some argument for the inclusion of the complement loci in the MHC on the basis of the MHC and linked loci possibly evolving as a genetic unit (Bodmer 1976), Klein et al^ (1983b) argue against a

PAGE 16

c n c c < •H 1 -U M H a cd O jG 0,4J » > w -H V 4J o « d H T3 CD u M u H s CD c £ cd U 0" TJ M G M 03 C/3 •H m o «3 H H u M d) M.£ 4-> W U3 lH X: rH iH H 3 CD t/3 • £ 1 H H w rH cd u E a 3 cr> m w -H X! i Cn O M

PAGE 18

10 purely physical inclusion of the complement genes as part of the MHC in their review article. The class II genes map between the K and S regions of the H-2 complex ( figure 1 ) . There are 4 functional class II genes denoted A^, Ag, J^, and Eg, as well as the pseudogenes Ap 3 , A^ 2 , and Ep 2 (Steinmetz et al. 1986; Widera and Flavell 1985). The overall organization of the I region and its protein products are illustrated in figure 1. The class II glycoproteins are made up of a 28,000 dalton (3 chain of about 230 amino acids, and a 34,000 dalton a chain of about 220 amino acids (Klein et al. 1983b) . Both the a and the chains consist of five protein domains including a hydrophobic leader peptide of about 25 amino acids absent in the mature cell-surface form of the molecule, two approximately 90 amino acid extramembrane domains (a^a 2 or P]_3 2 )» a hydrophobic transmembrane segment of about 25 amino acids, and a cytoplasmic tail region containing a high proportion of charged residues (Mengle-Gaw and McDevitt 1985). Each of the a-]_, a 2 , (3-]_, and (3 2 domains is formed by a disulfide cystine bridge. The two a and two (3 domains noncovalently associate to form the I-A molecule. The presence of four extramembrane protein domains appears to be a stabilizing configuration for both the class I and class II molecules. The structuring of the MHC protein molecules into domains reflects the basic organization of the encoding DNA into exons and introns.

PAGE 19

11 Structure of the Class II Genes Both the Ap and Eg genes are made up of six exons, one exon for each of the five protein domains, plus one exon encoding the 3' untranslated region (Saito et al. 1983). Both the A^ and E^ genes, though very similar to the (3 genes , are made up of only five exons . This difference is due to the transmembrane and cytoplasmic tail regions of the a chains being encoded by a single exon (Mathis et al^ 1983; McNicholas et a^U 1982). It is important to note the polymorphic nature of the Ap, A^, and Ep genes although they will be discussed in detail in a later section. The presence of multiple allelic forms of the Ag and Eo genes may imply a unique role for the encoded proteins in the ability of a population to respond to an antigen. The actual molecular interplay between the class II molecule, the antigen, and the T cell receptor is still largely an unknown. Brief History of the I Region Loci The I region, which contains the class II genes, has had a relatively short, but turbulent, history. As has already been mentioned, the I region was discovered and mapped by the differential ability of inbred and congenic animals to respond to certain simple antigens (Benacerraf and McDevitt 1972; Martin et al. 1971; McDevitt and Sela 1965; McDevitt et al^ 1972). Even then, the I region gene products were mistaken for being

PAGE 20

12 related to, but not a part of, the MHC and somehow related to the T cell receptor (McDevitt and Chinitz 1969). Historically, five subregions were defined in the I region, the A, B, J, E, and C subregions (Murphy 1981). The B locus was originally postulated by Lieberman et al. (1972) to explain what appeared to be a response to an allotypic determinant on an IgG2a molecule known as MOPC173. Responses to several other antigens were mapped to the B locus including lactate dehydrogenase B (Melchers et al^ 1973), staphlococcal nuclease (Lozner et al. 1974), oxazolone (Fachet and Ando 1977), and H-Y (Hurme et al^ 1978). But the data from the different laboratories have not always corroborated the existence of the B locus, and several alternative explanations have in fact been offered based on an interplay of the A and E loci (Baxevanis et al. 1981). However, no protein product has been detected and sequencing data has not demonstrated the presence of any gene corresponding to the B locus (Hood et al. 1983; Steinmetz et al. 1982; Steinmetz et al. 1986). The C subregion was first discovered with an H-2 h2 antiH-2 h4 antiserum (David and Shreffler 1974). Other evidence in support of the existence of the C subregion was found by Rich et al^ ( 1979a ; 1979b) with the presence of C-specific antibodies that reacted with a suppressor factor produced in a mixed lymphocyte culture. The C

PAGE 21

13 locus was mapped, using recombinant inbred strains, telomeric to the E^ locus and centromeric to the genes encoding the C4 component of complement, other investigators have been unable to confirm many of the results dealing with the C locus and so guestion its existence (Juretic et al^ 1981; Livnat et al^ 1973). In the most current molecular cloning data of this region (Steinmetz et §jU 1986) there is no evidence for the C locus in the 150 kb of DNA telomeric to the E^ locus, although the entire chromosomal segment in guestion has not been characterized. And once again, no protein product has been isolated from the C locus. The existence of both the B and the C loci is based entirely on their possible regulatory effect on the immune response of the mouse . The J subregion is the third subregion from which no protein product has been well-characterized, although anti-J antiserum and monoclonal antibodies have been produced by several laboratories (Kanno et al. 1981; Murphy 1978; Waltenbaugh 1981). In fact, the J locus has been perhaps the most publicized single gene in immunology, and the most controversial, other than the T cell receptor. Thousands of papers have been published on either the J product, or on its role as the class II element controlling the T suppressor cells. The J subregion was originally defined by reciprocal antisera raised between inbred congenic mouse strains B10.A(3R)

PAGE 22

14 and B10.A(5R) as well as the mouse strains B10.HTT and B10.S(7R). The same mouse combinations mapped the location of the J subregion between the A and E loci (Murphy et al. 1976; Murphy 1978). These alloantisera recognize soluble suppressor factors secreted by these cells as well as recognizing polymorphic determinants expressed on T suppressor cells (Krupen et al. 1982; Murphy 1978; Tada et al. 1976; Taniguchi et al. 1980; Waltenbaugh 1981). Although no protein has been positively identified for the J locus, Taniguchi et al. (1982) report finding a 25,000 dalton protein using an anti-J monoclonal antibody. In the first extensive DNA level characterization of the murine I region, Steinmetz et al. (1982) found no evidence for the existence of the J locus within the I region. Due to a hotspot for recombination at the 3' end of the Ep gene, these authors were able to map the suspected position of J between the A and E loci and found that if it was located there it would have to be encoded by less than 3.4 kb of DNA. In further DNA cloning analysis of this region and more RFLP mapping of additional intra-I region recombinants, Kobori et al. (1984) shortened the distance down to about 2.0 kb, making it even more unlikely that J might be encoded here. Related experiments have shown that cloned DNA encompassing this critical 3 . 4 kb segment fails to hybridize to RNA from J positive T suppressor cell lines

PAGE 23

15 (Kronenberg et al^ 1983), thus making unlikely the presence of an I region encoded J gene product. Alternative explanations for the location of J have been offered (Hayes et al^ 1984; Klyczek et al^ 1984), but have not been substantiated. The A and E subregions have survived the despoilment of the I region which occurred with the onslaught of molecular analysis of the I region DNA. These two subregions contain genes which encode four cell surface protein products which have been identified by serological and biochemical methods (Jones 1977; Uhr et al. 1979). The A subregion contains at least three loci that encode class II molecules which are expressed on the cell surface: Ap, A^, and E (Jones et al^ 1978). The E subregion contains a fourth loci that is known to encode a molecule expressed on the cell surface, E^ (Jones et al. 1978). It is important to note that with the molecular characterization of the genome containing the I region, the nomenclature of I "subregion" is no longer appropriate. The term originally defined the I region loci, several of which are now generally believed to have been artif actual for reasons listed above. Recombinational events are more accurately represented when viewing the class II genes as being part of the continuum of DNA versus the archaic concept of subregion. Henceforth in this literature review individual loci shall be referred to by their gene designation, for

PAGE 24

16 example, Ag. These loci, plus other class II loci recently discovered in the genome, will now be discussed in more detail. Cloning and Sequencing of the Murine Class II Genes The cloning and sequencing of the murine class II genes were based in part on technical advances and new approaches made while isolating the human class I (Ploegh et al^_ 1980; Sood et al^ 1980), and class II (Auffray et al^_ 1982; Korman et aJU 1982a, 1982b; Larhammar et al_;_ 1982a; Lee et al^ 1982a; Yang et al. 1982) genes. Protein sequence comparisons done earlier, reviewed by Nathenson et al. (1981), have already established the homology between humans and mice when comparing DNA sequences of the class I genes. Similar work on the class II gene products also reveals strong homologies between mice and human (Allison et al. 1978; Cook et al. 1979). More indicative of the evolutionary stability of the class II gene products in a dynamic molecular environment is the maintenance of the domain structure as the basic functional unit of the molecule. The most revealing feature of the class II proteins in terms of their evolutionary origins is the aforementioned domain structure and their sequence relatedness to other immunological molecules. There is a consistent correlation between all the class II genes of each structural domain being encoded by a separate exon,

PAGE 25

17 as there is for the class I, ^-microglobulin, the Thy-1 molecule, and antibody genes (Kaufman et al. 1984). Domain structure alone does not indicate homology between proteins, since similar domains have been found in such proteins as superoxide dismutase (Richardson et al. 1976), but taken together with the nucleotide sequence homology found between these immunological molecules (Benoist et al_j_ 1983a; Bregegere et al. 1981; Korman et al. 1982b; Larhammar et al. 1982b; McNicholas et al. 1982; Parnes and Seidman 1982; Steinmetz et al. 1981), there is strong evidence for the existence of a common ancestral gene (Peterson et al. 1975). Other similarities between members of the immunoglobulin supergene family include similar placement and size of the disulfide bridges and RNA splicing according to the GT/AG rule (Hood et al^ 1983). The T8 cell surface glycoprotein expressed by most cytotoxic T lymphocytes has also been determined to belong to the immunoglobulin supergene family by domain structure and cDNA sequencing (Sukhatme et al. 1985), as has the T4 molecule (Maddon et al. 1985). Thus, evolution through gene duplication and divergence (Ohno 1970) may be an ancient mechanism for the immune system gene family. Although the murine class II genes have an exonintron organization that corresponds to the domain organization of the expressed protein product, the murine class II a gene structure differs from that of the murine

PAGE 26

18 class II 3 gene. A large intron separates the exons encoding the signal peptide from the first domain in the class II a gene, and the 3' untranslated region is split between two exons, but the transmembrane and cytoplasmic regions are encoded by a single exon (Benoist et al. 1982; McNicholas et al^ 1982). This genetic structure is similar to that of the murine ^-microglobulin gene (Suggs et al. 1981). In contrast, the murine class II (3 genes have a large intron between exons encoding the first and second extracellular protein domains. The transmembrane, cytoplasmic, and 3* untranslated regions are split over three exons (Larhammer et al. 1983a; Saito et al. 1983), more similar, though not identical, to the class I heavy chain gene structure (Malissen et al. 1982). The genomic structure of the Ag gene can be seen in figure 2 . As mentioned above, because of all of the common structural and seguence homologies between the members of the immunoglobulin supergene family, there is a strong possibility that each has all evolved from a common ancestral gene. It is important to keep in mind that one cannot distinguish between convergent and divergent evolution (Hood et al^_ 1983). The membrane proximal domains of these molecules have the most sequence homology and are therefore even more likely to have a common origin. But the nearly identical external domain size and disulfide bridge placement of the different

PAGE 27

C T3 *W • -P iH (U -P £} rH •h o e m U 3 o i -H -i Ifl 0>£) •H -H 4-1 >. [14 H 0^2

PAGE 28

20 O o E _>% o CL m LU ro 1 cm ro LU m I III ro ro -O O -iq:CD 1 3 ro >o < 111 m Cl o O if) coco— k o— co— I *co a:-

PAGE 29

21 members of the immunoglobulin supergene family argues strongly for a common ancestral gene evolving in a divergent manner following gene duplication. Genome Organization of the Class II Genes The first evidence at the DNA level of the linkage of class II genes was made by Steinmetz et al. (1982) with their cloning of about 230 kb of DNA isolated from a BALB/c sperm DNA cosmid library. The cosmid library was first probed with a DR a cDNA probe (characterized by Wake et al. 1982), and then probed by single copy genetic fragments subcloned from contiguous cosmids. In this manner, Steinmetz et al^ were able to "walk" along the chromosome as long as there were cosmid clones in the genomic library that contained overlapping fragments of genomic DNA, identifying approximately 200 kb of linked DNA in the process. The telomeric boundary of the I region was defined as the structural gene for the C4 complement component mapping about 90 kb downstream from Eq. The centromeric boundary of the I region was not determined in this particular publication, but several other important discoveries were made. First, four class II genes were identified, one as a possible pseudogene because a 5' probe failed to hybridize to the gene. Second, the BALB/c genome was determined to contain two a and four to six 3 genes, a finding which has been borne out in more recent work (Widera and Flavell 1985). Third,

PAGE 30

22 Steinmetz et ajL_ (1982) also reported that the E^ and Eo genes are present in strains of mice which do not express an E molecule, e.g. , the b and s haplotypes which express the protein in the cytoplasm, and the f and 2 haplotypes, which do not express the protein at any detectable level. This finding has led to more work on control of gene expression in the class II gene system. Fourth, correlation of the molecular map with the serologically and genetically determined map of the I region led scientists to guestion the existence and location of the B and J genes. Finally, a recombinational hotspot was identified where nine independently generated recombinant mouse lines were found to all have recombined within the same 3.4 kb of DNA. Kobori et al^ (1984) have furthur characterized six of the murine I region MHC recombinants using southern blot DNA analysis to limit the recombination region in these strains to less than 2.0 kb of DNA. This 2 . kb contains part of the intron between the first and second protein encoding exons, and part of the second domain encoding exon. Figure 3 shows the most recent concept of the I region at the DNA level. Other class II genes which have been characterized recently are A p2 , A^ 3 , and E^Larhammar et al^_ (1983a) identified A 02 and located it to be about 20 kb centromeric to Ag. Larhammar et al. (1983b) seguenced the genomic A$ 2 of the b haplotype isolated from cosmid clone 19-101. The exon-intron

PAGE 31

23 structure of Ag 2 is the same as for the other class II 3 genes. The predicted amino acid sequence of Aq 2 , as interpreted from the nucleotide sequence, shows only up to 56% homology to the other 3 chains, including the human 3 chain class II proteins. These other 3 chains typically show up to almost 80% homology to each other. On this basis, the Ag2 second domain sequence was determined to be the most divergent member of the class II 3 gene family. Larhammar et al. (1983b) also cloned and sequenced a cDNA clone, proving transcription of Ao 2 does occur, although it was not detected on the cell surface, and some possible splicing errors were detected. When Ag 2 was used as a probe to hybridize blots of other strains, a lesser degree of polymorphism was detected. The latest class II gene, and possibly the last in the I region, is Ap 3 . Widera and Flavell (1985) isolated Ag3 from a b haplotype cosmid library and were able to link it 75 kb telomeric to the class I H-2K region. The nucleotide sequence of the 32 domain of A33 has homology to the immunoglobulinlike domains of other class II genes, and 83% homology to the human SB S gene. An examination of the nucleotide sequence also showed a deletion of 8 nucleotides which makes impossible the translation of this gene into a functional protein. The existence of A Q 3 in another haplotype was confirmed by Steinmetz et al. (1986) with their cosmid cloning of the BALB/c A33. Whereas Widera and Flavell (1985) linked the

PAGE 32

w a; T3 > u •H ;J 4J a (d rC rH X)
PAGE 33

25 a CM \ uj T oat < 00.1 < ' CVJ^ QQ.1 < -O O o CD O o ID o o o o CO i :: o o CO o o l-o co UJ z: UJ o UJ _l < o CO

PAGE 34

26 K region with Ap 3 , Steinmetz et al. (1986) were also able to link A(33 with the rest of the I region, effectively providing a 600 kb continuous DNA map of the K and I region. Therefore, as illustrated in Figure 3, the order of the genes discussed is K2, K, A p3 , Ag 2 # Ap, A^, Eo, Ep 2 f and E a . In addition, Steinmetz et al. (1986) also localized two short regions of DNA which had recombination frequencies of 0.6% to 1.5% between genes from Mus musculus castaneus and standard laboratory mouse strains ( Mus musculus domesticus ) . Such hotspots for recombination may be instrumental in the generation of polymorphism in the class II genes. Polymorphism of the Class II Genes The most unusual feature of the MHC in the murine system, or in vertebrate systems, is the extensive polymorphism of certain of the class I and class II genes. Of the class II molecules, the chain proteins have been known to be the most polymorphic, and E^ the least polymorphic (Klein et al^ 1983a). The polymorphic nature of the class II genes has agreed with that found in the proteins in general, but A^ has been determined to be more polymorphic than originally thought by many (Benoist et al^_ 1983a). As mentioned earlier, this unique degree of polymorphism implies a unique biological role for the encoded glycoproteins.

PAGE 35

27 Biological Role of Polymorphism The class II molecules are involved in the communication between immunocompetent (thymic education aside) cells to induce and maintain a defensive reaction to what the body perceives as a foreign invasion. The class II molecules are key elements in the activation of an immune response via regulatory T lymphocytes. The discovery and characterization of the class II molecules have already been described in detail. The interaction between the antigen, the class II glycoprotein, and the T cell receptor determine if an animal is able to mount an immune response to a particular antigen. The T cells apparently cannot recognize free antigen as the B cells can (Moller 1978; Moller 1980). The function of the class II glycoprotein, then, is to enable the T lymphocytes to recognize a foreign antigen so that they can respond appropriately. This process is known as MHC restriction. Because the T cell receptor only recognizes the class II glycoprotein which is of the identical allelic haplotype as itself, the process is also sometimes referred to as I-region restriction or self -MHC restriction (Klein et al^_ 1981; Nagy et al^ 1981) . The T cell receptor, therefore, must recognize and form a ternary complex with two ligands (Schwartz 1985). One of these ligands is the antigen itself, which is usually a partial degradation product of an antigen

PAGE 36

28 presenting cell. The other ligand is the class II gene product expressed as a transmembrane glycoprotein present most abundantly on B lymphocytes and antigen presenting cells (Asano et al^_ 1983; Hammerling et al^_ 1974; Katz et al^_ 1973; Kindred and Shreffler 1972; Nagy et al. 1981). Each member of the ternary complex possesses a precise and high binding affinity for each other member of the ternary complex; otherwise, the biological triggering of the T helper cell, and consequently the stimulating of an antibody response, does not occur. It is in this specificity of binding that the role of polymorphism of class II glycoproteins can best be understood. The extent of the polymorphism of the class II gene products, although very high, is not nearly enough to explain the ability of the class II glycoproteins to control immune responsiveness to the enormous number of foreign antigens an animal is able to respond to. The precise mechanism by which the class II glycoproteins trigger specific immune responses is still not known. There is evidence that the T cells can differentiate class II gene products in association with molecules which are subtle structural variants as with insulin (Rosenthal 1978), lysozyme (Adorini et al^_ 1979), and cytochrome c (Solinger et al. 1979). The polymorphic nature of the class II glycoproteins might be explained by the fine balance the immune system

PAGE 37

29 seems to maintain. There are many animal models, and human models, of diseases caused by the immunocompetent cells attacking self. Even the prevalent existence of so many allergies and asthmatics in the human population suggests that control of the immune system is relatively easily thrown off. If so, then the presence of many haplotypes in a population would mean that a given animal with at the most two haplotypes would be less likely to react with an innocuous antigen, and thus lower that animal's selective advantage. However, the advantage to the population at large of having many alleles to best defend the species against a threatening plague would be of tremendous selective advantage. If one class II glycoprotein could not present a particular dangerous antigen to the immune system, then perhaps another allele in the population could (Zinkernagel 1979). Although selection operates on the individual level, mechanisms which would enhance the introduction of new alleles could have a selective advantage. The proof of a postulate such as the one suggested above awaits appropriate experimental design and statistical analysis. The possible heterozygous advantage involving class II genes also needs to be taken into account. Nevertheless, the existence of the polymorphism through evolutinary time suggests its importance in the survival of the species and of the importance of the mechanisms which have generated and maintained the polymorphism.

PAGE 38

30 Mechanisms of Generating Polymorphism If the class II genes are viewed as being in a dynamic state of flux in evolutionary time rather than being static structures, then visualizing the genetic mechanisms which have generated the polymorphisms , and the selective pressures which have maintained them, is more revealing. The entire immunoglobulin gene superfamily, which includes the class II genes, appears to have arisen by gene duplication and divergence. Two of the most popular possibilities for divergence of the class II genes into polymorphic alleles are unequal crossing over and gene conversion. Unequal crossingover and gene conversion, originally found in fungi (Radding 1978), are mechanisms whereby DNA sequence is transferred or copied from one gene to another. Although by definition the DNA sequence can be transferred from and to genes anywhere in the genome, it is much more probable to occur within tandem multigenic or multiallelic families (Baltimore 1981; Egel 1981; Robertson 1982; Slightom et al. 1980). Pairing between partially homologous sequences during meiosis or mitosis would occur, followed by mismatch repair which converts part of one sequence to the other. The primary evidence for gene conversion is the discovery of clusters of substitutions, especially at the DNA level. While these "tracts" of nucleotide substitutions have been clearly demonstrated in class I genes (Mellor et al. 1983; Weiss

PAGE 39

31 et al. 1983a; Weiss et al. 1983b), there is also evidence (Mengle-Gaw et al. 1984; Widera and Flavell 1984), though not as thoroughly documented, for a similar mechanism acting on the class II genes. Regions of allelic hypervariability have been reported in the murine A^ gene (Benoist et al. 1983b), suggesting that this gene has more of a polymorphic nature than previously thought by some (Cullen et al. 1976; Klein and Figueroa 1981; and Klein et aJL 1981), though a few scientists had evidence for an unexpectedly high degree of polymorphism for the ^ gene (Cecka et al. 1979; Cook et aJL 1979). Benoist et al^ (1983b) sequenced a total of six different Aq alleles, including the k, d, b, f, u, and q haplotypes, and compared their cDNA sequences. Not only did they find a surprising degree of polymorphism, they also found that the amino acid substitutions were clustered in the first domain exon. In fact, many of the substitutions were localized at a few highly variable positions within the first domain exon. Also, 40 out of 46 dinucleotide changes, which are indicative of nucleotide sequence fluidity, occur in the first domain exon. A translation of the cDNA nucleotide sequence into the corresponding amino acid sequence for the six haplotypes reveals not only the polymorphism of domain one, but the corresponding Kabat-Wu variability plot (Kabat et al. 1979) also shows two regions of "allelic hypervariability" at residues 11-15 and at

PAGE 40

32 56-57. These regions, however, are not nearly as variable as the immunoglobulin hypervariable regions. The polymorphism of Aq still leaves open the question of how it was generated. Because A^ is not a member of a large gene family it might not be a good candidate for gene conversion, although one must still consider interallelic gene conversion. Benoist e_t al. (1983b) do mention the likeliness of interallelic conversion in heterozygotic wild mice, which will be discussed later in this dissertation, but they do not feel it sufficent to explain the generation of polymorphism in A^. The A^ gene lacks the clustering of nucleotide substitutions, and a clear donor of sequence material has not been detected as yet, to make it a good candidate for gene conversion (Benoist et al. 1983a). They offer instead a hypothesis of a gene duplication event followed by one of the copies subject to slow drift, the other copy acquiring a degree of sequence instability which would lead to a high rate of point mutations. Data presented in the results of this dissertation tend to support some type of conversion event over simply the accumulation of point mutations. Regions of allelic hypervariability have also been reported for E^ (Mengle-Gaw and McDevitt 1983). Again, these regions were found only in the first domain and correspond to the hypervariable regions found both in the alleles at a particular locus, and between (3 loci.

PAGE 41

33 Clusters of polymorphism separated by sequences of nucleotide homology found both among the Eq alleles and between the loci suggested to the authors the possibility of generation of this polymorphism by a gene conversion type event. Genomic clones from three different haplotypes, the b, d, and k haplotypes, have been isolated and their DNA sequences compared to one another (Choi et al. 1983). While the overall structural organization of these genomic clones was determined, unfortunately only the exons were sequenced at the time. The authors determined that there is a concentration of amino acid substitutions in the amino terminal portion of the encoded molecule and that the pattern of nucleotide substitutions is consistent with multiple independent mutational events. Their restriction map analysis of sequences flanking the exons suggests that there may be large differences between the haplotypes, which agrees with the data presented in this dissertation. They interpret their data as being inconsistent with gene conversion, but do not take into consideration the low number of haplotypes they analyzed. Evidence for gene conversion in a class II 3 gene has been reported by Mengle-Gaw et al_;_ (1984). They have isolated an alloreactive T cell clone, 4.1.4, that recognizes a determinant present on both E_ e b and Ao bm12 . Comparison of the nucleotide sequence of Ap b (Choi et al.

PAGE 42

34 1983) and k^ hm12 (Mclntyre and Seidman 1984) to the cDNA sequence of E^ h revealed that the bml2 sequence is identical to the Ep" sequence in the reqion where it differs from Ag. The particular reqion where the conversion event may have occurred includes three nucleotides in a clustered reqion of 14 nucleotides between sequence codinq for amino acids 67-71. This reqion is also flanked by reqions of exact homoloqy which extend 20 nucleotides 5' and 9 nucleotides 3". These flankinq reqions may provide stabilization of heteroduplex formation between the qenes, which miqht potentiate sequence transfer. The T cell clone 4.1.4 was found to recoqnize a determinant shared by A« bm12 and E_@ , so the possible qene conversion event would have occurred in a functional zone. Previous information which led to this interest in the bml2 mutation includes qenetic mappinq of the bml2 mutation to within the Ao b qene (Hansen et al^_ 1980), and tryptic peptide data showinq the bml2 mutant to differ from its C57BL/6 parent only in its Ag polypeptide (Lee et al. 1982b; McKean et al^ 1981) . There is tremendous difficulty in distinquishinq between qene conversion and unequal crossinq over as mechanisms of the qenetic exchanqes in the MHC. The discovery of gene conversion in fungi was only possible because the products of a single meiosis in some species remain in a tightly clustered tetrad in which mendelian

PAGE 43

35 ratios are directly detectable (Baltimore 1981; Radding 1978). A change in gene number might be expected in unequal crossover, but if the crossover event took place totally within the genes, then one might find an insertion or deletion of genetic sequence as the only evidence, which is something our laboratory is looking for in the intron between the first and second protein coding domains. Steinmetz et al. (1982) have even postulated that unequal crossover may occur using pseudogenes as a genetic reservoir for polymorphic sequence material. Possible evidence for gene conversion at the DNA sequence level is the strong homology seen in the flanking regions of suspected conversion events; perhaps such sequences have been selected for indirectly within introns as shuttle elements to continually generate polymorphism on an evolutonary timescale. Still, there are now three known Ag sequences, as well as a very large number of alleles, for generation of diversity in A3, and there is no rule that requires one mechanism to operate for all class II genes or that requires only one mechanism to generate that diversity. More nucleotide sequence information, especially in the introns of the class II genes, should do much to elucidate the mechani sms invo 1 ved . Mechanisms for the generation of polymorphism should take into account the variable and conserved tracts within the I region characterized by Steinmetz et al.

PAGE 44

36 (1984). Single copy probes were isolated from the class II region of a BALB/c library and were used to screen DNA cosmid libraries of AKR and B10.WR7, haplotypes H-2 k and H-2 wr7 respectively. The isolated clones were aligned to provide a nearly continuous stretch of DNA through the I region of the three haplotypes, which was restriction endonuclease (RE) mapped and oriented. Using probes spanning the I region in a southern blot analysis, a variable tract was found in the left half of the I region, and a conserved tract in the right half, with the dividing point being in the middle of the Eq gene, probably overlapping the hot spot for recombination in the middle of the Eg gene. The Ap, A^, and Ep genes, which show extensive polymorphism, are located in the variable tract, whereas the much less polymorphic E^ gene is located in the conserved tract. Noncoding seguences located in the variable tract were found to be just as polymorphic, or often more so, than the coding regions in the variable tract. Again, only more nucleotide sequence information is likely to elucidate the mechanisms operating to generate and maintain the polymorphism in the I region. The hotspots for recombination are of special interest in the generation of polymorphism in the I region. The recombination rates may even be strain dependent. Shiroishi et aJU (1982) examined a congenic mouse strain, B10.MOL-SGR, which has an H-2 wm7 haplotype

PAGE 45

37 bred onto a C57BL/10 background. This H-2 haplotype, of Mus musculus molossinus origin, tremendously enhanced recombination rates between the K and A loci. A similar dramatic increase in specific recombination rates has been reported in another wild mouse haplotype (Steinmetz et al. 1986). Two haplotypes from Mus musculus castaneus (CAS3 and CAS4) showed recombination at the same high freguency, 0.6%-1.5%, as was seen in Mus musculus molossinus derived MHC genes. Steinmetz et al. (1986) went on to seguence the intron between the first and second protein coding domain of the Ep gene, which probably contains the hotspot region, and found that the seguence contained a CAGG tetramer repeated in tandem 22 times, if a mismatch of one nucleotide is allowed. The seguence has some homology to the lambda Chi seguence, which promotes recombination, but the homology is not very strong. A much stronger degree of homology was found to the core seguence of the hypervariable minisatellite regions found in human DNA (Jeffreys et al. 1985). These regions could generate allelic variability by facilitating unegual crossover events during meiosis, or perhaps even by initiating a gene conversion event. Control of expression of the class II genes may also play a role in their generation of diversity, either by differential control of expression, or by polymorphism in the control elements themselves. Some standard laboratory

PAGE 46

38 inbred mouse strains carry mutations that cause failure of expression of the class II E molecule on the cell surface (Jones et al^ 1978, 1981). These mutations can be of any one of three types (Hyldig-Nielsen et al. 1983; Mathis et al^_ 1983): the H-2 b and H-2 S haplotypes have a deletion in the E^ gene, the H-2 f haplotype makes an E^ mRNA of aberrant size, and the H-2^ haplotype has a defect in RNA processing or RNA stability. The lack of a cell surface expressed E molecule for any reason is referred to as an E° mutation. The E° mutations have been identified in over 50% of the t bearing strains (Nizetic et al. 1984). Eighteen t haplotype carrying strains have been found to be E° by Dembic et aJL (1984). Three strains, CR0437, t w2 , and t° were found to transcribe E but do not make a functional protein. All fifteen other E° strains had a deletion encompassing the promoter region, the RNA initiation site, and the first exon, which amounts to an approximately 650 bp deletion. The role these mutations might play in the polymorphism of class II genes is just now getting underway. In the human system, there are cell lines which have specifically lost expression of all class II molecules (Levine et aJU 1985). The cell line 6.1.6 is a variant of a normal lymphoblastoid line which has been shown to have a regulatory defect in class II gene expression (Gladstone and Pious 1978, 1980; Levine and Pious 1984). P30 is a partial revertent of the 6.1.6 cell line. Levine

PAGE 47

39 et al. (1985) used southern and northern blotting of these two cell lines to show evidence that class II and Ii (I invariant) chain expression may be linked. The characterization and polymorphic nature of the regulatory elements of class II genes is just beginning. Variation in Wild Mice The major purpose of this dissertation is to address the guestion of how the generation of the polymorphism of the MHC class II genes arose and how this polymorphism is maintained. To address this guestion realistically reguires an understanding of the evolutionary relationships of the model system being studied. The more thorough the understanding of the strain development of the system, the more informative the study. A major limitation of many previous studies of the extensive genetic polymorphism of the murine class II genes is that only a limited number of class II alleles have been studied, and nearly all of these come from the standard laboratory inbred strains of mice. These strains were derived from a limited number of sources with a high degree of interbreeding early in their development. As such, they represent a highly biased sampling of the mouse population and an artifical collection of considerable genetic homogeneity (Ferris et al. 1982;

PAGE 48

40 Klein 1974). Wild mouse populations have a relatively high degree of genetic variation, particularly at the H-2 complex, when compared to the standard laboratory inbred strains of mice. It is this variability, generated and maintained through natural selection, which makes wild mice a near ideal model for the study of the genetics of class II polymorphism. In turn, the class II polymorphism is a near ideal model to study the evolution of a species. A useful definition of wild mice is a population whose reproduction is not controlled directly by humans. (Bruell 1970). This study will examine the polymorphic nature of the class II genes at the DNA level in mice of wild mouse populations of different subspecies and geographic origins as well as the standard laboratory inbred strains. For this reason it is of major importance to understand wild mice as a genetic model. Natural History of Wild Mice Basic to the understanding of the evolutionary implications of the selection process on wild murine class II gene products is a rudimentary understanding of the natural history of wild mice. The degree of association of the wild mice with humans can be used to distinguish three groups (Sage 1981). Aboriginal mice live predominantly unassociated with human dwellings or food sources. Commensal mice live in close association

PAGE 49

41 with human buildings and food supplies, while feral mice have made a transition from the commensal stage back to an aboriginal existence. Much of what is known about the natural history of wild mice has been learned from studies on commensal mice. The term house mouse is defined here because it describes essentially all wild mice used in this dissertation research. House mouse literally refers to the commensal relationship between human dwellings and certain species of mice. The number of species comprising what we call the house mouse varies depending on the person defining the term. In this dissertation, the house mouse shall be split into seven species and subspecies as per Joe Marshall (1981). These species include the commensal mice Mus musculus domesticus , Mus musculus musculus, Mus musculus castaneus , and Mus musculus molossinus , and the closely related aboriginal mice Mus hortulanus , Mus spretus . and Mus abbotti . From fossil evidence, nuclear genetic variation, and mitochondrial genetic variation, it has been estimated that the commensal association between humans and mice began more than a million years ago (Ferris et al^_ 1983). The native distribution of the wild mouse species ranges across Europe, North Africa, and northern Asia. M.m. domesticus and M.m. castaneus , two commensal species, have followed man into North and South America, Australia, and southeastern Africa, presumably as

PAGE 50

42 stowaways on sailing ships. Thus, most of the standard laboratory mice of M.m. domesticus origination were already introduced in the New World. The three related aboriginal species, M. hortulanus , M. spretus , and M. abbotti , have a native distribution in Europe and Asia Minor. The mound-building species, M. hortulanus . is restricted to the steppe grasslnd regions of the Carpathian basin and the Ukraine (Petrov 1979). M. spretus is found in the warmer parts of the western Mediterranean regions from France to Libya, and M. abbotti is found in southeastern Europe abd Asia Minor, although its geographic distribution is less well characterized (Sage 1981). Distribution of these three aboriginal species is consistent with patterns of other animal and plant groups, suggesting that their present locations were determined by natural factors, not humans, as opposed to M. domesticus . The western European house mouse, M. domesticus , has the most diverse geographic distribution of the house mouse species and has provided the most information about the range of genetic variability of the house mouse species. This species, due to its occupation of buildings and sailing ships during an era of worldwide colonization, established founding populations in areas as diverse as the Americas, Australia, and varied temperate and tropic Pacific island chains. M.m. domesticus may be a more advanced member of the genus

PAGE 51

43 based on its great adaptability and spectacular variation in color which matches its various geographic environments (Marshall 1981). Many studies have been carried out where mice, usually M.m. domesticus , have been introduced, a fact which should be kept in mind when reviewing the older studies (Schwarz and Schwarz 1943). This problem is poteniated by mice which go feral after colonizing a new land, thus subject themselves to new and different natural selective pressures. The native distribution of the house mice species has not been thoroughly documented, but some informative observations have been made (Sage 1981). M. spretus , for instance, is native to western Europe and North Africa, but has been found in agricultural fields, often cornfields, in Spain and France, and grasslands in North Africa. This species has been found inside buildings in at least one instance (Sage 1981). M. hortulanus . a well-studied aboriginal species (Mikes 1971), has been found in grain fields and some native steppe grasslands. Whether or not it inhabits buildings is still questionable. Information on the natural habitats of M. abbotti is sparse (Osborn 1965). They have been reported in agricultural habitats in southern Georgia, U.S.S.R., in grain fields and bamboo groves in Turkey, and adjacent to cornfields in southern Yugoslavia. The more commensal of the house mouse species are most often found associated with human buildings, but not

PAGE 52

44 always. M. castaneus is found indoors in Malaya (Harrison 1955), India (Srivastva and Wattal 1973), Indonesia (Hadi et al^ 1976), and Nepal and Thailand (Marshall 1977). In fact, it has not been reported outdoors in these areas and may be the most commensal of the house mouse species. M.m. molossinus has been found in houses, farms, cultivated fields, and even along river levees in Japan, as well as abandoned agricultural fields in Korea (Hamajima 1962; Jones and Johnson 1965), suggesting that it is less of an obligate commensal than M.m. castaneus . The native range of M.m. musculus includes central Asia to northeastern Europe. Its microhabitats vary from inside buildings and haystacks in much of northern Russia and central Europe (Pelikan 1974; Romanova 1970; Zejda 1975) to agricultural fields and meadows in Denmark (Ursin 1952). In Sweden this species has been reported in natural wild locations independent of any human influence whatsoever (Zimmerman 1949). M.m. domes ticus is presently found throughout the world, although it is an adventive species in most of these areas. Its native range extends from Nepal to North Africa and western Europe. Within its native range it can be found in habitats as diverse as agricultural fields to barren stony ravines isolated from human settlements, especially in Afghanistan and Pakistan (Gaisler 1975; Hassinger 1973; Roberts 1977). In the desert of the south Arabian peninsula it has been

PAGE 53

45 found living in burrows of sand rats (Harrison 1972). The versatile adaptability of this species is demonstrated by unusual commensal habitats it has occupied such as coal mines (Philip 1938) and frozen meat lockers (Mohr and Dunker 1930). M.m. domesticus is at least as versatile in non-native lands, and has been found in environments such as salt marshes (Breakey 1963) and grasslands (Pearson 1963) to the Andes mountains (Harland 1958), although it is predominantly a commensal species. It is worth noting that it has not been reported in woodland forests in Europe, nor in the Americas, although it has been found to occupy the native silver beech forests in New Zealand (Taylor 1978). Interspecies competitive interactions are difficult to study in rodents in their native habitats. The most thoroughly studied case remains one involving two species, M.m. domesticus and the vole Microtus californicus in the California grassland ecosystems (Lidicker 1966). A population of approximately 12,000 mice on an island was extinguished within one year after the introduction of a small number of voles. DeLong (1966; 1967) studied two enclosed populations of mice, one group with the presence of voles, one without. The population of mice in the enclosure with the voles has a significantly lower survival rate for postnatal, preweaning mice. Lidicker (1966) also found that the voles dominated house mice in 94% of their encounters.

PAGE 54

46 DeLong and Lidicker ' s studies are actually some of the few experimental approaches in this area. These rodent interactions demonstrate part of the natural selection process when a new species enters a territory. How these interactions affect the evolution of a native species over hundreds of thousands of years remains to be determined. Variation of Non-MHC Features in Wild Mice Factors affecting the evolution of the murine MHC class II molecules are probably numerous. The wild mice are an excellent system to study evolution of morphological features, protein structure, and DNA structure. The morphological features of the wild mice have been especially instrumental in organizing the phylogenetic relationships of the different species of Mus while anatomical features such as dental structure (Bader 1965; Van Valen 1965), skull shape (Hussain et al. 1976), and relative tail length and foot size (Ursin 1952) have all contributed to the classification scheme. Relative tail length has also been a useful feature in the classification of wild mice, particularly long tail length of M.m. domesticus , because of a genetic region known as the t complex, which complicates tail length inheritance. Color variation as a morphological feature was critical in establishing the mouse as an excellent model

PAGE 55

47 in the twentieth century to study inherited traits. Geographic factors and microhabitat have played major roles in determining resulting coat color for the species in their native ranges. There is a notable polymorphism of coat color, particularly in ventral coloration, detectable in some species of mice such as M.m. molossinus (Hamajima 1964) and M.m. musculus (Serafinski 1965). The coat color patterns are important for genetists because of their great utility as genetic markers, but also because a multifactorial nature has been shown to be involved (Falconer 1947). Genes on five or more chromosomes have been found controlling melanism (Radbruch 1973). Just as the coat color genes are important markers for geneticists, coloration affecting natural selection and survival will influence the polymorphism of some of the biochemical factors. Variation in the proteins in wild mice has been assayed most commonly with electrophoresis and serology reviewed by Sage (1981). Many variant forms of proteins have been localized to a particular chromosomal position (Womack 1979), but the function of these proteins has not always been identified. Protein variation in wild mouse populations has also proved useful in learning about the heterozygosity levels in M. domesticus populations around the world (Berry and Peters 1977; Rice and O* Brian 1980; Sage 1978). Such studies have aided in the classification of the different subspecies of wild mice (Bonhomme et al.

PAGE 56

48 1978; Minezawa et al. 1979). One example of how protein variants have led to important discoveries in evolution and ecology is the discovery of the hybrid zone in Europe by Selander (Selander et aJU 1969; Hunt and Selander 1973). A zone of contact between M.m. domesticus and M.m. musculus runs across the Jutland peninsula in Denmark and continues through the eastern part of West Germany. The hybrid zone is as narrow as 20 kilometers in some places, and has possibly been in existence for 5000 years. Free interbreeding occurs between the "semispecies" within the zone, but not on either side of it. M.m. domesticus alleles have been detected within the M.m. musculus populations within the hybrid zone, but not vice versa , perhaps reflecting social dominance of M.m. domesticus over M.m. musculus (Thuesen 1977). While selection operates most often at the protein level, this dissertation will examine the DNA coding for variation in a specific group of proteins. Variability in chromosome structure, once thought nonexistent, has been discovered to be quite prevalent in certain regions of the world, e.g. Italy and Switzerland (Gropp et al^ 1969; 1970; 1972). The previously "normal" chromosomal complement was thought to be 20 pairs of acrocentric chromosomes. In an excellent review article by the discoverer of Robertsonian translocations (Gropp and Winking 1981), Gropp and Winking describe the presence in wild mouse

PAGE 57

49 populations of metacentric chromosomes formed by the joining of two acrocentric chromosomes. Studies on the variability of mitochondrial DNA sequence in various species of house mice using an RFLP analysis were first reported by Yonekawa et al. (1980). The 25 standard laboratory strains they analyzed showed no variation and were identical with a sample of wild M.m. domesticus from Canada, but the patterns from M.m. castaneus and M.m. molossinus were very different. Varibility of mitochondrial DNA within M.m. molossinus populations appears to limited. An extensive analysis of mitochondrial DNA evolution in 208 mice by Ferris et al_;_ (1983) has reinforced the phylogenetic classification scheme of Marshall and Sage (1981) for the seven Mus species and subspecies of house mice discussed here. An RFLP analysis of the mitochondrial DNA of four commensal and three aboriginal species of house mice and the standard laboratory mice led to the construction of evolutionary trees on the basis of mitochondrial polymorphisms. These evolutionary trees emphasized the distinctiveness of M.m. domesticus from the other commensal species of mice. All 50 of the standard laboratory mouse strains analyzed were found to be M.m. domesticus . The mitochondrial evolutionary tree also reinforces that the three European aboriginal species of mice which have been discussed, M. spretus , M. abbotti , and M. hortulanus . differ substantially from

PAGE 58

50 the commensal mouse species and are each an individual species of Mus. The first commensal mice may have begun their relationship with humans one to two million years ago, assuming the rate of mutational divergence in mitochondrial DNA is between 2% and 4% per million years (Ferris et al. 1983). Mitochondrial DNA comparisons between mammals whose divergence times have been estimated from fossil records (Brown et al. 1979; Brown et al. 1982; Ferris et al. 1981; Upholt and Dawid 1977) have provided this estimate of mitochondrial DNA divergence. This estimate of commensalism between mice and humans fits with Sage's (1981) protein comparisons and may correlate with Mus species divergence. The "t" Complex in Wild Mice The term t complex indicates the part of the chromosome which is occupied by a complete t haplotype. Occurring in a frequency of 10% (Artzt et al. 1985) to 40% in most of the sampled wild mouse populations (Dembic et al^ 1984), t haplotypes are structurally variant forms of a segment of murine chromosome 17. Mouse t haplotypes are thoroughly reviewed in a recent article by Silver (1985). When first discovered (Dobrovoloskaia-Zavadskaia and Kobozieff 1932), and for many years thereafter, t haplotypes were thought to be recessive alleles at the Brachyury (T) locus. Although

PAGE 59

51 there is a T locus near the centromere on chromosome 17, it is well-defined single locus which is only a small part of a t haplotype (Bennett et al. 1975). The different t haplotypes all appear to be related to one another structurally. A complete t haplotype encompasses about 30 x 10 3 kb, which accounts for approximately 1% of the entire mouse genome and includes the entire murine H-2 complex, hence the connection with the polymorphism of the class II genes. There are also polymorphisms within the t haplotypes themselves, among the most t specific of which may be the t complex proteins (TCP) (Silver et al^ 1979; Silver et al^ 1983). Also within the chromosomal region occupied by t haplotypes are many other normal genes common to non-t bearing mice, along with a smaller number of mutant t genes which must effect the t specific characteristics. The t haplotypes have been known to influence tail length, fertility, embryogenesis, male transmission ratio, and meiotic recombination (Dunn and GluecksohnSchoenheimer 1950; Silver 1985). Of these characteristics, it is believed that through suppression of recombination the t haplotypes has been maintained as a distinct genomic unit. Furthermore, a distorted male-specific transmission ratio permits propagation through mouse populations despite the deleterious effects which accompany complete t haplotypes.

PAGE 60

52 The suppression of recombination which occurs in complete t haplotypes with non-t wild haplotypes, as first discovered by Dunn and Caspari (1945), is related to the t genomic structure. This suppression extends from T, includes the H^ complex (Hammerling and Klein 1975), and the Tla and Qa-2 regions (Shin et al^_ 1982; Silver 1981), but not the Pgk-2 locus (Nadeau 1983; Rudolph and Vanderberg 1981). Thus, a complete t haplotype comsists of a 12 to 15 cM region of the chromosome with concomitant suppression of recombination from somewhere between the centromere and T and extending to somewhere between the distal part of the MHC and Pgk-2 . Rare chromosomes which had recombined within the t haplotype were discovered and designated as partial t haplotypes. These rare recombinants were subseguently found to be of critical importance in understanding the physical structure of the t haplotypes (Lyon 1960; Lyon and Meredith 1959). With partial t haplotypes as a tool used to infer structure, the region of suppression of recombination was found to occur only along the extent of t DNA present (Bechtol and Lyon 1978; Bennett et al^ 1979). Normal recombination rates between t haplotypes also suggested that the structures of t haplotypes were similar to one another, and different from the same chromosomal region in wild type DNA (Artzt et al^ 1982a; Condamine et al. 1983). Artzt et al^_ (1982b) were able to demonstrate that

PAGE 61

53 the physical locations of the H-2 and the tf locus were reversed in t haplotypes relative to their location in the wild type chromosome. These results were confirmed by . others (Shin et al^ 1983b; Shin et al^_ 1984). A complete t haplotype therefore consists of a distal inversion, which includes tf and H-2 , a proximal inversion which includes T and the genes encoding the Tcp (T complex proteins) products (Herrmann et al. 1986), and possibly a small central inversion. Many complete t haplotypes are known to have lethal effects in homozygous t embryos (Klein et al. 1984). This can be a useful tool as one of the few ways to distinguish complete t haplotypes from one another, as different chromosomes carrying different lethal mutations can complement each other in genetic tests (Bennett 1975; Klein et ajU 1984; Winking and Guenet 1978). It also became possible to analyze the genetic basis for t lethal effects with the finding that normal crossover occurs between two different t haplotypes (Silver and Artzt 1981). The majority of the complete t lethal mutations analyzed appear to be single-locus mutations, and lethal mutations of complementing t haplotypes are not allelic to each other (Artzt et al^_ 1982a). Although there is some evidence for clustering (Artzt 1984), the different lethal mutations appear to be distributed over the entire length of complete t haplotypes. Overall, the entire genetic basis for the t lethal mutations seems to be

PAGE 62

54 straight forward, but the molecular mechanism by which they effect their lethality is still unknown. The male-specific transmission ratio distortion (TRD) inherent with t haplotypes is responsible in part for propagation of t haplotypes through the wild mouse population, even though the t haplotypes carry deleterious genes. Wild males with a complete t haplotype will transmit it to well over 90% of their offspring (Lyon and Meredith 1964a, 1964b). Mice carrying a single partial t haplotype cannot transmit it at a high ratio, but the TRD can be restored in males carrying particular pairs of partial t haplotypes in cis or trans configuration (Silver 1985). This effect was higher with certain trans combinations for a portion of the t haplotypes, leading Lyon (1984) to propose a model in which partial t haplotypes carry different lengths of t DNA with particular sets of distortion loci. Lyon (1984) hypothesized that a series of t-specific distorter loci, Ted , act on a single t-specific responder locus, Tcr. The effects of the Ted loci would be additive, and they could act cis or trans to the Tcr locus to transmit it at a high ratio when enough Ted loci are present. Evidence in support of this model has been obtained by Fox et al. (1985). Further research based on this model should be forthcoming. Sterility is another effect which accompanies the presence of any two complete t haplotypes in male mice.

PAGE 63

55 The physiological reasons for this sterility are still unknown. The sperm appear to be morphologically normal (Hillman and Nadijcka 1980). This sterile condition is of particular interest because of its strong similarities to the t's TRD effect. The possibility exists that the two are related, but this has not yet been proven. The association of the t haplotypes with the murine histocompatibility class II molecules has long fascinated geneticists, due to the inclusion of the H-2 complex in the recombination suppresion of complete t haplotypes. The extreme polymorphic nature of the class II molecules has provided excellent markers and an approach to study evolution of the t haplotypes. Dembic et al. (1984) and Nizetic et al. (1984) have drawn correlations between an E^ deletion and its association with t haplotypes which suggest an ancient origin for this deletion. Association of the members of the same t complementary group with the same H-2 haplotype supports this view, but interpretations should be made cautiously as the evidence that H-2 haplotype association with t chromosomes is derived from a single ancestor is not conclusive. More recently, by Figueroa et al. (1985) have revealed the existence of three major groups of class II alleles associated with particular t haplotypes. These results are not in conflict with those to be presented in this dissertation.

PAGE 64

56 Polymorphism of the H-2 Class II Genes in Wild Mice The murine class II histocompatibility genes are one of the most polymorphic gene complexes in mammalian genetics. Research on the genetic basis for this polymorphism and its functional significance has led to many critical discoveries in transplantation biology, cancer research, and genetics. However, most of the research in these areas has been carried out using standard laboratory mice. As was mentioned earlier, almost all of these mice are of M.m. domesticus origin and derived from a very limited number of stocks which were not well characterized. The presence in wild mouse populations of private H-2 antigen specificities absent from the standard laboratory inbred mice led to the realization that there was a need to identify and characterize these new H-2 specificities. The methodology of choice was a serological characterization of the wild H-2 haplotypes, but the problem was to isolate these antigens from nonH-2 antigens so that antisera specific for only the H-2 could be produced. Klein developed the B10.W congenic lines (Klein 1973, 1975), where "W" stands for wild. The wild males were bred with a BIO. BR female and the progeny were backcrossed 8 to 14 times to the same inbred strain with a continual selection for an H-2 marker (Ss h ) specific for the wild mouse's H-2 haplotype. The Ss h animals were then intercrossed and progeny with Ss h and

PAGE 65

57 a H-2.23 negative phenotype (wild haplotype) were selected to establish homozygous lines with brother x sister matings to maintain the line. Thus, each wild H-2 haplotype is bred onto a C57BL/10 background for specific analysis of the wild type H-2 . Once the B10.W lines were established, a serological examination of sixteen of their wild H-2 haplotypes substantiated their extreme polymorphism (Klein 1975; Zaleska-Rutczynska and Klein 1977). A few wild haplotypes appeared to be identical to one another serologically, and a few of them resembled standard laboratory strains of mice, but most were different from one another and different from all known laboratory inbred mouse strains (Zaleska-Rutcznska 1977). A serological analysis of 29 wild-derived E^J. haplotypes (Wakeland and Klein 1979a; Wakeland and Klein 1981) defined five new I region antigens, with the inclusion of three new haplotypes, u, v, and j_, on their inbred panel. Mentioned in this same report is the beginnings of discernible "phenogroups. " Also, wild mouse haplotypes which showed showed evidence of possible recombination in the H-2 complex were characterized (Duncan and Klein 1980; Wakeland and Klein 1979b), suggesting that the wild mouse haplotypes may be of use in analyzing recombination mechanisms as they occur in a natural population. The combination of serological and tryptic peptide mapping analyses proved to be very informative for

PAGE 66

58 Wakeland and Klein (1983). They were able to organize 29 B10.W lines into 8 distinct antigenic groups. The tryptic peptide mapping correlated with the serology, but also demonstrated an extremely high degree of similarity of the class II molecules of members of the same family. These class II families often had a standard laboratory inbred mouse strain as a "prototypic" member. The discovery of the existence of groupings of class II wild haplotypes bears directly on the question of how the polymorphism of the class II molecules arose and how it is maintained. The process of generation of diversity in class II molecules may not be as random as once thought. These aspects are stressed here because these groupings formed the basis for this dissertation. Two of the groupings established by Wakeland and Klein (1983), the A k and AP families, were subsequently selected for more detailed analysis. The tryptic peptide fingerprints of the A a , Ag, E a , and E subunits encoded by four of the wild ^2 genes in the A k group were compared. The A a and A p subunits of all of the related haplotypes differed from A a k and Ap k by less than 10% of their tryptic peptides (Wakeland and Darby 1983). The tryptic peptide fingerprint comparisons of the E gene in these same strains were Eg d -like in two wild haplotypes and E s s -like in another wild haplotype suggesting that recombination between A^ and Er may be significant in the wild. This may reflect different

PAGE 67

59 evolutionary patterns of the A3 and Aq genes with respect to the Ep genes. The A k and AP families (Wakeland and Darby 1983; Wakeland and Klein 1983) were also analyzed to determine the effect of their minor structural variations on allorecognition by T lymphocytes (Peck et al. 1983). Minor structural variations in the A molecule were usually found to cause major functional changes in in allorecognition. These changes were always detected when the Ap subunit contained the structural variation. Peck et al. (1983) also found that more than one site in the A molecule can be recognized by alloreactive T lymphocytes. These results suggest that specific sites in the A molecule are critical for allorecognition. Thus it would be informative to know the location of the minor structural differences between wild H-2 haplotypes in either the A k or the AP family to determine if the differences are in a critical binding area of the molecule. If so, the evolutionary mechanism generating the polymorphism found in one of these haplotype families would seem to be operating in a non-random process. Radiochemical sequence analysis of tryptic peptides of wild-derived H-2 complexes of A k family members has localized structural variations of the A molecule to the aj and @ x domains (Wakeland et al^_ 1985). The variations have been localized in the A a molecule to two adjacent peptides. In the A p subunit the differences have been

PAGE 68

60 localized to single amino acid changes, possibly due to single point mutations in the encoding DNA. Thus, the A k family of class II alleles probably are diversifying by the accumulation of discrete mutations within the exons encoding the a^ and 3j_ domains. Again this suggests that wild-derived variants in exon structure are not random. Recent data on the DNA structure of the A k and A? families based on RFLP analysis (McConnell et al. 1986) suggests that the intron structure may also be informative to determine evolutionary lineage of H-2 class II haplotypes.

PAGE 69

MATERIALS AND METHODS Mice All mice were from the mouse colony in the Tumor Biology Unit at the Department of Pathology, University of Florida, or from our wild mouse colony at the Animal Care Facility, University of Florida. Strains used included AZROU 1, AZROU 2, BELGRADE 1, C57BL/10, BIO. BR, B10.BUA16, B10.CAA2, B10.CAS2, B10.CHA2, B10.D2, B10.F, B10.KEA5, B10.M, BIO. PL, B10.Q, BIO . RIII ( 7INS ) , B10.S, B10.SAA48, B10.SM, B10.STC77, B10.STC90, B10.WB, t w71 , TT6, t 6 -JRl, t"8, t w7 5, t^ ^5, ^2, JERUSALEM 3 , JERUSALEM 4, METKOVIC 1, STU, VIBORG 5, and W12A. The inbred mouse strains are maintained by full brother x sister mating with a single line of descent. All mouse strains are homozygous at the H-2 complex unless otherwise noted. Isolation of Genomic DNA Genomic DNA was prepared from liver tissue according to the methods of Maniatis et §J_ (1982). The mice were deprived of food for 24 hours prior to sacrifice. The

PAGE 70

62 livers were minced with surgical scissors, placed in a mortar which contained liquid nitrogen. The frozen tissue pieces were then ground to a fine powder and added to 40 ml of TES buffer (10 mM Tris HCl, pH 7.5; 5 raM EDTA, 100 mM NaCl) with 1% SDS and 0.4 mg/ml of protease K (Sigma, St. Louis, MO). This DNA preparation was then incubated at 65 °C for 16 hours, extracted three times with Tris equilibrated phenol (pH 7.5), twice with chloroform and isoamyl alcohol (96:4 v/v) and then precipated by the addition of 2.5 volumes of isopropanol. The high molecular weight genomic DNA was hooked from the isopropanol solution with a drawn Pasteur pipette, dissolved in 0.5 ml TE (10 mM Tris HCl, pH 7.5, 1 mM EDTA), and dialyzed extensively against TE buffer. The resulting genomic DNA prepartions were then analyzed for purity and quantitated by spectrophotometry and agarose gel electrophoresis. All DNA preparations used in this study have 260/280 ratios in excess of 1.8 and migrate as high molecular weight DNA on 0.7% agarose gels. Restriction Endonuclease Digestions and Agarose Gel Electrophoresis A Tris buffered solution containing 15 ug of genomic DNA was digested with 30 units of enzyme for 16 hours at 37 °C under conditions described by the supplier (Bethesda Research Laboratories, Bethesda, MD) . An additional 15 units of enzyme was then added for 8 hours. The

PAGE 71

63 efficiency of each endonuclease digestion was monitored by removing 10% of the digest reaction volume immediately following the final addition of endonuclease and adding this aliquot to 0.5 ug of lambda phage DNA. Following an 8 hour incubation, digestion of the genomic DNA was analyzed by electrophoresis in 0.7% agarose gels. Complete digestion of the genomic and lambda phage DNA was detected as a "smear" of genomic DNA-derived restriction fragments together with a pattern of lambda DNA derived restriction fragments characteristic of complete digestion with each specific enzyme. The bulk of the digested genomic DNA (13.5 ug) was stored at -20 °C until electrophoresis. Digested genomic DNAs were electrophoresed through 0.7% agarose gels for 40 hours at 1.5 V/cm or for 20 hours at 3.0 V/cm in a high resolution horizontal electrophoresis apparatus with cooling thermoplate (International Biotechnologies Incorporated, New Haven, CT) . Capillary Transfer and Hybridization. Following electrophoresis, DNA was transferred from the gel to nylon filters (Zetabind, AMF, Meriden, CT) by the method of Southern (1980). Transfer efficiency was monitored by comparing the amount of DNA remaining in the gel following transfer with the amount present prior to transfer by ethidium bromide staining and photographic

PAGE 72

64 analysis. The nylon filters were vacuum dried for 2 hours at 80 °C and stored on dessicant at 4°C until hybridization. The filters were hybridized with a 3 2 J ^P-labeled 5.8 kb Eco RI fragment containing the entire A(3 d gene (Malissen et al^ 1983) or with a 1.2 kb Hind III fragment containing part of the A^ gene derived from I-A ° (J. Seidman, personal communication 1984). The probes were radiolabeled with 32 P-dCTP to a specific activity of >2 x 10 8 dpm/ug by nick translation (Bethesda Research Laboratories, Bethesda, MD) . Hybridization and rehybridization conditions were as described by the supplier of the Zetabind nylon filters (AMF, Meriden, CT). Final stringency was established by two 30 minute washes at 65 °C with 0.015 M NaCl, 0.0015 M sodium citrate, 0.1% SDS. Autoradiographs were produced by exposure for 2-6 days on XAR-5 X-ray film (Kodak, Rochester, NY) with intensifying screens (Dupont, Wilmington, DE) at -70 °C. Data Analysis. RFLP analyses were performed using equation 21 from Nei and Li (1979) with F= 2n XY /(n x + n Y ) in which n x and n Y are the numbers of fragments in populations X and Y, respectively, whereas n XY is the number if fragments shared by the two populations. The validity of the formula was tested by Nei and Li in known pairwise

PAGE 73

65 sequence comparisoms. An F value was calculated for each pairwise comparison for all restriction digests. Restriction fragments which weakly hybridized with probes for either ^ or A^ were not included in the analysis.

PAGE 74

RESULTS RFLP Analysis of the A n and A p Genes of Standard Laboratory Inbred and Wild Mice Table 1 presents the 37 mouse strains analyzed in this study, including 13 standard laboratory inbred strains, 15 strains containing wild derived H-2 haplotypes, and 9 t haplotype bearing strains. The 28 wild and standard laboratory inbred mouse haplotypes will be dealt with in this first section of the results. Also relevant to the data presented here, mice representative of the three different subspecies were analyzed in this study, Mus musculus domesticus , Mus musculus musculus , and Mus musculus castaneus , as seen in table 1. The genomic structures of the A^ and Aq alleles of these haplotypes were compared by RFLP analysis with DNA probes specific for these genes, as described in the materials and methods. A more detailed diagram of the Ap probe is illustrated in figure 2 in the literature review. The probe consists of a 5.4 kilobase (kb) Eco RI genomic fragment derived from the H-2 b haplotype. In the initial RFLP analysis of the Ag gene of the 28 standard laboratory inbred and wild mouse strains, I 66

PAGE 75

67 Table 1. Mouse strains used in this study. Mouse Geographic group strain Subspecies H-2 origin C57BL/10 M.m. domesticus a b Old inbred b AZROU 1 it w201 Morocco b JERUSALEM 3 t w75 ir n w203 Israel b b JERUSALEM 4 it w204 Israel b B10.M 'i f Old inbred b B10.WB n 3 Old inbred b B10.S ii s Old inbred b B1Q.STC90 t w71 ii ii wl5 Michigan b b b p/12 ii TT6-865 ii b TT6-866 ii b AZROU 2 'i w217 Morocco b STU ii w34 Eur . inbred b W12A ii w216 Netherlands b B10.D2 M.m. domesticus d Old inbred d B10.RIII it r Old inbred d METKOVIC 1 ii w205 Yugoslavia d B10.BUA16 ii w22 Michigan d B10.CAS2 M.m. castaneus wl7 Thailand d t 6 -JRl M.m. domesticus d pD it d B10.SM it V Old inbred d B10.F I P Old inbred d B10.Q ti q Old inbred d B10.KEA5 •i w5 Michigan d B10.CAA2 ii wll Michigan d B10.STC77 ii wl4 Michigan d BELGRADE 1 t w8 ^W32 M.m. musculus w202 Yugoslavia d M.m. domesticus it d d B10.SAA48 ' w3 Michigan d BIO. BR M.m. domesticus k Old inbred k B10.CHA2 n w26 Old inbred k BIO. PL 1 u Old inbred k NZW it Old inbred k a M.m. abbreviation represents the speci .es designation Mus musculus,

PAGE 76

68 obtained restriction fragments such as those illustrated in figure 4. Patterns of similarity between different Ap alleles began to stand out. Ap d and Ap w22 alleles are seen to have identical Pvu II fragments of 2.89 kb and 1.85 kb. Ap b , Ap f , and h$i alleles have identical 4.83 kb and 2.92 kb Pvu II RFs, the extra band in the A p b allele has been proven to be a plasmid contaminant in this particular southern blot. The ApP, Ap r , Ap v , and Ag w201 alleles all have Pvu II RFs of 2.89 kb and 2.75 kb. As the RFLP patterns became more obvious , I arranged the mouse strains according to their similarities and carried out more southern blots. Figure 5 is a representative autoradiogram with the identical strains of mice being analyzed, minus one, as were seen in the Pvu II autoradiogram of figure 4, but reorganized to emphasize the groupings. The Ap alleles of r, 2' &« 1> and w22 all have Sac I RF of 5.2, 3.8 and 2.65 kb. The one evolutionary group d member on this autoradiogram without the 5.2 and 2.65 kb restriction fragment (RF), although it does have the 3.8 kb RF, is BELGRADE 1 (Ag w201 ). When the two missing 2.65 and 5.2 kb RF from the d group alleles are added together, the sum is 7.95 kb, 0.1 kb different from the new 7.85 kb RF of the As and well within experimental error to suggest that the new RF is due to the absence of a single Sac I

PAGE 77

Figure 4. Autoradiogram of a southern blot of the different mouse strains ' DNA which was digested with the rstriction endonuclease Pvu II, electrophoresced, blotted, and probed with the Aq genomic probe as described. Each band represents the relative location of the restriction fragment in the gel, which differs according to the position in the gene of the restriction endonuclease used, Pvu II in this case. Molecular weight markers are indicated at the righht side of the figure.

PAGE 78

70 b d I REGION ALLELES TESTED * P r s v ) u w22 w211 w201 kb ^7.4 9 ^5.8 ^4.9 «*3-4 ««2.1

PAGE 79

Figure 5. Autoradiogram of a Sac I restriction endonuclease digestion of standard laboratory inbred and wild mouse strains' DNA probed with Ap. This panel shows representative members of the three evolutionary Aq groups discovered.

PAGE 80

Group d G roup b r P d v w22 w201 f j b s k u '•« 6.7* ft 4.9^ 3.5^ ** a* # y 72 A* • W

PAGE 81

73 site in BELGRADE 1 not present in the Ap r, g, d, v, and w22 alleles. Of the group b Ap alleles on this autoradiogram, the f, j_, b, and s alleles have a Sac I 1.58 kb RF in common. The f, j_, and b alleles have a common 7.3 kb Sac I RF while the f , j_, and s alleles have a common 3.8 kb Sac I RF. The k group members shown on this autoradiogram have a common Ap Sac I 4.6 RF kb which has been found only in k group members and not in any of the other 33 mouse strains examined. The digestion of the mouse strains with a total of seven RE makes possible a detailed and accurate comparison between alleles to better define the groups. Consequently, the DNA sequence homology among these Ap alleles can be quantitatively estimated from the RFLP data by calculating the fraction homologous (F) value as defined by Nei and Li (1979). The F value is the fraction of RF's which the two alleles have in common. An F value of 1.00 indicates all RF's for all seven RE"s are identical for the two alleles being compared. An F value of indicates that no RF's are shared between the two alleles being compared. A number of mouse strains, which would always be in the same group, had identical RFLP's for all seven restriction enzymes, giving them F values of 1.00 when compared to one another. BIO, AZROU 1, JERUSALEM 3, and t w75 all have identical Ap alleles, thereby establishing them as a core of the b group,

PAGE 82

74 named after the b haplotype of the prototypic C57BL/10 mouse strain (also designated BIO). The t w71 and t wl2 strains have identical A3 alleles, as do the two TT6 t haplotype strains, although these two may be identical chromosomes. The two mouse strains B10.WB and JERUSALEM 4 are another pair with an F value of 1.00 between them, with B10.M differing from them by a single Bgl II fragment. In the d group, named after the well characterized prototypic d haplotype, the A3 alleles of B10.RIII and METKOVIC 1 are identical to one another, as are t w8 and t w32 . The mouse strains B10.F, B10.Q, B10.CAA2, B10.KEA5, and B10.STC77 all possess identical A3 alleles, as do the pair STU and W12A, and the k group members BIO. BR and B10.CHA2, which will all be considered in detail shortly. RFLP analysis with seven different restriction endonucleases indicated that, of the 3 6 different mouse strains analyzed including t haplotype bearing mice, 22 different A3 alleles were identified. Table 2 presents a matrix comparison of the divergence of A3 within and between representative members of groups b, d, and k expressed as F values, and based on results obtained with seven RE's. The full strain designations can be found in the Table 1. Once the F values for the A3 alleles of the 3 7 standard laboratory inbred strains, wild derived strains, and t haplotype bearing strains were calculated and compared, the existence of three groups became obvious. The F

PAGE 83

75 3 O u CD 03 ca i rH 03 c b CN 0) rH XI 03 E-" >1 u (d c rH •H P tJ 3 w iH CQ > •-H rH P 0j B P W c S (I) M fN a Q a; • P. w J in in IT) in CO •^ VO r» on ftj rH I-l tH rH CM fN CN V0 VO o o o —I 23 VO US VO fN CM VO CO o in oo VO CN CO VO CM 00 05 QJ is] w 00 o oo o vo fN CM VO CN 00 vo fNl oo s o rH oo r~ rrin oo vo vo o vo CM fN VO cn 10 fN VO CN ^r 00 co r-» r* cn oo oo fN oo in fN r\) in CM Lfl CN CN 00 fN CTl fN rH oo vo o tN fN fN CN cn CN CN fN fN fN fN CN CN cn o fN O CO CN fN fN o fN in m o <=r fN fN fN fN fN cm fN CN fN VO rH rH < CN H CN H D H W H •-i P >i a co XI rH Ol 03 o o M a 03 4H WTlH a) a oj 3 iH o o o o Bei ^ OS H M 03 3 P o o enp o 0i CQ (ki « QJ M d £ 75 P U 4H 03 O « 2 X) e 03 CQ p H c^ G •H P

PAGE 84

76 values within any one of the three groups is usually greater than 0.50, and most often be 0.67 or higher. The discrete nature of the groups is demonstrated by the F values between the groups, ranging from zero to 0.18, indicating very little homology. Table 1 contains the complete listing of the 37 strains analyzed with the Ag group to which they have been assigned shown in the far right column. Every mouse strain for which RFLP data was obtained using seven re's, can be placed into one of these three groups. Also presented in table 1 is the subspecies listing of each of the strains tested. There is no correlation of the subspecies of the mice to the Ap group to which they belong. For example, the subspecies Mus musculus domesticus is present in all three groups, and three different subspecies are present in the d group, indicating that the existence of these Ag groups predated the subspeciation of Mus musculus at least into the three subspecies represented here. Subspeciation probably occurred approximately one million years ago. The ancient origin of these evolutionary A3 groups indicates a continual maintenance of at least part of the A gene structure . To further substantiate the existence and the discrete nature of the evolutionary Ao groups, a statistical comparison of the A Q groups is shown in table 3 . The average F values between any two members

PAGE 85

77 Table 3. Statistical comparison of Aq groups defined by RFLP analysis. Mean F value + S.D. d.f . a Group Within same group Between different groups Students T test 1 .641 + .157 .098 + .074 466 50.13 2 .644 + .161 .103 + .088 511 48.94 3 .697 + .158 .147 + .091 140 13.99 a degrees of freedom

PAGE 86

78 within the same group is no lower than .641, while the F values between different groups is no higher than .091. The Student's T test number for each of the three groups calculates to a probability of p < .001, indicating the statistical validity of these three evolutionary groupings . The genetic diversity of Ag in 35 of the 36 strains of mice was analyzed by RFLP using a 1.2 kb Hind III restriction fragment derived from AqP (J. Seidman, personal communication) . This fragment contains part of the exon which encodes the a^_ domain of ^ together with approximately one kb of flanking intron seguences (Mclndoe and Wakeland, unpublished observation). Representative examples of the restriction fragment patterns detected at high stringency with this A^ probe are shown in figure 6. Digestion with the restriction endonuclease Eco RI resulted in a 10.6 kb ^ fragment in all of the 35 mouse strains examined without exception. Upon the digestion with the other six restriction endonucleases used in this study, however, a significant amount of restriction fragment polymorphisms were noted. Of the 35 mouse strains assessed for A a , 27 separate alleles are discernible by the RFLP analysis, indicating the polymorphic nature of the ^ gene. Table 4 presents a matrix comparison of the calculated F values based on the restriction fragments obtained with the total of seven restriction

PAGE 87

Figure 6. Autoradiogram of an Eco RI restriction endonuclease digestion of wild and standard laboratory inbred strains of mice, probed with a 1.2 kb Hind III fragment from A^ and kindly supplied by Dr. John Seidman. As with the A3 probe, this A^ probe is also derived from a genomic clone and contains predominantly noncoding sequence.

PAGE 88

80 w221 w219 w205 w202 w220 r V w22 w15 W17 w3 kb 21.2 | 7.4* 5.8}

PAGE 89

81 w a H 09 « 03 O s 8 > 4J c t rH (0 G a. 06 r-l E-t »4 S o cn H fM CN rn in m CM CO cn CN in CN CN CO fNJ CN o o ca n < •X> H < a CO Eh w S CN Q W >-3 OS fa < o H CQ c 01 cd 3 M O 4J S 03 CU a oao <|h co rn oo in ^r ro ci rs v. o cn x> CN CN x> JQ rn oo H CM H rj •v \ ^ *f 00 rg CN cn ro t •*]• in ^r m n [-» in vo in 10 cn i cn i 1X1 XI n ro m CN CN r-l CN CO fl fl 13 O CN -CO tH rH CO CO n o\ ci CN 00 CO m cn CO tH tH in CN CN CN CN t-H CN CN \ \ \ \ \ 00 O O ^J* CO tH r-l r-l CO CO cn C7i CJ> H i— I tH i— I tH \ \ \ \ \ X) CN ^r CN KD CN CN CO CO ro CTi CTl tH H tH H tH >». *•«. "^ *«* -^ O CO VO O VD H ,H 10 iH r-l CN H < w CN H 3 < J u W m u w • s • • 23 CN O CN V. 00 O CN ~^ XI m iH >s ^f CTi tH oo cn tH IX) 10 a^ c rH t^ 1 CN C^ IC CT\ •^ X 1 m oo 10 10 00 CO VO CN o CN 10 UD a. 03 0) o •H -U O • •H M u o 03 Cd U n c c O cd J 03 M cd 73 XJ 3 c > id ^04 03 H (!) » (JJ rH M >ttH 03 XI cd CU cu e •H 4-i 0) T3 4J O M O 4-1 03 cd X3 CU H T3 a •H pj 03 O O b Cl! — S En «— rH H cd M 03 4J 3 O O -U O U, tJfl K 03 kl o kl a -a E-i 2 d XJ £ 03 x: -H 3 03 C o

PAGE 90

82 endonucleases. The mouse strains in this K. matrix are the same strains listed in table 2 and represent the three evolutionary Ap groups. Within each of the three Ap groups, the A^ gene shows lower F values, indicating a lesser degree of homology between members of the same Ap group. More significantly, the F values of the Aq alleles when comparing between the different Ap groups are much higher than for the Ap alleles, ranging very close to the same F values seen within an Ap group. Thus no groups are detected in this RFLP analysis of the A^ gene of 3 5 different strains of mice. RFLP Analysis of t Haolotype-Bearing Mice A number of t haplotype bearing mice have also been examined in this protocol because there exists some controversy regarding their evolutionary origin. In an attempt to address this aspect, nine different t haplotype strains were examined by RFLP analysis in a collaborative study with Dr. Joesph Nadeau of Jackson Laboratories. Eight different t haplotype bearing mice, plus two k haplotype controls, were examined with the same seven restriction endonuclease RFLP analysis as with the other 28 mouse strains examined. Figure 7 shows a representative autoradiogram of an Ap probed Sac I restriction endonuclease digestion of the nine t haplotype bearing mouse strains as listed in

PAGE 91

Figure 7. Autoradiogram of a Sac I restriction endonuclease digestion of t haplotype bearing mice strains, probed with the A3 probe. Many of the strains are heterozygous with either the k haplotype or the t b haplotype, as described in the text, explaining the overabundance of Aq restriction fragments seen.

PAGE 92

84 in t f (O I
PAGE 93

85 the figure and in table 1. The t haplotype strains are often maintained as heterozygous for the t haplotype due to some of their homozygous lethal effects. This may explain the multitude of restriction fragments seen in this figure. To compensate for this, the k restriction fragments (7.85, 4.6, and 2.0 kb) have subtracted from t haplotypes t w71 , t w75 , t wl2 , t w5 , and t w32 . In addition, the TT6 strains, which may have identical chromosomes, are heterozygous with t^-JRl, so those restriction fragments have been subtracted from TT6 for each restriction endonuclease in order to calculate relevant F values. The 7.3 Ap Sac I restriction fragment is present in strains t w71 , TT6, t w75 , and t wl2 . The 5.2 Ap Sac I fragment is found in strains TT6, t 6 , t w8 , t w ^ , and t w32 . The 3.6 kb fragment is present in all eight of the t haplotype strains except for t w75 , which is identical to C57BL/10 for all seven restriction endonucleases. Finally, the 1.58 kb Sac I fragment as shown in figure 7 is present in strains t w7 ^, TT6, t w7 ^, and t wl2 , all of which are in the b group. On seeing the various Ap restriction fragments in the t haplotype strains which were in common with members of the three Aq evolutionary groups, the t haplotype strains were compared to the RFLP's of the other 28 mouse strains analyzed. All nine of the t haplotype strains Ar alleles examined by RFLP analysis fit into either the b or the d groups, as listed in table 1 and shown with

PAGE 94

86 representative strains in table 5. The F values for their Ag alleles range from 0.50 to 1.00 within a group, and range from to 0.17 between groups in table 5. Again, as with some of the other wild and standard laboratory inbred strains, a few are identical to one another such as t wl2 and t w71 , and t w8 and t w32 . Not shown in table 5 are the F values comparing the t haplotype strains in the b and d groups to members in the k group. The F values in this comparison ranged from to 0.17, therefore none of the t haplotype strains examined are members of the k group. Table 6 presents a statistical comparison of the Ag alleles of the t haplotype strains to determine if they are more related to one another within either of the Ag evolutionary groups, or if they are just as related to any other member of their respective Ag group. As was just mentioned, there is no question that they belong to either group. The Student's T test values shown in table 6 indicate that in neither group are the Ag alleles of the t haplotypes more related to one another than to other members of the same Ag group at the p < 0.001 level of significance, particularly in the d Ag evolutionary group. The Aq^ alleles of the t haplotype strains were also examined by RFLP analysis with the same seven restriction endonucleases. Figure 8 shows representative Bgl II restriction fragments seen in the t haplotype strains

PAGE 95

87 0) V H E 0» C •H rd i P O rH a JG PI H-l o w d) rH 0) rd 0Q. <\ 0) P •H W >M rH to to CM 0) rH X! « o rH PQ C tt) -H 10 rd O -P S W r00 r-oo co CN tn o O O o rH ^r VD 00 m 1 CM -H -d o (T, ^ -o • oo rH CO rH rH -H 0) M ij W M rd m m in in m T) XJ 3 oo CO CO CM CM CM CM CM C > o o o o o i ^ -v. V» v ~^ \. fd »CM • • • co «r <£> (N CM W rH rH rH rH rH by Nei allele Pst I, ^r ro n CO m ro o o rH iH (N CM CM CM CM CM X) £ LD IT) CTl Oi 1 S \ \ \ \. \ tt) -P H CM *3* -s* ^r ^ •^ COM H JQ M CM *tf co CO co 00 CO tt) ^ T) o o o CM CM CM CM Pi CM CM T> O C LD in o 1 \ S "•>»« S ""v ~\ ^ Mh -h rH o CM CM CM CM CM CM CM rd xl tt) tt) (H M rsi CM «* oo co oo ro ro 3 Di o o CM CM CM CM CM CM CM CM H O LT) LT> 1 s \ **>% ^»s ~\ >», \ v» rd W O • CM CM o CM CM CM CM CM CM CM > O td "** >>* vT ^> m m m m m — rH M o CM fM CM CM CM CM CM CM CM rd M o i \ "v. ^v "•n, "V \ »v. n^ \ W +J • CN fM CM O O o o o o ^ O rH H X) rH rH rH O -P t» Pn U3 "* ^r <* *X> in m m in m H Pi ^ CM C\) CM CM CM CM CM CM CM CM M 1 ^•v \ ^ \ ^•v. ""v. N^ *v. \. \ E-d ffi CM H 1 in 00 ro rd 5 rH S , 5 , s , Eh Q « VO £ £ 5 u u CQ Pi Pi -Pi Eh -Ml -P! -PI -p! The f Numbe tions X! X? X) X! X! T) TJ T) X) TJ xf X)

PAGE 96

88 Table 6. Statistical comparison of Ag alleles of t haplotype mice within related t haplotype groupings and between t haplotype groupings compared to their related evolutionary Ag grouping. Mean F va lue + S.D. d.f . a Group Within Ag t group Between t and Ag groups Students T test b d .764 + .230 .733 + .179 .614 + .137 .606 + .136 63 60 2.85 2.11 a degrees of freedom

PAGE 97

Figure 8. Autoradiogram of a Bgl II restriction endonuclease digestion of the t haplotype bearing strains of mice, probed with the A^ fragment.

PAGE 98

90 * ** I** • K s * s s m CM k b 9.4 ^ 6.7^ iil ** m 2.3^

PAGE 99

91 with the A^ probe which has already been described. Although A(j is a polymorphic gene, no grouping patterns were detected by RFLP analysis as is readily demonstrated in figure 8. RFLP Analysis of the Divergence of the~A„ and A p Genes Within the AH Family I have compared the organization and structures of the A a and Ag alleles present in the A_P family members by RFLP analysis with DNA probes specific for each gene. The AP family consists of 6 M.m. domesticus H-2 haplotypes and 1 M.m. castaneus H-2 haplotype derived from Asian and North American wild mouse populations (Wakeland and Klein 1983). Their grouping of the 7 mouse strains into the A? family is based on similarities in the antigenic phenotypes of the respective class II molecules. High Pressure Liguid Chromotography (HPLC) tryptic peptide fingerprint comparisons of the A a and Ap subunits of the A.P family members have demonstrated close structural relationships of the I-A molecules encoded by alleles in the AP family (McConnell et al. 1986). A gradation in the relatedness of the 7 I-A molecules became apparent from this tryptic peptide fingerprint data. Repesentative examples of the restriction fragment patterns detected by probing at high stringency with Aq are presented in figure 9. Digestion with endonuclease

PAGE 100

Figure 9. Autoradiogram of an Eco RI and a Bam HI restriction endonuclease digestion of the seven members of the I-a P family, probed with A3, demonstrating the RFLP relatedness of the five core strains.

PAGE 101

kb 21. 2^ 7.«* 5.8^ 4..| 3.5| 93 Eco Rl Bim HI q P W11 W14 W5 W17 W3 * * W11 w14 w5 w17 w3 «•&&#& *

PAGE 102

94 Eco RI yields 2 different sizes of restriction fragments for A|3 among the 7 A_P family alleles. A s w3 is present on a 6.2 kb fragment while the remainder of the AP family alleles are present on 5.8 kb fragments (Figure 9, left). Similarly, digestion with endonuclease Bam HI yields two patterns of restriction fragments within the AP family (Figure 9, right). Digestion of AqP, A^, ^(3 W5 ' t#* 14 , and A^ 11 with Bam HI yields 3 . 6 kb and 5.4 kb fragments while digestion of Ap w3 and Ao 17 yields a single 9.0 kb fragment. The results with Bam HI are consistent with the loss of a single Bam HI site within the Ap w3 and Aq w11 alleles during their evolutionary divergence from the rest of the AP family. The DNA sequence homology among these Ag alleles, as before, is quantitated from the RFLP data by calculating the fraction homologous (F) value. Table 7 presents a matrix comparison of the divergence of Ag within the AP family based on results obtained with 7 restriction enzymes. The A{3 P ' A w11 , A 3 w5 , and A Q w14 alleles are indistinguishable by RFLP, consistent with the similar structures of the A3 subunits they encode. Although A(j w3 differs from AqP by only 6 peptides, A w3 and A P differ by 60% of their restriction fragments. In contrast, A^ 17 , which encodes an Ap subunit differing from A P by only 25% of restriction fragments compared. As with the A3 gene, the genetic diversity of the A„ gene within the AP family was assesed by RFLP analysis.

PAGE 103

95 Table 7. RFLP analysis of the AP family K, alleles Allele p q wll w5 wl4 wl7 w3 P 1.00 a .95 .95 .95 .76 1.00 q 22/22 b .95 .95 .95 .76 1.00 wll 20/21 20/21 1.00 1.00 .80 .95 w5 20/21 20/21 20/20 1.00 .80 .95 wl4 20/21 20/21 21/21 21/21 .80 .95 wl7 16/21 16/21 16/20 16/20 16/20 .76 w3 22/22 22/22 20/21 20/21 20/21 16/21 * The upper right half of each checkerboard lists the fraction homologous (F) value as defined by Nei and Li (1979). Number of shared restriction fragments/total restriction fragments scored for both alleles. This analysis is based on restriction digestions with Bam HI, Bgl II, Eco RI, Hind III, Pst I, Pvu II, and Sac I. The genomic DNA from wl7 could not be digested to completion with Hind III and consequently results with this enzyme for wl7 were not included in this analysis.

PAGE 104

96 A 1.2 kb Hind III genomic restriction fragment derived from A^ was used to probe the A.P family digested with the same 7 restriction endonucleases used for the A B analysis. Representative examples of restriction fragment patterns detected at high stringency with this probe for A^ are shown in Figure 10. Digestion with restriction endonuclease Eco RI yielded a single 10.6 fragment containing A^ for every allele in the AP family (Figure 10, left side). Similarly, digestion with Bam HI yielded a 5.2 kb Aq fragment with every AP family allele except A^ 17 which yielded a 5.4 kb fragment (Figure 10, right side). Table 8 presents a matrix comparison of the calculated F values between the various A„ alleles present within the A_P family. The evolutionary divergence of these alleles coincides closely with that detected for A . The A^P and A^ alleles are indistinguishable with the 7 restriction enzymes used and can be distinguished from A^ 5 , A^ 11 , and A^ 14 by a single restriction fragment, indicating that their DNA seguences are very homologous. These results are consistent with the structural similarity of the A a subunits encoded by the gene (McConnell et al^ 1986), and correlate precisely with RFLP results obtained with A (see Table 7). Although by RFLP analysis A^ is not organized into evolutionary groups as is A3, the restriction fragment genotypes of A^ and ^wl7 were distinguishable from

PAGE 105

Figure 10. Autoradiogram of an Eco RI and Bam HI restriction endonuclease digestion of the seven members of the I-A P family, probed with A^. The 10.6 Eco RI fragment is present in all the mouse strains tested.

PAGE 106

98 I Ecy Rt Bam hi * P W11 w14 W5 w17 w3 Q P w11 w14 w5 w17 w3 f #

PAGE 107

99 Table 8. RFLP analysis of the A? family A3 allele Allele p q wll w5 wl4 wl7 w3 p 1.00 a 1.00 1.00 1.00 .80 .43 q 24/24 b 1.00 1.00 1.00 .80 .43 wll 24/24 24/24 1.00 1.00 .80 .43 w5 24/24 24/24 24/24 1.00 .80 .43 wl4 24/24 24/24 24/24 24/24 .80 .43 wl7 16/20 16/20 16/20 16/20 16/20 .53 w3 10/23 10/23 10/23 10/23 10/23 10/19 * The upper right half of each checkerboard lists the fraction homologous (F) value as defined by Nei and Lei (1979). b Number of shared restriction fragments/total restriction fragments scored for both alleles. This analysis is based on restriction digestions with Bam HI, Bgl II, Eco RI, Hind III, Pst I, Pvu II, and Sac I. c The genomic DNA from wl7 could not be digested to completion with Hind III and consequently results with this enzyme for wl7 were not included in the analysis.

PAGE 108

100 each other and from the other 5 members of the AP family. As with Ap, the structural relationships of A a w3 and A a w17 subunits with those of the others of the AP family did not correlate precisely with the evolutionary relationships of their RF genotypes. As before, A a w3 was less related than ha 111 to the rest of the AP family, although A^ 3 is more similar to A^P than A a w17 by tryptic peptide fingerprinting (McConnell et al. 1986). RFLP Analysis of the Divergence of the A^ and A ^ Genes Within the jdl Family The A k family contains 5 M.m. domesticus H-2 haplotypes which were derived from either European or North American wild mouse populations (Wakeland and Darby 1983). The antigenic phenotypes expressed by alleles in the A k family are very similar, but at least 3 minor variants of the A k molecule are present (Wakeland and Darby 1983). Tryptic peptide fingerprinting and radiochemical seguencing studies have demonstrated that these 3 forms of the A k molecule differ by only 2 or 3 amino acid substitutions in the a ± and 3i protein domains (Wakeland et aJU 1985). Examples of the restriction fragment patterns detected by probing at high stringency with Ao are presented in Figure 11. Digestion with restriction endonuclease Sac I yields 3 distinct restriction fragment patterns for Ap among the 5 A k family alleles. A 3 k and

PAGE 109

Figure 11. Autoradiograra of a Sac I and Eco RI restriction endonuclease digestion of the I-A k family probed with Aq.

PAGE 110

102 Sac I k w26 w216 W15 w34 EcO R| kb 7.4^ m 5.8^ 4.9> m

PAGE 111

103 Aq w26 are detected on 7.85, 4.6, and 2.0 kb fragments, ^w216 and a q w34 are detected on 5.2, 3.8, 2. 3, and 1.58 kb fragments, and Ap w15 is detected on 7.3, 3.5, and 1.58 kb fragments. A p k and A p w26 are indistinguishable and share no restriction fragments with Ap w15 , Ar w34 , and A,3 w216 . Similarly, A^ 216 and A p w34 are indistinguishable with Sac I and share only a 1.58 kb fragment Ao w15 (Figure 11, left side). Results obtained following digestion with restriction endonuclease Eco RI are similar except that with this enzyme A$ w15 , Aa w34 , and AgW216 are indistinguishable (Figure 11, right side). Aj3 k and Ag w26 are detected on a 16 kb fragment while -V 15 ' V 34 ' and A p w216 are detected on a 6 . 4 kb fragment. Table 9 presents a pairwise comparison of the homology of Ap within the A k family based on RFLP results obtained with 7 restriction enzymes. These results clearly divide the A3 alleles in the A k family into two distinct groups based on homology of their genomic structures. I found that these two A^ k groups are members of A3 evolutionary groups b and d, and share less than 15% of their restriction endonuclease fragments. There is also some heterogeneity of Ap within the Ao b evolutionary group as A w34 and A$ w216 , which were derived from European wild mice, can be distinguished from the North American derived A s w15 allele by variations in 4 restriction fragments.

PAGE 112

104 Table 9. RFLP analysis of the A k family A B alleles Group Designation Alleles )< w26 w216 wl5 w34 k k 1.00 a .09 .09 .09 k w26 22/22 b .09 .09 .09 b w216 2/23 2/23 .58 1.00 b wl5 2/23 2/23 14/24 .58 b w34 2/23 2/23 24/24 14/24 3 The upper right half of each checkerboard lists the fraction homolougous (F) value as defined by Nei and Li (1979). Number of shared restriction fragments/total restriction fragments scored for both alleles. This analysis is based on restriction digests with Bam HI, Bgl II Eco RI, Hind III, Psy I, Pvu II, and Sac I.

PAGE 113

105 Figure 12 presents representative examples of the restriction fragment patterns detected at high stringency with the 1.2 kb Hind III fragment probe for A a . Digestion with restriction endonuclease Hind III yielded 3 distinct restriction fragment patterns among the 5 A k family alleles. A^ and A a w26 are detected on 9.4 kb fragments, A a w34 and A^ 216 are detected on >20 kb fragments, and A a w15 is detected on a 12.5 kb fragment (Figure 12, right side). Aq/ 1 and A^ 26 are indistinguishable and share no restriction fragments with A^ 34 , A^ w216 , or A a w15 ; and A a w15 is distinguishable from A a w34 and A a w216 . Digestion with restriction endonuclease Pvu II yielded 2 distinct restriction fragment patterns in which A^ and A a w26 are detected on 3.3 kb fragments while A^ 34 , A^ 216 , and A a w15 are detected on 3.8 kb fragments (Figure 12, left side). The A^ and A^ 26 alleles are again closely related to one another and different from the other alleles which are all related on the protein level. A pairwise comparison of the homology of A^ within the I-A k family is presented in Table 10. This analysis clearly divides the A^ alleles in the A k family into two groups based on the homology of their gene structures. These two groups coincide precisely with the A,3 evolutionary groups that the Ag^ alleles are members of. As with the Ag gene, some additional genetic diversification of the A^ gene was detected in the A^ alleles within each group. A^ can be distinguished from

PAGE 114

Figure 12. Autoradiogram of a Pvu II and a Hind III restriction endonuclease digestion of the I-A k family, probed with A a .

PAGE 115

107 Pvu II Hind III — — — — , k w26 W216 w15 w34 k w26 w216 w15 w34 kb 21. 2^ 6.7| 4.9^ f «*

PAGE 116

108 Table 10. RFLP analysis of the A k family A~ alleles. Group Designation Alleles k w26 w216 wl5 w34 k k .90 a .63 .38 .63 k w26 18/20 b .74 .48 .74 b w216 12/19 14/19 .50 1.00 b w!5 8/21 10/21 10/20 .50 b 5 3 w34 12/19 14/19 18/18 10/20 fraction homologous (F) value as defined by Nei and Li (1979). Number of shared restriction fragments/total restriction fragments scored for both alleles. This analysis is based on restriction digests with Bam HI, Bgl II, Eco RI, Hind III, Pst I, Pvu II, and Sac I.

PAGE 117

109 ^w26 b y one restriction fragment and A a w15 can be distinguished from A a w216 and A a w34 by three restriction fragments. This pattern of diversification also coincides with that seen for Ap, except that Ap w26 and Ao k could not be distinguished with the 7 restriction endonucleases used in this analysis.

PAGE 118

DISCUSSION RFLP Analysis of the Genomic Structures of the Murine Class II Histocompatibility Alleles Comparisons of the RF genotypes of Ar and A^ alleles provide information on DNA sequence variations throughout the interval of genomic DNA containing A^ or Ap, including sequences contained in exons, introns, and flanking regions. The size of the interval of genomic DNA analyzed with each probe can be estimated by calculating the average size of the sum of all the restriction fragments detected with each restriction endonuclease. For the 7 restriction ennzymes used in this analysis, the Ag probe hybridized to an average of 9.4 kb of genomic DNA per restriction enzyme digest and the A^ probe hybridized to an average of 7.2 kb. Therefore, the polymorphic restriction enzyme sites assayed in this study are distributed over a fairly large segment of genomic DNA. Because both A^ and Ao are each encoded by about 700 base pairs of exon DNA, the majority of the genomic DNA assayed by this RFLP analysis is from intron and flanking regions. Thus, although RF genotypes reflect sequence variations in the entire segment of genomic DNA containing the assayed gene, the majority of 110

PAGE 119

Ill the restriction sites detected reflect DNA sequence variations in non-coding regions. Standard Laboratory Inbred and Wild Mice As was stressed in the section on polymorphism in the literature review, the use of wild H-2 haplotypes has been critical to this study. With the use of mice of known subspecies origin and known geographic origin, more accurate interpretations about the generation of polymorphism of the A^ and Ag genes can be made from the data. The role of natural selection and evolution in the generation of polymorphism are essentially unknowns in the standard laboratory mouse strains when compared to the wild mice. This advantage of wild mice in analyzing the polymorphism of class II genes, combined with the discovery and characterization of wild-derived H-2 haplotypes into distinct antigenic families (Wakeland and Klein 1977; Wakeland and Klein 1983), have laid the groundwork for this dissertation. The 36 mouse strains listed in table 1 all fit into one of the three Ap groups, as shown in the right hand column of table 1. The fact that Mus musculus domesticus is present in all three evolutionary groups, as well as the fact that three different subspecies are present in one of the groups, indicates that these three A s groups were present as discrete groups prior to subspeciation of Mus musculus . Therefore there must be some selective

PAGE 120

112 pressures to maintain the genomic structure as detected by the RFLP analysis, even through the evolutionary process of subspeciation. As previously mentioned, the Aq and Aq probes used in this study are both derived from genomic clones and contain predominantly noncoding sequence. It is very probable that whatever the genetic element that has maintained these evolutionary groups is, it is probably located in an intron somewhere in the A3 gene. The striking difference between the protein and genomic structures of the A_P and the A k families provide particularly strong evidence for an intron element being fundamental to the difference between evolutionary Ag groups . Definition of Evolutionary Groups by RFLP Analysis The distinct grouping of the A B alleles is demonstrated by the representative mouse strains and their values as shown in table 2 . Because of the number of mouse strains analyzed, they could not all be electrophoresed and southern blotted onto the same sheet of nylon membrane, therefore each strain was compared with every other strain on at least three and on as many as five different autoradiograms of southern blots. In this manner every restriction fragment detected was electrophoresed on the same autoradiogram with any other restriction fragment that was close to the same size.

PAGE 121

113 The significance of the existence of the three A groups relates directly to the evolution of the subspecies because of the presence of Man, domesticus , M.m. castaneus , and M.m. musculus in one group. Thus these evolutionary groups have been in existence and must have been maintained for over one million years, when subspeciation is estimated to have occurred. These results indicate that A3 is evolutionary diversified as a limited number of discrete allelic forms rather than as a random array of genetic variants. Choi et al. (1983) sequenced exon portions of genomic clones, and determined overall genomic structural features including noncoding regions, from three different haplotypes, the b, d, and k haplotypes, and concluded from the data that the generation of polymorphism has been a random evolutionary process, most probably through multiple independent mutational events. Because they analyzed only three A3 alleles, their interpretation was overstated, in part because the three alleles they chose to analyze happened to be the three prototypic A3 alleles for the three distinct evolutionary groups, as the data presented in this dissertation demonstrates. Therefore, the genetic mechanisms which generated the A3 polymorphism are most probably something other than single random mutational events, and are more likely either gene conversion or some form of intragenic recombination.

PAGE 122

114 Evidence for gene conversion in Ao has been presented (Mengle-Gaw et al. 1984), but because of the widespread patterns detected for all of the Aq alleles analyzed in this study, the conversion or recombination event may be more ancient and established in the mouse haplotypes than previously suspected. The variety of postulated genetic mechanisms for generation of class II polymorphism are discussed in the Literature Review of this dissertation. An intragenic, which might also be termed an interallelic, recombination mechanism at an evolutionary stage preceding subspeciation would explain the discrete grouping of the Ap alleles, which might then diversify more or less within the respective Ao evolutionary group. Data lending itself towards such a mechanism has been presented by Wakeland and Darby (1983). The presence of RFLP identical alleles, along with related but not identical alleles within each evolutionary group, would be explained by such a mechanism. The A^ gene has also been examined in detail at the DNA seguence level. Benoist et al^_ (1983b) examined six different standard laboratory inbred A^ alleles and found a high degree of polymorphism in the A^ gene, which is substaniated in the results of this dissertation, and found hypervariable clustering of amino acid encoding nucleotides in the exon encoding the first protein domain. The RFLP analysis of Aq presented in this

PAGE 123

115 dissertation is not enough of a fine structural detecting method to confirm those results, but it is readily apparent that the Aq alleles did not conform to any evolutionary grouping pattern as did the Ag alleles. Therefore the genetic polymorphic generating mechanisms involved may have operated over the relatively short DNA span of Within the Ag gene itself, and perhaps even within the intron between the first two domain encoding exons, as this is likely to be the most polymorphic intron. This possibility is corroborated by the fact that the genomic probe used in this RFLP analysis would predominantly detect intron seguence as described in the results. In the most closely related mouse strains of the A s evolutionary groups, one pattern of the A^ alleles became apparent. For mouse strains which have an A Q F value of 0.85 or higher, a close relatedness of the same strains A^ alleles was demonstrated. These results indicate that the Ag and A^ genes show coevolution over at least one million years which can be detected in closely related alleles. Organization of the t Haplotypes into A o Evolutionary Groups The t haplotypes, as discussed in detail in the literature review, all appear to be related to one another, at least on a general structural level. Therefore it would be reasonable to assume they have

PAGE 124

116 a very ancient common evolutionary ancestor. As can be seen in tables 5 and 6 , the t haplotypes are definitively organized into the b and d Ap evolutionary groups. The same patterns of identical F values and closely related F values being found within the same evolutionary group are seen. Table 6 demonstrates that the t haplotypes are not an evolutionarily isolated genetic anomaly but have undergone similar Ap evolutionary changes as have the other wild mice. Comparisons of class II molecules and class II gene RF genotypes within the A ^ and A k fa mily. The grouping of some of the Ag alleles has its foundation in serology (Wakeland and Klein 1979a, 1983) and in tryptic peptide mapping (Wakeland and Darby 1983). These related alleles have been referred to as the A_P and A k families in past publications because of the close relatedness of their class II molecules, and to avoid confusion will continue to be referred to as such here. It is important to note, however, that their genomic structure indicates that the A_P family Ap alleles are integral members of the evolutionary Ap d group. The A k family shows a dramatic split in the RFLP categorization of its Ag alleles as will be discussed shortly. In the results section of this dissertation, the serologic and HPLC tryptic peptide fingerprinting data

PAGE 125

117 have been described which demonstrate a gradation in the structural similarities of A molecules encoded by alleles in the AP family. The most closely-related or "core" alleles of the I-AP family, which are AP, A<3, A wl4 , A w5 , and A wl1 , encode molecules which probably differ by only 2 or 3 point mutations in Ag. The I-A w 3 allele (B10.SAA48) encodes a molecule which is structurally similar to the A molecules encoded by the core alleles, but differs from them by minor structural variations in both subunits. The A wl7 allele (B10.CAS2) encodes a molecule which differs from the rest of the AP family by numerous structural variations in both subunits (McConnell et al^_ 1986). In fact, the amount of structural variation distinguishing A wl7 from the rest of the I -A molecules encoded by AP family alleles indicates it is at best a very peripheral member of the A_P family. Some interesting features about the evolutionary divergence of the AP family are revealed when the RF genotypes of the different alleles involved are compared. The five core alleles of the AP family all have closely related or indistinguishable RF genotypes for A^ and An, indicating that these alleles have very similar genomic structures. These results are consistent with the structural similarity of the A molecules they encode and suggest that these minor variant A alleles must have recently diverged from a common ancestral allele, as

PAGE 126

118 other alleles with identical RFLPs. But in the case of the A? family, a close evolutionary relationship between A w3 and the AP family core alleles is not established by a comparison of their RFLPs. These results suggest that the noncoding region of A w3 may differ in the region which defines the An evolutionary groups, although both genes encode molecules with similar structures. In contrast, A wl7 encodes an A molecule with much less similarity to AP, but appears more related by RFLP analysis. These observations indicate that the comparison of RFLPs between two class II alleles may not always accurately predict the structural similarity of their expressed gene products. The A k family alleles were previously divided into two groups on the basis of tryptic peptide analysis and radiochemical sequencing (Wakeland and Darby 1983; Wakeland et al^_ 1985). The A molecules of A k and A w26 differ from those of A w216 , A wl5 , and A w34 by sequence variations affecting amino acid positions 28 and 95 in the 3j_ domain and by variations affecting two adjacent tryptic peptides in the a^ domain. These five alleles which are closely related by protein analysis techniques are as different as possible by RFLP analysis, and in fact are in two separate evolutionary Ao groups. These results indicate that the A^ and Ag alleles of these two subsets of what has been called the A k family have very unrelated genomic structures despite the fact

PAGE 127

119 that the I-A molecules they encode differ from one another by only two or three amino acid interchanges. The genetic mechanisms which have generated the noncoding differences between these sets of alleles are unknown, but as discussed earlier, may reflect an intragenic recombination event, particularly considering that the exons of these alleles are so nearly identical. The Ajj alleles in the A k and AP protein family members followed an identical pattern as the A R gene. Therefore, especially with the closely related haplotypes, the alleles of A^ and A fi may be evolving as an A^A^ gene duplex in natural mouse populations as has been previously published (Wakeland and Darby 1983). Although the A^ alleles exhibit less RF variability then alleles of Ar based on F values, alleles at both loci in these two exhibit the same evolutionary relationships.

PAGE 128

REFERENCES Adorini, L. , M.A. Harvey, A. Miller, and E.E. Sercarz. 1979. J. Exp. Med. 150:293. Allison, J. P., L.E. Walker, W.A. Russell, M.A. Pellegrino, S. Ferrone, R.A. Reisfeld, J. A. Frelinger, and J. Silver. 1978. Proc. Natl. Acad . Sci. USA 74:5135. Alper,_C. 1981. The Role of the Major Histocompati bility Complex in Immunobioloqy. Dorf, M. , ed. , Garland STPM, New York, p. 173. Artzt, K. 1984. Cell 39:565. Artzt, K., P. McCormick, and D. Bennett. 1982a. Cell 28:463. Artzt, K. , H-S. Shin, and D. Bennett. 1982b. Cell 28:471. Artzt, K., H. Shin, D. Bennett, and A. Dimeo-Talento. 1985. J. Exp. Med. 162:93. Asano, Y., A. Singer, and R. Hodes. 1983. J. Immunol. 130:67. Auffray, C. , A. Ben-Nun, M. Roux-Dosseto, R.N. Germain, J.G. Seidman, and J.L. Strominger. 1983. EMBO J. 2:121. Auffray, C. , A.J. Korman, M. Roux-Dosseto, R. Bono, and J.L. Strominger. 1982. Proc. Natl . Acad. Sci. USA 79:6337. ' Bader, R.S. 1965. Growth 29:291. Baltimore, D. 1981. Cell 24:592. Baxevanis, C.N. , 2. A. Nagy, and J. Klein. 1981. Proc. Natl. Acad. Sci. USA 78:3809. Bechtol, K.B., and M.F. Lyon. 1978. Immunogenetics 6:571. a 120

PAGE 129

121 Benacerraf, B. 1981. Science 212:1229. Benacerraf, B. , and H.O. McDevitt. 1972. Science 175:273. Bennett, D. 1975. Cell 6:441. Bennett, D. , K. Artzt, J. Cookingham, and C. Calo. 1979. Genet. Res. 33:269. Bennett, D. , L.C Dunn, M. Spiegelraan, K. Artzt, J. Cookingham, and E. Schermerhorn. 1975. Genet. Res. 26:95. Benoist, CO., D.J. Mathis, M.R. Kanter, V.E Williams II, and H.O. McDevitt. 1983a. Proc. Natl. Acad. Sci. USA 80:534. Benoist, CO., D.J. Mathis, M.R. Kanter, V.E. Williams II, and H.O. McDevitt. 1983b. Cell 34:169. Berry, R. J. , and J. Peters. 1977. Proc. R. Soc. London, Ser. B 197:485. Bodmer, W.F. 1976. Harvey Lecture Ser. 72:91. Bonhomme, F. , J. Britton-Davidian, L. Thaler, and C. Triantaphyllides. 1978. C R. Hebd. Seances Acad. Sci., Ser. D 287:631. Breakey, D.R. 1963. J. Mammal. 44:153. Bregegere, F. , J. P. Abastado, S. Kvist, L. Rask, J.L. Lalanne, H. Garoff, B. Cami , K. Wiman, D. Larhammar, P. A. Peterson, G. Gachelin, P. Kourilsky, and B. Dobberstein. 1981. Nature 292:78. Brown, W.M., M. George, Jr., and A.C Wilson. 1979. Proc. Natl. Acad. Sci. USA 76:1967. Brown, W.M., E.M. Prager, A. Wang, and A.C. Wilson. 1982. J. Mol. Evol. 18:225. Bruell, J.H. 1970. Contributions to Behavior-Genetic Analysis, p. 261. Lindzey, G. and D.D. Thiessen, eds., Appleton, New York. Cecka, J., M. McMillan, D. Murphy, H. McDevitt, and L. Hood. 1979. Eur. J. Immunol. 9:955. Choi, E. , K. Mclntyre, R.N. Germain and S.G. Seidman 1983. Science 221:283.

PAGE 130

122 Cohen-Haguenauer, 0., E. Robbins , C. Massart, M. Bosson, I. Deschamps, J. Hors, J.M. Lalouel, J. Dausset, and D Cohen. 1985. Proc. Natl. Acad. Sci. USA 82:3335. ~ Condamine, H. , J.-L. Guenet, and F. Jacob. 1983. Genet . Res. 42:335. Cook, R.G., J.D. Capra, J.L. Bednarczyk, J.W. Uhr, and E.S. Vitetta. 1979. J. Immunol. 123:2799. Cook, R. , J.D. Capra, J.W. Uhr, and E.S. Vitteta. 1981. Current Trends in Histocompatibility 1 . p. 349. Reisfeld, R.A., and S. Ferrone, eds., Plenum, New York, N.Y. Cullen, S.E., J.H. Freed and S.G. Nathenson. 1976. Transplant. Rev. 30:236. David, C.S., andD.C. Shreffler. 1974. Transplantation 17:462. DeLong, K.T. 1966. Ecology 47:481. DeLong, K.T. 1967. Ecology 48:611. Dembic, Z., P. A. Singer, and J. Klein. 1984. EMBO J. 3:1647. Dobrovoloskaia-Zavadskaia, N. , and N. Kobozieff. 1932. C. R. Soc. Biol. Paris 110:782. Duncan, W.R., and J. Klein. 1980. Immunogenetics 10:45. Dunn, L.C., and E. Caspari. 1945. Genetics 30:543. Dunn, L.C., and S. Gluecksohn-Schoenheimer(Waelsch) . 1950. Proc. Natl. Acad. Sci. USA 36:233. Egel, R. 1981. Nature 290:191. Fachet, J., and I. Ando. 1977. Eur. J. Immunol. 7:223. Falconer, D.S. 1947. J. Hered. 38:215. Ferris, S.D., R.D. Sage, E.M. Prager, U. Ritte, and A.C. Wilson. 1983. Genetics 105:681. Ferris, S.D., R.D. Sage, and A.C. Wilson. 1982. Nature 295:163. Ferris, S.D., A.C. Wilson, and W.M. Brown. 1981. Proc. Natl. Acad. Sci. USA 78:2432.

PAGE 131

123 Figueroa, F. , M. Golubic, D. Nizetic, and J. Klein. 1985. Proc. Natl. Acad. Sci. USA 82:2819. Flaherty, L. 1980. The Role of the Major Histocompat ibility Complex in Immunology. Dorf, M.E. ed. , Garland, New York, p. 33. Fox, H., G. Martin, M.F. Lyon, B. Herrmann, A.-M. Frischauf, H. Lehrach, and L. Silver. 1985. Cell 40:63. Gaisler, J. 1975. Rodents in Desert Environments . p. 59. Prekash, I. and P.K. Ghosh, eds., Junk, The Hague. Germain, R.N. , R.I. Lechler, M.A. Norcross, D.M. Bentley, and D.H. Margulies. 1985. Advances in Gene Technology: Molecular Biology of the Immune System, p. 167. Streilein, J.W. , F. Ahmad, S. Black, B. Blomberg, and R.W.Voellmy, eds., ICSU Press, Cambridge, U.K. Gladstone, P., and D. Pious. 1978. Nature 271:459. Gladstone, P., and D. Pious. 1980. Somatic Ce ll Genet. 6:285. Gorer, P. A. 1936. Br. J. Exp. Pathol. 17:42. Gorer, P. A. 1938. J. Pathol. Bacteriol. 47:231. Gropp, A., U. Tettenborn, and E. von Lehmann. 196 9. Experientia 25:875. Gropp, A., U. Tettenborn, and E. von Lehmann. 1970. Cytogenetics 9:9. Gropp, A., and H. Winking. 1981. Biology of the House Mouse, p. 141. Berry, R.J., ed. , Academic Press, New York, N.Y. Gropp, A., H. Winking, L. Zech, and H. Muller. 1972. Chromoscma 3 9:265. Hadi, J.R., E.E. Stafford, F. Sukaeri, and W. Riberu. 1976. Southeast Asian J. Trop. Med. Public Health 7:487. ' Hamajima, F. 1962. Kyushu Daigaku Nogakub u Gakugei Zasshi 20:61. — Hamajima, F. 1964. Kyushu Daigaku Nogakubu Ga kugei Zasshi 21:73. —

PAGE 132

124 Hammerberg, C. , and J. Klein. 1975. Nature 253:137. Hammerling, G., B. Deak, G. Mauve, U. Hamraerling, and H. McDevitt. 1974. Immunogenetics 1:68. Hansen, T.H., R.W. Melvold, J.S. Arn, and D.H. Sachs. 1980. Nature 285:340. Harland, P.S.E.G. 1958. Ann. Mag. Nat. Hist. 1:193. Harrison, J.L. 1955. Proc. Zool. Soc. London 125:445. Harrison, D.L. 1972. The Mammals of Arabia, Vol. III . Ernest Benn Ltd., London. Hassinger, J.D. 1973. Fieldiana, Zool. 60:1. Hayes, C.E., K.K. Klyczek, D.P. Krum, R.M. Whitcomb, D.A. Hullet, and H. Cantor. 1984. Science 223:559. Herrmann, B., M. Bucan, P.E. Mains, A. Frischauf, L.M. Silver, and H. Lehrach. 1986. Cell 44:469. Hillman, N. , and M. Nadijcka. 1980. J. Embryol. Exp. Morphol. 59:27. Hood, L., M. Steinmetz, and R. Goodenow. 1982. Cell 28:685. Hood, L. , M. Steinmetz, and B. Malissen. 1983. Ann. Rev. Immunol. 1:529. Hunt, W.G., and R.K. Selander. 1973. Heredity 31:11. Hurme, M. , P.R. Chandler, CM. Heterington, and E. Simpson. 1978. J. Exp. Med. 147:768. Hussain, S.R., A. A. Khan, and M.A. Beg. 1976. Biologia (Lahore) 22:261. Hyldig-Nielsen, J. J. , L. Schenning, U. Hammerling, E. Widmark, E. Helding, P. Lind, B. Servenius , T. Lund, R. Flavell, J.S. Lee, J. Trowsdale, P.H. Schreier, F. Zablitzky, D. Larhammar, P. A. Peterson, and L. Rask. 1983. Nucleic Acid Res. 11:5055. Jeffreys, A.J., V. Wilson, and S.L. Thein. 1985. Nature 314:67. Jones, P. 1977. J. Exp. Med. 146:1261. Jones, J.K. Jr., and D.H. Johnson. 1965. Univ. Kans. Publ. , Mus Nat. Hist. 16:357.

PAGE 133

125 Jones, P.P., D.B. Murphy, and H.O. McDevitt. 1978. J. Exp. Med. 148:925. Jones, P.P., D.B. Murphy, and H.O. McDevitt. 1981. Immunoqenetics 12:321. Juretic, A., Z.A. Nagy, and J. Klein. 1981. Nature 298:308. Kabat, E.A., T.T. Wu, and H. Bilofsky. 1979. U.S. Dept. of Health, Education, and Welfare, NIH Publication No. 80-2008, Bethesda. Kanno, M. , S. Kobayashi, T. Tokuhisa, I. Takei, N. Shinohara, and M. Taniguchi. 1981. J. Exp. Med. 154:1290. Katz, D., T. Hamaoka, M.E. Dorf, and B. Benacerraf. 1973. Proc. Natl. Acad. Sci. USA 70:2624. Kaufman, J. A., C. Auffray, A.J. Korman, D.A. Shackelford, and J. Strominger. 1984. Cell 36:1. Kindred, B., and D.C. Shreffler. 1972. J. Immunol. 109:940. Klein, J. 1973. International Symposium on Standardization of HL-A Reagents . p. 251. Regamey, R.H. , and J.V. Sparck, eds., Karger, Basel. Klein, J. 1974. Ann. Rev. Genet. 8:63. Klein, J. 1975. Biology of the Mouse Histocompatibility^ Complex. Springer-Verlag, New York. Klein, J. 1979. Science 703:516. Klein, J., and F. Figueroa. 1981. Immunol. Rev. 60:23. Klein, J., F. Figueroa, and C.S. David. 1983a. Immunogenetics 17:553. Klein, J., F. Figueroa, and Z. Nagy. 1983b. Ann. Rev. Immunol. 1:119. Klein, J., A. Juretic, C.N. Baxevanis, and Z.A. Nagy. 1981. Nature 291:455. Klein, J. p. Sipos, and F. Figueroa. 1984. Genet. Res. 44:39. Klyczek, K.K. , H. Cantor, and C.E. Hayes. 1984. J. Exp. Med. 159:1604. Kmm

PAGE 134

126 Kobori, J. A. , A. Winoto, J. McNicholas, and L. Hood. 1984. J. Mol. Cell. Immunol. 1:125. Korman, A. J. , C. Auffray, A. Schamboeck, and J.L. Strominger. 1982b. Proc. Natl. Acad. S ci. USA 79:6013. Korman, A.J., P.J. Knudsen, J.F. Kaufman, and J.L. Strominger. 1982a. Proc. Natl. Acad. Sci. USA 79:1844. Kronenberg, M. , M. Steinmetz, T. Kobori, E. Kraig, J. A. Kapp, C.W. Pierce, CM. Sorensen, G. Suzuki, T. Tada, and L. Hood. 1983. Proc. Natl. Acad . Sci. USA 80:5704. ~~" Krupen, K. , B.A. Araneo, L. Brink, J. A. Kapp, S. Stein, K.J. Wieder, and D.R. Weeb. 1982. Proc. Natl. Acad. Sci. USA 79:1254. Kvist, S., F. Bregegere, L. Rask, B. Cami, H. Garoff, F. Daniel, K. Wiman, D. Larhammar, J. P. Abastado, G. Gachelin, P. A. Peterson, D. Dobberstein, and P. Kourilsky. 1981. Proc. Natl. Acad. Sci. USA 78:2772. Larhammar, D. , K. Gustafsson, L. Claesson, P. Bill, K. Wiman, L. Schenning, J. Sundelin, E. Widmark, P. A. Peterson, and L. Rask. 1982b. Cell 30:153. Larhammar, D. , U. Hammer ling, M. Denaro, T. Lund, R.A. Flavell, L. Rask, and P. A. Peterson. 1983a. Cell 34:179. Larhammar, D. , U. Hammer ling, L. Rask, and P. A. Peterson, 1985. J. Biol. Chem. 260:14111. Larhammar, D., J.J. Hyldig-Nielsen, B. Servenius, G. Andersson, L. Rask, and P. A. Peterson. 1983b. Proc. Natl. Acad. Sci. USA 80:7313. Larhammar, D. , L. Schenning, K. Gustafsson, K. Wiman, L. Claesson, L. Rask, and P. Peterson. 1982a. Proc. Natl. Acad. Med. USA 79:3687. Lee, D.R., T.H. Hansen, and S.E. Cullen. 1982b. J. Immunol. 129:245. Lee, J.S., J. Trowsdale, P.J. Travers, J. Carey, F. Grosveld, J. Jenkins, and W.F. Bodmer. 1982a. Nature 299:750. Levine, F. , H.A. Erlich, B. Mach, and D. Pious. 1985. J. Immunol. 134:637.

PAGE 135

127 Levine, F. , and D. Pious. 1984. J. Immunol. 132:959. Lidicker, W.Z., Jr. 1966. Ecol. Monogr. 36:27. Lieberman, R. , W.E. Paul, W. Jr. Humphrey, and J.H. Stimpfling. 1972. J. Exp. Med. 136:1231. Little, C.C., and E.E. Tyzzer. 1916. J. M ed. Res. 33:393. Livnat, S., J. Klein, and F.H. Bach. 1973. Nature 243:42. Lozner, E.C. , D.H. Sachs, and G.M. Shearer. 1974. J. Exp. Med. 139:1204. Lyon, M.F. 1960. Heredity 14:247. Lyon, M.F. 1984. Cell 37:621. Lyon, M.F., and R. Meredith. 1964a. Heredity 19:301. Lyon, M.F., and R. Meredith. 1964b. Heredity 19:313. Lyon, M.F., and R.J.S. Phillips. 1959. Heredity 13:23. Maddon, P.J., D.R. Littman, M. Godfrey, D.E. Maddon, L. Chess, and R. Axel. 1985. Cell 42:93. Malissen, M. , T. Hunkapiller and L. Hood. 1983. Science 221:283. Malissen, M. , B. Malissen, and B. Jordan. 1982. Proc. Natl. Acad. Sci. USA 79:893. Maniatis, T. , E.F. Fritsch, and J. Sambrook. 1982. Cold Spring Harbor Laboratory, Cold Spring, New York. Marshall, J.T. 1977 Bull. Am. Mus . Nat. Hist. 158:177. Marshall, J.T. 1981. The Mouse in Biomedical Research, Vo1 L PI 7 Foster, H.L., J.D. Small, and J.G. Fox, eds., Academic Press, Inc., New York, N.Y. Marshall, J.T. , and R.D. Sage. 1981. Symp. Zo ol. Soc. Lond . 47:15. — Martin, W. J. , P.H. Maurer, and B. Benacerraf. 1971. J. Immunol. 107:715. Mathis, D.J., CO. Benoist, V.E. Williams II, M.R. Kanter, and H.O. McDevitt. 1983. Cell 32:745.

PAGE 136

128 McConnell, T.J., B. Darby, and E.K. Wakeland. 1986. J. Immunol. 136:3076. McDevitt, H.O., and A. Chinitz. 1969. Science 163:1207. McDevitt, H.O., B.D. Deak, D.C. Shreffler, J. Klein, J.H. Stimpfling, G.D. Snell. 1972. J. Exp. Med . 135:1259. McDevitt, H.O., and M. Sela. 1965. J. Exp. Med. 122:517. Mclntyre, K. , and J. Seidman. 1984. Nature 308:551. McKean, D.J., R.W. Melvold, and C.S. David. 1981. Immunoqenetics 14:41. McNicholas, J., M. Steinmetz, T. Hunkapillar, P. Jones, L. Hood. 1982. Science 218:1229. Melchers, I., K. Rajewsky, D.C. Shreffler. 1973. Eur. J. Immunol. 3:754. Mellor, A.L., E.H. Weiss, K. Ramachandran, and R.A. Flavell. 1983. Nature 306:792. Mengle-Gaw, L. , and H.O. McDevitt. 1983. Proc. Natl. Acad. Sci. USA 80:7621. Mengle-Gaw, L. , S. Conner, H.D. McDevitt, and C.G. Fathman. 1984. J. Exp. Med. 160:1184. Mengle-Gaw, L. , and H.O. McDevitt. 1985. Ann. Rev. Immunol. 3:367. Michaelson, J., E.A. Boyse, M. Chorney, L. Flaherty, I. Fleisner, U. Hammerling, C. Reinisch, R. Rosenson, and F.-W. Shen. 1983. Transplanta tion Proc. 15:2033. Mikes, M. 1971. Matica Srpska, Novi Sad Zb. S er. Pr^r. Nauka 40:52. " Minezawa, M. , K. Moriwaki, and K. Kondo. 1979. jpn. J. Genet. 54:165. Mohr, E., and G. Dunker. 1930. Zool. Jahrb. , Abt. Syst (Oekol.) Geogr. Biol. 59:65. Moller, G. 1978. Immunol. Rev. 38:1. Moller, G., ed. 1980. Immunol. Rev. 58.

PAGE 137

129 Murphy, D.B. 1978. Springer Sem. Immunolpathol. 1:111. Murphy, D.B. 1981. The Role of the Major Histocompat ibility Complex in Immunobioloqy . Dorf, M. , ed. , Garland STPM, New York, N.Y. Murphy, D.B., L.A. Herzenberg, K. Okumura, L.A. Herzenberg, and H.O. McDevitt. 1976. J. Exp. Med. 144:699. Nadeau, J. 1983. Genet. Res. 42:323. Nagy, Z., C.N. Baxevanis, N. Ishii, and J. Klein. 1981. I mmunol. Rev. 60:59. Nathenson, S.G., H. Uehara, B.M. Ewenstein, T.J. Kindt, and J.E. Coligan. 1981. Ann. Rev. Bioche m. 50:1025. Nei, M. , and W.H. Li. 1979. Proc. Natl. Acad. S ci. USA 76:5269. Nizetic, D. , F. Figueroa, and J. Klein. 1984. Immunoqenetics 19:311. Ohno, S. 1970. Evolution Through Gene Duplication . Springer-Verlag, New York, N.Y. Osborn, D.J. 1965. J. Egypt. Public Health Assoc . 40:401. Owerbach, D., A. Lernmark, P. Platz, L.P. Ryder, L. Rask, P. Peterson, and J. Ludvigsson. 1983. Natu re 303:815. Parnes, J.R. and J.G. Seidman. 1982. Cell 29:661. Pearson, O.P. 1963. Ecology 44:540. Peck, A.B., B. Darby, and E.K. Wakeland. 1983. J. Immunol. 131:2432. Pelikan, J. 1974. Prirodoved. Pr. Ustavu Cesk. Akad. Ved. Brne 8:1. ~ Peterson, P. A., L. Rask, K. Sege, L. Klareskog, H. Anundi, and L. Ostberg. 1975. Proc. Natl. Acad. Sci. USA 72:1612. ' Petrov, B. 1979. Folia Zool. 28:13. Philip, U. 1938. J. Genet. 36:197.

PAGE 138

130 Ploegh, H.L., H.T. Orr, and J.L. Strominger. 1980. Proc. Natl. Acad. Sci. USA 77:6081. Radbruch, A.V. 1973. Z. Saeugetierkd. 38:168. Radding, CM. 1978. Ann. Rev. Biochem. 47:847. Rice, M.C., and S.J. O'Brien. 1980. Nature 283:157. Rich, S.S., C.S. David, and R.R. Rich. 1979a. J. E xp. Med. 149:114. Rich, R.R., D.A. Sudberry, D.L. Kastner, and L. Chu. 1979b. J. Exp. Med. 150:1555. Richardson, J.S., D.C. Richardson, K.A. Thomas, E.W. Silverton, and D.R. Davies. 1976. J. Mo l. Biol. 102:221. Roberts, T.J. 1977. The Mammals of Pakistan . Ernest Benn Ltd., London. Robertson, M. 1982. Nature 297:629. Rogers, M.J., R.N. Germain, J. Hare, E. Long, and D.S. Singer. 1985. J^ Immunol. 134:630. Romanova, G.A. 1970. Sov. J. Ecol. ( Enq. Transl.) 1:98. Rosenthal, A.S. 1978. Immunol. Rev. 40:136. Rudolph, N.S., and J.L. Vanderberg. 1981. J. Exp. Zool. 217:455. C Sage, R.D. 1978. Origins of Inbred Mice , p. 519. Morse, H.C. Ill, ed. , Academic Press, New York. Sage, R.D. 1981. The Mouse in Biomedical Research. Vo1 £« P39. Foster, H.L., J.D. Small, and J.G. Fox, eds., Academic Press, New York. Saito, H., R.A. Maki, L.K. Clayton, S. Tonegawa. 1983. Proc. Natl. Acad. Sci. USA 80:5520. Schlauder, G.G., M.P. Bell, B.N. Beck, A. Nilson, and D.J. McKean. 1985. J^ Immunol. 135:1945. Schwartz, R.H. 1985. Ann. Rev. Immunol. 3:237. Schwarz, E., and H.K. Schwarz. 1943. J. Mammal. 24:59. Selander, R.K. , W.G. Hunt, and S.Y. Yang. 1969. Evolution 23:379.

PAGE 139

131 Serafinski, W. 1965. Ekol. Pol. 13:305. Shaut, D.M., E.K. Wakeland, P.H. Maurer, and A.B. Peck 1984. J. Immunol. 133:1410. Shin, H.-S., D. Bennett, and K. Artzt. 1984. Cell 39:573. Shin, H.-S., L. Flaherty, K. Artzt, D. Bennett, and J. Ravetch. 1983b. Nature 306:380. Shin, H.-S., P. Mccormick, K. Artzt, and D. Bennett. 1983a. Cell 33:925. Shin, H.-S., J. Stavnezer, K. Artzt, and D. Bennett. 1982. Cell 29:969. Shiroishi, T. , T. Sagai, and K. Moriwaki. 1982. Nature 300:370. Silver, L.M. 1981. Genet. Res. 38:115. Silver, L.M. 1985. Ann. Rev. Genet. 19:179. Silver, L.M., and K. Artzt. 1981. Nature 290:68. Silver, L.M., K. Artzt, and D. Bennett. 1979. Cell 17:275. Silver, L.M., D. Lukralle, J.I. Garrels. 1983. Nature 301:422. Slightom, J., A.E. Blechl, 0. Smithies. 1980. Cell 21:627. Solinger, A.M., M.E. Ultee, E. Margoliash, and R.H. Schwartz. 1979. J. Exp. Med. 150:293. Snell, G.D., J. Dausset, and S. Nathenson. 1976. Histocompatibility . Academic Press, New York. Sood, A.K., D. Pereira, and S.M. Weissman. 1980. Proc. Natl. Acad. Sci. USA 78:616. Southern, E. 1980. Methods Enzymol. 69:152. Srivastva, S.P., and B.L. Wattal. 1973. Indian J. Entomol. 35:306. Steinmetz, M. , J.G. Frelinger, D. Fisher, T. Hunkapiller, D. Pereira, S.M. Weissman, H. Uehara, S. Nathenson, and L. Hood. 1981. Cell 24:125.

PAGE 140

132 Steinmetz, K. , M. Malissen, L. Hood, A. Orn, R.A. Maki, G.R. Dastoornikoo, D. Stephan, E. Gibb, and R. Romaniuk. 1984. EMBO Journal 3:2995. Steinraetz, M. , K. Minard, S. Horvath, J. McNicholas, J. Frelinger, C. Wake, E. Long, B. Mach, and L. Hood. 1982. Nature 300:35. Steinmetz, M. , D. Stephan, and K. Fisher Lindahl. 1986. Cell 44:895. Suggs, S.V., R.B. Wallace, T. Hirose, E.H. Kawashima, and K. Itakura. 1981. Proc. Natl. Acad. Sci. USA 78:6613. Sukhatme, V.P. , K.C. Sizer, A.C. Vollmer, T. Hunkapiller, and J.R. Parnes. 1985. Cell 40:591. Tada, T., M. Taniguchi, and C.S. David. 1976. J . Exp . Med. 144:713. Taniguchi, M. , I. Takei, and T. Tada. 1980. Nature 283:227. Taniguchi, M. , T. Tokuhisa, M. Kanno, Y. Yaoita, A. Shimizu, and T. Honjo. 1982. Nature 298:172. Taylor, R.H. 1978. The Ecology and Control of Rodents , p. 135. Dingwall, P.R. , I.A.E. Atkinson, and C. Hays, eds., New Zealand Dept. of Lands and Survey, Wellington. Thuesen, P. 1977. Vidensk. Medd. Dan. Naturhist. Foren. 140:117. — Uhr, J., J.D. Capra, E.S. Vitetta, and R.G. Cook. 1979. Science 206:292. Upholt, W.B., and I.B. Dawid. 1977. Cell 11:571. Ursin, E. 1952. Vidensk. Medd. Dan. Naturhist. Foren. 114:217. " Van Valen, L. 1965. Genetica 36:119. Wake, C.T., E.O. Long, M. Strubin, N. Gross, R. Accolla, S. Carrel, and B. Mach. 1982. Proc. Natl. Acad. Sci. USA 79:6979. Wakeland, E.K. , and B.R. Darby. 1983. J. Immuno l. 131:3052. Wakeland, E.K. , B.R. Darby, and J.E. Coligan. 1985. J. Immunol. 131:2432.

PAGE 141

133 Wakeland, E.K. , and J. Klein. 1979a. Immunogenetics 8:27. Wakeland, E.K. , and J. Klein. 1979b. Immunogenet ics 9:535. Wakeland, E.K. , and J. Klein. 1981. J. Immunol. 126:1734. Wakeland, E.K. , and J. Klein. 1983. J. Immunol. 130:1280. : Waltenbaugh, C. 1981. J. Exp. Med. 154:1570. Weiss, E., L. Golden, R. Zakut, A. Mellor, K. Fahrner, S. Kvist, and R.A. Flavell. 1983. EMBO J. 2:453. Weiss, E.H., A. Mellor, L. Golden, K. Fahrner, E. Simpson, J. Hurst, and R.A. Flavell. 1983a. Nature 301:671. Widera, G. , and R.A. Flavell. 1984. EMBO J. 3:1221. Widera, G. , and R.A. Flavell. 1985. Proc. Nat ' 1 Acad. Sci. USA 82:5500. Winking, H. , and J.L. Guenet. 1978. Mous e News Lett. 59:33. ' " Winoto, A., M. Steinmetz, and L. Hood. 1983. Proc. Natl. Acad. Sci. USA 80:3425. Womack, J.E. 1979. Genetics 92:5. Yang, C. , H. Kratzin, H. Gotz, F.P. Thinnes, T. Kruse, G. Egert, E. Pauly, S. Kolbel, P. Wernet, and N. Hilschmann. 1982. Hoppe-Seyler ' s Z. Phys iol. Chem. 363:671. Yonekawa, H. , K. Moriwaka, 0. Gotoh, J. Watanabe, J. Hayashi, N. Miyashita, M.L. Petras, and Y. Tagashira. 1980. Jpn. J. Genet. 55:289. Zaleska-Rutczynska, Z., and J. Klein. 1977. J. Immunol. 119:6. Zejda, J. 1975. Zool. Listy 24:99. Zimmerman, K. 1949. Zool. Jahrb. Abt. Syst. (Oekol.) Geoqr. Tiere 78:217. Zinkernagel, R.M. 1979. Ann. Rev. Microbiol. 33:201.

PAGE 142

134 Zinkernagel, R.M. , and P.C. Doherty. 1980 Adv. Immunol. 27:51.

PAGE 143

BIOGRAPHICAL SKETCH Thomas John McConnell was born in Teaneck, New Jersey, on November 22, 1955. His family moved to Miami, Florida, when he was four years old, where he and his one brother and two sisters grew up. He graduated from Miami Palmetto High School in 1973, and then attended Junior College in Miami for one and one half years before attending Florida State University in Tallahassee, Florida, for one year. He transferred to the University of Florida in Gainesville, Florida, in 1976 where he graduated with his Bachelor of Science in zoology in 1979. After working as a laboratory technologist for one and one half years, he started in the graduate program in the Department of Pathology at the University of Florida in 1981. He received his Doctor of Philosophy degree from the Department of Pathology at the University of Florida in 1986. 135

PAGE 144

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Edward K. Wake land, Chairman Associate Professor of Pathology I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Ammn £v Amm©fr~B7 Peck Associate Professor of Pathology I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Noel K. Maclaren Professor of Pathology I certify that I have read this study and that in my opinion it conforms to acceptable standards on scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. sTlZzZL^ , /ft*. Arthur Kimura Associate Professor of Pathology

PAGE 145

I certify that I have read this study and that in my opinion it conforms to acceptable standards on scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. ZJLJIa Edward Siden Assistant Professor of Immunology and Medical Microbiology This dissertation was submitted to the Graduate Faculty of the College of Medicine and to the Graduate School and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. August 1986 Jean, College of Medicine y^\^ <>UaA Deani GradUatb School

PAGE 146

UNIVERSITY OF FLORIDA 3 1262 08554 8401


xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EQRMUBIX3_S8V0SQ INGEST_TIME 2015-03-25T20:25:23Z PACKAGE AA00029740_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES