The influence of recombination on the diversification of the murine I region


Material Information

The influence of recombination on the diversification of the murine I region
Physical Description:
viii, 132 leaves : ill. ; 29 cm.
Tarnuzzer, Roy William, 1960-
Publication Date:


Subjects / Keywords:
DNA, Recombinant   ( mesh )
Recombination, Genetic   ( mesh )
Polymorphism (Genetics)   ( mesh )
Haplotypes   ( mesh )
Pathology thesis Ph.D   ( mesh )
Dissertations, Academic -- Pathology -- UF   ( mesh )
bibliography   ( marcgt )
non-fiction   ( marcgt )


Thesis (Ph.D.)--University of Florida, 1988.
Bibliography: leaves 123-131.
Statement of Responsibility:
by Roy William Tarnuzzer.
General Note:
General Note:

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001071818
oclc - 20674830
notis - AFF6297
System ID:

This item is only available as the following downloads:

Full Text







This dissertation is dedicated to Lena
as a token of my appreciation for
the love, support, and encouragement
she has provided.


I would like to thank my mentor and committee chairman,

Dr. Edward K. Wakeland, for his input and help in the

compilation of this dissertation.

I would also like to express my appreciation to my other

committee members, Drs. Ammon Peck, Bill Winter, Harry

Ostrer, and Linda Smith, for their help and interest in this


My fellow students and laboratory technicians in the

lab definitely have made these past five and a half years

quite interesting. I would especially like to thank Stefen

Boehme for the "unique" environment we were able to maintain

in the lab and the many entertaining excursions we made into

the outside world. And of course, what would life be

without Rick McIndoe, the only human being who could put up

with anything I could dish out. I would also like to extend

a thank you to all the other people in the lab who, through

the years, have helped and encouraged me.

I would also like to thank the many other people in the

department for anything they might have done to make my stay

in the Department of Pathology a pleasant one.

Finally, I would like to thank all those fish I have

caught over the last half a dozen years for the healing

therapy they provided when it was needed most. Thanks are

extended to my brother, Bob, for without him and the


numerous fishing trips, all those fish would have gone

through life without thrill of fighting me at the other end

of the line.



ACKNOWLEDGEMENTS .................................... iii

ABSTRACT ............................................ vii

CHAPTER I: INTRODUCTION ............................. 1


Organization of the Major Histocompatibility
Complex ....................................... 4
Class II Gene Polymorphism ..................... 15
Functional Role of MHC Polymorphism ............ 22
Homologous Recombination Within the MHC ........ 29
Wild Mice ...................................... 38


Mice ........................................... 47
Antibody Isolation and Conjugation ............. 47
Spleen Cell Isolation, Immunostaining, and
Flow Cytometric Analysis ...................... 48
Isolation of Genomic DNA ....................... 50
Endonuclease Digestion and Agarose Gel
Electrophoresis ............................... 51
Capillary Transfer and Hybridization ........... 51
RNA Isolation and Analysis ..................... 52
Probes ......................................... 53
Genomic Restriction Mapping .................... 56
Data Analysis .................................. 56

CHAPTER IV: RESULTS ................................ 57

RFLP Analysis of the I Region .................. 57
Allele Lineages ................................ 75
Evidence for Site Specific Recombination
Within the I Region: Identification of
RHSs in EJ and Ea ............................. 77
Identification of a Recombinational Hotspot
Between Aa and Ef ............................. 81
Identification of Recombinationally Depressed
Segments (REDS) within the I Region ........... 84
Correlation Analysis of REDS ................... 90
Lineage Analysis of REDS ....................... 98
Haplotype Characterization of t Forms of
Chromosome 17 ................................. 100


CHAPTER V: DISCUSSION .............................. 108

Polymorphism of Genes Within the I Region ...... 108
Recombination Within the I Region .............. 111
The Influence of RHSs on Evolution of I
Region Haplotypes at the Genomic Level......... 114
Recombination, Selection, and Generation of
I Region Haplotypes ........................... 115

REFERENCES .......................................... 124

BIOGRAPHICAL SKETCH ................................. 133

Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



Roy William Tarnuzzer

December 1988

Chairman: Edward K. Wakeland
Major Department: Pathology and Laboratory Medicine

Seven single copy DNA probes were isolated that span

110 kilobases of the Murine I region and used in a

Restriction Fragment Length Polymorphism (RFLP) analysis

with five restriction endonucleases on genomic DNAs from

28 H-2 homozygous wild and laboratory inbred mice.

Polymorphic restriction sites were used to cluster alleles

at each locus to form lineages representing groups of

minor variants of a single progenitor allele. These

lineages were then used to identify recombinational events

between the loci probed. Three recombinational hotspots

(RHS) were identified from the 26 unique I region

haplotypes analyzed. These RHSs are located; 1) in the

second intron of Eli, 2) at the centromeric end of Ea, and

3) approximately 5kb telomeric of the Aa gene. The E_ and

Ea RHSs correspond to those already documented while the

RHS adjacent to Aa has not been previously defined. This

RHS maps to a 4.7 kb stretch of DNA 3' of the Aa gene and


its activity appears to be haplotype dependent. The three

RHSs separate the I region into four genomic segments

where the sequences within a particular segment accumulate

mutations at the same rate. These segments were termed

recombinationally depressed segments (REDS) since

recombination is localized to the RHSs with only a few

rare recombinational events occurring within a defined

REDS. These REDS were grouped into lineages which

represent a limited number of evolutionary units which are

shuffled between haplotypes during evolution. The genes

within a REDS, for example, Aa and AM3 in REDS1, show a

strong linkage disequilibrium which results in the

coordinate evolution of these two genes. In the case of

the A molecule, this linkage disequilibrium between these

co-expressed genes appears to be necessary for the proper

expression on the cell surface. This same pattern of

evolution is seen in the t haplotype mice which contain a

large number of wild type alleles suggesting a much higher

degree of recombination between these two different forms

of chromosome 17 than previously expected.



The murine major histocompatibility complex (MHC)

located on chromosome 17 and called H-2 is a multigene

family coding for polymorphic surface glycoproteins

involved in cell recognition and the generation of immune

responses to foreign antigen. The I region, spanning

approximately 120 kilobases of DNA, lies within H-2 and

encodes the two class II molecules, I-A and I-E. These

class II or Ia molecules are heterodimers composed of an a

and 3 chain which non-covalently associate in the

cytoplasm and are expressed predominately on B lymphocytes

and activated macrophage (Flavell and Widera 1986).

The genes for the class II molecules are arranged from

the centromere in the order AM8, Aa, Ef3, E f2, and Ea. The

AM3, Aa, E_8, and to a lesser extent, Ea molecules, are very

polymorphic and show many distinct forms or alleles at the

protein and DNA level. The set of alleles present in the

I region of a specific mouse constitutes its haplotype.

The high amount of polymorphism, i.e. the large number

of distinct alleles, makes the H-2 ideal for the study of

homologous recombination and its influence on the

evolution of the regions adjacent to the sites of

recombination. Homologous recombination is a mechanism by

which homologous nucleotide sequences or allelic sequences


on homologous chromosomes are exchanged with high fidelity

during meiosis. Homologous recombination can occur

anywhere along a chromosome. It has been observed that

recombination frequencies can vary for different stretches

of DNA of the same length and that genetic map distances

do not always agree with molecular map distances

(Steinmetz et al. 1982b). This suggests that there are

regions of DNA that either concentrate or suppress

recombinational events. A site where recombination

appears to be localized, first identified in procaryotes,

has been termed a recombinational hotspot (RHS) (Song


The H-2 contains four documented RHSs with two falling

within the I region (Steimetz et al. 1987). The RHS in

Ef3, defined by 12 breakpoints, is localized to a 10kb

stretch of DNA and the RHS in Ea is defined by 7

breakpoints localized to a 12-14kb stretch just

centromeric to the gene (Steinmetz et al. 1982b; Lafuse et

al. 1986). All RHSs in H-2 show three characteristics in

common: 1) high frequency of homologous recombination, 2)

localization to a small stretch of DNA, and 3) haplotype

specificity (Steinmetz et al. 1987).

The presence or absence of an active RHS in different

individuals would be expected to have a distinct influence

on the generation of haplotypes over an evolutionary

timespan. Homologous equal recombination would shuffle

and generate new combinations of alleles which would lead


to new haplotypes in a population. Depending on the

number of active RHSs in a population, the extent of

allele shuffling would vary as would the number of unique

haplotypes. Because recombination appears to be localized

to specific sites within the I region, markers located in

these regions flanked by RHSs should show linkage


The aims of this dissertation are to survey a large

collection of independently derived I region haplotypes

and to identify and localize RHSs. Once characterized, I

wanted to determine the influence of these RHSs on the

generation of I region haplotypes and the relationships of

the genes flanked by RHSs.


The major histocompatibility complex (MHC), located

on chromosome 17 in the mouse, was first characterized

based on its involvement in graft rejection between

different inbred mouse lines (Little and Tyzzer 1916).

With the advent of serologic techniques and their

application to the study of the H-2 complex (Gorer 1936),

the genetics of this region began to interest more and

more biologists. From this flourishing interest and

advances in various chemical and molecular techniques, a

more exact picture of the organization, structure and

function of the H-2 and its gene products has emerged.

Organization of the Major Histocompatibility Complex
Genomic Characteristics and Structure of Encoded Products

The murine major histocompatibility complex, commonly

referred to as H-2, is a large multigene family which

codes for the cell surface glycoproteins involved in cell

recognition and the control of immune responses to foreign

antigens. The H-2 complex, in genetic map distances,

encompasses approximately 2 centiMorgans of DNA (Klein

1975) which translates into a physical distance of 2000 to

4000 kilobases (Hood et al. 1982). The H-2 encodes three


classes of immune related proteins: class I, class II and

class III (Klein 1975). Based on functional parameters,

the H-2 has been divided into four regions which

correspond to the classes of molecules which they encode.

The K and D regions contain the class I genes, the I

region which contains the class II genes, and the S region

contains the class III genes. The class I products fall

into 2 general categories, those involved in graft

rejection and those related to development. The first

group, the classical transplantation antigens, is encoded

by genes denoted K, P, L and R and is expressed on the

surface of all nucleated cells. Although they mediate

heterologous graft rejection in the laboratory, this is

not their function in vivo. These class I molecules

function in the restricted presentation of viral and tumor

antigens to cytotoxic T lymphocytes (Zinkernagel 1979).

The second group of class I molecules is of two families.

The Qa family of molecules is expressed on mammalian

nucleated blood cells and the Tla molecules are expressed

on certain leukemias (Michaelson et al. 1983). Whereas

the classical transplantation antigens are very

polymorphic, the Qa and Tla antigens exhibit very low

polymorphism and their functions are still unknown

(Flaherty 1980). Molecular cloning and analysis of the H-

2 have revealed over 32 genes spanning at least 800 bps

for the Qa and Tla products alone (Steinmetz et al.

1982a). The class I molecules show a unified structure of


three extracellular domains, a transmembrane domain and a

cytoplasmic domain which constitutes a 40-45,000 dalton

glycoprotein of approximately 350 amino acids. This

glycoprotein chain non-covalently associates with a 12,000

dalton molecule encoded on chromosome 2 known as P2

microglobulin (Klein et al. 1983b) .

The class III genes, contained within the S region,

encode the complement proteins C2, Bf, Slp, and C4.

Although these genes are physically contained in the H-2,

Klein and Figueroa (1981) argue against their inclusion in

the MHC because they are not functionally related to the

class I or class II loci.

The class II genes are contained within the I region,

or immune response region, which was first defined by the

differential ability of inbred mouse strains to mount an

immune response to certain antigens (McDevitt and Sela

1965; Martin et al. 1971) and were latter mapped by the

use of recombinant and congenic strains of mice

(Benacerraf and McDevitt 1972). There are two class II

molecules encoded within the I region, I-A and I-E,

assembled from four functional class II genes. These

genes are AP_, Aa, E#f and Ea as well as the pseudogenes

AMi3, AP_2 and E832 (Widera and Flavell 1985). A molecular

map of the H-2 and the I region is given in Figure 2-1.

These class II molecules are composed of a 35,000 dalton a

and 29,000 dalton P chain of about 220 and 230 amino acids

respectively (Klein et al. 1983b) which non-covalently

0 0

4) u




4 j




o 0)
.) 0
0 0

I 0
UN 0
-4 0 r--
0 0
m t3
1 0



m -




associate in the cytoplasm and are subsequently expressed

on the surface of the cell as a heterodimer. The a and P

chains are organized similarly into five protein domains.

They consist of a hydrophobic leader peptide of 25 amino

acids, two 90 amino acid extracellular domains (ala2,

3132), a hydrophobic transmembrane segment of 25 amino

acids and a cytoplasmic domain. The domain structures of

the a2, pl and P2 regions are formed due to disulfide

bonds between pairs of cystine residues located within

each domain. The domain organization of the protein

directly reflects the intron/exon organization of their

respective genes.

The 3 chain genes of the class II molecules are

composed of six exons, one for each protein domain, and an

exon for the 3' untranslated region (Saito et al. 1983).

The a genes are very similar except they are composed of

five, instead of six exons due to the transmembrane and

cytoplasmic regions being combined in a single exon

(Mathis et al. 1983; McNicholas et al. 1982). A diagram

of the organization of the class II a and P genes is given

in Figure 2-2.

The Inclusion of the MHC Genes Into the Immunoqlobulin
Supergene Family

With the advent of cloning and sequencing techniques,

a very detailed analysis of class I and class II gene











0 1


*-4 H









structure could be performed. Comparisons of protein and

DNA sequences reveal a very similar domain structure for

most of the genes of the immune system, with the domain

organization reflecting the intron/exon organization of

the genes which encode them. This is true for the class

II, class I, Thy-1, #2-microglobulin, T4, T8, T cell

receptor and immunoglobulin genes (Kaufman et al. 1984;

Benoist et al. 1983; McNicholas et al. 1982; Parnes and

Seidman 1982; Larhammar et al. 1982; Sukhatme et al. 1985;

Maddon et al. 1985; Hood et al. 1983; Davis 1985). The

domain and sequence homology within the membrane proximal

domains among the genes of the immune system has led to

the theory that these genes arose from a single ancestral

gene through gene duplication. The strong similarity

between the membrane proximal domains of the molecules of

the immune system based on size and structure argues

strongly for the divergent evolution of a single ancestral

gene following gene duplication events (Hood et al. 1983).

I Region Organization

Before the advent of molecular cloning, the I region

was considered by immunologists to consist of four

subregions as determined by recombinational analysis based

on serologic and immune response assays (Klein 1975; Klein

et al. 1983a; Mengle-Gaw and McDevitt 1985). The four

defined subregions were I-A, I-B, I-J, and I-E. The I-A

and I-E subregions were serologically defined and encode


the conventional Ia antigens. The genes for Af3, Aa, and

EP map to the I-A subregion, whereas Ea maps to the I-E

subregion (Jones et al. 1978; Murphy et al. 1980). The I-

B subregion was defined by the regulation of immune

responses to IgG2a and lactate dehydrogenase (Lieberman et

al. 1972; Melchers et al. 1973). The I-B subregion later

became defunct as shown by Dorf and Benacerraf (1975) by

the explanation of this immune response phenotype being

controlled by the complimentation of two genes, one from

the I-A and I-E subregion, respectively. The I-J

subregion was defined serologically by reagents directed

against an I-J polypeptide, which was believed to be a

suppressor factor from suppressor T lymphocytes (Murphy et

al. 1976; Murphy et al. 1980). However, attempts to

isolate and purify I-J in sufficient quanity for protein

sequence analysis have failed. Molecular characterization

of the I region by Steinmetz et al. (1982b) in the

recombinant strains used to define I-J showed that the

product of I-J would have to be encoded by a 3.4 kilobase

stretch of DNA. Sequence analysis of this fragment showed

that the was no gene which could code for the I-J product

(Kobori et al. 1986).

The exact order and number of the class II genes came

into view when 240,000 contiguous base pairs of the I

region were cloned from the BALB/c mouse (Steinmetz et al.

1982b). Four class II genes were identified with one

being a pseudogene due to the lack of hybridization with a


5' probe. It was determined that the BALB/c genome

contains two a genes and from four to six P genes. This

was confirmed in latter work by Widera and Flavell (1985).

The positions of the genes for Af3, Aa, Ef3, EP2, and Ea

were conclusively mapped within the I region.

Subsequently, two other class II P genes were

discovered and determined to be pseudogenes. Larhammar et

al. (1983a) identified AM2 and positioned it approximately

20 kilobases centromeric to Aj3. The AP2 gene was

sequenced (Larhammar et al. 1983b) and the exon/intron

organization was found to be the same as for the other

class II P genes. The A/2 molecule, from the predicted

amino acid sequence, shows only 56% homology to the other

3 chains, in contrast to the typical homology of around

80% seen among these P chains. Based on this, AP2 was

determined to be the most divergent member of the class II

P genes. Widera and Flavell (1985) isolated and

characterized A_33 and localized it to 75 kilobases

telomeric of the K region. Steinmetz et al. (1986) were

able to link the AP3 gene from BALB/c to the rest of the I

region thereby providing a continuous 600 kilobase map of

the K and I regions. The pseudogene A_83 shows strong

homology to the other 3 genes and 83% homology to the

human SB_ gene. Whereas the AP2 gene is transcribed

(Larhammar et al. 1983a) but not expressed on the cell

surface due to splicing errors, AP/3 shows an 8 nucleotide


deletion which would make transcription of the gene


Figure 2-3 illustrates the content and organization

of the I region. The genes of the I region are arranged

centromerically in the order A83, A~2, AP8, Aa, Ef8, f8i2 and

Ea, and span approximately 300 kilobases of DNA with the

functional class II genes confined to a 110 kilobase


Class II Gene Polymorphism

The MHC genes are the most polymorphic loci known for

vertebrates and have made the class I and class II genes

of great interest to investigators. Based on serologic

and molecular studies, it has been determined that, in

general, the AP, Aa, and Ef chains are the most

polymorphic (Klein 1975; Benoist et al. 1983), and Ea

being the least polymorphic (Klein et al. 1983a). The

genes of the class II molecules have been shown to exhibit

the same degree of polymorphism as the protein products

they encode. This unique variability of the class II

genes is therefore a reflection of the unique biological

role of these molecules with respect to the immune system.

0 *4H

0 *
*- 0 V
0) -I -4

*H 0
0 to
*H -H 0
.) V -.4

3 0


0 )P

54> co

V M 4-)
r-4 C-q
u 04
0 V1 0

M *0 #i

C 0 )

c 0- N0
S*0 C0

N 4.)

P4 0 -H N4



0 c J
z' *-


Mechanisms for the Generation of Polymorphism

The entire region containing the class II genes has

been cloned and analyzed extensively (Larhammar et al.

1983b; Choi et al. 1983; Benoist et al. 1983; Widera and

Flavell 1985; Steinmetz et al. 1986). When the I region

of several laboratory inbred strains of mice were compared

using single copy probes spanning the I region, a variable

tract was found in the telomeric half of the I region

characterized by extensive sequence diversity determined

by an RFLP analysis, and a conserved tract on the

centromric end showing very low sequence diversity, with

the two tracts meeting at the 3' portion of the EP gene

(Steinmetz et al. 1984). The genes for Af9, Aa, and the 5'

end of E/ occupy the 60 kilobases that compose the

variable tract. The conserved tract, spanning 50

kilobases, contains the genes for Ef2 and Ea. Non-coding

sequences within the two tracts showed the same patterns

of diversity. The mechanisms maintaining this pattern of

diversity are not known but perhaps extensive sequence

comparisons would shed some light on this unknown.

Nucleotide sequence comparisons of MA_, Aa, and BE_

reveal extensive sequence polymorphism within laboratory

strains of mice (Benoist et al. 1983; Estess et al. 1986).

Variations in nucleotide sequence of 5% to 10% are not

unusual between alleles of AP and Aa with the most

diversity localized within the 81 and al domains. Choi et

al. (1983) did a sequence comparison on genomic clones of


A_1 from the b, d, and k haplotypes and determined that the

majority of amino acid substitutions are localized to the

animo terminus of the encoded molecule. These mutations

are indicative of a pattern of multiple independent

events. Recent work by McConnell et al. (1988) on AP

reveals evidence for segmental exchange between alleles to

generate diversity.

Benoist et al. (1983) sequenced six alleles of Aa for

the k, d, b, f, u, and g haplotypes and found that most

substitutions within the al exon are clustered into what

they term "regions of allelic hypervariability." A Kabat-

Wu variability plot (Kabat et al. 1979) of the

corresponding animo acid sequence reveals that amino acid

substitutions fall into two hypervariable regions at

residues 11-15 and at residues 56-57 which correspond to

the regions of the molecule responsible for the binding of

foreign antigen (Brown et al. 1988).

Mengle-Gaw and McDevitt (1983) have reported regions

of allelic hypervariability between alleles of E_? also.

These regions of diversity are localized to the fl exon

and are separated by tracts of sequence homology which the

authors suggest might reflect diversification by gene


The mechanisms for the generation of diversity of the

class II genes are unknown; however, two hypotheses

dominate speculations. The first hypothesis proposes that

new alleles in a population are generated by


hypermutational mechanisms such as gene conversion or

segmental exchange. Segmental exchange or gene conversion

was originally defined in fungi (Radding et al. 1978) and

is a mechanism by which DNA sequence is copied or

transferred to or from genes, usually belonging to

multigenic or multiallelic families (Baltimore 1981;

Robertson 1982). During meiosis or mitosis, there is

pairing of partially homologous sequences followed by

mismatch repair thereby converting part of one sequence to

that of another. Gene conversion events are characterized

by clusters of substitutions at the DNA level. This

pattern of diversity is clearly documented in class I

genes (Mellor et al. 1983; Weiss et al. 1983) and evidence

for the same mechanism of diversification of the class II

genes, although to a lesser extent, is also seen (Mengle-

Gaw et al. 1984; Widra and Flavell 1985; McConnell et al.


As mentioned earlier, the majority of mutations

within the class II genes appear to be clustered into

tracts of allelic hypervariability. Direct evidence for

gene conversion in class II genes has been reported by

Mengle-Gaw et al. (1984), where an alloreactive T cell

clone reacted with determinants present on both Ejb and

Apbml2. Sequence comparisons between Apb, Apbml2, and EB8b

(Choi et al. 1983; McIntyre and Seidman 1984) reveal

sequence homology between bml2 and IEb where it differs

from AMb. The region that is exchanged, encompassing


approximately 14 nucleotides, is flanked by regions of

exact homology extending for distances of 20 base pairs

either side of the recombinational event.

By examining the nucleotide sequence of eight alleles

of AM3, McConnell et al. (1988) found that, for six of the

eight alleles, the evolutionary lineage of the il and 32

exons corresponds to the presence or absence of a

retroposon insertion within the second intron which is

used to define these lineages. The 31 exon of two

alleles, AM8b and Afnod, did not reflect their evolutionary

lineage by RFLP, and therefore reflects the exchange of

sequence, by segmental exchange, from alleles of a

different evolutionary lineage.

The second hypothesis for the generation of

diversity, termed "trans-species evolution," proposes that

the polymorphism arose from the steady accumulation of

mutations over long evolutionary periods, and multiple

advantageous alleles have survived speciation (Klein

1980). Trans-species evolution, therefore, represents a

mechanism for the maintenance of diversity in natural

populations. A recent report has shown (McConnell et al.

1988) that 90% of 115 AM3 alleles examined by RFLP analysis

fall into two evolutionary lineages based on the presence

or absence of a short interspersed nucleotide element

(SINE). Using the SINE sequence as an evolutionary tag

for the analysis of nine separate species and sub-species

of the genus Mus, the authors determined that the SINE


sequence could be identified in species that diverged over

eight million years ago. Therefore, these alleles

containing the retroposon insertion must have survived

speciation suggesting the role of trans-species evolution

in the generation of polymorphism seen in modern Mus


The above findings indicate that both hypermutational

mechanisms and trans-species evolution contributes to the

diversity of class II genes. The diversity within class

II genes is localized to the regions of the molecule

responsible for antigen binding, suggesting that selection

for functional diversity in the binding sites of these

molecules may maintain these polymorphisms in natural

populations. Taken together, this suggests that strong

selective pressures play an important role in the

maintenance of MHC polymorphism.

Functional Role of MHC Polymorphism

The MHC molecules are involved in cell recognition

and generation of the immune response to foreign antigen.

The interaction of foreign antigen, the class II

molecules, and the T cell receptor determines if an animal

can mount an immune response. Therefore, the MHC mole-

cules play a pivotal role in the survival of the animal

when challenged by pathogens in their natural environment.


Regulation and Expression of MHC Molecules

Class II molecules are expressed predominantly on two

cell types, collectively called antigen presenting cells

(APC), and typlified by the macrophage and the B

lymphocyte. It is well documented that macrophage and

macrophage-like cells play a fundamental role in the

induction of immune responses. The interaction of the

regulatory T lymphocyte and the APC is under the control

of the I region of the MHC, termed MHC restriction, and

the ability of the APC to present antigen is dependent on

the cell surface expression of a class II molecule (Unanue

1983). It has been demonstrated that the expression of

class II is not constitutive in macrophages and can come

under both positive and negative control (Steinman et al.

1980; Snyder et al. 1982). In the case of positive

control, McNicholas et al. (1982) have shown that factors

secreted from mitogen activated spleen cells induce the

biosynthesis and cell surface expression of MHC antigens.

Subsequent studies have determined this factor to be

gamma-interferon (Steeg et al. 1982; King and Jones 1983).

When macrophages are incubated with immune interferon,

there is a coordinate increase in mRNA for the four class

II chains within an eight hour period (Paulnock-King et

al. 1985). Further studies on class I induction on

macrophages by gamma-interferon suggest the role of a

common sequence in the promoter of the genes, in

association with a functional enhancer sequence, necessary


for the induction of their expression (Israel et al.


The B lymphocyte, in contrast to the macrophage,

shows a heterologous constitutive level of class II on its

cell surface (Mond et al. 1981; Monroe and Cambier 1983).

Mitogen activated T cell supernatants were shown to

increase the levels of cell surface Ia on resting B cells

(Roehm et al. 1984), and later studies identified this

factor as B-cell stimulatory factor 1 (BSF-1) (Noelle et

al. 1984). BSF-1, or interluekin 4, induces mRNA levels

within one hour and cell surface levels as early as two

hours (Polla et al. 1986).

These two mechanisms of class II induction reflect

the importance of the cell surface expression of class II

for the interactions of the regulatory T lymphocytes and

the APC for the initiation of an immune response.

Chain Association and the Functional Expression of Class
II Molecules

Early studies hava shown that the class II molecules

are heterodimeric in nature requiring the association of

an a and P chain (Cullen et al. 1974; Jones et al. 1978).

By evaluating the functional role of the class II

molecules, it was observed that certain immune responses

in recombinant mice of the b and k haplotypes mapped to

separate subregions of the I region and were therefore

under the control of two genes (Jones et al. 1978).


It was noted that some laboratory inbred mice carry

mutations that cause the failure of expression of the

class II E molecule on the cell surface (Jones et al.

1981). The cloning and analysis of the I region by

Steinmetz et al. (1982b) showed that the genes for E_3 and

Ea are present in the strains of mice that do not express

an E molecule. These defects in expression fall into

three categories (Hyldig-Nielson et al. 1983; Mathis et

al. 1983): The H-2b and H-2s haplotypes have a deletion

in Ea, the H-2f haplotypes makes an Ea message of abarent

size, and the H-2q haplotype has a defect in the stability

of the Ea message. Lack of E molecule expression has been

documented to be as high as 30% in wild mice with levels

of 50% in the t haplotypes, which can be found in

frequencies of up to 40% in wild populations (Nizetic et

al. 1984). Eighteen t haplotype strains were shown to

lack expression of an E molecule (Dembic et al. 1984).

Sixteen of the eighteen strains carry a deletion in the

promoter of Ea identical to that seen in inbred mouse

strains. The three non-expressing strains which do not

carry this deletion carry a mutation where the gene is

transcribed but no protein is expressed on the cell

surface. These three mutations represent the extreme case

where changes in one chain of the class II molecule effect

cell surface expression and the ability of an animal to

mount an immune response to certain antigens.


There are no reports of the lack of expression of an

A molecule within natural populations of mice. The

importance of maximizing the amount of class II variation

is believed to be reflected in the observation that a and

3 chains of a given isotype (i.e. A or E) can

transassociate in heterozygotes (Fathman and Kimoto 1981).

These findings have given rise to the notion of free

association among alleles within an isotype. Studies on

wild derived haplotypes, by the analysis of tryptic

peptide fingerprints from serologically related groups of

mice (Wakeland and Klein 1983), show that Aa and AP within

these strains differ by less than 10% of their tryptic

peptides (Wakeland and Darby 1983). RFLP analysis of A_3

and Aa for this same allelic family corroborate this

observation at the DNA level (McConnell et al. 1986).

Recent studies that indicate that polymorphism can

dramatically affect the Aa and AP subunits' ability to

assemble as an A molecule for functional expression on the

cell surface (Germain et al. 1985; Gilfillan et al. 1988).

In these studies, Aa and AP genes from different

haplotypes were either co-transfected or introduced into

transgenic mice. It was observed that, haplotype

mismatched chains can not effectively associate to get

appreciable levels of the transassociated pairs. Taken

together, these findings suggest that, in order for proper

assembly and cell surface expression, the a and P chains


of the A molecule need to be co-adapted and, therefore, be

from the same or similar haplotype.

Role of Class II in the Presentation of Foreign Antigen

It is the interaction of foreign antigen, class II

molecules and the T cell receptor which determines if an

animal will mount an immune response. Unlike the B

lymphocyte, the T cell's receptor cannot bind and

recognize free antigen (Moller 1978; Moller 1980). It is,

therefore, the role of the class II molecule to present

antigen in such a way to enable the T lymphocyte to

respond and initiate an immune response.

The T cell receptor must recognize a bimolecular

ligand composed of the antigen and the MHC class II

molecule (Schwartz 1985). Studies have shown that most T

cells recognize non-native forms of the antigen as seen

with lysozyme (Adorini et al. 1979), ovalbumin

(Shimonokevitz et al. 1983), myoglobin (Streicher et al.

1984), and insulin (Falo et al. 1986). Conversion of

antigen from a native to a non-native form is termed

antigen processing, and it is performed by APCs which

express class II antigens (Allen 1987).

The nature of the interaction of MHC molecules and

processed foreign antigen is of great interest for the

understanding of the functional role of MHC polymorphism.

An early study by Babbitt et al. (1985) revealed that

immunogenic peptides from hen egg lysozyme bind


specifically to class II molecules from a responder, but

not a non-responder, haplotype. Subsequent studies have

focused on the residues responsible for this interaction

and the exact nature of the binding between antigen and

class II (Buus et al. 1986; Sette et al. 1987; Buus et al.

1987). Recently, the three dimensional structure of a

class I molecule was determined by X-ray crystallography

(Bjorkman et al. 1987) and because of the significant

domain and sequence homologies between class I and class

II molecules, Brown et al. (1988) propose a similar model

for the class II molecule as determined for class I

molecules. The cell-surface portions of each subunit

contain two domains (al, a2, f31, and 32) in which the al

and 31 domains are postulated to jointly form the binding

site which interacts with peptide antigens. The binding

site is a groove produced by two parallel alpha helixes

which rest atop a platform formed by an eight strand beta

pleated sheet. The al and 31 domains of the class II

molecules are postulated to donate one of the alpha

helixes and four strands of the beta pleated sheet each

which combine symmetrically to form the binding site. The

position of the polymorphic residues within the al and i31

domains have been shown to occupy sites within this groove

thereby representing contact points between MHC, antigen,

and the T cell receptor. This model has not been

confirmed as of yet by X-ray crystallography, but conforms


to a variety of structural and functional studies (Allen

et al. 1987; Buus et al. 1987; Guillet et al. 1987).

Taken together, these data show the importance of the

combinatorial associations of the a and P chains of the

class II molecules. The influence of polymorphisms on

their association can drastically affect the development

of an effective immune response. Maintenance of co-

adapted a and 3 chains will insure the proper assembly and

expression of a class II molecule.

Homologous Recombination Within the MHC

The mouse MHC offers a unique opportunity for

investigating whether homologous meiotic recombination

happens at random or at specific sites within the genome.

Many recombinant strains of mice have been characterized,

and the advent of molecular cloning has made it possible

to map the crossover sites.

Hotspots of Homologous Recombination

The DNA in the genome is not a static structure and

undergoes reorganizational events during evolution and

development. There is exchange of nucleotide sequences

between chromosomes by homologous and non-homologous

recombination. Recombination is a mechanism by which DNA

sequences are exchanged between homologous and non-

homologous chromosomes either during meiosis or mitosis.


There are three types of recombination; non-homologous,

homologous equal, and homologous unequal. Non-homologous

recombination occurs between sites within distinct

structural environments which may be located on the same

or different chromosomes. Non-homologous recombination is

also referred to as site specific recombination and acts

in the differentiation of some prokaryotic and eukaryotic

cells. Some specific examples of non-homologous

recombination are the integration or excision of

bacteriophages and bacterial transposons (Bauer et al.

1984; Calos and Miller 1980), and the rearrangement of

immunoglobulin and T cell receptor genes in eukaryotes

(Tonegawa 1983; Davis 1985). Non-homologous recombination

usually represents mitotic or somatic events.

Homologous recombination, or allelic recombination,

occurs between homologous or allelic nucleotide sequences

on homologous chromosomes. Homologous recombination can

generate new combinations of alleles by the exchange of

sequences between homologous chromosomes. Homologous

equal recombination breaks and rejoins nucleotide

sequences at precisely the same position whereas

homologous non-equal recombination cuts and rejoins at

different locations on homologous chromosomes leading to

the accumulation of duplications and deletions.

Homologous recombination can, theoretically, occur

anywhere along the chromosome. It has been recognized

that recombination frequencies can vary for different


stretches of DNA of the same size and that genetic map

distances do not always agree with molecular map distances

(Steinmetz et al. 1982b). Both these observations suggest

that recombination is site specific where there are

regions that either enhance or suppress recombinational

events. A site where recombination is localized has been

termed a recombinational hotspot (RHS) (Smith 1983).

Recombinational hotspots were first reported in

bacteriophage lambda where mutants arose that grew better

in E. coli than the wild type phage due to a

recombinational event localized to an eight nucleotide

sequence. This sequence, GCTGGTGG, or Chi (for crossover

hotspot instigator), enhances recombination leading to

better growth in the host bacteria (Smith 1983). The

function of the Chi sequence has been determined to be the

stimulation of homologous recombination. It exerts its

greatest activity within 10 kilobases upstream, and there

appears to be a second sequence involved which is located

downstream (Smith et al. 1981). Chi sequences are found

in E. coli at a frequency of one every five kilobases

(Malone et al. 1978). The Chi sequence may therefore

represent a molecular basis for recombination.

Homologous recombination has also been documented in

eukaryotic systems, such as yeasts, which also serve as a

vector system for the study of recombination prone

sequences (Song 1985). Recombination has been documented

in the human genome within the P-globin gene cluster


(Orkin and Kazazian 1984). By examining haplotype

associations of polymorphic restriction endonuclease

sites, recombination could be localized to a 9.1 kilobase

stretch of DNA located between the 6-globin gene and the

first exon of the P-globin gene. Because several

haplotypes carry identical mutations in the 5' genomic

segment which are found in association with different 3'

genomic segments, this site was determined to be a

recombinational hotspot.

Recombination Within H-2

Hotspots for homologous recombination have been

identified and characterized in the MHC of the mouse.

Because of the ability to breed homozygous mouse strains

and the large number of distinct alleles for the genes

within H-2, researchers have been able to compile an

extensive collection of intra-I region recombinant

congenic inbred mouse strains. These strains were first

identified serologically and functionally by demonstrating

the co-expression of two distinct parental epitopes for

the A and E molecule in a single offspring (Stimpfling and

Durham 1972; Benacerraf and Dorf 1976).

Molecular cloning and characterization of the class

II genes made it possible to locate the recombinational

breakpoints at the molecular level. Steinmetz and co-

workers (1982b) did a molecular characterization of nine

intra-I region recombinant strains and found that all the


recombinational events map to a single site within the I

region. This site is localized to a 3.4 kilobase region

encompassing the second intron of the EP gene. These

breakpoints were further characterized through southern

blot analysis by Kobori et al. (1984), and were localized

to only 2.0 kilobases. Sequence analysis of three

parental and four I region recombinants reveals that, in

three of the recombinants, the recombinational event

occurs within a 1 kilobase region of DNA (Kobori et al.

1986). Several subsequent studies have identified more

intra-I region recombinants in which the breakpoints map

to this RHS (Saha and Cullen 1986a, 1986b; Lafuse and

David 1986). In all, there have been 12 breakpoints

defined, which are localized within 10 kilobases of DNA

spanning the EP gene.

Shiroshi and co-workers (1982) examined a congenic

mouse strain, B10.MOL-SGR (Mus musculus molossinus) and

found a tremendously enhanced frequency of recombination

between the K and A locus. Steinmetz et al. (1986)

examined a similar mouse, CAS4 (Mus musculus castaneus)

which shows a recombination rate as high as 1.5% within

this same portion of the genome which encompasses 40


A third recombinational hotspot was identified in

another strain of M. m. castaneus, CAS3, which exhibits a

recombination rate of 0.6% with breakpoints localized to a

9.5 kilobase stretch of DNA between AP3 and AB2 (Steinmetz


et al. 1986). Further analysis of the nucleotide sequence

of five similar recombinant haplotypes revealed that all

the breakpoints are confined to a 3.5 kilobase region of

DNA (Uematsu et al. 1986). Of the breakpoints examined,

all show homologous recombination without any DNA

sequences duplicated or deleted between the parental and

recombinant haplotypes.

Recently, a fourth RHS has been identified which maps

to a 12 to 14 kilobase region centromeric to the Ea gene

as characterized by seven breakpoints (Lafuse and David

1986). Figure 2-4 gives a map of the RHSs within the MHC.

Earlier data from serologic and tryptic peptide

fingerprints (Singh et al. 1981; Wakeland and Darby 1983)

have provided evidence for the existence of a fifth RHS

within the MHC which maps between the Aa and Ef genes in

some wild derived haplotypes. Data presented in this

dissertation help to confirm the existence of this site of


Haplotype Specificity of Recombination within the MHC

The presence of RHSs at a given site depends on the

haplotype of the MHC involved in the genetic cross.

Recombination within the hotspot in EV is exhibited by

strains of the b and k haplotypes in genetic crosses

producing recombinant offspring (Steinmetz et al. 1982b;

Lafuse et al. 1986). Recombination within two distinct M.

m. castaneus haplotypes, c3 and c4, reveal two distinct


.r-4 -- 4J
-4 *d

0 -0
(a 4)

0 0

Ut 0
4.i .3

. 0 *Q)

0 V H

C (0

I > 44
XH 0).-)
B a4 *M o

C p 04

-, ,,,
m V---


a I


I -

I 0


W'3. O bO

IX 13 LO ) S

< < < < <

< < < "
0 0 0
i II


E- L




recombinational hotspots in the interval between K and A_2

(Steinmetz et al. 1986; Uematsu et al. 1986).

Recombinational events in the c3 haplotype mapped to the

A_33/AP32 hotspot, whereas recombination in the c4 haplotype

occurred in the K/AP3 hotspot. Crossing over in the

hotspot between EBf2 and Ea has so far only been detected

in crosses of the R haplotype (Lafuse et al. 1986; Lafuse

and David 1986).

Molecular Basis For Recombinational Hotspots in the MHC

There are similarities between the Chi sequence in E.

coli and the recombinational hotspots in the MHC. As seen

for Chi, breakpoints are clustered, only homologous

recombination is seen, and the activity of the RHS appears

to be dominant. Therefore, the nucleotide sequences

around the RHSs were examined for sequences with homology

to Chi. Studies by Steinmetz et al. (1986) and Kobori et

al. (1986) showed that the region around the Efi RHS

contains a sequence composed of a tetramer, CAGG, which is

repeated 22 times. This sequence has limited homology to

the Chi sequence of bacteriophage lambda, which is known

to promote recombination. A much stronger degree of

homology is found between this sequence and the core

sequence of human minisatellite DNA, which may facilitate

recombination in human chromosomes (Jeffreys et al. 1985).

There is no functional evidence, however, to suggest that

these repeated sequences are important in recombination


within the MHC. Sequence comparisons can only suggest

possible control sequences for recombination, and only

functional assays can identify structural elements

required for recombination.

Wild Mice

The goals of this dissertation are to survey

polymorphism at loci evenly spaced across the I region and

to determine the location and influences of

recombinational hotspots on the evolution of I region

haplotypes in modern species of Mus. Previous studies on

this subject were limited in scope due to the nature of

the strains of mice used to address these questions.

Inbred mice represent only a small subset of highly biased

haplotypes. These strains are derived from a limited

number of sources that were generated from a high degree

of inbreeding, thereby representing a biased sampling of

the mouse population and an artificial collection of

considerable homogeneity (Ferris et al. 1982; Klein 1974).

Wild mice represent a population whose breeding is

not controlled by humans (Bruell 1970), and, therefore,

represent a collection of I region haplotypes of

considerable heterogeneity. These mice also represent the

product of evolutionary processes where the I region

haplotypes are fixed and maintained through natural

selective pressures.


Natural History of the Wild Mouse

Wild mice can be divided into three groups based on

their associations with humans (Sage 1981). Aboriginal

mice are free living mice with essentially no interaction

with humans. Commensal mice, on the other hand, live in

close association with humans, and in most cases rely on

humans for their source of food and shelter. The third

group, feral mice, represent mice which have made the

transition from a commensal association back to the

aboriginal state.

The commensal mice fall into four subspecies of Mus

musculus; M.m.domesticus, M.m.musculus, M.m.castaneus, and

M.m.molossinus (Marshall 1981). The research in this

dissertation is concerned only with mice of the

M.m.domesticus subspecies. Based on fossil evidence,

nuclear genetic variation and mitochondrial genetic

variation, Ferris et al. (1983) estimate that this

commensal relationship between mouse and man has been in

existence for more than 1 million years.

In contrast to aboriginal mice whose range

encompasses only the Eurasian continent, commensal mice,

also indigenous to Eurasia, have radiated to the new world

around much the same time as the first human settlers.

The commensal mice have adapted remarkably well to

extremely varied climatic conditions with habitats ranging

from Europe, the Americas, Australia, Africa, and several

South Pacific islands. They may represent the most


evolutionarily advanced member of the genus (Marshall


M.m.Domesticus is presently found throughout the

world, and within its native range, can be found in

habitats as diverse as households, agricultural fields, in

barren rocky ravines (Gaisler 1975; Hassinger 1973), salt

marshes (Breakey 1963), grasslands (Pearson 1963), coal

mines (Philip 1938), and mountain environments (Harland


The t Complex

The t complex is a gene complex located on the

centromeric one third of chromosome 17 adjacent to H-2,

and accounts for nearly 1% of the mouse genome. There are

two major structural forms of chromosome 17, the wild type

and the t form. The t form is carried in 10% to 40% of

wild mice (Artzt et al. 1985; Dembic et al. 1984). A

complete t haplotype is one that, by definition,

suppresses recombination along the entire 12 centiMorgan

region from the gene locus Brachyury (T) to the H-2

complex. The different t haplotypes are all structurally

related to one another. Within the chromosomal region

occupied by the t haplotypes, there are genes common to

non-t bearing mice and mutant genes characteristic of the

t complex.

Mutant genes within the t haplotypes have been shown

to cause abnormalities in tail length, embryogenesis,


fertility, male transmission ratios, and meiotic

recombination (Dunn and Gluecksohn-Schoenheimer 1950;

Silver 1985). The t complex has been termed "selfish DNA"

(Klein et al. 1986) which serves no apparent purpose other

than self propagation and dissemination throughout a

population. Two reasons for the prevalence of t

chromosomes in the wild are their ability to sway their

own transmission and their ability to keep the genetic

elements responsible for segregation distortion together

by the suppression of recombination.

The molecular nature of the segregation distorters is

not known. It is well documented that wild males carrying

a complete t haplotype will transmit their t chromosome to

greater than 90% of their offspring (Lyon and Meredith

1964a, 1964b). Mice carrying a partial t haplotype can

transmit it only when complimented by another chromosome

which can restore the transmission distorter (Silver

1985). Lyon (1984) suggests that a series of distorter

loci, Tcd, can act on a single responder locus, Tcr. The

Tcd loci act in an additive fashion in either a cis or

trans configuration to the Tcr, and when the additive

effect of the Tcd loci reach a certain level, a high

degree of transmission of that chromosome is seen.

The mechanism responsible for the suppression of

meiotic recombination is much more straight forward than

for transmission distortion. The partial t haplotypes

were an important tool in elucidating the molecular basis


of the recombination suppression. The region of

suppression in the partial t haplotypes extends only as

far as the t DNA present (Bechtol and Lyon 1978; Bennett

et al. 1978). Normal levels of recombination are observed

between t chromosomes as opposed to the wild type. This

suggests that the structure of the t haplotypes are

similar to each other, yet different than wild type DNA

(Artzt et al. 1982a; Condamine et al. 1983).

Subsequently, it was shown that the t haplotypes have a

proximal inversion encompassing T and the Tcp (t complex

proteins) products (Herrmann et al. 1986), and a distal

inversion containing tf (tufted locus) and H-2 (Artzt et

al. 1982b; Shin et al. 1983; Shin et al. 1984).

Recombination, therefore, is suppressed between t and the

wild type due to the inversion of these regions.

The other characteristics of the t complex, sterility

and lethality, have been suggested to be secondary add-ons

to the primary properties of the t chromosome;

transmission distortion and recombination suppression

(Klein et al. 1986). This region of the chromosome is

believed to carry genes instrumental in embryogenesis and

development which have become mutated, hence the lethality

and sterility seen, and which are carried along with the

"selfish DNA" and disseminated throughout wild

populations. This hypothesis is supported by reports that

that lethality mutations appear to be single locus

mutations which can compliment each other in genetic tests


(Bennett 1975; Klein et al. 1984; Winking and Guenet


Due to the inclusion of the MHC in the recombination

suppression of the t haplotypes, the association of the

alleles at MHC loci with the t haplotypes is of great

interest. t forms of chromosome 17 are believed to be of

single founder origin, or at least of a limited founder

origin (Klein et al. 1986). This suggests that the H-2

haplotype associated with a t haplotype will represent a

unique and separate evolutionary lineage than those seen

in the wild type. Figueroa et al. (1985) have reported

the existence of three groups of class II alleles

associated with particular t haplotypes. Dembic et al.

(1984) and Nizetic et al. (1984) have shown a correlation

between a deletion in Ea and its association with the t

haplotypes. This identical deletion is seen in wild type

forms of chromosome 17 also, which has been interpreted as

an ancient origin of this deletion, but which also may be

interpreted as having been introduced through

recombination with the wild type. A recent study has

shown alleles for AM8 shared between t and the wild type

(McConnell et al. 1988). This, in conjunction with data

in this dissertation, suggest a higher degree of

recombination between the t and wild type forms of

chromosome 17.


Class II Gene Polymorphisms in Wild Mice

Evidence for the presence of H-2 specificities which

are unique to wild mice, and not present in panels of

laboratory strains, led to the quest for new alleles from

wild mouse populations. Serology was a powerful tool for

the analysis and characterization of these H-2

specificities, but there was a problem in separating these

reactivities from non-H-2 antigens and other H-2

specificities in the heterozygous animal. This problem

led to the development of wild derived congenic mouse

lines on a B10 background, collectively referred to as the

B10.W congenic lines (Klein 1973, 1975). Wild male mice

were bred with B10.BR female mice and the offspring were

backcrossed 8 to 14 times with the continual selection of

an H-2 marker specific for the wild haplotype. These

lines were maintained by brother x sister matings and the

wild H-2 haplotypes selected for on a C57BL/10 background.

Serologic examination of the B10.W lines revealed the

extreme polymorphisms of the class II genes within wild

populations (Klein 1975; Zaleska-Rutcznska and Klein

1977). Of the 16 haplotypes examined, a few appear

identical to inbred haplotypes, a few are identical to

each other, but the majority represent novel haplotypes

which are not seen in laboratory inbred strains. Later

serologic analysis of 29 wild derived haplotypes by

Wakeland and Klein (1979a; 1981), revealed three new I

region haplotypes; u, v, and j. These same types of


analyses of wild haplotypes show evidence for possible

recombination events (Duncan and Klein 1980; Wakeland and

Klein 1979b), which points out the value of wild derived

H-2 haplotypes for the study of recombination.

Tryptic peptide mapping, in conjuction with the

serologic data, demonstrate that many of the haplotypes

can be grouped into families of varient alleles (Wakeland

and Klein 1983). The nature of the variations between

alleles of the A molecule within these families were

investigated, and it was found that these alleles differ

by less than 10% of their tryptic peptides. Most of these

differences are localized in the al and 31 domains of the

A molecule (Wakeland and Darby 1983; Wakeland et al.

1985). Studies at the DNA level confirm the relatedness

of these alleles, and have added more insight into the

mechanisms which play a role in the generation of

diversity of class II molecules (McConnell et al. 1986,


Wild derived lines have continually shown evidence

for recombination within the I region (Duncan and Klein

1980; Wakeland and Klein 1979b; Wakeland and Darby 1983;

Singh et al. 1981). A recent study by Soper and co-

workers (1988) reveals the prevalence of the RHS in Ef in

the diversification of I region haplotypes within a small

panel of wild mice. Singh et al. (1981) demonstrated that

recombination commonly occurs between Aa and Ei, a fact

which has not been borne out at the molecular level until


the work in this dissertation. These data suggest the

important role of the wild mouse in the understanding of

the evolution of the I region and its application to the

study of recombination.



All mice used in this study were from the mouse

colony in the Tumor Biology Unit at the Department of

Pathology and Laboratory Medicine, University of Florida,

or from our wild mouse colony located at the Animal Care

Facility, University of Florida. Strains included in this

analysis are listed in Table 3-1. The wild derived mouse

strains were maintained by brother x sister matings and

are homozygous at the H-2 complex unless otherwise noted.

t haplotype mice were supplied by Dr. Joseph Nadeau,

Jackson Laboratories, and were maintained as heterozygotes

due to the lethality of the t mutations carried by these


Antibody Isolation and Conjugation

Monoclonal antibody, 14.4.4, (Ozato et al. 1980) was

produced by injecting 0.5 ml containing 2 x 106 hybridoma

cells interperitoneally into sub-lethally irradiated,

Pristan (Sigma, St. Louis, MO) primed, male BALB/c mice.

Ascites fluid was harvested every 2 days for a 1 month

period. The ascites fluid was run over protein A-

sepharose column (Pharmacia Fine Chemicals, Uppsala) at 50



ml/hr and then washed with PBS until the absorbance at 280

nm was at baseline as determined spectrophotometricly.

IgG was eluted from the column with a solution of 0.58%

acetic acid, 0.15 M NaCl and the eluate collected in 5 ml

fractions. The eluted IgG was dialyzed 18 hours at 4C

against pH 9.3 carbonate/bicarbonate buffer (17.3 g

NaHCO3/8.6 g NaCO3 in 1 liter H20). Fluorescein

isothiocyanate (FITC) (Sigma, St. Louis, MO) was dissolved

in dimethyl sulfoxide at a concentration of 1.0 mg/ml and

the appropriate amount was added to the IgG and allowed to

react at room temperature for 2 hours. This mixture was

loaded on a G-75 (Bio-Rad Laboratories, Richmond, CA)

column in PBS plus 0.1% NaN3 and the first colored band

was collected and used in subsequent experiments.

Spleen Cell Isolation, Immunostaining. and
Flow Cytometric Analysis

Freshly explanted spleens were minced through wire

screens to make single cell suspensions. Red blood cells

were lysed with an ammonium sulfate solution (0.5% w/v)

and washed extensively with PBS. 1 x 106 cells were

resuspended in PBS plus 0.1% NaN3 and incubated with a

1:150 dilution of the FITC conjugated antibody for 30

minutes at 40C. Samples were washed 3 times with PBS and

brought up in 0.5 ml for flow cytometry. Cells were

passed through a 44 p nylon mesh filter and then run on a

TABLE 3-1. H-2 Homozygous Wild and Inbred Mice.

Strain of
Mus m. domesticus H-2 Geographic Origin







old inbred
old inbred
old inbred
old inbred
old inbred
old inbred
old inbred
old inbred
old inbred
old inbred
old inbred

West Germany


FACS II fluorescence activated cell sorter (Becton-

Dickinson, Mountain Veiw, CA) at a flow rate of 200-250


Isolation of genomic DNA

Genomic DNA was isolated from liver tissue by a

Protease K (Sigma, St. Louis, MO)/SDS method as detailed in

Maniatis et al (1982). After 24 hrs of starvation, mice

were sacrificed and their livers were surgically removed.

Liver tissue was minced with scissors and added to a mortar

containing liquid nitrogen and ground to a fine powder. The

frozen powder was then added to 40 ml of a TES buffer (10 mM

Tris HC1, pH 7.5; 5 mM EDTA, 100 mM NaCl) containing 1% SDS

and 0.4 mg/ml protease K. This solution was incubated at

650C in a water bath for 18 hours. DNA solutions were then

extracted three times with Tris equilibrated phenol (pH

7.5), twice with a chloroform/isoamyl alcohol solution (25:1

v/v) and precipitated with 2.5 times the volume of

isopropanol. DNA was hooked out of solution with Pasteur

pipettes, resuspended in 1.0 ml TE buffer (10 mM Tris HC1,

pH 7.5; 1 mM EDTA), and dialyzed against TE buffer.

Resulting DNA solutions were quantitated spectophotometricly

and electrophoresed on 0.7% agarose gels to confirm their

high molecular weight.


Endonuclease Digestion and
Agarose Gel Electrophoresis

Aqueous solutions containing 20 ug of high molecular

weight DNA were digested with 40 units of one of 5

restriction endonucleases; Bam HI, Bgl II, Eco RI, Pvu II,

and Sac I for 18 hours at 370C under conditions described by

the supplier (Bethesda Research Laboratories, Bethesda, MD).

Complete digestion was confirmed by removing 10% of the

reaction mixture, adding 0.5 ug of phage lambda DNA,

incubating an additional 4 hours, and electrophoresing on an

agarose gel. Characteristic restriction patterns of phage

lambda DNA with a homogeneous smear of genomic DNA were

indicative of complete digestion of the DNA mixture. The

remaining digested genomic DNA solution was electrophoresed

on 0.7% agarose gels for 20 hours at 3 V/cm in a water

cooled horizontal electrophoresis apparatus (International

Biotechnologies Incorporated, New Haven, CT).

Capillary DNA Transfer and Hybridization

Restriction endonuclease digested DNA was transferred

from the gels onto nylon membranes (Zetabind, AMF, Meriden,

CT) by the method of Southern (1980). The nylon membranes

were dried in a vacuum oven at 800C for 3 hours and stored

at room temperature until hybridization. Membranes were

washed in a 0.1% SSC (0.015 M NaCl, 0.0015 M sodium citrate)


solution containing 0.5% SDS at 650C for 1 hour in a shaking

water bath. Prehybridization and hybridization of the

filters was performed as described by the supplier (AMF,

Meriden, CT). Membranes were hybridized with 32P-labeled

DNA probes of specific activity of approximately 2 x 108

dpm/ug by nick translation (Bethesda Research Laboratories,

Bethesda, MD) for 18 hours at 420C. Non-specifically bound

probe was removed by two successive washes in 0.1% SSC/0.1%

SDS at 650C in a shaking water bath. Membranes were exposed

for 2-5 days to XAR-5 X-ray film (Kodak, Rochester, NY) with

Lightning Plus intensifying screens (Dupont, Wilmington,

DE). After autoradiography, hybridized probes were removed

by washing with 0.1% SSC/0.5% SDS at 800C for 20 minutes and

sequentially rehybridized with other labeled probes.

RNA Isolation and Analysis

Total cellular RNA was isolated from mouse splenocytes

with guanidine isothiocyanate (International Biotechnologies

Inc., New Haven CT) as described (Chirgwin et al. 1979;

Dingler et al 1986). Isolated spleen cells were dissolved

in 4 M guanidine isothiocyanate and layered onto a buffer of

5.7 M cesium chloride (Bethesda Research Laboratories,

Bethesda, MD) and centrifuged at 20,000 rpm for 18 hours.

The supernatant was aspirated and the precipitated RNA was

resuspended in sterile DEPC-treated water, phenol and

chloroform extracted twice, and ethanol precipitated. RNA


solutions were stored as precipitates in 100% ethanol at

-20oC. The RNA was quantitated spectrophotometrically and

10 ug were electrophoresed in 1.0% formaldehyde\agarose gels

to check for degradation. RNAs that were intact were

blotted to nylon membranes and hybridized with specific

probes as described in the above section. These filters

were probed with a 700 bp cDNA for the Ea gene.


Seven single copy DNA probes were isolated from I

region cosmid clones supplied by Dr. Michael Stienmetz,

except where otherwise noted, which are evenly spaced across

the I region and flanking the known recombinational

hotspots. Figure 3-1 shows the location of these probes

with relation to the genes within the I region. Probe 1 is

a 5.8 kb Eco RI fragment containing the entire A_3d gene

(Malissen et al. 1983), probe 2 is a 1.2 kb Hind III

fragment containing the al and a2 exons of Aab (J. Seidman,

personal communication 1984), probe 3 is a 4.2 kb Eco RI/

Xho I fragment midway between Aa and Ef, probe 4 is a 1.8 kb

Eco RI fragment containing the 31 exon of EBfd, probe 5 is a

700 bp cDNA containing the 32, TM, CTY, and 3'UT portions of

the Ed gene, probe 6 is a 4.5 kb Bam HI fragment containing

the E_82 psuedogene, and probe 7 is a 6.5 kb Bam HI fragment

containing the 5' promoter region of the Ead gene.

*H -H

c > 4


4U V0
0O 04

S4.C )

*H 4) H

4 )

S 4 0 II
) gM u

o *Ho

I -,

4) *H H

*) I M
r-4 0 0)

I- m

,, -m


X -



Genomic Restriction Mapping

Genomic restriction maps were determined for the

pertinent strains by first determining the restriction map

of the different I region cosmids with the 5 restriction

endonucleases used in this study. Likely strains showing

recombination were double digested with the restriction

enzymes and fragment sizes determined after blotting and

hybridization with a probe for the locus of interest. These

fragments were then used to map the restriction endonuclease

sites for each strain.

Data Analysis

A restriction fragment length polymorphism (RFLP)

analysis was performed on the data using equation 21 of Nei

and Li (1979) where:

F = 2nxy / (nx + ny)

in which nx and ny are the number of fragments from alleles

x and y, respectively, and nxy is the number of shared

fragments between the alleles. A pairwise correlation

analysis of the F values obtained in the RFLP analyses was

performed for adjacent loci using a computer program from

SAS (Statistical Analysis Software) where a cumulative

correlation coefficient (R) was calculated for each allele

as compared to all other alleles sampled.


RFLP Analysis of the I Region

An RFLP analysis of the A_8 gene was done previously

(McConnell et al. 1986, 1988). Briefly, a 5.5 kb Af_

fragment, probe 1, identified twenty-one alleles with an

average F value of 0.29 within this panel of mice. This

high degree of divergence (low F value) was found to be due

to a retroposon insertion within the second intron. This

insertion separated the alleles into three evolutionary

groups based on either the absence of a retroposon, the

presence of a 851 bp retroposon, or the presence of a 1.1 kb

retroposon insertion. When these retroposon polymorphisms

were taken into account during the RFLP analysis, the mean F

value rose to 0.64.

Probe 2, a 1.1 kb Aa fragment containing the al and a2

exons, shows a similar high degree of polymorphism as seen

with AS8, and a high number of alleles (Table 4-1). These

results show that the degree of diversity detected by each

restriction enzyme may vary. By using a combination of

several enzymes, the relative diversity between alleles can

be better estimated. The data in Table 4-1 also show the

frequency of an allele within this panel of mice. For


example, seven strains carry the a allele at this locus

whereas only one strain carries the g allele. Another point

obtained from these data is the degree of relatedness

between particular alleles such that they represent minor

variants of each other which differ by only a single

restriction fragment, i.e. allele b and allele g by a single

Bam HI fragment. The F values calculated between alleles

are shown in Table 4-2, and show there are six fragments for

each allele for the five enzymes, with each pair of alleles

sharing from two to ten fragments. The result of the RFLP

analysis is depicted in Figure 4-1, where probe 2 identifies

twelve alleles with a mean F value of 0.49 0.18. All

polymorphisms at this locus are due to restriction enzyme

site changes rather than due to insertions or deletions as

determined by restriction fragment length comparisons.

Probe 3, a 4.2 kb fragment midway between Aa and Ef8 and

designated intergenic sequence 1 (11), identified thirteen

alleles (Table 4-3) in which nineteen of the twenty-eight

mouse strains can be grouped into the first four allelic

lineages. This probe detects six to seven fragments per

allele as seen in Table 4-4, with these alleles showing an

average F value of 0.47 0.20 (Figure 4-1). There are no

discernable insertion or deletion polymorphisms when the

restriction fragment lengths are compared at this locus.

The probe for the 5' portion of the E_3 gene, probe 4,

identifies nine alleles (Table 4-5) in which there are only

three allelic groups without multiple members, and one


Table 4-1. RFLP sizes and allelic grouping of strains for Aa.

allele Barn HI Bgl II Eco RI Pvu II Sac I strains



a 5.2 3.5 12.0 7.0 13.7 B1O.F, B10.Q,
6.6 B1O.KEA5, MET2,

b 5.4 4.8 12.0 7.0 11.1 W12A, STU, FAI4,
6.6 MET3, FAI5

c 5.4 5.2 12.0 7.0 8.0 B10, JER3, AZR1,
6.0 B1O.STC90

d 5.2 4.8 12.0 7.0 8.0 B10.WB, JER4

e 5.4 4.8 12.0 4.0 13.7 MET1, B10.RIII

f 5.4 4.8 12.0 6.0 11.1 B10.BR, B10.CHA2

g 5.2 4.8 12.0 7.0 11.1 B10.S

h 5.4 4.8 12.0 7.0 13.7 B10.SM

i 5.2 5.2 12.0 4.0 10.5 FAI3

j 5.2 6.5 12.0 7.0 13.7 B10.PL

k 5.4 3.5 12.0 7.0 8.7 B10.D2

1 5.2 4.8 12.0 7.0 8.0 B1O.M

aAllele designations in ascending alphabetical order based on frequency.
bValues are expressed in kilobases.


Table 4-2. RFLP analysis of Aa alleles.

allele a b c d e f g h i j k 1

a 6 4 6 6 2 8 8 4 10 8 6
12 12 12 12 12 12 12 12 12 12 12
b .50 6 6 8 8 10 10 2 6 8 6
12 12 12 12 12 12 12 12 12 12

c .33 .50 6 4 4 4 6 6 4 8
12 12 12 12 12 12 12 12 12

d .50 .50 .50 4 6 8 6 4 6 4 10
12 12 12 12 12 12 12 12

e .50 .67 .33 .33 6 6 10 4 6 6 4
12 12 12 12 12 12 12

f .17 .67 .33 .50 .50 6 6 2 2 4 4
12 12 12 12 12 12

g .67 .83 .33 .67 .50 .50 8 4 8 6 8
12 12 12 12 12

h .67 .83 .50 .50 .83 .50 .67 2 8 8 6
12 12 12 12

i .33 .17 .50 .33 .33 .17 .33 .17 4 2 6
12 12 12

j .83 .50 .33 .50 .50 .17 .67 .67 .33 6 6
12 12

k .67 .67 .50 .33 .50 .33 .50 .67 .17 .50 4

1 .50 .50 .67 .83 .33 .33 .67 .50 .50 .50 .33 -

aThe fraction homologous (F) value as defined by Nei and Li (1979).
bNumber of shared RF / total RF for both alleles, based on restriction
digestion with Barn HI, Bgl II, Eco RI, Pvu II, and Sac I.

0 u


u t *
a -

-I 4 0

0 -
0 W ->


.J 0 ,-
4 14
4 Au

(D 1 >

0 H
HI a 4

m C 04
V- r-

-l 0 0

to 0 *
r- U u>i

.r- 4J 0)

C -0 r-I
0 a) r-l

tyI V

H ) -r4 (3


Do-- o
"2" -o



Lo N

-- or


fl: -- 1

Table 4-3. RFLP

allele Bamn HI

a b
a 10.0














sizes and

Bgl II















allelic grouping

Eco RI Pvu



























strains for Ii.

Sac I















B10, B10.SM,
B10.S, AZR1,
B10.STC90, FAI4

B10.Q, B10.CAA2


B10.BR, B10.CHaA2









designations in ascending alphabetical order based on frequency.
are expressed in kilobases.


Table 4-4. RFLP analysis of Ii alleles.

allele a b c d e f g h i j k 1 m

a 12 8 6 6 2 6 10 8 12 10 8 6
14 14 14 13 14 13 14 14 14 14 13 13
b .86 8 6 6 4 6 10 8 12 8 8 6
14 14 13 14 13 14 14 14 14 13 13

c .57 .57 4 4 2 10 6 10 8 8 6 4
14 13 14 13 14 14 14 14 13 13

d .43 .43 .29 4 2 2 8 4 6 6 4 6
13 14 13 14 14 14 14 13 13

e .46 .46 .31 .31 2 2 6 4 6 6 8 8
13 12 13 13 13 13 12 12

f .14 .29 .14 .14 .15 2 2 2 2 4 2 2
13 14 14 14 14 13 13

g .46 .46 .77 .15 .17 .15 4 10 8 6 6 2
13 13 13 13 12 12

h .71 .71 .43 .57 .46 .14 .31 6 10 8 6 8
14 14 14 13 13

i .57 .57 .71 .29 .31 .14 .77 .43 10 8 8 4
14 14 13 13

.86 .86 .57 .43 .46 .14 .62 .71 .71 8 10 6
14 13 13

k .71 .57 .57 .43 .46 .29 .46 .57 .57 .57 6 8
13 13

1 .62 .62 .46 .31 .67 .15 .50 .46 .62 .77 .46 8

m .46 .46 .31 .46 .67 .15 .17 .62 .31 .46 .62 .62 -

aThe fraction homologous (F) value as defined by Nei and Li (1979).
bNumber of shared RF / total RF for both allele, based on restriction
digestion with Bam HI, Bgl II, Eco RI, Pvu II, and Sac I.

Table 4-5. RFLP

allele Bam HI

a b
a 8.6











sizes and

Bgl II




allelic grouping

Eco RI Pvu





strains for

Sac I



4.8 2.7 2.0 3.0 5.0

6.9 3.0 2.0 3.4 5.2

3.8 3.0 2.0 3.6 4.2

3.8 3.0 1.0 3.4 2.5

3.8 2.5 2.0 3.4 2.6

3.8 17.1 -

8.6 3.0 2.0 3.6 5.2

designations in ascending alphabetical order based
are expressed in kilobases.



B10, B10.S,
B10.SM, JER3,
B10.STC90, MET1,

B10.F, B10.CAA2,
B10.KEA5, B10.Q,

B10.D2, W12A,

B10.BR, AZR1,

JER4, B10.WB

B10.SAA48, FAI3



on frequency.


Table 4-6. RFLP analysis of 5'fB alleles.

allele a b c d e f g h i

a 2 2 8 4 4 4 0 8
10 10 10 10 10 10 10 10
b .20 0 2 2 6 4 2 0
10 10 10 10 10 10 10

c .20 0 2 2 0 2 0 2
10 10 10 10 10 10

d .80 .20 .20 4 4 4 0 6
10 10 10 10 10

e .40 .20 .20 .40 4 4 2 6
10 10 10 10

f .40 .60 0 .40 .40 4 2 2
10 10 10

g .40 .40 .20 .40 .40 .40 2 2
10 10

h 0 .20 0 0 .20 .20 .20 0

i .80 0 .20 .60 .60 .20 .20 0 -

aThe fraction homologous (F) value as defined by Nei
and Li (1979).
bNumber of shared RF / total RF for both alleles, based
on restriction digestions with Barn HI, Bgl II, Eco RI,
Pvu II, and Sac I.


lineage containing thirteen mice of the H-2b haplotype. The

H-2P-like alleles show a 1.0 kb deletion when compared to

the other alleles in this panel. This deletion was analyzed

previously by an RFLP analysis, and was shown to have no

effect on the transcription or expression of this gene in

the H-2P haplotype (Soper et al. 1988). Because eight

strains share this polymorphism, the average F value for

this locus is extremely low, 0.28 0.22 (Figure 4-1 and

Table 4-6). There is also one other small 100 to 200 bp

deletion detected in the c allele when compared to the a

allele for this probe. This deletion also shows no effect

on the expression of the gene. When these deletions are

taken into account, the average F value rose to 0.60. This

locus, and the three previous loci, all map into what was

described as the variable tract of the I region (Steinmetz

et al. 1984) as characterized by a large number of very

distinct alleles.

Probe 5, a 700 bp cDNA for the 3' portion of the Ef8

gene, defines fifteen alleles which are more closely related

to one another than alleles detected at previous loci

(Tables 4-7 and 4-8). Fourteen stains comprise the four

most frequent alleles, and ten of the remaining mice

represent minor variants differing by a single restriction

fragment (RF). Therefore, twenty-four of the twenty-eight

strains can be categorized into these four most common

alleles which shared approximately 60% of their restriction


fragments. This homogeneity accounts for the high overall

mean F value of 0.57 0.19 (Figure 4-1).

The EBf2 psuedogene, probe 6, shows seven alleles in

which only one of these alleles differs by more than a

single RF, and this allele is shared by only three strains

(Table 4-9). Table 4-10 demonstrates the relatedness of

these alleles as reflected in the extremely high F values

between mice. The mean F value for this locus is 0.80 +

0.08 (Figure 4-1).

The Ea gene is another very non-polymorphic locus which

is characterized by probe 7. The Ea gene shows only four

alleles which can be grouped into two allelic lineages.

Alleles b and c are minor variants of the a allele, one

major evolutionary lineage, and allele d-like mice making up

the second predominant lineage (Table 4-11). Allele a and

its variants represent a closely related family of alleles

when compared to the d lineage when the F values generated

in an RFLP analysis are examined (Table 4-12). In total,

these data show that the Ea alleles can be separated into

two classes which correlate to the presence of a 650 bp

deletion in the centromeric portion of the gene. The allele

a lineage (a, b, and c) do not carry the deletion and,

therefore, can transcribe and express this gene. Allele d

carrying mice do not have mRNA transcribed and do not

express an I-E molecule on their cell surface (Table 4-13).

The nature of the defects in E molecule expression within

these mice was examined at both the DNA and the RNA level by

Table 4-7. RFLP sizes and allelic grouping of strains for 3'L8.

allele Bam HI Bgl II Eco RI Pvu II Sac I strains

a b
a 10.0 4.0 12.0 4.5 2.6 B10.F, B10.CAA2,
3.0 2.4 B10.STC77, JER4,

b 10.0 3.1 12.2 4.5 6.4 B10.D2, B10.S,
3.0 2.6 FAI5, W12A, FAI5

C 10.0 2.8 12.2 4.5 6.4 B10.BR, MET2,
3.0 2.5 FAI4

d 10.0 2.8 6.0 4.5 6.4 B10.CHA2, JER3
3.0 2.5

e 10.0 2.8 12.2 4.5 2.6 B10.M, MET1
3.0 2.4

f 10.0 4.0 12.0 4.5 6.4 FAI3, B10.WB
3.0 2.6

g 10.0 3.1 6.0 4.5 6.4 B10.RIII
2.7 2.5

h 10.0 3.1 6.0 4.5 6.4 B10.SM
3.0 2.5

i 10.0 4.6 12.2 4.5 6.4 B10.PL
2.7 2.6

j 10.0 2.8 12.0 4.5 2.6 B10.Q
3.0 2.4

k 8.6 4.6 12.2 5.0 2.6 B10.SAA48
4.6 2.4

1 10.0 3.1 12.2 4.5 2.6 AZR1
3.0 2.4

m 10.0 2.8 6.0 4.5 6.4 B1O.STC90
2.7 2.6

n 10.0 3.1 6.0 4.5 6.4 B10
2.7 2.6

o 10.0 4.6 5.0 4.5 6.4 MET3
3.0 2.6

aAllele designations in ascending alphabetical order based on frequency.
values are expressed in kilobases.


Table 4-8. RFLP analysis of 3'3I alleles.

allele a b c d e f g h i j k 1 m n o

a 8 6 6 10 12 4 6 6 12 4 10 6 6 8
14 14 14 14 14 14 14 14 14 14 14 14 14 14
b .57 10 8 10 10 4 6 6 8 4 12 8 10 10
14 14 14 14 14 14 14 14 14 14 14 14 14

c .43 .71 12 10 8 8 10 8 8 2 8 8 6 8
14 14 14 14 14 14 14 14 14 14 14 14

d .43 .57 .86 8 8 10 12 6 8 2 6 10 8 8
14 14 14 14 14 14 14 14 14 14 14

e .71 .71 .71 .57 8 4 6 8 12 6 12 8 6 8
14 14 14 14 14 14 14 14 14 14

f .86 .71 .57 .57 .57 6 8 8 10 2 8 8 8 10
14 14 14 14 14 14 14 14 14

g .29 .29 .57 .71 .29 .43 12 8 4 0 6 10 12 6
14 14 14 14 14 14 14 14

h .43 .43 .71 .86 .43 .57 .86 6 6 0 8 8 10 6
14 14 14 14 14 14 14

i .43 .43 .57 .43 .57 .57 .57 .43 6 6 8 10 1Q 10
14 14 14 14 14 14

.86 .57 .57 .57 .86 .71 .29 .43 .43 4 10 8 6 8
14 14 14 14 14

k .29 .29 .14 .14 .43 .14 0 0 .43 .29 6 2 2 4
14 14 14 14

1 .71 .86 .57 .43 .86 .57 .43 .57 .57 .71 .43 6 8 8
14 14 14

m .43 .57 .57 .71 .57 .57 .71 .57 .71 .57 .14 .43 12 _
14 14

n .43 .71 .43 .57 .43 .57 .86 .71 .71 .43 .14 .57 .86 8

o .57 .71 .57 .57 .57 .71 .43 .43 .71 .57 .29 .57 .57 .57 -

aThe fraction homologous (F) value as defined by Nei and Li (1979).
bNumber of shared RF / total RF for both alleles, based on restriction digestion
with Barn HI, Bgl II, Eco RI, Pvu II, and Sac I.

Table 4-9. RFLP sizes and allelic grouping of strains for B82.

allele Bam HI Bgl II Eco RI Pvu II Sac I strains

a b









aAllele designations in ascending a
bValues are expressed in kilobases.

iphabetical order based on frequency.




















Bl0.D2, W12A,
B10.CHA2, B10.S,
B10.Q, B10.CAA2,
B10.STC77, B10.M
B10.PL, STU,
FAI5, B10.SM,
B10.F, B10.KEA5













RFLP analysis of ES2 alleles.

allele a b c d e f g

- 18
.90 -

.90 .70

16 16
20 20

16 16
20 20

.80 .80 .90 14

16 18
20 20

.80 .80 .70 .70 14

.90 .80 .70 .80 .70 -

.90 .80 .80 .90 .70 .80

aThe fraction homologous (F) value as defined
by Nei and Li (1979).
bNumber of shared RF / total RF for both
alleles, based on restriction digestion
with Bamn HI, Bgl II, Eco RI, Pvu II, and Sac I.

Table 4-10.

Table 4-11. RFLP sizes and allelic grouping of strains for Ec.

allele Bam HI Bgl II Eco RI Pvu II Sac I strains

a b
a 6.5 6.0 8.5 6.0 6.0 B10.D2, B10.SM,
2.5 B10.BR, B10.PL,
1.2 B10.STC90, MET3
B10.CAA2, JER4,
B10.SAA48, FAI5
B10.RIII, B10.Q
B10.M, B10.KEA5

b 6.5 6.0 8.5 6.0 6.0 W12A, SIU, AZR1,
1.2 JER3, B10.F

c 6.5 6.0 7.5 6.0 6.0 B10.STC77, FAI3,
2.5 B10.WB, B10.CHA2

d 6.0 5.5 8.0 5.5 8.0 B10, B10.S,
2.5 MET2, FAI4

aAllele designations in ascending alphabetical order based on frequency.
bValues are expressed in kilobases.


Table 4-12. RFLP analysis of Ea alleles.

allele a b c d

a 12 12 2
15 14 13
b .80 10 0
15 15

c .86 .67 2

d .15 0 .15 -

aThe fraction homologous (F) value
as defined by Nei and Li (1979).
bNumber of shared RF / total RF for
both alleles, based on restriction
digestion with Bam HI, Bgl II, Eco RI,
Pvu II, and Sac I.

Table 4-13. Characterization of EO strains.

strain haplotype surface Ect message deletion
expression expression present

AZR1 w201 -

B10 b +

B10.S s +
B10.M f + -
B10.Q q + -

FAI4 w207 +

MET2 w218 +

aMice show
bMice show

abarent sized message.
low message levels due to a defect
of the message.

affecting the


RFLP and northern blot analysis. Of the three identified

defects in Ea gene expression (the 650 bp deletion, a

splicing defect, or a message stability defect), only the

deletion mutation is found in our panel of wild mice (Table

4-13). This deletion is the only the major polymorphisms

seen at this locus and accounts for the low mean F value

(0.44 0.38) seen between these mice (Table 4-12 and Figure


Probes 5, 6, and 7 all map within the conserved tract

of the I-region which is characterized by the low number of

very related alleles (Steinmetz et al. 1984).

Allele Lineages

When two or more mouse strains shared an F value of

>0.80, thus differing by a single RF, they were grouped into

evolutionary lineages. This is demonstrated in Figure 4-2

where, for Aa, there are five strains which carry allele b

with two minor variants, bvl and bv2. Each variant allele

is represented by a single member which differs from b by a

unique Bam HI and Sac I fragment, respectively. The average

F value between mice within a lineage is 0.83 whereas the

value is 0.50 between lineages. The number of lineages for

each of the loci probed are shown in Table 4-14. Grouping

the alleles at each locus across the I region lowers the

number of distinct alleles two fold. This helps simplify

the data for further analysis and shows that there are a


limited number of old alleles carried by the mice in this


Evidence for Site Specific Recombination Within the I
Region: Identification of RHSs in ER and Ea

The lineage designations determined in the preceding

section were used to determine if and where recombination

occurs within the I region haplotypes in our panel. It was

expected that if recombination is occurring between two

loci, the pattern of allele associations of the recombinant

haplotypes will change with respect to that of the donor

haplotypes. This is confirmed by the results shown in Table

4-15 when compared to the data presented in Table 4-16. The

data presented in Table 4-16 demonstrate the linkage

disequilibrium between two loci, Ii and 5'E13, seen in the

absence of recombination. The results in Table 4-15

demonstrate the predicted switching of allele associations

between two loci, 5'Efi and 3'Ej3, known to undergo

recombination. These associations are so pronounced in

haplotypes not undergoing recombination that by knowing the

lineage of one locus, the lineage of the adjacent locus can

be predicted accurately in 90% of the cases. For example,

referring to Table 4-16, if a strain belongs to lineage a

for II, then in nine of ten cases these same mice are

lineage a for 5'E83. This same pattern holds true for all

other lineages. In contrast, especially for lineage a, this

(A 0


to 4)
V 0 0
r0 Q0 :

M >
(0 t :1oQk

.-4 k UN

C1 T% v

o o
0 0 n


S-:1 4 -)

o 4) $4- )

0 kU4 0

H r4

0 14 C 4

4 ) iH

0 -(0 V
0 ,4l 9 4)

tp 0 H>

* 4 H *
r4 0 ON t
1 o1 Z t0
0 AM

. 4.. ) r-l 10

R 1U

o .0




Table 4-14. Allele Lineages Across the I region.

strain AP Ac Ii 5',8 3'J3 _82 _Ea

B10 h c a a gv2 av3 d
B10.BR p f d d d av5 a
B10.CAA2 a a avl b a a a
B10. CA2 p f d d dvl a av2
B10.D2 cvl k cvl c b a a
B10.F a a avl b a a avl
B10.KEA5 a a avl b a a a
B10.M 1 dvi i h cvl a a
B10.PL q avl m avl i a a
B10.Q a a avl b avi a a
B10.RIII c e a a av2 avl a
B10. S m bvl a a b a d
B10.SAA48 bvl a av2 f k av4 a
B10.SM d bv2 a a av3 a a
B10.STC77 a a avi b a a av2
B10.STC90 n c a a cv3 avl a
B10.WB i d h e av4 c av2
AZR1 h c a d 1 a avl
FAI3 1 i k f av4 a av2
FAI4 jvl b a a d c d
FAI5 jvl b c c b a a
JER3 h c a a dvl av2 avl
JER4 i d f e a a a
MET1 c e a a cvl av2 av2
MET2 a a a a d c d
MET3 mvl b e g o avl a
STU j b c c b a avl
W12A j b c c b a avl

Table 4-15. Lineage associations within f3.

strain 5'3 3' 8 recoabinant

B10 a gv2 -
B10.RIII a av2 +
B10.S a b +
B1O.SM a av3 +
B10.STC90 a cv2 -
FAI4 a d +
JER3 a dvl +
MET1 a cvl -
MET2 a d +
B10.PL avl i -

B10.F b a -
B10.KEA5 b a -
B10.Q b avl -
B10.STC77 b a -
B10.CAA2 b a -

B10.D2 c b -
FAI5 C b -
STU c b -
W12A c b -

B10.BR d d -
B10.CHA2 d dvl -

aRecambinants defined by switching of lineage
association between loci.

Table 4-16. Lineage associations centromeric of M8.

strain Ii 5'V recombinant

B10 a a -
B10.RIII a a -
B10.S a a -
B10.SM a a -
B10.STC90 a a -
AZR1 a d +
FAI4 a a -
JER3 a a -
MET1 a a -
MET2 a a -
B10.CAA2 avl b -
B10.F avl b -
B10.KEA5 avi b -
B10.Q avl b -
B10.STC77 avi b -
B10.SAA48 av2 f -

FA15 c c -
W12A c c -
STIU c c -
B10.D2 cvl c -

B10.BR d d -
B10.CHA2 d d -


predictive ability is lost due to the high degree of

recombination that occurs between 5' and 3'Efi (Table 4-15).

All recombination events at this site occur between an a

allele at 5'E_3 and some other allele. Six recombination

events are scored at this site as compared to only one

possible event between Ii and 5'E_8 in all the mouse

haplotypes tested. This is therefore designated a

recombinational hotspot (RHS) based on the high frequency of

localized recombinational events specific for the a lineage.

These recombinational events can be represented graphically

(Figure 4-3A), where the fill pattern at each locus

represents the lineage origin of the genomic segment of

interest. This diagram represents the identification of a

RHS within E_8 for mice with the w22, w26, w207, and s


An similar analysis was performed on the genomic

segment containing the _fi2 and Ea genes. Because of the

lower degree of polymorphism at the telomeric end of the I

region, the 650 bp deletion in Ea is used as a marker in

order to identify recombination within this region.

Recombination is scored when a deleted allele of Ea is

observed adjacent to a proximal expressor associated allele

for EF82 (Figure 4-3B). Of the three possible lineages of

Ej2 and two lineages at Ea, three recombinant mouse

haplotypes can be identified; b, s, and 1.


Both of these RHSs correspond to the hotspots previously

identified in laboratory inbred strains (Steinmetz et al.

1982b; Lafuse et al. 1986).

Identification of a Recombinational Hotspot
Between Aa and EO

By the same methods and criteria as above, a third RHS

was identified between Aa and E 8. This recombinational

hotspot maps to a 4.7 to 9.2 kb stretch of DNA midway

between the two genes (Figure 4-4). The minimum distance

was defined at centromeric end by a Bgl II site in w207 and

telomeric end by a Pvu II site. These sites were confirmed

with a 2.8 kb Bam HI fragment probe which lies equidistantly

between the Aa and Il probes. Five recombinational events

were scored at this site and the data for some of the

representative haplotypes (s, w207, and w218) are presented

in Figure 4-3C. All recombinational events occur between

mice of the a lineage for II, and usually strains of the b

lineage for Aa and therefore are haplotype specific. This

RHS represents a site for homologous recombination not

previously identified in laboratory strains of Mus m.


0 p --,*4 0

A 4) M r 1
4>J *O '

o403 R o
0) 4J 0 -04
r. 00 0.-, 0
*HM r- 0 t1 3
t 414J 0,
r tO.44
4.i4 0H
() 4-0
HC *-H Q .C 0
MA 9 O >
,0 V ) 0-,-4
rC -H -l C 0
0M 4H 4J 44

0I. 0 M
) r3- 0- ,
r.471 U 0 4-4
A r, 4 r. 4.
A 40 4 P -)
C 0 0 00 )

- M. U 4 A 1
4t 0 9 9 4)

H .q 4 1 v .
C O O) r 4 0
0 > 4 --4 -HI M
0 40 0. -0
*HC *H 0-HO

S M 0 *M

.0 4 40 r
r 0 O 0 0 r

*H 0 *- 0
4) W) I m '0 -i
43 0 4) w
.00 V 4) 01O

)*H 4 4 # >4


!o wiS58 oS

ui UI I


.0 0 1






r4 1 4

0 -4 04
A 6 0
4Xid r- I I C
) 4) 0 ,
9 3 O-i f 4 0>

A3 t M U
04 0 4 1 -el ) >4
0O M *H 41:
Sa) A u 1.a
0 4) ., 4 0 -,-
S4-) 0 A o0
4O r-4 MW *)k

to 4) 4*4 $4 4
A 0 U0 H4
4-) 0 4 C) H C

0 U > *-
00. 4d.
o 0 ,) 01 1
4 r-i o I1 $4 i

A V3 0o

0*H c4 ) H
q 9 u 14 0 0-H

4H ) -H4 0
u M > t 0 H4) 0
k -4 ON to p

0 0 4 V 0 0
t () 0) k O t V

> '-1 -0 0
) o 4 H *I
043 IV 0 :
0 V am U

-4 M 00

fo 4) 4- -44)

CO n
ci) c





UJ -


- a.










to -
LU -



- a.





I .


o -
i 1-
> U-



_ Qn




- a.







Identification of Recombinationally Depressed
Segments (REDS) Within the I Region

By lining up the haplotypes in this panel as done for

the identification of RHSs, a pattern of associations

between loci emerged. It was observed that recombination is

localized almost exclusively to the three recombinational

hotspots with one event or less occurring elsewhere within

the I region. This restricted pattern of recombination at

specific, non-random sites separates the I region into

discrete genomic segments flanked by the RHSs. There is

strong linkage disequilibrium seen between the loci within

any of these three genomic segments. Depending on the RHSs

active within a particular haplotype, the extent of this

linkage disequilibrium varies (Figure 4-5). Some haplotypes

carry linkage groups extending across the entire I region,

for example H-2P, while others, with more active forms of

the different RHSs, show shorter linkage groups. For

example, in the H-2w208 haplotype, AM and Aa form one

linkage group while II and 5'Ef3 form another short linkage

group. This pattern of recombination within the I region

defined by RHSs, led these genomic segments flanked by RHSs

to be defined as recombinationally depressed segments

(REDS). These REDS are characterized by the lack of

recombination between the loci within a particular REDS and,

therefore, a strong linkage disequilibrium is exhibited

between these loci. Recombinationally depressed segment 1


contains 44 kb of DNA including the genes for AP3 and Aa,

REDS2 contains 19 kb extending just telomeric of Aa to the

RHS in Ef, REDS3 spans 21 kb containing 3'Efi, EP2, and 9kb

telomeric to the RHS at Ea, and REDS4 contains Ea and

approximately 15 kb telomeric of this gene. These REDS are

the genetic units exchanged between haplotypes by homologous

recombination during the evolution of the genus Mus which

led to the diversification of the I region.

Correlation Analysis of REDS

To test how strong the linkage disequilibrium is

between loci within a REDS and across the I region, a

statistical pairwise correlation analysis of the F values

for each pair of neighboring loci was performed. The

correlation coefficient or R values were calculated. These

show the relationship between the degree of divergence (F

value) for the neighboring loci. For example, by assessing

the degree of divergence of one locus, A, in a particular

haplotype as compared to a certain subset of haplotypes, if

there is correlation between this locus and its neighbor, B,

then an analogous pattern of divergence or F value will be

seen for locus B as compared to the same subset of

haplotypes (Figure 4-6). As the strength of the correlation

between loci increases, the R value approaches 1.0. Table

4-17 shows the cumulative results of this type of analysis

for all the loci within the I region. The center diagonal

o u4) -
4) 04 H

a) .0 0
0 $4 MC 10
*- 0r *-d

H 0

4) a) -P 4
o to
4J : 4) 0 ,4

0 9 0
r.-I U 4) $4 I0

. a) 0 -)

4)0 4rr
a) 4) -4

U Ma) 04)
9 to -P -1
a) .!C
>-.,C-uV 4)
V '0 104 U
0 0 4 HO0
A4 00 OO

M 4)

m 0 to
go H< I) c 4)
cIn M$ >
0 I 0

I o 0 *t t
o0 9M )A

0 m0 0 "1








0 H











Full Text
xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID E3OMEH4ZB_O6B9RZ INGEST_TIME 2012-02-20T22:26:44Z PACKAGE AA00009089_00001