The Generation of antigen binding site diversity in the murine Mhc class II A beta molecule

MISSING IMAGE

Material Information

Title:
The Generation of antigen binding site diversity in the murine Mhc class II A beta molecule
Physical Description:
viii, 172 leaves : ill. ; 29 cm.
Language:
English
Creator:
Boehme, Stefen A., 1962-
Publication Date:

Subjects

Subjects / Keywords:
Research   ( mesh )
Binding Sites, Antibody -- genetics   ( mesh )
Binding Sites, Antibody -- chemistry   ( mesh )
Major Histocompatibility Complex -- genetics   ( mesh )
Genes, MHC Class II -- genetics   ( mesh )
Antigen-Antibody Reactions   ( mesh )
Point Mutation -- genetics   ( mesh )
Polymorphism (Genetics)   ( mesh )
Selection (Genetics)   ( mesh )
Evolution, Molecular   ( mesh )
Molecular Sequence Data   ( mesh )
Amino Acid Sequence   ( mesh )
Base Sequence   ( mesh )
Department of Pathology and Laboratory Medicine thesis Ph.D   ( mesh )
Dissertations, Academic -- College of Medicine -- Department of Pathology and Laboratory Medicine -- UF   ( mesh )
Genre:
bibliography   ( marcgt )
non-fiction   ( marcgt )

Notes

Thesis:
Thesis (Ph.D.)--University of Florida, 1990.
Bibliography:
Bibliography: leaves 161-171.
Statement of Responsibility:
by Stefen A. Boehme.
General Note:
Typescript.
General Note:
Vita.

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 002357426
oclc - 50763024
notis - ALW1869
sobekcm - AA00006099_00001
System ID:
AA00006099:00001

Full Text












THE GENERATION OF ANTIGEN BINDING SITE
DIVERSITY IN THE MURINE MHC CLASS II
A BETA MOLECULE















By

STEFEN A. BOEHME


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


1990
























I would like to dedicate this dissertation to the three
people whose altruistic love, support and encouragement
made this work possible. Kathy, Mom, and Dad, from the
bottom of my heart, thank you, and I love you.
God bless you.
















ACKNOWLEDGEMENTS


First and foremost, I would like to thank my mentor,

Dr. Ward Wakeland, for his guidance, patience, and

friendship during my tenure as his student.

I would also like to express my thanks to my committee

members, Drs. Smith, Johnson, Nick and Hauswirth, for their

assistance and encouragement.

Additionally, I certainly appreciated the support and

help from the faculty members of the Department of Pathology

and Laboratory Medicine. Furthermore, I sincerely thank Liz

(soon to be even more bored) Wilkerson, Rose (Lil' Pork

chop) Mills, and Crystal (thanks for just being you) Grimes,

for making my life easier, and well delivered doses of

sanity.

I also would like to wish the Wakeland laboratory

continued success in its scientific endeavors.

Additionally, I want to thank Drs. McConnell, Hensen,

Tarnuzzer, Zack, Potts and She, and soon to be Drs. Lu and

McIndoe for all their help and assistance.


iii








Finally, I wish to thank my peers who brought life into

lab. Of particular notoriety are Roy Tarnuzzer, Jane

Gibson, Lena Dingler, Rick McIndoe, Lee Grimes, Jeff

Anderson, Linda Yaswen, and Sussanna Lamers. Best of luck

to you all, and Baa-Baa-Roo!













TABLE OF CONTENTS

Page

ACKNOWLEDGEMENTS ...................................... iii

ABSTRACT ............................................. vii

CHAPTER I: INTRODUCTION ............................. 1

CHAPTER II: REVIEW OF THE LITERATURE ................. 4

Genomic Organization of the Major
Histocompatibility Complex......................... 4
Generation of Mhc Class II Gene Polymorphism .... 28
Functional Role of Mhc Polymorphism ............. 43
Wild Mice ....................................... 59

CHAPTER III: MATERIALS AND METHODS ................... 67

Isolation of Genomic DNA ........................ 67
Polymerase Chain Reaction Amplification,
Cloning, and Sequencing ......................... 68
Spleen Cell Isolation, Immunostaining, and
Flow Cytometric Analysis ........................ 69
Data Analysis ... ................. .............. 70

CHAPTER IV: RESULTS .................................. 71

The Generation of Mhc Class II A& Gene
Polymorphism in Rodents ........................ 71
The Impact of Mhc Class II A Gene Polymorphisms
on the Structure of the Antigen Binding
Site .......................... .......... ...... 99
Serological Characterization of the Mhc
Class II A Molecule ............................ 135

CHAPTER V: DISCUSSION ............................... 146

The Genetic Mechanisms of Mhc Gene
Diversification ........... ................... 146
Combinatorial Association of the Aa and Ag
Chains ....................... ................. 147
Mhc Influence on Immune Responsiveness .......... 149
The Selective Maintenance of Antigen Binding
Site Diversity .......... ....................... 153









LITERATURE CITED ..................................... 161

BIOGRAPHICAL SKETCH .................................. 172












Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy


THE GENERATION OF ANTIGEN BINDING SITE DIVERSITY IN
THE MURINE MHC CLASS II A BETA MOLECULE


By

Stefen A. Boehme

August 1990


Chairman: Edward K. Wakeland
Major Department: Pathology and Laboratory Medicine


The genetic polymorphism of the murine major

histocompatibility complex class II Ab gene is generated by

the slow accumulation of point mutations over long

evolutionary time periods. These point mutations, which are

predominantly located in the antigen binding site,

frequently result in nonsynonymous (amino acid replacement)

mutations, and usually change the biochemical class of amino

acid. This diversity is then extensively amplified by

mechanisms that dramatically modify the antigen binding site

in a single step; intra-exonic recombination between

different alleles of Ab exon 2 (which contains the Ab

portion of the antigen binding site), and the introduction

of amino acid deletions. Consequently, natural mouse

populations contain an array of Ab alleles with highly


vii








divergent antigen binding site structures and presumably

antigen binding properties. The accumulation of such rare

and unusual genetic events specifically within the antigen

binding site of the Mhc class II Ab gene suggests that

specialized selective mechanisms may favor the maintenance

of alleles encoding highly divergent forms of the antigen

binding site. This type of selection, referred to as

divergent allele advantage, may act in concert with other

forms of balancing selection, and drive the diversification

of the antigen binding site by selectively maintaining the

most divergent Mhc alleles within populations.


viii












CHAPTER I
INTRODUCTION




A crucial step in the initiation of all antigen-

specific immune responses is T lymphocyte recognition of

processed antigen bound to molecules encoded by the major

histocompatibility complex (Mhc). The two classes of Mhc

molecules, class I and class II, bind peptide fragments that

are derived from different cellular compartments, and

generated by various antigen processing pathways. This

allows T lymphocytes the ability to efficiently detect

cellular alterations via stimulation of their clonally

distributed T cell antigen receptor. Regulatory T

lymphocytes normally recognize antigen bound to class II

molecules, while cytotoxic T lymphocytes normally recognize

antigen completed to class I molecules. Mhc molecules

specifically bind antigenic fragments in their antigen

binding site, which X-ray crystallographic analysis of a

class I molecule has shown to be a groove produced by two

parallel a-helices overlaying a platform composed of an

eight strand P-pleated sheet (Bjorkman et al. 1987a; Brown

et al. 1988). Fragments of processed antigen, approximately

9 17 amino acids in length, are bound with micromolar








2

affinity within the antigen binding site groove (Buus et al.

1987).

The importance of Mhc gene products to immune

recognition has dramatically influenced their evolution,

such that these genes exhibit an unparalleled degree of

polymorphism. Alleles often differ in greater than 10% of

their nucleotide sequence, and most of this diversity is

concentrated in the exons encoding the antigen binding site

(Benoist et al. 1983; Estess et al. 1986). These

polymorphisms modify the functional properties of Mhc

molecules in antigen presentation, causing changes in the

immune responsiveness of individuals to foreign antigens.

The evolutionary mechanisms responsible for the

generation and maintenance of Mhc diversity has been a

controversial issue in the field of immunogenetics for many

years. The goals of this dissertation are to elucidate the

molecular mechanisms responsible for this unprecedented

genetic diversity, and to determine whether selective

pressures are acting to exclusively diversify the antigen

binding site. These issues were approached by analyzing the

divergence of the antigen binding site in the murine Mhc

class II Ab gene. The polymerase chain reaction coupled

with DNA sequencing technology was employed to obtain the

nucleotide sequence of 46 alleles of Ab exon 2 (which

encodes the Ab portion of the antigen binding site).

Together with ten published sequences, the data set included








3

nucleotide sequence from 56 alleles derived from 12 Mus

species and 2 Rattus species.

The results of this analysis indicate that the

diversification of Ab exon 2 is generated by the slow

accumulation of point mutations predominantly in the antigen

binding site over long evolutionary time spans. These point

mutations frequently result in nonsynonymous (amino acid

replacement) mutations, and often change the biochemical

class of amino acid at that position. This diversity is

then extensively amplified by mechanisms that dramatically

modify the antigen binding site in a single step; intra-

exonic recombination between Ab alleles, and the

introduction of amino acid deletions. As a result, natural

mouse populations contain an array of alleles with highly

divergent binding properties. The accumulation of such rare

and unusual events specifically within the antigen binding

site of Mhc class II genes suggests that specialized

selective mechanisms may favor the maintenance of alleles

encoding highly divergent forms of the antigen binding site.

This type of selection, referred to as divergent allele

advantage, may act in concert with the other two forms of

balancing selection (overdominance and rare allele

advantage) and drive the diversification of the antigen

binding site by selectively maintaining the most divergent

Mhc alleles within populations.















CHAPTER II
REVIEW OF THE LITERATURE


The major histocompatibility complex (Mhc) was

initially detected based on its involvement in graft

rejection between different inbred mouse lines (Little and

Tyzzer 1916). The development of serological techniques

immensely augmented.the study of the H-2 complex (Gorer

1936; 1938), and thus allowed for a greater understanding

and appreciation of the genetic diversity encoded within the

genes of the Mhc. The extensive polymorphism of the Mhc

genes, as well as their critical role in regulating immune

responsiveness has made their study particularly interesting

for scientists from many different disciplines of biology.

This literature review will concentrate on Mhc class II gene

structure, function, and genetic diversity.



Genomic Organization of the Major Histocompatibility Complex



The murine major histocompatibility complex, referred

to as H-2, is a large multigene family located on chromosome

17 of the mouse. The H-2 complex encompasses approximately












2 centiMorgans of DNA, which is equivalent to a physical

distance of at least 2.4 megabase pairs (Hood et al. 1982;

Klein 1975). Chromosomal walking and mapping techniques

have provided a detailed picture of the molecular

organization of the Mhc. A total of 50 genes have been

cloned and partially characterized from the Mhc of a BALB/c

mouse, most of which encode immune related proteins. A

molecular map of the H-2 complex, as well as the general

protein structure, :is illustrated in Figure 2-1.

Mhc genes encode three families of proteins based on

their structure and function. The H-2 complex has

accordingly been divided into four regions which correspond

to the class of molecules encoded. The K and D regions

contain the class I genes, of which there are two general

types. First, the K, P, and I genes encode for the

classical transplantation antigens and are expressed on most

nucleated cells of the body. These molecules are extremely

polymorphic, and as such are responsible for mediating

heterologous graft rejection. Their physiological function

in vivo, however, is to present viral and tumor antigens to

cytotoxic T lymphocytes (Zinkernagel 1979). The second type

of class I gene, designated Qa and Tla, are much less

polymorphic. Molecular cloning studies have revealed more

than 32 genes of this type (Steinmetz et al. 1982a), located

telomeric to the D and L loci (Winoto et al. 1983).








6

Although some of these genes are expressed on nucleated

blood cells (Qa) or on thymocytes and certain leukemias

(Tla) (Michealson et al. 1983), their function is not yet

known (Flaherty 1980). It has been postulated, however,

that Tla molecules may serve as restriction elements for the

6 lineage of T lymphocytes (Janeway et al. 1989).

Both types of class I molecules have a similar protein

structure. They consist of three extracellular domains, a

transmembrane domain, and a cytoplasmic domain, thereby

constituting a 40-45,000 dalton membrane bound glycoprotein

of approximately 350 amino acids. A fourth extracellular

domain is contributed by P-2 microglobulin. This 12,000

dalton molecule, encoded on chromosome 2 of the mouse,

noncovalently associates with class I proteins, and is

thought to play a role in stabilizing the extracellular

domain structure of class I proteins (Klein et al. 1983b).

The S region encodes a heterogeneous assortment of

genes. Included are the classical class III genes, which

encode the complement proteins C2, Bf, Sli, and C4, as well

as the two homologous genes 21-OHA and 21-OHB, one of which

codes for the steroid 21-hydroxylase (Steinmetz et al.

1984). Centromeric to the D locus are the genes for two

cytotoxins, TNF-a and TNF-8 (Muller et al. 1987). Although

these S region genes are physically located within the H-2

complex, and may therefore evolve as a single genetic unit

(Bodmer 1976), Klein et al. (1983b) argue against their








7

inclusion in the Mhc because they are not functionally

related to the class I or class II histocompatibility loci.

The class II genes are contained within the I region,

which maps between the K and S regions of the H-2 complex

(Figure 2-1). Class II genes were first defined by the

differential ability of inbred mouse strains to mount an

immune response to the synthetic peptide (T,G)-A--L

(McDevitt and Sela 1965). These immune response genes were

then definitively mapped using recombinant and congenic

strains of mice (Mcpevitt et al. 1972; Benacerraf and

McDevitt 1972). There are two isotypic forms of class II

molecules encoded within the I region, denoted A and E, and

these are assembled from polypeptides encoded by the four

functional genes Ah, Aa, Eb, and Ea. In addition, there are

three pseudogenes, termed Ab3, Ab2, and Eb2 (Widera and

Flavell 1985; Steinmetz et al. 1986). Class II molecules

are heterodimeric glycoproteins composed of a 35,000 dalton

a (alpha) chain and a 28,000 dalton b (beta) chain (Klein et

al. 1983b). These polypeptides noncovalently associate in

the cytoplasm and are subsequently expressed on the surface

of antigen presenting cells. Both the a and b chains are

organized into 5 protein domains including a hydrophobic

leader peptide of approximately 25 amino acids absent in the

mature cell surface form of the molecule, 2 approximately 90

amino acid extra-cellular domains (termed al a2 or bl b2), a

hydrophobic transmembrane segment of 25 amino acids, and a


















H
H
H




0.4






Ho
0
(0 4-)




to
O *4-I
(0 *


o
0
0 -r,- 0


4 -

4-O
.*-



M 0
*a 0 04



1 0 k




ON
,-4 0
-P C 0


*0 0




'0 4H


*H C 0






















Co
0


^ M


Co


'I-
C"


QI
-II
c1


III








10

highly charged cytoplasmic domain. The tertiary structure

of the a2, bl, and b2 domains is abetted by the formation of

disulfide bonds between pairs of cysteine residues located

within each domain (Mengle-Gaw and McDevitt 1985).

The domain organization of class II polypeptides

directly reflects the exon/intron organization of their

respective genes. The b chain genes, for instance, are

composed of six exons, one for each protein domain, and an

additional exon encoding the 3' untranslated region (Saito

et al. 1983). The chain genes are very similar, except

that the transmembrane and cytoplasmic domains are combined

into a single exon. Thus, they are composed of 5 exons

(Mathis et al. 1983; McNichols et al. 1982). A diagram of

the organization of class II a and b genes is given in

Figure 2-2.



Organization of the I-Region

The I region was originally divided into 5 sub-regions,

I-A, I-B, I-J, I-E, and I-C based on recombinational

analysis of various immune responsiveness traits (reviewed

by Klein 1975). The I-A and I-E subregions encode the

serologically and biochemically defined A and E molecules,

which are the immune response antigens. The Ab, Aa, and Eb

polypeptides are encoded in the I-A subregion and the Ea

chain is encoded by the I-E subregion (Jones et al. 1978;

Murphy et al. 1980).
























H




,-I
H
H












4-4
0






0
.9S
4J
N
*o
t-I

C
ON



0
0
x
4)






,-4




40)
Irl




to
P4 .



























1I


I-
C~)


0


--J


LU


LUo
00

00








13

The I-J subregion was serologically defined by reagents

directed against an I-J polypeptide, which was thought to be

a suppressor T cell factor capable of suppressing immune

responses (Murphy et al. 1976; Murphy et al. 1980).

Although these I-J suppressor factors have been

serologically defined, attempts to isolate and biochemically

characterize them have failed.

The existence of the I-B and I-C subregions were based

entirely on regulatory effects on immune responsiveness.

The I-B subregion was originally defined by Leiberman et al.

(1972) for its ability to regulate the antibody response to

an allotypic determinant on the myeloma protein MOPC 173.

Immune responses to at least 5 other antigens have been

attributed to the B region including lactate dehydrogenase B

(Melchers et al. 1973), staphylococcal nuclease (Lozner et

al. 1974), oxazolone (Fachet and Ando 1977), H-Y antigen

(Hurme et al. 1978), and trinitrophenylated mouse serum

albumin (Urba and Hildemann 1978). No protein product has

ever been detected from the I-B subregion, and its effects

can be explained by the complimentation of gene products

from both the I-A and I-E subregions (Dorf and Benacerraf

1975; Klein et al. 1981).

The C locus was first discovered with H-2h2anti-H-2h4

antiserum (David and Shreffler 1974). Rich et al. (1979a,

1979b) subsequently produced antisera containing g specific

antibodies that reacted with a suppressor factor produced in








14

allogeneic mixed lymphocyte reactions. Mapping of the C

subregion using recombinant inbred strains suggested a

position telomeric to the Ea locus and centromeric to C4.

As for the B subregion, no protein product has ever been

found.

The advent of molecular genetic analysis has allowed

the elucidation of a molecular map of the I-region.

Steinmetz et al. (1982) provided the first evidence at the

molecular level of the exact linkage of class II genes by

cloning a stretch of 200,000 contiguous base pairs from the

I-region of a BALB/c mouse. This study identified 3 of the

biochemically defined class II genes, Ab, Eb, and Ea; and

Eb2, designated a pseudogene because it did not hybridize to

a 5' probe. Southern blot analysis of the BALB/c genome

suggested that the I-region contains 2 a chain genes and

from 4 to six b chain genes, a conclusion later confirmed by

the work of Widera and Flavell (1985). Steinmetz et al.

(1982) also showed that the Ea and Eb genes are in fact

present in strains not expressing an E molecule. Thus, the

failure to express an E molecule is not a consequence of the

deletion of the entire gene, but rather must occur at the

level of transcription or translation. Subsequent

experiments involving the screening of cosmid libraries by

Davis et al. (1984) lead to the identification of the Aa

gene, and it was mapped just telomeric to the Ab locus.








15

Comparison of the molecular map of the I-region with

the genetic map has confirmed the location of the Aa and Ab

genes in the I-A subregion, and the location of the Ea gene

in the I-E subregion. The Eb gene, however, is located with

its 5' end in the I-A subregion and its 3' end in the I-E

subregion. This confines the I-J and I-B subregions to less

than 3.4 Kb of DNA at the 3' end of the Eb gene (Steinmetz

et al. 1982). Sequence analysis of this DNA fragment

definitively showed that no gene could encode for I-J in

this segment (Kobori et al. 1986).

Two other class II genes have subsequently been

discovered and determined to be pseudogenes. Larhammer et

al. (1983a) identified the Ab2 gene and mapped it

approximately 20 Kb centromeric to the Ab gene. Sequence

analysis of the Ab2 gene and an Ab2 cDNA clone shows the

exon/intron organization to be the same as other class II b

genes (Larhammer et al. 1983b). The predicted amino acid

sequence of Ab2 shows only about 60% homology to other b

chains. In contrast, the typical homology among other b

chains in human and mouse is around 80%, thus indicating

that Ab2 is the most divergent member of the family.

Detection of incompletely spliced Ab2 mRNA and the finding

of an cDNA clone containing intron sequences suggests that

Ab2 transcripts are not properly processed. No cell surface

product has been isolated from the Ab2 locus. Hybridization

to restriction enzyme-digested genomic DNA of different








16

inbred strains with Ab2 probes indicated that this gene

displays a lesser degree of polymorphism than Ab.

Widera and Flavell (1985) isolated the Ab3 gene. It

shows 83% nucleotide homology with the human SBb gene and

strong homology with other class II b genes. However, an 8

nucleotide deletion makes the translation of this gene into

a functional protein an impossibility.

The position of the Ab3 gene is 75 Kb telomeric the K

gene (Widera and Flavell 1985). Steinmetz et al. (1986)

subsequently linked;the Ab3 gene from the BALB/c mouse to

the rest of the 1-region, thereby providing a contiguous 600

kilobase map of the K and I regions of the Mhc. The

organization of the I-region is shown in Figure 2-3. The

genes are arranged centromerically in the order of Ab3, Ab2,

Ab, Aa, Eb, Eb2, and Ea, and span approximately 300

kilobases of DNA, with the functional genes confined to a

110 kilobase region.



Homologous Recombination Within the Mhc

The molecular cloning and characterization of large

segments of the Mhc has made it possible to map meiotic

recombinational breakpoints at the nucleotide level.

Steinmetz et al. (1982a) initially analyzed 9 intra-I region

recombinant mouse strains and found that all the

recombinational events map to a 10 kilobase segment of DNA

covering part of the Eb gene. Subsequent southern blot and



















Old
0 0S



4JM

.,-i



0 M
- .4 4J
-*H Pa
b 4 r-
c) -'-


- l 0
010





. 0
4 o

0 C4
MH H


0 > CV




U-U *0 4 N
0U *P
WOO




0 -.4

H 0r. U




rP 0 04 4





















LL


oa-
uj |

q I

qo I

9L|
<(H


-o

0
LO






0
0
Cto


0



0



0




0
CMl






-4





-0
CU
C)


4Bit








19

sequence analysis revealed that 3 of the recombinational

events occurred within a 1 kilobase region of DNA in the

intron between the Ebl and Eb2 exons (Kobori et al. 1984;

1986). Several succeeding studies have identified 3 more

intra-j region recombinants in which the breakpoints map to

this recombinational hotspot (Saha and Cullen 1986a; 1986b;

Lafuse and David 1986). The finding of highly localized

meiotic recombination points in the mouse Mhc indicates that

recombination is highly focal, and the genetic and physical

maps would not be congruent.

Further studies have revealed 4 additional

recombinational hotspots within the Mhc. These map to (1) a

40 kilobase stretch of DNA between the K and A loci

(Steinmetz et al. 1986; Shiroshi et al. 1982). (2) A 9.5

kilobase region of DNA between the Ab3 and Ab2 loci

(Steinmetz et al. 1986). Further analysis of this

recombinational hotspot by Uematsu et al. (1986) revealed

that all the breakpoints were confined to a 3.5 kilobase

stretch of DNA. All the breakpoints examined showed

homologous recombination without any DNA sequences

duplicated or deleted between the parental and recombinant

haplotypes. (3) Seven breakpoints have been characterized

mapping to a 12-14 kilobase region centromeric to the Ea

gene (Lafuse and David 1986). (4) Another recombinational

hotspot was identified by Tarnuzzer (1988), that maps to a

4.7 kilobase stretch of DNA approximately








20

5 kilobases telomeric to the AAa gene. These observations

indicate that most of the recombinations within the H-2

complex occur in clusters, defining 5 recombinational

hotspots shown in Figure 2-4.

All the recombinational hotspots in H-2 have three

characteristics in common: (1) high frequency of homologous

recombination, (2) localization to a small stretch of DNA,

and (3) haplotype specificity (Steinmetz et al. 1987).

Furthermore, when the recombinational hotspots are present,

they act in a dominant fashion (Steinmetz and Uematsu 1987).

The structural basis of the recombinational hotspots

within the Mhc is unknown (Steinmetz ett al. 1987).

Repetitive sequences have been identified in the proximity

of the Ab3/Ab2 and Eb hotspots (Steinmetz et al. 1987;

Uematsu et al. 1986). These repetitive sequences have been

suggested to play a role based on their similarity to Chi, a

recombinational hotspot in phage lambda, and human

hypervariable minisatellite sequences, constituting presumed

hotspots in man. These similarities may therefore indicate

that the basic mechanism of homologous recombination maybe

similar in prokaryotes and eukaryotes.


3-Dimensional Structure of Mhc Molecules

A major advance in the understanding of Mhc molecules

came with the elucidation of the 3-dimensional structure of

the class I HLA-A2 molecule (Bjorkman et al. 1987a). Plasma























r-4

0






0


*4
.1




0

0


41






*H
9 0
-JH








r -4

141









S00
$4

r4 0












0
0
- w O





o w


o t


0
0
-C')




0
0
N

CE, U .____
11I ^-


0
0

1 o
^ Q-

N0
0.

O
z *
o
0 I-




0
_z


<, z o
W 0








23

membranes from the homozygous human lymphoblastoid cell line

JY were digested with papain to remove the transmembrane

anchor of the HLA-A2 molecule. The soluble fragment,

containing the al, a2, a3, and P2M domains, was

crystallized, and the structure was then determined from

3.5A X-ray crystallographic analysis. The molecule is

comprised of two structurally similar domains; al and a2

have the same tertiary folds, and likewise a3 and S2M have

the same tertiary folds. The a3 and P2M domains are both P-

sandwich structures composed of 2 antiparallel p-pleated

sheets, one with 4 p-strands and one with 3 P-strands.

These two sheets are connected by a disulfide bond. This

tertiary structure has been described for the constant

region of immunoglobulin molecules, and is consistent with

the amino acid homology between the 2 molecules (Orr et al.

1979).

The al and a2 domains interact symmetrically to compose

the antigen binding site (Bjorkman et al. 1987a). It is

located on the top surface of the molecule, distal from the

membrane, in a position accessible for recognition by

receptors from the surface of another cell. The structure

consists of two parallel a-helices, each span a platform

composed of an 8 strand antiparallel P-pleated sheet

structure. The antigen binding site is the groove that lies

between the two a-helices and atop the p-pleated sheet. The

dimensions of the antigen binding site (ABS) groove are








24

approximately 25A long, 10A wide, and 11A' deep. This

would accommodate an 8-10 amino acid fragment in a linear

conformation or 14-17 amino acids in an a-helical

confirmation. The interior of the antigen binding site is

lined with both polar and nonpolar amino acid side chains,

and many of the highly polymorphic amino acids responsible

for haplotype-specific associations with antigen are located

in the site (Bjorkman et al. 1987b).

A large continuous region of electron density that is

not accounted for by the polypeptide chain of the HLA-A2

molecule was observed in the ABS (Bjorkman et al. 1987a).

It seems likely that this extra density is from a peptide or

mixture of peptides that co-crystallized with the Mhc

molecule.

Recently, the structure of HLA-Aw68, refined to a

resolution of 2.6A*, was reported (Garrett et al. 1989).

The backbone structure of the two HLA class I molecules was

very similar, excluding the 13 amino acid differences, of

which 10 are in amino acid positions that face in the

antigen binding cleft. These amino acid differences

individually cause only local structural changes, but

overall substantially transform the ABS. For instance,

comparison of the structure from the 2 alleles illustrates

that various sub-sites of the groove have contour and

charge-distribution changes. Furthermore, the physical

characteristics of pockets which extend between the a-helix








25

and P-pleated sheets, and are thought to play a critical

role in determining peptide binding properties, can be

highly diversified between the alleles. This is due to

polymorphisms that result in amino acid side chain changes,

differences that ultimately dictate physical binding

properties. The number of amino acid differences between

HLA-A2 and HLA-Aw68 is approximately equal to the average

number of site differences between pairs of HLA alleles.

Therefore, the same degree of structural changes in local

pockets and sub-sites should be observed in other alleles.

A model of the Mhc class II ABS has been proposed based

on the class I structure and the pattern of polymorphism of

human and mouse class II alleles (Brown et al. 1988). The

basic 3-dimensional structure is the same; the two a-helices

lying atop an 8 strand P-pleated sheet (Figure 2-5). Both

the a chain and the b chain contribute an a-helix and 4

strands of the P-pleated sheet. There are regions of the

model, however, whose tertiary structure cannot be

accurately predicted. These areas in the b chain are (1)

the loop between p-strand one and p-strand two, (2) the

central a-helix, and (3) the 3' segment. The undefined

parts of the a chain are (1,2) the loops between p-strand 1

and 2, and between p-strands 3 and 4, and (3) the 5' a-helix

segment. Analysis of the secondary structure of class II

molecules by physical criteria, such as Fourier transform

infrared and circular dichroism spectroscopy, are consistent




















0M)

m 91



.,q
.*H,


V >


00 *
-4J 0)
0 CH


*H 4
M




o H I
00
HOO

00




*H

0 .+





I 0)
V- 4-)*
4) '0 10









* 0 0-
10Q











28

with the class II model proposed by Brown et al. (1988)

(Gorga et al. 1989). Furthermore, the class II ABS model is

consistent with a variety of structural and functional

studies (Allen et al. 1987, Buus et al. 1987).



Generation of Mhc Class II Gene Polymorphism



The most outstanding feature of Mhc genes is their

unprecedented genetic polymorphism. No other vertebrate

genes exhibit such a high degree of diversity (Klein 1986).

Serological studies, tryptic peptide mapping, and molecular

characterization have estimated that greater than 100

alleles of some Mhc loci exist in natural populations of Mus

(Wakeland and Klein 1979; Duncan et al. 1979a; Gotze et al.

1980; Klein and Figueroa 1981; 1986). Many of the alleles

are globally-distributed with frequencies ranging from 1-10%

in wild mouse populations (Gotze et al. 1980; Nadeau et al.

1981). In addition, greater than 90% of wild mice are

heterozygous at H-2 (Duncan et al. 1979b), an observation

fully consistent with the high degree of diversity of Mhc

genes.

Restriction fragment length polymorphism (RFLP)

analysis with single copy probes spanning the I-region

reveals variable and conserved tracts of DNA (Steinmetz et

al. 1984; Tarnuzzer 1988). The centromeric half of the I-

region, that encodes the Ab, Aa, and 5' portion of the Eb








29

gene, shows extensive polymorphism and allelic variability.

On the other hand, the telomeric portion of the I-region,

encoding the 3' portion of the Eb gene, Eb2, and Ea,

displays little polymorphism. The boundary runs through the

Eb gene, close to and perhaps overlapping with the

recombinational hotspot in the intron between Ebl and Eb2

exons.

Nucleotide sequence comparisons of the four functional

class II genes derived from laboratory strains of mice is

consistent with the;observations made at the RFLP level; Ab,

Aa, and Eb are polymorphic, whereas Ea is not (Benoist et

al. 1983a; 1983b; Choi et al. 1983; Malissen et al. 1983;

Mengle-Gaw and McDevitt 1983; Estess et al. 1986). Allelic

nucleotide sequence variation can be extensive; alleles of

Ab or Aa commonly differ by 5-10% of their nucleotide

sequence (Benoist et al. 1983b; Estess et al 1986). Thus,

the RFLP and nucleotide sequence data suggest that the

diversity is indeed greater in both the coding and non-

coding regions of the variable tract as compared to the

conserved tract of the I-region.

Nucleotide sequence analysis of the Ab, Aa, and Eb

genes from different laboratory strains of mice all indicate

that the majority of the diversity is localized in the amino

terminus of the molecule, specifically the al and bl exons

(Choi et al. 1983; Benoist et al. 1983b; Estess et al.

1986). These are the exons that encode the antigen binding








30

site (Brown et al. 1988). Closer inspection of the

nucleotide sequence variation of the Ab, Aa, and Eb genes

reveals that most of the substitutions are clustered into

regions of hypervariability (Benoist et al. 1983b; Mengle-

Gaw and McDevitt 1983; Estess et al. 1986). This diversity

is also seen at the amino acid level. For instance, a Kabat

and Wu variability plot (Kabat and Wu 1970) of the 6 Aa

alleles sequenced by Benoist et al. (1983b) illustrates that

the amino acid substitutions fall into two hypervariable

regions at residues 11-15 and residues 56-57.

Hughes and Nei have examined the patterns of nucleotide

substitutions at both the class I loci (1988a), and the

class II loci (1989) of both humans and mouse. The rates of

nonsynonymous (replacement) substitutions versus synonymous

(silent) substitutions was measured for the various domains

of the Mhc molecules. In both class I and class II loci,

the membrane distal domains encoding the antigen binding

site had a much higher rate of nonsynonymous versus

synonymous substitutions. This was contrasted by the other

parts of the molecules where the reverse was observed; the

rate of synonymous substitutions exceeded that of

nonsynonymous substitutions.

These observations illustrate the extensive

polymorphism of Mhc class II genes, and imply that this

genetic diversity reflects a unique and important biological

role for these molecules. The functional significance of








31

this polymorphism is still unclear; although, it is thought

to directly relate to disease susceptibility. In addition,

the evolutionary origin of Mhc allelic diversity is unknown;

however, two hypotheses dominate speculations: retention of

ancestral polymorphisms (Klein 1980; 1987), and

hypermutational diversification (Pease 1985).



Retention of Ancestral Polymorphisms

Klein (1980) first postulated that the extensive

diversification of Mhc alleles could be explained by trans-

species evolution; the hypothesis that most Mhc alleles

diverged prior to the origin of the species in which they

are presently found. The divergence of contemporary Mhc

alleles, therefore, reflects the steady accumulation of

mutations over long evolutionary timespans, rather than

hypermutational diversification subsequent to speciation.

The hypothesis of retention of ancestral polymorphisms

postulates that allelic lineages of Mhc genes are maintained

for extremely long evolutionary periods in natural

populations, independent of speciation events (Figure 2-6).

This predicts that selection may act to maintain specific

sets of alleles with specific antigen binding sites, and

consequently binding properties. The most common allelic

lineages would encode antigen binding sites which are

optimal for the presentation of the antigenicity expressed

by the prevalent endemic pathogens.


















> 4. r43, m L(
M 0 ) 4P) H*
SC 4 M r4 (t 4
o0 ) r-l 0 C k

-H 4 0>Q P C 9
4J. id 40 4-) -4 t 0



:- 4 0) C-H 34
b-I I k H W 0-H o



0 0- 4) > 4 r-4H *0
> to 0 r 4) 0-4 4
)o a)- 0 4-cH )4
p C CP to >IrHq -1
4) U-1k 0 4 I0 k
43 r-4 P 4r-4 M r
0 Q 0) 4) 0)


C U 0 E) d -4 V



0 0) 4000 (
/; Jk > l 0 J k- 0


0 4) *V 4 0 0
H C (I 0 0 H i
0 0 0 0>) (1) M
-r4 0 H>l4 M 0

m >u rl -1 0 IQ

S to 4.) t 14 4 9 4v
r-I 04 >1 0



r- 4) -4 0001 In
0r o 0) cd > o 00 t
) *r- f(r P4 ()
0 14 4)n *H k
-4 t to 4H --I 4
*ci 1O1m* <-i
4'd (1) 0- ) -4 (0
cd c ( -rA O 0 4

(0 0 0 P 0) 0
k 0 ) m -H ( r-l
tl J( k W} P4-) 9C
to Q) P a) go 0 )
COI In 0 4 4 0 )
S )H *H 0 V 04
44 C M C .4 d

*0 -HH 4I -H IV
I U 0 >4C r1) C
*d 1(d(t -4 H ) 4J 00
(0 ) O 0 t r-4 r-
H O-Hr-iH H 0 r-4 P
*rz 4r-i > a ,) to
*r-l ) .-l Q) ( 1 43O X
i43 > to vc, aol 0 )























L.I
ct















MC








34

The first experimental support for this hypothesis

demonstrated that Mhc class I molecules derived from

different sub-species of the Mus musculus complex had

identical serological reactivities and tryptic peptide maps

(Arden and Klein 1982). Direct evidence further

illustrating the antiquity of Mhc genes was reported by

McConnell et al. (1988). This study analyzed Ab gene

polymorphism by RFLP analysis, and revealed that greater

than 90% of the 31 alleles examined could be organized into

two evolutionary lineages based on the presence or absence

of an 861 basepair retroposon insertion into the intron

between the Abl and Ab2 exons. The flanking direct repeats

of host derived sequences on either side of the retroposon

indicate that the insertion into this position was a random

event during the evolutionary divergence of Ab. Ab alleles

with and without the retroposon insertion were found in 4

species and sub-species of the genus Mus, demonstrating that

this polymorphism arose in the ancestors of modern Mus

species, and was maintained as a polymorphism across

multiple speciation events. These findings have

subsequently been extended to 115 independently derived Ab

alleles, representing 9 different species or sub-species of

the genus Mus. Ab alleles from both lineages are present in

Mus caroli, demonstrating that alleles in these two lineages

diverged at least 8 million years ago (Figure 2-7)(Lu et al.

1990).



















Q) 9


Q) to ) rH 1
0 4- 0 0 0
4J -A r-I) 1
04)i t n-i
4 M 0 4


-p (d 4 9
ON 0 to 0 1|
0-e +J H i Ol
4-1 U )






00000H,
- 0 0 1 o 0
0-H0-P OQ) C
O C-1 ( 4 ( CM a)U


C 0m
-r4 Q u -H o
rE -Hl a E U U) Q














O~.H0-O H
4 ) 0 4) 0 oM
o a4o -t O







HI I -H 0 0 >1 i
-E-4C r- ~ *Q r-

>o- 0 0O
o p am ad z

*0) 4 0 4 0
^ Q) ) 10 ) -P q
04l r-1 ft4 )l W



0-W 0 M 94
> 0 4-4 0
a)*rl (0 la Q)
0- r U 0)4M

4C Q) m O
EUO 0Q M<-HC

( 0 440
p 0 0 p -H ft
4J H 44 0 M 4-b Q
e 0f -H 0


*il 0) Q) a) a) Q)




c-H -44-H 1 04-)
~0 0 AH 0 U).Q



HI > > -H H> U :0
wOO A to w H)





























Insertion in Lineage 2
produces Lineage 3


Modern
Mus
H-2 Tested


m.dom
67


Ancestral Mus

Lineage 1
Insertion in Ag
produces Lineage 2




_...".
^ S S ^ S S ~ ~f.. ____
E---------,----



..........^^


m.mus m.cas sptd
18 4 6


hort spretus
4 12


r
cery
1


cook
1


carol
2








37

Nucleotide sequence analysis of Ab alleles has revealed

that some of these genes have a deletion of two codons,

while others do not (Choi et al. 1983; Estess et al. 1986).

The deletions occur in exon 2 at amino acid positions 65 and

67. Figueroa et al. (1988) report 2 Ab specific monoclonal

antibodies that correlate perfectly with the two types of Ab

genes. The H-2A.m27 antibody reacts with the Ab chains that

have the two deletions, and the H-2A.m25 antibody reacts

with Ab chains that are undeleted. Strains that are

homozygous for Ab show a perfectly antithetical relationship

between the determinants that these antibodies detect, they

are either m25-positive and m27-negative or vica versa. No

molecule has been found that reacts with both antibodies,

and only one molecule reacts with neither of the two

antibodies. Utilizing these two antibodies and Northern

blot hybridization with allele-specific oligonucleotides,

Figueroa et al. (1988) where able to demonstrate the

presence or absence of the amino acid position 65/67

deletion polymorphism in 10 species and sub-species of the

genus Mus, in addition to Rattus norveqicus. This data

indicates that the 65/67 deletion polymorphism already

existed in the last common ancestor of mice and rats.

A number of different mutations affecting both the a

and b chains of the E molecule can result in E molecule non-

expression (Jones et al. 1990). (These mutations will be

discussed at greater depth in the following sections of this








38

literature review.) Many of these mutations can be

identified in mice from multiple species and sub-species of

the genus Mus, indicating that the mutations were already

present in nascent species and survived multiple speciation

events.

The fact that many of these polymorphisms are found in

multiple species requires that the different alleles be

present at relatively high frequencies in the species-

founding populations. These founding populations must also

have been of reasonably large sizes. If either of these two

criterion had not been met, there would be a high likelihood

of losing the polymorphism by random drift, particularly the

retroposon polymorphism in the intron of Ab, where selection

would presumably act at a minimum (Nei 1987).

These findings, together with similar results for

primates (Lawler et al. 1988; Parham et al. 1989; Gyllensten

and Erlich 1989; Mayer et al. 1988), demonstrate that the

retention of ancestral polymorphisms over extremely long

evolutionary periods can account for the extensive diversity

seen in modern Mhc alleles in natural populations of

rodents.



Hypermutational Diversification of Class II Genes

The presence of hypervariable regions of DNA within

class II genes suggests that hypermutational mechanisms such

as gene conversion or segmental exchange may be operating to








39

rapidly diversify the regions of Mhc genes responsible for

immune responsiveness. This hypothesis predicts that most

of the polymorphism will be generated within the lifetime of

a species, potentially allowing alleles to rapidly adapt to

changes in the antigenicity of endemic pathogens (Pease

1985).

Gene conversion or segmental exchange was originally

defined in fungi (Radding et al. 1978), and is a mechanism

by which DNA sequence is copied or transferred from one gene

to another. Although the DNA sequences can be transferred

to and from genes anywhere in the genome, it is more common

to occur within multigenic or multiallelic families

(Baltimore 1981; Robertson 1982; Slightom et al. 1980).

Gene conversion is defined by the DNA transfer between

discrete loci, whereas intragenic segmental exchange occurs

between alleles of a particular locus. Pairing between

partially homologous sequences during meiosis or mitosis is

followed by mismatch repair thereby converting part of one

sequence to that of another. The primary evidence for gene

conversion events is the clustering of nucleotide

substitutions. This pattern of diversity is clearly

documented in Mhc class I genes (Mellor et al. 1983; Weiss

et al. 1983; Nathenson et al. 1986; Geliebter and Nathenson

1987).

Direct evidence for gene conversion in Mhc class II

genes has been reported by Mengle-Gaw et al. (1984), where








40

an alloreactive T cell clone, 4.1.4, recognized a

determinate present on both the Eb and Abbml molecules.

Nucleotide sequence comparisons between Ab, Abbm and Eb

(Choi et al. 1983; McIntyre and Seidman 1984) revealed that

the Abbml sequence is identical to Eb in the region where it

differs from Abb The region exchanged must have

encompassed a minimum of 14 nucleotides, because the 3

nucleotide changes occurred in this 14 base pair stretch.

The area of exchange is flanked by regions of exact homology

extending for 20 nucleotides 5' and 9 nucleotides 3'.

McConnell et al. (1988) demonstrated evidence for

segmental exchange occurring in Mhc class II genes. By

examining the nucleotide sequence of eight alleles of Ab,

the sequence of the Ab2 exon of all 8 alleles corresponded

to the appropriate genomic evolutionary lineage, as defined

by the retroposon insertion. However, the nucleotide

sequence of the Abl exon of two of the 8 alleles, Abb and

Abnd did not reflect their evolutionary lineage, and

therefore reflects the exchange of sequence, by segmental

exchange, from alleles of a different evolutionary lineage

(Figure 2-8).

At present, the relative importance of recombinational

mechanisms versus the accumulation of point mutations over

long evolutionary periods in the generation of Mhc class II

gene diversity has yet to be determined. This is one the

goals of this dissertation.


















0U 4J



4) uto i

0 .c0 La

4 0 -* 0( 0
O H
4- M0 V -
0 04 0 (ed
0n 4M Vt 0 '
O ,C


k 43 R

-H o 4 -r

o r.k *5ot co
4 ,C V o 4 co



SM ,C HI
(D ,Q Q a 04 Q
S0 4 Q)





(0 0 0 c
x:. 0)g 6 a g






g 4)1 O>q.4
t o 0 o0



14:C ,C ,C cj
ta 0 OH 0P
IXH0 0



co > 5a m


*H 0c H 0 V
r ) rt ,4 0


D r -ri $4
*0 0 (0 4J C

-4 0 0 --1 --1 >
















C 0
aa)x






00)
LO c1 =

C:C

.c .c
C



O


















0)
C C
"J)
(d(1 e
C1,T C r














Functional Role of Mhc Polymorphism



Regulatory T lymphocytes are responsible for initiating

and coordinating antigen specific immune responses.

Activation of virgin T regulatory cells, or regulatory T

cells that have not come into contact with their specific

ligand, is dependant upon a set of signals delivered by the

antigen presenting cell. This stringency is designed to

maintain the specificity of the resultant immune response,

and ensure the inactivity of autoreactive T cells. First,

regulatory T lymphocytes must recognize processed peptide

antigen bound to molecules encoded by the I-region of the

major histocompatibility complex. This recognition is

achieved via T cell surface structures including the

clonally distributed T cell antigen receptor, and the co-

receptor molecule CD4. Second, the antigen presenting cell

must provide a costimulatory signal, such as the membrane

form of interleukin-l. The regulatory T cell must receive

both of these signals in tandem. Either signal alone is not

sufficient to induce T lymphocyte activation, or the

subsequent immune response to the antigen from which the

peptide was derived.

Class II molecules play a crucial role in this antigen

presenting cell-T cell interaction. Their function is to








44

act as promiscuous receptors for antigen fragments; thereby

making them recognizable to T lymphocytes. These two

events, antigen binding by class II molecules, and

regulatory T lymphocyte recognition of the resultant

bimolecular complex, not only forms the basis for antigen

specific immune responsiveness, but, in addition, determines

to a large extent the intensity of the ensuing immune

response.



Regulation of Expression of Class II Molecules

Consistent with their function to bind and present

antigen to regulatory T lymphocytes, the expression of class

II molecules is restricted to the antigen presenting cells

of the body. These antigen presenting cell types include

macrophage, dendritic cells, B lymphocytes, and thymic

epithelial cells. Macrophage are the primary antigen

presenting cell in the body (Unanue and Allen 1987), and as

such have the unique ability to trigger virgin T cells

(Lassila et al. 1988). However, resting macrophage do not

constitutively express class II molecules on their surface,

rather cell surface expression is under both positive and

negative control (Steinman et al. 1980; Snyder et al. 1982).

Supernatants of mitogen activated T cells induce the cell

surface expression of class II molecules on macrophage

(McNichols 1982). Biochemical analysis of the inducing

component of these supernatants have determined the factor








45

to be gamma-interferon (Steeg et al. 1982; King and Jones

1983). Gamma-interferon increases the cell surface

expression of both the A and E molecules, as well as class I

molecules. This control appears to act at the level of

transcription, such that there is a coordinate increase in

the level of mRNA of all four class II chains within 8 hours

of treatment with gamma-interferon (Paulnock-King et al.

1985). Prostoglandins, glucocorticoids, and the bacterial

endotoxin LPS have all been shown to have a negative effect

on the cell surface expression of class II molecules (Snyder

et al. 1982; Aberer et al. 1984; Steeg et al. 1982).

Precursor B lymphocytes do not express class II

molecules; however, mature B cells and plasma cells show

heterologous constitutive levels of class II on their

surface (Mond et al. 1981; Monroe and Cambier 1983). The

levels of class II expression on resting B cells can be

augmented by incubation with mitogen activated T cell

supernatants (Roehm et al. 1984), and subsequent studies

have shown the factor responsible for this to be

interleukin-4 (BSF-1)(Noelle et al. 1984). Interleukin-4

can induce the levels of class II mRNA within one hour and

cell surface expression levels as early as two hours after

incubation of B cells (Polla et al. 1986). B lymphocytes do

not have the ability to activate virgin T cells (Lassila et

al. 1989). However, they may play an integral role in

antigen presentation during a secondary T cell response








46

because of their ability to pick up and display minute

quantities of antigen (Lanzavecchia 1985).

These induction mechanisms for class II molecule

expression illustrates the importance of class II molecule

cell surface expression to the interaction of regulatory T

lymphocytes and antigen presenting cells resulting in an

immune response. Furthermore, this expression of class II

molecules on limited cell types ensures regulatory T cell

reactivity can take place only while interacting with

selected cells of the body. This introduces a control

mechanism to ensure the inactivity of autoreactive T cells.



The Functional Expression of Class II Molecules

Initial serological and biochemical characterization of

Mhc class II molecules revealed a heterodimeric glycoprotein

requiring the association of both the a and b chains (Jones

1977; Jones et al. 1978). Serological analysis of class II

molecules expressed in inbred and wild mice has shown the A

molecule expressed in all populations of mice examined.

However, four of eleven inbred strains and 5-30% of wild

haplotype mice fail to express an E molecule on the cell

surface (Jones et al. 1981; Nizetic et al. 1984). Analysis

of the Ea and Eb polypeptides of the four inbred E-strains

by 2-dimensional gel electrophoresis revealed that the H-2b

and the H-2smice synthesize normal Eb chains but do not

express Ea chains, whereas the H-2fand H-2qmice do not








47

synthesize either Ea or Eb chains (Jones et al. 1978; Jones

et al. 1981). Recently the molecular defects resulting in E

molecule non-expression have been identified, and thus far,

seven independent defects have been detected (Jones et al.

1990).

The Ea gene of the H-2b and H-2shaplotypes have a 627

base pair deletion encompassing the promoter and first exon

(Mathis et al. 1983). The Ea gene has a single nucleotide

insertion in codon 64, causing a frameshift leading to a

stop codon at position 69 (Vu et al. 1989). The Ea gene

also has a single nucleotide insertion, but at codon -2,

thereby generating a downstream stop codon (Vu et al. 1989).

The Eb mutation of the H-2w 3nd H2W21 aplotypes is a

single nucleotide substitution at codon 7 generating a stop

codon. There are also two independent mutations occurring

in the RNA donor splice site at the first exon-intron border

of the Eb gene (Tacchini-Cottier and Jones 1988; Vu et al.

1988). Both of these mutations lead to aberrant RNA

processing. Jones et al. (1990) also report another Eb

mutation distinct from the first three, but have not as yet

molecularly characterized it. All the defects described

causing E molecule non-expression, with the exception of the

insertion affecting the Ea gene of H-2, have also been

found in various wild mouse haplotypes (Jones et al. 1990;

Dembic et al. 1984). The large number of mice not








48

expressing an E molecule may indicate that the two Mhc class

II molecules are not functionally equivalent.



Chain Association of Class II Molecules

The extensive polymorphism of Mhc class II molecules

together with the critical nature of their function of

binding antigen allowing T lymphocyte recognition suggests

that individuals expressing a greater variance of class II

molecules on the cell surface of an antigen presenting cell

would be at a selective advantage. Fathman and Kimoto

(1981) observed that the a and b chains of a given isotype

can transassociate in heterozygotes. These findings gave

rise to the notion of free association of allelic varients

within an isotype, suggesting that 4 types of class II

heterodimers will form in a heterozygote. In contrast,

cross-isotype pairing of A and E molecule polypeptide chains

does not occur except in artificial experimental systems

(Murphy et al. 1980). Preferential isotypic pairing is due

to a strong increased affinity for the association of

isotype matched pairs of polypeptides (Sant and Germain

1989).

Numerous observations now suggest that preferential

pairing of certain allelic A molecule polypeptide chains

limits the amount of transassociated A molecules that can be

formed. Tryptic peptide analysis from serologically related

groups of mice (Wakeland and Klein 1983) show the Aa and Ab








49

polypeptides from these strains differ by less than 10% of

their tryptic peptides (Wakeland and Darby 1983).

Restriction fragment length polymorphism analysis of the Aa

and Ab chain genes from this same allelic family

corroborates this observation at the DNA level (Tarnuzzer

1988; McConnell et al. 1986). These observations suggest

that the Aa and Ab genes on the same chromosome accumulate

mutations in a coordinate manner, thereby ensuring their

ability to functionally associate.

Gene transfection experiments by Germain et al. (1985)

clearly illustrated that allelic variation can dramatically

affect the ability of A molecule subunits to assemble

correctly, and be expressed on the cell surface. These

studies showed that haplotype mismatched chains cannot

associate as efficiently as haplotype matched chains, and

therefore are not expressed at appreciable levels. Further

analysis indicated that polymorphisms in the amino terminal

half of the Abl domain consistently controlled a and b chain

interactions (Braunstein and Germain 1987). Buerstedde et

al. (1988), using site-directed mutagenesis and DNA mediated

gene transfer, have shown that amino acid positions 9, 12,

13, 14, and 17 of the Abl exon are responsible for proper

chain association and cell surface expression for the H-2

and H-2k haplotypes examined. The amino acid positions 12

and 13 being particularly significant for proper

association.








50

These studies suggest that, in order for proper subunit

association and cell surface expression, the a and b chains

of the A molecule need to be co-adapted, and therefore be

from the same or similar haplotype (Figure 2-9).

The ability of polypeptide chains of the E molecule to

associate is under different selective pressures. In the

case of the E molecule, only the Eb chain is highly

diversified while Ea exhibits very low levels of diversity.

Therefore, the diversification of Eb is only constrained by

the requirement to associate with an essentially monomorphic

Ea (Figure 2-9). This discrepancy in selective pressures of

the various class II genes to properly associate may be due

in part to the presence of a recombinational hotspot in the

second intron of Eb. Homologous recombination at this

position would not allow co-evolution of the two genes.



The Role of the Invariant Chain in Class II Molecule
Expression

Mhc class II molecules are associated intracellularly

with a third glycoprotein called the invariant chain (Ii),

which displays little allelic variation among the different

strains of mice examined (Jones et al. 1979). The invariant

chain is a basic polypeptide of 31,000 daltons that is

coprecipitated with class II molecules in

immunoprecipitations using anti-Ia antisera of monoclonal

antibodies. It noncovalently associates with class II

















I4
tPO .C c
C > M +> +J

*H -I 0


C- 0CI rl


rO-H
(O* ( 0 u
00 0 to

0) ,Y o
, UD O 4 .


S( 4 -D 4J
- c o 0 r-i

0 0 n
-*H 4-) H X




*H- O -l C
9 u 0 r-0 0)




.d-,.lCId *
pH to w0 ,4
H-) c(,O -:
C- P ) Q)



*H O n cD
q 0 i'o o
c m C

:0 in O,-0


S0H 0 UM
0 4-0 4
(d (0 4J U


rH 0 o )
IOM ) O C
rI CrJn
HH d C0 )



C3 41 0
A C 0 -I. o 4


- 0 4H OU 0

tPr-I M0 Q) 4J r-l
*-1 0 r-l p >4 0)
FL 0 u (o u u
*MOHlbG
cu0)00


C0 )
,C 'd


0 M.I i
0 H -
OH CdH
CH-r-l r
O 0
P to 043
(d (d Ut -H
r-40 .t
tP 0
H 0 H4.)
C 43M
4- 4.) 0-
Cd au
4.4-4 0 0

CdC 0 d
to to
S0 ( 0 t
000
-0 1M
-.4 rO C
S04 C
t4. ) o1 B10


4-) 10
$45.4 Id
000

C U




-H r4 ,C.
A (0
de 0
-I O4
14 r. 0



P 0
C 0 C 0









4-4 r-.
4. -0 04.
U) Cd
HCO O










M 00 0. 4
'-4-.4 6. +U











Ii Ii


ii fi


lIIij


II %'n








53

molecules in the membranes of the endoplasmic reticulum, but

has not been detected on the cell surface in association

with class II molecules (Sung and Jones 1981). It has been

demonstrated that the invariant chain is coordinately

regulated with class II molecules (Koch et al. 1984;

Paulnock-King et al. 1985). Although the function of the

invariant chain is unclear, it has been suggested that it

plays a role in the assembly and intracellular transport of

class II molecules to the cell surface (Sung and Jones 1981;

Jones et al. 1979).



The Presentation of Antigen by Class II Molecules

Regulatory T lymphocytes recognize the bimolecular

ligand of foreign antigen and a self class II molecule on

the surface of antigen presenting cells. However, unlike B

lymphocytes which directly interact with antigen, most T

lymphocytes only recognize a non-native form of the antigen

(Schwartz 1985). The conversion of an antigen from a native

to a non-native form has been termed antigen processing, and

is performed by antigen presenting cells which express class

II molecules on their surface. Although much is still

unknown about the intricacies of antigen processing, the

following is a summary of events (Werdelin et al. 1989;

Germain 1988).

The first step involved in antigen processing is

ingestion of the antigen. Macrophage accomplish this by








54

constitutive endocytosis, whereas B lymphocytes, by virtue

of their immunoglobulin receptor, utilize receptor-mediated

endocytosis. The ingested antigen is transported into the

interior of the cell in an endocytic vacuole.

The second step of the process takes place when the

endocytic vacuole becomes acidified and proteolytic enzymes

with an acid optimum become activated. This results in the

partial degradation of the antigen; hence, the antigen is

broken down into peptide fragments.

The third step in antigen processing is the binding of

antigenic fragments to Mhc class II molecules. This

presumably occurs in an intracellular compartment, but

exactly where in the cell this occurs is not known. Once an

antigenic fragment is bound to a class II molecule, it is

protected from complete proteolytic destruction. However,

parts of the antigen fragments which.are outside the antigen

binding site, may not be protected against further

degradation.

The fourth step consists of transporting the class II

molecule-processed antigen fragment complex back to the

surface of the antigen presenting cell. Once there, the

class II molecule acts to keep the antigen fragment in a

constant orientation with a stable conformation, thereby

allowing recognition by a T lymphocyte.

The processing requirements may vary with each

particular antigen, depending on the conditions required to








55

induce the conformational flexibility needed for the antigen

to bind a class II molecule (Allen 1987). For instance,

some proteins may require no processing, because at least a

portion of the protein has enough freedom in its native

state to become stably bound to a class II molecule. Other

proteins may simply need denaturation, such as a reduction

and alkylation of disulfide bonds, to reveal peptide

fragments able to bind to class II molecules. The most

stringent antigen processing would require proteolytic

cleavage of the native protein. Irrespective of the type of

antigen processing necessary, the immunogenic peptide must

possess two distinct features. First, it must be able to

bind to a class II molecule, and the class II molecule

contact sites of an immunogenic peptide is called an

agretope (Haber-Katz et al. 1983). The immunogenic peptide

must also make contact with the T cell antigen receptor, and

this site on the peptide is termed an epitope.

The first direct evidence for peptide-class II molecule

association came from Babbitt et al. (1985) using an

equilibrium dialysis method employing purified class II

molecules and peptide fragments. These experiments showed a

peptide from hen egg lysozyme, HEL 46-61, previously

demonstrated to be immunogenic for H-2 and not for H-2

bound specifically to A molecules but not to A molecules in

a saturable process, with an affinity in the micromolar

range. This direct correlation between antigen-class II








56

molecule interaction and Mhc restriction was subsequently

extended for numerous other antigens (Buus et al. 1986a;

1986b; 1987; Guillet et al. 1987). Furthermore, inhibition

analysis illustrated that peptides restricted to a

particular class II molecule competitively inhibited one

another from binding. This suggested that a class II

molecule contained just a single antigen binding site (Buss

et al. 1987; Babbitt et al. 1986; Guillet et al. 1987), an

observation in agreement with X-ray crystallographic

analysis (Bjorkman et al. 1987a).

Utilizing a gel filtration system enabling complexes of

antigenic peptides and class II molecules to be separated

from unbound peptide, Buus et al. (1986b) were able to study

the kinetics of association and disassociation of these

complexes. These experiments, using the ovalbumin 323-339

peptide/Adsystem, illustrated that the rate of complex

formation is very slow (KalM-1s- 1t once formed the class

II molecule peptide complex is remarkably stable (Kd 3x10l s

1). This suggests that the association of class II

molecules with antigen fragments most likely occurs in an

intracellular vesicular compartment as opposed to the plasma

membrane, since this would prevent soluble processed antigen

from diffusing away from the membrane bound class II

molecule. This intracellular compartment would probably

have a neutral pH, based on a 10-fold slower rate of complex

formation at pH 4.6, compared with pH 7.2, and the liability








57

of preformed complexes to acid pH. This peptide-class II

molecule complex sensitivity to acid pH may represent a

mechanism by which class II molecules could rid itself of

completed peptide, and be available to bind newly processed

antigen. Recycling of class II molecules (Pernis 1985)

together with biosynthesis (Harding et al. 1989) could

effectively prevent potentially constant saturation of class

II molecule binding sites with peptides derived from self-

proteins.

Babbitt et al. (1986) first observed that class II

molecules can bind antigenic fragments of self-proteins.

Lorenz and Allen (1988; 1989) further characterized the

ability of class II molecules to bind self-peptides. These

studies provided direct functional proof in vivo that self

proteins are processed constitutively, and can be presented

in a fashion similar to that by which foreign antigens are

presented. In addition, experiments by Adorini et al.

(1988; 1990) demonstrate that peptides of foreign antigens

generated by processing events must compete for binding to

class II molecules with peptides generated from self-

antigens in vivo. Thus, self-tolerance does not occur at

the level of the antigen presenting cell, because antigen

presentation does not discriminate self from non-self.

Rather, it occurs at the level of the regulatory T

lymphocyte, either through functional or physical deletion

of self-reactive T cells.








58

Any given class II molecule can bind a wide variety of

peptides, however different class II molecules show distinct

broad specificity patterns. This is reflected in the

variation between alleles in their capacity to present

different peptides to the immune system. For instance, when

overlapping peptides comprising an entire protein have been

analyzed for reactivity, different Mhc class II molecules

have been found to present different peptide determinants to

T lymphocytes (Roy et al. 1989; Allen et al. 1987).

Furthermore, class ,II molecules only bind a subset of

peptides derived from native protein (Braciale et al. 1989),

and there is a definite hierarchy of peptide determinants

that are immunodominant for particular allelic forms of Mhc

gene products (Ria et al. 1990; Roy et al. 1989; Berzofsky

et al. 1989). Which immunodominant region of the native

protein is ultimately recognized by T lymphocytes is

predominantly influenced by the particular Mhc class II

allele expressed. This influence reflects the ability of

processed fragments to bind to a particular class II

molecule, and demonstrates the affect Mhc class II molecule

polymorphisms have in controlling an immune response

(Benacerraf 1978; Babbitt et al. 1985; Buus et al. 1987).

In conclusion, these studies of Mhc molecule-antigen

interaction illustrate the broad specificity of the class II

molecule antigen binding site. Although the interaction is

generally permissive, the direct correlation of peptide








59

binding and Mhc restriction powerfully illustrates the

crucial role class II molecules play in the initiation of a

T cell dependent immune response.


Wild Mice



The goals of this dissertation are to elucidate the

evolutionary mechanisms responsible for generating Mhc class

II gene polymorphism, and examine the role selection plays

in driving this extensive diversification. Previous studies

addressing these questions utilized techniques, such as

serology and tryptic peptide mapping which have a limited

capacity to answer these questions, as compared to obtaining

the DNA sequence of the genes. The nucleotide sequence of a

limited number of Mhc class II genes has been obtained only

from a few standard inbred laboratory strains of mice.

Aside from having uncertain genetic origins, inbred strains

of mice were derived from a limited number of sources that

were generated by a high degree of inbreeding. This

represents a biased sampling of the mouse population and an

artificial collection of considerable homogeneity (Ferris et

al. 1982; Klein 1974).

Wild mice are unconfined animals whose breeding is not

controlled by man (Bruell 1970), and, as such, represent a

collection of I-region haplotypes of significant

heterogeneity, particularly when compared to standard








60

laboratory inbred strains of mice. A number of features

make the study of the evolutionary dynamics of the wild

mouse Mhc particularly attractive. Natural populations of

wild mice are abundant and their phylogenetic relationships

have been extensively characterized. Furthermore, these

mice represent the product of evolutionary processes where

the I-region haplotypes are fixed and maintained through

natural selection.



Natural History of Wild Mice

Wild mice can be divided into 3 categories of animals

depending on their association with man: aboriginal,

commensal, and feral (Sage 1981). Aboriginal mice are

genuinely wild, with essentially no interaction with man.

With the exception of one subspecies that is indigenous to

northwest Africa, aboriginal mice are found only on the

Eurasian continent. Typically, they are dry-area animals,

and feed on grass, seeds, and grain.

Commensal mice, on the other hand, live in close

association with man, and rely on man for their main source

of food and shelter. Marshall (1981) distinguishes 4

commensal subspecies of Mus musculus; M. m. domesticus, M.

m. musculus, M. m. castaneus, and M. m. molossinus.

Commensal mice, like aboriginal mice are also indigenous to

Eurasia, and, in addition have radiated to habitats

throughout the world. They have successfully adapted to the








61

extremely varied climatic conditions of environments ranging

from Europe, the Americas, Australia, Africa, and several

south Pacific islands. Ferris et al. (1983) estimate that

the commensal relationship between mouse and man has existed

for approximately 1 million years. This is based on fossil

evidence, nuclear DNA variation, and mitochondrial DNA

variation.

Feral mice were once commensals of man, but reverted to

a more aboriginal existence (Bruell 1970). They are found

in areas such as agricultural fields, open grasslands,

marshes, sandhills, and coastal islands, and feed on grass

and grain (Sage 1981). Permanent reversion to feral habits

primarily occurs only in dry climatic zones.

Mus musculus domesticus is presently found throughout

the world. However it originated in western Europe and

subsequently spread to the Americas and Australia in

association to the global movements of Europeans (Bonhomme

1986a). Mus musculus musculus is endemic to northeastern

Europe and central Asia (Sage 1981); Mus musculus castaneous

is found in Malaya (Harrison 1955), India (Srivastva and

Wattel 1973), Indonesia (Hadi et al. 1976), and Nepal and

Thailand (Marshall 1977). The native range of Mus musculus

molossinus is eastern Asia, particularly Japan and Korea

(Hamijima 1962; Jones and Johnson 1965). Mus spretus is a

feral species endemic to the western rim of the Mediterrean

Sea (Bonhomme 1986b). Mus specilequs are the aboriginal








62

mound-building mice found in the steppe of eastern Europe

(Petrov 1979). Mus spretoides is found in eastern Europe,

the Balkan peninsula, Cypress, and Turkey (Bonhomme et al.

1984). Mus cookii, Mus cervicolor, and Mus caroli are all

endemic to southeast Asia (Marshall 1986). The natural

range of Mus platythrix is India (Marshall 1986).



Phylogenetic Relationships in the Genus Mus

The phylogenetic relationships of the various species

within the genus Mus have been extensively studied (Bonhomme

1986a), and a general understanding of their relationships

can be inferred (She et al. 1989). Different species are

distinguished from subspecies based on the presence of

reproductive barriers in natural populations. Therefore, M.

m. domesticus and M. m. musculus can interbreed in natural

habitats, and in regions where they come into contact, such

as central Europe, form a tightly defined hybrid zone. In

contrast, different species with overlapping ranges, such as

M. m. domesticus and M. spretus do not interbreed in natural

populations. Although, these two species can be bred in an

laboratory environment, the resultant male hybrids are

commonly sterile.

The three major molecular techniques employed for

biochemical systematics include protein electrophoresis,

single copy nuclear DNA (scn DNA) hybridization, and

mitochondrial (mt) DNA restriction fragment length








63

polymorphism (RFLP) analysis (She et al. 1989). Protein

electrophoresis assays only the polymorphism in the coding

regions of the genome, and therefore is likely to be

constrained by natural selection. Scn hybridization studies

reveal differences between two genomes of all single copy

DNA, including exons, introns, and flanking sequences.

Mitochondrial DNA RFLP analysis, on the other hand, assays

the cytoplasmic genome which has several unique

characteristics; such as a high evolutionary rate, strictly

maternal inheritance patterns, and an absence of

recombination.

Figure 2-10 illustrates the phylogenetic relationship

within the genus Mus and Rattus as determined by DNA-DNA

hybridization studies (She et al. 1989). Similar

phylogenetic relationships are obtained when these species

are compared by other techniques; however, the estimates of

the exact genetic distance among the Mus species vary

depending upon the technique used. In comparing 9 species

of Mus, 5 subspecies of Mus musculus, and species from the

genus Rattus, there are seven levels of divergence among the

species, ranging from 0.3 million years (Mus musculus

complex) to 10 million years (divergence between species

within the genus Mus and the genus Rattus) (Luckett and

Hartenberger 1985; She et al. 1989).

By analyzing the Mhc class II gene nucleotide sequence

from wild-derived alleles from a number of different species








64

and subspecies of the genus Mus and Rattus, it should be

possible to obtain an evolutionary perspective of the forces

acting to diversify and maintain contemporary Mhc alleles.



















r M0
t -4 0


i< 0


i *4 o
Z Q-H (d



->i M <
0)
X 0 OX

G *HO U
4. 0 4, M
C 9 X


e () (d -
*^ a '-o *-r
, X *0















.4 4 o
U) (0H M



0004
- -0
0 r-4
o o) 4) c





-4 *4 0
P
lC -)
*4 V >O


0 0) .H
X! r4
Q) 0O r.



0 0 3-q
C 0rP

0 U0 Ir
o 0 i r-I |
-1 0



tP4J Xl 44 0)
-4 tO >, r.














2
3
4
5
6
7-
8
9
10-
11 -
12-
13
14
15
16
17
18
19
20

29
30-
% DNA
Divergence


'%h N,
oc
i I Igt

a%


- 5


- 6


- 10

Million
Years


19













CHAPTER III
MATERIALS AND METHODS



Isolation of Genomic DNA



Genomic DNA was isolated from liver or kidney tissue by

a Protease K/SDS method as detailed in Maniatis et al.

(1982). The extraction was performed on a 340A Applied

Biosystems Inc. (Foster City, CA) nucleic acids extractor.

Frozen tissues are'ground in a mortar containing liquid N2

to a fine powder, and added to 3.5 ml solution of lysis

buffer (Applied Biosystems Inc., Foster City, CA) and

Protease K (final concentration of 0.3 mg/ml) (Applied

Biosystems Inc., Foster City, CA). The solution was

incubated overnight at 65'C. The remainder of the

extraction was performed by the machine. Briefly, the DNA

solution was extracted two times with a Tris equilibrated

phenol (pH 7.5)/chloroform/isoamyl alcohol solution (25:1

v/v), and one time with just the chloroform/isoamyl alcohol

solution. The genomic DNA was then ethanol precipitated,

washed in 70% ethanol, and resuspended in TE buffer (10 mM

Tris HCL, pH 7.5; 1 mM EDTA), and dialyzed overnight at 12*C

against TE buffer. The resulting DNA solutions were








68

electrophoresed on a 0.7% agarose gel for quantification and

to confirm their high molecular weight.



Polymerase Chain Reaction Amplification, Cloning, and
Sequencing



Amplification of Ab exon two was achieved via the

polymerase chain reaction (PCR) described by Saiki et al.

(1985) with slight modification. The initial 100 Al

reaction mixture contained 1 Ag of genomic DNA, 50 mM KC1,

10 mM tris (pH 8.3), 1.5 mM MgCl2, 0.01% gelatin, 0.5% DMSO

(v/v), and 250 mM of each dNTP (dATP, dCTP, dGTP, and dTTP).

80 pmoles of each oligonucleotide primer, which are

complementary to stable intron sequences, were also

included: mouse 2: CACGGCCCGCCGCGCTCCCGC; mouse 3:

CGGGCTGACCGCGTCCGTCCGCAG. Samples were then boiled for ten

minutes, quenched on ice, and 5 U Taq DNA polymerase

(Perkin-Elmer Cetus, Norwalk, CT) was added. The first five

amplification rounds consisted of 1.minute denaturing at

94"C, 2 minutes annealing at 25'C, and 3 minutes extension

at 72'C. At this point, 200 Al of dH20, 5% DMSO (v/v) was

added with an additional 5 U Taq DNA polymerase. The

amplification protocol for the ensuing 23 cycles consisted

of 1 minute 94*C, 2 minutes 62*C, and 3 minutes 72C.

Immediately following the last cycle was a 7 minute 72*C

chase to ensure full extension of all amplified fragments.








69

The samples were then ethanol precipitated and

electrophoresed through a 5% nondenaturing acrylamide gel.

The fragment of interest was then excised and eluted into 3

ml of elution buffer; 0.5 M ammonium acetate, 0.1% SDS, and

1 mM EDTA, at 50*C overnight. The mixture was then

centrifuged at 2,000 rpm for 10 minutes, the supernatant was

ethanol precipitated, the pellet was washed with 70%

ethanol, dried and resuspended in 10 Al of dH20. The

amplified fragment was then ligated with Sma 1 digested

M13mpl8 overnight at 25"C under conditions described by the

supplier (Bethesda Research Laboratories, Bethesda, MD).

Insert-positive plaques were sequenced via the Sanger

dideoxyribonucleoside method employing the Sequenase

protocol (United States Biochemical, Cleveland, OH). To

eliminate potential errors introduced by the PCR, at least 2

clones per sequence were analyzed.



Spleen Cell Isolation. Immunostaining. and Flow Cytometric
Analysis



Freshly explanted spleens were minced through wire

screens to make a single cell suspension. Red blood cells

were lysed by incubating the cells in a lX ammonium chloride

solution for 5 minutes at 25*C. The remaining spleen cells

were washed thoroughly with IX PBS. 1 x 106 cells were

resuspended in 400 pl IX PBS, 0.1% NaN3 solution, and then








70

incubated in a 1:2 dilution of the monoclonal antibody

culture supernatant for 30 minutes at 4*C. The samples were

washed 3 times with 1X PBS and incubated in a 800 &l volume

of a 1/500 dilution of FITC (Accurate Chemical and

Scientific Corp., Westbury, NY) in IX PBS, 0.1% NaN3 for 30

minutes at 4*C. The samples were again washed 3 times with

1X PBS and brought up in a 400 &l volume for flow cytometry.

The cells were passed through a 4 micron nylon mesh filter

and analyzed on a FACSTAR fluorescence activated cell sorter

(Becton-Dickinson, Mountain View, CA) at a flow rate of 300

cells/second.



Data Analysis



The DNA sequence was analyzed by the following computer

programs. The nucleotide alignment and amino acid

translation was achieved using Microgenie (Beckman,

Fullerton, CA). The allelic phylogenies were constructed

using the DNAPARS and DNACOMP programs in the PHYLIP package

(Felsenstein 1989), and the neighbor-joining and UPGMA

programs (provided by M. Nei). Nucleotide divergence and

diversity was calculated with the SYNO and SEND programs

(Nei and Jin 1989).
















CHAPTER IV
RESULTS



The Generation of Mhc Class II Ab Gene Polymorphism in Rodents





The mechanisms responsible for generating antigen

binding site polymorphisms in rodents has been a puzzle to

immunogeneticists for many years. In an attempt to assess

the roles mutational and recombinational processes play in

diversifying MHC class II genes, the nucleotide sequence of

46 alleles of Ab exon 2 (Abl exon) was determined and the

patterns of diversification examined. This is the most

polymorphic exon, and it contains the antigen binding site.

These alleles were obtained from a panel of rodents

containing 12 Mus species and sub-species and 2 species of

Rattus; thus providing alleles derived from species diverged

for increasing amounts of evolutionary time up to 10 million

years.












Animals

The DNA analyzed in this study was isolated from fresh

tissues or ethanol preserved tissues from various species of

rodents. The standard laboratory inbred mice were from the

mouse colony in the Tumor Biology Unit at the Department of

Pathology and Laboratory Medicine, University of Florida.

H-2 homozygous wild mice, whose origins and characteristics

have been described previously (Wakeland et al. 1987), are

from our wild mouse colony located at the Animal Care

Facility, University of Florida. Some wild-mouse derived

strains were supplied by F. Bonhomme's laboratory in

Montpellier, France. Three individuals of Rattus rattus

were trapped locally in Gainesville, Florida. The strains

included in this portion of the analysis are listed in Table

4-1.



Nucleotide Diversity Within the Abl Exon

The polymerase chain reaction coupled with DNA

sequencing technology was employed to obtain the nucleotide

sequence of 46 alleles of the Abl exon (27 alleles sequenced

by S.A.B.; 19 sequenced by J.X. She). These sequences were

combined with 10 previously reported laboratory mouse and

rat sequences to provide a data base of 56 sequences

(Malissen et al. 1983; Larhammer et al. 1983a; Eccles and

McMaster 1985; Estess et al. 1986; Acha-Orbea and McDevitt










List of Ab Alleles Analyzed.


Allele Species Strain Geographic Origin


Abb
Ab
AbW
Abk
A';
AV?
Ab'
Ab"

Ab'
Ab"

MudoAb
MudoAbt
MudoAb3
nudoAbV
MudoAb3
MudoAb6
MudoAbV
Ab0
MudoAb7
MudoAb8
MudoAb9
MudoAb"

MudoAb"
MudoAb1
MudoAb"


MumuAb=Ab?
MumuAb2
MumuAb
MumuAb'
MunuAb5
MuniAb6MudoAb8

MucaAbw
MucaAb

MumoAb

HuAb 2
MusiAbw



WusiAb2
MusiAb'
nusiAb3



HusiAb'
MusiAbP
MustAbt
MustiAb



MucoAb5

MuceAb'

MucrAb1
MucrAb2

MuplAb1

RT-.lb
RT-1

Rarea
Rara
Rara2


M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticuss
M. m.'domesticus
M. m. domesticus
M. m. domesticus
M. m. domesticus

M. m. musculus
M. m. musculus
M. m. musculus
M. m. musculus
M. m. musculus
m. m. nusculus

M. m. castaneus
M. m. castaneus

N. m. molossinus


M. spretus
M. spretus
M. spretus

M. spicilegus
M. spicilegus
M. spicitegus
M. spiciLegus
M. spiciLegus
M. spicilegus

M. spretoides

M. cookii

M. cervicolor popaeus

M. carol
M. carol


M. platythrix

R. norvegicus
R. norvegicus


rattus
rattus
rattus


C57BL/6
BALB/C
B10.M
B1O.BR
C3H.NB
B10.6R
B1O.RIII
A.SW
B1O.PL
NOD
DR1
B10.CAA2
B10.STC77
B10.SAA48
B10.STC90
B10.BUA16
ERFOUD5
AZROU1
AZROU3
METKOVIC2
ERFOUD1
FAYIUM4
FAYIUM5
JERUSALEM4
24CI
38CH
METKOVIC1

VIBORG7
MBS
MDS
MBT
MYL
BRN04

CAS
THONBURI1

MOL

SEG
SEI
STF

PANSEVO1
PANSEV02
PANSEVOB
ZRU
ZYD
ZBN

XBJ

COK

CRP

KAR
KAR2


Lab inbred 2
Lab inbred 1
Lab inbred
Lab inbred
Lab inbred
Lab inbred
Lab inbred
Lab inbred 3
Lab inbred
Lab inbred 4
Florida
Michigan
Michigan
Michigan
Michigan
Michigan
Morocco
Morocco
Morocco
Yugoslavia
Morocco
Egypt
Egypt
Israel
Italy
Italy
Yugoslavia

Denmark
Bulgaria
Dernmark
Bulgaria
Yugoslavia
Czechoslavakia

Thailand
Thai Land


Japan

Spain
Spain
Tunisia

Yugoslavia
Yugoslavia
Yugoslavia
Bulgaria
Yugoslavia
Bulgaria

Bulgaria

Thailand

Thailand

Thailand
Thailand


India


RT1B
RT1U

LN3
LN4
LN20


Lab inbred
Lab inbred

Gainesville
Gainesville
Gainesville


Table 4-1.


















0







. ) 0 ,C
0
X



UQ)



00
CO


) 0


0Q)C
C ?A 44


0) 0-
)0 H

.,- 0 ,.
0 '*







U4J.
(1) >4i











0 #.
1 0r.









,C I -
M- M
H0 0V
-r4 0


rl 1








M 4 0
Sfl4) *r
4) 4
* ( UW



OI -0
)< 0M 0
C4'Jd












11(1111111111111111



111 111111 1
1 U1111 U1111111 li


































.;
II) II)I(I III II(






















111111 i- ii liii Kt 111t- tii
11111111 IIt i l
111111 I 11111111111
1111 1111111111)11111

(I I II liii till I ll
1111111(11111111111












I I III)(II(I I II)I
till 11111 111111 111
1111111111111111 ll i

till lIIi Ii It II
till liiiI'I II lilt

I 1111111 i1 it ill
























































!
liii1111111111 tilt


111 | Il 1 I III Il
liii II1111111111111



liii II 111 1 it llt
I I1i II It IIt t



































IiI It tii l I I
Ii I(111111111)111 li
1111111(11111)11111


















































t-"L3 ~ I U"i t-- iUOt
0II 1 1II i O Itt It (II































































II I
1111111 tell lilt Ill
II 111111 tell 111111






































I I I I II(I I II
11(1111111111111 lii
11111111 11111111 li










































I i 'I l ) ~ l I I 0 1
11 I II00111 iI









































I tIllII Ilt 1 1
liii 1 11111ii11111
Iliii i11l iii lii lilt




11111111 1111I II
l)IIi 1111 II ll lltl

lit 111111 il till II








































I I
111111 i ti
11111 I 1
1111 III 11111111 lii
1111111111111111111I
111li 1111111 I(IIIIII
111 1111111111111(II

111111liiill 1111111
liii Ii 1111(I ii 1 111
1111111111111111111
11111111111(1111(111
1111111I i111111111


li
Il




Iii
Il
Il
li

Ii

It









II















It
II

It



,5





















II

i i
lii


0I

H
i
H
4
1I
I
I










<


Il



ii
ii
,,









Ii
II

it













Ii

It
~Il
ItI

it


II I


it!























I !









II
ii


it




Ii


It i






lii


It
It
It

It Il












I l It

ItIt
H II
I Ii

ill It

I I

ttl



til






II


It



til


I i


It




I) It t

I 0 I I

I t


*!





I i

( I t l I
i i
il I it
I I it it





Ii it






























!!
SIt l






























i
IllI Itl
I It lil






























O
I t II I













Ii lilt* *;



I It i I0





Ii U i U U
01 11


1111U 111
li til


I II
































It II

t Ii

It II
t It


t It
Ill I




lIl


ti t

i II



IiII

111|

Ii I I




I It

I It




ItItI





I I


I t- i
i I i I


i I i i


It
iI I lt I




Oi i

i I



ItI
I tl11Ht



H I H
o I I I I I
Il It Itc











U
* it I t, II
I I I I
II It I




I I I I I








U i l I Ii
I I I










iU I
iI I
l I I












SI I f
SI I


II I i
I I Ii
I I
I I I I It1












SI Ill


I t I I I

ii i It
I< 111(
I fI I I0
SI I I i


















SI I
i I

It
i It O
I It It
lIt~ lit




























SI I i I
* I I I I I

I I I I I

I I I I
I I i i 4



































































I !-
I I I I I i
i I I I I I








































SIt II I I
II II I I
Si t Ii ItI


I IHHHE-t

SI I 0 I I
I H I I I I
It It It
4111111
IC~ 014441
01I I 0 0 01
l Ii ii I I

II lilt
i I tOI I
IH I H H H I
I 410441
I I 41-ili
ItI Ill
liltI Ill
lit 11441
1101000
l ilt Ill



SI I I I I H


il ItI I I
I I Iti t
Ii l ilt


Il tilt

I 0 I I I
i tII It I
iii 1111
lt it t
lIt(I t Ii
41 4444
I tI I it it
SI i (I ItI
I Ii Ill)
II1 till

I 0 It I I
IH I HHH H
I II Ii I t

I 0 I I I I

I0 I I I I

1101001
t ilt II(
I II liltI
SI tIlt
I II t lIt
I H I I I I
SI I I- I

tot 1410
ItI I I I
It t II ItI
itf It I I
ii 0 I I I
ill I I ItI
I t I II t

i t I II I I
I II It Iti


tA




A) xq JII I I I i O O l~ ~ O J1 J 1 2*.:l-U lI .Clc: x0

















till
11111
11111
Ill
ill
it It
11111
till
It
It It
Ii
11111
It

Ii
II
Ii
ii
iii II
11111
lit It

11111
I I I-i I
I 0 I
lit Ii
itt II
lit Ii
Itt Ii
I 1.4 I I 1-I
I 0440
I i I
1441

4 I
11111
lii Ii
11111
11111
11111
11111
Ill Ii
ill It
11111
11111
it ill
I ~I ii
11111
11111
11111
11111
11111
Iii It
iii Ii
u I::
ii I
11111
lit Ii
It ii
It It
It Ii
It It
till
11111

0 i 100
Ill It
Ill I
11111
11111
I It It
11111
11111
it It
I I I 14

lit II
lit II
11111
11111
ill it
ill It
Ill It
lit It
ill It
11111

11111
lii
ill
tItIt II
tItIt it
It
I tilt
11-il-Il
11111
44 I
Ill II
Ill Ii
lit It
It~ II
tItIt ii
l.It II
11111
I 1-iIt4 I
ii it
11111

I u I I
4 I I
'liii
iii Ii
11111
11111
11111
It Ii
Ill it
It Ii

I Ill
11111
11111
ill I
I lilt
11111
till
I l-4 I I I
11111
11111

11111
11111
II Ii
11111
II It
I tIll
till
I liii
Ill
liii


EI I il I
itII II
I"++++It









EIl









*i 1


11I






















Itii i
It


































I I I I
lii|





It





I Iu


















I lilt
I I












it lilt
It i- it

I U 1000
til It I!.


II I
Il I



















I









*!

El 4 I EI
It I







It I
Ii i



I ,l





ll Ii



















ill Ii
Ii i











I i
It it










Elil
I lii


I
I












It t i i-

*!
Iit it I











I i 1























I i It it1







I- l Htl I Ill
tui 0







I I
,i I It

it, I It


It I Ii




IlII
Il, I

II I It


it Itt






It It II




It It it




Ill I II


IllI I I<

I ItII





i I It


It I Ii




111 1
11 1 1
it t l


uuuuuuuuu
I 144 E-il E < El -.
I 1 1 1 1
I 1 1 1 1
i 1 1 1 I
i 1 1 1
I 1 1 11 1

Ii t I I


Ut
141

ii
lii
Ii
Ii
II

I I



lii
ii
Il-il
ill
it
III
It






It


i





II


Il



01
It
44


II



i<


In

V -'-4., ~

44444444444444 ~ ~ 4444444444444-a ~
*u I I I I I a a I I I I :i *~- ~i -.-"lflle.l,.l -t4~l~ OIi* I.1 t~,4,.i...tI l~ II II
u


t











0



.It ,#
I II







!






I
I EII





1 11












*t -I
E1 lE1
I4 I
It I
EI Et I

It I



It ; ;*;
3 0I I
I I It<
U U U


i i I it Ii i I I I i
I I I III I i IC ,

Iu









E l

I t-


l I Ii t tl I II I
Ii I ti t I t It I I
'l I l 1 11 III


II It I il





I I I I I -
I -i

I i

E tut I EU I Et I














H *!
I I I i -it

























U U



i
11111 11111 1141















(I '
til til 11 1 i t l t It
I I I I.~, I I I '





















I I I I I IIt It I I t I
I I I I I I I I I !

I I 4 1 I I t I I I I
I I I I I I I I II I I I I t

t i it i i ii1.
111 14 11 ul I -It

I I I I I I I I I ,I I
t i it It ii It it

iIII I Il lii i

























Ill It It ii It It t I 1111


*,<






ll
I it It i

















( I
It lll



4I '
itIl





Ii Il

Ilt I l














il
I 1I I I
Ii It II
It II It

I 1. I I -

Ill I It

I 0 I i I



I I I

:1 i 1410



I I u I

I 4 I I







i II
II It Ii

tIt It


t It It



,Il Ill






















1
It-
till


It It I
I ItII























*;
It It I






I I


















i il
I 0 Il I 0
It I I

U111
til 0


ti l t l ll l l l 1 l l I

I It ~~It~~~ItItItIItItIt
lilt lil

l-ltHiiil li lt 11111 i lE-HEt -








77

1987; Figueroa et al. 1988). By aligning all the sequences

in our panel it was observed that the two codon deletion at

amino acid positions 65 and 67 were erroneously placed

previously in the literature (Estess et al. 1986). Shifting

the position of the two deletions to nucleotide positions

175-177 and 185-187 results in two less nucleotide mis-

matches between the two forms of Abl. The nucleotide

sequence of all the alleles is shown in Figure 4-1.

DNA sequence analysis of these 56 sequences revealed 52

different alleles in the data set; 4 pairs of identical

alleles were found in independent samples from the Mus

musculus complex. The nucleotide diversity between alleles

was computed using Nei and Jin's program (1989). Most

allelic comparisons revealed 5-15% sequence diversity in the

Abl exon; although some alleles differed by as much as 25%

in comparisons both within and between mouse species. The

maximum value of sequence divergence was 32.7%, and occurred

between a rat allele (Raral) and a mouse allele (MudoAb5).

The mean nucleotide sequence diversity among alleles within

a Mus species (D = 7.57 0.7%) was comparable to that

observed between species (D = 7.73 0.3%) and between all

alleles in the genus (D = 7.68 0.3%). These results

indicate that the diversification of MHC genes is

independent of the phylogenetic relationships within the

genus Mus, and contrasts diversification patterns of other

nuclear genes (She et al. 1989).












Phyloqepetic Relationships of the AbI Alleles

The phylogenetic relationships of these Abl alleles

were analyzed using both phynetic or distance methods, such

as neighbor-joining and UPGMA (Saitou and Nei 1987; Sokal

and Sneathi 1963), and parsimony analysis (DNAPARS program in

PHYLLIP)2Felsenstein 1989). The distance methods determine

the allei2c relationships by comparing the total sequence

divergencerbetween alleles; whereas parsimony analysis forms

a networkl-basing the genealogy on the fewest number of

mutationrwbetween alleles. Similar results were obtained

for all three methods of analysis, and Figure 4-2

illustrates the allelic genealogy produced using UPGMA based

on the JiCC distance. The results show that alleles in

separat4dspecies are commonly more related than alleles

within the same species. This observation is consistent

with theiretention of ancestral polymorphisms. However,

these analyses revealed very few tightly-clustered allelic

lineages that were stably maintained over evolutionary time

spans. There are only 6 lineages of closely related

alleles, or alleles with less than 2% nucleotide sequence

divergence. These lineages were strictly comprised of

alleles derived from the same or closely related species,

with divergence times of less than 1-2 million years. Most

of the Abl alleles could not be organized into homogeneous

lineages. For example, if lineages were defined as alleles




















0) 44
,0O






M

0,




(fl *






4)
rl






S->1











4.)
4-)
O0




0



0 0


04

* 4)


0 *4
*4)







4.Q)









p
Mudol
Mudo2
Mudo3
Mudo8
q
Mudo4
Musp3
Mudoll 1
Mudo14
s
Muca2
f
Musi3
Mudo7
b
Muspl
Mumu5
Mucal
d
Mucr2
Mumu3
Musi6
Mudo6
r
Mudol 5
Mumol
Mudo12
Mudol3
U
k
Mudo10
Musp2
nod
Mumu2
Musil
Musi4
Musi2
Musi5
Must
Mudo9
Mucol
Mucrl
Mumu4
Mucel
Mupil
RT-1 b
Raral
Rara2
Rara3
RT-1u
Mudo5


% of nucleotide divergence


2I I I I I I I I I I I I I I I I I I I I I 1
22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0








81

containing less than 5% nucleotide divergence in the Abl

exon, 25 separate singleton alleles remained. This

indicates that most of the lineages defined in the analysis

are not retained as stable polymorphisms for more than 1-2

million years. This observation sharply contrasts the

results reported by Gyllensten and Erlich (1989) for

polymorphisms of the primate class II DQa gene.

Parsimony analysis revealed a second interesting

feature of the diversification patterns among these alleles.

The most parsimonious network for all 52 alleles required

405 character changes and only 20 of the 82 (24.2%)

informative sites (nucleotide positions exhibiting at least

2 character states represented by 2 or more alleles in the

data set) were compatible with the genealogy. These results

indicate that the amount of homoplasy, or reverse, parallel,

or convergent mutations, is excessive, and suggests that the

observed evolutionary relationships are not reliable. An

example of the homoplasy in the data set is illustrated in

Figure 4-3. The actual analysis was done with nucleotide

sequences, but the presentation of the results is simplified

by showing the protein sequence. The carboxyl terminal

regions of the Abp and MumuAb alleles (underlined) are

identical to that of Abnodalthough the remainder of their

sequences are clearly different. These results might be

explained by convergent evolution, but the amount and the

patterns of homoplasy observed are best explained by


























OH
0N
'0












C o
04
HO
E-4

r--I t


0
0.0


HIO
10








0 0
l *>I0


Or-I )



to sH
p 0



0) 0



















5 I l l l
I I l i l
0 I l l II
S I I I I II


z
ui












I I
n > >






























I I l l I'I
[ 1 p t s I > _I,
















I i I I I
UM M l I































0 0
I I l l I I






l II I I;
0 > Id 1U II
1-M 11 # o11

















Il
1 II I l
''^^1 I II' C1 I ll

I II 1

I I I I



gC 1
3 IdI I f i





1^ "g


I "t
4 E -<<
**< h I"I"I








84

postulating intra-exonic recombinational events among the

alleles.


The Abl Exon Consists of Five Polymorphic Sub-Domains

The majority of polymorphisms in the Abl exon occurs in

5 specific regions, termed polymorphic segments. These

segments contain 56 of the 82 (68.3%) informative sites for

parsimony analysis, and are identified in Figure 4-4. Each

polymorphic segments encodes a specific element of the

hypothetical class
Figure 4-5. The BS1, BS2, and BS3 segments are located

within the region of the Abl exon that encodes the P-pleated

sheets of the antigen binding site, the a-helix segment is

located in the 5' end of the region that encodes the a-helix

of the ABS, and the 3' segment is located at the end of the

Abl exon in a region that encodes a portion of Ab whose

structure cannot be currently predicted. The Abl exon was

divided into 5 sub-domains based on the locations of these

polymorphic segments, and each sub-domain was analyzed

separately by parsimony analysis. This revealed that the

alleles in each sub-domain could be organized into a series

of highly divergent lineages. Furthermore, the total number

of mutations needed to produce all of the lineages in all of

the sub-domains was two-fold lower than that required for

lineages constructed from the entire exon. This indicates

that these sub-domain lineages have much lower levels of







































0
x





















4
'IQ
0


0

*-4
0



4-)
U


0




a)











62 TM CYTCY3'UT

1+1-"


BS2 BS3


0C-helix


3' segment


Y
pleated sheet


L 1


2 Kb


U.


BS1


50 bp


VA N mF

















4)

0
U,
Ho
HW



S00

>1


0

oc








rI 0
-,i o


*H I)
OUN4J









0O
- A
4I (lU










10-



0 0





rX r























/ N.
N.
N.
/ N. -
/ / N.
/ ,/ N.,

/
~-N \ / /
/








89

homoplasy, and suggests that each polymorphic segment is

evolving independently.



The Abl Exon Sub-Domain Lineages Represent Ancestral
Polymorphisms

Parsimony analysis identified 5-11 distinct lineages in

each sub-domain; primarily defined by point mutations

occurring in the polymorphic segments. The consensus

sequences of these lineages are presented in Table 4-2.

These highly diversified polymorphic segments often differ

in 20-35% of their.nucleotide sequences. The majority of

the diversity between polymorphic segments appears to result

from the accumulation of point mutations over long

evolutionary periods. As illustrated in Figure 4-2, each

sub-domain lineage contains alleles derived from multiple

mouse species, or even both mouse and rat. The data in

Table 4-3 illustrates that alleles in the same sub-domain

lineage often have identical or very similar nucleotide

sequences, yet may be derived from evolutionary distant

rodent species. For example, some polymorphic segments are

identical in alleles derived from mouse and rat, indicating

they have been retained as polymorphisms for a minimum of 10

million years. These results indicate that the polymorphic

segments in the sub-domains of the Abl exon are extremely

stable polymorphisms, some of which first arose prior to the

divergence of mice and rats.





















44 r-H H-I H
0 1 0
to 0 m
0 r to ** c





-0 -o o- -



o ?0 3 0
-) a) 00 ).






P4 t- 1-1. ) .
4 0 C c1 >%


(d 0 9 .r4 0
o c 4. 0 g .
n 0 V 4 ** (1




P4 1 0 0 M 040 4

o -P t pgJ 4
0 C r m .4O


to Id 4.ijJ.4*r-

H r-4 n ) 9 -

0) ) H 04
a) t) u) Ud )n tP
0 4 ) 0 g-



a) Ha) o u) U) H H


-3 .0 a a) r n gr
4 .Ho 0 H 0 -


0 0 V 44JWU)



0 4) to 0) 0 r4o

S0 W W0 ) U M *
O4) 043 -4 4- --I 0 ) g



) 0) r. r.4 0C0Vi-4 0
4 3Q) O E-4 e **

* d3 t() a) 0 0
* cO mO *4 W *r iI



^N*HinaaOCOegc*H














Table 4-2


AB sub- Line- Nucleotide, Number of Species
domains ages Sequences ALLeles


1
2
BS1 3
4
5



1
2
3
4
5
BS2 6
7
8
9
10



1
2
3
BS3 4
5
6
7



1
2
3
a-helix 4
5
6
7




1
2
3
4
5
3'segment 6
7
8
9
10
11


13 23
TACCAGTTCAAGGGCGAG
-------- C--CC-TTC
GC----- G --------
--------G-----CT-
GT---......------------


61 71 80
CGGCTCGTGACCAGATACATC
--ATCT---- A ---------
--ATCT---GA----------
--ATAT---------------
A--TATC--GA ------.T-
-------------------
---------- T ---------
----- T --------------
----AT----G---T ------
---AGT---GA----CG-T--
--TA-A----...---------......-


97 107
TACGTGCGCTTC
-GG---------
---C------
--C----C A-
----------A-
------------
-T-A--------
-- C -C----A-


155 165 175 185
CCAGACGCCGAGTACTGGAACAGCCAGCCGGAGATC
----------------AC--T-***---A-T-***-
..----------------------------------
--CTCA--------------T-***--A-T-***-
--CACA------------------------------
-GG----------------------T-AA----T--
--GTGG-----CG--------------AA-------
--G----T-------------------A--------


242 252 262 272
GGGCCGGAGACCCACACCTCCCTGCGGCGGCTT
---------------------------------
-A-A-----GT--C-------------------
---GT----------------------------
AA-A--------- C -------------------
---....-T....-----....C-- -----------***--
....T --------C-------------------
--------- G-T-T---T ---------------
--------- T-AT---- T ---------------
****T ----T--T--T-------A------C-
---A--------CG--------A-------
---GT-------- G --------------***--


1,2,3,6,7,9,11
1,2,6
12,13
1,6
1,2,3,4,5,6,8,10


1,2,3,8
1,2,4,6
1,2,3,9
12
1,2,4,6
5,6,7
1,3
11
13
10


1,2,4,5,6,8,10
1,2,4,6,7,10
1
1,2,3
1,2,3,11,13
1,3,6
12



1,2,3,4,5,6,7,8,9,10
1,2,4,10,11
1,2,13
2
12
13
12




1,2,3,4,6,11
1,2,12,13
1,2,4,7
1,2,3
1,6
5,6
12,13
8
10
10
9



















x
*H

,C

0




0
H O

4-
0





4J
41
r-4 4


00
*P C





0




H -H
(04







n>
QC M







-H *H
o t o
#) c










0 0

14 (C
00


* 0

m -Q)
Ito
0 C

HI
(0 .0









t o
E-4n


co
CO



0
0




0
-)


-H
0




(co
0





-44


0
H-






00
o --
P 4o










0)
W H
0







U4) 0
S*I




0) 0