Analysis of the sequences required for transcriptional regulation of a human H4 histone gene in vivo


Material Information

Analysis of the sequences required for transcriptional regulation of a human H4 histone gene in vivo
Physical Description:
viii, 200 leaves : ill. ; 29 cm.
Kroeger, Paul Edmond, 1960-
Publication Date:


Subjects / Keywords:
Transcription, Genetic   ( mesh )
Transcription Factors   ( mesh )
Genes, Regulator   ( mesh )
Immunology and Medical Microbiology thesis Ph.D   ( mesh )
Dissertations, Academic -- Immunology and Medical Microbiology -- UF   ( mesh )
bibliography   ( marcgt )
non-fiction   ( marcgt )


Thesis (Ph.D.)--University of Florida, 1988.
Bibliography: leaves 184-199.
Statement of Responsibility:
by Paul Edmond Kroeger.
General Note:
General Note:

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001031878
oclc - 20429224
notis - AFB4056
System ID:

Full Text








I would like to thank Janet and Gary Stein for the opportunity to

work in their laboratory and explore molecular biology from a great

many perspectives. I also appreciate the advice and encouragement of

my other committee members, Drs. Ostrer, Hauswirth and Moyer.

The Stein's laboratory has been filled with many characters over

these last six years and I owe thanks to all of them. I would like to

thank Farhad Marashi, Mark Plumb, and Linda Green (especially Linda)

for their technical expertise and friendship. My fellow graduate

students Gerard Zambetti, Dave Collart, Andr6 van Wijnen, and Anna

Ramsey, I thank for their comradeship during the preceding years. The

laboratory would not have been the same without Charles Stewart, Urs

Pauli, and Sue Chrysogelos all of whom have given me new perspectives

on life and science. I thank Tim Morris for his constant good nature,

advice, and stimulating conversions (although we did not always

agree). I would particularly like to thank Ken Wright for our many

successful collaborative adventures in the laboratory, his friendship,

and generosity when it was most needed.

Finally I thank my wife Carol, and our new son, Alan, who have

given me constant inspiration to continue down what has been a long and

unusual path through graduate school. My parents have also been a

constant source of advice and encouragement, and I thank them for their

unending interest.



ACKNOWLEDGEMENTS............................................. ii

ABBREVIATIONS................................................ v

ABSTRACT..................................................... vii


1 INTRODUCTION........................................ 1

Viral model systems............................. 4
Chromatin Studies.............................. 5
In vitro transcription ......................... 9
Enhancers and Silencers......................... 17
Histone genes................................... 25

2 MATERIALS AND METHODS................................. 37

3 HISTONE H4 5' REGULATORY SEQUENCES.................. 63

Cell line Construction.......................... 67
Initiation of Transcription and Basal........... 68
Distal Transcriptional Regulatory.............. 108
Distal-Proximal Positive Element............... 129
Enhancer Element................................. 131
Nuclear Run-on Analysis of H4 Transcription.... 139


Integrity of Flanking Sequences................ 142
Location of pSV2neo Plasmid Sequences........... 153
Compatibility of Mouse and Human Regulatory.... 158
Proteins and Sequences

5 DISCUSSION AND CONCLUSIONS........................... 168


A SAMPLE COPY NUMBER CALCULATION...................... 181



C TABLE OF CONSTRUCTS.................................. 183

REFERENCES........................................... ......... 184

BIOGRAPHICAL SKETCH........................................... 200


ATP: Adenosine 5'-triphosphate

bp: Base pair

C: Centigrade

CIP: Calf intestinal phosphatase

CTP: Cytidine 5'-triphosphate

DEPC: Diethylpyrocarbonate

DNA: Deoxyribonucleic acid

DNase I: Deoxyribonuclease I

DU: Densitometry units

EDTA: Disodium Ethylenediaminetetraacetate

EGTA: Ethylenebis(oxyethylenenitrilo)tetraacetic acid

GTP: Guanosine 5'-triphosphate

Hepes: N-2-hydroxyethylpiperizine-N'-2-ethanesulfonic acid

HU: Hydroxyurea

1: Liter

M: Molar

pCi: Microcurie

mg: Milligram

fg: Microgram

ml: Milliliter

Pl: Microliter

mM: Millimolar

mRNA: Messenger ribonucleic acid

nm: Nanometer

nt: Nucleotides

OD: Optical density

Pipes: [l,4-piperazinebis(ethanesulfonic acid)]

PVS: Polyvinylsulfate

RNA: Ribonucleic acid

RNaseA: Ribonuclease A

rpm: Revolutions per minute

SDS: Sodium dodecyl sulfate

SV40 Simian virus 40

TCA: Trichloroacetic acid

Tris: Tris(hydroxymethyl)aminomethane + Hydrochloric acid

TTP: Thymidine 5'-triphosphate

Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



Paul Edmond Kroeger

August 1988

Chairman: Janet Stein
Major Department: Immunology and Medical Microbiology

We have characterized the sequences required for the

transcriptional regulation of the F0108 human H4 histone gene in vivo.

Recombinant cell lines that contained deletion constructs of the H4

promoter region were prepared in mouse C127 cells, and the level of

human H4 histone gene expression was measured by Sl nuclease analysis.

We found that the minimal sequences required for the initiation of

transcription from this gene were contained within the 73 nucleotides

5' to the initiation site of transcription. Within this region are

located an in vivo protein binding site (Site II), the GGTCC element

and the TATA box. Deletion of the distal half of Site II abolished site

specific initiation of transcription and demonstrated that the TATA box

and GGTCC element were not sufficient for initiation in vivo. Extension

of the H4 promoter to -100 base pairs resulted in a significant

increase in transcription and this increase correlated with the


presence of an Spl site in the proximal half of the upstream protein

binding site, Site I. If the promoter region was lengthened to -410

nucleotides, there was a two-fold increase in the level of

transcription. Deletion analysis suggested that the "distal-proximal"

positive element was located from in the region from -210 to -330 base

pairs 5' to the cap site. We investigated the functionality of a

previously identified enhancer-like element located very far upstream

in the pFO116 fragment of X HHG 41 and demonstrated that although it

functioned in HeLa cells it was not functional in mouse C127 cell


Sl analysis of distal deletion constructs supported the idea that a

negative regulatory element of H4 gene transcription was located

between nucleotides -730 and -1010. Analysis of the region

demonstrated consensus sequences for a topoisomerase II site, nuclear

matrix attachment sites, and a very high A/T content (70%) suggestive

of bent DNA. Taken together this set of results implied that the DNA

topology of this region might be important for H4 gene regulation.

Additional studies demonstrated that Alu repetitive sequences in

the histone deletion constructs could mediate specific integration into

the mouse chromosome and that high copy number was possible.




The goal of this study has been to assess the contribution of

promoter sequences in the F0108 human H4 histone gene 5' flanking

region to transcriptional regulation of the gene. We have endeavored

to define the sequences necessary for the initiation and augmentation

of transcription. The TATA box, GGTCC element, "CAAT box," "CCAAT box,"

and Spl site have been implicated in transcriptional regulation and are

reviewed below. We have also investigated a putative enhancer-like

element and negative regulatory sequence and so these sequences are

also discussed below.

Historical Background

The concepts governing gene regulation, as we know them today, have

their foundations in the work of many biochemists and geneticists who

introduced the ideas of positive and negative regulation in prokaryotic

gene expression. The observations of many, that the total genetic

potential of a cell was never expressed simultaneously, referred to as

"genetic adaptation," led Jacob and Monod (1961) to address the

question of what controls this phenomenon. In their seminal paper the

"operon model" was proposed. This model described how structural genes

expressed themselves and how that expression was regulated. It had

been known for some time that bacteria could respond to various


nutrients by synthesizing new metabolic enzymes, so Jacob and Monod

investigated the lactose metabolic pathway of Escherichia coli (E.

coli). Their work was encouraged by many earlier investigators,

including Demerac (1956), who made the observation that genes coding

for similar enzymatic function were located in localized regions of the

Salmonella chromosome. Demerac was able to conclude that the genes he

had investigated were in a nonrandom distribution and that perhaps this

conferred an evolutionary advantage to the organism.

The lac operon is one of the most well studied genetic systems in

all of prokaryotic and eukaryotic molecular biology. The many

intuitive observations and predictions of Jacob and Monod and

colleagues led to the identification of the components of the lac

operon: the repressor, produced by the lac I gene; the lac operator,

promoter, and three linked structural genes. The interplay of inducer

and repressor was demonstrated, and Jacob and Monod proposed that the

lac operon was subject to negative regulation. An initial observation

of Jacob and Monod (1961) was that the control gene would make

repressors that would turn off the structural genes. The isolation of

nonsense mutations in the lac I gene (Bourgeois et al., 1965) provided

convincing evidence for the nature of repressors. Suppression of the

nonsense mutation restored repressor function and demonstrated that

repressor genes encoded repressor proteins. The final proof was the

isolation of the lac repressor by Gilbert and Mller-Hill (1966). In

addition it was demonstrated that the lac operon and others were under

more general control by catabolite activator protein and 3'5'-cyclic-

AMP as it was shown that both are required, in addition to the inducing

molecule, for the operon to be transcribed (Emmer et al., 1970).

The ensuing years have led to refinement of the operon model as

well as its acceptance as one of the general organizational patterns

characteristic of prokaryotes. In particular, the concepts of

protein/DNA interactions, repression, and positive and negative

regulation have carried over into eukaryotic molecular biology and

have served as a basis for unraveling the complexity of the eukaryotic

cell. The extension of these ideas has allowed considerable progress;

however, the original view that all genes, prokaryotic and eukaryotic,

would have similar regulatory and organizational patterns has not been

borne out. In fact there is a great diversity in the regulatory

mechanisms that govern both prokaryotic and eukaryotic gene


The control of eukaryotic gene regulation has been of obvious

interest, but research has been slower than in prokaryotes because of

the complexity and technical difficulties encountered when working with

the eukaryotic cell. Two avenues of study have predominated in

eukaryotic molecular biology: the investigation of viral models such

as adenovirus and SV40 (as was done with the prokaryotic phages lambda

and T7) and the characterization of cellular genes and the proteins

that regulate their expression.

Eukaryotic molecular biologists have had to develop the appropriate

technology because many of the advantageous prokaryotic techniques are

not directly applicable to eukaryotic systems. Two of the most

important discoveries that have revolutionized molecular biology are

restriction enzymes (reviewed by Nathans and Smith, 1975) and DNA

ligase (Modrich et al., 1973; Weiss and Richardson, 1967). With these

new enzymatic tools the ability to manipulate DNA fragments developed

quickly and was responsible for the present state of advancement.

Viral Model Systems

The utilization of viral model systems for the characterization of

eukaryotic regulatory mechanisms was a logical extension of the work

done in prokaryotes. In particular, adenovirus and SV40 have provided

considerable insights into eukaryotic gene regulation. Without an

understanding of the exact mechanisms involved in the various processes

of RNA transcription and DNA replication, it was obvious to early

investigators that viruses, such as SV40, could invade and eventually

kill the host cell and yet were extremely dependent on the cell's

enzymatic machinery to accomplish their replicative cycle.

Adenoviruses were first isolated by Rowe et al. (1953) as the

agent responsible for the degeneration of human adenoid tissue in

culture. The adenovirus life cycle in human cells has been examined

with respect to the virus-specific proteins produced, replication of

viral DNA, transcription of viral genes, and effect on the host cell

(Reviewed in Tooze, 1980). Initial studies demonstrated that there

were two phases--early and late--in the expression of adenovirus genes

(Lindberg et al., 1972). As a measure of the impact of infection on the

cell, adenovirus mRNA comprises almost all the mRNA bound to

polyribosomes by the end of the replicative cycle (Thomas and Green,

1966). The early viral mRNA was detected and mapped to precise

locations on the adenovirus genome by R-loop mapping (Thomas et al.,

1976) and hybridization to restriction endonuclease fragments of

adenovirus DNA (Sharp et al., 1975). Restriction enzymes permitted the

mapping and orientation of DNA fragments and transcription units on the

SV40 genome as well (Khoury et al., 1973; Sambrook et al., 1973).

Several laboratories utilized adenovirus/SV40 recombinant hybrids

to define essential genomic regions of each. In particular, the hybrid

viruses were useful in the determination of the functional "helper"

domain of the SV40 T antigen, as adenovirus requires "help" to grow in

nonpermissive cells (Fey et al., 1979). With the mRNA coding regions

mapped on the adenovirus and SV40 genomes, a more informative analysis

and interpretation were initiated which have begun to elucidate the

complex nature of transcriptional regulation in these viruses. The

promoter structure and presence of enhancing/silencing elements in

these viruses have served as continuing models for studies of cellular

promoters and regulatory sequences. Additionally, although not

discussed here, both adenovirus and SV40 were utilized in the discovery

of mRNA splicing (Berk and Sharp, 1977, 1978), which has revolutionized

our concepts of gene regulation and expression.

Chromatin Studies

At the same time that the viral model systems were beginning to be

reasonably well understood, there were a number of investigators

pursuing the characterization of cellular genes and transcriptional

mechanisms. Although restriction enzymes had been discovered (Smith

and Wilcox, 1970) and their applicability realized, it was several

years before their purification and recombinant DNA technology were

worked out to make them sufficiently useful. This lag did not deter a

number of investigators from direct examination of the transcriptional

process in eukaryotic cells. As early as 1962 isolated pea embryo

chromatin had been utilized as a template for transcription (Huang and

Bonner, 1962). Isolated chromatin was incubated with bacterial RNA

polymerase (the purification of eukaryotic RNA polymerases had not been

achieved at this time) and the four ribonucleoside triphosphates. A

comparative analysis of transcription from chromatin and deproteinized

DNA of the same source indicated that the chromatin was less able to

support transcription (Huang and Bonner, 1962). It was postulated that

part of the chromatin was repressed, perhaps due to the presence of

histone proteins bound to the DNA. The amount of transcription

possible from a known quantity of chromatin was referred to as its

template capacity. The determination of template capacity in chick

oviduct, a steroid responsive tissue, led to the observation that the

level of transcription was modulated with the addition of hormone

(Dahmus and Bonner, 1965). The amount of template capacity also

correlated with the various developmental stages of sea urchin growth

(Johnson and Hnilica, 1970). Another more accurate measure of the

"transcriptional capacity" of a sample of chromatin was the number of

RNA polymerase initiation sites. Cedar and Felsenfeld (1973) first

measured the number of E. coli RNA polymerase initiation sites on

chromatin by incubating chromatin and RNA polymerase together in the

absence of ribonucleoside triphosphates. Next, the addition of the

ribonucleoside triphosphates with high levels of ammonium sulfate

permitted elongation but not reinitiation. One of the major criticisms

of this early work was that the use of bacterial RNA polymerase made an

accurate interpretation in doubt. Comparative studies were performed

by Mandel and Chambon (1970) and Tsai et al. (1976). These

investigators demonstrated that there was no competition for either

SV40 DNA or calf thymus DNA by the bacterial or eukaryotic RNA

polymerase. However, when Tsai et al. (1976) compared hen oviduct and

E. coli RNA polymerase initiation sites on chick DNA or chick oviduct

chromatin, they found no competition on the DNA, but direct competition

in the chromatin sample. Thus it appeared that chromosomal proteins

could modify the initiation specificity such that both enzymes were

competing for similar sites. To establish this point conclusively, the

product mRNAs had to be examined. Filter hybridization techniques were

developed that permitted the detection of reiterated gene transcripts

and particularly abundant mRNAs. At the level of sensitivity possible

with this methodology, in vitro chromatin transcription appeared to

reflect an accurate view of the transcriptional status in vivo

(Bacheler and Smith, 1976).

The next major advance was the fractionation of chromosomal

proteins in an effort to reconstitute transcriptionally competent DNA

into chromatin in vitro. The first attempts to reconstitute chromatin

were studies by Paul and Gilmour (1966, 1968) and Bekhor et al. (1969)

in which they fractionated chromatin proteins in an attempt to

discover what group of proteins controlled transcriptional. Their

results indicated that the non-histone chromosomal protein (NHCP)

fraction was probably responsible. The role of NHCP in the expression

of several genes has been reviewed (Stein et al., 1974; Simpson,


Experiments became more refined as exemplified by the studies of

Tsai et al. (1976) who examined the inducible ovalbumin gene in the

chick oviduct system. The role of NHCP was established, and through a

series of competition assays with induced and uninduced NHCPs it was

demonstrated that in vitro expression of the ovalbumin gene was

stimulated by the appearance, upon steroid induction, of a positive

regulatory factor. Histones, a moderately reiterated family of genes

(Stein et al., 1984), were also studied in a similar manner to examine

the role of NHCPs. Several studies indicated that NHCPs were involved

in the increased expression of the histone genes during S-phase of the

cell cycle (Park et al., 1976; Stein et al., 1975). Kleinsmith et al.

(1976) extended the characterization and demonstrated that

phosphorylation of the NHCP was necessary for optimal in vitro

expression of the histone genes. When the NHCPs were treated with

phosphatase before addition to the reaction, there was a decrease in

the number of transcription initiation sites.

The role of the histone proteins in transcription has been of great

interest because they form such a close association with the DNA.

Studies with either electron microscopy or nuclease digestion have

demonstrated that there is either a change in the histone/DNA ratio or

a conformational change in the nucleosomes associated with genes

undergoing active transcription (Weintraub and Groudine, 1976). The

chromatin structure of specific genes has also been shown to be

conformationally altered only in tissues where they are

expressed. Examples include the P-globin gene in chick embryo red

blood cell nuclei and the ovalbumin gene in chick oviduct nuclei

(Garel and Axel, 1976). Also, several investigators have proposed that

nucleosomes might be "phased" on the chromosome so as to render

particular areas of the DNA accessible, or inaccessible, to

transcription factors (Gottschling and Cech, 1984; Linxweiler and Horz,

1985). Thus, at this juncture, it became more realistic to assume that

the chromatin structure of active genes in comparison to silent loci

was a more open and dynamic conformation, yet not necessarily devoid of

histones as had been postulated.

In Vitro Transcription

During the early 1970s, several investigators actively pursued the

activity (or activities) responsible for the synthesis of the various

eukaryotic mRNAs. Almost simultaneously several laboratories were able

to isolate multiple RNA polymerase activities on DEAE-Sephadex columns

(Chambon, 1975; Roeder, 1976). Each peak of activity exhibited a

different susceptibility to the inhibitor amanitin (Kedinger, 1970).

There were differences in the results they obtained as evidenced by the

diverse number of variant RNA polymerase activities that were

originally identified (Roeder, 1976). As the purity of the RNA

polymerase activity increased it became more obvious that there were

three distinct RNA polymerase activities present in eukaryotic cells

(Roeder, 1976). It was very difficult for early investigators to make

progress toward understanding the relationship between the various

eukaryotic RNA polymerases and their respective function in the

expression of genes, because adequate templates for transcription in

vitro were not available. The predominant templates used were either

homopolymers, bacteriophage DNA, or fractions of genomic DNA enriched

in either ribosomal or satellite DNA (Chambon, 1975). These proved

unsatisfactory, and the results were often confusing. Several lines of

evidence suggested that ancillary factors were necessary in order for

RNA polymerase, in particular RNA polymerase II, to exhibit template

specific transcription (Chambon, 1975). The application of restriction

enzymes to the manipulation of DNA led to the cloning of specific genes

that were then suitable as templates for in vitro transcription

systems (Nathans and Smith, 1975).

The biological implications of the viral model systems that had

been studied in vivo, and the new DNA cloning technology, prompted

several investigators to develop cell free transcription systems. It

was obvious that it would be advantageous to work with an in vitro

system to dissect the various components of the eukaryotic

transcriptional apparatus. The first in vitro transcription systems

were developed for RNA polymerase III, and shortly thereafter, RNA

polymerase II. RNA polymerase III is responsible for the synthesis of

5S ribosomal RNA (Ng et al., 1979), tRNAs, and a few viral RNAs

including the adenovirus VAI and VAII RNAs (Fowlkes and Shenk, 1980).

Cell free transcription of the Xenopus 5S rRNA gene by RNA polymerase

III was first demonstrated by Birkenmeier et al. (1978) in nuclear

extracts of Xenopus oocytes. At the same time it was shown that

cytoplasmic extracts of human KB cells (Wu, 1978; Weil et al., 1979)

were able to transcribe selectively cloned 5S rRNA, tRNA, and

adenovirus VA RNA genes. The cytoplasmic extracts were shown to

contain a majority of the RNA polymerase III activity (Weil et al.,

1979) that had apparently leaked from the nucleus during preparation of

the extract. With respect to RNA polymerase II, Manley et al. (1980)

prepared a concentrated HeLa cell extract that was able to initiate

transcription accurately in vitro at a variety of adenovirus RNA

polymerase II transcriptional control regions.

In vitro transcription was and is a powerful technique for the

investigation of eukaryotic promoter function. The concomitant

development of various molecular techniques for the mutation and

reassortment of DNA sequences was fortuitous, and in a relatively short

period of time the basic sequence requirements of the RNA polymerase II

promoter were delineated (Efstratiadis et al., 1980). Although

considerable refinement has occurred in our knowledge of these

sequences, the basic elements have not changed. One of the first

sequences to be implicated because of similarity to prokaryotic

promoter sequences was the "TATAA" box (Goldberg-Hogness). This A-T

rich stretch is located -25 to -35 bp upstream of the mRNA start site

in RNA polymerase II promoters and is remarkably similar to the Pribnow

box (TATAAT) described for the promoters of prokaryotic genes (Pribnow,

1975). The only real difference is the location of the Pribnow box,

which is at -10 bp from the start of transcription (Rosenberg and

Court, 1979). It should be noted that the comparison of the Pribnow box

with the Hogness box has revealed variations in sequence and some

difference in function. Principally, the Pribnow box is absolutely

required for transcription to occur in prokaryotes; however, as

discussed below, the Hogness box is not as stringently required. The

second sequence that has been retained with equally remarkable

similarity is the "CAAT" box. The consensus sequence for this element

is 5'-GGCtCAATCT-3' (Efstratiadis et al., 1980; Dynan and Tjian, 1985)

and is usually located -70 to -80 bp from the mRNA start site.

Although the TATA box and CAAT box have been found in a majority of

RNA polymerase II promoters and appear to be the framework around which

gene specific variations in regulatory sequences occur, there have been

some genes described that have no TATA box (Contreras and Fiers, 1981;

Melton et al., 1986; Reynolds et al., 1984). A subset of these genes

that have instead a highly G-C rich promoter and in general lack the

strict structure created by consensus RNA polymerase II sequences.

Examples include enzymes such as mouse dihydrofolate reductase (Farnham

and Schimke, 1985), hamster 3-hydroxy 3-methylglutaryl coenzyme A

reductase (Reynolds et al., 1984), and human phosphoglycerate kinase

(Singer-Sam et al., 1984). These genes are often constitutive and hence

have been described as "housekeeping genes." Because the TATAA and

CAAT homologies were found in many genes, it was thought that they

might function in the regulation of transcription. Early in vitro

transcription experiments done by Wasylyk et al. (1980) indicated that

the promoter of the conalbumin gene could be deleted to -44 bp from the

mRNA start site without any effect on the transcription of the gene.

However, when these same investigators introduced even a single base

change into the TATAA box, there was a 10 fold decrease in the amount

of transcription. Similar results were obtained with the adenovirus 2

major late control region (Corden et al., 1980; Hu and Manley, 1981;

Concino et al., 1984).

In contrast to the in vitro results, it was noticed that the TATAA

box, in general, was not essential for transcription in vivo. Benoist

and Chambon (1980) made an SV40 deletion mutant that lacked the TATAA

box preceding the early transcription unit. This mutant was capable of

synthesizing T antigen and transforming rat cells. Similar results were

obtained with the polyoma virus early transcription unit (Bendig et

al., 1980). It was also established that the TATAA box preceding the

sea urchin H2A transcription unit was not necessary for function in

vivo (Grosschedl and Birnstiel, 1980a). The deletion mutants that

Grosschedl made were assayed by injection into Xenopus

oocytes. A 54 bp deletion that included the TATAA box lowered the level

of transcription 5 fold but did not abolish activity.

If the TATAA box is not absolutely essential in vivo for

transcription, then what is the function of this highly conserved

sequence? The answer came from a series of SV40 early promoter mutants

in which the TATAA box was deleted (Gluzman et al., 1980). From this

set of mutants it was demonstrated that in vivo the initiation of SV40

early transcription occurred downstream of the normal site. Also it was

established by Gluzman et al. (1980) that when there were deletions

between the start of transcription and the TATAA box the site of

initiation remained a constant 25 bp 2 bp downstream. This

demonstrated that regardless of the deletion, the mRNA cap site was

determined by the position of the TATAA box. Grosschedl and Birstiel

(1980b) found that multiple initiation sites were utilized in vivo

when the TATAA box was deleted from the sea urchin H2A gene. Since the

lack of a TATAA box caused heterogeneity in the start site of

transcription for several genes, it is now considered that the TATAA

box functions in vivo to specify the correct mRNA initiation site.

Early in vitro transcription studies did not directly discern

whether the CAAT box was necessary for transcription (reviewed in

Shenk, 1981). However, more recent and detailed studies have determined

that the CAAT box does play a role in transcriptional regulation.

Detailed mutagenesis studies by McKnight and Kingsbury (1982); McKnight

et al. (1984) and Myers et al. (1986) elegantly demonstrated the need

for the CAAT box. Initially the studies of McKnight and Kingsbury

(1982), dissected the Herpes Simplex thymidine kinase gene (HSVtk) into

discrete areas required for expression: these included the TATAA box

and two upstream regions referred to as distal signal I (dsl) and

distal signal II (dsII). To pinpoint these small regions accurately

they developed a technique called "linker-scanning" mutagenesis which

introduces clustered sets of point mutations in a short sequence of

DNA. Specifically, these mutations were constructed by ligation of a

series of complementary 3' and 5' deletions joined via a synthetic

linker (BamHI). The mutants that McKnight and Kingsbury created spanned

the proximal 120 bp 5' to the mRNA start site and thus they were able

to assign a boundary to all the sequences required for HSV tk gene

expression after microinjection into Xenopus oocytes. In subsequent

studies dsl and dsII of the HSV tk gene have been shown to interact

specifically with a cellular protein (Jones et al., 1985). This

protein, Spl, was initially purified by Dynan and Tjian (1983a) from

HeLa cells because of its affinity for the SV40 early promoter--later

identified as the G-C rich sequences of the 21 bp repeats. Once the

sequence of the binding site (GGGCGG) on SV40 was confirmed by various

in vitro methods (e.g., DNaseI footprinting), the purified protein was

tested for binding on a variety of other genes that contain a G-C rich

sequence(s), including the mouse Dihydrofolate reductase gene (Dynan et

al., 1986) and more recently the rat insulin-like growth factor gene by

Evans et al. (1988). Both of these genes contain several Spl binding

sites, identified in vitro by DNase I footprinting, and the sites in

the rat insulin-like growth factor gene are of varying affinity

depending on the sequence.

Subsequent to the purification of Spl several groups reported the

identification a cellular protein that interacts with the CAAT box

sequence and has been referred to as either CAAT box transcription

factor (CTF) by Jones et al. (1985) or CAAT box binding protein (CBP)

by Graves et al. (1986). Jones et al. (1985) demonstrated an

interaction in dsII of the HSV tk promoter between Spl and CTF, thus

indicating that distinct transcription factors may interact to regulate

expression. The identification of CTF prompted the search for other

putative transcription factors, and although the evidence is somewhat

preliminary, there appear to be at least 3-4 different CAAT box

binding activities depending on the source of the material used to

purify the activity and the criteria used for analysis (Dorn et al.,

1987). CBP and CTF differ from each other in their heat stability

(McKnight and Tjian, 1986). A CAAT box binding factor isolated from

HeLa cells in our laboratory (van Wijnen et al., 1988), termed HiNF-B,

is yet another addition to this growing family of proteins, with

properties that distinguish it from previous isolated CAAT box-binding


The most sophisticated study to date on the subject of

transcriptional regulatory sequences was done recently by Myers et al.

(1986). These investigators developed a quick method for the

introduction of single point mutations in a small region of DNA. They

mutated nearly every base from -1 to -101 bp of the mouse P-globin

promoter. With a battery of over 100 clones, each with a single base

change in the promoter, they were able to assay the expression of the

mutant constructs in vivo in a short term transient assay. Therefore,

they could assign functional limits to consensus regulatory sequences

and discover any minor, or as yet unnoticed, contributing nucleotides.

In addition, transversions and transitions were measured to assess any

effects on expression. They demonstrated a requirement for the TATAA

box (-25 bp) and the CAAT box (-75 bp) as well as an upstream sequence

characteristic of the p-globin genes, CACCC (-96 bp), in P-globin

transcription. Significantly, an "up" promoter mutation was discovered

when the two bases, GG, immediately 5' to the CCAAT box were changed to

AA. The result of this mutation was a 3-4 fold increase in the level

of message. The implications of this result are that a CAAT box

transcription factor is able to bind more tightly or more specifically

and therefore perform its function more efficiently. With the number

of CAAT box binding factors that are being found in various systems, it

is also possible that the "up" mutation results in the binding of an

alternative, as yet unidentified, protein that carries out the same

function, just more efficiently.

In addition, there are temporal and tissue-specific sequences that

are found in the promoters of some genes and regulate expression at the

transcriptional level. Many of these elements fall into a category of

modulatory sequences referred to as enhancers, negative elements, and


Enhancers and Silencers

The promoter of a gene has generally been defined as the minimal

sequences necessary for the initiation and maintenance of a basal level

of specific transcription. Additional elements that modify the

expression of a gene either during development, temporally, in a tissue

specific manner, or as a result of an inducer, would seem a necessity

if adequate regulation in the eukaryotic cell is to be achieved. In the

preceding 5-10 years a number of investigators have provided

considerable evidence for the existence of positive regulatory

sequences referred to as enhancers (Reviewed in Serfling et al., 1985;

Maniatis et al., 1987). The properties of an enhancer are that 1)

there is strong activation of the linked gene from the correct

initiation site, 2) it exhibits independence of orientation, 3) it is

operative at long distances whether 3' or 5', and 4) it preferentially

stimulates transcription from the closest promoter, if they are

tandemly arranged (Serfling et al., 1985). The prototype enhancer

elements are the 72 bp repeats of SV40, which have been extensively

characterized (Benoist and Chambon, 1980; Fromm and Berg, 1982;

Treisman and Maniatis, 1985). Several experiments in which the SV40

enhancer has been fused to the mouse f-globin promoter have

demonstrated the relationships that exist between an enhancer and

promoter. Banerji et al. (1981) demonstrated that the SV40 enhancer

could promote hundred-fold higher levels of rabbit f-globin

transcription whether located 1400 or 3300 base pairs away. Treisman

and Maniatis (1985) demonstrated that SV40 enhanced transcription of

the mouse 1-globin gene depended on the presence of a functional

promoter. Point mutations in the upstream promoter elements (UPE) of

the P-globin promoter abolished transcription almost totally. In

conjunction with these results, Treisman et al. (1985) demonstrated

that when the f-globin promoter was deleted, and the SV40 enhancer was

moved to a proximal position, transcription returned to a high level.

It would then appear that enhancers are like promoters but not vice

versa. Bienz and Pelham (1986) demonstrated that the tandem

duplication of transcriptional control sequences could result in

enhancing ability. They found that the duplication of a heat shock

regulatory element (HSE) could function as an enhancer (distance

activation) whereas a single HSE was inactive at a distance. So one of

the major differences between enhancers and promoters (action at a

distance) may be due to the number of "promoter" elements present with

some accompanying specific sequences (Maniatis et al., 1987). The

importance of the specific sequences should not be down-played, as a

consensus core sequence, 5'-GTGGAAAG-3', has been identified in viral

and cellular enhancers (Khoury and Gruss, 1983).

Differences may also be the result of the arrangement of

transcriptional regulatory sequences. Why do an increased number of

regulatory sequences in many cases stimulate transcription so

dramatically? It has been proposed that the resulting protein-protein

complexes that arise from the juxtaposition of regulatory sequences

result in increased transcription. Therefore since most enhancers

contain repeated elements it is possible that they function in

organization of the transcriptional apparatus. Exceptions to this

exist of course; tandem duplication of the CCAAT box does not lead to

a DNA fragment with enhancer qualities (Bienz and Pelham, 1986), i.e.

no enhancement at a distance. Perhaps this result is also a reflection

of the idea that some "transcription" factors bind to the DNA but do

not act directly. Instead they function through their association with

adjacent proteins (Maniatis et al., 1987). An example is that CTF has

been shown to associate closely with Spl protein in the Herpes virus tk

gene (Jones et al., 1985). Significantly, it has recently become

apparent that the mechanism of transcriptional activation by upstream

activation sites (UASs) in yeast is conserved in mammals. Several

studies over the last year have demonstrated 1) that activator

proteins in yeast are composed of a DNA binding domain in the amino

terminus of the protein and a transcriptional activator in the carboxy

terminus, and 2) that when the yeast proteins are expressed in

mammalian cells (with the appropriate binding site present in the

promoter of the target gene) they can activate transcription (Kakidani

and Ptashne, 1988; Webster et al., 1988; Hope and Struhl, 1986). Taken

together with what is known about transcriptional regulation in higher

eukaryotes, it appears that the separation of the DNA binding domain

and the transcriptional activation domain of regulatory proteins may be

conserved from yeast to mammals. In addition the mechanism is probably

conserved as well.

Several of the more well characterized enhancer sequences are part

of a group related by tissue specificity of expression. The

Immunoglobulin (Ig) enhancer of the heavy chain locus is located

several thousand bps 3' to the variable region promoter. This enhancer

sequence, in its entirety, is only active in cells of the lymphoid

lineage (Gillies et al., 1983; and Banerji et al., 1983). As has been

found for the SV40 enhancer, the Ig enhancer is composed of several

distinct elements that interact with specific proteins in vivo (Church

et al., 1985). One of the core elements of the Ig enhancer is the

"octamer" sequence, 5'-ATGCAAAT-3'. It is of special interest as it

also appears in the promoter of a few cellular genes, including histone

H2B (Harvey et al., 1982) and (2'-5') oligo-A synthetase (Benech et

al., 1985). How this element contributes to tissue specificity in one

context (Ig enhancer) and not in another (histone H2B) remains to be

determined. Recent in vitro binding studies of proteins that interact

with the SV40 "octamer" sequence have demonstrated that there are both

general and tissue specific factors present that bind this sequence,

and this may relate to its role in tissue specific regulation (Rosales

et al., 1987). Also, careful mapping of the binding of HeLa and B cell

nuclear proteins to the SV40 enhancer has revealed subtle differences

in the extent to which various motifs are protected which is indicative

of differential protein/DNA interactions (Davidson et al., 1986).

Enhancers should not be mistaken for promoters with additional

sequence attached or interspersed. In many cases they exhibit

exceptional cell-type and temporal specificity with respect to

transcriptional activation. Deletion analysis has indicated that

certain core sequences of the IgH enhancer may function in non-lymphoid

cells to shut off the enhancer action (Wasylyk and Wasylyk, 1986;

Kadesh et al., 1986).

The implication of a negative regulatory mechanism for the control

of IgH enhancer action presents a confusing picture of tissue specific

and temporal gene regulation. At first it was thought that the absence

of necessary factors for enhancer action was the reason for

differential activity in various tissues (Maniatis et al., 1987).

However, this has been shown to be somewhat incorrect as many of the

factors found in B cell extracts are also in other types of cells. So,

it is either a case of inaccessibility of the DNA binding sites in

nonlymphoid cells, or that there must be an interaction with a B cell

specific protein (Maniatis et al., 1987). Recently Sen and Baltimore

(1986) discovered a factor present in many cell types, NF-kB, that

interacts with the kappa-chain gene enhancer, but only after

modification to an active form in B-cells.

Negative regulation of gene expression is an old subject for

prokaryotic molecular biologists, but is relatively new to eukaryotic

gene regulation. The first description of the SV40 enhancer element

caused everyone to search for similar elements in other genes, and the

identification of negative regulatory sequences, especially in viral

enhancers, has had a similar effect. It is important to understand that

negative regulatory sequences can be divided into two groups, 1) those

sequences that shut off activity of another regulatory element (such as

an enhancer) and have been found to exist within the confines of the

enhancer element, and 2) sequences that act independently of other

regulatory elements to control the level of gene expression. This

latter type of element is the newest discovered and as such is less

well characterized. An interesting distinction can be made in that

some negative regulatory elements can act in either orientation and

with some distance independence and as such have been called either

dehancers or silencers (Baniahmad et al., 1987; Laimins et al., 1986;

Remmers et al., 1986).

Negative regulation of viral enhancer elements is best typified by

the IgH enhancer in which Wasylyk and Wasylyk (1986) have shown that

sequences on either side of the central core sequence down regulate

expression in fibroblasts as compared to B-cells. It is obvious that,

as mentioned above, there must be a mechanism by which the appropriate

genes are expressed at the right times in the right tissues. This may

occur through the regulation of many protein factors, but more likely

there is one protein that regulates the organization of the other

transcriptional factors. It seems apparent that the complexity of the

eukaryotic promoter would in many cases permit great specificity of

expression but could be a regulatory nightmare for the cell. An

exquisite example of coordinate regulation of many genes is found in

the Adenovirus system and the Ela protein. Ela, one of the immediate

early proteins produced in early infection, coordinates the expression

of several other genes (Yee et al., 1987) and also represses the

expression of other elements, such as the SV40 enhancer.

A particularly interesting example of negative regulation, which

relates to Ela regulation, has been described for embryonal carcinoma

cells (EC). SV40, polyoma virus, or Moloney murine leukemia virus are

unable to express their genomes when transfected into undifferentiated

EC cells. The induction of differentiation removes the block on the

expression of both viral and cellular genes (Gorman et al., 1985).

Mutants of polyoma virus were isolated that could replicate in the

undifferentiated EC cells, and it was found that the mutations occurred

predominantly in the promoter and enhancer regions of the early genes.

Alternatively, it has been found that the adenoviruses replicate well

in undifferentiated EC cells. In conjunction it was discovered that

mutants in the Ela region could grow in undifferentiated, but not

differentiated EC cells. Taken together with previous evidence about

the function of the Ela protein, it has been suggested that EC cells

contain an Ela like protein that negatively regulates gene expression

until differentiation is induced (Gorman et al., 1985). Gorman et al.

(1985) have demonstrated that when the SV40 early promoter is

introduced by infection it is inactive in EC cells, but when introduced

by CaP04 transfection it is expressed in an enhancer-independent

fashion. This result strongly suggests that the large number of

molecules present in the transiently transfected cell are able to

titrate out the negative factor (or factors) and thus allow expression

from some of the genomes present. Gorman et al. (1985) have also shown

that the negative factors in EC cells have different relative

affinities for the various enhancers, and surprisingly the affinity of

the interaction did not necessarily relate to the level of expression.

A number of cellular genes have been shown to contain negative

regulatory elements although their specific mode of action has not been

characterized. These genes include mouse f-interferon (Goodbourn et

al., 1986), mouse c-myc (Remmers et al., 1986), rat insulin 1 gene

(Laimins et al., 1986), chicken lysozyme (Baniahmad et al., 1987),

mouse p53 tumor antigen (Bienz-Tadmoor et al., 1985), chicken

ovalbumin (Gaub et al., 1987), and rat a-fetoprotein (Muglia and

Rothman-Denes, 1986). This list includes genes in which the negative

element is situated within an enhancer (mouse P-interferon) and those

in which it is interspersed between other promoter elements (chicken

lysozyme and rat a-fetoprotein). The most well characterized of these

are the chicken lysozyme and mouse P-interferon genes in which the

sequences responsible for the negative effect have been identified

(Goodbourn et al., 1986, Baniahmad et al., 1987). The chicken lysozyme

gene is particularly of interest because it contains several possible

negative regulatory sequences located at -0.25, -1.0 and -2.4 kb from

the start of transcription and they are well separated from the

enhancer element identified 7 kb upstream (Theisen et al., 1986).

Additionally, it is interesting that both the chicken lysozyme and the

rat insulin 1 gene negative regulatory elements are contained within

repetitive elements. The chicken lysozyme element is found within the

CR1 repeat, which is a middle repetitive sequence and has limited

homology to the mammalian Alu-type sequences. Additionally, the CR1

repeats near the chicken ovalbumin gene are found in areas where there

is a change in the DNaseI sensitivity when the ovalbumin gene is

induced, perhaps indicative of a protein/DNA interaction (Stumph et

al., 1984). The rat insulin 1 element is a member of the family of long

interspersed rat repetitive sequences (LINES) that are present in

about 50,000 copies per cell (Laimins et al., 1986). The fact that

some of the negative regulatory elements identified so far are

associated with middle repetitive sequences has attracted attention.

Some investigators have proposed that the function of this arrangement

may be to coordinate transcriptional domains. The isolation of a domain

by blocking it off with repetitive elements would be consistent with

the structure of eukaryotic chromatin as we understand it today, and

would allow for coordinate control of a gene or set of genes of related

function (Laimins et al., 1986). Negative regulatory elements are

still awaiting the identification of factors that interact with them

and characterization of the protein/DNA and protein/protein

interactions that result in the negative regulation of transcription.

Histone Genes

Histone proteins have been known for a considerable time and their

composition has been the subject of much investigation (reviewed in

Isenberg, 1979). Little was known however about the genes encoding

these acidic proteins until the late 1960s and the 1970s when many

investigators took advantage of the size of the histone messages, and

their relative abundance to investigate the regulation of this set of

genes. The histone genes have many characteristics that make them an

attractive model system for the investigation of regulation. They are

coordinately expressed during S-phase of the cell cycle, and this

expression is the result of both transcriptional and

posttranscriptional processes. Additionally, their small size and basic

structure (no introns, minimal processing) make them an easy system to

manipulate and study (Maxson et al., 1983). If we can understand how

the highly coupled expression of the histone genes is controlled,

perhaps we can then understand how other genes are expressed

coordinately and otherwise.

Historical background. One of the initial observations regarding

histone proteins was that they are present in a relatively invariant

1:1 molar ratio with DNA in the cell (Prescott, 1966). It was further

demonstrated that the amount of histone protein present in a cell

doubled during S-phase of the cell cycle (Bloch et al., 1967). Such

results suggested a possible coupling between these two metabolic

events. Borun et al. (1967) were able to demonstrate that a class of

polyribosomes (7-9S) were selectively enriched during S phase of the

HeLa cell cycle and that they coded for histone-like polypeptides in

vitro, thus giving more credence to the relationship that had been

demonstrated earlier. Borun et al. also noted several properties of

these small mRNAs that have become the foundation of present day

theory about histone mRNA regulation: 1) the addition of cytosine

arabinoside caused a fourfold increase in the "histone" mRNA

destabilization rate as compared to actinomycin D treated cells; 2) the

newly synthesized 7-9S RNA, at the Gl-S boundary, became associated

with polyribosomes thus beginning histone synthesis; and 3) two hours

before the end of DNA synthesis in synchronized HeLa cells 7-9S mRNA

transcription ceased and the remaining 7-9S mRNA decayed with

approximately a one hour half life. Borun et al. proposed, somewhat

incorrectly, that the control of histone mRNA levels was through

transcriptional regulation. The refinement of molecular techniques has

allowed later investigators to define the degree to which

transcriptional and posttranscriptional mechanisms regulate histone

mRNA metabolism. Butler and Mueller (1973) repeated and extended the

results of Borun by demonstrating several basic facts. First,

cycloheximide was able to stabilize histone mRNA in the presence of

hydroxyurea, a potent inhibitor of DNA synthesis. When added to

synchronized HeLa Cells, hydroxyurea causes a very rapid

destabilization of almost all histone mRNAs (90%) via the complete

shutdown of DNA synthesis (Baumbach et al., 1984; Heintz et al., 1983;

Sittman et al., 1983). This suggests that a protein(s) is (are)

necessary for the destabilization process to occur. The 10% of histone

message that remains is insensitive to hydroxyurea and probably

represents replication independent histone gene mRNAs (Wells and Kedes,

1985; Wu and Bonner, 1982). Second, transcription is not necessary for

the production of this putative destabilization factor as the addition

of a transcription inhibitor has no effect on the subsequent

destabilization of histone mRNA. Third, Butler and Mueller (1973)

demonstrated a transient increase in the pool of free histone proteins

for 20 minutes after treatment with hydroxyurea. They suggested in

their regulatory model that the free histone proteins might

autogenously regulate the translation of their own message and/or the

stability of the remaining message following the cessation of DNA

synthesis. Nearly 15 years later, the idea of autogenous regulation

has gained popularity, since Ross and coworkers (1986, 1987) have so

aptly demonstrated the specific degradation of histone mRNA in vitro,

and the isolation of a nuclease activity that degrades poly A minus

messages from the 3' end.

The histone enriched environment of the sea urchin genome allowed

for their early isolation by equilibrium centrifugation and

subsequently the characterization of the coding and spacer region base

composition (Birnstiel, 1974). The sea urchin genes have been

successfully used as probes for the isolation of histone genes from

several species, including vertebrates such as Xenopus (Moorman et

al., 1980) and mouse (Seiler-Tuyns and Birnstiel, 1981). The higher

vertebrate histone genes were then used to expedite the isolation of

the human histone genes (Clark et al., 1981; Heintz et al., 1981;

Sierra et al., 1982). The replication dependent histone genes, which

comprise the majority of expressed histone genes, are characterized by

a lack of introns and an extremely well conserved 3' end sequence that

consists of an 15 bp stem and loop structure.

Human histone gene organization. The isolation of the human histone

genes, which had previously been so intensively studied, permitted the

proposed regulatory hypotheses to be tested. The organizational

pattern of the human histone genes was uncovered by restriction enzyme

analysis, and Southern blot hybridization (Southern, 1975) of

restricted phage clones demonstrated that, unlike the tandem repeats of

the lower eukaryotes, the human genes were clustered but had no obvious

organizational pattern (Sierra et al., 1982; Heintz et al., 1981 and

Clark et al., 1981). Sierra et al. (1982) were able to isolate lambda

Charon 4A phage clones representative of three families or clusters.

Unlike the lower eukaryotic organization, none of these clustered

groups of human histone genes contained a human HI gene. By using a

chicken HI specific probe Carozzi et al. (1984) isolated a clone that

had all 5 human histone genes including an H1 histone. Recently,

several human histone genes have been localized to different

chromosomes (Triputti et al., 1986, Green et al., 1986). This

suggests that coordinate control of human histone gene expression might

not be as easily regulated as in lower eukaryotes.

Another question that had not been addressed up to this time was

whether different histone mRNAs were the product of different histone

genes. Lichtler et al. (1982) demonstrated convincingly that seven

species of human H4 histone mRNA were encoded by at least 3 separate

genes, thereby establishing that the human histone genes are a

repetitive family of genes, but not redundant. Lichtler et al. (1982)

also strengthened the possibility that different histone genes might be

subject to diverse regulation since it was obvious that certain H4

mRNAs were present at higher levels than others.

Transcriptional and Posttranscriptional regulation. Our knowledge

about these two steps in the regulation of histone mRNA metabolism has

been strengthened by the studies of Heintz et al. (1983); Sittman et

al. (1983) and Plumb et al. (1983a,b). Plumb et al. (1983b) utilized

HeLa cells synchronized by double thymidine block and hybrid selection

of pulse labelled histone mRNA. This technique permitted several

species of histone mRNA to be isolated on acrylamide gels. These

experiments demonstrated that the histone genes are transcribed in the

early part of S-phase, approximately 2-3 hours post release from double

thymidine block. The increase in the histone mRNA transcription was 3-

5 fold during this period. Baumbach et al. (1987) demonstrated a

similar increase in the level of histone gene transcription at the

beginning of S-phase with nuclear run-on analysis. However, one of the

anomalies of histone gene expression is that if one follows the total

increase in the amount of histone mRNA, the actual elevation is from

10-25 fold (Plumb et al., 1983b; Heintz et al., 1983). The actual

differences in histone mRNA levels have varied from one report to

another and this is probably the result of the various synchronization

and analysis techniques utilized. Conservatively, the level of

transcription increases 3 fold during the first 2-4 hours of S phase,

and the stability of histone mRNA rises 10-20 times during S-phase.

Outside of S phase or after the artificial cessation of DNA synthesis

by drug treatment, the half-life of histone mRNA is approximately 10-15

mins. (Sittman et al., 1983; Plumb et al., 1983a).

Nuclease sensitivity and Protein/DNA interaction. Historically, a

hallmark of an active gene has been the presence of nuclease

hypersensitive sites in the promoter region of the gene. Chrysogelos et

al. (1985) and Moreno et al. (1986) have extensively characterized the

nuclease sensitivities of the flanking and coding regions of the F0108

human H4 histone gene. Together, their results demonstrate that the 5'

region of the F0108 H4 gene is a dynamic area of varying sensitivity to

DNase I, micrococcal and Sl nuclease. Since the histone genes are cell

cycle regulated with respect to transcription and total message levels,

Chrysogelos et al. (1985) were able to correlate the size of the DNase

I hypersensitive site with the stage of the cell cycle. As mentioned

earlier, the appearance of a DNaseI hypersensitive site is indicative

of protein/DNA interactions in the region. Pauli et al. (1987)

utilized the technique of genomic sequencing to visualize the in vivo

protein/DNA interactions in the promoter of the F0108 human H4 histone

gene. They demonstrated that there are two binding sites in the

proximal promoter region which have been designated Site I (-122 bp to

-89 bp) and Site II (-64 bp to -23 bp). Site I contains a putative Spl

site and a possible CAAT box. Site II contains the GGTCC element (see

below) and the TATAA box. The protein/DNA complexes at Site I and Site

II are present throughout the cell cycle and presumably these

interactions in the promoter region are involved in the basal and

increased level of transcription demonstrated at the onset of S-phase.

Perhaps the interactions that regulate the level of transcription at

the start of S-phase occur through protein/protein interactions since

there is no apparent change in the protein/DNA interactions during the

cell cycle. In studies done by Heintz and Roeder (1984), it was

demonstrated that the pHuH4 histone gene was transcribed in vitro to a

greater extent in S-phase extracts than in G-phase extracts. It would

be important to know whether there is a new protein that appears at

the onset of S-phase that acts either directly to augment

transcription by interacting with the DNA or through a protein/protein

interaction. Since the identification of protein/DNA interactions in

the promoter of the F0108 H4 gene, it has been of great interest to

us to ascertain if there is any functionality in the interaction and

this is addressed to some extent in this work.

Other histone genes, from a variety of species, have been

characterized with respect to the contribution of 5' flanking sequences

in transcriptional regulation. Notably, the human H2B gene has been

extensively characterized with in vitro transcription by Sive et al.

(1986). They demonstrated that the transcription of the H2B gene is

dependent on a number of sequences 5' to the TATA box including the H2B

octamer element and CCAAT box. Recently, the emphasis has been placed

on identification of the sequences responsible for the periodic

increase in histone gene transcription during the cell cycle.

Artishevsky et al. (1987) have demonstrated, although not convincingly,

that the sequences responsible for the S-phase increase in

transcription of a hamster H3 gene are located in the proximal

promoter region (-150 bp); however they were not explicitly defined.

The authors propose that this region of the hamster H3 gene bears

similarity to the sequence, 5'-GCGAAA-3', that has been shown to

regulate the cell cycle expression of the HO genes of yeast (Nasmyth,

1985). Taken as a whole, these many results support the idea that the

histone genes are controlled at the transcriptional level by promoters

that are composed of many elements that interact with different and

specific proteins. Though not dealt with here, van Wijnen et al.

(1987, 1988) have shown that the promoter region of several cloned

human histone genes can interact with nuclear proteins in a specific


Sequence analysis. Only a few histone genes have been sequenced

extensively enough to permit a comparative analysis of 5' flanking

sequences. The majority of sequencing information concerning histone

genes has revolved around the coding sequences. Comparative analysis of

these protein sequences has revealed remarkable homogeneity from

species to species, especially with respect to histones H3 and H4

(Wells, 1986). Unfortunately little 5' flanking sequence for H4 histone

genes has been published, and most sequences extend only 80-120

nucleotides upstream (Wells, 1986). A comparison of the F0108A H4

histone gene (Sierra et al., 1983), which my studies have involved,

and the human H4 histone gene independently isolated by Heintz et al.

(1981), suggests that some of the sequences in the 5' proximal promoter

region are conserved--the TATA and GGTCC boxes. The TATA box is, of

course, a canonical RNA polymerase II transcription sequence and the

GGTCC box has been associated with many H4 gene promoters from sea

urchin to human (Hentschel and Birnstiel, 1981, Wells, 1986).

Comparison of the F0108 gene to the mouse H4 gene isolated by Seiler-

Tuyns and Birnstiel (1981) reveals extensive similarity between the

promoters, especially the TATA box, GGTCC element, and the CAAT

sequence that is found as either a single or double copy located just

5' to the GGTCC element in many H4 histone genes (Wells, 1986). The

significance of the H4 "CAAT" sequence is somewhat questionable as it

was originally thought to represent a the "CCAAT" box that is

associated with many RNA polymerase II promoters. There have been

several CCAAT box factors isolated, and all of them require, for good

binding, the sequence 5'-CCAAT-3' (Dorn et al., 1987)-. The H4

histone gene with which we are working, F0108, does have two CCAAT

boxes located several hundred basepairs upstream and the possible

functionality of both the proximal CAAT boxes and the distal CCAAT

boxes is discussed in the work presented here.

The functionality of these and other sequences in the promoter of

histone genes has been one of the focuses of our work. Also, the

Heintz and Roeder laboratory have investigated the functionality of

promoter sequences in the human H4 gene they isolated. In vitro

transcription analysis of Bal 31 deletion mutants of the F0108 H4 gene

by Sierra et al. (1983) demonstrated, in whole cell extracts, that

promoter sequences could be deleted to within 50 bp of the cap site

without loss of transcription. These sequences include only the TATA

box and GGTCC element, but are apparently sufficient for accurate in

vitro transcription to occur. In vitro transcription analysis by Hanly

et al. (1985) demonstrated very similar effects. When only the TATA

box remained as the sole RNA polymerase II consensus element,

transcription was accurate but at a reduced level. Hanly et al.

(1985) have suggested that the sequences extending to -110 bp are

sufficient for maximal transcription of the human H4 histone gene in


The analysis of histone gene transcription in vitro has contributed

to our understanding of the minimal requirements for 5' sequence;

however, it has been demonstrated previously that the requirements for

initiation of mRNA synthesis in vitro and in vivo are different in many

instances. One might reasonably assume that the chromatin structure of

an integrated gene would affect its regulation and intrinsic

accessibility to regulatory proteins. We felt it was necessary to

extend these in vitro studies into stable cell lines for the reasons

outlined above and discussed in Materials and Methods (Chapter 2). A

logical extension of many in vitro studies has been to manipulate the

promoter or coding region of a gene in vitro and to replace it in vivo

and hopefully measure the affect of the manipulation on expression.

Perhaps this has been most successfully accomplished in yeast, where

the reintroduction of the manipulated gene can be done with precision

into the exact locus from which it came originally (Szostak et al.,

1983). This is a goal shared by many molecular biologists as it would

be a more accurate way to assess structure/function relationships.

Histone genes have been transiently expressed in a number of

different cell types (Kroeger et al., 1987; Capasso and Heintz, 1985;

Green et al., 1986; Bendig and Hentschel, 1983; Marashi et al., 1986).

The transient assay affords a reasonably quick way to examine the

effects of DNA manipulation. The results have suggested that

heterologous or homologous systems can be used to express transfected

genes. In probably one of the more radical transfection experiments,

Bendig and Hentschel (1983) introduced the embryonic histone gene

repeat of the sea urchin Psammechinus miliaris transiently into HeLa

cells. Correct 5' mRNA start sites were detected for all 5 genes of the

cluster, but the termination of transcription was generally aberrant

with the exception of the H2B gene. This set of results is suggestive

that heterologous systems may share many regulatory components that

allow them to transcribe foreign genes correctly, but may have--in

this case 3' processing--parts of the regulatory machinery that are

incompatible. This particular subject is discussed in the work

presented here. At the point where our work began, the only stable cell

lines created with an integrated human H4 histone gene were by Capasso

and Heintz (1985). They utilized one construct, pHuH4, to assess the

level of H4 histone gene regulation in mouse Ltk- cells. In vivo Sl

nuclease analysis of this single construct permitted them to conclude

that mouse cells could accurately transcribe the human H4 gene. Green

et al. (1986) demonstrated that the F0108 human H4 histone gene was

expressed in mouse C127 lung fibroblasts. In these experiments the

F0108 gene was carried episomally on a construct made from the 69%

transforming fragment of Bovine papilloma virus.

With this understanding and background we initiated studies with

the human H4 histone gene F0108 (Sierra et al., 1982) to ascertain the

in vivo functionality of sequences in the 5' promoter region.



Experimental rationale and commentary. Of particular importance,

for histone and other eukaryotic genes, is the identification of

regulatory sequences and molecules that mediate transcriptional

control. Several laboratories, including our own, have conducted in

vitro and in vivo experiments to assess the functionality of the

histone gene coding region and flanking sequences in the regulation of

expression (van Wijnen et al., 1987; Sierra et al., 1983; Heintz et

al., 1983; Pauli et al., 1987; Dailey et al., 1986; Green et al.,


We felt that an in vivo approach, via the introduction of modified

genes by transfection, had the advantage that the integrated gene was

packaged as chromatin and presumably transcription factors, such as

RNA polymerase II, CTF, and Spl were present in proper and localized

concentrations due to the structural integrity of the nucleus.

Therefore the results would be a more accurate reflection of the actual

in vivo situation. The results were still cautiously interpreted in the

context of the experimental parameters present, such as copy number.

Some of our experiments have been done in a transient assay system and

the expression of the human H4 gene under these conditions was somewhat

different than when stably integrated. Presumably there were

differences in chromatin structure and factor to DNA ratios and this

may have been reflected in the results. Previous work has demonstrated

that the human H4 histone gene, with which we have worked, has a

defined chromatin structure that includes an extensive DNaseI

hypersensitive site, and that this site fluctuates in size during the

cell cycle, which may be the result of the interaction of

transcriptional control factors (Chrysogelos et al., 1985).

An in vivo experiment with a transfected gene requires an assay and

experimental approach that will allow for the detection of the

introduced gene. Several options were available for us to pursue. The

most commonly used have been 1) the promoter of a gene was linked to a

reporter gene such as chloramphenicol-acetyl-transferase (CAT) (Gorman

et al., 1982), or 2) the whole gene, coding and flanking regions, was

introduced into a heterologous environment (e.g. a human gene into a

mouse cell) (Capasso and Heintz, 1985, Marashi et al., 1986). Several

groups, including our own, have utilized such heterologous systems

because they allow for the easy detection, by S1 nuclease analysis, of

the mRNA of interest with little or no background. We decided that it

would be better to leave the H4 promoter attached to the H4 gene and

express these constructs in mouse cells.

The histone constructs we cotransfected with the pSV2neo plasmid

were expressed and detectable with S1 nuclease analysis in mouse cells.

We realized that the histone promoter deletion constructs could be

compared to one another and the differences in the steady state level

of histone mRNA from one construct to another were a direct reflection

of transcription. We concluded this because the coding region of all

the constructs had remained intact. Messenger RNA turnover was

presumably the same for each construct and any differences in the

steady-state level of histone mRNA were therefore a result of


We included a mouse H4 control in each of our Sl nuclease assays to

permit the quantitation of the total amount of mRNA and particularly

the amount of histone mRNA. In retrospect, this has helped us to

understand more about the interaction of transcription factors with the

H4 histone genes and in some cases has been an adequate internal

control. Because of the competition phenomenon we uncovered (described

in Chapter 4) the mouse H4 became a less than perfect internal control.

Originally we tried to incorporate the mouse 18S ribosomal RNA gene

into our Sl nuclease assay but were unable to find adequate

hybridization conditions for both histone and-ribosomal probes. Ideally

another mouse histone gene in conjunction with the mouse H4 should have

been used.

Materials and general laboratory procedures. All chemicals were of

the highest quality available. Phenol was redistilled and stored frozen

with the addition of 0.1 % (w/v) 8-hydroxyquinoline at -20C. The

frozen phenol was equilibrated first with 100 mM Tris-HC1 (pH 8.0) and

subsequently with 10 mM Tris-HC1 and 1 mM EDTA (pH 8.0) until the pH

was between 6.0 and 7.0. Phenol/Chloroform extraction refers to the

addition of one volume of equilibrated phenol and one volume of

Chloroform/isoamyl alcohol (24:1) to a solution, mixing, and

separation of the phases by a brief centrifugation step. Next, at least

one volume of chloroform/isoamyl alcohol is added and the above

centrifugation step repeated. Hereafter precipitation refers to the

addition of 2-3 volumes of 95% ethanol, 1/10th volume 3M Sodium Acetate

(pH 5.0), to a solution of DNA or RNA. This was subsequently placed at

-20 or -70C for a sufficient time to allow precipitation of the

nucleic acids. Radioactively labelled nucleotides, [y-32P]ATP (- 600

Ci/mmol) and [a-32P]dCTP (- 3000 Ci/mmol), were purchased from Amersham

and ICN. X-ray film, Cronex and XAR-5, were obtained from Dupont and

Eastman Kodak respectively. For all experiments that involved RNA the

solutions were pretreated with 0.01% diethylpyrocarbonate (DEPC) and

glassware was treated with 0.1% DEPC. After a 30 min. treatment the

solutions and glassware were autoclaved for thirty minutes to remove

any traces of DEPC.

Plasmid growth and preparation. L-broth (Maniatis et al., 1982) was

prepared by mixing 10g/l Bacto tryptone (Difco), 5 g/l yeast extract

(Difco), 5 g/l NaCl,and 2 ml/1 1M NaOH in 1 L of ddH20 (double

distilled water). The medium was then autoclaved for 30 min. in order

to sterilize it. Ten milliliter starter cultures of bacteria were

prepared in sterile conical tubes and grown overnight at 37C. These

were supplemented with sterile 20% glucose (100 pl), lM MgS04 (10 i1),

and 50 ag/ml ampicillin (Sigma). Small inocula were removed from

glycerol stocks or colonies were picked from plates and placed in the

starter culture overnight. Large scale (500 ml) preparations were then

completed with 5 ml 20% glucose, 0.5 ml MgS04 and 50 pg/ml ampicillin.

Cultures were grown at 37C until they reached an optical density

(595nm) of 0.4 to 0.5. At this point 4.25 ml of 20 mg/ml

chloramphenicol were added and the cultures were allowed to grow for an

additional 16-18 hrs. If the bacteria contained a pUC plasmid or

derivative, the amplification step was omitted. The cells were

harvested and the plasmid DNA was prepared essentially as described by

Maniatis et al. (1982). The pellet was resuspended in 10 ml of

Solution 1 (50 mM glucose, 25 mM Tris-HCl pH 8.0, 10 mM EDTA, and 5

mg/ml lysozyme (Cooper Biomedical)) and incubated at room temperature

for 5 min. Next, 20 ml of Solution 2 (0.2 N NaOH, 1% SDS) was added and

the cells were placed on ice for 10 min. Fifteen ml of Solution 3 (5M

KAc, pH 4.8) was added and incubated on ice for 10 min. The cells were

then centrifuged at 10k rpm for 20 min., 4"C. The supernatants from all

tubes were pooled and precipitated with 0.6 volume of isopropanol for

15 min. at room temperature. The precipitate was recovered by

centrifugation at 10k rpm for 30 min. The pellet was dried and

resuspended in 8 ml of 10 mM Tris-HCl pH 8.0, 1 mM EDTA (TE). Eight

grams of CsC1 and 640 il of 10 mg/ml ethidium bromide were added and

the preparation was centrifuged for 36 hrs at 45k rpm in Beckman heat

sealed tubes in a Beckman Ti50 rotor. The DNA band was visualized by

ultraviolet illumination and recovered by side puncture with a 20

gauge hypodermic needle. The DNA was then either placed over a small

Dowex AG 50W-X8 column or butanol extracted 5X to remove the ethidium

bromide. The sample was then dialyzed extensively against TE. The DNA

was recovered by ethanol precipitation and subsequent centrifugation.

Quantitation of the yield was done spectrophotometrically (Beckman) at

260 nm.

Plasmid preparation with TB. The method is similar to that outlined

above for L-Broth except that the TB medium was used. TB was prepared

as described by Tartof and Hobbs (1987). Bacto tryptone (6.65 gr.),

13.3 gr. of yeast extract, and 2.2 ml of glycerol were prepared in 450

ml of ddH20. The medium was sterilized in the autoclave for 30 min. To

the sterile solution was added 55.5 ml of sterile 0.17M KH2PO4, 0.72M

K2HPO4. This medium was inoculated and bacteria were grown as above.

Because the medium is very rich, the yields were often large so

bacteria that contained pBR322 plasmids were not induced with

chloramphenicol. The DNA was prepared by the same method except that

the original volume of cells was split into two aliquots at the

beginning of the isolation procedure. This was found to be essential

and greatly facilitated lysis and subsequent isolation of the plasmid

DNA. For comparative purposes, 500 ml of TB can produce 4-5 mg of

total plasmid DNA in comparison to 1 mg with L-Broth with


Production of unidirectional deletions with Exonuclease III. This

method was carried out essentially as described by Stratagene (San

Diego, CA) from which the reagents were purchased. The method takes

advantage of the fact that Exonuclease III cannot digest 3' single

strand overhangs. For our purposes the pF0005 insert was cloned into

the PstI/HindIII sites of Bluescript M13+. The HindIII site is adjacent

to an Apal site in the vector. To produce the deletions in which we

were interested, the pFO005 Bluescript clone was digested with HindIII

(5' overhang) and Apal (3' overhang) to completion. We then mixed three

pg of digested DNA, 25 p1 of 2X Exonuclease III buffer (100 mM Tris-HCl

pH 8.0, 10 mM MgC12, 20 pg/ml tRNA), 5 1l of freshly prepared 200 mM 2-

mercaptoethanol, 30 units of Exonuclease III, and enough ddH20 to make

the final volume 50 j1. The reaction conditions were established

through a series of titration experiments to determine the extent of

deletion with time. After the addition of the enzyme (added last) 10 pl

aliquots were removed every minute for 5 min., diluted with 80 j1 IX

Mung Bean nuclease buffer (5X = 150 mM NaOAc, pH 5.0, 250 mM NaC1, 5

mM ZnC12, 25% glycerol) and heated to 68*C for 15 min. Once the

deletion reactions had been stopped 9 units of Mung Bean nuclease in

dilution buffer (1X = 10 mM NaOAc, pH 5.0, 0.1 mM ZnOAc, 1 mM cysteine,

0.001% Triton X-100, 50% glycerol) were added and the reaction allowed

to proceed at 30*C for 30 min. The reaction was stopped by the addition

of 100 pl of phenol/chloroform and extracted. The aqueous layer was

removed and precipitated with 10 pl of 3 M NaOAc pH 7.0 and 2.5 volumes

of 95% ethanol. The DNA was recovered by centrifugation, ligated and

transfected as described below. This procedure worked very poorly and

resulted in very few positive clones. The deletions that were obtained

were characterized by run-off transcription from the T3 promoter of

each clone. The DNA was digested with Ncol and transcription reactions

carried out exactly as described by Stratagene. The transcripts were

electrophoresed on a 6% acrylamide, 8.3M urea gel and the extent of

deletion determined by comparison to run-off transcription from the

parental construct pFO005BS.

DNA Fragment Elution. After restriction enzyme digestion DNA

fragments were usually electrophoresed in low percentage agarose gels

(0.7 to 1.0%) with lX TBE (10X = 500 mM Tris-HCl pH 8.3, 500 mM boric

acid, 10 mM EDTA) and visualized by long wave ultraviolet illumination

of the ethidium bromide stained band (2 pg/ml for 15 min.). The band of

interest was excised from the gel. The Fragment Eluter (IBI) was first

run for 30 min. with low salt buffer (20mM Tris-HCl, pH 8.0; 5 mM NaCI;

and 0.2 mM EDTA) at 125 volts. The gel fragment was then placed in the

well and the V-channel filled with 100 yl of high salt buffer (3M

NaOAc, 5% glycerol, 0.01% Bromophenol Blue). It was important that the

gel slice remain in the same orientation as it had been run previously

to facilitate the removal of the band. The band was electroeluted at

150 V for 15-20 min. after which the high salt buffer was carefully

removed in 100 p1 aliquots. A total of 4, 100 p1 aliquots were removed

from each channel. Five micrograms of glycogen (Boehringer-Mannheim)

were added and the sample was precipitated with 1 ml of 95% ethanol at

-70C for 30 min. The DNA fragment was then recovered by centrifugation

at 10k rpm for 30 min. Fragments isolated in this manner were found to

be directly suitable for ligation reactions or probe preparation.

DNA ligation. The ligation of DNA fragments was done with T4 DNA

ligase (New England Biolabs) and essentially as described by King and

Blakesley (1986). DNA fragments were digested with the appropriate

enzymes dictated by the cloning scheme and fragments and vectors were

mixed in 10 p1 of 1X ligation buffer (5X = 250 mM Tris-HC1 pH 7.6, 50

mM MgC12, 25% (w/v) polyethylene glycol 8000 (Eastman Kodak), 5 mM ATP,

5 mM dithiothreitol). Usually the vector (a pUC plasmid) was treated

with phosphatase prior to the reaction and therefore the vector to

insert ratio was 3:1. Blunt end ligations were carried out with less

than 20 pg/ml of total DNA. Sticky ligations were done at 20-40 pg/ml

and diluted after 4 hrs at room temperature. Generally 10-20 units of

ligase were added for sticky end ligations and 200-400 units for blunt

end ligations. After 4 hours the reactions were diluted 1:2 with 1X

ligase buffer and an additional aliquot of ligase added to the

reaction. The reactions were then incubated overnight at 14C (sticky

end) and 4C (blunt end). The reactions were diluted 1:2 with TE and

transfected into DH5 bacteria as described by the methods of Bethesda

Research Laboratories, and Hanahan (1983).

Preparation of competent bacterial cells for transformation.

Bacteria, either DH5 or HB101, were grown in 100 ml of Luria broth to

an OD590 0.375. The cells were divided between two sterile 50 ml

conical tubes and placed on ice for 10 min. All subsequent procedures

were carried out at 4C. The cells were then harvested by

centrifugation for 5 min. at 5k rpm. The supernatant was removed and

the cells gently resuspended in 10 ml of CaC12 buffer (60 mM CaC12, 10

mM PIPES pH 7.0, 15% glycerol). The cells were then centrifuged for 5

min. at 5k rpm and gently resuspended again in CaC12 buffer. They were

then placed on ice for 30 min. and centrifuged at 2.5k rpm for 5 min.

The cells were resuspended in 2 ml each of CaC12 buffer and dispensed

into 200 il aliquots and frozen at -70C until needed.

Transformation of bacteria with plasmid DNA. Competent bacterial

cells, either DH5 or HB101, were thawed on ice and 5-10 p1 of the

ligation were added and incubated with the cells for 30 min. on ice.

The DH5 cells were heat shocked at 42C, and the HB101 cells at 37"C.

The cells were briefly placed on ice and then diluted with 900 pl of

room temperature S.O.C. (2% Bactotryptone, 0.5% yeast extract, 10 mM

NaCI, 2.5 mM KC1, 10 mM MgC12, 10 mM MgS04). The cells were incubated

at 37C for 1 hour and then plated on TYN (1% Tryptone, 1% yeast

extract, 0.5% NaCI) medium with ampicillin. If detection of insertion

of a DNA fragment was possible (DH5 cells and pUC plasmids) then 30 Il

of 2% X-gal (5-bromo-4-chloro-3-indolyl-p-D-galactoside) and 20 pl of

100 mM IPTG (Isopropyl-p-D-thiogalactopyranoside) were included with

the bacteria spread on the plate. Resistant colonies grew up overnight

and white colonies, indicative of a disrupted lac Z gene, were picked

for further analysis.

Rapid plasmid preparation. The method is essentially as described

by Ish-Horowicz and Burke (1981) with some modifications. One

milliliter of saturated overnight culture, grown in TYN or L-broth,

was centrifuged for 20 sec. in an Eppendorf microfuge. The solutions

for preparation of DNA were the same as for the large scale preparation

described above. The cells were resuspended in 100 p1 Solution 1 and

incubated for 5 min. at room temperature. Solution 2 (200 4l) was

added and incubated on ice for 5 min. Solution 3 (150 pl) was added

and incubated on ice for 5 min. The cells were then centrifuged for 5

min. and the supernatant extracted with phenol/chloroform. The

supernatant was then precipitated with 2 volumes of 95% ethanol at room

temperature. DNA was then suitable for restriction enzyme digestion and

agarose gel analysis.

Growth and preparation of cell lines. C127 cells were utilized in

all transfections and were grown in 10 cm tissue culture dishes as

monolayer cultures. The medium used in all experiments was Dulbecco's

modified essential medium (Gibco) supplemented with 5% calf serum

(Gibco), 5% horse serum (Gibco), 2 mM L-glutamine, and 100 U/ml

penicillin, 100 ug/ml streptomycin. To initiate a cell line (histone

plasmid and pSV2neo) or transient (histone plasmid only) transfection

the cells were refed with 10 ml of medium 2-4 hours before application

of the DNA precipitate. Stable cell lines were initiated by the

cotransfection of the histone plasmid and pSV2neo in a 10:1 ratio. This

was done essentially as described by Graham and van der Eb (1973) and

Gorman et al. (1982). Plasmid DNA, usually 10 pg/construct, was

diluted to 450 pl with 1 mM Tris-HCl pH 7.9, 0.1 mM EDTA. This was then

mixed with 50 pl of 2.5 M CaC12. The DNA solution was then added

dropwise to 500 pl of 2X Hepes Buffered Saline (280 mM NaCI, 50 mM

HEPES, 1.5 mM Na2PO4, pH 7.12 + 0.05) in a sterile 15 ml conical tube

while the tube was vortexed. The precipitates were allowed to stand for

20 min. and were grey and cloudy in appearance. A poor precipitate was

obvious as settling out occurred during the 20 min. incubation. The DNA

precipitates were added to the plates dropwise under sterile conditions

with gentle swirling. After 4 hours the medium was removed and the

cells were shocked for 1-2 min with 15% glycerol in medium. This was

removed, the cells washed with 10 ml of incomplete medium and refed

with 20 ml of complete medium. For transient transfections the cells

were incubated for 24-48 hours and then harvested (80-90% confluency)

as described below.

Cell lines were initiated by growing the cells to confluency,

approximately 2-3 days. At this point the cells were split 1:5 into

five plates and the medium was supplemented with 500 pg/ml of Geneticin

(G418, Gibco). The aminoglycoside phosphotransferase 3'(II) gene

carried on the pSV2neo plasmid confers resistance to this antibiotic

and therefore permits cell growth if present. Cells were refed with

medium + G418 every 3-4 days until resistant colonies were apparent

and most of the other cells had died. This usually took approximately

2-3 weeks. All the colonies on an individual plate were pooled and

subsequently passage in drug-free medium--these were referred to as

polyclonal cell lines. The clone name for a cell line contains several

designations. For example: pF0003pl, the pFO designates this construct

as originally derived from the AHHG 41 clone isolated by Sierra et al.

(1982), 003 describes the deletion construct, and pl refers to

polyclone number 1. When an "m" is used instead of a "p" this indicates

a monoclonal cell line. To produce monoclonal cell lines, 12

individual colonies, 2-3 from each plate, were picked with a cotton

plugged sterile pasteur pipette and grown in 24 well cell plates

(Corning). After these cells had expanded they were grown in 6 and 10

cm dishes as described above.

Cell lines and C127 cells were frozen down periodically in medium

supplemented with 20% foetal calf serum (Gibco) and 10% glycerol. Cells

were washed off the plate in Puck's Saline + 0.02% EDTA, centrifuged at

1500 rpm for 2 min, resuspended in freezing medium in Nunc Cryotubes,

and placed at -70*C.

Southern blot analysis. This method has been used to determine the

copy number of the individual monoclonal cell lines and the status of

the integrated constructs with respect to flanking sequences and mode

of integration. In general, DNAs from individual monoclonal cell lines

were digested to completion with restriction enzymes in the buffer

recommended by the supplier. The restriction enzyme reactions were

stopped by the addition of 1/10 volume of running dye- (X TBE, 50%

glycerol, 0.2% sodium dodecyl sulfate, 0.01% bromophenol blue, and

0.01% xylene cyanol) and heated to 650C for 15 min. The DNA was then

loaded onto 1% agarose gels and run 16-18 hours at 70 V. Gels were

stained in ddH20 with 5 ug/ml ethidium bromide. Next, the gels were

soaked in 25 mM HC1 for 10 min. to cause strand breaks that permit

better transfer and then transferred to Zetabind nylon membranes (AMF-

Cuno) as described by Southern (1975) except that the transfer buffer

was 0.4 M NaOH (methodology kindly provided by Dr. Harry Ostrer,

University of Florida, Department of Pediatric Genetics). Transfer was

complete in 20-24 hrs. The filters were gently washed in 2X SSC (20X

SSC = 3M NaCI, 0.3M Sodium Citrate, pH 7.0) 3 times for 15 min. each.

The filters were briefly air dried and then washed in 0.1X SSC, 0.5%

SDS for 1 hr at 65 C. At this point filters were stored at 4C in

plastic Seal-a-meal bags. Blots were prehybridized in 5X SSPE (15X SSPE

= 2.69 M NaC1, 150 mM NaH2P04, 15 mM EDTA, pH 7.7), 0.1% SDS, and 1.0%

non-fat dry milk (Carnation) at 67-68*C for 4-6 hrs. Hybridizations

were performed in the above solution with the addition of either

denatured nick-translated or oligolabelled probe. For blots probed with

histone H4 sequences 1-2 x 106 cpm/ml of probe were used in the

hybridization. For mouse 18S ribosomal RNA hybridizations, 1-2 x 105

cpm/ml of the pUC974 insert probe were utilized. The specific activity

of all probes was at least 1 x 108 cpm/ug. The length of hybridization

was from 18 20 hrs at 67-68*C. Filters were washed 3 times at room

temperature with agitation in 5 mM NaP04 pH 7.0, 2 mM EDTA, and 0.2 %

SDS. Each wash was 30 min in length. After a brief drying period the

filters were sealed in plastic bags (to prevent dehydration and

facilitate the subsequent removal of probe fragments) and exposed to

preflashed XAR-5 film (Kodak) at -70*C.

Preparation of DNA from monoclonal and polyclonal cell lines. The

medium from each plate was removed and 2 ml of Puck's saline (Gibco)

with 0.02% EDTA were added. The cells were physically removed from the

plate by scraping with a rubber spatula and placed in a sterile 15 ml

Corex tube. The cells were pelleted by centrifugation at 1500 rpm for

2 min. at 40C in an IEC-International centrifuge. At this point the

supernatant was removed and the cells were snap frozen on dry ice.

Frozen pellets were quickly resuspended in 1 ml of 0.1X SSC, 1.0% SDS,

and 200 ig/ml proteinase K (Sigma Chemical Company) and incubated for 4

hrs to overnight at 37*C. This mixture was then extracted 2 times and

precipitated with 2 volumes of 95% ethanol at -20*C overnight. The

precipitated nucleic acids were recovered by centrifugation at 10K rpm

for 10 min. at 4C. The pellet was dried briefly and resuspended in 1

ml of TE and RNaseA (Sigma) was added to a final concentration of 50

pg/ml. Digestion proceeded for 1 hr at 370C and was stopped by the

addition of SDS to 0.5% and phenol/chloroform extraction. DNA was then

precipitated with 2 volumes of 95% ethanol, centrifuged at 10K for 10

min, and the pellet resuspended in 500 pL of TE and stored at 4C.

Copy number analysis. Approximately 30 ug of genomic DNA from an

individual cell line were diluted to 50 p1 with TE. Digestions were

carried out in EcoRI buffer (Boehringer-Mannheim) with the following

regime: 1 unit/ug of EcoRI and Xbal were added and incubated at 370C

for 4-8 hrs, at which point an additional 1 unit/ug was added and the

digestion proceeded overnight (16-18 hrs). The DNA was quantitated by

diluting 5 al of the digestion into 1 ml of TE and determining the

OD260. The completion of digestion was determined by gel

electrophoresis of a small aliquot of the digestion on a 1% agarose

minigel (Bio-Rad). Ten micrograms of digested DNA were electrophoresed

and blotted as above (Southern Blotting). The probes used for the copy

number determination were either the EcoRI/XbaI fragment from pFOO02

(for the human H4 histone genes) or the BamHI/SalI fragment from p974

(mouse 18S ribosomal gene for quantitation). The probes were labelled

by either nick-translation or oligolabelling (see below). The copy

number quantitation of the human H4 histone gene was done by

densitometric scanning of multiple autoradiograms. The exact amount of

DNA in each lane was determined by reprobing the Southern blots with

the mouse 18S ribosomal gene. This gene served as an internal control

for variations in the actual amount of DNA loaded and any loss during

the process. The copy number of the mouse 18S ribosomal gene should be

invariant and all densitometric values for the human H4 histone genes

were corrected to account for the actual amount of DNA in the lane

based on the internal control.

Labelling of DNA fragments using Klenow fragment. This was done as

described by Maniatis et al. (1982). Two hundred nanograms of plasmid

or A phage DNA were digested to completion with the restriction enzymes

of choice. One to two microcuries of [a-32P]dCTP were added with 0.5

units of the large fragment of E. coli DNA polymerase I (Klenow

fragment, BRL). The reaction was incubated for 10 min. at room

temperature. Then 2 1l of 0.2M EDTA, 100 4l of 0.3M sodium acetate,

and 20 Ag of yeast tRNA were added to stop the reaction. The labelled

DNA fragments were recovered by precipitation with 95% ethanol at -

70C. The DNA was recovered by centrifugation and resuspended in 100 pl

of TE.

Nick translation and oligolabelling. Both of these methods were

utilized for the production of DNA hybridization probes. Nick

translation was done as described by Rigby et al. (1977). For the copy

number experiments the EcoRI/Xbal fragment of pF0002 was isolated with

the IBI fragment eluter and 250 ng were used in the reaction. A 25 pl

reaction was composed of 2.5 pl of 10X buffer (500 mM Tris-HCl pH 7.5,

50 mM MgC12, 1 mg/ml bovine serum albumin (BSA, Sigma Fraction V)),

2.5 p1 of 10X nucleotides (330 pM each of dATP, dGTP, dTTP), 40-80 pCi

of a-32P-dCTP, 2.5 units of E. coli DNA polymerase I (BRL), 1 p1 of a 1

x 10-4 dilution of DNaseI (stored in 10 mM HC1 at 1 mg/ml) activated at

1:100 for 1-2 hours on ice in 10 mM Tris-HC1 pH 7.5, 5 mM MgC12, 1

mg/ml BSA. The reaction was begun with the final addition of the DNaseI

and incubated at 140C for 45 min. The reaction was stopped by dilution

with TE and the probe purified over a pipette (10 mm x 100 mm, Fisher)

column of Biogel A1.5m in TE. The sample was applied to the column in a

200 1l aliquot and 200 pl fractions were collected. The labelled DNA

usually came off in fractions 6-10. These were pooled and quantitated

in the scintillation counter. The specific activity of these probes was

always greater than 1 x 108 cpm/ug. Oligo-labelling was done as

described by Feinberg and Vogelstein (1983). The DNA fragment (100 to

200 ng) was added to a 1.5 ml Eppendorf tube and ddH20 added to make

the final volume after addition of the other components either 12.5 p1

or 25 pl. This tube was then heated to 95-100*C for -two minutes and

placed on ice. To this denatured DNA fragment was added 10 jl of 2X

oligolabelling buffer (2X = 500 mM Hepes pH 6.6, 50 yM each of dATP,

dGTP, dTTP; 125 mM Tris-HCl pH 8.0, 25 mM 2-mercaptoethanol, 0.55 mg/ml

mixed hexanucleotides (Pharmacia)). We added 25-50 pCi of [a-32P]dCTP

and 2.5 units of Klenow fragment (BRL). The reaction was allowed to

proceed for 2 hours to overnight and purified as described above for

the nick translation reaction. Specific activity of these probes

usually exceeded 2-4 x 108 cpm/pg.

Preparation of total cellular RNA. Because of the sensitivity of

histone mRNA to degradation following the cessation of DNA synthesis,

it was important that the initial steps of this protocol be carried out

as quickly as possible.

The medium from 2-4 plates was removed and 1 ml of cold Puck's

saline (Gibco) + 0.02% EDTA was added and the cells were immediately

scraped from the dish and transferred to a sterile, DEPC treated, corex

tube. The cells were pelleted in the clinical centrifuge at a setting

of five for 2 min., the supernatant was removed and the cells were

frozen on dry ice and subsequently stored at -20C for no more than a

few days. Degradation can occur quickly and therefore it was necessary

to prepare the RNA as soon after harvesting as possible. The cell

pellet was resuspended in 1 ml of 2mM Tris HC1 pH 7.4, 1 mM EDTA, and

10 pg/ml polyvinylsulfate (PVS, Eastman Kodak). SDS (10%) was added to

a final concentration of 1% and proteinase K added to 200 pg/ml.

Incubation was at 37C for 30 min. at which point 5M NaCi was added to

a final concentration of 500 mM and the incubation continued for an

additional 15 min. The total nucleic acids were extracted with 2

volumes of phenol/chloroform, 2 times, and with 3 volumes of

chloroform 1 time. The total nucleic acid was then precipitated by the

addition of 60 pl of 3M NaAc and 2.5 vols of 95% ethanol (-20C

overnight). The nucleic acids were recovered by centrifugation at 10K

rpm for 15 min. at 40C. The pellet was resuspended in 500 p of 10 mM

Tris HC1 (pH 7.4), 2 mM CaCl2, and 10 mM MgC12 with the addition of 25

1l of proteinase K treated DNase I (see below for preparation) and

digested at 370C until it was completely suspended (this usually

required from 30 min. to 1 hr., intermittent vortexing helped to

disrupt the pellet). When the pellet was no longer visible, SDS and

NaCI were added to a final concentration of 0.5% and 250 mM,

respectively. The solution was extracted 2 times with phenol/chloroform

and 1 time with chloroform, and precipitated with 3 vols of 95% ethanol

overnight. RNA was either stored in water at -70C or in ethanol at -

20*C. Ethanol suspensions needed to be vigorously mixed to avoid

quantitation problems with the RNA aliquots. RNA stored in water was

also mixed before removal.

Preparation of RNase free DNaseI. Deoxyribonuclease I (Sigma)(l

mg/ml in 20 mM Tris-HCl pH 7.4, 10 mM CaCl2) was preincubated at 37C

for 20 min. and then further incubated for 2 hrs. at 37*C in the

presence of 0.1 volumes of proteinase K (1 mg/ml in 20 mM Tris-HCl pH

7.4, 10 mM CaC12) to digest any contaminating ribonuclease activity as

described by Tullis and Rubin (1980). This preparation was stable on

ice for several hours to overnight.

Sl nuclease protection assay. This method is essentially as

described by Berk and Sharp (1977) with modifications-. In order to

detect the human histone H4 mRNAs 25 pg of total cellular RNA from a

C127 cell line containing an integrated human H4 histone gene construct

were added to a DEPC treated 1.5 ml Eppendorf tube. Sufficient human

and mouse probe, labelled with [y-32P]ATP, was added to provide an

excess (5 to 10 ng) of protected fragment in the reaction. Probe excess

was either determined by titration of the probes with a stock C127 or

HeLa RNA sample or by addition of twice the amount of probe to some

reactions. One twentieth volume of 5M NaC1 and 3 volumes of 95% ethanol

were added and the solution was placed on dry ice for 15-30 min. The

precipitated RNA and probes were recovered by centrifugation at 10K

rpm for 15 min. at 40C. The pellet was briefly dried in a Savant Speed

Vac (1-2 min.). Four microliters of 5X hybridization buffer (2M NaC1,

0.2 M Pipes pH 6.4, and 5 mM EDTA) were added followed by 16 yl of

recrystallized formamide (Specialty Biochemicals). The buffer was added

first to the pellet to facilitate rehydration. The final volume, 20 p1,

was vortexed vigorously to resuspend the precipitated RNA and probe.

The tubes were placed at 90C for 10 min. and then transferred

immediately to a 55C water bath and incubated for 12-18 hrs

(overnight). Each tube was removed individually from the water bath and

the reaction diluted immediately with 8 volumes of ice-cold S1

digestion buffer (280 mM NaC1, 50 mM NaOAc, pH 4.5, and 5 mM ZnS04) and

placed briefly on ice. S1 nuclease (Boehringer-Mannheim) was added to a

final concentration of 3 units/Ml and digestion was then done at 24-

26C for one hour and at 4C for 15 min. (the tubes were placed on

ice). Ten microliters each of 10% SDS and 5M NH40H were added and the

reaction was extracted and precipitated with 3 volumes of 95% ethanol.

The length of precipitation was from 3-12 hours at -20C. (The

precipitations should not be done at -70C as this will cause the

formation of formamide crystals). The precipitated probe fragment was

recovered by centrifugation at 10K rpm for 30 min. The pellet was

briefly dried and resuspended in 2-4 pl of loading buffer (80%

formamide, lX TBE, 0.01% Bromophenol Blue, and 0.01% Xylene Cyanol).

Samples were denatured at 1000C for 3 min. and placed immediately on

dry ice until loaded. Samples were electrophoresed on a 6%

polyacrylamide, 8.3 M urea gel at a 50W constant power for 3-4 hours

(the acrylamide to bisacrylamide ratio was 20:1). Gels were dried and

exposed to preflashed XAR-5 film (Kodak) at -70C with Dupont Cronex

Lightning Plus Screens.

DNA sequencing. All sequencing reactions were carried out exactly

as described by Maxam and Gilbert (1980) and so will not be detailed

here. For each fragment that was sequenced the G (Dimethyl Sulfate,

(DMS)); G+A (Formic acid); C+T (Hydrazine); C only (Hydrazine in high

salt); and A>C (1.2 N NaOH) reactions were done. Single end labelled

fragments were prepared as follows: plasmid DNAs were digested with an

appropriate restriction endonuclease, treated with phosphatase, and

labelled as described below. After the DNA was labelled it was digested

with a second restriction enzyme to produce two single end labelled

fragments. To purify the fragment of interest for analysis we

electrophoresed the DNA on a native 4% acrylamide gel. The location of

each labelled DNA band on the gel was determined by exposure to Cronex

(Dupont) X-ray film. After alignment of the film and the gel we excised

the bands of interest and eluted them in 500 pL of 500 mM ammonium

acetate, 10 mM MgC12, 0.5% SDS, overnight at 370C as described by

Maxam and Gilbert (1980). The acrylamide gel slice was ground with a

siliconized glass rod in a 1.5 ml Eppendorf tube prior to addition of

the elution buffer. After the overnight incubation the acrylamide was

centrifuged to the bottom of the tube at 10K rpm for 5 min. The

supernatant was removed and the pellet resuspended in 200-400 p1 of

elution buffer, centrifuged, and the supernatant removed. This

procedure routinely resulted in recoveries of 80-90% of the labelled

DNA fragment. The pooled supernatants were then precipitated twice in

succession with 3M Sodium Acetate and 95% ethanol. These fragments were

then used in the sequencing reactions noted above. After the reactions

were carried out and the DNA was cleaved with piperidine and

lyophilized, it was electrophoresed (50W constant power) on a 6%

acrylamide, 8.3M urea gel (45 cm x 30cm x 0.5mm). The samples were

resuspended in 6 p1 of Sl loading buffer and divided into two, 3 1l

aliquots. These were boiled for 3 min. and placed on dry ice. To

maximize the amount of the sequence we could read, two loadings of the

reactions were done. The first 3 p1 sample of each reaction was loaded

and electrophoresed for 5-6 hours or until the Bromophenol Blue reached

the bottom of the gel. The second sample was then loaded and

electrophoresed for an additional 5-6 hours. The gel was then dried and

exposed to either Cronex or XAR-5 film at room temperature overnight.

Sl nuclease analysis probe preparation. Two probes were routinely

used to quantitate the amount of human and mouse histone H4 mRNA

present in cell line samples. The human probe was prepared by digestion

of 50-100 pg of pFOO05 or pF0002 with Ncol. This digestion was then

extracted, precipitated, and the DNA recovered by centrifugation at

10k rpm for 15 min. The pelleted DNA was resuspended in 50 pL of 50 mM

Tris-HCl pH 8.0, 0.1 mM EDTA, 1 unit of calf intestinal phosphatase

(CIP) was added and the mixture was incubated at 370C for 30 min. An

additional aliquot of enzyme was added and the DNA incubated for 30

min. The reaction was stopped by the addition of EGTA (ethyleneglycol-

bis-(P-aminoethyl ether)-N,N,N',N',-tetraacetic acid) to 10 mM and

heated to 65C for 20 min. The DNA was then extracted and precipitated.

The DNA was resuspended in 10 pL of y-32P-ATP (100 pCi) and 1 pL of 10X

Kinase buffer (500 mM Tris-HCl pH 7.6, 100 mM MgC12, 100 mM 2-

mercaptoethanol). After resuspension, 15 units of T4 polynucleotide

kinase (United States Biochemical Corporation) were added and the

reaction incubated at 37*C for 45 min. The reaction was stopped by

extraction followed by precipitation. The DNA was recovered,

resuspended and digested with HindIII to produce a probe fragment

labelled at the Ncol site in the human H4 gene. The reaction was was

electrophoresed on a 1.0% agarose gel in IX TBE and the 695 bp

NcoI/HindIII fragment, labelled at the Ncol site purified with the IBI

fragment eluter as described by IBI. The mouse H4 probe was produced in

a similar manner from the plasmid pBR-mus-hi-l-H4-Hinfl (Seiler-Tuyns

and Birnstiel, 1981) digested with BstNI. The labelled 1000 bp BstNI

fragment was isolated and used as a control in each Sl nuclease

protection assay. Although this probe was not single end labelled, we

had no ambiguities because of this fact. To make the probe shorter and

single end labelled would have possibly obscured the protected fragment

of the human H4 gene (280 nt). Both the human and mouse H4 S1 nuclease

probes were quantitated on agarose gels stained with ethidium bromide

and exposed to Cronex X-ray film to judge the relative strength of

each. Generally a large amount of probe (several micrograms) was

prepared simultaneously and S1 nuclease analysis was done on many

samples to ensure that the expression was measured with the same

strength probe in each case. Variation in the mouse and human probe

specific activity did occur; however, the data presented in this work

were prepared primarily from a large set of Sl nuclease assays in

which many cell lines were assayed side by side with the same mouse and

human probe preparation. When additional cell lines were subsequently

measured, samples assayed previously were included to ensure that the

results could be related to results from previous assays.

Densitometry and data analysis. Densitometry of autoradiograms

was done to quantitate the S1 nuclease analysis experiments of H4 gene

expression and the copy number of the cell lines. Several films of

different length exposure were utilized to determine the intensity of

the Sl protected fragment signal. Two densitometers were used, a Zeineh

laser densitometer and an LKB-Pharmacia high intensity laser

densitometer. Comparison of the capabilities of each densitometer

demonstrated that for most films either one was adequate; however for

particularly low intensity signals the LKB machine gave more

reproducible results. The data collected by both densitometers were

computer processed with either the Videophoresis II (Zeineh, Biomed

Instruments) or the GelScan XL programs (LKB-Pharmacia). Each program

was successfully used to analyze the intensity of radioactive signals

for expression and copy number. The areas under the curve for the S1

nuclease analysis (mouse and human) and the copy number blots (H4 and

18S ribosomal) were integrated and expressed as an amount of absorbance

units. To calculate the expression of a particular construct, the human

expression value was divided by the mouse value and expressed as a

ratio. Sample calculations for copy number are presented in Appendix A

and for Sl nuclease analysis in Appendix B.

Agarose and acrylamide gel electrophoresis. Agarose (Bio-Rad

molecular biology grade) gels were prepared as described by Maniatis et

al. (1982). The buffer was 1X TBE and the buffer in the reservoir was

also IX TBE. 20 x 25 cm gels were used for large scale fragment

purification and Southern blot analysis of cell line DNAs. Minigels

were used for checking the extent of digestion and analysis of rapid

and other plasmid preparations. Acrylamide gels were routinely run for

S1 nuclease analysis and consisted of 6% acrylamide (20:1 acrylamide to

bis acrylamide), 8.3 M urea, and 1X TBE. The gel solution (75 ml) was

polymerized with the addition of 750 p1 of 10% ammonium persulfate and

20 pl of N,N,N',N',-tetra methylethylenediamine. It was immediately

poured, the comb put into place and allowed to harden for 1 hour.

Before use the wells were rinsed with buffer and the gel was

preelectrophoresed for 30 min. at 50W constant power. The samples were

loaded and electrophoresed at 50W constant power.

Genomic sequencing. This technique was done as described by Church

and Gilbert (1984). Monoclonal cell lines pFO003ml, 5, and 6 were grown

in 15 cm plates (10 per construct). Seven of the 10 were treated with

0.5% DMS in 2-3 mls of medium for 1-2 minutes. Three were left

untreated, the DNA purified, and treated with DMS in vitro as a

control. The DMS was removed from the plate and the cells washed twice

in phosphate buffered saline (PBS = 150 mM NaP04, 150 NaC1, pH 7.2, 60

mM Tris-HC1, pH 7.4). The DMS treated cells were scraped from the plate

and the DNA purified by incubation with proteinase K as described above

and extraction. To purify high molecular weight DNA only, 95% ethanol

was slowly added to the tube while swirling the solution with a

siliconized glass rod. The DNA was washed off the rod with TE and

quantitated spectrophotometrically. The purified DNA (30 pg) was

restricted with Hinc II, treated with piperidine and lyophilized as

described by the sequencing protocol of Maxam and Gilbert (1980). The

samples were then separated in a 6% acrylamide gel, with 8 M urea and

electrotransferred to a nylon membrane (Genescreen). The hybridization

probe was prepared as described by Pauli et al. (1987) with primer

extension of a fragment cloned into M13. In our experiments

hybridization was performed with the Hinc II 5' upper strand probe at

65C for 16 hrs, followed by eight 5 min. washes at 650C (1 mM EDTA, 40

mM NaHP04, pH 7.2, 1% SDS). The membrane was then exposed to preflashed

XAR-5 film at -70 C. In these experiments I was responsible for the

growth of the cells and Dr. Urs Pauli performed the rest of the

experiment, with my constant encouragement, and occasional


Statistical analysis. The analysis of the S1 nuclease and copy

number data that we accumulated was suggested by Dr. Mike Conlon of the

University of Florida Biostatistics Unit. After he had examined the

data and gained an understanding of the complexities involved, he

advised that we employ a ranking test, the Wilcoxon Rank Sum Test. This

test makes the null assumption that two groups of data that are

compared came from the same random distribution. The members of each

group are assigned a rank (i.e. 1, 2, 3, ...) from highest to lowest

in both groups. For example if we had two sets of data, A 1, 2, 4,

6, and 12 and B = 10, 14, 16, 19, and 25, the members of group A and B

would be ranked in order of increasing value. The absolute values of

the data are ignored and only the rank is examined.

Group A:(l, 2, 4, 6, 12) is converted to Ranks 1, 2, 3, 4, 6.

Group B:(10, 14, 16, 16, 25) is converted to Ranks 5, 7, 8, 9, 10.

We have 5 members in each group with only one point of overlap

between the two groups at ranks 5 and 6. The Rank Sum for group A 17

and for group B 39. To determine if the difference of the Rank sums

is significant, statistical tables of probability for this test were

employed. These two groups of data are not significantly different at

p < 0.05. The reason is the small sample size. With only five members

in each group the fact that one of the members of each group falls into

the range of the other group precludes any significance. As the groups

become larger the overlap allowed for significance becomes greater. I

have found with some of my data that larger sample sizes would have

been necessary to employ this test in all cases.



It has been established that the steady state level of histone mRNA

during the cell cycle is a function of both transcription and message

stability. These two components of histone mRNA metabolism have been

studied in a number of different ways. Earlier studies by Plumb et al.

(1983a, b) utilized pulsed incorporation of 3H-uridine to determine the

contribution of transcription to the increase in histone mRNA levels

during the S-phase of the cell cycle. Later, Baumbach et al. (1987)

used nuclear run-on transcription to measure transcription of the

histone genes directly during the cell cycle. The increase in

transcription during early S-phase was determined to be 3-5 fold by

both Baumbach et al. (1987) and Plumb et al. (1983b). In the studies

of Baumbach et al. (1987), message stability was eliminated as a

variable in the experiments, and therefore they were able to determine

that histone gene transcription occurred throughout the cell cycle at a

basal level. Instead of an "on/off" mechanism for transcriptional

control an "enhancement" was apparent during the first 4 hours of S-

phase. The 3-5 fold enhancement in the histone gene transcription

level has been duplicated in various systems and by different methods

during the last 5 years (Sittman et al., 1983; Heintz et al., 1983;

Artishevsky et al., 1987).

The implications are that protein/DNA or protein/protein

interactions occur that stimulate the increased level of

transcription. Evidence for specific protein/DNA interactions has been

gathered by Artishevsky et al. (1987). They demonstrated, at the end

of GI and the beginning of S phase, the presence of a factor that

interacted with the proximal promoter region of the hamster H3

promoter. The F0108 H4 gene, with which my work has been done, also

demonstrates protein/DNA interactions in the proximal promoter region

(Pauli et al., 1987, van Wijnen et al., 1987); however, there are no

detectable changes in these interactions during the cell cycle. Since

it has been demonstrated that transcription of the F0108 H4 histone

gene proceeds throughout the cell cycle at a basal level, it was of

interest to discover what sequences are necessary for basal and

enhanced expression. The promoter of the F0108 H4 histone gene is

potentially extensive and so deletions that encompass the entire 6.5 kb

of possible promoter sequence were prepared and analyzed. In the

proximal region of the promoter we were interested to understand the

functionality of elements such as the TATAA box, GGTCC element, Spl

binding site, and putative CAAT boxes. More distal elements have also

been examined and these included a possible enhancer and negative

regulatory element located thousands of base pairs upstream.

As mentioned in the introduction, the differences encountered in in

vivo and in vitro transcription systems have sometimes been

considerable. In order to ascertain the functional in vivo promoter

sequences of the F0108 human H4 histone gene, we constructed a series

of mouse C127 cell lines each containing a different H4 promoter


E e 4J < c T h
(O 00 0 .0 0 0
4J0 4 0 0 UO
C* 0 4 P 0 "4 3 -4 0 0

Wo 0 C 4i 0 EO 0

O o.. CO a)
HC P 0 0 Po 4 0) 0 blJ
Su 4. 1 4. 1 ( 1 --
a 0n r-4 41 u 4) t 4-o
C <; T-l n O C Oi n 01
o0 o .,4 a oJ a w ca o
oI oo Zti c) E! 10 L 0 C
-4 0 (l U] U a 4
Q 0 0 0) UN- 4J

O, 4 0 0 4J o
h O ) E l -4 0
4 0 CU C ) n e
ca M0 X a 0 40
0 0 -O

C 0 0 O CU CO
o o o u o o
10 0 oo4- o4- o
cn o 4 O In 0 4
-r r,- 0 Z 4-) 0 4 *r 4J-
,C i O U 4
S0r:: 0 4W 0 4-
(d '- 0 4- U] (44 4 X 4*.]
- 0 C 4 3 4-) *H ca
r4 r-4 44 4( 4 r- -
b4oc) C .r 4J -4 0
ni un >4 r-4 u m P- t -4 E a) 4 4
1 o Ca u Wr4 4oO nC
S0 P0 *c 00 0) 4-1 r
w, a* m Ct 0 C -0
; B TJ o C
4n 44 0 4 La O ) 4) C 4 l l
o -u a) -r 4 c)j C4 4-) U-
0 D Q) -I 4-)
S 4-J 044J 9L *-I
C 0 H C*.(C **HU
) 0 1 0o -0 H 0 -4cd ar) 01
E -r-4 W) f v 1) a 4
0 4JI *rq M 4J 4 ) C O -
w a) u cM O (-) oi c4 04 00 n- r.i O

O 0 0 m cd C0 0) pc U .
(44 () *) r-4 0 h0 QQ *

o 0 O 0 4 0 0E
4-) PJ. ) r-4 U) a)
e 4d 3 (D aV a 4

bl) It 0 0 4) 0 M 41
Co P 41 o0 j
-4 0 04 0o 4i -a E
a*0 4 4- U 4

S) r-4 0 U E 0 1 -(1/
4-) 0 W a C U O cr 4 0 '

4e 4 -.4 O4 Q) r0 W oi 0 nW
a) ctn w oi rq 44- W :3


0 %-
q 10 Ln CC N z

LL N 4 -L

I I I -


LL~ U..0 M
t 0








o o o 0 O
Soo 0 0 0
u. U. U. 9L U.







UXi u



It ***

deletion construct. As described in the prologue to the Materials and

Methods section, we decided that this was the best way to proceed. We

hoped that stable integration into the chromosome would give the most

accurate information about the function of H4 promoter sequences.

Cell line construction

The first step in these experiments was to construct the cell

lines. The mouse C127 cell line was chosen because it was a

heterologous host and had been previously used to support the stable

expression of the F0108 human H4 gene in an episomal form (Green et

al., 1986). Many of the histone H4 plasmid DNA constructs were

available already (Figure 3-1), although as the work progressed several

more were prepared to answer various questions that arose. The

constructs are all products of subclones of the original A human

histone gene clone 41 (AHHG41) isolated by Sierra et al. (1982) and

this is diagramed at the top of Figure 3-1. The proximal deletion

constructs J67, J56, J50, K8, and L14 (Figure 3-1) were all available

and had been made by Bal31 deletion of pF0108A (Sierra et al., 1983).

The precise determination of each deletion point will be outlined later

in the chapter. A subclone of pF0108, pFO108A, prepared by Sierra et

al. (1983) deleted some 3' sequences including an Alu repeat. Plasmid

pFO005 was made by A. van Wijnen from a HindIII digestion of pFO002.

Plasmid pFO002 was prepared from a BamHl, PstI digest of AHHG41 to

obtain a fragment with 1065 bp of 5' flanking sequence. Plasmid pFO003

was prepared from an Xbal digest of AHHG41 and has 6.5 kb of 5'

flanking sequence. Additional clones will be described as they pertain

to subjects under discussion later in the chapter--positive and

negative regulatory elements.

Initiation of Transcription and Basal Regulation

The initiation of transcription by RNA polymerase II and the

sequences required for it have been studied in considerable detail in a

number of genes, as outlined in the introduction (Reviewed in Shenk,

1981). The importance of the TATA box has been established in vitro

and in vivo. and it is thought to be primarily responsible for the

specification of the transcription initiation site. We constructed

cell lines with several of the short proximal deletion constructs in

order to ascertain what sequences in the F0108 H4 histone gene were

necessary for the initiation of transcription. The general protocol

for DNA transfection and the subsequent selection and expansion process

is outlined in Figure 3-2. The constructs were cotransfected into C127

cells with the plasmid pSV2neo. The inclusion of the pSV2neo plasmid

permitted selection for expression with the antibiotic Geneticin

(G418). Once resistant cells were present as distinct colonies the

plates were either pooled and passage (polyclones) or picked and

expanded as monoclonal cell lines. The specific method is described in

the Materials and Methods section.

To determine the level of transcription from each of the proximal

deletion constructs, we analyzed cell lines early in passage. The

results from Sl nuclease analysis of total cellular RNA from polyclonal

cell lines 108A, L14, K8, J50, J56, and J67 is presented in Figure 3-3.

RNA was prepared from each cell line as described and hybridized to two

probes, human and mouse, at 55C for 8-16 hours as described in

Figure 3-2

Flow diagram for the production of both polyclonal and
monoclonal mouse cell lines that contain stable
integrated human histone H4 genes.

The method relies on the cotransfection of the histone plasmid with a
selectable marker, pSV2neo. This plasmid carries the gene that confers
resistance to a derivative of neomycin. The cotransfection procedure
permitted the pSV2neo plasmid to be taken up with the histone plasmid
into the mouse C127 cells. These stable cell lines were utilized to
study human H4 gene expression. The specific protocol is outlined in
materials and methods.



C127 (40%)
I 4hr

glycerol shock
2 days
split 1:5 G418

polyclones monocloi
S 2-3wks
pool -G418 pick









110 MOU n

Figure 3-3 Sl nuclease analysis of proximal deletion polyclonal
cell lines.

S1 nuclease analysis was done as described in Materials and Methods and
quantitated by densitometry. Lanes: the cell line name is denoted above
the lane. For example polyclonal cell line pF0108A number 1 is denoted
as 108Apl; C, C127 total cellular RNA and H, HeLa total cellular RNA
incubated with both human and mouse Sl probes as a positive control for
the size of the mouse and human Sl protected fragments, respectively;
M, pBR322 HpaII marker labelled with a-32P-dCTP and Klenow fragment.
Both human (280 nt) and mouse (110 nt) protected fragments are noted at
the right.

Materials and Methods. The mouse H4 histone probe was included as an

internal control in each Sl nuclease assay not only for the intactness

of the RNA preparation, but also as an indicator of the amount of

histone mRNA present in the sample. The half-life of a histone mRNA

after the cessation of DNA synthesis is very short (Plumb et al.,

1983a, Sittman et al., 1983), and therefore the growth conditions of

the cells and temperature at the time of harvested are critical for the

adequate recovery of histone mRNA.

We particularly wanted to determine if there was a minimal amount

of promoter that could initiate transcription in vivo and if this was

different than that seen in vitro. Previously the shortest Bal31

deletion, J67, had been shown to initiate mRNA synthesis accurately in

vitro in a whole cell extract (Sierra et al., 1983). As shown in figure

3-3, the construct J67, which we later learned has only the TATA box

and the GGTCC element, produced no correctly initiated transcripts.

The only transcription products detectable from the J67 construct were

initiated upstream of the normal mRNA start site. These are denoted

with arrows in Figure 3-3, and occur in the cell lines with J50, J56,

and J67 integrated. The upstream transcription start sites map

primarily to the TATA box (-30 bp) and the deletion end points. The

"deletion end point transcripts" originate from outside of the histone

flanking sequences either in the plasmid or surrounding chromosomal DNA

and are detected by virtue of the lack of homology between the probe

and the mRNA past the deletion point.

The possibility that J67 was unable to express correctly initiated

H4 mRNA was based on a single polyclonal cell line. To assure

ourselves that this was not a result of a spurious integration event we

1 2 3 4 5 6 7

lr,, O

Figure 3-4

8 9 10 H M





Southern blot analysis of polyclonal cell lines:
Intactness of 5' flanking regions and copy number of the
constructs in each cell line.

Genomic DNA purified from each cell line was digested with EcoRI and
XbaI, electrophoresed, blotted, probed, and quantitated as described in
Materials and Methods. Lanes: 1, pFOl08Apl, 2, pFOl08Ap2, 3, L14p2, 4,
L14p3, 5, K8pl, 6, K8p2, 7, J50pl, 8, J56pl (passage 4), 9, J56pl
(passage 8), 10, J67pla. Histone plasmid markers (EcoRI/XbaI digested
pF0002) were included on the blot equal to 1.3 (10 pg), 6.5 (50 pg),
and 13 (100 pg) gene equivalents per diploid genome in order to
quantitate the human histone H4 copy number. H, HeLa DNA digested with
EcoRI and XbaI as a positive control for the 1070 bp fragment. M, A DNA
digested with EcoRI and HindIII and Klenow labelled. Pertinent sizes
are denoted to the right in kilobases. The probe for this experiment
was the EcoRI/XbaI fragment of pFO002 that had been nick-translated as
described in Materials and Methods.

determined the intactness of the flanking and coding sequences for each

of the constructs J67, J56, J50, K8, L14 and 108A in Figure 3-4. This

experiment also permitted us to determine the copy number of each cell

line. Ten micrograms of genomic DNA from each cell line was digested

to completion with EcoRl and Xbal and electrophoresed on a 1% agarose

gel, blotted and probed as described in Materials and Methods. In

order to quantitate the copy number of each cell line the gel also

contained plasmid DNAs of known amounts digested with both EcoRI and

Xbal. Ten, 50 and 100 pg correspond to 1.3, 6.5 and 13 gene

equivalents per diploid genome respectively as designated in Figure 3-

4. Several exposures of the autoradiogram were scanned with a Zeineh

laser densitometer and quantitated in comparison to the controls.

Additionally, the Southern blot in Figure 3-4 was quantitated for the

actual amount of DNA by densitometrically scanning a photographic

negative of the gel prior to transfer, and differences in DNA amounts

have been taken into account in the copy number calculation. Later,

copy number blots for other constructs were reprobed with a clone of

the mouse 18S ribosomal gene kindly provided by the Dr. David

Schlessinger (Washington Univ., St Louis) to allow exact determination

of the amount of DNA loaded in each lane and subsequently transferred.

A sample copy number calculation in which the ribosomal probe was

utilized is presented in Appendix A.

The Southern blot analysis demonstrated not only the copy number of

each cell line, but permitted us to conclude that the flanking region

of most constructs was intact. The mode of integration for the histone

plasmids is described further in chapter 4.

Table 3-1

Cell Line


Quantitation of Polyclonal Cell Line Expression.

Human/Mouse Exp Copy number Exp/Copy number



A quantitative summary of the expression data from the polyclonal cell
lines of the proximal deletion constructs. The human/mouse expression
ratio was determined by densitometry of the S1 nuclease protected
fragments in Figure 3-3. Copy number for each cell line was determined
from the Southern blot in Figure 3-4. Since these data were derived
from polyclonal cell lines it is not possible to interpret the results
strictly, and we would like to note that copy number in a polyclonal
cell line is somewhat ambiguous. Expression is denoted as Exp.

The results of the Sl nuclease analysis and copy number

determination are presented in Table 3-1. The Sl nuclease assay was

similarly quantitated with the densitometer and the results are

expressed as a ratio of the mouse and human signals. The results,

although of a few individual cell lines, have been repeated several

times. The Sl nuclease analysis results from the proximal deletion

polyclones suggested that J67 (-47bp) was unable to correctly initiate

histone mRNA transcription. Only when the promoter was extended in J56

(-73 bp) was correct initiation observed (Figure 3-3). It can be seen

from the data in Table 3-1 that the expression per copy of the J56

construct (-73 bp) is quite low in vivo (expression/copy = 0.0004), and

as noted later this may be somewhat a reflection of the copy number and

not the amount of 5' sequence present in the construct. When the

flanking sequences are extended to -100 bp in the construct J50 there

is an apparent 80 fold increase in the expression/copy ratio (0.034).

The expression/copy ratio of the remaining deletion constructs

stabilizes at a value of 0.02 to 0.01 with increased length of 5'

sequence. This 25-50 fold increase is probably exaggerated because of

copy number differences between J56 and the longer constructs. This

phenomenon (expression versus copy number) will be discussed later in

the chapter. Still it is likely that the difference in the

expression/copy ratio is 10 fold. These data are supported by the

results of Ken Wright in our laboratory, who has utilized in vitro

transcription to define the functionality of proximal promoter elements

and demonstrated that in nuclear extracts the transcription of J50


*108A *L14


*J50 *J56

-50 +1


Figure 3-5 Schematic diagram of the proximal human histone H4
Bal31 deletion mutants: Sequence analysis of the
deletion points.

Each construct was sequenced according to the protocol of Maxam and
Gilbert (1980) and as described in Materials and Methods. The deletion
point of each construct is denoted with an asterisk over the last
nucleotide included in the sequence of that construct. For reference
the ATG codon, TATA box, GGTCC element, CAAT boxes and Spl site have
been underlined. The two bolded regions of the promoter correspond to
Site I and Site II, the DNAseI protected regions of protein/DNA
interaction as defined by Pauli et al. (1987).

(-100 bp) is several fold higher than J56 (-73 bp) ( Ken Wright,

personal communication).

Previously, the deletion points of the Bal 31 deletions had been

determined by restriction enzyme analysis and electrophoresis on high

percentage agarose gels (Sierra et al., 1983). To determine exactly

the deletion point, each construct was sequenced by the method of Maxam

and Gilbert (1980). Ken Wright and I collaborated in this effort and

the approach we undertook is described in Materials and Methods.

Importantly, the strategy permitted us to sequence across the deletion

point in each construct and to determine the exact end of Bal31

digestion. The deletion points we determined are denoted in Figure 3-


When we examined the sequence of the J67 (-47bp) deletion, it was

obvious that the GGTCC element and TATA box were still present and the

proximal CAAT box (-53 bp) was absent. Our S1 nuclease analysis

suggested that this was not sufficient promoter sequence for correct in

vivo transcription initiation. To ensure that this was indeed the

case, we prepared 5 additional polyclonal cell lines of J67 and

demonstrated that they all contained integrated constructs (Figure 3-

6b,c); however, none expressed a correctly initiated histone H4 mRNA

(Figure 3-6a). The absence of a detectable S1 protected fragment in

the J67 polyclonal cell lines was repeated several times. Upstream

initiation of transcription was sometimes detectable although this was

not consistent. The importance these results became apparent when

Drs. Urs Pauli and Susan Chrysogelos of our laboratory demonstrated the

binding of proteins to the proximal promoter region of this H4 gene in

Figure 3-6 Sl nuclease and Southern Blot analysis of J67 polyclonal
cell lines for correct human H4 expression and copy

Additional J67 polyclonal cell lines were made to confirm that this
construct was unable to initiate human H4 mRNA transcription correctly.
A. Sl nuclease analysis of 25 pg total cellular RNA from 5 new J67
polyclonal lines and the one tested previously, J67pla. Also shown are
polyclonal lines 108Ap4 and 108Xp2. H, HeLa total cellular RNA. C,
C127 total cellular RNA. M, pBR322 HpaII markers. The human H4 Sl
protected fragment (280 nt) is noted with an arrow at the left. There
was no detectable human H4 signal in any of the J67 lanes even upon
repetition and long exposure. B. Southern blot analysis of J67
polyclonal cell line for copy number determination. J67 polyclones 1-5
and pF0108Aml2 are shown. The position of 1070 bp is noted and the
arrow indicates the size of the deletion EcoRI/Xbal fragment from J67.
Plasmid DNAs in the amount of 10, 50, and 100 pg were included for copy
number quantitation as described in Fig 3-4. H, HeLa cell DNA digested
with EcoRI and Xbal; C, C127 cell DNA digested with EcoRI and Xbal. C.
The blot in B was reprobed with the 18S mouse ribosomal fragment for
quantitation of the amount of DNA in each lane. The size of the 18S
band, 1.3 kb, is noted at the right. Quantitation was done as described
in Materials and Methods and Appendix A.


A. U. ,i




1 32 114


vivo (Pauli et al., 1987). The specific areas of protein/DNA

interaction as defined by DNase I protection are outlined in Figure 3-5

with the construct deletion end points. Interestingly, the J67

deletion point is located in the middle of Site II and leaves the

proximal portion with the GGTCC element and TATA box intact. It would

appear that the absence of Site I and the presence of only half of Site

II are insufficient for transcription initiation in vivo. However,

when all of Site II is present in the of construct J56 a low but

detectable level of transcription is present (Figure 3-3 and Table 3-

1). The large increase in the expression/copy ratio of the J50 (-100

bp) construct is apparently the result of remarkable similarity to the

Spl (Dynan and Tjian, 1983b) binding site as described by Briggs et al.

(1985) and Evans et al. (1988). Although we have not proven that the

protein/DNA interaction at this site is the result of Spl, it seems a

strong possibility that it could be Spl or a similar protein. J50 also

includes a putative CAAT box, however the functionality of this

sequence is in question because it lacks the necessary homology to the

consensus sequence. Additionally, this CAAT box is not entirely

included in the protein binding domain of Site I as described by Pauli

et al. (1987) and it is therefore unlikely that it functions in the

same capacity. It should be mentioned that Spl has been shown to

interact with CTF in the HSVtk promoter (Jones et al., 1985), and

possible interaction in the histone promoter should not be ruled out

immediately, however it is unlikely. The CAAT sequence is well

conserved evolutionarily in conjunction with the GGTCC element (Wells,

1986) and our results suggest that the removal of this element in the

distal half of Site II prevents correct transcription initiation.

We investigated the whether any diatl promoter elements had an

effect on the transcription of the F0108 human H4 histone gene.

Polyclonal cell lines were prepared from constructs pF0005 (-417 bp),

pF0004 (-6.0 to -7.5 kb), pF0002 (-1065 bp), and pFO003 (-6.5 kb). The

results of the Sl nuclease analysis and limited copy number analysis on

these cell lines suggested that upstream sequences beyond those already

examined might contribute to an increased level of expression (data not

shown). Upon reflection, it is likely that in most cases, the

increased level of expression we noted was the result of high copy

number, and not necessarily because of a strong promoter sequence such

as an enhancer. These results, although limited at the time, prompted

us to examine in a more rigorous way the distal 5' promoter sequences

of the F0108 H4 histone gene for possible regulatory areas that control


Transfection of the constructs pFO005 (-417 bp), pF0002 (-1065 bp),

and pF0003 (-6.5kb) into mouse C127 cells was done to assess any distal

contributions to the expression level of this H4 gene. As stated

previously enhancer and silencer/negative regulatory elements can be

located at considerable distances from the promoter of a gene and still

accentuate or depress expression of the linked gene (Maniatis et al.,

1987, Theisen et al., 1986, Baniahmad et al., 1987). The new cell

lines were grown primarily as monoclones, and for continuity with the

previous studies, monoclonal cell lines of pF0108A and K8 were also


I will state now that we have found that there is a competition

between the transfected human H4 histone genes and the endogenous mouse

H4 gene for regulatory factors and this is discussed later and in

chapter 4. The interpretation of expression from each construct is

affected by this competition phenomenon, and becomes rather confusing.

We bring this up here only to make the reader aware that this situation

exists, and the results have been interpreted several ways, sometimes

with this taken into account. It has been extremely difficult to

understand the relationship that exists between the endogenous mouse H4

genes and the transfected human H4 genes. We have analyzed the

expression/copy data carefully to decipher any trends. The results of

this analysis are also reviewed in chapter 4. The choice of the mouse

H4 as an internal control for the Sl nuclease analysis was both

fortunate and detrimental to our interpretation. In short, the entire

expression analysis is presented here, but because of the realization

later in the course of this work about copy number and competition for

transcription factors, only some of the data will be incorporated into

the final synopsis.

The monoclonal cell lines were analyzed for the level of expression

and copy number present. The Sl nuclease analysis of the pFO003

monoclonal cell lines is presented in Figure 3-7 and was done as

described in Materials and Methods. Almost all of the monoclones were

positive for expression of the human H4 histone gene with the exception

of pFO003ml8. We utilized several exposures to determine,

densitometrically, the level of expression from each cell line. The

expression data are presented as a ratio of the human and mouse

Mouse -1,

Figure 3-7 Sl nuclease analysis of pFO003 monoclonal cell lines.

S1 nuclease assays were performed as described in Materials and
Methods. Almost all 15 clones shown here are positive for expression
of the human H4 gene. The exception is pFO003ml8. H, HeLa total
cellular RNA. C, C127 total cellular RNA. M, pBR322 digested with HpaII
and labelled with a-32P-dCTP and Klenow fragment. Dilutions of the
marker are noted as 1:4, 1:8, 1:16 and 1:32 for densitometry purposes.
The human (280 nt) and mouse (110 nt) protected fragments are denoted
with labels and arrows at the left. The clone numbers appear above the
individual lanes to which they correspond.

Figure 3-8 Southern blot analysis of pFO003 monoclonal cell lines.

Southern blot analysis was performed as described in Materials and
Methods. 10 pg of DNA from each cell line were analyzed with nick
translated EcoRI/Xbal fragment from pF0002. A. pFO003 cell line DNA
probed with H4 sequences. B. The histone probe was removed and the blot
was reprobed with the mouse 18S ribosomal fragment. Densitometry of the
1070 bp band specified by the arrow in A and the 18S ribosomal band in
B permitted quantitation of the copy number through normalization to
the amount of DNA actually loaded and transferred as described in the
Materials and Methods. The figure in A is a composite of several
exposures that reflects the actual copy number and accounts for
original quantitation errors. The plasmid controls for quantitation are
labelled 10, 50 and 100 designating the number of pg loaded. C, C127
cellular DNA. H, HeLa cellular DNA. M, A DNA digested with EcoRI and
Hind III and labelled with a-32P-dCTP and Klenow fragment. The number
of each clone is designated above the lane.


m 5.1
S* *i

': 1



M 1 2 4 5 13 14 15

in*00.0 m



- -***in

16 17 18 C H

I ,



- 21.2

a* 5.1

. 29

- 1A

in 0.9


,AI )


it I
HC 1 6 7 8 9 101214


t I

:- 11

M 3 t IIIIa
2 8 7 tit mo

-Human "0* .,

*0, 0-

am '


r em

Figure 3-9

S1 nuclease analysis of pFO108A and pF0002 monoclonal
cell lines.

S1 nuclease assays were performed as described in and Materials and
Methods. The left panel is representative of results obtained from
FO108A cell lines; the right panel with total cellular RNA from pFO002
cell lines. The human and mouse protected fragments are designated with
labels and arrows. The markers, M, are pBR322 digested with HpaII and
important sizes are noted. The number above each lane corresponds to
the clone number of that construct. The markers were diluted M1:4 and
M1:8 for densitometry quantitation purposes. H, HeLa total cellular
RNA. C, C127 total cellular RNA.

G- a


Figure 3-10 Copy number analysis of pF0002 and pFO108A monoclonal
cell lines.

Southern blot analysis was performed as described in Materials and
Methods. 10 pg of DNA from each cell line were analyzed with nick
translated EcoRI/Xbal fragment from pF0002. A. pFOO18A and pFO002 cell
line DNA probed with H4 sequences. B. The histone probe was removed and
the blot was reprobed with the mouse 18S ribosomal fragment.
Densitometry of the 1070 bp band specified by the arrow in A and the
18S ribosomal band in B permitted quantitation of the copy number
through normalization to the amount of DNA actually loaded and
transferred as described in the Materials and Methods. The figure in A
is a composite of several exposures that reflects the actual copy
number and accounts for original quantitation errors. The plasmid
controls for quantitation are labelled 10, 50 and 100 designating the
number of pg loaded. C, C127 cellular DNA. H, HeLa cellular DNA. M,
A DNA digested with EcoRI and Hind III and labelled with a-32P-dCTP and
Klenow fragment. Each set of clones is designated with the black bar
and the number of the individual clones is above the lane.

O0 W I0 A
W A ft .^ AA 0







002 108A
2 3 7 81 2 5 7 8 9 10 14 C

1es e*
ij 8

densitometry signals in Table 3-2 (p. 103). The average expression of

nine pF0003 monoclonal cell lines, for which copy number was later

determined, was 2.29 2.43.

It was obvious that these results varied, so the copy number of

each cell line was determined from the southern blots in Figure 3-8a,b.

The Southern blots of pF0003 monoclonal cell line genomic DNA, digested

with EcoRI and Xbal, were prepared as detailed earlier and in Materials

and Methods. The hybridization probe was the 1070 bp EcoRI/Xbal

fragment isolated from pF0002 and nick-translated. The actual copy

number of each cell line was determined by densitometric analysis of

the 1070 bp EcoRI/Xbal band with normalization for the amount of DNA

actually loaded. The amount of DNA in each lane was determined by

removal of the histone probe at 80*C in 0.1XSSC and subsequent

hybridization with the oligo-labelled BamHI/SalI fragment of the mouse

18S ribosomal gene. Densitometry of the 18S ribosomal band (Figure 3-

8b) permitted normalization of the histone H4 copy numbers and

comparison to the plasmid controls for copy number (see Appendix A for

sample calculation of copy number).

The copy number data helps to explain some of the variation seen

with the original expression determination for each cell line. When

pF0003 copy number is taken into account for the expression data in

Table 3-2, the expression/copy ratio for all of the cell lines is

lowered and the average expression/copy is 0.094 0.091. It is

apparent from the data in Table 3-2 that as copy number increases, the

expression/copy increases until approximately 20-40 copies are present,

after which it declines. The pFO003M15 cell line is-perhaps lower than

expected with respect to expression because of an unusual or

deleterious integration site. The threshold of expression at 20-40

copies indicated that a limited number of human histone genes could be

integrated and expressed in any one cell. This phenomenon has been

investigated further and is discussed later in light of genomic

sequencing data presented in Chapter 4. Overall the pFO003 monoclonal

cell lines had higher expression levels than other cell lines (compare

expression values with others in Table 3-2), but the expression/copy

was similar. Since copy number was implicated in the level of

expression, we also calculated the average copy number of each group of

monoclonal cell lines and this is presented in Table 3-2. The level of

expression, as we have determined it here (Table 3-2), is a direct

reflection of the copy number.

The results of the Sl analysis of the pF0108A and pF0002 monoclonal

cell lines are presented in Figure 3-9. Both cell lines expressed at a

relatively low level and the numerical data are presented in Table 3-2.

The average level of expression/copy for pFO108A is .079 .061 and for

pF0002 is 0.045 0.053. The data collected for the pF0108A monoclones

were previously divided into two groups. Originally, there was a

construct, designated J40, that after sequencing of the deletion points

was found to be identical to pFO108A. Therefore, these data were

incorporated into the 108A data base. It is interesting to note that

pFO108A and J40 were thought to have different lengths of 5' sequence

and yet their expression was shown to be almost identical. This

separation of the original observations lends a measure of confidence

to the analysis process that has been used in these studies.

H C MM3 8 6 7 8 10 1111814


309 ,



Figure 3-11

1 k

'" rIII~~~ :
?11 *' ^.*f .II~1Y -,:*, *

r !

4. .e

* -C **.


-L -

S1 nuclease analysis of pFO005 monoclonal cell lines.

Twenty five micrograms of total cellular RNA from each of the cell
lines were treated as described in Materials and Methods and the
autoradiograph of the S1 nuclease analysis was quantitated by
densitometry. Lanes are designated with the clone number of the cell
line. HeLa cell total RNA hybridized to both human and mouse probes, H.
C127 RNA hybridized to both human and mouse probes, C. pBR322 HpaII
markers labelled with a-32P-dCTP and Klenow fragment, M. One fourth the
amount of marker was electrophoresed for quantitation purposes, M1:4.
The construct name, pF0005, is displayed above the black line. Both
human and mouse (280 nt and 110 nt respectively) protected fragments
are noted at the right.

- 1b

- Mouse